Sunday, October 28, 2012

The Zorn Identity

The Zorn Identity:
MathML-enabled post (click for more details).
My aim here is to make one simple point:


Zorn’s lemma has almost nothing to do with the axiom of choice.


It hasn’t got much to do with set theory, either.

MathML-enabled post (click for more details).
People often describe Zorn’s lemma as a form of the axiom of choice. If
called on to justify that statement, they would probably answer: in the
presence of the Zermelo–Frankel axioms for set theory, the axiom of
choice implies Zorn’s lemma and vice versa. This is, indeed, a fact.

However, I want to argue that the emphasis is misplaced.
Zorn’s lemma is provable entirely without the axiom of choice, until
the very end, where choice is used in a simple and transparent way.

Let me make that precise. Here are three theorems about partially ordered
sets (posets).

Preliminary definitions: for a poset X,


  • a chain in X is a subset of X that, with the order
    inherited from X, is totally ordered

  • an upper bound for a subset C⊆X is an element x∈X
    such that c≤x for all c∈C

  • a maximal element of X is an element m∈X such that the only
    element of X greater than or equal to m is m itself.

The first of the three theorems is usually called “Zorn’s lemma”.


1. Zorn’s maximal theorem 
Let X be a poset in which every chain has an upper bound. Then X
has a maximal element.



For the second, we need another definition: an inflationary
operator
on a poset X is a map of sets s:X→X such that
s(x)≥x for all x∈X. Note that s need not be
order-preserving.


2. Zorn’s fixed point theorem
Let X be a poset in which every chain has an upper bound. Then every
inflationary operator on X has a fixed point.



In the definition of upper bound for a subset C⊆X, it
is not required that the upper bound belongs to C. However, our
third theorem says that if we’ve got an upper bound for every chain in
a poset, then at least one of them must in fact belong to that chain.


3. Zorn’s chain theorem
Let X be a poset. Let u:{chains in  X}→X
be a function assigning to each chain in X an upper bound.
Then u(C)∈C for some chain C.



I confess that these names are completely without historical justification.
I have no idea which of these statements Zorn knew, or proved. For all I
know, Zorn didn’t even prove Zorn’s lemma — that wouldn’t be
surprising, given how inaccurate mathematical attributions often are.

My central point, that Zorn’s lemma has almost nothing to do with the axiom
of choice, is justified as follows:


  • Zorn’s chain theorem is provable without the axiom of choice

  • Zorn’s chain theorem trivially implies Zorn’s maximal theorem
    (a.k.a. Zorn’s lemma).

I won’t show you the choice-free proof of Zorn’s chain theorem, or at
least, not today. But I will show you how Zorn’s chain theorem (3)
trivially implies Zorn’s maximal theorem (1). In fact, I’ll show you how
each of the three theorems trivially implies the others. There’s a
direction to it: both implications 3 ⇒ 2 ⇒ 1 seem
to require choice, but the reverse implications don’t.

If you’re feeling too lazy to read the following proofs, it doesn’t really
matter. All you need to observe is: they’re very short!


  • 1 ⇒ 2  An inflationary operator fixes every
    maximal element.

  • 2 ⇒ 1  There is an inflationary operator s on
    X defined by taking s(x)=x whenever x∈X is maximal, and
    choosing s(x)>x for all other x∈X. Then s has a
    fixed point, so X has a maximal element.

  • 2 ⇒ 3  The restriction of u to one-element chains
    is an inflationary operator, so has a fixed point.

  • 3 ⇒ 2  Choose an upper bound v(C) for each chain,
    and let s be an inflationary operator on X. Then whenever C is a
    chain, s(v(C)) is also an upper bound for C. Hence s(v(C))∈C for some
    chain C, so s(v(C))≤v(C), so v(C) is a fixed point of s.

So, the three theorems are trivially deducible from each other. In
contrast, the proof of any one of them is a relatively serious endeavour.

Describing Zorn’s lemma as a form of the axiom of choice is a bit like
describing a person in terms of their hairstyle. Occasionally hair
matters: wear a mohican and snooty restaurants will refuse
you entry, but part your hair at the side and they’ll let you in.
Nevertheless, for almost all purposes, you’re the same person. The three
theorems above are really one theorem with three different haircuts.

I tried to come up with a similar example from elsewhere in mathematics,
but I couldn’t think of one. (Maybe you can.) What we need is a
nontrivial theorem that exists in two variants, one of which requires the
axiom of choice for its proof and one of which doesn’t. Deducing either
variant from the other should be trivial.

Why does any of this matter? That question can only have a personal
answer. I’ll give you mine, but it’s quite likely you’ll disagree, and
that’s OK — it is, after all, personal.

What it’s not about, for me, is the axiom of choice. I’m not
actually enormously interested in figuring out which parts of mathematics
use the axiom of choice and which don’t. (I don’t think that’s a pointless
endeavour; I’m only stating that it’s not a particular interest of my own
at the moment, just as heart surgery and firefighting aren’t.)

What it is about is a continuing effort to understand the shape and
character of set theory. Like many readers, I suspect, I took a ZFC-based
course on set theory when I was a student. At the time I very much enjoyed
it. Later, I came to realize that there was a qualitative difference
between theorems I’d learned such as “for any sets X and Y, either X
is not an element of Y or Y is not an element of X” (which is only
meaningful in certain formal systems) and those such as “for any sets X
and Y, either there is an injection from X into Y or there is an
injection from Y into X” (which just about every mathematician would agree is meaningful). Now, I find myself wanting to understand
which of the results typically called “set theory” — like Zorn’s
lemma — are really results about order that intrinsically have little
to do with sets.

So let’s treat Zorn’s lemma in exactly the same way as we treat, say, the
classification theorem for finite abelian groups — no mystique, just
a straightforward, run-of-the-mill, result. Let’s not treat it as a
mysterious creature poking its head up from the dark set-theoretic cellar of mathematics. Let’s call it what it is: quite simply, a useful piece
of order theory.

DIGITAL JUICE

No comments:

Post a Comment

Thank's!