Evanalysis
6.2Embedded interactionEstimated reading time: 16 min

6.1.2-6.3 Cantor's theorem, continuum, and choice

Use Cantor's theorem to prove that the reals are uncountable, state the continuum hypothesis, and introduce choice functions, the axiom of choice, and Zorn's lemma.

Course contents

MATH1090: Set theory

Rigorous course notes on logic, sets, number construction, the real numbers, limits, cardinality, and the first algebraic structures, written in linked sections with careful proofs and examples.

Chapter 7Sets with structure1 sections

The previous note compared sizes of sets using bijections and injections. This note begins with the first genuinely large-set theorem of the chapter: Cantor's theorem. It shows that every set is strictly smaller than its power set.

That result immediately gives a proof that the real numbers are uncountable. The same pages then move to two foundational statements: the continuum hypothesis and the axiom of choice. The course does not prove their deep metamathematical facts, but it states clearly how they enter the theory.

Power sets and Cantor's theorem

Definition

Power set notation

For a set XX, the notation 2X2^X denotes the set of all subsets of XX.

Thus

T2XT\in 2^X

means exactly that TXT\subseteq X.

The notation 2X2^X is suggestive: if XX has n elements, then its power set has 2n2^n subsets. Cantor's theorem says that the power set is larger not only for finite sets, but for every set.

Theorem

Cantor's theorem

Let XX be a set. Then

X<2X.|X|<|2^X|.

The proof has two parts.

First, there is an injection

X2X,x{x}.X\to 2^X,\qquad x\mapsto \{x\}.

So X2X|X|\le |2^X|.

Second, there is no bijection from XX to 2X2^X. Suppose, for contradiction, that f:X2Xf:X\to 2^X were a bijection. Define the diagonal set

T={xXxf(x)}.T=\{x\in X\mid x\notin f(x)\}.

Since TT is a subset of XX, we have T2XT\in 2^X. If f were surjective, there would be some yXy\in X such that

f(y)=T.f(y)=T.

Now ask whether yTy\in T.

  • If yTy\in T, then by definition of TT, yf(y)=Ty\notin f(y)=T, a contradiction.
  • If yTy\notin T, then by definition of TT, yf(y)=Ty\in f(y)=T, again a contradiction.

Both cases are impossible. Therefore no bijection X2XX\to 2^X exists, and hence X<2X|X|<|2^X|.

Proof

Why this is a diagonal argument

A diagonal lab

The widget below is meant to support the proof, not replace it. Use it to see how the set TT is forced to disagree with each attempted list of subsets.

Cantor diagonal set construction

Figure. The diagonal set is constructed by reversing the membership decision on the diagonal, so it cannot equal any row of the proposed list.

Read and try

Build Cantor's diagonal set

The lab turns Cantor's diagonal argument into a table that shows why no list can contain every subset.

nf(n)n in f(n)?n in T?
0{0, 2, 4}YesNo
1{0, 1, 3, 5}YesNo
2{2, 3}YesNo
3{0, 4}NoYes
4{1, 4, 5}YesNo
5{}NoYes

T = {3, 5}

This finite table only illustrates the rule. In the proof, T = {n in N : n notin f(n)} differs from every listed f(n) exactly at row n, so no list can contain all subsets of N.

The real numbers are uncountable

Theorem

The reals are uncountable

The set RR of real numbers is uncountable. In particular,

N<R.|N|<|R|.

The proof uses Cantor's theorem and an injection from 2N2^N into RR.

By Cantor's theorem,

N<2N.|N|<|2^N|.

So it is enough to show

2NR.|2^N|\le |R|.

Given a subset SNS\subseteq N, encode it by a sequence of zeros and ones:

an={1,nS,0,nS.a_n= \begin{cases} 1, & n\in S,\\ 0, & n\notin S. \end{cases}

Then define

ϕ(S)=n=0an3n+1[0,1].\phi(S)=\sum_{n=0}^{\infty}\frac{a_n}{3^{n+1}}\in [0,1].

This sends each subset of NN to a real number.

Worked example

Encoding a subset of N

If

S={0,2,5,},S=\{0,2,5,\ldots\},

then the beginning of the sequence is

a0=1,a1=0,a2=1,a3=0,a4=0,a5=1.a_0=1,\quad a_1=0,\quad a_2=1,\quad a_3=0,\quad a_4=0,\quad a_5=1.

The corresponding real number begins as

ϕ(S)=13+133+136+.\phi(S)=\frac13+\frac1{3^3}+\frac1{3^6}+\cdots .

The proof does not need to describe this number by a decimal expansion. It only needs the fact that the infinite series determines a real number.

Why use base 3 rather than base 2? Base 3 is useful so that distinct zero-one sequences cannot be cancelled by the tail.

If SSS\ne S', let k be the first index where the two associated sequences differ. At index k, the two sums differ by exactly

13k+1.\frac1{3^{k+1}}.

The total possible contribution from all later terms is smaller than that leading difference. Therefore the two real numbers are not equal. Hence ϕ\phi is injective, and 2NR|2^N|\le |R|.

Combining this with Cantor's theorem gives

N<2NR,|N|<|2^N|\le |R|,

so RR is uncountable.

Common mistake

The proof only needs an injection into R

To prove 2NR|2^N|\le |R|, we do not need every real number to be hit by ϕ\phi. We only need distinct subsets of NN to produce distinct real numbers.

The continuum hypothesis

Theorem

Continuum hypothesis

The continuum hypothesis says that there is no set SS such that

N<S<R.|N|<|S|<|R|.

This is a natural guess after seeing that RR is larger than NN: perhaps there is no intermediate size between the countably infinite cardinality and the cardinality of the continuum.

This guess has a surprising formal status. The continuum hypothesis is independent of the usual axioms of set theory, ZFC. Godel showed in 1940 that it is consistent with ZFC, and Cohen showed in 1963, using forcing, that its negation is also consistent with ZFC. Therefore it can neither be proved nor disproved from the standard axioms alone.

For this course, the important point is not to prove those results. It is to recognize that cardinal arithmetic quickly reaches foundational questions where axioms matter.

Choice functions and the axiom of choice

Before stating the axiom of choice, define how to take a union of a set whose elements are themselves sets:

S={xXS such that xX}.\bigcup S=\{x\mid \exists X\in S\text{ such that }x\in X\}.

Definition

Choice function

Let SS be a set whose elements are nonempty sets. A choice function for SS is a function

f:SSf:S\to \bigcup S

such that

f(X)Xf(X)\in X

for every XSX\in S.

A choice function chooses one element from each set in the family SS.

If SS contained the empty set, no choice function could exist, because there is no element to choose from the empty set. That is why the definition assumes that the members of SS are nonempty.

Theorem

Axiom of choice

Let SS be a set whose elements are nonempty sets. Then SS has a choice function.

This is an axiom, not a theorem. Like the continuum hypothesis, its truth or falsehood is independent of the other axioms of set theory.

Worked example

What a choice function does

Let

S={{1,2},{3,4,5},{6}}.S=\{\{1,2\},\{3,4,5\},\{6\}\}.

A choice function might choose

f({1,2})=1,f({3,4,5})=4,f({6})=6.f(\{1,2\})=1,\qquad f(\{3,4,5\})=4,\qquad f(\{6\})=6.

The function does not have to choose the smallest element. It only has to choose one member of each nonempty set.

Surjections and cardinal inequalities

Theorem

Surjections give reverse cardinal inequalities with choice

Let f:XYf:X\to Y be a surjective function. Then

XY.|X|\ge |Y|.

To prove this, we need an injection g:YXg:Y\to X.

For each yYy\in Y, the fiber

f1(y)Xf^{-1}(y)\subseteq X

is nonempty because f is surjective. Let

S={f1(y)yY}.S=\{f^{-1}(y)\mid y\in Y\}.

This is a set of nonempty subsets of XX. By the axiom of choice, choose one element from each fiber. Define

g(y)=the chosen element of f1(y).g(y)=\text{the chosen element of }f^{-1}(y).

Then g:YXg:Y\to X is injective. Indeed, if g(y)=g(y)g(y)=g(y'), then the same element of XX lies in both fibers, so

f(g(y))=yandf(g(y))=y.f(g(y))=y \qquad\text{and}\qquad f(g(y'))=y'.

Since g(y)=g(y)g(y)=g(y'), it follows that y=yy=y'.

This explains the earlier warning: the implication from surjection to reverse cardinal inequality uses choice when infinitely many fibers may need to be selected simultaneously.

Countable unions of countable sets

Theorem

A countable union of countable sets is countable

If AnnN{A_n}_{n\in N} is a countable family of countable sets, then

A=nNAnA=\bigcup_{n\in N} A_n

is countable.

The proof is organized by surjections. Since each AnA_n is countable, choose a surjection

fn:NAn.f_n:N\to A_n.

If AnA_n is finite, one may repeat the last element indefinitely to obtain such a surjection. The axiom of choice is used to choose the whole family of maps fn{f_n} simultaneously.

Now define

F:N×NA,F(n,m)=fn(m).F:N\times N\to A,\qquad F(n,m)=f_n(m).

This map is surjective: every element of the union lies in some AnA_n, and then is hit by the corresponding fnf_n.

Finally, N×NN\times N is countable by a diagonal argument analogous to the proof that QQ is countable. Therefore there is a surjection

NN×N.N\to N\times N.

Composing gives a surjection NAN\to A, so AA is countable.

Common mistake

Countable union is not the same as arbitrary union

The theorem here is about a countable family An{A_n}. It does not say that an arbitrary union of countable sets must be countable.

Chains, maximal elements, and Zorn's lemma

The final part of the assigned pages states Zorn's lemma, which is equivalent to the axiom of choice.

Definition

Chain

Let XX be a partially ordered set. A chain SXS\subseteq X is a totally ordered subset: for every a,bSa,b\in S, either

aborba.a\le b \qquad\text{or}\qquad b\le a.

Definition

Maximal element

An element mXm\in X is called a maximal element if there is no xXx\in X with

x>m.x>m.

A maximal element need not be greater than all other elements. This is different from a maximum, which must satisfy mxm\ge x for every xXx\in X.

Worked example

Maximal is weaker than maximum

In a partially ordered set, two elements may be incomparable. If neither is above the other, both can be maximal in a small subset even though neither is a maximum.

So "maximal" means "cannot be extended upward from here," not "dominates every element."

Theorem

Zorn's lemma

The axiom of choice is equivalent to the following statement:

if XX is a nonempty partially ordered set in which every chain has an upper bound in XX, then XX has a maximal element.

The course states this result as a foundational tool. Its full proof belongs to a more advanced treatment, but the definitions should be read carefully now: Zorn's lemma is about partial orders, chains, upper bounds for chains, and the existence of maximal elements.

Common mistakes and subtle points

Common mistake

Do not treat Cantor's theorem as only finite arithmetic

For finite sets, 2n>n2^n>n is familiar. Cantor's theorem is stronger: it applies to every set, including infinite sets.

Common mistake

Do not confuse maximal with maximum

A maximum is above every element. A maximal element merely has no strictly larger element above it. In partial orders these are different ideas.

Common mistake

Do not hide the role of choice

Choosing one element from each of infinitely many nonempty sets is exactly the kind of step the axiom of choice is designed to justify.

Quick checks

Quick check

In Cantor's theorem, what is the diagonal set T?

State it in terms of whether x belongs to f(x).

Solution

Answer

Quick check

Why is base 3 used in the injection from 2^N to R?

Think about the first differing index and the possible tail contribution.

Solution

Answer

Quick check

What does a choice function choose?

Mention both the family of sets and the chosen element.

Solution

Answer

Exercises

Quick check

Show why the singleton map x -> {x} is injective.

Assume two singleton sets are equal.

Solution

Guided solution

Quick check

Explain why a surjective map f:X to Y gives nonempty fibers f^{-1}(y).

Use the definition of surjective.

Solution

Guided solution

Quick check

In Zorn's lemma, why is it not enough to talk only about maximum elements?

Recall that the order is partial, not necessarily total.

Solution

Guided solution

Read this after 6.1 Cardinality, countability, and cardinal inequalities and 2.2 Functions and relations. The next course section begins with intervals, so this note closes the big-set foundations used in Chapter 6.1-6.3.