set theory - cornell universitypi.math.cornell.edu/~web3040/setsandfunctions_s11.pdf ·...

Math 3040 Spring 2011

Set theory†

Contents

1. Sets 11.1. Objects and set formation 11.2. Intersections and Unions 21.3. Differences 31.4. Power sets 41.5. Ordered pairs and binary cartesian products 42. Relations 42.1. General relations 42.2. Functions 62.3. Injective, surjective, and bijective functions 62.4. Inverses 72.5. The set of all functions A → B 72.6. Ordered n-tuples 82.7. Operations 82.8. Equivalence relations 92.9. Order relations 133. Cardinality 153.1. Ordering cardinalities 154. Finite and Infinite sets 16Appendix A. Well-posed definitions 18

The formal concepts of set and function date back to the nineteenth century, which makethem relative newcomers to mathematics. Nevertheless, these ideas have become essentialtools for doing mathematics and for the precise expression of mathematical ideas.

This chapter presents the basic definitions needed for working with sets and functions.It is self-contained, but it is not intended for the student who has never seen the basicconcepts. The main point of the chapter, which appears in its second half, is to present thekey ideas of equivalence relation, order relation, cardinality, and infinity, all of which playa role later in the course.

1. Sets

1.1. Objects and set formation. In mathematics it is important to be able to recognizethat certain distinct objects are in some way related to one another, and then to considerall such related objects as a collectivity, a unified whole—indeed, another object in its ownright. This is the mental process of set-formation, which is a special case of the mentalprocess of abstraction. The collectivity formed is called a set or, synonymously, a collection,

† c©January 1, 2011

1

2

or a family. The objects that we collect to form the set are called members or elements of theset. We may use the usual “curly-bracket” notation for this, as when we consider objectsa, b, c, . . ., and collect them into a set {a, b, c, . . .}. We also use the standard set-buildernotation when we form a set by describing properties. Thus, for example, the notation{x : x is real andx2 > 1} denotes the set of all real numbers whose square is greater than 1.

It is important to note that sets themselves may be members of other sets. For example,there are exactly 120 distinct sets each of which consists of three distinct integers between0 and 9. Examples of these are, say, {0, 1, 2}, {4, 5, 9}, {6, 2, 7}, and so on. Let S denotethe set consisting of all 120 of these. That is, S is a set whose members (or elements) arethemselves sets. This sort of object, namely a set of sets, occurs repeatedly in mathematics.

When an object a is a member of a set A, we describe this notationally by a ∈ A. Thenegation ¬ (a ∈ A) is abbreviated by a 6∈ A. Sets are defined by the elements that they

elementsand

subsets

contain, which means that two sets are equal if and only if they have the same elements. Asubset B of a set A is simply a set formed exclusively from objects of A, but not necessarilyfrom all of them. When this is the case, we write B ⊆ A. We may describe this by sayingthat B is contained in A or included in A. Sometimes we wish to consider subsets of a setA but not the set A itself. We call such subsets of A proper subsets. For such subsets B,we may use the notation of strict set inclusion B ⊂ A.

Two sets are worthy of special notice at this point: namely, the set with no objects(the empty set), and the set containing all the objects we want to consider (the universalset, or universe). The former is denoted by ∅; the latter by U . The empty set is easy tounderstand, and the student should have no trouble verifying that the empty set is a subsetof every set.

Exercise 1. Verify that the following sets are all different: ∅, {∅}, {{∅}}, and {∅, {∅}}.How many elements does each contain?

A cautionary note: Exercise 1 points up the distinction between the membership relation∈ and the inclusion relation ⊆. For {∅} is a member of {{∅}}, but it is not included in it.And ∅ is a subset of {{∅}}, but it is not a member of it. So, do not use the notation ∈ whenyou mean “is a subset of”, and do not use the notation ⊆ when you mean “is a memberof.”

The universal set U is a more complicated concept. Not only do we want its members tothe

universalset

be all of the sets we normally encounter, say in calculus or number theory, but we also wantits elements to be all sets obtained from these via any constructions that we may choose tomake in the course of proofs, etc. This forces U to be very large. In a more advanced coursein set theory, it is shown how to construct such universal sets. Here, we just take them forgranted. The reasons for considering them at all are technical: without some universal setto work in, we end up with logical contradictions.

1.2. Intersections and Unions. Let A and B be sets. Introductory treatments of settheory describe the intersection and union of A and B as follows. The intersection A ∩ Bis the set consisting of all objects x such that x ∈ A and x ∈ B; the union A ∪B is the setconsisting of all objects x such that x ∈ A or x ∈ B. Or, more symbolically,

A ∩B = {x : (x ∈ A) ∧ (x ∈ B)},and

A ∪B = {x : (x ∈ A) ∨ (x ∈ B)}.

3

( Note the similarity between the symbols ∩ and ∧ and the similarity between ∪ and ∨. Ofcourse this is not coincidental.)

For example, the two sets of lower case letters {c, o, r, n, e, l} and {u, n, i, v, e, r, s, t, y}have intersection {e, n, r} and union {c, o, r, n, e, l, u, i, v, s, t, y}.

Often in mathematics, however, it is desirable to consider intersections and unions of manysets, indeed, often infinitely many. So, it is convenient to have definitions of intersectionsand unions of any collection of sets. Accordingly, let F denote a family of sets. Theintersection of the sets in F may be written as ∩F and is the set defined as follows:

∩F consists of eachx that is simultaneously a member of every set inF .

The union of the sets in F may be written as ∪F and is the set defined as follows:

∪F consists of eachx that is a member of at least one set in F .

When the sets of F are listed in some fashion, for example such as F = {A,B, C}, then∩F may be written as the more familiar A ∩B ∩C and ∪F may be written as A ∪B ∪C.Or, when F is an infinite sequence of sets, say F = {A1, A2, A3, . . .}, we may write ∩F as⋂∞

n=1 An and ∪F as⋃∞

n=1 An.As an example of these definitions, let F denote the set of all lines ` lying in the plane R2

and passing through the origin (0, 0) of R2. Each such ` is, of course, itself a set of pointsin the plane. Both ∩F and ∪F are sets of points in the plane.

Exercise 2. Check that ∩F = {(0, 0)} and ∪F = R2.

Exercise 3. Suppose that F is the empty collection of sets. Prove that ∪F = ∅ and∩F = U , where we recall that U denotes the universal set. (The reader might well expectthe first equality but is likely to be surprised at the second one. Give a careful proof ofeach.)

Exercise 4. Let T and S1, S2, S3, . . . be sets. Prove:

(a) T ∩ (⋃i Si) =

⋃i(T ∩ Si).

(b) T ∪ (⋂i Si) =

⋂i(T ∪ Si).

These are known as distributive laws. In the first, we have intersection distributing overunion; in the second, union distributes over intersection.

1.3. Differences. The complement of a set A is defined to be {x ∈ U : x 6∈ A}, and itis denoted by Ac. Sometimes it’s useful to restrict attention only to those elements of Ac

that are also in some other set B, i.e., to the set B ∩ Ac. This is usually called the (set)difference between B and A, and it is denoted by B \ A. Note that A does not have to bea subset of B in order for B \A to be defined.

Caveats: (a) The meaning of the term “set difference” has nothing to do with subtractingnumbers or vectors, etc. No algebra is involved; only intersection and complementation ofsets.(b) Note the difference between the backslash \ and the forward slash /. That is, do notconfuse the notation for set difference, which is B \ A, with the notation B/A. Even incontexts in which both of these are defined (which is not always the case) they will notmean the same thing.

4

1.4. Power sets. From time to time, we shall wish to refer to the collection of all subsetsof a given set S. This collection (which includes the empty set ∅ and the full set S itself)is known as the power set of S. It is denoted either by P(S).

Exercise 5. (a) Suppose S is a set with exactly 4 distinct elements. How many elementsdoes P(S) have? (Hint: Make a list.)

(b) Let S be as in (a). How many elements are there in P(P(S))? You may use acalculator.

1.5. Ordered pairs and binary cartesian products. Given sets A and B and elementsa ∈ A and b ∈ B, the notion of the ordered pair (a, b) is defined by

(a, b) = {{a}, {a, b}}.Proposition 1. {{a}, {a, b}} = {{c}, {c, d}}, if and only if a = c and b = d.

This key property of ordered pairs is the one that students are familiar with when theywork with ordered pairs of real numbers in an analytic geometry course or a calculus course.

Exercise 6. Prove Proposition 1.

Exercise 7. Explain why we cannot define (a, b) simply to be the set {a, b}.The collection of all ordered pairs (a, b) as a ranges over A and b ranges over B is denoted

A×B and is called the Cartesian product of A by B. Note that, in general, this is not thesame as the Cartesian product B ×A.

2. Relations

2.1. General relations. Mathematics is not done with objects and sets alone, just asphysics cannot be done by simply labeling certain things as particles or bodies. Bothsubjects require some way to express various kinds of interactions among the objects ofstudy. In mathematics, this is done using the notion of relation. We focus almost exclusivelyhere on the notion of a binary relation, that is a relation between a pair of objects.

There are many ways to describe or define particular kinds of relations, but it is usefulat the outset to be as general as possible. So, let A and B be any two sets. By a relation ρfrom (elements of) A to (elements of) B, we mean simply a subset of A×B.

The motivation behind this definition is that in its most general form, a relation betweenelements of A and B simply singles out certain pairs of elements, a ∈ A and b ∈ B that arethen said to be related. It is simply a list of related pairs. A philosopher might call this adenotative definition of a relation, that is, a definition that points out all instances in whichthe relation holds and ignores those in which it does not. A so-called connotative definitionwould describe the notion of relation in terms of some sort of qualities or properties thatit has. It is hard to see what these (general) properties would be, which is why we usedenotative definition. Note also that the sort of definition that we give fits nicely into settheory, since our concept of relation is expressed in terms of sets.

Clearly this definition gives the most general possible concept of a (binary) relation, sincewe place no further restrictions on ρ. Two notably uninteresting relations from A to B arethe empty relation ∅ and the full relation A×B.

Now for some notation. Instead of writing (a, b) ∈ ρ, when we want to indicate a pair ofelements that are related, we shall write a ρ b, and say “a is ρ-related to b.” This seems tobe the natural way to define the notation and terminology. Unfortunately, it runs somewhat

5

counter to the historically standard notation and terminology for functions, as we shall seeshortly.

The set of all relations from A to B is simply the power set of A×B, namely P(A×B).We may apply the standard set-theoretic operations to relations. Therefore, for example,

given two relations ρ and σ from A to B, it makes sense to talk about the relations ρ ∩ σand ρ ∪ σ or to say that ρ ⊆ σ.

Taking a cue from what we do with functions, moreover, we may talk about the compo-sition of two relations or about the inverse of a relation.

• Suppose that ρ is a relation from A to B and σ is a relation from B to C. Then thecompositionofrelations

composition (or composite) σ ◦ ρ is the relation from A to C defined as follows: Forany a ∈ A and c ∈ C,

a (σ ◦ ρ) c ⇐⇒ (a ρ b ∧ b σ c), for some b ∈ B.

• With ρ as above, its inverse ρ−1 is the relation from B to A defined as follows: Forinverseof arelation

any a ∈ A and any b ∈ B,

b (ρ−1) a ⇐⇒ a ρ b.

Given a relation ρ from A to B, which we may write as ρ : A → B or Aρ→ B, we define

two sets, the domain and range of ρ, written D(ρ) and R(ρ), respectively, as follows:

D(ρ) = {a : a ∈ A and a ρ b, for some b ∈ B}R(ρ) = {b : b ∈ B and a ρ b, for some a ∈ A}.

Sometimes, the set B is called the codomain of ρ, but there is no special symbol for this.When D(ρ) = A and A = B, we often say that ρ is a relation on A.The most important relation defined on any set A is the identity relation, otherwise known

as equality and usually denoted by =. If it is useful to do so, we may want to emphasizethe particular domain A in this case by writing the relation as =A, though this is almostalways left tacit. Using the above notation, we could write

=A = {(a, a) : a ∈ A}.Other examples of relations will appear shortly.

Exercise 8. We define sets A,B, C, D as follows: A = {a, b, c, d}, B = {α, β, δ}, C ={1, 2}, D = {w, x, y}. Next, we define the relations ρ : A → B, σ : B → C, and τ : C → Das follows: ρ = {(a, α), (b, β), (b, δ), (c, δ)}, σ = {(α, 1), (α, 2), (β, 1), (δ, 1), (δ, 2)}, τ ={(1, w), (1, x)}. Compute the relations (i) σ ◦ ρ, (ii) τ ◦ σ, (iii) τ ◦ (σ ◦ ρ), (iv) (τ ◦ σ) ◦ ρ,(v)ρ−1, (vi) σ−1, (vii) (σ ◦ ρ)−1, and (viii) (ρ−1) ◦ (σ−1). Notice that the relations in (iii)and (iv) are the same and that the relations in (vii) and (viii)are the same.

Exercise 9. Prove the following properties involving the relations ρ : A → B, σ : B → C,and τ : C → D. (Note: These are not the sets and relations defined in the preceding

propertiesofrelations

exercise; they are general sets and relations. Your proofs should not use specific relations,such as those defined in the preceding exercise. The proofs should apply to the given generalrelations.)

(a) (ρ−1)−1 = ρ.(b) (σ ◦ ρ)−1 = (ρ−1) ◦ (σ−1).(c) (τ ◦ σ) ◦ ρ = τ ◦ (σ ◦ ρ).

6

2.2. Functions. Next to the relation of equality, the most common kind of relation is afunctional relation or function. Functions are, for example, the familiar objects of studyin calculus. We begin here by repeating the basic definition and then proceed to defineadditional concepts.

A function f from A to B is a relation from A to B with the following property:

Fun(f,A,B): for every a ∈ A, there is exactly one b ∈ B such that a f b.

Since this b is uniquely determined by a, we may unambiguously denote it by f(a) and referto this as the function value of f at a. This accounts for the function notation f(a) = bwhich now replaces the more general relation notation afb.

Notice that, by definition, the domain D(f) of f : A → B equals A, but the range R(f)need not equal the entire codomain B.

Here is an easy example. Let g : R → R be the real-valued function of a real variablegiven by the equation y = g(x) = x2 − 2x− 2. Then, we have, for example, −2 = g(0) and−2 = g(2). So, although each real number a determines g(a) uniquely, it is not correct tosay that g(a) always determines a uniquely. Moreover, R(g) 6= R, because, for example, −4is not a function value g(a) for any real number a. Try to verify this last fact, either usingcalculus methods or straight algebra.

If we need to refer to the above definition of the function f : A → B, we’ll refer to it as“Fun(f, A, B).”

The two simplest examples of functions are the constant functions and the identity func-tions. Constant functions are those functions with exactly one function value. For anyset A, the identity function idA may be defined to be the relation {(a, a) : a ∈ A}, i.e,idA(a) = a, for all a ∈ A.

It is sometimes convenient to extend the function notation f(a) as follows. Suppose thatf is as above and that C is any subset of A. Then we may let f(C) denote the set

{b ∈ B : f(c) = b, for some c ∈ C}.In words, f(C) is the set of all function values f(c), as c ranges over C. For examplef(∅) = ∅ and f(A) = R(f).

Exercise 10. Let f : A → B and g : B → C be functions. In particular, they are relations,composition

offunctions

and so their composition g◦f : A → C, as a relation, is defined. Prove that this compositionis a function.

2.3. Injective, surjective, and bijective functions. Let f : A → B be a function. Wesay that f is injective if and only if f(a1) 6= f(a2) whenever a1 6= a2. To put it another

injectivefunctions (logically equivalent) way: For all a1 and a2 in A, f(a1) = f(a2) =⇒ a1 = a2. For

example, the real polynomial function y = x3 is injective, whereas the polynomial functiony = x2 is not. A common synonym for injective is the term one-to-one, sometimes written1− 1.

7

Exercise 11. Suppose that f : A → B and g : B → C are both injective. Prove thatg ◦ f : A → C is injective.

We say that f is surjective if and only if every b ∈ B is in R(f). In other words, for everysurjectivefunctionsb ∈ B there exists at least one a ∈ A such that b = f(a). For example, considering each

of the two polynomial functions described above as having domain and codomain equal tothe set of all real numbers, then y = x3 is surjective, whereas y = x2 is not. A commonsynonym for surjective is the term onto.

Exercise 12. Suppose that f : A → B and g : B → C are both surjective. Prove thatg ◦ f : A → C is surjective.

Finally, we say that f is bijective if and only if it is both injective and surjective.bijectivefunctions

Exercise 13. Use the previous two exercises to conclude that if f : A → B and g : B → Care both bijective, then g ◦ f : A → C is bijective.

An injective function is sometimes called an injection, a surjective function a surjection,1-1correspon-dence

and a bijective function a bijection. A common synonym for a bijection is a one-to-onecorrespondence or 1-1 correspondence.

If there is a bijection f : A → B, we may say that A is bijectively related to B or inbijective correspondence with B or in 1-1 correspondence with B.

2.4. Inverses. We may always form the inverse of a function f , using the definition givenabove for the inverse of a relation. The result f−1 is a relation but not usually a function,however, since it may not associate a unique value with every element in its domain. Inorder to ensure that f−1 be a function, we need this uniqueness feature to hold. Thishappens precisely when the original function is injective.

Exercise 14. Let f be a function from A to B with range C. (a) Prove that f is injectiveif and only if f−1 is a function from C to A. (b) Use (a), together with (f−1)−1 = f ,(Exercise 9(a)), to prove that whenever f is an injective function, so is f−1.

Exercise 15. (a) Prove that if f is a bijection from A to B, then f−1 is a bijection fromB to A. (b) Verify that f ◦ f−1 = idB and f−1 ◦ f = idA.

Exercise 16. Prove that if f : A → B and g : B → A are functions satisfying f ◦ g = idB

and g ◦ f = idA, then f is bijective, and g = f−1.notationBA

2.5. The set of all functions A → B. Since a function is a special kind of relation, theset of all functions A → B is a subset of P(A×B). It is usually denoted by BA.

Exercise 17. Consider any function f : A → {0, 1}, and define a subset Bf ⊆ A by therule x ∈ Bf ⇐⇒ f(x) = 1. The set Bf is sometimes called the support of f . Now, define

supportof afunction

a function ψ : {0, 1}A → P(A) by the rule ψ(f) = Bf . Next, choose any subset B ⊆ A, anddefine a function kB : A → {0, 1} by the rule kB(x) = 1 ⇐⇒ x ∈ B. In other words, kB

has constant value 1 for every x ∈ B and constant value 0 for every x 6∈ B. kB is sometimescalled the characteristic function of B. Finally, define a function φ : P(A) → {0, 1}A bythe rule φ(B) = kB.

8

Prove that φ ◦ ψ = id{0,1}A and ψ ◦ φ = idP(A). By the preceding exercise, this showsthat φ and ψ are inverses of one another and both bijections and that the two sets {0, 1}A

and P(A) are in 1-1 correspondence. This correspondence shows why the power set P(A)is often denoted by 2A.

Exercise 18. Define a bijective correspondence B{1,2} → B × B. (Verify that it is abijection.)

This correspondence—if you got it right— allows one to identify B{1,2} with B ×B: i.e.,they may be used interchangeably. That is, the ordered pair (b, b′) may be considered tobe the set {{b}, {b, b′}}, the way we defined it. Or it may be considered to be the functionf : {1, 2} → B given by the rule f(1) = b and f(2) = b′.

2.6. Ordered n-tuples. We may extend this alternative way of looking at ordered pairsto obtain ordered n-tuples and even more general notions as follows: Given any positiveinteger n, we may define the set of all n-tuples of elements of B to be the set B{1,2,...,n},which is usually abbreviated as Bn. However, we do not normally denote its elements viafunction notation but rather in the usual n-tuple form. Thus, if f ∈ B{1,2,...,n}, and we havefunction values f(i) = bi, i = 1, 2, . . . , n, then, instead of f , we write (b1, b2, . . . , bn).

The subscripts used for the n-tuple above are sometimes called indices, and we say thatthe elements b1, b2, . . . , bn are indexed by the set {1, , 2, . . . , n}. This indexing idea canbe extended to sets other than {1, , 2, . . . , n}. In fact, any non-empty set can be used toindex elements of another set. Let I be any such non-empty set, and consider any functionf : I → B. We may write the function value f(i) as bi and then display f as (bi) or as(bi)i∈I . We might informally call this an I-tuple in B, by analogy with the term n-tuple,and refer to the set of functions BI as the set of all I-tuples in B.

For example, suppose we let I be the set of all positive integers {1, 2, 3, . . .} and takeB to be the set of all real numbers R. Then, an I-tuple f ∈ RI can be written as (ri)or (ri)i∈I , or perhaps even as (ri)∞i=1. The student will immediately recognize this as thefamiliar concept of an infinite sequence of real numbers.

If the members of the set B are themselves sets in U , and I is any non-empty set in U ,then an I-tuple (bi) in B is an indexed family of sets, i.e., each bi is a set which is an elementof B. Just as before, we can take the union or intersection of the sets in this family, andwe may write these as

⋃i∈I bi and

⋂i∈I bi, respectively.

2.7. Operations. There is a special kind of function that deserves separate mention. It isknown as an operation. Examples of operations abound in mathematics: ordinary additionof real numbers, vector addition, wedge product, cross product, ordinary multiplication ofnumbers, and so on. Also useful is the notion of an operation of one set on another, with thebest known example being scalar multiplication of vectors. We’ll give a general definitionof these concepts, together with a couple of illustrations.

Let n be any positive integer. An n-ary operation on a set X is defined to be any functionf : Xn → X. Thus, f assigns to every n-tuple (x1, x2, . . . , xn) of elements of X a valuef(x1, x2, . . . , xn) in X.

Usually the notation for an n-ary operation is not the standard function notation. Forexample, addition of n real numbers can be viewed as an operation + : Rn → R. But,given n real numbers r1, r2, . . . , rn, we do not write +(r1, r2, . . . , rn) to denote the sum. Weusually write r1 + r2 + . . . + rn.

9

Two important special cases of n-ary operations are when n = 1 and n = 2. Whenn = 1, we call the operation a unary operation. For example, taking the negative of a realnumber (or the negative of a vector) is a unary operation. The most common operationsin mathematics are for the case n=2. These are the binary operations, such as addition oftwo numbers or vectors, etc. In fact, summing n-numbers or vectors can be done two at atime, so in this sense, binary addition can be said to determine or generate n-ary addition.We’ll see more precise descriptions of how this works later in the course. Similarly formultiplication of numbers and many other important operations. However, it does notwork for all operations; that is, in general, unary and binary operations are not sufficientfor generating all n-ary operations.

An operation of a set Y on a set X is simply a function f : X × Y → X (or sometimes,equivalently, a function Y ×X → X). Note that the target of operations is always the setthat is being operated on (in this case X). So, using f , if we wish to operate on an elementx ∈ X by an element y ∈ Y , we evaluate f at (x, y).

Just as before, we do not usually use the standard function notation f(x, y) for this value.Rather, we may write yfx, or, sometimes, when f is understood, we write simply y · x, oreven more simply yx (juxtaposition).

This is what we do with scalar multiplication of vectors: If V is a real vector space, scalarmultiplication is a function R× V → V . We don’t even give this function a name becausewe usually use juxtaposition: the result of multiplying the vector v ∈ V by the real numberr is rv.

Exercise 19. Show or explain how one can define addition of 5 real numbers using onlybinary addition. Can you give a second definition?

2.8. Equivalence relations. The relation of equality has three important properties whichwe call reflexivity, symmetry, and transitivity. To spell these out in general, let ρ be anyrelation from a set A to itself. We say that ρ is reflexive if aρa for every a ∈ A. We say itis symmetric if, for every a, b ∈ A, aρb ⇒ bρa. Finally, we say that it is transitive if, for alla, b, c ∈ A, aρb and bρc ⇒ aρc. It is obvious that the relation of equality has all of theseproperties.

Here is a somewhat amusing exercise.

Exercise 20. Verify that reflexivity, symmetry, and transitivity may be expressed in thefollowing way:

ρ is reflexive ⇐⇒ =A ⊆ ρ.

ρ is symmetric ⇐⇒ ρ−1 ⊆ ρ.

ρ is transitive ⇐⇒ ρ ◦ ρ ⊆ ρ.

A relation on a set A that is reflexive, symmetric, and transitive is called an equivalencerelation on A.

Given an equivalence relation ρ on a set A, we may use ρ to partition A into certainsubsets. These are usually called ρ-equivalence classes, but we may also call them ρ-orbits

equivalenceclassesfor short. Given a ∈ A, the ρ-orbit of a, written [a]ρ (or simply [a] when it is clear which

equivalence relation we are considering), is defined by

[a]ρ = {b ∈ A : aρb}.

10

By the reflexivity property of ρ, we have a ∈ [a], for every a ∈ A. It follows that⋃

a∈A[a]ρ =A.

Exercise 21. In this exercise, we make use of the set Z of integers {0,±1,±2,±3, . . .},and we assume the standard facts about these numbers. (a) We define a relation ∼ onthe power set P(Z) as follows: If A and B are two sets of integers, let us say that A ∼ Bwhenever B \ A is a finite set. Prove that ∼ is reflexive and transitive but not symmetric.(b) We define a relation © on Z as follows: If a and b are integers, let us say that a© bwhenever −1 ≤ a− b ≤ 1. Prove that © is reflexive and symmetric but not transitive. (c)Is it possible for a relation to be symmetric and transitive but not reflexive? Justify youranswer. (d) Verify that the relation 6= on Z is symmetric but not transitive or reflexive.(e) Here’s a somewhat artificial example of a relation concocted to be reflexive but notsymmetric or transitive. Verify this. The relation, which we’ll denote by o is defined on Zby describing it as a subset of Z× Z as follows:

o = {(a, a) : a ∈ Z} ∪ {(0, 1), (1, 2)}.(f) Describe a relation on Z that is transitive but not reflexive or symmetric.

2.8.1. Examples of equivalence relationsSome of the following equivalence relations will be well known to you and won’t require

much further comment. Others may be new or not very familiar and can be elaboratedmuch further. We will only give the basic definitions here though. At later points in thecourse, we’ll come back to these examples and fill in some finer points.

(a) Let ∆ be the set of all triangles in the plane. The usual relation of congruence oftriangles, often denoted ≡, is an equivalence relation on ∆. The relation of similarityof triangles, sometimes denoted ∼, is also an equivalence relation. Clearly, ≡ ⊆ ∼.

(b) Let S be any set of statements, and let LE(S) denote the set of all possible logicalexpressions in which the atomic statements can be any statements in S. We studiedsuch expressions in The Propositional Calculus. The relation of logical equivalence,⇐⇒ , introduced in The Propositional Calculus, is an equivalence relation on the setLE(S).

(c) Let Z denote the set of integers, and choose any a, b ∈ Z. Then we say that a iscongruent to b mod 2 whenever b−a is divisible by 2, and when this is the case, wecongruence

mod 2 write a ≡2 b. The relation ≡2 is an equivalence relation on Z. There are exactly twoequivalence classes for this relation, one consisting of all the even integers and theother of all the odd integers. These classes are often denoted 0 and 1 respectively,and the set of all (two) ≡2-classes is often denoted by Z2 (cf. The PropositionalCalculus, §3, on mod 2 arithmetic).

A similar definition can be used to define a relation of congruence mod n, for anygiven positive integer n. The relation is often written as ≡n. We say that an integera is congruent mod n to an integer b, written a ≡n b whenever b− a is divisible byn. This is an equivalence relation, and the corresponding quotient set is denoted byZn and consists of exactly n equivalence classes, usually denoted 0, 1, 2, . . . , n − 1.Addition and multiplication can also be defined for these, as will become apparentfrom an exercise below.

11

(d) Let V and W be two vector subspaces of the real vector space Rn, and supposeW ⊆ V . We define a relation ≡W on V by the rule: u ≡W v if and only if u−v ∈ W .This is an equivalence relation. Given any v ∈ V , its ≡W equivalence class consistsof all vectors v + w as w ranges over W . This set is often written as v + W . It issometimes called the affine subspace through v determined by (or parallel to) W .For example, if V = R2 and W is a line through the origin, then v + W is exactlythe line through v parallel to W . In general, the quotient set corresponding to ≡W

is denoted by V/W . One can define the operations of a real vector space on V/W ,the result being called a quotient vector space. This is an important construction inadvanced courses in linear algebra. We’ll return to this kind of example later in thecourse.

(e) Let a and b be real numbers, and consider the ordered pair (a, b). The set of all suchpairs is usually called the Cartesian plane and is denoted by R2. Accordingly, weusually think of the pair as a point in the plane. When either a or b is different from0, then (a, b) determines a unique line in R2: namely, the line through the origin,(0, 0) and (a, b). Let X denote the set of all real pairs (a, b) except for the pair (0, 0).Let us say that two such pairs (a, b) and (a′, b′) are collinear if they determine thesame line through the origin. If that is the case, write (a, b) ∼ (a′, b′). Then ∼ is anequivalence relation on X. Each ∼-orbit corresponds to a unique line through theorigin, and each such line may be given by a point (a, b) in X, hence corresponds tothe ∼-orbit [(a, b)]∼. So, the set of all ∼-orbits can be thought of as the set of alllines in the plane passing through the origin. This set is sometimes called the realprojective line.

(f) One can extend the preceding example to the case of ordered triples of real numbers,or indeed to ordered n-tuples, for any n. Thus, in the case of triples, we look at all(a, b, c), where a, b, c are real numbers at least one of which is not zero. Then wedefine two such triples to be collinear if and only if they lie on the same line throughthe origin. This again defines an equivalence relation, and the set of all equivalenceclasses may be considered to be the set of all lines through the origin in R3. Thisset is sometimes called the real projective plane.

Exercise 22. (a) Verify that ≡n is an equivalence relation.(b) Verify that ≡W is an equivalence relation.(c) Verify that ∼ is an equivalence relation.

2.8.2. PartitionsIn all of the above examples, the ρ-equivalence classes are always disjoint sets (i.e., the

intersection of any two is empty). In the following exercise, you prove that this is noaccident.

Exercise 23. Suppose a, b ∈ A and ρ is an equivalence relation on A. Prove:(a) aρb ⇐⇒ [a]ρ = [b]ρ.(b) ¬(aρb) ⇐⇒ [a]ρ ∩ [b]ρ = ∅.

This exercise shows that either two ρ-orbits are equal, or they are completely disjoint (i.e.,have empty intersection). Thus, the orbits of ρ form a partition of A ( which is a collectionof disjoint subsets of A whose union is A). Another way of describing the collection of sets

12

in a partition of A is that they are mutually exclusive and collectively exhaustive. We callthe partition coming from an equivalence relation ρ the quotient set of ρ, and we denote itby A/ρ. All the examples above exhibit quotient sets. The quotient sets for the first two

quotientsets examples are not very interesting, but the quotient sets for ≡n and ≡W , which we denote

by Zn and V/W above, respectively, both play an important role in algebra. The quotientset X/ ∼ in example (e), which we called the projective line, is often denoted P1. Thecorresponding quotient set in example (f), that is, the projective plane, is denoted P2. Bothof these last examples play an important role in projective geometry.

Given a ρ-orbit, any element in the orbit is called a representative of the orbit (or of theequivalence class). The above exercise indicates that the ρ-orbit may be written as [a] or,equally well, as [a′], where a and a′ are any two representatives of the orbit.

Often, we shall be defining concepts or constructions that involve ρ-orbits, and ourdefinition will make use of one of the representatives of a ρ-orbit. It will be important thento check that the definition is independent of the particular choice of representative; whenthis is the case, the definition is said to be well-posed.1 The following exercise and remarkillustrate how this works in a particular case.

Exercise 24. Suppose that a ≡n a′ and b ≡n b′.

(a) Show that a + b ≡n a′ + b′.(b) Show that ab ≡n a′b′. (Caution: This one is a bit tricky. Try it first in the special

case that a = a′ and then, separately, in the special case that b = b′. Then combinethese cases to obtain the general case.)

This exercise allows us to define addition and multiplication of ≡n-orbits: [a]+[b] = [a+b]and [a][b] = [ab]. The exercise shows that these definitions are well-posed. Convince yourselfof this.2

In a similar way, after proving an analogous exercise for ≡W , one can then define thesum of ≡W -orbits and real scalar multiples of these. An ambitious student might try thisexercise, which is not very different from the foregoing.

Exercise 25. Suppose A is a set, and F is a partition of A. That is, the elements of F aremutually exclusive collectively exhaustive subsets of A (cf., the comments below Exercise23). Define a relation ρ on A as follows:

ρ = {(a, b) ∈ A×A : there is S ∈ F such that a ∈ S and b ∈ S)}.Verify that ρ is an equivalence relation such that A/ρ = F .

This exercise shows that every partition determines an equivalence relation. Exercise 23shows that every equivalence relation determines a partition (namely, the quotient set ofthe relation). It is not hard to see that the rule for going from an equivalence relation to apartition in Exercise 23 is exactly the inverse of the rule for going from a partition to anequivalence relation in this exercise. This shows that the concepts of equivalence relation andthat of partition are basically the same. One focuses on the relational aspect of the concept,the other focuses on the orbits (or equivalence classes). They can be used interchangeably,but usually the context makes one more convenient to use than the the other.

1See the appendix to this chapter for a more detailed discussion of well-posed definitions.2See example (e) in the appendix to this chapter.

13

2.9. Order relations. We deal with order relations (also known as “orderings” or “orderingrelations”) very early in our lives, indeed as soon as we learn one of the words “more, less,before” or “after” (and its meaning). In mathematics, we encounter ordering first whenwe count the positive integers 1, 2, 3, . . ., etc., and we quickly understand from this theaccompanying ordering of rational numbers, and real numbers. There are various kinds oforderings in mathematics, some with very subtle properties. Here we give only the basicdefinitions and some simple examples of orderings.

2.9.1. Partial OrderingsA relation ρ is called antisymmetric provided that whenever aρb and bρa, it follows that

a = b. The usual ordering relations mentioned above are clearly antisymmetric.A relation ρ on a set A is called a weak partial order on A (or more simply a partial

order on A) provided it is reflexive, transitive, and antisymmetric. We often denote partialorderings by symbols similar to the usual “less than or equal” symbol, ≤, for example, by¹.

The standard ordering of the real numbers (or rational numbers, or integers, etc.) is anexample of a partial ordering. Notice that, for two real numbers r and s, it is always thecase that either r ≤ s or s ≤ r; we often express this by saying that any two reals arecomparable. The same need not be true for partial orders in general, however, as the firstof the following examples shows.(a) Let B be any set, and consider its power set P(B). The relation of set inclusion is apartial order on P(B). Thus, we may think of ⊆ as the set of all ordered pairs of the form(S, T ), where both S and T are subsets of B and S ⊆ T . The reader should quickly checknow that ⊆ is indeed, reflexive, transitive, and antisymmetric. Notice, however, that, ingeneral, two subsets of B need not be comparable. For example, if B consists of at leasttwo elements, say a and b. Then neither set inclusion {a} ⊆ {b} nor {b} ⊆ {a} is true.(b) Consider the following relation ρ defined on the set R2 of all ordered pairs of realnumbers: ρ = {((r, s), (t, u)) ∈ R2 × R2 : r ≤ t}. Convince yourself that ρ is reflexive andtransitive but not antisymmetric. So, ρ is not a partial order on R2.(c) Let λ be the relation on R2 given by the rule:

(r, s)λ(t, u) ⇐⇒((

(r ≤ t) ∧ (r 6= t)) ∨ (

(r = t) ∧ (s ≤ u)))

.

Exercise 26. Convince yourself that this is a partial order on R2 with respect to whichany two pairs are comparable. It is known as the lexicographical order.

A partial order for which any two elements are comparable is called a linear order. Thus,linearorderingsthe lexicographical order on R2 is linear, as is the usual order on R.

If ¹ is a partial order on a set A, we may wish to consider this relation only in connectionwith elements of some subset of A, say B ⊆ A. That is, we may wish to restrict ¹ toelements of B. We may denote this restriction by ¹ |B and call it the ordering induced by¹, or the induced ordering. However, for brevity we often use the same symbol ¹ and relyon the context to make the point that we are restricting attention to elements of the subsetB.

Strictly speaking, the definition of ¹ |B is given by

¹ |B = ¹ ∩ (B ×B).

14

Sometimes, we may wish to reverse the above procedure. That is, we may have a set A,a subset B ⊆ A, and a partial order ¹ on B. Then, we may wish to consider some partialorder ¹′ on A such that ¹ = ¹′ |B. We may either call ¹ the restriction of ¹′ to B or ¹′the extension of ¹ to A. We do exactly this when we consider the usual orderings of thereal numbers and the integers. The former is an extension of the latter

In the case of the usual ordering of the real numbers, we sometimes say s ≥ r to meanthe same thing as r ≤ s. That is, we use the inverse ordering. Sometimes this usage is justrandom, sometimes the context makes one or the other seem more appropriate. We shalldo the same thing in general. Thus, we may say b º a to mean the same thing as a ¹ b,where it is understood that º is the inverse of ¹.

A set A, together with a given partial order on A is often called a poset (an obviousabbreviation of partially ordered set).

2.9.2. Strict partial orderings A relation ≺ on a set A, is said to be irreflexive if, forno a ∈ A do we have a ≺ a. If ≺ is both irreflexive and transitive, then we call it a strictpartial order (or strict partial ordering) on A.

Exercise 27. (a) Start with a weak partial order ¹ on A, and define a relation ≺ on A bythe rule: a ≺ b if and only if a ¹ b and a 6= b. Show that ≺ is a strict partial order on A.It is called the strict ordering associated with ¹.

(b) Start with a strict partial order ≺ on A, and define a relation ¹ on A by the rule:a ¹ b if and only if a ≺ b or a = b. Show that ¹ is a weak partial order on A. It is calledthe weak partial order associated with ≺

(c) Show that the constructions in (a) and (b) are inverses of one another. That is showthat, if we start with a weak partial order, form the associated strict order as in (a), andthen use that to form the associated weak partial ordering, as in (b), we get the originalpartial order back. And, if we start with a strict partial order, form the associated weakpartial order as in (b), and then use that to form the associated strict partial order, as in(a), we get the original strict order back.

We shall feel free to make use of the notation used in this exercise. That is, if we aredealing with a weak partial order denoted ¹, then we shall use ≺ to refer to the associatedstrict ordering, and vice versa.

Exercise 28. Suppose that ρ is both an equivalence relation and a partial order on the setA. Prove that ρ is equal to the identity relation =A

Exercise 29. Suppose that A is a set with a partial ordering ¹ and that B is a subset ofdense

subsets A. We say that B is dense in A if, for any elements a and a′ of A satisfying a ≺ a′, thereexists an element b ∈ B such that a ¹ b ¹ a′.

(a) For example, in the case of the real numbers with their usual ordering, the set ofrational numbers, usually denoted Q, is dense in the reals because any two realscontain a rational number between them. Give an argument that demonstrates thisassertion. (You may use the usual decimal definition of real numbers, together withthe standard facts you know about decimals.)

(b) Is Q2 dense in R2 with respect to the lexicographical order? If so, prove it; if not,explain why not.

(c) Consider the set of natural numbers N = {0, 1, 2, 3, . . .} endowed with the restrictionof the usual ordering of the reals. Suppose that S is a subset of N such that thecomplement of S in N contains no pair of consecutive numbers.

15

(i) Show that S is dense in N.(ii) Show that any subset that is dense in N must have this property.

3. Cardinality

One of the most important and useful ways to compare two sets is in terms of their relativesize, that is in terms of what we think of as the number of elements in the sets. However, theconcepts of “number” and “size” are far more sophisticated than the elementary conceptsand constructions to which we have restricted ourselves in this initial development. In fact,as the next chapters will show, we want to use set theory and logic to develop the conceptof number. So, we must seek a way to compare the so-called sizes of sets using only theconcepts already developed. This leads us to the notions of equipotence and cardinality.

Given any two sets S and T , we say that S is equipotent to T and write S ≈ T wheneverthere exists a bijection S

f→ T . When this is the case, f−1 is a bijection T → S, soequipotenceequipotence is a symmetric relation. It is clearly reflexive since every set S has an identity

map, which is a bijection S → S. The student should check that equipotence is transitive.Therefore, ≈ determines an equivalence relation on the collection of all sets in our universe.Given any such set S we call the ≈-equivalence class of S the cardinality of S and we denoteit by card(S).

This notion of equipotence is our substitute for sequential enumeration or counting theelements of a set. In fact, counting the elements of a set S amounts to using a preferredset of so-called natural numbers, and then establishing a bijection between the naturalnumbers, or some subset thereof, and the set S. The concept of equipotence is simpler sinceit makes no reference to a preferred set. It is simply a formal version of the time-honoredprocess of checking that two sets have equally many members by pairing them off.

Using our earlier discussion about equivalence relations, we can write, for any two sets Sand T ,

card(S) = card(T ) ⇐⇒ S ≈ T.

3.1. Ordering cardinalities. We can define an order relation on the set of cardinalitiesas follows: We say that

card(S) ¹ cardT ) ⇐⇒ there is an injective function f : S → T.

Exercise 30. Verify that this definition is well-posed, i.e., that it does not depend on thechoices of representatives S and T of the equipotency-classes card(S) and card(T ).

It is almost obvious that ¹ is reflexive and transitive; we leave it to the reader to checkthis. However, the property of antisymmetry is highly non-trivial. In fact, it’s an importantenough fact that it has a proper name:

The Schroder-Bernstein Theorem: If S and T are sets for which there are injectionsS → T and T → S, then there exists a bijection S → T . Equivalently, if card(S) ¹ card(T )and card(T ) ¹ card(S), then card(S) = card(T ) (which is the antisymmetry property forthe relation ¹).

It follows that the relation ¹ is a partial ordering of the set of cardinalities.

The Schroder-Bernstein Theorem implies a principle used in combinatorics known as thepigeon-hole principle. This asserts that if S and T are sets with #(T ) ≺ #(S), then

pigeon-holeprinciple

16

there is no injective function S → T . (For the existence of such a function would imply#(S) ¹ #(T ), which, together with Schroder-Bernstein and the original hypothesis, leads tothe conclusion #(S) = #(T ). But this contradicts the original hypothesis #(T ) ≺ #(S).)In the context of combinatorics—i.e., when we are dealing with finite sets— this meansthat when #(T ) ≺ #(S) whatever function we define from S to T (whatever T-pigeon-holeaddresses we assign to our S-pigeons), at least one element of T will be the target of morethan one element of S ( at least one pair of pigeons will have to share the same pigeon-hole).

Another non-trivial theorem asserts the following:

Theorem: If S and T are any two sets, then either there is an injection S → T , or thereis an injection T → S. In other words, for any sets S and T , either card(S) ¹ card(T ) orcard(T ) ¹ card(S)

Since this theorem asserts that any two cardinalities are comparable, it follows that ¹ isa linear order on the set of all cardinalities.

It’s worth thinking for a moment about how remarkable this fact is. The universe ofsets that we consider in mathematics is vast, as large as anyone’s imagination can reach.And yet, despite all this diversity, given any two sets in the universe, their cardinalities arecomparable. That is, there must exist an injective map from one of the sets to the other. Ifyou think this is not hard to establish, try to define an injective map from, say, the set ofall 50× 50 matrices of complex numbers to the set of all curves in the plane, or vice versa.The preceding theorem says that such a map exists not only for this pair of sets but for allconceivable pairs of sets.

We do not give proofs of these theorems. Indeed, the proof of the second theorem requiresa property of sets and functions that is substantially more sophisticated than what we havebeen discussing here and is usually assumed as an axiom about sets. Advanced courses inset theory develop these ideas.

However, both of the above theorems are relatively easy to prove in the case of finite sets.They gain most of their power and interest from the fact that they hold for all sets, finiteor not.

4. Finite and Infinite sets

The concluding sentence of the previous section points out a gap in our presentation ofset theory. We have talked about finite and infinite sets throughout the chapter, alwaysto illustrate ideas or to present exercises. We have used these terms informally, intuitively,but not given a formal definition. Since our intuitive concept of, say, the natural numbers,requires that there be infinitely many of them, and since we want to construct the naturalnumbers in a relatively rigorous, formal way from more basic set-theoretic concepts, we needat the very least to have a formal definition of the concepts finite and infinite. Moreover,these concepts should be provably compatible with our just-defined concept of cardinality.Accordingly, we offer the following definition:

Definitions: (a) A set S is is said to be finite if and only if every injection f : S → S isa bijection.

(b) A set S is infinite if and only if it is not finite: that is, there exists an injectionf : S → S that is not a bijection.

17

Note: The injection f appearing in the definition is required to be a function with bothdomain and codomain equal to S, i.e. f : S → S.

The definition given above is expressed entirely in terms of the basic concepts about setsthat we introduced earlier in this chapter; so that’s good. However, they may require alittle bit of intuitive justification.

So consider the definition of finite set in part (a). Let’s see how this is compatible withour intuition. In this example, to emphasize that we are talking about our intuitive conceptof finite set and not the formal definition of finite set just given, we’ll enclose the word“finite” in quotes. We’ll do the same for our intuitve concept of infinite discussed in thenext paragraph. Now suppose S is some well-known set that we know is “finite”: say S ={1, 2, 3, 4, 5, 6, 7, 8, 9}. Let f : S → S be any injection. Then, the range R(f) has exactly 9elements, since we can count them explicitly: f(1), f(2), f(3), f(4), f(5), f(6), f(7), f(8), f(9).All of these are distinct because f is injective. If f were not a bijection, then it would not besurjective, which means that S would contain at least one more member, i.e., there wouldbe at least 10 elements in S, which is clearly false. So, f has to be bijective. Of course, thesame argument can be applied to “finite” sets with any number of elements, and so it givesan intuitive justification for part (a) of the definition.

Part (b) of the definition is logically equivalent to part(a). However, a separate intuitivejustification of part (b) would add more support to the definition. We can do this asfollows. Let us consider any well-known “infinite” set: for convenience, let us take the setN of natural numbers, {0, 1, 2, . . .}. Define a function f : N→ N by the rule f(n) = n + 1.Thus, f(0) = 1, f(1) = 2, etc. It is easy to check that f is an injection but not a bijection.So, N satisfies part (b) of the definition. A slight modification of this construction of f canbe applied to any set S that we think of as infinite. So any such S satisfies part (b) of thedefinition, that is it satisfies the formal definition of being an infinite set.

Note that a mathematical definition does not logically require intuitive justification. Itmay simply be posited for no reason whatever and then explored as to its consequences.However, this casual approach does not usually lead to interesting mathematics. Here, weare goal-oriented: we want to use the formal definition in ways that accord with and supportour intuition. In this way, the foregoing intuitive considerations provide important justifi-cation for the definition, even though the consequences of the definition are not logicallylinked to the justification.

The following exercises show that the definition is compatible with the notion of cardi-nality introduced earlier.

Exercise 31. Suppose that f : A → B is a bijection.(a) Prove that A is infinite if and only if B is infinite.(b) Conclude from (a) that A is finite if and only if B is finite.

This exercise shows that if sets A and B have the same cardinality, then one set is finiteif and only if the other is. This means that the property of finiteness depends only on the

finiteandinfinitecardinals

cardinality of the set; similarly for the property of being infinite. Therefore, we may applythe terms “finite” and “infinite” directly to the cardinalities themselves. That is, we maysay that card(A) is, say, infinite instead of saying A is infinite.

Exercise 32. (a) Suppose that card(A) ¹ card(B). Prove that if card(A) is infinite,then card(B) is infinite.

18

(b) Again suppose that card(A) ¹ card(B). Conclude from (a) that if card(B) is finite,then card(A) is finite.

It follows immediately from the above exercise that if T is any finite set and S is anarbitrary set, then S ∩ T is finite, since S ∩ T ⊆ T . It is slightly more involved to provethat the union of two finite sets is finite, though this is clearly intuitively true. Instead, weprove the following special case, which will be useful later.

Proposition 2. Let X be a finite set and y any object in our universe. Then the setX ∪ {y} is finite. Moreover, whenever y 6∈ X, card(X) ≺ card(X ∪ {y})Proof. : To prove the first statement, we proceed by contradiction. Suppose X ∪ {y} isinfinite, so that there is an injective map h : X ∪ {y} → X ∪ {y} that is not bijective.

Case 1 : h(y) = y. No value h(x), for x ∈ X, can equal y, because h(y) = y and h isbijective. So, the restriction of h to X is a function with range in X. Let us denote thisrestricted function by h′ and write h′ : X → X. h′ is injective, since h is. Now, since weassume that h is not bijective, and we are also assuming that y ∈ R(h), there must be somex that belongs to X but is not in R(h). This x cannot be of the form h′(x′) for any x′ ∈ Xbecause this value is defined to be h(x′). So, h′ is not bijective. This implies that X isinfinite, contradicting the given finiteness of X.

Case 2 : h(y) 6= y. The exercise at the end of this proof asserts that, given any twoelements a and b in a set S, there is a bijection f : S → S such that f(a) = b and f(b) = a.Apply this result to the case S = X ∪ {y}, a = y and b = h(y). Consider the compositionfh : X ∪{y} → X ∪{y}. Since f is bijective and h is injective, the composition is injective,from earlier results. Moreover, fh(y) = f(h(y)) = y. So, the function fh satisfies theconditions of Case 1. Using the argument in Case 1, now with fh in place of h, we againobtain the contradiction that X is infinite.

This concludes the proof of the first statement.

Before proving the second statement, we first define the function j : X → X ∪{y} by therule j(x) = x. This is clearly injective, so card(X) ¹ card(X ∪ {y}).

Now assume y 6∈ X. We proceed to prove the second statement by contradiction. Supposecard(X) ≺ card(X ∪ {y}) is false. In light of the preceding paragraph, the only way thiscan happen is if card(X ∪ {y}) = card(X)), so let h : X ∪ {y}) → X be a bijection. Thecomposition hj is an injective map X → X. It cannot be bijective because h(y) is notin its image. (The reader should verify this.) Therefore, X is infinite, contradicting ourassumption about X.

¤

Exercise 33. Let S be a set containing at least two elements, say a and b. Prove that thereexists a bijection f : S → S such that f(a) = b and f(b) = a.

Appendix A. Well-posed definitions

In mathematics we often devise some notation and terminology to denote a mathematicalobject or interrelated collection of objects that we wish to use frequently.

19

Sometimes this is simply a matter of temporary convenience. For example, you may begiven a problem to prove something about a function or type of function. So, you say “Letf be a function of such and such a type. . . ,” and you use the symbol “f” throughout yourproof. This is a useful but not very deep case of mathematical definition.

Frequently, however, more than convenience is involved. Mathematics is often a processof mental “bootstrapping,” whereby problems are solved or theorems proved only after newconcepts are introduced and studied. This introduction of a new concept is another, muchdeeper and more important, case of mathematical definition. For an extreme example,the three famous geometric problems of antiquity—trisecting angles, doubling the cube,squaring the circle—remained unsolved for over 2000 years. Not until Arab mathematiciansintroduced the many new concepts and techniques involved in trigonometry and algebra inthe Middle Ages did these geometric problems come within grasp. And even then, it tookhundreds of years of development before a nineteenth century French mathematician wasable to prove unequivocally that the required constructions were impossible to do.

These examples suggest that mathematical definitions appear at all levels in mathematics,and their roles range from that of a simple convenience to that of an important tool forunifying, highlighting and often introducing new mathematical concepts. Obviously, then,when we give a definition in mathematics, it is important that the definition be meaningful,clear and unambiguous. When it fails to be any one of these, we say that it is not well-posed.We may also say, alternatively, that the concept in question is not well-defined.

Now we become more specific and give several examples of attempted definitions thatare not well-posed. As the foregoing chapter indicates, mathematical objects are all of acertain limited number of kinds. There are elements of sets, there are sets themselves,there are relations, there are functions, and there are operations.3 We’ll give an example ofa non-well-posed definition in each case:

(a) Suppose that we try to define an element r by saying that it is “the largest realnumber < 1.” This is an attempt to define a single element (by calling it thelargest...) of a certain set: namely the set of all real numbers < 1. The set iscertainly clearly defined, but the element is not well-defined. For, if x is any realnumber < 1, then (x+1)/2 is also such an real number, and it is plainly larger thanx. So, this definition does not make sense, i.e., it is not meaningful.

(b) Suppose we try to define a set S in our universe by describing it as the set of allobjects x in the universe such that x 6∈ x. Suppose this truly is a set in our universe.Then, for every object y in our universe, we must be able to settle the question,“Is y ∈ S?” If all such questions can be settled, then let us focus on one particularquestion by setting y = S. That is, we ask “Is S ∈ S?” If so, then, by the descriptionof S, we must have S 6∈ S, a contradiction. Therefore, since the assumption thatS ∈ S led to a contradiction, we must have S 6∈ S. But this means that the objectS satisfies the condition to be a member of S. So, S ∈ S. Another contradiction.

Therefore, our attempted definition of S gives rise to something that has some ofthe trappings of a set (e.g., we talk about S as the set of all objects that satisfy a

3To be sure, all of the mentioned objects can be construed to be certain kinds of sets. However, in commonmathematical usage, we distinguish elements from sets, sets from functions, functions from operations, andso on.

20

certain property), but it lacks the key property that a set must have: namely, if Sis a set in our universe, then we must be able to determine whether or not y ∈ Sfor each y in the universe, in particular for the object y = S. One or the other ofthese must hold, and not both. But that’s exactly what we have just shown does nothappen: neither is true! So, either our definition of S is not meaningful (i.e., doesnot specify a meaningful object of discourse) or it is meaningful, but the specifiedobject S is not a set. In either case, we have not defined a set.

The foregoing example is based on the famous Russell’s Paradox.

(c) Getting closer to earth, consider the real valued function f of a real variable x givenby the equation f(x) = 1/(x − 2). This is not well-defined, because when x = 2(which was not excluded in our definition), we obtain the function value 1/0, whichis not a meaningful expression of a real number.

Let us try a different function: Say that we attempt to define g as a real-valuedfunction of a real variable by setting g(x) =

√x. Obviously this is not defined for

negative x. So we cannot get a real-valued function of a real variable unless we ruleout negative x.

Of course, in these two examples, the definitions are easily fixed so as to becomewell-posed by just suitably restricting the domain of the function. In the first case,the domain should be the set of all real number x 6= 2, and in the second case, thedomain should be the set of all non-negative reals.

Easy as this may be, it does point out the importance of being clear about whatthe domain of the function is when you are attempting to define the function.

(d) There is another important type of example involving functions. Suppose we wishto define a function, which we write in classical notation as y = f(x), by defining itvia the equation y2 = x3 − 3x + 10. If we want both y and x to be real variables,then we must restrict x so that the right-hand side is non-negative, since y2 is alwaysnon-negative. This is similar to the examples above, which were fixed by restrictingthe domain. Let’s not go into details here and suppose this is done. But anotherproblem remains. For any positive value of the right-hand side, there are two valuesof y that will work. So, the rule does not define a single function or single-valuedfunction. It does, however, define a relation. Indeed, the curve in R2 specified by theequation is precisely this relation. Parts of this curve can be interpreted as graphs ofwell-defined functions: in particular in this case, the portion of the curve in the upperhalf-plane corresponds to the case in which we restrict y by the condition y ≥ 0, andthe portion in the lower half plane corresponds to the negative case, y ≤ 0. Thesemay be explicitly written as y =

√x3 − 3x + 10 and y = −√x3 − 3x + 10. Things

get more complicated for more general algebraic curves, so it is not always possibleto get explicit formulas like this. However, even for general curves, the ImplicitFunction Theorem shows how we may interpret pieces of the curves as graphs offunctions (usually without nice formulas though). Suffice it to say here, an equationby itself is not always sufficient for giving a well-posed definition of a function.

(e) Next we give an example of a set and two purported binary operations on the set.One of the operations will be well-defined, and the other will not.

21

First we define the set. We begin with R2, the usual Cartesian plane. We definean equivalence relation on R2 as follows: Given points P and Q in R2, we’ll writeP ∼ Q if and only if there is a line L of slope 3 such that P ∈ L and Q ∈ L. It iseasy to check that ∼ is reflexive, symmetric and transitive, i.e., it’s an equivalencerelation. (You should check this!) In terms of analytic geometry, writing P = (a, b)and Q = (c, d), we have

P ∼ Q ⇔ d− b = 3(c− a).

The relation ∼ then partitions R2 into equivalence classes. The set of all equiv-alence classes, that is, the quotient set, is usually denoted by R2/ ∼, but we shalldenote it by X for short. You can visualize each equivalence class as a line in R2 ofslope 3. Every such line corresponds to an equivalence class. So, the set X can bethought of as the set of all slope 3 lines in the plane.

Note that each such line is completely determined once we specify a point on theline (remember the “point-slope formula”). This is in accord with the fact that theequivalence class of the point represented by the pair (a, b) is completely determinedby (a, b). As usual, we denote this equivalence class (or line) by [(a, b)].

Our two operations will be defined on the set X. Recall that an operation is atype of function. So, for it to be well-defined, for any given input, there must be asingle, unambiguous output.

The operations will be denoted ⊕ and ⊗. To define them, we choose two arbitraryequivalence classes [(a, b)] and [(c, d)]. The definitions go as follows:

[(a, b)]⊕ [(c, d)] = [(a + c, b + d)], and

[(a, b)]⊗ [(c, d)] = [(ac, bd)].These attempted definitions look very similar, but only the first one is well-posed.

We’ll check this first.

Let us make another selection of points on the same two lines, say (a′, b′) on theline [(a, b)] and (c′, d′) on the line [(c, d)]. Of course, by our choices, the equivalenceclass [(a′, b′)] is the same as [(a, b)] and [(c′, d′)] = [(c, d)]. Now the right-handside of the equation in the definition of ⊕, namely [(a + c, b + d)], is expressed interms of a, b, c, d, which depend on our choices of the representative points (a, b) and(c, d). If we had chosen the other representatives (a′, b′) and (c′, d′), the value of[(a′, b′)] ⊕ [(c′, d′)] would have been defined to be [(a′ + c′, b′ + d′)]. Fine, providedthat that value is the same as [(a+c, b+d)]! (Same inputs require the same outputs.)So, we have to check that they are indeed the same. In this case, it’s easy, as wenow show.

We know that (a, b) and (a′, b′) are on the same line of slope 3, so b′−b = 3(a′−a).Similarly, d′ − d = 3(c′ − c). Adding these two equations and rearranging terms, weget (b′ + d′) − (b + d) = 3((a′ + c′) − (a + c)). This is precisely the condition for(a′+c′, b′+d′) ∼ (a+c, b+d), or, what amounts to the same thing, [(a′+c′, b′+d′)] =[(a + c, b + d)].. So, we have proved that the outputs resulting from the two choicesof representatives are the same.

So the operation ⊕ is well-defined on X.

22

Now the definition of the operation ⊗ looks very similar to that of the operation⊕, at least superficially. But, in point of fact, ⊗ is not well-defined. This is not hardto prove. We note that the equation giving the purported definition is supposedto be valid for all possible choices of representative ordered pairs. So, to provethat the definition is not well-posed, we need only to come up with one choice ofrepresentatives for which the equation fails. So, let’s just choose any two points Pand Q, say P = (1, 2) and Q = (−1, 5), and then let’s choose two other points P ′and Q′ on the same lines, respectively: say P ′ = (3, 8) and Q′ = (−2, 2). (To checkthat P and P ′ are on the same slope 3 line, just compute 8− 2 = 3(3− 1). Similarlyfor Q′ and Q: 2-5=3(-2-(-1)).) Now let’s apply the definition of ⊗ to the equivalenceclasses [(1, 2)] and [(−1, 5)] and then to the same classes written in terms of theother representatives, [(3, 8)] and [(−2, 2)]. We get:

[(1, 2)]⊗ [(−1, 5)] = [(−1, 10)], and

[(3, 8)]⊗ [(−2, 2)] = [(−6, 16)].Remember, the “inputs” on the left-hand sides of these equations are the same.

But you can check that the two equivalence classes on the right-hand sides of theequations are not equal: the displayed representatives do not lie on the same slope3 line.

Therefore, the same inputs led to different outputs: ⊗ is not well-defined. In thisproof, we needed to produce only two specific inputs that yield different outputs.Other choices would very likely have been just as good.

We have two reasons for giving these very similar-looking operations and goinginto great detail to prove that one is well-defined and one is not.

First, we want to illustrate how one goes about checking whether or not a defi-nition involving equivalence classes is well-posed. One cannot be content with thesuperficial form of the definition. Rather one has to go back to the original set andwork with the representatives of the equivalence classes. To prove that the defini-tion is well-posed, one has to prove that certain equations or properties hold for allchoices of representatives. To show that the definition is not well-posed, one has onlyto come up with a single choice of elements for which the equations or properties donot hold. (See The Predicate Calculus, §14.)

Secondly, although the particular example we have given may seem to be con-trived or obscure, in point of fact it is typical of constructions and definitions thatoccur again and again in more advanced mathematics. The formation of quotientsets and operations on them is an essential tool for constructing a wide variety ofmathematical structures.

set theory - cornell universitypi.math.cornell.edu/~web3040/setsandfunctions_s11.pdf ·...

Documents