graham - duke math camp notes

Upload: madminarch

Post on 10-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Graham - Duke Math Camp Notes

    1/117

    Math Campfor Economists

    Daniel A. Graham

    August 14, 2009

    Copyright c 2007-2008 Daniel A. Graham

  • 8/8/2019 Graham - Duke Math Camp Notes

    2/117

  • 8/8/2019 Graham - Duke Math Camp Notes

    3/117

    Contents

    Contents iii

    Preface vii

    1 Linear Algebra 1

    1.1 Real Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    1.2 Combinations of Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    1.3 The Standard Linear Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    1.4 Separating and Supporting Hyperplanes . . . . . . . . . . . . . . . . . . . . . . . . 11

    1.5 Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    2 Matrix Algebra 15

    2.1 Linear Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    2.2 Real Valued Linear Transformations and Vectors . . . . . . . . . . . . . . . . . . . . 16

    2.3 Linear Transformations and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 17

    2.4 Invertible Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    2.5 Change of Basis and Similar Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 202.6 Systems of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

    2.7 Square Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    2.8 Farkas Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    2.9 Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    iii

  • 8/8/2019 Graham - Duke Math Camp Notes

    4/117

    3 Topology 35

    3.1 Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    3.2 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    3.3 Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

    3.4 Sigma Algebras and Measure Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 453.5 Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

    4 Calculus 47

    4.1 The First Differential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

    4.2 Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

    4.3 The Second Differential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    4.4 Convex and Concave Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

    4.5 Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

    5 Optimization 57

    5.1 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    5.2 The Well Posed Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

    5.3 Comparative Statics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

    6 Dynamics 756.1 Dynamic Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

    6.2 Systems of Linear Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . 81

    6.3 Liapunovs Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

    Notation 91

    Using Mathematica 93

    Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

    Input Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

    Symbols and Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

    Using Prior Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

    Commonly Used Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

    iv

  • 8/8/2019 Graham - Duke Math Camp Notes

    5/117

    Bibliography 103

    List of Problems 105

    Index 106

    Colophon 109

    v

  • 8/8/2019 Graham - Duke Math Camp Notes

    6/117

    vi

  • 8/8/2019 Graham - Duke Math Camp Notes

    7/117

    Preface

    The attached represents the results of many years of effort to economize on the use of my scarcemental resources particularly memory. My goal has been to extract just those mathematical ideaswhich are most important to Economics and to present them in a way that emphasizes the approach,common to virtually all of mathematics, that begins with the phrase let X be a non-empty set andgoes on to add a pinch of this and a dash of that.

    I believe that Mathematics is both beautiful and useful and, when viewed in the right way, not nearly ascomplicated as some would have you believe. For me, the right way is to identify the links connectingthe ideas and, whenever possible, to embed them in a visual setting.

    The reader should be aware of two aspects of these notes. First, intuition is emphasized. While ProveTheorem 7 might be a common format for exercises inside courses, State an interesting propositionand prove it is far more common outside courses. Intuition is vital for such endeavors. Secondly, useof the symbolic algebra program Mathematica is emphasized for at least the following reasons:

    Mathematica is better at solving a wide variety of problems than you or I will ever be. Ourcomparative advantage is in modeling , not solving . Mathematica lowers the marginal cost of asking What if? questions, thereby inducing us to askmore of them. This is a very good thing. One of the best ways of formulating conjectures about

    what might be true, for instance, is to examine many specic cases and this is a relatively cheapendeavor with Mathematica .

    Mathematica encourages formulating solution plans and, in general, top-down thinking. Afterall, with it to do the heavy lifting, all thats left for us is to formulate the problem and plan thesteps. This too, is a very good thing.

    Why Mathematica and not Maple , another popular symbolics program? While there are differences, both are wonderful programs and it would be difficult to argue that either is better than the other. Iveused both and have a slight personal preference for Mathematica .

    Dan Graham

    Duke University

    vii

  • 8/8/2019 Graham - Duke Math Camp Notes

    8/117

    viii

  • 8/8/2019 Graham - Duke Math Camp Notes

    9/117

    Chapter 1

    Linear Algebra

    1.1 Real Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    1.1.1 Equality and Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    1.1.2 Norms, Sums and Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.2 Combinations of Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    1.2.1 Linear Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    1.2.2 Affine Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    1.2.3 Convex Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    1.3 The Standard Linear Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    1.4 Separating and Supporting Hyperplanes . . . . . . . . . . . . . . . . . . . . . . . . 11

    1.5 Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    The following is an informal review of that part of linear algebra which will be most important tosubsequent analysis. Please bear in mind that linear algebra is, perhaps, the single most importanttool in Economics and forms the basis for many other important areas of mathematics as well.

    1.1 Real Spaces

    Recall that the Cartesian product of sets, e.g. Capital Letters Integers Lower Case Letters, isitself a set composed of all ordered n-tupels of elements chosen from the respective sets, e.g., (G, 5, f ) ,(F, 1, a) and so forth. Note that no multiplication is involved in forming this product. Now introducethe set consisting of all real numbers, denoted R and called the reals, and real n -space is obtainedas the n -fold Cartesian product of the reals with itself:

    1

  • 8/8/2019 Graham - Duke Math Camp Notes

    10/117

    (6, -3)

    (-4, 5) *

    Figure 1.1: Vectors in R 2

    R n n times

    R . . . R {(x 1 , x 2 ,...,x n ) | x i R , i =1, 2 , . . . , n }The origin is the point (0 , 0, . . . , 0)

    R n . It will sometimes be written simply as 0 when no confusion should result. Anarbitrary element of this set, x R

    n , is sometimes called apoint and sometimes called a vector and x i is called the i thcomponent of x . The existence of two terms for the samething is due, no doubt, to the fact that it is sometimes usefulto think of x = (x 1 , x 2 ) , for example, as a point located atx 1 on the rst axis and x 2 on the second axis. Other timesit is useful to think of x = (x 1 , x 2 ) as a directed arrow orvector with its tail at the origin, (0 , 0) , and its tip at the pointx =(x 1 , x 2 ) .See Figure 1.1 . It is important to realize, on the other hand, that it is hardly ever useful to think of a

    vector as a list of its coordinates. Vectors are objects and better regarded as such than as lists of theircomponents.

    1.1.1 Equality and Inequalities

    Recall that a relation (or binary relation), R , on a set S is a mapping from S S to {True , False}, i.e.,for every x, y S , xRy is either True or False. This is illustrated in Figure 1.2 for the relation >on R . Note that points along the 45-degree line where x =y map into False.

    x > y: True

    x

    y

    x > y: False

    x = y

    Figure 1.2: The Relation > on R

    Supposing that x, y Rn , several sorts of relations are pos-

    sible between these vectors. The vector x is equal to thevector y when each component of x is equal to the corre-sponding component of y :Denition 1.1 . x =y iff x i =y i , i =1, 2 , . . . , nThe vector x is greater than or equal to the vector y wheneach component of x is at least as great as the correspondingcomponent of y :Denition 1.2 . x y iff x i y i , i =1, 2 , . . . , n .The vector x is greater than the vector y when each compo-nent of x is at least as great as the corresponding componentof y and at least one component of x is strictly greater thanthe corresponding component of y :

    Denition 1.3 . x > y iff x y and x =y Problem 1.1 . [Answer] Suppose x > y . Does it necessarily follow that x y ?The vector x is strictly greater than the vector y when each component of x is greater than thecorresponding component of y :Denition 1.4 . x y iff x i > y i , i =1 , 2, . . . , n .

    Problem 1.2 . [Answer] Suppose x y . Does it necessarily follow that x > y ?

    2

  • 8/8/2019 Graham - Duke Math Camp Notes

    11/117

    These denitions are standard and they conform to the conventional usage in the special case in whichn =1. The distinctions are illustrated in Figure 1.3 . The shaded area in the left-hand panel representsthe set of vectors, y , for which y (4, 3) . Note that (4, 3) does not belong to this set nor does anypoint directly above (4, 3) nor any point directly to the right of (4, 3) . The shaded area in the right-hand panel illustrates the set of vectors, y , for which y (4 , 3) . This shaded area differs by including(4, 3) , points directly above (4 , 3) and points directly to the right of (4 , 3) . Though not illustrated, theset of y s for which y > ( 4, 3) corresponds to the shaded area in the right-hand panel with the point(4, 3) itself removed.

    x = (4,3) x = (4,3)

    Figure 1.3: Vector Inequalities: y x (left) and y x (right)Problem 1.3 . A relation, R , on a set S is transitive iff xRy and yRz implies xRz for all x ,y,z S .(i) Is =transitive? (ii) What about ? (iii) ? (iv) > ?Problem 1.4 . A releation, R on a set S is complete if either xRy or yRx for all x, y S . When n =1it must either be the case that x y or that y x (or both). Thus on R 1 is complete. Is on R ncomplete when n > 1?Problem 1.5 . [Answer] Consider the case in which n = 1 and x, y R 1 . Is there any distinction between x > y and x y ?

    1.1.2 Norms, Sums and Products

    0-3

    ||(-3)|| = 3

    4

    3

    ||(3,4)|| = 52

    4

    4

    ||(2,4,4)|| = 6

    Figure 1.4: Vector Norms in R 1 , R 2 and R 3

    The (Euclidean) norm or length of a vector, by an obvious extension of the Pythagorean Theorem, isthe square root of the sum of the squares of its components.

    3

  • 8/8/2019 Graham - Duke Math Camp Notes

    12/117

    Denition 1.5 . x x 21 +x 22 +. . . +x 2n1/ 2

    .

    Note that the absolute value of a real number and the norm of a vector in R 1 are equivalent if a Rthen a =a 2 = |a |. The norms of a vectors in R 2 and R 3 are illustrated in Figure 1.4 on precedingpage . The extensions to higher dimensions are analogous.

    x = (4,-3)

    y = (3,4)z = (-3,4)

    w = (-3, -4)

    Figure 1.5: Angles

    If x, y R n , the dot or inner product of these two vectors

    is obtained by multiplying the respective components and adding.Denition 1.6 . x y x 1y 1 +x 2y 2 +. . . +x n y n =

    ni=1 x i y i

    There is a very important geometric interpretation of this dotproduct. It can be shown that

    x y = x y cos (1.1)where is the included angle between the two vectors. Recallthat the cosine of is bigger than, equal to or less than zerodepending upon whether is less than, equal to or greaterthan ninety degrees.Theorem 1. Suppose x, y R

    n with x, y =0. Then x y > 0iff x and y form an acute angle , x y =0 iff x and y form aright angle and x y < 0 iff x and y form an obtuse angle .This theorem is illustrated in Figure 1.5 where (a) x and y form a right angle and x y =0, (b) x andw form a right angle and x w =0, (c) x and z form an obtuse angle and x z = 24 < 0, (d) y andw form an obtuse angle and y w = 25 < 0 and (e) y and z form an acute angle and y z =7 > 0.When two vectors form a right angle they are said to be orthogonal Note that the word orthogonal is just the generalization of the word perpendicular to R n . Similarly, orthant is the generalization of the word quadrant to R n .

    Problem 1.6 . The Cauchy-Schwarz inequality states that x y x y . Show that this inequalityfollows from Equation 1.1 . Query 1.1 . When does Cauchy-Schwarz hold as an equality?

    Problem 1.7 . Suppose a, x, y Rn and a x > a y . Does it follow that x > y ? [Hint: resist anytemptation to divide both sides by a .]

    Problem 1.8 . In Mathematica a vector is a list , e.g. {1,2,3,4} or Table[j,{j,1,4}] and the dot product of two vectors is obtained by placing a period between them. Use Mathematica to evaluate thefollowing dot product:

    Table[j, {j,1,100}] . Table[j-50, {j,1,100}]

    Do the two vectors form an acute angle, an obtuse angle or are they orthogonal?

    The sum of two vectors is obtained by adding the respective components. Supposing that x, y R n

    we have:Denition 1.7 . x +y (x 1 +y 1 , x 2 +y 2 , . . . , x n +y n )

    4

  • 8/8/2019 Graham - Duke Math Camp Notes

    13/117

    x = (5, -3)

    y = (-1,5)

    x+y = (4, 2)

    Figure 1.6: Vector Addition

    Note that the sum of two vectors in R n is itself a vector inR n . The set R n is said, therefore, to be closed with respectto the operation of addition.

    The addition of two points from R 2 is illustrated in Figure 1.6 .The addition of x

    =(5 ,

    3) and y

    =(

    1, 5) yields a point

    located at the corner of the parallelogram whose sides areformed by the vectors x and y . Equivalently, x +y = (4, 2)is obtained by moving the vector x parallel to itself until itstail rests at the tip of y , or by moving the vector y parallelto itself until its tail rests at the tip of x .

    The scalar product of a real number and a vector is obtained by multiplying each component of the vector by the real num- ber. If R then:Denition 1.8 . x =(x 1 , x 2 , . . . , x n ) .Note that this product is itself a vector in R n . The set R n issaid, therefore, to be closed with respect to the operation of scalar multiplication.

    x = (3,1)

    2 x

    -2 x

    Figure 1.7: Scalar Product

    Scalar multiplication is illustrated in Figure 1.7 . Note thatfor any choice of , x lies along the extended line passingthrough the origin and the point x . The sign of determineswhether x will be on the same ( > 0) or opposite ( 1)from the origin than x .

    Problem 1.9 . In Mathematica if x and y are vectors and ais a real number, then x+y gives the sum of the two vectorsand a x gives the scalar product of a and x . Use Mathematica to evaluate the following:

    3 {1,3,5} + 2 {2,4,6}

    The norm of x is

    x =( 2x 21 + 2x 22 +. . . + 2x 2n ) 1/ 2=[ 2 (x 21 +x 22 +. . . +x 2n )] 1/ 2=( 2 ) 1/ 2 (x 21 +x 22 +. . . +x 2n ) 1/ 2= x

    Multiplying x by thus produces a new vector that is times as long as the original vector. It is notdifficult to see that x points in the same direction as x if is positive and in the opposite directionif is negative.

    Problem 1.10 . In Mathematica the norm of the vector x is given by Norm[x] . What isNorm[Table[j, {j,1,100}]]

    1.2 Combinations of Vectors

    The operations of vector addition and scalar multiplication can be combined.

    5

  • 8/8/2019 Graham - Duke Math Camp Notes

    14/117

    1.2.1 Linear Combinations

    Denition 1.9 . If x 1 , x 2 , . . . , x k are k vectors in R n and if 1 , 2 , . . . , k are real numbers then

    z = 1x 1 + 2x 2 +. . . + k x kis a linear combination of the vectors.

    A related concept is that of linear independence.Denition 1.10 . If

    1x 1 + 2x 2 +. . . + k x k =(0 , 0 , . . . , 0)has no solution (a 1 , a 2 , . . . , a k ) other than the trivial solution , = 0, then the vectors are said to belinearly independent . Alternatively, if there were a non-trivial solution, =0, then the vectors are saidto be linearly dependent .In the latter case we must have j =0 for some j and thus can write:

    j x j = i=j

    i x i

    or, since j =0,x j =

    i=j

    i j

    x i

    Thus x j is a linear combination of the remaining x s. It follows that vectors are either linearly inde-pendent or one of them can be expressed as a linear combination of the rest.

    This is illustrated in Figure 1.8 . In the right-hand panel, x and y are linearly dependent and a non-trivial solution is =(1, 2) . In the left-hand panel, on the other hand, x and y are linearly independent .Scalar multiples of x lie along the dashed line passing through x and the origin and, similarly, scalarmultiples of y lie along the dashed line passing through y and the origin. The only way to have thesum of two points selected from these lines add up to the origin is to choose the origin from each line the trivial solution

    =(0, 0) .

    x = (4, -8)

    y = (-2, 4)

    2y = (-4, 8)

    (0, 0) =1 x + 2 y

    x = (6, -3)

    y = (6, 4)(0,7)

    Figure 1.8: Linear Independence (left) and Dependence (right)

    Denition 1.11 . If L is a non-empty set which is closed with respect to vector addition and scalarmultiplication, i.e. (i) x, y L x +y L and (ii) R , ; x L x L, then L is called alinear space .Denition 1.12 . If L is a linear space and LM then L is a linear subspace of M .

    Problem 1.11 . Which of the following sets are linear subspaces of R 3?

    1. A point other than the origin? What about the origin?

    6

  • 8/8/2019 Graham - Duke Math Camp Notes

    15/117

    2. A line segment? A line through the origin? A line not passing through the origin?

    3. A plane passing through the origin? A plane not passing through the origin?

    4. The non-negative orthant, i.e., {x R 3 | x 0}? Query 1.2 . Must the intersection of two linear subspaces itself be a linear subspace?

    Denition 1.13 . The dimension of a linear (sub)space is an integer equaling the largest number of linearly independent vectors which can be selected from the (sub)space.

    Problem 1.12 . [Answer] What are the dimensions of the following subsets of R 3?

    1. The origin?

    2. A line through the origin?

    3. A plane which passes through the origin?

    Denition 1.14 . Given a set {x 1 , x 2 , . . . , x k}of k vectors in R n , the set of all possible linear combina-tions of these vectors is referred to as the linear subspace spanned by these vectors.Linear spaces spanned by independent and dependent vectors are illustrated for the case in whichn = 2 in Figure 1.8 on the previous page . Since the two vectors, (6 , 3) and (6, 4) in the left-handpanel are linearly independent, every point in R 2 can be obtained as a linear combination of these twovectors. The point (0 , 7) , for example, corresponds to 1x +1y . In the right-hand panel, on the otherhand, the two vectors (4, 8) and (2 , 4) , are linearly dependent and the linear subspace spanned bythese vectors is a one-dimensional, strict subset of R 2 corresponding to the line which passes throughthe two points and the origin.

    Problem 1.13 . [Answer] Suppose x, y Rn with x =0 and let X = {z R n | z =x, R }bethe (1-dimensional) linear space spanned by x . The projection of y upon X , denoted y , is dened to

    be that element of X which is closest to y , i.e. that element y X for which the norm of the residual of the projection , y y , is smallest. Obtain expressions for both and y as functions of x and y .

    Problem 1.14 . Suppose a, y Rn and let X = {x R n | a x = 0 }be the linear subspaceorthogonal to a . Obtain an expression for y , the projection of y on X , as a function of a and y .

    [See Problem 1.13 .]Denition 1.15 . A basis for a linear (sub)space is a set of linearly independent vectors which span the(sub)space.Denition 1.16 . An orthonormal basis for a linear (sub)space is a basis with two additional properties:

    1. The basis vectors are mutually orthogonal, i.e., if x i and x j are vectors in the basis, then x i x j =0.2. The length of each basis vector is one, i.e., if x i is a vector in the basis, then x i x i =1.

    Problem 1.15 . [Answer] Give an orthonormal basis, {x 1 , x 2 , . . . , x n }, for R n .

    1.2.2 Affine Combinations

    In forming linear combinations of vectors no restriction whatever is placed upon the s other thanthat they must be real numbers. In the left-hand panel of Figure 1.9 on following page , for example,every point in the two-dimensional space corresponds to a linear combination of the two vectors. Anaffine combination of vectors, on the other hand, is a linear combination which has the additionalrestriction that the s add up to one.

    7

  • 8/8/2019 Graham - Duke Math Camp Notes

    16/117

    Denition 1.17 . If x 1 , x 2 , . . . , x k are k vectors in R n and if 1 , 2 , . . . , k are real numbers with theproperty that

    k

    i=1 i =1

    then

    z =k

    i=1 i x i

    is an affine combination of the x s.Problem 1.16 . An affine combination of points is necessarily a linear combination as well but not

    vice versa. True or false?

    An affine space bears the same relationship to affine combinations that a linear space does to linearcombinations:Denition 1.18 . If L is closed with respect to affine combinations, i.e. affine combinations of points inL are necessarily also in L, then L is called an affine space . If, additionally, LM then L is an affinesubspace of M .

    The affine subspace spanned by a set of vectors is similarly analogous to the linear subspace spanned by a set of vectors.Denition 1.19 . Given a set {x 1 , x 2 , . . . , x k}of k vectors in R n , the affine subspace spanned by thesevectors is set of all possible affine combinations of these vectors

    z R n | z =

    k

    i=1 i x i ,

    k

    i=1 i =1

    When k =2, z = 1x 1 + 2x 2 is an affine combination of x 1 and x 2 , provided that a 1 +a 2 =1. Supposenow that x 1 =x 2 , let = 1 and (1 ) = 2 and rewrite this as z =x 1 +(1 )x 2 . Rewriting againwe have z = (x 1 x 2 ) +x 2 . Note that when =0, z = x 2 . Alternatively, when = 1, z = x 1 . Ingeneral z is obtained by adding a scalar multiple of (x 1 x 2) to x 2 . It is not difficult to see that suchpoints lie on the extended line passing through x 1 and x 2 the set of all possible affine combinationsof two distinct vectors is simply the line determined by the two vectors. This is illustrated for n

    =2

    by the middle panel in Figure 1.9 .

    Figure 1.9: Combinations: linear (left), affine (middle) and convex (right)

    Problem 1.17 . A linear subspace is necessarily an affine subspace as well but not vice versa. True orfalse?

    Problem 1.18 . [Answer] Suppose a is a point in L where L is an affine subspace but not a linearsubspace. Let M be the set obtained by subtracting a from L, i.e. M = {z | z =x a, x L}. Is M necessarily a linear subspace?

    Problem 1.19 . Suppose x, y Rn with x and y linearly independent and consider the affine sub-

    space A = {z R n | z =x +(1 )y, R }. Find the projection, o of the origin on A.

    8

  • 8/8/2019 Graham - Duke Math Camp Notes

    17/117

    1.2.3 Convex Combinations

    If we add the still further requirement that the s not only add up to one but also that each is non-negative, then we obtain a convex combination.Denition 1.20 . If x 1 , x 2 , . . . , x k are k vectors in R n and if 1 , 2 , . . . , k are real numbers with theproperty that

    k

    i=1 i =1 i 0, i =1, . . . , k

    then

    z =k

    i=1 i x i

    is a convex combination of the x s.

    Again considering the case of k =2, we know that since the s must sum to one, convex combinationsof two vectors must lie on the line passing through these vectors. The additional requirement that the s must be non-negative means that convex combinations correspond to points on the line betweenx 1 and x 2 , i. e. the set of all possible convex combinations of two distinct points is the line segmentconnecting the two points. This is illustrated for n =2 in Figure 1.9 on the previous page .

    Problem 1.20 . A convex combination of points is necessarily an affine combination and thus a linearcombination as well but not vice versa. True or false?

    A convex set bears the same relationship to convex combinations that an affine subspace does to affinecombinations:

    Denition 1.21 . If LRn and L is closed with respect to convex combinations, i. e. convex combina-

    tions of points in L are necessarily also in L, then L is called an convex set .

    Problem 1.21 . Show that the intersection of two (or more) convex sets in R n must itself be a convexset.Denition 1.22 . Given a set LR

    n , the smallest convex set which contains L is called the convex hull of L. Here smallest means the intersection of all convex sets containing the given set.]

    The convex hull of a set of vectors corresponds to the set of all convex combinations of the vectorsand is thus analogous to the affine space spanned by a set of vectors:

    Problem 1.22 . Suppose x , y and z are three, linearly independent vectors in R 3 . Describe the setswhich correspond to all (i) linear, (ii) affine and (iii) convex combinations of these three vectors.

    1.3 The Standard Linear Equation

    9

  • 8/8/2019 Graham - Duke Math Camp Notes

    18/117

    a = (4, 3)

    a . x = 0

    90

    Figure 1.10: a x =0

    With the geometrical interpretation of the dot product inmind consider the problem of solving the linear equation

    a 1x 1 +a 2x 2 +. . . +a n x n =0or

    a x =0where a = (a 1 , a 2 , . . . a n ) is a known vector of coefficients called the normal of the equation and the problem isto nd those x s in R n which solve the equation. We knowthat nding such an x is equivalent to nding an x which isorthogonal to a . The solution set ,

    X(a) {x R n | a x =0 }then must consist of all x s which are orthogonal to a . Thisis illustrated for n =2 in Figure 1.10 .

    Problem 1.23 . [Answer] Show that X(a) is a linear subspace.Problem 1.24 . [Answer] What is the dimension of X(a) ?Problem 1.25 . Suppose a,b,y R

    n are linearly independent and let L = {x R n | a x =0 and b x =0}. Find an expression for y , the projection of y on L as a function of a , b and y .Now consider the non zero version of the linear equation

    a 1x 1 + +a n x n =bor

    a x =bwhere b is not necessarily equal to 0 and let

    X(a,b)

    = {x

    R n

    |a

    x

    =b

    }denote the solution set for this equation.Problem 1.26 . [Answer] Show that X(a,b) is an affine subspace.Problem 1.27 . When is X(a,b) a linear subspace?

    Query 1.3 . Which two subsets of a linear space, X , are always linear subspaces?

    To provide a geometric characterization of X(a,b) , nd a point xthat (i) lies in the linear subspacespanned by a and (ii) solves the equation a x =b . To satisfy (i) it must be the case that x=a forsome real number . To satisfy (ii) it must be the case that a x=b . Combining we have a (a) =bor =b/(a a) and thus x=[b/(a a)]a .Now suppose that x is any solution to a x =0. It follows that x+x must solve a x =b sincea

    (x+

    x )

    =a

    x+

    a

    x

    =b

    +0

    =b . We may therefore obtain solutions to a

    x

    =b simply

    by adding xto each solution of a x = 0. X(a,b) is obtained, in short, by moving X(a) parallelto itself until it passes through x. The signicance of xis that it is the point in X(a,b) which isclosest to the origin. This norm, moreover, is x = b / a . Note that xcan be interpreted asthe intercept of the solution set with the a axis. When b is positive, X(a,b) lies on the same sideof the origin as a and a forms a positive dot product (acute angle)with each point in X(a,b) . When bis negative, X(a,b) lies on the opposite side of the origin from a and a forms a negative dot product(obtuse angle) with each point in X(a,b) .

    10

  • 8/8/2019 Graham - Duke Math Camp Notes

    19/117

    a = (4, 3)

    a . x = 0

    90

    a . x = -25/2

    x* = (-2, -3/2)

    Figure 1.11: a x =b

    This xis illustrated in Figure 1.11 for the case in whicha =(4 , 3) , a =5, b = 25 / 2, x=[b/(a a)]a = 25 / 2 1/ 25 (4 , 3) =(2 , 3/ 2) and

    x

    =(

    2 ,

    3/ 2)

    (

    2,

    3/ 2)

    =5/ 2= b / a

    The solution set for the linear equation a x =b can thus begiven the following interpretation: X(a,b) is an affine sub-space orthogonal to the normal a and lying a directed dis- tance equal to b/ a from the origin at the closest point.The term directed distance simply means that X(a,b) lieson the same side of the origin as a if b is positive and on theopposite side if b is negative.

    This is the standard form for a linear equation. It replaces

    the familiar slope-intercept form used for n =2. In this more general form the slope is given by theorthogonal to a requirement and the intercept by the point xa distance b/ a out the a axis.Problem 1.28 . Suppose b R , a, y R

    n and let X(a,b) = {x R n | a x = b}. Obtain anexpression for y , the projection of y on X(a,b) , as a function of a , b and y . [See Problem 1.13 onpage 7 .]

    1.4 Separating and Supporting Hyperplanes

    The solution set X(a,b) bears exactly the same relationship to R n that a plane does to R 3 . For example,it is linear (either a linear or an affine subspace) and has a dimension equal to n 1. For these reasons

    X(a,b) {x R n | a x =b }is called a hyperplane . This hyperplane divides R n into two associated half spaces

    H +(a,b) {x R n | a x b }H (a,b) {x R n | a x b }

    the intersection of which is X(a,b) itself:

    H +(a,b) H (a,b) =X(a,b)

    Denition 1.23 . If Z R

    nis an arbitrary set, then X(a,b) is bounding for Z iff Z is entirely containedin one of X(a,b) s half-spaces, i.e., either Z H + or Z H .

    Denition 1.24 . If Z Rn is an arbitrary set, then X(a,b) is supporting for Z iff X(a,b) is bounding

    for Z and X(a,b) touches Z , i.e.,inf zZ

    |a z b | =0

    These concepts together with the following theorem will prove very useful in subsequent analysis.

    11

  • 8/8/2019 Graham - Duke Math Camp Notes

    20/117

    Theorem 2 (Minkowskis Theorem) . If Z and W are non-empty, convex and non-intersecting subsetsof R n , then there exist a R

    n and b R such that X(a,b) is separating for Z and W , i.e., X(a,b)(i) is bounding for both Z and W , (ii) contains Z in one half-space and (iii) contains W in the otherhalf-space.

    Minkowskis Theorem is illustrated for n = 2 in Figure 1.12 . In the left-hand panel the antecedentconditions for the theorem are met and the separating hyperplane is illustrated. In right-hand panelone of the sets is not convex and it is not possible to nd a separating hyperplane.

    Z

    W

    a

    a . x = b

    Z

    W

    Figure 1.12: Conditions for Minkowskis Theorem: satised (left) and violated (right)

    1.5 Answers

    Problem 1.1 on page 2 . Yes. From the only if in the denition, x > y x y .Problem 1.2 on page 2 . Yes. From the only if in the denition,

    x y x i > y i , i =1, 2, . . . , n x i y i , i =1, 2, . . . , n

    x y

    Problem 1.5 on page 3 . No. If x > y then at least one component of x must be greater than thecorresponding component of y . Since there is only one component, this means that every componentof x is greater than the corresponding component of y . Thus x > y x y . The converse alsoholds.

    Problem 1.12 on page 7 . The origin has dimension 0. Surprised? Note that ( 0 , 0, 0) =(0, 0, 0) has anabundance of non-trivial solutions, e.g. =1. A line through the origin has dimension 1 and a planethrough the origin has dimension 2.Problem 1.13 on page 7 . Two facts characterize this projection. (i) Since y X it must be the casethat y = x for some real . (ii) The residual of the projection, y y , must be orthogonal to everyvector in X . Since x

    X fact (ii) implies that (y

    y)

    x

    =0 or y

    x

    =y

    x . Combining with (i) yields

    y x =x x or, since x =0, =(y x)/(x x) and thus y =(y x)/(x x)x .Problem 1.15 on page 7 .

    x 1 = (1 0 0 0)x 2 = (0 1 0 0)...x n = (0 0 0 1)

    12

  • 8/8/2019 Graham - Duke Math Camp Notes

    21/117

    Problem 1.18 on page 8 . Yes. The argument proceeds in three steps.

    1. M is an affine space:If y i M, i =1 , . . . , k then it must be the case that y i =x i a, i =1 , . . . , k for some x i L, i =1, . . . , k . Since L is affine, it follows that i i =1 implies z i i x i = i i (y i +a) L. Butthis means that i i y i =z a M . Thus M is an affine space.

    2. The origin belongs to M :Since a L it follows that a a =0M .

    3. An affine space which contains the origin is necessarily a linear space:Suppose that x i M, i =1 , . . . , k and i R , i = 1, . . . , k . We need to show that the linearcombination i i x i M .Note that i x i = i x i +(1 i )0 M since x i , 0 M and M is affine. But then i i x i +(1

    i i )0 = i i x i M .Problem 1.23 on page 10 . (i) if x, x X(a) then a x =a x =0, a x +a x =a (x +x ) =0 andthus x +x X(a) (ii) if x X(a) then a x =0, a x =a (x) =0 and thus x X(a) .Problem 1.24 on page 10 . Since (i) the dimension of R n equals n , (ii) a itself spans (occupies) a linearsubspace of dimension 1 and (iii) X(a) contains all those x s which are orthogonal to a , it is not hardto see that there are n 1 directions left in which to nd vectors orthogonal to a . Thus the dimensionof X(a) must be equal to n 1.Problem 1.26 on page 10 . Since x i X(a,b) implies a x i =b and i i =1 for any affine combination

    i i x i , it follows that

    b =( 1 +. . . + k )b= 1a x 1 +. . . + k a x k=a ( 1x 1 +. . . + k x k )

    and i i x i X(a,b) .

    13

  • 8/8/2019 Graham - Duke Math Camp Notes

    22/117

    14

  • 8/8/2019 Graham - Duke Math Camp Notes

    23/117

    Chapter 2

    Matrix Algebra

    2.1 Linear Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    2.2 Real Valued Linear Transformations and Vectors . . . . . . . . . . . . . . . . . . . . 16

    2.3 Linear Transformations and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 17

    2.4 Invertible Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    2.5 Change of Basis and Similar Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 20

    2.6 Systems of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

    2.6.1 Homogeneous Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    2.6.2 Non-Homogeneous Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    2.7 Square Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    2.7.1 The Inverse of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    2.7.2 Application: Ordinary Least Squares Regression as a Projection . . . . . . . . . . . . . 25

    2.7.3 The Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    2.7.4 Cramers Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    2.7.5 Characteristic Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

    2.8 Farkas Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    2.8.1 Application: Asset Pricing and Arbitrage . . . . . . . . . . . . . . . . . . . . . . . 32

    2.9 Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    2.1 Linear Spaces

    Thus far we have thought of vectors as points in R n represented by n -tuples of real numbers. Thisis a little like thinking of 127 Main Street as a 15 character text string when, in reality, its a house.Similarly, an n -tuple of real numbers is best regarded as the address of the vector that lives there.

    15

  • 8/8/2019 Graham - Duke Math Camp Notes

    24/117

    ||x||

    ||y||

    x

    y

    x+y

    3/2 x

    o 1

    1

    x = (1, 0)

    y = (0, 1)x+y = (1, 1)

    3/2 x = (3/2, 0)

    2-dimensional linear space R2

    Figure 2.1: A linear space (left) and the corresponding address space (right)

    All this can be made less abstract by constructing a coordinate free linear space using a pencil, ruler,protractor and a blank sheet of paper. Begin by placing a point on the paper and labeling it o torepresent the origin. Then arbitrarily pick another couple of points, label them x and y and drawarrows connecting them to o . This is illustrated in the left-hand panel of Figure 2.1 .

    The lengths, x and y , of x and y , respectively, can be measured with the ruler. The scalar productof x and, say 3 / 2, can then be obtained by extending x using the ruler until the length is 3 / 2 times aslong as x . Multiplying by a negative real number, say 2, would require extending x in the oppositedirection until its length is 2 times the original length. The scalar multiple of an arbitrary point z bythe real number a is then obtained by expanding (or contracting) z until its length equals a z andthen reversing the direction if a is negative.

    To add, say, x and y use the protractor to construct a parallel to y through x and a parallel tox through y . The intersection of these parallels gives x +y . Adding any other two points wouldsimilarly be accomplished by completing the parallelogram formed by the two points.Note that x and y are linearly independent since ax +by =o has only the trivial solution a =b =0.Any other point, z , can be expressed as a linear combination, z

    =ax

    +by , for appropriate choices of

    the real numbers a and b . This means that the two vectors, x and y , form a basis for our linear spacewhich, consequently, is 2-dimensional. All this is possible without axes and coordinates.

    Now lets add coordinates by choosing x and y , respectively, as the two basis vectors for our linearspace. The corresponding 2-dimensional address space is illustrated in the right-hand panel of Fig-ure 2.1 where, for example, (1, 0) is the address of x since x lives 1 unit out the rst basis vector ( x )and 0 units out the second basis vector ( y ). In general, (a,b) in the right-hand panel is the address of the vector ax +by in the left-hand panel.Denition 2.1 . A linear space is an abstract set, L, with a special element called the origin anddenoted o together with an operation on pairs of elements in L called addition and denoted + andanother operation on elements in L and real numbers called scalar multiplication with the propertythat for any x, y L and any a R : (i)x +o =x , (ii) 0x =o , (iii) x +y L and (iv) ax L.

    2.2 Real Valued Linear Transformations and Vectors

    Suppose that L is an n -dimensional linear space and that b = {b 1 , b 2 , . . . b n }L form a basis for L. Areal-valued linear transformation on L is a map, t , from L into R with the property that t(ax +by) =at(x) +bt(y) for all real numbers a and b and all x, y L.

    16

  • 8/8/2019 Graham - Duke Math Camp Notes

    25/117

    Since b is a basis for L, an arbitrary x L must be expressible as a linear combination of the elementsof b . Thus x = i x i b i where x = (x 1 , x 2 , . . . , x n ) R n and, since T is linear, T ( x) =T ( i x i b i ) =i T (b i )x i = a x , where a i T (b i ) and thus a R n . This means that a x is the image of T ( x)when x is the address of x . It also means that for every real-valued linear transformation on the

    n -dimensional linear space, L, there is a corresponding vector, a Rn , that represents the associated

    formula for getting the image of a point from its address.

    Query 2.1 . Let Ldenote the set of all real-valued linear transformations of the linear space L anddene addition and scalar multiplication for elements of Las follows for all f , g Land R :

    (f +g)(x) f(x) +g(x), x L(f)(x) f(x), x L

    Lthus dened is called the dual space of L. Is it a linear space and, if so, what is its dimensionality?

    2.3 Linear Transformations and Matrices

    In general, a linear transformation is a mapping that is, well, linear. This means (i) that if x maps intoT(x) and a is a real number, then ax must map into aT(x) and (ii) that if x and y map into T(x) andT(y) , respectively, then x +y must map into T(x) +T(y) .Lets suppose that the domain and range of the linear transformation are both equal to the 2-dimensionallinear space illustrated in Figure 2.1 on the previous page and construct a linear transformation. Con-sider the left-hand panel of Figure 2.2 . First select the same basis vectors as before, x and y . Nowchoose arbitrary points to be the images of these two points and label them T(x) and T(y) . Youredone. Thats right, you have just constructed a linear transformation. To see why simply note that anypoint in the domain, z , can be expressed as a linear combination of the basis vectors, z = ax +by .But then the linearity of T implies that T(z) =T(ax +by) =aT(x) +bT(y) . Thus the image of anypoint in the domain is completely determined by the starting selection of T(x) and T(y) .

    x

    y

    o

    T(x)

    T(y)

    x = (1,0)

    y = (0,1)T(x) = (1, 2/3)

    T(y) = (1/2, 1)

    1/2 x

    2/3 y

    Figure 2.2: A linear transformation

    Note that the T(x) and T(y) in the illustration are linearly independent. This need not be the case,they could be linearly dependent and span either a one-dimensional linear subspace of L, a line, or azero-dimensional linear subspace, the origin. See Problem 2.1 on following page .

    As before, the right-hand panel of Figure 2.2 gives the address view of the same linear transforma-tion. This means that x =(1 , 0) maps into T(x) =(1 , 2/ 3) , y =(0 , 1) maps into T(y) =(1/ 2 , 1) and,

    17

  • 8/8/2019 Graham - Duke Math Camp Notes

    26/117

    in general, z =(z 1 , z 2 ) maps intoT (z 1x +z 2y) =z 1T(x) +z 2T(y)

    =(1, 2/ 3)z 1 +(1/ 2, 1)z 2

    =1 1/ 2

    2/ 3 1z 1z 2

    Thus the matrix-vector product1 1/ 2

    2/ 3 1z 1z 2

    gives a formula for computing the address of the image of z from the address of z .Problem 2.1 . Suppose, in the construction of the linear transformation in Figure 2.2 on preceding

    page , that x and y are linearly independent but that T(x) and T(y) were chosen in a way that madethem linearly dependent and, in fact, span only a one-dimensional linear subspace. Discuss the impli-cations for the image of the domain under the transformation and for the matrix that maps addressesfor the transformation.Denition 2.2 . A linear transformation is a mapping, T , which associates with each x in some n -dimensional linear space, D , a point T(x) in some m -dimensional linear space, R , with the propertythat if x 1 , . . . , x kD and

    1 , . . . , kR then

    T i

    i x i =i

    i T x i

    It is customary to refer to D as the domain and R as the range of the transformation.

    While the denition imposes no restriction upon the values of m and n it is convenient to assume forthe moment that m =n and R =D . Suppose that b 1 , . . . , b n D is a basis for both D and R , and letT (b j ), j =1, . . . , n be the images of these basis vectors under the transformation. Since T (b j ) belongsto R =D it can be expressed as a linear combination of the basis vectors:

    T (b j ) =n

    i=1a ij b i

    The matrixa 11 a 12 a 1na 21 a 22 a 2n... ... . . . ...a n 1 a n 2 a nn

    obtained in this way has for its j th column the address of T (b j ) in terms of the basis, i.e., T (b j )lives a 1j out the basis vector b 1 , a 2j out b 2 and so forth.

    Similarly, an arbitrary vector x D can be expressed as a linear combination of the basis vectors

    x

    =

    n

    j =1x j b j

    and the resulting column vectorx 1x 2...

    x n

    18

  • 8/8/2019 Graham - Duke Math Camp Notes

    27/117

    can be interpreted as the address of x in terms of the basis.

    Now since T is linear,

    T(x) =T n

    j =1x j b j

    =n

    j =1x j T (b j )

    =n

    j =1x j

    n

    i=1a ij b i

    =n

    i=1

    n

    j =1a ij x j b i

    so that the matrix-vector product

    A11 A12 A1nA21 A22 A2n... ... . . . ...An 1 An 2 Ann

    x 1x 2...

    x n

    can be interpreted as the address of T(x) in terms of the basis.

    A similar result can be established when m =n so that to every linear transformation which maps ann -dimensional linear space into an m -dimensional linear space there corresponds an m by n matrixfor mapping addresses in terms of given bases for the domain and the range, and vice versa. This being the case, the study of linear transformations centers upon the matrix-vector product Ax or,equivalently, upon the linear transformations, T , for which D =R n and R =R m . Query 2.2 . Suppose m =n and thus R =D . Let d 1 , d 2 , . . . , d n D be a basis for D and r 1 , r 2 , . . . , r m R be a basis for R . Derive the formula for mapping the address of x D into the address of T(x) R.Now choose a subset of the domain, R n , and recall that:Denition 2.3 . The image of X R

    n under T is

    T(X) {y R m | y =T(x),x X } = {y R m | y =Ax,x X }Note that T ( R n ) , the set of all linear combinations of the columns of A, is a linear subspace with adimension equal to the number of linearly independent columns or rank of A. It is also true thatRank (A) min {m, n }since there cant be more linearly independent columns than there are columnsand since the columns themselves live in R n . When Rank (A) < n the transformation collapses thedomain into a linear subspace. No such collapse takes place when Rank (A) =m =n and T ( R n ) =R n .

    2.4 Invertible Transformations

    Denition 2.4 . Given a mapping T : R n R m , the inverse image of Y Rm is

    T 1 (Y ) {x R n | T(x) Y }19

  • 8/8/2019 Graham - Duke Math Camp Notes

    28/117

  • 8/8/2019 Graham - Duke Math Camp Notes

    29/117

    Show that Ax / x =a 2 +b 2 and cos () = a/ a 2 +b 2 where is the angle between x and Axand thus that this transformation corresponds to a rotation and either a lengthening or a shortening.Hint: For the rst part, try Mathematica with

    A = {{a, b}, {-b, a}};x = {x1, x2};

    Assuming[{Element[x1, Reals], Element[x2, Reals], Element[ a, Reals],Element[b, Reals]}, Simplify[Norm[A.x]/Norm[x]]]

    Query 2.3 . Suppose

    A =c 00 d

    Interpret the transformation T , i.e., what are the images, T(x) and T(y) , of the two basis vectors, xand y ?

    2.6 Systems of Linear Equations

    Solving simultaneous systems of linear equations involves nothing more than identifying the proper-ties of the inverse image of a linear transformation. To solve the homogeneous system

    a 11 x 1 + + a 1n x n = 0...a m 1x 1 + + a mn x n = 0

    or Ax = 0 is to nd the inverse image of 0 under this linear transformation. Similarly, to solve thenon-homogeneous system

    a 11 x 1 + + a 1n x n = b 1...

    a m 1x 1 + + a mn x n = b mor Ax =b is to nd the inverse image of b under this transformation.Two distinct views of the matrix-vector product prove useful. In the column view , the vector Ax isviewed as a linear combination of the columns of A using the components of x as the weights:

    a 11a 21

    ...a m 1

    x 1 +

    a 12a 22

    ...a m 2

    x 2 + +

    a 1na 2n

    ...a mn

    x n (COL)

    In the row view , the components of the vector Ax are viewed as the dot products of the rows of Awith the vector x :

    [a 11 a 12 a 1n ] x[a 21 a 22 a 2n ] x... ... . . . ... ...[a m 1 a m 2 a mn ] x

    (ROW)

    21

  • 8/8/2019 Graham - Duke Math Camp Notes

    30/117

    2.6.1 Homogeneous Equations

    Consider the homogeneous system Ax = 0 using the column view. A non-trivial solution ( x = 0)is possible iff the columns of A are linearly dependent since a non-trivial linear combination of thecolumns using the components of x as weights can only be equal to zero if the columns are linearlydependent.

    The row view conrms this since x must be orthogonal to each row of A and thus to the linearsubspace spanned by the rows of A. This is possible iff Rank (A) =r < n in which case the rows spana r dimensional linear subspace and there are n r directions left to look for things orthogonal. Thesolution set in this case, not surprisingly, is itself a linear subspace of dimension n r and is calledthe null space of A.

    Problem 2.6 . Suppose

    A =

    0 1 0 0 00 0 0 1 10 1 0 1 11 1 0 0 1

    The Mathematica command

    MatrixRank[A]

    gives the rank of the matrix A and the command

    NullSpace[A]

    gives an orthogonal basis for the null space of A. (i) What is the rank of A? (ii) Give an orthogonal basis for the null space of A. (iii) What is the rank of the null space of A?

    The column view is illustrated in the left-hand panel of Figure 2.3 for the case in which

    A =6 34 2

    Column 1:(6, 4)

    Column 2:(-3, -2)

    Column View

    (6, 4)1+(-3 ,-2)2= (0,0)

    Row 2: (4, -2)

    Row 1: (6, -3)

    Ax = (0, 0)

    Row View

    (1, 2)

    Figure 2.3: Non-trivial Solutions for Ax =0Since rank (A) =1 there are non-trivial choices for the weights x 1 and x 2 for which A1x 1 +A2x 2 =0,e.g., (x 1 , x 2 ) =(1, 2) . The right-hand panel presents the corresponding row view in which the solutionset is a 2 1 =1 dimensional linear subspace orthogonal to the linear subspace spanned by the rowsof A. Note that (x 1 , x 2 ) =(1, 2) belongs to the solution set.

    22

  • 8/8/2019 Graham - Duke Math Camp Notes

    31/117

    2.6.2 Non-Homogeneous Equations

    The non-homogeneous system Ax =b is similar. The column view suggests that a solution is possibleiff b lies in the linear subspace spanned by the columns of A. Put somewhat differently, a solutionis possible iff Rank (A |b) = Rank (A) . Given any one such solution, x, it is possible to obtain allsolutions as follows. Since Ax= b it follows that if x is any other solution it must be the casethat Ax= Ax = b or A(x x) = 0. Now we already know that solutions to Ax = 0 form alinear subspace of dimension n Rank (A) . The solutions to Ax = b must then correspond to theset obtained by adding xto each of the solutions to Ax = 0 an affine subspace of dimensionn Rank (A) .This is illustrated in Figure 2.4 for the case in which

    A =4 3

    3 4b =

    5

    10

    Column View Row View

    (4, -3)

    (3, 4)

    (5, -10)

    (4, 3)(-3, 4)

    (2, -1)

    -1(3, 4)+2(4, -3) S 1

    S 2

    Figure 2.4: A Unique Solution for Ax =bSince rank (A) =2 = n , the solutions to Ax = b must be an affine subspace of rank zero a singlepoint which corresponds to the trivial solution for Ax = 0. In column view illustrated in theleft-hand panel this unique solution for x is obtained by completing the parallelogram whose sidescorrespond to the columns of A and whose diagonal corresponds to b . It follows that the uniquesolution is x = (2, 1) . Notice that if the columns of A were chosen as the basis, then the addressof b would be (2, 1) . In the row view illustrated in the right-hand panel, the unique solution for xcorresponds to the intersection of

    S 1 = {x | (4 , 3) (x 1 , x 2) =5}a hyperplane orthogonal to the rst row of A and lying a directed distance equal to 5 / (4, 3) =1from the origin and

    S 2 = {x | (3, 4) (x 1 , x 2 ) = 10}a hyperplane orthogonal to the second row of A and lying a directed distance equal to 10 / (3 , 4) =2 from the origin.

    23

  • 8/8/2019 Graham - Duke Math Camp Notes

    32/117

    2.7 Square Matrices

    2.7.1 The Inverse of a Matrix

    When T R n R n is invertible, the inverse image of any point, T 1(x) , is itself a point. Thus T 1 is isalso a linear transformation. As such it has an associated matrix which is denoted, naturally enough,A1 . A consequence of Equation 2.1 on page 20 is that

    A1Ax =x =AA1xor that

    A1A =I =AA1 (2.3)where I is the identity matrix :

    1 0 00 1 0... ... . . . ...0 0 1

    Equation 2.3 thus requires that

    Ai A1j =1 if i =j0 if i =j

    (2.4)

    where Ai is the i th row of A and A1

    j is the j th column of A1 . The j th column of A1 must therefore

    1. be orthogonal to every row of A save for the j th.

    2. form an acute angle with the j th row.

    3. be just long enough to make the dot product with the j th row equal to one.

    These requirements can be used to construct the inverse geometrically see Figure 2.5 on the nextpage for the case in which n =2 and

    A =4 32 6

    A1 =1/ 3 1/ 61/ 9 2 / 9

    24

  • 8/8/2019 Graham - Duke Math Camp Notes

    33/117

    (4, 2)

    (3, 6)

    R 1

    R 2

    Figure 2.5: Constructing the Inverse

    In Figure 2.5 , R1 is the set of vectors which are orthogonalto the second column of A and form an acute angle with therst column the rst row of A1 must belong to this set.Similarly, R2 is the set of vectors which are orthogonal to therst column of A and form an acute angle with the secondcolumn the second row of A1 must belong to this set.

    Problem 2.7 . What problem would be encountered in con-structing the inverse if the columns of A were linearly depen-dent?

    Problem 2.8 . The formula for the inverse of a 2 by 2 matrixis:

    a bc d

    1

    =1

    ad bcd bc a

    Derive the rst row of this inverse using Equation 2.4 on theprevious page .

    Problem 2.9 . Derive the formula given in Problem Prob-lem 2.8 using the Mathematica commands A = {{a,b},{c,d}} and then Inverse[A]//MatrixForm .What difference would it make to replace //MatrixForm with //InputForm ?

    Problem 2.10 . [Answer] Grams Theorem states that if A is an m by n matrix with m < n and x Rm

    thenAAT x =0 AT x =0

    Prove Grams Theorem.

    This theorem implies, for example, that if Rank (A) =m then Rank (AA T ) =m since AT x =0 has nosolution, x =0, and AAT x =0 must therefore have no solution either.

    2.7.2 Application: Ordinary Least Squares Regression as a Projection

    Consider the problem of ordinary least squares regression . In this problem data is available whichdescribes n observations on each of p exogenous and 1 endogenous variables. This arranged asfollows

    X : an n by p exogenous data matrix each row of which corresponds to an observation andeach column of which corresponds to an exogenous variable. There are more observations thanvariables so Rank (X) =p < n .

    y : an n by 1 endogenous data vector each row of which corresponds to an observation onthe endogenous variable.

    The problem is to nd the projection, y , of y on S = {z | z =X, R p }. The term least squaresderives from the fact y is the closest point to y in S and thus minimizes the sum of the squares of the components of the difference see Figure 2.6 on following page .

    There are two key facts:

    Since y S , it must be the case that y =X for some . The problem of nding y thus reducesto one of nding .25

  • 8/8/2019 Graham - Duke Math Camp Notes

    34/117

    y

    S = Linear subspace spanned by columns of X

    Projection of y on S

    Residual fromthe projection

    Figure 2.6: Ordinary Least Squares Regression

    Since y is the projection of y on S , the residual of the projection, y y , must be orthogonal toS , the space spanned by the columns of X .

    These facts are sufficient to identify :

    X j (y y) =0, j =1 , . . . , p . To be orthogonal to the space spanned by the columns of X , y y must be orthogonal to each column of X . X T (y y) =0. Matrix version of previous line. X T y =X T y =X T X . Carry out the multiplication and substitute for y . =(X T X) 1X T y . Multiply both sides by (X T X) 1 which exists by virtue of Grams Theorem.

    Problem 2.11 . Suppose x 1 = (1 , 0 , 2), x 2 = (2, 0 , 1), y = (3, 3, 3) R n and let L = {z R n | z = 1x 1 + 2x 2 , , R } be the linear subspace spanned by x 1 and x 2 . Find y , the projection of y onL. Hint:X = {{1, 2}, {0, 0}, {2, 1}}y = {3,3,3}X . Inverse[Transpose[X] . X] . Transpose[X] . y

    2.7.3 The Determinant

    The following, due to Hadley [1961 ], page 87, is a typical denition of the determinant of a squarematrix correct but not particularly intuitive.Denition 2.7 . The determinant of an n by n matrix A, written |A|, is

    |A| ()a 1i a 2j . . . a nr the sum being taken over all permutations of the second subscripts with a term assigned a plus signif ( i , j , . . . , r ) is an even permutation of (1, 2, . . . , n ) , and a minus sign if it is an odd permutation.

    When n =2 this becomes a 11 a 12a 21 a 22 =a 11 a 22 a 12 a 21

    26

  • 8/8/2019 Graham - Duke Math Camp Notes

    35/117

    Problem 2.12 . Use Mathematica to derive the formulas for the determinant and inverse of a general3 3 matrix by rst entering

    A = {{a,b,c}, {d,e,f}, {g,h,i}}

    and then Det[A] and Inverse[A]//MatrixForm .

    It is often more useful to recognize that the determinant is another signed magnitude somewhatanalogous to the dot product which is best understood by examining its sign and its magnitude sepa-rately. In Figure 2.7 the (linearly independent) columns of

    A =4 3

    3 4A =4 4 3 3 =25

    have been illustrated and a parallelogram or, in this case a square, has been formed by completing thesides formed by these columns.

    5

    5

    25 = |A|

    A = (4, -3)1

    A = (3, 4)2

    Figure 2.7: The Determinant in R 2 : AnOriented Area

    The rst thing to notice is that movement from the rst tothe second axis is counter clockwise and that the movement

    from the rst column to the second is also counter clockwise.Thus the rows of A have the same orientation as the axes.This means that the determinant has a positive sign. (Switchthe columns and the determinant would be negative.) Themagnitude, moreover, corresponds to the area of this paral- lelogram .

    Consider, alternatively, the columns of the singular matrix

    B =6 32 1

    The parallelogram formed by these columns is, in this case,

    degenerate a segment of a line rather than an area. Thedeterminant is again equal to the area enclosed within thisline interval which, in this case, is equal to zero.

    Problem 2.13 . The formula for a 2 by 2 determinant is:

    a bc d =ad bc

    Show that |ad bc | is the area of the parallelogram formed by the columns. Hint: let x = (a,c) ,y = (b,d) and note that the area of the parallelogram is equal to the length of the base, x , timesthe altitude, y y where y is the projection of y on the linear subspace spanned by x .

    27

  • 8/8/2019 Graham - Duke Math Camp Notes

    36/117

    A1

    A2

    A3

    Figure 2.8: The Determinant in R 3 :An Oriented Volume

    Higher dimensional cases are analogous. The determinant of a 3 by 3 matrix, for example, has a sign which depends uponwhether the columns have the same orientation as the axes anda magnitude equal to the volume of the parallelepiped formed by the columns. [A parallelepiped is a solid each face of which

    is a parallelogram.] See Figure 2.8 . When the columns arelinearly dependent the parallelepiped degenerates into a planearea (rank 2) or a line interval (rank 1), both of which have zerovolume and the determinant, accordingly, is equal to zero.

    The determinant of an n by n matrix, analogously, has a signwhich depends upon the orientation of the columns and a mag-nitude equal to the volume of the hyper parallelepiped formed by the columns.

    Problem 2.14 . Suppose A is an n by n matrix. Provide geo-metrical interpretations for the following propositions:

    1. Suppose that Ai

    =A

    i for i

    =k and

    Ak =A ki.e., A is obtained from A by multiplying the k th column of A by a number . Then |A| = |A|.

    2. Suppose that Ai =Ai for i =k andAk =Ak +

    i=k i Ai

    i.e., that A is obtained from A by adding a linear combination of the other columns to the k thcolumn. Then |A| = |A|.

    Problem 2.15 . Suppose A is a non-singular n by n matrix. Show that |Ax | = |A| for some Rwhich depends only upon x . What is ?

    2.7.4 Cramers Rule

    An important application of the determinant is provided by:Theorem 4 (Cramers Rule) . If Ax =b with A an n by n matrix and |A| > 0 then

    x i =|Bi ||A|

    where Bi is obtained by replacing the i th column of A with b .

    The geometrical interpretation of this theorem is quite simple and is illustrated for the case in whichn =2 in Figure 2.9 on the next page . Note rst that the columns of A, labeled A1 and A2 , are linearlyindependent and the solution for both x 1 and x 2 can be obtained by completing the parallelogram:

    x 1 =b 1A1

    x 2 =b 2A2

    28

  • 8/8/2019 Graham - Duke Math Camp Notes

    37/117

    A 1

    A 2

    b

    b 1

    b 2

    c

    d

    e

    o

    f

    Figure 2.9: Using Cramers Rule to SolveAx =b for x 1

    Lets use Cramers Rule to nd, say, x 1 . Since we wish toidentify the rst component of x we begin by replacing therst column of A with b to obtain B1 . Cramers rule thenasserts that

    x 1=

    |B1||A|

    Our task then is to show that

    |B1 ||A| =

    b 1A1

    (2.5)

    Note rst that |A| is the oriented volume of the parallelo-gram formed by the rst and second columns of A which, inthis case, is positive and could be computed by multiplyingthe length of the base, oA 2 , by the altitude, the distance between the parallel lines ob 2 and A1e . Since the parallelo-gram with vertices at o , d , e and A2 has the same base and

    the same altitude, its area is also equal to |A|. Call thisparallelogram P A .Turning attention to the numerator, |B1| is the area of the parallelogram with vertices at o , b , f andA2 . Call this parallelogram P B. Since P B has the same base as P A , the ratio of the area of P B to thearea of P A , |B1 | / |A|, is the same as the ratio of the distance between ob 2 and b 1 f and the distance between ob 2 and A1e . But this is the same as the ratio b 1 / A1 which establishes Equation 2.5 .

    2.7.5 Characteristic Roots

    A particularly important characterization of a square matrix is provided by its characteristic rootsand characteristic vectors:Denition 2.8 . If A is n by n matrix, is a scalar and x =0 is an n by 1 vector, then is a characteristic root of A and x is the associated characteristic vector iff and x solve the characteristic equation :

    Ax =x (2.6)

    Characteristic roots and vectors are also sometimes called (i) eigenvalues and eigenvectors or (ii) latent roots and latent vectors .

    A fact worth noting about the characteristic roots of a matrix is that they characterize the underlyinglinear transformation and are invariant with respect to the choice of basis recall the discussion of Section 2.3 on page 17 . To see this note that if A is the matrix representation of the linear transfor-mation T for a particular choice of basis, then to be a characteristic root of A, must satisfy

    T(x) =xBut this means that matrices which represent the same linear transformation under alternative choicesof basis, i.e., similar matrices, will have the same characteristic roots.

    Since Equation 2.6 can be rewritten as the homogenous equation

    [A I]x =029

  • 8/8/2019 Graham - Duke Math Camp Notes

    38/117

    it follows that is a characteristic root of A iff

    |A I | =0The expansion of this determinant

    a11

    a12

    a1na 21 a 22 a 2n... ... . . . ...

    a n 1 a n 2 a nn =0

    is a polynomial in with () n the highest order term [the product of the diagonal elements]. Fromthe fundamental theorem of algebra we know that such a polynomial will have n , not necessarilydistinct, solutions for .

    Problem 2.16 . The characteristic roots of A may be either real or complex but if they are complexthey must occur in conjugate pairs so that if =a +bi is a root then =a bi must also be a root.Show that it follows that both the sum of the roots and the product of the roots are necessarily real.

    Two elementary facts about characteristic roots are worth noting.Theorem 5. If A is an n by n matrix with characteristic roots i , i =1, . . . , n , then

    n

    i=1 i =trace (A)

    n

    i=1 i = |A|

    where the trace of A is the sum of the diagonal elements:

    trace (A)

    n

    i=1a ii

    Since similar matrices must have the same characteristic roots, it follows from Theorem 5 , that similarmatrices have the same trace and determinant as well.

    Problem 2.17 . [Answer] Show that Theorem 5 is valid for the 2 by 2 matrix

    A =a bc d

    Problem 2.18 . What are the characteristic roots of the three canonical matrices given in Equation 2.2on page 20 ?

    Problem 2.19 . Suppose

    A =1 2 3 44 1 2 33 4 1 22 3 4 1

    The Mathematica command for nding the characteristic roots and vectors of a matrix are Eigenvalues[A]and Eigenvectors[A] , respectively. What are the characteristic roots and vectors of A?

    30

  • 8/8/2019 Graham - Duke Math Camp Notes

    39/117

    2.8 Farkas Lemma

    A nal result that will prove very important in subsequent analysis takes us into the realm of linearinequalities. It states that a system of linear equations will have a solution precisely when anothersystem of linear inequalities does not have a solution. The importance of this result involves indi-

    rection often it will be easier to establish the existence of a solution to the system of interest byshowing that the solution to complementary system cannot exist.

    Theorem 6 (Farkas Lemma) . Suppose A is an m by n matrix and b =0 is a 1 by n row vector. Exactlyone of the following holds:

    1. yA =b, y > 0 has a solution y R m

    2. Az 0 , b z < 0 has a solution z R n

    bb

    A1

    A2

    A1

    A2

    b . z < 0b . z < 0

    yA, y > 0yA, y > 0

    Az > 0_ Az > 0_

    Az > 0, b z < 0._ yA = b, y > 0

    Figure 2.10: Farkas Lemma

    The basis of this theorem is quite simple and is illustrated in Figure 2.10 . Either

    1. a vector z exists which forms a non-obtuse angle with every row of A and an obtuse angle with b(the left-hand panel)

    2. or b lies in the cone generated by the rows of A (the right-hand panel)

    The key to understanding Figure 2.10 is to x the rows of A and rotate b clockwise in moving from the

    left-hand panel to the right-hand panel. Initially b lies outside the cone generated by the rows of A. Itfollows that there is a vector z for which Az 0 and for which b z < 0, i.e., a vector z that makes annon-obtuse angle with every row of A and an obtuse angle with b .As b rotates clockwise this solution disappears precisely at the point at which b enters the conespanned by the rows of A but then there is a solution to yA = b with y > 0. This solution persistsuntil b emerges from the cone spanned by the rows of A but at this point there is again a solution toAz 0 and b z < 0.

    31

  • 8/8/2019 Graham - Duke Math Camp Notes

    40/117

    2.8.1 Application: Asset Pricing and Arbitrage

    Consider a two period model of asset pricing. There are n assets which can be traded in the rstperiod at prices p . The rst period budget constraint limits an investor endowed with portfolio s toportfolios satisfying

    p

    s

    p

    s (2.7)

    Asset prices in the second period are uncertain and depend upon which of m possible states of nature occurs. It common knowledge when rst period trading takes place that the second-periodprice of the j th asset will be a ij if the i th state occurs. Let A denote the corresponding m n matrixof second period prices in which rows correspond to states and columns to assets. Holding portfolios would then pay As in the second period, i.e., the i th component of this m -tuple would be the totalvalue of the portfolio if the i th state occurred.

    Note that the components of s are not required to be non-negative. Indeed, negative componentscorrespond to short positions, e.g., s1 = 1 would be interpreted as taking a short position of oneshare on the rst asset. This means the investor borrows a share of this asset from the market, sellsit for p 1 and then uses the receipts to purchase other shares. The catch, of course, is that such loansmust be repaid in the second period. Our investor would thus be required to purchase one share of

    the rst asset in the second period, whatever its price turns out to be, to repay the rst-period loan.The second-period solvency constraint is that the investor must be able to repay such loans or thatholding the portfolio not entail bankruptcy in any state

    As 0 (2.8)

    It is important to realize that the components of As are the commodities that investors care about the components of s only matter to the extent that they affect As . Since p is vector of securityprices and securities are not themselves the focus of interest, the question arises of whether or notit is possible to identify an m -tuple of shadow prices, , of the commodities of interest. Here iwould be interpreted as the price of a claim to one dollar contingent upon state i occurring in ctionalshadow market. For such shadow prices to be interesting, trade opportunities in the shadow marketwould have to be equivalent to those in the actual markets, i.e., would have to satisfy

    p =A, > 0 (2.9)

    Can we be sure that a solution to Equation 2.9 exists? Well, if we make the association y = , b =pand z = s , then Farkas Lemma states that either Equation 2.9 will have a solution or there will be asolution, s , toAs 0, p s < 0 (2.10)

    An s that satised Equation 2.10 would be a good thing, too good in fact. It not only satises solvency,As 0, but also pumps money into the pocket of the investor in the rst period since p s < 0. Inthe context of the budget constraint, Equation 2.7 , this means that

    p

    (s)

    =p

    s

    p

    s

    is satised for an arbitrarily large and thus that our investor could acquire innite rst period wealth.This is commonly called an arbitrage opportunity. If we make the reasonable supposition that p andA preclude such arbitrage opportunities, then the existence of shadow prices satisfying Equation 2.9is guaranteed. Query 2.4 . Suppose m =n =2, that the two columns of A, A1 and A2 , are linearly indpendent andthat p s =1, i.e., our investor is worth one dollar in the rst period.

    32

  • 8/8/2019 Graham - Duke Math Camp Notes

    41/117

    1. In a graph of the positive quadrant of R 2 , illustrate A1 and A2 and the points v 1 A1 /p 1 andv 2 A2 /p 2 . Is either v 1 > v 2 or v 2 < v 1 consistent with no arbitrage opportunities?

    2. Illustrate the budget constraint for contingent claims under the assumption that no arbitrageopportunities exist. Label the regions corresponding to long positions on both assets, to a longposition on the rst asset and a short position on the second and to a short position on the rstasset and a long position on the second.

    3. What is the effect in your illustration of adding the solvency constraint to the budget constraint?

    4. Is it possible to determine the shadow prices, 1 and 2 , from your illustration and, if so, how? Query 2.5 . Suppose that no arbitrage opportunities exist and let x =As and x =As . What is the budget constraint corresponding to Equation 2.7 on the previous page in terms of x , x and ? Whatis the solvency constraint corresponding to Equation 2.8 on the previous page ? Query 2.6 . Suppose that a new asset is introduced, that Rank (A) =Rank (A |b) where b is the vectorof state-dependent, second-period prices for the new asset, that no arbitrage opportunities exist either before or after the introduction of the new asset and that p =A is the vector of rst-period prices of the original assets. What must be the rst-period price of the new asset? Query 2.7 . Suppose that no arbitrage opportunities exist and that there is a riskless portfolio, s, forwhich As

    =(1 , 1 , . . . , 1) T . What is the one-period riskless rate of return? Hint: What is the rst-period

    cost of buying claims to a sure, second-period dollar?

    2.9 Answers

    Problem 2.2 on page 20 . Suppose that Rank (A) =m =n and that A is not invertible. Then there mustexist y,x ,x R n with x = x such that y = Ax = Ax . But this means that A(x x ) = 0 with(x x ) =0 and thus the columns of A are linearly dependent a contradiction. Conversely, supposeA is invertible and the columns of A are linearly dependent. Then there exist weights =( 1 , . . . , n )such that A =0. Now choose any x R n and note that x =x + and yet A(x ) =A(x +) =Ax thus A is not invertible.Problem 2.10 on page 25 . Suppose AAT x

    =0. Then

    x T AAT x =0 (A T x) T (A T x) =0 |AT x | =0 AT x =0and, conversely, if AT x =0 then clearly AAT x =0.Problem 2.17 on page 30 . Expanding the determinant yields

    (a )(d ) bc =0or

    2 (a +d) +ad bc =0Using the quadratic formula yields

    1 =a +d + (a +d) 2 4(ad bc)

    2

    2 = a +d (a +d)2

    4(ad bc)2

    It follows immediately that

    1 + 2 =a +d =trace (A) 1 2 =ad bc = |A|

    33

  • 8/8/2019 Graham - Duke Math Camp Notes

    42/117

    34

  • 8/8/2019 Graham - Duke Math Camp Notes

    43/117

    Chapter 3

    Topology

    3.1 Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    3.1.1 Countable Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    3.1.2 Uncountable Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

    3.2 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    3.2.1 Open Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

    3.2.2 Closed Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    3.2.3 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    3.2.4 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    3.3 Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

    3.3.1 Separation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

    3.3.2 Generic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443.3.3 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

    3.4 Sigma Algebras and Measure Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 45

    3.5 Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

    This chapter draws much from Simmons [1963 ], surely one of the most beautiful books about mathe-matics ever written.

    3.1 Counting

    The subject of counting begins simply enough with thoughts of the positive integers, 1, 2, 3, . . . ,familiar to all of us. But counting was surely important to human beings even before such symbolswere invented. Imagine a primitive society of sheep herders whose number system was limited tothe symbols 1, 2, 3 and several, i.e., more than 3. How might they have kept track of herdscontaining several sheep? One simple device might have been to place a stone in a pile for eachsheep in the herd and then, each night, to remove a stone for each sheep accounted for. Stones left inthe pile would then have indicated strays needing to be found.

    35

  • 8/8/2019 Graham - Duke Math Camp Notes

    44/117

    3.1.1 Countable Sets

    Similarly, the innite setN = {1, 2, 3 , . . .}

    containing all the positive integers or cardinal numbers , serves as a modern pile of stones. Whilethis set is adequate for counting any non-empty, nite set, in mathematics there are many innite sets just as, for the herdsmen, there were many herds with several sheep. The simple but profound ideaof a one-to-one correspondence that met the needs of the herdsmen also permits comparing theseinnite sets.Denition 3.1 . Two sets are said to numerically equivalent if there is a one-to-one correspondence between the elements of the two sets.Denition 3.2 . A countable set is a set that is numerically equivalent to the positive integers.

    Suppose, for example, that we want to compare the set consisting of all positive integers with the setconsisting of of all even positive integers. Since the pairing

    1 2 3 n 2 4 6 2n

    establishes a one-to-one correspondence, the two sets must be regarded as having the same numberof elements even though one is a proper subset of the other. This situation is not unusual since every innite set can, in fact, be put into a one-to-one correspondence with a proper subset of itself.

    Similarly, there are exactly as may perfect squares as there are positive integers because these twosets can also be put in a one-to-one correspondence:

    1 2 3 n 12 22 32 n 2

    As another example, consider the set of all positive rational numbers , i.e., ratios of positive integers.Surely this set is larger than the positive integers, right? No. The following array includes everypositive rational number at least once

    1/ 1 1/ 2 1 / 3 1 / 4 2/ 1 2/ 2 2 / 3 2 / 4 3/ 1 3/ 2 3 / 3 3 / 4 ... ... ... ... . . .

    and can be put into a one-to-one correspondence with the positive integers as follows:

    1 2 3 4 5 6 7 8 9 1/ 1 1 / 2 2 / 1 1/ 3 2/ 2 3/ 1 1/ 4 2 / 3 3 / 2

    Problem 3.1 . Construct a one-to-one correspondence between the set of integers,

    {, 2, 1 , 0, 1, 2, }and the set of positive integers.

    So how many positive integers are there? The symbol 0 , read aleph null, is used to represent thenumber of elements or cardinality of the set. Our list of numbers now includes its rst trans-nitenumber:

    1 < 2 < 3 < < 036

  • 8/8/2019 Graham - Duke Math Camp Notes

    45/117

    3.1.2 Uncountable Sets

    Not all sets with innitely many elements are countable. Consider a countable sequence of points of the form x 1 , x 2 , x 3 ,... where each element x i is either 0 or 1 and a countable listing of these sequencessuch as:

    s 1 =(0, 0, 0, 0, 0 , 0 , 0, )s 2 =(1, 1, 1, 1, 1 , 1 , 1, )s 3 =(0, 1, 0, 1, 0 , 1 , 0, )s 4 =(1, 0, 1, 0, 1 , 0 , 1, )s 5 =(1, 1, 0, 1, 0 , 1 , 1, )s 6 =(0, 0, 1, 1, 0 , 1 , 1, )s 7 =(1, 0, 0, 0, 1 , 0 , 0, )

    ...

    It is possible to build a list of elements s0 in such a way that its rst element is different from the rstelement of the rst sequence in the list, its second element is different from the second element of thesecond sequence in the list, and, in general, its n th element is different from the n th element of then th sequence in the list. For instance:

    s1 =(0 , 0, 0 , 0 , 0, 0, 0, )s2 =(1, 1 , 1 , 1 , 1, 1, 1, )s3 =(0, 1, 0 , 1 , 0, 1, 0, )s4 =(1, 0, 1 , 0 , 1, 0, 1, )s5 =(1, 1, 0 , 1 , 0 , 1, 1, )s6

    =(0, 0, 1 , 1 , 0, 1 , 1,

    )

    s7 =(1, 0, 0 , 0 , 1, 0, 0 , )...

    s0 =(1 , 0 , 1 , 1 , 1 , 0 , 1 , )Note that the highlighted element in s0 is in every case different from the highlighted element in thetable above it and thus the new sequence is distinct from all the sequences in the list. From this itfollows that the set T , consisting of all countable sequences of zeros and ones, cannot be put intoa list s1 , s2 , s3 ,... . Otherwise, it would be possible by the above process to construct a sequence s0which would both be in T (because it is a sequence of 0s and 1s) and at the same time not in T (because we deliberately construct it not to be in the list). Therefore T cannot be placed in one-to-onecorrespondence with the positive integers. In other words, T is uncountable .

    Now consider the binary representation of a number between zero and one where, for example, 1 / 2would be represented as 0 .1, 1/ 4 would be 0 .01 and so forth. Since the binary representation of a realnumber between zero and one must be a countable sequence of zeros and ones preceded by a decimalpoint, e.g., 0 .1011011100 . . . , and since the number of such sequences is uncountable, it follows thatthe set of real numbers lying between zero and one must also be uncountable.

    Surely there are more real numbers than those lying between zero and one, right? No, the set of all realnumbers and the set of real numbers between zero and one, or in any other interval, are numerically

    37

  • 8/8/2019 Graham - Duke Math Camp Notes

    46/117

    P

    P'

    a b

    0

    Figure 3.1: One-to-one Correspondence Between an Interval and the Real Line

    equivalent. The one-to-one correspondence is illustrated in Figure 3.1 . Simply bend the interval abinto a semi-circle, rest the result on the real line and then associate an arbitrary point P from theinterval with that point P from the real line which corresponds to the intersection of a line from thecenter of the semi-circle through P with the real line.

    We now have a new cardinal number, c , called the cardinal number of the continuum and our list of numbers now includes a second trans-nite number:

    1 < 2 < 3 < < 0 < cProblem 3.2 . The Cantor set is obtained as follows. First let C 1 denote the closed unit interval [0, 1] .

    Next delete from C 1 the open interval (1/ 3 , 2/ 3) corresponding to the middle third of C 1 to get C 2 andnote that C 2 =[0 , 1/ 3][2/ 3 , 1] . Now delete the open middle thirds of the two closed intervals to get

    C 3 =[0, 1/ 9][2/ 9, 1/ 3][2/ 3, 7/ 9][8/ 9 , 1]Continuing in this fashion we obtain a sequence of closed sets, each of which contains all its succes-sors. The Cantor set is dened by

    C = i=1C i1. Each C n consists of a number of disjoint closed intervals of equal length. How many closed

    intervals are there in C 30 ?

    2. The intervals removed have lengths 1 / 3 , 2/ 9, 4/ 27 , . . . , 2n 1 / 3n , . . . What is the combined lengthof the intervals that have been removed? Hint: Let Mathematica evaluate

    Sum[2^(n-1)/3^n, {n,1,Infinity}]

    You might be surprised at this point to learn that the cardinality of C is equal to c , i.e., the same as C 1 .

    An interesting consequence is that since the rational numbers are countable but the real numbers arenot, the set of irrational numbers must be uncountable as well or, more poetically:

    The rational numbers are spotted along the real line like stars against a black sky, and thedense blackness of the background is the rmament of the irrationals.

    T. E. Bell

    Are there any cardinal numbers between 0 and c ? No one knows the answer to this question thoughCantor himself thought that no such number exits. There are, on the other hand, cardinal numberslarger than c the number of elements in the class of all subsets of R , for example. This is oneconsequence of the following theorem.Theorem 7. If X is any non-empty set, then the cardinal number of X is less than the cardinal numberof the class of all subsets of X .

    38

  • 8/8/2019 Graham - Duke Math Camp Notes

    47/117

    Suppose, for example, that X = {1}, then there are two subsets, and {1}. If X = {1, 2}, then thereare four subsets, , {1}, {2}and {1, 2}. Similarly, X = {1, 2, 3}has eight subsets and, in general, if X has n elements, then there are 2 n subsets.Continuing into the innite realm, if X has 0 elements then there are 2

    0 > 0 subsets. Which islarger, c or 20 ? As noted above, the cardinality of the unit interval, c , is the same as the cardinalityof the set of all countable sequences of zeros and ones. Consider the one-to-one mapping betweenthe set of all subsets of the natural numbers and the set of all countable sequences of zeros and onesdened by

    f(S) = {x S 1 , x S 2 , . . .}where

    x S i 1 if iS 0 otherwise

    If, for example, S = {2 , 3 , 5}then f(S) =(0, 1, 1, 0 , 1