Download - Lecture 7 - sites.pitt.edu
Lecture 7
Econ 2001
2015 August 18
Lecture 7 Outline
First, the theorem of the maximum, an amazing result about continuity inoptimization problems. Then, we start linear algebra, mostly looking at familiardefinitions.
1 Theorem of the Maximum2 Matrices3 Matrix Algebra4 Inverse of a Matrix5 System of Linear Equations6 Span and Basis
Berge’s Theorem
Also called Theorem of the Maximum, Berge’s theorem provides conditions underwhich in a constrained maximization problem the maximum value and themaximizing vectors are continuous with respect to the parameters.
TheoremLet A ⊂ Rm and X ⊂ Rn be both non-empty. Assume f : A× X → R is acontinuous function and that ϕ : A→ 2X is a continuous correspondence that iscompact and non-empty for all a ∈ A. For all a ∈ A define
h(a) = maxx∈ϕ(a)
f (a, x) and µ(a) = {x ∈ ϕ(a) : h(a) = f (a, x)}.
Then µ(a) is non-empty for all a ∈ A, µ(a) is upper hemicontinuous, and h(a) iscontinuous.
REMARKIf µ is a function, then it is a continuous function since upper hemicontinuousfunctions are continuous.
µ(a) is non-empty for all a ∈ A
Proof.By assumption, ϕ(a) is compact and non-empty for all a ∈ A, and f is continuous.
Since a continuous function on a compact set attains its maximum (extremevalue theorem), µ(a) is non-empty for all a.
Prove that µ is upper hemicontinuous by contradiction.
Proof.Since ϕ is uhc and ϕ(a) is closed for all a, ϕ has closed graph (see theorem from last class).
If µ is not uhc, there are a sequence {an} in A converging to a ∈ A and an ε > 0 such that,for every n, there is a point xn ∈ µ(an) that does not belong to Bε (µ(a)).The contradiction is to find a subsequence xnk that converges to a point x
∗ ∈ µ(a).Since ϕ is uhc, there is a δ > 0 such that if a belongs to Bδ (a), then ϕ(a) ∈ Bε (ϕ(a)).Since ϕ is compact, the set C = {x ∈ Rn : ‖x − y‖ ≤ ε for some y ∈ ϕ(a)} is compact.Since limn→∞ an = a, there is a positive integer N such that an ∈ Bδ (a) when n ≥ N .Therefore xn ∈ µ(an) ⊂ ϕ(an) ⊂ C for n ≥ N , hence xn belongs to the compact set C , forn ≥ N .By Bolzano-Weierstrass, ∃ a subsequence xnk that converges to some x ∈ X .Call this subsequence xn again. Since ϕ has closed graph, we know that x ∈ ϕ(a).So, limn→∞ an = a ∈ A, xn ∈ µ(an) for all n, and limn→∞ xn = x∗ ∈ ϕ(a). Next, show thatx∗ ∈ µ(a).Since x∗ ∈ ϕ(a) we know that f (a, x∗) ≤ maxx∈ϕ(a) f (a, x) = h(a).If we prove that f (a, x∗) = h(a) we are done (why?).
Suppose that f (a, x∗) < h(a).Then there exists x ∈ ϕ(a) such that f (a, x) > f (a, x∗). Because ϕ is lhc, for each n, there is axn ∈ ϕ(an) such that limn→∞ xn = x . Since f is continuous,limn→∞ f (an , xn) = f (a, x) > f (a, x∗). Similarly limn→∞ f (an , xn) = f (a, x∗).Therefore there is a positive integer K such that for n ≥ K , f (an , xn) > f (an , xn) which isimpossible because xn ∈ ϕ(an) and f (an , xn) = maxx∈ϕ(an ) f (an , x). A contradiction.
This contradiction proves that f (a, x∗) = h(a) and so x∗ ∈ µ(a).
This completes the proof that µ is upper hemicontinuous.
Proof that h is continuous.
Proof.Let {an} be a sequence in A converging to a ∈ A. We must show that h(an) converges toh(a).
For each n, let xn ∈ µ(an) and let x ∈ µ(a), so that h(an) = f (an , xn) for all n, andh(a) = f (a, x).Suppose not: the sequence h(an) does not converge to h(a).
Then ∃ε > 0 and a subsequence nk such that |h(ank )− h(a)| ≥ ε for all k .Hence |f (ank , xnk )− f (a, x)| ≥ ε for all k . (∗)Since µ is uhc and the sets µ(a) are closed for all a, we know that µ has closed graph.By uhc of µ, there is a γ > 0 such that if a ∈ Bγ(a), then µ(a) ⊂ B1 (µ(a)); hence µ(a)is contained in the compact set G = {x ∈ X : ‖x − y‖ ≤ 1 for some y ∈ µ(a)}.Since limk→∞ ank = a, we may assume that ank ∈ Bγ(a) for all k , and hencexnk ∈ µ(ank ) ⊂ G for all k .Since G is compact, Bolzano-Weierstrass implies there is a subsequence of xnk , call it xnkagain, such that limk→∞ xnk = x
∗, for some x∗ ∈ B .Since limk→∞ ank = a and limk→∞ xnk = x
∗ and, for all k , xnk ∈ µ(ank ) and since µ hasclosed graph, it follows that x∗ ∈ µ(a).Therefore f (a, x∗) = h(a) = f (a, x).(∗) and the continuity of f imply thatε ≤ limk→∞ |f (ank , xnk )− f (a, x)| = |f (a, x∗)− f (a, x)| thus contradicting that h(an)does not converge to h(a).
This contradiction proves that lim limn→∞ h(an) = h(a) and so h is continuous.
Berge’s TheoremTheoremLet A ⊂ Rm and X ⊂ Rn be both non-empty. Assume f : A× X → R is acontinuous function and that ϕ : A→ 2X is a continuous correspondence that iscompact and non-empty for all a ∈ A. For all a ∈ A define
h(a) = maxx∈ϕ(a)
f (a, x) and µ(a) = {x ∈ ϕ(a) : h(a) = f (a, x)}.
Then µ(a) is non-empty for all a ∈ A, µ(a) is upper hemicontinuous, and h(a) iscontinuous.
This is an amazing resultWhen solving a constrained optimization problem, if
the objective function is continuous, andthe correspondence defining the constraint set is continuous, compact, and nonempty;
then
the problem has a solutionthe optimized function is continuous in the parametersthe correspondence defining the optimal choice set is upper hemi continuous
if this is a function, it is a continuous function.
Matrices
DefinitionAn m × n matrix is an element ofMm×n (the set of all m × n matrices) andwritten as
A =
α11 α12 · · · α1nα21 α22 · · · α2n...
......
αm1 αm2 · · · αmn
= [αij ]
where m denotes the number of rows and n denotes the number of columns.
An m × n matrix is just of a collection of nm numbers organized in aparticular way.
We can think of a matrix as an element of Rm×n if all entries are real numbers.The extra notation makes it possible to distinguish the way that the numbersare organized.
Vectors
Example
A2×3
=
(0 1 56 0 2
)
NotationVectors are a special case of matrices:
x =
x1x2...xn
∈Mn×1
This notation emphasizes that we think of a vector with n components as a matrixwith n rows and 1 column.
Transpose of a MatrixDefinitionThe transpose of a matrix A, is denoted At . To get the transpose of a matrix, welet the first row of the original matrix become the first column of the new(transposed) matrix.
At =
α11 α21 · · · α1nα12 α22 · · · α2n...
......
α1m α2m · · · αnm
= [αji ]
Clearly, if A ∈Mm×n , then At ∈Mn×m .
DefinitionA matrix A is symmetric if A = At .
Example
Continuing the previous example, we see that At3×2
=
0 61 05 2
Matrix Algebra: Addition
Definition (Matrix Addition)If
Am×n
=
α11 α12 · · · α1nα21 α22 · · · α2n...
......
αm1 αm2 · · · αmn
= [αij ] and Bm×n
=
β11 β12 · · · β1nβ21 β22 · · · β2n...
......
βm1 βn2 · · · βmn
= [β ij ]
then
A+ B︸ ︷︷ ︸m×n
= Dm×n
=
α11 + β11 α12 + β12 · · · α1n + β1nα21 + β21 α22 + β22 · · · α2n + β2n
......
...αm1 + βm1 αm2 + βm2 · · · αmn + βmn
= [δij ] = [αij + β ij ]
Matrix Algebra: Multiplication
Definition (Matrix Multiplication)If Am×k
and Bk×n
are given, then we define
Am×k· Bk×n
= Cm×n
= [cij ]
such that
cij ≡k∑l=1
ailblj
Note that the only index being summed over is l .
Matrix Algebra: Multiplication
Am×k· Bk×n
= Cm×n
= [cij ] where cij ≡∑k
l=1 ailblj
ExampleLet
A2×3
=
(0 1 56 0 2
)and B
3×2=
0 31 02 3
Then
A2×3· B3×2︸ ︷︷ ︸
2×2
=
(0 1 56 0 2
)·
0 31 02 3
=
((0× 0) + (1× 1) + (5× 2), (0× 3) + (1× 0) + (5× 3)(6× 0) + (0× 1) + (2× 2), (6× 3) + (0× 0) + (2× 3)
)=
(11 154 24
)
Matrix Algebra: Multiplication
REMARKIn general:
A · B 6= B · A
Example
As in the previous example A2×3
=
(0 1 56 0 2
)and B
3×2=
0 31 02 3
A2×3· B3×46= B
3×4· A2×3
The product on the right is not defined!
Matrix Algebra: Square and Identity Matrices
DefinitionAny matrix that has the same number of rows as columns is known as a squarematrix, and denoted A
n×n.
DefinitionThe identity matrix is denoted In and is equal to
Inn×n
=
1 0 . . . 00 1 . . . 0...
. . ....
0 . . . 0 1
.
REMARKAny matrix multiplied by the identity martix gives back the original matrix: for anymatrix A
m×n:
Am×n· In = A
m×nand Im · A
m×n= A
m×n
Diagonal and Triangular Matrices
DefinitionA square matrix is called a diagonal matrix if aij = 0 whenever i 6= j .
DefinitionA square matrix is called an upper triangular matrix (resp. lower triangular) ifaij = 0 whenever i > j (resp. i < j).
Diagonal matrices are easy to deal with. Triangular matrixes are also tractable.
In many applications you can replace an arbitrary square matrix with a relateddiagonal matrix (super useful property in macro and metrics). We will provethis.
Matrix Inversion
DefinitionWe say a matrix A
n×nis invertible or non-singular if ∃ B
n×nsuch that
An×n· Bn×n︸ ︷︷ ︸
n×n
= Bn×n· An×n︸ ︷︷ ︸
n×n
= In
If A is invertible, we denote its inverse as A−1.So we get
A(n×n)
· A−1(n×n)︸ ︷︷ ︸
n×n
= A−1n×n· An×n︸ ︷︷ ︸
n×n
= In
A square matrix that is not invertible is called singular.
Determinant of a Matrix
DefinitionThe determinant of a matrix A (written detA = |A|) is defined inductively asfollows.
For n = 1 A(1×1)
detA = |A| ≡ a11For n ≥ 2 A
(n×n)
detA = |A| ≡ a11|A−11| − a12|A−12|+ a13|A−13| − · · · ± a1n |A−1n |where A−1j is the (n − 1)× (n − 1) matrix formed by deleting the first row andjth column of A.
NoteThe determinant is useful primarily because a matrix is invertible if and only if itsdeterminant 6= 0.
Determinant of a Matrix
detA = |A| ≡ a11|A−11| − a12|A−12|+ a13|A−13| − · · · ± a1n |A−1n |where A−1j is the (n− 1)× (n− 1) matrix formed by deleting the first row and jthcolumn of A.
ExampleIf
A2×2
= [aij ] =(a11 a12a21 a22
)⇒ detA = a11a22 − a12a21
ExampleIf
A3×3
= [aij ] =
a11 a12 a13a21 a22 a23a31 a32 a33
⇒
detA = a11
∣∣∣∣ a22 a23a32 a33
∣∣∣∣− a12 ∣∣∣∣ a21 a23a31 a33
∣∣∣∣+ a13 ∣∣∣∣ a21 a22a31 a32
∣∣∣∣
Adjoint and Inverse
DefinitionThe adjoint of a matrix A
n×nis the n × n matrix with entry ij equal to
adj A = (−1)i+j detA−ji
Example
If A is a (2× 2) matrix then adjA =(
a22 −a12−a21 a11
)
FACTThe Inverse of a matrix A
n×nis given by A−1 = 1
detA · adj A
Example
If A is a (2× 2) matrix and invertible then A−1 = 1detA ·
(a22 −a12−a21 a11
)
Inner Product
DefinitionIf x, y ∈Mn×1, then the inner product (or dot product or scalar product) is givenby
xty = x1y1 + x2y2 + · · ·+ xnyn =n∑i=1
xiyi
Note that xty = ytx.
NotationWe usually write the inner product x · y and read it “x dot y”to mean:
x · y =n∑i=1
xiyi
Inner Product, Distance, and Norm
Remember the Euclidean Distance is
d(x, y) = ||x− y||where
||z|| =√z21 + z
22 + · · ·+ z2n =
√√√√ n∑i=1
z2i
Under the Euclidean metric, the distance between two points is the length ofthe line segment connecting the points.
In this case ||z||, the distance between 0 and z, is the norm of z.
FACTThe Norm is an Inner Product: ||z||2 = z · z.
OrthogonalityDefinitionWe say that x and y are orthogonal (at right angles, perpendicular) if and only iftheir inner product is zero.
Two vectors are orthogonal whenever x · y = 0.This follows from “The Law of Cosines.”: if a triangle has sides A,B , and Cand the angle θ is opposite the side C , then
c2 = a2 + b2 − 2ab cos(θ),where a, b, and c are the lengths of A, B , and C respectively.Take a and b to be the vectors x and y, θ is the angle between x and y, andnotice that c = a − b = x− yThen:
(x− y) · (x− y) = x · x+ y · y − 2(x · y)law of cosines︷︸︸︷
= x · x+ y · y− 2||x|| ||y|| cos(θ),Hence
x · y = ||x|| ||y|| cos(θ), andx · y||x|| ||y|| = cos(θ)
1 The inner product of two non-zero vectors is zero if and only if the cosine of theangle between them is zero (cosine=0 means they are perpendicular).
2 Since the absolute value of the cosine is less than or equal to one,||x|| ||y|| ≥ |x · y|
Systems of Linear Equations
Consider the system of m linear equations in n variables:
y1 = α11x1 + α12x2 + · · ·+ α1nxny2 = α21x1 + α22x2 + · · ·+ α2nxn...
yi = αi1x1 + αi2x2 + · · ·+ αinxn...
ym = αm1x1 + αm2x2 + · · ·+ αmnxnwhere the variables are the xj .
This can be written using matrix notation
Matrices and Systems of Linear Equations
NotationA system of m linear equations in n variables can be written as
y(m×1)
= A(m×n)
· x(n×1)︸ ︷︷ ︸
(m×1)
where
y(m×1)
=
y1y2...ym
x(n×1)
=
x1x2...xn
Am×n
=
α11 α12 · · · α1nα21 α22 · · · α2n...
......
αm1 αm2 · · · αmn
= [αij ]
thus
y1y2...ym
=
α11 α12 · · · α1nα21 α22 · · · α2n...
......
αm1 αm2 · · · αmn
·x1x2...xn
Facts about solutions to linear equations
DefinitionA system of equations of the form Ax = 0 is called homogeneous.
A homogeneous system always has a solution (x = 0).This solution is not unique if there are more unknowns than equations, or ifthere are as many equations as unknowns and A is singular.
TheoremWhen A is square, the system Ax = y has a unique solution if and only if A isnonsingular.
If defined, the solution is x = A−1y.If not, then there is a nonzero z such that Az = 0.
If you can find one solution to Ax = y, you can find infinitely many.
Matrices as Linear Functions
REMARK: Matrices as functionsA matrix is a function that maps vectors into vectors.
This function applies a linear transformation to the vector x to get anothervector y:
A(m×n)
· x(n×1)︸ ︷︷ ︸
(m×1)
= y(m×1)
or
α11 α12 · · · α1nα21 α22 · · · α2n...
......
αm1 αm2 · · · αmn
·x1x2...xn
=
y1y2...ym
Linear IndependenceRn is a vector space, so sums and scalar multiples of elements of Rn are alsoelements of it.
Given some vectors in Rn , their linear combinations is also in Rn .
DefinitionLet X be a vector space over R. A linear combination of x1, . . . , xn ∈ X is a vectorof the form
y =n∑i=1
αixi where α1, . . . , αn ∈ R
αi is the coeffi cient of xi in the linear combination.
DefinitionA collection of vectors {x1, . . . , xk}, where each xi ∈ X (a vector space over R), islinearly independent if
k∑i=1
λixi = 0 if and only if λi = 0 for all i .
The collection {x1, . . . , xk} ⊂ X is linearly independent if and only if∑n
i=1λixi = 0 xi ∈ X ∀i ⇒ λi = 0 ∀i
Span and DimensionThe span of a collection of vectors is the set of all their linear combinations.
DefinitionIf {v1, . . . , vk} = V ⊂ X , the span of V is the set of all linear combinations ofelements of V : spanV = {y ∈ X : y =
k∑i=1
λivi with v ∈V }
Factspan(V ) is the smallest vector space containing all of the vectors in V .
DefinitionA set V ⊂ X spans X if spanV = X .
DefinitionThe dimension of a vector space V is the smallest number of vectors that span V .
ExampleRn has dimension n. You can span all of it with only n vectors.
Span and Linear Independence
TheoremIf X = {x1, . . . , xk} is a linearly independent collection of vectors in Rn andz ∈ span(X ), then there are unique λ1, . . . , λk such that z =
∑ki=1 λixi .
Proof.Existence follows from the definition of span.
Take two linear combinations of the elements of X that yield z so that
z =k∑i=1
λixi and z =k∑i=1
λ′ixi .
Subtract one equation fron the other to obtain:
z− z = 0 =k∑i=1
(λ′i − λi
)xi .
By linear independence, λi − λ′i = 0 for all i , as desired.
(Hamel) Basis
DefinitionA basis for a vector space V is given by a linearly independent set of vectors in Vthat spans V .
RemarkA basis must satisfy two conditions
1 Linearly independent2 Spans the vector space.
Example{(1, 0), (0, 1)} is a basis for R2 (this is the standard basis). Problem Set
Basis
Example{(1, 1), (−1, 1)} is another basis for R2:
Let (x , y) = α(1, 1) + β(−1, 1) for some α, β ∈ RTherefore x = α− β and y = α+ β
x + y = 2α⇒ α =x + y2
y − x = 2β ⇒ β =y − x2
(x , y) =x + y2
(1, 1) +y − x2
(−1, 1)
Since (x , y) is an arbitrary element of R2, {(1, 1), (−1, 1)} spans R2.
if (x , y) = (0, 0) then α =0+ 02
= 0, and β =0− 02
= 0
so the coeffi cients are all zero, so {(1, 1), (−1, 1)} is linearly independent.Since it is linearly independent and spans R2, it is a basis.
Basis
Example{(1, 0), (0, 1), (1, 1)} is not a basis for R2 (why?).
1(1, 0) + 1(0, 1) + (−1)(1, 1) = (0, 0)so the set is not linearly independent.
Example{(1, 0, 0), (0, 1, 0)} is not a basis of R3, because it does not span R3 (why?).
Span, Basis, and Linear Independence
TheoremIf X = {x1, . . . , xk} is a linearly independent collection of vectors in Rn andz ∈ span(X ), then there are unique λ1, . . . , λk such that z =
∑ki=1 λixi .
Put together this statement and the fact that Rn has dimension n.
Take X = V a basis for Rn .Then span(X ) = span(V ) = Rn .Therefore any vector in Rn can be written as
∑ki=1 λivi .
Tomorrow
More linear algebra.
1 Eigenvectors and eigenvalues2 Diagonalization3 Quadratic Forms4 Definiteness of Quadratic Forms5 Uniqure representation of vectors