theory of ordinary differential equations of ordinary differential equations review of advance...
TRANSCRIPT
THEORY OF ORDINARYDIFFERENTIAL EQUATIONS
Review of Advance Calculus Topics
John A. Burns
Center for Optimal Design And Control
Interdisciplinary Center for Applied MathematicsVirginia Polytechnic Institute and State University
Blacksburg, Virginia 24061-0531
MATH 5245FALL 2012
TopicsReview of Differentiation for Vector Valued Functions Partial and Directional Derivatives Derivatives (Fréchet) Gradients, Jacobians and Hessians
Matrix Theory
Calculus for F: D(F) Rn ---> Rm
We review calculus for vector-valued functions of n variables. We then go to the infinite dimensional case. The following references are good …
Robert G. Bartle, The Elements of Real Analysis, John Wiley & Sons, New York, 1976L. V. Kantorovich and G. P. Akilov, Functional Analysis in Normed Spaces, Pergamon Press, New York, 1964.M. Z. Nashed, Differentiability and Related Properties of Nonlinear Operators: Some Aspects of the Role of Differentials in Nonlinear Functional Analysis, in Nonlinear Functional Analysis and Applications, Academic Press, New York, 1971.J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970.A. E. Taylor, An Introduction to Functional Analysis, John Wiley & Sons, New York, 1967.E. Zeidler, Applied Functional Analysis, Springer-Verlag, New York, 1995.
Finite Dimensional Spaces1
2
.:
.
.
i
n
xx
x R
x
1
2
.:
.
.
i
n
xx
x C
x
COLUMN VECTORS
iNi
n
i
pip
n
ii
n
ii
nTn
xx
pxx
xx
xxx
Rxxxx
p
max
1 ,
,...,,
1
1
11
1
2
2
21
1
NORMS (SAME for Complex Spaces)
c of moduluscomplex
of valueabsolute22
babiac
rr
and
i) 0 and 0 if and only if 0
ii) x , (or )
iii) , for , (or )
n n
n n
R
x x x
x x R C
x y x y x y R C
A Norm on is a functionsatisfying
RRn :nR
R = Real Numbers R+ = [0, +)C = Complex NumbersRn = n-dimensional Euclidean Space = Cn = n-dimensional Complex SpacexT = [x1, x2, …,xn]
- inequality
Finite Dimensional SpacesThe vector space Rn with a norm || || is a normed linear space. Inparticular, the pair (Rn , || ||2) is called n-dimensional Euclidean space.
2: 12
1
nx x xii
1: | | 11
nx x xii
x1
x2
The distance between two vectors x and y in Rn is given by dist(x,y) = ||x - y||.
The geometry and differentiability of different norms are DIFFERENT!
Finite Dimensional SpacesEquivalent Norm Theorem:
If ( , ) is one of the finite dimensional normed linear spaces ( , ) or
( , ), then there are constants 0 and 0 such that for any 1
nX R nC m M p
. m x x M xp p
2 1 2x x n x
2x x n x
1x x n x
EXAMPLES
Finite Dimensional Spaces
ˆ ˆ( , ) :B x x X x x
Open Ball About A Point Closed Ball About A Point
ˆ ˆ( , ) :B x x X x x
For k = 1,2,3, … let be a sequence of vectors in
Rn. We say that {xh} converges to x in Rn if
and we write
1 2[ .. ]k k k k Tnx x x x
lim 0,k
k x x
k kx x
For each > 0 there is a K() > 1 such that if k ≥ K(), then
.kx x
Finite Dimensional Spaces
NOTE: Because of the Equivalent Norm Theorem if || ||p is any norm on Rn, then
2lim - 0k
k x x
lim - 0k
pk x x
if and only if
NOTE:
IMPLIES that for i= 1, 2 , …, n
k
kx x
ki ikx x
AND :
for all i= 1, 2 , …, n IMPLIES
k
i ikx x
k
kx x
Inner ProductsAn inner product on Rn is a mapping such that , : n nR R R
n
n
n
Rzyx
Ryx
Ryx
x
R
,, ,yz,yx, yz,x iv)
,for xy, yx, iii)
, ,yx, yx, ii)
0 ifonly and if 0xx, and 0xx, i)
and
EXAMPLE: )(... ,1
xyyxyxyx Tn
iii
An inner product on Cn is a mapping such that CCC nn :,
n
n
n
Czyx
Cyx
Cyx
x
C
,, ,yz,yx, yz,x iv)
,for xy, yx, iii)
, ,yx, yx, ii)
0 ifonly and if 0xx, and 0xx, real, is xx, i)
_____
and
EXAMPLE: )*(... ,1
xyxyyxyxyx Tn
iii
Inner Product Spaces
______ _________ ______
__________ ________________
andFor all , ,
, and
x,y z y z,x y,x z,x x,y x,z
nC
α y y αy
x y z C
x, y ,x ,x ,x x, y
NOTE:
A norm || || and an inner product <· , · > are compatible if x,xxx,xx 2or
n
ii
n
iii xxxxxx
1
2
12
||,EXAMPLE:
REMARK: If p 2 then there is NO inner product <· , · > with x,xxp
If <· , · > is an inner product on Rn , then the pair (Rn , <· , · > ) is called aninner product space. Same for (Cn , <· , · > ).
Results for Inner Products
Schwarz Inequality: yxy,x
Parallelogram Law: 2222 2 yxyxyx
Two vectors are call orthogonal if )(or , nn CRyx 0.x,y
Pythagorean Theorem: .yxyx,y,x 222 then0 If
x
y
x
y
x y
Matrix Notation
nm
m,nm,m,
,n,,
,n,,
i,j R
a..aa
aaaa..aa
a
::::
..
21
22212
12111
A
We use standard matrix notation ... for real matrices
mn
m,n,n,n
m,,,
m,,,
T
i,jT R
a..aa
aaaa..aa
a
::::
..
21
22221
11211
A
nni,jn R
..
..
100::::0..10001
II
AND standard terms ... diagonal tridiagonal upper triangular :
Symmetric
skew-symmetric
positive definite
orthogonal
AA T
AA T
nT Rxxx 0 ,0AIAA T
Matrix Notation
11 1 2 1
2 1 2 2 2
1 2
..
: : : :
, , ,n
, , ,n m ni, j
m, m, m,n
a a .. aa a a
a C
a a .. a
A
We use standard matrix notation ... for complex matrices
11 2 1 1
1 2 2 2 2
1 2
..
: : : :
, , m,
T, , m,T n m
i, j
,n ,n m,n
a a .. aa a a
a C
a a .. a
A
nni,jn R
..
..
100::::0..10001
II
AND standard terms ...
diagonal tridiagonal upper triangular :
self-adjoint (Hermitian)
skew-adjoint
positive definite
unitary
* A A* A A
* 0, 0 nx x x C A* A A I
11 2 1 1
1 2 2 2 2*
1 2
..
: : : :
, , m,
T, , m, n m
i, j
,n ,n m,n
a a .. aa a a
a C
a a .. a
A
Matrix Notation
MATRIX NORMS ...
11 1 2 1
2 1 2 2 2
1 2
..
: : : :
, , ,n
, , ,n m ni, j
m, m, m,n
a a .. aa a a
a R
a a .. a
A
2/1
1 1
2
,
m
i
n
jjiF aAFROBENIUS NORM ...
p
p
xp x
xAA sup
0P - NORM ...
FACTS
22 n AAA F max max ,2, nm jiji aa A
AAA12
m
ijia
1,j1 ||max A
n
jjia
1,i
||max A
Matrix Notation
nm
m,nm,m,
,n,,
,n,,
i,j R
a..aa
aaaa..aa
a
::::
..
21
22212
12111
A sn
n,sn,n,
,s,,
,s,,
i,j R
b..bb:.::
bbbb..bb
b
21
22212
12111
B
FFF BABA
THESE MATRIX NORMS ARE MUTUALLY CONSISTENT ...
ppp BABA
IF Q AND Z ARE BOTH ORTHOGONAL, THEN
FF AZAQ 22 AZAQ
Gene H. Golub and Charles Van Loan, Matrix Computation, third edition, The Johns Hopkins University Press, London, 1996.
Functions of n VariablesLet D(F) Rn be a subset of Rn and let F: D(F) --->Rm be a function with domainD(F) Rn and range R(F) in Rm.
One of the most fundamental problems in engineering and science is: Given y Rm, find x D(F) Rn such that
F(x) = y (1)
EXISTENCE: If F is onto Rm (i.e., if R(F) = Rm) then there exists at least onesolution to (1).
UNIQUENESS: If F is one-to-one (i.e, F(x1) = F(x2) implies that x1= x2), then thesolution to (1) is unique.
MOST IMPORTANT PROBLEMS
ACTUALLY COMPUTING THE SOLUTION
FIND CHECKABLE CONDITIONS FOR EXISTENCE & UNIQUENESS
DEVELOP FAST & ACCURATE ALGORITHMS - SOFTWARE TOOLS
Functions of n Variables
The function F is locally one-to-one at the point x D(F) Rn, if there is a > 0 such that F restricted to the set D(F)B(x, ) is one-to-one.
The function F is one-to-one on the set D(F) Rn, if F restricted to the setD(F) is one-to-one.
If F is one-to-one on the set D(F) Rn, then there exists a function F -1 from = F( ) Rm into Rn such that
F(F -1(y)) = y for all y AND F -1(F(x)) = x for all x
The function A: D(A) ---> Rm is linear ifn i) ( ) is a linear subspace of D RA
1 2 1 2 1 2ii) If and , ( ), then , R x x D A ( x + x )= (x )+ (x ) A A A
AND
A linear function : Rn ---> R is called a linear functional.
Functions of n VariablesLet F: D(F) Rn --->Rm and G: D(G) Rm --->Rk be functions. Thecomposite function is the function G o F :D(GoF) Rn --->Rk is definedon the domain
( ) ( ) : ( ) ( ) ( ) nD G F x D F F x D G D F R
by [G o F ](x) = G(F(x))
F G
G o F
xRn F(x)Rm
G(F(x))
Rk
CalculusIf F: D(F) Rn ---> Rm is a function with domain D(F) Rn and range R(F) in Rm,then F is continuous at p Rn, ifi) p D(F)ii) For each > 0, there is a = (p, ) > 0, such that if x D(F) B(p, ),
then || F(x) - F(p) ||Rm <
We say that F is continuous, if F is continuous at each x D(F)
NOTE: i) and ii) are equivalent to ... - 0lim ( ) - ( ) 0m
nRRx p
F x F p
The function F: D(F)Rn ---> Rm is a contraction if there is a < 1 such that ifx0 , x1 D(F) , then || F(x1) - F(x0) ||Rm || x1 - x0 ||Rn
Abuse of NotationLet D(F) Rn be a subset of Rn and let F: D(F)Rn --->Rm be a functionwith domain D(F) in Rn and range R(F) in Rm.
1
21
11 1
22 2 1
1
2
( ):
( )( ) ( )
( ) :
( )
( ):
n
nn m
m
n
xx
F
x
xx F x
xx F x F
F(x) Fx
x F x
xx
F
x
Tn
n
xxx
x
xx
x ..: 12
1
1 1 1 2
2 2 1 21 2
1 2
( ) ( , ,..., )( ) ( , ,..., )
( , ,..., ): :( ) ( , ,..., )
n
nn
m m n
F x F x x xF x F x x x
F(x) F x x x
F x F x x x
Notation
If A: D(A)Rn ---> Rm is a linear function with domain, then one can show that A is continuous. Moreover, A is bounded. In particular, there is a M 0 such that for all x D(A) one has
|| A(x) ||Rm M || x ||Rn.
The operator norm on A is defined to be
0
( )sup m
n
R
x R
A xA A
x
RECALL: All linear operators from Rn to Rm have a matrix representation.Actually, if one selects basis for Rn and Rm, then there exist a mn matrix
such that
11 1 2 1
2 1 2 2 2
1 2
..
: : : :
, , ,n
, , ,n m ni, j
m, m, m,n
a a .. aa a a
a R
a a .. a
A
( )A x x A( )A x x AOPERATOR MATRIX
Matrix Representations
The standard unit vectors ei = [0 0 … 0 1 0 … 0]T
ith position
x=[x1,x2]T
x=[x2,-x1]T
0( ) rotate clockwise by 90A x
1
2
0 1( )
1 0x
x xAx
A
WARNING: The use of matrices as representations of linear operators is important. However, be sure not to confusethe representation (MATRIX) with the operator A.A
Partial DerivativesLet F: D(F) Rn ---> R1 be a real-valued function of n variables.The partial derivative xiF(p) at p is a number Di=Di (p) satisfying:For each > 0, there is a = (p, ) > 0, such thatfor each t with - < t < , [p + tei] D(F) and
| F(p + tei) - F(p) - Di t | | t | (2)or equivalently,
| F(p1, p2, …,pi + t, …pn) - F(p1, p2, …,pi, …pn ) - Di t | | t |. (3)
1t 0lim | [ ( ) ( )] | 0i
it F p te F p D
NOTE: The partial derivatives will be denoted by several symbols ...
APPLYING THE CHAIN RULE (IF POSSIBLE)
0( ) ( ) ( )
i
ididt x x pt
F p te F p x F x
Partial Derivatives
i i i
F(p)i ix x x x p
D F(p) x F(p) F(x)
IMPORTANT REMARK: The partial derivative of a real-valued function F at a point p is a NUMBER!
HIGHER ORDER PARTIAL DERIVATIVES ...2
22
iixx x p
F(p) F(x)
NEED GENERAL NOTATION: Let s=(s1, s2, …, sn) be a multi-index,where each si is a non-negative integer.
1 2
1 2
...
1 2
( ) ( )...
N
N
s s ss
ss sN
D F p F px x x
Let |s| = s1 + s2 + … + sn and define the mixed partial derivative DsF(p) by
(0,0,...,0) 0D F(p) D F(p) F(p)
AGREE THAT
Partial Derivatives
EXAMPLE: In R2 if s=(1,1), then 1 1 2
s (1,1)
1 2 1 2
D ( ) D ( ) ( ) ( )F p F p F p F px x x x
EXAMPLE: In R2 if s=(2,0), then2 0 2
s (2,0)2 0 21 2 1
D ( ) D ( ) ( ) ( )F p F p F p F px x x
EXAMPLE: In R2 if s=(2,0) and r=(0,2), then
2 2(2,0) (0,2)
2 21 2
[ ] ( ) ( ) ( ) ( ) ( ) ( )s rD D F p D F p D F p F p F p F px x
EXAMPLE: In R2 if s=(1,0) and r=(0,1), then
(1,0) (0,1)
1 2
[ ] ( ) ( ) ( ) ( ) ( )s rD D F p D F p D F p F p F px x
Variations and Differentials
There are various notations and terms used for this derivative:
0( ; ) [F( )] ( , )d
dt ttF p p V p
The first variation of F at p in the direction of , The directional derivative of F at p in the direction , (misleading)The Gateaux variation at p in the direction ...
Let Rn be a “direction”. If for each > 0, there is a = (p, ) > 0, such that for each t with - < t < , [p + t ] D(F) and the limit (in R1)
exists, then this limit is called the directional derivative of F at p in thedirection .
10 0
[F( )] lim( )[F( ) ( )]ddt tt t
tp p t F p
If F: D(F) Rn ---> R1, then the partial derivative xiF(p) at p is the first variation of F in the direction of the unit vector ei.
0
1( ) ( ) lim ( )[ ( ) ( )] ( ; )i ii
i t tF p x F p F p te F p F p e
x
Variations and Differentials? WHEN DO THESE VARIATIONS EXIST ?
REQUIRES:
10
For sufficiently small ( )AND lim [ ( ) ( )]
EXISTStt
t p t D F
F p t F p
0( ; ) [F( )]d
dt ttF p p
1
2
( )
( )
:)
( )
(
n
x
x
x
F p
F p
F p
F p
If for all i = 1, 2, …, n the partial derivativesexist, then we can define the gradient vector
i iix x x pF(p) x F(p) F(x)
Variations and DifferentialsIF the CHAIN RULE can be used ...
1 1 2 2 1 1 2 21
( , ,..., ) ( , ,..., )i
nd
N N N N idt xi
F p t p t p t F p t p t p t
At t = 0
1 1 2 2 1 201
( , ,..., ) ( , ,..., ) , ( )i
Nd
N N N idt xti
F p t p t p t F p p p F p
Thus
( ; ) , ( ) [ ( )]TF p F p F p (4)
WARNING: (4) does not always hold!
The Gateaux Derivative
If F(p;) exists for all Rn AND is linear in , then we can define athe Gateaux Derivative.
NOTE: F(p) Rn is a vector. It is not a “derivative”. So,what do we mean by a “derivative”? We start with a weak notion of thederivative and the extend to the stronger notion.
Assume F: D(F) Rn ---> R1 and p int[D(F)]. If the first variation F( p ; )of F at p exists for all Rn AND for 1 , 2 Rn ,
F( p ; 1 + 2 ) = F( p ; 1 ) + F ( p ; 2 ),
(i.e. F( p ; ) is linear in ), then we say that F is Gateaux-differentiableat p. If = p is the (unique) linear functional p: Rn ---> R1 defined byp( ) = F( p ; ), then p is called the Gateaux Derivative of F at p.
The Gateaux Derivative is a linear functional.
Gateaux Derivative
Reisz Representation Theorem. If : Rn ---> R1 is a linear functional onRn,, then
i) is continuous,
and
ii) there exists a fixed vector a Rn (depending on ) such that forall Rn
( ) = < , a > (5)
To make the Gateaux derivative something useful, we need the followingversion of the Reisz Representation Theorem.
If F is Gateaux differentiable at p we can apply the Reisz Representation Theoremto p( ) = F( p ; ), so it follows that there is a unique vectora = a(p) Rn such that for all Rn ,
T
1
( ; ) ) , [ ( )] [ ( )] ( ) .n
p i ii
F p (η a p a p a p
(6)
Gateaux Derivative
IMPORTANT: The linear functional p( ) = f( p ; ) is theGateaux derivative at p. The vector a(p) is just a matrix representation ofthis linear functional.
Note that the linear functional f( p ; )=p: Rn ---> R1 is the Gateaux-derivativeof F at p int[D(F)] if and only if for each Rn
OR EQUIVALENTLY
0lim ( 1/ ) [ ( ) ( )] ( ) 0pt
tt F p F p t
(7)
0
lim (1/ ) [ ( ) ( )] ( ) 0T
ttt F p F p t a p
(8)
Gateaux DerivativeIf F: D(F) Rn ---> R1 is Gateaux-differentiable at p int[D(F)], then the
partial derivatives xiF(p) at p exist. Let a(p) = [a(p)1, a(p)2 , …, a(p)n]T Rn
be the vector defined by (5). It follows that
We have the following theorem.
1
2
( )
( )( ) ( ) .
:( )
n
x
x
x
F p
F pa p F p
F p
(9)
Theorem D1. If F: D(F) Rn ---> R1 is Gateaux differentiable atp int[D(F)], then the gradient of F at p exists, and p( ) = F( p ; ) has the representation
1( ; ) , ( ) ( )
n
i ii
F p F p x F p
(10)
Gateaux DerivativeRECALL ...If A: D(A) Rn ---> Rm is a linear function with domain D(A) Rn and rangeR(A) in Rm, then the operator norm on A is defined to be
n
m
R
R
x x)x(
supA
AA 0
(11)
If f( p ; ) =p : Rn ---> R1 is the Gateaux derivative of F at p, then it is a linearfunctional and the (operator) norm of p is given by
0 0 0
[ ( )]( ) , ( )sup sup sup
n n n
Tp
p pR R R
F p F p
(12)
The DerivativeBUT … what is “the derivative” of a function F: D(F) Rn ---> R1 ? Best tothink of the derivative as a linear function that approximates F locally.
F(x) = (x +1)2 - 1 F( p ; ) = 2(p +1)
Let p denote the linear function with slope F'(p) = 2(p+1). If we look atp = 0, then 0( ) = 2 is a linear approximation to F(·) near p = 0.
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3-4
-2
0
2
4
6
8
10
12
14
16
F()
0( )0( ) = F( 0 ; )
p=0
The Fréchet Derivative
Let F:D(F) Rn ---> R1 be a real-valued function and assumep int[D(F)]. We say that F is (Fréchet) differentiable at p if thereexists a linear functional D :Rn ---> R1 such that for each > 0, thereis a = (p, ) > 0 such that if
0 < || x - p || Rn < ,then
i) x D(F) (13)
ii) | F(x) - F(p) - D(x - p) | || x - p || Rn (14)
If F: D(F) Rn ---> R1 then the (Fréchet) derivative of F at a point p is alinear function D:Rn ---> R1 that “approximates F near p”.
BASIC IDEA …
Notation varies and we shall use ... D = [DF(p)] = DxF(p) = F´(p)
The (Fréchet) Derivative is the LINEAR OPERATOR DxF(p)
The Fréchet DerivativeLet F:D(F) Rn ---> R1 be (Fréchet) differentiable at p with derivativeDxF(p) :Rn ---> R1. Given any > 0, there is a = (p, ) > 0 such that if0 < || x - p || Rn < , then x D(F) and then
| F(x) - F(p) – [DxF(p)](x - p) | || x - p || Rn
Let = x - p. Note that p + = x D(F), 0 < || || Rn < and
| F(p + ) - F(p) - [DxF(p)]() | || || Rn
or equivalently,
(1/ || || Rn) | F(p + ) - F(p) - [DxF(p)]() |
Thus, the linear functional DxF(p) :Rn ---> R1 is the (Fréchet) derivative of thefunction F:D(F) Rn ---> R1 at x = p if and only if
1lim [ ( ) ( )] [ ( )]( ) 0.0 n
xR
F p F p D F p
(15)
The Fréchet DerivativeTheorem D2. If F: D(F) Rn ---> R1 has a Fréchet derivative p int[D(F)], then F has a Gateaux derivative at p and the two derivatives are equal. Inparticular, DxF(p) = p( ) = f( p ; ).
There exist functions F: D(F) Rn ---> R1 such that at p int[D(F)]:
(1) F has a Gateaux differential (first variation) at p but F is not Gateauxdifferentiable at p.
(2) F has a Gateaux derivative at p but F is not Fréchet differentiable at p.
COMMENTS
Problem (P1): Let F :R2 ---> R1 be defined by 2
2 4 ,( , )
0,
x yx y
F x y
(16)(x , y) (0 , 0)
(x , y) = (0 , 0).
Show that F( 0 ; ) exists for all R2 but F does not have a Gateauxderivative at 0 = [0 , 0]T. Also, F is NOT continuous at 0!
The Fréchet Derivative
3
4 2 ,( , )
0,
x yx y
F x y
(17)(x , y) (0 , 0)
(x , y) = (0 , 0)
Problem (P2): Let F :R2 ---> R1 be defined by
Show that F has a Gateaux derivative F( 0 ; ) at p = 0=[0 , 0]T, but F does not have a Fréchet derivative at p = 0=[0 , 0]T.
2
2
( )
2 (2 )
2 ,( , )
0,
x
x
yey eF x y
(18)x 0
x = 0
Problem (P3): Let F :R2 ---> R1 be defined by
Show that F has a Gateaux derivative F( 0 ; ) at p = 0=[0 , 0]T,but F is not continuous at p = 0=[0 , 0]T.
The Fréchet Derivative
,( , )
0,
xy
F x y
(19)y 0
y = 0
Problem (P4): Let F :R2 ---> R1 be defined by
Show that f has partial derivatives at p = 0=[0 , 0]T, but F( 0 ; ) does notexist unless 12 = 0. Also, F is not continuous at (0, 0) = 0.
2 2 ,( , )
0,
x yF x y
(20)if both x and y are rational
otherwise
Problem (P5): Let F :R2 ---> R1 be defined by
Show that F is continuous at only one point p = 0=[0 , 0]T, andyet F is Fréchet differentiable at p = 0=[0 , 0]T.
The Fréchet DerivativeLet D(F) Rn be a subset of Rn and let F: D(F) ---> R1 be a real-valuedfunction of n variables. We say that F is differentiable on the set D(F), if F is Fréchet differentiable at each point x D(F).
Mean Value Theorem For Functionals. Let F: D(F) Rn ---> R1. Assume that F is Fréchet differentiable at all points in theopen set D(F). If x and y are two elements in such that the linesegment {z = x + (1- )y : 0 1 is contained in , then there isa c with 0 < c < 1 such that
F(y) - F(x) = [DxF(c x+(1 - c)y)](y – x) (21)
… OR by defining c = c x+(1 - c)y, we have
F(y) - F(x) = [DxF(c)](y – x) (22)
| F(y) - F(x) | | DxF(c)( y – x) | || DxF(c)|||( y – x) | (23)
IMPLIES
The Mean Value Theorem
c = c x+(1 - c)y
xy
c = c x+(1 - c)yx
y
ASSUMPTIONS SATISFIED ASSUMPTIONS FAIL
The Mean Value Theorem
c = c x+(1 - c)yx
y
If is convex, then the theorem applies to all x and y in .
Let F: D(F) ---> R1 be a real valued function of n variables. If D(F), then we say that F is Holder continuous on the set if there exists constants M 0 and (0, 1] such that for all x, y ,
| F(x) - F(y) | M || x - y||. If = 1, then we say that F is Lipschitz continuous.
Functions F: D(F) Rn ---> Rm
1 1 1 2
2 2 1 21 2
1 2
( ) ( , ,..., )( ) ( , ,..., )
( ) ( , ,..., ) ,: :( ) ( , ,..., )
n
nn
m m n
F x F x x xF x F x x x
F x F x x x
F x F x x x
If F: D( F ) Rn ---> Rm is a function with domain D( F ) Rn and range R(F )in Rm, then F has the form
where for i = 1, 2, …, m, Fi is a real valued function with domain D( Fi ) Rn
and range R( Fi ) R1.
,i
i j j
F (p)i j i j i ix x x x p
D F (p) x F (p) F (x)
Each function Fi: D(F) ---> R1 fits into the previous framework. For example,the definitions and results involving partial derivatives, Gateaux variations,and Fréchet derivatives hold. The notation is as expected … e.g.
Functions F: D(F) Rn ---> Rm
There are various notations and terms used for this derivative:
0( ; ) [F( )] ( , )d
dt tmt RF p p V p
The first variation of F at p in the direction of , The directional derivative of F at p in the direction , (misleading)The Gateaux variation at p in the direction ...
Let Rn be a “direction”. If for each > 0, there is a = (p, ) > 0, such that for each t with - < t < , [p + t ] D(F) and the limit (in Rm)
exists, then this limit is called the directional derivative of F at p in thedirection .
10 0
[F( )] lim( )[F( ) ( )]ddt tt t
tp p t F p
NOTE: These variations are vectors in the range space Rm
Variations and Differentials? WHEN DO THESE VARIATIONS EXIST ?
REQUIRES:
10
For sufficiently small ( ) AND lim [ ( ) ( )] ( ; )
EXISTStt
m
t p t D FF p t F Rp F p
0( ; ) [F( )]d
dt ttF p p
1
2
( )
( ), 1, 2,...,
:( )
( )
n
ix
ix
ix
i
F p
F pF p
i m
F p
If for all i = 1, 2, …, n the partial derivativesexist, then we can define the gradient vectors
j ji j i ix x x pF (p) x F (p) F (x)
Jacobian Matrix
1
2
1 2
( )
( )[ ] ( ) ( ) ( )
( )
( )n
n
Tix
ixTi i ix x x
ix
i
F p
F pF F F
F
F
p
p
x x x
If for all i = 1, 2, …, n the partial derivativesexist, then we can define the Jacobian matrix
j ji j i ix x x pF (p) x F (p) F (x)
1 2
1 2
1 2
1 1 1
2 2 2
( ) ( ) ( )
( ) ( ) ( )
( ) ( )
( ) ( ) ( )
|
n
n
n
x x x
x x x
m n
m m mx x x
F F F
F F F
F F R
F F F
J J
x p
x p
x x x
x x x
p x
x x xAlso, since
Jacobian Matrix
1 2
1 2
1 2
1 1 1 1
2 2 2 2
( ) ( ) ( ) [ ( )]( ) ( ) ( ) [ ( )]
( )
[ ( )]( ) ( ) ( )
n
n
n
Tx x x
Tx x x
Tmm m mx x x
F p F p F p F pF p F p F p F p
F
F pF p F p F p
J
p
NOTE: JF(p) Rm n is a matrix. It is not a derivative. So, again we have todefine what we mean by a “derivative”? The definitions are the same as for thereal valued functions …
The Gateaux Derivative
If F(p;) exists for all Rn AND is linear in , then we can define athe Gateaux Derivative.
Assume F: D(F) Rn ---> Rm and p int[D(F)]. If the first variation F( p ; )of F at p exists for all Rn AND for 1 , 2 Rn ,
F( p ; 1 + 2 ) = F( p ; 1 ) + F ( p ; 2 ),
(i.e. F( p ; ) is linear in ), then we say that F is Gateaux-differentiableat p. If L = Lp is the (unique) linear function Lp: Rn ---> Rm defined byLp( ) = F( p ; ), then Lp is called the Gateaux Derivative of F at p.
The Gateaux Derivative is a linear operator
: n mpL R R
Gateaux Derivative
Reisz Representation Theorem. If L: Rn ---> Rm is a linear operator onRn,, then
i) L is continuous,
and
ii) there exists a matrix L R m n (depending on L and a basis for Rn and Rm) such that for all Rn
L( ) = L
To make the Gateaux derivative something useful, we need the followingversion of the Reisz Representation Theorem.
If F is Gateaux differentiable at p we can apply the Reisz Representation Theoremto L( ) = F( p ; ), so it follows that there is a matrixL = L(p) R m n such that for all Rn ,
( ; ) ) ( )pF p L (η p L
Gateaux Derivative
IMPORTANT: The linear function Lp( ) = f( p ; ) is the Gateaux derivative of F at p. The matrix L(p) is just a matrix representation of this linear function.
Note that the linear functional f( p ; )= Lp: Rn ---> Rm is the Gateaux derivativeof F at p int[D(F)] if and only if for each Rn
OR EQUIVALENTLY
0lim ( 1/ ) ( 1/ )[ ( ) ( )] ( ) 0mp Rt
tt t F p F p L
0
lim (1/ )[ ( ) ( )] ( ) 0mRttt F p F p p
L
Gateaux DerivativeIf F: D(F) Rn ---> Rm is Gateaux differentiable at p int[D(F)], then the
partial derivatives xjFi(p) at p exist. Select the standard basis for Rn and Rm
and let
Theorem D3. If F: D(F) Rn ---> Rm is Gateaux differentiable atp int[D(F)], then the Jacobian matrix of F at p exists, and Lp( ) = F( p ; ) has the representation
( ; ) [ ( )]F p F p J
1 2
1 2
1 2
1 1 1 1
2 2 2 2
( ) ( ) ( ) [ ( )]( ) ( ) ( ) [ ( )]
( )
[ ( )]( ) ( ) ( )
n
n
n
Tx x x
Tx x x
Tmm m mx x x
F p F p F p F pF p F p F p F p
F
F pF p F p F p
J
p
Gateaux DerivativeRECALL ...
If A: D(A) Rn ---> Rm is a linear function with domain D(A) Rn and rangeR(A) in Rm, then the operator norm on A is defined to be
n
m
R
R
x x)x(
supA
AA 0
If f( p ; ) =Lp : Rn ---> Rm is the Gateaux derivative of F at p, then it is a linearoperator and the (operator) norm of Lp is given by
0 0
( ) [ ( )]sup sup [ ( )]m m
n n
p R Rp p
R R
L F pL L F p
JJ
The Fréchet Derivative
Notation varies and we shall use ... D = [DF(p)] = DxF(p) = F´(p)
The (Fréchet) Derivative is the LINEAR OPERATOR D
Let F:D(F) Rn ---> Rm be a vector-valued function and assumep int[D(F)]. We say that F is (Fréchet) differentiable at p if thereexists a linear operator D :Rn ---> Rm such that for each > 0, thereis a = (p, ) > 0 such that if
then
i) x D(F)
and ii) [ ( ) ( )] ( ) m nR R
F x F p D x p x p
0 nRx p
: n mD R R
The Fréchet Derivative
If F is Fréchet differentiable at p we can apply the Reisz Representation Theoremto D( ) = [DxF( p )]() to obtain a matrix representation with respect to the standardbasis …
[ ( )]( ) [ ( )]xD F p F p J[ ]( ) [( ) ( )]xD F p F p J
OPERATOR MATRIX
Theorem D4. If F: D(F) Rn ---> Rm is Fréchet differentiable atp int[D(F)], then the Jacobian matrix of F at p exists, F is continuous at p and has the representation
[ ( )]( ) [ ( )]xD F p F p J
Two Theorems
Theorem D5. If F: D(F) Rn ---> Rm and Jacobian matrix of F atp int[D(F)] exists and is continuous in an open ball
about p, then F is Fréchet differentiable at p.
( , ) : nn
RB p x R x p
Mean Value Theorem For Vector Valued Functions. Let F: D(F) ---> Rm.Assume that F is Fréchet differentiable at all points in an open set D(F). If x and y are two elements in such that the linesegment {z = x + (1- )y : 0 1 is contained in , then there isa c with 0 < c < 1 such that for c = c x+(1 - c)y,
CAN NOT SAY … [F(y) - F(x)] = [DxF(c)]( y – x)
( ) ( ) [ ( )]( ) ( )m m nx xR R RF y F x D F y x D F yc c x
Analysis of F: D(F) Rn ---> Rm
In order to use these concepts we need to know when the usual “calculus” results hold. The important results for this topic are:
• Taylor’s Theorem• Chain Rule• Inverse Function Theorem • Implicit Function Theorem• Necessary conditions for optimization• Higher order derivatives• Convexity
We will cover these topics when needed and indicate where proofs can be found in the references.
We present two important results
Chain RuleChain Rule for Fréchet derivative. Let F: D(F) Rn --->Rm andG: D(G) Rm --->Rk be functions. If F has a Fréchet derivativeat p and G has a Fréchet derivative at y = F(p), then the composite functionG o F has Fréchet derivative at p and
for all Rn. Or ….
[ ( )( )]( ) [ ( ( ))] [ ( )]( )x y xD G F p D G F p D F p
( )[( )( )] [ ( ) ] [ ( )]| y F p
d d
dy dx
d G F p G y F pdx
DOES NOT HOLD for Gateaux derivative
[( )( )] [( ( ( ))] [ ( ( ))] [ ( )]x xD G F p D G F p G F p F p
Partial Fréchet Derivatives Let be the product space with norm defined by
If F : D(F) Rn Rp ---> Rm is a function of the two variables x and q, then we can define the partial Gateaux and Fréchet derivatives. If p = ( x0 , q0 )T int[D(F)], then define the domain
D( F1 ) = {x Rn : (x , q0 )T int[D( F)]}
and the function F1 : D( F1 ) Rp ---> Rm by
F1(x) = F( x , q0 ) Rm.
If F1(x) = F( x, q0 ) has a Gateaux (Fréchet) derivative at x = x0 , then we say that F( x, q0 ) has a first partial Gateaux (Fréchet) derivative atz0 = p and we denote this derivative by DxF( x0 , q0 ) or D1F(x0 , q0 ) or xF(x0 , q0 ). Likewise, we can define qF(x0 , q0 ).
n pZ R R
2 2n p
T
R Rx q x q
The partial Fréchet derivatives are LINEAR OPERATORS
Partial Fréchet DerivativesNote that at p = ( x0 , q0 )T int[D( F )], then the partial Fréchet derivative xF(x0 , q0 ) is a continuous linear operator [xF(x0 , q0 )]: Rn ---> Rm and qf(x0 , q0 ) is a continuous linear operator [qF(x0 , q0 )]: Rp ---> Rm.
Assume F: D( F ) Rn Rp ---> Rm is a Fréchet differentiable function of the form
If all the partial derivatives exist and are continuous,
on the open set , then we say that F is smooth on and write
1 1 2 1 21
2 1 2 1 221 2 1 2
1 2 1 2
( , ,..., , , ,..., )( , )( , ,..., , , ,..., )( , )
( ) ( , ) ( , ,..., , , ,..., ) ,
( , ,..., , , ,..., )( , )
T
n p
n pn p
m n pm
z x q
F x x x q q qF x qF x x x q q qF x q
F z F x q F x x x q q q
F x x x q q qF x q
( , ) ( , ) and i i
i i
F x q F x qx q
1( )F C
Partial Fréchet Derivatives
If the partial Fréchet derivatives xF(x0 , q0 ) and qF(x0 , q0 ) exist at the point z0 = p = ( x0 , q0 )T int[D( F )], , then we define the total derivative of F atp to be the continuous linear operator dzF(x0 , q0 ) : Z = Rn Rp ---> Rm given by
[dzF(p)]( , ) = [xF(x0 , q0 )]( ) + [qF(x0 , q0 )]( ).
Theorem D6. If F: D(F) Rn Rp ---> Rm is Fréchet differentiable at a point p = ( x0 , q0 )T int[D(F)], with derivative [DzF(p)]:Rn Rp ---> Rm , then both partial Fréchet derivatives exists at p and
[DzF(p)]( , ) = [dzF(p)]( , ) = [xF(x0 , q0 )]( ) + [qF(x0 , q0 )]( ).
Theorem D7. If the partial Fréchet derivatives xF(x0 , q0 ) and qF(x0 , q0 )exist and are continuous at each point in a neighborhood Rn Rp ofz0 = p = ( x0 , q0 )T int[D(F)], then F is Fréchet differentiable at the p and
[DzF(p)]( ,) = [xF(x0 , q0 )]( )+ [qF(x0 , q0 )]( ).
Implicit Function Theorem
0 0( , ) 0F x q
Assume F : D(F) Rn Rp ---> Rn is a smooth function on a neighborhood ofz0 = [x0 , q0]T Rn Rp. If
and the partial Fréchet derivative
is one-to-one and onto Rn. Then there exists an open neighborhood Q of q0 and
a function such that
0 0( , ) : n nxF x q R R
: p nw Q R R
0 ) ( ) 0i w q and ) ( ( ), ) 0, for all .ii F w q q q Q
ˆ ˆ ˆ ˆ ˆ ˆ) [ ( ( ), )] [ ( )] [ ( ( ), )] 0, for all x q qiii F w q q D w q F w q q q Q
Moreover, the Fréchet derivative exists, is
continuous at each point , and
ˆ( )qD w q
q̂ Q
pR
nR
( )x w q
( , ) : ( , ) 0x q F x q