chapter ii transport and laplace … diff eq...contents chapter i calculus of variations 1-39...

CONTENTS ------------------------------------------------------------------------------------------------------

CHAPTER I CALCULUS OF VARIATIONS 1-39

CHAPTER II TRANSPORT AND LAPLACE EQUATIONS 40-73

CHAPTER III HEAT AND WAVE EQUATIONS 74-94 CHAPTER IV ANALYTICAL MECHANICS – I 95-131 CHAPTER V ANALYTICAL DYNAMICS-II 132-144

CHAPTER VI ANALYTICAL MECHANICS-III 145-160

CHAPTER VII ANALYTICAL MECHANICS-IV 161-179

CHAPTER VIII NONLINEAR FIRST-ORDER PDE 180-219

CHAPTER IX REPRESENTATION OF SOLUTIONS 220-253

CHAPTER X ATTRACTION AND POTENTIAL-I 254-274

CHAPTER XI ATTRACTION AND POTENTIAL-II 275-290

CALCULUS OF VARIATIONS

5

y

x

(x0, y0)

(x1, y1)

c

Chapter-1 Calculus of Variations

1.1 INTRODUCTION By a functional, we mean a correspondence which assigns a definite real number to each function/curve belonging to some class.

That is, a functional is a kind of function where the independent variable is itself a function. Thus the domain of a functional is a set of admissible functions, rather than a region of a coordinate space.

Examples of Functionals

(1) consider the set of all rectifiable plane curves between two given points (x0, y0) and (x1, y1). Let this family be denoted by A. The length of a curve y(x) ∈A is a functional. This length is given by

J[y] = l[y(x)] = � ��

��

�+1x0x

2

dxdy

1 dx,

y(x)∈A.

(2) The area “S” of a surface z = z(x, y) bounded by a given curve C is a functional. This area “S” is determined by the choice of the surface S, z = z(x, y), as

J[z(x, y)] = ��

��

�

∂∂+�

�

��

�

∂∂+

D

22

yz

xz

1 dx dy,

PARTIAL DIFFERENTIAL EQUATIONS AND MECHANICS

6

A B

where D is the projection of the surface, z = z(x, y), bounded by the curve C,

on the xy-plane.

Functionals, called variable quantities, play an important role in many problems arising in analysis, geometry, mechanics, etc. The first important results in this area due to Euler (1707-1783). Nevertheless, up to now, the “calculus of functionals” still does not have methods of a generality comparable to the methods of classical analysis calculus of functions.

The most developed branch of the ‘Calculus of functionals” is concerned with finding the maxima and minima of functionals, and is called the “Calculus of variations”.

Actually, it would be more appropriate to call this branch/subject the “calculus of variations in the narrow sense”, since the significance of the concept of the “variation of a functional” is by no means confined to its applications to the problem of determining the extrema of functionals.

The aim of “calculus of variation” is to explore methods for finding the maximum or minimum of a functional defined over a class of functions. Several physical laws can be deducted from concise mathematical principles to the effect that a certain functional is a given process attains/assumes a maximum or minimum. In mechanics, we have the principle of least action, the principle of conservation of linear momentum, and the principle of conservation of angular momentum. In addition, we have the principle of castigliano in the theory of elasticity.

The history of the calculus of variations (CV) can be traced back to the year 1696 when John Bernoulli formulated the problem of the brachistochrone (shortest time). In this problem one has to find the curve connecting two given points, A and B, that do not lie on a vertical line, such that a particle sliding down this curve under the influence of gravity alone from the point A reaches point B in the shortest time.

We shall see later on that the curve of quickest descent will not be the straight-line connecting the points A and B, though this is the shortest distance between the points.

Apart from Bernoulli, this problem was independently solved by Leibnitz, Newton and L’Hospital. However, the development of “Calculus of Variations” as an independent Mathematical discipline, along with its own methods of investigation, was due to the pioneering studies of Euler during the period 1707-1783.


7

Apart from the above described problem, three other problems, stated below, were the motivating one for the developed of the subject CV.

Problem of Geodesics

In this problem, it is required to determine the line of shortest length connecting two given points A(x0, y0, z0) and B(x1, y1, z1) on a surface S given by

ϕ(x, y, z) = 0.

This problem is a typical problem “Variational problem with a constraint”. Here, we are required to minimize the arc length given by the functional

J[y, z] = dxdxdz

dxdy

11x0x

22

� ��

��

�+��

��

�+

Subject to the constraint ϕ(x, y, z) = 0.

This problem was first solved by Jacob Bernoulli in 1698, but a general method of such category of problems was given by Euler.

A geodesic on a given surface is a curve, lying on that surface, along which distance between two points is minimum. On a plane, a geodesic is a straight line.

The Problem of Minimum Surface of Revolution

A curve y = y(x) ≥ 0 is rotated about the x-axis through an angle 2π. The resulting surface bounded by the planes

x = a and x = b

has the area

J[y] = 2π dxdxdy

1yba

2

� ��

��

�+

The determination of a particular curve

y = y(x)

which minimizes J[y] is a variational problem.

The Isoperimetric Problem

This problem is : “Among all closed curves of a given length l, find the curve enclosing the greatest area”.

This problem was solved by Euler. The required curve turns out to be a circle. The solution of this problem was known ever is ancient Greece.


8

1.2 FUNCTION SPACES In the study of functions of n variables, it is convenient to use geometric language, by regarding a set of n numbers (y1, y2,…, yn) as a point in the n-dimensional space.

Linear space. Let L be a non-empty set, consisting of elements x, y, z, of any kind, for which the operations of addition and multiplication by real numbers α, β, … are defined and obey the following axioms :

(i) x + y = y + x

(ii) x + (y + z) = (x + y) + z ;

(iii) there exists an element ‘o’, called the zero element, such that

x + 0 = x = 0 + x for all x∈L,

(iv) For each x∈L, there exists an element “−x” in L such that

x +(−x) = 0 = (−x) + x; (v) 1. x = x;

(vi) α(βx) = (αβ)x

(vii) (α + β)x = αx + βx;

(viii) α(x + y) = αx + αy.

Normed Linear Space

A linear space L is said to be a normed linear space, if each x∈L is assigned a non-negative number ||x||, called the norm of x, such that (i) ||x|| = 0 iff x = 0 ;

(ii) ||x+y|| ≤ ||x|| + ||y||,

(iii) ||αx|| = |α| ||x||.

Function Spaces

Linear spaces whose elements are functions are called function spaces.

In studying functionals of various types, it is reasonable to use various function spaces. The concept of continuity plays an important role for functionals, just as it does for the ordinary functions considered in classical analysis. In order to formulate this concept for functionals, we must somehow introduce a concept of “closeness” for elements in a function space. This is most conveniently done by introducing the concept of the norm of a function. The following normed linear spaces are important for our subsequent studies,


9

Examples of Normed Linear Spaces of Function

(1) The space C [a, b] consisting of all continuous functions defined on a closed interval [a, b], is a normed linear space with

||y||0 = bxa

max≤≤

|y(x)|

(2) The space D1[a, b] consisting of all functions y(x) defined on the closed interval [a, b] which are continuous and have continuous first derivative, is a normed linear space with the norm ||y||1 =

bxamax

≤≤|y(x)| +

bxamax

≤≤ |y′(x)|.

Remark. Two functions, y and z, in D1 are regarded as close together if both the functions themselves and their first derivatives are close together, since

||y−z||1 < ∈

implies that

|y(x) − z(x)| < ∈ and |y′(x) − z′(x)| < ∈

for all x∈[a, b]

(3) The space Dn[a, b], consisting of all functions y(x) defined on the closed interval [a, b] which are continuous and have continuous derivatives upto order n inclusive (where n is a fixed positive integer), is a normed linear space with norm

||y||n = �=

n

0i bxamax

≤≤|y(i)(x)|,

where yi(x) = (d/dx)i y(x) and y(0)(x) = y(x).

Remark. Two functions in Dn are regarded as close together if the values of the functions themselves and of all their derivatives upto order n inclusive are close together.

Similarly, we can introduce spaces of functions of severable variables − the space of continuous functions of n variables, the space of functions of n variables with continuous first derivative, etc. Continuity of functionals

After introducing norm on function spaces, it is natural to talk about continuity of functionals defined on a function space L.

Definition. The function J[y] is said to be continuous at the point y ∈L if for any ∈>0, there is a δ>0 such that

|J[y] − J[ y ]| < ∈ provided


10

||y− y ||<δ.

Remark. So far, we have talked about linear spaces, and functionals defined on them. However, in many variational problems, we have to deal with functionals defined on sets of functions which do not form linear spaces.

The set of functions/curves satisfying the constraints of a given variational problem, called the admissible functions is in general not a linear space.

1.3 THE CONCEPT OF A VARIATION/DIFFERENTIAL OF A FUNCTIONAL

First, we give some preliminary definitions and facts.

Definition. Let N be a normed linear space of functions. Let each h∈N be assigned a real number φ[h]. That is, let ϕ[h] be a functional defined on N. Then ϕ[h] is said to be a linear functional if (i) ϕ[αh] = αϕ[h] for any h∈N and any real number α ;

(ii) ϕ[h1 + h2] = ϕ[h1] + ϕ[h2] for any h1 and h2 in N ;

(iii) ϕ[h] is continuous for all h∈N.

Example. (1) The integral

ϕ[h] = �b

ah(x) dx

defines a linear functional on the normed linear space C[a, b]

(2) The integral

ϕ[h] =�b

aα(x) h(x) dx

where α(x) is a fixed member of space C[a, b], defines a linear functional on

C[a, b]

Lemma 1. If α(x) is continuous in [a, b], and if

�b

aα(x) h(x) dx = 0

for all h(x) in C[a, b] such that h(a) = h(b) = 0, then

α(x) = 0 for all x in [a, b]

Proof. Suppose the function α(x) is non-zero, at some point, say C in [a, b]. Then, there exists some interval [x1, x2], around c and contained in [a, b], such that α(x) has the same sign in [x1, x2]. Without loss of generality it is assumed that


11

α(x)>0 in [x1, x2] ⊂ [a, b] …(1) Set

h(x) = � −−

othersie0

]x,x[inxfor)xx()xx( 2121 …(2)

Since h(x) is continuous and h(x1) = h(x2) = 0, so, h(x) ∈ C[a, b]. However,

�b

aα(x) h(x)dx = �

2x

1xα(x) (x−x1) (x2−x) {Θ h(x) = 0 in [a, x1] and [x2, b]

> 0 , …(3)

since the integrand is positive in the open interval (x1, x2). This is a contradiction to the hypothesis in the statement of lemma. This contradiction proves the lemma 1.

Remark. The lemma still holds if we replace the word ‘C[a, b]’ by ‘Dn[a, b]’ in the statement of the lemma. In that situation, we use the same proof with

h(x) = � −− +

otherwise0

]x,x[inxfor)]xx)(xx[( 211n

21

Lemma 2. Statement. If α(x) is continuous in [a, b], and if

�b

aα(x) h′(x) dx = 0

for every function h(x)∈ D1(a, b) such that h(a) = h(b) = 0, then

α(x) = c for all x in [a, b]

where c is a constant.

Proof. Let c be the constant defined by the condition

�b

a[α(x)−c] dx = 0 …(1)

Let

h(x) = �x

a[ α( ξ)−c] dξ …(2)

Then h(x) is differentiable and

h′(x) = α(x) −c, in [a, b] …(3)

by the fundamental theorem of integral calculus. So

h(x) ∈ D1(a, b) …(4)


12

Also, from equations (1) and (2).

h(a) = h(b) = 0. …(5)

That is, h(x) satisfies all the conditions of the lemma. Hence, by hypothesis

�b

aα(x) h′(x) = 0 …(6)

Now

�b

a[α(x)−c]2dx

= �b

a[α(x)−c]h′(x)

= �b

aα(x) h′(x) dx −c�

b

ah′(x) dx

= 0 − c [h(b) − h(a)] = 0. This gives

�b

a[α(x)−c]2 dx = 0 in [a, b].

It follows that

α(x) −c= 0 for all x in [a, b]

or α(x) = c for all x in [a, b].

This completes the proof of Lemma 2

Lemma 3. Statement. If α(x) and β(x) are continuous in [a, b], and if

�b

a[α(x) h(x) + β(x) h′(x)] dx = 0

for every function h(x) ∈ D1(a, b) such that

h(a) = h(b) = 0,

then β(x) is differentiable, and

β′(x) = α(x) for all x in [a, b]

Proof. Set

A(x) = �αx

a(ξ) dξ, for x∈[a, b] …(1)


13

Then

A(a) = 0 and A′(x) = α(x) for all x∈[a, b] …(2)

Now

�b

aα(x) h(x)dx = { } dxd)()x('hd)()x(h

b

a

x

a

ba � � �

��

�

��

�

ξξα−ξξα�

= − �b

aA(x) h′(x) dx, …(3)

since h(a) = h(b) = 0. The given condition

�b

a[α(x) h(x) + β(x) h′(x)] dx = 0, …(4)

and result in (3), lead to

�b

a[−A(x) + β(x)] h′(x) = 0, …(5)

for every function h(x) ∈ D1(a, b) such that h(a) = h(b) = 0. The lemma 2 applied to relation (5) given at once

−A(x) + β(x) = constt. In [a, b]

i.e., (since A(x) is differentiable)

β′(x) = A′(x) in [a, b] …(6)

Equations (2) and (6) yield

β′(x) = α(x) in [a, b]

This completes the proof of Lemma 3. We now introduce the concept of the variation/differential of a functional. Let J[y] be a functional defined on some normed linear space. Let ∆J[y] = J[y + h] − J[y] …(1)

be its increment corresponding to the increment

h = h(x) …(2)

of the “independent variable” y = y(x). If y is fixed, ∆ J[h] is a functional of h and it is a nonlinear functional, in general. Suppose that

∆J[y] = ϕ[h] + ∈||h||, …(3)

where


14

ϕ [h] = a linear functional, …(4)

and

∈→0, …(5)

as ||h||→0.

Then the functional J[y] is said to be differential, and the principal linear part of the increment ∆J[h], i.e., φ[h], is called the variation/ differential of J[y]. It is denoted by δJ[h]. That is, δJ[h] = φ[h]. …(6)

Theorem 1. The differential of a differentiable functional is unique.

Proof. Before proving the main theorem, we state and prove a lemma.

Statement of lemma. If differential ϕ[h] of a functional J[y] is a linear

functional and if

||h||]h[φ →0 …(1)

as ||h||→0, then

ϕ[h] = 0 for all h. …(2)

Proof of lemma. If possible, suppose that

ϕ[h0] ≠ 0 for some h0 ≠ 0. …(3)

Define

hn =||h||]h[

,n

h

0

00 φ=λ ≠ 0. …(4)

Then ||hn||→0 as n→∞, …(5)

but

||n/h||]n/h[

lim||h||]h[

lim0

0

nn

n

n

φ=

φ∞→∞→

= ||h||]h[

lim0

0

n

φ∞→

, since ϕ is linear

= λ

≠ 0. …(6)

This is contrary to hypothesis in (1). Hence, the result (2) holds.


15

Proof of the main theorem

Now, suppose that, if possible, the differential of the functional J[y] is not unique. Then, we can write

∆J[y] = ϕ1[h] + ∈1 ||h||, …(7)

and

∆J[y] = ϕ2[h] + ∈2 ||h||, …(8)

where ϕ1[h] and ϕ2[h] are linear functionals, and

∈1, ∈2→0, …(9)

as ||h||→0. Here

∆J[y] = J[y+h] − J[y]. …(10)

From equations (7) and (8) imply

ϕ1[h] −ϕ2[h] = (∈2−∈1) ||h||

or

||h||

]h[]h[ 21 φ−φ=∈2−∈1

→0 …(11) as ||h||→0. Hence, by above lemma, the functional

ϕ1[h] − ϕ2[h]

vanishes identically. This gives

ϕ1[h] = ϕ2[h] , for all h …(12)

implying that the differential of the differentiable functional J[y] is unique. This completes the proof.

Definition (Extremum). The functional J[y] is said to have a extremum for

y = y if

J[y] − J[ y ]

does not change its sign in some neighbourhood of the curve y = y (x).

Definition (Weak Extremum). The functional J[y] is said to have a weak extremum for y = y if there exists on ∈>0 such that J[y] − J[ y ]

has the same sign for all y in the domain of definition of the functional which

satisfy the condition

||y− y ||1<∈,


16

where || ||1 denote the norm in the space D1.

Definition (Strong Extremum). The functional J[y] is said to have a strong extremum for y = y if there exists an ∈>0 such that

J[y] − J[ y ]

Has the same sign for all y in the domain of definition of the functional which

satisfy the condition

||y− y || M∈,

where || ||0 denotes the norm in the space C [a, b].

Note. Every strong extremum is simultaneously a weak extremum.

(1) However, the converse is not true in general.

(2) Finding a weak extremum is simpler than finding a strong extremum.

Theorem 2. Statement. A necessary condition for the differentiable functional J[y] to have an extremum for y = y is that its variation vanishes for y = y .

Proof. We are required to prove that

δJ[y] = 0 …(1)

for y = y and all admissible h.

According to the definition of the variation δJ[h] of J[y], we have

∆J[h] = δJ[y] + ∈ ||h|| …(2) where

∈→0 …(3)

as ||h||→0, and

∆J[h] = J[y + h] − J[y]. …(4)

Thus, for sufficiently small ||h||, the sign of ∆ J[h] will be the same as the sign of the variation δJ[h]. To be explicit, suppose that J[y] has a minimum for y = y . If possible suppose that

δJ[h0] ≠ 0, …(5)

for some admissible h0. Then, for any α>0, no matter however small it may

be, we have

δJ[−α h0] = −δJ [α h0]. …(6)

Hence, (2) can be made to have either sign for sufficiently small ||h||. But this is impossible, since by hypothesis, J[y] has a minimum for y = y , i.e.,


17

∆J[h] = J[ y +h] − J[ y ] ≥ 0, …(7)

for all sufficiently small ||h||. This contradiction completes the proof of the theorem. 1.4 EULER’S EQUATION −−−− SIMPLEST VARIATIONAL PROBLEM Theorem 3. Let

J[y] = �ba F (x, y, y′) dx …(1)

Be a functional defined on the set of functions y(x) which has continuous first derivative in [a, b] and satisfy the boundary conditions

y(a) = A, y(b) = B. …(2)

Prove that a necessary condition for J[y] to have an extremum for a given function y(x) is that y(x) satisfies the differential equation

Fy − 0)F(dxd

'y = . …(3)

Proof. Let h = h(x) be the increment given to y(x). Then, in order for the function “y + h” to satisfy the boundary conditions in (2), we must have

h(a) = 0, h(b) = 0 …(4)

Now ∆J[h] = J[y + h] − J[y]

= �ba F (x, y + h, y′+h′)dx − �

ba F (x, y, y′)dx

= �ba F[ (x, y +h, y′+h′) − F(x, y, y′)]dx, …(5)

Using Taylor’s theorem, we write

∆ J[h] = �ba yhF[ (x, y, y′) + h′ Fy′(x, y, y′)]dx +…….., …(6)

where the subscripts denote partial derivative w.r.t. the corresponding arguments, and dots denote terms of order higher than 1 relative to h and h′. The integral in the right-hand side of (6) represents the principal linear part of the increment ∆J[h]. Hence, the variation/ linear part of the increment ∆J[h]. Hence, the variation/differential δJ of J[y] is, by definition,

δJ = �ba yhF[ (x, y, y′) + h′ Fy′(x, y, y′)]dx …(7)


18

We know that a necessary condition for J[y] to have an extremum for

y = y(x) is that

δ J = 0, …(8)

for all admissible h. Equations (7) and (8) imply

�ba yhF[ +h′ Fy’] dx = 0, …(9)

for all admissible h.

The use of lemma 3 an relation (9) imply

Fy = )F(dxd

'y

i.e., Fy − )F(dxd

'y = 0. …(10)

This completes the proof of the theorem

Definition 1. Equation (10) is known as Euler’s equation.

Definition 2. The integral curves of Euler’s equation are called extremals.

Remark. Euler’s equation is, in general, a second-order ordinary differential equation, and its solution will, in general, depend on two arbitrary constants, which are determined from the boundary conditions.

y(a) = A, y(b) = B .

Special Cases

We now consider some special cases where Euler’s equation can be reduced to a first-order differential equation, or where its solution can be obtained entirely by evaluating integrals.

Case I. Suppose the integrand does not depend on y.

In this case, the functional under consideration is of type

J[y] = �ba F (x, y′)dx, …(1)

where F does not contains y explicitly. In this case,

Fy = 0. …(2)

Consequently, the Euler’s equation becomes

dxd

(Fy’) = 0 …(3)

which has the first integral


19

Fy′ = C, …(4)

where C is a constant.

Equation (4) is a first-order ordinary differential equation. Solving (4) for y′, we obtain an equation of the form

y′ = f(x, c),

from which y can be found by integration

Case 2. Suppose the integrand does not depend on x.

In this case, we have functional as

J[y] = �ba F (y, y′) dx. …(1)

Now

Fy − �

��

��

��

��

��

�+��

��

��

��

�−=

dx'dy

)F('dy

ddxdy

.)F(dyd

F)F(dxd

,y'yy'y

= Fy −y′ Fy’y−y′′ Fy′y. …(2) So, the Euler’s equation is

Fy−y′ Fy’y − y′′ Fy′y′ = 0.

Multiplying by y′, we obtain

y′Fy−(y′)2 Fy’y−y′y′′ Fy′y′ = 0

� [ ] ,0F'yFdxd

'y =− …(3)

where has the first integral

F−y′ Fy′= c

where c is a constant. Euler’s equation (4) is of first-order.

Case 3. Suppose the integrand does not depend on y′. In this case, the function is as

J[y] = �ba F(x, y) dx, …(1)

so

Fy′ = 0. …(2)

Hence, Euler’s equation becomes

Fy(x, y) = 0 …(3)

This equation is not a differential equation, but an algebraic equation in x and y. Its solution consists of one or more curves y = y(x).

Case 4. When functional J[y] is of the form


20

J[y] = �ba [f(x, y) 2)'y(1+ ] dx. …(1)

This functional represents the integral of a function f(x, y) with respect to the

arc length s, where

ds = 2)'y(1+ dx.

In ths case,

F(x, y, y′) = f(x, y) 2)'y(1+ …(2)

and, so

[ ]��

�

��

�

�

+−+=��

�

��

�

∂∂−

∂∂

2

2y

)'y(1

'y)y,x(f

dxd

)'y(1)y,x(f'y

Fdxd

yF

= (fy)�

��

−��

�

�

��

�

�

+∂∂+

++

dxdy

)'y(1

'yf

y)'y(1

'y)f()'y(1

22x2

+��

��

��

�

�

��

�

�

+∂∂

dx'dy

.)'y(1

'yf'y 2

= (fy)��

�

��

�

�

+−

+−+

2y2y2

'y1

'y)f('y

'y1

'y)f('y1

��

�

�

��

�

�

+

+−+

−

2

2

22

'y1

'y1

'y'y1

)f)(''y(

=(fy) 2/322

2

y2x2

)'y1(1

).f)(''y('y1

'y)f(

'y1

'y)f('y1

+−

+−

+−+

= 2/322

x

2

y

)'y1(f''y

'y1

f'y

'y1

)f(

+−

+−

+

= 2'y1

1

+[fy − y′ fx −y′′f] …(3)

So, Euler’s equation becomes

fy − y′fx − y′′f = 0.


21

which is a differential equation of order 2.

ILLUSTRATIVE EXAMPLES Example 1. Solve the variational problem

J[y] = dxx

'y121

2

��

��

�

�� +

…(1)

y(1) = 0, y(2) = 1. …(2)

Solution. We note that the integrand in the given functional does not depend

on y explicitly, and

F(x, y, y′) = x

'y1 2+ …(3)

Euler’s equations for such case is of the form

=∂∂

'yF

c, …(4)

where c is a constant. From equations (3) and (4), we find (exercise)

y′ = 22xc1

cx

− …(5)

Integrating (5), it follows that (exercise)

y = 122 cxc1

c1 +−

or (y−c1)2 + x2 = ,c1

2

…(6)

where c1 is a constant. The curve (6) represents a circle with centre (0, c1), lying on the y-axis, and radius 1/c. Using the boundary conditions (2), we find (exercise)

c = 5

1, c1 = 2. …(7)

So, the require curve is

x2 + (y−2)2 = 5. …(8)

Example 2. Among all the curves joining two given points (x0, y0) and (x1, y1), find the one which generates the surface of minimum area when rotated about the x-axis.


22

Solution. We know that the area of the surface of revolution generated by rotating the curve y = y(x) about the x-axis is

2π � +1x0x

2 dx'y1y

So, the variational problem is

J[y] = 2π � +1x0x

2 dx'y1y , …(1)

with boundary conditions

y(x0) = y0, y(x1) = y1. …(2)

In this variational problem the integrand does not depend explicitly on x, and

F(y, y′) = 2πy ,'y1 2+ …(3)

and corresponding Euler equation is

y −y′(Fy′) = constt. …(4)

We find (exercise)

2'y1

y

+ = c, c = constt. …(5)

we put y′ = sinh t, …(6)

then (5) imply (exercise)

y = c cosh t. …(7)

Equation (6) and (7) give (exercise)

dx = c dt.

Integrating it, we obtain

x = ct + c1, …(8)

where c1 is a constant. Eliminating t from equations (7) and (8), we find

y = c cosh ��

��

� −c

cx 1 . …(9)

The values of the arbitrary constants c and c1 are determined by the given

conditions in (2).

The required curve is catenary passing through the two given points. The surface generated by rotation of the catenary is called a catenoid.

Example 3. Minimize the functional


23

J[y] = �ba (x−y)2dx. …(1)

Solution. In this example, the integrand does not contain y′ explicitly and

F(x, y) = (x−y)2 …(2)

The corresponding Euler’s equation is

,0yF =

∂∂

…(3)

which leads to

x−y = 0. …(4)

The required curve (4) is a straight line. Further, the functional (1) vanishes

along this line.

This completes the solution

1.5 THE CASE OF SEVERABLE VARIABLES Now, we consider further generalization of the simplest variational problem. First we consider the case of n dependent functions. Let

J[y1, y2,…, yn] = �ba F(x, y1, y2,…, yn, '

n'1 y,...,y )dx …(1)

Be a functional which depends on n continuously differentiable functions

y1(x), y2(x), …, yn(x)

satisfying the boundary conditions

yi(a) = Ai, yi(b) = Bi, 1≤ i ≤ n. …(2)

Here, we are looking for an extremum of the functional (1) defined on the set of the set of smooth curves joining two fixed points in (n+1)-dimensional Euclidean space Rn+1.

The problem of finding geodesics (shortest curves joining two points of some manifold) is of this type. The same kind of problem arises in geometric optics, in finding the paths along which light rays propagate in an inhomogeneous media. According to Fermat’s principle, light goes from a point, say P0, to a point, say P1, along the path for which the transit time is the smallest.

Theorem. Prove that a necessary condition for the curve

yi = yi(x), (i =1, 2,…, n)

to be an extremal of the functional

J = �ba F (x, y1, y2,…, yn, '

n'2

'1 y,...,y,y ) dx …(1)

is that the functions yi(x) satisfy the equations


24

0yF

dxd

yF

ii=��

�

��

�

∂∂−

∂∂

', 1 ≤ i ≤ n. …(2)

Proof. First of all, we calculate the variation δJ of the given functional J in (1). We replace each yi(x) by a varied function yi(x) + hi(x).

By definition, the variations δJ of the functional J[y1,., yn] is linear in hi and hi′, 1 ≤ i ≤ n, and which differs from the increment

∆J = J[y1 + h1, y2 + h2, ., yn + hn] − J[y1, y2,.,yn], …(3)

by a quantity of order higher than 1 relative to hi and hi′ ; i = 1, 2,…, n. Since both yi(x) and yi(x) + hi(x) satisfy the boundary conditions

yi(a) = Ai, yi(b) = Bi, 1 ≤ i ≤ n. …(4)

Therefore, we must have

hi(a) = hi(b) = 0, for each i. …(5)

Using Taylor’s theorem, we obtain

∆J = �ba [ F(x, y1 +h1,.., yi+hi,…, yn + hn ,

),..,,..., ''''''nnii11 hyhyhy +++

− F(x, y1, …, yi,…, yn, )]y,...,y,.,y 'n

'i

'1 dx

= � +��

��

�

��

�

∂∂+

∂∂

=

ba

n

1i i

'i

ii ......dx

'yF

hyF

h …(6)

where the dots denote terms of order higher than 1 relative to hi, hi′ (i = 1, 2, ., n) The integral on the right of (6) represents the principal linear part of the increment ∆J. Hence, by definition, the variation δJ of J is

δJ = ��

��

�

��

�

∂∂+

∂∂

=

ba

n

1i i

'i

ii dx

'yF

hyF

h …(7)

Since all the increments hi(x) are independent, we can choose one of them quite arbitrarily (as long as the boundary conditions are satisfied), setting all the others equal to zero. Therefore, the necessary condition

δJ = 0, …(8)

for an extremum implies


25

� =��

��

�

∂∂+

∂∂b

ai

'i

ii ,0

'yF

hyF

h …(9)

for each i = 1, 2,…, n. There are now n conditions in (9). Using lemma 3, we

obtain

��

��

�

∂∂=

∂∂

'ii y

Fdxd

yF

, 1 ≤ i ≤ n

or

iyF −

dxd

( 'iyF ) = 0, 1 ≤ i ≤ n. …(10)

Equations (10) are called Euler’s equations. We note that (10) is a system of n second order, in general ordinary differential equations. Solution of (10) contains, in general, 2n arbitrary constants, which are determined from the boundary conditions in (4). This completes the proof.

Definition. Two functionals are said to be equivalent if they have the same

extremals.

Example. Find the external of the functional

J[y, z] = �π 2

0

/(y′2 + z′2 + 2yz)dx

y(0) = 0, y (π/2)= 1, z(0) = 0, z(π/2)= −1.

Solution. Taking

y1(x) = y(x), y2(x) = z(x), …(1)

and F[y1, y2] = (y1′)2 + (y2′)2 + 2 y1 y2, …(2)

Euler’s equations

,0'y

Fdxd

yF

ii

=��

��

�

∂∂−

∂∂

i = 1, 2, …(3)

become

2z −dxd

(2y′) = 0,

2y − dxd

(2z′) = 0.

This gives z = y′′, …(4)


26

y = z′′. …(5)

Equations (4) and (5) imply

,0ydx

yd4

4

=− …(6)

solution of (6) is

y(x) = c1 ex + c2 e−x + c3 sin x + c4 cosx, …(7)

where c1, c2, c3, c4 are constants. Equations (4) and (7) give

z(x) = c1 ex + c2e−x − c3 sinx − c4 cos x. …(8)

Using the given boundary conditions

��

−=π==π=

,)/(,)()/(,)(

12z00z12y00y

…(9)

we obtain (exercise)

c1 = c2 = 0, c3 = 1, c4 = 0. …(10)

Hence, an extremum of the given functional is given by

y(x) = sinx,

z(x) = −sin x. …(11)

1.6 THE PROBLEM OF GEODESICS

Suppose we have surface σ specified by a vector equation

rrρρ

= (u, v).

The shortest curve (of minimum length) lying on the surface σ and connecting

two points A and B of surface, σ, is called the geodesic connecting the two

points.

The equations for the geodesics of σ are the Euler equations of the corresponding variational problem−namely, the problem of finding the minimums distance (measured along surface σ) between two points of the surface σ.

Euler’s Equations on Geodesics

A curve lying on the surface

rrρρ

= (u, v), …(1)

can be specified by the equations

u = u(t),


27

v = v(t), t being a parameter. …(2)

Let vu)v,u(vu)v,u(vu)v,u( r.rG,r.rF,r.rE

ρρρρρρ=== …(3)

These quantities are called the coefficients of the first fundamental form of the surface (1). The arc length between the points A(t1) and B(t2), corresponding to the parameter t, is given by (using results from Differential Geometry by C.E. Wealtherburn)

J[u, v] = dt'Gv'v'Fu2'uE1t0t

22� ++ …(4)

Euler’s equations for the functional (4) are

0GvvFu2Euudt

dGvvFu2Eu

u2222 =� �

��

� ++∂∂−++

∂∂

'''''

]''''[

0GvvFu2Euvdt

dGvvFu2Eu

v2222 =� �

��

� ++∂∂−++

∂∂

'''''

''''

These become

0'Gv'v'Fu2'Eu

)'Fv'Eu(2dtd

'Gv'v'Fu2'Eu

'vG'v'uF2'uE2222

2uu

2u =

��

�

��

�

++

+−��

�

��

�

++

++ …(5)

and 0GvvFu2Eu

FvFu2dtd

GvvFu2Eu

vGvuF2uE2222

2vv

2v =

��

�

��

�

++

+−��

�

��

�

++

++

''''

)''(

''''

'''' …(6)

Remark. The concept of a geodesic can be defined not only for surfaces, but also for higher-dimensional manifolds. Finding the geodesics of an n-dimensional manifold reduces to solving a variational problem for a functional depending on n functions.

Example 1. Find the geodesics of the circular cylinder

(r =ρ

a cos ϕ, a sin ϕ, z)

Solution. The variables ϕ and z play the role of the function u and v in the

above article. Now

(r =ρ

a cos ϕ, a sin ϕ, z). …(1)

Then �

�

=

−=

).1,0,0(r

),0,�cosa,�sina(r

z

�

ρ

ρ …(2)


28

Therefore

��

�

==

==

==

1r.rG

0r.rF

ar.rE

zz

z�

2��

ρρ

ρρ

ρρ

…(3)

The arc length between two points A(t1) and B (t2) lying on the cylinder (1) is given by the functional

J[ϕ, z] = � ++2t1t

22 dt'Gz'z'�F2'�E …(4)

or

J[ϕ, z] = � +2t1t

222 'z'�a dt. …(5)

Euler’s equations for the functional (5) are

0 − ,0'z'�a

'�adtd

222

2

=��

�

��

�

�

+ …(6)

0 − .''

'0

za

adtd

222

2=

��

�

��

�

�

+φ

φ …(7)

These equations, on integration, yield

22221222

2

c'z'�a

'z,c

'z'�a

'�a =+

=+

Dividing the second of these equations by the first, we obtain

z′/ϕ′ = constt.

or c�d

dz =

which has the solution

z = c1ϕ + c2. …(8)

Equation (8) represents a two-parameter family of helical lines lying on the cylinder (1). Thus, a geodesic on cylinder (1) is a helix.

Example 2. Find the geodesics of the sphere

rρ

= (a sin θ cosφ, a sin θ sinφ, a cosθ)

Solution. On the surface of a given sphere


29

rρ

= (a sin θ cosφ, a sin θ sinφ, a cosθ) …(1)

we find (exercise)

E = a2, F = 0, G = a2 sin2θ. …(2)

The variational functional is (exercise)

J = � �sin'� 22 + dφ, θ′ = .�d�d

…(3)

Here, the integrand is

F = F(θ, θ′) = �sin'� 22 + = independent of φ. …(4)

So, the corresponding Euler’s equation is

F − θ′ Fθ′ = constt. = c

� c�sin'�

'��sin'�

22

222 =

+−+

� c�sin'�

)�(sin22

2

=+

� sin4θ = c2(θ′2 + sin2θ)

� c2θ′2 = sin4θ − c2 sin2θ

� 2

222

c)c�(sin�sin

�d�d −

θ−−

θ=θ−θ

=θφ

222

2

222 cc1

ecc

ecc1

cdd

cot)(

cos

cossin

Integrating

ϕ = cos−1 'cc1

�cotc2

+��

�

�

��

�

�

−

� cos (φ − c′) = ��

�

�

��

�

�

−

θ2c1

c cot

� c1 cotθ = cosϕ cos c′ + sin ϕ sin θ′

� c1 cosθ = sinθ sinϕ sin c′ + sinθ sinϕ sin c′

� z = Ax + By. …(5)


30

This is the equation of the plane passing through the centre (0, 0, 0) of the sphere and intersecting the sphere along a great circle.

Thus, the shorten curve, i.e., geodesic on a sphere is n are of a great circle.

Example 3. Find the geodesic on the plane.

Solution. The geodesic on the plane is an extremal of the functional

J[y] = � +1x0x

2 dx'y1 …(1)

The integrand F does not contain y explicity. Hence, the corresponding Euler’s

equation is

Fy′ = c …(2)

i.e.,

c'y2.)'y1(21 2

12 =+

−

� y′ = c 2'y1+

y′2 = c2(1+y′2)

y′2(1−c2) = c2

� y′ = A

� y(x) = Ax + B. …(3)

This is the equation of a straight line in the plane. Thus, geodesics in a plane

are straight lines.

1.7 FUNCTIONALS DEPENDING ON HIGHER-ORDER

DERIVATIVES Theorem. Statement. Among all functions y(x) belonging to the space Dn(a, b) and satisfying the conditions y(i)(a) = Ai, y(i)(b) = Bi, 0 ≤ i ≤ n, …(1)

find the function for which the functional

J[y] = �ba F (x, y, y′, y′′,…, y(n)) dx, …(2)

has an extremum.


31

Solution. First, we state the general result which states that a necessary condition for a functional J[y] to have an extremum is that its variation vanish, i.e.,

δJ = 0. …(3)

We replace y(x) by the “varied” function “y(x) + h(x)”, where h(x) belongs to Dn(a, b) and satisfy the boundary conditions (1). For this, we must have

h(i)(a) = h(i)(b) = 0 for i = 0, 1, 2,…, n.. …(4)

we know that by the variation δJ of the functional J[y], we mean the expression which is linear in h, h′,…, h(n), and which differs from the increment

∆J = J[y + h] − J[y], …(5)

by a quantity of order higher than 1 relative to h, h′,…, h(n). Next, we use Taylor’s theorem to obtain

∆J = �ba (F[ x, y + h, y′ + h′,…, y(n) + h(n)) − F(x, y′,…y(n))]dx

= �ba [ h Fy + h′ Fy′ +…+ h(n) Fy(n)]dx +…., …(6)

where the dots denote terms of order higher than 1 relative to h, h′, …, h(n). The last integral in (6) represents the principal linear part of the increment ∆J. Therefore, by definition of the variation of J[y], we write

δJ = �ba [ hFy+h′ Fy′ +…+ h(n) Fy(n)]dx. …(7)

The necessary condition (3) for an extremum implies that

�ba [ h Fy+h′ Fy′ +…+ h(n) Fy(n)] dx = 0. …(8)

Integrating (8) by parts repeatedly and using boundary condition (4), we find

that (exercise)

��

�

��

�−+++−b

a yn

nn

y2

2

yy nFdx

d1f

dx

dF

dxd

F )()(...)()( )(''' = 0, …(9)

for any function h(x) which has continuous derivatives and satisfies the boundary condition in (4). It follows from lemma 1 that

Fy− 0)F(dxd

)1(...)F(dxd

)F(dxd

'n'yn

nn

''y2

2

'y =−+++ …(10)

Equation (10) is called Euler’s equation. Equation (10) is an ordinary differential equation of order 2n, its general solution contains 2n arbitrary constants, which can be determined from the 2n boundary conditions in (4).


32

�ba h′(Fy′) dx = [ ] � �

�

��

�− ba 'y

ba'y )F(

dxd

h)F(h dx

= � �

��

�ba 'y ,dx)F(

dxd

h

�ba h′′(Fy′′) dx = [ ] { }�− b

a ''yba''y dxF

dxd

'h)F('h

= −��

�

��

��−

��

� b

a ''y2

2b

a''y dx)F(

dxd

h)F(dxd

h

= (−1)2 �ba ''y2

2

,dx)F(dxd

h

in general,

� �−=b

a

b

a yk

kk

kyk dxF

dx

dxh1dxFh k }{)()(}{ )()(

)(

1.8 THE CONCEPT OF VARIATIONAL DERIVATIVE Let J[y] be a functional depending on the function y(x), and suppose we give y(x) an increment h(x) which is different from zero only in the neighbourhood of a point x0.

Let ∆σ denote the area lying between the curve y = y(x) and y = y(x) + h(x). Consider the ratio

�

]y[J]hy[J∆

−+ …(1)

of the increment

∆J = J[y+h] −J[y] …(2)

to the area ∆σ

Let the area

∆σ→0 …(3)

in such a way that

max |h(x)|→0 …(4)

as well as the length of the interval in which h(x) is non zero, goes to zero. Then, if the ratio (1) converges to a limit as ∆σ→0, this limit is called the variational derivative of the functional J[y] at the point x0(for the curve y = y(x)), and is denoted by


33

0xxy�

J�

=

…(5)

Remark. In the light of above, we write

∆J = J[y+h] − J[y] = ��

��

�

��

∈+= 0xxy�

J� ∆σ, …(6)

where

∈→0,

as ∆σ→0.

(2) The variation/differential of a functional J[y] at the point x = x0, in terms of the variational derivative, is given by the formula

δJ =��

��

�

��

= 0xxy�J� ∆σ. …(7)

1.9 VARIATIONAL PROBLEMS WITH SUBSIDIARY CONDITIONS

(THE ISOPERIMETRIC PROBLEM) Theorem. Given the functional

J[y] = �ba F (x, y, y′) dx,

Let the admissible curves satisfy the conditions

y(a) = A, y(b) = B,

K[y] = �ba G (x, y, y′)dx = l ,

where K[y] is another functional, and let J[y] have an extremum for y = y(x). Then, if y = y(x) is not an extremal of K[y], there exists a constant λ such that y = y(x) is an extremal of the functional

�ba F( +λG)dx.

Proof. Let J[y] = �ba F (x, y, y′)dx, …(1)

have an extremum for the curve y = y(x), subject to the conditions

y(a) = A, y(b) = B, …(2)

K[y] = �ba G (x, y, y′)dx = l . …(3)


34

We choose two points x1 and x2 in the interval [a, b], where x1 is arbitrary and x2 satisfies a condition to be stated later on, but is otherwise arbitrary.

We give y(x) an increment

δ1y(x) + δ2y(x),

where

δ1y(x) is non-zero only in the neighbourhood of x1, (4a)

and

δ2(x) is nonzero only in a neighbourhood of x2 …(4b)

Let y*(x) = y(x) + δ1y(x) + δ2y(x). …(5)

We now require that the “varied” curve y = y*(x) satisfy the condition

K[y*] = K[y]. …(6)

Using variational derivatives, we can write the increment ∆J of the functional J in the form

∆J = ,�y�F�

�y�F�

22

2xx11

1xx

∆��

��

�

��

∈++∆��

��

�

��

∈+==

…(7)

where

∆σ1 = �ba [ δ1y(x)]dx,

∆σ2 = �ba [ δ2y(x)]dx, …(8)

and ∈1, ∈2→0 …(9) as ∆σ1, ∆σ2→0. …(10)

Writing ∆K in a form similar to (7), we obtain

∆K = K[y*] −K[y]

= ,�y�G�

�y�G�

2'2

2xx1

'1

1xx

∆��

��

�

��

∈++∆��

��

�

��

∈+==

…(11)

where

∈1′, ∈2′→0 …(12a)

as

∆σ1, ∆σ2→0. …(12b)

Next, we choose the point x2 to be a point for which


35

.0y�G�

2xx

≠=

…(13)

Such a point exists, since by hypothesis y = y(x) is not an extremal of the functional K. The condition of the point x2, given in (13), is the condition which we had mentioned earlier. With this choice of point x2, Equations (6) and (11) imply

∆σ2 = −

��

�

��

�

�

��

��

�

∈+

=

= '

y�G�

y�G�

2xx

1xx ∆σ1, …(14)

where

∈′→0 as ∆σ1→0.

We set

λ = −

2xx

2xx

y�G�

y�F�

=

= . …(15)

Using (14) and (15) into (7), we obtain

∆J =

��

�

��

�

�

��

��

�

+��

��

�

��

+−∆��

��

�

��

∈+

=

=

==

'�

y�G�

y�G�

�y�F�

�y�F�

2xx

1xx2

2xx11

1xx

∆σ1

= ��

��

�

��

+== 1xx1xx y�

G��

y�F� ∆σ1 + ε ∆σ1, …(16)

where

∈→0

as

∆σ1→0.

This expression for ∆J explicitly involves variational derivatives only at the point x = x1 and the increment h(x) is now first δ1y(x). The “compensating increment” δ2y(x) has been taken into account automatically by using the


36

condition ∆K = 0. Thus, the first term in the right-hand side of (16) is the principal linear part of ∆J. So, the variation of the functional J at the point x1 is

δJ = ��

��

�

��

+== 1xx1xx y�

G��

y�F� ∆σ1. …(17)

We know that a necessary condition for an extremum is that

δJ = 0. …(18)

Since ∆σ1 is nonzero while x1 is arbitrary, we finally obtain

0yG

yF =

δδλ+

δδ

i.e.,

0)G(dxd

G�)F(dxd

F 'yy'yy =��

� −+

��

� − …(19)

This shows that y = y(x) is an extremal of the functional

�ba F( +λ G)dx,

where λ is given by (15).

This completes the proof.

Remarks (1) The general solution of differential equation (19) will contain two arbitrary constants in addition to the parameter λ. We shall determine these three quantities from two boundary conditions

y(a) = A,

y(b) = B

and the subsidiary condition

K[y] = l .

Remark (2). The above theorem/result generalizes immediately to the case of functionals depending on several functions.

Suppose we are looking for an extremum of the functional

J[y1, y2,.., yn] = �ba F (x, y1,…, yn, '

n'1 y,...,y )dx, …(20)

Subject to the conditions

yi(a) = Ai,

yi(b) = Bi, 1 ≤ i ≤ n, …(21)

and


37

�ba kG (x, y1, y2,…, yn, '

n'1 y,...,y )dx = lk …(22)

for k = 1, 2,…, m.

In this case a necessary condition for an extremum is that

0G�F'ydx

dG�F

y

m

1kkk

i

m

1kkk =

��

�

��

��

��+

∂∂−�

�

��

��+

∂∂

== …(23)

for i = 1, 2, …, n.

The 2n arbitrary constants appearing in the differential equation system (23) and the values of m parameters λ1, λ2,…, λm, sometimes called Largange multipliers & are determined from the boundary conditions (21) and subsidiary conditions (22). Here, the number of Lagrange multiplier equals the number of conditions of constraint. Example 1. Among all curves of length l in the upper half-plane passing through the points (−a, 0) and (a, 0) find the one which together with the interval [−a, a] encloses the largest area. Solution. We have to find the function

y = y(x)

for which the integral

J[y] = �−aa y(x)dx …(1)

takes the largest value subject to the conditions

y(−a) = 0, y(a) = 0, …(2)

K[y] = � +−aa

2'y1 dx = l. …(3)

We form the functional

J*[y] = J[y] + λ K[y]

= � ++−aa

2 ]'y1�y[ dx …(4)

The corresponding Euler’s equation is

0'y1

'y2.21

.�

dxd

]'y1�y[dyd

2

2 =

��

�

��

�

�

+−++

1− λ 0'y1

'ydxd

2=

��

�

�

��

�

�

+ …(5)


38

Integrating, we obtain

x − 12c

'y1

'y� =+

or (x−c1) = 2'y1

'y�

+

or (x−c1)2 (1+ y′2) = λ2 y′2

y′ = 2

12

1

)cx(�

cx

−

−. …(6)

Integrating (6), we obtain

y(x) = dx)cx(�

cx2

12

1

−−

−�

= 22

12 c)cx(� +−−

or (x−c1)2 + (y−c2)2 = λ2. …(7)

It is a family of circles. The values of c1, c2 and λ are determined from the

given conditions in (2) and (3).

We find (−a − c1)2 + c22 = λ2

(a−c1)2 + c22 = λ2

� c1 = 0. …(8)

Then, we have

c2 = − 22 a� − …(9)

So, solution (7) now becomes

x2 + (y + 2222 �)a� =− . …(10)

This gives

y = 2222 a�x� −−−

and y′ = 22 x�

x

−

− …(11)


39

Now, the condition (3) implies

l = dxx�

x1a

a 22

2

�−

+−

= dxx�

�aa 22

�−

−

= 2λ sin−1 (a/λ).

This gives

a/λ = sin(e/2λ). …(12)

Equation (12) is a transcendal equation for λ. Solving it, we find a definite/certain value, say λ = λ0. Then, solution curve (10) becomes

x2 + ( ) 20

222

0 �a�y =−+ …(13)

The result (13) is the required form.

Example 2. Find the extremal of the functional

J[y] = �π0

y′2 dx

Subject to the conditions

y(0) = 0, y(π) = 0,

�π0

y2 dx = 1.

Solution. We form an auxiliary function

J*[y] = �π0

F(x, y, y′)dx …(1)

where

F(x, y, y′) = y′2 + λ y2, λ being a parameter. …(2)

Euler’s equation for (1) is

Fy-dxd

(Fy′) = 0

i.e., 2λy −dxd

(2y′) = 0

or y′′ − λy = 0. …(3)


40

First of all, we claim

λ < 0. …(4)

If possible, consider the case when λ ≥ 0, then the general solution of second

order ODE (3) is

y(x) = c1x�

2x� ece −+ …(5A)

Use of boundary conditions

y(0) = 0, y (π)= 0 …(5)

give (exercise)

c1 = c2 = 0, …(5B)

and y(x) ≡ 0.

Then �π0

y2 dx = 0 ≠ 1, …(5C)

which is a violation given condition. Hence, our claim in (4) is valid.

Consequently, solution of ODE (4) is

y(x) = c1 sin �− x + c2 sin �− x …(6)

The boundary condition y(0) = 0 gives

c2 = 0, …(7)

and boundary condition

y (π) = 0

implies

λ = −k2 …(8)

for k = 1, 2, 3,…

Thus, a solution of ODE satisfying two boundary condition is

y(x) = c1 sink x, …(9)

where c1 is a non-zero constant and yet to be determined.

The condition

�π0

y2dx = 1, …(10)

gives (exercise)

c1 = π± /2 . …(11)


41

Hence, extremals are

y(x) = π

± 2sin kx, …(12)

where k = 1, 2, 3,… . …(13)

Example 3. Find an extremal of the functional

J[y, z] = �10 [y′2 + z′2 − 4xz′ − 4z]dx,

y(0) = 0, y(1) = 1

z(0) = 0, z(1) = 1

subject to the condition

�10 [y′2 − xy′−z′2] dx = 2.

Solution. We form an auxiliary functional

J*[y, z] = �10 F(x, y, z, y′, z′)dx …(1)

where

F(x, y, z, y′, z′) = (y′2 + z′2 −4xz′ − 4z) +λ(y′2−xy′−z′2), …(2)

in which λ is a parameter.

The system of Euler’s equations are

0 +dxd

(2y′ + 2λy′ − λx) = 0 …(3)

4 +dxd

(2z′ − 4x −2λz′) = 0 …(4)

Solving these equations (exercise), we obtain

y(x) = 21

2

c)�1(4

xc2x� ++

+ …(5)

z(x) = ,c)�1(2

xc4

3 +−

…(6)

where c1, c2, c3, c4 are constants of integration. Using the boundary conditions

��

====

1)1(z,0)0(z1)1(y,0)0(y

…(7)

we find (exercise)


42

c1 2

4�3 +, c2 = 0, c3 = 2(1−λ), c4 = 0. …(8)

Hence, solution of Euler’s system is

y(x) = )�1(4

x)4�3(x� 2

+++

…(9)

z(x) = x. …(10)

To find λ, we substitute the value of y(x) and z(x) from equations (9) & (10)

into the given condition

�10 (y′2−xy′−z′2)dx = 2, …(11)

we find (exercise), two value of λ, namely

λ1 = −1112

�,1110

2 −= …(12)

The actual substitution of λ, y and z in (2), we find that λ2 does not satisfy it, but λ1 does. Hence, the desired extremal is determined by the equations

.)(

,)(

xxz2

x5x7xy

2

=

−= …(13)

1.10 FINITE SUBSIDIARY CONDITIONS We now consider a problem which can be stated a follows :

Problem. Find the function yi(x) for which the functional

J[y1, y2,…, yn] = �ba F(x, y1, …, yn, 1

n'1 y,...,y ) dx, …(1)

has an extremum, where the admissible functions satisfy the boundary

conditions

yi(a) = Ai, yi(b) = Bi, 1 ≤ i ≤ n, …(2)

and m “finite” subsidiary conditions (m < n)

gk(x, y1,…, yn) = 0, 1 ≤ k ≤ m. …(3)

Note. We note that in the above problem, the functional (1) is not considered for all curves satisfying the boundary conditions (2), but only for those which lie in the (n−m) −dimensional manifold defined by the systems (3).

Remark. For simplicity, we restrict ourselves to the case


43

n = 2 and m = 1.

Theorem. Given the functional

J[y, z] = �ba F(x, y, z, y′, z′)dx, …(1)

let the admissible curves lie on the surface

g(x, y, z) = 0, …(2)

and satisfy the boundary conditions

y(a) = A1, y(b) = B1, …(3)

z(a) = A2, z(b) = B2, …(4)

and moreover, let J[y, z] have an extremum for the curve

y = y(x), z = z(x) …(5)

Then, if gy and gz do not vanish simultaneously at any point of the surface (2), there exists a function λ(x) such that (5) is an extremal of the functional

�ba [ F+λ(x) g]dx. …(6)

Proof. We are required to prove that (5) satisfies the differential equations

Fy + λgy −dxd

(Fy′) = 0, …(7)

Fz + λgz −dxd

(Fz′) = 0. …(8)

Let J[y, z] have an extremum for the curve (5), subject to the conditions (2) to (4). Let x1 be an arbitrary point of the interval [a, b]. Next, we give y(x) an increment δy(x) and z(x) an increment δz(x), where both δy(x) and δz(x) are non-zero only in a neighbourhood, say [α, β] ⊂ [a, b], of x1. Using the notion of variational derivatives, we can write the corresponding increment

∆J = J[y + δy, z + δz] − J[y, z], …(9) in the form

∆J = ,22xx

11xx 11

zF

yF σ∆

��

��

�

��

∈+δδ+σ∆

��

��

�

��

∈+δδ

== …(10)

where

∆σ1 = �ba � y(x)dx, ∆σ2 �

ba � z(x) dx, …(11)

and

∈1, ∈2→0


44

as

∆σ1, ∆σ2→0. …(12)

We now require that the “varied” curve

y = y*(x) = y(x) +δy(x),

z = z*(x) = z(x) + δz(x) …(13)

satisfy the condition (2), i.e.,

g(x, y*, z*) = 0. …(14)

Then 0 = �ba [G(x, y*, z*) −g(x, y, z)]dx

= �ba dx]z�gy�g[ zy +

= { } ,''22xxz11xxy 11

gg σ∆∈++σ∆��

� ∈+ ==

…(15)

where ∈1′, ∈2′ →0 as ∆σ1, ∆σ2→0, and the overbar indicates that the corresponding derivatives are evaluated along certain intermediate curves. By hypothesis, either

1xxz

1xxy gorg ==

is nonzero. If

1xxzg = ≠ 0, …(16)

we can write the condition (16) in the form

∆σ2 = −��

��

�

�

�

�

∈+=

= '1

1

xxz

xxy

g

g∆σ1, …(17)

where ∈′→0 as ∆σ1→0. Substituting (18) into the formula (10) for ∆J, we

obtain

∆J = ��

��

�

�

�

�

��

��

�−

== 1xxz

y

1xx z�F�

g

g

y�F� ∆σ1 + ∈ ∆σ1 …(18)

where ∈→0 as ∆σ1→0. The first term in the right side of (19) is the principal linear part of ∆J. Hence, by definition, the variation δJ of the functional J at the point x1 is


45

δJ = ��

��

�

�

�

�

��

��

�−

== 1xxz

y

1xx z�F�

g

g

y�F� ∆σ1 …(19)

we know that a necessary condition for an extremum of the functional J is that

δJ = 0. …(20)

Since ∆σ1 is non-zero while x1 is arbitrary, equations (20) and (21) imply

,0z�F�

g

g

y�F�

z

y =��

��

�−

or 0)F(dxd

Fg

g)F(

dxd

F 'zzz

y'yy =�

��

� −−� �

��

� −

or z

'zz

y

'yy

g

)F(dxd

F

g

)F(dxd

F −=

− …(21)

Along the curve

y = y(x),

z = z(x)

the common value of the ratios (22) is some function of x, say −λ(x). Then (22) reduces to system of differential equations

Fy + λgy −dxd

(Fy′) = 0

Fz + λgz−dxd

(Fy′) = 0,

which are precisely equations (7) and (8). This completes the proof of

theorem.

Remark 1. If the functional J has an extremum for a curve γ, subject to the

condition

g(x, y, z, y′, z′) = 0, …(∗)

and if the derivatives gy′ and gz′ do not vanish simultaneously along γ, then there exists a function λ(x) such that γ is an integral curve of the system of differential equations

Φy−dxd

(Φy′) = 0,


46

Φz −dxd

(Φz′) = 0,

where

Φ = F + λG.

Remark 2. If we assume that the condition (2) does not hold everywhere, but only at some fixed point

g(x1, y, z) = 0, …(∗∗)

we obtain a condition whose left-hand side can be regarded as a functional of y and z. Thus, the condition (2) can be regarded as an infinite set of conditions, each of which is a functional.

Example 1. Among all curves lying on the sphere x2 + y2 + z2 = a2, and passing through two given points (x0, y0, z0) and (x1, y1, z1), find the one which has the least length.

Solution. The length of the curve

y = y(x), z = z(x) …(1)

is given by the integral

J[y, z] = dx'z'y11x0x

22� ++ …(2)

The curve (1) lies on the sphere

x2 + y2 + z2 = a2. …(3)

we form the auxiliary functional

J* = � ++1

0

x

x22 zy1 ''( + λ(x) (x2 + y2 + z2)]dx …(4)

The other boundary conditions are

��

====

,z)x(z,z)x(z

y)x(y,y)x(y

1100

1100 …(5)

The Euler’s equations, corresponding to function (4), are

2λ(x) y− 0'z'y1

'ydxd

22=

��

�

�

��

�

�

++, …(6)

2λ(x)y − 0'z'y1

'zdxd

22=

��

�

�

��

�

�

++. …(7)


47

Solving these equations (6) and (7), we obtain a family of curves depending on four constants, whose values are determined from the boundary conditions in (5).

Example 2 . Find the shortest distance between the points A(1, −1, 0) and B(2, 1, −1) lying on the surface 15x−7y +z −22 = 0.

Solution. In this question, we have to find the minimum of the functional

J[y, z] = ,dx'z'y121

22� ++ …(1)

subject to the conditions

��

−===−=

1)2(z,0)1(z1)2(y,1)1(y

…(2)

provided

g(x, y, z) ≡ 15x −7y +z − 22 = 0. …(3)

To achieve this end, we form an auxiliary functional

J*[y, z] = �2

1 F(x, y, z, y′, z′)dx …(4)

where

F = 22 'z'y1 ++ + λ(x) [15x−7y +z −22]. …(5)

The corresponding Euler’s equations are

0 + λ(x) {−7} − 0'z'y1

'ydxd

22=

��

�

��

�

�

++ …(6)

0 + λ(x). {1} − 0'z'y1

'zdxd

22=

��

�

��

�

�

++ …(7)

Combined together, we get

0'z'y1

'z'ydxd

22=

��

�

��

�

�

++

+

Integrating, we find

122c

'z'y1

'z7'y =++

+ …(8)

From (3), we write

z′ = 7y′ −15. …(9)


48

From equation (8) and (9), and then integrating, we obtain (exercise)

y(x) = αx +β. …(10)

Using boundary conditions in (2), we find (exercise)

α = 2, β = −3, y(x) = 2x−3. …(11)

From equation (9) and (11), we have

z′ = −1

giving z(x) = c−x.

The B.C.′s in (2), give

z(x) = 1−x …(12)

Putting y(x) and z(x) from equations (11) and (12) into equation (6), we find

λ(x) ≡ 0. …(13)

The desired shortest distance is (exercise)

l = � =++21

22 6'z'y1

The Books Recommended for Chapter I

1. I.M. Gelfand Calculus of Variations, and S.V. Fovmin Prentice Hall.

TRANSPORT AND LAPLACE EQUATIONS

49

Chapter-2

Transport and Laplace Equations

2.1 INTRODUCTION Many physical problems in science, engineering and geometry can be modeled mathematically by partial differential equations (PDE). A partial differential equation is an equation involving an unknown function of two or more variables and certain of its partial derivatives.

Before writing symbolically a typical PDE, we first present the notation / symbol to be used consequently.

2.1.1 Geometric Notation

(i) Rn = n – dimensional real Euclidean space,

(ii) R1 = R = real line.

(iii) ei = ith standard coordinate vector

= (0, 0, ……, 0, 1, 0,…….0).

(iv) A typical point x in Rn is

x = (x1, x2,……, xn).

Sometimes, we will also regard x as a row or column vector.

(v) Rn+ = open upper half – space

= { x = (x1, x2,……,xn) ∈ Rn | xn > 0}.

(vi) R+ = { x ∈ R | x > 0}

(vii) U, V, W etc are usually open subsets of Rn.

(viii) ∂ U = boundary of U

(ix) U = closure of U

= U ∪ ∂U.

(xi) A typical point in Rn+1 will often be denoted as

(x, t) = (x1, x2,…., xn, t),

and we usually interpret

t = xn+1 = time.


50

(xii) A point x ∈ Rn will sometimes be written as

x = (x1, xn)

for x1 = (x1, x2,…., xn-1) ∈ Rn-1.

(xiii) B(x, r) = closed ball in Rn with center at x and having radius r, r > 0.

= { y ∈ Rn | |y – x| ≤ r}.

(xiv) B0(x, r) = open ball in Rn with centre at x and radius r =

= { y ∈ Rn | |y – x| < r}

(xv) For a = (a1, a2,….., an) and b = (b1, b2,……., bn)

a . b =�=

n

i 1

ai bi

| a | = 21

1

2��

��

��

=

n

iia = 22

22

1 .......... naaa +++ ≡ Euclidean norm of a

(xvi) Cn = n – dimensional complex space,

(xvii) C1 = C = complex plane.

(xviii) α(n) = volume of unit ball B(0, 1) in Rn

=

��

��

�

��

��

�

��

��

� +Γ 12

2/

n

nπ ,

In particular for n = 3 ,

α(3) = π34

for r = 1

(xix) n α(n) = surface area of unit sphere B(0, 1) in Rn

= ∂B(0, 1).

2.1.2. Notation for Functions

If u : U → R is a real valued function with domain U ⊂ Rn, we write

u(x) = u(x1, x2,……, xn), for x ∈ U.

Definition: Function u is called smooth when u is infinitely differentiable

If u and v are two functions, then we write


51

u ≡ v (read : u is identically equal to v)

when functions u and v agree for all values of their arguments.

(i) We write

u : = v

to define u as equaling v

(ii) u+ = max (u, 0) , u+ ≥ 0

(iii) u- = − min(u, 0) , u- ≥ 0

(iv) u = u+− u-,

(v) | u | = u++ u-.

(vi) The sign function is defined as

sgn (x) = ��

�

�

<−=>

.0100

01

xif

xif

xif

(vii) If u : U → Rm , U ⊂ Rn ,we write

u (x) = (u1(x) , u2(x) , .... , um(x)) for x ∈ U

Here ,uk is the kth component of u for k = 1,2,…., m. Further

uk : U → R.

(viii) The function

χE(x) = � �

∉∈

Exif

Exif

01

is called the Indicator Function of E.

(ix) A function u : U → R is called Lipschitz continuous if

| u(x) – u(y) | ≤ C | x – y | for al x , y ∈ U


52

and for some constant C. Here, on the left there is norm in R, and on the right, there is norm in Rn.

2.1.3. Notation for Derivatives

Let u : U → R, x ∈ U ⊂ Rn. We write

(i) ��

��

� −+→

=∂∂

hxuhexu

hx

xu i

i

)()(0

lim)( , provided this limit exists, h ∈ R.

(ii) We usually write ixu for

ixu

∂∂

.

(iii) ixu

jx = ji xx

u∂∂

∂2

, ixu

jx kx =kji xxx

u∂∂∂

∂3

, etc.

2.1.4. Multiindex Notation

(1) A vector / n-tuple of the form

α = (α1, α2,………., αn), αi is a non-negative integer for each i,

is called a multiindex. Its order is defined as

| α | = α1 + α2 + ……..+αn = �=

n

i 1

αi

Note : | α | ≥ 0 and | α | is a non – negative integer.

Also, we define

α! = α1! α2! ……. αn!

(2) For x ∈ Rn, we define

xα = x1 1α x2 2α ……. x3 nα

(3) We employ the symbol

Du

to denote the gradient vector of the function u.

(4) Given a multiindex α = (α1, α2,……, αn) , we define


53

Dα u(x) = .................

)(||

uxu n

n2

21

1nn

22

11

xxxxxx

αααααα

α∂∂∂=

∂∂∂∂

=��

�

�

��

�

�

��

��

�

∂∂

α

=∏

i

i

n

1i xu

In particular, if α = 0, then Dα is the identity operator.

(5) If k is a non – negative integer, we define

Dk u(x) : = {Dk u(x) : |α| = k}. (*)

Thus, Dk u(x) is the set of all partial derivatives of order k. Assigning some ordering to the various partial derivatives in (*), we can also regard

Dk u(x) as a point in knR - space.

(6) We define

| Dk u | = 21

||

2|| ��

��

��

=k

uDα

α

(7) Special cases:

(a) When k = 1, Du is a point in Rn – space and we arrange the elements of Du in a vector of the form

Du = (u1x ,u

nxx u.......,2

) = gradient vector

In particular, for n = 3,

Du = ((u1x ,u

32, xx u )

(b) When k = 2, D2u can be regarded as an element of 2nR - space, and the

elements of D2u are being arranged in a matrix


54

D2u =

��

�

�

��

�

�

∂∂∂

∂∂∂

∂∂∂

−−−−−−−−−−−−−−−∂∂

∂∂∂

∂

∂∂∂

∂∂∂

∂∂

nnnn

n

xxu

xxu

xxu

xxu

xxu

xxu

xxu

xu

2

2

2

1

2

22

2

12

2

1

2

21

2

21

..........

...................

.............

This matrix is called the HESSIAN MATRIX.

For n = 2 (i.e., in two dimensional space),

u = u(x, y) and D2u =

222

22

2

2

2

×��

�

�

��

�

�

∂∂

∂∂∂

∂∂∂

∂∂

yu

xyu

yxu

xu

(c) tr(D2 u) = �=

n

i 1ixu

ix

= Laplacian of u

= ∆ u.

(d) For a function of two variables u = u(x, y) ,

x = (x1, x2,….., xn) , and

y = (y1, y2, ….., yn) ,

Dx u = (1xu ,

2xu …..,nxu ) ,

Dy u = (1yu ,

2yu ….,nyu ) .

2.1.5 Vector – valued Functions

(i) Let U ⊂ Rn and m > 1. Let

u : U → Rm

be a vector – valued function and

u = (u1, u2,….., um).

We define

Dαu = (Dα u1, Dα u2, …..Dα um)


55

for each multi-index α.

We note that Dα ui are defined earlier under the heading “Notation for

Derivatives”.

(ii) For a non – negative integer k, we define

Dku = { Dαu : |α| = k}

and

| Dku | = norm in m – dimensional space

= 21

||

2|| ��

��

��

=k

uDα

α

as defined earlier for scalar – valued functions.

2.1.6 Measures and Integrals

(i) The integral of a function f : U ⊆ Rn → R, over a subset U ⊆ Rn, with respect to Lebesgue measure is denoted by

�U

f (x) dx or simply �U

f .

Note: If no subscript occurs on the integral sign , the region of integration is understood to be Rn.

(ii) Let � be a smooth (n – 1) dimensional surface in Rn, we write

�� dsf

for the integral of f over � , with respect to (n – 1) – dimensional surface measure.

(iii) If C is a curve in Rn, we denote by

�C dtf ,

the integral of f over C w.r.t. arc length.

(iv) The convolution of the function f and g, denoted by

f * g,

is given by


56

(f * g) (x) = � f (x – y) g(y) dy

= � f(y) g(x – y) dy

= ( g * f) (x),

provided the integrals exists.

2.1.7 Function Spaces

(1) C(U) = { u| u : U →→→→ R is continuous }

C( U ) = { u : u ∈∈∈∈ C(u) and is uniformly continuous }

(2) Ck(U) = { u : U → R, U ⊆ Rn | u is k – times continuously differentiable}

Ck( U ) = { u ∈ Ck(U) | Dαu is uniformly continuous for all | α | ≤ k}

Thus, if u ∈ Ck( u ), then Dαu continuously extends to U for each multiindex α such that | α | ≤ k.

(3) C∞ (U) = { u : U → R | u is infinitely differentiable}

= Ι∞

=0k

Ck (U)

C∞( U ) = Ι∞

=0k

Ck( U ).

…(4)

Lp(U) = { u : U → R : u is Lebesgue measurable, || u ||)(ULp < ∞ }

where

|| u || Lp(U)=p1

p dxf ��

��

��∪

|| . 1 ≤ p < ∞.

2.1.8 Notation for Matrices

(1) A = (aij)

= a matrix A which is an m × n matrix with (i, j)th entry aij.

A = diag(d1, d2,…., dn)

= a diagonal matrix.


57

(2) Mm×n = space of real m × n matrices

Sn×n = space of real symmetric n × n matrices

(3) tr A = trace of A = a11 + a22 +…..+ ann

= sum of diagonal elements

(4) det A = determinant of the matrix A

(5) cof A = cofactor matrix of A

= Transpose of (Adj A)

= (Adj A)T

AT = transpose of the matrix A

(6) If A = (aij), B = (bij) are m × n matrices, then

A : B = ��= =

m

i

n

j1 1

aij bij

| A | = norm of matrix A

= (A : A)1/2

=21

n

1i

n

1j

2ija �

�

��

��= =

)(

= [(a11)2 + (a12)2 +……+ (a1n)2 + (a21)2 + (a22)2 +….+ (a2n)2

+ ……..+ (ann)2 ]1/2

(7) If A = (aij) ∈ Sn×n and x = (x1, x2, …., xn) ∈ Rn, then

x . A x = ��= =

m

i

n

j1 1

aij xi xj

= �=

m

ji 1,

aij xi xj

= Quadratic Form corresponding to (aij) .

(8) Let A ∈ Sn×n. If


58

x . A x ≥ θ | x |2 for all x ∈ Rn, and some real number θ, then, we write A ≥ θ. I

(9) For A ∈ Mn×n, y ∈ Rn, we sometimes write

y A = AT y.

2.2 TRANSPORT EQUATION

The transport equation with constant coefficients is the PDE

Ut + b. D u = 0 in Rn × [0, ∞) , (1)

where

b = (b1, b2,…., bn) is a fixed vector in Rn,

and

u : R × [0, ∞] → R

is the unknown function, and

u = u(x, t).

Note. Here x = (x1, x2, ….., xn) ∈ Rn is a typical point in space, and t ≥ 0

denotes a typical time variable.

We write D u = Dx u = (

1xu ,2xu …..,

nxu ) (2)

for the gradient of the scalar function u with respect to the spatial variable x.

Initial – Value Problem

Let us consider the homogeneous linear initial – value problem

Ut + b . D u = 0 in Rn × [0, ∞) (1)

u = g on Rn × {0 = t} (2)

where g : Rn → R is known.

The problem is to compute u = u(x ,t).

Solution. Let (x, t) be any given (hence fixed) point in Rn × [0, ∞).

The line through (x, t) with direction (b, 1 ) is represented parametrically by


59

�

+=+=stst

bsxsx

)()(

, s∈R (3)

i.e., by (x + s b, t + s) for s ∈ R.

This line hits the plane

Γ : Rn × {t = 0} (4)

at the point (x – t b, 0) , when

s = − t. (5)

Since u is constant on the line and

u(x – t b, 0) = g(x – t b), (6)

by virtue of given initial condition (2), we deduce that

u(x, t) = g(x – t b) (7)

for x ∈ Rn and t ≥ 0.

So, if the given initial – value problem has a sufficiently regular solution, u = u(x, t), it must certainly be given by (7) above. Conversely, if g is C1, then u = u(x, t) defined by (7) is indeed a solution of the given initial – value problem. Verification:

From, (7), we find

Ut = − b . D(ξ), where ξ = x – t b

D u = D(ξ)

Hence ut + b. D u = [ −b . D(ξ)] + b . [D(ξ)]

= 0 (8)

and, for t = 0,

u(x, 0) = g(x) on Rn (9)

This completes the result.

Remark : If g is not C1, then there is no C1 solution of the given initial – value problem. But even in this case, formula (7) certainly provides a strong, and in fact, the only reasonable candidate for a solution.

We may thus formulary declare

u(x, t) = g(x – t b), x ∈ Rn, t ≥ 0. (10)


60

to be a weak solution of IVP, even should g not be C1. This all makes sense even if g, and thus u, are discontinuous.

Non homogeneous problem

Problem: Consider the non – homogeneous initial – value problem

)2(}0t{Ringu

)1(),0[RinfuD.bun

nt

=×=

∞×=+

in which b = (b1, b2, ….., bn) ∈ Rn is a fixed vector, and

u : R × [ 0, ∞) → R

is the unknown function, and

u = u(x, t),

x = (x1, x2,……, xn) ∈ Rn is a point in space,

t ≥ 0 denotes a typical time variable,

Du = Dxu = (1xu ,

2xu …..,nxu )

denote the gradient of u with respect to the spatial variable x,

g : Rn → R

is known, f : Rn × [0, ∞) → R

is known. The problem is to compute u = u(x . t).

Solution: Let (x, t) be any given, hence fixed, point in Rn × [0, ∞). Define a

function

),()(

:

stbsxusz

RRz

++=

→ (3)

for all s ∈ R. Then

z&(s) = b . D u(x + s b, t + s) + ut (x + s b, t + s)

= f(x + s b, t + s) (4)

using (1).

Now, using (2), (3) and (4), we find

u(x, t) – g(x – b t) = z(0) – u(x – b t, 0)

= z(0) – z(- t)


61

= �−

0

t

z&(s) ds

= �−

0

t

f(x + s b, t + s) ds

= �t

0

f(x + (s – t) b, s) ds (5)

This gives

u(x, t) = g(x – b t) + �t

0

f(x + (s – t) b, s) ds (6)

for x ∈ Rn, t ≥ 0

as solution of the given non – homogeneous initial – value problem.

2.3 LAPLACE’S EQUATION

Problem: Laplace’s equation is

∆u = 0, (1)

and Poisson’s equation is

∆u = − f . (2)

In equation (2), the minus sign is taken so that the notation is consistent with notation for general second – order elliptic operators. In both equations (1) and (2),

x ∈ U ⊆ Rn, U is an open set

and the unknown function is

u : U → R , U = closure of U

u = u(x).

In equation (2),

f : U → R

is given. Further

∆u = Laplacian of u

=�=

n

i 1ii xxu .


62

Definition : A function u ∈ C2 is called harmonic function if u satisfies the

Laplace’s equation

∆u = 0.

Physical Interpretation

Laplace’s equation comes up in a wide variety of physical contexts – such as when u denotes the chemical concentration / temperature / electrostatic potential.

Laplace’s equation arises as well in the study of analytic functions.

Fundamental Solution of Laplace’s Equation

We attempt to find a solution of the given Laplace equation

∆u = 0 (1)

by searching radial solutions of the form

u(x) = v(r) (2)

where

r = | x |

= 21

222

21 )......( nxxx +++ , (3)

and v is to be selected, if possible, so that

∆ v = 0, (4)

holds.

First, we note that

21=

∂∂

ixr

( 21

222

21 ).....

−+++ nxxx (2xi)

= rxi , (x ≠ 0) (5)

Thus, we have

uxi = v′(r) ��

��

�

∂∂

ixr

= v′(r) ��

��

�rxi , (6)


63

and

uii xx = v′′(r)

2

��

��

�

rxi + v′(r)

�

� �

− 3

21rx

ri (7)

for i = 1, 2,…, n. So,

∆u = �=

n

i 1

uxi xi

= v″(r) + ��

��

� −r

n 1v′(r). (8)

Hence

∆ u = 0

iff

v′′ + ��

��

� −r

n 1 v′ = 0. (9)

If v′ ≠ 0, we deduce

r

nvv −= 1

'"

or ( )r

nv

drd −= 1

'log .

or log v′(r) = (1 – n) log r + constt.

or v′(r) = 1−nra

, (10)

for some constant a . Consequently, if r > 0, we obtain

v(r) = ��

�

�

≥+

=+

− 3,

2,log

2 ncr

b

ncrb

n

(11)

where b and c are constants. Let

Φ(x) =

��

��

�

≥−

=−

− 3,||

1)()2(

1

2,||log21

2 nxnnn

nx

nα

π (12)

for x ∈ Rn, x ≠ 0. α(n) = volume of unit ball in Rn.


64

Then Φ(x) is a solution of the given Laplace equation (1) and is called the Fundamental Solution of Laplace’s Equation.

Note: This fundamental solution is radial.

2.4 FUNDAMENTAL SOLUTION OF POISSON’S EQUATION

Let Φ(x) be the fundamental solution of Laplace’s equation

∆u = 0, (1) where

Φ(x) =

��

��

�

≥−

=−

− 3,||

1)()2(

1

2,||log21

2 nxnnn

nx

nα

π (2)

and x ∈ Rn, x ≠ 0.

So, the mapping

x → Φ(x) , x ≠ 0 , (3)

is harmonic.

If we shift the origin to a new point y, the PDE (1) is unchanged, so the

mapping

x → Φ(x – y) (4)

is also harmonic as a function of x, x ≠ y.

Now, we consider the Possion’s equation

∆u = − f, (5)

where

f : Rn → R. (6)

we note that the mapping

x → Φ(x – y) f(y), (7)

for x ≠ y, is harmonic for each point y ∈ Rn, and so is the sum of finitely many such expressions built / constructed for different points y. Consider convolution u(x) = �

nR

Φ(x – y) f(y) dy . (8)


65

From equations (2) and (8), we write

u(x) =

��

�

��

�

≥−α−

=−π

−

�

�

− )(,||)(

)()(

)()(|)log(|

3ndyyxyf

n2nn1

2ndyyfyx21

n

n

R2n

R . (9)

For simplicity, we assume that the function f, given in Possion’s equation (5), is twice continuously differentiable with compact support. Now, we shall show that, u(x) defined by (9) satisfies (i) u ∈ C2(Rn)

(ii) ∆u = −f in Rn.

Consequently, the function in (9) provided us with a formula for a solution of Possion’s equation (5).

Proof of (i):

We have

u(x) = �nR

Φ(x – y) f(y) dy = �nR

Φ(y) f(x – y) dy (10)

Hence

hxuhexu i )()( −+

= �nR

Φ(y) dyh

yxfyhexf i��

��

� −−−+ )()( (11)

where h ≠ 0 is a real number and ei ∈ Rn,

ei = (0, 0, …,0, 1, 0, ….., 0)

with 1 in the ith slot.

But

h

yxfyhexf i )()( −−−+

→ ix

f∂∂

(x−y) (12)

uniformly on Rn as h → 0. Thus, on taking h → 0 in (11) and making use of

result in (12), we write

��

� �

−∂∂Φ=

∂∂

nR ii

dyyxxf

yxxu

,)()()( (13)


66

for i = 1, 2, 3,…, n.

Similarly

� �

��

��

� �

−∂∂

∂Φ=∂∂

∂nR jiji

dyyxxxf

yxxxu

,)()()(22

(14)

for i, j = 1, 2, …, n.

As the expression on the right hand side of (14) is continuous in the variable x, we see that

u ∈ C2(Rn) (15)

This proves (i).

Proof of (ii) :

This function Φ(x), defined in (9), blows up at x = 0 , we will need for subsequent calculations to isolate this singularity (x = 0) inside a small ball. So, fix ∈ > 0. Let B(0, ∈) denote an open ball at x = 0 with radius ∈. Then, from equation (10), we obtains

∆u(x) = �∈

Φ),0(

)(B

y ∆x f(x – y) dy + �∈−

Φ),0(

)(BR n

y ∆x f(x – y) dy

= I∈ + J∈ , say, (16)

where

I∈ = �∈

Φ),0(

)(B

y ∆x f(x – y) dy, (17)

J∈ = �∈−

Φ),0(

)(BR n

y ∆x f(x – y) dy, (18)

Now

| I∈ | ≤ |)(|),0(�

∈

ΦB

y | ∆x f(x – y) | dy ,

≤ C || D2 f ||)( nRL∞ �

�

�

�

��

�

�Φ�

∈dyy

0B

|)(|),(


67

≤��

� �

≥∈

=∈∈

)(

)(|log|

3nC

2nC2

2

(19)

Also, by integration by parts, we get

J∈ = �∈−

Φ),(

)(0BRn

y ∆y f(x – y) dy

= − �∈− ),0(BR n

D Φ(y) . Dy f(x – y) dy

+ �∈∂ ∂

∂Φ),0(

)(B v

fy (x – y) dS(y), using divergence then

= K∈ + L∈. (20)

v indicating the inward pointing unit normal along the boundary ∂B(0, ∈) of the ball B(0, ∈).

Further

| L∈ | ≤ || D f ||)( nRL∞ �

�

�

�

��

�

�Φ�

∈−

)()(),(

ydsy0BRn

≤ � �

≥∈=∈∈

3nC2nC ,|log|

. (21)

We continue by integration by parts once again in the term K∈, to obtain /

discover

k∈ = �∈−

∆Φ),(

)(0BRn

y f(x – y) dy

– �∈−

Φ),(

)(0BR n

y f(x – y) dS(y)

= �∈∂ ∂

Φ∂

),0(B v(y) f(x – y) dS(y), (22)

since the function Φ is harmonic away from the origin (x ≠ 0).


68

Now

��

�

�

∈−=−=

≠−=Φ

,||

,0,||)(

1)(

yyy

v

yyy

nnyD nα

(23)

on the boundary ∂ B(0, ∈). Consequently

v∂Φ∂

(y) = v . D Φ(y)

= 1)(1

−∈nnα, (24)

on the boundary ∂B(0, ∈).

Since n α(n) ∈n-1 is the surface area of the n – dimensional sphere ∂B(0, ∈), we

have

K∈ = − 1)(1

−∈nnα �∈∂ ),0(B

f(x – y) dS(y)

= − ),( ∈∂

�0B

f(y) dS(y)

→ − f(x), as ∈ → 0. (25)

Here, a slash through an integral denote an average value.

Combining now equations (16) – (25), and letting ∈ → 0, we find

∆ u(x) = − f(x), (26)

as asserted earlier.

Thus, u(x), given by (9), in a solution of (26). This completes the solutions of Poisson’s equation.

Remark: (i) We sometimes write

∆ Φ = − δ0

in Rn, δ0 denoting the Dirac measure on Rn giving unit mass to the point x = 0.

Adopting this notation, we formally compute

∆u(x) = �nR

[∆x Φ(x – y)] f(y) dy


69

= − �nR

δx f(y) dy

= − f(x),

x ∈ Rn, in accordance with above theorem.

Remark (ii). The above theorem (Solving Poisson’s equation) is in fact valid under for less stringent smoothness requirements for f.

2.5 MEAN – VALUE FORMULAS FOR LAPLACE’S EQUATION

Let U ⊂ Rn be an open set. Let

u : U → R

be a harmonic function. We define

(i) average of f over the ball B(x, r)

= �),()(

1

rxBn dyf

rnα

= �∂ ),(

.rxB

dsf

where

α(n) = volume of unit ball B(0, 1) in Rn

= ,1

2

2/

��

��

� +Γ n

nπ

n α(n) = surface area of unit sphere ∂B(0, 1) in Rn

Note: For x ∈ U ⊂ Rn, r = | x | ,

we shall now derive the important mean – value formulas, which declare that

“u(x) equals both the average of u over the sphere ∂∂∂∂B(x, r) and the average of u over the entire ball B(x, r), provided

B(x, r) ⊂⊂⊂⊂ U”.


70

Theorem (Mean – value formulas for Laplace’s equation)

Statement : If u ∈ C2 (U) is harmonic, then

u(x) = �∂ ),( rxB

u ds = �),( rxB

u dy,

for each ball B(x, r) ⊂ U.

Proof of Part – I

Set

φ(r) = �∂ ),( rxB

u(y) dS(y)

= �∂ )1,0(B

u(x + r z) dS(z). (1)

Then

φ′(r) = �∂ )1,0(B

z . Du(x + r z) dS(z), (2)

and consequently, using Green’s formula, we compute

φ′(r) = �∂ ),( rxB

��

��

� −r

xy. Du(y) d S(y)

= �∂ ),( rxB v

u∂∂

dS(y)

= nr�

),( rxB

∆u(y) dy

= 0. (3)

Hence φ is constant, and so

φ(r) = 0

lim→t

φ(t)


71

= 0

lim→t �

��

�

��

��

∂ ),(

)()(txB

ydSyu

= u(x). (4)

Equations (1) and (4) prove the part – I, i.e. ,

u(x) = �∂ ),( rxB

u(y) dS(y) = average of u over the sphere ∂B(x, r). (5)

Proof of Part – II : We observe that by employing polar coordinates, one gets

� � � ξ��

�

��

�=

ξ∂),( ),(rxB

r

0 xB

ddsudyu

= u(x) �r

0

[n α(n) ξn-1] dξ

= α(n) rn u(x). (6) Hence

u(x) = �),()(

1

rxBn dyu

rnα

= �),( rxB

f dy

= average of u over the entire ball B(x, r). (7)

This complete the proof of both the mean – value formulas for Laplace’s equation.

Theorem (Converse of mean – value property for Laplace’s equation):

Statement: If u ∈ C2(U) satisfies the mean formula

u(x) = �∂ ),( rxB

u dS

for each ball B(x, r) ⊂ U, then prove that

u : U → R

is harmonic.

Proof: If possible assume that

∆U ≠ 0 in U ⊆ Rn (1)


72

Then, there exists some open ball

B(x, r) ⊂ U (2)

such that

∆u > 0, within B(x, r) . (3)

Set

φ(r) = �∂ ),( rxB

u(y) dS(y) (4)

Then, as proved earlier (exercise)

φ′(r) = nr�

),( rxB

∆u(y) dy . (5)

Using (3) and (5), we get

φ′(r) > 0 . (6)

From the hypothesis and equation (4), it follows that

u(x) = φ(r) = constant (7)

This contradicts (6). Hence, the result follows. This completes the proof.

2.6. ENERGY METHODS

Definition (Energy functional) :

It is defined as

I[w] = � ��

��

� −U

2 dxfwwD21

|| (1)

where w belongs to the admissible set.

A = { w ∈ C2 (U )| w = g on ∂ U}. (2)

and

∆w = − f in U. (3)

Theorem (Dirichlet’s principle):

Statement: Assume u ∈ C2 ( U ) solves the boundary – value problem

��

�

∂=−=∆

UonguUinfu

(*)


73

where U is open and bounded subset of Rn and its boundary ∂U is C1. Prove that

],[min][ wIuI

Aw∈=

(**)

where I[w] is the energy functional and w belongs to the admissible set

A = {w ∈ C2 (U )| w = g on ∂U} (***)

Conversely, if u ∈ A satisfies (**), then u solves the boundary value problem

(*).

Proof (Part – I) : Choose w ∈ A. Then

W = g on ∂ U (1)

Let u ∈ C2 (U) solves the BVP (*). Then

Uinfu −=∆ (2)

Uongu = (3)

Now

�U

( ∆u + f) (u – w) dx

= 0 (4)

by virtue of (2). This gives

�U

[ (∆u) (u – w) + f(u – w)] dx = 0

An integration by parts yields (using Green’s formula)

�U

[ D u. D(u – w)] dx

= − �U

( ∆u) (u – w) dx + �∂

��

��

�

∂∂

U vu

(u – w) dS

= �U

( f) (u – w) dx + 0,

using (1), (2) and (3).

This implies


74

�U

[ D u . D(u – w) – f(u – w)] dx = 0. (5)

as u = w = g on ∂ U, and hence there is no boundary term Equation (5) gives

�U

[ | D u |2 – u f] dx = �U

[ D u. D w – w f] dx. (6)

we know the estimates

| D u . D w | ≤ | D u | | D w |

≤ 21

| D w |2 + 21

| D w |2 (7)

by virtue of Cauchy – Schwarz and Cauchy inequalities. From (6) and (7), we write

�U

{ | D u |2 – u f } dx ≤ 21�U

| D u |2 + 21�[ | D w |2 – w f] dx (8)

By definition, the energy functional is given by

I[w] = � ��

��

� −U

2 fwDw21

|| dx (9)

Hence, relation (8) concludes

I[u] ≤ I[w], w ∈ A. (10)

Since u ∈ A, it follows that

I[u] = Aw∈

min I[w] . (11)

This proves part – I .

Proof of Part – II : Conversely, assume that the conclusion (**) of the

statement of the theorem holds.

Let v ∈ ∞cC (U) be any but fixed function. Let

λ(τ) = I[u + τ v], τ ∈ R (12)

where the energy function I is defined above in equation (9). Since u + τ v ∈ A


75

for each τ, the scalar function λ(τ) has a minimum at zero, by virtue of

assumption in (**). So λ′(0) = 0, (13)

provided this derivative of λ(τ) at τ = 0 exists. But

λ(τ) = I[u + τ v]

= � ��

��

� τ+−τ+U

2 dxfvuDvDu21

)(||

= � ��

��

�τ+−τ+τ+

U

22

2 dxfvuDvDuDv2

Du21

)(.|||| (14)

Equation (13) and (14) give at once

�U

( D u . D v – v f) dx = 0 .

This gives

�U

( – ∆u – f ) v dx = 0 . (15)

This identity is valid for each function v ∈ ∞cC (U). So we must have

− ∆ u – f = 0 in U

or

∆ u = − f in U. (16)

This shows that u solves the given boundary – value problem. Hence, the proof of the converse of Dirichlet’s principle is complete.

This completes fully the Dirichlet’s principle.

Note (i) : In other words, the Dirichlet’s principle states that

If u ∈ A, then P D E

∆ u = − f in U

u = g on ∂ U


76

is equivalent to the statement that the solution function u = u(x, t) minimizes the associated energy functional

I[ . ].

Note (ii) : Dirichlet’s principle is an instance of the calculus of variations applied to Laplace Equation.

Theorem (Uniqueness theorem)

Statement : Prove that there exist at most one solution u ∈ C2(U ) of the boundary – value problem,

∆ u = − f in U

u = g on ∂ U

where U is bounded, open, and ∂ U is C1.

Proof: If possible assume that, in addition to u, there is another solution, sayu of the given boundary-value problem. Set w = u −u in U . (1)

Since u and u are solutions of the given boundary value problem, so

∆ u = − f in U (2)

u = g on ∂ U (3)

∆u = − f in U (4)

u = g on ∂ U (5) Now, in U,

∆ w = ∆ u − ∆u

= (− f) – (−f)

= 0 in U , (6)

and

w = 0 on ∂U . (7)

From Green’s formula, we write

�U

| D w |2 dx = �U

(D w . d w) dx

= − �U

w(∆ w) dx + �∪∂

��

��

�

∂∂

vw

w dS


77

= 0 + 0

= 0, (8)

using (6) and (7). Equation (8) shows that

D w ≡ 0 in U (9)

Since w = 0 on the boundary ∂ U and w is constant in U, it follows that

w = 0 in U

or

u =u in U. (10)

This proves uniqueness theorem.

2.7 PROPERTIES OF HARMONIC FUNCTION

We now present a sequence of interesting deductions about harmonic functions, all based upon the mean – value formulas. Assume for the following that U ⊂ Rn is open and bounded.

Theorem: (Strong maximum principle).

Statement : Suppose u ∈ C2(U) ∩ C(U) is harmonic within U.

(i) Then uu

UU ∂= maxmax

(ii) Furthermore, if U is connected and there exists a point x0 ∈ U such that

u(x0) = uU∂

max ,

then u is constant within U.

Proof: Suppose there exists a point x0 ∈ U with

u(x0) = M = maxu u . (1)

Then for

0 < r < dist(x0, ∂U),

the mean – value property asserts


78

M = u(x0) = �),( 0 rxB

u dy

≤ M. (2)

As equality holds only

if u ≡ M within B(x0, r), (3) we see

u(y) = M (4)

for all y ∈ B(x, r). Hence the set

{x ∈ U | u(x) = M }

is both open and relatively closed in U, and thus equals U if U is connected. This proves assertion (ii), from which (i) follows.

Note : Assertion (i) is the maximum principle for Laplace’s equation and (ii) is the strong maximum principle. Replacing u by –u, we recover also similar assertions with “min” replacing “max”.

Remark (i) : The strong maximum principle asserts in particular that if U is connected and

u ∈ C2(U) ∩ C )(U

satisfies

� �

∂==∆

,00

Uonu

Uinu

where g ≥ 0, then u is positive everywhere in U if g is positive somewhere on ∂∂∂∂U.

Remark (ii) : An important application of the maximum principle is establishing the uniqueness of solutions to certain boundary – value problems for Poisson’s equation.

Theorem: (Uniqueness).

Statement : Let g ∈ C(∂U), f ∈ C(U). Then there exists at most one solution u ∈ C2(U) ∩ C )(U of the boundary – value problem


79

� �

∂==∆−

.Uongu

Uinfu (1)

Proof: If u and (u ) both satisfy (1) , apply theorem above to the harmonic functions

w = ± (u − u ) .

Local Estimates for Harmonic Functions

Next we employ the mean – value formulas to derive careful estimates on the various partial derivatives of a harmonic function. The precise structure of these estimates will be needed below, when we prove analyticity.

Theorem (Estimates on derivatives)

Statement : Assume u is harmonic in U. Then

| Dα u(x0) | ≤ knk

rC

+ || u )),((

||rxBL 0

1 (1)

for each ball B(x0, r) ⊂ U and each multi-index α of order | α | = k.

Here

C0 = ,)(n

1α

)()(

nnk2

Ck1n

k α=

+ (k = 1,……) (2)

Proof 1: We establish (1) and (2) by induction on k . The case k = 0 being immediate from the mean – value formula. For k = 1, we note upon differentiating Laplace’s equation that

ixu (i = 1, …., n) is harmonic. Consequently

| ixu (x0) | = | � )2/,( 0 rxB ixu dx |

= irxBn

n

uvrn �∂ 2/,( 0)(

2|α

dS | (3)

≤ .||||2

))2

,0((r

xBLu

rn

∂∞

Now if x ∈ ∂B(x0, r/2), then B(x, r/2) ⊂ B(x0, r) ⊂ U, and so


80

|u(x)| ≤ )),(

||||)( rxBL

n

01u

r2

n1

��

��

�

α .

Combining the inequalities above, we deduce

| Dαu(x0) | ≤ )),((1

1

01||||

1)(

2rxBLn

n

urn

n+

+

α (4)

if | α | = 1. This verifies (1) and (2) for k = 1 .

2. Assume now k ≥ 2 and (1) and (2) is valid for all balls in U and each multiindex of order less than or equal to k – 1. Fix B(x0, r) ⊂ U and let α be a multiindex with | α | = k. Then Dα u = (Dβ u)xi for some i ∈ {1,…., n} , | β | = k – 1. By calculations similar to those in (3), we establish that (exercise)

| Dα u(x0) | ≤ .||||)),(( 0 k

rxBL

uDrkn

∂∞

β (5)

If x ∈ ∂B(x0, kr

), then

B(x, k

k 1−r) ⊂ B(x0, r) ⊂ U.

Thus (1) , (2) for k – 1 imply

| Dβu(x) | ≤ )),((1

11

01||||

1)(

))1(2(rxBLkn

kn

u

rk

kn

kn−+

−+

��

��

� −−

α. (6)

Combining the two previous estimates yields the bound

| Dαu(x0) | ≤ .||||)(

)2()),((

1

01 rxBLkn

kn

urnnk

+

+

α (7)

This confirms (1), (2) for | α | = k.

Liouville’s Theorem.

Next we see that there are no nontrivial bounded harmonic functions on all of Rn.

Theorem (Liouville’s Theorem)


81

Statement : Suppose u : Rn → R is harmonic and bounded. Then u is constant.

Proof: Fix x0 ∈ Rn, r > 0, then

| Du(x0) | ≤ )),((1

1

01||||

rxBLn urC

+

≤ )(

||||)(

nL1 u

rnC

ℜ∞α

→ 0 ,

as r → ∞. Thus

Du ≡ 0,

and so

u is constant. This proves the Liouville’s Theorem.

2.8 GREEN’S FUNCTION

Assume now U ⊂ Rn is open, bounded, and ∂U is C1. We propose next to obtain a general representation formula for the solution of Poisson’s equation

−∆u = f in U, (*)

subject to the prescribed boundary condition

u = g on ∂U. (**)

Derivation of Green’s function.

Suppose first of all u ∈ C2(U) is an arbitrary function. Fix x ∈ U, choose ∈ > 0 so small that B(x, ∈) ⊂ U, and apply Green’s formula on the region V∈ = U – B(x, ∈) to u(y) and Φ(y – x). We thereby compute

�∈V

u(y) ∆Φ(y – x) - Φ(y – x) ∆u(y) dy

= �∈∂V

u(y) v∂Φ∂

(y – x) - Φ(y – x) vu

∂∂

(y) dS(y), (1)

v denoting the outer unit normal vector on ∂V∈. Recall next


82

∆Φ(x – y) = 0 for x ≠ y.

We observe also that

| � ∈∂ ),( xBΦ(y – x)

vu

∂∂

(y) dS(y) | ≤ C ∈n-1 ),0(

max∈∂B

| Φ | = 0(1)

as ∈ → 0. Furthermore

� ∈∂ ),( xBu(y)

v∂Φ∂

(y – x) dS(y) = � ∈∂ ),( xBu(y) dS(y) → u(x)

as ∈ → 0. Hence our sending ∈ → 0 in (1) yields the formula:

u(x) = �∂UΦ(y – x)

vu

∂∂

(y) – u(y) v∂Φ∂

(y – x) dS(y)

− �U Φ(y – x) ∆u(y) dy. (2)

This identity is valid for any point x ∈ U and any function u ∈ C2 )(U .

Now formula (2) would permit us to solve for u(x) if we knew the values of ∆u within U and the values of u, ∂u / ∂v along ∂U. However for our application to Poisson’s equation with prescribed boundary values for u, somehow modify (2) to remove this term.

The idea is now to introduce for fixed x a corrector function

φx = φx(y),

solving the boundary – value problem:

� �

∂−Φ==∆

.)(

0

Uonxy

Uinx

x

φφ

(3)

Let us apply Green’s formula once more, now to compute

− �U φx(y) ∆u(y) dy = �∂Uu(y)

v

x

∂∂φ

(y) - φx(y) vu

∂∂

(y) dS(y)

= �∂Uu(y)

v

x

∂∂φ

(y) - φ(y -x) vu

∂∂

(y) dS(y)

We introduce next this.


83

Definition: Green’s function for the region U is

G(x, y) = Φ(y – x) − φx(y)

for x, y ∈ U, x ≠ y . Adopting this terminology and adding (2) to (4), we find

u(x) = − �∂Uu(y)

vG

∂∂

(x, y) dS(y) − �U G(x, y) ∆u(y) dy (x ∈ U), (5)

where

vG

∂∂

(x, y) = Dy G(x, y) . v(y) (6)

is the outer normal derivative of g with respect to the variable y. Observe that the term ∂u / ∂v does not appear in equation (5). We introduces the corrector φx precisely to achieve this.

Suppose now u ∈ C2(U ) solves the boundary – value problem

� �

∂==∆−

,Uongu

Uinfu (7)

for given continuous functions f, g. Plugging into (5), we obtain the following theorem.

Theorem: (Representation formula using Green’s function).

Statement : If u ∈ C2(U ) solves problem, then

u(x) = − �∂Ug(y)

vG

∂∂

(x, y) dS(y) + �U f(y) G(x, y) dy (x ∈ U). (8)

Here we have formula for the solution of the boundary – value problem (7), provided we can construct Green’s function G for the given domain U. This is in general a difficult matter, and can be done only when U has simple geometry. Subsequent subsections identify some special cases for which an explicit calculation of G is possible.

Remark: Fix x ∈ U. Then regarding G as a function of y, we may symbolically write

� �

∂==∆−

,0 UonG

UinG xδ

δx denoting the Dirac measure giving unit mass to the point x.


84

Before moving on to specific examples, let us record the general assertion that G is symmetric in the variables x and y .

Theorem: (Symmetry of Green’s function)

Statement : For all x, y ∈ U, x ≠ y, we have

G(y, x) = G(x, y). (9)

Proof: Fix x, y ∈ U, x ≠ y. Write

v(z) = G(x, z),

w(z) = G(y, z), (10)

for z ∈ U . Then

∆v(z) = 0 (z ≠ x), (11)

∆w(z) = 0 (z ≠ y) (12)

and

w = v = 0 (13)

on ∂U. Thus our applying Green’s identity on

V = U – [B(x, ∈) ∪ B(y, ∈)] (14)

for sufficiently small ∈ > 0 yields

� ∈∂ ∂∂−

∂∂

),( xB vw

wvu

v dS(z) = � ∈∂ ∂∂−

∂∂

),(yB vu

vvw

w dS(z), (15)

v denoting the inward pointing unit vector field on ∂B(x, ∈) ∪ ∂B(y, ∈). Now w is smooth near x . So

| � ∈∂

−

∈∂∈≤

∂∂

),(

1

),(sup

|xB

n

xBCdSv

vw

| v |

= o(1) (16)

as ∈ → 0.

On the other hand, v(z) = Φ(z – x) - φx(z),

where φx is smooth in U. Thus


85

�� ∈∂∈∂ ν∂Φ∂

∈→=

ν∂∂

∈→ ),(),(

limlimxBxB 0

dSwv

0(x – z) w(z) dS

= w(x) .

Thus the left – hand side of (15) converges to w(x) as ∈ → 0. Likewise the right hand side converges to v(y). Consequently

G(y, x) = w(x) = v(y) = G(x, y) . This completes the proof. The Books Recommended for Chapter II 1. L.C. Evans Partial Differential Equations, Graduate Studies

in Mathematics, Volume 19, AMS, 1998.

PARTIAL DIFFERENTIAL EQUATIONS AND MECHANICS 86

Chapter-3 Heat and Wave Equations 3.1 INTRODUCTION

The heat equation is

ut −∆u = 0 (∗)

and the non-homogeneous heat equation is

ut − ∆u = f, (∗∗)

where t > 0 and x∈U, U⊂Rn is open. The unknown function

u = u(x, t) is

u : U [0, ∞) →R. (∗∗∗)

The Laplician ∆ is taken with respect to the spatial variables x = (x1, x2,…, xn),

and

∆u = ∆x u

= �=

n

1iixixu . (∗∗∗∗)

In equation (∗∗), the function

f : U×[0, ∞)→R

is given.

Remark (1). The heat equation is also known as the diffusion equation.

Remark (2). In typical applications, the heat equation describes the evolution in time of the density u of some quantity such as heat, chemical concentration, etc.

Remark (3). The heat equation appears as well in the study of BROWNIAN

MOTION.

3.2. FUNDAMENTAL SOLUTION OF HEAT EQUATION Article :- Derivation of the fundamental solution of the heat equation ut − ∆u = 0, in U ×[0, ∞)

HEAT AND WAVE EQUATIONS

87

where U⊂Rn is open.

Solution. We observe that the heat equation

ut −∆u = 0 in U ×[0, ∞) …(1)

involves one derivative w.r.t. the time variable t, but two derivatives w.r.t. the space variables x1, x2,…, xn . Consequently, we see that if u = u(x, t) solves (1), then so does u(λx, λ2t) for λ∈R. This scaling indicates that the ratio

,t

r 2

2n

21 xxxr ++== ...|| …(2)

is important for the heat equation. It also suggests that we seek a solution of heat equation (1) of the form

u = u(x, t) = v ��

��

�=��

�

��

�

t|x|

vt

r 22

, …(3)

for t > 0 and x∈Rn, for some function v as yet undetermined.

However, it is quicker to try a solution u having the special structure

u(x, t) = ��

��

�� tx

vt1

…(4)

for x∈Rn, t > 0. Here, α and β are constants and the function

v : Rn→R …(5)

must be found.

Inserting (4) into heat equation (1), and thereafter comprising, we obtain

αt−(α+1) v(y) + βt−(α+1)y. D v(y) + t−(α+2β) ∆v(y) = 0, …(6) where

y = t−βx = �tx

…(7)

Canceling t−(α+1) from equation (6), we find

α v(y) + β y. D(y) + t−(2β−1) ∆ v(y) = 0. …(8)

In order to transform (8) into an expression involving the variable y alone, we

take

β = 21

. …(9)


Then, equation (8) reduces to

αv + +Dy.y21 ∆v = 0. …(10)

We simplify further by guessing v to be radial, i.e.,

v(y) = w(|y|), …(11)

for some

w : R→R. …(12)

Then (left as an exercise)

∆v = r

1n −w′, …(13)

and equation (10) now becomes

αw + ,0'wr

1n''w'wr

21 =−++ …(14)

for r = |y| and ′ =drd

.

Now, if we set

α = n/2, …(15)

then equation (14) simplifies to read

(rn−1 w′)′ + 21

(rn w)′ = 0. …(16)

On integration, one obtains

rn−1 w′ +21

rn w = constant = a. …(17)

We assume that w and w′ tend to zero as r→∞. Under these conditions, we

find

a = 0. …(18)

Hence, equations (17) and (18) imply

w′ = −21

r w. …(19)


w = b 42re− , …(20)


89

for some constant b. Combining equations (4), (9), (11), (15) and (20), we

conclude

u(x, t) = �

��

−t4

|x|exp

tb 2

2/n, …(21)

solves the heat equation (1).

We define Φ(x, t) =

��

��

�

<∈

>∈−

)0t,Rx(;0

)0t,Rx(;e)t�4(

1

n

nt4

2|x|

2/n …(22)

The function Φ(x, t) called the fundamental solution of the heat equation (1).

Remarks. φ is singular at the point (0, 0). (2) We will sometimes write

Φ(x, t) = Φ(|x|, t) …(23)

to emphasize that the fundamental solution is radial in the variable x.

Theorem. (Integral of fundamental solution of heat equation)

Statement. For each time t > 0,

�nR

Φ(x, t) dx = 1.

Proof. We find

�nR

Φ(x, t) dx = 2/n)t�4(

1�nR

exp[−|x|2/4t]dt

= 2/n�

1�nR

exp[−|z|2]dz

= 2/n�

1∏ �=

∞

∞−

n

1iexp[−|zi|2] dzi

= 1. This proves the theorem.

Article. Solve the initial value (or Cauchy) problem

ut −∆u = 0 in Rn×(0, ∞) …(1)

u =g on Rn×{t = 0} …(2)

associated with homogeneous heat equation.


Solution. Let Φ(x, t) = ��

��

π

−t4

x

2n

2

et41

||

/)((x∈Rn, t>0) …(3)

be the fundamental solution of the heat equation (1). We note that the function

(x, t) → Φ(x, t) …(4)

solves the heat equation away from the singularity at (0, 0), and thus so does

(x, t) → Φ(x−y, t) for each fixed y∈Rn. …(5)

Consequently, consider the convolution

u(x, t) = �nR

Φ(x−y, t) g(y)dy

= t4yx

R2n

2

n

et41

||

/)(

−−

�πg(y)dy …(6)

for x∈Rn, t>0. (A) First, we shall show that u∈C∞ (Rn×(0, ∞)).

Since the function ��

��

�

−t4

2|x|

2/ne

t1

is infinitely differentiable, with

uniformly bounded derivatives of all order, on Rn×[δ, ∞) for each δ>0, we

see that

u∈⊂∞(Rn×(0, ∞)). …(7)

(B) Furthermore, from equation (6), we write

ut(x, t) −∆u(x, t) = �nR

[(Φt−∆xΦ)(x−y, t)]g(y) dy

= 0 ,

for all x∈Rn and t>0, since the fundamental solution Φ(x, t) itself solves the

heat equation. Thus,

ut(x, t) −∆u(x, t) = 0 …(8)

in Rn×(0, ∞).

(C) Let x0 ∈Rn be a fixed point. Let ∈>0 be given. Choose δ>0 such that

|g(y)−g(x0)| <∈ …(9)


91

whenever

|y−x0|<δ for y∈Rn. …(10)

We know that

�nR

Φ(x, t) dx = 1 …(11)

for each time t>0 .

Then, if

|x−x0|< δ/2, …(12)

we have, using equations (6) and (11),

|u(x, t) −g(x0)| = | �nR

Φ(x−y,t) {g(y)−g(x0)}dy|

≤ � Φ)�,0x(B

(x−y, t) |g(y)−g(x0) |dy

+ � − )�,0x(BnR Φ(x−y t) |g(y)−g(x0)|dy

= I + J, say …(13)

where I = � )�,0x(B Φ(x−y, t) |g(y)−g(x0)| dy, …(14)

J = � − )�,0x(BnR Φ(x−y, t) |g(y)−g(x0)|dy …(15)

Now, owing to inequality (9) and relation (11), we find

I ≤ ∈ � )�,0x(B (x−y, t) dy = ∈

This implies

I ≤ ∈. …(16)

Furthermore, if

|x−x0| ≤ δ/2 and |y−x0| ≥ δ, …(17) then |y−x0| ≤ |y−x| + |x−x0|

≤ |y−x| + δ/2

≤ |y−x| +21

|y−x0|

or |y−x| ≥ 21

|y−x0|. …(18)


Consequently,

J ≤ � −∞ )�,0x(BnRL||g||2 Φ(x−y, t)dy

≤ dy|yx|t4

1exp

tC 2

)�,0x(BnR2/n �

��

−−� −

≤ dy|xy|t16

1exp

tC 20

)�,0x(BnR2/n �

��

−−� − , using (18)

= )drr(rt16

1exp

tC 1n2

�2/n−∞

�

��

−�

→ 0 …(19)

as t→0+.

Hence, if |x−x0| < δ/2 and t > 0 is small enough, then

|u(x, t) −g(x0)| < 2∈, …(20)

Using equations (13), (16) and (19). The relation (20) implies

)(),(lim

,),(),(

0

0tRx0xyx

xgtxun

0=

+→∈→

…(21)

for each point x0∈Rn.

Thus, we have shown that u(x, t), given by (6), is the solution of the initial-value problem constiting of equations (1) & (2). This complete the proof. 3.2 MEAN-VALUE FORMULA FOR THE HEAT

EQUATION Let U⊂Rn be open and bounded. We fix a time T > 0.

Definition. The parabolic cylinder is defined as

UT = U×(0, T],

and the parabolic boundary of UT is denoted by ΓT and is defined as

ΓT = )U( T −(UT).

Interpretation. We interpret UT as being the parabolic interior ofU×[0, T]. We must note that UT includes to top U×{t = T}. The parabolic boundary ΓT comprises the bottom and vertical sides of U×[0, T],

but not the top.


93

Definition (Heat ball)

For fixed x∈Rn, t∈R and r > 0, we define

E(x, t; r) = ��

�� ≥−−Φ≤∈ +

n1n

r1

)st,yx(andtsR)s,y( .

Note. E(x, t; r) is a region in space-time. Its boundary is a level set of fundamental solutions Φ(x−y, t−s) for the heat equation. The point (x, t) is at the center of the top. E(x, t; r) is called a heat ball.

Theorem. (A mean-value property for the heat equation)

Statement. Let u∈C12(UT) solve the heat equation

ut − ∆u = 0 in Rn×(0, ∞). …(1) Prove that

u(x, t) = dsdystyx

syur41

2

2

rtxEn )(

||),(

),,( −−

�� …(2)

for each heat ball E(x, t; r) ⊂ UT.

Proof. Formula (2) is a mean-value formula for heat equation. We find that the right hand side of (2) involves only u(y, s) for times s ≤ t. This is reasonable, as the value u(x, t) should not depend upon future times. We may assume upon translating the space and time coordinates that

x = 0, t = 0. …(3)

We write

E(r) = E(0, 0; r). …(4)

and set

φ(r) = dsdysy

syur1

2

2

rEn

||),(

)(�� …(5)

= dsdysy

srryu2

22

1E

||),(

)(�� …(5A)

We calculate

φ′(r) = dsdys|y|

ur2s

|y|uy

)1(E

2

s2

2n

1i iyi��

��

��

��

��

��

�+��

��

��=

= 1nr

1+ dsdy

s|y|

u2s

|y|uy

)r(E

2

s2

2

iyi��

��

��

��

��

��

�+��

��

�

= A + B, say …(6) We introduce the useful function


ψ = − rlogns4|y|

)s�4log(2n 2

++− …(7)

Then

ψ = 0, on ∂E(r), …(8)

since, Φ(y, −s) = r−n on ∂E(r), …(9)

be definition of heat ball. Now, we utilize (7) to write

B = {�� =+

)r(E

n

1iiyis1n

dsdy�yu.4r

1

= − dsdyyu4un4r

1

rE

n

1iisys1n i��

�� ψ+ψ

=+

)(

)( , …(10)

there is no boundary term, since ψ = 0 on the boundary ∂E(N), by virtue of (8). Integrating by part w.r.t. ‘s’, we discover

B = ��

��

�+−=+

)r(E

n

1isiiys1n�yu4�un4

r1

dy ds

= ��

��

��

��

� ��

��

�−−+−

=+)r(E

n

1i2

2

iiys1n s4|y|

s2n

yu4�un4r

1dy ds

= ��

�� −ψ−

=+

)(rE

n

1iiys1n

yusn2

un4r

1i

dy ds − A .

This implies

A + B = ��

��

�−∆−=+

)r(E

n

1iiiy1n

yusn2

�un4r

1dy ds,

Since u solves the heat equation. So

φ′(r) = � �

��

��

�� −−��

= +

n

1iiiyiyiy1n

dsdyyusn2

�un4r

1 = 0,

…(11)

by virtue of (7). Equation (11) gives

φ(r) = constant.

or φ(r) = 0t

lim→

φ(t)


95

= u(0, 0) ��

��

��→ )t(E 2

2

n0tdsdy

s|y|

t1

lim

= 4 u(0, 0), …(12) since,

.4dsdys

|y|dsdy

s|y|

t1

)1(E 2

2

)r(E 2

2

n=��=�� …(13)

From equation (4) and (12), we write

u(x, t) = 41 φ(r) …(14)

From equation (5) and (14), we have

u(x, t) = ��−−

)r;t,x(E 2

2

ndsdy

)st(|yx|

)s,y(ur41

…(15)

This completes the proof of mean-value formula for the heat equation.

3.4 ENERGY METHODS FOR HEAT EQUATIONS

Theorem. (Uniqueness theorem for heat equation)

Statement. Prove that there exists atmost one solution u∈C12(UT) of the

problem

ut −∆u = f in UT,

u = g on ΓT,

where U⊂ Rn is open and bounded, and ∂U is C1. The terminal time T >0 is

given.

Proof. Let u and u be two solutions of the above problem. Then

ut −∆u = f in UT, …(1)

u = g on ΓT, …(2)

u t − ∆ u = f in UT, …(3)

u = g on ΓT. …(4) Let w = u− u …(5)

Then equations (1) to (5) yield


wt − ∆w = (ut− tu ) − (∆u −∆ u )

= (ut −∆u) − ( tu −∆ u )

= f −f

= 0 in UT …(6)

Also w = u − u

= g − g

= 0 on ΓT. …(7)

Set e(t) = �U w2(x, t)dx, 0 ≤ t ≤ T. …(8)

Then

�=≡U

2)t(edtde & w wt dx

= 2 �U w ∆w dx, using (6)

= −2 �U |Dw|2 dx

≤ 0. …(9)

So e(t) is a decreasing function, and so

e(t) ≤ e(0) = 0, for 0 ≤ t ≤ T. …(10)

�U w2(x, t)dx = 0 for all 0 ≤ t ≤ T ,

w ≡ 0 in UT ,

u = u in UT .

Hence, the solution is unique. This completes the proof.

3.5 PROPERTIES OF SOLUTIONS

First we employ the mean-value property to give a quick proof of the strong maximum principle. Theorem. (Strong maximum principle for the heat equation). Statement : Assume u∈ )U(C)U(C TT

21 ∩ solves the heat equation in UT.

(i) Then .umaxumax

TTU Γ=


97

Rn

t

Strong maximum principle for the heat equation

(ii) Furthermore, if U is connected and there exists a point (x0, t0) ∈ UT such

that

u(x0, t0) = ,umaxTU

then u is constant in

0tU .

Assertion (i) is the maximum principle for the heat equation and (ii) is the strong maximum principle. Similar assertions are valid with “min” replacing “max”.

Remark. So if u attains its maximum (or minimum) at an interior point, then u is constant at all earlier times. This accords with our strong intuitive interpretation of the variable t as denoting time : the solution will be constant on the time interval [0, t0] provided the initial and boundary conditions are constant. However, the solution may change at times t > t0, provided the boundary conditions alter after t0. The solution will however not respond to changes in boundary conditions until these changes happen.

Take note that whereas all this is obvious on intuitive, physical grounds, such insights do not constitute a proof. The task is to deduce such behaviour from the PDE.

Proof. 1. Suppose there exists a point (x0, t0)∈ UT with

u(x0, t0) = M = TUmax u.

Then for all sufficiently small r > 0,

E(x0, t0; r) ⊂ UT ,

and we employ the mean-value property to deduce

(x0, t0)


M = u(x0, t0)

= dyds)st(

|yx|)s,y(u

r41

20

20

)r;0t,0x(En −−

��

≤ M, since

1 = .dyds)st(

|yx|

r41

20

20

)r;0t,0x(En −−

��

Equality holds only if u is identically equal to M within E(x0, t0; r).

Consequently

u(y, s) = M for all (y, s) ∈ E(x0, t0; r).

Draw any line segment L in UT connecting (x0, t0) with some other point (y0, s0) ∈ UT, with s0 < t0. Consider r0 = min {s ≥ s0 | u(x, t) = M for all points (x, t) ∈ L, s ≤ t ≤ t0}.

Since u is continuous, the minimum is attained. Assume r0 > s0. Then

u(z0, r0) = M

for some point (z0, r0) on L ∩ UT and so

u ≡ M on E(z0, r0; r) for all sufficiently small r > 0.

Since E(z0, r0; r) contains L ∩ {r0 − σ ≤ t ≤ r0} for some small σ > 0, we have a

contradiction. Thus

r0 = s0,

and hence u ≡ M on L .

2. Now fix any point x∈U and any time 0 ≤ t < t0. There exists points {x0, x1,…,xm = x} such that the line segments in Rn connecting xi−1 to xi lie in U for i = 1,…,m. (This follows since the set of points in U which can be so connected to x0 by a polygonal path is nonempty, open and relatively closed in U.) Select times t0 > t1 >…> tm = t. Then the line segments in Rn+1 connecting (xi−1, ti−1) to (xi, ti) (i = 1,…, m) lie in UT. According to Step 1,

u ≡ M


99

on each such segment and so

u(x, t) = M.


Remark. The strong maximum principle implies that if U is connected and u∈ )U(C)U(C TT

21 ∩ satisfies

��

��

�

=×=×∂=

=∆−

}0t{Uongu]T,0[Uon0u

Uin0uu Tt

where g ≥ 0, then u is positive everywhere within UT if g is positive somewhere on U. This is another illustration of infinite propagation speed for disturbances. An important application of the maximum principle is the following

uniqueness assertion.

Theorem. (Uniqueness on bounded domains). Statement. Let g∈C(ΓT), f ∈ C(UT). Then there exists at most one solution u ∈ )U(C)U(C TT

21 ∩ of the initial/boundary-value problem

��

Γ==∆−

.T

Tt

onu

Uinfuu

g …(1)

Proof. If u and u~ are two solutions of (1), apply previous theorem to

w = + (u− u~ )

to get the result.

3.5 WAVE EQUATION The wave equation is

utt −∆u = 0 …(∗)

and the non-homogeneous wave equation is

utt − ∆u = f. …(∗∗)

Here t > 0 and x∈U, U⊂ Rn is open. The unknown function is

u = U ×[0, ∞)→R, …(∗∗∗)

u = u(x, t),

and the Laplacian ∆ is taken w.r.t. the spatial variables.


x = (x1, x2,…, xn)

In equation (∗∗)

f : U×[0, ∞)→R …(∗∗∗∗) is given.

Generally, we use the abbreviation

� u = utt − ∆u. …(∗∗∗∗∗)

Remark. The wave equation is a simplified model for a vibrating

string (n = 1),

membrance (n = 2),

elastic solid (n = 3).

In each of the above, u(x, t) represents the displacement in some direction of

the point x at time t ≥ 0.

Solutions by Special Means

Article d′′′′. Alembert’s formula (for n = 1)

We consider the initial-value problem for the one-dimensional wave equation in all of R: utt − uxx = 0 in R×(0, ∞) …(1)

u = g , ut = h on R × {t = 0}, …(2)

where g, h are given functions.

We desire to derive a formula for u = u(x, t) in terms of known functions g and h. The two initial conditions in (2) imply that the displacement u(x, 0) and the velocity ut(x, 0) are known. The PDE (1) can be factored to write

��

��

�

∂∂−

∂∂

��

��

�

∂∂+

∂∂

xtxtu = 0. …(4)

Set

v(x, t) = ��

��

�

∂∂−

∂∂

xtu(x, t). …(5)

Then equations (4) says

ut(x, t) + vx(x, t) = 0 ; x∈R, t > 0. …(6)

Equation (6) is a homogeneous transport equation with constant coefficients

(b = 1). Let


101

v(x, 0) = a(x). …(7)

We know that the fundamental solution of the initial-value problem consisting of transport equation (6) and condition (7) is v(x, t) = a(x−t), x∈R, t ≥ 0. …(8)

Combining equations (5) and (8), we obtain

ut(x, t) − ux(x, t) = a(x−t) in R × (0, ∞) …(9)

Also u(x, 0) = g(x) in R, …(10)

By virtue of initial condition (2). Equations (9) and (10) constitute the non-homogeneous transport problem. Hence, its solution is u(x, t) = g(x + t) + �

t0 a (x + (s−t) (−1)−s) ds

= g (x + t) + �+−

txtx2

1a(y)dy. …(11)

The second initial condition in (2) imply

a(x) = v(x, 0)

= ut(x, 0) − ux(x, 0)

= h(x) − g′(x), x∈R. …(12)

Substituting (12) into equation (11), we obtain

u(x, t) = g(x + t) + �+−

txtx2

1[h(y) − g′(y)]dy

= 21

[g(x+t) + g(x−t)] + �+−

txtx2

1h(y)dy, …(13)

for x∈R, t ≥ 0.

This is the d’ Alembert’s formula. We have derived (13) assuming u is a (sufficiently smooth) solution of (1). Application of D’ Alembert’s Formula

Initial/boundary-value problem on the half-line. R+ = {x > 0}

Example. Consider the problem utt − uxx = 0 in R+ ×(0, ∞)

u = g, ut = h on R+ ×{t = 0} …(1)

u = 0 on {x = 0} × (0, ∞),

where g, h are given, with


g(0) = 0, h(0) = 0. …(2)

Solution. We convert the given problems on the half-line into the problem on whole of R. We do so by extending the functions u, g, h to all of R by odd reflection method as below we set

��

≥≤−−≥≥

=,,),(

,),(),(~

0t0xfortxu

0t0xfortxutxu …(3)

��

≤−−≥

=,0xfor)x(g

0xfor)x(g)x(g~ …(4)

��

≤−−≥

=.0xfor)x(h

0xfor)x(h)x(h

~ …(5)

Now, problem (1) becomes

��

=×==

∞×=

}0t{Ronh~

u~,g~u~),0(Rinu~u~

t

xxtt …(6)

Hence, d’ Alembert’s formula for one-dimensional problem (6) implies

�+−++= +−

txtx )y(h

~21

)]tx(g~)tx(g~[21

)t,x(u~ dy . …(7)

Recalling the definitions of h~

,g~,u~ in equations (3)−(5), we can transform equation (7) to read for x ≥ 0, t ≥ 0. u(x, t)

=

��

��

�

≤≤+−−+

≥≥+−++

�

�

+

+−

+

−

tx

tx

tx

tx

tx0fordyyh21

xtgtxg21

0txfordyyh21

txgtxg21

;)()]()([

;)()]()([ …(8)

Formula (8) is the solution of the given problem on the half-line R+ = {x > 0}.

Remark. If h ≡ 0 (9) in R+ ×{t = 0}, then the solution of the corresponding problem, as given by (8), is

u(x, t) =

��

��

�

≤≤−−+

≥≥−++

.ttxfor)];xt(g)tx(g[21

,0txfor)];tx(g)tx(g[21

…(10)


103

The formula (10) shows that the initial displacement, u(x, 0) = g(x), splits into two parts − one moving to the right with speed one (c = 1) and the other to the left with speed one. The latter part reflects off the point x = 0, where the vibrating string is held

fixed.

Article. Derive Kirchloff’s formula for the solution of three-dimensional (n = 3) initial-value problem utt − ∆u = 0 in R3× (0, ∞) …(1)

u = g on R3×{t = 0} …(2)

ut = h on R3×{t = 0}. …(3)

Solution. Suppose u ∈ C2 (R3 × [0, ∞)) solves the above initial-value problem.

We know that

U(x; r, t) = ),( rxB∂

� u(y, t) dS(y) …(4)

Defines the average of u(⋅, t) over the sphere ∂B(x, r). Similarly,

G(x; r) = ),( rxB∂

� g(y)dS(y) …(5)

H(x; r) = ),( rxB∂

� h(y) d S(y). …(6)

For fixed x, we hereafter regard U as a function of r and t only. Next, set

,UrU~ = …(7)

.HrH~

,GrG~ == …(8)

we now assert that U~

solve

��

�

��

�

�

∞×==

=×=

=×=

∞×=−

+

+

+

),(}{~

}{~~

}{~~

),(~~

00ron0U

0tRonHU

0tRonGU

0Rin0UU rrtt

…(9)

We note that the transformation in (7) and (8) converts the three-dimensional wave equation into the one-dimensional wave equation. From equation (7), we find


tttt UrU~ =

= r ∆U

= r ,Ur2

U rrr �

��

+ Laplacian for n = 3

= r Urr + 2Ur

= (U + r Ur)r

= ( rr )U~

= .U~ rr …(10)

The problem (9) is one the half-line R+ = {r ≥ 0}. The d’ Alembert’s formula for the same, for 0 ≤ r ≤ t, is

�+−−+= ++−trtr .dy)y(H

~21

)]tr(G~

)tr(G~

[21

)t,r;x(U~

…(11)

From (4), we find

u(x, t) = +→0r

lim U(x; r, t), …(12)

Equations (7), (8), (11) and (12) imply that

u(x, t) = +→0r

lim �

��

r)t,r;x(U

~

= +→0r

lim �

��

�+−−+ +

−rtrt dy)y(H~

r21

r2)rt(G

~)rt(G

~

= ).t(H~)t('G~ + …(13)

Owing then to (5) and (6), we deduce from (13)

u(x, t) = { }�∂∂+

��

��

�∂∂

),(),()()()()(

txBtxBySdyhtydSygt

t …(14)

But

� �∂ ∂+=

),( ),().()()()(

txB 10BzSdtzxgySdyg …(15)

Hence

)()}.({)()(),(),(

zSzdtzxDgySdygt 10BtxB

+�=��

��

�∂∂

∂∂


105

= ��

��

� −�

∂ txy

yDgtxB

)}.({),(

d S(y). …(16)

Now, equation (14) and (16) conclude

u(x, t) = ),( txB∂

� [g(y) + {Dg(y)}. (y−x) + t h(y)] d S(y) …(17)

for x∈R3, t > 0.

The formula (17) is called KIRCHHOFF’s formula for the solution of the initial-value problem (1)−(3), in 3D. Nonhomogeneous Problem

We next investigate the initial-value problem for the nonhomogeneous wave

equation

��

��

=×==

∞×=∆−

}.0t{Ron0u,0u

),0(Rinfuun

t

ntt …(1)

Motivated by Duhamel’s principle, we define u = u(x, t; s) to be the solution of

��

��

=×⋅=⋅=⋅

∞×=⋅∆−⋅

}.st{Ron)s;(f)s;(u,0)s;(u

),s(Rin0)s;(u)s;(un

t

ntt …(2)

Now set u(x, t) : = �

t0 u(x, t; s)ds (x∈Rn, t ≥ 0). …(3)

Duhamel’s principle asserts this is solution of

��

��

=×==

∞×=∆−

}.0t{Ron0u,0u

),0(Rinfuun

t

ntt …(4)

Theorem. (Solution of nonhomogeneous wave equation). Statement. Assume n ≥ 2 and f ∈ C[n/2]+1 (Rn ×[0, ∞)). Define u

by (3). Then

(i) u ∈ C2 (Rn ×[0, ∞)),

(ii) utt − ∆u = f in Rn × (0, ∞),

and


(iii)

0t,nRx)0,0x()t,x(

lim

>∈→

u(x, t) = 0,

0t,nRx)0,0x()t,x(

lim

>∈→

ut(x, t) = 0 for each point x0 ∈ Rn.

Proof. 1. If n is odd, 2

1n1

2n +=+�

��

. Also u(⋅,⋅;s) ∈ C2(Rn ×[s, ∞)) for each

s ≥ 0, and so u∈C2 (Rn × [0, ∞)). If n is even, 2

2n1

2n +=+�

��

. Hence

u ∈ C2 (Rn ×[0, ∞)). 2. We then compute :

ut(x, t) = u(x, t; t) + �t0 ut(x, t; s)ds

= �t0 ut(x, t; s)ds,

utt(x, t) = ut(x, t; t) + �t0 utt(x, t; s)ds

= f(x, t) + �t0 utt(x, t; s)ds.

Furthermore

∆u(x, t) = �t0 ∆u(x, t; s)ds

= �t0 utt(x, t; s)ds.

Thus utt(x, t) − ∆u(x, t) = f(x, t) (x ∈ Rn, t > 0),

and clearly

u(x, 0) = ut(x, 0) = 0 for x ∈Rn.

Examples. (i) Let us work out explicitly how to solve (4) for n = 1. In this case d’ Alembert’s formula gives

u(x, t; s) = ,),( dysyf21 stx

stx�−+

+−

dydssyf21

txustx

stx

t

0),(),( ��

−+

+−=

That is,

u(x, t) = ��+−

sxsx

t02

1f(y, t−s)dy ds (x∈R, t ≥ 0). …(5)

(ii) For n = 3, Kirchhoff’s formula implies


107

u(x, t; s) = (t −s) ),( stxB −∂

� f(y, s) dS;

so that

u(x, t) = � ��

��

��−

−∂

t

0 stxBdsdSsyfst ),()(

),(

= dSds)st()s,y(f

�41

)st,x(Bt0 −�� −∂

= .dSdrr

)rt,y(f�4

1)r,x(B

t0

−�� ∂

Therefore

u(x, t) = � −−−

)t,x(B dy|xy|

|)xy|t,y(f�4

1 (x ∈ R3, t ≥ 0) …(6)

solves (4) for n = 3.

The integrand on the right is called a retarded potential.

3.6 ENERGY METHODS There is the necessity of making more and more smoothness assumptions upon the data g and h to ensure the existence of a C2 solution of the wave equation for larger and larger n. This suggests that perhaps some other way of measuring the size and smoothness of functions may be more appropriate. Indeed we will see in this section that the wave equation is nicely behaved (for all n) with respect to certain integral “energy” norms. Uniqueness

Let U ⊂ Rn be a bounded, open set with a smooth boundary ∂U, and as usual set UT = U × (0, T], ΓT = TU −UT, where T > 0. We are interested in the initial/boundary-value problem

��

��

�

=×=Γ=

=∆−

}.0t{Uonhu

ongu

Uinfuu

t

T

Ttt

…(1)

Theorem. (Uniqueness for wave equation).

Statement. There exists at most one function u∈C2 )U( T solving (1).

Proof. If u~ is another such solution, then w : = u − u~ solves


Cone of dependence

��

��

�

=×=Γ=

=∆−

}.0t{Uon0w

on0w

Uin0ww

t

T

Ttt

Define the “energy”

e(t) = �U21 2

tw (x, t) + |Dw(x, t)|2dx (0 ≤ t ≤ T).

We compute

��

��

� =⋅⋅+�=dtd

dxDwDwww)t(e ttttU&

= �U wt(wtt − ∆w)dx = 0.

There is no boundary term since w = 0, and hence wt = 0, on ∂U × [0, T]. Thus for all 0 ≤ t ≤ T, e(t) = e(0) = 0, and so wt, Dw ≡ 0 within UT. Since w ≡ 0 on U × {t = 0}, we conclude w = u − u~ ≡ 0 in UT. Domain of dependence

As another illustration of energy methods, let us examine again the domain of dependence of solutions to the wave equation in all of space. For this, suppose u ∈ C2 solves utt − ∆u = 0 in Rn ×(0, ∞).

Fix x0 ∈ Rn, t0 > 0 and consider the cone

C = {(x, t) |0 ≤ t ≤ t0, |x−x0| ≤ t0 − t}. The Books Recommended for Chapter III 1. L.C. Evans Partial Differential Equations, Graduate Studies


(x0, t0)

B(x0, t0−t)

ANALYTICAL MECHANICS – I 109

Chapter-4

Analytical Mechanics – I

4.1 INTRODUCTION In the literature on mechanics there is no single generally accepted interpretation of term “analytical mechanics”. Some writers identify analytical mechanics with theoretical mechanics. Some authors maintain that an exposition in generalized coordinates constitutes the distinguishing feature of analytical mechanics. According to Gantmacher, analytical mechanics is characterized both by a specific system of presentation and also by a definite range of problems investigated. In analytical mechanics, general principles (differential or integral) serve as the foundation and then the basic differential equations of motion are derived from these principles analytically. 4.2. FREE AND CONSTRAINED SYSTEM The motion is studied of a system of particles Pk , (k = 1, 2,…,N), relative to some inertial (Galilean) system of coordinates. There are some restrictions on the positions and velocities of the particles of the system. These restrictions may be of a geometrical or kinematical nature. Such restrictions are called constraints. Systems with such constrains are termed as constrained systems. If there are no constraint in the system, then the system is called a free system.

Let t denotes time, krρ

, (k = 1, 2,…, N), be the radius vectors taken from a

single pole (that is stationary in the given system of coordinates) for all system of particles Pk and kk vr

ρ&ρ = (k = 1, 2,…,N) (1)

denote velocities of all points Pk of the system. Here dot (⋅) represents differentiations with respect to time t . Analytically, a constraint is expressed by the equation f(t, kk r,r &ρρ

) = 0. (2)

In the general case, constraint (2) is called differential or kinematical . In (2), f(t, )r,r kk

&ρ is an abridged notation for the function

f(t, ).r,...,r,r,rr,r N21N,...,21&ρ&ρ&ρρρρ

Such abbreviated notation will be used throughout the chapters on analytical mechanics.


If the velocities kr&ρ

do not enter into the constraint equation (2), the constraint is termed finite or geometric. Analytically, it is written as, f(t, )rk

ρ= 0 (3)

Given a finite constraint of type (3), a system cannot occupy an arbitrary position in space at every given instant of time. Finite constrains impose restrictions to possible positions of the system at time t. But with a differential constraint alone, the system may occupy any arbitrary position in space at any time t. However, in this position the velocities of the particles of the system cannot any longer be arbitrary, since the differential constraint imposes restrictions on these velocities. From now on, we shall confine our consideration solely to such differential constraints whose equations contain the velocities of the particles in linear form :- Drrrrl NN332211 +++++ &ρρ&ρρ&ρρ&ρρ

....... lll = 0 or

�=

N

1kkk r&

ρρ.l + D = 0 (4)

where kk r&ρρ

.l is the scalar product of the vectors kk rand &ρρl and the vectors kl

ρ and

the scalar D are specified functions of time t and of all �rρ

(µ = 1, 2,…, N). It is

assumed here that the vectors klρ

cannot all vanish at the some time. Each finite constraint of type (3) implies, as a consequence, a differential constraint whose equation is obtained by termwise differentiation of equation (3) :

0tf

r.rfN

1kk

k

=∂∂+� ��

�

��

�

∂∂

=

&ρρ , (5)

where kzjyixr kkkk ++=ρ

, (6)

and k,j,i are mutually orthogonal unit vectors of the co-ordinate axes. Then

kzf

jyf

ixf

rf

kkkk ∂∂+

∂∂+

∂∂=

∂∂ρ , (k = 1, 2,…, N) (7)

or

krfρ∂

∂= gradk f , (k = 1, 2…N) (8)

But differential constraint (5) is not equivalent to the finite constraint (3). It is equivalent to the finite constraint f(t, kr

ϖ) = c, (9)


where c is an arbitrary constant. For this reason, the finite constraint (9) is

called integrable.

In rectangular Cartesian co-ordinates, we write

kzjyixr kkkkˆˆˆ ++=ρ

, (10)

kCjBiA kkkkˆˆˆ ++=l

ρ, (11)

and kzjyixr kkkk &&&&ρ ++= , (12)

where Ak, Bk and Ck , (k = 1, …N), are scalar functions of t, x1, y1, z1,…, xN, yN, zN. Then the above constraint equations are now written as : f(t, xk, yk, zk, kkk z,y,x &&& ) = 0 (13)

f(t, xk, yk, zk) = 0 (14)

� ++=

N

1kkkkkkk )zCyBxA( &&& + D = 0 (15)

� =∂∂+��

�

��

�

∂∂+

∂∂+

∂∂

=

N

1kk

kk

kk

k

0tf

zzf

yyf

xxf &&& (16)

4.3 CLASSIFICATION OF CONSTRAINTS If t is not expressed explicitly in the constraint equation, i.e.,

,0tf =

∂∂

(17)

then the constraint is termed as stationary.

Note (1) : If the differential constraint (5) is stationary, then differential equation (5) is linear and homogeneous in the velocities.

Note (2) : By analogy, the differential constraint (4) or (15) is termed

stationary if

D = 0 (18)

and vectors klρ

in equation (3) and, respectively, the coefficients Ak, Bk, and Ck in equation (15) are not explicit function of time t.


Illustration 1. A particle is constrained to move over a surface. Let the equation of this surface be given in the form

f )r(

ρ= 0 (19)

or f(x, y, z) = 0 (20)

This is a finite stationary constraint.

If the surface is moving or undergoing deformation, then the time t enters into the equation of the surface explicitly and its equation is of the form f(t, r

ρ) = 0 (21)

or f(t, x, y, z) = 0 (22)

In this case, the constraint is finite but non-stationary.

System of particles Definition 1. A system of particles is called holonomic if the particles of the system are not subjected to differential nonintegrable constraints. Thus, a holomic system is any free system of particles and also any constrained system with finite or differential but integrable constraints. All constraints in a holonomic system may be written in closed form. Definition 2. A system of particles is called nonholonomic if there are differentiable integrable constraints.

Nonintegrable differential constraints are themselves frequently called nonholonomic. Sometimes, integrable differential constraints are termed seminomic if only stationary constraints are imposed. Otherwise, it is called sclernomic.

Illustration 2 : Two particles are connected by a rod of constraint length l. Then the constraint equation is of the form

( 2

21 )rrρρ

− − l2 = 0 (23) or (x1−x2)2 + (y1−y2)2 + (z1−z1)2 − l2 = 0 (24)

Here 21 randrρρ

are the position vectors of the end points of the given rod. This is a holonomic scleronomic system. Note : We note that a rigid body may be regarded as a system of particles equidistant from one another, that is, subjected to constraints of type (23).


Thus a free rigid body is a special case of a constrained holonomic scleronomic system of particles.

Illustration 3. Two particles are connected by a rod of variable length l = f(t). The constraint equation for this is

( 2

21 )rrρρ

− − f2(t) = 0 , (25) or (x1−x2)2 + (y1−y2)2 + (z1−z1)2 − f2(t) = 0 . (26)

This system is a holonomic sheonomic system.

Illustration 4 . Two particles in a plane are connected by a rod of constant length l and are constrained to move in such a manner that the velocity of the middle of the rod is in the direction of the rod. The constraint equations for this are z1 = 0, z2 = 0, (27)

(x1−x2)2 + (y1−y2)2 − l2 = 0, (28)

21

21

21

21

yyyy

xxxx

−+

=−+ &&&&

, (29)

since, the velocity of the centre of the rod is ��

��

� ++2

yy,

2xx 2121 &&&&

and direction

ratios of the rod are < >++2

yy,

2xx 2121 . This system is a nonholonomic

system because equation (29) defines a nonintegrable differential constraint. Unilateral Constraints : The constraints discussed earlier are called bilateral constraints. The constraints of the form f(t, kk r,r &ρρ

) ≥ 0 (30)

are called unilateral constraints. If in condition (30), we have an equal sign, it is said that the constraint is taut. Illustration 5. Consider two particles connected by a thread of length l. Then the constraint equation is expressed by the inequality l2 − 2

21 )rr(ρρ

− ≥ 0, (31)

and is unilateral constraint.


Remark. The motion of a system of particles on which a unilateral constraint is imposed may be divided into portions so that in certain portions the constraint is taut and the motion occurs as if the constraint were bilateral, and in other portions the constraint is not taut and the motion occurs as if there were no such constraint.

In other words, in certain portions a unilateral constraint is either replaced by a bilateral constraint or is eliminated altogether.

So we shall hence forth consider only bilateral constraints. 4.4. POSSIBLE AND VIRTUAL DISPLACEMENT

On a material system, let us impose the following d finite constraints.

f1(t, 0rrr N21 =),.......,,ρρρ

f2(t, 0rrr N21 =),,.........,ρρρ

f3(t, 0rrr N21 =).,,.........,ρρρ

Μ

fd(t, 0rrr N21 =)..,..........,ρρρ

(1)

or fα(t, kr

ρ) = 0, (α = 1, 2,…d) (2)

and following g differential constraints :

��

��

��

=β

N

1kkk v

ρρ&l + Dβ = 0 , (3)

or

��

=+⋅++⋅+⋅

=+⋅++⋅+⋅

0Dvvv

0Dvvv

gNgN22g11g

1NN1212111

ρρρρρρ

ρρρρρρ

lll

lll

...

............................................................

...........................................................

...

. (4)

We replace the finite constraints (2) by the differential constraints by

differentiating them. We have

0t

fV.

rf �

N

1kk

k

� =∂

∂+��

��

��

∂∂

=

ρρ , (α = 1, 2,…d) (5)


Definition (Possible velocities). The system of vectors kvρ

are called possible velocities for a certain instant of time t and for a certain possible (at that instant) position of the system if these vectors kv

ρ satisfy the above “d + g”

linear equations in (3) and (5).

Thus, possible velocities are velocities permitted by the constraints.

For every possible position of the system at time t there exists an infinity of systems of possible velocities. One of these system of velocities is realized in the actual motion of the system at time t.

Definition (Possible displacements). Consider the system of infinitely small displacements

dtVrd

dtVrd

dtVrd

NN

22

11

ρρΜ

ρρ

ρρ

=

=

=

(6)

or dtVrd kk

ρρ= (k = 1, 2,…N), (7)

where kvϖ (k = 1, 2,….N) are the possible velocities. The infinitesimal displacements krd

ρ are called possible infinitesimal displacements or simply

possible displacements.

Remark : Multiplying equation (3) and (5) termwise by dt, and then using

equation (7), we find

��

��

��

=β k

N

1kk rd

ρρ&l Dβ dt = 0 , (8)

for β = 1, 2,…, g , and

0dtt

frd.

rf �

k

N

1k k

� =∂

∂+��

��

��

∂∂

=

ρϖ , (9)

for α = 1, 2,…,d. Equation (8) and (9) determine the possible displacements.

Virtual Displacements

Let us take two systems of possible displacements at one and the same instant of time and for one and the same position of the system : dtvdrd kk

ρρ= ,

and


dtvr'd kk &ϖϖ = , (k = 1, 2,…N), (10)

Here, both possible displacements, kk r'anddrdρϖ

satisfy the equations (8) and (9).

Therefore,

0dtt

frd.

rf �

k

N

1k k

� =∂

∂+��

��

��

∂∂

=

ρϖ (11)

0dtt

fr'd.

rf �

k

N

1k k

� =∂

∂+��

��

��

∂∂

=

ρϖ (12)

and

��

��

��

=β k

N

1kk rd

ρρ&.l + Dβ dt = 0 (13)

��

��

��

=β k

N

1kk rd

ρρ& '.l + Dβ dt = 0 (14)

Subtract equation (11) from (12) and (13) from (14) we have

,0)rdr'd(.rf

kk

N

1k k

� =−�∂∂

=

ρρρ (15)

and

0rdrd kk

N

1kk =−�

=β )'(

ρρρl (16)

for β = 1, 2,…, g. We denote

δ ,rdr'dr kkkρρρ

−= (17)

k = 1, 2,…, N.

Then equations (15) and (16) become

0r�..rf

k

N

1k k

� =��

��

��

∂∂

=

ρϖ (18)

0rkk

N

1k=��

�

��

� δβ=�

ρρ..l , (19)

for β = 1, 2,…, g. The displacements δ kr

ϖsatisfy the homogeneous relations (18) and (19) are

called virtual displacements.


uρ

P rdρ

vρ

uρ

1vρ S

P dtvrd

ρρ=

vρ

Any system of vectors δ krϖ

satisfying equations (18) and (19) is a system of virtual displacements. Remark 1 : We can say that virtual displacements are displacements of points of a system from one possible position of the system at time t to another infinitely close possible position of the system. Remark 2 : In the case of stationary constraints, virtual displacements coincide with possible displacements. Illustrations 1 : Consider a particle in motion on a fixed surface. In this case, any vector v

ρ constructed from the point P and tangent to the

surface at P will constitute a possible velocity. The corresponding possible displacement vrd

ρρ= dt

lies in the plane tangent to the given fixed surface. The difference δ rdr'dr

ρρρ−=

of the two tangent vectors is also a vector tangent to the surface at the same

point P.

Thus, any vector constructed from P and lying in the tangent plane may be regarded as a certain rd

ρ and as a certain r�ρ .

Here, the constraint is stationary and the virtual displacements coincide with the possible displacements. Illustrations 2 : The constraint is a surface S which is itself in motion (as a rigid body) with a certain velocity u

ρ relative to the original system of

coordinates.


In this case, the possible velocity vρ

is obtained from an arbitrary vector 1vρ

that is tangent to the surface by adding to it the velocity u

ρ. That is,

.uvv 1

ρρρ += Therefore,

dtvrdρρ

=

= .dtudtv1ρρ +

Similarly, for another possible displacement

dtudtvr'd '1

ρρρ+= .

So, the virtual displacement is

δ rdr'drρρρ

−= = dt)vv( 2

'1

ρρ − .

Degree of Freedom

The vector ,r� kρ

in Cartesian co-ordinates, is characterized by three projections on the axes δxk, δyk, δzk (k = 1, 2,…,N) and the equations (18) and (19) which define the virtual displacements may be written in the following form :

� ��

��

�

∂∂

+∂∂

+∂∂

=

N

1kk

k

�k

k

�k

k

� z�zf

y�yf

x�xf

= 0 , (20)

for α = 1, 2,…,d and

� =++=

N

1kkk�kk�kk� 0)z�Cy�Bx�A( (21)

for β = 1, 2,…,N .

If the above d + g equations in (20) and (21) are independent, then out of the 3N virtual increments δxk, δyk, δzk, there will be (3N − d − g) independent virtual increment. Let n = 3N − d − g (22)

then n is called the number of degrees of freedom of the given system of particles. 4.5. POSSIBLE ACCELERATION Let the corresponding forces ),.....,,(, N21kFk =

ρ, be impressed at the points Pk

of the system. Here kFρ

is the resultant of all forces applied directly to the


particle Pk , (k = 1, 2,…,N). If the constraint were absent, then by Newton’s second law we would have the relations kkk wmF

ρρ= , (23)

for k = 1, 2,…,N , between the masses mk, the accelerations wk and the forces

Fk.

Given constraints, the accelerations

kk

k Fm1

wρρ = (24)

(at a given instant of time t, in a given position of the particles of the system ,rk and for given velocities kv

ρ) may prove incompatible with the constraints.

Differentiating the equations (3) and (5) termwise w.r.t. time, we get

0t

fdtd

v.rf

dtd

w.rf �

N

1kk

k

�N

1kk

k

� =��

��

�

∂∂

+��

��

��

�

��

�

∂∂

+��

��

��

∂∂

==

ρρ

ρρ , (25)

for α = 1, 2,…, d, and

0Ddtd

vdtd

wN

1kkkk

N

1kk =+��

�

��

��

��

�+��

��

�β

=β

=β ��

ρρρρ& .. ll , (26)

for β = 1, 2,…,g.

The left hand sides in relations (25) and (26) are linearly dependent on the accelerations kw

ρ. These left hand sides are also dependent on t, kk v,r

ρρ(k = 1,

2,…N). Equations (25) and (26) are analytic expressions for the restrictions imposed by the constraints on the accelerations kw

ρ of the particles of the

system. The accelerations (24), i.e.,

kk

k Fm1

wρρ = ,

may not satisfy above relations (25) and (26). Then the materially effected constraints will act on the particles Pk of the system with certain supplementary forces kR

ρ, (k = 1, 2,…,N). These forces are called the reaction forces of the

constraint. The reactions that arise are such that the accelerations determined from the equations mk kkk RFw

ρρρ += , (k = 1, 2,…., N) (27)


are already permitted by the constraints. Unlike the reactions kRρ

(k = 1,

2,…N), the pressigned forces kFρ

(k = 1, 2,….,N) are called effective forces. Effective forces are ordinarily specified as known functions of the time, position and velocities of the particles of the system, i.e., ),,( kkkk vrtFF

ρρρρ= . (k = 1, 2,…N) (28)

Basic Problem : The basic problem of the dynamics of a constrained system consists in the following : Given effective forces kF

ρ= ),,( kkk vrtF

ρρρ and the initial positions

ok

ok vvelocitiesinitialtheandr

ρρ of the particles of the system both are

compatible with constraints it is required to determine the motion of the system and the reactions of the constraints kR

ρ (k = 1, 2,…N).

It nothing is known about the nature of the constraints except the defining equations (2) and (3) and, consequently, nothing in known about the reaction

kRρ

produced by these constraints, then the above problem is indeterminate, since the number of scalar quantities (xk, yk, zk, Rkx, Rky, Rkz) that have to be determined is greater than the number of available scalar equations (6 N > 3N + d+ g)

For the basic problem of dynamics to become determinate, it is necessary to have some kind of additional

6N − (3N + d + g)

= 3N − d − g

= n (29)

independent relations between sought-for quantities. These relation can be obtained if we confine ourselves to the following important class of ideal constraints.

Ideal Constraints : If the sum of the works of the reactions of constraints on any virtual displacements is equal to zero, constraints are termed ideal. That is, for ideal constraints, 0r�R...r�Rr�R NN2211 =⋅++⋅+⋅

ρρρρρρ

or

0r�RN

1kkk =� ⋅

=

ρρ . (30)

In Cartesian co-ordinates equal (30) may be rewritten as :


(R1x δx1 + R1y δy1 + R1z δz1)

+ (R2x δx2 + R2y δy2 + R2z δz2) + …..+ (RNx δxn + RNy δyN + RNz δzN) = 0, or

�=

N

1k(Rkx δxk + Rky δyk + Rkz δzk) = 0. (31)

Among the 3N quantities δxk, δyk, δzk, there are n independent ones (n = 3N − d − g is the degree of freedom of the system). It is therefore possible in (31) to express 3N−n dependent increments δxk, δyk, δzk in terms of n independent increments and equate to zero the co-efficients of these independent increments. We then obtain the n relations still lacking and need to make determinate the basic problem of the dynamics mentioned above. 4.6 LAGRANGE’S EQUATIONS OF THE FIRST KIND

We assumed that all constraints imposed on a system of particle are ideal. If mk is the mass of the kth particle, kw

ρ is its acceleration, and kk RandF

ρρare,

respectively, the resultant of the effective forces and the resultant of the forces of reaction operating on this particle (k = 1, 2,…N), then for particles of a constrained system, we have kkkk RFwm

ρρρ += . (k = 1, 2…N) (1) Since constraints are ideal, for any position of the system under any virtual

displacements, we have

0r�RN

1kkk =� ⋅

=

ρρ, (2)

Eliminating Rk from equations (1) and (2), we obtain

0r�).wmF kkk

N

1kk =� −

=

ρρρ . (3)

This is known as the general equation of dynamics.

It states that, given a system in motion, at any instant of time the sum of the works of the effective forces and the forces of inertia on any virtual displacements is zero.


Thus, general equation of dynamics always holds for any motion that is compatible with constraints and that corresponds to the specified effective forces kF

ρ, (k = 1, 2,…,N).

Derivation of Lagrange’s Equations of the First Kind

Let us find the expressions for the reaction forces kRρ

by means of the undetermined multipliers of Lagrange. The relations defining the virtual displacements of particles of a system are

0r�.rf

k

N

1k k

� =��

��

��

∂∂

=

ρρ , (4)

for α = 1, 2, …d, and

,0rkN

1kk =��

�

��

� δ�=

βρρ

l (5)

for β = 1, 2, …, g.

Multiplying the equations (4) and (5) termwise by arbitrary scalar multipliers (−λα) and (−µβ) and adding termwise the resulting equations to equation (2), we get,

0rrf

R kk

g

1

d

1 kk

N

1k=δ��

�

��

�µ−

∂∂

λ− ββ=β=α

αα

=��

ρρρ

ρ.l . (6)

In expanded form in Cartesian co-ordinates, we have

kk��

g

1�

d

1� k

��kx

N

1kx�A�

xf

�R ��

��

��−�

∂∂

−�===

+ kk��

g

1�

d

1� k

��ky

N

1ky�A�

yf

�R ��

��

��−�

∂∂

−�===

+ kk��

g

1�

d

1� k

��kz

N

1kz�A�

zf

�R ��

��

��−�

∂∂

−�===

= 0. (7)

The undetermined multipliers λα and µβ may be chosen so that all the scalar coefficients in (7) and, hence, all the vector co-effecients in (6) vanish. This gives


k��

g

1�

d

1� k

��k l�

rf

�Rρ

ρρ

�+�∂∂

===

, (8)

for k = 1, 2,…N. Expressions (8) is a general expression for the reaction forces of ideal constraints in terms of the undermined multipliers of Lagrange λα, µβ (α = 1, 2,…d ; β = 1 2, …g). Putting the expressions for kR

ρ into equation (1), we get

��

��

��+��

�

��

��

∂∂

+===

k��

g

1�

d

1� k

��kkk l�

zf

�Fwmρρρ

, (9)

for k = 1, 2,…N. The constraint equations for above equations are

fα( 0)rk =ρ

, (10)

for α = 1, 2,…, d, and

0DrN

1kkk =+�

=ββ &l , (11)

for β = 1, 2,…, g.

Equations (9) are called Lagrange equations of the first kind.

Remark : By replacing each vector equation in (9) by three scalar equations, equations (9) to (11) constitute a set of (3N + d + g) scalar equations in (3N + d + g) unknown scalar quantities xk, yk, zk, λα, βµ .

Integrating this set of equations, we get the final equations of motion and, at the same time, from equation (8) we get the magnitude of the reaction forces of constraints. However, integration of such a set of equations is very cumbersome due to the large number of equations. That is why the Lagrange equations of the first kind find little use in actual practice.

Example. Two ponderable particles M1 and M2 of identical mass m = 1 are joined by a rod of invariable length l and negligibly small mass. The system is constrained to move in the vertical plane and only in such manner that the velocity of the midpoint of the rod is directed along it. Determine the motion of the particles M1 and M2

Solution. Let (x1, y1) and (x2, y2) be the co-ordinates of the particles M1 and M2. Then, the constraint equations are

−−+− 212

211 )yy()xx[(

21

l2] = 0, (1)

and


(x2 − x1) )xx()yy( 1212 &&&& +−+ (y2 − y1) = 0. (2)

The Lagrange equations with undermined multipliers λ and µ are

mk � �=α =β

ββα

α µ+∂∂

λ+=d

1

g

1k

kkk r

fFw l

ρρ

ρρ (3)

for k = 1, 2,…, N. These equations give

1x&& = −λ(x2 − x1) − µ(y2 − y1),

1y&& = −g −λ (y2−y1) + µ(x2−x1), (4)

and 2x&& = λ(x2−x1) − µ(y2−y1),

2y&& = −g + λ(y2−y1) + µ(x2−x1). (5)

Equations in (4) are rewritten as

λ(x2−x1) + µ(y2−y1) + 1x&& = 0, (6)

λ(y2 −y1) −µ(x2 −x1) + gy1 +&& = 0. (7)

Solving equations (6) and (7) for λ and µ, we get

λ =2

122

12

112112

yyxx

xxxgyyy

)()(

)())((

−−−−−++− &&&&

. (8)

Using equation (1), we obtain

λ = ( )1121122122y)yy(x)xx(

l1

)yy(eg &&&& −+−−−− (9)

Similarly, we shall find (left as an exercise)

µ = ( )1121122122yxxxyy

1xx

g &&&& )()()( −−−−−ll

. (10)

It is clear that equation (3) can be obtained from equation (4) if are replace λ by “−λ” and 11 y,x &&&& by .y,x 2 &&&& The values of λ and determined from equation are (exercise)

λ = [ ]2122122122yyyxxx

1yy

g &&&& )()()( −+−+−el

, (11)


µ = [ ]2122122122yxxxyy

1xx

g &&&& )()()( −−−−−ll

. (12)

Equating the approximate expressions for µ and λ in the above formulae, we

find

)yy()yy)(xx( 121212 &&&&&&&& −−−− (x2 − x1) = 0 (13)

)yy()xx)(xx( 121212 &&&&&&&& −+−− (y2−y1) + 2g (y2−y1) = 0. (14)

Next, we introduce the following abbreviated notation :

u = x2−x1,

v = y2 − y1,

P = 21 xx && + , Q = 21 yy && + (15)

Then we write

u2 + v2 = l2 , (16)

,0vuvu =− &&&& (17)

P v− Qu = 0 , (18)

0gv2vQuP =++ && . (19)

Equations (16) and (17) show that in a u, v-plane a particle with coordinates (u, v) moves in a circle with radius l and with centre at the origin. Its acceleration will all the time be directed towards the centre. The motion of particle will then be uniform. For this reason, we write u = l cosφ,

v = l sinφ (20)

Since motion is uniform so change in φ is uniform, i.e., rate of change of φ is

constant. Let

�&= α . (constant) (21)

Then


φ = αt + β (22)

According to (18), we may put

P = vf

Quf

ll=, (23)

Substituting these values in (19), are write

0vg2fvvvffuuuf 22 =++++ l&&&& . (24)

Using (20) and (21), we obtain

sing2f −=& φ. (25)

Then

dtddtdf

ddf

φ=

φ

α

= f&

α

−= g2 sinφ.

This implies

f = �

g2cos φ + 2γ . (26)

Consequently, we get

P = 2 ��

��

� φα

+γ cosg

cosφ

= 21 xx && + , (27)

and

Q = 2 ��

��

� φα

+γ cosg

sinφ

= 21 yy && + . (28)

Integrating we get

x1 + x2 = �P dt


= P21� dφ

= αγg

sinφ +2�

gsin φ cos φ +

2�

g φ + 2δ , (29)

and

y1 + y2 = −�

�2cos φ −

2�

gcos2φ + 2∈. (30)

Finally, we obtain

x1 = �

�sinφ +

2�2g

sin φ cos φ + 2�2

g φ −2l

cos φ+ δ (31)

y1 = −�

�cosφ −

2�2g

cos2φ −2l

sin φ + ∈ (32)

x2 = �

�sinφ +

2�2g

sin φ cos φ + 2�2

g φ + 2l

cos φ+ δ (33)

y2 = −�

�cosφ −

2�2g

cos2φ −2l

sin φ + ∈ (34)

φ = αt + β (35)

where α, β, γ, δ and ∈ are arbitrary constants.

4.7 INDEPENDENT COORDINATES AND GENERALIZED FORCES

Let us consider a holonomic system of N particles Pk with radii

vectors

kzjyixr kkkk ++=ρ (1)

for k = 1, 2,…, N, and with finite constraints

fα(t, )rkρ

= 0, (2)

with α = 1, 2,…, d. In Cartesian form, equivalently, we write

fα(t, xk, yk, zk) = 0 . (3)


We shall assume that d functions fα of 3N arguments xk, yk, zk are independent. Here, t is regarded as a parameter. We can therefore express d co-ordinates of equations (3) as the functions of remaining 3N − d coordinates. The time t and there 3N − d co-ordinates are regarded as independent quantities that define the position of the system at time t. All the 3N Cartesian coordinates may be expressed in the form of functions of n = 3N−d (4)

independent parameters

q1, q2 …qn (5)

and time t. That is,

xk = φk(t, q1, q2…qn),

yk = ψk(t, q1, q2,…, qn),

zk = χk(t, q1, q2,…, qn), (6)

for k = 1, 2,…, N. When these functions are put in the constraint equations (1), the latter become identities. We will assume that any position of the system that is compatible with constraints at the given instant of time may be obtained from the equations (6) for certain values of the quantities q1, q2,…, qn. In vector form, equation (6) can be written as

kk rrρρ

= (t, q1, q2,…, qn) (7)

for k = 1, 2,…, N. The scalar functions in (6) and vector functions in (7) both are assumed continuous and differentiable. The minimal number of quantities qi with the aid of which formulas (6) can embrace all possible positions of a holonomic system coincides with the number of degrees of freedom of the system n = 3N − d .

The quantities q1, q2…qn in formula (6) or (7) (where n is the number of degree of freedom) are called the independent generalized coordinates of the system.

For each instant of time t, a one to one corespondent is established between the possible states of the system and the points of a certain region in the n-dimensional coordinate space (q1, q2…qn). To each position of the system at


time t, there corresponds a point in the space (q1, q2…qn) that describes this position of the system. The motion of a point in the coordinate space (q1, q2…qn) corresponds to the motion of the system.

If all constraints are stationary, then the time t does not appear explicity in equations (3). It is then always possible to choose coordinates q1, q2…qn such that time t does not enter the equations (6) either.

From now on, it is assumed that for a scleronomic system the independent coordinates q1, q2,…, qn are chosen in precisely that way. Then, for a scleronomic system, the formulas (6) and (7) take on the form xk = φk(qi),

yk = ψk(qi),

zk = χk(qi) , (8)

or kk rr

ρρ= (qi), (9)

for = 1, 2,…, N.

Generalized Forces :- To every coordinate qi, there corresponds a generalized force Qi for i = 1, 2,…, N. The generalized forces are determined as follows.

Consider the elementary work of effective forces on virtual displacements

δA = � ⋅=

N

1kkk r�F

ρρ. (1)

But the virtual differentials of the function )q,t(r ikρ

are the virtual

displacements kr�ρ

:

δ �∂∂

==

n

1i i

kk q

rr

ρρ

δqI , (2)

for k = 1, 2,…, N.

Substitute the expressions (2) into the right-hand side of formula (1) and express the elementary work of the effective forces on the virtual displacements in terms of arbitrary elementary increments δqi of the independent coordinates qi , (i = 1, 2,…m) :

δA = � �= =

�

�

�δ

∂∂N

1k

n

1ii

i

kk q

qr

Fρρ

.


= � ��

��

�

∂∂

⋅�= =

n

1ii

i

kN

1kk q�

qr

Fρρ

= �=

n

1iii ,q�Q (3)

where

Qi = ��

��

�

∂∂

⋅�= i

kN

1kk q

rF

ρρ , (4)

for i = 1, 2,…n. Qi are called the generalized forces, which are coefficients of δδδδqi.

It will be noted that for practical purposes formula (4) is by far not always used to find the quantity Qi. Instead, the system is given an elementary virtual displacement such that only the ith coordinate qi receives a certain increment while the remaining independent coordinates do not change. After that the work of effective forces δAi is calculated on just such a specially chosen displacement. Then δAi = Qiδqi , (5) or

Qi = i

i

q�A�

. (6)

Theorem : Prove that the position of a holonomic system is an equilibrium position if and only if all the generalized forces in this position are zero.

Proof. Let a certain position of the system be a position of equilibrium. According to the principle of virtual displacements, this is possible if and only if

δA = 0 , (1)

or

�=

n

1iiQ δqi = 0. (2)

Here, Qi are generalized forces. But the increments δqi in the independent/generalized coordinates qi are arbitrary. Therefore, the equation (2) is equivalent to the following system of equations Qi = 0, (3)

for i = 1, 2,…, n.


This proves the theorem.

Illustration 1. Consider a rigid body which is constrained to move

translationally along the x-axis.

The abscissa x of some point of the given rigid body may be taken as the only independent coordinate. Here n = 1 and δA = X δx ,

where X is the sum of the projections, on the x-axis, of all effective forces

acting on the body. SO

Q = X ,

is the generalized force for the single independent coordinate x.

Illustration 2. Consider a rigid body which is constrained to rotate about a certain fixed axis, say u. In this problem, the angle of rotation, say φ, may be taken as the only independent coordinate. Then δA = Luδφ ,

where Lu is the total moment of all effective forces about the axis of rotation.

We find

Q = Lu ,

as the generalized force.

Illustration 3. Consider a free rigid body.

It has six degrees of freedom. For this, we take the three coordinates xA, yA, zA of some point A of the body as the independent coordinates and the three Eulerian angles ψ, θ, φ that define the rotation. Then δA = Qx δx + Qy δy + Qz δz + Qψ δψ + Qθ δθ + Qϕ δϕ . (1)

To determine Qx, we impart to the body an elementary displacement along the x-axis. Then δyA = δzA = 0,

δψ = δθ = δφ = 0. (2)

Equations (1) and (2) imply


δA = Qx δxA. (3)

Let X, Y, Z be the projections of the stationary axes x, y, z of the principal vector of all effective forces acting on the body. Then Qx = X. (4)

Similarly

Qy = Y,

Qz = Z. (5)

We now impart to our body an elementary displacement such that only the angle ψ changes, while the other quantities remain invariable. Then δA = Qψ δψ. (6)

Let Lψ be the total moment of all effective forces about the Az1-axis (parallel to oz− axis), about which a rotation through an angle ψ is performed. Then Qψ = Lψ (7)

In quite analogous fashion,

Qθ = Lθ,

Qφ = Lφ, (8)

where Lθ and Lϕ are the total moments of the effective forces.

4.8. LAGRANGE’S EQUATIONS OF THE SECOND KIND

We know that the general equation of dynamics is

� −=

N

1kkkkk r�).wmF(

ρρρ= 0. (1)

Expression for the elementary work of effective forces is

δA = �=

N

1kkk r�.F

ρρ

= �=

n

1iii q�Q , (2)


where

Qi = �∂∂

=

N

1k i

kk ,

qr

.Fρ

(3)

for i = 1, 2,…,n.

The elementary work of the inertial forces “−mk kwρ

”, (k = 1, 2,….N), is

δB = − � ⋅=

N

1kkkk r�wm

ρρ

= −�=

δn

1iii qZ (4)

where, by analogy with expression (3),

Zi = �∂∂

⋅=

N

1k i

kkk q

rwm

ρρ

= ��

��

�

∂∂

⋅� ��

��

�

= i

kN

1k

kk q

rdtrd

mρ&ρ

= � ��

��

�

∂∂

−�

�

��

∂∂

⋅==

N

1k i

kkk

N

1k i

kkk ,

qr

dtd

rmqr

rmdtd &ρ

ρ&ρ

for i = 1, 2,…, n.

But the velocity kr&ρ

is also given as

1kk q(r[dtd

rρ&ρ = , q2, …, qn, t)]

= �∂∂

+��

��

�

∂∂

=

n

1i

ki

i

k

tr

qqr

ρ&

ρ . (6)

This shows that the velocity kr&ρ

is linearly dependent on n21 q,...,q,q &&& . From formula (6), we find that

,qr

qr

i

k

i

k

∂∂

=∂∂

ρ

&&

(7)

for i = 1, 2,…n and k = 1, 2,…N. On the other hand, from the same equation (6), we obtain


tq

rq

qqr

qr

i

k2n

1kk

ki

k2

i

k

∂∂∂

+��

��

��

∂∂∂

=∂∂

=

ρ&

ρ&ρ

= ��

��

�

∂∂

i

k

qr

dtd

, (8)

for i = 1, 2,…, n and k = 1, 2,…, N.

Using equation (7) and (8) in equation (5), we obtain the following expression for Zi

Zi = �∂∂⋅−�

�

��

∂∂

⋅==

N

1k ikk

N

1k i

kkk q

r)rm(

qr

rmdtd &ρ&ρ

&

&ρ&ρ , (9)

for i = 1, 2,…,n. Let T denote the kinetic energy of the system. Then

T = �

�

�� ⋅=

N

1kkkk )rr(m

21 &ρ&ρ (10)

� ��

��

�

∂∂

=∂∂

=

N

1k i

kkk

i

,qr

).rm(qT &ρ

&ρ (11)

and

�=

��

��

�

∂∂

=∂∂ N

1k i

kkk

i qr

rmqT

&

&ρ&ρ&

).( . (12)

Equations (9), (11) and (12) imply

Zi = ,qT

qT

dtd

ii ∂∂−��

�

��

�

∂∂

(13)

for i = 1, 2,…n.

The general equation of dynamics (1) gives as

δA + δB = 0 (14)

Equations (14), (2) and (4) imply

�=

n

1i(Qi − Zi) δqi = 0 (15)


Since qi are the independent coordinates and, for this reason, the δqi are absolutely arbitrary increments in the coordinates. It follows that equation (15) can hold when and only when all the coefficients of δqi in equation (15) are equal to zero. Therefore, the general equation of dynamics (15) is equivalent to the following set of equations Zi = Qi , (16)

for i = 1, 2,…,n. Equations (13) and (16) give

iii

QqT

qT

dtd =��

�

��

�

∂∂−��

�

��

�

∂∂&

, (17)

for i = 1, 2,…,n. The above equations in (17) are called the Lagrange equations of the second kind or Lagrange equation in independent co-ordinates. Definition : The quantities iq&, (i = 1, 2,…, n), are called generalized

velocities.

We note that the velocities of the points of the system )rv( kk&ρρ = are expressed

in terms of the generalized velocities )q,...,q,q( n21 &&& and also in terms of independent co-ordinates (q1, q2,…, qn) and the time ‘t’ by means of the formula (6).

Definition : The quantities ,q i&& (i = 1, 2,…n), are called generalized accelerations.

Remark 1 : After performing the operation ,dtd

the left-hand sides of the

Lagranges equations (17) contain the time t, the generalized coordinate qi, the generalized velocities iq& and the generalized acceleration iq&& , (i = 1, 2,… n). The generalized forces Qi,(i = 1, 2,…,n), on the right-hand side of the Lagrange equations (17) are ordinarily specified as functions of t, qk, kq& , (k = 1, 2,…n). That is,

Qi = Qi(t, qk, kq& )

for i = 1, 2,…,n.

Remark 2. The Lagrange equations (17) form a set of n ordinary differential equations, each of the second order in n unknown functions qi of the independent variable t. The order of this system of differential equations is g n . Note that the set of differential equations determining the motion of a holonomic system with n degrees of freedom cannot be of order less than g n,


since by virtue of the arbitrariness of the initial values of the quantities qi and iq&, (i = 1, 2,…,n), the solution of the system must contain at least g n arbitrary

constants. Thus the set of Lagrange equations in independent coordinates has the lowest possible order. Remark 3. In the case of a constrained system, the reaction forces do not enter into Lagrange’s equations. This is an essential advantage of Lagrange’s equation. After Lagrange’s equations have been integrated and the functions qi(t) found, ),t(rr kk

ρρ= )r,r,t(Fandrw,rv kkkkkkk

&ρρ&&ρρ&ρρ == are determined consequently. After that the unknown reaction forces are determined from the formulas ,FwmR kkkk

ρρρ−=

for k = 1, 2,…, N.

Remark 4 : In the case of a free system of particles, Lagrange’s equations (17) are a compact notation of the equations of motion in an arbitrary system of coordinates. Illustration 1 : Consider a rigid body in rotation about a stationary axis u. In this situation, there is one independent coordinate, the angle of rotation ϕ, to represent the motion. So, we take

qi = φ . (1)

The appropriate generalized force Q for the present motions is equal to

moment of rotation Lu ,i.e.,

Q = Lu. (2)

The total kinetic energy is given by

T = 2u �I

21 & , (3)

where Iu is the moment of inertia of the body about the axis of rotation.

The Lagrange equation for the present problem becomes

.Q�

T�

Tdtd =

∂∂−��

�

��

�

∂∂

& (4)

we find

�I�

Tu &

&=

∂∂

(5)


y

y

ω P(t)

x

v

and

.0�

T =∂∂

(6)

Lagrange equation (4) now takes the form

Iu uL� =&& . (7)

This is the differential equation of the rotation of a rigid body about a stationary axis. Illustration 2 (Circular Motion). Suppose that a moving particle is describing a circle of constant radius r about the centre 0 with angular velocity ω. Then )jsini(cosrr +=ρ

(1)

is the position vector of the particle. Let r Landi be the unit vectors in the radial (r increasing) and transversal (θ increasing) directions. Then ri = cosθ) j)(sini + (2)

jii ˆ)(cosˆ)sin(ˆ θ+θ−=θ . (3)

The velocity and acceleration are

dtrd

vρρ =

j

o

r

i

θ


l

m θ

= ]jcosi)sin[(r +−&

= θθ ir& (4)

and

dtvd

fρρ

=

= r .ˆ)(ˆ)( θθ+θ−= irir r2 &&& (5)

If rv , vθ and fr, fθ are components in radial and transversal direction, then

vr = 0, vθ = r&, (6)

fr = −r&2, fθ = r&&. (7)

The quantity ω =& is the instantaneous rate of change of θ and is the angular velocity of the moving point at P.

Illustration 3 : Let us consider a simple pendulum. Assume that a particle of mass m is attached to a massless rod that is free to rotate in a vertical plan about a frictionless pin.

The motion of this single-degree-of-freedom system may be described by the generalized coordinate θ (shown in the figure given below).

The kinetic energy of this system is given in terms of the generalized velocity &as

T = 2mv21

= 22m21 θ&l . (1)

The generalized force associated with the rotational coordinate of a pendulum is Qθ = −mgl sinθ. (2)


O

l1

φ1 A

l2

φ2

m1g B

m2g z

The equation of motion based on the Lagrangian formation is

T

Tdtd

∂∂−�

�

��

�

∂∂

& = Qθ. (3)

This gives

lg+θ&& sinθ = 0. (4)

Example : A double simple pendulum is in motion in a vertical plane. Find the Langrangian equations of motion. Solution : Let OA and AB be the rods hinged at 0 and A, making angles φ1 and φ2 with the vertical at any time t. Let mass of rod OA bc m1 and mass of rod AB be m2. Let OA = l1 and AB = l2 (see figure below).

Let z1 = l1 cos φ1 (1)

z2 = l1 cos ϕ1 + l2 cos ϕ2. (2)

We know that the elementary work equation is

δA = m1g δz1 + m2 gδz2 (3)

Form equations (1) and (2), we find

δz1 = −l1 sinφ1, δφ1, (4)

δz2 = −l1 sinφ1 δφ1 − l2 sinφ2 δφ2. (5)

Equations (3) to (5) yield


δA = [−(m1 + m2) gl1 sinφ1] δφ1 [m2gl2 sinφ2] δφ2 (6)

This gives

Qφ1 = −(m1 + m2) gl1 sinφ1, (7)

and

Qφ2 = −m2g l2 sinφ2. (8)

The K.E. of rod OA is

T1 = 21

211m

21 φ&l , (9)

and K.E. of rod AB is

T2 = cos(( 2122

22

21

212 qm

21

llll +φ+φ && φ1 − φ2) )�� 21 && . (10)

The total K.E. of the system is

T = T1+ T2 =21

(m1 + m2) l12

2121221 m φφ+φ &&& ll cos (φ1 − φ2) + 2

2222m

21 φ&l .

(11)

Then the first Lagrange’s equation of motion for the generalized co-ordinate φ1 is

11

TTdtd

φ∂∂−

φ∂∂& = Qφ1. (12)

On simplification of (12), one gets

12121 mm

dtd φ+ &l)[( + m2 l1l2 2φ& cos (φ1 − φ2)]

+ m2 l1l2 21 φφ && sin (φ1 −φ2)

= (m1 +m2) gl1 sinφ1. (13)

The second Lagrange’s equation of motion for the generalized coordinate φ2 is


22

TTdtd

φ∂∂−

φ∂∂& = Qφ2. (14)

This gives

1212mdtd φ&ll[ cos (φ1 − φ2) + m2 l2

2 ]�2&

+ m2 l1l2 21 φφ && sin (φ1 − φ2)

= −m2 g l2 sin φ2. (15) Special Case : When

m1 = m2 = m,

l1 = l2 = l . (16)

After simplification, equations (13) and (15) reduce to

2 21 φ+φ &&&& cos (φ1 − φ2) + 21φφ && sin (φ1 − φ2) +2 ��

��

�

lg

sin φ1 = 0, (17)

22 φ+φ &&&& cos (φ1 − φ2) + 21φφ && sin (φ1 − φ2) + ��

��

�

lg

sin φ2 = 0, (18)

Further, for small oscillations,

sin φ1 = φi, cos φi = 1 . (19)

Neglecting the second and higher order terms, equation (17) and (18) become

2 0g

221 =��

��

�+φ+φl

&&&& , (20)

0g

21 =��

��

�+φ+φl

&&&& . (21)

4.9. UNIQUENESS OF SOLUTION We have seen that in order to form the lagrange equations of motion for a holonomic system, it is necessary first to find the expression for the kinetic energy T as a function of the time t, the generalized velocities 1q& (i = 1, 2,…n). Let us do this in the general form :


T = �=

N

1k

2kk rm

21 &ρ . (1)

We know that

� ��

��

�

∂∂

+∂∂

==

n

1i

ki

i

kk t

rq

qr

rρ

&ρ

&ρ . (2)

Equations (1) and (2) give

T = � ��

��

��

∂∂

+∂∂

= =

N

1k

2n

1i

ki

i

kk t

rq

qr

m21

ρ&

ρ.

= � ��

��

�

∂∂+�

��

��

��

��

∂∂

= =

N

1k

2k

2n

1ii

i

kk t

rq

qr

m21

ρ&

ρ

+ �=

��

�

∂∂

⋅��

��

∂∂N

1k

ki

i

kk t

rq

qr

m2ρ

&ρ

= � ��

��

�

∂∂

�+��

��

��

∂∂

= ==

N

1k

2k

N

1kk

2n

1ii

i

kk t

rm

21

qqr

m21

ρ&

ρ

+ � � ��

��

�

∂∂

∂∂

= =

N

1ki

n

1i

k

i

kk q

tr

.qr

m &ρρ

. (3)

Let aiK ai, a0 be functions of t, q1, q2, …, qn defined by the following equations

aiK = �=

��

��

�

∂∂

∂∂N

1k K

k

i

kk q

rqr

mρρ

. (4)

ai = � ��

��

�

∂∂

∂∂

=

N

1k

k

i

kk t

r.

qr

mρρ

, (5)

a0 = � ��

��

�

∂∂

=

N

1k

2k

k tr

m21

ρ (6)

where i, K = 1, 2,…,n. From equation (4), we find

aiK = aKi . (7)


Using relations (4)−(6) in equation (3) we find

T = T2 + T1 + T0 (8)

where

T2 = �=

n

1KiKiiK qqa

21

,.&& ,

T1 = �=

n

1iiiqa .&

T0 = a0. (9)

We also know that, in the case of a sclaronomic system, the time does not explicitly enter into the relation between .qandr ik &ρ

For this reason

.0trk =

∂∂

ρ (10)

for k = 1, 2,…,N. Consequently, we get

a0 = 0,

ai = 0; (i = 1, 2, …n) (11) and

T = T2 = �=

n

1KiKiiK qqa

21

,

&& . (12)

Thus, we see that the kinetic energy of a scleronomic system appears in the form of a homogenous function of the second degree of the generalized velocities. Remark 1 : It will be noted that in an arbitrary (Scleronomic or rheonomic) holonomic system, the homogeneous quadratic form T2 is always nondegenerate. That is, a determinant made up of its coefficients is different from zero, or . det (aiK) ≠ 0, (12)

for i, K = 1 to n.

For this, if possible, assume that


det(aiK) = 0. (13)

Then the system of n homogeneous linear equations

�=

n

1KaiK λK = 0 (14)

for i = 1, 2,…,n has a real non-zero solution for λK. Multiplying the set of equations (14), termwise, by λi then summing with respect to i from 1 to n and using formula (4), we get

0 = �=

n

1Ki,aiK λi λK

= � �= =

��

��

�

∂∂

∂∂n

1Ki

N

1k K

k

i

kk q

rqr

m,

.ρρ

(λi λK)

= � �= =

�

�

�λλ��

�

��

�

∂∂

∂∂N

1kKi

n

1Ki K

k

i

kk q

rqr

m,

.ρρ

= � ��= ==

�

�

��

��

�

∂∂

λ��

��

�

∂∂

λN

1k

n

1K K

KK

n

1i i

kik q

rqr

mρρ

..

= � �= =

��

��

�

∂∂

λN

1k

2n

1i i

kik q

rm .

ρ. (15)

This implies

� =∂∂

=

N

1k i

ki 0

qr

�

ρ , (16)

for k = 1, 2,…, N. These N vector equations may be replaced by 3N scalar

equations

� =∂∂

=

n

1ii

i

ki 0�

qx

� ,

� =∂∂

=

n

1ii

i

ki 0�

qy

� ,


� =∂∂

=

n

1ii

i

ki 0�

qz

� (17)

for k = 1, 2,…, N. The equations (17) show that in the following Jacobian functional matrix.

J =

��

��

�

∂∂

∂∂

∂∂

∂∂

∂∂

∂∂

∂∂

∂∂

∂∂

∂∂

∂∂

∂∂

n

N

1

N

n

N

1

N

n

N

1

N

n

1

1

1

n

1

1

1

n

1

1

1

qz

..........qz

qy

..........qy

qx

..........qx

qz

..........qz

qy

..........qy

qx

..........qx

ΜΜ , (18)

The columns are linearly dependent. So, the rank, say ρ, of this functional matrix J is less than n. That is, ρ > n. Then among the 3N functions x1, y1, z1, x2, y2, z2,…xN, yN, zN of the n arguments q1, q2…qn (t is regarded as a parameter) there are ρ independent quantities in terms of which all the remaining Cartesian coordinates of the points of the system may be expressed. This is a contradiction to the fact that the minimal number of independent coordinates of the system is equal to the number of degrees of freedom n. This contradiction establishes the claim in (12). Remark 2. Since T2 ≥ 0, (19)

always, it follows from inequality (12) that the quadratic form

T2 = �=

n

1k,ikiik qqa

21 && (20)

is positive definite, and

T2 = 0


only when

iq&= 0 for each i.

Therefore, from the theory of matrices, we have

a11 > 0,

,0aaaa

2221

1211 >

…………………,

nn2n1n

n22221

n11211

a......

a......

a.....

aaaaaa

> 0. (21)

Remark 3 : Putting the expression (1) for kinetic energy into the Lagrange

equations of motion

iii

QqT

qT

dtd =

∂∂−��

�

��

�

∂∂&

(22)

for i = 1, 2,…, n; we get

kik

n

1kqa &&�

=+ (sum f the terms not involving second derivatives of the

coordinates w.r.t. time)

= Qi (t, qi, jq& ), (23)

for i = 1, 2,…, n. The right-hand sides likewise do not contain second derivatives, since, in the general case, they are functions of the quantities t, qj,

jq& for i = 1, 2,………..n. By virtue of (12), it follows that the equations (23) may be solved for the second derivatives and represented in the form iq&& = Gi (t, qk, kq& ) (24)


for i = 1, 2,…,n. But, then, as we know from the theory of differential equations, for certain assumptions relative to the right hand sides Gi (the functions Gi, 1 ≤ i ≤ n, have continuous first order partial derivatives) then there is one and only one solution of the Lagrange equations for arbitrary pre-assigned initial quantities qi, iq& with t = t0 for i = 1, 2,…,n. Thus, the motion of a holonomic system is uniquely determined by specifying the initial position

oiq and initial velocities o

iq& . 4.10. THEOREM ON VARIATION OF TOTAL ENERGY

EQUATION FOR CONSERVATIVE FIELDS If the generalized forces do not depend on the generalized velocities, i.e.,

Qi = Qi(t, q1, q2…qn) (1)

for i = 1, 2,…,n and there exists a function U(t, q1, q2…qn) such that

Qi = −iq

U∂∂

(2)

for i = 1, 2,…,n; then the forces Qi are called potentials and the function U is the potential of the forces or the potential energy. We know that the elementary work of the forces Qi is given by

δA = �=

n

1iiQ δqI . (3)

From equations (2) and (3) , we find

δA = −δU . (4)

Let us now consider the general case when in addition to the potential forces determined by the potential U, the system is acted upon also by non-potential forces )q,q,t(Q

~Q~

jjii &= , (5)

for i = 1, 2,…,n. Then the total generalized force is

Qi = − ii

Q~

qU +

∂∂

, (6)


and the Lagrange equations of motion assume the following form ;

iii

Q~

qU

qT

qT

dtd +

∂∂−=

∂∂−��

�

��

�

∂∂&

(7)

for i = 1, 2,…, n. We now consider the total energy E, which is equal to the sum of the K.E. and potential energy, is given by E = T + U . (8)

To compute the derivative dtdE

, we first evaluate .dtdT

We find

�∂∂+��

�

��

�

∂∂+

∂∂=

=

n

1ii

ii

i tT

qqT

qqT

dtdT &&

&&

= �∂∂+�

�

��

��

�

∂∂−

∂∂+��

�

��

��

∂∂

==

n

1ii

ii

n

1ii

i

.tT

qqT

dtd

qT

qqT

dtd &

&&

& (9)

The Euler’s formula for a homogeneous function gives

� =∂∂

=

n

1i2i

i

2 T2qqT &&

(10)

� =∂∂

=

n

1i1i

i

1 TqqT &&

. (11)

We know that

T = T2 + T1 + T0 (12)

From equations (9) to (12) and Lagrange’s equation of motion (7), we find

�=

��

��

�−

∂∂+

∂∂++=

n

1iii

i12 qQ

qU

tT

TT2dtd

dtdT &~

)( . (13)

From (12), we find the value of T2 in term of T, T1 and T0.


T2 = T−T1 − T0 . (14)

Using equation (14) in (13), we obtain

dtdU

tT

)T2T(dtd

dtdT

2dtdT

01 +∂∂++−=

− �−∂∂

=

n

1iii qQ

~tU & . (15)

From equation (8), we find

dtdU

dtdT

dtdE += . (16)

Using relation (15) in equation (16), we write

�∂∂+

∂∂−++=

=

n

1i01ii t

UtT

)T2T(dtd

qQ~

dtdE & . (17)

Formula (17) determines the total energy of an arbitrary holonomic system.

Further,

�=

n

1iii qQ

~ & = �=

n

1i

ii dt

qdQ~ &

= dt

qdQ~n

1iii�

=&

= ,dtA~�

(18)

where A~� is the elementary work of the nonpotential forces .Q

~i

It is called the power of the non potential forces iQ~ .

For a conservative system, we have

(i) a scalernomic system,

(ii) a system where all forces are potential, and

(iii) the potential energy U is not explicitly dependent on the time.

Thus, for a conservative system equation (17) leads to


0dtdE = . (19)

This gives E = constant = h , say, (20) for a conservative system. This shows that the total energy of a conservative system does not change when the system is in motion. The Books Recommended for Chapters IV, V, VI and VII 1. F. Gantmatch Lectures in Analytical Mechanics,

MIR Publications, Moscow, 1975.

2. Louis N. Hand and J.D. Finch Analytical Mechanics,

Cambridge University Press, 1998.

3. J.S. Torok Analytical Mechanics,

John Wiley and Sons, 2000.

ANALYTICAL DYNAMICS-II 151

Chapter-5 Analytical Dynamics-II 5.1. LAGRANGE’S EQUATIONS FOR POTENTIAL

FORCES Let the generalized forces Qi be potential, i.e., there exist a force potential

(potential energy)

U = U(t, qi) (1)

such that

Qi = − ,q

U

i∂∂

(2)

for i = 1, 2,…,n.

We define

L = T − U. (3)

The function L is called the Lagrangian function or the kinetic potential. Since the potential energy U does not depend on the generalized velocity iq& , so

ii q

T

q

L

∂∂

=∂∂&

, (4)

iii q

U

q

T

q

L

∂∂

−∂∂

=∂∂

, (5)

for i = 1, 2,…, n. We know that the Lagrange’s equation of motion in terms of kinetic energy T are given by

iii q

U

q

T

q

T

dt

d

∂∂

−=∂∂

−��

��

�

∂∂&

. (6)

from equations (4) to (6), we obtain


0q

L

q

L

dt

d

ii

=∂∂

−��

��

�

∂∂&

(7)

for i = 1, 2, …, n.

Let us now consider the case when in place of the ordinary potential U(t, qi), there exists a generalized potential V = V(t, qi, iq& ) (8) in term of which the generalized forces Qi are expressed by means of the

following formulas

Qi = ii q

V

q

V

dt

d

∂∂

−��

��

�

∂∂&

(9)

for i = 1, 2, …., n.

In this case, the Lagrange’s equations

=∂∂

−��

��

�

∂∂

ii q

T

q

T

dt

d& ii q

V

q

V

dt

d

∂∂

−��

��

�

∂∂&

becomes

ii q

L

q

L

dt

d

∂∂

−��

��

�

∂∂&

= 0 . (10)

These equation in (10) are of the same type/form as in (7).

Remark I: From formulas (9), it follows that

Qi = �=

��

��

�

∂∂∂n

1KK

Ki

2

qqqV &&&&

+ (*) (11)

for i = 1, 2,….., n. (*) denotes the sum of the terms that do not contains generalized accelerations q&&, K = (1, 2,…,n) Inasmuch as in mechanics we consider only the core when the generalized forces Qi are not explicitly dependent on the generalized accelerations but depend solely on the time, on the coordinates and on the generalized velocities.


Qi = Qi(t, qK, Kq& ) (12)

for i = 1, 2,…,n .

It then follows according to formulas (11) that all partial second order derivatives of V with respect to the generalized velocities must be identically equal to zero. This implies that the general potential V, in this case, depends linearly on the generalized velocities. Therefore, we can write

V = i

n

1ii qU &�

=+ U = V1 + U , (13)

where Ui and V are functions of coordinates q1, q2,…, qn and of time t.

Substituting expression (13) for V into formula (9), we get

Qi = �

��

+∂∂− �

=UqU

qdtdU

K

n

1KK

i

i &

= − �= ∂

∂+��

�

��

�

∂∂

−∂∂

+∂∂ n

1K

iK

i

K

K

i

i tU

qq

UqU

qU & . (14)

The formulas (14) show that when the linear part V1 of the generalized potential does not depend explicitly on the time variable t, the generalized forces Qi are made up out of potential forces

− iq

U

∂∂

(15)

and gyroscopic forces

�=

γ=n

1KKiKi qQ &~

, (16)

where

γiK = −γKi = i

K

K

i

qU

qU

∂∂

−∂∂

(17)

for i, K = 1, 2,…, n. 5.2 LAGRANGIAN AND HAMILTONIAN VARIABLES

If the kinetical potential or the Lagrangian function L = L(t, qi, )q i& is known, then the differential equations of motion of a system can be written. The


variables t, qi, iq&(i = 1, 2,…n), in terms of which the Lagrangian function is expressed, are called Lagrangian variables. For the basic variables that characterize the state of a system, Hamilton proposed the quantities t, qi, pi (i = 1, 2,…,n), where pi are the generalized momenta defined by the following equalities

pi = iq

L&∂

∂ (1)

for i = 1, 2,…,n. The variables t, qi, pi are called Hamiltonian variables. Classical systems in which the forces have the ordinary potential U(t, qi) or the generalized potential V(t, qi, iq&) will be called natural. For such systems the Lagrangian function L is a quadratic function of the generalized velocities. For natural system we have

iKKi

2

Ki

2

aqqT

qqL =

∂∂∂=

∂∂∂

&&&& (2)

for i, K = 1, 2,…,n.

We notice that the Jacobian of the Right side of the equations (1) w.r.t. the variables iq& is the Hessian of the function L. We assume that the Hessian of the function L w.r.t. to the generalized velocities iq& is not identically zero. Therefore

det ��

��

�

∂∂∂

Ki

2

qqL&&

≠ 0 (3)

it follows that equation (1) can be solved for qi, and we write

iq& = Φi(t, qK, pK) (4)

for i = 1, 2,…, n.

Thus, Hamilton’s variables may be expressed in terms of the Lagrange variables and vice versa. Consequently, the state of the system may be described both as a system of values of the Lagrange variables and as a system of values of Hamiltonian variables.


We know that in the case of a natural system, Lagrangian function L is a quadratic function of the generalized velocities. By virtue of equation (1), the generalized momenta pi are linearly expressible in terms of the generalized velocities :

pi = �=

+n

1KiKiK cqa & (5)

for i = 1, 2,…,n. Solving the linear system (5) for iq&, we get linear

expressions for iq& of the type

iq& = �=

n

1KKiK pb + bi (6)

for i = 1, 2,…,n. Here, bik and bi are functions of t, q1, q2,…, qn.

If in a natural system the forces Qi have an ordinary potential U(t, qi), it

follows from the equation

L = T − U (7)

pi = iq

T&∂

∂ (8)

If forces Qi have a generalized potential, then we have

pi = ii

Uq

T−

∂∂&

(9)

Let F = F(t, qi, iq&) , (10)

be any function of Lagrangian variables. After substitution of the expression (4) or (6) into (10) in place of the generalized velocities iq&, the function (10) is converted into a certain functions, say, F (t, qi, pi), of the Hamiltonian variables. We call the function F (t, qi, pi) the associated function of the function F(t, qi, iq&). Hamilton (1834) introduced the function H(t, qi, pi) defined by the equation


H = � −=

n

1iii LqP & , (11)

where L is the associated function of L, in the sense described above.

The function H is called Hamiltonian function.

With the help of Hamiltonian function H, the equations of motion may be written in the form of the following system of 2n ordinary differential equations of the first order

,p

H

dt

dq

i

i

∂∂

= (12a)

i

i

q

H

dt

dp

∂∂

−= (12b)

for i = 1, 2,…,n.

These equations (12a, b) are called canonical equations or Hamilton’s

equations.

DONKIN’S THEOREM : The derivation of the canonical equations of Hamilton will follows from the DONKIN’S Theorem : Statement : Given a certain function X(x1 , x2,…, xn), the Hessian of which is different from zero. Let there also be a transformation of the variables “generated” by the function X(x1, x2…xn) :

yi = ,x

X

i∂∂

for i = 1, 2,…,n. Then, there exists a transformation which likewise generates some function Y(y1, y2…yn) :

xi = iy

Y

∂∂

for i = 1, 2,…,n. If the function X contains the parameters α1, α2,…, αm , i.e.,

X = X(x1, x2………xn ; α1, α2…….αm),

then Y also contains these parameters, i.e.,


Y = Y(y1, y2………yn ; α1, α2…….αm) and

jj

XY

α∂∂

−=α∂

∂

for j = 1, 2,…,n.

Proof. Let the generating function Y of the inverse transformation be connected with the generating function X of the direct transformation

yi = ix

X

∂∂

(1)

for i = 1, 2,…,n . By the formula (known as Legendre transformation)

Y = �=

n

1iii yx −X . (2)

The Hessian of the function X coincides with the Jacobian of the right hand side of equations (1), and by virtue of the hypotheses given in the statement of the theorem, we have

det��

�

�

��

�

�

∂∂∂

ki

2

xx

X≠ 0 , (3)

for this reason it is possible to express the variables x1, x2,…, xn in terms of y1,

y2,…, yn .

We write

xi = fi (y1, y2,…,yn) (4)

for i = 1, 2,…,n. Now we replace the variables xi appearing in formula (2) by the expressions in

(4). Then

��

��

� −∂∂=

∂∂

�=

n

1KKK

ii

Xyxyy

Y


= ��== ∂

∂∂∂−+

∂∂ n

1K i

K

K

n

1KiK

i

K

yx

xX

xyyx

. (5)

By virtue of equations (1), the two sums on the right hand side of equation (5) cancel. Hence, we obtain

ii

xy

Y=

∂∂

, (6)

for i = 1, 2,…,n.

This proves the first part of the Donkin’s theorem.

2nd Part : Now, let X contain parameters α1, α2…αm in addition to the variables x1, x2…xn. Then these parameters occur in the direct transformation (1) and, consequently, in the reverse one as well : xi = fi(y1, y2…yn ; α1, α2…αm) (7)

for i = 1, 2,…,n.

The function Y is determined by equation (2) in which the xi are replaced by

fi(y1, y2…yn; α1, α2…αn),

and so (regarding y1, y2,…, yn as constants) we obtain

��

��

�� −

α∂∂

=α∂

∂=

n

1iii

jj

XyxY

= �α∂

∂−��

�

�

��

�

�

α∂∂

∂∂

−� ��

�

�

��

�

�

α∂∂

==

n

1i jj

i

i

n

1ii

j

i Xx

x

Xy

x

= �α∂

∂−

α∂∂

∂∂

� −∂∂

α∂∂

==

n

1i jj

i

i

n

1i ij

i Xx

x

X

x

Xx

= − ,X

iα∂∂

(8)

for j = 1, 2,…,n.

This completes the proof of Donkin’s theorem.


5.3 HAMILTON CANONICAL EQUATIONS

To derive these equations, we use Donkin’s theorem to make transition from the Lagrangian variables to the Hamiltonian variables. For this, by the function X is replaced by L, the variables x1, x2,…, xn by n21 q,...,q,q &&& , the parameters α1, α2 … αm by q1, q2,…, qn and t, the variables y1, y2,…, yn by p1, p2,…, pn and the function

Y = � −=

n

1iii Xyx (1)

in the Donkin’s theorem. We know that

H = �=

−n

1iii Lqp

))& , (2)

and

pi = iq

L&∂

∂ , (3)

for i = 1, 2,…,n.

Hence by Donkin’s theorem, it is concluded that

i

i p

Hq

∂∂

=& , (4)

ii q

HqL

∂∂−=

∂∂

, (5)

and

t

H

t

L

∂∂

−=∂∂

(6)

for i = 1, 2,…, n. Lagrange equations of motion are

0q

L

q

L

dt

d

ii

=∂∂

−��

��

�

∂∂&

, (7)

for i = 1, 2,…,n. Using (3) in (7), we write

i

i q

L)p(

dt

d

∂∂

= . (8)


Hence, from (5) to (8), we obtain

ii

i

ii

qH

qL

dtdp

pH

q

∂∂−=

∂∂=

∂∂= ,&

or

,i

i

pH

dtdq

∂∂=

i

i

qH

dtdp

∂∂−= , (9)

for i = 1, 2,…,n.

Equations in (9) are the required canonical equations of Hamilton.

5.4 ROUTH’S EQUATIONS

For the basic variables characterizing the state of a system at a time t, Routh proposed taking a part of the Lagrangian variables and a part of the Hamiltonian variables. The Routh variables are the quantities t, qi, qα, iq&, pα , (1)

for i = 1, 2,…, m ; α = m + 1,…,n, m is an arbitrary fixed number less than n.

The Lagrangian variables can be replaced by Routh variables if we express all the αq& in terms of pα, by the relations

pα = α∂

∂q

L&

, (2)

for α = m + 1, m + 2,…,n.

Suppose that the Hessain of the function L of the generalized velocities αq& is different from zero. That is,

det��

�

�

��

�

�

∂∂∂

βα qq

L2

&& ≠ 0. (3)


Then, by applying the Donkin theroem, we get a transformation that is inverse to the transformation (2), namely

α

α ∂∂=pR

q& (4)

for α = m + 1, m + 2,…,n

and R = R(t, qi, qα, ,q i& pα)

is the Routh function defined by the equation

R = �+=α

αα −n

1mLqp))

& . (5)

The sign .) signifying that all the αq& are expressed in terms of pα.

The variables

t, qi, qα, iq&

(i = 1, 2,…, m ; α = m + 1, m + 2, …,n) are now regarded as parameters. Consequently, Donkin theorem gives

,ii q

LqR

∂∂−=

∂∂

ii qL

qR

&& ∂∂−=

∂∂

(6)

for i = 1, 2,…,m ; and

αα ∂

∂−=

∂∂

q

L

q

R (7)

for α = m + 1, m + 2,…, n , and

.t

L

t

R

∂∂

−=∂∂

(8)

We know that the Lagrange equations for the coordinates qi are


0q

L

q

L

dt

d

ii

=∂∂

−��

��

�

∂∂&

. (9)

Using (6) equations (9) may be written as

0q

R

q

R

dt

d

ii

=∂∂

−��

��

�

∂∂&

, (10)

for i = 1, 2,…,m.

Lagrange’s equations of the coordinates qα are

α

α

∂∂

=q

L

dt

dp (11)

for α = m + 1, m + 2,…, n.

Using (4) and (7), equation (11) become

α

α

∂∂

−=q

R

dt

dp (12)

and

α

α

∂∂

=p

R

dt

qd, (13)

for α = m + 1, m + 2,…, n.

Equations (10), (12) and (13) form a set of following Routh equations :

,0q

R

q

R

dt

d

ii

=∂∂

−��

��

�

∂∂&

,q

R

dt

dp

α

α

∂∂

−=

α

α

∂∂

−=p

R

dt

dq , (14)

for i = 1, 2,…, m and α = m + 1, m + 2,…,n. These equations consist of m second-order differential equations of the Lagrange type and 2(n−m) first order differential equations of the Hamiltonian


type. The Routh function in the first equations play the role of the Lagrangian function, while those in the latter equations play the role of the Hamiltonian function. 5.5 CYCLIC COORDINATES

The Lagrangian of any physical system is generally expected to have explicit dependence on all the generalized coordinates qi, all the generalized velocities

iq& and time t, that is L = L(q1, q2… qn, )t,q...q,q n21 &&&

where n is the total number of generalized coordinates. If some of the generalized coordinates do not appear explicitly in the expression for the Lagrangian, these coordinates are called cyclic (ignorable). Any change in these coordinates do not affect the Lagrangian. 5.6 POISSON BRACKETS Poisson introduced a special term, called the Poisson bracket, for the following expression composed of the partial derivatives of two arbitrary functions φ(t, qi, pi) and ψ(t, qi, pi) :

(φ ψ) = � ��

��

�

∂ψ∂

∂φ∂

−∂

ψ∂∂

φ∂=

n

1i iiii qPPq. (1)

Remark : For the functions φ(t, qi, Pi), ψ(t, qi, pi), χ(t, qi, pi) and constant c, the following properties are satisfied by Poisson bracket : (1) (φ ψ) = −(ψ φ)

(2) (c φ ψ) = c(φ ψ)

(3) (φ + ψ χ) = (φ χ) + (ψ χ)

(4) ��

��

�

∂ψ∂

φ+��

��

�ψ

∂φ∂

=φψ∂∂

tt)(

t.

POISSON’S IDENTITY : For the functions φ(t, qi pi) ψ(t, qi, pi) and χ (t, qi, pi); the following property holds ((φ ψ)χ) + ((ψ χ)φ) + ((χ φ)ψ) = 0

This property is known as Poission’s identity.


Definition : A function f(t, qi, pi) is called the integral of the equations of

motion

,i

i

pH

dtdq

∂∂=

i

i

qH

dtdp

∂∂−= , (1)

for i = 1, 2,…,n ; if for any motion of the given system this function retains a

constant value say C :

f(t, qi, pi) = C (2)

Remark : The necessary and sufficient condition for the function f(t, qi, pi) to be integral of the equations of motion (1) is that

t

f

dt

df

∂∂

= + ( f H) = 0. (3)

5.7 JACOBI-POISSON THEOREM Statement : If f and g are integrals of equations of motion, then (f g) is also an integral of these equations. Proof : Since f and g are integrals of equations of motion then

0)Hf(t

f=+

∂∂

, (1)

t

g

∂∂

+ (g H) = 0 . (2)

Now to prove that (f g) is also an integral of the same equations of motion, it is

required to prove that

t∂

∂(f g) + ((fg)H) = 0 . (3)

Now


��

��

�

∂∂

+��

��

�

∂∂

=∂∂

t

gfg

t

f)gf(

t (4)

From equations (1) and (2), we write

)Hf(t

f−=

∂∂

, (5)

t

g

∂∂

= − (g H). (6)

Using (5) and (6) in (4), we write

t∂

∂(f g) = −((f H)g) − (f(gH)) ,

t∂

∂(f g) = ((H f) g) + ((g H) f) . (7)

From (3) and (7), we write

t∂

∂(f g) + ((f g)H) = ((H f)g) + ((gH) f) + ((f g)H) = 0 .

Thus

t∂

∂(f g) + ((f g)H) = 0 .

This completes the proof of the theorem.


Chapter-6

Analytical Mechanics-III

6.1 HAMILTON’S PRINCIPLE Definition : We consider an arbitrary holonomic system with independent coordinates q1, q2,…, qn and the Lagrangian function L = L(t, qi, iq&), 1 ≤ i ≤ n. The integral

W = �1

0

t

tLdt …(1)

is called the “Hamilton action” during a time interval (t0, t1). The expression Ldt is called the elementary Hamilton action.

Note : Since the function L is of the form

L = L (t, qi, iq&), …(2)

it is necessary, in order to compute the Hamilton action W, to specify the

functions

qi = qi(t) …(3)

for i = 1, 2,…, n; in the time interval [t0, t1]. This imply that the Hamilton action W is a functional dependent on the motion of the system.

Remark : Let 0iq be a given initial position of the system at time t = t0 and qi′

be a given final position, which it occupies at time t = t1. We fix the initial and terminal instants of time t0 and t1, and the initial and terminal positions of the system. The motions are otherwise arbitrary.

In the extended (n + 1) −dimensional coordinate space, where the quantities qi and the time t are the coordinates, this motion is depicted by some Curves or paths. We shall consider all possible such paths, passing through two specified points of space M0 (t0, qi

o) and M1(t1, qi′) as shown in figure below.

t

qi

M1

M0

ANALYTICAL MECHANICS-III

167

That is, we consider all possible motions that translate the systems from initial position qi

o to final position qi′. Straight path and Circuitous paths

Suppose that among the paths considered above, there is a path along which the system can move for a specified function L, i.e., in a given field of force. Such a path is called “straight path”. In the above figure, the “straight path” is depicted by a solid-line. For a straight path, the functions, qi = qi(t), satisfy the Lagrange equations of motion, namely,

,0q

L

q

L

dt

d

ii

=∂∂

−��

��

�

∂∂&

…(4)

for i = 1, 2,…,n. All other paths passing through the points M0 and M1 are termed as “circuitous paths”. Statement of Hamilton’s principle (1834-35)

“The Hamilton action W has a stationary value for the straight path as compared with the circuitous paths”. Proof Hamilton’s principle : Let us consider an arbitrary one-parameter

family of paths

qi = qi (t, α), …(5)

where α is a parameter, −γ ≤ α ≤ γ, t0 ≤ t ≤ t1 and i = 1, 2,…,n. For α = 0, one obtains a given straight path and for α ≠ 0, the paths are circuitous. Further, let us assume that all these paths in (5) have a common initial point M0 and a common end point M1. That is, qi(t0, α) = qi

0,

…(6a)

qi(t1 α) = qi′, …(6b)

for −γ ≤ α ≤ γ and i = 1, 2,…, n. The Hamilton action as computed along a path of this one-parameter family is a function of the parameter α, and is denoted by W(α), defined as

W(α) = � α1

0

t

tii q),,t(q,t(L & (t, α)) dt. …(7)

Now we compute the variation δW of the Hamilton action W. We defined


δW = � δ1

0

t

tdtL

= dtqq

Lq

q

L1

0

t

t

n

1ii

ii

i�

��

��

� ��

�

��

�δ

∂∂

+δ∂∂

=&

& …(8)

We note that

δ �

��

αδ= )},t(q{

dt

dq ii&

= δα��

��

��

��

�

��

α

α∂∂

)},t(q{dt

di

= ��

��

��

��

δα�

��

α

α∂∂

),t(qdt

di

= dt

d(δqi). …(9)

Now

dtqq

L1

0

t

t

n

1ii

i� �

��

� δ

∂∂

=&

&

= dt)q(dt

d

q

L1

0

t

t

n

1ii

i�

��

��

� δ��

�

��

�

∂∂

=&

&

= � ��

��

δ��

��

�

∂∂

−��

��

� δ��

�

��

�

∂∂

=

=

==

1

0

1

0

t

t

n

1ii

i

tt

tt

n

1ii

i

dtqq

L

dt

dq

q

L&&

= − ,dtqq

L

dt

d1

0

t

t

n

1ii

i� �

��

��

δ��

��

�

∂∂

= & …(10)

since the variations δqi are zero at t = t0 and t = t1 by virtue of the assumption that the straight path and all the circuitous paths pass through M0 and M1 in extended co-ordinate square. Combining equations (8) and (10), one writes

δW = ��

��

��

�� δ

��

��

��

��

�

∂∂

−∂∂

=

1

0

t

t

n

1ii

ii

dtqq

L

dt

d

q

L&

…(11)


169

For a straight path (i.e., α = 0), the functions qi = qi(t) satisfy the Lagrange equations of motion, namely,

,0q

L

q

L

dt

d

ii

=∂∂

−��

��

�

∂∂&

…(12)

for i = 1, 2, …,n. From equations (11) and (12), it is concluded that δW = 0, …(13)

for the straight path. This proves that the Hamilton actions W has a stationary value for the straight path. Hence, the proof of Hamilton’s principle is complete.

Remark : The converse of Hamilton’s principle is true. That is, if

δW = 0 …(14)

for some path, then the paths straight.

Remark 2 : Since, from the Hamilton principle there follows the Lagrange equations of motion (using equation 11) and vice versa, Hamilton’s principle may be placed at the foundation of the dynamics of holonomic systems.

Remark 3 : The variational principle characterizes the entire straight path as a whole. It formulates the stationary property of a certain functional (Hamilton action), which property distinguishes the straight-line path from among other kinetically possible paths (circuitous paths). The variational principle has a more surveyable and compact form and if frequently used as foundation for new non-classical domains of mechanics.

Remark 4 : The value of the Hamilton actions W is least for a straight-line

path.

6.2 POINCARE −−−− CARTAN INTEGRAL INVARIANT We shall now derive a formula for the variation of Hamilton action

W = �1

0

t

t,dtL …(1)

in the general case when the initial and terminal instances of time, just like the initial and terminal coordinates, are not fixed but are functions of a parameter, say α, of the type t0 = t0(α), t1 = t1(α), …(2)

and

)(qq),(qq 1i

1i

0i

0i α=α= . …(3)


Using the Leibnitz rule for differentiation under integral sign, the differentiation of (1) with respect to parameter α gives

δW = δ ��

��

��α

α

)(t

)(t

1

0

dtL

L1δt1 −L0δt0 + dtqq

Lq

q

L)(t

)(t

n

1ii

ii

i

1

0

� � ��

��

�δ

∂∂

+δ∂∂α

α =&

& …(4)

Integrating by parts, we write

dtqdt

d

q

Ldtq

q

Li

)(t

)(t i

)(t

)(ti

i

1

0

1

0��

��

�δ� ��

�

��

�

∂∂

=� ��

��

�δ

∂∂ α

α

α

α &&

&

= � ��

��

�

∂∂

−�

��

δ

∂∂ α

α

α

α

)(t

)(t i

)(t

)(ti

i

1

0

1

0q

L

dt

dq

q

L&&

(δqi) dt

= � ��

��

�

∂∂

−δ−δα

αα=α=

)(t

)(t i)(tti

0i)(tti

1i

1

001 q

L

dt

d}q(p}q{p

& (δqi) dt, …(5)

where we have used the facts that

pi = ,q

L

i&∂∂

…(6a)

and

),q(dt

dq ii δ=δ& …(6b)

for i = 1, 2,…,n. Here, pi is the generalized momenta of the system. Further, we have used the notation [ ] )(ttii pp αλ=

λ =

= pi(tλ(α), α), …(7)

for λ = 0, 1. We note that

{ } δα�

��

α

α∂∂

=δλ

λ=

=tt

itti ),t(qq …(8)

for λ = 0, 1. On the other hand, for the variations of the terminal coordinates


171

],),(t[qq 1'i

'i αα=

we have the formulas

δα�

��

α∂α∂

+δ=δ= 1tt

i1

'i

'i

),t(qtqq &

= [ ] .qtq1tti1

'i =δ+δ&

This implies

[ ] 1'i

'itti tqqq

1δ−δ=δ = & . …(9)

Similarly,

[ ] .tqqq 00i

0itti 0

δ−δ=δ = & …(10)

First, we substitute the expressions for [ ]λ=δ ttiq from equations (9) and (10)

into right side of equation (5) and then use the value of leftside of (5) into equation (4) to get the following expression for δW.

δW = ( )� δ+δ−δ=

n

1i111

'i

'i

'i tLtqqp &

− 0

n

1i00

0i

0i

0i tL)tqq(p δ� −δ−δ

=&

+ ��

��

��

��

δ��

��

��

��

�

∂∂

−∂∂α

α =

)(t

)(ti

n

1i ii

1

0

dtqq

L

dt

d

q

L&

…(11)

we know that

HLqpn

1iii =−�

=& , …(12)

so that

�=

λλn

1iii qp & − Lλ = Hλ, …(13)

for λ = 0, 1. Using (13) in (11), we finally obtain

δW = 1

0

n

1iii tHqp �

��

δ−δ�=

+ ��

��

��

�� δ

��

��

��

��

�

∂∂

−∂∂

=

1

0

t

t

n

1ii

ii

dt,qq

L

dt

d

q

L&

…(14)


t

q1

C0

C1

p1

(For n = 1)

where

��==

δ−δ=�

��

δ−δn

1i11

1i

1i

1

0

n

1iii tHqptHqp

− .tHqpn

1i00

0i

0i� δ+δ

= …(15)

In place of the (n + 1) −dimensional extended coordinate space, we take the (2n+1)−dimensional extended phase space in which the quantities qi, pi and t will be the coordinates of the point.

In this space we take an arbitrary closed curve C0 given by the equations

qi = 0iq (α),

pi = 0iq (α),

t = t0(α), …(16)

for i = 1, 2,…,n and 0 ≤ α ≤ l. Here, we have one and the same point of the curve C0 for α = 0 and α = l. Taking each point on the curve C0 as the initial one, we draw the appropriate straight-line path. Such a path is uniquely determined by a system of Hamilton’s canonical equations and initial-conditions. We obtain a closed tube of straight paths qi = qi(t, α),

pi = pi(t, α), …(17)

qi(t, 0) = qi(t, l),

pi(t, 0) = pi(t, l), …(18)

for i = 1, 2,…,n and 0 ≤ α ≤ l.


173

On this tube, we choose arbitrarily a second closed curve C1 around the tube that has only one common point with each generatrix. The equations of curve C1 may be written in the following form

qi = 1iq (α),

pi = 1iq (α),

t = t1(α). …(19)

Now we shall examine the Hamilton action W along the generatrix of the tube from the curve C0 to the curve C1. In this case, for any α, the generatrix is a straight-line path and by virtue of Lagrange equations of motion

0q

L

dt

d

q

L

ii

=��

��

�

∂∂

−∂∂

&, …(20)

for i = 1, 2,…,n.

Consequently, the variations of Hamilton action, δW, for this case takes the following simplest form:

δW = ,tHqp1

0

n

1iii �

��

� δ−δ=

…(21)

by virtue formula in (14). This gives

W′(α) δα = .tHqp1

0

n

1iii �

��

� δ−δ=

…(22)

Integrating (22) with respect to α from α = 0 to α = l, we get

0 = w(l) −w(0) = � � �

��

δ−δ=

1

0

1

0

n

1iii tHqp

= � ��

�� δ−δ−

��

�� δ−δ

==

11

0

n

1i00

0i

0i

0

n

1i11

'i

'i

tHqptHqp

= ��

�� δ−δ−�

��

�� δ−δ

== 01 C

n

1iii

C

n

1iii tHqptHqp .

This gives


��

�� δ−δ=�

��

�� δ−δ

== 01 C

n

1iii

C

n

1iii tHqptHqp . …(23)

The relation (23) proves that the line-integral

I = ��

�� δ−δ=C

n

1iii tHqp …(24)

taken along any close contour C does not change its value in the case of an arbitrary displacement of the contour along a tube of straight-line paths. This imply that the integral I is invariant.

Definition : The integral I, defined above in (24), is called the POINCARE-CARTAN INTEGRAL INVARIANT.

Remarks 1 : The invariance of the Poincare−Cartan integral may be placed at the foundation of mechanics since from this invariance it follows that the motion of a system obeys Hamilton’s canonical equations.

Remark 2 : The converse proposition of the above result is true. That is, if the Poincare−Cartan I is an integral invariant with respect to the straight-line paths defined by the set of following first-order differential equations

=dt

dq i Qi(t, qk, pk),

dt

dp i = Pi(t, qk, pk), …(25)

for i = 1, 2,…,n ; then the following relations hold between the function H and

the functions Qi, Pi :

Qi = ip

H

∂∂

Pi = −iq

H

∂∂

…(26)

for i = 1, 2,…,n.

Remark 3 : In the Poincare−Cartan integral (24), the time t enters as the coordinate qi, and the role of the corresponding momentum is played by the quantity, −H, i.e., the energy taken with opposite sign. This is a far−reaching analogy. We change the variables in the integral I by introducing a new variable z connected with the old variables by the relation

z = −H(t, q1, q2, …, qn, p1, p2,…,pn) …(27)

Using this relation (27), we express p1 in terms of other variables. Let


175

p1 = −K(t, q1, q2,…, qn ; z; p2, p3,…, pn). …(28)

Using this relation (27), and (28), the expression (24) now becomes

I = �C

{zδt + p2 δq2 + p3 δq3 +…+ pn δqn − K δq1}. …(29)

Thus, in the new variables, the integral I has the aspect of the Poincare − Cartan integral, but the role of the time is now played by the variable q1 and in place of the earlier energy H we have the momentum p1 taken with reversed sign, i.e., K. Thus, the motion of a system in the new variables is described by the following Hamiltonian system of differential equations.

tK

dqdz

,zK

dqdt

11 ∂∂−=

∂∂= , …(30)

j1

j

j1

j

qK

dq

dp,

pK

dq

dq

∂∂−=

∂∂= , …(31)

for i = 2, 3,…,n; q1 being the independent variable.

Illustration : In the case of a linear oscillator for which

H = .2

cq

m2

p 22

+ …(1)

To form the canonical equations taking q for the independent variable, we put

z = −��

�

�

��

�

�+

2

cq

m2

p 22

. …(2)

Equation (2) implies

p = )cqz2(m 2+− . …(3)

Thus, we have,

K = − )cqz2(m 2+− . …(4)

The corresponding canonical equations are

,q)c/z(2

c/m

dq

dt2−−

= …(5)

0dq

dz= . …(6)


Solving (6), we get

z = constant

= −h, say. …(7)

From equations (5) and (7), we find

α−−

�=�2q)c/h2(

dqdt

m

c

This gives

wt + α = sin−1

��

�

�

��

�

�

h2

cq …(8)

where w = m/c , …(9)

and α is the constant of integration. Equation (8) implies

q = A sin (wt + α) …(10) where A = c/h2 .

The solution of canonical equations (5) and (6) consists of relations in (7) and (10).

6.3 WHITTAKER’S EQUATIONS We consider a generalized conservative system for which the Hamiltonian function H is not explicitly dependent on time, i.e., H = H(qi, pi) …(1) and

.0t

H=

∂∂

…(2)

We know that Hamilton’s canonical equations are

i

i

i

i

q

H

dt

dp,

p

H

dt

dq

∂∂

−=∂∂

= …(3)

for i = 1, 2,…,n. Differentiating (1), we write

� ��

��

�

∂∂

+∂∂

==

n

1i

i

i

1

i dt

dp

p

H

dt

dq

q

H

dt

dH

= 0, …(4)


177

by virtue of (2) and (3). Integrating (3), we get

H(qi, pi) = constant

= h, say, …(5)

during the motion of the system. The function H is called the generalized total energy and relation (5) is termed as the generalized integral of energy.

We consider an ordinary 2n−dimensional phase space in which the quantities qi and pi, 1 ≤ i ≤ n, are the coordinates of a point. We confine ourselves to only those points of the phase space whose coordinates satisfy the equation (5) with fixed value of the constant h, say h0. In other words, we confine ourselves solely to those states of the system to which the given magnitude of the total energy corresponds :

H = H(qi0, pi

0) = h0. …(6)

The basic integral invariant I for a generalized conservative system is

I = � �

��

� δ=

n

1iii qp . …(7)

We solve equation (5) for one of the momenta, say p1, to get

p1 = −K (q1, q2,…, qn, p2, p3,…, pn, h0), …(8)

and put the expression (8) for p1 into the integral (7). We have

I = �

��

δ−δ� �

=

n

2j1jj qKqp . …(9)

But the integral invariant (9) again has the form of the Poincare−Cartan integral if it is assumed that the basic coordinates and momenta are the quantities qj and pj for j = 2, 3,…,n, and the variable q1 plays the role of time variable; and instead of H we have the function K. Consequently, the motion of a generalized conservative system satisfy the following Hamiltonian system of 2n−2 differential equations.

,pk

dq

dq

j1

j

∂∂=

j1

j

qk

dq

dp

∂∂−= , …(10)

for j = 2, 3,…,n. These differential equations in (10) were obtained by Whittaker. For this reason, equations (10) are called Whittaker Equations.


6.4 JACOBI EQUATIONS We know that for an ordinary conservative system, Whittaker’s equations are

j1

j

j1

j

q

k

dq

dp,

p

k

dq

dq

∂∂

−=∂∂

= , …(1)

and H = constant = h. …(1a)

for j = 2, 3,…n. Integrating the system (1), we find the qj and pj as functions of variable q1 and 2n−2 arbitrary constants c1, c2,…, c2n−2.

Moreover, the integrals of Whittaker’s equation will contain an arbitrary constant h0, representing the given magnitude of the total energy of the system. Thus,

qj = ϕj (q1, h0, c1, c2,…, c2n−2), …(2)

pj = ψj(q1, h0, c1, c2,…, c2n−2), …(3)

for j = 2, 3,…, n. We know that

p1 = −K(q1, q2,…, qn, p2, p3,…, pn, h0). …(4)

Substituting (2) and (3) into equation (4), we find

p1 = ψ1(q1, h0, c1, c2,…, c2n−2). …(5)

The dependence of the coordinates on the time t is obtained from the following Hamilton equation

.p

H

dt

dq

1

1

∂∂

= …(6)


t = )pH(

dq

1

1

∂∂� + c2n−1, …(7)

where c2n−1 is the constant of integration, and all the variables in the partial derivative (∂H/∂p1) are expressed in terms of q1 with the help of the equations (2) to (4)


179

1

j'j dq

dqq = , …(8)

for j = 1, 2,…,n.

Then q1′ = 1.

Let P = P(q1, q2,…, qn, q2′, q3′,…, qn′)

= �=

n

2j

'jj qp −K …(9)

The Hamiltonian system (1), by eliminating K and pj from equations (1), (8) and (9), is equivalent to the system of equations of the Lagrangian type :

,0q

P

q

P

dt

d

j'j

=∂∂

−��

�

�

��

�

�

∂

∂ …(10)

for j = 2, 3,…,n.

The system (10) contains (n−1) second-order equations.

Next, we transform the expression for the function P by using (4) for p1. We

write

P = �=

n

2j

'jj qp + p1. …(11)

Now P = p2 q2′ + p3 q3′ +…+ pn qn′ + p1

= p2 11

nn

1

33

1

2 pdq

dqp...

dq

dqp

dq

dq++++

= 1nn33221

p)qp...qpqp(q

1++++ &&&

&

= ��

��

��=

n

1iii

1

qpq1 &&

= ),HL(q

1

1

+&

…(12)

because

� =−=

n

1iii .HLqp & …(13)


We know that for a conservative system.

L = T−U,

H = T + U, …(14)

where T is the kinetic energy and U is the potential energy. The kinetic energy

T may be written as

T = k

n

1k,iiik qqa

2

1&&�

=. …(15)

Let

G(q1, q2, qn, q2′, q3′,…, qn′) = �=

n

1k,i

'k

'iik qqa

21

…(16)

Equations (12) to (16) yield

P = 1q

T2&

…(17)

H−U = T = .Gq 21& …(18)

From equations (1a) and (18), we find

G

Uhq1

−=& , …(19)

and from equations (1a), (17) and (19), we get the following expression for the

function P.

P = 2 )Uh(G − . …(20)

Definition : The differential equations (10), in which the function P is of the form (20) and which therefore belong to the ordinary conservative systems (natural) are refereed to as JACOBI EQUATIONS.

Remark : Integrating Jacobi’s equations, we determine all the trajectories in the coordinate space (q1, q2,…, qn). qj = ϕj (q1, h, c1, c2,…, c2n−2). …(21)

The relation between the coordinates and the time variable is established from equation (19) by integration. We find

t = 1n21 CdqUh

G−+

−� . …(22)


181

qi′ q3

q1

qi0

q2

6.5 PRINCIPAL OF LEAST ACTION The Jacobi equations are

,0qP

'qP

dqd

jj1

=∂∂−

��

�

�

��

�

�

∂∂

…(1)

for j = 2, 3,.., n, and

1

j'j dq

dqq = . …(2)

Here, q1 is the independent variable and plays the role of the time. The Lagrangian action, denoted by W*, is defined as

W* = .dqP'1

01

q

q1� …(3)

Here, all the motions of a generalized conservative system that transfer the system from a given initial position 0

iq to a specified terminal position 'iq

(figure below). The instants of time t0 and t1 are not fixed and may vary when passing from the straight-line path to circuitous paths. Statement (Principal of least action). The variation of the Lagrange action W* is zero for the straight-line path.

Proof : We note that Jacobi equations (1) are Lagrangian type equations. So

by Hamilton principle,

δW* = 0

for the straight-line path.


Remark 1 : We have

W* = � +1

0

t

tdt)HL(

= � �+1

0

1

0

t

t

t

tdthdtL

= W + h(t1 − t0).

Remark 2 : For an ordinary system, we have the Lagrange action W* in the

following form

W* = 2 �1

0

t

tdtT

= ��

��=

1

0

t

t

N

1k

2kk dtvm

ρ

= ��

��

�

�

=

N

1kk

2k

s

sk dsvm

'k

0k

ρ.

6.6 LEE HWA −−−− CHUNG’S THEOREM

Poincare introduced for the first time, the following integral

I1 = � �=

n

1ipi δqi

…(1)

Round the contour C consisting of simultaneous states of a system. Poincare’s integral invariant I1 does not change its value if the contour C is displaced along the tube of straight-line paths to the contour C′, which again consists of simultaneous states.

It is convenient toe consider the integral I1 in the ordinary non extended 2n−dimensional phase space (q1, p1, q2, p2,…, qn, pn). In this the contour D and D′ (figure below) bounding the tube of straight-line paths.


183

q1

pn

p1

D

D′

Here,

� � δ=� � δ== 'D

n

1iii

D

n

1iii qpqp .

The Poincare integral invariant I1 is called the universal integral

invariant.

In 1947, the Chinese scientist Lee Hwa−Chung proved the uniqueness of universal integral invariants. He demonstrated that any other universal integral invariant differs by a constant factor from I1. Statement of Lee−−−−Hwa Chung Theorem :

If

I′ = � �=

n

1i[ Ai(t, qk, pk) δqi + Bi(t, qk, pk) δpi]

is a universal relative integral invariant, then

I′ = c I1,

where C is a constant, and I1 is the Poincare integral.

Note : The term ‘relative’ means that the domain of integration is a closed contour.


184

Chapter-7

Analytical Mechanics-IV

7.1 CANONICAL TRANSFORMATIONS

Definition :- The transformation of 2n-dimensional space

ii q~q~ = (t, qk, pk), …(1)

ii p~p~ = (t, qk, pk), …(2)

for i = 1, 2, 3,…, n, with the condition that

)p,q,...,p,q,p,q()p~,q~,...,p~,q~,p~,q~(

nn2211

nn2211

∂∂

≠ 0, …(3)

is called canonical if it carries any Hamiltonian system

,pH

dtdq

i

i

∂∂=

,qH

dtdp

i

i

∂∂−= …(4)

for i = 1, 2,…,n, again into another Hamiltonian system

,p~H~

dtq~d

i

i

∂∂=

,q~H~

dtp~d

i

i

∂∂−= …(5)

for i = 1, 2,…,n.

ANALYTICAL MECHANICS-IV

185

Remark 1 :- In the transformation (1, 2), the time variable t is considered as a parameter

Remark 2 :- In equation (5), is another Hamiltonian function.

Remark 3 :- The importance of studying canonical transformations is due to the fact that these transformations permit replacing a given Hamiltonian system (4) by another Hamiltonian system (5) in which the function H~ is of a simpler structure than H.

Remark 4 :- Canonical transformations are some time also called contact transformations

Result :- The set of all canonical transformation form a group. If in a phase space, we perform two canonical transformation in succession, the resulting transformation will again be canonical. Further more, a transformation that is inverse to a certain canonical transformation will always be canonical. The identity transformation

ii qq~ = ,

ii pp~ = ,

for i = 1, 2,…,n is canonical. Hence the result.

Example 1 :- Consider the transformation

ii q�q~ = ,

ii p�p~ = ,

for i = 1, 2,…,n, α ≠ 0, β ≠ 0, and α and β are constant. This transformation is canonical and it transforms the system (4) into the system (5) with

H~ = αβH.

Example 2 :- Consider the transformation

ii q�q~ = ,

ii p�p~ = ,

for i = 1, 2, …,n, α ≠ 0, β ≠ 0 and α, β are some constant. This transformation is canonical and it transform the system (4) into the system (5) with


186

H~ = −αβH.

Result :- A necessary and sufficient condition for the transformation

ii q~q~ = (t, qk, pk),

ii p~p~ = (t, qk, pk),

for i = 1, 2,…,n with

)p,q,...,p,q,p,q()p~,q~,...,p~,q~,p~,q~(

nn2211

nn2211

∂∂

≠ 0

to be canonical is the existence of a generating function F and some constant C for which

��

�� −=−

= =

n

1i

n

1iiiii t�H)q�p(Ct�H

~)q~�p~( −δF

is identically satisfied by virtue of the above transformation.

Note 1 :- The constant C is called the valence of the canonical transformation under consideration. The canonical transformation will be called univalent if

C = 1

for it.

Note 2 :- δF = t�tF

p�pF

q�qF

ii

ii

n

1i ∂∂+��

��

∂∂+

∂∂

�=

Note 3 :- In the literature, only univalent canonical transformations are frequently considered, and many authors erroneously hold that these transformations exhaust all the transformations that carry Hamiltonian systems again into Hamiltonian systems.

7.2 FREE CANONICAL TRANSFORMATION

Definition :- A canonical transformation is called a free canonical transformation if the inequality

)p,...,p,p()q~,...,q~,q~(

n21

n21

∂∂

≠ 0 …(1)


187

holds additionally.

Note 1 :- The inequality (1) for a free transformation ensures the independence of the quantities

t, q1, q2 ,…, qn, ,q~,...,q~,q~ n21

which can now be taken as the basic variables. Further the generalized momenta p1, p2,…, pn can now be expressed in terms of 2n+1 quantities t, qi, iq~ for i = 1, 2,…, n consequently the generating function F for a free canonical transformation is represented as

F(t, qi, pi) = S(t, qi, )q~i …(2)

For univalent (c = 1) free canonical transformation we obtain the following formulas

ii

pqS =

∂∂

, …(3)

iq~

S∂∂

= −pi, …(4)

tS

HH~

∂∂+= …(5)

in the equation (3) and (4), i = 1, 2,…,.

Example :- The canonical transformation

ii p�q~ =

ip~ = β qi

for i = 1, 2,…,n α ≠ 0, β ≠ 0 is free

Remark :- For a natural system, the coordinates q1, q2,…, qn, define the position of the system, and together with the momenta p1, p2, …, pn they define the state of the system, that is the positions and velocities of its points. This specificity of the coordinates is lost in a general-type canonical transformation. The quantities n21 q~,...,q~,q~ no longer define the position of the system, and only together with the n21 p~,...,p~,p~ do they define the state of the system. The variables n21 q~,...,q~,q~ will as before define the position of the system only in


188

the particular case of a point canonical transformation for which the functions iq~ (t, qk, pk) actually do not contain the momenta

ii q~q~ = (t, qk)

for i = 1, 2,…,n.

Note that subsequently the transformation of an arbitrary Hamiltonian system into a system with the function H of simple structure may be effected with the aid of a free canonical transformation. But a free canonical transformation is not a point transformation. Thus, non point canonical transformations play an essential role in the theory of Hamiltonian systems.

7.3 THE HAMILTON−−−−JACOBI EQUATION

Let there be given a holonomic system whose motion obeys the canonical equations of Hamilton :

i

i

pH

dtdq

∂∂= , …(1)

i

i

qH

dtdp

∂∂−= , …(2)

for i = 1, 2, …,n. We shall try to find/determine a free univalent canonical transformation such that in the transformed Hamiltonian system

i

i

p~H~

dtq~d

∂∂= , …(3)

i

i

q~H~

dtp~d

∂∂−= , …(4)

for i = 1 2,…,n and the function H~ will be identically zero, i.e.,

H~ ≡ 0. …(5)

Using (5), equations (3) and (4) reduce to

0dtp~d

,0dtq~d ii == , …(6)

for i = 1, 2,…,n. Integrating (6), we get


189

,�p~,�q~ iiii == …(7)

where αi and βi are 2n arbitrary constants. This implies that the new variables are also identically constant.

Knowing the canonical transformation, i.e., the relation between qi, pi and ii p~,q~ ; we can express all the qi and pi as functions of the time t and of the 2n

arbitrary constants αk, βk (k = 1, 2,…,n). That is, we can find the final equations of motion of the given holomonic system completely, i.e., all the solutions of the system (1) and (2).

We know that for such a univalent free canonical transform, there exists a generating function S = S(t, qi, )q~i for which

,tS

HH~

∂∂+= …(8)

,pqS

ii

=∂∂

ii

p~q~S −=

∂∂

…(9)

for i = 1, 2,…,n.

Using (5), (8) and (9) ; we write

0qS

,q,tHtS

ii =��

��

∂∂+

∂∂

. …(10)

The partial differential equation in (10) is called the Hamilton-Jacobi equation

The form of this equation is very simple to write down. Given a specific Hamiltonian function H = H(t, qi , pi), the momentum components are formally replaced by the partial derivative of S as in equation (9). The result is a first order partial differential equation. By assumption, the new generalised coordinates iq~ are constants. Hence, the form of the generating function is

S = S(t, q1, q2,…, qn, α1, α2,…, αn) …(11)

dependent on the original coordinates and possibly time. So S is a function of n + 1 variables and n parameters. The solution of the Hamilton-Jacobi


190

equation (10) is equivalent to finding the solution of the original canonical equations of motion. Besides the Hamilton-Jacobi equation (10), the condition

det��

��

�

∂∂∂

ji

2

�qS

≠ 0, …(12)

must hold for the generating function. As soon as the generating function S (t, qi, αi) is found, the formulas (9) will define the required/desired univalent free canonical transformation

Definition :- The solution of partial differential equation (10) of Hamilton-Jacobi containing n arbitrary constants α1, α2,…, αn is called the complete integral of this equation if the condition (12) is fulfilled.

In the above, we have proved the following theorem :

Theorem (Jacobi’s Theorem)

Statement : If S(t, qi, αi) is some complete integral of the Hamilton−Jacobi equation

0qS

q,tHtS

ii =��

��

∂∂+

∂∂

,

then the final equations of motion of a holomonic system with the given function H may be written in the form

ii

ii

��

S,p

qS =

∂∂=

∂∂

for i = 1, 2,…, n; and αi and βi are arbitrary constants.

Remark :- A knowledge of the complete integral of the partial differential equation (10) relieves us of the necessity of integrating the system of ordinary differential equations in (1) and (2).

Conservative System

When the Hamiltonian function H does not depend explicitly on time, then the Hamiltonian itself is a constant of motion.

H(qi, pi) = h …(1)


191

in which h is an energy constant. In this case the Hamilton−Jacobi equations reduces to

,0qS

,qHtS

ii =��

��

∂∂+

∂∂

…(2)

in which

S = S(t, q1, q2,…, qn, α1, α2,…, αn) …(3)

is the generating function for free canonical transformation with n parameters α1, α2,…, αn. From equations (1) and (2)

0htS =+

∂∂

. …(4)

We assume the following special form of S for time dependence of S.

S = −ht + V(q1, q2,…, qn, α1, α2,…,αn). …(5)

Substituting of (5) into equation (4) gives

H hqV

,qi

i =��

��

∂∂

. …(6)

Note 1 : Equation (6) is called the reduced Hamilton – Jacobi equation.

This is a first order partial differential equation in n dependent variables.

Note 2: Since the solution of (6) already dependents on the energy parameter h, there is no loss in assigning one of the parameters, say αn ; the constant h that is, αn = h, so

V = V(q1, q2,…, qn; α1, α2,…, αn−1, h). …(7)

Note 3: We have the following final equations of motion of a generalised conservative system.

ii

pqV =

∂∂

, 1 ≤ i ≤ n, …(8)

j�

V∂∂

= βj, 1 ≤ j ≤ n−1 …(9)


192

hV

∂∂

= t + γ …(10)

where αj, βj, h and γ are arbitrary constants

7.4 METHOD OF SEPARATION OF VARIABLES

The Hamilton − Jacobi theory offers an elegant method by which the canonical equations can be solved by reducing the dynamical problem to that of finding a solution to a partial differential equation. Unfortunately, there is no general technique for construction of complete solutions to partial differential equations.

The most practical method, which works for certain classes of differential equations, is the method of separation of variables. In this method, we search for solutions in which the independent variables of the given PDE are grouped together so that the original problem converts into a collection of problems involving only ODE.

Under certain circumstances, it is feasible to assume certain convenient forms for the solution to the Hamilton−Jacobi equation. We consider the conservative Hamiltonian for which the generating function S is

S = −h t + V, …(1)

where

H ��

��

∂∂

ii q

v,q = h. …(2)

Let

H(qi, pi) = G[f1(q1, p1),…, fn (qn, pn)] . …(3)

Here, the variables in the expression for the function H are separated. Equations (2) and (3) imply

G ��

��

��

��

∂∂

��

��

∂∂

nnn

111 q

V,qf,...,

qV

,qf = h, …(4)

since

pi = iq

V∂∂

…(5)


193

we now put

fi ��

��

∂∂

ii q

V,q = αi …(6)

for i = 1, 2, …, n. The constants in (6) are otherwise arbitrary but by (4), must satisfy the relation.

G(α1, α2,…, αn) = h. …(7)

Solving (5) for iq

V∂∂

, we find

iq

V∂∂

= Fi(qi, αi) …(8)

for i = 1, 2,…, n Consequently, we obtain

V = [ ]iiii

n

1idq)�,q(F��

= …(9)

and then

S = − G(α1, α2,…, αn)t + [ ]iiii

n

1idq)�,q(F��

= …(10)

Remark 1. We find

i

i

ii

2

�

F�qS

∂∂=

∂∂∂

…(11)

for i = 1, 2,…, n, and

0�qS

ki

2

=∂∂

∂ …(12)

for i ≠ k and i, k, = 1, 2,…n. Hence, the condition.

det ��

��

∂∂∂

ki

2

�qS ≠ 0, …(13)

reduces (for this method) to


194

∏ ��

��

∂∂

=

n

1i i

i

�

f≠ 0. …(14)

Remark 2

1

i

i

i

i

pf

�

F−

��

��

∂∂

=∂∂

≠ 0, …(15)

for i = 1, 2,…,n.

Remark 3. The formula (10) defines a complete integral of the reduced Hamilton −Jacobi equations for a conservative system.

7.5 LAGRANGE BRACKETS

Let φ and ψi be 2n functions of the two variables p and q, for i = 1, 2,…,n. We define

[q p] = )p,.q(

)�,�( jjn

1j ∂∂

�=

= � ��

��

�

∂∂

∂∂

−∂

∂∂

∂

=

n

1j

jjjj

q

�

p

�

p

�

q

�. …(1)

[q p] are called Lagrange brackets.

We find

[q p] = −[p q], …(2)

which is an antisymmetric property.

Note : Comparing Lagrange brackets with Poisson brackets, we find that there were two functions φ and ψ of 2n variables qi, pi for Poisson brackets, whereas, there are 2n functions φi, ψi of two variables p, q for Lagrange brackets.

Canonical character of a transformation in terms of Lagrange brackets

We shall now derive the necessary and sufficient conditions that must be satisfied by 2n independent functions


195

iq~ = ϕi(t, qk, pk),

ip~ = ψi(t, qk, pk), …(1)

for i, k = 1, 2,…,n, so that the transformation defined above in (1) should be canonical.

We know that the necessary and sufficient condition for the transformation (1) to be canonical is the existence of a generating function F = F(t, qi, pi) and some constant c for which the following identity hold.

��

��

�� −=−�==

n

1iiii

n

1ii t�Hq�pct�H

~q~�p~ − δF(t, qi, pi) …(2)

We assume that the transformation (1) is canonical. Then, the identity (2) holds. We take an arbitrary fixed value t = t (so δ t = 0) in (20. We write

��==

=δn

1ii

n

1ii q~p~ pi δqi − δF ( t , qi, pi) …(3)

But (3) is a defining identity for a transformation that does not contain the time explicity,

=iq~ φi( t , qk, pk), ip~ = ψi( t , qk, pk) …(4)

for i = 1, 2,…, n.

Hence, formulas (4) define a canonical transformation with valence C which is independent of the chosen value of t = t

On the contrary, let it now be given that all transformations obtained from the transformation (1) by replacing the variable t by various fixed values of t are canonical and with one and the same valence c. Then, defining the function H~ by the equation

�= ∂

∂+

∂∂+=

n

1i

ii t

q~p~

tF

CHH~ , …(5)

we get equation (2) from equations (3) and (5). Thus, we find that the transformation (1) that depends on the time t is canonical

Hence, for the time-dependent transformation (1) to be canonical it is necessary and sufficient that all the time-independent transformations obtained from the transformation (1) by replacing t with an arbitrary value of t be


196

canonical and with one and the same valence c. For this reason, when establishing tests for canonical character, we can confine ourselves to canonical transformations that do not contain the time variable t explicitly :

iq~ = φi(qk, pk), …(6)

ip~ = ψi(qk, pk), …(7)

with

)p,...,p;q,...,q,q()p~,...,p~,q~,...q~,q~(

n1n21

n1n21

∂∂

≠ 0, …(8)

for i = 1, 2,…,n.

For the above canonical transformation (6) to (8), the defining identity (2) is now written as

��

��

� δ=δ ��==

k

n

1kkk

n

1kk qpCq~p~ − δK (qk, pk) …(9)

From (7), we write

� ��

��

�

∂∂

+∂∂

==

n

1ii

i

ki

i

kk p�

pq~

q�qq~

q~� . …(10)

let

Φi = �=

−∂∂n

1ki

i

kk Cp

qq~

p~ , …(11)

Ψi = �∂∂

=

n

1k i

kk q

q~p~ , …(12)

for i = 1, 2,…,n. Using (10)−(12) in (9), we obtain

�=

n

1i(Φi δqi + Ψi φpi) = −δK (qk, pk). …(13)

The conditions that the left-hand side of (13) is differential are

i

k

k

i

qq ∂Φ∂

=∂Φ∂

, …(14a)


197

i

k

k

i

pp ∂Ψ∂

=∂Ψ∂

, …(14b)

i

k

k

i

qp ∂Ψ∂

=∂Φ∂

, …(14c)

where i, k = 1, 2,…,n. Substituting the expressions for Φi and Ψi from equations (11) and (12) into equations (14a−c), we obtain

� ��

��

�

∂∂

∂∂

−∂∂

∂∂

=

n

1j i

j

k

j

k

j

i

j

q

p~

q

q~

q

p~

q

q~ = 0, …(15a)

�=

��

��

�

∂∂

∂∂

−∂∂

∂∂n

1j i

j

k

j

k

j

i

j

p

p~

p

q~

p

p~

p

q~ = 0, …(15b)

�=

��

��

�

∂∂

∂∂

−∂∂

∂∂n

1j i

j

k

j

k

j

i

j

q

p~

p

q~

p

p~

q

q~= C δik, …(15c)

for i, k = 1, 2,…,n.

Here δik is the substitution tensor. Using Lagrange brackets, the above conditions in (15a−c) can be written as

[qi qk] = 0,

[pi pk] = 0,

[qi pk] = C δik …(16)

The equalities (16) express the necessary and sufficient conditions for the transformation (6) and (7) to be canonical.

7.6 JACOBIAN MATRIX OF A CANONICAL TRANSFORMATIONS

Let

Q = k

i

qq

∂∂

…(1)

be the Jacobian matrix of order n. Let


198

P = ,pp

k

i

∂∂

,pq~

Rk

i

∂∂

=

k

i

qp~

S∂∂

= …(2)

be other Jacobian matrices, each of order n. Let

M = ��

��

=

∂∂

∂∂

∂∂

∂∂

∂∂

∂∂

∂∂

∂∂

∂∂

∂∂

∂∂

PSRQ

pp~

.....................qp~

...................................

...........................qp~

pq~

...pq~

qq~

...qq~

..................................

pq~

...pq~

qq~

...qq~

n

n

1

n

1

1

n

n

1

n

n

n

1

n

n

1

1

1

n

1

1

1

…(3)

be the Jacobian matrix of the canonical transformation

),p,q(q~ kkii φ=

ip~ = ψi(qk, pk). …(4)

Let E be a unit matrix of order n. Let

J = ��

��

� −OEEO

…(5)

be a matrix of order 2n.

Then

det J = 1 . …(5A)


199

Since the transformation (4) is canonical, so the following conditions, in terms of Lagrange brackets, hold :

[qi qk] = 0,

[pi pk] = 0,

[qi pk] = C δik …(6)

Using (6), it can be checked (left as an exercise) that

M′ JM = C J. …(7)

For a univalent canonical transformation (c = 1), equation (7) becomes

M′ JM = J. …(8)

Definition 1: The matrix M that satisfies (7) is called a GENERALIZED − SIMPLICIAL matrix.

Definition 2: The matrix M for which (8) hold is called SIMPLICIAL.

For such matrices

det M = + Cn .

That is, simplicial matrices are non-singular.

Result :- All generalized−simplicial matrices ( for c ≠ 0) form a group and

det M = + Cn.

Note :- In view of the above, the test for the canonical character of the transformations may be stated as

“For a certain transformation

)p,q,t(p~p~),p,q,t(q~q~ kkiikkii ==

to be canonical, it is necessary and sufficient that the Jacobian matrix M corresponding to this transformation should be generalized−simplicial with constant valence C”.


200

7.7 CONDITION OF CANONICAL CHARACTER OF A TRANSFORMATION IN TERMS OF POISSON BRACKETS

We know that the condition of canonicity of a transformation

),p,q,t(q~q~ kkii =

)p,q,t(p~p~ kkii = …(1)

(i, k = 1, 2,…) is that

M′ JM = C J , …(2)

where C ≠ 0 is the valence of the canonical transformation, M is a 2n×2n generlalized − simplicial matrix and

J = ��

��

−OEEO

…(3)

in which E is a unit matrix of order n. Matrix M is non-singular. Equation (2) gives

(M′)−1 (M′ JM) M−1 = (M′)−1 (c J) M−1

or J = C [(M′)−1 J M−1]

or (M′)−1 J M−1 = .JC1

…(4)

From equation (3), we find (exercise)

J−1 = −J. …(5)

Taking inverse of (4) both sides and using (5), we write

M J M′ = C J. …(6)

The equality (6) may be considered as obtained from equality (2) on replacing the Jacobian matrix M by its transpose M′. In view of definition of M,


201

M =

��

��

�

∂∂

∂∂

∂∂

∂∂

k

i

k

i

k

i

k

i

pp~

qp~

pq~

qq~

, …(7)

the above substitution reduces to replacing the derivatives

k

i

k

i

k

i

k

i

pp~

,qp~

,pq~

,qp~

∂∂

∂∂

∂∂

∂∂

,

respectively, by the derivatives

i

k

i

k

i

k

i

k

pq~

,pq~

,qq~

,qq~

∂∂

∂∂

∂∂

∂∂

.

That is in each derivative the letters and indices above and below are interchanged. He know that in terms of Lagrange brackets, the equation (2) is equivalent to the following system of equalities :

[qi qk] = 0,

[pi pk] = 0,

[qi pk] = C δik …(8)

for i , k = 1, 2,…,n. In view of the discussion in the proceeding paragraph, the equation (6) will be equivalent to the following system of equalities ;

[qi qk]* = 0,

[pi pk]* = 0,

[qi pk]* = C δik …(9)

Here, the asterisk (*) indicates that the above mentioned interchange of derivatives is to be performed within Lagrange brackets. That is,

[qi qk]* = *

q

q~

q

p~

q

p~

q

q~n

1j k

j

i

j

k

j

i

j

��

�

��

��

�

��

∂∂

∂∂

−∂∂

∂∂

=

= � ��

��

�

∂∂

∂∂

−∂∂

∂∂

=

n

1j j

k

j

i

j

k

j

i

qq~

pq~

pq~

qq~


202

= ),q~q~( ki …(10a)

where )q~q~( ki are Poisson brackets of the function iq~ and kq~ with respect to the independent variables q1, p1, q2, p2, Poisson brackets. Similarly,

[pi pk]* = ),p~p~( ki

)p~q~(]*pq[ kiki = …(10b)

Hence, the conditions of the canonicity of transformation (8) may be written in the following form in terms of Poisson brackets.

,0)q~q~( ki =

,0)p~p~( ki =

)p~q~( ki = C δik …(11)

for i, k = 1, 2,…,n.

7.8 INVARIANCE OF THE POSISSON BRACKETS IN A CANONICAL TRANSFORMATIONS

Consider two function

φ = φ(t, qi, pi) ψ = ψ(t, qi, pi) …(1) Let kk q~q~ = (pi, qi) kk p~p~ = (pi, qi) …(2) be the canonical transformation and its inverse be pi = pi )p~,q~( kk qi = qi )p~,q~( kk …(3)

Substituting the function pi and qi in (1) interm of kk p~,q~ with the help of (3), we can regard these same functions φ and ψ as functions of the variable

kk p~,q~ . Accordingly, the Poisson brackets of φ and ψ may be evaluated both with respect to the variable qi and pi and with respect to the variables ii p~,q~ .

Let (φ ψ) denote the Poissons brackets with respect to variables qi, pi and )��( denote the same with respect to variables .p~,q~ ii

We assert that (left as an exercise to the readers)


203

(φ ψ) = C )��( , …(4)

where C is the valence of the canonical transformations

The converse of the above assertion also holds. That is, if for any two functions φ and ψ, the identity (4) is fulfilled for one and the same constant C ≠ 0, then the transition from the 2n variables qi, pi to the 2n variables ii p~,q~ is accomplished by a canonical transformation with valence C.

In particular, for a univalent canonical transformation (C = 1), we have

(φ ψ) = )��( . …(5)

This proves that the Poisson brackets are invariant under univalent canonical transformations.

This property of univalent canonical transformations singles out these transformations from among all possible transformations of phase space.


204

Chapter-8

Nonlinear First-Order PDE

8.1 INTRODUCTION

We shall study PDE of the form

F(Du, u, x) = 0, …(∗)

where x∈U⊂ Rn, U is open, and

u : U →R ,

is the unknown, u = u(x). The function F is given.

Notation : We write

F = F(p, z, x)

= F(p1, p2,…,pn, z, x1, x2,…, xn)

for

p∈Rn, z∈R, x∈U.

Thus

“p” is the name of the variable for which we substitute the gradient Du(x), and “z” is the variable for which we substitute u(x).

We also assume hereafter that F is smooth, and set

DpF = ( )F...,F,Fnp,2p1p

DzF = Fz

DxF = ( )F...,F,Fnx,2x1x

We are concerned with discovering solutions u of the PDE (∗) in U, usually subject to the boundary condition

u = g on Γ, …(∗∗)

NONLINEAR FIRST-ORDER PDE

205

where Γ is some given subset of ∂U and

g : Γ→R …(***)

is prescribed.

8.2 COMPLETE INTEGRALS

Consider the nonlinear first-order PDE

F(Du, u, x) = 0. …(1)

Suppose first A ⊂ Rn is an open set. Assume for each parameter

a = (a1, a2,…, an) ∈A,

we have a C2 solution

u = u(x ; a) …(2)

of the PDE (1). We write

)uD,uD( 2xaa =

��

�

�

��

�

�

nnn1n

2n212

1n111

axaxa

axaxa

axaxa

u...uu

u...uu

u...uu

Μ . …(3)

Definition : A C2 function u = u(x ; a) is called a complete integral in U×A, provided

(i) u(x ; a) solves the PDE (1) for each a∈A ,

(ii) rank )uD,uD( 2xaa = n ,

for x∈U, a∈A .

Remark : The condition (ii) above ensures u = u(x ;a)

“depends on all the n independent parameters a1, a2,…, an”.

Example 1. The Clairaut’s equation is the PDE (nonlinear)

x⋅Du + f(Du) = u, …(1)

where

f : Rn→R is given.


206

A complete integral of PDE (1) is

u(x ; a) = a⋅ x f(a) , …(2)

for all x∈U and a∈Rn.

Example 2 : The eikonal PDE is (non-linear)

|Du| = 1. …(3)


u(x ; a, b) = a ⋅ x + b …(4)

for all x∈U ⊂ Rn, a∈∂B(0, 1), b∈R.

Example 3 : The Hamilton−Jacobi equation from mechanics is, in its simplest form,

the PDE (nonlinear)

ut + H(Du) = 0, …(5)

where

H : Rn → R

is given.

In PDE(5), u depends on x = (x1, x2,…, xn) ∈Rn and t∈R.

A complete integer of PDE (5) is

u(x, t; a, b) = a⋅x − t H(a) + b, …(6)

for x∈Rn, t ≥ 0 and a∈Rn, b∈R.

New Solutions as envelopes of Complete Integrals

Definition : Let u = u(x; a) be a C1 function of x∈U, a∈A, where U⊂Rn and A⊂Rm are open sets. Consider the vector PDE

Da u (x ; a) = 0, …(1)

for x∈U and a∈A.

Suppose that we can solve (1) for the parameter a as a C1 function of x, of the form


207

a = φ(x). …(2)

Then

Dau(x; φ(x)) = 0,

for all x∈U. …(3)

We then call

v(x) = u(x; φ(x)),

x∈U …(4)

the envelope of the functions

{u(⋅ ; a)}a∈A.

Result : By forming envelopes of complete integrals (or of other m-parameter families of solutions), we construct new solutions of given nonlinear first order PDE.

Such solution is called a Singular Integral of given equation

F(Du, u, x) = 0.

Example : Consider the nonlinear first order PDE

u2(1 + |Du|2) = 1. …(1)


u(x ; a) = + (1−|x−a|2)1/2, …(2)

for

|x−a| < 1. …(3)

we compute

Dau = 2/1|)ax|1(

)ax(−−−±

. …(4)

The vector equation

Dau = 0 , …(5)

gives


208

a = x ≡ φ(x), say. …(6)

Thus

v(x) = u(x, φ(x))

or

v(x) = + 1 , …(7)

are singular integrals of nonlinear PDE (1).

8.3 CHARACTERISTICS METHOD

Consider basic nonlinear first-order PDE

F(Du, u, x) = 0 in U, …(1)

subject to the boundary condition

u = g on Γ, …(2)

where Γ ⊆ ∂ U and

g : Γ→R …(3)

are given.

We hereafter suppose that F, g are smooth functions.

PLAN :- We develop next the method of characteristic, which solves (1) and (2) by converting the PDE into an appropriate system of first order ODE.

Method

Suppose u solves (1), (2) and fix any point x∈U. We would like to calculate u(x) by finding some curve lying within U, connecting the point x with a point x0∈Γ and along which we can compute u.

Since the boundary condition (2) says that the function u is known on Γ, the value of u at the one end x0 becomes known.

We hope then to be able to calculate u all along the curve, and so in particular at the point x.

Let us suppose that this curve is described parametrically by the function

x (s) = (x1(s), x2(s),…, xn(s)) …(4)

the parameter s lying in some subinterval of R.


209

Assuming u is a C2 solution of PDE (1), we define also

z(s) = u( x (s)). …(5)

We also set

p (s) = Du( x (s)), …(6)

i.e.,

p (s) = (p1(s), p2(s),…, pn(s)), …(7)

where

pi(s) = ixu ( x (s)), …(8)

for i = 1, 2,…, n.

The function z(⋅) gives the value of u along the curve and p (⋅) determines the values of the gradient Du.

We must choose the function x (⋅) in such a way that we can compute z(⋅) and p (⋅).

For this, we differentiate (8) and write

),s(x))s(x(u)s(p jn

1jjxix

i && �=

= …(9)

for i = 1, 2,…,n.

Here, dot (⋅) signifies d/ds. The right side of (9) involves the second derivatives of u. On the other hand, we can also differentiate the PDE (1) with respect to xi, to get

ix

n

1jixjx

j

u)x,u,Du(zF

u)x,u,Du(pF

��

��

∂∂+

��

�

��

��

∂∂

=

+ )x,u,Du(xF

i∂∂

= 0, …(10)

for i = 1,2,…,n.

We are able to employ this identity (10) to get rid of the “dangerous” second derivative terms in (9), provided we first set


210

))s(x),s(z),s(p(pF

)s(xj

j

∂∂=& , for j = 1, 2,…,n. …(11)

Assuming now (11) holds, we evaluate (10) at x = x (s), obtaining from (5), (6), the identity,

�= ∂

∂n

1jjxix

j

))s(x(u))s(x),s(z),s(p(pF

+ )s(p))s(x),s(z),s(p(zF i

∂∂

+ 0))s(x),s(z),s(p(xF

i

=∂∂

, …(12)

for i = 1, 2,…,n.

Substituting from (11) and (12) into equation (9) we write

))s(x),s(z),s(p(xF

)s(pi

i

∂∂−=&

− ),s(p))s(x),s(z),s(p(zF i

∂∂

…(13)

for i = 1, 2,…,n.

Finally, we differentiate (5) and obtain

)s(x))s(x(xu

)s(z jn

1j j

&& �= ��

�

��

��

∂∂=

= ��

�

��

��

∂∂

=

n

1j j

j ,))s(x),s(z),s(p(pF

)s(p …(14)

by using equations (8) and (11).

We now summarize equations (11), (13) and (14) and rewrite in vector notation

)s(p))s(x),s(z),s(p(FD))s(x),s(z),s(p(FD)s(p zx −−=& , …(15A)

)s(p)}.s(x),s(z),s(p(FD{)s(z p=& , …(15B)


211

)).s(x),s(z),s(p(FD)s(x p=& …(15C)

The system (15) consists of (2n+1) first order ODE. It comprises the characteristic equations of the given nonlinear first order PDE (1).

The functions

))(p),...,(p),(p()(p n21 ⋅⋅⋅=⋅ ,

z )(⋅ ,

)),(x),...,(x),(x()(x n21 ⋅⋅⋅=⋅

are called the characteristics.

Remark 1 : In the above, we have proved the following theorem regarding the “Structure of Characteristic ODE”.

Statement : Let u∈C2(U) solve the nonlinear first-order PDE (1) in U. Assume )(x ⋅ solves the ODE (15c). Then )(p ⋅ solves the ODE (15a) and )(z ⋅ solves the ODE (15b), for those s such that )s(x ∈U.

Remark 2 : We still need to discover appropriate initial conditions for the system of ODE (15), in order that this theorem be useful.

Remark 3 : The form of the full characteristic equations can be quite complicated for fully nonlinear first order PDE, but sometimes a remarkable mathematical structure emerges.

Question 1. Derive the characteristic equations for the linear and homogeneous PDE

F(Du, u, x) = b (x) ⋅ Du(x) + c(x) u(x) = 0, …(1)

for x∈U. Hence, solve the problem

x1 uuxu1x22x =− in U

u = g on Γ,

where U is the quadrant {x1 > 0, x2 > 0} and

Γ = {x1 > 0, x2 = 0} ⊆ ∂ U.

Solution Part I.


212

We write

nR)x(Dup ∈= , …(1)

R)x(uz ∈= . …(2)

Then , the given equation can be written as

F(p, z, x) = z)x(cp)x(b +⋅ = 0 …(3)

So

DpF = )x(b . …(4)

The characteristic equation (15c) of the Article now becomes

)),s(x(b)s(x =& …(5)

which is an first order ODE involving only the function x (⋅), and can be solved easily.

The characteristic equation (15b) now becomes

)s(p))}.s(x(b{)s(z =& …(6)

Equations (1), (2) and (6) simplify to

)s(z))s(x(c)s(z −=& . …(7)

This is a first order linear ODE in z(⋅), once the function x (∗) is known from ODE (5).

Thus, equations (5) and (7) comprise

)),s(x(b)s(x =& …(8a)

),s(z))s(x(c)s(z −=& …(8b)

the characteristic equations for the linear first order PDE (3).

Part −−−−II : Comparing the given PDE with standard PDE (3), we find

),x,x(x 21=

b = (−x2, x1),

c = −1. …(9)


213

Thus, the system (8a, b) for this present problem consists of

,xx 21 −=&

,xx 12 =& …(10)

and

.zz =& …(11)

Solving the system of two ODE in (10), we find (exercise)

scosx)s(x 01 = ,

ssinx)s(x 02 = . …(12)

Solution of (11) is

z(s) = z0 es

= g(x0) es, …(13)

where x0 ≥ 0 and 0 ≤ s ≤ π/2 .

Fix a point (x1, x2) ∈U. We select s > 0, x0 > 0 so that

(x1, x2) = (x1(s), x2(s)) .

This gives x0 = (x12 + x2

2)1/2,

s = tan−1 (x2/x1). …(14)

Therefore

u(x1, x2) = u(x1(s), x2(s))

= z(s)

= g(x0) es

= g((x12 + x2

2)1/2) exp [tan−1 (x2/x1)] , …(15)

as the solution of the given boundary-value problem.

Article : Derive the characteristic equations for the quasilinear PDE of the form

F(Du, u, x) = b (x, u (x)) ⋅ Du(x) + c(x, u(x)) = 0.


214

Hence, solve the boundary-value problem

22x1x uuu =+ in U ,

u = g on Γ,

where U is the half-space {x2 > 0} and Γ = {x2 = 0} = ∂U.

Solution : Part I.

For

��

�

∈=∈=R))s(x(u)s(z

R))s(x(Du)s(p n

, …(1)

the given nonlinear PDE is

F .0)z,x(cp)z,x(b)x,z,p( =+⋅= …(2)

We obtain

DpF = b (x, z). …(3)

Hence, characteristic equation (15c) now becomes

))s(z),s(x(b)s(x =& . …(4)

Characteristic equation (15b) becomes

)s(p))}s(z),s(x(b{)s(z ⋅=&

= − c )),s(z),s(x( …(5)

Using (2). So, the characteristic equations for the quasi linear first order PDE consists of two ODE (4) and (5).

Part-II. In this example,

),1,1(b =

c = −z2,

x = (x1, x2) …(6)

ODE equations (4) and (5) become


215

��

�

=

=

1x

1x2

1

&

&, …(7)

and

2zz =& . …(8)

Solving (7), we get

x1(s) = x0 + s,

x2(s) = s. …(9)

Solving (8), we get (exercise)

z(s) = 0

0

sz1z

−

= )x(sg1

)x(g0

0

−, …(10)

where x0∈R, s ≥ 0, provided the denomination is not zero.

For a point (x1, x2) ∈U, we select s > 0 and x0∈R so that

(x1, x2) = (x1(s), x2(s))

= (x0 + s, s) ,

i.e., x0 = x1 − x2,

s = x2. …(11)

Then

u(x1, x2) = u(x1(s), x2(s))

= z(s)

=)x(sg1

)x(g0

0

−.

)xx(gx1

)xx(g

212

21

−−−

= …(12)

is the solution, provided the denominator is non-zero.


216

8.4 CHARACTERISTIC FOR THE HAMILTON−−−−JACOBI EQUATION

The general Hamilton−Jacobi PDE is

G(Du, ut, u, x, t) = ut + H(Du, x) = 0, …(1)

where

Du = Dxu = ),u,...,u,u(nx2x1x

H : Rn→R,

x = (x1, x2,…, xn) ∈Rn,

t∈R.

We set

t = xn+1 ,

q = (p, pn+1)

z = u(x) ,

y = (x, t). …(2)

We have

G(q, z, y) = pn+1 + H(p, x) …(3)

So, DqG = (DpH(p, x), −1), …(4)

DyG = (Dx H(p, x), 0), …(5)

Dz G = 0. …(6)

The characteristic equation (15c) of the main article yields

));s(x),s(p(pH

)s(xi

i

∂∂=&

1)s(x 1n =+& , …(7)

for i = 1,2,.., n .


217

In particular, we can identity the parameter s with the time t.

The characteristic equation (15a) for Hamilton − Jacobi PDE (1) reads as

)),s(x),s(p(xH

)s(pi

i

∂∂−=&

.0)s(p 1n =+& …(8)

)ni1( ≤≤ .

The characteristic equation (15b) is now

��

�

−⋅=

+⋅= +

),s(x),s(p(H)s(p))s(x),s(p(HD

),s(p)s(p))s(x),s(p(HD)s(z

p

1np&

…(9)

using equations (1) and (3).

In summary, the characteristic equations for the given Hamilton−Jacobi PDE(1) are the following set of ODE.

)),s(x),s(p(HD)s(p x−=& …(10A)

)),s(x),s(p(H)s(p))}.s(x),s(p(HD{)s(z p −=& …(10B)

)),s(x),s(p(HD)s(x p=& …(10c)

for

)),(p),...,(p),(p()(p n21 ⋅⋅⋅=⋅

))(x),...,(x),(x()(x n21 ⋅⋅⋅=⋅ ,

and

)(z ⋅ .

Definition : Equalities (10A) and (10C), i.e.,

)x,p(HDx p=& , …(11A)

)x,p(HDp x−=& , …(11B)

are called Hamilton’s Equations.


218

Note 1 : Obverse that the ODE (10B) for z(⋅) is trivial, once x (⋅) and p (⋅) have been found by solving Hamilton’s equations (11A, B).

Note 2 : The initial-value problem for the Hamilton−Jacobi equation does not, in general, have a smooth solution u lasting for all times t > 0.

Initial-value problem for the Hamilton−−−−Jacobi Equations.

Problem : Consider the initial-value problem for the Hamilton-Jacobi equation :

ut + H(Du) = 0 in Rn ×(0, ∞) …(1)

u = g on Rn × {t = 0}. …(2)

Here

u : Rn × [0, ∞) → R ,

is the unknown, u = u(x, t), and

Du = Dxu = )u,...,u,u(nx2x1x . …(3)

The Hamiltonian

H : Rn → R …(4)

and the initial function

g : Rn → R, …(5)

are given.

Remark : We have derived earlier two Hamilton’s ODE. In the next section, we shall derive them from a variational principle.

8.5 DERIVATION OF HAMILTON’S ODE FROM A VARIATIONAL PRINCIPLE

Assume that

L : Rn × Rn → R , …(1)

is a given smooth function.

We call L to be Lagrangian.

We write


219

L = L(q, x) = L(q1, q2,…, qn, x1, x2,…, xn) …(2)

for q ∈ Rn and x∈Rn and

��

�

=

=

).L,...,L,L(LD

),L,...,L,L(LD

nx2x1xx

nq2q1qq …(3)

Now fix two points x, y ∈Rns and a time t > 0.

We introduce the action functional

I �=⋅t

0ds))s(w),s(w(L)](W[ & , …(4)

in which

.ds

)s(wd)s(w =&

Here, the functional (4) is defined for functions

w (⋅) = (w1(⋅), w2(⋅),…, wn(⋅)) , …(5)

belonging to the admissible class

A = ( ){ }.x)t(w,y)0(wR];t,0[C)(w n2 ==∈⋅ . …(6)

Thus a C2 curve )(w ⋅ belongs to A if it starts at the point y at time 0, and reaches the point x at time t.

According to the calculus of variations, we shall find a curve x (⋅) ∈A such that

)](w[Imin)](x[IA)(w

⋅=⋅∈⋅

. …(7)

That is, we are seeking a function )(w ⋅ which minimizes the functional I[⋅], given in equation (4), among all admissible functions/candidates )(w ⋅ in class A.

Theorem (Euler−−−−Lagrange equations)

Statement : Prove that any minimizer )(x ⋅ belonging to the admissible class

A = { )(w ⋅ ∈C2 ([0, t]; Rn): w (0) = y, w (t) = x} …(1)


220

of the action functional

I[ w (⋅)] = �t

0ds))s(w),s(w(L & , …(2)

solves the system

− { } ,0))s(x),s(x(LD))s(x),s(x(LDdsd

xq =+ && (0 ≤ s ≤ t) …(3)

of Euler−Lagrange ordinary differential equations.

Proof : Choose a smooth function

v : [0, t] → Rn …(4)

satisfying

v (0) = v (t) = 0, …(5)

and

v = (v1, v2,…, vn).

For τ∈R, define

)(w ⋅ = ).(v�)(x ⋅+⋅ …(6)

Then )(w ⋅ belongs to the admissible class A and )(x ⋅ being the minimizer of the action functional, we write

I )].(w[I)](x[ ⋅=≤⋅ …(7)

Therefore, the real-valued function

i : R → R …(8)

defined by

i(τ) = I )](v�)(x[ ⋅+⋅ , …(9)

has a minimum at

τ = 0. …(10)

Consequently,

i′(0) = 0, …(11)


221

provided i′(0) exists.

Now, we shall compute this derivative explicitly. We find

i(τ) = � ++t

0)]s(v�)s(x),s(vt)s(x[L && ds,

so that

i′(τ) = ( ) ( ){ }� � ��

��

� τ+++τ+τ+=

t

0

n

1i

iixiq vvx,vxLvvx,vxL &&&&& i ds .

Setting τ = 0 and using the relation (11), we find

0 = { }� ��

��

�� +

=

n

1i

t

0

iix

iiq dsv)x,x(Lv)x,x(L &&& . …(12)

Integrating by parts in the first term inside the integral and using conditions in (5), we find

0 = ( )� ��

��

��

�

�� +−

=

n

1i

t

0

iixiq dsv)x,x(L)x,x(L

dsd && . …(13)

This identity (13) is valid for all smooth functions v = (v1, v2,…,vn) satisfying the boundary conditions (5), so we must have

( ) ,0)x,x(L)x,x(Ldsd

ixiq =+− && …(14)

for all i = 1, 2,…,n, and, 0 ≤ s ≤ t. Hence, in vector form,

( ) ,0))s(x),s(x(LD))s(x),s(x(LDdsd

xq =+− && …(15)

for 0 ≤ s ≤ 1.

This completes the proof

Note (1) Equation (15) is a vector equation. It consists of n coupled second-order ODE.

Note (2) It is of course possible that a curve x (⋅) ∈A may solve the EL-equations without necessarily being a minimizer. In such a case, we say that solution x (⋅) is a critical point of the action functional I[⋅].


222

So every minimizer of a functional is a critical point, but a critical point need not be a minimizer.

Example : If, we take

L(q, x) = 2|q|m21 − φ(x)

where m>0, the corresponding Euler−Lagrange equation is

m ))s(x(f)s(xρ&& =

for

fρ

= −Dφ.

This is Newton’s law for the motion of a particle of mass m moving in the force field f generated by the potential φ.

8.6 DERIVATION OF HAMILTON’S ODE

Assume that the C2 function )(x ⋅ is a critical point of the action functional. Thus, it solves the Euler−Lagrange equations (or vector equation)

( ) ,0))s(x),s(x(LD))s(x),s(x(LDdsd

xq =+− && 0 ≤ s ≤ t . …(1)

First we set

)),s(x),s(x(LD)s(p q&= 0 ≤ s ≤ t. …(2)

p (⋅⋅⋅⋅) is called the generalized momentum corresponding to the position x (⋅⋅⋅⋅) and velocity x&(⋅⋅⋅⋅).

We now make the following important hypothesis.

Hypothesis : Suppose for all x, p∈Rn that the equation

p = DqL(q, x) , …(3a)

can be uniquely solved for q as a smooth functions of p and x,

q = q(p, x). …(3b)

Definition : The Hamiltonian H associated with the Lagrangian L is defined to be


223

H(p, x) = p q (p, x) −L(q (p, x), x) , …(4)

for p, x ∈Rn. The function q (⋅, ⋅) is defined implicitly by (3).

We now convert the Euler−Lagrange equations into Hamilton’s equation.

We rewrite the Euler−Lagrange equations in terms of p (⋅) and x (⋅). For this purpose, we state and prove a theorem.

Statement : The functions x (⋅) and p (⋅) satisfy Hamilton’s equations :

��

�

−=

=

)),s(x),s(p(HD)s(p

))s(x),s(p(HD)s(x

x

p

&

& …(5)

for 0 ≤ s ≤ t.

Furthermore, the mapping

s→H( ))s(x),s(p …(6)

is constant.

Proof : Let us hereafter write

))(q),...,(q),(q()(q n21 ⋅⋅⋅=⋅ . …(7)

From equation (4), we compute, for 1 ≤ i ≤ n,

�∂∂−

∂∂

∂∂−

∂∂=

∂∂

=

n

1k ii

k

ki

k

ki

)x,q(xL

)x,p(xq

)x,q(qL

)x,p(xq

p)x,p(xH

ix

L∂∂−= (q, x), …(8)

using (3). Also

�∂∂

∂∂−

∂∂+=

∂∂

=

n

1k i

k

ki

k

ki

i

)x,p(pq

)x,q(qL

)x,p(pq

p)x,p(q)x,p(pH

= qi(p, x) …(9)

by using again (3).

Thus


224

))s(x),s(p(q))s(x),s(p(pH i

i

=∂∂

= )s(x i& ; …(10)

and likewise

))s(x)),s(x),s(p(q(xL

))s(x),s(p(xH

ii ∂∂−=

∂∂

= − ))s(x),s(x(xL

i

&∂∂

= − ��

��

�

∂∂

)),s(x),s(x(qL

dsd

i

& ,

= − )s(p i& . …(11)

Equations (10) and (11) are required Hamilton’s ODE.

These equations comprise a coupled system of 2n first order ODE for xi(⋅) and pi(⋅)

for 1 ≤ i ≤ n.

Finally, we observe

� ��

��

�

∂∂+

∂∂=

=

n

1i

i

i

i

i

xxH

ppH

))s(x),s(p(Hdsd &&

=� ��

��

��

��

�

∂∂−

∂∂+��

�

��

�

∂∂−

∂∂

=

n

1i iiii

,pH

xH

xH

pH

= 0 , (12)

using (10) and (11). Equation (12) shows that the mapping (6) is constant.


8.7 LEGENDRE TRANSFORM

Now we try to find a connection between the Hamilton−Jacobi PDE and the calculus of variations problem−minimizing of the action functional.

To simplify further, we also drop the x-dependence in the Hamiltonian so that


225

H = H(p). ….(1)

We hereafter suppose that the Lagrangian

L : Rn → R ,

satisfies the following conditions :

(a) the mapping

q → L(q) …(2)

is convex and

(b) �

��

∞→ |q|)q(L

lim|q|

= + ∞. …(3)

Result : The convexity of the mapping in (2) implies L is continuous.

Definition : The Legendre transform of L is denoted by L*(p) and is defined as

L*(p) = nRq

sup∈

{p⋅q − L(q)}, …(4)

for p∈Rn.

Theorem : Show that the Hamiltonian H can be obtained from the Lagrangian L. Establish the relation.

Proof : Suppose that the Lagrangian L satisfies the conditions (2) and (3). Let L* denote the Legendre transform of L, defined in (4).

We note that in view of (3), the “sup” in the definition of L* in (4) is really a “max”. That is, there exists some q*∈Rn for which

L*(p) = p⋅q*−L(q*), …(5)

and the mapping

q → p⋅q − L(q) …(6)

has a maximum at q = q*.

But then

p = DL(q*), …(7)


226

provided L is differentiable at q*. Hence the equation

p = DL(q)

is solvable for q in terms of p, as

q* = q(p) . …(8)

Therefore, from equations (5) and (8), we write

L*(p) = p q (p) − L(q(p)). …(9)

From, definition of the Hamiltonian associated with the Lagrangian L, we write

H(p) = p⋅ q (p) − L( q (p)), for p∈Rn …(10)

In equation (10), the x-dependence in the Hamiltonian is dropped for simplification, as the variable x is not appearing.

From equations (9) and (10), we write.

H(p) = L*(p) for p∈Rn .

Hence

H = L* . …(11)

This gives the formula to obtain the Hamiltonian H from the Lagrangian L, under certain conditions.

Theorem : Prove that L = H*, under certain assumptions.

Proof : This theorem gives us a formula to compute L, when H is given. It states that L is the Legendre transform of H. We have already checked that H is the Legendre transform f L under certain conditions.

Thus, we shall say that H and L are dual convex functions.

To prove the result, we assume that the Lagrangian

L : Rn→R,

L = L(q) for q∈Rn , …(1)

satisfies the conditions

a) the mapping

q → L(q) …(2)


227

is convex and

b) |q|)q(L

limq ∞→

= + ∞. …(3)

we know that the Legendr transform, L*(p), is defined as

L*(p) = nRq

sup∈

{p⋅q − L(q)} …(4)

for p∈Rn. We also know that the Hamiltonian H is given by the formula

H = L*. …(5)

To achieve the desired result, we shall show that

i) the mapping

p →H(p) …(6)

is convex and

ii) |p|)p(H

lim|p| ∞→

= + ∞, …(7)

iii) L = H*. …(8)

For each fixed q, the function

p→p ⋅q −L(q) …(9)

is linear, and consequently, the mapping

p→H(p) = L*(p),

= nRq

sup∈

{p⋅q − L(q)}, …(10)

is convex, using (4) and (5).

Indeed, if 0 ≤ τ ≤ 1, p and p ∈Rn, then

H(τp + (1−τ) p ) = q

sup {(τp + (1−τ) p ) ⋅q − L(q)}

≤ τq

sup {p ⋅q − L(q)}


228

+ (1−τ) q

sup { p ⋅ q − L(q)}

= τ H(p) + (1−τ) H( p ) …(11)

This proves part (i) in (6) that the mapping is convex.

To prove (ii), fix any λ > 0 and p ≠ 0. Then

H(p) = nRq

sup∈

{p⋅q − L(q)},

using (10) or (4) and (5).

≥ λ |p| − L ��

��

�λ

|p|p

, on taking q = |p|

pλ ∈ Rn

≥ λ |p| − ),0(BLmax

λ

Thus,

lim |p|)p(H

inf|p| ∞→

≥ λ for all λ > 0. …(12)

This proves (ii) in (7).

To prove (iii) in (8), equation (10) gives

H(p) + L(q) ≥ p⋅q , …(13)

for all p, q ∈Rn. Consequently,

L(q) ≥ nRp

sup∈

{p ⋅q − H(p)}

= H*(q) .

This gives

L(q) ≥ H*(q) for all q∈Rn. …(14)

On the other hand,

H*(q) = nRp

sup∈ �

��

−⋅−∈

)}r(Lrp{supq,pnRr

,


229

= nRp

sup∈ �

�

∈ nRrinf {p⋅(q−r) + L(r)}

�

, …(15)

by definition of Legendre transform and properties of sup and inf. Since the mapping

q→L(q)

is convex, so there exists s∈Rn such that

L(r) ≥ L(q) + s⋅ (r−q), for r ∈Rn . …(16)

Taking p = s in (15) and using (16), we compute

H*(q) ≥nRr

inf∈

{s⋅(q−r) + L(r)}

= L(q) .

This gives

H*(q) ≥ L(q) for all q∈Rn. …(17)

From equations (14) and (17), we find

L(q) = H*(q) for all q∈Rn

Hence

L = H*. …(18)

This proves part (iii) in equation (8). Hence, the proof of the theorem is complete.

8.8 HOPF−−−−LAX FORMULA

Consider the initial-value problem for the Hamilton - Jacobi equation

ut + H(Du) = 0 in Rn × (0, ∞) …(1)

u = g on Rn × {t = 0}. …(2)

We know that the calculus of variations problem with Lagrangian L leads to Hamilton’s ODE for the associated Hamiltonian H. Since these ODE are also the characteristic equations of the Hamilton-Jacobi PDE, we infer there is probably a direct connection between this PDE and the calculus of variations.

Theorem : (Hopf-Lax formula)


230

Statement : If x∈Rn and t > 0, then prove that the solution u = u(x, t) of the minimization problem

u(x, t) = inf�

��

==� + x)t(w,y)0(w)y(gds))s(w(Lt

0

& ,

the infimum taken over all C1 functions

w : [0, t] → Rn

satisfying

w (t) = x,

is

u(x, t) = �

�� +�

�

��

� −∈

)y(gt

yxLtmin

nRy.

Proof : Fix any y ∈Rn and define

w (s) = y ),yx(ts −+ …(1)

for 0 ≤ s ≤ t. Then

w (0) = y …(2)

and

w (t) = x. …(3)

It is given that

u(x, t) = inf �

��

==� + x)t(w,y)0(w)y(g))s(w(Lt

0

& . …(4)

It implies that

u(x, t) ≤ �t

0))s(w(L & ds + g(y), …(5)

by definition of infimum. Equations (1) and (5) yield

u(x, t) ≤ � ��

��

� −t

0 tyx

L ds + g(y)


231

= t L ��

��

� −t

yx+ g(y) .

This gives

u(x, t) ≤ �

�� +�

�

��

� −∈

)y(gt

yxLtinf

nRy. …(6)

On the other hand, if w (⋅) is any C1 function satisfying the condition

,x)t(w = …(7)

we have, by Jensen’s inequality (exercise),

L �� ≤��

��

� t

0

t

0

.ds))s(w(Lt1

ds)s(wt1 && …(8)

If we write

w (0) = y, …(9)

we find

t L �≤+��

��

� − t

0))s(w(L)y(g

tyx & ds + g(y),

and consequently,

)}y(gds))s(w{(inf)y(gt

yxLtinf

wnRy+≤

�

�� +�

�

��

� −∈

& ,

= u (x, t), …(10)

by definition (4). Equations (6) and (10) yield the desired Hopf-Lax formula for the given variational problem stated in the statement of the theorem.

This completes the proof of Hopf-Lax formula.

Remark : We propose now to investigate the sense in which u so defined above (as a minimization problem) actually solves the initial-value problem for the Hamilton-Jacobi PDE.

ut + H(Du) = 0 in Rn×(0, ∞) …(1)

u = g on Rn ×{t = 0}. …(2)


232

Recall we are assuming H is

i) smooth , …(3)

ii) convex, and

iii) |p|)p(H

lim|p| ∞→

= + ∞. …(4)

We henceforth suppose also

g : Rn → R , …(5)

is Lipschitz continuous, i.e.,

Lip (g) = <�

��

−−

≠∈ |yx|

|)y(g)x(g|sup

yxnRy,x

∞. …(6)

Our ultimate goal is showing “Hopf−Lax formula” provides a reasonable “weak solution” of the initial-value problem (1) for the Hamilton-Jacobi PDE.

First, we record some preliminary observations/properties of the function u = u(x, t) defined earlier by the Hopf−Lax formula.

Lemma 1 : (known as a functional identity)

Statement : For x∈Rn and 0 ≤ s ≤ t, we have

u(x, t) = .)s,y(ustyx

L)st(minnRy �

�� +�

�

��

�

−−−

∈

In other words, to compute u (⋅, t), we can calculate u at time s and then use u(⋅, s) as the initial condition on the remaining time interval [s1, t].

Proof of Lemma 1 : For y∈Rn and 0 < s < t and choose z∈Rn so that

u(y, s) = s L ��

��

� −s

zy + g(z). …(1)

by virtue of Hopf-Lax formula.

Further

��

��

� −��

��

�+��

��

�

−−

��

��

� −=−s

zyts

styx

ts

1t

zx, …(2)


233

and 0 < .1ts < Since L is convex, so we have

L ��

��

� −��

��

�+��

��

�

−−

��

��

� −≤��

��

� −s

zyL

ts

styx

Lts

1t

zx. …(3)

Thus, combining with Hope-Lax formula

u(x, t) = �

�� +�

�

��

� −∈

)y(gt

yxLtmin

nRy, …(4)

we write

u(x, t) ≤ t L )z(gt

zx +��

��

� −

≤ (t − s) L ��

��

� −+��

��

�

−−

szy

Lsstyx

+ g(z)

= (t −s) L ��

��

�

−−

styx

+ u (y, s), …(5)

using the relation (1). The inequality (5) is true for each y∈Rn. Therefore, relation (5) gives

u(x, t) ≤ �

�� +�

�

��

�

−−−

∈)s,y(u

styx

L)st(minnRy

. …(6)

Now, it remains to prove that (to complete the proof of Lemma)

�

�� +�

�

��

�

−−−

∈)s,y(u

styx

L)st(minnRy

≤ u(x, t) . …(7)

To prove (7), we now choose w such that

u(x, t) = t L ��

��

� −t

wx+ g(w), …(8)

and set

y = wts

1xts

��

��

� −+ . …(9)

Then


234

s

wystyx

twx −=

−−=−

. …(10)

Consequently

(t −s) L ��

��

�

−−

styx

+ u(y, s)

≤ (t −s) L ��

��

� +��

��

� −+��

��

� −)w(g

swy

Lst

wx,

= t L ��

��

� −t

wx+ g(w), …(11)

= u(x, t),

using (8) and (10). Hence

�

�� +�

�

��

�

−−−

∈)s,y(u

styx

L)st(minnRy

≤ u(x, t). …(12)

Results (6) and (12) combine together prove the desired result.

This completes the proof of Lemma

Lemma 2 : (Lipschitz Continuity)

Statement : The function u is Lipschitz continuous in Rn×[0, ∞) and

u = g on Rn×{t = 0}.

Proof : For t > 0 and x, x ∈Rn. Choose y∈Rn such that

t L ��

��

� −t

yx+ g(y) = u(x, t), …(1)

using Hopf-Lax formula. Then

u( x , t) − u(x, t) = )y(gt

yxLt)z(g

tzx

LtInfz

−��

��

� −−�

�� +�

�

��

� −

≤ g( x −x +y) −g(y), on taking z = x −x +y

≤ Lip (g) {| x −x|},


235

as g is Lip continuous. Hence

u( x , t) −u(x, t) ≤ Lip (g) | x −x|. …(2)

Interchanging the roles of x and x, we write

u(x, t) − u ( x , t) ≤ Lip(g) | x −x|. …(3)

Combining (2) and (3), we write

|u( x , t) −u(x, t)| ≤ Lip (g) | x −x|. …(4)

Next select x∈Rn, t > 0. Choose y = x in Hopf-Lax formula, we discover

u(x, t) ≤ t L(0) + g(x). …(5)

Furthermore, by Hopf-Lax formula,

u(x, t) = �

�� +�

�

��

� −∈

)y(gt

yxLtmin

nRy

≥ g(x) +�

��

��

��

� −+−−∈ t

yxLg|yx|)g(Lipmin

nRy,

= g(x) −t { })z(L|z|)g(LipmaxnRz

−∈

; on taking x−y = tz,

= g(x) − t �

�� −

∈∈)}z(Lz.w{maxmax

nRz))g(Lip,0(Bw

= g(x) −t ))g(Lip,0(B

max H. …(6)

Inequalities (5) and (6) together imply

|u(x, t) −g(x)| ≤ C t …(7)

for

C = max .|H|max|,)0(L|))g(Lip,0(B �

�� …(8)

Finally select x∈Rn, o < t < t. Then

Lip (u(⋅, t)) ≤ Lip (g), …(9)


236

by virtue of inequality (4) above. Consequently Lemma 1 and calculations like those employed above imply

|u(x, t) − u(x, t )| ≤ C |t − t |, …(10)

for the constant C defined in (8).

Inequalities (4) and (10) proves the fact that the function u is Lipschitz continuous in Rn×[0, ∞). Moreover, inequality (7) proves that

u = g on Rn ×{t = 0}.

This completes the proof of Lemma 2.

Result : By Rademacher’s theorem (proof out of course), it is asserted that a Lipschitz function is differentiable almost every where.

Consequently, by Lemma 2, our function u = u(x, t) defined above by the Hopf-Lax formula is differentiable almost everywhere in Rn×[0, ∞).

The next theorem concludes u, in fact, as defined by Hopf-Lax formula, solves the Hamilton-Jacobi PDE wherever u is differentiable.

Theorem : (Solving the Hamilton-Jacobi equation).

Statement : Suppose x∈Rn, t > 0, and u = u(x, t) defined by the Hopf-Lax formula is differentiable at a point (x, t) ∈ Rn ×(0, ∞).

Then

ut(x, t) + H(Du(x, t)) = 0.

Proof : Fix q ∈Rn, h > 0. Owing to Lemma 1,

u(x + hq, t + h) = �

�� +�

�

��

� −+∈

)t,y(uh

yqhxLhmin

nRy

≤ h L(q) + u(x, t).

Hence

h

)t,x(uht,qhx(u −++ ≤ L (q) .

Taking h→ 0 +, we write

q ⋅ Du(x, t) + ut(x, t) ≤ L(q). …(1)


237

Since

H = L*, …(2)

therefore,

ut(x, t) + H(Du(x, t)) = ut(x, t) + nRq

max∈

{q ⋅ Du(x, t) − L(q)}

≤ 0, …(3)

because the inequality (1) is valid for all q∈Rn.

In order to prove the required result, it is now enough

to show that

ut(x, t) + H(Du(x, t)) ≥ 0. …(4)

To prove (4), we choose z such that

u(x, t) = t L ��

��

� −t

zx+ g(z). …(5)

Fix h > 0 and set

s = t −h,

y = zts

1xts

��

��

� −+ . …(6)

Then

,s

zyt

zx −=− …(7)

and thus

u(x, t) −u(y, s) ≥ ��

��

� +��

��

� −)z(g

tzx

Lt

− ��

��

� +��

��

� −)z(g

szy

Ls

= (t−s) L ��

��

� −t

zx, …(8)


238

using (7). This gives

,t

zxL

h

ht,zth

xth

1u)t,x(u

��

��

� −≥��

��

� −+��

��

� −− …(9)

using (6). Letting h→0+ in (9), we compute

.t

zxL)t,x(u)t,x(Du.

tzx

t ��

��

� −≥+��

��

� −

Consequently

ut(x, t) + H(D(u, t)) = ut(x, t) +nRq

max∈

{q ⋅ Du(x, t) − L(q)}

≥ ut(x, t) + ��

��

� −−⋅��

��

� −t

zxL)t,x(Du

tzx

≥ 0. …(10)

This prove (4) and hence the theorem

We summarize the above results in the form of following theorem :

Theorem (Hopf−−−−Lax formula as solution). The function u (x,t) defined by the Hopf-Lax formula is Lipschitz continuous, is differentiable a. e. in Rn×(0, ∞), and solves the initial-value problem.

ut + H(Du) = 0 a.e. in Rn×(0, ∞) ,

u = g on Rn×{t = 0} .

8.9 WEAK SOLUTIONS, UNIQUENESS

Semiconcavity

In view of Theorem above it may seem reasonable to define a weak solution of the initial-value problem to be a Lipschitz function which agrees with g on Rn×{t = 0}, and solves the PDE a.e. on Rn×(0, ∞). However, this turns out to be an inadequate definition, as such weak solutions would not in general be unique.

Example : Consider the initial-value problem


239

��

=×=∞×=+

}.0t{Ron0u

),0(Rin0|u|u 2xt …(1)

One obvious solution is

u1(x, t) ≡ 0. …(2)

However the function

u2(x, t) =��

��

≤≤−−−≤≤−

≥

0xtiftxtx0iftx

t|x|if0

…(3)

is Lipschitz continuous and also solves the PDE a.e. (everywhere, in fact, except on the lines x = 0, + t). It is easy to see that actually there are infinitely many Lipschitz functions satisfying (1).

This example shows we must presumably require more of a weak solution than merely that it satisfy the PDE a.e. We will look to the Hopf-Lax formula for a further clue as to what is needed to ensure uniqueness.

The following lemma demonstrates that u inherits a kind of “one-sided” second-derivative estimate from the initial function g.

Lemma 3: (Semiconcavity). Suppose there exists a constant C such that

g(x + z) −2g(x) + g(x−z) ≤ C|z|2 …(1)

for all x, z ∈Rn. Define u by the Hopf-Lax formula. Then

u(x + z, t) −2u(x, t) + u(x −z, t) ≤ C |z|2 …(2)

for all x, z ∈Rn, t > 0.

Remark. We say g is semiconcave provided (1) holds. It is easy to check (1) is valid if g is C2 and

nR

sup |D2g| < ∞. Note that g is semiconcave if and only if

the mapping

x→g(x) −2C

|x|2

is concave for some constant C.

Proof : Choose y∈Rn so that


240

u(x, t) = t L ��

��

� −t

yx+ g(y). …(3)

Then, putting y + z and y −z in the Hopf-Lax formulas for u(x, + z, t) and u(x−z, t),

we find

u(x + z, t) −2u(x, t) + u(x −z, t)

≤ ��

��

� ++��

��

� −)zy(g

tyx

Lt

��

��

� +��

��

� −− )y(gt

yxLt2

+ ��

��

� −+��

��

� −)zy(g

tyx

Lt

= g(y + z) −2g(y) + g(y−z)

≤ C |z|2, …(4)

by (1). This proves the lemma.

Note : As a semiconcavity condition for u(x, t) will turn out to be important, we pause to identify some other circumstances under which it is valid. We will no longer assume g to be semiconcave, but will suppose the Hamiltonian H to be uniformly convex.

Definition : A C2 convex function

H : Rn→R

is called uniformly convex (with constant θ > 0) if

�=

n

1j,ijpipH (p)ξiξj ≥ θ|ξ|2 for all p, ξ ∈Rn.

We now prove that if g is not semiconcave, the uniform convexity of H forces u(x,t) to become semiconcave for time t > 0. This is a kind of mild regularizing effect for the Hopf-Lax solution of the initial-value problem.

Lemma 4 : Suppose that H is uniformly convex (with constant θ) ad u(x,t) is defined by the Hopf-Lax formula. Then


241

u(x + z, t) −2u(x, t) + u(x −z, t) ≤ 2|z|t�

1

for all x, z ∈Rn, t > 0.

Proof : We note first using Taylor’s formula that uniform convexity of H implies

H8�

)p(H21

)p(H21

2pp

2121 −+≤��

��

� +|p1−p2|2. …(1)

Next we claim that for the Lagrangian L we have the estimate

221

2121 |qq|

�81

2qq

L)q(L21

)q(L21 −+�

�

��

� +≤+ , …(2)

for all q1, q2 ∈Rn. Verification is left as an exercise.

Now choose y so that

u(x, t) = t L ��

��

� −t

yx+ g(y). …(3)

Then using the same value of y in the Hopf-Lax formulas for u(x + z, t) and u(x −z, t), we calculate

u(x + z, t) −2u(x, t) + u(x −z, t)

≤ ��

��

� +��

��

� −+)y(g

tyzx

Lt

��

��

� +��

��

� −− )y(gt

yxLt2

+ ��

��

� +��

��

� −−)y(g

tyzx

Lt

= 2t ��

��

��

��

� −−��

��

� −−+��

��

� −+t

yxL

tyzx

L21

tyzx

L21

≤ 2t2

tz2

81θ

t

1θ

≤ |z|2,


242

using (2). Hence the lemma.

Now we show that semiconcavity conditions of the sort discovered for the Hopf-Lax solution u(x, t) in Lemmas 3 and 4 can be utilized as uniqueness criteria.

Definition : We say that a Lipschitz continuous function

u : Rn × [0, ∞) → R

is a weak solution of the initial-value problem:

��

��

=×=

∞×=+

}0t{Rongu

),0(Rin0)Du(Hun

nt …(∗)

provided

(a) u(x, 0) = g(x) for x∈Rn,

(b) ut(x, t) + H(Du(x, t)) = 0 a.e. for (x, t) ∈Rn ×(0, ∞), and

(c) u(x +z, t) −2u(x, t) + u(x −z, t) ≤ C ��

��

� +t1

1 |z|2

for some constant C ≥ 0 and all x,z∈Rn, t > 0.

Next we prove that a weak solution the above initial-value problem is unique, the key point being that this uniqueness assertion follows from the inequality condition (c) above.

Theorem (Uniqueness of weak solutions). Assume H is C2 and satisfies the condition

��

��

+∞=∞→ |p|

)p(Hlim

andconvexisH

|p|

(**)

and g is Lipschitz continuous. Then there exists at most one weak solution of the initial-value problem (∗).

Proof : 1. Suppose that u and u~ are two weak solutions of (∗) and write

w = u− u~ . …(1)

Observe now at any point (y, s) where both u and u~ are differentiable and solve our PDE, we have


243

wt(y, s) = ut(y, s) − tu~ (y, s)

= −H(Du(y, s)) + H(D u~ (y, s))

= − �1

0drd

H(r Du(y, s) + (1−r)D u~ (y, s)) dr

= − �1

0

r(DH Du(y, s) +(1−r) D u~ (y, s)) dr . (Du(y, s) − D u~ (y, s))

= −b(y, s) ⋅ Dw(y, s) .

Consequently

wt + b ⋅ Dw = 0 a.e. …(2)

2. Write v = φ(w) ≥ 0, where φ : R→[0, ∞) is a smooth function to be selected later. We multiply (2) by φ′(w) to discover

vt + b ⋅ Dv = 0 a.e. …(3)

3. Now choose ε > 0 and define

uε = ηε ∗ u,

�u~ = ηε ∗ u~ , …(4)

where ηε is the standard mollifier in the x and t variables. Then

|Duε| ≥ Lip(u),

|u~D| � ≤ Lip( u~ ), …(5)

and

Duε → Du,

�u~D → D u~ …(6)

a.e., as ∈→0 .

Furthermore inequality (c) in the definition of weak solution implies

D2uε, �2u~D ≤ C ��

��

� +s1

1 I …(7)


244

for an appropriate constant C and all ε > 0, y∈Rn, s > 2ε. Verification is left as an exercise.

4. Write

bε(y, s) = �1

0DH(r Duε(y, s) + (1−r) �u~D (y, s)) dr. …(8)

Then (3) becomes

vt + bε ⋅ Du = (bε − b) ⋅ Dv a.e.; …(9)

hence

vt + dv (v bε) = (div bε)v + (bε −b) ⋅ Dv a.e. …(10)

5. Now

div bε = dr)u~)r1(ru)(u~D)(r1(rDu(Hkxx

1

0kxxpkp

n

1,k

εεεε

=−+−+� � 1ll

l

≤ C ��

��

� +s1

1 , …(11)

for some constant c, in view of (5), (7). Here we note that H convex implies

D2H ≥ 0.

6. Fix x0∈Rn, t0 > 0, and set

R = max {|DH(p)| | |p| ≤ max (Lip(u), Lip ( u~ ))}. …(12)

Define also the cone

C = {(x, t) |0 ≤ t ≤ t0|x −x0| ≤ R(t0 − t)|. …(13)

Next write

e(t) = �− ))t0t(R,0x(B

v(x, t) dx …(14)

and compute for a.e. t > 0:

)t(e& = �− ))t0t(R,0x(B

vt dx −R �−∂ ))t0t(R,0x(B

v dS


245

= �−∂ ))t0t(R,0x(B

−div (v bε) + (div bε)v + (bε −b). Dv dx

−R �−∂ ))t0t(R,0x(B

v dS

= − �−∂ ))t0t(R,0x(B

v(bε⋅ v + R)dS

+ �− ))t0t(R,0x(B

(div bε)v + (bε −b) ⋅ Dv dx

≤ �− ))t0t(R,0x(B

(div bε)v + (bε−b) ⋅ Dv dx

≤ C ��

��

� +t1

1 e(t) + �− ))t0t(R,0x(B

(bε −b) ⋅ Dv dx

by (11). The last term on the right hand side goes to zero as ε→0, for a.e. t0 > 0, according to (5), (6) and the Dominated Convergence Theorem.

��

��

� +≤t1

1C)t(e& e(t) for a.e. 0 < t < t0. …(15)

7. Fix 0 < ε < r < t and choose the function φ(z) to equal zero if

|z| ≤ ε[Lip(u) + Lip( u~ )]

and to be positive otherwise.

Since u = u~ on Rn × {t = 0},

v = φ(w) = φ (u− u~ ) = 0 at {t = ε}.

Thus

e(ε) = 0.

Consequently Gronwall’s inequality and (15) imply

e(r) ≤ e(ε)�ε

��

��

� +r

dss1

1C

e

.0=


246

Hence

|u− u~ | ≤ ε [Lip(u) + Lip( u~ )] on B(x0, R(t0 −r)).

This inequality is valid for all ε > 0, and so

u ≡ u~ in B(x0, R(t0 −r)).

Therefore, in particular,

u(x0, t0) = u~ (x0, t0).

This completes the proof

In light of Lemma 3, 4 and Theorem above, we have the following theorem.

Theorem : (Hopf-Lax formula as weak solution). Suppose H is C2 and satisfies (**), and g is Lipschitz continuous. If either g is semiconcave or H is uniformly convex, then

u(x, t) = �

�� +�

�

��

� −∈

)y(gt

yxLtmin

nRy

is the unique weak solution of the initial-value problem (∗) for the Hamilton-Jacobi equation.

Example 1: Consider the initial-value problem :

��

��

=×=

∞×=+

}.0t{Ron|x|u

),0(Rin0|Du|21

u

n

n2t …(1)

Here

H(p) = 2|p|21

.|q|21

)q(L 2=

The Hopf-Lax formula for the unique, weak solution of (1) is

u(x, t) = .|y|t2

|yx|min

2

nRy �

��

+−∈

…(2)

Assume |x| > t. Then


247

Dy|y|

yt

xy|y|

t2|yx| 2

+−=��

��

�+−

(y ≠ 0); …(3)

and this expression equals zero if x = y +|x|

x)t|x(|y,t

|y|y −= ≠ 0.

Thus

u(x, t) = |x|−21

if |x| > t.

If |x| ≤ t,

the minimum in (2) is attained at y = 0. Consequently

u(x, t) = ��

��

≤

≥−

.t|x|ift2|x|

t|x|if2/t|x|2

Observe that the solution becomes semiconcave at time t > 0, even though the initial function g(x) = |x| is not semiconcave. This accords with Lemma 4.

Example 2 : We next examine the problem with reversed initial conditions :

��

��

=×−=

∞×=+

}.0t{Ron|x|u

),0(Rin0|Du|21

u

n

n2t …(1)

Then

u(x, t) = .|y|t2

|yx|min

2

nRy �

��

−−∈

Now

Dy|y|

yt

xy|y|

t2|yx| 2

−−=��

��

�−−

(y ≠ 0),

and this equals zero if x = y − .|x|

x)t|x(|y,t

|y|y += Thus

u(x, t) = −|x| −2t

(x ∈Rn, t ≥ 0). …(2)

The initial function g(x) = −|x| is semiconcave, and the solution remains so for times t > 0. The Books Recommended for Chapter VIII 1. L.C. Evans Partial Differential Equations, Graduate Studies



Chapter-9 Representation of Solutions

In this chapter, we collect together a wide variety of techniques that are some-times useful for finding certain more-or-less explicit solutions to various partial differential equations, or at least representation formulas for solutions.

9.1 SEPARATION OF VARIABLES

The method of separation of variables tries to construct a solution u to a given partial differential equation as some sort of combination of functions of fewer variables.

In other words, the idea is to guess that u can be written as, say, a sum or product of as yet undetermined constituent function, to plug this guess into the PDE, and finally to choose the simpler functions to ensure u really is a solution.

This technique is best understood in examples.

Example 1 : Let U⊂ Rn be a bounded, open set with smooth boundary. We consider the initial/boundary-value problem for the heat equation

ut−∆u = 0 in U × (0, ∞),

u = 0 on ∂U × [0, ∞),

u = g on U ×{t = 0}, …(1)

where

g : U→R is given. …(2)

We conjecture there exists a solution having the multiplicative form

u(x, t) = v(t)w(x) (x ∈U, t ≥ 0) . …(3)

That is, we look for a solution of (1) with the variables x = {x1,…,xn} ∈U “separated” from the variable t ∈[0, T].

We compute

ut(x, t) = v′(t)w(x), …(4)

REPRESENTATION OF SOLUTIONS 249

∆u(x, t) = v(t) ∆w(x). …(5)

Hence, equations (1), (4) and (5) imply

v′(t)w(x) − v(t)∆w(x) = 0

or

)x(w)x(w

)t(v)t('v ∆= , …(6)

for all x∈U and t > 0 such that w(x), v(t) ≠ 0.

Now observe the left-hand side of (6) depends only on t and the right hand side depends only on x. This is impossible unless each is constant, say

)x(w)x(w

�)t(v)t('v ∆== (t ≥ 0, x∈U). …(7)

Then

v′ = µv, …(8)

and

∆w = µw. …(9)

We must solve these equations (8) and (9) for the unknowns w, v and µ.

Notice first that if µ is known, the solution of (8) is

v (t) = d eµt …(10)

for an arbitrary constant d. Consequently, we need only investigate equation (9).

We say that λλλλ is an eigenvalue of the operator −−−−∆∆∆∆ on U (subject to zero boundary conditions) provided there exists a function w, not identically equal to zero, solving

��

∂==∆−

.Uon0wUinw�w

…(11)

The function w is a corresponding eigenfunction.

If λ is an eigenvalue and w is a related eigenfunction, we set


µ = −λ , …(12)

and find

u = de−λtw , …(13)

solves the problem

ut − ∆u = 0 in U × (0, ∞)

u = 0 on ∂U × [0, ∞), …(14)

with the initial condition

u(⋅, 0) = d w. …(15)

Thus the function u defined by (13) solves problem (1), provided

g = d w. …(16)

More generally, if λ1,…,λm are eigenvalues, w1,…, wm corresponding eigenfunctions, and d1,…,dm are constants, then

u = k

m

1k

tk�k wed�

=

− …(17)

solves (14), with the initial condition

u(⋅, 0) = � =m

1k kk .wd …(18)

If we can find m, w1,…, etc. such that

� =m

1k kk wd = g, …(19)

we are done.

We can hope to generalize further by trying to find a countable sequence λ1,… of eigenvalues with corresponding eigenfunctions w1, ….. so that

� =∞

=1kkk gwd in U …(20)

for appropriate constants d1,….

Then presumably


u = �∞

=

−

1kk

tk�k wed …(21)

will be the solution of the initial-value problem (1).

Remark 1 : This is an attractive representation formula for the solution, but depends upon

(a) our being able to find eigenvalues, eigenfunctions and constants satisfying (21) and

(b) our verifying that the series in (21) converges in some appropriate sense.

Remark 2 : Note that our solution (13) is determined by the method of separation of variables. The more complicated forms (17) and (21) depend upon the linearity of the heat equation.

Example 2 : Let us turn once again to the Hamilton-Jacobi equation

ut + H(Du) = 0 in Rn × (0, ∞) , …(22)

and look for a solution u having the form

u(x, t) = w(x) + v(t) (x ∈Rn, t ≥ 0). …(23)

Then

0 = ut(x, t) + H(Du(x, t))

= v′(t) + H(Dw(x)) ,

if and only if

H(Dw(x)) = µ = −v′(t) (x∈Rn, t > 0) , …(24)

for some constant µ. Consequently if

H(Dw) = µ, …(25)

v′(t) = −µ, …(26)

for some µ∈R, then

u(x, t) = w(x) −µt + b …(27)

will for any constant b solve (22).

In particular, if we choose


w(x) = a⋅x …(28)

for some a∈Rn and set

µ = H(a), …(29)

we discover the solution

u = a ⋅ x −H(a)t + b …(30)

already obtained.

9.2 SIMILARITY SOLUTIONS

When investigating partial differential equations it is often profitable to look for specific solutions u, the form of which reflects various symmetries in the structure of the PDE. We have already seen this idea in our derivation of the fundamental solutions for Laplace’s and the heat equations.

Following are some other applications of this important method.

Plane and Traveling Waves, Solitons

Consider first a partial differential equation involving the two variables x∈R, t∈R. A solution u of the form

u(x, t) = v(x−σt), (x∈R, t∈R) …(1)

is called a traveling wave (with speed σ and profile v).

More generally, a solution u of a PDE in the n + 1 variables x = (x1,…xn) ∈Rn, t∈R having the form

u(x, t) = v(y⋅x−σt), (x∈Rn, t∈R) …(2)

is called a plane wave having wavefront normal to y∈Rn, velocity |y|�

, and

profile v.

Exponential Solutions

In view of the Fourier transform, it is particularly enlightening when studying linear partial differential equations to consider complex-valued plane wave solutions of the form

u(x, t) = ei(y⋅x+ωt), …(3)

where

ω ∈C and y = (y1,…,yn) ∈Rn.


ω being the frequency and n1ii}y{ = the wave numbers.

We will next substitute trial solutions of the form (3) into various linear PDE, paying particular attention to the relationship between y and ω forced by the structure of the equation.

Example 1 : (Heat equation).

If u is given by (3), we compute

ut − ∆u = (iω + |y|2) u = 0, …(4)

provided

ω = i|y|2. …(5)

Hence

u = t2|y|xiye −⋅ , …(6)

solves the heat equation for each y ∈Rn.

Taking real and imaginary parts, we discover further that

u1 = t2|y|e− cos (y⋅x) , …(7)

and

u2 = t2|y|e− sin (y⋅x) , …(8)

are solutions as well.

Notice in this example that since ω is purely imaginary, there results a real,

negative exponential term t2|y|e− in the formulas, which corresponds to dissipation.

Example 2 : (Wave equation).

Upon our substituting (3) into the wave equation, we discover

utt − ∆u = (−ω2 + |y|2 u = 0, …(9)

provided

ω = + |y|. …(10)


Consequently

u = ei(y⋅x+|y|t) , …(11)

solves the wave equation.

The pair of functions

u1 = cos (y⋅x + |y|t) , …(12)

and

u2 = sin(y⋅x + |y|t) , …(13)

also solves the same.

Since ω is real, there are no dissipation effects in these solutions.

Example 3 : (Dispersive equations).

We now let n = 1 and substitute

u = ei(yx + ωt)

into Airy’s equation

ut + uxxx = 0. …(14)

We calculate

ut + uxxx = i(ω −y3) u = 0, …(15)

whenever

ω = y3. …(16)

Thus

u = )t3yyx(ie + , …(17)

solves Airy’s equation.

Once again, as ω is real there is no dissipation. Notice however that the velocity of propagation is y2, which depends non-linearly upon the frequency of the initial value iyxe .

Thus waves of different frequencies propagate at different velocities: the PDE creates dispersion.


Likewise, if n ≥ 1 and we substitute

u = ei(y⋅x+ωt)

into Schrodinger’s equation

iut + ∆u = 0, …(18)

we compute

iut + ∆u = −(ω + |y|2)u = 0. …(19)

Consequently

ω = −|y|2, …(20)

and

u = )t2|y|xy(ie −⋅ . …(21)

Again, the solution displays dispersion.

Solitons

We consider next the Korteweg-de Vries (KdV) equation in the form

ut + 6uux + uxxx = 0 in R × (0, ∞), …(22)

this nonlinear dispersive equation being a model for surface waves in water.

We seek a traveling wave solution having the structure

u(x, t) = v(x −σt) (x ∈R, t > 0). …(23)

Then u solves the KdV equation (22), provided v satisfies the ODE

−σv′ + 6vv′ + v′′′ = 0 . ��

��

−== t�xs,dsd

' …(24)

We integrate (24) by first noting

−σv + 3v2 + v′′ = a, …(25)

a denoting some constant.

Multiply this equality by v′ to obtain

−σ vv′ + 3v2v′ + v′′ v′ = av′,


and so deduce

232

v2�

v2)'v( +−= + av + b , …(26)

where b is another arbitrary constant.

We investigate (26) by looking now only for solutions v which satisfy

v, v′, v′′ → 0 , as s→ + ∞. …(27)

The function u having the form (23), under conditions (27), is called a solitary wave.

Then (25), (26) imply

a = b = 0. …(28)

Equation (26) thereupon simplifies to read

.2�

vv2)'v( 2

2

��

��

+−=

Hence

v′ = + v(σ − 2v)1/2. …(29)

We take the minus sign above for computational convenience, and obtain then this implicit formula for v:

s = − � +−

)s(v

02/1

,c)z2�(z

dz …(30)

for some constant c. Now substitute

z = 2�

sech2θ. …(31)

It follows that

��d

dz −= sech2θ tanh θ , …(32)

and


z(σ−2z)1/2 = 2� 2/3

sech2 θ tanhθ. …(33)

Hence (30) becomes

s = σ2 θ + c, …(34)

where θ is implicitly given by the relation

2�

sech2 θ = v(s). …(35)

We combine (34) and (35) to compute

v(s) = 2�

sech2 ��

��

− )cs(

2�

, (s∈R). …(36)

Conversely, it is routine to check v so defined actually solves the ODE (24).

The upshot is that

u(x, t) =2�

sech2 ��

��

−σ−σ

)ctx(2

, (x∈R, t ≥ 0) …(37)

is a solution of the KdV equation for each c∈R, σ > 0.

A solution of this form is called a soliton.

Note the velocity of the solution depends upon its height.

Remark : The KdV equation is in fact utterly remarkable, in that it is completely integrable, which means that in principle the exact solution can be computed for essentially arbitrary initial data.

Traveling Waves for a Bistable Equation.

Consider next the scalar reaction-diffusion equation

ut − uxx = f(u) in R×(0, ∞), …(38)

where

f : R→R

has a “cubic-like” shape.


Graph of the function f

We assume, more precisely, f is smooth and verifies

(a) f(0) = f(a) = f(1) = 0

(b) f < 0 on (0, a), f > 0 on (a, 1)

(c) f′(0) < 0, f′(1) < 0

(d) �10 )z(f dz > 0 …(39)

for some point 0 < a < 1.

We look for a traveling wave solution of the form

u(x, t) = v(x − σ t), …(40)

the profile v and velocity σ to be determined, such that

u→0 as x→ − ∞, u→1 as x→ +∞. …(41)

Now since

f ′ < 0 at z = 0, 1,

the constants 0 and 1 are stable solutions of the PDE (and since f′ ≥ 0 at z = a, the constant a is an unstable solution).

So we want our traveling wave (40) to interpolate between the two stable states z = 0, 1 at x = µ ∞.

Plugging (40) into (38), we see v must satisfy the ordinary differential equation.

v′′ + σv′ + f(v) = 0 , ��

��

=dsd

' , …(42)

0 a 1


subject to the conditions

,1)s(vlims

=+∞→

,0)s(vlims

=−∞→

±∞→s

lim v′(s) = 0. …(43)

We outline now (without complete proofs) a phase plane analysis of the ODE problem (42), (43). We begin by setting

w = v′ . …(44)

Then (42), (43) transform into the autonomous first-order system

v′ = w

w′ = −σw −f(v), …(45)

with

),0,1()w,v(lims

=∞→

−∞→s

lim (v, w) = (0, 0). …(46)

Now (0, 0) and (1, 0) are critical points for the system (45), and the eigen-values of the corresponding linearizations are

λ0+ = ,

2))0('f4( 2/12 −σ±σ−

.

2

))1('f4( 2/12

1−σ±σ−=λ± …(47)

In view of (39c), ±±10 �,� are real, with differing sign, and thus (0, 0) and (1, 0)

are saddle points for the flow (45).

Consequently as “stable curve”, Ws approaches (1, 0), as drawn. Furthermore, by calculating eigenvectors corresponding to (47) , we see

Wu is tangent to the line w = +0� v at (0, 0)

Ws is tangent to the line w = −1� (v−1) at (1, 0). …(48)


Stable and unstable curves

Note that ±±10 �,� , Wu and Ws depend upon the parameter σ.

Our intention is to find σ< 0 so that

Wu = Ws in the region {v > 0, w > 0}. …(49)

Then we will have a solution of (45), (46), whose path in the phase plane is a heteroclinic orbit connecting (0 0) to (1, 0).

To establish (49), we fix now a small number ε > 0 and let L denote the vertical line through the point (a + ε, 0).

We claim

Ws ∩ L ≠φ,

Wu ∩ L ≠ φ, …(50)

if σ < 0.

To check this assertion, define

E(v, w) = )z(f2

w v

0

2

�+ dz (v, w∈R) …(51)

and compute

dtd

E(v(t), w(t)) = w(t) w′(t) + f(v(t)) v′(t)

= − σw2(t). …(52)

w

Wu Ws

(0, 0) (1, 0) v


As σ < 0, we see that E is nondecreasing along trajectories of the ODE (45). Note also the level sets of E have the shapes illustrated below :

Level curves of E

Consider next the region R, as drawn. The unstable curve enters R from (0, 0) and cannot exit through the bottom, top or left hand side. Using (45), we deduce that Wu must exit R through the line L, at a point (a + ε, w0(σ)). Similarly we argue Ws must hit L at a point (a + ε, w1(σ)). This verifies claim (50).

We next observe

w0(0) < w1(0); …(53)

this follows since trajectories of (45) for σ = 0 are contained in level sets of E.

We assert further that

w0(σ) > w1(σ) …(54)

The region R

The region S

w

(1, 0) v (a, 0) (0, 0)

T

w L

s

v

w L R

v (a,0) (1,0) (0,0)


Provided σ < 0 and |σ| is large enough. To see this, fix β > 0 and consider the region S, as drawn.

Now along the line segment T = {0 ≤ v ≤ a + ε, w = βv},

we have

v�

)v(f�

w)v(fw�

'v'w −−=−−= . …(55)

Since v

)v(f is bounded for 0 ≤ v ≤ a + ε, we see

��

C�

'v'w >−−≥ on T, …(56)

provided σ < 0 and |σ| is large enough.

The calculation (56) shows that Wu cannot exit S through the line segment T, and so

w0(σ) ≥ β(a + ε) if σ = σ(β)

is sufficiently negative.

On the other hand,

w1(σ) ≤ w1(0) for all σ ≤ 0.

Thus we see that (53) will follow once we choose β large enough and then σ sufficiently negative.

Since w0 and w1 depend smoothly on σ, we deduce from (50) and (53) that there exists σ < 0 with

w0(σ) = w1(σ). …(57)

For this velocity σ there consequently exists a solution of the ODE (45), (46).

Hence we have found for our reaction-diffusion PDE (38) a traveling wave of the form (40).

9.3 TRANSFORM METHODS

In this section we develop some of the theory of Fourier and Laplace transforms, which provides extremely powerful tools for converting certain


linear partial differential equations into either algebraic equations or else differential equations involving fewer variables.

Fourier Transform

In this section all functions are complex-valued, and − denote the complex conjugate.

Definitions and Elementary Properties

Definition of Fourier transform on L1.

If u ∈L1(Rn), we define its Fourier transform

yix

R2/n

e)2(1

)y(un

⋅−�π

= u(x) dx (y ∈ Rn) …(1)

and its inverse Fourier transform

y.ix

R2/n

e)2(1

)y(un�π

=( u(x)dx (y∈Rn). …(2)

Since

|e+ix⋅y| = 1

and u∈L1(Rn), these integrals converge for each y∈Rn.

We intend now to extend definitions (1), (2) to functions u∈L2(Rn).

Theorem 1: (Plancherel’s Theorem). Assume u ∈ L1(Rn) ∩ L2(Rn).

Then u,u ( ∈ L2(Rn) and

.uuu)nR(2L)nR(2L)nR(2L

== ( …(3)

Proof 1: First we note that if v, w ∈L1(Rn), then w,v ∈ L∞(Rn). Also

dy)y(w)y(vdx)x(w)x(vnn RR�� = , …(4)

since both expressions equal

y.ix

RR2/n

e)2(1

nn

−��π

v(x) w(y) dxdy.


Furthermore, (exercise)

t4|y|2/n

R

|x|tyix

2

n

2e

tdxe

−−⋅−��

��

π=� (t > 0). …(5)

Consequently if ε > 0 and

vε(x) = ,e2|x|�− …(6)

we have

.)2(

e)y(v

2/n

4|y| 2

ε=

ε−

ε …(7)

Thus (4) implies for each ε > 0 that

.dxe)x(w)2(1

dye)y(w 4|x|

R2/n

|y|

R

2

n

2

n

ε−ε−

�� ε= …(8)

2. Now take u ∈ L1(Rn) ∩ L2(R2) and set

v(x) = u (−x). …(9)

Let

w = u∗v ∈ L1(Rn) ∩ C(Rn) , …(10)

and we find that

vu)�2(w 2/n= ∈ L∞(Rn). …(11)

But

dx)x(ue)2(1

)y(v yix

R2/n

n

−π

= ⋅−�

= )y(u …(12)

and so

22/n |u|)�2(w = . …(13)


Now w is continuous and thus

).0(w)2(dxe)x(w)2(1

lim 2/n4|x|

R2/n0

2

n

π=ε

ε−

→ε � …(14)

Since

22/n |u|)�2(w = ≥ 0,

we deduce upon sending ε→0+ in (8) that w is summable, with

dy)y(wnR� = (2π)n/2 w(0). …(15)

Hence

.dx|u|dx)x(v)x(u)0(wdy|u| 2

RR

2

R nnn�� =−== …(16)

The proof for u( is similar.

Definition of Fourier transform on L2.

In view of the equality (3), we can define the Fourier transform of a function u ∈L2(Rn)

as follows.

Choose a sequence ∞=1kk }u{ ⊂ L1 (Rn) ∩ L2(Rn) with

uk→ u in L2(Rn).

According to (3),

)nR(2Ljk)nR(2Ljk)nR(2Ljk uuuuuu −=−=− ,

and thus ∞=1kk }u{ is a Cauchy sequence in L2(Rn).

This sequence consequently converges to a limit, which we define to be :u

uu k → in L2(Rn).


The definition of u does not depend upon the choice of approximating sequence ∞

=1kk }u{ . We similarly define .u(

Next we record some useful formulas in the following theorem.

Theorem 2 : (Properties of Fourier transform).

Assume u, v ∈ L2(Rn). Then

(i) ,dyvudxvunn RR�� = …(17)

(ii) Dαu = (i y)α u

for each multi-index α such that

Dαu ∈ L2(Rn), …(18)

(iii) (u ∗ v)^ = (2π)n/2 vu , …(19)

(iv) u = ∨)u( . …(20)

Applications :

The Fourier transform is an especially powerful technique for studying linear, constant-coefficient partial differential equations.

Example 1 : (Bessel potentials).

We investigate first the PDE

−∆u + u = f in Rn, …(21)

where f ∈ L2(Rn).

To find an explicit formula for u, we take the Fourier transform and use (18) to obtain

(1 + |y|2 )y(f)y(u = (y ∈Rn). …(22)

The effect of the Fourier transform has been to convert the PDE (21) into the algebraic equation (22). The solution of (22) is

.|y|1

fu

2+= …(23)


Thus

∨

��

��

+=

2|y|1f

u , …(24)

and so the only real problem is to rewrite the right hand side of (24) into a more explicit form.

Using (19), we see

u = ,)�2(Bf

2/n

∗ …(25)

where

.|y|1

1B

2+= …(26)

We solve for B as follows.

Since

�= ∞ −0

tadtea1

, …(27)

for each a > 0, we have

)|y|1(t

02

2e

|y|11 +−

∞

�=+

dt. …(28)

Thus

B = ∨

��

��

+ 2|y|11

( ) .dtdyee)2(1

n

2

R

|y|tyix

0

t2/n ��

−⋅∞

−

π= …(29)

Now if a, b ∈ R, b > 0, and we set

z = b1/2 x − ,ib2a

2/1 …(30a)


we find

� �=∞∞− Γ

−−

− ,dzeb

edxe

2z2/1

b42a2bxiax …(30b)

Γ denoting the contour ��

� � −=

2/1b2a

)zIm( in the complex plane. Deforming Γ

into the real axis, we compute

� �Γ

∞

∞−

−− π== ;dxedze 2/1xz 22 …(31)

and hence

�∞

∞−

−−��

��

π= .b

edxe2/1

b4/abxiax 22 …(32)

Thus

∏ ��=

∞

∞−

−−⋅ =n

1jj

tyyix|y|tyix

R

dyedye2jjj

2

n

.

t4|x|2/n 2

et

−��

��

π= …(33)

Consequently, we conclude from (29), (33) that

B(x) = �∞

−−

0

2/n

t4|x|

t

2/ndt

te

21

2

, (x ∈Rn). …(34)

B is called a Bessel potential.

Employing (25), we derive then the formula

u(x) = 2/n

t4|yx|

t

R02/n t

e)4(1

2

n

−−−∞

��π f(y) dydt (x ∈Rn) …(35)

for the solution of (21).

Example 2 : (Fundamental solution of heat equation).

Consider again the initial-vale problem for the heat equation


ut − ∆u = 0 in Rn ×(0, ∞)

u = g on Rn×{t = 0}. …(1)

We establish a new method for solving (1) by computing ,u the Fourier transform of u in the spatial variables x only.

Thus

tu + |y|2 u = 0 for t > 0 ,

u = g for t = 0 . …(2)

Solution of (2) is (exercise)

u = .ge2|y|t− …(3)

Consequently

u =∨

−��

� ge

2|y|t , …(4)

and therefore

u = ,)�2(

Fg2/n

∗ …(5)

where

.eF2|y|t−= …(6)

But then

F = ( ) dye)2(1

en

22

R

|y|tyix2/n

|y|t � −⋅∨−

π= .

t4|x|

2/n

2

e)t2(1 −= …(7)

using (5). We compute


u(x, t) = �−−

π n

2

R

t4|yx|

2/ne

)t4(1

g(y) dy. (x∈Rn, t > 0). …(8)

The Fourier transform has provided us with new derivation of the fundamental solution of the heat equation.

Example 3 : (Fundamental solution of Schrodinger’s equation).

Let us next look at the initial-value problem for Schrodinger’s equation

iut + ∆u = 0 in Rn × (0, ∞)

u = g on Rn × {t = 0}. …(1)

Here u and g are complex-valued.

If we formally replace t by ‘i t’ in the solution of heat equation, we obtain the formula

u(x, t) = �−−

π n

2

R

t4|yx|i

2/ne

)it4(1

g(y) dy , (x∈Rn, t > 0), …(2)

where we interpret .easi 4�i

21

This expression clearly makes sense for all times t > 0, provided g∈L1(Rn). Furthermore if |y|2 g∈L1(Rn), we can check by a direct calculation that u, given in (2), solves the differential equation is (1).

Let us next rewrite formula (2) as

u(x, t) = .dy)y(gee)it4(

et4|y|i

R

t2yix

2/n

t4|x|i 2

n

2

�⋅−

π …(3)

Since

|e,e| t4

2|y|it4

2|x|i

= 1,

we can check that if g∈L1(Rn) ∩ L2(Rn), then


)nR(2L)nR(2L

g)t,(u =⋅ (t > 0). …(4)

Hence the mapping

g→u(⋅, t)

preserves the L2-norm. Therefore we can extend formula (2) to functions g ∈ L2(Rn), in the same way that we extended the definition of Fourier transform.

Remark. We call

Ψ(x, t) = 4

2|x|i

2/ne

)it�4(1

(x ∈ Rn, t ≠ 0) …(5)

the fundamental solution of Schrodinger’s equation.

Note that formula (2), u = g ∗ Ψ, makes sense for all time t ≠ 0, even t < 0. Thus we in fact have solved the problem.

iut + ∆u = 0 in Rn × (−∞, ∞)

u = g on Rn × {t = 0}. …(6)

In particular, Schrodinger’s equation is reversible in time, whereas the heat equation is not.

Example 4 : (Wave equation).

We next analyze the initial-value problem for the wave equation

utt − ∆u = 0 in Rn × (0, ∞),

u = g, ut = 0 on Rn × {t = 0}, …(1)

where for simplicity we suppose the initial velocity to be zero.

Take as before u to be the Fourier transform of u in the variable x∈Rn. Then (1) gives

u|y|u 2tt + = 0 for t > 0,

0u,gu t == for t = 0. …(2)

This is an ODE for each fixed y ∈Rn.

We look for a solution having the form


�te�u = (β, γ ∈ C) …(3)

Plugging into (2) gives

γ2 + |y|2 = 0 , …(4)

and so

γ = + i|y|. …(5)

Remembering the initial conditions from (2), we deduce

)ee(2g

u |y|it|y|it −+= . …(6)

Inverting, we find

u(x, t) = .)ee(2g |y|it|y|it

∨−

��

��

� + …(7)

Consequently, we get the formula

u(x, t) = dy)ee(2

)y(g)2(1 |)y|tyx(i

R

|)y|tyx(i2/n

n

−⋅+⋅ +π � , …(8)

for x∈Rn, t ≥ 0.

Laplace Transform

Remember that we write R+ = (0, ∞).

Definition : If u ∈L1(R+), we define its Laplace transform to be

u#(s) =�∞

−

0

ste u(t)dt (s ≥ 0). …(∗)

Whereas the Fourier transform is most appropriate for functions defined on all of R (or Rn), the Laplace transform is useful for function defined only on R+.

In practice this means that for a partial differential equation involving time, it may be useful to perform a Laplace transform in t, holding the space variables x fixed.


Example 1 : (Resolvents and Laplace Transform).

Consider again the heat equation

vt − ∆v = 0 in U×(0, ∞)

v = f on U×{t = 0}, …(1)

and perform a Laplace transform with respect to time :

v#(x, s) = �∞

−

0

st )t,x(ve dt (s > 0). …(2)

We compute

∆v#(x, s) =�∞

− ∆0

st dt)t,x(ve

=�∞

−

0

tst dt)t,x(ve

= s [ ]�∞

∞==

−− +0

t0t

stst vedt)t,x(ve

= sv#(x, s)−f(x). …(3)

Think now of s > 0 being fixed, and write

u(x) = v#(x, s) . …(4)

Then

−∆u + su = f in U . …(5)

Thus the solution of the resolvent equation (5) with right hand side f is the Laplace transform of the solution of the heat equation with initial data f.

Example 2 : (Wave equation from the heat equation).

Next we employ some Laplace transform ideas to provide a new derivation of the solution for the wave equation, based upon the heat equation.

Suppose u is a bounded, smooth solution of the initial-value problem :


utt − ∆u = 0 in Rn×(0, ∞),

u = g, ut = 0 on Rn×{t = 0}, …(1)

where n is odd and g is smooth, with compact support.

We extend u to negative times by writing

u(x, t) = u(x, −t) if x∈Rn, t < 0. …(2)

Then

utt − ∆u = 0 in Rn × R. …(3)

Next define

v(x, t) = �∞

∞−

−

πt4/s

2/1

2e

)t4(1

u(x, s)ds (x ∈Rn, t > 0). …(4)

Hence

0t

lim→

v = g , …(5)

uniformly on Rn. In addition

∆v(x, t) = �∞

∞−

−

πt4/s

2/1

2e

)t4(1 ∆u (x, s)ds

= �∞

∞−

−

πt4/s

2/1

2e

)t4(1

uss(x, s)ds

= �∞

∞−

−

πt4/s

2/1

2e

)t4(1

us(x, s)ds

= �∞

∞−

−��

��

−

πt4/s

2

2

2/1

2e

t21

t4s

)t4(1

u(x, s)ds

= vt(x, t). …(6)

Consequently v solves this initial-value problem for the heat equation :

vt −∆v = 0 in Rn×(0, ∞),


v = g on Rn×{t = 0}. …(7)

As v is bounded, we deduce that

v(x, t) = .dy)y(g)t4(

1

t4

2|yx|neR

2/n � −−π

…(8)

We equate (4) with (8), recall (2), and set

λ = .t4

1

We obtain the identity

� �∞

−λ−

−

λ−��

��

πλ=

0

)y(g|yx|

R

21n

s 2

n

2e

21

dse)s,x(u dy.

Thus

,dr)r;x(Gre2

)n(ndse)s,x(u 1nr

0

21n

s

0

22 −λ−∞−

λ−∞

��

��

πλα= …(9)

for all λ > 0, where

G(x ; r) = ).y(dS)y(g)r,x(B�∂

…(10)

We will solve (9), (10) for u.

To do so, we write n = 2k + 1 and note

−22 rr e)e(

drd

r21 λ−λ− λ= . …(11)

Hence

dr)r;x(Gredr)r;x(Gre k2rk

0

1nr

0

21n

22 λ−∞

−λ−∞−

λ=λ ��


= dr)r;x(Gr)e(drd

r1

2)1( k2r

k

0k

k2

��

�

��

��

��

− λ−∞

�

= ,dre))r;x(Gr(rr

1r

21 2r1k2

k

0k

λ−−∞

��

�

��

��

��

∂∂

� …(12)

where we integrated by parts k times for the last equality.

Owing to (9) (with r replacing s in the expression on the left), we deduce from (12),

.dre))r;x(Gr(rr

1r

2

)n(ndre)r,x(u

22 r

0

1k2k

01k21n

r λ−∞

−∞

+−

λ−� � ��

�

��

��

��

∂∂

π

α= …(13)

Upon substituting τ = r2 we see that each side above, taken as a function of λ, is a Laplace transform. As two Laplace transforms agree only if the original functions were identical, we deduce

u(x, t) = )).t,x(Gt(tt

1t

2�

)n(n 1k2k

1kk−

+ ��

��

∂∂

…(14)

Now n = 2k + 1 and

α(n) = .1

2n�

12n� 2

1k2/n

��

��

+Γ=

��

��

+Γ

+

…(15)

Since

Γ 2/1�21 =��

��

, …(16)

and

Γ(x + 1) = xΓ(x) for x > 0,

we compute


n1k

2/1

1kk �

13.5)...4n)(2n(

1

12n

2

�n2�

)n(n =−−

=��

��

+Γ=

++ . …(17)

We insert this deduction (17) into (14) and simplify :

u(x, t) = ��

�

�

��

��

��

∂∂

∂∂

γ �∂−

−

dSgttt

1t

1

)t,x(B

2n23n

n

(x ∈Rn, t > 0). …(18)

9.4 CONVERTING NONLINEAR INTO LINEAR PDE

Now we describe several techniques which are sometimes useful for converting certain nonlinear equations into linear equations.

Hopf-Cole Transformation

A parabolic PDE with quadratic nonlinearity.

We consider first of all an initial-value problem for a quasilinear parabolic equation :

ut −a∆u + b|Du|2 = 0 in Rn×(0, ∞)

u = g on Rn ×{t = 0}, …(1)

where a > 0.

This sort of nonlinear PDE arises in stochastic optimal control theory.

Assuming for the moment u is a smooth solution of (1), we set

w = φ(u), …(2)

where

φ : R→R …(3)

is a smooth function, as yet unspecific.

We will try to choose φ so that w solves a linear equation. We have

wt = φ′(u)ut, …(4)

∆w = φ′(u)∆u + φ′′(u)|Du|2; …(5)

and consequently (1) implies


φ′(u)ut = φ′(u) [a∆u−b|Du|2]

=a∆w − [aφ′′(u) + bφ′(u)] |Du|2 ,

or

wt = a∆w, …(6)

provided we choose φ to satisfy

aφ′′ + bφ′ = 0. …(7)

We solve this differential equation (7) by setting

φ = abu

e−

. …(8)

Thus we see that if u solves (1), then

w = abu

e−

…(9)

solves this initial-value problem for the heat equation (with conductivity a):

��

�

�

=×=

∞×=∆−−

}.0t{Ronew

),0(Rin0waw

nabg

nt

…(10)

Formula (9) is the Hopf-Cole transformation.

Now the unique bounded solution of (10) is

w(x, t) = dyee)at4(

1 )y(gab

at4|yx|

R2/n

2

n

−−−

�π (x ∈Rn, t > 0); …(11)

and, since (9) implies

u = − ,wlogba

…(12)

we obtain thereby the explicit formula

u(x, t) = −��

�

�

��

π �−−−

dye)at4(

1log

ba

n

2

R

)y(gab

at4|yx|

2/n (x ∈Rn, t > 0) …(13)


for a solution of quasilinear initial-value problem (1).

Burgers’ Equation with Viscosity.

As a further application, we examine now for n = 1 the initial-value problem for the viscous Burgers’ equation:

ut−auxx + uux = 0 in R × (0, ∞)

u = g on R ×(0, ∞). …(14)

If we set

w(x, t) = dy)t,y(ux

� ∞−

…(15)

and

h(x) = dy)y(gx

� ∞−

…(16)

we have

wt − awxx +21 2

xw = 0 in R×(0, ∞)

w = h on R×{t = 0}. …(17)

This is an equation of the form (1) for

n = 1, b =21 .

So (13) provides the formula

w(x, t) = −2a log .dye)at4(

1

R

a2)y(h

at4|yx|

2/1

2

��

�

�

��

π �−−−

…(18)

Since

u = wx,

we find upon differentiating (18) that


u(x, t) =

��

∞

∞−

−−−

∞

∞−

−−−−

dye

dyet

yx

a2)y(h

at4|yx|

a2)y(h

at4|yx| 2

(x ∈R, t > 0) …(19)

is a solution of problem (14), where h is defined by (16).

Potential Functions

Another technique is to utilize a potential function to convert a nonlinear system of PDE into a single linear PDE.

We consider as an example Euler’s equations for inviscid, incompressible fluid flow :

(a) ut + u⋅Du = −Dp + f in R3×(0, ∞)

(b) div u = 0 in R3×(0, ∞)

(c) u = g on R3×{t = 0}. …(20)

Here the unknowns are the velocity field u = (u1, u2, u3) and the scalar pressure p . The external force f = (f1, f2, f3) and initial velocity g = (g1, g2, g3) are given. Here D as usual denotes the gradient in the spatial variables x = (x1, x2, x3). The vector equation 20(a) means

� +−=+=

3

1j

iix

ijx

jit fpuuu (i = 1, 2, 3). …(21)

We will assume

div g = 0. …(22)

If furthermore there exists a scalar function

h : R3 × (0, ∞)→R …(23)

such that

f = Dh, …(24)

we say that the external force is derived from the potential h.

We will try to find a solution (u, p) of (20) for which the velocity field u is also derived from a potential, say


u = Dv. …(25)

The flow will then be irrotational as

curl u ≡ 0. …(26)

Now equations (20) (b) and (26) imply

∆v = 0, …(27)

and so v must be harmonic as a function of x, for each tie t > 0.

Thus if we can find a smooth function v satisfying (27) and

Dv (⋅, 0) = g, …(28)

we can then recover u from v by (25).

How do we compute the pressure p? Let us observe that because of (25), we have

u ⋅Du = 21

D(|Dv|2). …(29)

Consequently (20) (a) reads

D ��

��

+ 2t |Dv|2

1v = D(−p + h).

Therefore we may take

vt + 21 |Dv|2 + p = h. …(30)

This is Bernoulli’s law.

But now we can employ (30) to compute p, since v and h are already unknown.

9.5 HODOGRAPH AND LEGENDRE TRANSFORMS

Hodograph Transform

The hodograph transform is a technique for converting certain quasilinear systems of PDE into linear systems, by reversing the roles of the dependent and independent variables.

As this method is most easily understood by an example, we investigate here the equations of steady, two-dimensional, irrotational fluid flow :


(a) (σ2(u) −(u1)2) )uu(uuu 21x

12x

2111x +− + (σ2(u) −(u2)2) 0u 2

2x =

(b) 0uu 21x

12x =− , …(31)

in R2.

The unknown is the velocity field u = (u1, u2). The function

σ(⋅) : R2→R,

the local sound speed, is given.

The system (31) is quasilinear.

Let us now, however, no longer regard u1 and u2 as functions of x1 and x2:

u1 = u1(x1, x2), u2 = u2(x1, x2), …(32)

but rather regard x1 and x2 as functions of u1 and u2:

x1 = x1(u1, u2), x2 = x2(u1, u2). …(33)

We have exchanged sub and superscripts in the notation to emphasize the interchange between independent and dependent variables.

According to the Inverse Function Theorem, we can invert equations (32) to yield (33), provided

J = 21x

12x

22x

11x

21

21

uuuu)x,x()u,u( −=

∂∂ ≠ 0 , …(34)

in some region of R2.

Assuming now (34) holds, we calculate

��

� �

=−=

−==

.Jxu,Jxu

Jxu,Jxu2u

1x

12u

1x

2u

2x

1u

2x

212

1112 …(35)

We insert (35) into (31), to obtain

{ }��

� �

=−

=−σ+++−σ

.0xx)b(

,0x)u)u(()xx(uuxu)u()a(2u

1u

1u

22

22u

1u21

2u

22

12

11221

…(36)

This is linear system for x = (x1, x2), as a function of u = (u1, u2).


Remark : We can utilize the method of potential functions to simplify (36) further. Indeed, equation (36b) suggests that we look for a single function z = z(u) such that

x1 = 1uz

x2 = .z2u …(37)

Then (36a) transforms into the linear second-order PDE

(σ2(u) − .0z)u)u(�(zuu2z)u1u1u

22

22u1u212u2u

21 =−++ …(38)

Legendre Transform

A technique closely related to the hodograph transform is the classical Legendre transform. The idea is to regard the components of the gradient of a solution as new independent variables.

Once again an example is instructive. We investigate the minimal surface equation

div ,0)|Du|1(

Du2/12

=��

��

+ …(39)

which for n = 2 may be rewritten as

.0u)u1(uuu2u)u1(2212121112 xx

2xxxxxxx

2x =++−− …(40)

Let us now assume that at least in some region of R2, we can invert the relations

p1 =1xu (x1, x2),

p2 = 2xu (x1, x2), …(41)

to solve for

x1 = x1(p1, p2),

x2 = x2(p1, p2). …(42)

The Inverse Function Theorem assures us we can do so in a neighborhood of any point where

J = det D2u ≠ 0. …(43)


Now define

v(p) = x(p)⋅p −u(x(p)), …(44)

where x = (x1, x2) is given by (42) and p = (p1, p2). We find that (exercise)

��

��

�

=

−=

=

,Jvu

,Jvu

,Jvu

1122

2121

221

ppxx

ppxx

pp1xx

…(45)

Upon substituting the identities (45) into (40), we derive for v the following linear equation

0v)p1(vpp2v)p1(112122 pp

21pp21pp

22 =++++ . …(46)

Remark. The hodograph and Legendre transform techniques for obtaining linear out of nonlinear PDE are in practice tricky to use, as it is usually not possible to transform given boundary conditions very easily.

The Books Recommended for Chapter IX 1. L.C. Evans Partial Differential Equations, Graduate Studies


ATTRACTION AND POTENTIAL-I

285

Chapter-10

Attraction and Potential-I

10.1 LAW OF GRAVITATION This law states that “every particle in the universe attracts every other particle with a force which is directly proportional to the product of the masses of the particles and inversely proportional to the square of the distance between them.”

Thus, if m1, m2 denote the masses of two particles and r their distance apart. Then the force of attraction between them is

γ2

21

r

mm,

where γ is known as the gravitation constant.

Remark I :- This law was discovered by Sir Isaac Newton (1642-1727)

Remark II :- Gravitation constant γ measures the attraction of two particles, each of unit mass, at unit distance apart.

Remark III :- To avoid a difficulty in defining the distance between two particles, we may define a material particle as a body so small that, for the purposes of our investigation, the distance between different parts of body may be neglected.

Remark IV : The numerical value of γ is 000,500,15

1 approximately.

Remark V :- If we choose units such that γ = 1. Such units are called astronomical or theoretical units.

Remark VI :- The acceleration f produced by the attraction of a particle of mass m on a particle at a distance r is given by,

f = γ2r

m,


so that γ = 1, when f, m and r are all unity. Hence, the astronomical unit of mass is the mass of a particle which by its attraction produces unit acceleration at unit distance.

We can find the astronomical unit of mass in grammes by taking the above formula for acceleration, which holds good in all systems of units, and putting r = 1 cm, f = 1cm/sec2, m = 15,500,000 grammes.

If P be a particle of unit mass and Q another particle of mass m. Then the force of attraction

F = γ2)PQ(

1m ×

is called the attraction of Q at P, and act along the line PQ towards Q.

Field of force :

The attraction of a system of particles at a point external to itself is the force of attraction which the system would exert on a particle of unit mass placed at the point. There must be a definite value for this force at every point at which a particle can be placed. Thus we arrive at the conception of a field of force, or region of space with every point of which there is associated a force which is definite in magnitude and direction.

Remark At the point of equilibrium the definite value of force of attraction is zero.

10.2 ATTRACTION OF A SYSTEM OF PARTICLES

Let particles of masses m1, m2, m3,… be situated at points A1 , A2, A3,… whose co-ordinates referred to rectangular axes are (x1, y1, z1), (x2, y2, z2), (x3, y3, z3),

Q (m)

P(unit mass)

z

o

y

Z

P

Y X

A2(x2, y2, z2)

A1(x1, y1, z1)

x

A3


287

Let P (x, y, z) be any point in space. Let (X, Y, Z) denote the components of the attraction of the given system of particles at point P(x, y, z).

Let r1, r2, r3,…. Denote the distance PA1, PA2, PA3,….

so that

r12 = (x1, −x)2 + (y1−y)2 + (z1−z)2,

r22 = (x2−x)2 + (y2−y)2 + (z2−z)2,

r32 = (x3−x)2 + (y3−y)2 + (z3−z)2,

………………………………..

………………………………..

………………………………..

and the direction cosines of PA1, PA2, PA3……..are

>−−−

<1

1

1

1

1

1

rzz

,r

yy,

rxx

,

>−−−

<2

2

2

2

2

2

rzz

,r

yy,

rxx

,

>−−−

<3

3

3

3

3

3

rzz

,r

yy,

rxx

,

…………………………..

…………………………..

respectively.

The attraction at the point P of mass m1 situated at the point A1 is m1/r12 (on

taking γ=1) and is directed along 1PA .

Therefore, the particle m1 located at A1(x1, y1, z1) exerts at a force at P(x, y, z), whose components parallel to the axes are

X1 = ��

��

� −��

�

�

��

�

�=��

�

��

� −��

�

�

��

�

�=��

�

��

� −��

�

�

��

�

�

1

12

1

11

1

12

1

11

1

12

1

1

rzz

r

mZ,

ryy

r

mY,

rxx

r

m.

The other particles m2, m3,… make like contributions for the attraction at P(x, y, z).


The principle of superposition of fields of force states that “the force exerted at a point by a system of particles is the vector sum of the forces exerted by each of the particles separately”.

So, by the principle of superposition of fields of force, total attraction at P(x, y, z) due to the given system of particles is

X = ��

��

−k 3

k

kk

r

)xx(m,

Y = ��

��

−k 3

k

kk

r

)yy(m,

Z = ��

��

−k 3

k

kk

r

)zz(m,

where k = 1, 2, 3,… and the summation extends to all the attracting particles.

10.3 POTENTIAL

Let particles of masses m1, m2, m3,… be situated at points A1, A2, A3,… whose co-ordinates referred to rectangular axes are (x1, y1, z1), (x2, y2, z2),… . Let P (x, y, z) be any point of space. Let r1, r2, r3,… denote the distance PA1, PA2, PA3,……, i.e.,

rk2 = (xk−x)2 + (yk−y)2 + (zk−z)2 …(1)

for k = 1, 2, 3,….

Let us now define a function v(x, y, z) by the formula

V(x, y, z) = � ��

��

�

k k

k

rm

. …(2)

The function V defined in (2) is a function related to a system of attracting particles having a definite value at every point P of space external to the particles. It is a function of the co-ordinates (x, y, z) of P and is clearly a single-valued function, in the sense that it cannot have more then one value at each point P; for it represents simply the sum of the masses of the separate particles divided by their respective distances from P. Further, V represents a sum which does not depend on the particular system of axes of reference.

Now, differentiation of equations (1) and (2) with respect to x gives


289

� ��

��

�

∂∂

��

�

�

��

�

�−=

∂∂

k

k2k

k ,xr

r

mxV

…(3)

and

rk −=∂∂

xrk (xk−x). …(4)

using equation (4) in (3) we obtain

��

��

−�=

∂∂

3k

kk

k r

)xx(mxV

. …(5)

But, we know that the component X of the force of attraction at P is given by

X = ��

��

−� 3

k

kk

k r

)xx(m. …(6)

Equations, (5) and (6) imply

XxV =

∂∂

, …(7)

similarly

YyV =

∂∂

, …(8)

ZzV =

∂∂

, …(9)

where Y and Z are other components of the force of attraction.

Definition. The function V defined by (2) is called the potential of the attracting particles, or the potential of the field of force.

Result (1) :- We have proved that the derivatives of the potential V with regard to x, y, z give the components of attraction at P in the directions of the axes.

Result (2) :- Since the directions of the axes can be chosen arbitrarily, it follows that the space derivative of the potential V in any direction gives the component of attraction in that direction.

For verification of result (2), let ∂/∂s denote a differentiation in a direction dS

whose direction cosines are < l, m, n> or >∂∂

∂∂

∂∂<

sz

,sy

,sx

. Then, by chain rule,


��

��

�

∂∂

∂∂+�

�

��

�

∂∂

∂∂+�

�

��

�

∂∂

∂∂=

∂∂

sz

zV

sy

yV

sx

xV

sV

.

= lzV

nyV

mxV

∂∂+

∂∂+

∂∂

= l X + mY + nZ

component of the force of attraction in direction < l, m, n>.

Remark 1 :- In the language of vectors the force of attraction say Rρ

, is the gradient of the potential V. That is,

Rρ

= grad V.

Remark 2 :- If the potential V of any given distribution of matter can be determined, the force of attraction R

ρat any point can be found immediately by

taking the gradient of scalar potential V.

Physical Interpretation of Potential V(x, y, z).

The total differential of the potential V(x, y, z) is

dV = dzzV

dyyV

dxxV

∂∂+

∂∂+

∂∂

= X dx + Y dy + Z dz. …(1)

Hence, by integrating along any path from the point P to the point Q, we get

VQ −VP = � ��

��

�

∂∂+

∂∂+

∂∂Q

Pds

sz

Zsy

Ysx

X . …(2)

But the integral in R.H.S. of equation (2) represents the work which the forces of attraction would perform upon a particle of unit mass as it moved along this path from P to Q. This gives us a measure of potential V in terms of work per unit mass. The potential at any point Q exceeds the potential at any other point P by the work which the forces of attraction would perform upon a particle of unit mass as it moves along any path from P to Q.

Remark (1) : The addition of a constant to the potential V will not affect the values of the force components.

Remark 2 :We know that the potential V of the attracting particles is defined by


291

ds

V = � ��

��

�

k k

k

rm

…(3)

for k = 1, 2, 3,…

From (3) it is clear that potential V vanish at an infinite distance from the attracting matter. So, when the potential is determined by integration from known force components (X,Y, Z), the constant of integration may be so chosen as to make the potential vanish at an infinite distance from the attracting matter.

On this hypothesis, we see that the potential at a given point P due to a given point P due to a given attracting system is the work that would be done by the attractions of the system on a particle of unit mass as it moves along any path from an infinite distance up to the point considered. Hence the definition of potential V as �(m/r) leads to this expression for potential in terms of work done per unit mass.

Result : We have seen that the definition of potential V as

V = �(m/r)

leads tot he expression for V in terms of work done per unit mass.

Now we shall demonstrate the converse.

Let m be the mass of a typical particle of the system situated at the point A. Let

PP′ = ds

be an element of any path from an infinite distance to the point Q, and let

AP = r,

AP′ = r + dr.

Then so far as the field of force depends upon the particle m at A, its value at P is

A(m)

Q P′ P ∞

r


= ,r

m2

and is directed along PA . The work done by this force on the unit particle as it moves from P to P′ is

= dsdsdr

r

m2 �

�

��

� −��

��

�

= 2r

drm−.

Hence, the total work done by the attraction of the particle A(m) on a unit particle moving from an infinite distance to the point Q is

= − drr

m2

AQ��

��

��∞

= AQm

.

By the principle of superposition of fields, the total work done by the attractions of all the particles of the system is obtained by adding for their separate effects, so that

V = � ��

��

�

AQm

gives the potential at Q. The above formula represents the sum of the masses of the separate particles each divided by its distance from Q. The interchangeability of the two definitions of potential is thus completely established.

Dimensions : Gravitational potential V and potential energy have different physical dimensions. The dimensions of potential energy are those of work, i.e., ML2 T−2, in terms of the fundamental units of mass, space and time. The dimensions of gravitational potential V are obtained below

Dimensions of the potential V.

It is important to remember that we are using astronomical units and omitting the gravitational constant γ, and though this does not affect the argument when potential is defined as work per unit mass. We now use the formula for potential given below :


293

V = � ��

��

�

rm

…(1)

If we want to find the dimensions of gravitational potential V we must restore the constant γ in (1) and write

V = γ � ��

��

�

rm

…(2)

Because the constant γ has dimensions. By definition of force of attraction, the quantity

F = γ2

1

r

mm …(3)

Represents a force and is therefore has dimensions MLT−2. Equation (3) gives

γ = M−1 L3 T−2 …(4)

Thus, the dimensions of constant γ are

M−1L3 T −1

Hence, the dimensions of γ ��

��

�

rm

are

(M−1 L3 T−1) ��

��

�

LM

= L2T−2. = (LT−1)2.

NOTE :- (1) The potential energy decreases when the work is done and the gravitational potential increases when the work is done.

(2) The dimensions of gravitational potential V are those of the square of a velocity.

10.4 EQUIPOTENTIAL SURFACES AND LINES OF FORCE

Regarding the potential V(x, y, z) of a given attracting system as a function of coordinates x, y, z the equation

V(x, y, z) = constant, …(1)

represents a surface over which the potential V is constant. Such surfaces are called equipotential surfaces. By the definition of potential V(x, y, z), it is clear that only one equipotential surface passes through any point of space, so


that no two equipotential surfaces can intersect. Also, since the potential V(x, y, z) has a constant value over a equipotential surface, no work would be done by the attraction on a particle moving on such a surface. Therefore, at every point the resultant attraction is normal to the equipotential surface through the point. The observation is also obvious from the relation

R = grad V. …(2)

Definition :- The curve such that the tangent at any point of it is in the direction of the resultant attractive force at that point is called line of force. Remark : The line of force is at right angles to the equipotential surfaces at all their points of intersection. Conversely, a surface which cuts all lines of force at right angles must be an equipotential surface, because at no point on the surface is there a component of force tangential to the surface, so that no work would be done on a particle moving on the surface and there could therefore be no variation in the potential.

Continuous Bodies

We now pass from the attraction components (X, Y, Z) and potential V(x, y, z) of a system of separate particles to the attraction components (X, Y, Z) and potential V(x, y, z) of distributions of matter regarding as continuous bodies.

By the principle of super position we can obtain the attraction components and potential V(x, y, z) of such a body provided that we have a means of summing the contributions of the separate particles.

It is natural to look to integration to effect the summation, but when we consider what the process of integration involves, we find that it does not fit the physical conditions of the problem precisely, and it is only by giving a special interpretation to our conception of 'body' that we can justify the use of integration. Thus, it is usual to represent the potential V of a continuous body by a volume integral, i.e.,

V = �rdv�

,

where dv denotes an element of volume of the body at the distance r from an external point P and ρ denotes the density of matter in dv.

The process of integration implies that the density of the body is continuous. To justify the use of such an integral it is necessary therefore to suppose that it is applied to a hypothetical continuous distribution of matter occupying the same region as the body and having at each point a suitably chosen density; this density being found by considering a small but finite volume surrounding


295

the point and taking the average through this small volume of the masses of the particles of the real body contained there in.

Attraction of a uniform straight rod

Let m denote the mass per unit length of a uniform rod AB of finite length. It is required to find the components of attraction of the rod AB at an external point, say P.

Let the perpendicular from P to AB meet AB in M, which for simplicity we take on AB produced.

MP = p. …(1)

consider an element QQ′ of the rod AB, where

MQ = x, QQ′ = dx. …(2)

Let

∠MPQ = θ.

Then, from the triangle PMQ, we have

x = p tan θ, …(3)

dx = p sec2θ dθ …(4)

The mass of the element QQ′ of the rod is

m dx = m p sec2θ dθ.

The attraction at P of the element QQ′ of the rod AB is, therefore,

2

2

)PQ(

�d�secmp along PQ. …(5)

X P

Y

θ R′ R

E

D

A Q′ Q M B


From the triangle PMQ, we have

PQ = p secθ. …(6)

Combing (5) and (6), the attraction at P of the element QQ′ becomes

p�dm

along PQ. …(7)

Let the components of attraction of the rod AB parallel and perpendicular to BA be X and Y.

Let the angles MPA and MPB be α, β.

Then, we have

X = ��

� pm

sinθ dθ

= pm

(cos β − cos α)

= 21

sinpm2

(α+β) sin ��

��

� −2��

, …(8)

and

Y = ��

� pm

cosθ dθ

= pm

(sin α − sin β)

= ��

��

� β+α��

��

� β−α2

cos2

sinpm2

…(9)

Let R be the resultant attraction. Then

R = 22 YX +

= ��

��

� −2��

sinpm2

= ( ) .BPA21

sinpm2

�

��

…(10)


297

The direction f R is given by

tan−1��

��

�

XY

= tan−1�

��

−−

�sin�sin�cos�cos

= tan−1�

��

��

��

� +2��

tan

= ��

��

� +2��

. …(11)

Thus, the resultant attraction R acts along the bisector of the angle APB and

makes an angle 21

(α+β) with PM.

Remark 1: The component (8), parallel to the rod. AB, can also be written as

X = PAm

PBm − …(12)

in the sense parallel to BA.

Remark 2 : We note that if a circle of centre P and radius PM cuts PA, PQ′, PQ, PB in D, R′, R, E then the attraction at P of the element RR′ of a rod in the form of a circular arc DE of the same line density (mass per unit length) as AB is

= 2p

�mpd

= p�md

= attraction of QQ′.

Hence the circular arc DE exerts the same attraction as the rod AB.

Corollary : If the rod is infinitely long, the angle APB is two right angles and the resultant attraction is

pm2


and perpendicular to the rod.

If would appear from this result that if the attracted particle were close to the rod, the attraction would be infinite; but this conclusion is not justified because in the foregoing argument we assumed that every point of an element QQ′ of the rod was at the same distance from the point P, and for this to be true when P is close to the rod it would be necessary for the rod to have no thickness. To find the attraction at a point close to a rod of finite thickness it will be necessary to take account of the cross-section of the rod.

Potential of a uniform straight rod

Let m denote the mass per unit length of a uniform rod AB of finite length. It is required to find the potential of rod AB at an external point, say, P.

Let the perpendicular from P to AB meet AB in M, which for simplicity we take on AB produced. Let

MP = p.

Consider an element QQ′ of the rod, where

MQ = x, QQ′ = dx.

Let the angle MPQ be θ.

From the triangle PMQ, we write

x = p tanθ,

dx = p sec2θ dθ,

PQ = p sec θ.

The mass of the element QQ′ of the rod is

mdx = mp sec2θ dθ

P

A

θ

Q′ Q M B


299

The potential at P is given by the formula

V = �PQ

mdx

= ��

�

2

�secp�d�secmp

= m log �

��

��

��

��

��

�

2PBA

cot2

PABcot …(1)

Let 2l denote the length of the rod AB and,

PA = r, PB = r′,

and r + r′ + 2l = 2s.

Then,

cot )l2s)('rs(

)rs(s.

)l2s)(rs()'rs(s

2PBA

cot2

PAB−−

−−−

−=��

��

��

��

�

= ll

2'rr2'rr

−+++

. …(2)

so, from equations (1) and (2), the potential V is expressed as

V = m log ��

��

�

−+++

ll

2'rr2'rr

. …(3)

Remark 1 : If the ends A, B of rod are foci of an ellipse passing through P and 2a is its major axis, then

V = m log ��

��

�

−+

lala

, …(4)

or V = m loge1e1

−+

. …(5)

where e denotes the eccentricity of the ellipse.

Hence the potential V is constant over any prolate spheroid of which A, B are the foci. That is, family of confocal prolate spheroids are the equipotential surfaces. Since the normal to an ellipse at any point bisects the angle between the focial distances and a resultant attraction at a point is normal to the equipotential surface, it follows that the resultant attraction at P bisects the angle APB.


Remark (2) If the rod AB be of great length and P in the neighbourhood of its centre, then we may put

r + r′ = 2 22 pl + , …(6)

where p is small compared with l. Then, from equation (3), we obtain

V = m log ��

��

�

−+

++

lp

lp22

22

l

l

= 2m log ��

�

�

��

�

� ++p

lp22l

= 2m log

��

��

�

��

��

�+

p

2p

22

ll

,

neglecting the term (p/l)2, we write

V = 2m log 2 −2m log p …(7)

By differentiating (7), we get for the attraction of the rod in the direction p increasing,

pm2

pV −=

∂∂

.

10.5. THE ATTRACTION AND POTENTIAL OF A UNIFORM LONG ROD WHOSE CROSS-SECTION IS A CIRCLE

Take a cross-section of the long rod about the middle of its length. Let O be the centre of the cross-section and P any point inside it.

R R′

P O θ

Q Q′


301

Through the point P draw chords QPR, Q′PR′ making a small angle dθ with one another, and intercepting small arcs QQ′, RR′ on the circle. We can conceive the long rod to be composed of long parallel rods, and take QQ′, RR′ as the cross-sections of two of them. Then if m denotes the mass per unit area of given long uniform rod, mQQ′ and mRR′ denote the mass per unit length of the two rods.

Let ORP|OQP| = = φ.

Using the result for the attraction of a long rod, the attraction at P due to rod through QQ′

= PQ

)'mQQ(2

= 2m. sec φ dθ …(1)

and acts along PQ. Similarly, the attraction at P due to rod through RR′

= PR

RRm2

= 2m sec φ dθ …(2)

and acts along PR.

Hence, these two rods exert equal and opposite attractions at P; and by dividing up the whole long rod into similar pairs of rods we obtain that its resultant attraction at any internal point is zero. Consequently, the potential must be constant at all points inside the long rod sufficiently far from its ends, and is therefore equal to the potential at O. But, the potential due to the rod QQ′ at O is

= 2m QQ′ log(2 l). −2m QQ′ log(OQ) …(3)

where 2l is the length of the rod. Therefore, for the whole potential, we write

V = 2M log 2l − 2M log a, …(4)

a being the radius of the cross-section and

M = 2πma. …(5)

Now we consider the case when the attraction and potential are to determined at an external point. Let P′ be an external point. Let P be its inverse point w.r.t the circle mentioned above. Then


OP. OP′ = a2. …(6)

Bu the similarity of triangles

OQP|Q'OP| = = φ = R'OP| …(7)

Then, the attraction at the external point P′ of the rod through QQ′ is

= 'Q'P

'QQm2

= Q'P

�sec�dmPQ2

= (2m dθ sec φ)'OP

a …(8)

and acts along P′Q. But the resultant attraction of the rod at P′ is clearly along P′O, and by resolving the attraction of the rod through QQ′ in this direction, we

get �d'OP

am2.

Similarly, other rods give like resultants and the whole attraction at P′ is, therefore, equal to

'OP

M2'OP

am�4 = . …(9)

We observe that this is the same as if the whole mass of the circular rod were condensed into a rod of equal mass along the axis of the cylinder. To find the potential, we may put

r = OP′ …(10) and write

,rM2

drdV −= …(11)

giving

v = −2M log r + C. …(12)

O

R R′

P′

Q′ Q

P


303

In order that V may take the form (4) when r = a, we must have

C = M log 2l . …(13) So, V = 2M log 2l − 2M log r. …(14)

10.6 ATTRACTION AND POTENTIAL OF A UNIFORM CIRCULAR DISC AT A POINT ON ITS AXIS

Let O be the centre of the disc, P a point on its axis Oz at a distance z from O. Let m denote mass per unit area.

Divide the disc into concentric rings. Let

OQ = x

be the radius and

QQ′ = dx

The breadth of one of these rings. The mass of the ring is 2π m x dx, and the attraction at P of each element is got by dividing the mass of the element by (PQ)2. But the resultant attraction of the ring is in the direction PO, so that its magnitude is

= 2)PQ(

dxmx�2 cosθ,

where θ is the angle OPQ| . But

x = z tanθ

dx = z sec2 θ dθ.

z

P

S

Q′ Q

θ Q


So, if α is the angle which a radius of the disc subtends at P, we have for the whole attraction of the disc

= �α

��

��

�

θθθθπ

0

22

22

seczcossectanmz2

dθ

= 2π m (1− cosα). …(1)

Remark (1) : For an infinite plate we may put α = π /2, so that the attraction of an infinite plate is 2πm at right angles to itself.

Potential : The potential at point P of the ring of radius x is given by

�π

PQdxmx2

,

that the potential of whole disc of radius a at point P on its axis Oz at a distance z from O is

V = �π

a

0

PQdxmx2

= 2πm � +

a

0

22 xz

dxx

= 2πm { 22 az + −z}. …(2)

NOTE : The formula (2) gives the attraction in the direction PO as

−��

��

��

��

+−=

22 az

z1m�2

dzdV

. …(3)

Remark (1) : The formula (2) is equivalent to

V = 2πm (SP − OP), …(4)

and that this will give the value of V on either side of the disc if, SP, OP denote numerical lengths.


305

Solid Angles and Its Use

Definition : The solid angle of a cone is measured by the area intercepted by the cone on the surface of a sphere of unit radius having its centre at the vertex of the cone.

Definition : The solid angle subtended at a point by a surface of any form is measured by the solid angle of the cone whose vertex is at the given point and whose base is the given surface.

Let PP′ be a small element of area dS which subtends a solid angle dω at O.

Let the normal to area dS make an acute angle γ with OP, and let OP = r. Then the cross-section at P of the cone which dS subtends at O is dS cos γ, and this cross-section and the small area dw intercepted on the unit sphere (having centre at o) are similar figures, so that

1r

dcosdS 2

=ω

γ …(1)

Then from (1), we find

dω = 2r

�cosdS, …(2)

or

dS = r2 sec γ dω. …(3)

Integrating equation (3) both side we obtain

S = � r2 sec γ dω …(4)

with suitable limits of integration. Hence the area of a finite surface can be represented as an integral over a spherical surface.

cross-section

γ

P dS

P′

dω

O


10.7 USE OF SOLID ANGLES

There are many applications of the theory of attraction in which calculations are simplified by the use of the solid angle.

Article : Find the component of attraction perpendicular to itself produced by a plane plate of any form.

Let dS be an element of area of the plate at a distance r, from the point O and subtending a solid angle dω at O.

Let m denote the mass per unit area of the plate. Then the attraction of element dS at the point O is given by

= 2r

mdS. …(5)

Resolving attraction in (5) at right angles to the plate, we get

,r

�cosdSm2

…(6)

where θ is the inclination of r to the normal to the plate. From (2) and (6), we conclude that m dω is the contribution of area element dS of the plate to its whole attraction at O at right angles to itself. Further, the attraction of the whole plate in the same direction is m ω, where ω is the solid angle which the plate subtends at O. This completes the article.

Article : Prove that the potential of a solid of uniform density ρ at an external point P can be represented by a surface integral

21 ρ � cos θ dS

over the surface of the solid, where θ is the angle between the inward normal to dS and the line joining dS to P.

plate

O

θ r

dS

dω


307

Proof :- Let a cone of small solid angle dω and vertex P cut the surface of the solid in elements of area dS1, dS2 at A, B, where the inward normals make angles Q1, Q2 with the line BAP.

Let AP = r1, BP = r2. The mass of an element of volume of the cone at a distance r from the point P is ρr2dω dr.

Hence, the mass of the cone between A and B produces at P a potential equal to

� =ωρ

rdrdr 2

� ρ r dω dr

= ( ) ω−ρ drr21 2

122

= [ ]1122 �cosdS�cosdS�21 + .

If we take the sum for all cones which intersect the solid, we shall get

�ρ21

cos θ dS

as the potential at P, integration is being taken over the surface of the given solid.

The Books Recommended for Chapters X and XI. 1. A.S. Ramsey Newtonian Gravitation, ELBS and Cambridge

University Press.

P dω

θ1

A

B

θ2


Chapter-11

Attraction and Potential-II 11.1 ATTRACTION AND POTENTIAL AT INTERNAL POINTS

In the previous chapter we confined our attention to the attraction and potential at points external to the attracting matter.

We have now to consider the case of attraction and potential at points inside the attracting matter.

Our definitions of attraction and potential at an external point imply the existence of a separate attracted particle at the point under consideration. Such a particle cannot exist inside a continuous body because two particles cannot occupy the same space simultaneously.

We therefore imagine that there is a small cavity in the body surrounding the attracted particle placed at the chosen point. We assume that we can calculate the attraction and potential at this point by our former rules, since the attracted particle is not in contact with the matter. Then we define the attraction and potential at the same point in the continuous body to the limits to which the attraction and potential at the point in the cavity tend as the cavity decreases in size and ultimately vanishes.

11.2 ATTRACTION AND POTENTIAL OF UNIFORM THIN SPHERICAL SHEEL

Let m be the mass per unit area of a thin spherical shell of radius a and centre O.

(A) ATTRACTION AT AN INTERNAL POINT :-Let p be any internal point. with p as vertex construct a cone of small solid angle dω intersecting the surface in elements QQ′, RR′ of areas dS, dS′. The attractions at P of these elements are [mdS/(QP)2] and [mdS′/(RP)2 in opposite directions. But

Q

dS Q′

R R′ dS′

O P

ATTRACTION POTENTIAL - II 309

dS = (QP)2 (sec/OQP)dω,

and

dS′ = (RP)2 (sec(ORP) dω,

and the angles ORP|,OQP| are equal. So, the elements QQ′, RR′ of the spherical shell exert equal and opposite attractions at the point P. Since the whole shell can be divided into similar pair of elements by taking cones in all directions round the point P, it follows that the resultant attraction of the spherical shell at the internal point P is zero.

(B) Attraction at an external point :

Let P′ be any external point of the spherical shell and P its inverse, so that

OP⋅OP′ = a2

The resultant attraction at P′ is, by symmetry, along P′O and the element QQ′ exerts an attraction {mdS/(QP′)2} along P′Q. Resolving this attraction in the direction P′O, we get

= Q'OP|cos)'QP(

d)OQP|(sec)QP(m2

2 ω.

But the triangles QPP′ and OQP′ are similar, so

'OP

OQ'QP

QP = ,

and the angles OQP. OP′Q are equal. Therefore, the element QQ′ contributes an amount {ma2dw/(OP′)2} to the resultant attraction. By taking cones in all directions round P, we get for the attraction at P.

of the whole shell

=2

2

)'OP(

ma�4

= 2)'OP(

M,

Q

Q′

R

O P

R′

P′


where M denotes the mass of the spherical shell. It follows that the attraction of the shell at external points is the same as if its mass were collected into a particle at its centre.

(C) Potential at an internal point: Since the attraction is zero throughout the interior of the shell, there can be no variation in the potential, or the potential is constant. The potential at every point in the interior is, therefore, the same as the potential at the centre, i.e., M/a, where M denotes the whole mass, since every element of M is at the same distance a from the centre O.

(D) Potential at an external point : Let OP′ = r. Since the force at distance r is M/r2 in the direction in which r decreases, therefore,

2r

Mdrdv −= .

Integrating, we obtain

V = cr

M +

where c is a constant of integration. But the potential vanishes at an infinite distance, therefore,

c = 0

Hence, V = .r

M

11.3 ATTRACTION OF A THIN UNIFORM SPHERICAL SHELL AT A POINT OF ITSELF

The attraction at a point of itself of a thin layer of matter depends on the shape of the gap in the surface in which the attracted particle is placed. We may define the principal value of the attraction as the limiting value of the attraction at the centre of a circular hole when its radius tends to zero.

Let m be the mass per unit area of the spherical shell, and omitting a small circular element of the shell surrounding a point P, consider the attraction of the rest of the shell at the point P.

Let an element QQ′ of area dS subtend a solid angle dω at P. The attraction of the element dS at P is

Q

Q′

P O


2)PQ(

dSm

along PQ. Resolving this attraction along the direction PO, the direction of the resultant attraction, we get

2)PQ(

)OPQ|(cosdSm

= 2)PQ(

)OQP|(cosdSm

= m dω.

If now we allow the gap in the spherical shell round P to shrink to vanishing point, resultant we see that we have to take the sum � m dω for all cones on one. Side of the tangent plane at P. So, the resultant attraction is 2πm.

11.4 ATTRACTION AND POTENTIAL OF A UNIFORM SOLID SPHERE

Let a be the radius and ρ the density of the sphere. Such a sphere may be regarded as composed of a series of concentric thin spherical shells, and the required results may be obtained by summation.

(A) Attraction at an interval point : Take a point P at a distance r, 0 < r < a, from the centre. Imagine a thin spherical shell of matter of radii r +∈ and r−∈ to be removed and consider the attraction at a point P in this cavity.

The concentric shells external to the cavity exerts no attraction at the point P, and those internal to the cavity attracts as though their masses were concentrated at the centre O.

Hence, the attraction at P is the limit as ∈→0 of

23 r)r(34 ∈−ρπ

which is equal to π34 ρr. This shows that the attraction of a uniform solid

sphere at an internal point is directly proportional to the distance from the centre.

o r

a


(b) Attraction at an external point. Since each of the concentric spherical shells attracts at an external point as though its mass were collected at its centre, the same is true of the solid sphere. The attraction of the solid sphere is, therefore, represented by M/r2, where M is its mass and r the distance of an external point from the centre of the solid sphere.

(c) Potential of a uniform solid sphere at an internal point

Adopting the method of finding the attraction of a solid sphere at an internal point, let R denote the radius of a shell external to the cavity. Its mass is 4πρR2dR, so that the potential it produces at a point inside itself is 4πρRdR. Consequently, the potential at P due to all such external shells is given by

�∈+

a

r4πρRdR = 2π{a2 − (r + ∈)2}.

Also, the shells of radius less than that of the cavity produce the same potential as if the mass were collected at 0, i.e.,

).r()r(��34 3 ∈−∈−

Hence, the whole potential at P is the limit as ∈→0 of

[ ]��

�� ∈+−+∈− 222 )r(a

23

)r(��34

= ��32

(3a2 − r2).

(d) At an external point. Since each of the concentric shells produces at an external point a potential equal to its mass divided by the distance of the point from the centre, the same is true for the solid sphere. That is, the potential V at an external point P, OP = r, due to a uniform solid sphere with centre O is

V = M/r,

where M is the total mass of the uniform solid sphere.

Exercise : Deduce the expression for the potential of a uniforms solid sphere from the attraction.

11.5 WORK DONE BY SELF-ATTRACTING SYSTEMS

Let the component particles be m1, m2,…, and let A1, A2,… be their positions in the given system. Let

rst = distance between ms and mt.


First bring m1 from infinity to its assigned position A1. The work done is zero, for there are no particles of the system near enough to exert attraction on it.

Next, bring the particle m2 from infinity to its position A2. The work done on it

= m2 ×(potential of m1 at A2)

= γ12

21

rmm

Next, bring the particle m3 from infinity to is position A3. The work done on it

= γ ,r

mm�

rmm

23

23

13

13 +

and so on for the other particles of the system.

Hence, the total work done in collecting all the particles from rest at infinity distances from one another to their positions

= γ

��

�+++

��

�++

34

3

24

2

14

14

23

2

13

13

12

21

rm

rm

rm

m�rm

rm

m�rmm

= γ �12

21

rmm

…(1)

When the particles of the system have all been brought to their positions, let V1, V2,… be the potentials of the system at the respective points A1, A2,…,An,… Then

V1 = γ14

4

13

3

12

2

rm�

rm�

rm

++ +…

V2 = γ24

4

23

3

21

1

rm�

rm�

rm

++ +… …(2)

In view of (2), the expression (1) becomes

= 21�m1 V1, …(3)

as in the expression (3), any such term γst

ts

rmm

is twice repeated.

Hence, the work done in bringing the particles of the system from infinity to their respective positions,


= 21� m1 v1. …(4)

For continuous masses, we write (4) as

21� V dm …(5)

where V is the potential of the body A at any element dm of itself, and the integration is taken throughout the configuration A.

11.6 LAPLACE EQUATION FOR THE POTENTIAL

Let V be the potential of a system of attracting particles at a point, say P(x, y, z), which is not in contact with the particles. Let m be the mass of a particle at A1(a, b, c) of the given system. Let

r2 = (x−a)2 + (y−b)2 + (z−c)2 …(1)

we know that V is given by the formula

V = � ��

��

�

rm

. …(2)

Equation (1) gives

��

��

� −=∂∂

rax

xr

��

��

� −=∂∂

rby

yr

��

��

� −=∂∂

rcz

zr

. …(3)

Differentiating (2) partially w.r.t. x, we obtain

��

� −Σ−=��

��

�

∂∂

��

��

�Σ−=∂∂

32 r)ax(m

xr

rm

xV

. …(4)

Differentiating again (4) partially w.r.t. x, we have (left as an exercise)

��

� −Σ+��

��

�Σ−=∂∂

5

2

32

2

r)ax(m

3rm

xV

. …(5)

Similarly, we shall have (exercise)


,r

)by(m3

rm

yV

5

2

32

2

��

� −Σ+��

��

�Σ−=∂∂

…(6)

��

� −Σ+��

��

�Σ−=∂∂

5

2

32

2

r)cz(m

3rm

zV

. …(7)

Adding (5)−(7) vertically, we find (exercise)

0zV

yV

xV

2

2

2

2

2

2

=∂∂+

∂∂+

∂∂

. …(8)

This shows that the potential V of a system of attracting particles satisfies the Laplace equation. That is, V is a harmonic function.

Continuous Body

Let V be the potential of a continuous body or bodies at a point P(x, y, z) outside the body or bodies. Let ρ be the density of the element of volume dv at (x′, y′, z′), and

r2 = (x−x′)2 + (y−y′)2 + (z−z′)2. …(1)

We know that the potential V is given by

V = rdv�

� …(2)

Differentiating (2) under the integral sign, we get (exercise)

,dvr

)'xx(�3r�

xV

5

2

32

2

��

� −−�−=∂∂

…(3)

,dvr

)'yy(�3r�

yV

5

2

32

2

��

� −−�−=∂∂

…(4)

dvr

)'zz(�3r�

zV

5

2

32

2

��

� −−�−=∂∂

…(5)

r P(x,y,z)

(x′,y′,z′)


Adding (3) to (5) vertically, we obtain (exercise)

0zV

yV

xV

2

2

2

2

2

2

=∂∂+

∂∂+

∂∂

. …(6)

The Laplace equation (6) is satisfied by the potential of an attracting system at every point P(x, y, z) at which there is no matter.

11.7 POISSON’S EQUATION FOR THE POTENTIAL

Now let the point P(x, y, z) be inside the given attracting matter. Describe a sphere of small radius ∈ and centre (a, b, c) containing the point P, taking ∈ so small that we may regard the density ρ of the matter in this sphere as uniformly distributed.

The matter which produces the potential V at P may now be divided into two parts the matter outside this small sphere and the matter inside this small sphere. Let V0 and Vi denote their respective contributions to the whole potential V at P.

Since the point P is not in contact with the matter which produces the potential V0, therefore,

∇2 V0 = 0. …(1)

Further, Vi being the potential at a point P(x, y, z) inside a small sphere of radius ∈, we have

Vi = 32 πρ (3∈2 − r2), …(2)

where r is the distance of the point P(x, y, z) from the centre (a, b, c), i.e.,

r2 = (x−a)2 + (y−b)2 + (z−c)2. …(3)

From equations (2) and (3), we find (exercise)

2

i2

2i

2

2i

2

z

V

y

V

x

V

∂∂+

∂∂+

∂∂

= −4πρ. …(4)

Equation (4) is known as Poisson equation.

Remark : The relation

V0

∈ Vi


Attraction = grad V

which we have seen to be true at points outside a system of attracting particles, is also true at all points outside or inside a continuous distribution of matter.

Summarize of our results upto this point

(i) The attraction is the gradient of a potential function V both outside and inside the attracting matter.

(ii) In empty space, the potential V satisfies Laplace equation

∇2 V = 0.

(iii) At any point at which there is matter of volume density ρ, the potential V satisfies Poisson equation

∇2V = −4πρ.

11.8 GAUSS’S THEOREM (SURFACE INTEGRAL OF NORMAL ATTRACTION OVER ANY CLOSED SURFACE) Statement : If N be the normal attraction at any point of the element dS of any closed surface, measured positively along the normal outwards, due to any attracting mass, then

� N dS = −4γπM

where M is the amount of the attracting mass within the surface and the integral is being taken given the whole surface.

Proof : Let O be the position of any element m of the attracting mass within the closed surface. Through O, draw a cone of very small vertical angle and let it cut the given surface in the elements PQ and P′Q′, whose areas are dS and dS′.

The attraction of the mass m at these elements are

γ m OPN|cosOPdS

2 ��

��

� , along PN, …(1)

γ m 'N'OP|cosOP

'dS2' ��

��

� , along P′N′, …(2)

where PN and P′N′ are the outward drawn normals at P and P′, respectively, as shown below.


Through Q and Q′ draw normal sections QM and Q′M′ of this cone. Let dw be the solid angle of the cone. Let Q be the angle between the elements QM and QP. Then

Q + ∠OPN = π. …(3)

Now

dω = 2OQ

QMarea

= 2OQ�cosdS

= − 2OP

)OPN|(cosdS, …(4)

in the limit when PQ is very small. Similarly,

dω = −2'OP

)'N'OP|(cos'dS …(5)

Hence, the total normal attractions for the elements dS and dS′ at points P and P′ are each equal to −γm dw. Hence, the total normal attraction for the whole surface is

−γm �dm

= −γm π4( ),

i.e., for a single particle m at 0, we have

� NdS = −4Vπm. …(6)

Similarly for all other particles of the attracting mass inside the surface. Hence, finally, for the whole mass, we have

P

N dS

M Q

P′ Q′

N′

O

M′ dS′


� NdS = −4γπm.

This completes the proof of the theorem.

11.9 EQUIPOTENTIAL SURFACES

For any attracting mass M, the potential V at any point P(x, y, z) will be a function of coordinates x, y, z consider the equation

V(x, y, z) = C, …(1)

Where C is a constant. The equation (1) represents a surface such that the potential V(x, y, z) at any point of it for the given attracting mass is constant and equals to C. It is hence called an equipotential surface. By given different values to C, we get a family of equipotential surfaces.

Remark 1 : Only one equipotential surface passes through any point of space, so that no two equipotential surfaces can intersect.

Remark 2 : No work would be done by the attractions on a particle moving on an equipotential surface. Therefore, at every point the resultant attraction is normal to the equipotential surface through the point.

Remark 3 : In the case of rod AB, the equipotential surfaces are ellipsoids of revolution obtained by rotating confocal ellipses, whose foci are A and B, about AB as axis

Remark 4 : In the case of the spherical shells and sphere, the equipotential surfaces are concentric spheres.

Distribution for a given potential

When the potential is given at all points of space, we can determine the corresponding distribution. For, the potential V being known, we can find ∇2V for every point of space. The Poisson equation gives

ρ = −π4

1 ∇2V. …(1)

Equation (1) serves to determine the volume density. When

∇2V = 0, …(2)

the corresponding density of the distribution is zero. That is, there is no attracting mass at all such points. Whenever

∇2V ≠ 0, …(3)


the corresponding density of the distribution is given by (1),

If the form of the potential function V inside any surface S is different from its

form outside, and if there be an abrupt change in the value of dxdV

as we pass

across this surface, then the surface density σ on S is calculated as below. For, let V1 be the potential just inside S and V2 the potential just outside S. Let dn be an element of the outward drawn normal. The surface density σ of S is determined by the relation

σ = −π41

��

�

∂∂

−∂

∂n

Vn

V 12

where the direction of the normal ∂n is from 1 to 2.

Question : The potential outside a certain cylindrical boundary is zero and inside it is

V = x3 − 3x y2 − ax2 + 3ay2.

Find the distribution of matter.

Solution : First we found the boundary. Since the potential V is continuous across the boundary and zero outside, the boundary must be given by the equation

x3 − 3xy2 − ax2 + 3ay2 = 0. …(1)

Equation (1) can be written as

(x−a) (x− 3 y) (x + 3 y) = 0. …(2)

It shows that the section of the given cylindrical boundary is an equilateral triangle OAB of height a

We find

y

o

x = a A P

a M

B

x

OM = a


2

2

xV

∂∂

= 6x − 2a, …(3)

2

2

yV

∂∂

= −6x + 6a, …(4)

2

2

zV

∂∂

= 0. …(5)

Hence ∇2V = 2

2

2

2

2

2

zV

yV

xV

∂∂+

∂∂+

∂∂

= 4a. …(6)

we know that inside the given cylindrical region, the volume density ρ is given by the formula

ρ = −�4

1 ∇2V …(7)

Equations (6) and (7) give

ρ = −πa

. …(8)

Further, since V is zero outside the region, hence ρ = 0 outside the region.

On the boundary : When the point P lies on the part AB of the boundary, then the surface density is given by

σ = − ��

��

�

∂∂

−∂

∂nV

nV

�41 12 , …(9)

where V1 and V2 are the potentials on two sides of the boundary. Here

V1 = V, V2 = 0. …(10)

and the normal to AB is along x-direction. Hence, from equations (9) and (10), we have

σ = π4

1 ( )�4

1ax2y3x3 ax

22 =−− = (a2 − 3y2)


= �4

3 (MA2−MP2)

= �4

3AP⋅PB …(11)

For a point Q on AO part of the boundary, we have (left as an exercise)

σ = �4

3OQ ⋅ QA. …(12)

Similarly for the boundary part OB. This shows that a solid cylindrical region

(prism) of uniform density πa

would produce the same external field as a

distribution of matter of surface density ��

��

�

�43

(AP ⋅ PB) on each of the faces of

the prism. Hence the result.

11.10 SURFACE AND SOLID HARMONIC We know that the potential V satisfy the Laplace’s equation

∇2V = 0

or 2

2

2

2

2

2

zV

yV

xV

∂∂+

∂∂+

∂∂

= 0. …(1)

Definition : Any solution of Laplace’s equation which is homogeneous in x, y, z, is called a harmonic function or a spherical harmonic.

Definition : The degree of homogeneity is called the degree of the function.

Remark 1 : We are concerned with the case in which the degree of the function is an integer.

Remark 2 : If V is a harmonic function of degree n, then

.Vzyx t

t

q

q

p

p

∂∂

∂∂

∂∂

is a harmonic function of degree (n−p−q−t).

Surface and Solid Harmonics

In polar co-ordinate(r, θ, φ) the Laplace’s equation (1) may be written as (left as an exercise)

0�

V�sin

1�

V�sin

��sin1

rV

rr 2

2

22 =��

�

��

�

∂∂+�

�

��

�

∂∂

∂∂+�

�

��

�

∂∂

∂∂

…(2)


We take,

V = rn Sn, …(3)

Where Sn is independent of r and Sn = Sn(θ, φ). Differentiating equation (3) w.r. to r partially, we write

rV

∂∂

= nrn−1. sn

or r2 rV

∂∂

= n rn+1. Sn …(4)

Differentiate (4) w.r. to r both side partially, we obtain

��

��

�

∂∂

∂∂

rV

rr

2 = n(n+1) rn Sn …(5)

From equations (2), (3) and (5), we obtain (exercise)

2n

2

2n

2n

2

�

S

�sin1

�

S�cot

�

S

∂∂+

∂∂+

∂∂

+ n(n+1) Sn = 0 …(6)

We put

cos θ = µ …(7)

in equation (6) and obtain (exercise)

2n

2

2n2

�

S

�11

�

S)�1(

� ∂∂��

��

�

−+

��

�

∂∂

−∂∂

+ n(n+1) Sn = 0 . …(8)

A solution, Sn, of equation (6) is known as a Laplace’s function or a surface harmonic of order n. Since n(n+1) remains unchanged when we write −(n+1) for n, there are two solutions of equation (2) of which Sn is a factor, namely rnsn and r−n−1 Sn.

Definition : The functions rn Sn and r−(n+1) Sn are called solid harmonics of degree n and −(n+1), respectively.

Results 1 : If U is a harmonic function of degree n, then ��

��

�+1n2r

U is a also a

harmonic function.

For example, U = xyz is a harmonic function of the degree 3, therefore, xyz/r7 is also a harmonic function.

Result 2 : If U is a harmonic function of degree −(n+1), then r2n+1U is also a harmonic function.


Note : x0 = 1 and θ = tan−1(y/x) are both harmonic functions of degree zero.

Consequently r1

andr1

tan−1(y/x) are harmonics of degree −1.

Differentiating these w.r.t. x, y, z; we find )x/y(tanrz

,rz

,ry

,rx 1

3333− etc., as

harmonics of degree −2, and so on.

11.11. SURFACE DENSITY IN TERMS OF SURFACE HARMONIC

We assume that

V1 = �∞

+0

1n

n

ar

Un, r < a …(1)

and

V2 = �∞

+0

1n

n

ra

Un, r > a …(2)

Give the potential of a certain distribution of matter. In equations (1) and (2), Un denotes the sum of a finite number of surface harmonics (one for each particle) and therefore itself a surface harmonic. Hence

∇2V1 = 0, ∇2V2 = 0. …(3)

The matter resides on the surface of the sphere, and its surface density σ is given by the formula

ar

21

rV

rV

4=

��

��

�

∂∂

−∂

∂=σπ . …(4)

Equations (1), (2) and (4) give (exercise)

σ = �+∞

=0n2a�41n2

Un. …(5)

It follows that if we accept the physical argument that an arbitrary distribution of surface density on the surface of a sphere produces the same kind of field of potential as an aggregate of particles distributed over the sphere with the potentials given by (1) and (2), then the arbitrary surface density is expressible in the form (5).

Hence the result.

chapter ii transport and laplace … diff eq...contents chapter i calculus of variations 1-39...

Documents