geocontinua i lecture notes - geophysik.uni-muenchen.de · chapter 1 physical quantities physics is...

98
Geocontinua I Lecture Notes Lorenzo Colli February 26, 2014

Upload: vuongmien

Post on 14-Jan-2019

214 views

Category:

Documents


0 download

TRANSCRIPT

Geocontinua ILecture Notes

Lorenzo Colli

February 26, 2014

ii

Information wants to be free

Stewart Brand

Figure 1.1 by Benjamin D. Esham / Wikimedia Commons / Public DomainFigure 1.2 by User:Acdx / Wikimedia Commons / Public DomainFigure 2.1 by Lyudmil Antonov / Wikimedia Commons / GFDL / CC-BY-SA-3.0Figure 3.1 by User:Pbroks13 / Wikimedia Commons / GFDL / CC-BY-SA-3.0Figure 3.2 by User:Drz / Wikimedia Commons / GFDL / CC-BY-SA-3.0Figure 3.3 by User:Inductiveload / Wikimedia Commons / Public DomainFigure 5.1 by kind courtesy of Zach Weiner / smbc-comics.comFigure 6.1 by User:Sanpaz / Wikimedia Commons / GFDL / CC-BY-SA-3.0Figure 6.3 by Berlyn Brixner / Public Domain

As for the other figures, they have been created by me. My figures as well as the restof this book is under copyright and all rights are reserved by me! They are mine, mineonly! My precious rights! Gollum! Gollum!Just joking: I release everything into the public domain. In the remote case that you findmy work useful or inspiring and want to use it, just be kind and cite me, ok?

GFDL licence can be found at the following URL:http://www.gnu.org/copyleft/fdl.html

CC-BY-SA-3.0 licence can be found at the following URL:

http://creativecommons.org/licenses/by-sa/3.0

Contents

Forewords vi

Where the real knowledge lies vii

Notation viii

1 Physical quantities 11.1 Intensive and extensive quantities . . . . . . . . . . . . . . . . 11.2 Scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3.1 Dot and cross products . . . . . . . . . . . . . . . . . . 31.4 Frame of reference and coordinate system . . . . . . . . . . . . 5

1.4.1 Orthonormal right-handed coordinate system . . . . . 61.5 Invariance of physical laws . . . . . . . . . . . . . . . . . . . . 7

2 Linear algebra 92.1 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 92.2 Vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2.1 Some examples . . . . . . . . . . . . . . . . . . . . . . 112.2.2 Linear combinations and linear dependence . . . . . . . 112.2.3 Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2.4 Functions over vector spaces . . . . . . . . . . . . . . . 13

2.3 Change of basis . . . . . . . . . . . . . . . . . . . . . . . . . . 192.4 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.4.1 Determinant . . . . . . . . . . . . . . . . . . . . . . . . 212.4.2 Inverse matrix . . . . . . . . . . . . . . . . . . . . . . . 252.4.3 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . 262.4.4 Diagonalization . . . . . . . . . . . . . . . . . . . . . . 28

2.5 Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.5.1 Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.5.2 Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

iii

iv CONTENTS

2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3 Derivative 353.1 General definition . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.1.1 Higher order derivatives . . . . . . . . . . . . . . . . . 373.2 Various notations . . . . . . . . . . . . . . . . . . . . . . . . . 373.3 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . 373.4 Derivative of a vector valued function . . . . . . . . . . . . . . 383.5 Functions of many variables . . . . . . . . . . . . . . . . . . . 38

3.5.1 Partial derivative . . . . . . . . . . . . . . . . . . . . . 383.5.2 Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . 383.5.3 Derivatives of higher order . . . . . . . . . . . . . . . . 39

3.6 Vector valued functions of many variables . . . . . . . . . . . 403.6.1 Jacobian matrix . . . . . . . . . . . . . . . . . . . . . . 403.6.2 Composition of functions . . . . . . . . . . . . . . . . . 413.6.3 Coordinate transformations . . . . . . . . . . . . . . . 413.6.4 Vector fields . . . . . . . . . . . . . . . . . . . . . . . . 43

3.7 Differential operators . . . . . . . . . . . . . . . . . . . . . . . 443.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4 Riemann integral 474.1 General definition . . . . . . . . . . . . . . . . . . . . . . . . . 474.2 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . 484.3 Fundamental theorem of calculus . . . . . . . . . . . . . . . . 50

5 Integration in Rn 515.1 Multiple integral . . . . . . . . . . . . . . . . . . . . . . . . . 515.2 Coordinate transformation . . . . . . . . . . . . . . . . . . . . 52

5.2.1 Curves and surfaces . . . . . . . . . . . . . . . . . . . . 525.2.2 Gauß’ theorem . . . . . . . . . . . . . . . . . . . . . . 545.2.3 Stokes’ theorem . . . . . . . . . . . . . . . . . . . . . . 555.2.4 Green’s identities . . . . . . . . . . . . . . . . . . . . . 55

5.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

6 Continuum Mechanics 576.1 Lagrangian and Eulerian frames of reference . . . . . . . . . . 586.2 Material time derivative . . . . . . . . . . . . . . . . . . . . . 596.3 Deformation tensors . . . . . . . . . . . . . . . . . . . . . . . 626.4 Strain tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

6.4.1 Velocity gradient tensor . . . . . . . . . . . . . . . . . 646.5 Conservation of mass . . . . . . . . . . . . . . . . . . . . . . . 66

CONTENTS v

6.6 Balance of momentum . . . . . . . . . . . . . . . . . . . . . . 686.6.1 Cauchy’s stress theorem . . . . . . . . . . . . . . . . . 686.6.2 Navier-Stokes equation . . . . . . . . . . . . . . . . . . 70

6.7 Constitutive equation and rheology . . . . . . . . . . . . . . . 716.7.1 Linear isotropic constitutive equation . . . . . . . . . . 716.7.2 Non-Newtonian fluids . . . . . . . . . . . . . . . . . . . 72

6.8 Similarity and non-dimensionalization . . . . . . . . . . . . . . 746.8.1 Buckingham’s Π-Theorem . . . . . . . . . . . . . . . . 746.8.2 Dimensional Analysis . . . . . . . . . . . . . . . . . . . 756.8.3 Non-dimensionalization . . . . . . . . . . . . . . . . . . 766.8.4 Geometric, kinematic and dynamic similarity . . . . . . 77

6.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

A Symmetry of the stress tensor 79

B Solutions 82B.1 Linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . 82B.2 Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87B.3 Integration in Rn . . . . . . . . . . . . . . . . . . . . . . . . . 88

Forewords

These are complementary lecture notes for the class ”Earth Rotation andSolid Earth Physics” (aka Geocontinua I). The aim of these notes is to pro-vide the students a derivation of the basic equations of fluid dynamics fromfundamental laws of physics, within a solid mathematical framework. Thefirst chapters of these notes are thus intended to give the students a brief andsketchy overview of the mathematical background that is needed to under-stand the core issues of Earth dynamics, and to solve some basic problemsof fluid dynamics, while fluid mechanics is dealt with in the final chapter.

In writing these notes I used many different sources, sometimes to drawinspiration from them, sometimes to flat out copy a definition or a figure.I’m very grateful to all the authors of the aforementioned sources, and I’mparticularly in debt to the lecture notes of Prof. Carlamaria Maderna forChapters 3, 4 and 5; to the lecture notes of Prof. Guido Parravicini forChapter 6; and to the English Wikipedia1 for many images.

As a final remark, let me point out that this is still a work in progressand probably will always be, evolving year by year with each new generationof master students: it is not guaranteed 100% error free; indeed I exhort allreaders to report me all typos, errors and horrors they may find.

Lorenzo Colli

1http://en.wikipedia.org/

vi

Where the real knowledge lies

Don’t just read it; fight it!

P. R. Halmos

• Paul R. Halmos, Finite-Dimensional Vector Spaces

• Serge Lang, Analysis I

• Walter Rudin, Principles of Mathematical Analysis

• Terence Tao, Analysis

• George Keith Batchelor, An Introduction to Fluid Dynamics

• D. J. Tritton, Physical Fluid Dynamics

• L. D. Landau and E. M. Lifshitz, Fluid Mechanics

• Giorgio Ranalli, Rheology of the Earth

vii

Notation

Sets

For our present purposes, we can use the naıve definition of set: a set S isa collection of distinct objects, considered as an object in its own right. Tostate expressly the elements of a set, we will list them in curly brackets: forexample S = ♥,♦,♣,♠ is the set of the four card suits. Sometimes theelements of a set will not be explicitly listed, but it will instead be given arule, formula or property P that the elements of the set must satisfy. Thiswill be stated in the form S = x : P (x), that is “S is the set of all x suchthat P (x) is true”. For example, S = x : x/2 ∈ N is the set of all evennatural numbers.

The most important relation involving a set is membership, that is whenone object is an element of the set. If e is an element of a set S, than we canwrite e ∈ S, otherwise e /∈ S if e is not an element of S.

If every element of a set A is also an element of another set B, than A issaid to be a subset of B, denoted by A ⊂ B.

We will indicate the Cartesian product of two sets X and Y with thestandard multiplication sign: X × Y = (x, y) : x ∈ X, y ∈ Y .

Functions

Throughout these notes we will use the notation f : X → Y to indicate afunction with domain X and image Y . Sometimes it will be used in a moresloppy way, with the actual domain and image of the function being propersubsets of X and Y .

The Kronecker delta is the function δ : Z× Z → 0, 1, usually denotedby δij, that takes the value 1 if the two integers are equal and 0 if they aredifferent.

A permutation σ is a bijection from a set S into itself. Basically it can bethought of as a function that rearranges the elements of a set in a differentorder. For example, there are two permutations of the set 1, 2, namely [1, 2]and [2, 1], and six permutations of the set 1, 2, 3, namely [1, 2, 3], [1, 3, 2],

viii

ix

[2, 3, 1], [2, 1, 3], [3, 1, 2] and [3, 2, 1]. Every permutation can be decomposedinto a product (composition) of transpositions, i.e. binary swappings. Foreach permutation more than one decomposition is possible, but the totalnumber of transpositions entering the product will always be either even orodd, i.e. it is impossible to write a product of an even number of transpo-sitions as the product of an odd number of transpositions. The parity of apermutation is equal to the parity of any of its possible decompositions intoa product of tanspositions, and is thus a well-defined property of a permu-tation. The sign of a permutation is denoted sgn(σ) and is equal to +1 ifthe permutation is even and to −1 if the permutation is odd. Denoting withm the number of transpositions in the decomposition we can write this assgn(σ) = (−1)m.

Symbolic logic

∀ is the universal quantification, stating that all the elements of that par-ticular set, none excluded, obey the specified statement. For example, theexpression “∀x ∈ N, x < x+ 1” reads “for all natural numbers n, n is smallerthan its successor”, which is also a true statement.∃ is instead the existential quantification, stating that there exists at

least one element of that particular set that obeys the specified property.For example ∃x ∈ R : x = x2 is a true statement, since there are two realnumbers, 0 and 1, that satisfy the condition.

x NOTATION

Chapter 1

Physical quantities

Physics is that branch of science devoted to the study of matter and its mo-tion through space and time, together with the causes and the effects thereof.As such, Physics deals with physical quantities, that are the properties ofbodies, phenomenon or systems that can be quantified by measurement, andthat are relevant for the physical research.

1.1 Intensive and extensive quantities

A first useful distinction can be made between quantities that are linearlydependent on the size or the amount of matter of the physical system undermeasure, also called extensive quantities, and quantities that do not dependon the size or the amount of matter of the physical system, also called in-tensive quantities. Examples of extensive quantities are mass, volume andcharge, while examples of intensive quantities are mass density and temper-ature.

The main difference between intensive and extensive quantities lies in theway we can handle them. Extensive quantities can be manipulated in a verynaıve way: as an example, the total mass and the total volume of a systemmade up of two bodies is just the sum of the masses and of the volumes of thetwo bodies. On the other hand, intensive quantities must be manipulatedwith more care: in the same situation as the previous example, the massdensity of the whole system is given by an average of the mass densities ofthe two bodies.

Note that the ratio of two extensive quantities that scale in the sameway is scale-invariant, and hence an intensive quantity. In physics it is quitecommon to divide an extensive quantity that scales with the volume (likemass, charge or energy) by the volume itself, thus obtaining a density of the

1

2 CHAPTER 1. PHYSICAL QUANTITIES

Figure 1.1: Graphic representation of vectors as arrows and of the parallelo-gram rule.

extensive quantity.

1.2 Scalars

A scalar is the most simple type of physical quantity, as it needs only a singlereal number1 to be fully characterized. This numerical value depends on theunit of measurement, but not on the coordinate system2. Some examples ofscalars include the mass, charge, pressure, temperature and speed of a body,and the distance between two points in space.

1.3 Vectors

Other physical quantities, such as displacement, velocity, acceleration, forceand heath flux, are more complicated than simple scalars and, in addition totheir magnitude, they need also a direction and a sense to be fully described.These kinds of physical quantities are called (Euclidean) vectors.

Vectors can thus be represented graphically by an arrow pointing in thesame direction and sense of the vector and whose length is proportional to thevector magnitude. Vectors are usually denoted in lowercase italic boldface,like a, or with a small arrow above the letter, like ~a, while the magnitude ofa vector is denoted by |a| or ‖a‖.

1Actually, rational numbers are more than enough to measure scalar quantities, butfor a proper mathematical treatment of physical theories we need to regard scalars as realnumbers.

2We will say more about this in section (1.5). Let’s point out here that the numericalvalue of a scalar quantity may, in some cases, depend on the frame of reference, and thisfact is what drove Einstein to develop its theory of general relativity.

1.3. VECTORS 3

Vectors can be summed together, using the parallelogram rule (see Figure1.1). As a consequence, a + a is a vector that has the same direction andsense of a but double magnitude. It thus makes sense to define a scalarmultiplication of a vector by a real number ra as the multiplication of themagnitude of the vector a by the real number r. Obviously 1a = a, while−1a = −a is the vector that has the same magnitude and direction of a butopposite sense3, and −a+a = 0a = 0 is a vector with zero magnitude. If wemultiply a vector for the inverse of its magnitude we obtain a vector of unitlength, often called normalized vector or versor, and denoted by a “hat”

1

‖a‖a = a.

Notice that, as a consequence of the geometrical properties of the paral-lelogram rule and of the algebraic properties of real numbers

• vector summation is associative:

(a+ b) + c = a+ (b+ c);

• scalar multiplication is distributive with respect to both scalar andvector addition:

r(a+ b) = ra+ rb,

(r + s)a = ra+ sa;

• product between scalars cope well with scalar multiplication:

(rs)a = r(sa);

Given a set of n vectors a1, . . . ,an and a set of n scalars r1, . . . , rn, theexpression r1a1 + . . . + rnan = b is well defined and both the expressionitself and the resulting vector b are called linear combination of the vectorsa1, . . . ,an.

1.3.1 Dot and cross products

Two particularly useful operations on vectors are the dot and the cross prod-ucts. The dot product between two vectors is defined as:

a · b = ‖a‖‖b‖ cos θ = b · a,3Depending on the convention adopted, it could be the vector with the same direction

and sense, but opposite magnitude.

4 CHAPTER 1. PHYSICAL QUANTITIES

Figure 1.2: Graphic representation of the dot and the cross products.

where θ is the angle from a to b (see Figure 1.2). This is equivalent to projectone vector on the direction of the other. As it yields a scalar, the dot productis also called scalar product.

The cross product is instead defined as:

a× b = ‖a‖‖b‖ sin θ n = −b× a,

where θ is as before the angle from a to b and n is a versor perpendicularto both a and b which completes a right-handed system: with the thumb,index, and middle fingers at right angles to each other, the middle fingerpoints in the direction of n when the thumb represents a and the indexfinger represents b. Since the result of a cross product is a vector4, the crossproduct is also called vector product.

The magnitude of the cross product is equal to the area of the parallel-ogram delimited by the two vectors. Combining a cross and a dot productin what is called the scalar triple product we can calculate the volume of theparallelepiped whose edges are determined by three vectors a, b and c. Infact, as can be seen in Figure 1.3, (a×b) ·c = ‖a‖‖b‖ sin θ ‖c‖ cosφ, which isthe well known geometrical rule to compute the volume of a parallelepiped.Clearly we can take whichever plane we want as the base of the parallelepiped,and its volume is not going to change: (a× b) · c = a · (b× c) = (c×a) · b.

4Strictly speaking this is not true, as one of the three “vectors” involved in the crossproduct is actually a pseudovector ; but vectors and pseudovectors are similar enough tobe all regarded as vectors, at least for the purpose of our present discussion.

1.4. FRAME OF REFERENCE AND COORDINATE SYSTEM 5

Figure 1.3: Geometric interpretation of the scalar triple product.

1.4 Frame of reference and coordinate sys-

tem

We have seen that to specify a vector we need three quantities: magnitude,direction and sense. While for the magnitude we only need a unit of measure-ment, to specify direction and sense we need not only a unit of measurement(e.g. 30 degrees or π/6 radians), but also some reference point or preferentialdirection (e.g. 30 degrees north). In fact there is no other way but to spec-ify a direction with respect to something else, called a frame of reference.Since space itself is pretty homogeneous and anisotropic it is impossible todiscriminate some particular directions, and thus to determine an absoluteframe of reference. We are thus forced to choose something less universal andmore trivial: an object, or a system of objects, that can be easily used as areference and that is characterized by an acceptable degree of indeformabil-ity and stability. Moreover, among the many possible frames of reference wechoose only those that allow us to describe objects that are not subject toforces as moving with a constant velocity. These are called inertial framesof reference.

Once we have chosen a frame of reference, we can identify three prin-cipal directions (which must not lie on the same plane), and we can startto measure all the vectors with respect to these directions. Moreover, wecan choose a set of three vectors, directed as the principal directions of the

6 CHAPTER 1. PHYSICAL QUANTITIES

frame of reference, as reference vectors and express every other vector as alinear combination of these three vectors, which are thus called basis vectors.Every vector can now be identified just by the three coefficients of its linearcombination. By doing all this we have actually chosen a coordinate systemfor our frame of reference: we need no more to indicate vectors with theirphysical quantities, that is magnitude, direction and sense, but with threenumbers, the coefficients of the linear combination of the basis vectors, calledcoordinates. We will see that this is very, very handy.

Suppose that we have three vectors, say a, b and c, and that we want tocalculate (a + b) · c or (a + b)× c. If we do not state a coordinate system,we have to use the parallelogram rule to add the two vectors and then usesome more trigonometry to find the angle between (a+ b) and c. All this isquite difficult and very cumbersome. Suppose instead that we have chosen acertain basis v1,v2,v3, and thus that we can write the three vectors as

a = a1v1 + a2v2 + a3v3,

b = b1v1 + b2v2 + b3v3,

c = c1v1 + c2v2 + c3v3.

Now (a+ b) · c can be written as

((a1v1 + a2v2 + a3v3) + (b1v1 + b2v2 + b3v3)) · (c1v1 + c2v2 + c3v3) =

((a1 + b1)v1 + (a2 + b2)v2 + (a3 + b3)v3) · (c1v1 + c2v2 + c3v3) =

(a1 + b1)c1v1 · v1 + (a2 + b2)c1v2 · v1 + (a3 + b3)c1v3 · v1 +

(a1 + b1)c2v1 · v2 + (a2 + b2)c2v2 · v2 + (a3 + b3)c2v3 · v2 +

(a1 + b1)c3v1 · v3 + (a2 + b2)c3v2 · v3 + (a3 + b3)c3v3 · v3.

At first sight this may not seem such a big simplification, but we must con-sider two things: first of all, since we are not going to change the basis vectorsvery frequently, we can compute the six scalar products between the basisvectors once and for all, and secondly we have to deal only with sums andmultiplications of the coordinates, which are much easier than trigonometricexpressions.

1.4.1 Orthonormal right-handed coordinate system

Usually, the basis vectors are chosen to be of unit length, so that vi ·vi = 1 ∀i,and to be mutually orthogonal, that is vi ·vj = 0 if i 6= j. We can summarizethese two properties by writing vi · vj = δij. In this way the scalar productbetween any two vectors a and b can be written as

a · b = a1b1 + a2b2 + a3b3,

1.5. INVARIANCE OF PHYSICAL LAWS 7

and our previous equation for (a+ b) · c takes the simpler form

(a+ b) · c = (a1 + b1)c1 + (a2 + b2)c2 + (a3 + b3)c3.

Moreover, the basis vectors are usually ordered so that they constitute aright-handed system:

v1 × v2 = v3

v2 × v3 = v1

v3 × v1 = v2,

so that we can express the cross product between two vectors as

a× b = (a2b3 − a3b2)v1 + (a3b1 − a1b3)v2 + (a1b2 − a2b1)v3.

1.5 Invariance of physical laws

One of the most basic assumptions in physics is that the laws of nature areconstant in time and space5: a certain phenomenon — or a certain experi-ment, from the physicist point of view — develops in the same way wheterit takes place now or in a month, here or at the other side of the galaxy,facing this or that direction. This also means that any two observers exam-ining a phenomenon from different positions and under different directionssee indeed the same phenomenon, that is they measure the same physicalquantities.

These three assumptions, namely invariance under time translation, spacetranslation, and rotation6 form the basis for classical mechanics, and arereflected in the way the laws of physics are written: all the quantities used ina physical law must be invariant under these transformations. For example,the position of an object with respect to the origin of coordinates is nota proper physical quantity, because a different observer using a differentcoordinate system would measure a different position. The relative positionof two objects, instead, is a good physical quantity, since it is the same forall observers.

Moreover, every admissible “manipulation” of a physical quantity in aphysical law not only must yield something that is still invariant, but mustbe itself invariant. The most important class of such “manipulations” arecalled tensors, and describe linear relations between scalars, vectors, or other

5Or, seen from the opposite point of view, we assume that time and space themselvesare homogeneous and isotropic.

6Actually, together with Newton’s laws of motion.

8 CHAPTER 1. PHYSICAL QUANTITIES

tensors. The scalar and vector products are two such linear relations, and canbe described using tensors. The differential operators gradient, divergence,curl and laplacian are tensors too, and we will encounter another tensor whenlooking for a linear relation between the stress and the strain, or the strainrate, of a continuous body.

Chapter 2

Linear algebra

2.1 Basic definitions

Groups

A group is a set G together with an operation f : G×G→ G, called a grouplaw, with the following properties:

• closure: for every x, y ∈ G, z = f(x, y) ∈ G;

• associativity: for every x, y, z ∈ G, f(f(x, y), z) = f(x, f(y, z));

• identity element: there exist an element e ∈ G such that for allx ∈ G f(e, x) = f(x, e) = x;

• inverse element: for every element x ∈ G there exist an elementy ∈ G such that f(x, y) = f(y, x) = e.

The group law is usually written as a product, that is f(x, y) = x× y =x·y = xy, the identity element is denoted by 1, and the inverse of x is denotedby x−1. If the group law is also commutative, that is ∀x, y ∈ G, f(x, y) =f(y, x), the group law is usually written as a sum, that is f(x, y) = x + y,the identity element is denoted by 0, and the inverse of x is denoted by −x.If the group law is commutative we also say that the group is commutative.

The integers Z, together with the addition, are a (commutative) group.

Rings

A ring is a set G together with two operations, called addition and multipli-cation, satisfying the following properties:

• G is a commutative group with respect to addition;

9

10 CHAPTER 2. LINEAR ALGEBRA

• The multiplication is closed, associative and has an identity element;

• The distributive law is applicable: ∀x, y, z ∈ G we have

(x+ y) · z = (x · z) + (y · z) and z · (x+ y) = (z · x) + (z · y).

Fields

A field is a ring for which:

• also multiplication is commutative;

• there exist a multiplicative inverse for every non-zero element (i.e. forevery element except for the identity element of the addition);

• the additive and the multiplicative identities are distinct.

The sets of all rational and of all real numbers, Q and R, together withthe addition and the multiplication are common examples of fields.

2.2 Vector spaces

Let F be a field. A vector space over F is a commutative group V togetherwith an operation f : F × V → V , denoted by a product and called scalarmultiplication, such that, for all a, b ∈ F and x, y ∈ V we have that

• scalar multiplication is distributive with respect to vector addition:

a(x+ y) = ax+ ay;

• scalar multiplication is distributive with respect to field addition:

(a+ b)x = ax+ bx;

• scalar multiplication is compatible with field multiplication:

(ab)x = a(bx);

• the identity element of scalar multiplication is the identity element offield multiplication:

∀x 6= 0, ax = x⇔ ab = b, ∀b ∈ F.

Elements of F are called scalars and elements of V are called vectors.

2.2. VECTOR SPACES 11

2.2.1 Some examples

• Given a field F , the set F n of all the ordered n-tuples of elements of Fis a vector space over the field F if, given α ∈ F and x = (µ1, . . . , µn),y = (η1, . . . , ηn) ∈ F n, we define:

x+ y = (µ1 + η1, . . . , µn + ηn),

αx = (αµ1, . . . , αµn).

In particular, let’s notice that F 1 = F is a vector space, that is everyfield F can be regarded as a vector space on F itself.

• A rectangular array of n × m elements of a field F , that is the setA = aij with aij ∈ F ∀i = 1, . . . , n and ∀j = 1, . . . ,m, usuallywritten in the form a11 . . . a1m

.... . .

...an1 . . . anm

, (2.1)

is called a n×m matrix. The set MFn×m of all the n×m matrices over

F is a vector space if, given α ∈ F and aij, bij ∈MFn×m, we define:

aij+ bij = aij + bij,αaij = αaij.

• Let P be the set of all polynomials with real coefficients. P can beregarded as a vector space if we interpret vector addition and scalarmultiplication as the ordinary addition of two polynomials and themultiplication of a polynomial by a real number. Note that, with thesame operations, also the sets Pn of all the polynomials of degree < nare vector spaces.

2.2.2 Linear combinations and linear dependence

Let V be a vector space and X = xi : xi ∈ V ∀i = 1, . . . , n be a finiteset of vectors. Given a finite set of scalars A = ai, i = 1, . . . , n, a linearcombination of the elements of X is the expression

n∑i=1

aixi = a1x1 + . . .+ anxn,

and by extension also its value.

12 CHAPTER 2. LINEAR ALGEBRA

The zero vector can always be obtained as a linear combination of anyset xi of vectors, using only the zero scalar as coefficient: ai = 0 ∀i. Thislinear combination is called trivial.

Given a set of vectors X of a vector space V , if every vector of V can beexpressed as a linear combination of X, then X is a generator of V .

The vectors of a finite set xi (and the set itself) are called linearlydependent if at least one of them can be written as a linear combination ofthe others,

xk =∑

i=1,...,ni 6=k

aixi,

or equivalently, by subtracting the vector xk from both sides of the previousequation, if the zero vector can be obtained by means of a non-trivial linearcombination of all the vectors. A set of vectors that includes the zero vectoris obviously linearly dependent. A set that is not linearly dependent is calledlinearly independent.

2.2.3 Bases

A basis in a vector space V is a set of linearly independent vectors B thatalso generates V .

As a corollary, let’s notice that this implies that every vector v in V canbe written as a linear combination of the basis (because it is a generator),and that this linear combination is unique (otherwise the basis wouldn’t belinearly independent). This gives us the opportunity to denote every vectorv = v1b1 + . . .+vnbn by the coefficients of the linear combination of the basisvectors (v1, . . . , vn), that is by an n-tuple of elements of F . We will see inthe following how deep this connection is.

It can be shown that every linearly independent set of vectors X canbe extended to a basis, and that every generator of the vector space canbe restricted to a basis. More importantly, it can be shown that, given avector space V , every basis of that vector space has the same cardinality.The cardinality of the basis is called the dimension of the vector space. Inthe following we will treat mainly finite vector spaces, i.e. vector spaces withfinite dimension.

2.2. VECTOR SPACES 13

2.2.4 Functions over vector spaces

Linear maps

Given two vector spaces U and V over the same field F , a linear map orlinear function is a function f : U → V such that

f(a1u1 + a2u2) = a1f(u1) + a2f(u2)

for all u1, u2 ∈ U and all a1, a2 ∈ F .Let M be the set of all linear maps from U to V . Then M can be viewed

as a vector space over F when equipped with the following definitions foraddition and scalar multiplication:

(f + g)(u) = f(u) + g(u)

(af)(u) = af(u)

for all f , g ∈ M , u ∈ U , a ∈ F . Let’s notice in particular that the zerovector is the map f(u) = 0 for all u ∈ U .

Given a basis B = b1, . . . , bn of U , for every vector u = η1b1+. . .+ηnbnwe can thus write

f(u) = f(η1b1 + . . .+ ηnbn) = η1f(b1) + . . .+ ηnf(bn).

This means that, on the one hand, in order to know the action of a linearfunction over any vector u ∈ U it is enough to know the action of that linearfunction over the basis vectors bi. On the other hand, given a set of vectorsv1, . . . ,vn ⊂ V , there is one and only one linear map such that f(bi) = vi.

Let’s take a linear map f whose action on the basis vectors is given bythe vectors v1, . . . ,vn. As we have seen, given a vector u = (u1, . . . , un),we have that

z = f(u) = u1v1 + . . .+ unvn.

Let’s now denote the coordinates of the vectors vi as (vi1, vi2, . . . , v

im), and

those of the vector z as (z1, . . . , zm). Note that U is an n-dimensional vectorspace while V is an m-dimensional vector space, and that n and m can bedifferent. We can rewrite the previous equation as

zj =n∑i=1

uivij =

n∑i=1

vjiui.

It is natural now to regard all the vij = vji as a single object, a rectangulararray of m × n elements, that is a matrix. In fact, given a certain basis,

14 CHAPTER 2. LINEAR ALGEBRA

every linear map can be written as a matrix and every matrix defines alinear map. Moreover, the last equation defines the (usual) product betweenan m × n matrix and an n-dimensional vector. The definitions of sum andscalar multiplication that we have given for the linear maps vector space arefully compatible with the ones we have given for the matrix vector space.The name M for the set of all linear maps was indeed not given by chance.

The special case of a linear map from U to F is called a linear functional.Let’s notice that a linear functional gives rise to a very special kind of matrix,with just one row and as many columns as the dimension of U , that is withas many elements as the number of coordinates of the vectors of U . We’llsee that this is more than just a coincidence.

Given any function f : V → W we can identify two particularly inter-esting and useful sets. The first is the image of f , that is the set of all theelements in W that are spanned by f when applied to every element of V ,in symbols

Im(f) = w ∈ W : ∃v ∈ V : f(v) = w .

The second set is the kernel of f , that is the set of all the elements in V thatare transformed into the zero element of W , in symbols

ker(f) = v ∈ V : f(v) = 0 .

In the context of linear algebra, and thus of linear functions, the kernel isoften called null space and the image is often called range.

If f is a linear function, we know for sure that both the image and thekernel of f are non-empty and that they form a vector subspace of, respec-tively, W and V : we know that f(0) = 0, so both the image and the kernelmust contain at least the zero element (of W and V , respectively). More-over, if w1 and w2 are two vectors in the image of f , that is if there existtwo vectors v1 and v2 such that f(v1) = w1 and f(v2) = w2 then for anylinear combination w3 = αw1 + βw2 we have that

w3 = αw1 + βw2 = αf(v1) + βf(v2) = f(αv1 + βv2) = f(v3),

that is, also w3 ∈ Im(f), and thus Im(f) is a vector space. A similar argu-ment can be carried over for the kernel: given two vectors v1 and v2 suchthat f(v1) = 0 and f(v2) = 0 then for any linear combination αv1 +βv2 wehave that

f(αv1 + βv2) = αf(v1) + βf(v2) = 0.

In the special case that ker(f) = 0, i.e. dim(ker(f)) = 0, every vector ofV is mapped into a different vector of W , thus the function f is injective.Analogously, in the special case that Im(f) = W , i.e. dim(Im(f)) = dim(W ),

2.2. VECTOR SPACES 15

every vector of W is obtained form some vector v in V , and the function fis thus surjective. A function is invertible (i.e. bijective) if and only if it isboth injective and surjective.

In the context of linear algebra the dimension of the kernel is called nullityand the dimension of the image is called rank.

Theorem 2.1 (Rank-nullity theorem). For any linear function f : V → W ,the sum of the rank and the nullity gives the dimension of the domain:

dim(Im(f)) + dim(ker(f)) = dim(V ).

Proof. Since the kernel is a vector space we can build a basis for it X =x1, . . . ,xm and since the kernel is a subspace of V we can extend X addinga second set of vectors Y = y1, . . . ,yn so that X∪Y is a basis of the wholespace V . In particular this means that

dim(V ) = dim(X ∪ Y ) = dim(X) + dim(Y ) = dim(ker(f)) + dim(Y ),

and we are left to prove that dim(Y ) = dim(Im(f)).Let’s now take any vector v ∈ V , v = α1x1+. . .+αmxm+β1y1+. . .+βnyn,

and let’s apply f on it:

f(v) = f(α1x1 + . . .+ αmxm + β1y1 + . . .+ βnyn)

= f(α1x1 + . . .+ αmxm) + f(β1y1 + . . .+ βnyn)

= 0 + β1f(y1) + . . .+ βnf(yn).

This means that every vector f(v) in the image of f can be written asa linear combination of the vectors f(yi), i.e. the set f(y1) + . . . + f(yn)generates all the image of f . Let’s now prove that these vectors are alsolinearly independent, i.e. they form a basis of Im(f): if there exists a linearcombination of these vectors that gives zero

0 = γ1f(y1) + . . .+ γnf(yn),

using the linearity of the function we obtain that

0 = f(γ1y1 + . . .+ γnyn),

which means that γ1y1 + . . . + γnyn is in the kernel of f , and so it can bewritten as a linear combination of the vectors xi, giving δ1x1 + . . .+ δmxm =γ1y1 + . . .+ γnyn, or better

δ1x1 + . . .+ δmxm − γ1y1 − . . .− γnyn = 0,

but since X ∪Y is a basis of V , the last equation is true if and only if all thescalars δ1, . . . , δm, γ1, . . . , γn are zero. Thus the vectors f(yi) form a basisof Im(f) and, since they are as many as the vectors yi, it is also true thatdim(Im(f)) = dim(Y ).

16 CHAPTER 2. LINEAR ALGEBRA

Isomorphism

Two vector spaces U and V over the same field F are isomorphic if there is alinear isomorphism between them, that is if there exists an invertible linearmap f : U → V such that also its inverse f−1 : V → U is a linear map. Inother words, U and V are isomorphic if there is a one-to-one correspondencebetween them that preserves the space structure.

Let’s notice that a composition of isomorphisms is again an isomorphism,so if U is isomorphic to V and V is isomorphic to W then U is isomorphicto W .

An isomorphism between U and V transforms a base of U into a base ofV , thus isomorphic vector spaces have the same dimension. Also the converseis true: all vector spaces over the same field with the same dimension areisomorphic.

Theorem 2.2. Every n-dimensional vector space V over a field F is isomor-phic to F n.

Proof. Let x1, . . . ,xn be a basis of V . Each vector v in V can be written inthe form a1x1+. . .+anxn, and the scalars a1, . . . , an are uniquely determinedfor each v. We consider the one-to-one correspondence

v (a1, . . . , an)

between vectors of V and n-tuples of F n. If u = b1x1 + . . .+ bnxn then

αv + βu = (αa1 + βb1)x1 + . . .+ (αan + βbn)xn.

This establishes the desired isomorphism.

This theorem also shows that every basis of a vector space V determinesa coordinate system, since it associates to every vector v ∈ V an n-tuple ofelements of a field F . In particular, fixed a basis and a coordinate system,the linear functionals ei : V → F that associate to every vector v its ithcoordinate vi are called coordinate functions.

Moreover, the coordinate representation is itself a vector space, with thesame properties of the original vector space: all the operations on the origi-nal vectors can be equally performed on the coordinate representation. Thesum of two vectors u = (u1, . . . , un) and v = (v1, . . . , vn) is the vectorw = u + v = (u1 + v1, . . . , un + vn) and, given a scalar a, we can per-form the scalar multiplication by direct multiplication of the coordinates:au = a(u1, . . . , un) = (au1, . . . , aun).

2.2. VECTOR SPACES 17

Dual space

Let V be a vector space over a field F and V ∗ ⊂ M be the set of all linearfunctionals on V . V ∗ is called the dual space of V and it is a vector spaceover F when equipped with the same definitions for addition and scalarmultiplication as M .

Theorem 2.3 (Uniqueness of the dual basis). If V is an n-dimensional vectorspace and if B = b1, . . . , bn is a basis of V , then there is a uniquely de-termined basis B∗ = f 1, . . . ,fn of V ∗ with the property that f i(bj) = δij,called the dual basis of V . Consequently the dual space of a finite dimensionalvector space is isomorphic to the vector space itself.

Proof. We only have to prove that the set B∗ is indeed a basis for V ∗.In the first place, B∗ is linearly independent, for if we had a1f 1 + . . . +

anfn = 0, in other words, if

(a1f 1 + . . .+ anfn)(v) = a1f 1(v) + . . .+ anfn(v) = 0

for all v, then we should have in particular, for v = bi,

0 =∑j

ajf j(bi) =∑j

ajδij = ai.

In the second place, every g ∈ V ∗ is a linear combination of B∗: firstof all, given the action of g on B, g(bi) = ai, we can write for every vectorv = η1b1 + . . .+ ηnbn

g(v) = η1a1 + . . .+ ηnan.

On the other hand

f i(v) =∑j

ηjf i(bj) = ηi

so that, substituting in the preceding equation, we get

g(v) = f 1(v)a1 + . . .+ fn(v)an = (a1f 1 + . . .+ anfn)(v).

Consequently g = a1f 1 + . . .+ anfn ∀g ∈ V ∗.

In particular, let’s notice that the vectors of both an n-dimensional vectorspace over a field F and its dual space can be viewed as vectors of F n, thatis n-tuples of elements of F .

18 CHAPTER 2. LINEAR ALGEBRA

Double-dual and reflexivity

We will now introduce a different notation for the action of a linear functionalf over a vector v: instead of the usual f(v) we will write 〈f ,v〉. This newnotation will become clear in a moment.

Since also the dual space V ∗ of a vector space V is a vector space, it isnatural to take another step and construct the dual space of the dual space(V ∗)∗, or, looking from the point of view of V , the double-dual space V ∗∗.The verbal description of an element of V ∗∗ is clumsy, and may cause at firstsome headaches if one tries too hard to get a mental image of it: such anelement is a linear functional of linear functionals; but let’s not be baffled bythe mathematical jargon: an element of V ∗∗ is indeed just a linear functional,and an element of V ∗ is just a vector.

Now let’s take our brand new notation, and let’s give it a closer look:if we consider the symbol 〈f ,v〉 for some fixed f = f 0, we obtain nothingnew: 〈f 0,v〉 is just another way of writing the value f 0(v) of the functionf 0 at the vector v. If, however, we consider the symbol 〈f ,v〉 for some fixedv = v0, then we observe that this defines a function of the vectors in V ∗,whose value at f is 〈f ,v0〉. This function is also scalar-valued and, thanks tothe properties of the vector space V ∗, it happens to be linear. In other words,〈f ,v0〉 defines a linear functional on V ∗, and, consequently, an element ofV ∗∗. The correspondence between vectors of V and vectors of V ∗∗ is alwaysinjective and, in case of finite-dimensional vector spaces, it is also surjective.

Theorem 2.4. If V is a finite-dimensional vector space, then correspondingto every linear functional z0 on V ∗ there is one and only one vector v0 in Vsuch that z0(f) = 〈f ,v0〉 = f(v0) for every f ∈ V ∗. This correspondencebetween V ∗∗ and V is an isomorphism, and is called the natural correspon-dence.

Proof. The correspondence is an injective linear map. Its image is thereforea subspace of V ∗∗ of dimension dim(V ) = dim(V ∗) = dim(V ∗∗). From therank-nullity theorem it follows that it is also surjective.

Moreover, we can view 〈·, ·〉 as a function f : V ∗ × V → F . Since it islinear in the both of its arguments, it is called bilinear form.

Given a basis B = b1, . . . , bn of a vector space V and its dual basisB∗ = b∗1, . . . , b∗n, we can express any vector v ∈ V by its coordinates(v1, . . . , vn) and any vector u ∈ V ∗ by its coordinates (u1, . . . , un). Then

〈u,v〉 =n∑

i,j=1

uib∗i (vjbj) =

n∑i,j=1

uivjb∗i (bj) =

n∑i,j=1

uivjδij =

2.3. CHANGE OF BASIS 19

=n∑i=1

uivi = u1v1 + . . .+ unvn,

which is the well-known formula for the scalar product of two vectors of Rn.The function 〈·, ·〉 is indeed a good definition of a scalar product, since itis bilinear (more or less by construction), commutative (because a field iscommutative with respect to both addition and multiplication) and 〈u,v〉 =0 if and only if the vectors u and v are perpendicular1.

Since every finite-dimensional vector space V is isomorphic to its dualand double-dual, they are, in a very specific and technical sense, the samevector space and thus, with an abuse of notation, we will now use the samesymbols for all three spaces.

2.3 Change of basis

Every vector space has an infinite number of possible bases, and sometimes itwill be useful to change from one basis to another. To express the coordinatesof a vector in a new basis, given the coordinates in the old basis, is notparticularly difficult, but it is certainly cumbersome and, especially at first,it can be confusing. We will thus try to keep the notation as clear as possible.

Let’s take an n-dimensional vector space V together with two differentbases B = b1, . . . , bn and C = c1, . . . , cn. The vectors of C can bewritten in terms of their coordinates with respect to the basis B, for examplec1 = c1

1b1 + . . . + c1nbn = B(c1

1, c12, . . . , c

1n), where the subscript B denotes

explicitly the basis to which the coordinates relate.Now, given a vector v = v1c1 + . . . + vncn = C(v1, . . . , vn) written using

the basis C, we can write it in the other basis B using the expressions forthe basis vectors ci with respect to the other basis B:

vB = v1

c1

1

c12...c1n

B

+ v2

c2

1

c22...c2n

B

+ . . .+ vn

cn1cn2...cnn

B

=

c1

1 c21 . . . cn1

c12 c2

2 . . . cn2...

.... . .

...c1n c2

n . . . cnn

B

v1

v2...vn

C

= MB C vC .

1Actually, it is the other way round, i.e. the vectors are defined to be perpendicularbecause the value of the bilinear function is zero. What directions are perpendicular isdetermined by the direction of the basis vectors of V .

20 CHAPTER 2. LINEAR ALGEBRA

The matrix MB C is called change of basis matrix.

We can obviously develop the same argument using the expression of thebasis vectors of B in terms of those of C, obtaining

vC = MC B vB .

The matrix MC B is made up, in analogy to MB C , of the coordinates of thebasis vectors of B. Also, it is the inverse of MB C , as can be seen by thefollowing chain of equalities:

vC = MC B vB = MC B MB C vC , ∀v ∈ V ;

The couple of bases B and C thus define two linear maps, one the inverse ofthe other, from V in itself.

As an explicit example, let’s take P3 and choose the set B = b1 =1, b2 = t, b3 = t2 as a basis. Every vector a0 +a1t+a2t

2 can now be denotedby its coordinates (a0, a1, a2), and the three basis vectors are thus given byb1 = (1, 0, 0)B , b2 = (0, 1, 0)B and b3 = (0, 0, 1)B . Let’s now take anotherbasis, say C = c1 = 3 + 2t − 5t2, c2 = 1 + t + t2, c3 = 4 − t + 2t2 or, incoordinate notation, C = (3, 2,−5)B , (1, 1, 1)B , (4,−1, 2)B . We have that

MB C =

3 1 42 1 −1−5 1 2

and MC B =1

38

3 2 −51 26 117 −8 1

.

2.4 Matrices

In the following we will focus only on square matrices and on the relatedlinear functions from a vector space in itself. We will therefore denote thevectors of the vector space in question with lowercase boldface letters, likeu, and linear functions with a non-standard (for functional analysis) non-boldface-letter notation, like f and g, to avoid confusion with the vectorsthey act upon. Their correspondent matrices will be denoted by uppercaseboldface letters, like F and G, and the entries of these matrices by lowercaseletters with two indices for the row and the column, like fij and gij.

As we have already seen in section 2.2.4, we can write (and calculate) thevector y = f(z) using matrix notation:

yi =∑j

fijzj.

2.4. MATRICES 21

Let’s now take another linear function g and write the composite functiong(f(z))

xk =∑i

gkiyi =∑i

gki∑j

fijzj =∑j

(∑i

gkifij

)zj =

∑j

hkjzj.

The composition of two linear functions h = g(f) is thus again (as expected)a linear function, and the correspondent matrix H is given by the matrixproduct GF =

∑i gkifij. In practice every element hij is given by the scalar

product of the i-th row of the first matrix by the j-th column of the secondmatrix.

It’s easy to see that matrix multiplication is associative

A (BC) =∑j

aij

(∑k

bjkckl

)=∑k

(∑j

aijbjk

)ckl = (AB)C,

and that the matrix 1 whose entries are given by the Kronecker delta δij is itsunit element. Unfortunately, since not every linear function is invertible, alsonot every matrix has an inverse2. Let’s notice also that matrix multiplicationis not commutative, that is AB 6= BA in general.

Given a matrix A = (aij), the matrix At = (aji) is called the transposeof A. A matrix A = (aij) such that aij = 0 if i < j is called lower triangularmatrix, while it is called upper triangular if aij = 0 whenever i > j. A matrixthat is at the same time upper and lower triangular, that is aij = 0 if i 6= j,has non-zero entries only on its main diagonal, and is thus called diagonalmatrix. The sum of the diagonal elements of a matrix is called trace, insymbols tr(A) =

∑i aii.

2.4.1 Determinant

Multilinear alternating functions

Given a vector space V over a field F , a function

f : V × ...× V︸ ︷︷ ︸n times

→ F

is said to be a multilinear function on V if it is linear in each variable, i.e. forevery index i and elements x1, . . . ,xi−1,xi+1, . . . ,xn, the function

f i(y) = f(x1, . . . ,xi−1,y,xi+1, . . . ,xn)

2Thus the set of all n×n square matrices together with matrix multiplication does notform a group.

22 CHAPTER 2. LINEAR ALGEBRA

is linear. Moreover, we say that a multilinear function is alternating iff(x1, . . . ,xn) = 0 whenever there exists an index i, 1 ≤ i ≤ n − 1, suchthat xi = xi+1, that is when two adjacent elements are equal.

A multilinear alternating function has the property that, when we trans-pose two adjacent arguments, its value changes by a sign. Concentrating,without loss of generality, on the first two arguments of a multilinear alter-nating function f we have that:

0 = f(x+ y,x+ y, . . .)

= f(x,x, . . .) + f(x,y, . . .) + f(y,x, . . .) + f(y,y, . . .)

= f(x,y, . . .) + f(y,x, . . .)

and so, taking one addend to the other side of the equation, we obtain

f(x,y, . . .) = −f(y,x, . . .).

This in particular means that if any two distinct arguments xi and xj areequal then the value of the multilinear alternating function is zero, sincebringing these two arguments next to each other will cause at most a changeof sign. Moreover, replacing one of its arguments xi by xi + axj, the value ofa multilinear alternating function doesn’t change.

From now on we will treat only n-linear alternating functions on ann-dimensional vector space. Let’s thus take a multilinear alternating functionf(x1, . . . ,xn) and let’s write its arguments as linear combinations of basisvectors bi,

x1 = a11b1 + . . .+ a1nbn,

. . .

xn = an1b1 + . . .+ annbn,

obtaining

f(x1, . . . ,xn) = f(a11b1 + . . .+ a1nbn, . . . , an1b1 + . . .+ annbn)

=∑δ

a1δ(1) . . . anδ(n)f(bδ(1), . . . , bδ(n)), (2.2)

where we have expanded the function using its multilinearity, and δ thusranges over all arbitrary functions of 1, . . . , n into itself. However, if δ isnot a bijection (i.e. a permutation), then, for that particular addend, at leasttwo arguments bδ(i) and bδ(j) are equal (with i 6= j), and thus that particularaddend is equal to zero. Thus we can restrict our sum to permutations σ.

2.4. MATRICES 23

When shuffling back each term f(bσ(1), . . . , bσ(n)) to the standard orderingf(b1, . . . , bn) we perform a number m of transpositions (and thus of changesof sign) determined by the permutation σ, obtaining:

f(bσ(1), . . . , bσ(n)) = (−1)mf(b1, . . . , bn) = sgn(σ)f(b1, . . . , bn).

We can now rewrite (2.2) obtaining

f(x1, . . . ,xn) =∑σ

sgn(σ)a1σ(1) . . . anσ(n)f(b1, . . . , bn). (2.3)

Determinant

The determinant is the function det : MRn×n → R that, when viewed as a

function det(A1, . . . , An) of the columns A1, . . . , An of a matrix A, is multi-linear alternating, and such that det(1) = 1. It can be shown that such afunction is unique.

Theorem 2.5 (Cramer’s rule). Let A1, . . . , An be column vectors of dimen-sion n. Let x1, . . . , xn ∈ R be such that

x1A1 + . . .+ xnA

n = B

for some column vector B. Then for each i we have

xi det(A1, . . . , An) = det(A1, . . . , Ai−1, B,Ai+1, . . . , An).

Proof. Say i = 1. We expand

det(B,A2, . . . , An) =n∑j=1

xj det(Aj, A2, . . . , An),

and just check that every term on the right hand side is equal to 0 (becausetwo column vectors are equal) except for j = 1.

Corollary 2.6. Given n column vectors A1, . . . , An of dimension n, they arelinearly dependent if and only if det(A1, . . . , An) = 0.

The previous theorem is called Cramer’s rule because it gives a rule tosolve linear systems of n equations in n unknowns: given a linear system

a11x1 + a12x2 + . . .+ a1nxn = b1

a21x1 + a22x2 + . . .+ a2nxn = b2

. . .an1x1 + an2x2 + . . .+ annxn = bn

24 CHAPTER 2. LINEAR ALGEBRA

we can rewrite it in matrix form as

Ax =

a11 a12 . . . a1n

a21 a22 . . . a2n...

.... . .

...an1 an2 . . . ann

x1

x2...xn

=

b1

b2...bn

= b

and Cramer’s rule tells us that the system has a solution if and only ifdet(A) 6= 0 and this solution is given by

xi =det(Ai)

det(A)

where Ai is the matrix obtained substituting the i-th column of the matrixA with the constant term vector b. Moreover, now we also know that amatrix A is invertible if and only if det(A) 6= 0.

We shall now give two different formulas for computing the determinantof an n × n matrix A. We will then make some remarks on how to use theproperties of multilinear alternating functions to ease the calculation of thedeterminant of a matrix.

Given an n×n matrix A = (A1, . . . , An) = (aij), and the canonical basisB = b1, . . . , bn of R3, we can write the columns of the basis as linearcombinations of the basis vectors

A1 = a11b1 + . . .+ an1bn,

. . .

An = a1nb1 + . . .+ annbn.

Using now Eq. (2.3) and the definition of determinant we obtain the followingexpression for the determinant

det(A) = det(A1, . . . , An) =∑σ

sgn(σ)aσ(1)1 . . . aσ(n)n det(b1, . . . , bn)

=∑σ

sgn(σ)aσ(1)1 . . . aσ(n)n det(1) =∑σ

sgn(σ)aσ(1)1 . . . aσ(n)n (2.4)

which is known as Leibniz formula. It is possible to show, using some prop-erties of permutations, that the determinant is multilinear and alternatingalso with respect to the rows of matrices, and that in particular

det(A) = det(At) =∑σ

sgn(σ)a1σ(1) . . . anσ(n).

2.4. MATRICES 25

Given an n × n matrix A = (A1, . . . , An) = (aij), we denote by Aij thematrix obtained form A by deleting the i-th row and j-th column. Fixed aninteger k, we have that

det(A) = (−1)k+1ak1 det(Ak1) + . . .+ (−1)k+nakn det(Akn)

= (−1)k+1a1k det(A1k) + . . .+ (−1)k+nank det(Ank) (2.5)

determines a recursive formula for calculating the determinant once we de-fine, for n = 1, det(A) = a. The terms (−1)k+j det(Akj) = Ckj are calledcofactors of the matrix A.

This second formula, known as Laplace formula, is obviously equivalentto Leibniz formula, and the latter can be obtained from the former by re-arranging the terms of the sum and using some properties of permutations.However, Laplace formula is usually preferred for practical use since it iseasier to apply.

When actually calculating the determinant of a matrix it is useful to tryto exploit the basic properties of multilinear alternating functions to check ifthe determinant is zero and to ease the calculation: first of all we can checkif two rows or columns are equal, or if a row or column is clearly given bya linear combination of the others, in which case the determinant is zero;then we can use the property that substituting a column (or a row) Ai of thematrix with Ai + xAj doesn’t change the value of the determinant to reducethe matrix in triangular form. This way, applying Laplace formula to thefirst row or column we obtain that the determinant is simply given by theproduct of the diagonal elements.

2.4.2 Inverse matrix

Here we will present two simple methods to find the inverse matrix of aninvertible matrix A:

Gauß-Jordan elimination

Given an n×n matrixA, let’s augment it with the n×n identity matrix, thatis let’s juxtapose the identity matrix on the right of our matrix A obtainingan n× 2n matrix:

[A1] =

a11 a12 . . . a1n 1 0 . . . 0a21 a22 . . . a2n 0 1 . . . 0...

.... . .

......

.... . .

...an1 an2 . . . ann 0 0 . . . 1

.

26 CHAPTER 2. LINEAR ALGEBRA

Then we perform elementary row operations, that is we add to somerows linear combinations of the other rows, in order to transform the firstn columns into an identity matrix, obtaining what is called a reduced row-echelon form:

[1A−1

]=

1 0 . . . 0 b11 b12 . . . b1n

0 1 . . . 0 b21 b22 . . . b2n...

.... . .

......

.... . .

...0 0 . . . 1 bn1 bn2 . . . bnn

.We can now split the augmented matrix, obtaining

A−1 =

b11 b12 . . . b1n

b21 b22 . . . b2n...

.... . .

...bn1 bn2 . . . bnn

.

Cofactor equation

The matrix of cofactors of a matrix A is the matrix

C =

C11 C12 . . . C1n

C21 C22 . . . C2n...

.... . .

...Cn1 Cn2 . . . Cnn

whose entries are the cofactors of the matrix A. The inverse matrix A−1 isthen given by the formula

A−1 =1

det(A)Ct.

In case of a 2× 2 matrix A =

(a bc d

)the cofactor equation yields

A−1 =1

ad− bc

(d −b−c a

).

2.4.3 Eigenvalues and eigenvectors

A scalar λ is an eigenvalue of a matrixA (or equivalently of its associated lin-ear transformation a) if there exists a non-zero vector x such that Ax = λx.

2.4. MATRICES 27

If λ is an eigenvalue of A then every vector satisfying the equation Ax = λxis an eigenvector with eigenvalue λ. The zero vector is, by definition, aneigenvector of every eigenvalue, but it cannot be used to determine eigen-values. It is excluded because the relation A0 = λ0 always holds for everypossible λ. With the exception of the zero vector, eigenvectors of differenteigenvalues are linearly independent.

Let λ be an eigenvalue of A, and let Eλ be the set of all the eigenvectorsof A with eigenvalue λ. This set is a vector subspace of the whole vectorspace V , and is called eigenspace, while its dimension dim(Eλ) = mλ is thegeometric multiplicity of the eigenvalue λ. The set of all the eigenvalues of amatrix A is called the spectrum of A.

We can rewrite the eigenvalue–eigenvector relation as (A − λ1)x = 0.This relation highlights the fact that the eigenspace of λ is the null spaceof A − λ1, and thus that λ is an eigenvalue if and only if A − λ1 is notinvertible, i.e. if det(A − λ1) = 0. Treating λ as an unknown variable, theexplicit calculation of the determinant det(A−λ1) yields a polynomial of theform λn+cn−1λ

n−1 + . . .+c1λ+c0 = 0 called the characteristic polynomial ofthe matrix A. This polynomial has some interesting properties: the constantterm c0 is equal to the determinant of the matrix A, the term cn−1 is equalto the opposite of its trace tr(A) and the roots of this polynomial are theeigenvalues of the matrix A. Since every real polynomial of degree n hasat most n real roots, also every n-dimensional square matrix has always nomore than n eigenvalues, but sometimes no one at all. It can happen thatk0 roots of the characteristic polynomial are given by the same eigenvalue,i.e the characteristic polynomial can be factorized with a factor of the form(λ0−λ)k0 . In this case k0 is called the algebraic multiplicity of the eigenvalueλ0, and it can be shown that 1 ≤ m0 ≤ k0, i.e. the geometric multiplicity canbe smaller than the algebraic multiplicity.

For example the n-dimensional zero matrix has eigenvalue 0 with multi-plicity n, the n-dimensional unit matrix has eigenvalue 1 with multiplicity n,and matrices representing rotations in three dimensions have eigenvalue 1,with unit multiplicity, corresponding to the rotation axis. Another exampleis given by simple shear in the plane (see Fig. 2.1), whose associated matrixis (

1 k0 1

)and its characteristic polynomial is λ2−2λ+1 = (1−λ)2. The eigenvalue 1 hasthus algebraic multiplicity 2, but there exists only one linearly independenteigenvector, namely (1, 0), so it has geometric multiplicity 1.

To find the eigenvectors of a matrix A we have in the first place to find

28 CHAPTER 2. LINEAR ALGEBRA

Figure 2.1: Geometric visualization of horizontal simple shear. The shearangle φ is given by k = cotφ.

its eigenvalues, that is the roots λi of the characteristic polynomial given bydet(A − λ1), and then we need to find all the linearly independent vectorsxi that solve the equation Axi = λixi.

2.4.4 Diagonalization

If the sum of the geometrical multiplicities mi of the eigenvalues λi of amatrix A is equal to its dimension, then we can assemble a basis E made upof eigenvectors. Due to the peculiar nature of these basis vectors, the matrixA written with respect to this new basis will be in diagonal form:

λ1 0 . . . 00 λ2 . . . 0...

.... . .

...0 0 . . . λn

;

for this reason the matrix A and its associated linear function a are calleddiagonalizable. Vice versa, if a linear function a can be written as a diagonalmatrix then it is possible to find a basis made up of eigenvectors of a. Theprocess of finding a corresponding diagonal matrix for a diagonalizable matrixor linear function is called diagonalization.

It can be shown that symmetric matrices, that is matrices for whichaij = aji, can always be diagonalized, and moreover that the eigenvectorsform an orthogonal basis. Lucky enough, the stress and strain tensors arealmost always symmetric.

2.5. GEOMETRY 29

2.5 Geometry

Some geometrical concepts and entities like distance, length and orthogo-nality, which we usually consider as self-evident and “given” in our physicalworld and in Euclidean geometry, can actually be generalized within linearalgebra:

2.5.1 Metric

Let E be an arbitrary set and d : E × E → R a function with the followingproperties:

1. Non-negativity: d(x, y) ≥ 0 ∀x, y ∈ E.

2. Identity of indiscernibles: d(x, y) = 0⇔ x = y.

3. Symmetry: d(x, y) = d(y, x) ∀x, y ∈ E.

4. Triangle inequality: d(x, y) ≤ d(x, z) + d(z, y) ∀x, y, z ∈ E.

Then d is called a metric.A metric is a generalization of the concept of distance, and there are

many possible choices for it, even for Rn. For example we have:

• Euclidean metric: d(x,y) =√∑n

i=1 (xi − yi)2.

• Discrete metric: d(x,y) = 1 if x = y, otherwise is 0.

• `1 or taxicab metric: d(x,y) =∑n

i=1|xi − yi|.

• `∞ or Chebyshev metric: d(x,y) = maxi (|xi − yi|).

• More generally, the `m metric is given by: (∑n

i=1|xi − yi|m)1m . The

Euclidean metric is thus equivalent to the `2 metric.

2.5.2 Norm

Let E be a vector space over R and ||·|| : E → R a function with the followingproperties:

1. ||x|| ≥ 0 ∀x ∈ E, ||x|| = 0⇔ x = 0.

2. ||λx|| = |λ| · ||x|| ∀λ ∈ R, ∀x ∈ E

3. ||x+ y|| ≤ ||x||+ ||y|| ∀x,y ∈ E

30 CHAPTER 2. LINEAR ALGEBRA

Then ||·|| is called a norm.A norm is a generalization of the concept of length of a vector, and also

here there are a number of possible choices. Moreover, once a norm ||x|| isfixed, it naturally induces a metric for the vector space as d(x,y) = ||x− y||.All the examples of metrics that we have given before are in fact induced bynorms.

2.6 Exercises

1. Prove that the four vectors

w = (1, 0, 0), x = (0, 1, 0), y = (0, 0, 1), z = (1, 1, 1),

in R3 are linearly dependent, but any three of them are linearly inde-pendent.

2. Prove that the four vectors w, x, y, z in P3 defined by w(t) = 1,x(t) = 1 + t, y(t) = t + t2 and z(t) = 15 − 23t +

√2t2 are linearly

dependent, but any three of them are linearly independent.

3. Given three linearly independent vectors x, y and z, is it true that alsothe vectors x+ y, x+ z and y + z are linearly independent?

4. Under what conditions on the scalar η are the vectors (1+η, 1−η) and(1− η, 1 + η) in R2 linearly dependent?

5. Is it possible to have a set of three linearly independent vectors in R2?

6. Under what conditions on the scalar η do the vectors (1, 1, 1) and(1, η, η2) form a basis of R3?

7. Prove that the four matrices

A =

(1 00 0

), B =

(0 10 0

), C =

(0 01 0

), D =

(0 00 1

),

form a basis for MR2×2.

8. Consider the following functions defined on vectors x = (η1, η2, η3) ∈R3:

w(x) = η1 + η2,

x(x) = η1 − η23,

y(x) = η1 + 1,

z(x) = η1 − 2η2 + 3η3.

2.6. EXERCISES 31

Which of these are linear functionals?

9. If f is a non-zero linear functional on a vector space V , and if a is anarbitrary scalar, does there necessarily exist a vector x ∈ V such that〈f ,x〉 = a?

10. Let a be the function on R3 that, in a certain basis, transforms thevector (x1, x2, x3) into the vector (x2, x3, x1). Prove that a is a linearfunction and find its corresponding matrix.

11. For which of the following polynomials p, regarded as vectors of thethree dimensional vector space P3 and expressed using the basis B =1, t, t2, and matrices A is it true that Ap = 0?

a) p = 1− t+ t2, A =

1 1 00 1 10 0 1

;

b) p = 1 + t− t2, A =

1 2 32 4 63 6 9

;

c) p = 5 + t+ 2t2, A =

1 −3 −1−2 0 50 −2 1

;

d) p = 2 + t+ 2t2, A =

1 1 −13 6 18 −2 7

.

12. What happens to the matrix of a linear transformation on a finite-dimensional vector space when the elements of the basis with respect towhich the matrix is computed are permuted among themselves (e.g. thefirst basis vector is switched with the second)?

13. Given the three vectors

x =1√7

√31√3

N

y =1

2√

7

4

−√

3−3

N

z =1

2

0√3−1

N

,

and a linear function a whose action on these three vectors is given bythe following relations

a(x) = x a(y) =1

4√

7

1

5√

3−6

N

a(z) =1

4

−3√3

2

N

,

32 CHAPTER 2. LINEAR ALGEBRA

a) check that the vectors x, y and z consitute an orthonormal basisB for R3;

b) write the matrix associated to the linear function a in the naturalbasis N given by the vectors (1, 0, 0), (0, 1, 0), (0, 0, 1);

c) write the matrix associated to the linear function a in the basis B.

14. Which of the following sets of vectors of R3 are linearly independent?

(a)

0

1−1

,

201

; (b)

1

23

,

045

,

678

;

(c)

1

13

,

−121

,

068

.

15. Prove that the following subset of MR2×2

V =

(a bc d

)∈MR

2×2 : a+ b+ c = 0

is a vector space over R and find a basis for it.

16. Prove that

A =

(21

),

(−1−1

), B =

(−1−3

),

(23

)and C =

(−11

),

(11

)are basis of R2. Find the coordinates of the vectors

w =

(13

), x =

(2−1

), y =

(7−7

)and z =

(11

)with respect to these bases.

17. Given the matrix

A =

1 1 24 6 10−3 −4 −7

compute A3 = AAA.

2.6. EXERCISES 33

18. Find for which values of k the matrix

A =

1 0 0 00 1 k −11 1 −1 10 2 0 −1

is invertible, and when possible find its inverse.

19. Given the vectors

v1 =

12−1

, v2 =

012

, v3 =

120

,

w1 =

−1−21

, w2 =

0−21

, w3 =

0−11

,

prove that the sets B = v1,v2,v3 and C = w1,w2,w3 are basesof R3 and find the change of basis matrix from the basis B to the basisC.

20. Given the three vectors

e1 =

100

, e2 =

010

and e3 =

001

,

prove that the four sets

B1 = e1 + e3, e1 − 2e2 + e3, e2 − e3,B2 = 2e1 − 3e2 + 3e3,−2e2,−e1 + 3e2 − 2e3,B3 = e1 − e2 + 2e3,−2e2, 3e2 − e3,B4 = e1 − e2 + 2e3,−e2 − e3,−e1 + 3e2 − 2e3,

are bases of R3 and find all the matrices of change of basis betweenthese four bases.

21. Given the linear functions

f1

(xy

)=

(x+ yx+ 2y

), f2

(xy

)=

(x+ y

2y

), f3

(xy

)=

(23x− 5y−1

9x+ 7

3y

),

find the corresponding matrices with respect to the natural basis.

34 CHAPTER 2. LINEAR ALGEBRA

22. Given P4 =∑4

i=1 αixi−1, αi ∈ R

and the two sets A = 1, x, x2, x3

and B = x3 + x2 + x+ 1, x3 + x2 + x, x3 + x2, x3, check that theyform two bases for P4 and find the two matrices associated with thechange of basis between A and B. Moreover, find the coordinates ofthe two vectors p = 3x2 +2x+5 and q = x3−2x2+4x−3 with respectto each basis.

Chapter 3

Derivative

In mathematics and physics a very useful concept is the rate at which adependent variable f changes with respect to the change in the independentvariable x or, from a geometrical point of view, the slope of the graph ofthe function f(x). Let f : (a, b) ∈ R → R be a continuous and sufficientlysmooth function. The Newton’s difference quotient m of the function f at apoint x is defined as

m =f(x+ h)− f(x)

h=

∆f

∆x.

Newton’s difference quotient gives the average increase of the function fover a distance h from the point x and, if the function is indeed sufficientlysmooth and well behaved, as the one in Figure 3.1, it is close to the instan-taneous increase of the function at the point x, i.e. the secant line is a goodapproximation of the tangent line.

The difference quotient approach can fail for two reasons: one is that thefunction, and the corresponding graph, is highly oscillatory with respect tothe independent increase h we have used to compute it, and the other is thatthe graph, at the point x, f(x) is edgy, angular, or even broken. In the firstcase we can solve the problem by taking smaller and smaller increments, thatis by taking the limit of the difference quotient as the independent increasegoes to zero, while for the second case there is no solution: there the veryconcept of instantaneous infinitesimal increase is not applicable.

35

36 CHAPTER 3. DERIVATIVE

Figure 3.1: Newton’s difference quotient is closely linked to the line inter-secting the points x, f(x) and x+ h, f(x+ h).

3.1 General definition

Let f : (a, b) ⊂ R → R be a continuous function, f ∈ C0. If at a pointc ∈ (a, b) the limit

f ′ (c) = limh→0

f(c+ h)− f(c)

h

exists and is finite, then f is differentiable at c and we call f ′(c) the (first)derivative of f at c. Equivalently, the derivative can be defined as the quan-tity f ′(c) such that

limh→0

f(c+ h)− f(c)− f ′(c)hh

= 0. (3.1)

This different approach underlines one of the basic properties of a differen-tiable function (and of its derivative): the variation of the dependent variable(i.e. the quantity f(c+h)−f(c)) with respect to an increment h of the inde-pendent variable can be approximated linearly with the increment, and theproportionality constant is exactly the derivative f ′(c).

3.2. VARIOUS NOTATIONS 37

3.1.1 Higher order derivatives

If f is differentiable at every point of an open subset (d, e) of its domain, wecan regard its derivative as a function f ′ : (d, e) ⊆ (a, b) → R, and try tobuild the derivative of the derivative function:

f ′′(c) = limh→0

f ′(c+ h)− f ′(c)h

.

If it exists, we’ll call f ′′(c) the second derivative of f at c, and so on.

A function f is said to be continuously differentiable if the derivativef ′ exists, and is itself a continuous function. The set of all continuouslydifferentiable functions is denoted by C1. More in general, a function f issaid to be of class Cm if its first m derivatives f ′, f ′′, . . . , f (m) all exist andare continuous.

3.2 Various notations

We recall here the main notations used in the literature to indicate differen-tiation:

• Lagrange’s notation is the one we have used up to now: f ′. Afterthe third derivative, the prime marks are substituted by a number inparenthesis, like f (4), to discriminate it from exponentiation.

• Leibniz’s notation comes from the idea of the derivative as a quotient

of infinitesimal increments:df

dx,

dnf

dxn.

• Newton’s notation, used only for derivatives of physical quantities withrespect to time, substitutes the prime mark of Lagrange’s notation witha dot over the function name: f , f .

3.3 Basic properties

• Linearity: given two differentiable functions f and g and two numbersa, b ∈ R, then (af + bg)′ = af ′ + bg′. This also implies that the setsCm can be regarded as vector spaces.

• Product rule: given two differentiable functions f and g, then theirproduct fg is differentiable and (fg)′ = f ′g + fg′.

38 CHAPTER 3. DERIVATIVE

• Chain rule: let f(x) = h(g(x)) and let h and g be differentiable on theappropriate domains, then f is differentiable and f ′(x) = h′(g(x))g′(x).This rule can be easily remembered using Leibniz’s notation:

df

dx=

dh

dx=

dh

dg

dg

dx.

3.4 Derivative of a vector valued function

Given a function f : (a, b) ⊂ R→ Rn, i.e. f(x) = (f1(x), f2(x), . . . , fn(x))T ,it is differentiable if the limit

f ′ (x) = limh→0

f(x+ h)− f(x)

h

exists and is finite, i.e. if every coordinate function fi is differentiable.As an example, let x(t) be the position of a point particle at every time

t. Then x′(t) = x(t) = v(t) is the velocity of that point particle and |v(t)|is its speed.

3.5 Functions of many variables

3.5.1 Partial derivative

Let f = f(x1, x2, . . . , xn) be a real valued function of n variables. The partialderivative of f with respect to xi at a given point (c1, . . . , cn) is given by theusual limit

∂f

∂xi(c1, . . . , cn) = lim

h→0

f(c1, . . . , ci−1, ci + h, ci+1, . . . , cn)− f(c1, . . . , cn)

h,

where all other variables but xi are held fixed.For the sake of brevity, and to keep equations more readable, the partial

derivative symbol is frequently contracted to ∂xi .

3.5.2 Gradient

Generalizing the concept of partial derivative, we can regard the collectionof independent variables x = (x1, . . . , xn) as a vector of Rn, and try to finda linear approximation for an arbitrary increment h at a point a. If thereexists a vector L ∈ Rn such that

limh→0

f(a+ h)− f(a)−L · h|h|

= 0,

3.5. FUNCTIONS OF MANY VARIABLES 39

we say that f is differentiable at a, and we call L the gradient of f at a, insymbols ∇f(a) or gradf(a).

It can be shown that if a function f : Rn → R is differentiable at a pointa, then all its partial derivatives are well defined at a and the gradient of fis given by the vector made up of the partial derivatives:

∇f(a) =

(∂f

∂x1

(a),∂f

∂x2

(a), . . . ,∂f

∂xn(a)

)T.

Conversely, if all the partial derivatives of f exist in the neighborhood ofa point a and are continuous in a, then f is also differentiable in a.

Given a versor v, we can easily calculate the directional derivative of f inthe direction specified by v as∇f(a)·v. Notice that if v is a basis vector ei ofRn we correctly obtain that the directional derivative of f in the coordinatedirection of xi is the partial derivative ∂xi .

As an example, let T (x) be the temperature field of a room, c a point ofthe surface Σ of the room (i.e. its walls) and n(c) the versor normal to thesurface at c pointing out of the room. Given Fourier’s law for the heat flux:

q = −k∇T,

if we want to calculate the amount of heat Q conducted through the wall ofthe room at the point c we need to calculate the variation of temperatureacross the room wall, that is the derivative of the temperature at the wall,in the direction perpendicular to the wall:

Q = −k∇T (c) · n(c).

3.5.3 Derivatives of higher order

In perfect analogy with the case of functions of a single variable (see Section3.1.1), it is possible to define higher order derivatives of functions of manyvariables: given a differentiable function of many variables f(x) : Rn → R,if the partial derivative with respect to xi of its partial derivative ∂xkf(x)with respect to xk exists at a point a, we write

∂xi∂xkf(a) =∂2f

∂xi∂xk(a) =

∂xi

(∂f

∂xk

)(a) if i 6= k,

∂2xif(a) =

∂2f

∂x2i

(a) =∂

∂xi

(∂f

∂xi

)(a) if i = k.

If the second derivative exists at each point of an open set Ω ⊆ Rn, we caniterate the procedure and define third derivatives, and so on.

40 CHAPTER 3. DERIVATIVE

Higher order derivatives with respect to more than one distinct variableare called mixed. Mixed derivatives with respect to the same variables, butin different order, are in general different:

∂2f

∂xi∂xk6= ∂2f

∂xk∂xi;

but, for regular enough functions, we have the following theorem:

Theorem 3.1 (Schwarz’ theorem). Be f : Ω ⊆ Rn → R, with Ω an open setand a ∈ Ω. If

∂2f

∂xi∂xk(x) and

∂2f

∂xk∂xi(x)

exist in a neighborhood of a and are continuous at a, then

∂2f

∂xi∂xk(a) =

∂2f

∂xk∂xi(a).

In particular, mixed derivatives are equal if f ∈ C2.

Proof. It suffices to apply Stokes’ theorem to the gradient of f .

3.6 Vector valued functions of many variables

3.6.1 Jacobian matrix

Let f : Rn → Rm. The function f is differentiable at a point a ∈ Rn if everycoordinate function fj is differentiable:

limh→0

fj(a+ h)− fj(a)−∇fj(a) · h|h|

= 0 ∀j = 1, . . . ,m.

Using the Jacobian matrix Jf (a), that is the matrix whose lines are thegradients of the coordinate functions of f

Jf =

∇f1

∇f2...∇fm

=

∂f1

∂x1

∂f1

∂x2

. . .∂f1

∂xn∂f2

∂x1

∂f2

∂x2

. . .∂f2

∂xn...

.... . .

...

∂fm∂x1

∂fm∂x2

. . .∂fm∂xn

,

we can rewrite the previous set of equations as

limh→0

f(a+ h)− f(a)− Jf (a) · h|h|

= 0.

3.6. VECTOR VALUED FUNCTIONS OF MANY VARIABLES 41

3.6.2 Composition of functions

Let f : Rn → Rm and g : Rm → Rp. If f is differentiable at a andg is differentiable at b = f(a) then the composite function h = g(f) isdifferentiable at a and we have that

Jh(a) = Jg(b) · Jf (a).

If n = m = p = 1 this reduces to the usual chain rule, that is h′(a) =g′(f(a))f ′(a).

3.6.3 Coordinate transformations

Coordinate transformations usually employed in physics are characterizedby a higher regularity than simple bijective functions, mainly for topologicalreasons. In particular, coordinate transformations are usually required to beat least diffeomorphisms.

Let Φ : A ⊂ Rn → B ⊂ Rn a bijective function. We call Φ a dif-feomorphism if both Φ and its inverse Φ−1 are continuously differentiable.Obviously, if Φ is a diffeomorphism, also its inverse Φ−1 is a diffeomorphism.

Since the composite function of a diffeomorphism with its inverse φ =Φ−1(Φ) is the identity function on A, we obtain that

JΦ−1(Φ(a)) · JΦ(a) = I ∀a ∈ A.

This in particular means that JΦ(a) is nonsingular for all a ∈ A, that is

detJΦ(a) 6= 0.

The determinant of the Jacobian matrix is called Jacobian determinant orsimply Jacobian.

It is possible to demonstrate that also the converse is true: given a func-tion f : A ⊂ Rn → B ⊂ Rn, if:

1. f is a bijective function from A to B;

2. f is continuously differentiable on A;

3. detJf (a) 6= 0 ∀a ∈ A;

then f is a diffeomorphism.

42 CHAPTER 3. DERIVATIVE

Figure 3.2: Visualization of the polar coordinate system.

Polar coordinates

The polar coordinate system identifies (see also Figure 3.2) every point inthe plane R2, excluding the origin, by its distance from the origin r and its(counterclockwise) angular distance from the positive x-axis θ. The function

Φ(r, θ) =

(r cos θr sin θ

)is a bijection between A = (0,+∞)×(0, 2π), i.e. an infinitely long “rectangle”without borders, and B = R2−(x, y) ∈ R2 : y = 0, x ≥ 0, i.e. the cartesianplane minus the nonnegative x-axis, with its inverse given by the relations

r =√x2 + y2

sin θ =

y√x2 + y2

cos θ =x√

x2 + y2

,

and both Φ and its inverse are clearly continuously differentiable.1 Since itis also true that

detJΦ(r, θ) = det

(cos θ −r sin θsin θ r cos θ

)= r(cos2 θ + sin2 θ) = r,

we know that Φ is a diffeomorphism from A to B.

Spherical coordinates

The spherical coordinate system identifies (see also Figure 3.3) every pointin the space R3, excluding the z-axis, by its distance from the origin ρ, its

1Let’s notice here that a bijection can actually be defined between (0,+∞) × (0, 2π]and R2 − (0, 0). However this bijection is not differentiable everywhere, since the plane inpolar coordinates is not described by an open set.

3.6. VECTOR VALUED FUNCTIONS OF MANY VARIABLES 43

Figure 3.3: Visualization of the spherical coordinate system.

longitude θ from the positive x-axis and its colatitude φ from the positivez-axis. The function

Φ(ρ, θ, φ) =

ρ cos θ sinφρ sin θ sinφρ cosφ

is a bijection between A = (0,+∞) × (0, 2π) × (0, π) (i.e. an infinitely longbox without boundaries) and B = R3−(x, y, z) ∈ R3 : y = 0, x ≥ 0 (i.e. R3

minus a half-plane). Also in this case, since

detJΦ(r, θ, φ) = det

cos θ sinφ −ρ sin θ sinφ ρ cos θ cosφsin θ ρ cos θ sinφ ρ sin θ cosφcosφ 0 −ρ sinφ

= −ρ2 sinφ,

Φ is a diffeomorphism from A to B.

3.6.4 Vector fields

Let’s take a vector valued function of many variables, for instance f : Rn →Rn. It associates to each point of the starting space a vector of the arrivingspace. The function itself, or equivalently the set of all couples (x,f(x)), iscalled vector field. In physics the concept of vector field is used, for example,to describe the velocity of a moving fluid throughout space, or the strengthand direction of some force, such as the magnetic or gravitational force, asit changes from point to point.

44 CHAPTER 3. DERIVATIVE

3.7 Differential operators

Over the space of differentiable functions we can define some useful differen-tial operators that have a big role in fluid dynamics and, more in general, inphysics.

Gradient

We have already defined the concept of gradient in Section 3.5.2. Here wecan look at it, from another point of view, as the operator that associatesto a function of many variables a vector field which points in the directionof the greatest rate of increase of the function, and whose magnitude is thegreatest rate of increase.

Divergence

Given a differentiable function f : R3 → R3 its divergence in cartesian coor-dinates is given by

divf = ∇ · f =3∑i=1

∂fi∂xi

.

The divergence operator is associated to the concept of source and sink of avector field, as we will see in section 5.2.2.

Curl

Given a differentiable function f : R3 → R3 its curl (or rotor) in cartesiancoordinates is given by

curlf = rotf = ∇× f =

∂f3

∂x2

− ∂f2

∂x3∂f1

∂x3

− ∂f3

∂x1∂f2

∂x1

− ∂f1

∂x2

.

The curl of a vector field can be calculated as the determinant of the followingsymbolic matrix x y z

∂x1 ∂x2 ∂x3

f1 f2 f3

.

The curl operator is associated to the concept of infinitesimal rotation of a3-dimensional vector field, as we will see in section 5.2.3.

3.8. EXERCISES 45

Figure 3.4: Graphic visualization of f(x, y) =xy(x2 − y2)

x2 + y2.

Laplacian

Given a function f : R3 → R of class C2 its Laplacian in cartesian coordinatesis given by

4f = ∇ · ∇f = ∇2f =3∑i=1

∂2f

∂x2i

.

3.8 Exercises

1. Consider the following functions:

f(x, y) = x log(1 + y),

g(x, y) = x+ 2xy − 1

y + 3,

h(x, y) =

xy2

x2 + y2if (x, y) 6= (0, 0)

0 if (x, y) = (0, 0).

Are they differentiable in (0, 0)?

2. Be g = g(u) : R2 → R a differentiable function and f(v) = (3v1 −2v2,−v1 + 5v2) a (linear) coordinate transformation. Calculate thepartial derivatives of the composite function ∂v1g(f(v)) and ∂v1g(f(v)).

46 CHAPTER 3. DERIVATIVE

3. Given the function (see Fig. 3.4)

f(x, y) =

xy(x2 − y2)

x2 + y2if (x, y) 6= (0, 0)

0 if (x, y) = (0, 0),

calculate ∂x∂yf(0, 0) and ∂y∂xf(0, 0) both using the common rules of

derivation and by computing the directional derivatives∂

∂x∂yf(x, 0)

and∂

∂y∂xf(0, y). Comment the result.

Chapter 4

Riemann integral

4.1 General definition

Given a closed and bounded interval [a, b] ⊂ R, a partition of [a, b] is a finiteset P = x0, x1, x2, . . . , xn of (n+ 1) points such that

a = x0 < x1 < x2 < . . . < xn = b.

The points of the partition split [a, b] into n subsets.Let f : [a, b] ⊂ R→ R be a bounded function on [a, b]. It is in particular

bounded on every subset [xi−1, xi]. Let us then consider the lower and uppersums of f with respect to the partition P :

s (P, f) =n∑i=1

mi (xi − xi−1) S (P, f) =n∑i=1

Mi (xi − xi−1)

wheremi = inf

x∈(xi−1,xi)f(x) Mi = sup

x∈(xi−1,xi)

f(x).

If the function f is positive valued (as the one pictured in Figure 4.1), wecan give a straightforward geometrical interpretation of the lower and uppersums, as they are approximating the area enclosed between the graph of thefunction and the x-axis with two sums of rectangles. Varying the partition Pwe get different estimates for the area and, loosely speaking, if we refine thepartition reducing the size of the subsets we get better and better estimatesfor the area.

We define f to be an integrable function if

supPs(P, f) = inf

PS(P, f),

47

48 CHAPTER 4. RIEMANN INTEGRAL

Figure 4.1: Visualization of a partition of a positive valued function. A) andB) represent the result of the lower and the upper sum respectively, while C)is the result of the integration.

and its integral is therefore defined as

b∫a

f =

b∫a

f(x) dx = supPs(P, f) = inf

PS(P, f).

This basic concept of integral can be extended, with some caution, toinclude unbounded functions and open and unbounded subsets of R, takingthe limit of successions of integrals on closed and bounded intervals.

4.2 Basic properties

We recall here some well known properties of integrals, without any proof.For a more complete treatment of this subject the reader should refer tospecialized texts.

Theorem 4.1 (Conditions of integrability). Let f : [a, b] ⊂ R→ R.

1. If f is monotonic on [a, b] then it is also integrable.

2. If f is continuous on [a, b] except at most for a finite number of pointsthen it is also integrable.

Theorem 4.2. Let f and g be integrable functions on [a, b] and λ ∈ R. Then

1. (f + g) is integrable and

b∫a

(f + g) =

b∫a

f +

b∫a

g.

4.2. BASIC PROPERTIES 49

2. λf is integrable on [a, b] and

b∫a

λf = λ

b∫a

f.

3. f · g is integrable.

4. |f | is integrable and ∣∣∣∣b∫

a

f

∣∣∣∣ ≤b∫

a

|f |.

In particular, due to property 1 and 2, the integral is a linear functional.

Theorem 4.3 (Basic properties of the integrals). Let f and g be integrablefunctions on [a, b].

1. If f(x) ≥ 0 ∀x ∈ [a, b], then∫ baf ≥ 0.

2. If f(x) ≤ 0 ∀x ∈ [a, b], then∫ baf ≤ 0.

3. If f(x) ≥ g(x) ∀x ∈ [a, b], then∫ baf ≥

∫ bag.

Theorem 4.4. Let f be an integrable function on [a, b] and c ∈ [a, b]. Thenf is also integrable on both [a, c] and [c, b], and

b∫a

f =

c∫a

f +

b∫c

f.

Vice versa, if f is integrable on both [a, c] and [c, b], then it is integrable on[a, b].

Theorem 4.5 (Mean value theorem). If f is an integrable function on [a, b]then there exists a number λ such that

b∫a

f = λ(b− a)

withinf

x∈[a,b]f(x) ≤ λ ≤ sup

x∈[a,b]

f(x).

Moreover, if f is continuous over [a, b], then there exists at least one pointc ∈ [a, b] such that f(c) = λ.

50 CHAPTER 4. RIEMANN INTEGRAL

4.3 Fundamental theorem of calculus

Theorem 4.6 (First part). Let f be an integrable function on [a, b] and x0

a fixed point in [a, b]. Then the function

Fx0(x) =

x∫x0

f

is continuous on [a, b]. Moreover if f is continuous at a point c ∈ [a, b], thenFx0 is differentiable in c and F ′x0(c) = f(c), that is, Fx0 is an antiderivativeof f . Two different antiderivatives Fx0(x) and Fx1(x) will differ at most bya constant value:

Fx0(x)− Fx1(x) =

x∫x0

f −x∫

x1

f =

x1∫x0

f.

Theorem 4.7 (Second part). Let f be an integrable function on [a, b], andlet φ be one of its antiderivatives on [a, b]. Then

b∫a

f = φ(b)− φ(a) = [φ(x)]ba .

To calculate the integral of an integrable function it is thus enough toknow one of its primitives. The value of the integral is then obtained througha mere subtraction.

Chapter 5

Integration in Rn

Riemann theory of integration is not very satisfactory when it has to deal withunbounded intervals of integrations, with successions and series of functions,and with integrals of functions of more than one variable. The commonchoice is to abandon the Riemann integral in favor of the Lebesgue integral.For a rigorous treatment of the topics of this chapter we need thereforeLebesgue theory of integration. Unfortunately it is much more complex thanRiemann’s one, and since it involves also a theory of measure, it is thus faroff the scope of these notes. Luckily enough, from a practical point of view,there is no big difference between Riemann and Lebesgue integral, so we willskip completely the theory, and go directly to the practice.

5.1 Multiple integral

Given a function f : D ⊂ Rn → R, we would like to compute the integral∫D

f(x) dx, (5.1)

interpreting it, at least for non-negative functions, as the n+1-dimensionalvolume of the region between the surface defined by the function and then-dimensional plane which contains its domain. If the function is Lebesgueintegrable then the expression (5.1) is well defined and its value can be ob-tained iterating one dimensional integrals, regarding the other variables asfixed parameters:

∫D

f(x) dx =

b1∫a1

. . .

bn∫an

f(x1, . . . , xn) dxn

. . . dx1,

51

52 CHAPTER 5. INTEGRATION IN RN

and the order of integration is non influential. Basically we are first integrat-ing a two-dimensional cross-section of the hypervolume, then we integratethis cross-section in another direction, obtaining a three-dimensional cross-section, and so on.

For example, if we take D = [0, 1]× [−1, 1] and f = x+ y2, then

∫D

f(x) dx =

1∫−1

1∫0

(x+ y2) dx

dy =

1∫−1

[x2

2+ xy2

]1

0

dy

=

1∫−1

1

2+ y2 dy =

[y

2+y3

3

]1

−1

=5

3.

5.2 Coordinate transformation and integra-

tion by substitution

We have seen in section 3.6.3 some common coordinate systems and theassociated transformations to and from Cartesian coordinates. If we apply acoordinate transformation x = Φ(y) to our space, all integrals must changeaccordingly: ∫

D

f(x) dx =

∫Φ−1(D)

f(Φ(y)) ||detJΦ(y)|| dy. (5.2)

If f is a constant function equal to one, then our integral reduces to thecomputation of the dimension of the domain of integration D, called themeasure of D:

mn(D) =

∫D

dx =

∫Φ−1(D)

||detJΦ(y)|| dy. (5.3)

If the set D is a line, we obtain its length; if it is a surface, we obtain itsarea.

5.2.1 Curves and surfaces

A regular curve γ in Rn is the image of a differentiable injective functionΦ : [a, b] → Rn of class C1 with Φ′ 6= 0 ∀x ∈ [a, b]. A regular curve is thuscontinuous, smooth and does not intersect itself. A closed regular curve isdefined in analogy as before, but with the request for the function Φ of being

5.2. COORDINATE TRANSFORMATION 53

injective on [a, b), while Φ(a) = Φ(b). The length S of the curve γ is thengiven by the integral

S =

∫γ

ds =

b∫a

||Φ′(τ)|| dτ. (5.4)

From a physical point of view, we can easily interpret [a, b] as the time intervaltaken to cover the length of the curve, and thus Φ is the equation of motionand Φ′ is its corresponding velocity. From this point of view, equation (5.4)is just the integral of speed in time.

For example, let’s take the unit circle and parameterize it as Φ(t) =(cos t, sin t) with t ∈ [0, 2π]. Its length S is given by the integral

S =

2π∫0

∥∥∥∥− sin τcos τ

∥∥∥∥ dτ =

2π∫0

√sin2 τ + cos2 τ dτ =

2π∫0

dτ = 2π.

Given a function F : γ → Rn defined on a curve γ then the integral

∫γ

F (x) ds =

b∫a

F (Φ(τ)) ||Φ′(τ)|| dτ,

if exists, is called line integral of F along the curve γ.The two dimensional analogous of a regular curve is a regular surface, that

is the image Σ of a differentiable injective function Φ : J = [a, b]×[c, d]→ Rn.In the three-dimensional case, n = 3, the surface area A is given by theintegral

A =

∫Σ

dσ =

b∫a

d∫c

||∂x1Φ× ∂x2Φ|| dx1 dx2.

Let’s stress here that the entity ∂x1Φ × ∂x2Φ is a vector normal to thesurface. Its magnitude and sense depend on the parameterization, but notits direction.

Let’s compute the area of the three-dimensional sphere of radius r: wecan parameterize it in spherical coordinates

Φ : [0, π]× [0, 2π]→ R3, Φ(θ, φ) =

r sin θ sinφr sin θ cosφr cos θ

,

54 CHAPTER 5. INTEGRATION IN RN

obtaining

∂θΦ× ∂φΦ =

r cos θ sinφr cos θ cosφ−r sin θ

×−r sin θ cosφ

r sin θ sinφ0

= −r2

sin2 θ sinφsin2 θ cosφsin θ cos θ

.

The area of the sphere is thus

A(r) =

π∫0

2π∫0

||∂θΦ× ∂φΦ|| dθ dφ

= r2

π∫0

2π∫0

√(sin2 θ sinφ

)2+(sin2 θ cosφ

)2+ (sin θ cos θ)2 dθ dφ

= r2

π∫0

2π∫0

√sin4 θ + sin2 θ cos2 θ dθ dφ = r2

π∫0

2π∫0

sin θ dθ dφ

= r2

π∫0

sin θ dθ

2π∫0

dφ = 4πr2.

In analogy with the line integral, given a function F : Σ→ Rn defined ona surface Σ, the integral∫

Σ

F (σ) dσ =

∫J

F (Φ(x1, x2))||∂x1Φ× ∂x2Φ|| dx1 dx2

is called a surface integral.

5.2.2 Gauß’ theorem

Given a volume V ⊂ R3 bounded by a regular surface ∂V and a continuouslydifferentiable vector field F , then∫

V

divF dx =

∫∂V

F · n dσ, (5.5)

where n is a versor normal to the surface pointing in the outward direction.Given a parameterization Φ for ∂V , then

n =∂x1Φ× ∂x2Φ||∂x1Φ× ∂x2Φ||

.

Equation (5.5) should clarify the link between the divergence and the fluxof a vector field.

5.2. COORDINATE TRANSFORMATION 55

Figure 5.1: Now everything should be clear to the attentive reader.

5.2.3 Stokes’ theorem

Given a surface S ⊂ R3 bounded by a regular curve ∂S and a continuouslydifferentiable vector field F , then∫

S

∇× F · n dσ =

∫∂S

F · t ds, (5.6)

where t is the tangent versor. The meaning of this theorem is elegantlyexplained in Figure 5.1

5.2.4 Green’s identities

• First Green’s identity: Given two continuously differentiable scalarfunctions φ and ψ defined on a volume V ⊂ R3 bounded by a regularsurface ∂V , applying Gauß’ theorem to the vector field φ∇ψ yields∫

V

φ4 ψ +∇φ · ∇ψ dx =

∫∂V

φ∇ψ · n dσ. (5.7)

• Second Green’s identity: In the same hypothesis as before, sub-tracting from equation (5.7) an analogous one for the vector field ψ∇φyields ∫

V

φ4 ψ − ψ4 φ dx =

∫∂V

φ∇ψ · n− ψ∇φ · n dσ. (5.8)

56 CHAPTER 5. INTEGRATION IN RN

5.3 Exercises

1. Given the set E = (x, y) ∈ R2 : x2 + y2 ≤ 1, calculate the integral∫E

xy

x2 + y2ds.

2. Given the set E = (x, y) ∈ R2 : y ≥ 0;x+ y ≥ 0;x2 + y2 ≤ 3, calcu-late the integral ∫

E

yds.

Hint: the domain of integration has a cumbersome definition in carte-sian coordinates, even if its shape is not particularly odd. Can some-thing be done about it?

Chapter 6

Continuum Mechanics

Continuum Mechanics is based on the assumption that the body under studyfills continuously the region of space it occupies, without any discontinuity:the matter is spread — possibly unevenly, but without gaps — throughoutthe space the body occupies. A continuous body could thus be sliced anarbitrary, even infinite number of times, and the resulting small, infinitesimalfraction of matter would show the same properties of the whole body. Thiswill allow us to consider infinitesimally small portions of the body, and treatthem with the tools of infinitesimal calculus.

We know that, at the submicroscopic level of atoms and molecules, this as-sumption of continuity is fundamentally wrong for all materials; neverthelessfor some of them, and more generally speaking for some physical systems,the peculiar properties of the underlying discontinuous structure “averageout” when considering a portion of the material that is large when comparedto the dimensions of the atomic constituents, but is still small compared tothe dimensions of the macroscopic body. The minimum volume of materialfor which the discontinuous structure becomes statistically homogeneous andstarts to show the macroscopic properties of the material is called represen-tative elementary volume. Continuum Mechanics can thus be used to studythese materials at lengthscales larger than the representative elementary vol-ume. For example in a 1 µm3 droplet of water there are N = ρV Na/m =1000 Kg m−3 · 10−18 m3 · 6 × 1023 mol−1/18 × 10−3 Kg mol−1 ≈ 3 × 1010

molecules of water, more than enough to ensure statistical homogeneity.When dealing with the solid earth our “molecules” are the crystallographicdefects in the minerals and the boundaries between the grains. A rock dropletcan thus have a diameter of some meters.1

1Another good example is our universe: at the smaller lengthscales of a planetary or astellar system, or even a galaxy group, the universe looks pretty much discontinuous. Butat lengthscales of roughly 1011 parsecs, or 1027 meters, the universe can be considered as

57

58 CHAPTER 6. CONTINUUM MECHANICS

6.1 Lagrangian and Eulerian frames of refer-

ence

Let B ⊂ R3. Then a mapping Φ(X, t) = Φt(X) : B ×R→ R3 is called mo-tion. From the point of view of continuum mechanics, this mapping describesthe position of every continuum particle at each time t. Usually some stricterassumptions are made2, in order to prevent unphysical events like tearing oroverlapping of contiguous parts of the medium: first of all we require Φt(X)to be a family of diffeomorphisms parametrized by the real variable t, that isfor every fixed t0, Φt0(X) is a diffeomorphism. This implies in particular that∃Φ−1

t (X)∀t ∈ R, and thus det∇XΦt(X) 6= 0∀t ∈ R. Moreover, it is usuallyrequired t = 0 to be the initial time of observation, that is Φ(X, 0) = X,∀X ∈ B. In the following we will denote points in B by upper-case letters,and points in Φt(B) by lower-case letters. We can regard B as a “referencestate” of the body, and upper-case coordinates as a parametrization of thefluid parcels, to keep track of the various particles, while Φt(B) is the por-tion of space actually occupied by the body at the time t, and lower-casecoordinates are the coordinates of the three dimensional space.

The mapping V (X, t) = ∂tΦ(X, t) is called material velocity, while themapping At(X) = ∂tV (X, t) = ∂2

t Φ(X, t) is called material acceleration.The names come from the fact that, in taking these derivatives, we are fol-lowing the motion of each fluid parcel, of each piece of material. These arethe quantities to which classical mechanics’ laws apply.

However, we will often be interested in the velocity or in the accelerationof fluid parcels passing by a fixed point in space. For example, measuresof wind direction and intensity are taken at weather stations, that are fixedpoints on the Earth surface, while it is practically impossible to measurethem while moving along wind streamlines.

Now, fixed a point x in space, the mapping Φ−1t (x) gives us, for every

time t, the initial position of the particle that is passing at that point inspace at that moment in time. In fact, once a motion Φ is given, we canlink the initial position of a fluid parcel to its current position, and with alittle abuse of notation we can write Φ−1

t (x) = X(x, t); moreover, we canalso write Φt(X) = x(X, t), linking the current position of the particle to itsinitial position. The function vt : Φt(B) → R3, v(x, t) = V

(Φ−1t (x), t

)=

V (X(x, t), t) returns the fluid velocity at a certain point in space, and isthus called spatial velocity. Similarly, the spatial acceleration is given by thefunction a(x, t) = A

(Φ−1t (x), t

)= A(X(x, t), t).

filled by a continuous fluid with zero viscosity and low density.2We will treat only motions with this higher regularity.

6.2. MATERIAL TIME DERIVATIVE 59

More in general, every physical quantity can be described from a fixedframe of reference giving its value at fixed points in space, like a speed trapon an highway, or following fluid particles as they move with the flow, likethe speedometer of a certain car. In the first case we are using spatial (orEulerian) quantities to give an Eulerian description of the system, while inthe second case we are using material (or Lagrangian) quantities to give aLagrangian description of the system.

6.2 Material time derivative

Let Qt : B → R3 be a continuously differentiable Lagrangian quantity andqt : Φt(B)→ R3 be the corresponding Eulerian quantity, qt(x) = Qt(X(x, t))and Qt(X) = qt(x(X, t)). Taking the time derivative of Qt we obtain thefollowing relation:

∂tQ(X, t) =

d

dtq(x(X, t), t)

=3∑i=1

∂xiq(x(X, t), t)

∂txi(X, t) +

∂tq(x(X, t), t)

=3∑i=1

∂xiq(x(X, t), t)Vi(X, t) +

∂tq(x(X, t), t)

= V (X, t) · ∇q(x(X, t), t) +∂

∂tq(x(X, t), t)

= v(x(X, t), t) · ∇q(x(X, t), t) +∂

∂tq(x(X, t), t).

The most important case is the computation of the material acceleration:taking Q(X, t) = V (X, t), we obtain that

A(X, t) =∂

∂tV (X, t) = v(x(X, t), t) · ∇v(x(X, t), t) +

∂tv(x(X, t), t).

The notation v ·∇v, sometimes written as (v · ∇)v, is an abuse of notation,and may cause some misunderstanding, so we give here its form in indexnotation:

(v · ∇v)i =3∑j=1

vj∂xjvi.

From its explicit derivation in the previous paragraph, it should be clear thatwe are actually performing a product of Jacobian matrices Jv(x) · Jx(t),where the second Jacobian matrix is just a column vector.

60 CHAPTER 6. CONTINUUM MECHANICS

We have obtained that the time derivative of an Eulerian quantity q(x, t)taken following the fluid particles in their motion, described by the vectorfield v(x, t), is given by the operator

D

Dtq(x, t) =

∂tq(x, t) + v(x, t) · ∇q(x, t), (6.1)

which is called material3 time derivative. It is a fully-fledged derivative andhas all the properties of a derivative.

In the following we will need the total derivative of a volume integral:given a region of space U(t) = Φt(U0) occupied by a fluid deforming accord-ingly to a motion Φt, and an extensive Eulerian quantity per unit volumeF (x, t), at each moment in time the total amount of the extensive quantityF(t) contained inside the region U(t) is given by the integral

F(t) =

∫U(t)

F (x, t) dx.

In this expression F(t) is a function of time both because F (x, t) is time-dependent and because we are following the fluid in its motion, thus theregion of integration is time-dependent too. In the following we will be inter-ested in the total time derivative of such a quantity, for which the followingequality holds:

d

dtF(t) =

∫U(t)

[D

DtF (x, t) + F (x, t)∇ · v(x, t)

]dx (6.2)

=

∫U(t)

[∂

∂tF (x, t) +∇ · (F (x, t)v(x, t))

]dx. (6.3)

To prove this equality, let’s start from the definition of derivative:

d

dtF(t) = (6.4)

lim∆t→0

1

∆t

∫U(t+∆t)

F (x(t+ ∆t), t+ ∆t) d(x(t+ ∆t))−∫U(t)

F (x(t), t) d(x(t))

.We can now write x(t + ∆t) ≡ x as a function of x(t), since x(t + ∆t) =x(t) + v(t)∆t to the first order in ∆t, and we can thus change the variable

3Or Lagrangian, or substantial.

6.2. MATERIAL TIME DERIVATIVE 61

of integration from x(t+ ∆t) to x(x(t)) in the first integral4:∫U(t+∆t)

F (x(t+ ∆t), t+ ∆t) d(x(t+ ∆t)) = (6.5)

∫U(t)

F (x(x(t)), t+ ∆t)||det(J x(x(t)))|| d(x(t)).

Let’s notice that the region of integration has changed, and now is the sameas the second integral.

Before adding explicitly the two integrals we will massage the integranda bit: first of all, we write explicitly the Jacobian, retaining only the firstorder terms in ∆t, which yields5:

det(J x(x(t))) = det(∂xi(x(t))

∂xj(t)) = det(

∂xi + vi∆t

∂xj) = det(δij +

∂vi∂xj

∆t) =

1 + ∆t tr(∂vi∂xj

) = 1 + ∆t∑i

∂vi∂xi

= 1 + ∆t∇ · v.

Let’s notice that since ∆t is infinitesimal, the Jacobian is positive, and soin this case ||det(J x(x(t)))|| = det(J x(x(t))). Now we use another basicproperty of the material derivative, a consequence of Eq. (3.1), again correctto the first order in ∆t:

F (x(x), t+ ∆t) = F (x(t+ ∆t), t+ ∆t) = F (x(t), t) +D

DtF (x(t), t)∆t.

We substitute this equation and the expression we found for the Jacobianback into Eq. (6.5), and then into Eq. (6.4), obtaining the desired equation:

d

dtF(t) =

lim∆t→0

1

∆t

∫U(t)

[(F (x, t) +

D

DtF (x, t)∆t

)(1 + ∆t∇ · v)− F (x, t)

]dx =

lim∆t→0

1

∆t

∫U(t)

∆t

[D

DtF (x, t) + F (x, t)∇ · v

]dx.

4See Eq. (5.2). In particular here we have x(x(t)) = Φ(x(t)) = x(t) + v(t)∆t.5Here we use the relation det(1 + εA) = 1 + ε tr(A), correct to the first order in the

infinitesimal parameter ε, where 1 is the identity matrix and A is an arbitrary matrix.

62 CHAPTER 6. CONTINUUM MECHANICS

6.3 Deformation tensors

Given a motion Φt(X), its Jacobian matrix (with respect to the spatialvariables) F t(X) = ∇XΦt(X) is called the deformation gradient. We canuse the deformation gradient to approximate the deformation

dx = Φt(X + dX)−Φt(X)

of the fluid parcel X with the product F t(X) dX. Obviously this linearapproximation is good only for small deformations, technically called in-finitesimal.

In the approximation of infinitesimal deformations, the length of a linesegment in the deformed state is

||dx||2 = dXTF Tt (X)F t(X) dX,

where we have denoted by T the transpose matrix. The tensor Ct(X) =F Tt (X)F t(X) is called Green deformation tensor, and is a Lagrangian quan-

tity.On the other hand the original length of a deformed line segment dx is

||dX||2 = dxT(F−1t

(Φ−1t (x)

))TF−1t

(Φ−1t (x)

)dx,

and the tensor ct (x) =(F−1t

(Φ−1t (x)

))TF−1t

(Φ−1t (x)

)is called Cauchy

deformation tensor, and is an Eulerian quantity.Let’s notice that, by definition, both Green and Cauchy tensors are sym-

metric and positive definite (they are Gramian matrices).

6.4 Strain tensors

Let’s now evaluate the amount of deformation experienced by a line segmentdX during the motion:

||dx||2 − ||dxt=0||2 = ||dx||2 − ||dX||2 = dXTF TF dX − dXTdX

= dXT(F TF − 1

)dX = 2 dXTE dX

= dxTdx− dxT(F−1

)TF−1 dx = 2 dxTe dx.

The Green strain tensor Et(X) is thus defined by

Et(X) =1

2(Ct(X)− 1) ,

6.4. STRAIN TENSORS 63

while the Cauchy strain tensor et(x) is defined by

et(x) =1

2(1− ct(x)) .

Introducing now the displacement u = x−xt=0 = x−X = Φt−Φ0, wehave that

∇Xu = F − 1, F = 1 +∇Xu, (6.6)

∇xu = 1− F−1, F−1 = 1−∇xu, (6.7)

and we can, after some work, write explicitly the Green and Cauchy straintensors as

Eij ≈1

2

(∂ui∂Xj

+∂uj∂Xi

), (6.8)

eij ≈1

2

(∂ui∂xj

+∂uj∂xi

). (6.9)

As an example, let’s take the motion

Φt(X) =

(1 + t)X1

(1 + t2)X2

(1 + 2t)X3

,

defined on the unit sphere B = X ∈ R3 : ||X|| < 1 for t ≥ 0. First of all,let’s check if the general assumptions on motions are satisfied:

1. Initial time:

Φ0(X) =

X1

X2

X3

= X.

2. Determinant of the gradient:

∇XΦt(X) = F t(X) =

1 + t 0 00 1 + t2 00 0 1 + 2t

.

detF t = (1 + t)(1 + t2)(1 + 2t) > 0, ∀t ≥ 0.

3. Inverse:

x1 = (1 + t)X1

x2 = (1 + t2)X2

x3 = (1 + 2t)X3

X1 =

1

1 + tx1

X2 =1

1 + t2x2

X3 =1

1 + 2tx3

64 CHAPTER 6. CONTINUUM MECHANICS

⇒ Φ−1t (x) =

11+tx1

11+t2

x21

1+2tx3

.

All assumptions are thus satisfied. Let’s now compute the material velocityand acceleration:

V (X, t) =∂

∂tΦt(X) =

X1

2tX2

2X3

,

A(X, t) =∂

∂tV (X, t) =

02X2

0

.

It is now clear the nature of the motion: the unit ball is dilating. The outerfluid parcels are faster than the inner ones. Every fluid parcel experiences auniform motion along the first and third coordinate axis, and a motion withconstant acceleration along the second coordinate axis.

The corresponding Eulerian quantities are then:

v(x, t) = V (Φ−1t (x), t) =

11+tx1

2t 11+t2

x2

2 11+2t

x3

,

a(x, t) = A(Φ−1t (x), t) =

02 1

1+t2x2

0

.

Now we are looking at the motion from fixed points in space, thus we seepass by particles coming each time from a more internal portion of the fluid,and thus moving more and more slowly.

6.4.1 Velocity gradient tensor

In this section we will show a decomposition of the deformation gradienttensor and of the velocity gradient tensor that will come in handy when wewill deal with constitutive equations. This decomposition will give us analternative derivation of the stress tensor, mathematically less rigorous but,we hope, physically more intuitive.

We will show the explicit calculations only for the velocity gradient tensorπ, defined as

πij =∂vi∂xj

,

6.4. STRAIN TENSORS 65

those for the deformation gradient tensor being completely analogous.Let’s expand each entry of the tensor in the form

∂vi∂xj

=1

2

(∂vi∂xj

+∂vj∂xi− 2

3divv δij

)+

1

2

(∂vi∂xj− ∂vj∂xi

)+

1

3divv δij.

We can now decompose the velocity gradient tensor into the sum of a sym-metric tensor with null trace πS, an antisymmetric tensor πA and an isotropictensor πI :

πSij =1

2

(∂vi∂xj

+∂vj∂xi− 2

3divv δij

), (6.10)

πAij =1

2

(∂vi∂xj− ∂vj∂xi

), (6.11)

πIij =1

3divv δij. (6.12)

A first important consequence of this decomposition is that, since allterms of the decomposition are tensors, they transform separately and in-dependently. Thus, if one of them is null in a particular coordinate systemthen it is also null in every other coordinate system.

Moreover, the various terms have a clear physical interpretation. Themotion of the fluid parcel is in fact given by the composition of:

• a translatory motion, that doesn’t give rise to a velocity gradient;

• a rigid body rotation, with angular velocity ω = 12∇ × v, associated

with the antisymmetric term πA;

• an isotropic dilation with a relative dilation velocity 1V

dVdt

= ∇ · v,

associated with the isotropic term πI ;

• a motion of pure shear: angles and distances are deformed withoutvariations in volume. It is associated with the symmetric term πS.

Since πA describes a rigid body rotation, it doesn’t cause any deformationof the fluid parcel, unlike the other two tensors. The sum of the symmetricand the isotropic parts of the velocity gradient tensor is called deformationvelocity tensor, and it is equal to the Cauchy strain rate tensor:

πSij + πIij =1

2

(∂vi∂xj

+∂vj∂xi

)= eij.

66 CHAPTER 6. CONTINUUM MECHANICS

6.5 Conservation of mass

Given a motion Φt of a body B with mass M, a time dependent densityfunction ρ(x, t) satisfies the conservation of mass if

M =

∫B

ρ(x, 0) dx,

and ifd

dtM =

d

dt

∫Φt(U)

ρ(x, t) dx = 0, ∀U ⊂ B, ∀ t.

Taking into account equation (6.3) the derivative of the integral trans-forms into

d

dtM =

∫Φt(U)

[∂

∂tρ(x, t) +∇ · (ρ(x, t)v(x, t))

]dx = 0.

For this integral to be zero for every arbitrary portion of space and for alltimes, its integrand must be identically zero. We thus obtain the continuityequation:

∂tρ(x, t) +∇ · (ρ(x, t)v(x, t)) = 0.

From equation (6.2) we know that we can also write the continuity equa-tion in a different form:

D

Dtρ(x, t) + ρ(x, t)∇ · v(x, t) = 0,

that we rewrite in the form

1

ρ

Dt= −∇ · v. (6.13)

This shows that the relative variation of density of a fluid parcel is given bythe divergence of the velocity field. If the flow is divergence free, then thedensity of each fluid parcel is constant, and the motion is called incompress-ible.

A motion is called volume preserving if

Vol (Φt(U)) =

∫Φt(U)

dx =

∫U

dX = Vol (U) , ∀U, ∀t, (6.14)

6.5. CONSERVATION OF MASS 67

or, equivalently, ifd

dtVol = 0.

Setting F = 1 in equation (6.3) we obtain that

d

dtVol =

∫Φt(U)

∇ · v(x, t) dx. (6.15)

A motion that is volume preserving is also incompressible, and vice versa.We could have reached the same conclusion following a different argument:mass density, that is mass per unit volume, is the inverse of volume per unitmass,

ρ =1

υ.

Substituting this into equation (6.13) yields

1

υ

Dt= ∇ · v.

Theorem 6.1. Let Φt ∈ C2 be a motion. Then the following statements areequivalent:

a) Φt is volume preserving.

b) ∇ · v = 0.

c) detJΦ(X, t) = 1 ∀X, t.

Proof. From equation (6.15) and its derivation should be clear that a)⇔ b).To show that a) ⇔ c) is enough to take into account the definition of

volume preserving motion and perform a coordinate transformation, usingthe motion itself as the coordinate transformation. Let’s stress the fact thatwe can do this thanks to the high regularity we require from the motion.Taking the left member of equation (6.14) and transforming it using equation(5.3) yields

Vol (Φt(U)) =

∫Φt(U)

dx =

∫Φ−1(Φt(U))

||detJΦ(X)|| dX =

∫U

||detJΦ(X)|| dX.

Comparing this result with the right member of equation (6.14), we see thatif the motion is volume preserving then the equality∫

U

||detJΦ(X)|| dX =

∫U

dX (6.16)

68 CHAPTER 6. CONTINUUM MECHANICS

holds for every portion U of the continuum and for every time t. This impliesthe equality of the integrands, that is

||detJΦ(X)|| = 1,

and so a) ⇒ b). If, on the contrary, the Jacobian determinant is equal to 1,then we know that equation (6.16) holds for every portion U of the continuumand for every time t. We can thus follow the previous argument backwardsand prove that the motion is volume preserving.

6.6 Balance of momentum

Let Φ be a motion of a body B, ρ : Φ(B) × R → R be a density, t :Φ(B)×R×Ω→ R3 be a force per unit area, or traction6, with Ω being thespherical surface of unit radius, and b : Φ(B)×R→ R3 be the sum of all thebody forces per unit volume. Then, for every portion U of the body underconsideration and at any time t, Newton’s second law of motion F = d

dt(mv)

must hold:∫Φt(U)

b(x, t) dx+

∫∂Φt(U)

t(x, t, n) dω =d

dt

∫Φt(U)

ρ(x, t)v(x, t) dx. (6.17)

Using equation (6.2) to manipulate the time derivative of the momentumand taking into consideration the continuity equation yields∫

Φt(U)

b dx+

∫∂Φt(U)

t dω =

∫Φt(U)

(D

Dt(ρv) + ρv∇ · v

)dx

=

∫Φt(U)

D

Dtv + v

D

Dtρ+ vρ∇ · v

)dx

=

∫Φt(U)

ρD

Dtv dx. (6.18)

6.6.1 Cauchy’s stress theorem

Before continuing with the manipulation of equation (6.18) we need to demon-strate a fundamental fact about traction, namely that traction is linear in

6Given a point x on a surface Σ with outward unit normal vector n(x), the tractiont(x, t, n) is the force per unit area exerted at the point x and at time t by the portion ofcontinuum pointed by n over the portion of continuum on the other side of the surface. Byvirtue of Newton’s third law of motion (i.e. the mutual forces of action and reaction betweentwo bodies are equal, opposite and collinear), we obtain that t(x, t, n) = −t(x, t,−n).

6.6. BALANCE OF MOMENTUM 69

Figure 6.1: Tetrahedron for the Cauchy’s stress theorem. Image fromWikipedia.

the components of the normal vector:

ti(x, n) = σi1(x)n1 + σi2(x)n2 + σi3(x)n3 =3∑i=1

σij(x)nj. (6.19)

To prove this let’s consider a tetrahedron of infinitesimal volume dV anddiameter r (see Figure 6.1), with three faces Si orthogonal to the correspond-ing cartesian versor ei in such a way that the outward normal versor is −ei,and the fourth face S with outward normal versor n. For clear geometricalreasons, the areas dAi of the three faces Si are given by the relation

dAi = (n · ei) dA = ni dA.

On the tetrahedron act the traction t(x, n), the three tractions t(x,−ei) =−t(x, ei), and possibly a body force b(x). We can thus write the i-th com-ponent of Newton’s second law of motion as Fi −mai = 0

ti(x, n) dA−∑j

ti(x, ej) dAj + ρ(x)bi(x) dV − ρ(x)ai dV =(ti(x, n)−

∑j

ti(x, ej)nj

)dA+ ρ(x)bi(x) dV − ρ(x)ai dV = 0.

Noticing that dA ∼ r2 while dV ∼ r3, and since infinitesimal quantities ofdifferent order cannot mutually cancel out7, the last equality implies that

ti(x, n)−∑j

ti(x, ej)nj = 0.

7Or, equivalently, dividing both members of the equation by dA and taking the limitfor dV → 0.

70 CHAPTER 6. CONTINUUM MECHANICS

Setting σij(x) = ti(x, ej) we obtain the desired relationship (6.19). Thematrix σij is called Cauchy stress tensor. Its diagonal entries are the normalstresses, while its off-diagonal entries are the components of the tangentialstresses.

For all “ordinary” materials the Cauchy stress tensor happens to be sym-metric. Those materials, like ferromagnetic suspensions or some polymericfluids, that have a non-symmetric stress tensor are called polar. We will onlytreat non-polar materials, and the order of the indices of the stress tensorwill thus not be relevant. This allows us to write equation (6.19) in vectornotation as

t(x, n) = σ(x) · n (6.20)

without caring for the ambiguity of this notation.The stress tensor is usually decomposed into the sum of a mean hydro-

static stress tensor p = −13

∑i σii with a stress deviator tensor σ′:

σij = σ′ij − pδij. (6.21)

6.6.2 Navier-Stokes equation

We can now return to equation (6.18) and substitute the traction using equa-tion (6.20), obtaining∫

Φt(U)

ρ(x, t)D

Dtv(x, t) dx =

∫Φt(U)

b(x, t) dx+

∫∂Φt(U)

σ(x, t) · n dω

=

∫Φt(U)

b(x, t) dx+

∫Φt(U)

∇ · σ(x, t) dx,

where we have used Gauß theorem to transform the surface integral into avolume integral. Since this equation must hold for every arbitrary portionof continuum U , it must hold for the integrands too. We thus obtain theNavier-Stokes equation:

b(x, t) +∇ · σ(x, t) = ρ(x, t)D

Dtv(x, t)

= ρ(x, t)

(∂

∂tv(x, t) + v(x, t) · ∇v(x, t)

)that, taking into consideration equation (6.21), we can rewrite as

ρ(x, t)

(∂

∂tv(x, t) + v(x, t) · ∇v(x, t)

)= b(x, t)−∇p(x, t) +∇ · σ′(x, t).

(6.22)

6.7. CONSTITUTIVE EQUATION AND RHEOLOGY 71

We can see that, if the momentum of the fluid is constant, the Navier-Stokes equation reduces to a balance between body and surface forces,

b(x, t) = ∇p(x, t)−∇ · σ′(x, t).

6.7 Constitutive equation and rheology

In the previous section we have considered tractions as given, known func-tions. Unfortunately it is more complicated than that: while body forcesare given by the long-range interactions between particles of the body, likeelectromagnetism and gravity, and can be described by robust and relativelysimple physical theories using only macroscopic quantities, surface forcesare an approximated description of the microscopic short-range interactionsbetween nearby particles (e.g. collisions between molecules, van der Waalsinteractions, chemical bonds between atoms in a crystal, etc.) when viewedon a macroscopic scale. Surface forces are thus very difficult to describe witha consistent physical theory that takes into account the microphysics of par-ticle interaction but that uses only macroscopic quantities. Tractions, andthe stress deviator tensor in particular, are linked to macroscopic quantitiesvia a constitutive equation σ′(x, t) = f(x,v, t, p, T, . . .) that, depending onthe material, might be very complicated, not really satisfactory, or both.

6.7.1 Linear isotropic constitutive equation

Usually deviatoric stresses are related only to deformations, in case of anelastic solid, and only to deformation velocities, in case of a viscous fluid.We will now derive a very simple stress-strain relation assuming that:

1. the constitutive equation is linear;

2. the material is isotropic;

3. the stress deviator tensor is symmetric.

The first assumption yields:

σ′ij = Aijhkαhk,

where αhk can be the strain tensor eij or the strain rate tensor eij, dependingon the material. If moreover we assume that the fluid is isotropic, then alsothe tensor Aijhk must be isotropic, and thus can be written as8

Aijhk = aδijδhk + bδihδjk + cδikδjh.

8This is a known and demonstrable property of an isotropic tensor of order 4. Theproof is rather lengthy and not very insightful, so it is not reported here.

72 CHAPTER 6. CONTINUUM MECHANICS

Substituting this expression back into the constitutive equation yields

σ′ij = atrα δij + bαij + cαji = (b+ c)αSij + (b− c)αAij + (b+ c+ 3a)αIij.

Since we are treating non-polar materials, we also require the stress deviatortensor to be symmetric, thus the coefficient (b− c) of αA must be zero, andthe constitutive equation reduces to:

σ′ij = 2bαSij + (2b+ 3a)αIij.

If we are treating an elastic material, then α is the strain tensor e, and werecognize b as the shear modulus G and 1

3(2b + 3a) as the bulk modulus K.

The constitutive equation for a linear isotropic elastic material takes the form

σ′ij = 2GeSij + 3KeIij = 2G1

2

(∂ui∂xj

+∂uj∂xi− 2

3

∂uk∂xk

δij

)+ 3K

1

3

∂uk∂xk

δij

= 2Geij +

(K − 2

3G

)ekkδij = 2Geij + λekkδij,

where the last equality is written using only the Lame parameters G and λ.In case of a linear viscous fluid, often called Newtonian fluid, α is the strainrate tensor e, and thus b is the shear viscosity η and 1

3(2b+3a) is the volume

(or bulk) viscosity ζ. The constitutive equation for a linear viscous fluid isthus

σ′ij = 2ηeSij + 3ζeIij = 2ηeij +

(ζ − 2

)ekkδij.

6.7.2 Non-Newtonian fluids

A number of common fluids, like water, air, petrol, kerosene and many salinewater solutions, behave quite linearly, at least at room temperature andpressure. Many others have instead a more complicated behavior, with theirstress state being a nonlinear function of present and sometimes also paststrain and strain rate. Rheology is the science that studies the flow of mat-ter, in particular of those materials that show an anomalous flow, like non-Newtonian fluids and visco-plastic solids, and by extension also their behav-ior. We will give here a brief list of non-Newtonian fluids, to show that evensome quite common fluids have a quite weird behavior:

Shear thickening fluids Also called dilatant, their viscosity increases withincreasing shear strain rate. The classic example is a mixture of corn-starch and water, which can be easily deformed if kneaded gently, butbecomes basically solid if struck.

6.7. CONSTITUTIVE EQUATION AND RHEOLOGY 73

Figure 6.2: Tangential shear τ as a function of tangential strain ε for variousmaterials. The instantaneous, effective viscosity is given by the slope of thetangent, ∂τ

∂ε. Notice the yield stress τ0 for the Bingham plastic.

Shear thinning fluids Also called pseudoplastic, their viscosity decreaseswith increasing shear strain rate. Nowadays many paints and nail pol-ishes are shear thinning fluids, so they can be easily spread, withoutdripping once applied.

Thixotropic fluids Their viscosity depends on both strain rate and time:at constant strain rate the viscosity decreases with time towards anequilibrium viscosity. Some clays (like bentonite) and inks are thixotropic.

Rheopectic fluids Like thixotropic fluids, but their viscosity increases withtime.

Bingham plastics A Bingham plastic deforms like a solid at low stress,but flows like a viscous fluid at high stress. The value of stress atwhich a Bingham plastic starts to flow is called yield stress. Mud isa Bingham plastic with a yield stress varying from tens to thousandsof Pascals, depending on chemical composition and hydration. Thiscan create very dangerous situations: a sloping layer of mud is stableuntil it thickens so much that the shear stress at its base exceeds theyield stress, at which point it flows down the slope like a solid above alubricating thin layer, with possible devastating effects.

Polar fluids Some fluids have a non-zero antisymmetric stress tensor, dueto an intrinsic spin angular momentum. In nature, examples of polar

74 CHAPTER 6. CONTINUUM MECHANICS

fluids are ferrofluids, and possibly also the liquid outer core of theEarth.

Viscoelastic materials Some materials exhibit an elastic response to animpulsive force, but behave like a fluid if the force lasts for a longenough period of time. Many jellies, glasses and crystals are viscoelas-tic.

Earth’s mantle is viscoelastic and in some regions also shear thinning.Moreover its viscosity depends strongly on temperature and pressure. Mod-eling the Earth as an isoviscous fluid is thus not very accurate, but we stilllack a good constitutive equation for the Earth’s mantle because of boththeoretical and experimental difficulties.

6.8 Similarity and non-dimensionalization

Ideally, a geodynamical model should respect two distinct criteria: it shouldbe sufficiently simple that the essential physics it embodies can be easily un-derstood, yet sufficiently complex and realistic that it can be used to drawinferences about the Earth. It is seldom easy to satisfy both these desideratain a single model. However, there is a way around this dilemma: to inves-tigate not just a single model, but rather a hierarchical series of models ofgradually increasing complexity and realism. The initial study of a highlysimplified model provides the physical understanding required to guide theformulation and investigation of more complex models. However, we wantthe simplified model - be it analytical, numerical or analog - to bear somedegree of resemblance to the actual physical system we are aiming at. Thereis a well developed branch of physics that studies to what extent and underwhat conditions a model can be interpreted in order to gain insights over thetarget physical system.

6.8.1 Buckingham’s Π-Theorem

A general principle of physics states that the validity of physical laws cannotdepend on the units in which they are expressed. A first consequence ofthis principle is the Π-theorem of Buckingham9: suppose that a physicallymeaningful equation

f(p1, . . . , pn) = 0

9E. Buckingham (1914), On physically similar systems; illustrations of the use of di-mensional equations.

6.8. SIMILARITY AND NON-DIMENSIONALIZATION 75

relates n physical variables pi expressed in terms of k different physical units.Then the previous equation is equivalent to a new relation

Φ(Π1, . . . ,Πn−k) = 0 (6.23)

where Πi are n−k linearly independent non-dimensional parameters obtainedas products of powers of the dimensional parameters pi. While the totalnumber of independent groups Πi is fixed (= N/M), the definitions of theindividual groups are arbitrary and can be chosen as convenient.

This theorem has two main consequences: on the one hand, all possi-ble observables of a certain physical system depend on its non-dimensionalparameters Πi, rather than on its dimensional parameters pi; on the otherhand, if two physical systems happen to have the same values of the non-dimensional parameters Πi, then they are governed by the same equations(even in the case we don’t explicitly know them!) and behave similarly.

Most of the times the non-dimensional parameters can be chosen in sucha way that they can be regarded as the ratio between meaningful physicalquantities, like the energies, length- and time-scales, or efficiencies of differentprocesses partaking in the total dynamics of the system. This way one canobtain a first clue of which processes will be relevant and which will be ofsecondary importance or even completely negligible.

6.8.2 Dimensional Analysis

Buckingham’s Π-Theorem is also the basis of dimensional analysis: by listingall the relevant parameters of a certain physical system and combining theminto non-dimensional parameters, one may have the possibility to infer someproperties of the system even if the precise functional form of (6.23) is notknown. A very famous example of this is the scaling analysis carried out bySir Geoffrey I. Taylor, who estimated the energy released by the first atombomb, a US military top-secret information at the time, by cleverly usingthe very little information released by the US government.

The first nuclear test explosion took place on July 16th, 1945 in NewMexico. Five years afterwards the US government declassified the photo-graph data of the explosion while keeping other technical data, like the exactpower of the bomb, secret. Life magazine promptly published those pictures:they showed the expansion of the bomb’s blast in time, with a 100-metersscale for reference (see Fig. 6.3). Taylor noted that, combining the relevantphysical parameters, only two independent non-dimensional quantities canbe obtained:

Π0 = R( ρ

Et2

)1/5

and Π1 = p

(t6

E2ρ3

)1/5

.

76 CHAPTER 6. CONTINUUM MECHANICS

Figure 6.3: One of the pictures of the nuclear test explosion used by Taylorto estimate the energy released by the blast.

So, in particular, he obtained that

R =

(Et2

ρ

)1/5

f(Π1).

For small times Π1 ≈ 0, and it is known from the analysis of smaller explo-sions10 that f(0) = 1. Taylor thus estimated11 the yield of the bomb to bebetween 16.8 and 23.7 kilotons12, in remarkable agreement with its actualyield of 20 kilotons.

6.8.3 Non-dimensionalization

When the equations governing the system are known, it is possible to writethem in a way that is independent of the absolute size of the system, em-phasizing instead the relative size of the various parts and parameters ofthe system: the equations become dependent only on some non-dimensionalnumbers, that fully describe the physics of the system.

As an example, let’s take the Navier-Stokes equation for a Newtonianfluid with zero bulk viscosity in a constant gravitational field gk:

ρ

(∂v

∂t(x, t) + (v(x, t) · ∇)v(x, t)

)= −∇p(x, t) + η∇2v(x, t)− δρ(x, t)gk.

We define dimensionless variables using appropriate scales:

Re ≡ ρUL

η=UL

ν; Fr ≡ U√

gL; t ≡ U

Lt; x ≡ 1

Lx;

10G. I. Taylor, Proc. Roy. Soc. London A 200, pp. 235–247 (1950).11G. I. Taylor, Proc. Roy. Soc. London A 201, pp. 175–186 (1950).12A kiloton is the energy released by 106 kilograms of TNT, standardized as 4.184×1012

Joules.

6.8. SIMILARITY AND NON-DIMENSIONALIZATION 77

v(x, t) ≡ 1

Uv(Lx,

L

Ut); p ≡ L(p− p0)

ηU; δρ ≡ 1

ρδρ;

where U is the typical velocity of the fluid, L is a fundamental size of oursystem, p0 is the hydrostatic pressure, Re is the Reynolds number and Fr isthe Froude number. We can then rewrite the N-S equation accordingly:

∂v

∂t(x, t) + (v(x, t) · ∇)v(x, t) =

1

Re(∇2v(x, t)− ∇p(x, t))− 1

Fr2δρ(x, t)k.

We see here explicitly what we have already anticipated as a consequenceof Buckingham’s Π-Theorem: two systems characterized by the same valueof the relevant non-dimensional numbers are governed by the same non-dimensional equations, and are thus characterized by a similar behavior.In particular, the Earth’s mantle is characterized by very small Reynolds(≈ 10−20) and Froude (≈ 10−13) numbers. Since all non-dimensional termsare of order one, this means that the left hand side of the N-S equation isby many orders of magnitude smaller than the right hand side, and can besafely set to zero. Dropping the left hand side of the N-S equation meansthat inertia plays no role in the dynamics of the system, and the equation ofmotion reduces to an instantaneous balance of forces.

6.8.4 Geometric, kinematic and dynamic similarity

Two systems are geometrically similar if their domains, and thus in particu-lar their boundaries, are geometrically similar in the usual Euclidean sense;that is, if they can be mapped into one another with rotations, translations,reflections and isotropic scalings. Let us notice here that we can scale anytwo geometrically similar systems into a universal non-dimensional system bytaking as scaling factor the distance between any two geometrically equiv-alent points. The choice of these two points is completely arbitrary anddoesn’t influence the geometric validity of the argument. From the point ofview of the physics of the system, however, the choice of a fundamental size,like the diameter if we’re dealing with the flow in a pipe — or the wingspanfor a plane, or the mantle thickness for a planet, and so on — is usually moresensible.

Two systems are kinematically similar if, on top of being geometricallysimilar, velocities measured at geometrically equivalent points in the twosystems are proportional with each other. As before, by dividing the velocityfield of each system by the absolute value of the velocity at some relevantpoint, a universal non-dimensional velocity field is obtained whose magnitudeis around one.

78 CHAPTER 6. CONTINUUM MECHANICS

Two systems are dynamically similar if they are subject to the same non-dimensional equations, that is if all the relevant non-dimensional numberstake the same values in the two systems.

It is natural to inquire if two geometrically similar problems that arekinematically similar on the boundaries give rise to kinematically similarsystems. The answer is that this is true only if they are also dynamicallysimilar. Moreover, two dynamically similar problems give rise to kinemat-ically similar systems only if they are also geometrically similar and kine-matically similar on the boundaries. From the point of view of mathematicsthis is a consequence of the fact that the two dimensional problems can bereduced, through a space-time scaling, to the same non-dimensional problem,and the non-dimensional solution of same partial differential equations (dy-namic similarity) in the same domains (geometric similarity) with the sameboundary conditions (kinematic similarity at the boundary) can be rescaledto the correct dimensional solution via a known space-time scaling. From thepoint of view of physics, it means that properly tuned numerical and analogexperiments are a meaningful way of studying physical systems of a differentsize because they are actually the same system, just on different space- andtime-scales.

6.9 Exercises

1. Derive the full form of the Cauchy strain tensor, and compare it withthe approximation given by eq. (6.9). Is the approximation reasonable?Why?

2. The temperature equation for a viscous fluid in the presence of flow isgiven by:

ρ

[∂

∂t(CpT )

]= −ρv · ∇ (CpT ) +∇ · [k∇T ] + 2ηD : D + h,

where ρ is the density, Cp is the specific heat at constant pressure, T isthe temperature, v is the fluid velocity, k is the thermal conductivity,η is the viscosity and D is the deviatoric strain-rate tensor and h isthe heat generation per unit volume due to radioactivity. Derive theappropriate simplified equation in case of small Peclet (Pe = ULρCp/k)and Brinkman (Br = ηU2/(k∆T )) numbers, assuming all parametersare constant throughout the fluid.

Appendix A

Symmetry of the stress tensor

In order not to overwhelm notation, we will use here Einstein summationconvention. We will thus drop the sum symbol

∑, and the summation is

implied by repeated indices. Moreover, we will use the Levi-Civita symbolto write in a compact form vector products in component notation. In threedimensions, the Levi-Civita symbol is defined as

εijk =

+1 if (i, j, k) is (1, 2, 3), (3, 1, 2) or (2, 3, 1),

−1 if (i, j, k) is (1, 3, 2), (3, 2, 1) or (2, 1, 3),

0 otherwise: i = j or j = k or k = i,

i.e. εijk is 1 if (i, j, k) is an even permutation of (1, 2, 3), −1 if it is an oddpermutation, and 0 if any index is repeated. We can thus see that, given twovectors a and b, their cross product can be written as

a× b =3∑i=1

3∑j=1

3∑k=1

εijkeiajbk = εijkeiajbk.

The total angular momentum per unit mass γ of every parcel of contin-uum is, in the most general case, composed of an orbital angular momentuml and of a spin angular momentum1 s:

γ = l + s.

The conservation of angular momentum requires that the time derivativeof the angular momentum of a parcel of continuum is equal to the sum of allthe torques τ acting on the parcel:

γ = l + s = τ . (A.1)

1Not to be confused with the spin angular momentum of quantum mechanics.

79

80 APPENDIX A. SYMMETRY OF THE STRESS TENSOR

Part of this equation is a direct consequence of Newton’s second lawof motion, since we know that the time derivative of the orbital angularmomentum L is equal to the moment of the sum of the external forces F :

L =d

dtL =

d

dt(x×Mv)

=d

dt(x)×Mv + x× d

dt(Mv)

= v ×Mv + x× F= x× F .

We will be interested in particular to the corresponding relation between thequantities per unit mass, obtained dividing the last equation by the totalmass of the system:

l = x× f . (A.2)

Now, given a parcel of continuum occupying a region R of space boundedby a surface S with outward pointing normal versor n, experiencing a tractionfield t = σ · n, a body-force field per unit volume b and a couple2 field perunit volume c, the total torque acting on that parcel of continuum is

m =

∫S

x× (σ · n) dω +

∫R

x× b dx+

∫R

c dx

mi =

∫S

εijkxjσklnl dω +

∫R

εijkxjbk dx+

∫R

ci dx

=

∫R

εijk∂l (xjσkl) dω +

∫R

εijkxjbk dx+

∫R

ci dx,

where we have used Gauß theorem to transform the surface integral into avolume integral. Since the region of integration is infinitesimally small wecan approximate the integral with the product of the volume V of the regionof integration times the integrand evaluated at some arbitrary point insidethe region, obtaining then

mi = εijk∂l (xjσkl)V + εijkxjbkV + ciV,

= εijk∂l (xj)σklM

ρ+ εijkxj∂l (σkl)

M

ρ+ εijkxjbk

M

ρ+ ci

M

ρ

= εijkσkjM

ρ+ εijkxj∂l (σkl)

M

ρ+ εijkxjbk

M

ρ+ ci

M

ρ,

2A couple is a system of forces with a resultant torque but no resultant force. A commonexample of a couple is the effect of a magnetic field B on a ferromagnetic suspension.

81

where in the last equality we have used the fact that ∂lxj = δlj. Dividing thelast equation by the total mass M and rearranging the addends we obtainthat the total torque per unit mass acting on the parcel of continuum is

τi =1

ρεijkxj (∂l (σkl) + bk) +

1

ρεijkσkj + ci. (A.3)

This equation shows that the total torque is given by the sum of the moment

of the total force (x× f)i =1

ρεijkxj (∂l (σkl) + bk), the moment due to the

antisymmetric part of the stress tensor1

ρεijkσkj, and the couple field ci.

Using equation (A.3) to eliminate τ in equation (A.1) yields

li + si =1

ρεijkxj (∂l (σkl) + bk) +

1

ρεijkσkj + ci,

and using now equation (A.2) to remove that part of the conservation ofangular momentum that is a direct consequence of Newton’s second law ofmotion yields

si =1

ρεijkσkj + ci.

The right hand side of the equation is independent of the origin of coordi-nates, so must be the left hand side too. The left hand member was notan orbital angular momentum by construction, and the name of spin an-gular momentum has indeed been properly given. If the continuum is inequilibrium, si = 0, we obtain that

1

ρεijkσkj = −ci,

and the stress tensor is symmetric only if no external couple field is applied.If the continuum is out of equilibrium, the symmetry of the stress tensorcorresponds to the absence of external couple fields together with the absenceof an intrinsic spin of the continuum.

Appendix B

Solutions

B.1 Linear algebra

1. Since we are dealing with a three-dimensional vector space, any com-bination of more than three vectors is bound to be linearly dependent.Instead, the determinants of the square matrices obtained by piecingtogether any three of the four vectors are always non-zero.

2. Using the natuarl basis B = 1, t, t2 to represent the vectors we obtain:w = (1, 0, 0), x = (1, 1, 0), y = (0, 1, 1) and z = (15,−23,

√2). The

argument is the same as in the previous exercise.

3. Let’s take three coefficients α, β and γ and let’s suppose that they givea null linear combination:

α(x+ y) + β(x+ z) + γ(y + z) = 0.

By recombining the terms on the left-hand side we obtain

(α + β)x+ (α + γ)y + (β + γ)z = 0.

This linear combination, by hypothesis, can hold if and only if all threecombinations (α+β), (α+γ) and (β+γ) are zero, which in turn impliesthat all three coefficents are separately zero.

4. The determinant of the matrix(1 + η 1− η1− η 1 + η

)is zero if and only if η = 0.

82

B.1. LINEAR ALGEBRA 83

5. No: a basis, the maximally large set of linearly independent vectors,has a number of vectors that is equal to the cardinality of the space.

6. The can never form a basis, since no set of two vectors can be a gener-ator of a three dimensional space.

7. The solution is analogous to those of ex. 1 and 2.

8. The linear functionals are w and z.

9. Yes, it must exist: since f is non-zero, then by definition there is avector y ∈ V such that 〈f ,y〉 = b, with b 6= 0. Then, by the linearityof f , it follows that x = a

by is the desired vector.

10.

A =

0 1 00 0 11 0 0

11. It is true for cases b) and c).

12. The rows and the columns of the matrix are permuted accordingly(e.g. the first column is switched with the second and the first row isswitched with the second).

13. a) The three vectors x, y and z are orthonormal if and only if

x · x = y · y = z · z = 1,

x · y = y · z = z · x = 0.

We can check that

x · x =

(1√7

)2((√3)2

+ 12 +(√

3)2)

= 1,

y · y =

(1

2√

7

)2(42 +

(−√

3)2

+ (−3)2

)= 1,

z · z =

(1

2

)2(02 +

(√3)2

+ (−1)2

)= 1,

x · y =1√7· 1

2√

7

(√3 · 4 + 1 · (−

√3) +

√3 · (−3)

)= 0,

y · z =1

2√

7· 1

2

(4 · 0 + (−

√3) ·√

3 + (−3) · (−1))

= 0,

84 APPENDIX B. SOLUTIONS

z · x =1

2· 1√

7

(0 ·√

3 +√

3 · 1 + (−1) ·√

3)

= 0.

The set B = x,y, z is thus an orthonormal basis.

b) Unfortunately, we only have the action of the linear function a onthe basis B, written with respect to the natural basis

N =

1

00

,

010

,

001

,

thus the juxtaposition of the three vectors a(x)N , a(x)N and a(x)Nis a mixed matrix. It acts on vectors whose coordinates are writtenwith respect to the basis B, and yields vectors whose coordinatesare written with respect to the natural basis N :

NAB =

3√7

14√

7−3

41√7

5√

34√

7

√3

4√3√7− 3

2√

712

.

However, we can readily write the matrix of change of basis from Bto N , that is the juxtaposition of the three basis vectors of B:

NMB =

3√7

2√7

01√7−√

32√

7

√3

2√3√7− 3

2√

7−1

2

,

and using Gauß-Jordan elimination we compute its inverse

(NMB)−1 =B MN =

3√7

1√7

√3√7

2√7−√

32√

7− 3

2√

7

0√

32

−12

.

The requested matrix NAN is thus given by the matrix product

NAN =N AB ·BMN =

12−√

34

34√

32

14−√

34

0√

32

12

.

c) From the previous exercise we have already the matrices NMB and

BMN . The matrix BAB is thus given by the matrix product

BAB =B MN ·N AB =

1 0 0

0 18−3√

78

0 3√

78

18

.

B.1. LINEAR ALGEBRA 85

Written in this form, it is apparent that the function a rotates thespace around the direction given by the vector x counterclockwise

by an angle θ = arccos(

18

)= arcsin

(3√

78

)≈ 83.

14. (a) and (b)

15. Let’s take two elements M1 and M2 of V and two scalars α and β:

M1 =

(a1 b1

c1 d1

), a1 + b1 + c1 = 0

M2 =

(a2 b2

c2 d2

), a2 + b2 + c2 = 0

and let’s check that αM1 + βM2 is still an element of V .

αM1 + βM2 =

(αa1 + βa2 αb1 + βb2

αc1 + βc2 αd1 + βd2

),

(αa1+βa2)+(αb1+βb2)+(αc1+βc2) = α(a1+b1+c1)+β(a2+b2+c2) = 0.

A possible basis is

B1 =

(1 0−1 0

); B2 =

(0 1−1 0

); B3 =

(0 00 1

).

16.

wA =

(−2−5

), xA =

(34

), yA =

(1421

)and zA =

(0−1

)wB =

(−10

), xB =

1

3

(87

), yB =

1

3

(3528

)and zB =

1

3

(12

)wC =

(12

), xC =

1

2

(−31

), yC =

(−70

)and zC =

(01

)17. For this particular matrix, A3 = A. Compare this to the solutions of

the same equation in the real and complex fields.

18. detA = 3k − 1, thus A is invertible if k 6= 13.

A−1 =

1 0 0 0

− k3k−1

13k−1

k3k−1

k−13k−1

− 13k−1

33k−1

13k−1

− 23k−1

− 2k3k−1

23k−1

2k3k−1

−k−13k−1

.

86 APPENDIX B. SOLUTIONS

19. detB = detC = 1, so they both form a basis. Moreover, we have that

v1 = −1w1 + 0w2 + 0w3,

v2 = 0w1 − 3w2 + 5w3,

v3 = −1w1 − 1w2 + 2w3.

So, the change of basis matrix from the basis B to the basis C is

MC B =

−1 0 −10 −3 −10 5 2

.

20.

MB2 B1 =

1 1 10 1 11 1 2

, MB3 B1 =

1 1 01 2 11 1 1

,

MB4 B1 =

2 1 11 1 11 0 1

, MB3 B2 =

2 0 −12 1 −11 0 0

,

MB4 B2 =

2 −1 01 0 00 −1 1

, MB4 B3 =

1 −1 20 0 10 −1 2

.

21.

F 1 =

(1 11 2

), F 2 =

(1 10 2

), F 3 =

(23−5

−19

73

).

22. We can advantageously consider A as the natural basis. In this case,we can easily find that

MA B =

1 0 0 01 1 0 01 1 1 01 1 1 1

.

Computing its inverse gives us

MB A =

1 0 0 0−1 1 0 00 −1 1 00 0 −1 1

.

B.2. DERIVATIVE 87

B.2 Derivative

1. f(x, y) and g(x, y) are differentiable, while h(x, y) is not: let’s considerfor example ∂xh(x, y)

∂xh(x, y) =y2

x2 + y2− 2x2y2

(x2 + y2)2 .

It is not continuous in (0, 0), as can be seen by taking the limit fromthe two different coodrinate directions:

limx=0y→0

∂xh(x, y) = limy→0

(y2

02 + y2− 202y2

(02 + y2)2

)= lim

y→0(1− 0) = 1,

limy=0x→0

∂xh(x, y) = limx→0

(02

x2 + 02− 2x202

(x2 + 02)2

)= lim

x→0(0− 0) = 0.

2. By applying the chain rule we obtain:

∂g(f(v))

∂v1

=∂g(f)

∂f1

f1(v)

∂v1

+∂g(f)

∂f2

f2(v)

∂v1

=∂g(f)

∂f1

(3) +∂g(f)

∂f2

(−1),

∂g(f(v))

∂v2

=∂g(f)

∂f1

f1(v)

∂v2

+∂g(f)

∂f2

f2(v)

∂v2

=∂g(f)

∂f1

(−2) +∂g(f)

∂f2

(5).

3. Computing the mixed second derivatives via the common rules for sim-ple functions yeld a symmetic result:

(x− y)(x+ y) (x4 + 10x2y2 + y4)

(x2 + y2)3 .

Explicit computation of the directional derivatives give us instead:

∂f

∂x

∣∣∣∣(0,y)

= −y ⇒ ∂2f

∂y∂x

∣∣∣∣(0,0)

= −1,

∂f

∂y

∣∣∣∣(x,0)

= x ⇒ ∂2f

∂x∂y

∣∣∣∣(0,0)

= 1.

The second derivatives (actually, all derivatives) are not continuous at(0, 0), thus Schwarz’ theorem doesn’t hold.

88 APPENDIX B. SOLUTIONS

B.3 Integration in Rn

1. A possible form of the iterated integral is:

1∫−1

1−y2∫−√

1−y2

xy

x2 + y2dx

dy.

The solution of the inner integral is

1

2y log

(x2 + y2

)∣∣∣∣√

1−y2

−√

1−y2= 0,

and thus the total integral is zero.

2. Using polar coordinates the domain of integration is simply 0 < r ≤√

30 < θ ≤ 3

4π. The iterated integral takes the form

√3∫

0

34π∫

0

r sin θ r dθ

dr = 332

2 +√

2

6=√

3 +

√3

2.