basics of space and vectors points and...

Basics of space and vectors

Points and distance

One way to describe our position in three di-mensional space is using Cartesian coordinates (x,y, z)where we have fixed three orthogonal directions andwe move x units in the first direction, y units in thesecond direction, and z units in the third direction.

The x-axis consists of points of the form (x, 0, 0),the y-axis consists of points of the form (0,y, 0) andthe z-axis consists of points of the form (0, 0, z). Thexy-plane consists of points of the form (x,y, 0), thexz-plane consists of points of the form (x, 0, z) andthe yz-plane consists of points of the form (0,y, z).You should be able to sketch a picture of three di-mensional space and mark each one of these axes andplanes.

Once we can describe position the next step is tomeasure (straight-line) distance. In two dimensionswe can use the Pythagorean Theorem to get the dis-tance between points (x1,y1) and (x2,y2) to be

D =√

(x2 − x1)2 + (y2 − y1)2.

On the other hand for three dimensions we will usethe Pythagorean Theorem (twice) to get the distancebetween points (x1,y1, z1) and (x2,y2, z2) to be

D =√(x2 − x1)2 + (y2 − y1)2 + (z2 − z1)2.

A sphere is a set of points which are a fixed dis-tance, r, away from a central point, (h,k, `). Usingthe distance formula (where we conveniently squareto get rid of the inconvenient square roots) we havethat a sphere is the set of points satisfying

(x− h)2 + (y− k)2 + (z− `)2 = r2.

Sometimes we will not have a sphere given to us inthis form in which case we should rewrite it (i.e., us-ing complete the square). If we want to have all thepoints in a solid sphere then we have

(x− h)2 + (y− k)2 + (z− `)2 6 r2,

the volume of this sphere is 43πr3.

The midpoint between (x1,y1, z1) and (x2,y2, z2)is the point(

x1 + x22

,y1 + y22

,z1 + z22

).

We can describe motion of a particle by describ-ing position at each time t. This gives us parametricequations

(x(t),y(t), z(t)

). We can take a parametric

equation and ask how long is the curve. This can be

found by splitting the curve into tiny little pieces, us-ing the distance formula on each piece, and addingthem back up (i.e., doing integral calculus). The limitof this process gives us

length =

∫ba

√(x ′(t)

)2+(y ′(t)

)2+(z ′(t)

)2dt.

Vectors

Sometimes we are interested in quantities that haveboth a length and a direction (i.e., velocity or force).We will refer to these as vectors. Most of our dis-cussion will be about three dimensional vectors butmost ideas generalize to all dimensions. We note thatvectors are not tied to specific points, i.e., they can betranslated to any other location and still be an equiv-alent vector.

Geometrically a vector is a directed line segmentand we can add vectors by chaining them one afteranother, and we can scale vectors by changing thelength (note that scaling by a negative number re-verses the direction of the vector, this allow for sub-traction of vectors).

For our purposes it will be convenient to workwith a vector in terms of quantities, i.e., algebraically.This is done by writing the vector in component form.

~u = 〈a,b, c〉

where a is the amount of change in the x direction,b is the amount of change in the y direction and cis the amount of change in the z direction. As anexample, the vector going from the point (x1,y1, z1)to the point (x2,y2, z2) is 〈x2 − x1,y2 − y1, z2 − z1〉.

In component form we can easily add and scalevectors, working component by component, i.e.,

〈a,b, c〉+ 〈d, e, f〉 = 〈a+ d,b+ e, c+ f〉k〈a,b, c〉 = 〈ka,kb,kc〉

The magnitude of a vector (i.e., the length) can befound in component form by translating the vectorso that the tail is at the origin and looking at thedistance between the tip of the vector and the origin.In particular we have∥∥〈a,b, c〉

∥∥ =√a2 + b2 + c2.

A vector is a unit vector if it has length 1, any vectorthat is not the zero-vector (0 = 〈0, 0, 0〉) can be scaledto a unit vector by dividing its magnitude, i.e., ~u/‖~u‖.This will be used whenever we want to talk aboutsomething happening in a particular direction.

Three important unit vectors are i = 〈1, 0, 0〉, j =〈0, 1, 0〉 and k = 〈0, 0, 1〉. These are known as thestandard unit vectors and we can rewrite our vectors

as combinations of these three vectors, i.e.,

〈a,b, c〉 = 〈a, 0, 0〉+ 〈0,b, 0〉+ 〈0, 0, c〉= a〈1, 0, 0〉+ b〈0, 1, 0〉+ c〈0, 0, 1〉= ai + bj + ck.

Dot product

While we can conveniently add and scale vectorsthere is no convenient way to multiply vectors to-gether. There are two approaches that act like multi-plication, we start with the first called the dot product.We have “(vector) · (vector) = number”. In otherwords the dot product takes two vectors and pro-duces a number. In two and three dimensions thisbehaves as follows (this works similarly in other di-mensions):

〈a1,b1〉 · 〈a2,b2〉 = a1a2 + b1b2〈a1,b1, c1〉 · 〈a2,b2, c2〉 = a1a2 + b1b2 + c1c2

With this definition it is easy to establish some basicfacts that make it look like multiplication, i.e.,

~u ·~v = ~v · ~u, ~u · 0 = 0, ~u · (~v+ ~w) = ~u ·~v+ ~u · ~w.

What makes the dot product useful is the geometricinterpretation. Let ~u = 〈a,b, c〉 then we have

~u · ~u = 〈a,b, c〉 · 〈a,b, c〉 = a2 + b2 + c2 = ‖u‖2

(the same result holds in other dimensions as well).Combining this with the law of cosines we get thefollowing:

~u ·~v = ‖~u‖‖~v‖ cos θ or cos θ =~u ·~v‖~u‖‖~v‖

This allows us to find angles between vectors. Onespecial angle between vectors is a right angle (90◦ orπ2

). For such an angle we have that cos θ = 0 and thisleads us to an easy test of whether two vectors are atright angles to each other. Namely, two vectors areorthogonal or perpendicular if ~u ·~v = 0; this follows theconvention that the 0 vector is perpendicular to everyother vector. On a side note, we say that two vectorsare parallel if they are scalar multiples of one another(this includes the possibility of reversing direction).

This can be used to find the projection of one vec-tor onto another, i.e., “proj~v~u” which is the vector ~uprojected down onto the vector ~v. We have

proj~v~u =

(~u ·~v~v ·~v

)~v .

Similarly given a force vector F and a vector onwhich the force moves an object along D we havethat the work is w = F ·D.

Planes will play an important role in our class. Wethink of planes as our generalization of lines andwhen we found lines we needed a point and a slopewhich we can think of as a point and a direction.When we find a plane we will similarly need a pointand a direction (which we can use a vector for). Butfor direction we don’t want to use a vector in theplane because there are many possible directions. In-stead we will use a vector that is perpendicular to theplane, what we call the normal vector. In particularthe normal vector n is perpendicular to every vectorin the plane.

Given a point (x0,y0, z0) and n = 〈a,b, c〉 then apoint (x,y, z) is in the plane if and only if the vector〈x− x0,y−y0, z− z0〉 (which is a vector in the plane)is orthogonal to n, i.e.,

0 = n · 〈x− x0,y− y0, z− z0〉= a(x− x0) + b(y− y0) + c(z− z0)

or rearranging

ax+ by+ cz = ax0 + by0 + cz0︸︷︷︸=d

= d.

In this last form it is easy to read off the normal vec-tor to the plane. In general when we are dealing withplanes we will be working with normal vectors. Sotwo planes are parallel when the normal vectors areparallel, the angle between planes is the angle be-tween normal vectors, and so on.

Cross product

Cross product, denoted with “×”, gives a way tomultiply vectors together and get a new vector, i.e.,“(vector)×(vector)=(vector)”. But there is a big catch,namely it only works in three dimensions (conve-niently though we live in three dimensions so we cantolerate this limitation). We have

〈a1,a2,a3〉 × 〈b1,b2,b3〉 =〈a2b3 − a3b2,a3b1 − a1b3,a1b2 − a2b1〉

This can be hard to remember, a more convenientform is to write it as a determinant of a particularmatrix, namely we have

〈a1,a2,a3〉 × 〈b1,b2,b3〉 =

∣∣∣∣∣∣i j ka1 a2 a3b1 b2 b3

∣∣∣∣∣∣ ,i.e., the first row is the standard unit vectors i, j andk; the second row is the entries of the first vectorand the third row is the entries of the second vec-tor. There are several ways to take determinants, oneway is to copy the first two columns over again thenmultiply along the six diagonals add the ones where

the diagonals go down from left to right and subtractthe ones where the diagonals go up from left to right.Another alternative is to use cofactor expansions, i.e.,we have

〈a1,a2,a3〉 × 〈b1,b2,b3〉 =

∣∣∣∣∣∣i j ka1 a2 a3b1 b2 b3

∣∣∣∣∣∣= i∣∣∣∣ a2 a3b2 b3

∣∣∣∣− j∣∣∣∣ a1 a3b1 b3

∣∣∣∣+ k∣∣∣∣ a1 a2b1 b2

∣∣∣∣where to take the determinant of a two-by-two ma-trix we have ∣∣∣∣ a b

c d

∣∣∣∣ = ad− cb.

With this rule in place it is easy to check to see that itagrees with the first definition given above.

The reason that the cross product is useful is be-cause we have a geometrical interpretation of whatis going on. A vector has two parts, a direction anda magnitude. For the cross product we can describewhat is going on for each of these parts. We have thefollowing:

• Direction: (~u×~v) ⊥ ~u and (~u×~v) ⊥ ~v

• Magnitude: ‖~u×~v‖ = ‖~u‖‖~v‖ sin θ

The first property tells us that the cross product isperpendicular to the vectors we started with. This isnot enough to nail down the direction so we also willinsist it obeys the right hand rule (not important forus). This is one of the most useful properties of thecross product and we can use it to find normal vec-tors to planes, i.e., the normal vector is perpendicularto the plane so if we can find two distinct vectors inthe plane we can take the cross product and get thenormal vector.

This also gives us a method to test to see if we didthe cross product correctly, i.e., we must have

~u · (~u×~v) = 0 and ~v · (~u×~v) = 0.

This is fast to compute and helps us identify if wehave made a mistake in taking the cross product.

The second property, the magnitude, has a geomet-ric interpretation. Namely, this says that the magni-tude of the cross product is the area of the parallel-ogram formed by the vectors ~u and ~v. By cuttingthe parallelogram in half we can get the area of thetriangle with two sides formed by ~u and ~v, i.e.,

Area of ∆ =1

2‖~u×~v‖ .

We can also go one step further and determine thatthe area of the parallelepiped formed by the threevectors ~u, ~v and ~w is |~w · (~u×~v)|.

We have (among others) the following rules forcross product, notice in particular order makes a dif-ference in the sign:

~u×~v = −(~v× ~u)

~u× (~v+ ~w) = ~u×~v+ ~u× ~w

~u× 0 = 0~u× ~u = 0

Combining the distributive property with howcross product works for the standard unit vectorsalso gives a way to compute the cross product, i.e.,

i× i= 0 j× i=−k k× i= ji× j= k j× j= 0 k× j=−i

i× k=−j j× k= i k× k= 0

Vector valued functions

A vector-valued function is a function which pro-duces a vector for the output. Since a vector can bebroken down into its component parts this functioncan be described by how it is behaving in each com-ponent, i.e.,

F(t) = 〈f(t),g(t),h(t)〉 = f(t)i + g(t)j + h(t)k.

We can ask calculus questions about these functions,i.e., limits, derivatives, anti-derivatives and definiteintegrals. The key is that we will define these itemsin the same way and since we know that vector ad-dition and scaling works component-by-componentand calculus in the end boils down to fancy addi-tions and scaling we conclude that it suffices to workcomponent by component (to be fair this should becarefully checked; but we will leave those details forfuture courses). More particularly we have the fol-lowing:

limt→cF(t) =

⟨limt→c f(t), lim

t→cg(t), limt→ch(t)

⟩F ′(t) =

⟨f ′(t),g ′(t),h ′(t)

⟩∫

F(t)dt =⟨ ∫

f(t)dt,∫g(t)dt,

∫h(t)dt

⟩∫ba

F(t)dt =⟨ ∫b

a

f(t)dt,∫ba

g(t)dt,∫ba

h(t)dt

⟩On a side note, when finding the anti-derivative of avector valued function we will have a separate con-stant for each component. Of course we can combinethese constants into a single vector and conclude thattwo antiderivatives of a vector-valued function differby a constant vector. The various derivative rules that

we have learned for real-valued functions also gener-alize as we would expect.

d

dt

(F(t) + G(t)

)= F ′(t) + G ′(t)

d

dt

(F(p(t)

))= F ′

(p(t)

)p ′(t)

d

dt

(p(t)F(t)

)= p ′(t)F(t) + p(t)F ′(t)

d

dt

(F(t) ·G(t)

)= F ′(t) ·G(t) + F(t) ·G ′(t)

d

dt

(F(t)×G(t)

)= F ′(t)×G(t) + F(t)×G ′(t)

Amazingly all three of the different ways to takeproducts involving vectors (scalar, dot and cross)obey the same product rule as before. Woohoo!

As an application, suppose we have a paramet-ric curve

(x(t),y(t), z(t)

). Then we can think of

this as a vector valued function by putting a vec-tor from the origin to the location of the particle attime t, i.e., r(t) =

⟨x(t),y(t), z(t)

⟩. Then we have that

v(t) = r ′(t) represents the velocity of the particle attime t and a(t) = v ′(t) = r ′′(t) equals the acceler-ation of the particle at time t. This can be used todescribe how the behavior of the particle (i.e., loca-tion, direction of motion, acceleration, and so on). Bytaking antiderivatives we can also find position givensome information about velocity and/or acceleration.

Note that the magnitude of velocity is speed, andif we want to find distance traveled by the particlefrom t = a to t = b we can do this by finding theintegral of speed. So we have

Length =

∫ba

‖v(t)‖dt =∫ba

‖r ′(t)‖dt.

By expanding this we see we get the same expressionthat we had before for length.

We note that r(t) ⊥ r ′(t) if and only if ‖r(t)‖ is aconstant, we will use this later.

Lines

To find a line we will need two items: a point onthe line, (x0,y0, z0); and a direction vector for theline, 〈a,b, c〉. Once we have these then we can writean equation for the line. There are three differentways we have discussed to write such an equation.

The first is vector format where we find all thepoints (in vector form) which can be done by startingat the given point and adding some multiple of thedirection vector, i.e.,

〈x,y, z〉 = 〈x0,y0, z0〉+ t〈a,b, c〉.

Solving for each component we get parametric form,i.e., find a parametric equation for the line.

x = x0 + at

y = y0 + bt

z = z0 + ct

If we solve for t in parametric form we can get anexpression for the line only in terms of x, y and zknown as symmetric form, i.e.,

x− x0a

=y− y0b

=z− z0c

.

This assumes that a,b, c 6= 0, if one of them is 0 thenwe simply write this as a combination of equations.For example if a = 0 then the line would be

x = x0 andy− y0b

=z− z0c

.

In each of the forms given above it is easy to finda point on the line and the direction vector of theline. We note that these equations are not unique forlines, i.e., we can choose a different point or choose aparallel vector and still have the same line.

Given a parametric curve(x(t),y(t), z(t)

)we can

transform it to a vector valued function by drawing avector from the origin to the current location on theparametric curve, i.e., r(t) = 〈x(t),y(t), z(t)〉. Recallthat velocity is given by v(t) = r ′(t). Velocity en-codes both the speed of the particle and the currentdirection in which the particle is moving.

If we imagine the particle as traveling along a trackand at some time t0 we derail from the track, thenhow does the particle move? Well it starts at the pointit derailed at, r(t0), it will move in the direction givenby velocity, r ′(t0). This movement will thus lie onthe line containing the point r(t0) and the directionvector r ′(t0). This is known as the tangent line.

Decomposing acceleration

We can continue to refine our conversation aboutmotion. If we only care about the direction that theparticle is traveling in (and not the speed), then wewould naturally want to consider a unit vector in thedirection of movement. We have the following.

r(t) = positionv(t) = r ′(t) = velocitya(t) = r ′′(t) = acceleration

T(t) =r ′(t)‖r ′(t)‖

= unit tangent vector

N(t) =T ′(t)‖T ′(t)‖

= unit normal vector

The unit tangent vector is pointing in the direction ofmotion. Since ‖T‖ = 1 (i.e., it is unit) we have using

a result from last week that T is perpendicular to T ′,and hence T is perpendicular to N. This indicatesthat N is perpendicular to the direction vector T, thatis to say perpendicular to the motion of the particle.On the other hand since T is related to velocity wewould expect that T ′ should be related to accelera-tion, i.e., N should have something to do with accel-eration. This turns out to be the case. In particularwe have the following:

a = aTT + aNN

aT =r ′ · r ′′

‖r ′‖and aN =

‖r ′ × r ′′‖‖r ′‖

Which is to say that we can split acceleration intotwo parts; one part in the direction the particle iscurrently moving, and another in a direction orthog-onal to how the particle is currently moving. The firstterm is essentially the projection of r ′′ onto r ′, i.e., wehave

projr′(r′′) =

r ′ · r ′′

r ′ · r ′r ′ =

r ′ · r ′′

‖r ′‖r ′

‖r ′‖=

r ′ · r ′′

‖r ′‖T = aTT.

The other term essentially is a projection onto the or-thogonal direction which can be done by using crossproducts. Alternatively, we can make the observationthat aNN = a − aTT.

From the perspective of the particle then all of theinteresting information about the motion (position,velocity and acceleration) are all contained in the sin-gle plane containing the point and the vectors T andN. The normal vector to this plane is known as thebinormal vector and is denoted B = T × N. Thisplane is known as the osculating plane (or kissingplane, i.e., it gently kisses the curve).

Curvature

Given a curve we can ask how “bendy” the curveis. Which is to say we want to measure how fast thecurve is turning. We are careful here to say that weare not interested in our speed of rotation in regardsto how we travel along the curve, i.e., this is not de-pendent on the parameterization.

A first approach is to look at how quickly our di-rection, T, is changing as we change our positionalong the curve, s. So curvature (which is denotedwith the Greek letter κ) is given by

κ =

∥∥∥∥dTds

∥∥∥∥.

This is great, but is hard to compute for must curves.So using the chain rule we can rewrite this as

κ =‖T ′‖‖r ′‖

.

This is better but we can even make it more straight-forward (i.e,. T tends to have square roots which arebest to avoid when taking derivatives if we can). Thisis done by recalling that ‖T ′‖ shows up as part of Nand so is connected to aN. Using this we derive

κ =‖r ′ × r ′′‖‖r ′‖3

.

In the special case of a parameterization (x(t),y(t))this becomes

κ =|x ′y ′′ − x ′′y ′|(

(x ′)2 + (y ′)2)3/2 .

Coordinate systems

There are three main ways to describe a point inspace. We have already discussed the Cartesian co-ordinate system (also known as rectangular coordi-nates). The other two are generalizations of polar co-ordinates in the plane, so before jumping into themwe briefly recall facts about polar coordinates.

In the plane we can describe our position by start-ing at the origin looking in the direction of the pos-itive x-axis. We then rotate (counter-clockwise) anangle θ and then move a distance r out. In particularevery point can be uniquely described by a distancer > 0 and an angle 0 6 θ < 2π. In converting backand forth between these coordinate systems we havethe following:

x = r cos θy = r sin θ

r =√x2 + y2

tan θ = y/x

It is useful to remember r2 = x2 + y2. Using theserelationships it is possible (though not always conve-nient!) to switch from one coordinate system to theother.

On a side note we do not say θ = arctan(y/x) be-cause the inverse tangent only has a range of a halfof a revolution and not a full revolution. That meanswe would need to add some correcting term in somecases to get the correct angle.

The first generalization to three dimensions isknown as cylindrical coordinates. This can be thoughtof as polar plus z. Every point can be described by r,θ and z (we will assume that r > 0 and 0 6 θ < 2π).The way it works is that we use r and θ as before tomove to a point in the plane. Then we move up ordown according to z. Given what we already knowabout polar coordinates this makes converting backand forth between Cartesian coordinates and cylin-drical coordinates very easy. We have the following:

x = r cos θy = r sin θz = z

r =√x2 + y2

tan θ = y/xz = z

The second generalization to three dimensions isknown as spherical coordinates. This is more in thespirit of polar coordinates in that we first start at theorigin and then turn to look in the direction of the de-sired point. We then move sufficient distance to get tothat point. To determine where to look we will needtwo angles, one tells us how to rotate left/right in thexy-plane (θ, this is the exact same θ as in cylindricalcoordinates and we will assume 0 6 θ < 2π); the sec-ond angle then tells us how to rotate up/down, φ.The way that φ is measured is as the angle off of thepositive z-axis. So we have that φ = 0 corresponds tothe positive z-axis, φ = 1

2π corresponds to points in

the xy-plane and φ = π corresponds to the negativez-axis, so 0 6 φ 6 π. Finally the distance we movewill be ρ > 0.

To be able to convert from spherical to Cartesianand vice-versa we observe that since ρ is the dis-tance we move that it corresponds to the distancefrom (x,y, z) to (0, 0, 0) and we have a formula forthat. We can also find z by drawing a right trianglefrom the origin, to our point, to the z-axis. By usingproperties of right triangles we can find z = ρ cosφ.To find x and y we note that the length of the otherpart of the triangle is ρ sinφ and this correspondsto the r in cylindrical/polar coordinates which canhelp us solve for x and y. (Note: the above dialoguemakes much more sense when drawing a picture atthe same time.) So we have the following:

x = ρ sinφ cos θy = ρ sinφ sin θz = ρ cosφ

andρ =

√x2 + y2 + z2

tan θ = y/x

φ = arccos(

z√x2 + y2 + z2

)It is useful to remember ρ2 = x2 + y2 + z2 and thatz = ρ cosφ and r2 = x2 + y2 = ρ2 sin2φ.

The reason to have multiple coordinate systems isthat some surfaces are easy to describe in one coordi-nate system but very difficult to describe in another.The following are some ways to describe the samesurface in various coordinate systems.

• Sphere centered at the origin of radius m

Cartesian: x2 + y2 + z2 = m2

Cylindrical: r2 + z2 = m2

Spherical: ρ = m

• Cylinder with center z-axis of radius m

Cartesian: x2 + y2 = m2

Cylindrical: r = m

Spherical: ρ = m cscφ

• Cone with angle off the positive z-axis of α

Cartesian: z = cotα√x2 + y2

Cylindrical: z = cotα rSpherical: φ = α

In general, given a formula for a surface we canidentify the coordinate system used (i.e., if we see zand r we are in cylindrical; if we see φ or ρ we are inspherical). Being able to convert between coordinatesystems will be useful in a future chapter.

Quadric surfaces

To understand surfaces we will often look at crosssections. These are found by looking at where a planeintersects the surface and examining the resultingcurve(s). We will be particularly interested in traceswhich correspond to planes of the form x = c, y = cor z = c. Note that in such planes one of the variablesis fixed so any equation describing a surface reducesthe number of equations involved.

A special type of surface is a cylinder. These aresurfaces which have identical traces in one of thevariables, i.e., x2 + y2 = 1 in three dimensions formswhat we normally call a cylinder because for eachslice of the form z = c we get a unit circle. Theseare easy to identify when written out as equationsbecause they are missing a variable.

With our background in understanding conic sec-tions (i.e., ellipses, parabolas, hyperbolas and soforth) we are ready to understand what is going onfor cross sections of quadric surfaces. These are sur-faces which can be written in the form

Ax2+By2+Cz2+Dxy+Exz+Fyz+Gx+Hy+Iz+J = 0

where A,B, . . . , J are constants. But by translationand rotation we only need to consider such surfacesof the form

Ax2 + By2 + Cz2 + J = 0 or Ax2 + By2 + Iz = 0.

These give the following surfaces:

• Ellipsoids:x2

a2+y2

b2+z2

c2= 1

Cross sections: ellipses, empty

• Hyperboloid of one sheet:x2

a2+y2

b2−z2

c2= 1

Cross sections: ellipses, hyperbolas

• Hyperboloid of two sheets:x2

a2−y2

b2−z2

c2= 1

Cross sections: ellipses, hyperbolas, empty

• Elliptic paraboloid: z =x2

a2+y2

b2

Cross sections: ellipses, parabolas, empty

• Hyperbolic paraboloid: z =x2

a2−y2

b2

Cross sections: parabolas, hyperbolas

• Elliptic cone: z2 =x2

a2+y2

b2

Cross sections: ellipses, hyperbolas

Multivariable differentiation

Functions of several variables

Up to this point we have looked at functions of asingle variable, for example a parametric curve hasa single parameter, usually thought of as time. Weare now ready to look at functions of several vari-ables, i.e., multi-variable functions. These are func-tions which take several inputs and produce an out-put (usually the output will be a single number; atthe end of the semester we will look at the case whenthe output is a vector), i.e., f(x,y) or g(x,y, z).

The domain of a function is the set of all inputs forwhich we get a valid output. The domain will eitherbe given to us (i.e., we are interested in a restriction ofthe function) or we will determine the domain basedon the function given to us. There are three problemsthat will possibly occur, namely:

• Division by zero:∗0

• Square root of a negative:√(< 0)

• Log of a non-positive number: ln(6 0)

Generally speaking, as long as an input avoids thesethree problems then the point will be in our domain.For a function f(x,y) the domain will be some subsetof the plane.

The range of a function is the set of all possibleoutputs. Generally speaking, the domain is usuallyvery easy to find while the range takes a lot of work!

A function f(x,y) can be represented graphicallyby plotting all points of the form

(x,y, f(x,y)

). The

result will be a surface in three dimensions. One wayto understand this surface is to look at level curves,these correspond to what we previously called traceswith respect to z. Namely they are the curves in theplane that correspond to solutions f(x,y) = k. Thecollection of these curves produces the contour map.By examining the contour map we can understandhow our function behaves. The contour map is sim-ilar to a topographical map used by hikers. So forexample where lines are bunched together are placeswhere the function rapidly changes. When we movealong a line we do not change the output. Being ableto connect surfaces with contour plots is a useful abil-ity which comes with a lot of practice.

For functions g(x,y, z) we could similarly try toplot all points of the form

(x,y, z,g(x,y, z)

); but this

is a four dimensional object which is very hard todraw! However, we can still get useful informationby looking at level surfaces which correspond to theset of points satisfying g(x,y, z) = c. By understand-ing level surfaces we can get some intuition aboutwhat is happening with our function.

Partial derivatives

One way to understand a function is to look atslices. Previously we saw that level curves came fromlooking at slices of the form z = k. We could alsolook at slices of the from x = x0 or y = y0. The sur-face corresponding to our function when intersectedwith such a plane gives a curve in that plane. Andwe love curves! In particular, we can find the slopeof the tangent lines to these curves. This leads to theidea of partial derivatives. Given z = f(x,y) we have

∂f

∂x(x,y) =

∂z

∂x= fx(x,y) = lim

h→0

f(x+ h,y) − f(x,y)h

,

∂f

∂y(x,y) =

∂z

∂y= fy(x,y) = lim

h→0

f(x,y+ h) − f(x,y)h

.

Taking partial derivatives is just like taking deriva-tives of single variable functions as long as we re-member the following rule:

When taking a partial derivative with respect to avariable, treat all the other variables as constants.

Notationally we use the “∂” symbol (pronounced“partial”). This acts similarly to “d”, i.e., previouslywe had dy

dxand now we have ∂z

∂x. But both nota-

tions are asking the same thing, how does one vari-able change as we perturb the other variable. We willalways use the “d” for functions of a single variableand always use the “∂” for functions of two or morevariables.

The other notation, fx, is similar to what was previ-ously f ′, we need the subscript to help specify whichvariable we are taking a derivative with respect to.

We can also take higher order partial derivatives,including mixed partial derivatives. The notationhelps us to keep track of which derivatives we takeand in which order we take them, for example thereare four second order partial derivatives,

∂2f

∂x2= fxx,

∂2f

∂y∂x= fxy,

∂2f

∂x∂y= fyx,

∂2f

∂y2= fyy.

Similar notation works for higher order partialderivatives.

When the function is nice (which will essentiallyalways be the case in our class) then fxy = fyx. Inother words the order of taking partial derivativesdoes not matter. This is true whenever fxy and fyxare continuous in a neighborhood.

Finally we note that the same notation and ideaswork for functions of three or more variables.

Limits

Often we have to deal with expressions which areambiguous, the classic example of this is 0/0. Thereis no value for this because any value could work.

So if we cannot know what it is, the we can ask thequestion what should it be. This is where limits comein, namely we look at what is happening to the ex-pression at points nearby and based on what is hap-pening we can indicate what the value should be orindicate that there is no value that it should be (i.e.,if the points nearby are giving ambiguous possibili-ties). Notationally we have

lim(x,y)→(a,b)

f(x,y) = L

which is read as “the limit as (x,y) approaches (a,b)of f(x,y) is L”. Intuitively what this means is that as(x,y) gets close to (a,b) then f(x,y) gets close to L.

When we moved to limits of two variables weopened up a whole new can of awesome. Previouslywe could only talk about approaching a value fromeither the left or the right. Now we can approachfrom the left, from the top, from a different angle,along a spiral, along a parabola, along anything wewant. If the limit exists then we will always get thesame answer. However, if we ever get two differ-ent answers when we approach in two different waysthen the limit does not exist (DNE for short). As anexample

lim(x,y)→(0,0)

x2 − y2

x2 + y2= DNE

because if we approach along the x-axis it appearsto approach 1, but if we approach along the y-axisit appears to approach −1 which are totally not thesame.

A function is continuous when what happens iswhat we expected to happen, i.e.,

Continuous ⇐⇒ lim(x,y)→(a,b)

f(x,y) = f(a,b).

Polynomials are continuous, as are sin, cos, arctan,e∗ and many others. We can add/subtract/multiplycontinuous functions and the result will be a contin-uous function; we can also divide and the result iscontinuous wherever the denominator is not 0. Wecan also compose continuous functions (put a func-tion in a function) and get continuous functions.

When dealing with limits we first check to see if itis continuous and plug in the point to see what weget; if we get a value then we are done and if we get6=00

then the limit does not exist. If we get 0/0 thenwe start looking at different ways to approach thelimiting point; if we get two different values we aredone. If we are still not done then we need to try torewrite the function or perhaps bound one part; thisis nontrivial and there is a whole class dedicated toteaching this (we will mostly avoid this situation!).

For future reference we will need some notation.Given a set S, an interior point x is a point where wecan put a small ball centered at x completely inside

of S; a boundary point y is a point where every smallball centered around y contains points both insideand outside of S. An open set is a set where everypoint is an interior point; a closed set is a point whichincludes all the boundary points. A bounded set is aset which can be placed inside of a single large ball.

Differentiability

A function is differentiable if locally the function islinear (i.e., flat). Put another way, if we zoom in closeenough the function becomes more and more like aplane. More mathematically we have

f(x,y) =

f(x0,y0) +∂f

∂x(x0,y0)(x− x0) +

∂f

∂y(x0,y0)(y− y0)︸︷︷︸

= tangent plane / linear approximation

+ g1(x,y)(x− x0) + g2(x,y)(y− y0)︸︷︷︸= error

.

where g1,g2 → 0 as (x,y) → (x0,y0), which is to saythat for points (x,y) near (x0,y0) the error is tiny.

The choice of the partial derivatives is driven bythe need to match up with the cross sections withthe planes x = x0 and y = y0. So we need to havepartial derivatives at the point we are interested in.If the partial derivatives exist and are continuous ina neighborhood around (x0,y0) then the function isdifferentiable at that point.

Since the partial derivatives will play an importantrule in what is to follow we have a special vectorwhich consists of the partial derivatives, this is thegradient vector. Given f we denote the gradient of fby ∇f.

If z = f(x,y) then ∇f =⟨∂f

∂x,∂f

∂y

⟩

If w = g(x,y, z) then ∇g =

⟨∂g

∂x,∂g

∂y,∂g

∂z

⟩The gradient will take over many of the roles thatwas previously done by the derivative. For examplethe above formulation of differentiability becomes

f(x,y) ≈ f(x0,y0) +∇f(x0,y0) · 〈x− x0,y− y0〉.

In a later chapter we will see it is useful to think of“∇” as a vector of partial derivatives, i.e.,

“∇ =

⟨∂

∂x,∂

∂y,∂

∂x

⟩”,

so that the gradient corresponds to “scaling” this vec-tor by f. We note that the gradient vector satisfiesmany basic properties we would expect of a deriva-tive including

• ∇(f+ g) = ∇f+∇g, showing ∇ is linear

• ∇(cf) = c∇f, for some constant c

• ∇(fg) = g∇f+ f∇g, the product rule.

Finally, we note that if a function is differentiable ata point then it must also be continuous at that point.

Properties of the gradient

If we look at the individual terms in the gradientwe see that ∂f

∂xis indicating how the function is be-

having in the x-direction. That is to say that if westart from an initial point and move in the (positive)x-direction then the initial rate of change of the func-tion is given by this partial derivative. Similarly wehave that ∂f

∂yindicates how the function is behaving

in the y-direction. But what if we want to know howthe function is behaving in some other direction?

To answer this kind of question we will use direc-tional derivatives,

D~uf(p) = limh→0

f(p + h~u) − f(p)h

.

Where D~uf(p) is “the directional derivative of f inthe direction ~u from the point p = (x0,y0)”, or alter-natively “the rate of change of f as we start out fromp in the direction ~u”. To indicate direction we willalways use unit vectors. If we think of f as elevationand p as indicating our latitude/longitude positionand ~u indicating which direction we want to go, thenD~uf(p) indicates how steep it is from our current lo-cation if we move in the direction indicated by ~u.

We have already mastered doing directionalderivatives in the x and y directions (these are ourold friends, the partial derivatives). The nice thingis that when our function is differentiable finding di-rectional derivatives in any direction is just as easy.In particular we have the following:

D~uf(p) = ∇f(p) · ~u.

(Usually the hardest thing about these problems is toremember to make sure that ~u is a unit vector.)

Recall that ~u · ~v = ‖~u‖‖~v‖ cos θ, and since ~u is aunit vector we have D~uf(p) = ‖∇f(p)‖ cos θ. Since−1 6 cos θ 6 1 this gives bounds on how large thedirectional derivative can be, and in which directionwe achieve maximum and minimum values. In par-ticular we have the following:

• ‖∇f(p)‖ is maximum rate of increase at p.

• ∇f(p) points in the direction which gives themaximum rate of increase at p.

• −‖∇f(p)‖ is maximum rate of decrease at p.

• −∇f(p) points in the direction which gives themaximum rate of decrease at p.

In particular the gradient encodes information abouthow to move to achieve maximum rates of increaseand/or decrease.

On the other hand D~uf(p) = 0 along level curvesor level surfaces. This is because on a level curve orlevel surface the function is constant (i.e., unchang-ing). So if ~u is tangent to a level curve we can con-clude that ∇f(p) is perpendicular to ~u. Or put moresuccinctly:

∇f(p) is perpendicular to level curves/surfaces.

This gives an easy way to find the normal for tangentplanes to a surface, namely given a surface describedby F(p) = k we use ∇F(p) as the normal vector.

Chain rule

The chain rule applies when we have a function in-side of a function. For example z = f(x,y), x = x(t)and y = y(t). In this case, z is ultimately a functionof t and so it is natural to ask how does z vary as wevary t, or in other words what is dz

dt. From differen-

tiability we can rearrange the terms to get

∆z =∂z

∂x∆x+

∂z

∂y∆y+ ERROR

where ∆z = f(x,y) − f(x0,y0), ∆x = x − x0 and∆y = y − y0. Also we note that the error term isvery small, much smaller than our other terms andbecomes insignificant near (x0,y0). If we now divideboth sides by ∆t and take a limit as ∆t→ 0 we have

dz

dt=∂z

∂x

dx

dt+∂z

∂y

dy

dt.

In the above expression we have used both “dz” and“∂z”. We use the “d” when we are treating the func-tion as depending on a single variable and the “∂”when we are treating the function as depending ontwo or more variables; however they both are askingto do the same thing, namely take a derivative.

The same idea works in more complicated situa-tions. For instance we could have that z = f(x,y) andthat x = x(s, t) and y = y(s, t) so that z depends onboth s and t. In this case we have

∂z

∂s=∂z

∂x

∂x

∂s+∂z

∂y

∂y

∂s

∂z

∂t=∂z

∂x

∂x

∂t+∂z

∂y

∂y

∂t

In general whenever we deal with functions withinfunctions (possibly within functions themselves, an“inception” of functions if you will) then we can find

how the variables change with respect to one another.One easy way to keep track is to form a tree showingthe dependencies among the variables. Then to findthe derivative of the top variable with respect to oneof the leaves we simply add up the product of thepartial derivatives of the paths (as discussed in class).

One special case that we have is when we have im-plicit relationships among the variables. For exam-ple, F(x,y) = 0 defines y as a function of x. So Fdepends on x and y but the relationship also showsthat y is a function of x. Taking the derivative of bothsides with respect to x using the chain rule we get

Fx + Fydy

dx= 0 or

dy

dx=

−FxFy

.

Similarly we get the following:

F(x,y, z) = k =⇒ ∂z

∂x=

−FxFz

and∂z

∂y=

−FyFz

.

Tangent planes and other miscellany

We have seen tangent planes done in two differentways. When we did differentiability for a functionz = f(x,y) we said that a function locally looks likea plane along with some possible error, the plane wegot was as follows:

z = f(x0,y0)+∂f

∂x(x0,y0)(x−x0)+

∂f

∂y(x0,y0)(y−y0).

The second way we have seen these tangent planesis when dealing with F(x,y, z) = k, which we canthink of as a level surface. In this case for a pointp = (x0,y0, z0) we can use the properties of gradientto note that ∇F(p) will be our normal vector so thatour tangent plane is

∇F(p) · 〈x− x0,y− y0, z− z0〉 = 0.

or if we expand out the above we get the following:

∂F

∂x(p)(x− x0) +

∂F

∂y(p)(y− y0) +

∂F

∂z(p)(z− z0) = 0

It is important to note that these two definitions arecompatible, i.e., if we wanted the tangent plane forz = f(x,y) we would get the same plane as if weworked with the function F(x,y, z) = f(x,y) − z = 0.

We can rearrange our terms above to get the fol-lowing:

∆z ≈ fx∆x+ fy∆y.

or in differential form

dz = fxdx+ fydy.

These types of formula are useful when we wantto approximate the change in output given that weknow the approximate changes of our input. In par-ticular this can be used for error tolerance but wecan also use this to give approximate values for thefunction near a point that we can easily evaluate thefunction.

Tangent planes are trying to mimic the functionso that it agrees locally with the function in boththe value of the function and the first order partialderivatives of the function. We can also try to find afunction that matches the value, the first order partialderivatives and the second order partial derivatives.These are done by using the Taylor polynomials. Thesecond order Taylor polynomials are shown below(where the function and derivatives are all evaluatedat the point (x0,y0)):

z = f+ fx(x− x0) + fy(y− y0) +12fxx(x− x0)

2+fxy(x− x0)(y− y0) +

12fyy(y− y0)

2

Optimization

The goal of optimization is to maximize or mini-mize a function. There are two types of maximums.A global maximum is a point where the functionevaluated at that point is at least as large or largerthan the function evaluated anywhere else. A localmaximum is a point where the function evaluated atthat point is at least as large or larger than the func-tion evaluated at points nearby. Similar definitionsapply for minimums.

The nice fact is that optimization works similarlyto single variable calculus, i.e., we generally look forcritical points and then apply a test of some sort. Onenice fact that still holds is that if a function is contin-uous on a closed and bounded set then the functionmust achieve a maximum and a minimum value onthat set.

At a maximum (similarly a minimum) we cannotget bigger, so the gradient should not be nonzero(otherwise moving in the direction of the gradientallows us to increase and moving in the direction op-posite the gradient allows us to decrease). So we havethat maximums and minimum will occur at criticalpoints, these include:

• Where ∇f = 0 (i.e., critical points).

• Where ∇f is undefined.

• At boundary.

For a function we find critical points by looking at∇f and where it is 0, equivalently where the partialderivatives are 0. Once we have the critical points thenext step is to determine what type of critical point

it is. In single variable calculus we had the first andsecond derivative tests; in multi-variable calculus wehave the Second Partials Test. This is done by notingthat at a critical point the partial derivatives are equalto zero and so nearby using the second order Taylorpolynomial we have

f(x,y) ≈ f(x0,y0) +1

2[∆x ∆y]

[fxx

fxy

fyx

fyy

][∆x

∆y

].

We can use properties of 2 × 2 matrices (i.e., eigen-values) to link the local behavior to the determinantof this matrix, i.e.,

D = fxxfyy − (fxy)2.

We have the following possibilities

• If D > 0 and fxx < 0 (or fyy < 0) then it is a localmaximum.

• If D > 0 and fxx > 0 (or fyy > 0) then it is a localminimum.

• If D < 0 then it is a saddle, neither max or min.

• If D = 0 the test is inconclusive.

When finding the maximum and/or minimum ona closed and bounded set we know that it must exist,so we find all the places where it could exist and thentest each point. In short we do the following:

1. Find all critical points in the interior using ∇f.

2. Find all critical points on boundary (reduce di-mensions down).

3. Plug in list of critical points into function.

4. Largest number on list is max; smallest is min.

Lagrange multipliers

The technique of Lagrange multipliers is used tosolve optimization problems with a constraint. Theseare usually easy to identify, i.e., there will be twofunctions one that is being maximized and the otherthat is a constraint on the variables (i.e., “given that”or “such that”).

Essentially what it will boil down to is that if thegradients of the function we are optimizing and thefunction that is our constraint are not parallel thenby slightly perturbing along our constraint we canincrease or decrease our value. So we can concludethat if we are at a point where we are maximizing orminimizing the two gradient vectors must be parallel(this includes the possibility that one of them is 0).

So if maximizing the function f(x,y) given thatg(x,y) = k then the method of Lagrange multipliersreduces down to solving the following equations:

∇f(x,y) = λ∇g(x,y) and g(x,y) = k.

This leads to a large system of nonlinear equations(i.e., one for each partial derivative and one for theconstraint).

When in doubt on how to solve such a system thefollowing technique tends to work: Solve for λ andset the various terms equal to lambda equal to oneanother. This gives another relationship between xand y that can be used with g(x,y) = k to solve forthe possible points yielding a maximum and/or min-imum. Once we have these points we plug into thefunction, the largest value is the maximum and thesmallest value is the minimum.

Multivariable integration

Multivariable integration

Integration is meant to answer the question “howmuch”, depending on the problem and how we setup the integral we can be finding how much volume,how much surface area, how much mass, etc. Thephilosophy of integration boils down to breaking upthe quantity that we are interested in finding intosmall manageable parts, each part of which is easyto find, and then adding them up to get the total.

We want to do this for multivariable functions. Sowe start by considering a function f(x,y) and a rect-angular region that we want to integrate over. In thiscase we will interpret f(x,y) as a height and we wantto find volume. Our region that we will integrateover (i.e., the “base”), which we will denote by R,will consist of the points a 6 x 6 b and c 6 y 6 d.We subdivide R up into small pieces so that for thesepieces the volume becomes essentially that of the vol-ume of a tall and skinny box, namely f(xk,yk)∆Ak.The point (xk,yk) is a point inside of the small sub-division and ∆Ak is the area of the subdivision. Soto find an approximation for the total we now addthese altogether to get

Volume ≈∑

f(xk,yk)∆Ak.

This method gives a way to approximate integralswhen we cannot directly integrate using tools of cal-culus. Also this is essentially a way that computersdo numerical integration, computers just love addinglots of numbers together.

To get a better approximation we take the limit,where here the limiting process is refining our sub-division, notationally this is ‖P‖→ 0 (this notation isnot important!) and we have

Volume = lim‖P‖→0

∑f(xk,yk)∆Ak =

∫∫R

f(x,y)dA.

This assumes that the limit exists, fortunately for usif we know that f is continuous on R then the limitexists, or we say “f is integrable on R”. In particularfor the functions that we are interested in the integralwill always exist.

Since integration boils down to adding, andadding behaves nicely we get the following proper-ties:

•∫∫R

kf(x,y)dA = k

∫∫R

f(x,y)dA

•∫∫R

(f(x,y) + g(x,y)

)dA

=

∫∫R

f(x,y)dA+

∫∫R

g(x,y)dA

• If the region that we are working on can be splitinto two parts which only overlap on their bor-ders, i.e., R can be broken into two pieces R1 andR2, then∫∫R

f(x,y)dA =

∫∫R1

f(x,y)dA+

∫∫R2

f(x,y)dA

• If f 6 g on R then∫∫R

f(x,y)dA 6∫∫R

g(x,y)dA

Iterated integration

Let us continue with trying to integrate f(x,y) overthe region with a 6 x 6 b and c 6 y 6 d. Instead ofbreaking the rectangle down into ever more refinedsmaller rectangles (as we did last week), we will findit more convenient to find the volume by slicing thefunction. For example by taking the volume that weare trying to find and looking at thin strips wherewe hold y constant. Then the volume of one smallstrip is ≈ A(y)∆y where A(y) is the area of the crosssection. Taking limits we can conclude

Volume =

∫dc

A(y)dy.

On the other hand we have that our cross section willlook like a function of x as x ranges between a and b,in particular it will be the function f(x,y) (rememberthat in our cross section that we are holding y fixed).So we have

A(y) =

∫ba

f(x,y)dx.

Putting these together we have∫∫R

f(x,y)dA =

∫dc

∫ba

f(x,y)dxdy.

This is a nested integral or an iterated integral. Whenevaluating this integral we always work from the in-side out, that is we perform the inside integral andthen evaluate the bounds and then we go to the nextintegral.

We could have started this whole conversation byslicing in a different way, i.e., holding x constant. Thesame ideas carry through and we can conclude that∫∫

R

f(x,y)dA =

∫ba

∫dc

f(x,y)dydx.

Notice that we changed both the order on the boundsand the order of the “d” terms. Notation is impor-tant. The inside integral goes with the inside d termand then we work our way out step by step. Also, it isuseful to keep track of bounds while doing these in-tegrals, for example we could be more specific about

the bounds (so that we are less likely to make a mis-take). As an example we have∫b

a

∫dc

f(x,y)dydx =∫x=bx=a

∫y=dy=c

f(x,y)dydx.

When f(x,y) = g(x)h(y) we can use properties ofconstants with respect to integration to conclude∫dc

∫ba

g(x)h(y)dxdy =

( ∫ba

g(x)dx

)( ∫dc

h(y)dy

)We note that for some functions (for example ones

involving absolute value) it is sometimes more con-venient to break the integral up into pieces. On asimilar note we can use symmetry in some cases tosimplify an integral.

Integration beyond rectangles

While we love our flat things, we will have to dealwith things which are not flat, this includes havingregions R that are not rectangles. We will deal withtwo general cases.

• A region R is y-simple when it can be describedby a 6 x 6 b and φ1(x) 6 y 6 φ2(x). For sucha region it is good for us to hold x constant andtake thin vertical strips. Doing this we get

∫∫R

f(x,y)dA =

∫ba

∫φ2(x)

φ1(x)

f(x,y)dydx .

• A region R is x-simple when it can be describedby c 6 y 6 d and ψ1(y) 6 x 6 ψ2(y). For sucha region it is good for us to hold y constant andtake thin horizontal strips. Doing this we get

∫∫R

f(x,y)dA =

∫dc

∫ψ2(y)

ψ1(y)

f(x,y)dxdy .

Unlike rectangles, changing the order of integra-tion is not as simple as swapping a few symbolsaround. To change the order of integration we needto change the way that we describe our region. Wehave the following general procedure:

1. Write down the current bounds.

2. Draw a picture, clearly indicating the region thatwe are integrating and (ideally) how we are cur-rently integrating.

3. Relabel any bounding curves as needed, i.e.,change y = f(x) to x = f−1(y).

4. Use the picture to determine how to write downthe new bounds. Work from the outside in.

5. Woohoo! Bounds changed.

Changing the bounds can take some previouslyimpossible function to integrate and help us to inte-grate. Setting up and changing the bounds are someof the most important ideas from this chapter andyou can expect to be tested on them.

Note that when we change bounds it might requireus to break our integral up into several pieces (i.e.,whenever a bounding curve changes). Conversely itmight also allow us to consolidate several integrals.

Integration in polar coordinates

We can also integrate in polar coordinates. Theimportant part about this process is how we “chop”our region up. In Cartesian coordinates the idea isto subdivide the region into small rectangles so thatthe area of each rectangle is dxdy or dydx. In polarcoordinates when we subdivide we break things intosmall pieces of θ, i.e., dθ, and small pieces of r, i.e.,dr. So that instead of small rectangles we are lookingat pieces of circular wedges. We know how to findthe area of wedges (i.e., given a circle of radius r anda central angle of α, the area of the wedge is 1

2αr2)

and so we can determine that the area of our littlepiece is r dr dθ. In summary we have:

dA =

dydx in Cartesian coordinates.

r dr dθ in polar coordinates.

We also need to rewrite the function that we are try-ing to integrate in terms of r and θwhich can be doneusing x = r cos θ and y = r sin θ (another often usedfact is x2 + y2 = r2), so we have∫∫

R

f(x,y)dA =

∫∫R

f(r cos θ, r sin θ) r dr dθ.

Where we also need to describe our region in termsof r and θ. While it is possible to have to integrater dθdr this will happen rarely in practice. So in gen-eral to describe a region we do the following:

1. Find the bounds for θ, these will either be givenor are found by looking for the intersection ofcurves.

2. Once the bounds for θ are known pick a typ-ical θ between the bounds and look for how rvaries, i.e., from the closest curve in the direc-tion of θ to the farthest curve in the direction ofθ. (If there ever is a transition between curveswe simple break the integral into pieces.)

The best time to use polar coordinates are in thefollowing situations:

• We are told to do a problem in polar coordinates,or are given curves describing our region in po-lar coordinates.

• Our region can easily be described using polarcoordinates, particularly true when we have cir-cles centered at the origin or on the x or y axis.

• If we have x2 + y2 on the “inside” of some func-tion and we can’t integrate.

• If we cannot make progress in Cartesian coordi-nates.

Applications of integration

There are several questions that we can answerwith integration, for example,

Volume =

∫∫R

(height)dA

where the height is usually measured as the distancefrom the surface z = f(x,y) to the xy-plane. But re-membering in this format has the advantage of han-dling more general situations, i.e., finding the vol-ume between two surfaces. In this case we figure outthe region we integrate over and for the height we do“top−bottom”.

We can also use double integration to answer ques-tions about regions in the plane. We can think of aregion as corresponding to a “lamina”, i.e., a thinplate which has varying thickness or density whichwe denote using δ(x,y). The first problem to con-sider is the mass. If the density is constant the massis simply found by multiplying the density times thearea. When the density varies we can approximatethe mass by the following: subdivide the lamina intopieces (small parts of size ∆A); approximate the massof each piece (i.e., δ(x,y)∆A); add the masses up (i.e.,∑δ(x,y)∆A). Taking the limits of finer subdivision

this sum becomes an integral and we have

Mass =

∫∫R

(density)dA =

∫∫R

δ(x,y)dA.

If we were to spin this region around a line wecould look at various quantities associated with thisaction. For example the (first) moment measures tor-sional effects, specifically the turning effect providedby this force. To find the moment of a particle we takeforce times distance to where we rotate around. Tofind the moment of the lamina we repeat the sameidea as above by finding the moment of each smallpiece of a subdivision and adding. Taking the limitsof finer subdivisions this becomes an integral and wehave

Moment =∫∫R

(distance)(density)dA.

We let Mx denote the moment with respect to spin-ning around the x-axis and My denote the moment

with respect to spinning around the y-axis. Since thedistance from (x,y) to the x-axis is y and the distanceto the y-axis is x we immediately have

Mx =

∫∫R

yδ(x,y)dA and My =

∫∫R

xδ(x,y)dA.

Given a moment we can find the center of mass(x,y) (the point at which the lamina balances) by

x =My

M=

∫∫R xδ(x,y)dA∫∫R δ(x,y)dA

,

y =Mx

M=

∫∫R yδ(x,y)dA∫∫R δ(x,y)dA

.

Alternatively we can think of x as the “weighted av-erage” of x where we have weighted each x valueaccording to the mass at that point. When findingthe center of mass it is convenient to use symmetry.We need symmetry of both the region and the densityfunction.

We can also find the second moment, or the mo-ment of inertia. This works similar to the moment, theonly difference being instead of using (distance) weuse (distance)2. So we have

Inertia =

∫∫R

(distance)2(density)dA.

We let Ix denote the inertia with respect to spinningaround the x-axis, Iy denote the inertia with respectto spinning around the y-axis, and Iz the inertia withrespect to spinning around the z-axis (i.e., spinningthe plane around the origin). Since the distance from(x,y) to the x-axis is y, the distance to the y-axis is x,and the distance to the origin is

√x2 + y2 we imme-

diately have

Ix =

∫∫R

y2δ(x,y)dA, Iy =

∫∫R

x2δ(x,y)dA,

Iz =

∫∫R

(x2 + y2)δ(x,y)dA = Iy + Ix.

Changing gears, let us go back to the case whenwe have z = f(x,y) as a surface. We can then askthe question about how much surface area lies abovea particular region R. We approach this as alwaysby subdividing into little pieces, approximating eachpiece, and adding back up. For example if we subdi-vide the region R into little rectangles of size ∆x by∆y we can look at what is happening at the surfaceabove this small rectangle. Because we are looking ata small rectangle the piece above it will be almost flatand so can be well approximated by using a parallel-ogram. In particular the parallelogram whose twosides are formed by the vectors⟨

∆x, 0,∆x fx⟩

and⟨0,∆y,∆y fy

⟩.

Recalling that the area of the parallelogram is thenthe magnitude of the cross product we have∥∥⟨∆x, 0,∆x fx

⟩×⟨0,∆y,∆y fy

⟩∥∥=∥∥∆x∆y︸︷︷︸

=∆A

⟨− fx,−fy, 1

⟩∥∥ =√f2x + f

2y + 1∆A.

Taking finer subdivisions we then get better approx-imations to the surface area and so we have

Surface area =

∫∫R

√f2x + f

2y + 1 dA.

Generally these integrals are terrible, but we can setthem up and in a very few rare cases (i.e., cylinders)we can actually do the integration.

Triple integration

Over our short calculus careers we started withsingle integration,

∫I f(x)dx, where we integrated

over an interval and have now learned double in-tegration,

∫∫R f(x,y)dA, where we integrated over

some region. We can of course keep going to higherdimensions and we now discuss triple integration∫∫∫

S f(x,y, z)dV , where we integrate over some solidin three dimensions.

Philosophically integration works the same.Namely we take our solid and divide it up into littlepieces. On each little piece the function is essentiallyconstant so that the contribution from that piece willbe f(x,y, z)∆V and then we add them all up, so thatwe have ∫∫∫

S

f(x,y, z)dV ≈∑

f(x,y, z)∆V

where the approximation gets better as we divideinto smaller and smaller pieces (i.e., as we take alimit).

The other nice thing is that we can again use manyof the same techniques. For example to evaluate theintegral we can use nested integration. There are es-sentially six different ways we can choose to nest, i.e.,dxdydz, dxdzdy, . . . , dzdydx; the most natural formost people is to choose dzdydx as this tends tobe how regions are described, and the most compli-cated part of integration comes down to accuratelydescribing the region we are integrating over. A typ-ical integral using this order of integration will be ofthe form∫b

a

∫φ2(x)

φ1(x)

∫ψ2(x,y)

ψ1(x,y)f(x,y, z)dzdydx.

In determining the bounds it is usually easiest towork from the outside in, i.e., determine the rangeof values for x, given a typical x determine the rangeof values of y (it is helpful to project the solid down

to the xy-plane for this step), given a typical x and ydetermine the range of values of z.

Triple integration can be used for several applica-tions. For example to find the volume of a regionwe have V =

∫∫∫S dV (i.e., chop our region into little

pieces, find the volume of each little piece, and addup to get the total), though of course this quickly be-comes a double integral.

If our solid has a density function, δ(x,y, z) thenwe can look at several different quantities related tothe solid including the mass, center of mass, and mo-ments. We have the following.

• Mass:

M =

∫∫∫S

(density)dV =

∫∫∫S

δ(x,y, z)dV .

• Moment: A moment is found by summing masstimes distance (really mass times “distance”, butthat is another story), so we have

moment =∫∫∫

S

(distance)(density)dV .

The moment of a solid is taken with respect to aplane and so the distance is the distance to thatparticular plane. We are usually concerned withthe xy-, xz-, and yz-planes and these momentsare respectively:

Mxy =

∫∫∫S

zδ(x,y, z)dV

Mxz =

∫∫∫S

yδ(x,y, z)dV

Myz =

∫∫∫S

xδ(x,y, z)dV

• Center of mass: The center of mass of a solid,(x,y, z), also corresponds to the “weighted av-eraging” of the x, y and z values respectively.Once we have the moments and the mass thispoint is easy to compute and we have

x =Myz

M, y =

Mxz

M, z =

Mxy

M.

In finding the center of mass we can often usesymmetry to simplify the problem (recall that weneed to have symmetry both in the solid and inthe density function).

Integration using other coordinates

In double integration we saw that we could in-tegrate either in cartesian coordinates or using po-lar coordinates. The important part about switchingwas to make sure we accounted for everything, i.e.,

bounds, rewriting the function in terms of new vari-ables, and the way we subdivided area. In particularwe had

dA = dydx = r dr dθ.

The extra “r” came from how we now subdividedour region. Changing coordinates is useful when it helpssimplify the description of the region and/or makes thefunction that we are integrating simpler.

We can do similar things in three dimensions.Namely we have learned about two other coordinatesystems and instead of doing triple integration withrespect to cartesian coordinates (the dzdydx that wehave talked about) we can do triple integration in oneof these coordinate systems.

The first coordinate system is cylindrical coordi-nates which we think of as “polar +z”, so that

x = r cos θ, y = r sin θ, z = z.

In this case the important thing is that when we chopup our volume into little pieces we have

dV = r dz dr dθ .

(Again we could integrate in other orders, but this isthe most common order for describing our region.)The other thing to remember is to change our func-tion in terms of r, θ, and z, in general we have∫∫S

f(x,y, z)dV=∫∗∗

∫∗∗

∫∗∗f(r cos θ, r sin θ, z) r dz dr dθ

where “∗” are appropriate bounds to describe ourregion in terms of the variables for cylindrical co-ordinates. Most of the time we will use cylindricalcoordinates if our “base” is easy to describe in po-lar coordinate and/or our function involved x2 + y2

terms as these simplify to r2.The second coordinate system is spherical coordi-

nates which has a distance ρ and directions φ and θ,i.e.,

x = ρ sinφ cos θ, y = ρ sinφ sin θ, z = ρ cosφ.

In this case the important thing is that when we chopup our volume into little pieces we have

dV = ρ2 sinφdρdφdθ .

(Again we could integrate in other orders, but this isthe most common order for describing our region. Toremember the correct term just remember the classicsong “Rho, rho, sine of phi”) The other thing to re-member is to change our function in terms of ρ, φ,and θ, in general we have∫∫

S

f(x,y, z)dV

=

∫∗∗

∫∗∗

∫∗∗f(ρ sinφ cos θ, ρ sinφ sin θ, ρ cosφ)

× ρ2 sinφdρdφdθ

where “∗” are appropriate bounds to describe ourregion in terms of the variables for spherical coordi-nates. Most of the time we will use spherical coor-dinates if our region is easy to describe in sphericalcoordinate and/or our function involved x2+y2+z2

terms as these simplify to ρ2.

Jacobian

In both of the above cases we changed our coordi-nate systems and we also had to take into account ofhow we were dividing up our space, i.e., in spheri-cal we needed to have “ρ2 sinφ” when we looked atdV . In general we can look at what happens whenwe change our basis. So instead of integrating withrespect to say x and y we integrate with respect to uand v. There are three things that must be changed.Namely:

1. The function.

2. The bounds.

3. How our new variables correspond to the waywe subdivide, i.e., dA or dV .

So we start with a correspondence between vari-ables u, v and variables x, y. In particular we havea way to take a pair (u, v) to a pair (x,y) and viceversa, i.e.,

x= x(u, v) u=u(x,y)y=y(u, v) v= v(x,y)

To change our function then we simply use the aboverelationship. Similarly we can use these relation-ship to rewrite the curves that bound our region interms of our new variables (that is write our bound-ing curves as functions involving x and y and replacex by x(u, v) and y by y(u, v) and simplify). This takescare of two out of the three.

To determine the last part, namely what happensto dA or dV , we look at what happens to a smallpiece in the uv-plane and what it corresponds to inthe xy-plane. In particular it will correspond roughlyto a parallelogram and we can use the magnitudeof a cross product to find the area of the parallelo-gram. The cross product is a determinant and fol-lowing through we get the correcting term, knownas the Jacobian

J(u, v) =

∣∣∣∣∣∣∣∂x

∂u

∂y

∂u∂x

∂v

∂y

∂v

∣∣∣∣∣∣∣ .

Taking the magnitude means in this case taking theabsolute value of the determinant. In particular wehave the following.∫∫R

f(x,y)dxdy =

∫∫S

f(x(u, v),y(u, v)

)|J(u, v)|dudv

In three dimensions something similar happens,the main difference is now that the Jacobian is a func-tion of three variables, i.e.,

J(u, v,w) =

∣∣∣∣∣∣∣∣∣∣∣

∂x

∂u

∂y

∂u

∂z

∂u∂x

∂v

∂y

∂v

∂z

∂v∂x

∂w

∂y

∂w

∂z

∂w

∣∣∣∣∣∣∣∣∣∣∣.

We will use this method when we are instructedto, but also when we have unusual functions on theinside of the function we are integrating and/or theboundary curves are given by unusual functions.

Basic vector calculus

Vector fields

We now turn to the calculus of vector valued func-tions, such functions are also known as vector fields.In the last few chapters we have dealt with scalarvalued functions, i.e., functions which return a num-ber. But we have also seen examples of vector valuedfunctions, i.e., functions which return a vector.

The vector valued function that we are most famil-iar with is the gradient vector of a function, ∇f. Thisis an important example, if F = ∇f then we will saythat F is conservative and that f is a potential func-tion. (In a vague way this is stating F is the “deriva-tive” of f, in a later section we will want to start withF and find f which is akin to finding an antideriva-tive.) Throughout the chapter we will be making ex-tensive use of the “∇” operator (also known as the“del operator”). This can be thought of as the vectorof partial derivatives, i.e.,

∇ =

⟨∂

∂x,∂

∂y,∂

∂z

⟩.

Once we have a vector there are three things thatwe might want to do with it, scale it, dot it, or crossit. Each one of these will play an important role solet us review them. In what follows we will let f be ascalar valued function and F = 〈M,N,P〉 be a vectorvalued function.

• ∇f: This operation is scaling ∇ by a scalar val-ued function and the result is a vector valuedfunction.

• ∇ · F: This operation is dotting ∇ by a vectorvalued function and the result is a scalar valuedfunction. This is known as the divergence of F ordiv F for short.

div F = ∇ · F =∂M

∂x+∂N

∂y+∂P

∂z.

• ∇ × F: This operation is crossing ∇ by a vectorvalued function and the result is a vector valuedfunction. This is known as the curl of F or curl Ffor short.

curl F = ∇× F =

∣∣∣∣∣∣∣∣∣i j k

∂

∂x

∂

∂y

∂

∂z

M N P

∣∣∣∣∣∣∣∣∣ .

Expanding the above we have

curl F =

⟨∂P

∂y−∂N

∂z,∂M

∂z−∂P

∂x,∂N

∂x−∂M

∂y

⟩.

Line integrals

Given a function f(x,y) we have looked at whathappens over some 2-dimensional region R in theplane. But we could also ask what is happening withthe function directly over some curve C. One wayto think of this is that we are “hanging a curtain”where the height of the curtain is given by the func-tion and we want to hang it directly over the curve.In this case we would do our traditional chop intosmall pieces, determine what is happening in eachpiece, and add it back up. If we did this we wouldget the following:∑

f(x,y)∆s ≈∫C

f(x,y)ds.

The idea being that we took our curve and dividedit into small pieces (“s” represents a small length ofarc), and as we refine our subdivision we approach avalue which we call the integral along C.

The actual method to evaluate this integral is tosimply find a parameterization of the curve C. Sosuppose that

(x(t),y(t)

)is such a parameterization

of our curve from t = a to t = b, then we have∫C

f(x,y)ds=∫ba

f(x(t),y(t)

)√(x ′(t))2 + (y ′(t))2 dt︸︷︷︸

=ds

where the last term is the term for arc length. We getthe same answer for any parameterization.

This can be used to find mass and center of massfor wires using the same techniques as given in theprevious chapter. Also we note that we have a simi-lar result in three dimensions where we parameterizeour curves by

(x(t),y(t), z(t)

).

We can also find the amount of work done in mov-ing a particle along a curve. Let F denote the vectorvalued function corresponding to a force needed ata certain point. Then the amount of work done at aparticular point is F ·Tds where T is the unit tangentvector and ds the distance moved. Therefore the to-tal work comes from adding this up. We can also useproperties of T to simplify this further to get

Work =

∫C

F · Tds =∫C

F · dr .

Where dr = dxi + dyj + dzk so letting F = 〈M,N,P〉this last integral becomes∫

C

F · dr =∫C

Mdx+Ndy+ P dz .

Where M,N,P are each functions of x,y, z. Remem-ber that while this looks like an integral of severalvariables, after we throw in the parametrization thisreduces to an integral of t.

Path independence

Some line integrals are easy to evaluate. In partic-ular using the fundamental theorem of calculus wehave ∫

C

∇f(r) · r = f(b) − f(a) ,

i.e., essentially we are taking the antiderivative of ∇fand evaluating at the endpoints. The amazing thingabout this is that on the right hand side there is noreference to C, this shows that the integral does notdepend on the path!

The amazing thing is that the reverse is true.Namely, if the value of the integral only ever dependson where the integral starts and stops then it must bethe case that we are integrating a function of the form∇f, or in other words we are working with a conser-vative function.

Line integrals are easy for conservative functions.

This means that we need to be able to take an inte-gral of the form ∫

C

Mdx+Ndy

and determine if this integral comes from a conser-vative function, and if so what the function f is. If itis conservative we have the following:

M =∂f

∂xand N =

∂f

∂y.

This implies the following two partial derivativesmust be equal since they will both correspond to thesame second partial derivative (only where we havechanged the order that we take derivatives and in ourclass order does not matter).

∂M

∂y=∂N

∂x

This in fact is necessary and sufficient to show that afunction is conservative. Once we know it is conser-vative it is a simple process to rebuild our original f(up to a constant). The following steps are one wayto methodically solve the problem.

1. Compare the above partial derivatives to see ifwe are conservative. If not, we have to do some-thing else.

2. f =∫Mdx+ C(y).

(Note, we are integrating with respect to x andso we need to add a constant term that is con-stant with respect to x and in our case it will beanything invoking only y which is why we adda function of y.)

3. Take the function from Step 2 and take thederivative with respect to y and set it equal to Nand solve for C ′(y) (at this point anything withan x should drop out).

4. Integrate to solve for C(y) (there will be a con-stant).

5. Put C(y) back into Step 2 and we now have f.

6. Woohoo!

Similar things work in three dimensions. In that casea function is conservative if the following holds:

∂M

∂y=∂N

∂x,

∂M

∂z=∂P

∂x,

∂N

∂z=∂P

∂y

Green’s Theorem

In the above we basically reduce to understandingwhat is happening at the endpoints, i.e., we integrateby looking at some related function at the boundary.This is a common unifying theme for this chapter.When we have a line the boundary is the endpoints.When we have a region the boundary is the curvethat bounds the region. We can again relate an inte-gral on the interior of the boundary with an integralon the boundary of the curve.

Green’s Theorem: Let C be a piecewise smoothclosed curve bounding a region S in the plane ori-ented “counterclockwise” and let M and N have con-tinuous partial derivatives. Then∫∫

S

(∂N

∂x−∂M

∂y

)dA =

∮C

Mdx+Ndy .

(The notation “∮

” is used to emphasize that the inte-gral is a closed oriented curve.)

Green’s Theorem is useful to allow us to take anintegral on the interior and make it equivalent to oneon the boundary and vice-versa. In particular we willtend to use this theorem when we are dealing withan integral over a closed curve of a simple region.

We also note that we can have “holey” regions bycombining our curves together. More generally in-stead of orienting counterclockwise we will orient sothat the region is always on our left hand side as wetraverse.

A simple application of this is finding area. Forexample we have∮

C

xdy = Area enclosed in C.

This basic idea has been implemented to create de-vices known as planimeters which can find the areaof a region by tracing the boundary of the region.

Applications of Green’s Theorem

Let us suppose that we are starting with a path Cand a vector valued function F in the plane. Then aswe traverse along C there are two important (unit)vectors, namely T, the unit tangent vector 〈dx

ds, dyds〉,

and n, the unit normal vector 〈dyds

,−dxds〉. So we can

consider the following integrals.∫C

F · Tds and∫C

F · nds.

Note that these integrals exist for any C, howeveronce we add on the condition that C is a closed curvethen we can use Green’s Theorem to simplify the in-tegrals and in particular turn these into double inte-grals over the region S enclosed by C. We have thefollowing:∮

C

F · nds =∫∫

S

(∂M

∂x+∂N

∂y

)dA

∮C

F · Tds =∫∫

S

(∂N

∂x−∂M

∂y

)dA

The first integral is measuring flow through the curveand is also known as flux.

What makes this interesting is that we can relatethe integrals on the right hand side in terms of var-ious operands we can do to F using the ∇ operator.For example the first integral is simply ∇ · F and sois naturally associated with divergence. The secondintegral is a part of ∇×F (where we add a zero com-ponent to make F three dimensional), namely it isthe part corresponding to the entry in the z-direction(i.e., k) and we need to pull it off which can be donewith a dot product. Updating we now have∮

C

F · nds =∫∫

S

(∇ · F)dA

∮C

F · Tds =∫∫

S

(∇× F) · k dA

In particular, we have the following rule of thumbwhich will repeat again later.

“·T” ↔ curl and “·n” ↔ div .

Much of the rest of this chapter is built upon ex-panding these two basic ideas; in order to get to thatpoint we will first need to work on doing surface in-tegration.

Surface integrals

We have already seen line integrals, where we havea function that we can restrict to points on the lineand then integrate along that line. Surface integrals

work in the same way. Namely, we have a surfacesitting in three-dimensional space and we have afunction f(x,y, z) that assigns a value to each pointin space, and in particular each point on the sur-face. We can then integrate this function on the sur-face by breaking the surface into little tiny pieces“d(SA)”, finding the value of the function on eachpiece, and hence the total contribution of that piecef(x,y, z)d(SA), and finally adding all of the littlepieces up. In particular we represent our surface in-tegral for the surface G in the form

Surface integral =∫∫

G

f(x,y, z)d(SA).

This is great theory, and intuitive, but we need tosee how to accomplish this in practice. We will startwith a special case, namely suppose that our surfaceis formed by a function z = g(x,y) over some regionR in the xy-plane. Then we can simply express ev-erything in terms of x and y (including our regionwe integrate over); the hardest part is the “d(SA)”term but we developed that from the last chapter.Therefore we have the following.∫∫

G

f(x,y, z)d(SA)

=

∫∫R

f(x,y,g(x,y)

)√(gx)2 + (gy)2 + 1 dA

Not surprisingly these types of integrals are gen-erally unpleasant and best avoided. The problemcomes from what is happening under the squareroot.

Amazingly though there is a special case when thissimplifies tremendously. Namely if we go back andlook at what the flux through the surface is. Thisworks similarly as before and we have

flux =

∫∫G

(F · n)d(SA) .

At first glance it would appear that things have got-ten worse for us in that the “n” term will also involvea square root. But the amazing thing is that these twosquare roots exactly cancel out. Woohoo! In particu-lar, if we let F = 〈M,N,P〉 and have a surface over aregion in the plane as before then we have∫∫

G

(F · n)d(SA) =∫∫

R

(−Mgx −Ngy + P

)dA.

(This assumes that we are dealing with “upwardpointing normals”.)

More generally surface integrals can be thought ofas being parameterizations of two dimensional re-gions (similar to the philosophy that a tangent line

is a parameterization of an interval). From this per-spective we have

r(u, v) =⟨x(u, v),y(u, v), z(u, v)

⟩.

Using the same approach as used for the Jacobianand surface integrals from the last chapter we havethat “d(SA) = ‖ru × rv‖dA”. This gives us the fol-lowing more general result where we will let S de-note the region in the uv-plane∫∫

G

f(x,y, z)d(SA) =∫∫

S

f(r)‖ru × rv‖dA.

(In practice we will not need this more general formfor our purposes.)

Divergence Theorem

We can compute the flux of a function F throughany surface. Earlier in the plane we noted that if ourcurve was closed then we could relate the integralof the flux to the integral of the region it encloses.Amazingly a similar result holds, so similar in thatexcept for some minor notation it is the same. Beforewe dive into the specifics of what we need and theconclusions of the result,. the take home message isthat the flux can be computed by either an integralon the boundary dotted with the normal or an inte-gral on the interior of the divergence in both two andthree dimensions.

Also before we get too far we should talk aboutnotation. In an effort to be green before it was cool,mathematicians started to reuse symbols. We will belooking at solid shapes and talking about the bound-ing surfaces, in this case if S is our solid then we willdenote the boundary of S (i.e., the surface on the ex-terior of S) as ∂S. This uses the same “∂” that we sawin partial derivatives. So to be clear, if the “∂” is infront of a shape we read this as the boundary of thatshape and if the “∂” is in front of a function we readthis as the partial derivative of that function.

The Divergence Theorem then states that given asolid shape S with a boundary ∂S that is “nice” (i.e.,composed of pieces that are smooth, a fancy way ofsaying we can do calculus with those pieces) andwe have a function F = 〈M,N,P〉 with nice partialderivatives then

Flux =

∫∫∂S

(F · n)d(SA) =∫∫∫

S

(∇ · F)dV

The normal vector n will always point out, i.e., awayfrom the interior. Note that the conclusion to this the-orem still holds for cases when the boundary surfaceis composed of several pieces.

In practice this is useful because we can take a hardsurface integral and reduce it to a less-hard integral

over a shape. In the latter case we can often usesome basic tools to compute this integral (i.e., vol-umes, symmetry and so on).

Stokes’s Theorem

The generalization of the other result is known asStokes’s Theorem. Before diving into it let us firstnote that while we can talk about normals to bothcurves and surfaces in a meaningful way, there is noway to talk about tangent to a surface in a meaning-ful way. So when we generalize the result relatingto the tangent vector we will essentially stay in thesame setting where we have a single region (now asurface) with a boundary and we are computing anintegral over both the surface and the boundary ofthe surface. Viewed in this context the result is nearlyidentical to what we had before.

Given a function F = 〈M,N,P〉 with nice partialderivatives and a surface S which is smooth withsmooth boundary ∂S then∮

∂s

F · Tds =∫∫

S

(∇× F) · n d(SA) .

The hardest part about applying this is getting theorientation on the curve correct. Given n imaginethat we slice off a thin strip next to ∂S and that wewill walk along ∂S on this thin strip standing withour head in the direction of n. Then the direction thatwe should travel along the curve is the one whichwill keep the surface to our left hand side. “To theleft, to the left, always keep the surface on the side tothe left.” Conversely if we have an orientation on ourcurve, this gives an orientation to the surface, i.e., n,by a similar argument.

This is generally a hard theorem to use for a fewreasons. First off we have a hard time motivatingwhat is going on except to vaguely way that this issomehow measuring rotation. Second off both sidesof the above equation can be very difficult to computeso we do not seem to gain much. But there is a secretthat can make many of these problems easier. Andonce you see the secret then the theorem becomesmany times more amazing. Here it is:

Different surfaces S and T can have ∂S = ∂T .

Why is this useful? Because the integral on the leftonly looks at ∂S, so this says that we can reduce itto a surface which better fits our mood. Generallythat means we can reduce it to a surface that is flat.For such a surface it is easy to then find n and theproblem becomes many factors of time easier! Themajority of Stokes’s Theorem problems can be donequickly and easily by using this observation.

basics of space and vectors points and...

Documents