chapter 9 vectors - macs.hw.ac.ukmarkl/teaching/algebra/chapter9.pdf · chapter 9 vectors...

Chapter 9

Vectors

Euclid’s Elements codified what was known about geometry into a handfulof axioms and then showed that all of geometry could be deduced from them.His achievement was impressive, but it does suffer one drawback in that it isnot the easiest system to use. Even proving simple results, like Pythagoras’theorem, take dozens of intermediate results. It can be compared to a low-level programming language in which everything has to be spelt out. It istherefore a great theoretical achievement, but not so much a practical one.

It was not until the nineteenth century that a practical tool for doingthree-dimensional geometry was constructed. On the basis of the work car-ried out by Hamilton on quaternions — I say a little more about this later— the theory of vectors, the subject of this chapter, was developed by theAmerican Josiah Willard Gibbs and promoted by the English electrical en-gineer Oliver Heaviside. We introduce their theory in this chapter. It can becompared to a high-level programming language with in-built features thatsimplify proving results. For example, Pythagoras’ theorem may be provedin a couple of lines. Of course, there is a price to be paid in that it takeslonger to learn, but subsequently is easier to use.

However, what is very important to remember is that the geometry de-veloped by Euclid and the geometry described in this chapter are one andthe same. They are just being described in different ways.

I shall also touch on the connection with the work of the previous twochapters. Each linear equation in three unknowns is in fact the equation ofa plane in three-dimensional space. This means that the theory of linearequations in three unknowns has a geometrical interpretation. This maybe generalized: the theory of matrices combined with a theory of vectors

235

236 CHAPTER 9. VECTORS

in arbitrary dimensions is known as linear algebra, and is one of the mostimportant branches of algebra.

I have not attempted to develop the subject in this chapter completelyrigorously, so I often make appeals to geometric intuition in setting up thealgebraic theory of vectors.

9.1 Vectors geometrically

I assume you are familiar with the following ideas in three dimensions:

• The notion of a point.

• The notion of a line and of a line segment.

• The notion of the length of a line segment and the angle between twoline segments.

• The notion of a directed line segment.

• The notion of parallel lines.

The notion of a pair of lines being parallel is fundamental to Euclidean ge-ometry. We used it, for example, in proving that the angles in a triangle addup to two right angles.

9.1.1 Addition and scalar multiplication of vectors

Definition of a vector Two directed line segments which are parallel, havethe same length, and point in the same direction are said to represent thesame vector.

The word ‘vector’ means carrier in Latin and what a vector carries isinformation about length and direction and nothing else. Because vectorsstay the same when they move parallel to themselves, they also preserveinformation about angles. I shall denote vectors by bold letters a,b, . . . If Pand Q are points then the directed line segment from P to Q is written PQ

or−→PQ. If P = Q then PQ is just a point. The zero vector 0 is represented by

the degenerate line segment PP . Vectors are denoted by arrows: the vectorstarts at the base of the arrow (where the feathers would be) we shall call

9.1. VECTORS GEOMETRICALLY 237

this the tail of the vector and ends at the tip (where the arrowhead is) whichwe shall call the point or head of the vector. For example, all the directedline segments below represent the same vector.

Let a and b be vectors. Their sum is defined as follows: slide the vectorsparallel to themselves so that the point of a touches the tail of b. The directedline segment from the tail of a to the point of b represents the vector a + b.

a

b

a + b

If a is a vector, then −a is defined to be the vector with the same lengthas a but pointing in the opposite direction.

a −a

The following properties should be familiar. They are the same as theproperties of addition of real numbers, and the addition of matrices of thesame size.


Theorem 9.1.1 (Properties of vector addition).

1. a + (b + c) = (a + b) + c.

2. 0 + a = a = a + 0.

3. a + (−a) = 0 = (−a) + a.

4. a + b = b + a.

Proof. The first thing we have to do is show that the definition of vectoraddition makes sense. I shall do this informally. Let A, B and C be threepoints in space, and let A′, B′ and C ′ be three other points. Suppose inaddition that AB is parallel to and the same length as A′B′, and that BC isthe same length and parallel toB′C ′. Then I claim thatAC is the same lengthand parallel to A′C ′ and both line segments point in the same direction.

A B

A′ B′C

C ′

(1) The idea behind the proof of associativity is illustrated in the diagrambelow. We choose directed line segments such that a = AB, b = BC andc = CD. The directed line segment AD, in gray, represents both (a+b)+c,via blue, and a + (b + c) via red.

A

B

C

D

a b

c


(4) The idea behind the proof of the commutativity is illustrated in thediagram below. The directed line segments AB and DC represent a and thedirected line segments BC and AD represent b. The directed line segmentAC represents both a + b and b + a.

AB

C

D

b

a

b

a

The proofs of (2) and (3) are straightforward.

As usual, we define subtraction in terms of addition

a− b = a + (−b).

Example 9.1.2. Consider the following square, though any other polygonwould work, and choose vectors as shown.

a

b

c

d

Then we have

a + b + c + d = 0.

Let a be a vector. Denote by ‖a‖ the length of the vector. If ‖a‖ = 1then a is called a unit vector. We have that ‖a‖ ≥ 0, and ‖a‖ = 0 if, andonly if, a = 0. By results on triangles we have the triangle inequality

‖a + b‖ ≤ ‖a‖+ ‖b‖ .


We now define multiplication of a vector by a scalar. Let λ be a scalarand a a vector. If λ = 0 then λa = 0. If λ > 0 then λa has the samedirection as a and length λ ‖a‖. If λ < 0 then λa has the opposite directionto a and length (−λ) ‖a‖. Observe that in all cases

‖λa‖ = |λ| ‖a‖ .

If a is non-zero thena =

a

‖a‖is a unit vector in the same direction as a. We call this process normalization.Vectors that differ by a scalar multiple are said to be parallel. The followingare all intuitively plausible.

Theorem 9.1.3 (Properties of scalar multiplication).

1. 0a = 0.

2. 1a = a.

3. (−1)a = −a.

4. (λ+ µ)a = λa + µa.

5. λ(a + b) = λa + λb.

6. λ(µa) = (λµ)a.

We can use what we have introduced so far to prove simple geometrictheorems.

Example 9.1.4. If the midpoints of the consecutive sides of any quadrilateralare joined by line segments, then the resulting quadrilateral is a parallelo-gram. We refer to the picture below.

AB

CD


We choose vectors as shown in the diagram below.

AB

CD

a

b

c

d

We have thata + b + c + d = 0.

Now−→AB = 1

2a + 1

2b and

−−→CD = 1

2c + 1

2d. But a + b = −(c + d). It follows

that−→AB = −−−→CD. Hence the line segment AB is parallel to the line segment

CD and they have the same lengths. Similarly, BC is parallel to AD andhas the same length.

9.1.2 Inner products

We now introduce a notion that will enable us to measure angles and lengths.It is a development of the idea of the perpendicular projection of a line ontoanother line. Let a and b be two vectors. If a,b 6= 0 then define

a · b = ‖a‖ ‖b‖ cos θ

where θ is the angle between a and b. Note that this angle is always chosento be 0 ≤ θ ≤ π. If either a or b is zero then a · b is defined to be zero.We call a ·b the inner product of a and b. It is important to remember thatit is a scalar and not a vector. We say that non-zero vectors a and b areorthogonal if the angle between them is ninety degrees. The key property ofthe inner product is that for non-zero a and b we have that

a · b = 0 iff a and b are orthogonal.


Theorem 9.1.5 (Properties of the inner product).

1. a · b = b · a.

2. a · a = ‖a‖2.

3. λ(a · b) = (λa) · b = a · (λb).

4. a · (b + c) = a · b + a · c.

Proof. The proof of (1), (2) and (3) are straightforward. We outline theproof of (4). Let x and y be a pair of vectors. Then the component of x inthe direction of y, written comp(x,y), is by definition the number ‖x‖ cos θwhere θ is the angle between x and y.

y

x

θ

comp(x,y)

Clearlyx · y = ‖y‖ comp(x,y).

Geometry shows that

comp(b + c, a) = comp(b, a) + comp(c, a).


We therefore have that

(b + c) · a = ‖a‖ comp(b + c, a)

= ‖a‖ comp(b, a) + ‖a‖ comp(c, a)

= b · a + c · a

The inner product a · a is often abbreviated a2.The inner product enables us to prove much more interesting theorems.

Example 9.1.6. The angle in a semicircle is a right angle. Draw a semicircle.Choose any point on the circumference of the semicircle and join it to thepoints at either end of the diameter of the semicircle. Then the claim is thatthe resulting triangle is right-angled.

B C

A

We are interested in the angle formed by AB and AC. We choose vectors asshown in the diagram below

B C

A

−a a

b

Observe that−→AB = −(a + b) and

−→AC = a− b. Thus

−→AB · −→AC = −(a + b) · (a− b)

= −(a2 − a · b + b · a− b2)

= −(a2 − b2)

= 0


using the fact that a ·b = b ·a and ‖a‖ = ‖b‖, because this is just the radiusof the semicircle. It follows that the angle BAC is a right angle, as claimed.

Example 9.1.7. Pythagoras’ theorem proved using vectors.

cb

a

We prove that a2 + b2 = c2. Choose vectors as shown in the diagram below.

cb

a

We have thata + b + c = 0

and so a + b = −c. Now

(a + b)2 = (−c) · (−c) = ‖c‖2 .

But(a + b)2 = ‖a‖2 + 2a · b + ‖b‖2

and this is equal to ‖a‖2 + ‖b‖2 because a · b = 0. It follows that

‖a‖2 + ‖b‖2 = ‖c‖2 .

The set of 3-dimensional vectors equipped with the operations of vectoraddition and scalar multiplication together with the inner product is calledthree dimensional Euclidean space E3. This is precisely the space of Euclid’sgeometry, but done in a modern way.


9.1.3 Vector products

In three dimensional space, there is another operation available that is usefulin many applications. Let a and b be non-zero vectors. I shall define aunique vector in terms of these two. Any vector is determined by a directionand a length. Direction can be specified by giving a unit vector n. I shalldeal with that last of all. First I shall specify the length of our vector. Thereis no loss of generality in assuming that a and b lie in the plane of this page.

b

a

If the angle between these vectors is non-zero, then they form two sides of aparallelogram.

b

a

With reference to the diagram below

b

a

θ


the area enclosed is

‖a‖ ‖b‖ sin θ

where θ is the angle between a and b. For our unit vector n, we shall wanta vector orthogonal to both a and b. There are only two choices: either wechoose the vector pointing out of the page or we choose the vector pointinginto the page. We choose the vector pointing into the page. With all this inmind, define

a× b = ‖a‖ ‖b‖ sin θn.

If a or b is zero or θ = 0 then a × b is the zero vector. We call it thevector product of a and b. The key property of the vector product is that fornon-zero vectors

a× b = 0 iff a and b are parallel.

Theorem 9.1.8 (Properties of the vector product).

1. a× b = −b× a.

2. λ(a× b) = (λa)× b = a× (λb).

3. a× (b + c) = a× b + a× c.

Proof. The proofs of (1) and (2) are straightforward. We prove (3). Wedefined the vector product in terms of geometry and so we shall have toprove this property by means of geometry.

We begin with an observation. Let a and b be a pair of vectors. It is con-venient to move them so that they are both emanating from the same pointP . They determine a plane. In that plane, we can draw a line perpendicularto the vector a and passing through the point P . We project the vector bonto this line and we get a vector b′. Then a×b = a×b′. The proof followsby observing that these two vectors clearly have the same direction and acalculation shows that they have the same length.

We now have the key geometric intuition. Orientate so that the vector ais at right angles to the page and pointing at you the reader. If x is a vectorin the plane of the page, then a× x will also be in the plane of the page. Itwill be the vector obtained from x by first rotating x by a right-angle in ananti-clockwise direction and then multiplying its length by a scalar equal tothe length of a.


We project the vectors b and c onto the plane of the page to get thevectors b′ and c′. We shall prove that

a× (b′ + c′) = a× b′ + a× c′.

Let’s see first why this result is enough to prove the theorem. The vectorsa and b + c define a plane. As in our observation above, we have thata× (b + c) = a× (b + c)′. Also a× b = a× b′ and a× c = a× c′. As longas (b + c)′ = b′ + c′, our theorem will follow.

We now prove that

a× (b′ + c′) = a× b′ + a× c′.

By the way we have defined our vectors, a×b′ and a× c′ are in the plane ofthe page and are orthogonal to b′ and c′, respectively. By our key geometricintuition above, the angle between a×b′ and a× c′ is the same as the anglebetween b′ and c′. It follows that a×b′+ a× c′ is at right angles to b′+ c′.Thus a×b′+a×c′ and a×(b′+c′) are vectors pointing in the same direction.We now compare the lengths of these two vectors. We shall use the fact thatthe triangle formed by the vectors a×b′ and a× c′ is similar to the triangleformed by the vectors b′ and c′. Thus

‖a× b′ + a× c′‖‖b′ + c′‖ =

‖a× b′‖‖b′‖ .

But this works out to give

‖a× b′ + a× c′‖ = ‖a‖ ‖b′ + c′‖ .

Our claim is now proved.

It is very important to observe that

a× (b× c) 6= (a× b)× c.

In other words, the vector product is not associative. This is shown inthe exercises by means of a counter-example.

Example 9.1.9. We shall prove the law of sines for triangles using the vectorproduct. With reference to the diagram below


AC

B

b

ac

this states thatsinA

a=

sinB

b=

sinC

c.

Choose vectors as shown

AC

B

b

ac

Thena + b + c = 0.

Hencea + b = −c.

Take the vector product of this equation on both sides on the left with a,band c in turn. We get

1. a× b = c× a.

2. b× a = c× b.

3. c× a = b× c.

From (1), we get‖a× b‖ = ‖c× a‖ .

Thus‖b‖ sinC = ‖c‖ sinB

which gives us the second equation in the statement of the result. Theremaining results follow similarly.


9.1.4 Scalar triple products

This product is nothing more than a combination of the previous two. How-ever, it is included because, as we shall see, it has an important geometricinterpretation. Let a, b and c be three vectors. Then b×c is a vector. Thusa · (b× c) is a scalar. We define

[a,b, c] = a · (b× c).

It is called the scalar triple product. Its properties are determined by theproperties of the inner and vectors products. What it means geometricallywill be described later.

Exercises 9.1

1. Consider the following diagram.

a b

c

A B C

D E F

Now answer the following questions.

(a) Write the vector BD in terms of a and c

(b) Write the vector AE in terms of a and c

(c) What is the vector DE?

(d) What is the vector CF?

(e) What is the vector AC?

(f) What is the vector BF?

2. If a, b, c and d represent the consecutive sides of a quadrilateral, showthat the quadrilateral is a parallelogram if, and only, if a + c = 0.


a

b

c

d

3. In the regular pentagon ABCDE, let AB = a, BC = b, CD = c, andDE = d. Express EA, DA, DB, CA, EC, BE in terms of a, b, c,and d.

A

B

C

D

E

ab

c

d

4. Let a and b represent adjacent sides of a regular hexagon so that theinitial point of b is the terminal point of a. Represent the remainingsides by means of vectors expressed in terms of a and b.

a

b

9.2. VECTORS ALGEBRAICALLY 251

5. Prove that ‖a‖b + ‖b‖ a is orthogonal to ‖a‖b− ‖b‖ a for all vectorsa and b.

6. Let a and b be two non-zero vectors. Let

u =

(a · ba · a

)a.

Show that b− u is orthogonal to a.

7. Simplify (u + v)× (u− v).

8. Let a and b be two unit vectors the angle between them being π3. Show

that 2b− a and a are orthogonal.

9. Prove that‖u− v‖2 + ‖u + v‖2 = 2(‖u‖2 + ‖v‖2).

Deduce that the sum of the squares of the diagonals of a parallelogramis equal to the sum of the squares of all four sides.

9.2 Vectors algebraically

The theory I introduced in Section 9.1 is useful for proving general resultsabout geometry, but what if we want to calculate with particular vectors:how do we describe them? To do this we need coordinates.

9.2.1 i, j and k

Set up a cartesian coordinate system consisting of x, y and z axes. Weorient the system so that in rotating the x axis clockwise to the y axis, weare looking in the direction of the positive z axis. Let i, j and k be unitvectors parallel to the x, y and z axes respectively (pointing in the positivedirections).

k

i j


Every vector a can be uniquely written in the form

a = a1i + a2j + a3k

for some scalars a1, a2, a3. This is achieved by orthogonal projection of thevector a (moved so that it starts at the origin) onto each of the three coordi-nate axes. The numbers ai are called the components of a in each of the threedirections. The proofs of the following can be deduced from the propertiesof vector addition and scalar multiplication.

Theorem 9.2.1.

1. If a = a1i + a2j + a3k and b = b1i + b2j + b3k then a = b if, and only,if ai = bi. That is, corresponding components are equal.

2. 0 = 0i + 0j + 0k.

3. If a = a1i + a2j + a3k and b = b1i + b2j + b3k then

a + b = (a1 + b1)i + (a2 + b2)j + (a3 + c3)k.

4. If a = a1i + a2j + a3k then λa = λa1i + λa2j + λa3k for any scalar λ.

We may now calculate the form taken by the inner and vector productsin co-ordinates.

Theorem 9.2.2 (Inner products). Let a = a1i + a2j + a3k and b = b1i +b2j + b3k. Then

a · b = a1b1 + a2b2 + a3b3.

Proof. This is proved using part (4) of Theorem 9.1.5 and the following table

· i j k

i 1 0 0j 0 1 0k 0 0 1

computed from the definition of the inner product. We have that

a · b = a · (b1i + b2j + b3k) = b1(a · i) + b2(a · j) + b3(a · k).

We now compute a · i, a · j, and a · k in turn:

9.2. VECTORS ALGEBRAICALLY 253

• a · i = a1.

• a · j = a2.

• a · k = a3.

Putting everything together we get

a · b = a1b1 + a2b2 + a3b3,

as required.

If a = a1i + a2j + a3k then

‖a‖ =√a2

1 + a22 + a2

3.

Theorem 9.2.3 (Vector products). Let a = a1i + a2j + a3k and b = b1i +b2j + b3k. Then

a× b =

∣∣∣∣∣∣

i j ka1 a2 a3

b1 b2 b3

∣∣∣∣∣∣

It is important to note that this ‘determinant’ can only be expanded along thefirst row.

Proof. This follows by part (3) of Theorem 9.1.8 and the following table

× i j k

i 0 k −jj −k 0 ik j −i 0

computed from the definition of the vector product. We have that

a× b = a× (b1i + b2j + b3k) = b1(a× i) + b2(a× j) + b3(a× k).

We now compute a× i, a× j, and a× k in turn:

• a× i = −a2k + a3j.

• a× j = a1k− a3i.


• a× k = −a1j + a2i.

Putting everything together we get

a× b = (a2b3 − a3b2)i− (a1b3 − a3b1)j + (a1b2 − a2b1)k

which is equal to the given determinant.

The proof of the following now follows by Theorem 9.2.2 and Theo-rem 9.2.3.

Theorem 9.2.4 (Scalar triple products and determinants). Let

a = a1i + a2j + a3k, b = b1i + b2j + b3k, c = c1i + c2j + c3k.

Then

[a,b, c] =

∣∣∣∣∣∣

a1 a2 a3

b1 b2 b3

c1 c2 c3

∣∣∣∣∣∣=

∣∣∣∣∣∣

a1 b1 c1

a2 b2 c2

a3 b3 c3

∣∣∣∣∣∣Thus the properties of scalar triple products are the same as the properties of3× 3 determinants.

Proof. We calculate a · (b× c). This is equal to

(a1i + a2j + a3k) · [(b2c3 − b3c2)i− (b1c3 − b3c1)j + (b1c2 − b2c1)k].

But this is equal to

a1(b2c3 − b3c2)− a2(b1c3 − b3c1) + a3(b1c2 − b2c1)

which is nothing other than∣∣∣∣∣∣

a1 a2 a3

b1 b2 b3

c1 c2 c3

∣∣∣∣∣∣

The second equality follows from the fact that the determinant of the trans-pose of a matrix and the determinant of the original matrix are the same.

In Section 9.1, we defined vectors and vector operations geometrically.In Section 9.2, we showed that once we had chosen a co-ordinate system, vec-tors and vector operations could be described algebraically. The importantpoint to remember in what follows is that the two approaches must give thesame answers.

9.3. GEOMETRY WITH VECTORS 255

Exercises 9.2

1. Let a = 3i + 4j, b = 2i + 2j− k and c = 3i− 4k.

(a) Find ‖a‖, ‖b‖, and ‖c‖.(b) Find a + b and a− c.

(c) Determine ‖a− c‖.

2. (a) Let a = 4i + j− 3k and b = i + 2j + 2k. Find a · b. Are a and borthogonal?

(b) Find the angle between −2(i− j) + k and j− i.

3. The unit cube is determined by the three vectors i, j and k. Find theangle between the long diagonal of the unit cube and one of its edges.

4. Calculate i× (i× k) and (i× i)× k. What do you deduce as a resultof this?

5. Calculate u · (v × w) where u = 3i − 2j − 5k, v = i + 4j − 4k, andw = 3j + 2k.

6. If [a,b, c] = 0 what can you deduce?

9.3 Geometry with vectors

There are two kinds of vectors: the free vectors that we have been dealingwith up to now and the position vectors we introduce next.

9.3.1 Position vectors

As defined, vectors cannot describe the exact position of a point in space,since they are described by all directed line segments of the same length andthe same direction. Our vectors are therefore also called free vectors, sincethey are free to wander. But we can also use vectors to describe the preciselocation of points. To do this, we have to choose and fix a point O in space,called an origin. We can then consider all the directed line segments thatstart at O. Each such segment represents a vector and every vector is thusrepresented. The tops of the line segments are points in space, and every


point thus occurs. It follows that once an origin has been fixed, vectors canbe used to describe points. We talk about the position vectors of points.However, we can only talk about position vectors with respect to some fixedpoint O.

9.3.2 Linear combinations

Let v1, . . . ,vn be n vectors and let λ1, . . . , λn be n scalars. Then the vector

v = λ1v1 + . . .+ λnvn

is called a linear combination of the n vectors. Only two cases of this defi-nition are needed in this chapter. If we are given just one vector v1 then alinear combination is just a scalar multiple of that vector. The other caseis where we have two vectors v1 and v2. Linear combinations then look likethis

λ1v1 + λ2v2.

Let v be any non-zero vector. Then any vector parallel to this vector has theform λv for some scalar λ. Now let v1 and v2 be two non-zero vectors whereneither is a multiple of the other. Then these two vectors determine a planein space. This plane is not rooted to any point and so, for convenience, wemay move it parallel to itself so that it passes through some fixed point thatwe may treat as an origin. Now let v be any vector which is parallel to thisplane. We may move it parallel to itself so that its tail is at the origin. Byplane geometry, we may find real numbers λ1 and λ2 such that

v = λ1v1 + λ2v2.

We shall use these ideas in deriving formulae for lines and planes in space inthe sections below.

9.3.3 Lines

Intuitively, a line in space is determined by one of the following two piecesof information:

1. Two distinct points each described by a position vector.

2. One point and a direction, where the point is given by a position vectorand the direction by a (free) vector.


Let’s see how we can use vectors to obtain the equation of that line. Let aand b be the position vectors of two distinct points. Then they determinethe straight line indicated by the dashed line.

a

b

O

Let r = xi + yj + zk be the position vector of an arbitrary point on this line.

a

b

r

O

Observe that the line determined by the two points will be parallel to thevector b− a which is the direction the line is parallel to. The vectors r− aand b−a will be parallel. Thus there is a scalar λ such that r−a = λ(b−a).It follows that

r = a + λ(b− a).


This is called the (vector form of) the parametric equation of the line. Theparameter in question is λ.

We now derive the co-ordinate form of the parametric equation. Let

a = a1i + a2j + a3k and b = b1i + b2j + b3k.

Substituting in our vector equation above and equating components we ob-tain

x = a1 + λ(b1 − a1), y = a2 + λ(b2 − a2), z = a3 + λ(b3 − a3).

For convenience, put ci = bi−ai. Thus the co-ordinate form of the parametricequation for the line is

x = a1 + λc1, y = a2 + λc2, z = a3 + λc3.

If c1, c2, c3 6= 0 then we can eliminate the parameters in the above equa-tions to get the non-parametric equations of the line:

x− a1

c1

=y − a2

c2

,y − a2

c2

=z − a3

c3

.

It’s worth noting that

• The parametric equation is useful for generating points on the line (bychoosing values of the parameter λ).

• The non-parametric equation is useful for checking that given pointslie on a given line.

Example 9.3.1. Find the parametric and the non-parametric equations ofthe line through the point with position vector i + 2j + 3k and parallel to thevector 4i+5j+6k. In this question, we are given one point and the directionthat the line is parallel to. Thus

r− (i + 2j + 3k)

is parallel to4i + 5j + 6k.

It follows thatr = i + 2j + 3k + λ(4i + 5j + 6k)


is the vector form of the parametric equation of the line. We now find thecartesian form of the parametric equation. Put

r = xi + yj + zk.

Then

xi + yj + zk = i + 2j + 3k + λ(4i + 5j + 6k).

These two vectors are equal if, and only if, their co-ordinates are equal. Thuswe have that

x = 1 + 4λ

y = 2 + 5λ

z = 3 + 6λ

This is the cartesian form of the parametric equation of the line. Finally, weeliminate λ to get the non-parametric equation of the line

x− 1

4=y − 2

5and

y − 2

5=z − 3

6.

These two equations can be rewritten in the form

5x− 4y = −3 and 6y − 5z = −3.

9.3.4 Planes

Intuitively, a plane in space is determined by one of the following three piecesof information:

1. Any three points that do not all lie in a straight line. That is, thepoints form the vertices of a triangle.

2. One point and two non-parallel directions.

3. One point and a direction which is perpendicular or normal to theplane.

We shall begin by finding the parametric equation of the plane determinedby the three points with position vectors a,b and c.


O

ab c

b− a c− a

The vectors b−a and c−a are both parallel to the plane, but are not parallelto each other. Thus every vector parallel to the plane they determine hasthe form

λ(b− a) + µ(c− a)

for some scalars λ and µ. Thus if the position vector of an arbitrary pointon the plane is r, then r− a = λ(b− a) + µ(c− a). Thus the (vector formof) the parametric equation of the plane is

r = a + λ(b− a) + µ(c− a).

This can easily be written in coordinate form by equating components.

To find the non-parametric equation of a plane, we use the fact that aplane is determined once a point on the plane is known and a vector orthog-onal to every vector in the plane — such a vector is said to be normal tothe plane. Let n be a vector normal to our plane, and let a be the positionvector of a point in the plane.


O

n

a

r

r− a

Then r− a is orthogonal to n. Thus

(r− a) · n = 0.

This is the (vector form) of the non-parametric equation of the plane. Tofind the co-ordinate form of the non-parametric equation, let

r = xi + yj + zk, a = a1i + a2j + a3k, n = n1i + n2j + n3k.

From (r− a) · n = 0 we get (x− a1)n1 + (y − a2)n2 + (z − a3)n3 = 0. Thusthe non-parametric equation of the plane is

n1x+ n2y + n3z = a1n1 + a2n2 + a3n3.

We have one final question to answer: given the parametric equation ofthe plane, how do we find the non-parametric equation? The vectors b − aand c−a are parallel to the plane but not parallel to each other. The vector

n = (b− a)× (c− a)

is normal to our plane.


Example 9.3.2. Find the parametric and non-parametric equations of theplane containing the three points with position vectors

a = j− k, b = i + j, c = i + 2j.

We have thatb− a = i + k

andc− a = i + j + k.

Thus the parametric equation of the plane is

r = j− k + λ(i + k) + µ(i + j + k).

To find the non-parametric equation, we need to find a vector normal to theplane. We calculate

(b− a)× (c− a) = k− i.

Thus(r− a) · (k− i) = 0.

That is(xi + (y − 1)j + (z + 1)k) · (k− i) = 0.

This simplifies toz − x = −1,

the non-parametric equation of the plane. We now check that our threeoriginal points satisfy this equation. The point a has co-ordinates (0, 1,−1);the point b has co-ordinates (1, 1, 0); the point c has co-ordinates (1, 2, 0).It is easy to check that each set of co-ordinates satisfies the equation.

9.3.5 The geometric meaning of linear equations

From the non-parametric equation of the plane derived above, we deducethat the solutions of a linear equation in three unknowns

ax+ by + cz = d

all lie on a plane in general (although there are some degenerate cases wheresomething different from a plane will be obtained). We may now observethat the non-parametric equation of the line in fact describes the line as the


intersection of two planes. If we have three equations in three unknownsthen, as long as the planes are angled correctly, they will intersect in a point— that is, the equations will have a unique solution. However, there aremany cases where either the planes have no points in common (no solution)or have lines or indeed planes in common (infinitely many solutions). Thusthe nature of the solutions of a system of linear equations in three unknownsis intimately bound up with the geometry of the planes they determine.

9.3.6 The geometric meaning of determinants

In this section, I shall describe the geometric meaning of determinants in one,two and three dimensions. Let’s start with 1× 1 matrices. The determinantof (a) is just a. The length of a is |a|, the absolute value of the determinantof (a).

Theorem 9.3.3. Let a = ai + cj and b = bi + dj be a pair of plane vectors.Then the area of the parallelogram determined by these vectors is the absolutevalue of the determinant

∣∣∣∣a bc d

∣∣∣∣

Proof. The proof I give will be for the case where both vectors are in thefirst quadrant. I shall consider two cases.


(1)

(1)(2)

(2)

(3)

(3)

a

b

a

c

d

c+ d

a+ bb

a + b

(Case 1): b is to the left of a when standing at the origin and looking alonga. Let

a = ai + cj and b = bi + dj.

The area of the parallelogram is equal to the area of the rectangle defined bythe points

0, (a+ b)i, a + b, (c+ d)j

minus the area of two rectangles the same size, labelled (1), two triangles thesame size, labelled (2), and another two triangles of the same size, labelled(3). That is

(a+ b)(c+ d)− 2bc− 2(ac

2

)− 2

(bd

2

)

which is equal to

ac+ ad+ bc+ bd− 2bc− ac− bd = ad− bc.(Case 2): b is to the right of a when standing at the origin and looking

along a. A similar argument shows that the area is bc − ad which is thenegative of the determinant.


Putting these two cases together, we see that the area is the absolutevalue of the determinant.

Theorem 9.3.4. Let

a = a1i + a2j + a3k, b = b1i + b2j + b3k, c = c1i + c2j + c3k

be three vectors. Then the volume of the parallelepiped (‘squashed box’) de-termined by these three vectors is the absolute value of the determinant

∣∣∣∣∣∣

a1 b1 c1

a2 b2 c2

a3 b3 c3

∣∣∣∣∣∣

or its transpose.

Proof. We refer to the diagram below.

θ

φ b

a

c

n

Denote by n the unit vector orthogonal to a and b and and pointing in thedirection of a× b. The volume of the box determined by the vectors a,b, cis equal to the base area times the vertical height. The base area is given by

‖a‖ ‖b‖ sin θ.

The height is equal to the absolute value of

‖c‖ cosφ.

We have to use the absolute value of this expression because cosφ can takenegative values if c is below rather than above the plane of a and b as I havedrawn it. Thus the volume is equal to the absolute value of

‖a‖ ‖b‖ sin θ ‖c‖ cosφ.


Now

• a× b = ‖a‖ ‖b‖ sin θn. As expected, this has length equal to the areaof the base parallelogram.

• n · c = ‖c‖ cosφ.

Thus‖a‖ ‖b‖ sin θ ‖c‖ cosφ = (a× b) · c.

By the properties of the inner product

(a× b) · c = c · (a× b) = [c, a,b].

We now use properties of the determinant

[c, a,b] = −[a, c,b] = [a,b, c].

It follows that the volume of the box is the absolute value of

[a,b, c].

It follows from the above theorem and our theorem on scalar triple prod-ucts that the volume of the parallelepiped determined by the three vectorsa, b, and c is the absolute value of the scalar triple product [a,b, c].

The geometric significance of determinants is that they enable us to mea-sure lengths, areas and volumes. Let’s focus on the case of 3×3 determinantsfor concreteness. We can now see geometrically why such a determinant van-ishes when two columns are equal: the volume enclosed in this case is clearlyzero. If we multiply one side by a factor of λ > 0 then this will change thevolume by a factor of λ. The sign of the determinant which seems an annoy-ance is actually an extra feature which plays a role in the further theory ofmatrices.

Exercises 9.3

1. (a) Find the parametric and the non-parametric equations of the linethrough the two points with position vectors i − j + 2k and 2i +3j + 4k.


(b) Find the parametric and the non-parametric equations of the planecontaining the three points with position vectors i+3k, i+2j−k,and 3i− j− 2k.

2. Let c be the position vector of the centre of a sphere with radius R.Let an arbitrary point on the sphere have position vector r. Why is‖r− c‖ = R? Squaring both sides we get

(r− c) · (r− c) = R2.

If r = xi + yj + zk and c = c1i + c2j + c3k, deduce that the equation ofthe sphere with centre c1i + c2j + c3k and radius R is

(x− c1)2 + (y − c2)2 + (z − c3)2 = R2.

(a) Find the equation of the sphere with centre i + j + k and radius 2.

(b) Find the centre and radius of the sphere with equation

x2 + y2 + z2 − 2x− 4y − 6z − 2 = 0.

3. The distance of a point from a line is defined to be the length of theperpendicular from the point to the line. Let the line in question haveparametric equation

r = p + λd

and let the position vector of the point be q. Show that the distanceof the point from the line is

‖d× (q− p)‖‖d‖ .

4. The distance of a point from a plane is defined to be the length of theperpendicular to the plane. Let the position vector of the point be qand the equation of the plane be (r−p) ·n = 0. Show that the distanceof the point from the plane is

|(q− p) · n|‖n‖ .


9.4 Quaternions

The set of quaternions, denoted by H, was invented by the Irish mathemati-cian Sir William Rowan Hamilton in 1843. They are 4-dimensional generali-sations of the complex numbers. It was from the theory of quaternions thatthe modern theory of vectors with inner and vector products developed. Todescribe what they are, I shall reverse history and derive them from vectors.Define the matrices I,X, Y, Z,−I,−X,−Y,−Z as follows

X =

(0 1−1 0

), Y =

(i 00 −i

)and Z =

(0 −i−i 0

)

where i is the complex number i. We shall be interested in how any two ofthe above matrices multiplies. The following table contains everything weneed. This is

X Y Z

X −I Z −YY −Z −I XY Y −X −I

We shall now consider matrices of the form

λI + αX + βY + γZ

where λ, α, β, γ ∈ R. We calculate the product of two such matrices using thedistributivity and scalar multiplication properties of matrix multiplicationand the above multiplication table. The product

(λI + αX + βY + γZ)(µI + α′X + β′Y + γ′Z)

can be written in the form aI + bX + cY + dZ where a, b, c, d ∈ R althoughI shall write it in a slightly different form

(λµ− αα′ − ββ′ − γγ′)I +

λ(α′X + β′Y + γ′Z) + µ(αX + βY + γZ) +

(βγ′ − γβ′)X + (γα′ − αγ′)Y + (αβ′ − βα′)Z.

Although this looks complicated there are some familiar things within it:the first term contains what looks like an inner product and the last term

9.4. QUATERNIONS 269

contains what looks like a vector product. Note that because this is matrixmultiplication this operation is associative.

The above calculation motivates the following construction. Let E3 de-note the set of all 3-dimensional vectors. Thus a typical element of E3 isαi + βj + γk. Put

H = R× E3.

The elements of H are therefore ordered pairs (λ, a) consisting of a realnumber λ and a vector a. We define the sum of two elements of H in a verysimple way

(λ, a) + (µ, a′) = (λ+ µ, a + a′).

The product is defined in a way that mimics what I did above (you shouldcheck this)

(λ, a)(µ, a′) = (λµ− a · a′, λa′ + µa + (a× a′)) .

It follows that this product is associative.We shall now investigate what we can do with H. I shall only deal with

multiplication because addition poses no problems.

• Consider the subset R of H which consists of elements of the form(λ,0). You can check that (λ,0)(µ,0) = (λµ,0). Thus R mimics thereal numbers.

• Consider the subset C of H which consists of the elements of the form(λ, ai). You can check that

(λ, ai)(µ, a′i) = (λµ− aa′, (λa′ + µa)i).

In particular, (0, i)(0, i) = (−1,0). Thus C mimics the set of complexnumbers.

• Consider the subset E of H which consists of elements of the form (0, a).You can check that

(0, a)(0, a′) = (−a · a′, a× a′).

Thus E mimics vectors, the inner product and the vector product.

The set H with the above operations of addition and multiplication isthe set of quaternions. This structure pulls together most of the importantelements of this course: complex numbers, vectors and matrices.

chapter 9 vectors - macs.hw.ac.ukmarkl/teaching/algebra/chapter9.pdf · chapter 9 vectors...

Documents