1561 maths

155
The Arithmetic Mean An arithmetic mean is a fancy term for what most people call an "average." When someone says the average of 10 and 20 is 15, they are referring to the arithmetic mean. The simplest definition of a mean is the following: Add up all the numbers you want to average, and then divide by the number of items you just added. For example, if you want to average 10, 20, and 27, first add them together to get 10+20+27= 57. Then divide by 3 because we have three values, and we get an arithmetic mean (average) of 19. Want a formal, mathematical expression of the arithmetic mean? That's just a fancy way to say "the sum of k different numbers divided by k." Check out a few example of the arithmetic mean to make sure you understand: Example: Find the arithmetic mean (average) of the following numbers: 9, 3, 7, 3, 8, 10, and 2. Solution: Add up all the numbers. Then divide by 7 because there are 7 different numbers.

Upload: dr-fereidoun-dejahang

Post on 13-Apr-2017

56 views

Category:

Education


9 download

TRANSCRIPT

Page 1: 1561 maths

The Arithmetic Mean

An arithmetic mean is a fancy term for what most people call an "average." When someone says the average of 10 and 20 is 15, they are referring to the arithmetic mean. The simplest definition of a mean is the following: Add up all the numbers you want to average, and then divide by the number of items you just added.

For example, if you want to average 10, 20, and 27, first add them together to get 10+20+27= 57. Then divide by 3 because we have three values, and we get an arithmetic mean (average) of 19.

Want a formal, mathematical expression of the arithmetic mean?

That's just a fancy way to say "the sum of k different numbers divided by k."

Check out a few example of the arithmetic mean to make sure you understand:

Example:

Find the arithmetic mean (average) of the following numbers: 9, 3, 7, 3, 8, 10, and 2.

Solution:

Add up all the numbers. Then divide by 7 because there are 7 different numbers.

 

Example:

Find the arithmetic mean of -4, 3, 18, 0, 0, and -10.

Solution:

Page 2: 1561 maths

Sum the numbers. Divide by 6 because there are 6 numbers.

The answer is 7/6, or 1.167

Geometric Mean

The geometric mean is NOT the arithmetic mean and it is NOT a simple average. It is the nth root of the product of n numbers. That means you multiply a bunch of numbers together, and then take the nth root, where n is the number of values you just multiplied.

Did that make sense? Here's a quick example:

Example:

What is the geometric mean of 2, 8 and 4?

Solution:

Multiply those numbers together. Then take the third root (cube root) because there are 3 numbers.

 

Naturally, the geometric mean can get very complicated. Here's a mathematical definition of the geometric mean:

Remember that the capital PI symbol means to multiply a series of numbers. That definition says to multiply k numbers and then take the kth root. One thing you should

Page 3: 1561 maths

know is that the geometric mean only works with positive numbers. Negative numbers could result in imaginary results depending on how many negative numbers are in a set. Typically this isn't a problem, because most uses of the geometric mean involve real data, such as the length of physical objects or the number of people responding to a survey.

Try a few more examples until you understand the geometric mean.

Example:

What is the geometric mean of 4, 9, 9, and 2?

Solution:

Just multiply the four numbers and take the 4th root:

The geometric mean between two numbers is: 

The arithmetic mean between two numbers is: Example: The cut-off frequencies of a phone line are f1 = 300 Hzand f2 = 3300 Hz. What is the center frequency?

The center frequency is f0 = 995 Hz as geometric mean andnot f0 = 1800 Hz (arithmetic mean). What a difference!

 

The geometric mean of two numbers is the square root of their product.The geometric mean of three numbers is the cubic root of their product.

The arithmetic mean is the sum of the numbers, divided by the quantity of the numbers.Other names for arithmetic mean: average, mean, arithmetic average.

In general, you can only take the geometric mean of positive numbers.

Page 4: 1561 maths

The geometric mean, by definition, is the nth root of the product of the n units in a data set. For example, the geometric mean of 5, 7, 2, 1 is (5 × 7 × 2 × 1)1/4 = 2.893.Alternatively, if you log transform each of the individual units the geometric will be theexponential of the arithmetic mean of these log-transformed values. So, reusing theexample above, exp [ ( ln(5) + ln(7) + ln(2) + ln(1) ) / 4 ] = 2.893.

Geometric Mean

Arithmetic Mean

An arithmetic average is the sum of a series of numbers divided by the count of that series of numbers. 

If you were asked to find the class (arithmetic) average of test scores, you would simply add up all the test scores of the students, and then divide that sum by the number of students. For example, if five students took an exam and their scores were 60%, 70%, 80%, 90% and 100%, the arithmetic class average would be 80%. 

This would be calculated as: (0.6 + 0.7 + 0.8 + 0.9 + 1.0) / 5 = 0.8.

The reason you use an arithmetic average for test scores is that each test score is an independent event. If one student happens to perform poorly on the exam, the next student's chances of doing poor (or well) on the exam isn't affected. In other words, each student's score is independent of the all other students' scores. However, there are some instances, particularly in the world of finance, where an arithmetic mean is not an appropriate method for calculating an average.

Consider your investment returns, for example. Suppose you have invested your savings in the stock market for five years. If your returns each year were 90%, 10%, 20%, 30% and -90%, what would your average return be during this period? Well, taking the simple arithmetic average, you would get an answer of 12%. Not too shabby, you might think.

Page 5: 1561 maths

However, when it comes to annual investment returns, the numbers are not independent of each other. If you lose a ton of money one year, you have that much less capital to generate returns during the following years, and vice versa. Because of this reality, we need to calculate the geometric average of your investment returns in order to get an accurate measurement of what your actual average annual return over the five-year period is.

To do this, we simply add one to each number (to avoid any problems with negative percentages). Then, multiply all the numbers together, and raise their product to the power of one divided by the count of the numbers in the series. And you're finished - just don't forget to subtract one from the result! 

That's quite a mouthful, but on paper it's actually not that complex. Returning to our example, let's calculate the geometric average: Our returns were 90%, 10%, 20%, 30% and -90%, so we plug them into the formula as [(1.9 x 1.1 x 1.2 x 1.3 x 0.1) ^ 1/5] - 1. This equals a geometric average annual return of -20.08%. That's a heck of a lot worse than the 12% arithmetic average we calculated earlier, and unfortunately it's also the number that represents reality in this case.

It may seem confusing as to why geometric average returns are more accurate than arithmetic average returns, but look at it this way: if you lose 100% of your capital in one year, you don't have any hope of making a return on it during the next year. In other words, investment returns are not independent of each other, so they require a geometric average to represent their mean.

Page 6: 1561 maths

A matrix consists of a set of numbers arranged in rows and columns enclosed in brackets.

 

 

The order of a matrix gives the number of rows followed by the number of columns in a matrix. The order of a matrix with 3 rows and 2 columns is 3   2 or 3 by 2.

We usually denote a matrix by a capital letter.

C is a matrix of order 2 × 4 (read as ‘2 by 4’)

 

 

Elements In An Array

Page 7: 1561 maths

Each number in the array is called an entry or an element of the matrix. When we need to read out the elements of an array, we read it out row by row.

Each element is defined by its position in the matrix.

In a matrix A, an element in row i and column j is represented by aij

Example:

a11 (read as ‘a one one ’)= 2 (first row, first column)

a12 (read as ‘a one two') = 4 (first row, second column)

a13 = 5, a21 = 7, a22 = 8, a23 = 9

 

Matrix Multiplication

There are two matrix operations

which we will use in our matrix transformations, multiplying (concatenating) two matrices, and transforming a vector by a matrix. We will now examine the first of these two operations, matrix multiplication.

Matrix multiplication is the operation by which one matrix is transformed by another. A very important thing to remember is that matrix multiplication is not commutative. That is, [a] * [b] != [b] * [a]. For now, it will suffice to say that a matrix multiplication stores the results of the sum of the products of matrix rows and columns. Here is some example code of a matrix

Page 8: 1561 maths

multiplication routine which multiplies matrix [a] * matrix [b], then copies the result to matrix a.

void matmult(float a[4][4], float b[4][4]){ float temp[4][4]; // temporary matrix for storing result int i, j; // row and column counters

for (j = 0; j < 4; j++) // transform by columns first for (i = 0; i < 4; i++) // then by rows temp[i][j] = a[i][0] * b[0][j] + a[i][1] * b[1][j] + a[i][2] * b[2][j] + a[i][3] * b[3][j];

for (i = 0; i < 4; i++) // copy result matrix into matrix a for (j = 0; j < 4; j++) a[i][j] = temp[i][j];}

I have been informed that there is a faster way of multiplying matrices, which involves taking the dot product of rows and columns. However, I have yet to implement such a method, so I will not discuss it here at this time.

Transforming a Vector by a Matrix

This is the second operation

which is required for our matrix transformations. It involves projecting a stationary vector onto transformed axis vectors using the dot product. One dot product is performed for each coordinate axis.

x = x0 * matrix[0][0] + y0 * matrix[1][0] + z0 * matrix[2][0] + w0 * matrix[3][0];

y = x0 * matrix[0][1] + y0 * matrix[1][1] + z0 * matrix[2][1] + w0 * matrix[3][1];

z = x0 * matrix[0][2] + y0 * matrix[1][2] + z0 * matrix[2][2] + w0 * matrix[3][2];

The x0, y0, etc. coordinates are the original object space coordinates for the vector. That is, they never change due to transformation.

"Alright," you say. "Where did all the w coordinates come from???" Good question :) The w coordinates come from what is known as a homogenous coordinate system, which is basically a way to represent 3d space in terms of a 4d matrix. Because we are limiting ourselves to 3d, we pick a constant, nonzero value for w (1.0 is a good choice, since anything * 1.0 = itself). If we use this identity axiom, we can eliminate a multiply from each of the dot products:

x = x0 * matrix[0][0] + y0 * matrix[1][0] + z0 * matrix[2][0] +

Page 9: 1561 maths

matrix[3][0];

y = x0 * matrix[0][1] + y0 * matrix[1][1] + z0 * matrix[2][1] + matrix[3][1];

z = x0 * matrix[0][2] + y0 * matrix[1][2] + z0 * matrix[2][2] + matrix[3][2];

These are the formulas you should use to transform a vector by a matrix.

Object Space Transformations

Now that we know how to multiply matrices together, we can implement the actual formulas used in our transformations. There are three operations performed on a vector by a matrix transformation: translation, rotation, and scaling.

Translation can best be described as linear change in position. This change can be represented by a delta vector [dx, dy, dz], where dx represents the change in the object's x position, dy represents the change in its y position, and dz its z position.

If done correctly, object space translation allows objects to translate forward/backward, left/right, and up/down, relative to the current orientation of the object. Using our airplane as an example, if the nose of the airplane is oriented along the object's local z axis, then translating the airplane in the +z direction will make the airplane move forward (the direction in which its nose is pointing) regardless of the airplane's orientation.

Here is the translation matrix:

+= =+ | += =+ += =+ += =+ += += | | | | | | | | | | | | | 1 | | 0 | | 0 | | 0 | | | | | | | | | | | | | | | | | | | | | | | | 0 | | 1 | | 0 | | 0 | | | | | | | | | | | | | | | | | | | | | | | | 0 | | 0 | | 1 | | 0 | | | += =+ += =+ += =+ | | | | +===============+ | | | | dy dx dz | 1 | | | +===============+ += =+ | += =+

where [dx, dy, dz] is the displacement vector. After this operation, the object will have moved in its own coordinate system, according to the displacement (translation) vector.

The next operation that is performed by our matrix transformation is rotation. Rotation can be described as circular motion about some axis, in this case the axis is one of the object's

Page 10: 1561 maths

local axes. Since there are three axes in each object, we need to rotate around each of them. Here are the matrices for rotation about each axis:

about the x axis:

+= =+ | += =+ += =+ += =+ += += | | | | | | | | | | | | | 1 | | 0 | | 0 | | 0 | | | | | | | | | | | | | | | | | | | | | | | | 0 | |cx | |sx | | 0 | | | | | | | | | | | | | | | | | | | | | | | | 0 | |-sx| |cx | | 0 | | | += =+ += =+ += =+ | | | | +===============+ | | | | 0 0 0 | 1 | | | +===============+ += =+ | += =+

about the y axis:

+= =+ | += =+ += =+ += =+ += += | | | | | | | | | | | | |cy | | 0 | |-sy| | 0 | | | | | | | | | | | | | | | | | | | | | | | | 0 | | 1 | | 0 | | 0 | | | | | | | | | | | | | | | | | | | | | | | |sy | | 0 | |cy | | 0 | | | += =+ += =+ += =+ | | | | +===============+ | | | | 0 0 0 | 1 | | | +===============+ += =+ | += =+

about the z axis:

+= =+ | += =+ += =+ += =+ += += | | | | | | | | | | | | |cz | |sz | | 0 | | 0 | | | | | | | | | | | | | | | | | | | | | | | |-sz| |cz | | 0 | | 0 | | | | | | | | | | | | | | | | | | | | | | | | 0 | | 0 | | 1 | | 0 | |

Page 11: 1561 maths

| += =+ += =+ += =+ | | | | +===============+ | | | | 0 0 0 | 1 | | | +===============+ += =+ | += =+

The cx, sx, cy, sy, cz, and sz variables are the values of the cosines and sines of the angles of rotation about the x, y, and z axes, respectively. Remeber that the angles used represent angular displacement just as the values used in the translation step denote a linear displacement. Correct transformation CANNOT be accomplished with matrix multiplication if you use the cumulative angles of rotation. I have been told that quaternions are able to perform this operation correctly, however I know nothing of quaternions and how they are implemented. The incremental angles used here represent rotation from the current object orientation. In other words, by rotating 1 degree about the z axis, you are telling your object "Rotate 1 degree about your z axis, regardless of your current orientation, and regardless of how you got to that orientation." If you think about it a bit, you will realize that this is how the real world operates. In object space, the series of rotations an object undergoes to attain a certain orientation have no effect on the object space results of any upcoming rotations.

Now that we know the matrix formulas for translation and rotation, we can combine them to transform our objects. The formula for transformations in object space is

[O] = [O] * [T] * [X] * [Y] * [Z]

where O is the object's matrix, T is the translation matrix, and X, Y, and Z are the rotation matrices for their respective axes. Remember, that order of matrix multiplication is very important!

The recursive assignment of O poses a question: What is the original value of the object matrix? To eliminate any terrible errors in transformation, the matrices which store an object's orientation should always be initialized to identity.

Matrix Multiplication

You probably know what a matrix is already if you are interested in matrix multiplication. However, a quick example won't hurt. A matrix is just a two-dimensional group of numbers. Instead of a list, called a vector, a matrix is a rectangle, like the following:

Page 12: 1561 maths

You can set a variable to be a matrix just as you can set a variable to be a number. In this case, x is the matrix containing those four numbers (in that particular order). Now, suppose you have two matrices that you need to multiply. Multiplication for numbers is pretty easy, but how do you do it for a matrix?

Here is a key point: You cannot just multiply each number by the corresponding number in the other matrix. Matrix multiplication is not like addition or subtraction. It is more complicated, but the overall process is not hard to learn. Here's an example first, and then I'll explain what I did:

Example:

Solution:

 

You're probably wondering how in the world I got that answer. Well you're justified in thinking that. Matrix multiplication is not an easy task to learn, and you do need to pay attention to avoid a careless error or two. Here's the process:

Page 13: 1561 maths

Step 1: Move across the top row of the first matrix, and down the first column of the second matrix:

Step 2: Multiply each number from the top row of the first matrix by the number in the first column on the second matrix. In this case, that means multiplying 1*2 and 6*9. Then, take the sum of those values (2+54):

Step 3: Insert the value you just got into the answer matrix. Since we are multiplying the 1st row and the 1st column, our answer goes into that slot in the answer matrix:

Step 4: Repeat for the other rows and columns. That means you need to walk down the first row of the first matrix and this time the second column of the second matrix. Then the second row of the first matrix and the first column of the second, and finally the bottom of the first matrix and the right column of the second matrix:

Page 14: 1561 maths

Step 5: Insert all of those values into the answer matrix. I just showed you how to do top left and the bottom right. If you work the other two numbers, you will get 1*2+6*7=44 and 3*2+8*9=78. Insert them into the answer matrix in the corresponding positions and you get:

Now I know what you're thinking. That was really hard!!! Well it will seem that way until you get used to the process. It may help you to write out all your work, and even draw arrows to remember which way you're moving in the rows and columns. Just remember to multiply each row in the first matrix by each column in the second matrix.

Page 15: 1561 maths

What if the matrices aren't squares? Then you have to add another step. In order to multiply two matrices, the matrix on the left must have as many columns as the matrix on the right has rows. That way you can match up each pair while you're multiplying. The size of the final matrix is determined by the rows in the left matrix and the columns in the right. Here's what I do:

I write down the sizes of the matrices. The left matrix has 2 rows and 3 columns, so that's how we write it. Rows, columns, in that order. The other matrix is a 3x1 matrix because it has 3 rows and just 1 column. If the numbers in the middle match up you can multiply. The outside numbers give you the size of the answer. Even if you mess this up you'll figure it out eventually because you won't be able to multiply.

Here's an important reminder: Matrix Multiplication is not commutative. That means you cannot switch the order and expect the same result! Regular multiplication tells us that 4*3=3*4, but this is not multiplication of the usual sense.

Finally, here's an example with uneven matrix sizes to wrap things up:

Example:

Page 16: 1561 maths

 

Lab 1: Matrix Calculation Examples

Given the Following Matrices:

A=

1.00 2.00 3.00

4.00 5.00 6.00

7.00 8.00 9.00

Page 17: 1561 maths

B=

1.00 1.00 1.00

2.00 2.00 2.00

C=

1.00 2.00 1.00

3.00 2.00 2.00

1.00 5.00 3.00

1) Calculate A + C

2) Calculate A - C

3) Calculate A * C (A times C)

4) Calculate B * A (B time A)

5) Calculate A .* C (A element by element multiplication with C)

6) Inverse of C

Page 18: 1561 maths

Matrix Calculation Examples - Answers

A + C=

2.00 4.00 4.00

7.00 7.00 8.00

8.00 13.00 12.00

A - C=

0.00 0.00 2.00

1.00 3.00 4.00

6.00 3.00 6.00

A * C=

10.00 21.00 14.00

25.00 48.00 32.00

40.00 75.00 50.00

B * A=

12.00 15.00 18.00

Page 19: 1561 maths

24.00 30.00 36.00

Element by element multiplication A .* C=

1.00 4.00 3.00

12.00 10.00 12.00

7.00 40.00 27.00

Inverse of C=

0.80 0.20 -0.40

1.40 -0.40 -0.20

-2.60 0.60 0.80

Page 20: 1561 maths

Matrix Calculation Assignment

Given the Following Matrices:

A=

1.00 2.00 3.00

6.00 5.00 4.00

9.00 8.00 7.00

B=

2.00 2.00 2.00

3.00 3.00 3.00

C=

1.00 2.00 1.00

4.00 3.00 1.00

3.00 4.00 2.00

Page 21: 1561 maths

1) Calculate A + C

2) Calculate A - C

3) Calculate A * C (A times C)

4) Calculate B * A (B time A)

5) Calculate A .* C (A element by element multiplication with C)

Vector product VS dot product in matrix

hi, i don't really understand whats the difference between vector product and dot product in matrix form. 

for example

(1 2) X (1 2) (3 4) (3 4) = ?

so when i take rows multiply by columns, to get a 2x2 matrix, i am doing vector product? 

so what then is dot producT? 

lastly, my notes says |detT| = final area of basic box/ initial area of basic box 

where detT = (Ti) x (Tj) . (Tk) 

so, whats the difference between how i should work out x VS . ? 

also, |detT| = magnitude of T right? so is there a formula i should use to find magnitude?

so why is |k . k| = 1? thanks 

PhysOrg.com

science news on PhysOrg.com

>> Smooth-talking hackers test hi-tech titans' skills>> Reading terrorists minds about imminent attack: P300 brain waves correlated to guilty knowledge>> Nano 'pin art': NIST arrays are step toward mass production of nanowires

 Apr25-10, 07:35 AM       Last edited by HallsofIvy; Apr27-10 at 07:42 AM..            #2

HallsofIvy

 HallsofIvy is Offline:

Re: Vector product VS dot product in matrix

Originally Posted by quietrain hi, i don't really understand whats the difference between vector product and dot product in matrix form. 

Page 22: 1561 maths

Posts: 26,845

for example

(1 2) X (1 2) (3 4) (3 4) = ?

so when i take rows multiply by columns, to get a 2x2 matrix, i am doing vector product?

No, you are doing a "matrix product". There are no vectors here.

so what then is dot producT?

With matrices? It isn't anything. The matrix product is the only multiplication defined for matrices. The dot product is defined for vectors, not matrices.

lastly, my notes says |detT| = final area of basic box/ initial area of basic box 

where detT = (Ti) x (Tj) . (Tk)Well, we don't have your notes so we have no idea what "T", "Ti", "Tj", "Tk" are nor do we know what a "basic box" is.

I do know that if you have a "parallelpiped" with adjacent sides given by the vectors  ,  , and  , then the volume (not area) of the parallelpiped is given by the "triple product",   which can be represented by determinant having the components of the vectors as rows. That has nothing at all to do with matrices.

so, whats the difference between how i should work out x VS . ? 

also, |detT| = magnitude of T right?

No, "det" applies only to square arrays for which "magnitude" is not defined.

so is there a formula i should use to find magnitude?

so why is |k . k| = 1? thanks

I guess you mean "k" to be the unit vector in the z direction in a three dimensional coordinate system. If so, then |k.k| is, by definition, the length of k which is, again by definition of "unit vector", 1.

You seem to be confusing a number of very different concepts. Go back and review.

 

Page 23: 1561 maths

 Apr27-10, 01:02 AM                  #3

quietrainquietrain is Offline:Posts: 173

Re: Vector product VS dot product in matrix

oh.. em.. 

ok lets say we have 

(1 2) x (4 5)(3 4) (6 7) = so this is just rows multiply by column to get a 2x2 matrix right? so what is the difference if i replace the x sign with the dot sign now. do i still get the same? i presume one is cross (x) product , one is dot (.) product? or is it for matrix there is no such things as cross or dot product? thats weird. my tutor tells us to know the difference between cross and dot matrix product

so for the case of the parallelpiped, whats the significance of the triple product (u x v) .w? why do we use x for u&v but . for w? 

is it just to tell us that we have to use sin and cos respectively? but if u v and w were square matrix, then there won't be any sin and cos to use? so we just multiply as usual rows by columns? 

oh by definition . so that means |k.k| = (k)(k)cos(0) = (1)(1)cos(0) = 1 so |i.k| = (1)(1)cos(90) = 0 ?so if i x k gives us -j by the right hand rule, then does it mean the magnitude, which is |i.k| = 0 is 0? in the direction of the -j?? or are they 2 totally different aspects? 

btw, sry for another question,why is e(w)(A), where A = (0 -1)(1 0)

can be expressed as ( cosw -sinw) ( sinw cosw)which is the rotational matrix anti-clockwise about the x-axis right? 

thanks 

 Apr27-10, 08:08 AM       Last edited by HallsofIvy; Apr27-10 at 08:16 AM..            #4

HallsofIvy

 HallsofIvy is Offline:Posts: 26,845

Re: Vector product VS dot product in matrix

Originally Posted by quietrain oh.. em.. 

ok lets say we have 

(1 2) x (4 5)(3 4) (6 7) = so this is just rows multiply by column to get a 2x2 matrix right? so what is the difference if i replace the x sign with the dot sign now. do i still get the same?

Page 24: 1561 maths

You can replace it by whatever symbol you like. As long as your multiplication is "matrix multiplication" you will get the same result.

i presume one is cross (x) product , one is dot (.) product?

No, just changing the symbol doesn't make it one or the other.

or is it for matrix there is no such things as cross or dot product? thats weird. my tutor tells us to know the difference between cross and dot matrix product

I suspect your tutor was talking about vectors not matrices.

so for the case of the parallelpiped, whats the significance of the triple product (u x v) .w? why do we use x for u&v but . for w?

Because you are talking about vectors not matrices!

is it just to tell us that we have to use sin and cos respectively? but if u v and w were square matrix, then there won't be any sin and cos to use? so we just multiply as usual rows by columns?

They are NOT matrices, they are vectors!!

You can think of vectors as "row matrices" (n by 1) or "column matrices" (1 by n) but they still have properties that matrices in general do not have.

oh by definition . so that means |k.k| = (k)(k)cos(0) = (1)(1)cos(0) = 1 so |i.k| =(1)(1)cos(90) = 0 ?

Yes, that is correct.

so if i x k gives us -j by the right hand rule, then does it mean the magnitude, which is |i.k| = 0 is 0? in the direction of the -j?? or are they 2 totally different aspects?

No, the length of i x k is NOT |i.k|, it is  .

In general, the length of   is   where   is the angle between   and  . 

btw, sry for another question,why is e(w)(A), where A = (0 -1)(1 0)

can be expressed as ( cosw -sinw) ( sinw cosw)which is the rotational matrix anti-clockwise about the x-axis right? 

thanks

Page 25: 1561 maths

For objects other than numbers, where we have a notion of addition and multiplication, we define higher functions by using their "Taylor series", power series that are equal to the functions. In particular,  .

It should be easy to calculate that 

and, since that is the identity matrix, it all repeats:

etc.

That gives

and you should be able to recognise those as the Taylor's series about 0 for cos(w) and sin(w).

  Apr27-10, 09:15 AM                  #5

quietrainquietrain is Offline:Posts: 173

Re: Vector product VS dot product in matrix

wow.

ok, i went to check again what my tutor said and it was "scalar and vector products in terms of matrices". so what does he mean by this? 

the scalar product is (A B C) x (D E F)T, (so we can take the transpose of DEF because it is symmetric matrix? or is it for some other reason? )so rows multiply by columns again? but what about vector product? 

for the parallelpiped, (u x v).w so lets say u = (1,1) , v = (2,2), w = (3,3)so u x v = (1x2, 1x2)sin(angle between vectors)so .w = (2x3,2x3) cos(angle) ? so if it yields 0, that vector w lies in the plane define by u and v, but if its otherwise, then w doesn't lie in the plane of u v ? 

Page 26: 1561 maths

for i x k, why is the length |i||j|? why is j introduced here? shouldn't it be |i||k|sin(90) = 1? oh i see.. so the right hand rule gives the direection but the magnitude for i x k = |i||k|sin(90) = 1?

thanks a ton! 

 Apr27-10, 11:40 AM                  #6

HallsofIvy

 HallsofIvy is Offline:Posts: 26,845

Re: Vector product VS dot product in matrix

Originally Posted by quietrain wow.

ok, i went to check again what my tutor said and it was "scalar and vector products in terms of matrices". so what does he mean by this? 

the scalar product is (A B C) x (D E F)T, (so we can take the transpose of DEF because it is symmetric matrix? or is it for some other reason? )so rows multiply by columns again?

Okay, you think of one vector as a row matrix and the other as a column matrix then the "dot product" is the matrix product

But the dot product is commutative isn't it? Does it really make sense to treat the two vectors as different kinds of matrices? It is really better here to think of this not as the product of two vectors but a vector in a vector space and functional in the dual space.

but what about vector product

for the parallelpiped, (u x v).w so lets say u = (1,1) , v = (2,2), w = (3,3)so u x v = (1x2, 1x2)sin(angle between vectors)so .w = (2x3,2x3) cos(angle) ? so if it yields 0, that vector w lies in the plane define by u and v, but if its otherwise, then w doesn't lie in the plane of u v ? 

for i x k, why is the length |i||j|? why is j introduced here? shouldn't it be |i||k|sin(90) = 1?

Yes, that was a typo. I meant |i||k|.

oh i see.. so the right hand rule gives the direection but the magnitude for i x k = |i||k|sin(90) = 1?

Yes.

thanks a ton!

Page 27: 1561 maths

Matrix (mathematics)

From Wikipedia, the free encyclopedia

Specific entries of a matrix are often referenced by using pairs of subscripts.

Page 28: 1561 maths

In mathematics, a matrix (plural matrices, or less commonly matrixes) is a rectangular array of numbers, such as

An item in a matrix is called an entry or an element. The example has entries 1, 9, 13, 20, 55, and 4. Entries are often denoted by a variable with twosubscripts, as shown on the right. Matrices of the same size can be added and subtracted entrywise and matrices of compatible sizes can be multiplied. These operations have many of the properties of ordinary arithmetic, except that matrix multiplication is not commutative, that is, AB and BA are not equal in general. Matrices consisting of only one column or row define the components of vectors, while higher-dimensional (e.g., three-dimensional) arrays of numbers define the components of a generalization of a vector called a tensor. Matrices with entries in other fields or rings are also studied.

Matrices are a key tool in linear algebra. One use of matrices is to represent linear transformations, which are higher-dimensional analogs of linear functions of the form f(x) = cx, where c is a constant; matrix multiplication corresponds to composition of linear transformations. Matrices can also keep track of thecoefficients in a system of linear equations. For a square matrix, the determinant and inverse matrix (when it exists) govern the behavior of solutions to the corresponding system of linear equations, and eigenvalues and eigenvectors provide insight into the geometry of the associated linear transformation.

Matrices find many applications. Physics makes use of matrices in various domains, for example in geometrical optics and matrix mechanics; the latter led to studying in more detail matrices with an infinite number of rows and columns. Graph theory uses matrices to keep track of distances between pairs of vertices in a graph. Computer graphics uses matrices to project 3-dimensional space onto a 2-dimensional screen. Matrix calculus generalizes classical analytical notions such as derivatives of functions or exponentials to matrices. The latter is a recurring need in solving ordinary differential equations.Serialism and dodecaphonism are musical movements of the 20th century that use a square mathematical matrix to determine the pattern of music intervals.

A major branch of numerical analysis is devoted to the development of efficient algorithms for matrix computations, a subject that is centuries old but still an

Page 29: 1561 maths

active area of research. Matrix decomposition methods simplify computations, both theoretically and practically. For sparse matrices, specifically tailored algorithms can provide speedups; such matrices arise in the finite element method, for example.

Definition

A matrix is a rectangular arrangement of numbers.[1] For example,

An alternative notation uses large parentheses instead of box brackets:

The horizontal and vertical lines in a matrix are called rows and columns, respectively. The numbers in the matrix are called its entries or its elements. To specify a matrix's size, a matrix with m rows and ncolumns is called an m-by-n matrix or m × n matrix, while m and n are called its dimensions. The above is a 4-by-3 matrix.

A matrix with one row (a 1 × n matrix) is called a row vector, and a matrix with one column (an m × 1 matrix) is called a column vector. Any row or column of a matrix determines a row or column vector, obtained by removing all other rows respectively columns from the matrix. For example, the row vector for the third row of the above matrix A is

When a row or column of a matrix is interpreted as a value, this refers to the corresponding row or column vector. For instance one may say that two different rows of a matrix are equal, meaning they determine the same row vector. In some cases the value of a row or column should be interpreted just as a sequence of values (an element of Rn if entries are real numbers) rather than as a matrix, for instance when saying that the rows of a matrix are equal to the corresponding columns of its transpose matrix.

Page 30: 1561 maths

Most of this article focuses on real and complex matrices, i.e., matrices whose entries are real or complex numbers. More general types of entries are discussed below.

[edit]Notation

The specifics of matrices notation varies widely, with some prevailing trends. Matrices are usually denoted using upper-case letters, while the corresponding lower-case letters, with two subscript indices, represent the entries. In addition to using upper-case letters to symbolize matrices, many authors use a special typographical style, commonly boldface upright (non-italic), to further distinguish matrices from other variables. An alternative notation involves the use of

a double-underline with the variable name, with or without boldface style, (e.g.,  ).

The entry that lies in the i-th row and the j-th column of a matrix is typically referred to as the i,j, (i,j), or (i,j)th entry of the matrix. For example, the (2,3) entry of the above matrix A is 7. The (i, j)th entry of a matrix A is most commonly written as ai,j. Alternative notations for that entry are A[i,j] or Ai,j.

Sometimes a matrix is referred to by giving a formula for its (i,j)th entry, often with double parenthesis around the formula for the entry, for example, if the (i,j)th entry of A were given by aij, A would be denoted ((aij)).

An asterisk is commonly used to refer to whole rows or columns in a matrix. For example, ai,∗ refers to the ith row of A, and a∗,j refers to the jth column of A. The set of all m-by-n matrices is denoted  (m,n).

A common shorthand is

A = [ai,j]i=1,...,m; j=1,...,n or more briefly A = [ai,j]m×n

to define an m × n matrix A. Usually the entries ai,j are defined separately for all integers 1 ≤ i ≤ m and 1 ≤ j ≤ n. They can however sometimes be given by one formula; for example the 3-by-4 matrix

can alternatively be specified by A = [i − j]i=1,2,3; j=1,...,4, or simply A = ((i-j)), where the size of the matrix is understood.

Some programming languages start the numbering of rows and columns at zero, in which case the entries of an m-by-n matrix are indexed by 0 ≤ i ≤ m − 1 and 0

Page 31: 1561 maths

≤ j ≤ n − 1.[2] This article follows the more common convention in mathematical writing where enumeration starts from 1.

[edit]Basic operations

Main articles: Matrix addition, Scalar multiplication, Transpose, and Row operations

There are a number of operations that can be applied to modify matrices called matrix addition, scalar multiplication and transposition.[3] These form the basic techniques to deal with matrices.

Operation Definition Example

Addition

The sum A+B of two m-by-n matrices A and B is calculated entrywise:

(A + B)i,j = Ai,j + Bi,j, where 1 ≤ i ≤ m and 1 ≤ j ≤ n.

Scalar multiplication

The scalar multiplication cA of a matrix A and a number c (also called a scalar in the parlance ofabstract algebra) is given by multiplying every entry of A by c:

(cA)i,j = c · Ai,j.

Transpose

The transpose of an m-by-n matrix A is the n-by-m matrix AT (also denoted Atr o

Page 32: 1561 maths

r tA) formed by turning rows into columns and vice versa:

(AT)i,j = Aj,i.

Familiar properties of numbers extend to these operations of matrices: for example, addition is commutative, i.e. the matrix sum does not depend on the order of the summands: A + B = B + A.[4] The transpose is compatible with addition and scalar multiplication, as expressed by (cA)T = c(AT) and (A + B)T = AT + BT. Finally, (AT)T = A.

Row operations are ways to change matrices. There are three types of row operations: row switching, that is interchanging two rows of a matrix, row multiplication, multiplying all entries of a row by a non-zero constant and finally row addition which means adding a multiple of a row to another row. These row operations are used in a number of ways including solving linear equations and finding inverses.

[edit]Matrix multiplication, linear equations and linear transformations

Main article: Matrix multiplication

Schematic depiction of the matrix product AB of two matrices A and B.

Multiplication of two matrices is defined only if the number of columns of the left matrix is the same as the number of rows of the right matrix. If A is an m-by-n matrix and B is an n-by-p matrix, then their matrix product AB is the m-by-p matrix whose entries are given by dot-product of the corresponding row of A and the corresponding column of B:

Page 33: 1561 maths

where 1 ≤ i ≤ m and 1 ≤ j ≤ p.[5] For example (the underlined entry 1 in the product is calculated as the product 1 · 1 + 0 · 1 + 2 · 0 = 1):

Matrix multiplication satisfies the rules (AB)C = A(BC) (associativity), and (A+B)C = AC+BC as well as C(A+B) = CA+CB (left and right distributivity), whenever the size of the matrices is such that the various products are defined.[6] The product AB may be defined without BA being defined, namely if A and B are m-by-n and n-by-k matrices, respectively, and m ≠ k. Even if both products are defined, they need not be equal, i.e. generally one has

AB ≠ BA,

i.e., matrix multiplication is not commutative, in marked contrast to (rational, real, or complex) numbers whose product is independent of the order of the factors. An example of two matrices not commuting with each other is:

whereas

The identity matrix In of size n is the n-by-n matrix in which all the elements on the main diagonal are equal to 1 and all other elements are equal to 0, e.g.

It is called identity matrix because multiplication with it leaves a matrix unchanged: MIn = ImM = M for any m-by-n matrix M.

Besides the ordinary matrix multiplication just described, there exist other less frequently used operations on matrices that can be considered forms of multiplication, such as the Hadamard product and theKronecker product.[7] They arise in solving matrix equations such as the Sylvester equation.

Page 34: 1561 maths

[edit]Linear equationsMain articles: Linear equation and System of linear equations

A particular case of matrix multiplication is tightly linked to linear equations: if x designates a column vector (i.e. n×1-matrix) of n variables x1, x2, ..., xn, and A is an m-by-n matrix, then the matrix equation

Ax = b,

where b is some m×1-column vector, is equivalent to the system of linear equations

A1,1x1 + A1,2x2 + ... + A1,nxn = b1

...Am,1x1 + Am,2x2 + ... + Am,nxn = bm .[8]

This way, matrices can be used to compactly write and deal with multiple linear equations, i.e. systems of linear equations.

[edit]Linear transformationsMain articles: Linear transformation and Transformation matrix

Matrices and matrix multiplication reveal their essential features when related to linear transformations, also known as linear maps. A real m-by-n matrix A gives rise to a linear transformation Rn → Rmmapping each vector x in Rn to the (matrix) product Ax, which is a vector in Rm. Conversely, each linear transformation f: Rn → Rm arises from a unique m-by-n matrix A: explicitly, the (i, j)-entry of A is theith coordinate of f(ej), where ej = (0,...,0,1,0,...,0) is the unit vector with 1 in the jth position and 0 elsewhere. The matrix A is said to represent the linear map f, and A is called the transformation matrix of f.

The following table shows a number of 2-by-2 matrices with the associated linear maps of R2. The blue original is mapped to the green grid and shapes, the origin (0,0) is marked with a black point.

Vertical shear with

m=1.25.Horizontal flip

Squeeze mapping with

r=3/2

Scaling by a factor of 3/2

Rotation by π/6R = 30°

Page 35: 1561 maths

Under the 1-to-1 correspondence between matrices and linear maps, matrix multiplication corresponds to composition of maps:[9] if a k-by-m matrix B represents another linear map g : Rm → Rk, then the composition g ∘ f is represented by BA since

(g ∘ f)(x) = g(f(x)) = g(Ax) = B(Ax) = (BA)x.

The last equality follows from the above-mentioned associativity of matrix multiplication.

The rank of a matrix A is the maximum number of linearly independent row vectors of the matrix, which is the same as the maximum number of linearly independent column vectors.[10] Equivalently it is thedimension of the image of the linear map represented by A.[11] The rank-nullity theorem states that the dimension of the kernel of a matrix plus the rank equals the number of columns of the matrix.[12]

Square matrices

A square matrix is a matrix which has the same number of rows and columns. An n-by-n matrix is known as a square matrix of order n. Any two square matrices of the same order can be added and multiplied. A square matrix A is called invertible or non-singular if there exists a matrix B such that

AB = In.[13]

This is equivalent to BA = In.[14] Moreover, if B exists, it is unique and is called the inverse matrix of A, denoted A−1.

The entries Ai,i form the main diagonal of a matrix. The trace, tr(A) of a square matrix A is the sum of its diagonal entries. While, as mentioned above, matrix multiplication is not commutative, the trace of the product of two matrices is independent of the order of the factors: tr(AB) = tr(BA).[15]

If all entries outside the main diagonal are zero, A is called a diagonal matrix. If only all entries above (below) the main diagonal are zero, A is called a

Page 36: 1561 maths

lower triangular matrix (upper triangular matrix, respectively). For example, if n = 3, they look like

 (diagonal),   (lower) and   (upper triangular matrix).

[edit]DeterminantMain article: Determinant

A linear transformation on R2 given by the indicated matrix. The determinant of this matrix is −1, as the area of the green parallelogram at the right is 1, but the map reverses theorientation, since it turns the counterclockwise orientation of the vectors to a clockwise one.

The determinant det(A) or |A| of a square matrix A is a number encoding certain properties of the matrix. A matrix is invertible if and only if its determinant is nonzero. Its absolute value equals the area (in R2) or volume (in R3) of the image of the unit square (or cube), while its sign corresponds to the orientation of the corresponding linear map: the determinant is positive if and only if the orientation is preserved.

The determinant of 2-by-2 matrices is given by

the determinant of 3-by-3 matrices involves 6 terms (rule of Sarrus). The more lengthy Leibniz formula generalises these two formulae to all dimensions.[16]

The determinant of a product of square matrices equals the product of their determinants: det(AB) = det(A) · det(B).[17] Adding a multiple of any row to another row, or a multiple of any column to another column, does not change the

Page 37: 1561 maths

determinant. Interchanging two rows or two columns affects the determinant by multiplying it by −1.[18] Using these operations, any matrix can be transformed to a lower (or upper) triangular matrix, and for such matrices the determinant equals the product of the entries on the main diagonal; this provides a method to calculate the determinant of any matrix. Finally, the Laplace expansion expresses the determinant in terms of minors, i.e., determinants of smaller matrices.[19] This expansion can be used for a recursive definition of determinants (taking as starting case the determinant of a 1-by-1 matrix, which is its unique entry, or even the determinant of a 0-by-0 matrix, which is 1), that can be seen to be equivalent to the Leibniz formula. Determinants can be used to solve linear systems using Cramer's rule, where the division of the determinants of two related square matrices equates to the value of each of the system's variables.[20]

[edit]Eigenvalues and eigenvectorsMain article: Eigenvalues and eigenvectors

A number λ and a non-zero vector v satisfying

Av = λv

are called an eigenvalue and an eigenvector of A, respectively.[nb 1][21] The number λ is an eigenvalue of an n×n-matrix A if and only if A−λIn is not invertible, which is equivalent to

[22]

The function pA(t) = det(A−tI) is called the characteristic polynomial of A, its degree is n. Therefore pA(t) has at most n different roots, i.e., eigenvalues of the matrix.[23] They may be complex even if the entries of A are real. According to the Cayley-Hamilton theorem, pA(A) = 0, that is to say, the characteristic polynomial applied to the matrix itself yields the zero matrix.

[edit]Symmetry

A square matrix A that is equal to its transpose, i.e. A = AT, is a symmetric matrix; if it is equal to the negative of its transpose, i.e. A = −AT, then it is a skew-symmetric matrix. In complex matrices, symmetry is often replaced by the concept of Hermitian matrices, which satisfy A∗ = A, where the star or asterisk denotes the conjugate transpose of the matrix, i.e. the transpose of the complex conjugateof A.

By the spectral theorem, real symmetric matrices and complex Hermitian matrices have an eigenbasis; i.e., every vector is expressible as a linear combination of eigenvectors. In both cases, all eigenvalues are real.[24] This theorem can be

Page 38: 1561 maths

generalized to infinite-dimensional situations related to matrices with infinitely many rows and columns, see below.

[edit]DefinitenessMatrix A; definiteness; associated quadratic form QA(x,y);

set of vectors (x,y) such that QA(x,y)=1

positive definite indefinite

1/4 x2 + y2 1/4 x2 − 1/4 y2

Ellipse Hyperbola

A symmetric n×n-matrix is called positive-definite (respectively negative-definite; indefinite), if for all nonzero vectors x ∈ Rn the associatedquadratic form given by

Q(x) = xTAx

takes only positive values (respectively only negative values; both some negative and some positive values).[25] If the quadratic form takes only non-negative (respectively only non-positive) values, the symmetric matrix is called positive-semidefinite (respectively negative-semidefinite); hence the matrix is indefinite precisely when it is neither positive-semidefinite nor negative-semidefinite.

A symmetric matrix is positive-definite if and only if all its eigenvalues are positive.[26] The table at the right shows two possibilities for 2-by-2 matrices.

Allowing as input two different vectors instead yields the bilinear form associated to A:

BA (x, y) = xTAy.[27]

[edit]Computational aspects

In addition to theoretical knowledge of properties of matrices and their relation to other fields, it is important for practical purposes to perform matrix calculations effectively and precisely. The domain studying these matters is called numerical linear algebra.[28] As with other numerical situations, two main aspects are

Page 39: 1561 maths

the complexity of algorithms and theirnumerical stability. Many problems can be solved by both direct algorithms or iterative approaches. For example, finding eigenvectors can be done by finding a sequence of vectors xn converging to an eigenvector when n tends to infinity.[29]

Determining the complexity of an algorithm means finding upper bounds or estimates of how many elementary operations such as additions and multiplications of scalars are necessary to perform some algorithm, e.g. multiplication of matrices. For example, calculating the matrix product of two n-by-n matrix using the definition given above needs n3 multiplications, since for any of the n2 entries of the product, n multiplications are necessary. The Strassen algorithm outperforms this "naive" algorithm; it needs only n2.807 multiplications.[30] A refined approach also incorporates specific features of the computing devices.

In many practical situations additional information about the matrices involved is known. An important case are sparse matrices, i.e. matrices most of whose entries are zero. There are specifically adapted algorithms for, say, solving linear systems Ax = b for sparse matrices A, such as the conjugate gradient method.[31]

An algorithm is, roughly speaking, numerical stable, if little deviations (such as rounding errors) do not lead to big deviations in the result. For example, calculating the inverse of a matrix via Laplace's formula(Adj (A) denotes the adjugate matrix of A)

A−1 = Adj(A) / det(A)

may lead to significant rounding errors if the determinant of the matrix is very small. The norm of a matrix can be used to capture the conditioning of linear algebraic problems, such as computing a matrix' inverse.[32]

Although most computer languages are not designed with commands or libraries for matrices, as early as the 1970s, some engineering desktop computers such as the HP 9830 had ROM cartridges to add BASIC commands for matrices. Some computer languages such as APL were designed to manipulate matrices, and various mathematical programs can be used to aid computing with matrices.[33]

[edit]Matrix decomposition methodsMain articles: Matrix decomposition, Matrix diagonalization, and Gaussian elimination

Page 40: 1561 maths

There are several methods to render matrices into a more easily accessible form. They are generally referred to as matrix transformation or matrix decomposition techniques. The interest of all these decomposition techniques is that they preserve certain properties of the matrices in question, such as determinant, rank or inverse, so that these quantities can be calculated after applying the transformation, or that certain matrix operations are algorithmically easier to carry out for some types of matrices.

The LU decomposition factors matrices as a product of lower (L) and an upper triangular matrices (U).[34] Once this decomposition is calculated, linear systems can be solved more efficiently, by a simple technique called forward and back substitution. Likewise, inverses of triangular matrices are algorithmically easier to calculate. The Gaussian elimination is a similar algorithm; it transforms any matrix to row echelon form.[35] Both methods proceed by multiplying the matrix by suitable elementary matrices, which correspond to permuting rows or columns and adding multiples of one row to another row. Singular value decomposition expresses any matrix A as a product UDV∗, where U and V are unitary matrices and D is a diagonal matrix.

A matrix in Jordan normal form. The grey blocks are called Jordan blocks.

The eigendecomposition or diagonalization expresses A as a product VDV−1, where D is a diagonal matrix and V is a suitable invertible matrix.[36] If A can be written in this form, it is called diagonalizable. More generally, and applicable to all matrices, the Jordan decomposition transforms a matrix into Jordan normal form, that is to say matrices whose only nonzero entries are the eigenvalues λ1 to λn of A, placed on the main diagonal and possibly entries equal to one directly above the main diagonal, as shown at the right.[37] Given the eigendecomposition, the nth power of A (i.e. n-fold iterated matrix multiplication) can be calculated via

An = (VDV−1)n = VDV−1VDV−1...VDV−1 = VDnV−1

Page 41: 1561 maths

and the power of a diagonal matrix can be calculated by taking the corresponding powers of the diagonal entries, which is much easier than doing the exponentiation for Ainstead. This can be used to compute the matrix exponential eA, a need frequently arising in solving linear differential equations, matrix logarithms and square roots of matrices.[38] To avoid numerically ill-conditioned situations, further algorithms such as the Schur decomposition can be employed.[39]

[edit]Abstract algebraic aspects and generalizations

Matrices can be generalized in different ways. Abstract algebra uses matrices with entries in more general fields or even rings, while linear algebra codifies properties of matrices in the notion of linear maps. It is possible to consider matrices with infinitely many columns and rows. Another extension are tensors, which can be seen as higher-dimensional arrays of numbers, as opposed to vectors, which can often be realised as sequences of numbers, while matrices are rectangular or two-dimensional array of numbers.[40] Matrices, subject to certain requirements tend to form groups known as matrix groups.

[edit]Matrices with more general entries

This article focuses on matrices whose entries are real or complex numbers. However, matrices can be considered with much more general types of entries than real or complex numbers. As a first step of generalization, any field, i.e. a set where addition, subtraction, multiplication and division operations are defined and well-behaved, may be used instead of R or C, for example rational numbers or finite fields. For example, coding theory makes use of matrices over finite fields. Wherever eigenvalues are considered, as these are roots of a polynomial they may exist only in a larger field than that of the coefficients of the matrix; for instance they may be complex in case of a matrix with real entries. The possibility to reinterpret the entries of a matrix as elements of a larger field (e.g., to view a real matrix as a complex matrix whose entries happen to be all real) then allows considering each square matrix to possess a full set of eigenvalues. Alternatively one can consider only matrices with entries in an algebraically closed field, such as C, from the outset.

More generally, abstract algebra makes great use of matrices with entries in a ring R.[41] Rings are a more general notion than fields in that no division operation exists. The very same addition and multiplication operations of matrices extend to this setting, too. The set M(n, R) of all square n-by-n matrices over R is a ring

Page 42: 1561 maths

called matrix ring, isomorphic to the endomorphism ring of the left R-moduleRn.[42] If the ring R is commutative, i.e., its multiplication is commutative, then M(n, R) is a unitary noncommutative (unless n = 1) associative algebra over R. The determinant of square matrices over a commutative ring R can still be defined using the Leibniz formula; such a matrix is invertible if and only if its determinant is invertible in R, generalising the situation over a field F, where every nonzero element is invertible.[43] Matrices over superrings are called supermatrices.[44]

Matrices do not always have all their entries in the same ring - or even in any ring at all. One special but common case is block matrices, which may be considered as matrices whose entries themselves are matrices. The entries need not be quadratic matrices, and thus need not be members of any ordinary ring; but their sizes must fulfil certain compatibility conditions.

[edit]Relationship to linear maps

Linear maps Rn → Rm are equivalent to m-by-n matrices, as described above. More generally, any linear map f: V → W between finite-dimensional vector spaces can be described by a matrix A = (aij), after choosing bases v1, ..., vn of V, and w1, ..., wm of W (so n is the dimension of V and m is the dimension of W), which is such that

In other words, column j of A expresses the image of vj in terms of the basis vectors wi of W; thus this relationuniquely determines the entries of the matrix A. Note that the matrix depends on the choice of the bases: different choices of bases give rise to different, but equivalent matrices.[45] Many of the above concrete notions can be reinterpreted in this light, for example, the transpose matrix AT describes thetranspose of the linear map given by A, with respect to the dual bases.[46]

Graph theory

Page 43: 1561 maths

An undirected graph with adjacency matrix 

The adjacency matrix of a finite graph is a basic notion of graph theory.[62] It saves which vertices of the graph are connected by an edge. Matrices containing just two different values (0 and 1 meaning for example "yes" and "no") are called logical matrices. The distance (or cost) matrix contains information about distances of the edges.[63] These concepts can be applied to websites connected hyperlinks or cities connected by roads etc., in which case (unless the road network is extremely dense) the matrices tend to be sparse, i.e. contain few nonzero entries. Therefore, specifically tailored matrix algorithms can be used in network theory.

[edit]Analysis and geometry

The Hessian matrix of a differentiable function ƒ: Rn → R consists of the second derivatives of ƒ with respect to the several coordinate directions, i.e.[64]

It encodes information about the local growth behaviour of the function: given a critical point x = (x1, ..., xn), i.e., a point where the first partial derivatives   of ƒ vanish, the function has a local minimum if the Hessian matrix is positive definite. Quadratic programming can be used to find global minima or maxima of quadratic functions closely related to the ones attached to matrices (see above).[65]

At the saddle point (x = 0, y = 0) (red) of the function f(x,−y) = x2 − y2, the

Hessian matrix   is indefinite.

Page 44: 1561 maths

Another matrix frequently used in geometrical situations is the Jacobi matrix of a differentiable map f: Rn → Rm. If f1, ..., fm denote the components of f, then the Jacobi matrix is defined as [66]

If n > m, and if the rank of the Jacobi matrix attains its maximal value m, f is locally invertible at that point, by the implicit function theorem.[67]

Partial differential equations can be classified by considering the matrix of coefficients of the highest-order differential operators of the equation. For elliptic partial differential equations this matrix is positive definite, which has decisive influence on the set of possible solutions of the equation in question.[68]

The finite element method is an important numerical method to solve partial differential equations, widely applied in simulating complex physical systems. It attempts to approximate the solution to some equation by piecewise linear functions, where the pieces are chosen with respect to a sufficiently fine grid, which in turn can be recast as a matrix equation. [69]

[edit]Probability theory and statistics

Two different Markov chains. The chart depicts the number of particles (of a total of 1000) in state "2". Both limiting values can be determined from the

transition matrices, which are given by   (red) and   (black).

Stochastic matrices are square matrices whose rows are probability vectors, i.e., whose entries sum up to one. Stochastic matrices are used to define Markov chains with finitely many states.[70] A row of the stochastic matrix gives the probability distribution for the next position of some particle

Page 45: 1561 maths

which is currently in the state corresponding to the row. Properties of the Markov chain like absorbing states, i.e. states that any particle attains eventually, can be read off the eigenvectors of the transition matrices. [71]

Statistics also makes use of matrices in many different forms.[72] Descriptive statistics is concerned with describing data sets, which can often be represented in matrix form, by reducing the amount of data. The covariance matrix encodes the mutual variance of several random variables.[73] Another technique using matrices are linear least squares, a method that approximates a finite set of pairs (x1, y1), (x2, y2), ..., (xN, yN), by a linear function

yi ≈ axi + b, i = 1, ..., N

which can be formulated in terms of matrices, related to the singular value decomposition of matrices.[74]

Random matrices are matrices whose entries are random numbers, subject to suitable probability distributions, such as matrix normal distribution. Beyond probability theory, they are applied in domains ranging from number theory to physics.[75][76]

[edit]Symmetries and transformations in physicsFurther information: Symmetry in physics

Linear transformations and the associated symmetries play a key role in modern physics. For example, elementary particles in quantum field theory are classified as representations of the Lorentz group of special relativity and, more specifically, by their behavior under the spin group. Concrete representations involving the Pauli matrices and more general gamma matrices are an integral part of the physical description of fermions, which behave as spinors.[77] For the three lightest quarks, there is a group-theoretical representation involving the special unitary group SU(3); for their calculations, physicists use a convenient matrix representation known as the Gell-Mann matrices, which are also used for the SU(3) gauge group that forms the basis of the modern description of strong nuclear interactions, quantum chromodynamics. The Cabibbo–Kobayashi–Maskawa matrix, in turn, expresses the fact that the basic quark states that are important for weak interactions are not the same as, but linearly related to the basic quark states that define particles with specific and distinct masses.[78]

[edit]Linear combinations of quantum states

The first model of quantum mechanics (Heisenberg, 1925) represented the theory's operators by infinite-dimensional matrices acting on quantum states. [79] This is also

Page 46: 1561 maths

referred to as matrix mechanics. One particular example is the density matrix that characterizes the "mixed" state of a quantum system as a linear combination of elementary, "pure" eigenstates.[80]

Another matrix serves as a key tool for describing the scattering experiments which form the cornerstone of experimental particle physics: Collision reactions such as occur in particle accelerators, where non-interacting particles head towards each other and collide in a small interaction zone, with a new set of non-interacting particles as the result, can be described as the scalar product of outgoing particle states and a linear combination of ingoing particle states. The linear combination is given by a matrix known as the S-matrix, which encodes all information about the possible interactions between particles.[81]

[edit]Normal modes

A general application of matrices in physics is to the description of linearly coupled harmonic systems. The equations of motion of such systems can be described in matrix form, with a mass matrix multiplying a generalized velocity to give the kinetic term, and a force matrix multiplying a displacement vector to characterize the interactions. The best way to obtain solutions is to determine the system'seigenvectors, its normal modes, by diagonalizing the matrix equation. Techniques like this are crucial when it comes to describing the internal dynamics of molecules: the internal vibrations of systems consisting of mutually bound component atoms.[82] They are also needed for describing mechanical vibrations, and oscillations in electrical circuits.[83]

[edit]Geometrical optics

Geometrical optics provides further matrix applications. In this approximative theory, the wave nature of light is neglected. The result is a model in which light rays are indeed geometrical rays. If the deflection of light rays by optical elements is small, the action of a lens or reflective element on a given light ray can be expressed as multiplication of a two-component vector with a two-by-two matrix called ray transfer matrix: the vector's components are the light ray's slope and its distance from the optical axis, while the matrix encodes the properties of the optical element. Actually, there will be two different kinds of matrices, viz. a refraction matrix describing de madharchod refraction at a lens surface, and a translation matrix, describing the translation of the plane of reference to the next refracting surface, where another refraction matrix will apply. The optical system consisting of a combination of lenses and/or reflective elements is simply described by the matrix resulting from the product of the components' matrices.[84]

Page 47: 1561 maths

[edit]Electronics

The behaviour of many electronic components can be described using matrices. Let A be a 2-dimensional vector with the component's input voltage v1 and input current i1 as its elements, and let B be a 2-dimensional vector with the component's output voltage v2 and output current i2 as its elements. Then the behaviour of the electronic component can be described by B = H · A, where H is a 2 x 2 matrix containing one impedance element (h12), one admittance element (h21) and two dimensionless elements (h11 and h22). Calculating a circuit now reduces to multiplying matrices.

[edit]History

Matrices have a long history of application in solving linear equations. The Chinese text The Nine Chapters on the Mathematical Art (Jiu Zhang Suan Shu), from between 300 BC and AD 200, is the first example of the use of matrix methods to solve simultaneous equations,[85] including the concept of determinants, almost 2000 years before its publication by the Japanese mathematician Seki in 1683 and the German mathematician Leibniz in 1693. Cramer presented Cramer's rule in 1750.

Early matrix theory emphasized determinants more strongly than matrices and an independent matrix concept akin to the modern notion emerged only in 1858, with Cayley's Memoir on the theory of matrices.[86][87] The term "matrix" was coined by Sylvester, who understood a matrix as an object giving rise to a number of determinants today called minors, that is to say, determinants of smaller matrices which derive from the original one by removing columns and rows. Etymologically, matrix derives from Latin mater (mother).[88]

The study of determinants sprang from several sources.[89] Number-theoretical problems led Gauss to relate coefficients of quadratic forms, i.e., expressions such as x2 + xy − 2y2, and linear maps in three dimensions to matrices. Eisenstein further developed these notions, including the remark that, in modern parlance, matrix products are non-commutative. Cauchy was the first to prove general statements about determinants, using as definition of the determinant of a matrix A = [ai,j] the following: replace the powers aj

k by ajk in the polynomial

where Π denotes the product of the indicated terms. He also showed, in 1829, that the eigenvalues of symmetric matrices are real.[90] Jacobi studied "functional determinants"—later called Jacobi determinants by Sylvester—which can be used to describe geometric transformations at a local (or infinitesimal) level,

Page 48: 1561 maths

see above; Kronecker's Vorlesungen über die Theorie der Determinanten[91] andWeierstrass' Zur Determinantentheorie,[92] both published in 1903, first treated determinants axiomatically, as opposed to previous more concrete approaches such as the mentioned formula of Cauchy. At that point, determinants were firmly established.

Many theorems were first established for small matrices only, for example the Cayley-Hamilton theorem was proved for 2×2 matrices by Cayley in the aforementioned memoir, and by Hamilton for 4×4 matrices. Frobenius, working on bilinear forms, generalized the theorem to all dimensions (1898). Also at the end of the 19th century the Gauss-Jordan elimination (generalizing a special case now known asGauss elimination) was established by Jordan. In the early 20th century, matrices attained a central role in linear algebra.[93] partially due to their use in classification of the hypercomplex number systems of the previous century.

The inception of matrix mechanics by Heisenberg, Born and Jordan led to studying matrices with infinitely many rows and columns.[94] Later, von Neumann carried out the mathematical formulation of quantum mechanics, by further developing functional analytic notions such as linear operators on Hilbert spaces, which, very roughly speaking, correspond to Euclidean space, but with an infinity ofindependent directions.

[edit]Other historical usages of the word "matrix" in mathematics

The word has been used in unusual ways by at least two authors of historical importance.

Bertrand Russell and Alfred North Whitehead in their Principia Mathematica (1910–1913) use the word matrix in the context of their Axiom of reducibility. They proposed this axiom as a means to reduce any function to one of lower type, successively, so that at the "bottom" (0 order) the function will be identical to its extension[disambiguation needed]:

"Let us give the name of matrix to any function, of however many variables, which does not involve any apparent variables. Then any possible function other than a matrix is derived from a matrix by means of generalization, i.e. by considering the proposition which asserts that the function in question is true with all possible values or with some value of one of the arguments, the other argument or arguments remaining undetermined".[95]

For example a function Φ(x, y) of two variables x and y can be reduced to a collection of functions of a single variable, e.g. y, by "considering" the function for all possible values of "individuals" ai substituted in place of variable x. And then the

Page 49: 1561 maths

resulting collection of functions of the single variable y, i.e. ∀ai: Φ(ai, y), can be reduced to a "matrix" of values by "considering" the function for all possible values of "individuals" bi substituted in place of variable y:∀bj∀ai: Φ(ai, bj).

Alfred Tarski in his 1946 Introduction to Logic used the word "matrix" synonymously with the notion of truth table as used in mathematical logic.[96]

[edit]See also

Median

The median is the middle value in a set of numbers. In the set [1,2,3] the median is 2. If the set has an even number of values, the median is the average of the two in the middle. For example, the median of [1,2,3,4] is 2.5, because that is the average of 2 and 3.

The median is often used when analyzing statistical studies, for example the income of a nation. While the arithmetic mean (average) is a simple calculation that most people understand, it is skewed upwards by just a few high values. The average income of a nation might be $20,000 but most people are much poorer. Many people with $10,000 incomes are balanced out by just a single person with a $5,000,000 income. Therefore the median is often quoted because it shows a value that 50% of the country makes more than and 50% of the country makes less than.

Exponential Functions

Take a look at x3 . What does it mean?

We have two parts here:1) Exponent, which is 3.2) Base, which is x.

x 3 = x times x times x

It's read two ways:1) x cubed2) x to the third power

With exponential functions, 3 is the base and x is the exponent.So, the idea is reversed in terms of exponential functions.

Here's what exponential functions look like:

Page 50: 1561 maths

y = 3 x , f(x) = 1.124 x , etc. In other words, the exponent will be a variable.

The general exponential function looks like this: b x where the base b is ANY constant. So, the standard form for ANY exponential function is f(x) = b x where b is a real number greater than 0.

Sample: Solve for x

f(x) = 1.276 x

Here x can be ANY number we select.

Say, x = 1.2.

f(1.2) = 1.276 1.2

NOTE: You must follow your calculator's instructions in terms of exponents. Every calculator is different and thus has different steps.

I will use my TI-36 SOLAR Calculator to find an approximation for x.

f(1.2) = 1.33974088

Rounding off to two decimal places I get:

f(1.2) = 1.34

We can actually graph our point (1.2, 1.34) on the xy-plane but more on that in future exponential function lessons.

We can use the formula B(t) = 100(1.12 t ) to solve bacteria applications. We can use the above formula to find HOW MUCH bacteria remains in a given region after a certain amount of time. Of course, in the formula, lower case t = time. The number 100 indicates how many bacteria there were at the start of the LAB experiment. The decimal number 1.12 indicates how fast bacteria grows.

Sample: How much bacteria in LAB 3 after 2.9 hours of work?

Okay, t = 2.9 hours.

Replace t with 2.9 hours in the formula above and simplify.

B(2.9 hours) = 100(1.12 2.9 hours)B(2.9 hours) = 100(1.389096016) B(2.9 hours) = 138.9096

NOTE: An exponent can be ANY real number, positive or negative. For ANY exponential function, the domain will be ALL real numbers.

Page 51: 1561 maths

Trig Addition Formulas

The trig addition formulas can be useful to simplify a complicated expression, or perhaps find an exact value when you only have a small table of trig values. For example, if you want the sine of 15 degrees, you can use a subtraction formula to compute sin(15) as sin(45-30).

Trigonometry Derivatives

While you may know how to take the derivative of a polynomial, what happens when you need to take the derivative of a trig function? What IS the derivative of a sine?

Luckily, the derivatives of trig functions are simple -- they're other trig functions! For example, the derivative of sine is just cosine:

The rest of the trig functions are also straightforward once you learn them, but they aren't QUITE as easy as the first two.

Page 52: 1561 maths

Derivatives of Trigonometry Functions

sin'(x) = cos(x)

cos'(x) = -sin(x)

tan'(x) = sec2(x)

sec'(x) = sec(x)tan(x)

cot'(x) = -csc2(x)

csc'(x) = -csc(x)cot(x)

 

Take a look at this graphic for an illustration of what this means. At the first point (around x=2*pi), the cosine isn't changing. You can see that the sine is 0, and since negative sine is the rate of change of cosine, cosine would be changing at a rate of -0.

At the second point I've illustrated (x=3*pi), you can see that the sine is decreasing rapidly. This makes sense because the cosine is negative. Since cosine is the rate of change of sine, a negative cosine means the sine is decreasing.

Page 53: 1561 maths

Double and Half Angle Formulas

The double and half angle formulas can be used to find the values of unknown trig functions. For example, you might not know the sine of 15 degrees, but by using the half angle formula for sine, you can figure it out based on the common value of sin(30) = 1/2.

They are also useful for certain integration problems where a double or half angle formula may make things much simpler to solve.

Double Angle Formulas:

You'll notice that there are several listings for the double angle for cosine. That's because you can substitute for either of the squared terms using the basic trig identity sin^2+cos^2=1.

Half Angle Formulas:

Page 54: 1561 maths

These are a little trickier because of the plus or minus. It's not that you can use BOTH, but you have to figure out the sign on your own. For example, the sine of 30 degrees is positive, as is the sine of 15. However, if you were to use 200, you'd find that the sine of 200 degrees is negative, while the sine of 100 is positive. Just remember to look at a graph and figure out the sines and you'll be fine.

The magic identity

Trigonometry is the art of doing algebra over the circle. So it is a mixture of algebra and geometry. The sine and cosine functions are just the coordinates of a point on the unit circle. This implies the most fundamental formula in trigonometry (which we will call here the magic identity)

where is any real number (of course measures an angle).

Example. Show that

Answer. By definitions of the trigonometric functions we have

Page 55: 1561 maths

Hence we have

Using the magic identity we get

This completes our proof.

Remark. the above formula is fundamental in many ways. For example, it is very useful in techniques of integration.

Example. Simplify the expression

Answer. We have by definition of the trigonometric functions

Hence

Using the magic identity we get

Putting stuff together we get

This gives

Page 56: 1561 maths

Using the magic identity we get

Therefore we have

Example. Check that

Answer.

Example. Simplify the expression

Answer.

The following identities are very basic to the analysis of trigonometric expressions and functions. These are called Fundamental Identities

Reciprocal identities

Pythagorean Identities

Quotient Identities

Page 57: 1561 maths

Understanding sineA teaching guideline/lesson plan when first teaching sine (grades 7-9)

The sine is simply a RATIO of certain sides of a right triangle.  Look at the triangles below.  They all have the same shape.  That means they have the SAME ANGLES but the lengths of the sides may be different.  In other words, they are SIMILAR figures.

Have your child/students measure the sides s1, h1, s2, h2, s3, h3 as accurately as possible (or draw several similar right triangles on their own).

Then let her calculate the following ratios:

s1

h1

,

s2

h2

s3

h3

.  What can you note?

Those ratios should all be the same (or close to same due to measuring errors).  That is so because the triangles have the same shape (or are similar), which means their respective parts are PROPORTIONAL.  That is why the ratio of those parts remains the same.  Now ask your child what would happen if we had a fourth triangle with the same shape.  The answer of course is that even in that fourth triangle the ratio  s4/h4  would be the same.

The ratio you calculated remains the same for all the triangles. Why? Because the triangles were similar so their sides were proportional. SO, in all right triangles where the angles are the same, this one ratio is the same too. We associate this ratio with the angle α.  THAT RATIO IS CALLED THE SINE OF THE ANGLE α.

What follows is that if you know the ratio, you can find what the angle α is. Or in other words, if you know the sine of α, you can find α. Or, if you know what α is, you

Page 58: 1561 maths

can find this ratio - and when you know this ratio and one side of a right triangle, you can find the other sides.

:  

     

s1

h1

=

s2

h2

=

s3

h3

 = sin α = 0.57358

In our pictures the angle α is 35 degrees.  So sin 35 = 0.57358 (rounded to five decimals).  We can use this fact when dealing with OTHER right triangles that have a 35 angle.  See, other such triangles are, again, similar to these ones we see here, so the ratio of the opposite side to the hypotenuse, WHICH IS THE SINE OF THE 35 ANGLE,  is the same!  So in another such triangle, if you only know the hypotenuse, you can calculate the opposite side since you know the ratio, or vice versa.

Problem

Suppose we have a triangle that has the same shape as the triangles above.  The side opposite to the 35 angle is 5 cm.  How long is the hypotenuse?

SOLUTION:   Let h be that hypotenuse.  Then5cm

h

 = sin 35 ≈ 0.57358

From this equation one can easily solve that   h =

5cm

0.57358

≈ 8.72 cm

Page 59: 1561 maths

An example

The two triangles are pictured both overlapping and separate.  We can find H3 simply by the fact that these two triangles are similar.  Since the triangles are similar,

3.9

h3

=

2.6

6

,  from which  h3 =

6 × 3.9

2.6

 = 9

We didn't even need the sine to solve that, but note how closely it ties in with similar triangles.

The triangles have the same angle α.  Sin α of course would be the ratio 

2.6

6

 or 

3.9

9

≈ 0.4333.

Now we can find the actual angle α from the calculator: Since sin α = 0.4333, then α = sin-10.4333 ≈ 25.7 degrees.

Test your understanding

1.  Draw a right triangle that has a 40 angle.  Then measure the opposite side and the hypotenuse and use those measurements to calculate  sin 40.  Check your answer by plugging into calculator sin 40  (remember the calculator has to be in the degrees mode instead of radians mode).

2.  Draw two right triangles that have a 70 angle - but that are of different sizes.  Use the first triangle to find sin 70 (like you did in problem 1).  Then measure the hypotenuse of your second triangle.  Use sin 70 and the measurement of the

Page 60: 1561 maths

hypotenuse to find the opposite side in your second triangle.  Check by measuring the opposite side from your triangle.

3.  Draw a right triangle that has a 48 angle.  Measure the hypotenuse.  Then use sin 48 (from a calculator) and your measurement to calculate the length of the opposite side.  Check by measuring the opposite side from your triangle.

 

Someone asked me once, "When I type in sine in my graphic calculator, why does it give me a wave?"Read my answer where we get the familiar sine wave.

My question is that if some one give us only the length of sides of triangle, how can we draw a triangle? sajjad ahmed shah 

This is an easy construction. SeeConstructing a Triangle andConstructing a triangle when the lengths of 3 sides are known

if i am in a plane flying at 30000 ft how many linear miles of ground can i see. and please explain how that answer is generated. does it have anything to do with right triangles and the pythagorean therom 

jim taucher 

The image below is NOT to scale - it is just to help in the problem. The angle α is much smaller in reality.Yes, you have a right triangle. r is the Earth's radius. Now, Earth's radius is not constant but varies because Earth is not a perfect sphere. For this problem, I was using the mean radius, 3959.871 miles. This also means our answer will be just approximate. I also converted 30000 feet to 5.6818182 miles.First we calculate α using cosine. You should get α is 3.067476356 degrees. Then, we use a proportion comparing α and 360 degrees and x and earth's circumference. You will get x ≈ 212 miles. Even that result might be too 'exact'.

Page 61: 1561 maths

 

ntroduction:           The Unit circle is used to understand the sins and cos of angles to find 90 degree triangle. Its radius is exactly one. The center of circle is said to be the origin and its perimeter comprises the set of all points that are exactly one unit from the center of the circle while placed in the plane.Its just a circle with radius ‘one’.

Unit Circle -standard Equation:The distance from the origin point(x,y) is   by using Pythagorean Theorem.

Here, radius is one So, The expression should becomes   =1Take square on both sides then the equation becomes,                      X2+y2 =1Positive angles are found using counterclockwise from the positive x axisAnd negative angles are found anti clockwise from negative x axis.Unit Circle-sine and Cosine

Page 62: 1561 maths

Sine ,Cosine:You Can directly measure the Sine theta and cosine theta,becasue radius of the unit circle is one,By using  this  condition,When the angle is    is 0 deg.So,cosine =1 ,sine =0 and tangent  =0if    is 90 degree. cosine =0 ,sine =1 and tangent is undefined.

Calculating 60 °, 30° ,45 °Note : Radius of the Unit Circle is 1

Take 60°        Consider eqiilateral triangle.All Sides are equal and all angles are same.So x  side is now 1/2 and  Y side  will be,

  +   = 

 +  =1Therefore,

      Y =     =

Therefore , cos =1/2 = 0.5 and sine =    =   0.8660Take     30° Here 30 is just 60 swapped over  So we get ,cosine = sqrt(3/4)   = 0.8660 and sine =1/2 = 0.5

Page 63: 1561 maths

Take 45°

X  =  Y   so ,   +    = 12

Therefore ,   cos =     = 0.70 and sine =  = 0.70he standard definition of sine is usually introduced as:

 which is correct, but makes the eyes glaze over and doesn’t even hint that the ancient Egyptians used trigonometry to survey their land .The ancient Egyptians noticed if two different triangles both have the same angles (similar triangles), then no matter what their size, the relationships between the sides were always the same.

 The ratio of side a to side b is the same as the ratio of side A to side B. This is true for all combinations of sides – their ratios are always the same. Expressed as fractions we would write: 

   With numbers it might look like: if side a is 1 unit long and side b is 2 units long and the larger triangle has sides that are twice as long, then side A is 2 units and side B is 4 units long. Writing the ratios as fractions we see:

Page 64: 1561 maths

To determine if two triangles are similar (have the same angles), we have to measure two angles. The angles inside a triangle always add up to 180 degrees. If you know two of the angles, then you can figure out the third.Later, an insightful Greek realized that if the triangle is a right angle triangle, then we already know one angle – it is 90 degrees, so we don’t need to measure it, therefore we only have to measure one of the other angles to determine if two right angle triangles are similar. This insight turned out to be incredibly useful for measuring things.We can make a bunch of small right angle triangles and measure their sides and calculate the ratios of the sides. (or we could just write them in a book, “A right angle triangle with a measured angle of 1 degree has the following ratios ...”)Knowing all these ratios for different angles, we can then measure things. For example, if you are a landowner and want to measure how long your fields are, you could do the following:

send a servant with a 3 meter tall staff to the end of the field measure the angle to the top of the staff consult your table of similar right angle triangles and determine the

ratio of the sides (in this case, we would be looking for staff / distance to measure)

calculate the length of the field using the ratio and the known height of the staff

If the measured angle was 5 degrees, then we know (from a similar triangle) that the ratio of staff to distance to measure is 0.0874, or:

Rearranging the equation we get:

 a distance of 34.3 meters.We call this ratio the tangent of the angle.

Page 65: 1561 maths

The sine, cosine, tangent, secant, cosecant and cotangent refer to the ratio of a specific pair of sides of the triangle at the given angle A. The problem is that the names aren’t very informative or intuitive. Sine comes from the Latin word sinus which means fold or bend. Looking at our original definition, it now makes a little more sense:

  The sine of angle A is equal to the ratio of the sides at the bend in the triangle as seen from A. Or opposite divided by hypotenuse.

The ratio of a given pair of sides in a right angle triangle are given the following names:

There is no simple way to remember which ratios go with which trigonometric function, although it is easier if you know some of the history behind it.sin, cos, tan, sec, csc, and cot are a shorthand way of referring to the ratio of a specific pair of sides in a right angle triangle.

Page 66: 1561 maths

An Introduction to Trigonometry ... by Brandon WilliamsMain Index...

Introduction Well it is nearly one in the morning and I have tons of work to do and a fabulous idea pops into my head: How about writing an introductory tutorial to trigonometry! I am going to fall so far behind. And once again I did not have the chance to proof read this or check my work so if you find any mistakes e-mail me.

I'm going to try my best to write this as if the reader has no previous knowledge of math (outside of some basic Algebra at least) and I'll do my best to keep it consistent. There may be flaws or gaps in my logic at which point you can e-mail me and I will do my best to go back over something more specific. So let's begin with a standard definition of trigonometry:

trig - o - nom - e - try n. - a branch of mathematics which deals with relations between sides and angles of triangles 

Basics Well that may not sound very interesting at the moment but trigonometry is the most interesting forms of math I have come across…and just to let you know I do not have an extensive background in math. Well since trigonometry has a lot to do with angles and triangles let's familiarize ourselves with some fundamentals. First a right triangle: 

 

A right triangle is a triangle that has one 90-degree angle. The 90-degree angle is denoted with a little square drawn in the corner. The two sides that are adjacent to the 90-degree angle, 'a' and 'b', are called the legs. The longer side opposite of the 90-degree angle, 'c', is called the hypotenuse. The hypotenuse is always longer than the legs. While we are on the subject lets brush up on the Pythagorean Theorem. The Pythagorean Theorem states that the sum of the two legs squared is equal to the hypotenuse squared. An equation you can use is:

c^2 = a^2 + b^2

So lets say we knew that 'a' equaled 3 and 'b' equaled 4 how would we find the length of 'c'…assuming this is in fact a right triangle. Plug-in the values that you know into your formula:

c^2 = 3^2 + 4^2

Page 67: 1561 maths

Three squared plus four squared is twenty-five so we now have this:

c^2 = 25 - - - > Take the square root of both sides and you now know that c = 5

So now we are passed some of the relatively boring parts. Let's talk about certain types of right triangles. There is the 45-45-90 triangle and the 30-60-90 triangle. We might as well learn these because we'll need them later when we get to the unit circle. Look at this picture and observe a few of the things going on for a 45-45-90 triangle:

In a 45-45-90 triangle you have a 90-degree angle and two 45-degree angles (duh) but also the two legs are equal. Also if you know the value of 'c' then the legs are simply 'c' multiplied by the square root of two divided by two. I rather not explain that because I would have to draw more pictures…hopefully you will be able to prove it through your own understanding. The 30-60-90 triangle is a little but harder to get but I am not going into to detail with it…here is a picture:

 

You now have one 30-degree angle, a 60-degree angle, and a 90-degree angle. This time the relationship between the sides is a little different. The shorter side is half of the hypotenuse. The longer side is the hypotenuse times the square root of 3 all divided by two. That's all I'm really going to say on this subject but make sure you get this before you go on because it is crucial in understanding the unit circle…which in turn is crucial for understanding trigonometry. 

Trigonometric Functions The entire subject of trigonometry is mostly based on these functions we are about to learn. The three basic ones are sine, cosine, and tangent. First to clear up any confusion that some might have: these functions mean nothing with out a number with them i.e. sin (20) is something…sin is nothing. Make sure you know that. Now for some quick definitions (these are my own definitions…if you do not get what I

Page 68: 1561 maths

am saying look them up on some other website): 

Sine - the ratio of the side opposite of an angle in a right triangle over the hypotenuse. 

Cosine - the ratio of the side adjacent of an angle in a right triangle over the hypotenuse. 

Tangent - the ratio of the side opposite of an angle in a right triangle over the adjacent side.

Now before I go on I should also say that those functions only find ratios and nothing more. It may seem kind of useless now but they are very powerful functions. Also I am only going to explain the things that I think are useful in Flash…I could go off on some tangent (no pun intended) on other areas of Trigonometry but I'll try to keep it just to the useful stuff. OK lets look at a few pictures:

 

Angles are usually denoted with capital case letters so that is what I used. Now lets find all of the trigonometry ratios for angle A:

sin A = 4/5 cos A = 3/5 tan A = 4/3 

Now it would be hard for me to explain more than what I have done, for this at least, so you are just going to have to look at the numbers and see where I got them from. Here are the ratios for angle B: 

sin B = 3/5 cos B = 4/5 tan B = 3/4 

Once again just look at the numbers and reread the definitions to see where I came up with that stuff. But now that I told you a way of thinking of the ratios like opposite over hypotenuse there is one more way which should be easier and will also be discussed more later on. Here is a picture…notice how I am only dealing with one angle:

Page 69: 1561 maths

 

The little symbol in the corner of the triangle is a Greek letter called "theta"…its usually used to represent an unknown angle. Now with that picture we can think of sine, cosine and tangent in a different way:

sin (theta) = x/r cos (theta)= y/r tan (theta)= y/x -- and x <> 0

We will be using that form most of the time. Now although I may have skipped some kind of fundamentally important step (I'm hoping I did not) I can only think of one place to go from here: the unit circle. Becoming familiar with the unit circle will probably take the most work but make sure you do because it is very important. First let me tell you about radians just in case you do not know. Radians are just another way of measuring angles very similar to degrees. You know that there are 90 degrees in one-quarter of a circle, 180 degrees in one-half of a circle, and 360 degrees in a whole circle right? Well if you are dealing with radians there are 2p radians in a whole circle instead of 360 degrees. The reason that there are 2p radians in a full circle really is not all that important and would only clutter this "tutorial" more…just know that it is and it will stay that way. Now if there are 2p radians in a whole circle there are also p radians in a half, and p/2 radians in a quarter. Now its time to think about splitting the circle into more subdivisions than just a half or quarter. Here is a picture to help you out:

Page 70: 1561 maths

 

If at all possible memorize those values. You can always have a picture to look at like this one but it will do you well when you get into the more advanced things later on if you have it memorized. However that is not the only thing you need to memorize. Now you need to know (from memory if you have the will power) the sine and cosine values for every angle measure on that chart. 

OK I think I cut myself short on explaining what the unit circle is when I moved on to explaining radians. For now the only thing we need to know is that it is a circle with a radius of one centered at (0,0). Now the really cool thing about the unit circle is what we are about to discuss. I'm going to just pick some random angle up there on the graph…let's say…45 degrees. Do you see that line going from the center of the circle (on the chart above) to the edge of the circle? That point at which the line intersects the edge of the circle is very important. The "x" coordinate of that point on the edge is the cosine of the angle and the "y" coordinate is the sine of the angle. Very interesting huh? So lets find the sine and cosine of 45 degrees ourselves without any calculator or lookup tables. 

Well if you remember anything that I said at the beginning of this tutorial then you now know why I even mentioned it. In a right triangle if there is an angle with a measure of 45 degrees the third angle is also 45 degrees. And not only that but the two legs of the triangle have the same length. So if we think of that line coming from the center of the circle at a 45-degree angle as a right triangle we can find the x- and y-position of where the line intersects…look at this picture:

Page 71: 1561 maths

If we apply some of the rules we learned about 45-45-90 triangles earlier we can accurately say that:

sqrt (2) sin 45 = -------- 2

sqrt (2) cos 45 = ---------- 2

Another way to think of sine is it's the distance from the x-axis to the point on the edge of the circle…you can only think of it that way if you are dealing with a unit circle. You could also think of cosine the same way except it's the distance from the y-axis to the point on the border of the circle. If you still do not know where I came up with those numbers look at the beginning of this tutorial for an explanation of 45-45-90 triangles…and why you are there refresh yourself on 30-60-90 triangles because we need to know those next.

Now lets pick an angle from the unit circle chart like 30 degrees. I'm not going to draw another picture but you should know how to form a right triangle with a line coming from the center of the circle to one of its edges. Now remember the rules that governed the lengths of the sides of a 30-60-90 triangle…if you do then you can once again accurately say that:

1 sin 30 = ---- 2

sqrt (3) cos 30 = --------- 2

I was just about to type out another explanation of why I did this but it's basically the

Page 72: 1561 maths

same as what I did for sine just above. Also now that I am rereading this I am seeing some things that may cause confusion so I thought I would try to clear up a few things. If you look at this picture (it's the same as the one I used a the beginning of all this) I will explain with a little bit more detail on how I arrived at those values for sine and cosine of 45-degrees:

 

Our definition of sine states that the sine of an angle would be the opposite side of the triangle divided by the hypotenuse. Well we know our hypotenuse is one since this a unit circle so we can substitute a one in for "c" and get this:

/ 1*sqrt(2) \ | ------------ | \ 2 /sin 45 = ------------------- 1

Which even the most basic understand of Algebra will tell us that the above is the same as: 

sqrt (2) sin 45 = -------- 2

Now if you do not get that look at it really hard until it comes to you…I'm sure it will hit you sooner or later. And instead of my wasting more time making a complete unit circle with everything on it I found this great link to one: http://www.infomagic.net/~bright/research/untcrcl.gif . Depending on just how far you want to go into this field of math as well as others like Calculus you may want to try and memorize that entire thing. Whatever it takes just try your best. I always hear people talking about different patterns that they see which helps them to memorize the unit circle, and that is fine but I think it makes it much easier to remember if you know how to come up with those numbers…that's what this whole first part of this tutorial was mostly about.

Also while on the subject I might as well tell you about the reciprocal trigonometric functions. They are as follow: 

Page 73: 1561 maths

csc (theta) = r/y sec (theta) = r/x cot (theta) = x/y 

Those are pronounced secant, cosecant, and cotangent. Just think of them as the same as their matching trigonometric functions except flipped…like this: 

sin (theta) = y/r - - - > csc (theta) = r/y cos (theta) = x/r - - - > sec (theta) = r/x tan (theta) = y/x - - - > cot (theta) = x/y 

That makes it a little bit easier to understand doesn't it? 

Well believe it or not that is it for an introduction to trigonometry. From here we can start to go into much more complicate areas. There are many other fundamentals that I would have liked to go over but this has gotten long and boring enough as it is. I guess I am hoping that you will explore some of these concepts and ideas on your own…you will gain much more knowledge that way as opposed to my sloppy words. 

Before I go… Before I go I want to just give you a taste of what is to come…this may actually turn out to be just as long as the above so go ahead and make yourself comfortable. First I want to introduce to you trigonometric identities, which are trigonometric equations that are true for all values of the variables for which the expressions in the equation are defined. Now that's probably a little hard to understand and monotonous but I'll explain. Here is a list of what are know as the "fundamental identities":

Reciprocal Identities

1 csc (theta) = ---------- , sin (theta) <> 0 sin (theta)

1 sec (theta) = ---------- , COs (theta) <> 0 cos (theta)

1 cot (theta) = ---------- , tan (theta) <> 0 tan (theta)

Ratio Identities

Page 74: 1561 maths

sin (theta) tan (theta) = ------------ , cos (theta) <> 0 cos (theta)

cos (theta) cot (theta) = ------------- , sin (theta) <> 0 sin (theta)

Pythagorean Identities

sin^2(theta) + cos^2(theta) = 1

1 + cot^2(theta) = csc^2(theta)

1 + tan^2(theta) = sec^2(theta)

Odd-even Identities

sin (-theta) = -sin (theta)

cos (-theta) = cos (theta)

tan (-theta) = -tan (theta)

csc (-theta) = csc (theta)

sec (-theta) = sec (theta)

cot (-theta) = -cot (theta)

Now proving them…well that's gonna take a lot of room but here it goes. I'm only going to prove a few out of each category of identities so maybe you can figure out the others. Lets start with the reciprocal. Well if the reciprocal of a number is simply one divided by that number then we can look at cosecant (which is the reciprocal of sine) as:

1csc (theta) = ----- ----------------- >>> | If you multiply the numerator and the denominator by "r" you get: / y \ | |---- | < -- I hope you know | csc (theta) = r/y < -- Just like we said before. We just proved \ r / that is sine (theta) | an identity...I'll let you do the rest of them...

Now the ratio identities. If you think of tangent as y/x , sine as y/r , and cosine as x/r then check this out:

Page 75: 1561 maths

sin (theta) --- > y/r ytan (theta) = -------------- --- > ----- --- > Multiply top and bottom by "r" and you're left with --- > --- cos (theta) --- > x/r x

I'm going to save the proof for the Pythagorean Identities for another time. These fundamental identities will help us prove much more complex identities later on. Knowing trigonometric identities will help us understand some of the more abstract things…at least they are abstract to me. Once I am finished with this I am going to write another tutorial that will go into the somewhat more complex areas that I know of and these fundamental things I have just talked about are required reading.

I was going to go over some laws that can be very useful but my study plan tells me that I may not have provided enough information for you to understand it…therefore that will be something coming in the next thing I write.

Closing thoughts 

Well this concludes all the things that you will need to know before you start to do more complicated things. I was a bit brief with some things so if you have any questions or if you want me to go back and further explain something I implore you to e-mail me and I will do my best to clear up any confusion. Also I want to reiterate that this is a very basic introduction to trigonometry. I hope you were not expecting to read this and learn all there is to know. Actually I have not really even mentioned Flash or the possibilities yet…and quite honestly there is not really anything to work with yet. However once I do start to mention Flash and the math that it will take to create some of these effects everyone sees it will almost be just like a review. When you sit down and want to write out a script it will be like merely translating everything you learned about trigonometry from a piece of paper into actionscript. 

If you want a little synopsis of what I plan on talking about in the next few things I write here you go:

- Trigonometry curves - More advanced look into trigonometry - Programmatic movement using trigonometry - Orchestrating it all into perfect harmony (pardon the cliché) 

Well that's it for me…until next time.Definition

Page 76: 1561 maths

The "mean", or "average", or "expected value" is the weighted sum of all possible outcomes.  The roll of two dice, for instance, has a mean of 7.  Multiply 2 by 1/36, the odds of rolling a 2.  Multiply 3 by 2/36, the odds of rolling a 3.  Do this for all outcomes up to 12.  Add them up, and the result is 7.  Toss the dice 100 times and the sum of all those throws is going to be close to 700, i.e. 100 times the expected value of 7.

The mean need not be one of the possible outcomes.  Toss one die, and the mean is 3.5, even though there is no single outcome with value 3.5.  But toss the die 100 times and the sum of all those throws will be close to 350.

Given a continuous density function f(x), the expected value is the integral of x×f(x).  This is the limit of the discrete weighted sum described above.

Let's consider a pathological example.  Let f(x) = 1/x2, from 1 to infinity.  This is a valid density function with integral equal to 1.  What is its expected value?  Multiply by x to get 1/x, and integrate to get log(x).  Evaluate log(x) at 1 and infinity, giving an infinite expected value.  Whatever the outcome, you can expect larger outcomes in the future.

Add a constant c to each outcome, and you add c to the expected value.  Prove this for discrete and continuous density functions.

Similarly, scale the output by a constant c, and the mean is multiplied by c.  This is proved using integration by substitution.

The sum of two independent variables adds their means.  This is intuitive, but takes a little effort to prove.  If f and g are the density functions of x and y, then the density function for both variables is f(x)g(y).  Multiply by x+y and take the integral over the xy plane.  Treat it as two integrals:

∫{ f(x)g(y)x } + ∫{ f(x)g(y)y }

The first integral becomes the mean of x times 1, and the second becomes 1 times the mean of y.  Hence the mean of the sum is the sum of the means.

Arithmetic and Geometric Mean

Page 77: 1561 maths

The arithmetic mean is the mean, as described above.  If all values are positive, the geometric mean is computed by taking logs, finding the arithmetic mean, and taking the exponential.  If there are just a few values, the same thing can be accomplished by multiplying them together and taking the nth root.  In the arithmetic mean, you add up and divide by n; in the geometric mean, you multiply up and take the nth root.  The geometric mean of 21, 24, and 147 is 42.

The geometric mean is used when the log of a measurement is a better indicator (for whatever reason) than the measurement itself.  If we wanted to find, for example, the "average" strength of a solar flare, we might use a geometric mean, because the strength can vary by orders of magnitude.  Of course, scientists usually develop logarithmic scales for these phenomenon - such as the ricter scale, the decibel scale, and so on.  When logs are already implicit in the measurements we can return to the arithmetic mean.

The Arithmetic Mean Exceeds the Geometric Mean

The average of 2, 5, 8, and 9 is 6, yet the geometric mean is 5.18.  The geometric mean always comes out smaller.

Let f be a differentiable function that maps the reals, or an unbroken segment of the reals, into the reals.  Let f′ be everywhere positive, and let f′′ be everywhere negative.  Let g be the inverse of f.

Let s be a finite set of real numbers with mean m.  Apply f to s, take the average, and apply g.  The result is less than m, or equal to m if everything in s is equal to m.  When f = log(x), the relationship between the geometric mean and the arithmetic mean is a simple corollary.

Shift f(x) up or down, so that f(m) = 0.  Let v = f′(m).  If x is a value in s less than m, and if f were a straight line with slope v, f(x) would be v×(x-m).  Actually f(x) has to be smaller, else the mean value theorem implies a first derivative ≤ v, and a second derivative ≥ 0.  On the other side, when x is greater than m, similar reasoning shows f(x) is less than v×(x-m).  The entire curve lies below the line with slope v passing through the origin.

If f was a line, f(s) would have a mean of 0.  But for every x ≠ m, f(x) is smaller.  This pulls the mean below 0, and when we apply f inverse, the result lies below m.

If f′′ is everywhere positive then the opposite is true; the mean of the image of s in f pulls back to a value greater than m.

All this can be extended to the average of a continuous function h(x) from a to b.  Choose riemann nets with regular spacing, and apply the theorem to the resulting riemann sums.  As the spacing approaches 0, the average remains ≤ m, and in the limit, the average of f(h), pulled back through g, is no larger than the average of h.

Page 78: 1561 maths

If h is nonconstant the average through f comes out strictly smaller than the average of h.  You'll need uniform continuity, which is assured by continuity across the closed interval [a,b].  The scaled riemann sums approach the average of f(h), and after a while, the mean, according to each riemann sum, can be bounded below f(m).  I'll leave the details to you.

Variance and Standard Deviation

If the mean of a random variable is m, the variance is the sum or integral of f(x)(x-m)2.  To illustrate, let m = 0.  The variance is now the weighted sum of the outcomes squared.  In other words, how far does the random variable stray from its mean?  If the variance is 0, the outcome is always zero.  Any nonzero outcome produces a positive variance.

Consider the example of throwing two dice.  The average throw produces 7, so subtract 7 from everything.  Ten times out of 36 you get a 6 or an 8, giving 10/36×12, or 10/36.  Eight times out of 36 you get a 5 or a 9, so add in 8×4/36.  Continue through all possible rolls.  When your done, the variance is 35/6.

Recall that (x-m)2 = x2-2mx+m2.  This lets us compute both mean and variance in one pass, which is helpfull if the data set is large.  Add up f(x)×x, and f(x)×x2.  The former becomes m, the mean.  The latter is almost the variance, but we must add m2 times the sum of f(x) (which is m2), and subtract 2m times the sum of xf(x) (which becomes 2m2).  Hence the variance is the sum of f(x)x2, minus the square of the mean.

The above is also true for continuous variables.  The variance is the integral of f(x)x2, minus the square of the mean.  The proof is really the same as above.

Variance is a bit troublesome however, because the units are wrong.  Let a random variable indicate the height of a human being on earth.  Height is measured in meters, and the mean, the average height of all people, is also measured in meters.  Yet the variance, the variation of height about the mean, seems to be measured in meters squared.  To compensate for this, the standard deviation is the square root of variance.  Now we're back to meters again.  If the average height is 1.7 meters, and the standard deviation is 0.3 meters, we can be pretty sure that a person, chosen at random, will be between 1.4 and 2.0 meters tall.  How sure?  We'll quantify that later on.  For now, the standard deviation gives a rough measure of the spread of a random variable about its mean.

The Variance of the Sum

We showed that the mean of the sum of two random variables is the sum of the individual means.  What about variance?

Page 79: 1561 maths

Assume, without loss of generality, that mean(x) = mean(y) = 0.  If x and y have density functions f and g, the individual variances are the integrals of f(x)x2 and g(y)y2, respectively.  Taken together, the combined density function is f×g, and we want to know the variance of x+y.  Consider the following double integral.

∫∫f(x)g(y)(x+y)2 =

∫∫{ f(x)g(y)x2 } + ∫∫{ 2f(x)g(y)xy } + ∫∫{ f(x)g(y)y2 }

The first integral is the variance of x, and the third is the variance of y.  The middle integral is the mean of x times the mean of y, or zero.  Therefore the variance of the sum is the sum of the variances.

Reverse Engineering

If a random variable has a mean of 0 and a variance of 1, what can we say about it?  Not a whole lot.  The outcome could be 0 most of the time, and on rare occasions, a million.  That gives a variance of 1.  But for all practical purposes the "random" variable is always 0.  Alternatively, x could be ±1, like flipping a coin.  This has mean 0 and variance 1, yet the outcome is never 0.  Other functions produce values of 1/3, 0.737, sqrt(½), and so on.  There's really no way to know.

We can however say something about the odds of finding x ≥ c, for c ≥ 1.  Let |x| exceed c with probability p.  The area of the curve, beyond c, is p.  This portion of the curve contributes at least pc2 to the variance.  Since this cannot exceed 1, the probability of finding x beyond c is bounded by 1/c2.

Generalize the above proof to a random variable with mean m and standard deviation s.  If c is at least s, x is at least c away from m with probability at most s2/c2.

The Mean is Your Best Guess

Let a random variable x have a density function f and a mean m.  You would like to predict the value of x, in a manner that minimizes error.  If your prediction is t, the error is defined as (x-t)2, i.e. the square of the difference between your prediction and the actual outcome.  What should you guess to minimize error?

The expected error is the integral of f(x)(x-t)2, from -infinity to +infinity.  Write this as three separate integrals:

error = ∫{ f(x)x2 } - ∫{ 2f(x)xt } + ∫{ f(x)t2 }

The first integral becomes a constant, i.e. it does not depend on t.  The second becomes -2mt, where m is the mean, and the third becomes t2.  This gives a

Page 80: 1561 maths

quadratic in t.  Find its minimum by setting its first derivative equal to 0.  Thus t = m, and the mean is your best guess.  The expected error is the variance of f.

Custom Links:

What is Log?

Date: 26 Feb 1995 22:46:28 -0500From: charleySubject: Math questions

Hi,

My name is Yutaka Charley and I'm in the 5th grade at PS150Q in NYC.

What's 4 to the half power?

What does log mean?

Thank you.

Yutaka

Date: 27 Feb 1995 21:54:12 -0500From: Dr. KenSubject: Re: Math questions

Hello there!

I'll address your second question, the one about Logs; and mycolleague and buddy Ethan has promised to answer your first question, the one about 4 to the 1/2 power.

Here's the definition of Log:

b If a = x, then Log (x) = b. a

When you read that, you say "if a to the b power equals x, then the Log (or Logarithm) to the base a of x equals b." Log is short for the wordLogarithm. Here are a couple of examples: Since 2^3 = 8, Log (8) = 3.

Page 81: 1561 maths

2

For the rest of this letter we will use ^ to represent exponents - 2^3 means 2 to the third power.

To find out what Log (25) is, we'd ask ourselves "what power do you raise 5 5to to get 25?" Since 5^2 = 25, the answer to this one is 2. So theLogarithm to the base 5 of 25 is 2.

Whenever you talk about a Logarithm, you have to say what base you'retalking about. For instance, the Logarithm to the base 3 of 81 is 4, butthe Logarithm to the base 9 of 81 is 2.

Here are a couple of examples that you can try to figure out: What is theLogarithm to the base 2 of 16? What is the Logarithm to the base 7 of 343?How would you express the information, 4^3 = 64, in terms of Logarithms?_______________

Now that you have done Logarithms I will take over for my buddyKen and talk about fractional exponents.

To help explain fractional exponents I need to teach you one neatfact about exponents:

3^4 times 3^5 equals 3^(4+5) or 3^9

This will be very important so I will show a few more examples.

4^7 times 4^10 equals 4^17 5^2 times 5^6 equals 5^8

Now let's get to fractional exponents. Let's start with 9^(1/2).

We know from our adding rule that 9^(1/2) times 9^(1/2) is 9^(1/2 + 1/2),

Page 82: 1561 maths

which is 9^1; so whatever 9^(1/2) is, we know that it times itself has toequal nine. But what times itself equals 9? Well 3, so 9^(1/2) is 3.

All fractional exponents work this way. Lets look at 8^(1/3). Again,8^(1/3) times 8^(1/3) times 8^(1/3) is 8^(1/3 + 1/3 + 1/3), which is8; so we need to know what times itself three times is 8. That is 2.

So now look at your problem, 4^(1/2). We know from experiencethat this means what number times itself is 4? That is 2, so 4^(1/2)equals 2.

Geometrical Meaning of Matrix Multiplication

Definitions of 'matrix'

Wordnet 

1. (noun) matrix (mathematics) a rectangular array of quantities or expressions set out by rows and columns; treated as a single element  and manipulated according to rules

2. (noun) matrix (geology) amass of fine-grained rock in which fossils, crystals, or gems are embedded

3. (noun) matrix an enclosure within which something originates or develops (from the Latin for womb)

4. (noun) matrix, intercellular substance, ground substance the body substance in which tissue cells are embedded

5. (noun) matrix the formative tissue at the base of a nail

6. (noun) matrix mold used in the production of phonograph records, type, or other relief surface

Page 83: 1561 maths

Definitions of 'matrix' Webster 1913 Dictionary 

1. (noun) matrix the womb

2. (noun) matrix hence, that which gives form or origin to anything

3. (noun) matrix the cavity in which anything is formed, and which gives it shape; a die; a mold, as for the face of a type

4. (noun) matrix the earthy or stony substance in which metallic ores or crystallized minerals are found; the gangue

5. (noun) matrix the five simple colors, black, white, blue, red, and yellow, of which all the rest are composed

6. (noun) matrix the lifeless portion of tissue, either animal or vegetable, situated between the cells; the intercellular substance

7. (noun) matrix a rectangular arrangement of symbols in rows and columns. The symbols may express quantities or operations

Definitions of 'matrix' The New Hacker's Dictionary 

1.  matrix [FidoNet]

1. What the Opus BBS software and sysops call FidoNet.

2. Fanciful term for a cyberspace expected to emerge from current networking experiments (see the network). The name of the rather good 1999 cypherpunk movie The Matrix played on this sense, which however had been established for years before.

3. The totality of present-day computer networks (popularized in this sense by John Quarterman; rare outside academic  literature).

Page 84: 1561 maths

Matrix multiplication is a versatile tool for many aspects of scientific or technical methods. One particular application of matrix multiplication is the transformation of

data in n-dimensional space. Data can be scaled, shifted, rotated, or distorted by a simple matrix multiplication. In order to achieve all these operations by a single transformation matrix, the original data has to be augmented by an additional constant value (preferably 1). In order to see the effects of matrix multiplication, you can start the following   interactive example   .

Example: transformation of two-dimensional points. Suppose you have seven data points in two dimensions (x, and y). These seven data points have to be submitted to various transformation operations. Therefore we first augment the data points, denoted by [xi,yi], with a constant value, resulting in the point vectors [x i,yi,1].

For performing the various transformations, we simply have to adjust the transformation matrix.    

Shift  The coordinates of the data points are shifted by the vector [t1,t2]

Page 85: 1561 maths

Scaling  The points are scaled by the factor s.

Scaling only the y coordinate 

Here, only the y coordinates are scaled according to the factor s.

Rotation 

A rotation of all points around the origin can be accomplished by using the sines and cosines of the rotation angle (remember the negative sign for the first sine term).

The ordinary matrix product is the most often used and the most important way to multiply matrices. It is defined between two matrices only if the width of the first matrix equals the height of the second matrix. Multiplying an m×n matrix with an n×p matrix results in an m×p matrix. If many matrices are multiplied together, and their dimensions are written in a list in order, e.g. m×n, n×p, p×q, q×r, the size of the result is given by the first and the last numbers (m×r), and the values surrounding each comma must match for the result to be defined. The ordinary matrix product is not commutative:

The element x3,4 of the above matrix product is computed as follows

The first coordinate in matrix notation denotes the row and the second the column; this order is used both in indexing and in giving the dimensions. The element   at the intersection of row i and columnj of the product matrix is the dot product (or scalar product) of row i of the first matrix and column j of the second matrix. This explains why the width and the height of the matrices being multiplied must match: otherwise the dot product is not defined.

Page 86: 1561 maths

The figure to the right illustrates the product of two matrices A and B, showing how each intersection in the product matrix corresponds to a row of Aand a column of B. The size of the output matrix is always the largest possible, i.e. for each row of A and for each column of B there are always corresponding intersections in the product matrix. The product matrix AB consists of all combinations of dot products of rows of A and columns of B.

The values at the intersections marked with circles are:

Formal definition

Formally, for

,

for some field F, then

where the elements of AB are given by

Page 87: 1561 maths

for each pair i and j with 1 ≤ i ≤ m and 1 ≤ j ≤ p. The algebraic system of "matrix units" summarizes the abstract properties of this kind of multiplication.

Relationship with the inner product and the outer product

The Euclidean inner product and outer product are the simplest special cases of the ordinary matrix product. The inner product of two column vectors A and B is  , where T denotes the matrix transpose. More explicitly,

The outer product is  , where

Matrix multiplication can be viewed in terms of these two operations by considering how the matrix product works on block matrices.

Decomposing A into row vectors and B into column vectors:

where

The method in the introduction was:

Page 88: 1561 maths

This is an outer product where the real product inside is replaced with the inner product. In general, block matrix multiplication works exactly like ordinary matrix multiplication, but the real product inside is replaced with the matrix product.

An alternative method results when the decomposition is done the other way around (A decomposed into column vectors and B into row vectors):

This method emphasizes the effect of individual column/row pairs on the result, which is a useful point of view with e.g. covariance matrices, where each such pair corresponds to the effect of a single sample point. An example for a small matrix:

One more useful decomposition results when B is decomposed intocolumns and A is left unrecompensed. Then A is seen to act separately on each column of B, transforming them in parallel. Conversely,B acts separately on each row of A.

If x is a vector and A is decomposed into columns, then  . The column vectors of A give directions and units for coordinate axes and the elements of x are coordinates on the corresponding axes.   is then the vector which has thos

Properties

Matrix multiplication is not generally commutative

If A and B are both n x n matrices, the determinant of their product is independent of the order of the matrices in the product.

Page 89: 1561 maths

If both matrices are diagonal square matrices of the same dimension, their product is commutative.

If A is a matrix representative of a linear transformation L and B is a matrix representative of a linear transformation P then AB is a matrix representative of a linear transform P followed by the linear transformation L.

Matrix multiplication is associative:

Matrix multiplication is distributive over matrix addition:

.

If the matrix is defined over a field (for example, over the Real or Complex fields), then it is compatible with scalar multiplication in that field.

where c is a scalar.

Algorithms for efficient matrix multiplication

The running time of square matrix multiplication, if carried out naively, is O(n3). The running time for multiplying rectangular matrices (one m×p-matrix with one p×n-matrix) is O(mnp). But more efficient algorithms do exist. Strassen's algorithm, devised by Volker Strassen in 1969 and often referred to as "fast matrix multiplication", is based on a clever way of multiplying two 2 × 2 matrices which requires only 7 multiplications (instead of the usual 8), at the expense of several additional addition and subtraction operations. Applying this trick recursively gives an

algorithm with a multiplicative cost of  . Strassen's algorithm is awkward to implement, compared to the naive algorithm, and it lacks numerical stability. Nevertheless, it is beginning to appear in libraries such asBLAS, where it is

Page 90: 1561 maths

computationally interesting for matrices with dimensions n > 100[1], and is very useful for large matrices over exact domains such as finite fields, where numerical stability is not an issue.

The currently O(nk) algorithm with the lowest known exponent k is the Coppersmith–Winograd algorithm. It was presented by Don Coppersmith and Shmuel Winograd in 1990, has an asymptotic complexity of O(n2.376). It is similar to Strassen's algorithm: a clever way is devised for multiplying two k × k matrices with fewer than k3 multiplications, and this technique is applied recursively. However, the constant coefficient hidden by the Big O Notation is so large that the Coppersmith–Winograd algorithm is only worthwhile for matrices that are too large to handle on present-day computers.[2]

Since any algorithm for multiplying two n × n matrices has to process all 2 × n² entries, there is an asymptotic lower bound of ω(n2) operations. Raz (2002) proves a lower bound of Ω(m2logm) for bounded coefficient arithmetic circuits over the real or complex numbers.

Cohn et al. (2003, 2005) put methods, such as the Strassen and Coppersmith–Winograd algorithms, in an entirely different group-theoretic context. They show that if families of wreath products of Abelian with symmetric groups satisfying certain conditions exist, then there are matrix multiplication algorithms with essential quadratic complexity. Most researchers believe that this is indeed the case[3] - a lengthy attempt at proving this was undertaken by the late Jim Eve.[4]

Because of the nature of matrix operations and the layout of matrices in memory, it is typically possible to gain substantial performance gains through use of parallelisation and vectorization. It should therefore be noted that some lower time-complexity algorithms on paper may have indirect time complexity costs on real machines.

Relationship to linear transformations

Matrices offer a concise way of representing linear transformations between vector spaces, and (ordinary) matrix multiplication corresponds to the composition of linear transformations. This will be illustrated here by means of an example using three vector spaces of specific dimensions, but the correspondence applies equally to any other choice of dimensions.

Let X, Y, and Z be three vector spaces, with dimensions 4, 2, and 3, respectively, all over the same field, for example the real numbers. The coordinates of a point in X will be denoted as xi, for i = 1 to 4, and analogously for the other two spaces.

Page 91: 1561 maths

Two linear transformations are given: one from Y to X, which can be expressed by the system of linear equations

and one from Z to Y, expressed by the system

These two transformations can be composed to obtain a transformation from Z to X. By substituting, in the first system, the right-hand sides of the equations of the second system for their corresponding left-hand sides, the xi can be expressed in terms of the zk:

These three systems can be written equivalently in matrix–vector notation – thereby reducing each system to a single equation – as follows:

These three systems can be written equivalently in matrix–vector notation – thereby reducing each system to a single equation – as follows:

Page 92: 1561 maths

Representing these three equations symbolically and more concisely as

inspection of the entries of matrix C reveals that C = AB.

This can be used to formulate a more abstract definition of matrix multiplication, given the special case of matrix–vector multiplication: the product AB of matrices A and B is the matrix C such that for all vectors z of the appropriate shape Cz = A(Bz).

Scalar multiplication

The scalar multiplication of a matrix A = (aij) and a scalar r gives a product r A of the same size as A. The entries of r A are given by

For example, if

then

If we are concerned with matrices over a more general ring, then the above multiplication is the left multiplication of the matrix A with scalar p while the right multiplication is defined to be

When the underlying ring is commutative, for example, the real or complex number field, the two multiplications are the same. However, if the ring is not commutative, such as the quaternions, they may be different. For example

Hadamard product

Page 93: 1561 maths

See also: (Function) pointwise product

For two matrices of the same dimensions, we have the Hadamard product (named after French mathematician Jacques Hadamard), also known as the entrywise product and the Schur product.[5]

Formally, for two matrices of the same dimensions:

the Hadamard product A · B is a matrix of the same dimensions

with elements given by

Note that the Hadamard product is a submatrix of the Kronecker product.

The Hadamard product is commutative.

The Hadamard product appears in lossy compression algorithms such as JPEG.

Kronecker product

Main article: Kronecker product

For any two arbitrary matrices A and B, we have the direct product or Kronecker product A ⊗ B defined as

If A is an m-by-n matrix and B is a p-by-q matrix, then their Kronecker product A ⊗ B is an mp-by-nq matrix.

The Kronecker product is not commutative.

For example

Page 94: 1561 maths

If A and B represent linear transformations V1 → W1 and V2 → W2, respectively, then A ⊗ B repres

Common properties

If A, B and C are matrices with appropriate dimensions defined over a field (e.g.  ) where c is a scalar in that field, then for all three types of multiplication:

Matrix multiplication is associative:

Matrix multiplication is distributive:

Matrix multiplication is compatible with scalar multiplication:

Note that matrix multiplication is not commutative:

although, the order of multiplication can be reversed by transposing the matrices:

The Frobenius inner product, sometimes denoted A:B is the component-wise inner product of two matrices as though they are vectors. In other words, it is the sum of the entries of the Hadamard product, that is,

Page 95: 1561 maths

This inner product induces the Frobenius norm.

Square matrices can be multiplied by themselves repeatedly in the same way that ordinary numbers can. This repeated multiplication can be described as a power of the matrix. Using the ordinary notion of matrix multiplication, the following identities hold for an n-by-n matrix A, a positive integer k, and a scalar c:

The naive computation of matrix powers is to multiply k times the matrix A to the result, starting with the identity matrix just like the scalar case. This can be improved using the binary representation of k, a method commonly used to scalars. An even better method is to use the eigenvalue decomposition of A.

Calculating high powers of matrices can be very time-consuming, but the complexity of the calculation can be dramatically decreased by using the Cayley-Hamilton theorem, which takes advantage of an identity found using the matrices' characteristic polynomial and gives a much more effective equation for Ak, which instead raises a scalar to the required power, rather than a matrix.

Powers of diagonal matrices

The power, k, of a diagonal matrix A, is the diagonal matrix whose diagonal entries are the k powers of the original matrix A.

Page 96: 1561 maths

When raising an arbitrary matrix (not necessarily a diagonal matrix) to a power, it is often helpful to diagonalize the matrix first.

The Weighted Matrix Product

The Weighted Matrix Product, Weighted Matrix Multiplication is a generalization of ordinary matrix multiplication, in the following way.

Given a set of Weight Matrices,   the Weighted Matrix Product of the matrix pair   is given by:

 , 

where: c(A) is the number of columns of 

The number of Weight Matrices is:

the number of columns of the left operand = the number of rows of the right operand

The number of rows of the Weight Matrices is:

number of rows of the left operand.

The number of columns of the Weight Matrices is:

the number of columns of the right operand.

The Weighted Matrix Product is defined only if the matrix operands are conformable in the ordinary sense.

The resultant matrix has the number of rows of the left operand  , and the number of columns of the right operand  .

NOTE:

Ordinary Matrix Multiplication is the special case of Weighted Matrix Multiplication, where all the weight matrix entries are 1s .

Ordinary Matrix Multiplication is Weighted Matrix Multiplication in a default "sea of 1s ", the weight matrices formed out of the "sea" as necessary.

NOTE:

The Weighted Matrix Product is not generally associative:

Page 97: 1561 maths

Weighted matrix multiplication may be expressed in terms of ordinary matrix multiplication, using matrices constructed from the constituent parts, as follows:

for mxp matrix:   , and pxn matrix:   ,

define:

then:

The Weighted Matrix product is especially useful in developing matrix bases closed under a (not necessarily associative) product (algebras).

As an example, consider the following developments: It is convenient (although not necessary) to begin with permutation matrices as the basis; since they are a known basis and about as simple as there is.

the complex plane

 is weighted matrix multiplication,

with weights: 

then:

then:

So:

Page 98: 1561 maths

Thus, is manifested a homomorphism between this and thecomplex plane.

Quaternions

 is weighted matrix multiplication,

with weights: 

then

*

Multiplying a 2 × 3 matrix by a 3 × 4 matrix is possible and it gives a 2 × 4 matrix as the answer.

Multiplying a 7 × 1 matrix by a 1 × 2 matrix is okay; it gives a 7 × 2 matrix

A 4 × 3 matrix times a 2 × 3 matrix is NOT possible.

How to Multiply 2 Matrices

Page 99: 1561 maths

We use letters first to see what is going on. We'll see a numbers example after.

As an example, let's take a general 2 × 3 matrix multiplied by a 3 × 2 matrix.

The answer will be a 2 × 2 matrix.

We multiply and add the elements as follows. We work across the 1st row of the first matrix, multiplying down the 1st column of the second matrix, element by element. Weadd the resulting products. Our answer goes in position a11 (top left) of the answer matrix.

We do a similar process for the 1st row of the first matrix and the 2nd column of the second matrix. The result is placed in position a12.

Now for the 2nd row of the first matrix and the 1st column of the second matrix. The result is placed in position a21.

Finally, we do the 2nd row of the first matrix and the 2nd column of the second matrix. The result is placed in position a22.

Page 100: 1561 maths

So the result of multiplying our 2 matrices is as follows:

Now let's see a number example.

Example

Multiply:

Answer

Multiplying 2 × 2 Matrices

The process is the same for any size matrix. We multiply across rows of the first matrix and down columns of the second matrix, element by element. We then add the products:

In this case, we multiply a 2 × 2 matrix by a 2 × 2 matrix and we get a 2 × 2 matrix as the result.

Page 101: 1561 maths

Example

Multiply:

Answer

Here is a (2×2)×(2×2) example in LiveMath.

LIVE Math

Let's look at another example. This time we have (3×3)×(3×1).

LIVE Math

Flash Interactive

Here's a Flash movie to play with. It will generate many different sized (up to 5 by 5) matrices with different random numbers each time. You can see plenty of examples of matrix operations, including adding, subtracting and multiplying.

You can step through each calculation involved. You can do this by clicking the "step" button which appears.

Suggestion: Work out the answer yourself first, then check your answer against what it says. Never just copy!

Matrices and Systems of Simultaneous Linear Equations

We now see how to write a system of linear equations using matrix multiplication.

Example:

The system of equations

Page 102: 1561 maths

can be written as:

Matrices are ideal for computer-driven solutions of problems because computers easily form arrays. We can leave out the algebraic symbols. A computer only requires the first and last matrices to solve the system, as we will see in Matrices and Linear Equations.

Note 1 - Notation

Care with writing matrix multiplication.

The following expressions have different meanings:

AB is matrix multiplication

A×B cross product, which returns a vector

A*B used in computer notation, but not on paper

A•B dot product, which returns a scalar.

[See the Vector chapter for more information on vector and scalar quantities.]

Note 2 - Commutativity of Matrix Multiplication

Does AB = BA?

Let's see if it is true using an example.

Example

If

Page 103: 1561 maths

and

find AB and BA.

Answer

In general, when multiplying matrices, the commutative law doesn't hold, i.e. AB ≠ BA. There are two common exceptions to this:

The identity matrix: IA = AI = A. The inverse of a matrix: A-1A = AA-1 = I.

In the next section we learn how to find the inverse of a matrix.

Example - Multiplying by the Identity Matrix

Given that

find AI.

Answer

Page 104: 1561 maths

Exercises

1. If possible, find BA and AB.

Answer

2. Determine if B = A-1.

Answer

3. In studying the motion of electrons, one of the Pauli spin matrices is

Page 105: 1561 maths

where

Show that s2 = I.

[If you have never seen j before, go to the section on complex numbers].

Answer

4. Evaluate the following matrix multiplication which is used in directing the motion of a robotic mechanism.

Answer

Diagonal matrix

From Wikipedia, the free encyclopedia

In linear algebra, a diagonal matrix is a square matrix in which the entries outside the main diagonal (↘) are all zero. The diagonal entries themselves may or may not be zero. Thus, the matrix D = (di,j) with n columns and n rows is diagonal if:

Page 106: 1561 maths

For example, the following matrix is diagonal:

The term diagonal matrix may sometimes refer to a rectangular diagonal matrix, which is an m-by-n matrix with only the entries of the form di,i possibly non-zero; for example,

, or 

However, in the remainder of this article we will consider only square matrices.

Any diagonal matrix is also a symmetric matrix. Also, if the entries come from the field R or C, then it is a normal matrix as well.

Equivalently, we can define a diagonal matrix as a matrix that is both upper- and lower-triangular.

The identity matrix In and any square zero matrix are diagonal. A one-dimensional matrix is always diagonal.

Scalar matrix

A diagonal matrix with all its main diagonal entries equal is a scalar matrix, that is, a scalar multiple λI of the identity matrix I. Its effect on a vector is scalar multiplication by λ. For example, a 3×3 scalar matrix has the form:

The scalar matrices are the center of the algebra of matrices: that is, they are precisely the matrices that commute with all other square matrices of the same size.

For an abstract vector space V (rather than the concrete vector space Kn), or more generally a module M over a ring R, with the endomorphism algebra End(M) (algebra of linear operators on M) replacing the algebra of matrices, the analog of scalar matrices are scalar transformations. Formally,

Page 107: 1561 maths

scalar multiplication is a linear map, inducing a map   (send a scalar λ to the corresponding scalar transformation, multiplication by λ) exhibiting End(M) as a R-algebra. For vector spaces, or more generally free modules  , for which the endomorphism algebra is isomorphic to a matrix algebra, the scalar transforms are exactly the center of the endomorphism algebra, and similarly invertible transforms are the center of the general linear group GL(V), where they are denoted by Z(V), follow the usual notation for the center.

Matrix operations

The operations of matrix addition and matrix multiplication are especially simple for diagonal matrices. Write diag(a1,...,an) for a diagonal matrix whose diagonal entries starting in the upper left corner area1,...,an. Then, for addition, we have

diag(a1,...,an) + diag(b1,...,bn) = diag(a1+b1,...,an+bn)

and for matrix multiplication,

diag(a1,...,an) · diag(b1,...,bn) = diag(a1b1,...,anbn).

The diagonal matrix diag(a1,...,an) is invertible if and only if the entries a1,...,an are all non-zero. In this case, we have

diag(a1,...,an)-1 = diag(a1-1,...,an

-1).

In particular, the diagonal matrices form a subring of the ring of all n-by-n matrices.

Multiplying an n-by-n matrix A from the left with diag(a1,...,an) amounts to multiplying the i-th row of A by ai for all i; multiplying the matrix A from the right with diag(a1,...,an) amounts to multiplying the i-thcolumn of A by ai for all i.

Background

The variance of a random variable or distribution is the expectation, or mean, of the squared deviation of that variable from its expected value or mean. Thus the variance is a measure of the amount of variation within the values of that variable, taking account of all possible values and their probabilities or weightings (not just the extremes which give the range). For example, a perfect die, when thrown, has expected value (1 + 2 + 3 + 4 + 5 + 6)/6 = 3.5, expected absolute deviation 1.5 (the mean of the equally likely absolute deviations (3.5 − 1, 3.5 − 2, 3.5 − 3, 4 − 3.5, 5 − 3.5, 6 − 3.5), giving 2.5, 1.5, 0.5, 0.5, 1.5, 2.5), but expected square deviation or

Page 108: 1561 maths

variance of 17.5/6 ≈ 2.9 (the mean of the equally likely squared deviations 2.52, 1.52, 0.52, 0.52, 1.52, 2.52).

As another example, if a coin is tossed twice, the number of heads is: 0 with probability 0.25, 1 with probability 0.5 and 2 with probability 0.25. Thus the variance is 0.25 × (0 − 1)2 + 0.5 × (1 − 1)2 + 0.25 × (2 − 1)2 = 0.25 + 0 + 0.25 = 0.5. (Note that in this case, where tosses of coins are independent, the variance is additive, i.e., if the coin is tossed n times, the variance will be 0.25n.)

Unlike expected deviation, the variance of a variable has units that are the square of the units of the variable itself. For example, a variable measured in inches will have a variance measured in square inches. For this reason, describing data sets via their standard deviation or root mean square deviation is often preferred over variance. In the dice example the standard deviation is √(17.5/6) ≈ 1.7, slightly larger than the expected deviation of 1.5.

The standard deviation and the expected deviation can both be used as an indicator of the "spread" of a distribution. The standard deviation is more amenable to algebraic manipulation, and, together with variance and its generalization covariance is used frequently in theoretical statistics; however the expected deviation tends to be more robust as it is less sensitive to outliers arising from measurement anomalies or an unduly heavy-tailed distribution.

Real-world distributions such as the distribution of yesterday’s rain throughout the day are typically not fully known, unlike the behavior of perfect dice or an ideal distribution such as the normal distribution, because it is impractical to account for every raindrop. Instead one estimates the mean and variance of the whole distribution as the computed mean and variance of n samples drawn suitably randomly from the whole sample space, in this example yesterday’s rainfall.

This method of estimation is close to optimal, with the caveat that it underestimates the variance by a factor of (n−1)/n (when n = 1 the variance of a single sample is obviously zero regardless of the true variance), a bias which should be corrected for when n is small. If the mean is determined in some other way than from the same samples used to estimate the variance then this bias does not arise and the variance can safely be estimated as that of the samples.

The variance of a real-valued random variable is its second central moment, and it also happens to be its second cumulant. Just as some distributions do not have a mean, some do not have a variance. The mean exists whenever the variance exists, but not vice versa.

Page 109: 1561 maths

Definition

If a random variable X has the expected value (mean) μ = E[X], then the variance of X is given by:

This definition encompasses random variables that are discrete, continuous, or neither. It can be expanded as follows:

The variance of random variable X is typically designated as Var(X),  , or simply σ2 (pronounced “sigma squared”). If a distribution does not have an expected value, as is the case for the Cauchy distribution, it does not have a variance either. Many other distributions for which the expected value does exist do not have a finite variance because the relevant integral diverges. An example is a Pareto distribution whose index k satisfies 1 < k ≤ 2.

[edit]Continuous case

If the random variable X is continuous with probability density function f(x),

where

and where the integrals are definite integrals taken for x ranging over the range of X.

[edit]Discrete case

If the random variable X is discrete with probability mass function x1 ↦ p1, ..., xn ↦ pn, then

where

Page 110: 1561 maths

 .

(When such a discrete weighted variance is specified by weights whose sum is not 1, then one divides by the sum of the weights.) That is, it is the expected value of the square of the deviation of X from its own mean. In plain language, it can be expressed as “The mean of the square of the deviation of each data point from the average”. It is thus the mean squared deviation.

Examples

Exponential distribution

The exponential distribution with parameter λ is a continuous distribution whose support is the semi-infinite interval [0,∞). Its probability density function is given by:

and it has expected value μ = λ−1. Therefore the variance is equal to:

So for an exponentially distributed random variable σ2 = μ2.

[edit]Fair dice

A six-sided fair die can be modelled with a discrete random variable with outcomes 1 through 6, each with equal probability 1/6. The expected value is (1 + 2 + 3 + 4 + 5 + 6)/6 = 3.5. Therefore the variance can be computed to be:

Properties

Variance is non-negative because the squares are positive or zero. The variance of a constant random variable is zero, and the variance of a variable in a data set is 0 if and only if all entries have the same value.

Variance is invariant with respect to changes in a location parameter. That is, if a constant is added to all values of the variable, the variance is unchanged. If all values are scaled by a constant, the variance is scaled by the square of that constant. These two properties can be expressed in the following formula:

Page 111: 1561 maths

The variance of a finite sum of uncorrelated random variables is equal to the sum of their variances. This stems from the identity:

and that for uncorrelated variables covariance is zero.

In general, for the sum of N variables:  , we have:

Suppose that the observations can be partitioned into equal-sized subgroups according to some second variable. Then the variance of the total group is equal to the mean of the variances of the subgroups plus the variance of the means of the subgroups. This property is known as variance decomposition or the law of total variance and plays an important role in the analysis of variance. For example, suppose that a group consists of a subgroup of men and an equally large subgroup of women. Suppose that the men have a mean body length of 180 and that the variance of their lengths is 100. Suppose that the women have a mean length of 160 and that the variance of their lengths is 50. Then the mean of the variances is (100 + 50) / 2 = 75; the variance of the means is the variance of 180, 160 which is 100. Then, for the total group of men and women combined, the variance of the body lengths will be 75 + 100 = 175. Note that this uses N for the denominator instead of N − 1.   In a more general case, if the subgroups have unequal sizes, then they must be weighted proportionally to their size in the computations of the means and variances. The formula is also valid with more than two groups, and even if the grouping variable is continuous.   This formula implies that the variance of the total group cannot be smaller than the mean of the variances of the subgroups. Note, however, that the total variance is not necessarily larger than the variances of the subgroups. In the above example, when the subgroups are analyzed separately, the variance is influenced only by the man-man differences and the woman-woman differences. If the two groups are combined, however, then the men-women differences enter into the variance also.

Many computational formulas for the variance are based on this equality: The variance is equal to the mean of the squares minus the square of the mean. For example, if we consider the numbers 1, 2, 3, 4 then the mean of the squares is (1 ×

Page 112: 1561 maths

1 + 2 × 2 + 3 × 3 + 4 × 4) / 4 = 7.5. The regular mean of all four numbers is 2.5, so the square of the mean is 6.25. Therefore the variance is 7.5 − 6.25 = 1.25, which is indeed the same result obtained earlier with the definition formulas. Many pocket calculators use an algorithm that is based on this formula and that allows them to compute the variance while the data are entered, without storing all values in memory. The algorithm is to adjust only three variables when a new data value is entered: The number of data entered so far (n), the sum of the values so far (S), and the sum of the squared values so far (SS). For example, if the data are 1, 2, 3, 4, then after entering the first value, the algorithm would have n = 1, S = 1 and SS = 1. After entering the second value (2), it would have n = 2, S = 3 and SS = 5. When all data are entered, it would have n = 4, S = 10 and SS = 30. Next, the mean is computed as M = S / n, and finally the variance is computed as SS / n − M × M. In this example the outcome would be 30 / 4 − 2.5 × 2.5 = 7.5 − 6.25 = 1.25. If the unbiased sample estimate is to be computed, the outcome will be multiplied by n / (n − 1), which yields 1.667 in this example.

Sum of uncorrelated variables (Bienaymé formula)

One reason for the use of the variance in preference to other measures of dispersion is that the variance of the sum (or the difference) of uncorrelated random variables is the sum of their variances:

This statement is called the Bienaymé formula.[1] and was discovered in 1853. It is often made with the stronger condition that the variables are independent, but uncorrelatedness suffices. So if all the variables have the same variance σ2, then, since division by n is a linear transformation, this formula immediately implies that the variance of their mean is

That is, the variance of the mean decreases when n increases. This formula for the variance of the mean is used in the definition of the standard error of the sample mean, which is used in the central limit theorem.

[edit]Sum of correlated variables

In general, if the variables are correlated, then the variance of their sum is the sum of their covariances:

Page 113: 1561 maths

(Note: This by definition includes the variance of each variable, since Cov(X,X) = Var(X).)

Here Cov is the covariance, which is zero for independent random variables (if it exists). The formula states that the variance of a sum is equal to the sum of all elements in the covariance matrix of the components. This formula is used in the theory of Cronbach's alpha in classical test theory.

So if the variables have equal variance σ2 and the average correlation of distinct variables is ρ, then the variance of their mean is

This implies that the variance of the mean increases with the average of the correlations. Moreover, if the variables have unit variance, for example if they are standardized, then this simplifies to

This formula is used in the Spearman-Brown prediction formula of classical test theory. This converges to ρ if n goes to infinity, provided that the average correlation remains constant or converges too. So for the variance of the mean of standardized variables with equal correlations or converging average correlation we have

Therefore, the variance of the mean of a large number of standardized variables is approximately equal to their average correlation. This makes clear that the sample mean of correlated variables does generally not converge to the population mean, even though the Law of large numbers states that the sample mean will converge for independent variables.

[edit]Weighted sum of variables

The scaling property and the Bienaymé formula, along with this property from the covariance page: Cov(aX, bY) = ab Cov(X, Y) jointly imply that

This implies that in a weighted sum of variables, the variable with the largest weight will have a disproportionally large weight in the variance of the total. For example,

Page 114: 1561 maths

if X and Y are uncorrelated and the weight of X is two times the weight of Y, then the weight of the variance of X will be four times the weight of the variance of Y.

Decomposition

The general formula for variance decomposition or the law of total variance is: If X and Y are two random variables and the variance of X exists, then

Here, E(X|Y) is the conditional expectation of X given Y, and Var(X|Y) is the conditional variance of X given Y. (A more intuitive explanation is that given a particular value of Y, then X follows a distribution with mean E(X|Y) and variance Var(X|Y). The above formula tells how to find Var(X) based on the distributions of these two quantities when Y is allowed to vary.) This formula is often applied in analysis of variance, where the corresponding formula is

SSTotal = SSBetween + SSWithin.

It is also used in linear regression analysis, where the corresponding formula is

SSTotal = SSRegression + SSResidual.

This can also be derived from the additivity of variances, since the total (observed) score is the sum of the predicted score and the error score, where the latter two are uncorrelated.

Computational formulaMain article: computational formula for the variance

See also: algorithms for calculating variance

The computational formula for the variance follows in a straightforward manner from the linearity of expected values and the above definition:

This is often used to calculate the variance in practice, although it suffers from catastrophic cancellation if the two components of the equation are similar in magnitude.

Page 115: 1561 maths

Characteristic property

The second moment of a random variable attains the minimum value when taken around the first moment (i.e., mean) of the random variable,

i.e.  . Conversely, if a continuous function   satisfies   for all random variables X, then it is

necessarily of the form  , where a > 0. This also holds in the multidimensional case.[2]

[edit]Calculation from the CDF

The population variance for a non-negative random variable can be expressed in terms of the cumulative distribution function F using

where H(u) = 1 − F(u) is the right tail function. This expression can be used to calculate the variance in situations where the CDF, but not the density, can be conveniently expressed.

[edit]Approximating the variance of a function

The delta method uses second-order Taylor expansions to approximate the variance of a function of one or more random variables: see Taylor expansions for the moments of functions of random variables. For example, the approximate variance of a function of one variable is given by

provided that f is twice differentiable and that the mean and variance of X are finite.[citation needed]

[edit]Population variance and sample variance

In general, the population variance of a finite population of size N is given by

where

is the population mean.

Page 116: 1561 maths

In many practical situations, the true variance of a population is not known a priori and must be computed somehow. When dealing with extremely large populations, it is not possible to count every object in the population.

A common task is to estimate the variance of a population from a sample.[3] We take a sample with replacement of n values y1, ..., yn from the population, where n < N, and estimate the variance on the basis of this sample. There are several good estimators. Two of them are well known:

 and

[4]

Both are referred to as sample variance. Here,   denotes the sample mean:

The two estimators only differ slightly as can be seen, and for larger values of the sample size n the difference is negligible. While the first one may be seen as the variance of the sample considered as a population, the second one is the unbiased estimator of the population variance, meaning that its expected value E[s2] is equal to the true variance of the sampled random variable; the use of the term n − 1 is called Bessel's correction. The sample variance with n − 1 is a U-statistic for the function ƒ(x1, x2) = (x1 − x2)2/2, meaning that it is obtained by averaging a 2-sample statistic over 2-element subsets of the population.

Page 117: 1561 maths

While,

Distribution of the sample variance

Being a function of random variables, the sample variance is itself a random variable, and it is natural to study its distribution. In the case that yi are independent observations from a normal distribution,Cochran's theorem shows that s2 follows a scaled chi-square distribution:

As a direct consequence, it follows that E(s2)  = σ2.

If the yi are independent and identically distributed, but not necessarily normally distributed, then

where κ is the kurtosis of the distribution. If the conditions of the law of large numbers hold, s2 is a consistent estimator of σ2.

Page 118: 1561 maths

[edit]Generalizations

If X is a vector-valued random variable, with values in  , and thought of as a column vector, then the natural generalization of variance

is  , where   and   is the transpose of X, and so is a row vector. This variance is a positive semi-definite square matrix, commonly referred to as the covariance matrix.

If X is a complex-valued random variable, with values in  , then its variance

is  , where   is the conjugate transpose of X. This variance is also a positive semi-definite square matrix.

[edit]History

The term variance was first introduced by Ronald Fisher in his 1918 paper The Correlation Between Relatives on the Supposition of Mendelian Inheritance:[5]

The great body of available statistics show us that the deviations of a human measurement from its mean follow very closely the Normal Law of Errors, and, therefore, that the variability may be uniformly measured by the standard deviation corresponding to the square root of the mean square error. When there are two independent causes of variability capable of producing in an otherwise uniform population distributions with standard deviations θ1 and θ2, it is found that the

distribution, when both causes act together, has a standard deviation  . It is therefore desirable in analysing the causes of variability to deal with the square of the standard deviation as the measure of variability. We shall term this quantity the Variance...

[edit]Moment of inertia

The variance of a probability distribution is analogous to the moment of inertia in classical mechanics of a corresponding mass distribution along a line, with respect to rotation about its center of mass. It is because of this analogy that such things as the variance are called moments of probability distributions. The covariance matrix is related to the moment of inertia tensor for multivariate distributions. The moment of inertia of a cloud of n points with a covariance matrix of Σ is given by

This difference between moment of inertia in physics and in statistics is clear for points that are gathered along a line. Suppose many points are close to the x and distributed along it. The covariance matrix might look like

Page 119: 1561 maths

That is, there is the most variance in the x direction. However, physicists would consider this to have a low moment about the x axis so the moment-of-inertia tensor is

Overview

The moment of inertia of an object about a given axis describes how difficult it is to change its angular motion about that axis. Therefore, it encompasses not just how much mass the object has overall, but how far each bit of mass is from the axis. The farther out the object's mass is, the more rotational inertia the object has, and the more force is required to change its rotation rate. For example, consider two hoops, A and B, made of the same material and of equal mass. Hoop A is larger in diameter but thinner than B. It requires more effort to accelerate hoop A (change its angular velocity) because its mass is distributed farther from its axis of rotation: mass that is farther out from that axis must, for a given angular velocity, move more quickly than mass closer in. So in this case, hoop A has a larger moment of inertia than hoop B.

Divers reducing their moments of inertia to increase their rates of rotation

The moment of inertia of an object can change if its shape changes. A figure skater who begins a spin with arms outstretched provides a striking example. By pulling in her arms, she reduces her moment of inertia, causing her to spin faster (by the conservation of angular momentum).

Page 120: 1561 maths

The moment of inertia has two forms, a scalar form, I, (used when the axis of rotation is specified) and a more general tensor form that does not require the axis of rotation to be specified. The scalar moment of inertia, I, (often called simply the "moment of inertia") allows a succinct analysis of many simple problems inrotational dynamics, such as objects rolling down inclines and the behavior of pulleys. For instance, while a block of any shape will slide down a frictionless decline at the same rate, rolling objects may descend at different rates, depending on their moments of inertia. A hoop will descend more slowly than a solid disk of equal mass and radius because more of its mass is located far from the axis of rotation, and thus needs to move faster if the hoop rolls at the same angular velocity. However, for (more complicated) problems in which the axis of rotation can change, the scalar treatment is inadequate, and the tensor treatment must be used (although shortcuts are possible in special situations). Examples requiring such a treatment include gyroscopes, tops, and even satellites, all objects whose alignment can change.

The moment of inertia is also called the mass moment of inertia (especially by mechanical engineers) to avoid confusion with the second moment of area, which is sometimes called the moment of inertia (especially by structural engineers). The easiest way to differentiate these quantities is through their units (kg·m2 as opposed to m4). In addition, moment of inertia should not be confused with polar moment of inertia, which is a measure of an object's ability to resist torsion (twisting) only.

[edit]Scalar moment of inertia

[edit]Definition

A simple definition of the moment of inertia (with respect to a given axis of rotation) of any object, be it a point mass or a 3D-structure, is given by:

where m is mass and r is the perpendicular distance to the axis of rotation.

[edit]Detailed analysis

The (scalar) moment of inertia of a point mass rotating about a known axis is defined by

The moment of inertia is additive. Thus, for a rigid body consisting of N point masses mi with distances ri to the rotation axis, the total moment of inertia equals the sum of the point-mass moments of inertia:

Page 121: 1561 maths

The mass distribution along the axis of rotation has no effect on the moment of inertia.

For a solid body described by a mass density function, ρ(r), the moment of inertia about a known axis can be calculated by integrating the square of the distance (weighted by the mass density) from a point in the body to the rotation axis:

where

V is the volume occupied by the object.ρ is the spatial density function of the object, andr = (r,θ,φ), (x,y,z), or (r,θ,z) is the vector (orthogonal to the axis of rotation) between the axis of rotation and the point in the body.

Diagram for the calculation of a disk's moment of inertia. Here c is 1/2 and   is the radius used in determining the moment.

Based on dimensional analysis alone, the moment of inertia of a non-point object must take the form:

where

M is the massL is a length dimension taken from the centre of mass (in some cases, the length of the object is used instead.)c is a dimensionless constant called the inertial constant that varies with the object in consideration.

Inertial constants are used to account for the differences in the placement of the mass from the center of rotation. Examples include:

Page 122: 1561 maths

c = 1, thin ring or thin-walled cylinder around its center,

c = 2/5, solid sphere around its center

c = 1/2, solid cylinder or disk around its center.

When c is 1, the length (L) is called the radius of gyration.

For more examples, see the List of moments of inertia.

[Parallel axis theoremMain article: Parallel axis theorem

Once the moment of inertia has been calculated for rotations about the center of mass of a rigid body, one can conveniently recalculate the moment of inertia for all parallel rotation axes as well, without having to resort to the formal definition. If the axis of rotation is displaced by a distance r from the center of mass axis of rotation (e.g., spinning a disc about a point on its periphery, rather than through its center,) the displaced and center-moment of inertia are related as follows:

This theorem is also known as the parallel axes rule and is a special case of Steiner's parallel-axis theorem.

Composite bodies

If a body can be decomposed (either physically or conceptually) into several constituent parts, then the moment of inertia of the body about a given axis is obtained by summing the moments of inertia of each constituent part around the same given axis.[2]

Equations involving the moment of inertia

The rotational kinetic energy of a rigid body can be expressed in terms of its moment of inertia. For a system with N point masses mi moving with speeds vi, the rotational kinetic energy T equals

where ω is the common angular velocity (in radians per second). The final expression I ω2 / 2 also holds for a mass density function with a generalization of the above derivation from a discrete summation to an integration.

In the special case where the angular momentum vector is parallel to the angular velocity vector, one can relate them by the equation

Page 123: 1561 maths

where L is the angular momentum and ω is the angular velocity. However, this equation does not hold in many cases of interest, such as the torque-free precession of a rotating object, although its more general tensor form is always correct.

When the moment of inertia is constant, one can also relate the torque on an object and its angular acceleration in a similar equation:

where τ is the torque and α is the angular acceleration.

Moment of inertia tensor

In three dimensions, if the axis of rotation is not given, we need to be able to generalize the scalar moment of inertia to a quantity that allows us to compute a moment of inertia about arbitrary axes. This quantity is known as the moment of inertia tensor and can be represented as a symmetric positive semi-definite matrix, I. This representation elegantly generalizes the scalar case: The angular momentum vector, is related to the rotation velocity vector, ω by

and the kinetic energy is given by

as compared with

in the scalar case.

Like the scalar moment of inertia, the moment of inertia tensor may be calculated with respect to any point in space, but for practical purposes, the center of mass is almost always used.

[edit]Definition

For a rigid object of N point masses mk, the moment of inertia tensor is given by

,

Page 124: 1561 maths

where

and I12 = I21, I13 = I31, and I23 = I32. (Thus I is a symmetric tensor.)

Here Ixx denotes the moment of inertia around the x-axis when the objects are rotated around the x-axis, Ixy denotes the moment of inertia around the y-axis when the objects are rotated around the x-axis, and so on.

These quantities can be generalized to an object with distributed mass, described by a mass density function, in a similar fashion to the scalar moment of inertia. One then has

where   is their outer product, E3 is the 3 × 3 identity matrix, and V is a region of space completely containing the object. Alternatively, the equation above can be represented in a component-based method. Recognizing that, in the above expression, the scalars Iij with   are called the products of inertia, a generalized form of the products of inertia can be given as

The diagonal elements of I are called the principal moments of inertia.

Page 125: 1561 maths

Derivation of the tensor components

The distance r of a particle at   from the axis of rotation passing through the origin in the   direction is  . By using the formula I = mr2 (and some simple vector algebra) it can be seen that the moment of inertia of this particle (about the axis of rotation passing through the origin in the   direction)

is   This is a quadratic form in   and, after a bit more algebra, this leads to a tensor formula for the moment of inertia

.

This is exactly the formula given below for the moment of inertia in the case of a single particle. For multiple particles we need only recall that the moment of inertia is additive in order to see that this formula is correct.

Reduction to scalar

For any axis  , represented as a column vector with elements ni, the scalar form I can be calculated from the tensor form I as

The range of both summations correspond to the three Cartesian coordinates.

The following equivalent expression avoids the use of transposed vectors which are not supported in maths libraries because internally vectors and their transpose are stored as the same linear array,

However it should be noted that although this equation is mathematically equivalent to the equation above for any matrix, inertia tensors are symmetrical. This means that it can be further simplified to:

[edit]Principal axes of inertia

By the spectral theorem, since the moment of inertia tensor is real and symmetric, it is possible to find a Cartesian coordinate system in which it is diagonal, having the form

Page 126: 1561 maths

where the coordinate axes are called the principal axes and the constants I1, I2 and I3 are called the principal moments of inertia. The principal axes of a body, therefore, are a cartesian coordinate system whose origin is located at the center of mass. [3] The unit vectors along the principal axes are usually denoted as (e1, e2, e3). This result was first shown by J. J. Sylvester (1852), and is a form ofSylvester's law of inertia. The principal axis with the highest moment of inertia is sometimes called the figure axis or axis of figure.

When all principal moments of inertia are distinct, the principal axes are uniquely specified. If two principal moments are the same, the rigid body is called a symmetrical top and there is no unique choice for the two corresponding principal axes. If all three principal moments are the same, the rigid body is called a spherical top (although it need not be spherical) and any axis can be considered a principal axis, meaning that the moment of inertia is the same about any axis.

The principal axes are often aligned with the object's symmetry axes. If a rigid body has an axis of symmetry of order m, i.e., is symmetrical under rotations of 360°/m about a given axis, the symmetry axis is a principal axis. When m > 2, the rigid body is a symmetrical top. If a rigid body has at least two symmetry axes that are not parallel or perpendicular to each other, it is a spherical top, e.g., a cube or any other Platonic solid.

The motion of vehicles is often described about these axes with the rotations called yaw, pitch, and roll.

A practical example of this mathematical phenomenon is the routine automotive task of balancing a tire, which basically means adjusting the distribution of mass of a car wheel such that its principal axis of inertia is aligned with the axle so the wheel does not wobble.

[edit]Parallel axis theorem

Once the moment of inertia tensor has been calculated for rotations about the center of mass of the rigid body, there is a useful labor-saving method to compute the tensor for rotations offset from the center of mass.

If the axis of rotation is displaced by a vector R from the center of mass, the new moment of inertia tensor equals

Page 127: 1561 maths

where m is the total mass of the rigid body, E3 is the 3 × 3 identity matrix, and   is the outer product.

[edit]Rotational symmetry

Using the above equation to express all moments of inertia in terms of integrals of variables either along or perpendicular to the axis of symmetry usually simplifies the calculation of these moments considerably.

Comparison with covariance matrixMain article: Moment (mathematics)

The moment of inertia tensor about the center of mass of a 3 dimensional rigid body is related to the covariance matrix of a trivariate random vector whose probability density function is proportional to the pointwise density of the rigid body by:[citation needed]

where n is the number of points.

The structure of the moment-of-inertia tensor comes from the fact that it is to be used as a bilinear form on rotation vectors in the form

Each element of mass has a kinetic energy of

The velocity of each element of mass is   where r is a vector from the center of rotation to that element of mass. The cross product can be converted to matrix multiplication so that

and similarly

Thus,

plugging in the definition of   the   term leads directly to the structure of the moment tensor.

Page 128: 1561 maths