1 gauss and the method of least squares teddy petrou hongxiao zhu

1

Gauss and the Method of Least Squares

Teddy Petrou Hongxiao Zhu

2

Outline

Who was Gauss? Why was there controversy in finding the method of

least squares? Gauss’ treatment of error Gauss’ derivation of the method of least squares Gauss’ derivation by modern matrix notation Gauss-Markov theorem Limitations of the method of least squares References

3

Johann Carl Friedrich Gauss

Born:1777 Brunswick, Germany

Died: February 23, 1855, Göttingen, Germany

By the age of eight during arithmetic class he

astonished his teachers by being able to instantly find the sum of the first hundred

integers.

4

Facts about Gauss

Attended Brunswick College in 1792, where he discovered many important theorems before even reaching them in his studies

Found a square root in two different ways to fifty decimal places by ingenious expansions and interpolations

Constructed a regular 17 sided polygon, the first advance in this matter in two millennia. He was only 18 when he made the discovery

5

Ideas of Gauss

Gauss was a mathematical scientist with interests in so many areas as a young man including theory of numbers, to algebra, analysis, geometry, probability, and the theory of errors.

His interests grew, including observational astronomy, celestial mechanics, surveying, geodesy, capillarity, geomagnetism, electromagnetism, mechanism optics, and actuarial science.

6

Intellectual Personality and Controversy

Those who knew Gauss best found him to be cold and uncommunicative.

He only published half of his ideas and found no one to share his most valued thoughts.

In 1805 Adrien-Marie Legendre published a paper on the method of least squares. His treatment, however, lacked a ‘formal consideration of probability and it’s relationship to least squares’, making it impossible to determine the accuracy of the method when applied to real observations.

Gauss claimed that he had written colleagues concerning the use of least squares dating back to 1795

7

Formal Arrival of Least Squares

• Gauss • Published ‘The theory of the Motion of Heavenly Bodies’

in 1809. He gave a probabilistic justification of the method,which was based on the assumption of a normal distribution of errors. Gauss himself later abandoned the use of normal error function.

• Published ‘Theory of the Combination of Observations Least Subject to Errors’ in 1820s. He substituted the root mean square error for Laplace’s mean absolute error.

• Laplace Derived the method of least squares (between1802 and 1820) from the principle that the best estimate should have the smallest ‘mean error’ -the mean of the absolute value of the error.

8

Treatment of Errors

Using probability theory to describe error

Error will be treated as a random variable

Two types of errors Constant-associated with calibration Random error

9

Error Assumptions

Gauss began his study by making two assumptions

Random errors of measurements of the same type lie within fixed limits

All errors within these limits are possible, but not necessarily with equal likelihood

10

Density Function

)()( likely,equally are maginitude same theof errors negative and Positive–

ones large occur than likely to more are errors Small–

)( is dx) x(x, interval e within thlying errors ofy probabilit The–

.properties following the

ith function wdensity a as meaning same with the)(function thedefine We

xx

dxx

x

11

Mean and Variance

Define . In many cases assume k=0

Define mean square error as

If k=0 then the variance will equal

dxxxk )(

dxxxm )(22

2m

12

Reasons for

is always positive and is simple.

The function is differentiable and integrable unlike the absolute value function.

The function approximates the average value in cases where large numbers of observations are being considered,and is simple to use when considering small numbers of observations.

2m

2m

13

More on Variance

If then variance equals .Suppose we have independent random variables

with standard deviation 1 and expected value 0. The linear function of total errors is given by

Now the variance of E is given as

This is assuming every error falls within standard deviations from the mean

0k 22 km ,...}'',',{ eee

...'' eeE

k

ii

k

iiieM

1

2

1

222

14

Gauss’ Derivation of the Method of Least Squares

Suppose a quantity, V=f(x), where V, x are unknown. We estimate V by an observation L.

If x is calculated by L, L~f(x), error will occur.

But if several quantities V,V’,V’’…depend on the same unknown x and they are determined by inexact observations, then we can recover x by some combinations of the observations.

Similar situations occur when we observe several quantities that depend on several unknowns.

15


) variancesame thehave they so errors thescaled We:(

ns'.observatio theof errorsmean ' theof weights theare s' thewhere

,'

)''(' ,

)(:

:be nsobservatio in the errors Let the

),,,("

),,,('

),,,(

.,,, unkowns of functions are .,'',', where

.,'',',: nsobservatiot independen by taking , '',', estimate want toWe

:Problem

3

2

1

Note

p

p

LVv

p

LVv

zyxfV

zyxfV

zyxfV

zyxVVV

LLLVVV

16


)"...,',space(n observatio...),,(spaceparameter :or ,RR:

:mapping a describes system This 2.

since ,ined'overdeterm' is system 1.This :Note

known. are ,, tscoefficien thewhere

,...,, unkowns of functionslinear as written are ,'',',

vvvzyxF

cba

zyxvvv

'''''''''''''''

:systemlinear following heConsider t

lzcybxavlzcybxav

lczbyaxv

17

componet.other the

forcondition similar want weand possible, as small as be to2 want We

....""''"...),',(

Then . ofcomponent first theis "...),',( Suppose

:below as described condition, optimalityan statisfies 2.

Ron identiy theis 1.

:such that R toR from ),",',G( mappinglinear afor looking are We

:as problem thestatecan We

.z,y, x,oft independenk constant somefor

.'''''' :s.t

,",', of tscoefficien are ,",', where

:problemon optimizatian Solve

2''2'2 min

kvvvvvvgx

Gvvvgx

G

FG

vvv

kxetcvvv

vvv

18


Solutions:

It’s still not obvious: How do these results relate with the least squares

estimation?

.)''''()''()(.'''

'''222222

222

etcetc

.,"" ,'' ,en minimun wh its attains

''' sum that theobvious isit which From system. theof

neliminatioby derived wetscoefficien thedenote ' theall where222

etc

s

19


It can be proved that

...' ofon minimizati theas results same get the willwe

,0 ,0 ..

vanish.,,,

partials theall where, minimize that valuesparameter thepicks squaresLeast

'

)'),,,('()),,,(("'

22

22222

yxei

zyx

p

LzyxV

p

LzyxVvvvLet

20

Gauss’ derivation by modern matrix notation:

lAxv

pLVv

VL

xV

cb

RcbcxbxbV

xxx

VVV

iiii

ii

ii

iij

iijiiii

:becomes system The

/)(

:settingby system coordinate new a Switch to

ofn observatioan is Assume

. theof valuesinfer the attempt toan in themeasure We

. and theall of values theknow we

, ,...

such that ,,, parameters of functions

linear are ,,, quantities observable that Assume

11

21

21

21

Gauss’ derivation by modern matrix notation:

:Proof

norm.minimun of rows has )(matrix thematricessuch all among and

,

:holds following thesuch that

matrix a is eThen ther .rank ofmatrix ) ( a is Suppose

1 TT AAAE

xKAxRx

KA

Gauss’ results are equivalent to the following lemma:

possible. as small as be should

of entries diagonal theof sum that thedemanding toequivalent is This

possible. as small as be should ...

: thatiscondition on optimizati The

condition.first thesatisfies )(

222

1

T

iiii

TT

KK

KKK

AAAE

22

optimal. isG ofpart consistent-non theinverses,left linear all among and

))(( )(function theof inverseleft theis )(:)(

that shows lemmaour , :equation originalour toReturning

diagonal.

itson entries positivestrictly have will))(( entries, zero-nonany

has )( if since one, optimal fact thein is solution that theshows This

))(())())((( Finally,

0)(get we, that noting and gmultiplyinRight

matrix. zero theis )( Thus .0)(, :get weg,Subtractin

; have weAlso, ; and ,: Thus ,)(

)(, Thus . )( denote ,invertible is

:continued Proof

1

11

xxFGlAxxFEllAxEvG

lAxv

EKEK

EKE

EKEKEEEKEEKEKK

EEKEADD

AEKAxEKx

xKAxxEAxDAEAAAE

AxDAAxAAAxxAADAA

T

TTTT

TTTT

TTT

TTTTT

23

Gauss-Markov theorem

ariance.smallest v with theestimator unbiasedbest theisestimator squares

-least theed,uncorrelat are and variancesame thehave ' when s,other wordIn

) ~

()ˆVar(C and )ˆE( have we,C of ~

estimator unbiasedany for then ,)Var( and 0)E( If or.error vect

theis and ctor,unknown vean is ,rank h matrix wit an isA where

modellinear aIn

TT

2

s

Var

I

ppn

Ax

LSLS

24

Limitation of the Method of Least Squares

Nothing is perfect:

This method is very sensitive to the presence of unusual data points. One or two outliers can sometimes seriously skew the results of a least squares analysis.

25

References

Gauss, Carl Friedrich, Translated by G. W. Stewart. 1995. Theory of the Combination of Observations Least Subject to Errors: Part One, Part Two, Supplement. Philadelphia: Society for Industrial and Applied Mathematics.

Plackett, R. L. 1949. A Historical Note on the Method of Least Squares. Biometrika. 36:458–460.

Stephen M. Stiger, Gauss and the Invention of Least Squares. The Annals of Statistics, Vol.9, No.3(May,1981),465-474.

Plackett, Robin L. 1972. The Discovery of the Method of Least Squares. Plackett, Robin L. 1972. The Discovery of the Method of Least Squares.

Belinda B.Brand, Guass’ Method of Least Squares: A historically-based introduction. August 2003

http://www.infoplease.com/ce6/people/A0820346.html http://www.stetson.edu/~efriedma/periodictable/html/Ga.html

1 gauss and the method of least squares teddy petrou hongxiao zhu

Documents

error error

ideas of gauss gauss

error assumptions gauss

squares gauss derivation

mean square error

calibration random error

smallest mean error

theory of errors