1 gauss and the method of least squares teddy petrou hongxiao zhu
TRANSCRIPT
1
Gauss and the Method of Least Squares
Teddy Petrou Hongxiao Zhu
2
Outline
Who was Gauss? Why was there controversy in finding the method of
least squares? Gauss’ treatment of error Gauss’ derivation of the method of least squares Gauss’ derivation by modern matrix notation Gauss-Markov theorem Limitations of the method of least squares References
3
Johann Carl Friedrich Gauss
Born:1777 Brunswick, Germany
Died: February 23, 1855, Göttingen, Germany
By the age of eight during arithmetic class he
astonished his teachers by being able to instantly find the sum of the first hundred
integers.
4
Facts about Gauss
Attended Brunswick College in 1792, where he discovered many important theorems before even reaching them in his studies
Found a square root in two different ways to fifty decimal places by ingenious expansions and interpolations
Constructed a regular 17 sided polygon, the first advance in this matter in two millennia. He was only 18 when he made the discovery
5
Ideas of Gauss
Gauss was a mathematical scientist with interests in so many areas as a young man including theory of numbers, to algebra, analysis, geometry, probability, and the theory of errors.
His interests grew, including observational astronomy, celestial mechanics, surveying, geodesy, capillarity, geomagnetism, electromagnetism, mechanism optics, and actuarial science.
6
Intellectual Personality and Controversy
Those who knew Gauss best found him to be cold and uncommunicative.
He only published half of his ideas and found no one to share his most valued thoughts.
In 1805 Adrien-Marie Legendre published a paper on the method of least squares. His treatment, however, lacked a ‘formal consideration of probability and it’s relationship to least squares’, making it impossible to determine the accuracy of the method when applied to real observations.
Gauss claimed that he had written colleagues concerning the use of least squares dating back to 1795
7
Formal Arrival of Least Squares
• Gauss • Published ‘The theory of the Motion of Heavenly Bodies’
in 1809. He gave a probabilistic justification of the method,which was based on the assumption of a normal distribution of errors. Gauss himself later abandoned the use of normal error function.
• Published ‘Theory of the Combination of Observations Least Subject to Errors’ in 1820s. He substituted the root mean square error for Laplace’s mean absolute error.
• Laplace Derived the method of least squares (between1802 and 1820) from the principle that the best estimate should have the smallest ‘mean error’ -the mean of the absolute value of the error.
8
Treatment of Errors
Using probability theory to describe error
Error will be treated as a random variable
Two types of errors Constant-associated with calibration Random error
9
Error Assumptions
Gauss began his study by making two assumptions
Random errors of measurements of the same type lie within fixed limits
All errors within these limits are possible, but not necessarily with equal likelihood
10
Density Function
)()( likely,equally are maginitude same theof errors negative and Positive–
ones large occur than likely to more are errors Small–
)( is dx) x(x, interval e within thlying errors ofy probabilit The–
.properties following the
ith function wdensity a as meaning same with the)(function thedefine We
xx
dxx
x
11
Mean and Variance
Define . In many cases assume k=0
Define mean square error as
If k=0 then the variance will equal
dxxxk )(
dxxxm )(22
2m
12
Reasons for
is always positive and is simple.
The function is differentiable and integrable unlike the absolute value function.
The function approximates the average value in cases where large numbers of observations are being considered,and is simple to use when considering small numbers of observations.
2m
2m
13
More on Variance
If then variance equals .Suppose we have independent random variables
with standard deviation 1 and expected value 0. The linear function of total errors is given by
Now the variance of E is given as
This is assuming every error falls within standard deviations from the mean
0k 22 km ,...}'',',{ eee
...'' eeE
k
ii
k
iiieM
1
2
1
222
14
Gauss’ Derivation of the Method of Least Squares
Suppose a quantity, V=f(x), where V, x are unknown. We estimate V by an observation L.
If x is calculated by L, L~f(x), error will occur.
But if several quantities V,V’,V’’…depend on the same unknown x and they are determined by inexact observations, then we can recover x by some combinations of the observations.
Similar situations occur when we observe several quantities that depend on several unknowns.
15
Gauss’ Derivation of the Method of Least Squares
) variancesame thehave they so errors thescaled We:(
ns'.observatio theof errorsmean ' theof weights theare s' thewhere
,'
)''(' ,
)(:
:be nsobservatio in the errors Let the
),,,("
),,,('
),,,(
.,,, unkowns of functions are .,'',', where
.,'',',: nsobservatiot independen by taking , '',', estimate want toWe
:Problem
3
2
1
Note
p
p
LVv
p
LVv
zyxfV
zyxfV
zyxfV
zyxVVV
LLLVVV
16
Gauss’ Derivation of the Method of Least Squares
)"...,',space(n observatio...),,(spaceparameter :or ,RR:
:mapping a describes system This 2.
since ,ined'overdeterm' is system 1.This :Note
known. are ,, tscoefficien thewhere
,...,, unkowns of functionslinear as written are ,'',',
vvvzyxF
cba
zyxvvv
'''''''''''''''
:systemlinear following heConsider t
lzcybxavlzcybxav
lczbyaxv
17
componet.other the
forcondition similar want weand possible, as small as be to2 want We
....""''"...),',(
Then . ofcomponent first theis "...),',( Suppose
:below as described condition, optimalityan statisfies 2.
Ron identiy theis 1.
:such that R toR from ),",',G( mappinglinear afor looking are We
:as problem thestatecan We
.z,y, x,oft independenk constant somefor
.'''''' :s.t
,",', of tscoefficien are ,",', where
:problemon optimizatian Solve
2''2'2 min
kvvvvvvgx
Gvvvgx
G
FG
vvv
kxetcvvv
vvv
18
Gauss’ Derivation of the Method of Least Squares
Solutions:
It’s still not obvious: How do these results relate with the least squares
estimation?
.)''''()''()(.'''
'''222222
222
etcetc
.,"" ,'' ,en minimun wh its attains
''' sum that theobvious isit which From system. theof
neliminatioby derived wetscoefficien thedenote ' theall where222
etc
s
19
Gauss’ Derivation of the Method of Least Squares
It can be proved that
...' ofon minimizati theas results same get the willwe
,0 ,0 ..
vanish.,,,
partials theall where, minimize that valuesparameter thepicks squaresLeast
'
)'),,,('()),,,(("'
22
22222
yxei
zyx
p
LzyxV
p
LzyxVvvvLet
20
Gauss’ derivation by modern matrix notation:
lAxv
pLVv
VL
xV
cb
RcbcxbxbV
xxx
VVV
iiii
ii
ii
iij
iijiiii
:becomes system The
/)(
:settingby system coordinate new a Switch to
ofn observatioan is Assume
. theof valuesinfer the attempt toan in themeasure We
. and theall of values theknow we
, ,...
such that ,,, parameters of functions
linear are ,,, quantities observable that Assume
11
21
21
21
Gauss’ derivation by modern matrix notation:
:Proof
norm.minimun of rows has )(matrix thematricessuch all among and
,
:holds following thesuch that
matrix a is eThen ther .rank ofmatrix ) ( a is Suppose
1 TT AAAE
xKAxRx
KA
Gauss’ results are equivalent to the following lemma:
possible. as small as be should
of entries diagonal theof sum that thedemanding toequivalent is This
possible. as small as be should ...
: thatiscondition on optimizati The
condition.first thesatisfies )(
222
1
T
iiii
TT
KK
KKK
AAAE
22
optimal. isG ofpart consistent-non theinverses,left linear all among and
))(( )(function theof inverseleft theis )(:)(
that shows lemmaour , :equation originalour toReturning
diagonal.
itson entries positivestrictly have will))(( entries, zero-nonany
has )( if since one, optimal fact thein is solution that theshows This
))(())())((( Finally,
0)(get we, that noting and gmultiplyinRight
matrix. zero theis )( Thus .0)(, :get weg,Subtractin
; have weAlso, ; and ,: Thus ,)(
)(, Thus . )( denote ,invertible is
:continued Proof
1
11
xxFGlAxxFEllAxEvG
lAxv
EKEK
EKE
EKEKEEEKEEKEKK
EEKEADD
AEKAxEKx
xKAxxEAxDAEAAAE
AxDAAxAAAxxAADAA
T
TTTT
TTTT
TTT
TTTTT
23
Gauss-Markov theorem
ariance.smallest v with theestimator unbiasedbest theisestimator squares
-least theed,uncorrelat are and variancesame thehave ' when s,other wordIn
) ~
()ˆVar(C and )ˆE( have we,C of ~
estimator unbiasedany for then ,)Var( and 0)E( If or.error vect
theis and ctor,unknown vean is ,rank h matrix wit an isA where
modellinear aIn
TT
2
s
Var
I
ppn
Ax
LSLS
24
Limitation of the Method of Least Squares
Nothing is perfect:
This method is very sensitive to the presence of unusual data points. One or two outliers can sometimes seriously skew the results of a least squares analysis.
25
References
Gauss, Carl Friedrich, Translated by G. W. Stewart. 1995. Theory of the Combination of Observations Least Subject to Errors: Part One, Part Two, Supplement. Philadelphia: Society for Industrial and Applied Mathematics.
Plackett, R. L. 1949. A Historical Note on the Method of Least Squares. Biometrika. 36:458–460.
Stephen M. Stiger, Gauss and the Invention of Least Squares. The Annals of Statistics, Vol.9, No.3(May,1981),465-474.
Plackett, Robin L. 1972. The Discovery of the Method of Least Squares. Plackett, Robin L. 1972. The Discovery of the Method of Least Squares.
Belinda B.Brand, Guass’ Method of Least Squares: A historically-based introduction. August 2003
http://www.infoplease.com/ce6/people/A0820346.html http://www.stetson.edu/~efriedma/periodictable/html/Ga.html