copyright © 2006 the mcgraw-hill companies, inc. permission required for reproduction or display. 1...

14
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression

Upload: bernard-harmon

Post on 17-Jan-2018

219 views

Category:

Documents


0 download

DESCRIPTION

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 3 In sciences, if several measurements are made of a particular quantity, additional insight can be gained by summarizing the data in one or more well chosen statistics: Arithmetic mean - The sum of the individual data points (y i ) divided by the number of points. Standard deviation – a common measure of spread for a sample or variance Coefficient of variation – quantifies the spread of data (similar to relative error) Simple Statistics

TRANSCRIPT

Page 1: Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

1

~ Curve Fitting ~Least Squares Regression

Page 2: Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

2

•Fit the best curve to a discrete data set and obtain estimates for other data points

•Two general approaches:–Data exhibit a significant degree

of scatter. Find a single curve that represents the general trend of the data.

–Data is very precise. Pass a curve(s) exactly through each of the points: interpolation.

Curve Fitting

Page 3: Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

3

In sciences, if several measurements are made of a particular quantity, additional insight can be gained by summarizing the data in one or more well chosen statistics:

Arithmetic mean - The sum of the individual data points (yi) divided by the number of points.

Standard deviation – a common measure of spread for a sample

or variance

Coefficient of variation – quantifies the spread of data (similar to relative error)

ny

y i

1

2

n

yyS i

y

)(1

22

n

yyS i

y

)(

Simple Statistics

%100.. y

Svc y

Page 4: Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

4

• Fitting a straight line to a set of paired observations: (x1, y1), (x2, y2),…,(xn, yn)

yi : measured value ei : error (residual)

yi = a0 + a1 xi + ei

ei = yi - a0 - a1 xi

a1 : slope

a0 : intercept

Linear Regression

ei Error

Line equationy = a0 + a1 x

Page 5: Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

5

• The most commonly used strategy is to minimize the sum of the squares of the residuals between the measured-y and the y calculated with the linear model:

• Yields a unique line for a given set of data• Need to compute a0 and a1 such that Sr is minimized!

n

iiir

n

imodelimeasuredi

n

iir

xaayS

yy

eS

1

210

1

2

1

2

)(

)( ,,

ei Error

Criteria For a “Best Fit”

Page 6: Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Least-Squares Fit of a Straight Line

0 0)(2

0 0)(2

2101

1

101

iiiiiioir

iiioio

r

xaxaxyxxaayaS

xaayxaayaS

Normal equations which can be solved simultaneously

iiii

ii

xyaxax

yaxna

naa

12

0

10

00

(2)

(1)

Since

n

iii

n

iir xaayeS

1

210

1

2 )( :error Minimize

6

Page 7: Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Least-Squares Fit of a Straight Line

xayaa

xxn

yxyxna

ii

iiii

100

221

as expressed becan (1), using

Normal equations which can be solved simultaneously

Mean values

iiii

ii

xyaxax

yaxna

12

0

10

n

iii

n

iir xaayeS

1

210

1

2 )( :error Minimize

ii

i

ii

i

xyy

aa

xxx

1

02

n

7

Page 8: Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

• Sr = Sum of the squares of residuals around the regression line• St = total sum of the squares around the mean• (St – Sr) quantifies the improvement or error reduction due to describing

data in terms of a straight line rather than as an average value.

• For a perfect fit Sr=0 and r = r2 = 1 signifies that the line explains 100 % of the variability of the data.

• For r = r2 = 0 Sr=St the fit represents no improvement

t

rt

SSSr

2r : correlation coefficientr2 : coefficient of determination

“Goodness” of our fit

The spread of data

(a) around the mean(b) around the best-fit line

Notice the improvement in the error due to linear regression

2

1

210

)(

)(

yyS

xaayS

it

n

iiir

8

Page 9: Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Example

Use linear regression to fit the following data to a straight line.

Solution:Solving this system of two equations gives

a0 = 4.56 and a1 = 0.6

y = 4.56 + 0.6 x

x y x2 xy1 5.1 1 5.1

2 5.9 4 11.8

3 6.3 9 18.9

∑ 6 17.3 14 35.8

ii

i

ii

i

xyy

aa

xxx

1

02

n

8.353.17

14663

1

0

aa

9

Page 10: Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Example

Use linear regression to fit the following data to a straight line.

Solution:Solving this system of two equations gives

a0=5 and a1=4

y = 5 + 4x

x y x2 xy0 5 0 0

1 9 1 9

2 13 4 26

3 17 9 51

4 21 16 84

∑ 10 65 30 170

ii

i

ii

i

xyy

aa

xxx

1

02

n

17065

3010105

1

0

aa

10

Page 11: Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Linearization of Nonlinear Relationships

(a) Data that is ill-suited for linear least-squares regression

(b) Indication that a parabola may be more suitable

Exponential Eq.

xey

slope

lnintercept

11

Page 12: Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Saturation growth-rate Eq.

Power Eq.

Linearization of Nonlinear Relationships

12

Page 13: Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

13

Polynomial Regression• Some engineering data is poorly represented by a straight line. A curve

(polynomial) may be better suited to fit the data. The least squares method can be extended to fit the data to higher order polynomials.

• As an example let us consider a second order polynomial to fit the data points:

n

iiii

n

iir xaxaayeS

1

22210

1

2 )( :error Minimize

21 2

21 2

1

2 21 2

2

2 ( ) 0

2 ( ) 0

2 ( ) 0

ri o i i

o

ri i o i i

ri i o i i

S y a a x a xaS x y a a x a xaS x y a a x a xa

iiiii

iiiii

iii

yxaxaxax

yxaxaxax

yaxaxna

22

41

30

2

23

12

0

22

10

2210 xaxaay

Page 14: Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Example: Polynomial Regression

Fit a second-order polynomial to the following data

Solution:Solving this system of three equations gives

a0 = 2.48, a1 = 2.36, a2 = 1.86

y = 2.48 + 2.36 x + 1.86 x2

xi yi xi2 xi

3 xi4 xiyi xi

2yi

0 2.1 0 0 0 0 0

1 7.7 1 1 1 7.7 7.7

2 13.6 4 8 16 27.2 54.4

3 27.2 9 27 81 81.6 244.8

4 40.9 16 64 256 163.6 654.4

5 61.1 25 125 625 305.5 1527.5

∑ 15 152.6 55 225 979 585.6 2488.8

ii

ii

i

iii

iii

ii

yxyx

y

aaa

xxxxxxxx

22

1

0

432

32

2n

8.24886.5856.152

97925555225551555156

2

1

0

aaa

14