Download - Overview Curve Fitting - University of Oxfordkarla/reading_group/lecture_notes/... · • Curve fitting: statistical ... curve fitting for relationships that are best ... Mth Rsq

1

Curve Fitting

S.K. Piechnik

Overview

• Introduction• Linear regression• Linear-transformable regressions • Linear Pitfalls, options and complications • Non-linear fitting• Robust estimation: alternative cost functions and

weighting• Implementation and software• Comparing and testing models

Introduction

• Frequently, a relation within the experimental data is desired in terms of an analytical expression between variables that were measured.

• Parameters may be subsequently used as summary descriptives of underlying process enabling multi-level comparisons between datasets in abstract of particular choice of measurement points.

Curve fitting: Definitions• Curve fitting: statistical technique used to derive coefficient values for

equations that express the value of one variable (dependent variable) as a function of another (independent variable).

• Linear regression: curve fitting for relationships that are best approximated by a straight line

• Non-linear regression: curve fitting for relationships that are best approximated by a curved (i.e. non-linear) equation

��

��

��

��

��

��

From whence do this “best” line come?

Gut feeling

Maximum probability principle

Least (sum) of squares (of error)

��

��

��

��

��

�� Maximum probability likelihood

• Probability that out measurements came from the specific line under Gaussian noise

• Maximize the above probability • After log and removal of constants (N, �y

and �) this is equivalent to minimising

∏= ��

��

��

��

∆��

��

��

��

� −−N

i

ii yxyy

P1

2

21 *

)(exp~

σ

( )�=

−=N

iii xyyCost

1

2)(

2

From whence do this “best” line come?

The Cartoon Guide to Statistics, L. Gonick & W. Smith

Line Fitting•Exact analytic solution

•Implemented in scientific calculators and in M$Excel

•Can even easily get the errors on the parameters

�� (Offset)

�� (Slope) �

��

��

Linear fitting of non-linear functions?

Just contradiction of terms

Linear regression of (some) nonlinear functions

• This method of least squares is not limited to linear fits (or 2 variables fits)

• One can just as readily use the same procedure for Y = ax2 + bx +c by minimizing

( ) ( )��==

−−−=−=n

iiii

n

iii cbxaxyYySS

1

22

1

2

Example: Quadratic RegressionYi = axi

2+bxi+c

��

�

��

�

�

=+++−�=∂

��

��

�∂

=+++−�=∂

��

��

�∂

=+++−�=∂

��

��

�∂

� ��

� ��

� ��

= ==

=

= ===

=

= ===

=

n

1i

n

1ii

n

1i

2ii

n

1i

2i

n

1i

n

1ii

n

1i

2i

n

1i

3iii

n

1i

2i

n

1i

n

1i

2i

n

1i

3i

n

1i

4ii

2i

n

1i

2i

)3(0ncxbxay0c

d

)2(0xcxbxayx0b

d

)1(0xcxbxayx0a

d

Quadratic Regression (cont’d)

• Solve linear system of equations

��

��

�

=��

��

�

��

��

�

�

�

�

��

��

��

=

=

=

==

===

===

n

1ii

n

1iii

i

n

1i

2i

n

1ii

n

1i

2i

n

1ii

n

1i

2i

n

1i

3i

n

1i

2i

n

1i

3i

n

1i

4i

y

yx

yx

c

b

a

nxx

xxx

xxx

3

Exponential Fitting

Linearize the equation and apply the fit to a straight line

y = 4.2986e0.2668x

R2 = 0.9935

0

20

40

60

80

100

120

0 5 10 15

y = 4.2986e0.2668x

R2 = 0.9935

1

10

100

1000

0 5 10 15

Logarithmic Fitting

Power Law Fitting(far from) Exhaustive list of regression transforms

............

1111

)ln(1

1ln

11

ln1

11

1)ln()ln()ln()ln()ln(

)ln()ln()ln(

Bx

Ay

xyBxA

xy

BxAexeBxAy

Bxy

xy

ey

BxAy

xyBxA

y

xbAyxyAxy

BxAyxyAey

yy

Bx

B

Bx

+=+

=

+=+=

=��

��

�

−��

��

�

−−=

+=+

=

+==+==

−

Non-linar Transform EquivalentFunction Y X regression

Even excel...

X Y

1 3

2 2

3 4

4 7

5 9

y = 1.7x - 0.1R2 = 0.85

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6

X (independent)

Y (d

epen

den

t)

Correlation Coefficient

• Given a relation between y and x,How good is the fit?

�The parameter which convey this information is the correlation coefficient, usually denoted by r.

21

2y

2x,y1r��

��

−=

σσ �� (Variation in residuals)

�� (Variation in data)

4

Eventually…

( )21n

1i

2i

y 1n

yy

��

��

�

−

−=�

=σ( )

21n

1i

2ii

x,y 2n

Yy

��

��

�

−

−=�

=σ

212n

1ii

n

1i

2i

212n

1ii

n

1i

2i

n

1ii

n

1ii

n

1iii

yynxxn

yxyxnr

��

��

��

��

�−��

��

��

��

�−

−=

��

��

====

===

Where:Variation in Y about its mean Variation in Y about its prediction Correlation Coefficient (cont’d)

• The value range of r ranges from -1 (perfectly correlated in the negative direction) to +1 (perfectly correlated in the positive direction).

• When r = 0, the two variables are not correlated.

• The “goodness” of fit is usually given by r2. • This also known as proportion of explained

variability and used as the basic descriptor of fit quality.

Fit Quality (STATISTICA ver. 7.0)Part of GLM

Test of SS Whole Model vs. SS Residual (Spreadsheet1)DependntVariable

MultipleR

MultipleR²

AdjustedR²

SSModel

dfModel

MSModel

SSResidual

dfResidual Residual

Y 0.921954 0.850000 0.800000 28.90000 1 28.90000 5.100000 3

Univariate Tests of Significance for Y (Spreadsheet1)Sigma-restricted parameterizationEffective hypothesis decomposition

EffectSS Degr. of

FreedomMS F p

InterceptXError

0.00909 1 0.00909 0.00535 0.94630828.90000 1 28.90000 17.00000 0.0258655.10000 3 1.70000

Parameter Estimates (Spreadsheet1)Sigma-restricted parameterization

EffectY

Param.Y

Std.ErrYt

Yp

-95.00%Cnf.Lmt

+95.00%Cnf.Lmt

InterceptX

-0.100000 1.367479 -0.073127 0.946308 -4.45193 4.2519301.700000 0.412311 4.123106 0.025865 0.38784 3.012156



MultipleR

MultipleR²

AdjustedR²

SSModel

dfModel

MSModel

SSResidual

dfResidual Residual

Y 0.921954 0.850000 0.800000 28.90000 1 28.90000 5.100000 3


EffectSS Degr. of

FreedomMS F p

InterceptXError

0.00909 1 0.00909 0.00535 0.94630828.90000 1 28.90000 17.00000 0.0258655.10000 3 1.70000


EffectY

Param.Y

Std.ErrYt

Yp

-95.00%Cnf.Lmt

+95.00%Cnf.Lmt

InterceptX

-0.100000 1.367479 -0.073127 0.946308 -4.45193 4.2519301.700000 0.412311 4.123106 0.025865 0.38784 3.012156



MultipleR

MultipleR²

AdjustedR²

SSModel

dfModel

MSModel

SSResidual

dfResidual Residual

Y 0.921954 0.850000 0.800000 28.90000 1 28.90000 5.100000 3


EffectSS Degr. of

FreedomMS F p

InterceptXError

0.00909 1 0.00909 0.00535 0.94630828.90000 1 28.90000 17.00000 0.0258655.10000 3 1.70000


EffectY

Param.Y

Std.ErrYt

Yp

-95.00%Cnf.Lmt

+95.00%Cnf.Lmt

InterceptX

-0.100000 1.367479 -0.073127 0.946308 -4.45193 4.2519301.700000 0.412311 4.123106 0.025865 0.38784 3.012156

Model/Transform Identification

0

5

10

15

20

25

30

35

0 5 10 15 20 25 30

Y

5

Automatic Model Identification(SPSS ver. 12)

Independent: xUpper

Mth Rsq d.f. F Sigf bound b0 b1 b2 b3

LIN .984 23 1412.48 .000 7.0775 1.0250LOG .800 23 92.22 .000 7.7379 5.5463INV .256 23 7.93 .010 20.4520 -1.9434QUA .993 22 1470.38 .000 5.6967 1.3839 -.0149CUB .996 21 1841.49 .000 4.7333 1.9133 -.0710 .0016COM .818 23 103.33 .000 8.0675 1.0664POW .977 23 996.62 .000 7.2018 .4219S .528 23 25.69 .000 2.9657 -.1919GRO .818 23 103.33 .000 2.0878 .0643EXP .818 23 103.33 .000 8.0675 .0643LGS .818 23 103.33 .000 . .1240 .9377

Actual function usedY=3*exp((0.7x)^0.3) These 4 plots have

the same slopes, intercepts andr values!

Plots are pictures of science, worth thousands of words in boring tables.

Automatic Model Identification(SPSS ver. 12)

Independent: xUpper

Mth Rsq d.f. F Sigf bound b0 b1 b2 b3

LIN .984 23 1412.48 .000 7.0775 1.0250LOG .800 23 92.22 .000 7.7379 5.5463INV .256 23 7.93 .010 20.4520 -1.9434QUA .993 22 1470.38 .000 5.6967 1.3839 -.0149CUB .996 21 1841.49 .000 4.7333 1.9133 -.0710 .0016COM .818 23 103.33 .000 8.0675 1.0664POW .977 23 996.62 .000 7.2018 .4219S .528 23 25.69 .000 2.9657 -.1919GRO .818 23 103.33 .000 2.0878 .0643EXP .818 23 103.33 .000 8.0675 .0643LGS .818 23 103.33 .000 . .1240 .9377

Actual function usedY=3*exp((0.7x)^0.3)

0.00

10.00

20.00

30.00

0.00 5.00 10.00 15.00 20.00 25.00

x

ObservedLinearLogarithmicInverseQuadraticCubicCompoundPowerSGrowthExponentialLogistic

Non-linear Transform ExplorationSTATISTICA v. 7

Non-linear Transform Exploration

Actual function usedY=3*exp((0.7x)^0.3)

Non-linear Transform Exploration in PicturesExcel(transform&scale) + Statistica(categorized plot)

: Y

-0.4-0.20.00.20.40.60.81.01.21.4

: Y2̂

-0.4-0.20.00.20.40.60.81.01.21.4

: Y3̂

-0.4-0.20.00.20.40.60.81.01.21.4

: Y0̂.

5

-0.4-0.20.00.20.40.60.81.01.21.4

: Y0̂.

3

-0.4-0.20.00.20.40.60.81.01.21.4

: log

(Y)

-0.4-0.20.00.20.40.60.81.01.21.4

: X

: exp

(Y)

-0.20.0

0.20.4

0.60.8

1.01.2

-0.4-0.20.00.20.40.60.81.01.21.4

: X^2

-0.20.0

0.20.4

0.60.8

1.01.2

: X^3

-0.20.0

0.20.4

0.60.8

1.01.2

: X^0.5

-0.20.0

0.20.4

0.60.8

1.01.2

: X^0.3

-0.20.0

0.20.4

0.60.8

1.01.2

: log(x)

-0.20.0

0.20.4

0.60.8

1.01.2

: exp(x)

-0.20.0

0.20.4

0.60.8

1.01.2

6

• Transformations can be very useful when used appropriately.

• But beware, follow these rules:– You should transform your data when the

transformation makes the variability more consistent and more Gaussian.

– You should not transform data when the transformation makes the variability less consistent and less Gaussian.

0

50

100

150

200

250

300

350

400

0 5 10 15

Y

Y+noise

TransformFit

0.01

0.1

1

10

100

1000

0 5 10 15

Y

Y+noise

TransformFit

This is ONE BIG and (not) really MEAN square!

•Note: this is result of Exponential fit performed by MS Excel.

0

2

4

6

8

10

12

0 2 4 6 8 10 12

0

2

4

6

8

10

12

0 2 4 6 8 10 12

0

2

4

6

8

10

12

0 2 4 6 8 10 120

2

4

6

8

10

12

0 2 4 6 8 10 12

Choice of distance and error weighting

0

2

4

6

8

10

12

0 2 4 6 8 10 12

0

2

4

6

8

10

12

0 2 4 6 8 10 12

0

2

4

6

8

10

12

0 2 4 6 8 10 120

2

4

6

8

10

12

0 2 4 6 8 10 12

Choice of distance and error weighting

�=

−−=M

iii bxaySS

1

2)(

�

�

=

=

+−−=

⋅−−=

M

i

ii

M

iii

bbxay

barctgbxaySS

12

2

1

2

1)(

)]/1(cos()[(

�=

−−=M

i Yi

ii bxaySS

12

2)(σ

�= +

−−=M

i XiYi

ii

bbxay

SS1

222

2)(σσ

Vertical vs Perpendicular offsets• “In practice, the vertical

offsets from a line (polynomial, surface, hyperplane, etc.) are almost always minimized instead of the perpendicular offsets.

• This provides a much simpler analytic form for the fitting parameters.

• Minimizing R2perp for a

second- or higher-order polynomial leads to polynomial equations having higher order, so this formulation cannot be extended.

• “In any case, for a reasonable number of noisy data points, the difference between vertical and perpendicular fits is quite small.”[Mathworld]

Regression of data with X&Y errors

0

2

4

6

8

10

12

0 2 4 6 8 10 12

��==

⋅−−=+

−−=M

iiii

M

i XiYi

ii wbxayb

bxaySS

1

2

122

2

)()(

σσ

222

1

XiYii b

wσσ +

=

�

�

=

=

⋅−= M

ii

M

iiii

w

wbxya

1

1

2)(

b=?

�� (Offset)

�� (Slope) – can NOT be calculated analytically. Must be optimised and fed back to calculations of weights!

Non linear fitting• Often linearized approach is not adequate

– Linearisation not possible or introduces errors– Too cumbersome

• Optimal fit with a non-linear function is usually also obtained with least squares.

• Difference lies in fact that the optimal set of function parameters (a1, … an) must be found iteratively by trial and error until the best combination is found.

• Thus the problem reduces to the minimalisation of a function (SS) in multidimensional space.

• Important aspects: – Choice of optimisation method– Start parameters– Convergence threshold and method

7

Minimising SS

• Treat sum of squares as a continuous function of the m parameters and search the m-dimensional space for the appropriate minimum value

�=

⋅−=M

iimii waaaxfySS

1

210 ),...,,,((

•Grid Search: Vary each parameter in turn, minimizing chi-squared with each parameter independently. Many successive iterations are required to locate the minimum of chi-squared unless the parameters are independent.

•Gradient Search: Vary all parameters simultaneously, adjusting relative magnitudes of the variations so that the direction of propagation in parameter space is along the direction of steepest descent of chi-squared

•Expansion Methods: Find an approximate analytical function that describes the chi-squared hypersurfaceand use this function to locate the minimum. Number of computed points is less, but computations are considerably more complicated.

•Marquardt Method: Gradient-Expansion combination

From Bevington and Robinson

Problems to look out for

• Ringing– Max iterations

• High initial SS • Multiple minima

• Select good starting point

• Inspect results


• Ringing– Max iterations

• High initial SS • Multiple minima


• Inspect results


• Ringing• High initial SS

– Stop condition

• Multiple minima


• Inspect results

ε<∆SSSS

0

10

20

30

40

50

60

70

SS

Problems to look out for• Ringing• High initial SS • Multiple minima

• Select good starting point• Inspect results

8

Real Life ExampleMRS estimation of CSF/Brain Partial Volume

Brain

CSF

Amplitude=Amp*(PV*exp(-TE/tau1)+(1-PV)*exp(-TE/tau2))

Normalised Signal

TE (Echo-time)

Note: sum of exponentials can not be linearized!

Real Life ExampleMRS estimation of CSF/Brain Partial Volume

Brain

CSF

Amplitude=Amp*(PV*exp(-TE/tau1)+(1-PV)*exp(-TE/tau2))

Normalised Signal (Log scale)

TE (Echo-time)

Note: sum of exponentials can not be linearized!

Proton Signal Amplitudes vs TE9 cases, 12 TEs

Scatterplot (PhysioAndMRS_02.sta 47v*1729c)

-0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6

TE

-500

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Am

plitu

des

Signal variation depends on data � estimate weighting

• Pooled data (initial analysis)- tenfold SD change

0.000

500.000

1000.000

1500.000

2000.000

2500.000

3000.000

3500.000

4000.000

0 200 400 600 800 1000 1200 1400 1600

AmplitudesMeansAmplitudesStd.Dev.

Amplitudes

0.000

50.000

100.000

150.000

200.000

250.000

300.000

350.000

400.000

0.000 500.000 1000.000 1500.000 2000.000 2500.000 3000.000 3500.000 4000.000

Amplitudes Std.Dev.

Signal variation on individual basis9 cases, 16 repetitions, 12 TEs

SD of Signal Amplitude

Te[ms]

SD

Am

p

caseID: 0caseID: 1caseID: 2caseID: 3caseID: 4caseID: 5caseID: 6caseID: 7caseID: 8

-200 0 200 400 600 800 1000 1200 1400 1600-20

0

20

40

60

80

100

120

140

160

180

Model of signal variationSD of Signal Amplitude

Function = 100/(x^0.35)

Te[ms]

SD

Am

p


0 200 400 600 800 1000 1200 1400 1600-20

0

20

40

60

80

100

120

140

160

180

9

Inventory

�Data

�Model

�1/Weights

Scatterplot (PhysioAndMRS_02.sta 47v*1729c)

-0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6

TE

1

2345678910

2030405060708090100

2003004005006007008009001000

2000300040005000600070008000900010000

Am

plitu

des

SD of Signal Amplitude Function = 100/(x^0.35)

Te[ms]

SD

Am

p


0 200 400 600 800 1000 1200 1400 1600-20

0

20

40

60

80

100

120

140

160

180

STATISTICAver. 7

Model: Amplitudes=Amp*(PV*exp(-TE/tau1)+(1-PV)*exp(-TE/tau2))y=(1820.94)*((2.47271)*exp(-x/(.095016))+(1-(2.47271))*exp(-x/(.922e-3)))

C:193 C:194 C:195C:196C:197 C:198C:199 C:200 C:201C:202 C:203 C:204C:205 C:206 C:207C:208C:209 C:210C:211 C:212 C:213C:214 C:215 C:216C:217 C:218 C:219C:220C:221 C:222C:223 C:224 C:225C:226 C:227 C:228C:229 C:230 C:231C:232C:233 C:234C:235 C:236 C:237C:238 C:239 C:240C:241 C:242 C:243C:244C:245 C:246C:247 C:248 C:249C:250 C:251 C:252C:253 C:254 C:255C:256C:257 C:258C:259 C:260 C:261C:262 C:263 C:264C:265 C:266 C:267C:268C:269 C:270C:271 C:272 C:273C:274 C:275 C:276C:277 C:278 C:279C:280C:281 C:282C:283 C:284 C:285C:286 C:287 C:288C:289 C:290 C:291C:292C:293 C:294C:295 C:296 C:297C:298 C:299 C:300C:301 C:302 C:303C:304C:305 C:306C:307 C:308 C:309C:310 C:311 C:312C:313 C:314 C:315C:316C:317 C:318C:319 C:320 C:321C:322 C:323 C:324C:325 C:326 C:327C:328C:329 C:330C:331 C:332 C:333C:334 C:335 C:336C:337 C:338 C:339C:340C:341 C:342C:343 C:344 C:345C:346 C:347 C:348C:349 C:350 C:351C:352C:353 C:354C:355 C:356 C:357C:358 C:359 C:360C:361 C:362 C:363C:364C:365 C:366C:367 C:368 C:369C:370 C:371 C:372C:373 C:374 C:375C:376C:377 C:378C:379 C:380 C:381C:382 C:383 C:384

-0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6

TE

-4.5E97

-4E97

-3.5E97

-3E97

-2.5E97

-2E97

-1.5E97

-1E97

-5E96

0

5E96

Model: Amplitudes=Amp*(PV*exp(-TE/tau1)+(1-PV)*exp(-TE/tau2))y=(1820.94)*((2.47271)*exp(-x/(.095016))+(1-(2.47271))*exp(-x/(.922e-3)))

C:193

C:194C:195C:196

C:197

C:198

C:199C:200

C:201

C:202

C:203C:204

C:205

C:206C:207C:208

C:209

C:210

C:211C:212

C:213

C:214

C:215C:216

C:217

C:218C:219C:220

C:221

C:222

C:223C:224

C:225

C:226

C:227C:228

C:229

C:230C:231C:232

C:233

C:234

C:235C:236

C:237

C:238

C:239C:240

C:241

C:242C:243C:244

C:245

C:246

C:247C:248

C:249

C:250

C:251C:252

C:253

C:254C:255

C:256

C:257

C:258

C:259C:260

C:261

C:262

C:263C:264

C:265

C:266C:267C:268

C:269

C:270

C:271C:272

C:273

C:274

C:275C:276

C:277

C:278C:279C:280

C:281

C:282

C:283C:284

C:285

C:286

C:287C:288

C:289

C:290C:291C:292

C:293

C:294

C:295C:296

C:297

C:298

C:299C:300

C:301

C:302C:303C:304

C:305

C:306

C:307C:308

C:309

C:310

C:311C:312

C:313

C:314C:315

C:316

C:317

C:318

C:319C:320

C:321

C:322

C:323C:324

C:325

C:326C:327C:328

C:329

C:330

C:331C:332

C:333

C:334

C:335C:336

C:337

C:338C:339

C:340

C:341

C:342

C:343C:344

C:345

C:346

C:347C:348

C:349

C:350C:351C:352

C:353

C:354

C:355C:356

C:357

C:358

C:359C:360

C:361

C:362C:363C:364

C:365

C:366

C:367C:368

C:369

C:370

C:371C:372

C:373

C:374C:375C:376

C:377

C:378

C:379C:380

C:381

C:382

C:383C:384

-0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6

TE

0

1000

2000

3000

4000

5000

10

Better!Model: Amplitudes=Amp*(PV*exp(-TE/tau1)+(1-PV)*exp(-TE/tau2))

y=(5164.41)*((.889446)*exp(-x/(.061098))+(1-(.889446))*exp(-x/(.584223)))

C:193

C:194C:195C:196

C:197

C:198C:199

C:200 C:201

C:202

C:203C:204

C:205

C:206C:207C:208

C:209

C:210C:211 C:212

C:213

C:214

C:215C:216

C:217

C:218C:219C:220

C:221

C:222C:223

C:224C:225

C:226

C:227 C:228

C:229

C:230C:231C:232

C:233

C:234C:235

C:236C:237

C:238

C:239C:240

C:241

C:242C:243C:244

C:245

C:246C:247 C:248

C:249

C:250

C:251 C:252

C:253

C:254C:255C:256

C:257

C:258C:259

C:260C:261

C:262

C:263C:264

C:265

C:266C:267C:268

C:269

C:270C:271 C:272

C:273

C:274

C:275C:276

C:277

C:278 C:279C:280

C:281

C:282C:283

C:284C:285

C:286

C:287C:288

C:289

C:290C:291C:292

C:293

C:294C:295 C:296

C:297

C:298

C:299C:300

C:301

C:302C:303C:304

C:305

C:306C:307 C:308

C:309

C:310

C:311C:312

C:313

C:314C:315C:316

C:317

C:318C:319

C:320C:321

C:322

C:323C:324

C:325

C:326C:327C:328

C:329

C:330C:331

C:332C:333

C:334

C:335C:336

C:337

C:338C:339C:340

C:341

C:342C:343 C:344

C:345

C:346

C:347C:348

C:349

C:350C:351C:352

C:353

C:354C:355 C:356

C:357

C:358

C:359C:360

C:361

C:362C:363C:364

C:365

C:366C:367

C:368C:369

C:370

C:371C:372

C:373

C:374C:375C:376

C:377

C:378C:379

C:380 C:381

C:382

C:383 C:384

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6

TE

-1000

0

1000

2000

3000

4000

5000

6000

More checks

Observed versus Predicted Values

-500 0 500 1000 1500 2000 2500 3000 3500 4000

Predicted Values

-500

0

500

1000

1500

2000

2500

3000

3500

4000

Obs

erve

d V

alue

s

Value of Residuals (good fit, case=1)

Predicted versus Residual Values

-500 0 500 1000 1500 2000 2500 3000 3500 4000 4500

Predicted Values

-60

-40

-20

0

20

40

60

80

Res

idua

l Val

ues

11

Value of Residuals (poor fit, Case=0)


-500 0 500 1000 1500 2000 2500 3000 3500 4000

Predicted Values

-200

-180

-160

-140

-120

-100

-80

-60

-40

-20

0

20

40

60

80

Res

idua

l Val

ues

Problem not easy to spot otherwiseObserved versus Predicted Values

-500 0 500 1000 1500 2000 2500 3000 3500 4000

Predicted Values

-500

0

500

1000

1500

2000

2500

3000

3500

4000

Obs

erve

d V

alue

s Model: Amplitudes=Amp*(PV*exp(-TE/tau1)+(1-PV)*exp(-TE/tau2))y=(5476.15)*((.850048)*exp(-x/(.05812))+(1-(.850048))*exp(-x/(.551334)))

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6

TE

0

1000

2000

3000

4000

5000

Robust Estimation• “Insensitive to small departures

from the idealized assumptions for which the estimator is optimized.”

• Typical:– Fractionally large departures

for a small number of data points

– Measurement errors are not normally distributed

• �� Use alternative estimates of error:

• General idea is that the weight given individual points should first increase with deviation, then decrease

NUMERICAL RECIPES IN C. Cambridge University Press. Chapter 15.7

Alternative estimates of error

• M-values • L-values• R-values• A-priori knowledge

221

221

1)()1log()(

)()()(

zz

zzz

zsignzzz

+=+=

==

ψρ

ψρ

Alternative estimates of error• Based on maximum likelihood estimates using a-priori known

(assumed) distributions– Similar to sum of squares but solutions differ if non-gaussian

distributions are considered

– In theory analytic solutions available from set of eqautions for ak:

– In practice solved numerically using alternative cost functions• Two sided exponential distribution:

– (minimising mean absolute deviation)• Cauchy/Lorentzian distribution:

– Arbitrary functions with arbitrary paramters:• Andrew’s sine, Tukeys biweight

�=

−==N

i i

Mii aaxyyzzError

1

1 )...,();(

σρ

( )�

=

==N

i k

Mi

i

Mka

aaxyzz

1

1 ...1;)...;(1

0δ

δδ

δρσ

Alternative estimates of error• Linear combinations of order statistics

• ”Median” of Squares• Tukey’s Trimean of quartiles: Err=SSQ1+2*SSQ2+SSQ3

• Does not require assumptions about error distribution• Ordering is time consuming

(Observed-Predicted)2

N(distribution)

IGNORE LARGEST DEVIATIONS!

12

Alternative estimates of error• Rank test statistics

• Wilcoxon, Spearman Rank-order correlation, Komogorov-Smirnov

• A-priori information on data validity

0.1

1

10

100

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

SDAmplitudes (-) ; case=sp0SDAmplitudes (-) ; case=rb1SDAmplitudes (-) ; case=kp2SDAmplitudes (-) ; case=lb3SDAmplitudes (-) ; case=dg4SDAmplitudes (-) ; case=wb5SDAmplitudes (-) ; case=nn6SDAmplitudes (-) ; case=pl7SDAmplitudes (-) ; case=hd8

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0 0.5 1 1.5 2

SDFrequencies (Hz) ; case=sp0SDFrequencies (Hz) ; case=rb1SDFrequencies (Hz) ; case=kp2SDFrequencies (Hz) ; case=lb3SDFrequencies (Hz) ; case=dg4SDFrequencies (Hz) ; case=wb5SDFrequencies (Hz) ; case=nn6SDFrequencies (Hz) ; case=pl7SDFrequencies (Hz) ; case=hd8

FEAT Motion Correction Reportfor FEAT session MRdata/010755_2004_07_08_co2/BOLD_series_2.3+.feat

Tue Jan 18 11:49:45 GMT 2005

Mean (across voxels) voxel displacements:absolute (each time point with respect to the reference image) = 0.73mm;relative (each time point with respect to the previous timepoint) = 0.16mm.

Implementation (STATISTICA) Implementation – Other Software

• Statistics software – dialogs/scripting– Excel (Solver module must be installed)– SPSS– STATGRAPHICS– STATISTICA– GraphPad– …….

• Programming (with libraries!)– C, Pascal…– Matlab– IDL– …….

Programmatic Implementation - IDLSpecific fit functionsLINFIT( X, Y [, CHISQ=variable] [, COVAR=variable] [, /DOUBLE] [, MEASURE_ERRORS=vector] [, PROB=variable]

[, SIGMA=variable] [, YFIT=variable] )LADFIT( X, Y [, ABSDEV=variable] [, /DOUBLE] ) ; linear robustCOMFIT( X, Y, A {, /EXPONENTIAL | , /GEOMETRIC | , /GOMPERTZ | , /HYPERBOLIC | , /LOGISTIC | ,

/LOGSQUARE} [, SIGMA=variable] [, WEIGHTS=vector] [, YFIT=variable] ) ; common functionsGAUSSFIT( X, Y [, A] [, CHISQ=variable] [, ESTIMATES=array] [, MEASURE_ERRORS=vector] [, NTERMS=integer{3

to 6}] [, SIGMA=variable] [, YERROR=variable]) GAUSS2DFIT( Z, A [, X, Y] [, /NEGATIVE] [, /TILT] ) ; gaussian surfacePOLY_FIT( X, Y, Degree [, CHISQ=variable] [, COVAR=variable] [, /DOUBLE] [, MEASURE_ERRORS=vector]

[, SIGMA=variable] [, STATUS=variable] [, YBAND=variable] [, YERROR=variable] [, YFIT=variable] ) REGRESS( X, Y, [, CHISQ=variable] [, CONST=variable] [, CORRELATION=variable] [, /DOUBLE] [, FTEST=variable]

[, MCORRELATION=variable] [, MEASURE_ERRORS=vector] [, SIGMA=variable] [, STATUS=variable] [, YFIT=variable] )

SFIT( Data, Degree [, /IRREGULAR, KX=variable, /MAX_DEGREE] ) ;polynomial surface fit

General solversLMFIT( X, Y, A [, ALPHA=variable] [, CHISQ=variable] [, CONVERGENCE=variable] [, COVAR=variable] [, /DOUBLE]

[, FITA=vector] [, FUNCTION_NAME=string] [, ITER=variable] [, ITMAX=value] [, ITMIN=value] [, MEASURE_ERRORS=vector] [, SIGMA=variable] [, TOL=value] ) ;Levenberg-Marquardt algorithm

CURVEFIT( X, Y, Weights, A [, Sigma] [, CHISQ=variable] [, /DOUBLE] [, FITA=vector] [, FUNCTION_NAME=string] [, ITER=variable] [, ITMAX=value] [, /NODERIVATIVE] [, STATUS={0 | 1 | 2}] [, TOL=value] [, YERROR=variable] ) ;gradient-expansion algorithm

SVDFIT( X, Y [, M] [, A=vector] [, CHISQ=variable] [, COVAR=variable] [, /DOUBLE] [, FUNCTION_NAME=string] [, /LEGENDRE] [, MEASURE_ERRORS=vector] [, SIGMA=variable] [, SING_VALUES=variable] [, SINGULAR=variable] [, STATUS=variable] [, TOL=value] [, VARIANCE=variable] [, YFIT=variable] )

SVDFIT( X, Y [, M] [, A=vector] [, CHISQ=variable] [, COVAR=variable] [, /DOUBLE] [, FUNCTION_NAME=string] [, /LEGENDRE] [, MEASURE_ERRORS=vector] [, SIGMA=variable] [, SING_VALUES=variable] [, SINGULAR=variable] [, STATUS=variable] [, TOL=value] [, VARIANCE=variable] [, YFIT=variable] )

AMOEBA( Ftol [, FUNCTION_NAME=string] [, FUNCTION_VALUE=variable] [, NCALLS=value] [, NMAX=value] [, P0=vector, SCALE=vector | , SIMPLEX=array] ) ; down slope minimalisation

POWELL, P, Xi, Ftol, Fmin, Func [, /DOUBLE] [, ITER=variable] [, ITMAX=value]

Programmatic Implementation – IDLuse of general solvers

;define function that will calculate costFunction MyError, Parameters

common MEcom, Xi,Yi, WeightsiYest=Model(Xi,Paramters)Error=abs(Yest-Yi)*Weightsi ; e.g.return, Error

End ;function MyError

;define necessary start parameters and call the general solver routineEpsilon=1e-5P0=[5000, 0.9, 0.05, 0.5]Scale=P/5.Result = AMOEBA(Epsilon, SCALE=Scale, P0 = P0,

FUNCTION_NAME="MyError", NMAX=1000)

13

Sub-sample variability in model estimates Impact of the removal of data with high errors of amplitude estimates Overview



y = 1.7x - 0.1R2 = 0.85

y = -6E-14x4 - 0.3333x3 + 3.5x2 - 9.1667x + 9

R2 = 1

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6

X (independent)

Y Comparing

Models

• Use both Sum of Squares and number of degrees of freedom

• F~1 � models equivalent• Transform through F-distribution to get

probability that gain is real

11

2121

121

121

/)/()(

/)(/)(

ModelModel

ModelModelModelModel

ModelModelModel

ModelModelModel

DOFSSDOFDOFSSSS

DOFDOFDOFSSSSSS

F−−=

−−=

Model selection based on analysis of residuals

• Runs test

– Expected Number of Runs ~1+N/2 = 21(=1+[(N+*N-)/(N++N-)])

– Actual Number of Runs=13– P=0.0077 � Model needs upgrade

+ +++++++ ++++++++ + + + ++

----- - - - ----- --------

H. Motulsky & A. Christopoulos, Fitting Models to Biological Data using Linear and Non-linear Regression. www.graphpad.com

Model selection in practice

• Not like textbook example– More noise– Repeated

measures– Clustered

samples– Unobvious

models


-500 0 500 1000 1500 2000 2500 3000 3500 4000 4500

Predicted Values

-60

-40

-20

0

20

40

60

80

Res

idua

l Val

ues

Other way - modelling

• Select prime suspects

• Test the impact of– CBF inflow– CSF inflow– Diffusive mixing

between compartments

14

Modelling Modelling

• Test specific hypothesis

e.g. “Is there really

additional downwards deflection in the data for long TE?”

1

10

100

1000

10000

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

Amplitudes (-) ; case=sp0Amplitudes (-) ; case=rb1Amplitudes (-) ; case=kp2Amplitudes (-) ; case=lb3Amplitudes (-) ; case=dg4Amplitudes (-) ; case=wb5Amplitudes (-) ; case=nn6Amplitudes (-) ; case=pl7Amplitudes (-) ; case=hd8

Overview



Download - Overview Curve Fitting - University of Oxfordkarla/reading_group/lecture_notes/... · • Curve fitting: statistical ... curve fitting for relationships that are best ... Mth Rsq

Top Related