1
Curve Fitting
S.K. Piechnik
Overview
• Introduction• Linear regression• Linear-transformable regressions • Linear Pitfalls, options and complications • Non-linear fitting• Robust estimation: alternative cost functions and
weighting• Implementation and software• Comparing and testing models
Introduction
• Frequently, a relation within the experimental data is desired in terms of an analytical expression between variables that were measured.
• Parameters may be subsequently used as summary descriptives of underlying process enabling multi-level comparisons between datasets in abstract of particular choice of measurement points.
Curve fitting: Definitions• Curve fitting: statistical technique used to derive coefficient values for
equations that express the value of one variable (dependent variable) as a function of another (independent variable).
• Linear regression: curve fitting for relationships that are best approximated by a straight line
• Non-linear regression: curve fitting for relationships that are best approximated by a curved (i.e. non-linear) equation
����
��������
����� � � ���
����
��������
��������� ���� � � ��
From whence do this “best” line come?
Gut feeling
Maximum probability principle
Least (sum) of squares (of error)
����
��������
����� � � ���
����
��������
��������� ���� � � �� Maximum probability likelihood
• Probability that out measurements came from the specific line under Gaussian noise
• Maximize the above probability • After log and removal of constants (N, �y
and �) this is equivalent to minimising
∏= ��
���
��
���
∆��
���
��
���
� −−N
i
ii yxyy
P1
2
21 *
)(exp~
σ
( )�=
−=N
iii xyyCost
1
2)(
2
From whence do this “best” line come?
The Cartoon Guide to Statistics, L. Gonick & W. Smith
Line Fitting•Exact analytic solution
•Implemented in scientific calculators and in M$Excel
•Can even easily get the errors on the parameters
���� (Offset)
���� (Slope) �
������
����� � � �� �
Linear fitting of non-linear functions?
Just contradiction of terms
Linear regression of (some) nonlinear functions
• This method of least squares is not limited to linear fits (or 2 variables fits)
• One can just as readily use the same procedure for Y = ax2 + bx +c by minimizing
( ) ( )��==
−−−=−=n
iiii
n
iii cbxaxyYySS
1
22
1
2
Example: Quadratic RegressionYi = axi
2+bxi+c
������
�
������
�
�
=+++−�=∂
��
���
�∂
=+++−�=∂
��
���
�∂
=+++−�=∂
��
���
�∂
� ���
� ����
� ����
= ==
=
= ===
=
= ===
=
n
1i
n
1ii
n
1i
2ii
n
1i
2i
n
1i
n
1ii
n
1i
2i
n
1i
3iii
n
1i
2i
n
1i
n
1i
2i
n
1i
3i
n
1i
4ii
2i
n
1i
2i
)3(0ncxbxay0c
d
)2(0xcxbxayx0b
d
)1(0xcxbxayx0a
d
Quadratic Regression (cont’d)
• Solve linear system of equations
�������
�������
�
=���
���
�
�������
�������
�
�
�
�
��
���
���
=
=
=
==
===
===
n
1ii
n
1iii
i
n
1i
2i
n
1ii
n
1i
2i
n
1ii
n
1i
2i
n
1i
3i
n
1i
2i
n
1i
3i
n
1i
4i
y
yx
yx
c
b
a
nxx
xxx
xxx
3
Exponential Fitting
Linearize the equation and apply the fit to a straight line
y = 4.2986e0.2668x
R2 = 0.9935
0
20
40
60
80
100
120
0 5 10 15
y = 4.2986e0.2668x
R2 = 0.9935
1
10
100
1000
0 5 10 15
Logarithmic Fitting
Power Law Fitting(far from) Exhaustive list of regression transforms
............
1111
)ln(1
1ln
11
ln1
11
1)ln()ln()ln()ln()ln(
)ln()ln()ln(
Bx
Ay
xyBxA
xy
BxAexeBxAy
Bxy
xy
ey
BxAy
xyBxA
y
xbAyxyAxy
BxAyxyAey
yy
Bx
B
Bx
+=+
=
+=+=
=���
����
�
−���
����
�
−−=
+=+
=
+==+==
−
Non-linar Transform EquivalentFunction Y X regression
Even excel...
X Y
1 3
2 2
3 4
4 7
5 9
y = 1.7x - 0.1R2 = 0.85
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6
X (independent)
Y (d
epen
den
t)
Correlation Coefficient
• Given a relation between y and x,How good is the fit?
�The parameter which convey this information is the correlation coefficient, usually denoted by r.
21
2y
2x,y1r��
���
−=
σσ ���� (Variation in residuals)
���� (Variation in data)
4
Eventually…
( )21n
1i
2i
y 1n
yy
����
����
�
−
−=�
=σ( )
21n
1i
2ii
x,y 2n
Yy
����
����
�
−
−=�
=σ
212n
1ii
n
1i
2i
212n
1ii
n
1i
2i
n
1ii
n
1ii
n
1iii
yynxxn
yxyxnr
��
���
��
���
�−��
���
��
���
�−
−=
����
���
====
===
Where:Variation in Y about its mean Variation in Y about its prediction Correlation Coefficient (cont’d)
• The value range of r ranges from -1 (perfectly correlated in the negative direction) to +1 (perfectly correlated in the positive direction).
• When r = 0, the two variables are not correlated.
• The “goodness” of fit is usually given by r2. • This also known as proportion of explained
variability and used as the basic descriptor of fit quality.
Fit Quality (STATISTICA ver. 7.0)Part of GLM
Test of SS Whole Model vs. SS Residual (Spreadsheet1)DependntVariable
MultipleR
MultipleR²
AdjustedR²
SSModel
dfModel
MSModel
SSResidual
dfResidual Residual
Y 0.921954 0.850000 0.800000 28.90000 1 28.90000 5.100000 3
Univariate Tests of Significance for Y (Spreadsheet1)Sigma-restricted parameterizationEffective hypothesis decomposition
EffectSS Degr. of
FreedomMS F p
InterceptXError
0.00909 1 0.00909 0.00535 0.94630828.90000 1 28.90000 17.00000 0.0258655.10000 3 1.70000
Parameter Estimates (Spreadsheet1)Sigma-restricted parameterization
EffectY
Param.Y
Std.ErrYt
Yp
-95.00%Cnf.Lmt
+95.00%Cnf.Lmt
InterceptX
-0.100000 1.367479 -0.073127 0.946308 -4.45193 4.2519301.700000 0.412311 4.123106 0.025865 0.38784 3.012156
Fit Quality (STATISTICA ver. 7.0)Part of GLM
Test of SS Whole Model vs. SS Residual (Spreadsheet1)DependntVariable
MultipleR
MultipleR²
AdjustedR²
SSModel
dfModel
MSModel
SSResidual
dfResidual Residual
Y 0.921954 0.850000 0.800000 28.90000 1 28.90000 5.100000 3
Univariate Tests of Significance for Y (Spreadsheet1)Sigma-restricted parameterizationEffective hypothesis decomposition
EffectSS Degr. of
FreedomMS F p
InterceptXError
0.00909 1 0.00909 0.00535 0.94630828.90000 1 28.90000 17.00000 0.0258655.10000 3 1.70000
Parameter Estimates (Spreadsheet1)Sigma-restricted parameterization
EffectY
Param.Y
Std.ErrYt
Yp
-95.00%Cnf.Lmt
+95.00%Cnf.Lmt
InterceptX
-0.100000 1.367479 -0.073127 0.946308 -4.45193 4.2519301.700000 0.412311 4.123106 0.025865 0.38784 3.012156
Fit Quality (STATISTICA ver. 7.0)Part of GLM
Test of SS Whole Model vs. SS Residual (Spreadsheet1)DependntVariable
MultipleR
MultipleR²
AdjustedR²
SSModel
dfModel
MSModel
SSResidual
dfResidual Residual
Y 0.921954 0.850000 0.800000 28.90000 1 28.90000 5.100000 3
Univariate Tests of Significance for Y (Spreadsheet1)Sigma-restricted parameterizationEffective hypothesis decomposition
EffectSS Degr. of
FreedomMS F p
InterceptXError
0.00909 1 0.00909 0.00535 0.94630828.90000 1 28.90000 17.00000 0.0258655.10000 3 1.70000
Parameter Estimates (Spreadsheet1)Sigma-restricted parameterization
EffectY
Param.Y
Std.ErrYt
Yp
-95.00%Cnf.Lmt
+95.00%Cnf.Lmt
InterceptX
-0.100000 1.367479 -0.073127 0.946308 -4.45193 4.2519301.700000 0.412311 4.123106 0.025865 0.38784 3.012156
Model/Transform Identification
0
5
10
15
20
25
30
35
0 5 10 15 20 25 30
Y
5
Automatic Model Identification(SPSS ver. 12)
Independent: xUpper
Mth Rsq d.f. F Sigf bound b0 b1 b2 b3
LIN .984 23 1412.48 .000 7.0775 1.0250LOG .800 23 92.22 .000 7.7379 5.5463INV .256 23 7.93 .010 20.4520 -1.9434QUA .993 22 1470.38 .000 5.6967 1.3839 -.0149CUB .996 21 1841.49 .000 4.7333 1.9133 -.0710 .0016COM .818 23 103.33 .000 8.0675 1.0664POW .977 23 996.62 .000 7.2018 .4219S .528 23 25.69 .000 2.9657 -.1919GRO .818 23 103.33 .000 2.0878 .0643EXP .818 23 103.33 .000 8.0675 .0643LGS .818 23 103.33 .000 . .1240 .9377
Actual function usedY=3*exp((0.7x)^0.3) These 4 plots have
the same slopes, intercepts andr values!
Plots are pictures of science, worth thousands of words in boring tables.
Automatic Model Identification(SPSS ver. 12)
Independent: xUpper
Mth Rsq d.f. F Sigf bound b0 b1 b2 b3
LIN .984 23 1412.48 .000 7.0775 1.0250LOG .800 23 92.22 .000 7.7379 5.5463INV .256 23 7.93 .010 20.4520 -1.9434QUA .993 22 1470.38 .000 5.6967 1.3839 -.0149CUB .996 21 1841.49 .000 4.7333 1.9133 -.0710 .0016COM .818 23 103.33 .000 8.0675 1.0664POW .977 23 996.62 .000 7.2018 .4219S .528 23 25.69 .000 2.9657 -.1919GRO .818 23 103.33 .000 2.0878 .0643EXP .818 23 103.33 .000 8.0675 .0643LGS .818 23 103.33 .000 . .1240 .9377
Actual function usedY=3*exp((0.7x)^0.3)
0.00
10.00
20.00
30.00
0.00 5.00 10.00 15.00 20.00 25.00
x
ObservedLinearLogarithmicInverseQuadraticCubicCompoundPowerSGrowthExponentialLogistic
Non-linear Transform ExplorationSTATISTICA v. 7
Non-linear Transform Exploration
Actual function usedY=3*exp((0.7x)^0.3)
Non-linear Transform Exploration in PicturesExcel(transform&scale) + Statistica(categorized plot)
: Y
-0.4-0.20.00.20.40.60.81.01.21.4
: Y2̂
-0.4-0.20.00.20.40.60.81.01.21.4
: Y3̂
-0.4-0.20.00.20.40.60.81.01.21.4
: Y0̂.
5
-0.4-0.20.00.20.40.60.81.01.21.4
: Y0̂.
3
-0.4-0.20.00.20.40.60.81.01.21.4
: log
(Y)
-0.4-0.20.00.20.40.60.81.01.21.4
: X
: exp
(Y)
-0.20.0
0.20.4
0.60.8
1.01.2
-0.4-0.20.00.20.40.60.81.01.21.4
: X^2
-0.20.0
0.20.4
0.60.8
1.01.2
: X^3
-0.20.0
0.20.4
0.60.8
1.01.2
: X^0.5
-0.20.0
0.20.4
0.60.8
1.01.2
: X^0.3
-0.20.0
0.20.4
0.60.8
1.01.2
: log(x)
-0.20.0
0.20.4
0.60.8
1.01.2
: exp(x)
-0.20.0
0.20.4
0.60.8
1.01.2
6
• Transformations can be very useful when used appropriately.
• But beware, follow these rules:– You should transform your data when the
transformation makes the variability more consistent and more Gaussian.
– You should not transform data when the transformation makes the variability less consistent and less Gaussian.
0
50
100
150
200
250
300
350
400
0 5 10 15
Y
Y+noise
TransformFit
0.01
0.1
1
10
100
1000
0 5 10 15
Y
Y+noise
TransformFit
This is ONE BIG and (not) really MEAN square!
•Note: this is result of Exponential fit performed by MS Excel.
0
2
4
6
8
10
12
0 2 4 6 8 10 12
0
2
4
6
8
10
12
0 2 4 6 8 10 12
0
2
4
6
8
10
12
0 2 4 6 8 10 120
2
4
6
8
10
12
0 2 4 6 8 10 12
Choice of distance and error weighting
0
2
4
6
8
10
12
0 2 4 6 8 10 12
0
2
4
6
8
10
12
0 2 4 6 8 10 12
0
2
4
6
8
10
12
0 2 4 6 8 10 120
2
4
6
8
10
12
0 2 4 6 8 10 12
Choice of distance and error weighting
�=
−−=M
iii bxaySS
1
2)(
�
�
=
=
+−−=
⋅−−=
M
i
ii
M
iii
bbxay
barctgbxaySS
12
2
1
2
1)(
)]/1(cos()[(
�=
−−=M
i Yi
ii bxaySS
12
2)(σ
�= +
−−=M
i XiYi
ii
bbxay
SS1
222
2)(σσ
Vertical vs Perpendicular offsets• “In practice, the vertical
offsets from a line (polynomial, surface, hyperplane, etc.) are almost always minimized instead of the perpendicular offsets.
• This provides a much simpler analytic form for the fitting parameters.
• Minimizing R2perp for a
second- or higher-order polynomial leads to polynomial equations having higher order, so this formulation cannot be extended.
• “In any case, for a reasonable number of noisy data points, the difference between vertical and perpendicular fits is quite small.”[Mathworld]
Regression of data with X&Y errors
0
2
4
6
8
10
12
0 2 4 6 8 10 12
��==
⋅−−=+
−−=M
iiii
M
i XiYi
ii wbxayb
bxaySS
1
2
122
2
)()(
σσ
222
1
XiYii b
wσσ +
=
�
�
=
=
⋅−= M
ii
M
iiii
w
wbxya
1
1
2)(
b=?
���� (Offset)
���� (Slope) – can NOT be calculated analytically. Must be optimised and fed back to calculations of weights!
Non linear fitting• Often linearized approach is not adequate
– Linearisation not possible or introduces errors– Too cumbersome
• Optimal fit with a non-linear function is usually also obtained with least squares.
• Difference lies in fact that the optimal set of function parameters (a1, … an) must be found iteratively by trial and error until the best combination is found.
• Thus the problem reduces to the minimalisation of a function (SS) in multidimensional space.
• Important aspects: – Choice of optimisation method– Start parameters– Convergence threshold and method
7
Minimising SS
• Treat sum of squares as a continuous function of the m parameters and search the m-dimensional space for the appropriate minimum value
�=
⋅−=M
iimii waaaxfySS
1
210 ),...,,,((
•Grid Search: Vary each parameter in turn, minimizing chi-squared with each parameter independently. Many successive iterations are required to locate the minimum of chi-squared unless the parameters are independent.
•Gradient Search: Vary all parameters simultaneously, adjusting relative magnitudes of the variations so that the direction of propagation in parameter space is along the direction of steepest descent of chi-squared
•Expansion Methods: Find an approximate analytical function that describes the chi-squared hypersurfaceand use this function to locate the minimum. Number of computed points is less, but computations are considerably more complicated.
•Marquardt Method: Gradient-Expansion combination
From Bevington and Robinson
Problems to look out for
• Ringing– Max iterations
• High initial SS • Multiple minima
• Select good starting point
• Inspect results
Problems to look out for
• Ringing– Max iterations
• High initial SS • Multiple minima
• Select good starting point
• Inspect results
Problems to look out for
• Ringing• High initial SS
– Stop condition
• Multiple minima
• Select good starting point
• Inspect results
ε<∆SSSS
0
10
20
30
40
50
60
70
SS
Problems to look out for• Ringing• High initial SS • Multiple minima
• Select good starting point• Inspect results
8
Real Life ExampleMRS estimation of CSF/Brain Partial Volume
Brain
CSF
Amplitude=Amp*(PV*exp(-TE/tau1)+(1-PV)*exp(-TE/tau2))
Normalised Signal
TE (Echo-time)
Note: sum of exponentials can not be linearized!
Real Life ExampleMRS estimation of CSF/Brain Partial Volume
Brain
CSF
Amplitude=Amp*(PV*exp(-TE/tau1)+(1-PV)*exp(-TE/tau2))
Normalised Signal (Log scale)
TE (Echo-time)
Note: sum of exponentials can not be linearized!
Proton Signal Amplitudes vs TE9 cases, 12 TEs
Scatterplot (PhysioAndMRS_02.sta 47v*1729c)
-0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
TE
-500
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
Am
plitu
des
Signal variation depends on data � estimate weighting
• Pooled data (initial analysis)- tenfold SD change
0.000
500.000
1000.000
1500.000
2000.000
2500.000
3000.000
3500.000
4000.000
0 200 400 600 800 1000 1200 1400 1600
AmplitudesMeansAmplitudesStd.Dev.
Amplitudes
0.000
50.000
100.000
150.000
200.000
250.000
300.000
350.000
400.000
0.000 500.000 1000.000 1500.000 2000.000 2500.000 3000.000 3500.000 4000.000
Amplitudes Std.Dev.
Signal variation on individual basis9 cases, 16 repetitions, 12 TEs
SD of Signal Amplitude
Te[ms]
SD
Am
p
caseID: 0caseID: 1caseID: 2caseID: 3caseID: 4caseID: 5caseID: 6caseID: 7caseID: 8
-200 0 200 400 600 800 1000 1200 1400 1600-20
0
20
40
60
80
100
120
140
160
180
Model of signal variationSD of Signal Amplitude
Function = 100/(x^0.35)
Te[ms]
SD
Am
p
caseID: 0caseID: 1caseID: 2caseID: 3caseID: 4caseID: 5caseID: 6caseID: 7caseID: 8
0 200 400 600 800 1000 1200 1400 1600-20
0
20
40
60
80
100
120
140
160
180
9
Inventory
�Data
�Model
�1/Weights
Scatterplot (PhysioAndMRS_02.sta 47v*1729c)
-0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
TE
1
2345678910
2030405060708090100
2003004005006007008009001000
2000300040005000600070008000900010000
Am
plitu
des
SD of Signal Amplitude Function = 100/(x^0.35)
Te[ms]
SD
Am
p
caseID: 0caseID: 1caseID: 2caseID: 3caseID: 4caseID: 5caseID: 6caseID: 7caseID: 8
0 200 400 600 800 1000 1200 1400 1600-20
0
20
40
60
80
100
120
140
160
180
STATISTICAver. 7
Model: Amplitudes=Amp*(PV*exp(-TE/tau1)+(1-PV)*exp(-TE/tau2))y=(1820.94)*((2.47271)*exp(-x/(.095016))+(1-(2.47271))*exp(-x/(.922e-3)))
C:193 C:194 C:195C:196C:197 C:198C:199 C:200 C:201C:202 C:203 C:204C:205 C:206 C:207C:208C:209 C:210C:211 C:212 C:213C:214 C:215 C:216C:217 C:218 C:219C:220C:221 C:222C:223 C:224 C:225C:226 C:227 C:228C:229 C:230 C:231C:232C:233 C:234C:235 C:236 C:237C:238 C:239 C:240C:241 C:242 C:243C:244C:245 C:246C:247 C:248 C:249C:250 C:251 C:252C:253 C:254 C:255C:256C:257 C:258C:259 C:260 C:261C:262 C:263 C:264C:265 C:266 C:267C:268C:269 C:270C:271 C:272 C:273C:274 C:275 C:276C:277 C:278 C:279C:280C:281 C:282C:283 C:284 C:285C:286 C:287 C:288C:289 C:290 C:291C:292C:293 C:294C:295 C:296 C:297C:298 C:299 C:300C:301 C:302 C:303C:304C:305 C:306C:307 C:308 C:309C:310 C:311 C:312C:313 C:314 C:315C:316C:317 C:318C:319 C:320 C:321C:322 C:323 C:324C:325 C:326 C:327C:328C:329 C:330C:331 C:332 C:333C:334 C:335 C:336C:337 C:338 C:339C:340C:341 C:342C:343 C:344 C:345C:346 C:347 C:348C:349 C:350 C:351C:352C:353 C:354C:355 C:356 C:357C:358 C:359 C:360C:361 C:362 C:363C:364C:365 C:366C:367 C:368 C:369C:370 C:371 C:372C:373 C:374 C:375C:376C:377 C:378C:379 C:380 C:381C:382 C:383 C:384
-0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
TE
-4.5E97
-4E97
-3.5E97
-3E97
-2.5E97
-2E97
-1.5E97
-1E97
-5E96
0
5E96
Model: Amplitudes=Amp*(PV*exp(-TE/tau1)+(1-PV)*exp(-TE/tau2))y=(1820.94)*((2.47271)*exp(-x/(.095016))+(1-(2.47271))*exp(-x/(.922e-3)))
C:193
C:194C:195C:196
C:197
C:198
C:199C:200
C:201
C:202
C:203C:204
C:205
C:206C:207C:208
C:209
C:210
C:211C:212
C:213
C:214
C:215C:216
C:217
C:218C:219C:220
C:221
C:222
C:223C:224
C:225
C:226
C:227C:228
C:229
C:230C:231C:232
C:233
C:234
C:235C:236
C:237
C:238
C:239C:240
C:241
C:242C:243C:244
C:245
C:246
C:247C:248
C:249
C:250
C:251C:252
C:253
C:254C:255
C:256
C:257
C:258
C:259C:260
C:261
C:262
C:263C:264
C:265
C:266C:267C:268
C:269
C:270
C:271C:272
C:273
C:274
C:275C:276
C:277
C:278C:279C:280
C:281
C:282
C:283C:284
C:285
C:286
C:287C:288
C:289
C:290C:291C:292
C:293
C:294
C:295C:296
C:297
C:298
C:299C:300
C:301
C:302C:303C:304
C:305
C:306
C:307C:308
C:309
C:310
C:311C:312
C:313
C:314C:315
C:316
C:317
C:318
C:319C:320
C:321
C:322
C:323C:324
C:325
C:326C:327C:328
C:329
C:330
C:331C:332
C:333
C:334
C:335C:336
C:337
C:338C:339
C:340
C:341
C:342
C:343C:344
C:345
C:346
C:347C:348
C:349
C:350C:351C:352
C:353
C:354
C:355C:356
C:357
C:358
C:359C:360
C:361
C:362C:363C:364
C:365
C:366
C:367C:368
C:369
C:370
C:371C:372
C:373
C:374C:375C:376
C:377
C:378
C:379C:380
C:381
C:382
C:383C:384
-0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
TE
0
1000
2000
3000
4000
5000
10
Better!Model: Amplitudes=Amp*(PV*exp(-TE/tau1)+(1-PV)*exp(-TE/tau2))
y=(5164.41)*((.889446)*exp(-x/(.061098))+(1-(.889446))*exp(-x/(.584223)))
C:193
C:194C:195C:196
C:197
C:198C:199
C:200 C:201
C:202
C:203C:204
C:205
C:206C:207C:208
C:209
C:210C:211 C:212
C:213
C:214
C:215C:216
C:217
C:218C:219C:220
C:221
C:222C:223
C:224C:225
C:226
C:227 C:228
C:229
C:230C:231C:232
C:233
C:234C:235
C:236C:237
C:238
C:239C:240
C:241
C:242C:243C:244
C:245
C:246C:247 C:248
C:249
C:250
C:251 C:252
C:253
C:254C:255C:256
C:257
C:258C:259
C:260C:261
C:262
C:263C:264
C:265
C:266C:267C:268
C:269
C:270C:271 C:272
C:273
C:274
C:275C:276
C:277
C:278 C:279C:280
C:281
C:282C:283
C:284C:285
C:286
C:287C:288
C:289
C:290C:291C:292
C:293
C:294C:295 C:296
C:297
C:298
C:299C:300
C:301
C:302C:303C:304
C:305
C:306C:307 C:308
C:309
C:310
C:311C:312
C:313
C:314C:315C:316
C:317
C:318C:319
C:320C:321
C:322
C:323C:324
C:325
C:326C:327C:328
C:329
C:330C:331
C:332C:333
C:334
C:335C:336
C:337
C:338C:339C:340
C:341
C:342C:343 C:344
C:345
C:346
C:347C:348
C:349
C:350C:351C:352
C:353
C:354C:355 C:356
C:357
C:358
C:359C:360
C:361
C:362C:363C:364
C:365
C:366C:367
C:368C:369
C:370
C:371C:372
C:373
C:374C:375C:376
C:377
C:378C:379
C:380 C:381
C:382
C:383 C:384
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
TE
-1000
0
1000
2000
3000
4000
5000
6000
More checks
Observed versus Predicted Values
-500 0 500 1000 1500 2000 2500 3000 3500 4000
Predicted Values
-500
0
500
1000
1500
2000
2500
3000
3500
4000
Obs
erve
d V
alue
s
Value of Residuals (good fit, case=1)
Predicted versus Residual Values
-500 0 500 1000 1500 2000 2500 3000 3500 4000 4500
Predicted Values
-60
-40
-20
0
20
40
60
80
Res
idua
l Val
ues
11
Value of Residuals (poor fit, Case=0)
Predicted versus Residual Values
-500 0 500 1000 1500 2000 2500 3000 3500 4000
Predicted Values
-200
-180
-160
-140
-120
-100
-80
-60
-40
-20
0
20
40
60
80
Res
idua
l Val
ues
Problem not easy to spot otherwiseObserved versus Predicted Values
-500 0 500 1000 1500 2000 2500 3000 3500 4000
Predicted Values
-500
0
500
1000
1500
2000
2500
3000
3500
4000
Obs
erve
d V
alue
s Model: Amplitudes=Amp*(PV*exp(-TE/tau1)+(1-PV)*exp(-TE/tau2))y=(5476.15)*((.850048)*exp(-x/(.05812))+(1-(.850048))*exp(-x/(.551334)))
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
TE
0
1000
2000
3000
4000
5000
Robust Estimation• “Insensitive to small departures
from the idealized assumptions for which the estimator is optimized.”
• Typical:– Fractionally large departures
for a small number of data points
– Measurement errors are not normally distributed
• ���� Use alternative estimates of error:
• General idea is that the weight given individual points should first increase with deviation, then decrease
NUMERICAL RECIPES IN C. Cambridge University Press. Chapter 15.7
Alternative estimates of error
• M-values • L-values• R-values• A-priori knowledge
221
221
1)()1log()(
)()()(
zz
zzz
zsignzzz
+=+=
==
ψρ
ψρ
Alternative estimates of error• Based on maximum likelihood estimates using a-priori known
(assumed) distributions– Similar to sum of squares but solutions differ if non-gaussian
distributions are considered
– In theory analytic solutions available from set of eqautions for ak:
– In practice solved numerically using alternative cost functions• Two sided exponential distribution:
– (minimising mean absolute deviation)• Cauchy/Lorentzian distribution:
– Arbitrary functions with arbitrary paramters:• Andrew’s sine, Tukeys biweight
�=
−==N
i i
Mii aaxyyzzError
1
1 )...,();(
σρ
( )�
=
==N
i k
Mi
i
Mka
aaxyzz
1
1 ...1;)...;(1
0δ
δδ
δρσ
Alternative estimates of error• Linear combinations of order statistics
• ”Median” of Squares• Tukey’s Trimean of quartiles: Err=SSQ1+2*SSQ2+SSQ3
• Does not require assumptions about error distribution• Ordering is time consuming
(Observed-Predicted)2
N(distribution)
IGNORE LARGEST DEVIATIONS!
12
Alternative estimates of error• Rank test statistics
• Wilcoxon, Spearman Rank-order correlation, Komogorov-Smirnov
• A-priori information on data validity
0.1
1
10
100
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
SDAmplitudes (-) ; case=sp0SDAmplitudes (-) ; case=rb1SDAmplitudes (-) ; case=kp2SDAmplitudes (-) ; case=lb3SDAmplitudes (-) ; case=dg4SDAmplitudes (-) ; case=wb5SDAmplitudes (-) ; case=nn6SDAmplitudes (-) ; case=pl7SDAmplitudes (-) ; case=hd8
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0 0.5 1 1.5 2
SDFrequencies (Hz) ; case=sp0SDFrequencies (Hz) ; case=rb1SDFrequencies (Hz) ; case=kp2SDFrequencies (Hz) ; case=lb3SDFrequencies (Hz) ; case=dg4SDFrequencies (Hz) ; case=wb5SDFrequencies (Hz) ; case=nn6SDFrequencies (Hz) ; case=pl7SDFrequencies (Hz) ; case=hd8
FEAT Motion Correction Reportfor FEAT session MRdata/010755_2004_07_08_co2/BOLD_series_2.3+.feat
Tue Jan 18 11:49:45 GMT 2005
Mean (across voxels) voxel displacements:absolute (each time point with respect to the reference image) = 0.73mm;relative (each time point with respect to the previous timepoint) = 0.16mm.
Implementation (STATISTICA) Implementation – Other Software
• Statistics software – dialogs/scripting– Excel (Solver module must be installed)– SPSS– STATGRAPHICS– STATISTICA– GraphPad– …….
• Programming (with libraries!)– C, Pascal…– Matlab– IDL– …….
Programmatic Implementation - IDLSpecific fit functionsLINFIT( X, Y [, CHISQ=variable] [, COVAR=variable] [, /DOUBLE] [, MEASURE_ERRORS=vector] [, PROB=variable]
[, SIGMA=variable] [, YFIT=variable] )LADFIT( X, Y [, ABSDEV=variable] [, /DOUBLE] ) ; linear robustCOMFIT( X, Y, A {, /EXPONENTIAL | , /GEOMETRIC | , /GOMPERTZ | , /HYPERBOLIC | , /LOGISTIC | ,
/LOGSQUARE} [, SIGMA=variable] [, WEIGHTS=vector] [, YFIT=variable] ) ; common functionsGAUSSFIT( X, Y [, A] [, CHISQ=variable] [, ESTIMATES=array] [, MEASURE_ERRORS=vector] [, NTERMS=integer{3
to 6}] [, SIGMA=variable] [, YERROR=variable]) GAUSS2DFIT( Z, A [, X, Y] [, /NEGATIVE] [, /TILT] ) ; gaussian surfacePOLY_FIT( X, Y, Degree [, CHISQ=variable] [, COVAR=variable] [, /DOUBLE] [, MEASURE_ERRORS=vector]
[, SIGMA=variable] [, STATUS=variable] [, YBAND=variable] [, YERROR=variable] [, YFIT=variable] ) REGRESS( X, Y, [, CHISQ=variable] [, CONST=variable] [, CORRELATION=variable] [, /DOUBLE] [, FTEST=variable]
[, MCORRELATION=variable] [, MEASURE_ERRORS=vector] [, SIGMA=variable] [, STATUS=variable] [, YFIT=variable] )
SFIT( Data, Degree [, /IRREGULAR, KX=variable, /MAX_DEGREE] ) ;polynomial surface fit
General solversLMFIT( X, Y, A [, ALPHA=variable] [, CHISQ=variable] [, CONVERGENCE=variable] [, COVAR=variable] [, /DOUBLE]
[, FITA=vector] [, FUNCTION_NAME=string] [, ITER=variable] [, ITMAX=value] [, ITMIN=value] [, MEASURE_ERRORS=vector] [, SIGMA=variable] [, TOL=value] ) ;Levenberg-Marquardt algorithm
CURVEFIT( X, Y, Weights, A [, Sigma] [, CHISQ=variable] [, /DOUBLE] [, FITA=vector] [, FUNCTION_NAME=string] [, ITER=variable] [, ITMAX=value] [, /NODERIVATIVE] [, STATUS={0 | 1 | 2}] [, TOL=value] [, YERROR=variable] ) ;gradient-expansion algorithm
SVDFIT( X, Y [, M] [, A=vector] [, CHISQ=variable] [, COVAR=variable] [, /DOUBLE] [, FUNCTION_NAME=string] [, /LEGENDRE] [, MEASURE_ERRORS=vector] [, SIGMA=variable] [, SING_VALUES=variable] [, SINGULAR=variable] [, STATUS=variable] [, TOL=value] [, VARIANCE=variable] [, YFIT=variable] )
SVDFIT( X, Y [, M] [, A=vector] [, CHISQ=variable] [, COVAR=variable] [, /DOUBLE] [, FUNCTION_NAME=string] [, /LEGENDRE] [, MEASURE_ERRORS=vector] [, SIGMA=variable] [, SING_VALUES=variable] [, SINGULAR=variable] [, STATUS=variable] [, TOL=value] [, VARIANCE=variable] [, YFIT=variable] )
AMOEBA( Ftol [, FUNCTION_NAME=string] [, FUNCTION_VALUE=variable] [, NCALLS=value] [, NMAX=value] [, P0=vector, SCALE=vector | , SIMPLEX=array] ) ; down slope minimalisation
POWELL, P, Xi, Ftol, Fmin, Func [, /DOUBLE] [, ITER=variable] [, ITMAX=value]
Programmatic Implementation – IDLuse of general solvers
;define function that will calculate costFunction MyError, Parameters
common MEcom, Xi,Yi, WeightsiYest=Model(Xi,Paramters)Error=abs(Yest-Yi)*Weightsi ; e.g.return, Error
End ;function MyError
;define necessary start parameters and call the general solver routineEpsilon=1e-5P0=[5000, 0.9, 0.05, 0.5]Scale=P/5.Result = AMOEBA(Epsilon, SCALE=Scale, P0 = P0,
FUNCTION_NAME="MyError", NMAX=1000)
13
Sub-sample variability in model estimates Impact of the removal of data with high errors of amplitude estimates Overview
• Introduction• Linear regression• Linear-transformable regressions • Linear Pitfalls, options and complications • Non-linear fitting• Robust estimation: alternative cost functions and
weighting• Implementation and software• Comparing and testing models
y = 1.7x - 0.1R2 = 0.85
y = -6E-14x4 - 0.3333x3 + 3.5x2 - 9.1667x + 9
R2 = 1
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6
X (independent)
Y Comparing
Models
• Use both Sum of Squares and number of degrees of freedom
• F~1 � models equivalent• Transform through F-distribution to get
probability that gain is real
11
2121
121
121
/)/()(
/)(/)(
ModelModel
ModelModelModelModel
ModelModelModel
ModelModelModel
DOFSSDOFDOFSSSS
DOFDOFDOFSSSSSS
F−−=
−−=
Model selection based on analysis of residuals
• Runs test
– Expected Number of Runs ~1+N/2 = 21(=1+[(N+*N-)/(N++N-)])
– Actual Number of Runs=13– P=0.0077 � Model needs upgrade
+ +++++++ ++++++++ + + + ++
----- - - - ----- --------
H. Motulsky & A. Christopoulos, Fitting Models to Biological Data using Linear and Non-linear Regression. www.graphpad.com
Model selection in practice
• Not like textbook example– More noise– Repeated
measures– Clustered
samples– Unobvious
models
Predicted versus Residual Values
-500 0 500 1000 1500 2000 2500 3000 3500 4000 4500
Predicted Values
-60
-40
-20
0
20
40
60
80
Res
idua
l Val
ues
Other way - modelling
• Select prime suspects
• Test the impact of– CBF inflow– CSF inflow– Diffusive mixing
between compartments
14
Modelling Modelling
• Test specific hypothesis
e.g. “Is there really
additional downwards deflection in the data for long TE?”
1
10
100
1000
10000
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
Amplitudes (-) ; case=sp0Amplitudes (-) ; case=rb1Amplitudes (-) ; case=kp2Amplitudes (-) ; case=lb3Amplitudes (-) ; case=dg4Amplitudes (-) ; case=wb5Amplitudes (-) ; case=nn6Amplitudes (-) ; case=pl7Amplitudes (-) ; case=hd8
Overview
• Introduction• Linear regression• Linear-transformable regressions • Linear Pitfalls, options and complications • Non-linear fitting• Robust estimation: alternative cost functions and
weighting• Implementation and software• Comparing and testing models