5. multiway calibration

17
1 5. Multiway calibration Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP

Upload: havily

Post on 30-Jan-2016

40 views

Category:

Documents


0 download

DESCRIPTION

5. Multiway calibration. Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP. Multiway regression problems e.g. batch reaction monitoring. Process measurements. Product quality. Y. X. batch. batch. time. product quality. process variable. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 5. Multiway calibration

1

5. Multiway calibration

Quimiometria Teórica e Aplicada

Instituto de Química - UNICAMP

Page 2: 5. Multiway calibration

2

Multiway regression problems Multiway regression problems e.g.e.g. batch reaction batch reaction monitoringmonitoring

Process measurements Product quality

X

process variable

time

ba

tch

ba

tch

product quality

Y

Page 3: 5. Multiway calibration

3

Multiway regression problems Multiway regression problems e.g.e.g. tandem mass tandem mass spectrscopyspectrscopy

X5

X4

X3

X2

X1

sam

ple

s

parent ion m/z

daughter ion m/z

sam

ple

compound

MS-MS spectra Compound concentrations

Page 4: 5. Multiway calibration

4

Some terminologySome terminology

Univariate calibration

(OLS – ordinary least squares)

Multivariate calibration

(ridge regression, PCR, PLS etc.)

Second-order advantage

(PARAFAC, restricted Tucker, GRAM, RBL etc.)

zero-order

first-order

second-order

Cannot handle interferents

Can handle interferents if they are present in the

training set

Can handle unknown interferents (although see work of

K.Faber)

N-PLS(?)

Page 5: 5. Multiway calibration

5

Multiway calibration methodsMultiway calibration methods

• PARAFAC (already discussed on first day)

• (Unfold-PLS)

• Multiway PCR

• N-PLS

• MCovR (multiway covariates regression) (see work of

Smilde & Gurden)

• GRAM, NBRA, RBL (see work of Kowalski et al.)

Page 6: 5. Multiway calibration

6

Unfold-PLSUnfold-PLS

• Matricize (or ‘unfold’) the data and use standard two-way PLS:

X

J

K

I

X1 ... XI

I

JK

• But if a multiway structure exists in the data, multiway methods have some important advantages!!

M

Y

I

Page 7: 5. Multiway calibration

7

Two-way PCRTwo-way PCR

• Standard PCR for X (I J) and y (I 1).

1. Calculate PCA model of X:

X = TPT + E

2. Use PCA scores for ordinary regression:

y = Tb + E

b = (TTT)-1TTy

3. Make predictions for new samples:

Tnew = XnewP

ynew = Tnew b

Y

b

1. Calculate PCA model of X:

X = TPT + E

2. Use PCA scores for ordinary regression:

y = Tb + E

b = (TTT)-1TTy

X E

PT

T+=

1. Calculate PCA model of X:

X = TPT + E

Page 8: 5. Multiway calibration

8

Multiway PCRMultiway PCR

• Multiway PCR for X (I J K) and y (I 1).

1. Calculate multiway model:

X = A(C||B)T + E

2. Use scores for regression:

y = A bPCR + E

bPCR = (ATA)-1ATy

3. Make predictions for new samples:

Anew = XnewP(PTP)-1

where P = (C||B)

ynew = Anew bPCR

Y

bPCR

1. Calculate multiway model:

X = A(C||B)T + E

2. Use scores for regression:

y = A bPCR + E

bPCR = (ATA)-1ATy

BT

A

+=

CT

X E

1. Calculate multiway model:

X = A(C||B)T + E

Page 9: 5. Multiway calibration

9

N-PLSN-PLS

• N-PLS is a direct extension of standard two-way PLS for N-way arrays.

• The advantages of N-PLS are the same as for any multiway analysis:– a more parsimonious model

– loadings which are easier to plot and interpret

Page 10: 5. Multiway calibration

10

N-PLSN-PLS

• The standard two-way PLS algorithm (see ‘Multivariate Calibration’ by Martens and Næs):

• The N-PLS algorithm (R.Bro) uses PARAFAC-type loadings, but is otherwise very similar

1 ith w

,covmax 11

r

rr

r

r

w

ywXw

r

rr wXt 1 T1

rrrr wtXX

rr Uqyy 0

1.

2.

3.

4.

1 with

,covmax 11

,

rr

rrr

r

rr

vw

ywvXvw

rrr

r wvXt 1

T1rrr

rr wvtXX

rr Uqyy 0

1.

2.

3.

4.

Page 11: 5. Multiway calibration

11

N-PLS graphicN-PLS graphic(taken from R.Bro)(taken from R.Bro)

Page 12: 5. Multiway calibration

12

Other methodsOther methods

• Multiway covariates regression (MCovR)– different to PLS-type models– choice of structure on X (PARAFAC, Tucker, unfold etc.)– sometimes loadings are easier to interpret–

standard, N

mixture, N + M

• Restricted Tucker, GRAM, RBL, NBRA etc.– for more specialized use

– second-order advantage, i.e. able to handle unknown interferents

1

0

N M

restricted loadings, A

221min T

YTX

WXWPYXWPX

Page 13: 5. Multiway calibration

13

ConclusionsConclusions

• There are a number of different calibration methods for multiway data.

• N-PLS is a extension of two-way PLS for multiway data.

• All the normal guidelines for multivariate regression still apply!!– watch out for outliers

– don’t apply the model outside of the calibration range

Page 14: 5. Multiway calibration

14

• Outliers are objects which are very different from the rest of the data. These can have a large effect on the regression model and should be removed.

Outliers (1)Outliers (1)

1 1.5 2 2.5 3 3.5 4 4.54

6

8

10

12

14

16

18

pH

T (o C

)

1 1.5 2 2.5 3 3.5 4 4.54

6

8

10

12

14

16

18

pHT

(o C)

Remove outlier

bad experiment

Page 15: 5. Multiway calibration

15

Outliers (2)Outliers (2)

• Outliers can also be found in the model space or in the residuals.

-8 -6 -4 -2 0 2 4 6 8-8

-6

-4

-2

0

2

4

6

Scores PC 1

Sco

res

PC

2

22 24 26 28 30 32 34 36 38 40 420

2

4

6

8

10

12

14

Time (min)

Sum

-of-s

quar

ed r

esid

uals

Page 16: 5. Multiway calibration

16

Model extrapolation...Model extrapolation...

18 20 22 24 26 28 3075

76

77

78

79

80

81

82

83

84

Age (months)

Hei

ght

(cm

)

• Univariate example: mean height vs age of a group of young children

• A strong linear relationship between height and age is seen.

• For young children, height and age are correlated.

Moore, D.S. and McCabe G.P., Introduction to the Practice of Statistics (1989).

Page 17: 5. Multiway calibration

17

... can be dangerous!... can be dangerous!

0 5 10 15 20 25 300

50

100

150

200

250

300

Age (years)

Hei

ght

(cm

)

Linear model was valid for this age range...

...but is not valid for 30 year olds!