quimiometria teórica e aplicada instituto de química - unicamp

10
1 3. The Tucker3 model Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP

Upload: buddy-sims

Post on 17-Jan-2018

233 views

Category:

Documents


0 download

DESCRIPTION

The PARAFAC model has a strict trilinear structure: PARAFAC & Tucker3 The PARAFAC model has a strict trilinear structure: xijk = airbjrckr + eijk Another generalization of PCA for multiway data is able to use a different number of components for each mode: the Tucker3 model. = + + etc.

TRANSCRIPT

Page 1: Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP

1

3. The Tucker3 model

Quimiometria Teórica e AplicadaInstituto de Química - UNICAMP

Page 2: Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP

2

PARAFAC & Tucker3PARAFAC & Tucker3

• The PARAFAC model has a strict trilinear structure:xijk

= airbjrckr + eijk

• Another generalization of PCA for multiway data is able to use a different number of components for each mode: the Tucker3 model.

= + + etc.

Page 3: Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP

3

The Tucker3 modelThe Tucker3 model

e.g. Mode I has chemical rank 3 Mode J has chemical rank 2 Mode K has chemical rank 4

X

J

K

EBT

CT

A

G +=

X (I J K) E (I J K)

I

C (K 4)

core arrayG (3 2 4)

B (J 2)

A (I 3)

Page 4: Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP

4

The Tucker3 formulaThe Tucker3 formula

X = AG (CB)T + E

• Loadings– A (I R1) describes variation in the first mode

– B (J R2) describes variation in the second mode

– C (K R3) describes variation in the third mode

• Core array– G (R1 R2 R3) is matricized into GR1R2R3 (R1 R2R3)

Page 5: Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP

5

What does the core array mean?What does the core array mean?

• The core array describes the significance of the interactions between the different loadings, e.g. the Tucker3 (2,2,2) model can be written as

X = g111a1(c1b1)T

+ g112a1(c2b1)T

+ g121a1(c1b2)T

+ g122a1(c2b2)T

+ g211a2(c1b1)T

+ g212a2(c2b1)T

+ g221a2(c1b2)T

+ g222a2(c2b2)T

+ E

97

0

2.1

26

27

3

41

6

R1

R2

R3

g111=97, this triad is important

g211=0, this interaction does not exist

Page 6: Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP

6

How many components to use in each mode?How many components to use in each mode?Unfold along each mode and look at the eigenvaluesUnfold along each mode and look at the eigenvalues

1 2 3 4 5 6 7 8 90

0.5

1

1.5

2

2.5

3

3.5

4

4.5Eigenvalue vs. PC Number

PC Number

Eig

enva

lue

‘Knee’ here - select 4 PC’s for first mode

Try Tucker3(4,2,3)

XIJK

First mode

XJKI

Select 2 PC’s for second mode

1 2 3 4 5 6 7 8 90

0.5

1

1.5

2

2.5

3

3.5

4

4.5Eigenvalue vs. PC Number

PC Number

Eig

enva

lue

Second mode

XKIJ

Select 3 PC’s for third mode

1 2 3 4 5 6 7 8 90

0.5

1

1.5

2

2.5

3

3.5

4

4.5Eigenvalue vs. PC Number

PC Number

Eig

enva

lue

Third mode

Page 7: Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP

7

When to use PARAFAC or Tucker3?When to use PARAFAC or Tucker3?

X5 samples

201 emission ’s

61 excitation ’s

rank = 3

rank = 3rank = 3

Try 3-component PARAFAC or Tucker3 (3,3,3)

Fluorescence data:

rank = 4 rank = 3

rank = 2

Try Tucker3 (4,2,3)

35 batches

12 sensors

24 hours

Process data:

X

Page 8: Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP

8

PARAFAC as a restricted Tucker modelPARAFAC as a restricted Tucker model

• PARAFAC is a type of Tucker model for which the core array is a superidentity, I.

X

J

K

EBT

CT

A

I +=

X (I J K) E (I J K)

I

C (K R)

core arrayI (R R R)

B (J R)

A (I R)

1

0

0

0

0

0

0

1

R1

R2

R3

Page 9: Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP

9

PARAFAC vs Tucker3PARAFAC vs Tucker3

Can have different number of components in each mode.

Core array, G.

Same number of components in each mode. Core array is superidentity, I.

Rotational freedom. More difficult to interpret.

Solution is unique. Easy to interpret.

Multiway subspace model. Good for exploratory analysis.

Strict, trilinear model. Good for some types of data.

Algorithm sometimes slow and problematic.

Algorithm fast and robust.

PARAFAC Tucker3

Page 10: Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP

10

ConclusionsConclusions

• The Tucker3 model is good for– general exploratory analysis– multiway data which have modes of different rank

• Like PARAFAC, the Tucker3 model is estimated using ALS, with an extra step for the estimation of G.

• Like PCA, Tucker loadings have rotational freedom, making model interpretation more difficult than for PARAFAC. The use of constraints can help.

• Restricted Tucker3 models have been used for chemical calibration (more about this later...).