quimiometria teórica e aplicada instituto de química - unicamp
DESCRIPTION
The PARAFAC model has a strict trilinear structure: PARAFAC & Tucker3 The PARAFAC model has a strict trilinear structure: xijk = airbjrckr + eijk Another generalization of PCA for multiway data is able to use a different number of components for each mode: the Tucker3 model. = + + etc.TRANSCRIPT
1
3. The Tucker3 model
Quimiometria Teórica e AplicadaInstituto de Química - UNICAMP
2
PARAFAC & Tucker3PARAFAC & Tucker3
• The PARAFAC model has a strict trilinear structure:xijk
= airbjrckr + eijk
• Another generalization of PCA for multiway data is able to use a different number of components for each mode: the Tucker3 model.
= + + etc.
3
The Tucker3 modelThe Tucker3 model
e.g. Mode I has chemical rank 3 Mode J has chemical rank 2 Mode K has chemical rank 4
X
J
K
EBT
CT
A
G +=
X (I J K) E (I J K)
I
C (K 4)
core arrayG (3 2 4)
B (J 2)
A (I 3)
4
The Tucker3 formulaThe Tucker3 formula
X = AG (CB)T + E
• Loadings– A (I R1) describes variation in the first mode
– B (J R2) describes variation in the second mode
– C (K R3) describes variation in the third mode
• Core array– G (R1 R2 R3) is matricized into GR1R2R3 (R1 R2R3)
5
What does the core array mean?What does the core array mean?
• The core array describes the significance of the interactions between the different loadings, e.g. the Tucker3 (2,2,2) model can be written as
X = g111a1(c1b1)T
+ g112a1(c2b1)T
+ g121a1(c1b2)T
+ g122a1(c2b2)T
+ g211a2(c1b1)T
+ g212a2(c2b1)T
+ g221a2(c1b2)T
+ g222a2(c2b2)T
+ E
97
0
2.1
26
27
3
41
6
R1
R2
R3
g111=97, this triad is important
g211=0, this interaction does not exist
6
How many components to use in each mode?How many components to use in each mode?Unfold along each mode and look at the eigenvaluesUnfold along each mode and look at the eigenvalues
1 2 3 4 5 6 7 8 90
0.5
1
1.5
2
2.5
3
3.5
4
4.5Eigenvalue vs. PC Number
PC Number
Eig
enva
lue
‘Knee’ here - select 4 PC’s for first mode
Try Tucker3(4,2,3)
XIJK
First mode
XJKI
Select 2 PC’s for second mode
1 2 3 4 5 6 7 8 90
0.5
1
1.5
2
2.5
3
3.5
4
4.5Eigenvalue vs. PC Number
PC Number
Eig
enva
lue
Second mode
XKIJ
Select 3 PC’s for third mode
1 2 3 4 5 6 7 8 90
0.5
1
1.5
2
2.5
3
3.5
4
4.5Eigenvalue vs. PC Number
PC Number
Eig
enva
lue
Third mode
7
When to use PARAFAC or Tucker3?When to use PARAFAC or Tucker3?
X5 samples
201 emission ’s
61 excitation ’s
rank = 3
rank = 3rank = 3
Try 3-component PARAFAC or Tucker3 (3,3,3)
Fluorescence data:
rank = 4 rank = 3
rank = 2
Try Tucker3 (4,2,3)
35 batches
12 sensors
24 hours
Process data:
X
8
PARAFAC as a restricted Tucker modelPARAFAC as a restricted Tucker model
• PARAFAC is a type of Tucker model for which the core array is a superidentity, I.
X
J
K
EBT
CT
A
I +=
X (I J K) E (I J K)
I
C (K R)
core arrayI (R R R)
B (J R)
A (I R)
1
0
0
0
0
0
0
1
R1
R2
R3
9
PARAFAC vs Tucker3PARAFAC vs Tucker3
Can have different number of components in each mode.
Core array, G.
Same number of components in each mode. Core array is superidentity, I.
Rotational freedom. More difficult to interpret.
Solution is unique. Easy to interpret.
Multiway subspace model. Good for exploratory analysis.
Strict, trilinear model. Good for some types of data.
Algorithm sometimes slow and problematic.
Algorithm fast and robust.
PARAFAC Tucker3
10
ConclusionsConclusions
• The Tucker3 model is good for– general exploratory analysis– multiway data which have modes of different rank
• Like PARAFAC, the Tucker3 model is estimated using ALS, with an extra step for the estimation of G.
• Like PCA, Tucker loadings have rotational freedom, making model interpretation more difficult than for PARAFAC. The use of constraints can help.
• Restricted Tucker3 models have been used for chemical calibration (more about this later...).