tensors and graphical models - personal...
TRANSCRIPT
![Page 1: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/1.jpg)
Tensors and graphical models
Mariya Ishteva with Haesun Park, Le Song
Dept. ELEC, VUB Georgia Tech, USA
INMA Seminar, May 7, 2013, LLN
![Page 2: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/2.jpg)
Outline
Tensors
Random variables and graphical models
Tractable representations
Structure learning
2
![Page 3: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/3.jpg)
Tensors
RM×N×P
3
![Page 4: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/4.jpg)
Ranks
• Multilinear rank (R1,R2,R3)
• Rank-R
Rank-1 tensor:
R = min(r), s.t. A =r∑
i=1
{rank-1 tensor}i
4
![Page 5: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/5.jpg)
Matrix representations of tensors
Mode-1
A = A(1) =
• Mode-2
• Mode-3
• Multilinear rank: (rank(A(1)), rank(A(2)), rank(A(3)))
5
![Page 6: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/6.jpg)
Tensor-matrix multiplication
• Tensor-matrix product
• Contraction A ∈ RI×J×M B ∈ R
K×L×M
C = 〈A, B〉3 C(i , j , k , l) =M∑
m=1
aijmbklm
4th order tensorC ∈ R
I×J×K×L
6
![Page 7: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/7.jpg)
Basic decompositions
Singular value decomposition (SVD)
MLSVD / HOSVD
CP / CANDECOMP / PARAFAC
7
![Page 8: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/8.jpg)
Outline
Tensors
Random variables and graphical models
Tractable representations
Structure learning
8
![Page 9: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/9.jpg)
Discrete random variables
• Random variable
X ; 1, . . . , nPx(1), . . . , Px(n) Px ∈ R
n, Rn+, [0, 1]
• X1,X2; P(X1,X2) P12 ∈ Rn×n
1 · · · n1 P12(1, 1) · · · P12(1, n)...
n P12(n, 1) · · · P12(n, n)
• P(x1, x2) := P(X1 = x1,X2 = x2)
9
![Page 10: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/10.jpg)
2 random variables
X1,X2; P(X1,X2) P12 ∈ Rn×n
X1 ⊥ X2
P(x1, x2) = P(x1)P(x2)rank-1 matrix
=
H
X1 X2
P(x1, x2) =∑
h
P(x1|h)P(x2|h)P(h)
low-rank matrixrank-k matrix, k < n
=
Conditional probability tables (CPTs) P(X1|H),P(X2|H)
10
![Page 11: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/11.jpg)
3 random variablesX1,X2,X3; P(X1,X2,X3) P123 ∈ R
n×n×n
X1,X2,X3 independent
P(x1, x2, x3) = P(x1)P(x2)P(x3)
rank-1 tensor
=
H
X1 X2 X3
rank-k tensor, k < n
=
= · · ·
P(x1, x2, x3) =∑
h
P(x1|h)P(x2|h)P(x3|h)P(h)
11
![Page 12: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/12.jpg)
4 random variables
• X1,X2,X3,X4; P(X1,X2,X3,X4) P1234 ∈ Rn×n×n×n
• X1,X2,X3,X4 independent
• H
X1 X2 X3 X4
P(x1, x2, x3, x4) =∑
h
P(x1|h)P(x2|h)P(x3|h)P(x4|h)P(h)
• more variables
• more hidden variables
12
![Page 13: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/13.jpg)
Challenges
• 10 variables, 10 states each −→ 1010 entries
• We need tractable representations• Latent variable models / low-rank factors• # parameters: exponential −→ polynomial
H
X1 X1 X
X1 X1 X1
• Challenges:• Choose a good representation X
• Learn the correct structure X
• Estimate the parameters ×
13
![Page 14: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/14.jpg)
Outline
Tensors
Random variables and graphical models
Tractable representations
Structure learning
14
![Page 15: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/15.jpg)
Tensors and graphical models
CP / CANDECOMP / PARAFACH
X1 X2 Xn· · ·
Tensor trainH1 H2 H3 Hn
X1 X2 X3 Xn
· · ·
HMM
Hierarchical Tucker
H
X1 X1 X
X1 X1 X1 Latent tree model
Tucker / MLSVDBlock term decomposition
×
15
![Page 16: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/16.jpg)
Tensor train (TT) decomposition
A(i1,...,id )=∑
α0,...,αd
G1(α0, i1, α1)G2(α1, i2, α2) . . .Gd(αd−1, id , αd )
[I. V. Oseledets, SIAM J. Scientific Computing, 2011]
• Avoids curse of dimensionality• Small number of parameters, compared to Tucker model• Slightly more parameters than CP but more stable• Gk (αk−1, nk , αk ) has dimensions rk−1 × nk × rk , r0 = rd = 1• rk are called compression ranks:
Ak = Ak (i1, . . . , ik ; ik+1, . . . , id ), rank(Ak ) = rk
• Computation based on SVD• Computation: top → bottom
H1 H2 H3 Hn
X1 X2 X3 Xn
· · ·
16
![Page 17: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/17.jpg)
Hierarchical Tucker decomposition
[L. Grasedyck, SIMAX, 2010]
• Similar properties as TT decomposition• Computation: bottom → top
H
X1 X1 X
X1 X1 X1
17
![Page 18: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/18.jpg)
Potential advantages of tensor approach
• Real data are often multi-way
• Provides higher-level view
• Flexibility: different ranks in each mode: Tucker
• Uniqueness: CP, Block term decomposition
• No curse of dimensionality: Tensor train, hierarch. Tucker
18
![Page 19: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/19.jpg)
Outline
Tensors
Random variables and graphical models
Tractable representations
Structure learning
19
![Page 20: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/20.jpg)
Structure learning
• Given: (samples of) observed variables
• Assumption: the variables can be connected via hiddenvariables in a tree structure in a meaningful way
• Find: the tree / the relationships between the variables
• Additional difficulty: unknown number of hidden states
H
H H
X X X X
X3 X5 X2 X1 X X1 X1 X1
X1 X1
?
20
![Page 21: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/21.jpg)
Quartet relationships: topologies
X1
X2
X3
X4
H G
X1
X3
X2
X4
H G
X1
X4
X2
X3
H G
P(x1, x2, x3, x4) =∑
h,g
P(x1|h)P(x2|h)P(h, g)P(x3|g)P(x4|g)
21
![Page 22: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/22.jpg)
Building trees based on quartet relationships
Choose 3 variables and form a tree
Add all other variables, one by one
• Split the current tree into 3 subtrees• Choose 3 variables from different subtrees• Resolve the quartet relation with current and chosen variables• Insert the current variable in a subtree or connect to the tree
[For simplicity, assume each latent variable has 3 neighbors]
22
![Page 23: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/23.jpg)
Tensor view of quartets
X1
X2
X3
X4
H G
P(X1,X2,X3,X4) =
P1|H
P2|H
IH PHG IG
P4|G
P3|G
A = reshape(P,n2,n2);
B = reshape(permute(P, [1,3,2,4]),n2,n2);
C = reshape(permute(P, [1,4,2,3]),n2,n2).
Notation: P1|H , P2|H , etc. stand for P(X1|H), P(X2|H), etc.
23
![Page 24: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/24.jpg)
Rank properties of matrix representations
A =
(
( (
(P2|H P1|H PHG P4|G P3|G⊤
B =
(( (
(P3|G P1|H diag(PHG(:)) P4|G P2|H⊤
• rank(A) = rank(PHG) = krank(B) = rank(C) = nnz(PHG)
rank(A) ≪ rank(B) = rank(C)
• Sampling noise Nuclear norm relaxation
‖A‖∗ =∑n2
i=1 σi(A)
24
![Page 25: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/25.jpg)
Resolving quartet relations
Algorithm 1 i∗ = Quartet(X1, X2, X3, X4)
1: Estimate P(X1,X2,X3,X4) from a set of m i.i.d. samples.2: Unfold P into matrices A, B and C, and compute
a1 = ‖A‖∗, a2 = ‖B‖∗ and a3 = ‖C‖∗.
3: Return i∗ = arg min i∈{1,2,3}ai .
• Easy to compute
• Recovery conditions
• Finite sample guarantees
• Agnostic to the number of hidden states
• Compares favorably to alternatives
25
![Page 26: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/26.jpg)
Example: stock data
Given: stock prices (25 years, discretized into 10 values)
Find: relations between stocks
Finance:• C (Citigroup)• JPM (JPMorgan Chase)• AXP (American Express)• F (Ford Motor: Automotive and Financial Services)
Retailers:• TGT (Target)• WMT (WalMart)• RSH (RadioShack)
26
![Page 27: Tensors and graphical models - Personal Homepageshomepages.vub.ac.be/.../tensor_graphicalmodels_slides.pdf · Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song](https://reader034.vdocument.in/reader034/viewer/2022052104/603f4fd172b3e75e63725b91/html5/thumbnails/27.jpg)
Conclusions
• Tensor decompositions are related to graphical models
• A common goal: tractable representations
• Tensors can be used for structure learning
27