on the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfon the extension...

16
On the extension of trace norm to tensors Ryota Tomioka 1 , Kohei Hayashi 2 , Hisashi Kashima 1 1 The University of Tokyo 2 Nara Institute of Science and Technology 2010/12/10 NIPS2010 Workshop: Tensors, Kernels, and Machine Learning

Upload: volien

Post on 30-Mar-2018

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension of trace norm to tensors Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1 1The

On the extension of trace norm to tensors

Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1

1The University of Tokyo2Nara Institute of Science and Technology

2010/12/10

NIPS2010 Workshop: Tensors, Kernels, and Machine Learning

Page 2: On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension of trace norm to tensors Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1 1The

Convex low-rank tensor completion

SensorsTime

Feat

ures

Factors(loadings)

Core(interactions)

I1

I2I3

I1

I2

I3

r1

r2

r3r1

r2 r3

Tucker decomposition

Page 3: On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension of trace norm to tensors Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1 1The

Conventional formulation (nonconvex)

mode-k productobservation

minimizeC,U1,U2,U3

!! " (Y # C $1 U1 $2 U2 $3 U3) !2F + regularization.

minimizeX

!! " (Y # X ) !2F s.t. rank(X ) $ (r1, r2, r3).

• Alternate minimization• Have to fix the rank beforehand

Page 4: On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension of trace norm to tensors Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1 1The

Our approach

MatrixEstimation of

low-rank matrix(hard)

Trace normminimization(tractable)

[Fazel, Hindi, Boyd 01]

TensorEstimation of

low-rank tensor(hard)

Rank defined in the sense ofTucker decomposition

Extendedtrace norm

minimization(tractable)

Generalization

Page 5: On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension of trace norm to tensors Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1 1The

Trace norm regularization (for matrices)

X ! RR!C , m = min(R,C)

Linear sum of singular-values

• Roughly speaking, L1 regulariza6on on the singular‐values.

• Stronger regulariza6on ‐‐> more zero singular‐values     ‐‐> 

low rank.

• Not obvious for tensors (no singular‐values for tensors)

!X!tr =m!

j=1

!j(X)

Page 6: On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension of trace norm to tensors Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1 1The

Mode-k unfolding (matricization)

I1

I2 I3

I1

I2 I2 I2

I3

Mode-1 unfolding

I1

I2 I3

I2

I3 I3 I3

I1

Mode-2 unfolding

X(1)

X(2)

Page 7: On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension of trace norm to tensors Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1 1The

Elementary facts about Tucker decomposition

X = C !1 U1 !2 U2 !3 U3

r1

r2 r3C

I1r1

U1 I2

r2U2

I3

r3

U3

Mode-1 unfolding

Mode-2 unfolding

Mode-3 unfolding

X(1) = U1C(1)(U3 !U2)!

X(2) = U2C(2)(U1 !U3)!

X(3) = U3C(3)(U2 !U1)!

rank ≦ r1

rank ≦ r2

rank ≦ r3

The rank of X(k) is no more than the rank of C(k)

Page 8: On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension of trace norm to tensors Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1 1The

What it means

• We can use the trace norm of an unfolding of a 

tensor X to learn low‐rank X.

Tensor X is low-rank∃k, rk<Ik

Unfolding Xk is a low-rank

matrix

Matricization

Tensorization

Page 9: On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension of trace norm to tensors Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1 1The

Approach 1: As a matrix

• Pick a mode k, and hope that the tensor to be 

learned is low rank in mode k.

minimizeX!RI1!···!IK

12!

!! " (Y # X ) !2F + !X(k)!",

Pro: Basically a matrix problem → Theoretical guarantee (Candes & Recht 09)Con: Have to be lucky to pick the right mode.

Page 10: On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension of trace norm to tensors Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1 1The

Approach 2: Constrained optimization

• Constrain so that each unfolding of X is 

simultaneously low rank.

minimizeX!RI1!···!IK

12!

!! " (Y # X ) !2F +

K!

k=1

"k!X(k)!".

γk: tuning parameter usually set to 1.

Pro: Jointly regularize every modeCon: Strong constraint

See also Marco Signoretto et al.,10

Page 11: On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension of trace norm to tensors Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1 1The

Approach 3: Mixture of low-rank tensors

• Each mixture component Zk is regularized to be 

low‐rank only in mode‐k.

minimizeZ1,...,ZK

12!

!!!!!! !"Y "

#K

k=1Zk

$!!!!!

2

F

+K#

k=1

"k#Zk(k)#!,

Pro: Each Zk takes care of each modeCon: Sum is not low-rank

Page 12: On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension of trace norm to tensors Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1 1The

Numerical experiment

• True tensor: Size 50x50x20, rank 7x8x9. No noise (λ=0).

• Random train/test split.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

10−4

10−2

100

102

Fraction of observed elements

Gen

eral

izatio

n er

ror

As a Matrix (mode 1)As a Matrix (mode 2)As a Matrix (mode 3)ConstraintMixtureTucker (large)Tucker (exact)Optimization tolerance

Tucker=EM algo

(nonconvex)

Page 13: On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension of trace norm to tensors Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1 1The

Computation time

• Convex formula6on is also fast

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

10

20

30

40

50

Fraction of observed elements

Com

puta

tion

time

(s)

As a MatrixConstraintMixtureTucker (large)Tucker (exact)

Page 14: On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension of trace norm to tensors Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1 1The

Phase transition behaviour

0 20 40 60 800

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

[10 12 13]

[10 12 9][16 6 8]

[17 13 15]

[20 20 5]

[20 3 2]

[30 12 10]

[32 3 4]

[40 20 6]

[40 3 2]

[40 9 7]

[4 42 3][5 4 3]

[7 8 15]

[7 8 9][8 5 6]

Sum of true ranks

Frac

tion

requ

ired

for p

erfe

ct re

cons

truct

ion

• Sum of true ranks= min(r1,r2r3)+ min(r2,r3r1)+ min(r3,r1r2)

Page 15: On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension of trace norm to tensors Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1 1The

Summary

• Low‐rank tensor comple6on can be computed in a convex 

op6miza6on problem using the trace norm of the unfoldings.

‐ No need to specify the rank beforehand.

• Convex formula6on is more accurate and faster than 

conven6onal EM‐based Tucker decomposi6on.

• Curious “phase transi6on” found → compressive‐sensing‐

type analysis is an on‐going work.

• Technical report: Arxiv:1010.0789 (including op6miza6on)

• Code:

‐ h]p://www.ibis.t.u‐tokyo.ac.jp/RyotaTomioka/So_wares/Tensor

Page 16: On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension of trace norm to tensors Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1 1The

Acknowledgment

• This work was supported in part by MEXT KAKENHI 

22700138, 22700289, and NTT Communica6on 

Science Laboratories.