three right directions and three wrong directions for ... · 2.understand the geometry of tensors....

28
Three right directions and three wrong directions for tensor research Michael W. Mahoney Stanford University ( For more info, see: http:// cs.stanford.edu/people/mmahoney/ or Google on “Michael Mahoney”)

Upload: others

Post on 11-Aug-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Three right directions and three wrongdirections for tensor research

Michael W. Mahoney

Stanford University

( For more info, see:http:// cs.stanford.edu/people/mmahoney/

or Google on “Michael Mahoney”)

Page 2: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Lots and lots of large data!

• High energy physics experimental data

• Hyperspectral medical and astronomical image data

• DNA microarray data and DNA SNP data

• Medical literature analysis data

• Collaboration and citation networks

• Internet networks and web graph data

• Advertiser-bidded phrase data

• Static and dynamic social network data

Page 3: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

“Scientific” and “Internet” dataSNPs

indi

vidu

als … AG AG AG AG AA CC GG AG CG AC CC AA CC AA GG TT AG CT CG CG CG AT CT CT AG CT …

… AA AG AG AG AA CC AG GG CC AC CC AA CG AA GG TT AG CT CG CG CG AT CT CT AG CT …

… AA GG GG GG AA CT GG AA CC AC CG AA CC AA GG TT GG CC CG CG CG AT CT CT AG CT …

… AG AG AG AG AA CT GG AG CC CC CG AA CC AA GT TT AG CT CG CG CG AT CT CT AG CT …

… AA AG AG AG AA CC AG AG CG AA CC AA CG AA GG TT AA TT GG GG GG TT TT CC GG TT …

Page 4: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Algorithmic vs. Statistical Perspectives

Computer Scientists• Data: are a record of everything that happened.• Goal: process the data to find interesting patterns and associations.• Methodology: Develop approximation algorithms under differentmodels of data access since the goal is typically computationally hard.

Statisticians• Data: are a particular random instantiation of an underlying processdescribing unobserved patterns in the world.• Goal: is to extract information about the world from noisy data.• Methodology: Make inferences (perhaps about unseen events) bypositing a model that describes the random variability of the dataaround the deterministic model.

Lambert (2000)

Page 5: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Matrices and Data

Matrices provide simple representations of data:

• Aij = 0 or 1 (perhaps then weighted), depending on whether wordi appears in document j

• Aij = -1,0,+1, if homozygous for the major allele, heterozygous,or homozygous for the minor allele

Can take advantage of “nice” properties of vector spaces:

• structural properties: SVD, Euclidean geometry

• algorithmic properties: “everything” is O(n3)

• statistical properties: PCA, regularization, etc.

SNPs

indi

vidu

als … AG AG AG AG AA CC GG AG CG AC CC AA CC AA GG TT AG CT CG CG CG AT CT CT AG CT …

… AA AG AG AG AA CC AG GG CC AC CC AA CG AA GG TT AG CT CG CG CG AT CT CT AG CT …

… AA GG GG GG AA CT GG AA CC AC CG AA CC AA GG TT GG CC CG CG CG AT CT CT AG CT …

… AG AG AG AG AA CT GG AG CC CC CG AA CC AA GT TT AG CT CG CG CG AT CT CT AG CT …

… AA AG AG AG AA CC AG AG CG AA CC AA CG AA GG TT AA TT GG GG GG TT TT CC GG TT …

Page 6: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Graphs and DataCommon variations include:• Directed graphs• Weighted graphs• Bipartite graphs

Interaction graph model ofnetworks:• Nodes represent entities• Edges represent interactionbetween pairs of entities

Page 7: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Why model data as graphs and matrices?

Graphs and matrices -

• provide natural mathematical structures that providealgorithmic, statistical, and geometric benefits

• provide nice tradeoff between rich descriptive frameworkand sufficient algorithmic structure

• provide regularization due to geometry, either explicitly dueto Rn or implicitly due to approximation algorithms

Page 8: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

What if graphs/matrices don’t work?

Employ more general mathematical structures:

• Hypergraphs

• Attributes associated with nodes

• “Kernelize” the data using, e.g., a similarity notion

• Generalized linear or hierarchical models

• Tensors!!

These structures provide greater descriptive flexibility, thattypically comes at a (moderate or severe) computational cost.

Page 9: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

What is a tensor? (1 of 3)See L.H.Lim’s tutorial on tensors at MMDS 2006.

Page 10: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

What is a tensor? (2 of 3)

Page 11: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

What is a tensor? (3 of 3)

IMPORTANT: This is similar to NLA --- but, there is no reason toexpect the “subscript manipulation” methods, so useful in NLA, toyield anything meaningful for more general algebraic structures.

Page 12: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Tensor ranks and data analysis (1 of 3)

Page 13: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Tensor ranks and data analysis (2 of 3)

IMPORTANT: These ill-posedness results are NOT pathological---they are ubiquitous and essential properties of tensors.

Page 14: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Tensor ranks and data analysis (3 of 3)

THAT IS: To get a “simple” or “low-rank” tensor approximation,we focus on exceptions to fundamental ill-posedness propertiesof tensors (i.e., rank-1 tensors and 2-mode tensors).

Page 15: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Historical Perspective on NLA

• NLA grew out of statistics (among other areas) (40s and 50s)

• NLA focuses on numerical issues (60s, 70s, and 80s)

• Large-scale data generation increasingly common (90s and 00s)

• NLA has suffered due to the success of PageRank and HITS.

• Large-scale scientific and Internet data problems invite us totake a broader perspective on traditional NLA:

revisit algorithmic basis of common NLA matrix algorithms

revisit statistical underpinnings of NLA

expand traditional NLA view of tensors

Page 16: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

The gap between NLA and TCSMatrix factorizations:• in NLA and scientific computing - used to express a problem s.t. it canbe solved more easily.• in TCS and statistical data analysis - used to represent structure thatmay be present in a matrix obtained from object-feature observations.

MMDS06, MMDS08, … were designed to “bridge the gap” between NLA,TCS, and data applications.

NLA:• emphasis on optimal conditioning,• backward error analysis issues,• is running time a large or smallconstant multiplied by n2 or n3.

TCS:• motivated by large data applications• space-constrained or pass-efficientmodels• over-sampling and randomness ascomputational resources.

Page 17: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

How to “bridge the gap”(Lessons from MMDS)

• In a vector space, “everything is easy,”

multi-linear captures the inherent intractability of NP-hardness.

• Convexity is an appropriate generalization of linear

nice algorithmic framework, as with kernels in Machine Learning

• Randomness, over-sampling, approximation ...are powerful algorithmic resources

but you need to have a clear objective you are solving

• Geometry of combinatorial objects (e.g., graphs, tensors, etc.)has positive algorithmic, statistical, and conceptual benefits

• Approximate computation induces implicit statistical regularization

Page 18: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Examples of “tensor data”

• Chemistry: model fluorescence excitation-emission data in food science:Aijk is samples x emission x excitation.

• Neuroscience: EEG data as patients, doses, conditions, etc. varied:Aijk is time samples x frequency x electrodes.

• Social network and Web analysis: to discover hidden structures: Aijk is webpages x webpages x anchor text. Aijk is users x queries x webpages. Aijk is advertisers x bidded-phrases x time.

• Computer Vision: image compression and face recognition: Aijk is pixel x illumination x expression x viewpoint x person.

• Quantum mechanics, large-scale computation, hyperspectral data, climatedata, ICA, nonnegative data, blind source separation, NP-hard problems, …

“Tensor-based data are particularly challenging due to their size and since manydata analysis tools based on graph theory and linear algebra do not easily generalize.”-- MMD06

(Acar and Yener 2008)

Page 19: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Three Right Directions

Page 20: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Three Right Directions

1. Understand statistical and algorithmic assumptions s.t.tensor methods work. (NOT just independence.)

Page 21: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Three Right Directions

1. Understand statistical and algorithmic assumptions s.t.tensor methods work. (NOT just independence.)

2. Understand the geometry of tensors. (NOT of vector spacesyou unfold to.)

Page 22: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Three Right Directions

1. Understand statistical and algorithmic assumptions s.t.tensor methods work. (NOT just independence.)

2. Understand the geometry of tensors. (NOT of vector spacesyou unfold to.)

3. Understand WHY tensors work in physical applications andwhat this says about less structured data applications (andvice-versa, which has been very fruitful for matrices*.)

*(E.g., low-rank off-diagonal blocks are common in matrices -- since the world is 3D, which isnot true in less structured applications -- this has significant algorithmic implications.)

Page 23: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Four! Right Directions

1. Understand statistical and algorithmic assumptions s.t.tensor methods work. (NOT just independence.)

2. Understand the geometry of tensors. (NOT of vector spacesyou unfold to.)

3. Understand WHY tensors work in physical applications andwhat this says about less structured data applications (andvice-versa, which has been very fruitful for matrices*.)

4. Understand “unfolding” as a process of defining features.(Since this puts you in a nice algorithmic place.)

*(E.g., low-rank off-diagonal blocks are common in matrices -- since the world is 3D, which isnot true in less structured applications -- this has significant algorithmic implications.)

Page 24: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Three Wrong directions

Page 25: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Three Wrong directions

1. Viewing tensors as matrices with additional subscripts.(That may be true, but it hampers you, since Rn is so nice.)

Page 26: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Three Wrong directions

1. Viewing tensors as matrices with additional subscripts.(That may be true, but it hampers you, since Rn is so nice.)

2. Using methods that damage geometry and enhance sparsity.(BTW, you will do this if you don’t understand the underlyinggeometric and sparsity structure.)

Page 27: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Three Wrong directions

1. Viewing tensors as matrices with additional subscripts.(That may be true, but it hampers you, since Rn is so nice.)

2. Using methods that damage geometry and enhance sparsity.(BTW, you will do this if you don’t understand the underlyinggeometric and sparsity structure.)

3. Doing “Applied Ramsey Theory”:

Theorem: Given a large enough universe of data, then forany algorithm there exists a data set s.t. it performs well.

(Show me where your method fails AND where it succeeds!Otherwise, is your result about your data or your method?Of course, this applies more generally in data analysis.)

Page 28: Three right directions and three wrong directions for ... · 2.Understand the geometry of tensors. (NOT of vector spaces you unfold to.) 3.Understand WHY tensors work in physical

Conclusions

• Large-scale data applications have been the maindriver for a lot of the interest in tensors.

• Tensors are tricky to deal with, bothalgorithmically and statistically.

• Let’s use this meeting to refine my directions inlight of motivating data applications.