novel multiscale representations of data sets for...

19
Novel Multiscale Representations of Data Sets for Interactive Learning Mauro Maggioni, Eric Monson, R. Brady Dept. of Mathematics and Computer Science Duke University FODAVA Annual Review 2010, Georgia Tech 12/9/2010 Joint work with: G. Chen. Partial support: NSF/DHS, ONR Friday, December 10, 2010

Upload: others

Post on 25-Mar-2020

21 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

Novel Multiscale Representations of Data Sets for Interactive Learning

Mauro Maggioni, Eric Monson, R. BradyDept. of Mathematics and Computer Science

Duke University

FODAVA Annual Review 2010, Georgia Tech12/9/2010

Joint work with: G. Chen.Partial support: NSF/DHS, ONR

Friday, December 10, 2010

Page 2: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

Ongoing efforts in several directions

• Using diffusion processes on graphs for (inter)active learning.

• Perform multiscale analysis on graphs: construction of graph-adaptivemultiscale analysis, for graph visualization and exploration, and (inter)activelearning.

• Sparse learning w.r.t. multiscale dictionaries on graphs.

• Construct data-adaptive dictionaries for data-modeling and exploration.

Friday, December 10, 2010

Page 3: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

Ongoing efforts in several directions

• Using diffusion processes on graphs for (inter)active learning.

• Perform multiscale analysis on graphs: construction of graph-adaptivemultiscale analysis, for graph visualization and exploration, and (inter)activelearning.

• Sparse learning w.r.t. multiscale dictionaries on graphs.

• Construct data-adaptive dictionaries for data-modeling and exploration.

Friday, December 10, 2010

Page 4: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

Random walks on data & graphs

• One may connect data points to form a graph, with edges weighted

by the similarity of data points.

• One can then construct a random on the data points, which may be

used for a variety of tasks:

– construct local and global embeddings of the data in low dimen-

sions,

– perform learning tasks such as clustering, classification, regres-

sion, etc..

– diffuse information (e.g. labels) on data

– study geometric properties of data

Friday, December 10, 2010

Page 5: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

Start with few labelled

points

Predict (diffusion)

Label a ``well-chosen’’ new point

Active LearningWith E. Monson and R.

Brady [C.S.]

Given: full data set (e.g. a body of text documents).

Goal: learn a categorization of the data (e.g. topics of the text documents).Cost for every label we obtain from an expert. Large scale here means: “labelsare very expensive compared to very large amount of data available”.

Find points whose labels maximize the gain in prediction accuracy. Naturalcandidates: points with highly uncertain predictions + well-distributed on thedata (standard idea) (our contribution) Points actually proceed in a multiscalefashion.

Friday, December 10, 2010

Page 6: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

20 30 40 50 60 70 800.5

0.55

0.6

0.65

0.7

0.75

With E. Monson and R. Brady [C.S.]Active Learning

1147 Science News articles, 8 categories

Cost: # of labeled points

Accuracy of predictions

Friday, December 10, 2010

Page 7: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

Ongoing efforts in several directions

• Using diffusion processes on graphs for (inter)active learning.

• Perform multiscale analysis on graphs: construction of graph-adaptivemultiscale analysis, for graph visualization and exploration, and (inter)activelearning.

• Sparse learning w.r.t. multiscale dictionaries on graphs.

• Construct data-adaptive dictionaries for data-modeling and exploration.

Friday, December 10, 2010

Page 8: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

Ongoing efforts in several directions

• Using diffusion processes on graphs for (inter)active learning.

• Perform multiscale analysis on graphs: construction of graph-adaptivemultiscale analysis, for graph visualization and exploration, and (inter)activelearning.

• Sparse learning w.r.t. multiscale dictionaries on graphs.

• Construct data-adaptive dictionaries for data-modeling and exploration.

Friday, December 10, 2010

Page 9: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

X is N × D, N documents in RD, compute multiscale dictionary Φ (D ×M)on the D words. If f maps documents to their topic, write f = XΦβ + η andfind β by

argminβ ||f −XΦβ||22 + λ||{2−jγβj,k}||1 ,

which is a form of sparse regression. (λ, γ) are determined by cross-validation.

Example: text documentsWith S. Mukherjee

and J. Guinney

Friday, December 10, 2010

Page 10: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

X is N × D, N documents in RD, compute multiscale dictionary Φ (D ×M)on the D words. If f maps documents to their topic, write f = XΦβ + η andfind β by

argminβ ||f −XΦβ||22 + λ||{2−jγβj,k}||1 ,

which is a form of sparse regression. (λ, γ) are determined by cross-validation.

Example: text documentsWith S. Mukherjee

and J. Guinney

Friday, December 10, 2010

Page 11: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

Source of data: Nakagawa T, Kollmeyer T, Morlan B, Anderson, S, Bergstralh E, et al, (2008) A tissue biomarker panel predicting systemic progression after PSA recurrence post-definitive prostate cancer therapy, Plos One 3:e2318.

X is N ×D, N patients with D genes (here N ∼ 400 and D ∼ 1000).

With S. Mukherjee and J. Guinney

Example: gene microarray data

Friday, December 10, 2010

Page 12: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

Source of data: Nakagawa T, Kollmeyer T, Morlan B, Anderson, S, Bergstralh E, et al, (2008) A tissue biomarker panel predicting systemic progression after PSA recurrence post-definitive prostate cancer therapy, Plos One 3:e2318.

X is N ×D, N patients with D genes (here N ∼ 400 and D ∼ 1000).

With S. Mukherjee and J. Guinney

Example: gene microarray data

Friday, December 10, 2010

Page 13: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

Source of data: Nakagawa T, Kollmeyer T, Morlan B, Anderson, S, Bergstralh E, et al, (2008) A tissue biomarker panel predicting systemic progression after PSA recurrence post-definitive prostate cancer therapy, Plos One 3:e2318.

Added advantage: the multiscale genes we construct are much interpretable than eigengenes, several of them match important pathways, and moreover both small scale and large scale genelets seem relevant.

X is N ×D, N patients with D genes (here N ∼ 400 and D ∼ 1000).

With S. Mukherjee and J. Guinney

Example: gene microarray data

Friday, December 10, 2010

Page 14: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

Ongoing efforts in several directions

• Using diffusion processes on graphs for (inter)active learning.

• Perform multiscale analysis on graphs: construction of graph-adaptivemultiscale analysis, for graph visualization and exploration, and (inter)activelearning.

• Sparse learning w.r.t. multiscale dictionaries on graphs.

• Construct data-adaptive dictionaries for data-modeling and exploration.

Friday, December 10, 2010

Page 15: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

Ongoing efforts in several directions

• Using diffusion processes on graphs for (inter)active learning.

• Perform multiscale analysis on graphs: construction of graph-adaptivemultiscale analysis, for graph visualization and exploration, and (inter)activelearning.

• Sparse learning w.r.t. multiscale dictionaries on graphs.

• Construct data-adaptive dictionaries for data-modeling and exploration.

Friday, December 10, 2010

Page 16: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

Geometric Wavelets: Multiscale Data-Adaptive Dictionaries• Many constructions of “general-purpose” dictionaries [Fourier, wavelets,

curvelets, ...], especially for low-dimensional signals (sounds, images,...).

Motivation: pretend we have rather good tractable models (e.g. functionspaces), construct good dictionaries by hand.

Goals: compression, signal processing tasks (e.g. denoising), etc...

• Recently, many constructions of data-adaptive dictionaries [K-SVD, K-planes, ...].

Motivation: we do not have tractable good models, need to adapt to data.

Goals: as before, albeit hopes for more general types of high-dimensionaldata.

• Important role of sparsity in statistics, learning, design of measurements,...: seek dictionaries that yield sparse representations of the data.

Friday, December 10, 2010

Page 17: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

Dictionary Learning

!j

xj

xj+"cj+!

cj+!

cj+"

cj+"

cj

x

!j+"

#j+"

Scalingfunctions

Pt approxat scale j

Finerpt approxat scale j+!

Originaldata pointWavelets

Waveletcorrection

Treestructure

Centers!Graphics by E. Monson

With G. Chen, E. Monson, R. Brady

Friday, December 10, 2010

Page 18: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

UI for Geometric WaveletsWith E. Monson and R.

Brady [C.S.]

Friday, December 10, 2010

Page 19: Novel Multiscale Representations of Data Sets for ...fodava.gatech.edu/files/review2010/Maggioni_FODAVA_2010.pdf · Novel Multiscale Representations of Data Sets for Interactive Learning

Open problems & future dir.’s

• Geometric wavelets meet interactive learning.• Multiscale analysis on graphs meets interactive learning.• Better visualization of multiscale analysis of graphs [E. Monson, R. Brady]

• Towards a toolbox of highly robust geometric analysis tools for data sets [A. Little, G. Chen].

• Dynamic graphs [J. Lee].• Wrap up toolboxes; scale part of the code.

Collaborators: E. Monson, R. Brady (Duke C.S.); A. V. Little, K. Balachandrian (Math grad, Duke), J. Lee (Math undergrad, Duke); L. Rosasco (CS, MIT and Universita’ di Genova).Funding: NSF, ONR, Sloan Foundation, Duke.

www.math.duke.edu/~mauro

Friday, December 10, 2010