tutorials at isit’15 t-pm-2 at isit’15 t-pm-2 information theory and machine learning by...

1
TUTORIALS AT ISIT’15 T-PM-2 INFORMATION THEORY AND MACHINE LEARNING by Emmanuel Abbe and Martin Wainwright We are in the midst of a data deluge, with an explosion in the volume and rich- ness of data sets in fields including social networks, biology, natural language processing, and computer vision, among others. In all of these areas, machine learning has been extraordinarily successful in providing tools and practical algorithms for extracting information from massive data sets (e.g., genetics, multi-spectral imaging, Google and Facebook). Despite this tremendous prac- tical success, relatively less attention has been paid to fundamental limits and tradeoffs, and information theory has a crucial role to play in this context. The goal of this tutorial is to demonstrate how information-theoretic tech- niques and concepts can be brought to bear on machine learning problems in un- orthodox and fruitful ways. We discuss how any learning problem can be formal- ized in a Shannon-theoretic sense, albeit one that involves non-traditional no- tions of codewords and channels. This perspective allows information-theoretic tools—including information measures, Fano’s inequality, random coding argu- ments, and so on—to be brought to bear on learning problems. We illustrate this broad perspective with discussions of several learning prob- lems, including sparse approximation, dimensionality reduction, graph recovery, clustering, and community detection. We emphasise recent results establishing the fundamental limits of graphical model learning and community detection. We also discuss the distinction between the learning-theoretic capacity when arbitrary “decoding” algorithms are allowed, and notions of computationally- constrained capacity. Finally, a number of open problems and conjectures at the interface of information theory and machine learning will be discussed. Emmanuel Abbe is an assistant professor at Princeton University, with a joint appointment between the EE department and the Program in Applied and Computational Mathematics. His research interests lie at the boundaries between information theory, machine learning and networks, in particular on the interplay between coding theory and community detection. He recently received the Bell Labs Prize (’14) for his work establishing the fundamental limits of community recovery in statistical network models, and for the development of capacity-achieving clustering algorithms. Martin Wainwright is a professor at UC Berkeley, with a joint appoint- ment between the Departments of Statistics and EECS. His research interests lie at the boundaries between machine learning, statistics and information theory, and he has worked on various topics in these areas, including sparse approxi- mation, graphical model recovery, non-parametric estimation, randomized algo- rithms, and spectral methods. Together with Michael Jordan, he co-authored a monograph on graphical models (’08), and he is currently writing a book on high-dimensional statistics. He has given a number of short courses on graph- ical models and high-dimensional statistics, and recently received the COPSS Presidents’ Award (’14) from the Joint Statistical Societies.

Upload: lyhanh

Post on 09-Mar-2018

217 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: TUTORIALS AT ISIT’15 T-PM-2 AT ISIT’15 T-PM-2 INFORMATION THEORY AND MACHINE LEARNING by Emmanuel Abbe and Martin Wainwright We are in the midst of a …

TUTORIALS AT ISIT’15

T-PM-2INFORMATION THEORY AND MACHINE LEARNINGby Emmanuel Abbe and Martin Wainwright

We are in the midst of a data deluge, with an explosion in the volume and rich-ness of data sets in fields including social networks, biology, natural languageprocessing, and computer vision, among others. In all of these areas, machinelearning has been extraordinarily successful in providing tools and practicalalgorithms for extracting information from massive data sets (e.g., genetics,multi-spectral imaging, Google and Facebook). Despite this tremendous prac-tical success, relatively less attention has been paid to fundamental limits andtradeoffs, and information theory has a crucial role to play in this context.

The goal of this tutorial is to demonstrate how information-theoretic tech-niques and concepts can be brought to bear on machine learning problems in un-orthodox and fruitful ways. We discuss how any learning problem can be formal-ized in a Shannon-theoretic sense, albeit one that involves non-traditional no-tions of codewords and channels. This perspective allows information-theoretictools—including information measures, Fano’s inequality, random coding argu-ments, and so on—to be brought to bear on learning problems.

We illustrate this broad perspective with discussions of several learning prob-lems, including sparse approximation, dimensionality reduction, graph recovery,clustering, and community detection. We emphasise recent results establishingthe fundamental limits of graphical model learning and community detection.We also discuss the distinction between the learning-theoretic capacity whenarbitrary “decoding” algorithms are allowed, and notions of computationally-constrained capacity. Finally, a number of open problems and conjectures atthe interface of information theory and machine learning will be discussed.

Emmanuel Abbe is an assistant professor at Princeton University, witha joint appointment between the EE department and the Program in Appliedand Computational Mathematics. His research interests lie at the boundariesbetween information theory, machine learning and networks, in particular on theinterplay between coding theory and community detection. He recently receivedthe Bell Labs Prize (’14) for his work establishing the fundamental limits ofcommunity recovery in statistical network models, and for the development ofcapacity-achieving clustering algorithms.

Martin Wainwright is a professor at UC Berkeley, with a joint appoint-ment between the Departments of Statistics and EECS. His research interests lieat the boundaries between machine learning, statistics and information theory,and he has worked on various topics in these areas, including sparse approxi-mation, graphical model recovery, non-parametric estimation, randomized algo-rithms, and spectral methods. Together with Michael Jordan, he co-authoreda monograph on graphical models (’08), and he is currently writing a book onhigh-dimensional statistics. He has given a number of short courses on graph-ical models and high-dimensional statistics, and recently received the COPSSPresidents’ Award (’14) from the Joint Statistical Societies.