statistical machine learning from data - introduction to

22
What is Machine Learning? Types of Problems and Situations Content of the Course Documentation Statistical Machine Learning from Data Introduction to Machine Learning Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique F´ ed´ erale de Lausanne (EPFL), Switzerland [email protected] http://www.idiap.ch/~bengio November 30, 2005 Samy Bengio Statistical Machine Learning from Data 1

Upload: butest

Post on 26-Jan-2015

114 views

Category:

Documents


7 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

Statistical Machine Learning from DataIntroduction to Machine Learning

Samy Bengio

IDIAP Research Institute, Martigny, Switzerland, andEcole Polytechnique Federale de Lausanne (EPFL), Switzerland

[email protected]

http://www.idiap.ch/~bengio

November 30, 2005Samy Bengio Statistical Machine Learning from Data 1

Page 2: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

1 What is Machine Learning?

2 Types of Problems and Situations

3 Content of the Course

4 Documentation

Samy Bengio Statistical Machine Learning from Data 2

Page 3: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

What is Machine Learning?Why Learning is Difficult?

1 What is Machine Learning?

2 Types of Problems and Situations

3 Content of the Course

4 Documentation

Samy Bengio Statistical Machine Learning from Data 3

Page 4: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

What is Machine Learning?Why Learning is Difficult?

What is Machine Learning? (Graphical View)

Samy Bengio Statistical Machine Learning from Data 4

Page 5: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

What is Machine Learning?Why Learning is Difficult?

What is Machine Learning?

Learning is an essential human property

Learning means changing in order to be better (according to agiven criterion) when a similar situation arrives

Learning IS NOT learning by heart

Any computer can learn by heart, the difficulty is to generalizea behavior to a novel situation

Samy Bengio Statistical Machine Learning from Data 5

Page 6: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

What is Machine Learning?Why Learning is Difficult?

Why Learning is Difficult?

Given a finite amount of training data, you have to derive arelation for an infinite domain

In fact, there is an infinite number of such relations

How should we draw the relation?

Samy Bengio Statistical Machine Learning from Data 6

Page 7: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

What is Machine Learning?Why Learning is Difficult?

Why Learning is Difficult? (2)

Given a finite amount of training data, you have to derive arelation for an infinite domain

In fact, there is an infinite number of such relations

Which relation is the most appropriate?

Samy Bengio Statistical Machine Learning from Data 7

Page 8: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

What is Machine Learning?Why Learning is Difficult?

Why Learning is Difficult? (3)

Given a finite amount of training data, you have to derive arelation for an infinite domain

In fact, there is an infinite number of such relations

... the hidden test points...

Samy Bengio Statistical Machine Learning from Data 8

Page 9: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

What is Machine Learning?Why Learning is Difficult?

Occam’s Razor’s Principle

William of Occam: Monk living in the 14th century

Principle of Parcimony:

One should not increase, beyond what is necessary,the number of entities required to explain anything

When many solutions are available for a given problem, weshould select the simplest one

But what do we mean by simple?

We will use prior knowledge of the problem to solve to definewhat is a simple solution

Example of a prior: smoothness

Samy Bengio Statistical Machine Learning from Data 9

Page 10: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

What is Machine Learning?Why Learning is Difficult?

Learning as a Search Problem

Set of solutions chosen a prioriInitial solution

Optimal solution

compatible with training setSet of solutions

Samy Bengio Statistical Machine Learning from Data 10

Page 11: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

Types of ProblemsTypes of Learning SituationsTypes of Applications

1 What is Machine Learning?

2 Types of Problems and Situations

3 Content of the Course

4 Documentation

Samy Bengio Statistical Machine Learning from Data 11

Page 12: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

Types of ProblemsTypes of Learning SituationsTypes of Applications

Types of Problems

There are 3 kinds of problems:

regression

Samy Bengio Statistical Machine Learning from Data 12

Page 13: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

Types of ProblemsTypes of Learning SituationsTypes of Applications

Types of Problems

There are 3 kinds of problems:

regression, classification

Samy Bengio Statistical Machine Learning from Data 13

Page 14: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

Types of ProblemsTypes of Learning SituationsTypes of Applications

Types of Problems

There are 3 kinds of problems:

regression, classification, density estimation

X

P(X)

Samy Bengio Statistical Machine Learning from Data 14

Page 15: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

Types of ProblemsTypes of Learning SituationsTypes of Applications

Types of Learning

Supervised learning:

The training data contains the desired behavior

(desired class, outcome, etc)

Reinforcement learning:

The training data contains partial targets

(for instance, simply whether the machine did well or not).

Unsupervised learning:

The training data is raw, no class or target is given

There is often a hidden goal in that task (compression,maximum likelihood, etc)

Samy Bengio Statistical Machine Learning from Data 15

Page 16: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

Types of ProblemsTypes of Learning SituationsTypes of Applications

Applications

Vision Processing

Face detection/verificationHandwritten recognition

Speech Processing

Phoneme/Word/Sentence/Person recognition

Others

Indexing: google, text mining, information retrievalFinance: asset prediction, portfolio and risk managementTelecom: traffic predictionData mining: make use of huge datasets kept by largecorporationsGames: Backgammon, goControl: robots

... and plenty of others of course!

Samy Bengio Statistical Machine Learning from Data 16

Page 17: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

1 What is Machine Learning?

2 Types of Problems and Situations

3 Content of the Course

4 Documentation

Samy Bengio Statistical Machine Learning from Data 17

Page 18: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

Content of the Course

Theoretical Issues

What are the theoretical foundations for statistical learning?How can we measure the expected performance of a model?

Modeling Issues

Models specialized for classification, regression, distributions,sequences, images, etcFor each model, we need to devise a training algorithm

Others

Other practical issues, such as feature selection, parametersharing, etc.

Laboratories

About one third of the course will be through practicallaboratories, using the python programming language

Samy Bengio Statistical Machine Learning from Data 18

Page 19: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

Journals and ConferencesBooks and Lecture NotesElectronic Resources

1 What is Machine Learning?

2 Types of Problems and Situations

3 Content of the Course

4 Documentation

Samy Bengio Statistical Machine Learning from Data 19

Page 20: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

Journals and ConferencesBooks and Lecture NotesElectronic Resources

Journals and Conferences

Journals:

Journal of Machine Learning ResearchNeural ComputationIEEE Transactions on Neural NetworksIEEE Transactions on Pattern Analysis and MachineIntelligence

Conferences:

NIPS: Neural Information Processing SystemsCOLT: Computational Learning TheoryICML: International Conference on Machine Learning

Samy Bengio Statistical Machine Learning from Data 20

Page 21: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

Journals and ConferencesBooks and Lecture NotesElectronic Resources

Books and Lecture Notes

Books:

C. Bishop. Neural Networks for Pattern Recognition, 1995.V. Vapnik. The Nature of Statistical Learning Theory, 1995.T. Hastie, R. Tibshirani, J. Friedman. The elements ofStatistical Learning, 2001.B. Scholkopf, A. J. Smola. Learning with Kernels, 2002.

Other lecture notes: (some are in french...)

Bengio, Y.: http://www.iro.umontreal.ca/˜pift6266/A03/Kegl, B.: http://www.iro.umontreal.ca/˜kegl/ift6266/Jordan, M.:http://www.cs.berkeley.edu/˜jordan/courses/281A-fall04/LeCun, Y.:http://www.cs.nyu.edu/˜yann/2005f-G22-2565-001/

Samy Bengio Statistical Machine Learning from Data 21

Page 22: Statistical Machine Learning from Data - Introduction to

What is Machine Learning?Types of Problems and Situations

Content of the CourseDocumentation

Journals and ConferencesBooks and Lecture NotesElectronic Resources

Electronic Resources

Search engines:

NIPS online: http://nips.djvuzone.orgCiteseer: http://citeseer.ist.psu.edu/Google scholar: http://scholar.google.com/

Machine learning libraries:

Torch: http://www.Torch.chLush: http://lush.sf.net

Samy Bengio Statistical Machine Learning from Data 22