online learning rong jin. batch learning given a collection of training examples d learning a...

32
Online Learning Rong Jin

Upload: colleen-little

Post on 17-Jan-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Online Learning

Rong Jin

Page 2: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Batch Learning

• Given a collection of training examples D

• Learning a classification model from D• What if training examples are received one

at each time ?

Page 3: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Online Learning

For t=1, 2, … T • Receive an instance• Predict its class label• Receive the true class label• Encounter loss• Update the classification model

Page 4: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

4

Objective

• Minimize the total loss

• Loss function• Zero-One loss:

• Hinge loss:

Page 5: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

5

Loss Functions

1

1Zero-One Loss

Hinge Loss

Page 6: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

6

• Restrict our discussion to linear classifier

• Prediction:• Confidence:

Linear Classifiers

Page 7: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

7

Separable Set

Page 8: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

8

Inseparable Sets

Page 9: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

9

Why Online Learning?

FastMemory efficient - process one example at a timeSimple to implementFormal guarantees – Regret/Mistake bounds Online to Batch conversionsNo statistical assumptionsAdaptive

Not as good as a well designed batch algorithms

Page 10: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

10

Update Rules

• Online algorithms are based on an update rule which defines from (and possibly other information)

• Linear Classifiers : find from based on the input

Some Update Rules :– Perceptron (Rosenblat)– ALMA (Gentile)– ROMMA (Li & Long)– NORMA (Kivinen et. al)

– MIRA (Crammer & Singer)– EG (Littlestown and Warmuth)– Bregman Based (Warmuth)

Page 11: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Perceptron

Initialize For t=1, 2, … T • Receive an instance• Predict its class label• Receive the true class label• If then

Page 12: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

12

Geometrical Interpretation

Page 13: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Mistake Bound: Separable Case

• Assume the data set D is linearly separable with margin , i.e.,

• Assume• Then, the maximum number of mistakes

made by the Perceptron algorithm is bounded by

Page 14: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Mistake Bound: Separable Case

Page 15: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Mistake Bound: Inseparable Case

• Let be the best linear classifier• We measure our progress by• Consider we make a mistake for

Page 16: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Mistake Bound: Inseparable Case

• Result 1:

Page 17: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Mistake Bound: Inseparable Case

• Result 2

Page 18: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Perceptron with Projection

Initialize For t=1, 2, … T • Receive an instance• Predict its class label• Receive the true class label• If then• If then

Page 19: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

19

Remarks

• Mistake bound is measured for a sequence of classifiers

• Bound does not depend on dimension of the feature vector

• The bound holds for all sequences (no i.i.d. assumption).

• It is not tight for most real world data. But, it can not be further improved in general.

Page 20: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Perceptron

Initialize For t=1, 2, … T • Receive an instance• Predict its class label• Receive the true class label• If then

Conservative: updates the classifier only

when it misclassifies

Page 21: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Aggressive Perceptron

Initialize For t=1, 2, … T • Receive an instance• Predict its class label• Receive the true class label• If then

Page 22: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Regret Bound

Page 23: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Learning a Classifier

• The evaluation (mistake bound or regret bound) concerns a sequence of classifiers

• But, by the end of the day, which classifier should used ? The last? By Cross Validation ?

Page 24: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Learning with Expert Advice

• Learning to combine the predictions from multiple experts

• An ensemble of d experts: • Combination weights:

• Combined classifier

Page 25: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Hedge

Simple Case• There exists one expert, denoted by ,

who can perfectly classify all the training examples

• What is your learning strategy ?

Difficult case• What if we don’t have such a perfect expert ?

Page 26: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Hedge Algorithm

+1 -1 +1 +1

Page 27: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Hedge Algorithm

Initialize For t=1, 2, … T • Receive a training example• Prediction • If then

For i=1, 2, …, d• If then

Page 28: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Mistake Bound

Page 29: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Mistake Bound

• Measure the progress• Lower bound

Page 30: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Mistake Bound

• Upper bound

Page 31: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Mistake Bound

• Upper bound

Page 32: Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are

Mistake Bound