unsupervised domain adaptation: from practice to theory john blitzer texpoint fonts used in emf....

58
Unsupervised Domain Adaptation: From Practice to Theory John Blitzer

Upload: britney-matthews

Post on 24-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Unsupervised Domain Adaptation: From Practice to Theory

John Blitzer

John
PhD, visiting researcher, PostdocAll on correspondencesSay: Correspondences for Domain Adaptation Correspondences for Machine Translation
Page 2: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

...

...

??

?

Unsupervised Domain Adaptation

Running with Scissors

Title: Horrible book, horrible.

This book was horrible. I read half,

suffering from a headache the entire

time, and eventually i lit it on fire. 1 less

copy in the world. Don't waste your

money. I wish i had the time spent

reading this book back. It wasted my

life

...

...

Avante Deep Fryer; Black

Title: lid does not work well...

I love the way the Tefal deep fryer

cooks, however, I am returning my

second one due to a defective lid

closure. The lid may close initially,

but after a few uses it no longer

stays closed. I won’t be buying this

one again.

Source Target

Page 3: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

...

...

??

?

Target-Specific Features

Running with Scissors

Title: Horrible book, horrible.

This book was horrible. I read half,

suffering from a headache the entire

time, and eventually i lit it on fire. 1 less

copy in the world. Don't waste your

money. I wish i had the time spent

reading this book back. It wasted my

life

...

...

Avante Deep Fryer; Black

Title: lid does not work well...

I love the way the Tefal deep fryer

cooks, however, I am returning my

second one due to a defective lid

closure. The lid may close initially,

but after a few uses it no longer

stays closed. I won’t be buying this

one again.

...

...

Source Target

Page 4: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Learning Shared Representations

fascinatingboring

read halfcouldn’t put it down

defectivesturdyleakinglike a charm

fantastichighly recommended

waste of moneyhorrible

Sourc

e

Target

John
Say "The pivots can be chosen automatically based on labeled source & unlabeled target data"
Page 5: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Shared Representations: A Quick Review

Blitzer et al. (2006, 2007). Shared CCA.Tasks: Part of speech tagging, sentiment.

Xue et al. (2008). Probabilistic LSA Task: Cross-lingual document classification.

Guo et al. (2009). Latent Dirichlet AllocationTask: Named entity recognition

Huang et al. (2009). Hidden Markov Models Task: Part of Speech Tagging

John
Say "The pivots can be chosen automatically based on labeled source & unlabeled target data"
Page 6: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

What do you mean, theory?

Page 7: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

What do you mean, theory?

Statistical Learning Theory:

Page 8: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

What do you mean, theory?

Statistical Learning Theory:

Page 9: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

What do you mean, theory?

Classical Learning Theory:

Page 10: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

What do you mean, theory?

Adaptation Learning Theory:

Page 11: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Goals for Domain Adaptation Theory

1. A computable (source) sample bound on target error

2. A formal description of empirical phenomena• Why do shared representations algorithms work?

3. Suggestions for future research

Page 12: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Talk Outline

1. Target Generalization Bounds using Discrepancy Distance

[BBCKPW 2009]

[Mansour et al. 2009]

2. Coupled Subspace Learning

[BFK 2010]

Page 13: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Formalizing Domain Adaptation

Source labeled data

Source distribution Target distribution

Target unlabeled data

Page 14: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Formalizing Domain Adaptation

Source labeled data

Source distribution Target distribution

Semi-supervised adaptation

Target unlabeled data

Some target labels

Page 15: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Formalizing Domain Adaptation

Source labeled data

Source distribution Target distribution

Semi-supervised adaptation

Target unlabeled data

Not in this talk

Page 16: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

A Generalization Bound

Page 17: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

A new adaptation bound

Bound from [MMR09]

Page 18: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Discrepancy Distance

When good source models go bad

Page 19: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Binary Hypothesis Error Regions

+

Page 20: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Binary Hypothesis Error Regions

+

+

Page 21: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Binary Hypothesis Error Regions

+

+

Page 22: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Binary Hypothesis Error Regions

Page 23: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Discrepancy Distance

lowlow high

When good source models go bad

Page 24: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Computing Discrepancy Distance

Learn pairs of hypotheses to discriminate source from target

Page 25: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Computing Discrepancy Distance

Learn pairs of hypotheses to discriminate source from target

Page 26: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Computing Discrepancy Distance

Learn pairs of hypotheses to discriminate source from target

Page 27: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Hypothesis Classes & Representations

Linear Hypothesis Class:

Induced classes from projections

3

0

1

001

...

Goals for

1)

2)

Page 28: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

A Proxy for the Best Model

Linear Hypothesis Class:

Induced classes from projections

3

0

1

001

...

Goals for

1)

2)

Page 29: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Problems with the Proxy

3

0

1

00

...

1) 2)

Page 30: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Goals

1. A computable bound

2. Description of shared representations

3. Suggestions for future research

Page 31: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Talk Outline

1. Target Generalization Bounds using Discrepancy Distance

[BBCKPW 2009]

[Mansour et al. 2009]

2. Coupled Subspace Learning

[BFK 2010]

Page 32: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Assumption: Single Linear Predictor

target-specific can’t be estimated from source alone. . . yet

source-specific target-specificshared

Page 33: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Visualizing Single Linear Predictorfa

scin

atin

g

works well

don’t buy

source

target

shared

Page 34: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Visualizing Single Linear Predictorfa

scin

atin

g

works well

don’t buy

Page 35: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Visualizing Single Linear Predictorfa

scin

atin

g

works well

don’t buy

Page 36: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Visualizing Single Linear Predictorfa

scin

atin

g

works well

don’t buy

Page 37: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Dimensionality Reduction Assumption

Page 38: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Visualizing Dimensionality Reductionfa

scin

atin

g

works well

don’t buy

Page 39: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Visualizing Dimensionality Reductionfa

scin

atin

g

works well

don’t buy

Page 40: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Representation Soundnessfa

scin

atin

g

works well

don’t buy

Page 41: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Representation Soundnessfa

scin

atin

g

works well

don’t buy

Page 42: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Representation Soundnessfa

scin

atin

g

works well

don’t buy

Page 43: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Perfect Adaptationfa

scin

atin

g

works well

don’t buy

Page 44: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Algorithm

Page 45: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Generalization

Page 46: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Generalization

Page 47: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Canonical Correlation Analysis (CCA) [Hotelling 1935]

1) Divide feature space into disjoint views

Do not buy the Shark portable steamer. The trigger mechanism is defective.

not buy

trigger

defective

mechanism

2) Find maximally correlating projections

not buy

defectivetrigger

mechanism

Page 48: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Canonical Correlation Analysis (CCA) [Hotelling 1935]

1) Divide feature space into disjoint views

Do not buy the Shark portable steamer. The trigger mechanism is defective.

not buy

trigger

defective

mechanism

2) Find maximally correlating projections

not buy

defectivetrigger

mechanism

Ando and Zhang (ACL 2005)

Kakade and Foster (COLT 2006)

Page 49: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Square Loss: Kitchen Appliances

Books DVDs Electronics1.1

1.3

1.5

1.7

Naïve

Source Domain

Squa

re lo

ss (

s)

Page 50: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Square Loss: Kitchen Appliances

Books DVDs Electronics1.1

1.3

1.5

1.7

Naïve

Coupled

In Domain

Source Domain

Squa

re lo

ss (

s)

Page 51: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Using Target-Specific Features

mush bad quality warranty evenly

super easy great product

dishwasher

books kitchen

trite

the publisher

the author introduction to

illustrations

good reference

kitchen bookscritique

Page 52: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Comparing Discrepancy & Coupled Bounds

Target: DVDs

Squa

re L

oss

Source Instances

true error

coupled bound

Page 53: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Comparing Discrepancy & Coupled Bounds

Target: DVDs

Squa

re L

oss

Source Instances

Page 54: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Comparing Discrepancy & Coupled Bounds

Target: DVDs

Squa

re L

oss

Source Instances

Page 55: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Idea: Active Learning

Piyush Rai et al. (2010)

Page 56: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Goals

1. A computable bound

2. Description of shared representations

3. Suggestions for future research

Page 57: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Conclusion

1. Theory can help us understand domain adaptation better

2. Good theory suggests new directions for future research

3. There’s still a lot left to do• Connecting supervised and unsupervised adaptation• Unsupervised adaptation for problems with structure

Page 58: Unsupervised Domain Adaptation: From Practice to Theory John Blitzer TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Thanks

Collaborators

Shai Ben-DavidKoby CrammerDean FosterSham Kakade

Alex KuleszaFernando PereiraJenn Wortman

References

Ben-David et al. A Theory of Learning from Different Domains. Machine Learning 2009.

Mansour et al. Domain Adaptation: Learning Bounds and Algorithms. COLT 2009.