adapted deep embeddings: a synthesis of methods for -shot ...04-15-30)-04...(ustinova&...

16
Adapted Deep Embeddings: A Synthesis of Methods for !-Shot Inductive Transfer Learning Tyler R. Scott 1,2 , Karl Ridgeway 1,2 , Michael C. Mozer 1,3 1 University of Colorado, Boulder 2 Sensory Inc. 3 Presently at Google Brain

Upload: others

Post on 15-Mar-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Adapted Deep Embeddings: A Synthesis of Methods for -Shot ...04-15-30)-04...(Ustinova& Lempitsky, 2016) Distance Within class Between class Weight Transfer Deep Metric Learning Few-Shot

Adapted Deep Embeddings: A Synthesis of Methods for !-Shot

Inductive Transfer Learning

Tyler R. Scott1,2, Karl Ridgeway1,2, Michael C. Mozer1,3

1 University of Colorado, Boulder2 Sensory Inc.

3 Presently at Google Brain

Page 2: Adapted Deep Embeddings: A Synthesis of Methods for -Shot ...04-15-30)-04...(Ustinova& Lempitsky, 2016) Distance Within class Between class Weight Transfer Deep Metric Learning Few-Shot

Inductive Transfer Learning

ModelTarget Domain Input

Target Domain Prediction

SourceDomain

Data

Target Domain

Data

Page 3: Adapted Deep Embeddings: A Synthesis of Methods for -Shot ...04-15-30)-04...(Ustinova& Lempitsky, 2016) Distance Within class Between class Weight Transfer Deep Metric Learning Few-Shot

Weight Transfer

Retrain output

Adapt weightsto target domain

Inductive Transfer Learning

Yosinski et al. (2014)

Source Domain

Target Domain

Page 4: Adapted Deep Embeddings: A Synthesis of Methods for -Shot ...04-15-30)-04...(Ustinova& Lempitsky, 2016) Distance Within class Between class Weight Transfer Deep Metric Learning Few-Shot

Weight TransferDeep Metric Learning

Inductive Transfer Learning

Source Domain

Source & TargetDomain Embedding Histogram loss

(Ustinova & Lempitsky, 2016)

Distance

Within class

Betweenclass

Page 5: Adapted Deep Embeddings: A Synthesis of Methods for -Shot ...04-15-30)-04...(Ustinova& Lempitsky, 2016) Distance Within class Between class Weight Transfer Deep Metric Learning Few-Shot

Weight TransferDeep Metric LearningFew-Shot Learning

Inductive Transfer Learning

Source Domain

Source & TargetDomain Embedding

Prototypical nets(Snell et al., 2017)

Tyler Scott
Tyler Scott
Tyler Scott
Tyler Scott
Tyler Scott
Tyler Scott
Tyler Scott
Tyler Scott
Tyler Scott
Tyler Scott
Tyler Scott
Tyler Scott
Tyler Scott
Tyler Scott
Tyler Scott
Page 6: Adapted Deep Embeddings: A Synthesis of Methods for -Shot ...04-15-30)-04...(Ustinova& Lempitsky, 2016) Distance Within class Between class Weight Transfer Deep Metric Learning Few-Shot

Weight TransferDeep Metric LearningFew-Shot Learning

Inductive Transfer Learning

Adapted DeepEmbeddings

1. Train network usingembedding loss• Histogram loss,

Prototypical nets2. Adapt weights using

limited target-domaindata

Source Domain

Target Domain

Page 7: Adapted Deep Embeddings: A Synthesis of Methods for -Shot ...04-15-30)-04...(Ustinova& Lempitsky, 2016) Distance Within class Between class Weight Transfer Deep Metric Learning Few-Shot

Why hasn’t a comparison been explored?

Inductive Transfer Learning

# labeled examples per target class

(k)Weight Transfer > 100Deep Metric Learning agnosticFew-Shot Learning < 20

Page 8: Adapted Deep Embeddings: A Synthesis of Methods for -Shot ...04-15-30)-04...(Ustinova& Lempitsky, 2016) Distance Within class Between class Weight Transfer Deep Metric Learning Few-Shot

Target Domain

k labeled examples per

class

Source Domain

2200 labeled examples per

class

MNIST

Page 9: Adapted Deep Embeddings: A Synthesis of Methods for -Shot ...04-15-30)-04...(Ustinova& Lempitsky, 2016) Distance Within class Between class Weight Transfer Deep Metric Learning Few-Shot

1 5 10 50 100 500 1000k

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0Te

st A

ccur

acy

MNIST, n = 5

Baseline

Targ

et D

omai

nTe

st A

ccur

acy

Labeled Examples in Target Domain (k)

MNIST

Page 10: Adapted Deep Embeddings: A Synthesis of Methods for -Shot ...04-15-30)-04...(Ustinova& Lempitsky, 2016) Distance Within class Between class Weight Transfer Deep Metric Learning Few-Shot

1 5 10 50 100 500 1000k

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0Te

st A

ccur

acy

MNIST, n = 5

Weight AdaptationBaseline

Targ

et D

omai

nTe

st A

ccur

acy

Labeled Examples in Target Domain (k)

MNIST

Page 11: Adapted Deep Embeddings: A Synthesis of Methods for -Shot ...04-15-30)-04...(Ustinova& Lempitsky, 2016) Distance Within class Between class Weight Transfer Deep Metric Learning Few-Shot

1 5 10 50 100 500 1000k

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0Te

st A

ccur

acy

MNIST, n = 5

Prototypical NetWeight AdaptationBaseline

Targ

et D

omai

nTe

st A

ccur

acy

Labeled Examples in Target Domain (k)

MNIST

Page 12: Adapted Deep Embeddings: A Synthesis of Methods for -Shot ...04-15-30)-04...(Ustinova& Lempitsky, 2016) Distance Within class Between class Weight Transfer Deep Metric Learning Few-Shot

1 5 10 50 100 500 1000k

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0Te

st A

ccur

acy

MNIST, n = 5

Histogram LossPrototypical NetWeight AdaptationBaseline

Targ

et D

omai

nTe

st A

ccur

acy

Labeled Examples in Target Domain (k)

MNIST

Page 13: Adapted Deep Embeddings: A Synthesis of Methods for -Shot ...04-15-30)-04...(Ustinova& Lempitsky, 2016) Distance Within class Between class Weight Transfer Deep Metric Learning Few-Shot

1 5 10 50 100 500 1000k

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0Te

st A

ccur

acy

MNIST, n = 5

Adapted Histogram LossAdapted Prototypical NetHistogram LossPrototypical NetWeight AdaptationBaseline

Targ

et D

omai

nTe

st A

ccur

acy

Labeled Examples in Target Domain (k)

MNIST

Page 14: Adapted Deep Embeddings: A Synthesis of Methods for -Shot ...04-15-30)-04...(Ustinova& Lempitsky, 2016) Distance Within class Between class Weight Transfer Deep Metric Learning Few-Shot

1 10 50 100 200k

0.20.30.40.50.60.70.80.91.0

Isolet, n = 5

1 10 50 100 200k

0.20.30.40.50.60.70.80.91.0

Isolet, n = 10

5 10 100 1000n

0.0

0.1

0.2

0.3

0.4

0.5

0.6

Test

Acc

urac

y

Omniglot, k = 1

5 10 100 1000n

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9Omniglot, k = 5

5 10 100 1000n

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9Omniglot, k = 10

1 10 50 100 300k

0.2

0.3

0.4

0.5

0.6

0.7

Test

Acc

urac

ytinyImageNet, n = 5

1 10 50 100 300k

0.1

0.2

0.3

0.4

0.5

0.6tinyImageNet, n = 10

1 10 50 100 300k

0.0

0.1

0.2tinyImageNet, n = 50

1 10 50 100 200k

0.20.30.40.50.60.70.80.91.0

Isolet, n = 5

1 10 50 100 200k

0.20.30.40.50.60.70.80.91.0

Isolet, n = 10

1 5 10 50 100 500 1000k

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Test

Acc

urac

y

MNIST, n = 5

Adapted Histogram LossAdapted Prototypical NetHistogram LossPrototypical NetWeight AdaptationBaseline

Page 15: Adapted Deep Embeddings: A Synthesis of Methods for -Shot ...04-15-30)-04...(Ustinova& Lempitsky, 2016) Distance Within class Between class Weight Transfer Deep Metric Learning Few-Shot

Conclusion

• Weight transfer is the least effective method for inductive transfer learning

• Histogram loss is robust regardless of the amount of labeled data in the target domain

• Adapted embeddings outperform every static embedding method previously proposed

Page 16: Adapted Deep Embeddings: A Synthesis of Methods for -Shot ...04-15-30)-04...(Ustinova& Lempitsky, 2016) Distance Within class Between class Weight Transfer Deep Metric Learning Few-Shot

Poster #167Room 210 & 230 AB

Today, 5:00 - 7:00 PM