transfer learning - eddy pei-hao su · transfer learning pei-hao (eddy) su1 and yingzhen li2...
TRANSCRIPT
![Page 1: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/1.jpg)
Outline Motivation Historical points Definition Case studies
Transfer Learning
Pei-Hao (Eddy) Su1 and Yingzhen Li2
1Dialogue Systems Group and 2Machine Learning Group
January 29, 2015
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 1 / 41
![Page 2: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/2.jpg)
Outline Motivation Historical points Definition Case studies
Outline
1 Motivation
2 Historical points
3 Definition
4 Case studies
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 2 / 41
![Page 3: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/3.jpg)
Outline Motivation Historical points Definition Case studies
Outline
1 Motivation
2 Historical points
3 Definition
4 Case studies
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 3 / 41
![Page 4: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/4.jpg)
Outline Motivation Historical points Definition Case studies
Standard Supervised Learning Task
Most ML tasks assume the training/test data are drawn fromthe same data space and the same distribution
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 4 / 41
![Page 5: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/5.jpg)
Outline Motivation Historical points Definition Case studies
Standard Supervised Learning Task
Most ML tasks assume the training/test data are drawn fromthe same data space and the same distribution
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 4 / 41
![Page 6: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/6.jpg)
Outline Motivation Historical points Definition Case studies
NLP tasks: POS, NER, Category labelling
Modified from Gao et al.’s presentation in KDD ’08
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 5 / 41
![Page 7: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/7.jpg)
Outline Motivation Historical points Definition Case studies
Combine and get better result
Modified from Gao et al.’s presentation in KDD ’08
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 6 / 41
![Page 8: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/8.jpg)
Outline Motivation Historical points Definition Case studies
Motivation
Traditional ML tasks assume the training/test data are drawnfrom the same data space and the same distribution
Insufficient labelled data result in poor prediction performance
Lots of (un-)related existing data from various sources
Start from scratch is always time-consuming
Transfer knowledge from other sources may help!
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 7 / 41
![Page 9: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/9.jpg)
Outline Motivation Historical points Definition Case studies
Motivation (Taylor et.al JMLR ’09)
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 8 / 41
![Page 10: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/10.jpg)
Outline Motivation Historical points Definition Case studies
Outline
1 Motivation
2 Historical points
3 Definition
4 Case studies
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 9 / 41
![Page 11: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/11.jpg)
Outline Motivation Historical points Definition Case studies
Psychology and Education
In 1901, Thorndike and Woodworth explored how individualstransfer similar characteristics shared by different contexts
In 1992, Perkins and Salomon published ”Transfer ofLearning” which defined different types of transfer
examples:
Skill learning: C/C + +→ PythonLanguage acquisition: German→ English
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 10 / 41
![Page 12: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/12.jpg)
Outline Motivation Historical points Definition Case studies
Psychology and Education
In 1901, Thorndike and Woodworth explored how individualstransfer similar characteristics shared by different contexts
In 1992, Perkins and Salomon published ”Transfer ofLearning” which defined different types of transfer
examples:
Skill learning: C/C + +→ PythonLanguage acquisition: German→ English
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 10 / 41
![Page 13: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/13.jpg)
Outline Motivation Historical points Definition Case studies
Psychology and Education
In 1901, Thorndike and Woodworth explored how individualstransfer similar characteristics shared by different contexts
In 1992, Perkins and Salomon published ”Transfer ofLearning” which defined different types of transfer
examples:
Skill learning: C/C + +→ PythonLanguage acquisition: German→ English
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 10 / 41
![Page 14: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/14.jpg)
Outline Motivation Historical points Definition Case studies
Machine Learning
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 11 / 41
![Page 15: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/15.jpg)
Outline Motivation Historical points Definition Case studies
Machine Learning
Explanation-Based Neural Network Learning: A LifelongLearning Approach [Thrun PhD ’95, NIPS ’96]
Multitask Learning [Caruana ICML ’93 & ’96, PhD ’97]Workshops
Learning to Learn: Knowledge Consolidation and Transfer inInductive Systems [NIPS ’95]Inductive Transfer: 10 Years Later [NIPS ’05]Structural Knowledge Transfer for Machine Learning [ICML’06]Transfer Learning for Complex Tasks [AAAI ’08]Lifelong Learning [AAAI ’11]Theoretically Grounded Transfer Learning [ICML ’13]Workshop: Second Workshop on Transfer and Multi-TaskLearning: Theory meets Practice [NIPS ’14]...
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 12 / 41
![Page 16: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/16.jpg)
Outline Motivation Historical points Definition Case studies
Machine Learning
Explanation-Based Neural Network Learning: A LifelongLearning Approach [Thrun PhD ’95, NIPS ’96]
Multitask Learning [Caruana ICML ’93 & ’96, PhD ’97]
WorkshopsLearning to Learn: Knowledge Consolidation and Transfer inInductive Systems [NIPS ’95]Inductive Transfer: 10 Years Later [NIPS ’05]Structural Knowledge Transfer for Machine Learning [ICML’06]Transfer Learning for Complex Tasks [AAAI ’08]Lifelong Learning [AAAI ’11]Theoretically Grounded Transfer Learning [ICML ’13]Workshop: Second Workshop on Transfer and Multi-TaskLearning: Theory meets Practice [NIPS ’14]...
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 12 / 41
![Page 17: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/17.jpg)
Outline Motivation Historical points Definition Case studies
Machine Learning
Explanation-Based Neural Network Learning: A LifelongLearning Approach [Thrun PhD ’95, NIPS ’96]
Multitask Learning [Caruana ICML ’93 & ’96, PhD ’97]Workshops
Learning to Learn: Knowledge Consolidation and Transfer inInductive Systems [NIPS ’95]Inductive Transfer: 10 Years Later [NIPS ’05]Structural Knowledge Transfer for Machine Learning [ICML’06]Transfer Learning for Complex Tasks [AAAI ’08]Lifelong Learning [AAAI ’11]Theoretically Grounded Transfer Learning [ICML ’13]Workshop: Second Workshop on Transfer and Multi-TaskLearning: Theory meets Practice [NIPS ’14]...
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 12 / 41
![Page 18: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/18.jpg)
Outline Motivation Historical points Definition Case studies
Outline
1 Motivation
2 Historical points
3 Definition
4 Case studies
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 13 / 41
![Page 19: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/19.jpg)
Outline Motivation Historical points Definition Case studies
Definition
Notations
Domain D1 Data space X2 Marginal distribution P(X ), where X ∈ X
Task T (Given D = {X ,P(X )})1 Label space Y2 Learn a f : X → Y to approach the underlying P(Y |X ), where
X ∈ X and Y ∈ Y
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 14 / 41
![Page 20: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/20.jpg)
Outline Motivation Historical points Definition Case studies
Definition
Assume we have only one source S and one target T :
Definition
Transfer Learning (TL): Given a source domain DS and learningtask TS , a target domain DT and learning task TT , transferlearning aims to help improve the learning of the target predictivefunction fT (·) in DT using the knowledge in DS and TS , where
DS 6= DT (either XS 6= XT or PS(X ) 6= PT (X ))
or TS 6= TT (either YS 6= YT or P(YS |XS) 6= P(YT |XT ))
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 15 / 41
![Page 21: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/21.jpg)
Outline Motivation Historical points Definition Case studies
Example: Category labelling
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 16 / 41
![Page 22: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/22.jpg)
Outline Motivation Historical points Definition Case studies
Example: Category labelling
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 17 / 41
![Page 23: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/23.jpg)
Outline Motivation Historical points Definition Case studies
Example: Category labelling
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 18 / 41
![Page 24: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/24.jpg)
Outline Motivation Historical points Definition Case studies
ML v.s. TL (Langley ’06, Yang et al. ’13)
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 19 / 41
![Page 25: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/25.jpg)
Outline Motivation Historical points Definition Case studies
Outline
1 Motivation
2 Historical points
3 Definition
4 Case studies
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 20 / 41
![Page 26: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/26.jpg)
Outline Motivation Historical points Definition Case studies
Transfer in practice
The rest of the talk will give you an intuition, with examples, on:
when to transfer
what to transfer
and how to transfer
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 21 / 41
![Page 27: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/27.jpg)
Outline Motivation Historical points Definition Case studies
When to transfer: Domain relatedness
Transfer learning is applicable when there exists relatedness
Standard machine learning assume source = target
Transferring knowledge from unrelated domain can be harmful- Negative transfer [Rosenstein et al NIPS-05 Workshop]
(Ben-David et al.) proposed a bound of target domain error
ReferenceBen-David et al. Analysis of Representation for Domain Adaptation. NIPS ’06
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 22 / 41
![Page 28: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/28.jpg)
Outline Motivation Historical points Definition Case studies
When to transfer (Ben-David et al.)
In standard binary classification supervised learning task:
Given X ,Y = {0, 1} and samples from P(x , y), we aim to learnf : X → [0, 1] which captures P(y |x)
Often we decompose the problem into:
1 determine a feature mapping Φ : X → Z2 learn a hypothesis h : Z → {0, 1} on dataset {Φ(x), y}
In transfer learning scenario:
Theorem (Simplified version of Thm. 1&2)
Given X = XS = XT and PS(x),PT (x) the distributions of the source andtarget domain. Let Φ : X → Z be a fixed mapping function and H be ahypothesis space. For any hypothesis h ∈ H trained on source domain:
εT (h) ≤ εS(h) + dH(P̃S , P̃T ) + εS(h∗) + εT (h∗)
where P̃S , P̃T are induced distributions on Z wrt. PS and PT ,h∗ = arg minh∈H(εS(h) + εT (h)) is the best hypothesis by joint training.
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 23 / 41
![Page 29: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/29.jpg)
Outline Motivation Historical points Definition Case studies
Domain adaptation
Approach 1: mixture of general & specific component
Can we learn hypotheses for both the general and specific components?
Reference:Daume III. Frustratingly easy domain adaptation. ACL ’07Daume III et al. Co-regularization Based Semi-supervised Domain Adaptation.NIPS ’10
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 24 / 41
![Page 30: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/30.jpg)
Outline Motivation Historical points Definition Case studies
EasyAdapt (Daume III)
Binary classification problem:
XS = XT ⊂ Rd , YS = YT = {−1,+1}Goal: obtain classifier fT : XT → YT
in SVM context: learn a hypothesis hT ∈ Rd
However:
too little training data available on (XT ,YT ) for robust training
also P(xS) 6= P(xT ) and P(xS , yT ) 6= P(xS , yT )
...so directly apply a trained hypothesis hs returns bad results
How to use xS , yS ∼ P(xS , yS) to improve learning of hT?
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 25 / 41
![Page 31: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/31.jpg)
Outline Motivation Historical points Definition Case studies
EasyAdapt (Daume III)
EasyAdapt algorithm
define two mappings ΦS ,ΦT : Rd →R3d :
ΦS(xS) = (xS , xS , 0), Φt(xT ) = (xT , 0, xT )
training: learn a hypothesis h = (w g ,w s ,w t) ∈ R3d on transformeddataset {(ΦS(xS), yS)} ∪ {(ΦT (xT ), yT )}test: apply hT = w g + w t on xT
(also hS = w g + w s)
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 26 / 41
![Page 32: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/32.jpg)
Outline Motivation Historical points Definition Case studies
EA++ (Daume III et al.)
Use unlabelled data to improve training:
want hS and hT to agree on unlabelled data xU :
hS · xU = hT · xU ⇔ w s · xU = w t · xU ⇔ h · (0, xU ,−xU) = 0
so we define mapping ΦU : Rd →R3d for unlabelled data
ΦU(xU) = (0, xU ,−xU) (1)
and train the hypothesis h on augmented and transformed dataset{(ΦS(xS), yS)} ∪ {(ΦT (xT ), yT )} ∪ {(ΦU(xU), 0)}
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 27 / 41
![Page 33: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/33.jpg)
Outline Motivation Historical points Definition Case studies
EA++ (Daume III et al.)
(a) DVD → BOOKS (proxy A-distance=0.7616),(b) KITCHEN → APPAREL (proxy A-distance=0.0459).
SOURCE/TARGETONLY(-FULL): trained on source/target (full)labelled samples
ALL: trained on combined labelled samples
EA/EA++: trained in augmented feature space (and unlabelled targetdata)
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 28 / 41
![Page 34: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/34.jpg)
Outline Motivation Historical points Definition Case studies
Feature transfer
Approach 2: shared lower-level features
DNN first layer learns Gabor filters or color blobs when trained on images
instances in source/target domain share the same lower-level features?
Reference:Yosinski et al. How transferable are features in deep neural networks? NIPS’14.
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 29 / 41
![Page 35: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/35.jpg)
Outline Motivation Historical points Definition Case studies
Feature transfer1
Lee et al. Convolutional deep belief networks for scalable unsupervised learningof hierarchical representations. ICML ’09
1adapt from Ruslan Salakhutdinov’s tutorial in MLSS’14 BeijingPei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 30 / 41
![Page 36: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/36.jpg)
Outline Motivation Historical points Definition Case studies
Feature transfer (Yosinski et al.)
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 31 / 41
![Page 37: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/37.jpg)
Outline Motivation Historical points Definition Case studies
Feature transfer (Yosinski et al.)
Test 1 (similar datasets): random A/B splits of the ImageNet dataset
(similar source and target domain training/testing instances)
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 32 / 41
![Page 38: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/38.jpg)
Outline Motivation Historical points Definition Case studies
Feature transfer (Yosinski et al.)
Test 1 (similar datasets): random A/B splits of the ImageNet dataset
(similar source and target domain training/testing instances)
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 32 / 41
![Page 39: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/39.jpg)
Outline Motivation Historical points Definition Case studies
Feature transfer (Yosinski et al.)
Test 1 (similar datasets): random A/B splits of the ImageNet dataset
(similar source and target domain training/testing instances)
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 32 / 41
![Page 40: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/40.jpg)
Outline Motivation Historical points Definition Case studies
Feature transfer (Yosinski et al.)
Test 2 (very different datasets): man-made/natural object split
(dissimilar source and target domain training/testing instances)
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 33 / 41
![Page 41: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/41.jpg)
Outline Motivation Historical points Definition Case studies
Feature transfer (Yosinski et al.)
Test 2 (very different datasets): man-made/natural object split
(dissimilar source and target domain training/testing instances)
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 33 / 41
![Page 42: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/42.jpg)
Outline Motivation Historical points Definition Case studies
Joint representation
Approach 3: joint feature representation
data has many domain specific characteristics
however might be related in high level?
our brain might work like this as well
Reference:Srivastava and Salakhutdinov. Multimodal Learning with Deep BoltzmannMachines. NIPS ’12, JMLR 15 (2014).
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 34 / 41
![Page 43: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/43.jpg)
Outline Motivation Historical points Definition Case studies
Joint representation (Srivastava et al.)
MIR Flickr Dataset http://press.liacs.nl/mirflickr/
For images
1M datapoints, 25K labelled instances in 38 classes, 10K fortraining, 5K for validation and 10K for testinginputs are the concatenation of PHOW and MPEG-7 features
For texts
use word count vectors on 2K frequently used tags (verysparse)18% training images have missing texts
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 35 / 41
![Page 44: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/44.jpg)
Outline Motivation Historical points Definition Case studies
Joint representation (Srivastava et al.)
for images: 2-layer deep Boltzmann machine (DBM) with Gaussian input units
(vmi ∈ R, abbrev. W(k)m (i , j) as W
(k)ij )
P(vm, h(1)m , h(2)
m ) ∝
exp
−∑i
(vmi − bi )2
2σ2i
+∑i,j
vmi
σiW
(1)ij h
(1)mj +
∑j,l
h(1)mj W
(2)jl h
(2)ml
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 36 / 41
![Page 45: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/45.jpg)
Outline Motivation Historical points Definition Case studies
Joint representation (Srivastava et al.)
for texts: 2-layer DBM with replicated softmax model (vti counts the
occurrence of word i, abbrev. W(k)t (i , j) as W
(k)ij )
P(vt , h(1)t , h
(2)t ) ∝
exp
−∑i=1
vtibi +∑i,j
vtiW(1)ij h
(1)mj +
∑j,l
h(1)tj W
(2)jl h
(2)tl
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 36 / 41
![Page 46: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/46.jpg)
Outline Motivation Historical points Definition Case studies
Joint representation (Srivastava et al.)
combining domain specific models to a multimodal DBM:
P(vm, vt , h; θ) ∝
exp(−E(h(2)
m , h(2)t , h(3))− E(vm, h
(1)m , h(2)
m )− E(vt , h(1)t , h
(2)t ))
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 36 / 41
![Page 47: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/47.jpg)
Outline Motivation Historical points Definition Case studies
Joint representation (Srivastava et al.)
first pre-train domain specific DBMs with CD, then co-train the jointmodel with PCD
use mean-field variational approximation when computing hidden unitmoments driven by data
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 36 / 41
![Page 48: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/48.jpg)
Outline Motivation Historical points Definition Case studies
Joint representation (Srivastava et al.)
Results:
Figure: Classification with data from both image and text domain
Figure: Classification with data from image domain only
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 37 / 41
![Page 49: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/49.jpg)
Outline Motivation Historical points Definition Case studies
Joint representation (Srivastava et al.)
Results:
Figure: Retrieval results for multi/image domain queries
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 37 / 41
![Page 50: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/50.jpg)
Outline Motivation Historical points Definition Case studies
Conclusions
In this talk, we showed that
transfer learning adapts knowledge from other sources to improve targettask performance
domains related to each other in different ways
In the future:
manage large scale data that do not lack in size but may lack in quality
manage data which may continuously change over time
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 38 / 41
![Page 51: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/51.jpg)
Outline Motivation Historical points Definition Case studies
Open Questions2
what are the limits of existing multi-task learning methods when thenumber of tasks grows while each task is described by only a small bunchof samples (“big T, small n”)?
what is the right way to leverage over noisy data gathered from theInternet as reference for a new task?
how can an automatic system process a continuous stream of informationin time and progressively adapt for life-long learning?
can deep learning help to learn the right representation (e.g., tasksimilarity matrix) in kernel-based transfer and multi-task learning?
How can similarities across languages help us adapt to different domainsin natural language processing tasks?
...
2nips.cc/Conferences/2014/Program/event.php?ID=4282
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 39 / 41
![Page 52: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/52.jpg)
Outline Motivation Historical points Definition Case studies
Thank you
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 40 / 41
![Page 53: Transfer Learning - Eddy Pei-Hao Su · Transfer Learning Pei-Hao (Eddy) Su1 and Yingzhen Li2 1Dialogue Systems Group and 2Machine Learning Group January 29, 2015 Pei-Hao (Eddy) Su](https://reader033.vdocument.in/reader033/viewer/2022050221/5f66b4586af18c3df2475721/html5/thumbnails/53.jpg)
Outline Motivation Historical points Definition Case studies
Reference
1 Pan and Yang. A Survey on Transfer Learning. IEEE TKDE 20102 Pan and Yang. Transfer Learning. MLSS 20113 Taylor et al. Transfer Learning for Reinforcement Learning Domains: A
Survey. JMLR 20104 Langley. Transfer of Learning in Cognitive System. ICML 20065 Perkins et al. Transfer of Learning. IEE 19926 Thrun. Explanation-Based Neural Network Learning: A Lifelong Learning
Approach. PhD thesis 19957 Caruana. Multitask Learning. PhD thesis 19938 Ben-David et al. Analysis of Representation for Domain Adaptation.
NIPS 20069 Daume III. Frustratingly easy domain adaptation. ACL 200710 Daume III et al. Co-regularization Based Semi-supervised Domain
Adaptation. NIPS 201011 Yosinski et al. How transferable are features in deep neural networks?
NIPS 201412 Lee et al. Convolutional deep belief networks for scalable unsupervised
learning of hierarchical representations. ICML 200913 Srivastava and Salakhutdinov. Multimodal Learning with Deep
Boltzmann Machines. NIPS 2012, JMLR 2014
Pei-Hao (Eddy) Su and Yingzhen Li
Transfer Learning 41 / 41