lifelong learning - stanford...

33
CS 330 Lifelong Learning

Upload: others

Post on 04-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

CS 330

Lifelong Learning

Page 2: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

Logistics

2

Project milestone due Wednesday.

Two guest lectures next week!

Jeff Clune Sergey Levine

Page 3: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

Plan for Today

3

The lifelong learning problem statement

Basic approaches to lifelong learning

Can we do better than the basics?

Revisiting the problem statement from the meta-learning perspective

Page 4: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

4

A brief review of problem statements.

Meta-LearningGiven i.i.d. task distribu0on, learn a new task efficiently

quickly learn new task

learn to learn tasks

Mul8-Task Learning

Learn to solve a set of tasks.

perform taskslearn tasks

Page 5: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

5

In contrast, many real world se@ngs look like:

Meta-Learninglearn to learn tasks

quickly learn new task

Mul8-Task Learningperform taskslearn tasks

0me

- a student learning concepts in school - a deployed image classifica8on system learning from a

stream of images from users - a robot acquiring an increasingly large set of skills in

different environments - a virtual assistant learning to help different users with

different tasks at different points in 0me - a doctor’s assistant aiding in medical decision-making

Some examples:

Our agents may not be given a large batch of data/tasks right off the bat!

Page 6: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

Sequen8al learning se@ngs

online learning, lifelong learning, con0nual learning, incremental learning, streaming data

dis0nct from sequence data and sequen8al decision-making

Some Terminology

6

Page 7: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

1. Pick an example se@ng.

2. Discuss problem statement with your neighbor:

(a) how would you set-up an experiment to develop & test your algorithm?

(b) what are desirable/required proper0es of the algorithm? (c) how do you evaluate such a system?

What is the lifelong learning problem statement?

A. a student learning concepts in school B. a deployed image classifica8on system learning from a

stream of images from users C. a robot acquiring an increasingly large set of skills in

different environments D. a virtual assistant learning to help different users with

different tasks at different points in 0me E. a doctor’s assistant aiding in medical decision-making

Example seTngs:

Exercise:

7

Page 8: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

Some considera0ons:

- computa8onal resources

- memory

- model performance

- data efficiency

Problem varia0ons:

- task/data order: i.i.d. vs. predictable vs. curriculum vs. adversarial

- others: privacy, interpretability, fairness, test 0me compute & memory

- discrete task boundaries vs. con8nuous shiVs (vs. both)

- known task boundaries/shiVs vs. unknown

Substan0al variety in problem statement!

What is the lifelong learning problem statement?

8

Page 9: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

General [supervised] online learning problem:

What is the lifelong learning problem statement?

for t = 1, …, n

observe xt

predict ̂yt

observe label yt

i.i.d. setting: xt ∼ p(x), yt ∼ p(y |x)

not a function of p t

streaming setting: cannot store (xt, yt)- lack of memory - lack of computational resources - privacy considerations - want to study neural memory mechanisms

otherwise: xt ∼ pt(x), yt ∼ pt(y |x)

true in some cases, but not in many cases!- recall: replay buffers

<— if observable task boundaries: observe xt, zt

9

Page 10: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

What do you want from your lifelong learning algorithm?

minimal regret (that grows slowly with )t

regret: cumula0ve loss of learner — cumula0ve loss of best learner in hindsight

(cannot be evaluated in prac0ce, useful for analysis)

RegretT :=T

∑1

ℒt(θt) − minθ

T

∑1

ℒt(θ)

10

Regret that grows linearly in is trivial.t Why?

Page 11: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

posi1ve & nega1ve transfer

What do you want from your lifelong learning algorithm?

posi8ve forward transfer: previous tasks cause you to do be[er on future tasks

compared to learning future tasks from scratch

posi8ve backward transfer: current tasks cause you to do be[er on previous tasks

compared to learning past tasks from scratch

posi8ve -> nega8ve : beMer -> worse

11

Page 12: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

Plan for Today

12

The lifelong learning problem statement

Basic approaches to lifelong learning

Can we do better than the basics?

Revisiting the problem statement from the meta-learning perspective

Page 13: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

Store all the data you’ve seen so far, and train on it.

Approaches

—> follow the leader algorithm

Take a gradient step on the datapoint you observe. —> stochas8c gradient descent

+ will achieve very strong performance

- computa8on intensive

- can be memory intensive

—> Con8nuous fine-tuning can help.

[depends on the applica0on]

+ computa0onally cheap+ requires 0 memory- subject to nega8ve backward transfer

“forgeTng”some0mes referred to as catastrophic forgeTng

- slow learning

13Can we do beMer?

Page 14: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

Plan for Today

14

The lifelong learning problem statement

Basic approaches to lifelong learning

Can we do better than the basics?

Revisiting the problem statement from the meta-learning perspective

Page 15: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

Case Study: Can we use meta-learning to accelerate online learning?

15

Page 16: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

motor malfunctiongradual terrain change

time

online adaptation = few-shot learning tasks are temporal slices of experience

Recall: model-based meta-RL

16

Page 17: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

motor malfunctiongradual terrain change

time

icy terrain

k time steps not sufficient to learn entirely new terrain

Continue to run SGD?

example online learning problem

+ will be fast with MAML initialization- what if ice goes away? (subject to forgetting)

17Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19

Page 18: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

time

Online inference problem: infer latent “task” variable at each time step

Note: If neural net is random initialized, this procedure would be too slow.

Alternate between:

M-step: Update mixture of network parameters

Mixture of neural networks over task variable T, adapted continually:

E-step: Estimate latent “task” variable at each time step given data

prior

gradient step on each mixture element, weighted by task probability

likelihood of the dataunder task .

P (Tt = Ti|xt,yt) / p✓(Ti)(yt|xt, Tt = Ti)P (Tt = Ti)<latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="ck8pdC+ekZH4nUmSP+ZG7r8lEyk=">AAAB2XicbZDNSgMxFIXv1L86Vq1rN8EiuCozbnQpuHFZwbZCO5RM5k4bmskMyR2hDH0BF25EfC93vo3pz0JbDwQ+zknIvSculLQUBN9ebWd3b/+gfugfNfzjk9Nmo2fz0gjsilzl5jnmFpXU2CVJCp8LgzyLFfbj6f0i77+gsTLXTzQrMMr4WMtUCk7O6oyaraAdLMW2IVxDC9YaNb+GSS7KDDUJxa0dhEFBUcUNSaFw7g9LiwUXUz7GgUPNM7RRtRxzzi6dk7A0N+5oYkv394uKZ9bOstjdzDhN7Ga2MP/LBiWlt1EldVESarH6KC0Vo5wtdmaJNChIzRxwYaSblYkJN1yQa8Z3HYSbG29D77odOn4MoA7ncAFXEMIN3MEDdKALAhJ4hXdv4r15H6uuat66tDP4I+/zBzjGijg=</latexit><latexit sha1_base64="HYuehfr9ScUR7ZI+Dh5V0LkyYlA=">AAAGKnicxVRLb9NAEHZLAiUU2nLlMqKKSFRaxVxAoEpIcEDiUqS+pGyw1ut1sur6od1xaeRufxQX/gcnOHAAIX4H6zya2InUE2Ily7Mz334z8+3DT6XQ2Ol8X1m9VavfvrN2t3Fv/f6Djc2t9WOdZIrxI5bIRJ36VHMpYn6EAiU/TRWnkS/5iX/2poifnHOlRRIf4jDlvYj2YxEKRtG6vK3a+ybJWySiOPDDXBsPn8J0RkczNe+ygBx3XNMGYhpNkgpotggLkgKCA460bb3XWGh6BPkF5sg1mlmkIIZ9CMd5GZX5WzNFKipiU0qIr5Zx2zp2z16iKZU78TWaBzPqQ8tQrBxReFgKiHajgrRllQBwec1/URJnWNACSVWSYgKpl48TVOhNax5/uYRqmni/vBBuqMt2BCdKIPIYkgwhVSJR1kckD7ELRGeRVePJvms+WlFcU+GzIVPNuAMk4LLaAoz3BYSGmH8ybUJlOqAARIn+AHs2peJRcs7tL8jigMZsuCD/PxR1TtL2f2h+cqpGd6JSZdH08jMHu0B8G7hpj2+SKaa+pFONFrIYK0fSn4m4BFA+nKVkbbi6ImGiqJTlomZ3cLEmb3O7s9cZDVg03Imx7UzGgbf5lQQJyyIeI5NU667bSbGXU4WCSW4aJNM8peyM9nnXmjGNuO7lo1fPPi7WE4Ct0X4xwsg7vyKnkdbDyLfIokZdjRXOZbFuhuGLXi7iNLO3i40ThZkEeySLJxQCoThDObQGZfYOCgZsQBVlaB/ahhXBrba8aBw/23Ot/aHjrDmPnMdOy3Gd585r551z4Bw5rPa59q32s/ar/qX+o/57LNfqykS3h05p1P/8BQlyJAo=</latexit><latexit sha1_base64="HYuehfr9ScUR7ZI+Dh5V0LkyYlA=">AAAGKnicxVRLb9NAEHZLAiUU2nLlMqKKSFRaxVxAoEpIcEDiUqS+pGyw1ut1sur6od1xaeRufxQX/gcnOHAAIX4H6zya2InUE2Ily7Mz334z8+3DT6XQ2Ol8X1m9VavfvrN2t3Fv/f6Djc2t9WOdZIrxI5bIRJ36VHMpYn6EAiU/TRWnkS/5iX/2poifnHOlRRIf4jDlvYj2YxEKRtG6vK3a+ybJWySiOPDDXBsPn8J0RkczNe+ygBx3XNMGYhpNkgpotggLkgKCA460bb3XWGh6BPkF5sg1mlmkIIZ9CMd5GZX5WzNFKipiU0qIr5Zx2zp2z16iKZU78TWaBzPqQ8tQrBxReFgKiHajgrRllQBwec1/URJnWNACSVWSYgKpl48TVOhNax5/uYRqmni/vBBuqMt2BCdKIPIYkgwhVSJR1kckD7ELRGeRVePJvms+WlFcU+GzIVPNuAMk4LLaAoz3BYSGmH8ybUJlOqAARIn+AHs2peJRcs7tL8jigMZsuCD/PxR1TtL2f2h+cqpGd6JSZdH08jMHu0B8G7hpj2+SKaa+pFONFrIYK0fSn4m4BFA+nKVkbbi6ImGiqJTlomZ3cLEmb3O7s9cZDVg03Imx7UzGgbf5lQQJyyIeI5NU667bSbGXU4WCSW4aJNM8peyM9nnXmjGNuO7lo1fPPi7WE4Ct0X4xwsg7vyKnkdbDyLfIokZdjRXOZbFuhuGLXi7iNLO3i40ThZkEeySLJxQCoThDObQGZfYOCgZsQBVlaB/ahhXBrba8aBw/23Ot/aHjrDmPnMdOy3Gd585r551z4Bw5rPa59q32s/ar/qX+o/57LNfqykS3h05p1P/8BQlyJAo=</latexit><latexit sha1_base64="ekeP9cfXZZoDKmiLpGft+htEIbE=">AAAGNXicxVRLbxMxEN6WBEp4tIUjlxFVRKLSKssFBKpUCQ5IXIroS4rDyut4E6veh+zZ0mjrit/Ehf/BCQ4cQIgrfwFvHk12E6knhKXVjmc+fzPz+eEnUmhstb4tLV+rVK/fWLlZu3X7zt3VtfV7hzpOFeMHLJaxOvap5lJE/AAFSn6cKE5DX/Ij/+RlHj865UqLONrHQcI7Ie1FIhCMonV565U3dZI1SEix7weZNh4+hsmMDmdq1mUBGW66pgnE1OokEVBvENaNcwj2OdKm9V5ioe4R5GeYIddoppGcGHYgGOVlVGavzASpqIhMISG+WMRt69g6eY6mUO7YV6vvTan3LUO+ckjhYSEgmrUS0pZVAMD5Jf9ZQZxBTgskUXGCMSReNkpQojeNWfz5AqpJ4p3iQriiLtsRHCmByCOIU4REiVhZH5E8wDYQnYZWjUc7rnlvRXFNic+GTDnjJpAul+UWYLQvIDRE/INpEiqTPgUgSvT62LEpFQ/jU25/3TTq0ogN5uT/h6LOSNr8D82PT9XwTpSqzJtefOZgC4hvA1ft8VUyRdSXdKLRXBZj5Yh7UxEXAIqHs5CsCRcXJIgVlbJY1PQOztfkrW20tlvDAfOGOzY2nPHY89a+kG7M0pBHyCTVuu22EuxkVKFgkpsaSTVPKDuhPd62ZkRDrjvZ8NWzj4v1dMHWaL8IYeidXZHRUOtB6FtkXqMux3Lnolg7xeBZJxNRktrbxUaJglSCPZL5EwpdoThDObAGZfYOCgasTxVlaB/amhXBLbc8bxw+2Xat/ba1sfvu40iOFeeB89BpOK7z1Nl1Xjt7zoHDKp8qXys/Kj+rn6vfq7+qv0fQ5aWxhPedwqj++QvgriY7</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit>

Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘1918Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19

Page 19: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

Crawler with crippled legs

Does it work? online learning w. MAML initialization SGD w. MAML initialization MAML (always reset to prior + 1 grad step) model-based, no adaptation model-based, grad steps

Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19

no meta-learning

19Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19

Page 20: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19

Latent task distribution during online learningDoes it work?

Crawler with crippled legs

20Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19

Page 21: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

Case Study: Can we modify vanilla SGD to avoid nega0ve backward transfer?

21

(from scratch)

Page 22: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

Idea:

22Lopez-Paz & Ranzato. Gradient Episodic Memory for Continual Learning. NeurIPS ‘17

(1) store small amount of data per task in memory

(2) when making updates for new tasks, ensure that they don’t unlearn previous tasks

How do we accomplish (2)?

memory: for task ℳk zk

For t = 0,...,T

minimize ℒ( fθ( ⋅ , zt) , (xt, yt) )

subject to for all ℒ( fθ , ℳk ) ≤ ℒ( f t−1θ , ℳk ) zk < zt

learning predictor yt = fθ(xt, zt)

(i.e. s.t. loss on previous tasks doesn’t get worse)

Can formulate & solve as a QP.

Assume local linearity: ⟨gt, gk⟩ := ⟨ ∂ℒ( fθ , (xt, yt) )

∂θ,

ℒ( fθ , ℳk )∂θ ⟩ ≥ 0 for all zk < zt

Page 23: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

23Lopez-Paz & Ranzato. Gradient Episodic Memory for Continual Learning. NeurIPS ‘17

Experiments

If we take a step back… do these experimental domains make sense?

BWT: backward transfer, FWT: forward transfer

- MNIST permuta0ons - MNIST rota0ons - CIFAR-100 (5 new classes/task)

Problems:

Total memory size: 5012 examples

Page 24: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

Can we meta-learn how to avoid nega0ve backward transfer?

24

Javed & White. Meta-Learning Representa3ons for Con3nual Learning. NeurIPS ‘19

Page 25: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

Plan for Today

25

The lifelong learning problem statement

Basic approaches to lifelong learning

Can we do better than the basics?

Revisiting the problem statement from the meta-learning perspective

Page 26: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

More realis8cally:

learn learn learn learn learnlearn

slow learning rapid learning

learn

0me

What might be wrong with the online learning formula0on?

Online Learning(Hannan ’57, Zinkevich ’03)

Perform sequence of tasks while minimizing sta0c regret. 0me

perform perform perform perform perform performperform

zero-shot performance

26

Page 27: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

Online Learning(Hannan ’57, Zinkevich ’03)

Perform sequence of tasks while minimizing sta0c regret.

(Finn*, Rajeswaran*, Kakade, Levine ICML ’18)

Online Meta-LearningEfficiently learn a sequence of tasks from a non-sta0onary distribu0on.

0me

learn learn learn learn learn learnlearn

0me

perform perform perform perform perform performperform

zero-shot performance

evaluate performance aQer seeing a small amount of data

What might be wrong with the online learning formula0on?

27

Primarily a difference in evalua&on, rather than the data stream.

Page 28: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

The Online Meta-Learning Se=ng

RegretT :=TX

t=1

`t(�t(✓t))�min✓2⇥

TX

t=1

`t(�t(✓))<latexit sha1_base64="2+KP9DCIWvggRsB3gBx4m2dSNKQ=">AAACu3ichVFNbxMxEPUuX6UFmvJx4mKRRWoPRLvhQIUUqRIcOHAIKGkrZcPK650kprZ3ZY+LotWK38kP4H/gTfZAUyRGsvz0Zt6M/SavpLAYx7+C8M7de/cf7D3cP3j0+Mlh7+jpuS2d4TDlpSzNZc4sSKFhigIlXFYGmMolXORXH9r8xTUYK0o9wXUFc8WWWiwEZ+iprPczimiqGK6Mqr/C0gA22YS+H9HUOpXVOEqabxOagpQZHqfjlWgvXAGyDE9O6BsvFjqrtxRNhabppIXN/xt4eRRlvX48iDdBb4OkA33SxTg7CqK0KLlToJFLZu0siSuc18yg4BKa/dRZqBi/YkuYeaiZAjuvN0419LVnCroojT8a6Yb9W1EzZe1a5b6yNcXu5lryX7mZw8XpvBa6cgiabwctnKRY0tZ2WggDHOXaA8aN8G+lfMUM4+iXc2MKaKcEgvI/0fCDl0oxXdQp/9zUrYs7tCsq30X5HApZQLeIpvG+Jrsu3gbnw0HydjD8Muyffewc3iMvyStyTBLyjpyRT2RMpoST38FB8Dx4EY5CHn4P5bY0DDrNM3IjQvcHPI7YVw==</latexit>

Goal: Learning algorithm with sub-linear

Loss of algorithmLoss of best algorithm

in hindsight

28

for task t = 1, …, n

observe 𝒟trt

use update procedure to produce parameters Φ(θt, 𝒟trt ) ϕt

observe label yt

observe xt

predict ̂yt = fϕt(xt) Standard online learning se@ng

(Finn*, Rajeswaran*, Kakade, Levine ICML ’18)

Page 29: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

29

Store all the data you’ve seen so far, and train on it.

Recall the follow the leader (FTL) algorithm:

Follow the meta-leader (FTML) algorithm:

Can we apply meta-learning in lifelong learning seTngs?

Store all the data you’ve seen so far, and meta-train on it.

Run update procedure on the current task.

Deploy model on current task.

What meta-learning algorithms are well-suited for FTML?

What if is non-sta0onary?pt(𝒯)

Page 30: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

Experiment with sequences of tasks: - Colored, rotated, scaled MNIST - 3D object pose predic1on - CIFAR-100 classifica0on

Example pose predic0on tasks

plane

car

chair

Experiments

30

Page 31: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

ExperimentsLe

arni

ng e

ffici

ency

(# d

atap

oint

s)

Task index

Rainbow MNIST Pose Predic8on

Task index

Rainbow MNIST Pose Predic8on

- TOE (train on everything): train on all data so far- FTL (follow the leader): train on all data so far, fine-tune on current task- From Scratch: train from scratch on each task

Lear

ning

pro

ficie

ncy

(err

or)

Comparisons:

31

Follow The Meta-Leader learns each new task faster & with greater proficiency,

approaches few-shot learning regime

Page 32: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

32

TakeawaysMany flavors of lifelong learning, all under the same name.

Defining the problem statement is often the hardest part

Meta-learning can be viewed as a slice of the lifelong learning problem.

A very open area of research.

Page 33: Lifelong Learning - Stanford Universityweb.stanford.edu/.../cs330/slides/cs330_lifelonglearning.pdf · 2019-11-12 · What do you want from your lifelong learning algorithm? minimal

33

RemindersProject milestone due Wednesday.

Two guest lectures next week!

Jeff Clune Sergey Levine