unc, chapel hill haotan, licheng, [email protected] hao tan, licheng yu, mohit...

52
Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng, [email protected] UNC, Chapel Hill 1

Upload: others

Post on 11-Jul-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout

Hao Tan, Licheng Yu, Mohit Bansalhaotan, licheng, [email protected]

UNC, Chapel Hill1

Page 2: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Go to the bedroom, and go through the door, continue forward until you can climb three steps to your right…

Instruction Bird-View

2

Vision-and-Language Navigation Task

Page 3: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Go to the bedroom, and go through the door, continue forward until you can climb three steps to your right…

Instruction Bird-View

3

Vision-and-Language Navigation Task

Agent’s Start Location

Page 4: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Go to the bedroom, and go through the door, continue forward until you can climb three steps to your right…

Instruction Bird-View

4

Vision-and-Language Navigation Task

Page 5: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Go to the bedroom, and go through the door, continue forward until you can climb three steps to your right…

Instruction Bird-View

5

Vision-and-Language Navigation Task

Page 6: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Go to the bedroom, and go through the door, continue forward until you can climb three steps to your right…

Instruction Bird-View

6

Vision-and-Language Navigation Task

Page 7: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Go to the bedroom, and go through the door, continue forward until you can climb three steps to your right…

Instruction Bird-View

7

Vision-and-Language Navigation Task

Page 8: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Go to the bedroom, and go through the door, continue forward until you can climb three steps to your right…

Instruction Bird-View

8

Vision-and-Language Navigation Task

Agent’s Target Location

Page 9: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Go to the bedroom, and go through the door, continue forward until you can climb three steps to your right…

agent

Instruction Bird-View

AgentActions

agent

9

Vision-and-Language Navigation Task

Page 10: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Vision-and-Language Navigation Task

10

Agent

Go to the bedroom, and go through the door, continue forward until you can climb three steps to your right…

Page 11: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Back Translation

11

1.Want to learn: En → Fr English French

Improving Neural Machine Translation Models with Monolingual Data, Senrrich et.al., 2015 Iterative Back-Translation for Neural Machine Translation, Hoang et.al., 2018

Style Transfer Through Back-Translation, Prabhumoye et.al., 2018Understanding Back-Translation at Scale, Edunov et.al., 2018

Page 12: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Back Translation

12

1.Want to learn: En → Fr

2.Have unpaired Fr corpus

English French

French

Improving Neural Machine Translation Models with Monolingual Data, Senrrich et.al., 2015 Iterative Back-Translation for Neural Machine Translation, Hoang et.al., 2018

Style Transfer Through Back-Translation, Prabhumoye et.al., 2018Understanding Back-Translation at Scale, Edunov et.al., 2018

Page 13: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Back Translation

13

1.Want to learn: En → Fr

2.Have unpaired Fr corpus

3.Train Fr → En and use it to translate unpaired Fr corpus.

English French

French

English FrenchUse Fr→ En

Improving Neural Machine Translation Models with Monolingual Data, Senrrich et.al., 2015 Iterative Back-Translation for Neural Machine Translation, Hoang et.al., 2018

Style Transfer Through Back-Translation, Prabhumoye et.al., 2018Understanding Back-Translation at Scale, Edunov et.al., 2018

Page 14: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Train En→ Fr

Back Translation

14

1.Want to learn: En → Fr.

2.Have unpaired Fr corpus.

3.Train Fr → En and use it to translate unpaired Fr corpus.

4.Use reversed pairs as additional data for En → Fr.

English French

French

English FrenchUse Fr→ En

English French

Improving Neural Machine Translation Models with Monolingual Data, Senrrich et.al., 2015 Iterative Back-Translation for Neural Machine Translation, Hoang et.al., 2018

Style Transfer Through Back-Translation, Prabhumoye et.al., 2018Understanding Back-Translation at Scale, Edunov et.al., 2018

Page 15: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Back Translation: Preliminary Setup

15

Environment: A set ofroutes. Some routeshave instructions; Some do not.

Speaker-Follower Models for Vision-and-Language Navigation, Fried et.al., 2018

Page 16: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Back Translation: Preliminary Setup

16

Speaker: A pre-trained neural model which generates instructions from routes.

Walk past the hall, turn Left, ...

Speaker

Environment: A set ofroutes. Some routeshave instructions; Some do not.

Speaker-Follower Models for Vision-and-Language Navigation, Fried et.al., 2018

Page 17: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Back Translation: Step 1

1. New routes from existing environments.

17

Existing Environment

Page 18: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Back Translation: Step 2

Walk past the hall, turn Left, ...

Speaker

1. New routes from existing environments.

2. New instructions by pre-trained speaker.

18

Existing Environment

Page 19: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Back Translation: Step 3

Agent

Walk past the hall, turn Left, ...

Speaker

1. New routes from existing environments.

2. New instructions by pre-trained speaker.

3. Train agent on new routes, new instructions,existing environments.

19

Existing Environment

Page 20: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Back Translation:

Agent

Walk past the hall, turn Left, ...

Speaker

1. New routes from existing environments.

2. New instructions by pre-trained speaker.

3. Train agent on new routes, new instructions,existing environments.

20

Existing Environment

Limited Envs?

Page 21: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

New Environment

1. New routes from New environments.

2. New instructions by pre-trained speaker.

3. Train agent on New routes, New instructions,New environments.

Back Translation:

Agent

Walk past the hall, turn Left, ...

Speaker

21

New Envs!!

Page 22: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

How to get new environments?

Captured from new houses?

22

Page 23: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

How to get new environments?

Is very expensive…

23

Captured from new houses?

Page 24: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

How to get new environments?

Generate new environments?

24

Page 25: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

How to get new environments?

Generate new environments?

Not so easy…

25

Page 26: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

How to get new environments?

26

Let’s modify the existing environments!!

Page 27: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Vie

wpo

ints

Views

t

t+1

Illustration: Random Removal

27

RGB-imageviews

Page 28: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Vie

wpo

ints

Views

t

t+1

Illustration: Random Removal

28

Remove objects(Marked in blue)

Page 29: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Vie

wpo

ints

Views

t

t+1

Illustration: Random Removal (Two Issues)

29

Incomplete Removal:The chair is still visible from other views.

Page 30: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Vie

wpo

ints

Views

t

t+1

Illustration: Random Removal (Two Issues)

30

Incomplete Removal:The chair is still visible from other views.

Inconsistent Removal:The same chair disappears in the next viewpoint.

Page 31: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Illustration: Environmental Removal / Dropout

Vie

wpo

ints

Views

t

t+1

31

Solution:Remove all thechairs!!

Page 32: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

32

Environmental Dropout: Image-Level Implementation

RGB Image Semantic View

Object-level annotation is noisy.

Page 33: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

33

Environmental Dropout: Image-Level

Rendering is slow for training agents.

Render

Page 34: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Environmental Dropout: Feature-Level

Views Feat D

ims

Vie

wpo

ints

Views Feat D

ims

Vie

wpo

ints

Random Feature Dropout Environmental Dropout34

Page 35: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Environmental Dropout: Full Pipeline

35Agent

Walk past the hall, turn Left, ...

Speaker

Existing Environment

Env Drop

New Environment

Page 36: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

36

Results Comparison

Metric: Success Rate

Evaluated in Unseen Environments

Training Environments

Testing Environments

Agent

Page 37: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Agent Training

AgentWalk past the hall, turn Left, ...

Results Comparison

46.5%37

Page 38: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Results Comparison

Agent

Walk past the hall, turn Left, ...

Speaker

Agent Training Back Translation

Existing Environment

AgentWalk past the hall, turn Left, ...

46.5% 48.2% (+1.7%)38

Page 39: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Results Comparison

Agent

Walk past the hall, turn Left, ...

Speaker

Agent Training Back Translation

Back Translationw/ Random Dropout

New Environment

Agent

Walk past the hall, turn Left, ...

Speaker

Existing Environment

Existing Environment

Random Drop

AgentWalk past the hall, turn Left, ...

46.5% 48.2% (+1.7%) 48.4% (+1.9%)39

Page 40: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Results Comparison

Agent

Walk past the hall, turn Left, ...

Speaker

Agent Training Back Translation

Back Translationw/ Env Dropout

New Environment

Agent

Walk past the hall, turn Left, ...

Speaker

Existing Environment

Existing Environment

Env Drop

AgentWalk past the hall, turn Left, ...

46.5% 48.2% (+1.7%) 52.2% (+5.7%)40

Page 41: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Leaderboard Results

41

Beam SearchGreedy Decoding

Previous Best: 48.0%

Ours: 51.5% (+3.5%)

Previous Best: 63.0%

Ours: 68.9% (+6.9%)

Self-Monitoring Navigation Agent via Auxiliary Progress Estimation, Ma et.al., 2019Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation, Wang et.al., 2019

Page 42: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

42

Sufficient

If we use “new” environment, the result is better.

Page 43: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

43

Sufficient and Necessary

If we use “new” environment, the result is better.

If we do not use “new” environment, the result would not be better.

Page 44: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

44

Upper Bound of Back Translation (on Existing Envs)

Labeled Data Unlabeled Data

Existing Environments

Page 45: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

45

Upper Bound of Back Translation (on Existing Envs)

Labeled Data Unlabeled Data

Labeled Data Pseudo-labeled Data

BackTranslation

Page 46: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

46

Upper Bound of Back Translation (on Existing Envs)

Labeled Data Unlabeled Data

Labeled Data Pseudo-labeled Data

Labeled Data **Labeled** Data

BackTranslation

“WeakerThan”

Assumption

Page 47: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

47

Upper Bound of Back Translation (on Existing Envs)

Labeled Data Unlabeled Data

Labeled Data Pseudo-labeled Data

Labeled Data **Labeled** Data

BackTranslation

Upper Bound

Existing Environments

Page 48: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

48

Upper Bound of Back Translation (on Existing Envs)

Labeled Data **Labeled** DataUpper Bound

How to calculate (approximate) this upper bound?

Existing Environments

Page 49: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

49

Upper Bound of Back Translation (on Existing Envs)

Labeled Data **Labeled** Data

How to calculate (approximate) this upper bound?

“Result Extrapolation” Approximation

Existing Environments

Upper Bound

Page 50: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

50

“Result Extrapolation” Approximation

Labeled Data **Labeled** Data **52%**

26% Training data

73% Training data

100% Training data 46%

45%

42%

Predict

Page 51: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

51

Reinforcement Learning + Imitation Learning

Agent Agent Agent

Teacher Actions<BOS>

Agent Agent Agent

<BOS>

Sampling Sampling

RL:

IL:

Walk past the shelves and out of the garage.Stop in ...

Rewards

Page 52: UNC, Chapel Hill haotan, licheng, mbansal@cs.unc.edu Hao Tan, Licheng Yu, Mohit …airsplay/Hao_NAACL2019_slide.pdf · 2019-06-09 · Hao Tan, Licheng Yu, Mohit Bansal haotan, licheng,

Code released at:https://github.com/airsplay/R2R-EnvDrop

Hao Tan, Licheng Yu, Mohit Bansal

Thank you!

52

UNC Chapel Hill

Supported by ARO-YIP, ONR, Google, Facebook, Adobe, Baidu, and Salesforce.