enabling language models to fill in the blanksminalee/pdf/acl2020-infilling-slides.pdf · our idea:...

34
Enabling Language Models to Fill in the Blanks Chris Donahue Paper https://arxiv.org/abs/2005.05339 Code https://github.com/chrisdonahue/ilm Demo https://chrisdonahue.com/ilm Mina Lee Percy Liang

Upload: others

Post on 17-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Enabling Language Models to Fill in the Blanks

Chris Donahue

Paper https://arxiv.org/abs/2005.05339 Code https://github.com/chrisdonahue/ilm Demo https://chrisdonahue.com/ilm

Mina Lee Percy Liang

Page 2: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Why filling in the blanks?

Hi Chris, Thanks for updating the draft.

Can you revert the wording of the task definition? Editing and revising

Page 3: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Why filling in the blanks?

Hi Chris, Thanks for updating the draft. The modifications look

Can you revert the wording of the task definition? Editing and revising

Page 4: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Why filling in the blanks?

Hi Chris, Thanks for updating the draft. The modifications look great to me.

Can you revert the wording of the task definition? Editing and revising

Page 5: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Why filling in the blanks?

Hi Chris, Thanks for updating the draft. The modifications look good with one exception.

Can you revert the wording of the task definition? Editing and revising

Page 6: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Why filling in the blanks?

We were lost in the dark forest. Suddenly,

Connecting ideas

Page 7: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Why filling in the blanks?

We were lost in the dark forest. Suddenly, a bear emerged from the trees!

Connecting ideas

Page 8: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Why filling in the blanks?

We were lost in the dark forest. Suddenly, A wave of relief washed over us and we ran over to greet the other traveler.

Connecting ideas

Page 9: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Why filling in the blanks?

We were lost in the dark forest. Suddenly, we saw a flashlight in the distance. A wave of relief washed over us and we ran over to greet the other traveler.

Connecting ideas

Page 10: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Input Output

Given�incomplete�text�with�[blank]s,�predict�complete�text

Text infilling

She ate [blank] for [blank]. She ate leftover pasta for lunch.

Arbitrary�number�of�blanks

Variable�length�spans�(e.g.�word,�sentence,�paragraph)

Page 11: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Input Output

Previous work on text infilling

She ate [blank] for [blank]. She ate leftover pasta for lunch.

GPT-3�(Brown�et�al.,�2020):�Cannot�consider�future�context

General-purpose models

Page 12: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Output

Previous work on text infilling

She ate leftover pasta for lunch.She ate [mask] [mask] for [mask].

Input

General-purpose models

BERT�(Devlin�et�al.,�2019):�Must�know�exact�number�of�tokens

GPT-3�(Brown�et�al.,�2020):�Cannot�consider�future�context

Page 13: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Input Output

Previous work on text infilling

She ate [blank] for [blank]. She ate leftover pasta for lunch.

General-purpose models

SA�(Zhu�et�al.,�2019):�Cannot�leverage�pre-trained�language�models

Task-specific models

GPT-3�(Brown�et�al.,�2020):�Cannot�consider�future�context

BERT�(Devlin�et�al.,�2019):�Must�know�exact�number�of�tokens

Page 14: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Our Idea: Infilling by Language Modeling (ILM)

1. Download your favorite language model (LM)

Language Model

Page 15: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Our Idea: Infilling by Language Modeling (ILM)

1. Download your favorite language model (LM)

2. Fine-tune the model on infilling examples

leftover pasta [answer] lunch [answer] She ate [blank] for [blank]. [sep]Language Model

Page 16: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Our Idea: Infilling by Language Modeling (ILM)

1. Manufacture infilling examplesTraining�time

Page 17: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

leftover pasta [answer] lunch [answer] She ate [blank] for [blank].She ate leftover pasta for lunch.

Our Idea: Infilling by Language Modeling (ILM)

1. Manufacture infilling examplesTraining�time

Data

Page 18: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

leftover pasta [answer] lunch [answer] She ate leftover pasta for lunch.

She ate [blank] for [blank].

Our Idea: Infilling by Language Modeling (ILM)

1. Manufacture infilling examples

Input

Training�time

Data

Page 19: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

leftover pasta [answer] lunch [answer] She ate [blank] for [blank].

Our Idea: Infilling by Language Modeling (ILM)

1. Manufacture infilling examples

Input Target

Training�time

She ate leftover pasta for lunch.

Data

Page 20: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Our Idea: Infilling by Language Modeling (ILM)

1. Manufacture infilling examples

Data

Training�time

New data

leftover pasta [answer] lunch [answer] She ate [blank] for [blank]. [sep]

She ate leftover pasta for lunch.

Page 21: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Our Idea: Infilling by Language Modeling (ILM)

2. Download pre-trained left-to-right LM

Language Model

Training�time

leftover pasta [answer] lunch [answer] She ate [blank] for [blank]. [sep]

Page 22: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Our Idea: Infilling by Language Modeling (ILM)

3. Fine-tune LM on infilling examples

Language Model

Training�time

leftover pasta [answer] lunch [answer] She ate [blank] for [blank]. [sep]

Page 23: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Our Idea: Infilling by Language Modeling (ILM)

3. Fine-tune LM on infilling examplesTraining�time

leftover pasta [answer] lunch [answer] She ate [blank] for [blank]. [sep]Language Model

Page 24: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Our Idea: Infilling by Language Modeling (ILM)

Use fine-tuned LM to infill

Language Model

He drinks [blank] after [blank].Input

Test�time

[sep]

Page 25: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Our Idea: Infilling by Language Modeling (ILM)

Language Model

Test�timeUse fine-tuned LM to infill

He drinks [blank] after [blank].Input

water [answer] running [answer] [sep]Target

Page 26: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Our Idea: Infilling by Language Modeling (ILM)

Test�time

Output

He drinks water after running.

Use fine-tuned LM to infill

He drinks [blank] after [blank]. water [answer] running [answer] [sep]Input Target

Page 27: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Experimental setup

Data

Metric

Stories�(Mostafazadeh�et�al.,�2016),�Abstracts,�Lyrics

Score,�Perplexity

1. Human evaluation

2. Quantitative evaluation

Model BERT,�SA�(Zhu�et�al.,�2019),�LM,�ILM�(ours)

Page 28: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

1. Human evaluation: Turing test

Identify one of the five sentences generated by machine.

Patty was excited about having her friends over. She had been working hard preparing the food. She also had the place looking spotless. All of her friends arrived and were seated at the table. Patty had a great time with her friends.

Page 29: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

1. Human evaluation: Turing test

Identify one of the five sentences generated by machine.

Patty was excited about having her friends over. She had been working hard preparing the food. She also had the place looking spotless. All of her friends arrived and were seated at the table. Patty had a great time with her friends.

Page 30: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

1. Human evaluation: Turing test

Identify one of the five sentences generated by machine.

Patty was excited about having her friends over. She had been working hard preparing the food. She also had the place looking spotless. All of her friends arrived and were seated at the table. Patty had a great time with her friends.

[blank]

Page 31: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

1. Human evaluation: Turing test

Identify one of the five sentences generated by machine.

Patty was excited about having her friends over. She had been working hard preparing the food. She also had the place looking spotless. All of her friends arrived and were seated at the table. Patty had a great time with her friends.

[blank]

ILM Patty knew her friends wanted pizza.

Page 32: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Patty was excited about having her friends over. She had been working hard preparing the food. She also had the place looking spotless. All of her friends arrived and were seated at the table. Patty had a great time with her friends.

[blank]

1. Human evaluation: Turing test

Identify one of the five sentences generated by machine.

SA She wasn't sure she had to go to the store.

LM She went to check the tv.

ILM Patty knew her friends wanted pizza.

favoritea ", Mary brightly said.BERT

29%

41%

45%

20%

Page 33: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

2. Quantitative evaluation

Stories Abstracts Lyrics

LM 18.3 27.9 27.7

ILM 15.6 22.4 22.6

Perplexity on the sentence infilling task

Take�advantage�of�bidirectional�context�despite�using�unidirectional�models�

Please�refer�to�the�paper�for�more�experiments�and�detailed�analysis

Page 34: Enabling Language Models to Fill in the Blanksminalee/pdf/acl2020-infilling-slides.pdf · Our Idea: Infilling by Language Modeling (ILM) 1. Download your favorite language model (LM)

Takeaways

Conceptual simplicity

Model-agnostic framework

Minimal�change�to�standard�LM�training

Leverage�massively�pre-trained�LMs

Input Output

Thank [blank] for [blank]! Thank you for watching!

Paper https://arxiv.org/abs/2005.05339 Code https://github.com/chrisdonahue/ilm Demo https://chrisdonahue.com/ilm