from simple to complex qa - nus · – multiple-choice question answering: race, mctest – answer...

55
From Simple to Complex QA Eduard Hovy CMU Language Technologies Institute www.cs.cmu.edu/~hovy

Upload: others

Post on 11-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

FromSimpletoComplexQA

EduardHovyCMULanguageTechnologiesInstitute

www.cs.cmu.edu/~hovy

Page 2: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

WebclopediaQA,2003

•  Wherearezebrasmostlikelyfound? —inthedictionary

•  Wheredolobstersliketolive? —onthetable

•  HowmanypeopleliveinChile? —nine

Webclopedia(Hovyetal.2001)

•  Whatisaninvertebrate? —Dukakis 1

Page 3: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

BasicsimplefactoidQA

•  IdentifykeywordsfromQ•  Build(Boolean)queryforIR•  RetrievetextsusingIR•  Ranktexts/passages

•  FindspecifiedQtype•  MoveApatternsovertextand

scoreeachposition•  Rankwindows;returntopN

Alist

InputQ

Corpus:30%

+Web:add10%

1Mdocuments3000sentences

50candidates5answers

…Xwasbornin<YEAR>……Xwasbornon<DATE>……X(<YEAR>–<YEAR>)…

2

Page 4: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

WhereistheAnswer?—Progresssince2003?

TypicalQAformat:

EithertheQcontextprovidestheA1.  nearby(=n-wordwindow)context 2.  distant(=doc-level)context

Ornotatall…soyouhavetousebackgroundinfo3.  fromthetrainingdata4.  fromlogicalderivation/reasoningrules/procedure 3

Question:QContext:“wwwwww…w”A:

Page 5: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Either…

WhenallinfoneededtogettheAispresentintheQcontext

…thensomeformofsurfaceandsimpletypematching+sub-Acompositionisenough

—>Ultimately,justdo[nested]simpleQA

Or…

ButwhengettingtheArequiresinformationnotintheQcontext(likebackgroundinfo,calculation,etc.)

…thenyouareintrouble:thisisnotstandardized,henceimpossibletoevaluate

—>NocomplexQA!?4

Page 6: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Outline

1.  Ainthenearbycontext2.  Ainthedistantcontext3.  Ahiddeninthetrainingdata4.  Aonlybyreasoning

5

Page 7: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Option1:Ainnearbycontext

•  BuildanduseshortpatternsorarichLM•  Tonsofworksince2000onpatternlearningandgeneralization,QAtypologies,etc.

•  NumerousQAdatasets(TREC,SQuAD,CNN…)•  ManyQAcompetitions(SEMEVAL…)

6

Sowhere’sthelimit?

Page 8: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

YoucandoaLOTwithpatternsDidyouknowyouareanexpertonthe

PanamaCanal?

BlahPanamaCanalblahblahPanamablahPres.RooseveltblahUSAblahblahblahblahblah10yearsblahuntil1914blahblahblahblah51milesblahblahblahblahblahblahblahblahblahblahblahblahblah8to10hoursblahblahblahblahGatunLakeblah

WhenwasthePanamaCanalcompleted?HowlongisthePanamaCanal?HowlongdidittaketobuildthePanamaCanal?HowlongdoesittaketocrossthePanamaCanal?WhatisthelakeinthePanamaCanalcalled?WhichUSPresidentenabledthePanamaCanal?WhichoceansdoesthePanamaCanalconnect?Inyourtrainingdata,youhavesurelyseen“PanamaCanal”withonlytwooceannames…

7Sowhere’sthelimit?

Page 9: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Acorpustotestthepowerofngram/patternQAmodels

•  CLOTH(Xie,Lai,Dai,Hovy,EMNLP2018)–  Large-scaleClozetestdataset–  CreatedbyEnglishteachersinChinaforEnglishexams(MiddleandHighschoollevels)

– Aftercleanup:7kpassages;99kquestions(2/3removed)

– Droppedwordsandwordoptionscarefullycreatedbyteachers:highlynuancedalternatives

–  Testsknowledgeofgrammar,vocabulary,reasoning•  Howwelldostate-of-the-artcomputationalmodelsdocomparedtohumans?– Wetestusinga1-billion-wordlanguagemodel

8

Page 10: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

•  Tense,voice,preps•  Localcontentwords

•  Copy/paraphrasewords•  Contentwords,long-distancedependencies

Percentagesoftestexamples,Middle/Highschoollevels

(Xieetal.2018)

9

Page 11: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

QAsystemresults

•  Evena1B-LMstilllagsbehindhumanperformance•  Increasingthecontextlengthfor1B-LMdoesnothelp•  However:human-createdquestionsaredifferent:

(Xieetal.2018)

10

(AR:AttentionReader)

Thiswaspre-BERT!)

Page 12: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Conclusionforoption1

ForfactoidQAtypesthatobeypatterns,iftheAiscloseenough,andyouhaveenoughtrainingdata……youwillalwayslearngoodenoughwordcombinationpatternstoconnectQparameters<–>Qcontextmaterial<–>A

(Ifyouhaven’tseenthenecessarywordcombinations,youwon’teverbeabletoanswertheQ)

11

Page 13: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Option2:Aindistantcontext

•  StillusesomeformofmatchingQandA•  Needamore-sophisticatedandlonger-distancetypeof‘pattern’

12

Page 14: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Makingmatchingmorecomplex:RACE:Abettertestbed

•  RACE:ReAdingComprehensiondatasetfromExaminations(Lai,Xie,Liu,Yang,Hovy,EMNLP2018)

•  CollectedfromChinesemiddleandhighschoolexamsthatevaluatehumanstudents’Englishreadingcomprehensionability– Designedbyhumanexperts:Ensuresqualityandbroadtopiccoverage

–  SubstantiallymoredifficultthanexistingQAdatasets(butRACE-MeasierthanRACE-H)

– About4/5ofsourcematerialfilteredouttoremoveduplicates,incorrectformat,etc.

•  Aftercleaning:27,933passages;97,687questions

(Laietal.2018)

13

Page 15: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

14

Page 16: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Toward‘reasoning’:typesofmore-complexmatching

•  ParaphrasingQs:testlanguageability•  DetailQs:identifyandmatchdetailsofathing

•  AttitudeQs:findopinions/attitudesoftheauthortowardssomething(’sentiment’)

•  Whole-pictureQs:understandtheentirestory(multi-sentence)

•  SummarizationQs:understandthepoint(multi-sentence)

(Laietal.2018)

15

Increasingreasoning

Page 17: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

ComparisonwithotherQAdatasets•  Reasoningquestions:59.2%ofRACE;20.5%ofSQuAD•  Processingtypes:

–  Wordmatching:exactmatch–  Paraphrasing:paraphraseorentailment–  Single-sentreasoning:incompleteinfoorconceptualoverlap

–  Multi-sentreasoning:synthesizinginformationfrommultiplesentences

–  Insufficient/Ambiguous:noA,orAisnotunique

(Laietal.2018)

16

Page 18: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

ComparingQAalgorithms

•  Baselines:–  SlidingWindow:TF-IDFbasedmatchingalgorithm–  StanfordAttentionReader(AR)andGatedAttentionReader(early-2018state-of-the-artneuralmodels)

•  RACEhasmore‘semantics’(=requiresmore‘reasoning’)thanothercorpora:–  higherhumanceiling–  harderforneuralmodels

(Laietal.2018)

17

Page 19: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Matchingtypeperformance

•  TurkersandSlidingWindowaregoodatsimplematchingquestions

•  Surprisingly,StanfordARdoesnothavebetterperformanceonmatchingquestions

(Laietal.2018)

18

Page 20: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Conclusionforoption2

WhentheAisdistant,orrequiresmore-sophisticatedmatching/’reasoning’(notjustsimpleword-string/languagemodel),

thenattention-basedneuralmodelscandosomeofit,butstillfailwiththeharderparts

19

Page 21: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Option3:A‘hidden’intrainingdata

SometimestheQcontextdoesnotcontaintheAatall…butyoucanSTILLgettherightA!(AndevengetitwithouttheQitself!)CorruptedngramsandotherSQuADperturbations(JiaandLiang,EMNLP2017)

NecessityofQcontextorevenofQitself(KaushikandLipton,EMNLP2018,BestShortPaperaward)

20

Page 22: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Example:QonlyQuestion:shinkanemaru,thegravel-voicedback-roombosswhodiedonthursdayaged81,goesdowninhistoryasjapan’smostcorruptpost-warpoliticianafter___________Passage:...glynisbc-nj-zimmer-profile-2takes-nytrahanefumioyasuhirodragnealhadonbjorkman/max...seventh-largestembarrasedjeopardyhilariouslymasahisahaibarabajram8-to-24duke/meredithacceding...koiduiraqs2:32:21//www.ironmanlive.com/sagawakyubindeaninternatinoal90-meterkakueitanakaseven-paragraph577,610wendovergolf-lpga-jpnpartner,un-appointeduemazzeicanada-u.s.Answer:kakueitanaka

(KaushikandLipton,EMNLP2018)

21

Page 23: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Doyouactuallyneedthecontext?

•  Researchgoal:–  HowstrongaremodelsthatseetheQonly?– WhataboutmodelsthatseetheQcontextpassageonly?–  Howdoweknowmodelsarereally“reading”thewholepassage?

•  Question-onlysetting:–  IftheQAsystemneedsthepassage,randomizeitswordsfirst–  IfjustcandidateAsneeded,placetheminrandomspots,fillinterveningtextwithgibberish

•  Passage-onlysetting:–  ‘Ignore’theQs:assigneachQtosomerandompassage

22

(KaushikandLiptonEMNLP2018)

Page 24: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Experiments•  Datasets/tests:

–  Spanselection:SQuAD,TriviaQA–  Clozequeries:ChildrensBookTest(CBT),CNN,CLOTH,Who-did-What,DailyMail

–  Multi-classclassification(implicit):bAbI(20tasks)–  Multiple-choicequestionanswering:RACE,MCTest–  Answergeneration:MSMARCO

•  Algorithms:–  Key-ValueMemoryNetworks:

Milleretal.2016:Key-ValueMemoryNetworksforDirectlyReadingDocuments.ProceedingsofEMNLP

–  GatedAttentionReaders:Dhingraetal.2017:Gated-AttentionReadersforTextComprehension.ProceedingsofACL

–  QANet:Yuetal.2018:QANet:CombiningLocalConvolutionwithGlobalSelf-AttentionforReadingComprehension.ProceedingsofICLR

(KaushikandLiptonEMNLP2018)

23

Page 25: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Someresults

SQuAD,usingQANet

bAbI,usingKey-ValueMemNets

(KaushikandLiptonEMNLP2018)

24

Page 26: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Who-did-What,usingGated-AttentionReaders

CBT,usingGated-AttentionReaders

(KaushikandLiptonEMNLP2018)

25

Page 27: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Why?What’sgoingon??Question:shinkanemaru,thegravel-voicedback-roombosswhodiedonthursdayaged81,goesdowninhistoryasjapan’smostcorruptpost-warpoliticianafter___________Passage:...glynisbc-nj-zimmer-profile-2takes-nytrahanefumioyasuhirodragnealhadonbjorkman/max...seventh-largestembarrasedjeopardyhilariouslymasahisahaibarabajram8-to-24duke/meredithacceding...koiduiraqs2:32:21//www.ironmanlive.com/sagawakyubindeaninternatinoal90-meterkakueitanakaseven-paragraph577,610wendovergolf-lpga-jpnpartner,un-appointeduemazzeicanada-u.s.Answer:kakueitanaka

Transportationcompany

Kanemaru’ssecretary

Long-termpolitician

NamenotinGoogle

26

Page 28: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Conclusionforoption3

•  Don’ttrustQAdatasets!•  Don’ttrustQAsystemclaims!•  First,checkif

– anypre-existing(=trainingdata)dependenciesamongtheQandcandidateAs?

–  fullcontextpredictstheAwithouteventheQ?

27

Page 29: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Option4:Aonlythroughreasoning

FortrulycomplexQA:1.  Identifytheindividualsteps/piecesneeded

toderivetheA2.  Figureouthowtocompute/findthem

– FromtheQcontextand/orfromelsewhere

3.  Compose(andcheck?)them– BuildanAfinding‘script’

28

Page 30: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Possiblesourcesofthisknowledge•  Externalsearch:

–  Querysomethinglikethewebandhopetobelucky

•  Entailments:“sentence”–>“sentence”–  Operateatsurfaceform(inRTEformulation)–  Allowonesurfaceformtobestatedwhenanotherisgiven–  NewsurfaceformmayprovideAnswer–  Need:entailmentrules+entailmentapplier

•  Axioms:A∨B–>C–  Operateatdeeperlevel–  Connectrepresentationsubgraphs,evenprovidingnewnodes–  ExpandedgraphmayprovideAnswer–  Need:axioms/compositionrules+theoremprover

29

Page 31: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Type1:Apopulartasktoday:QAoverstructureddata

•  Data:database,table,etc.•  Task:askQsthatrequire(1)findingvariousbitsofdataand(2)composingthemtomaketheA

•  Themissinginformationisthescriptgoverningthesequenceofaccessandcomposition

•  Research:howto[learnto]buildthisscript?•  Evaluation:didthesystemproducetherightA?•  Examples:

– U.S.geographydatabaseof800facts(Zelle&Mooney,1996)– Wikitablequestions(PasupatandLiang,2015;Dasigi2018)– Otherdomains’tables(severalAI2projects)

30

Page 32: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Wikitabledataset

31

Athlete Nation Olympics Medals

Gillis Grafström

Sweden (SWE) 1920–1932 4

Kim Soo-Nyung

South Korea (KOR) 1988-200 6

Evgeni Plushenko Russia (RUS) 2002–2014 4

Kim Yu-na South Korea (KOR) 2010–2014 2

Patrick Chan Canada (CAN) 2014 2

Question:WhichathletewasfromSouthKoreaaftertheyear2010?

Answer:KimYu-Na

Reasoning:1)  GetrowswhereNationcolumn

containsSouthKorea2)  FilterrowswhereOlympicshas

avaluegreaterthan2010.3)  GetvaluefromAthletecolumn

fromfilteredrows.

Program:((reverseathlete)(and (nationsouth_korea) (year((reversedate)

(>=2010-mm-dd)))WikiTableQuestions,PasupatandLiang,2015

(DasigiLTIPhDthesis,2018)

Page 33: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Example:Dasigi•  Approachforlearningtobuildaccessroutines:

1.  ParseQ,builddependencytree2.  ConvertintoLogicalForm3.  Translateintocandidatetableaccessroutine4.  (tryallkindsofmappingsfromwordstoqueryoperators/structure)5.  Testcompositionbyrepeatedtrialanderror

•  Essentially,learningisasearchin‘operatorcombinationspace’tobuildthelogicalform

•  Weaksupervisionisnotenough.Speedupthelearning/searchby:–  Learningtoassociatetableaccessparameterswithpartsofthetree(Q

variables)–  Learningtoassociatenestingandaccessoperatorswithpartsofthetree

(‘operator’words:“themost”,“last”,etc.)–  Predefiningsomelexicon-to-operationmappings–  Payingattentiontogrammaticalconstructionofthetree–  Implementingheuristicstoguideexploration(‘shortQsfirst’)

32

(DasigiLTIPhDthesis,2018)

Page 34: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Dasigiapproach•  Strategies:

–  Incorporateknowledgeofgrammaticalconstraints–  ‘Lucky’examples:removerightAwithwrongquerylogic–  Questioncoverage:howmanyQwordsmapped?–  Complexqueries(denotation):howlargeisthequery?–  Doiterativesearch,fromsimplertomorecomplexQs

•  CombineintosingleObjective:Minimizeexpectedvalueofcost(Goodman,1996;GoelandByrne,2000;SmithandEisner,2005)

withalinearcombinationofcoverageanddenotationcosts

33

x: NL term y: script term d: denotation

Page 35: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

EmpiricalcomparisononWikiTableQuestions

●  Requiresapproximatesetoflogicalformsduringtraining

●  UsedoutputfromDynamicProgrammingonDenotations(PasupatandLiang,2016)

●  Variousmodels:strings,trees,etc.

●  Efficientsearchfollowedbypruningusinghumanannotations

34

(Krishnamurthy, Dasigi and Gardner, 2017)

Page 36: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Dasigiresultsusingiterativesearch

35

●  Similar trend in 2 domains ●  Used functional query language (Liang et al., 2018)

(Dasigi,Gardner,Murty,Zettlemoyer,Hovy2018)

NLVR WikiTableQuestions

Page 37: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Conclusionforoption4.1

Interestingideato‘operationalize’theQandtestits‘truth’byrunningthescriptfortheA

ButworksonlywithstructuredAsourceswheresuchoperationalizationispossible

36

Canwe‘operationalize’other,typicalkindsofQs?

Page 38: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Type2:AnewQAtask:Multi-domainknowledge

Q:WhatisthelargestcapitalcitysouthofSantiagodeChile?

– Geographicknowledge(lat-long,population)– Numericalability(sorting,etc.)

Q:WhichoftheleadersoftheXYZenterprisearewell-liked,andwhy?

– Discoveryofsocialrolebyactions– Sentimentjudgmentsattachedtoactions

37

Page 39: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Multi-domainknowledge

•  DefineNself-containedstandardized‘domainspecialists’(KBs+reasoners)thatanyQAenginecanrun

•  Atrun-time,analyzetheQ,buildtheAscript,activatethespecialistsasneeded,computetheA

Arithmetic

GeographyPsych:goals

SocialcustomsPhysics

38

Page 40: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Researchneeded

•  Foreachdomainspecialist:– Defineits‘knowledgeservice’–  Createtheunderlyingknowledge– DefinetheI/OAPIsfortheQAenginetouse–  Buildthespecialist

•  ForeachQAengine:– AnalyzetheQ—>determineparametersandneed– Decomposeneed,buildascriptofspecialistqueriesplustheirresultcomposition

–  Execute39

Page 41: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Somespecialistareaswearecurrentlyworkingoninmygroup

1.  Arithmetic/numericalreasoningforentailment(Ravichander,Naik,Rosé,Hovy,CoNLL2019,ACL2019)

2.  Psychgoalsforsentimentjustification(OtaniandHovy,ACL2019)

3.  Socialrolesforgroupactivitysupport(Yang,Kraut,Hov,yEMNLP,HCI,andothers2017–18)

40

Page 42: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Topic1.Numericalcalculation

•  Task:Entailmentproblem•  Input:clausescontainingnumbers•  Output:entailed/not-entailed

•  Results:–  EQUATEdatasetextractedfrom~8existingQAandEntailmentresources,withAsadded

–  Baselinenumericalreasonerscoresonthedataset

(Ravichander,Naik,Rose,Hovy,2019)

P:AbombinaHebrewUniversitycafeteriakilledfiveAmericansandfourIsraelisH:AbombingatHebrewUniversityinJerusalemkilledninepeople,includingfiveAmericans

41

Page 43: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

EQUATEcorpusDataset Size Clas

sesSynthetic

DataSource

AnnotationSource

QuantitativePhenomena

StressTest 7500 3 ✓ AQuA-RAT Automatic Quantifiers

RTE-Quant 166 2 ✗ RTE2-RTE4 Expert Arithmetic,Worldknowledge,Ranges,Quantifiers

AwpNLI 722 2 ✓ ArithmeticWordProblems

Automatic Arithmetic

NewsNLI 1000 2 ✗ CNN Crowd-sourced

Ordinals,Quantifiers,Arithmetic,WorldKnowledge,Magnitude,Ratios

RedditNLI 250 3 ✗ Reddit Expert Range,Arithmetic,Approximation,Verbal

(Ravichanderetal.,2019)

42

Page 44: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Baselines(SOTAmethods)•  MajorityClass(MAJ):Simplebaselinealwayspredictsthemajorityclassintestset.•  Hypothesis-Only(HYP):FastTextclassifiertrainedononlyhypothesestopredictthe

entailmentrelation(Gururanganetal.2018)•  ALIGN:Abag-of-wordsalignmentmodelinspiredbyMacCartney(2009)•  NB(NieandBansal2017):SentenceencoderconsistingofstackedBiLSTM-RNNs

withshortcutconnectionsandfine-tuningofembeddings.Achievestopnon-ensembleresultintheRepEval-2017sharedtask

•  CH(Chenetal.2017):SentenceencoderconsistingofstackedBiLSTM-RNNswithshortcutconnections,character-compositionwordembeddingslearnedviaCNNs,intra-sentencegatedattentionandensembling.AchievesbestoverallresultintheRepEval-2017sharedtask

•  RC(Balazsetal.2017):Single-layerBiLSTMwithmeanpoolingandintra-sentenceattention

•  IS(Conneauetal.2017):Single-layerBiLSTM-RNNwithmax-pooling,showntolearnrobustuniversalsentencerepresentationsthattransferwellacrossinferencetasks

•  BiLSTM:WereimplementthesimpleBiLSTMbaselinemodelofNangiaetal.(2017).OurreimplementationachievesslightlybetterresultsontheMultiNLIdevset

•  CBOW:Bag-of-wordssentencerepresentationfromwordembeddingspassedthroughatanhnon-linearityandasoftmaxlayerforclassification.

(Ravichanderetal.,2019)

43

Page 45: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Constructingentailmentinferences•  Generateareport

foreachpremise-hypothesispair,consistingof:–  ExtractedNUMSETSforpremiseandhypothesis

–  WhichNUMSETSwerecombinedandbywhatoperation

–  WhichNUMSETSwerejustifiedandwhichweren’t

•  Combinesneuralandsymbolicprograms–  Somesubmodulesareneural;overallframeworkissymbolic–  Lightweightsupervision

(Ravichanderetal.,2019)

44

Page 46: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Topic2.Humangoals

•  ComplexQAdomain:humangoalforsentiment–  Ilovedthehotel’spricebuttheroomwasnoisy—>[price+][room-]

•  Task:sentimentjustification:WHYdoestheHolderhavethesentimentvalueforthefacet?

•  Approach:Classifyeachclauseintoalistofhuman(psychologicalandsocial)goals–  Initialset:Maslowhierarchy–  Currently:~110humangoalsfromUSC(Talevichetal.)

•  Data:Crowdsourced;κ≈0.55

(OtaniandHovy,2019)

45

Page 47: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

(Talevichetal.2017)

46

Page 48: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Topic3.Socialroles

•  ComplexQAdomain:Humaninteractionsingroups

•  Task:Automatedsocialrolediscovery–  Input:Discussionsinasocialmediaplatform– Output:Rolelist,andassignmentforeachuser

•  Data:– Wikipediaeditors:ourroletaxonomyconformstoWikipedia’sinternalset

– CancerSurvivorNetworkdiscussiongroups

(Yang,Kraut,Hovy,2018)

47

Page 49: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

User edit history Role assignments

Information_insertion 0.4 Reference_insertion 0.2 ….

Roles

Grammar 0.2 Markup_deletion 0.1 Rephrase 0.1 ….

Wikilink_insertion 0.2 Wikilink_deletion 0.1 ….

48

(Yang,Kraut,Hovy,2018)

Page 50: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

LatentrolemodelinWikipedia

Role: distribution of edit actions

Role proportions

Role assignment for user u and word n

Edit actions

(Yang,Kraut,Hovy,2018)

49

Page 51: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Discoverededitorroles(namingbyexpert)Expert’s role name Discovered representative behavior

Substantive Expert Information insertion, wikilink insertion, reference insertion

Social Networker Main talk namespace, user namespace

Vandal Fighter Reverting, user talk namespace

Quality Assurance Wikilink insertion, wikipedia namespace, template namespace

Fact Checker Information deletion, wikilink deletion, reference deletion

Cleanup Worker Wikilink modification, template insertion, markup modification

Fact Updater Template modification, reference modification

Copy Editor Grammar, paraphrase, relocation 50

(Yang,Kraut,Hovy,2018)

Page 52: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Topics4–.Otherinferencespecialists

•  GeographyandTime… (see(Allen,CACM1983)and(Davis,JAIR2017))– E.g.:north-of,area-included-in-region…

•  Physics,Biology… (seetheHALOproject)– RecentworkonaspectsofPhysicsatAI2(Clarketal.)

•  Emotions

51

Page 53: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Physics:noun-nouncompounds

Whereis…

•  …thekitchentable•  …thecoffeetable•  …thewoodtable•  …theteacher’stable•  …thedatatable

•  Needtoknowtherelationandthenountypestoinferadditionalinfo:

•  LOC•  FUNCTIONèLOC•  MATERIAL•  ?FUNCTIONèLOC?•  TYPESèCONTENTèLOC?

52

Page 54: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

Conclusionforoption4.2WherenextwithComplexQA?

•  Identifyandbuildthemostusefuldomainspecialists–  Findbasicknowledgeprimitives– Developreasoninglogics,models,andimplementations

– Develop/findQAdatasetsthatexercisethissortofspecialistknowledgeandreasoning

•  Greatoverviewin(Davis,JAIR2018)•  Createacommonlibraryforalltoshare•  EvaluatecorrectnessANDAnswerproductionscripts(traces,as‘explanation’) 53

Open-sourceandgeneral-purpose(notjustscientific/political)versionofWolframAlpha

Page 55: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al

THANKYOU

54