lexical semantics, distributions, predicate-argument ...tbergkir/11711fa17/lexical semantics and...
Post on 13-Oct-2020
8 Views
Preview:
TRANSCRIPT
LexicalSemantics,Distributions,Predicate-ArgumentStructure,and
FrameSemanticParsing
11-711AlgorithmsforNLP
5 December2017
(WiththankstoNoahSmith
andLoriLevin)
11-711CourseContext
• Previoussemanticslecturesdiscussedcomposingmeaningsofpartstoproducethecorrectglobalsentencemeaning– Themailmanbitmydog.
• The“atomicunits”ofmeaninghavecomefromthelexicalentriesforwords
• Themeaningsofwordshavebeenoverlysimplified(asinFOL):atomicobjectsinaset-theoreticmodel
WordSense
• Instead,abank canholdtheinvestmentsinacustodialaccountintheclient’sname.
• Butasagricultureburgeonsontheeastbank,theriverwillshrinkevenmore.
• Whilesomebanks furnishspermonlytomarriedwomen,othersaremuchlessrestrictive.
• Thebank isnearthecornerofForbesandMurray.
FourMeaningsof“Bank”
• Synonyms:• bank1 =“financialinstitution”
• bank2 =“slopingmound”
• bank3 =“biologicalrepository”
• bank4 =“buildingwhereabank1 doesitsbusiness”
• Theconnectionsbetweenthesedifferentsenses varyfrompracticallynone(homonymy)torelated(polysemy).– Therelationshipbetweenthesensesbank4 andbank1 iscalledmetonymy.
Antonyms
• White/black,tall/short,skinny/American,…
• Butdifferentdimensionspossible:–White/Blackvs.White/Colorful
– Oftenculturallydetermined
• Partlyinterestingbecauseautomaticmethodshavetroubleseparatingthesefromsynonyms– Samesemanticfield
HowManySenses?
• Thisisahardquestion,duetovagueness.
Ambiguityvs.Vagueness
• Lexicalambiguity:Mywifehastwokids(childrenorgoats?)
• vs.Vagueness:1sense,butindefinite:horse(mare,colt,filly,stallion,…)vs.kid:– IhavetwohorsesandGeorgehasthree– IhavetwokidsandGeorgehasthree
• Verbstoo:IranlastyearandGeorgedidtoo• vs.Reference:I,here,thedognotconsideredambiguousinthesameway
HowManySenses?
• Thisisahardquestion,duetovagueness.
• Considerations:– Truthconditions(servemeat/servetime)– Syntacticbehavior(servemeat/serveassenator)– Zeugmatest:• #DoesUnitedservebreakfastandPittsburgh?• ??Shepoacheselephantsandpears.
RelatedPhenomena
• Homophones(would/wood,two/too/to)–Mary,merry,marryinsomedialects,notothers
• Homographs(bass/bass)
WordSensesandDictionaries
WordSensesandDictionaries
Ontologies
• ForNLP,databasesofwordsensesaretypicallyorganizedbylexicalrelationssuchashypernym (IS-A)intoaDAG
• Thishasbeenworkedonforquiteawhile
• Aristotle’sclasses(about330BC)– substance(physicalobjects)– quantity(e.g.,numbers)
– quality(e.g.,beingred)– Others:relation,place,time,position,state,action,affection
WordsensesinWordNet3.0
Synsets
• (bass6,bass-voice1,basso2)
• (bass1,deep6)(Adjective)
• (chump1,fool2,gull1,mark9,patsy1,
fallguy1,sucker1,softtouch1,mug2)
“Rough”Synonymy
• JonathanSafranFoer’s EverythingisIlluminated
NounrelationsinWordNet3.0
Isahamburgerfood?
VerbrelationsinWordNet3.0
• Notnearlyasmuchinformationasnouns
FramebasedKnowledgeRep.
• Organizerelationsaroundconcepts
• Equivalentto(orweakerthan)FOPC– Imagefromfuturehumanevolution.com
Stillno“real”semantics?
• Semanticprimitives:Kill(x,y)=CAUSE(x,BECOME(NOT(ALIVE(y))))Open(x,y)=CAUSE(x,BECOME(OPEN(y)))
• ConceptualDependency:PTRANS,ATRANS,…ThewaiterbroughtMarythecheckPTRANS(x)�ACTOR(x,Waiter)�(OBJECT(x,Check)
�TO(x,Mary)�ATRANS(y)�ACTOR(y,Waiter)�(OBJECT(y,Check)
�TO(y,Mary)
Wordsimilarity
• Humanlanguagewordsseemtohavereal-valuedsemanticdistance(vs.logicalobjects)
• Twomainapproaches:– Thesaurus-basedmethods• E.g.,WordNet-based
– Distributionalmethods• Distributional“semantics”,vector“semantics”
• Moreempirical,butaffectedbymorethansemanticsimilarity(“wordrelatedness”)
Human-subjectWordAssociations
Stimulus:wall
Numberofdifferentanswers:39
Totalcountofallanswers:98
BRICK160.16STONE90.09PAPER70.07GAME50.05BLANK40.04BRICKS40.04FENCE40.04FLOWER40.04BERLIN30.03CEILING30.03HIGH30.03STREET30.03...
Stimulus:giraffe
Numberofdifferentanswers:26
Totalcountofallanswers:98
NECK330.34
ANIMAL90.09
ZOO90.09
LONG70.07
TALL70.07
SPOTS50.05
LONGNECK40.04
AFRICA30.03
ELEPHANT20.02
HIPPOPOTAMUS20.02
LEGS20.02
...
FromEdinburghWordAssociationThesaurus,http://www.eat.rl.ac.uk/
Thesaurus-basedWordSimilarity
• Simplestapproach:pathlength
Betterapproach:weightedlinks• Usecorpusstatstogetprobabilitiesofnodes
• Refinement:useinfocontentofLCS:2*logP(g.f.)/(logP(hill)+logP(coast))=0.59
DistributionalWordSimilarity
• Determinesimilarityofwordsbytheirdistribution inacorpus– “Youshallknowawordbythecompanyitkeeps!”(Firth1957)
• E.g.:100kdimension vector,“1”ifwordoccurswithin“2lines”:
• “Whoismyneighbor?”Whichfunctions?
Whoismyneighbor?• Linearwindow?1-500wordswide.Orwholedocument.Removestopwords?
• Usedependency-parserelations?Moreexpensive,butmaybebetterrelatedness.
Weightsvs.justcounting
• Weightthecountsbytheapriorichanceofco-occurrence
• Pointwise MutualInformation(PMI)
• Objectsofdrink:
Distancebetweenvectors
• Comparesparsehigh-dimensionalvectors– Normalizeforvectorlength
• Justusevectorcosine?
• SeveralotherfunctionscomefromIRcommunity
Lotsoffunctionstochoosefrom
Distributionally SimilarWords
31
Rumvodkacognacbrandywhiskyliquordetergentcolaginlemonadecocoachocolatescotchnoodletequilajuice
Writereadspeakpresentreceivecallreleasesignofferknowacceptdecideissueprepareconsiderpublish
Ancientoldmoderntraditionalmedievalhistoricfamousoriginalentiremainindianvarioussingleafricanjapanesegiant
Mathematicsphysicsbiologygeologysociologypsychologyanthropologyastronomyarithmeticgeographytheologyhebreweconomicschemistryscripturebiotechnology
(fromanimplementationofthemethoddescribedinLin.1998.AutomaticRetrievalandClusteringofSimilarWords.COLING-ACL.Trainedonnewswiretext.)
Human-subjectWordAssociations
Stimulus:wall
Numberofdifferentanswers:39
Totalcountofallanswers:98
BRICK160.16STONE90.09PAPER70.07GAME50.05BLANK40.04BRICKS40.04FENCE40.04FLOWER40.04BERLIN30.03CEILING30.03HIGH30.03STREET30.03...
Stimulus:giraffe
Numberofdifferentanswers:26
Totalcountofallanswers:98
NECK330.34
ANIMAL90.09
ZOO90.09
LONG70.07
TALL70.07
SPOTS50.05
LONGNECK40.04
AFRICA30.03
ELEPHANT20.02
HIPPOPOTAMUS20.02
LEGS20.02
...
FromEdinburghWordAssociationThesaurus,http://www.eat.rl.ac.uk/
Recentevents(2013-now)
• RNNs(RecurrentNeuralNetworks)asanotherwaytogetfeaturevectors– Hiddenweightsaccumulatefuzzyinfoonwordsintheneighborhood
– Thesetofhiddenweightsisusedasthevector!
RNNs
Fromopeni.nlm.nih.gov
Recentevents(2013-now)
• RNNs(RecurrentNeuralNetworks)asanotherwaytogetfeaturevectors– Hiddenweightsaccumulatefuzzyinfoonwordsintheneighborhood
– Thesetofhiddenweightsisusedasthevector!• Compositionbymultiplying(etc.)–Mikolov etal(2103):“king– man+woman=queen”(!?)
– CCGwithvectorsasNPsemantics,matricesasverbsemantics(!?)
36 SemanticProcessing[2]
SemanticCases/ThematicRoles
• Developedinlate1960’sand1970’s
• Postulatealimitedsetofabstractsemanticrelationshipsbetweenaverb&itsarguments:thematicroles orcaseroles
• Insomesense,partoftheverb’ssemantics
37 SemanticProcessing[2]
ThematicRoleexample
• Johnbrokethewindowwiththehammer• John:AGENTrolewindow:THEMErolehammer:INSTRUMENTrole
• ExtendLFnotationtousesemanticroles
38 SemanticProcessing[2]
ThematicRoles
• IsthereaprecisewaytodefinemeaningofAGENT,THEME,etc.?
• Bydefinition:– “TheAGENTisaninstigatoroftheactiondescribedbythesentence.”
• Testingviasentencerewrite:– Johnintentionally brokethewindow– *Thehammerintentionally brokethewindow
39 SemanticProcessing[2]
ThematicRoles[2]
• THEME– Describestheprimaryobjectundergoingsomechangeorbeingactedupon
– FortransitiveverbX,“whatwasXed?”– Thegrayeaglesawthemouse“Whatwasseen?”(A:themouse)
Breaking,Eating,Opening
• Johnbrokethewindow.• Thewindowbroke.• Johnisalwaysbreakingthings.
• Weatedinner.• Wealreadyate.• Thepieswereeatenupquickly.
• Openup!• Someoneleftthedooropen.• Johnopensthewindowatnight.
Breaking,Eating,Opening
• Johnbrokethewindow.• Thewindowbroke.• Johnisalwaysbreakingthings.
• Weatedinner.• Wealreadyate.• Thepieswereeatenupquickly.
• Openup!• Someoneleftthedooropen.• Johnopensthewindowatnight.
breaker,brokenthing,breakingfrequency?
eater,eatenthing,eatingspeed?
opener,openedthing,openingtime?
CanWeGeneralize?
• Thematicroles describegeneralpatternsofparticipantsingenericevents.
• Thisgivesusakindofshallow,partialsemanticrepresentation.
• FirstproposedbyPanini,before400BC!
ThematicRoles
Role Definition ExampleAgent Volitional causer of the event The waiter spilled the soup.
Force Non-volitional causer of the event The wind blew the leaves around.
Experiencer Mary has a headache.Theme Most directly affected participant Mary swallowed the pill.Result End-product of an event We constructed a new building.Content Proposition of a propositional event Mary knows you hate her.Instrument You shot her with a pistol.Beneficiary I made you a reservation.Source Origin of a transferred thing I flew in from Pittsburgh.Goal Destination of a transferred thing Go to hell!
ThematicGridorCaseFrame
• Example:break– Thechildbrokethevase.<agenttheme>
subjobj
– Thechildbrokethevasewithahammer.<agentthemeinstr >
subjobj PP
– Thehammerbrokethevase.< themeinstr >obj subj
– Thevasebroke.<theme>subj
ThematicGridorCaseFrame
• Example:break– Thechildbrokethevase.<agenttheme>
subjobj
– Thechildbrokethevasewithahammer.<agentthemeinstr >
subjobj PP
– Thehammerbrokethevase.< themeinstr >obj subj
– Thevasebroke.<theme>subjTheThematicGridorCaseFrameshows
• Howmanyargumentstheverbhas• Whatrolestheargumentshave• Wheretofindeachargument
• Forexample,youcanfindtheagentinthesubjectposition
DiathesisAlternation:achangeinthenumberofargumentsorthegrammaticalrelationsassociatedwith
eachargument
• Chris gaveabooktoDana. <agentthemegoal>subjobj PP
• AbookwasgiventoDana byChris. <agentthemegoal>PPsubjPP
• ChrisgaveDanaabook. <agentthemegoal>subjobj2obj
• Dana wasgivenabookbyChris. <agentthemegoal>PPobj subj
TheTroubleWithThematicRoles
• Theyarenotformallydefined.
• Theyareoverlygeneral.
• “agent verb theme withinstrument”and“instrumentverbtheme”...– Thecookopenedthejarwiththenewgadget.
→ Thenewgadgetopenedthejar.
– Susanatetheslicedbananawithafork.→ #Theforkatetheslicedbanana.
TwoDatasets
• PropositionBank(PropBank):verb-specificthematicroles
• FrameNet:“frame”-specificthematicroles
• Thesearelexiconscontainingcaseframes/thematicgridsforeachverb.
PropositionBank(PropBank)
• Asetofverb-sense-specific “frames”withinformalEnglishglossesdescribingtheroles
• Conventionsforlabelingoptionalmodifierroles
• PennTreebankislabeledwiththoseverb-sense-specificsemanticroles.
“Agree”inPropBank
• arg0:agreer
• arg1:proposition
• arg2:otherentityagreeing
• Thegroupagreeditwouldn’tmakeanoffer.
• UsuallyJohn agreeswithMary oneverything.
“Fall(movedownward)”inPropBank
• arg1:logicalsubject,patient,thingfalling• arg2:extent,amountfallen• arg3:startingpoint• arg4:endingpoint• argM-loc:medium• Sales fellto$251.2million from$278.8million.• Theaveragejunkbondfellby4.2%.• Themeteorfellthroughtheatmosphere,crashingintoCambridge.
FrameNet
• FrameNetissimilar,butabstractsfromspecificverbs,sothatsemanticframes arefirst-classcitizens.
• Forexample,thereisasingleframecalledchange_position_on_a_scale.
change_position_on_a_scale
Oil rose in price by 2%It has increased to having them 1 day a month.Microsoft shares fell to 7 5/8.Colon cancer incidence fell by 50% among men.
Manywords,notjustverbs,sharethesameframe:
Verbs:advance,climb,decline,decrease,diminish,dip,double,drop,dwindle,edge,explode,fall,fluctuate,gain,grow,increase,jump,move,mushroom,plummet,reach,rise,rocket,shift,skyrocket,slide,soar,swell,swing,triple,tumbleNouns:decline,decrease,escalation,explosion,fall,fluctuation,gain,growth,hike,increase,rise,shift,tumbleAdverb:increasingly
Conversely,onewordhasmanyframesExample:rise
• Change-position-on-a-scale:OilROSEinpricebytwopercent.• Change-posture:aprotagonist changestheoverallpositionorpostureofabody.
– Source:startingpointofthechangeofposture.
– Charles ROSEfromhisarmchair.
• Get-up:A Protagonist leavestheplacewheretheyhaveslept,their Bed,tobeginorresumedomestic,professional,orotheractivities.GettingupisdistinctfromWakingup,whichisconcernedonlywiththetransitionfromthesleepingstatetoawakefulstate.
– I ROSE frombed,threwonapairofcamouflageshortsanddrovemylittleToyotaCorollatoaconstructionclearingafewmilesaway.
• Motion-directional:Inthisframea Theme movesinacertain Direction whichisoftendeterminedbygravityorothernatural,physicalforces.The Theme isnotnecessarilyaself-mover.
– TheballoonROSEupward.
• Sidereal-appearance: An Astronomical_entity comesintoviewabovethehorizonaspartofaregular,periodicprocessof(apparent)motionoftheAstronomical_entity acrossthesky.Inthecaseofthesun,theappearancebeginstheday.
– Atthetimeofthenewmoon, themoon RISES ataboutthesametimethesunrises,anditsetsataboutthesametimethesunsets.
Eachday thesun's RISE offersusanewday.
FrameNet
• Framesarenotjustforverbs!• Verbs:advance,climb,decline,decrease,diminish,dip,double,drop,dwindle,edge,explode,fall,fluctuate,gain,grow,increase,jump,move,mushroom,plummet,reach,rise,rocket,shift,skyrocket,slide,soar,swell,swing,triple,tumble
• Nouns:decline,decrease,escalation,explosion,fall,fluctuation,gain,growth,hike,increase,rise,shift,tumble
• Adverb:increasingly
FrameNet
• Includesinheritanceandcausationrelationshipsamongframes.
• Examplesincluded,butlittlefully-annotatedcorpusdata.
SemLink
• Itwouldbereallyusefulifthesedifferentresourceswereinterconnectedinausefulway.
• SemLink projectis(was?)tryingtodothat
• UnifiedVerbIndex(UVI)connects– PropBank– VerbNet– FrameNet
–WordNet/OntoNotes
SemanticRoleLabeling
• Input:sentence• Output:foreachpredicate*,labeledspansidentifyingeachofitsarguments.
• Example:[agentThebatter]hit[patient theball][time yesterday]
• Somewherebetweensyntacticparsingandfull-fledgedcompositionalsemantics.
*Predicatesaresometimesidentifiedintheinput,sometimesnot.
Butwait.Howisthisdifferentfromdependencyparsing?
• Semanticrolelabeling– [agentThebatter]hit[patient theball][time yesterday]
• Dependencyparsing– [subjThebatter]hit[obj theball][mod yesterday]
Butwait.Howisthisdifferentfromdependencyparsing?
• Semanticrolelabeling– [agentThebatter]hit[patient theball][time yesterday]
• Dependencyparsing– [subjThebatter]hit[obj theball][mod yesterday]
1. Thesearenotthesametask.
2. Semanticrolelabelingismuchharder.
Subjectvsagent
• Subjectisagrammaticalrelation• Agentisasemanticrole
• InEnglish,asubjecthastheseproperties– Itcomesbeforetheverb– Ifitisapronoun,itisinnominativecase(inafiniteclause)
• I/he/she/we/theyhittheball.• *Me/him/her/us/themhittheball.
– Iftheverbisinpresenttense,itagreeswiththesubject• She/he/ithitstheball.• I/we/theyhittheball.• *She/he/ithittheball.• *I/we/theyhitstheball.• Ihittheball.• Ihittheballs.
Subjectvsagent
• Inthemosttypicalsentences(forsomedefinitionof“typical”),theagentisthesubject:– Thebatterhittheball.– Chrisopenedthedoor.– Theteachergavebookstothestudents.
• Sometimestheagentisnotthesubject:– Theballwashitbythebatter.– Theballswerehitbythebatter.
• Sometimesthesubjectisnottheagent:– Thedooropened.– Thekeyopenedthedoor.– Thestudentsweregivenbooks.– Booksweregiventothestudents.
SemanticRoleLabeling
• Input:sentence
• Output:segmentationintoroles,withlabels
• Examplefrombook:• [arg0 TheExaminer]issued[arg1 aspecialedition][argM-tmp yesterday]
SemanticRoleLabeling:HowItWorks
• First,parse.
• Foreachpredicatewordintheparse:– Foreachnodeintheparse:• Classify thenodewithrespecttothepredicate.
YetAnotherClassificationProblem!
• Asbefore,therearemanytechniques(e.g.,NaïveBayes)
• Key:whatfeatures?
FeaturesforSemanticRoleLabeling
• Whatisthepredicate?• Phrasetypeoftheconstituent• Headwordoftheconstituent,itsPOS• Pathintheparsetreefromtheconstituenttothepredicate
• Activeorpassive• Isthephrasebeforeorafterthepredicate?• Subcategorization(≈grammarrule)ofthepredicate
Featureexample
• Examplesentence:[arg0 TheExaminer]issued[arg1 aspecialedition][argM-tmp
yesterday]
• Arg0features:issued,NP,Examiner,NNP,path,active,before,VP->VBDNPPP
Example
Figure20.16:ParsetreeforaPropBank sentence,showingthePropBank argumentlabels.Thedottedlineshowsthepath featureNP ↑ S ↓ VP ↓ VBD forARG0,theNP-SBJconstituentTheSanFranciscoExaminer.
AdditionalIssues
• Initialfilteringofnon-arguments
• Usingchunkingorpartialparsinginsteadoffullparsing
• Enforcingconsistency(e.g.,non-overlap,onlyonearg0)
• Phrasalverbs,supportverbs/lightverbs– takeanap:verbtakeissyntacticheadofVP,butpredicateisnapping,nottaking
Twodatasets,twosystems
• ExamplefrombookusesPropBank
• Locally-developedsystemSEMAFORworksonSemEval problem,basedonFrameNet
PropBank vsFrameNet
Shallowapproachestodeepproblems
• Formanyproblems:– Shallowapproachesmucheasiertodevelop• Asin,possibleatall forunlimitedvocabularies
– Notwonderfulperformanceyet• Sometimesclaimedtohelpaparticularsystem,butoftendoesn’tseemtohelp
– Definitionsarenotcrisp• Thereclearlyissomething there,butthegranularityofthedistinctionsveryproblematic
• DeepLearningwillfixeverything?
Questions?
top related