issues in computational linguistics: semantics
Post on 12-Jan-2016
47 Views
Preview:
DESCRIPTION
TRANSCRIPT
Issues in Computational Linguistics:Issues in Computational Linguistics:SemanticsSemantics
Dick Crouch & Tracy KingDick Crouch & Tracy King
OverviewOverview What is semantics?:
– Aims & challenges of syntax-semantics interface
Introduction to Glue Semantics: – Linear logic for meaning assembly
Topics in Glue– The glue logic– Quantified NPs– Type raising & intensional verbs– Coordination – Control– Skeletons and modifiers
What is Semantics?What is Semantics?
Traditional Definition:– Study of logical relations between sentences
Formal Semantics:– Map sentences onto logical representations
making relations explicit
Computational Semantics– Algorithms for inference/knowledge-based
applications
All men are mortalSocrates is a manSocrates is mortal
x. man(x) mortal(x)man(socrates)mortal(socrates)
Logical & Collocational SemanticsLogical & Collocational Semantics
Logical Semantics– Map sentences to logical representations of meaning– Enables inference & reasoning
Collocational semantics – Represent word meanings as feature vectors– Typically obtained by statistical corpus analysis– Good for indexing, classification, language modeling, word
sense disambiguation– Currently does not enable inference
Complementary, not conflicting, approaches
What does semantics have What does semantics have that f-structure doesn’t?that f-structure doesn’t?
Repackaged information, e.g:– Logical formulas instead of AVMs– Adjuncts wrap around modifiees
Extra information, e.g:– Aspectual decomposition of events
break(e,x,y) & functional(y,start(e)) & functional(y,end(e)) – Argument role assignments
break(e) & cause_of_change(e,x) & object_of_change(e,y)
Extra ambiguity, e.g:– Scope– Modification of semantic event decompositions
e.g. Ed was observed putting up a deckchair for 5 minutes
w. wire(w) & w=part25 & t. interval(t) & t<now & e. break_event(e) & occurs_during(e,t) & object_of_change(e,w) & c. cause_of_change(e,c)
Semantics (logical form)
Example Semantic RepresentationExample Semantic Representation
F-structure gives basic predicate-argument structure, but lacks:
– Standard logical machinery (variables, connectives, etc)
– Implicit arguments (events, causes)
– Contextual dependencies (the wire = part25)
Mapping from f-structure to logical form is systematic, but can introduce ambiguity (not illustrated here)
The wire broke
PRED
SUBJ
TENSE
break<SUBJ>
PRED wireSPEC defNUM sg
past
Syntax (f-structure)
Mapping sentences to logical formsMapping sentences to logical forms
Borrow ideas from compositional compilation of programming languages (with adaptations)
Computer Program
NL Utterance
Object Code Execution
Logical Form Inference
parsecompile
parse
interpret
The Challenge to CompositionalityThe Challenge to CompositionalityAmbiguity & context dependenceAmbiguity & context dependence
Strict compositionality (e.g. Montague)– Meaning is a function of (a) syntactic structure, (b) lexical
choice, and (c) nothing else– Implies that there should be no ambiguity in absence of
syntactic or lexical ambiguity
Counter-examples? (no syntactic or lexical ambiguity)– Contextual ambiguity
» John came in. He sat down. So did Bill.– Semantic ambiguity
» Every man loves a woman.» Put up a deckchair for 5 minutes» Pets must be carried on escalator» Clothes must be worn in public
Semantic AmbiguitySemantic Ambiguity
Syntactic & lexical ambiguity in formal languages– Practical problem for program compilation
» Picking the intended interpretation
– But not a theoretical problem» Strict compositionality generates alternate meanings
Semantic ambiguity a theoretical problem, leading to– Ad hoc additions to syntax (e.g. Chomskyan LF)– Ad hoc additions to semantics (e.g. underspecification) – Ad hoc additions to interface (e.g. quantifier storage)
Weak CompositionalityWeak Compositionality
Weak compositionality– Meaning of the whole is a function of (a) the meaning of its
parts, and (b) the way those parts are combined
– But (a) and (b) are not completely fixed by lexical choice and syntactic structure, e.g.
» Pronouns: incomplete lexical meanings» Quantifier scope: combination not fixed by syntax
Glue semantics– Gives formally precise account of weak compositionality
Modular Syntax-Semantics InterfacesModular Syntax-Semantics Interfaces
Different grammatical formalisms – LFG, HPSG, Categorial grammar, TAG, minimalism, …
Different semantic formalisms– DRT, Situation semantics, Intensional logic, …
Need for modular syntax-semantics interface– Pair different grammatical & semantic formalisms
Possible modular frameworks– Montague’s use of lambda-calculus– Unification-based semantics– Glue semantics (interpretation as deduction)
Some ClaimsSome Claims
Glue is a general approach to the syntax-semantics interface– Alternative to unification-based semantics, Montagovian λ-calculus
Glue addresses semantic ambiguity/weak compositionality
Glue addresses syntactic & semantic modularity
(Glue may address context dependence & update)
Glue Semantics Glue Semantics Dalrymple, Lamping & Saraswat 1993 Dalrymple, Lamping & Saraswat 1993 and subsequentlyand subsequently
Syntax-semantics mapping as linear logic inference
Two logics in semantics:– Meaning Logic (target semantic representation) any suitable semantic representation– Glue Logic (deductively assembles target meaning) fragment of linear logic
Syntactic analysis produces lexical glue premises
Semantic interpretation uses deduction to assemble final meaning from these premises
Linear LogicLinear Logic Influential development in theoretical computer
science (Girard 87) Premises are resources consumed in inference
(Traditional logic: premises are non-resourced)
• Linguistic processing typically resource sensitiveWords used exactly once
Traditional LinearA, AB |= B A, A -o B |= BA, AB |= A&B A, A -o B |= AB A re-used A consumed
A, B |= B A, B |= B A discarded Cannot discard A
/
/
Glue Interpretation (Outline)Glue Interpretation (Outline) Parsing sentence instantiates lexical entries to produce
lexical glue premises Example lexical premise (verb “saw” in “John saw Fred”):
see : g -o (h -o f)Meaning Term Glue Formula2-place predicate g, h, f: constituents in parse “consume meanings of g and h to produce meaning of f”
• Glue derivation |= M : f
• Consume all lexical premises , • to produce meaning, M, for entire sentence, f
Glue Interpretation Glue Interpretation Getting the premisesGetting the premises
PRED
SUBJ
OBJ
see
PRED John
PRED Fred
f: g:
h:
S
NP VP
V NP
John saw Fred
Syntactic Analysis:
Lexicon: John NP john: Fred NP fred: saw V see: SUBJ -o (OBJ -o )
Premises: john: g fred: h see: g -o (h -o f)
Glue InterpretationGlue InterpretationDeduction with premisesDeduction with premises
Premises john: g fred: h see: g -o (h -o f)
Linear Logic Derivation g -o (h -o f) g
h -o f h f Using linear modus ponens
Derivation with Meaning Terms see: g -o (h -o f) john: g
see(john) : h -o f fred : h
see(john)(fred) : f
Linear modus ponens = function application
g -o f g
f
Fun: Arg:
Fun(Arg):
Curry Howard Isomorphism: Pairs LL inference rules with operations on meaning terms
Modus Ponens = Function ApplicationModus Ponens = Function ApplicationThe Curry-Howard IsomorphismThe Curry-Howard Isomorphism
Propositional linear logic inference constructs meanings LL inference completely independent of meaning language
(Modularity of meaning representation)
Semantic AmbiguitySemantic AmbiguityMultiple derivations from single set of premisesMultiple derivations from single set of premises
PRED criminal
ADJSalleged
from London
f:
Alleged criminal from London Premises
criminal: f
alleged: f -o f
from-London: f -o f
Two distinct derivations:
1. from-London(alleged(criminal))
2. alleged(from-London(criminal))
Quantifier Scope AmbiguityQuantifier Scope Ambiguity
Every cable is attached to a base-plate– Has 2 distinct readings x cable(x) y plate(y) & attached(x,y) y plate(y) & x cable(x) attached(x,y)
Quantifier scope ambiguity accounted for by mechanism just shown– Multiple derivations from single set of premises– More on this later
Semantic Ambiguity & ModifiersSemantic Ambiguity & Modifiers
Multiple derivations from single premise set– Arises through different ways of permuting -o
modifiers around an skeleton Modifiers given formal representation in glue as
-o logical identities– E.g. an adjective is a noun -o noun modifier
Modifiers prevalent in natural language, and lead to combinatorial explosion– Given N -o modifiers, N! ways of permuting
them around an skeleton
Packing & Ambiguity ManagementPacking & Ambiguity Management
Exploit explicit skeleton-modifier of glue derivations to implement efficient theorem provers that manage combinatorial explosion– Packing of N! analyses
» Represent all N! analyses in polynomial space» Compute representation in polynomial time» Read off any given analysis in linear time
– Packing through structure re-use» N! analyses through combinations of N sub-analyses» Compute each sub-analysis once, and re-use
Combine with packed output from XLE
SummarySummary
Glue: semantic interpretation as (linear logic) deduction– Syntactic analysis yields lexical glue premises– Standard inference combines premises to construct sentence
meaning Resource sensitivity of linear logic reflects resource
sensitivity of semantic interpretation Gives modular & general syntax-semantics interface Models semantic ambiguity / weak compositionality Leads to efficient implementations
Topics in GlueTopics in Glue
The glue logic Quantified NPs and scope ambiguity Type raising and intensionality Coordination Control Why glue is a good computational theory
Two Rules of InferenceTwo Rules of Inference
a a-o b
b
Modus ponens /-o elimination
[ a]:b
a –o b
Hypothetical reasoning /-o elimination
Assume aand thusprove b
a implies b(discharging assumption)
A: F:
F(A):
x:
F(x):
λx.F(x):
F is a function of type a –o bthat takes arguments of type ato give results of type b
Have shown that there is some functiontaking arguments, x, of type ato give results, F(x), of type b.Call this function λx.F(x), of type a –o b
λλ-terms describe propositional proofs-terms describe propositional proofs
Intimate relation between λ-calculus and propositional inference (Curry-Howard)– λ-terms are descriptions of proofs– Equivalent λ-terms mean equivalent proofs
[ g] g -o f f
g –o f g f
A roundabout proof of ffrom g -o f and g
g g –o f f
A direct proof of ffrom g –o f and g
By λ-reduction: (λx.F(x))(A) = F(A)
A: F: F(A):
x: F: F(x): λx.F(x): A: (λx.F(x))(A):
Digression: Structured MeaningsDigression: Structured Meanings
Glue proofs as an intermediate level of structure in semantic theory– Identity conditions given by λ-equivalence– Used to explore notions of semantic parallelism (Asudeh &
Crouch)
Unlike Montague semantics– MS allows nothing between syntax and model theory.– Logical formulas are not linguistic structures; cannot build
theories off arbitrary aspects of their notation
Unlike Minimal Recursion Semantics– MRS uses partial descriptions of logical formulas– A theory built off aspects of logical notation
Two kinds of semantic resourceTwo kinds of semantic resource
Some nodes, n, in f-structure gives rise to entity-denoting semantic resources, e(n)– e(n) is a proposition stating that n has an entity-denoting resource
Other nodes, n, give rise to proposition/truth-value denoting semantic resources, t(n)– t(n) is a proposition stating that n has a truth-denoting resource
Notational convenience:– Write e(n) as ne, or just n (when kind of resource is unimportant)
– Write t(n) as nt, or just n (when kind of resource is unimportant)
Variables over f-structure nodesVariables over f-structure nodes
The glue logic allows universal quantification over f-structure nodes, e.g. N. (e(g) –o t(N)) –o t(N)– Important for dealing with quantified NPs
But the logic is still essentially propositional– Quantification allows matching of variable propositions with
atomic propositions, e.g. t(N) with t(f)
Notational Convenience:– Drop explicit quantifiers, and write variables over nodes as
upper case letters, e.g. (ge –o Nt) –o Nt
Non-Quantified and Quantified NPsNon-Quantified and Quantified NPs
PRED
SUBJ
sleep
PRED Johnf:
g:
sleep: ge –o ft john: ge
john: g sleep: g –o f sleep(john): f
PRED
SUBJ
sleep
PRED everyoneQUANT +
f:g:
sleep: ge –o ft
everyone: (ge –o Xt) –o Xt
sleep: ge –o ft everyone: (ge –o Xt) –o Xt
everyone(sleep): ft
everyone = λP.x.person(x)P(x)everyone(sleep) = λP.x.person(x)P(x)[sleep] = x.person(x)sleep(x)
Quantifier Scope AmbiguityQuantifier Scope Ambiguity Two derivationsTwo derivations
PREDSUBJOBJ
seeeveryonesomeonef:
g:h:
see: g –o h –o f:(g –o X) –o X:(h –o Y) –o Y
see:g –o h –o f [x:g]
see(x): h –o f [y:h]
see(x,y): f
f
(g –o X) –o X g –o f
f
(h –o Y) –o Y h –o f
see: f
f
h –o f (h –o Y) –o Y
f
g –o f (g –o X) –o X
see: f
Quantifier Scope AmbiguityQuantifier Scope Ambiguity Two derivationsTwo derivations
PREDSUBJOBJ
seeeveryonesomeonef:
g:h:
see: g –o h –o f:(g –o X) –o X:(h –o Y) –o Y
see:g –o h –o f [x:g] see(x): h –o f [y:h]
see(x,y): f
see(x,y): f
:(g-oX)-oX λx.see(x,y): g-o f
λx.see(x,y): f
:(h-oY)-oY λyλx.see(x,y): h-of
λyλx.see(x,y): f
see(x,y): f
λy.see(x,y): h-o f :(h-oY)-oY
λy.see(x,y): f
λxλy.see(x,y): h-of :(g-oX)-oX
λxλy.see(x,y): f
No Additional Scoping MachineryNo Additional Scoping Machinery
Scope ambiguities arise simply through application of the two standard rules of inference for implication
Glue theorem prover automatically finds all possible derivations / scopings
Very simple and elegant account of scope variation.
Type Raising and IntensionalityType Raising and Intensionality
Intensional verbs (seek, want, dream about)– Do not take entities as arguments
* x. unicorn(x) & seek(ed, x)– But rather quantified NP denotations
seek(ed, λP.x unicorn(x) & P(x))
Glue lexical entry for seek λxλQ. seek(x,Q): SUBJ –o (subject entity, x)
((OBJ –o Nt) –o Nt) –o (object quant, Q) (clause meaning)
Ed seeks a unicornEd seeks a unicornPREDSUBJOBJ
seekEda unicornf:
g:h:
ed: gλP.x unicorn(x) & P(x)) : (h –o X) –o XλxλQ. seek(x,Q): g –o ((h –o Y) –o Y) –o f
g g –o ((h –o Y) –o Y) –o f
((h –o Y) –o Y) –o f (h –o X) –o X
f
Derivation (without meanings)
ed: g λxλQ.seek(x,Q): g –o ((h –o Y) –o Y) –o f
λQ.seek(ed,Q):((h –oY)–oY)–of λP.x unicorn(x) & P(x):(h–oX)–oX
seek(ed, λP.x unicorn(x) & P(x)): f
Derivation (with meanings)
Ed seeks Santa ClausEd seeks Santa ClausPREDSUBJOBJ
seekEdSantaf:
g:h:
ed: gsanta: h λxλQ. seek(x,Q): g –o ((h –o Y) –o Y) –o f
Looks problematic– “seek” expects a quantifier from its object– But we only have a proper name
Traditional solution (Montague)– Uniformly give all proper names a more
complicated, type-raised, quantifier-like semanticsλP.P(santa) : (h –o X) –o X
Glue doesn’t force you to do this– Or rather, it does it for you
Type Raising in GlueType Raising in Glue
h [h –o X]
X
(h –o X) –o X
Propositional tautologyh |- (h –o X) –o X
santa: h [P: h –o X]
P(santa): X
λP. P(santa):(h –o X) –o X
Ed seeks Santa ClausEd seeks Santa ClausPREDSUBJOBJ
seekEdSantaf:
g:h:
ed: gsanta: h λxλQ. seek(x,Q): g –o [(h –o Y) –o Y] –o f
g g –o ((h –o Y) –o Y) –o f
((h –o Y) –o Y) –o f
seek(ed, λP. P(santa)): f
santa: h [P: h –o X]
P(santa): X
λP. P(santa):(h –o X) –o X
Glue derivations will automatically type raise, when needed
CoordinationCoordination Incorrect TreatmentIncorrect Treatment
PRED eatSUBJ Ed
PRED drinkSUBJ
ed: geat: g –o f1drink: g –o f2and: f1 –o f2 –o f
Resource deficit: There aren’t enough g’s to go round
Coordination: Coordination: Correct TreatmentCorrect Treatment
PRED eatSUBJ Ed
PRED drinkSUBJ
ed: geat: g –o f1drink: g –o f2λP1 λP2 λx. P1(x)&P2(x): (g –o f1) –o (g –o f2) –o (g –o f)
λP1P2x. P1(x)&P2(x): (g–o f1) –o (g–o f2) –o (g–o f) eat: g –o f1
λP2x.eat(x)&P2(x): (g–o f2) –o (g–o f) drink: g –o f2
ed: g λx.eat(x)&drink(x): (g–of)
eat(ed)&drink(ed): f
Resolving Apparent Resource DeficitsResolving Apparent Resource Deficits
Deficit: – Multiple consumers for some resource g– But only one instance of g
Resolution– Consume the consumers of g, until there is only one
Applies to coordination, and also control
Control: Apparent resource deficitControl: Apparent resource deficit
PRED sleep<SUBJ>SUBJ
PRED want<SUBJ, XCOMP>SUBJ Ed
XCOMP
want: e –o s –o wsleep: e –o sed: e
Resource Deficit:Not enough e’s to go round
Resolve in same way as for coordination
Control: Deficit resolvedControl: Deficit resolved
PRED sleep<SUBJ>SUBJ
PRED want<SUBJ, XCOMP>SUBJ Ed
XCOMP
want: e –o (e –o s) –o wsleep: e –o sed: e
ed: e want: e –o (e –o s) –o w
want(ed): (e –o s) –o w sleep: e –o s
want(ed,sleep): w
Does this commit you to a property analysis of control? i.e. want takes a property as its second argument
Property and/or Propositional ControlProperty and/or Propositional Control
ed: e λxλP.want(x,P): e –o (e –o s) –o w
λP.want(ed,P): (e –o s) –o w sleep: e –o s
want(ed,sleep): w
Property Control λxλP. want(x,P): SUBJ –o (SUBJ –o XCOMP) –o
Propositional Control λxλP. want(x, P(x)): SUBJ –o (SUBJ –o XCOMP) –o
ed: e λxλP.want(x,P(x)): e –o (e –o s) –o w
λP.want(ed,P(ed)): (e –o s) –o w sleep: e –o s
want(ed,sleep(ed)): w
Lexical Variation in ControlLexical Variation in Control
Glue does not commit you to either a propositional or a property-based analysis of controlled XCOMPs (Asudeh)
The type of analysis can be lexically specified– Some verbs get property control– Some verbs get propositional control
Why Glue Makes Computational SenseWhy Glue Makes Computational Sense
The backbone of glue is the construction of propositional linear logic derivations– This can be done efficiently
Combinations of lexical meanings determined solely by this propositional backbone– Algorithms can factor out idiosyncracies of meaning
expressions
Search for propositional backbone can further factor out skeleton (α) from modifier (α –o α) contributions, leading to efficient free choice packing of scope ambiguities– Work still in progress
top related