unification grammars - university of haifashuly/malta-slides.pdf · lexical functional grammar...
TRANSCRIPT
Unification Grammars
Shuly Wintner
Department of Computer ScienceUniversity of Haifa
Haifa, Israel
L-Universita ta’ Malta, October 2008
Introduction Overview
Introduction
Grammatical formalisms: a formal, mathematical and computationalmodel for (the structure of) natural languages
Unification grammars: a general formalism underlying variouslinguistic theories
Lexical Functional Grammar (LFG)Head-driven Phrase Structure Grammar (HPSG)some variants of CCG, TAG, ...
This course: linguistic motivation; mathematical infrastructure;linguistic applications
This is an introductory course...
The Book (Francez and Wintner, Forthcoming)
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 2 / 420
Introduction Overview
Plan
Syntax: the structure of natural languages
Linguistic formalismsConstituencySome syntactic phenomena
Context-free grammars
Basic definitions: grammars, forms, derivations, languages...Derivation treesStructural ambiguityGenerative capacityCFGs and natural languages
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 3 / 420
Introduction Overview
Plan
Feature structures
MotivationFeature graphsSubsumptionFeature structuresAttribute-value matrices
Unification
Feature structure unificationFeature graph unificationGeneralization
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 4 / 420
Introduction Overview
Plan
Extending feature structures
Multi-rooted feature graphsMulti-rooted feature structuresMulti-AVMs
Unification grammars
Unification in contextForms and grammar rulesDerivationsLanguagesDerivation tress
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 5 / 420
Introduction Overview
Plan
Linguistic applications
AgreementCase controlSubcategorizationLong-distance dependenciesControlCoordination
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 6 / 420
Introduction Overview
Plan
Computational aspects
The expressivity of unification grammars
Extensions and open problems
Restricted versions of unification grammarsTyped unification grammarsDevelopment of large-scale grammarsGrammar engineering
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 7 / 420
Syntax Linguistic formalisms
Linguistic formalisms
Syntax is the field of linguistics that studies the structure of naturallanguages.
Why should there be any mathematics involved with linguistictheories?
A linguistic formalism is a (formal) language, with which claimsabout (natural, but also formal) languages can be made.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 8 / 420
Syntax Structure
Syntax
The underlying assumption is that languages have structure: not allsequences of words over the given alphabet are valid; and when asequence of words is valid (grammatical), a natural structure can beinduced on it.
It is useful to think of this structure as a tree (although we shall seeother structures later).
Given a sentence in some language, not all possible trees define thestructure that native speakers of the language intuitively recognize.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 9 / 420
Syntax Structure
Natural languages have structure
Even though I klaw through the valley of the shadow of death, I willraef no evil
Even though I walk through the valley of the shadow of death, I willfear no evil
Even though I ordinary through the valley of the shadow of death, Iwill slowly no evil
Even though I slowly gaze through the valley of the shadow of death, Iwill unsurprisingly do no evil
Even though I walk through the valley of the shadow of death, I willfear no evil
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 10 / 420
Syntax Structure
Natural languages have structure
Natural languages are infinite:
The water put out the fire
The water put out the fire, that burned the stick
The water put out the fire, that burned the stick, that hit the dog
The water put out the fire, that burned the stick, that hit the dog,
that chased the cat
But it is possible to characterize an infinite set with finite expressions.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 11 / 420
Syntax Structure
Natural languages have structure
Intuitively, words combine to form phrases:
(Jacob (served (seven years) (for Rachel))),
and (they seemed to him but a few days
(because of ((the love) (he had for her)))).
but not:(Jacob served) seven (years for) Rachel,
and they (seemed to) him but
(a few days because) of the love he had for her.
Phrases which correspond to our native speaker intuitions are calledconstituents.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 12 / 420
Syntax Constituency
Determining constituents
The criteria for defining constituents are sometimes fuzzy.
The main criterion is equivalent distribution: if two word sequencesare mutually interchangeable in every context, preservinggrammaticality, then both are constituents and both have the samegrammatical category.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 13 / 420
Syntax Constituency
Determining constituents
Certain grammatical operations apply only to constituents:
Topicalization:
For Rachel, Jacob served seven years
Cleft:It was for Rachel that Jacob served seven years
Interjection:
Jacob served seven years, the Bible tells us, for Rachel
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 14 / 420
Syntax Constituency
Determining constituents
Certain grammatical operations apply only to constituents:
Question formation:
How long did Jacob serve for Rachel?
Coordination:Jacob served seven years for Rachel,
and they seemed to him but a few days
Anaphors refer to constituents:
... and for Leah, too
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 15 / 420
Syntax Constituency
Types of constituents
Inducing structure on a grammatical string is done recursively,starting with the words.
To this end, words are classified into categories according to theirdistribution.
In many languages, words are classified into substantial andfunctional categories.
substantial: table, dogs, walked, purple, quickly
functional: the, in, or
Another classification is according to whether the category is open orclose.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 16 / 420
Syntax Constituency
Types of constituents
Word categories (parts of speech):
N Noun table, dogs, justice, oak
V Verb run, climb, love, ignore
ADJ Adjective green, fast, mild, imaginary
ADV Adverb quickly, well, alone
P Preposition in, to, of, after, in spite of
D Determiner a, the, all, some
Pron Pronoun I, you, she, theirs, our
PropN Proper Noun Jacob, IBM, Haifa
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 17 / 420
Syntax Constituency
Constituents
Phrases are projections of word categories:
Noun phrases are headed by nouns:table → round table → the round table
→ the round table in the corner
→ the round table in the corner that we sat at yesterday
Verb phrases are headed by verbs:climbed → climbed a tree → climbed a tree yesterday
→ recklessly climbed a tree yesterday
Adjectival phrases are headed by adjectives:high → rather high / higher than me / high as a tree
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 18 / 420
Syntax Constituency
Constituents
Phrases consist of a head and additional complements and adjuncts.The phrase is a projection of its head.
Complements are required by the head, and are mandatory. Adjunctsare optional, and can be iterated.
Example: John drinks a cup of milk every morning
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 19 / 420
Syntax Syntactic phenomena
Syntactic phenomena
Agreement
Subcategorization
Case assignment
Unbounded dependencies
Subject/object control
Coordination
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 20 / 420
Syntax Syntactic phenomena
A gradual description of language fragments
E0 is a small fragment of English consisting of very simple sentences,constructed with only intransitive and transitive (but no ditransitive)verbs, common nouns, proper names, pronouns and determiners.
Typical sentences are:
A sheep drinks
Rachel herds the sheep
Jacob loves her
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 21 / 420
Syntax Syntactic phenomena
A gradual description of language fragments
Similar strings are not E0- (and, hence, English-) sentences:
∗Rachel feed the sheep
∗Rachel feeds herds the sheep
∗The shepherds feeds the sheep
∗Rachel feeds
∗Jacob loves she
∗Jacob loves Rachel she
∗Them herd the sheep
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 22 / 420
Syntax Syntactic phenomena
A gradual description of language fragments
There are constraints on the combination of phrases in E0:
The subject and the predicate must agree on number and person: ifthe subject is a third person singular, so must the verb be.
Objects complement only – and all – the transitive verbs.
When a pronoun is used, it is in the nominative case if it is in thesubject position, and in the accusative case if it is an object.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 23 / 420
Syntax Syntactic phenomena
Subcategorization
E1 is a fragment of English, based on E0, in which verbs are classifiedto subclasses according to the complements they “require”:
Laban gave Jacob his daughter
Jacob promised Laban to marry Leah
Laban persuaded Jacob to promise him to marry Leah
Similar strings that violate this constraint are:
∗Rachel feeds Jacob the sheep
∗Jacob saw to marry Leah
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 24 / 420
Syntax Syntactic phenomena
Control
With the addition of infinitival complements in E1, E2 can captureconstraints of argument control in English:
Jacob promised Laban to work seven years
Laban persuaded Jacob to work seven years
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 25 / 420
Syntax Syntactic phenomena
Long distance dependencies
Another extension of E1 is E3, typical sentences of which are:
The shepherd wondered whom Jacob loved ⌣.
The shepherd wondered whom Laban thought Jacob loved ⌣.
The shepherd wondered whom Laban thought Rachel claimed
Jacob loved ⌣.
An attempt to replace the gap with an explicit noun phrase results inungrammaticality:
∗The shepherd wondered who Jacob loved Rachel.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 26 / 420
Syntax Syntactic phenomena
Long distance dependencies
The gap need not be in the object position:
Jacob wondered who ⌣ loved Leah
Jacob wondered who Laban believed ⌣ loved Leah
Again, an explicit noun phrase filling the gap results inungrammaticality:
∗Jacob wondered who the shepherd loved Leah
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 27 / 420
Syntax Syntactic phenomena
Long distance dependencies
More than one gap may be present in a sentence (and, hence, morethan one filler):
This is the well which Jacob is likely to ⌣ draw water from ⌣
It was Leah that Jacob worked for ⌣ without loving ⌣
In some languages (e.g., Norwegian) there is no (principled) bound onthe number of gaps that can occur in a single clause.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 28 / 420
Syntax Syntactic phenomena
Long distance dependencies
There are other fragments of English in which long distancedependencies are manifested in other forms.
Topicalization:
Rachel, Jacob loved ⌣
Rachel, every shepherd knew Jacob loved ⌣
Another example is interrogative sentences:
Who did Jacob love ⌣?
Who did Laban believe Jacob loved ⌣?
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 29 / 420
Syntax Syntactic phenomena
Coordination
Coordination is accounted for in the language fragment E4:
No man lift up his [hand] or [foot] in all the land of EgyptJacob saw [Rachel] and [the sheep of Laban]Jacob [went on his journey] and[came to the land of the people of the east]Jacob [went near], and [rolled the stone from the well’s mouth], and[watered the flock of Laban his mother’s brother].every [speckled] and [spotted] sheepLeah was [tender eyed] but [not beautiful][Leah had four sons], but [Rachel was barren]
She said to Jacob, “[Give me children], or [I shall die]!”
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 30 / 420
Syntax Syntactic phenomena
The goals of syntactic analysis
Given a natural language sentence, syntactic analysis provides astructural description of the sentence.
To do so, one must have a model of the structure of the language.
Syntax is concerned with a formulation of the structure of naturallanguages. An example of a syntactic formalism is context-freegrammars.
In CFGs, the structure of sentences is modeled by derivation trees.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 31 / 420
Context-free grammars Basic definitions
Context-free grammars
Definition (Context-free grammars)
A context-free grammar (CFG) is a four-tuple 〈Σ,V ,S ,P〉, where:
Σ is a finite, non-empty set of terminals, the alphabet;
V is a finite, non-empty set of grammar variables (categories, ornon-terminal symbols), such that Σ ∩ V = ∅;
S ∈ V is the start symbol;
P is a finite set of production rules, each of the form A → α, whereA ∈ V and α ∈ (V ∪ Σ)∗.
For a rule A → α, A is the rule’s head and α is its body.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 32 / 420
Context-free grammars Basic definitions
Context-free grammars
Example (CFG example)
Σ = {the, cat, in, hat}V = {D, N, P, NP, PP}The start symbol is NPThe rules:
D → the NP → D NN → cat PP → P NPN → hat NP → NP PPP → in
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 33 / 420
Context-free grammars Basic definitions
Context-free grammars: language
Each non-terminal symbol in a grammar denotes a language.
A rule such as N → cat implies that the language denoted by thenon-terminal N includes the alphabet symbol cat.
The symbol cat here is a single, atomic alphabet symbol, and not astring of symbols: the alphabet of this example consists of naturallanguage words, not of natural language letters.
For a more complex rule such as NP → D N , the language denotedby NP contains the concatenation of the language denoted by D withthat denoted by N: L(NP) ⊇ L(D) · L(N).
Matters become more complicate when we consider recursive rulessuch as NP → NP PP .
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 34 / 420
Context-free grammars Basic definitions
Context-free grammars: derivation
Given a grammar G = 〈V ,Σ,P ,S〉, we define the set of forms to be(V ∪ Σ)∗: the set of all sequences of terminal and non-terminalsymbols.
Derivation is a relation that holds between two forms, each asequence of grammar symbols.
Definition (Derivation)
A form α derives a form β, denoted by α ⇒ β, if and only if α = γlAγr
and β = γlγcγr and A → γc is a rule in P .
A is called the selected symbol. The rule A → γ is said to beapplicable to α.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 35 / 420
Context-free grammars Basic definitions
Derivation
Example (Forms)
The set of non-terminals of G is V = {D, N, P, NP, PP} and the set ofterminals is Σ = {the, cat, in, hat}.The set of forms therefore contains all the (infinitely many) sequences ofelements from V and Σ, such as 〈〉, 〈NP〉, 〈D cat P D hat〉, 〈D N〉,〈the cat in the hat〉, etc.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 36 / 420
Context-free grammars Basic definitions
Derivation
Example (Derivation)
Let us start with a simple form, 〈NP〉. Observe that it can be written asγlNPγr , where both γl and γr are empty. Observe also that NP is thehead of some grammar rule: the rule NP → D N . Therefore, the form is agood candidate for derivation: if we replace the selected symbol NP withthe body of the rule, while preserving its environment, we getγlD Nγr = D N. Therefore, 〈NP〉 ⇒ 〈D N〉.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 37 / 420
Context-free grammars Basic definitions
Derivation
Example (Derivation)
We now apply the same process to 〈D N〉. This time the selected symbolis D (we could have selected N, of course). The left context is againempty, while the right context is γr = N. As there exists a grammar rulewhose head is D, namely D → the, we can replace the rule’s head by itsbody, preserving the context, and obtain the form 〈the N〉. Hence〈D N〉 ⇒ 〈the N〉.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 38 / 420
Context-free grammars Basic definitions
Derivation
Example (Derivation)
Given the form 〈the N〉, there is exactly one non-terminal that we canselect, namely N. However, there are two rules that are headed by N:N → cat and N → hat. We can select either of these rules to show thatboth 〈the N〉 ⇒ 〈the cat〉 and 〈the N〉 ⇒ 〈the hat〉.Since the form 〈the cat〉 consists of terminal symbols only, nonon-terminal can be selected and hence it derives no form.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 39 / 420
Context-free grammars Basic definitions
Extended derivation
αk⇒G β if α derives β in k steps:
α ⇒G α1 ⇒G α2 ⇒G . . . ⇒G αk andαk = β.
The reflexive-transitive closure of ‘⇒G ’ is ‘∗⇒G ’: α
∗⇒G β if α
k⇒G β
for some k ≥ 0.
A G -derivation is a sequence of forms α1, . . . , αn, such that forevery i , 1 ≤ i < n, αi ⇒G αi+1.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 40 / 420
Context-free grammars Basic definitions
Extended derivation: example
Example (Derivation)
(1) 〈NP〉 ⇒ 〈D N〉(2) 〈D N〉 ⇒ 〈the N〉(3) 〈the N〉 ⇒ 〈the cat〉
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 41 / 420
Context-free grammars Basic definitions
Extended derivation: example
Example (Derivation)
Therefore, we trivially have:
(4) 〈NP〉∗⇒ 〈D N〉
(5) 〈D N〉∗⇒ 〈the N〉
(6) 〈the N〉∗⇒ 〈the cat〉
From (2) and (6) we get
(7) 〈D N〉∗⇒ 〈the cat〉
and from (1) and (7) we get
(7) 〈NP〉∗⇒ 〈the cat〉
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 42 / 420
Context-free grammars Basic definitions
Languages
Definition (Senential forms)
A form α is a sentential form of a grammar G iff S∗⇒G α, i.e., it can be
derived in G from the start symbol.
Definition (Language)
The (formal) language generated by a grammar G with respect to a
category name (non-terminal) A is LA(G ) = {w | A∗⇒ w}. The language
generated by the grammar is L(G ) = LS(G ).
Definition (Context-free languages)
A language that can be generated by some CFG is a context-free languageand the class of context-free languages is the set of languages everymember of which can be generated by some CFG. If no CFG can generatea language L, L is said to be trans-context-free.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 43 / 420
Context-free grammars Basic definitions
Language of a grammar
Example (Language)
For the example grammar (with NP the start symbol):
D → the NP → D NN → cat PP → P NPN → hat NP → NP PPP → in
it is fairly easy to see that L(D) = {the}.Similarly, L(P) = {in} and L(N) = {cat, hat}.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 44 / 420
Context-free grammars Basic definitions
Language of a grammar
Example (Language)
It is more difficult to define the languages denoted by the non-terminalsNP and PP, although is should be straight-forward that the latter isobtained by concatenating {in} with the former.Proposition: L(NP) is the denotation of the regular expression
the · (cat + hat) · (in· the · (cat + hat))∗
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 45 / 420
Context-free grammars Basic definitions
Language: a formal example Ge
Example (Language)
S → Va S Vb
S → ǫ
Va → aVb → b
L(Ge) = {anbn | n ≥ 0}.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 46 / 420
Context-free grammars Basic definitions
Recursion
The language L(Ge) is infinite: it includes an infinite number ofwords; Ge is a finite grammar.
To be able to produce infinitely many words with a finite number ofrules, a grammar must be recursive: there must be at least one rulewhose body contains a symbol, from which the head of the rule canbe derived.
Put formally, a grammar 〈Σ,V ,S ,P〉 is recursive if there exists achain of rules, p1, . . . , pn ∈ P , such that for every 1 < i ≤ n, the headof pi+1 occurs in the body of pi , and the head of p1 occurs in thebody of pn.
In Ge , the recursion is simple: the chain of rules is of length 0,namely the rule S → Va S Vb is in itself recursive.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 47 / 420
Context-free grammars Derivation trees
Derivation tree
Sometimes derivations provide more information than is actuallyneeded. In particular, sometimes two derivations of the same stringdiffer not in the rules that were applied but only in the order in whichthey were applied.
Starting with the form 〈NP〉 it is possible to derive the string the catin two ways:
(1) 〈NP〉 ⇒ 〈D N〉 ⇒ 〈D cat〉 ⇒ 〈the cat〉(2) 〈NP〉 ⇒ 〈D N〉 ⇒ 〈the N〉 ⇒ 〈the cat〉
Since both derivations use the same rules to derive the same string, itis sometimes useful to collapse such “equivalent” derivations intoone. To this end the notion of derivation trees is introduced.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 48 / 420
Context-free grammars Derivation trees
Derivation tree
A derivation tree (sometimes called parse tree, or simply tree) is avisual aid in depicting derivations, and a means for imposing structureon a grammatical string.
Trees consist of vertices and branches; a designated vertex, the rootof the tree, is depicted on the top. Then, branches are simplyconnections between two vertices.
Intuitively, trees are depicted “upside down”, since their root is at thetop and their leaves are at the bottom.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 49 / 420
Context-free grammars Derivation trees
Derivation tree
Example (Derivation tree)
An example for a derivation tree for the string the cat in the hat:
NP
NP PP
D N P NP
D N
the cat in the hat
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 50 / 420
Context-free grammars Derivation trees
Derivation tree
Formally, a tree consists of a finite set of vertices and a finite set ofbranches (or arcs), each of which is an ordered pair of vertices.
In addition, a tree has a designated vertex, the root, which has twoproperties: it is not the target of any arc, and every other vertex isaccessible from it (by following one or more branches).
When talking about trees we sometimes use family notation: if avertex v has a branch leaving it which leads to some vertex u, thenwe say that v is the mother of u and u is the daughter, or child, of v .If u has two daughters, we refer to them as sisters.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 51 / 420
Context-free grammars Derivation trees
Derivation trees
Derivation trees are defined with respect to some grammar G , and mustobey the following conditions:
1 every vertex has a label, which is either a terminal symbol, anon-terminal symbol or ǫ;
2 the label of the root is the start symbol;
3 if a vertex v has an outgoing branch, its label must be a non-terminalsymbol, the head of some grammar rule; and the elements in body ofthe same rule must be the labels of the children of v , in the sameorder;
4 if a vertex is labeled ǫ, it is the only child of its mother.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 52 / 420
Context-free grammars Derivation trees
Derivation trees
A leaf is a vertex with no outgoing branches.
A tree induces a natural “left-to-right” order on its leaves; when readfrom left to right, the sequence of leaves is called the frontier, or yieldof the tree.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 53 / 420
Context-free grammars Derivation trees
Correspondence between trees and derivations
Derivation trees correspond very closely to derivations.
For a form α, a non-terminal symbol A derives α if and only if α isthe yield of some parse tree whose root is A.
Sometimes there exist different derivations of the same string thatcorrespond to a single tree. In fact, the tree representation collapsesexactly those derivations that differ from each other only in the orderin which rules are applied.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 54 / 420
Context-free grammars Derivation trees
Correspondence between trees and derivations
NP
NP PP
D N P NP
D N
the cat in the hat
Each non-leaf vertex in the tree corresponds to some grammar rule (sinceit must be labeled by the head of some rule, and its children must belabeled by the body of the same rule).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 55 / 420
Context-free grammars Derivation trees
Correspondence between trees and derivations
This tree represents the following derivations (among others):
(1) NP ⇒ NP PP ⇒ D N PP ⇒ D N P NP⇒ D N P D N ⇒ the N P D N⇒ the cat P D N ⇒ the cat in D N⇒ the cat in the N ⇒ the cat in the hat
(2) NP ⇒ NP PP ⇒ D N PP ⇒ the N PP⇒ the cat PP ⇒ the cat P NP⇒ the cat in NP ⇒ the cat in D N⇒ the cat in the N ⇒ the cat in the hat
(3) NP ⇒ NP PP ⇒ NP P NP ⇒ NP P D N⇒ NP P D hat ⇒ NP P the hat⇒ NP in the hat ⇒ D N in the hat⇒ D cat in the hat ⇒ the cat in the hat
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 56 / 420
Context-free grammars Derivation trees
Correspondence between trees and derivations
While exactly the same rules are applied in each derivation (the rulesare uniquely determined by the tree), they are applied in differentorders.
In particular, derivation (2) is a leftmost derivation: in every step theleftmost non-terminal symbol of a derivation is expanded.
Similarly, derivation (3) is rightmost.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 57 / 420
Context-free grammars Ambiguity
Ambiguity
Sometimes, however, different derivations (of the same string!)correspond to different trees.
This can happen only when the derivations differ in the rules whichthey apply.
When more than one tree exists for some string, we say that thestring is ambiguous.
Ambiguity is a major problem when grammars are used for certainformal languages, in particular programming languages. But fornatural languages, ambiguity is unavoidable as it corresponds toproperties of the natural language itself.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 58 / 420
Context-free grammars Ambiguity
Ambiguity: example
Consider again the example grammar and the following string:
the cat in the hat in the hat
Intuitively, there can be (at least) two readings for this string: one inwhich a certain cat wears a hat-in-a-hat, and one in which a certaincat-in-a-hat is inside a hat:
((the cat in the hat) in the hat)
(the cat in (the hat in the hat))
This distinction in intuitive meaning is reflected in the grammar, andhence two different derivation trees, corresponding to the tworeadings, are available for this string:
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 59 / 420
Context-free grammars Ambiguity
Ambiguity: example
NP
NP
NP PP PP
D N P NP P NP
D N D N
the cat in the hat in the hat
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 60 / 420
Context-free grammars Ambiguity
Ambiguity: example
NP
NP PP
D N P NP
NP PP
P NP
D N D N
the cat in the hat in the hat
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 61 / 420
Context-free grammars Ambiguity
Ambiguity: example
Using linguistic terminology, in the left tree the second occurrence ofthe prepositional phrase in the hat modifies the noun phrase the catin the hat, whereas in the right tree it only modifies the (firstoccurrence of) the noun phrase the hat.
This situation is known as syntactic or structural ambiguity.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 62 / 420
Context-free grammars Generative capacity
Grammar equivalence
It is common in formal language theory to relate different grammarsthat generate the same language by an equivalence relation:
Two grammars G1 and G2 (over the same alphabet Σ) areequivalent (denoted G1 ≡ G2) iff L(G1) = L(G2).
We refer to this relation as weak equivalence, as it only relates thegenerated languages. Equivalent grammars may attribute totallydifferent syntactic structures to members of their (common)languages.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 63 / 420
Context-free grammars Generative capacity
Grammar equivalence
Example (Equivalent grammars, different trees)
Following are two different tree structures that are attributed to the stringaabb by the grammars Ge and Gf , respectively.
S S
S S
Va Va S Vb Vb
a a ǫ b b a a ǫ b b
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 64 / 420
Context-free grammars Generative capacity
Grammar equivalence
Example (Structural ambiguity)
A grammar, Garith, for simple arithmetic expressions:
S → a | b | c | S + S | S ∗ S
Two different trees can be associated by Garith with the string a + b ∗ c :
S S
S S
S S S S S S
a + b ∗ c a + b ∗ c
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 65 / 420
Context-free grammars Generative capacity
Grammar equivalence
Weak equivalence relation is stated in terms of the generatedlanguage.
Consequently, equivalent grammars do not have to be described inthe same formalism for them to be equivalent.
We will later see how grammars, specified in different formalisms, canbe compared.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 66 / 420
Context-free grammars CFGs and natural languages
Normal form
It is convenient to divide grammar rules into two classes: one thatcontains only phrasal rules of the form A → α, where α ∈ V ∗, andanother that contains only terminal rules of the form B → σ whereσ ∈ Σ.
It turns out that every CFG is equivalent to some CFG of this form.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 67 / 420
Context-free grammars CFGs and natural languages
Normal form
A grammar G is in phrasal/terminal normal form iff for everyproduction A → α of G , either α ∈ V ∗ or α ∈ Σ.
Productions of the form A → σ are called terminal rules, and A issaid to be a pre-terminal category, the lexical entry of σ.
Productions of the form A → α, where α ∈ V ∗, are called phrasalrules.
Furthermore, every category is either pre-terminal or phrasal, but notboth.
For a phrasal rule with α = A1 · · ·An,w = w1 · · ·wn,w ∈ LA(G ) andwi ∈ LAi
(G ) for i = 1, . . . , n, we say that w is a phrase of category A,and each wi is a sub-phrase (of w) of category Ai .
A sub-phrase wi of w is also called a constituent of w .
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 68 / 420
Context-free grammars CFGs and natural languages
Context-free grammars for natural languages
Context-free grammars can be used for a variety of syntacticconstructions, including some non-trivial phenomena such asunbounded dependencies, extraction, extraposition etc.
However, some (formal) languages are not context-free, and thereforethere are certain sets of strings that cannot be generated bycontext-free grammars.
The interesting question, of course, involves natural languages: arethere natural languages that are not context-free? Are context-freegrammars sufficient for generating every natural language?
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 69 / 420
Context-free grammars CFGs and natural languages
A context-free grammar, G0, for E0
Example
A context-free grammar, G0, for E0
S → NP VPVP → VVP → V NPNP → D NNP → PronNP → PropND → the, a, two, every, . . .
N → sheep, lamb, lambs, shepherd, water . . .
V → sleep, sleeps, love, loves, feed, feeds, herd, herds, . . .
Pron → I, me, you, he, him, she, her, it, we, us, they, them
PropN → Rachel, Jacob, . . .
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 70 / 420
Context-free grammars CFGs and natural languages
Context-free grammars for natural languages
There are two major problems with this grammar.
1 it ignores the valence of verbs: there is no distinction amongsubcategories of verbs, and an intransitive verb such as sleep mightoccur with a noun phrase complement, while a transitive verb such aslove might occur without one. In such a case we say that thegrammar overgenerates: it generates strings that are not in theintended language.
2 there is no treatment of subject–verb agreement, so that a singularsubject such as the cat might be followed by a plural form of verbsuch as smile. This is another case of overgeneration.
Both problems are easy to solve.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 71 / 420
Context-free grammars CFGs and natural languages
Problems of G0
Over-generation (agreement constraints are not imposed):
∗Rachel feed the sheep
∗The shepherds feeds the sheep
∗Rachel feeds
∗Jacob loves she
∗Them herd the sheep
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 72 / 420
Context-free grammars CFGs and natural languages
Problems of G0
Over-generation (subcategorization constraints are not imposed):
the lambs sleep
Jacob loves Rachel
∗the lambs sleep the sheep
∗Jacob loves
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 73 / 420
Context-free grammars CFGs and natural languages
Problems of G0
Example (Over-generation)
S
NP VP
D N V NP
Pron
the lambs sleeps they
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 74 / 420
Context-free grammars CFGs and natural languages
Verb valence
To account for valence, we can replace the non-terminal symbol V bya set of symbols: Vtrans, Vintrans, Vditrans etc.
We must also change the grammar rules accordingly:
Example
VP → Vintrans Vintrnas → sleep, sleeps
VP → Vtrans NP Vtrans → love, loves
VP → Vditrans NP NP Vditrans → give, gives
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 75 / 420
Context-free grammars CFGs and natural languages
Agreement
To account for agreement, we can again extend the set ofnon-terminal symbols such that categories that must agree reflect inthe non-terminal that is assigned for them the features on which theyagree.
In the very simple case of English, it is sufficient to multiply the set of“nominal” and “verbal” categories, so that we get Dsg, Dpl, Nsg,Npl, NPsg, NPpl, Vsg, Vlp, VPsg, VPpl etc. We must also changethe set of rules accordingly:
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 76 / 420
Context-free grammars CFGs and natural languages
Agreement
Example
Nsg → lamb Npl → lambs
Nsg → sheep Npl → sheep
Vsg → sleeps Vpl → sleep
Vsg → smiles Vpl → smile
Vsg → loves Vpl → love
Vsg → saw Vpl → saw
Dsg → a Dpl → two
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 77 / 420
Context-free grammars CFGs and natural languages
Agreement
Example
S → NPsg VPsg S → NPpl VPplNPsg → Dsg Nsg NPpl → Dpl NplVPsg → Vsg VPpl → VplVPsg → VPsg NP VPpl → VPpl NP
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 78 / 420
Context-free grammars CFGs and natural languages
Methodological properties of the CFG formalism
1 Concatenation is the only string combination operation
2 Phrase structure is the only syntactic relationship
3 The terminal symbols have no properties
4 Non-terminal symbols (grammar variables) are atomic
5 Most of the information encoded in a grammar lies in the productionrules
6 Any attempt of extending the grammar with a semantics requiresextra means.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 79 / 420
Context-free grammars CFGs and natural languages
Alternative methodological properties
1 Concatenation is not necessarily the only way by which phrases maybe combined to yield other phrases.
2 Even if concatenation is the sole string operation, other syntacticrelationships are being put forward.
3 Modern computational formalisms for expressing grammars adhere toan approach called lexicalism.
4 Some formalisms do not retain any context-free backbone. However,if one is present, its categories are not atomic.
5 The expressive power added to the formalisms allows also a certainway for representing semantic information.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 80 / 420
Feature structures Introduction
Feature structures
Motivated by the violations of the context-free grammar G0, wewould like to extend the CFG formalism with additional mechanismsthat will facilitate the expression of information that is missing in G0
in a uniform and compact way.
The core idea is to incorporate into the grammar properties ofsymbols, in terms of which the violations of G0 were stated.
Properties are represented by means of feature structures.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 81 / 420
Feature structures Introduction
Overview
An overview of feature structures, motivating their use as arepresentation of linguistic information
Four different views of these entities:
feature graphsfeature structuresabstract feature structuresattribute-value matrices (AVMs)
Feature structures in a broader context.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 82 / 420
Feature structures Motivation
Motivation
Words in natural languages have properties
We want to model these properties in the lexicon
We would like to associate with words not just atomic symbols, as inCFGs, but rather structural information that reflects their properties.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 83 / 420
Feature structures Motivation
A simple lexicon
Example (A simple lexicon)
lamb:
[
num : sgpers : third
]
lambs:
[
num : plpers : third
]
I:
[
num : sgpers : first
]
sheep:
[
num : [ ]pers : third
]
dreams:
[
num : sgpers : third
]
dreams:
[
num : sgpers : third
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 84 / 420
Feature structures Motivation
Feature structures
Feature structures map features into values, which are themselvesfeature structures
A special case of feature structures are atoms, which representstructureless values.
For example, to deal with number (and impose its agreement), weuse a feature num, and a set of atomic feature structures {sg,pl} asits values, representing singularity and plurality, respectively.
When a value is not atomic, it is complex.
A complex value is, recursively, a feature structure consisting offeatures and values.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 85 / 420
Feature structures Motivation
A complex feature structure
Example (A complex feature structure)
loves:
vtype : transitive
agr :
[
num : sgpers : third
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 86 / 420
Feature structures Motivation
Grouping features
Deciding how to group features is up to the grammar designer, and isintended to capture syntactic generalizations.
If number and person ‘go together’ in formulating restrictions, it ismore appropriate to group them as in this example.
Moreover, such a grouping might be beneficial when featurestructures are being modified.
Processes of derivation and parsing (the application of grammarrules) are able to manipulate feature structures to reflect applicationof such constraints.
When the properties of some feature structure are changed, it ispossible to change the value of only one feature, namely agr, ratherthan specify two separate changes for each subfeature.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 87 / 420
Feature structures Motivation
Grouping features
In the example lexicon, the lexical ambiguity of sheep is representedby an empty feature structure as the value of the num feature.
This is interpreted as the value of this feature being unconstrained.
However, it would have been useful to be able to state that the onlypossible values for this feature are, say, sg and pl.
There are at least two different ways to specify such information:
by listing a set of values for the feature;or by restricting its value to a certain “type” of permissible values.
We do not explore the former solution here.
The latter solution is employed by typed feature structure formalisms.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 88 / 420
Feature structures Motivation
Adding features to phrases
Words are not the only linguistic entities that have properties; wordsare combined into phrases, and those also have properties which canbe modeled by feature:value pairs.
For example, the noun phrase a sheep has the value sg for the num
feature, while two sheep has the value pl for num.
Consequently, grammar non-terminals, too, must be decorated withfeatures, representing the endowment of phrases of this category withthat feature.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 89 / 420
Feature structures Feature graphs
Feature graphs
The informal discussion of feature structures above depicted themusing a representation, called attribute-value matrices (AVMs), whichis common in the linguistic literature.
We begin the discussion of feature structures by defining the conceptof feature graphs, using well-known concepts of graph theory.
A graph view of feature structures facilitates computationalprocessing because so many properties of graphs are well understoodand because graphs lend themselves to efficient processing.
We will return to AVMs and discuss their correspondence with featuregraphs later on.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 90 / 420
Feature structures Feature graphs
Definitions
Feature graphs are defined over a signature consisting of non-empty,finite, disjoint sets Feats of features and Atoms of atoms.
Features are used to encode properties of (linguistic) objects, such asnumber, gender etc.
Atoms are used for the (atomic) values of such features, as in plural,feminine etc.
We use a convention of depicting features in small capitals andatoms in italics.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 91 / 420
Feature structures Feature graphs
Signature
Definition (Signature)
A signature is a structure S = 〈Atoms,Feats〉, where Atoms is a finiteset of atoms and Feats is a finite set of features.
We assume some fixed signature throughout this presentation.
Meta-variables f , g (with or without subscripts or superscripts) rangeover features, and a, b, etc. over atoms.
We usually assume that both Feats and Atoms are non-empty (andsometimes even assume that they include more than one elementeach).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 92 / 420
Feature structures Feature graphs
Feature graphs
Definition (Feature graphs)
A feature graph A = 〈QA, qA, δA, θA〉 is a finite, directed, connected,labeled graph consisting of a finite, nonempty set of nodes QA (such thatQA ∩ Feats = QA ∩Atoms = ∅), a root qA ∈ QA, a partial functionδA : QA × Feats → QA specifying the arcs such that every node q ∈ QA
is accessible from qA, and a partial function, marking some of the sinks:θA : QS → Atoms, where QS = {q ∈ QA | δA(q, f )↑ for every f }.Given a signature of features Feats and atoms Atoms, letG(Feats,Atoms) be the set of all feature graphs over the signature.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 93 / 420
Feature structures Feature graphs
Feature graphs
Example (Feature graphs)
The graph displayed below is 〈Q, q, δ, θ〉, whereQ = {q0, q1, q2, q3}, q = q0, δ(q0,agr) = q1, δ(q1,num) =q2, δ(q1,pers) = q3,QS = {q2, q3}, θ(q2) = pl, θ(q3) = third.
q2pl
q0 q1
q3third
agr
num
pers
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 94 / 420
Feature structures Feature graphs
Feature graphs
The arcs of a feature graph are thus labeled by features.
The root is a designated node from which all other nodes areaccessible (through δ); note that nothing prevents the root fromhaving incoming arcs.
Sink nodes (nodes with no outgoing edges) can be marked by anatom, but can also be unmarked.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 95 / 420
Feature structures Feature graphs
Feature graphs
We use meta-variables A, B (with or without subscripts) to refer tofeature graphs.
We use Q, q, δ, θ, to refer to constituents of feature graphs.
When displaying feature graphs, the root is depicted as a grey-colorednode, usually at the top or the left side of the graph.
The identities of the nodes are arbitrary, and we use generic namessuch as q0, q1 etc. to refer to them.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 96 / 420
Feature structures Feature graphs
Feature graphs
Example (Feature graphs)
In the following graph, the leaves q2 and q3 bear no marking; in otherwords, the marking function θ is undefined for the two sinks in its domain.
q2
q0 q1
q3
agrnum
pers
The graph displayed above is 〈Q, q, δ, θ〉, where Q = {q0, q1, q2, q3}, q =q0, δ(q0,agr) = q1, δ(q1,num) = q2, δ(q1,pers) = q3,QS = {q2, q3},and θ is undefined for its entire domain.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 97 / 420
Feature structures Feature graphs
Feature graphs
A feature graph is empty if it consists of a single unmarked nodewith no arcs.
A feature graph is atomic if it consists of a single marked node withno arcs.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 98 / 420
Feature structures Feature graphs
Empty and atomic feature graphs
Example (Empty and atomic feature graphs)
A, an empty feature graph: q0
B , an atomic feature graph: q0pl
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 99 / 420
Feature structures Feature graphs
Paths
The concept of paths is natural when graphs are concerned.
A path (over Feats) is a finite sequence of features, and the setPaths = Feats∗ is the collection of all paths.
Meta-variables π, α (with or without subscripts) range over paths.
ǫ is the empty path, denoted also by ‘〈〉’.
The length of a path π is denoted |π|.
For example, if Feats = {a, b} then Paths includesǫ, 〈a〉, 〈b〉, 〈a,b,a〉, 〈b,b,b,b,a,b〉, etc.
While a path is a purely syntactic notion (every sequence of featuresconstitutes a path), interesting paths are those that can be interpretedas actual paths in some graph, leading from the root to some node.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 100 / 420
Feature structures Feature graphs
Paths
The definition of δ is therefore extended to paths: given a featuregraph A = 〈QA, qA, δA, θA〉, define δA : QA ×Paths → QA as follows:
δA(q, ǫ) = q
δA(q, f π) = δA(δA(q, f ), π) (defined only if δA(q, f )↓)
Since for every node q ∈ QA and every feature f ∈ Feats,δA(q, f) = δA(q, 〈f〉), we identify δ with δ in the future and use onlythe latter. When the index (A) is clear from the context, it is omitted.When δA(q, π) = q′ we say that π leads (in A) from q to q′.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 101 / 420
Feature structures Feature graphs
Paths
Definition (Paths)
The paths of a feature graph A are Π(A) = {π ∈ Paths | δA(qA, π)↓}.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 102 / 420
Feature structures Feature graphs
Paths
Example (Paths)
Consider the following feature graph, A:
q2pl
q0 q1
q3third
agr
num
pers
Its paths are
Π(A) = {ǫ, 〈agr〉, 〈agr num〉, 〈agr pers〉}
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 103 / 420
Feature structures Feature graphs
Path values
Of particular interest are paths which lead from the root of a featuregraph to some node in the graph.
For such paths we define the notion of a value, which is thesub-graph whose root is the node at the end of the path.
It would have been possible to define as value the node itslef, ratherthan the sub-graph is induces; the choice is a matter of taste, asmoving from one view of values to another is trivial.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 104 / 420
Feature structures Feature graphs
Path values
Definition (Path value)
For a feature graph A = 〈QA, qA, δA, θA〉 and a path π ∈ Π(A), the valuevalA(π) of π in A is a feature graph B = 〈QB , qB , δB , θB〉, over the samesignature as A, where:
qB = δA(qA, π)
QB = {q′ ∈ QA | for some π′, δA(qB , π′) = q′} (QB is the set ofnodes reachable from qB)
for every feature f and for every q′ ∈ QB , δB(q′, f) = δA(q′, f ) (δB isthe restriction of δA to QB)
for every q′ ∈ QB , θB(q′) = θA(q′) (θB is the restriction of θA to QB)
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 105 / 420
Feature structures Feature graphs
Paths
Example (Paths)
Consider the following feature graph, A:
q2pl
q0 q1
q3third
agr
num
pers
Its paths are
Π(A) = {ǫ, 〈agr〉, 〈agr num〉, 〈agr pers〉}
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 106 / 420
Feature structures Feature graphs
Path values
Example (Path values)
The value of the path 〈agr〉 in A is:
valA(〈agr〉) =
q2pl
q1
q3third
num
pers
and the value of the path 〈agr num〉 in A is:
valA(〈agr num〉) = q2pl
Note that, for example, the value of 〈agr pers num〉 in A is undefined.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 107 / 420
Feature structures Feature graphs
Reentrancy
The definition of path values raises the question of when two pathshave equal values.
We distinguish between paths which lead to one and the same node,and those whose values are isomorphic but not identical.
The former case is called reentrancy.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 108 / 420
Feature structures Feature graphs
Reentrancy
Definition (Reentrancy)
Let A = 〈Q, q, δ, θ〉 be a feature graph. Two paths π1, π2 ∈ Π(A) are
reentrant in A, denoted π1A
! π2, iff δ(q, π1) = δ(q, π2), implyingvalA(π1) = valA(π2). A feature graph A is reentrant iff there exist two
distinct paths π1, π2 ∈ Π(A) such that π1A
! π2.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 109 / 420
Feature structures Feature graphs
Reentrancy
Example (A reentrant feature graph)
This feature graph, A, is reen-trant because δA(q0, 〈agr〉) =δA(q0, 〈subj,agr〉)
q2pl
q0 q1
q4 q3third
agr
num
perssubj agr
The (single) value of the(different) paths 〈agr〉 and〈subj agr〉 in A is:
q2pl
q1
q3third
num
pers
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 110 / 420
Feature structures Feature graphs
Reentrancy
The notion of reentrancy touches on the issue of the distinctionbetween type- and token-identity.
Two feature graphs are token identical if their components (i.e., theirsets of nodes, roots, transition functions and atom marking functions)are identical.
They are type-identical if they are isomorphic, not necessarilyrequiring their nodes to be identical.
We will discuss feature graph isomorphism later.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 111 / 420
Feature structures Feature graphs
Cicles
Early feature structure based formalisms used to employ only acyclicfeature graphs.
However, modern ones usually allow (or even require) featurestructures to be possibly cyclic.
While the linguistic motivation for cyclic feature structures is limited,there is good practical motivation for allowing them: whenimplementing a system for manipulating feature graphs, it is usuallyeasier to support cycles than to guarantee that all the graphs in asystem are acyclic.
The reason is that unification, which is the major operation definedon feature graphs, can yield a cyclic graph even when its operands areacyclic.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 112 / 420
Feature structures Feature graphs
Cicles
Definition (Cycles)
A feature graph A = 〈QA, qA, δA, θA〉 is cyclic if two paths π1, π2 ∈ Π(A),
where π1 is a proper subsequence of π2, are reentrant: π1A
! π2. A isacyclic otherwise.
Note that cyclicity is a special case of reentrancy (every cyclic featuregraph is reentrant, but not vice versa).
A corollary of the definition is that when a feature graph is cyclic, ithas at least one node q such that δ(q, α) = q for some non-emptypath α.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 113 / 420
Feature structures Feature graphs
Cicles
Example (A cyclic feature graph)
Following is a cyclic feature graph, C :
q0 q1 q2a
f
h
g
The value of the path 〈f〉 in C , as well as the values of the (infinitelymany) paths 〈f hn〉, for n ≥ 0, is the same feature graph:
q1 q2a
h
g
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 114 / 420
Feature structures Feature graph subsumption
Feature graph isomorphism
Since feature graphs are just a special case of directed, labeledgraphs, we can adapt the well-defined notion of graph isomorphism tofeature graphs.
Informally, two graphs are isomorphic when they have the samestructure; the identites of their nodes may differ without affecting thestructure.
In our case, we require also that the labels of sink nodes be identicalin order for two graphs to be considered isomorphic.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 115 / 420
Feature structures Feature graph subsumption
Feature graph isomorphism
Definition (Feature graph isomorphism)
Two feature graphs A = 〈QA, qA, δA, θA〉 and B = 〈QB , qB , δB , θB〉 areisomorphic, denoted A ∼ B , iff there exists a one-to-one and ontomapping i : QA → QB , called an isomorphism, such that:
i(qA) = qB ;
for all q1, q2 ∈ QA and f ∈ Feats, δA(q1, f ) = q2 iffδB(i(q1), f ) = i(q2); and
for all q ∈ QA, θA(q) = θB(i(q)) (either both are undefined, or bothare defined and equal).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 116 / 420
Feature structures Feature graph subsumption
Feature graph subsumption
Definition (Subsumption)
Let A1 = 〈Q1, q1, δ1, θ1〉 and A2 = 〈Q2, q2, δ2, θ2〉 be two feature graphs.A1 subsumes A2 (denoted by A1 ⊑ A2) iff there exists a total functionh : Q1 → Q2, called a subsumption morphism, such that
h(q1) = q2
for every q ∈ Q1 and for every f such that δ1(q, f )↓,h(δ1(q, f )) = δ2(h(q), f )
for every q ∈ Q1, if θ1(q)↓ then θ1(q) = θ2(h(q)).
If A1 ⊑ A2 then A1 is said to subsume, or be more general than A2; A2 issubsumed by, or is more specific than, A1.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 117 / 420
Feature structures Feature graph subsumption
Subsumption
The morphism h associates with every node in Q1 a node in Q2; if anarc labeled f connects q with q′, then such an arc connects h(q) withh(q′).
In other words, δ and h commute, as depicted in the followingdiagram, where δ-arcs are depicted using solid lines, whereash-mappings are depicted using dashed lines:
δ :
h
h
f f
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 118 / 420
Feature structures Feature graph subsumption
Subsumption
In addition, if a node q ∈ Q1 is marked by an atom, then its imageh(q) must be marked by the same atom (recall that only sinks can bethus marked).
Note that if a sink in Q1 is not marked, there is no constraint on itsimage (in particular, it can be a non-sink).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 119 / 420
Feature structures Feature graph subsumption
Subsumption morphism
Example (Subsumption morphism)
A1 A2
q h(q)
q h(q)
q′ h(q′)
f f
h
h
h
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 120 / 420
Feature structures Feature graph subsumption
Subsumption morphism
Example (Subsumption)
qA2
A : qA0 qA
1
qA3
third
qB2
pl
B : qB0 qB
1
qB4 qB
3 third
agr
num
pers
agr num
perssubj agr
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 121 / 420
Feature structures Feature graph subsumption
Subsumption morphism
Indeed, B can—and does—have nodes that do not correspond tonodes in A: such is qB
4 in the example.
In addition, while the sink qA2 is not marked by an atom (that is, it is
a variable), its image in B , qB2 , is marked as pl .
Notice that no subsumption morphism can be defined from QB toQA, since there is no node into which qB
4 can be mapped.
In particular, it cannot be mapped to the root of A since this wouldnecessitate an arc from qA
0 to itself (as the root of A would be theimage of both qB
4 and qB0 ).
Trying to take h−1 as an inverse subsumption morphism will fail bothbecause of qB
4 and because it would map qB2 to qA
2 , violating the lastclause of the subsumption relation (a marked sink must be mapped toa sink with the same mark).
We conclude that B 6⊑ A.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 122 / 420
Feature structures Feature graph subsumption
Subsumption
Given a feature structure, what modifications can be made to it inorder for it to become more specific? Three different kinds ofmodifications are possible:
1 Adding arcs;2 Adding reentrancies;3 Marking unmarked sinks by some atom.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 123 / 420
Feature structures Feature graph subsumption
Subsumption
Example (Subsumption as an order on information)
⊑ pl adding arcsnum
⊑ pl adding atomic marksnum num
sg ⊑ sg adding arcs
third
num num
per
sg ⊑ sg adding reentrancies
sg
num1
num2
num1
num2
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 124 / 420
Feature structures Feature graph subsumption
Subsumption
Lemma
If A ⊑ B then Π(A) ⊆ Π(B).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 125 / 420
Feature structures Feature graph subsumption
Subsumption
Lemma
If A ⊑ B then for each π ∈ Π(A), if θA(δA(qA, π))↓ then θB(δB(qB , π))↓and θA(δA(qA, π)) = θB(δB (qB , π)).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 126 / 420
Feature structures Feature graph subsumption
Subsumption
Lemma
If A ⊑ B and π1, π2 are reentrant in A (that is, π1A
! π2) then π1, π2 are
reentrant in B (that is, π1B
! π2).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 127 / 420
Feature structures Feature graph subsumption
Subsumption
Corollary
If A ⊑ B, then:
Π(A) ⊆ Π(B)
for each π ∈ Π(A), if θA(δA(qA, π))↓ then θB(δB (qB , π))↓ andθA(δA(qA, π)) = θB(δB (qB , π))
for each π1, π2 ∈ Π(A), if π1A
! π2 then π1B
! π2 (and, therefore, ifA is reentrant/cyclic then so is B).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 128 / 420
Feature structures Feature graph subsumption
Subsumption
Theorem
If A is an atomic feature graph and A ⊑ B, then A ∼ B.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 129 / 420
Feature structures Feature graph subsumption
Subsumption
Theorem
Subsumption has a least element: there exists a feature graph A such thatfor all feature graph B, A ⊑ B.
Proof.
Consider the (empty) feature graph A = 〈{q0}, q0, δ, θ〉, where δ and θ areundefined for their entire domains. For every feature graph B , A ⊑ B bymapping (through h) the root q0 to the root of B , qB . The two clauses ofthe definition of subsumption hold vacuously.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 130 / 420
Feature structures Feature graph subsumption
Subsumption
Theorem
Subsumption is reflexive: for every feature graph A, A ⊑ A.
Proof.
Take h to be the identity function that maps every node in A to itself.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 131 / 420
Feature structures Feature graph subsumption
Subsumption
Theorem
Subsumption is transitive: if A ⊑ B and B ⊑ C then A ⊑ C.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 132 / 420
Feature structures Feature graph subsumption
Subsumption
Theorem
Subsumption is not antisymmetric: if A ⊑ B and B ⊑ A then notnecessarily A = B.
Proof.
Consider the feature graphs A = 〈{qA}, qA, δ, θ〉 and B = 〈{qB}, qB , δ, θ〉,where δ and θ are undefined for their entire domains, and where qA 6= qB .Trivially, both A ⊑ B and B ⊑ A, but A 6= B .
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 133 / 420
Feature structures Feature graph subsumption
Subsumption
Thus, feature graph subsumption forms a partial pre-order on featuregraphs.
It is a pre-order since it is not antisymmetric; it is partial as there arefeature graphs that are incomparable with respect to subsumption.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 134 / 420
Feature structures Feature graph subsumption
Subsumption
Example (Feature graph subsumption is a partial relation)
Feature graphs can be incomparable due to inconsistency (contradictinginformation) or to complementary information.
sg6⊑6⊒ pl
sg6⊑6⊒ pl
num num
6⊑6⊒
num pers
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 135 / 420
Feature structures Feature graph subsumption
Subsumption
There is a clear connection between feature graph isomorphism andfeature graph subsumption:
Theorem
A ∼ B iff A ⊑ B and B ⊑ A.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 136 / 420
Feature structures Feature structures
Feature structures
Feature graphs are a useful notation but they are too discriminating.
Usually, the importance of the identities of the nodes in a graph isinferior to the structure of the graph (including the labels on itsnodes and arcs).
It is therefore beneficial to collapse feature graphs which only differ inthe identities of their nodes into an equivalence class.
The definition of feature structures as equivalence classes ofisomorphic feature graphs facilitates a view which emphasizes thestructure and ignores the irrelevant information encoded in the nodes.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 137 / 420
Feature structures Feature structures
Feature structures
Definition (Feature structures)
Given a signature of features Feats and atoms Atoms, let FS = G|∼ bethe collection of equivalence classes in G(Feats,Atoms) with respect tofeature graph isomorphism. A feature structure is any member of FS.
We use meta-variables fs to range over feature structures.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 138 / 420
Feature structures Feature structures
Feature structures
Theorem
Let fs be a feature structure and let A ∈ fs, B ∈ fs be two feature graphsin fs. Then:
Π(A) = Π(B)
for each π ∈ Π(A), θA(δA(qA, π))↓ iff θB(δB (qB , π))↓ andθA(δA(qA, π)) = θB(δB (qB , π))
for each π1, π2 ∈ Π(A), π1A
! π2 iff π1B
! π2 (and, therefore, A isreentrant/cyclic iff B is reentrant/cyclic).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 139 / 420
Feature structures Feature structures
Feature structures
Definition
Let fs be a feature structure. Then the paths of fs are defined asΠ(fs) = Π(A) for some A ∈ fs.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 140 / 420
Feature structures Feature structures
Feature structures
From now on, we will usually refer to feature structures through somefeature graph representative, taking care that all definitions arerepresentative independent.
As an example, we can lift the definition of reentrancy from featuregraphs to feature structures in the natural way:
Definition (Feature structure reentrancy)
Two paths π1, π2 are reentrant in a feature structure fs, denoted
π1fs
! π2, if π1A
! π2 for some A ∈ fs. fs is reentrant if for some
π1 6= π2, π1fs
! π2.
The definition is independent of the representative A.
Feature structure cyclicity is defined in a similar way.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 141 / 420
Feature structures Feature structures
Feature structures
As another example, we lift the definition of subsumption fromfeature graphs to feature structures:
Definition (Feature structure subsumption)
If fs1 and fs2 are feature structures, fs1 subsumes fs2, denoted fs1⊑fs2, ifffor some A ∈ fs1 and some B ∈ fs2, A ⊑ B .
Since feature structure subsumption is defined in terms of arepresentative, we must show that the definition is representativeindependent.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 142 / 420
Feature structures Feature structures
Feature structures
Lemma
The definition of feature structure subsumption is independent of therepresentative: if A ∼ A′ and B ∼ B ′ then A ⊑ B iff A′ ⊑ B ′.
Proof.
Assume that A ∼ A′ through an isomorphism iA : QA → QA′ and B ∼ B ′
through an isomorphism iB : QB → QB′ . If A ⊑ B there exists asubsumption morphism h : QA → QB . Then h′ = iB ◦ h ◦ iA
−1 is asubsumption morphism mapping QA′ to QB′ (the proof is left as anexercise), and hence fs(A′)⊑fs(B ′). The other direction (if fs(A′)⊑fs(B ′)then fs(A)⊑fs(B)) is completely symmetric.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 143 / 420
Feature structures Feature structures
Feature structures
Corollary
If fsA and fsB are feature structures, fsA⊑fsB iff for every A ∈ fsA andevery B ∈ fsB , A ⊑ B.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 144 / 420
Feature structures Feature structures
Feature structures
Like feature graph subsumption, feature structure subsumption isreflexive and transitive; these properties can be easily establishedfrom their counterparts in the feature graph case.
However, unlike feature graphs, feature structure subsumption isantisymmetric:
Theorem
If fs1⊑fs2 and fs2⊑fs1, then fs1 = fs2.
Therefore, subsumption is a partial order on feature structures.
In the sequel we will sometimes use the ‘⊑’ symbol to denote bothfeature graph and feature structure subsumption, when the type ofthe arguments of the relation is clear.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 145 / 420
Feature structures Feature structures
Feature graphs and feature structures
Example (Feature graphs and feature structures)
Feature Graph Feature Structure
A1 fs1 = [A1]∼
A2 ∼ A′2 fs2 = [A2]∼
[·]∼
[·]∼
∈
∈
∈
⊑ ⊑
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 146 / 420
Feature structures Attribute-value matrices
AVMs
We now return to attribute-value matrices (AVMs).
This is the view that we will adopt for depicting feature structures(and grammars based on them), both because they are easy topresent on paper and because of their centrality in existing literature.
Like feature graphs, AVMs are defined over a signature of featuresand atoms, which we fix below.
In addition, AVMs make use of variables, also called tags below.Meta-variables X , Y , Z , etc. range over over variables.
Variables are used to encode sharing of values, as will be clearpresently.
When AVMs are concerned, we follow the convention of the linguisticliterature by which variables are natural numbers, depicted in boxes,e.g., 3 .
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 147 / 420
Feature structures Attribute-value matrices
AVMs
Definition (AVMs)
Given a signature S, the set Avms(S) of AVMs over S is the least setsatisfying the following two clauses:
1 M = Xa ∈ Avms(S) for any a ∈ Atoms and X ∈ Tags; M is saidto be atomic and X is the tag of M, denoted tag(M) = X .
2 M = X [f1 : M1, . . . , fn : Mn] ∈ Avms(S) for n ≥ 0, X ∈ Tags,f1, . . . , fn ∈ Feats and M1, . . . ,Mn ∈ Avms(S), where fi 6= fj ifi 6= j . M is said to be complex, and X is the tag of M, denotedtag(M) = X . If n = 0, M = X [] is an empty AVM.
Note that two AVMs which differ only in their tag are distinct: if X 6= Y ,X
[
· · ·]
6= Y[
· · ·]
. In particular, there is no unique empty AVM. Note alsothat the same variable can be used more than once in an AVM.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 148 / 420
Feature structures Attribute-value matrices
AVMs
Example (AVMs)
Consider a signature consisting of Atoms = {a} and Feats = {f,g}.Then M1 = 4a is an AVM by the first clause of the definition, M2 = 2 [ ] isan empty AVM by the second clause, M3 = 3
[
f : 4a]
is an AVM by thesecond clause (using M1 as the value of f, so that fval(M3, f) = M1), and
M4 = 2
[
g : 3[
f : 4a]
f : 2 [ ]
]
is an AVM by the second clause, as is
M5 = 4
[
g : 3[
f : 4a]
f : 2 [ ]
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 149 / 420
Feature structures Attribute-value matrices
AVMs
Meta-variables M, with or without subscripts, range over Avms; theparameter S is omitted when it is clear from the context.
The domain of an AVM M, denoted dom(M), is undefined when M isatomic, and {f1, . . . , fn} when M is complex (hence, dom(M) isempty for an empty AVM).
The value of some feature f ∈ Feats in M, denoted fval(M, f ), isdefined if f = fi ∈ dom(M), in which case it is Mi , and undefinedotherwise.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 150 / 420
Feature structures Attribute-value matrices
Sub-AVMs
Definition (Sub-AVMs)
Given an AVM M, its sub-AVMs are SubAVM(M), defined as:
1 SubAVM(Xa) = {Xa}
2 SubAVM(X [f1 : M1, . . . , fn : Mn]) = X [f1 : M1, . . . , fn : Mn]⋃
∪1≤i≤nSubAVM(Mi )
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 151 / 420
Feature structures Attribute-value matrices
AVMs
Definition (Tags)
Given an AVM M, its tags Tags(M) are defined as:
1 Tags(Xa) = {X}
2 Tags(X [f1 : M1, . . . , fn : Mn]) = X ∪1≤i≤n Tags(Mi )
Definition (Tagset)
The tagset of an AVM M and a tag X ∈ Tags(M) is the set of sub-AVMsof M (including M itself) which are tagged by X :TagSet(M,X ) = {M ′ ∈ SubAVM(M) | tag(M ′) = X}.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 152 / 420
Feature structures Attribute-value matrices
AVMs
Example (AVMs)
Let:
M4 = 2
[
g : 3[
f : 4a]
f : 2 [ ]
]
fval(M4, f) = 2 [ ]. Observe that Tags(M4) = { 2 , 3 , 4}. Also,TagSet(M4, 4 ) is { 4a}, TagSet(M4, 3 ) is { 3
[
f : 4a]
} andTagSet(M4, 2 ) is {M4, 2 [ ]}.Trivially, tag(M4) = 2 .
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 153 / 420
Feature structures Attribute-value matrices
AVMs
Example (AVMs)
Let:
M5 = 4
[
g : 3[
f : 4a]
f : 2 [ ]
]
Similarly, fval(M5, f) = 2 [ ], whereas fval(M5,g) = 3[
f : 4a]
. Observethat Tags(M5) = { 2 , 3 , 4}.Also TagSet(M5, 2 ) = { 2 [ ]},TagSet(M5, 3 ) = { 3
[
f : 4a]
} and TagSet(M5, 4 ) = {M5, 4a}.Trivially, tag(M5) = 4 .
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 154 / 420
Feature structures Attribute-value matrices
AVMs
Example (AVMs)
As another example, consider the AVM
M6 = 1[
f : 1[
f : 1[
f : 1 [ ]]]]
Here, Tags(M6) = { 1}, and TagSet(M6, 1 ) is:
{M6, 1[
f : 1[
f : 1 [ ]]]
, 1[
f : 1 [ ]]
, 1 [ ]}
Of course, tag(M6) = 1 .
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 155 / 420
Feature structures Attribute-value matrices
Well-formed AVMs
Consider some AVM M = 1
[
f1 : 2M1
f2 : 2M2
]
where M1 6= M2.
Both M1 and M2 are sub-AVMs of M, and both have the same tag,although they are different.
In other words, the recursive definition of AVMs allows two different,contradicting AVMs to be in the TagSet of the same variable.
To eliminate such cases, we define well-formed AVMs as follows:
Definition (Well-formed AVMs)
An AVM M is well-formed iff for every variable X ∈ Tags(M),TagSet(M,X ) includes at most one non-empty AVM.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 156 / 420
Feature structures Attribute-value matrices
Variable associations
Henceforth, we only consider well-formed AVMs.
This allows us to provide a concise interpretation of shared values inAVMs: we wish to make explicit the special role that multipleoccurrences of the same variable in a single AVM play.
To this end, we would like to say that the association of a variableX ∈ Tags(M) in an AVM M, written assoc(M,X ), is the AVM whichis tagged by X ; if, in a given AVM M, a variable X occurs exactlyonce, then assoc(M,X ) is a single, unique value.
If, however, X occurs more than once in M, special care is required.Recall that for well-formed AVMs, at most one of these multipleoccurrences is associated with a non-empty AVM.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 157 / 420
Feature structures Attribute-value matrices
Variable associations
Definition (Variable association)
For a variable X ∈ Tags(M), the association of X in M, denotedassoc(M,X ), is the single non-empty AVM in TagSet(M,X ); if only X [ ]is a member of TagSet(M,X ), then assoc(M,X ) = X [ ].
Note that assoc assigns exactly one sub-AVM of M to each variableoccurring in M, independently of the number of occurrences of thevariable in M or the size of TagSet(M,X ).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 158 / 420
Feature structures Attribute-value matrices
Variable associations
Example (Variable association)
Consider the well-formed AVM
M = 2
[
g : 3[
f : 4a]
f : 2 [ ]
]
Observe that assoc(M, 2 ) = M, assoc(M, 3 ) = 3[
f : 4a]
andassoc(M, 4 ) = 4a. The two occurrences of the variable 2 have one andthe same association. For M ′ = 4
[
f : 4 [ ]]
, assoc(M ′, 4 ) = M ′.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 159 / 420
Feature structures Attribute-value matrices
AVM paths
Definition (AVM paths)
Let M be an AVM. Let Arcs(M) be defined as:Arcs(M) = {〈X , f ,Y 〉 | X ,Y ∈ Tags(M), f ∈ dom(assoc(M,X )) andtag(fval(assoc(M,X ), f )) = Y }. Let Arcs* be the extension of Arcs topaths, defined (recursively) by:
for all X ∈ Tags(M), 〈X , ǫ,X 〉 ∈ Arcs*(M)
if 〈X , f ,Y 〉 ∈ Arcs(M) then 〈X , f ,Y 〉 ∈ Arcs*(M)
if 〈X , f ,Y 〉 ∈ Arcs(M) and 〈Y , π,Z 〉 ∈ Arcs*(M) then〈X , f · π,Z 〉 ∈ Arcs*(M)
The paths of M, denoted Π(M), is the set {π | X = tag(M) and for somevariable Y ∈ Tags(M), 〈X , π,Y 〉 ∈ Arcs*(M)}.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 160 / 420
Feature structures Attribute-value matrices
Paths
Example (Paths)
Consider again the AVM
M = 2
[
g : 3[
f : 4a]
f : 2 [ ]
]
Observe that Arcs(M) = {〈 2 ,g, 3 〉, 〈 2 , f, 2 〉, 〈 3 , f, 4 〉}. Therefore,Arcs*(M) includes, in addition to the elements of Arcs(M), also〈 2 , ǫ, 2 〉, 〈 2 , 〈gf〉, 4 〉 and, due to the multiple occurrence of 2 , theinfinitely many triples 〈 2 , fi · 〈g,f〉, 4 〉 for any i ≥ 0.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 161 / 420
Feature structures Attribute-value matrices
Path values
Definition (Path values)
The value of a path π in an AVM M, denoted pval(M, π), is assoc(M,Y ),where Y is such that 〈tag(M), π,Y 〉 ∈ Arcs*(M). This is well definedsince Arcs* is functional. Similarly, pval is partial since Arcs* is partial.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 162 / 420
Feature structures Attribute-value matrices
Path values
Example (Path values)
In the AVM
M = 2
[
g : 3[
f : 4a]
f : 2 [ ]
]
pval(M, ǫ) = M; pval(M, 〈g〉) = 3[
f : 4a]
; andpval(M, 〈f〉) = pval(M, 〈ff〉) = pval(M, 〈fff〉) = M. pval(M, 〈gg〉) isundefined.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 163 / 420
Feature structures Attribute-value matrices
Reentrancy
Definition (Reentrancy)
Two paths π1 and π2 are reentrant in an AVM M if
pval(M, π1) = pval(M, π2), denoted also π1M! π2. An AVM M is
reentrant if there exist two distinct paths π1, π2 such that π1M! π2.
In the AVM M of the previous example, ǫM! 〈f〉 because
pval(M, ǫ) = pval(M, 〈f〉) = M.
Definition (Cyclic AVMs)
An AVM M is cyclic if two paths π1, π2 ∈ Π(M), where π1 is a proper
subsequence of π2, are reentrant: π1M! π2.
M of the previous example is therefore cyclic, e.g., by the paths ǫ and 〈f〉.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 164 / 420
Feature structures Attribute-value matrices
Reentrancy
Example (A reentrant AVM)
The following AVM is reentrant but not cyclic:
0
agr : 1
[
num : 2plpers : 3 third
]
subj : 4[
agr : 1]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 165 / 420
Feature structures Attribute-value matrices
Conventions
We introduce three conventions regarding the depiction ofwell-formed AVMs, motivated by the fact that variables are usedprimarily to indicate value sharing.
If a variable occurs more than once then its value is explicated onlyonce; where this value is explicated (i.e., next to which occurrence ofthe variable) is immaterial.
Variables which occur only once can be omitted.
The empty AVM is sometimes omitted when it is associated with avariable.
The first convention is crucial in the case of cyclic AVMS: there is nofinite representation of cyclic AVMs unless this convention is adopted.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 166 / 420
Feature structures Attribute-value matrices
Conventions
Example (Shorthand notation for AVMs)
Consider the following AVM:
6
f : 3 [ ]g : 4
[
h : 3a]
h : 2 [ ]
Notice that it is well-formed, since the only variable occurring more thanonce ( 3 ) is associated with a non-empty value (a) only once.
We can therefore leave only one occurrence of the value explicit
The tag 2 is associated with the empty feature structure, which canbe omitted
Finally, the tags 4 and 6 occur only once, so they can be omitted
This is the conventional form of the AVM.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 167 / 420
Feature structures Attribute-value matrices
AVM subsumption
Definition (AVM subsumption)
Let M1, M2 be AVMs over the same signature. M1 subsumes M2,denoted M1 � M2, if there exists a total functionh : Tags(M1) → Tags(M2) such that:
1 h(tag(M1)) = tag(M2)
2 for every 〈X , f ,Y 〉 ∈ Arcs(M1), 〈h(X ), f , h(Y )〉 ∈ Arcs(M2)
3 for every X ∈ Tags(M1), if assoc(M1,X ) is atomic thenassoc(M2, h(X )) is atomic, with the same atom.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 168 / 420
Feature structures Attribute-value matrices
AVM subsumption
Lemma
If M1 � M2 through h and 〈X , π,Y 〉 ∈ Arcs*(M1) then〈h(X ), π, h(Y )〉 ∈ Arcs*(M2).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 169 / 420
Feature structures Attribute-value matrices
AVM subsumption
Corollary
If M1 � M2 then Π(M1) ⊆ Π(M2) and if π1M1! π2 then π1
M2! π2.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 170 / 420
Feature structures Attribute-value matrices
AVM isomorphism
When two AVMs are identical up to the variables which occur inthem, one AVM can be obtained from the other by a systematicrenaming of the variables;
We say that the two AVMs are isomorphic.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 171 / 420
Feature structures Attribute-value matrices
AVM isomorphism
Example (Isomorphic AVMs)
Let
M1 = 2
[
g : 3[
f : 4a]
f : 2 [ ]
]
M2 = 22
[
g : 23[
f : 24a]
f : 22 [ ]
]
Then M2 can be obtained from M1 by systematically replacing 22 for 2 ,23 for 2 and 24 for 4 .
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 172 / 420
Feature structures Attribute-value matrices
Renaming
Of course, one must be careful renaming variables, especially whenthe same variable may occur in both AVMs.
For example, if M = 2[
f : 1a]
then renaming 1 to 2 will result inM = 2
[
f : 2a]
which is not even well-formed.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 173 / 420
Feature structures Attribute-value matrices
AVM isomorphism
Theorem
If M1 and M2 are isomorphic AVMs then both M1 � M2 and M2 � M1.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 174 / 420
Feature structures Attribute-value matrices
AVM equivalence
Another case of AVM equivalence is induced by the convention bywhich if a variable occurs more than once in an AVM then its value isexplicated only once.
A consequence of this convention is that two AVMs which differ onlywith respect to where the (single) value of some multiply occurringvariable is explicated subsume each other, as they induce the sameset of Arcs.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 175 / 420
Feature structures Attribute-value matrices
AVM equivalence
Example (AVM equivalence)
M1 and M2 differ only in the instance of 0 whose value is explicated:
M1 = 0
agr : 1
[
num : 2plpers : 3 third
]
subj : 4[
agr : 1]
M2 = 0
agr : 1
subj : 4
[
agr : 1
[
num : 2plpers : 3 third
]]
Then M1 � M2 and M2 � M1.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 176 / 420
Feature structures Attribute-value matrices
AVM equivalence
Theorem
Let M and M ′ be two AVMs such that X ∈ Tags(M) ∩Tags(M ′), andassume that X occurs twice in M and in M ′ (that is, |TagSet(M,X )| > 1and |TagSet(M ′,X )| > 1). If M and M ′ are identical up to the choice ofwhich instance of X in them is explicated, then M � M ′ and M ′ � M.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 177 / 420
Feature structures Attribute-value matrices
AVM equivalence
Definition (Renaming)
Let M1 and M2 be two AVMs. M2 is a renaming of M1, denotedM1 ≃ M2, iff M1 � M2 and M2 � M1.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 178 / 420
Feature structures Attribute-value matrices
AVM equivalence
Example (AVM renamings)
The following two AVMs are renamings of each other:
M1 = 0
agr : 1
[
num : 2plpers : 3 third
]
subj : 4[
agr : 1]
M2 = 10
agr : 11
subj : 14
[
agr : 11
[
num : 12plpers : 13 third
]]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 179 / 420
Feature structures The correspondence between feature graphs and AVMs
The correspondence between feature graphs and AVMs
AVMs are the entities that the linguistic literature employs to depictfeature structures;
feature graphs are well-understood mathematical entities to whichvarious results of graph theory can be applied.
We define the relationship between these two views.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 180 / 420
Feature structures The correspondence between feature graphs and AVMs
From AVMs to feature graphs
We formalize the correspondence between AVMs and feature graphsby presenting a mapping, φ, which embodies the relation between anAVM and its feature graph image.
Informally, a given AVM M is mapped to a concrete graph whosenodes are the variables occurring in the AVM, Tags(M).
The root of the graph is the variable tagging the entire AVM; and thearcs are determined using the function val .
Atomic AVMs are mapped to single nodes, labeled by the atom, withno outgoing arcs.
Empty AVMs are mapped to a graph having just one node, bearingno label and having no outgoing features.
Complex AVMs are mapped to graphs whose nodes, including theroot, may have outgoing arcs, where the arcs’ labels correspond tofeatures.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 181 / 420
Feature structures The correspondence between feature graphs and AVMs
From AVMs to feature graphs
Definition (AVM to graph mapping)
Let M be a well-formed AVM. The feature graph image of M isφ(M) = 〈Q, q, δ, θ〉, where:
Q = Tags(M)
q = tag(M)
for all X ∈ Tags(M) and f ∈ Feats, δ(X , f ) = Y iff〈X , f ,Y 〉 ∈ Arcs(M), and
for all X ∈ Tags(M) and a ∈ Atoms, θ(X ) = a iff assoc(M,X ) isthe atomic AVM Xa, and is undefined otherwise.
Note that if M1 and M2 are two AVMs which differ only in the order ofthe “rows” of feature–value pairs, they will be mapped by φ to exactly thesame feature graph.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 182 / 420
Feature structures The correspondence between feature graphs and AVMs
From AVMs to feature graphs
Example (AVM to graph mapping)
Let
M = 3
f : 1[
f1 : 7a]
g : 2
[
g1 : 9ag2 : 1 [ ]
]
M is well-formed. The associa-tions of the variables of M are:
Variable Association1 1
[
f1 : 7a]
2 2
[
g1 : 9ag2 : 1 [ ]
]
3 3
f : 1[
f1 : 7a]
g : 2
[
g1 : 9ag2 : 1 [ ]
]
7 7a9 9a
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 183 / 420
Feature structures The correspondence between feature graphs and AVMs
From AVMs to feature graphs
Example (AVM to graph mapping)
The feature graph image of M is φ(M) = 〈Q, q, δ, θ〉 whereQ = { 3 , 1 , 7 , 2 , 9}, q = 3 , θ( 7) = θ( 9) = a (and θ is undefinedelsewhere), and δ is given by: δ( 3 , f) = 1 , δ( 3 ,g) = 2 , δ( 1 , f1) = 7 ,δ( 2 ,g1) = 9 , δ( 2 ,g2) = 1 and δ is undefined elsewhere.
φ(M) =
1 7a
3 9a
2
f
f1
g
g1g2
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 184 / 420
Feature structures The correspondence between feature graphs and AVMs
From AVMs to feature graphs
Example (AVM to graph mapping)
A reentrant AVM and its feature graph image:
M = 0
agr : 1
[
num : 2plpers : 3 third
]
subj : 4[
agr : 1]
φ(M) =
2pl
0 1
4 3third
agrnum
perssubj agr
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 185 / 420
Feature structures The correspondence between feature graphs and AVMs
From AVMs to feature graphs
Example (AVM to graph mapping in the face of cycles)
Let M be the (cyclic) AVM
M = 3[
f : 3 [ ]]
where Tags(M) = { 3}. Observe that M is well-formed, as the onlyvariable that occurs more than once in M, namely 3 , has only onenon-empty AVM associated with it: M itself. The graph φ(M) willtherefore be 〈Q, q, δ, θ〉, where Q = { 3}, q = 3 , θ( 3 ) is undefined andδ( 3 , f) = 3 , δ undefined elsewhere. This graph is:
φ(M) = 3 f
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 186 / 420
Feature structures The correspondence between feature graphs and AVMs
From AVMs to feature graphs
Lemma
If M is an AVM and A = φ(M) is its feature graph image, then for allX ,Y ∈ Tags(M) and π ∈ Paths, 〈X , π,Y 〉 ∈ Arcs*(M) iffδA(X , π) = Y .
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 187 / 420
Feature structures The correspondence between feature graphs and AVMs
From AVMs to feature graphs
Corollary
If M is an AVM and A = φ(M) is its feature graph image, then
Π(M) = Π(A) and for all π1, π2 ∈ Paths, π1M! π2 iff π1
φ(M)! π2.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 188 / 420
Feature structures The correspondence between feature graphs and AVMs
From AVMs to feature graphs
Theorem
For all AVMs M1,M2, M1 � M2 iff φ(M1) ⊑ φ(M2).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 189 / 420
Feature structures The correspondence between feature graphs and AVMs
From AVMs to feature graphs
Corollary
For all AVMs M1,M2, M1 ≃ M2 iff φ(M1) ∼ φ(M2).
This concludes the first direction of the correspondence between AVMsand feature graphs.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 190 / 420
Feature structures The correspondence between feature graphs and AVMs
From feature graphs to AVMs
For the reverse direction, we define a mapping, η, from feature graphsto AVMs.
As above, there should be a correspondence between nodes in thegraph and variables in the AVM.
But note that while the nodes of a feature graph are part of thedefinition of the graph, AVMs are defined over a universal set ofvariables.
We must therefore pre-define a set of variables, called V below, foreach AVM M, to serve as Tags(M).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 191 / 420
Feature structures The correspondence between feature graphs and AVMs
From feature graphs to AVMs
In addition, AVMs exhibit a degree of freedom which is not present infeature graphs;
this is due to the fact that multiple occurrences of the same variablecan be explicated along with any of the instances of the variable.
To overcome this difficulty, we first introduce the notion ofarborescence.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 192 / 420
Feature structures The correspondence between feature graphs and AVMs
Arborescence
Definition
Given a feature graph A = 〈Q, q, δ, θ〉, a tree τ = 〈Q,E 〉, where E ⊆ δ, isan arborescence of A if τ is a minimum spanning directed tree of A,rooted in q.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 193 / 420
Feature structures The correspondence between feature graphs and AVMs
Arborescence
Informally, an arborescence of a given feature graph is a treeconsisting of the nodes of the graph and the minimum number of arcsrequired for defining some shortest possible path from the root toeach of the nodes in the graph.
Since feature graphs are connected and each node is accessible fromthe root, such a tree always exists, but it is not necessarily unique.
A simple algorithm for producing an arborescence scans the tree, fromthe root, in some order, and marks each node by the length of theshortest path from the root to that node, marking additionally theincoming arcs to the node that are parts of minimum length paths.
Then, for each node with in-degree greater than 1, only a singlemarked arc is retained.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 194 / 420
Feature structures The correspondence between feature graphs and AVMs
Arborescence
Example (Arborescence)
Let A be the graph: A =
q1 q7a
q3
q2 q9
f
f1
gg1
g2
Then the following trees are arborescences of A:
q1 q7a
q3
q2 q9
f
f1
gg1
q1 q7a
q3
q2 q9
f1
gg1
g2
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 195 / 420
Feature structures The correspondence between feature graphs and AVMs
Arborescence
Definition (Feature graph to AVM mapping)
Let A = 〈Q, q, δ, θ〉 be a feature graph and let τ = 〈Q, E 〉 be an arborescense ofA. Let V ⊆ Tags be a set of |Q| variables and I : Q → V be a one-to-onemapping. For each node q ∈ Q, define Mτ
I (q) as:
if δ(q, f )↑ for all f ∈ Feats and θ(q)↑, then Mτ
I (q) = I (q) [ ]
if δ(q, f )↑ for all f ∈ Feats and θ(q) = a, then Mτ
I (q) = I (q)a
if δ(q, fi ) = qi for 1 ≤ i ≤ n , where n is the out-degree of q, then
Mτ
I (q) = I (q)
f1 : α1
......
fn : αn
where αi = Mτ
I (qi ) if 〈q, fi , qi〉 ∈ E ,αi = I (qi ) otherwise.
The AVM expression of A with respect to an arborescence τ is ητ
I (A) = Mτ
I (q).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 196 / 420
Feature structures The correspondence between feature graphs and AVMs
Feature graph to AVM mapping
Example (Feature graph to AVM mapping)
Let A be the graph: A =
q1 q7a
q3 q9
q2
f
f1
g
g1g2
Let τ = 〈Q,E 〉 be:
q1 q7a
q3 q9
q2
f
f1
g
g1
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 197 / 420
Feature structures The correspondence between feature graphs and AVMs
Feature graph to AVM mapping
Example (Feature graph to AVM mapping)
Since Q = {q1, q2, q3, q7, q9} we select a set of five variables from Tags;say, V = { 1 , 2 , 3 , 7 , 9}. We define a one-to-one mapping I from Q to V ;here, the function which maps qi to i .To compute the AVM expression of A (with respect to τ and I ) we startwith the sinks of the graph: nodes with no outgoing edges. There are twosuch nodes in A, namely q7 and q9. By the definition,Mτ
I (q9) = I (q9) [ ] = 9 [ ], and MτI (q7) = I (q7)a = 7a. Then,
MτI (q1) = I (q1)
[
f1 : MτI (q7)
]
= 1[
f1 : 7a]
.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 198 / 420
Feature structures The correspondence between feature graphs and AVMs
Feature graph to AVM mapping
Example (Feature graph to AVM mapping)
More interestingly,
MτI (q2) = I (q2)
[
g1 : MτI (q9)
g2 : I (q1)
]
= 2
[
g1 : 9 [ ]g2 : 1
]
.
Note how the value of 1 is not explicated, as the arc 〈q2,G2, q1〉 is notincluded in τ . Finally,
M = MτI (q3) = I (q3)
[
f : MτI (q1)
g : MτI (q2)
]
= 3
f : 1[
f1 : 7a]
g : 2
[
g1 : 9 [ ]g2 : 1
]
Observe that the result is a well-formed AVM, and that the reentrancy inA is reflected in M.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 199 / 420
Feature structures The correspondence between feature graphs and AVMs
Feature graph to AVM mapping
Example (Feature graph to AVM mapping)
Had we chosen the other arborescence of A, the resulting AVM wouldhave been:
3
f : 1
g : 2
[
g1 : 9 [ ]g2 : 1
[
f1 : 7a]
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 200 / 420
Feature structures The correspondence between feature graphs and AVMs
Feature graph to AVM mapping
Example (Feature graph to AVM mapping in the face of cycles)
Let A be the following graph, whose unique arborescence is τ :
A = q0 q1
f
gτ = q0 q1
f
Define V = { 0 , 1} and I maps qi to i . MτI (q1) = 1
[
g : 0]
, and hence
M = MτI (q0) = 0
[
f : MτI (q1)
]
= 0[
f : 1[
g : 0]]
.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 201 / 420
Feature structures The correspondence between feature graphs and AVMs
Feature graph to AVM mapping
Recall that the function η, mapping a feature graph to an AVM, isdependent on τ , the arborescence chosen for the graph.
When a given feature graph A has several different arborescences, ithas several different AVM expressions.
However, these expressions are not arbitrarily different; in fact, theyare all renamings of each other.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 202 / 420
Feature structures The correspondence between feature graphs and AVMs
Feature graph to AVM mapping
Lemma
Let A = 〈Q, q, δ, θ〉 be a feature graph and letτ1 = 〈Q1,E1〉, τ2 = 〈Q2,E2〉 be two arborescenses of A. LetV1,V2 ⊆ Tags be two sets of |Q| variables and I1, I2 : Q → V be twoone-to-one mapping. Then ητ1
I1(A) and ητ2
I2(A) are renamings of each other.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 203 / 420
Feature structures The correspondence between feature graphs and AVMs
Feature graph to AVM mapping
Lemma
If A = 〈Q, q, δ, θ〉 is a feature graph and M = ητI (A) is any one of its AVM
expressions, then for all q1, q2 ∈ Q and f ∈ Feats, δA(q1, f ) = q2 iff〈I (q1), f , I (q2)〉 ∈ Arcs(M).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 204 / 420
Feature structures The correspondence between feature graphs and AVMs
Feature graph to AVM mapping
Corollary
If A is a feature graph and M = ητI (A) is any one of its AVM expressions,
then:
Π(A) = Π(M);
for every path π, pval(M, π) is an atomic AVM with the atom a iffvalA(π) is the graph 〈{q}, q, δ, θ〉 for some node q, where δ isundefined and θ(q) = a; and
for every π1, π2, π1A
! π2 iff π1M! π2.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 205 / 420
Feature structures The correspondence between feature graphs and AVMs
Feature graph to AVM mapping
Corollary
If A is a feature graph and M1 = ητ1I1
(A),M2 = ητ2I2
(A) are two of its AVMexpressions, then M1 ≃ M2.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 206 / 420
Feature structures The correspondence between feature graphs and AVMs
Feature graph to AVM mapping
Theorem
For all feature graphs A1 = 〈Q1, q1, δ1, θ1〉,A2 = 〈Q2, q2, δ2, θ2〉, A1 ⊑ A2
iff for all arborescences τ1, τ2 of A1 and A2, respectively, and mappingsI1, I2, ητ1
I1(A1) � ητ2
I2(A2).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 207 / 420
Feature structures The correspondence between feature graphs and AVMs
Feature graph to AVM mapping
Corollary
For all feature graphs A1,A2, A1 ∼ A2 iff η(A1) ≃ η(A2).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 208 / 420
Feature structures The correspondence between feature graphs and AVMs
AVMs, feature graphs, feature structures and AFSs
Example (AVMs, feature graphs, feature structures and AFSs)
AVM Feature Graph Feature Structure AFS
M1 A1 fs1 = [A1]∼ F1 = Abs(A1)
M2 ≃ M ′2 A2 ∼ A′
2 fs2 = [A2]∼ F2 = Abs(A2)
φ [·]∼
η, τ
η, τ ′ ∈
∈
Abs
Conc
� ⊑ ⊑ �
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 209 / 420
Unification Motivation
Unification
We presented different views of feature structures, withcorrespondences among them.
For each of the views, a subsumption relation was defined in a naturalway.
We now define the operation of unification for these views.
The subsumption relation compares the information content offeature structures.
Unification combines the information that is contained in two(compatible) feature structures.
We use the term ‘unification’ to refer to both the operation and itsresult. Whenever two feature structures are related, they are assumedto be over the same signature.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 210 / 420
Unification Motivation
Unification
The mathematical interpretation of “combining” two members of apartially ordered set is to take the least upper bound of the twooperands with respect to the partial order; in our case, subsumption.
Indeed, feature structure unification is exactly that.
However, since subsumption is antisymmetric for feature structuresand AFSs but not for feature graphs and AVMs, a unique least upperbound cannot be guaranteed for all four views.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 211 / 420
Unification Feature structure unification
Feature structure unification
Definition (Feature structure unification)
Two feature structures fs1 and fs2 are consistent if they have an upperbound (with respect to subsumption), and inconsistent otherwise. If fs1and fs2 are consistent, their unification, denoted fs1⊔fs2, is their leastupper bound with respect to subsumption.
If two feature structures have an upper bound, they have a (unique) leastupper bound.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 212 / 420
Unification Feature graph unification
Feature graph unification
While the definition of unification as least upper bound is usefulmathematically, it does not tells us how to compute the unification oftwo given feature structures.
To this end, we provide a constructive definition in terms of featuregraphs, which induces an algorithm for computing unification.
For reasons that will be clear presently, we require that the twofeature graphs be node-disjoint.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 213 / 420
Unification Feature graph unification
Feature graph unification
Definition
Let A = 〈QA, qA, δA, θA〉 and B = 〈QB , qB , δB , θB〉 with QA ∩ QB = ∅ be
two feature graphs. Let ‘u≈’ be the least equivalence relation on QA ∪ QB
such that:
qA
u≈ qB
for every q1, q2 ∈ QA ∪ QB and f ∈ Feats, if
q1u≈ q2, (δA ∪ δB)(q1, f )↓ and (δA ∪ δB)(q2, f )↓, then
(δA ∪ δB)(q1, f )u≈ (δA ∪ δB)(q2, f )
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 214 / 420
Unification Feature graph unification
Feature graph unification
The ‘u≈’ relation partitions the nodes of QA ∪ QB to equivalence
classes such that both roots are in the same class, and if some featureis defined for two nodes in one class, then the two nodes this featureleads to are also in one (possibly different) class.
Clearly, the number of equivalence classes (called the index ofu≈) is
finite.
The requirement that QA and QB be disjoint is essential here: wewould want two nodes to be in the same equivalence class with
respect to ‘u≈’ only if they comply with the above definition; if we
allowed a non-empty intersection of nodes, ‘u≈’ could have been a
different relation.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 215 / 420
Unification Feature graph unification
Theu≈ relation
A : qA0 qA
1 qA2
sg
B : qB0 qB
1 qB2
3rd
f
g
num
f pers
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 216 / 420
Unification Feature graph unification
Theu≈ relation
A : qA0 qA
1 qA2
B : qB1qB
0
f
g
h
f
g
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 217 / 420
Unification Feature graph unification
Type-respecting relation
Definition
A binary relation ‘≈’ over the nodes of two feature structures QA ∪ QB issaid to be type respecting iff for every node q ∈ QA ∪ QB , if(θA ∪ θB)(q)↓ and (θA ∪ θB)(q) = a, then for every node q′ such thatq ≈ q′, q′ is a sink and either (θA ∪ θB)(q′)↑ or (θA ∪ θB)(q′) = a.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 218 / 420
Unification Feature graph unification
Type-respecting relation
When is ‘u≈’ not type respecting?
The above condition can hold for a node q ∈ QA ∪ QB only if(θA ∪ θB)(q)↓; that is, q must be a sink in either A or B .
The type respecting condition requires that all nodes that areequivalent to q be sinks, either unmarked or marked by the sameatom.
Since this is the only requirement, the relation is not type respectingif it maps two nodes, one of which is a marked sink and the other ofwhich is either a non-sink or a sink with a different label, to the sameequivalence class.
A non-type respecting ‘u≈’ is the only source for unification failure.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 219 / 420
Unification Feature graph unification
Type respectingu≈ relation
A : qA0 qA
1 qA2
sg
B : qB0 qB
1 qB2
3rd
f
g
num
f pers
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 220 / 420
Unification Feature graph unification
Feature graph unification
Lemma
If A and B have a common upper bound C, such that A ⊑ C through themorphism hA and B ⊑ C through the morphism hB , and if qA ∈ QA and
qB ∈ QB are such that qA
u≈ qB , then hA(qA) = hB(qB).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 221 / 420
Unification Feature graph unification
Feature graph unification
Definition (Feature graph unification)
Let A and B be two feature graphs such that QA and QB are disjoint. The
unification of A and B, denoted A ⊔ B, is defined only if ‘u≈’ is type respecting,
in which case it is the feature graph 〈Q, q, δ, θ〉, where:
Q = {[q] u≈| q ∈ (QA ∪ QB)}
q = [q1] u≈
(= [q2] u≈
)
δ([q] u≈
, f ) =
{
[q′′] u≈
if there exists q′ ∈ [q] u≈
s.t. (δA ∪ δB)(q′, f ) = q′′
undef. if (δA ∪ δB)(q′, f )↑ for all q′ ∈ [q] u≈
θ([q] u≈
) =
{
(θA ∪ θB)(q′) if there exists q′ ∈ [q] u≈
s.t. (θA ∪ θB)(q′)↓
undefined if (θA ∪ θB)(q′)↑ for all q′ ∈ [q] u≈
Ifu≈ is not type respecting, A and B are inconsistent.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 222 / 420
Unification Feature graph unification
Feature graph unification
f
gnum
pers
A : qA0 qA
1 qA2
sg
B : qB0 qB
1 qB2
3rd
f
g
num
f pers
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 223 / 420
Unification Feature graph unification
Unification
To see that the result of unification is indeed a feature graph, observethat
〈Q, q, δ, θ〉 is connected because both A and B are connected;it is finite since both A and B are (and hence the number ofequivalence classes is finite);
and θ labels only sinks, sinceu≈ is type respecting.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 224 / 420
Unification Feature graph unification
Unification
Example (Unification combines information)
q0 q1sg
⊔ q3 = q6 q7sg
q53rd
q83rd
num
pers
num
pers
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 225 / 420
Unification Feature graph unification
Unification
Example (Unification is absorbing)
q0 q1sg
⊔ q3 q4sg
= q6 q7sg
q53rd
q83rd
num num
pers
num
pers
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 226 / 420
Unification Feature graph unification
Unification with reentrancies
sg
3rd
sg
3rd
subj
obj
num
pers
subj
obj
num
pers
subj
obj
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 227 / 420
Unification Feature graph unification
Unification
Theorem
If A and B are inconsistent, they do not have a common upper bound.Otherwise, C = A ⊔ B is a minimal upper bound of A and B with respectto (feature graph) subsumption.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 228 / 420
Unification Feature graph unification
Unification
The previous theorem connects feature graph unification with featurestructure unification.
In order to compute fs = fs1⊔fs2, simply compute A = A1 ⊔ A2,where A1 ∈ fs1 and A2 ∈ fs2, and take fs = [A]∼.
Theorem
For all feature graphs A1,A2, if A = A1 ⊔ A2 then [A]∼ = [A1]∼⊔[A2]∼.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 229 / 420
Unification Unification as a computational process
Unification as a computational process
Unification, as defined above, turns out to be very efficient toimplement.
Several algorithms for feature structure unification have beenproposed.
We present a simple algorithm, based directly on the definition, forunifying two feature graphs.
The algorithm uses two operations, known as union and find, tomanipulate equivalence classes.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 230 / 420
Unification Unification as a computational process
Unification as a computational process
Feature graphs are implemented using the following data structure:
each node q is a record (or structure) with the fields:
label, specifying θ(q) (if defined); andfeats, specifying δ, which is a list of pairs, consisting of features andpointers to nodes.
A node q is a sink if and only if q.feats is empty, and only such nodesare labeled.
If θ(q)↑, the field label is nil.
The functions is labeled and is sink receive a node record and returntrue if and only if the node is labeled or has no outgoing edges,respectively.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 231 / 420
Unification Unification as a computational process
Unification as a computational process
To implement the union-find algorithm, an additional field, class, isadded to nodes.
It is used to point to the equivalence class of the node.
Upon initialization of the algorithm, for every node q, q.class pointsback to q, indicating that each node is a separate equivalence class.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 232 / 420
Unification Unification as a computational process
Unification as a computational process
Example (Internal representation of a feature graph)
qA0 qA
1 qA2
sg
f
g h
label : nilfeats : 〈f,g〉class :
label : nilfeats : 〈h〉class :
label : sgfeats : 〈〉class :
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 233 / 420
Unification Unification as a computational process
Unification as a computational process
The find operation receives a node and returns a unique, canonicalrepresentative of its equivalence class
Also, union receives two (representatives of) classes and merges themby setting the equivalence class of all members of the second class tothat of the first.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 234 / 420
Unification Unification as a computational process
Unification algorithm
Example (Unification algorithm)
input: two feature graphs A and B
output: if fs(A) and fs(B) are unifiable, then a representative offs(A) ⊔ fs(B), else fail.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 235 / 420
Unification Unification as a computational process
Unification algorithm
Example (Unification algorithm)
S← {〈qA, qB〉}while S 6= ∅
select a pair 〈q1, q2〉 ∈ S ; S← S \ {〈q1, q2〉}q1 ← find(q1); q2 ← find(q2)if q1 6= q2 then
if (is labeled(q1) and not is sink(q2)), or(is labeled(q2) and not is sink(q1)), or(is labeled(q1) and is labeled(q2) and q1.label 6= q2.label) then fail
elseunion(q1, q2)if (is sink(q1) and is sink(q2) and is labeled(q2)) then q1.label ← q2.labelfor each 〈f , q〉 ∈ q2.feats
if there is some 〈f , p〉 ∈ q1.feats thenS← S ∪ {〈p, q〉}
elseadd 〈f , q〉 to q1.feats
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 236 / 420
Unification Unification as a computational process
Unification algorithm
Upon termination of the algorithm, the original inputs are modified.
The result is obtained by considering the equivalence class of q1 as aroot, and computing the graph that is accessible from it.
Such algorithms, that modify their inputs, are called destructive.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 237 / 420
Unification Unification as a computational process
Unification algorithm: correctness
Lemma
The unification algorithm terminates.
Lemma
The unification algorithm computes A ⊔ B.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 238 / 420
Unification Unification as a computational process
Unification algorithm: complexity
Every iteration of the loop merges two (different) equivalence classesinto one.
If the inputs are feature graphs consisting of fewer than n nodes, thenumber of equivalence classes in the result is bounded by 2n.
Thus, union can be executed at most 2n times.
There are two calls to find in each iteration, so the number of findoperations is bounded by 4n.
With the appropriate data structures for implementing equivalenceclasses it can be proven that O(n) operations of union and find canbe done in (O(c(n) × n), where c(n) is the inverse Ackermanfunction, which can be considered a constant for realistic n-s.
Therefore, the unification algorithm is quasi-linear.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 239 / 420
Unification Unification as a computational process
Unification algorithm
The algorithm is destructive: the input feature graphs are modified.
This might pose a problem: the inputs might be necessary for furtheruses; even worse, when the unification fails, the inputs might be lost.
To overcome the problem, the inputs to unification must be copiedbefore they are unified, and copying of graphs is an expensiveoperation.
As an alternative solution, there exist non-destructive unificationalgorithms whose theoretical complexity (and actual run-time) is notworse than the algorithm we have presented.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 240 / 420
Unification Generalization
Generalization
Unification is an information-combining operator: when two featurestructures are compatible, their unification can be informally seen asa union of the information both structures encode.
Sometimes, however, a dual operation is useful, analogous to theintersection of the information encoded in feature structures.
This operation, which is much less frequently used in computationallinguistics, is referred to as anti-unification, or generalization.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 241 / 420
Unification Generalization
Generalization
Defined over pairs of feature structures, generalization (denoted ⊓) isthe operation that returns the most specific (or least general) featurestructure that is still more general than both arguments.
In terms of the subsumption ordering, generalization is the greatestlower bound (glb) of two feature structures.
Unlike unification, generalization can never fail.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 242 / 420
Unification Generalization
Generalization
Definition (Generalization)
The generalization (or anti-unification) of two feature structures fs1 andfs2, denoted fs1⊓fs2, is the greatest lower bound of fs1 and fs2.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 243 / 420
Unification Generalization
Generalization
Example (Generalization)
Generalization reduces information:
[
num : sg]
⊓[
pers : third]
= [ ]
Different atoms are inconsistent:
[
num : sg]
⊓[
num : pl]
=[
num : [ ]]
Generalization is restricting:
[
num : sg]
⊓
[
num : sgpers : third
]
=[
num : sg]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 244 / 420
Unification Generalization
Generalization
Example (Generalization)
Empty feature structures are zero elements:
[ ] ⊓[
agr :[
num : sg]]
= [ ]
Reentrancies can be lost:[
f : 1[
num : sg]
g : 1
]
⊓
[
f :[
num : sg]
g :[
num : sg]
]
=
[
f :[
num : sg]
g :[
num : sg]
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 245 / 420
Unification grammars Introduction
Unification grammars
Feature structures are the building blocks with which unificationgrammars are built, as they serve as the counterpart of the terminaland non-terminal symbols in CFGs.
In order to define grammars and derivations, one needs someextension of feature structures to sequences thereof.
Multi-rooted feature structures are aimed at capturing complex,ordered information and are used for representing rules and sententialforms of unification grammars.
Multi-rooted feature graphs, a natural extension of feature graphsMulti-rooted feature structures, which are equivalence classes ofisomorphic multi-rooted feature graphsMulti-AVMs, which are an extension of AVMs, and show how theycorrespond to multi-rooted graphs.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 246 / 420
Unification grammars Introduction
Unification grammars
Unification in context
Forms and grammar rules
Derivation
Languages
Derivation tress
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 247 / 420
Unification grammars Motivation
Motivation
A naıve attempt to augment context-free rules with feature structurescould have been to add to each rule a sequence of feature structures,with an element for each element in the CF skeleton
However, rules cannot be thought of simply as sequences of featurestructures
The reason is possible reentrancies among elements of suchsequences, or in other words, among different categories in a singlerule
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 248 / 420
Unification grammars Motivation
Motivation
Example (Rule)
As a motivating example, consider a rule intending to account foragreement on number between the subject and the verb of Englishsentences:
[
cat : s]
→
[
cat : npnum : 4
] [
cat : vpnum : 4
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 249 / 420
Unification grammars Motivation
Motivation
The difficulty in extending feature structures to sequences thereof isthe possible sharing of information among different elements of theintended sequence
This sharing takes different forms across the various views
In the case of multi-AVMs, the scope of variables (i.e., tags) isextended from a single AVM to a sequenceIn multi-rooted feature graphs this is expressed by the possibility of twopaths, leaving two different roots, leading to the same nodeIn the case of abstract multi-rooted structures, the reentrancy relationmust account for possible reentrancies across elements
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 250 / 420
Unification grammars Motivation
Sequences
Two methods for representing rules (and sentential forms based uponthem)
One approach is to use (single) feature structures for representing“sequences” of feature structures.Dedicated features are used to encode substructures of a featurestructure, and the order among them:
Example
1 :[
cat : s]
2 :
[
cat : npnum : 6
]
3 :
[
cat : vpnum : 6
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 251 / 420
Unification grammars Motivation
Sequences
In this example, the special features 1,2 and 3 are used to encodethe left-hand side, the first element and the last element of the righthand side of the rule, respectively
The main advantage of this approach is that the existing apparatus offeature structures suffices for representing rules as well
However, there are several drawbacks to this solution:
The signature must be augmented to include additional atoms forrepresenting categories, and special features to encode positionsDedicated features (e.g., 1, 2 and 3) are required to have a specialmeaningThe set Feats must be considered an ordered set in order for it to bemapped to the ordered substructure of such feature structures.The number of such dedicated features that are needed is unbounded,in contradiction to our assumption that the set Feats is finite.When feature structures are typed this results in an unboundednumber of types, too
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 252 / 420
Unification grammars Motivation
Sequences
A different solution to this problem can be based on the observationthat feature structures can be used to represent lists.
A list can be simply represented as a feature structure having twofeatures, named, say, first and rest:
Example (Feature structure encoding of a list)
first : 1
rest :
first : 2
rest :
[
first : 3rest : elist
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 253 / 420
Unification grammars Motivation
Sequences
A representation of the motivating example with list could be:
Example (List representation of a rule)
first :[
cat : s]
rest :
first :
[
cat : npnum : 6
]
rest :
first :
[
cat : vpnum : 6
]
rest : elist
Similar problems arise with the list representation:, the features first
and rest are acquired a special, irregular meaning
There is no “direct access” to elements of a rule
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 254 / 420
Unification grammars Multi-rooted feature graphs
Multi-rooted feature graphs
We extend feature graphs to multi-rooted feature graphs (MRGs).
Multi-rooted feature graphs are defined over the same signature(Feats and Atoms), which is assumed to be fixed
Definition (Multi-rooted feature graphs)
A multi-rooted feature graph (MRG) is a pair 〈R ,G 〉 whereG = 〈Q, δ, θ〉 is a finite, directed, labeled graph consisting of a non-empty,finite set Q of nodes (disjoint of Feats and Atoms), a partial functionδ : Q × Feats → Q specifying the arcs and a labeling function θ markingsome of the sinks, and where R is an ordered list of distinguished nodes inQ called roots. G is not necessarily connected, but the union of all thenodes reachable from all the roots in R is required to yield exactly Q. Thelength of an MRG is the number of its roots, |R|. λ denotes the emptyMRG, where Q = ∅.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 255 / 420
Unification grammars Multi-rooted feature graphs
Multi-rooted feature graphs
Example (Multi-rooted feature graphs)
The following is an MRG, in which the shaded nodes (ordered from left toright) constitute the list of roots, R
q1 q2 q3
q4s
q5np
q6vp
q7
cat cat cat
agr agr
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 256 / 420
Unification grammars Multi-rooted feature graphs
Multi-rooted feature graphs
A multi-rooted feature graph is a directed, not necessarily connected,labeled graph with a designated sequence of nodes called roots
It is a natural extension of feature graphs, the only difference beingthat the single root of a feature graph is extended here to a list inorder to model the required structured information
Meta-variables ~A range over MRGs, and Q, δ, θ and R – over theirconstituents
We do not distinguish between an MRG of length 1 and a featuregraph
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 257 / 420
Unification grammars Multi-rooted feature graphs
Multi-rooted feature graphs
Natural relations can be defined between MRGs and feature graphs
First, note that if ~A = 〈R ,G 〉 is an MRG and qi is a root in R then qi
naturally induces a feature graph ~A|i = 〈Qi , qi , δi , θi 〉, where:
Qi is the set of nodes reachable from qi
δi = δ|Qi(the restriction of δ to Qi )
θi = θ|Qi(the restriction of θ to Qi ).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 258 / 420
Unification grammars Multi-rooted feature graphs
Multi-rooted feature graphs
One can view an MRG ~A = 〈R,G 〉 as an ordered sequence〈A1, . . . ,An〉 of (not necessarily disjoint) feature graphs, whereAi = ~A|i for 1 ≤ i ≤ n
Note that such an ordered list of feature structures is not a sequencein the mathematical sense:
removing a node accessible from one root can result in this node beingremoved from the graph accessible from some other root
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 259 / 420
Unification grammars Multi-rooted feature graphs
Subgraphs
Although MRGs are not element-disjoint sequences, it is possible todefine substructures of them
The roots of an MRG form a sequence of nodes
Taking just a subsequence of the roots, and considering only thesubgraph they induce (that is, the nodes that are accessible fromthese roots), a notion of substructure is naturally obtained
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 260 / 420
Unification grammars Multi-rooted feature graphs
Subgraphs
Definition (Induced subgraphs)
The subgraph of a non-empty MRG ~A = 〈R,G 〉, induced by j , k anddenoted ~Aj ...k , is defined only if 1 ≤ i ≤ j ≤ n, in which case it is theMRG 〈R ′,G ′〉 where R ′ = 〈qj , . . . , qk〉, G ′ = 〈Q ′, δ′, θ′〉 and
Q ′ = {q | δ(q, π) = q} for some q ∈ R ′ and some π
δ′(q, f ) = δ(q, f ) for every q ∈ Q ′
θ′(q) = θ(q) for every q ∈ Q ′
When the sequence is of length 1 we write ~Ai for ~Ai ...i . As we identify afeature graph with an MRG of length 1, ~Ai = ~A|i .
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 261 / 420
Unification grammars Multi-rooted feature graphs
MRGs
Since MRGs are a natural extension of feature graphs, many ofconcepts defined for the latter can be extended to the former
The transition function δ is extended from single features to pathsThe set of paths of an MRGThe function val , associating a value with each path in a featuregraph, is extended to MRGs.Reentrancy and cyclicityIsomorphism and subsumption
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 262 / 420
Unification grammars Multi-rooted feature graphs
MRG paths
Definition (MRG paths)
The paths of a multi feature graph ~A are
Π(~A) = {〈i , π〉 | π ∈ Paths and δ(qi , π)↓}
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 263 / 420
Unification grammars Multi-rooted feature graphs
MRG path values
Definition (Path value)
The value of a path 〈i , π〉 in an MRG ~A, denoted by val~A(〈i , π〉), is definedif and only if δ~A
(qi , π)↓, in which case it is the feature graph val~A|i (π).
Note that the value of a path in an MRG is a (single-rooted) featuregraph, not an MRG. In particular, val~A(〈i , π〉) may include nodes which
are roots in ~A but are not the root of the resulting feature graph. Clearly,an MRG may have two paths 〈i1, π1〉 and 〈i2, π2〉 where π1 = π2 eventhough i1 6= i2.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 264 / 420
Unification grammars Multi-rooted feature graphs
MRG path values
Example (Path value)
~A, where R = 〈q0, q1, q2〉 val~A(〈2, 〈f〉〉)
q0 q1 q2
q3 q4 q5
q6a
q7b
q4
q6a
q7b
f f f
h hg h
g h
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 265 / 420
Unification grammars Multi-rooted feature graphs
MRG reentrancy
Two MRG paths are reentrant, denoted 〈i , π1〉~A
! 〈j , π2〉, if theyshare the same value: δ~A
(qi , π1) = δ~A(qj , π2)
A multi-rooted feature graph is reentrant if it has two distinct paths(possibly leaving different roots) that are reentrant
An MRG ~A is cyclic if two paths 〈i , π1〉, 〈i , π2〉 ∈ Π(~A), where π1 is a
proper subsequence of π2, are reentrant: 〈i , π1〉~A
! 〈i , π2〉
Here, the two paths must have the same index i , although they may“pass through” elements of ~A other than the i -th one
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 266 / 420
Unification grammars Multi-rooted feature graphs
A cyclic MRG
Example (A cyclic MRG)
The following MRG ~A = 〈R ,G 〉, where R = 〈q0, q1, q2〉, is cyclic:
q0 q1 q2
q3 q4 q5
q6 q7
f f f
h h
g
h
g
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 267 / 420
Unification grammars Multi-rooted feature graphs
Multi-rooted feature graph isomorphism
Definition (Multi-rooted feature graph isomorphism)
Two MRGs ~A1 = 〈R1,G1〉 and ~A2 = 〈R2,G2〉 are isomorphic, denoted~A1~∼~A2, iff they are of the same length, n, and there exists a one-to-onemapping i : Q1 → Q2, called an isomorphism, such that:
i(q1j ) = q2j for all 1 ≤ j ≤ n;
for all q1, q2 ∈ Q1 and f ∈ Feats, δ1(q1, f ) = q2 iffδ2(i(q1), f ) = i(q2); and
for all q ∈ Q1, θ1(q) = θ2(i(q)) (either both are undefined, or bothare defined and equal).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 268 / 420
Unification grammars Multi-rooted feature graphs
Subsumption of multi-rooted feature graphs
Definition (Subsumption of multi-rooted feature graphs)
An MRG ~A = 〈R ,G 〉 subsumes an MRG ~A′ = 〈R ′,G ′〉, denoted ~A~⊑~A′, if|R | = |R ′| and there exists a total function h : Q → Q ′ such that:
for every root qi ∈ R, h(qi ) = q′i
for every q ∈ Q and f ∈ Feats, if δ(q, f )↓ thenh(δ(q, f )) = δ′(h(q), f )
for every q ∈ Q, if θ(q)↓ then θ(q) = θ′(h(q))
The only difference from feature graph subsumption is that h is requiredto map each of the roots in R to its corresponding root in R ′. Notice thatin order for two MRGs to be related by subsumption they must be of thesame length.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 269 / 420
Unification grammars Multi-rooted feature graphs
Subsumption of multi-rooted feature graphs
Example (MRG subsumption)
Feature graph subsumption can have three different effects: if A ⊑ B ,then B can have additional arcs, additional reentrancies or more markedatoms. The same holds for MRGs, with the observation that additionalreentrancies can now occur among paths that originate at different roots:
~⊑
6 ~⊒
f g f g
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 270 / 420
Unification grammars Multi-rooted feature graphs
Subsumption of multi-rooted feature graphs
Example (MRG subsumption)
Let ~A and ~A′ be the following two MRGs. Then ~A~⊑~A′ but not ~A′~⊑~A.
~Anp vp np
sg 3rd sg 3rd
cat
agr
num
pers
catag
r
agr
num
pers
cat
~A′
np vp np
sg 3rd
cat
agr
num
pers
cat
agr cat
agr
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 271 / 420
Unification grammars Multi-rooted feature graphs
Multi-rooted feature structures
Since MRG isomorphism is an equivalence relation, the notion ofmulti-rooted feature structures is well defined:
Definition (Multi-rooted feature structures)
Given a signature of features Feats and atoms Atoms, let~G(Feats,Atoms) be the set of all multi-rooted feature graphs over thesignature. Let ~G|∼ be the collection of equivalence classes inG(Feats,Atoms) with respect to feature graph isomorphism. Amulti-rooted feature structure (MRS) is a member of ~G|∼. We usemeta-variables mrs to range over MRSs.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 272 / 420
Unification grammars Multi-AVMs
Multi-AVMs
Definition
Given a signature S, a multi-AVM (MAVM) of length n ≥ 0 is asequence 〈M1, . . . ,Mn〉 such that for each i , 1 ≤ i ≤ n, Mi is an AVMover the signature.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 273 / 420
Unification grammars Multi-AVMs
Multi-AVMs
Meta-variables ~M range over multi-AVMs
The sub-AVMs of ~M are SubAVM( ~M) =⋃
1≤i≤n SubAVM(Mi)
Similarly to what we did for AVMs, we define the set of tagsoccurring in a multi-AVM ~M as Tags(~M)
Note that if ~M = 〈M1, . . . ,Mn〉 then Tags( ~M) =⋃
1≤i≤n Tags(Mi )(where the union is not necessarily disjoint)
Also, the set of sub-AVMs of ~M (including ~M itself) which are taggedby the same variable X is TagSet(~M,X )
Here, too, TagSet( ~M,X ) =⋃
1≤i≤n TagSet(Mi ,X )
We usually do not distinguish between a multi-AVM of length 1 andan AVM
When depicting MAVMs graphically, we sometimes suppress theangular brackets which enclose the sequence of AVMs.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 274 / 420
Unification grammars Multi-AVMs
Multi-AVMs
Well-formedness and variable association are extended from AVMs toMAVMs in the natural way:
Definition (Well-formed MAVMs)
A multi-AVM ~M is well-formed iff for every variable X ∈ Tags( ~M),TagSet(~M,X ) includes at most one non-empty AVM.
Definition (Variable association)
The association of a variable X in ~M, denoted assoc( ~M,X ), is the singlenon-empty AVM in TagSet(~M,X ); if all the members of TagSet(~M,X ) areempty, then assoc(~M,X ) = X [ ].
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 275 / 420
Unification grammars Multi-AVMs
Multi-AVMs
Example (Multi-AVMs)
Consider the following multi-AVM ~M , whose length is 3:
⟨
2[
f : 9[
h : 1 [ ]]]
, 1
[
f : 8
[
g : 7ah : 2 [ ]
]]
, 6[
f : 5[
h : 2 [ ]]]
⟩
Tags( ~M) = { 1 , 2 , 5 , 6 , 7 , 8 , 9}. ~M is well-formed:
TagSet( ~M , 1 ) =
{
1 [ ] , 1
[
f : 8
[
g : 7ah : 2 [ ]
]]}
TagSet( ~M , 2 ) ={
2 [ ] , 2[
f : 9[
h : 1 [ ]]]}
Therefore,
assoc( ~M, 1 ) = 1
[
f : 8
[
g : 7ah : 2 [ ]
]]
, assoc( ~M, 2 ) = 2[
f : 9[
h : 1 [ ]]]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 276 / 420
Unification grammars Multi-AVMs
Multi-AVMs
The same variable can tag different sub-AVMs of different elements inthe sequence
In other words, the scope of variables is extended from single AVMsto multi-AVMs
This leads to an interpretation of variables (in multi-AVMs) whichhampers the view of multi-AVMs as sequences of AVMs
Recall that we interpret multiple occurrence of the same variablewithin a single AVM as denoting value sharing; hence the definition ofwell-formed AVMs, and the convention that when a variable occursmore than once in an AVM, its association can be stipulated next toany of its occurrences
As in the other views, when multi-AVMs are concerned, thisconvention implies that removing an element from a multi-AVM canaffect other elements, in contradiction to the usual concept ofsequences
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 277 / 420
Unification grammars Multi-AVMs
MAVM arcs
The sets Arcs and Arcs* are naturally extended from AVMs toMAVMs
Crucially, an arc can connect two tags which occur in differentmembers of the MAVM.
Example (Multi-AVM arcs)
In the MAVM ~M of the example, the set of arcs includes:
{〈 2 , f, 9 〉, 〈 1 , f, 8 〉, 〈 8 ,h, 2 〉} ⊂ Arcs( ~M)
Hence, in particular, 〈 1 , 〈fhf〉, 9 〉 ∈ Arcs*( ~M)
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 278 / 420
Unification grammars Multi-AVMs
MAVM paths
When defining the paths of a multi-AVM, some caution is required
For an AVM M, Π(M) is defined as {π | X = tag(M) and for somevariable Y ∈ Tags(M), 〈X , π,Y 〉 ∈ Arcs*(M)}
In case of MAVMs, there are several elements from which X can bechosen
Hence, we define the set of MAVM paths relative to an additionalparameter, the index of the element in the MAVM from which thepath leaves.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 279 / 420
Unification grammars Multi-AVMs
MAVM paths
Definition (Multi-AVM paths)
If ~M = 〈M1, . . . ,Mn〉 is an MAVM of length n, then its paths are the setΠ( ~M) = {〈i , π〉 | 1 ≤ i ≤ n, X = tag(Mi) and for some variableY ∈ Tags( ~M), 〈X , π,Y 〉 ∈ Arcs*( ~M)}. If n = 0, Π(~M) = ∅.
Example (Multi-AVM paths)
In the MAVM ~M of the example, the set of paths includes 〈2, 〈fg〉〉 butnot 〈1, 〈fg〉〉. More interestingly, {〈i , 〈fh〉k〉 | k ≥ 0 and1 ≤ i ≤ 3} ⊂ Π( ~M).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 280 / 420
Unification grammars Multi-AVMs
MAVM path values
Definition (Path values)
The value of a path 〈i , π〉 in a multi-AVM ~M, denoted pval( ~M , 〈i , π〉), isassoc(M,Y ), where Y is such that 〈tag( ~M), π,Y 〉 ∈ Arcs*( ~M).
Of course, one path can have several values when it leaves differentelements of a multi-AVM: in general, pval(~M, 〈i , π〉) 6= pval( ~M, 〈j , π〉) ifi 6= j .
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 281 / 420
Unification grammars Multi-AVMs
MAVM path values
Example (Path values)
Consider again ~M of the example. Examples of path values includepval( ~M, 〈2, 〈f g〉〉) = 7a and pval( ~M , 〈1, 〈f h f g〉〉) = 7a. Observe thatin order to fully stipulate the value of some paths, one must combinesub-AVMs of more than one element of the multi-AVM. For example,
pval( ~M, 〈2, 〈f h〉〉) = 1
[
f : 8
[
g : 7ah : 2
[
f : 9[
h : 1 [ ]]]
]]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 282 / 420
Unification grammars Multi-AVMs
MAVM reentrancy
A multi-AVM is reentrant if it has two distinct paths which share thesame value; these two paths may well be “rooted” in two differentelements of the MAVM
An MAVM ~M is cyclic if two paths 〈i , π1〉, 〈i , π2〉 ∈ Π( ~M), where π1
is a proper subsequence of π2, are reentrant: 〈i , π1〉~M
! 〈i , π2〉
Here, the two paths must have the same index i , although they may“pass through” elements of ~M other than the i -th one.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 283 / 420
Unification grammars Multi-AVMs
MAVM reentrancy
Example (Multi-AVM reentrancy)
Consider again the MAMV ~M of the example. It is reentrant, sincepval(〈1, 〈fh〉〉) = pval(〈2, ǫ〉). Furthermore, it is cyclic sincepval(〈1, 〈fhfh〉〉) = pval(〈1, ǫ〉).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 284 / 420
Unification grammars Multi-AVMs
MAVM subsumption
Definition (Multi-AVM subsumption)
Let ~M, ~M ′ be two MAVMs of the same length n and over the samesignature. ~M subsumes ~M ′, denoted ~M~�~M ′, if the following conditionshold:
1 for all i , 1 ≤ i ≤ n, Mi � M ′i ;
2 if 〈i , π1〉~M
! 〈j , π2〉 then 〈i , π1〉~M′
! 〈j , π2〉.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 285 / 420
Unification grammars Multi-AVMs
MAVM subsumption
Example (MAVM subsumption)
Let ~M and ~M ′ be the following two MAVMs (of length 3):
~M : 1
»
cat : np
agr : 4
–
2
2
4
cat : vp
agr : 4
»
num : sg
pers : 3rd
–
3
5 3
2
4
cat : np
agr : 6
»
num : sg
pers : 3rd
–
3
5
~M ′ : 1
»
cat : np
agr : 4
–
2
2
4
cat : vp
agr : 4
»
num : sg
pers : 3rd
–
3
5 3
»
cat : np
agr : 4
–
Then ~M � ~M ′ but not ~M ′ � ~M.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 286 / 420
Unification grammars Multi-AVMs
MAVM subsumption
The second clause of the definition may seem redundant: if for all i ,1 ≤ i ≤ n, Mi � M ′
i , then in particular all the reentrancies of Mi areall reentrancies in M ′
i ; why then is the second clause necessary?
The answer lies in the possibility of reentrancies across elements inmulti-AVMs
Such reentrancies are a “global” property of multi-AVMs, which isnot reflected in any of the elements in isolation
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 287 / 420
Unification grammars Multi-AVMs
MAVM Renaming
Definition (Renaming)
Let ~M1 and ~M2 be two MAVMs. ~M2 is a renaming of ~M1, denoted~M1~≃ ~M2, iff ~M1~�~M2 and ~M2~�~M1.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 288 / 420
Unification grammars Multi-AVMs
Multi-AVM to MRG mapping
Definition (Multi-AVM to MRG mapping)
Let ~M = 〈M1, . . . ,Mn〉 be a well-formed multi-AVM of length n. TheMRG image of ~M is ϕ( ~M) = 〈R ,G 〉, with R = 〈q1, . . . , qn〉 andG = 〈Q, δ, θ〉, where:
Q = Tags( ~M)
qi = tag(Mi) for 1 ≤ i ≤ n
for all X ∈ Tags(~M) and f ∈ Feats, δ(X , f ) = Y if〈X , f ,Y 〉 ∈ Arcs( ~M), and
for all X ∈ Tags(~M) and a ∈ Atoms, θ(X ) = a if assoc( ~M,X ) is theatomic AVM X (a), and is undefined otherwise.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 289 / 420
Unification grammars Multi-AVMs
Multi-AVM to MRG mapping
Example (Multi-AVM to multi-rooted feature graph mapping)
Consider the following multi-AVM ~M:
2[
f : 9[
h : 1 [ ]]]
1
[
f : 8
[
g : 7ah : 2 [ ]
]]
6[
f : 5[
h : 2 [ ]]]
Observe that it is well-formed, as the variables that occur more than once( 1 and 2 ) have only one non-empty occurrence each. The set of variablesof ~M is Tags( ~M) = { 1 , 2 , 5 , 6 , 7 , 8 , 9}, which will also be the set ofnodes Q in ϕ(~M). The sequence of roots R is the sequence of variablestagging the AVM elements of ~M, namely 〈 2 , 1 , 6 〉.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 290 / 420
Unification grammars Multi-AVMs
Multi-AVM to MRG mapping
Example (Multi-AVM to multi-rooted feature graph mapping)
The obtained graph is:
2 1 6
9 8 5
7a
f f fh
gh
h
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 291 / 420
Unification grammars Multi-AVMs
Multi-AVM to MRG mapping
Proposition
Let ~M1, ~M2 be two multi-AVMs. Then:
Π( ~M) = Π(ϕ(~M))
〈i , π1〉~M
! 〈j , π2〉 iff 〈i , π1〉ϕ(~M)! 〈j , π2〉
~M1~� ~M2 iff ϕ( ~M1)~⊑ϕ( ~M2)
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 292 / 420
Unification grammars Unification revisited
Unification revisited
We defined the unification operation for feature structures
We now extend the definition to multi-rooted structures; we definetwo variants of the operation:
one which unifies two same-length structures and produces their leastupper bound with respect to subsumptionunification in context, which combines the information in two featurestructures, each of which may be an element in a larger structure
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 293 / 420
Unification grammars Unification revisited
Two AMRS unification operations
Example (Two AMRS unification operations)
[ ] [ ] [ ] · · · [ ][ ] [ ] [ ] · · · [ ]
[ ] [ ] [ ] · · · [ ]
Same-length AMRS unification
[ ][ ]
[ ] [ ] [ ] [ ] [ ][ ]
Unification in context
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 294 / 420
Unification grammars Unification revisited
MRS unification
Example (MRS unification)
Let
σ =
[
cat : dnum : 4
]
cat : nnum : 4
case : nom
[
cat : vnum : 4
]
ρ =
[
cat : dnum : pl
]
cat : nnum : plcase : [ ]
[
cat : vnum : pl
]
Then
σ ⊔ ρ =
[
cat : dnum : 4pl
]
cat : nnum : 4
case : nom
[
cat : vnum : 4
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 295 / 420
Unification grammars Unification revisited
Unification in context
Example (Unification in context)
Let
σ =
[
f : 1ag : 2 [ ]
]
[
h : 2]
, ρ =
[
f : 3 [ ]g : 4b
]
[
h : 3]
Unifying the first element in σ with the first element in ρ in the contextsof σ and ρ, we obtain (σ, 1) ⊔ (ρ, 1) = (σ′, ρ′):
σ′ =
[
f : 1ag : 2b
]
[
h : 2]
, ρ′ =
[
f : 3ag : 4b
]
[
h : 3]
Note that both operands of the unification are modified.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 296 / 420
Unification grammars Unification revisited
Unification in context
Theorem
If 〈σ′, ρ′〉 = (σ, i) ⊔ (ρ, j) then σ′i = ρ′j = σi ⊔ ρj .
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 297 / 420
Unification grammars Unification revisited
Unification in context
Theorem
Let σ, ρ be two AMRSs and i , j be indexes such that i ≤ len(σ) andj ≤ len(ρ). Then 〈σ′, ρ′〉 = (σ, i) ⊔ (ρ, j) iff
σ′ = min~⊑{σ′′ | |σ~⊑ σ′′ and ρj�σ′′i} and
ρ′ = min~⊑{ρ′′ | ρ~⊑ ρ′′ and σi �ρ′′j}.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 298 / 420
Unification grammars Rules and grammars
Rules and grammars
Like context free grammars, unification grammars are defined over analphabet
As the grammars that are of most interest to us are of naturallanguages, and since sentences in natural languages are not juststrings of symbols, but rather strings of words, we add to thesignature an alphabet, a fixed set Words of words (in addition tothe fixed sets Feats and Atoms)
Meta-variables wi ,wj etc. are used to refer to elements of Words, wto refer to strings over Words.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 299 / 420
Unification grammars Rules and grammars
Rules and grammars
We also adopt here the distinction between phrasal and terminal rules
The former cannot have elements of Words in their bodies; thelatter have only a single word as their body
We refer to the collection of terminal rules as the lexicon: itassociates with terminals, members of Words, (abstract) featurestructures that are their categories
For every word wi ∈ Words the lexicon specifies a finite set ofabstract feature structures L(wi )
If L(wi ) is a singleton then wi is unambiguous, and if it is empty thenwi is not a member of the language defined by the lexicon.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 300 / 420
Unification grammars Rules and grammars
Lexicon
Definition (Lexicon)
Given a signature of features Feats and atoms Atoms, and a setWords of terminal symbols, a lexicon is a finite-range functionL : Words → 2AFS(Feats,Atoms).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 301 / 420
Unification grammars Rules and grammars
Lexicon
Example (Lexicon)
Following is a lexicon L over a signature consisting ofFeats = {cat,num,case}, Atoms = {d, n, v, sg, pl}, andWords = {two, sheep, sleep}:
L(two) =
{[
cat : dnum : pl
]}
L(sheep) =
cat : nnum : [ ]case : [ ]
L(sleep) =
{[
cat : vnum : pl
]}
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 302 / 420
Unification grammars Rules and grammars
Lexicon
Example (Lexicon)
An an alternative to the previous lexical entry of sheep above, thegrammar writer may prefer the following lexical entry:
L(sheep) =
cat : nnum : sgcase : [ ]
,
cat : nnum : plcase : [ ]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 303 / 420
Unification grammars Rules and grammars
Lexicon
Example (Lexicon, rule-format)
To depict the lexicon specification above, we usually use the followingnotation:
sheep →
cat : nnum : sgcase : [ ]
sheep →
cat : nnum : plcase : [ ]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 304 / 420
Unification grammars Rules and grammars
Lexicon
When a string of words w is given, it is possible to construct anAMRS σw for the lexical entries of the words in w , such that no twoelements of σw share paths
Such an AMRS is simply the concatenation of the lexical entries ofthe words in w
In general, there may be several such AMRSs, as each word in w canhave multiple elements in its category
The set of such AMRSs is the pre-terminals of w
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 305 / 420
Unification grammars Rules and grammars
Pre-terminals
Definition (Pre-terminals)
Let w = w1 . . . wn ∈ Words+. PTw (j , k) is defined iff 1 ≤ j , k ≤ n, inwhich case it is the set of AMRSs {〈Aj · Aj+1 · · ·Ak〉 | Ai ∈ L(wi) forj ≤ i ≤ k}. If j > k (i.e., w = ǫ), then PTw (j , k) = {λ}. The subscript wis omitted when it is clear from the context.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 306 / 420
Unification grammars Rules and grammars
Pre-terminals
Example (Pre-terminals)
Consider the string of words w = two sheep sleep and the lexicon of theprevious example. There is exactly one element in PTw (1, 3); this is theAMRS
[
cat : dnum : pl
]
cat : nnum : [ ]case : [ ]
[
cat : dnum : pl
]
Notice that there is no sharing of variables among different featurestructures in this AMRS. As AMRSs are depicted using multi-AVMs here,the variables in the above multi-AVM are chosen such that unintendedreentrancies are avoided.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 307 / 420
Unification grammars Rules and grammars
Pre-terminals
Example (Pre-terminals)
Now assume that the word sheep is represented as an ambiguous word: itscategory contains two feature structures, namely
L(sheep) =
cat : nnum : sgcase : [ ]
,
cat : nnum : plcase : [ ]
Then PTw (1, 3) has two members:
[
cat : dnum : pl
]
cat : nnum : sgcase : [ ]
[
cat : dnum : pl
]
,
[
cat : dnum : pl
]
cat : nnum : plcase : [ ]
[
cat : dnum : pl
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 308 / 420
Unification grammars Rules and grammars
Rules
Definition (Rules)
A (phrasal) rule is an AMRS of length n > 0 with a distinguished firstelement. If σ is a rule then σ1 is its head and σ2..n is its body. We adopta convention of depicting rules with an arrow (→) separating the headfrom the body.
Since a rule is simply an AMRS, there can be reentrancies among itselements: both between the head and (some element of) the bodyand among elements in its body.
Notice that the definition supports ǫ-rules, i.e., rules with null bodies
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 309 / 420
Unification grammars Rules and grammars
Rules
Example (Rules as AMRSs)
As every AMRS can be interpreted as a rule, so can the following:
[
cat : s]
→
[
cat : npagr : 4
] [
cat : vagr : 4
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 310 / 420
Unification grammars Rules and grammars
Rules
Example (Rules as AMRSs)
Rules can also propagate information between the mother and any of thedaughters using reentrancies between paths originating in the head of therule and paths originating from one of the body elements, as below.
[
cat : ssubj : 1
]
→ 1
[
cat : npagr : 2
] [
cat : vagr : 2
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 311 / 420
Unification grammars Rules and grammars
Rules
The rules of the example employ feature structures that include thefeature cat, encoding the major part-of-speech category of phrases
While this is useful and natural, it is by no means obligatory
Unification rules can encode such information in other ways (e.g., viaa different feature, or as a collection of features); or they may notencode it at all
In the general case, a unification rule is not required to have acontext-free skeleton, a feature whose values constitute a context-freebackbone that drives the derivation
Some unification-based grammar theories do indeed maintain acontext-free skeleton (LFG is a notable example), while others (likeHPSG) do not
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 312 / 420
Unification grammars Rules and grammars
Rules
We introduce a shorthand notation in the presentation of grammars:
When two rules have the same head, we list the head only once andseparate the bodies of the different rules with ‘|’ (following theconvention of context-free grammars)
Note, however, that the scope of variables is still limited to a singlerule, so that multiple occurrences of the same variable within thebodies of two different rules are unrelated
Additionally, we may use the same variable (e.g., 4 ) in several rules
It should be clear by now that these multiple uses are unrelated toeach other, as the scope of variables is limited to a single rule
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 313 / 420
Unification grammars Rules and grammars
Unification grammars
Definition (Unification grammars)
A unification grammar (UG) G = (L,R,As) over a signature Atoms ofatoms and Feats of features consists of a lexicon L, a finite set of rulesR and a start symbol As that is an abstract feature structure.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 314 / 420
Unification grammars Rules and grammars
Unification grammars
Example (Gu, a unification grammar)
[
cat : s]
→
cat : npnum : 4
case : nom
[
cat : vnum : 4
]
cat : npnum : 4
case : 2
→
[
cat : dnum : 4
]
cat : nnum : 4
case : 2
cat : npnum : 4
case : 2
→
cat : pronnum : 4
case : 2
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 315 / 420
Unification grammars Rules and grammars
Unification grammars
Example (Gu, a unification grammar)
sleep →
[
cat : vnum : pl
]
sleeps →
[
cat : vnum : sg
]
lamb →
cat : nnum : sgcase : [ ]
lambs →
cat : nnum : plcase : [ ]
she →
cat : pronnum : sgcase : nom
her →
cat : pronnum : sgcase : acc
a →
[
cat : dnum : sg
]
two →
[
cat : dnum : pl
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 316 / 420
Unification grammars Derivations
Derivations
The language generated by UGs is defined in a parallel way to thedefinition of languages generated by context-free grammars:
first, we define derivations, analogously to the context-free derivations
The reflexive transitive closure of the derivation relation is the basisfor the definition of languages
For the following discussion fix a particular grammar G = (L,R,As)
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 317 / 420
Unification grammars Derivations
Derivations
Derivation is a relation that holds between two forms, σ1 and σ2,each of which is an AMRS
To define it formally, two concepts have to be taken care of:
An element of σ1 has to be matched against the head of somegrammar rule, ρ
The body of ρ must replace the selected element in σ1, thus producingσ2
Matching involves unification, and unification must be computed incontext: that is, when the selected element of σ1 is unified with thehead of ρ, other elements in σ1 or in ρ may be affected due toreentrancy
This possibility must be taken care of when replacing the selectedelement with the body of ρ
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 318 / 420
Unification grammars Derivations
Derivations
Definition (Derivation)
An AMRS σ1 of length k derives an AMRS σ2 (denoted σ1 ⇒ σ2) iff forsome j ≤ k and some rule ρ ∈ R of length n,
(σ1, j) ⊔ (ρ, 1) = (σ′1, ρ
′), and
σ2 is the replacement of the j-th element of σ1 with the body of ρ
(details suppressed)
The reflexive transitive closure of ‘⇒’ is ‘∗⇒’. We write σ
l⇒ ρ when σ
derives ρ in l steps.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 319 / 420
Unification grammars Derivations
Derivation step
Example (Derivation step)
Suppose that
σ1 =
cat : npnum : 1
case : nom
[
cat : vnum : 1
]
is a (sentential) form and that
ρ =
cat : npnum : 2
case : 3
→
[
cat : dnum : 2
]
cat : nnum : 2
case : 3
is a rule. Assume further that the selected element j in σ1 is the first one.Applying the rule ρ to the form σ1, it is possible to construct a derivationσ1 ⇒ σ2.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 320 / 420
Unification grammars Derivations
Derivation step
Example (Derivation step)
First, compute (σ1, 1) ⊔ (ρ, 1) = (σ′1, ρ
′):
σ′1 =
cat : npnum : 1
case : nom
[
cat : vnum : 1
]
ρ′ =
cat : npnum : 2
case : 3nom
[
cat : dnum : 2
]
cat : nnum : 2
case : 3
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 321 / 420
Unification grammars Derivations
Derivation step
Example (Derivation step)
Now, the first element of σ′1 is replaced by the body of ρ′. This operation
results in a new AMRS, σ2, of length 3: the first two elements are thebody of ρ′, and the last element is the remainder of σ′
1, after its firstelement has been eliminated; that is, the last element of σ′
1. A simplereplacement would have resulted in the following AMRS:
[
cat : dnum : 2
]
cat : nnum : 2
case : 3nom
[
cat : vnum : 1
]
Obviously, this is not the expected result!
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 322 / 420
Unification grammars Derivations
Derivation step
Example (Derivation step)
Since the path (1,num) in σ1 is reentrant with (2,num) (indicated by thetag 1 ), and since the path (1,num) in the rule ρ is reentrant with thepaths (2,num) and (3,num) (the tag 3 ), one would expect that thesharing between the num values of the noun phrase and the verb phrase inσ1 would manifest itself as a sharing between this feature’s values of thedeterminer, the noun and the verb phrase in σ2.This is what the last clause in the definition of derivation guarantees. Theresult is:
σ2 =
[
cat : dnum : 4
]
cat : nnum : 4
case : 5nom
[
cat : vnum : 4
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 323 / 420
Unification grammars Derivations
Derivation
Example (Derivation)
Consider the grammar Gu. A derivation with Gu can start with a form oflength 1, consisting of
σ1 =[
cat : s]
The single element of this AMRS unifies with the head of the first rule inthe grammar, trivially. Substitution is again trivial, and the next form inthe derivation is the body of the first rule:
σ2 =
cat : npnum : 1
case : nom
[
cat : vnum : 1
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 324 / 420
Unification grammars Derivations
Derivation
Example (Derivation)
Since the rule ρ of that example is indeed in Gu, a derivable form from σ2
is:
σ3 =
[
cat : dnum : 4
]
cat : nnum : 4
case : nom
[
cat : vnum : 4
]
Thus, we obtain σ1 ⇒ σ2 ⇒ σ3, and hence σ1∗⇒ σ3.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 325 / 420
Unification grammars Derivations
Derivation
Example (Derivation)
Consider the form σ3 and one of the AMRSs in PTw (1, 3):
σ3 =
[
cat : dnum : 4
]
cat : nnum : 4
case : nom
[
cat : vnum : 4
]
σ =
[
cat : dnum : pl
]
cat : nnum : plcase : [ ]
[
cat : dnum : pl
]
The former contains information that is accumulated during derivations; thelatter reflect information from the lexical entries of the words in w .
σ ⊔ ρ =
[
cat : dnum : 4pl
]
cat : nnum : 4
case : nom
[
cat : vnum : 4
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 326 / 420
Unification grammars Derivations
Language
Definition (Language)
The language of a unification grammar G isL(G ) = {w ∈ Words∗ | w = w1 · · ·wn and there exist an AMRS σ such
that As∗⇒ σ and an AMRS ρ ∈ PTw (1, n) such that σ ⊔ ρ is defined}.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 327 / 420
Unification grammars Derivations
Language
Example (Language)
Consider the grammar Gu and the string the sheep sleep. The form σ3 isderivable from the start symbol of the grammar. This form is unifiablewith one of the members of PTw (1, 3). Hence the string the sheep sleep isa member of L(Gu).
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 328 / 420
Unification grammars Derivation trees
Derivation trees
In order to depict derivations graphically we extend the notion ofderivation trees, defined for context-free grammars, to unificationgrammars
Informally, we would like a tree to be a structure whose elements arefeature structures
However, care must be taken when the scope of reentrancies in a treeis concerned: in order for information to be shared among all nodes ina tree, this scope is extended to the entire tree
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 329 / 420
Unification grammars Derivation trees
Derivation trees
Rather than define a new mathematical entity, corresponding to atree whose nodes are feature structures with the scope of reentranciesextended to the entire structure, we reuse in the following definitionthe concept of multi-rooted structures (more precisely, AMRSs)
In order to impose a tree structure on AMRSs we simply pair themwith a tree whose nodes are integers, such that each node in the treeserves as an index into the AMRS
In this way, all the existing definitions which refer to AMRSs can benaturally used when reasoning about trees
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 330 / 420
Unification grammars Derivation trees
Derivation trees
Definition (Unification trees)
Given a signature S = 〈Atoms,Feats〉, a unification tree is an orderedtree whose nodes are AVMs over S, where the scope of reentrancies isextended to the entire tree. A subtree is a particular node of the tree,along with all its descendants (and the edges connecting them). Formally,a unification tree is a pair 〈σ, τ 〉, where σ is an AMRS over S, say oflength l for some l ∈ N, and τ is a tree over the nodes {1, 2, . . . , l}.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 331 / 420
Unification grammars Derivation trees
Derivation trees
Example (Unification tree)
Following is a unification tree, depicted as a tree of AVMs:
[
cat : s]
cat : npnum : 4
case : 2nom
[
cat : dnum : 4
]
cat : nnum : 4
case : 2nom
[
cat : vnum : 4
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 332 / 420
Unification grammars Derivation trees
Derivation trees
Example (Unification tree)
Formally, this tree is a pair 〈τ, σ〉, where τ is a tree over {1, 2, 3, 4, 5} and σ is anAMRS of length 5:
τ = 1
2
3 4 5
σ =[
cat : s]
cat : npnum : 4
case : 2nom
[
cat : dnum : 4
]
cat : nnum : 4
case : 2nom
[
cat : vnum : 4
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 333 / 420
Unification grammars Derivation trees
Unification derivation trees
Definition (Unification derivation trees)
A unification derivation tree induced by a unification grammarG = (R,As) is a unification tree defined recursively as follows:
〈As , τ〉 is a unification derivation tree, where τ is the tree consistingof the single node {1};
if 〈σ, τ 〉 is a unification derivation tree and 〈σ′, τ ′〉 extends 〈σ, τ 〉,then 〈σ′, τ ′〉 is also a unification derivation tree.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 334 / 420
Unification grammars Derivation trees
Unification derivation trees
Example (Unification derivation trees)
A unification derivation tree with the grammar Gu can be builtincrementally as follows. The start symbol of the grammar is
[
cat : s]
;therefore, an initial derivation tree would be 〈σ1, {1}〉, the start symbolitself.Then, by using the first grammar rule, the following tree, 〈σ2, τ2〉, can beobtained:
[
cat : snum : 4
]
cat : npnum : 4
case : nom
[
cat : vnum : 4
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 335 / 420
Unification grammars Derivation trees
Unification derivation trees
Example (Unification derivation trees)
Next, by applying the second grammar rule to the leftmost node on thefrontier of 〈σ2, τ2〉, the following tree, 〈σ3, τ3〉, is obtained:
[
cat : snum : 4
]
cat : npnum : 4 sgcase : nom
[
cat : dnum : 4
]
cat : nnum : 4
case : 2
[
cat : vnum : 4
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 336 / 420
Unification grammars Derivation trees
Complete derivation trees
As in the context-free case, the frontier of unification derivation treesdoes not have to correspond to any lexical item
Of course, in order for trees to represent complete derivations, we areparticularly interested in such trees whose frontier is unifiable with asequence of pre-terminals
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 337 / 420
Unification grammars Derivation trees
Complete derivation trees
Definition (Complete derivation trees)
A unification derivation tree 〈σ, τ 〉 is complete if the frontier of τ isj1, . . . , jn and there exist a word w ∈ Words∗ of length n and an AMRSρ ∈ PTw (1, n) such that ρ ⊔ 〈σi , σj1, . . . , σjn〉 is defined.
Note that there may be more than one qualifying AMRS in PTw (1, n); thedefinition only requires one. Of course, different AMRSs in PTw (1, n) willcorrespond to different interpretations of the input string (resulting fromambiguous lexical entries of the words)
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 338 / 420
Unification grammars Derivation trees
Complete derivation trees
Example (Complete derivation trees)
Consider the grammar Gu and the string w = two lambs sleep. The tree ofthe previous example is complete. Its frontier is unifiable with thefollowing AMRS:
[
cat : dnum : pl
]
cat : nnum : plcase : 2
[
cat : vnum : pl
]
∈ PTw (1, 3)
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 339 / 420
Unification grammars Derivation trees
Lexicalized derivation trees
It is sometimes useful to depict a tree whose leaves already reflect theadditional information obtained by actually unifying the frontier of acomplete derivation tree with PTw
We call such trees lexicalized
It is easy to see that for every lexicalized tree 〈σ, τ 〉 there exists a
complete derivation tree 〈σ′, τ ′〉 such that τ ′ = τ and σ′ ~⊑ σ
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 340 / 420
Unification grammars Derivation trees
Lexicalized derivation trees
Definition (Lexicalized derivation trees)
Let 〈σ, τ 〉 be a complete derivation tree induced by a unification grammarG = (R,As) and let w , ρ be as in the definition of complete trees. Alexicalized derivation tree induced by G on w is the unification tree〈σ′, τ 〉, where σ′ is obtained from σ by unifying the frontier of σ with ρ.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 341 / 420
Unification grammars Derivation trees
Lexicalized derivation trees
Example (Lexicalized derivation tree)
A tree induced by the grammar Gu on the string two lambs sleep:
[
cat : s]
cat : npnum : 4
case : 2nom
[
cat : dnum : 4pl
]
cat : nnum : 4
case : 2nom
[
cat : vnum : 4
]
two sheep sleep
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 342 / 420
Linguistic applications Introduction
Linguistic applications
We now put the theory to use, by accounting for several of thelinguistic phenomena that motivated UGs
Unification grammars facilitate the expression of linguisticgeneralizations
This is mediated through two main mechanisms:
The notion of grammatical category is expressed via feature structures,thereby allowing for complex categories as first-class citizens of thegrammatical theoryReentrancy provides a concise machinery for expressing “movement”,or more generally, relations that hold in a deeper level than aphrase-structure tree
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 343 / 420
Linguistic applications Introduction
Phenomena
Agreement
Case control
Subcategorization
Long-distance dependencies
Control
Coordination
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 344 / 420
Linguistic applications A basic grammar
A basic grammar
Example (A context-free grammar G0:)
S → NP VPVP → V | V NPNP → D N | Pron | PropND → the, a, two, every, . . .
N → sheep, lamb, lambs, shepherd, water . . .
V → sleep, sleeps, love, loves, feed, feeds, herd, herds, . . .
Pron → I, me, you, he, him, she, her, it, we, us, they, them
PropN → Rachel, Jacob, . . .
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 345 / 420
Linguistic applications A basic grammar
Every CFG is a UG
Observe that any context-free grammar is a special case of aunification grammar
The non-terminal symbols of the CFG can be modeled by atoms
A more general view of G0 as a unification grammar can encode thefact that the non-terminal symbols represent grammatical categories
This can be done using a single feature, e.g., cat, whose values arethe non-terminals of G0
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 346 / 420
Linguistic applications A basic grammar
G ′0, a basic unification grammar
Example (G ′0, a basic unification grammar)
Following is a unification grammar, G ′0, over a signature 〈Feats,Atoms〉
where Feats = {cat} and Atoms = {s, np, vp, v, d, n, pron, propn}:
1[
cat : s]
→[
cat : np] [
cat : vp]
2[
cat : vp]
→[
cat : v]
3[
cat : vp]
→[
cat : v] [
cat : np]
4[
cat : np]
→[
cat : d] [
cat : n]
5, 6[
cat : np]
→[
cat : pron]
|[
cat : propn]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 347 / 420
Linguistic applications A basic grammar
G ′0, a basic unification grammar
Example (G ′0, a basic unification grammar)
sleep →[
cat : v]
give →[
cat : v]
love →[
cat : v]
tell →[
cat : v]
feed →[
cat : v]
feeds →[
cat : v]
lamb →[
cat : n]
lambs →[
cat : n]
she →[
cat : pron]
her →[
cat : pron]
they →[
cat : pron]
them →[
cat : pron]
Rachel →[
cat : propn]
Jacob →[
cat : propn]
a →[
cat : d]
two →[
cat : d]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 348 / 420
Linguistic applications A basic grammar
Derivation trees induced by G ′0
Example (Derivation trees induced by G ′0)
The grammar G ′0 induces the following tree on the string the sheep love her:
[
cat : s]
[
cat : np] [
cat : vp]
[
cat : d] [
cat : n] [
cat : v] [
cat : np]
[
cat : pron]
the sheep love her
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 349 / 420
Linguistic applications A basic grammar
Derivation trees induced by G ′0
Example (Derivation trees induced by G ′0)
Not surprisingly, an isomorphic derivation tree is induced by the grammaron the ungrammatical string ∗the lambs sleeps they:
[
cat : s]
[
cat : np] [
cat : vp]
[
cat : d] [
cat : n] [
cat : v] [
cat : np]
[
cat : pron]
the lambs sleeps they
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 350 / 420
Linguistic applications Imposing agreemnt
Gagr, accounting for agreement on number
Example (Gagr, accounting for agreement on number)
1[
cat : s]
→
[
cat : npnum : 4
] [
cat : vpnum : 4
]
2
[
cat : vpnum : 4
]
→
[
cat : vnum : 4
]
3
[
cat : vpnum : 4
]
→
[
cat : vnum : 4
]
[
cat : np]
4
[
cat : npnum : 4
]
→
[
cat : dnum : 4
] [
cat : nnum : 4
]
5, 6
[
cat : npnum : 4
]
→
[
cat : pronnum : 4
]
|
[
cat : propnnum : 4
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 351 / 420
Linguistic applications Imposing agreemnt
Gagr, accounting for agreement on number
Example (Gagr, accounting for agreement on number)
sleep →
[
cat : vnum : pl
]
give →
[
cat : vnum : pl
]
love →
[
cat : vnum : pl
]
tell →
[
cat : vnum : pl
]
feed →
[
cat : vnum : pl
]
feeds →
[
cat : vnum : sg
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 352 / 420
Linguistic applications Imposing agreemnt
Gagr, accounting for agreement on number
Example (Gagr, accounting for agreement on number)
lamb →
[
cat : nnum : sg
]
lambs →
[
cat : nnum : pl
]
she →
[
cat : pronnum : sg
]
her →
[
cat : pronnum : sg
]
they →
[
cat : pronnum : pl
]
them →
[
cat : pronnum : pl
]
Rachel →
[
cat : propnnum : sg
]
Jacob →
[
cat : propnnum : sg
]
a →
[
cat : dnum : sg
]
two →
[
cat : dnum : pl
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 353 / 420
Linguistic applications Imposing agreemnt
Gagr generates a CF language
Example (A context-free grammar G1)
S → Ssg | Spl
Ssg → NPsg VPsg Spl → NPpl VPpl
NPsg → Dsg Nsg NPpl → Dpl Npl
NPsg → Pronsg | PropNsg NPpl → Pronpl | PropNpl
VPsg → Vsg VPpl → Vpl
VPsg → Vsg NPsg | Vsg NPpl VPpl → Vpl NPsg | Vpl NPpl
Dsg → a Dpl → two
Nsg → lamb | sheep | · · · Npl → lambs | sheep | · · ·
Pronsg → she | her | · · · PropNsg → Rachel | Jacob | · · ·
Vsg → sleeps | · · · Vpl → sleep | · · ·
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 354 / 420
Linguistic applications Imposing agreemnt
Gagr generates a CF language
While Gagr is a unification grammar, the language it generates iscontext free
But the equivalent CFG is inferior to the unification grammar:
The linguistic description is distorted: information regarding number,which is determined by the words themselves, is encoded in G1 by theway they are derived (in other words, G1 accounts for lexical knowledgeby means of phrase-structure rules)Several linguistic generalizations are lost: the context-free grammarinduces two different trees on a lamb sleeps and two lambs sleep
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 355 / 420
Linguistic applications Imposing agreemnt
UG and linguistic generalizations
One natural notion of ‘linguistic generalization’ emerges: the abilityto formulate a linguistic restriction by means of a single rule, insteadof by a collection of “similar” rules
In this sense, Gagr captures the agreement generalization, while G1
does not
Multiplying out all the possible values of a particular feature, andconverting a unification grammar to an equivalent context-freegrammar in this way, is not always possible
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 356 / 420
Linguistic applications Imposing case control
Imposing case control
Add a feature to the signature, case, to the feature structuresassociated with nominal categories: nouns, pronouns, proper namesand noun phrases
The lexical entries of pronouns must specify their case, which is overtand explicit: we use the value nom for nominative case, whereas accstands for accusative
As for proper names and nouns, their lexical entries are simplyunderspecified with respect to case
Use the values of the case feature in the grammar to imposeconstraints of case control
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 357 / 420
Linguistic applications Imposing case control
Gcase, accounting for case control
Example (Gcase, accounting for case control)
1[
cat : s]
→
cat : npnum : 4
case : nom
[
cat : vpnum : 4
]
2
[
cat : vpnum : 4
]
→
[
cat : vnum : 4
]
3
[
cat : vpnum : 4
]
→
[
cat : vnum : 4
]
cat : npnum : 3
case : acc
4
cat : npnum : 4
case : 2
→
[
cat : dnum : 4
]
cat : nnum : 4
case : 2
5, 6
cat : npnum : 4
case : 2
→
cat : pronnum : 4
case : 2
|
cat : propnnum : 4
case : 2
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 358 / 420
Linguistic applications Imposing case control
Gcase, accounting for case control
Example (Gcase, accounting for case control)
sleep →
»
cat : v
num : pl
–
sleeps →
»
cat : v
num : sg
–
feed →
»
cat : v
num : pl
–
feeds →
»
cat : v
num : sg
–
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 359 / 420
Linguistic applications Imposing case control
Gcase, accounting for case control
Example (Gcase, accounting for case control)
lamb →
2
4
cat : n
num : sg
case : [ ]
3
5 lambs →
2
4
cat : n
num : pl
case : [ ]
3
5
she →
2
4
cat : pron
num : sg
case : nom
3
5 her →
2
4
cat : pron
num : sg
case : acc
3
5
they →
2
4
cat : pron
num : pl
case : nom
3
5 them →
2
4
cat : pron
num : pl
case : acc
3
5
Rachel →
»
cat : propn
num : sg
–
Jacob →
»
cat : propn
num : sg
–
a →
»
cat : d
num : sg
–
two →
»
cat : d
num : pl
–
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 360 / 420
Linguistic applications Imposing case control
Derivation tree with case control
Example (Derivation tree with case control)
ˆ
cat : s˜
2
4
cat : np
num : 4case : 3nom
3
5
»
cat : vp
num : 4
–
»
cat : d
num : 4pl
–
2
4
cat : n
num : 4case : 3
3
5
»
cat : v
num : 4
–
2
4
cat : np
num : 2case : 5 acc
3
5
2
4
cat : pron
num : 2pl
case : 5
3
5
the shepherds feed them
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 361 / 420
Linguistic applications Imposing case control
Derivation tree with case control
Example (Derivation tree with case control)
This tree represents a derivation which starts with the initial symbol,[
cat : s]
, and ends with multi-AVM σ′, where
σ′ =the
[
num : 4]
shepherds[
num : 4
case : nom
] feed[
num : 4]
them[
num : 2
case : acc
]
This multi-AVM is unifiable with (but not identical to!) the sequence oflexical entries of the words in the sentence, which is:
σ =the
[
num : [ ]]
shepherds[
num : plcase : [ ]
] feed[
num : pl]
them[
num : plcase : acc
]
Hence the sentence is in the language generated by the grammar.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 362 / 420
Linguistic applications Imposing subcategorization constraints
Imposing subcategorization constraints
A naıve solution to the subcategorization problem
intransitive verbs (with no object): sleep, walk, run, laugh, . . .
transitive verbs (with a nominal object): feed, love, eat, . . .
Lexical entries of verbs are extended such that their subcategorizationis specified
The rules that involve verbs and verb phrases are extended
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 363 / 420
Linguistic applications Imposing subcategorization constraints
Gsubcat, a naıve account of verb subcategorization
Example (Gsubcat, a naıve account of verb subcategorization)
1ˆ
cat : s˜
→
2
4
cat : np
num : 4case : nom
3
5
»
cat : vp
num : 4
–
2
»
cat : vp
num : 4
–
→
2
4
cat : v
num : 4subcat : intrans
3
5
3
»
cat : vp
num : 4
–
→
2
4
cat : v
num : 4subcat : trans
3
5
2
4
cat : np
num : 4case : acc
3
5
4
2
4
cat : np
num : 4case : 2
3
5 →
»
cat : d
num : 4
–
2
4
cat : n
num : 4case : 2
3
5
5, 6
2
4
cat : np
num : 4case : 2
3
5 →
2
4
cat : pron
num : 4case : 2
3
5 |
2
4
cat : propn
num : 4case : 2
3
5
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 364 / 420
Linguistic applications Imposing subcategorization constraints
Gsubcat, a naıve account of verb subcategorization
Example (Gsubcat, a naıve account of verb subcategorization)
sleep →
cat : vnum : plsubcat : intrans
sleeps →
cat : vnum : sgsubcat : intrans
feed →
cat : vnum : plsubcat : trans
feeds →
cat : vnum : sgsubcat : trans
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 365 / 420
Linguistic applications Imposing subcategorization constraints
Gsubcat, a naıve account of verb subcategorization
Example (Gsubcat, a naıve account of verb subcategorization)
lamb →
2
4
cat : n
num : sg
case : [ ]
3
5 lambs →
2
4
cat : n
num : pl
case : [ ]
3
5
she →
2
4
cat : pron
num : sg
case : nom
3
5 her →
2
4
cat : pron
num : sg
case : acc
3
5
they →
2
4
cat : pron
num : pl
case : nom
3
5 them →
2
4
cat : pron
num : pl
case : acc
3
5
Rachel →
»
cat : propn
num : sg
–
Jacob →
»
cat : propn
num : sg
–
a →
»
cat : d
num : sg
–
two →
»
cat : d
num : pl
–
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 366 / 420
Linguistic applications Subcategorization lists
Subcategorization lists
The previous account of subcategorization is naıve
Different verbs subcategorize for different kinds of complements:noun phrases, infinitival verb phrases, sentences etc.
Some verbs require more than one complement
The idea is to store in the lexical entry of each verb not an atomicfeature indicating its subcategory, but rather a list of categories,indicating the appropriate complements of the verb
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 367 / 420
Linguistic applications Subcategorization lists
Lexical entries of some verbs using subcategorization lists
Example (Lexical entries of some verbs using subcategorization lists)
sleep →
cat : vsubcat : elistnum : pl
love →
cat : vsubcat : 〈
[
cat : np]
〉num : pl
give →
cat : vsubcat : 〈
[
cat : np]
,[
cat : np]
〉num : pl
tell →
cat : vsubcat : 〈
[
cat : np]
,[
cat : s]
〉num : pl
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 368 / 420
Linguistic applications Subcategorization lists
Subcategorization lists
The grammar rules must be modified to reflect the additional wealthof information in the lexical entries
Due to this wealth there can be a dramatic reduction in the numberof grammar rules necessary for handling verbs
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 369 / 420
Linguistic applications Subcategorization lists
VP rules using subcategorization lists
Example (VP rules using subcategorization lists)
[
cat : s]
→[
cat : np]
[
cat : vsubcat : elist
]
[
cat : vsubcat : 2
]
→
cat : v
subcat :
[
first :[
cat : 4]
rest : 2
]
[
cat : 4]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 370 / 420
Linguistic applications Subcategorization lists
A derivation tree
Example (A derivation tree)
ˆ
cat : s˜
»
cat : v
subcat : 〈〉
–
»
cat : v
subcat : 〈ˆ
cat : 2˜
〉
–
ˆ
cat : np˜
»
cat : v
subcat : 〈ˆ
cat : 1˜
,ˆ
cat : 2˜
〉
–
ˆ
cat : 1 np˜ ˆ
cat : 2 np˜
Rachel gave the sheep water
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 371 / 420
Linguistic applications Subcategorization lists
A derivation tree
Example (A derivation tree)
ˆ
c : s˜
»
c : v
sc : 〈〉
–
h
c : 2 s
i
"
c : v
sc : 〈h
c : 2i
〉
#
»
c : v
sc : 〈〉
–
ˆ
c : np˜
"
c : v
sc : 〈h
c : 1i
,
h
c : 2i
〉
#
h
c : 1 np
i
ˆ
c : np˜
"
c : v
sc : 〈h
c : 3i
〉
#
h
c : 3 np
i
Jacob told Laban he loved Rachel
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 372 / 420
Linguistic applications Subcategorization lists
Subcategorization imposes case constraints
In the above grammar, categories on subcategorization lists arerepresented as an atomic symbol
This is a simplification; the method outlined here can be used withmore complex encodings of categories
For example, the lexical entry of the German verb geben (to give) canstate that the first complement must be in the dative case, whereasthe second must be accusative
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 373 / 420
Linguistic applications Subcategorization lists
Subcategorization imposes case constraints
Example (Subcategorization imposes case constraints)
Ich gebe dem Hund den KnochenI give the(dat) dog the(acc) boneI give the dog the bone
∗Ich gebe den Hund den KnochenI give the(acc) dog the(acc) bone
∗Ich gebe dem Hund dem KnochenI give the(dat) dog the(dat) bone
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 374 / 420
Linguistic applications Subcategorization lists
Subcategorization imposes case constraints
Example (Subcategorization imposes case constraints)
The lexical entry of gebe, then, could be:
L(gebe) =
cat : v
subcat :
⟨[
cat : npcase : dat
]
,
[
cat : npcase : acc
]⟩
num : sg
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 375 / 420
Linguistic applications Subcategorization lists
Subcategorization imposes case constraints
In order to account for subcategorization of complex information(rather than of atomic category symbols), the VP rule whichmanipulates subcategorization lists has to be slightly modified
The revised rule reflects the fact that the subcategorized informationis not the value of the cat feature, but rather the entire verbcomplement:
[
cat : vsubcat : 2
]
→
cat : v
subcat :
[
first : 3
rest : 2
]
3 [ ]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 376 / 420
Linguistic applications Subcategorization lists
G3, a complete E2-grammar
Example (G3, a complete E2-grammar)
ˆ
cat : s˜
→
2
4
cat : np
num : 4case : nom
3
5
2
4
cat : v
num : 4subcat : elist
3
5
2
4
cat : v
num : 4subcat : 2
3
5 →
2
6
6
4
cat : v
num : 4
subcat :
»
first : 3rest : 2
–
3
7
7
5
3 [ ]
2
4
cat : np
num : 4case : 2
3
5 →
»
cat : d
num : 4
–
2
4
cat : n
num : 4case : 2
3
5
2
4
cat : np
num : 4case : 2
3
5 →
2
4
cat : pron
num : 4case : 2
3
5 |
2
4
cat : propn
num : 4case : 2
3
5
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 377 / 420
Linguistic applications Subcategorization lists
G3, a complete E2-grammar
Example (G3, a complete E2-grammar)
sleep →
2
4
cat : v
subcat : elist
num : pl
3
5
give →
2
6
6
4
cat : v
subcat : 〈
»
cat : np
case : acc
–
,ˆ
cat : np˜
〉
num : pl
3
7
7
5
love →
2
6
6
4
cat : v
subcat : 〈
»
cat : np
case : acc
–
〉
num : pl
3
7
7
5
tell →
2
6
6
4
cat : v
subcat : 〈
»
cat : np
case : acc
–
,ˆ
cat : s˜
〉
num : pl
3
7
7
5
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 378 / 420
Linguistic applications Subcategorization lists
G3, a complete E2-grammar
Example (G3, a complete E2-grammar)
lamb →
2
4
cat : n
num : sg
case : 2
3
5 lambs →
2
4
cat : n
num : pl
case : 2
3
5
she →
2
4
cat : pron
num : sg
case : nom
3
5 her →
2
4
cat : pron
num : sg
case : acc
3
5
Rachel →
»
cat : propn
num : sg
–
Jacob →
»
cat : propn
num : sg
–
a →
»
cat : d
num : sg
–
two →
»
cat : d
num : pl
–
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 379 / 420
Linguistic applications Long distance dependencies
Long distance dependencies
Encoding grammatical categories as feature structures is very usefulin the treatment of unbounded dependencies
Such phenomena involve a “missing” constituent that is realizedoutside the clause from which it is missing, as in:
The shepherd wondered whom Jacob loved ⌣.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 380 / 420
Linguistic applications Long distance dependencies
Long distance dependencies
Phrases such as whom Jacob loved ⌣ or who ⌣ loved Rachel aresentences, with a constituent which is “moved” from its defaultposition and realized as a wh-pronoun in front of the phrase
We represent such phrases by using the category s
But to distinguish them from declarative sentences we add a feature,que, to the category
The value of que is ‘+’ in sentences with an interrogative pronounrealizing a transposed constituent
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 381 / 420
Linguistic applications Long distance dependencies
Long distance dependencies
We also add a lexical entry for the pronoun whom:
whom →
cat : proncase : accque : +
Finally, we update the rule that derives pronouns such that itpropagate the value of que from the lexicon to higher projections ofthe pronoun:
cat : npnum : 1
case : 3
que : 5
→
cat : pronnum : 1
case : 3
que : 5
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 382 / 420
Linguistic applications Long distance dependencies
Long-distance dependencies
We extend G3 with two additional rules, based on the first two rulesof G3:
(3)
[
cat : sslash : 4
]
→
cat : npnum : 1
case : nom
cat : vnum : 1
subcat : elistslash : 4
(4)
cat : vnum : 1
subcat : 2
slash : 4
→
cat : vnum : 1
subcat :
[
first : 4
rest : 2
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 383 / 420
Linguistic applications Long distance dependencies
A derivation tree for Jacob loved ⌣
Example (A derivation tree for Jacob loved ⌣)»
cat : s
slash : 4
–
2
4
cat : np
num : 1case : 2
3
5
2
6
6
4
cat : v
num : 1slash : 4subcat : 8
3
7
7
5
2
4
cat : propn
num : 1 sg
case : 2nom
3
5
2
6
6
6
6
4
cat : v
num : 1
subcat :
2
4
first : 4
»
cat : np
case : acc
–
rest : 8 elist
3
5
3
7
7
7
7
5
Jacob loved ⌣
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 384 / 420
Linguistic applications Long distance dependencies
Long-distance dependencies
A rule for creating “complete” sentences by combining the missingcategory with a “slashed” sentence
The rule does not commit as to the category of the dislocatedelement; it simply combines any category with a sentence in whichthis very same category is missing, provided that this category ismarked as ‘que +’
The value of que is propagated to the mother to indicate that thesentence is interrogative rather than declarative:
(5)
[
cat : sque : 5
]
→ 4[
que : 5+]
[
cat : sslash : 4
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 385 / 420
Linguistic applications Long distance dependencies
A derivation tree for whom Jacob loved ⌣
Example (A derivation tree for whom Jacob loved ⌣)»
cat : s
que : 5
–
»
cat : s
slash : 4
–
4
2
4
cat : np
case : 3que : 5
3
5
2
4
cat : np
num : 1case : 2
3
5
2
6
6
4
cat : v
num : 1slash : 4subcat : elist
3
7
7
5
2
4
cat : pron
case : 3 acc
que : 5+
3
5
2
4
cat : propn
num : 1 sg
case : 2nom
3
5
2
4
cat : v
num : 1subcat :
˙
4¸
3
5
whom Jacob loved ⌣
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 386 / 420
Linguistic applications Long distance dependencies
Long-distance dependencies
In order to derive the full sentenceRachel wondered whom Jacob loved ⌣
we need a lexical entry for the verb wondered
It is a verb, so its category is v, and as it subcategorizes for aninterrogative sentence, its subcategory is a list of a single member, asentence whose que feature is ‘+’:
wondered →
cat : vnum : [ ]
subcat : 〈
[
cat : sque : +
]
〉
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 387 / 420
Linguistic applications Long distance dependencies
A derivation tree for Rachel wondered whom Jacob loved ⌣
Example (A derivation tree for Rachel wondered whom Jacob loved ⌣)
[
cat : s]
cat : npnum : 3
case : 4nom
cat : vnum : 3
subcat : elist
cat : propnnum : 3 sgcase : 4
cat : vnum : 3
subcat : 〈 1 〉
1
[
cat : sque : +
]
Rachel wondered whom Jacob loved ⌣
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 388 / 420
Linguistic applications Long distance dependencies
Long-distance dependencies
In the previous example the filler of the gap is realized immediately tothe left of the clause in which the gap occurs
This need not always be the case: unbounded dependencies can holdacross several clause boundaries
Typical examples are:
The shepherd wondered whom Jacob loved ⌣.
The shepherd wondered whom Laban thought Jacob loved ⌣.
The shepherd wondered whom
Laban thought Leah claimed Jacob loved ⌣.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 389 / 420
Linguistic applications Long distance dependencies
Long-distance dependencies
Also, the dislocated constituent does not have to be an object:
The shepherd wondered who ⌣ loved Rachel.
The shepherd wondered who Laban thought ⌣ loved Rachel.
The shepherd wondered who
Laban thought Leah claimed ⌣ loved Rachel.
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 390 / 420
Linguistic applications Long distance dependencies
Long-distance dependencies
The solution we proposed for the simple case of unboundeddependencies can be easily extended to the more complex examples
The solution amounts to three components:
A slash introduction ruleSlash propagation rulesA gap filler rule
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 391 / 420
Linguistic applications Long distance dependencies
Long-distance dependencies
In order to account for filler-gap relations that hold across severalclauses, all that needs to be done is to add more slash propagationrules
For example, in
The shepherd wondered whom Laban thought Jacob loved ⌣.
the slash is introduced by the verb phrase loved ⌣, and is propagatedto the sentence Jacob loved ⌣ by rule (3)
This sentence is the object of the verb thought; therefore, we need arule that propagates the value of slash from a sentential object tothe verb phrase of which it is an object
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 392 / 420
Linguistic applications Long distance dependencies
Long-distance dependencies
Example (Long-distance dependencies)
(6)
cat : vnum : 1
subcat : 12
slash : 4
→
cat : vnum : 1
subcat :
[
first : 8
rest : 12
]
8[
slash : 4]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 393 / 420
Linguistic applications Long distance dependencies
Long-distance dependencies
Example (Long-distance dependencies)
Then, the slash is propagated from the verb phrase thought Jacob loved ⌣
to the sentence Laban thought Jacob loved ⌣:
(7)
[
cat : sslash : 4
]
→
cat : npnum : 5
case : nom
cat : vnum : 5
subcat : elistslash : 4
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 394 / 420
Linguistic applications Long distance dependencies
Long-distance dependencies
Example (A derivation tree for whom Laban thought Jacob loved ⌣)"
cat : s
que : 6
#
"
cat : s
slash : 4
#
2
6
6
6
4
cat : v
num : 5slash : 4sc : 12 elist
3
7
7
7
5
8"
cat : s
slash : 4
#
4
2
6
4
cat : np
case : 3que : 6
3
7
5
2
6
4
cat : np
num : 5case : 9
3
7
5
2
6
4
cat : np
num : 1case : 2
3
7
5
2
6
6
6
4
cat : v
num : 1slash : 4sc : elist
3
7
7
7
5
2
6
4
cat : pron
case : 3 acc
que : 6 +
3
7
5
2
6
4
cat : propn
num : 5 sg
case : 9 nom
3
7
5
2
6
6
6
4
cat : v
num : 5
sc :
"
first : 8rest : 12
#
3
7
7
7
5
2
6
4
cat : propn
num : 1 sg
case : 2 nom
3
7
5
2
6
4
cat : v
num : 1sc :
D
4E
3
7
5
whom Laban thought Jacob loved ⌣
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 395 / 420
Linguistic applications Long distance dependencies
Long-distance dependencies
Example (Long-distance dependencies)
Finally, to account for gaps in the subject position, all that is needed is anadditional slash introduction rule:
(8)
cat : s
slash :
cat : npnum : 1
case : nom
→
cat : vnum : 1
subcat : elist
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 396 / 420
Linguistic applications Long distance dependencies
Long-distance dependencies
Example (A derivation tree for who ⌣ loved Rachel)»
cat : s
que : 6
–
2
4
cat : s
num : 1slash : 4
3
5
2
4
cat : v
num : 1subcat : elist
3
5
4
2
4
cat : np
case : 3 nom
que : 6
3
5 8»
cat : np
case : 2
–
2
4
cat : pron
case : 3 nom
que : 6
3
5
2
4
cat : v
num : 1 sg
subcat : 〈 8 〉
3
5
2
4
cat : propn
num : 6 sg
case : 2 acc
3
5
who ⌣ loved Rachel
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 397 / 420
Linguistic applications Subject and object control
Subject and object control
Differences between the ‘understood’ subjects of the infinitive verbphrase to work seven years in the following sentences:
Jacob promised Laban to work seven years
Laban persuaded Jacob to work seven years
The differences between the two example sentences stem fromdifferences in the matrix verbs:
promise is a subject control verb;persuade is object control
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 398 / 420
Linguistic applications Subject and object control
G4: explicit subj values
Example (G4: explicit subj values)
ˆ
cat : s˜
→ 1
2
4
cat : np
case : nom
num : 7
3
5
2
6
6
4
cat : v
num : 7subcat : elist
subj : 1
3
7
7
5
2
6
6
4
cat : v
num : 7subcat : 4subj : 1
3
7
7
5
→
2
6
6
6
6
4
cat : v
num : 7
subcat :
»
first : 2rest : 4
–
subj : 1
3
7
7
7
7
5
2 [ ]
2
4
cat : np
num : 7case : 6
3
5 →
»
cat : d
num : 7
–
2
4
cat : n
num : 7case : 6
3
5
2
4
cat : np
num : 7case : 6
3
5 →
2
4
cat : pron
num : 7case : 6
3
5 |
2
4
cat : propn
num : 7case : 6
3
5
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 399 / 420
Linguistic applications Subject and object control
G4: explicit subj values
Example (G4: explicit subj values)
sleep →
2
6
6
6
4
cat : v
subcat : elist
subj :
»
cat : np
case : nom
–
num : pl
3
7
7
7
5
love →
2
6
6
6
6
6
4
cat : v
subcat : 〈
»
cat : np
case : acc
–
〉
subj :
»
cat : np
case : nom
–
num : pl
3
7
7
7
7
7
5
give →
2
6
6
6
6
6
4
cat : v
subcat : 〈
»
cat : np
case : acc
–
,ˆ
cat : np˜
〉
subj :
»
cat : np
case : nom
–
num : pl
3
7
7
7
7
7
5
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 400 / 420
Linguistic applications Subject and object control
G4: explicit subj values
Example (G4: explicit subj values)
lamb →
cat : nnum : sgcase : 6
lambs →
cat : nnum : plcase : 6
she →
cat : pronnum : sgcase : nom
her →
cat : pronnum : plcase : acc
Rachel →
[
cat : propnnum : sg
]
Jacob →
[
cat : propnnum : sg
]
a →
[
cat : dnum : sg
]
two →
[
cat : dnum : pl
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 401 / 420
Linguistic applications Subject and object control
Infinitival verb phrases
The next step is to account for infinitival verb phrases
This can be easily done by adding a new feature, vform, to verbalprojections
The values of this feature can represent the form of the verb: fin forfinite verbs and inf for infinitival ones
to work →
cat : vvform : infsubcat : elistsubj :
[
cat : np]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 402 / 420
Linguistic applications Subject and object control
The lexical entry of promise
Example (The lexical entry of promise)
promised →
cat : vvform : fin
subcat : 〈
[
cat : npcase : acc
]
,
cat : vvform : infsubj : 1
〉
subj : 1
[
cat : npcase : nom
]
num : [ ]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 403 / 420
Linguistic applications Subject and object control
A derivation tree for Jacob promised Laban to work
Example (A derivation tree for Jacob promised Laban to work)ˆ
cat : s˜
2
6
6
4
cat : v
vform : fin
subj : 1subcat : elist
3
7
7
5
2
6
6
6
4
cat : v
vform : fin
subj : 1subcat : 〈 3 〉
3
7
7
7
5
1"
cat : np
case : 6 nom
#
2
6
6
6
4
cat : v
vform : fin
subj : 1subcat : 〈 2 , 3 〉
3
7
7
7
5
2"
cat : np
case : 7 acc
#
"
cat : propn
case : 6
# "
cat : propn
case : 7
#
3
2
4
cat : v
vform : inf
subj : 1
3
5
Jacob promised Laban to work
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 404 / 420
Linguistic applications Subject and object control
The lexical entry of persuade
Example (The lexical entry of persuade)
persuaded →
cat : vvform : fin
subcat : 〈 1
[
cat : npcase : acc
]
,
cat : vvform : infsubj : 1
〉
subj :
[
cat : npcase : nom
]
num : [ ]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 405 / 420
Linguistic applications Constituent coordination
Constituent coordination
N: no man lift up his [hand] or [foot] in all the land of Egypt
NP: Jacob saw [Rachel] and [the sheep of Laban]
VP: Jacob [went on his journey] and
[came to the land of the people of the east]
VP: Jacob [went near], and
[rolled the stone from the well’s mouth], and
[watered the flock of Laban his mother’s brother].
ADJ: every [speckled] and [spotted] sheep
ADJP: Leah was [tender eyed] but [not beautiful]
S: [Leah had four sons], but [Rachel was barren]
S: she said to Jacob, “[Give me children], or [I shall die]!”
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 406 / 420
Linguistic applications Constituent coordination
Coordination in CFG
Example (Coordination in CFG)
S → S Conj SNP → NP Conj NPVP → VP Conj VP...
Conj → and, or, but, . . .
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 407 / 420
Linguistic applications Constituent coordination
Coordination in UG
Example (Coordination in UG)[
cat : 1]
→[
cat : 1] [
cat : conj] [
cat : 1]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 408 / 420
Linguistic applications Constituent coordination
Coordination in UG
Example (Coordination)
ˆ
cat : 1 v˜
2
4
cat : 1num : [ ]sc : elist
3
5
2
4
cat : 1num : [ ]sc : elist
3
5
2
4
cat : v
num : [ ]
sc : 〈 2 〉
3
5 2»
cat : np
num : sg
–
ˆ
cat : conj˜
2
4
cat : v
num : [ ]
sc : 〈 3 〉
3
5 3»
cat : np
num : [ ]
–
rolled the stone and watered the sheep
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 409 / 420
Linguistic applications Constituent coordination
Tough issues in coordination
Coordination of conjunctions
Properties of the conjoined phrases
Coordination of unlikes
Non-constituent coordination
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 410 / 420
Linguistic applications Constituent coordination
Coordination
Example (Ruling out coordination in UG)[
cat : 1
conj : −
]
→
[
cat : 1
conj : +
]
[
cat : conj]
[
cat : 1
conj : +
]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 411 / 420
Linguistic applications Constituent coordination
Coordination
Example (Properties of the conjoined phrases)
2
6
6
4
cat : 1np
num : ??pers : ??gen : ??
3
7
7
5
2
6
6
4
cat : 1num : 4pers : 2gen : 8
3
7
7
5
2
6
6
4
cat : 1num : 6pers : 3gen : 7
3
7
7
5
2
6
6
4
cat : pron
num : 4pers : 2 second
gen : 8
3
7
7
5
ˆ
cat : conj˜
»
cat : d
num : 6
–
2
6
6
4
cat : n
num : 6 sg
pers : 3 third
gen : 7
3
7
7
5
you and a lamb
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 412 / 420
Linguistic applications Constituent coordination
Coordination
Example (Coordination of unlikes)
Joseph became wealthyJoseph became a ministerJoseph became [wealthy and a minister]Joseph grew wealthy∗Joseph grew a minister∗Joseph grew [wealthy and a minister]
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 413 / 420
Linguistic applications Constituent coordination
Coordination
Example (Coordination of unlikes)
[
cat : 1 ⊓ 2]
→[
cat : 1] [
cat : conj] [
cat : 2]
where ‘⊓’ is the generalization operator
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 414 / 420
Linguistic applications Constituent coordination
Coordination
Example (Coordination of unlikes)
ˆ
cat :ˆ
v : +˜˜
»
subcat :ˆ
n : +˜
cat :ˆ
v : +˜
–
ˆ
cat :ˆ
n : +˜˜
»
cat :
»
v : +n : +
––
ˆ
cat : conj˜
»
cat :
»
v : −n : +
––
became wealthy and a minister
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 415 / 420
Linguistic applications Constituent coordination
Coordination
Example (Coordination of unlikes)ˆ
c :ˆ
v : +˜ ˜
»
c :ˆ
v : +˜
sc :ˆ
n : +˜
–
ˆ
c :ˆ
n : +˜ ˜
2
4
c :ˆ
v : +˜
sc :
»
v : +n : +
–
3
5
ˆ
c : c˜
»
c :ˆ
v : +˜
sc :ˆ
n : +˜
–»
c :
»
v : +n : +
– –
ˆ
c : c˜
»
c :
»
v : −n : +
– –
grew and remained wealthy and a minister
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 416 / 420
Linguistic applications Constituent coordination
Coordination
Example (Non-constituent coordination)
Rachel gave the sheep [grass] and [water]Rachel gave [the sheep grass] and [the lambs water]Rachel [kissed] and Jacob [hugged] Binyamin
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 417 / 420
Linguistic applications Unification grammars facilitate linguistic generalizations
Unification grammars facilitate linguistic generalizations
Compared with context-free grammars, unification grammars providemuch better means for expressing linguistic generalizations
Verb subcategorizationCoordination
Unification grammars also provide much more informative structuresthan CFGs
AgreementSubject/object control
Unification grammars provide a very powerful tool for expressing whatother linguistic theories would call “movement”
Gap–filler constructionsUnbounded dependencies
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 418 / 420
Summary Extensions and open problems
Extensions and open problems
Restricted versions of unification grammars
Off-line parsabilityContext-free and Mildly-context-sensitive unification grammarsPolynomially-arsable unification grammars
Typed unification grammars
Type hierarchiesAppropriateness specificationType inference
Development of large-scale grammars
Grammar engineeringModularity, information encapsulation, separate compilation, ...
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 419 / 420
Summary Extensions and open problems
Thank you
c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 420 / 420