context-free grammars for natural languages

298
Context-free grammars Basic definitions Context-free grammars for natural languages Context-free grammars can be used for a variety of syntactic constructions, including some non-trivial phenomena such as unbounded dependencies, extraction, extraposition etc. However, some (formal) languages are not context-free, and therefore there are certain sets of strings that cannot be generated by context-free grammars. The interesting question, of course, involves natural languages: are there natural languages that are not context-free? Are context-free grammars sufficient for generating every natural language? c Shuly Wintner (University of Haifa) Unification Grammars c Copyrighted material 1 / 300

Upload: others

Post on 22-Jan-2022

10 views

Category:

Documents


0 download

TRANSCRIPT

Context-free grammars Basic definitions

Context-free grammars for natural languages

Context-free grammars can be used for a variety of syntacticconstructions, including some non-trivial phenomena such asunbounded dependencies, extraction, extraposition etc.

However, some (formal) languages are not context-free, and thereforethere are certain sets of strings that cannot be generated bycontext-free grammars.

The interesting question, of course, involves natural languages: arethere natural languages that are not context-free? Are context-freegrammars sufficient for generating every natural language?

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 1 / 300

Context-free grammars Basic definitions

A context-free grammar, G0, for E0

Example

A context-free grammar, G0, for E0

S → NP VPVP → VVP → V NPNP → D NNP → PronNP → PropND → the, a, two, every, . . .

N → sheep, lamb, lambs, shepherd, water . . .

V → sleep, sleeps, love, loves, feed, feeds, herd, herds, . . .

Pron → I, me, you, he, him, she, her, it, we, us, they, them

PropN → Rachel, Jacob, . . .

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 2 / 300

Context-free grammars Basic definitions

Context-free grammars for natural languages

There are two major problems with this grammar.

1 it ignores the valence of verbs: there is no distinction amongsubcategories of verbs, and an intransitive verb such as sleep mightoccur with a noun phrase complement, while a transitive verb such aslove might occur without one. In such a case we say that thegrammar overgenerates: it generates strings that are not in theintended language.

2 there is no treatment of subject–verb agreement, so that a singularsubject such as the cat might be followed by a plural form of verbsuch as smile. This is another case of overgeneration.

Both problems are easy to solve.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 3 / 300

Context-free grammars Basic definitions

Problems of G0

Over-generation (agreement constraints are not imposed):

∗Rachel feed the sheep

∗The shepherds feeds the sheep

∗Rachel feeds

∗Jacob loves she

∗Them herd the sheep

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 4 / 300

Context-free grammars Basic definitions

Problems of G0

Over-generation (subcategorization constraints are not imposed):

the lambs sleep

Jacob loves Rachel

∗the lambs sleep the sheep

∗Jacob loves

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 5 / 300

Context-free grammars Basic definitions

Problems of G0

Example (Over-generation)

S

NP VP

D N V NP

Pron

the lambs sleeps they

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 6 / 300

Context-free grammars Basic definitions

Verb valence

To account for valence, we can replace the non-terminal symbol V bya set of symbols: Vtrans, Vintrans, Vditrans etc.

We must also change the grammar rules accordingly:

Example

VP → Vintrans Vintrnas → sleep, sleeps

VP → Vtrans NP Vtrans → love, loves

VP → Vditrans NP NP Vditrans → give, gives

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 7 / 300

Context-free grammars Basic definitions

Agreement

To account for agreement, we can again extend the set ofnon-terminal symbols such that categories that must agree reflect inthe non-terminal that is assigned for them the features on which theyagree.

In the very simple case of English, it is sufficient to multiply the set of“nominal” and “verbal” categories, so that we get Dsg, Dpl, Nsg,Npl, NPsg, NPpl, Vsg, Vlp, VPsg, VPpl etc. We must also changethe set of rules accordingly:

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 8 / 300

Context-free grammars Basic definitions

Agreement

Example

Nsg → lamb Npl → lambs

Nsg → sheep Npl → sheep

Vsg → sleeps Vpl → sleep

Vsg → smiles Vpl → smile

Vsg → loves Vpl → love

Vsg → saw Vpl → saw

Dsg → a Dpl → two

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 9 / 300

Context-free grammars Basic definitions

Agreement

Example

S → NPsg VPsg S → NPpl VPplNPsg → Dsg Nsg NPpl → Dpl NplVPsg → Vsg VPpl → VplVPsg → VPsg NP VPpl → VPpl NP

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 10 / 300

Context-free grammars Basic definitions

Methodological properties of the CFG formalism

1 Concatenation is the only string combination operation

2 Phrase structure is the only syntactic relationship

3 The terminal symbols have no properties

4 Non-terminal symbols (grammar variables) are atomic

5 Most of the information encoded in a grammar lies in the productionrules

6 Any attempt of extending the grammar with a semantics requiresextra means.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 11 / 300

Context-free grammars Basic definitions

Alternative methodological properties

1 Concatenation is not necessarily the only way by which phrases maybe combined to yield other phrases.

2 Even if concatenation is the sole string operation, other syntacticrelationships are being put forward.

3 Modern computational formalisms for expressing grammars adhere toan approach called lexicalism.

4 Some formalisms do not retain any context-free backbone. However,if one is present, its categories are not atomic.

5 The expressive power added to the formalisms allows also a certainway for representing semantic information.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 12 / 300

Feature structures Introduction

Feature structures

Motivated by the violations of the context-free grammar G0, wewould like to extend the CFG formalism with additional mechanismsthat will facilitate the expression of information that is missing in G0

in a uniform and compact way.

The core idea is to incorporate into the grammar properties ofsymbols, in terms of which the violations of G0 were stated.

Properties are represented by means of feature structures.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 13 / 300

Feature structures Introduction

Overview

An overview of feature structures, motivating their use as arepresentation of linguistic information

Four different views of these entities:

feature graphsfeature structuresattribute-value matrices (AVMs)

Feature structures in a broader context.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 14 / 300

Feature structures Motivation

Motivation

Words in natural languages have properties

We want to model these properties in the lexicon

We would like to associate with words not just atomic symbols, as inCFGs, but rather structural information that reflects their properties.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 15 / 300

Feature structures Motivation

A simple lexicon

Example (A simple lexicon)

lamb:

[

num : sgpers : third

]

lambs:

[

num : plpers : third

]

I:

[

num : sgpers : first

]

sheep:

[

num : [ ]pers : third

]

dreams:

[

num : sgpers : third

]

dreams:

[

num : sgpers : third

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 16 / 300

Feature structures Motivation

Feature structures

Feature structures map features into values, which are themselvesfeature structures

A special case of feature structures are atoms, which representstructureless values.

For example, to deal with number (and impose its agreement), weuse a feature num, and a set of atomic feature structures {sg,pl} asits values, representing singularity and plurality, respectively.

When a value is not atomic, it is complex.

A complex value is, recursively, a feature structure consisting offeatures and values.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 17 / 300

Feature structures Motivation

A complex feature structure

Example (A complex feature structure)

loves:

vtype : transitive

agr :

[

num : sgpers : third

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 18 / 300

Feature structures Motivation

Grouping features

Deciding how to group features is up to the grammar designer, and isintended to capture syntactic generalizations.

If number and person ‘go together’ in formulating restrictions, it ismore appropriate to group them as in this example.

Moreover, such a grouping might be beneficial when featurestructures are being modified.

Processes of derivation and parsing (the application of grammarrules) are able to manipulate feature structures to reflect applicationof such constraints.

When the properties of some feature structure are changed, it ispossible to change the value of only one feature, namely agr, ratherthan specify two separate changes for each subfeature.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 19 / 300

Feature structures Motivation

Grouping features

In the example lexicon, the lexical ambiguity of sheep is representedby an empty feature structure as the value of the num feature.

This is interpreted as the value of this feature being unconstrained.

However, it would have been useful to be able to state that the onlypossible values for this feature are, say, sg and pl.

There are at least two different ways to specify such information:

by listing a set of values for the feature;or by restricting its value to a certain “type” of permissible values.

We do not explore the former solution here.

The latter solution is employed by typed feature structure formalisms.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 20 / 300

Feature structures Motivation

Adding features to phrases

Words are not the only linguistic entities that have properties; wordsare combined into phrases, and those also have properties which canbe modeled by feature:value pairs.

For example, the noun phrase a sheep has the value sg for the num

feature, while two sheep has the value pl for num.

Consequently, grammar non-terminals, too, must be decorated withfeatures, representing the endowment of phrases of this category withthat feature.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 21 / 300

Feature structures Feature graphs

Feature graphs

The informal discussion of feature structures above depicted themusing a representation, called attribute-value matrices (AVMs), whichis common in the linguistic literature.

We begin the discussion of feature structures by defining the conceptof feature graphs, using well-known concepts of graph theory.

A graph view of feature structures facilitates computationalprocessing because so many properties of graphs are well understoodand because graphs lend themselves to efficient processing.

We will return to AVMs and discuss their correspondence with featuregraphs later on.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 22 / 300

Feature structures Feature graphs

Definitions

Feature graphs are defined over a signature consisting of non-empty,finite, disjoint sets Feats of features and Atoms of atoms.

Features are used to encode properties of (linguistic) objects, such asnumber, gender etc.

Atoms are used for the (atomic) values of such features, as in plural,feminine etc.

We use a convention of depicting features in small capitals andatoms in italics.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 23 / 300

Feature structures Feature graphs

Signature

Definition (Signature)

A signature is a structure S = 〈Atoms,Feats〉, where Atoms is a finiteset of atoms and Feats is a finite set of features.

We assume some fixed signature throughout this presentation.

Meta-variables f , g (with or without subscripts or superscripts) rangeover features, and a, b, etc. over atoms.

We usually assume that both Feats and Atoms are non-empty (andsometimes even assume that they include more than one elementeach).

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 24 / 300

Feature structures Feature graphs

Feature graphs

Definition (Feature graphs)

A feature graph A = 〈QA, qA, δA, θA〉 is a finite, directed, connected,labeled graph consisting of a finite, nonempty set of nodes QA (such thatQA ∩ Feats = QA ∩Atoms = ∅), a root qA ∈ QA, a partial functionδA : QA × Feats → QA specifying the arcs such that every node q ∈ QA

is accessible from qA, and a partial function, marking some of the sinks:θA : QS → Atoms, where QS = {q ∈ QA | δA(q, f )↑ for every f }.Given a signature of features Feats and atoms Atoms, letG(Feats,Atoms) be the set of all feature graphs over the signature.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 25 / 300

Feature structures Feature graphs

Feature graphs

Example (Feature graphs)

The graph displayed below is 〈Q, q, δ, θ〉, whereQ = {q0, q1, q2, q3}, q = q0, δ(q0,agr) = q1, δ(q1,num) =q2, δ(q1,pers) = q3,QS = {q2, q3}, θ(q2) = pl, θ(q3) = third.

q2pl

q0 q1

q3third

agr

num

pers

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 26 / 300

Feature structures Feature graphs

Feature graphs

The arcs of a feature graph are thus labeled by features.

The root is a designated node from which all other nodes areaccessible (through δ); note that nothing prevents the root fromhaving incoming arcs.

Sink nodes (nodes with no outgoing edges) can be marked by anatom, but can also be unmarked.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 27 / 300

Feature structures Feature graphs

Feature graphs

We use meta-variables A, B (with or without subscripts) to refer tofeature graphs.

We use Q, q, δ, θ, to refer to constituents of feature graphs.

When displaying feature graphs, the root is depicted as a grey-colorednode, usually at the top or the left side of the graph.

The identities of the nodes are arbitrary, and we use generic namessuch as q0, q1 etc. to refer to them.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 28 / 300

Feature structures Feature graphs

Feature graphs

Example (Feature graphs)

In the following graph, the leaves q2 and q3 bear no marking; in otherwords, the marking function θ is undefined for the two sinks in its domain.

q2

q0 q1

q3

agrnum

pers

The graph displayed above is 〈Q, q, δ, θ〉, where Q = {q0, q1, q2, q3}, q =q0, δ(q0,agr) = q1, δ(q1,num) = q2, δ(q1,pers) = q3,QS = {q2, q3},and θ is undefined for its entire domain.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 29 / 300

Feature structures Feature graphs

Feature graphs

A feature graph is empty if it consists of a single unmarked nodewith no arcs.

A feature graph is atomic if it consists of a single marked node withno arcs.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 30 / 300

Feature structures Feature graphs

Empty and atomic feature graphs

Example (Empty and atomic feature graphs)

A, an empty feature graph: q0

B , an atomic feature graph: q0pl

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 31 / 300

Feature structures Feature graphs

Paths

The concept of paths is natural when graphs are concerned.

A path (over Feats) is a finite sequence of features, and the setPaths = Feats∗ is the collection of all paths.

Meta-variables π, α (with or without subscripts) range over paths.

ǫ is the empty path, denoted also by ‘〈〉’.

The length of a path π is denoted |π|.

For example, if Feats = {a, b} then Paths includesǫ, 〈a〉, 〈b〉, 〈a,b,a〉, 〈b,b,b,b,a,b〉, etc.

While a path is a purely syntactic notion (every sequence of featuresconstitutes a path), interesting paths are those that can be interpretedas actual paths in some graph, leading from the root to some node.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 32 / 300

Feature structures Feature graphs

Paths

The definition of δ is therefore extended to paths: given a featuregraph A = 〈QA, qA, δA, θA〉, define δA : QA ×Paths → QA as follows:

δA(q, ǫ) = q

δA(q, f π) = δA(δA(q, f ), π) (defined only if δA(q, f )↓)

Since for every node q ∈ QA and every feature f ∈ Feats,δA(q, f) = δA(q, 〈f〉), we identify δ with δ in the future and use onlythe latter. When the index (A) is clear from the context, it is omitted.When δA(q, π) = q′ we say that π leads (in A) from q to q′.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 33 / 300

Feature structures Feature graphs

Paths

Definition (Paths)

The paths of a feature graph A are Π(A) = {π ∈ Paths | δA(qA, π)↓}.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 34 / 300

Feature structures Feature graphs

Paths

Example (Paths)

Consider the following feature graph, A:

q2pl

q0 q1

q3third

agr

num

pers

Its paths are

Π(A) = {ǫ, 〈agr〉, 〈agr num〉, 〈agr pers〉}

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 35 / 300

Feature structures Feature graphs

Path values

Of particular interest are paths which lead from the root of a featuregraph to some node in the graph.

For such paths we define the notion of a value, which is thesub-graph whose root is the node at the end of the path.

It would have been possible to define as value the node itslef, ratherthan the sub-graph is induces; the choice is a matter of taste, asmoving from one view of values to another is trivial.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 36 / 300

Feature structures Feature graphs

Path values

Definition (Path value)

For a feature graph A = 〈QA, qA, δA, θA〉 and a path π ∈ Π(A), the valuevalA(π) of π in A is a feature graph B = 〈QB , qB , δB , θB〉, over the samesignature as A, where:

qB = δA(qA, π)

QB = {q′ ∈ QA | for some π′, δA(qB , π′) = q′} (QB is the set ofnodes reachable from qB)

for every feature f and for every q′ ∈ QB , δB(q′, f) = δA(q′, f ) (δB isthe restriction of δA to QB)

for every q′ ∈ QB , θB(q′) = θA(q′) (θB is the restriction of θA to QB)

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 37 / 300

Feature structures Feature graphs

Paths

Example (Paths)

Consider the following feature graph, A:

q2pl

q0 q1

q3third

agr

num

pers

Its paths are

Π(A) = {ǫ, 〈agr〉, 〈agr num〉, 〈agr pers〉}

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 38 / 300

Feature structures Feature graphs

Path values

Example (Path values)

The value of the path 〈agr〉 in A is:

valA(〈agr〉) =

q2pl

q1

q3third

num

pers

and the value of the path 〈agr num〉 in A is:

valA(〈agr num〉) = q2pl

Note that, for example, the value of 〈agr pers num〉 in A is undefined.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 39 / 300

Feature structures Feature graphs

Reentrancy

The definition of path values raises the question of when two pathshave equal values.

We distinguish between paths which lead to one and the same node,and those whose values are isomorphic but not identical.

The former case is called reentrancy.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 40 / 300

Feature structures Feature graphs

Reentrancy

Definition (Reentrancy)

Let A = 〈Q, q, δ, θ〉 be a feature graph. Two paths π1, π2 ∈ Π(A) are

reentrant in A, denoted π1A

! π2, iff δ(q, π1) = δ(q, π2), implyingvalA(π1) = valA(π2). A feature graph A is reentrant iff there exist two

distinct paths π1, π2 ∈ Π(A) such that π1A

! π2.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 41 / 300

Feature structures Feature graphs

Reentrancy

Example (A reentrant feature graph)

This feature graph, A, is reen-trant because δA(q0, 〈agr〉) =δA(q0, 〈subj,agr〉)

q2pl

q0 q1

q4 q3third

agr

num

perssubj agr

The (single) value of the(different) paths 〈agr〉 and〈subj agr〉 in A is:

q2pl

q1

q3third

num

pers

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 42 / 300

Feature structures Feature graphs

Reentrancy

The notion of reentrancy touches on the issue of the distinctionbetween type- and token-identity.

Two feature graphs are token identical if their components (i.e., theirsets of nodes, roots, transition functions and atom marking functions)are identical.

They are type-identical if they are isomorphic, not necessarilyrequiring their nodes to be identical.

We will discuss feature graph isomorphism later.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 43 / 300

Feature structures Feature graphs

Cicles

Early feature structure based formalisms used to employ only acyclicfeature graphs.

However, modern ones usually allow (or even require) featurestructures to be possibly cyclic.

While the linguistic motivation for cyclic feature structures is limited,there is good practical motivation for allowing them: whenimplementing a system for manipulating feature graphs, it is usuallyeasier to support cycles than to guarantee that all the graphs in asystem are acyclic.

The reason is that unification, which is the major operation definedon feature graphs, can yield a cyclic graph even when its operands areacyclic.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 44 / 300

Feature structures Feature graphs

Cicles

Definition (Cycles)

A feature graph A = 〈QA, qA, δA, θA〉 is cyclic if two paths π1, π2 ∈ Π(A),

where π1 is a proper subsequence of π2, are reentrant: π1A

! π2. A isacyclic otherwise.

Note that cyclicity is a special case of reentrancy (every cyclic featuregraph is reentrant, but not vice versa).

A corollary of the definition is that when a feature graph is cyclic, ithas at least one node q such that δ(q, α) = q for some non-emptypath α.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 45 / 300

Feature structures Feature graphs

Cicles

Example (A cyclic feature graph)

Following is a cyclic feature graph, C :

q0 q1 q2a

f

h

g

The value of the path 〈f〉 in C , as well as the values of the (infinitelymany) paths 〈f hn〉, for n ≥ 0, is the same feature graph:

q1 q2a

h

g

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 46 / 300

Feature structures Feature graph subsumption

Feature graph isomorphism

Since feature graphs are just a special case of directed, labeledgraphs, we can adapt the well-defined notion of graph isomorphism tofeature graphs.

Informally, two graphs are isomorphic when they have the samestructure; the identites of their nodes may differ without affecting thestructure.

In our case, we require also that the labels of sink nodes be identicalin order for two graphs to be considered isomorphic.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 47 / 300

Feature structures Feature graph subsumption

Feature graph isomorphism

Definition (Feature graph isomorphism)

Two feature graphs A = 〈QA, qA, δA, θA〉 and B = 〈QB , qB , δB , θB〉 areisomorphic, denoted A ∼ B , iff there exists a one-to-one and ontomapping i : QA → QB , called an isomorphism, such that:

i(qA) = qB ;

for all q1, q2 ∈ QA and f ∈ Feats, δA(q1, f ) = q2 iffδB(i(q1), f ) = i(q2); and

for all q ∈ QA, θA(q) = θB(i(q)) (either both are undefined, or bothare defined and equal).

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 48 / 300

Feature structures Feature graph subsumption

Feature graph subsumption

Definition (Subsumption)

Let A1 = 〈Q1, q1, δ1, θ1〉 and A2 = 〈Q2, q2, δ2, θ2〉 be two feature graphs.A1 subsumes A2 (denoted by A1 ⊑ A2) iff there exists a total functionh : Q1 → Q2, called a subsumption morphism, such that

h(q1) = q2

for every q ∈ Q1 and for every f such that δ1(q, f )↓,h(δ1(q, f )) = δ2(h(q), f )

for every q ∈ Q1, if θ1(q)↓ then θ1(q) = θ2(h(q)).

If A1 ⊑ A2 then A1 is said to subsume, or be more general than A2; A2 issubsumed by, or is more specific than, A1.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 49 / 300

Feature structures Feature graph subsumption

Subsumption

The morphism h associates with every node in Q1 a node in Q2; if anarc labeled f connects q with q′, then such an arc connects h(q) withh(q′).

In other words, δ and h commute, as depicted in the followingdiagram, where δ-arcs are depicted using solid lines, whereash-mappings are depicted using dashed lines:

δ :

h

h

f f

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 50 / 300

Feature structures Feature graph subsumption

Subsumption

In addition, if a node q ∈ Q1 is marked by an atom, then its imageh(q) must be marked by the same atom (recall that only sinks can bethus marked).

Note that if a sink in Q1 is not marked, there is no constraint on itsimage (in particular, it can be a non-sink).

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 51 / 300

Feature structures Feature graph subsumption

Subsumption morphism

Example (Subsumption morphism)

A1 A2

q h(q)

q h(q)

q′ h(q′)

f f

h

h

h

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 52 / 300

Feature structures Feature graph subsumption

Subsumption morphism

Example (Subsumption)

qA2

A : qA0 qA

1

qA3

third

qB2

pl

B : qB0 qB

1

qB4 qB

3 third

agr

num

pers

agr num

perssubj agr

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 53 / 300

Feature structures Feature graph subsumption

Subsumption morphism

Indeed, B can—and does—have nodes that do not correspond tonodes in A: such is qB

4 in the example.

In addition, while the sink qA2 is not marked by an atom (that is, it is

a variable), its image in B , qB2 , is marked as pl .

Notice that no subsumption morphism can be defined from QB toQA, since there is no node into which qB

4 can be mapped.

In particular, it cannot be mapped to the root of A since this wouldnecessitate an arc from qA

0 to itself (as the root of A would be theimage of both qB

4 and qB0 ).

Trying to take h−1 as an inverse subsumption morphism will fail bothbecause of qB

4 and because it would map qB2 to qA

2 , violating the lastclause of the subsumption relation (a marked sink must be mapped toa sink with the same mark).

We conclude that B 6⊑ A.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 54 / 300

Feature structures Feature graph subsumption

Subsumption

Given a feature structure, what modifications can be made to it inorder for it to become more specific? Three different kinds ofmodifications are possible:

1 Adding arcs;2 Adding reentrancies;3 Marking unmarked sinks by some atom.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 55 / 300

Feature structures Feature graph subsumption

Subsumption

Example (Subsumption as an order on information)

⊑ pl adding arcsnum

⊑ pl adding atomic marksnum num

sg ⊑ sg adding arcs

third

num num

per

sg ⊑ sg adding reentrancies

sg

num1

num2

num1

num2

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 56 / 300

Feature structures Feature graph subsumption

Subsumption

Lemma

If A ⊑ B then Π(A) ⊆ Π(B).

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 57 / 300

Feature structures Feature graph subsumption

Subsumption

Lemma

If A ⊑ B then for each π ∈ Π(A), if θA(δA(qA, π))↓ then θB(δB(qB , π))↓and θA(δA(qA, π)) = θB(δB (qB , π)).

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 58 / 300

Feature structures Feature graph subsumption

Subsumption

Lemma

If A ⊑ B and π1, π2 are reentrant in A (that is, π1A

! π2) then π1, π2 are

reentrant in B (that is, π1B

! π2).

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 59 / 300

Feature structures Feature graph subsumption

Subsumption

Corollary

If A ⊑ B, then:

Π(A) ⊆ Π(B)

for each π ∈ Π(A), if θA(δA(qA, π))↓ then θB(δB (qB , π))↓ andθA(δA(qA, π)) = θB(δB (qB , π))

for each π1, π2 ∈ Π(A), if π1A

! π2 then π1B

! π2 (and, therefore, ifA is reentrant/cyclic then so is B).

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 60 / 300

Feature structures Feature graph subsumption

Subsumption

Theorem

If A is an atomic feature graph and A ⊑ B, then A ∼ B.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 61 / 300

Feature structures Feature graph subsumption

Subsumption

Theorem

Subsumption has a least element: there exists a feature graph A such thatfor all feature graph B, A ⊑ B.

Proof.

Consider the (empty) feature graph A = 〈{q0}, q0, δ, θ〉, where δ and θ areundefined for their entire domains. For every feature graph B , A ⊑ B bymapping (through h) the root q0 to the root of B , qB . The two clauses ofthe definition of subsumption hold vacuously.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 62 / 300

Feature structures Feature graph subsumption

Subsumption

Theorem

Subsumption is reflexive: for every feature graph A, A ⊑ A.

Proof.

Take h to be the identity function that maps every node in A to itself.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 63 / 300

Feature structures Feature graph subsumption

Subsumption

Theorem

Subsumption is transitive: if A ⊑ B and B ⊑ C then A ⊑ C.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 64 / 300

Feature structures Feature graph subsumption

Subsumption

Theorem

Subsumption is not antisymmetric: if A ⊑ B and B ⊑ A then notnecessarily A = B.

Proof.

Consider the feature graphs A = 〈{qA}, qA, δ, θ〉 and B = 〈{qB}, qB , δ, θ〉,where δ and θ are undefined for their entire domains, and where qA 6= qB .Trivially, both A ⊑ B and B ⊑ A, but A 6= B .

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 65 / 300

Feature structures Feature graph subsumption

Subsumption

Thus, feature graph subsumption forms a partial pre-order on featuregraphs.

It is a pre-order since it is not antisymmetric; it is partial as there arefeature graphs that are incomparable with respect to subsumption.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 66 / 300

Feature structures Feature graph subsumption

Subsumption

Example (Feature graph subsumption is a partial relation)

Feature graphs can be incomparable due to inconsistency (contradictinginformation) or to complementary information.

sg6⊑6⊒ pl

sg6⊑6⊒ pl

num num

6⊑6⊒

num pers

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 67 / 300

Feature structures Feature graph subsumption

Subsumption

There is a clear connection between feature graph isomorphism andfeature graph subsumption:

Theorem

A ∼ B iff A ⊑ B and B ⊑ A.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 68 / 300

Feature structures Attribute-value matrices

AVMs

We now return to attribute-value matrices (AVMs).

This is the view that we will adopt for depicting feature structures(and grammars based on them), both because they are easy topresent on paper and because of their centrality in existing literature.

Like feature graphs, AVMs are defined over a signature of featuresand atoms, which we fix below.

In addition, AVMs make use of variables, also called tags below.Meta-variables X , Y , Z , etc. range over over variables.

Variables are used to encode sharing of values, as will be clearpresently.

When AVMs are concerned, we follow the convention of the linguisticliterature by which variables are natural numbers, depicted in boxes,e.g., 3 .

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 69 / 300

Feature structures Attribute-value matrices

AVMs

Definition (AVMs)

Given a signature S, the set Avms(S) of AVMs over S is the least setsatisfying the following two clauses:

1 M = Xa ∈ Avms(S) for any a ∈ Atoms and X ∈ Tags; M is saidto be atomic and X is the tag of M, denoted tag(M) = X .

2 M = X [f1 : M1, . . . , fn : Mn] ∈ Avms(S) for n ≥ 0, X ∈ Tags,f1, . . . , fn ∈ Feats and M1, . . . ,Mn ∈ Avms(S), where fi 6= fj ifi 6= j . M is said to be complex, and X is the tag of M, denotedtag(M) = X . If n = 0, M = X [] is an empty AVM.

Note that two AVMs which differ only in their tag are distinct: if X 6= Y ,X

[

· · ·]

6= Y[

· · ·]

. In particular, there is no unique empty AVM. Note alsothat the same variable can be used more than once in an AVM.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 70 / 300

Feature structures Attribute-value matrices

AVMs

Example (AVMs)

Consider a signature consisting of Atoms = {a} and Feats = {f,g}.Then M1 = 4a is an AVM by the first clause of the definition, M2 = 2 [ ] isan empty AVM by the second clause, M3 = 3

[

f : 4a]

is an AVM by thesecond clause (using M1 as the value of f, so that fval(M3, f) = M1), and

M4 = 2

[

g : 3[

f : 4a]

f : 2 [ ]

]

is an AVM by the second clause, as is

M5 = 4

[

g : 3[

f : 4a]

f : 2 [ ]

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 71 / 300

Feature structures Attribute-value matrices

AVMs

Meta-variables M, with or without subscripts, range over Avms; theparameter S is omitted when it is clear from the context.

The domain of an AVM M, denoted dom(M), is undefined when M isatomic, and {f1, . . . , fn} when M is complex (hence, dom(M) isempty for an empty AVM).

The value of some feature f ∈ Feats in M, denoted fval(M, f ), isdefined if f = fi ∈ dom(M), in which case it is Mi , and undefinedotherwise.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 72 / 300

Feature structures Attribute-value matrices

Sub-AVMs

Definition (Sub-AVMs)

Given an AVM M, its sub-AVMs are SubAVM(M), defined as:

1 SubAVM(Xa) = {Xa}

2 SubAVM(X [f1 : M1, . . . , fn : Mn]) = X [f1 : M1, . . . , fn : Mn]⋃

∪1≤i≤nSubAVM(Mi )

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 73 / 300

Feature structures Attribute-value matrices

AVMs

Definition (Tags)

Given an AVM M, its tags Tags(M) are defined as:

1 Tags(Xa) = {X}

2 Tags(X [f1 : M1, . . . , fn : Mn]) = X ∪1≤i≤n Tags(Mi )

Definition (Tagset)

The tagset of an AVM M and a tag X ∈ Tags(M) is the set of sub-AVMsof M (including M itself) which are tagged by X :TagSet(M,X ) = {M ′ ∈ SubAVM(M) | tag(M ′) = X}.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 74 / 300

Feature structures Attribute-value matrices

AVMs

Example (AVMs)

Let:

M4 = 2

[

g : 3[

f : 4a]

f : 2 [ ]

]

fval(M4, f) = 2 [ ]. Observe that Tags(M4) = { 2 , 3 , 4}. Also,TagSet(M4, 4 ) is { 4a}, TagSet(M4, 3 ) is { 3

[

f : 4a]

} andTagSet(M4, 2 ) is {M4, 2 [ ]}.Trivially, tag(M4) = 2 .

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 75 / 300

Feature structures Attribute-value matrices

AVMs

Example (AVMs)

Let:

M5 = 4

[

g : 3[

f : 4a]

f : 2 [ ]

]

Similarly, fval(M5, f) = 2 [ ], whereas fval(M5,g) = 3[

f : 4a]

. Observethat Tags(M5) = { 2 , 3 , 4}.Also TagSet(M5, 2 ) = { 2 [ ]},TagSet(M5, 3 ) = { 3

[

f : 4a]

} and TagSet(M5, 4 ) = {M5, 4a}.Trivially, tag(M5) = 4 .

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 76 / 300

Feature structures Attribute-value matrices

AVMs

Example (AVMs)

As another example, consider the AVM

M6 = 1[

f : 1[

f : 1[

f : 1 [ ]]]]

Here, Tags(M6) = { 1}, and TagSet(M6, 1 ) is:

{M6, 1[

f : 1[

f : 1 [ ]]]

, 1[

f : 1 [ ]]

, 1 [ ]}

Of course, tag(M6) = 1 .

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 77 / 300

Feature structures Attribute-value matrices

Well-formed AVMs

Consider some AVM M = 1

[

f1 : 2M1

f2 : 2M2

]

where M1 6= M2.

Both M1 and M2 are sub-AVMs of M, and both have the same tag,although they are different.

In other words, the recursive definition of AVMs allows two different,contradicting AVMs to be in the TagSet of the same variable.

To eliminate such cases, we define well-formed AVMs as follows:

Definition (Well-formed AVMs)

An AVM M is well-formed iff for every variable X ∈ Tags(M),TagSet(M,X ) includes at most one non-empty AVM.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 78 / 300

Feature structures Attribute-value matrices

Reentrancy

Example (A reentrant AVM)

The following AVM is reentrant but not cyclic:

0

agr : 1

[

num : 2plpers : 3 third

]

subj : 4[

agr : 1]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 79 / 300

Feature structures Attribute-value matrices

Conventions

We introduce three conventions regarding the depiction ofwell-formed AVMs, motivated by the fact that variables are usedprimarily to indicate value sharing.

If a variable occurs more than once then its value is explicated onlyonce; where this value is explicated (i.e., next to which occurrence ofthe variable) is immaterial.

Variables which occur only once can be omitted.

The empty AVM is sometimes omitted when it is associated with avariable.

The first convention is crucial in the case of cyclic AVMS: there is nofinite representation of cyclic AVMs unless this convention is adopted.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 80 / 300

Feature structures Attribute-value matrices

Conventions

Example (Shorthand notation for AVMs)

Consider the following AVM:

6

f : 3 [ ]g : 4

[

h : 3a]

h : 2 [ ]

Notice that it is well-formed, since the only variable occurring more thanonce ( 3 ) is associated with a non-empty value (a) only once.

We can therefore leave only one occurrence of the value explicit

The tag 2 is associated with the empty feature structure, which canbe omitted

Finally, the tags 4 and 6 occur only once, so they can be omitted

This is the conventional form of the AVM.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 81 / 300

Feature structures Attribute-value matrices

AVM equivalence

Example (AVM equivalence)

M1 and M2 differ only in the instance of 0 whose value is explicated:

M1 = 0

agr : 1

[

num : 2plpers : 3 third

]

subj : 4[

agr : 1]

M2 = 0

agr : 1

subj : 4

[

agr : 1

[

num : 2plpers : 3 third

]]

Then M1 � M2 and M2 � M1.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 82 / 300

Feature structures Attribute-value matrices

AVM equivalence

Example (AVM renamings)

The following two AVMs are renamings of each other:

M1 = 0

agr : 1

[

num : 2plpers : 3 third

]

subj : 4[

agr : 1]

M2 = 10

agr : 11

subj : 14

[

agr : 11

[

num : 12plpers : 13 third

]]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 83 / 300

Feature structures The correspondence between feature graphs and AVMs

The correspondence between feature graphs and AVMs

AVMs are the entities that the linguistic literature employs to depictfeature structures;

feature graphs are well-understood mathematical entities to whichvarious results of graph theory can be applied.

We define the relationship between these two views.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 84 / 300

Feature structures The correspondence between feature graphs and AVMs

From AVMs to feature graphs

Example (AVM to graph mapping)

A reentrant AVM and its feature graph image:

M = 0

agr : 1

[

num : 2plpers : 3 third

]

subj : 4[

agr : 1]

φ(M) =

2pl

0 1

4 3third

agr

num

perssubj agr

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 85 / 300

Feature structures Feature structures in a broader context

Feature structures in a broader context

Feature structures are utilized by many grammatical formalisms toencode different kinds of linguistic information: they serve inrepresenting phonological, morphological, syntactic and semanticknowledge.

But the use of feature structures is not limited to computationallinguistics; indeed, they are present in other areas of computer scienceas well.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 86 / 300

Feature structures Feature structures in a broader context

Feature structures in a broader context

A somewhat degenerate form of feature structures is utilized by manyprogramming languages: records (as in Pascal, known as structures inC).

There are some major differences between records and featurestructures.

The notion of sharing that is central to feature structures is lesssignificant for records.The values of record fields are not necessarily other records – differentdata types can be freely used; hence transfer of values is mediatedthrough explicit assignments, not unifications.Unification-based formalisms usually do not allow such a diversity ofoperations to apply to feature structures as programming languagesallow to records.In particular, arithmetic operations are usually not applicable to featurestructures’ values, while they are very natural to numeric records’ fields.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 87 / 300

Feature structures Feature structures in a broader context

Feature structures in a broader context

Logic programming languages such as Prolog manipulate first-orderterms (FOTs), which might be viewed as a special case of featurestructures.

There are some important differences between feature structures andFOTs.

FOTs are essentially trees, with possibly shared leaves, whereas featurestructures allow reentrancies to occur in every level of the structure.Feature structures can be cyclic, in contrast to (ordinary) FOTs.FOTs use positional encoding of argument structures, with no features.Two FOTs are unifiable only if they have the same functor and thesame arity, while two feature structures might be unifiable even if theyhave different number of features.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 88 / 300

Unification Motivation

Unification

The subsumption relation compares the information content offeature structures.

Unification combines the information that is contained in two(compatible) feature structures.

We use the term ‘unification’ to refer to both the operation and itsresult. Whenever two feature structures are related, they are assumedto be over the same signature.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 89 / 300

Unification Motivation

Unification

The mathematical interpretation of “combining” two members of apartially ordered set is to take the least upper bound of the twooperands with respect to the partial order; in our case, subsumption.

Indeed, feature structure unification is exactly that.

However, since subsumption is antisymmetric for feature structuresand AFSs but not for feature graphs and AVMs, a unique least upperbound cannot be guaranteed for all four views.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 90 / 300

Unification Feature structure unification

Feature structure unification

Definition (Feature structure unification)

Two feature structures fs1 and fs2 are consistent if they have an upperbound (with respect to subsumption), and inconsistent otherwise. If fs1and fs2 are consistent, their unification, denoted fs1⊔fs2, is their leastupper bound with respect to subsumption.

If two feature structures have an upper bound, they have a (unique) leastupper bound.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 91 / 300

Unification Feature graph unification

Feature graph unification

While the definition of unification as least upper bound is usefulmathematically, it does not tells us how to compute the unification oftwo given feature structures.

To this end, we provide a constructive definition in terms of featuregraphs, which induces an algorithm for computing unification.

For reasons that will be clear presently, we require that the twofeature graphs be node-disjoint.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 92 / 300

Unification Feature graph unification

Feature graph unification

Definition

Let A = 〈QA, qA, δA, θA〉 and B = 〈QB , qB , δB , θB〉 with QA ∩ QB = ∅ be

two feature graphs. Let ‘u≈’ be the least equivalence relation on QA ∪ QB

such that:

qA

u≈ qB

for every q1, q2 ∈ QA ∪ QB and f ∈ Feats, if

q1u≈ q2, (δA ∪ δB)(q1, f )↓ and (δA ∪ δB)(q2, f )↓, then

(δA ∪ δB)(q1, f )u≈ (δA ∪ δB)(q2, f )

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 93 / 300

Unification Feature graph unification

Feature graph unification

The ‘u≈’ relation partitions the nodes of QA ∪ QB to equivalence

classes such that both roots are in the same class, and if some featureis defined for two nodes in one class, then the two nodes this featureleads to are also in one (possibly different) class.

Clearly, the number of equivalence classes (called the index ofu≈) is

finite.

The requirement that QA and QB be disjoint is essential here: wewould want two nodes to be in the same equivalence class with

respect to ‘u≈’ only if they comply with the above definition; if we

allowed a non-empty intersection of nodes, ‘u≈’ could have been a

different relation.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 94 / 300

Unification Feature graph unification

Theu≈ relation

A : qA0 qA

1 qA2

sg

B : qB0 qB

1 qB2

3rd

f

g

num

f pers

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 95 / 300

Unification Feature graph unification

Theu≈ relation

A : qA0 qA

1 qA2

B : qB1qB

0

f

g

h

f

g

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 96 / 300

Unification Feature graph unification

Type-respecting relation

Definition

A binary relation ‘≈’ over the nodes of two feature structures QA ∪ QB issaid to be type respecting iff for every node q ∈ QA ∪ QB , if(θA ∪ θB)(q)↓ and (θA ∪ θB)(q) = a, then for every node q′ such thatq ≈ q′, q′ is a sink and either (θA ∪ θB)(q′)↑ or (θA ∪ θB)(q′) = a.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 97 / 300

Unification Feature graph unification

Type-respecting relation

When is ‘u≈’ not type respecting?

The above condition can hold for a node q ∈ QA ∪ QB only if(θA ∪ θB)(q)↓; that is, q must be a sink in either A or B .

The type respecting condition requires that all nodes that areequivalent to q be sinks, either unmarked or marked by the sameatom.

Since this is the only requirement, the relation is not type respectingif it maps two nodes, one of which is a marked sink and the other ofwhich is either a non-sink or a sink with a different label, to the sameequivalence class.

A non-type respecting ‘u≈’ is the only source for unification failure.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 98 / 300

Unification Feature graph unification

Type respectingu≈ relation

A : qA0 qA

1 qA2

sg

B : qB0 qB

1 qB2

3rd

f

g

num

f pers

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 99 / 300

Unification Feature graph unification

Feature graph unification

Lemma

If A and B have a common upper bound C, such that A ⊑ C through themorphism hA and B ⊑ C through the morphism hB , and if qA ∈ QA and

qB ∈ QB are such that qA

u≈ qB , then hA(qA) = hB(qB).

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 100 / 300

Unification Feature graph unification

Feature graph unification

Definition (Feature graph unification)

Let A and B be two feature graphs such that QA and QB are disjoint. The

unification of A and B, denoted A ⊔ B, is defined only if ‘u≈’ is type respecting,

in which case it is the feature graph 〈Q, q, δ, θ〉, where:

Q = {[q] u≈| q ∈ (QA ∪ QB)}

q = [q1] u≈

(= [q2] u≈

)

δ([q] u≈

, f ) =

{

[q′′] u≈

if there exists q′ ∈ [q] u≈

s.t. (δA ∪ δB)(q′, f ) = q′′

undef. if (δA ∪ δB)(q′, f )↑ for all q′ ∈ [q] u≈

θ([q] u≈

) =

{

(θA ∪ θB)(q′) if there exists q′ ∈ [q] u≈

s.t. (θA ∪ θB)(q′)↓

undefined if (θA ∪ θB)(q′)↑ for all q′ ∈ [q] u≈

Ifu≈ is not type respecting, A and B are inconsistent.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 101 / 300

Unification Feature graph unification

Feature graph unification

f

gnum

pers

A : qA0 qA

1 qA2

sg

B : qB0 qB

1 qB2

3rd

f

g

num

f pers

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 102 / 300

Unification Feature graph unification

Unification

To see that the result of unification is indeed a feature graph, observethat

〈Q, q, δ, θ〉 is connected because both A and B are connected;it is finite since both A and B are (and hence the number ofequivalence classes is finite);

and θ labels only sinks, sinceu≈ is type respecting.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 103 / 300

Unification Feature graph unification

Unification

Example (Unification combines information)

q0 q1sg

⊔ q3 = q6 q7sg

q53rd

q83rd

num

pers

num

pers

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 104 / 300

Unification Feature graph unification

Unification

Example (Unification is absorbing)

q0 q1sg

⊔ q3 q4sg

= q6 q7sg

q53rd

q83rd

num num

pers

num

pers

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 105 / 300

Unification Feature graph unification

Unification with reentrancies

sg

3rd

sg

3rd

subj

obj

num

pers

subj

obj

num

pers

subj

obj

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 106 / 300

Unification Feature graph unification

Unification

Theorem

If A and B are inconsistent, they do not have a common upper bound.Otherwise, C = A ⊔ B is a minimal upper bound of A and B with respectto (feature graph) subsumption.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 107 / 300

Unification Feature graph unification

Unification

The previous theorem connects feature graph unification with featurestructure unification.

In order to compute fs = fs1⊔fs2, simply compute A = A1 ⊔ A2,where A1 ∈ fs1 and A2 ∈ fs2, and take fs = [A]∼.

Theorem

For all feature graphs A1,A2, if A = A1 ⊔ A2 then [A]∼ = [A1]∼⊔[A2]∼.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 108 / 300

Unification Generalization

Generalization

Unification is an information-combining operator: when two featurestructures are compatible, their unification can be informally seen asa union of the information both structures encode.

Sometimes, however, a dual operation is useful, analogous to theintersection of the information encoded in feature structures.

This operation, which is much less frequently used in computationallinguistics, is referred to as anti-unification, or generalization.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 109 / 300

Unification Generalization

Generalization

Defined over pairs of feature structures, generalization (denoted ⊓) isthe operation that returns the most specific (or least general) featurestructure that is still more general than both arguments.

In terms of the subsumption ordering, generalization is the greatestlower bound (glb) of two feature structures.

Unlike unification, generalization can never fail.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 110 / 300

Unification Generalization

Generalization

Definition (Generalization)

The generalization (or anti-unification) of two feature structures fs1 andfs2, denoted fs1⊓fs2, is the greatest lower bound of fs1 and fs2.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 111 / 300

Unification Generalization

Generalization

Example (Generalization)

Generalization reduces information:

[

num : sg]

⊓[

pers : third]

= [ ]

Different atoms are inconsistent:

[

num : sg]

⊓[

num : pl]

=[

num : [ ]]

Generalization is restricting:

[

num : sg]

[

num : sgpers : third

]

=[

num : sg]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 112 / 300

Unification Generalization

Generalization

Example (Generalization)

Empty feature structures are zero elements:

[ ] ⊓[

agr :[

num : sg]]

= [ ]

Reentrancies can be lost:[

f : 1[

num : sg]

g : 1

]

[

f :[

num : sg]

g :[

num : sg]

]

=

[

f :[

num : sg]

g :[

num : sg]

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 113 / 300

Unification grammars Introduction

Unification grammars

Feature structures are the building blocks with which unificationgrammars are built, as they serve as the counterpart of the terminaland non-terminal symbols in CFGs.

In order to define grammars and derivations, one needs someextension of feature structures to sequences thereof.

Multi-rooted feature structures are aimed at capturing complex,ordered information and are used for representing rules and sententialforms of unification grammars.

Multi-rooted feature graphs, a natural extension of feature graphsMulti-rooted feature structures, which are equivalence classes ofisomorphic multi-rooted feature graphsMulti-AVMs, which are an extension of AVMs, and show how theycorrespond to multi-rooted graphs.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 114 / 300

Unification grammars Introduction

Unification grammars

Unification in context

Forms and grammar rules

Derivation

Languages

Derivation tress

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 115 / 300

Unification grammars Multi-rooted feature graphs

Multi-rooted feature graphs

We extend feature graphs to multi-rooted feature graphs (MRGs).

Multi-rooted feature graphs are defined over the same signature(Feats and Atoms), which is assumed to be fixed

Definition (Multi-rooted feature graphs)

A multi-rooted feature graph (MRG) is a pair 〈R ,G 〉 whereG = 〈Q, δ, θ〉 is a finite, directed, labeled graph consisting of a non-empty,finite set Q of nodes (disjoint of Feats and Atoms), a partial functionδ : Q × Feats → Q specifying the arcs and a labeling function θ markingsome of the sinks, and where R is an ordered list of distinguished nodes inQ called roots. G is not necessarily connected, but the union of all thenodes reachable from all the roots in R is required to yield exactly Q. Thelength of an MRG is the number of its roots, |R|. λ denotes the emptyMRG, where Q = ∅.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 116 / 300

Unification grammars Multi-rooted feature graphs

Multi-rooted feature graphs

Example (Multi-rooted feature graphs)

The following is an MRG, in which the shaded nodes (ordered from left toright) constitute the list of roots, R

q1 q2 q3

q4s

q5np

q6vp

q7

cat cat cat

agr agr

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 117 / 300

Unification grammars Multi-rooted feature graphs

Multi-rooted feature graphs

A multi-rooted feature graph is a directed, not necessarily connected,labeled graph with a designated sequence of nodes called roots

It is a natural extension of feature graphs, the only difference beingthat the single root of a feature graph is extended here to a list inorder to model the required structured information

Meta-variables ~A range over MRGs, and Q, δ, θ and R – over theirconstituents

We do not distinguish between an MRG of length 1 and a featuregraph

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 118 / 300

Unification grammars Multi-rooted feature graphs

Multi-rooted feature graphs

Natural relations can be defined between MRGs and feature graphs

First, note that if ~A = 〈R ,G 〉 is an MRG and qi is a root in R then qi

naturally induces a feature graph ~A|i = 〈Qi , qi , δi , θi 〉, where:

Qi is the set of nodes reachable from qi

δi = δ|Qi(the restriction of δ to Qi )

θi = θ|Qi(the restriction of θ to Qi ).

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 119 / 300

Unification grammars Multi-rooted feature graphs

Multi-rooted feature graphs

One can view an MRG ~A = 〈R,G 〉 as an ordered sequence〈A1, . . . ,An〉 of (not necessarily disjoint) feature graphs, whereAi = ~A|i for 1 ≤ i ≤ n

Note that such an ordered list of feature structures is not a sequencein the mathematical sense:

removing a node accessible from one root can result in this node beingremoved from the graph accessible from some other root

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 120 / 300

Unification grammars Multi-rooted feature graphs

Subgraphs

Although MRGs are not element-disjoint sequences, it is possible todefine substructures of them

The roots of an MRG form a sequence of nodes

Taking just a subsequence of the roots, and considering only thesubgraph they induce (that is, the nodes that are accessible fromthese roots), a notion of substructure is naturally obtained

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 121 / 300

Unification grammars Multi-rooted feature graphs

Subgraphs

Definition (Induced subgraphs)

The subgraph of a non-empty MRG ~A = 〈R,G 〉, induced by j , k anddenoted ~Aj ...k , is defined only if 1 ≤ i ≤ j ≤ n, in which case it is theMRG 〈R ′,G ′〉 where R ′ = 〈qj , . . . , qk〉, G ′ = 〈Q ′, δ′, θ′〉 and

Q ′ = {q | δ(q, π) = q} for some q ∈ R ′ and some π

δ′(q, f ) = δ(q, f ) for every q ∈ Q ′

θ′(q) = θ(q) for every q ∈ Q ′

When the sequence is of length 1 we write ~Ai for ~Ai ...i . As we identify afeature graph with an MRG of length 1, ~Ai = ~A|i .

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 122 / 300

Unification grammars Multi-rooted feature graphs

MRGs

Since MRGs are a natural extension of feature graphs, many ofconcepts defined for the latter can be extended to the former

The transition function δ is extended from single features to pathsThe set of paths of an MRGThe function val , associating a value with each path in a featuregraph, is extended to MRGs.Reentrancy and cyclicityIsomorphism and subsumption

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 123 / 300

Unification grammars Multi-rooted feature graphs

MRG paths

Definition (MRG paths)

The paths of a multi feature graph ~A are

Π(~A) = {〈i , π〉 | π ∈ Paths and δ(qi , π)↓}

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 124 / 300

Unification grammars Multi-rooted feature graphs

MRG path values

Definition (Path value)

The value of a path 〈i , π〉 in an MRG ~A, denoted by val~A(〈i , π〉), is definedif and only if δ~A

(qi , π)↓, in which case it is the feature graph val~A|i (π).

Note that the value of a path in an MRG is a (single-rooted) featuregraph, not an MRG. In particular, val~A(〈i , π〉) may include nodes which

are roots in ~A but are not the root of the resulting feature graph. Clearly,an MRG may have two paths 〈i1, π1〉 and 〈i2, π2〉 where π1 = π2 eventhough i1 6= i2.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 125 / 300

Unification grammars Multi-rooted feature graphs

MRG path values

Example (Path value)

~A, where R = 〈q0, q1, q2〉 val~A(〈2, 〈f〉〉)

q0 q1 q2

q3 q4 q5

q6a

q7b

q4

q6a

q7b

f f f

h hg h

g h

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 126 / 300

Unification grammars Multi-rooted feature graphs

MRG reentrancy

Two MRG paths are reentrant, denoted 〈i , π1〉~A

! 〈j , π2〉, if theyshare the same value: δ~A

(qi , π1) = δ~A(qj , π2)

A multi-rooted feature graph is reentrant if it has two distinct paths(possibly leaving different roots) that are reentrant

An MRG ~A is cyclic if two paths 〈i , π1〉, 〈i , π2〉 ∈ Π(~A), where π1 is a

proper subsequence of π2, are reentrant: 〈i , π1〉~A

! 〈i , π2〉

Here, the two paths must have the same index i , although they may“pass through” elements of ~A other than the i -th one

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 127 / 300

Unification grammars Multi-rooted feature graphs

A cyclic MRG

Example (A cyclic MRG)

The following MRG ~A = 〈R ,G 〉, where R = 〈q0, q1, q2〉, is cyclic:

q0 q1 q2

q3 q4 q5

q6 q7

f f f

h h

g

h

g

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 128 / 300

Unification grammars Multi-rooted feature graphs

Multi-rooted feature graph isomorphism

Definition (Multi-rooted feature graph isomorphism)

Two MRGs ~A1 = 〈R1,G1〉 and ~A2 = 〈R2,G2〉 are isomorphic, denoted~A1~∼~A2, iff they are of the same length, n, and there exists a one-to-onemapping i : Q1 → Q2, called an isomorphism, such that:

i(q1j ) = q2j for all 1 ≤ j ≤ n;

for all q1, q2 ∈ Q1 and f ∈ Feats, δ1(q1, f ) = q2 iffδ2(i(q1), f ) = i(q2); and

for all q ∈ Q1, θ1(q) = θ2(i(q)) (either both are undefined, or bothare defined and equal).

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 129 / 300

Unification grammars Multi-rooted feature graphs

Subsumption of multi-rooted feature graphs

Definition (Subsumption of multi-rooted feature graphs)

An MRG ~A = 〈R ,G 〉 subsumes an MRG ~A′ = 〈R ′,G ′〉, denoted ~A~⊑~A′, if|R | = |R ′| and there exists a total function h : Q → Q ′ such that:

for every root qi ∈ R, h(qi ) = q′i

for every q ∈ Q and f ∈ Feats, if δ(q, f )↓ thenh(δ(q, f )) = δ′(h(q), f )

for every q ∈ Q, if θ(q)↓ then θ(q) = θ′(h(q))

The only difference from feature graph subsumption is that h is requiredto map each of the roots in R to its corresponding root in R ′. Notice thatin order for two MRGs to be related by subsumption they must be of thesame length.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 130 / 300

Unification grammars Multi-rooted feature graphs

Subsumption of multi-rooted feature graphs

Example (MRG subsumption)

Feature graph subsumption can have three different effects: if A ⊑ B ,then B can have additional arcs, additional reentrancies or more markedatoms. The same holds for MRGs, with the observation that additionalreentrancies can now occur among paths that originate at different roots:

~⊑

6 ~⊒

f g f g

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 131 / 300

Unification grammars Multi-rooted feature graphs

Subsumption of multi-rooted feature graphs

Example (MRG subsumption)

Let ~A and ~A′ be the following two MRGs. Then ~A~⊑~A′ but not ~A′~⊑~A.

~Anp vp np

sg 3rd sg 3rd

cat

agr

num

pers

catag

r

agr

num

pers

cat

~A′

np vp np

sg 3rd

cat

agr

num

pers

cat

agr cat

agr

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 132 / 300

Unification grammars Multi-AVMs

Multi-AVMs

Definition

Given a signature S, a multi-AVM (MAVM) of length n ≥ 0 is asequence 〈M1, . . . ,Mn〉 such that for each i , 1 ≤ i ≤ n, Mi is an AVMover the signature.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 133 / 300

Unification grammars Multi-AVMs

Multi-AVMs

Meta-variables ~M range over multi-AVMs

The sub-AVMs of ~M are SubAVM( ~M) =⋃

1≤i≤n SubAVM(Mi)

Similarly to what we did for AVMs, we define the set of tagsoccurring in a multi-AVM ~M as Tags(~M)

Note that if ~M = 〈M1, . . . ,Mn〉 then Tags( ~M) =⋃

1≤i≤n Tags(Mi )(where the union is not necessarily disjoint)

Also, the set of sub-AVMs of ~M (including ~M itself) which are taggedby the same variable X is TagSet(~M,X )

Here, too, TagSet( ~M,X ) =⋃

1≤i≤n TagSet(Mi ,X )

We usually do not distinguish between a multi-AVM of length 1 andan AVM

When depicting MAVMs graphically, we sometimes suppress theangular brackets which enclose the sequence of AVMs.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 134 / 300

Unification grammars Multi-AVMs

Multi-AVMs

Well-formedness and variable association are extended from AVMs toMAVMs in the natural way:

Definition (Well-formed MAVMs)

A multi-AVM ~M is well-formed iff for every variable X ∈ Tags( ~M),TagSet(~M,X ) includes at most one non-empty AVM.

Definition (Variable association)

The association of a variable X in ~M, denoted assoc( ~M,X ), is the singlenon-empty AVM in TagSet(~M,X ); if all the members of TagSet(~M,X ) areempty, then assoc(~M,X ) = X [ ].

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 135 / 300

Unification grammars Multi-AVMs

Multi-AVMs

Example (Multi-AVMs)

Consider the following multi-AVM ~M , whose length is 3:

2[

f : 9[

h : 1 [ ]]]

, 1

[

f : 8

[

g : 7ah : 2 [ ]

]]

, 6[

f : 5[

h : 2 [ ]]]

Tags( ~M) = { 1 , 2 , 5 , 6 , 7 , 8 , 9}. ~M is well-formed:

TagSet( ~M , 1 ) =

{

1 [ ] , 1

[

f : 8

[

g : 7ah : 2 [ ]

]]}

TagSet( ~M , 2 ) ={

2 [ ] , 2[

f : 9[

h : 1 [ ]]]}

Therefore,

assoc( ~M, 1 ) = 1

[

f : 8

[

g : 7ah : 2 [ ]

]]

, assoc( ~M, 2 ) = 2[

f : 9[

h : 1 [ ]]]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 136 / 300

Unification grammars Multi-AVMs

Multi-AVMs

The same variable can tag different sub-AVMs of different elements inthe sequence

In other words, the scope of variables is extended from single AVMsto multi-AVMs

This leads to an interpretation of variables (in multi-AVMs) whichhampers the view of multi-AVMs as sequences of AVMs

Recall that we interpret multiple occurrence of the same variablewithin a single AVM as denoting value sharing; hence the definition ofwell-formed AVMs, and the convention that when a variable occursmore than once in an AVM, its association can be stipulated next toany of its occurrences

As in the other views, when multi-AVMs are concerned, thisconvention implies that removing an element from a multi-AVM canaffect other elements, in contradiction to the usual concept ofsequences

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 137 / 300

Unification grammars Multi-AVMs

MAVM subsumption

Definition (Multi-AVM subsumption)

Let ~M, ~M ′ be two MAVMs of the same length n and over the samesignature. ~M subsumes ~M ′, denoted ~M~�~M ′, if the following conditionshold:

1 for all i , 1 ≤ i ≤ n, Mi � M ′i ;

2 if 〈i , π1〉~M

! 〈j , π2〉 then 〈i , π1〉~M′

! 〈j , π2〉.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 138 / 300

Unification grammars Multi-AVMs

MAVM subsumption

Example (MAVM subsumption)

Let ~M and ~M ′ be the following two MAVMs (of length 3):

~M : 1

»

cat : np

agr : 4

2

2

4

cat : vp

agr : 4

»

num : sg

pers : 3rd

3

5 3

2

4

cat : np

agr : 6

»

num : sg

pers : 3rd

3

5

~M ′ : 1

»

cat : np

agr : 4

2

2

4

cat : vp

agr : 4

»

num : sg

pers : 3rd

3

5 3

»

cat : np

agr : 4

Then ~M � ~M ′ but not ~M ′ � ~M.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 139 / 300

Unification grammars Multi-AVMs

MAVM subsumption

The second clause of the definition may seem redundant: if for all i ,1 ≤ i ≤ n, Mi � M ′

i , then in particular all the reentrancies of Mi areall reentrancies in M ′

i ; why then is the second clause necessary?

The answer lies in the possibility of reentrancies across elements inmulti-AVMs

Such reentrancies are a “global” property of multi-AVMs, which isnot reflected in any of the elements in isolation

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 140 / 300

Unification grammars Multi-AVMs

MAVM Renaming

Definition (Renaming)

Let ~M1 and ~M2 be two MAVMs. ~M2 is a renaming of ~M1, denoted~M1~≃ ~M2, iff ~M1~�~M2 and ~M2~�~M1.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 141 / 300

Unification grammars Multi-AVMs

Multi-AVM to MRG mapping

Definition (Multi-AVM to MRG mapping)

Let ~M = 〈M1, . . . ,Mn〉 be a well-formed multi-AVM of length n. TheMRG image of ~M is ϕ( ~M) = 〈R ,G 〉, with R = 〈q1, . . . , qn〉 andG = 〈Q, δ, θ〉, where:

Q = Tags( ~M)

qi = tag(Mi) for 1 ≤ i ≤ n

for all X ∈ Tags(~M) and f ∈ Feats, δ(X , f ) = Y if〈X , f ,Y 〉 ∈ Arcs( ~M), and

for all X ∈ Tags(~M) and a ∈ Atoms, θ(X ) = a if assoc( ~M,X ) is theatomic AVM X (a), and is undefined otherwise.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 142 / 300

Unification grammars Multi-AVMs

Multi-AVM to MRG mapping

Example (Multi-AVM to multi-rooted feature graph mapping)

Consider the following multi-AVM ~M:

2[

f : 9[

h : 1 [ ]]]

1

[

f : 8

[

g : 7ah : 2 [ ]

]]

6[

f : 5[

h : 2 [ ]]]

Observe that it is well-formed, as the variables that occur more than once( 1 and 2 ) have only one non-empty occurrence each. The set of variablesof ~M is Tags( ~M) = { 1 , 2 , 5 , 6 , 7 , 8 , 9}, which will also be the set ofnodes Q in ϕ(~M). The sequence of roots R is the sequence of variablestagging the AVM elements of ~M, namely 〈 2 , 1 , 6 〉.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 143 / 300

Unification grammars Multi-AVMs

Multi-AVM to MRG mapping

Example (Multi-AVM to multi-rooted feature graph mapping)

The obtained graph is:

2 1 6

9 8 5

7a

f f fh

gh

h

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 144 / 300

Unification grammars Multi-AVMs

Multi-AVM to MRG mapping

Proposition

Let ~M1, ~M2 be two multi-AVMs. Then:

Π( ~M) = Π(ϕ(~M))

〈i , π1〉~M

! 〈j , π2〉 iff 〈i , π1〉ϕ(~M)! 〈j , π2〉

~M1~� ~M2 iff ϕ( ~M1)~⊑ϕ( ~M2)

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 145 / 300

Unification grammars Unification revisited

Unification revisited

We defined the unification operation for feature structures

We now extend the definition to multi-rooted structures; we definetwo variants of the operation:

one which unifies two same-length structures and produces their leastupper bound with respect to subsumptionunification in context, which combines the information in two featurestructures, each of which may be an element in a larger structure

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 146 / 300

Unification grammars Unification revisited

Two AMRS unification operations

Example (Two AMRS unification operations)

[ ] [ ] [ ] · · · [ ][ ] [ ] [ ] · · · [ ]

[ ] [ ] [ ] · · · [ ]

Same-length AMRS unification

[ ][ ]

[ ] [ ] [ ] [ ] [ ][ ]

Unification in context

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 147 / 300

Unification grammars Unification revisited

MRS unification

Example (MRS unification)

Let

σ =

[

cat : dnum : 4

]

cat : nnum : 4

case : nom

[

cat : vnum : 4

]

ρ =

[

cat : dnum : pl

]

cat : nnum : plcase : [ ]

[

cat : vnum : pl

]

Then

σ ⊔ ρ =

[

cat : dnum : 4pl

]

cat : nnum : 4

case : nom

[

cat : vnum : 4

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 148 / 300

Unification grammars Unification revisited

Unification in context

Example (Unification in context)

Let

σ =

[

f : 1ag : 2 [ ]

]

[

h : 2]

, ρ =

[

f : 3 [ ]g : 4b

]

[

h : 3]

Unifying the first element in σ with the first element in ρ in the contextsof σ and ρ, we obtain (σ, 1) ⊔ (ρ, 1) = (σ′, ρ′):

σ′ =

[

f : 1ag : 2b

]

[

h : 2]

, ρ′ =

[

f : 3ag : 4b

]

[

h : 3]

Note that both operands of the unification are modified.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 149 / 300

Unification grammars Unification revisited

Unification in context

Theorem

If 〈σ′, ρ′〉 = (σ, i) ⊔ (ρ, j) then σ′i = ρ′j = σi ⊔ ρj .

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 150 / 300

Unification grammars Unification revisited

Unification in context

Theorem

Let σ, ρ be two AMRSs and i , j be indexes such that i ≤ len(σ) andj ≤ len(ρ). Then 〈σ′, ρ′〉 = (σ, i) ⊔ (ρ, j) iff

σ′ = min~⊑{σ′′ | |σ~⊑ σ′′ and ρj�σ′′i} and

ρ′ = min~⊑{ρ′′ | ρ~⊑ ρ′′ and σi �ρ′′j}.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 151 / 300

Unification grammars Rules and grammars

Rules and grammars

Like context free grammars, unification grammars are defined over analphabet

As the grammars that are of most interest to us are of naturallanguages, and since sentences in natural languages are not juststrings of symbols, but rather strings of words, we add to thesignature an alphabet, a fixed set Words of words (in addition tothe fixed sets Feats and Atoms)

Meta-variables wi ,wj etc. are used to refer to elements of Words, wto refer to strings over Words.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 152 / 300

Unification grammars Rules and grammars

Rules and grammars

We also adopt here the distinction between phrasal and terminal rules

The former cannot have elements of Words in their bodies; thelatter have only a single word as their body

We refer to the collection of terminal rules as the lexicon: itassociates with terminals, members of Words, (abstract) featurestructures that are their categories

For every word wi ∈ Words the lexicon specifies a finite set ofabstract feature structures L(wi )

If L(wi ) is a singleton then wi is unambiguous, and if it is empty thenwi is not a member of the language defined by the lexicon.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 153 / 300

Unification grammars Rules and grammars

Lexicon

Definition (Lexicon)

Given a signature of features Feats and atoms Atoms, and a setWords of terminal symbols, a lexicon is a finite-range functionL : Words → 2AFS(Feats,Atoms).

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 154 / 300

Unification grammars Rules and grammars

Lexicon

Example (Lexicon)

Following is a lexicon L over a signature consisting ofFeats = {cat,num,case}, Atoms = {d, n, v, sg, pl}, andWords = {two, sheep, sleep}:

L(two) =

{[

cat : dnum : pl

]}

L(sheep) =

cat : nnum : [ ]case : [ ]

L(sleep) =

{[

cat : vnum : pl

]}

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 155 / 300

Unification grammars Rules and grammars

Lexicon

Example (Lexicon)

An an alternative to the previous lexical entry of sheep above, thegrammar writer may prefer the following lexical entry:

L(sheep) =

cat : nnum : sgcase : [ ]

,

cat : nnum : plcase : [ ]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 156 / 300

Unification grammars Rules and grammars

Lexicon

Example (Lexicon, rule-format)

To depict the lexicon specification above, we usually use the followingnotation:

sheep →

cat : nnum : sgcase : [ ]

sheep →

cat : nnum : plcase : [ ]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 157 / 300

Unification grammars Rules and grammars

Lexicon

When a string of words w is given, it is possible to construct anAMRS σw for the lexical entries of the words in w , such that no twoelements of σw share paths

Such an AMRS is simply the concatenation of the lexical entries ofthe words in w

In general, there may be several such AMRSs, as each word in w canhave multiple elements in its category

The set of such AMRSs is the pre-terminals of w

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 158 / 300

Unification grammars Rules and grammars

Pre-terminals

Definition (Pre-terminals)

Let w = w1 . . . wn ∈ Words+. PTw (j , k) is defined iff 1 ≤ j , k ≤ n, inwhich case it is the set of AMRSs {〈Aj · Aj+1 · · ·Ak〉 | Ai ∈ L(wi) forj ≤ i ≤ k}. If j > k (i.e., w = ǫ), then PTw (j , k) = {λ}. The subscript wis omitted when it is clear from the context.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 159 / 300

Unification grammars Rules and grammars

Pre-terminals

Example (Pre-terminals)

Consider the string of words w = two sheep sleep and the lexicon of theprevious example. There is exactly one element in PTw (1, 3); this is theAMRS

[

cat : dnum : pl

]

cat : nnum : [ ]case : [ ]

[

cat : dnum : pl

]

Notice that there is no sharing of variables among different featurestructures in this AMRS. As AMRSs are depicted using multi-AVMs here,the variables in the above multi-AVM are chosen such that unintendedreentrancies are avoided.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 160 / 300

Unification grammars Rules and grammars

Pre-terminals

Example (Pre-terminals)

Now assume that the word sheep is represented as an ambiguous word: itscategory contains two feature structures, namely

L(sheep) =

cat : nnum : sgcase : [ ]

,

cat : nnum : plcase : [ ]

Then PTw (1, 3) has two members:

[

cat : dnum : pl

]

cat : nnum : sgcase : [ ]

[

cat : dnum : pl

]

,

[

cat : dnum : pl

]

cat : nnum : plcase : [ ]

[

cat : dnum : pl

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 161 / 300

Unification grammars Rules and grammars

Rules

Definition (Rules)

A (phrasal) rule is an AMRS of length n > 0 with a distinguished firstelement. If σ is a rule then σ1 is its head and σ2..n is its body. We adopta convention of depicting rules with an arrow (→) separating the headfrom the body.

Since a rule is simply an AMRS, there can be reentrancies among itselements: both between the head and (some element of) the bodyand among elements in its body.

Notice that the definition supports ǫ-rules, i.e., rules with null bodies

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 162 / 300

Unification grammars Rules and grammars

Rules

Example (Rules as AMRSs)

As every AMRS can be interpreted as a rule, so can the following:

[

cat : s]

[

cat : npagr : 4

] [

cat : vagr : 4

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 163 / 300

Unification grammars Rules and grammars

Rules

Example (Rules as AMRSs)

Rules can also propagate information between the mother and any of thedaughters using reentrancies between paths originating in the head of therule and paths originating from one of the body elements, as below.

[

cat : ssubj : 1

]

→ 1

[

cat : npagr : 2

] [

cat : vagr : 2

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 164 / 300

Unification grammars Rules and grammars

Rules

The rules of the example employ feature structures that include thefeature cat, encoding the major part-of-speech category of phrases

While this is useful and natural, it is by no means obligatory

Unification rules can encode such information in other ways (e.g., viaa different feature, or as a collection of features); or they may notencode it at all

In the general case, a unification rule is not required to have acontext-free skeleton, a feature whose values constitute a context-freebackbone that drives the derivation

Some unification-based grammar theories do indeed maintain acontext-free skeleton (LFG is a notable example), while others (likeHPSG) do not

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 165 / 300

Unification grammars Rules and grammars

Rules

We introduce a shorthand notation in the presentation of grammars:

When two rules have the same head, we list the head only once andseparate the bodies of the different rules with ‘|’ (following theconvention of context-free grammars)

Note, however, that the scope of variables is still limited to a singlerule, so that multiple occurrences of the same variable within thebodies of two different rules are unrelated

Additionally, we may use the same variable (e.g., 4 ) in several rules

It should be clear by now that these multiple uses are unrelated toeach other, as the scope of variables is limited to a single rule

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 166 / 300

Unification grammars Rules and grammars

Unification grammars

Definition (Unification grammars)

A unification grammar (UG) G = (L,R,As) over a signature Atoms ofatoms and Feats of features consists of a lexicon L, a finite set of rulesR and a start symbol As that is an abstract feature structure.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 167 / 300

Unification grammars Rules and grammars

Unification grammars

Example (Gu, a unification grammar)

[

cat : s]

cat : npnum : 4

case : nom

[

cat : vnum : 4

]

cat : npnum : 4

case : 2

[

cat : dnum : 4

]

cat : nnum : 4

case : 2

cat : npnum : 4

case : 2

cat : pronnum : 4

case : 2

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 168 / 300

Unification grammars Rules and grammars

Unification grammars

Example (Gu, a unification grammar)

sleep →

[

cat : vnum : pl

]

sleeps →

[

cat : vnum : sg

]

lamb →

cat : nnum : sgcase : [ ]

lambs →

cat : nnum : plcase : [ ]

she →

cat : pronnum : sgcase : nom

her →

cat : pronnum : sgcase : acc

a →

[

cat : dnum : sg

]

two →

[

cat : dnum : pl

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 169 / 300

Unification grammars Derivations

Derivations

The language generated by UGs is defined in a parallel way to thedefinition of languages generated by context-free grammars:

first, we define derivations, analogously to the context-free derivations

The reflexive transitive closure of the derivation relation is the basisfor the definition of languages

For the following discussion fix a particular grammar G = (L,R,As)

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 170 / 300

Unification grammars Derivations

Derivations

Derivation is a relation that holds between two forms, σ1 and σ2,each of which is an AMRS

To define it formally, two concepts have to be taken care of:

An element of σ1 has to be matched against the head of somegrammar rule, ρThe body of ρ must replace the selected element in σ1, thus producingσ2

Matching involves unification, and unification must be computed incontext: that is, when the selected element of σ1 is unified with thehead of ρ, other elements in σ1 or in ρ may be affected due toreentrancy

This possibility must be taken care of when replacing the selectedelement with the body of ρ

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 171 / 300

Unification grammars Derivations

Derivations

Definition (Derivation)

An AMRS σ1 of length k derives an AMRS σ2 (denoted σ1 ⇒ σ2) iff forsome j ≤ k and some rule ρ ∈ R of length n,

(σ1, j) ⊔ (ρ, 1) = (σ′1, ρ

′), and

σ2 is the replacement of the j-th element of σ1 with the body of ρ

(details suppressed)

The reflexive transitive closure of ‘⇒’ is ‘∗⇒’. We write σ

l⇒ ρ when σ

derives ρ in l steps.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 172 / 300

Unification grammars Derivations

Derivation step

Example (Derivation step)

Suppose that

σ1 =

cat : npnum : 1

case : nom

[

cat : vnum : 1

]

is a (sentential) form and that

ρ =

cat : npnum : 2

case : 3

[

cat : dnum : 2

]

cat : nnum : 2

case : 3

is a rule. Assume further that the selected element j in σ1 is the first one.Applying the rule ρ to the form σ1, it is possible to construct a derivationσ1 ⇒ σ2.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 173 / 300

Unification grammars Derivations

Derivation step

Example (Derivation step)

First, compute (σ1, 1) ⊔ (ρ, 1) = (σ′1, ρ

′):

σ′1 =

cat : npnum : 1

case : nom

[

cat : vnum : 1

]

ρ′ =

cat : npnum : 2

case : 3nom

[

cat : dnum : 2

]

cat : nnum : 2

case : 3

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 174 / 300

Unification grammars Derivations

Derivation step

Example (Derivation step)

Now, the first element of σ′1 is replaced by the body of ρ′. This operation

results in a new AMRS, σ2, of length 3: the first two elements are thebody of ρ′, and the last element is the remainder of σ′

1, after its firstelement has been eliminated; that is, the last element of σ′

1. A simplereplacement would have resulted in the following AMRS:

[

cat : dnum : 2

]

cat : nnum : 2

case : 3nom

[

cat : vnum : 1

]

Obviously, this is not the expected result!

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 175 / 300

Unification grammars Derivations

Derivation step

Example (Derivation step)

Since the path (1,num) in σ1 is reentrant with (2,num) (indicated by thetag 1 ), and since the path (1,num) in the rule ρ is reentrant with thepaths (2,num) and (3,num) (the tag 3 ), one would expect that thesharing between the num values of the noun phrase and the verb phrase inσ1 would manifest itself as a sharing between this feature’s values of thedeterminer, the noun and the verb phrase in σ2.This is what the last clause in the definition of derivation guarantees. Theresult is:

σ2 =

[

cat : dnum : 4

]

cat : nnum : 4

case : 5nom

[

cat : vnum : 4

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 176 / 300

Unification grammars Derivations

Derivation

Example (Derivation)

Consider the grammar Gu. A derivation with Gu can start with a form oflength 1, consisting of

σ1 =[

cat : s]

The single element of this AMRS unifies with the head of the first rule inthe grammar, trivially. Substitution is again trivial, and the next form inthe derivation is the body of the first rule:

σ2 =

cat : npnum : 1

case : nom

[

cat : vnum : 1

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 177 / 300

Unification grammars Derivations

Derivation

Example (Derivation)

Since the rule ρ of that example is indeed in Gu, a derivable form from σ2

is:

σ3 =

[

cat : dnum : 4

]

cat : nnum : 4

case : nom

[

cat : vnum : 4

]

Thus, we obtain σ1 ⇒ σ2 ⇒ σ3, and hence σ1∗⇒ σ3.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 178 / 300

Unification grammars Derivations

Derivation

Example (Derivation)

Consider the form σ3 and one of the AMRSs in PTw (1, 3):

σ3 =

[

cat : dnum : 4

]

cat : nnum : 4

case : nom

[

cat : vnum : 4

]

σ =

[

cat : dnum : pl

]

cat : nnum : plcase : [ ]

[

cat : dnum : pl

]

The former contains information that is accumulated during derivations; thelatter reflect information from the lexical entries of the words in w .

σ ⊔ ρ =

[

cat : dnum : 4pl

]

cat : nnum : 4

case : nom

[

cat : vnum : 4

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 179 / 300

Unification grammars Derivations

Language

Definition (Language)

The language of a unification grammar G isL(G ) = {w ∈ Words∗ | w = w1 · · ·wn and there exist an AMRS σ such

that As∗⇒ σ and an AMRS ρ ∈ PTw (1, n) such that σ ⊔ ρ is defined}.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 180 / 300

Unification grammars Derivations

Language

Example (Language)

Consider the grammar Gu and the string the sheep sleep. The form σ3 isderivable from the start symbol of the grammar. This form is unifiablewith one of the members of PTw (1, 3). Hence the string the sheep sleep isa member of L(Gu).

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 181 / 300

Unification grammars Derivation trees

Derivation trees

In order to depict derivations graphically we extend the notion ofderivation trees, defined for context-free grammars, to unificationgrammars

Informally, we would like a tree to be a structure whose elements arefeature structures

However, care must be taken when the scope of reentrancies in a treeis concerned: in order for information to be shared among all nodes ina tree, this scope is extended to the entire tree

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 182 / 300

Unification grammars Derivation trees

Derivation trees

Rather than define a new mathematical entity, corresponding to atree whose nodes are feature structures with the scope of reentranciesextended to the entire structure, we reuse in the following definitionthe concept of multi-rooted structures (more precisely, AMRSs)

In order to impose a tree structure on AMRSs we simply pair themwith a tree whose nodes are integers, such that each node in the treeserves as an index into the AMRS

In this way, all the existing definitions which refer to AMRSs can benaturally used when reasoning about trees

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 183 / 300

Unification grammars Derivation trees

Derivation trees

Definition (Unification trees)

Given a signature S = 〈Atoms,Feats〉, a unification tree is an orderedtree whose nodes are AVMs over S, where the scope of reentrancies isextended to the entire tree. A subtree is a particular node of the tree,along with all its descendants (and the edges connecting them). Formally,a unification tree is a pair 〈σ, τ 〉, where σ is an AMRS over S, say oflength l for some l ∈ N, and τ is a tree over the nodes {1, 2, . . . , l}.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 184 / 300

Unification grammars Derivation trees

Derivation trees

Example (Unification tree)

Following is a unification tree, depicted as a tree of AVMs:

[

cat : s]

cat : npnum : 4

case : 2nom

[

cat : dnum : 4

]

cat : nnum : 4

case : 2nom

[

cat : vnum : 4

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 185 / 300

Unification grammars Derivation trees

Derivation trees

Example (Unification tree)

Formally, this tree is a pair 〈τ, σ〉, where τ is a tree over {1, 2, 3, 4, 5} and σ is anAMRS of length 5:

τ = 1

2

3 4 5

σ =[

cat : s]

cat : npnum : 4

case : 2nom

[

cat : dnum : 4

]

cat : nnum : 4

case : 2nom

[

cat : vnum : 4

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 186 / 300

Unification grammars Derivation trees

Unification derivation trees

Definition (Unification derivation trees)

A unification derivation tree induced by a unification grammarG = (R,As) is a unification tree defined recursively as follows:

〈As , τ〉 is a unification derivation tree, where τ is the tree consistingof the single node {1};

if 〈σ, τ 〉 is a unification derivation tree and 〈σ′, τ ′〉 extends 〈σ, τ 〉,then 〈σ′, τ ′〉 is also a unification derivation tree.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 187 / 300

Unification grammars Derivation trees

Unification derivation trees

Example (Unification derivation trees)

A unification derivation tree with the grammar Gu can be builtincrementally as follows. The start symbol of the grammar is

[

cat : s]

;therefore, an initial derivation tree would be 〈σ1, {1}〉, the start symbolitself.Then, by using the first grammar rule, the following tree, 〈σ2, τ2〉, can beobtained:

[

cat : snum : 4

]

cat : npnum : 4

case : nom

[

cat : vnum : 4

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 188 / 300

Unification grammars Derivation trees

Unification derivation trees

Example (Unification derivation trees)

Next, by applying the second grammar rule to the leftmost node on thefrontier of 〈σ2, τ2〉, the following tree, 〈σ3, τ3〉, is obtained:

[

cat : snum : 4

]

cat : npnum : 4 sgcase : nom

[

cat : dnum : 4

]

cat : nnum : 4

case : 2

[

cat : vnum : 4

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 189 / 300

Unification grammars Derivation trees

Complete derivation trees

As in the context-free case, the frontier of unification derivation treesdoes not have to correspond to any lexical item

Of course, in order for trees to represent complete derivations, we areparticularly interested in such trees whose frontier is unifiable with asequence of pre-terminals

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 190 / 300

Unification grammars Derivation trees

Complete derivation trees

Definition (Complete derivation trees)

A unification derivation tree 〈σ, τ 〉 is complete if the frontier of τ isj1, . . . , jn and there exist a word w ∈ Words∗ of length n and an AMRSρ ∈ PTw (1, n) such that ρ ⊔ 〈σi , σj1, . . . , σjn〉 is defined.

Note that there may be more than one qualifying AMRS in PTw (1, n); thedefinition only requires one. Of course, different AMRSs in PTw (1, n) willcorrespond to different interpretations of the input string (resulting fromambiguous lexical entries of the words)

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 191 / 300

Unification grammars Derivation trees

Complete derivation trees

Example (Complete derivation trees)

Consider the grammar Gu and the string w = two lambs sleep. The tree ofthe previous example is complete. Its frontier is unifiable with thefollowing AMRS:

[

cat : dnum : pl

]

cat : nnum : plcase : 2

[

cat : vnum : pl

]

∈ PTw (1, 3)

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 192 / 300

Unification grammars Derivation trees

Lexicalized derivation trees

It is sometimes useful to depict a tree whose leaves already reflect theadditional information obtained by actually unifying the frontier of acomplete derivation tree with PTw

We call such trees lexicalized

It is easy to see that for every lexicalized tree 〈σ, τ 〉 there exists a

complete derivation tree 〈σ′, τ ′〉 such that τ ′ = τ and σ′ ~⊑ σ

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 193 / 300

Unification grammars Derivation trees

Lexicalized derivation trees

Definition (Lexicalized derivation trees)

Let 〈σ, τ 〉 be a complete derivation tree induced by a unification grammarG = (R,As) and let w , ρ be as in the definition of complete trees. Alexicalized derivation tree induced by G on w is the unification tree〈σ′, τ 〉, where σ′ is obtained from σ by unifying the frontier of σ with ρ.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 194 / 300

Unification grammars Derivation trees

Lexicalized derivation trees

Example (Lexicalized derivation tree)

A tree induced by the grammar Gu on the string two lambs sleep:

[

cat : s]

cat : npnum : 4

case : 2nom

[

cat : dnum : 4pl

]

cat : nnum : 4

case : 2nom

[

cat : vnum : 4

]

two sheep sleep

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 195 / 300

Linguistic applications Introduction

Linguistic applications

We now put the theory to use, by accounting for several of thelinguistic phenomena that motivated UGs

Unification grammars facilitate the expression of linguisticgeneralizations

This is mediated through two main mechanisms:

The notion of grammatical category is expressed via feature structures,thereby allowing for complex categories as first-class citizens of thegrammatical theoryReentrancy provides a concise machinery for expressing “movement”,or more generally, relations that hold in a deeper level than aphrase-structure tree

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 196 / 300

Linguistic applications Introduction

Phenomena

Agreement

Case control

Subcategorization

Long-distance dependencies

Control

Coordination

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 197 / 300

Linguistic applications A basic grammar

A basic grammar

Example (A context-free grammar G0:)

S → NP VPVP → V | V NPNP → D N | Pron | PropND → the, a, two, every, . . .

N → sheep, lamb, lambs, shepherd, water . . .

V → sleep, sleeps, love, loves, feed, feeds, herd, herds, . . .

Pron → I, me, you, he, him, she, her, it, we, us, they, them

PropN → Rachel, Jacob, . . .

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 198 / 300

Linguistic applications A basic grammar

Every CFG is a UG

Observe that any context-free grammar is a special case of aunification grammar

The non-terminal symbols of the CFG can be modeled by atoms

A more general view of G0 as a unification grammar can encode thefact that the non-terminal symbols represent grammatical categories

This can be done using a single feature, e.g., cat, whose values arethe non-terminals of G0

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 199 / 300

Linguistic applications A basic grammar

G ′0, a basic unification grammar

Example (G ′0, a basic unification grammar)

Following is a unification grammar, G ′0, over a signature 〈Feats,Atoms〉

where Feats = {cat} and Atoms = {s, np, vp, v, d, n, pron, propn}:

1[

cat : s]

→[

cat : np] [

cat : vp]

2[

cat : vp]

→[

cat : v]

3[

cat : vp]

→[

cat : v] [

cat : np]

4[

cat : np]

→[

cat : d] [

cat : n]

5, 6[

cat : np]

→[

cat : pron]

|[

cat : propn]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 200 / 300

Linguistic applications A basic grammar

G ′0, a basic unification grammar

Example (G ′0, a basic unification grammar)

sleep →[

cat : v]

give →[

cat : v]

love →[

cat : v]

tell →[

cat : v]

feed →[

cat : v]

feeds →[

cat : v]

lamb →[

cat : n]

lambs →[

cat : n]

she →[

cat : pron]

her →[

cat : pron]

they →[

cat : pron]

them →[

cat : pron]

Rachel →[

cat : propn]

Jacob →[

cat : propn]

a →[

cat : d]

two →[

cat : d]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 201 / 300

Linguistic applications A basic grammar

Derivation trees induced by G ′0

Example (Derivation trees induced by G ′0)

The grammar G ′0 induces the following tree on the string the sheep love her:

[

cat : s]

[

cat : np] [

cat : vp]

[

cat : d] [

cat : n] [

cat : v] [

cat : np]

[

cat : pron]

the sheep love her

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 202 / 300

Linguistic applications A basic grammar

Derivation trees induced by G ′0

Example (Derivation trees induced by G ′0)

Not surprisingly, an isomorphic derivation tree is induced by the grammaron the ungrammatical string ∗the lambs sleeps they:

[

cat : s]

[

cat : np] [

cat : vp]

[

cat : d] [

cat : n] [

cat : v] [

cat : np]

[

cat : pron]

the lambs sleeps they

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 203 / 300

Linguistic applications Imposing agreemnt

Gagr, accounting for agreement on number

Example (Gagr, accounting for agreement on number)

1[

cat : s]

[

cat : npnum : 4

] [

cat : vpnum : 4

]

2

[

cat : vpnum : 4

]

[

cat : vnum : 4

]

3

[

cat : vpnum : 4

]

[

cat : vnum : 4

]

[

cat : np]

4

[

cat : npnum : 4

]

[

cat : dnum : 4

] [

cat : nnum : 4

]

5, 6

[

cat : npnum : 4

]

[

cat : pronnum : 4

]

|

[

cat : propnnum : 4

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 204 / 300

Linguistic applications Imposing agreemnt

Gagr, accounting for agreement on number

Example (Gagr, accounting for agreement on number)

sleep →

[

cat : vnum : pl

]

give →

[

cat : vnum : pl

]

love →

[

cat : vnum : pl

]

tell →

[

cat : vnum : pl

]

feed →

[

cat : vnum : pl

]

feeds →

[

cat : vnum : sg

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 205 / 300

Linguistic applications Imposing agreemnt

Gagr, accounting for agreement on number

Example (Gagr, accounting for agreement on number)

lamb →

[

cat : nnum : sg

]

lambs →

[

cat : nnum : pl

]

she →

[

cat : pronnum : sg

]

her →

[

cat : pronnum : sg

]

they →

[

cat : pronnum : pl

]

them →

[

cat : pronnum : pl

]

Rachel →

[

cat : propnnum : sg

]

Jacob →

[

cat : propnnum : sg

]

a →

[

cat : dnum : sg

]

two →

[

cat : dnum : pl

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 206 / 300

Linguistic applications Imposing agreemnt

Gagr generates a CF language

Example (A context-free grammar G1)

S → Ssg | Spl

Ssg → NPsg VPsg Spl → NPpl VPpl

NPsg → Dsg Nsg NPpl → Dpl Npl

NPsg → Pronsg | PropNsg NPpl → Pronpl | PropNpl

VPsg → Vsg VPpl → Vpl

VPsg → Vsg NPsg | Vsg NPpl VPpl → Vpl NPsg | Vpl NPpl

Dsg → a Dpl → two

Nsg → lamb | sheep | · · · Npl → lambs | sheep | · · ·

Pronsg → she | her | · · · PropNsg → Rachel | Jacob | · · ·

Vsg → sleeps | · · · Vpl → sleep | · · ·

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 207 / 300

Linguistic applications Imposing agreemnt

Gagr generates a CF language

While Gagr is a unification grammar, the language it generates iscontext free

But the equivalent CFG is inferior to the unification grammar:

The linguistic description is distorted: information regarding number,which is determined by the words themselves, is encoded in G1 by theway they are derived (in other words, G1 accounts for lexical knowledgeby means of phrase-structure rules)Several linguistic generalizations are lost: the context-free grammarinduces two different trees on a lamb sleeps and two lambs sleep

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 208 / 300

Linguistic applications Imposing agreemnt

UG and linguistic generalizations

One natural notion of ‘linguistic generalization’ emerges: the abilityto formulate a linguistic restriction by means of a single rule, insteadof by a collection of “similar” rules

In this sense, Gagr captures the agreement generalization, while G1

does not

Multiplying out all the possible values of a particular feature, andconverting a unification grammar to an equivalent context-freegrammar in this way, is not always possible

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 209 / 300

Linguistic applications Imposing case control

Imposing case control

Add a feature to the signature, case, to the feature structuresassociated with nominal categories: nouns, pronouns, proper namesand noun phrases

The lexical entries of pronouns must specify their case, which is overtand explicit: we use the value nom for nominative case, whereas accstands for accusative

As for proper names and nouns, their lexical entries are simplyunderspecified with respect to case

Use the values of the case feature in the grammar to imposeconstraints of case control

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 210 / 300

Linguistic applications Imposing case control

Gcase, accounting for case control

Example (Gcase, accounting for case control)

1[

cat : s]

cat : npnum : 4

case : nom

[

cat : vpnum : 4

]

2

[

cat : vpnum : 4

]

[

cat : vnum : 4

]

3

[

cat : vpnum : 4

]

[

cat : vnum : 4

]

cat : npnum : 3

case : acc

4

cat : npnum : 4

case : 2

[

cat : dnum : 4

]

cat : nnum : 4

case : 2

5, 6

cat : npnum : 4

case : 2

cat : pronnum : 4

case : 2

|

cat : propnnum : 4

case : 2

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 211 / 300

Linguistic applications Imposing case control

Gcase, accounting for case control

Example (Gcase, accounting for case control)

sleep →

»

cat : v

num : pl

sleeps →

»

cat : v

num : sg

feed →

»

cat : v

num : pl

feeds →

»

cat : v

num : sg

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 212 / 300

Linguistic applications Imposing case control

Gcase, accounting for case control

Example (Gcase, accounting for case control)

lamb →

2

4

cat : n

num : sg

case : [ ]

3

5 lambs →

2

4

cat : n

num : pl

case : [ ]

3

5

she →

2

4

cat : pron

num : sg

case : nom

3

5 her →

2

4

cat : pron

num : sg

case : acc

3

5

they →

2

4

cat : pron

num : pl

case : nom

3

5 them →

2

4

cat : pron

num : pl

case : acc

3

5

Rachel →

»

cat : propn

num : sg

Jacob →

»

cat : propn

num : sg

a →

»

cat : d

num : sg

two →

»

cat : d

num : pl

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 213 / 300

Linguistic applications Imposing case control

Derivation tree with case control

Example (Derivation tree with case control)

ˆ

cat : s˜

2

4

cat : np

num : 4case : 3nom

3

5

»

cat : vp

num : 4

»

cat : d

num : 4pl

2

4

cat : n

num : 4case : 3

3

5

»

cat : v

num : 4

2

4

cat : np

num : 2case : 5 acc

3

5

2

4

cat : pron

num : 2pl

case : 5

3

5

the shepherds feed them

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 214 / 300

Linguistic applications Imposing case control

Derivation tree with case control

Example (Derivation tree with case control)

This tree represents a derivation which starts with the initial symbol,[

cat : s]

, and ends with multi-AVM σ′, where

σ′ =the

[

num : 4]

shepherds[

num : 4

case : nom

] feed[

num : 4]

them[

num : 2

case : acc

]

This multi-AVM is unifiable with (but not identical to!) the sequence oflexical entries of the words in the sentence, which is:

σ =the

[

num : [ ]]

shepherds[

num : plcase : [ ]

] feed[

num : pl]

them[

num : plcase : acc

]

Hence the sentence is in the language generated by the grammar.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 215 / 300

Linguistic applications Imposing subcategorization constraints

Imposing subcategorization constraints

A naıve solution to the subcategorization problem

intransitive verbs (with no object): sleep, walk, run, laugh, . . .

transitive verbs (with a nominal object): feed, love, eat, . . .

Lexical entries of verbs are extended such that their subcategorizationis specified

The rules that involve verbs and verb phrases are extended

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 216 / 300

Linguistic applications Imposing subcategorization constraints

Gsubcat, a naıve account of verb subcategorization

Example (Gsubcat, a naıve account of verb subcategorization)

cat : s˜

2

4

cat : np

num : 4case : nom

3

5

»

cat : vp

num : 4

2

»

cat : vp

num : 4

2

4

cat : v

num : 4subcat : intrans

3

5

3

»

cat : vp

num : 4

2

4

cat : v

num : 4subcat : trans

3

5

2

4

cat : np

num : 4case : acc

3

5

4

2

4

cat : np

num : 4case : 2

3

5 →

»

cat : d

num : 4

2

4

cat : n

num : 4case : 2

3

5

5, 6

2

4

cat : np

num : 4case : 2

3

5 →

2

4

cat : pron

num : 4case : 2

3

5 |

2

4

cat : propn

num : 4case : 2

3

5

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 217 / 300

Linguistic applications Imposing subcategorization constraints

Gsubcat, a naıve account of verb subcategorization

Example (Gsubcat, a naıve account of verb subcategorization)

sleep →

cat : vnum : plsubcat : intrans

sleeps →

cat : vnum : sgsubcat : intrans

feed →

cat : vnum : plsubcat : trans

feeds →

cat : vnum : sgsubcat : trans

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 218 / 300

Linguistic applications Imposing subcategorization constraints

Gsubcat, a naıve account of verb subcategorization

Example (Gsubcat, a naıve account of verb subcategorization)

lamb →

2

4

cat : n

num : sg

case : [ ]

3

5 lambs →

2

4

cat : n

num : pl

case : [ ]

3

5

she →

2

4

cat : pron

num : sg

case : nom

3

5 her →

2

4

cat : pron

num : sg

case : acc

3

5

they →

2

4

cat : pron

num : pl

case : nom

3

5 them →

2

4

cat : pron

num : pl

case : acc

3

5

Rachel →

»

cat : propn

num : sg

Jacob →

»

cat : propn

num : sg

a →

»

cat : d

num : sg

two →

»

cat : d

num : pl

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 219 / 300

Linguistic applications Subcategorization lists

Subcategorization lists

The previous account of subcategorization is naıve

Different verbs subcategorize for different kinds of complements:noun phrases, infinitival verb phrases, sentences etc.

Some verbs require more than one complement

The idea is to store in the lexical entry of each verb not an atomicfeature indicating its subcategory, but rather a list of categories,indicating the appropriate complements of the verb

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 220 / 300

Linguistic applications Subcategorization lists

Lexical entries of some verbs using subcategorization lists

Example (Lexical entries of some verbs using subcategorization lists)

sleep →

cat : vsubcat : elistnum : pl

love →

cat : vsubcat : 〈

[

cat : np]

〉num : pl

give →

cat : vsubcat : 〈

[

cat : np]

,[

cat : np]

〉num : pl

tell →

cat : vsubcat : 〈

[

cat : np]

,[

cat : s]

〉num : pl

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 221 / 300

Linguistic applications Subcategorization lists

Subcategorization lists

The grammar rules must be modified to reflect the additional wealthof information in the lexical entries

Due to this wealth there can be a dramatic reduction in the numberof grammar rules necessary for handling verbs

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 222 / 300

Linguistic applications Subcategorization lists

VP rules using subcategorization lists

Example (VP rules using subcategorization lists)

[

cat : s]

→[

cat : np]

[

cat : vsubcat : elist

]

[

cat : vsubcat : 2

]

cat : v

subcat :

[

first :[

cat : 4]

rest : 2

]

[

cat : 4]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 223 / 300

Linguistic applications Subcategorization lists

A derivation tree

Example (A derivation tree)

ˆ

cat : s˜

»

cat : v

subcat : 〈〉

»

cat : v

subcat : 〈ˆ

cat : 2˜

ˆ

cat : np˜

»

cat : v

subcat : 〈ˆ

cat : 1˜

cat : 2˜

ˆ

cat : 1 np˜ ˆ

cat : 2 np˜

Rachel gave the sheep water

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 224 / 300

Linguistic applications Subcategorization lists

A derivation tree

Example (A derivation tree)

ˆ

c : s˜

»

c : v

sc : 〈〉

h

c : 2 s

i

"

c : v

sc : 〈h

c : 2i

#

»

c : v

sc : 〈〉

ˆ

c : np˜

"

c : v

sc : 〈h

c : 1i

,

h

c : 2i

#

h

c : 1 np

i

ˆ

c : np˜

"

c : v

sc : 〈h

c : 3i

#

h

c : 3 np

i

Jacob told Laban he loved Rachel

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 225 / 300

Linguistic applications Subcategorization lists

Subcategorization imposes case constraints

In the above grammar, categories on subcategorization lists arerepresented as an atomic symbol

This is a simplification; the method outlined here can be used withmore complex encodings of categories

For example, the lexical entry of the German verb geben (to give) canstate that the first complement must be in the dative case, whereasthe second must be accusative

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 226 / 300

Linguistic applications Subcategorization lists

Subcategorization imposes case constraints

Example (Subcategorization imposes case constraints)

Ich gebe dem Hund den KnochenI give the(dat) dog the(acc) boneI give the dog the bone

∗Ich gebe den Hund den KnochenI give the(acc) dog the(acc) bone

∗Ich gebe dem Hund dem KnochenI give the(dat) dog the(dat) bone

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 227 / 300

Linguistic applications Subcategorization lists

Subcategorization imposes case constraints

Example (Subcategorization imposes case constraints)

The lexical entry of gebe, then, could be:

L(gebe) =

cat : v

subcat :

⟨[

cat : npcase : dat

]

,

[

cat : npcase : acc

]⟩

num : sg

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 228 / 300

Linguistic applications Subcategorization lists

Subcategorization imposes case constraints

In order to account for subcategorization of complex information(rather than of atomic category symbols), the VP rule whichmanipulates subcategorization lists has to be slightly modified

The revised rule reflects the fact that the subcategorized informationis not the value of the cat feature, but rather the entire verbcomplement:

[

cat : vsubcat : 2

]

cat : v

subcat :

[

first : 3

rest : 2

]

3 [ ]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 229 / 300

Linguistic applications Subcategorization lists

G3, a complete E2-grammar

Example (G3, a complete E2-grammar)

ˆ

cat : s˜

2

4

cat : np

num : 4case : nom

3

5

2

4

cat : v

num : 4subcat : elist

3

5

2

4

cat : v

num : 4subcat : 2

3

5 →

2

6

6

4

cat : v

num : 4

subcat :

»

first : 3rest : 2

3

7

7

5

3 [ ]

2

4

cat : np

num : 4case : 2

3

5 →

»

cat : d

num : 4

2

4

cat : n

num : 4case : 2

3

5

2

4

cat : np

num : 4case : 2

3

5 →

2

4

cat : pron

num : 4case : 2

3

5 |

2

4

cat : propn

num : 4case : 2

3

5

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 230 / 300

Linguistic applications Subcategorization lists

G3, a complete E2-grammar

Example (G3, a complete E2-grammar)

sleep →

2

4

cat : v

subcat : elist

num : pl

3

5

give →

2

6

6

4

cat : v

subcat : 〈

»

cat : np

case : acc

cat : np˜

num : pl

3

7

7

5

love →

2

6

6

4

cat : v

subcat : 〈

»

cat : np

case : acc

num : pl

3

7

7

5

tell →

2

6

6

4

cat : v

subcat : 〈

»

cat : np

case : acc

cat : s˜

num : pl

3

7

7

5

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 231 / 300

Linguistic applications Subcategorization lists

G3, a complete E2-grammar

Example (G3, a complete E2-grammar)

lamb →

2

4

cat : n

num : sg

case : 2

3

5 lambs →

2

4

cat : n

num : pl

case : 2

3

5

she →

2

4

cat : pron

num : sg

case : nom

3

5 her →

2

4

cat : pron

num : sg

case : acc

3

5

Rachel →

»

cat : propn

num : sg

Jacob →

»

cat : propn

num : sg

a →

»

cat : d

num : sg

two →

»

cat : d

num : pl

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 232 / 300

Linguistic applications Long distance dependencies

Long distance dependencies

Encoding grammatical categories as feature structures is very usefulin the treatment of unbounded dependencies

Such phenomena involve a “missing” constituent that is realizedoutside the clause from which it is missing, as in:

The shepherd wondered whom Jacob loved⌣

.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 233 / 300

Linguistic applications Long distance dependencies

Long distance dependencies

Phrases such as whom Jacob loved⌣

or who⌣

loved Rachel aresentences, with a constituent which is “moved” from its defaultposition and realized as a wh-pronoun in front of the phrase

We represent such phrases by using the category s

But to distinguish them from declarative sentences we add a feature,que, to the category

The value of que is ‘+’ in sentences with an interrogative pronounrealizing a transposed constituent

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 234 / 300

Linguistic applications Long distance dependencies

Long distance dependencies

We also add a lexical entry for the pronoun whom:

whom →

cat : proncase : accque : +

Finally, we update the rule that derives pronouns such that itpropagate the value of que from the lexicon to higher projections ofthe pronoun:

cat : npnum : 1

case : 3

que : 5

cat : pronnum : 1

case : 3

que : 5

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 235 / 300

Linguistic applications Long distance dependencies

Long-distance dependencies

We extend G3 with two additional rules, based on the first two rulesof G3:

(3)

[

cat : sslash : 4

]

cat : npnum : 1

case : nom

cat : vnum : 1

subcat : elistslash : 4

(4)

cat : vnum : 1

subcat : 2

slash : 4

cat : vnum : 1

subcat :

[

first : 4

rest : 2

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 236 / 300

Linguistic applications Long distance dependencies

A derivation tree for Jacob loved⌣

Example (A derivation tree for Jacob loved⌣

cat : s

slash : 4

2

4

cat : np

num : 1case : 2

3

5

2

6

6

4

cat : v

num : 1slash : 4subcat : 8

3

7

7

5

2

4

cat : propn

num : 1 sg

case : 2nom

3

5

2

6

6

6

6

4

cat : v

num : 1

subcat :

2

4

first : 4

»

cat : np

case : acc

rest : 8 elist

3

5

3

7

7

7

7

5

Jacob loved ⌣

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 237 / 300

Linguistic applications Long distance dependencies

Long-distance dependencies

A rule for creating “complete” sentences by combining the missingcategory with a “slashed” sentence

The rule does not commit as to the category of the dislocatedelement; it simply combines any category with a sentence in whichthis very same category is missing, provided that this category ismarked as ‘que +’

The value of que is propagated to the mother to indicate that thesentence is interrogative rather than declarative:

(5)

[

cat : sque : 5

]

→ 4[

que : 5+]

[

cat : sslash : 4

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 238 / 300

Linguistic applications Long distance dependencies

A derivation tree for whom Jacob loved⌣

Example (A derivation tree for whom Jacob loved⌣

cat : s

que : 5

»

cat : s

slash : 4

4

2

4

cat : np

case : 3que : 5

3

5

2

4

cat : np

num : 1case : 2

3

5

2

6

6

4

cat : v

num : 1slash : 4subcat : elist

3

7

7

5

2

4

cat : pron

case : 3 acc

que : 5+

3

5

2

4

cat : propn

num : 1 sg

case : 2nom

3

5

2

4

cat : v

num : 1subcat :

˙

3

5

whom Jacob loved ⌣

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 239 / 300

Linguistic applications Long distance dependencies

Long-distance dependencies

In order to derive the full sentenceRachel wondered whom Jacob loved

we need a lexical entry for the verb wondered

It is a verb, so its category is v, and as it subcategorizes for aninterrogative sentence, its subcategory is a list of a single member, asentence whose que feature is ‘+’:

wondered →

cat : vnum : [ ]

subcat : 〈

[

cat : sque : +

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 240 / 300

Linguistic applications Long distance dependencies

A derivation tree for Rachel wondered whom Jacob loved⌣

Example (A derivation tree for Rachel wondered whom Jacob loved⌣

)

[

cat : s]

cat : npnum : 3

case : 4nom

cat : vnum : 3

subcat : elist

cat : propnnum : 3 sgcase : 4

cat : vnum : 3

subcat : 〈 1 〉

1

[

cat : sque : +

]

Rachel wondered whom Jacob loved⌣

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 241 / 300

Linguistic applications Long distance dependencies

Long-distance dependencies

In the previous example the filler of the gap is realized immediately tothe left of the clause in which the gap occurs

This need not always be the case: unbounded dependencies can holdacross several clause boundaries

Typical examples are:

The shepherd wondered whom Jacob loved⌣

.

The shepherd wondered whom Laban thought Jacob loved⌣

.

The shepherd wondered whom

Laban thought Leah claimed Jacob loved⌣

.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 242 / 300

Linguistic applications Long distance dependencies

Long-distance dependencies

Also, the dislocated constituent does not have to be an object:

The shepherd wondered who⌣

loved Rachel.

The shepherd wondered who Laban thought⌣

loved Rachel.

The shepherd wondered who

Laban thought Leah claimed⌣

loved Rachel.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 243 / 300

Linguistic applications Long distance dependencies

Long-distance dependencies

The solution we proposed for the simple case of unboundeddependencies can be easily extended to the more complex examples

The solution amounts to three components:

A slash introduction ruleSlash propagation rulesA gap filler rule

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 244 / 300

Linguistic applications Long distance dependencies

Long-distance dependencies

In order to account for filler-gap relations that hold across severalclauses, all that needs to be done is to add more slash propagationrules

For example, in

The shepherd wondered whom Laban thought Jacob loved⌣

.

the slash is introduced by the verb phrase loved⌣

, and is propagatedto the sentence Jacob loved

⌣by rule (3)

This sentence is the object of the verb thought; therefore, we need arule that propagates the value of slash from a sentential object tothe verb phrase of which it is an object

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 245 / 300

Linguistic applications Long distance dependencies

Long-distance dependencies

Example (Long-distance dependencies)

(6)

cat : vnum : 1

subcat : 12

slash : 4

cat : vnum : 1

subcat :

[

first : 8

rest : 12

]

8[

slash : 4]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 246 / 300

Linguistic applications Long distance dependencies

Long-distance dependencies

Example (Long-distance dependencies)

Then, the slash is propagated from the verb phrase thought Jacob loved⌣

to the sentence Laban thought Jacob loved⌣

:

(7)

[

cat : sslash : 4

]

cat : npnum : 5

case : nom

cat : vnum : 5

subcat : elistslash : 4

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 247 / 300

Linguistic applications Long distance dependencies

Long-distance dependencies

Example (A derivation tree for whom Laban thought Jacob loved⌣

)"

cat : s

que : 6

#

"

cat : s

slash : 4

#

2

6

6

6

4

cat : v

num : 5slash : 4sc : 12 elist

3

7

7

7

5

8"

cat : s

slash : 4

#

4

2

6

4

cat : np

case : 3que : 6

3

7

5

2

6

4

cat : np

num : 5case : 9

3

7

5

2

6

4

cat : np

num : 1case : 2

3

7

5

2

6

6

6

4

cat : v

num : 1slash : 4sc : elist

3

7

7

7

5

2

6

4

cat : pron

case : 3 acc

que : 6 +

3

7

5

2

6

4

cat : propn

num : 5 sg

case : 9 nom

3

7

5

2

6

6

6

4

cat : v

num : 5

sc :

"

first : 8rest : 12

#

3

7

7

7

5

2

6

4

cat : propn

num : 1 sg

case : 2 nom

3

7

5

2

6

4

cat : v

num : 1sc :

D

4E

3

7

5

whom Laban thought Jacob loved⌣

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 248 / 300

Linguistic applications Long distance dependencies

Long-distance dependencies

Example (Long-distance dependencies)

Finally, to account for gaps in the subject position, all that is needed is anadditional slash introduction rule:

(8)

cat : s

slash :

cat : npnum : 1

case : nom

cat : vnum : 1

subcat : elist

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 249 / 300

Linguistic applications Long distance dependencies

Long-distance dependencies

Example (A derivation tree for who⌣

loved Rachel)»

cat : s

que : 6

2

4

cat : s

num : 1slash : 4

3

5

2

4

cat : v

num : 1subcat : elist

3

5

4

2

4

cat : np

case : 3 nom

que : 6

3

5 8»

cat : np

case : 2

2

4

cat : pron

case : 3 nom

que : 6

3

5

2

4

cat : v

num : 1 sg

subcat : 〈 8 〉

3

5

2

4

cat : propn

num : 6 sg

case : 2 acc

3

5

who ⌣ loved Rachel

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 250 / 300

Linguistic applications Subject and object control

Subject and object control

Differences between the ‘understood’ subjects of the infinitive verbphrase to work seven years in the following sentences:

Jacob promised Laban to work seven years

Laban persuaded Jacob to work seven years

The differences between the two example sentences stem fromdifferences in the matrix verbs:

promise is a subject control verb;persuade is object control

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 251 / 300

Linguistic applications Subject and object control

G4: explicit subj values

Example (G4: explicit subj values)

ˆ

cat : s˜

→ 1

2

4

cat : np

case : nom

num : 7

3

5

2

6

6

4

cat : v

num : 7subcat : elist

subj : 1

3

7

7

5

2

6

6

4

cat : v

num : 7subcat : 4subj : 1

3

7

7

5

2

6

6

6

6

4

cat : v

num : 7

subcat :

»

first : 2rest : 4

subj : 1

3

7

7

7

7

5

2 [ ]

2

4

cat : np

num : 7case : 6

3

5 →

»

cat : d

num : 7

2

4

cat : n

num : 7case : 6

3

5

2

4

cat : np

num : 7case : 6

3

5 →

2

4

cat : pron

num : 7case : 6

3

5 |

2

4

cat : propn

num : 7case : 6

3

5

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 252 / 300

Linguistic applications Subject and object control

G4: explicit subj values

Example (G4: explicit subj values)

sleep →

2

6

6

6

4

cat : v

subcat : elist

subj :

»

cat : np

case : nom

num : pl

3

7

7

7

5

love →

2

6

6

6

6

6

4

cat : v

subcat : 〈

»

cat : np

case : acc

subj :

»

cat : np

case : nom

num : pl

3

7

7

7

7

7

5

give →

2

6

6

6

6

6

4

cat : v

subcat : 〈

»

cat : np

case : acc

cat : np˜

subj :

»

cat : np

case : nom

num : pl

3

7

7

7

7

7

5

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 253 / 300

Linguistic applications Subject and object control

G4: explicit subj values

Example (G4: explicit subj values)

lamb →

cat : nnum : sgcase : 6

lambs →

cat : nnum : plcase : 6

she →

cat : pronnum : sgcase : nom

her →

cat : pronnum : plcase : acc

Rachel →

[

cat : propnnum : sg

]

Jacob →

[

cat : propnnum : sg

]

a →

[

cat : dnum : sg

]

two →

[

cat : dnum : pl

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 254 / 300

Linguistic applications Subject and object control

Infinitival verb phrases

The next step is to account for infinitival verb phrases

This can be easily done by adding a new feature, vform, to verbalprojections

The values of this feature can represent the form of the verb: fin forfinite verbs and inf for infinitival ones

to work →

cat : vvform : infsubcat : elistsubj :

[

cat : np]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 255 / 300

Linguistic applications Subject and object control

The lexical entry of promise

Example (The lexical entry of promise)

promised →

cat : vvform : fin

subcat : 〈

[

cat : npcase : acc

]

,

cat : vvform : infsubj : 1

subj : 1

[

cat : npcase : nom

]

num : [ ]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 256 / 300

Linguistic applications Subject and object control

A derivation tree for Jacob promised Laban to work

Example (A derivation tree for Jacob promised Laban to work)ˆ

cat : s˜

2

6

6

4

cat : v

vform : fin

subj : 1subcat : elist

3

7

7

5

2

6

6

6

4

cat : v

vform : fin

subj : 1subcat : 〈 3 〉

3

7

7

7

5

1"

cat : np

case : 6 nom

#

2

6

6

6

4

cat : v

vform : fin

subj : 1subcat : 〈 2 , 3 〉

3

7

7

7

5

2"

cat : np

case : 7 acc

#

"

cat : propn

case : 6

# "

cat : propn

case : 7

#

3

2

4

cat : v

vform : inf

subj : 1

3

5

Jacob promised Laban to work

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 257 / 300

Linguistic applications Subject and object control

The lexical entry of persuade

Example (The lexical entry of persuade)

persuaded →

cat : vvform : fin

subcat : 〈 1

[

cat : npcase : acc

]

,

cat : vvform : infsubj : 1

subj :

[

cat : npcase : nom

]

num : [ ]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 258 / 300

Linguistic applications Constituent coordination

Constituent coordination

N: no man lift up his [hand] or [foot] in all the land of Egypt

NP: Jacob saw [Rachel] and [the sheep of Laban]

VP: Jacob [went on his journey] and

[came to the land of the people of the east]

VP: Jacob [went near], and

[rolled the stone from the well’s mouth], and

[watered the flock of Laban his mother’s brother].

ADJ: every [speckled] and [spotted] sheep

ADJP: Leah was [tender eyed] but [not beautiful]

S: [Leah had four sons], but [Rachel was barren]

S: she said to Jacob, “[Give me children], or [I shall die]!”

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 259 / 300

Linguistic applications Constituent coordination

Coordination in CFG

Example (Coordination in CFG)

S → S Conj SNP → NP Conj NPVP → VP Conj VP...

Conj → and, or, but, . . .

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 260 / 300

Linguistic applications Constituent coordination

Coordination in UG

Example (Coordination in UG)[

cat : 1]

→[

cat : 1] [

cat : conj] [

cat : 1]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 261 / 300

Linguistic applications Constituent coordination

Coordination in UG

Example (Coordination)

ˆ

cat : 1 v˜

2

4

cat : 1num : [ ]sc : elist

3

5

2

4

cat : 1num : [ ]sc : elist

3

5

2

4

cat : v

num : [ ]

sc : 〈 2 〉

3

5 2»

cat : np

num : sg

ˆ

cat : conj˜

2

4

cat : v

num : [ ]

sc : 〈 3 〉

3

5 3»

cat : np

num : [ ]

rolled the stone and watered the sheep

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 262 / 300

Linguistic applications Constituent coordination

Tough issues in coordination

Coordination of conjunctions

Properties of the conjoined phrases

Coordination of unlikes

Non-constituent coordination

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 263 / 300

Linguistic applications Constituent coordination

Coordination

Example (Ruling out coordination in UG)[

cat : 1

conj : −

]

[

cat : 1

conj : +

]

[

cat : conj]

[

cat : 1

conj : +

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 264 / 300

Linguistic applications Constituent coordination

Coordination

Example (Properties of the conjoined phrases)

2

6

6

4

cat : 1np

num : ??pers : ??gen : ??

3

7

7

5

2

6

6

4

cat : 1num : 4pers : 2gen : 8

3

7

7

5

2

6

6

4

cat : 1num : 6pers : 3gen : 7

3

7

7

5

2

6

6

4

cat : pron

num : 4pers : 2 second

gen : 8

3

7

7

5

ˆ

cat : conj˜

»

cat : d

num : 6

2

6

6

4

cat : n

num : 6 sg

pers : 3 third

gen : 7

3

7

7

5

you and a lamb

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 265 / 300

Linguistic applications Constituent coordination

Coordination

Example (Coordination of unlikes)

Joseph became wealthyJoseph became a ministerJoseph became [wealthy and a minister]Joseph grew wealthy∗Joseph grew a minister∗Joseph grew [wealthy and a minister]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 266 / 300

Linguistic applications Constituent coordination

Coordination

Example (Coordination of unlikes)

[

cat : 1 ⊓ 2]

→[

cat : 1] [

cat : conj] [

cat : 2]

where ‘⊓’ is the generalization operator

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 267 / 300

Linguistic applications Constituent coordination

Coordination

Example (Coordination of unlikes)

ˆ

cat :ˆ

v : +˜˜

»

subcat :ˆ

n : +˜

cat :ˆ

v : +˜

ˆ

cat :ˆ

n : +˜˜

»

cat :

»

v : +n : +

––

ˆ

cat : conj˜

»

cat :

»

v : −n : +

––

became wealthy and a minister

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 268 / 300

Linguistic applications Constituent coordination

Coordination

Example (Coordination of unlikes)ˆ

c :ˆ

v : +˜ ˜

»

c :ˆ

v : +˜

sc :ˆ

n : +˜

ˆ

c :ˆ

n : +˜ ˜

2

4

c :ˆ

v : +˜

sc :

»

v : +n : +

3

5

ˆ

c : c˜

»

c :ˆ

v : +˜

sc :ˆ

n : +˜

–»

c :

»

v : +n : +

– –

ˆ

c : c˜

»

c :

»

v : −n : +

– –

grew and remained wealthy and a minister

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 269 / 300

Linguistic applications Constituent coordination

Coordination

Example (Non-constituent coordination)

Rachel gave the sheep [grass] and [water]Rachel gave [the sheep grass] and [the lambs water]Rachel [kissed] and Jacob [hugged] Binyamin

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 270 / 300

Linguistic applications Unification grammars facilitate linguistic generalizations

Unification grammars facilitate linguistic generalizations

Compared with context-free grammars, unification grammars providemuch better means for expressing linguistic generalizations

Verb subcategorizationCoordination

Unification grammars also provide much more informative structuresthan CFGs

AgreementSubject/object control

Unification grammars provide a very powerful tool for expressing whatother linguistic theories would call “movement”

Gap–filler constructionsUnbounded dependencies

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 271 / 300

Expressiveness of unification grammars Expressiveness of unification grammars

Expressiveness of unification grammars

We hinted above that unification grammars are more expressive thanCFGs

Unification grammars are strictly more powerful than CFGs, evenwhen weak generation capacity is concerned

We show two unification grammars for formal languages that areknown to be trans-context-free

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 272 / 300

Expressiveness of unification grammars Expressiveness of unification grammars

Unification grammars are more expressive than CFGs

Gabc generates the language L = {anbncn | n > 0}

The signature of the grammar consists in the features cat and t andthe atoms s, ap, bp, cp, at, bt, ct and end

The terminal symbols are, of course, a, b and c

The start symbol is the left-hand side of the first rule

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 273 / 300

Expressiveness of unification grammars Expressiveness of unification grammars

Unification grammars are more expressive than CFGs

Feature structures in this example have two features: cat, whichstands for category, and t, which “counts” the length of sequences ofa-s, b-s and c-s

The “category” is ap for strings of a-s, bp for b-s and cp for c-s

The categories at, bt and ct are pre-terminal categories of the wordsa, b and c, respectively

“Counting” is done in unary base: a string of length n is derived byan AFS (that is, an AMRS of length 1) whose depth is n

For example, the string bbb is derived by the following featurestructure:

[

cat : bpt :

[

t :[

t : end]]

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 274 / 300

Expressiveness of unification grammars Expressiveness of unification grammars

A unification grammar Gabc for {anbncn | n > 0}

Example (A unification grammar Gabc for {anbncn | n > 0})

ρ1 :[

cat : s]

[

cat : apt : 3

] [

cat : bpt : 3

] [

cat : cpt : 3

]

ρ2 :

[

cat : apt :

[

t : 4

]

]

→[

cat : at]

[

cat : apt : 4

]

ρ3 :

[

cat : apt : end

]

→[

cat : at]

ρ4 :

[

cat : bpt :

[

t : 4

]

]

→[

cat : bt]

[

cat : bpt : 4

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 275 / 300

Expressiveness of unification grammars Expressiveness of unification grammars

A unification grammar Gabc for {anbncn | n > 0}

Example (A unification grammar Gabc for {anbncn | n > 0})

ρ5 :

[

cat : bpt : end

]

→[

cat : bt]

ρ6 :

[

cat : cpt :

[

t : 4

]

]

→[

cat : ct]

[

cat : cpt : 4

]

ρ7 :

[

cat : cpt : end

]

→[

cat : ct]

[

cat : at]

→ a

[

cat : bt]

→ b

[

cat : ct]

→ c

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 276 / 300

Expressiveness of unification grammars Expressiveness of unification grammars

A unification grammar Gabc for {anbncn | n > 0}

Example (Derivation tree of a2b2c2)

ˆ

cat : s˜

»

cat : ap

t : 3ˆ

t : 4 end˜

– »

cat : bp

t : 3ˆ

t : 4 end˜

– »

cat : cp

t : 3ˆ

t : 4 end˜

»

cat : ap

t : 4 end

– »

cat : bp

t : 4 end

– »

cat : cp

t : 4 end

ˆ

cat : at˜ ˆ

cat : at˜ ˆ

cat : bt˜ ˆ

cat : bt˜ ˆ

cat : ct˜ ˆ

cat : ct˜

a a b b c c

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 277 / 300

Expressiveness of unification grammars Expressiveness of unification grammars

A unification grammar Gabc for the language

{anb

nc

n | n > 0}

Corollary

The grammar Gabc generates the language L = {anbncn | n > 0}.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 278 / 300

Expressiveness of unification grammars Expressiveness of unification grammars

A unification grammar Gww for {ww | w ∈ {a, b}+}

Example (A unification grammar Gww for {ww | w ∈ {a, b}+})

ˆ

cat : s˜

»

first : 4

rest : 2

– »

first : 4

rest : 2

2

4

first : ap

rest :

»

first : 4

rest : 2

3

5 →ˆ

cat : at˜

»

first : 4

rest : 2

2

4

first : bp

rest :

»

first : 4

rest : 2

3

5 →ˆ

cat : bt˜

»

first : 4

rest : 2

»

first : ap

rest : elist

→ˆ

cat : at˜

»

first : bp

rest : elist

→ˆ

cat : bt˜

ˆ

cat : at˜

→ aˆ

cat : bt˜

→ b

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 279 / 300

Expressiveness of unification grammars Expressiveness of unification grammars

A unification grammar Gww for {ww | w ∈ {a, b}+}

Example (A derivation tree for the string aabaab)[

cat : S]

〈a, a, b〉 〈a, a, b〉

〈a, b〉 〈a, b〉

〈b〉 〈b〉

a a b a a b

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 280 / 300

Expressiveness of unification grammars Unification grammars and Turing machines

Unification grammars and Turing machines

How expressive are unification grammars?

They are equivalent in their weak generative power to unrestrictedrewriting systems

Unification grammars are equivalent to Turing machines in theirgenerative capacity

The languages generated by unification grammars are exactly the setof recursively enumerable languages

The universal recognition problem with unification grammars isundecidable: given an arbitrary unification grammar G and a stringw , no computational procedure can be designed to determinewhether w ∈ L(G )

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 281 / 300

Expressiveness of unification grammars Unification grammars and Turing machines

Turing machines

Definition (Turing machines)

A (deterministic) Turing machine (Q,Σ, ♭, δ, s, h) is a tuple such that:

Q is a finite set of states

Σ is an alphabet, not containing the symbols L, R and elist

♭ ∈ Σ is the blank symbol

s ∈ Q is the initial state

h ∈ Q is the final state

δ : (Q \ {h}) × Σ → Q × (Σ ∪ {L,R}) is a total function specifyingtransitions.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 282 / 300

Expressiveness of unification grammars Unification grammars and Turing machines

Turing machines

A configuration of a Turing machine consists of the state, thecontents of the tape and the position of the head on the tape

A configuration is depicted as a quadruple (q,wl , σ,wr ) where q ∈ Q,wl ,wr ∈ Σ∗ and σ ∈ Σ

The contents of the tape is ♭ω · wl · σ · wr · ♭ω, and the head is

positioned on the σ symbol.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 283 / 300

Expressiveness of unification grammars Unification grammars and Turing machines

Turing machines

A given configuration yields a next configuration, determined by thetransition function δ, the current state and the character on the tapethat the head points to.

The next configuration of a configuration (q,wl , σ,wr ) is defined iffq 6= h, in which case it is:

(p,wl , σ′,wr ) if δ(q, σ) = (p, σ′) for σ′ ∈ Σ

(p,wlσ, first(wr ), but-first(wr )) if δ(q, σ) = (p,R)(p, but-last(wl ), last(wl ), σwr ) if δ(q, σ) = (p,L)

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 284 / 300

Expressiveness of unification grammars Unification grammars and Turing machines

Turing machines

where:

first(σ1 · · · σn) =

{

σ1 n > 0♭ n = 0

but-first(σ1 · · · σn) =

{

σ2 · · · σn n > 1ǫ n ≤ 1

last(σ1 · · · σn) =

{

σn n > 0♭ n = 0

but-last(σ1 · · · σn) =

{

σ1 · · · σn−1 n > 1ǫ n ≤ 1

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 285 / 300

Expressiveness of unification grammars Unification grammars and Turing machines

Turing machines

A next configuration is only defined for configurations in which thestate is not the final state, h

Since δ is a total function, there always exists a unique nextconfiguration for every given configuration

A configuration c1 yields the configuration c2, denoted c1 ⊢ c2, iff c2

is the next configuration of c1

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 286 / 300

Expressiveness of unification grammars Unification grammars and Turing machines

Unification grammars and Turing machines: program

define a unification grammar GM for every Turing machine M suchthat the grammar generates the word halt if and only if the machineaccepts the empty input string:

L(GM) =

{

{halt} if M terminates for the empty input∅ if M does not terminate on the empty input

if there were a decision procedure to determine whether w ∈ L(G ) foran arbitrary unification grammar G , then in particular such aprocedure could determine membership in the language of GM ,simulating the Turing machine M.

the procedure for deciding whether w ∈ L(G ), when applied to theproblem halt∈ L(GM), determines whether M terminates for theempty input, which is known to be undecidable.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 287 / 300

Expressiveness of unification grammars Unification grammars and Turing machines

Unification grammars and Turing machines

Feature structures will have three features: curr, representing thecharacter under the head; right, representing the tape contents to theright of the head (as a list); and left, representing the tape contentsto the left of the head, in a reversed order

All the rules in the grammar are unit rules; and the only terminalsymbol is halt. Therefore, the language generated by the grammar isnecessarily either the singleton {halt} or the empty set

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 288 / 300

Expressiveness of unification grammars Unification grammars and Turing machines

Unification grammars and Turing machines: signature

Let M = (Q,Σ, ♭, δ, s, h) be a Turing machine. Define a unificationgrammar GM as follows:

Feats = {cat, left, right, curr, first, rest}Atoms = Σ ∪ {start, elist}The start symbol is

[

cat : start]

the only terminal symbol is halt

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 289 / 300

Expressiveness of unification grammars Unification grammars and Turing machines

Unification grammars and Turing machines: rules

Two rules are defined for every Turing machine:

[

cat : start]

cat : scurr : ♭

right : elistleft : elist

h → halt

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 290 / 300

Expressiveness of unification grammars Unification grammars and Turing machines

Unification grammars and Turing machines: rules

For every q, σ such that δ(q, σ) = (p, σ′) and σ′ ∈ Σ, the followingrule is defined:

cat : qcurr : σ

right : 4

left : 2

cat : pcurr : σ′

right : 4

left : 2

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 291 / 300

Expressiveness of unification grammars Unification grammars and Turing machines

Unification grammars and Turing machines: rules

For every q, σ such that δ(q, σ) = (p,R) we define two rules:

cat : qcurr : σ

right : elistleft : 4

cat : pcurr : ♭

right : elist

left :

[

first : σ

rest : 4

]

cat : qcurr : σ

right :

[

first : 4

rest : 2

]

left : 5

cat : pcurr : 4

right : 2

left :

[

first : σ

rest : 5

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 292 / 300

Expressiveness of unification grammars Unification grammars and Turing machines

Unification grammars and Turing machines: rules

For every q, σ such that δ(q, σ) = (p,L) we define two rules:

cat : qcurr : σ

right : 4

left : elist

cat : pcurr : ♭

right :

[

first : σ

rest : 4

]

left : elist

cat : qcurr : σ

right : 4

left :

[

first : 2

rest : 5

]

cat : pcurr : 2

right :

[

first : σ

rest : 4

]

left : 5

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 293 / 300

Expressiveness of unification grammars Unification grammars and Turing machines

Unification grammars and Turing machines: results

Lemma

Let c1, c2 be configurations of a Turing machine M, and A1,A2 be AFSsencoding these configurations, viewed as AMRSs of length 1. Thenc1 ⊢ c2 iff A1 ⇒ A2 in Gm.

Theorem

A Turing machine M halts on the empty input iff halt ∈ L(GM).

Corollary

The universal recognition problem for unification grammars is undecidable.

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 294 / 300

Expressiveness of unification grammars Unification grammars and Turing machines

Unification grammars and Turing machines

Unification grammars are indeed a model of computation: everyrecursively enumerable set can be computed as the languagegenerated by some unification grammar

Consider again the simulation of a Turing machine M by a unificationgrammar GM

Feature structures manipulated by the grammar encode the contentsof the Turing machine tape

By the end of the derivation, the pre-terminal of the terminal halt is afeature structure which encodes, in its right and left features, thecontents of the tape when the Turing machine halts

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 295 / 300

Expressiveness of unification grammars Unification grammars and Turing machines

Unification grammars and Turing machines

w ∈ L(M) iff there exists a terminating computation of M where w isthe contents of the tape

It is therefore possible to define, for each Turing machine M, agrammar G ′

M , such that L(G ′M) = L(M), in the following way

First, G ′M is constructed in a similar way to GM , simulating the

operation of M until it terminates (or indefinitely, in case it does notterminate)

Then, additional rules distinguish G ′M from GM

Such rules should first copy the contents of the left feature to thebeginning of the right list

Then, additional rules should pop the contents of the right list, oneby one, and generate a pre-terminal for each of the list’s elements

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 296 / 300

Expressiveness of unification grammars Unification grammars and Turing machines

Unification grammars and Turing machines

Example (Unification grammars and Turing machines)

h →[

cat : shift]

cat : shiftright : 4

left : elist

[

cat : printright : 4

]

cat : shiftright : 2

left :

[

first : σrest : 4

]

cat : shift

right :

[

first : σrest : 2

]

left : 4

[

cat : printright : elist

]

→ ǫ

cat : print

right :

[

first : σrest : 4

]

→ σ

[

cat : printright : 4

]

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 297 / 300

Summary Extensions and open problems

Extensions and open problems

Restricted versions of unification grammars

Off-line parsabilityContext-free and Mildly-context-sensitive unification grammarsPolynomially-arsable unification grammars

Typed unification grammars

Type hierarchiesAppropriateness specificationType inference

Development of large-scale grammars

Grammar engineeringModularity, information encapsulation, separate compilation, ...

c©Shuly Wintner (University of Haifa) Unification Grammars c©Copyrighted material 298 / 300