![Page 1: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/1.jpg)
Applications of Weighted Tree Automata andTree Transducers in Natural Language Processing
Andreas Maletti
Institute of Computer ScienceUniversität Leipzig, Germany
Krippen — March 28, 2019
![Page 2: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/2.jpg)
Final Round of Jeopardy!
IBM Watson$36,681
Brad Rutter$5,400
Ken Jennings$2,400
Final CategoryU.S. cities
Answer: Its largest airport was named for a World War II hero;its second largest, for a World War II battle.
Watson (bet $947)What is Toronto?????(con�dence ≈ 30%)
Brad (bet $5,000)What is Chicago?
Ken (bet $2,400)What is Chicago?
![Page 3: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/3.jpg)
Final Round of Jeopardy!
IBM Watson$36,681
Brad Rutter$5,400
Ken Jennings$2,400
Final CategoryU.S. cities
Answer: Its largest airport was named for a World War II hero;its second largest, for a World War II battle.
Watson (bet $947)What is Toronto?????(con�dence ≈ 30%)
Brad (bet $5,000)What is Chicago?
Ken (bet $2,400)What is Chicago?
![Page 4: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/4.jpg)
Final Round of Jeopardy!
![Page 5: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/5.jpg)
Information Retrieval
Wikipedia page of Toronto Pearson International AirportLester B. Pearson International Airport [. . . ] is the primary internationalairport serving Toronto, its metropolitan area, and surrounding regionknown as the Golden Horseshoe in the province of Ontario, Canada.
It is the largest and busiest airport in Canada, the second-busiestinternational air passenger gateway in the Americas, and the 31st-busiestairport in the world [. . . ].
The airport is named in honour of Lester B. Pearson, Nobel Peace Prizelaureate and 14th Prime Minister of Canada.
Mention Detection is the task to identify all instances of the same entity
![Page 6: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/6.jpg)
Information Retrieval
Wikipedia page of Toronto Pearson International AirportLester B. Pearson International Airport [. . . ] is the primary internationalairport serving Toronto, its metropolitan area, and surrounding regionknown as the Golden Horseshoe in the province of Ontario, Canada.
It is the largest and busiest airport in Canada, the second-busiestinternational air passenger gateway in the Americas, and the 31st-busiestairport in the world [. . . ].
The airport is named in honour of Lester B. Pearson, Nobel Peace Prizelaureate and 14th Prime Minister of Canada.
Mention Detection is the task to identify all instances of the same entity
![Page 7: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/7.jpg)
Mention Detection
Mention Detection1 Identify all mentions (= noun phrases)2 Identify which mentions reference the same entity
S
NP
PRP
It
VP
VBZ
is
NP
NP
NP
DT
the
ADJP
JJS
largest
CC
and
JJS
busiest
NN
airport
PP
IN
in
NP
NNP
Canada
,
,
NP
NP
DT
the
JJ
second-busiest
JJ
international
NN
air
NN
passenger
NN
gateway
PP
IN
in
NP
DT
the
NNPS
Americas
,
,
CC
and
NP
NP
DT
the
JJ
31st-busiest
NN
airport
PP
IN
in
NP
DT
the
NN
world
![Page 8: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/8.jpg)
Mention Detection
Mention Detection1 Identify all mentions (= noun phrases)2 Identify which mentions reference the same entity
S
NP
PRP
It
VP
VBZ
is
NP
NP
NP
DT
the
ADJP
JJS
largest
CC
and
JJS
busiest
NN
airport
PP
IN
in
NP
NNP
Canada
,
,
NP
NP
DT
the
JJ
second-busiest
JJ
international
NN
air
NN
passenger
NN
gateway
PP
IN
in
NP
DT
the
NNPS
Americas
,
,
CC
and
NP
NP
DT
the
JJ
31st-busiest
NN
airport
PP
IN
in
NP
DT
the
NN
world
![Page 9: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/9.jpg)
Mention Detection
Mention Detection1 Identify all mentions (= noun phrases)2 Identify which mentions reference the same entity
S
NP
PRP
It
VP
VBZ
is
NP
NP
NP
DT
the
ADJP
JJS
largest
CC
and
JJS
busiest
NN
airport
PP
IN
in
NP
NNP
Canada
,
,
NP
NP
DT
the
JJ
second-busiest
JJ
international
NN
air
NN
passenger
NN
gateway
PP
IN
in
NP
DT
the
NNPS
Americas
,
,
CC
and
NP
NP
DT
the
JJ
31st-busiest
NN
airport
PP
IN
in
NP
DT
the
NN
world
![Page 10: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/10.jpg)
Parsing
Parsingdetermining the syntactic structure of a sentencesubject to a given theory of syntax (encoded in the training data)
I constituent syntaxI dependency syntaxI . . .
S
NP-SBJ
NP
NNP
Mr.
NNP
Hahn
,
,
NP
NP
DT
the
ADJP
NP
CD
62
HYPH
-
NN
year
HYPH
-
JJ
old
NML
NML
NN
chairman
CC
and
NML
JJ
chief
JJ
executive
NN
officer
PP
IN
of
NP
NNP
Georgia
HYPH
-
NNP
Pacific
NNP
Corp.
VP
VBZ
is
VP
VBG
leading
NP
NP
NP
DT
the
NML
NN
forest
HYPH
-
NN
product
NN
concern
POS
’s
JJ
unsolicited
NML
QP
$
$
CD
3.19
CD
billion
-NONE-
*U*
NN
bid
PP
IN
for
NP
NNP
Great
NNP
Northern
NNP
Nekoosa
NNP
Corp
.
.
![Page 11: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/11.jpg)
Parsing
Parsingdetermining the syntactic structure of a sentencesubject to a given theory of syntax (encoded in the training data)
I constituent syntaxI dependency syntaxI . . .
S
NP-SBJ
NP
NNP
Mr.
NNP
Hahn
,
,
NP
NP
DT
the
ADJP
NP
CD
62
HYPH
-
NN
year
HYPH
-
JJ
old
NML
NML
NN
chairman
CC
and
NML
JJ
chief
JJ
executive
NN
officer
PP
IN
of
NP
NNP
Georgia
HYPH
-
NNP
Pacific
NNP
Corp.
VP
VBZ
is
VP
VBG
leading
NP
NP
NP
DT
the
NML
NN
forest
HYPH
-
NN
product
NN
concern
POS
’s
JJ
unsolicited
NML
QP
$
$
CD
3.19
CD
billion
-NONE-
*U*
NN
bid
PP
IN
for
NP
NNP
Great
NNP
Northern
NNP
Nekoosa
NNP
Corp
.
.
![Page 12: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/12.jpg)
Constituent Parsing
Example: We must bear in mind the Community as a whole
S
NP
PRP
We
VP
MD
must
VP
VB
bear
PP
IN
in
NP
NN
mind
NP
NP
DT
the
NN
Community
PP
IN
as
NP
DT
a
NN
whole
POS-tag: part-of-speech tag, “class” of a word
![Page 13: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/13.jpg)
Constituent Parsing
Example: We must bear in mind the Community as a whole
S
NP
PRP
We
VP
MD
must
VP
VB
bear
PP
IN
in
NP
NN
mind
NP
NP
DT
the
NN
Community
PP
IN
as
NP
DT
a
NN
whole
POS-tag: part-of-speech tag, “class” of a word
![Page 14: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/14.jpg)
Constituent Parsing
todayLinear-time dependency models; optimized by neural networks
2016Subcategorizationmanual: Collins (1999), Stanford (2003), BLLIP (2005)automatic, e.g. Berkeley (2007)
2000Statistical approach (cheap, automatically trained)Penn and WSJ tree bank (1M and 30M words)automatically obtained weighted CFG
1990Chomskyan approach (perfect analysis, poor coverage)hand-cra�ed CFG, TAG (re�ned via POS tags)corrections and selection by human annotators
![Page 15: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/15.jpg)
Constituent Parsing
All models use weights for disambiguation:
S
NP
PRP
We
VP
VBD
saw
NP
PRP$
her
NN
duck
S
NP
PRP
We
VP
VBD
saw
S-BAR
S
NP
PRP
her
VP
VBP
duck
![Page 16: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/16.jpg)
Weight Structure
De�nition (Commutative semiring)Algebraic structure (S,+, ·,0, 1) is commutative semiring if
(S,+,0) and (S, ·, 1) are commutative monoids
· distributes over �nite sums(s · 0 = 0 for all s ∈ S )
Examples
semiring (N,+, ·,0, 1) of nonnegative integers
semiring (Q,+, ·,0, 1) of rational numbers
Viterbi semiring ([0, 1],max, ·,0, 1) of probabilities
![Page 17: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/17.jpg)
Weight Structure
De�nition (Commutative semiring)Algebraic structure (S,+, ·,0, 1) is commutative semiring if
(S,+,0) and (S, ·, 1) are commutative monoids
· distributes over �nite sums(s · 0 = 0 for all s ∈ S )
Examples
semiring (N,+, ·,0, 1) of nonnegative integers
semiring (Q,+, ·,0, 1) of rational numbers
Viterbi semiring ([0, 1],max, ·,0, 1) of probabilities
![Page 18: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/18.jpg)
Weighted Tree Automaton
Fix a commutative semiring (S,+, ·,0, 1)
De�nition (Weighted tree automaton [Berstel, Reutenauer 1982])Tuple (Q,Σ, I, R) is weighted tree automaton if
�nite set Q of states
�nite set Σ of terminals
initial states I ⊆ Q�nite set R of weighted rules of the form q s→ σ(q1, . . . , qk)
(s ∈ S , σ ∈ Σ, k ≥ 0, q, q1, . . . , qk ∈ Q)
Example rules
q40.4→
VP
q5 q2 q3q0
0.8→S
q1 q4q0
0.2→S
q6 q2
![Page 19: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/19.jpg)
Weighted Tree Automaton
Fix a commutative semiring (S,+, ·,0, 1)
De�nition (Weighted tree automaton [Berstel, Reutenauer 1982])Tuple (Q,Σ, I, R) is weighted tree automaton if
�nite set Q of states
�nite set Σ of terminals
initial states I ⊆ Q�nite set R of weighted rules of the form q s→ σ(q1, . . . , qk)
(s ∈ S , σ ∈ Σ, k ≥ 0, q, q1, . . . , qk ∈ Q)
Example rules
q40.4→
VP
q5 q2 q3q0
0.8→S
q1 q4q0
0.2→S
q6 q2
![Page 20: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/20.jpg)
Weighted Tree Automaton
De�nition (Derivation semantics and recognized tree language)Let (Q,Σ, I, R) tree automaton
for the state q ∈ Q labeling the le�most state-labeled leaf positionand every rule q s→ r ∈ R
qs
=⇒G
r
derivation D : ts1=⇒ · · · sn=⇒ u from t to u with weight
∏ni=1 si
recognized weighted tree language L
L(t) =∑q∈I
( ∑D : derivationfrom q to t
weight(D)
)
![Page 21: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/21.jpg)
Weighted Tree Automaton
De�nition (Derivation semantics and recognized tree language)Let (Q,Σ, I, R) tree automaton
for the state q ∈ Q labeling the le�most state-labeled leaf positionand every rule q s→ r ∈ R
qs
=⇒G
r
derivation D : ts1=⇒ · · · sn=⇒ u from t to u with weight
∏ni=1 si
recognized weighted tree language L
L(t) =∑q∈I
( ∑D : derivationfrom q to t
weight(D)
)
![Page 22: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/22.jpg)
Weighted Tree Automaton
De�nition (Derivation semantics and recognized tree language)Let (Q,Σ, I, R) tree automaton
for the state q ∈ Q labeling the le�most state-labeled leaf positionand every rule q s→ r ∈ R
qs
=⇒G
r
derivation D : ts1=⇒ · · · sn=⇒ u from t to u with weight
∏ni=1 si
recognized weighted tree language L
L(t) =∑q∈I
( ∑D : derivationfrom q to t
weight(D)
)
![Page 23: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/23.jpg)
Constituent Parsing
F1-scoregrammar |w| ≤ 40 full
CFG 62.7TSG [Post, Gildea, 2009] 82.6TSG [Cohn et al., 2010] 85.4 84.7
CFGsub [Collins, 1999] 88.6 88.2CFGsub [Petrov, Klein, 2007] 90.6 90.1CFGsub [Petrov, 2010] 91.8
TSGsub [Shindo et al., 2012] 92.9 92.4
![Page 24: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/24.jpg)
Local Tree Languages
CFG (in tree-generation mode) = Local tree grammar
De�nition (Local tree grammar [Gécseg, Steinby 1984])
Local tree grammar (Σ, I, P) is �nite set P of weighted branchings σ s→ w(with σ ∈ Σ, s ∈ S , and w ∈ Σk ) together with a set I ⊆ Σ of root labels
![Page 25: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/25.jpg)
Local Tree Languages
Example (with root label S)
S 0.3→ NP1 VP2 VP20.6→ MD VP3
NP20.4→ NP2 PP VP3
0.2→ VB PP NP2
MD 0.1→ must . . .
![Page 26: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/26.jpg)
Local Tree Languages
Example (with root label S)
S 0.3→ NP1 VP2 VP20.6→ MD VP3
NP20.4→ NP2 PP VP3
0.2→ VB PP NP2
MD 0.1→ must . . .
S
NP1
PRP
We
VP2
MD
must
VP3
VB
bear
PP
IN
in
NP1
NN
mind
NP2
NP2
DT
the
NN
Community
PP
IN
as
NP2
DT
a
NN
whole
![Page 27: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/27.jpg)
Local Tree Languages
Example (with root label S)
S 0.3→ NP1 VP2 VP20.6→ MD VP3
NP20.4→ NP2 PP VP3
0.2→ VB PP NP2
MD 0.1→ must . . .
S
NP1
PRP
We
VP2
MD
must
VP3
VB
bear
PP
IN
in
NP1
NN
mind
NP2
NP2
DT
the
NN
Community
PP
IN
as
NP2
DT
a
NN
whole
![Page 28: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/28.jpg)
Local Tree Languages
Example (with root label S)
S 0.3→ NP1 VP2 VP20.6→ MD VP3
NP20.4→ NP2 PP VP3
0.2→ VB PP NP2
MD 0.1→ must . . .
S
NP1
PRP
We
VP2
MD
must
VP3
VB
bear
PP
IN
in
NP1
NN
mind
NP2
NP2
DT
the
NN
Community
PP
IN
as
NP2
DT
a
NN
whole
![Page 29: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/29.jpg)
Local Tree Languages
Example (with root label S)
S 0.3→ NP1 VP2 VP20.6→ MD VP3
NP20.4→ NP2 PP VP3
0.2→ VB PP NP2
MD 0.1→ must . . .
S
NP1
PRP
We
VP2
MD
must
VP3
VB
bear
PP
IN
in
NP1
NN
mind
NP2
NP2
DT
the
NN
Community
PP
IN
as
NP2
DT
a
NN
whole
![Page 30: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/30.jpg)
Local Tree Languages
Example (with root label S)
S 0.3→ NP1 VP2 VP20.6→ MD VP3
NP20.4→ NP2 PP VP3
0.2→ VB PP NP2
MD 0.1→ must . . .
S
NP1
PRP
We
VP2
MD
must
VP3
VB
bear
PP
IN
in
NP1
NN
mind
NP2
NP2
DT
the
NN
Community
PP
IN
as
NP2
DT
a
NN
whole
![Page 31: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/31.jpg)
Local Tree Languages
not closed under union
these singletons are local
S
NP2
PRP$
My
NN
dog
VP1
VBZ
sleeps
S
NP2
DT
The
NN
candidates
VP2
VBD
scored
ADVP
RB
well
but their union cannot be local
(as we also generate these trees — overgeneralization)
![Page 32: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/32.jpg)
Local Tree Languages
not closed under union
these singletons are local
S
NP2
DT
The
NN
candidates
VP1
VBZ
sleeps
S
NP2
PRP$
My
NN
dog
VP2
VBD
scored
ADVP
RB
well
but their union cannot be local(as we also generate these trees — overgeneralization)
![Page 33: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/33.jpg)
Local Tree Languages
Question
Given a regular tree language L, determine whether L is local
Answer
decidable
![Page 34: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/34.jpg)
Local Tree Languages
Question
Given a regular tree language L, determine whether L is local
Answer
decidable
![Page 35: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/35.jpg)
Tree Substitution Languages
De�nition (Tree substitution grammar [Joshi, Schabes 1997])Tree substitution grammar (Σ, I, P) is �nite set P of weighted fragmentsroot(t) s→ t (with s ∈ S and t ∈ TΣ) together with a set of root labels I ⊆ Σ
Typical fragments [Post 2011]
VP 0.2→VP
VBD NP
CD
PP S 0.4→S
NP
PRP
VP S 0.6→
S
NP VP
TO VP
![Page 36: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/36.jpg)
Tree Substitution Languages
De�nition (Tree substitution grammar [Joshi, Schabes 1997])Tree substitution grammar (Σ, I, P) is �nite set P of weighted fragmentsroot(t) s→ t (with s ∈ S and t ∈ TΣ) together with a set of root labels I ⊆ Σ
Typical fragments [Post 2011]
VP 0.2→VP
VBD NP
CD
PP S 0.4→S
NP
PRP
VP S 0.6→
S
NP VP
TO VP
![Page 37: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/37.jpg)
Tree Substitution Languages
(Le�most) Derivation step ξs⇒G ζ
ξ = c[root(t)
]and ζ = c
[t]for some context c and root(t) s→ t ∈ P
for each fragment t ∈ P with root label σ
σs
=⇒G
t
recognized weighted tree language L
L(t) =
∑
D : derivationfrom root(t) to t
weight(D) if root(t) ∈ I
0 otherwise
![Page 38: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/38.jpg)
Tree Substitution Languages
(Le�most) Derivation step ξs⇒G ζ
ξ = c[root(t)
]and ζ = c
[t]for some context c and root(t) s→ t ∈ P
for each fragment t ∈ P with root label σ
σs
=⇒G
t
recognized weighted tree language L
L(t) =
∑
D : derivationfrom root(t) to t
weight(D) if root(t) ∈ I
0 otherwise
![Page 39: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/39.jpg)
Tree Substitution Languages
(Le�most) Derivation step ξs⇒G ζ
ξ = c[root(t)
]and ζ = c
[t]for some context c and root(t) s→ t ∈ P
for each fragment t ∈ P with root label σ
σs
=⇒G
t
recognized weighted tree language L
L(t) =
∑
D : derivationfrom root(t) to t
weight(D) if root(t) ∈ I
0 otherwise
![Page 40: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/40.jpg)
Tree Substitution Languages
Fragments
S 0.3→ S(NP1(PRP),VP2
)PRP 0.5→ PRP(We)
VP 0.1→ VP2(MD,VP3(VB, PP,NP2)
)MD 0.2→ MD(must)
Derivation
S
NP1
PRP
We
VP2
MD
must
VP3
VB PP NP2
![Page 41: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/41.jpg)
Tree Substitution Languages
Fragments
S 0.3→ S(NP1(PRP),VP2
)PRP 0.5→ PRP(We)
VP 0.1→ VP2(MD,VP3(VB, PP,NP2)
)MD 0.2→ MD(must)
Derivation
S
NP1
PRP
We
VP2
MD
must
VP3
VB PP NP2
![Page 42: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/42.jpg)
Tree Substitution Languages
Fragments
S 0.3→ S(NP1(PRP),VP2
)PRP 0.5→ PRP(We)
VP 0.1→ VP2(MD,VP3(VB, PP,NP2)
)MD 0.2→ MD(must)
Derivation
S
NP1
PRP
We
VP2
MD
must
VP3
VB PP NP2
![Page 43: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/43.jpg)
Tree Substitution Languages
Fragments
S 0.3→ S(NP1(PRP),VP2
)PRP 0.5→ PRP(We)
VP 0.1→ VP2(MD,VP3(VB, PP,NP2)
)MD 0.2→ MD(must)
Derivation
S
NP1
PRP
We
VP2
MD
must
VP3
VB PP NP2
![Page 44: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/44.jpg)
Tree Substitution Languages
Fragments
S 0.3→ S(NP1(PRP),VP2
)PRP 0.5→ PRP(We)
VP 0.1→ VP2(MD,VP3(VB, PP,NP2)
)MD 0.2→ MD(must)
Derivation
S
NP1
PRP
We
VP2
MD
must
VP3
VB PP NP2
![Page 45: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/45.jpg)
Tree Substitution Languages
Fragments
S 0.3→ S(NP1(PRP),VP2
)PRP 0.5→ PRP(We)
VP 0.1→ VP2(MD,VP3(VB, PP,NP2)
)MD 0.2→ MD(must)
Derivation
S
NP1
PRP
We
VP2
MD
must
VP3
VB PP NP2
![Page 46: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/46.jpg)
Tree Substitution Languages
Fragments
S 0.3→ S(NP1(PRP),VP2
)PRP 0.5→ PRP(We)
VP 0.1→ VP2(MD,VP3(VB, PP,NP2)
)MD 0.2→ MD(must)
Derivation
S
NP1
PRP
We
VP2
MD
must
VP3
VB PP NP2
![Page 47: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/47.jpg)
Tree Substitution Languages
Fragments
S 0.3→ S(NP1(PRP),VP2
)PRP 0.5→ PRP(We)
VP 0.1→ VP2(MD,VP3(VB, PP,NP2)
)MD 0.2→ MD(must)
Derivation
S
NP1
PRP
We
VP2
MD
must
VP3
VB PP NP2
![Page 48: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/48.jpg)
Tree Substitution Languages
Fragments
S 0.3→ S(NP1(PRP),VP2
)PRP 0.5→ PRP(We)
VP 0.1→ VP2(MD,VP3(VB, PP,NP2)
)MD 0.2→ MD(must)
Derivation
S
NP1
PRP
We
VP2
MD
must
VP3
VB PP NP2
![Page 49: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/49.jpg)
Tree Substitution Languages
Fragments
S 0.3→ S(NP1(PRP),VP2
)PRP 0.5→ PRP(We)
VP 0.1→ VP2(MD,VP3(VB, PP,NP2)
)MD 0.2→ MD(must)
Derivation
S
NP1
PRP
We
VP2
MD
must
VP3
VB PP NP2
![Page 50: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/50.jpg)
Tree Substitution Languages
Fragments
S 0.3→ S(NP1(PRP),VP2
)PRP 0.5→ PRP(We)
VP 0.1→ VP2(MD,VP3(VB, PP,NP2)
)MD 0.2→ MD(must)
Derivation
S
NP1
PRP
We
VP2
MD
must
VP3
VB PP NP2
![Page 51: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/51.jpg)
Tree Substitution Languages
Fragments
S 0.3→ S(NP1(PRP),VP2
)PRP 0.5→ PRP(We)
VP 0.1→ VP2(MD,VP3(VB, PP,NP2)
)MD 0.2→ MD(must)
Derivation
S
NP1
PRP
We
VP2
MD
must
VP3
VB PP NP2
![Page 52: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/52.jpg)
Tree Substitution Languages
Fragments
S 0.3→ S(NP1(PRP),VP2
)PRP 0.5→ PRP(We)
VP 0.1→ VP2(MD,VP3(VB, PP,NP2)
)MD 0.2→ MD(must)
Derivation
S
NP1
PRP
We
VP2
MD
must
VP3
VB PP NP2
![Page 53: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/53.jpg)
Tree Substitution Languages
not closed under union
these languages are tree substitution languages individuallyS
C
C
C
a
a
S
C
C
C
b
b
L1 = {S(Cn(a), a) | n ∈ N} L2 = {S(Cn(b), b) | n ∈ N}
but their union is not
(exchange subtrees below the indicated cuts)
![Page 54: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/54.jpg)
Tree Substitution Languages
not closed under union
these languages are tree substitution languages individuallyS
C
C
C
a
a
S
C
C
C
b
b
L1 = {S(Cn(a), a) | n ∈ N} L2 = {S(Cn(b), b) | n ∈ N}
but their union is not(exchange subtrees below the indicated cuts)
![Page 55: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/55.jpg)
Tree Substitution Languages
not closed under intersectionthese languages L1 and L2 are tree substitution languages individuallyfor n ≥ 1 and arbitrary x1, . . . , xn ∈ {a, b}
S’
x1 S
x1 S
x2 S
x2 S
x3 S
xn−1 S
xn S
xn c
∈ L1
S’
x1 S
x2 S
x2 S
x3 S
x3 S
xn−1 S
xn−1 S
xn c
∈ L2
but their intersection only contains trees with x1 = x2 = · · · = xnand is not a tree substitution language
![Page 56: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/56.jpg)
Tree Substitution Languages
not closed under intersectionthese languages L1 and L2 are tree substitution languages individuallyfor n ≥ 1 and arbitrary x1, . . . , xn ∈ {a, b}
S’
x1 S
x1 S
x2 S
x2 S
x3 S
xn−1 S
xn S
xn c
∈ L1
S’
x1 S
x2 S
x2 S
x3 S
x3 S
xn−1 S
xn−1 S
xn c
∈ L2
but their intersection only contains trees with x1 = x2 = · · · = xnand is not a tree substitution language
![Page 57: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/57.jpg)
Tree Substitution Languages
not closed under complement
this language L is a tree substitution languageS
A
A
A′
A′
a
∈ L
S
B
B
B′
B′
b
∈ L
but its complement is not
(exchange as indicated in red)
![Page 58: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/58.jpg)
Tree Substitution Languages
not closed under complement
this language L is a tree substitution languageS
A
A
A′
A′
a
∈ L
S
B
B
B′
B′
b
∈ L
S
A
A
A′
A′
b
/∈ L
S
B
B
A′
A′
a
/∈ L
but its complement is not(exchange as indicated in red)
![Page 59: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/59.jpg)
Tree Substitution Languages
Question
Given regular tree language L, is it a tree substitution language
Answer
???
![Page 60: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/60.jpg)
Tree Substitution Languages
Question
Given regular tree language L, is it a tree substitution language
Answer
???
![Page 61: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/61.jpg)
Subcategorization
Tags:o�cial tags o�en conservative
I English: ≈ 50 tagsI German: � 200 tags ADJA-Sup-Dat-Sg-Fem
all modern parsers use re�ned tags→ subcategorizationbut return parse over o�cial tags→ relabeling
S
NP
PRP$
My
NN
dog
VP
VBZ
sleeps
Comparison:rule of subcategorized CFG vs. corresponding rule of tree automaton
S-1→ ADJP-2 S-1 S-1→ S(ADJP-2, S-1
)
![Page 62: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/62.jpg)
Subcategorization
Tags:o�cial tags o�en conservative
I English: ≈ 50 tagsI German: � 200 tags ADJA-Sup-Dat-Sg-Fem
all modern parsers use re�ned tags→ subcategorization
but return parse over o�cial tags→ relabeling
S-1
NP-4
PRP$-3
My
NN-2
dog
VP-5
VBZ-7
sleeps
Comparison:rule of subcategorized CFG vs. corresponding rule of tree automaton
S-1→ ADJP-2 S-1 S-1→ S(ADJP-2, S-1
)
![Page 63: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/63.jpg)
Subcategorization
Tags:o�cial tags o�en conservative
I English: ≈ 50 tagsI German: � 200 tags ADJA-Sup-Dat-Sg-Fem
all modern parsers use re�ned tags→ subcategorizationbut return parse over o�cial tags→ relabeling
S
NP
PRP$
My
NN
dog
VP
VBZ
sleeps
Comparison:rule of subcategorized CFG vs. corresponding rule of tree automaton
S-1→ ADJP-2 S-1 S-1→ S(ADJP-2, S-1
)
![Page 64: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/64.jpg)
Constituent Parsing
F1-scoregrammar |w| ≤ 40 full
CFG 62.7TSG [Post, Gildea, 2009] 82.6TSG [Cohn et al., 2010] 85.4 84.7
CFGsub [Collins, 1999] 88.6 88.2CFGsub [Petrov, Klein, 2007] 90.6 90.1CFGsub [Petrov, 2010] 91.8
TSGsub [Shindo et al., 2012] 92.9 92.4
TA
TSGsub
TSG CFGsub
CFG
Hence:
subcategorization = �nite-state
all modern models equivalent to tree automata in expressive power
![Page 65: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/65.jpg)
Constituent Parsing
F1-scoregrammar |w| ≤ 40 full
CFG 62.7TSG [Post, Gildea, 2009] 82.6TSG [Cohn et al., 2010] 85.4 84.7
CFGsub [Collins, 1999] 88.6 88.2CFGsub [Petrov, Klein, 2007] 90.6 90.1CFGsub [Petrov, 2010] 91.8
TSGsub [Shindo et al., 2012] 92.9 92.4
TA
TSGsub
TSG CFGsub
CFG
Hence:
subcategorization = �nite-state
all modern models equivalent to tree automata in expressive power
![Page 66: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/66.jpg)
Constituent Parsing
F1-scoregrammar |w| ≤ 40 full
CFG 62.7TSG [Post, Gildea, 2009] 82.6TSG [Cohn et al., 2010] 85.4 84.7
CFGsub [Collins, 1999] 88.6 88.2CFGsub [Petrov, Klein, 2007] 90.6 90.1CFGsub [Petrov, 2010] 91.8
TSGsub [Shindo et al., 2012] 92.9 92.4
TA
TSGsub
TSG CFGsub
CFG
Hence:
subcategorization = �nite-state
all modern models equivalent to tree automata in expressive power
![Page 67: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/67.jpg)
Parsing
Parsingdetermining the syntactic structure of a given sentencesubject to a given theory of syntax (encoded in the training data)
I constituent syntaxI dependency syntaxI . . .
> John saw a dog yesterday which was a Yorkshire Terrier
Practical results:linear-time statistical parsersGoogle’s “Parsey McParseface” [Andor et al., 2016]94% F1-score; linguists achieve 96–97%
![Page 68: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/68.jpg)
Parsing
Parsingdetermining the syntactic structure of a given sentencesubject to a given theory of syntax (encoded in the training data)
I constituent syntaxI dependency syntaxI . . .
> John saw a dog yesterday which was a Yorkshire Terrier
Practical results:linear-time statistical parsersGoogle’s “Parsey McParseface” [Andor et al., 2016]94% F1-score; linguists achieve 96–97%
![Page 69: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/69.jpg)
Combinatory Categorial Grammars
Combinators (Compositions)Composition rules of degree k are
ax/c, cy → axy (forward rule)
cy, ax /c → axy (backward rule)
with y = |1c1 |2 · · · |kck
ExamplesC D/E/D /C
D/E/D︸ ︷︷ ︸degree 0
D/E/D D/E /CD/E/E /C︸ ︷︷ ︸degree 2
![Page 70: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/70.jpg)
Combinatory Categorial Grammars
Combinators (Compositions)Composition rules of degree k are
ax/c, cy → axy (forward rule)
cy, ax /c → axy (backward rule)
with y = |1c1 |2 · · · |kck
ExamplesC D/E/D /C
D/E/D︸ ︷︷ ︸degree 0
D/E/D D/E /CD/E/E /C︸ ︷︷ ︸degree 2
![Page 71: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/71.jpg)
Combinatory Categorial Grammars
Combinatory Categorial Grammar (CCG)(Σ,A, k, I, L)
terminal alphabet Σ and atomic categories A
maximal degree k ∈ N ∪ {∞} of composition rules
initial categories I ⊆ Alexicon L ⊆ Σ× C(A) with C(A) categories over A
Notes:
always all rules up to the given degree k allowed
k-CCG = CCG using all composition rules up to degree k
![Page 72: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/72.jpg)
Combinatory Categorial Grammars
Combinatory Categorial Grammar (CCG)(Σ,A, k, I, L)
terminal alphabet Σ and atomic categories A
maximal degree k ∈ N ∪ {∞} of composition rules
initial categories I ⊆ Alexicon L ⊆ Σ× C(A) with C(A) categories over A
Notes:
always all rules up to the given degree k allowed
k-CCG = CCG using all composition rules up to degree k
![Page 73: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/73.jpg)
Combinatory Categorial Grammars
c........
C
c..C
d..
D/E/D /CD/E/D
d.....
D/E /CD/E/E /C
D/E/E
e...........
ED/E
e..............
ED
2-CCG generates string language L with L ∩ c+d+e+ = {c id iei | i ≥ 1}for initial categories {D}
L(c) = {C}L(d) = {D/E /C , D/E/D /C}L(e) = {E}
![Page 74: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/74.jpg)
Combinatory Categorial Grammars
allow (deterministic) relabeling (to allow arbitrary labels)
Question
Which tree languages are legal proof trees of CCG
![Page 75: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/75.jpg)
Machine Translation
Review translation [by Translate
a�er 2016
]1 The room it is not narrowly was a simple, bathtub was also attached.2 Wi-�, TV and I was available.3 Church looked When morning awake open the curtain.4 But was a little cold, morning walks was good.
Original [Japanese — © ]1 部屋もシンプルでしたが狭くなく、バスタブもついていました。
2 Wi-�、テレビも利用出来ました。
3 朝起きてカーテンを開けると教会が見えました。
4 ちょっと寒かったけれど、朝の散策はグッドでしたよ。
![Page 76: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/76.jpg)
Machine Translation
Review translation [by Translate
a�er 2016
]1 The room it is not narrowly was a simple, bathtub was also attached.2 Wi-�, TV and I was available.3 Church looked When morning awake open the curtain.4 But was a little cold, morning walks was good.
Original [Japanese — © ]1 部屋もシンプルでしたが狭くなく、バスタブもついていました。
2 Wi-�、テレビも利用出来ました。
3 朝起きてカーテンを開けると教会が見えました。
4 ちょっと寒かったけれど、朝の散策はグッドでしたよ。
![Page 77: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/77.jpg)
Machine Translation
Review translation [by Translate a�er 2016]1 The room was simple, but it was not small,
and the bathtub was also attached.2 Wi-�, TV was also available.3 When I woke up in the morning and opened the curtain,
I saw the church.4 It was a bit cold, but walking in the morning was good.
Original [Japanese — © ]1 部屋もシンプルでしたが狭くなく、バスタブもついていました。
2 Wi-�、テレビも利用出来ました。
3 朝起きてカーテンを開けると教会が見えました。
4 ちょっと寒かったけれど、朝の散策はグッドでしたよ。
![Page 78: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/78.jpg)
Machine Translation
Short History:today
Neural networks
2016Reformationphrase-based and syntax-based systemsstatistical approach (cheap, automatically trained)
1991Dark agerule-based systems (e.g., )Chomskyan approach (perfect translation, poor coverage)
1960
![Page 79: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/79.jpg)
Machine Translation
Vauquois triangle:
phrase
syntax
semantics
foreign English
Translation model: string-to-string
![Page 80: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/80.jpg)
Machine Translation
Vauquois triangle:
phrase
syntax
semantics
foreign English
Translation model: string-to-tree
![Page 81: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/81.jpg)
Machine Translation
Vauquois triangle:
phrase
syntax
semantics
foreign English
Translation model: tree-to-tree
![Page 82: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/82.jpg)
Machine Translation
parallel corpus, word alignments, parse tree
I would like your advice about Rule 143 concerning inadmissibility
Könnten Sie mir eine Auskun� zu Artikel 143 im Zusammenhang mit der Unzulässigkeit geben
KOUS PPER PPER ART NN APPR NN CD AART NN APPR ART NN VV
PP
PP
PP
NP
S
![Page 83: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/83.jpg)
Machine Translation
parallel corpus, word alignments, parse tree
I would like your advice about Rule 143 concerning inadmissibility
Könnten Sie mir eine Auskun� zu Artikel 143 im Zusammenhang mit der Unzulässigkeit geben
KOUS PPER PPER ART NN APPR NN CD AART NN APPR ART NN VV
PP
PP
PP
NP
S
via GIZA++ [Och, Ney: A systematic comparison of various statistical alignment models. Computational Linguistics 29(1), 2003]
![Page 84: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/84.jpg)
Machine Translation
parallel corpus, word alignments, parse tree
I would like your advice about Rule 143 concerning inadmissibility
Könnten Sie mir eine Auskun� zu Artikel 143 im Zusammenhang mit der Unzulässigkeit geben
KOUS PPER PPER ART NN APPR NN CD AART NN APPR ART NN VV
PP
PP
PP
NP
S
via Berkeley parser [Petrov, Barrett, Thibaux, Klein: Learning accurate, compact, and interpretable tree annotation. Proc. ACL, 2006]
![Page 85: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/85.jpg)
Weighted Synchronous Grammars
Synchronous tree substitution grammar: productions N →(r, r1
)nonterminal N
right-hand side r of context-free grammar production
right-hand side r1 of tree substitution grammar production
(bijective) synchronization of nonterminals
S→
PPER would likeKOUS PPER advice PP
Könnten eine Auskun� geben
KOUS PPER PPER ART NN PP VV
NP
S
variant of [M., Graehl, Hopkins, Knight: The power of extended top-down tree transducers. SIAM Journal on Computing 39(2), 2009]
![Page 86: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/86.jpg)
Weighted Synchronous Grammars
Synchronous tree substitution grammar: productions N →(r, r1
)nonterminal N
right-hand side r of context-free grammar production
right-hand side r1 of tree substitution grammar production
(bijective) synchronization of nonterminals
S→
PPER would likeKOUS PPER advice PP
Könnten eine Auskun� geben
KOUS PPER PPER ART NN PP VV
NP
S
variant of [M., Graehl, Hopkins, Knight: The power of extended top-down tree transducers. SIAM Journal on Computing 39(2), 2009]
![Page 87: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/87.jpg)
Synchronous Grammars
S→
PPER would likeKOUS PPER advice PP
Könnten eine Auskun� geben
KOUS PPER PPER ART NN PP VV
NP
S
Production application:1 Selection of synchronous nonterminals
2 Selection of suitable production3 Replacement on both sides
![Page 88: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/88.jpg)
Synchronous Grammars
S→
PPER would likeKOUS PPER advice PP
Könnten eine Auskun� geben
KOUS PPER PPER ART NN PP VV
NP
S
Production application:1 Selection of synchronous nonterminals
2 Selection of suitable production3 Replacement on both sides
![Page 89: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/89.jpg)
Synchronous Grammars
S→
PPER would likeKOUS PPER advice PP
Könnten eine Auskun� geben
KOUS PPER PPER ART NN PP VV
NP
S
Production application:1 Selection of synchronous nonterminals2 Selection of suitable production
3 Replacement on both sides
KOUS→
would like
Könnten
KOUS
![Page 90: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/90.jpg)
Synchronous Grammars
S→
PPER KOUSwould like PPER advice PP
Könnten eine Auskun� geben
KOUS PPER PPER ART NN PP VV
NP
S
Production application:1 Selection of synchronous nonterminals2 Selection of suitable production3 Replacement on both sides
KOUS→
would like
Könnten
KOUS
![Page 91: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/91.jpg)
Synchronous Grammars
S→
PPER would like PPER advice APPR NN CD PPPP
Könnten eine Auskun� geben
KOUS PPER PPER ART NN APPR NN CD PP VV
PP
NP
S
Production application:1 synchronous nonterminals
2 suitable production3 replacement
![Page 92: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/92.jpg)
Synchronous Grammars
S→
PPER would like PPER advice APPR NN CD PPPP
Könnten eine Auskun� geben
KOUS PPER PPER ART NN APPR NN CD PP VV
PP
NP
S
Production application:1 synchronous nonterminals
2 suitable production3 replacement
![Page 93: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/93.jpg)
Synchronous Grammars
S→
PPER would like PPER advice APPR NN CD PPPP
Könnten eine Auskun� geben
KOUS PPER PPER ART NN APPR NN CD PP VV
PP
NP
S
Production application:1 synchronous nonterminals2 suitable production
3 replacement
PP→
APPR NN CD PP
APPR NN CD PP
PP
![Page 94: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/94.jpg)
Synchronous Grammars
S→
PPER would like PPER advice PPAPPR NN CD PP
Könnten eine Auskun� geben
KOUS PPER PPER ART NN APPR NN CD PP VV
PP
NP
S
Production application:1 synchronous nonterminals2 suitable production3 replacement
PP→
APPR NN CD PP
APPR NN CD PP
PP
![Page 95: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/95.jpg)
Production Extraction
(extractable productions marked in red)
would like your advice about Rule 143 concerning inadmissibility
Könnten Sie mir eine Auskun� zu Artikel 143 im Zusammenhang mit der Unzulässigkeit geben
KOUS PPER PPER ART NN APPR NN CD AART NN APPR ART NN VV
PP
PP
PP
NP
S
I
following [Galley, Hopkins, Knight, Marcu: What’s in a translation rule? Proc. NAACL, 2004]
![Page 96: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/96.jpg)
Production Extraction
(extractable productions marked in red)
would like your advice about Rule 143 concerning inadmissibility
Könnten Sie mir eine Auskun� zu Artikel 143 im Zusammenhang mit der Unzulässigkeit geben
KOUS PPER PPER ART NN APPR NN CD AART NN APPR ART NN VV
PP
PP
PP
NP
S
I
following [Galley, Hopkins, Knight, Marcu: What’s in a translation rule? Proc. NAACL, 2004]
![Page 97: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/97.jpg)
Production Extraction
(extractable productions marked in red)
would like your advice about Rule 143 concerning inadmissibility
Könnten Sie mir eine Auskun� zu Artikel 143 im Zusammenhang mit der Unzulässigkeit geben
KOUS PPER PPER ART NN APPR NN CD AART NN APPR ART NN VV
PP
PP
PP
NP
S
I
following [Galley, Hopkins, Knight, Marcu: What’s in a translation rule? Proc. NAACL, 2004]
![Page 98: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/98.jpg)
Production Extraction
(extractable productions marked in red)
would like your advice about Rule 143 concerning inadmissibility
Könnten Sie mir eine Auskun� zu Artikel 143 im Zusammenhang mit der Unzulässigkeit geben
KOUS PPER PPER ART NN APPR NN CD AART NN APPR ART NN VV
PP
PP
PP
NP
S
I
following [Galley, Hopkins, Knight, Marcu: What’s in a translation rule? Proc. NAACL, 2004]
![Page 99: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/99.jpg)
Production Extraction
(extractable productions marked in red)
would like your advice about Rule 143 concerning inadmissibility
Könnten Sie mir eine Auskun� zu Artikel 143 im Zusammenhang mit der Unzulässigkeit geben
KOUS PPER PPER ART NN APPR NN CD AART NN APPR ART NN VV
PP
PP
PP
NP
S
I
following [Galley, Hopkins, Knight, Marcu: What’s in a translation rule? Proc. NAACL, 2004]
![Page 100: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/100.jpg)
Production Extraction
Removal of extractable production:
would like your advice about Rule 143 concerning inadmissibility
Könnten Sie mir eine Auskun� zu Artikel 143 im Zusammenhang mit der Unzulässigkeit geben
KOUS PPER PPER ART NN APPR NN CD AART NN APPR ART NN VV
PP
PP
PP
NP
S
I
![Page 101: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/101.jpg)
Production Extraction
Removal of extractable production:
PPER would like your advice about Rule 143
Könnten Sie eine Auskun� zu Artikel 143 geben
KOUS PPER PPER ART NN APPR NN CD VV
PP
PP
NP
S
PP
![Page 102: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/102.jpg)
Production Extraction
Repeated production extraction:
(extractable productions marked in red)
PPER would like your advice about Rule 143
Könnten Sie eine Auskun� zu Artikel 143 geben
KOUS PPER PPER ART NN APPR NN CD VV
PP
PP
NP
S
PP
![Page 103: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/103.jpg)
Production Extraction
Repeated production extraction:
(extractable productions marked in red)
PPER would like your advice about Rule 143
Könnten Sie eine Auskun� zu Artikel 143 geben
KOUS PPER PPER ART NN APPR NN CD VV
PP
PP
NP
S
PP
![Page 104: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/104.jpg)
Production Extraction
Repeated production extraction: (extractable productions marked in red)
PPER would like your advice about Rule 143
Könnten Sie eine Auskun� zu Artikel 143 geben
KOUS PPER PPER ART NN APPR NN CD VV
PP
PP
NP
S
PP
![Page 105: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/105.jpg)
Production Extraction
Repeated production extraction: (extractable productions marked in red)
PPER would like your advice about Rule 143
Könnten Sie eine Auskun� zu Artikel 143 geben
KOUS PPER PPER ART NN APPR NN CD VV
PP
PP
NP
S
PP
![Page 106: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/106.jpg)
Production Extraction
Repeated production extraction: (extractable productions marked in red)
PPER would like your advice about Rule 143
Könnten Sie eine Auskun� zu Artikel 143 geben
KOUS PPER PPER ART NN APPR NN CD VV
PP
PP
NP
S
PP
![Page 107: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/107.jpg)
Production Extraction
Repeated production extraction: (extractable productions marked in red)
PPER would like your advice about Rule 143
Könnten Sie eine Auskun� zu Artikel 143 geben
KOUS PPER PPER ART NN APPR NN CD VV
PP
PP
NP
S
PP
![Page 108: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/108.jpg)
Synchronous Tree Substitution Grammars
Advantages:
very simple
implemented in framework ‘Moses’[Koehn et al.: Moses — Open source toolkit for statistical machine translation. Proc. ACL, 2007]
“context-free”
Disadvantages:
problems with discontinuities
composition and binarization not possible[M., Graehl, Hopkins, Knight: The power of extended top-down tree transducers. SIAM Journal on Computing 39(2), 2009]
[Zhang, Huang, Gildea, Knight: Synchronous Binarization for Machine Translation. Proc. NAACL, 2006]
“context-free”
![Page 109: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/109.jpg)
Synchronous Tree Substitution Grammars
Advantages:
very simple
implemented in framework ‘Moses’[Koehn et al.: Moses — Open source toolkit for statistical machine translation. Proc. ACL, 2007]
“context-free”
Disadvantages:
problems with discontinuities
composition and binarization not possible[M., Graehl, Hopkins, Knight: The power of extended top-down tree transducers. SIAM Journal on Computing 39(2), 2009]
[Zhang, Huang, Gildea, Knight: Synchronous Binarization for Machine Translation. Proc. NAACL, 2006]
“context-free”
![Page 110: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/110.jpg)
Evaluation
English → German translation task: (higher BLEU is better)
Type System BLEUvanilla WMT 2013 WMT 2015
string-to-string FST 16.8 20.3 25.2string-to-tree STSG 15.2 19.4 24.5tree-to-tree STSG 14.5 — 15.3
STSG = synchronous tree substitution grammar
Observations:
syntax-based systems competitive with manual adjustments
much less so for vanilla systems
very unfortunate situation (more supervision yields lower scores)
from [Seemann, Braune, M.: A systematic evaluation of MBOT in statistical machine translation. Proc. MT-Summit, 2015]and [Bojar et al.: Findings of the 2013 workshop on statistical machine translation. Proc. WMT, 2013]and [Bojar et al.: Findings of the 2015 workshop on statistical machine translation. Proc. WMT, 2015]
![Page 111: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/111.jpg)
Evaluation
English → German translation task: (higher BLEU is better)
Type System BLEUvanilla WMT 2013 WMT 2015
string-to-string FST 16.8 20.3 25.2string-to-tree STSG 15.2 19.4 24.5tree-to-tree STSG 14.5 — 15.3
STSG = synchronous tree substitution grammar
Observations:
syntax-based systems competitive with manual adjustments
much less so for vanilla systems
very unfortunate situation (more supervision yields lower scores)from [Seemann, Braune, M.: A systematic evaluation of MBOT in statistical machine translation. Proc. MT-Summit, 2015]and [Bojar et al.: Findings of the 2013 workshop on statistical machine translation. Proc. WMT, 2013]and [Bojar et al.: Findings of the 2015 workshop on statistical machine translation. Proc. WMT, 2015]
![Page 112: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/112.jpg)
Production Extraction
PPER would like your advice about Rule 143
Könnten Sie eine Auskun� zu Artikel 143 geben
KOUS PPER PPER ART NN APPR NN CD VV
PP
PP
NP
S
PP
very speci�c production
every production for ‘advice’ contains sentence structure(syntax “in the way”)
![Page 113: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/113.jpg)
Synchronous Grammars
Synchronous multi tree substitution grammar: N →(r, 〈r1, . . . , rn〉
)variant of [M.: Why synchronous tree substitution grammars?. Proc. NAACL, 2010]
nonterminal N
right-hand side r of context-free grammar production
right-hand sides r1, . . . , rn of regular tree grammar production
synchronization via map NT r1, . . . , rn to NT r
![Page 114: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/114.jpg)
Synchronous Grammars
Synchronous multi tree substitution grammar: N →(r, 〈r1, . . . , rn〉
)variant of [M.: Why synchronous tree substitution grammars?. Proc. NAACL, 2010]
nonterminal N
right-hand side r of context-free grammar production
right-hand sides r1, . . . , rn of regular tree grammar production
synchronization via map NT r1, . . . , rn to NT r
ART-NN-VV→
advice
eine
ART
Auskun�
NN
geben
VV
![Page 115: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/115.jpg)
Synchronous Grammars
Synchronous multi tree substitution grammar: N →(r, 〈r1, . . . , rn〉
)variant of [M.: Why synchronous tree substitution grammars?. Proc. NAACL, 2010]
nonterminal N
right-hand side r of context-free grammar production
right-hand sides r1, . . . , rn of regular tree grammar production
synchronization via map NT r1, . . . , rn to NT r
ART-NN-VV→
advice
eine
ART
Auskun�
NN
geben
VV
![Page 116: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/116.jpg)
Synchronous Grammars
Synchronous multi tree substitution grammar: N →(r, 〈r1, . . . , rn〉
)variant of [M.: Why synchronous tree substitution grammars?. Proc. NAACL, 2010]
nonterminal N
right-hand side r of context-free grammar production
right-hand sides r1, . . . , rn of regular tree grammar production
synchronization via map NT r1, . . . , rn to NT r
NP-VV→
ART-NN-VV about Rule 143 PP
zu Artikel 143
ART NN APPR NN CD PP VV
PP
NP
![Page 117: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/117.jpg)
Synchronous Grammars
Synchronous multi tree substitution grammar: N →(r, 〈r1, . . . , rn〉
)variant of [M.: Why synchronous tree substitution grammars?. Proc. NAACL, 2010]
nonterminal N
right-hand side r of context-free grammar production
right-hand sides r1, . . . , rn of regular tree grammar production
synchronization via map NT r1, . . . , rn to NT r
NP-VV→
ART-NN-VV about Rule 143 PP
zu Artikel 143
ART NN APPR NN CD PP VV
PP
NP
![Page 118: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/118.jpg)
Synchronous Grammars
NP-VV→
ART-NN-VV about Rule 143 PP
zu Artikel 143
ART NN APPR NN CD PP VV
PP
NP
Production application:1 synchronous nonterminals
2 suitable production3 replacement
![Page 119: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/119.jpg)
Synchronous Grammars
NP-VV→
ART-NN-VV about Rule 143 PP
zu Artikel 143
ART NN APPR NN CD PP VV
PP
NP
Production application:1 synchronous nonterminals
2 suitable production3 replacement
![Page 120: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/120.jpg)
Synchronous Grammars
NP-VV→
ART-NN-VV about Rule 143 PP
zu Artikel 143
ART NN APPR NN CD PP VV
PP
NP
Production application:1 synchronous nonterminals2 suitable production
3 replacement
ART-NN-VV→
advice
eine
ART
Auskun�
NN
geben
VV
![Page 121: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/121.jpg)
Synchronous Grammars
NP-VV→
advice about Rule 143 PP
eine Auskun� zu Artikel 143 geben
ART NN APPR NN CD PP VV
PP
NP
Production application:1 synchronous nonterminals2 suitable production3 replacement
ART-NN-VV→
advice
eine
ART
Auskun�
NN
geben
VV
![Page 122: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/122.jpg)
Production Extraction
(extractable productions marked in red)
PPER would like your advice about Rule 143
Könnten Sie eine Auskun� zu Artikel 143 geben
KOUS PPER PPER ART NN APPR NN CD VV
PP
PP
NP
S
PP
variant of [M.: How to train your multi bottom-up tree transducer. Proc. ACL, 2011]
![Page 123: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/123.jpg)
Production Extraction
(extractable productions marked in red)
PPER would like your advice about Rule 143
Könnten Sie eine Auskun� zu Artikel 143 geben
KOUS PPER PPER ART NN APPR NN CD VV
PP
PP
NP
S
PP
variant of [M.: How to train your multi bottom-up tree transducer. Proc. ACL, 2011]
![Page 124: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/124.jpg)
Production Extraction
(extractable productions marked in red)
PPER would like your advice about Rule 143
Könnten Sie eine Auskun� zu Artikel 143 geben
KOUS PPER PPER ART NN APPR NN CD VV
PP
PP
NP
S
PP
variant of [M.: How to train your multi bottom-up tree transducer. Proc. ACL, 2011]
![Page 125: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/125.jpg)
Production Extraction
(extractable productions marked in red)
PPER would like your advice about Rule 143
Könnten Sie eine Auskun� zu Artikel 143 geben
KOUS PPER PPER ART NN APPR NN CD VV
PP
PP
NP
S
PP
variant of [M.: How to train your multi bottom-up tree transducer. Proc. ACL, 2011]
![Page 126: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/126.jpg)
Synchronous Multi Tree Substitution Grammars
Advantages:
complicated discontinuities
implemented in framework ‘Moses’[Braune, Seemann, Quernheim, M.: Shallow local multi bottom-up tree transducers in SMT. Proc. ACL, 2013]
binarizable, composable
Disadvantages:
output non-regular (tree-level) or non-context-free (string-level)(in fact output is captured by MRTG = MCFTG without variables)
not symmetric (input context-free; output not)
![Page 127: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/127.jpg)
Synchronous Multi Tree Substitution Grammars
Advantages:
complicated discontinuities
implemented in framework ‘Moses’[Braune, Seemann, Quernheim, M.: Shallow local multi bottom-up tree transducers in SMT. Proc. ACL, 2013]
binarizable, composable
Disadvantages:
output non-regular (tree-level) or non-context-free (string-level)(in fact output is captured by MRTG = MCFTG without variables)
not symmetric (input context-free; output not)
![Page 128: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/128.jpg)
Evaluation
Task BLEU
Productions
STSG SMTSG
STSG SMTSG
English → German 15.0 *15.5
14M 144M
English → Arabic 48.2 *49.1
55M 491M
English → Chinese 17.7 *18.4
17M 162M
English → Polish 21.3 *23.4
— —
English → Russian 24.7 *26.1
— —
STSG = synchronous tree substitution grammarSMTSG = synchronous multi tree substitution grammar
Observations:
consistent improvements
1 magnitude more productions
SMTSG alleviate some of the problems of syntax-based systems
from [Seemann, Braune, M.: A systematic evaluation of MBOT in statistical machine translation. Proc. MT-Summit, 2015]and [Seemann, M.: Discontinuous statistical machine translation with target-side dependency syntax. Proc. WMT, 2015]
![Page 129: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/129.jpg)
Evaluation
Task BLEU ProductionsSTSG SMTSG STSG SMTSG
English → German 15.0 *15.5 14M 144MEnglish → Arabic 48.2 *49.1 55M 491MEnglish → Chinese 17.7 *18.4 17M 162MEnglish → Polish 21.3 *23.4 — —English → Russian 24.7 *26.1 — —
STSG = synchronous tree substitution grammarSMTSG = synchronous multi tree substitution grammar
Observations:
consistent improvements
1 magnitude more productions
SMTSG alleviate some of the problems of syntax-based systems
from [Seemann, Braune, M.: A systematic evaluation of MBOT in statistical machine translation. Proc. MT-Summit, 2015]and [Seemann, M.: Discontinuous statistical machine translation with target-side dependency syntax. Proc. WMT, 2015]
![Page 130: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/130.jpg)
Evaluation
Task BLEU ProductionsSTSG SMTSG STSG SMTSG
English → German 15.0 *15.5 14M 144MEnglish → Arabic 48.2 *49.1 55M 491MEnglish → Chinese 17.7 *18.4 17M 162MEnglish → Polish 21.3 *23.4 — —English → Russian 24.7 *26.1 — —
STSG = synchronous tree substitution grammarSMTSG = synchronous multi tree substitution grammar
Observations:
consistent improvements
1 magnitude more productions
SMTSG alleviate some of the problems of syntax-based systemsfrom [Seemann, Braune, M.: A systematic evaluation of MBOT in statistical machine translation. Proc. MT-Summit, 2015]and [Seemann, M.: Discontinuous statistical machine translation with target-side dependency syntax. Proc. WMT, 2015]
![Page 131: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/131.jpg)
Synchronous Grammars
Evaluation properties:
rotations implementable? (for arbitrary t1, t2, t3)σ
σ
t1 t2
t3 7→
σ
t1 σ
t2 t3
symmetric?M N domain regular?M N range regular?
closed under composition?
following [Knight: Capturing practical natural language transformations. Machine Translation 21(2), 2007]and [May, Knight, Vogler: E�cient inference through cascades of weighted tree transducers. Proc. ACL, 2010]
Icons by interactivemania (http://www.interactivemania.com/) and UN O�ce for the Coordination of Humanitarian A�airs
![Page 132: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/132.jpg)
Synchronous Grammars
Illustration of rotation:
S
NP
NNP
Alice
VP
VBD
carries
NP
NNP
Bob
7→
S
NP
NNP
Bob
VP
VBZ
is
VP
VBN
carried
PP
IN
by
NP
NNP
Alice
![Page 133: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/133.jpg)
Top-down Tree Transducer
Hasse diagram:
TOPR1
TOP2 s-TOPR1
s-TOP2 n-TOP(R)1
ns-TOP(R)1
(composition closure in subscript)
Model Property
M N M N
ns-TOP 7 7 3 3 3
n-TOP 7 7 3 3 3
s-TOP 7 7 3 3 72s-TOPR 7 7 3 3 3
TOP 7 7 3 3 72TOPR 7 7 3 3 3
![Page 134: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/134.jpg)
Synchronous Tree Substitution Grammars
Hasse diagram:
STSGR3
STSG4
n-STSG(R)∞ s-STSGR
2
s-STSG2 TOPR1
ns-STSG(R)2
TOP2
n-TOP(R)1 s-TOPR1
s-TOP2
ns-TOP(R)1
(composition closure in subscript)
Model Property
M N M N
n-TOP 7 7 3 3 3TOP 7 7 3 3 72
TOPR 7 7 3 3 3
ns-STSG 3 3 3 3 72
n-STSG 3 7 3 3 7∞s-STSG(R) 3 7 3 3 72
STSG 3 7 3 3 74
STSGR 3 7 3 3 73
composition closures by[Engelfriet, Fülöp, M.: Composition closure of linear extended top-down tree transducers. Theory of Computing Systems, to appear 2016]
![Page 135: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/135.jpg)
Synchronous Multi Tree Substitution Grammars
Advantages of SMTSG
always have regular look-ahead
can always be made nondeleting & shallow
closed under composition
Disadvantages of SMTSG:
non-regular range (theoretically interesting?)
[Engelfriet, Lilin, M.: Extended multi bottom-up tree transducers — composition and decomposition. Acta Informatica 46(8), 2009]
![Page 136: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/136.jpg)
Synchronous Multi Tree Substitution Grammars
Advantages of SMTSG
always have regular look-ahead
can always be made nondeleting & shallow
closed under composition
Disadvantages of SMTSG:
non-regular range (theoretically interesting?)
[Engelfriet, Lilin, M.: Extended multi bottom-up tree transducers — composition and decomposition. Acta Informatica 46(8), 2009]
![Page 137: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/137.jpg)
Synchronous Multi Tree Substitution Grammars
Hasse diagram:
(n)-SMTSG(R)1
(n)s-SMTSG(R)1
STSGR3
STSG4
n-STSG(R)∞ s-STSGR
2
s-STSG2 TOPR1
ns-STSG(R)2
TOP2
n-TOP(R)1 s-TOPR1
s-TOP2
ns-TOP(R)1
(composition closure in subscript)
Model Property
M N M N
n-TOP 7 7 3 3 3TOP 7 7 3 3 72
TOPR 7 7 3 3 3
ns-STSG 3 3 3 3 72
n-STSG 3 7 3 3 7∞s-STSG(R) 3 7 3 3 72
STSG 3 7 3 3 74
STSGR 3 7 3 3 73
SMTSG 3 7 3 7 3reg. range 3 7 3 3 3symmetric 3 3 3 3 3
(string-level) range characterization by[Gildea: On the string translations produced by multi bottom-up tree transducers. Computational Linguistics 38(3), 2012]
![Page 138: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/138.jpg)
Synchronous Multi Tree Substitution Grammars
Theorem (STSGR)3 ( reg.-range SMTSG
u0
δ
t2 δ
t3 δ
tn−1 δ
tn t1
...
(3)...
(2)
?
(1)
—
u1
v12
v(n−1)2 vn2
—
u2
vn3 v(n−1)3
v13
—
u3
δ
t1 δ
t2 δ
t3 δ
tn−1 tn
[M.: The power of weighted regularity-preserving multi bottom-up tree transducers. Int. J. Found. Comput. Sci. 26(7), 2015]
![Page 139: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/139.jpg)
Summary
Parsing:
tree automata = CFG with subcategorization(which are the state-of-the-art models for many languages)
wealth of open problems for non-constituent parsing(alternative theories seem to be on the rise; “Parsey McParseface”)
Machine translation:
all major (non-neural) translation models in use are grammar-based(and their expressive power is o�en ill-understood)
combination of parser and translation model challenging(although that is typically just a regular domain restriction)
evaluation of theoretically well-behaved models (in practice)
Thank you for the attention.
![Page 140: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/140.jpg)
Summary
Parsing:
tree automata = CFG with subcategorization(which are the state-of-the-art models for many languages)
wealth of open problems for non-constituent parsing(alternative theories seem to be on the rise; “Parsey McParseface”)
Machine translation:
all major (non-neural) translation models in use are grammar-based(and their expressive power is o�en ill-understood)
combination of parser and translation model challenging(although that is typically just a regular domain restriction)
evaluation of theoretically well-behaved models (in practice)
Thank you for the attention.
![Page 141: Applications of Weighted Tree Automata and Tree ... · Applications of Weighted Tree Automata and Tree Transducers in Natural Language Processing Andreas Maletti Institute of Computer](https://reader033.vdocument.in/reader033/viewer/2022050513/5f9d82541163632fa12b1895/html5/thumbnails/141.jpg)
Summary
Parsing:
tree automata = CFG with subcategorization(which are the state-of-the-art models for many languages)
wealth of open problems for non-constituent parsing(alternative theories seem to be on the rise; “Parsey McParseface”)
Machine translation:
all major (non-neural) translation models in use are grammar-based(and their expressive power is o�en ill-understood)
combination of parser and translation model challenging(although that is typically just a regular domain restriction)
evaluation of theoretically well-behaved models (in practice)
Thank you for the attention.