context free grammar
TRANSCRIPT
MT Study Group - Context Free Grammar -
2015/06/18 AHCLab M1 Makoto Morishita
You can download my slides from http://goo.gl/uO2NVU
Why we use a Tree-to-String Machine Translation
Phrase-Based Machine Translation
4
English ↔ French→ Similar word order & vocabulary
English ↔Japanese→ Different word order & vocabulary
Why we use a Tree-to-String Machine Translation
Tree-to-String Machine Translation
5
English ↔ French→ Similar word order & vocabulary
English ↔Japanese→ Different word order & vocabulary
How Tree-to-String works
6
友達とご飯を食べたString
TreeVP
PP VP
I ate a meal with a friendString
友達 と ご飯 を 食べ た
N P PP VP
N P V SUF
a friendx1 with x0
a meal
x1 x0ate
How Tree-to-Tree works
7
he visited the white houseString
TreeS
NPVP
PRP VBD
NP
DT NNP NNP
he visited the white house
彼 は ホワイト ハウス を 訪問 したString
TreeS
NP
VP
N P
PP
P N
彼 は ホワイト ハウス を 訪問 したN N V
PPNP VP
Tree-to-String Pros and Cons
Pros • Good between
different word order & vocabulary languages
Cons • Need a lot of time to translate • Translation accuracy is depend on
Parser result
8
You can try Tree-to-String Machine Translation (Travatar) http://ahclab.naist.jp/travatar/translator/
➡
➡
➡
➡
➡
CFG is composed by…
• Terminal characters
• Non-terminal characters
• Start variable
• Rules
• Weight of rules
12
X
N
R
S
A
CFG Example
Rules • S → NP VP | VP
• VP → NP V | PP V | VP NP
• NP → PP NP | N P
• PP → N P
• P → が | は | の | に | を
• V → 開けた | 座った
• N → 犬 | ドア | 本
13
Non-Terminal • S, VP, NP, PP, P, V, N
Terminal •が, は, の, に, を, 開けた, 座った, 犬, ドア, 本
Start Variable • S
CFG Example
Rules • S → NP VP | VP
• VP → NP V | PP V | VP NP
• NP → PP NP | N P
• PP → N P
• P → が | は | の | に | を
• V → 開けた | 座った
• N → 犬 | ドア | 本
14
Derivation S ⇒NP VP
⇒N P VP
⇒ 犬 P VP
⇒ 犬 P VP
⇒ 犬 が VP
⇒ 犬 が NP V
⇒ 犬 が N P V
⇒ 犬 が ドア P V
⇒ 犬 が ドア を V
⇒ 犬 が ドア を 開けた
Derivation Tree
CFG Example
15
S
VP
N
犬 が ドア を 開けた
NP
P NP V
N P
Derivation S ⇒NP VP
⇒N P VP
⇒ 犬 P VP
⇒ 犬 P VP
⇒ 犬 が VP
⇒ 犬 が NP V
⇒ 犬 が N P V
⇒ 犬 が ドア P V
⇒ 犬 が ドア を V
⇒ 犬 が ドア を 開けた
Parsing
CKY Algorithm (a.k.a. CYK Algorithm) • One of the most famous algorithm of parsing
• Grammar is supposed to Chomsky Normal Form (CNF)
17
Chomsky Normal Form (CNF)
• Right side of the rule has 2 non-terminal characters or 1 terminal character
18
S → NP VP S → PRP VP VP → VBD NP
PRP → “I” VBD → “saw” DT → “a”
VP → VBD NP PP NP → NN NP → PRP
CKY Algorithm Example
• Expand terminal characters with scores
19
I saw him
PRP0,1
NP0,1
VP1,2
VBD1,2
PRP2,3
NP2,31.0 0.5 3.2 1.4 2.4 2.6
CKY Algorithm Example
• Expand all nodes of 0,2
20
I saw him
PRP0,1
NP0,1
VP1,2
VBD1,2
PRP2,3
NP2,3
SBAR0,2
S0,2 5.34.7
1.0 0.5 3.2 1.4 2.4 2.6
CKY Algorithm Example
• Expand all nodes of 1,3
21
I saw him
PRP0,1
NP0,1
VP1,2
VBD1,2
PRP2,3
NP2,3
VP1,3
SBAR0,2
S0,2 5.05.34.7
1.0 0.5 3.2 1.4 2.4 2.6
CKY Algorithm Example
• Expand all nodes of 0,3
22
I saw him
PRP0,1
NP0,1
VP1,2
VBD1,2
PRP2,3
NP2,3
VP1,3
SBAR0,2
S0,2
SBAR0,3
S0,36.1 5.9
5.05.34.7
1.0 0.5 3.2 1.4 2.4 2.6
CKY Algorithm Example
• Find S which is cover all sentence, and expand all edges
23
I saw him
PRP0,1
NP0,1
VP1,2
VBD1,2
PRP2,3
NP2,3
VP1,3
SBAR0,2
S0,2
SBAR0,3
S0,36.1 5.9
5.05.34.7
1.0 0.5 3.2 1.4 2.4 2.6
CKY Algorithm Example
• Find S which is cover all sentence, and expand all edges
24
I saw him
PRP0,1
NP0,1
VP1,2
VBD1,2
PRP2,3
NP2,3
VP1,3
SBAR0,2
S0,2
SBAR0,3
S0,36.1 5.9
5.05.34.7
1.0 0.5 3.2 1.4 2.4 2.6
Another expression of tree
• By using deduction system, tree is expressed in this way
26
開けた[V, 4, 5]
犬[N, 0, 1]
が[P, 0, 1]
[NP, 0, 2]
ドア[N, 2, 3]
を[P, 3, 4]
[NP, 2, 4]
[VP, 2, 5]
[S, 0, 5]
S
VP
N
犬 が ドア を 開けた
NP
P NP V
N P
Hyper-Graph
Suppose that there are 2 trees
28
S
VP
N
犬 が ドア を 開けた
NP
P NP V
N P
Tree 1 Tree 2
S
VP
N
犬 が ドア を 開けた
NP
P PP V
N P
Hyper-Graph
Almost the same
29
S
VP
N
犬 が ドア を 開けた
NP
P NP V
N P
Tree 1 Tree 2
S
VP
N
犬 が ドア を 開けた
NP
P PP V
N P
Hyper-Graph
Add edges which exists Tree 1 and Tree 2
33
S
VP
N
犬 が ドア を 開けた
NP
P V
N P
NP PP
If select blue, tree will be Tree 1 If select orange, tree will be Tree 2
This is also called as a Parse Forest
• Vertexes
• Hyper-edges
• Target Vertex
• Weight of edges
➡
➡
➡
➡
Weighted Acyclic Directed Hyper-Graphis composed by…
34
A
V
E
t
Hyper-edge
35
e 2 E =< tails(e), head(e),!(e) >
• tails(e)
• head(e)
• ω(e)
➡ list of start points
➡ end point
➡ weight of the “e”
Hyper-edge
36
• in(e)
• out(e)
➡ set of hyper-edges which go to v
➡ set of hyper-edges which go from v
{e 2 E|head(e) = v}
{e 2 E|9v 2 tails(e)}
Another expression of Hyper-Graph
Directed Hyper-Graph can be Directed and/or Graph
• all vertex → or-vertex
• all hyper-edge → and-vertex
37
Another expression of Hyper-GraphS0,5
VP2,5
N0,1
犬0,1 が1,2ドア2,3 を3,4 開けた4,5
NP0,2
P1,2 NP2,4 V4,5
N2,3 P3,4
PP2,4
∧ ∧
∧ ∧
N∨2,3 P∨3,4
NP∨2,4PP∨2,4
∧ ∧
∧
V∨4,5
VP∨2,5
∧
S∨0,5
NP∨0,2
∧ ∧
N∨0,1 P∨1,2
犬∨0,1 が∨1,2 ドア∨2,3 を∨3,4 開けた∨4,5
∧
38
Semiring
Examples of semiring
41
Boolean {0, 1} ∨ ∧ 0 1
Real + × 0 1
Tropical max + -∞ 0
LogProb logsumexp + -∞ 0
LogReal {-, +}×aaaaa + LogReal
× LogReal <+, -∞> <+, 0>
A � ⌦ 0 1
R1�1
R1�1
R0�1
R1�1
Semiring Parsing
• : all derivations of hyper-graph G
• : set of hyper-edge
• : weight of d
• : sum of weight G
42
D(G)
d 2 D(G)
!(d) = ⌦e2d!(e)
!(G) = �d2D(G) ⌦e2d !(e)
= �d2D(G)!(d)
Real, LogReal Semiring → sum of weightTropical Semiring → weight of Viterbi derivation
Semiring Parsing
• Finding all derivation is difficult. ➡ inside-outside algorithm
43
!(d) = ⌦e2d!(e)
!(G) = �d2D(G) ⌦e2d !(e)
= �d2D(G)!(d)
K-Best (a.k.a. N-Best)
Parse Forest (with weight)
What is the best tree? (1-best) What is the second-best tree? (2-best) What is the k-best tree? (k-best)
45
S0,5
VP2,5
N0,1
NP0,2
P1,2 V4,5NP2,4 PP2,4N1,2
K-Best
Left-side:{<<{N0,1, P1,2}, NP0,2>, (1,1)>, <<{N0,1, N1,2}, NP0,2>, (1,1)>, …}
Right-side:{<<{NP2,4, V4,5}, VP2,5>, (1,1)>, <<{PP2,4, V4,5}, VP2,5>, (1,1)>, …}
46
1-best 2-best
1-best 2-best
S0,5
VP2,5
N0,1
NP0,2
P1,2 V4,5NP2,4 PP2,4N1,2
Left-side:{<<{N0,1, P1,2}, NP0,2>, (1,1)>, <<{N0,1, N1,2}, NP0,2>, (1,1)>, …}
Right-side:{<<{NP2,4, V4,5}, VP2,5>, (1,1)>, <<{PP2,4, V4,5}, VP2,5>, (1,1)>, …}
47
1-best 2-best
1-best 2-best
Derivation <<{NP0,2, VP2,5}, S0,5>, (1,2)>
S0,5
VP2,5
N0,1
NP0,2
P1,2 V4,5PP2,4
S0,5
VP2,5
N0,1
NP0,2
P1,2 V4,5NP2,4 PP2,4N1,2
K-Best
• D(v) = {D1(v),…,Dk(v)} : K-Best list of v
• Finding D(root(G)) is difficult ‣ We need find k-best of every nodes,
and sort… ‣ It’s a heavy calculation…
➡ There is efficient algorithm!
48
K-Best
• D(NP0,2)={-0.8, -1.6, -2.4, -3.2} • D(VP2,5)={-0.5, -1.5, -2.5} • ω(<{NP0,2,VP2,5}, S0,5>)=-0.3
49
-0.5 -1.5 -2.5
-0.8 -1.6 -2.6 -3.6
-1.6 -2.4 -3.4 -4.4
-2.4 -3.2 -4.2 -5.2
-3.2 -4.0 -5.0 -6.0
D(VP2,5)
D(NP0,2)
S0,5
VP2,5
N0,1
NP0,2
P1,2 V4,5NP2,4 PP2,4N1,2
• D(NP0,2)={-0.8, -1.6, -2.4, -3.2} • D(VP2,5)={-0.5, -1.5, -2.5} • ω(<{NP0,2,VP2,5}, S0,5>)=-0.3
50
-0.5 -1.5
-0.8 -1.6 -2.6
-1.6 -2.4
D(VP2,5)
D(NP0,2)
1-best
cand(v)
-0.5 -1.5
-0.8 -1.6 -2.6
-1.6 -2.4 -3.4
-2.4 -3.2
D(VP2,5)
D(NP0,2)
2-best
cand(v)
51
-0.5 -1.5
-0.8 -1.6 -2.6
-1.6 -2.4
D(VP2,5)
D(NP0,2)
1-best
cand(v)
-0.5 -1.5
-0.8 -1.6 -2.6
-1.6 -2.4 -3.4
-2.4 -3.2
D(VP2,5)
D(NP0,2)
2-best
cand(v)
-0.5 -1.5 -2.5
-0.8 -1.6 -2.6 -3.6
-1.6 -2.4 -3.4
-2.4 -3.2
D(VP2,5)
D(NP0,2)
3-best
cand(v)
We can omit a lot of calculation
Why we use a Tree-to-String Machine Translation
Tree-to-String Machine Translation
53
English ↔ French→ Similar word order & vocabulary
English ↔Japanese→ Different word order & vocabulary
How Tree-to-String works
54
友達とご飯を食べたString
TreeVP
PP VP
I ate a meal with a friendString
友達 と ご飯 を 食べ た
N P PP VP
N P V SUF
a friendx1 with x0
a meal
x1 x0ate
Derivation Tree
CFG Example
55
S
VP
N
犬 が ドア を 開けた
NP
P NP V
N P
Derivation S ⇒NP VP
⇒N P VP
⇒ 犬 P VP
⇒ 犬 P VP
⇒ 犬 が VP
⇒ 犬 が NP V
⇒ 犬 が N P V
⇒ 犬 が ドア P V
⇒ 犬 が ドア を V
⇒ 犬 が ドア を 開けた
CKY Algorithm Example
• Find S which is cover all sentence, and expand all edges
56
I saw him
PRP0,1
NP0,1
VP1,2
VBD1,2
PRP2,3
NP2,3
VP1,3
SBAR0,2
S0,2
SBAR0,3
S0,36.1 5.9
5.05.34.7
1.0 0.5 3.2 1.4 2.4 2.6
Hyper-Graph
Add edges which exists Tree 1 and Tree 2
57
S
VP
N
犬 が ドア を 開けた
NP
P V
N P
NP PP
If select blue, tree will be Tree 1 If select orange, tree will be Tree 2
This is also called as a Parse Forest
K-Best
Parse Forest (with weight)
What is the best tree? (1-best) What is the second-best tree? (2-best) What is the k-best tree? (k-best)
58
S0,5
VP2,5
N0,1
NP0,2
P1,2 V4,5NP2,4 PP2,4N1,2
Reference
• COLING2012 参加報告(その3)– 木構造に基づく機械翻訳 –,中澤 敏明, http://www.anlp.jp/doc/IC/ICreportv02/coling2012-3.pdf
• ALAGIN 機械翻訳セミナー 統語情報に基づく機械翻訳, Graham Neubig, http://www.phontron.com/slides/alagin2014-syntax.pdf
• 機械翻訳, Graham Neubig, http://www.phontron.com/slides/neubig-alagin-20130117.pdf
59