finite-state methods
DESCRIPTION
Finite-State Methods. c. a. e. Finite state acceptors (FSAs). Things you may know about FSAs: Equivalence to regexps Union, Kleene *, concat, intersect, complement, reversal Determinization, minimization Pumping, Myhill-Nerode. Defines the language a? c* - PowerPoint PPT PresentationTRANSCRIPT
600.465 - Intro to NLP - J. Eisner 1
Finite-State Methods
600.465 - Intro to NLP - J. Eisner 2
Finite state acceptors (FSAs)
Things you may know about FSAs: Equivalence to
regexps Union, Kleene *,
concat, intersect, complement, reversal
Determinization, minimization
Pumping, Myhill-Nerode
a
c
Defines the Defines the languagelanguage a? c* a? c*
= {a, ac, acc, accc, = {a, ac, acc, accc, …,…, , c, , c, cc, ccc, cc, ccc, …}…}
600.465 - Intro to NLP - J. Eisner 3
n-gram models not good enough
Want to model grammaticality A “training” sentence known to be grammatical:
BOS mouse traps catch mouse traps EOS
Resulting trigram model has to overgeneralize: allows sentences with 0 verbsallows sentences with 0 verbsBOS mouse traps EOS
allows sentences with 2 or more verbsallows sentences with 2 or more verbsBOS mouse traps catch mouse traps catch mouse traps catch mouse traps EOS
Can’t remember whether it’s in subject or object(i.e., whether it’s gotten to the verb yet)
trigram model must allow these trigramstrigram model must allow these trigrams
600.465 - Intro to NLP - J. Eisner 4
Want to model grammaticalityBOS mouse traps catch mouse traps EOS
Finite-state can capture the generalization here:
Finite-state models can “get it”
Noun+ Verb Noun+Noun+ Verb Noun+Noun
Noun Verb
Noun
Noun
preverbal states(still need a verb
to reach final state)
postverbal states(verbs no longer
allowed)
Allows arbitrarily long NPs (just keep looping around for another Noun modifier).
Still, never forgets whether it’s preverbal or postverbal! (Unlike 50-gram model)
600.465 - Intro to NLP - J. Eisner 5
How powerful are regexps / FSAs?
More powerful than n-gram models The hidden state may “remember” arbitrary past context With k states, can remember which of k “types” of context
it’s in
Equivalent to HMMs In both cases, you observe a sequence and it is “explained”
by a hidden path of states. The FSA states are like HMM tags.
Appropriate for phonology and morphologyWord = Syllable+ = (Onset Nucleus Coda?)+ = (C+ V+ C*)+ = ( (b|d|f|…)+ (a|e|i|o|u)+ (b|d|f|…)* )+
600.465 - Intro to NLP - J. Eisner 6
How powerful are regexps / FSAs?
But less powerful than CFGs / pushdown automata Can’t do recursive center-embedding Hmm, humans have trouble processing those constructions
too … This is the rat that ate the malt. This is the malt that the rat ate.
This is the cat that bit the rat that ate the malt. This is the malt that the rat that the cat bit ate.
This is the dog that chased the cat that bit the rat that ate the malt.
This is the malt that [the rat that [the cat that [the dog chased] bit] ate].
finite-state can handle this
pattern (can you write the
regexp?)
but not this pattern,which requires a CFG
600.465 - Intro to NLP - J. Eisner 7
How powerful are regexps / FSAs?
But less powerful than CFGs / pushdown automata
More important: Less explanatory than CFGs An CFG without recursive center-embedding can be
converted into an equivalent FSA – but the FSA will usually be far larger
Because FSAs can’t reuse the same phrase type in different places
Noun
Noun Verb
Noun
NounS =S =
duplicatedstructure
duplicatedstructure
Noun
NounNP =NP =
NP Verb NPS =S =
more elegant – usingnonterminals like this
is equivalent to a CFG
conv
ertin
g to
FSA
copi
es th
e NP
twice
600.465 - Intro to NLP - J. Eisner 8
Strings vs. String Pairs
FSA = “finite-state acceptor” Describes a language
(which strings are grammatical?)
FST = “finite-state transducer” Describes a relation
(which pairs of strings are related?) underlying form surface form sentence translation original edited …
600.465 - Intro to NLP - J. Eisner 9
Example: Edit Distance
c: l: a: r: a:
:c
c:c
:cl:c
:c
a:c
:c
r:c
:c
a:c
:cc: l: a: r: a:
:ac:a
:a
l:a
:a
a:a
:a
r:a
:a
a:a
:ac: l: a: r: a:
:c
c:c
:c
l:c
:c
a:c
:c
r:c
:c
a:c
:cc: l: a: r: a:
:a
c:a
:a
l:a
:a
a:a
:a
r:a
:a
a:a
:ac: l: a: r: a:
0 1 2 3 4 50
1
2
3
4
position in upper string
posi
tion in low
er
stri
ngCost of best
path relatingthese two strings?
600.465 - Intro to NLP - J. Eisner 10
Example: Morphology
VP VP [head=vouloir,...]
VV[head=vouloir,tense=Present,num=SG, person=P3]
......
veutveut
600.465 - Intro to NLP - J. Eisner 11
Example: Unweighted transducer
veut
vouloir +Pres +Sing + P3
Finite-state transducer
inflected form
canonical form inflection codes
v o u l o i r +Pres +Sing +P3
v e u t
slide courtesy of L. Karttunen (modified)
VP VP [head=vouloir,...]
VV[head=vouloir,tense=Present,num=SG, person=P3]
......
veutveut
the relevant path
600.465 - Intro to NLP - J. Eisner 12
veut
vouloir +Pres +Sing + P3
Finite-state transducer
inflected form
canonical form inflection codes
v o u l o i r +Pres +Sing +P3
v e u t
Example: Unweighted transducer
Bidirectional: generation or analysis
Compact and fast Xerox sells for about 20
languges including English, German, Dutch, French, Italian, Spanish, Portuguese, Finnish, Russian, Turkish, Japanese, ...
Research systems for many other languages, including Arabic, Malay
slide courtesy of L. Karttunen
the relevant path
600.465 - Intro to NLP - J. Eisner 13
Relation: like a function, but multiple outputs ok
Regular: finite-state Transducer: automaton w/ outputs
b ? a ? aaaaa ?
Regular Relation (of strings)
b:b
a:a
a:
a:c
b:
b:b
?:c
?:a
?:b
{b} {}{ac, aca, acab,
acabc}
Invertible? Closed under composition?
600.465 - Intro to NLP - J. Eisner 14
Can weight the arcs: vs. b {b} a {} aaaaa {ac, aca, acab,
acabc}
How to find best outputs? For aaaaa? For all inputs at once?
Regular Relation (of strings)
b:b
a:a
a:
a:c
b:
b:b
?:c
?:a
?:b
600.465 - Intro to NLP - J. Eisner 15
Function from strings to ...
a:x/.5
c:z/.7
:y/.5.3
Acceptors (FSAs) Transducers (FSTs)
a:x
c:z
:y
a
c
Unweighted
Weighted a/.5
c/.7
/.5.3
{false, true} strings
numbers (string, num) pairs
600.465 - Intro to NLP - J. Eisner 16
Sample functions
Acceptors (FSAs) Transducers (FSTs)
Unweighted
Weighted
{false, true} strings
numbers (string, num) pairs
Grammatical?
How grammatical?Better, how likely?
MarkupCorrectionTranslation
Good markupsGood correctionsGood translations
600.465 - Intro to NLP - J. Eisner 17
Terminology (acceptors)
StringString
RegexpRegexp FSAFSA
acce
pts
matches
matches
compiles into
implements
Regular languageRegular language
defines recognizes
(or ge
nera
tes)
600.465 - Intro to NLP - J. Eisner 18
Terminology (transducers)
String pairString pair
RegexpRegexp FSTFST
matches
matches
compiles into
implements
Regular relationRegular relation
defines recognizes
(or, tr
ansd
uces
one
strin
g of
the
pair
into
the
othe
r)acce
pts
(or ge
nera
tes)
??
600.465 - Intro to NLP - J. Eisner 19
Perspectives on a Transducer Remember these CFG perspectives:
Similarly, 3 views of a transducer: Given 0 strings, generate a new string pair (by picking a path) Given one string (upper or lower), transduce it to the other kind Given two strings (upper & lower), decide whether to accept the pair
FST just defines the regular relation (mathematical object: set of pairs). What’s “input” and “output” depends on what one asks about the relation.The 0, 1, or 2 given string(s) constrain which paths you can use.
3 views of a context-free rule
generation (production): S NP VP parsing (comprehension): S NP VP verification (checking): S = NP VP
(randsent)(parse)
v o u l o i r +Pres +Sing +P3
v e u t
600.465 - Intro to NLP - J. Eisner 20
Functions
ab?d abcd
f
g
600.465 - Intro to NLP - J. Eisner 21
Functions
ab?d
Function composition: f g
[first f, then g – intuitive notation, but opposite of the traditional math notation]
600.465 - Intro to NLP - J. Eisner 22
From Functions to Relations
ab?d abcd
abed
abjd
3
2
6
4
2
8
...
f
g
600.465 - Intro to NLP - J. Eisner 23
From Functions to Relations
ab?d
...
Relation composition: f g
3
2
6
4
2
8
600.465 - Intro to NLP - J. Eisner 24
From Functions to Relations
ab?d
...
Relation composition: f g
3+4
2+2
6+8
600.465 - Intro to NLP - J. Eisner 25
From Functions to Relations
ab?d
Often in NLP, all of the functions or relations involved can be described as finite-state machines, and manipulated using standard algorithms.
Pick min-cost or max-prob output2+2
600.465 - Intro to NLP - J. Eisner 26
Building a lexical transducer
Regular ExpressionLexicon
LexiconFSA
Compiler
Regular Expressionsfor Rules
ComposedRule FSTs
Lexical Transducer(a single FST)composition
slide courtesy of L. Karttunen (modified)
big | clear | clever | ear | fat | ...
rlc ae
v ee
t hf a
b i g +Adj
r
+Comp
b i g g e
one path
600.465 - Intro to NLP - J. Eisner 27
Building a lexical transducer
Actually, the lexicon must contain elements likebig +Adj +Comp
So write it as a more complicated expression:(big | clear | clever | fat | ...) +Adj ( | +Comp | +Sup) adjectives | (ear | father | ...) +Noun (+Sing | +Pl) nouns | ... ...
Q: Why do we need a lexicon at all?
Regular ExpressionLexicon
LexiconFSA
slide courtesy of L. Karttunen (modified)
big | clear | clever | ear | fat | ...
rlc ae
v ee
t hf a
600.465 - Intro to NLP - J. Eisner 28
Inverting Relations
ab?d abcd
abed
abjd
3
2
6
4
2
8
...
f
g
600.465 - Intro to NLP - J. Eisner 29
Inverting Relations
ab?d abcd
abed
abjd
3
2
6
4
2
8
...
f -1
g-1
600.465 - Intro to NLP - J. Eisner 30
Inverting Relations
ab?d
...
(f g)-1 = g-1 f -1
3+4
2+2
6+8
600.465 - Intro to NLP - J. Eisner 31
Weighted version of transducer: Assigns a weight to each string pair
payer+IndP+SG+P1
paie
paye
Weighted French Transducer
suis
suivre+Imp+SG + P2
suivre+IndP+SG+P2
suivre+IndP+SG+P1
être+IndP +SG + P1
“upper language”
“lower language”
slide courtesy of L. Karttunen (modified)
419
20
50
3
12
600.465 - Intro to NLP - J. Eisner 32
Composition Cascades
You can build fancy noisy-channel models by composing transducers …
Examples: Phonological/morphological rewrite rules? English orthography English phonology
Japanese phonology Japanese orthography e.g. ??? goruhubaggu
Information extraction
600.465 - Intro to NLP - J. Eisner 33600.465 - Intro to NLP - J. Eisner 33
FASTUS – Information Extraction Appelt et al, 1992-?
Input: Bridgestone Sports Co. said Friday it has set up a joint venture in Taiwan with a local concern and a Japanese trading house to produce golf clubs to be shipped to Japan. The joint venture, Bridgestone Sports Taiwan Co., capitalized at 20 million new Taiwan dollars, will start production in January 1990 with …
Output:Relationship: TIE-UPEntities: “Bridgestone Sports Co.”
“A local concern”“A Japanese trading house”
Joint Venture Company: “Bridgestone Sports Taiwan Co.”Amount: NT$20000000
600.465 - Intro to NLP - J. Eisner 34600.465 - Intro to NLP - J. Eisner 34
FASTUS: Successive Markups(details on subsequent slides)
Tokenization.o.
Multiwords.o.
Basic phrases (noun groups, verb groups …).o.
Complex phrases.o.
Semantic Patterns.o.
Merging different references
600.465 - Intro to NLP - J. Eisner 35600.465 - Intro to NLP - J. Eisner 35
FASTUS: Tokenization
Spaces, hyphens, etc. wouldn’t would not their them ’s company. company .
butCo. Co.
600.465 - Intro to NLP - J. Eisner 36600.465 - Intro to NLP - J. Eisner 36
FASTUS: Multiwords
“set up” “joint venture” “San Francisco Symphony Orchestra,”
“Canadian Opera Company”
… use a specialized regexp to match musical groups.
... what kind of regexp would match company names?
600.465 - Intro to NLP - J. Eisner 37600.465 - Intro to NLP - J. Eisner 37
FASTUS : Basic phrases
Output looks like this (no nested brackets!):… [NG it] [VG had set_up] [NP a joint_venture] [Prep in]
…
Company Name: Bridgestone Sports Co.Verb Group: saidNoun Group: FridayNoun Group: itVerb Group: had set upNoun Group: a joint venturePreposition: inLocation: TaiwanPreposition: withNoun Group: a local concern
600.465 - Intro to NLP - J. Eisner 38600.465 - Intro to NLP - J. Eisner 38
FASTUS: Noun Groups
Build FSA to recognize phrases likeapproximately 5 kgmore than 30 peoplethe newly elected presidentthe largest leftist political forcea government and commercial project
Use the FSA for left-to-right longest-match markup
What does FSA look like? See next slide …
600.465 - Intro to NLP - J. Eisner 39600.465 - Intro to NLP - J. Eisner 39
FASTUS: Noun Groups
Described with a kind of non-recursive CFG …(a regexp can include names that stand for other regexps)
NG Pronoun | Time-NP | Date-NPNG (Det) (Adjs) HeadNouns…Adjs sequence of adjectives maybe with commas,
conjunctions, adverbs…Det DetNP | DetNonNPDetNP detailed expression to match “the only five,
another three, this, many, hers, all, the most …”…
600.465 - Intro to NLP - J. Eisner 40600.465 - Intro to NLP - J. Eisner 40
FASTUS: Semantic patterns
BusinessRelationship =NounGroup(Company/ies) VerbGroup(Set-up) NounGroup(JointVenture) with NounGroup(Company/ies) | …
ProductionActivity = VerbGroup(Produce) NounGroup(Product)
NounGroup(Company/ies) NounGroup & … is made easy by the processing done at a previous level
Use this for spotting references to put in the database.
600.465 - Intro to NLP - J. Eisner 41
Composition Cascades
You can build fancy noisy-channel models by composing transducers …
… now let’s turn to how you might build the individual transducers in the cascade. We’ll use a variety of operators that
combine simpler transducers and acceptors into more complex ones.
Composition is just one example.