12: constraints on strings - johns hopkins universityjason/325/pdfslides/12rational.pdf · 2011. 5....

12
5/13/11 1 600.325/425 Declarative Methods - J. Eisner 1 Constraints on Strings 600.325/425 Declarative Methods - J. Eisner 2 What’s a constraint, again? X=0 1 2 3 4 5 X Y unary binary A set of allowed values A set of allowed value pairs Infinite sets? Sure … Infinite subsets of (pairs of) integers, reals, … How about soft constraints? 600.325/425 Declarative Methods - J. Eisner 3 What’s a constraint on strings? Hard constraint: Does string S match pattern P? (Is it in the set?) A description of a set of strings Like a constraint … how? S is a variable whose domain is set of all strings! So P can be regarded as a unary constraint: let’s write P(S). Soft constraint: How well does string S fit pattern P? A function mapping each string to a score / weight / cost. Like a soft constraint … 600.325/425 Declarative Methods - J. Eisner 4 What is a pattern? What operations would you expect for combining these string constraints? If P is a pattern, then so is ~P ~P matches exactly the strings that P doesn’t If P and Q are both patterns, then so is P & Q If P and Q are both patterns, then so is P | Q Wow, we can build up boolean formulas! Does this allow us to encode SAT? How? 600.325/425 Declarative Methods - J. Eisner 5 More about the relation to constraints By building complicated patterns from simple ones, we are building up complicated constraints! That is also allowed in ECLiPSe: alldiff3(X,Y,Z) :- X #\= Y, Y \#= Z, X \#= Z. between(X,Y,Z) :- X #< Y, Y #< Z. between(X,Y,Z) :- X #> Y, Y #> Z. Now we can use “alldiff3” and “between” as new constraints Hang on, patterns are only unary constraints. Generalize? between(X,Y,Z) :- (X #< Y, Y #< Z) or (X #> Y, Y ># Z). 600.325/425 Declarative Methods - J. Eisner 6 What is a pattern? Binary constraint (relation ): What are all the possible translations of string S? A description of a set of string pairs (S,T) Like a binary constraint: let’s write P(S,T) We can also do n-ary constraints more generally, but most current solvers don’t allow them Fuzzy case: How strongly is string S related to each T? Which one is it most strongly related to? Ok, so what’s new here? Why does it matter that they’re string variables?

Upload: others

Post on 27-Aug-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 12: Constraints on Strings - Johns Hopkins Universityjason/325/PDFSlides/12rational.pdf · 2011. 5. 13. · build an FSA that efficiently determines whether a given string satisfies

5/13/11

1

600.325/425 Declarative Methods - J. Eisner 1

Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 2

What’s a constraint, again?

X=0

1

2

3

4

5

X

Y

unary binary

A set of allowed values

A set of allowed

value pairs

Infinite sets? Sure …

Infinite subsets of (pairs of)

integers, reals, … How about soft constraints?

600.325/425 Declarative Methods - J. Eisner 3

What’s a constraint on strings?

Hard constraint:

Does string S match pattern P? (Is it in the set?)

A description of a set of strings

Like a constraint … how?

S is a variable whose domain is set of all strings!

So P can be regarded as a unary constraint: let’s write P(S).

Soft constraint:

How well does string S fit pattern P?

A function mapping each string to a score / weight / cost.

Like a soft constraint …

600.325/425 Declarative Methods - J. Eisner 4

What is a pattern?

What operations would you expect for combining these string constraints?

If P is a pattern, then so is ~P ~P matches exactly the strings that P doesn’t

If P and Q are both patterns, then so is P & Q

If P and Q are both patterns, then so is P | Q

Wow, we can build up boolean formulas! Does this allow us to encode SAT? How?

600.325/425 Declarative Methods - J. Eisner 5

More about the relation to constraints

By building complicated patterns from simple ones, we are building up complicated constraints!

That is also allowed in ECLiPSe:

alldiff3(X,Y,Z) :- X #\= Y, Y \#= Z, X \#= Z.

between(X,Y,Z) :- X #< Y, Y #< Z. % either this

between(X,Y,Z) :- X #> Y, Y #> Z. % ... or this

Now we can use “alldiff3” and “between” as new constraints

Hang on, patterns are only unary constraints. Generalize?

between(X,Y,Z) :- (X #< Y, Y #< Z)

or (X #> Y, Y ># Z).

600.325/425 Declarative Methods - J. Eisner 6

What is a pattern?

Binary constraint (relation): What are all the possible translations of string S?

A description of a set of string pairs (S,T)

Like a binary constraint: let’s write P(S,T) We can also do n-ary constraints more generally, but most current solvers don’t allow them

Fuzzy case: How strongly is string S related to each T? Which one is it most strongly related to?

Ok, so what’s new here? Why does it matter that they’re string variables?

Page 2: 12: Constraints on Strings - Johns Hopkins Universityjason/325/PDFSlides/12rational.pdf · 2011. 5. 13. · build an FSA that efficiently determines whether a given string satisfies

5/13/11

2

600.325/425 Declarative Methods - J. Eisner 7

Some Pattern Operators

~ complementation ~P

& intersection P & Q

| union P | Q

concatenation PQ

* iteration (0 or more) P*

+ iteration (1 or more) P+

- difference P - Q

\ char complement \P (equiv. to ?-P)

Which of these can be treated as syntactic sugar? That is, which of these can we get rid of?

600.325/425 Declarative Methods - J. Eisner 8

More Pattern Operators

.x. crossproduct P .x. Q

.o. composition P .o. Q

.u upper (input) language P.u “domain”

.l. lower (output) language P.l “range”

600.325/425 Declarative Methods - J. Eisner 9

The language of “regular expressions”

A variable S has infinitely many possible values if its type is “string” or “real”

So to specify a constraint on S, not enuf to list possible values

Language for simple constraints on reals: linear equations

Language for simple constraints on strings: regular expressions

Regular expression language

You probably know the standard form of regular expressions

Standard regexp is a unary constraint (“X must match a*b(c|d)*”)

Basic operators: union “|”, concatenation, closure “*”

But the language has been extended in various ways:

soft constraints (specifies costs)

binary constraints (over pairs of string variables)

n-ary constraints (over n string variables)

600.325/425 Declarative Methods - J. Eisner 10

Regular expressions finite-state automata

1. Given a regexp that specifies a constraint, you can build an FSA that efficiently determines whether a given string satisfies the constraint.

2. Given an FSA, you can find an equivalent regexp.

So the “compiled” form of the little language can be converted back to the source form.

Conclusion: Anything you can do with regexps, you can do with FSAs, and vice-versa.

600.325/425 Declarative Methods - J. Eisner 11

Given a regular expression …

1. Make a parse tree for it

2. Build up the FSA from the bottom up

Example: (ab|c)*(bb*a)

a b

c concat

union

closure

b

b

a concat

concat

closure

concat

600.325/425 Declarative Methods - J. Eisner 12

Concatenation (of soft constraints)

example thanks to M. Mohri

Page 3: 12: Constraints on Strings - Johns Hopkins Universityjason/325/PDFSlides/12rational.pdf · 2011. 5. 13. · build an FSA that efficiently determines whether a given string satisfies

5/13/11

3

600.325/425 Declarative Methods - J. Eisner 13

Union

example thanks to M. Mohri 600.325/425 Declarative Methods - J. Eisner 14

Union

example thanks to M. Mohri

eps/0

eps/0.3

eps/0.8

600.325/425 Declarative Methods - J. Eisner 15

Closure (also illustrates binary constraints)

example thanks to M. Mohri 600.325/425 Declarative Methods - J. Eisner 16

Complementation

M represents a constraint on strings

We’d like to represent ~M (i.e., a constraint that says that the string must not be accepted by M)

Just change M’s final states to non-final and vice-versa

Only works if every string takes you to exactly one state in M (final or non-final). So M must be both deterministic and complete. Any M can be put in this form.

example thanks to M. Mohri

600.325/425 Declarative Methods - J. Eisner 17

Intersection fat/0.5

1 0 2/0.8 pig/0.3 eats/0

sleeps/0.6

fat/0.2 1 0 2/0.5

eats/0.6

sleeps/1.3

pig/0.4

0,0 fat/0.7

0,1 1,1 pig/0.7

2,0/0.8

2,2/1.3

eats/0.6

sleeps/1.9

example adapted from M. Mohri 600.325/425 Declarative Methods - J. Eisner 18

Intersection fat/0.5

1 0 2/0.8 pig/0.3 eats/0

sleeps/0.6

0,0 fat/0.7

0,1 1,1 pig/0.7

2,0/0.8

2,2/1.3

eats/0.6

sleeps/1.9

fat/0.2 1 0 2/0.5

eats/0.6

sleeps/1.3

pig/0.4

Paths 0012 and 0110 both accept fat pig eats So must the new machine: along path 0,0 0,1 1,1 2,0

example adapted from M. Mohri

Page 4: 12: Constraints on Strings - Johns Hopkins Universityjason/325/PDFSlides/12rational.pdf · 2011. 5. 13. · build an FSA that efficiently determines whether a given string satisfies

5/13/11

4

600.325/425 Declarative Methods - J. Eisner 19

fat/0.5

fat/0.2

Intersection

1 0 2/0.5

1 0 2/0.8 pig/0.3 eats/0

sleeps/0.6

eats/0.6

sleeps/1.3

pig/0.4

0,0 fat/0.7

0,1

Paths 00 and 01 both accept fat So must the new machine: along path 0,0 0,1

600.325/425 Declarative Methods - J. Eisner 20

pig/0.3

pig/0.4

Intersection fat/0.5

1 0 2/0.8 eats/0

sleeps/0.6

fat/0.2 1 0 2/0.5

eats/0.6

sleeps/1.3

0,0 fat/0.7

0,1 pig/0.7

1,1

Paths 00 and 11 both accept pig So must the new machine: along path 0,1 1,1

600.325/425 Declarative Methods - J. Eisner 21

sleeps/0.6

sleeps/1.3

Intersection fat/0.5

1 0 2/0.8 pig/0.3 eats/0

fat/0.2 1 0

eats/0.6

pig/0.4

0,0 fat/0.7

0,1 1,1 pig/0.7

sleeps/1.9 2,2/1.3

2/0.5

Paths 12 and 12 both accept fat So must the new machine: along path 1,1 2,2

600.325/425 Declarative Methods - J. Eisner 22

eats/0.6

eats/0

sleeps/0.6

sleeps/1.3

Intersection fat/0.5

1 0 2/0.8 pig/0.3

fat/0.2 1 0

pig/0.4

0,0 fat/0.7

0,1 1,1 pig/0.7

sleeps/1.9

2/0.5

2,2/0.8

eats/0.6 2,0/1.3

600.325/425 Declarative Methods - J. Eisner 23

Intersection

Why is intersection guaranteed to terminate?

How big a machine might be produced by

intersection?

600.325/425 Declarative Methods - J. Eisner 24

Given a regular expression …

1. Make a parse tree for it

2. Build up the FSA from the bottom up

Example: (ab|c)*(bb*a)

a b

c concat

union

closure

b

b

a concat

concat

closure

concat

Page 5: 12: Constraints on Strings - Johns Hopkins Universityjason/325/PDFSlides/12rational.pdf · 2011. 5. 13. · build an FSA that efficiently determines whether a given string satisfies

5/13/11

5

600.325/425 Declarative Methods - J. Eisner 25

Given an FSA … Find a regular expression

describing all paths from

initial state 1 to final state 5.

1 2 3

4

Paths from 1 to 5:

e12 ((e23 e33* e35) | e24 e45)

5 >

600.325/425 Declarative Methods - J. Eisner 26

Paths from 1 to 5:

e12 ((e23 e33* e35) | e24 e45)

Given an FSA … Find a regular expression

describing all paths from

initial state 1 to final state 5.

1 2 3

4

Paths from 1 to 5:

e12 ((e23 e33* (e35 | e34 e45)) | e24 e45)

5 >

600.325/425 Declarative Methods - J. Eisner 27

Paths from 1 to 5:

e12 ((e23 e33* (e35 | e34 e45)) | e24 e45)

Given an FSA … Find a regular expression

describing all paths from

initial state 1 to final state 5.

1 2 3

4

Paths from 1 to 5:

e12 ( (e23 (e33 | e34 e43 )* (e35 | e34 e45))

| (e24 (e43 e33* e34 )* (e45 | e43 e35)))

5 >

600.325/425 Declarative Methods - J. Eisner 28

Given an FSA … Find a regular expression

describing all paths from

initial state 1 to final state 5.

1

2

3

4

Paths from 1 to 5:

???

5

>

600.325/425 Declarative Methods - J. Eisner 29

Does there exist any path from

initial state 1 to final state 5?

Let’s do a simpler variant first …

1 2 3

4

5 >

If there’s a way to get

from 1 to 3 and from

3 to 5, then there's a

way to get from 1 to 5.

slide thanks to R. Tamassia & M. Goodrich (modified)

More generally, transitive closure problem:

For each A, B, does there exist any path

from A to B?

600.325/425 Declarative Methods - J. Eisner 30

If there’s a way to get

from 1 to 3 and from

3 to 5, then there's a

way to get from 1 to 5.

Does there exist any path from

initial state 1 to final state 5?

Let’s do a simpler variant first …

Hmm … should I look for

a 1 3 path first in hopes of

using it to build a 1 5

path? Or vice-versa?

More generally, transitive closure problem:

For each A, B, does there exist any path

from A to B?

1

2

3

4 5

>

1 2 3 5 >

1 2 5 3 >

Page 6: 12: Constraints on Strings - Johns Hopkins Universityjason/325/PDFSlides/12rational.pdf · 2011. 5. 13. · build an FSA that efficiently determines whether a given string satisfies

5/13/11

6

600.325/425 Declarative Methods - J. Eisner 31

If there’s a way to get

from 1 to 3 and from

3 to 5, then there's a

way to get from 1 to 5.

Let’s do a simpler variant first …

Hmm … should I look for

a 1 3 path first in hopes of

using it to build a 1 5

path? Or vice-versa?

1 2 3 5 >

1 2 5 3 >

Option #1: Gradually build up longer paths (length-1, length-2, length-3 …)

How do we deal with cycles?

Option #2 (less obvious): Gradually allow paths of higher and higher order, where a path’s order is the number of the highest vertex that the path goes through.

Both have O(n3) runtime.

But option #2 allows more flexible handling of cycles. We’ll need that when we return to our FSA problem.

600.325/425 Declarative Methods - J. Eisner 32

If there’s a way to get

from 1 to 3 and from

3 to 5, then there's a

way to get from 1 to 5.

Floyd-Warshall transitive closure algorithm

Hmm … should I look for

a 1 3 path first in hopes of

using it to build a 1 5

path? Or vice-versa?

1 2 5 3 >

Option #2 (less obvious): Gradually allow paths of higher and higher order, where a path’s order is the number of the highest vertex that the path goes through.

What are the paths of order 0?

What are the paths of order 1?

What are the paths of order 2?

How big can a path’s order be?

What are the paths of order 5?

600.325/425 Declarative Methods - J. Eisner 33

If there’s a way to get

from 1 to 3 and from

3 to 5, then there's a

way to get from 1 to 5.

Floyd-Warshall transitive closure algorithm

Option #2 (less obvious): Gradually allow paths of higher and higher order, where a path’s order is the number of the highest vertex that the path goes through.

Definition: pkij = true iff there is an

i j path of order k.

1. Define p0: For each i,j, set p0ij

= true iff there is an i j edge.

2. For k=1, 2, …n, define pk:

1

2

3

4 5

>

600.325/425 Declarative Methods - J. Eisner 34

If there’s a way to get

from 1 to 3 and from

3 to 5, then there's a

way to get from 1 to 5.

Floyd-Warshall transitive closure algorithm

Option #2 (less obvious): Gradually allow paths of higher and higher order, where a path’s order is the number of the highest vertex that the path goes through.

Definition: pkij = true iff there is an

i j path of order k.

1. Define p0: For each i,j, set p0ij

= true iff there is an i j edge.

2. For k=1, 2, …n, define pk:

For each i,j, set pij

k = pijk-1 v (pik

k-1 ^ pkjk-1)

3. return pn (e.g., what is pn1n ?)

k

j

i

Uses only vertices

numbered 1,…,k-1 Uses only

vertices numbered 1,…,k-1

New: but still uses only vertices

numbered 1,…,k

parts of slide thanks to R. Tamassia & M. Goodrich

600.325/425 Declarative Methods - J. Eisner 35

Floyd-Warshall Example

v2

v1

v3

v4

v5

v6

slide thanks to R. Tamassia & M. Goodrich (modified) 600.325/425 Declarative Methods - J. Eisner 36

Floyd-Warshall: k=1 (computes p1 from p0)

v2

v1

v3

v4

v5

v6

slide thanks to R. Tamassia & M. Goodrich (modified)

Page 7: 12: Constraints on Strings - Johns Hopkins Universityjason/325/PDFSlides/12rational.pdf · 2011. 5. 13. · build an FSA that efficiently determines whether a given string satisfies

5/13/11

7

600.325/425 Declarative Methods - J. Eisner 37

Floyd-Warshall: k=2 (computes p2 from p1)

v2

v1

v3

v4

v5

v6

slide thanks to R. Tamassia & M. Goodrich (modified) 600.325/425 Declarative Methods - J. Eisner 38

v2

v1

v3

v4

v5

v6

slide thanks to R. Tamassia & M. Goodrich (modified)

Floyd-Warshall: k=3 (computes p3 from p2)

600.325/425 Declarative Methods - J. Eisner 39

v2

v1

v3

v4

v5

v6

slide thanks to R. Tamassia & M. Goodrich (modified)

Floyd-Warshall: k=4 (computes p4 from p3)

600.325/425 Declarative Methods - J. Eisner 40

v2

v1

v3

v4

v5

v6

slide thanks to R. Tamassia & M. Goodrich (modified)

Floyd-Warshall: k=5 (computes p5 from p4)

600.325/425 Declarative Methods - J. Eisner 41

v2

v1

v3

v4

v5

v6

slide thanks to R. Tamassia & M. Goodrich (modified)

Floyd-Warshall: k=6 (computes p6 from p5)

600.325/425 Declarative Methods - J. Eisner 42

v2

v1

v3

v4

v5

v6

slide thanks to R. Tamassia & M. Goodrich (modified)

Floyd-Warshall: k=7 (computes p7 from p6)

Page 8: 12: Constraints on Strings - Johns Hopkins Universityjason/325/PDFSlides/12rational.pdf · 2011. 5. 13. · build an FSA that efficiently determines whether a given string satisfies

5/13/11

8

600.325/425 Declarative Methods - J. Eisner 43

Paths from 1 to 5:

e12 ((e23 e33* (e35 | e34 e45)) | e24 e45)

Regular expression version (Kleene/Tarjan)

Find a regular expression

describing all paths from

initial state 1 to final state 5.

1 2 3

4

Paths from 1 to 5:

e12 ( (e23 (e33 | e34 e43 )* (e35 | e34 e45))

| (e24 (e43 e33* e34 )* (e45 | e43 e35)))

5 >

600.325/425 Declarative Methods - J. Eisner 44

Regular expression version (Kleene/Tarjan)

Find a regular expression

describing all paths from

initial state 1 to final state 5.

1

2

3

4

Paths from 1 to 5:

???

5

>

600.325/425 Declarative Methods - J. Eisner 45

If there’s a way to get

from 1 to 3 and from

3 to 5, then there's a

way to get from 1 to 5.

Regular expression version (Kleene/Tarjan)

Definition: pkij = regular

expression describing all i j paths that have order k.

1. Define p0: For each i,j, set p0ij

= eij if that edge exists, else .

2. For k=1, 2, …n, define pk:

For each i,j, set pijk =

pijk-1 | (pik

k-1 pkkk-1* pkj

k-1)

(a regexp using all three of union, concat, closure!)

3. return pn (e.g., what is pn1n ?)

k

j

i

Uses only vertices

numbered 1,…,k-1 Uses only

vertices numbered 1,…,k-1

New: but still uses only vertices

numbered 1,…,k

parts of slide thanks to R. Tamassia & M. Goodrich 600.325/425 Declarative Methods - J. Eisner 46

Paths from 1 to 5:

e12 ((e23 e33* (e35 | e34 e45)) | e24 e45)

Regular expression version (Kleene/Tarjan)

What if the arcs have labels?

1 2 3

4

Paths from 1 to 5:

e12 ( (e23 (e33 | e34 e43 )* (e35 | e34 e45))

| (e24 (e43 e33* e34 )* (e45 | e43 e35)))

5 >

a

a b

c

b

aa

600.325/425 Declarative Methods - J. Eisner 47

Paths from 1 to 5:

e12 ((e23 e33* (e35 | e34 e45)) | e24 e45)

Regular expression version (Kleene/Tarjan)

What if the arcs have labels?

Just substitute them in:

1 2 3

4

Paths from 1 to 5:

e12 ( (e23 (e33 | e34 e43 )* (e35 | e34 e45))

| (e24 (e43 e33* e34 )* (e45 | e43 e35)))

5 > a b

c

b

aa a

a b c a

b

b

a c aa

a

600.325/425 Declarative Methods - J. Eisner 48

Regular languages as points in a high-

dimensional space abc abc

abc:2 2abc (weighted)

ab|ac ab + ac

a(b|c) ab + ac

a(b|(c:2)) ab + 2ac

ab* c ac + abc + abbc + abbbc + …

a(b:2)*c ac + 2abc + 4abbc +8abbbc + …

Instead of dimensions x2, y2, xy, etc.,

every possible string is a dimension

and its coefficient is the coordinate (often 0)

Page 9: 12: Constraints on Strings - Johns Hopkins Universityjason/325/PDFSlides/12rational.pdf · 2011. 5. 13. · build an FSA that efficiently determines whether a given string satisfies

5/13/11

9

600.325/425 Declarative Methods - J. Eisner 49

Suppose P, Q are two regular languages represented as these “formal power series.”

What is the sum P+Q? Union!

We double-count …

What is the product PQ? Concatenation!

What is the Hadamard product P Q? (i.e., the dot product before you sum: x y = (x1y1, x2y2, …)) Intersection!

What is 1/(1-P)? * closure!

Could we use these techniques to classify strings using kernel SVMs?

Regular languages as points in a high-

dimensional space

600.325/425 Declarative Methods

- J. Eisner 50

Function from strings to ...

a:x/.5

c:z/.7

:y/.5

.3

Acceptors (FSAs) Transducers (FSTs)

a:x

c:z

:y

a

c

Unweighted

Weighted a/.5

c/.7

/.5

.3

{false, true} strings

numbers (string, num) pairs

600.325/425 Declarative Methods

- J. Eisner 51

Sample functions

Unweighted

Weighted

{false, true} strings

numbers (string, num) pairs

Grammatical?

How grammatical? Better, how likely?

Markup Correction Translation

Good markups Good corrections Good translations

Acceptors (FSAs) Transducers (FSTs)

600.325/425 Declarative Methods

- J. Eisner 52

Sample data, encoded same way

Unweighted

Weighted

{false, true} strings

numbers (string, num) pairs

Input string Corpus Dictionary

Input lattice Reweighted corpus Weighted dictionary

Bilingual corpus Bilingual lexicon Database (WordNet)

Prob. bilingual lexicon Weighted database

Acceptors (FSAs) Transducers (FSTs) b a n a n a

a i d d

600.325/425 Declarative Methods

- J. Eisner 53

Some Applications

Prediction, classification, generation of text

More generally, “filling in the blanks” (probabilistic reconstruction of hidden data)

Speech recognition

Machine translation, OCR, other noisy-channel models

Sequence alignment / Pdit distance / Computational biology

Text normalization, segmentation, categorization

Information extraction

Stochastic phonology/morphology, including lexicon

Tagging, chunking, finite-state parsing

Syntactic transformations (smoothing PCFG rulesets) 600.325/425 Declarative Methods

- J. Eisner 54

Finite-state “programming”

Object code

compiler

Function

Source code

programmer

Finite-state machine

regexp compiler

Better object code

optimizer

Better object code

determinization, minimization, pruning

Function on strings

Regular expression

programmer

c a

a?c*

Programming Langs Finite-State World

Page 10: 12: Constraints on Strings - Johns Hopkins Universityjason/325/PDFSlides/12rational.pdf · 2011. 5. 13. · build an FSA that efficiently determines whether a given string satisfies

5/13/11

10

600.325/425 Declarative Methods

- J. Eisner 55

Finite-state “programming”

Function composition

FST/WFST composition

Function inversion (available in Prolog)

FST inversion

Higher-order functions

...

Finite-state operators

...

Small modular cooperating functions (structured programming)

Small modular regexps, combined via operators

Programming Langs Finite-State World

600.325/425 Declarative Methods

- J. Eisner 56

Finite-state “programming”

Programming Langs Finite-State World

More features you wish other languages had!

600.325/425 Declarative Methods

- J. Eisner 57

p(x) =

Finite-State Operations

Projection GIVPS YOU marginal distribution

domain( p(x,y) )

p(y) = range( p(x,y) )

a : b / 0.3 a : b / 0.3 600.325/425 Declarative Methods

- J. Eisner 58

0.3 p(x) + 0.7 q(x) =

Finite-State Operations

Probabilistic union GIVPS YOU mixture model

p(x) +0.3 q(x)

p(x)

q(x)

0.3

0.7

600.325/425 Declarative Methods

- J. Eisner 59

p(x) + (1- )q(x) =

Finite-State Operations

Probabilistic union GIVPS YOU mixture model

p(x) + q(x)

p(x)

q(x)

1-

Learn the mixture parameter !

600.325/425 Declarative Methods

- J. Eisner 60

p(x|z) =

Finite-State Operations

Composition GIVPS YOU chain rule

p(x|y) o p(y|z)

p(x,z) = o z p(x|y) o p(y|z)

The most popular statistical FSM operation

Cross-product construction

Page 11: 12: Constraints on Strings - Johns Hopkins Universityjason/325/PDFSlides/12rational.pdf · 2011. 5. 13. · build an FSA that efficiently determines whether a given string satisfies

5/13/11

11

600.325/425 Declarative Methods

- J. Eisner 61

Finite-State Operations

Concatenation, probabilistic closure HANDLP unsegmented text

p(x) q(x)

p(x) p(x) q(x) *0.3

0.3

0.7

p(x)

Just glue together machines for the different segments, and let them figure out how to align with the text

600.325/425 Declarative Methods

- J. Eisner 62

Finite-State Operations

Directed replacement MODPLS noise or postprocessing

p(x, noisy y) = p(x,y) o

Resulting machine compensates for noise or postprocessing

D

noise model defined by dir. replacement

600.325/425 Declarative Methods

- J. Eisner 63

p(x)*q(x) =

Finite-State Operations

Intersection GIVPS YOU product models e.g., exponential / maxent, perceptron, Naïve Bayes, …

p(x) & q(x)

pNB(y | x) & p(y) p(A(x)|y) & p(B(x)|y) &

Cross-product construction (like composition)

Need a normalization op too – computes x f(x) “pathsum” or “partition function”

600.325/425 Declarative Methods

- J. Eisner 64

Finite-State Operations

Conditionalization (new operation)

p(y | x) = condit( p(x,y) )

p(x,y)

Construction: reciprocal(determinize(domain( ))) o p(x,y)

not possible for all weighted FSAs

Resulting machine can be composed with other distributions: p(y | x) * q(x)

600.325/425 Declarative Methods

- J. Eisner 65

Other Useful Finite-State

Constructions

Complete graphs YIPLD n-gram models

Other graphs YIPLD fancy language models (skips, caching, etc.)

Compilation from other formalism FSM:

Wordlist (cf. trie), pronunciation dictionary ...

Speech hypothesis lattice

Decision tree (Sproat & Riley)

Weighted rewrite rules (Mohri & Sproat)

TBL or probabilistic TBL (Roche & Schabes)

PCFG (approximation!) (e.g., Mohri & Nederhof)

Optimality theory grammars (e.g., Pisner)

Logical description of set (Vaillette; Klarlund) 600.325/425 Declarative Methods

- J. Eisner 66

Object code

compiler

Function

Source code

programmer

Finite-state machine

regexp compiler

Better object code

optimizer

Better object code

determinization, minimization, pruning

Function on strings

Regular expression

programmer

c a

a?c*

Programming Langs Finite-State World

Regular Expression Calculus

as a Programming Language

Page 12: 12: Constraints on Strings - Johns Hopkins Universityjason/325/PDFSlides/12rational.pdf · 2011. 5. 13. · build an FSA that efficiently determines whether a given string satisfies

5/13/11

12

600.325/425 Declarative Methods

- J. Eisner 67

Regular Expression Calculus

as a Modelling Language

Oops! Statistical FSMs still done “in assembly language”!

Build machines by manipulating arcs and states

For training, get the weights by some exogenous procedure and patch them onto arcs

you may need extra training data for this

you may need to devise and implement a new variant of PM

Would rather build models declaratively

((a*.7 b) +.5 (ab*.6)) ° repl.9((a:(b +.3 ))*,L,R) 600.325/425 Declarative Methods

- J. Eisner 68

A Simple Example: Segmentation

tapirseatgrass tapirs eat grass? tapir seat grass? tap irse at grass?

...

Strategy: build a finite-state model of p(spaced text, spaceless text)

Then maximize p(???, tapirseatgrass)

Start with a distribution p(English word) a machine D (for dictionary)

Construct p(spaced text) (D space)*0.99 D

Compose with p(spaceless | spaced) ((¬space)+(space: ))*

600.325/425 Declarative Methods

- J. Eisner 69

train on spaced or spaceless text

Strategy: build a finite-state model of p(spaced text, spaceless text)

Then maximize p(???, tapirseatgrass)

Start with a distribution p(Pnglish word) a machine D (for dictionary)

Construct p(spaced text) (D space)*0.99 D

Compose with p(spaceless | spaced) ((¬space)+(space: ))*

A Simple Example: Segmentation

D should include novel words:

D = KnownWord +0.99 (Letter*0.85 Suffix)

Could improve to consider letter n-grams, morphology ...

Noisy channel could do more than just delete spaces: Vowel deletion (Semitic); OCR garbling ( cl d, ri n, rn m ...) 600.325/425 Declarative Methods - J. Eisner 70