finite-state automata 2 day 13

35
Finite-state automata 2 Day 13 LING 681.02 Computational Linguistics Harry Howard Tulane University

Upload: step

Post on 15-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Finite-state automata 2 Day 13. LING 681.02 Computational Linguistics Harry Howard Tulane University. Course organization. http://www.tulane.edu/~ling/NLP/ NLTK is installed on the computers in this room! How would you like to use the Provost's $150?. SLP §2.2 Finite-state automata. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Finite-state automata 2 Day 13

Finite-state automata 2

Day 13

LING 681.02Computational Linguistics

Harry HowardTulane University

Page 2: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

2

Course organization

http://www.tulane.edu/~ling/NLP/NLTK is installed on the computers in this

room!How would you like to use the Provost's

$150?

Page 3: Finite-state automata 2 Day 13

SLP §2.2 Finite-state automata

2.2.1 Sheeptalk

Page 4: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

4

Find your files

>>> import sys>>>sys.path.append("/Users/harryhow/Documents/Work/Research/Sims/NLTK")

Page 5: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

5

Run program

>>> import fsaproc>>> test = 'baaa!'>>> test = 'baaa!$'>>> fsaproc.machine(test)

Page 6: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

6

Go over print-out

Page 7: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

7

Key points

D-recognize is a simple table-driven interpreter.The algorithm is universal for all unambiguous

regular languages.To change the machine, you simply change the table.

Crudely therefore… matching strings with regular expressions (ala Perl, grep, etc.) is a matter of:translating the regular expression into a machine (a

table) and passing the table and the string to an interpreter.

Page 8: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

8

Recognition as search

You can view this algorithm as a kind of state-space search.

States are pairings of tape positions and state numbers.

The goal state is a pairing with the end of tape position and a final accept state.

Page 9: Finite-state automata 2 Day 13

SLP §2.2 Finite-state automata

2.2.2 Formal languages

Page 10: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

10

Generative Formalisms

Formal Languages are sets of strings composed of symbols from a finite set of symbols.

Finite-state automata define formal languages (without having to enumerate all the strings in the language).

The term Generative is based on the view that you can run the machine as a generator to get strings from the language.

Page 11: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

11

Generative Formalisms

A FSA can be viewed from two perspectives, as:an acceptor that can tell you if a string is in the

language.a generators to produce all and only the strings

in the language.

Page 12: Finite-state automata 2 Day 13

SLP §2.2 Finite-state automata

2.2.4 Determinism

Page 13: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

13

Determinism

A deterministic FSA has one unique thing to do at each point in processing.i.e. there are no choices

Page 14: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

14

Non-determinism

Page 15: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

15

Non-determinism cont.

Epsilon transitionsAn arc has no symbol on it, represented as .Such a transition does not examine or advance

the tape during recognition:

Page 16: Finite-state automata 2 Day 13

SLP §2.2 Finite-state automata

2.2.5 Use of a nFSA to accept strings

Page 17: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

17

Read on your own

pp. 33-5

Page 18: Finite-state automata 2 Day 13

SLP §2.2 Finite-state automata

2.2.6 Recognition as search

Page 19: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

19

Non-deterministic recognition: Search

In a ND FSA there is at least one path through the machine for a string that is in the language defined by the machine.

But not all paths directed through the machine for an accept string lead to an accept state.

No paths through the machine lead to an accept state for a string not in the language.

Page 20: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

20

Non-deterministic recognition

So success in non-deterministic recognition occurs when a path is found through the machine that ends in an accept.

Failure occurs when all of the possible paths for a given string lead to failure.

Page 21: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

21

Example

b a a a ! \

q0 q1 q2 q2 q3 q4

Page 22: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

22

Example

Page 23: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

23

Example

Page 24: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

24

Example

Page 25: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

25

Example

Page 26: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

26

Example

Page 27: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

27

Example

Page 28: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

28

Example

Page 29: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

29

Example

Page 30: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

30

Key points

States in the search space are pairings of tape positions and states in the machine.

By keeping track of as yet unexplored states, a recognizer can systematically explore all the paths through the machine given an input.

Page 31: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

31

Ordering of states

But how do you keep track?Depth-first/last in first out (LIFO)/stack

Unexplored states are added to the front of the agenda, and they are explored by going to the most recent.

Breadth-first/first in first out (FIFO)/queueUnexplored states are added to the back of the

agenda, and they are explored by going to the most recent.

Page 32: Finite-state automata 2 Day 13

SLP §2.2 Finite-state automata

2.2.7 Comparison

Page 33: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

33

Equivalence

Non-deterministic machines can be converted to deterministic ones with a fairly simple construction.

That means that they have the same power:non-deterministic machines are not more

powerful than deterministic ones in terms of the languages they can accept.

Page 34: Finite-state automata 2 Day 13

23-Sept-2009 LING 681.02, Prof. Howard, Tulane University

34

Why bother?

Non-determinism doesn’t get us more formal power and it causes headaches, so why bother?

More natural (understandable) solutions.

Page 35: Finite-state automata 2 Day 13

Next time

SLP §2.3 briefly

SLP §3