lexical analysis - information sciences institute · 2016-01-03 · 2 csci 565 - compiler design...

119
1 Spring 2016 CSCI 565 - Compiler Design Pedro Diniz [email protected] Lexical Analysis Regular Expressions & DFA Copyright 2016, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class at the University of Southern California have explicit permission to make copies of these materials for their personal use.

Upload: others

Post on 20-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

1

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

Lexical Analysis

Regular Expressions & DFA

Copyright 2016, Pedro C. Diniz, all rights reserved.Students enrolled in the Compilers class at the University of Southern California have explicit permission to make copies of these materials for their personal use.

Page 2: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

2

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

2

Outline

• What is a Lexical Analyzer?

• Regular Expressions

• Matching regular expressions using Nondeterministic Finite Automata (NFA)

• Transforming an NFA to a DFA

Page 3: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

3

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

3

What is a Lexical Analyzer?

• Example of Tokens• Operators = + - > ( { := == <>• Keywords if while for int double• Numeric literals 43 5.65 -3.6e10 0x13F3A• Character literals ‘a’ ‘~’ ‘\’’• String literals “565” “Fall 10” “\”\” = empty”

• Example of non-tokens• White space space(‘ ‘) tab(‘\t’) end-of-line(‘\n’)• Comments /*this is not a token*/

Source Program Text Tokens

Page 4: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

4

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

4

f o r v a r 1 = 1 0 v a r 1 < =

Lexical Analyzer in Action

Page 5: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

5

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

5

for_key ID(“var1”) eq_op Num(10) ID(“var1”) leq_op

Lexical Analyzer in Action

f o r v a r 1 = 1 0 v a r 1 < =

Page 6: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

6

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

6

Lexical Analyzer Needs To...

• Partition Input Program Text into Subsequence of Characters Corresponding to Tokens

• Attach the Corresponding Attributes to the Tokens

• Eliminate White Space and Comments

Page 7: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

7

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

7

• Precisely identify the type of token that matches the input string

• 603 Num(603)• CSCI565 ID(“CSCI565”)

• Precisely describe different types of tokens• FORTRAN DO I=1,10 • C++ for(int i=1; i <= 10; i++)• C-shell foreach i (1 2 3 4 5 6 7 8 9 10)

• Use Regular Expressions to precisely describe what strings each type of token can recognize

Lexical Analyzer Needs To...

Page 8: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

8

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

8

Outline

• What is a Lexical Analyzer?

• Regular Expressions

• Matching Regular expressions using Nondeterministic Finite Automata (NFA)

• Transforming an NFA to a DFA

Page 9: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

9

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

9

Examples of Regular ExpressionsRegular Expression Strings Matched

a “a”a · b “ab”a | b “a” “b”ε “”

a* “” “a” “aa” “aaa”…(a | ε) · b “ab” “b”num = 0|1|2|3|4|5|6|7|8|9 “0”, “1”, …posint = num · num* “8” “6035” …int = (ε | -) · posint “-42” “1024” …real = int · (ε | (. · posint)) “-12.56” “12” “1.414”...

Page 10: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

10

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

10

Definition: Formal Languages

• Alphabet Σ = finite set of symbols– Σ = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }

• String s = finite sequence of symbols from alphabet– s = 6004

• Empty string ε = special string of length zero• Language L = set of strings over an alphabet

– L = { 6001, 6002, 6003, 6004, 6035 6891 … }

Page 11: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

11

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

11

• For a regular expression r, the language L(r) = { all the strings that match r }– L((a | ε) · b) = {“ab” “b”}

• Suppose r and s are Regular Expressions denoting languages L(r) and L(s)– L(r | s) = L(r) ∪ L(s)– L(r · s) = { xy | x ∈ L(r) and y ∈ L(s) }– L(r*) = { x1 x2 ... xk | xi ∈ L(r) and k >= 0 }– L(ε) = {}

Definition: Formal Languages

Page 12: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

12

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

12

More Regular Expressions• We know:

– L(r | s) is the union of L(r) and L(s)– L(r · s) is the concatenation of L(r) and L(s)– L(r*) is the Kleene closure of L(r)

• “zero or more occurrence of”

• Few additional ones– “one or more occurrence of”r+ = r · r*– “zero or one occurrence of” r? = r | ε

Page 13: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

13

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

13

Question

• What regular expression best identifies USC course numbers?

num = 0|1|2|3|4|5|6|7|8|9

1) class = num · num*

2) class = num · . · num*

3) class = num | . | num*

4) class = (num · . · num)*

Page 14: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

14

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

14

Outline• What is a Lexical Analyzer?

• Regular Expressions

• Matching regular expressions using Nondeterministic Finite Automata (NFA)

• Transforming an NFA to a DFA

Page 15: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

15

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

15

Reg. Expression to NFA Construction

a a

εε

rr

ss

If r and s are regular expressions with the NFA’s

Page 16: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

16

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

16

rr

ss

r · sr s

r | s r

s

ε

ε ε

ε

r*r

ε

ε

εε

Reg. Expression to NFA Construction

Page 17: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

17

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

17

r ?sε ε

ε

r+r

εε

Reg. Expression to NFA Construction

Page 18: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

18

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

18

Construction Example(-| ε)· (0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

Page 19: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

19

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

19

ε

-

(-| ε)· (0|1|2|3|4|5|6|7|8|9)+· (. ·(0|1|2|3|4|5|6|7|8|9)*)?

Construction Example

Page 20: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

20

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

20

(-| ε)· (0|1|2|3|4|5|6|7|8|9)+· (. ·(0|1|2|3|4|5|6|7|8|9)*)?

ε

Construction Example

Page 21: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

21

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

21

(-| ε)· (0|1|2|3|4|5|6|7|8|9)+· (. ·(0|1|2|3|4|5|6|7|8|9)*)?

ε

ε

Construction Example

Page 22: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

22

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

22

(-| ε)· (0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

ε

ε

0123456789

Construction Example

Page 23: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

23

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

23

(-| ε)· (0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

ε

ε

0123456789

-ε ε

Construction Example

Page 24: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

24

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

24

(-| ε)· (0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

ε

ε

0123456789

-ε ε

ε

ε

Construction Example

Page 25: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

25

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

25

(-| ε)· (0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

ε

ε

0123456789

-ε . ε

ε

Construction Example

Page 26: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

26

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

26

(-| ε)· (0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

ε

ε

0123456789

-ε . εε

εε

ε

Construction Example

Page 27: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

27

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

27

(-| ε)· (0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

Construction Example

Page 28: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

28

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

28

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 29: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

29

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

29

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 30: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

30

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

30

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 31: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

31

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

31

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 32: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

32

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

32

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 33: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

33

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

33

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 34: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

34

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

34

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 35: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

35

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

35

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 36: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

36

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

36

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 37: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

37

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

37

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 38: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

38

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

38

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 39: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

39

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

39

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 40: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

40

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

40

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 41: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

41

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

41

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 42: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

42

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

42

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 43: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

43

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

43

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 44: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

44

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

44

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 45: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

45

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

45

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 46: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

46

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

46

ε

ε

0123456789

ε

0123456789

-ε . εε

εε

- 1 2 . 8

String Matching

Page 47: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

47

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

47

Implementing a Lexical Analyzer

• Need to find which strings match a Regular Expression• Create a NFA for to match the Regular Expression• Unfortunately, NFA does not have a simple

implementation• Need to create a Deterministic Finite Automaton

(DFA) from a NFA

Page 48: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

48

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

48

Outline

• What is a Lexical Analyzer?

• Regular Expressions

• Matching regular expressions using Nondeterministic Finite Automata (NFA)

• Transforming an NFA to a DFA

Page 49: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

49

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

49

Constructing a DFA from a NFA

• Why do we need a DFA?– Easy to implement– Current state + input symbol uniquely identifies the next state

• How do you construct a DFA from a NFA?

Page 50: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

50

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

50

• Why do we need a DFA?– Easy to implement– Current state + input symbol uniquely identifies the next state

• How do you construct a DFA from a NFA?– DFA keeps track of which states the NFA would be in– Each state of the DFA is in fact a subset of the states of the NFA

NFA DFA

aA

B

A

B

Constructing a DFA from a NFA

aa

ε a1

2

3

4 5

aa

ε a1

2

3

4 5

Page 51: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

51

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

51

State ε-closure• The ε-closure of a state s is the set of states that can

be reached from that state without consuming any of the input– ε-Closure(S) is the smallest set T such that

• Algorithm (fixed-point)

T = S edge(s,ε)s∈T$

% & &

'

( ) )

T← Srepeat

T '←T

T←T ' edge(s,ε)s∈T '%

& ' '

(

) * *

until T =T '

Page 52: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

52

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

52

ε

-ε . εε

ε

1 4 5 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

S = {1}T = {}T’ = {}

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

ε-closure({1})

T← Srepeat

T '←T

T←T ' edge(s,ε)s∈T '%

& ' '

(

) * *

until T =T '

Page 53: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

53

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

53

ε

-ε . εε

ε

1 4 5 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

S = {1}T = {1}T’ = {}

T← Srepeat

T '←T

T←T ' edge(s,ε)s∈T '%

& ' '

(

) * *

until T =T '

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

ε-closure({1})

Page 54: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

54

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

54

ε

-ε . εε

ε

1 4 5 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

S = {1}T = {1}T’ = {1}

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

ε-closure({1})

T← Srepeat

T '←T

T←T ' edge(s,ε)s∈T '%

& ' '

(

) * *

until T =T '

Page 55: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

55

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

55

ε

-ε . εε

ε

1 4 5 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

S = {1}T = {1, 2}T’ = {1}

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

ε-closure({1})

T← Srepeat

T '←T

T←T ' edge(s,ε)s∈T '%

& ' '

(

) * *

until T =T '

Page 56: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

56

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

56

ε

-ε . εε

ε

1 4 5 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

S = {1}T = {1, 2}T’ = {1}

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

ε-closure({1})

T← Srepeat

T '←T

T←T ' edge(s,ε)s∈T '%

& ' '

(

) * *

until T =T '

Page 57: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

57

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

57

ε

-ε . εε

ε

1 4 5 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

S = {1}T = {1, 2}T’ = {1, 2}

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

ε-closure({1})

T← Srepeat

T '←T

T←T ' edge(s,ε)s∈T '%

& ' '

(

) * *

until T =T '

Page 58: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

58

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

58

ε

-ε . εε

ε

1 4 5 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

S = {1}T = {1, 2}T’ = {1, 2}

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

ε-closure({1})

T← Srepeat

T '←T

T←T ' edge(s,ε)s∈T '%

& ' '

(

) * *

until T =T '

Page 59: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

59

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

59

ε

-ε . εε

ε

1 4 5 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

S = {1}T = {1, 2}T’ = {1, 2}

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

ε-closure({1})

T← Srepeat

T '←T

T←T ' edge(s,ε)s∈T '%

& ' '

(

) * *

until T =T '

Page 60: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

60

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

60

ε

-ε . εε

ε

1 4 5 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

S = {1}T = {1, 2}T’ = {1, 2}

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

ε-closure({1})

T← Srepeat

T '←T

T←T ' edge(s,ε)s∈T '%

& ' '

(

) * *

until T =T '

Page 61: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

61

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

61

What is ε-closure({3})?

ε

-ε . εε

ε

1 4 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

S = {3}T = ??

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

5

Page 62: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

62

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

62

ε

-ε . εε

ε

1 4 5 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

S = {3}T = {2, 3, 4, 8}

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

What is ε-closure({3})?

Page 63: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

63

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

63

DFAedge• Given symbol c and a set of states S, what states can you reach?

• First find the states you can reach on the symbol c• Then, compute ε-closure to determine what other states are

reachable from each new state following ε-transitions.

DFAedge(S,c) = ε − closure edge(s,c)s∈S%

& ' '

(

) * *

Page 64: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

64

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

64

ε

-ε . εε

ε

1 4 5 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

S = {1}

DFAedge({1}, 3) = ??

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

What is DFAedge({1}, 3)?

Page 65: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

65

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

65

ε

-ε . εε

ε

1 4 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

S = {1}

edge({1},3) = {}DFAedge({1},3) = {}

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

What is DFAedge({1}, 3)?

5

Page 66: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

66

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

66

ε

-ε . εε

ε

1 4 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

S = {2}

edge({2}, 3) = {3}DFAedge({2}, 3) = ε-closure({3})

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

What is DFAedge({2}, 3)?

5

Page 67: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

67

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

67

ε

-ε . εε

ε

1 4 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

S = {4}

DFAedge({4}, .) = ??

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

What is DFAedge({4}, .)?

5

Page 68: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

68

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

68

ε

-ε . εε

ε

1 4 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

S = {4}

DFAedge({4}, .) = {5,6,8}

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

What is DFAedge({4}, .)?

5

Page 69: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

69

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

69

ε

-ε . εε

ε

1 4 5 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

S = {2}

ε-closure({3}) ={2,3,4,8}DFAedge({2}, 3) = {2,3,4,8}

(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

What is DFAedge({2}, 3)?

Page 70: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

70

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

70

NFA to DFA: the Subset Construction

states[0] = s1states[1] = ε − closure({s1})p =1j = 0while ( j ≤ p) doforeach c∈ ∑ doe = DFAedge(states[ j],c)if (e = states[i] for some i ≤ p) thentrans[ j,c] = ielsep = p+1states[p] = etrans[ j,c] = pj = j +1

end ifend foreachend while

• Approach– Use Subset Construction– Mimics the Set of states the NFA

should be in if it were to operate non-deterministically

– Label the states as accepting if they have at least one of the accepting states of the NFA

Page 71: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

71

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

71

ε

-ε . εε

ε

1 4 5 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

Page 72: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

72

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

72

ε

-ε . εε

ε

1 4 5 8

ε0

123

4567

8

9

32

ε0

123

4567

8

9

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

Let's simplify the diagram first...

Page 73: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

73

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

73

ε

-ε . εε

ε

1 4 5 8

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9 just means any character in the range from 0 to 9

0...9

Page 74: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

74

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

74

ε

-ε . εε

ε

1 4 5 8

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

ε-closure(1) = ?

Page 75: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

75

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

75

ε

-ε . εε

ε

1 4 5 8

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

ε-closure(1) = {1, 2}

Page 76: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

76

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

76

ε

-ε . εε

ε

1 4 5 8

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

ε-closure(1) = {1, 2}

This corresponds to the first state in the DFA...

Page 77: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

77

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

77

ε

-ε . εε

ε

1 4 5 8

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2}

Simplest approach for subset construction is to build a table

Page 78: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

78

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

78

ε

-ε . εε

ε

1 4 5 8

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2}

From NFA{1, 2}, if input is 0...9

Page 79: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

79

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

79

ε

-ε . εε

ε

1 4 5 8

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2}

From NFA{1, 2}, if input is 0...9

Page 80: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

80

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

80

ε

-ε . εε

ε

1 4 5 8

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2}

From NFA{1, 2}, if input is 0...9

Page 81: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

81

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

81

ε

-ε . εε

ε

1 4 5 8

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2}

From NFA{1, 2}, if input is 0...9

Page 82: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

82

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

82

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8}

From NFA{1, 2}, if input is 0...9

8

Page 83: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

83

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

83

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8}

1 {2, 3, 4, 8}

NFA{2, 3, 4, 8} is a new combination, so add a DFA state

8

Page 84: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

84

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

84

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8}

1 {2, 3, 4, 8}

From NFA{1, 2}, if input is −

8

Page 85: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

85

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

85

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2}1 {2, 3, 4, 8}

ε-closure doesn't lead anywhere else

8

Page 86: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

86

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

86

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2}1 {2, 3, 4, 8}

2 {2}

NFA{2} is a new combination, so add a DFA state

8

Page 87: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

87

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

87

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2}1 {2, 3, 4, 8}

2 {2}

From NFA{1, 2}, if input is .

8

Page 88: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

88

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

88

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error1 {2, 3, 4, 8}

2 {2}

From NFA{1, 2}, if input is .

8

Page 89: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

89

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

89

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8}

2 {2}

From NFA{2, 3, 4, 8}, if input is 0...9

8

Page 90: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

90

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

90

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8}

2 {2}

From NFA{2, 3, 4, 8}, if input is 0...9

8

Page 91: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

91

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

91

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8}

2 {2}

From NFA{2, 3, 4, 8}, if input is 0...9

8

Page 92: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

92

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

92

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8}2 {2}

From NFA{2, 3, 4, 8}, if input is 0...9

8

Page 93: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

93

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

93

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error2 {2}

From NFA{2, 3, 4, 8}, if input is –

8

Page 94: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

94

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

94

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error

2 {2}

From NFA{2, 3, 4, 8}, if input is .

8

Page 95: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

95

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

95

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error

2 {2}

From NFA{2, 3, 4, 8}, if input is .

8

Page 96: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

96

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

96

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}2 {2}

From NFA{2, 3, 4, 8}, if input is .

8

Page 97: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

97

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

97

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}2 {2}

3 {5, 6, 8}

NFA{5, 6, 8} is a new combination, so add a DFA state

8

Page 98: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

98

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

98

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2}

3 {5, 6, 8}

From NFA{2}, if input is 0...9

8

Page 99: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

99

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

99

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8}3 {5, 6, 8}

From NFA{2}, if input is 0...9

8

Page 100: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

100

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

100

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8} error3 {5, 6, 8}

From NFA{2}, if input is –

8

Page 101: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

101

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

101

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8} error error3 {5, 6, 8}

From NFA{2}, if input is .

8

Page 102: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

102

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

102

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8} error error

3 {5, 6, 8}

From NFA{5, 6, 8}, if input is 0...9

8

Page 103: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

103

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

103

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8} error error

3 {5, 6, 8}

From NFA{5, 6, 8}, if input is 0...9

8

Page 104: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

104

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

104

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8} error error

3 {5, 6, 8} {6, 7, 8}

From NFA{5, 6, 8}, if input is 0...9

8

Page 105: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

105

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

105

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8} error error

3 {5, 6, 8} {6, 7, 8}4 {6, 7, 8}

NFA{6, 7, 8} is a new combination, so add a DFA state

8

Page 106: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

106

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

106

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8} error error

3 {5, 6, 8} {6, 7, 8} error4 {6, 7, 8}

From NFA{5, 6, 8}, if input is –

8

Page 107: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

107

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

107

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8} error error

3 {5, 6, 8} {6, 7, 8} error error4 {6, 7, 8}

From NFA{5, 6, 8}, if input is .

8

Page 108: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

108

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

108

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8} error error

3 {5, 6, 8} {6, 7, 8} error error

4 {6, 7, 8}

From NFA{6, 7, 8}, if input is 0...9

8

Page 109: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

109

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

109

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8} error error

3 {5, 6, 8} {6, 7, 8} error error

4 {6, 7, 8}

From NFA{6, 7, 8}, if input is 0...9

8

Page 110: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

110

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

110

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8} error error

3 {5, 6, 8} {6, 7, 8} error error

4 {6, 7, 8} {6, 7, 8}

From NFA{6, 7, 8}, if input is 0...9

8

Page 111: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

111

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

111

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8} error error

3 {5, 6, 8} {6, 7, 8} error error

4 {6, 7, 8} {6, 7, 8} error error

Last two are errors...

8

Page 112: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

112

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

112

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8} error error

3 {5, 6, 8} {6, 7, 8} error error

4 {6, 7, 8} {6, 7, 8} error error

No cells left to fill – DONE!

8

Page 113: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

113

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

113

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8} error error

3 {5, 6, 8} {6, 7, 8} error error

4 {6, 7, 8} {6, 7, 8} error error

In this case, NFA state 8 is an accepting state, so any DFA state whichcontains NFA state 8 should also be accepting.

8

Page 114: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

114

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

114

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1* {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8} error error

3* {5, 6, 8} {6, 7, 8} error error

4* {6, 7, 8} {6, 7, 8} error error

In this case, NFA state 8 is an accepting state, so any DFA state whichcontains NFA state 8 should also be accepting.

8

Page 115: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

115

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

115

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1* {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8} error error

3* {5, 6, 8} {6, 7, 8} error error

4* {6, 7, 8} {6, 7, 8} error error

Now use the table as a guide to construct the DFA diagram

Page 116: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

116

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

116

DFA State NFA Statesε-closure after transition on...

0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error

1* {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}

2 {2} {2, 3, 4, 8} error error

3* {5, 6, 8} {6, 7, 8} error error

4* {6, 7, 8} {6, 7, 8} error error

Now use the table as a guide to construct the DFA diagram

0 1 3 4

2

0...9

.

0...9

0...9

0...9 0...9

Page 117: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

117

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

117

0 1 3 4

2

0...9

.

0...9

0...9

0...9 0...9

ε

-ε . εε

ε

1 4 5

ε

0...9

32

ε

76

ε

RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?

0...9

8NFA

DFA

Page 118: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

118

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

118

NFA vs DFA: Complexity

• Matching time and space used depends on the length of the regular expression |r| and length of the input string |x|

• NFA matching time is O(|r|x|x|) and used space is O(r)

• DFA matching time is O(|x|) and used space is O(2|r|)– The number of states may grow exponential (cf. subset construction)– (a|b)*a (a|b) (a|b)… (a|b)

• Using lazy transition evaluation only states really used in practice are computed.– Optimization that overcomes or mitigates issues with space

Page 119: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •

119

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

119

Summary

• Lexical Analyzer create tokens out of a text stream

• Tokens defined using Regular Expressions (REs)

• Regular Expressions can be mapped to Nondeterministic Finite Automata (NFA) – by simple Thompson’s construction

• NFA is transformed to a DFA – Transformation Algorithm: the Subset construction– Executing a DFA is straightforward