regular expressions

Post on 18-Nov-2014

308 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Regular Expressions

Regular Expression

• A regular expression (RE) is defined inductivelya ordinary character

from the empty string

2

Regular Expression

R|S = either R or SRS = R followed by S

(concatenation)R* = concatenation of R

zero or more times(R*= |R|RR|RRR...)

3

RE Extentions

R? = | R (zero or one R)

R+ = RR* (one or more R)

4

RE Extentions

[abc] = a|b|c (any of listed)

[a-z] = a|b|....|z (range)

[^ab] = c|d|... (anything but

‘a’‘b’) 5

Regular Expression

RE Strings in L(R)a “a”ab “ab”a|b “a” “b”(ab)* “” “ab”

“abab” ...(a|)b “ab” “b”

6

Example: integers

• integer: a non-empty string

of digits• digit = ‘0’|’1’|’2’|’3’|’4’|

’5’|’6’|’7’|’8’|’9’• integer = digit digit*

7

Example: identifiers

• identifier: string or letters or digits starting with a letter

• C identifier:[a-zA-Z_][a-zA-Z0-9_]*

8

9

Regular Definitions

• To write regular expression for some languages can be difficult, because their regular expressions can be quite complex. In those cases, we may use regular definitions.

• We can give names to regular expressions, and we can use these names as symbols to define other regular expressions.

• A regular definition is a sequence of the definitions of the form:d1 r1 where di is a distinct name and

d2 r2 ri is a regular expression over symbols in

. {d1,d2,...,di-1}

dn rn

10

Specification of Patterns for Tokens: Regular Definitions

• Example:

letter AB…Zab…z digit 01…9 id letter ( letterdigit )*

• digits digit digit*

11

Regular Definitions (cont.)

• Ex: Identifiers in Pascalletter A | B | ... | Z | a | b | ... | zdigit 0 | 1 | ... | 9id letter (letter | digit ) *

– If we try to write the regular expression representing identifiers without using regular definitions, that regular expression will be complex.

(A|...|Z|a|...|z) ( (A|...|Z|a|...|z) | (0|...|9) ) *

• Ex: Unsigned numbers in Pascaldigit 0 | 1 | ... | 9digits digit +

opt-fraction ( . digits ) ?opt-exponent ( E (+|-)? digits ) ?

unsigned-num digits opt-fraction opt-exponent

12

Specification of Patterns for Tokens: Notational Shorthand

• The following shorthands are often used:– + one or more instances of– ? Zero or one instance

r+ = rr*

r? = r[a-z] = abc…z

• Examples:digit [0-9]num digit+ (. digit+)? ( E (+-)? digit+ )?

13

Definition

• For primitive regular expressions:

aaL

L

L

14

Definition (continued)

• For regular expressions and

1r 2r

2121 rLrLrrL

2121 rLrLrrL

** 11 rLrL

11 rLrL

Concatenation of Languages

• If L1 and L2 are languages, we can define the concatenationL1L2 = {w | w=xy, xL1, yL2}

• Examples:– {ab, ba}{cd, dc} =? {abcd, abdc, bacd, badc}– Ø{ab} =? Ø

Kleene Closure

• L* = i=0Li

= L0 L1 L2 …• Examples:

– {ab, ba}* =? {, ab, ba, abab, abba,…}– Ø* =? {}– {}* =? {}

17

Example

• Regular expression *)10(00*)10( r

)(rL = { all strings with at least two consecutive 0 }

18

Example

• Regular expression )0(*)011( r

)(rL = { all strings without two consecutive 0 }

19

Equivalent Regular Expressions

• Definition:

• Regular expressions and

• are equivalent if

1r 2r

)()( 21 rLrL

20

Example

• L= { all strings without two consecutive 0 }

)0(*)011(1 r

)0(*1)0(**)011*1(2 r

LrLrL )()( 211r 2rand

are equivalentregular expr.

Assignment

• Σ = {0, 1}• What is the language for

– 0*1*

• What is the regular expression for– {w | w has at least one 1}– {w | w starts and ends with same symbol}– {w | |w| 5}– {w | every 3rd position of w is 1}– L+ = L1 L2 …– L? (means an optional L)

22

Regular Expressionsand

Regular Languages

23

Theorem

LanguagesGenerated byRegular Expressions

RegularLanguages

24

Standard Representations of Regular Languages

Regular Languages

FAs

NFAsRegularExpressions

25

Elementary Questions

about

Regular Languages

26

Membership Question

Question: Given regular languageand string how can we check if ?

L

Lw w

Answer: Take the DFA that acceptsand check if is accepted

Lw

27

DFA

Lw

DFA

Lw

w

w

28

Given regular languagehow can we checkif is empty: ?

L

L

Take the DFA that accepts

Check if there is any path from the initial state to a final state

L

)( L

Question:

Answer:

29

DFA

L

DFA

L

30

Given regular languagehow can we checkif is finite?

L

L

Take the DFA that accepts

Check if there is a walk with cyclefrom the initial state to a final state

L

Question:

Answer:

31

DFA

L is infinite

DFA

L is finite

From RE to -NFA

• For every regular expression R, we can construct an -NFA A, s.t. L(A) = L(R).

• Proof by structural induction:

Ø:

:

a:a

From RE to -NFA

R+S:

RS:

R*:

R

S

R S

R

Example: (0+1)*1(0+1)

0

1

0

1

0

1

1

0

1

Example : (a+b)*aba

top related