regular expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · grep search files and...

53
Regular Expressions and grep and sed and more?

Upload: others

Post on 15-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Regular Expressionsand grepand sedand more?

Page 2: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Regular Expressions

Page 3: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Globs = Regular Expressions

Differences

Page 4: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Regular Expressions

● Patterns that match against certain strings

● Different from globs

● Compatible with many applications

● But why are they called regular expressions?○ For interesting theoretical reasons○ That you will learn later

Page 5: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Example: Phone numbers

● Multiple possible strings○ 123-456-7890○ 1234567890○ 456-789-1234

● But the formats follow a few patterns○ ###-###-###○ ##########

Page 6: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Solution: Regular expressions

● Create a pattern that specifies which strings to match

● (\d{3}-?){2}\d{4} – matches a phone number

Page 7: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Examples

● gpi – matches "gpi"

● [hjkl] – matches "h", "j", "k", and "l"

● 07-?131 – matches "07131" and "07-131"

● item[1-3] – matches "item1", "item2", "item3"

● codes* – matches "code", "codes", "codess", "codesssss", etc.

Page 8: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Parts of a regular expression● Normal characters

○ gpi – matches "gpi"

● Quantifiers○ repeating* – matches "repeatin", "repeating", " repeatingggg", etc.

○ ab{1,3} – matches "ab", "abb", or "abbb"

● Character classes○ [hjkl] – matches "h", "j", "k", "l"

○ \d – matches and digit

○ . – matches any character

● Use parentheses for grouping

Page 9: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

QuantifiersQuantifier

Zero or one

Zero or more

One or more

Exactly 3

3 or more

Between 3 and 6

Matches

a?

a*

a+

a{3}

a{3,}

a{3,6}

Page 10: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Character classesClass

a or b or c

not any of a, b, c

A lowercase letter

Not a letter

Whitespace

Digit

Any single character

Matches

[abc]

[^abc]

[a-z]

[^A-Za-z]

\s

\d

.

Page 11: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Example

(\d{3}-?){2}\d{4}Matches any digit

Page 12: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Example

(\d{3}-?){2}\d{4}Matches any 3 digits

Page 13: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Example

(\d{3}-?){2}\d{4}Matches an optional hyphen

Page 14: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Example

(\d{3}-?){2}\d{4}Matches 2 groups of 3 digits

Ex:123-456-123456-123456

Page 15: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Example

(\d{3}-?){2}\d{4}Matches 2 groups of 3 digits,

then 4 more digits

Page 16: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Special sequences

● $ - End of string

● ^ - Start of string

● Parentheses for grouping

Page 17: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Cheat sheet● a* – Matches zero or more times

● a? – Matches one or zero times

● a{3} – Matches three times

● . – Matches any single character

● [a-z0-9] – Matches a digit or lowercase character

● [^xy] – Matches anything other than x and y.

● ^ - Matches start of string

● $ - Matches end of string

Page 18: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Quiz!Matches Regex

abab(ab)? or (ab){2,3}

(ab)*

[A-Za-z]\d

[a-z]*\.com

ababab or abab

ab any number of times

[any letter][any number] ex: A4

example.com website.com etc.

Page 19: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Regex vs Globs and ranges

Regex Glob/Range equivalent

?

file{1..7}.txt

*

Not possible

.

file[1-7]\.txt

.*

(ab)*

Page 20: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Grep

● Search files and directories using regular expressions!

● Prints lines that include a match

● Name comes from g/re/p command in the UNIX text editor ed

● $ grep 'evidence' largefile.txt

○ Searches largefile.txt for "evidence".

● $ grep -r 'secrets' path/to/directory

○ Searches recursively for "secrets".

Page 21: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Sed (look familiar????)

● Stands for "stream editor"

● Can perform find and replace on a file

● sed 's/find/replace/g' path/to/file○ Prints result of replacement to the command line, leaving input

untouched

● sed -i 's/find/replace/g' path/to/file○ "In place"

○ Edits the file

Page 22: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

How does grep work?

● It seems like some guessing is necessary○ Imagine matching "abc" against a?b?c?abc

○ Lots of guessing would be exponential time.

● But grep is fast○ For deep theoretical reasons. Involving finite state machines.

Page 23: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

<Extra Content>

Page 24: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Super Stylish TA Merch

FgrUeNatCprTa

ctIicOaNliSdA

eaRsfEoBrcoAm

puSteHrsSciCe

nRtIiPsTtSs<3

:%s/[A-Z]//g

Page 25: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Super Stylish TA Merch

FgrUeNatCprTa

ctIicOaNliSdA

eaRsfEoBrcoAm

puSteHrsSciCe

nRtIiPsTtSs<3

:%s/[A-Z]//g

Page 26: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Super Stylish TA Merch

greatpra

cticalid

easforcom

puterscie

ntists<3

:%s/[A-Z]//g

Page 27: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Super Stylish TA Merch

great practical

ideas for

computer scientists

<3

Page 28: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Super Stylish TA Merch

FgrUeNatCprTa

ctIicOaNliSdA

eaRsfEoBrcoAm

puSteHrsSciCe

nRtIiPsTtSs<3

:%s/[a-z]//g

Page 29: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Super Stylish TA Merch

FgrUeNatCprTa

ctIicOaNliSdA

eaRsfEoBrcoAm

puSteHrsSciCe

nRtIiPsTtSs<3

:%s/[a-z]//g

Page 30: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Super Stylish TA Merch

FUNCT

IONSA

REBA

SHSC

RIPTS<3

:%s/[a-z]//g

Page 31: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Super Stylish TA Merch

FUNCTIONS

ARE

BASH

SCRIPTS

<3

Page 32: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

DFA = DeterministicFinite-stateAutomaton

Page 33: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

DFA =

Page 34: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

DFA =q1

1

0

0

1

0

0

11

q2 q3

q4 q5

0

1

Page 35: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

DFA =q1

1

0

0

1

0

0

11

q2 q3

q4 q5

0

1

Page 36: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

DFA =q1

1

0

0

1

0

0

11

q2 q3

q4 q5

0

1

Page 37: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

DFA =q1

1

0

0

1

0

0

11

q2 q3

q4 q5

0

1

0100

Page 38: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

DFA =q1

1

0

0

1

0

0

11

q2 q3

q4 q5

0

1

0101

Page 39: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

DFA =q1

1

0

0

1

0

0

11

q2 q3

q4 q5

0

1

0100

Page 40: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

DFA =q1

1

0

0

1

0

0

11

q2 q3

q4 q5

0

1

0100

Page 41: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

DFA =q1

1

0

0

1

0

0

11

q2 q3

q4 q5

0

1

1001

Page 42: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

DFA =q1

1

0

0

1

0

0

11

q2 q3

q4 q5

0

1

1001

Page 43: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

DFA =q1

1

0

0

1

0

0

11

q2 q3

q4 q5

0

1

1001

Page 44: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

DFA =q1

1

0

0

1

0

0

11

q2 q3

q4 q5

0

1

1001

Page 45: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Regex Efficiently Converted NFA

Page 46: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

NFA = NondeterministicFinite-stateAutomaton

Page 47: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

NFA =

Page 48: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

NFA =• Accept if there exists any path to an

accepting state

• Computed by updating the set of reachable states

Page 49: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Regex NFA

grep

Evaluate NFA on string

Page 50: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

</Extra Content>

If we call the states “slides” and the transitions “hyperlinks”…

Just one more thing…

Page 51: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Cheat sheet● a* – Matches zero or more times

● a? – Matches one or zero times

● a{3} – Matches three times

● . – Matches any single character

● [a-z0-9] – Matches a digit or lowercase character

● [^xy] – Matches anything other than x and y.

● ^ - Matches start of string

● $ - Matches end of string

Page 52: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Bash scripting summary

● Bash scripts end in a .sh extension

● Always start with a shebang○ #!/usr/bin/env bash

● Add permissions with chmod +x script.sh

Page 53: Regular Expressions07131/f19/topics/readings/week-8/... · 2019. 11. 23. · Grep Search files and directories using regular expressions! Prints lines that include a match Name comes

Lab pro tips

● Lab is up! zombielab

● Be careful with escaping correctly. Both bash and regex have characters that must be escaped

● Don't forget to do chmod +x script.sh and add #!/usr/bin/env bash

● scottylabs.org/wdw/register