Transcript
Page 1: Beneath the Surface: Regular Expressions in Ruby

Photo By Mr. Christopher ThomasCreative Commons Attribution-ShareALike 2.0 Generic License

Beneath the Surface

Embracing the True Power of Regular Expressions in Ruby

@nellshamrell

Page 2: Beneath the Surface: Regular Expressions in Ruby

^4[0-9]{12}(?:[0-9]{3})?$

Source: regular-expressions.info

Page 3: Beneath the Surface: Regular Expressions in Ruby

We fear what we do not understand

Page 4: Beneath the Surface: Regular Expressions in Ruby
Page 5: Beneath the Surface: Regular Expressions in Ruby

Regular Expressions

+ Ruby

Photo By ShayanCreative Commons Attribution-ShareALike 2.0 Generic License

Page 6: Beneath the Surface: Regular Expressions in Ruby

Regex Matching in Ruby

RubyMethods

Onigmo

Page 7: Beneath the Surface: Regular Expressions in Ruby

Onigmo

Page 8: Beneath the Surface: Regular Expressions in Ruby

Oniguruma

OnigmoFork

Page 9: Beneath the Surface: Regular Expressions in Ruby

Onigmo

Reads Regex

Page 10: Beneath the Surface: Regular Expressions in Ruby

Onigmo

Reads Regex

AbstractSyntax

Tree

ParsesInto

Page 11: Beneath the Surface: Regular Expressions in Ruby

Onigmo

Reads Regex

AbstractSyntax

Tree

Series ofInstructions

ParsesInto

CompilesInto

Page 13: Beneath the Surface: Regular Expressions in Ruby

A Finite State Machine Shows How

Something Works

Page 14: Beneath the Surface: Regular Expressions in Ruby

Annie the Dog

Page 15: Beneath the Surface: Regular Expressions in Ruby

In the House

Out of House

Annie the Dog

Page 16: Beneath the Surface: Regular Expressions in Ruby

In the House

Out of House

Annie the Dog

Door

Page 17: Beneath the Surface: Regular Expressions in Ruby

In the House

Out of House

Annie the Dog

Door

Door

Page 18: Beneath the Surface: Regular Expressions in Ruby

Finite

State

Machine

Page 19: Beneath the Surface: Regular Expressions in Ruby

Finite

State

Machine

Page 20: Beneath the Surface: Regular Expressions in Ruby

Finite

State

Machine

Page 21: Beneath the Surface: Regular Expressions in Ruby

Multiple States

Page 22: Beneath the Surface: Regular Expressions in Ruby

/force/

Page 23: Beneath the Surface: Regular Expressions in Ruby

re = /force/string = “Use the force”re.match(string)

Page 24: Beneath the Surface: Regular Expressions in Ruby

f o r c e

/force/

“Use the force”

Path Doesn’t Match

Page 25: Beneath the Surface: Regular Expressions in Ruby

f o r c e

/force/

“Use the force”

Still Doesn’t Match

Page 26: Beneath the Surface: Regular Expressions in Ruby

f o r c e

/force/

“Use the force”

Path Matches!

(Fast Forward)

Page 27: Beneath the Surface: Regular Expressions in Ruby

f o r c e

/force/

“Use the force”

Page 28: Beneath the Surface: Regular Expressions in Ruby

f o r c e

/force/

“Use the force”

Page 29: Beneath the Surface: Regular Expressions in Ruby

f o r c e

/force/

“Use the force”

Page 30: Beneath the Surface: Regular Expressions in Ruby

f o r c e

/force/

“Use the force”

Page 31: Beneath the Surface: Regular Expressions in Ruby

f o r c e

/force/

“Use the force”

We Have A Match!

Page 32: Beneath the Surface: Regular Expressions in Ruby

re = /force/string = “Use the force”re.match(string)=> #<MatchData “force”>

Page 34: Beneath the Surface: Regular Expressions in Ruby

/Y(olk|oda)/

Pipe

Page 35: Beneath the Surface: Regular Expressions in Ruby

re = /Y(olk|oda)/string = “Yoda”re.match(string)

Page 36: Beneath the Surface: Regular Expressions in Ruby

Y oo

l k

d a

/Y(olk|oda)/

“Yoda”

Page 37: Beneath the Surface: Regular Expressions in Ruby

Y oo

l k

d a

/Y(olk|oda)/

Which To Choose?

“Yoda”

Page 38: Beneath the Surface: Regular Expressions in Ruby

Y oo

l k

d a

/Y(olk|oda)/

“Yoda”Saves To Backtrack

Stack

Page 39: Beneath the Surface: Regular Expressions in Ruby

Y oo

l k

d a

/Y(olk|oda)/

“Yoda”Uh Oh, No Match

Page 40: Beneath the Surface: Regular Expressions in Ruby

Y oo

l k

d a

/Y(olk|oda)/

“Yoda”Backtracks To Here

Page 41: Beneath the Surface: Regular Expressions in Ruby

Y oo

l k

d a

/Y(olk|oda)/

“Yoda”

Page 42: Beneath the Surface: Regular Expressions in Ruby

Y oo

l k

d a

/Y(olk|oda)/

“Yoda”

Page 43: Beneath the Surface: Regular Expressions in Ruby

Y oo

l k

d a

/Y(olk|oda)/

“Yoda”

We Have A Match!

Page 44: Beneath the Surface: Regular Expressions in Ruby

re = /Y(olk|oda)/string = “Yoda”re.match(string)=> #<MatchData “Yoda”>

Page 46: Beneath the Surface: Regular Expressions in Ruby

/No+/

PlusQuantifier

Page 47: Beneath the Surface: Regular Expressions in Ruby

re = /No+/string = “Noooo”re.match(string)

Page 48: Beneath the Surface: Regular Expressions in Ruby

N o

o

/No+/

“Noooo”

Page 49: Beneath the Surface: Regular Expressions in Ruby

N o

o

/No+/

“Noooo”

Page 50: Beneath the Surface: Regular Expressions in Ruby

N o

o

/No+/

“Noooo”

Return Match? Or Keep Looping?

Page 51: Beneath the Surface: Regular Expressions in Ruby

N o

o

/No+/

“Noooo”

Greedy Quantifier

KeepsLooping

Page 52: Beneath the Surface: Regular Expressions in Ruby

Greedy quantifiers match as much as possible

Page 53: Beneath the Surface: Regular Expressions in Ruby

Greedy quantifiers use maximum effort for

maximum return

Page 54: Beneath the Surface: Regular Expressions in Ruby

N o

o

/No+/

“Noooo”

Page 55: Beneath the Surface: Regular Expressions in Ruby

N o

o

/No+/

“Noooo”

Page 56: Beneath the Surface: Regular Expressions in Ruby

N o

o

/No+/

“Noooo”

We Have A Match!

Page 57: Beneath the Surface: Regular Expressions in Ruby

re = /No+/string = “Noooo”re.match(string)=> #<MatchData “Noooo”>

Page 58: Beneath the Surface: Regular Expressions in Ruby

Lazy Quantifiers

Page 59: Beneath the Surface: Regular Expressions in Ruby

Lazy quantifiers match as little as possible

Page 60: Beneath the Surface: Regular Expressions in Ruby

Lazy quantifiers use minimum effort for

minimum return

Page 61: Beneath the Surface: Regular Expressions in Ruby

/No+?/

Makes Quantifier

Lazy

Page 62: Beneath the Surface: Regular Expressions in Ruby

re = /No+?/string = “Noooo”re.match(string)

Page 63: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

/No+?/

Page 64: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

/No+?/

Page 65: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

/No+?/

Return Match? Or Keep Looping?

Page 66: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

/No+?/

We Have A Match!

Page 67: Beneath the Surface: Regular Expressions in Ruby

re = /No+?/string = “Noooo”re.match(string)=> #<MatchData “No”>

Page 68: Beneath the Surface: Regular Expressions in Ruby

Greedy quantifiers are greedy but reasonable

Page 69: Beneath the Surface: Regular Expressions in Ruby

/.*moon/

StarQuantifier

Page 70: Beneath the Surface: Regular Expressions in Ruby

re = /.*moon/string = “That’s no moon”re.match(string)

Page 71: Beneath the Surface: Regular Expressions in Ruby

. m o o n

./.*moon/

“That’s no moon”

Page 72: Beneath the Surface: Regular Expressions in Ruby

. m o o n

.

“That’s no moon”

/.*moon/

Page 73: Beneath the Surface: Regular Expressions in Ruby

. m o o n

.

“That’s no moon”

Loops

/.*moon/

Page 74: Beneath the Surface: Regular Expressions in Ruby

. m o o n

. Which To Match?

(Fast Forward)

“That’s no moon”

/.*moon/

Page 75: Beneath the Surface: Regular Expressions in Ruby

. m o o n

.

“That’s no moon”

Keeps Looping

/.*moon/

Page 76: Beneath the Surface: Regular Expressions in Ruby

. m o o n

.

“That’s no moon”

Keeps Looping

/.*moon/

Page 77: Beneath the Surface: Regular Expressions in Ruby

. m o o n

.

“That’s no moon”

Keeps Looping

/.*moon/

Page 78: Beneath the Surface: Regular Expressions in Ruby

. m o o n

“That’s no moon”No More

Characters?

./.*moon/

Page 79: Beneath the Surface: Regular Expressions in Ruby

. m o o n

“That’s no moon”

Backtrack or Fail?./.*moon/

Page 80: Beneath the Surface: Regular Expressions in Ruby

. m o o n

“That’s no moon”Backtracks

./.*moon/

Page 81: Beneath the Surface: Regular Expressions in Ruby

. m o o n

“That’s no moon”Backtracks

./.*moon/

Page 82: Beneath the Surface: Regular Expressions in Ruby

. m o o n

“That’s no moon”Backtracks

./.*moon/

Page 83: Beneath the Surface: Regular Expressions in Ruby

. m o o n

“That’s no moon”Backtracks

Huzzah!./.*moon/

Page 84: Beneath the Surface: Regular Expressions in Ruby

. m o o n

“That’s no moon”

./.*moon/

Page 85: Beneath the Surface: Regular Expressions in Ruby

. m o o n

“That’s no moon”

./.*moon/

Page 86: Beneath the Surface: Regular Expressions in Ruby

. m o o n

“That’s no moon”

./.*moon/

Page 87: Beneath the Surface: Regular Expressions in Ruby

. m o o n

“That’s no moon”

. We Have A Match!

/.*moon/

Page 88: Beneath the Surface: Regular Expressions in Ruby

re = /.*moon/string = “That’s no moon”re.match(string)=> #<MatchData “That’s no moon”>

Page 89: Beneath the Surface: Regular Expressions in Ruby

Backtracking = Slow

Page 90: Beneath the Surface: Regular Expressions in Ruby

/No+w+/

Page 91: Beneath the Surface: Regular Expressions in Ruby

re = /No+w+/string = “Noooo”re.match(string)

Page 92: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

/No+w+/

w

w

Page 93: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

/No+w+/

w

w

Page 94: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

/No+w+/

w

wLoops

Page 95: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

/No+w+/

w

wLoops

Page 96: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

/No+w+/

w

wLoops

Page 97: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

/No+w+/

w

w

Uh Oh

Page 98: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

/No+w+/

w

w

Uh Oh

Backtrack or Fail?

Page 99: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

/No+w+/

w

wBacktracks

Page 100: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

/No+w+/

w

wBacktracks

Page 101: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

/No+w+/

w

wBacktracks

Page 102: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

/No+w+/

w

w

Match FAILS

Page 103: Beneath the Surface: Regular Expressions in Ruby

Possessive Quantifers

Page 104: Beneath the Surface: Regular Expressions in Ruby

Possessive quantifiers do not backtrack

Page 105: Beneath the Surface: Regular Expressions in Ruby

Makes Quantifier Possessive

/No++w+/

Page 106: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

w

w

/No++w+/

Page 107: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

w

w

/No++w+/

Page 108: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

w

wLoops

/No++w+/

Page 109: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

w

wLoops

/No++w+/

Page 110: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

w

wLoops

/No++w+/

Page 111: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

w

w

/No++w+/

Page 112: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

w

wLoops

Uh Oh

Backtrack or Fail?

/No++w+/

Page 113: Beneath the Surface: Regular Expressions in Ruby

N o

o“Noooo”

w

w

Match FAILS

/No++w+/

Page 114: Beneath the Surface: Regular Expressions in Ruby

Possessive quantifiers fail faster by

controlling backtracking

Page 115: Beneath the Surface: Regular Expressions in Ruby
Page 117: Beneath the Surface: Regular Expressions in Ruby
Page 118: Beneath the Surface: Regular Expressions in Ruby

snake_case to CamelCase

Page 119: Beneath the Surface: Regular Expressions in Ruby

Find first letter of string and capitalize it

snake_case to CamelCase

Page 120: Beneath the Surface: Regular Expressions in Ruby

Find first letter of string and capitalize it

Find any character that follows an underscore and capitalize it

snake_case to CamelCase

Page 121: Beneath the Surface: Regular Expressions in Ruby

Find first letter of string and capitalize it

Find any character that follows an underscore and capitalize it

Remove underscores

snake_case to CamelCase

Page 122: Beneath the Surface: Regular Expressions in Ruby

Find first letter of string and capitalize it

snake_case to CamelCase

Page 123: Beneath the Surface: Regular Expressions in Ruby

it ʺ″capitalizes the first letterʺ″ do

end

result = @case_converter.upcase_chars(ʺ″methodʺ″)

result.should == ʺ″Methodʺ″

case_converter_spec.rb

before(:each) do

end@case_converter = CaseConverter.new

Page 124: Beneath the Surface: Regular Expressions in Ruby

it ʺ″capitalizes the first letterʺ″ do

end

result = @case_converter.upcase_chars(ʺ″methodʺ″)

result.should == ʺ″Methodʺ″

case_converter_spec.rb

before(:each) do

end@case_converter = CaseConverter.new

Page 125: Beneath the Surface: Regular Expressions in Ruby

it ʺ″capitalizes the first letterʺ″ do

end

result = @case_converter.upcase_chars(ʺ″methodʺ″)

result.should == ʺ″Methodʺ″

case_converter_spec.rb

before(:each) do

end@case_converter = CaseConverter.new

Page 126: Beneath the Surface: Regular Expressions in Ruby

/ /^

Anchors Match To

Beginning Of String

Page 127: Beneath the Surface: Regular Expressions in Ruby

/ /\ w^

Matches Any Word

Character

Page 128: Beneath the Surface: Regular Expressions in Ruby

case_converter.rb

def upcase_chars(string)

end

re = / /\w^string.gsub(re){|char| char.upcase}

Page 129: Beneath the Surface: Regular Expressions in Ruby

case_converter.rb

def upcase_chars(string)

end

re = / /\w^string.gsub(re){|char| char.upcase}

Page 130: Beneath the Surface: Regular Expressions in Ruby

case_converter.rb

def upcase_chars(string)

end

re = / /\w^string.gsub(re){|char| char.upcase}

Spec Passes!

Page 131: Beneath the Surface: Regular Expressions in Ruby

it ʺ″capitalizes the first letterʺ″ do

end

result = @case_converter

result.should == ʺ″_Methodʺ″

case_converter_spec.rb

.upcase_chars(ʺ″_methodʺ″)

Page 132: Beneath the Surface: Regular Expressions in Ruby

it ʺ″capitalizes the first letterʺ″ do

end

result = @case_converter

result.should == ʺ″_Methodʺ″

case_converter_spec.rb

.upcase_chars(ʺ″_methodʺ″)

Page 133: Beneath the Surface: Regular Expressions in Ruby

it ʺ″capitalizes the first letterʺ″ do

end

result = @case_converter

result.should == ʺ″_Methodʺ″

case_converter_spec.rb

.upcase_chars(ʺ″_methodʺ″)

Spec Fails!

Page 134: Beneath the Surface: Regular Expressions in Ruby

Expected: ʺ″_Methodʺ″Got: ʺ″_methodʺ″

Spec Failure:

Page 135: Beneath the Surface: Regular Expressions in Ruby

Problem:Matches Letters AND Underscores

\ w^/ /

Page 136: Beneath the Surface: Regular Expressions in Ruby

/ /[a-z]^

Matches Only

Lowercase Letters

Page 137: Beneath the Surface: Regular Expressions in Ruby

/ /[a-z]^[^a-z]

Matches everything

BUT lowercase letters

Page 138: Beneath the Surface: Regular Expressions in Ruby

/ /[a-z][̂^a-z]?

Makes Character

Class Optional

Page 139: Beneath the Surface: Regular Expressions in Ruby

case_converter.rb

def upcase_chars(string)

end

re = string.gsub(re){|char| char.upcase}

/ /[a-z]^[^a-z]?

Page 140: Beneath the Surface: Regular Expressions in Ruby

case_converter.rb

def upcase_chars(string)

endstring.gsub(re){|char| char.upcase}

Spec Passes!

re = / /[a-z]^[^a-z]?

Page 141: Beneath the Surface: Regular Expressions in Ruby

Find any character that follows an underscore and capitalize it

snake_case to CamelCase

Page 142: Beneath the Surface: Regular Expressions in Ruby

it ʺ″capitalizes letters after an underscoreʺ″ do

end

result = @case_converter

result.should == ʺ″Some_Methodʺ″

case_converter_spec.rb

.upcase_chars(ʺ″some_methodʺ″)

Page 143: Beneath the Surface: Regular Expressions in Ruby

it ʺ″capitalizes letters after an underscoreʺ″ do

end

result = @case_converter

result.should == ʺ″Some_Methodʺ″

case_converter_spec.rb

.upcase_chars(ʺ″some_methodʺ″)

Page 144: Beneath the Surface: Regular Expressions in Ruby

/ /[a-z]^[^a-z]?

Page 145: Beneath the Surface: Regular Expressions in Ruby

Pipe For Alternation

| [a-z]/ /[a-z]^[^a-z]?

Page 146: Beneath the Surface: Regular Expressions in Ruby

Look Behind

(?<=_)| [a-z]/ /[a-z]^[^a-z]?

Page 147: Beneath the Surface: Regular Expressions in Ruby

case_converter.rb

def upcase_chars(string)

end

re = string.gsub(re){|char| char.upcase}

| [a-z](?<=_)/ /[a-z]^[^a-z]?

Page 148: Beneath the Surface: Regular Expressions in Ruby

case_converter.rb

def upcase_chars(string)

end

re = string.gsub(re){|char| char.upcase}

| [a-z](?<=_)/ /[a-z]^[^a-z]?

Spec Passes!

Page 149: Beneath the Surface: Regular Expressions in Ruby

Remove underscores

snake_case to CamelCase

Page 150: Beneath the Surface: Regular Expressions in Ruby

it ʺ″removes underscoresʺ″ do

end

result = @case_converter

result.should == ʺ″somemethodʺ″

case_converter_spec.rb

.rmv_underscores(ʺ″some_methodʺ″)

Page 151: Beneath the Surface: Regular Expressions in Ruby

it ʺ″removes underscoresʺ″ do

end

result = @case_converter

result.should == ʺ″somemethodʺ″

case_converter_spec.rb

.rmv_underscores(ʺ″some_methodʺ″)

Page 152: Beneath the Surface: Regular Expressions in Ruby

it ʺ″removes underscoresʺ″ do

end

result = @case_converter

result.should == ʺ″somemethodʺ″

case_converter_spec.rb

.rmv_underscores(ʺ″some_methodʺ″)

Page 153: Beneath the Surface: Regular Expressions in Ruby

MatchesAn

Underscore

/ /_

Page 154: Beneath the Surface: Regular Expressions in Ruby

case_converter.rb

def rmv_underscores(string)

end

re = string.gsub(re, “”)

/ /_

Page 155: Beneath the Surface: Regular Expressions in Ruby

case_converter.rb

def rmv_underscores(string)

endstring.gsub(re, “”)re = / /_

Page 156: Beneath the Surface: Regular Expressions in Ruby

case_converter.rb

def rmv_underscores(string)

endstring.gsub(re, “”)

Spec Passes!

re = / /_

Page 157: Beneath the Surface: Regular Expressions in Ruby

Combine results of two methods

snake_case to CamelCase

Page 158: Beneath the Surface: Regular Expressions in Ruby

it ʺ″converts snake_case to CamelCaseʺ″ do

end

result = @case_converter

result.should == ʺ″SomeMethodʺ″

case_converter_spec.rb

.snake_to_camel(ʺ″some_methodʺ″)

Page 159: Beneath the Surface: Regular Expressions in Ruby

it ʺ″converts snake_case to CamelCaseʺ″ do

end

result = @case_converter

result.should == ʺ″SomeMethodʺ″

case_converter_spec.rb

.snake_to_camel(ʺ″some_methodʺ″)

Page 160: Beneath the Surface: Regular Expressions in Ruby

it ʺ″converts snake_case to CamelCaseʺ″ do

end

result = @case_converter

result.should == ʺ″SomeMethodʺ″

case_converter_spec.rb

.snake_to_camel(ʺ″some_methodʺ″)

Page 161: Beneath the Surface: Regular Expressions in Ruby

case_converter.rb

def snake_to_camel(string)

endupcase_chars(string)

Page 162: Beneath the Surface: Regular Expressions in Ruby

case_converter.rb

def snake_to_camel(string)

endupcase_chars(string)rmv_underscores( )

Page 163: Beneath the Surface: Regular Expressions in Ruby

case_converter.rb

def snake_to_camel(string)

endupcase_chars(string)rmv_underscores( )

Spec Passes!

Page 166: Beneath the Surface: Regular Expressions in Ruby

Develop regular expressions in small pieces

Page 167: Beneath the Surface: Regular Expressions in Ruby
Page 168: Beneath the Surface: Regular Expressions in Ruby

If you write code, you can write regular expressions

Page 169: Beneath the Surface: Regular Expressions in Ruby

Move beyond the fear


Top Related