1 matchete paths through the pattern matching jungle martin hirzel nate nystrom bard bloom jan vitek...

19
1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture.

Upload: deshaun-fairburn

Post on 28-Mar-2015

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

1

MatchetePaths through the Pattern Matching Jungle

Martin HirzelNate NystromBard Bloom

Jan Vitek

7+8 January 2008 PADL

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 2: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

2

What is Pattern Matching?

Examples:– Switch in C/Java– Exception handlers– ML-style patterns– Regular expressions– XPath patterns– Bit masks

Selection– If match, then

execute handler–E.g. is this a float?

22.341

Bindings–Give names to parts–E.g. integral part: 22,

fractional part: 341

Page 3: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

3

Example: Lists

-- list multiplicationmult( )= 3 * mult( )= 3 * -1 * mult( )= 3 * -1 * 0 * mult( )= 3 * -1 * 0 * 4 * mult(nil)= 3 * -1 * 0 * 4 * 1 = 0

-- list constructioncons(3, cons(-1, cons(0, cons(4, null))))= 3 -1 40

3 -1 40

-1 40

40

4

Page 4: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

4

Matching Structured Terms

int mult(List ls) { match(ls) { cons~(0, _): return 0; cons~(int h, List t): return h * mult(t); null: return 1; } return 1;}

Selection

Bindings

Central feature of ML, Haskell

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Hardly a jungle!

Page 5: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

5

Less Structured Data

Data Pattern Language

Strings Regular expression Perl

XML XPath XSLT

Raw bits Binary pattern Erlang

Major factor in success ofpractical languages!

Page 6: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

6

Why Unify?

• Given list of strings:

• Given String variable: name• Find name, extract int age• Match list deconstructor patterncons~(…, …)

• Match string nested RegExp/([a-z]+) ([0-9]+)/(name, int age)

sue 10 bob 15 ann 11

Page 7: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

7

Matchete (Java Extension)

• Integrates pattern sublanguages

• Common set of primitive patterns

• Nesting composite patterns

• Simple uniform semantics

Page 8: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

8

Primitive Patterns

Name Examples

Wildcard _

Value22.341

x

tiger.stripes + spider.legs

Binderint x

ScaryAnimal python

Page 9: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

9

Composite Patterns

[[(0x2cf9:16) 01 (int x:14)]]BitLevel

/([a-z]) ([0-9]+)/(chr,int f)RegExp

<bib/book>(NodeList n)XPath

int[]{1, x, int y}Array

re("([0-9]+)")~(int i)Parameterized

cons~(0, _)Deconstructor

ExamplesName

Page 10: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

10

Deconstructor Definition

class List { private int head; private List tail; public List(int h, List t) { head = h; tail = t; } public cons~(int h, List t) { h = head; t = tail; }}

Fields

Constructor

Deconstructor

Match on receiver objectOut parameters = subjects for nested patterns

Page 11: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

11

Nesting

cons~(/([a-z]+) ([0-9]+)/(name, int age), _)

Wildcard_

Valuename

Binderint age

Deconstructorcons

RegExp([a-z]+) ([0-9]+)

sue 10 bob 15 ann 11

Page 12: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

12

Subjects flow to children

RegExp([a-z]+) ([0-9]+)

Wildcard_

Valuename

Binderint age

Deconstructorcons

sue 10 bob 15 ann 11

bob 15 ann 11

sue 10

sue 10

Page 13: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

13

Decisions and bindings flowto textual successor

RegExp([a-z]+) ([0-9]+)

Wildcard_

Valuename

Binderint age

Deconstructorcons

Handlerprint(age)

Page 14: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

14

CompilationMatchete source code

Built on Rats!parser generator

GeneratedJava source

Debugginginformation

Runtimelibrary

OtherJava source

Matchete compiler

Java class files

Java compiler Postprocessor

Page 15: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

15

Implemented Examples

• Balance red-black tree

• Process TCP/IP network packet

• Pretty-print XML bibliography

• … + smaller regression tests

Page 16: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

16

Discussion: Typing

Matchete uses strong dynamic typing– No runtime errors, just failed matches– If Matchete compiler gives no error,

then Java compiler gives no error either

Why not (more) static typing?– Data formats mismatch– Test bed for a new scripting language

Page 17: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

17

Discussion: Integration

Simpler language

re("a(b)c(d)")~(p,q)

Nointegration

No need to count

/a(p:b)c(q:d)/Tight

integration

Sublanguagereuse

/a(b)c(d)/ (p,q)

Looseintegration

AdvantageExampleChoice

Matchete choses tight integration for BitLevel,loose integration for RegExp and XPath,

no integration for XML as terms

Page 18: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

18

Related Work

• Structured terms– Algebraic types: ML, Haskell, …– Objects: Tom, OOMatch, JMatch, …– Letting users define patterns: F#, Scala

• Strings: Perl; SNOBOL• Bit-level data: Erlang; DataScript; PADS• XML:– As trees: XSLT, XJ (XPath)– As terms: XDuce, HydroJ, …

Page 19: 1 Matchete Paths through the Pattern Matching Jungle Martin Hirzel Nate Nystrom Bard Bloom Jan Vitek 7+8 January 2008 PADL

19

Conclusions

• Pattern matching applies toterms, strings, XML, and raw bits

• Matchete offers path to unification