global dsl workshop slides

30
Domain-specific language features 1 Ted Kaminksi and Eric Van Wyk University of Minnesota Global DSL 2013, July 2, 2013, Montpellier 1 This work is partially supported by NSF Awards No. 0905581 and 1047961. c Eric Van Wyk 1

Upload: ericupnorth

Post on 02-Jul-2015

5.546 views

Category:

Technology


0 download

DESCRIPTION

Slides from my talk at the Global DSL workshop. See http://gemoc.org/globaldsl13/

TRANSCRIPT

Page 1: Global DSL workshop slides

Domain-specific language features1

Ted Kaminksi and Eric Van Wyk

University of Minnesota

Global DSL 2013, July 2, 2013, Montpellier

1This work is partially supported by NSF Awards No. 0905581 and1047961.

c© Eric Van Wyk 1

Page 2: Global DSL workshop slides

DSLs are bad, or at least have some challenges...

I ExternalI “No recourse to the law.”I domain-specific syntax, analysis, and optimization

I Internal/EmbeddedI “just libraries”, caveat: staging, DeliteI composableI general purpose host language available

Maybe there are alternatives ...

c© Eric Van Wyk 2

Page 3: Global DSL workshop slides

Extensible Languages (Frameworks)

I pluggable domain-specific language extensions

I domain-specific syntax, analysis, and optimization

I composable

I general purpose host language available

I extension features translate down to host language

c© Eric Van Wyk 3

Page 4: Global DSL workshop slides

class Demo {int demoMethod ( ) {List<List<Integer>> dlist ;

Regex ws = regex/\n\t / ;

int SELECT ;

connection c "jdbc:derby:/home/derby/db/testdb"

with table person [ person id INTEGER,

first name VARCHAR,

last name VARCHAR ] ,

table details [ person id INTEGER,

age INTEGER ] ;

Integer limit = 18 ;

ResultSet rs = using c query {SELECT age, gender, last name

FROM person , details

WHERE person.person id = details.person id

AND details.age > limit } ;

Integer = rs.getInteger("age");

String gender = rs.getString("gender");

boolean b ;

b = table ( age > 40 : T * ,

gender == "M" : T F ) ;

}}

• natural syntax• semantic analysis• composable extensions

• Regexs, different whitespace

• SQL queries

• non-null pointeranalysis

• tabular booleanexpressions

c© Eric Van Wyk 4

Page 5: Global DSL workshop slides

#include <sdtio.h>

int main() {

... bits of SAC ...

... stencil specifications ...

... computational geometry optimizations

robustness transformations ...

}

c© Eric Van Wyk 5

Page 6: Global DSL workshop slides

Roles people play ...

1. host language designer

2. language extension designer

3. programmerI language extension userI has no language design and implementation knowledge

c© Eric Van Wyk 6

Page 7: Global DSL workshop slides

c© Eric Van Wyk 7

Page 8: Global DSL workshop slides

An analog

I We can write type safe programs in assembly, Python,Ruby, Scheme, etc.

I We are guaranteed to write type safe program in Haskell,ML, etc.

I Some of use (happily) trade some expressibility for thissafety.

I Can we build extensible languages and composablelanguage extensions?

I Are there any guarantees of composability of languageextensions?

c© Eric Van Wyk 8

Page 9: Global DSL workshop slides

c© Eric Van Wyk 9

Page 10: Global DSL workshop slides

Developing language extensionsTwo primary challenges:

1. composable syntax — enables building a parserI context-aware scanning [GPCE’07]I modular determinism analysis [PLDI’09]I Copper

2. composable semantics — analysis and translations

I attribute grammars with forwarding, collections andhigher-order attributes

I set union of specification componentsI sets of productions, non-terminals, attributesI sets of attribute defining equations, on a productionI sets of equations contributing values to a single attribute

I modular well-definedness analysis [SLE’12a]I modular termination analysis [SLE’12b, KrishnanPhD]I Silver

c© Eric Van Wyk 10

Page 11: Global DSL workshop slides

Challenges in scanning

I Keywords in embedded languages may be identifiers inhost language:

int SELECT ;

...

rs = using c query { SELECT last name

FROM person WHERE ...

I Different extensions use same keyword

connection c "jdbc:derby:./derby/db/testdb"

with table person [ person id INTEGER,

first name VARCHAR ];

...

b = table ( c1 : T F ,

c2 : F * ) ;

c© Eric Van Wyk 11

Page 12: Global DSL workshop slides

Challenges in scanning

I Operators with different precedence specifications:

x = 3 + y * z ;

...

str = /[a-z][a-z0-9]*\.java/I Terminals that are prefixes of others

List<List<Integer>> dlist ;

...

x = y >> 4 ;

or

aspect ... before ... call( o.get*() )

...

x = get*3 ;

[from Bravenboer, Tanter & Visser’s OOPSLA 06 paper on parsing AspectJ]

c© Eric Van Wyk 12

Page 13: Global DSL workshop slides

Need for context

I Problem: the scanner cannot match strings (lexemes) toproper terminal symbols in the grammar.

“SELECT” – Select kwd or Identifier“table” – SQL table kwd or DNF table kwd“>>” – RightBitShift op or GT op, GT op

I Neither terminal has global precedence over the other.I maximal munch gives “>>” precedence over “>”I thus grammars are written to accommodate this

c© Eric Van Wyk 13

Page 14: Global DSL workshop slides

Need for context

I Traditionally, parser and scanner are disjoint.

Scanner → Parser → Semantic Analysis

I In context aware scanning, they communicate

Scanner � Parser → Semantic Analysis

c© Eric Van Wyk 14

Page 15: Global DSL workshop slides

Context aware scanning

I Scanner recognizes only tokens valid for current “context”

I keeps embedded sub-languages, in a sense, separate

I Consider:I chan in, out;

for i in a { a[i] = i*i ; }

I Two terminal symbols that match “in”.I terminal IN ’in’ ;I terminal ID /[a-zA-Z ][a-zA-Z 0-9]*/

submits to {promela kwd };

I terminal FOR ’for’ lexer classes {promela kwd

};

I example is part of AbleP [SPIN’11]

c© Eric Van Wyk 15

Page 16: Global DSL workshop slides

Allows parsing of embedded C code in host

Promelac_decl {

typedef struct Coord {

int x, y; } Coord; }

c_state "Coord pt" "Global" /* goes in state vector */

int z = 3; /* standard global decl */

active proctype example()

{ c_code { now.pt.x = now.pt.y = 0; };

do :: c_expr { now.pt.x == now.pt.y }

-> c_code { now.pt.y++; }

:: else -> break

od;

c_code { printf("values %d: %d, %d,%d\n",

Pexample->_pid, now.z, now.pt.x, now.pt.y);

};

}

// assert(false) /* trigger an error trail */

//}

c© Eric Van Wyk 16

Page 17: Global DSL workshop slides

Context aware scanning

I We use a slightly modified LR parser and a context-awarescanner.

I Context is based on the LR-parser state.

I Parser passes a set of valid look-ahead terminals toscanner.

I This is the set of terminals with shift, reduce, or acceptentries in parse table for current parse state.

I Scanner only returns tokens from the valid look-ahead set.

I Scanner uses this set to disambiguate terminals.I e.g. SQL table kwd and DNF table kwdI e.g. Select kwd and Identifier

c© Eric Van Wyk 17

Page 18: Global DSL workshop slides

Context aware scanning

I This scanning algorithm subordinates thedisambiguation principle of maximal munchto the principle ofdisambiguation by context.

I It will return a shorter valid match before a longer invalidmatch.

I In List<List<Integer>> before “>”,“>” in valid lookahead but “>>” is not.

I A context aware scanner is essentially an implicitly-modedscanner.

I There is no explicit specification of valid look ahead.I It is generated from standard grammars and terminal

regexs.

c© Eric Van Wyk 18

Page 19: Global DSL workshop slides

I With a smarter scanner, LALR(1) is not so brittle.

I We can build syntactically composable languageextensions.

I Context aware scanning makes composable syntax “morelikely”

I But it does not give a guarantee of composability.

c© Eric Van Wyk 19

Page 20: Global DSL workshop slides

Building a parser from composed specifications.

... CFGH ∪∗ {CFGE1, ...,CFGEn}

∀i ∈ [1, n].isComposable(CFGH ,CFGEi )∧conflictFree(CFGH ∪ CFGEi )

⇒ ⇒ conflictFree(CFGH ∪ {CFGE1, ...,CFGEn})

I Monolithic analysis - not too hard, but not too useful.

I Modular analysis - harder, but required [PLDI’09].

I Non-commutative composition of restricted LALR(1)grammars.

c© Eric Van Wyk 20

Page 21: Global DSL workshop slides

Building an attribute grammar evaluator from composedspecifications.

... AGH ∪∗ {AGE1, ...,AGEn}

∀i ∈ [1, n].modComplete(AGH ,AGEi )

⇒ ⇒ complete(AGH ∪ {AGE1 , ...,AGE

n })

I Monolithic analysis - not too hard, but not too useful.

I Modular analysis - harder, but required [SLE’12a].

c© Eric Van Wyk 21

Page 22: Global DSL workshop slides

c© Eric Van Wyk 22

Page 23: Global DSL workshop slides

I ModularI determinism analysis — scanning, parsingI well-definedness — attribute grammarsI termination — attribute grammars

are language independent analyses.

I Next: language specific analyses.Sebastian Erweg at ICFP’13 - modular type soundness

c© Eric Van Wyk 23

Page 24: Global DSL workshop slides

Expressiveness versus safe composition

Compare to

I other parser generators

I libraries

The modular compositionality analysis does not requirecontext aware scanning.But, context aware scanning makes it practical.

c© Eric Van Wyk 24

Page 25: Global DSL workshop slides

Tool Support

Copper – context-aware parser and scanner generator

I implements context-aware scanning for a LALR(1) parser

I lexical precedence

I parser attributes and disambiguation functions whendisambiguation by context and lexical precedence is notenough

I currently integrated into Silver [SCP’10], an extensibleattribute grammar system.

c© Eric Van Wyk 25

Page 26: Global DSL workshop slides

Thanks for your attention.

Questions?

http://melt.cs.umn.edu

[email protected]

c© Eric Van Wyk 26

Page 27: Global DSL workshop slides

Ted Kaminski and Eric Van Wyk.Modular well-definedness analysis for attribute grammars.In Proc. of Intl. Conf. on Software Language Engineering(SLE), volume 7745 of LNCS, pages 352–371.Springer-Verlag, September 2012.

Lijesh Krishnan and Eric Van Wyk.Termination analysis for higher-order attribute grammars.In Proceedings of the 5th International Conference onSoftware Language Engineering (SLE 2012), volume 7745of LNCS, pages 44–63. Springer-Verlag, September 2012.

Lijesh Krishnan.Composable Semantics Using Higher-Order AttributeGrammars.PhD thesis, University of Minnesota, Minnesota, USA,2012.http://melt.cs.umn.edu/pubs/krishnan2012PhD/.

c© Eric Van Wyk 26

Page 28: Global DSL workshop slides

Yogesh Mali and Eric Van Wyk.Building extensible specifications and implementations ofpromela with AbleP.In Proc. of Intl. SPIN Workshop on Model Checking ofSoftware, volume 6823 of LNCS, pages 108–125.Springer-Verlag, July 2011.

August Schwerdfeger and Eric Van Wyk.Verifiable composition of deterministic grammars.In Proc. of Conf. on Programming Language Design andImplementation (PLDI), pages 199–210. ACM, June 2009.

A. Schwerdfeger and E. Van Wyk.Verifiable parse table composition for deterministicparsing.In 2nd International Conference on Software LanguageEngineering, volume 5969 of LNCS, pages 184–203.Springer-Verlag, 2010.

c© Eric Van Wyk 26

Page 29: Global DSL workshop slides

E. Van Wyk, O. de Moor, K. Backhouse, andP. Kwiatkowski.Forwarding in attribute grammars for modular languagedesign.In 11th Conf. on Compiler Construction (CC), volume2304 of LNCS, pages 128–142. Springer-Verlag, 2002.

Eric Van Wyk, Lijesh Krishnan, August Schwerdfeger, andDerek Bodin.Attribute grammar-based language extensions for Java.In Proc. of European Conf. on Object Oriented Prog.(ECOOP), volume 4609 of LNCS, pages 575–599.Springer-Verlag, 2007.

E. Van Wyk and A. Schwerdfeger.Context-aware scanning for parsing extensible languages.In Intl. Conf. on Generative Programming and ComponentEngineering, (GPCE). ACM Press, October 2007.

c© Eric Van Wyk 26

Page 30: Global DSL workshop slides

Eric Van Wyk, Derek Bodin, Jimin Gao, and LijeshKrishnan.Silver: an extensible attribute grammar system.Science of Computer Programming, 75(1–2):39–54,January 2010.

c© Eric Van Wyk 26