Download - TI1220 Lecture 14: Domain-Specific Languages
TI1220 2012-2013Concepts of Programming Languages
Eelco Visser / TU Delft
Lecture 14: Domain-Specific Languages
Syntax and SemanticsNames, Bindings, and Scopes Storage Data TypesFunctional ProgrammingFirst-class FunctionsPolymorphism
Type ParameterizationParsing and InterpretationData Abstraction / Modular ProgrammingFunctional Programming ReduxConcurrencyConcurrent ProgrammingDomain-Specific Languages
Quarter 3
Quarter 4
Basics ofScalaJavaScriptC
Linguistic Abstraction
Formalizing Design Patterns
ProblemDomain
SolutionDomain
implement
validate
Software Engineering
Software Reuse: Don’t Repeat Yourself
software reuse patterns
• copy and paste
• libraries
• frameworks
• service APIs
• design patterns
Linguistic Abstraction
identify pattern
use new abstraction
language A language Bdesign abstraction
From Instructions to Expressions
mov &a, &cadd &b, &cmov &a, &t1sub &b, &t1and &t1,&c
Source: http://sites.google.com/site/arch1utep/home/course_outline/translating-complex-expressions-into-assembly-language-using-expression-trees
c = ac += bt1 = at1 -= bc &= t1
c = (a + b) & (a - b)
From Calling Conventions to Procedures
calc: push eBP ; save old frame pointer mov eBP,eSP ; get new frame pointer sub eSP,localsize ; reserve place for locals . . ; perform calculations, leave result in AX . mov eSP,eBP ; free space for locals pop eBP ; restore old frame pointer ret paramsize ; free parameter space and return
f(e1,e2,...,en)
push eAX ; pass some register resultpush byte[eBP+20] ; pass some memory variable (FASM/TASM syntax)push 3 ; pass some constantcall calc ; the returned result is now in eAX
f(x) { ... }
http://en.wikipedia.org/wiki/Calling_convention
A structure is a collection of one or more variables, possibly of different types, grouped together under a single name for convenient handling. (Structures are called ``records'' in some languages, notably Pascal.)
struct point { int x; int y;};
member
structure tag
Structures in C: Abstract from Memory Layout
Malloc/Free to Automatic Memory Management
/* Allocate space for an array with ten elements of type int. */
int *ptr = (int*)malloc(10 * sizeof (int));if (ptr == NULL) { /* Memory could not be allocated, the program should handle the error here as appropriate. */} else { /* Allocation succeeded. Do something. */ free(ptr); /* We are done with the int objects, and free the associated pointer. */ ptr = NULL; /* The pointer must not be used again, unless re-assigned to using malloc again. */}
http://en.wikipedia.org/wiki/Malloc
int [] = new int[10];/* use it; gc will clean up (hopefully) */
typedef struct Base { void* (**vtable)(); int x;} Base;
void (*Base_Vtable[])() = { &Base_print };
Base* newBase(int v) { Base* obj = (Base*)malloc(sizeof(Base)); obj->vtable = Base_Vtable; obj->x = v; return obj;}
void print(Base* obj) { obj->vtable[0](obj);}
class Base { Integer x; public Base(Integer v) { x = v; } public void print() { System.out.println("Base: " + x); }}class Child extends Base { Integer y; public Child(Integer v1, Integer v2) { super(v1); y = v2; } public void print() { System.out.println("Child: (" + x + "," + y + ")"); }}
Dynamic Dispatch
Polymorphic Higher-Order Functions
def map[A,B](f: A => B, xs: List[A]): List[B] = { xs match{ case Nil() => Nil() case Cons(y, ys) => Cons(f(y), map(f, ys)) } }
def incList(xs: IntList): IntList = xs match { case Nil() => Nil() case Cons(y, ys) => Cons(y + 1, incList(ys)) }
Abstractions in Programming Languages
❖ Structured control-flow
★ if-then-else, while
❖ Procedural abstraction
★ procedures, first-class functions (closures)
❖ Memory management
★ garbage collection
❖ Data abstraction
★ abstract data types, objects
❖ Modules
★ inheritance, traits, mixins
“A programming language is low level when its programs require attention to the irrelevant”
Alan J. Perlis. Epigrams on Programming. SIGPLAN Notices, 17(9):7-13, 1982.
Do HLLs eliminate all irrelevant details?
What about
❖ data persistence
❖ data services
❖ concurrency
❖ distribution
❖ access control
❖ data invariants
❖ workflow
❖ ...
Do HLLs eliminate all irrelevant details?
What about
❖ data persistence
❖ data services
❖ concurrency
❖ distribution
❖ access control
❖ data invariants
❖ workflow
❖ ...
many of these concerns require
programmatic encodings
What is the Next Level of Abstraction?
ProblemDomain HLL Machine
Domain-Specific Languages
ProblemDomain HLL MachineDSL
Example: Encoding Units
compiler
computerinput
input distance : Float;input duration : Float;output speed : Float := duration / distance;
output
Example: Encoding Units
compiler
computerinput
input distance : Float;input duration : Float;output speed : Float := duration / distance;
error
wrong output
Impact of Software Errors
compiler
computer
error
Mars Climate OrbiterUnit mismatch: Orbiter variables in Newtons, Ground control software in Pound-force.
Damage: ~350 M$
input distance : Float;input duration : Float;output speed : Float := duration / distance;
wrong output
Example: Explicit Representation of Units
computer
input distance : Meter;input duration : Second;output speed : Meter/Second := duration / distance;
compiler
formalize knowledge of application area (domain) in language
error
DSLs Provide Domain-Specific ...
Abstractions
★ directly represent domain concepts
Concrete syntax
★ natural notation
Optimization
★ based on domain assumptions
Error checking
★ report errors in terms of domain concepts
Tool support
★ interpreter, compiler, code generator, IDE
Internal DSL
Library in HLL
★ Haskell, Scala, Ruby, ...
★ API is language
★ language features for ‘linguistic abstraction’
Advantages
★ host language = implementation language
Disadvantages
★ host language = implementation language (encoding)
★ no portability
★ no domain-specific errors, analysis, optimization
External DSL
Dedicated language
★ independent of host/target language (portable)
★ implementation with interpreter or compiler
Advantages
★ language tuned to domain
★ domain-specific errors, analysis, optimizations
Disadvantages
★ cost of learning new language
★ cost of maintaining language
Example DSLs (1)
Spreadsheet
★ formulas, macros
Querying
★ SQL, XQuery, XPath
Graph layout
★ GraphViz
Web
★ HTML, CSS, RSS, XML, XSLT
★ Ruby/Rails, JSP, ASP, JSF, WebDSL
Example DSLs (2)
Games
★ Lua, UnrealScript
Modeling
★ UML, OCL, QVT
Language engineering
★ YACC, LEX, RegExp, ANTLR, SDF
★ TXL, ASF+SDF, Stratego
Example: Linguistic Integration in
WebDSL
browser server database
web app
Web Programming
browser server database
Java SQLHTML, JS, CSS
Web Programming = Distributed Programming
Concerns in Web Programming
Data Persistence
Access Control
Injection Attacks
Search
XSS
Data Validation
Data Binding
Routing
... ...
Zef Hemel, Danny M. Groenewegen, Lennart C. L. Kats, Eelco Visser. Static consistency checking of web applications with WebDSL. Journal of Symbolic Computation, 46(2):150-182, 2011.
Late Failure Detection in Web Applications
Complexity in Web Programming:
Multiple Languages x Multiple Concerns
Consistency not statically checked
Eelco Visser. WebDSL: A Case Study in Domain-Specific Language Engineering. In Ralf Lämmel, Joost Visser, João Saraiva, editors, Generative and Transformational Techniques in Software Engineering II, International Summer School, GTTSE 2007. Volume 5235 of Lecture Notes in Computer Science, pages 291-373, Springer, Braga, Portugal, 2007.
Separation of Concerns & Linguistic Integration
Formalizing Navigation Logic
http://eelcovisser.org/blog/post/42/dsl-engineering
“http://” + “eelcovisser.org” // host domain + “blog” // application name + “post” // page+ “42” // identity+ “dsl-engineering” // title
page blog(b: Blog, index: Int) { main(b){ for(p: Post in b.recentPosts(index,5)) { section{ header{ navigate post(p) { output(p.title) } } par{ output(p.content) } par{ output(p.created.format("MMMM d, yyyy")) } } }}page post(p: Post) { ... }
Statically checked navigation
entity Blog { key :: String (id) title :: String (name) posts -> Set<Post> (inverse=Post.blog) function recentPosts(index: Int, n: Int): List<Post> { var i := max(1,index) - 1; return [p | p: Post in posts order by p.created desc limit n offset i*n].list(); }}entity Post { key :: String (id) title :: String (name, searchable) content :: WikiText (searchable) blog -> Blog }
Persistent Data Models
Generation of queries: no injection attacks
entity Assignment { key :: String (id) title :: String (name, searchable) shortTitle :: String description :: WikiText (searchable) course -> CourseEdition (searchable) weight :: Float (default=1.0) deadline :: DateTime (default=null) // ...} page assignment(assign: Assignment, tab: String) { main{ progress(assign, tab) pageHeader{ output(assign.title) breadcrumbs(assign) } // ... } }
Persistent variables in WebDSL
http://department.st.ewi.tudelft.nl/weblab/assignment/752
objects are automatically persisted in database
1
2
3
page post(p: Post) { ... }page editpost(p: Post) { action save() { return blog(p); } main(p.blog){ form{ formEntry("Title"){ input(p.title) } formEntry("Content") { input(p.content) } formEntry("Posted") { input(p.created) } submit save() { "Save" } } }}
Forms & Data Binding
No separate controller!
access control rules
principal is User with credentials username, password rule page blog(b: Blog, index: Int) { true } rule page post(p: Post) { p.public || p.author == principal } rule page editpost(p: Post) { principal == p.author }
extend entity User { password :: Secret}
extend entity Blog { owner -> User}
extend entity Post { public :: Bool}
Declarative Access Control Rules
Linguistically Integrated
Persistent data model
Logic
Templates (UI, Email, Service)
Data binding
Access control
Data validation
Faceted search
Collaborative filtering
DSL Summary
software reuse through linguistic abstraction
• capture understanding of design patterns in language concepts
• abstract from accidental complexity
• program in terms of domain concepts
• automatically generate implementation
When to Use/Create DSLs?
Hierarchy of abstractions
• first understand how to program it
• make variations by copy, paste, adapt
• (avoid over-engineering)
• make library of frequently used patterns
• find existing (internal) DSLs for the domain
Time for a DSL?
• large class of applications using same design patterns
• design patterns cannot be captured in PL
• lack of checking / optimization for DSL abstractions
Language Engineering
object ExpParser extends JavaTokenParsers with PackratParsers { lazy val exp: PackratParser[Exp] = (exp <~ "+") ~ exp1 ^^ { case lhs~rhs => Add(lhs, rhs) } | exp1
lazy val exp1: PackratParser[Exp] = (exp1 ~ exp0) ^^ { case lhs~rhs => App(lhs, rhs) } | exp0 lazy val exp0: PackratParser[Exp] = number | identifier | function | letBinding | "(" ~> exp <~ ")" // ... def parse(text: String) = parseAll(exp, text)}
syntax through parsers
sealed abstract class Valuecase class numV(n: Int) extends Valuecase class closureV(param: Symbol, body: Exp, env: Env) extends Value
def eval(exp: Exp, env: Env): Value = exp match { case Num(v) => numV(v) case Add(l, r) => plus(eval(l, env), eval(r, env)) case Id(name) => lookup(name, env) case Let(name, e1, e2) => eval(e2, bind(name, eval(e1, env), env))
case Fun(name, body) => closureV(name, body, env)
case App(fun, arg) => eval(fun, env) match { case closureV(name, body, env2) => eval(body, bind(name, eval(arg, env), env2)) case _ => sys.error("Closure expected") }
} semantics through interpreter
Traditional Compilers
Traditional Compilers
ls
Course.java
Traditional Compilers
ls
Course.java
javac -verbose Course.java
[parsing started Course.java][parsing completed 8ms][loading java/lang/Object.class(java/lang:Object.class)][checking university.Course][wrote Course.class][total 411ms]
Traditional Compilers
ls
Course.java
javac -verbose Course.java
[parsing started Course.java][parsing completed 8ms][loading java/lang/Object.class(java/lang:Object.class)][checking university.Course][wrote Course.class][total 411ms]
ls
Course.class Course.java
Language Processors
syntax analysis
• parsing
• AST construction
static analysis
• name analysis
• type analysis
semantics
• generation
• interpretation
Integrated Development Environments (IDE)
Modern Compilers in IDEs
syntactic editor services
• syntax checking
• syntax highlighting
• outline view
• code folding
• bracket matching
semantic editor services
• error checking
• reference resolving
• hover help
• content completion
• refactoring
Eclipse Platform
runtime platform
• composition
• integration
development platform
• complex APIs
• abstractions for Eclipse IDEs
• concepts: editors, views, label provider, label provider factory, …
• tedious, boring, frustrating
Spoofax Language Workbench
declarative meta-languages
• syntax definition
• editor services
• term rewriting
implementation
• generic integration into Eclipse and IMP
• compilation & interpretation of language definitions
agile
• Spoofax & IDE under development in same Eclipse instance
• support for test-driven development
A Taste of Language Engineeringwith Spoofax
• abstract syntax trees
• declarative syntax definition
• name binding and scope
• transformation by term rewriting
EnFun: Entities with Functions
module blog entity String { function plus(that:String): String } entity Bool { } entity Set<T> { function add(x: T) function remove(x: T) function member(x: T): Bool } entity Blog { posts : Set<Post> function newPost(): Post { var p : Post := Post.new(); posts.add(p); } } entity Post { title : String }
Structure: Abstract Syntax
Signature & Terms
constructors Module : ID * List(Definition) -> Module Imports : ID -> Definition
Module( "application", [Imports("library"), Imports("users"), Imports("frontend")])
Entities & Properties
constructors Entity : ID * List(Property) -> Definition Type : ID -> Type New : Type -> Exp
constructors Property : ID * Type -> Property This : Exp PropAccess : Exp * ID -> Exp
Module("users", [ Imports("library") , Entity("User" , [ Property("email", Type("String")) , Property("password", Type("String")) , Property("isAdmin", Type("Bool"))])])
Parsing: From Text to Structure
Declarative Syntax Definition
Entity("User", [ Property("first", Type("String")), Property("last", Type("String"))])
signature constructors Entity : ID * List(Property) -> Definition Type : ID -> Type Property : ID * Type -> Property
Declarative Syntax Definition
entity User { first : String last : String}
Entity("User", [ Property("first", Type("String")), Property("last", Type("String"))])
signature constructors Entity : ID * List(Property) -> Definition Type : ID -> Type Property : ID * Type -> Property
Declarative Syntax Definition
entity User { first : String last : String}
Entity("User", [ Property("first", Type("String")), Property("last", Type("String"))])
signature constructors Entity : ID * List(Property) -> Definition Type : ID -> Type Property : ID * Type -> Property
context-free syntax "entity" ID "{" Property* "}" -> Definition {"Entity"} ID -> Type {"Type"} ID ":" Type -> Property {"Property"}
Prototyping Syntax Definition
Context-free Syntax
constructors True : Exp False : Exp Not : Exp -> Exp And : Exp * Exp -> Exp Or : Exp * Exp -> Exp
context-free syntax "true" -> Exp {"True"} "false" -> Exp {"False"} "!" Exp -> Exp {"Not"} Exp "&&" Exp -> Exp {"And"} Exp "||" Exp -> Exp {"Or"}
Lexical Syntax
constructors True : Exp False : Exp Not : Exp -> Exp And : Exp * Exp -> Exp Or : Exp * Exp -> Exp
context-free syntax "true" -> Exp {"True"} "false" -> Exp {"False"} "!" Exp -> Exp {"Not"} Exp "&&" Exp -> Exp {"And"} Exp "||" Exp -> Exp {"Or"}
lexical syntax [a-zA-Z][a-zA-Z0-9]* -> ID "-"? [0-9]+ -> INT [\ \t\n\r] -> LAYOUT
constructors : String -> ID : String -> INT
scannerless generalized (LR) parsing
form of tokens (words, lexemes)
Ambiguity
constructors True : Exp False : Exp Not : Exp -> Exp And : Exp * Exp -> Exp Or : Exp * Exp -> Exp
context-free syntax "true" -> Exp {"True"} "false" -> Exp {"False"} "!" Exp -> Exp {"Not"} Exp "&&" Exp -> Exp {"And"} Exp "||" Exp -> Exp {"Or"}
isPublic || isDraft && (author == principal())
Ambiguity
constructors True : Exp False : Exp Not : Exp -> Exp And : Exp * Exp -> Exp Or : Exp * Exp -> Exp
context-free syntax "true" -> Exp {"True"} "false" -> Exp {"False"} "!" Exp -> Exp {"Not"} Exp "&&" Exp -> Exp {"And"} Exp "||" Exp -> Exp {"Or"}
isPublic || isDraft && (author == principal())
amb([ And(Or(Var("isPublic"), Var("isDraft")), Eq(Var("author"), ThisCall("principal", []))), Or(Var("isPublic"), And(Var("isDraft"), Eq(Var("author"), ThisCall("principal", []))))])
Disambiguation by Encoding Precedence
constructors True : Exp False : Exp Not : Exp -> Exp And : Exp * Exp -> Exp Or : Exp * Exp -> Exp
context-free syntax "true" -> Exp {"True"} "false" -> Exp {"False"} "!" Exp -> Exp {"Not"} Exp "&&" Exp -> Exp {"And"} Exp "||" Exp -> Exp {"Or"}
context-free syntax "(" Exp ")" -> Exp0 {bracket} "true" -> Exp0 {"True"} "false" -> Exp0 {"False"} Exp0 -> Exp1 "!" Exp0 -> Exp1 {"Not"} Exp1 -> Exp2 Exp1 "&&" Exp2 -> Exp2 {"And"} Exp2 -> Exp3 Exp2 "||" Exp3 -> Exp3 {"Or"}
Declarative Disambiguation
context-free syntax "true" -> Exp {"True"} "false" -> Exp {"False"} "!" Exp -> Exp {"Not"} Exp "&&" Exp -> Exp {"And", left} Exp "||" Exp -> Exp {"Or", left} "(" Exp ")" -> Exp {bracket}context-free priorities {left: Exp.Not} > {left: Exp.Mul} > {left: Exp.Plus Exp.Minus} > {left: Exp.And} > {non-assoc: Exp.Eq Exp.Lt Exp.Gt Exp.Leq Exp.Geq}
isPublic || isDraft && (author == principal())
Or(Var("isPublic"), And(Var("isDraft"), Eq(Var("author"), ThisCall("principal", []))))
Analysis: Name Resolution
+
Definitions and References
module test
entity String { }
entity User { first : String last : String }
definition
reference
Name Binding in IDE
From Tree to Graph
Module( "test", [ Entity("String", []) , Entity( "User" , [ Property("first", ) , Property("last", ) ] ) ])
NaBL: Name Binding Language
module names
imports include/Cam namespaces Type Property Function Variable rules
Entity(name, None(), None(), _): defines Type name of type Type(name, []) scopes Type, Function, Property, Variable
Type(name, _): refers to Type name
Transformation
Transformation by Strategic Rewriting
rules desugar: Plus(e1, e2) -> MethCall(e1, "plus", [e2]) desugar: Or(e1, e2) -> MethCall(e1, "or", [e2])
desugar : VarDeclInit(x, t, e) -> Seq([VarDecl(x, t), Assign(Var(x), e)])
strategies desugar-all = topdown(repeat(desugar))
Return-Lifting Applied
function fact(n: Int): Int { var res: Int; if(n == 0) { res := 1; } else { res := this * fact(n - 1); } return res;}
function fact(n: Int): Int {
if(n == 0) { return 1; } else { return this * fact(n - 1); }
}
Return-Lifting Rules
rules lift-return-all = alltd(lift-return; normalize-all) lift-return : FunDef(x, arg*, Some(t), stat1*) -> FunDef(x, arg*, Some(t), Seq([ VarDecl(y, t), Seq(stat2*), Return(Var(y)) ])) where y := <new>; stat2* := <alltd(replace-return(|y))> stat1* replace-return(|y) : Return(e) -> Assign(y, e)
Language Engineering Summary
apply linguistic abstraction to language engineering
• declarative languages for language definition
• automatic derivation of efficient compilers
• automatic derivation of IDEs
Research Agenda
Example: Explicit Representation of Units
computer
input distance : Meter;input duration : Second;output speed : Meter/Second := duration / distance;
compiler
formalize knowledge of application area (domain) in language
error
error
Problem: Correctness of Language Definitions
computer
compilerCan we trust the compiler?
wrong outputinput
program
type soundness: well-typed programs don’t go wrong
compiler
error
Challenge: Automatic Verification of Correctness
computer
compiler
wrong output
program
type soundness: well-typed programs don’t go wrong
typechecker
codegenerator
input
CorrectnessProof
Language Workbench
State-of-the-Art: Language Engineering
SyntaxChecker
NameResolver
TypeChecker
CodeGenerator
focus on implementation; not suitable for verification
CompilerEditor(IDE) Tests
Formal Language Specification
State-of-the-Art: Semantics Engineering
AbstractSyntax
TypeSystem
DynamicSemantics Transforms
focus on (only semi-automatic) verification; not suitable for implementation
CorrectnessProof TestsCompiler
Editor(IDE)
Declarative Language Definition
My Approach: Multi-Purpose Language Definitions
SyntaxDefinition
NameBinding
TypeSystem
DynamicSemantics Transforms
CompilerEditor(IDE)
CorrectnessProof Tests
bridging the gap between language engineering and semantics engineering
Software Development on the Web
revisiting the architecture of the IDE
Exam
Syntax and SemanticsNames, Bindings, and Scopes Storage Data TypesFunctional ProgrammingFirst-class FunctionsPolymorphism
Type ParameterizationParsing and InterpretationData Abstraction / Modular ProgrammingFunctional Programming ReduxConcurrencyConcurrent ProgrammingDomain-Specific Languages
Quarter 3
Quarter 4
Basics ofScalaJavaScriptC
Material for exam
Slides from lectures
Tutorial exercises
Graded assignments
Sebesta: Chapters 1-13, 15
Programming in Scala: Chapters 1, 4-16, 19, 32-33
K&R C: Chapters 1-6
JavaScript Good Parts: Chapters 1-4
Content of exam
10% multiple choice questions about concepts
50% Scala programming (functional programming)
20% C programming (structures and pointers)
20% JavaScript programming (objects and prototypes)
Registration for Exam is Required
http://department.st.ewi.tudelft.nl/weblab/assignment/761 -> your submission
Good Luck!