itu - mdd - xtext

L0081 - 2010-11-08Redistribution and other use of this material requires written permission from The RCP Company.

ITU - MDD – XText

This presentation describes the use of XText.

This presentation assumes a good knowledge of Data Modeling and Grammars as previously presented.

This presentation is developed for MDD 2010 course at ITU, Denmark.

L0081 - 2010-11-082

Parse trees are very detailed: every step in a derivation is a node

After the parsing phase is done, the details of derivation are not needed for later phases

Semantic Analyzer removes intermediate productions to create an (abstract) syntax tree – known as an AST

expr

term

factor

ID: x

expr

ID: x

Parse Tree: Abstract Syntax Tree:

Abstract Syntax Trees

L0080 - 2010-11-083

Parsing an Expression assignment => ID “=“ expression;expression => expression “+” term

| expression “-” term| term

term => term “*” factor| term “/” factor| factor

factor => “(“ expression “)”| ID| NUMBER

ID ~ y expresssion ~ (2*x + 5)*x - 7

expression ~ (2*x + 5)*x term ~ 7

assignment ~ y = (2*x + 5)*x - 7

factor ~ 7term ~ (2*x + 5)*x

NUMBER ~ 7term ~ (2*x + 5) factor ~ x

ID ~ xfactor ~ (2*x + 5)

expression ~ 2*x + 5

expression ~ 2*x term ~ 5

factor ~ 5term ~ 2*x

NUMBER ~ 5factor ~ xterm ~ 2

ID ~ x

factor ~ 2

NUMBER ~ 2

L0081 - 2010-11-084

Building a ASTassignment => ID “=“ expression;expression => expression “+” term

| expression “-” term| term

term => term “*” factor| term “/” factor| factor

factor => “(“ expression “)”| ID| NUMBER

ID ~ y expresssion ~ (2*x + 5)*x - 7

expression ~ (2*x + 5)*x term ~ 7

assignment ~ y = (2*x + 5)*x - 7

factor ~ 7term ~ (2*x + 5)*x

NUMBER ~ 7term ~ (2*x + 5) factor ~ x

ID ~ xfactor ~ (2*x + 5)

expression ~ 2*x + 5

expression ~ 2*x term ~ 5

factor ~ 5term ~ 2*x

NUMBER ~ 5factor ~ xterm ~ 2

ID ~ x

factor ~ 2

NUMBER ~ 2

Assignment(y)

Expression(-)

Expression(*) NUMBER(7)

Expression(+)

Expression(*)

NUMBER(2)

NUMBER

ID(x)

ID(x)

L0081 - 2010-11-085

Level of Details

How detailed should a grammar be? Should the grammar be rich – i.e. contain all details?

Or should the grammar be as thin as possible?

In general use a thin grammar to avoid too many keywords For XText, use a rich grammar as this is used to provide context assist

Other similar questions: Should you model dates? How about ranges?

Flight : 'flight' ID '{' ( ('from'|'to') '=' STRING ';' )* '}'

Flight : 'flight' ID '{' ( ID '=' STRING ';' )* '}'

L0081 - 2010-11-086

Line feed or Explicit Terminators

Should you have an explicit “statement” terminator or use line feeds?

Statement Terminator such as “;” or “.” Pro: much easier to detect errors in the input Con: normally not natural for the user

Statement Terminator such as line feed Pro: much easier to detect errors in the input Con: cannot divide a logical line over multiple physical lines

No Statement Terminators Pro: very natural for many users Con: difficult to detect errors in the input

In general use a statement terminator – not a statement separator!

In our scenario, it properly makes good sense to use line feeds And indentions for scoping

L0081 - 2010-11-087

What is XText exactly

Xtext is a complete environment for development of textual programming languages and domain-specific languages.

It is implemented in Java and is based on Eclipse, EMF, and Antlr.

The Basic Idea? Augment an EBNF grammar Roll it through XText Have an Eclipse Editor

The XText tool bench Grammar is defined in an EBNF-like format in the Xtext editor. The editor provides code completion and constraint checking for the

grammars themselves Grammar is a collection of Rules. Rules start with their name followed by “:”

and ending with “;”

L0081 - 2010-11-088

Getting Started with XText

Create a new XText project with the New… wizard

L0081 - 2010-11-089

The XText File Structure

Three projects! You can make changes to the files in the src files,

but not the files in the src-gen folders

Your grammar rules

Workflow description used to create editor

Generated ECore model

Generated ECore classes

L0081 - 2010-11-0810

An XText Model File (.xtext)

Identify of model Include of base declarations and terminals

Directive to create model NS URI of model It is also possible to import an existing model

grammar org.xtext.example.MyDsl with org.eclipse.xtext.common.Terminals

generate myDsl "http://www.xtext.org/example/MyDsl"

Model :Type*;

…

L0081 - 2010-11-0811

Generating Files from a Model

To generate an Eclipse editor from a a model (.xtext) Select the workflow file (.mwe2) Use “Run As…” “MWE2 WorkFlow”

L0081 - 2010-11-0812

The Basic XText Concepts

The AST is a ECore model

The rules of the input grammar are used to create the AST (more or less automatically) or interface to an existing imported model (=AST)

So we must identify The entities of the model The attributes of the model The relations – containment and references – of the model

The input language of XText is a augmented EBNF (almost )

L0081 - 2010-11-0813

Example

Grammar for a (very) simple type system

Model : Type*;

Type: SimpleType | Entity;

SimpleType: ‘datatype' ID;

Entity : 'entity' ID ('extends' ID)? '{'Property*

'}';

Property: 'property' ID ':' ID ('[]')?;

datatype Adatatype Bentity E {

property a : Aproperty b : B[]

}entity F extends E {

property c : A}

Model

Type

SimpleType Entity

Property

*

*

Super-type

L0081 - 2010-11-0814

XText rules are EBNF plus type information

Type Rules For each rule XText creates an entity in the logical model Each rule property results in a property in the logical model

String Rules String rules are parsed to be a string These are in effect custom lexer rules as they recognize string patterns

Enum Rules Limited set of alternatives Mapped to an enumeration in the logical model

Native Rules A lexer rule that is mapped directly to an ANTLR rule Mapped to a String

Different Kinds of XText Rules

L0081 - 2010-11-0815

Class of type rule defaults to name of rule Can be overruled via “{ClassName}” construct

A type rule can be abstract An abstract type rule is basically a collection of OR-ed alternatives: R1 | R2

| R3 Mapped to an abstract metaclass

The or-ed alternatives become concrete subclasses Common properties of the alternatives are lifted into the abstract

superclass

Abstract Type Rules

Model : Type*;


SimpleType: ‘datatype' ID;


'}';


L0081 - 2010-11-0816

Example with Type Rules

Every rule corresponds to en entity “Or” rules becomes abstract classes with the entities of the sub-rules as child

classes Above we have Type as an abstract super class for SimpleType and Entity

Model : Type*;


SimpleType: 'type' ID;


'}';


L0081 - 2010-11-0817

Some built-in String Rules (Terminals)

The definition of the central terminals

Really part of lexical analysis!

Specific terminals can be “hidden” in the grammar or for specific rules, meaning they are recognized but ignored during parsing…

Very useful if you want a line-oriented grammar

terminal ID : '^'?('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;terminal INT returns ecore::EInt: ('0'..'9')+;terminal STRING:

'"' ( '\\' ('b'|'t'|'n'|'f'|'r'|'"'|"'"|'\\') | !('\\'|'"') )* '"' |"'" ( '\\' ('b'|'t'|'n'|'f'|'r'|'"'|"'"|'\\') | !('\\'|"'") )* "'"

; terminal ML_COMMENT: '/*' -> '*/';terminal SL_COMMENT : '//' !('\n'|'\r')* ('\r'? '\n')?;

terminal WS : (' '|'\t'|'\r'|'\n')+;

L0081 - 2010-11-0818

A type rule has a rule composition – made of a number of elements It may contain keywords (using string literal syntax)

It also contains properties which will result in properties of the type class The property type is derived from the called rule

There are different kinds of properties

= (single assign) += (multiple assign/add) ?= (boolean assign)

There are different property cardinalities ? (0..1) * (0..n) + (1..n) nothing (1..1)

Properties (Attributes and Relations) in Type Rules

Entity :'entity' name=ID ('extends' extends=ID)? '{'

properties+=Property*'}';

Property:'property' name=ID ':' type=ID (many?='[]')?;

Results in containment relation“List<Property> properties”

Results in simple attribute“String: name”

L0081 - 2010-11-0819

Example of Properties

Simple property: name of SimpleType

“Many” reference – containment

elements of Model properties of Entity

Boolean property many of Property

Model :(elements+=Type)*;

Type:SimpleType | Entity;

SimpleType:'type' name=ID;

Entity :'entity' name=ID ('extends' extends=ID)? '{'


Property:'property' name=ID ':' type=ID (many?='[]')?;

L0081 - 2010-11-0820

By default, rules results in a logical model with a tree of entities via containment

You can reference other elements via a reference rule In textual languages a reference has to be by name During linking, Xtext “dereferences” these by name-references

Why References? Used in the logical model for references XText also uses references for context assist

Reference Relations

Entity :'entity' name=ID ('extends' extends=[Entity])? '{'


Results in containment relation“List<Property> properties”

Results in reference relation“Entity extends”

L0081 - 2010-11-0821

Example of References

extends of Entity type of Property

Model :(elements+=Type)*;

Type:SimpleType | Entity;

SimpleType:'type' name=ID;

Entity :'entity' name=ID ('extends' extends=[Entity])? '{'


Property:'property' name=ID ':' type=[Type] (many?='[]')?;

L0081 - 2010-11-0822

A Enum Rule is used to define a limited set of defined alternatives

It is mapped to an enumeration in the logical model

It is declared via the Enum keyword and contains Enum Literals An Enum Literal has a token name and a string representation

It can be used just like any other rule Properties will get the enumeration type

Enumeration Rules

enum Color:RED=“red” | GREEN=“green” | BLUE=“blue” ;

Shape: name=ID ( ‘color’ color=Color)? … ;

L0081 - 2010-11-0823

A native rule contains a string which is passed to ANTLR without further processing it.

It is typically used to define lexer rules that cannot be expressed using Xtext syntax E.g. whitespace-aware lexer rules, such as define custom comment syntax

Native Rules

L0081 - 2010-11-0824

Importing an Existing Model

To import an model instead of generating it..

Simple change to the xtext grammar file:

Depending on the starting point a simple change to the MWE2 file:

Not so with the generated MWE2 file Easier to start with the wizard “XText Project from existing ECore Models”

grammar com.rcpcompany.mdd2010.DSL with org.eclipse.xtext.common.Terminals

import "platform:/resource/com.rcpcompany.mdd2010.model/model/Travel.ecore"import "http://www.eclipse.org/emf/2002/Ecore" as ecore

…

fragment = org.eclipse.xtext.generator.ecore.EcoreGeneratorFragment { genModels = "platform:/resource/com.rcpcompany.mdd2010.model/model/Travel.genmodel"}

L0081 - 2010-11-0825

More InformationXText and friends

“Build your own textual DSL with Tools from the Eclipse Modeling Project” http://www.eclipse.org/articles/article.php?file=Article-BuildYourOwnDSL/ind

ex.html Older, slightly out-dated, article on how to create your first XText project

Documentation for XText http://www.eclipse.org/Xtext/documentation/

http://www.eclipse.org/articles/article.php?file=Article-BuildYourOwnDSL/index.html

http://www.eclipse.org/articles/article.php?file=Article-BuildYourOwnDSL/index.html

http://www.eclipse.org/Xtext/documentation/



L0081 - 2010-11-0826

Exercise 1

Make sure you grammar is on proper EBNF form Do you think you can make an interesting editor from the grammar?

Create an XText project if not already done

Convert your EBNF grammar into an XText model

What can you do to your grammar to make it more suitable for XText

itu - mdd - xtext

Education