lecture 11: describing the syntax

53
Chair of Software Engineering Einführung in die Programmierung Introduction to Programming Prof. Dr. Bertrand Meyer Michela Pedroni Lecture 11: Describing the Syntax

Upload: rico

Post on 23-Feb-2016

33 views

Category:

Documents


0 download

DESCRIPTION

Einführung in die Programmierung Introduction to Programming Prof. Dr. Bertrand Meyer Michela Pedroni. Lecture 11: Describing the Syntax. Goals of today’s lecture. Learn about languages that describes other languages Read and understand the syntax description for Eiffel - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Lecture 11:  Describing the Syntax

Chair of Software Engineering

Einführung in die ProgrammierungIntroduction to Programming

Prof. Dr. Bertrand Meyer

Michela Pedroni

Lecture 11: Describing the Syntax

Page 2: Lecture 11:  Describing the Syntax

2

Goals of today’s lectureLearn about languages that describes other languages

Read and understand the syntax description for Eiffel

Write simple syntax descriptions

Page 3: Lecture 11:  Describing the Syntax

3

Syntax: ConditionalA conditional instruction consists, in order, of:An “If part”, of the form if condition.A “Then part” of the form then compound.Zero or more “Else if parts”, each of the formelseif condition then compound.Zero or one “Else part” of the form else compoundThe keyword end.

Here each condition is a boolean expression, and each compound is a compound instruction.

Page 4: Lecture 11:  Describing the Syntax

4

Why describe the syntax formally?

We know syntax descriptions for human languages:

e.g. grammar for German, French, …

Expressed in natural language

Good enough for human use

Ambiguous, like human languages themselves

Page 5: Lecture 11:  Describing the Syntax

5

The power of human mindI cdnoult blvelee taht I cluod aulacity uesdnatnrd waht I was rdgnieg. The Paomnnehal Pweor of the Hmuan Mnid Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, is deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht the frist and lsat ltteer be in the rghit pclae. The rset can be a taotl mses and you can sitll raed it wouthit any porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe. Ptrety Amzanig Huh?

Page 6: Lecture 11:  Describing the Syntax

6

Why describe the syntax formally?

Programming languages need better descriptions:

More precise: must tell us unambiguously whether given program text is legal or not

Use formalism similar to mathematics

Can be fed into compilers for automatic processing of programs

Page 7: Lecture 11:  Describing the Syntax

7

Why describe the syntax formally?Compilers use algorithms to

Determine if input is correct program text (parser)

Analyze program text to extract specimens for abstract syntax tree

Translate program text to machine instructions

Compilers need strict formal definition of programming language

Page 8: Lecture 11:  Describing the Syntax

8

Formal Description of SyntaxUse formal language to describe programming languages.

Languages used to describe other languages are called Meta-Languages

Meta-Language used to describe Eiffel:BNF-E (Variant of the Backus-Naur-Form, BNF)

Page 9: Lecture 11:  Describing the Syntax

9

History

1954 FORTRAN: First widely recognized programming language (developed by John Backus et Al.) 1958 ALGOL 58: Joint work of European and American groups1960 ALGOL 60: Preparation showed a need for a formal description John Backus (member of ALGOL team) proposed Backus-Normal-Form (BNF)1964: Donald Knuth suggested acknowledging Peter Naur for his contribution Backus-Naur-FormMany variants since then, e.g. graphical variant by Niklaus Wirth

Page 10: Lecture 11:  Describing the Syntax

10

Formal description of a languageBNF lets us describe syntactical properties of a language

Remember: Description of a programming language also includes lexical and semantic properties other tools

Page 11: Lecture 11:  Describing the Syntax

11

Formal Description of Syntax

A language is a set of phrases

A phrase is a finite sequence of tokens from a certain “vocabulary”

Not every possible sequence is a phrase of the language

A grammar specifies which sequences are phrases and which are not

BNF is used to define a grammar for a programming language

Page 12: Lecture 11:  Describing the Syntax

12

Example of phrases

class PERSONfeature

age: INTEGER-- Age

end

class age: INTEGER

-- Ageend PERSONfeature

Page 13: Lecture 11:  Describing the Syntax

13

Grammar

DefinitionA Grammar for a language is a finite set of rules

forproducing phrases, such that:

1. Any sequence obtained by a finite number of applications of rules from the grammar is a phrase of the language.

2. Any phrase of the language can be obtained by a finite number of applications of rules from the grammar.

Page 14: Lecture 11:  Describing the Syntax

14

Elements of a grammar: Terminals

Terminals

Tokens of the language that are not defined by a production of the grammar.E.g. keywords from Eiffel such as if, then, endor symbols such as the semicolon “;” or the

assignment “:=”

Page 15: Lecture 11:  Describing the Syntax

15

Elements of a grammar: Nonterminals

Nonterminals

Names of syntactical structures or substructures used to build phrases.

Page 16: Lecture 11:  Describing the Syntax

16

Elements of a grammar: Productions

Productions

Rules that define nonterminals of the grammar using a combination of terminals and (other) nonterminals

Page 17: Lecture 11:  Describing the Syntax

17

An example production

Terminal

NonterminalProduction

Conditional:

if

else

endthenCondition

Instruction

Instruction

Page 18: Lecture 11:  Describing the Syntax

18

BNF Elements: ConcatenationGraphical representation:

BNF: A B

Meaning: A followed by B

A B

Page 19: Lecture 11:  Describing the Syntax

19

Graphical representation:

BNF: [ A ]

Meaning: A or nothing

BNF Elements: Optional

A

Page 20: Lecture 11:  Describing the Syntax

20

Graphical representation:

BNF: A | B

Meaning: either A or B

BNF Elements: Choice

A

B

Page 21: Lecture 11:  Describing the Syntax

21

Graphical representation:

BNF: { A }*

Meaning: sequence of zero or more A

BNF Elements: Repetition

A

Page 22: Lecture 11:  Describing the Syntax

22

Graphical representation:

BNF: { A }+

Meaning: sequence of one or more A

BNF Elements: Repetition, once or more

A

Page 23: Lecture 11:  Describing the Syntax

23

BNF elements: Overview

ARepetition (at least once): { A }+

Repetition (zero or more): { A }* A

Choice: A | BA

B

AOptional: [ A ]

A BConcatenation: A B

Page 24: Lecture 11:  Describing the Syntax

24

A simple example

digitdigit

float_number:digit:

Example phrases:.76-.761.5612.845-1.3413.0

Translate it to written form!

0123456789

-.

Page 25: Lecture 11:  Describing the Syntax

25

A simple example

written in BNF:

[ ] { }* { }+

0

float_number

digit | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

- digit . digit

=

=

Page 26: Lecture 11:  Describing the Syntax

26

BNF Elements Combined

written in BNF:

Conditional:

if

else

endthencondition

instruction

instruction

[ instructioninstruction else endif thencondition

]

Conditional =

Page 27: Lecture 11:  Describing the Syntax

27

BNF: Conditional with elseif

Conditional

Then_part_list

Else_partThen_part_listif end[ ]

Then_part elseif }*{ Then_part

Then_part Boolean_expression then Compound

Else_part else Compound

=

=

=

=

Page 28: Lecture 11:  Describing the Syntax

28

Different Grammar for Conditional

Conditional

If_part

Then_part

Else_list

Elseif_part

Boolean_expressionif

If_part Then_part Else_list end

Compoundthen

Boolean_expression Then_partelseif

Elseif_part Compound{ ]else}* [

=

=

=

=

=

Page 29: Lecture 11:  Describing the Syntax

29

Simple BNF example

Sentence I [ don’t ] Verb Names QuantNames Name {and Name}*Name tomatoes | shoes | books | footballVerb like | hate Quant a lot | a little

Which of the following phrases are correct?I like tomatoes and footballI don’t like tomatoes a littleI hate football a lotI like shoes and tomatoes a littleI don’t hate tomatoes, football and books a lot

Rewrite the BNF to include the incorrect phrases

==

===

Page 30: Lecture 11:  Describing the Syntax

30

Simple BNF example (Solution)

Which of the following phrases are correct? - I like tomatoes and football I don’t like tomatoes a little I hate football a lot I like shoes and tomatoes a little - I don’t hate tomatoes, football and books a

lot

Rewrite the BNF to include the incorrect phrasesSentence I [ don’t ] Verb Names [ Quant ]Names Name [{, Name}* and Name]Name tomatoes | shoes | books | footballVerb like | hate Quant a lot | a little

====

=

Page 31: Lecture 11:  Describing the Syntax

31

BNF-EUsed in official description of Eiffel.Every Production is one of

ConcatenationA B C [ D ]

ChoiceA B | C | D

RepetitionA { B terminal ... }*

(also with +)

=

=

=

A [ B { terminal B }* ]=

Page 32: Lecture 11:  Describing the Syntax

32

BNF-E Rules

Every nonterminal must appear on the left-hand side of exactly one production, called its defining production

Every production must be of one kind: Concatenation, Choice or Repetition

Page 33: Lecture 11:  Describing the Syntax

33

Conditional with elseif (BNF)

Conditional

Then_part_list

Else_partThen_part_listif end[ ]

Then_part elseif }*{ Then_part

Then_part Boolean_expression then Compound

Else_part else Compound

=

=

=

=

Page 34: Lecture 11:  Describing the Syntax

34

BNF-E: Conditional

Conditional

Then_part_list

Else_partThen_part_listif end[ ]

Then_part Boolean_expression then Compound

Else_part else Compound

elseif }+{ Then_part ...

=

=

=

=

Page 35: Lecture 11:  Describing the Syntax

35

Recursive grammars

Constructs may be nested

Express this in BNF with recursive grammars

Recursion: circular dependency of productions

Page 36: Lecture 11:  Describing the Syntax

36

Conditionals can be nested within conditionals:

Recursive grammars

Else_part else Compound

Compound

Instruction

Instruction …}*{

...CallConditional Loop| | |

;

=

=

=

Page 37: Lecture 11:  Describing the Syntax

37

Production name can be used in its own definition

Definition of Then_part_list with repetition:

Recursive definition of Then_part_list:

Recursive grammars

Then_part_list …}*{ Then_part

Then_part_list Then_part elseif ][ Then_part_list

elseif=

=

Page 38: Lecture 11:  Describing the Syntax

38

Conditional

if a = b thena := a - 1b := b + 1

elseif a > b thena := a + 1

elseb := b + 1

end

Conditional

Then_part_list

Else_partThen_part_listif end[ ]

Then_part Boolean_expression then Compound

Else_part else Compound

elseif }+{ Then_part ...

====

Page 39: Lecture 11:  Describing the Syntax

39

BNF for simple arithmetic expressions

Expr Term { Add_op Term }*Term Factor { Mult_op Factor}*Factor Number | Variable | Nested Nested ( Expr )Add_op + | –Mult_op * | /

Assume Number is defined as a positive integer number, and Variable is a one-letter alphabetical word.

Is this a recursive grammar?How would the same grammar in BNF-E look like?Which of the following phrases are correct?aa + b-a + ba * 7 + b7 / (3 * 12) – 7(3 * 7)(5 + a ( 7 * b))

======

Page 40: Lecture 11:  Describing the Syntax

40

BNF for simple arithmetic expressions (Solution)

Expr {Term Add_op …}+

Term { Factor Mult_op …}+

Factor Number | Variable | Nested Nested ( Expr )Add_op + | –Mult_op * | /

Is this a recursive grammar? Yes (see Nested)How would the same grammar in BNF-E look like?(see yellow box below) Which of the following phrases are correct?a a + b -a + b -a * 7 + b 7 / (3 * 12) – 7 (3 * 7) (5 + a ( 7 * b)) -

======

Page 41: Lecture 11:  Describing the Syntax

41

Guidelines for GrammarsKeep productions short.

easier to readbetter assessment of language size

Conditional if Boolean_expression then Compound{ elseif Boolean_expression then Compound }*

[ else Compound ] end

=

Page 42: Lecture 11:  Describing the Syntax

42

Guidelines for GrammarsTreat lexical constructs like terminalsIdentifiersConstant values

Identifier Letter {Letter | Digit | "_”}*Integer_constant [-]{Digit}+

Floating_point [-] {Digit}* “." {Digit}+

Letter "A" | "B" | ... | "Z" | "a" | ... | "z"Digit "0" | "1" | ... | "9“

==

=

==

Page 43: Lecture 11:  Describing the Syntax

43

Guidelines for GrammarsUse unambiguous productions.Applicable production can be found by looking at one lexical element at a time

Conditional if Then_part_list [ Else_part ] end

Compound { Instruction }*

Instruction Conditional | Loop | Call | ...

=

==

Page 44: Lecture 11:  Describing the Syntax

44

One more exerciseDefine a recursive grammar in BNF-E for boolean expressions with the following description:Simple expressions limited to the variable identifiers x, y, or z, that contain operations of unary not, and binary and, or, and implies together with parentheses.

Valid phrases would include not x and not y (x or y implies z) y or (z)(x)

Page 45: Lecture 11:  Describing the Syntax

45

Solution binary expressionsB_expr ::= With_par | ExprWith_par ::= “(” Expr “)”Expr ::= Not_term | Bin_term | VariableBin_term ::= B_expr Bin_op B_exprBin_op ::= “implies”| “or” | “and” Not_term ::= “not” B_exprVariable ::= “x” | “y” | “z”

Note: Here we are using ::= instead of “x” to denote terminals=

Page 46: Lecture 11:  Describing the Syntax

46

Writing a ParserOne feature per Production

Concatenation:Sequence of feature calls for Nonterminals, checks for Terminals

Choice:Conditional with Compound per alternative

Repetition:Loop

Page 47: Lecture 11:  Describing the Syntax

47

Writing a Parser: EiffelParseAutomatic generation of abstract syntax tree for phrase

Based on BNF-E

One class per production

Classes inherit from predefined classes AGGREGATE, CHOICE, REPETITION, TERMINAL

Feature production defines Production

Page 48: Lecture 11:  Describing the Syntax

48

Writing a Parser: ToolsYooc:

Translates BNF-E to EiffelParse classes

Yacc / Bison:

Translates BNF to C parser

Page 49: Lecture 11:  Describing the Syntax

49

BNF alike syntax descriptionsDTD: Description of XML documents

<!ELEMENT collection (recipe*)><!ELEMENT recipe (title, ingredient*,preparation)><!ELEMENT title (#PCDATA)><!ELEMENT ingredient (ingredient*,preparation)?><!ATTLIST ingredient name CDATA #REQUIRED amount CDATA #IMPLIED unit CDATA #IMPLIED><!ELEMENT preparation (step*)><!ELEMENT step (#PCDATA)>

Page 50: Lecture 11:  Describing the Syntax

50

BNF alike syntax descriptionsUnix/Linux: Synopsis of commandsSYNOPSISman [-acdfFhkKtwW] [--path] [-m system] [-p string] [-C config_file] [-M pathlist] [-P pager] [-S section_list] [section] name ...

Page 51: Lecture 11:  Describing the Syntax

51

Eiffel syntax

http://se.ethz.ch/teaching/2007-F/eprog-0001/slides/eiffel_syntax.pdf

http://www.gobosoft.com/eiffel/syntax/

Page 52: Lecture 11:  Describing the Syntax

52

What we have seen

A way to describe syntax: BNF

Three variants: BNF, BNF-E, graphical

A glimpse into recursion

Page 53: Lecture 11:  Describing the Syntax

53

Exercise

Describe BNF-E with the help of BNF-E. Assume that the lexical constructs Keyword and Symbol (for Terminals) and Identifier (for Nonterminals) are given:

Terminal ::= Keyword | Symbol Nonterminal ::= Identifier