s01 intro ml

Upload: ashok-k

Post on 30-May-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 S01 Intro ML

    1/35

    Cse321, Programming Languages and Compilers

    108/21/09

    Lecture #1, Jan. 9, 2007Course Mechanics

    Text BookDown-loading SML

    Syllabus - Course Overview

    Entrance Exam

    Standard ML

    This weeks assignment

    Top to bottom example

    Lexical issues

    Parsing and syntax issues

    Translation issues

  • 8/14/2019 S01 Intro ML

    2/35

    Cse321, Programming Languages and Compilers

    208/21/09

    Acknowledgements

    The material taught in this coursewas made possible by many people.Here is a partial list:

    Andrew Tolmach

    Nathan Linger Harry Porter

    Jinke Lee

  • 8/14/2019 S01 Intro ML

    3/35

    Cse321, Programming Languages and Compilers

    308/21/09

    Class Web Page

    The CS321 class web page can be

    found at:www.cs.pdx.edu/~sheard/course/Cs321

    Contents of the page Course Syllabus

    Link to the ML home page

    Copies of the PowerPoint slides used in lectures

    Copies of the assignments

    Project Description

    Copies of the SML code illustrated in the lectures

    The web page will be updated aftereach lecture.

  • 8/14/2019 S01 Intro ML

    4/35

    Cse321, Programming Languages and Compilers

    408/21/09

    Todays AssignmentsReading Engineering a Compiler

    Available In the PSU bookstore

    Chapter 1, pp 1-26

    There will be a 5 minute quiz on the reading Wednesday.

    Search Find the class webpage

    1 page programming Assignment Due Wednesday, Jan 10, 2007. In Just 2 Days!!

    Login to some SML system. See how the systemoperates. Type in solutions (in a file) to theprogramming problems (In Class exercises 1 and 2 inthis handout), load them into SML. Get themrunning, and print them out then turn them in onWednesday. What matters here is that you try outthe SML system, not that you get them perfect.

  • 8/14/2019 S01 Intro ML

    5/35

    Cse321, Programming Languages and Compilers

    508/21/09

    Course Information CS321 - Languages and Compiler Design

    Time: Monday & Wednesday 18:00-19:50 pm

    Place: PCAT 138

    Instructor: Tim Sheard

    office: room 115, CS Dept, 4th Ave Building, Portland State Univ.

    phone: 503-725-2410 (work) 503-649-7242 (home)

    office hours: Before class in my office (5:00-5:50), or by Appt.

    Assignments Reading from text and handouts (quizzes on reading)

    Daily, 1 page programming assignments

    3 part programming project

    Grading: midterm exam (25%)

    3 parts of project (30%)

    Daily 1 page assignments and quizzes (15%)

    Final exam (30 %)

  • 8/14/2019 S01 Intro ML

    6/35

    Cse321, Programming Languages and Compilers

    608/21/09

    Examinations

    Entrance Exam.

    Do you know your REs and CFGs?

    Quizzes on Reading Material. There is a possible quiz on every reading assignment

    There will be a quiz on Wednesday!

    Mid Term exam Wed. Feb 14, 2007. Time: in class.

    Final exam Monday, Mar. 19, 2007. Time: 6:00-7:50.

  • 8/14/2019 S01 Intro ML

    7/35

    Cse321, Programming Languages and Compilers

    708/21/09

    Text Book

    Text: Engineering a Compiler

    Keith D. Cooper, and Linda Torczon Other Reference Materials

    Auxilliary Material

    Elements of Functional Programming (SML book)

    by Chris Reade, Addison Wesley, ISBN 0-201-12915-9

    Using the SML/NJ Systemhttp://www.cs.cmu.edu/~petel/smlguide/smlnj.htm

    Class Handouts Each class, a copy of that days slides will be available as a

    handout.

    I will post files that contain the example programs used in eachlecture on the class web pagewww.cs.pdx.edu/~sheard/course/Cs321

    I will post Assignments there as well.

    http://www.cs.pdx.edu/~sheard/course/funproghttp://www.cs.pdx.edu/~sheard/course/funprog
  • 8/14/2019 S01 Intro ML

    8/35

    Cse321, Programming Languages and Compilers

    808/21/09

    Labs

    Whenever you learn a new language its

    great to have someone looking over yourshoulder.

    In this spirit I have scheduled some labtimes where people can work on learningML while I am there to help. FAB INTEL Lab (FAB 55-17) downstairs by the Engineering and

    Technology Manangements departmental offices

    Friday Jan. 12, 2007. 4:00 5:30 PM

    Tueday Jan. 16, 2007 4:00 5:30

    Friday Jan. 19, 2005. 4:00 5:30 PM

    Labs are not required, but attendance of atleast one is highly recommended!

  • 8/14/2019 S01 Intro ML

    9/35

    Cse321, Programming Languages and Compilers

    908/21/09

    Installing SML

    Software can be obtained at: http://www.smlnj.org/

    I am using the most recent version 110.60 but it displays the version 110.57 when it runs

    Browse the documentation and Literature sectionof the SML web page. Find some resources that youcan use.

    SML also runs on the PSU linux and Intel labs linux

    usepkg sml

    then logout, or start a new shell

    type: sm

    Intel In a commnd window

    p:\programs\smlnj\addpkg.cmd

    then logout, or start a new command window

    then just type:

    N:\>sml

    http://www.smlnj.org/http://www.smlnj.org/http://www.smlnj.org/http://www.smlnj.org/
  • 8/14/2019 S01 Intro ML

    10/35

    Cse321, Programming Languages and Compilers

    1008/21/09

    Entrance Exam

    CS321 has some pretty serious

    prerequisites.

    Write a regular expression for the set ofstrings that begins with an a which isfollowed by an arbitrary number of bs orcs, and is ended by a d.

    e.g. ad, abbbd, abcbcbcd, etc.

    2. Transform your regular expression into aDFA

    3. Write a context free grammar thatrecognizes the same set of strings as yourRE

    4 Transform your CFG into a CFG that is left-

    recursion free.

  • 8/14/2019 S01 Intro ML

    11/35

    Cse321, Programming Languages and Compilers

    1108/21/09

    Academic Integrity

    Students are expected to be honest in their

    academic dealings. Dishonesty is dealt withseverely.

    Homework. Pass in only your own work.

    Program assignments. Program independently.

    Examinations. Notes and such, only as eachinstructor allows.

    OK to discuss how to solveproblems with other students,

    but each student should

    write up, debug, and turn in his

    own solution.

    C 321 P i L d C il

  • 8/14/2019 S01 Intro ML

    12/35

    Cse321, Programming Languages and Compilers

    1208/21/09

    Course Thesis This course is about programming

    languages. We study languages in twoways. From the perspective of the user

    From the perspective of the implementer (compiler writer)

    We will learn about some languages youmay never have heard of. We will learn toprogram in one of them (Standard ML). Itsgood to learn a new language in depth.

    This course is also about programming.There will be extensive programmingassignments in SML. If you dont do them -you wont learn Youre deluding yourself if you think you can learn the material

    without doing the exercises!

    We will write a comiler for a Java subset.Its good to understand the implementation

    details of a language you already know.

    C 321 P i L d C il

  • 8/14/2019 S01 Intro ML

    13/35

    Cse321, Programming Languages and Compilers

    1308/21/09

    This course is all about programming

    What makes a good program?

    Write at least 3 things on a piece of paper.

    C 321 P i L d C il

  • 8/14/2019 S01 Intro ML

    14/35

    Cse321, Programming Languages and Compilers

    1408/21/09

    Standard ML

    In this course we will use an

    implementation of the language StandardML

    The SML/NJ Homepage has lots of usefulinformation: http://www.smlnj.org//

    You can get a version to install on your ownmachine there.

    I will use the version 110.57 or 110.60 of SML. Earlierversions probably will work as well. I dont foresee anyproblems with other versions, but if you want to use theidentical version that I use in class then this is the one.

    C 321 P i L d C il

    http://www.haskell.org/http://www.smlnj.org/
  • 8/14/2019 S01 Intro ML

    15/35

    Cse321, Programming Languages and Compilers

    1508/21/09

    Characteristics of SML

    Applicative style

    input output description of problem. First class functions

    pass as parameters

    return as value of a function

    store in data-structures

    Less Importantly: Automatic memory management (G.C. no new or malloc)

    Use of a strong type system which uses type inference, i.e. nodeclarations but still strongly typed.

    Cse321 Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    16/35

    Cse321, Programming Languages and Compilers

    1608/21/09

    Syntactic Elements

    Identifiers start with a letter followed bydigits or other letters or primes orunderscores. Valid Examples: a a3 ab aF

    Invalid Examples: 12A

    Identifiers can also be constructed with asequence of operators like: !@#$%^&*+~

    Reserved words include

    fun val datatype if then else

    if of let in end type

    Cse321 Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    17/35

    Cse321, Programming Languages and Compilers

    1708/21/09

    Interacting

    The normal style for interaction is to start

    SML, and then type definitions into thewindow. Types of commands

    4 + 5;val x = 34; fun f x = x + 1;

    Here are two commands you might finduseful.

    val pwd = OS.FileSys.getDir;

    val cd = OS.FileSys.chDir;

    To load a file that has a sml program type

    Use file.sml;

    Cse321 Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    18/35

    Cse321, Programming Languages and Compilers

    1808/21/09

    The SML Read-Typecheck-Eval-Print Loop

    Standard ML of New Jersey v110.57 [built: Mon Nov 21 21:46:28 2005]

    -

    - 3+5;val it = 8 : int

    -

    - print "Hi there\n";

    Hi there

    val it = () : unit

    -

    - val x = 22;

    val x = 22 : int

    -

    - x+ 5;

    val it = 27 : int

    --

    val pwd = OS.FileSys.getDir;-val pwd = fn : unit -> string

    - val cd = OS.FileSys.chDir;

    val cd = fn : string -> unit

    -

    Note the semicolon when

    youre ready to evaluate.

    Otherwise commands can

    spread across several

    lines.

    Cse321 Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    19/35

    Cse321, Programming Languages and Compilers

    1908/21/09

    fun lastone x = hd (rev x)

    fun prefix x = rev (tl (rev x))

    In Class Exercise 1 Define prefix and lastone in terms of head tail and

    reverse.

    First make a file S01code.sml Start sml Change directory to

    where the file resides Load the file ( use S01code.html ) Test the function

    Standard ML of New Jersey v110.57 - K;- val cd = OS.FileSys.chDir;

    val cd = fn : string -> unit

    - cd "D:/work/sheard/courses/PsuCs321/web/notes";

    - use "S01code.html";

    [opening S01code.html]

    val lastone = fn : 'a list -> 'aval prefix = fn : 'a list -> 'a list

    val it = () : unit

    - lastone [1,2,3,4];

    val it = 4 : int

    Cse321 Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    20/35

    Cse321, Programming Languages and Compilers

    2008/21/09

    In Class Exercise 2

    define map and filter functions

    mymap f [1,2,3] = [f 1, f 2, f 3] filter even [1,2,3,4,5] = [2,4]

    fun mymap f [] = []

    | mymap f (x::xs) = (f x)::(mymap f xs);

    fun filter p [] = []

    | filter p (x::xs) =

    if (p x) then x::(filter p xs) else (filter p xs);

    Sample Session

    - mymap plusone [2,3,4]

    [3, 4, 5]

    - filter even [1,2,3,4,5,6]

    [2, 4, 6]

    Cse321 Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    21/35

    Cse321, Programming Languages and Compilers

    2108/21/09

    Course topics

    Programming Language Types of languages

    Data types and languages

    Types and languages

    Compilers Lexical analysis

    Parsing

    Translation to abstract syntax using modern parser generatortechnology.

    Type checking

    identifiers and symbol table organization,

    Next Quarter in the second class of thesequence Intermediate representations

    Backend analysis

    Transformations and optimizations for a number of different kinds

    of languages

    Cse321 Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    22/35

    Cse321, Programming Languages and Compilers

    2208/21/09

    Multi Pass Compilers

    Passes

    text tokens

    syntax trees

    intermediate forms

    (three address code, CPS code, etc)

    assembly code machine code

    Each phase is from one form to another, ORfrom one form to the same form, which isoften called a source to sourcetransformation.

    Cse321, Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    23/35

    Cse321, Programming Languages and Compilers

    2308/21/09

    The Top to Bottom Example

    text:

    tokens:

    syntax tree:

    id(z) eql id(x) plus id(pi) times float(12.0)

    z = x + pi * 12.0

    +Id(z)

    float(12.0)

    =

    Id(z)

    *Id(x)

    Id(pi)

    Cse321, Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    24/35

    Cse321, Programming Languages and Compilers

    2408/21/09

    Passes (cont)

    Three address code:

    temp1 := pi * 12.0z := x * temp1

    Assembly level code:

    ld r1,x

    ld r2,pi

    add r1,r2

    ldi r2,12.0

    mul r1,r2

    st r1,z

    Cse321, Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    25/35

    , g g g g p

    2508/21/09

    Lexical Analysis

    Produces Tokens and Deals with:

    white space comments

    reserved word identification

    symbol table interface

    Tokens are the terminals of grammars.

    Lexical analysis reads the whole program,character by character thus it needs to be

    efficient. This implies fancy bufferingtechniques etc. Modern lexical generatorshandle these problems so we will ignorethem.

    Cse321, Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    26/35

    , g g g g p

    2608/21/09

    Tokens, Patterns & Lexemes

    Many strings from the input may produce

    the same TOKEN i.e. identifiers, integersconstants, floats

    A PATTERN describes a rule which describeswhich strings are assigned to a token.

    A LEXEME is the exact sequence of inputcharacters matched by a PATTERN.

    Cse321, Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    27/35

    , g g g g p

    2708/21/09

    Examples

    lexeme pattern token

    x * Id "x" abc * Id "abc"

    152 + Constant(152)

    then then ThenKeyword

    Many lexemes map to the same token. e.g.x and abc .

    Note, some lexemes might match manypatterns. e.g. "then" above. Need to resolveambiguity.

    Since tokens are terminals, they must be"produced" by the lexical phase withsynthesized attributes in place. (e.g. nameof an identifier). e.g. id(x) and

    constant(152)

    Cse321, Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    28/35

    g g g g

    2808/21/09

    Syntax, Parse Trees & Grammars

    Syntax (the physical layout of the program)

    Grammars describe precisely the syntax of a language. Two kindsof grammars which compiler writers use a lot are: regular, andcontext free

    Informal Definitions of:

    Regular:concatenation, union, star

    Context Free:

    only one symbol on the lhs of

    a production

    Cse321, Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    29/35

    2908/21/09

    Example GrammarSentence ::= Subject Verb Object

    Subject ::= Proper-noun

    Object ::= Article Adjective Noun

    Verb ::= ate | saw | called

    Noun ::= cat | ball | dish

    Article ::= the | a

    Adjective ::= big | bad | pretty

    Proper-noun ::= tim | mary

    Start Symbol = Sentence

    Example sentence: tim ate the big ball

    Cse321, Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    30/35

    3008/21/09

    Recursive Grammar Examples

    Recursive Grammars describe infinitelanguages

    list ::= [ num morenum ]

    morenum ::= , num morenum

    |

    derives [ 2 ],[2,4], [2,4,6] ...

    Exp ::= id

    | Exp + Exp

    | Exp * Exp

    | ( Exp )derives x x+x

    Cse321, Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    31/35

    3108/21/09

    Parse Trees

    Each nonterminal on the lhs of aproduction "roots" a tree:

    Each node in a tree with all its immediatechildren is derived from a singleproduction of the grammar

    We desire a program which constructs aparse tree from a string. Such programsare different for every grammar, we sometimes use tools to construct suchprograms (yacc).

    Ex p

    ExpExp +

    Id Id

    Cse321, Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    32/35

    3208/21/09

    Syntax Directed Translations

    A syntax directed translation traverses asyntax tree and builds a translation in theprocess.

    Considerations

    Tree Traversal orders Left to right?

    right to left?

    in-order, pre-order, or post-order

    Where does the information about what todo in the traversal come from?

    Attribute grammars Inherited attributes

    Synthesized attributes

    Cse321, Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    33/35

    3308/21/09

    Example Translation Process

    Translation as an abstract syntax to abstractsyntax transformer

    We represent this as a grammar with actions{ ... }. The action is performed when thatproduction is reduced.

    Exp ::= Term terms

    terms ::= + Term { print "+" } term

    |

    Term ::= Factor factors

    factors ::= * Factor { print "*" } factors

    |

    Factor ::= id { print id.name }

    Cse321, Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    34/35

    3408/21/09

    Semantics

    How do we know what to translate thesyntax tree into?

    How do we know if it is correct?

    Semantics denotational semantics

    operational semantics

    interpreters

    Very useful in writing compilers since theygive a reference when trying to decide what

    the compiler should do in particular cases.

    Cse321, Programming Languages and Compilers

  • 8/14/2019 S01 Intro ML

    35/35

    Over view

    Compilation is a large process

    It is often broken into stages The theories of computer science guide us

    in writing programs at each stage.

    We must understand what a program

    means if we are to translate it correctly. Many phases of the compiler try and

    optimize by translating one form into abetter (more efficient?) form.

    Most of compiling is about patternmatching languages and tools that supportpattern matching are very useful.