lecture 5: syntax analysis - computer sciencestotts/comp144/lectures/lect5.pdfparsing • parsing an...

12
COMP 144 Programming Language Concepts Lecture 5: Syntax Analysis January 18, 2002 Felix Hernandez-Campos 1 COMP 144 Programming Language Concepts Felix Hernandez-Campos 1 Lecture 5: Lecture 5: Syntax Analysis Syntax Analysis COMP 144 Programming Language Concepts COMP 144 Programming Language Concepts Spring 2002 Spring 2002 Felix Hernandez Felix Hernandez- Campos Campos Jan 18 Jan 18 The University of North Carolina at Chapel Hill The University of North Carolina at Chapel Hill COMP 144 Programming Language Concepts Felix Hernandez-Campos 2 Review: Compilation/Interpretation Review: Compilation/Interpretation Compiler or Interpreter Compiler or Interpreter Translation Translation Execution Execution Source Code Source Code Target Code Target Code Interpre Interpre- tation tation

Upload: others

Post on 07-Feb-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

  • COMP 144Programming Language ConceptsLecture 5: Syntax Analysis

    January 18, 2002

    Felix Hernandez-Campos 1

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    11

    Lecture 5: Lecture 5: Syntax AnalysisSyntax Analysis

    COMP 144 Programming Language ConceptsCOMP 144 Programming Language ConceptsSpring 2002Spring 2002

    Felix HernandezFelix Hernandez--CamposCampos

    Jan 18Jan 18

    The University of North Carolina at Chapel HillThe University of North Carolina at Chapel Hill

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    22

    Review: Compilation/InterpretationReview: Compilation/Interpretation

    Compiler or InterpreterCompiler or Interpreter

    Translation Translation ExecutionExecution

    Source CodeSource Code

    Target CodeTarget Code

    InterpreInterpre--tationtation

  • COMP 144Programming Language ConceptsLecture 5: Syntax Analysis

    January 18, 2002

    Felix Hernandez-Campos 2

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    33

    Review: Syntax AnalysisReview: Syntax Analysis

    Compiler or InterpreterCompiler or Interpreter

    Translation Translation ExecutionExecution

    Source CodeSource Code•• Specifying the Specifying the formform

    of a programming of a programming

    languagelanguage

    –– TokensTokens»» Regular Regular

    ExpressionExpression

    –– SyntaxSyntax»» ContextContext--FreeFree

    GrammarsGrammarsTarget CodeTarget Code

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    44

    Phases of CompilationPhases of Compilation

  • COMP 144Programming Language ConceptsLecture 5: Syntax Analysis

    January 18, 2002

    Felix Hernandez-Campos 3

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    55

    Syntax AnalysisSyntax Analysis

    •• Syntax:Syntax:–– Webster’s definition: Webster’s definition: 1 a : the way in which linguistic 1 a : the way in which linguistic

    elements (as words) are put together to form constituents elements (as words) are put together to form constituents (as phrases or clauses)(as phrases or clauses)

    •• The syntax of a programming languageThe syntax of a programming language–– Describes its formDescribes its form

    »» Organization of tokensOrganization of tokens (elements)(elements)»» Context Free Grammars (CFGs)Context Free Grammars (CFGs)

    –– Must be Must be recognizablerecognizable by compilers and interpretersby compilers and interpreters»» ParsingParsing»» LL and LR parsersLL and LR parsers

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    66

    Context Free GrammarsContext Free Grammars

    •• CFGsCFGs–– Add recursion to regular expressionsAdd recursion to regular expressions

    »» Nested constructionsNested constructions–– NotationNotation

    expressionexpression →→ identifieridentifier | | numbernumber | | -- expressionexpression| | (( expressionexpression ))| | expressionexpression operatoroperator expressionexpression

    operator operator →→ ++ | | -- | | ** | | //

    »» Terminal symbolsTerminal symbols»» NonNon--terminal symbolsterminal symbols»» Production rule (i.e. substitution rule)Production rule (i.e. substitution rule)

    terminal symbol terminal symbol →→ terminal and nonterminal and non--terminal symbolsterminal symbols

  • COMP 144Programming Language ConceptsLecture 5: Syntax Analysis

    January 18, 2002

    Felix Hernandez-Campos 4

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    77

    ParsersParsers

    •• ScannersScanners–– Task: recognize language tokensTask: recognize language tokens–– Implementation: Deterministic Finite AutomatonImplementation: Deterministic Finite Automaton

    »» Transition based on the next characterTransition based on the next character

    •• ParsersParsers–– Task: recognize language syntax (organization of tokens)Task: recognize language syntax (organization of tokens)–– Implementation:Implementation:

    »» TopTop--down parsingdown parsing»» BottomBottom--up parsingup parsing

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    88

    Parse TreesParse Trees

    •• A parse is graphical representation of a derivationA parse is graphical representation of a derivation•• ExampleExample

  • COMP 144Programming Language ConceptsLecture 5: Syntax Analysis

    January 18, 2002

    Felix Hernandez-Campos 5

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    99

    Parsing exampleParsing example

    •• Example: commaExample: comma--separated list of identifierseparated list of identifier

    –– CFGCFG

    id_list id_list →→ idid id_list_tailid_list_tailid_list_tail id_list_tail →→ ,, id_list_tailid_list_tailid_list_tail id_list_tail →→ ;;

    –– ParsingParsing

    A, B, C;A, B, C;

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    1010

    TopTop--down derivation of down derivation of A, B, C;A, B, C;

    CFGCFG

    LeftLeft--toto--right,right,LeftLeft--most derivationmost derivation

    LL(1) parsingLL(1) parsing

  • COMP 144Programming Language ConceptsLecture 5: Syntax Analysis

    January 18, 2002

    Felix Hernandez-Campos 6

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    1111

    TopTop--down derivation of down derivation of A, B, C;A, B, C;

    CFGCFG

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    1212

    BottomBottom--up parsing of up parsing of A, B, C;A, B, C;

    CFGCFG

    LeftLeft--toto--right,right,RightRight--most derivationmost derivation

    LR(1) parsingLR(1) parsing

  • COMP 144Programming Language ConceptsLecture 5: Syntax Analysis

    January 18, 2002

    Felix Hernandez-Campos 7

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    1313

    BottomBottom--up parsing of up parsing of A, B, C;A, B, C;

    CFGCFG

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    1414

    BottomBottom--up parsing of up parsing of A, B, C;A, B, C;

    CFGCFG

  • COMP 144Programming Language ConceptsLecture 5: Syntax Analysis

    January 18, 2002

    Felix Hernandez-Campos 8

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    1515

    ParsingParsing

    •• Parsing an arbitrary Context Free GrammarParsing an arbitrary Context Free Grammar–– O(nO(n33))–– Too slow for large programsToo slow for large programs

    •• LinearLinear--time parsingtime parsing–– LL parsers LL parsers

    »» Recognize LL grammarRecognize LL grammar»» Use a topUse a top--down strategydown strategy

    –– LR parsersLR parsers»» Recognize LR grammarRecognize LR grammar»» Use a bottomUse a bottom--up strategyup strategy

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    1616

    Hierarchy of Linear ParsersHierarchy of Linear Parsers

    •• Basic containment relationshipBasic containment relationship–– All CFGs can be recognized by LR parserAll CFGs can be recognized by LR parser–– Only a subset of all the CFGs can be recognized by LL Only a subset of all the CFGs can be recognized by LL

    parsersparsers

    LL parsingLL parsing

    CFGsCFGs LR parsingLR parsing

  • COMP 144Programming Language ConceptsLecture 5: Syntax Analysis

    January 18, 2002

    Felix Hernandez-Campos 9

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    1717

    Recursive Descent Parser ExampleRecursive Descent Parser Example

    •• LL(1) grammarLL(1) grammar

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    1818

    Recursive Descent Parser ExampleRecursive Descent Parser Example

    •• Outline of Outline of

    recursive parserrecursive parser

    –– This parser onlyThis parser onlyverifies syntaxverifies syntax

    –– matchmatch isisthe scannerthe scanner

  • COMP 144Programming Language ConceptsLecture 5: Syntax Analysis

    January 18, 2002

    Felix Hernandez-Campos 10

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    1919

    Recursive Descent Parser ExampleRecursive Descent Parser Example

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    2020

    Recursive Descent Parser ExampleRecursive Descent Parser Example

  • COMP 144Programming Language ConceptsLecture 5: Syntax Analysis

    January 18, 2002

    Felix Hernandez-Campos 11

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    2121

    Recursive Descent Parser ExampleRecursive Descent Parser Example

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    2222

    Semantic AnalysisSemantic Analysis

    Compiler or InterpreterCompiler or Interpreter

    Translation Translation ExecutionExecution

    Source CodeSource Code•• Specifying the Specifying the meaningmeaning

    of a programming of a programming

    languagelanguage

    –– Attribute GrammarsAttribute Grammars

    Target CodeTarget Code

  • COMP 144Programming Language ConceptsLecture 5: Syntax Analysis

    January 18, 2002

    Felix Hernandez-Campos 12

    COMP 144 Programming Language ConceptsFelix Hernandez-Campos

    2323

    Reading AssignmentReading Assignment

    •• Scott’s Chapter 2Scott’s Chapter 2–– Section 2.2.2Section 2.2.2–– Section 2.2.3Section 2.2.3