antlr4 and its testing

Download ANTLR4 and its testing

If you can't read please download the document

Upload: knoldus-software-llp

Post on 16-Apr-2017

2.484 views

Category:

Software


0 download

TRANSCRIPT

ANTLR4 and its testing

Sahil SawhneySoftware Consultant Knoldus Software, LLP

Agenda

Understanding Grammar

Parse tree

Process Of Parsing

Knowing ANTLR

Who use ANTLR

Testing in ANTLR

Demonstration

What is grammar?

The set of rules that explains how words are used in a language

-Merriam Webster

For example use of an in English grammar

Why grammar?

To bring order out of chaos

And

we adore chaos because we love to produce order.

What type of grammar?

Context-free grammar (CFG)

It consists of a finite set of grammar rules in form of a quadruple (N, T, S, P) where

N is a set of non-terminal symbols. (Placeholders)

T is a set of terminals where N T = NULL

S is the start symbol. (must be a non-terminal)

P is a set of production rules, P: N (N T)*

An Example

Consider the production rule for palindrome with alphabet {a,b}.

S aSa | bSb | a | b |

here,

S is the start as well as non-terminal symbol

{a,b} is the set of terminal nodes

Example Cont...

Consider the string ababaCorresponding Parse Tree

(read from left terminal to right terminal node)

S

a

S

a

b

b

S

a

What is a parse tree?

A parse tree for a grammar G is a tree where

The root is the start symbol for G

The interior nodes are the nonterminals of G

The leaf nodes are the terminal symbols of G

A terminal string is considered valid with respect to a grammar only if there exists a valid parse tree for the input string among all possible parse trees.

Finally, what is ANTLR?

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

From a grammar, ANTLR generates a parser that can build and walk parse trees (data structure representing how a grammar matches the input).

Now what is this parser?

A parser is a program that takes input in the form of a sequence of tokens or program instructions and usually builds a data structure in the form of a parse tree or an abstract syntax tree

ababa

I am theparser

Grammar rules(S aSa | bSb | a | b | )

S

a

S

a

b

b

S

a

3 stages of parsing

Lexical Analysis It produces tokens from a stream of input string.

Syntactic Analysis Checks weather generated tokens form a grammatically correct expression.

Semantic Parsing If expressions are valid a meaning is associated with the expression and necessary actions are taken.

ANTLR Cont...

In a nutshell,

the ANTLR tool converts grammars into programs (Java programs for now) that recognize sentences in the language described by the grammar.

For example, given a grammar for JSON, the ANTLR tool generates a program that recognizes JSON input using some support classes from the ANTLR runtime library.

MyGrammar.g4

I amANTLRAnd the version is 4

MyGrammar.tokens

MyGrammarBaseListner

MyGrammarBaseVisitor

MyGrammarLexer

MyGrammarLexer.tokens

MyGrammarListner

MyGrammarParser

MyGrammarVisitor

Here, ANTLR acts on the grammar andgenerate corresponding Java files

But why ANTLR?

ANTLR generates recursive decent parsers (type of a top down parser) and has good error reporting.

The parser generated by ANTLR is more or less readable. This helps in debugging.

ANTLR is available as "open source" and there are a number of ANTLR users world wide, so there is a reasonable chance that bugs will be identified and corrected.

When to use ANTLR4?

DSL (Domain Specific Language)

Anyone care about ANTLR?

The following say YES WE DO :

Twitter search uses ANTLR for query parsing, with more than 2 billion queries a day

The NetBeans IDE parses C++ with ANTLR

Oracle uses ANTLR within the SQL Developer IDE and its migration tools

Knoldus uses ANTLR in there projects to achieve DSL (domain specific language) requirements

Any Alternatives?

The list is long. Some examples are :

CL-Yacc(Common Lisp)

Gold (C#, Java, Python, Visual Basic etc.)

Hime Parser Generator (C#, Java)

Coco/R (Ada, Pascal, Oberon, Ruby)

Yecc (Erlang)

Etc..

Testing ANTLR4

ANTLR provides a flexible testing tool in the runtime library called TestRig .

It can display lots of information about how a recognizer(auto generated Java classes) matches input from a file or standard input.

TestRig uses Java reflection to invoke compiled recognizers

References

https://github.com/antlr/antlr4/blob/master/doc/getting-started.md

https://www.javacodegeeks.com/2012/06/antlr-getting-started.html

https://en.wikipedia.org/wiki/Context-free_grammar

https://blog.knoldus.com/2016/04/29/testing-grammar-using-antlr4-testrig-grun/

Any Question?

Thank You !!!