supported by elte ikkk, ericsson hungary, in cooperation with university of kent erlang refactoring...

27
Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors: Zoltán Horváth and Simon Thompson Project members: László Lövei, Tamás Kozsik

Upload: lynne-hicks

Post on 03-Jan-2016

218 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent

Erlang refactoring with relational database

Anikó Víg and Tamás NagySupervisors:

Zoltán Horváth and Simon ThompsonProject members: László Lövei, Tamás Kozsik

Page 2: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Refactor

Refactoring is restructuring program code without altering its external behaviour

Aims: restructuring of code, code quality improvement, coding conventions, optimization, migrating to new API

In general not available for functional languages, only: HaRe (AST) and Clean refactoring (relational database)

Page 3: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Erlang

Functional programming language and runtime environment developed by Ericsson

Designed to build distributed, reliable, soft realtime concurrent systems (telecommunication)

Highly dynamic nature Lightweight processes and message passing Messages can be sent to a process ID or registered

name,which are bound to function code at runtime. Possibility of running dynamically created code (eval,

hot code replacement).

Page 4: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Erlang

Function id-s are atoms, atoms may be constructed runtime (and passed as arguments to apply,spawn). The syntactical category of an atom may be a function or ”string”, etc.

Side effects are restricted to message passing and built-in functions

Modules with explicit interface definitions and static export and import lists

No static type system Variables are assigned a value only once in their life Variables are not typed statically, they can have a

value of any data type.

Page 5: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

The refactor tool

Erlang node

Source codeIn Emacs

AST of thesource

Source in thedatabase

MySQL

Distel

Emacs

ODBC (?)

Page 6: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Code , AST

Page 7: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Storing the code in database

Every node in a tree has a unique identifier. Every module has a unique module identifier. Almost every syntax-tree node type has an

own database table The records in the tables contain the

identifiers of the current nodes and their children’s ones.

We store the positions, node types and names in separate tables.

Page 8: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Semantic informations

We store the semantic informations in separate tables: identical variables, function definitions and their calling expressions, scope of the nodes, hierarhy of scopes

AST + semantic informations = graph, not tree. + Searching and using a graph is more efficient

with relational database as traversing the tree.- The storing-recovering of the code and

communication with the database is more expensive

Page 9: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Our own tables: visibility of variables

Page 10: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Our own tables: function calls

Page 11: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Our own tables: visibility of scopes

Page 12: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Problems during the building up

The Erlang prepocessor substitutes the macro definitions using epp_dodger instead of epp

Every node has only the line number of position informations using erl_scan1 (modified by Huiqing) instead of erl_scan: it can give back the column information too

Page 13: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Problems during the building up

In Erlang language there are many types of comments: we have not only comments, but pre- and postcomments too, which can be list of comment nodes.

The erl_comment_scan:file collects the comments from the file, and we have to put them too with the correct position information into the database.

There is the same problem with the column infomation so we use erl_recomment1 instead of erl_recomment.

Page 14: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

The algorithms

We use postorder traverse on the syntax tree to give identifiers and put the nodes into the correct table.

We use preorder traverse on the syntax tree to get the visibility information.

The other information come from the database (collected by separate processes), for example the information of the function callings.

Page 15: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Analysis for Refactoring Steps

Syntax analysis (AST) Static semantics (Annotated AST or relational

database): scope, visibility, binding structure, type information.

Side-condition analysis Compensations Dynamic function calls (apply, spawn, etc.) Syntactical, semantical and library

coverage

Page 16: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Rename variable

Definition: Find every occurrence of the variable (i.e. the variables with the same name in the visibility range of the variable) and replace every occurrence with the new name.

Precondition: The new variable name is not visible at any occurrence of the variable

Limited to one module (no global variables)

Page 17: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Rename function

Definition: The refactoring rely on finding the definition and every place of call for a given function and substitute it with a new name.

Preconditions: No name clash in the current module (existing

functions, import list) No name clash in other modules, if the

function is exported

Page 18: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Reorder arguments

Definition: Change the order of arguments in the same way at the definition and every place of call for a given function.

Preconditions: No side effect of the parameters (just planned)

Tricky implicit function calls delete, create subtree (the same problem will be at the tuple arguments refactor step too)

Page 19: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Implicit function example

Page 20: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Tuple arguments

Definition: Change the way of using some arguments at the definition and at every place of call for a given function by grouping some arguments into one tuple argument.

Preconditions: The given position must be within a formal argument of

a function definition The function must be a declared function, not a fun-

expression The given number must not be too large No name clash if the arity is changing (not only in the

current module if the function is exported)

Page 21: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Eliminate variable

Definition: All instances of a variable are replaced with its bound value in that region where the variable is visible. The variable can be left out where its value is not used.

Preconditions: It has exactly one binding occurrence on the left hand

side of a pattern matching expression, and not a part of a compound pattern.

The expression bound to the variable has no side effects.

Every variable of the expression is visible (that is, not shadowed) at every occurrence of the variable to be eliminated.

Page 22: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Eliminate variable cont.

Decide if an occurrence is needed (remove or replace). Remove if: Not at the end of block expression Not at the end of clause body Not at the end of the recieve expression action Not at the end of try expression body, after branch and

handler Replicate subtree, because we need unique id-s in

the new subtrees. The subtree can contain every node type, we had to implement the most of the syntax tool module for database representation.

Page 23: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Planned refactor steps (short term)

Merge subexpression duplicates: All instances of the same subexpressions are stored in a variable that the user gives, then all instances of the original subexpression are changed to the variable.

Extract function: An alternative of a function definition might contain a sequence of expressions which can be considered as a logical unit, hence a function definition can be created from it. The extracted function is lifted to the module level, and it is parameterised with the variables that the expressions depend on. The sequence of expressions is replaced with a function call expression.

Page 24: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Planned refactor steps (middle term)

Tuple to record Specialisation of functions Generalisation of functions Fusion of functions Modification of data structures

Page 25: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Fusion and comparison

We are working together with the University of Kent. They released the Wrangler Erlang refactorer in January 2007. Their approach is working with annotated abstract syntax trees without database. We plan to make a common version with more refactorings.

The two tools have only two common refactorings (rename variable, rename function). The two tools gave the same result on bigger testbases too.

Page 26: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Testing

We tested the tool with more than 200 little test cases, which are made to cover the possible branches of the refactorings. The tool gave the same result as the original result files.

We tested the tool on a real bigger codebase. Our version was slower at the moment as the

AAST approach at big multi module systems and huge source files.

Page 27: Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Future work

We plan to make the fusion of the two tools at first just „under a common umbrella” and later with a common interface level

Compare the two approaches with more complicated refactorings (generalisation)

We plan to eliminate the ODBC connection and call the MySQL database directly from the Erlang node (a mysql module was released for Erlang in the middle of January)

Expand the number of the refactorings