aspect mining - an emerging research domain

65
Prof. Kim Mens Département d’Ingénierie Informatique (INGI) Université catholique de Louvain (UCL) http://www.info.ucl.ac.be/~km Aspect Mining An emerging research domain 1

Upload: kimmens

Post on 28-Nov-2014

1.246 views

Category:

Technology


0 download

DESCRIPTION

Presentation on Aspect Mining for GroepT in Leuven, Belgium

TRANSCRIPT

Page 1: Aspect Mining - An emerging research domain

Prof. Kim MensDépartement d’Ingénierie Informatique (INGI)

Université catholique de Louvain (UCL)

http://www.info.ucl.ac.be/~km

Aspect Mining

An emerging research domain

1

Page 2: Aspect Mining - An emerging research domain

• Motivation• Preliminaries

• formal concept analysis• fan-in

• Three aspect mining techniques• identifier analysis• dynamic analysis• fan-in analysis• comparison

• Conclusion

Aspect Miningan emerging research domain

2

Page 3: Aspect Mining - An emerging research domain

• Motivation• Preliminaries

• formal concept analysis• fan-in

• Three aspect mining techniques• identifier analysis• dynamic analysis• fan-in analysis• comparison

• Conclusion

Aspect Miningan emerging research domain

3

Page 4: Aspect Mining - An emerging research domain

Need for aspect mining

• Aspects offer a better separation of concernsby solving the problems of “scattering” and “tangling”

• But what if we have (non AO) legacy codehow can we migrate it to an AOP solution?

• Need for aspect mining• aspect identification : how to find all code relevant to some

crosscutting concern

• aspect refactoring : and turn it into an aspect

• with the help of automated software tools

4

Page 5: Aspect Mining - An emerging research domain

Three open research problems in AOP

“Aspect Mining”

5

Page 6: Aspect Mining - An emerging research domain

• Motivation• Preliminaries

• formal concept analysis• fan-in

• Three aspect mining techniques• identifier analysis• dynamic analysis• fan-in analysis• comparison

• Conclusion

Aspect Miningan emerging research domain

6

Page 7: Aspect Mining - An emerging research domain

• Motivation• Preliminaries

• formal concept analysis• fan-in

• Three aspect mining techniques• identifier analysis• dynamic analysis• fan-in analysis• comparison

• Conclusion

Aspect Miningan emerging research domain

7

Page 8: Aspect Mining - An emerging research domain

Formal Concept Analysis (FCA)

• Starts from• a set of elements

• a set of properties of those elements

• Determines concepts• Maximal groups of elements and properties

• Group:

• Every element of the concept has those properties

• Every property of the concept holds for those elements

• Maximal

• No other element (outside the concept) has those same properties

• No other property (outside the concept) is shared by all

8

Page 9: Aspect Mining - An emerging research domain

object-oriented functional logic

static typing

dynamic typing

C++ X - - X -

Java X - - X -

Smalltalk X - - - X

Scheme - X - - X

Prolog - - X - X

FCA : Elements and Properties

9

Page 10: Aspect Mining - An emerging research domain

object-oriented functional logic static

typingdynamic typing

C++ X - - X -

Java X - - X -

Smalltalk X - - - X

Scheme - X - - X

Prolog - - X - X

FCA : Concepts

10

Page 11: Aspect Mining - An emerging research domain

object-oriented functional logic static

typingdynamic typing

C++ X - - X -

Java X - - X -

Smalltalk X - - - X

Scheme - X - - X

Prolog - - X - X

FCA : Concepts

11

Page 12: Aspect Mining - An emerging research domain

object-oriented functional logic

static typing

dynamic typing

C++ X - - X -

Java X - - X -

Smalltalk X - - - X

Scheme - X - - X

Prolog - - X - X

FCA : Concepts

12

Page 13: Aspect Mining - An emerging research domain

object-oriented functional logic

static typing

dynamic typing

C++ X - - X -

Java X - - X -

Smalltalk X - - - X

Scheme - X - - X

Prolog - - X - X

FCA : Concepts

13

Page 14: Aspect Mining - An emerging research domain

object-oriented functional logic

static typing

dynamic typing

C++ X - - X -

Java X - - X -

Smalltalk X - - - X

Scheme - X - - X

Prolog - - X - X

FCA : Concepts

14

Page 15: Aspect Mining - An emerging research domain

object-oriented functional logic

static typing

dynamic typing

C++ X - - X -

Java X - - X -

Smalltalk X - - - X

Scheme - X - - X

Prolog - - X - X

FCA : Concepts

15

Page 16: Aspect Mining - An emerging research domain

object-oriented functional logic

static typing

dynamic typing

C++ X - - X -

Java X - - X -

Smalltalk X - - - X

Scheme - X - - X

Prolog - - X - X

FCA : Concepts

16

Page 17: Aspect Mining - An emerging research domain

FCA : Concept Lattice

Concept hierarchy based on containment

relation between concepts.

17

Page 18: Aspect Mining - An emerging research domain

“Mined” Concepts

Properties shared byall languages (none)

Languages havingall properties (none)

OO languagesLanguages withdynamic typing

Dynam. typedOO languages

Static. typedOO languages

Dynam. typedfunct. languages

Dynam. typedlogic languages

18

Page 19: Aspect Mining - An emerging research domain

Concept Latticewith sparse labeling

static typing

Java, C++ Smalltalk

functional

Scheme

logic

Prolog

object-oriented dynamic typing

the labeling algorithm detects for each concept its most specific elements and

properties.19

Page 20: Aspect Mining - An emerging research domain

• Motivation• Preliminaries

• formal concept analysis• fan-in

• Three aspect mining techniques• identifier analysis• dynamic analysis• fan-in analysis• comparison

• Conclusion

Aspect Miningan emerging research domain

20

Page 21: Aspect Mining - An emerging research domain

Fan-in

• Fan-in metric [Henderson-Sellers]counts the number of locations from which

control is passed into a module

• Fan-in metric for OOP• applied to method M

• number of distinct method bodies that can invoke M

• What about polymorphism?• one call-site can affect the fan-in of several methods

• a call to M contributes to the fan-in of M but also of all its overriding methods as well as all methods it overrides

• this interpretation corresponds to the standard behavior of the “search for references” feature of the Eclipse IDE

module /method

21

Page 22: Aspect Mining - An emerging research domain

Fan-In Example

Fan-in of a method m =the number of distinct methodbodies that can invoke m

interface A{ public void m(); } class B implements A { public void m() {}; } class C1 extends B { public void m() {}; } class C2 extends B { public void m() { super.m(); }; } class D { void f1(A a) { a.m(); } void f2(B b) { b.m(); } void f3(C1 c) { c.m(); } }

Method Caller set Fan-inA.m {D.f1, D.f2, D.f3} 3

B.m {D.f1, D.f2, D.f3, C2.m} 4

C1.m {D.f1, D.f2, D.f3} 3

C2.m {D.f1, D.f2} 2

22

Page 23: Aspect Mining - An emerging research domain

• Motivation• Preliminaries

• formal concept analysis• fan-in

• Three aspect mining techniques• identifier analysis• dynamic analysis• fan-in analysis• comparison

• Conclusion

Aspect Miningan emerging research domain

23

Page 24: Aspect Mining - An emerging research domain

Aspect Mining Techniques

• Aspect mining is an emerging research domain• Several aspect mining techniques are being proposed

• Based on pattern matching, clone detection, logic reasoning, concept analysis, clustering, fan-in analysis, program slicing...

• We will focus on three specific techniques• Two based on formal concept analysis :

• Identifier analysis

• Dynamic analysis

• One based on fan-in metric :

• Fan-in analysis

24

Page 25: Aspect Mining - An emerging research domain

Some references

• A qualitative comparison of three aspect mining techniques

M. Ceccato, M. Marin, K. Mens, L. Moonen, P. Tonella, T. Tourwé

Int’l Working Conference on Program Comprehension 2005

• Mining aspectual views using formal concept analysis

T. Tourwé, K. Mens

Int’l workshop on Source-Code Analysis and Manipulation 2004

• Aspect mining through the formal concept analysis of execution traces

P. Tonella, M. Ceccato

IEEE Working Conference on Reverse Engineering 2004

• Identifying aspects using fan-in analysis

M. Marin, A. van Deursen, L. Moonen

IEEE Working Conference on Reverse Engineering 2004

25

Page 26: Aspect Mining - An emerging research domain

3 Aspect Mining Techniques

• Identifier analysis• Approach : Use FCA to group classes/methods with similar names

• Motivation : In absence of real AOP support, (OO) developers often rely on naming conventions

• Dynamic analysis• Approach : Use FCA to relate methods to the use case scenarios in

which they appear

• Motivation : Methods used in different scenarios may represent a crosscutting concern

• Fan-in analysis• Approach : Look for methods with a high fan-in value

• Motivation : Methods that are being invoked from “all over the place” indicate a kind of scattering

26

Page 27: Aspect Mining - An emerging research domain

Case study

• Applied each technique to same case study

• JHotDraw• Framework for 2D graphics ~ 18,000NCLOC

• Open source (jhotdraw.org)

• Rather well designed (design patterns)

• shows relevance of aspect mining even for well-designed cases

• Qualitative comparison of identified aspects

27

Page 28: Aspect Mining - An emerging research domain

• Motivation• Preliminaries

• formal concept analysis• fan-in

• Three aspect mining techniques• identifier analysis• dynamic analysis• fan-in analysis• comparison

• Conclusion

Aspect Miningan emerging research domain

28

Page 29: Aspect Mining - An emerging research domain

“Identifier analysis”

• Idea: Use FCA to group program entities with similar names• Elements : classes and methods

• Properties : substrings of the elements’ names

• Only considering crosscutting groups

• Approach relies on naming conventions• Primary means to associated related but distant program

entities, in absence of designated AOP constructs

• Especially for object-oriented software

• polymorphism, intention-revealing names, design patterns, ...

• Joint work with Dr. Tom Tourwé (CWI)

29

Page 30: Aspect Mining - An emerging research domain

Identifier analysis approach

1. Generate the formal contextElements, properties & incidence relation

2. Concept AnalysisCalculate the formal concepts

(& organise them into a concept lattice)

3. FilteringRemove irrelevant concepts

• too small

• not scattered

4. Manually inspect conceptsAre they really an aspect or crosscutting concern?

30

Page 31: Aspect Mining - An emerging research domain

1. Generate formal context

• Elements• all classes and methods in analyzed program

• except test classes, accessor methods (produce too much noise)

• Properties• all “relevant” substrings of the elements’ names

• Based on where uppercases occur in an element’s name

• createUndoActivity → { create, undo, activity }

• Filter substrings that produce too much noise

• Uses stemming algorithm to map substrings to same ‘stem’

• Incidence relation : an element has a property if it has the substring in its name

31

Page 32: Aspect Mining - An emerging research domain

2. Concept Analysis

groups entities with similar identifiers

figure drawing request remove update change event …

drawingRequestUpdate(DrawingChangeEvent e) - X X - X - - …

figureRequestRemove(FigureChangeEvent e) X - X X - - - …

figureRequestUpdate(FigureChangeEvent e) X - X - X - - …

figureRequestRemove(FigureChangeEvent e) X - X X - - - …

figureRequestUpdate(FigureChangeEvent e) X - X - X - - …

… … … X … … … … …

32

Page 33: Aspect Mining - An emerging research domain

3. Filtering

• Irrelevant elements and properties already filtered• substrings with little meaning or that are too small

• test classes and methods, accessor methods

• Extra filtering• Drop top & bottom concept when empty

• Drop concepts with too few elements (less than 4)

• Drop concepts where classes and methods are not ‘scattered’

• should be in at least 2 different unrelated class hierarchies

33

Page 34: Aspect Mining - An emerging research domain

4. Manually inspect concepts

• Use DelfSTof, our Conceptual Code Mining tool

• Browse code of concept elements• Does a discovered concept really present an aspect or

crosscutting concern?

• Is the code really similar?

• Or do the elements ‘accidentally’ have a similar name?

• Group concepts that seem to address a similar concern

• Persistence : file / storable / load / register

34

Page 35: Aspect Mining - An emerging research domain

DelfSTof, our Conceptual Code Mining tool

35

Page 36: Aspect Mining - An emerging research domain

Case study : JHotDraw

• 2193 elements and 507 properties

• 230 concepts were discovered in 31 seconds• when using a threshold of 4 for minimum number of elements

• with threshold 10 : 100 concepts ; similar execution time

• 41 crosscutting concerns identified• after (laborous) manual analysis of the concepts

• three kinds :

• traditional aspects (observer, undo, persistence)

• crosscutting business logic (drawing figures, moving figures)

• Java-specific concerns (iterating over collections)

36

Page 37: Aspect Mining - An emerging research domain

Case study : JHotDraw

Selection of results of identifier analysis experimentCrosscutting

concernConcept(s) #elements Some

elements

Observerchange / check / listener / release

67 / 14 / 65 /12figureChanged(e) /checkDamage() /

createDesktopListener()

Undo undo / redo 53 / 14 createUndoActivity() /redo()

Visitor visit 12 visit(FigureVisitor)

Persistencefile / storable / load / register

15 / 5 / 8 / 7registerFileFilters(c) /

readStorable() /loadRegisteredImages

Drawing draw 112 draw(g)

Moving figures move 36 moveBy(x,y)moveSelection(dx,dy)

Iterating iterator 5 iterator()listIterator()

37

Page 38: Aspect Mining - An emerging research domain

• Motivation• Preliminaries

• formal concept analysis• fan-in

• Three aspect mining techniques• identifier analysis• dynamic analysis• fan-in analysis• comparison

• Conclusion

Aspect Miningan emerging research domain

38

Page 39: Aspect Mining - An emerging research domain

Aspect Miningthrough

Formal Concept Analysisof

Execution Traces

Based on a presentation by Mariano Ceccato & Paolo Tonella (ITC-irst)

“Dynamic analysis”

39

Page 40: Aspect Mining - An emerging research domain

Run Application

Why execution traces? We are interested in mining those crosscutting concerns

that are associated to a not well modularized system requirements Use-cases specify system requirements Execution traces are the result of use-case executions

Trace 1

Trace 2

Trace N

Scenario 1(feature a)

Scenario N(feature z)

Scenario 2(feature b)

40

Page 41: Aspect Mining - An emerging research domain

Rough trace analysis output Dynamic analysis produces a relation between :

Elements = methods (computational units) Properties = scenarios (use-cases) Relation R = in scenario s the method m is executed

The table can be too big to be manually inspected or analyzed We need a way to extract knowledge from it Use FCA (with sparse labeling)

ScenariosComputational Units

method1 method2 method3 method4scenario1 x x x

scenario2 x x x

scenario3 x x

41

Page 42: Aspect Mining - An emerging research domain

Dynamic Analysis The concept specific for a given feature is labeled by the

corresponding scenario. The most specific method for a concept are the ones in its label. “Dynamic analysis” focusses on those concepts that have both

scenarios and methods in their labels

FCA C1

Bottom

C2

Top

C3

C0

meth

1

meth

2

meth

3

meth

4

scen1 x x x

scen2 x x x

scen3 x x

Concept lattice with sparse labeling

42

Page 43: Aspect Mining - An emerging research domain

Interpretation of the lattice

• A potential aspect is detected when the existing modularity fails in dividing requirements

• It happens when:• The specific methods for a use-case come from different classes

(scattering).

• The same class defines specific methods for more than one use-case (tangling).

Documentation

DocOption

Algorithm

GraphCanvas Options GraphAlgorithm

Draw

43

Page 44: Aspect Mining - An emerging research domain

A small case study :Dijkstra algorithm

• Small size application (1068 LOC) easy to analyze

• Many features: interesting case study

44

Page 45: Aspect Mining - An emerging research domain

Documentation

DocOption

Algorithm

GraphCanvas Options GraphAlgorithm

Draw

45

Page 46: Aspect Mining - An emerging research domain

GraphAlgorithm.unlock()GraphAlgorithm.lock()Options.unlock()Options.lock()GraphCanvas.lock()GraphCanvas.unlock()GraphCanvas.runalg()GraphCanvas.detailsDijkstra(Graphics,int,int)GraphCanvas.endstepalg(Graphics)GraphCanvas.detailsalg(Graphics,int,int)GraphCanvas.endstepDijkstra(Graphics)GraphCanvas.stepalg()GraphCanvas.reset()GraphCanvas.nextstep()GraphCanvas.clear()GraphCanvas.showexample()GraphCanvas.initalg()GraphCanvas.run()

Documentation

DocOption

Algorithm

GraphCanvas Options GraphAlgorithm

Draw

Discovered aspect : “locking”- tangled- scattered- can be associated a well-defined functionality- but is not main functionality (“algorithm”)

46

Page 47: Aspect Mining - An emerging research domain

Case study : JHotDraw

• 27 elements (use cases)draw a rectangle, draw a line with the scribble tool, create a

connector between two figures, ...

• 1262 propertiesJHotDraw methods executed by running the scenarios

• concept lattice with 1514 nodes• 11 were classified as use-case specific aspects

• 56 as generic aspects

• these were revisited manually to determine plausible aspects

• that can be associated to a single well-defined functionality

• that is not the main functionality of the involved classes

47

Page 48: Aspect Mining - An emerging research domain

Case study : JHotDraw

Summary of results of dynamic analysis experiment

Aspect Concepts Methods

Undo 2 36

Bring to front 1 3

Send to back 1 3

Connect text 1 18

Persistence 1 30

Manage Handles 4 60

Move figure 1 7

Command executability 1 25

Connect figures 1 55

Figure observer 4 11

Add text 1 26

Add URL to figure 1 10

Manage figures outside drawing 1 2

Get attribute 1 2

Set attribute 1 2

Manage view rectangle 1 2

Visitor 1 6

CH.ifa.draw.figures:EllipseFigure.basicMoveBy(int,int)PolyLineFigure.basicMoveBy(int,int)RectangleFigure.basicMoveBy(int,int)RoundRectangleFigure.basicMoveBy(int,int)TextFigure.moveBy(int,int)

CH.ifa.draw.standard:AbstractFigure.moveBy(int,int)DecoratorFigure.moveBy(int,int)

48

Page 49: Aspect Mining - An emerging research domain

• Motivation• Preliminaries

• formal concept analysis• fan-in

• Three aspect mining techniques• identifier analysis• dynamic analysis• fan-in analysis• comparison

• Conclusion

Aspect Miningan emerging research domain

49

Page 50: Aspect Mining - An emerging research domain

Identifying Aspectsusing

Fan-in Analysis

Based on a presentation by Marius Marin, Leon Moonen & Arie van Deursen (TUDelft)

“Fan-in analysis”

50

Page 51: Aspect Mining - An emerging research domain

Why fan-in analysis?

Fan-in analysis can help us to identify :

• Scattered code relying on some common functionality, e.g., persistence

• Tangled code, needed in various places, e.g. logging

• Some design patterns generate high fan-in values, e.g. Observer, Visitor

write(StorableOutput)

implementations

Write to a storable output in JHotDrawStorableOutput.writeStorable(Storable)

calls

51

Page 52: Aspect Mining - An emerging research domain

Identification steps

1. Automatic computation of fan-in metricfor all methods in analysed application

2. Filtering the results• Methods with fan-in < 10

• Getters & setters (name get*/set* and returns/sets a reference)

• Utility methods

3. Largely manual analysis• Call sites

• Naming conventions used

• Implementation

• Comments in source codeEclipseplug-in

52

Page 53: Aspect Mining - An emerging research domain

Case study : JHotDraw

• Threshold fan-in : 10

• 7% of total # methods kept

• other filters removed another 50%• getters / setters

• utility methods

• 52% of remaining methods were manually classified as aspect seeds

53

Page 54: Aspect Mining - An emerging research domain

Case study : JHotDrawConcern type # Seed’s description

Consistent behavior 4 Methods implementing the consistent behavior shared by different callers, such as checking/refreshing figures/views affected by executing a command.

Contract enforcement 4 Method implementing a contract that needs to be enforced, such as checking the reference to the editorʼs active view before executing a command.

Undo 1 Methods checking whether a command is undoable/redoable + undo method in the superclass, which is invoked from the overriding methods in subclasses.

Persistence & resurrection 1Methods implementing functionality common to persistent elements, such as read/write operations for primitive types wrappers (like Double, Integer) which are referenced by the scattered implementations of persistence/resurrection.

Command design pattern 1 The execute method in the command classes and command constructors.

Observer design pattern 1 The observersʼ manipulation methods and notify methods in classes acting as subject.

Composite design pattern 2 The compositeʼs methods for manipulating child components, such as adding a new child.

Decorator design pattern 1 Methods in the decorator that pass the calls on to the decorated components.

Adapter design pattern 1 Methods that manipulate the reference from the adapter(Handle) to the adaptee(Figure).

54

Page 55: Aspect Mining - An emerging research domain

• Motivation• Preliminaries

• formal concept analysis• fan-in

• Three aspect mining techniques• identifier analysis• dynamic analysis• fan-in analysis• comparison

• Conclusion

Aspect Miningan emerging research domain

55

Page 56: Aspect Mining - An emerging research domain

Comparing the techniques

• Case study : JHotDraw

• Qualitative comparison of identified aspects• Identified aspects : discovered / discarded / missed

• Quality and level of detail of discovered information

• Weaknesses and limitations of each of the techniques

• Complementarity and opportunities for combination

56

Page 57: Aspect Mining - An emerging research domain

Results of the comparison (1)

• A selection of detected concerns in JHotDraw

ConcernFan-in

analysisIdentifier Analysis

Dynamic Analysis

Observer + + +

Consistent Behavior / Contract Enforcement

+ - -

Command Execution + + -

Bring to front /Send to back

- - +

Manage Handles - + +

Move Figures + (discarded) + +57

Page 58: Aspect Mining - An emerging research domain

Results of the comparison (2)• Observer, Undo, Persistence

• concerns reported by all 3 techniques

• correspond to well-known aspects

or functionality for which AOP is natural solution

• surprisingly few concerns were detected by all techniques

=> need for a combined technique

• Bring to front / Send to back• not detected by fan-in analysis (fan-in value too low)

• not detected by identifier analysis (#elements < threshold)

• detected by dynamic analysis

(corresponds to specific use-case scenario)

58

Page 59: Aspect Mining - An emerging research domain

Results of the comparison (3)

• Contract Enforcement / Consistent Behavior• E.g., common functionality for checking preconditions

• Found by fan-in because many calls to ‘check’ methods

• Identifier analysis misses cases when no common naming scheme

• Also missed by dynamic analysis

• Command execution• could be seen as particular case of Contract Enforcement

• all execute methods need to check that an active view exists

• found by identifier analysis (and fan-in analysis)

59

Page 60: Aspect Mining - An emerging research domain

Results of the comparison (4)

• Manage Handles• Partly detected by identifier analysis

• methods with identifier ‘handle’ appearing in their name

• missed specific methods north(), south(), east(), west()

• Partly detected by dynamic analysis

• detected specific methods

• and some (not all) of the handle methods

• Missed by fan-in analysis: calls too specific

• similar but distinct calls instead of one single called method with high fan-in

60

Page 61: Aspect Mining - An emerging research domain

Limitations of the techniques

• Identifier analysis• fails in absence of good naming conventions

• too many and too detailed results

=> better grouping / filtering needed

• Dynamic analysis• fails for functionality present in all execution traces

• completeness depends on coverage by scenarios

• Fan-in analysis• only crosscutting with large extent

• false negatives due to filtering

• All require quite some manual work

61

Page 62: Aspect Mining - An emerging research domain

Combining the techniques

• Techniques rely on orthogonal properties• suggests possibility of useful combinations, to

• Increase coverage• by taking the union of discovered results (fan-in + dynamic)

• Complete the discovered aspect “seeds”• with more methods relevant to the aspect (<= identifier

analysis)

• Provide more coarse-grained aspects• e.g., grouping of identifier analysis concepts (<= fan-in /

dynamic)

• Discard irrelevant concepts

62

Page 63: Aspect Mining - An emerging research domain

Future work

• More detailed comparisonof quality of discovered aspects

• Comparison with other aspect mining techniques

• Multi-technique tool• fan-in, FCA, clone detection, slicing, ...

• to obtain a higher degree of automation

• and better quality of results

• JHotDraw as common benchmark

• Aspect refactoring

63

Page 64: Aspect Mining - An emerging research domain

• Motivation• Preliminaries

• formal concept analysis• fan-in

• Three aspect mining techniques• identifier analysis• dynamic analysis• fan-in analysis• comparison

• Conclusion

Aspect Miningan emerging research domain

64

Page 65: Aspect Mining - An emerging research domain

Aspect mining...

• is a promising new research area

• can be (partly) automated with fairly simple techniquesand combinations thereof

• is only the first step...

But most of all... it’s fun :-)

65