mining version histories to guide software changes thomas zimmerman peter weisgerber stephan diehl...

11
Mining Version Histories to Guide Software Changes Thomas Zimmerman Peter Weisgerber Stephan Diehl Andreas Zeller

Upload: basil-obrien

Post on 05-Jan-2016

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Mining Version Histories to Guide Software Changes Thomas Zimmerman Peter Weisgerber Stephan Diehl Andreas Zeller

Mining Version Histories to Guide Software ChangesThomas ZimmermanPeter WeisgerberStephan DiehlAndreas Zeller

Page 2: Mining Version Histories to Guide Software Changes Thomas Zimmerman Peter Weisgerber Stephan Diehl Andreas Zeller

“In this paper, we apply data mining to version histories: 'Programmers who changed these functions also changed....' Just like the Amazon.com feature helps the customer browsing along related items, our ROSE tool guides the programmer along related changes...”

Page 3: Mining Version Histories to Guide Software Changes Thomas Zimmerman Peter Weisgerber Stephan Diehl Andreas Zeller

Agenda

ROSE Overview CVS to ROSE Data Analysis Evaluation Paper Critique

Page 4: Mining Version Histories to Guide Software Changes Thomas Zimmerman Peter Weisgerber Stephan Diehl Andreas Zeller

ROSE Overview

Aims: Suggest and predict likely changes. Suppose a programmer has just made a change. What else does she have to change?

Prevent errors due to incomplete changes. If a programmer wants to commit changes, but has missed a related change, ROSE issues a warning.

Detect coupling undetectable by program analysis. As ROSE operates exclusively on the version history, it is able to detect coupling between items that cannot be detected by program analysis.

Page 5: Mining Version Histories to Guide Software Changes Thomas Zimmerman Peter Weisgerber Stephan Diehl Andreas Zeller

ROSE Overview (2)

Page 6: Mining Version Histories to Guide Software Changes Thomas Zimmerman Peter Weisgerber Stephan Diehl Andreas Zeller

CVS to ROSE

ROSE works in terms of changes in entities ex: changes in directories, files, classes, methods, variables

Every entity is a triple (c, i, p), where c is the syntactic category, i is the identifier, and p is the parent entity:

ex: (method, initDefaults(), (class, Comp, ...))

Every change is expressed using predicates: alter(e) add_to(e) del_from(e)

Each transaction from CVS is converted to a list of those changes

Page 7: Mining Version Histories to Guide Software Changes Thomas Zimmerman Peter Weisgerber Stephan Diehl Andreas Zeller

Data Analysis

ROSE aims to mine rules from those alterations: alter(field, fKeys[], ...) is possibly followed by: alter(method, initDefaults(), ...) alter(file, plug.properties, ...)

The probability is measured by: Support count. Determines the number of transactions the rule has been derived from.

Confidence. The relative amount of the given consequences across all alternatives for a given antecedent.

ex: suppose fKeys[] was altered in 11 transactions. 10 of those also alter()'ed initDefaults() and plug.properties. 10 is the support count, and 10/11 (or 0.909) is the confidence.

Page 8: Mining Version Histories to Guide Software Changes Thomas Zimmerman Peter Weisgerber Stephan Diehl Andreas Zeller

Data Analysis

Other features: add_to() and del_from() allow an abstraction from the name of an added entity to the name of the surrounding entity.

The notation of entities allows varying granularities for mining data. Fine-granular mining. For source code of C-like languages, alter() is used for fields, functions, etc. add_to() is used for file entities.

Coarse-granular mining. Regardless of file type, only alter() is used for file entities. add_to() and del_from() can be used to capture when a file has been added or deleted

Coarse-granular rules have a higher support count and usually return more results. However they are less precise in location, and see limited use for guiding programmers.

Page 9: Mining Version Histories to Guide Software Changes Thomas Zimmerman Peter Weisgerber Stephan Diehl Andreas Zeller

Evaluation

Usage Scenarios: Navigation through source code. Given a change, can ROSE point to other entities that should typically be changed too?

Error prevention. If a programmer has changed many entities but missed to change one, does ROSE find the missing one?

Closure. When the transaction is finished, how often does ROSE erroneously suggest that a change is missing in the error prevention scenario?

Evaluation on eight large open-source projects ECLIPSE GCC GIMP JBOSS JEDIT KOFFICE POSTGRES PYTHON

Page 10: Mining Version Histories to Guide Software Changes Thomas Zimmerman Peter Weisgerber Stephan Diehl Andreas Zeller

Evaluation (2)

Summary: One can have precise suggestions or many suggestions, but not both.

When given an initial item, ROSE makes predictions in 66 percent of all queries. On average, the predictions of ROSE contain 33 percent of all items changed later in the same transaction, For those queries for which ROSE makes recommendations, in 70% of the cases, a correct location is within ROSE's topmost three suggestions.

In 3 percent of the queries where one item is missing, ROSE issues a correct warning. An issued warning predicts on average 75 percent of the items that need to be considered.

ROSE's warnings about missing items should be taken seriously: Only 2 percent of all transactions cause a false alarm. In other words: ROSE does not stand in the way.

ROSE has its best predictive power for changes to existing entities. ROSE learns quickly: A few weeks after a project starts, ROSE makes already useful suggestions.

Page 11: Mining Version Histories to Guide Software Changes Thomas Zimmerman Peter Weisgerber Stephan Diehl Andreas Zeller

Critique

Likes: The tool was applied and accordingly evaluated to 8 projects, and conclusions were drawn depending on their varying natures.

It's relevant to our assignment, thus it was easy to follow.

Dislikes: There is research value, but there is reason to be skeptical that the “recall” of such tools will reach practical levels (for the Navigation purposes). Intuitively, recommendations might break things if blindly followed, regardless of if the recommendation is correct. Ie: there is no practical value if the recommendations are incomplete, which is more likely for complex applications where this really matters.

I still don't know what ROSE stands for. :p