![Page 1: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/1.jpg)
1
Source Code Comprehension on Evolving Software: A Literature Survey
Yida Tao
Supervisor: Sunghun Kim
![Page 2: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/2.jpg)
2
Motivation
Code Change Comprehension
Tao et al., FSE’12Code change comprehension is• Frequently required• In major development activities,
in particular the code-review process
• How do software engineers understand code changes? An exploratory study in industry. Tao et al., FSE’12
• Expectations, outcomes, and challenges of modern code review. Bacchelli and Bird, ICSE’13
Bacchelli & Bird, ICSE’13• “…review and understand code
they have not seen before may be more common that a developer working on new code”
• “From interviews, no other code review challenge emerged as clearly as understanding the submitted change”
![Page 3: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/3.jpg)
3
Outline
Program Differencing
Describing code changes
Code Change Summarization
Explaining code changes
Querying and Filtering
Customization
Code Change Comprehension
![Page 4: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/4.jpg)
4
Program Differencing Text Differencing
Syntactic Differencing
Semantic Differencing
![Page 5: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/5.jpg)
5
Text Differencing Flat representation of a program
Sequence of strings
Unix diff Only output added/deleted lines, can not detect modified lines Hard to determine when a code fragment is moved upward or
downward
Ldiff (Canfora et al., ICSE’09) An enhanced line differencing tool
Limitations Changes to *characters* No syntactic-structure information
![Page 6: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/6.jpg)
6
Syntactic Differencing Structured representation of a program
Abstract syntax tree; XML ChangeDistiller (Fluri et al., TSE’07)
Tree differencing Node: bigram string similarity Control structure: subtree similarity
Output: tree edit script (insert, delete, move, update) XML differecing
srcXML (Maletic & Collard, ICSM’04): embeds abstract syntax and structure within the source code
diffX (Al-Ekram et al., CASCON '05) Limitation
Cannot describe how the behavior of a program is changed Still report differences for behavior-preserving changes
![Page 7: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/7.jpg)
7
Semantic Differencing Semantic diff (Jackson and Ladd, ICSM’94)
Method-level Variable dependencies comparison
==
![Page 8: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/8.jpg)
8
Semantic Differencing (cont.) JDiff (Apiwattanapong et al. ASE’04, 06)
Extended control-flow graph (ECFG) Dynamic binding, class hierarchy, exception handling,
etc.
![Page 9: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/9.jpg)
9
Semantic Differencing (cont.) Differential symbolic execution (Person et al.,
FSE’08) “Executing” a program using symbolic values
![Page 10: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/10.jpg)
10
Outline
Program Differencing
Text Differencing
Syntactic differencing
Semantic differencing
Code Change Comprehension
Code Change Summarization
Explaining code changes
Querying and Filtering
Customization
![Page 11: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/11.jpg)
11
Code Change Summarization LSdiff (Kim and Notkin, ICSE’09)
Group related changes Detect potential inconsistencies in a code change
![Page 12: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/12.jpg)
12
Code Change Summarization (cont.) DeltaDoc (Buse and Weimer, ASE’10)
Symbolic execution: obtain path predicates for each statement in both versions
Identify statements that are added, deleted, or have a changed predicates
Summarization
![Page 13: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/13.jpg)
13
Code Change Summarization (cont.) Multi-document summarization (Rastkar and Murphy,
ICSE’13) Linking evolutionary documents (commit log, issue tracking entries) Finding the most informative sentences to extract to form a
summary Similarity between a sentence and the title of the enclosing document Overlap between a sentence and the adjacent document
![Page 14: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/14.jpg)
14
Code Change Summarization (cont.) Challenges
Evolutionary documents Linkage might not be found (Bachman et al., FSE’10, Wu et al., FSE’11) Human-written document may be unavailable or uninformative (Buse
and Weimer, ASE’10, Tao et al., FSE’12) Automatically generated document
Verbosity Uninteresting changes are identified, e.g., “all types that declared
toString() added constructors” (Kim and Notkin, ICSE’09)
LSdiff DeltaDoc
![Page 15: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/15.jpg)
15
Outline
Program Differencing
Text Differencing
Syntactic differencing
Semantic differencing
Code Change Summarization
Rules and exceptions
Control-flow changes
Evolutionary documentation
Code Change Comprehension
Querying and Filtering
Customization
![Page 16: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/16.jpg)
16
Querying and Filtering Specifying and detecting meaningful changes (Yu et al.,
ASE’11) Normalize the program (user-specified) before differencing Non-trivial to construct the query
![Page 17: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/17.jpg)
17
Querying and Filtering (cont.) Filtering non-essential changes (Kawrykow
and Robillard, ICSE’11) Non-essential changes: rename-induced
modifications, local variable extraction, trivial keyword modification, whitespace and documentation updates
ChangeDistiller (Fluri et al., TSE’07) + Partial program analysis (Dagenais and Robillard, ICSE’08)
Goal: improving mining and recommendation accuracy instead of developers’ comprehension
![Page 18: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/18.jpg)
18
Outline
Program Differencing
Text Differencing
Syntactic differencing
Semantic differencing
Code Change Summarization
Rules and exceptions
Control-flow changes
Evolutionary documentation
Querying and Filtering
Meaningful changes
Non-essential changes
Code Change Comprehension
![Page 19: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/19.jpg)
19
Research Directions
Program Differencing
Text Differencing
Syntactic differencing
Semantic differencing
Code Change Summarization
Rules and exceptions
Control-flow changes
Evolutionary documentation
Querying and Filtering
Meaningful changes
Non-essential changes
Source Code Changes
Work-item-based changes?
![Page 20: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/20.jpg)
Work-item-based Changes Multiple work-items in a single code change (e.g.,
a bug fix + code cleanup + a new feature) Very difficult to understand (Tao et al., FSE’12)
20JFreeChart revision 1083
Trivial keyword removal
Bug fix
Formatting
![Page 21: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/21.jpg)
Work-item-based Change Detection Multiple work-items in a single code change (e.g.,
a bug fix + code cleanup + a new feature) Very difficult to understand (Tao et al., FSE’12) Change decomposition
Program slicing (entity dependencies) Pattern matching (similarities)
A single work-item spreads across multiple code changes (e.g., 5 changes to finally fix a bug completely) Change aggregation
Linkage to the same issue Heuristics like time duration, commit authors, program
dependencies, etc.21
![Page 22: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/22.jpg)
22
Research Directions
Program Differencing
Text Differencing
Syntax differencing
Semantic differencing
Code Change Summarization
Rules and exceptions
Control-flow changes
Evolutionary documentation
Querying and Filtering
Meaningful changes
Non-essential changes
Code Change ComprehensionWork-item change
detection
Change decomposition
Change aggregation
![Page 23: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/23.jpg)
23
Research Directions
Program Differencing
Text Differencing
Syntax differencing
Semantic differencing
Code Change Summarization
Rules and exceptions
Control-flow changes
Evolutionary documentation
Querying and Filtering
Meaningful changes
Non-essential changes
Work-item-specific changes
Code Change ComprehensionWork-item change
detection
Change decomposition
Change aggregation
![Page 24: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/24.jpg)
24
Research Directions
Program Differencing
Text Differencing
Syntax differencing
Semantic differencing
Code Change Summarization
Rules and exceptions
Control-flow changes
Evolutionary documentation
Querying and Filtering
Meaningful changes
Non-essential changes
Work-item-specific changes
Code Change Comprehension
Concrete Execution
Work-item change detection
Change decomposition
Change aggregation
![Page 25: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/25.jpg)
25
Explaining code changes with executions of co-changed test cases Test cases
Best documentation for source code Test cases co-changed with source code
Documentation for code changes? Mostly synchronous co-evolution of production and
test code (Zaidman et al., Empirical Software Engineering’11)
Differential test executions Co-changed test cases T Executing T on the old version P and new version
P’ Comparing executions to explained change
behaviors
From StackExchangehttp://programmers.stackexchange.com/questions/154439/quality-of-code-in-unit-tests?newsletter=1&nlcode=67628%7c1a35• “Unit tests are one of the best sources of documentation for
your system, and arguably the most reliable form”• “Unit tests are often the first thing you look at when trying to
grasp what some piece of code does”• “They can also serve as a starting point for people new to the
code base”
![Page 26: Source code comprehension on evolving software](https://reader033.vdocument.in/reader033/viewer/2022052901/55658b2fd8b42a2b6d8b4c5c/html5/thumbnails/26.jpg)
26
Research Directions
Program Differencing
Text Differencing
Syntax differencing
Semantic differencing
Code Change Summarization
Rules and exceptions
Control-flow changes
Evolutionary documentation
Querying and Filtering
Meaningful changes
Non-essential changes
Work-item-specific changes
Code Change Comprehension
Concrete Execution
• Co-changed test cases• Differential test
execution
Work-item change detection
Change decomposition
Change aggregation