CloningClone Detectionw
ww
.uni
-stu
ttgar
t.de
Empirical Results
Stefan Wagner @prof_wagnerst
Alpen-Adria-Universität Klagenfurt1. Juni 2015
on
and
You can
copy, share and change,
film and photograph,
blog, live-blog and tweet
this presentation given that you attributeit to its author and respect the rights andlicences of its parts.
basiert auf Vorlagen von @SMEasterbrook und @ethanwhite
Technische Universität München
Class A Class B
Class A Class B
Class A Class B
Often 20%–30% redundancy
We need to detect and remove clones reliably and automatically.
Types of Clones
Type 1 an exact copy without modifications (except for whitespace and comments)
Type 2 a syntactically identical copy; only variable, type, or function identifiers have been changed
Type 3 a copy with further modifications; statements have been changed, added, or removed
Clone detection: Processing steps
Storage
load
tokenise &
normalise
find duplicates
extract clones
visualise
• Number of clone groups/clone instances• Size of largest clone/cardinality of most
frequent clone
• Cloned StatementsNumber of statements in the system being part of at least one clone
• Clone Coverage– #Cloned Statements / #Statements– Probability of a randomly chosen statement to be part of
a clone
Measures for cloning
Compare View (~20 LOC)
Seesoft View (~400 LOC)
Tree Maps (>1.000.000 LOC)
Trends over Time
Visualisation of clone detection results
Technische Universität München
1 Code Clones
Inconsistencies
Can you spot the difference?
How problematic are these inconsistencies (and clones)?
Indicating harmfulness[Lague97]: inconsistent evolution of
clones in industrial telecom. SW.
[Monden02]: higher revision number for files with clones in legacy SW.
[Kim05]: substantial amount of coupled changes to code clones.
[Li06], [SuChiu07] and [Aversano07], [Bakota07]: discovery of bugs through search for inconsistent clones or clone evolution analysis.
Doubting harmfulness[Krinke07]: inconsistent clones hardly
ever become consistent later.
[Geiger06]: Failure to statistically verify impact of clones on change couplings
[Lozano08]: Failure to statistically verify impact of clones on changeability.
[Göde11]: Most changes intentionally inconsistent
[Rahman12]: no statistically significant impacts on faults
Our First Study at ICSE 2009
• Manual inspection of inconsistent clones by system developers
No indirect measures of consequences of cloning
• Both industrial and open source software analysed
• Quantitative data
Deissenboeck, Juergens, Hummel, Wagner, ICSE, 2009
Research QuestionsRQ1: Are clones changed inconsistently?
|IC| / |C|
RQ2: Are inconsistent clones created unintentionally?
|UIC| / |IC|
RQ3: Can inconsistent clones be indicators for faults in real systems?
|F| / |IC|, |F| / |UIC|
Clone Groups C (exact and incons.)
Inconsistent clone groups IC
Unintentionally incons. CloneGroups UIC
Faulty clone Groups F
Study Design
Tool detected clone group candidates CC
Clone group candidate detection
• Novel algorithm
• Tailored to target program
False positive removal
• Manual inspection of all inconsistent
and ¼ exact CCs
• Performed by researchers
Assessment of inconsistencies
• All inconsistent clone groups inspected
• Performed by developers
Clone groups C (exact and incons.)
Inconsistent clone groups IC
Unintentionally inconsistent clonegroups UIC
Faulty clone groups F
→ CC
→ C, IC
→ UIC, F
Study Objects
International reinsurance company, 37.000 employees
Munich-based life-insurance company, 400 employees
Sysiphus: Open source collaboration environment for distributed SW development. Developed at TUM.
2818JavaTUMSysiphus
19717CobolLV 1871D
4952C#Munich ReC
4544C#Munich ReB
3176C#Munich ReA
Size (kLoC)Age (years)LanguageOrganizationSystem
Results
Project A B C D Sys. Sum
Clone groups |C| 286 160 326 352 303 1427
Inconsistent CGs |IC|
159 89 179 151 146 724
Unint. Incos. |UIC|
51 29 66 15 42 203
Faulty CGs |F| 19 18 42 5 23 107
Threats to Validity
• Analysis of latest version instead of evolution.
• Developer review error
• Clone Detector Configuration
• System selection not random
(impact on transferability)
• All inconsistencies of interest, independent of creation time.
• Conservative strategy only
makes positive answers harder
• Validated during pre-study
• 5 different dev. organisations
• 3 different languages
• Technically different
Con
stru
ctIn
tern
alEx
tern
al
Threat Mitigation
Our Second Study
• Investigating evolution of type-3 clones• Relationship with documented faults from
issue tracker• Industrial systems
Research QuestionsRQ1: Do software systems contain type-3
clones?
|CT3| / |C|
RQ2: Do type-3 clones contain documented faults?
|CT3F| / |CT3|
RQ3: Are developers aware of type-3 clones?
|IMS| / |IM|, |Cx| / |CT3F|, |CT2
F ▶︎ CT3NF|
Clone Groups C (exact and incons.)
Inconsistent clone groups CT3
Faulty clone Groups CT3F
Data Collection and Analysis
Tool Support
Quality Model EditorHTMLDash-board
Code, documentation,inspection- and test results
v1 v2 v3
Extract
Analyse
Query for relationshipsand evolution
Extract
Study Objects
The resuls of the SQL query is required for the analysis of theinconsistent clones for faults.
After the executing all the SQL-queries, now all the dataready for analysis. To determine the faulty code in one version,the version of a file in which an error was found recorded. Asthe repository of a project is at an older version, the revisionhistory with the pull function in Mercurial add new changesthat are each incorporated by a commit message in the revisionhistory and thus obtain a ChangesetID. In Tortoise is possiblefor each file to see the revision history. Thus, for each file ofthe inconsistent clone class considers the entire revision historyand the development are checked. Furthermore, it can bedetermined whether an error was corrected during developmentin the inconsistent clone files. Therefore, the conclusions aboutthe inaccuracy of the inconsistent Klonklassen can be madefrom these retrieved results.
H. Validity Procedure
1) Construct validity: The development history of the threesystems was analyzed to determine whether the inconsistenclones introduce by changes to a system. The problem is thatthe code fragments were inserted by copying and modifiyingin a single commit. Therefore, the entire revision histroyof industrial systems has been manually processed to checkall changes of a code fragment. Another threat to constructvalidity is the clone cases which have a bug in the issue-tracking system were only used for each system as a basisfor faulty code fragments.
Other treats to construct should be added here
2) Internal validity:
3) External Validity:
IV. RESULTS (ASIM, STEFAN)
A. Case Description
TABLE I. SUMMARY OF THE STUDY OBJECTS
Size AgeSystem Domain Lang. (KLOC) Revision (Years) Developers
A Automotive Java 253 2470 4 10B Automotive Java 332 1622 5 5C Automotive Java 454 2181 4 10
B. Share of Type-3 Clones (RQ 1)
Table III contains the quantitative results for all researchquestions in detail. We found a mean share of type-3 clonesin all clones and all three systems of 52 %. Yet, it variedquite strongly from 23 % in system B to 79 % in systemC. Nevertheless, in all three systems, there is a considerableshare of type-3 clones and, hence, it is useful to investigatetheir relationship with faults.
Should we include the liberal/conservative detectionapproach stuff here?
Answer to RQ 1: On average, every second clone class isa type-3 clone class. Therefore, they are a substantial partof all clones.
TABLE IV. SUMMARY OF RESULTS
Project A B C Total
Clone classes |C| 37 88 82 207Type-3 clone classes |CT3 | 21 21 65 107RQ 1: |CT3 |/|C| 0.56 0.23 0.79 0.52
Faulty clone classes |CF | 16 5 37 58Faulty type-3 clone classes |CT3
F | 7 1 2 10RQ 2: |CT3
F |/|CT3 | 0.33 0.05 0.03 0.17
Type-3 clones |I| 46 43 146 235Modified type-3 clones |IM | 24 19 67 110Simultaneously modified type-3 clones |IMS | 14 17 62 93RQ 3.1: |IMS |/|IM | 0.58 0.89 0.92 0.85
Fixed type-3 clone classes |CX | 4 1 0 5RQ 3.2: |CX |/|CT3
F | 0.57 1.00 0 0.5
Faulty type-2 clone classes |CT2F | 9 4 35 48
Non-faulty type-3 clone classes |CT3NF | 14 20 63 97
|CT2F ! CT3
NF | 9 4 35 48RQ 3.3: |CT2
F ! CT3NF |/|C
T2F | 1 1 1 1
Mean length of type-3 clones 60 62 78Mean length of faulty type-3 clones 50 39 83RQ 4: ? ? ? ?
C. Type-3 Clones with Documented Faults (RQ 2)
We found documented faults in 58 clone classes overall.Ten of these were type-3 clone classes. Interestingly, whilesystem C had the most faulty clone classes (37), system A hadthe most faulty type-3 clone classes (16). Hence, in our smallsample the relationship from faulty clone classes to faulty type-3 clone classes is not linear. This could be an indication thatsome other factors, such as the developers awareness of clones,play a role.
The ratio of faulty type-3 clone classes in relation to alltype-3 clone classes is on average 17 %. Because of thediscussed imbalance between faulty clone classes and faultytype-3 clone classes, this ratio varies strongly from 3 % insystem C to 33 % in system A. Again, this could be anindication that the developers of systems B and C were moreaware of the clones and, hence, introduced less inconsistencieswhich represent faults.
Potentially, the ratio of faulty type-3 clone classes couldbe higher, because we only analysed documented faults. Therestill might be several faults in the code not detected so far (asobserved by Juergens et al. [2]). Yet, we wanted to concentrateon faults that had an actual effect and led to failures.
Answer to RQ 2: On average, 17 % of all type-3clone classes contained a documented fault. The range isfrom 3 % to 33 %. Therefore, type-3 clones do containdocumented faults but not a high ratio of them. Developerawareness of the clones might play a role in the varianceof the results.
D. Developers’ Awareness of Type-3 Clones (RQ 3)
In RQ 1, we established that type-3 clones are interestingto investigate. In RQ 2, we found that type-3 clones containdocumented faults as well as an indication that developerawareness of these clones plays a role. Therefore, we in-vestigate this awareness more closely. We analysed threedifferent indications of this awareness which we discuss inthe following.
ResultsThe resuls of the SQL query is required for the analysis of theinconsistent clones for faults.
After the executing all the SQL-queries, now all the dataready for analysis. To determine the faulty code in one version,the version of a file in which an error was found recorded. Asthe repository of a project is at an older version, the revisionhistory with the pull function in Mercurial add new changesthat are each incorporated by a commit message in the revisionhistory and thus obtain a ChangesetID. In Tortoise is possiblefor each file to see the revision history. Thus, for each file ofthe inconsistent clone class considers the entire revision historyand the development are checked. Furthermore, it can bedetermined whether an error was corrected during developmentin the inconsistent clone files. Therefore, the conclusions aboutthe inaccuracy of the inconsistent Klonklassen can be madefrom these retrieved results.
H. Validity Procedure
1) Construct validity: The development history of the threesystems was analyzed to determine whether the inconsistenclones introduce by changes to a system. The problem is thatthe code fragments were inserted by copying and modifiyingin a single commit. Therefore, the entire revision histroyof industrial systems has been manually processed to checkall changes of a code fragment. Another threat to constructvalidity is the clone cases which have a bug in the issue-tracking system were only used for each system as a basisfor faulty code fragments.
Other treats to construct should be added here
2) Internal validity:
3) External Validity:
IV. RESULTS (ASIM, STEFAN)
A. Case Description
TABLE I. SUMMARY OF THE STUDY OBJECTS
Size AgeSystem Domain Lang. (KLOC) Revision (Years) Developers
A Automotive Java 253 2470 4 10B Automotive Java 332 1622 5 5C Automotive Java 454 2181 4 10
B. Share of Type-3 Clones (RQ 1)
Table III contains the quantitative results for all researchquestions in detail. We found a mean share of type-3 clonesin all clones and all three systems of 52 %. Yet, it variedquite strongly from 23 % in system B to 79 % in systemC. Nevertheless, in all three systems, there is a considerableshare of type-3 clones and, hence, it is useful to investigatetheir relationship with faults.
Should we include the liberal/conservative detectionapproach stuff here?
Answer to RQ 1: On average, every second clone class isa type-3 clone class. Therefore, they are a substantial partof all clones.
TABLE IV. SUMMARY OF RESULTS
Project A B C Total
Clone classes |C| 37 88 82 207Type-3 clone classes |CT3 | 21 21 65 107RQ 1: |CT3 |/|C| 0.56 0.23 0.79 0.52
Faulty clone classes |CF | 16 5 37 58Faulty type-3 clone classes |CT3
F | 7 1 2 10RQ 2: |CT3
F |/|CT3 | 0.33 0.05 0.03 0.17
Type-3 clones |I| 46 43 146 235Modified type-3 clones |IM | 24 19 67 110Simultaneously modified type-3 clones |IMS | 14 17 62 93RQ 3.1: |IMS |/|IM | 0.58 0.89 0.92 0.85
Fixed type-3 clone classes |CX | 4 1 0 5RQ 3.2: |CX |/|CT3
F | 0.57 1.00 0 0.5
Faulty type-2 clone classes |CT2F | 9 4 35 48
Non-faulty type-3 clone classes |CT3NF | 14 20 63 97
|CT2F ! CT3
NF | 9 4 35 48RQ 3.3: |CT2
F ! CT3NF |/|C
T2F | 1 1 1 1
Mean length of type-3 clones 60 62 78Mean length of faulty type-3 clones 50 39 83RQ 4: ? ? ? ?
C. Type-3 Clones with Documented Faults (RQ 2)
We found documented faults in 58 clone classes overall.Ten of these were type-3 clone classes. Interestingly, whilesystem C had the most faulty clone classes (37), system A hadthe most faulty type-3 clone classes (16). Hence, in our smallsample the relationship from faulty clone classes to faulty type-3 clone classes is not linear. This could be an indication thatsome other factors, such as the developers awareness of clones,play a role.
The ratio of faulty type-3 clone classes in relation to alltype-3 clone classes is on average 17 %. Because of thediscussed imbalance between faulty clone classes and faultytype-3 clone classes, this ratio varies strongly from 3 % insystem C to 33 % in system A. Again, this could be anindication that the developers of systems B and C were moreaware of the clones and, hence, introduced less inconsistencieswhich represent faults.
Potentially, the ratio of faulty type-3 clone classes couldbe higher, because we only analysed documented faults. Therestill might be several faults in the code not detected so far (asobserved by Juergens et al. [2]). Yet, we wanted to concentrateon faults that had an actual effect and led to failures.
Answer to RQ 2: On average, 17 % of all type-3clone classes contained a documented fault. The range isfrom 3 % to 33 %. Therefore, type-3 clones do containdocumented faults but not a high ratio of them. Developerawareness of the clones might play a role in the varianceof the results.
D. Developers’ Awareness of Type-3 Clones (RQ 3)
In RQ 1, we established that type-3 clones are interestingto investigate. In RQ 2, we found that type-3 clones containdocumented faults as well as an indication that developerawareness of these clones plays a role. Therefore, we in-vestigate this awareness more closely. We analysed threedifferent indications of this awareness which we discuss inthe following.
Conclusions
• About half of all clone classes are type-3 clones.
• Rate of faulty type-3 clones is about 17 %.• There is a high awareness of clones and
inconsistencies.• This awareness seems to impact how many
faults are related to type-3 clones.• Further studies should take this into account.• Making developers aware of clones seems still
to be worthwhile.
Technische Universität München
2 Model Clones
Why not analyse generated code?
_1 = In * I; _2 = In * P; _3 = In * D; _4 = _1 + I-Delay; _5 = _3 - D-Delay; Out = _4 + _2 +_5; I-Delay = _4; D-Delay = _3;
Clone Detection Pipeline
Simulink FilesSimulink Models
NormalisationD
etec
tion
Simulink Parser
Flat Labeled Graph
Clone Pairs Clone Groups
Clustering Visualisation
Data Flow Clones▪ Basic Criteria:
– Functionally independent (Abstracted elements)– Reusable (Connected)– Functionally complex (Size of functional elements)– General (Number of instances)
▪ Additional Criteria: Relevance– Application domain specific– Intellectual Property
▪ Typical Examples– Sensor Data Plausibilisation– Error Management
Deissenboeck, Wagner et al., ICSE'08
37 % of relevant blocks are part of at least one clone group.
Conclusions
• It is possible to formulate a useful understanding for clones in models.
• Needs different algorithms for graph-based models
• If models are used for code generation, they will contain clones similar as in source code.
Technische Universität München
3 Requirements
Clones
"Redundancy [in requirements specifications] causes good engineers to suffer and the resulting systems will probably suffer, too."
–Matthias Weber, Joachim Weisbrod
Modifiability generally requires a requirements specification to […] not be redundant.
–IEEE 830-1998
TermsRequirements specification“specification for a particular software product, program, or set of
programs that performs certain functions in a specific environment.” [IEEE 830-1998]
Clone• Duplicated specification text of at least 20 words
• Small differences (e.g., declination) are tolerated
• Must refer to specified system
• False positives: e.g., page footers with copyright information
Research questions
1.How much cloning do real-world requirements specifications contain?
2.What kind of information is cloned in requirements specifications?
3.What consequences does cloning in requirements specifications have?
4.Can cloning in requirements specifications be detected accurately using existing clone detectors?
Study designRandom assignment of specifications
Detection tool execution
Inspection of detected clones Adding of filters
False positives?
Categorisation of clones
Independent re-categorisation Analysis of corresp. source code
Data analysis & interpretation
Yes
No
Regular expressionsRemoval of clonesImprovement in precisionCategorisation of the types of false positives
Adding of filters
Study designRandom assignment of specifications
Detection tool execution
Inspection of detected clones Adding of filters
False positives?
Categorisation of clones
Independent re-categorisation Analysis of corresp. source code
Data analysis & interpretation
Yes
No
• Qualitative analysis: content analysis• Sample is categorised• Mix of theory-based and Grounded Theory• 4+8 categories• Documentation of additional information
(mostly inconsistencies between clones)
Categorisation of clones
Study designRandom assignment of specifications
Detection tool execution
Inspection of detected clones Adding of filters
False positives?
Categorisation of clones
Independent re-categorisation Analysis of corresp. source code
Data analysis & interpretation
Yes
No
2 ratersSample: 5 specificationsSample: 5 clone groupsAnalysis of inter rater agreement
Independent re-categorisation
Study designRandom assignment of specifications
Detection tool execution
Inspection of detected clones Adding of filters
False positives?
Categorisation of clones
Independent re-categorisation Analysis of corresp. source code
Data analysis & interpretation
Yes
No
Study objects
28 specifications11 organisations8,667 pagesover 1.2 Mio. wordsEnglish & German
Domains:automotiveavionicsfinancetelecommunicationtransport
“The contracts with the clients describe the conditions regarding obligatory liabilities that the clients have agreed on with X. The liabilities are calculated from the exposures from Y and the contract conditions from X. The liability-relevant parts of the contracts thus need to be managed in system Z.”
“The contracts with the clients describe the conditions regarding obligatory liabilities that the clients have agreed on with X. The liabilities are calculated from the exposures from Y and the contract conditions from X. The liability-relevant parts of the contracts thus need to be managed in system Z.”
“The contracts with the clients describe the conditions regarding obligatory liabilities that the clients have agreed on with X. The liabilities are calculated from the exposures from Y and the contract conditions from X. The liability-relevant parts of the contracts thus need to be managed in system Z.”
Typical Clones• Entire use cases copied
• Similar combinations of pre and post conditions copied
• Descriptions of terms or roles copied
Example* 42 instances (61 words, 13 instances with > 100 words)
*Translated from German
“The contracts with the clients describe the conditions regarding obligatory liabilities that the clients have agreed on with X. The liabilities are calculated from the exposures from Y and the contract conditions from X. The liability-relevant parts of the contracts thus need to be managed in system Z.”
“The contracts with the clients describe the conditions regarding obligatory liabilities that the clients have agreed on with X. The liabilities are calculated from the exposures from Y and the contract conditions from X. The liability-relevant parts of the contracts thus need to be managed in system Z.”
…
1.How much cloning do real-world requirements specifications contain?
H F A G Y Z L C K U X AB V B D N AC I P W O S M J E R Q T
000,70,911,21,61,925,85,55,4
8,28,18,911,212,112,4
15,518,118,5
20,519,621,922,1
35
51,1
71,6
Clone coverage in percentage
Mean 13.6%
2.What kind of information is cloned?
Use case step
Reference
UI
Domain knowledge
Interface description
Precondition
Side condition
Configuration
Feature
Techn. domain knowlege
Postcondition
Rationale 1
3
3
5
6
7
10
13
14
15
15
24
Percentage of clones, more than one category possible
3.What consequences does cloning have?
AB H L A Y B V N U F AC D C Z G X K W M S I P O E R J Q T
0000,10,30,30,30,30,40,50,61,22,12,82,93,24,14,24,87
8,210,311,1
12,7
1717,518,5
36,7
Additional effort in hours per inspector
Mean 6
Modification• Multiple inconsistent specification clones identified
• Differences suspected to be unintentional⇒ Indication that inconsistent updates happen in practice
ImplementationTraced specification clone groups to implementation. 3 cases:
• Shared abstraction
• Cloned code
• Independent reimplementation of similar functionality⇒ Indication that spec. cloning causes redundancy in
implementation
4.Can cloning be detected accurately using existing clone detectors?
E F G J N S W Z Y X I V B L AC P C M R AB A O D H K U
85
969799100100100100100100100100100100100100100100100100100100100100100100
85969799227304044454848525859719697100100100100100100100100
Before tailoringAfter tailoring
Precision in percentage
Threats to validity
Internal• Pairs of researchers to reduce errors during manual steps
• Reading speeds for cloned vs non-cloned text? Assumed similar. Further research required
• Recall unclear. But: does not affect study results
External• Substantial differences between requirements specifications
(format, organisation, language, …)
But: large amount of study objects from different companies, domains
ConclusionLessons Learned• Many specs contain cloning
• Negative impact on reading and inspection effort
• Indication for corresponding redundancy in source code
• Cloning not necessary – many specs contain none
• Tailoring required but feasible: effort small w.r.t. inspection overhead
Future Work• How can cloning be avoided or removed?
• What are the causes for cloning? Different than for code clones?
• Further studies on consequences for implementation
Outlook• Other artefacts - test cases• Effects and costs of cloning• Functionally similar code detector
We need to detect and remove clones reliably and automatically.
Pictures Used in this Slide Deck
„Mercurial Logo“ by Mackall (http://www.selenic.com/hg-‐logo/)