1 tracing regression bugs presented by dor nir. 2 outline problem introduction. regression bug –...
TRANSCRIPT
2
Outline Problem introduction. Regression bug – definition. Industry tools. Proposed solution. Experimental results. Future work.
3
Micronose corp. A big company founded by Nataly
noseman 1998 – Version 1 of nosepad. (great
success)Application
4
Nosepad version 1
void Save(){
...
bDirty = false;
}
void Exit(){
if(IsDirty()) {
if(MsgBox(“Save your
noses?"))
Save();
}
CloseWindow();
}
}
class Nosepad
{
bool bDirty;
void AddNose(){
...
bDirty = true;
}
void DeleteNose(){
...
bDirty = true;
}
bool IsDirty(){
return bDirty;
}
5
Nosepad version 2 New features requires
… … Undo/Redo mechanism. …
Micronose expanding Promotions. New recruitments.
6
Undo/Redo Design Undo stack – Each operation will be added to the
undo stack. Redo stack - When undo operation this operation
will move to the redo stack.
Key “a”
Undo Redo
Key “b”
7
Nosepad version 2class Nosepad
{ …
Stack undoStack;
Stack redoStack;
void Undo(){
undoStack.Top().Operate(false);
redoStack.Push(undoStack.Pop()); }
void Redo(){
redoStack.Top().Operate(true);
undoStack.Push(redoStack.Pop());
}
Application
void AddNose(){ ...
undoStack.Add(AddNoseOp);
redoStack.Clear();
}
void DeleteNose() {
...
undoStack.Add(DelNoseOp);
redoStack.Clear();
}
.
.
.
}
9
Nosepad version 2 correctionclass Nosepad
{…
Stack undoStack;
Stack redoStack;
void Undo(){
undoStack.Top().Operate(false);
redoStack.Push(undoStack.Pop());
}
void Redo(){
redoStack.Top().Operate(true);
undoStack.Push(redoStack.Pop());
}
void AddNose(){
...
undoStack.Add(AddNoseOp);
redoStack.Clear();
}
void DeleteNose() {
undoStack.Add(DelNoseOp);
redoStack.Clear();
}
bool IsDirty(){
return bDirty &
!undoStack.IsEmpty();
}
11
Regression bug observations The second bug is a regression bug. The same test will succeeded on version 1 and
fail on version 2. The specifications for version 1 haven't
change for version 2 (only addition)
12
Version 1
Specifications1. X2. Y3. Z
Version 2
Specifications1. X2. Y3. Z4. A5. B
Changes in code
Regression bug
Bug…But no
regression
13
Regression bug definition Regression bug – Changes in existing code
that change the behavior of the application so it does not meet a specification that was previously met.
14
How to avoid regression bugs? Prevent inserting regression bug to the code:
Simple design. Programming language. Good programming. Methodology.
Test driven development. Code review. Communication.
Find regression bugs before release of the product. Extensive testing. White box \ Black box testing.
16
Where is it? What was the cause for the regression bug?
What was the change that caused the regression bug?
18
Problem definition When getting a check point C that failed, and
source code S of the AUT. We want to find the places (changes) in the code S that causes C to fail. We want to do it independently of the source
code language or technology. We know that at time T (previous to the failure)
the checkpoint passed.
nppp ..., 21
20
Solution 1 - map
Windows.cpp
errorMessages.cpp
File.cpp
IO.cs
C:\code\files
DB project
Tests Source code
“SELECT NAMES from Table1” is not empty
File t.xml was Created successfully
Check text in message box
21
Solution 1 Much work has to be done for each new test. Maintenance is hard. We end up with a lot of code to be analyzed. Could use automatic tools (profilers).
22
Solution 2
Windows.cpp
errorMessages.cpp
File.cpp
IO.cs
C:\code\files
DB project
Check text in message box
Only
changes
“SELECT NAMES from Table1” is not empty
File t.xml was Created successfully
23
Source Control Version control tool. Data base of source code. Check-in \ Check-out operation. History of versions. Differences between versions. Very common in software development. Currently in market:VSS, starTeam, clear
case, cvs and many more.
24
Check point to code tool
Finding regression bug
Second phase
Source Control ToolChange AChange B…
Heuristics
Failed check point
First phase
Check point
Source code
Out put:Relevant changes: 1. Change X 2. Change Y 3. Change Z …
Input
252
Heuristics (second phase) Rank changes. Each heuristic will get different weight. Two kinds of heuristics:
Technology dependence. Non technology dependence.
13
26
Non-technology heuristics Do not depend on the technology of the code. Textual driven. No semantics.
27
Code lines affinity
Check point
Select "clerk 1" from the clerk tree (clerk number 2). Go to the next clerk.The next clerk is "clerk 3"
28
Check in comment affinity
Check point
Select "clerk 1" from the clerk tree (clerk number 2). Go to the next clerk.The next clerk is "clerk 3"
Check-in comment
Go to the next waiter when next item event is raise
29
File affinity
Check point
Select "clerk 1" from the clerk tree (clerk number 2). Go to the next clerk.The next clerk is "clerk 3"
Words histogram in file Clerk.cpp
Waiter 186
Waiters 15
Next 26
Number 174
…..
…..
30
File Name affinity
Check point
Select "clerk 1" from the clerk tree (clerk number 2). Go to the next clerk.The next clerk is "clerk 3"
ClerkDlg.cpp
31
More possible non technology heuristics Programmers history.
Reliable vs. “Prone to error” programmers. Experience in the specific module.
Time of change. Late at night. Close to release dead line.
32
Technology heuristics Depend on the source code language. Take advantage of known keywords. Use the semantics.
33
Function\Class\Namespace affinity
Check point
Select "clerk 1" from the clerk tree (clerk number 2). Go to the next clerk.The next clerk is "clerk 3"
34
Code complexity Deepness, number of branching.
if(b1 && b2)
{
if(c2 && d1)
c1 = true;
else
{
if((c2 && d2) || e1)
c1 = false;
}
}
if(b1 && b2 && c2 && d1)
c1 = true;
>
36
Words affinity problem (cont.)
red, flower, white, black,
cloud
rain, green, red, coat
>red, flower,
white, black, cloud
train, table, love
38
How can we measure affinity? Vector space model of information retrieval .-
Wong S.K.M , raghavan Similarity of documents.
Improving web search results using affinity graph - benyu Zhang, Hau Li, Lei Ji, Wensi Xi, Weiguo Fan.
Similarity of documents. Diversity vs. Information richness of documents.
39
Affinity definition Synonym (a) - Group of words that are
synonyms of a or similar in the meaning to a.
Synonym (choose)
chosen, picked out; choice, superior, prime; discriminating, choosy, picky , select, selection
=
40
Words affinity definition (cont) 1 if a == b
ShallowAffinity (a,b) = 0 else 1 if a == b
Affinity (a,b) = else ShallowAffinity (synonym (a), synonym (b))
Affinity of groups of words
||||
),(
),( ..1 ..1
BA
bainityShallowAff
BAinityShallowAff ni mjji
2
),(),(),(
||
)},(),...,,(max{),( ..1
1
ABffinityAsymetricABAffinityAsymetricABAAffinity
A
baAffinitybaAffinityBAffinityAsymetricA ni
mii
}...,,{
}...,,{
321
321
m
n
bbbbB
aaaaA
42
Using affinity in the tool
))(),((),( PWordsCWordsAffinityPCRank
Words (C) = the group of words in the description of the checkpoint C.
Words (P) = Group of words in the source code/checkin/file etc…
43
Using affinity in heuristics Code line affinity:
Words (P, L) = Group of words in the source code located L lines from the
change P. β – coefficient that gives different weight for lines inside the change.
Check-in comment affinity:))((),((),(2 PcheckinWordsCWordsAffinityPCRank
44
Using affinity in heuristics (cont.) File affinity: P is a change in file F with histogram
map.
),(),(3 FCFileRankPCRank
))(),(),((),( FHstgrmFWordsCWordsnityHstgrmAffiFCFileRank
nii
niimii
amap
amapbaAffinitybaAffinitymapBAnityHstgrmAffi
..1
..11
][
][)},(),...,,(max{),,(
45
Using affinity in heuristics (cont.) File name affinity:
Code elements affinity:
))((),((),(4 PFileNameWordsCWordsAffinityPCRank
))((),((8
1
))((),((8
3
))((),((2
1
),(5
PNamespaceWordsCWordsAffinity
PClassNameWordsCWordsAffinity
PmeFunctionNaWordsCWordsAffinity
PCRank
46
AlgorithmInput: C – Checkpoint. T – The last time checkpoint C passed.1. Get the latest version of the source code for C from
the source control tool.2. Get files versions from the source control tool one
by one until the version check-in time is smaller then T. For each file version:
1. Get the change between the two sequential versions.2. Analyze and rank the change in respect to the checkpoint C (Rank(C,P))3. Add the rank to DB.
47
Observations and i ≠ j
Better affinity Better results The project is not always in a valid state.
),(),( 21 PCRankPCRank ji
21 PP
49
WordNet Developed at the University of Princtoen. Large lexical database of English. English nouns, verbs, adjectives, and adverbs
are organized into synonym sets, each representing one underlying lexicalized concept.
Different relations link the synonym sets.
53
Experimental Results Source code:
C++ MFC framework 891 files in 29 folders 3 millions lines of code 3984 check-ins
55
Challenges Time.
Cache. Filtering by one heuristic.
Words equality. Source code vocabulary.
Example - m_CountItemInTable. Additional synonyms.
Clerk ≈ Waiter.
56
Future work Add more heuristics. Learning mechanism – Automatic tuning of
heuristics. Why? Finding more about source of regression bugs.
Bad Programmer. Dead line. Technology. Design.