transferring software testing and analytics tools to practice
TRANSCRIPT
Transferring Software Testing and Analytics Tools to Practice
Tao XieUniversity of Illinois at Urbana-Champaign
Part of the research work described in this talk was done in collaboration with the Pex team and Software Analytics Group @Microsoft Research, students @Illinois ASE, and other collaborators
Main Types of Impact Research impact: inspiring/impactful
ideas/directions/subareas… for researchers Example: model checking
Practice impact: Practice adoption of tools/systems/technologies… for practitioners Some examples discussed in this talk
Societal impact: inspiring/impactful ideas/thinking/awareness… for general public Example: computational thinking, privacy, medical-
device security, MOOCs, …
Research Dissemination Publishing research results
technologies there adopted by companies, e.g.,
ICSE 00 Daikon paper by Ernst et al. Agitar Agitator
https://homes.cs.washington.edu/~mernst/pubs/invariants-relevance-icse2000.pdf
ASE 04 Rostra paper by Xie et al. Parasoft Jtest improvement
http://taoxie.cs.illinois.edu/publications/ase04.pdf PLDI/FSE 05 DART/CUTE papers by Sen et al. MSR SAGE, Pex
http://srl.cs.berkeley.edu/~ksen/papers/dart.pdf http://srl.cs.berkeley.edu/~ksen/papers/C159-sen.pdf
…
Research Commercialization Commercializing research results in
startup tools/products used by companies, e.g.,
Reactis®
…
Industrial Lab Tech Transfer Transferring research results to product
groups tools/products used inside company or outside, e.g.,
SAGE
Flash Fill
XIAOSTACKMINE…
SAS
CloudBuild
Tools for Software Engineers
Fakes
Tool Building Contributed by Academic/Industrial Communities
Release open source infrastructures or libraries to engage academic/industrial communities to use and contribute, e.g.,▪ MPI/PETSc by Bill Gropp et al. ▪ Charm++ by Laxmikant (Sanjay) Kale et al.
▪ LLVM by Vikram Adve, Chris Lattner, et al.“The openness of the LLVM technology and the quality of its architecture and engineering design are key factors in understanding the success it has had both in academia and industry.”
KLEE? JPF?FindBugs?
Shipshape?Soot? WALA? …
"Are Automated Debugging [Research] Techniques Actually Helping Programmers?" 50 years of automated debugging research
N papers only 5 evaluated with actual programmers“
” [Parnin&Orso ISSTA’11]http://dl.acm.org/citation.cfm?id=20014
45
(Automated) Test Generation
Human Expensive, incomplete, …
Brute Force Pairwise, predefined data, etc…
Tool Automation!!
State-of-the-Art/Practice Test Generation Tools
Running Symbolic PathFinder ...…=============================
========================= results
no errors detected=============================
========================= statistics
elapsed time: 0:00:02states: new=4, visited=0,
backtracked=4, end=2search: maxDepth=3, constraints=0choice generators: thread=1, data=2heap: gc=3, new=271, free=22instructions: 2875max memory: 81MBloaded code: classes=71, methods=884
…
9
Successful Case of MSR Testing Tool: Pex & Relatives
Pex (released on May 2008) Shipped with Visual Studio 15 as IntelliTest 30,388 download# (20 months, Feb 08-Oct 09) 22,466 download# (10 months, Apr 13-Jan 14):
Code Digger Active user community: 1,436 forum posts during
~3 years (Oct 08- Nov 11) Moles (released on Sept 2009)
Shipped with Visual Studio 12 as Fakes “Provide Microsoft Fakes w/ all Visual Studio
editions” got 1,457 community votes
VisualStudio.UserVoice
https://visualstudio.uservoice.com/forums/121579-visual-studio-2015/suggestions/6773265-make-intellitest-available-to-visual-studio-profes
https://visualstudio.uservoice.com/forums/121579-visual-studio-2015/suggestions/2216195-include-pex-and-moles-with-all-visual-studio-editi
Pex4Fun
1,753,594 clicked 'Ask Pex!'
http://pex4fun.com/
Code Hunt: Redesigned as Game
https://www.codehunt.com/
2014-2015Beauty of Programming Contest
Code Hunt can identify top coders
http://programming2015.cstnet.cn/
Behind the Scene of Code Hunt
Secret Implementation class Secret {
public static int Puzzle(int x) { if (x <= 0) return 1; return x * Puzzle(x-1); }}
Player Implementation class Player {
public static int Puzzle(int x) { return x; }}
class Test {public static void Driver(int x) { if (Secret.Puzzle(x) != Player.Puzzle(x)) throw new Exception(“Mismatch”); }}
behaviorSecret Impl == Player Impl
15
Example User Feedback on Pex4Fun
“It really got me *excited*. The part that got me most is about spreading interest in teaching CS: I do think that it’s REALLY great for teaching | learning!”
“I used to love the first person shooters and the satisfaction of blowing away a whole team of Noobies playing Rainbow Six, but this is far more fun.”“I’m afraid I’ll have to constrain myself to spend just an hour or so a day on this really exciting stuff, as I’m really stuffed with work.”
X
What are Behind Pex
NOT Random: Cheap, Fast “It passed a thousand tests” feeling
…But Dynamic Symbolic Execution:
e.g., Pex, CUTE,EXE White box Constraint Solving
Dynamic Symbolic Execution
Code to generate inputs for:
Constraints to solve
a!=null a!=null &&a.Length>0
a!=null &&a.Length>0 &&a[0]==1234567890
void CoverMe(int[] a){ if (a == null) return; if (a.Length > 0) if (a[0] == 1234567890) throw new Exception("bug");}
Observed constraints
a==nulla!=null &&!(a.Length>0)a!=null &&a.Length>0 &&a[0]!=1234567890
a!=null &&a.Length>0 &&a[0]==1234567890
Datanull
{}
{0}
{123…}a==null
a.Length>0
a[0]==123…T
TF
T
F
F
Execute&MonitorSolveChoose next path
Done: There is no path left.
Negated condition
There are decision procedures for individual path conditions, but… Number of potential paths grows
exponentially with number of branches Reachable code not known initially Without guidance, same loop might be
unfolded forever
Fitnex search strategy [Xie et al. DSN 09]
Explosion of Search Space
http://taoxie.cs.illinois.edu/publications/dsn09-fitnex.pdf
DSE Example
public bool TestLoop(int x, int[] y) {if (x == 90) {
for (int i = 0; i < y.Length; i++) if (y[i] == 15)
x++; if (x == 110) return true;
} return false;}
TestLoop(0, {0})
Path condition:!(x == 90) ↓New path condition:(x == 90) ↓New test input:TestLoop(90, {0})
DSE Examplepublic bool TestLoop(int x, int[] y) {
if (x == 90) { for (int i = 0; i < y.Length; i++)
if (y[i] == 15) x++;
if (x == 110) return true;
} return false;}
TestLoop(90, {0})
Path condition:(x == 90) && !(y[0] ==15) && !(x == 110) ↓New path condition:(x == 90) && (y[0] ==15) ↓New test input:TestLoop(90, {15})
Challenge in DSE
public bool TestLoop(int x, int[] y) {if (x == 90) {
for (int i = 0; i < y.Length; i++)if (y[i] == 15) x++;
if (x == 110)return true;
} return false;}
TestLoop(90, {15})
Path condition:(x == 90) && (y[0] ==15)
&& !(x+1 == 110) ↓New path condition:(x == 90) && (y[0] ==15)
&& (x+1 == 110) ↓New test input:No solution!?
A Closer Look
public bool TestLoop(int x, int[] y) {if (x == 90) {
for (int i = 0; i < y.Length; i++)if (y[i] == 15) x++;
if (x == 110)return true;
} return false;}
TestLoop(90, {15})
Path condition:(x == 90) && (y[0] ==15)
&& (0 < y.Length) && !(1 < y.Length) && !(x+1 == 110) ↓New path condition:(x == 90) && (y[0] ==15)
&& (0 < y.Length) && (1 < y.Length) Expand array size
A Closer Look
public bool TestLoop(int x, int[] y) {if (x == 90) {
for (int i = 0; i < y.Length; i++)if (y[i] == 15) x++;
if (x == 110)return true;
} return false;}
TestLoop(90, {15})
We can have infinite paths!
Manual analysis need at least 20 loop iterations to cover the target branch
Exploring all paths up to 20 loop iterations is infeasible:
220 paths
Fitnex: Fitness-Guided Explorationpublic bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++)
if (y[i] == 15) x++;
if (x == 110) return true;
} return false;}
Key observations: with respect to the coverage target not all paths are equally promising
for branch-node flipping not all branch nodes are equally
promising to flip• Our solution: – Prefer to flip branch nodes on the most promising paths
– Prefer to flip the most promising branch nodes on paths
– Fitness function to measure “promising” extents
TestLoop(90, {15, 0})TestLoop(90, {15, 15})
[Xie et al. DSN 2009]
http://taoxie.cs.illinois.edu/publications/dsn09-fitnex.pdf
Fitness Function FF computes fitness value (distance
between the current state and the goal state)
Search tries to minimize fitness value
[Tracey et al. 98, Liu at al. 05, …]
Fitness Function for (x == 110)
public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++)
if (y[i] == 15) x++;
if (x == 110) return true;
} return false;}
Fitness function: |110 – x |
Compute Fitness Values for Paths
public bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++)
if (y[i] == 15) x++;
if (x == 110) return true;
} return false;}
(90, {0}) 20(90, {15}) 19(90, {15, 0}) 19(90, {15, 15}) 18(90, {15, 15, 0}) 18(90, {15, 15, 15}) 17(90, {15, 15, 15, 0}) 17(90, {15, 15, 15, 15}) 16(90, {15, 15, 15, 15, 0}) 16(90, {15, 15, 15, 15, 15}) 15…
Fitness Value(x, y)
Fitness function: |110 – x |
Give preference to flip paths with better fitness valuesWe still need to address which branch node to flip on paths …
Compute Fitness Gains for Branches
public bool TestLoop(int x, int[] y) {
if (x == 90) { for (int i = 0; i < y.Length; i+
+) if (y[i] == 15) x++;
if (x == 110) return true;
} return false;}
(90, {0}) 20(90, {15}) flip b4 19(90, {15, 0}) flip b2 19(90, {15, 15}) flip b4 18(90, {15, 15, 0}) flip b2 18(90, {15, 15, 15}) flip b4 17(90, {15, 15, 15, 0}) flip b2 17(90, {15, 15, 15, 15}) flip b4 16(90, {15, 15, 15, 15, 0}) flip b2 16(90, {15, 15, 15, 15, 15}) flip b4 15…
Fitness Value(x, y)
Fitness function: |110 – x |Branch b1: i < y.LengthBranch b2: i >= y.LengthBranch b3: y[i] == 15Branch b4: y[i] != 15
•Flipping Branch b4 (b3) gives us average 1 (-1) fitness gain (loss)•Flipping branch b2 (b1) gives us average 0 fitness gain (loss)
Compute Fitness Gain for Branches cont.
For a flipped node leading to Fnew, find out the old fitness value Fold before flipping• Assign Fitness Gain (Fold – Fnew) for the branch of the
flipped node• Assign Fitness Gain (Fnew – Fold ) for the other branch of
the branch of the flipped node
Compute the average fitness gain for each branch over time
Search Frontier Each branch node candidate for being
flipped is prioritized based on its composite fitness value: • (Fitness value of node – Fitness gain of its
branch)
Select first the one with the best composite fitness value
Successful Case of MSR Testing Tool: Pex & Relatives
Pex (released on May 2008) Shipped with Visual Studio 15 as IntelliTest 30,388 download# (20 months, Feb 08-Oct 09) 22,466 download# (10 months, Apr 13-Jan 14):
Code Digger Active user community: 1,436 forum posts
during ~3 years (Oct 08- Nov 11) Moles (released on Sept 2009)
Shipped with Visual Studio 12 as Fakes “Provide Microsoft Fakes w/ all Visual Studio
editions” got 1,457 community votes
How to make such successful case????
Lesson 1. Started as (Evolved) Dream
void TestAdd(ArrayList a, object o) { Assume.IsTrue(a!=null); int i = a.Count; a.Add(o); Assert.IsTrue(a[i] == o);}
Parameterized Unit Tests Supported by Pex
Moles/Fakes
Code Digger
Pex4Fun/Code Hunt
Surrounding (Moles/Fakes) Simplifying (Code Digger) Retargeting (Pex4Fun/Code
Hunt)
Lesson 2. Chicken and Egg
Developer/manager: “Who is using your tool?”
Pex team: “Do you want to be the first?” Developer/manager: “I love your tool but
no.”
Tool Adoption by (Mass) Target Users
Tool Shipping with Visual Studio
Macro Perspective
Micro Perspective
Lesson 3. Human Factors – Generated Data Consumed by Human
Developer: “Code digger generates a lot of “\0” strings as input. I can’t find a way to create such a string via my own C# code. Could any one show me a C# snippet? I meant zero terminated string.”
Pex team: “In C#, a \0 in a string does not mean zero-termination. It’s just yet another character in the string (a very simple character where all bits are zero), and you can create as Pex shows the value: “\0”.”
Developer: “Your tool generated “\0”” Pex team: “What did you expect?” Developer: “Marc.”
Lesson 3. Human Factors – Generated Name Consumed by Human
Developer: “Your tool generated a test called Foo001. I don’t like it.”
Pex team: “What did you expect?” Developer:“Foo_Should_Fail_When_Bar_Is_Ne
gative.”
Lesson 3. Human Factors – Generated Results Consumed by Human
Object Creation messages suppressed (related to Covana by Xiao et al. [ICSE’11])
Exception Tree View Exploration Tree
View
Exploration Results View
http://taoxie.cs.illinois.edu/publications/icse11-covana.pdf
Lesson 4. Best vs. Worst Casespublic bool TestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++)
if (y[i] == 15) x++;
if (x == 110) return true;
} return false;}
Key observations: with respect to the coverage target not all paths are equally promising
for branch-node flipping not all branch nodes are equally
promising to flip• Our solution: – Prefer to flip branch nodes on the most promising paths
– Prefer to flip the most promising branch nodes on paths
– Fitness function to measure “promising” extents
Fitnex by Xie et al. [DSN’09]
To avoid local optimal or biases, the fitness-guided strategy is integrated with Pex’s fairness search strategies
http://taoxie.cs.illinois.edu/publications/dsn09-fitnex.pdf
Lesson 5. Tool Users’ Stereotypical Mindset or Habits
“Simply one mouse click and then everything would work just perfectly” Often need environment isolation w/ Moles/Fakes or
factory methods, … “One mouse click, a test generation tool would
detect all or most kinds of faults in the code under test” Developer: “Your tool only finds null references.” Pex team: “Did you write any assertions?” Developer: “Assertion???”
“I do not need test generation; I already practice unit testing (and/or TDD). Test generation does not fit into the TDD process”
Lesson 6. Practitioners’ Voice
Gathered feedback from target tool users Directly, e.g., via
MSDN Pex forum, tech support, outreach to MS engineers and .NET user groups
Indirectly, e.g., via interactions with MS Visual Studio team (a tool
vendor to its huge user base) Motivations of Moles
Refactoring testability issue faced resistance in practice
Observation at Agile 2008: high attention on mock objects and tool supports
Lesson 7. Collaboration w/ Academia Win-win collaboration model
Win (Ind Lab): longer-term research innovation, man power, research impacts, …
Win (Univ): powerful infrastructure, relevant/important problems in practice, both research and industry impacts, …
Industry-located Collaborations Faculty visits, e.g., Fitnex, Pex4Fun Student internships, e.g., FloPSy, DyGen,
state cov Academia-located Collaborations
Lesson 7. Collaboration w/ Academia
Academia-located Collaborations Immediate indirect impacts, e.g.,
Reggae [ASE’09s] Rex MSeqGen [FSE’09] DyGen Guided Cov [ICSM’10] state coverage
Long-term indirect impacts, e.g., DySy by Csallner et al. [ICSE’08] Seeker [OOPSLA’11] Covana [ICSE’11]
Summary
Pex practice impacts Moles/Fakes, Code Digger, Pex4Fun/Code
Hunt Lessons in transferring tools
Started as (Evolved) Dream Chicken and Egg Human Factors Best vs. Worst Cases Tool Users’ Stereotypical Mindset or Habits Practitioners’ Voice Collaboration w/ Academia
Summary: (How) Can A University Group Do It?
Start a startup but desirable to have right people (e.g., former students)
to start Release free tools/libraries to aim for adoption
but a lot of efforts to be invested on “non-researchy” stuffs Collaborate with industrial research labs
but many research lab projects may look like univ. projects Collaborate with industrial product groups
but many probs faced by product groups may not be “researchy”
Experience Reports on Successful Tool Transfer
Nikolai Tillmann, Jonathan de Halleux, and Tao Xie. Transferring an Automated Test Generation Tool to Practice: From Pex to Fakes and Code Digger. In Proceedings of ASE 2014, Experience Papers. http://taoxie.cs.illinois.edu/publications/ase14-pexexperiences.pdf
Jian-Guang Lou, Qingwei Lin, Rui Ding, Qiang Fu, Dongmei Zhang, and Tao Xie. Software Analytics for Incident Management of Online Services: An Experience Report. In Proceedings ASE 2013, Experience Paper. http://taoxie.cs.illinois.edu/publications/ase13-sas.pdf
Dongmei Zhang, Shi Han, Yingnong Dang, Jian-Guang Lou, Haidong Zhang, and Tao Xie. Software Analytics in Practice. IEEE Software, Special Issue on the Many Faces of Software Analytics, 2013. http://taoxie.cs.illinois.edu/publications/ieeesoft13-softanalytics.pdf
Yingnong Dang, Dongmei Zhang, Song Ge, Chengyun Chu, Yingjun Qiu, and Tao Xie. XIAO: Tuning Code Clones at Hands of Engineers in Practice. In Proceedings of ACSAC 2012. http://taoxie.cs.illinois.edu/publications/acsac12-xiao.pdf
Thank you!Questions ?
https://sites.google.com/site/asergrp/
http://research.microsoft.com/pexhttp://research.microsoft.com/sa/
Summary: (How) Can A University Group Do It?
Start a startup but desirable to have right people (e.g., former students)
to start Release free tools/libraries to aim for adoption
but a lot of efforts to be invested on “non-researchy” stuffs Collaborate with industrial research labs
but many research lab projects may look like univ. projects Collaborate with industrial product groups
but many probs faced by product groups may not be “researchy”