conquering the largest challenge of software testing: too much to test & not enough time to test...

Conquering the Single Largest Challenge Facing

Today's Testers

Justin HunterCEO of Hexawise

Tuesday, March 5, 13

“Bad News”Many slides in this presentation might not make sense on their own.

“Good News”The YouTube video taken of this presentation is available here.

Photo credit on opening slide: it’s my own photo (taken in the fantastic Rankapur Jain temple in Rajasthan) http://www.flickr.com/photos/82153534@N05/8515803041/sizes/o/in/set-72157632883997160/

“There’s too much to test and not enough time to test it all.”

According to a recent survey conducted by Robert Sabourin, this is the single largest challenge facing test managers today.

Single Biggest Challenge

1. What Happened?

2. Avoidable?

3. Practical Implications

I. “Maps Mayhem”

Here...Most Careers

TimeTuesday, March 5, 13

Here...Most Careers

TimeTuesday, March 5, 13

Here...Scott’s Career

Here...Even worse...

9 / 10

2nd to Go (He’s also Amazing)

Here...Nightmare Worsens

123,000

Here...CEO’s Apology Letter

“We are extremely sorry...”“While we’re improving Maps,

you can try alternatives... like

Bing, MapQuest and Waze, or

use Google or Nokia maps...”

Everyday Fails (Cont.)

13http://www.itsagadget.com/2012/09/apple-google-maps-ios-6.html

Missing Details

Squiggly Roads

http://www.fastcompany.com/3003446/apple-reportedly-fires-their-maps-man

http://news.yahoo.com/blogs/technology-blog/apple-ceo-apologizes-maps-recommends-google-instead-182143889.html

On the water

16http://machineslikeus.com/news/get-lost-apple-maps-road-nowhere

In the water

Missing water

18http://theamazingios6maps.tumblr.com/page/6

Water Turned into Beaches

19http://theamazingios6maps.tumblr.com/page/4

Melted Streets

http://www.crowdsourcing.org/images/resized//editorial_19902_780x0_proportion.jpg?1349379876

Social Media Mockery...

http://blogs.telegraph.co.uk/technology/micwright/100007771/apple-moronic-new-maps-this-is-turning-into-a-disaster/

Even Mocked by These Guys!

http://www.businessinsider.com/google-maps-apple-maps-2012-10

Impact to Sales?

In Fairness to those Involved

- Extreme Complexity - Unimaginably Large Scope - Highly Visible Mistakes - Google Had a Huge Head Start

http://img.photobucket.com/albums/v40/Dragonrider1227/chainsawsonfire.jpg

Could this have been Avoided?

Imminent DisasterSometimes it’s just better to grab a beer and watch...

I. More Smart Testers

This man, Harry Robinson, is a genius.

He helped lead testing for Google Maps.

IMO, he’d be a bargain to Apple at $1 million / year.

http://model-based-testing.info/2012/03/12/interview-with-harry-robinson/

II. Using Smart Test Design

6 browser choices

x 3 options x 2 options x 2 options x 2 options

x 4 options x 2 options x 3 options x 2 options

x 2 options = 13,824 possible tests...

...13,824 possible tests x 4 options x 4 options x 4 options

= 884,736 possible tests...

...884,736 possible tests x 5 optionsx 2 optionsx 2 optionsx 2 optionsx 2 optionsx 4 optionsx 2 optionsx 2 optionsx 2 optionsx 4 optionsx 2 optionsx 2 optionsx 2 options

72,477,573,120 possible tests

This single web page could be tested with

II. Using Pairwise Test Design

First, users input details of an application to be tested...

Next, users create tests that will cover interactions of every valid pair of values in as few tests as possible.

(1) Browser = “Opera” tested with (2) View = “Satellite?” Covered.(1) Mode of Transport = “Walk” tested with (2) Show Photos = “Y”? Covered.(1) Avoid Toll Roads = “Y” tested with (2) Show Traffic = “Y (Live)” ? Covered.

(1) Browser = IE6 tested with (2) Distance in = KM and (3) Zoom in = “Y” ? That is a 3-way interaction. It might not be covered in these 35 tests. See next page.

The third screen provides objective coverage data which is useful in determining when to stop testing.

% Coverage by Number of Tests100%

2 4 7 9 11 14 16 18 21 23 25 28

Every test plan has a finite number of valid combinations of parameter values (involving, in this case, 2 parameter values). The chart below shows, at each point in the test plan, what percentage of the total possible number of relevant combinations have been covered.

In this set of test cases, as in most, there is a significant decreasing marginal return.

Testing each feature to “see if it works” is not enough.Does it work in combination with every other feature?

Every pair of test inputs get tested in at least one test!

Prioritization

How many test inputs are needed to trigger defects in production?

51%33%

• Medical Devices: D.R. Wallace, D.R. Kuhn, Failure Models in Medical Device Software: an Analysis of 15 Years of Recall Data, International Journal of Reliability, Quality, and Safety Engineering, Vol. 8, No. 4, 2001. • Browser, Server: D.R. Kuhn, M.J. Reilly, An investigation of the Applicability of Design of Experiments to Software Testing, 27th NASA/IEEE Software Engineering Workshop, NASA Goddard SFC 4-6 December, 2002 . • NASA database: D.R. Kuhn, D.R. Wallace, A.J. Gallo, Jr., Software Fault Interactions and Implications for Software Testing, IEEE Trans. on Software Engineering, vol. 30, no. 6, June, 2004. • Network Security: K.Z. Bell, Optimizing Effectiveness and Efficiency of Software Testing: a Hybrid Approach, PhD Dissertation, North Carolina State University, 2006.

2 (“pairwise”)

4, 5, or 6

This part of the plan involved the following 6 parameters, each of which had between 2 and 4 values:

Banking / Capital Markets Case Study

After 5Hexawise

After 10Hexawise

After 13Hexawise

Now let’s compare coverage achieved by manual test case selection.

We’ll skip the “after 5 tests” and “after 10 tests” and go directly to “after 13 tests” (e.g., when Hexawise had already achieved 100%

coverage of all pairs of values).

Manual test case selection

After 13ManualTests

This is worse than it might look at first...

In other words, when Hexawise tests had not only tested each test input but tested each test input in combination with every other test input in the plan at least once...

After 13ManualTests

There were many, many pairs of values (in red) that the 13 manual tests had not tested together.

Even after the original plan’s 126 manually-created tests (or almost ten

times more test cases than the Hexawise test plan required)...

Never tested Never tested

Never tested

BAfter 126ManualTestsB

... there were four pairs of values that were never tested in the much longer original plan...

Tested 22 times

Tested 42 times

Tested 22 times

Tested 15 times

... and there were four pairs of values that were wastefully tested 15 or more times each.

Tested 22 times

After 126ManualTests

Tested 42 times

Tested 22 times

Tested 15 times

... and there were four pairs of values that were wastefully tested 15 or more times each.

Tested 22 times

Tested 42 times

Tested 22 times

Tested 15 times

Never tested

These characteristics (wasteful repetition combined with missed gaps in coverage) are found in almost all

manually-selected test cases.Manual test

case selection

Tested 22 times

After 126ManualTests

Tested 42 times

Tested 22 times

Tested 15 times

Never tested

These characteristics (wasteful repetition combined with missed gaps in coverage) are found in almost all

manually-selected test cases.Manual test

case selection

Without Hexawise With Hexawise

126 testsIncomplete coverageWasteful repetition

13 testsComplete coverage

Variation, not repetition

Source: Conservatively interpreted data from several dozen recent pilot projects. Time savings are often significantly larger than 40% and will almost always exceed 30%.

Faster Test Creation

More Defects Found / Hour

Source: Empirical study of average benefits 10 software testing projects published in IEEE Computer magazine in 2009: “Combinatorial Software Testing” Rick Kuhn, Raghu Kacker, Yu Lei, Justin Hunter. Results of individual projects will differ.

More Defects Found

Source: Empirical study of average benefits 10 software testing projects published in IEEE Computer magazine in 2009: “Combinatorial Software Testing” Rick Kuhn, Raghu Kacker, Yu Lei, Justin Hunter. Results of individual projects will differ.

Three FinalImplications

1. Bad software

quality can bring disaster to

anyone.

2.Smart, skilled,

empowered testers are essential.

3.Pairwise and

combinatorial testing helps test systems

3.Pairwise and

BOTH more thoroughly

3.Pairwise and

BOTH more thoroughly

3.Pairwise and

BOTH more thoroughly AND more quickly.

1) Harry Robinson testing

2) Pairwise testing case studies

For more info, Google / Bing:

Thank You!

(BTW did this topic make your “Top 3” list?)

conquering the largest challenge of software testing: too much to test & not enough time to test...

options x

test plan

test managers

number of tests

smart test design

pair of test inputs

set of test cases

possible testsx

Technology

a conquering faith

conquering cobra public

conquering himself, conquering off-road ...? rally "silk...

conquering the mountain

the conquering heroes

fatigue lm bcbsne - wellness.unl.edumanaging fatigue •...

conquering stage fear

conquering test-taking anxiety -...

conquering creative thinking

conquering the cold

conquering darkness

conquering malaria

conquering...

conquering complexity

conquering everest preview

tongonan geothermal field: conquering the … geothermal...

conquering cultural stress

conquering fear

2017 conquering sickness

conquering personal debt