seconf2011 database driven combinatorial testing

w w w . p r e d i c t i v e t e c h n o l o g i e s . c o m

Get More For Less!Database Driven Combinatorial Testing

Aaron Silverman

Lead Engineer

Applied Predictive Technologies

Background

page 2

APT produces business analytic software used by large

organizations to make decisions worth millions of dollars

It is critically

important that:

• Our software

behave as

expected

• The numbers we

display are correct

Agenda

• Front-end testing can only be made so fast

• An alternate way to reduce testing time is to reduce the

number of tests—but you don’t want to reduce testing coverage

• Combinatorial test selection techniques allow selective test case

creation and intelligent test case reduction

• Moving test case inputs and expected outputs out of the code

and into a testing framework database makes leveraging

combinatorial testing easy

• The database also makes test case maintenance and debugging

transparent and easy

page 3

We aim to share the lessons we have learned

Front End Testing Is Slow

End-to-end Selenium tests most accurately simulate an actual user

using the product, making them slower than other automated tests

March 2011 APT Continuous Integration

Automated Testing Stats

Test Type Test Cases Run Average Time

Per Test Case

Unit 434,016 6 seconds

Functional 91,912 21 seconds

Selenium 13,652 983 seconds

Test Cases Can Not Always Be Shortened

page 5

Despite extensive parallelization, some test cases can only be so

fast without deviating from real world use cases

Large Datasets Complex Computations

Slow Tests

Number of Tests Can Often Be Reduced

page 6

Note: No turtles were harmed in the making of this presentation

However, testing time can still be reduced by running fewer tests

Hey! Are you sure

about this?

Run fewer tests?

Doesn’t that mean

we are testing

less?


Not if we pick our

test cases using

combinatorial

testing!


Intelligently Pick Test Cases

page 10

Combinatorial testing focuses on testing specific relevant

combinations of inputs

• Hundreds of Papers

• Dozens of Tools

APT uses the freely available Microsoft’s Pairwise Independent

Combinatorial Testing, also known as PICT, as well as some tools we

developed ourselves

Download PICT at: http://msdn.microsoft.com/en-us/testing/bb980925.aspx

There is a plethora of research and

resources about combinatorial testing,

especially 2-way, or pairwise testing.

Simplified Example

page 11

Suppose we had an attendee survey for this conference…

Attendee Survey

Name:

Do you like Selenium?

Yes

No

Rate the conference: (Scale 1-10)

…and if you do not like selenium, it asks why

Attendee Survey

Name:


Yes

No

I have too many slow tests

The devs keep changing the UI

My favorite browser is Lynx

I like to complain a lot


because (check all that apply):

Simplified Example

Attendee Survey

Name:


Yes

No, because (check all that apply)

I have too many slow tests

The devs keep changing the UI

My favorite browser is Lynx

I like to complain a lot


There is lots to test!

• No Name Entered

• Name Entered

• Yes

• No

16 permutations

of checked

selections when

“No” is selected

• Valid and

favorable

• Valid and

unfavorable

• Invalid

• Non-integer

Simplified Example

Exhaustive testing would result in 136 permutations of inputs

Clearly this is absurd!

Simplified Example

Test Case Creation Tools

page 15

Using pairwise tools we can reduce our test cases to a much

smaller set that maintains coverage quality

name: blank, filled

likeSelenium: yes, no

slowTests: checked, unchecked

changingUI: checked, unchecked

usesLynx: checked, unchecked

complainsALot: checked, unchecked

rating: favorable, unfavorable, invalid, non-integer

#if user likes selenium, no reasons can be checked

IF [likeSelenium] = "yes" THEN [slowTests] = "unchecked"

AND [changingUI] = "unchecked" AND [usesLynx] =

"unchecked" AND [complainsALot] = "unchecked";

pict_inputs.txt

specify input

names and

possibilities

exclude

impossible

combinations

Note: PICT is far more powerful than the options shown here

Test Case Creation Tools

page 16

PICT will then generate the optimal set of test cases that cover

all possible input pairs meeting our specified conditions

C:\Program Files (x86)\PICT>pict.exe pict_inputs.txt > pict_outputs.xls

pict_outputs.xls

name likeSelenium slowTests changingUI usesLynx complainsALot rating

blank no checked checked checked checked favorable

blank no checked checked unchecked checked unfavorable

blank no unchecked checked unchecked checked non-integer

blank yes unchecked unchecked unchecked unchecked favorable

blank yes unchecked unchecked unchecked unchecked invalid

filled no checked checked checked checked invalid

filled no checked unchecked checked unchecked non-integer

filled no unchecked checked unchecked unchecked favorable

filled no unchecked unchecked checked checked favorable

filled no unchecked unchecked checked checked unfavorable

filled yes unchecked unchecked unchecked unchecked non-integer

filled yes unchecked unchecked unchecked unchecked unfavorable


Where did all my

friends go?

We should test outputs this way too

page 17

Our survey also has a few different outputs to validate:

Attendee Survey

Glad you liked the

conference!

Attendee Survey

Sorry you did not like

the conference.

Attendee Survey

You should consider a

career change.

Attendee Survey

You seem to be having

trouble following

directions, please try

again!

Any Invalid input:

Rating < 5:

Rating >= 5:

“No” selected, a

rating of 1, and

all checkboxes

selected:

Complete Coverage Is Not Always Complete

page 18

The outputs are related to the inputs but are not inputs

themselves, so we treat them as “metadata”

Even if the unfavorable rating is made a 1, we are still not testing the

“career change” result screen

pict_outputs.xls

name likeSelenium slowTests changingUI usesLynx complainsALot rating metadata - result

blank no checked checked checked checked favorable try again

blank no checked checked unchecked checked unfavorable try again

blank no unchecked checked unchecked checked non-integer try again

blank yes unchecked unchecked unchecked unchecked favorable try again

blank yes unchecked unchecked unchecked unchecked invalid try again

filled no checked checked checked checked invalid try again

filled no checked unchecked checked unchecked non-integer try again

filled no unchecked checked unchecked unchecked favorable glad

filled no unchecked unchecked checked checked favorable glad

filled no unchecked unchecked checked checked unfavorable sorry

filled yes unchecked unchecked unchecked unchecked non-integer try again

filled yes unchecked unchecked unchecked unchecked unfavorable sorry

Seeding Helps Achieve Desired Coverage

page 19

Using seeding we can intelligently adjust our set of inputs

Attendee Survey

You should consider a

career change.

name likeSelenium slowTests changingUI usesLynx complainsALot rating

filled no checked checked checked checked unfavorable

pict_seeds.txt

C:\Program Files (x86)\PICT>pict.exe pict_inputs.txt /e:pict_seeds.txt > pict_outputs.xls

We know we want at least one test that gets the “career

change” result. One way to do this is through seeding:

Now we are forcing PICT to include

the set of combinations that will

generate the career change result


Seeding Helps Achieve Desired Coverage

page 20

With our seed in place, PICT generates a new set of combinations

Excellent! We can test everything!

pict_outputs.xls

name likeSelenium slowTests changingUI usesLynx complainsALot rating metadata - result

blank no checked checked unchecked checked unfavorable try again

blank no checked checked unchecked unchecked favorable try again

blank no checked unchecked checked checked favorable try again

blank no unchecked checked unchecked checked non-integer try again

blank no unchecked unchecked checked unchecked unfavorable try again

blank yes unchecked unchecked unchecked unchecked invalid try again

blank yes unchecked unchecked unchecked unchecked non-integer try again

filled no checked checked checked checked invalid try again

filled no checked checked checked checked unfavorable career change

filled no checked unchecked checked unchecked non-integer try again

filled yes unchecked unchecked unchecked unchecked favorable glad

filled yes unchecked unchecked unchecked unchecked unfavorable sorry

Too bad the real world isn’t this simple…

Real World Products Are Incredibly Complex

page 21

In the real world, test cases are extremely complicated!

• Store location with

custom drawn trade

area

• Map modification

• Competition layering

• Boundary

highlighting

• Enabling heat map

• Map Detail

• Shading attribute

selection

• Map controls

• Product navigation

Practical Test Case Selection

page 22

The approach we’ve found that works best revolves around both

coverage of inputs and metadata

• When there is not much metadata that needs to be considered, the first

pass using combinatorial tools works great at test case generation

• When there is lots of metadata, we generally intentionally start with too

many tests and then trim down using our coverage tools

• Using the code to programmatically populate metadata values during

test case execution helps keeps inputs and metadata in sync as we

adjust our test cases

• Seeding is a great way to force in the test cases to ensure the coverage

we want

• Evaluation of testing coverage is an iterative process during test case

development

Using The Test Cases

page 23

Writing many test cases, where only the inputs and outputs

change, results in a lot of code duplication

@Test

public void test_likesSelenium_likesConference() throws Exception {

surveyPage.enterName("Aaron");

surveyPage.selectLikesSelenium(true);

surveyPage.enterRating("5");

surveyPage.submit();

resultPage.synchronize();

Assert.assertEquals("Glad you liked the conference!", resultPage.getResponseMessage());

}

@Test

public void test_dislikes_Selenium_hatesConference() throws Exception {

surveyPage.enterName("Aaron");

surveyPage.selectLikesSelenium(false);

surveyPage.checkDislikeReason(1);

surveyPage.checkDislikeReason(3);

surveyPage.enterRating("1");



Assert.assertEquals("Sorry you did not like the conference.", resultPage.getResponseMessage());

}

DRY!Note: DRY == Do not Repeat Yourself

Database Backed Frameworks Avoid Duplication

page 24

At APT, we store inputs and outputs in the database, and access

them by simple helper methods

@Test

public void test_conferenceSurvey(int testCaseId) throws Exception {

setTestCaseId(testCaseId);

surveyPage.enterName(getInputValue("name"));

surveyPage.selectLikesSelenium(getInputValue("likesSelenium").equals("yes"));

String[] reasons = getInputValue("reasonList").split(",");

for (String s : reasons) {

surveyPage.enterRating(s);

}



compareAgainstExpectedOutput("Result Page Message", resultPage.getResponseMessage());

}

Like me, this

function can

handle

everything!

• Variable length inputs stored as lists

• Comparisons usually non-fatal

• Caching used to avoid continuously

querying the database

• Test cases have names and comments

associated with them

The Database Also Improves Testing Overall

page 25

Storing inputs and outputs in the database opens up many

possibilities for test case creation, evaluation and debugging

• Creating new test cases is easier

• Database-backed web-based test case editor often easier to use than

having inputs in code, excel, wiki, etc.

• Developing tools to evaluate coverage is easier

• No need to parse files, just query the database

• Easy to specify impossible pairs

• Easy to specify values not tested

• Debugging is easier for engineers

• History of inputs and outputs (linked to screenshots) can be tracked

• Easy to store reasons for change of input or expected output

Creating Test Cases Is Transparent and Easy

page 26

Following import from pairwise selection tools, database driven

test cases can now be easily edited and configured by anyone

Test cases easily evaluated and adjusted by:

• Testers

• Engineers

• Product Management

Note: Yes we know our internal tools like they are from 1990—but they work great!

Evaluating Coverage is Transparent and Easy

page 27

With complex tests, coverage reports make identifying both

extraneous and missing test cases easy

For this one test:

• 38 test cases currently cover 3132 of 7917 pairs

• 9 test cases are not adding any additional coverage

• It is still easy to identify gaps in testing—look for the red!

These test cases

need some tuning!

Debugging Results Is Transparent and Easy

page 28

Identifiers

easily

recognized by

engineers

Ability to see history of output changes

This test case checked 657 outputs; 255 of them failed

Ability to accept

new actual outputs

as the expected

outputs going

forward

Actual value with

link to screenshot

of when it was set

Expected value

with link to

screenshot of the

failure

Date output was

set and link to

information about

the test where it

was set from

Simple reporting pages can make debugging test failures and

updating test cases much easier

Screenshots can be stored in the database and associated with

specific test case executions

It is easy to view all screenshots for this run of test case, the last

time this test case passed, etc.

Debugging Results Is Transparent and Easy

Easy To Integrate With Continuous Integration

page 30

With lots of information in the database, it is also easy to

aggregate up to the level desired for continuous integration

At APT we aggregate the results and then format them into junit style XML

files to allow easy integration with Jenkins

However each test case still provides a link to drill into the reports shown

in the previous slides that contain screenshots, inputs/outputs, logs, etc.

Note: Jenkins was formerly know as Hudson

The Big Picture

page 31

Combinatorial testing is just one part of a well-balanced

testing diet

Unit tests

Functional Tests

Selenium Tests

Code coverage tools let us know which

lines of code we are and are not testing

Combinatorial coverage tools let us know

which input combinations we are and are

not testing

Manual TestsExploratory testing alerts us to new

important inputs and interactions to

consider for our automated test cases

The Big Picture

page 32

Database driven combinatorial testing lets you get more for less!

• End-to-end selenium tests can be complicated

and slow

• Combinatorial testing is a great way to reduce

the number of these slow tests while maximizing

testing coverage

• Test metadata must be considered along with

inputs when evaluating coverage

• For complex tests, tools for test case creation,

evaluating coverage, and debugging failures are

essential

• Storing inputs, outputs, and metadata in a

database makes developing test case

management and coverage reporting tools easy

Whew that was

a lot! Do you

think the

audience has

any questions?

Questions

Only one way to

find out! Look for

hands!

Questions

Questions

Questions?

Questions

seconf2011 database driven combinatorial testing

Technology