software testing and reliability test assessment and enhancement

1. Software Testing and Reliability Test Assessment and Enhancement Aditya P. Mathur Purdue University August 12-16 @ Guidant Corporation Minneapolis/St Paul, MN Last update: August 13, 2002 Graduate Assistants : Ramkumar Natarajan Baskar Sridharan

2. References

A Data Flow Oriented Program Testing Strategy , Janusz Laski and Bogden Korel, IEEE Transactions on Software Engineering, Vol. SE-9, pp. 347-354, May 1983.

Hints on Test Data Selection: Help for the Practicing Programmer , Richard A. DeMillo, Richard J. Lipton, and Frederick G. Sayward, IEEE Computer, pp 34-41, April 1978.

Mutation Testing , Aditya P. Mathur, in Encyclopedia of Software Engineering, Ed. John Marcianiak, Wiley Interscience, Vol 1pp. 707-712.

3. Learning Objectives

To understand the relevance and importance of test assessment.

To learn the fundamental principle underlying test assessment.

To learn various methods and tools for test assessment.

To understand the relative strengths/weaknesses of test assessment methods.

To learn how to improve tests based on a test assessment procedure.

4. What is Test Assessment?

Once a test set T, a collection of test inputs, has been developed, we ask:

How good is T?

It is the measurement of the goodness of T which is known as test assessment .

Test assessment is carried out based on one or more criteria .

5. Test Assessment (contd.)

These criteria are known as test adequacy criteria.

Test assessment is also known as test adequacy assessment .

6. Test assessment (contd.)

Test assessment provides the following information:

A metric, also known as the adequacy score or coverage , usually between 0 and 1.

A list of all the weaknesses found in T, which when removed, will raise the score to 1.

The weaknesses depend on the criteria used for assessment.

7. Test assessment (contd.)

Once the coverage has been computed, and the weaknesses identified, one can improve T.

Improvement of T is done by examining one or more weaknesses and constructing new test requirements designed to overcome the weakness(es).

The new test requirements lead to new test specifications and to further testing of the program.


This is continued until all weaknesses are overcome, i.e. the adequacy criterion is satisfied (coverage=1) .

In some instances it may not be possible to satisfy the adequacy criteria for one or more of the following reasons:

Lack of sufficient manpower

Weaknesses that cannot be removed because they are infeasible.


The cost of removing the weaknesses is not justified.

While improving T by removing its weaknesses, one usually tests the program more thoroughly than it has been tested so far.

This additional testing is likely to result in the discovery of remaining errors .


Test assessment and improvement is applicable throughout the testing process and during all stages of software development.

Hence we say that test assessment and improvement helps in the improvement of software reliability .

11. Test Assessment Procedure Yes Improve T No Measure adequacy of T w.r.t. C. 2 Is T adequate? 3 Yes 4 More testing is warranted ? 5 Select an adequacy criterion C. 1 Develop T 0 No Done 6

12. Principle Underlying Test Assessment

There is a uniform principle that underlies test assessment throughout the testing process.

This principle is referred to as the coverage principle .

It has come about as a result of intensive research at Purdue and other research groups in software testing .

13. The Coverage Principle

To formulate and understand the coverage principle, we need to understand:

coverage domains

coverage elements

A coverage domain is a finite domain , related to the program under test, that we want to cover . Coverage elements are the individual elements of this domain

14. The Coverage Principle (contd.) Coverage Domains Coverage Elements Requirements Classes Functions Interface mutations Exceptions

15. The Coverage Principle (contd.)

Measuring test adequacy and improving a test set against a sequence of well defined, increasingly strong, coverage domains leads to improved confidence in the reliability of the system under test.


Note the following properties of a coverage domain:

It is related to the program under test.

It is finite .

It may come from program requirements, related to the inputs and outputs .


It may come from program code . Can you think of a coverage domain that comes from the program code?

It aids in measuring test adequacy as well as the progress made in testing. How ?


Example:

It is required to write a program that takes in the name of a person as a string and searches for the name in a file of names. The program must output the record ID which matches the given name. In case of no match a -1 is returned.

What coverage domains can be identified from this requirement?


As we learned earlier, improving coverage improves our confidence in the correct functioning of the program under test.

Given a program P and a test T suppose that T is adequate w.r.t. a coverage criterion C.

Does this mean that P is error free?

Obviously???

20. Test Effort

There are several measures of test effort .

One measure is the size of T. By this measure a test set with a larger number of test cases corresponds to higher effort than one with a lesser number of test cases.

21. Error Detection Effectiveness

Each coverage criterion has its error detection ability. This is also known as the error detection effectiveness or simply effectiveness of the criterion.

One measure of the effectiveness of criterion C is the fraction of faults guaranteed to be revealed by a test T that satisfies C.

22. Effectiveness (contd.)

Another measure is the probability that at least fraction f of the faults in P will be revealed by test T that satisfies C.

Unfortunately there is no absolute measure of the effectiveness of any given coverage criterion for a general class of programs and for arbitrary test sets.

23. Effectiveness (contd.)

One coverage criterion results in an exception to this rule: What is it?

Empirical studies conducted by researchers give us an idea of the relative goodness of various coverage criteria.

Thus, for a variety of criteria we can make a statement like: Criterion C1 is definitely better than criterion C2.

24. Effectiveness-continued

In some cases we may be able to say: Criterion C1 is probably better than criterion C2 .

Such information allows us to construct a hierarchy of coverage criteria.

This hierarchy is helpful in organizing and managing testing. How ?

25. Strength of a coverage criterion

The effectiveness of a coverage criterion is also referred to as its strength .

Strength is a measure of the criterions ability to reveal faults in a program.

Criterion C1 is considered stronger than criterion C2 if C1 is is capable of revealing more faults than C2.

26. The Saturation Effect

The rate at which new faults are discovered reduces as test adequacy with respect to a finite coverage domain increases ; it reduces to zero when the coverage domain has been exhausted.

coverage 0 1

27. Saturation Effect: Fault View Testing Effort Remaining Faults 0 N Functional t f s t f e t d s M t df e t m e

28. Saturation Effect: Reliability View Functional, Decision, Dataflow, and Mutation tsting provide various test assessment criteria. True reliability (R) Estimated reliability (R) Saturation region Reliability Testing Effort R f R d R df R m Functional R f t f s t f e Decision R d t d s t d e Dataflow R df t df s t df e Mutation R m t m s t f e

29. Coverage principle-discussion

Discuss:

How will you use the knowledge of coverage principle and the saturation effect in organizing and managing testing ?

Can you think of any other uses of the coverage principle and the saturation effect?

30. Control flow graph

Control flow graph (CFG) of a program is a representation of the flow of execution within the program.

More formally, a CFG G is:

G=(N,A)

where N: set of nodes and A: set of arcs

There is a unique entry node e n in N.

There is a unique exit node ex in N. A node represents a single statement or a block .

A block is a single-entry-single-exit sequence of instructions that are always executed in a sequence without any diversion of path except at the end of the block.

31. Control flow graph (contd.)

Every statement in a block, except possibly the first one, has exactly one predecessor .

Similarly, every statement in the block, except possibly the last one, has exactly one successor .

An arc a in A is a pair ( n,m ) of nodes from N which represent transfer of control from node n to node m .

A path of length k in G is an ordered sequence of arcs, from A such that:

32. Control flow graph (contd.)

The first node a 1 is en

The last node a k is ex

For any two adjacent arcs a i = ( n,m ) and a j = ( p,q ), m=p .

A path is considered executable or feasible if there exists a test case which causes this path to be traversed during program execution , otherwise the path is unexecutable or infeasible .

33. Control flow graph-example

Exercise:

Draw a CFG for the following program and identify all paths. :

1. scanf (x,y); if (y

software testing and reliability test assessment and enhancement

Documents