research article path-wise test data generation based on

20
Research Article Path-Wise Test Data Generation Based on Heuristic Look-Ahead Methods Ying Xing, 1,2 Yun-Zhan Gong, 1 Ya-Wen Wang, 1,3 and Xu-Zhou Zhang 1 1 State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China 2 School of Electronic and Information Engineering, Liaoning Technical University, Huludao 125105, China 3 State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China Correspondence should be addressed to Ying Xing; [email protected] Received 5 July 2013; Revised 1 December 2013; Accepted 2 December 2013; Published 14 May 2014 Academic Editor: John Gunnar Carlsson Copyright © 2014 Ying Xing et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Path-wise test data generation is generally considered an important problem in the automation of soſtware testing. In essence, it is a constraint optimization problem, which is oſten solved by search methods such as backtracking algorithms. In this paper, the backtracking algorithm branch and bound and state space search in artificial intelligence are introduced to tackle the problem of path-wise test data generation. e former is utilized to explore the space of potential solutions and the latter is adopted to construct the search tree dynamically. Heuristics are employed in the look-ahead stage of the search. Dynamic variable ordering is presented with a heuristic rule to break ties, values of a variable are determined by the monotonicity analysis on branching conditions, and maintaining path consistency is achieved through analysis on the result of interval arithmetic. An optimization method is also proposed to reduce the search space. e results of empirical experiments show that the search is conducted in a basically backtrack-free manner, which ensures both test data generation with promising performance and its excellence over some currently existing static and dynamic methods in terms of coverage. e results also demonstrate that the proposed method is applicable in engineering. 1. Introduction Soſtware testing plays an irreplaceable role in the process of soſtware development, as it is an important stage to guarantee soſtware reliability [1], which is a significant soſtware quality feature [2]. It is estimated that testing cost has accounted for almost 50 percent of the entire development cost [3], if not more, but manual testing is time-consuming and error-prone with low efficiency and is even impracticable for large-scale programs such as a Windows project with millions of lines of codes (LOC) [4]. erefore, the automation of testing is an urgent issue. Furthermore, as a basic problem in soſtware testing, path-wise test data generation (denoted as Q) is of particular importance because path-wise testing can detect almost 65 percent of the faults in the program under test (PUT) [5] and many problems in soſtware testing can be transformed into Q. e methods of solving Q can be categorized as static and dynamic. e static methods utilize techniques including symbolic execution [6, 7] and interval arithmetic [8, 9] to analyze the PUT without executing it. e process of generating test data is definite with relatively less cost. ey abstract the constraints to be satisfied and propagate and solve these constraints to obtain the test data. Due to their precision in generating test data and the ability to prove that some paths are infeasible, the static methods have been widely studied by many researchers. DeMillo and Offutt [10] proposed a fault-based technique that used algebraic constraints to describe test data designed to find particular types of faults. Gotlieb et al. [11] introduced “static single assignment” into a constraint system and solved the system. Cadar et al. from Stanford University proposed a symbolic execution tool named KLEE [12] and employed a variety of constraint solving optimizations. ey represented program Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2014, Article ID 642630, 19 pages http://dx.doi.org/10.1155/2014/642630

Upload: others

Post on 05-Apr-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Research ArticlePath-Wise Test Data Generation Based on HeuristicLook-Ahead Methods

Ying Xing12 Yun-Zhan Gong1 Ya-Wen Wang13 and Xu-Zhou Zhang1

1 State Key Laboratory of Networking and Switching Technology Beijing University of Posts and TelecommunicationsBeijing 100876 China

2 School of Electronic and Information Engineering Liaoning Technical University Huludao 125105 China3 State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of SciencesBeijing 100190 China

Correspondence should be addressed to Ying Xing faithyingxinggmailcom

Received 5 July 2013 Revised 1 December 2013 Accepted 2 December 2013 Published 14 May 2014

Academic Editor John Gunnar Carlsson

Copyright copy 2014 Ying Xing et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Path-wise test data generation is generally considered an important problem in the automation of software testing In essence itis a constraint optimization problem which is often solved by search methods such as backtracking algorithms In this paper thebacktracking algorithm branch and bound and state space search in artificial intelligence are introduced to tackle the problemof path-wise test data generation The former is utilized to explore the space of potential solutions and the latter is adopted toconstruct the search tree dynamically Heuristics are employed in the look-ahead stage of the search Dynamic variable orderingis presented with a heuristic rule to break ties values of a variable are determined by the monotonicity analysis on branchingconditions and maintaining path consistency is achieved through analysis on the result of interval arithmetic An optimizationmethod is also proposed to reduce the search space The results of empirical experiments show that the search is conducted ina basically backtrack-free manner which ensures both test data generation with promising performance and its excellence oversome currently existing static and dynamic methods in terms of coverage The results also demonstrate that the proposed methodis applicable in engineering

1 Introduction

Software testing plays an irreplaceable role in the process ofsoftware development as it is an important stage to guaranteesoftware reliability [1] which is a significant software qualityfeature [2] It is estimated that testing cost has accounted foralmost 50 percent of the entire development cost [3] if notmore but manual testing is time-consuming and error-pronewith low efficiency and is even impracticable for large-scaleprograms such as a Windows project with millions of linesof codes (LOC) [4] Therefore the automation of testing isan urgent issue Furthermore as a basic problem in softwaretesting path-wise test data generation (denoted as Q) is ofparticular importance because path-wise testing can detectalmost 65 percent of the faults in the program under test(PUT) [5] and many problems in software testing can betransformed into Q

The methods of solving Q can be categorized as staticand dynamicThe staticmethods utilize techniques includingsymbolic execution [6 7] and interval arithmetic [8 9]to analyze the PUT without executing it The process ofgenerating test data is definite with relatively less cost Theyabstract the constraints to be satisfied and propagate andsolve these constraints to obtain the test data Due to theirprecision in generating test data and the ability to provethat some paths are infeasible the static methods have beenwidely studied by many researchers DeMillo and Offutt[10] proposed a fault-based technique that used algebraicconstraints to describe test data designed to find particulartypes of faults Gotlieb et al [11] introduced ldquostatic singleassignmentrdquo into a constraint system and solved the systemCadar et al from Stanford University proposed a symbolicexecution tool named KLEE [12] and employed a variety ofconstraint solving optimizations They represented program

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2014 Article ID 642630 19 pageshttpdxdoiorg1011552014642630

2 Mathematical Problems in Engineering

states compactly and used searching heuristics to reach highcode coverage In 2013 Yawen et al [13] proposed an intervalanalysis algorithm using forward data-flow analysis Butno matter what techniques are adopted the static methodsrequire a strong constraint solver

The dynamic methods including metaheuristic search(MHS) algorithms [18] such as genetic algorithms [19] antcolony optimization [15] and simulated annealing [20] allrequire the actual execution of the PUT They select a groupof test data (usually randomly) in advance and execute itto observe whether the goal is reached that is coveragecriteria are satisfied or faults are detected and if not they spotthe problem and alter the values of some input variables tomake the PUT execute in the expected way They are flexiblemethods as they can easily change the input in the testingprocess but they are sensitive to the search-space size thediversity of initial population the effectiveness of evolutionoperators and the quality of fitness function [21] Therepeated exploration requires a large number of iterationssometimes even causing iteration exceptionThe randomnessof initial values is also a big problem because it bringsuncertainty to the search result [22]

In this paper considering the drawbacks of the dynamicmethods mentioned above and the demand for static meth-ods we propose a new static test data generation methodbased on Code Test System (CTS) (httpctstestingcn)which is a practical tool to test codes written in C program-ming languageOur contribution is threefold First path-wisetest data generation is defined as a constraint optimizationproblem (COP) Two techniques (state space search andbranch and bound) in artificial intelligence are integrated totackle the COP Second heuristics are adopted in the look-ahead stage of the search to improve the search efficiencyThird an optimization method is proposed to reduce thesearch space We try to evaluate the performance of ourmethod and the relationship between look-ahead and look-back techniques through the experimental results

The rest of this paper is organized as follows Section 2provides the background underlying our research The prob-lem Q is reformulated as a COP and the solution is pre-sented in Section 3 Section 4 illustrates the proposed searchstrategies in detail and describes the optimization methodused to reduce the search space The heuristic look-aheadtechniques with a case study are given in Section 5 Section 6focuses on experimental analyses and empirical evaluationsof the proposed approach as well as coverage comparisonswith some existing test data generation methods Section 7concludes this paper and highlights directions for futureresearch

2 Background

State space search [23 24] is a process in which successivestates of an instance are considered with the goal of findinga final state with a desired property Problems are normallymodeled as a state space a set of states that a problem canbe in The set of states forms a graph where two states areconnected if there is an operation which can be performed

to transform the first state into the second State space searchcharacterizes problem solving as the process of finding asolution path from an initial state to a final state In statespace search the nodes of the search tree are correspondingto partial problem solution and the arcs are corresponding tosteps in a problem-solving process State space search differsfrom traditional search methods because the state space isimplicit the typical state space is too large to generate andstore in memory Instead nodes are generated as they areexplored and typically discarded thereafter

Branch and bound (BB) [25 26] is an efficient back-tracking algorithm for searching the solution space of aproblem as well as a common search technique to solveoptimization problems The advantage of the BB strategy liesin alternating branching and bounding operations on the setof active and extensive nodes of a search tree Branching refersto partitioning of the solution space (generating the childnodes) bounding refers to lowering bounds used to constructa proof of feasibility without exhaustive search (evaluatingthe cost of new child nodes) The techniques for improvingBB are categorized as look-ahead and look-back methodsLook-ahead methods [27] are invoked whenever the searchis preparing to extend the current partial solution and theyconcern the following problems (1) how to select the nextvariable to be instantiated or to be assigned a value (2) howto select a value to instantiate a variable (3) how to reducethe search space by maintaining a certain level of consis-tency Look-back methods are invoked whenever the searchencounters a dead end and is preparing for the backtrackingstep and they can be classified into chronological backtrack-ing and backjumping

An important static testing technique adopted in thispaper is interval arithmetic [8 9 28ndash30] which representseach value as a range of possibilities An interval is acontinuous range in the form of [minmax] while a domainis a set of intervals For example if an integer variable119909 rangesfrom minus3 to 6 but it cannot be equal to 0 then its domain isrepresented as [minus3 minus1] cup [1 6] which is composed of twointervals Interval arithmetic has a set of arithmetic rulesdefined on intervals It analyzes and calculates the rangesof variables starting from the entrance of the program andprovides precise information for further program analysisefficiently and reliably

3 Reformulation of Path-Wise TestData Generation

This section addresses the reformulation of path-wise testdata generation Problem definition and its solution are pre-sented in Sections 31 and 32 respectively

31 Problem Definition Many forms of test data generationmake references to the control flow graph (CFG) of theprogram in question [31] In this paper a CFG for a program119875 is a directed graph 119866 = (119873 119864 119894 119900) where 119873 is a set ofnodes 119864 is a set of edges and 119894 and 119900 are respective uniqueentry and exit nodes to the graph Each node 119899 isin 119873 is astatement in the program with each edge 119890 = (119899

119903 119899119905) isin 119864

Mathematical Problems in Engineering 3

0

87

9

10

11

12

T 1

T 3

T 5

F 2

F 4

F 6

entry 0

if head 1

stmt 2 if head 3

if out 8

exit 10

stmt 4

if out 9

if head 5

stmt 6

if out 7

void test(int x1 int x2 int x3)

1 if (x1-x2lt=0)

2 printf(ldquoPath1rdquo)

3 else if(x3-x2lt=0)

4 printf(ldquoPath2rdquo)

5

6 printf(ldquoPath3rdquo)

else if(3lowastx3+5gt=0)

Figure 1 Program test and its corresponding CFG

representing a transfer of control from node 119899119903to node

119899119905 Nodes corresponding to decision statements such as if

statements are branching nodes Outgoing edges from thesenodes are referred to as branches A path through a CFGis a sequence 119901 = (119899

1 1198992 119899

119902) such that for all 119903 1 le

119903 lt 119902 (119899119903 119899119903+1) isin 119864 A path 119901 is regarded as feasible

if there exists a program input for which 119901 is traversedotherwise 119901 is regarded as infeasible Then the problem Qcan be reformulated as a COP [32 33] as follows 119883 is a setof variables 119909

1 1199092 119909

119899 119863 = 119863

1 1198632 119863

119899 is a set

of domains and 119863119894isin 119863 (119894 = 1 2 119899) is a finite set of

possible values for 119909119894 For each path 119863 is defined based on

the variablesrsquo acceptable ranges One solution to the problemis a set of values to instantiate each variable inside its domaindenoted as 119881 = 119881

1 1198812 119881

119899 119881119894isin 119863119894to make path

119901 feasible Particularly each constraint defined by the PUTalong 119901 should be met to make it feasible

An example with a program test and its correspondingCFG is shown in Figure 1 where if out 7 if out 8 if out 9and exit 10 are dummy nodes Adopting branch coveragethere are four paths to be traversed namely Path1 0 rarr 1 rarr 2rarr 9 rarr 10 Path2 0 rarr 1 rarr 3 rarr 4 rarr 8 rarr 9 rarr 10 Path3 0rarr 1 rarr 3 rarr 5 rarr 6 rarr 7 rarr 8 rarr 9 rarr 10 and Path4 0 rarr 1rarr 3 rarr 5 rarr 7 rarr 8 rarr 9 rarr 10Thenumbers along the pathsdenote nodes rather than edges of the CFG Assuming thatPath3 is the path to be traversed as shown in bold our workis to select 119881 = 119881

1 1198812 1198813 from 119863

1 1198632 1198633 for 119909

1 1199092 and

1199093 so that when executing test using 119881

1 1198812 1198813 as an input

the path traversed is Path3 There are three branching nodesif head 1 if head 3 and if head 5 along Path3 and threecorresponding branches F 2 F 4 and T 5 that contain theconstraints to be met

32 Solution to the Problem A COP is generally solved bysearch strategies among which backtracking algorithms [34]are widely used In this paper state space search and thebacktracking algorithm BB are introduced to solve the COP

mentioned aboveThe process of exploring the solution spaceis represented as state space search This representation willfacilitate the implementation of BB In classical BB searchnodes are always fully expanded that is for a given leaf nodeall child nodes are immediately added to the so-called openlist However considering that one solution is enough forpath-wise test data generation best-first-search is our firstchoice To find the best ordering of variables is requiredfor branching to prune the branches stretching out fromunneeded variables In addition as the domain of a variable isa finite set of possible valueswhichmay be quite large bound-ing is necessary to cut the unneeded or infeasible solutions InBB frame bisection [35] is often used to help prune unneededpart of the solution space Employing bisection this paperproposes best-first-search branch and bound (BFS-BB)to automatically generate the test data

It has been observed empirically that the enhancement oflook-ahead methods is sometimes counterproductive to theeffects of look-back methods [36] As for BFS-BB heuristicsare adopted in the look-ahead search Particularly they areused in the dynamic ordering of variables the selection of thevalues to assign a variable and the judgment of the feasibilityof the path after the assignment to a variable and the reduc-tion of the search space Chronological backtracking is usedfor look-back And from the results of the experiments wetry to seek out the relationship between look-ahead and look-back methods

During the search process variables are divided intothree sets past variables (short for PV already instantiated)current variable (now being instantiated) and future vari-ables (short for FV not yet instantiated) All the variablesinvolved in this paper are symbolic variables In the interestof simplicity the transformation from input variables tosymbolic variables and the inverse transformation are beyondthe scope of this paper In addition although the experimentswere carried out on benchmarks in the literature or industrialprograms of different variable types integer variables are usedfor brevity in the following algorithms

4 The Proposed Search Strategies

This section proposes the framework of the search strate-gies Particularly the representation of state space search isdescribed in detail in Section 41 which is followed by thesearch algorithmBFS-BB in Section 42 And an optimizationmethod in BFS-BB is explained in Section 43

41 The Representation of State Space Search A state is atuple (Precursor Variable Domain Value Type and Queue)Precursor provides a link to the previous state 119881119886119903119894119886119887119897119890 =119909119894isin 119883 (119894 = 1 2 119899) is the current variable 119863119900119898119886119894119899 =

119863119894119895sube 119863119894isin 119863 (119894 = 1 2 119899 119895 = 1 2 119898 119898 is the

branching factor or the threshold used to control the breadthof the search tree) in the form of [minmax] is the set ofpossible values to be selected to instantiate Variable 119881119886119897119906119890 =119881119894119895isin 119863119894119895is a value selected fromDomainTypemarks the type

of state active extensive or inactive Queue is a sequence ofvariables corresponding to the state in question

4 Mathematical Problems in Engineering

Best-first-search branch and boundState space searchInitialization

IVR

Initial state Final state

Heuristic look-ahead methods

Bisection

Path Test data

Heuristic look-ahead methods

Backtracking

MPC IDC MPC DVO IDCDVO PTC

⟨x1 V1⟩

⟨x2 V2⟩

⟨x3 V3⟩

⟨xn Vn⟩

Figure 2 Program test and its counterpart with branching conditions decomposed into basic functions

State space is a quadruple (119878 119860 119868 119865) where 119878 is a setof states 119860 is a set of connections between the states inaccordancewith the search operations 119868 is a nonempty subsetof 119878 denoting the initial state of the problem and 119865 is a non-empty subset of 119878 denoting the final state of the problem

State space search is all about finding one final state ina state space (which may be extremely large) Final meansthat every variable has been instantiated with a definite valuesuccessfully At the start of the search Precursor is null andwhen Queue is null the search ends The path made up of allthe extensive nodes in the search treemakes the solution pathThe state space needs to be searched to find a solution pathfrom an initial state to a final state

42 The Search Algorithm BFS-BB The idea of the searchalgorithm BFS-BB is to extend partial solutions At each stepa variable in FV is selected and assigned a value from itsdomain to extend the current partial solution It is checkedwhether such an extension may lead to a possible solutionof the COP and the subtrees containing no solutions basedon the current partial solution are pruned Some concepts inBFS-BB are explained as follows

Irrelevant variable removal (IVR) identifies variablesrelevant to the path to be traversed and removes thoseirrelevant Dynamic variable ordering (DVO) permutatesFV and returns a queue Path tendency calculation (PTC)calculates the path tendencies of all relevant variables alongthe path whichwill be used to calculate the domains inwhichtheir initial values are selected

Initial domain calculation (IDC) calculates the domain ofa variable in which its initial value is selected according toits path tendency calculated by PTC Bisection reduces thedomain of the current variable when its value just assignedfails to satisfy a constraint on the path Maintaining pathconsistency (MPC) utilizes interval arithmetic to determinewhether the domains of all variables satisfy the constraintsalong the path

The overview of our approach can be seen from Figure 2The path to be traversed is shown in the left part where thered circles represent nodes and the arrows represent edgesof the CFG The path contains the constraints to be metthe set of input variables and the domains corresponding tothe variables The first stage is to perform the initializationoperations At first IVR (see Section 43) is called to reducethe search space by removing irrelevant variables and leavingonly those relevant to the path Then four heuristic look-ahead methods take effect MPC (see Section 53) is used to

partially reduce the input domains of all variables and findinfeasible paths on occasion All the relevant variables in FVare permutated by DVO (see Section 51) to form a queue 119876

1

and its head 1199091is determined the best or the first variable

to be instantiated Next PTC (see Section 52) calculates pathtendency of each variable and IDC (see Section 52) reducesthe domain 119863

11in which the initial value 119881

11is selected

for 1199091 With all these the initial state is constructed as

(119899119906119897119897 1199091 11986311 11988111 119886119888119905119894V119890 and 119876

1) which is also the current

state 119878119888119906119903

shown as the red ringThe second stage implements state space search Four

heuristic look-ahead methods work in this stage To eachactive state MPC is carried out to determine the direction ofthe next search step If MPC succeeds Type becomes exten-sive the variables in FV will be permutated by DVO to get119876119906119890119906119890 = 119876

119894 119878119888119906119903

becomes Precursor and the head of 119876119894(119909119894)

will be Variable of next state Then IDC is used to calculatethe domain119863

1198941in which the initial value119881

1198941is selected for 119909

119894

With all these a new state (119875119903119890 119909119894 1198631198941 1198811198941 119886119888119905119894V119890 and 119876

119894) is

constructed for which the MPC check continues If after asuccessful MPC check no variable needs to be permutatedthen all the relevant variables have been assigned the rightvalues to make 119901 feasible The final state is reached shownas the red double ring Finally giving the irrelevant variablesrandom values fulfills the generation of the test data which isthe output of BFS-BB as shown in the right part of Figure 2If a MPC check fails Type remains active bisection (seeSection 52) is conducted to reduce the domain of Variableusing the information from the failed MPC check and Valueis reselected from the reduced domain all of which indicatethat the search will expand to a state with a different value forthe same variable 119909

119894 If all the values within its domain for 119909

119894

are tried out or the number of MPC checks has reached theupperbound119898 then 119909

119894is moved out of PV andType becomes

inactive In this case the search will have to backtrack toPrecursor at the higher level of the search tree as shown by thebidirectional arrow between backtracking and the heuristiclook-ahead methodsThe above-mentioned search process isdescribed by pseudocodes as Algorithm 1

43 Irrelevant Variable Removal As mentioned above 119883 =

1199091 1199092 119909

119899 is the set of input variables for a program 119875

The search space needs to involve every 119909119894(119894 = 1 2 119899)

in 119883 However it is possible that not every variable will beresponsible for determining whether every path in 119875 will betraversed or not Therefore when attempting to generate testdata for a particular path 119901 the search effort on the value

Mathematical Problems in Engineering 5

Input 119901 the path to be traversedOutput result ⟨119881119886119903119894119886119887119897119890 119881119886119897119906119890⟩ the test data making 119901 feasibleStage 1 Initialization(1) call Algorithm Irrelevant variable removal(2) resultlarr null(3) call Algorithm Maintaining path consistency(4) call Algorithm Dynamic variable ordering(5) call Algorithm Path tendency calculation(6) 119909

1larr head (119876

1)

(7) call Algorithm Initial domain calculation(8) 119881

11larr select (119863

11)

(9) initial statelarr (null 1199091 11986311 11988111 active 119876

1)

(10) 119878cur larr initial stateStage 2 State space searchBegin(11) for (Pre 119909

119894 119863119894119895 119881119894119895 active 119876

119894) (119894 rarr 1 119899)

(12) path consistentlarr false(13) call Algorithm Maintaining path consistency(14) if (path consistent = true)(15) 119878cur larr (Pre 119909

119894 119863119894119895 119881119894119895 extensive 119876

119894)

(16) resultlarr result cup ⟨119909119894 119881119894119895⟩

(17) FVlarr FV minus 119909119894

(18) PVlarr PV + 119909119894

(19) call Algorithm Dynamic variable ordering(20) if (119876

119894= null)

(21) 119878cur larr final state(22) foreach 119909lowast isin 119883irrel(23) resultlarr result cup ⟨119909lowast 119881random⟩(24) else Prelarr 119878cur(25) 119909

119894larr head (119876

119894)

(26) call Algorithm Initial domain calculation(27) 119881

1198941larr select (119863

1198941)

(28) 119878cur larr (Pre 119909119894 1198631198941 1198811198941 active 119876

119894)

(29) else if (1003816100381610038161003816100381611986311989411989510038161003816100381610038161003816gt 1 ampamp 119895 lt 119898)

(30) call Algorithm Bisection(31) 119881

119894119895larr select (119863

119894119895)

(32) 119878cur larr (Pre 119909119894 119863119894119895 119881119894119895 active 119876

119894)

(33) else 119878cur larr (Pre 119909119894 119863119894119895 119881119894119895 inactive 119876

119894)

(34) 119875119903119890 larr 119878cur(35) 119878cur larr (Pre 119909

119894 119863119894119895 119881119894119895 active 119876

119894)

(36) PVlarr PV minus 119909119894

(37) return resultEnd

Algorithm 1 Best-first-search branch and bound

of a variable which is not relevant to 119901 is wasted since itcannot influence the traversal of 119901Thus removing irrelevantvariables from the search space and only concentrating on thevariables relevant to the path of interest may improve the per-formance of the search Hence we propose an optimizationmethod irrelevant variable removal (IVR) Relevant variableand irrelevant variable are defined as follows

Definition 1 A relevant variable is an input variable that caninfluence whether a particular path 119901will be traversed or notTo put it more precisely for all the input variables 119909

119894| 119909119894isin

119883 119894 = 1 2 119899 there exists a corresponding set of values119881119894| 119881119894isin 119863119894 119894 = 1 2 119899 with which 119901 is not traversed

But when the value of a particular variable is changed forexample when the value of 119909

119892(119881119892) is changed into 1198811015840

119892 119901 is

traversed with the input 1198811 1198812 119881

1015840

119892 119881

119899 Then 119909

119892is a

relevant variable to path 119901

Definition 2 An irrelevant variable is an input variable thatis not capable of influencing whether a particular path 119901 willbe traversed or not To put it more precisely for all the sets119881119894| 119881119894isin 119863119894 119894 = 1 2 119899 of the search space of path 119901

with which 119901 is not traversed if 119901 is still not traversed withthe input 119881

1 1198812 119881

1015840

119892 119881

119899 when the value of a certain

variable 119909119892(119881119892) is changed into 1198811015840

119892 then 119909

119892is an irrelevant

variable to path 119901

6 Mathematical Problems in Engineering

Input Br(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branching conditions along the path119883 = 119909

1 1199092 119909

119899 the set of input variables

Output119883rel the set of relevant variables to the path119883irrel the set of irrelevant variables to the path

(1)119883rel larr Oslash(2)119883irrel larr Oslash(3) foreach Br(119899

119902119886 119899119902119886+1

) (119886 isin [1 119896])(4) if (119883rel = 119883)

(5) break(6) else if (119886

119895= 0)

(7) 119883rel larr 119883rel cup 119909119895(8)119883irrel larr 119883 minus119883rel(9) return 119883rel 119883irrel

Algorithm 2 Irrelevant variable removal

Generally for a particular path whether an input variableis relevant or irrelevant cannot be completely decided dueto the complex structure of programs But we can makeconservative estimate of irrelevancy with static control flowtechnique We give the most common condition in PUTsAssume that there are 119896 branches along a path each branch(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) needs to be traversed to find theset of relevant variables The removal of irrelevant variablesinvolves the judgment of whether a variable appears on eachbranch so we give the definition below which is utilizedby Algorithm 2 And considering the relation between thecomplexity of BFS-BB and the number of variables we giveProposition 4 about the effectiveness of IVR

Definition 3 The branching condition Br(119899119902119886 119899119902119886+1

) is theconstraint on the branch (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896]) and it canbe represented as

Br (119899119902119886 119899119902119886+1

) =

119899

sum119895=1

119886119895119909119895R119888 (1)

where R is a relational operator and 119886119895(119895 isin [1 119899]) and 119888 are

constants

Proposition 4 IVR may result in test data being searchedout with fewer MPC checks for a particular path 119901 than if allvariables are considered

Proof Thealgorithmbisection involves the search steps takenfor a certain variable under the same condition of othervariables which move in breadth (119898) until a value is foundto make MPC succeed Then 119898 is the base of the complexityof BFS-BB and the number of variables is the exponent Let119883rel denote the set of relevant variables to path119901 and let119883irrelbe the set of irrelevant variables onemore element in119883rel willinvolve more MPC checks on an exponential basis If all theirrelevant variables are removed from the search space thecomplexity will be reduced by119898|119883irrel| |119883irrel| is the cardinalityof the set of irrelevant variables

We conduct IVR for all the paths in Figure 1 and theprocess is shown in Table 1 The position where a variable isjudged relevant to the path of interest is highlighted in bold

5 The Heuristic Look-Ahead Methods

In this section the heuristic look-ahead methods in BFS-BBare explained in detail in Sections 51 52 and 53 respec-tively And Section 54 provides a case study to illustrate thesemethods

51 Heuristics in Variable Ordering In practice the chief goalin designing variable ordering heuristics is to reduce the sizeof the overall search tree In our method the next variableto be instantiated is selected to be the one with the minimalremaining domain size (the size of the domain after removingthe values judged to be infeasible) because this canminimizethe size of the overall search treeThe technique to break ties isimportant as there are often variables with the same domainsize We use variablesrsquo ranks to break ties In case of a tie thevariable with the higher rank is selected This method givessubstantially better performance than picking one of the tyingvariables at random Rank is defined as follows

Definition 5 The rank of a branch (119899119902119886 119899119902119886+1

) (119886 isin [1 119896])

marks its level in the sequence of the branches along a pathdenoted as rank (119899

119902119886 119899119902119886+1

)The rank of the first branch is 1 the rank of the second

one is 2 and the ranks of those following can be obtainedanalogously The variables appearing on a branch enjoy thesame rank as the branch The rank of a variable on a branchwhere it does not appear is supposed to be infinity As avariable may appear on more than one branch it may havedifferent ranks The rule to break ties according to the ranksof variables is based on the heuristics from interval arithmeticthat the earlier a variable appears on a path the greaterinfluence it has on the result of interval arithmetic along thepath Therefore if the ordering by rank is taken between avariable that appears on the branch (119899

119902119886 119899119902119886+1

) and a variablethat does not then the former has a higher rank That isbecause on the branch (119899

119902119886 119899119902119886+1

) the former has rank 119886

Mathematical Problems in Engineering 7

Table 1 IVR process for each path of test in Figure 1

Path Branching condition 1198861

1198862

1198863

119883rel 119883irrel

Path 1 0 rarr 1 rarr 2 rarr 9 rarr 10 1199091 minus 1199092 le 0 1 minus1 0 1199091 1199092 1199093

Path 2 0 rarr 1 rarr 3 rarr 4 rarr 8 rarr 9 rarr 101199091 minus 1199092 gt 0 1 minus1 0

1199091 1199092 1199093 Oslash1199093 minus 1199092 le 0 0 minus1 1

Path 3 0 rarr 1 rarr 3 rarr 5 rarr 6 rarr 7 rarr 8 rarr 9 rarr 10

1199091 minus 1199092 gt 0 1 minus1 01199091 1199092 1199093 Oslash1199093 minus 1199092 gt 0 0 minus1 1

31199093 ge minus5 mdash mdash mdash

Path 4 0 rarr 1 rarr 3 rarr 5 rarr 7 rarr 8 rarr 9 rarr 10

1199091 minus 1199092 gt 0 1 minus1 01199091 1199092 1199093 Oslash1199093 minus 1199092 gt 0 0 minus1 1

31199093 lt minus5 mdash mdash mdash

Input FV the set of future variables119863119894 the domain of 119909

119894(119909119894isin FV)

(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branches along the pathOutput 119876

119894 a queue of FV

Begin(1) 119876119894larr quicksort (FV 1003816100381610038161003816119863119894

1003816100381610038161003816)(2) for 119894 rarr 1 1003816100381610038161003816119876119894

1003816100381610038161003816(3) if (1003816100381610038161003816119863119894

1003816100381610038161003816 =10038161003816100381610038161003816119863119895

10038161003816100381610038161003816) (119895 gt 119894 119909

119894 119909119895isin 119876119894)

(4) break(5) else for (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896])(6) if (rank(119899

119902119886 119899119902119886+1

)(119909119894) = rank(119899

119902119886 119899119902119886+1

)(119909119895))

(7) 119886++(8) else permutate 119909

119894 119909119895by rank(119899

119902119886 119899119902119886+1

)(9) break(10) return 119876

119894

End

Algorithm 3 Dynamic variable ordering

while the latter has rank infinity The comparison between119886 and infinity determines the ordering The algorithm isdescribed by pseudocodes in Algorithm 3

Quicksort is utilized when permutating variables accord-ing to remaining domain size and returns 119876

119894as a result If

no variables have the same domain size then DVO finishesBut if there are variables whose domain sizes are the same asthat of the head of119876

119894 then the ordering by rank is under way

which will terminate as soon as different ranks appear

52 Heuristics in Value Selection DVO determines the nextvariable to be instantiated and then the value selectionstrategies are employed Considering the difference betweenthe variable in question (eg 119909

119894) and other variables the

branching condition defined by formula (1) can be furtherrepresented as a function of 119909

119894

Br (119899119902119886 119899119902119886+1

) (119909119894) 119863119894997888rarr 119861 = (119886

119894119909119894+ sum119895 = 119894

119886119895119909119895)R119888 (2)

where 119863119894is the domain of 119909

119894and 119861 is a set of Boolean

values 119905119903119906119890 119891119886119897119904119890 sum119895 = 119894119886119895119909119895is the linear combination of

the variables except 119909119894and is regarded as a constant Then

we can design the value selection strategies starting from themonotonic relation between the branching condition and 119909

119894

Monotonicity describes the behavior of a function in relationto the change of the input It gives an indication whether theoutput of the function moves in the same direction as theinput or in the reverse direction If a branching condition isa function whose monotonicity is known the direction inwhich the input needs to be moved to make the functiontrue can be determined The following proposition gives anattribute of a function composed of piecewise monotonicfunctions

Proposition 6 Assume that 1198911 1198831rarr 119884

1 1198912 1198832rarr

1198842 119891

119898 119883119898rarr 119884119898is a family of piecewise monotonic

functions with 119884119894sube 119883119894+1

Let 119865119898 1198831rarr 119884119898be a composed

function 119891119898∘ 119891119898minus1

∘ sdot sdot sdot ∘ 1198911 On this assumption 119865

119898is also

piecewise monotonic

Proof Mathematical induction is used to prove the proposi-tion

(i) Case 1198651= 1198911 Function 119891

1is piecewise monotonic

by assumption 1198651is equal to 119891

1 so it has the same

attribute(ii) Case 119865

119894+1= 119891119894+1∘ 119865119894 The composed function 119865

119894is

piecewise monotonic by the induction assumptionlet 119868 be a subset of its domainrsquos partition and let 119909and 1199091015840 be two arbitrary elements in 119868 with 119909le

1198831199091015840

then one of themonotonicity conditions holds that iseither 119865

119894(119909) le119884119894119865119894(1199091015840

) or 119865119894(119909) ge119884119894119865119894(1199091015840

) For simplic-ity we denote it as 119865

119894(119909)R119865

119894(1199091015840

) where R isin le geFunction 119891

119894+1is piecewise monotonic by assumption

The monotonicity condition is satisfied by 119865119894(119909) and

119865119894(1199091015840

) if both lie in the same subset 1198681015840of its domainrsquospartition Then 119891

119894+1(119909)R119891

119894+1(1199091015840

) holds and 119891119894+1

isalso monotonic on 1198681015840

After decomposing a branching condition into its basicfunctions its monotonicity can be utilized in the selectionof the initial value as well as other values of the variable inquestion

521 Initial Value Selection Initial values of variables are ofgreat importance to a search algorithm On the one hand in a

8 Mathematical Problems in Engineering

backtrack-free search the initial value of a variable is almostpart of the solution On the other hand the selection of initialvalues affectswhether the searchwill be backtrack-free Initialvalues are often selected at random in MHS methods whichreturn different test data each time allowing diversity butrandomness without any heuristics is a kind of blind searchand causes too many iterations sometimes even exceptionMeanwhile midvalues are selected in methods using bisec-tion so it is obvious that sometimes the same result may bereturned since the same initial value is always selected In ourmethod the above twomethods are combined and the initialvalue of a variable is determined based on its path tendencywhich is defined and calculated as follows

Definition 7 Path tendency isin 119901119900119904119894119905119894V119890 119899119890119892119886119905119894V119890 is anattribute of a variable on a path which is in favor of thesatisfaction of all the branching conditions along the pathAnd it provides the information about where to select itsinitial value Positive implies that a larger initial value willwork better while negative implies that a smaller initial valueis better

The calculation of the path tendency of a variable119909119894involves the calculation of its weight on each branch

(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) and its path weight denoted as119908119894(119899119902119886 119899119902119886+1

) and 119901119908119894 which are calculated as (3)

119908119894(119899119902119886 119899119902119886+1

)

=

10038161003816100381610038161198861198941003816100381610038161003816

10038161003816100381610038161198861198941003816100381610038161003816 + sum119895 = 119894

10038161003816100381610038161003816119886119895

10038161003816100381610038161003816

if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically increasing

minus

10038161003816100381610038161198861198941003816100381610038161003816

10038161003816100381610038161198861198941003816100381610038161003816 + sum119895 = 119894

10038161003816100381610038161003816119886119895

10038161003816100381610038161003816

if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically decreasing

119901119908119894=

119896

sum119886=1

119908119894(119899119902119886 119899119902119886+1

)

(3)

Path tendency calculation (PTC) gleans the path ten-dency of each variable with 119901119908

119894 Subsequently initial domain

calculation (IDC) works on the result of PTC In this waythe initial value selection allows for both diversity andheuristics The algorithms are expressed by pseudo-codes inAlgorithms 4 and 5

522 Bisection by Tendency Bisection functions only whena value (including the initial value) assigned to the currentvariable 119909

119894is judged to be infeasible and the conflicted branch

(119899119902119886 119899119902119886+1

)with the false branching condition is locatedThenthe tendency of 119909

119894is used by bisection defined as follows

Definition 8 Tendency isin 119901119900119904119894119905119894V119890 119899119890119892119886119905119894V119890 is an attributeof a variable at a branch (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896]) determinedby the analysis on the monotonicity of the correspondingbranching condition and it provides the information aboutwhere to select a value to better satisfy the branching

condition Positive implies that a larger value will work betterwhile negative implies that a smaller value is better It iscalculated according to the following formula

119879119890119899119889119890119899119888119910 (119909119894)

=

119901119900119904119894119905119894V119890 if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically increasing119899119890119892119886119905119894V119890 if Br (119909

119894) (119899119902119886 119899119902119886+1

)

is monotonically decreasing

(4)

Each branch holds a tendencymap ⟨119881119886119903119894119886119887119897119890 119879119890119899119889119890119899119888119910⟩which includes the variables appearing on the branch andtheir corresponding tendencies With the tendency mapbisection can be applied to reduce the domain of 119909

119894(119863119894119895)

leading the branching condition to be true as presented bypseudo-codes in Algorithm 6

For example if the conflicted branch is the first branchof Path3 in Figure 1 then the corresponding branchingcondition is 1199091 minus 1199092 gt 0 which has different monotonicrelations with 1199091 and 1199092 respectively Table 2 shows how touse bisection to reduce the domains of variables If the currentvariable is 1199091 then retrieval of tendency map returns positiveindicating that a larger value will help satisfy the branchingcondition sowe reduce its domain to the larger part But if thecurrent variable is 1199092 bisection will function in the oppositeway due to the opposite monotonic relation

53 Heuristics in Maintaining Path Consistency As men-tioned in Section 42 MPC can be used in both stages ofBFS-BB In this part the focus is on the state space searchstage A value assigned to the current variable 119909

119894 no matter

it is the initial value or another value selected after bisectionshould be examined by interval arithmetic to see whether it ispart of the solution Path consistency is a prerequisite for thesuccess of interval arithmetic In the implementation of BFS-BB interval arithmetic is enhanced to provide more preciseinterval information The enhancement is to make clear howthe value of the branching condition defined by formula (2)is calculated as shown in formula (5) Here we use 119863119886 todenote the domain of all variables before calculating the 119886thbranching condition Besides a library of inverse functions isadded in case of the occurrences of library functions in thePUT Consider

Br (119899119902119886 119899119902119886+1

) (119909119894)

=

119905119903119906119890 if (119899119902119886 119899119902119886+1

) is traversedwith 119863119886 (119881

119894119895isin 119863119886

)

119891119886119897119904119890 otherwise

(5)

Hence for 119896 branching nodes along path 119901 all the119896 branching conditions should be true to maintain pathconsistency MPC receives the value of the current variable 119909

119894

(119881119894119895) which is part of the domain of all variables denoted as

1198631

(119881119894119895= [119881119894119895 119881119894119895] isin 1198631

) and evaluates the branching condi-tion corresponding to the branch (119899

1199021 1198991199021+1

) where 1198991199021is the

first branching node The branching condition Br(1198991199021 1198991199021+1

)

Mathematical Problems in Engineering 9

Input119883rel the set of relevant variables to the path119901119908119894 the path weight of variable 119909

119894(119909119894isin 119883rel)

Output Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relBegin(1) Path-Tendencylarr null(2) foreach 119909

119894isin 119883rel

(3) if (119901119908119894gt 0)

(4) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119901119900119904119894119905119894V119890⟩

(5) else if (119901119908119894lt 0)

(6) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119899119890119892119886119905119894V119890⟩

(7) return Path-TendencyEnd

Algorithm 4 Path tendency calculation

Input 119863119894= [min max] the domain of 119909

119894

Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relOutput 119863

1198941 the domain of 119909

119894in which its initial value is selected

Begin(1) PathTendency(119909

119894)larr retrieval of Path-Tendency

(2) if (PathTendency(119909119894) = positive)

(3) 1198631198941larr997888 [

(min+max)2

max](4) else if (PathTendency(119909

119894) = negative)

(5) 1198631198941larr997888 [min (min+max)

2]

(6) return 1198631198941

End

Algorithm 5 Initial domain calculation

Input 119863119894119895= [min max] the current domain of 119909

119894

119881119894119895 the current value of 119909

119894that causes Br(119909

119894)(119899119902119886 119899119902119886+1

) to be false(119899119902119886 119899119902119886+1

) the conflicted branchOutput 119863

119894119895 the reduced domain of 119909

119894

Begin(1) 1198811015840 larr 119881

119894119895

(2) Tendency(119909119894)larr retrieval of tendency map held by (119899

119902119886 119899119902119886+1

)(3) 119895++(4) if (Tendency(119909

119894) = positive)

(5) 119863119894119895larr [1198811015840 + 1 max]

(6) else if (Tendency(119909119894) = negative)

(7) 119863119894119895larr [min 1198811015840 minus 1]

(8) return 119863119894119895

End

Algorithm 6 Bisection

Table 2 An example of bisection

Current variable Monotonicity Tendency Current value Domain before bisection Domain after bisection1199091 Increasing Positive 1198811 [min1 max1] [1198811+1 max1]1199092 Decreasing Negative 1198812 [min2 max2] [min2 1198812minus1]

10 Mathematical Problems in Engineering

Input 1198631 the domain of all variables before checking path consistencyBr(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branching conditions along the pathOutput 119863119896+1 the reduced domain of all variables after a successful path consistency check

(119899119902119886 119899119902119886+1

) the conflicted branch spotted by path consistency checkBegin(1) for 119886 rarr 1 119896(2) calculate Br(119899

119902119886 119899119902119886+1

) with119863119886(3) if (Br(119899

119902119886 119899119902119886+1

) = true)(4) 119863

119886+1

⊑ 119863119886

(5) else return (119899119902119886 119899119902119886+1

)(6) path consistentlarr true(7) return 119863119896+1End

Algorithm 7 Maintaining path consistency

is generally not satisfied for all the values in1198631 but for valuesin a certain subset 1198632 sube 119863

1 ensuring the traversal of the

branch (1198991199021 1198991199021+1

) that is 1198631119861119903(1198991199021 1198991199021+1)

997888997888997888997888997888997888997888997888997888rarr 1198632 Next the

branching condition Br(1198991199022 1198991199022+1

) is evaluated given that thedomain of all variables is1198632 Again generally Br(119899

1199022 1198991199022+1

) isonly satisfied by a subset 1198633 sube 1198632 This procedure continuesalong 119901 until all the branching conditions are satisfied tomaintain path consistency and119863119896+1 is returned as the domainof all variables The process of maintaining path consistencyis the propagation of the branching conditions along p in the

form of 1198631Br(1198991199021 1198991199021+1)997888997888997888997888997888997888997888997888997888rarr 119863

2

Br(11989911990221198991199022+1)997888997888997888997888997888997888997888997888997888rarr 119863

3

sdot sdot sdot 119863119896

Br(119899119902119896119899119902119896+1)997888997888997888997888997888997888997888997888997888rarr

119863119896+1 where 1198631 supe 119863

2

supe 1198633

sdot sdot sdot supe 119863119896

supe 119863119896+1 But if in

this procedure Br(119899119902ℎ 119899119902ℎ+1

) = 119891119886119897119904119890(1 le ℎ le 119896) whichmeans a conflict is detected then MPC is terminated andbisection will function according to the result of MPC atthe conflicted branch (119899

119902ℎ 119899119902ℎ+1

) The process of checkingwhether path consistency is maintained is shown by pseudo-codes in Algorithm 7

54 Case Study In this part the problem mentioned inSection 31 is used as an example to explain how BFS-BBworks especially the heuristic look-aheadmethods proposedahead The input is Path3 as shown in bold in Figure 3where each branching condition is decomposed into its basicfunctions in the right The IVR process has been illustratedin detail in Table 1 and all the three variables are determinedrelevant to Path3 For simplicity the input domains of allvariables are set [minus2 2] with the size 5 In the initializationstage MPC check reduces their domains to 1199091 [minus1 2] 1199092[minus2 1] and 1199093 [minus1 2] The path tendency of each variableis calculated by PTC as shown in Table 3 DVO serves todetermine the first variable to be instantiated as shown inTable 4 with the head of the queue (1199092) highlighted in boldOn determining 1199092 to be the current variable an initialvalue needs to be selected from[minus2 1] The retrieval of pathtendency map by IDC returns negative for 1199092 indicating thata smaller value will perform better and minus1 is selected

MPC checks the domains of all variables which are 1199091[minus1 2] 1199092 [minus1 1] and 1199093 [minus1 2] It succeeds and reduces

the domains of 1199091 and 1199093 to [0 2] and [0 2] respectivelyThen DVO determines the next variable to be instantiated asshown in Table 5 with the head of the queue (1199091) highlightedin bold

1 is selected for 1199091 after IDC MPC checks whether1199091 [1 1] 1199092 [minus1 minus1] and 1199093 [0 2] works It suc-ceeds and in the same manner 1199093 is assigned 1 Finally⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is checked by MPC to be suitablefor Path3 No variable needs to be permutated and BFS-BBsucceeds with the test data ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ Table 6shows how the domains of variables are changed duringthe search process The changed domains are highlighted inbold The changes listed in the fourth column are owing tovariable assignments according to the results of IDC andthe changes listed in the fifth column are owing to domainreduction by MPC checks The process of generating the testdata ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is presented as the search treein Figure 4 It is a backtrack-free search that accounts for anextremely large proportion in the implementation of BFS-BBEach variable consumes one MPC check in the state spacesearch stage and the initial values of each variable make thesolution The solution path is shown by the bold arrows

6 Experimental Results and Discussion

To observe the effectiveness of BFS-BB we carried out a largenumber of experiments in CTS Within the CTS frameworkthe PUT is automatically analyzed and its basic informationis abstracted to generate its CFG According to the specifiedcoverage criteria the paths to be traversed are generated andprovided for BFS-BB as input The generated test data willbe used for mutation testing that requires a high coverageideally 100 [37] This is a challenge for test data generation

The experiments were performed in the environment ofMSWindows 7with 32 bits Pentium4with 28GHz and 2GBmemory The algorithms were implemented in Java and runon the platformof eclipseThe experiments include two partsSection 61 presents the performance evaluation of BFS-BBand Section 62 tests the capability of BFS-BB to generate testdata in terms of coverage and makes comparisons with somecurrently existing static and dynamic methods

Mathematical Problems in Engineering 11

Table 3 PTC process for 1199091 1199092 and 1199093

Branchingcondition

Basic functions andcorresponding monotonicity

Monotonicity ofbranching conditions Weight Path weight Path tendency

1199091 minus 1199092 gt 0

119891(1199091) = 1199091 minus 1199092 increasing119891(1199092) = 1199091 minus 1199092 decreasing119891(1198871) = 1198871 gt 0 increasing

Br(1199091) increasingBr(1199092) decreasing

1199081 = 05

1199082 = minus051199011199081 = 05

1199011199082 = minus1

1199011199083 = 15

⟨1199091 positive⟩⟨1199092 negative⟩⟨1199093 positive⟩1199093 minus 1199092 gt 0

119891(1199092) = 1199093 minus 1199092 decreasing119891(1199093) = 1199093 minus 1199092 increasing119891(1198872) = 1198872 gt 0 increasing

Br(1199092) decreasingBr(1199093) increasing

1199082 = minus05

1199083 = 05

3 lowast 1199093 ge minus5119891(1199093) = 3 lowast 1199093 increasing119891(1198873) = 1198873 ge minus5 increasing Br(1199093) increasing 1199083 = 1

Table 4 DVO process for 1199091 1199092 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 4 |1198632| = 4 |1198633| = 4 Yes (all three have the same domain size)

x2 rarr 1199091 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199092) = 1 Rank 1(1199093) =infin Yes (1199091 and 1199092 both have Rank 1)Rank 2 Rank 2(1199091) =infin Rank 2(1199092) = 2 No (1199092 has Rank 2 while 1199091 has infinity)

Table 5 DVO process for 1199091 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 3 |1198633| = 3 Yes (both have the same domain size) x1 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199093) =infin No (1199091 has Rank 1 while 1199093 has infinity)

Table 6 Domain changes in the search process

Stage Function Before IDC After IDC and before MPC After MPC

Initialization Initial domain reduction mdash 1199091 [minus2 2] 1199092 [minus2 2]1199093 [minus2 2]

x1 [minus1 2] x2 [minus2 1]x3 [minus1 2]

State space search

MPC check when 1199092 isassigned minus1

1199091 [minus1 2] 1199092 [minus2 1] 1199093[minus1 2]

1199091 [minus1 2] x2 [minus1 minus1]1199093 [minus1 2]

x1 [0 2] 1199092 [minus1 minus1]x3 [0 2]

MPC check when 1199091 isassigned 1

1199091 [0 2] 1199092 [minus1 minus1] 1199093[0 2]

x1 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

MPC check when 1199093 isassigned 1

1199091 [1 1] 1199092 [minus1 minus1] 1199093[0 2]

1199091 [1 1] 1199092 [minus1 minus1]x3 [1 1]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [1 1]

void test(int x1 int x2 int x3) void test(int x1 int x2 int x3)

if (x1-x2lt=0)

printf(ldquoPath1rdquo)

else if(x3-x2lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

else if(3lowastx3+5gt=0)

int b1=x1-x2

printf(ldquoPath1rdquo)

if (b1lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

if(b2lt=0)

if(b3gt=-5)

else int b2=x-3x2

elseint b3=3lowastx3

Figure 3 Overview of our approach for searching the test data

12 Mathematical Problems in Engineering

0 1 2

DVO

MPC

0 1 2MPC

0 1 2MPC

DVO

DVOIDC

PTC

MPC

MPC

IDC

IDC

minus1

minus1

minus1

minus2

minus2

minus2

x1 minus x2 gt 0x3 minus x2 gt 03x3 + 5 ge 0

x1 x2 x3 isin minus2 minus1 0 1 2

x2x2x2x2

x3x3x3

x1 x1 x1

radic

Figure 4 The search tree of generating the test data for Path3 using BFS-BB

61 Performance Evaluation The number of relevant vari-ables is an important factor that affects the performanceof BFS-BB so in this part experiments were carried out toevaluate the performance of BFS-BB for varying numbers ofinput variables To be specific our major concern is (1) therelationship between the number of MPC checks (exclusiveof the one taken in the initialization stage) and the numberof relevant variables (2) the relationship between genera-tion time and the number of relevant variables This wasaccomplished by repeatedly running BFS-BB on generatedtest programs having input variables 119909

1 1199092 119909

119899where 119899

varied from 1 to 50 Adopting statement coverage in eachtest the program contained five if statements (equivalent tofive branching conditions along the path for MPC check)and there was only one path to be traversed of fixed lengthwhich was the one consisting of entirely true branches(TTTTT) that is all the branching conditions are the sameas the corresponding predicates Considering the relationshipbetween variables experiments involving two situations wereconducted that (1) the variables are all independent of eachother and (2) the variables are linearly related in the tightestmanner Generation time varied greatly in these two cases sothe axes of generation time of both cases are normalized forsimplicity

611 Variables Are All Independent of Each Other The pred-icate of each if statement is an expression in the form of

11988611199091relop

1const [1] and 119886

21199092relop

2119888119900119899119904119905 [2]

and sdot sdot sdot and 119886119899119909119899relop

119899119888119900119899119904119905 [119899]

(6)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop119894(119894 = 1 2 119899) isin gt ge lt le =

= and const is an array of randomly generated constantsThe randomly generated 119886

119894and 119888119900119899119904119905[119894] should be selected

to make the path feasible This arrangement constructs arelationship that all the variables are independent of eachother but all of them are relevant to the path The programsfor various values of n ranging from 1 to 50 were each tested50 times and the number of MPC checks and time required

to generate the data for each test were recorded The resultscan be seen from Figures 5 and 6

Figure 5 shows the relationship between the number ofMPC checks and the number of variables (119899) for variablesthat are all independent of each other and from (a) to (d)represent four different situations marked by the ordinatesIt can be seen that since the relation in formula (6) is thesimplest one between variables the number of MPC checksremains linearly increasing with the number of variables nomatter in which situation from (a) to (d) 119910 = 119909means thatfor this kind of constraint one relevant variable requires onlyone MPC check It also can be seen that 1198772 = 1 in all thefour situations and the number of MPC checks increasescompletely linearly with the number of variables

Figure 6 shows the relationship between generation timeand the number of variables (119899) for variables that are allindependent of each other and from (a) to (d) represent fourdifferent situations marked by the ordinates It can be seenthat generation time increases approximately linearlywith thenumber of variables and the linear correlation relationship issignificant at 95 confidence level with 119875 value far less than005 By the increase of the number of variables generationtime increases at an even speed The minimum value can becommendably represented as a straight line showing that itis the most ideal in the four situations with a larger value of1198772 Variations between tests with the same values of 119899 were

attributed to the randomness in the difference in the selectionof the initial values

612 Variables Are Linearly Related in the Tightest MannerThe predicate of each if statement is a linear combination ofall the 119899 variables in the form of

[1198861 1198862 119886

119899] [1199091 1199092 119909

119899]1015840relop119888119900119899119904119905 [119888] (7)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop isin gt ge lt le = = and 119888119900119899119904119905[119888](119888 isin 1 2 3 4 5) is an array of randomly generatedconstants The randomly generated 119886

119894and 119888119900119899119904119905[119888] should be

selected to make the path feasible This arrangement con-structs the tightest linear relation between the variables all

Mathematical Problems in Engineering 13

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = x

R2 = 1

(a)

Number of variables0 20 40 60

0

20

40

60

Aver

age n

umbe

r of M

PC ch

ecks

y = x

R2 = 1

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(c)

Number of variables0 20 40 60

0

20

40

60

Min

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(d)

Figure 5 Relationship between the number of MPC checks and the number of variables for variables that are all independent of each other

of which are relevant to the path The programs for variousvalues of 119899 ranging from 1 to 50 were each tested 50 times andthe number ofMPC checks and time required to generate thedata for each test were recordedThe results can be seen fromFigures 7 and 8

Figure 7 shows the relationship between the number ofMPC checks and the number of variables (119899) for variables thatare the tightest linearly related and from (a) to (d) representfour different situations marked by the ordinates It can beseen that the number of MPC checks remains approximatelylinearly increasing with the number of variables and thefitting curves are all near 119910 = 119909 The linear correlationrelationship is significant at 95 confidence level with 119875-value far less than 005 The general average and maximumnumbers ofMPC checks are all larger than those in the exper-iment for variables that are all independent of each otherbecause the relation in formula (7) is the tightest linear onebetween variablesTheminimumnumber ofMPC checks canbe completely represented as 119910 = 119909 with 1198772 = 1 whichmeans that theminimumnumber is themost ideal in the foursituations

Figure 8 shows the relationship between generation timeand the number of variables (119899) for variables that are

the tightest linearly related and from (a) to (d) represent fourdifferent situationsmarked by the ordinates It is clear that therelation between generation time and the number of variablescan be commendably represented as a quadratic curve andthe quadratic correlation relationship is significant at 95confidence level with 119875-value far less than 005 The betterfitting curves of average and minimum generation timesshow that average generation time is perfectly stable andminimum generation time is still the most ideal Variationsbetween tests with the same values of n were attributed to therandomness in (1) the difference in the selection of the initialvalues and (2) the difference in the expressions along the path(an equality relational operator will generally require morecalculation than an inequality relational operator) Besidesgeneration time increases at a uniformly accelerative speed bythe increase of the number of variables Take (b) for examplethe differentiation of average generation time indicates thatits increase rate rises by 119910 = 9994119909 minus 7734 as the number ofvariables increases We can roughly draw the conclusion thatgeneration time is very close for 119899 ranging from 1 to 8 whileit begins to increase when 119899 is larger than 8

The above cases are both completely backtrack-freesearch owing to the linear correlation relationship between

14 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 1471x + 3161

R2 = 0899

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time

y = 1636x + 1175

R2 = 0862

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 2073x minus 3655

R2 = 0997

(d)

Figure 6 Relationship between generation time and the number of variables for variables that are all independent of each other

the number of MPC checks and the number of relevantvariables Surely they cannot include all the relations betweenvariables in engineering so the analyses in this part arejust from the theoretic perspective The real-world PUTs aremuch more complex Whatrsquos more 50 tests were conductedfor each case of n ranging from 1 to 50 So the results from thesamples can only approximate the actual situation But it canbe concluded that BFS-BB functions are stably given a PUTof regular structure which lays a solid foundation for itsapplication in engineering

62 Coverage Evaluation To evaluate the capability of BFS-BB to generate test data in terms of coverage four experi-ments were carried out The first involves the testing with abenchmark used in CTS the second aims at generating testdata for a project in engineering the third compares BFS-BBwith a static method and the last compares it with dynamicmethods

621 Testing a Benchmark in CTS In this part test datawere automatically generated to meet three coverage criteriawhich were statement branch and MCDC The test bedwas branch boundc a benchmark in CTS with 402 LOC

29 input variables and complex structure trying to includemore content that might appear in engineering119898 was set 10for each variable as the upperbound of the number of MPCchecks so it can be estimated that the simplest backtrackingwill consume at least 11 MPC checks for the variable inquestion

The result is shown in Table 7 The numbers of paths wasdifferent owing to different coverage criteria adopted BFS-BB was able to generate test data for all the feasible pathsno matter which coverage criterion was taken The MCDCcoverage did not reach 100 because it is relatively strictand difficult to meet and subsumes statement and branchcoverage [38] But tolerable coverage was achieved withintolerable timeThere exists a trade-off between efficiency andsuccess rate IVR had no significant influence on coveragebut it did on generation time Generation time after IVRwas much less than that without IVR Note that the amountof generation time reduced by IVR is determined by the struc-ture of the PUT The numbers of generation time reductionin Table 7 are only related to the program branch boundcOur following analyses all concern BFS-BB with IVR Aver-age number of MPC checks per relevant variable adoptingstatement coverage was larger than the results in Section 61

Mathematical Problems in Engineering 15

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = 0994x + 0221

R2 = 0997

(a)

0 20 40 600

20

40

60

Number of variables

Aver

age n

umbe

r of M

PC ch

ecks

y = 0996x + 0309

R2 = 0999

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = 1014x + 2030

R2 = 0985

(c)

0 20 40 600

20

40

60

Number of variables

Min

imum

num

ber o

f MPC

chec

ks

y = x

R2 = 1

(d)

Figure 7 Relationship between the number of MPC checks and the number of variables for variables that are linearly related in the tightestmanner

Table 7 Coverage achieved by BFS-BB on branch boundc

Coverage criterion Number of paths Average coverage Generation time reducedby IVR

Average MPC checks perrelevant variable

Statement 61 100 34 134Branch 119 100 37 173MCDC 125 98 42 229

because there are some library functions as well as nonlinearconstraints in the PUT which require moreMPC checks Butfrom all the average values of the number of MPC checks itcan be concluded that the tests for all the three coverage crite-ria were basically backtrack-free and not all the tests includethe bisection operation

622 Testing Programs from a Project in Engineering In thispart seven programs were selected from the project de118i-2 at httpwwwmoshiernet as the test beds each of whichcontains several functions or loops Three coverage criteriawere adopted which were statement branch and MCDCThe information of the programs and the coverage results areshown in Table 8

From Table 8 it can be seen that BFS-BB performedencouragingly for programs with complex structure in engi-neering Some of the tests adoptingMCDC coverage did notreach 100 resulting from the fact thatMCDC is a relativelystrict criterion But all the tests were basically backtrack-freewith a high coverage proving its effectiveness in engineering

623 Comparison with a Static Method This part presentsthe results from an empirical comparison of BFS-BB withthe static method [13] (denoted as ldquomethod 1rdquo to avoidverbose description) which was implemented in CTS priorto BFS-BB The details of the test beds are shown in Table 9The comparison adopted three coverage criteria statementbranch and MCDC And the result is shown in Table 10

16 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 4318x2 minus 6684x + 3655

R2 = 0975

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 4997x2 minus 7734x + 4134

R2 = 0991

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time y = 4151x2 minus 2923x + 7107

R2 = 0907

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 5455x2 minus 9565x + 5030

R2 = 0994

(d)

Figure 8 Relationship between generation time and the number of variables for variables that are linearly related in the tightest manner

Table 8 Test result achieved by BFS-BB on programs from de118i-2

Program LOC Number ofvariables

Number offunctions

Number ofloops Statement coverage Branch coverage MCDC coverage

rungec 1626 8 2 8 100 100 89oblatec 581 14 4 4 100 100 94jplmpc 996 6 5 5 100 100 88floorlc 432 7 5 2 100 100 92atanlc 267 3 2 0 100 100 100adams4c 767 28 10 12 100 100 85tanlc 256 4 3 0 100 100 100

Table 9 Programs used for comparison with method 1

Program LOC Number of branches Number of variables Description Source

Bonus 29 10 1 To calculate bonus according toprofit [13]

Days 33 17 3 To calculate the day numberwithin a year [13]

Statistics 21 8 5 To count the number of everykind of characters [13]

gcd 38 5 2 To calculate greatest commondenominator [14]

Mathematical Problems in Engineering 17

Table 10 Comparison result with method 1 using three coverage criteria

Program Coverage criterion Number of paths Average coverage bymethod 1

Average coverage byBFS-BB

BonusStatement 6 25 100Branch 6 37 100MCDC 6 30 100

DaysStatement 17 100 100Branch 17 100 100MCDC 14 94 100

StatisticsStatement 4 100 100Branch 5 100 100MCDC 11 82 100

gcdStatement 3 85 100Branch 3 77 100MCDC 5 75 100

Table 11 Programs used for comparison with GA and SA

Program LOC Number of branches Number of variables Description SourceTriangle type 31 3 5 To classify type of a triangle [15]

Valid date 59 16 3 To check whether a date is validor not [15]

Cal day 72 11 3 To calculate the day of the week [16]

Cal 53 18 5To calculate the number of daysbetween the two given days in

the same year[17]

Since interval arithmetic has been improved in CTS forthe sake of validity test data were generated for all the testbeds on the same foundation of interval arithmetic It can beseen that BFS-BB reached 100 coverage for all the test bedsusing three coverage criteria while method 1 did not That islargely due to the heuristicmethods utilized in BFS-BBThereare modulus operations in the program days which had beendifficult for the interval arithmetic in method 1 to handle Butthe functionality of interval arithmetic has been improved byadding a library of inverse functions so it was not so difficulteven for method 1

624 Comparison with DynamicMethods This part presentsresults from an empirical comparison of BFS-BB with twodynamic methods which are genetic algorithm (GA) andsimulated annealing (SA) on four different benchmark pro-grams using branch coverage The details of the benchmarkprograms are shown in Table 11 In order to obtain unbiasedexperimental results we made a number of experiments withdifferent parameter settings and chose the one that GA andSA performed the best as shown in Table 12 Test data wereautomatically generated for each program using GA SA andBFS-BB with each tested 100 timesThe average coverage wasrecorded to make the comparison and the result was pre-sented in Table 13

It can be seen that BFS-BB reached 100 coverage onall the four benchmark programs which are rather simpleprograms for BFS-BB and it outperformed the algorithms

in comparison The better performance of BFS-BB is due tothree factorsThe first is that the initial values of variables areselected by heuristics on the path so BFS-BB reaches a rela-tively high coverage for the first round of the searchThe sec-ond is that MHS crashes on several occasions due to the iter-ation exception while the probability of aborting is quite lowfor BFS-BB because it has no demand for iteration The thirdis thatMPC is checked not only in the state space search stagebut also in the initialization stage which reduces the domainsof the variables to ensure a relatively small search spacethat follows

7 Conclusion

This paper presents an intelligent algorithm best-first-searchbranch and bound (BFS-BB) for path-wise test data gener-ation (Q) The problem Q is reformulated as a constraintoptimization problem (COP) and two techniques from arti-ficial intelligence are introduced to tackle the COP which arestate space search and branch and bound (BB) The formeris used to construct the search tree dynamically and thelatter is used as the search method Heuristics are adoptedin the look-ahead stage Dynamic variable ordering (DVO)is presented with a heuristic rule to break ties Maintainingpath consistency (MPC) is achieved through analysis on theresult of interval arithmetic The monotonicity analysis onbranching conditions is applied both in the selection of initialvalues by path tendency calculation (PTC) and initial domain

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

2 Mathematical Problems in Engineering

states compactly and used searching heuristics to reach highcode coverage In 2013 Yawen et al [13] proposed an intervalanalysis algorithm using forward data-flow analysis Butno matter what techniques are adopted the static methodsrequire a strong constraint solver

The dynamic methods including metaheuristic search(MHS) algorithms [18] such as genetic algorithms [19] antcolony optimization [15] and simulated annealing [20] allrequire the actual execution of the PUT They select a groupof test data (usually randomly) in advance and execute itto observe whether the goal is reached that is coveragecriteria are satisfied or faults are detected and if not they spotthe problem and alter the values of some input variables tomake the PUT execute in the expected way They are flexiblemethods as they can easily change the input in the testingprocess but they are sensitive to the search-space size thediversity of initial population the effectiveness of evolutionoperators and the quality of fitness function [21] Therepeated exploration requires a large number of iterationssometimes even causing iteration exceptionThe randomnessof initial values is also a big problem because it bringsuncertainty to the search result [22]

In this paper considering the drawbacks of the dynamicmethods mentioned above and the demand for static meth-ods we propose a new static test data generation methodbased on Code Test System (CTS) (httpctstestingcn)which is a practical tool to test codes written in C program-ming languageOur contribution is threefold First path-wisetest data generation is defined as a constraint optimizationproblem (COP) Two techniques (state space search andbranch and bound) in artificial intelligence are integrated totackle the COP Second heuristics are adopted in the look-ahead stage of the search to improve the search efficiencyThird an optimization method is proposed to reduce thesearch space We try to evaluate the performance of ourmethod and the relationship between look-ahead and look-back techniques through the experimental results

The rest of this paper is organized as follows Section 2provides the background underlying our research The prob-lem Q is reformulated as a COP and the solution is pre-sented in Section 3 Section 4 illustrates the proposed searchstrategies in detail and describes the optimization methodused to reduce the search space The heuristic look-aheadtechniques with a case study are given in Section 5 Section 6focuses on experimental analyses and empirical evaluationsof the proposed approach as well as coverage comparisonswith some existing test data generation methods Section 7concludes this paper and highlights directions for futureresearch

2 Background

State space search [23 24] is a process in which successivestates of an instance are considered with the goal of findinga final state with a desired property Problems are normallymodeled as a state space a set of states that a problem canbe in The set of states forms a graph where two states areconnected if there is an operation which can be performed

to transform the first state into the second State space searchcharacterizes problem solving as the process of finding asolution path from an initial state to a final state In statespace search the nodes of the search tree are correspondingto partial problem solution and the arcs are corresponding tosteps in a problem-solving process State space search differsfrom traditional search methods because the state space isimplicit the typical state space is too large to generate andstore in memory Instead nodes are generated as they areexplored and typically discarded thereafter

Branch and bound (BB) [25 26] is an efficient back-tracking algorithm for searching the solution space of aproblem as well as a common search technique to solveoptimization problems The advantage of the BB strategy liesin alternating branching and bounding operations on the setof active and extensive nodes of a search tree Branching refersto partitioning of the solution space (generating the childnodes) bounding refers to lowering bounds used to constructa proof of feasibility without exhaustive search (evaluatingthe cost of new child nodes) The techniques for improvingBB are categorized as look-ahead and look-back methodsLook-ahead methods [27] are invoked whenever the searchis preparing to extend the current partial solution and theyconcern the following problems (1) how to select the nextvariable to be instantiated or to be assigned a value (2) howto select a value to instantiate a variable (3) how to reducethe search space by maintaining a certain level of consis-tency Look-back methods are invoked whenever the searchencounters a dead end and is preparing for the backtrackingstep and they can be classified into chronological backtrack-ing and backjumping

An important static testing technique adopted in thispaper is interval arithmetic [8 9 28ndash30] which representseach value as a range of possibilities An interval is acontinuous range in the form of [minmax] while a domainis a set of intervals For example if an integer variable119909 rangesfrom minus3 to 6 but it cannot be equal to 0 then its domain isrepresented as [minus3 minus1] cup [1 6] which is composed of twointervals Interval arithmetic has a set of arithmetic rulesdefined on intervals It analyzes and calculates the rangesof variables starting from the entrance of the program andprovides precise information for further program analysisefficiently and reliably

3 Reformulation of Path-Wise TestData Generation

This section addresses the reformulation of path-wise testdata generation Problem definition and its solution are pre-sented in Sections 31 and 32 respectively

31 Problem Definition Many forms of test data generationmake references to the control flow graph (CFG) of theprogram in question [31] In this paper a CFG for a program119875 is a directed graph 119866 = (119873 119864 119894 119900) where 119873 is a set ofnodes 119864 is a set of edges and 119894 and 119900 are respective uniqueentry and exit nodes to the graph Each node 119899 isin 119873 is astatement in the program with each edge 119890 = (119899

119903 119899119905) isin 119864

Mathematical Problems in Engineering 3

0

87

9

10

11

12

T 1

T 3

T 5

F 2

F 4

F 6

entry 0

if head 1

stmt 2 if head 3

if out 8

exit 10

stmt 4

if out 9

if head 5

stmt 6

if out 7

void test(int x1 int x2 int x3)

1 if (x1-x2lt=0)

2 printf(ldquoPath1rdquo)

3 else if(x3-x2lt=0)

4 printf(ldquoPath2rdquo)

5

6 printf(ldquoPath3rdquo)

else if(3lowastx3+5gt=0)

Figure 1 Program test and its corresponding CFG

representing a transfer of control from node 119899119903to node

119899119905 Nodes corresponding to decision statements such as if

statements are branching nodes Outgoing edges from thesenodes are referred to as branches A path through a CFGis a sequence 119901 = (119899

1 1198992 119899

119902) such that for all 119903 1 le

119903 lt 119902 (119899119903 119899119903+1) isin 119864 A path 119901 is regarded as feasible

if there exists a program input for which 119901 is traversedotherwise 119901 is regarded as infeasible Then the problem Qcan be reformulated as a COP [32 33] as follows 119883 is a setof variables 119909

1 1199092 119909

119899 119863 = 119863

1 1198632 119863

119899 is a set

of domains and 119863119894isin 119863 (119894 = 1 2 119899) is a finite set of

possible values for 119909119894 For each path 119863 is defined based on

the variablesrsquo acceptable ranges One solution to the problemis a set of values to instantiate each variable inside its domaindenoted as 119881 = 119881

1 1198812 119881

119899 119881119894isin 119863119894to make path

119901 feasible Particularly each constraint defined by the PUTalong 119901 should be met to make it feasible

An example with a program test and its correspondingCFG is shown in Figure 1 where if out 7 if out 8 if out 9and exit 10 are dummy nodes Adopting branch coveragethere are four paths to be traversed namely Path1 0 rarr 1 rarr 2rarr 9 rarr 10 Path2 0 rarr 1 rarr 3 rarr 4 rarr 8 rarr 9 rarr 10 Path3 0rarr 1 rarr 3 rarr 5 rarr 6 rarr 7 rarr 8 rarr 9 rarr 10 and Path4 0 rarr 1rarr 3 rarr 5 rarr 7 rarr 8 rarr 9 rarr 10Thenumbers along the pathsdenote nodes rather than edges of the CFG Assuming thatPath3 is the path to be traversed as shown in bold our workis to select 119881 = 119881

1 1198812 1198813 from 119863

1 1198632 1198633 for 119909

1 1199092 and

1199093 so that when executing test using 119881

1 1198812 1198813 as an input

the path traversed is Path3 There are three branching nodesif head 1 if head 3 and if head 5 along Path3 and threecorresponding branches F 2 F 4 and T 5 that contain theconstraints to be met

32 Solution to the Problem A COP is generally solved bysearch strategies among which backtracking algorithms [34]are widely used In this paper state space search and thebacktracking algorithm BB are introduced to solve the COP

mentioned aboveThe process of exploring the solution spaceis represented as state space search This representation willfacilitate the implementation of BB In classical BB searchnodes are always fully expanded that is for a given leaf nodeall child nodes are immediately added to the so-called openlist However considering that one solution is enough forpath-wise test data generation best-first-search is our firstchoice To find the best ordering of variables is requiredfor branching to prune the branches stretching out fromunneeded variables In addition as the domain of a variable isa finite set of possible valueswhichmay be quite large bound-ing is necessary to cut the unneeded or infeasible solutions InBB frame bisection [35] is often used to help prune unneededpart of the solution space Employing bisection this paperproposes best-first-search branch and bound (BFS-BB)to automatically generate the test data

It has been observed empirically that the enhancement oflook-ahead methods is sometimes counterproductive to theeffects of look-back methods [36] As for BFS-BB heuristicsare adopted in the look-ahead search Particularly they areused in the dynamic ordering of variables the selection of thevalues to assign a variable and the judgment of the feasibilityof the path after the assignment to a variable and the reduc-tion of the search space Chronological backtracking is usedfor look-back And from the results of the experiments wetry to seek out the relationship between look-ahead and look-back methods

During the search process variables are divided intothree sets past variables (short for PV already instantiated)current variable (now being instantiated) and future vari-ables (short for FV not yet instantiated) All the variablesinvolved in this paper are symbolic variables In the interestof simplicity the transformation from input variables tosymbolic variables and the inverse transformation are beyondthe scope of this paper In addition although the experimentswere carried out on benchmarks in the literature or industrialprograms of different variable types integer variables are usedfor brevity in the following algorithms

4 The Proposed Search Strategies

This section proposes the framework of the search strate-gies Particularly the representation of state space search isdescribed in detail in Section 41 which is followed by thesearch algorithmBFS-BB in Section 42 And an optimizationmethod in BFS-BB is explained in Section 43

41 The Representation of State Space Search A state is atuple (Precursor Variable Domain Value Type and Queue)Precursor provides a link to the previous state 119881119886119903119894119886119887119897119890 =119909119894isin 119883 (119894 = 1 2 119899) is the current variable 119863119900119898119886119894119899 =

119863119894119895sube 119863119894isin 119863 (119894 = 1 2 119899 119895 = 1 2 119898 119898 is the

branching factor or the threshold used to control the breadthof the search tree) in the form of [minmax] is the set ofpossible values to be selected to instantiate Variable 119881119886119897119906119890 =119881119894119895isin 119863119894119895is a value selected fromDomainTypemarks the type

of state active extensive or inactive Queue is a sequence ofvariables corresponding to the state in question

4 Mathematical Problems in Engineering

Best-first-search branch and boundState space searchInitialization

IVR

Initial state Final state

Heuristic look-ahead methods

Bisection

Path Test data

Heuristic look-ahead methods

Backtracking

MPC IDC MPC DVO IDCDVO PTC

⟨x1 V1⟩

⟨x2 V2⟩

⟨x3 V3⟩

⟨xn Vn⟩

Figure 2 Program test and its counterpart with branching conditions decomposed into basic functions

State space is a quadruple (119878 119860 119868 119865) where 119878 is a setof states 119860 is a set of connections between the states inaccordancewith the search operations 119868 is a nonempty subsetof 119878 denoting the initial state of the problem and 119865 is a non-empty subset of 119878 denoting the final state of the problem

State space search is all about finding one final state ina state space (which may be extremely large) Final meansthat every variable has been instantiated with a definite valuesuccessfully At the start of the search Precursor is null andwhen Queue is null the search ends The path made up of allthe extensive nodes in the search treemakes the solution pathThe state space needs to be searched to find a solution pathfrom an initial state to a final state

42 The Search Algorithm BFS-BB The idea of the searchalgorithm BFS-BB is to extend partial solutions At each stepa variable in FV is selected and assigned a value from itsdomain to extend the current partial solution It is checkedwhether such an extension may lead to a possible solutionof the COP and the subtrees containing no solutions basedon the current partial solution are pruned Some concepts inBFS-BB are explained as follows

Irrelevant variable removal (IVR) identifies variablesrelevant to the path to be traversed and removes thoseirrelevant Dynamic variable ordering (DVO) permutatesFV and returns a queue Path tendency calculation (PTC)calculates the path tendencies of all relevant variables alongthe path whichwill be used to calculate the domains inwhichtheir initial values are selected

Initial domain calculation (IDC) calculates the domain ofa variable in which its initial value is selected according toits path tendency calculated by PTC Bisection reduces thedomain of the current variable when its value just assignedfails to satisfy a constraint on the path Maintaining pathconsistency (MPC) utilizes interval arithmetic to determinewhether the domains of all variables satisfy the constraintsalong the path

The overview of our approach can be seen from Figure 2The path to be traversed is shown in the left part where thered circles represent nodes and the arrows represent edgesof the CFG The path contains the constraints to be metthe set of input variables and the domains corresponding tothe variables The first stage is to perform the initializationoperations At first IVR (see Section 43) is called to reducethe search space by removing irrelevant variables and leavingonly those relevant to the path Then four heuristic look-ahead methods take effect MPC (see Section 53) is used to

partially reduce the input domains of all variables and findinfeasible paths on occasion All the relevant variables in FVare permutated by DVO (see Section 51) to form a queue 119876

1

and its head 1199091is determined the best or the first variable

to be instantiated Next PTC (see Section 52) calculates pathtendency of each variable and IDC (see Section 52) reducesthe domain 119863

11in which the initial value 119881

11is selected

for 1199091 With all these the initial state is constructed as

(119899119906119897119897 1199091 11986311 11988111 119886119888119905119894V119890 and 119876

1) which is also the current

state 119878119888119906119903

shown as the red ringThe second stage implements state space search Four

heuristic look-ahead methods work in this stage To eachactive state MPC is carried out to determine the direction ofthe next search step If MPC succeeds Type becomes exten-sive the variables in FV will be permutated by DVO to get119876119906119890119906119890 = 119876

119894 119878119888119906119903

becomes Precursor and the head of 119876119894(119909119894)

will be Variable of next state Then IDC is used to calculatethe domain119863

1198941in which the initial value119881

1198941is selected for 119909

119894

With all these a new state (119875119903119890 119909119894 1198631198941 1198811198941 119886119888119905119894V119890 and 119876

119894) is

constructed for which the MPC check continues If after asuccessful MPC check no variable needs to be permutatedthen all the relevant variables have been assigned the rightvalues to make 119901 feasible The final state is reached shownas the red double ring Finally giving the irrelevant variablesrandom values fulfills the generation of the test data which isthe output of BFS-BB as shown in the right part of Figure 2If a MPC check fails Type remains active bisection (seeSection 52) is conducted to reduce the domain of Variableusing the information from the failed MPC check and Valueis reselected from the reduced domain all of which indicatethat the search will expand to a state with a different value forthe same variable 119909

119894 If all the values within its domain for 119909

119894

are tried out or the number of MPC checks has reached theupperbound119898 then 119909

119894is moved out of PV andType becomes

inactive In this case the search will have to backtrack toPrecursor at the higher level of the search tree as shown by thebidirectional arrow between backtracking and the heuristiclook-ahead methodsThe above-mentioned search process isdescribed by pseudocodes as Algorithm 1

43 Irrelevant Variable Removal As mentioned above 119883 =

1199091 1199092 119909

119899 is the set of input variables for a program 119875

The search space needs to involve every 119909119894(119894 = 1 2 119899)

in 119883 However it is possible that not every variable will beresponsible for determining whether every path in 119875 will betraversed or not Therefore when attempting to generate testdata for a particular path 119901 the search effort on the value

Mathematical Problems in Engineering 5

Input 119901 the path to be traversedOutput result ⟨119881119886119903119894119886119887119897119890 119881119886119897119906119890⟩ the test data making 119901 feasibleStage 1 Initialization(1) call Algorithm Irrelevant variable removal(2) resultlarr null(3) call Algorithm Maintaining path consistency(4) call Algorithm Dynamic variable ordering(5) call Algorithm Path tendency calculation(6) 119909

1larr head (119876

1)

(7) call Algorithm Initial domain calculation(8) 119881

11larr select (119863

11)

(9) initial statelarr (null 1199091 11986311 11988111 active 119876

1)

(10) 119878cur larr initial stateStage 2 State space searchBegin(11) for (Pre 119909

119894 119863119894119895 119881119894119895 active 119876

119894) (119894 rarr 1 119899)

(12) path consistentlarr false(13) call Algorithm Maintaining path consistency(14) if (path consistent = true)(15) 119878cur larr (Pre 119909

119894 119863119894119895 119881119894119895 extensive 119876

119894)

(16) resultlarr result cup ⟨119909119894 119881119894119895⟩

(17) FVlarr FV minus 119909119894

(18) PVlarr PV + 119909119894

(19) call Algorithm Dynamic variable ordering(20) if (119876

119894= null)

(21) 119878cur larr final state(22) foreach 119909lowast isin 119883irrel(23) resultlarr result cup ⟨119909lowast 119881random⟩(24) else Prelarr 119878cur(25) 119909

119894larr head (119876

119894)

(26) call Algorithm Initial domain calculation(27) 119881

1198941larr select (119863

1198941)

(28) 119878cur larr (Pre 119909119894 1198631198941 1198811198941 active 119876

119894)

(29) else if (1003816100381610038161003816100381611986311989411989510038161003816100381610038161003816gt 1 ampamp 119895 lt 119898)

(30) call Algorithm Bisection(31) 119881

119894119895larr select (119863

119894119895)

(32) 119878cur larr (Pre 119909119894 119863119894119895 119881119894119895 active 119876

119894)

(33) else 119878cur larr (Pre 119909119894 119863119894119895 119881119894119895 inactive 119876

119894)

(34) 119875119903119890 larr 119878cur(35) 119878cur larr (Pre 119909

119894 119863119894119895 119881119894119895 active 119876

119894)

(36) PVlarr PV minus 119909119894

(37) return resultEnd

Algorithm 1 Best-first-search branch and bound

of a variable which is not relevant to 119901 is wasted since itcannot influence the traversal of 119901Thus removing irrelevantvariables from the search space and only concentrating on thevariables relevant to the path of interest may improve the per-formance of the search Hence we propose an optimizationmethod irrelevant variable removal (IVR) Relevant variableand irrelevant variable are defined as follows

Definition 1 A relevant variable is an input variable that caninfluence whether a particular path 119901will be traversed or notTo put it more precisely for all the input variables 119909

119894| 119909119894isin

119883 119894 = 1 2 119899 there exists a corresponding set of values119881119894| 119881119894isin 119863119894 119894 = 1 2 119899 with which 119901 is not traversed

But when the value of a particular variable is changed forexample when the value of 119909

119892(119881119892) is changed into 1198811015840

119892 119901 is

traversed with the input 1198811 1198812 119881

1015840

119892 119881

119899 Then 119909

119892is a

relevant variable to path 119901

Definition 2 An irrelevant variable is an input variable thatis not capable of influencing whether a particular path 119901 willbe traversed or not To put it more precisely for all the sets119881119894| 119881119894isin 119863119894 119894 = 1 2 119899 of the search space of path 119901

with which 119901 is not traversed if 119901 is still not traversed withthe input 119881

1 1198812 119881

1015840

119892 119881

119899 when the value of a certain

variable 119909119892(119881119892) is changed into 1198811015840

119892 then 119909

119892is an irrelevant

variable to path 119901

6 Mathematical Problems in Engineering

Input Br(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branching conditions along the path119883 = 119909

1 1199092 119909

119899 the set of input variables

Output119883rel the set of relevant variables to the path119883irrel the set of irrelevant variables to the path

(1)119883rel larr Oslash(2)119883irrel larr Oslash(3) foreach Br(119899

119902119886 119899119902119886+1

) (119886 isin [1 119896])(4) if (119883rel = 119883)

(5) break(6) else if (119886

119895= 0)

(7) 119883rel larr 119883rel cup 119909119895(8)119883irrel larr 119883 minus119883rel(9) return 119883rel 119883irrel

Algorithm 2 Irrelevant variable removal

Generally for a particular path whether an input variableis relevant or irrelevant cannot be completely decided dueto the complex structure of programs But we can makeconservative estimate of irrelevancy with static control flowtechnique We give the most common condition in PUTsAssume that there are 119896 branches along a path each branch(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) needs to be traversed to find theset of relevant variables The removal of irrelevant variablesinvolves the judgment of whether a variable appears on eachbranch so we give the definition below which is utilizedby Algorithm 2 And considering the relation between thecomplexity of BFS-BB and the number of variables we giveProposition 4 about the effectiveness of IVR

Definition 3 The branching condition Br(119899119902119886 119899119902119886+1

) is theconstraint on the branch (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896]) and it canbe represented as

Br (119899119902119886 119899119902119886+1

) =

119899

sum119895=1

119886119895119909119895R119888 (1)

where R is a relational operator and 119886119895(119895 isin [1 119899]) and 119888 are

constants

Proposition 4 IVR may result in test data being searchedout with fewer MPC checks for a particular path 119901 than if allvariables are considered

Proof Thealgorithmbisection involves the search steps takenfor a certain variable under the same condition of othervariables which move in breadth (119898) until a value is foundto make MPC succeed Then 119898 is the base of the complexityof BFS-BB and the number of variables is the exponent Let119883rel denote the set of relevant variables to path119901 and let119883irrelbe the set of irrelevant variables onemore element in119883rel willinvolve more MPC checks on an exponential basis If all theirrelevant variables are removed from the search space thecomplexity will be reduced by119898|119883irrel| |119883irrel| is the cardinalityof the set of irrelevant variables

We conduct IVR for all the paths in Figure 1 and theprocess is shown in Table 1 The position where a variable isjudged relevant to the path of interest is highlighted in bold

5 The Heuristic Look-Ahead Methods

In this section the heuristic look-ahead methods in BFS-BBare explained in detail in Sections 51 52 and 53 respec-tively And Section 54 provides a case study to illustrate thesemethods

51 Heuristics in Variable Ordering In practice the chief goalin designing variable ordering heuristics is to reduce the sizeof the overall search tree In our method the next variableto be instantiated is selected to be the one with the minimalremaining domain size (the size of the domain after removingthe values judged to be infeasible) because this canminimizethe size of the overall search treeThe technique to break ties isimportant as there are often variables with the same domainsize We use variablesrsquo ranks to break ties In case of a tie thevariable with the higher rank is selected This method givessubstantially better performance than picking one of the tyingvariables at random Rank is defined as follows

Definition 5 The rank of a branch (119899119902119886 119899119902119886+1

) (119886 isin [1 119896])

marks its level in the sequence of the branches along a pathdenoted as rank (119899

119902119886 119899119902119886+1

)The rank of the first branch is 1 the rank of the second

one is 2 and the ranks of those following can be obtainedanalogously The variables appearing on a branch enjoy thesame rank as the branch The rank of a variable on a branchwhere it does not appear is supposed to be infinity As avariable may appear on more than one branch it may havedifferent ranks The rule to break ties according to the ranksof variables is based on the heuristics from interval arithmeticthat the earlier a variable appears on a path the greaterinfluence it has on the result of interval arithmetic along thepath Therefore if the ordering by rank is taken between avariable that appears on the branch (119899

119902119886 119899119902119886+1

) and a variablethat does not then the former has a higher rank That isbecause on the branch (119899

119902119886 119899119902119886+1

) the former has rank 119886

Mathematical Problems in Engineering 7

Table 1 IVR process for each path of test in Figure 1

Path Branching condition 1198861

1198862

1198863

119883rel 119883irrel

Path 1 0 rarr 1 rarr 2 rarr 9 rarr 10 1199091 minus 1199092 le 0 1 minus1 0 1199091 1199092 1199093

Path 2 0 rarr 1 rarr 3 rarr 4 rarr 8 rarr 9 rarr 101199091 minus 1199092 gt 0 1 minus1 0

1199091 1199092 1199093 Oslash1199093 minus 1199092 le 0 0 minus1 1

Path 3 0 rarr 1 rarr 3 rarr 5 rarr 6 rarr 7 rarr 8 rarr 9 rarr 10

1199091 minus 1199092 gt 0 1 minus1 01199091 1199092 1199093 Oslash1199093 minus 1199092 gt 0 0 minus1 1

31199093 ge minus5 mdash mdash mdash

Path 4 0 rarr 1 rarr 3 rarr 5 rarr 7 rarr 8 rarr 9 rarr 10

1199091 minus 1199092 gt 0 1 minus1 01199091 1199092 1199093 Oslash1199093 minus 1199092 gt 0 0 minus1 1

31199093 lt minus5 mdash mdash mdash

Input FV the set of future variables119863119894 the domain of 119909

119894(119909119894isin FV)

(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branches along the pathOutput 119876

119894 a queue of FV

Begin(1) 119876119894larr quicksort (FV 1003816100381610038161003816119863119894

1003816100381610038161003816)(2) for 119894 rarr 1 1003816100381610038161003816119876119894

1003816100381610038161003816(3) if (1003816100381610038161003816119863119894

1003816100381610038161003816 =10038161003816100381610038161003816119863119895

10038161003816100381610038161003816) (119895 gt 119894 119909

119894 119909119895isin 119876119894)

(4) break(5) else for (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896])(6) if (rank(119899

119902119886 119899119902119886+1

)(119909119894) = rank(119899

119902119886 119899119902119886+1

)(119909119895))

(7) 119886++(8) else permutate 119909

119894 119909119895by rank(119899

119902119886 119899119902119886+1

)(9) break(10) return 119876

119894

End

Algorithm 3 Dynamic variable ordering

while the latter has rank infinity The comparison between119886 and infinity determines the ordering The algorithm isdescribed by pseudocodes in Algorithm 3

Quicksort is utilized when permutating variables accord-ing to remaining domain size and returns 119876

119894as a result If

no variables have the same domain size then DVO finishesBut if there are variables whose domain sizes are the same asthat of the head of119876

119894 then the ordering by rank is under way

which will terminate as soon as different ranks appear

52 Heuristics in Value Selection DVO determines the nextvariable to be instantiated and then the value selectionstrategies are employed Considering the difference betweenthe variable in question (eg 119909

119894) and other variables the

branching condition defined by formula (1) can be furtherrepresented as a function of 119909

119894

Br (119899119902119886 119899119902119886+1

) (119909119894) 119863119894997888rarr 119861 = (119886

119894119909119894+ sum119895 = 119894

119886119895119909119895)R119888 (2)

where 119863119894is the domain of 119909

119894and 119861 is a set of Boolean

values 119905119903119906119890 119891119886119897119904119890 sum119895 = 119894119886119895119909119895is the linear combination of

the variables except 119909119894and is regarded as a constant Then

we can design the value selection strategies starting from themonotonic relation between the branching condition and 119909

119894

Monotonicity describes the behavior of a function in relationto the change of the input It gives an indication whether theoutput of the function moves in the same direction as theinput or in the reverse direction If a branching condition isa function whose monotonicity is known the direction inwhich the input needs to be moved to make the functiontrue can be determined The following proposition gives anattribute of a function composed of piecewise monotonicfunctions

Proposition 6 Assume that 1198911 1198831rarr 119884

1 1198912 1198832rarr

1198842 119891

119898 119883119898rarr 119884119898is a family of piecewise monotonic

functions with 119884119894sube 119883119894+1

Let 119865119898 1198831rarr 119884119898be a composed

function 119891119898∘ 119891119898minus1

∘ sdot sdot sdot ∘ 1198911 On this assumption 119865

119898is also

piecewise monotonic

Proof Mathematical induction is used to prove the proposi-tion

(i) Case 1198651= 1198911 Function 119891

1is piecewise monotonic

by assumption 1198651is equal to 119891

1 so it has the same

attribute(ii) Case 119865

119894+1= 119891119894+1∘ 119865119894 The composed function 119865

119894is

piecewise monotonic by the induction assumptionlet 119868 be a subset of its domainrsquos partition and let 119909and 1199091015840 be two arbitrary elements in 119868 with 119909le

1198831199091015840

then one of themonotonicity conditions holds that iseither 119865

119894(119909) le119884119894119865119894(1199091015840

) or 119865119894(119909) ge119884119894119865119894(1199091015840

) For simplic-ity we denote it as 119865

119894(119909)R119865

119894(1199091015840

) where R isin le geFunction 119891

119894+1is piecewise monotonic by assumption

The monotonicity condition is satisfied by 119865119894(119909) and

119865119894(1199091015840

) if both lie in the same subset 1198681015840of its domainrsquospartition Then 119891

119894+1(119909)R119891

119894+1(1199091015840

) holds and 119891119894+1

isalso monotonic on 1198681015840

After decomposing a branching condition into its basicfunctions its monotonicity can be utilized in the selectionof the initial value as well as other values of the variable inquestion

521 Initial Value Selection Initial values of variables are ofgreat importance to a search algorithm On the one hand in a

8 Mathematical Problems in Engineering

backtrack-free search the initial value of a variable is almostpart of the solution On the other hand the selection of initialvalues affectswhether the searchwill be backtrack-free Initialvalues are often selected at random in MHS methods whichreturn different test data each time allowing diversity butrandomness without any heuristics is a kind of blind searchand causes too many iterations sometimes even exceptionMeanwhile midvalues are selected in methods using bisec-tion so it is obvious that sometimes the same result may bereturned since the same initial value is always selected In ourmethod the above twomethods are combined and the initialvalue of a variable is determined based on its path tendencywhich is defined and calculated as follows

Definition 7 Path tendency isin 119901119900119904119894119905119894V119890 119899119890119892119886119905119894V119890 is anattribute of a variable on a path which is in favor of thesatisfaction of all the branching conditions along the pathAnd it provides the information about where to select itsinitial value Positive implies that a larger initial value willwork better while negative implies that a smaller initial valueis better

The calculation of the path tendency of a variable119909119894involves the calculation of its weight on each branch

(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) and its path weight denoted as119908119894(119899119902119886 119899119902119886+1

) and 119901119908119894 which are calculated as (3)

119908119894(119899119902119886 119899119902119886+1

)

=

10038161003816100381610038161198861198941003816100381610038161003816

10038161003816100381610038161198861198941003816100381610038161003816 + sum119895 = 119894

10038161003816100381610038161003816119886119895

10038161003816100381610038161003816

if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically increasing

minus

10038161003816100381610038161198861198941003816100381610038161003816

10038161003816100381610038161198861198941003816100381610038161003816 + sum119895 = 119894

10038161003816100381610038161003816119886119895

10038161003816100381610038161003816

if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically decreasing

119901119908119894=

119896

sum119886=1

119908119894(119899119902119886 119899119902119886+1

)

(3)

Path tendency calculation (PTC) gleans the path ten-dency of each variable with 119901119908

119894 Subsequently initial domain

calculation (IDC) works on the result of PTC In this waythe initial value selection allows for both diversity andheuristics The algorithms are expressed by pseudo-codes inAlgorithms 4 and 5

522 Bisection by Tendency Bisection functions only whena value (including the initial value) assigned to the currentvariable 119909

119894is judged to be infeasible and the conflicted branch

(119899119902119886 119899119902119886+1

)with the false branching condition is locatedThenthe tendency of 119909

119894is used by bisection defined as follows

Definition 8 Tendency isin 119901119900119904119894119905119894V119890 119899119890119892119886119905119894V119890 is an attributeof a variable at a branch (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896]) determinedby the analysis on the monotonicity of the correspondingbranching condition and it provides the information aboutwhere to select a value to better satisfy the branching

condition Positive implies that a larger value will work betterwhile negative implies that a smaller value is better It iscalculated according to the following formula

119879119890119899119889119890119899119888119910 (119909119894)

=

119901119900119904119894119905119894V119890 if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically increasing119899119890119892119886119905119894V119890 if Br (119909

119894) (119899119902119886 119899119902119886+1

)

is monotonically decreasing

(4)

Each branch holds a tendencymap ⟨119881119886119903119894119886119887119897119890 119879119890119899119889119890119899119888119910⟩which includes the variables appearing on the branch andtheir corresponding tendencies With the tendency mapbisection can be applied to reduce the domain of 119909

119894(119863119894119895)

leading the branching condition to be true as presented bypseudo-codes in Algorithm 6

For example if the conflicted branch is the first branchof Path3 in Figure 1 then the corresponding branchingcondition is 1199091 minus 1199092 gt 0 which has different monotonicrelations with 1199091 and 1199092 respectively Table 2 shows how touse bisection to reduce the domains of variables If the currentvariable is 1199091 then retrieval of tendency map returns positiveindicating that a larger value will help satisfy the branchingcondition sowe reduce its domain to the larger part But if thecurrent variable is 1199092 bisection will function in the oppositeway due to the opposite monotonic relation

53 Heuristics in Maintaining Path Consistency As men-tioned in Section 42 MPC can be used in both stages ofBFS-BB In this part the focus is on the state space searchstage A value assigned to the current variable 119909

119894 no matter

it is the initial value or another value selected after bisectionshould be examined by interval arithmetic to see whether it ispart of the solution Path consistency is a prerequisite for thesuccess of interval arithmetic In the implementation of BFS-BB interval arithmetic is enhanced to provide more preciseinterval information The enhancement is to make clear howthe value of the branching condition defined by formula (2)is calculated as shown in formula (5) Here we use 119863119886 todenote the domain of all variables before calculating the 119886thbranching condition Besides a library of inverse functions isadded in case of the occurrences of library functions in thePUT Consider

Br (119899119902119886 119899119902119886+1

) (119909119894)

=

119905119903119906119890 if (119899119902119886 119899119902119886+1

) is traversedwith 119863119886 (119881

119894119895isin 119863119886

)

119891119886119897119904119890 otherwise

(5)

Hence for 119896 branching nodes along path 119901 all the119896 branching conditions should be true to maintain pathconsistency MPC receives the value of the current variable 119909

119894

(119881119894119895) which is part of the domain of all variables denoted as

1198631

(119881119894119895= [119881119894119895 119881119894119895] isin 1198631

) and evaluates the branching condi-tion corresponding to the branch (119899

1199021 1198991199021+1

) where 1198991199021is the

first branching node The branching condition Br(1198991199021 1198991199021+1

)

Mathematical Problems in Engineering 9

Input119883rel the set of relevant variables to the path119901119908119894 the path weight of variable 119909

119894(119909119894isin 119883rel)

Output Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relBegin(1) Path-Tendencylarr null(2) foreach 119909

119894isin 119883rel

(3) if (119901119908119894gt 0)

(4) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119901119900119904119894119905119894V119890⟩

(5) else if (119901119908119894lt 0)

(6) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119899119890119892119886119905119894V119890⟩

(7) return Path-TendencyEnd

Algorithm 4 Path tendency calculation

Input 119863119894= [min max] the domain of 119909

119894

Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relOutput 119863

1198941 the domain of 119909

119894in which its initial value is selected

Begin(1) PathTendency(119909

119894)larr retrieval of Path-Tendency

(2) if (PathTendency(119909119894) = positive)

(3) 1198631198941larr997888 [

(min+max)2

max](4) else if (PathTendency(119909

119894) = negative)

(5) 1198631198941larr997888 [min (min+max)

2]

(6) return 1198631198941

End

Algorithm 5 Initial domain calculation

Input 119863119894119895= [min max] the current domain of 119909

119894

119881119894119895 the current value of 119909

119894that causes Br(119909

119894)(119899119902119886 119899119902119886+1

) to be false(119899119902119886 119899119902119886+1

) the conflicted branchOutput 119863

119894119895 the reduced domain of 119909

119894

Begin(1) 1198811015840 larr 119881

119894119895

(2) Tendency(119909119894)larr retrieval of tendency map held by (119899

119902119886 119899119902119886+1

)(3) 119895++(4) if (Tendency(119909

119894) = positive)

(5) 119863119894119895larr [1198811015840 + 1 max]

(6) else if (Tendency(119909119894) = negative)

(7) 119863119894119895larr [min 1198811015840 minus 1]

(8) return 119863119894119895

End

Algorithm 6 Bisection

Table 2 An example of bisection

Current variable Monotonicity Tendency Current value Domain before bisection Domain after bisection1199091 Increasing Positive 1198811 [min1 max1] [1198811+1 max1]1199092 Decreasing Negative 1198812 [min2 max2] [min2 1198812minus1]

10 Mathematical Problems in Engineering

Input 1198631 the domain of all variables before checking path consistencyBr(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branching conditions along the pathOutput 119863119896+1 the reduced domain of all variables after a successful path consistency check

(119899119902119886 119899119902119886+1

) the conflicted branch spotted by path consistency checkBegin(1) for 119886 rarr 1 119896(2) calculate Br(119899

119902119886 119899119902119886+1

) with119863119886(3) if (Br(119899

119902119886 119899119902119886+1

) = true)(4) 119863

119886+1

⊑ 119863119886

(5) else return (119899119902119886 119899119902119886+1

)(6) path consistentlarr true(7) return 119863119896+1End

Algorithm 7 Maintaining path consistency

is generally not satisfied for all the values in1198631 but for valuesin a certain subset 1198632 sube 119863

1 ensuring the traversal of the

branch (1198991199021 1198991199021+1

) that is 1198631119861119903(1198991199021 1198991199021+1)

997888997888997888997888997888997888997888997888997888rarr 1198632 Next the

branching condition Br(1198991199022 1198991199022+1

) is evaluated given that thedomain of all variables is1198632 Again generally Br(119899

1199022 1198991199022+1

) isonly satisfied by a subset 1198633 sube 1198632 This procedure continuesalong 119901 until all the branching conditions are satisfied tomaintain path consistency and119863119896+1 is returned as the domainof all variables The process of maintaining path consistencyis the propagation of the branching conditions along p in the

form of 1198631Br(1198991199021 1198991199021+1)997888997888997888997888997888997888997888997888997888rarr 119863

2

Br(11989911990221198991199022+1)997888997888997888997888997888997888997888997888997888rarr 119863

3

sdot sdot sdot 119863119896

Br(119899119902119896119899119902119896+1)997888997888997888997888997888997888997888997888997888rarr

119863119896+1 where 1198631 supe 119863

2

supe 1198633

sdot sdot sdot supe 119863119896

supe 119863119896+1 But if in

this procedure Br(119899119902ℎ 119899119902ℎ+1

) = 119891119886119897119904119890(1 le ℎ le 119896) whichmeans a conflict is detected then MPC is terminated andbisection will function according to the result of MPC atthe conflicted branch (119899

119902ℎ 119899119902ℎ+1

) The process of checkingwhether path consistency is maintained is shown by pseudo-codes in Algorithm 7

54 Case Study In this part the problem mentioned inSection 31 is used as an example to explain how BFS-BBworks especially the heuristic look-aheadmethods proposedahead The input is Path3 as shown in bold in Figure 3where each branching condition is decomposed into its basicfunctions in the right The IVR process has been illustratedin detail in Table 1 and all the three variables are determinedrelevant to Path3 For simplicity the input domains of allvariables are set [minus2 2] with the size 5 In the initializationstage MPC check reduces their domains to 1199091 [minus1 2] 1199092[minus2 1] and 1199093 [minus1 2] The path tendency of each variableis calculated by PTC as shown in Table 3 DVO serves todetermine the first variable to be instantiated as shown inTable 4 with the head of the queue (1199092) highlighted in boldOn determining 1199092 to be the current variable an initialvalue needs to be selected from[minus2 1] The retrieval of pathtendency map by IDC returns negative for 1199092 indicating thata smaller value will perform better and minus1 is selected

MPC checks the domains of all variables which are 1199091[minus1 2] 1199092 [minus1 1] and 1199093 [minus1 2] It succeeds and reduces

the domains of 1199091 and 1199093 to [0 2] and [0 2] respectivelyThen DVO determines the next variable to be instantiated asshown in Table 5 with the head of the queue (1199091) highlightedin bold

1 is selected for 1199091 after IDC MPC checks whether1199091 [1 1] 1199092 [minus1 minus1] and 1199093 [0 2] works It suc-ceeds and in the same manner 1199093 is assigned 1 Finally⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is checked by MPC to be suitablefor Path3 No variable needs to be permutated and BFS-BBsucceeds with the test data ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ Table 6shows how the domains of variables are changed duringthe search process The changed domains are highlighted inbold The changes listed in the fourth column are owing tovariable assignments according to the results of IDC andthe changes listed in the fifth column are owing to domainreduction by MPC checks The process of generating the testdata ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is presented as the search treein Figure 4 It is a backtrack-free search that accounts for anextremely large proportion in the implementation of BFS-BBEach variable consumes one MPC check in the state spacesearch stage and the initial values of each variable make thesolution The solution path is shown by the bold arrows

6 Experimental Results and Discussion

To observe the effectiveness of BFS-BB we carried out a largenumber of experiments in CTS Within the CTS frameworkthe PUT is automatically analyzed and its basic informationis abstracted to generate its CFG According to the specifiedcoverage criteria the paths to be traversed are generated andprovided for BFS-BB as input The generated test data willbe used for mutation testing that requires a high coverageideally 100 [37] This is a challenge for test data generation

The experiments were performed in the environment ofMSWindows 7with 32 bits Pentium4with 28GHz and 2GBmemory The algorithms were implemented in Java and runon the platformof eclipseThe experiments include two partsSection 61 presents the performance evaluation of BFS-BBand Section 62 tests the capability of BFS-BB to generate testdata in terms of coverage and makes comparisons with somecurrently existing static and dynamic methods

Mathematical Problems in Engineering 11

Table 3 PTC process for 1199091 1199092 and 1199093

Branchingcondition

Basic functions andcorresponding monotonicity

Monotonicity ofbranching conditions Weight Path weight Path tendency

1199091 minus 1199092 gt 0

119891(1199091) = 1199091 minus 1199092 increasing119891(1199092) = 1199091 minus 1199092 decreasing119891(1198871) = 1198871 gt 0 increasing

Br(1199091) increasingBr(1199092) decreasing

1199081 = 05

1199082 = minus051199011199081 = 05

1199011199082 = minus1

1199011199083 = 15

⟨1199091 positive⟩⟨1199092 negative⟩⟨1199093 positive⟩1199093 minus 1199092 gt 0

119891(1199092) = 1199093 minus 1199092 decreasing119891(1199093) = 1199093 minus 1199092 increasing119891(1198872) = 1198872 gt 0 increasing

Br(1199092) decreasingBr(1199093) increasing

1199082 = minus05

1199083 = 05

3 lowast 1199093 ge minus5119891(1199093) = 3 lowast 1199093 increasing119891(1198873) = 1198873 ge minus5 increasing Br(1199093) increasing 1199083 = 1

Table 4 DVO process for 1199091 1199092 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 4 |1198632| = 4 |1198633| = 4 Yes (all three have the same domain size)

x2 rarr 1199091 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199092) = 1 Rank 1(1199093) =infin Yes (1199091 and 1199092 both have Rank 1)Rank 2 Rank 2(1199091) =infin Rank 2(1199092) = 2 No (1199092 has Rank 2 while 1199091 has infinity)

Table 5 DVO process for 1199091 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 3 |1198633| = 3 Yes (both have the same domain size) x1 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199093) =infin No (1199091 has Rank 1 while 1199093 has infinity)

Table 6 Domain changes in the search process

Stage Function Before IDC After IDC and before MPC After MPC

Initialization Initial domain reduction mdash 1199091 [minus2 2] 1199092 [minus2 2]1199093 [minus2 2]

x1 [minus1 2] x2 [minus2 1]x3 [minus1 2]

State space search

MPC check when 1199092 isassigned minus1

1199091 [minus1 2] 1199092 [minus2 1] 1199093[minus1 2]

1199091 [minus1 2] x2 [minus1 minus1]1199093 [minus1 2]

x1 [0 2] 1199092 [minus1 minus1]x3 [0 2]

MPC check when 1199091 isassigned 1

1199091 [0 2] 1199092 [minus1 minus1] 1199093[0 2]

x1 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

MPC check when 1199093 isassigned 1

1199091 [1 1] 1199092 [minus1 minus1] 1199093[0 2]

1199091 [1 1] 1199092 [minus1 minus1]x3 [1 1]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [1 1]

void test(int x1 int x2 int x3) void test(int x1 int x2 int x3)

if (x1-x2lt=0)

printf(ldquoPath1rdquo)

else if(x3-x2lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

else if(3lowastx3+5gt=0)

int b1=x1-x2

printf(ldquoPath1rdquo)

if (b1lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

if(b2lt=0)

if(b3gt=-5)

else int b2=x-3x2

elseint b3=3lowastx3

Figure 3 Overview of our approach for searching the test data

12 Mathematical Problems in Engineering

0 1 2

DVO

MPC

0 1 2MPC

0 1 2MPC

DVO

DVOIDC

PTC

MPC

MPC

IDC

IDC

minus1

minus1

minus1

minus2

minus2

minus2

x1 minus x2 gt 0x3 minus x2 gt 03x3 + 5 ge 0

x1 x2 x3 isin minus2 minus1 0 1 2

x2x2x2x2

x3x3x3

x1 x1 x1

radic

Figure 4 The search tree of generating the test data for Path3 using BFS-BB

61 Performance Evaluation The number of relevant vari-ables is an important factor that affects the performanceof BFS-BB so in this part experiments were carried out toevaluate the performance of BFS-BB for varying numbers ofinput variables To be specific our major concern is (1) therelationship between the number of MPC checks (exclusiveof the one taken in the initialization stage) and the numberof relevant variables (2) the relationship between genera-tion time and the number of relevant variables This wasaccomplished by repeatedly running BFS-BB on generatedtest programs having input variables 119909

1 1199092 119909

119899where 119899

varied from 1 to 50 Adopting statement coverage in eachtest the program contained five if statements (equivalent tofive branching conditions along the path for MPC check)and there was only one path to be traversed of fixed lengthwhich was the one consisting of entirely true branches(TTTTT) that is all the branching conditions are the sameas the corresponding predicates Considering the relationshipbetween variables experiments involving two situations wereconducted that (1) the variables are all independent of eachother and (2) the variables are linearly related in the tightestmanner Generation time varied greatly in these two cases sothe axes of generation time of both cases are normalized forsimplicity

611 Variables Are All Independent of Each Other The pred-icate of each if statement is an expression in the form of

11988611199091relop

1const [1] and 119886

21199092relop

2119888119900119899119904119905 [2]

and sdot sdot sdot and 119886119899119909119899relop

119899119888119900119899119904119905 [119899]

(6)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop119894(119894 = 1 2 119899) isin gt ge lt le =

= and const is an array of randomly generated constantsThe randomly generated 119886

119894and 119888119900119899119904119905[119894] should be selected

to make the path feasible This arrangement constructs arelationship that all the variables are independent of eachother but all of them are relevant to the path The programsfor various values of n ranging from 1 to 50 were each tested50 times and the number of MPC checks and time required

to generate the data for each test were recorded The resultscan be seen from Figures 5 and 6

Figure 5 shows the relationship between the number ofMPC checks and the number of variables (119899) for variablesthat are all independent of each other and from (a) to (d)represent four different situations marked by the ordinatesIt can be seen that since the relation in formula (6) is thesimplest one between variables the number of MPC checksremains linearly increasing with the number of variables nomatter in which situation from (a) to (d) 119910 = 119909means thatfor this kind of constraint one relevant variable requires onlyone MPC check It also can be seen that 1198772 = 1 in all thefour situations and the number of MPC checks increasescompletely linearly with the number of variables

Figure 6 shows the relationship between generation timeand the number of variables (119899) for variables that are allindependent of each other and from (a) to (d) represent fourdifferent situations marked by the ordinates It can be seenthat generation time increases approximately linearlywith thenumber of variables and the linear correlation relationship issignificant at 95 confidence level with 119875 value far less than005 By the increase of the number of variables generationtime increases at an even speed The minimum value can becommendably represented as a straight line showing that itis the most ideal in the four situations with a larger value of1198772 Variations between tests with the same values of 119899 were

attributed to the randomness in the difference in the selectionof the initial values

612 Variables Are Linearly Related in the Tightest MannerThe predicate of each if statement is a linear combination ofall the 119899 variables in the form of

[1198861 1198862 119886

119899] [1199091 1199092 119909

119899]1015840relop119888119900119899119904119905 [119888] (7)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop isin gt ge lt le = = and 119888119900119899119904119905[119888](119888 isin 1 2 3 4 5) is an array of randomly generatedconstants The randomly generated 119886

119894and 119888119900119899119904119905[119888] should be

selected to make the path feasible This arrangement con-structs the tightest linear relation between the variables all

Mathematical Problems in Engineering 13

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = x

R2 = 1

(a)

Number of variables0 20 40 60

0

20

40

60

Aver

age n

umbe

r of M

PC ch

ecks

y = x

R2 = 1

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(c)

Number of variables0 20 40 60

0

20

40

60

Min

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(d)

Figure 5 Relationship between the number of MPC checks and the number of variables for variables that are all independent of each other

of which are relevant to the path The programs for variousvalues of 119899 ranging from 1 to 50 were each tested 50 times andthe number ofMPC checks and time required to generate thedata for each test were recordedThe results can be seen fromFigures 7 and 8

Figure 7 shows the relationship between the number ofMPC checks and the number of variables (119899) for variables thatare the tightest linearly related and from (a) to (d) representfour different situations marked by the ordinates It can beseen that the number of MPC checks remains approximatelylinearly increasing with the number of variables and thefitting curves are all near 119910 = 119909 The linear correlationrelationship is significant at 95 confidence level with 119875-value far less than 005 The general average and maximumnumbers ofMPC checks are all larger than those in the exper-iment for variables that are all independent of each otherbecause the relation in formula (7) is the tightest linear onebetween variablesTheminimumnumber ofMPC checks canbe completely represented as 119910 = 119909 with 1198772 = 1 whichmeans that theminimumnumber is themost ideal in the foursituations

Figure 8 shows the relationship between generation timeand the number of variables (119899) for variables that are

the tightest linearly related and from (a) to (d) represent fourdifferent situationsmarked by the ordinates It is clear that therelation between generation time and the number of variablescan be commendably represented as a quadratic curve andthe quadratic correlation relationship is significant at 95confidence level with 119875-value far less than 005 The betterfitting curves of average and minimum generation timesshow that average generation time is perfectly stable andminimum generation time is still the most ideal Variationsbetween tests with the same values of n were attributed to therandomness in (1) the difference in the selection of the initialvalues and (2) the difference in the expressions along the path(an equality relational operator will generally require morecalculation than an inequality relational operator) Besidesgeneration time increases at a uniformly accelerative speed bythe increase of the number of variables Take (b) for examplethe differentiation of average generation time indicates thatits increase rate rises by 119910 = 9994119909 minus 7734 as the number ofvariables increases We can roughly draw the conclusion thatgeneration time is very close for 119899 ranging from 1 to 8 whileit begins to increase when 119899 is larger than 8

The above cases are both completely backtrack-freesearch owing to the linear correlation relationship between

14 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 1471x + 3161

R2 = 0899

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time

y = 1636x + 1175

R2 = 0862

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 2073x minus 3655

R2 = 0997

(d)

Figure 6 Relationship between generation time and the number of variables for variables that are all independent of each other

the number of MPC checks and the number of relevantvariables Surely they cannot include all the relations betweenvariables in engineering so the analyses in this part arejust from the theoretic perspective The real-world PUTs aremuch more complex Whatrsquos more 50 tests were conductedfor each case of n ranging from 1 to 50 So the results from thesamples can only approximate the actual situation But it canbe concluded that BFS-BB functions are stably given a PUTof regular structure which lays a solid foundation for itsapplication in engineering

62 Coverage Evaluation To evaluate the capability of BFS-BB to generate test data in terms of coverage four experi-ments were carried out The first involves the testing with abenchmark used in CTS the second aims at generating testdata for a project in engineering the third compares BFS-BBwith a static method and the last compares it with dynamicmethods

621 Testing a Benchmark in CTS In this part test datawere automatically generated to meet three coverage criteriawhich were statement branch and MCDC The test bedwas branch boundc a benchmark in CTS with 402 LOC

29 input variables and complex structure trying to includemore content that might appear in engineering119898 was set 10for each variable as the upperbound of the number of MPCchecks so it can be estimated that the simplest backtrackingwill consume at least 11 MPC checks for the variable inquestion

The result is shown in Table 7 The numbers of paths wasdifferent owing to different coverage criteria adopted BFS-BB was able to generate test data for all the feasible pathsno matter which coverage criterion was taken The MCDCcoverage did not reach 100 because it is relatively strictand difficult to meet and subsumes statement and branchcoverage [38] But tolerable coverage was achieved withintolerable timeThere exists a trade-off between efficiency andsuccess rate IVR had no significant influence on coveragebut it did on generation time Generation time after IVRwas much less than that without IVR Note that the amountof generation time reduced by IVR is determined by the struc-ture of the PUT The numbers of generation time reductionin Table 7 are only related to the program branch boundcOur following analyses all concern BFS-BB with IVR Aver-age number of MPC checks per relevant variable adoptingstatement coverage was larger than the results in Section 61

Mathematical Problems in Engineering 15

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = 0994x + 0221

R2 = 0997

(a)

0 20 40 600

20

40

60

Number of variables

Aver

age n

umbe

r of M

PC ch

ecks

y = 0996x + 0309

R2 = 0999

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = 1014x + 2030

R2 = 0985

(c)

0 20 40 600

20

40

60

Number of variables

Min

imum

num

ber o

f MPC

chec

ks

y = x

R2 = 1

(d)

Figure 7 Relationship between the number of MPC checks and the number of variables for variables that are linearly related in the tightestmanner

Table 7 Coverage achieved by BFS-BB on branch boundc

Coverage criterion Number of paths Average coverage Generation time reducedby IVR

Average MPC checks perrelevant variable

Statement 61 100 34 134Branch 119 100 37 173MCDC 125 98 42 229

because there are some library functions as well as nonlinearconstraints in the PUT which require moreMPC checks Butfrom all the average values of the number of MPC checks itcan be concluded that the tests for all the three coverage crite-ria were basically backtrack-free and not all the tests includethe bisection operation

622 Testing Programs from a Project in Engineering In thispart seven programs were selected from the project de118i-2 at httpwwwmoshiernet as the test beds each of whichcontains several functions or loops Three coverage criteriawere adopted which were statement branch and MCDCThe information of the programs and the coverage results areshown in Table 8

From Table 8 it can be seen that BFS-BB performedencouragingly for programs with complex structure in engi-neering Some of the tests adoptingMCDC coverage did notreach 100 resulting from the fact thatMCDC is a relativelystrict criterion But all the tests were basically backtrack-freewith a high coverage proving its effectiveness in engineering

623 Comparison with a Static Method This part presentsthe results from an empirical comparison of BFS-BB withthe static method [13] (denoted as ldquomethod 1rdquo to avoidverbose description) which was implemented in CTS priorto BFS-BB The details of the test beds are shown in Table 9The comparison adopted three coverage criteria statementbranch and MCDC And the result is shown in Table 10

16 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 4318x2 minus 6684x + 3655

R2 = 0975

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 4997x2 minus 7734x + 4134

R2 = 0991

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time y = 4151x2 minus 2923x + 7107

R2 = 0907

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 5455x2 minus 9565x + 5030

R2 = 0994

(d)

Figure 8 Relationship between generation time and the number of variables for variables that are linearly related in the tightest manner

Table 8 Test result achieved by BFS-BB on programs from de118i-2

Program LOC Number ofvariables

Number offunctions

Number ofloops Statement coverage Branch coverage MCDC coverage

rungec 1626 8 2 8 100 100 89oblatec 581 14 4 4 100 100 94jplmpc 996 6 5 5 100 100 88floorlc 432 7 5 2 100 100 92atanlc 267 3 2 0 100 100 100adams4c 767 28 10 12 100 100 85tanlc 256 4 3 0 100 100 100

Table 9 Programs used for comparison with method 1

Program LOC Number of branches Number of variables Description Source

Bonus 29 10 1 To calculate bonus according toprofit [13]

Days 33 17 3 To calculate the day numberwithin a year [13]

Statistics 21 8 5 To count the number of everykind of characters [13]

gcd 38 5 2 To calculate greatest commondenominator [14]

Mathematical Problems in Engineering 17

Table 10 Comparison result with method 1 using three coverage criteria

Program Coverage criterion Number of paths Average coverage bymethod 1

Average coverage byBFS-BB

BonusStatement 6 25 100Branch 6 37 100MCDC 6 30 100

DaysStatement 17 100 100Branch 17 100 100MCDC 14 94 100

StatisticsStatement 4 100 100Branch 5 100 100MCDC 11 82 100

gcdStatement 3 85 100Branch 3 77 100MCDC 5 75 100

Table 11 Programs used for comparison with GA and SA

Program LOC Number of branches Number of variables Description SourceTriangle type 31 3 5 To classify type of a triangle [15]

Valid date 59 16 3 To check whether a date is validor not [15]

Cal day 72 11 3 To calculate the day of the week [16]

Cal 53 18 5To calculate the number of daysbetween the two given days in

the same year[17]

Since interval arithmetic has been improved in CTS forthe sake of validity test data were generated for all the testbeds on the same foundation of interval arithmetic It can beseen that BFS-BB reached 100 coverage for all the test bedsusing three coverage criteria while method 1 did not That islargely due to the heuristicmethods utilized in BFS-BBThereare modulus operations in the program days which had beendifficult for the interval arithmetic in method 1 to handle Butthe functionality of interval arithmetic has been improved byadding a library of inverse functions so it was not so difficulteven for method 1

624 Comparison with DynamicMethods This part presentsresults from an empirical comparison of BFS-BB with twodynamic methods which are genetic algorithm (GA) andsimulated annealing (SA) on four different benchmark pro-grams using branch coverage The details of the benchmarkprograms are shown in Table 11 In order to obtain unbiasedexperimental results we made a number of experiments withdifferent parameter settings and chose the one that GA andSA performed the best as shown in Table 12 Test data wereautomatically generated for each program using GA SA andBFS-BB with each tested 100 timesThe average coverage wasrecorded to make the comparison and the result was pre-sented in Table 13

It can be seen that BFS-BB reached 100 coverage onall the four benchmark programs which are rather simpleprograms for BFS-BB and it outperformed the algorithms

in comparison The better performance of BFS-BB is due tothree factorsThe first is that the initial values of variables areselected by heuristics on the path so BFS-BB reaches a rela-tively high coverage for the first round of the searchThe sec-ond is that MHS crashes on several occasions due to the iter-ation exception while the probability of aborting is quite lowfor BFS-BB because it has no demand for iteration The thirdis thatMPC is checked not only in the state space search stagebut also in the initialization stage which reduces the domainsof the variables to ensure a relatively small search spacethat follows

7 Conclusion

This paper presents an intelligent algorithm best-first-searchbranch and bound (BFS-BB) for path-wise test data gener-ation (Q) The problem Q is reformulated as a constraintoptimization problem (COP) and two techniques from arti-ficial intelligence are introduced to tackle the COP which arestate space search and branch and bound (BB) The formeris used to construct the search tree dynamically and thelatter is used as the search method Heuristics are adoptedin the look-ahead stage Dynamic variable ordering (DVO)is presented with a heuristic rule to break ties Maintainingpath consistency (MPC) is achieved through analysis on theresult of interval arithmetic The monotonicity analysis onbranching conditions is applied both in the selection of initialvalues by path tendency calculation (PTC) and initial domain

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 3

0

87

9

10

11

12

T 1

T 3

T 5

F 2

F 4

F 6

entry 0

if head 1

stmt 2 if head 3

if out 8

exit 10

stmt 4

if out 9

if head 5

stmt 6

if out 7

void test(int x1 int x2 int x3)

1 if (x1-x2lt=0)

2 printf(ldquoPath1rdquo)

3 else if(x3-x2lt=0)

4 printf(ldquoPath2rdquo)

5

6 printf(ldquoPath3rdquo)

else if(3lowastx3+5gt=0)

Figure 1 Program test and its corresponding CFG

representing a transfer of control from node 119899119903to node

119899119905 Nodes corresponding to decision statements such as if

statements are branching nodes Outgoing edges from thesenodes are referred to as branches A path through a CFGis a sequence 119901 = (119899

1 1198992 119899

119902) such that for all 119903 1 le

119903 lt 119902 (119899119903 119899119903+1) isin 119864 A path 119901 is regarded as feasible

if there exists a program input for which 119901 is traversedotherwise 119901 is regarded as infeasible Then the problem Qcan be reformulated as a COP [32 33] as follows 119883 is a setof variables 119909

1 1199092 119909

119899 119863 = 119863

1 1198632 119863

119899 is a set

of domains and 119863119894isin 119863 (119894 = 1 2 119899) is a finite set of

possible values for 119909119894 For each path 119863 is defined based on

the variablesrsquo acceptable ranges One solution to the problemis a set of values to instantiate each variable inside its domaindenoted as 119881 = 119881

1 1198812 119881

119899 119881119894isin 119863119894to make path

119901 feasible Particularly each constraint defined by the PUTalong 119901 should be met to make it feasible

An example with a program test and its correspondingCFG is shown in Figure 1 where if out 7 if out 8 if out 9and exit 10 are dummy nodes Adopting branch coveragethere are four paths to be traversed namely Path1 0 rarr 1 rarr 2rarr 9 rarr 10 Path2 0 rarr 1 rarr 3 rarr 4 rarr 8 rarr 9 rarr 10 Path3 0rarr 1 rarr 3 rarr 5 rarr 6 rarr 7 rarr 8 rarr 9 rarr 10 and Path4 0 rarr 1rarr 3 rarr 5 rarr 7 rarr 8 rarr 9 rarr 10Thenumbers along the pathsdenote nodes rather than edges of the CFG Assuming thatPath3 is the path to be traversed as shown in bold our workis to select 119881 = 119881

1 1198812 1198813 from 119863

1 1198632 1198633 for 119909

1 1199092 and

1199093 so that when executing test using 119881

1 1198812 1198813 as an input

the path traversed is Path3 There are three branching nodesif head 1 if head 3 and if head 5 along Path3 and threecorresponding branches F 2 F 4 and T 5 that contain theconstraints to be met

32 Solution to the Problem A COP is generally solved bysearch strategies among which backtracking algorithms [34]are widely used In this paper state space search and thebacktracking algorithm BB are introduced to solve the COP

mentioned aboveThe process of exploring the solution spaceis represented as state space search This representation willfacilitate the implementation of BB In classical BB searchnodes are always fully expanded that is for a given leaf nodeall child nodes are immediately added to the so-called openlist However considering that one solution is enough forpath-wise test data generation best-first-search is our firstchoice To find the best ordering of variables is requiredfor branching to prune the branches stretching out fromunneeded variables In addition as the domain of a variable isa finite set of possible valueswhichmay be quite large bound-ing is necessary to cut the unneeded or infeasible solutions InBB frame bisection [35] is often used to help prune unneededpart of the solution space Employing bisection this paperproposes best-first-search branch and bound (BFS-BB)to automatically generate the test data

It has been observed empirically that the enhancement oflook-ahead methods is sometimes counterproductive to theeffects of look-back methods [36] As for BFS-BB heuristicsare adopted in the look-ahead search Particularly they areused in the dynamic ordering of variables the selection of thevalues to assign a variable and the judgment of the feasibilityof the path after the assignment to a variable and the reduc-tion of the search space Chronological backtracking is usedfor look-back And from the results of the experiments wetry to seek out the relationship between look-ahead and look-back methods

During the search process variables are divided intothree sets past variables (short for PV already instantiated)current variable (now being instantiated) and future vari-ables (short for FV not yet instantiated) All the variablesinvolved in this paper are symbolic variables In the interestof simplicity the transformation from input variables tosymbolic variables and the inverse transformation are beyondthe scope of this paper In addition although the experimentswere carried out on benchmarks in the literature or industrialprograms of different variable types integer variables are usedfor brevity in the following algorithms

4 The Proposed Search Strategies

This section proposes the framework of the search strate-gies Particularly the representation of state space search isdescribed in detail in Section 41 which is followed by thesearch algorithmBFS-BB in Section 42 And an optimizationmethod in BFS-BB is explained in Section 43

41 The Representation of State Space Search A state is atuple (Precursor Variable Domain Value Type and Queue)Precursor provides a link to the previous state 119881119886119903119894119886119887119897119890 =119909119894isin 119883 (119894 = 1 2 119899) is the current variable 119863119900119898119886119894119899 =

119863119894119895sube 119863119894isin 119863 (119894 = 1 2 119899 119895 = 1 2 119898 119898 is the

branching factor or the threshold used to control the breadthof the search tree) in the form of [minmax] is the set ofpossible values to be selected to instantiate Variable 119881119886119897119906119890 =119881119894119895isin 119863119894119895is a value selected fromDomainTypemarks the type

of state active extensive or inactive Queue is a sequence ofvariables corresponding to the state in question

4 Mathematical Problems in Engineering

Best-first-search branch and boundState space searchInitialization

IVR

Initial state Final state

Heuristic look-ahead methods

Bisection

Path Test data

Heuristic look-ahead methods

Backtracking

MPC IDC MPC DVO IDCDVO PTC

⟨x1 V1⟩

⟨x2 V2⟩

⟨x3 V3⟩

⟨xn Vn⟩

Figure 2 Program test and its counterpart with branching conditions decomposed into basic functions

State space is a quadruple (119878 119860 119868 119865) where 119878 is a setof states 119860 is a set of connections between the states inaccordancewith the search operations 119868 is a nonempty subsetof 119878 denoting the initial state of the problem and 119865 is a non-empty subset of 119878 denoting the final state of the problem

State space search is all about finding one final state ina state space (which may be extremely large) Final meansthat every variable has been instantiated with a definite valuesuccessfully At the start of the search Precursor is null andwhen Queue is null the search ends The path made up of allthe extensive nodes in the search treemakes the solution pathThe state space needs to be searched to find a solution pathfrom an initial state to a final state

42 The Search Algorithm BFS-BB The idea of the searchalgorithm BFS-BB is to extend partial solutions At each stepa variable in FV is selected and assigned a value from itsdomain to extend the current partial solution It is checkedwhether such an extension may lead to a possible solutionof the COP and the subtrees containing no solutions basedon the current partial solution are pruned Some concepts inBFS-BB are explained as follows

Irrelevant variable removal (IVR) identifies variablesrelevant to the path to be traversed and removes thoseirrelevant Dynamic variable ordering (DVO) permutatesFV and returns a queue Path tendency calculation (PTC)calculates the path tendencies of all relevant variables alongthe path whichwill be used to calculate the domains inwhichtheir initial values are selected

Initial domain calculation (IDC) calculates the domain ofa variable in which its initial value is selected according toits path tendency calculated by PTC Bisection reduces thedomain of the current variable when its value just assignedfails to satisfy a constraint on the path Maintaining pathconsistency (MPC) utilizes interval arithmetic to determinewhether the domains of all variables satisfy the constraintsalong the path

The overview of our approach can be seen from Figure 2The path to be traversed is shown in the left part where thered circles represent nodes and the arrows represent edgesof the CFG The path contains the constraints to be metthe set of input variables and the domains corresponding tothe variables The first stage is to perform the initializationoperations At first IVR (see Section 43) is called to reducethe search space by removing irrelevant variables and leavingonly those relevant to the path Then four heuristic look-ahead methods take effect MPC (see Section 53) is used to

partially reduce the input domains of all variables and findinfeasible paths on occasion All the relevant variables in FVare permutated by DVO (see Section 51) to form a queue 119876

1

and its head 1199091is determined the best or the first variable

to be instantiated Next PTC (see Section 52) calculates pathtendency of each variable and IDC (see Section 52) reducesthe domain 119863

11in which the initial value 119881

11is selected

for 1199091 With all these the initial state is constructed as

(119899119906119897119897 1199091 11986311 11988111 119886119888119905119894V119890 and 119876

1) which is also the current

state 119878119888119906119903

shown as the red ringThe second stage implements state space search Four

heuristic look-ahead methods work in this stage To eachactive state MPC is carried out to determine the direction ofthe next search step If MPC succeeds Type becomes exten-sive the variables in FV will be permutated by DVO to get119876119906119890119906119890 = 119876

119894 119878119888119906119903

becomes Precursor and the head of 119876119894(119909119894)

will be Variable of next state Then IDC is used to calculatethe domain119863

1198941in which the initial value119881

1198941is selected for 119909

119894

With all these a new state (119875119903119890 119909119894 1198631198941 1198811198941 119886119888119905119894V119890 and 119876

119894) is

constructed for which the MPC check continues If after asuccessful MPC check no variable needs to be permutatedthen all the relevant variables have been assigned the rightvalues to make 119901 feasible The final state is reached shownas the red double ring Finally giving the irrelevant variablesrandom values fulfills the generation of the test data which isthe output of BFS-BB as shown in the right part of Figure 2If a MPC check fails Type remains active bisection (seeSection 52) is conducted to reduce the domain of Variableusing the information from the failed MPC check and Valueis reselected from the reduced domain all of which indicatethat the search will expand to a state with a different value forthe same variable 119909

119894 If all the values within its domain for 119909

119894

are tried out or the number of MPC checks has reached theupperbound119898 then 119909

119894is moved out of PV andType becomes

inactive In this case the search will have to backtrack toPrecursor at the higher level of the search tree as shown by thebidirectional arrow between backtracking and the heuristiclook-ahead methodsThe above-mentioned search process isdescribed by pseudocodes as Algorithm 1

43 Irrelevant Variable Removal As mentioned above 119883 =

1199091 1199092 119909

119899 is the set of input variables for a program 119875

The search space needs to involve every 119909119894(119894 = 1 2 119899)

in 119883 However it is possible that not every variable will beresponsible for determining whether every path in 119875 will betraversed or not Therefore when attempting to generate testdata for a particular path 119901 the search effort on the value

Mathematical Problems in Engineering 5

Input 119901 the path to be traversedOutput result ⟨119881119886119903119894119886119887119897119890 119881119886119897119906119890⟩ the test data making 119901 feasibleStage 1 Initialization(1) call Algorithm Irrelevant variable removal(2) resultlarr null(3) call Algorithm Maintaining path consistency(4) call Algorithm Dynamic variable ordering(5) call Algorithm Path tendency calculation(6) 119909

1larr head (119876

1)

(7) call Algorithm Initial domain calculation(8) 119881

11larr select (119863

11)

(9) initial statelarr (null 1199091 11986311 11988111 active 119876

1)

(10) 119878cur larr initial stateStage 2 State space searchBegin(11) for (Pre 119909

119894 119863119894119895 119881119894119895 active 119876

119894) (119894 rarr 1 119899)

(12) path consistentlarr false(13) call Algorithm Maintaining path consistency(14) if (path consistent = true)(15) 119878cur larr (Pre 119909

119894 119863119894119895 119881119894119895 extensive 119876

119894)

(16) resultlarr result cup ⟨119909119894 119881119894119895⟩

(17) FVlarr FV minus 119909119894

(18) PVlarr PV + 119909119894

(19) call Algorithm Dynamic variable ordering(20) if (119876

119894= null)

(21) 119878cur larr final state(22) foreach 119909lowast isin 119883irrel(23) resultlarr result cup ⟨119909lowast 119881random⟩(24) else Prelarr 119878cur(25) 119909

119894larr head (119876

119894)

(26) call Algorithm Initial domain calculation(27) 119881

1198941larr select (119863

1198941)

(28) 119878cur larr (Pre 119909119894 1198631198941 1198811198941 active 119876

119894)

(29) else if (1003816100381610038161003816100381611986311989411989510038161003816100381610038161003816gt 1 ampamp 119895 lt 119898)

(30) call Algorithm Bisection(31) 119881

119894119895larr select (119863

119894119895)

(32) 119878cur larr (Pre 119909119894 119863119894119895 119881119894119895 active 119876

119894)

(33) else 119878cur larr (Pre 119909119894 119863119894119895 119881119894119895 inactive 119876

119894)

(34) 119875119903119890 larr 119878cur(35) 119878cur larr (Pre 119909

119894 119863119894119895 119881119894119895 active 119876

119894)

(36) PVlarr PV minus 119909119894

(37) return resultEnd

Algorithm 1 Best-first-search branch and bound

of a variable which is not relevant to 119901 is wasted since itcannot influence the traversal of 119901Thus removing irrelevantvariables from the search space and only concentrating on thevariables relevant to the path of interest may improve the per-formance of the search Hence we propose an optimizationmethod irrelevant variable removal (IVR) Relevant variableand irrelevant variable are defined as follows

Definition 1 A relevant variable is an input variable that caninfluence whether a particular path 119901will be traversed or notTo put it more precisely for all the input variables 119909

119894| 119909119894isin

119883 119894 = 1 2 119899 there exists a corresponding set of values119881119894| 119881119894isin 119863119894 119894 = 1 2 119899 with which 119901 is not traversed

But when the value of a particular variable is changed forexample when the value of 119909

119892(119881119892) is changed into 1198811015840

119892 119901 is

traversed with the input 1198811 1198812 119881

1015840

119892 119881

119899 Then 119909

119892is a

relevant variable to path 119901

Definition 2 An irrelevant variable is an input variable thatis not capable of influencing whether a particular path 119901 willbe traversed or not To put it more precisely for all the sets119881119894| 119881119894isin 119863119894 119894 = 1 2 119899 of the search space of path 119901

with which 119901 is not traversed if 119901 is still not traversed withthe input 119881

1 1198812 119881

1015840

119892 119881

119899 when the value of a certain

variable 119909119892(119881119892) is changed into 1198811015840

119892 then 119909

119892is an irrelevant

variable to path 119901

6 Mathematical Problems in Engineering

Input Br(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branching conditions along the path119883 = 119909

1 1199092 119909

119899 the set of input variables

Output119883rel the set of relevant variables to the path119883irrel the set of irrelevant variables to the path

(1)119883rel larr Oslash(2)119883irrel larr Oslash(3) foreach Br(119899

119902119886 119899119902119886+1

) (119886 isin [1 119896])(4) if (119883rel = 119883)

(5) break(6) else if (119886

119895= 0)

(7) 119883rel larr 119883rel cup 119909119895(8)119883irrel larr 119883 minus119883rel(9) return 119883rel 119883irrel

Algorithm 2 Irrelevant variable removal

Generally for a particular path whether an input variableis relevant or irrelevant cannot be completely decided dueto the complex structure of programs But we can makeconservative estimate of irrelevancy with static control flowtechnique We give the most common condition in PUTsAssume that there are 119896 branches along a path each branch(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) needs to be traversed to find theset of relevant variables The removal of irrelevant variablesinvolves the judgment of whether a variable appears on eachbranch so we give the definition below which is utilizedby Algorithm 2 And considering the relation between thecomplexity of BFS-BB and the number of variables we giveProposition 4 about the effectiveness of IVR

Definition 3 The branching condition Br(119899119902119886 119899119902119886+1

) is theconstraint on the branch (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896]) and it canbe represented as

Br (119899119902119886 119899119902119886+1

) =

119899

sum119895=1

119886119895119909119895R119888 (1)

where R is a relational operator and 119886119895(119895 isin [1 119899]) and 119888 are

constants

Proposition 4 IVR may result in test data being searchedout with fewer MPC checks for a particular path 119901 than if allvariables are considered

Proof Thealgorithmbisection involves the search steps takenfor a certain variable under the same condition of othervariables which move in breadth (119898) until a value is foundto make MPC succeed Then 119898 is the base of the complexityof BFS-BB and the number of variables is the exponent Let119883rel denote the set of relevant variables to path119901 and let119883irrelbe the set of irrelevant variables onemore element in119883rel willinvolve more MPC checks on an exponential basis If all theirrelevant variables are removed from the search space thecomplexity will be reduced by119898|119883irrel| |119883irrel| is the cardinalityof the set of irrelevant variables

We conduct IVR for all the paths in Figure 1 and theprocess is shown in Table 1 The position where a variable isjudged relevant to the path of interest is highlighted in bold

5 The Heuristic Look-Ahead Methods

In this section the heuristic look-ahead methods in BFS-BBare explained in detail in Sections 51 52 and 53 respec-tively And Section 54 provides a case study to illustrate thesemethods

51 Heuristics in Variable Ordering In practice the chief goalin designing variable ordering heuristics is to reduce the sizeof the overall search tree In our method the next variableto be instantiated is selected to be the one with the minimalremaining domain size (the size of the domain after removingthe values judged to be infeasible) because this canminimizethe size of the overall search treeThe technique to break ties isimportant as there are often variables with the same domainsize We use variablesrsquo ranks to break ties In case of a tie thevariable with the higher rank is selected This method givessubstantially better performance than picking one of the tyingvariables at random Rank is defined as follows

Definition 5 The rank of a branch (119899119902119886 119899119902119886+1

) (119886 isin [1 119896])

marks its level in the sequence of the branches along a pathdenoted as rank (119899

119902119886 119899119902119886+1

)The rank of the first branch is 1 the rank of the second

one is 2 and the ranks of those following can be obtainedanalogously The variables appearing on a branch enjoy thesame rank as the branch The rank of a variable on a branchwhere it does not appear is supposed to be infinity As avariable may appear on more than one branch it may havedifferent ranks The rule to break ties according to the ranksof variables is based on the heuristics from interval arithmeticthat the earlier a variable appears on a path the greaterinfluence it has on the result of interval arithmetic along thepath Therefore if the ordering by rank is taken between avariable that appears on the branch (119899

119902119886 119899119902119886+1

) and a variablethat does not then the former has a higher rank That isbecause on the branch (119899

119902119886 119899119902119886+1

) the former has rank 119886

Mathematical Problems in Engineering 7

Table 1 IVR process for each path of test in Figure 1

Path Branching condition 1198861

1198862

1198863

119883rel 119883irrel

Path 1 0 rarr 1 rarr 2 rarr 9 rarr 10 1199091 minus 1199092 le 0 1 minus1 0 1199091 1199092 1199093

Path 2 0 rarr 1 rarr 3 rarr 4 rarr 8 rarr 9 rarr 101199091 minus 1199092 gt 0 1 minus1 0

1199091 1199092 1199093 Oslash1199093 minus 1199092 le 0 0 minus1 1

Path 3 0 rarr 1 rarr 3 rarr 5 rarr 6 rarr 7 rarr 8 rarr 9 rarr 10

1199091 minus 1199092 gt 0 1 minus1 01199091 1199092 1199093 Oslash1199093 minus 1199092 gt 0 0 minus1 1

31199093 ge minus5 mdash mdash mdash

Path 4 0 rarr 1 rarr 3 rarr 5 rarr 7 rarr 8 rarr 9 rarr 10

1199091 minus 1199092 gt 0 1 minus1 01199091 1199092 1199093 Oslash1199093 minus 1199092 gt 0 0 minus1 1

31199093 lt minus5 mdash mdash mdash

Input FV the set of future variables119863119894 the domain of 119909

119894(119909119894isin FV)

(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branches along the pathOutput 119876

119894 a queue of FV

Begin(1) 119876119894larr quicksort (FV 1003816100381610038161003816119863119894

1003816100381610038161003816)(2) for 119894 rarr 1 1003816100381610038161003816119876119894

1003816100381610038161003816(3) if (1003816100381610038161003816119863119894

1003816100381610038161003816 =10038161003816100381610038161003816119863119895

10038161003816100381610038161003816) (119895 gt 119894 119909

119894 119909119895isin 119876119894)

(4) break(5) else for (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896])(6) if (rank(119899

119902119886 119899119902119886+1

)(119909119894) = rank(119899

119902119886 119899119902119886+1

)(119909119895))

(7) 119886++(8) else permutate 119909

119894 119909119895by rank(119899

119902119886 119899119902119886+1

)(9) break(10) return 119876

119894

End

Algorithm 3 Dynamic variable ordering

while the latter has rank infinity The comparison between119886 and infinity determines the ordering The algorithm isdescribed by pseudocodes in Algorithm 3

Quicksort is utilized when permutating variables accord-ing to remaining domain size and returns 119876

119894as a result If

no variables have the same domain size then DVO finishesBut if there are variables whose domain sizes are the same asthat of the head of119876

119894 then the ordering by rank is under way

which will terminate as soon as different ranks appear

52 Heuristics in Value Selection DVO determines the nextvariable to be instantiated and then the value selectionstrategies are employed Considering the difference betweenthe variable in question (eg 119909

119894) and other variables the

branching condition defined by formula (1) can be furtherrepresented as a function of 119909

119894

Br (119899119902119886 119899119902119886+1

) (119909119894) 119863119894997888rarr 119861 = (119886

119894119909119894+ sum119895 = 119894

119886119895119909119895)R119888 (2)

where 119863119894is the domain of 119909

119894and 119861 is a set of Boolean

values 119905119903119906119890 119891119886119897119904119890 sum119895 = 119894119886119895119909119895is the linear combination of

the variables except 119909119894and is regarded as a constant Then

we can design the value selection strategies starting from themonotonic relation between the branching condition and 119909

119894

Monotonicity describes the behavior of a function in relationto the change of the input It gives an indication whether theoutput of the function moves in the same direction as theinput or in the reverse direction If a branching condition isa function whose monotonicity is known the direction inwhich the input needs to be moved to make the functiontrue can be determined The following proposition gives anattribute of a function composed of piecewise monotonicfunctions

Proposition 6 Assume that 1198911 1198831rarr 119884

1 1198912 1198832rarr

1198842 119891

119898 119883119898rarr 119884119898is a family of piecewise monotonic

functions with 119884119894sube 119883119894+1

Let 119865119898 1198831rarr 119884119898be a composed

function 119891119898∘ 119891119898minus1

∘ sdot sdot sdot ∘ 1198911 On this assumption 119865

119898is also

piecewise monotonic

Proof Mathematical induction is used to prove the proposi-tion

(i) Case 1198651= 1198911 Function 119891

1is piecewise monotonic

by assumption 1198651is equal to 119891

1 so it has the same

attribute(ii) Case 119865

119894+1= 119891119894+1∘ 119865119894 The composed function 119865

119894is

piecewise monotonic by the induction assumptionlet 119868 be a subset of its domainrsquos partition and let 119909and 1199091015840 be two arbitrary elements in 119868 with 119909le

1198831199091015840

then one of themonotonicity conditions holds that iseither 119865

119894(119909) le119884119894119865119894(1199091015840

) or 119865119894(119909) ge119884119894119865119894(1199091015840

) For simplic-ity we denote it as 119865

119894(119909)R119865

119894(1199091015840

) where R isin le geFunction 119891

119894+1is piecewise monotonic by assumption

The monotonicity condition is satisfied by 119865119894(119909) and

119865119894(1199091015840

) if both lie in the same subset 1198681015840of its domainrsquospartition Then 119891

119894+1(119909)R119891

119894+1(1199091015840

) holds and 119891119894+1

isalso monotonic on 1198681015840

After decomposing a branching condition into its basicfunctions its monotonicity can be utilized in the selectionof the initial value as well as other values of the variable inquestion

521 Initial Value Selection Initial values of variables are ofgreat importance to a search algorithm On the one hand in a

8 Mathematical Problems in Engineering

backtrack-free search the initial value of a variable is almostpart of the solution On the other hand the selection of initialvalues affectswhether the searchwill be backtrack-free Initialvalues are often selected at random in MHS methods whichreturn different test data each time allowing diversity butrandomness without any heuristics is a kind of blind searchand causes too many iterations sometimes even exceptionMeanwhile midvalues are selected in methods using bisec-tion so it is obvious that sometimes the same result may bereturned since the same initial value is always selected In ourmethod the above twomethods are combined and the initialvalue of a variable is determined based on its path tendencywhich is defined and calculated as follows

Definition 7 Path tendency isin 119901119900119904119894119905119894V119890 119899119890119892119886119905119894V119890 is anattribute of a variable on a path which is in favor of thesatisfaction of all the branching conditions along the pathAnd it provides the information about where to select itsinitial value Positive implies that a larger initial value willwork better while negative implies that a smaller initial valueis better

The calculation of the path tendency of a variable119909119894involves the calculation of its weight on each branch

(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) and its path weight denoted as119908119894(119899119902119886 119899119902119886+1

) and 119901119908119894 which are calculated as (3)

119908119894(119899119902119886 119899119902119886+1

)

=

10038161003816100381610038161198861198941003816100381610038161003816

10038161003816100381610038161198861198941003816100381610038161003816 + sum119895 = 119894

10038161003816100381610038161003816119886119895

10038161003816100381610038161003816

if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically increasing

minus

10038161003816100381610038161198861198941003816100381610038161003816

10038161003816100381610038161198861198941003816100381610038161003816 + sum119895 = 119894

10038161003816100381610038161003816119886119895

10038161003816100381610038161003816

if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically decreasing

119901119908119894=

119896

sum119886=1

119908119894(119899119902119886 119899119902119886+1

)

(3)

Path tendency calculation (PTC) gleans the path ten-dency of each variable with 119901119908

119894 Subsequently initial domain

calculation (IDC) works on the result of PTC In this waythe initial value selection allows for both diversity andheuristics The algorithms are expressed by pseudo-codes inAlgorithms 4 and 5

522 Bisection by Tendency Bisection functions only whena value (including the initial value) assigned to the currentvariable 119909

119894is judged to be infeasible and the conflicted branch

(119899119902119886 119899119902119886+1

)with the false branching condition is locatedThenthe tendency of 119909

119894is used by bisection defined as follows

Definition 8 Tendency isin 119901119900119904119894119905119894V119890 119899119890119892119886119905119894V119890 is an attributeof a variable at a branch (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896]) determinedby the analysis on the monotonicity of the correspondingbranching condition and it provides the information aboutwhere to select a value to better satisfy the branching

condition Positive implies that a larger value will work betterwhile negative implies that a smaller value is better It iscalculated according to the following formula

119879119890119899119889119890119899119888119910 (119909119894)

=

119901119900119904119894119905119894V119890 if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically increasing119899119890119892119886119905119894V119890 if Br (119909

119894) (119899119902119886 119899119902119886+1

)

is monotonically decreasing

(4)

Each branch holds a tendencymap ⟨119881119886119903119894119886119887119897119890 119879119890119899119889119890119899119888119910⟩which includes the variables appearing on the branch andtheir corresponding tendencies With the tendency mapbisection can be applied to reduce the domain of 119909

119894(119863119894119895)

leading the branching condition to be true as presented bypseudo-codes in Algorithm 6

For example if the conflicted branch is the first branchof Path3 in Figure 1 then the corresponding branchingcondition is 1199091 minus 1199092 gt 0 which has different monotonicrelations with 1199091 and 1199092 respectively Table 2 shows how touse bisection to reduce the domains of variables If the currentvariable is 1199091 then retrieval of tendency map returns positiveindicating that a larger value will help satisfy the branchingcondition sowe reduce its domain to the larger part But if thecurrent variable is 1199092 bisection will function in the oppositeway due to the opposite monotonic relation

53 Heuristics in Maintaining Path Consistency As men-tioned in Section 42 MPC can be used in both stages ofBFS-BB In this part the focus is on the state space searchstage A value assigned to the current variable 119909

119894 no matter

it is the initial value or another value selected after bisectionshould be examined by interval arithmetic to see whether it ispart of the solution Path consistency is a prerequisite for thesuccess of interval arithmetic In the implementation of BFS-BB interval arithmetic is enhanced to provide more preciseinterval information The enhancement is to make clear howthe value of the branching condition defined by formula (2)is calculated as shown in formula (5) Here we use 119863119886 todenote the domain of all variables before calculating the 119886thbranching condition Besides a library of inverse functions isadded in case of the occurrences of library functions in thePUT Consider

Br (119899119902119886 119899119902119886+1

) (119909119894)

=

119905119903119906119890 if (119899119902119886 119899119902119886+1

) is traversedwith 119863119886 (119881

119894119895isin 119863119886

)

119891119886119897119904119890 otherwise

(5)

Hence for 119896 branching nodes along path 119901 all the119896 branching conditions should be true to maintain pathconsistency MPC receives the value of the current variable 119909

119894

(119881119894119895) which is part of the domain of all variables denoted as

1198631

(119881119894119895= [119881119894119895 119881119894119895] isin 1198631

) and evaluates the branching condi-tion corresponding to the branch (119899

1199021 1198991199021+1

) where 1198991199021is the

first branching node The branching condition Br(1198991199021 1198991199021+1

)

Mathematical Problems in Engineering 9

Input119883rel the set of relevant variables to the path119901119908119894 the path weight of variable 119909

119894(119909119894isin 119883rel)

Output Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relBegin(1) Path-Tendencylarr null(2) foreach 119909

119894isin 119883rel

(3) if (119901119908119894gt 0)

(4) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119901119900119904119894119905119894V119890⟩

(5) else if (119901119908119894lt 0)

(6) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119899119890119892119886119905119894V119890⟩

(7) return Path-TendencyEnd

Algorithm 4 Path tendency calculation

Input 119863119894= [min max] the domain of 119909

119894

Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relOutput 119863

1198941 the domain of 119909

119894in which its initial value is selected

Begin(1) PathTendency(119909

119894)larr retrieval of Path-Tendency

(2) if (PathTendency(119909119894) = positive)

(3) 1198631198941larr997888 [

(min+max)2

max](4) else if (PathTendency(119909

119894) = negative)

(5) 1198631198941larr997888 [min (min+max)

2]

(6) return 1198631198941

End

Algorithm 5 Initial domain calculation

Input 119863119894119895= [min max] the current domain of 119909

119894

119881119894119895 the current value of 119909

119894that causes Br(119909

119894)(119899119902119886 119899119902119886+1

) to be false(119899119902119886 119899119902119886+1

) the conflicted branchOutput 119863

119894119895 the reduced domain of 119909

119894

Begin(1) 1198811015840 larr 119881

119894119895

(2) Tendency(119909119894)larr retrieval of tendency map held by (119899

119902119886 119899119902119886+1

)(3) 119895++(4) if (Tendency(119909

119894) = positive)

(5) 119863119894119895larr [1198811015840 + 1 max]

(6) else if (Tendency(119909119894) = negative)

(7) 119863119894119895larr [min 1198811015840 minus 1]

(8) return 119863119894119895

End

Algorithm 6 Bisection

Table 2 An example of bisection

Current variable Monotonicity Tendency Current value Domain before bisection Domain after bisection1199091 Increasing Positive 1198811 [min1 max1] [1198811+1 max1]1199092 Decreasing Negative 1198812 [min2 max2] [min2 1198812minus1]

10 Mathematical Problems in Engineering

Input 1198631 the domain of all variables before checking path consistencyBr(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branching conditions along the pathOutput 119863119896+1 the reduced domain of all variables after a successful path consistency check

(119899119902119886 119899119902119886+1

) the conflicted branch spotted by path consistency checkBegin(1) for 119886 rarr 1 119896(2) calculate Br(119899

119902119886 119899119902119886+1

) with119863119886(3) if (Br(119899

119902119886 119899119902119886+1

) = true)(4) 119863

119886+1

⊑ 119863119886

(5) else return (119899119902119886 119899119902119886+1

)(6) path consistentlarr true(7) return 119863119896+1End

Algorithm 7 Maintaining path consistency

is generally not satisfied for all the values in1198631 but for valuesin a certain subset 1198632 sube 119863

1 ensuring the traversal of the

branch (1198991199021 1198991199021+1

) that is 1198631119861119903(1198991199021 1198991199021+1)

997888997888997888997888997888997888997888997888997888rarr 1198632 Next the

branching condition Br(1198991199022 1198991199022+1

) is evaluated given that thedomain of all variables is1198632 Again generally Br(119899

1199022 1198991199022+1

) isonly satisfied by a subset 1198633 sube 1198632 This procedure continuesalong 119901 until all the branching conditions are satisfied tomaintain path consistency and119863119896+1 is returned as the domainof all variables The process of maintaining path consistencyis the propagation of the branching conditions along p in the

form of 1198631Br(1198991199021 1198991199021+1)997888997888997888997888997888997888997888997888997888rarr 119863

2

Br(11989911990221198991199022+1)997888997888997888997888997888997888997888997888997888rarr 119863

3

sdot sdot sdot 119863119896

Br(119899119902119896119899119902119896+1)997888997888997888997888997888997888997888997888997888rarr

119863119896+1 where 1198631 supe 119863

2

supe 1198633

sdot sdot sdot supe 119863119896

supe 119863119896+1 But if in

this procedure Br(119899119902ℎ 119899119902ℎ+1

) = 119891119886119897119904119890(1 le ℎ le 119896) whichmeans a conflict is detected then MPC is terminated andbisection will function according to the result of MPC atthe conflicted branch (119899

119902ℎ 119899119902ℎ+1

) The process of checkingwhether path consistency is maintained is shown by pseudo-codes in Algorithm 7

54 Case Study In this part the problem mentioned inSection 31 is used as an example to explain how BFS-BBworks especially the heuristic look-aheadmethods proposedahead The input is Path3 as shown in bold in Figure 3where each branching condition is decomposed into its basicfunctions in the right The IVR process has been illustratedin detail in Table 1 and all the three variables are determinedrelevant to Path3 For simplicity the input domains of allvariables are set [minus2 2] with the size 5 In the initializationstage MPC check reduces their domains to 1199091 [minus1 2] 1199092[minus2 1] and 1199093 [minus1 2] The path tendency of each variableis calculated by PTC as shown in Table 3 DVO serves todetermine the first variable to be instantiated as shown inTable 4 with the head of the queue (1199092) highlighted in boldOn determining 1199092 to be the current variable an initialvalue needs to be selected from[minus2 1] The retrieval of pathtendency map by IDC returns negative for 1199092 indicating thata smaller value will perform better and minus1 is selected

MPC checks the domains of all variables which are 1199091[minus1 2] 1199092 [minus1 1] and 1199093 [minus1 2] It succeeds and reduces

the domains of 1199091 and 1199093 to [0 2] and [0 2] respectivelyThen DVO determines the next variable to be instantiated asshown in Table 5 with the head of the queue (1199091) highlightedin bold

1 is selected for 1199091 after IDC MPC checks whether1199091 [1 1] 1199092 [minus1 minus1] and 1199093 [0 2] works It suc-ceeds and in the same manner 1199093 is assigned 1 Finally⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is checked by MPC to be suitablefor Path3 No variable needs to be permutated and BFS-BBsucceeds with the test data ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ Table 6shows how the domains of variables are changed duringthe search process The changed domains are highlighted inbold The changes listed in the fourth column are owing tovariable assignments according to the results of IDC andthe changes listed in the fifth column are owing to domainreduction by MPC checks The process of generating the testdata ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is presented as the search treein Figure 4 It is a backtrack-free search that accounts for anextremely large proportion in the implementation of BFS-BBEach variable consumes one MPC check in the state spacesearch stage and the initial values of each variable make thesolution The solution path is shown by the bold arrows

6 Experimental Results and Discussion

To observe the effectiveness of BFS-BB we carried out a largenumber of experiments in CTS Within the CTS frameworkthe PUT is automatically analyzed and its basic informationis abstracted to generate its CFG According to the specifiedcoverage criteria the paths to be traversed are generated andprovided for BFS-BB as input The generated test data willbe used for mutation testing that requires a high coverageideally 100 [37] This is a challenge for test data generation

The experiments were performed in the environment ofMSWindows 7with 32 bits Pentium4with 28GHz and 2GBmemory The algorithms were implemented in Java and runon the platformof eclipseThe experiments include two partsSection 61 presents the performance evaluation of BFS-BBand Section 62 tests the capability of BFS-BB to generate testdata in terms of coverage and makes comparisons with somecurrently existing static and dynamic methods

Mathematical Problems in Engineering 11

Table 3 PTC process for 1199091 1199092 and 1199093

Branchingcondition

Basic functions andcorresponding monotonicity

Monotonicity ofbranching conditions Weight Path weight Path tendency

1199091 minus 1199092 gt 0

119891(1199091) = 1199091 minus 1199092 increasing119891(1199092) = 1199091 minus 1199092 decreasing119891(1198871) = 1198871 gt 0 increasing

Br(1199091) increasingBr(1199092) decreasing

1199081 = 05

1199082 = minus051199011199081 = 05

1199011199082 = minus1

1199011199083 = 15

⟨1199091 positive⟩⟨1199092 negative⟩⟨1199093 positive⟩1199093 minus 1199092 gt 0

119891(1199092) = 1199093 minus 1199092 decreasing119891(1199093) = 1199093 minus 1199092 increasing119891(1198872) = 1198872 gt 0 increasing

Br(1199092) decreasingBr(1199093) increasing

1199082 = minus05

1199083 = 05

3 lowast 1199093 ge minus5119891(1199093) = 3 lowast 1199093 increasing119891(1198873) = 1198873 ge minus5 increasing Br(1199093) increasing 1199083 = 1

Table 4 DVO process for 1199091 1199092 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 4 |1198632| = 4 |1198633| = 4 Yes (all three have the same domain size)

x2 rarr 1199091 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199092) = 1 Rank 1(1199093) =infin Yes (1199091 and 1199092 both have Rank 1)Rank 2 Rank 2(1199091) =infin Rank 2(1199092) = 2 No (1199092 has Rank 2 while 1199091 has infinity)

Table 5 DVO process for 1199091 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 3 |1198633| = 3 Yes (both have the same domain size) x1 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199093) =infin No (1199091 has Rank 1 while 1199093 has infinity)

Table 6 Domain changes in the search process

Stage Function Before IDC After IDC and before MPC After MPC

Initialization Initial domain reduction mdash 1199091 [minus2 2] 1199092 [minus2 2]1199093 [minus2 2]

x1 [minus1 2] x2 [minus2 1]x3 [minus1 2]

State space search

MPC check when 1199092 isassigned minus1

1199091 [minus1 2] 1199092 [minus2 1] 1199093[minus1 2]

1199091 [minus1 2] x2 [minus1 minus1]1199093 [minus1 2]

x1 [0 2] 1199092 [minus1 minus1]x3 [0 2]

MPC check when 1199091 isassigned 1

1199091 [0 2] 1199092 [minus1 minus1] 1199093[0 2]

x1 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

MPC check when 1199093 isassigned 1

1199091 [1 1] 1199092 [minus1 minus1] 1199093[0 2]

1199091 [1 1] 1199092 [minus1 minus1]x3 [1 1]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [1 1]

void test(int x1 int x2 int x3) void test(int x1 int x2 int x3)

if (x1-x2lt=0)

printf(ldquoPath1rdquo)

else if(x3-x2lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

else if(3lowastx3+5gt=0)

int b1=x1-x2

printf(ldquoPath1rdquo)

if (b1lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

if(b2lt=0)

if(b3gt=-5)

else int b2=x-3x2

elseint b3=3lowastx3

Figure 3 Overview of our approach for searching the test data

12 Mathematical Problems in Engineering

0 1 2

DVO

MPC

0 1 2MPC

0 1 2MPC

DVO

DVOIDC

PTC

MPC

MPC

IDC

IDC

minus1

minus1

minus1

minus2

minus2

minus2

x1 minus x2 gt 0x3 minus x2 gt 03x3 + 5 ge 0

x1 x2 x3 isin minus2 minus1 0 1 2

x2x2x2x2

x3x3x3

x1 x1 x1

radic

Figure 4 The search tree of generating the test data for Path3 using BFS-BB

61 Performance Evaluation The number of relevant vari-ables is an important factor that affects the performanceof BFS-BB so in this part experiments were carried out toevaluate the performance of BFS-BB for varying numbers ofinput variables To be specific our major concern is (1) therelationship between the number of MPC checks (exclusiveof the one taken in the initialization stage) and the numberof relevant variables (2) the relationship between genera-tion time and the number of relevant variables This wasaccomplished by repeatedly running BFS-BB on generatedtest programs having input variables 119909

1 1199092 119909

119899where 119899

varied from 1 to 50 Adopting statement coverage in eachtest the program contained five if statements (equivalent tofive branching conditions along the path for MPC check)and there was only one path to be traversed of fixed lengthwhich was the one consisting of entirely true branches(TTTTT) that is all the branching conditions are the sameas the corresponding predicates Considering the relationshipbetween variables experiments involving two situations wereconducted that (1) the variables are all independent of eachother and (2) the variables are linearly related in the tightestmanner Generation time varied greatly in these two cases sothe axes of generation time of both cases are normalized forsimplicity

611 Variables Are All Independent of Each Other The pred-icate of each if statement is an expression in the form of

11988611199091relop

1const [1] and 119886

21199092relop

2119888119900119899119904119905 [2]

and sdot sdot sdot and 119886119899119909119899relop

119899119888119900119899119904119905 [119899]

(6)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop119894(119894 = 1 2 119899) isin gt ge lt le =

= and const is an array of randomly generated constantsThe randomly generated 119886

119894and 119888119900119899119904119905[119894] should be selected

to make the path feasible This arrangement constructs arelationship that all the variables are independent of eachother but all of them are relevant to the path The programsfor various values of n ranging from 1 to 50 were each tested50 times and the number of MPC checks and time required

to generate the data for each test were recorded The resultscan be seen from Figures 5 and 6

Figure 5 shows the relationship between the number ofMPC checks and the number of variables (119899) for variablesthat are all independent of each other and from (a) to (d)represent four different situations marked by the ordinatesIt can be seen that since the relation in formula (6) is thesimplest one between variables the number of MPC checksremains linearly increasing with the number of variables nomatter in which situation from (a) to (d) 119910 = 119909means thatfor this kind of constraint one relevant variable requires onlyone MPC check It also can be seen that 1198772 = 1 in all thefour situations and the number of MPC checks increasescompletely linearly with the number of variables

Figure 6 shows the relationship between generation timeand the number of variables (119899) for variables that are allindependent of each other and from (a) to (d) represent fourdifferent situations marked by the ordinates It can be seenthat generation time increases approximately linearlywith thenumber of variables and the linear correlation relationship issignificant at 95 confidence level with 119875 value far less than005 By the increase of the number of variables generationtime increases at an even speed The minimum value can becommendably represented as a straight line showing that itis the most ideal in the four situations with a larger value of1198772 Variations between tests with the same values of 119899 were

attributed to the randomness in the difference in the selectionof the initial values

612 Variables Are Linearly Related in the Tightest MannerThe predicate of each if statement is a linear combination ofall the 119899 variables in the form of

[1198861 1198862 119886

119899] [1199091 1199092 119909

119899]1015840relop119888119900119899119904119905 [119888] (7)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop isin gt ge lt le = = and 119888119900119899119904119905[119888](119888 isin 1 2 3 4 5) is an array of randomly generatedconstants The randomly generated 119886

119894and 119888119900119899119904119905[119888] should be

selected to make the path feasible This arrangement con-structs the tightest linear relation between the variables all

Mathematical Problems in Engineering 13

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = x

R2 = 1

(a)

Number of variables0 20 40 60

0

20

40

60

Aver

age n

umbe

r of M

PC ch

ecks

y = x

R2 = 1

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(c)

Number of variables0 20 40 60

0

20

40

60

Min

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(d)

Figure 5 Relationship between the number of MPC checks and the number of variables for variables that are all independent of each other

of which are relevant to the path The programs for variousvalues of 119899 ranging from 1 to 50 were each tested 50 times andthe number ofMPC checks and time required to generate thedata for each test were recordedThe results can be seen fromFigures 7 and 8

Figure 7 shows the relationship between the number ofMPC checks and the number of variables (119899) for variables thatare the tightest linearly related and from (a) to (d) representfour different situations marked by the ordinates It can beseen that the number of MPC checks remains approximatelylinearly increasing with the number of variables and thefitting curves are all near 119910 = 119909 The linear correlationrelationship is significant at 95 confidence level with 119875-value far less than 005 The general average and maximumnumbers ofMPC checks are all larger than those in the exper-iment for variables that are all independent of each otherbecause the relation in formula (7) is the tightest linear onebetween variablesTheminimumnumber ofMPC checks canbe completely represented as 119910 = 119909 with 1198772 = 1 whichmeans that theminimumnumber is themost ideal in the foursituations

Figure 8 shows the relationship between generation timeand the number of variables (119899) for variables that are

the tightest linearly related and from (a) to (d) represent fourdifferent situationsmarked by the ordinates It is clear that therelation between generation time and the number of variablescan be commendably represented as a quadratic curve andthe quadratic correlation relationship is significant at 95confidence level with 119875-value far less than 005 The betterfitting curves of average and minimum generation timesshow that average generation time is perfectly stable andminimum generation time is still the most ideal Variationsbetween tests with the same values of n were attributed to therandomness in (1) the difference in the selection of the initialvalues and (2) the difference in the expressions along the path(an equality relational operator will generally require morecalculation than an inequality relational operator) Besidesgeneration time increases at a uniformly accelerative speed bythe increase of the number of variables Take (b) for examplethe differentiation of average generation time indicates thatits increase rate rises by 119910 = 9994119909 minus 7734 as the number ofvariables increases We can roughly draw the conclusion thatgeneration time is very close for 119899 ranging from 1 to 8 whileit begins to increase when 119899 is larger than 8

The above cases are both completely backtrack-freesearch owing to the linear correlation relationship between

14 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 1471x + 3161

R2 = 0899

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time

y = 1636x + 1175

R2 = 0862

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 2073x minus 3655

R2 = 0997

(d)

Figure 6 Relationship between generation time and the number of variables for variables that are all independent of each other

the number of MPC checks and the number of relevantvariables Surely they cannot include all the relations betweenvariables in engineering so the analyses in this part arejust from the theoretic perspective The real-world PUTs aremuch more complex Whatrsquos more 50 tests were conductedfor each case of n ranging from 1 to 50 So the results from thesamples can only approximate the actual situation But it canbe concluded that BFS-BB functions are stably given a PUTof regular structure which lays a solid foundation for itsapplication in engineering

62 Coverage Evaluation To evaluate the capability of BFS-BB to generate test data in terms of coverage four experi-ments were carried out The first involves the testing with abenchmark used in CTS the second aims at generating testdata for a project in engineering the third compares BFS-BBwith a static method and the last compares it with dynamicmethods

621 Testing a Benchmark in CTS In this part test datawere automatically generated to meet three coverage criteriawhich were statement branch and MCDC The test bedwas branch boundc a benchmark in CTS with 402 LOC

29 input variables and complex structure trying to includemore content that might appear in engineering119898 was set 10for each variable as the upperbound of the number of MPCchecks so it can be estimated that the simplest backtrackingwill consume at least 11 MPC checks for the variable inquestion

The result is shown in Table 7 The numbers of paths wasdifferent owing to different coverage criteria adopted BFS-BB was able to generate test data for all the feasible pathsno matter which coverage criterion was taken The MCDCcoverage did not reach 100 because it is relatively strictand difficult to meet and subsumes statement and branchcoverage [38] But tolerable coverage was achieved withintolerable timeThere exists a trade-off between efficiency andsuccess rate IVR had no significant influence on coveragebut it did on generation time Generation time after IVRwas much less than that without IVR Note that the amountof generation time reduced by IVR is determined by the struc-ture of the PUT The numbers of generation time reductionin Table 7 are only related to the program branch boundcOur following analyses all concern BFS-BB with IVR Aver-age number of MPC checks per relevant variable adoptingstatement coverage was larger than the results in Section 61

Mathematical Problems in Engineering 15

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = 0994x + 0221

R2 = 0997

(a)

0 20 40 600

20

40

60

Number of variables

Aver

age n

umbe

r of M

PC ch

ecks

y = 0996x + 0309

R2 = 0999

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = 1014x + 2030

R2 = 0985

(c)

0 20 40 600

20

40

60

Number of variables

Min

imum

num

ber o

f MPC

chec

ks

y = x

R2 = 1

(d)

Figure 7 Relationship between the number of MPC checks and the number of variables for variables that are linearly related in the tightestmanner

Table 7 Coverage achieved by BFS-BB on branch boundc

Coverage criterion Number of paths Average coverage Generation time reducedby IVR

Average MPC checks perrelevant variable

Statement 61 100 34 134Branch 119 100 37 173MCDC 125 98 42 229

because there are some library functions as well as nonlinearconstraints in the PUT which require moreMPC checks Butfrom all the average values of the number of MPC checks itcan be concluded that the tests for all the three coverage crite-ria were basically backtrack-free and not all the tests includethe bisection operation

622 Testing Programs from a Project in Engineering In thispart seven programs were selected from the project de118i-2 at httpwwwmoshiernet as the test beds each of whichcontains several functions or loops Three coverage criteriawere adopted which were statement branch and MCDCThe information of the programs and the coverage results areshown in Table 8

From Table 8 it can be seen that BFS-BB performedencouragingly for programs with complex structure in engi-neering Some of the tests adoptingMCDC coverage did notreach 100 resulting from the fact thatMCDC is a relativelystrict criterion But all the tests were basically backtrack-freewith a high coverage proving its effectiveness in engineering

623 Comparison with a Static Method This part presentsthe results from an empirical comparison of BFS-BB withthe static method [13] (denoted as ldquomethod 1rdquo to avoidverbose description) which was implemented in CTS priorto BFS-BB The details of the test beds are shown in Table 9The comparison adopted three coverage criteria statementbranch and MCDC And the result is shown in Table 10

16 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 4318x2 minus 6684x + 3655

R2 = 0975

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 4997x2 minus 7734x + 4134

R2 = 0991

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time y = 4151x2 minus 2923x + 7107

R2 = 0907

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 5455x2 minus 9565x + 5030

R2 = 0994

(d)

Figure 8 Relationship between generation time and the number of variables for variables that are linearly related in the tightest manner

Table 8 Test result achieved by BFS-BB on programs from de118i-2

Program LOC Number ofvariables

Number offunctions

Number ofloops Statement coverage Branch coverage MCDC coverage

rungec 1626 8 2 8 100 100 89oblatec 581 14 4 4 100 100 94jplmpc 996 6 5 5 100 100 88floorlc 432 7 5 2 100 100 92atanlc 267 3 2 0 100 100 100adams4c 767 28 10 12 100 100 85tanlc 256 4 3 0 100 100 100

Table 9 Programs used for comparison with method 1

Program LOC Number of branches Number of variables Description Source

Bonus 29 10 1 To calculate bonus according toprofit [13]

Days 33 17 3 To calculate the day numberwithin a year [13]

Statistics 21 8 5 To count the number of everykind of characters [13]

gcd 38 5 2 To calculate greatest commondenominator [14]

Mathematical Problems in Engineering 17

Table 10 Comparison result with method 1 using three coverage criteria

Program Coverage criterion Number of paths Average coverage bymethod 1

Average coverage byBFS-BB

BonusStatement 6 25 100Branch 6 37 100MCDC 6 30 100

DaysStatement 17 100 100Branch 17 100 100MCDC 14 94 100

StatisticsStatement 4 100 100Branch 5 100 100MCDC 11 82 100

gcdStatement 3 85 100Branch 3 77 100MCDC 5 75 100

Table 11 Programs used for comparison with GA and SA

Program LOC Number of branches Number of variables Description SourceTriangle type 31 3 5 To classify type of a triangle [15]

Valid date 59 16 3 To check whether a date is validor not [15]

Cal day 72 11 3 To calculate the day of the week [16]

Cal 53 18 5To calculate the number of daysbetween the two given days in

the same year[17]

Since interval arithmetic has been improved in CTS forthe sake of validity test data were generated for all the testbeds on the same foundation of interval arithmetic It can beseen that BFS-BB reached 100 coverage for all the test bedsusing three coverage criteria while method 1 did not That islargely due to the heuristicmethods utilized in BFS-BBThereare modulus operations in the program days which had beendifficult for the interval arithmetic in method 1 to handle Butthe functionality of interval arithmetic has been improved byadding a library of inverse functions so it was not so difficulteven for method 1

624 Comparison with DynamicMethods This part presentsresults from an empirical comparison of BFS-BB with twodynamic methods which are genetic algorithm (GA) andsimulated annealing (SA) on four different benchmark pro-grams using branch coverage The details of the benchmarkprograms are shown in Table 11 In order to obtain unbiasedexperimental results we made a number of experiments withdifferent parameter settings and chose the one that GA andSA performed the best as shown in Table 12 Test data wereautomatically generated for each program using GA SA andBFS-BB with each tested 100 timesThe average coverage wasrecorded to make the comparison and the result was pre-sented in Table 13

It can be seen that BFS-BB reached 100 coverage onall the four benchmark programs which are rather simpleprograms for BFS-BB and it outperformed the algorithms

in comparison The better performance of BFS-BB is due tothree factorsThe first is that the initial values of variables areselected by heuristics on the path so BFS-BB reaches a rela-tively high coverage for the first round of the searchThe sec-ond is that MHS crashes on several occasions due to the iter-ation exception while the probability of aborting is quite lowfor BFS-BB because it has no demand for iteration The thirdis thatMPC is checked not only in the state space search stagebut also in the initialization stage which reduces the domainsof the variables to ensure a relatively small search spacethat follows

7 Conclusion

This paper presents an intelligent algorithm best-first-searchbranch and bound (BFS-BB) for path-wise test data gener-ation (Q) The problem Q is reformulated as a constraintoptimization problem (COP) and two techniques from arti-ficial intelligence are introduced to tackle the COP which arestate space search and branch and bound (BB) The formeris used to construct the search tree dynamically and thelatter is used as the search method Heuristics are adoptedin the look-ahead stage Dynamic variable ordering (DVO)is presented with a heuristic rule to break ties Maintainingpath consistency (MPC) is achieved through analysis on theresult of interval arithmetic The monotonicity analysis onbranching conditions is applied both in the selection of initialvalues by path tendency calculation (PTC) and initial domain

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

4 Mathematical Problems in Engineering

Best-first-search branch and boundState space searchInitialization

IVR

Initial state Final state

Heuristic look-ahead methods

Bisection

Path Test data

Heuristic look-ahead methods

Backtracking

MPC IDC MPC DVO IDCDVO PTC

⟨x1 V1⟩

⟨x2 V2⟩

⟨x3 V3⟩

⟨xn Vn⟩

Figure 2 Program test and its counterpart with branching conditions decomposed into basic functions

State space is a quadruple (119878 119860 119868 119865) where 119878 is a setof states 119860 is a set of connections between the states inaccordancewith the search operations 119868 is a nonempty subsetof 119878 denoting the initial state of the problem and 119865 is a non-empty subset of 119878 denoting the final state of the problem

State space search is all about finding one final state ina state space (which may be extremely large) Final meansthat every variable has been instantiated with a definite valuesuccessfully At the start of the search Precursor is null andwhen Queue is null the search ends The path made up of allthe extensive nodes in the search treemakes the solution pathThe state space needs to be searched to find a solution pathfrom an initial state to a final state

42 The Search Algorithm BFS-BB The idea of the searchalgorithm BFS-BB is to extend partial solutions At each stepa variable in FV is selected and assigned a value from itsdomain to extend the current partial solution It is checkedwhether such an extension may lead to a possible solutionof the COP and the subtrees containing no solutions basedon the current partial solution are pruned Some concepts inBFS-BB are explained as follows

Irrelevant variable removal (IVR) identifies variablesrelevant to the path to be traversed and removes thoseirrelevant Dynamic variable ordering (DVO) permutatesFV and returns a queue Path tendency calculation (PTC)calculates the path tendencies of all relevant variables alongthe path whichwill be used to calculate the domains inwhichtheir initial values are selected

Initial domain calculation (IDC) calculates the domain ofa variable in which its initial value is selected according toits path tendency calculated by PTC Bisection reduces thedomain of the current variable when its value just assignedfails to satisfy a constraint on the path Maintaining pathconsistency (MPC) utilizes interval arithmetic to determinewhether the domains of all variables satisfy the constraintsalong the path

The overview of our approach can be seen from Figure 2The path to be traversed is shown in the left part where thered circles represent nodes and the arrows represent edgesof the CFG The path contains the constraints to be metthe set of input variables and the domains corresponding tothe variables The first stage is to perform the initializationoperations At first IVR (see Section 43) is called to reducethe search space by removing irrelevant variables and leavingonly those relevant to the path Then four heuristic look-ahead methods take effect MPC (see Section 53) is used to

partially reduce the input domains of all variables and findinfeasible paths on occasion All the relevant variables in FVare permutated by DVO (see Section 51) to form a queue 119876

1

and its head 1199091is determined the best or the first variable

to be instantiated Next PTC (see Section 52) calculates pathtendency of each variable and IDC (see Section 52) reducesthe domain 119863

11in which the initial value 119881

11is selected

for 1199091 With all these the initial state is constructed as

(119899119906119897119897 1199091 11986311 11988111 119886119888119905119894V119890 and 119876

1) which is also the current

state 119878119888119906119903

shown as the red ringThe second stage implements state space search Four

heuristic look-ahead methods work in this stage To eachactive state MPC is carried out to determine the direction ofthe next search step If MPC succeeds Type becomes exten-sive the variables in FV will be permutated by DVO to get119876119906119890119906119890 = 119876

119894 119878119888119906119903

becomes Precursor and the head of 119876119894(119909119894)

will be Variable of next state Then IDC is used to calculatethe domain119863

1198941in which the initial value119881

1198941is selected for 119909

119894

With all these a new state (119875119903119890 119909119894 1198631198941 1198811198941 119886119888119905119894V119890 and 119876

119894) is

constructed for which the MPC check continues If after asuccessful MPC check no variable needs to be permutatedthen all the relevant variables have been assigned the rightvalues to make 119901 feasible The final state is reached shownas the red double ring Finally giving the irrelevant variablesrandom values fulfills the generation of the test data which isthe output of BFS-BB as shown in the right part of Figure 2If a MPC check fails Type remains active bisection (seeSection 52) is conducted to reduce the domain of Variableusing the information from the failed MPC check and Valueis reselected from the reduced domain all of which indicatethat the search will expand to a state with a different value forthe same variable 119909

119894 If all the values within its domain for 119909

119894

are tried out or the number of MPC checks has reached theupperbound119898 then 119909

119894is moved out of PV andType becomes

inactive In this case the search will have to backtrack toPrecursor at the higher level of the search tree as shown by thebidirectional arrow between backtracking and the heuristiclook-ahead methodsThe above-mentioned search process isdescribed by pseudocodes as Algorithm 1

43 Irrelevant Variable Removal As mentioned above 119883 =

1199091 1199092 119909

119899 is the set of input variables for a program 119875

The search space needs to involve every 119909119894(119894 = 1 2 119899)

in 119883 However it is possible that not every variable will beresponsible for determining whether every path in 119875 will betraversed or not Therefore when attempting to generate testdata for a particular path 119901 the search effort on the value

Mathematical Problems in Engineering 5

Input 119901 the path to be traversedOutput result ⟨119881119886119903119894119886119887119897119890 119881119886119897119906119890⟩ the test data making 119901 feasibleStage 1 Initialization(1) call Algorithm Irrelevant variable removal(2) resultlarr null(3) call Algorithm Maintaining path consistency(4) call Algorithm Dynamic variable ordering(5) call Algorithm Path tendency calculation(6) 119909

1larr head (119876

1)

(7) call Algorithm Initial domain calculation(8) 119881

11larr select (119863

11)

(9) initial statelarr (null 1199091 11986311 11988111 active 119876

1)

(10) 119878cur larr initial stateStage 2 State space searchBegin(11) for (Pre 119909

119894 119863119894119895 119881119894119895 active 119876

119894) (119894 rarr 1 119899)

(12) path consistentlarr false(13) call Algorithm Maintaining path consistency(14) if (path consistent = true)(15) 119878cur larr (Pre 119909

119894 119863119894119895 119881119894119895 extensive 119876

119894)

(16) resultlarr result cup ⟨119909119894 119881119894119895⟩

(17) FVlarr FV minus 119909119894

(18) PVlarr PV + 119909119894

(19) call Algorithm Dynamic variable ordering(20) if (119876

119894= null)

(21) 119878cur larr final state(22) foreach 119909lowast isin 119883irrel(23) resultlarr result cup ⟨119909lowast 119881random⟩(24) else Prelarr 119878cur(25) 119909

119894larr head (119876

119894)

(26) call Algorithm Initial domain calculation(27) 119881

1198941larr select (119863

1198941)

(28) 119878cur larr (Pre 119909119894 1198631198941 1198811198941 active 119876

119894)

(29) else if (1003816100381610038161003816100381611986311989411989510038161003816100381610038161003816gt 1 ampamp 119895 lt 119898)

(30) call Algorithm Bisection(31) 119881

119894119895larr select (119863

119894119895)

(32) 119878cur larr (Pre 119909119894 119863119894119895 119881119894119895 active 119876

119894)

(33) else 119878cur larr (Pre 119909119894 119863119894119895 119881119894119895 inactive 119876

119894)

(34) 119875119903119890 larr 119878cur(35) 119878cur larr (Pre 119909

119894 119863119894119895 119881119894119895 active 119876

119894)

(36) PVlarr PV minus 119909119894

(37) return resultEnd

Algorithm 1 Best-first-search branch and bound

of a variable which is not relevant to 119901 is wasted since itcannot influence the traversal of 119901Thus removing irrelevantvariables from the search space and only concentrating on thevariables relevant to the path of interest may improve the per-formance of the search Hence we propose an optimizationmethod irrelevant variable removal (IVR) Relevant variableand irrelevant variable are defined as follows

Definition 1 A relevant variable is an input variable that caninfluence whether a particular path 119901will be traversed or notTo put it more precisely for all the input variables 119909

119894| 119909119894isin

119883 119894 = 1 2 119899 there exists a corresponding set of values119881119894| 119881119894isin 119863119894 119894 = 1 2 119899 with which 119901 is not traversed

But when the value of a particular variable is changed forexample when the value of 119909

119892(119881119892) is changed into 1198811015840

119892 119901 is

traversed with the input 1198811 1198812 119881

1015840

119892 119881

119899 Then 119909

119892is a

relevant variable to path 119901

Definition 2 An irrelevant variable is an input variable thatis not capable of influencing whether a particular path 119901 willbe traversed or not To put it more precisely for all the sets119881119894| 119881119894isin 119863119894 119894 = 1 2 119899 of the search space of path 119901

with which 119901 is not traversed if 119901 is still not traversed withthe input 119881

1 1198812 119881

1015840

119892 119881

119899 when the value of a certain

variable 119909119892(119881119892) is changed into 1198811015840

119892 then 119909

119892is an irrelevant

variable to path 119901

6 Mathematical Problems in Engineering

Input Br(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branching conditions along the path119883 = 119909

1 1199092 119909

119899 the set of input variables

Output119883rel the set of relevant variables to the path119883irrel the set of irrelevant variables to the path

(1)119883rel larr Oslash(2)119883irrel larr Oslash(3) foreach Br(119899

119902119886 119899119902119886+1

) (119886 isin [1 119896])(4) if (119883rel = 119883)

(5) break(6) else if (119886

119895= 0)

(7) 119883rel larr 119883rel cup 119909119895(8)119883irrel larr 119883 minus119883rel(9) return 119883rel 119883irrel

Algorithm 2 Irrelevant variable removal

Generally for a particular path whether an input variableis relevant or irrelevant cannot be completely decided dueto the complex structure of programs But we can makeconservative estimate of irrelevancy with static control flowtechnique We give the most common condition in PUTsAssume that there are 119896 branches along a path each branch(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) needs to be traversed to find theset of relevant variables The removal of irrelevant variablesinvolves the judgment of whether a variable appears on eachbranch so we give the definition below which is utilizedby Algorithm 2 And considering the relation between thecomplexity of BFS-BB and the number of variables we giveProposition 4 about the effectiveness of IVR

Definition 3 The branching condition Br(119899119902119886 119899119902119886+1

) is theconstraint on the branch (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896]) and it canbe represented as

Br (119899119902119886 119899119902119886+1

) =

119899

sum119895=1

119886119895119909119895R119888 (1)

where R is a relational operator and 119886119895(119895 isin [1 119899]) and 119888 are

constants

Proposition 4 IVR may result in test data being searchedout with fewer MPC checks for a particular path 119901 than if allvariables are considered

Proof Thealgorithmbisection involves the search steps takenfor a certain variable under the same condition of othervariables which move in breadth (119898) until a value is foundto make MPC succeed Then 119898 is the base of the complexityof BFS-BB and the number of variables is the exponent Let119883rel denote the set of relevant variables to path119901 and let119883irrelbe the set of irrelevant variables onemore element in119883rel willinvolve more MPC checks on an exponential basis If all theirrelevant variables are removed from the search space thecomplexity will be reduced by119898|119883irrel| |119883irrel| is the cardinalityof the set of irrelevant variables

We conduct IVR for all the paths in Figure 1 and theprocess is shown in Table 1 The position where a variable isjudged relevant to the path of interest is highlighted in bold

5 The Heuristic Look-Ahead Methods

In this section the heuristic look-ahead methods in BFS-BBare explained in detail in Sections 51 52 and 53 respec-tively And Section 54 provides a case study to illustrate thesemethods

51 Heuristics in Variable Ordering In practice the chief goalin designing variable ordering heuristics is to reduce the sizeof the overall search tree In our method the next variableto be instantiated is selected to be the one with the minimalremaining domain size (the size of the domain after removingthe values judged to be infeasible) because this canminimizethe size of the overall search treeThe technique to break ties isimportant as there are often variables with the same domainsize We use variablesrsquo ranks to break ties In case of a tie thevariable with the higher rank is selected This method givessubstantially better performance than picking one of the tyingvariables at random Rank is defined as follows

Definition 5 The rank of a branch (119899119902119886 119899119902119886+1

) (119886 isin [1 119896])

marks its level in the sequence of the branches along a pathdenoted as rank (119899

119902119886 119899119902119886+1

)The rank of the first branch is 1 the rank of the second

one is 2 and the ranks of those following can be obtainedanalogously The variables appearing on a branch enjoy thesame rank as the branch The rank of a variable on a branchwhere it does not appear is supposed to be infinity As avariable may appear on more than one branch it may havedifferent ranks The rule to break ties according to the ranksof variables is based on the heuristics from interval arithmeticthat the earlier a variable appears on a path the greaterinfluence it has on the result of interval arithmetic along thepath Therefore if the ordering by rank is taken between avariable that appears on the branch (119899

119902119886 119899119902119886+1

) and a variablethat does not then the former has a higher rank That isbecause on the branch (119899

119902119886 119899119902119886+1

) the former has rank 119886

Mathematical Problems in Engineering 7

Table 1 IVR process for each path of test in Figure 1

Path Branching condition 1198861

1198862

1198863

119883rel 119883irrel

Path 1 0 rarr 1 rarr 2 rarr 9 rarr 10 1199091 minus 1199092 le 0 1 minus1 0 1199091 1199092 1199093

Path 2 0 rarr 1 rarr 3 rarr 4 rarr 8 rarr 9 rarr 101199091 minus 1199092 gt 0 1 minus1 0

1199091 1199092 1199093 Oslash1199093 minus 1199092 le 0 0 minus1 1

Path 3 0 rarr 1 rarr 3 rarr 5 rarr 6 rarr 7 rarr 8 rarr 9 rarr 10

1199091 minus 1199092 gt 0 1 minus1 01199091 1199092 1199093 Oslash1199093 minus 1199092 gt 0 0 minus1 1

31199093 ge minus5 mdash mdash mdash

Path 4 0 rarr 1 rarr 3 rarr 5 rarr 7 rarr 8 rarr 9 rarr 10

1199091 minus 1199092 gt 0 1 minus1 01199091 1199092 1199093 Oslash1199093 minus 1199092 gt 0 0 minus1 1

31199093 lt minus5 mdash mdash mdash

Input FV the set of future variables119863119894 the domain of 119909

119894(119909119894isin FV)

(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branches along the pathOutput 119876

119894 a queue of FV

Begin(1) 119876119894larr quicksort (FV 1003816100381610038161003816119863119894

1003816100381610038161003816)(2) for 119894 rarr 1 1003816100381610038161003816119876119894

1003816100381610038161003816(3) if (1003816100381610038161003816119863119894

1003816100381610038161003816 =10038161003816100381610038161003816119863119895

10038161003816100381610038161003816) (119895 gt 119894 119909

119894 119909119895isin 119876119894)

(4) break(5) else for (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896])(6) if (rank(119899

119902119886 119899119902119886+1

)(119909119894) = rank(119899

119902119886 119899119902119886+1

)(119909119895))

(7) 119886++(8) else permutate 119909

119894 119909119895by rank(119899

119902119886 119899119902119886+1

)(9) break(10) return 119876

119894

End

Algorithm 3 Dynamic variable ordering

while the latter has rank infinity The comparison between119886 and infinity determines the ordering The algorithm isdescribed by pseudocodes in Algorithm 3

Quicksort is utilized when permutating variables accord-ing to remaining domain size and returns 119876

119894as a result If

no variables have the same domain size then DVO finishesBut if there are variables whose domain sizes are the same asthat of the head of119876

119894 then the ordering by rank is under way

which will terminate as soon as different ranks appear

52 Heuristics in Value Selection DVO determines the nextvariable to be instantiated and then the value selectionstrategies are employed Considering the difference betweenthe variable in question (eg 119909

119894) and other variables the

branching condition defined by formula (1) can be furtherrepresented as a function of 119909

119894

Br (119899119902119886 119899119902119886+1

) (119909119894) 119863119894997888rarr 119861 = (119886

119894119909119894+ sum119895 = 119894

119886119895119909119895)R119888 (2)

where 119863119894is the domain of 119909

119894and 119861 is a set of Boolean

values 119905119903119906119890 119891119886119897119904119890 sum119895 = 119894119886119895119909119895is the linear combination of

the variables except 119909119894and is regarded as a constant Then

we can design the value selection strategies starting from themonotonic relation between the branching condition and 119909

119894

Monotonicity describes the behavior of a function in relationto the change of the input It gives an indication whether theoutput of the function moves in the same direction as theinput or in the reverse direction If a branching condition isa function whose monotonicity is known the direction inwhich the input needs to be moved to make the functiontrue can be determined The following proposition gives anattribute of a function composed of piecewise monotonicfunctions

Proposition 6 Assume that 1198911 1198831rarr 119884

1 1198912 1198832rarr

1198842 119891

119898 119883119898rarr 119884119898is a family of piecewise monotonic

functions with 119884119894sube 119883119894+1

Let 119865119898 1198831rarr 119884119898be a composed

function 119891119898∘ 119891119898minus1

∘ sdot sdot sdot ∘ 1198911 On this assumption 119865

119898is also

piecewise monotonic

Proof Mathematical induction is used to prove the proposi-tion

(i) Case 1198651= 1198911 Function 119891

1is piecewise monotonic

by assumption 1198651is equal to 119891

1 so it has the same

attribute(ii) Case 119865

119894+1= 119891119894+1∘ 119865119894 The composed function 119865

119894is

piecewise monotonic by the induction assumptionlet 119868 be a subset of its domainrsquos partition and let 119909and 1199091015840 be two arbitrary elements in 119868 with 119909le

1198831199091015840

then one of themonotonicity conditions holds that iseither 119865

119894(119909) le119884119894119865119894(1199091015840

) or 119865119894(119909) ge119884119894119865119894(1199091015840

) For simplic-ity we denote it as 119865

119894(119909)R119865

119894(1199091015840

) where R isin le geFunction 119891

119894+1is piecewise monotonic by assumption

The monotonicity condition is satisfied by 119865119894(119909) and

119865119894(1199091015840

) if both lie in the same subset 1198681015840of its domainrsquospartition Then 119891

119894+1(119909)R119891

119894+1(1199091015840

) holds and 119891119894+1

isalso monotonic on 1198681015840

After decomposing a branching condition into its basicfunctions its monotonicity can be utilized in the selectionof the initial value as well as other values of the variable inquestion

521 Initial Value Selection Initial values of variables are ofgreat importance to a search algorithm On the one hand in a

8 Mathematical Problems in Engineering

backtrack-free search the initial value of a variable is almostpart of the solution On the other hand the selection of initialvalues affectswhether the searchwill be backtrack-free Initialvalues are often selected at random in MHS methods whichreturn different test data each time allowing diversity butrandomness without any heuristics is a kind of blind searchand causes too many iterations sometimes even exceptionMeanwhile midvalues are selected in methods using bisec-tion so it is obvious that sometimes the same result may bereturned since the same initial value is always selected In ourmethod the above twomethods are combined and the initialvalue of a variable is determined based on its path tendencywhich is defined and calculated as follows

Definition 7 Path tendency isin 119901119900119904119894119905119894V119890 119899119890119892119886119905119894V119890 is anattribute of a variable on a path which is in favor of thesatisfaction of all the branching conditions along the pathAnd it provides the information about where to select itsinitial value Positive implies that a larger initial value willwork better while negative implies that a smaller initial valueis better

The calculation of the path tendency of a variable119909119894involves the calculation of its weight on each branch

(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) and its path weight denoted as119908119894(119899119902119886 119899119902119886+1

) and 119901119908119894 which are calculated as (3)

119908119894(119899119902119886 119899119902119886+1

)

=

10038161003816100381610038161198861198941003816100381610038161003816

10038161003816100381610038161198861198941003816100381610038161003816 + sum119895 = 119894

10038161003816100381610038161003816119886119895

10038161003816100381610038161003816

if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically increasing

minus

10038161003816100381610038161198861198941003816100381610038161003816

10038161003816100381610038161198861198941003816100381610038161003816 + sum119895 = 119894

10038161003816100381610038161003816119886119895

10038161003816100381610038161003816

if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically decreasing

119901119908119894=

119896

sum119886=1

119908119894(119899119902119886 119899119902119886+1

)

(3)

Path tendency calculation (PTC) gleans the path ten-dency of each variable with 119901119908

119894 Subsequently initial domain

calculation (IDC) works on the result of PTC In this waythe initial value selection allows for both diversity andheuristics The algorithms are expressed by pseudo-codes inAlgorithms 4 and 5

522 Bisection by Tendency Bisection functions only whena value (including the initial value) assigned to the currentvariable 119909

119894is judged to be infeasible and the conflicted branch

(119899119902119886 119899119902119886+1

)with the false branching condition is locatedThenthe tendency of 119909

119894is used by bisection defined as follows

Definition 8 Tendency isin 119901119900119904119894119905119894V119890 119899119890119892119886119905119894V119890 is an attributeof a variable at a branch (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896]) determinedby the analysis on the monotonicity of the correspondingbranching condition and it provides the information aboutwhere to select a value to better satisfy the branching

condition Positive implies that a larger value will work betterwhile negative implies that a smaller value is better It iscalculated according to the following formula

119879119890119899119889119890119899119888119910 (119909119894)

=

119901119900119904119894119905119894V119890 if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically increasing119899119890119892119886119905119894V119890 if Br (119909

119894) (119899119902119886 119899119902119886+1

)

is monotonically decreasing

(4)

Each branch holds a tendencymap ⟨119881119886119903119894119886119887119897119890 119879119890119899119889119890119899119888119910⟩which includes the variables appearing on the branch andtheir corresponding tendencies With the tendency mapbisection can be applied to reduce the domain of 119909

119894(119863119894119895)

leading the branching condition to be true as presented bypseudo-codes in Algorithm 6

For example if the conflicted branch is the first branchof Path3 in Figure 1 then the corresponding branchingcondition is 1199091 minus 1199092 gt 0 which has different monotonicrelations with 1199091 and 1199092 respectively Table 2 shows how touse bisection to reduce the domains of variables If the currentvariable is 1199091 then retrieval of tendency map returns positiveindicating that a larger value will help satisfy the branchingcondition sowe reduce its domain to the larger part But if thecurrent variable is 1199092 bisection will function in the oppositeway due to the opposite monotonic relation

53 Heuristics in Maintaining Path Consistency As men-tioned in Section 42 MPC can be used in both stages ofBFS-BB In this part the focus is on the state space searchstage A value assigned to the current variable 119909

119894 no matter

it is the initial value or another value selected after bisectionshould be examined by interval arithmetic to see whether it ispart of the solution Path consistency is a prerequisite for thesuccess of interval arithmetic In the implementation of BFS-BB interval arithmetic is enhanced to provide more preciseinterval information The enhancement is to make clear howthe value of the branching condition defined by formula (2)is calculated as shown in formula (5) Here we use 119863119886 todenote the domain of all variables before calculating the 119886thbranching condition Besides a library of inverse functions isadded in case of the occurrences of library functions in thePUT Consider

Br (119899119902119886 119899119902119886+1

) (119909119894)

=

119905119903119906119890 if (119899119902119886 119899119902119886+1

) is traversedwith 119863119886 (119881

119894119895isin 119863119886

)

119891119886119897119904119890 otherwise

(5)

Hence for 119896 branching nodes along path 119901 all the119896 branching conditions should be true to maintain pathconsistency MPC receives the value of the current variable 119909

119894

(119881119894119895) which is part of the domain of all variables denoted as

1198631

(119881119894119895= [119881119894119895 119881119894119895] isin 1198631

) and evaluates the branching condi-tion corresponding to the branch (119899

1199021 1198991199021+1

) where 1198991199021is the

first branching node The branching condition Br(1198991199021 1198991199021+1

)

Mathematical Problems in Engineering 9

Input119883rel the set of relevant variables to the path119901119908119894 the path weight of variable 119909

119894(119909119894isin 119883rel)

Output Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relBegin(1) Path-Tendencylarr null(2) foreach 119909

119894isin 119883rel

(3) if (119901119908119894gt 0)

(4) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119901119900119904119894119905119894V119890⟩

(5) else if (119901119908119894lt 0)

(6) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119899119890119892119886119905119894V119890⟩

(7) return Path-TendencyEnd

Algorithm 4 Path tendency calculation

Input 119863119894= [min max] the domain of 119909

119894

Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relOutput 119863

1198941 the domain of 119909

119894in which its initial value is selected

Begin(1) PathTendency(119909

119894)larr retrieval of Path-Tendency

(2) if (PathTendency(119909119894) = positive)

(3) 1198631198941larr997888 [

(min+max)2

max](4) else if (PathTendency(119909

119894) = negative)

(5) 1198631198941larr997888 [min (min+max)

2]

(6) return 1198631198941

End

Algorithm 5 Initial domain calculation

Input 119863119894119895= [min max] the current domain of 119909

119894

119881119894119895 the current value of 119909

119894that causes Br(119909

119894)(119899119902119886 119899119902119886+1

) to be false(119899119902119886 119899119902119886+1

) the conflicted branchOutput 119863

119894119895 the reduced domain of 119909

119894

Begin(1) 1198811015840 larr 119881

119894119895

(2) Tendency(119909119894)larr retrieval of tendency map held by (119899

119902119886 119899119902119886+1

)(3) 119895++(4) if (Tendency(119909

119894) = positive)

(5) 119863119894119895larr [1198811015840 + 1 max]

(6) else if (Tendency(119909119894) = negative)

(7) 119863119894119895larr [min 1198811015840 minus 1]

(8) return 119863119894119895

End

Algorithm 6 Bisection

Table 2 An example of bisection

Current variable Monotonicity Tendency Current value Domain before bisection Domain after bisection1199091 Increasing Positive 1198811 [min1 max1] [1198811+1 max1]1199092 Decreasing Negative 1198812 [min2 max2] [min2 1198812minus1]

10 Mathematical Problems in Engineering

Input 1198631 the domain of all variables before checking path consistencyBr(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branching conditions along the pathOutput 119863119896+1 the reduced domain of all variables after a successful path consistency check

(119899119902119886 119899119902119886+1

) the conflicted branch spotted by path consistency checkBegin(1) for 119886 rarr 1 119896(2) calculate Br(119899

119902119886 119899119902119886+1

) with119863119886(3) if (Br(119899

119902119886 119899119902119886+1

) = true)(4) 119863

119886+1

⊑ 119863119886

(5) else return (119899119902119886 119899119902119886+1

)(6) path consistentlarr true(7) return 119863119896+1End

Algorithm 7 Maintaining path consistency

is generally not satisfied for all the values in1198631 but for valuesin a certain subset 1198632 sube 119863

1 ensuring the traversal of the

branch (1198991199021 1198991199021+1

) that is 1198631119861119903(1198991199021 1198991199021+1)

997888997888997888997888997888997888997888997888997888rarr 1198632 Next the

branching condition Br(1198991199022 1198991199022+1

) is evaluated given that thedomain of all variables is1198632 Again generally Br(119899

1199022 1198991199022+1

) isonly satisfied by a subset 1198633 sube 1198632 This procedure continuesalong 119901 until all the branching conditions are satisfied tomaintain path consistency and119863119896+1 is returned as the domainof all variables The process of maintaining path consistencyis the propagation of the branching conditions along p in the

form of 1198631Br(1198991199021 1198991199021+1)997888997888997888997888997888997888997888997888997888rarr 119863

2

Br(11989911990221198991199022+1)997888997888997888997888997888997888997888997888997888rarr 119863

3

sdot sdot sdot 119863119896

Br(119899119902119896119899119902119896+1)997888997888997888997888997888997888997888997888997888rarr

119863119896+1 where 1198631 supe 119863

2

supe 1198633

sdot sdot sdot supe 119863119896

supe 119863119896+1 But if in

this procedure Br(119899119902ℎ 119899119902ℎ+1

) = 119891119886119897119904119890(1 le ℎ le 119896) whichmeans a conflict is detected then MPC is terminated andbisection will function according to the result of MPC atthe conflicted branch (119899

119902ℎ 119899119902ℎ+1

) The process of checkingwhether path consistency is maintained is shown by pseudo-codes in Algorithm 7

54 Case Study In this part the problem mentioned inSection 31 is used as an example to explain how BFS-BBworks especially the heuristic look-aheadmethods proposedahead The input is Path3 as shown in bold in Figure 3where each branching condition is decomposed into its basicfunctions in the right The IVR process has been illustratedin detail in Table 1 and all the three variables are determinedrelevant to Path3 For simplicity the input domains of allvariables are set [minus2 2] with the size 5 In the initializationstage MPC check reduces their domains to 1199091 [minus1 2] 1199092[minus2 1] and 1199093 [minus1 2] The path tendency of each variableis calculated by PTC as shown in Table 3 DVO serves todetermine the first variable to be instantiated as shown inTable 4 with the head of the queue (1199092) highlighted in boldOn determining 1199092 to be the current variable an initialvalue needs to be selected from[minus2 1] The retrieval of pathtendency map by IDC returns negative for 1199092 indicating thata smaller value will perform better and minus1 is selected

MPC checks the domains of all variables which are 1199091[minus1 2] 1199092 [minus1 1] and 1199093 [minus1 2] It succeeds and reduces

the domains of 1199091 and 1199093 to [0 2] and [0 2] respectivelyThen DVO determines the next variable to be instantiated asshown in Table 5 with the head of the queue (1199091) highlightedin bold

1 is selected for 1199091 after IDC MPC checks whether1199091 [1 1] 1199092 [minus1 minus1] and 1199093 [0 2] works It suc-ceeds and in the same manner 1199093 is assigned 1 Finally⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is checked by MPC to be suitablefor Path3 No variable needs to be permutated and BFS-BBsucceeds with the test data ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ Table 6shows how the domains of variables are changed duringthe search process The changed domains are highlighted inbold The changes listed in the fourth column are owing tovariable assignments according to the results of IDC andthe changes listed in the fifth column are owing to domainreduction by MPC checks The process of generating the testdata ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is presented as the search treein Figure 4 It is a backtrack-free search that accounts for anextremely large proportion in the implementation of BFS-BBEach variable consumes one MPC check in the state spacesearch stage and the initial values of each variable make thesolution The solution path is shown by the bold arrows

6 Experimental Results and Discussion

To observe the effectiveness of BFS-BB we carried out a largenumber of experiments in CTS Within the CTS frameworkthe PUT is automatically analyzed and its basic informationis abstracted to generate its CFG According to the specifiedcoverage criteria the paths to be traversed are generated andprovided for BFS-BB as input The generated test data willbe used for mutation testing that requires a high coverageideally 100 [37] This is a challenge for test data generation

The experiments were performed in the environment ofMSWindows 7with 32 bits Pentium4with 28GHz and 2GBmemory The algorithms were implemented in Java and runon the platformof eclipseThe experiments include two partsSection 61 presents the performance evaluation of BFS-BBand Section 62 tests the capability of BFS-BB to generate testdata in terms of coverage and makes comparisons with somecurrently existing static and dynamic methods

Mathematical Problems in Engineering 11

Table 3 PTC process for 1199091 1199092 and 1199093

Branchingcondition

Basic functions andcorresponding monotonicity

Monotonicity ofbranching conditions Weight Path weight Path tendency

1199091 minus 1199092 gt 0

119891(1199091) = 1199091 minus 1199092 increasing119891(1199092) = 1199091 minus 1199092 decreasing119891(1198871) = 1198871 gt 0 increasing

Br(1199091) increasingBr(1199092) decreasing

1199081 = 05

1199082 = minus051199011199081 = 05

1199011199082 = minus1

1199011199083 = 15

⟨1199091 positive⟩⟨1199092 negative⟩⟨1199093 positive⟩1199093 minus 1199092 gt 0

119891(1199092) = 1199093 minus 1199092 decreasing119891(1199093) = 1199093 minus 1199092 increasing119891(1198872) = 1198872 gt 0 increasing

Br(1199092) decreasingBr(1199093) increasing

1199082 = minus05

1199083 = 05

3 lowast 1199093 ge minus5119891(1199093) = 3 lowast 1199093 increasing119891(1198873) = 1198873 ge minus5 increasing Br(1199093) increasing 1199083 = 1

Table 4 DVO process for 1199091 1199092 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 4 |1198632| = 4 |1198633| = 4 Yes (all three have the same domain size)

x2 rarr 1199091 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199092) = 1 Rank 1(1199093) =infin Yes (1199091 and 1199092 both have Rank 1)Rank 2 Rank 2(1199091) =infin Rank 2(1199092) = 2 No (1199092 has Rank 2 while 1199091 has infinity)

Table 5 DVO process for 1199091 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 3 |1198633| = 3 Yes (both have the same domain size) x1 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199093) =infin No (1199091 has Rank 1 while 1199093 has infinity)

Table 6 Domain changes in the search process

Stage Function Before IDC After IDC and before MPC After MPC

Initialization Initial domain reduction mdash 1199091 [minus2 2] 1199092 [minus2 2]1199093 [minus2 2]

x1 [minus1 2] x2 [minus2 1]x3 [minus1 2]

State space search

MPC check when 1199092 isassigned minus1

1199091 [minus1 2] 1199092 [minus2 1] 1199093[minus1 2]

1199091 [minus1 2] x2 [minus1 minus1]1199093 [minus1 2]

x1 [0 2] 1199092 [minus1 minus1]x3 [0 2]

MPC check when 1199091 isassigned 1

1199091 [0 2] 1199092 [minus1 minus1] 1199093[0 2]

x1 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

MPC check when 1199093 isassigned 1

1199091 [1 1] 1199092 [minus1 minus1] 1199093[0 2]

1199091 [1 1] 1199092 [minus1 minus1]x3 [1 1]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [1 1]

void test(int x1 int x2 int x3) void test(int x1 int x2 int x3)

if (x1-x2lt=0)

printf(ldquoPath1rdquo)

else if(x3-x2lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

else if(3lowastx3+5gt=0)

int b1=x1-x2

printf(ldquoPath1rdquo)

if (b1lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

if(b2lt=0)

if(b3gt=-5)

else int b2=x-3x2

elseint b3=3lowastx3

Figure 3 Overview of our approach for searching the test data

12 Mathematical Problems in Engineering

0 1 2

DVO

MPC

0 1 2MPC

0 1 2MPC

DVO

DVOIDC

PTC

MPC

MPC

IDC

IDC

minus1

minus1

minus1

minus2

minus2

minus2

x1 minus x2 gt 0x3 minus x2 gt 03x3 + 5 ge 0

x1 x2 x3 isin minus2 minus1 0 1 2

x2x2x2x2

x3x3x3

x1 x1 x1

radic

Figure 4 The search tree of generating the test data for Path3 using BFS-BB

61 Performance Evaluation The number of relevant vari-ables is an important factor that affects the performanceof BFS-BB so in this part experiments were carried out toevaluate the performance of BFS-BB for varying numbers ofinput variables To be specific our major concern is (1) therelationship between the number of MPC checks (exclusiveof the one taken in the initialization stage) and the numberof relevant variables (2) the relationship between genera-tion time and the number of relevant variables This wasaccomplished by repeatedly running BFS-BB on generatedtest programs having input variables 119909

1 1199092 119909

119899where 119899

varied from 1 to 50 Adopting statement coverage in eachtest the program contained five if statements (equivalent tofive branching conditions along the path for MPC check)and there was only one path to be traversed of fixed lengthwhich was the one consisting of entirely true branches(TTTTT) that is all the branching conditions are the sameas the corresponding predicates Considering the relationshipbetween variables experiments involving two situations wereconducted that (1) the variables are all independent of eachother and (2) the variables are linearly related in the tightestmanner Generation time varied greatly in these two cases sothe axes of generation time of both cases are normalized forsimplicity

611 Variables Are All Independent of Each Other The pred-icate of each if statement is an expression in the form of

11988611199091relop

1const [1] and 119886

21199092relop

2119888119900119899119904119905 [2]

and sdot sdot sdot and 119886119899119909119899relop

119899119888119900119899119904119905 [119899]

(6)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop119894(119894 = 1 2 119899) isin gt ge lt le =

= and const is an array of randomly generated constantsThe randomly generated 119886

119894and 119888119900119899119904119905[119894] should be selected

to make the path feasible This arrangement constructs arelationship that all the variables are independent of eachother but all of them are relevant to the path The programsfor various values of n ranging from 1 to 50 were each tested50 times and the number of MPC checks and time required

to generate the data for each test were recorded The resultscan be seen from Figures 5 and 6

Figure 5 shows the relationship between the number ofMPC checks and the number of variables (119899) for variablesthat are all independent of each other and from (a) to (d)represent four different situations marked by the ordinatesIt can be seen that since the relation in formula (6) is thesimplest one between variables the number of MPC checksremains linearly increasing with the number of variables nomatter in which situation from (a) to (d) 119910 = 119909means thatfor this kind of constraint one relevant variable requires onlyone MPC check It also can be seen that 1198772 = 1 in all thefour situations and the number of MPC checks increasescompletely linearly with the number of variables

Figure 6 shows the relationship between generation timeand the number of variables (119899) for variables that are allindependent of each other and from (a) to (d) represent fourdifferent situations marked by the ordinates It can be seenthat generation time increases approximately linearlywith thenumber of variables and the linear correlation relationship issignificant at 95 confidence level with 119875 value far less than005 By the increase of the number of variables generationtime increases at an even speed The minimum value can becommendably represented as a straight line showing that itis the most ideal in the four situations with a larger value of1198772 Variations between tests with the same values of 119899 were

attributed to the randomness in the difference in the selectionof the initial values

612 Variables Are Linearly Related in the Tightest MannerThe predicate of each if statement is a linear combination ofall the 119899 variables in the form of

[1198861 1198862 119886

119899] [1199091 1199092 119909

119899]1015840relop119888119900119899119904119905 [119888] (7)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop isin gt ge lt le = = and 119888119900119899119904119905[119888](119888 isin 1 2 3 4 5) is an array of randomly generatedconstants The randomly generated 119886

119894and 119888119900119899119904119905[119888] should be

selected to make the path feasible This arrangement con-structs the tightest linear relation between the variables all

Mathematical Problems in Engineering 13

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = x

R2 = 1

(a)

Number of variables0 20 40 60

0

20

40

60

Aver

age n

umbe

r of M

PC ch

ecks

y = x

R2 = 1

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(c)

Number of variables0 20 40 60

0

20

40

60

Min

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(d)

Figure 5 Relationship between the number of MPC checks and the number of variables for variables that are all independent of each other

of which are relevant to the path The programs for variousvalues of 119899 ranging from 1 to 50 were each tested 50 times andthe number ofMPC checks and time required to generate thedata for each test were recordedThe results can be seen fromFigures 7 and 8

Figure 7 shows the relationship between the number ofMPC checks and the number of variables (119899) for variables thatare the tightest linearly related and from (a) to (d) representfour different situations marked by the ordinates It can beseen that the number of MPC checks remains approximatelylinearly increasing with the number of variables and thefitting curves are all near 119910 = 119909 The linear correlationrelationship is significant at 95 confidence level with 119875-value far less than 005 The general average and maximumnumbers ofMPC checks are all larger than those in the exper-iment for variables that are all independent of each otherbecause the relation in formula (7) is the tightest linear onebetween variablesTheminimumnumber ofMPC checks canbe completely represented as 119910 = 119909 with 1198772 = 1 whichmeans that theminimumnumber is themost ideal in the foursituations

Figure 8 shows the relationship between generation timeand the number of variables (119899) for variables that are

the tightest linearly related and from (a) to (d) represent fourdifferent situationsmarked by the ordinates It is clear that therelation between generation time and the number of variablescan be commendably represented as a quadratic curve andthe quadratic correlation relationship is significant at 95confidence level with 119875-value far less than 005 The betterfitting curves of average and minimum generation timesshow that average generation time is perfectly stable andminimum generation time is still the most ideal Variationsbetween tests with the same values of n were attributed to therandomness in (1) the difference in the selection of the initialvalues and (2) the difference in the expressions along the path(an equality relational operator will generally require morecalculation than an inequality relational operator) Besidesgeneration time increases at a uniformly accelerative speed bythe increase of the number of variables Take (b) for examplethe differentiation of average generation time indicates thatits increase rate rises by 119910 = 9994119909 minus 7734 as the number ofvariables increases We can roughly draw the conclusion thatgeneration time is very close for 119899 ranging from 1 to 8 whileit begins to increase when 119899 is larger than 8

The above cases are both completely backtrack-freesearch owing to the linear correlation relationship between

14 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 1471x + 3161

R2 = 0899

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time

y = 1636x + 1175

R2 = 0862

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 2073x minus 3655

R2 = 0997

(d)

Figure 6 Relationship between generation time and the number of variables for variables that are all independent of each other

the number of MPC checks and the number of relevantvariables Surely they cannot include all the relations betweenvariables in engineering so the analyses in this part arejust from the theoretic perspective The real-world PUTs aremuch more complex Whatrsquos more 50 tests were conductedfor each case of n ranging from 1 to 50 So the results from thesamples can only approximate the actual situation But it canbe concluded that BFS-BB functions are stably given a PUTof regular structure which lays a solid foundation for itsapplication in engineering

62 Coverage Evaluation To evaluate the capability of BFS-BB to generate test data in terms of coverage four experi-ments were carried out The first involves the testing with abenchmark used in CTS the second aims at generating testdata for a project in engineering the third compares BFS-BBwith a static method and the last compares it with dynamicmethods

621 Testing a Benchmark in CTS In this part test datawere automatically generated to meet three coverage criteriawhich were statement branch and MCDC The test bedwas branch boundc a benchmark in CTS with 402 LOC

29 input variables and complex structure trying to includemore content that might appear in engineering119898 was set 10for each variable as the upperbound of the number of MPCchecks so it can be estimated that the simplest backtrackingwill consume at least 11 MPC checks for the variable inquestion

The result is shown in Table 7 The numbers of paths wasdifferent owing to different coverage criteria adopted BFS-BB was able to generate test data for all the feasible pathsno matter which coverage criterion was taken The MCDCcoverage did not reach 100 because it is relatively strictand difficult to meet and subsumes statement and branchcoverage [38] But tolerable coverage was achieved withintolerable timeThere exists a trade-off between efficiency andsuccess rate IVR had no significant influence on coveragebut it did on generation time Generation time after IVRwas much less than that without IVR Note that the amountof generation time reduced by IVR is determined by the struc-ture of the PUT The numbers of generation time reductionin Table 7 are only related to the program branch boundcOur following analyses all concern BFS-BB with IVR Aver-age number of MPC checks per relevant variable adoptingstatement coverage was larger than the results in Section 61

Mathematical Problems in Engineering 15

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = 0994x + 0221

R2 = 0997

(a)

0 20 40 600

20

40

60

Number of variables

Aver

age n

umbe

r of M

PC ch

ecks

y = 0996x + 0309

R2 = 0999

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = 1014x + 2030

R2 = 0985

(c)

0 20 40 600

20

40

60

Number of variables

Min

imum

num

ber o

f MPC

chec

ks

y = x

R2 = 1

(d)

Figure 7 Relationship between the number of MPC checks and the number of variables for variables that are linearly related in the tightestmanner

Table 7 Coverage achieved by BFS-BB on branch boundc

Coverage criterion Number of paths Average coverage Generation time reducedby IVR

Average MPC checks perrelevant variable

Statement 61 100 34 134Branch 119 100 37 173MCDC 125 98 42 229

because there are some library functions as well as nonlinearconstraints in the PUT which require moreMPC checks Butfrom all the average values of the number of MPC checks itcan be concluded that the tests for all the three coverage crite-ria were basically backtrack-free and not all the tests includethe bisection operation

622 Testing Programs from a Project in Engineering In thispart seven programs were selected from the project de118i-2 at httpwwwmoshiernet as the test beds each of whichcontains several functions or loops Three coverage criteriawere adopted which were statement branch and MCDCThe information of the programs and the coverage results areshown in Table 8

From Table 8 it can be seen that BFS-BB performedencouragingly for programs with complex structure in engi-neering Some of the tests adoptingMCDC coverage did notreach 100 resulting from the fact thatMCDC is a relativelystrict criterion But all the tests were basically backtrack-freewith a high coverage proving its effectiveness in engineering

623 Comparison with a Static Method This part presentsthe results from an empirical comparison of BFS-BB withthe static method [13] (denoted as ldquomethod 1rdquo to avoidverbose description) which was implemented in CTS priorto BFS-BB The details of the test beds are shown in Table 9The comparison adopted three coverage criteria statementbranch and MCDC And the result is shown in Table 10

16 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 4318x2 minus 6684x + 3655

R2 = 0975

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 4997x2 minus 7734x + 4134

R2 = 0991

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time y = 4151x2 minus 2923x + 7107

R2 = 0907

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 5455x2 minus 9565x + 5030

R2 = 0994

(d)

Figure 8 Relationship between generation time and the number of variables for variables that are linearly related in the tightest manner

Table 8 Test result achieved by BFS-BB on programs from de118i-2

Program LOC Number ofvariables

Number offunctions

Number ofloops Statement coverage Branch coverage MCDC coverage

rungec 1626 8 2 8 100 100 89oblatec 581 14 4 4 100 100 94jplmpc 996 6 5 5 100 100 88floorlc 432 7 5 2 100 100 92atanlc 267 3 2 0 100 100 100adams4c 767 28 10 12 100 100 85tanlc 256 4 3 0 100 100 100

Table 9 Programs used for comparison with method 1

Program LOC Number of branches Number of variables Description Source

Bonus 29 10 1 To calculate bonus according toprofit [13]

Days 33 17 3 To calculate the day numberwithin a year [13]

Statistics 21 8 5 To count the number of everykind of characters [13]

gcd 38 5 2 To calculate greatest commondenominator [14]

Mathematical Problems in Engineering 17

Table 10 Comparison result with method 1 using three coverage criteria

Program Coverage criterion Number of paths Average coverage bymethod 1

Average coverage byBFS-BB

BonusStatement 6 25 100Branch 6 37 100MCDC 6 30 100

DaysStatement 17 100 100Branch 17 100 100MCDC 14 94 100

StatisticsStatement 4 100 100Branch 5 100 100MCDC 11 82 100

gcdStatement 3 85 100Branch 3 77 100MCDC 5 75 100

Table 11 Programs used for comparison with GA and SA

Program LOC Number of branches Number of variables Description SourceTriangle type 31 3 5 To classify type of a triangle [15]

Valid date 59 16 3 To check whether a date is validor not [15]

Cal day 72 11 3 To calculate the day of the week [16]

Cal 53 18 5To calculate the number of daysbetween the two given days in

the same year[17]

Since interval arithmetic has been improved in CTS forthe sake of validity test data were generated for all the testbeds on the same foundation of interval arithmetic It can beseen that BFS-BB reached 100 coverage for all the test bedsusing three coverage criteria while method 1 did not That islargely due to the heuristicmethods utilized in BFS-BBThereare modulus operations in the program days which had beendifficult for the interval arithmetic in method 1 to handle Butthe functionality of interval arithmetic has been improved byadding a library of inverse functions so it was not so difficulteven for method 1

624 Comparison with DynamicMethods This part presentsresults from an empirical comparison of BFS-BB with twodynamic methods which are genetic algorithm (GA) andsimulated annealing (SA) on four different benchmark pro-grams using branch coverage The details of the benchmarkprograms are shown in Table 11 In order to obtain unbiasedexperimental results we made a number of experiments withdifferent parameter settings and chose the one that GA andSA performed the best as shown in Table 12 Test data wereautomatically generated for each program using GA SA andBFS-BB with each tested 100 timesThe average coverage wasrecorded to make the comparison and the result was pre-sented in Table 13

It can be seen that BFS-BB reached 100 coverage onall the four benchmark programs which are rather simpleprograms for BFS-BB and it outperformed the algorithms

in comparison The better performance of BFS-BB is due tothree factorsThe first is that the initial values of variables areselected by heuristics on the path so BFS-BB reaches a rela-tively high coverage for the first round of the searchThe sec-ond is that MHS crashes on several occasions due to the iter-ation exception while the probability of aborting is quite lowfor BFS-BB because it has no demand for iteration The thirdis thatMPC is checked not only in the state space search stagebut also in the initialization stage which reduces the domainsof the variables to ensure a relatively small search spacethat follows

7 Conclusion

This paper presents an intelligent algorithm best-first-searchbranch and bound (BFS-BB) for path-wise test data gener-ation (Q) The problem Q is reformulated as a constraintoptimization problem (COP) and two techniques from arti-ficial intelligence are introduced to tackle the COP which arestate space search and branch and bound (BB) The formeris used to construct the search tree dynamically and thelatter is used as the search method Heuristics are adoptedin the look-ahead stage Dynamic variable ordering (DVO)is presented with a heuristic rule to break ties Maintainingpath consistency (MPC) is achieved through analysis on theresult of interval arithmetic The monotonicity analysis onbranching conditions is applied both in the selection of initialvalues by path tendency calculation (PTC) and initial domain

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 5

Input 119901 the path to be traversedOutput result ⟨119881119886119903119894119886119887119897119890 119881119886119897119906119890⟩ the test data making 119901 feasibleStage 1 Initialization(1) call Algorithm Irrelevant variable removal(2) resultlarr null(3) call Algorithm Maintaining path consistency(4) call Algorithm Dynamic variable ordering(5) call Algorithm Path tendency calculation(6) 119909

1larr head (119876

1)

(7) call Algorithm Initial domain calculation(8) 119881

11larr select (119863

11)

(9) initial statelarr (null 1199091 11986311 11988111 active 119876

1)

(10) 119878cur larr initial stateStage 2 State space searchBegin(11) for (Pre 119909

119894 119863119894119895 119881119894119895 active 119876

119894) (119894 rarr 1 119899)

(12) path consistentlarr false(13) call Algorithm Maintaining path consistency(14) if (path consistent = true)(15) 119878cur larr (Pre 119909

119894 119863119894119895 119881119894119895 extensive 119876

119894)

(16) resultlarr result cup ⟨119909119894 119881119894119895⟩

(17) FVlarr FV minus 119909119894

(18) PVlarr PV + 119909119894

(19) call Algorithm Dynamic variable ordering(20) if (119876

119894= null)

(21) 119878cur larr final state(22) foreach 119909lowast isin 119883irrel(23) resultlarr result cup ⟨119909lowast 119881random⟩(24) else Prelarr 119878cur(25) 119909

119894larr head (119876

119894)

(26) call Algorithm Initial domain calculation(27) 119881

1198941larr select (119863

1198941)

(28) 119878cur larr (Pre 119909119894 1198631198941 1198811198941 active 119876

119894)

(29) else if (1003816100381610038161003816100381611986311989411989510038161003816100381610038161003816gt 1 ampamp 119895 lt 119898)

(30) call Algorithm Bisection(31) 119881

119894119895larr select (119863

119894119895)

(32) 119878cur larr (Pre 119909119894 119863119894119895 119881119894119895 active 119876

119894)

(33) else 119878cur larr (Pre 119909119894 119863119894119895 119881119894119895 inactive 119876

119894)

(34) 119875119903119890 larr 119878cur(35) 119878cur larr (Pre 119909

119894 119863119894119895 119881119894119895 active 119876

119894)

(36) PVlarr PV minus 119909119894

(37) return resultEnd

Algorithm 1 Best-first-search branch and bound

of a variable which is not relevant to 119901 is wasted since itcannot influence the traversal of 119901Thus removing irrelevantvariables from the search space and only concentrating on thevariables relevant to the path of interest may improve the per-formance of the search Hence we propose an optimizationmethod irrelevant variable removal (IVR) Relevant variableand irrelevant variable are defined as follows

Definition 1 A relevant variable is an input variable that caninfluence whether a particular path 119901will be traversed or notTo put it more precisely for all the input variables 119909

119894| 119909119894isin

119883 119894 = 1 2 119899 there exists a corresponding set of values119881119894| 119881119894isin 119863119894 119894 = 1 2 119899 with which 119901 is not traversed

But when the value of a particular variable is changed forexample when the value of 119909

119892(119881119892) is changed into 1198811015840

119892 119901 is

traversed with the input 1198811 1198812 119881

1015840

119892 119881

119899 Then 119909

119892is a

relevant variable to path 119901

Definition 2 An irrelevant variable is an input variable thatis not capable of influencing whether a particular path 119901 willbe traversed or not To put it more precisely for all the sets119881119894| 119881119894isin 119863119894 119894 = 1 2 119899 of the search space of path 119901

with which 119901 is not traversed if 119901 is still not traversed withthe input 119881

1 1198812 119881

1015840

119892 119881

119899 when the value of a certain

variable 119909119892(119881119892) is changed into 1198811015840

119892 then 119909

119892is an irrelevant

variable to path 119901

6 Mathematical Problems in Engineering

Input Br(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branching conditions along the path119883 = 119909

1 1199092 119909

119899 the set of input variables

Output119883rel the set of relevant variables to the path119883irrel the set of irrelevant variables to the path

(1)119883rel larr Oslash(2)119883irrel larr Oslash(3) foreach Br(119899

119902119886 119899119902119886+1

) (119886 isin [1 119896])(4) if (119883rel = 119883)

(5) break(6) else if (119886

119895= 0)

(7) 119883rel larr 119883rel cup 119909119895(8)119883irrel larr 119883 minus119883rel(9) return 119883rel 119883irrel

Algorithm 2 Irrelevant variable removal

Generally for a particular path whether an input variableis relevant or irrelevant cannot be completely decided dueto the complex structure of programs But we can makeconservative estimate of irrelevancy with static control flowtechnique We give the most common condition in PUTsAssume that there are 119896 branches along a path each branch(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) needs to be traversed to find theset of relevant variables The removal of irrelevant variablesinvolves the judgment of whether a variable appears on eachbranch so we give the definition below which is utilizedby Algorithm 2 And considering the relation between thecomplexity of BFS-BB and the number of variables we giveProposition 4 about the effectiveness of IVR

Definition 3 The branching condition Br(119899119902119886 119899119902119886+1

) is theconstraint on the branch (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896]) and it canbe represented as

Br (119899119902119886 119899119902119886+1

) =

119899

sum119895=1

119886119895119909119895R119888 (1)

where R is a relational operator and 119886119895(119895 isin [1 119899]) and 119888 are

constants

Proposition 4 IVR may result in test data being searchedout with fewer MPC checks for a particular path 119901 than if allvariables are considered

Proof Thealgorithmbisection involves the search steps takenfor a certain variable under the same condition of othervariables which move in breadth (119898) until a value is foundto make MPC succeed Then 119898 is the base of the complexityof BFS-BB and the number of variables is the exponent Let119883rel denote the set of relevant variables to path119901 and let119883irrelbe the set of irrelevant variables onemore element in119883rel willinvolve more MPC checks on an exponential basis If all theirrelevant variables are removed from the search space thecomplexity will be reduced by119898|119883irrel| |119883irrel| is the cardinalityof the set of irrelevant variables

We conduct IVR for all the paths in Figure 1 and theprocess is shown in Table 1 The position where a variable isjudged relevant to the path of interest is highlighted in bold

5 The Heuristic Look-Ahead Methods

In this section the heuristic look-ahead methods in BFS-BBare explained in detail in Sections 51 52 and 53 respec-tively And Section 54 provides a case study to illustrate thesemethods

51 Heuristics in Variable Ordering In practice the chief goalin designing variable ordering heuristics is to reduce the sizeof the overall search tree In our method the next variableto be instantiated is selected to be the one with the minimalremaining domain size (the size of the domain after removingthe values judged to be infeasible) because this canminimizethe size of the overall search treeThe technique to break ties isimportant as there are often variables with the same domainsize We use variablesrsquo ranks to break ties In case of a tie thevariable with the higher rank is selected This method givessubstantially better performance than picking one of the tyingvariables at random Rank is defined as follows

Definition 5 The rank of a branch (119899119902119886 119899119902119886+1

) (119886 isin [1 119896])

marks its level in the sequence of the branches along a pathdenoted as rank (119899

119902119886 119899119902119886+1

)The rank of the first branch is 1 the rank of the second

one is 2 and the ranks of those following can be obtainedanalogously The variables appearing on a branch enjoy thesame rank as the branch The rank of a variable on a branchwhere it does not appear is supposed to be infinity As avariable may appear on more than one branch it may havedifferent ranks The rule to break ties according to the ranksof variables is based on the heuristics from interval arithmeticthat the earlier a variable appears on a path the greaterinfluence it has on the result of interval arithmetic along thepath Therefore if the ordering by rank is taken between avariable that appears on the branch (119899

119902119886 119899119902119886+1

) and a variablethat does not then the former has a higher rank That isbecause on the branch (119899

119902119886 119899119902119886+1

) the former has rank 119886

Mathematical Problems in Engineering 7

Table 1 IVR process for each path of test in Figure 1

Path Branching condition 1198861

1198862

1198863

119883rel 119883irrel

Path 1 0 rarr 1 rarr 2 rarr 9 rarr 10 1199091 minus 1199092 le 0 1 minus1 0 1199091 1199092 1199093

Path 2 0 rarr 1 rarr 3 rarr 4 rarr 8 rarr 9 rarr 101199091 minus 1199092 gt 0 1 minus1 0

1199091 1199092 1199093 Oslash1199093 minus 1199092 le 0 0 minus1 1

Path 3 0 rarr 1 rarr 3 rarr 5 rarr 6 rarr 7 rarr 8 rarr 9 rarr 10

1199091 minus 1199092 gt 0 1 minus1 01199091 1199092 1199093 Oslash1199093 minus 1199092 gt 0 0 minus1 1

31199093 ge minus5 mdash mdash mdash

Path 4 0 rarr 1 rarr 3 rarr 5 rarr 7 rarr 8 rarr 9 rarr 10

1199091 minus 1199092 gt 0 1 minus1 01199091 1199092 1199093 Oslash1199093 minus 1199092 gt 0 0 minus1 1

31199093 lt minus5 mdash mdash mdash

Input FV the set of future variables119863119894 the domain of 119909

119894(119909119894isin FV)

(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branches along the pathOutput 119876

119894 a queue of FV

Begin(1) 119876119894larr quicksort (FV 1003816100381610038161003816119863119894

1003816100381610038161003816)(2) for 119894 rarr 1 1003816100381610038161003816119876119894

1003816100381610038161003816(3) if (1003816100381610038161003816119863119894

1003816100381610038161003816 =10038161003816100381610038161003816119863119895

10038161003816100381610038161003816) (119895 gt 119894 119909

119894 119909119895isin 119876119894)

(4) break(5) else for (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896])(6) if (rank(119899

119902119886 119899119902119886+1

)(119909119894) = rank(119899

119902119886 119899119902119886+1

)(119909119895))

(7) 119886++(8) else permutate 119909

119894 119909119895by rank(119899

119902119886 119899119902119886+1

)(9) break(10) return 119876

119894

End

Algorithm 3 Dynamic variable ordering

while the latter has rank infinity The comparison between119886 and infinity determines the ordering The algorithm isdescribed by pseudocodes in Algorithm 3

Quicksort is utilized when permutating variables accord-ing to remaining domain size and returns 119876

119894as a result If

no variables have the same domain size then DVO finishesBut if there are variables whose domain sizes are the same asthat of the head of119876

119894 then the ordering by rank is under way

which will terminate as soon as different ranks appear

52 Heuristics in Value Selection DVO determines the nextvariable to be instantiated and then the value selectionstrategies are employed Considering the difference betweenthe variable in question (eg 119909

119894) and other variables the

branching condition defined by formula (1) can be furtherrepresented as a function of 119909

119894

Br (119899119902119886 119899119902119886+1

) (119909119894) 119863119894997888rarr 119861 = (119886

119894119909119894+ sum119895 = 119894

119886119895119909119895)R119888 (2)

where 119863119894is the domain of 119909

119894and 119861 is a set of Boolean

values 119905119903119906119890 119891119886119897119904119890 sum119895 = 119894119886119895119909119895is the linear combination of

the variables except 119909119894and is regarded as a constant Then

we can design the value selection strategies starting from themonotonic relation between the branching condition and 119909

119894

Monotonicity describes the behavior of a function in relationto the change of the input It gives an indication whether theoutput of the function moves in the same direction as theinput or in the reverse direction If a branching condition isa function whose monotonicity is known the direction inwhich the input needs to be moved to make the functiontrue can be determined The following proposition gives anattribute of a function composed of piecewise monotonicfunctions

Proposition 6 Assume that 1198911 1198831rarr 119884

1 1198912 1198832rarr

1198842 119891

119898 119883119898rarr 119884119898is a family of piecewise monotonic

functions with 119884119894sube 119883119894+1

Let 119865119898 1198831rarr 119884119898be a composed

function 119891119898∘ 119891119898minus1

∘ sdot sdot sdot ∘ 1198911 On this assumption 119865

119898is also

piecewise monotonic

Proof Mathematical induction is used to prove the proposi-tion

(i) Case 1198651= 1198911 Function 119891

1is piecewise monotonic

by assumption 1198651is equal to 119891

1 so it has the same

attribute(ii) Case 119865

119894+1= 119891119894+1∘ 119865119894 The composed function 119865

119894is

piecewise monotonic by the induction assumptionlet 119868 be a subset of its domainrsquos partition and let 119909and 1199091015840 be two arbitrary elements in 119868 with 119909le

1198831199091015840

then one of themonotonicity conditions holds that iseither 119865

119894(119909) le119884119894119865119894(1199091015840

) or 119865119894(119909) ge119884119894119865119894(1199091015840

) For simplic-ity we denote it as 119865

119894(119909)R119865

119894(1199091015840

) where R isin le geFunction 119891

119894+1is piecewise monotonic by assumption

The monotonicity condition is satisfied by 119865119894(119909) and

119865119894(1199091015840

) if both lie in the same subset 1198681015840of its domainrsquospartition Then 119891

119894+1(119909)R119891

119894+1(1199091015840

) holds and 119891119894+1

isalso monotonic on 1198681015840

After decomposing a branching condition into its basicfunctions its monotonicity can be utilized in the selectionof the initial value as well as other values of the variable inquestion

521 Initial Value Selection Initial values of variables are ofgreat importance to a search algorithm On the one hand in a

8 Mathematical Problems in Engineering

backtrack-free search the initial value of a variable is almostpart of the solution On the other hand the selection of initialvalues affectswhether the searchwill be backtrack-free Initialvalues are often selected at random in MHS methods whichreturn different test data each time allowing diversity butrandomness without any heuristics is a kind of blind searchand causes too many iterations sometimes even exceptionMeanwhile midvalues are selected in methods using bisec-tion so it is obvious that sometimes the same result may bereturned since the same initial value is always selected In ourmethod the above twomethods are combined and the initialvalue of a variable is determined based on its path tendencywhich is defined and calculated as follows

Definition 7 Path tendency isin 119901119900119904119894119905119894V119890 119899119890119892119886119905119894V119890 is anattribute of a variable on a path which is in favor of thesatisfaction of all the branching conditions along the pathAnd it provides the information about where to select itsinitial value Positive implies that a larger initial value willwork better while negative implies that a smaller initial valueis better

The calculation of the path tendency of a variable119909119894involves the calculation of its weight on each branch

(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) and its path weight denoted as119908119894(119899119902119886 119899119902119886+1

) and 119901119908119894 which are calculated as (3)

119908119894(119899119902119886 119899119902119886+1

)

=

10038161003816100381610038161198861198941003816100381610038161003816

10038161003816100381610038161198861198941003816100381610038161003816 + sum119895 = 119894

10038161003816100381610038161003816119886119895

10038161003816100381610038161003816

if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically increasing

minus

10038161003816100381610038161198861198941003816100381610038161003816

10038161003816100381610038161198861198941003816100381610038161003816 + sum119895 = 119894

10038161003816100381610038161003816119886119895

10038161003816100381610038161003816

if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically decreasing

119901119908119894=

119896

sum119886=1

119908119894(119899119902119886 119899119902119886+1

)

(3)

Path tendency calculation (PTC) gleans the path ten-dency of each variable with 119901119908

119894 Subsequently initial domain

calculation (IDC) works on the result of PTC In this waythe initial value selection allows for both diversity andheuristics The algorithms are expressed by pseudo-codes inAlgorithms 4 and 5

522 Bisection by Tendency Bisection functions only whena value (including the initial value) assigned to the currentvariable 119909

119894is judged to be infeasible and the conflicted branch

(119899119902119886 119899119902119886+1

)with the false branching condition is locatedThenthe tendency of 119909

119894is used by bisection defined as follows

Definition 8 Tendency isin 119901119900119904119894119905119894V119890 119899119890119892119886119905119894V119890 is an attributeof a variable at a branch (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896]) determinedby the analysis on the monotonicity of the correspondingbranching condition and it provides the information aboutwhere to select a value to better satisfy the branching

condition Positive implies that a larger value will work betterwhile negative implies that a smaller value is better It iscalculated according to the following formula

119879119890119899119889119890119899119888119910 (119909119894)

=

119901119900119904119894119905119894V119890 if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically increasing119899119890119892119886119905119894V119890 if Br (119909

119894) (119899119902119886 119899119902119886+1

)

is monotonically decreasing

(4)

Each branch holds a tendencymap ⟨119881119886119903119894119886119887119897119890 119879119890119899119889119890119899119888119910⟩which includes the variables appearing on the branch andtheir corresponding tendencies With the tendency mapbisection can be applied to reduce the domain of 119909

119894(119863119894119895)

leading the branching condition to be true as presented bypseudo-codes in Algorithm 6

For example if the conflicted branch is the first branchof Path3 in Figure 1 then the corresponding branchingcondition is 1199091 minus 1199092 gt 0 which has different monotonicrelations with 1199091 and 1199092 respectively Table 2 shows how touse bisection to reduce the domains of variables If the currentvariable is 1199091 then retrieval of tendency map returns positiveindicating that a larger value will help satisfy the branchingcondition sowe reduce its domain to the larger part But if thecurrent variable is 1199092 bisection will function in the oppositeway due to the opposite monotonic relation

53 Heuristics in Maintaining Path Consistency As men-tioned in Section 42 MPC can be used in both stages ofBFS-BB In this part the focus is on the state space searchstage A value assigned to the current variable 119909

119894 no matter

it is the initial value or another value selected after bisectionshould be examined by interval arithmetic to see whether it ispart of the solution Path consistency is a prerequisite for thesuccess of interval arithmetic In the implementation of BFS-BB interval arithmetic is enhanced to provide more preciseinterval information The enhancement is to make clear howthe value of the branching condition defined by formula (2)is calculated as shown in formula (5) Here we use 119863119886 todenote the domain of all variables before calculating the 119886thbranching condition Besides a library of inverse functions isadded in case of the occurrences of library functions in thePUT Consider

Br (119899119902119886 119899119902119886+1

) (119909119894)

=

119905119903119906119890 if (119899119902119886 119899119902119886+1

) is traversedwith 119863119886 (119881

119894119895isin 119863119886

)

119891119886119897119904119890 otherwise

(5)

Hence for 119896 branching nodes along path 119901 all the119896 branching conditions should be true to maintain pathconsistency MPC receives the value of the current variable 119909

119894

(119881119894119895) which is part of the domain of all variables denoted as

1198631

(119881119894119895= [119881119894119895 119881119894119895] isin 1198631

) and evaluates the branching condi-tion corresponding to the branch (119899

1199021 1198991199021+1

) where 1198991199021is the

first branching node The branching condition Br(1198991199021 1198991199021+1

)

Mathematical Problems in Engineering 9

Input119883rel the set of relevant variables to the path119901119908119894 the path weight of variable 119909

119894(119909119894isin 119883rel)

Output Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relBegin(1) Path-Tendencylarr null(2) foreach 119909

119894isin 119883rel

(3) if (119901119908119894gt 0)

(4) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119901119900119904119894119905119894V119890⟩

(5) else if (119901119908119894lt 0)

(6) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119899119890119892119886119905119894V119890⟩

(7) return Path-TendencyEnd

Algorithm 4 Path tendency calculation

Input 119863119894= [min max] the domain of 119909

119894

Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relOutput 119863

1198941 the domain of 119909

119894in which its initial value is selected

Begin(1) PathTendency(119909

119894)larr retrieval of Path-Tendency

(2) if (PathTendency(119909119894) = positive)

(3) 1198631198941larr997888 [

(min+max)2

max](4) else if (PathTendency(119909

119894) = negative)

(5) 1198631198941larr997888 [min (min+max)

2]

(6) return 1198631198941

End

Algorithm 5 Initial domain calculation

Input 119863119894119895= [min max] the current domain of 119909

119894

119881119894119895 the current value of 119909

119894that causes Br(119909

119894)(119899119902119886 119899119902119886+1

) to be false(119899119902119886 119899119902119886+1

) the conflicted branchOutput 119863

119894119895 the reduced domain of 119909

119894

Begin(1) 1198811015840 larr 119881

119894119895

(2) Tendency(119909119894)larr retrieval of tendency map held by (119899

119902119886 119899119902119886+1

)(3) 119895++(4) if (Tendency(119909

119894) = positive)

(5) 119863119894119895larr [1198811015840 + 1 max]

(6) else if (Tendency(119909119894) = negative)

(7) 119863119894119895larr [min 1198811015840 minus 1]

(8) return 119863119894119895

End

Algorithm 6 Bisection

Table 2 An example of bisection

Current variable Monotonicity Tendency Current value Domain before bisection Domain after bisection1199091 Increasing Positive 1198811 [min1 max1] [1198811+1 max1]1199092 Decreasing Negative 1198812 [min2 max2] [min2 1198812minus1]

10 Mathematical Problems in Engineering

Input 1198631 the domain of all variables before checking path consistencyBr(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branching conditions along the pathOutput 119863119896+1 the reduced domain of all variables after a successful path consistency check

(119899119902119886 119899119902119886+1

) the conflicted branch spotted by path consistency checkBegin(1) for 119886 rarr 1 119896(2) calculate Br(119899

119902119886 119899119902119886+1

) with119863119886(3) if (Br(119899

119902119886 119899119902119886+1

) = true)(4) 119863

119886+1

⊑ 119863119886

(5) else return (119899119902119886 119899119902119886+1

)(6) path consistentlarr true(7) return 119863119896+1End

Algorithm 7 Maintaining path consistency

is generally not satisfied for all the values in1198631 but for valuesin a certain subset 1198632 sube 119863

1 ensuring the traversal of the

branch (1198991199021 1198991199021+1

) that is 1198631119861119903(1198991199021 1198991199021+1)

997888997888997888997888997888997888997888997888997888rarr 1198632 Next the

branching condition Br(1198991199022 1198991199022+1

) is evaluated given that thedomain of all variables is1198632 Again generally Br(119899

1199022 1198991199022+1

) isonly satisfied by a subset 1198633 sube 1198632 This procedure continuesalong 119901 until all the branching conditions are satisfied tomaintain path consistency and119863119896+1 is returned as the domainof all variables The process of maintaining path consistencyis the propagation of the branching conditions along p in the

form of 1198631Br(1198991199021 1198991199021+1)997888997888997888997888997888997888997888997888997888rarr 119863

2

Br(11989911990221198991199022+1)997888997888997888997888997888997888997888997888997888rarr 119863

3

sdot sdot sdot 119863119896

Br(119899119902119896119899119902119896+1)997888997888997888997888997888997888997888997888997888rarr

119863119896+1 where 1198631 supe 119863

2

supe 1198633

sdot sdot sdot supe 119863119896

supe 119863119896+1 But if in

this procedure Br(119899119902ℎ 119899119902ℎ+1

) = 119891119886119897119904119890(1 le ℎ le 119896) whichmeans a conflict is detected then MPC is terminated andbisection will function according to the result of MPC atthe conflicted branch (119899

119902ℎ 119899119902ℎ+1

) The process of checkingwhether path consistency is maintained is shown by pseudo-codes in Algorithm 7

54 Case Study In this part the problem mentioned inSection 31 is used as an example to explain how BFS-BBworks especially the heuristic look-aheadmethods proposedahead The input is Path3 as shown in bold in Figure 3where each branching condition is decomposed into its basicfunctions in the right The IVR process has been illustratedin detail in Table 1 and all the three variables are determinedrelevant to Path3 For simplicity the input domains of allvariables are set [minus2 2] with the size 5 In the initializationstage MPC check reduces their domains to 1199091 [minus1 2] 1199092[minus2 1] and 1199093 [minus1 2] The path tendency of each variableis calculated by PTC as shown in Table 3 DVO serves todetermine the first variable to be instantiated as shown inTable 4 with the head of the queue (1199092) highlighted in boldOn determining 1199092 to be the current variable an initialvalue needs to be selected from[minus2 1] The retrieval of pathtendency map by IDC returns negative for 1199092 indicating thata smaller value will perform better and minus1 is selected

MPC checks the domains of all variables which are 1199091[minus1 2] 1199092 [minus1 1] and 1199093 [minus1 2] It succeeds and reduces

the domains of 1199091 and 1199093 to [0 2] and [0 2] respectivelyThen DVO determines the next variable to be instantiated asshown in Table 5 with the head of the queue (1199091) highlightedin bold

1 is selected for 1199091 after IDC MPC checks whether1199091 [1 1] 1199092 [minus1 minus1] and 1199093 [0 2] works It suc-ceeds and in the same manner 1199093 is assigned 1 Finally⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is checked by MPC to be suitablefor Path3 No variable needs to be permutated and BFS-BBsucceeds with the test data ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ Table 6shows how the domains of variables are changed duringthe search process The changed domains are highlighted inbold The changes listed in the fourth column are owing tovariable assignments according to the results of IDC andthe changes listed in the fifth column are owing to domainreduction by MPC checks The process of generating the testdata ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is presented as the search treein Figure 4 It is a backtrack-free search that accounts for anextremely large proportion in the implementation of BFS-BBEach variable consumes one MPC check in the state spacesearch stage and the initial values of each variable make thesolution The solution path is shown by the bold arrows

6 Experimental Results and Discussion

To observe the effectiveness of BFS-BB we carried out a largenumber of experiments in CTS Within the CTS frameworkthe PUT is automatically analyzed and its basic informationis abstracted to generate its CFG According to the specifiedcoverage criteria the paths to be traversed are generated andprovided for BFS-BB as input The generated test data willbe used for mutation testing that requires a high coverageideally 100 [37] This is a challenge for test data generation

The experiments were performed in the environment ofMSWindows 7with 32 bits Pentium4with 28GHz and 2GBmemory The algorithms were implemented in Java and runon the platformof eclipseThe experiments include two partsSection 61 presents the performance evaluation of BFS-BBand Section 62 tests the capability of BFS-BB to generate testdata in terms of coverage and makes comparisons with somecurrently existing static and dynamic methods

Mathematical Problems in Engineering 11

Table 3 PTC process for 1199091 1199092 and 1199093

Branchingcondition

Basic functions andcorresponding monotonicity

Monotonicity ofbranching conditions Weight Path weight Path tendency

1199091 minus 1199092 gt 0

119891(1199091) = 1199091 minus 1199092 increasing119891(1199092) = 1199091 minus 1199092 decreasing119891(1198871) = 1198871 gt 0 increasing

Br(1199091) increasingBr(1199092) decreasing

1199081 = 05

1199082 = minus051199011199081 = 05

1199011199082 = minus1

1199011199083 = 15

⟨1199091 positive⟩⟨1199092 negative⟩⟨1199093 positive⟩1199093 minus 1199092 gt 0

119891(1199092) = 1199093 minus 1199092 decreasing119891(1199093) = 1199093 minus 1199092 increasing119891(1198872) = 1198872 gt 0 increasing

Br(1199092) decreasingBr(1199093) increasing

1199082 = minus05

1199083 = 05

3 lowast 1199093 ge minus5119891(1199093) = 3 lowast 1199093 increasing119891(1198873) = 1198873 ge minus5 increasing Br(1199093) increasing 1199083 = 1

Table 4 DVO process for 1199091 1199092 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 4 |1198632| = 4 |1198633| = 4 Yes (all three have the same domain size)

x2 rarr 1199091 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199092) = 1 Rank 1(1199093) =infin Yes (1199091 and 1199092 both have Rank 1)Rank 2 Rank 2(1199091) =infin Rank 2(1199092) = 2 No (1199092 has Rank 2 while 1199091 has infinity)

Table 5 DVO process for 1199091 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 3 |1198633| = 3 Yes (both have the same domain size) x1 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199093) =infin No (1199091 has Rank 1 while 1199093 has infinity)

Table 6 Domain changes in the search process

Stage Function Before IDC After IDC and before MPC After MPC

Initialization Initial domain reduction mdash 1199091 [minus2 2] 1199092 [minus2 2]1199093 [minus2 2]

x1 [minus1 2] x2 [minus2 1]x3 [minus1 2]

State space search

MPC check when 1199092 isassigned minus1

1199091 [minus1 2] 1199092 [minus2 1] 1199093[minus1 2]

1199091 [minus1 2] x2 [minus1 minus1]1199093 [minus1 2]

x1 [0 2] 1199092 [minus1 minus1]x3 [0 2]

MPC check when 1199091 isassigned 1

1199091 [0 2] 1199092 [minus1 minus1] 1199093[0 2]

x1 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

MPC check when 1199093 isassigned 1

1199091 [1 1] 1199092 [minus1 minus1] 1199093[0 2]

1199091 [1 1] 1199092 [minus1 minus1]x3 [1 1]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [1 1]

void test(int x1 int x2 int x3) void test(int x1 int x2 int x3)

if (x1-x2lt=0)

printf(ldquoPath1rdquo)

else if(x3-x2lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

else if(3lowastx3+5gt=0)

int b1=x1-x2

printf(ldquoPath1rdquo)

if (b1lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

if(b2lt=0)

if(b3gt=-5)

else int b2=x-3x2

elseint b3=3lowastx3

Figure 3 Overview of our approach for searching the test data

12 Mathematical Problems in Engineering

0 1 2

DVO

MPC

0 1 2MPC

0 1 2MPC

DVO

DVOIDC

PTC

MPC

MPC

IDC

IDC

minus1

minus1

minus1

minus2

minus2

minus2

x1 minus x2 gt 0x3 minus x2 gt 03x3 + 5 ge 0

x1 x2 x3 isin minus2 minus1 0 1 2

x2x2x2x2

x3x3x3

x1 x1 x1

radic

Figure 4 The search tree of generating the test data for Path3 using BFS-BB

61 Performance Evaluation The number of relevant vari-ables is an important factor that affects the performanceof BFS-BB so in this part experiments were carried out toevaluate the performance of BFS-BB for varying numbers ofinput variables To be specific our major concern is (1) therelationship between the number of MPC checks (exclusiveof the one taken in the initialization stage) and the numberof relevant variables (2) the relationship between genera-tion time and the number of relevant variables This wasaccomplished by repeatedly running BFS-BB on generatedtest programs having input variables 119909

1 1199092 119909

119899where 119899

varied from 1 to 50 Adopting statement coverage in eachtest the program contained five if statements (equivalent tofive branching conditions along the path for MPC check)and there was only one path to be traversed of fixed lengthwhich was the one consisting of entirely true branches(TTTTT) that is all the branching conditions are the sameas the corresponding predicates Considering the relationshipbetween variables experiments involving two situations wereconducted that (1) the variables are all independent of eachother and (2) the variables are linearly related in the tightestmanner Generation time varied greatly in these two cases sothe axes of generation time of both cases are normalized forsimplicity

611 Variables Are All Independent of Each Other The pred-icate of each if statement is an expression in the form of

11988611199091relop

1const [1] and 119886

21199092relop

2119888119900119899119904119905 [2]

and sdot sdot sdot and 119886119899119909119899relop

119899119888119900119899119904119905 [119899]

(6)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop119894(119894 = 1 2 119899) isin gt ge lt le =

= and const is an array of randomly generated constantsThe randomly generated 119886

119894and 119888119900119899119904119905[119894] should be selected

to make the path feasible This arrangement constructs arelationship that all the variables are independent of eachother but all of them are relevant to the path The programsfor various values of n ranging from 1 to 50 were each tested50 times and the number of MPC checks and time required

to generate the data for each test were recorded The resultscan be seen from Figures 5 and 6

Figure 5 shows the relationship between the number ofMPC checks and the number of variables (119899) for variablesthat are all independent of each other and from (a) to (d)represent four different situations marked by the ordinatesIt can be seen that since the relation in formula (6) is thesimplest one between variables the number of MPC checksremains linearly increasing with the number of variables nomatter in which situation from (a) to (d) 119910 = 119909means thatfor this kind of constraint one relevant variable requires onlyone MPC check It also can be seen that 1198772 = 1 in all thefour situations and the number of MPC checks increasescompletely linearly with the number of variables

Figure 6 shows the relationship between generation timeand the number of variables (119899) for variables that are allindependent of each other and from (a) to (d) represent fourdifferent situations marked by the ordinates It can be seenthat generation time increases approximately linearlywith thenumber of variables and the linear correlation relationship issignificant at 95 confidence level with 119875 value far less than005 By the increase of the number of variables generationtime increases at an even speed The minimum value can becommendably represented as a straight line showing that itis the most ideal in the four situations with a larger value of1198772 Variations between tests with the same values of 119899 were

attributed to the randomness in the difference in the selectionof the initial values

612 Variables Are Linearly Related in the Tightest MannerThe predicate of each if statement is a linear combination ofall the 119899 variables in the form of

[1198861 1198862 119886

119899] [1199091 1199092 119909

119899]1015840relop119888119900119899119904119905 [119888] (7)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop isin gt ge lt le = = and 119888119900119899119904119905[119888](119888 isin 1 2 3 4 5) is an array of randomly generatedconstants The randomly generated 119886

119894and 119888119900119899119904119905[119888] should be

selected to make the path feasible This arrangement con-structs the tightest linear relation between the variables all

Mathematical Problems in Engineering 13

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = x

R2 = 1

(a)

Number of variables0 20 40 60

0

20

40

60

Aver

age n

umbe

r of M

PC ch

ecks

y = x

R2 = 1

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(c)

Number of variables0 20 40 60

0

20

40

60

Min

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(d)

Figure 5 Relationship between the number of MPC checks and the number of variables for variables that are all independent of each other

of which are relevant to the path The programs for variousvalues of 119899 ranging from 1 to 50 were each tested 50 times andthe number ofMPC checks and time required to generate thedata for each test were recordedThe results can be seen fromFigures 7 and 8

Figure 7 shows the relationship between the number ofMPC checks and the number of variables (119899) for variables thatare the tightest linearly related and from (a) to (d) representfour different situations marked by the ordinates It can beseen that the number of MPC checks remains approximatelylinearly increasing with the number of variables and thefitting curves are all near 119910 = 119909 The linear correlationrelationship is significant at 95 confidence level with 119875-value far less than 005 The general average and maximumnumbers ofMPC checks are all larger than those in the exper-iment for variables that are all independent of each otherbecause the relation in formula (7) is the tightest linear onebetween variablesTheminimumnumber ofMPC checks canbe completely represented as 119910 = 119909 with 1198772 = 1 whichmeans that theminimumnumber is themost ideal in the foursituations

Figure 8 shows the relationship between generation timeand the number of variables (119899) for variables that are

the tightest linearly related and from (a) to (d) represent fourdifferent situationsmarked by the ordinates It is clear that therelation between generation time and the number of variablescan be commendably represented as a quadratic curve andthe quadratic correlation relationship is significant at 95confidence level with 119875-value far less than 005 The betterfitting curves of average and minimum generation timesshow that average generation time is perfectly stable andminimum generation time is still the most ideal Variationsbetween tests with the same values of n were attributed to therandomness in (1) the difference in the selection of the initialvalues and (2) the difference in the expressions along the path(an equality relational operator will generally require morecalculation than an inequality relational operator) Besidesgeneration time increases at a uniformly accelerative speed bythe increase of the number of variables Take (b) for examplethe differentiation of average generation time indicates thatits increase rate rises by 119910 = 9994119909 minus 7734 as the number ofvariables increases We can roughly draw the conclusion thatgeneration time is very close for 119899 ranging from 1 to 8 whileit begins to increase when 119899 is larger than 8

The above cases are both completely backtrack-freesearch owing to the linear correlation relationship between

14 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 1471x + 3161

R2 = 0899

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time

y = 1636x + 1175

R2 = 0862

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 2073x minus 3655

R2 = 0997

(d)

Figure 6 Relationship between generation time and the number of variables for variables that are all independent of each other

the number of MPC checks and the number of relevantvariables Surely they cannot include all the relations betweenvariables in engineering so the analyses in this part arejust from the theoretic perspective The real-world PUTs aremuch more complex Whatrsquos more 50 tests were conductedfor each case of n ranging from 1 to 50 So the results from thesamples can only approximate the actual situation But it canbe concluded that BFS-BB functions are stably given a PUTof regular structure which lays a solid foundation for itsapplication in engineering

62 Coverage Evaluation To evaluate the capability of BFS-BB to generate test data in terms of coverage four experi-ments were carried out The first involves the testing with abenchmark used in CTS the second aims at generating testdata for a project in engineering the third compares BFS-BBwith a static method and the last compares it with dynamicmethods

621 Testing a Benchmark in CTS In this part test datawere automatically generated to meet three coverage criteriawhich were statement branch and MCDC The test bedwas branch boundc a benchmark in CTS with 402 LOC

29 input variables and complex structure trying to includemore content that might appear in engineering119898 was set 10for each variable as the upperbound of the number of MPCchecks so it can be estimated that the simplest backtrackingwill consume at least 11 MPC checks for the variable inquestion

The result is shown in Table 7 The numbers of paths wasdifferent owing to different coverage criteria adopted BFS-BB was able to generate test data for all the feasible pathsno matter which coverage criterion was taken The MCDCcoverage did not reach 100 because it is relatively strictand difficult to meet and subsumes statement and branchcoverage [38] But tolerable coverage was achieved withintolerable timeThere exists a trade-off between efficiency andsuccess rate IVR had no significant influence on coveragebut it did on generation time Generation time after IVRwas much less than that without IVR Note that the amountof generation time reduced by IVR is determined by the struc-ture of the PUT The numbers of generation time reductionin Table 7 are only related to the program branch boundcOur following analyses all concern BFS-BB with IVR Aver-age number of MPC checks per relevant variable adoptingstatement coverage was larger than the results in Section 61

Mathematical Problems in Engineering 15

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = 0994x + 0221

R2 = 0997

(a)

0 20 40 600

20

40

60

Number of variables

Aver

age n

umbe

r of M

PC ch

ecks

y = 0996x + 0309

R2 = 0999

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = 1014x + 2030

R2 = 0985

(c)

0 20 40 600

20

40

60

Number of variables

Min

imum

num

ber o

f MPC

chec

ks

y = x

R2 = 1

(d)

Figure 7 Relationship between the number of MPC checks and the number of variables for variables that are linearly related in the tightestmanner

Table 7 Coverage achieved by BFS-BB on branch boundc

Coverage criterion Number of paths Average coverage Generation time reducedby IVR

Average MPC checks perrelevant variable

Statement 61 100 34 134Branch 119 100 37 173MCDC 125 98 42 229

because there are some library functions as well as nonlinearconstraints in the PUT which require moreMPC checks Butfrom all the average values of the number of MPC checks itcan be concluded that the tests for all the three coverage crite-ria were basically backtrack-free and not all the tests includethe bisection operation

622 Testing Programs from a Project in Engineering In thispart seven programs were selected from the project de118i-2 at httpwwwmoshiernet as the test beds each of whichcontains several functions or loops Three coverage criteriawere adopted which were statement branch and MCDCThe information of the programs and the coverage results areshown in Table 8

From Table 8 it can be seen that BFS-BB performedencouragingly for programs with complex structure in engi-neering Some of the tests adoptingMCDC coverage did notreach 100 resulting from the fact thatMCDC is a relativelystrict criterion But all the tests were basically backtrack-freewith a high coverage proving its effectiveness in engineering

623 Comparison with a Static Method This part presentsthe results from an empirical comparison of BFS-BB withthe static method [13] (denoted as ldquomethod 1rdquo to avoidverbose description) which was implemented in CTS priorto BFS-BB The details of the test beds are shown in Table 9The comparison adopted three coverage criteria statementbranch and MCDC And the result is shown in Table 10

16 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 4318x2 minus 6684x + 3655

R2 = 0975

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 4997x2 minus 7734x + 4134

R2 = 0991

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time y = 4151x2 minus 2923x + 7107

R2 = 0907

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 5455x2 minus 9565x + 5030

R2 = 0994

(d)

Figure 8 Relationship between generation time and the number of variables for variables that are linearly related in the tightest manner

Table 8 Test result achieved by BFS-BB on programs from de118i-2

Program LOC Number ofvariables

Number offunctions

Number ofloops Statement coverage Branch coverage MCDC coverage

rungec 1626 8 2 8 100 100 89oblatec 581 14 4 4 100 100 94jplmpc 996 6 5 5 100 100 88floorlc 432 7 5 2 100 100 92atanlc 267 3 2 0 100 100 100adams4c 767 28 10 12 100 100 85tanlc 256 4 3 0 100 100 100

Table 9 Programs used for comparison with method 1

Program LOC Number of branches Number of variables Description Source

Bonus 29 10 1 To calculate bonus according toprofit [13]

Days 33 17 3 To calculate the day numberwithin a year [13]

Statistics 21 8 5 To count the number of everykind of characters [13]

gcd 38 5 2 To calculate greatest commondenominator [14]

Mathematical Problems in Engineering 17

Table 10 Comparison result with method 1 using three coverage criteria

Program Coverage criterion Number of paths Average coverage bymethod 1

Average coverage byBFS-BB

BonusStatement 6 25 100Branch 6 37 100MCDC 6 30 100

DaysStatement 17 100 100Branch 17 100 100MCDC 14 94 100

StatisticsStatement 4 100 100Branch 5 100 100MCDC 11 82 100

gcdStatement 3 85 100Branch 3 77 100MCDC 5 75 100

Table 11 Programs used for comparison with GA and SA

Program LOC Number of branches Number of variables Description SourceTriangle type 31 3 5 To classify type of a triangle [15]

Valid date 59 16 3 To check whether a date is validor not [15]

Cal day 72 11 3 To calculate the day of the week [16]

Cal 53 18 5To calculate the number of daysbetween the two given days in

the same year[17]

Since interval arithmetic has been improved in CTS forthe sake of validity test data were generated for all the testbeds on the same foundation of interval arithmetic It can beseen that BFS-BB reached 100 coverage for all the test bedsusing three coverage criteria while method 1 did not That islargely due to the heuristicmethods utilized in BFS-BBThereare modulus operations in the program days which had beendifficult for the interval arithmetic in method 1 to handle Butthe functionality of interval arithmetic has been improved byadding a library of inverse functions so it was not so difficulteven for method 1

624 Comparison with DynamicMethods This part presentsresults from an empirical comparison of BFS-BB with twodynamic methods which are genetic algorithm (GA) andsimulated annealing (SA) on four different benchmark pro-grams using branch coverage The details of the benchmarkprograms are shown in Table 11 In order to obtain unbiasedexperimental results we made a number of experiments withdifferent parameter settings and chose the one that GA andSA performed the best as shown in Table 12 Test data wereautomatically generated for each program using GA SA andBFS-BB with each tested 100 timesThe average coverage wasrecorded to make the comparison and the result was pre-sented in Table 13

It can be seen that BFS-BB reached 100 coverage onall the four benchmark programs which are rather simpleprograms for BFS-BB and it outperformed the algorithms

in comparison The better performance of BFS-BB is due tothree factorsThe first is that the initial values of variables areselected by heuristics on the path so BFS-BB reaches a rela-tively high coverage for the first round of the searchThe sec-ond is that MHS crashes on several occasions due to the iter-ation exception while the probability of aborting is quite lowfor BFS-BB because it has no demand for iteration The thirdis thatMPC is checked not only in the state space search stagebut also in the initialization stage which reduces the domainsof the variables to ensure a relatively small search spacethat follows

7 Conclusion

This paper presents an intelligent algorithm best-first-searchbranch and bound (BFS-BB) for path-wise test data gener-ation (Q) The problem Q is reformulated as a constraintoptimization problem (COP) and two techniques from arti-ficial intelligence are introduced to tackle the COP which arestate space search and branch and bound (BB) The formeris used to construct the search tree dynamically and thelatter is used as the search method Heuristics are adoptedin the look-ahead stage Dynamic variable ordering (DVO)is presented with a heuristic rule to break ties Maintainingpath consistency (MPC) is achieved through analysis on theresult of interval arithmetic The monotonicity analysis onbranching conditions is applied both in the selection of initialvalues by path tendency calculation (PTC) and initial domain

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

6 Mathematical Problems in Engineering

Input Br(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branching conditions along the path119883 = 119909

1 1199092 119909

119899 the set of input variables

Output119883rel the set of relevant variables to the path119883irrel the set of irrelevant variables to the path

(1)119883rel larr Oslash(2)119883irrel larr Oslash(3) foreach Br(119899

119902119886 119899119902119886+1

) (119886 isin [1 119896])(4) if (119883rel = 119883)

(5) break(6) else if (119886

119895= 0)

(7) 119883rel larr 119883rel cup 119909119895(8)119883irrel larr 119883 minus119883rel(9) return 119883rel 119883irrel

Algorithm 2 Irrelevant variable removal

Generally for a particular path whether an input variableis relevant or irrelevant cannot be completely decided dueto the complex structure of programs But we can makeconservative estimate of irrelevancy with static control flowtechnique We give the most common condition in PUTsAssume that there are 119896 branches along a path each branch(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) needs to be traversed to find theset of relevant variables The removal of irrelevant variablesinvolves the judgment of whether a variable appears on eachbranch so we give the definition below which is utilizedby Algorithm 2 And considering the relation between thecomplexity of BFS-BB and the number of variables we giveProposition 4 about the effectiveness of IVR

Definition 3 The branching condition Br(119899119902119886 119899119902119886+1

) is theconstraint on the branch (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896]) and it canbe represented as

Br (119899119902119886 119899119902119886+1

) =

119899

sum119895=1

119886119895119909119895R119888 (1)

where R is a relational operator and 119886119895(119895 isin [1 119899]) and 119888 are

constants

Proposition 4 IVR may result in test data being searchedout with fewer MPC checks for a particular path 119901 than if allvariables are considered

Proof Thealgorithmbisection involves the search steps takenfor a certain variable under the same condition of othervariables which move in breadth (119898) until a value is foundto make MPC succeed Then 119898 is the base of the complexityof BFS-BB and the number of variables is the exponent Let119883rel denote the set of relevant variables to path119901 and let119883irrelbe the set of irrelevant variables onemore element in119883rel willinvolve more MPC checks on an exponential basis If all theirrelevant variables are removed from the search space thecomplexity will be reduced by119898|119883irrel| |119883irrel| is the cardinalityof the set of irrelevant variables

We conduct IVR for all the paths in Figure 1 and theprocess is shown in Table 1 The position where a variable isjudged relevant to the path of interest is highlighted in bold

5 The Heuristic Look-Ahead Methods

In this section the heuristic look-ahead methods in BFS-BBare explained in detail in Sections 51 52 and 53 respec-tively And Section 54 provides a case study to illustrate thesemethods

51 Heuristics in Variable Ordering In practice the chief goalin designing variable ordering heuristics is to reduce the sizeof the overall search tree In our method the next variableto be instantiated is selected to be the one with the minimalremaining domain size (the size of the domain after removingthe values judged to be infeasible) because this canminimizethe size of the overall search treeThe technique to break ties isimportant as there are often variables with the same domainsize We use variablesrsquo ranks to break ties In case of a tie thevariable with the higher rank is selected This method givessubstantially better performance than picking one of the tyingvariables at random Rank is defined as follows

Definition 5 The rank of a branch (119899119902119886 119899119902119886+1

) (119886 isin [1 119896])

marks its level in the sequence of the branches along a pathdenoted as rank (119899

119902119886 119899119902119886+1

)The rank of the first branch is 1 the rank of the second

one is 2 and the ranks of those following can be obtainedanalogously The variables appearing on a branch enjoy thesame rank as the branch The rank of a variable on a branchwhere it does not appear is supposed to be infinity As avariable may appear on more than one branch it may havedifferent ranks The rule to break ties according to the ranksof variables is based on the heuristics from interval arithmeticthat the earlier a variable appears on a path the greaterinfluence it has on the result of interval arithmetic along thepath Therefore if the ordering by rank is taken between avariable that appears on the branch (119899

119902119886 119899119902119886+1

) and a variablethat does not then the former has a higher rank That isbecause on the branch (119899

119902119886 119899119902119886+1

) the former has rank 119886

Mathematical Problems in Engineering 7

Table 1 IVR process for each path of test in Figure 1

Path Branching condition 1198861

1198862

1198863

119883rel 119883irrel

Path 1 0 rarr 1 rarr 2 rarr 9 rarr 10 1199091 minus 1199092 le 0 1 minus1 0 1199091 1199092 1199093

Path 2 0 rarr 1 rarr 3 rarr 4 rarr 8 rarr 9 rarr 101199091 minus 1199092 gt 0 1 minus1 0

1199091 1199092 1199093 Oslash1199093 minus 1199092 le 0 0 minus1 1

Path 3 0 rarr 1 rarr 3 rarr 5 rarr 6 rarr 7 rarr 8 rarr 9 rarr 10

1199091 minus 1199092 gt 0 1 minus1 01199091 1199092 1199093 Oslash1199093 minus 1199092 gt 0 0 minus1 1

31199093 ge minus5 mdash mdash mdash

Path 4 0 rarr 1 rarr 3 rarr 5 rarr 7 rarr 8 rarr 9 rarr 10

1199091 minus 1199092 gt 0 1 minus1 01199091 1199092 1199093 Oslash1199093 minus 1199092 gt 0 0 minus1 1

31199093 lt minus5 mdash mdash mdash

Input FV the set of future variables119863119894 the domain of 119909

119894(119909119894isin FV)

(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branches along the pathOutput 119876

119894 a queue of FV

Begin(1) 119876119894larr quicksort (FV 1003816100381610038161003816119863119894

1003816100381610038161003816)(2) for 119894 rarr 1 1003816100381610038161003816119876119894

1003816100381610038161003816(3) if (1003816100381610038161003816119863119894

1003816100381610038161003816 =10038161003816100381610038161003816119863119895

10038161003816100381610038161003816) (119895 gt 119894 119909

119894 119909119895isin 119876119894)

(4) break(5) else for (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896])(6) if (rank(119899

119902119886 119899119902119886+1

)(119909119894) = rank(119899

119902119886 119899119902119886+1

)(119909119895))

(7) 119886++(8) else permutate 119909

119894 119909119895by rank(119899

119902119886 119899119902119886+1

)(9) break(10) return 119876

119894

End

Algorithm 3 Dynamic variable ordering

while the latter has rank infinity The comparison between119886 and infinity determines the ordering The algorithm isdescribed by pseudocodes in Algorithm 3

Quicksort is utilized when permutating variables accord-ing to remaining domain size and returns 119876

119894as a result If

no variables have the same domain size then DVO finishesBut if there are variables whose domain sizes are the same asthat of the head of119876

119894 then the ordering by rank is under way

which will terminate as soon as different ranks appear

52 Heuristics in Value Selection DVO determines the nextvariable to be instantiated and then the value selectionstrategies are employed Considering the difference betweenthe variable in question (eg 119909

119894) and other variables the

branching condition defined by formula (1) can be furtherrepresented as a function of 119909

119894

Br (119899119902119886 119899119902119886+1

) (119909119894) 119863119894997888rarr 119861 = (119886

119894119909119894+ sum119895 = 119894

119886119895119909119895)R119888 (2)

where 119863119894is the domain of 119909

119894and 119861 is a set of Boolean

values 119905119903119906119890 119891119886119897119904119890 sum119895 = 119894119886119895119909119895is the linear combination of

the variables except 119909119894and is regarded as a constant Then

we can design the value selection strategies starting from themonotonic relation between the branching condition and 119909

119894

Monotonicity describes the behavior of a function in relationto the change of the input It gives an indication whether theoutput of the function moves in the same direction as theinput or in the reverse direction If a branching condition isa function whose monotonicity is known the direction inwhich the input needs to be moved to make the functiontrue can be determined The following proposition gives anattribute of a function composed of piecewise monotonicfunctions

Proposition 6 Assume that 1198911 1198831rarr 119884

1 1198912 1198832rarr

1198842 119891

119898 119883119898rarr 119884119898is a family of piecewise monotonic

functions with 119884119894sube 119883119894+1

Let 119865119898 1198831rarr 119884119898be a composed

function 119891119898∘ 119891119898minus1

∘ sdot sdot sdot ∘ 1198911 On this assumption 119865

119898is also

piecewise monotonic

Proof Mathematical induction is used to prove the proposi-tion

(i) Case 1198651= 1198911 Function 119891

1is piecewise monotonic

by assumption 1198651is equal to 119891

1 so it has the same

attribute(ii) Case 119865

119894+1= 119891119894+1∘ 119865119894 The composed function 119865

119894is

piecewise monotonic by the induction assumptionlet 119868 be a subset of its domainrsquos partition and let 119909and 1199091015840 be two arbitrary elements in 119868 with 119909le

1198831199091015840

then one of themonotonicity conditions holds that iseither 119865

119894(119909) le119884119894119865119894(1199091015840

) or 119865119894(119909) ge119884119894119865119894(1199091015840

) For simplic-ity we denote it as 119865

119894(119909)R119865

119894(1199091015840

) where R isin le geFunction 119891

119894+1is piecewise monotonic by assumption

The monotonicity condition is satisfied by 119865119894(119909) and

119865119894(1199091015840

) if both lie in the same subset 1198681015840of its domainrsquospartition Then 119891

119894+1(119909)R119891

119894+1(1199091015840

) holds and 119891119894+1

isalso monotonic on 1198681015840

After decomposing a branching condition into its basicfunctions its monotonicity can be utilized in the selectionof the initial value as well as other values of the variable inquestion

521 Initial Value Selection Initial values of variables are ofgreat importance to a search algorithm On the one hand in a

8 Mathematical Problems in Engineering

backtrack-free search the initial value of a variable is almostpart of the solution On the other hand the selection of initialvalues affectswhether the searchwill be backtrack-free Initialvalues are often selected at random in MHS methods whichreturn different test data each time allowing diversity butrandomness without any heuristics is a kind of blind searchand causes too many iterations sometimes even exceptionMeanwhile midvalues are selected in methods using bisec-tion so it is obvious that sometimes the same result may bereturned since the same initial value is always selected In ourmethod the above twomethods are combined and the initialvalue of a variable is determined based on its path tendencywhich is defined and calculated as follows

Definition 7 Path tendency isin 119901119900119904119894119905119894V119890 119899119890119892119886119905119894V119890 is anattribute of a variable on a path which is in favor of thesatisfaction of all the branching conditions along the pathAnd it provides the information about where to select itsinitial value Positive implies that a larger initial value willwork better while negative implies that a smaller initial valueis better

The calculation of the path tendency of a variable119909119894involves the calculation of its weight on each branch

(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) and its path weight denoted as119908119894(119899119902119886 119899119902119886+1

) and 119901119908119894 which are calculated as (3)

119908119894(119899119902119886 119899119902119886+1

)

=

10038161003816100381610038161198861198941003816100381610038161003816

10038161003816100381610038161198861198941003816100381610038161003816 + sum119895 = 119894

10038161003816100381610038161003816119886119895

10038161003816100381610038161003816

if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically increasing

minus

10038161003816100381610038161198861198941003816100381610038161003816

10038161003816100381610038161198861198941003816100381610038161003816 + sum119895 = 119894

10038161003816100381610038161003816119886119895

10038161003816100381610038161003816

if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically decreasing

119901119908119894=

119896

sum119886=1

119908119894(119899119902119886 119899119902119886+1

)

(3)

Path tendency calculation (PTC) gleans the path ten-dency of each variable with 119901119908

119894 Subsequently initial domain

calculation (IDC) works on the result of PTC In this waythe initial value selection allows for both diversity andheuristics The algorithms are expressed by pseudo-codes inAlgorithms 4 and 5

522 Bisection by Tendency Bisection functions only whena value (including the initial value) assigned to the currentvariable 119909

119894is judged to be infeasible and the conflicted branch

(119899119902119886 119899119902119886+1

)with the false branching condition is locatedThenthe tendency of 119909

119894is used by bisection defined as follows

Definition 8 Tendency isin 119901119900119904119894119905119894V119890 119899119890119892119886119905119894V119890 is an attributeof a variable at a branch (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896]) determinedby the analysis on the monotonicity of the correspondingbranching condition and it provides the information aboutwhere to select a value to better satisfy the branching

condition Positive implies that a larger value will work betterwhile negative implies that a smaller value is better It iscalculated according to the following formula

119879119890119899119889119890119899119888119910 (119909119894)

=

119901119900119904119894119905119894V119890 if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically increasing119899119890119892119886119905119894V119890 if Br (119909

119894) (119899119902119886 119899119902119886+1

)

is monotonically decreasing

(4)

Each branch holds a tendencymap ⟨119881119886119903119894119886119887119897119890 119879119890119899119889119890119899119888119910⟩which includes the variables appearing on the branch andtheir corresponding tendencies With the tendency mapbisection can be applied to reduce the domain of 119909

119894(119863119894119895)

leading the branching condition to be true as presented bypseudo-codes in Algorithm 6

For example if the conflicted branch is the first branchof Path3 in Figure 1 then the corresponding branchingcondition is 1199091 minus 1199092 gt 0 which has different monotonicrelations with 1199091 and 1199092 respectively Table 2 shows how touse bisection to reduce the domains of variables If the currentvariable is 1199091 then retrieval of tendency map returns positiveindicating that a larger value will help satisfy the branchingcondition sowe reduce its domain to the larger part But if thecurrent variable is 1199092 bisection will function in the oppositeway due to the opposite monotonic relation

53 Heuristics in Maintaining Path Consistency As men-tioned in Section 42 MPC can be used in both stages ofBFS-BB In this part the focus is on the state space searchstage A value assigned to the current variable 119909

119894 no matter

it is the initial value or another value selected after bisectionshould be examined by interval arithmetic to see whether it ispart of the solution Path consistency is a prerequisite for thesuccess of interval arithmetic In the implementation of BFS-BB interval arithmetic is enhanced to provide more preciseinterval information The enhancement is to make clear howthe value of the branching condition defined by formula (2)is calculated as shown in formula (5) Here we use 119863119886 todenote the domain of all variables before calculating the 119886thbranching condition Besides a library of inverse functions isadded in case of the occurrences of library functions in thePUT Consider

Br (119899119902119886 119899119902119886+1

) (119909119894)

=

119905119903119906119890 if (119899119902119886 119899119902119886+1

) is traversedwith 119863119886 (119881

119894119895isin 119863119886

)

119891119886119897119904119890 otherwise

(5)

Hence for 119896 branching nodes along path 119901 all the119896 branching conditions should be true to maintain pathconsistency MPC receives the value of the current variable 119909

119894

(119881119894119895) which is part of the domain of all variables denoted as

1198631

(119881119894119895= [119881119894119895 119881119894119895] isin 1198631

) and evaluates the branching condi-tion corresponding to the branch (119899

1199021 1198991199021+1

) where 1198991199021is the

first branching node The branching condition Br(1198991199021 1198991199021+1

)

Mathematical Problems in Engineering 9

Input119883rel the set of relevant variables to the path119901119908119894 the path weight of variable 119909

119894(119909119894isin 119883rel)

Output Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relBegin(1) Path-Tendencylarr null(2) foreach 119909

119894isin 119883rel

(3) if (119901119908119894gt 0)

(4) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119901119900119904119894119905119894V119890⟩

(5) else if (119901119908119894lt 0)

(6) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119899119890119892119886119905119894V119890⟩

(7) return Path-TendencyEnd

Algorithm 4 Path tendency calculation

Input 119863119894= [min max] the domain of 119909

119894

Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relOutput 119863

1198941 the domain of 119909

119894in which its initial value is selected

Begin(1) PathTendency(119909

119894)larr retrieval of Path-Tendency

(2) if (PathTendency(119909119894) = positive)

(3) 1198631198941larr997888 [

(min+max)2

max](4) else if (PathTendency(119909

119894) = negative)

(5) 1198631198941larr997888 [min (min+max)

2]

(6) return 1198631198941

End

Algorithm 5 Initial domain calculation

Input 119863119894119895= [min max] the current domain of 119909

119894

119881119894119895 the current value of 119909

119894that causes Br(119909

119894)(119899119902119886 119899119902119886+1

) to be false(119899119902119886 119899119902119886+1

) the conflicted branchOutput 119863

119894119895 the reduced domain of 119909

119894

Begin(1) 1198811015840 larr 119881

119894119895

(2) Tendency(119909119894)larr retrieval of tendency map held by (119899

119902119886 119899119902119886+1

)(3) 119895++(4) if (Tendency(119909

119894) = positive)

(5) 119863119894119895larr [1198811015840 + 1 max]

(6) else if (Tendency(119909119894) = negative)

(7) 119863119894119895larr [min 1198811015840 minus 1]

(8) return 119863119894119895

End

Algorithm 6 Bisection

Table 2 An example of bisection

Current variable Monotonicity Tendency Current value Domain before bisection Domain after bisection1199091 Increasing Positive 1198811 [min1 max1] [1198811+1 max1]1199092 Decreasing Negative 1198812 [min2 max2] [min2 1198812minus1]

10 Mathematical Problems in Engineering

Input 1198631 the domain of all variables before checking path consistencyBr(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branching conditions along the pathOutput 119863119896+1 the reduced domain of all variables after a successful path consistency check

(119899119902119886 119899119902119886+1

) the conflicted branch spotted by path consistency checkBegin(1) for 119886 rarr 1 119896(2) calculate Br(119899

119902119886 119899119902119886+1

) with119863119886(3) if (Br(119899

119902119886 119899119902119886+1

) = true)(4) 119863

119886+1

⊑ 119863119886

(5) else return (119899119902119886 119899119902119886+1

)(6) path consistentlarr true(7) return 119863119896+1End

Algorithm 7 Maintaining path consistency

is generally not satisfied for all the values in1198631 but for valuesin a certain subset 1198632 sube 119863

1 ensuring the traversal of the

branch (1198991199021 1198991199021+1

) that is 1198631119861119903(1198991199021 1198991199021+1)

997888997888997888997888997888997888997888997888997888rarr 1198632 Next the

branching condition Br(1198991199022 1198991199022+1

) is evaluated given that thedomain of all variables is1198632 Again generally Br(119899

1199022 1198991199022+1

) isonly satisfied by a subset 1198633 sube 1198632 This procedure continuesalong 119901 until all the branching conditions are satisfied tomaintain path consistency and119863119896+1 is returned as the domainof all variables The process of maintaining path consistencyis the propagation of the branching conditions along p in the

form of 1198631Br(1198991199021 1198991199021+1)997888997888997888997888997888997888997888997888997888rarr 119863

2

Br(11989911990221198991199022+1)997888997888997888997888997888997888997888997888997888rarr 119863

3

sdot sdot sdot 119863119896

Br(119899119902119896119899119902119896+1)997888997888997888997888997888997888997888997888997888rarr

119863119896+1 where 1198631 supe 119863

2

supe 1198633

sdot sdot sdot supe 119863119896

supe 119863119896+1 But if in

this procedure Br(119899119902ℎ 119899119902ℎ+1

) = 119891119886119897119904119890(1 le ℎ le 119896) whichmeans a conflict is detected then MPC is terminated andbisection will function according to the result of MPC atthe conflicted branch (119899

119902ℎ 119899119902ℎ+1

) The process of checkingwhether path consistency is maintained is shown by pseudo-codes in Algorithm 7

54 Case Study In this part the problem mentioned inSection 31 is used as an example to explain how BFS-BBworks especially the heuristic look-aheadmethods proposedahead The input is Path3 as shown in bold in Figure 3where each branching condition is decomposed into its basicfunctions in the right The IVR process has been illustratedin detail in Table 1 and all the three variables are determinedrelevant to Path3 For simplicity the input domains of allvariables are set [minus2 2] with the size 5 In the initializationstage MPC check reduces their domains to 1199091 [minus1 2] 1199092[minus2 1] and 1199093 [minus1 2] The path tendency of each variableis calculated by PTC as shown in Table 3 DVO serves todetermine the first variable to be instantiated as shown inTable 4 with the head of the queue (1199092) highlighted in boldOn determining 1199092 to be the current variable an initialvalue needs to be selected from[minus2 1] The retrieval of pathtendency map by IDC returns negative for 1199092 indicating thata smaller value will perform better and minus1 is selected

MPC checks the domains of all variables which are 1199091[minus1 2] 1199092 [minus1 1] and 1199093 [minus1 2] It succeeds and reduces

the domains of 1199091 and 1199093 to [0 2] and [0 2] respectivelyThen DVO determines the next variable to be instantiated asshown in Table 5 with the head of the queue (1199091) highlightedin bold

1 is selected for 1199091 after IDC MPC checks whether1199091 [1 1] 1199092 [minus1 minus1] and 1199093 [0 2] works It suc-ceeds and in the same manner 1199093 is assigned 1 Finally⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is checked by MPC to be suitablefor Path3 No variable needs to be permutated and BFS-BBsucceeds with the test data ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ Table 6shows how the domains of variables are changed duringthe search process The changed domains are highlighted inbold The changes listed in the fourth column are owing tovariable assignments according to the results of IDC andthe changes listed in the fifth column are owing to domainreduction by MPC checks The process of generating the testdata ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is presented as the search treein Figure 4 It is a backtrack-free search that accounts for anextremely large proportion in the implementation of BFS-BBEach variable consumes one MPC check in the state spacesearch stage and the initial values of each variable make thesolution The solution path is shown by the bold arrows

6 Experimental Results and Discussion

To observe the effectiveness of BFS-BB we carried out a largenumber of experiments in CTS Within the CTS frameworkthe PUT is automatically analyzed and its basic informationis abstracted to generate its CFG According to the specifiedcoverage criteria the paths to be traversed are generated andprovided for BFS-BB as input The generated test data willbe used for mutation testing that requires a high coverageideally 100 [37] This is a challenge for test data generation

The experiments were performed in the environment ofMSWindows 7with 32 bits Pentium4with 28GHz and 2GBmemory The algorithms were implemented in Java and runon the platformof eclipseThe experiments include two partsSection 61 presents the performance evaluation of BFS-BBand Section 62 tests the capability of BFS-BB to generate testdata in terms of coverage and makes comparisons with somecurrently existing static and dynamic methods

Mathematical Problems in Engineering 11

Table 3 PTC process for 1199091 1199092 and 1199093

Branchingcondition

Basic functions andcorresponding monotonicity

Monotonicity ofbranching conditions Weight Path weight Path tendency

1199091 minus 1199092 gt 0

119891(1199091) = 1199091 minus 1199092 increasing119891(1199092) = 1199091 minus 1199092 decreasing119891(1198871) = 1198871 gt 0 increasing

Br(1199091) increasingBr(1199092) decreasing

1199081 = 05

1199082 = minus051199011199081 = 05

1199011199082 = minus1

1199011199083 = 15

⟨1199091 positive⟩⟨1199092 negative⟩⟨1199093 positive⟩1199093 minus 1199092 gt 0

119891(1199092) = 1199093 minus 1199092 decreasing119891(1199093) = 1199093 minus 1199092 increasing119891(1198872) = 1198872 gt 0 increasing

Br(1199092) decreasingBr(1199093) increasing

1199082 = minus05

1199083 = 05

3 lowast 1199093 ge minus5119891(1199093) = 3 lowast 1199093 increasing119891(1198873) = 1198873 ge minus5 increasing Br(1199093) increasing 1199083 = 1

Table 4 DVO process for 1199091 1199092 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 4 |1198632| = 4 |1198633| = 4 Yes (all three have the same domain size)

x2 rarr 1199091 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199092) = 1 Rank 1(1199093) =infin Yes (1199091 and 1199092 both have Rank 1)Rank 2 Rank 2(1199091) =infin Rank 2(1199092) = 2 No (1199092 has Rank 2 while 1199091 has infinity)

Table 5 DVO process for 1199091 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 3 |1198633| = 3 Yes (both have the same domain size) x1 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199093) =infin No (1199091 has Rank 1 while 1199093 has infinity)

Table 6 Domain changes in the search process

Stage Function Before IDC After IDC and before MPC After MPC

Initialization Initial domain reduction mdash 1199091 [minus2 2] 1199092 [minus2 2]1199093 [minus2 2]

x1 [minus1 2] x2 [minus2 1]x3 [minus1 2]

State space search

MPC check when 1199092 isassigned minus1

1199091 [minus1 2] 1199092 [minus2 1] 1199093[minus1 2]

1199091 [minus1 2] x2 [minus1 minus1]1199093 [minus1 2]

x1 [0 2] 1199092 [minus1 minus1]x3 [0 2]

MPC check when 1199091 isassigned 1

1199091 [0 2] 1199092 [minus1 minus1] 1199093[0 2]

x1 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

MPC check when 1199093 isassigned 1

1199091 [1 1] 1199092 [minus1 minus1] 1199093[0 2]

1199091 [1 1] 1199092 [minus1 minus1]x3 [1 1]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [1 1]

void test(int x1 int x2 int x3) void test(int x1 int x2 int x3)

if (x1-x2lt=0)

printf(ldquoPath1rdquo)

else if(x3-x2lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

else if(3lowastx3+5gt=0)

int b1=x1-x2

printf(ldquoPath1rdquo)

if (b1lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

if(b2lt=0)

if(b3gt=-5)

else int b2=x-3x2

elseint b3=3lowastx3

Figure 3 Overview of our approach for searching the test data

12 Mathematical Problems in Engineering

0 1 2

DVO

MPC

0 1 2MPC

0 1 2MPC

DVO

DVOIDC

PTC

MPC

MPC

IDC

IDC

minus1

minus1

minus1

minus2

minus2

minus2

x1 minus x2 gt 0x3 minus x2 gt 03x3 + 5 ge 0

x1 x2 x3 isin minus2 minus1 0 1 2

x2x2x2x2

x3x3x3

x1 x1 x1

radic

Figure 4 The search tree of generating the test data for Path3 using BFS-BB

61 Performance Evaluation The number of relevant vari-ables is an important factor that affects the performanceof BFS-BB so in this part experiments were carried out toevaluate the performance of BFS-BB for varying numbers ofinput variables To be specific our major concern is (1) therelationship between the number of MPC checks (exclusiveof the one taken in the initialization stage) and the numberof relevant variables (2) the relationship between genera-tion time and the number of relevant variables This wasaccomplished by repeatedly running BFS-BB on generatedtest programs having input variables 119909

1 1199092 119909

119899where 119899

varied from 1 to 50 Adopting statement coverage in eachtest the program contained five if statements (equivalent tofive branching conditions along the path for MPC check)and there was only one path to be traversed of fixed lengthwhich was the one consisting of entirely true branches(TTTTT) that is all the branching conditions are the sameas the corresponding predicates Considering the relationshipbetween variables experiments involving two situations wereconducted that (1) the variables are all independent of eachother and (2) the variables are linearly related in the tightestmanner Generation time varied greatly in these two cases sothe axes of generation time of both cases are normalized forsimplicity

611 Variables Are All Independent of Each Other The pred-icate of each if statement is an expression in the form of

11988611199091relop

1const [1] and 119886

21199092relop

2119888119900119899119904119905 [2]

and sdot sdot sdot and 119886119899119909119899relop

119899119888119900119899119904119905 [119899]

(6)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop119894(119894 = 1 2 119899) isin gt ge lt le =

= and const is an array of randomly generated constantsThe randomly generated 119886

119894and 119888119900119899119904119905[119894] should be selected

to make the path feasible This arrangement constructs arelationship that all the variables are independent of eachother but all of them are relevant to the path The programsfor various values of n ranging from 1 to 50 were each tested50 times and the number of MPC checks and time required

to generate the data for each test were recorded The resultscan be seen from Figures 5 and 6

Figure 5 shows the relationship between the number ofMPC checks and the number of variables (119899) for variablesthat are all independent of each other and from (a) to (d)represent four different situations marked by the ordinatesIt can be seen that since the relation in formula (6) is thesimplest one between variables the number of MPC checksremains linearly increasing with the number of variables nomatter in which situation from (a) to (d) 119910 = 119909means thatfor this kind of constraint one relevant variable requires onlyone MPC check It also can be seen that 1198772 = 1 in all thefour situations and the number of MPC checks increasescompletely linearly with the number of variables

Figure 6 shows the relationship between generation timeand the number of variables (119899) for variables that are allindependent of each other and from (a) to (d) represent fourdifferent situations marked by the ordinates It can be seenthat generation time increases approximately linearlywith thenumber of variables and the linear correlation relationship issignificant at 95 confidence level with 119875 value far less than005 By the increase of the number of variables generationtime increases at an even speed The minimum value can becommendably represented as a straight line showing that itis the most ideal in the four situations with a larger value of1198772 Variations between tests with the same values of 119899 were

attributed to the randomness in the difference in the selectionof the initial values

612 Variables Are Linearly Related in the Tightest MannerThe predicate of each if statement is a linear combination ofall the 119899 variables in the form of

[1198861 1198862 119886

119899] [1199091 1199092 119909

119899]1015840relop119888119900119899119904119905 [119888] (7)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop isin gt ge lt le = = and 119888119900119899119904119905[119888](119888 isin 1 2 3 4 5) is an array of randomly generatedconstants The randomly generated 119886

119894and 119888119900119899119904119905[119888] should be

selected to make the path feasible This arrangement con-structs the tightest linear relation between the variables all

Mathematical Problems in Engineering 13

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = x

R2 = 1

(a)

Number of variables0 20 40 60

0

20

40

60

Aver

age n

umbe

r of M

PC ch

ecks

y = x

R2 = 1

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(c)

Number of variables0 20 40 60

0

20

40

60

Min

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(d)

Figure 5 Relationship between the number of MPC checks and the number of variables for variables that are all independent of each other

of which are relevant to the path The programs for variousvalues of 119899 ranging from 1 to 50 were each tested 50 times andthe number ofMPC checks and time required to generate thedata for each test were recordedThe results can be seen fromFigures 7 and 8

Figure 7 shows the relationship between the number ofMPC checks and the number of variables (119899) for variables thatare the tightest linearly related and from (a) to (d) representfour different situations marked by the ordinates It can beseen that the number of MPC checks remains approximatelylinearly increasing with the number of variables and thefitting curves are all near 119910 = 119909 The linear correlationrelationship is significant at 95 confidence level with 119875-value far less than 005 The general average and maximumnumbers ofMPC checks are all larger than those in the exper-iment for variables that are all independent of each otherbecause the relation in formula (7) is the tightest linear onebetween variablesTheminimumnumber ofMPC checks canbe completely represented as 119910 = 119909 with 1198772 = 1 whichmeans that theminimumnumber is themost ideal in the foursituations

Figure 8 shows the relationship between generation timeand the number of variables (119899) for variables that are

the tightest linearly related and from (a) to (d) represent fourdifferent situationsmarked by the ordinates It is clear that therelation between generation time and the number of variablescan be commendably represented as a quadratic curve andthe quadratic correlation relationship is significant at 95confidence level with 119875-value far less than 005 The betterfitting curves of average and minimum generation timesshow that average generation time is perfectly stable andminimum generation time is still the most ideal Variationsbetween tests with the same values of n were attributed to therandomness in (1) the difference in the selection of the initialvalues and (2) the difference in the expressions along the path(an equality relational operator will generally require morecalculation than an inequality relational operator) Besidesgeneration time increases at a uniformly accelerative speed bythe increase of the number of variables Take (b) for examplethe differentiation of average generation time indicates thatits increase rate rises by 119910 = 9994119909 minus 7734 as the number ofvariables increases We can roughly draw the conclusion thatgeneration time is very close for 119899 ranging from 1 to 8 whileit begins to increase when 119899 is larger than 8

The above cases are both completely backtrack-freesearch owing to the linear correlation relationship between

14 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 1471x + 3161

R2 = 0899

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time

y = 1636x + 1175

R2 = 0862

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 2073x minus 3655

R2 = 0997

(d)

Figure 6 Relationship between generation time and the number of variables for variables that are all independent of each other

the number of MPC checks and the number of relevantvariables Surely they cannot include all the relations betweenvariables in engineering so the analyses in this part arejust from the theoretic perspective The real-world PUTs aremuch more complex Whatrsquos more 50 tests were conductedfor each case of n ranging from 1 to 50 So the results from thesamples can only approximate the actual situation But it canbe concluded that BFS-BB functions are stably given a PUTof regular structure which lays a solid foundation for itsapplication in engineering

62 Coverage Evaluation To evaluate the capability of BFS-BB to generate test data in terms of coverage four experi-ments were carried out The first involves the testing with abenchmark used in CTS the second aims at generating testdata for a project in engineering the third compares BFS-BBwith a static method and the last compares it with dynamicmethods

621 Testing a Benchmark in CTS In this part test datawere automatically generated to meet three coverage criteriawhich were statement branch and MCDC The test bedwas branch boundc a benchmark in CTS with 402 LOC

29 input variables and complex structure trying to includemore content that might appear in engineering119898 was set 10for each variable as the upperbound of the number of MPCchecks so it can be estimated that the simplest backtrackingwill consume at least 11 MPC checks for the variable inquestion

The result is shown in Table 7 The numbers of paths wasdifferent owing to different coverage criteria adopted BFS-BB was able to generate test data for all the feasible pathsno matter which coverage criterion was taken The MCDCcoverage did not reach 100 because it is relatively strictand difficult to meet and subsumes statement and branchcoverage [38] But tolerable coverage was achieved withintolerable timeThere exists a trade-off between efficiency andsuccess rate IVR had no significant influence on coveragebut it did on generation time Generation time after IVRwas much less than that without IVR Note that the amountof generation time reduced by IVR is determined by the struc-ture of the PUT The numbers of generation time reductionin Table 7 are only related to the program branch boundcOur following analyses all concern BFS-BB with IVR Aver-age number of MPC checks per relevant variable adoptingstatement coverage was larger than the results in Section 61

Mathematical Problems in Engineering 15

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = 0994x + 0221

R2 = 0997

(a)

0 20 40 600

20

40

60

Number of variables

Aver

age n

umbe

r of M

PC ch

ecks

y = 0996x + 0309

R2 = 0999

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = 1014x + 2030

R2 = 0985

(c)

0 20 40 600

20

40

60

Number of variables

Min

imum

num

ber o

f MPC

chec

ks

y = x

R2 = 1

(d)

Figure 7 Relationship between the number of MPC checks and the number of variables for variables that are linearly related in the tightestmanner

Table 7 Coverage achieved by BFS-BB on branch boundc

Coverage criterion Number of paths Average coverage Generation time reducedby IVR

Average MPC checks perrelevant variable

Statement 61 100 34 134Branch 119 100 37 173MCDC 125 98 42 229

because there are some library functions as well as nonlinearconstraints in the PUT which require moreMPC checks Butfrom all the average values of the number of MPC checks itcan be concluded that the tests for all the three coverage crite-ria were basically backtrack-free and not all the tests includethe bisection operation

622 Testing Programs from a Project in Engineering In thispart seven programs were selected from the project de118i-2 at httpwwwmoshiernet as the test beds each of whichcontains several functions or loops Three coverage criteriawere adopted which were statement branch and MCDCThe information of the programs and the coverage results areshown in Table 8

From Table 8 it can be seen that BFS-BB performedencouragingly for programs with complex structure in engi-neering Some of the tests adoptingMCDC coverage did notreach 100 resulting from the fact thatMCDC is a relativelystrict criterion But all the tests were basically backtrack-freewith a high coverage proving its effectiveness in engineering

623 Comparison with a Static Method This part presentsthe results from an empirical comparison of BFS-BB withthe static method [13] (denoted as ldquomethod 1rdquo to avoidverbose description) which was implemented in CTS priorto BFS-BB The details of the test beds are shown in Table 9The comparison adopted three coverage criteria statementbranch and MCDC And the result is shown in Table 10

16 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 4318x2 minus 6684x + 3655

R2 = 0975

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 4997x2 minus 7734x + 4134

R2 = 0991

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time y = 4151x2 minus 2923x + 7107

R2 = 0907

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 5455x2 minus 9565x + 5030

R2 = 0994

(d)

Figure 8 Relationship between generation time and the number of variables for variables that are linearly related in the tightest manner

Table 8 Test result achieved by BFS-BB on programs from de118i-2

Program LOC Number ofvariables

Number offunctions

Number ofloops Statement coverage Branch coverage MCDC coverage

rungec 1626 8 2 8 100 100 89oblatec 581 14 4 4 100 100 94jplmpc 996 6 5 5 100 100 88floorlc 432 7 5 2 100 100 92atanlc 267 3 2 0 100 100 100adams4c 767 28 10 12 100 100 85tanlc 256 4 3 0 100 100 100

Table 9 Programs used for comparison with method 1

Program LOC Number of branches Number of variables Description Source

Bonus 29 10 1 To calculate bonus according toprofit [13]

Days 33 17 3 To calculate the day numberwithin a year [13]

Statistics 21 8 5 To count the number of everykind of characters [13]

gcd 38 5 2 To calculate greatest commondenominator [14]

Mathematical Problems in Engineering 17

Table 10 Comparison result with method 1 using three coverage criteria

Program Coverage criterion Number of paths Average coverage bymethod 1

Average coverage byBFS-BB

BonusStatement 6 25 100Branch 6 37 100MCDC 6 30 100

DaysStatement 17 100 100Branch 17 100 100MCDC 14 94 100

StatisticsStatement 4 100 100Branch 5 100 100MCDC 11 82 100

gcdStatement 3 85 100Branch 3 77 100MCDC 5 75 100

Table 11 Programs used for comparison with GA and SA

Program LOC Number of branches Number of variables Description SourceTriangle type 31 3 5 To classify type of a triangle [15]

Valid date 59 16 3 To check whether a date is validor not [15]

Cal day 72 11 3 To calculate the day of the week [16]

Cal 53 18 5To calculate the number of daysbetween the two given days in

the same year[17]

Since interval arithmetic has been improved in CTS forthe sake of validity test data were generated for all the testbeds on the same foundation of interval arithmetic It can beseen that BFS-BB reached 100 coverage for all the test bedsusing three coverage criteria while method 1 did not That islargely due to the heuristicmethods utilized in BFS-BBThereare modulus operations in the program days which had beendifficult for the interval arithmetic in method 1 to handle Butthe functionality of interval arithmetic has been improved byadding a library of inverse functions so it was not so difficulteven for method 1

624 Comparison with DynamicMethods This part presentsresults from an empirical comparison of BFS-BB with twodynamic methods which are genetic algorithm (GA) andsimulated annealing (SA) on four different benchmark pro-grams using branch coverage The details of the benchmarkprograms are shown in Table 11 In order to obtain unbiasedexperimental results we made a number of experiments withdifferent parameter settings and chose the one that GA andSA performed the best as shown in Table 12 Test data wereautomatically generated for each program using GA SA andBFS-BB with each tested 100 timesThe average coverage wasrecorded to make the comparison and the result was pre-sented in Table 13

It can be seen that BFS-BB reached 100 coverage onall the four benchmark programs which are rather simpleprograms for BFS-BB and it outperformed the algorithms

in comparison The better performance of BFS-BB is due tothree factorsThe first is that the initial values of variables areselected by heuristics on the path so BFS-BB reaches a rela-tively high coverage for the first round of the searchThe sec-ond is that MHS crashes on several occasions due to the iter-ation exception while the probability of aborting is quite lowfor BFS-BB because it has no demand for iteration The thirdis thatMPC is checked not only in the state space search stagebut also in the initialization stage which reduces the domainsof the variables to ensure a relatively small search spacethat follows

7 Conclusion

This paper presents an intelligent algorithm best-first-searchbranch and bound (BFS-BB) for path-wise test data gener-ation (Q) The problem Q is reformulated as a constraintoptimization problem (COP) and two techniques from arti-ficial intelligence are introduced to tackle the COP which arestate space search and branch and bound (BB) The formeris used to construct the search tree dynamically and thelatter is used as the search method Heuristics are adoptedin the look-ahead stage Dynamic variable ordering (DVO)is presented with a heuristic rule to break ties Maintainingpath consistency (MPC) is achieved through analysis on theresult of interval arithmetic The monotonicity analysis onbranching conditions is applied both in the selection of initialvalues by path tendency calculation (PTC) and initial domain

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 7

Table 1 IVR process for each path of test in Figure 1

Path Branching condition 1198861

1198862

1198863

119883rel 119883irrel

Path 1 0 rarr 1 rarr 2 rarr 9 rarr 10 1199091 minus 1199092 le 0 1 minus1 0 1199091 1199092 1199093

Path 2 0 rarr 1 rarr 3 rarr 4 rarr 8 rarr 9 rarr 101199091 minus 1199092 gt 0 1 minus1 0

1199091 1199092 1199093 Oslash1199093 minus 1199092 le 0 0 minus1 1

Path 3 0 rarr 1 rarr 3 rarr 5 rarr 6 rarr 7 rarr 8 rarr 9 rarr 10

1199091 minus 1199092 gt 0 1 minus1 01199091 1199092 1199093 Oslash1199093 minus 1199092 gt 0 0 minus1 1

31199093 ge minus5 mdash mdash mdash

Path 4 0 rarr 1 rarr 3 rarr 5 rarr 7 rarr 8 rarr 9 rarr 10

1199091 minus 1199092 gt 0 1 minus1 01199091 1199092 1199093 Oslash1199093 minus 1199092 gt 0 0 minus1 1

31199093 lt minus5 mdash mdash mdash

Input FV the set of future variables119863119894 the domain of 119909

119894(119909119894isin FV)

(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branches along the pathOutput 119876

119894 a queue of FV

Begin(1) 119876119894larr quicksort (FV 1003816100381610038161003816119863119894

1003816100381610038161003816)(2) for 119894 rarr 1 1003816100381610038161003816119876119894

1003816100381610038161003816(3) if (1003816100381610038161003816119863119894

1003816100381610038161003816 =10038161003816100381610038161003816119863119895

10038161003816100381610038161003816) (119895 gt 119894 119909

119894 119909119895isin 119876119894)

(4) break(5) else for (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896])(6) if (rank(119899

119902119886 119899119902119886+1

)(119909119894) = rank(119899

119902119886 119899119902119886+1

)(119909119895))

(7) 119886++(8) else permutate 119909

119894 119909119895by rank(119899

119902119886 119899119902119886+1

)(9) break(10) return 119876

119894

End

Algorithm 3 Dynamic variable ordering

while the latter has rank infinity The comparison between119886 and infinity determines the ordering The algorithm isdescribed by pseudocodes in Algorithm 3

Quicksort is utilized when permutating variables accord-ing to remaining domain size and returns 119876

119894as a result If

no variables have the same domain size then DVO finishesBut if there are variables whose domain sizes are the same asthat of the head of119876

119894 then the ordering by rank is under way

which will terminate as soon as different ranks appear

52 Heuristics in Value Selection DVO determines the nextvariable to be instantiated and then the value selectionstrategies are employed Considering the difference betweenthe variable in question (eg 119909

119894) and other variables the

branching condition defined by formula (1) can be furtherrepresented as a function of 119909

119894

Br (119899119902119886 119899119902119886+1

) (119909119894) 119863119894997888rarr 119861 = (119886

119894119909119894+ sum119895 = 119894

119886119895119909119895)R119888 (2)

where 119863119894is the domain of 119909

119894and 119861 is a set of Boolean

values 119905119903119906119890 119891119886119897119904119890 sum119895 = 119894119886119895119909119895is the linear combination of

the variables except 119909119894and is regarded as a constant Then

we can design the value selection strategies starting from themonotonic relation between the branching condition and 119909

119894

Monotonicity describes the behavior of a function in relationto the change of the input It gives an indication whether theoutput of the function moves in the same direction as theinput or in the reverse direction If a branching condition isa function whose monotonicity is known the direction inwhich the input needs to be moved to make the functiontrue can be determined The following proposition gives anattribute of a function composed of piecewise monotonicfunctions

Proposition 6 Assume that 1198911 1198831rarr 119884

1 1198912 1198832rarr

1198842 119891

119898 119883119898rarr 119884119898is a family of piecewise monotonic

functions with 119884119894sube 119883119894+1

Let 119865119898 1198831rarr 119884119898be a composed

function 119891119898∘ 119891119898minus1

∘ sdot sdot sdot ∘ 1198911 On this assumption 119865

119898is also

piecewise monotonic

Proof Mathematical induction is used to prove the proposi-tion

(i) Case 1198651= 1198911 Function 119891

1is piecewise monotonic

by assumption 1198651is equal to 119891

1 so it has the same

attribute(ii) Case 119865

119894+1= 119891119894+1∘ 119865119894 The composed function 119865

119894is

piecewise monotonic by the induction assumptionlet 119868 be a subset of its domainrsquos partition and let 119909and 1199091015840 be two arbitrary elements in 119868 with 119909le

1198831199091015840

then one of themonotonicity conditions holds that iseither 119865

119894(119909) le119884119894119865119894(1199091015840

) or 119865119894(119909) ge119884119894119865119894(1199091015840

) For simplic-ity we denote it as 119865

119894(119909)R119865

119894(1199091015840

) where R isin le geFunction 119891

119894+1is piecewise monotonic by assumption

The monotonicity condition is satisfied by 119865119894(119909) and

119865119894(1199091015840

) if both lie in the same subset 1198681015840of its domainrsquospartition Then 119891

119894+1(119909)R119891

119894+1(1199091015840

) holds and 119891119894+1

isalso monotonic on 1198681015840

After decomposing a branching condition into its basicfunctions its monotonicity can be utilized in the selectionof the initial value as well as other values of the variable inquestion

521 Initial Value Selection Initial values of variables are ofgreat importance to a search algorithm On the one hand in a

8 Mathematical Problems in Engineering

backtrack-free search the initial value of a variable is almostpart of the solution On the other hand the selection of initialvalues affectswhether the searchwill be backtrack-free Initialvalues are often selected at random in MHS methods whichreturn different test data each time allowing diversity butrandomness without any heuristics is a kind of blind searchand causes too many iterations sometimes even exceptionMeanwhile midvalues are selected in methods using bisec-tion so it is obvious that sometimes the same result may bereturned since the same initial value is always selected In ourmethod the above twomethods are combined and the initialvalue of a variable is determined based on its path tendencywhich is defined and calculated as follows

Definition 7 Path tendency isin 119901119900119904119894119905119894V119890 119899119890119892119886119905119894V119890 is anattribute of a variable on a path which is in favor of thesatisfaction of all the branching conditions along the pathAnd it provides the information about where to select itsinitial value Positive implies that a larger initial value willwork better while negative implies that a smaller initial valueis better

The calculation of the path tendency of a variable119909119894involves the calculation of its weight on each branch

(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) and its path weight denoted as119908119894(119899119902119886 119899119902119886+1

) and 119901119908119894 which are calculated as (3)

119908119894(119899119902119886 119899119902119886+1

)

=

10038161003816100381610038161198861198941003816100381610038161003816

10038161003816100381610038161198861198941003816100381610038161003816 + sum119895 = 119894

10038161003816100381610038161003816119886119895

10038161003816100381610038161003816

if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically increasing

minus

10038161003816100381610038161198861198941003816100381610038161003816

10038161003816100381610038161198861198941003816100381610038161003816 + sum119895 = 119894

10038161003816100381610038161003816119886119895

10038161003816100381610038161003816

if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically decreasing

119901119908119894=

119896

sum119886=1

119908119894(119899119902119886 119899119902119886+1

)

(3)

Path tendency calculation (PTC) gleans the path ten-dency of each variable with 119901119908

119894 Subsequently initial domain

calculation (IDC) works on the result of PTC In this waythe initial value selection allows for both diversity andheuristics The algorithms are expressed by pseudo-codes inAlgorithms 4 and 5

522 Bisection by Tendency Bisection functions only whena value (including the initial value) assigned to the currentvariable 119909

119894is judged to be infeasible and the conflicted branch

(119899119902119886 119899119902119886+1

)with the false branching condition is locatedThenthe tendency of 119909

119894is used by bisection defined as follows

Definition 8 Tendency isin 119901119900119904119894119905119894V119890 119899119890119892119886119905119894V119890 is an attributeof a variable at a branch (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896]) determinedby the analysis on the monotonicity of the correspondingbranching condition and it provides the information aboutwhere to select a value to better satisfy the branching

condition Positive implies that a larger value will work betterwhile negative implies that a smaller value is better It iscalculated according to the following formula

119879119890119899119889119890119899119888119910 (119909119894)

=

119901119900119904119894119905119894V119890 if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically increasing119899119890119892119886119905119894V119890 if Br (119909

119894) (119899119902119886 119899119902119886+1

)

is monotonically decreasing

(4)

Each branch holds a tendencymap ⟨119881119886119903119894119886119887119897119890 119879119890119899119889119890119899119888119910⟩which includes the variables appearing on the branch andtheir corresponding tendencies With the tendency mapbisection can be applied to reduce the domain of 119909

119894(119863119894119895)

leading the branching condition to be true as presented bypseudo-codes in Algorithm 6

For example if the conflicted branch is the first branchof Path3 in Figure 1 then the corresponding branchingcondition is 1199091 minus 1199092 gt 0 which has different monotonicrelations with 1199091 and 1199092 respectively Table 2 shows how touse bisection to reduce the domains of variables If the currentvariable is 1199091 then retrieval of tendency map returns positiveindicating that a larger value will help satisfy the branchingcondition sowe reduce its domain to the larger part But if thecurrent variable is 1199092 bisection will function in the oppositeway due to the opposite monotonic relation

53 Heuristics in Maintaining Path Consistency As men-tioned in Section 42 MPC can be used in both stages ofBFS-BB In this part the focus is on the state space searchstage A value assigned to the current variable 119909

119894 no matter

it is the initial value or another value selected after bisectionshould be examined by interval arithmetic to see whether it ispart of the solution Path consistency is a prerequisite for thesuccess of interval arithmetic In the implementation of BFS-BB interval arithmetic is enhanced to provide more preciseinterval information The enhancement is to make clear howthe value of the branching condition defined by formula (2)is calculated as shown in formula (5) Here we use 119863119886 todenote the domain of all variables before calculating the 119886thbranching condition Besides a library of inverse functions isadded in case of the occurrences of library functions in thePUT Consider

Br (119899119902119886 119899119902119886+1

) (119909119894)

=

119905119903119906119890 if (119899119902119886 119899119902119886+1

) is traversedwith 119863119886 (119881

119894119895isin 119863119886

)

119891119886119897119904119890 otherwise

(5)

Hence for 119896 branching nodes along path 119901 all the119896 branching conditions should be true to maintain pathconsistency MPC receives the value of the current variable 119909

119894

(119881119894119895) which is part of the domain of all variables denoted as

1198631

(119881119894119895= [119881119894119895 119881119894119895] isin 1198631

) and evaluates the branching condi-tion corresponding to the branch (119899

1199021 1198991199021+1

) where 1198991199021is the

first branching node The branching condition Br(1198991199021 1198991199021+1

)

Mathematical Problems in Engineering 9

Input119883rel the set of relevant variables to the path119901119908119894 the path weight of variable 119909

119894(119909119894isin 119883rel)

Output Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relBegin(1) Path-Tendencylarr null(2) foreach 119909

119894isin 119883rel

(3) if (119901119908119894gt 0)

(4) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119901119900119904119894119905119894V119890⟩

(5) else if (119901119908119894lt 0)

(6) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119899119890119892119886119905119894V119890⟩

(7) return Path-TendencyEnd

Algorithm 4 Path tendency calculation

Input 119863119894= [min max] the domain of 119909

119894

Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relOutput 119863

1198941 the domain of 119909

119894in which its initial value is selected

Begin(1) PathTendency(119909

119894)larr retrieval of Path-Tendency

(2) if (PathTendency(119909119894) = positive)

(3) 1198631198941larr997888 [

(min+max)2

max](4) else if (PathTendency(119909

119894) = negative)

(5) 1198631198941larr997888 [min (min+max)

2]

(6) return 1198631198941

End

Algorithm 5 Initial domain calculation

Input 119863119894119895= [min max] the current domain of 119909

119894

119881119894119895 the current value of 119909

119894that causes Br(119909

119894)(119899119902119886 119899119902119886+1

) to be false(119899119902119886 119899119902119886+1

) the conflicted branchOutput 119863

119894119895 the reduced domain of 119909

119894

Begin(1) 1198811015840 larr 119881

119894119895

(2) Tendency(119909119894)larr retrieval of tendency map held by (119899

119902119886 119899119902119886+1

)(3) 119895++(4) if (Tendency(119909

119894) = positive)

(5) 119863119894119895larr [1198811015840 + 1 max]

(6) else if (Tendency(119909119894) = negative)

(7) 119863119894119895larr [min 1198811015840 minus 1]

(8) return 119863119894119895

End

Algorithm 6 Bisection

Table 2 An example of bisection

Current variable Monotonicity Tendency Current value Domain before bisection Domain after bisection1199091 Increasing Positive 1198811 [min1 max1] [1198811+1 max1]1199092 Decreasing Negative 1198812 [min2 max2] [min2 1198812minus1]

10 Mathematical Problems in Engineering

Input 1198631 the domain of all variables before checking path consistencyBr(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branching conditions along the pathOutput 119863119896+1 the reduced domain of all variables after a successful path consistency check

(119899119902119886 119899119902119886+1

) the conflicted branch spotted by path consistency checkBegin(1) for 119886 rarr 1 119896(2) calculate Br(119899

119902119886 119899119902119886+1

) with119863119886(3) if (Br(119899

119902119886 119899119902119886+1

) = true)(4) 119863

119886+1

⊑ 119863119886

(5) else return (119899119902119886 119899119902119886+1

)(6) path consistentlarr true(7) return 119863119896+1End

Algorithm 7 Maintaining path consistency

is generally not satisfied for all the values in1198631 but for valuesin a certain subset 1198632 sube 119863

1 ensuring the traversal of the

branch (1198991199021 1198991199021+1

) that is 1198631119861119903(1198991199021 1198991199021+1)

997888997888997888997888997888997888997888997888997888rarr 1198632 Next the

branching condition Br(1198991199022 1198991199022+1

) is evaluated given that thedomain of all variables is1198632 Again generally Br(119899

1199022 1198991199022+1

) isonly satisfied by a subset 1198633 sube 1198632 This procedure continuesalong 119901 until all the branching conditions are satisfied tomaintain path consistency and119863119896+1 is returned as the domainof all variables The process of maintaining path consistencyis the propagation of the branching conditions along p in the

form of 1198631Br(1198991199021 1198991199021+1)997888997888997888997888997888997888997888997888997888rarr 119863

2

Br(11989911990221198991199022+1)997888997888997888997888997888997888997888997888997888rarr 119863

3

sdot sdot sdot 119863119896

Br(119899119902119896119899119902119896+1)997888997888997888997888997888997888997888997888997888rarr

119863119896+1 where 1198631 supe 119863

2

supe 1198633

sdot sdot sdot supe 119863119896

supe 119863119896+1 But if in

this procedure Br(119899119902ℎ 119899119902ℎ+1

) = 119891119886119897119904119890(1 le ℎ le 119896) whichmeans a conflict is detected then MPC is terminated andbisection will function according to the result of MPC atthe conflicted branch (119899

119902ℎ 119899119902ℎ+1

) The process of checkingwhether path consistency is maintained is shown by pseudo-codes in Algorithm 7

54 Case Study In this part the problem mentioned inSection 31 is used as an example to explain how BFS-BBworks especially the heuristic look-aheadmethods proposedahead The input is Path3 as shown in bold in Figure 3where each branching condition is decomposed into its basicfunctions in the right The IVR process has been illustratedin detail in Table 1 and all the three variables are determinedrelevant to Path3 For simplicity the input domains of allvariables are set [minus2 2] with the size 5 In the initializationstage MPC check reduces their domains to 1199091 [minus1 2] 1199092[minus2 1] and 1199093 [minus1 2] The path tendency of each variableis calculated by PTC as shown in Table 3 DVO serves todetermine the first variable to be instantiated as shown inTable 4 with the head of the queue (1199092) highlighted in boldOn determining 1199092 to be the current variable an initialvalue needs to be selected from[minus2 1] The retrieval of pathtendency map by IDC returns negative for 1199092 indicating thata smaller value will perform better and minus1 is selected

MPC checks the domains of all variables which are 1199091[minus1 2] 1199092 [minus1 1] and 1199093 [minus1 2] It succeeds and reduces

the domains of 1199091 and 1199093 to [0 2] and [0 2] respectivelyThen DVO determines the next variable to be instantiated asshown in Table 5 with the head of the queue (1199091) highlightedin bold

1 is selected for 1199091 after IDC MPC checks whether1199091 [1 1] 1199092 [minus1 minus1] and 1199093 [0 2] works It suc-ceeds and in the same manner 1199093 is assigned 1 Finally⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is checked by MPC to be suitablefor Path3 No variable needs to be permutated and BFS-BBsucceeds with the test data ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ Table 6shows how the domains of variables are changed duringthe search process The changed domains are highlighted inbold The changes listed in the fourth column are owing tovariable assignments according to the results of IDC andthe changes listed in the fifth column are owing to domainreduction by MPC checks The process of generating the testdata ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is presented as the search treein Figure 4 It is a backtrack-free search that accounts for anextremely large proportion in the implementation of BFS-BBEach variable consumes one MPC check in the state spacesearch stage and the initial values of each variable make thesolution The solution path is shown by the bold arrows

6 Experimental Results and Discussion

To observe the effectiveness of BFS-BB we carried out a largenumber of experiments in CTS Within the CTS frameworkthe PUT is automatically analyzed and its basic informationis abstracted to generate its CFG According to the specifiedcoverage criteria the paths to be traversed are generated andprovided for BFS-BB as input The generated test data willbe used for mutation testing that requires a high coverageideally 100 [37] This is a challenge for test data generation

The experiments were performed in the environment ofMSWindows 7with 32 bits Pentium4with 28GHz and 2GBmemory The algorithms were implemented in Java and runon the platformof eclipseThe experiments include two partsSection 61 presents the performance evaluation of BFS-BBand Section 62 tests the capability of BFS-BB to generate testdata in terms of coverage and makes comparisons with somecurrently existing static and dynamic methods

Mathematical Problems in Engineering 11

Table 3 PTC process for 1199091 1199092 and 1199093

Branchingcondition

Basic functions andcorresponding monotonicity

Monotonicity ofbranching conditions Weight Path weight Path tendency

1199091 minus 1199092 gt 0

119891(1199091) = 1199091 minus 1199092 increasing119891(1199092) = 1199091 minus 1199092 decreasing119891(1198871) = 1198871 gt 0 increasing

Br(1199091) increasingBr(1199092) decreasing

1199081 = 05

1199082 = minus051199011199081 = 05

1199011199082 = minus1

1199011199083 = 15

⟨1199091 positive⟩⟨1199092 negative⟩⟨1199093 positive⟩1199093 minus 1199092 gt 0

119891(1199092) = 1199093 minus 1199092 decreasing119891(1199093) = 1199093 minus 1199092 increasing119891(1198872) = 1198872 gt 0 increasing

Br(1199092) decreasingBr(1199093) increasing

1199082 = minus05

1199083 = 05

3 lowast 1199093 ge minus5119891(1199093) = 3 lowast 1199093 increasing119891(1198873) = 1198873 ge minus5 increasing Br(1199093) increasing 1199083 = 1

Table 4 DVO process for 1199091 1199092 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 4 |1198632| = 4 |1198633| = 4 Yes (all three have the same domain size)

x2 rarr 1199091 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199092) = 1 Rank 1(1199093) =infin Yes (1199091 and 1199092 both have Rank 1)Rank 2 Rank 2(1199091) =infin Rank 2(1199092) = 2 No (1199092 has Rank 2 while 1199091 has infinity)

Table 5 DVO process for 1199091 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 3 |1198633| = 3 Yes (both have the same domain size) x1 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199093) =infin No (1199091 has Rank 1 while 1199093 has infinity)

Table 6 Domain changes in the search process

Stage Function Before IDC After IDC and before MPC After MPC

Initialization Initial domain reduction mdash 1199091 [minus2 2] 1199092 [minus2 2]1199093 [minus2 2]

x1 [minus1 2] x2 [minus2 1]x3 [minus1 2]

State space search

MPC check when 1199092 isassigned minus1

1199091 [minus1 2] 1199092 [minus2 1] 1199093[minus1 2]

1199091 [minus1 2] x2 [minus1 minus1]1199093 [minus1 2]

x1 [0 2] 1199092 [minus1 minus1]x3 [0 2]

MPC check when 1199091 isassigned 1

1199091 [0 2] 1199092 [minus1 minus1] 1199093[0 2]

x1 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

MPC check when 1199093 isassigned 1

1199091 [1 1] 1199092 [minus1 minus1] 1199093[0 2]

1199091 [1 1] 1199092 [minus1 minus1]x3 [1 1]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [1 1]

void test(int x1 int x2 int x3) void test(int x1 int x2 int x3)

if (x1-x2lt=0)

printf(ldquoPath1rdquo)

else if(x3-x2lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

else if(3lowastx3+5gt=0)

int b1=x1-x2

printf(ldquoPath1rdquo)

if (b1lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

if(b2lt=0)

if(b3gt=-5)

else int b2=x-3x2

elseint b3=3lowastx3

Figure 3 Overview of our approach for searching the test data

12 Mathematical Problems in Engineering

0 1 2

DVO

MPC

0 1 2MPC

0 1 2MPC

DVO

DVOIDC

PTC

MPC

MPC

IDC

IDC

minus1

minus1

minus1

minus2

minus2

minus2

x1 minus x2 gt 0x3 minus x2 gt 03x3 + 5 ge 0

x1 x2 x3 isin minus2 minus1 0 1 2

x2x2x2x2

x3x3x3

x1 x1 x1

radic

Figure 4 The search tree of generating the test data for Path3 using BFS-BB

61 Performance Evaluation The number of relevant vari-ables is an important factor that affects the performanceof BFS-BB so in this part experiments were carried out toevaluate the performance of BFS-BB for varying numbers ofinput variables To be specific our major concern is (1) therelationship between the number of MPC checks (exclusiveof the one taken in the initialization stage) and the numberof relevant variables (2) the relationship between genera-tion time and the number of relevant variables This wasaccomplished by repeatedly running BFS-BB on generatedtest programs having input variables 119909

1 1199092 119909

119899where 119899

varied from 1 to 50 Adopting statement coverage in eachtest the program contained five if statements (equivalent tofive branching conditions along the path for MPC check)and there was only one path to be traversed of fixed lengthwhich was the one consisting of entirely true branches(TTTTT) that is all the branching conditions are the sameas the corresponding predicates Considering the relationshipbetween variables experiments involving two situations wereconducted that (1) the variables are all independent of eachother and (2) the variables are linearly related in the tightestmanner Generation time varied greatly in these two cases sothe axes of generation time of both cases are normalized forsimplicity

611 Variables Are All Independent of Each Other The pred-icate of each if statement is an expression in the form of

11988611199091relop

1const [1] and 119886

21199092relop

2119888119900119899119904119905 [2]

and sdot sdot sdot and 119886119899119909119899relop

119899119888119900119899119904119905 [119899]

(6)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop119894(119894 = 1 2 119899) isin gt ge lt le =

= and const is an array of randomly generated constantsThe randomly generated 119886

119894and 119888119900119899119904119905[119894] should be selected

to make the path feasible This arrangement constructs arelationship that all the variables are independent of eachother but all of them are relevant to the path The programsfor various values of n ranging from 1 to 50 were each tested50 times and the number of MPC checks and time required

to generate the data for each test were recorded The resultscan be seen from Figures 5 and 6

Figure 5 shows the relationship between the number ofMPC checks and the number of variables (119899) for variablesthat are all independent of each other and from (a) to (d)represent four different situations marked by the ordinatesIt can be seen that since the relation in formula (6) is thesimplest one between variables the number of MPC checksremains linearly increasing with the number of variables nomatter in which situation from (a) to (d) 119910 = 119909means thatfor this kind of constraint one relevant variable requires onlyone MPC check It also can be seen that 1198772 = 1 in all thefour situations and the number of MPC checks increasescompletely linearly with the number of variables

Figure 6 shows the relationship between generation timeand the number of variables (119899) for variables that are allindependent of each other and from (a) to (d) represent fourdifferent situations marked by the ordinates It can be seenthat generation time increases approximately linearlywith thenumber of variables and the linear correlation relationship issignificant at 95 confidence level with 119875 value far less than005 By the increase of the number of variables generationtime increases at an even speed The minimum value can becommendably represented as a straight line showing that itis the most ideal in the four situations with a larger value of1198772 Variations between tests with the same values of 119899 were

attributed to the randomness in the difference in the selectionof the initial values

612 Variables Are Linearly Related in the Tightest MannerThe predicate of each if statement is a linear combination ofall the 119899 variables in the form of

[1198861 1198862 119886

119899] [1199091 1199092 119909

119899]1015840relop119888119900119899119904119905 [119888] (7)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop isin gt ge lt le = = and 119888119900119899119904119905[119888](119888 isin 1 2 3 4 5) is an array of randomly generatedconstants The randomly generated 119886

119894and 119888119900119899119904119905[119888] should be

selected to make the path feasible This arrangement con-structs the tightest linear relation between the variables all

Mathematical Problems in Engineering 13

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = x

R2 = 1

(a)

Number of variables0 20 40 60

0

20

40

60

Aver

age n

umbe

r of M

PC ch

ecks

y = x

R2 = 1

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(c)

Number of variables0 20 40 60

0

20

40

60

Min

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(d)

Figure 5 Relationship between the number of MPC checks and the number of variables for variables that are all independent of each other

of which are relevant to the path The programs for variousvalues of 119899 ranging from 1 to 50 were each tested 50 times andthe number ofMPC checks and time required to generate thedata for each test were recordedThe results can be seen fromFigures 7 and 8

Figure 7 shows the relationship between the number ofMPC checks and the number of variables (119899) for variables thatare the tightest linearly related and from (a) to (d) representfour different situations marked by the ordinates It can beseen that the number of MPC checks remains approximatelylinearly increasing with the number of variables and thefitting curves are all near 119910 = 119909 The linear correlationrelationship is significant at 95 confidence level with 119875-value far less than 005 The general average and maximumnumbers ofMPC checks are all larger than those in the exper-iment for variables that are all independent of each otherbecause the relation in formula (7) is the tightest linear onebetween variablesTheminimumnumber ofMPC checks canbe completely represented as 119910 = 119909 with 1198772 = 1 whichmeans that theminimumnumber is themost ideal in the foursituations

Figure 8 shows the relationship between generation timeand the number of variables (119899) for variables that are

the tightest linearly related and from (a) to (d) represent fourdifferent situationsmarked by the ordinates It is clear that therelation between generation time and the number of variablescan be commendably represented as a quadratic curve andthe quadratic correlation relationship is significant at 95confidence level with 119875-value far less than 005 The betterfitting curves of average and minimum generation timesshow that average generation time is perfectly stable andminimum generation time is still the most ideal Variationsbetween tests with the same values of n were attributed to therandomness in (1) the difference in the selection of the initialvalues and (2) the difference in the expressions along the path(an equality relational operator will generally require morecalculation than an inequality relational operator) Besidesgeneration time increases at a uniformly accelerative speed bythe increase of the number of variables Take (b) for examplethe differentiation of average generation time indicates thatits increase rate rises by 119910 = 9994119909 minus 7734 as the number ofvariables increases We can roughly draw the conclusion thatgeneration time is very close for 119899 ranging from 1 to 8 whileit begins to increase when 119899 is larger than 8

The above cases are both completely backtrack-freesearch owing to the linear correlation relationship between

14 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 1471x + 3161

R2 = 0899

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time

y = 1636x + 1175

R2 = 0862

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 2073x minus 3655

R2 = 0997

(d)

Figure 6 Relationship between generation time and the number of variables for variables that are all independent of each other

the number of MPC checks and the number of relevantvariables Surely they cannot include all the relations betweenvariables in engineering so the analyses in this part arejust from the theoretic perspective The real-world PUTs aremuch more complex Whatrsquos more 50 tests were conductedfor each case of n ranging from 1 to 50 So the results from thesamples can only approximate the actual situation But it canbe concluded that BFS-BB functions are stably given a PUTof regular structure which lays a solid foundation for itsapplication in engineering

62 Coverage Evaluation To evaluate the capability of BFS-BB to generate test data in terms of coverage four experi-ments were carried out The first involves the testing with abenchmark used in CTS the second aims at generating testdata for a project in engineering the third compares BFS-BBwith a static method and the last compares it with dynamicmethods

621 Testing a Benchmark in CTS In this part test datawere automatically generated to meet three coverage criteriawhich were statement branch and MCDC The test bedwas branch boundc a benchmark in CTS with 402 LOC

29 input variables and complex structure trying to includemore content that might appear in engineering119898 was set 10for each variable as the upperbound of the number of MPCchecks so it can be estimated that the simplest backtrackingwill consume at least 11 MPC checks for the variable inquestion

The result is shown in Table 7 The numbers of paths wasdifferent owing to different coverage criteria adopted BFS-BB was able to generate test data for all the feasible pathsno matter which coverage criterion was taken The MCDCcoverage did not reach 100 because it is relatively strictand difficult to meet and subsumes statement and branchcoverage [38] But tolerable coverage was achieved withintolerable timeThere exists a trade-off between efficiency andsuccess rate IVR had no significant influence on coveragebut it did on generation time Generation time after IVRwas much less than that without IVR Note that the amountof generation time reduced by IVR is determined by the struc-ture of the PUT The numbers of generation time reductionin Table 7 are only related to the program branch boundcOur following analyses all concern BFS-BB with IVR Aver-age number of MPC checks per relevant variable adoptingstatement coverage was larger than the results in Section 61

Mathematical Problems in Engineering 15

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = 0994x + 0221

R2 = 0997

(a)

0 20 40 600

20

40

60

Number of variables

Aver

age n

umbe

r of M

PC ch

ecks

y = 0996x + 0309

R2 = 0999

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = 1014x + 2030

R2 = 0985

(c)

0 20 40 600

20

40

60

Number of variables

Min

imum

num

ber o

f MPC

chec

ks

y = x

R2 = 1

(d)

Figure 7 Relationship between the number of MPC checks and the number of variables for variables that are linearly related in the tightestmanner

Table 7 Coverage achieved by BFS-BB on branch boundc

Coverage criterion Number of paths Average coverage Generation time reducedby IVR

Average MPC checks perrelevant variable

Statement 61 100 34 134Branch 119 100 37 173MCDC 125 98 42 229

because there are some library functions as well as nonlinearconstraints in the PUT which require moreMPC checks Butfrom all the average values of the number of MPC checks itcan be concluded that the tests for all the three coverage crite-ria were basically backtrack-free and not all the tests includethe bisection operation

622 Testing Programs from a Project in Engineering In thispart seven programs were selected from the project de118i-2 at httpwwwmoshiernet as the test beds each of whichcontains several functions or loops Three coverage criteriawere adopted which were statement branch and MCDCThe information of the programs and the coverage results areshown in Table 8

From Table 8 it can be seen that BFS-BB performedencouragingly for programs with complex structure in engi-neering Some of the tests adoptingMCDC coverage did notreach 100 resulting from the fact thatMCDC is a relativelystrict criterion But all the tests were basically backtrack-freewith a high coverage proving its effectiveness in engineering

623 Comparison with a Static Method This part presentsthe results from an empirical comparison of BFS-BB withthe static method [13] (denoted as ldquomethod 1rdquo to avoidverbose description) which was implemented in CTS priorto BFS-BB The details of the test beds are shown in Table 9The comparison adopted three coverage criteria statementbranch and MCDC And the result is shown in Table 10

16 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 4318x2 minus 6684x + 3655

R2 = 0975

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 4997x2 minus 7734x + 4134

R2 = 0991

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time y = 4151x2 minus 2923x + 7107

R2 = 0907

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 5455x2 minus 9565x + 5030

R2 = 0994

(d)

Figure 8 Relationship between generation time and the number of variables for variables that are linearly related in the tightest manner

Table 8 Test result achieved by BFS-BB on programs from de118i-2

Program LOC Number ofvariables

Number offunctions

Number ofloops Statement coverage Branch coverage MCDC coverage

rungec 1626 8 2 8 100 100 89oblatec 581 14 4 4 100 100 94jplmpc 996 6 5 5 100 100 88floorlc 432 7 5 2 100 100 92atanlc 267 3 2 0 100 100 100adams4c 767 28 10 12 100 100 85tanlc 256 4 3 0 100 100 100

Table 9 Programs used for comparison with method 1

Program LOC Number of branches Number of variables Description Source

Bonus 29 10 1 To calculate bonus according toprofit [13]

Days 33 17 3 To calculate the day numberwithin a year [13]

Statistics 21 8 5 To count the number of everykind of characters [13]

gcd 38 5 2 To calculate greatest commondenominator [14]

Mathematical Problems in Engineering 17

Table 10 Comparison result with method 1 using three coverage criteria

Program Coverage criterion Number of paths Average coverage bymethod 1

Average coverage byBFS-BB

BonusStatement 6 25 100Branch 6 37 100MCDC 6 30 100

DaysStatement 17 100 100Branch 17 100 100MCDC 14 94 100

StatisticsStatement 4 100 100Branch 5 100 100MCDC 11 82 100

gcdStatement 3 85 100Branch 3 77 100MCDC 5 75 100

Table 11 Programs used for comparison with GA and SA

Program LOC Number of branches Number of variables Description SourceTriangle type 31 3 5 To classify type of a triangle [15]

Valid date 59 16 3 To check whether a date is validor not [15]

Cal day 72 11 3 To calculate the day of the week [16]

Cal 53 18 5To calculate the number of daysbetween the two given days in

the same year[17]

Since interval arithmetic has been improved in CTS forthe sake of validity test data were generated for all the testbeds on the same foundation of interval arithmetic It can beseen that BFS-BB reached 100 coverage for all the test bedsusing three coverage criteria while method 1 did not That islargely due to the heuristicmethods utilized in BFS-BBThereare modulus operations in the program days which had beendifficult for the interval arithmetic in method 1 to handle Butthe functionality of interval arithmetic has been improved byadding a library of inverse functions so it was not so difficulteven for method 1

624 Comparison with DynamicMethods This part presentsresults from an empirical comparison of BFS-BB with twodynamic methods which are genetic algorithm (GA) andsimulated annealing (SA) on four different benchmark pro-grams using branch coverage The details of the benchmarkprograms are shown in Table 11 In order to obtain unbiasedexperimental results we made a number of experiments withdifferent parameter settings and chose the one that GA andSA performed the best as shown in Table 12 Test data wereautomatically generated for each program using GA SA andBFS-BB with each tested 100 timesThe average coverage wasrecorded to make the comparison and the result was pre-sented in Table 13

It can be seen that BFS-BB reached 100 coverage onall the four benchmark programs which are rather simpleprograms for BFS-BB and it outperformed the algorithms

in comparison The better performance of BFS-BB is due tothree factorsThe first is that the initial values of variables areselected by heuristics on the path so BFS-BB reaches a rela-tively high coverage for the first round of the searchThe sec-ond is that MHS crashes on several occasions due to the iter-ation exception while the probability of aborting is quite lowfor BFS-BB because it has no demand for iteration The thirdis thatMPC is checked not only in the state space search stagebut also in the initialization stage which reduces the domainsof the variables to ensure a relatively small search spacethat follows

7 Conclusion

This paper presents an intelligent algorithm best-first-searchbranch and bound (BFS-BB) for path-wise test data gener-ation (Q) The problem Q is reformulated as a constraintoptimization problem (COP) and two techniques from arti-ficial intelligence are introduced to tackle the COP which arestate space search and branch and bound (BB) The formeris used to construct the search tree dynamically and thelatter is used as the search method Heuristics are adoptedin the look-ahead stage Dynamic variable ordering (DVO)is presented with a heuristic rule to break ties Maintainingpath consistency (MPC) is achieved through analysis on theresult of interval arithmetic The monotonicity analysis onbranching conditions is applied both in the selection of initialvalues by path tendency calculation (PTC) and initial domain

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

8 Mathematical Problems in Engineering

backtrack-free search the initial value of a variable is almostpart of the solution On the other hand the selection of initialvalues affectswhether the searchwill be backtrack-free Initialvalues are often selected at random in MHS methods whichreturn different test data each time allowing diversity butrandomness without any heuristics is a kind of blind searchand causes too many iterations sometimes even exceptionMeanwhile midvalues are selected in methods using bisec-tion so it is obvious that sometimes the same result may bereturned since the same initial value is always selected In ourmethod the above twomethods are combined and the initialvalue of a variable is determined based on its path tendencywhich is defined and calculated as follows

Definition 7 Path tendency isin 119901119900119904119894119905119894V119890 119899119890119892119886119905119894V119890 is anattribute of a variable on a path which is in favor of thesatisfaction of all the branching conditions along the pathAnd it provides the information about where to select itsinitial value Positive implies that a larger initial value willwork better while negative implies that a smaller initial valueis better

The calculation of the path tendency of a variable119909119894involves the calculation of its weight on each branch

(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) and its path weight denoted as119908119894(119899119902119886 119899119902119886+1

) and 119901119908119894 which are calculated as (3)

119908119894(119899119902119886 119899119902119886+1

)

=

10038161003816100381610038161198861198941003816100381610038161003816

10038161003816100381610038161198861198941003816100381610038161003816 + sum119895 = 119894

10038161003816100381610038161003816119886119895

10038161003816100381610038161003816

if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically increasing

minus

10038161003816100381610038161198861198941003816100381610038161003816

10038161003816100381610038161198861198941003816100381610038161003816 + sum119895 = 119894

10038161003816100381610038161003816119886119895

10038161003816100381610038161003816

if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically decreasing

119901119908119894=

119896

sum119886=1

119908119894(119899119902119886 119899119902119886+1

)

(3)

Path tendency calculation (PTC) gleans the path ten-dency of each variable with 119901119908

119894 Subsequently initial domain

calculation (IDC) works on the result of PTC In this waythe initial value selection allows for both diversity andheuristics The algorithms are expressed by pseudo-codes inAlgorithms 4 and 5

522 Bisection by Tendency Bisection functions only whena value (including the initial value) assigned to the currentvariable 119909

119894is judged to be infeasible and the conflicted branch

(119899119902119886 119899119902119886+1

)with the false branching condition is locatedThenthe tendency of 119909

119894is used by bisection defined as follows

Definition 8 Tendency isin 119901119900119904119894119905119894V119890 119899119890119892119886119905119894V119890 is an attributeof a variable at a branch (119899

119902119886 119899119902119886+1

) (119886 isin [1 119896]) determinedby the analysis on the monotonicity of the correspondingbranching condition and it provides the information aboutwhere to select a value to better satisfy the branching

condition Positive implies that a larger value will work betterwhile negative implies that a smaller value is better It iscalculated according to the following formula

119879119890119899119889119890119899119888119910 (119909119894)

=

119901119900119904119894119905119894V119890 if Br (119909119894) (119899119902119886 119899119902119886+1

)

is monotonically increasing119899119890119892119886119905119894V119890 if Br (119909

119894) (119899119902119886 119899119902119886+1

)

is monotonically decreasing

(4)

Each branch holds a tendencymap ⟨119881119886119903119894119886119887119897119890 119879119890119899119889119890119899119888119910⟩which includes the variables appearing on the branch andtheir corresponding tendencies With the tendency mapbisection can be applied to reduce the domain of 119909

119894(119863119894119895)

leading the branching condition to be true as presented bypseudo-codes in Algorithm 6

For example if the conflicted branch is the first branchof Path3 in Figure 1 then the corresponding branchingcondition is 1199091 minus 1199092 gt 0 which has different monotonicrelations with 1199091 and 1199092 respectively Table 2 shows how touse bisection to reduce the domains of variables If the currentvariable is 1199091 then retrieval of tendency map returns positiveindicating that a larger value will help satisfy the branchingcondition sowe reduce its domain to the larger part But if thecurrent variable is 1199092 bisection will function in the oppositeway due to the opposite monotonic relation

53 Heuristics in Maintaining Path Consistency As men-tioned in Section 42 MPC can be used in both stages ofBFS-BB In this part the focus is on the state space searchstage A value assigned to the current variable 119909

119894 no matter

it is the initial value or another value selected after bisectionshould be examined by interval arithmetic to see whether it ispart of the solution Path consistency is a prerequisite for thesuccess of interval arithmetic In the implementation of BFS-BB interval arithmetic is enhanced to provide more preciseinterval information The enhancement is to make clear howthe value of the branching condition defined by formula (2)is calculated as shown in formula (5) Here we use 119863119886 todenote the domain of all variables before calculating the 119886thbranching condition Besides a library of inverse functions isadded in case of the occurrences of library functions in thePUT Consider

Br (119899119902119886 119899119902119886+1

) (119909119894)

=

119905119903119906119890 if (119899119902119886 119899119902119886+1

) is traversedwith 119863119886 (119881

119894119895isin 119863119886

)

119891119886119897119904119890 otherwise

(5)

Hence for 119896 branching nodes along path 119901 all the119896 branching conditions should be true to maintain pathconsistency MPC receives the value of the current variable 119909

119894

(119881119894119895) which is part of the domain of all variables denoted as

1198631

(119881119894119895= [119881119894119895 119881119894119895] isin 1198631

) and evaluates the branching condi-tion corresponding to the branch (119899

1199021 1198991199021+1

) where 1198991199021is the

first branching node The branching condition Br(1198991199021 1198991199021+1

)

Mathematical Problems in Engineering 9

Input119883rel the set of relevant variables to the path119901119908119894 the path weight of variable 119909

119894(119909119894isin 119883rel)

Output Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relBegin(1) Path-Tendencylarr null(2) foreach 119909

119894isin 119883rel

(3) if (119901119908119894gt 0)

(4) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119901119900119904119894119905119894V119890⟩

(5) else if (119901119908119894lt 0)

(6) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119899119890119892119886119905119894V119890⟩

(7) return Path-TendencyEnd

Algorithm 4 Path tendency calculation

Input 119863119894= [min max] the domain of 119909

119894

Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relOutput 119863

1198941 the domain of 119909

119894in which its initial value is selected

Begin(1) PathTendency(119909

119894)larr retrieval of Path-Tendency

(2) if (PathTendency(119909119894) = positive)

(3) 1198631198941larr997888 [

(min+max)2

max](4) else if (PathTendency(119909

119894) = negative)

(5) 1198631198941larr997888 [min (min+max)

2]

(6) return 1198631198941

End

Algorithm 5 Initial domain calculation

Input 119863119894119895= [min max] the current domain of 119909

119894

119881119894119895 the current value of 119909

119894that causes Br(119909

119894)(119899119902119886 119899119902119886+1

) to be false(119899119902119886 119899119902119886+1

) the conflicted branchOutput 119863

119894119895 the reduced domain of 119909

119894

Begin(1) 1198811015840 larr 119881

119894119895

(2) Tendency(119909119894)larr retrieval of tendency map held by (119899

119902119886 119899119902119886+1

)(3) 119895++(4) if (Tendency(119909

119894) = positive)

(5) 119863119894119895larr [1198811015840 + 1 max]

(6) else if (Tendency(119909119894) = negative)

(7) 119863119894119895larr [min 1198811015840 minus 1]

(8) return 119863119894119895

End

Algorithm 6 Bisection

Table 2 An example of bisection

Current variable Monotonicity Tendency Current value Domain before bisection Domain after bisection1199091 Increasing Positive 1198811 [min1 max1] [1198811+1 max1]1199092 Decreasing Negative 1198812 [min2 max2] [min2 1198812minus1]

10 Mathematical Problems in Engineering

Input 1198631 the domain of all variables before checking path consistencyBr(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branching conditions along the pathOutput 119863119896+1 the reduced domain of all variables after a successful path consistency check

(119899119902119886 119899119902119886+1

) the conflicted branch spotted by path consistency checkBegin(1) for 119886 rarr 1 119896(2) calculate Br(119899

119902119886 119899119902119886+1

) with119863119886(3) if (Br(119899

119902119886 119899119902119886+1

) = true)(4) 119863

119886+1

⊑ 119863119886

(5) else return (119899119902119886 119899119902119886+1

)(6) path consistentlarr true(7) return 119863119896+1End

Algorithm 7 Maintaining path consistency

is generally not satisfied for all the values in1198631 but for valuesin a certain subset 1198632 sube 119863

1 ensuring the traversal of the

branch (1198991199021 1198991199021+1

) that is 1198631119861119903(1198991199021 1198991199021+1)

997888997888997888997888997888997888997888997888997888rarr 1198632 Next the

branching condition Br(1198991199022 1198991199022+1

) is evaluated given that thedomain of all variables is1198632 Again generally Br(119899

1199022 1198991199022+1

) isonly satisfied by a subset 1198633 sube 1198632 This procedure continuesalong 119901 until all the branching conditions are satisfied tomaintain path consistency and119863119896+1 is returned as the domainof all variables The process of maintaining path consistencyis the propagation of the branching conditions along p in the

form of 1198631Br(1198991199021 1198991199021+1)997888997888997888997888997888997888997888997888997888rarr 119863

2

Br(11989911990221198991199022+1)997888997888997888997888997888997888997888997888997888rarr 119863

3

sdot sdot sdot 119863119896

Br(119899119902119896119899119902119896+1)997888997888997888997888997888997888997888997888997888rarr

119863119896+1 where 1198631 supe 119863

2

supe 1198633

sdot sdot sdot supe 119863119896

supe 119863119896+1 But if in

this procedure Br(119899119902ℎ 119899119902ℎ+1

) = 119891119886119897119904119890(1 le ℎ le 119896) whichmeans a conflict is detected then MPC is terminated andbisection will function according to the result of MPC atthe conflicted branch (119899

119902ℎ 119899119902ℎ+1

) The process of checkingwhether path consistency is maintained is shown by pseudo-codes in Algorithm 7

54 Case Study In this part the problem mentioned inSection 31 is used as an example to explain how BFS-BBworks especially the heuristic look-aheadmethods proposedahead The input is Path3 as shown in bold in Figure 3where each branching condition is decomposed into its basicfunctions in the right The IVR process has been illustratedin detail in Table 1 and all the three variables are determinedrelevant to Path3 For simplicity the input domains of allvariables are set [minus2 2] with the size 5 In the initializationstage MPC check reduces their domains to 1199091 [minus1 2] 1199092[minus2 1] and 1199093 [minus1 2] The path tendency of each variableis calculated by PTC as shown in Table 3 DVO serves todetermine the first variable to be instantiated as shown inTable 4 with the head of the queue (1199092) highlighted in boldOn determining 1199092 to be the current variable an initialvalue needs to be selected from[minus2 1] The retrieval of pathtendency map by IDC returns negative for 1199092 indicating thata smaller value will perform better and minus1 is selected

MPC checks the domains of all variables which are 1199091[minus1 2] 1199092 [minus1 1] and 1199093 [minus1 2] It succeeds and reduces

the domains of 1199091 and 1199093 to [0 2] and [0 2] respectivelyThen DVO determines the next variable to be instantiated asshown in Table 5 with the head of the queue (1199091) highlightedin bold

1 is selected for 1199091 after IDC MPC checks whether1199091 [1 1] 1199092 [minus1 minus1] and 1199093 [0 2] works It suc-ceeds and in the same manner 1199093 is assigned 1 Finally⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is checked by MPC to be suitablefor Path3 No variable needs to be permutated and BFS-BBsucceeds with the test data ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ Table 6shows how the domains of variables are changed duringthe search process The changed domains are highlighted inbold The changes listed in the fourth column are owing tovariable assignments according to the results of IDC andthe changes listed in the fifth column are owing to domainreduction by MPC checks The process of generating the testdata ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is presented as the search treein Figure 4 It is a backtrack-free search that accounts for anextremely large proportion in the implementation of BFS-BBEach variable consumes one MPC check in the state spacesearch stage and the initial values of each variable make thesolution The solution path is shown by the bold arrows

6 Experimental Results and Discussion

To observe the effectiveness of BFS-BB we carried out a largenumber of experiments in CTS Within the CTS frameworkthe PUT is automatically analyzed and its basic informationis abstracted to generate its CFG According to the specifiedcoverage criteria the paths to be traversed are generated andprovided for BFS-BB as input The generated test data willbe used for mutation testing that requires a high coverageideally 100 [37] This is a challenge for test data generation

The experiments were performed in the environment ofMSWindows 7with 32 bits Pentium4with 28GHz and 2GBmemory The algorithms were implemented in Java and runon the platformof eclipseThe experiments include two partsSection 61 presents the performance evaluation of BFS-BBand Section 62 tests the capability of BFS-BB to generate testdata in terms of coverage and makes comparisons with somecurrently existing static and dynamic methods

Mathematical Problems in Engineering 11

Table 3 PTC process for 1199091 1199092 and 1199093

Branchingcondition

Basic functions andcorresponding monotonicity

Monotonicity ofbranching conditions Weight Path weight Path tendency

1199091 minus 1199092 gt 0

119891(1199091) = 1199091 minus 1199092 increasing119891(1199092) = 1199091 minus 1199092 decreasing119891(1198871) = 1198871 gt 0 increasing

Br(1199091) increasingBr(1199092) decreasing

1199081 = 05

1199082 = minus051199011199081 = 05

1199011199082 = minus1

1199011199083 = 15

⟨1199091 positive⟩⟨1199092 negative⟩⟨1199093 positive⟩1199093 minus 1199092 gt 0

119891(1199092) = 1199093 minus 1199092 decreasing119891(1199093) = 1199093 minus 1199092 increasing119891(1198872) = 1198872 gt 0 increasing

Br(1199092) decreasingBr(1199093) increasing

1199082 = minus05

1199083 = 05

3 lowast 1199093 ge minus5119891(1199093) = 3 lowast 1199093 increasing119891(1198873) = 1198873 ge minus5 increasing Br(1199093) increasing 1199083 = 1

Table 4 DVO process for 1199091 1199092 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 4 |1198632| = 4 |1198633| = 4 Yes (all three have the same domain size)

x2 rarr 1199091 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199092) = 1 Rank 1(1199093) =infin Yes (1199091 and 1199092 both have Rank 1)Rank 2 Rank 2(1199091) =infin Rank 2(1199092) = 2 No (1199092 has Rank 2 while 1199091 has infinity)

Table 5 DVO process for 1199091 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 3 |1198633| = 3 Yes (both have the same domain size) x1 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199093) =infin No (1199091 has Rank 1 while 1199093 has infinity)

Table 6 Domain changes in the search process

Stage Function Before IDC After IDC and before MPC After MPC

Initialization Initial domain reduction mdash 1199091 [minus2 2] 1199092 [minus2 2]1199093 [minus2 2]

x1 [minus1 2] x2 [minus2 1]x3 [minus1 2]

State space search

MPC check when 1199092 isassigned minus1

1199091 [minus1 2] 1199092 [minus2 1] 1199093[minus1 2]

1199091 [minus1 2] x2 [minus1 minus1]1199093 [minus1 2]

x1 [0 2] 1199092 [minus1 minus1]x3 [0 2]

MPC check when 1199091 isassigned 1

1199091 [0 2] 1199092 [minus1 minus1] 1199093[0 2]

x1 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

MPC check when 1199093 isassigned 1

1199091 [1 1] 1199092 [minus1 minus1] 1199093[0 2]

1199091 [1 1] 1199092 [minus1 minus1]x3 [1 1]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [1 1]

void test(int x1 int x2 int x3) void test(int x1 int x2 int x3)

if (x1-x2lt=0)

printf(ldquoPath1rdquo)

else if(x3-x2lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

else if(3lowastx3+5gt=0)

int b1=x1-x2

printf(ldquoPath1rdquo)

if (b1lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

if(b2lt=0)

if(b3gt=-5)

else int b2=x-3x2

elseint b3=3lowastx3

Figure 3 Overview of our approach for searching the test data

12 Mathematical Problems in Engineering

0 1 2

DVO

MPC

0 1 2MPC

0 1 2MPC

DVO

DVOIDC

PTC

MPC

MPC

IDC

IDC

minus1

minus1

minus1

minus2

minus2

minus2

x1 minus x2 gt 0x3 minus x2 gt 03x3 + 5 ge 0

x1 x2 x3 isin minus2 minus1 0 1 2

x2x2x2x2

x3x3x3

x1 x1 x1

radic

Figure 4 The search tree of generating the test data for Path3 using BFS-BB

61 Performance Evaluation The number of relevant vari-ables is an important factor that affects the performanceof BFS-BB so in this part experiments were carried out toevaluate the performance of BFS-BB for varying numbers ofinput variables To be specific our major concern is (1) therelationship between the number of MPC checks (exclusiveof the one taken in the initialization stage) and the numberof relevant variables (2) the relationship between genera-tion time and the number of relevant variables This wasaccomplished by repeatedly running BFS-BB on generatedtest programs having input variables 119909

1 1199092 119909

119899where 119899

varied from 1 to 50 Adopting statement coverage in eachtest the program contained five if statements (equivalent tofive branching conditions along the path for MPC check)and there was only one path to be traversed of fixed lengthwhich was the one consisting of entirely true branches(TTTTT) that is all the branching conditions are the sameas the corresponding predicates Considering the relationshipbetween variables experiments involving two situations wereconducted that (1) the variables are all independent of eachother and (2) the variables are linearly related in the tightestmanner Generation time varied greatly in these two cases sothe axes of generation time of both cases are normalized forsimplicity

611 Variables Are All Independent of Each Other The pred-icate of each if statement is an expression in the form of

11988611199091relop

1const [1] and 119886

21199092relop

2119888119900119899119904119905 [2]

and sdot sdot sdot and 119886119899119909119899relop

119899119888119900119899119904119905 [119899]

(6)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop119894(119894 = 1 2 119899) isin gt ge lt le =

= and const is an array of randomly generated constantsThe randomly generated 119886

119894and 119888119900119899119904119905[119894] should be selected

to make the path feasible This arrangement constructs arelationship that all the variables are independent of eachother but all of them are relevant to the path The programsfor various values of n ranging from 1 to 50 were each tested50 times and the number of MPC checks and time required

to generate the data for each test were recorded The resultscan be seen from Figures 5 and 6

Figure 5 shows the relationship between the number ofMPC checks and the number of variables (119899) for variablesthat are all independent of each other and from (a) to (d)represent four different situations marked by the ordinatesIt can be seen that since the relation in formula (6) is thesimplest one between variables the number of MPC checksremains linearly increasing with the number of variables nomatter in which situation from (a) to (d) 119910 = 119909means thatfor this kind of constraint one relevant variable requires onlyone MPC check It also can be seen that 1198772 = 1 in all thefour situations and the number of MPC checks increasescompletely linearly with the number of variables

Figure 6 shows the relationship between generation timeand the number of variables (119899) for variables that are allindependent of each other and from (a) to (d) represent fourdifferent situations marked by the ordinates It can be seenthat generation time increases approximately linearlywith thenumber of variables and the linear correlation relationship issignificant at 95 confidence level with 119875 value far less than005 By the increase of the number of variables generationtime increases at an even speed The minimum value can becommendably represented as a straight line showing that itis the most ideal in the four situations with a larger value of1198772 Variations between tests with the same values of 119899 were

attributed to the randomness in the difference in the selectionof the initial values

612 Variables Are Linearly Related in the Tightest MannerThe predicate of each if statement is a linear combination ofall the 119899 variables in the form of

[1198861 1198862 119886

119899] [1199091 1199092 119909

119899]1015840relop119888119900119899119904119905 [119888] (7)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop isin gt ge lt le = = and 119888119900119899119904119905[119888](119888 isin 1 2 3 4 5) is an array of randomly generatedconstants The randomly generated 119886

119894and 119888119900119899119904119905[119888] should be

selected to make the path feasible This arrangement con-structs the tightest linear relation between the variables all

Mathematical Problems in Engineering 13

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = x

R2 = 1

(a)

Number of variables0 20 40 60

0

20

40

60

Aver

age n

umbe

r of M

PC ch

ecks

y = x

R2 = 1

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(c)

Number of variables0 20 40 60

0

20

40

60

Min

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(d)

Figure 5 Relationship between the number of MPC checks and the number of variables for variables that are all independent of each other

of which are relevant to the path The programs for variousvalues of 119899 ranging from 1 to 50 were each tested 50 times andthe number ofMPC checks and time required to generate thedata for each test were recordedThe results can be seen fromFigures 7 and 8

Figure 7 shows the relationship between the number ofMPC checks and the number of variables (119899) for variables thatare the tightest linearly related and from (a) to (d) representfour different situations marked by the ordinates It can beseen that the number of MPC checks remains approximatelylinearly increasing with the number of variables and thefitting curves are all near 119910 = 119909 The linear correlationrelationship is significant at 95 confidence level with 119875-value far less than 005 The general average and maximumnumbers ofMPC checks are all larger than those in the exper-iment for variables that are all independent of each otherbecause the relation in formula (7) is the tightest linear onebetween variablesTheminimumnumber ofMPC checks canbe completely represented as 119910 = 119909 with 1198772 = 1 whichmeans that theminimumnumber is themost ideal in the foursituations

Figure 8 shows the relationship between generation timeand the number of variables (119899) for variables that are

the tightest linearly related and from (a) to (d) represent fourdifferent situationsmarked by the ordinates It is clear that therelation between generation time and the number of variablescan be commendably represented as a quadratic curve andthe quadratic correlation relationship is significant at 95confidence level with 119875-value far less than 005 The betterfitting curves of average and minimum generation timesshow that average generation time is perfectly stable andminimum generation time is still the most ideal Variationsbetween tests with the same values of n were attributed to therandomness in (1) the difference in the selection of the initialvalues and (2) the difference in the expressions along the path(an equality relational operator will generally require morecalculation than an inequality relational operator) Besidesgeneration time increases at a uniformly accelerative speed bythe increase of the number of variables Take (b) for examplethe differentiation of average generation time indicates thatits increase rate rises by 119910 = 9994119909 minus 7734 as the number ofvariables increases We can roughly draw the conclusion thatgeneration time is very close for 119899 ranging from 1 to 8 whileit begins to increase when 119899 is larger than 8

The above cases are both completely backtrack-freesearch owing to the linear correlation relationship between

14 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 1471x + 3161

R2 = 0899

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time

y = 1636x + 1175

R2 = 0862

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 2073x minus 3655

R2 = 0997

(d)

Figure 6 Relationship between generation time and the number of variables for variables that are all independent of each other

the number of MPC checks and the number of relevantvariables Surely they cannot include all the relations betweenvariables in engineering so the analyses in this part arejust from the theoretic perspective The real-world PUTs aremuch more complex Whatrsquos more 50 tests were conductedfor each case of n ranging from 1 to 50 So the results from thesamples can only approximate the actual situation But it canbe concluded that BFS-BB functions are stably given a PUTof regular structure which lays a solid foundation for itsapplication in engineering

62 Coverage Evaluation To evaluate the capability of BFS-BB to generate test data in terms of coverage four experi-ments were carried out The first involves the testing with abenchmark used in CTS the second aims at generating testdata for a project in engineering the third compares BFS-BBwith a static method and the last compares it with dynamicmethods

621 Testing a Benchmark in CTS In this part test datawere automatically generated to meet three coverage criteriawhich were statement branch and MCDC The test bedwas branch boundc a benchmark in CTS with 402 LOC

29 input variables and complex structure trying to includemore content that might appear in engineering119898 was set 10for each variable as the upperbound of the number of MPCchecks so it can be estimated that the simplest backtrackingwill consume at least 11 MPC checks for the variable inquestion

The result is shown in Table 7 The numbers of paths wasdifferent owing to different coverage criteria adopted BFS-BB was able to generate test data for all the feasible pathsno matter which coverage criterion was taken The MCDCcoverage did not reach 100 because it is relatively strictand difficult to meet and subsumes statement and branchcoverage [38] But tolerable coverage was achieved withintolerable timeThere exists a trade-off between efficiency andsuccess rate IVR had no significant influence on coveragebut it did on generation time Generation time after IVRwas much less than that without IVR Note that the amountof generation time reduced by IVR is determined by the struc-ture of the PUT The numbers of generation time reductionin Table 7 are only related to the program branch boundcOur following analyses all concern BFS-BB with IVR Aver-age number of MPC checks per relevant variable adoptingstatement coverage was larger than the results in Section 61

Mathematical Problems in Engineering 15

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = 0994x + 0221

R2 = 0997

(a)

0 20 40 600

20

40

60

Number of variables

Aver

age n

umbe

r of M

PC ch

ecks

y = 0996x + 0309

R2 = 0999

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = 1014x + 2030

R2 = 0985

(c)

0 20 40 600

20

40

60

Number of variables

Min

imum

num

ber o

f MPC

chec

ks

y = x

R2 = 1

(d)

Figure 7 Relationship between the number of MPC checks and the number of variables for variables that are linearly related in the tightestmanner

Table 7 Coverage achieved by BFS-BB on branch boundc

Coverage criterion Number of paths Average coverage Generation time reducedby IVR

Average MPC checks perrelevant variable

Statement 61 100 34 134Branch 119 100 37 173MCDC 125 98 42 229

because there are some library functions as well as nonlinearconstraints in the PUT which require moreMPC checks Butfrom all the average values of the number of MPC checks itcan be concluded that the tests for all the three coverage crite-ria were basically backtrack-free and not all the tests includethe bisection operation

622 Testing Programs from a Project in Engineering In thispart seven programs were selected from the project de118i-2 at httpwwwmoshiernet as the test beds each of whichcontains several functions or loops Three coverage criteriawere adopted which were statement branch and MCDCThe information of the programs and the coverage results areshown in Table 8

From Table 8 it can be seen that BFS-BB performedencouragingly for programs with complex structure in engi-neering Some of the tests adoptingMCDC coverage did notreach 100 resulting from the fact thatMCDC is a relativelystrict criterion But all the tests were basically backtrack-freewith a high coverage proving its effectiveness in engineering

623 Comparison with a Static Method This part presentsthe results from an empirical comparison of BFS-BB withthe static method [13] (denoted as ldquomethod 1rdquo to avoidverbose description) which was implemented in CTS priorto BFS-BB The details of the test beds are shown in Table 9The comparison adopted three coverage criteria statementbranch and MCDC And the result is shown in Table 10

16 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 4318x2 minus 6684x + 3655

R2 = 0975

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 4997x2 minus 7734x + 4134

R2 = 0991

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time y = 4151x2 minus 2923x + 7107

R2 = 0907

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 5455x2 minus 9565x + 5030

R2 = 0994

(d)

Figure 8 Relationship between generation time and the number of variables for variables that are linearly related in the tightest manner

Table 8 Test result achieved by BFS-BB on programs from de118i-2

Program LOC Number ofvariables

Number offunctions

Number ofloops Statement coverage Branch coverage MCDC coverage

rungec 1626 8 2 8 100 100 89oblatec 581 14 4 4 100 100 94jplmpc 996 6 5 5 100 100 88floorlc 432 7 5 2 100 100 92atanlc 267 3 2 0 100 100 100adams4c 767 28 10 12 100 100 85tanlc 256 4 3 0 100 100 100

Table 9 Programs used for comparison with method 1

Program LOC Number of branches Number of variables Description Source

Bonus 29 10 1 To calculate bonus according toprofit [13]

Days 33 17 3 To calculate the day numberwithin a year [13]

Statistics 21 8 5 To count the number of everykind of characters [13]

gcd 38 5 2 To calculate greatest commondenominator [14]

Mathematical Problems in Engineering 17

Table 10 Comparison result with method 1 using three coverage criteria

Program Coverage criterion Number of paths Average coverage bymethod 1

Average coverage byBFS-BB

BonusStatement 6 25 100Branch 6 37 100MCDC 6 30 100

DaysStatement 17 100 100Branch 17 100 100MCDC 14 94 100

StatisticsStatement 4 100 100Branch 5 100 100MCDC 11 82 100

gcdStatement 3 85 100Branch 3 77 100MCDC 5 75 100

Table 11 Programs used for comparison with GA and SA

Program LOC Number of branches Number of variables Description SourceTriangle type 31 3 5 To classify type of a triangle [15]

Valid date 59 16 3 To check whether a date is validor not [15]

Cal day 72 11 3 To calculate the day of the week [16]

Cal 53 18 5To calculate the number of daysbetween the two given days in

the same year[17]

Since interval arithmetic has been improved in CTS forthe sake of validity test data were generated for all the testbeds on the same foundation of interval arithmetic It can beseen that BFS-BB reached 100 coverage for all the test bedsusing three coverage criteria while method 1 did not That islargely due to the heuristicmethods utilized in BFS-BBThereare modulus operations in the program days which had beendifficult for the interval arithmetic in method 1 to handle Butthe functionality of interval arithmetic has been improved byadding a library of inverse functions so it was not so difficulteven for method 1

624 Comparison with DynamicMethods This part presentsresults from an empirical comparison of BFS-BB with twodynamic methods which are genetic algorithm (GA) andsimulated annealing (SA) on four different benchmark pro-grams using branch coverage The details of the benchmarkprograms are shown in Table 11 In order to obtain unbiasedexperimental results we made a number of experiments withdifferent parameter settings and chose the one that GA andSA performed the best as shown in Table 12 Test data wereautomatically generated for each program using GA SA andBFS-BB with each tested 100 timesThe average coverage wasrecorded to make the comparison and the result was pre-sented in Table 13

It can be seen that BFS-BB reached 100 coverage onall the four benchmark programs which are rather simpleprograms for BFS-BB and it outperformed the algorithms

in comparison The better performance of BFS-BB is due tothree factorsThe first is that the initial values of variables areselected by heuristics on the path so BFS-BB reaches a rela-tively high coverage for the first round of the searchThe sec-ond is that MHS crashes on several occasions due to the iter-ation exception while the probability of aborting is quite lowfor BFS-BB because it has no demand for iteration The thirdis thatMPC is checked not only in the state space search stagebut also in the initialization stage which reduces the domainsof the variables to ensure a relatively small search spacethat follows

7 Conclusion

This paper presents an intelligent algorithm best-first-searchbranch and bound (BFS-BB) for path-wise test data gener-ation (Q) The problem Q is reformulated as a constraintoptimization problem (COP) and two techniques from arti-ficial intelligence are introduced to tackle the COP which arestate space search and branch and bound (BB) The formeris used to construct the search tree dynamically and thelatter is used as the search method Heuristics are adoptedin the look-ahead stage Dynamic variable ordering (DVO)is presented with a heuristic rule to break ties Maintainingpath consistency (MPC) is achieved through analysis on theresult of interval arithmetic The monotonicity analysis onbranching conditions is applied both in the selection of initialvalues by path tendency calculation (PTC) and initial domain

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 9

Input119883rel the set of relevant variables to the path119901119908119894 the path weight of variable 119909

119894(119909119894isin 119883rel)

Output Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relBegin(1) Path-Tendencylarr null(2) foreach 119909

119894isin 119883rel

(3) if (119901119908119894gt 0)

(4) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119901119900119904119894119905119894V119890⟩

(5) else if (119901119908119894lt 0)

(6) Path-Tendencylarr Path-Tendency cup ⟨119909119894 119899119890119892119886119905119894V119890⟩

(7) return Path-TendencyEnd

Algorithm 4 Path tendency calculation

Input 119863119894= [min max] the domain of 119909

119894

Path-Tendency ⟨119881119886119903119894119886119887119897119890 119875119886119905ℎ119879119890119899119889119890119899119888119910⟩ a map used to store the path tendency of each variable in119883relOutput 119863

1198941 the domain of 119909

119894in which its initial value is selected

Begin(1) PathTendency(119909

119894)larr retrieval of Path-Tendency

(2) if (PathTendency(119909119894) = positive)

(3) 1198631198941larr997888 [

(min+max)2

max](4) else if (PathTendency(119909

119894) = negative)

(5) 1198631198941larr997888 [min (min+max)

2]

(6) return 1198631198941

End

Algorithm 5 Initial domain calculation

Input 119863119894119895= [min max] the current domain of 119909

119894

119881119894119895 the current value of 119909

119894that causes Br(119909

119894)(119899119902119886 119899119902119886+1

) to be false(119899119902119886 119899119902119886+1

) the conflicted branchOutput 119863

119894119895 the reduced domain of 119909

119894

Begin(1) 1198811015840 larr 119881

119894119895

(2) Tendency(119909119894)larr retrieval of tendency map held by (119899

119902119886 119899119902119886+1

)(3) 119895++(4) if (Tendency(119909

119894) = positive)

(5) 119863119894119895larr [1198811015840 + 1 max]

(6) else if (Tendency(119909119894) = negative)

(7) 119863119894119895larr [min 1198811015840 minus 1]

(8) return 119863119894119895

End

Algorithm 6 Bisection

Table 2 An example of bisection

Current variable Monotonicity Tendency Current value Domain before bisection Domain after bisection1199091 Increasing Positive 1198811 [min1 max1] [1198811+1 max1]1199092 Decreasing Negative 1198812 [min2 max2] [min2 1198812minus1]

10 Mathematical Problems in Engineering

Input 1198631 the domain of all variables before checking path consistencyBr(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branching conditions along the pathOutput 119863119896+1 the reduced domain of all variables after a successful path consistency check

(119899119902119886 119899119902119886+1

) the conflicted branch spotted by path consistency checkBegin(1) for 119886 rarr 1 119896(2) calculate Br(119899

119902119886 119899119902119886+1

) with119863119886(3) if (Br(119899

119902119886 119899119902119886+1

) = true)(4) 119863

119886+1

⊑ 119863119886

(5) else return (119899119902119886 119899119902119886+1

)(6) path consistentlarr true(7) return 119863119896+1End

Algorithm 7 Maintaining path consistency

is generally not satisfied for all the values in1198631 but for valuesin a certain subset 1198632 sube 119863

1 ensuring the traversal of the

branch (1198991199021 1198991199021+1

) that is 1198631119861119903(1198991199021 1198991199021+1)

997888997888997888997888997888997888997888997888997888rarr 1198632 Next the

branching condition Br(1198991199022 1198991199022+1

) is evaluated given that thedomain of all variables is1198632 Again generally Br(119899

1199022 1198991199022+1

) isonly satisfied by a subset 1198633 sube 1198632 This procedure continuesalong 119901 until all the branching conditions are satisfied tomaintain path consistency and119863119896+1 is returned as the domainof all variables The process of maintaining path consistencyis the propagation of the branching conditions along p in the

form of 1198631Br(1198991199021 1198991199021+1)997888997888997888997888997888997888997888997888997888rarr 119863

2

Br(11989911990221198991199022+1)997888997888997888997888997888997888997888997888997888rarr 119863

3

sdot sdot sdot 119863119896

Br(119899119902119896119899119902119896+1)997888997888997888997888997888997888997888997888997888rarr

119863119896+1 where 1198631 supe 119863

2

supe 1198633

sdot sdot sdot supe 119863119896

supe 119863119896+1 But if in

this procedure Br(119899119902ℎ 119899119902ℎ+1

) = 119891119886119897119904119890(1 le ℎ le 119896) whichmeans a conflict is detected then MPC is terminated andbisection will function according to the result of MPC atthe conflicted branch (119899

119902ℎ 119899119902ℎ+1

) The process of checkingwhether path consistency is maintained is shown by pseudo-codes in Algorithm 7

54 Case Study In this part the problem mentioned inSection 31 is used as an example to explain how BFS-BBworks especially the heuristic look-aheadmethods proposedahead The input is Path3 as shown in bold in Figure 3where each branching condition is decomposed into its basicfunctions in the right The IVR process has been illustratedin detail in Table 1 and all the three variables are determinedrelevant to Path3 For simplicity the input domains of allvariables are set [minus2 2] with the size 5 In the initializationstage MPC check reduces their domains to 1199091 [minus1 2] 1199092[minus2 1] and 1199093 [minus1 2] The path tendency of each variableis calculated by PTC as shown in Table 3 DVO serves todetermine the first variable to be instantiated as shown inTable 4 with the head of the queue (1199092) highlighted in boldOn determining 1199092 to be the current variable an initialvalue needs to be selected from[minus2 1] The retrieval of pathtendency map by IDC returns negative for 1199092 indicating thata smaller value will perform better and minus1 is selected

MPC checks the domains of all variables which are 1199091[minus1 2] 1199092 [minus1 1] and 1199093 [minus1 2] It succeeds and reduces

the domains of 1199091 and 1199093 to [0 2] and [0 2] respectivelyThen DVO determines the next variable to be instantiated asshown in Table 5 with the head of the queue (1199091) highlightedin bold

1 is selected for 1199091 after IDC MPC checks whether1199091 [1 1] 1199092 [minus1 minus1] and 1199093 [0 2] works It suc-ceeds and in the same manner 1199093 is assigned 1 Finally⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is checked by MPC to be suitablefor Path3 No variable needs to be permutated and BFS-BBsucceeds with the test data ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ Table 6shows how the domains of variables are changed duringthe search process The changed domains are highlighted inbold The changes listed in the fourth column are owing tovariable assignments according to the results of IDC andthe changes listed in the fifth column are owing to domainreduction by MPC checks The process of generating the testdata ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is presented as the search treein Figure 4 It is a backtrack-free search that accounts for anextremely large proportion in the implementation of BFS-BBEach variable consumes one MPC check in the state spacesearch stage and the initial values of each variable make thesolution The solution path is shown by the bold arrows

6 Experimental Results and Discussion

To observe the effectiveness of BFS-BB we carried out a largenumber of experiments in CTS Within the CTS frameworkthe PUT is automatically analyzed and its basic informationis abstracted to generate its CFG According to the specifiedcoverage criteria the paths to be traversed are generated andprovided for BFS-BB as input The generated test data willbe used for mutation testing that requires a high coverageideally 100 [37] This is a challenge for test data generation

The experiments were performed in the environment ofMSWindows 7with 32 bits Pentium4with 28GHz and 2GBmemory The algorithms were implemented in Java and runon the platformof eclipseThe experiments include two partsSection 61 presents the performance evaluation of BFS-BBand Section 62 tests the capability of BFS-BB to generate testdata in terms of coverage and makes comparisons with somecurrently existing static and dynamic methods

Mathematical Problems in Engineering 11

Table 3 PTC process for 1199091 1199092 and 1199093

Branchingcondition

Basic functions andcorresponding monotonicity

Monotonicity ofbranching conditions Weight Path weight Path tendency

1199091 minus 1199092 gt 0

119891(1199091) = 1199091 minus 1199092 increasing119891(1199092) = 1199091 minus 1199092 decreasing119891(1198871) = 1198871 gt 0 increasing

Br(1199091) increasingBr(1199092) decreasing

1199081 = 05

1199082 = minus051199011199081 = 05

1199011199082 = minus1

1199011199083 = 15

⟨1199091 positive⟩⟨1199092 negative⟩⟨1199093 positive⟩1199093 minus 1199092 gt 0

119891(1199092) = 1199093 minus 1199092 decreasing119891(1199093) = 1199093 minus 1199092 increasing119891(1198872) = 1198872 gt 0 increasing

Br(1199092) decreasingBr(1199093) increasing

1199082 = minus05

1199083 = 05

3 lowast 1199093 ge minus5119891(1199093) = 3 lowast 1199093 increasing119891(1198873) = 1198873 ge minus5 increasing Br(1199093) increasing 1199083 = 1

Table 4 DVO process for 1199091 1199092 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 4 |1198632| = 4 |1198633| = 4 Yes (all three have the same domain size)

x2 rarr 1199091 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199092) = 1 Rank 1(1199093) =infin Yes (1199091 and 1199092 both have Rank 1)Rank 2 Rank 2(1199091) =infin Rank 2(1199092) = 2 No (1199092 has Rank 2 while 1199091 has infinity)

Table 5 DVO process for 1199091 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 3 |1198633| = 3 Yes (both have the same domain size) x1 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199093) =infin No (1199091 has Rank 1 while 1199093 has infinity)

Table 6 Domain changes in the search process

Stage Function Before IDC After IDC and before MPC After MPC

Initialization Initial domain reduction mdash 1199091 [minus2 2] 1199092 [minus2 2]1199093 [minus2 2]

x1 [minus1 2] x2 [minus2 1]x3 [minus1 2]

State space search

MPC check when 1199092 isassigned minus1

1199091 [minus1 2] 1199092 [minus2 1] 1199093[minus1 2]

1199091 [minus1 2] x2 [minus1 minus1]1199093 [minus1 2]

x1 [0 2] 1199092 [minus1 minus1]x3 [0 2]

MPC check when 1199091 isassigned 1

1199091 [0 2] 1199092 [minus1 minus1] 1199093[0 2]

x1 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

MPC check when 1199093 isassigned 1

1199091 [1 1] 1199092 [minus1 minus1] 1199093[0 2]

1199091 [1 1] 1199092 [minus1 minus1]x3 [1 1]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [1 1]

void test(int x1 int x2 int x3) void test(int x1 int x2 int x3)

if (x1-x2lt=0)

printf(ldquoPath1rdquo)

else if(x3-x2lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

else if(3lowastx3+5gt=0)

int b1=x1-x2

printf(ldquoPath1rdquo)

if (b1lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

if(b2lt=0)

if(b3gt=-5)

else int b2=x-3x2

elseint b3=3lowastx3

Figure 3 Overview of our approach for searching the test data

12 Mathematical Problems in Engineering

0 1 2

DVO

MPC

0 1 2MPC

0 1 2MPC

DVO

DVOIDC

PTC

MPC

MPC

IDC

IDC

minus1

minus1

minus1

minus2

minus2

minus2

x1 minus x2 gt 0x3 minus x2 gt 03x3 + 5 ge 0

x1 x2 x3 isin minus2 minus1 0 1 2

x2x2x2x2

x3x3x3

x1 x1 x1

radic

Figure 4 The search tree of generating the test data for Path3 using BFS-BB

61 Performance Evaluation The number of relevant vari-ables is an important factor that affects the performanceof BFS-BB so in this part experiments were carried out toevaluate the performance of BFS-BB for varying numbers ofinput variables To be specific our major concern is (1) therelationship between the number of MPC checks (exclusiveof the one taken in the initialization stage) and the numberof relevant variables (2) the relationship between genera-tion time and the number of relevant variables This wasaccomplished by repeatedly running BFS-BB on generatedtest programs having input variables 119909

1 1199092 119909

119899where 119899

varied from 1 to 50 Adopting statement coverage in eachtest the program contained five if statements (equivalent tofive branching conditions along the path for MPC check)and there was only one path to be traversed of fixed lengthwhich was the one consisting of entirely true branches(TTTTT) that is all the branching conditions are the sameas the corresponding predicates Considering the relationshipbetween variables experiments involving two situations wereconducted that (1) the variables are all independent of eachother and (2) the variables are linearly related in the tightestmanner Generation time varied greatly in these two cases sothe axes of generation time of both cases are normalized forsimplicity

611 Variables Are All Independent of Each Other The pred-icate of each if statement is an expression in the form of

11988611199091relop

1const [1] and 119886

21199092relop

2119888119900119899119904119905 [2]

and sdot sdot sdot and 119886119899119909119899relop

119899119888119900119899119904119905 [119899]

(6)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop119894(119894 = 1 2 119899) isin gt ge lt le =

= and const is an array of randomly generated constantsThe randomly generated 119886

119894and 119888119900119899119904119905[119894] should be selected

to make the path feasible This arrangement constructs arelationship that all the variables are independent of eachother but all of them are relevant to the path The programsfor various values of n ranging from 1 to 50 were each tested50 times and the number of MPC checks and time required

to generate the data for each test were recorded The resultscan be seen from Figures 5 and 6

Figure 5 shows the relationship between the number ofMPC checks and the number of variables (119899) for variablesthat are all independent of each other and from (a) to (d)represent four different situations marked by the ordinatesIt can be seen that since the relation in formula (6) is thesimplest one between variables the number of MPC checksremains linearly increasing with the number of variables nomatter in which situation from (a) to (d) 119910 = 119909means thatfor this kind of constraint one relevant variable requires onlyone MPC check It also can be seen that 1198772 = 1 in all thefour situations and the number of MPC checks increasescompletely linearly with the number of variables

Figure 6 shows the relationship between generation timeand the number of variables (119899) for variables that are allindependent of each other and from (a) to (d) represent fourdifferent situations marked by the ordinates It can be seenthat generation time increases approximately linearlywith thenumber of variables and the linear correlation relationship issignificant at 95 confidence level with 119875 value far less than005 By the increase of the number of variables generationtime increases at an even speed The minimum value can becommendably represented as a straight line showing that itis the most ideal in the four situations with a larger value of1198772 Variations between tests with the same values of 119899 were

attributed to the randomness in the difference in the selectionof the initial values

612 Variables Are Linearly Related in the Tightest MannerThe predicate of each if statement is a linear combination ofall the 119899 variables in the form of

[1198861 1198862 119886

119899] [1199091 1199092 119909

119899]1015840relop119888119900119899119904119905 [119888] (7)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop isin gt ge lt le = = and 119888119900119899119904119905[119888](119888 isin 1 2 3 4 5) is an array of randomly generatedconstants The randomly generated 119886

119894and 119888119900119899119904119905[119888] should be

selected to make the path feasible This arrangement con-structs the tightest linear relation between the variables all

Mathematical Problems in Engineering 13

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = x

R2 = 1

(a)

Number of variables0 20 40 60

0

20

40

60

Aver

age n

umbe

r of M

PC ch

ecks

y = x

R2 = 1

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(c)

Number of variables0 20 40 60

0

20

40

60

Min

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(d)

Figure 5 Relationship between the number of MPC checks and the number of variables for variables that are all independent of each other

of which are relevant to the path The programs for variousvalues of 119899 ranging from 1 to 50 were each tested 50 times andthe number ofMPC checks and time required to generate thedata for each test were recordedThe results can be seen fromFigures 7 and 8

Figure 7 shows the relationship between the number ofMPC checks and the number of variables (119899) for variables thatare the tightest linearly related and from (a) to (d) representfour different situations marked by the ordinates It can beseen that the number of MPC checks remains approximatelylinearly increasing with the number of variables and thefitting curves are all near 119910 = 119909 The linear correlationrelationship is significant at 95 confidence level with 119875-value far less than 005 The general average and maximumnumbers ofMPC checks are all larger than those in the exper-iment for variables that are all independent of each otherbecause the relation in formula (7) is the tightest linear onebetween variablesTheminimumnumber ofMPC checks canbe completely represented as 119910 = 119909 with 1198772 = 1 whichmeans that theminimumnumber is themost ideal in the foursituations

Figure 8 shows the relationship between generation timeand the number of variables (119899) for variables that are

the tightest linearly related and from (a) to (d) represent fourdifferent situationsmarked by the ordinates It is clear that therelation between generation time and the number of variablescan be commendably represented as a quadratic curve andthe quadratic correlation relationship is significant at 95confidence level with 119875-value far less than 005 The betterfitting curves of average and minimum generation timesshow that average generation time is perfectly stable andminimum generation time is still the most ideal Variationsbetween tests with the same values of n were attributed to therandomness in (1) the difference in the selection of the initialvalues and (2) the difference in the expressions along the path(an equality relational operator will generally require morecalculation than an inequality relational operator) Besidesgeneration time increases at a uniformly accelerative speed bythe increase of the number of variables Take (b) for examplethe differentiation of average generation time indicates thatits increase rate rises by 119910 = 9994119909 minus 7734 as the number ofvariables increases We can roughly draw the conclusion thatgeneration time is very close for 119899 ranging from 1 to 8 whileit begins to increase when 119899 is larger than 8

The above cases are both completely backtrack-freesearch owing to the linear correlation relationship between

14 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 1471x + 3161

R2 = 0899

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time

y = 1636x + 1175

R2 = 0862

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 2073x minus 3655

R2 = 0997

(d)

Figure 6 Relationship between generation time and the number of variables for variables that are all independent of each other

the number of MPC checks and the number of relevantvariables Surely they cannot include all the relations betweenvariables in engineering so the analyses in this part arejust from the theoretic perspective The real-world PUTs aremuch more complex Whatrsquos more 50 tests were conductedfor each case of n ranging from 1 to 50 So the results from thesamples can only approximate the actual situation But it canbe concluded that BFS-BB functions are stably given a PUTof regular structure which lays a solid foundation for itsapplication in engineering

62 Coverage Evaluation To evaluate the capability of BFS-BB to generate test data in terms of coverage four experi-ments were carried out The first involves the testing with abenchmark used in CTS the second aims at generating testdata for a project in engineering the third compares BFS-BBwith a static method and the last compares it with dynamicmethods

621 Testing a Benchmark in CTS In this part test datawere automatically generated to meet three coverage criteriawhich were statement branch and MCDC The test bedwas branch boundc a benchmark in CTS with 402 LOC

29 input variables and complex structure trying to includemore content that might appear in engineering119898 was set 10for each variable as the upperbound of the number of MPCchecks so it can be estimated that the simplest backtrackingwill consume at least 11 MPC checks for the variable inquestion

The result is shown in Table 7 The numbers of paths wasdifferent owing to different coverage criteria adopted BFS-BB was able to generate test data for all the feasible pathsno matter which coverage criterion was taken The MCDCcoverage did not reach 100 because it is relatively strictand difficult to meet and subsumes statement and branchcoverage [38] But tolerable coverage was achieved withintolerable timeThere exists a trade-off between efficiency andsuccess rate IVR had no significant influence on coveragebut it did on generation time Generation time after IVRwas much less than that without IVR Note that the amountof generation time reduced by IVR is determined by the struc-ture of the PUT The numbers of generation time reductionin Table 7 are only related to the program branch boundcOur following analyses all concern BFS-BB with IVR Aver-age number of MPC checks per relevant variable adoptingstatement coverage was larger than the results in Section 61

Mathematical Problems in Engineering 15

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = 0994x + 0221

R2 = 0997

(a)

0 20 40 600

20

40

60

Number of variables

Aver

age n

umbe

r of M

PC ch

ecks

y = 0996x + 0309

R2 = 0999

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = 1014x + 2030

R2 = 0985

(c)

0 20 40 600

20

40

60

Number of variables

Min

imum

num

ber o

f MPC

chec

ks

y = x

R2 = 1

(d)

Figure 7 Relationship between the number of MPC checks and the number of variables for variables that are linearly related in the tightestmanner

Table 7 Coverage achieved by BFS-BB on branch boundc

Coverage criterion Number of paths Average coverage Generation time reducedby IVR

Average MPC checks perrelevant variable

Statement 61 100 34 134Branch 119 100 37 173MCDC 125 98 42 229

because there are some library functions as well as nonlinearconstraints in the PUT which require moreMPC checks Butfrom all the average values of the number of MPC checks itcan be concluded that the tests for all the three coverage crite-ria were basically backtrack-free and not all the tests includethe bisection operation

622 Testing Programs from a Project in Engineering In thispart seven programs were selected from the project de118i-2 at httpwwwmoshiernet as the test beds each of whichcontains several functions or loops Three coverage criteriawere adopted which were statement branch and MCDCThe information of the programs and the coverage results areshown in Table 8

From Table 8 it can be seen that BFS-BB performedencouragingly for programs with complex structure in engi-neering Some of the tests adoptingMCDC coverage did notreach 100 resulting from the fact thatMCDC is a relativelystrict criterion But all the tests were basically backtrack-freewith a high coverage proving its effectiveness in engineering

623 Comparison with a Static Method This part presentsthe results from an empirical comparison of BFS-BB withthe static method [13] (denoted as ldquomethod 1rdquo to avoidverbose description) which was implemented in CTS priorto BFS-BB The details of the test beds are shown in Table 9The comparison adopted three coverage criteria statementbranch and MCDC And the result is shown in Table 10

16 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 4318x2 minus 6684x + 3655

R2 = 0975

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 4997x2 minus 7734x + 4134

R2 = 0991

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time y = 4151x2 minus 2923x + 7107

R2 = 0907

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 5455x2 minus 9565x + 5030

R2 = 0994

(d)

Figure 8 Relationship between generation time and the number of variables for variables that are linearly related in the tightest manner

Table 8 Test result achieved by BFS-BB on programs from de118i-2

Program LOC Number ofvariables

Number offunctions

Number ofloops Statement coverage Branch coverage MCDC coverage

rungec 1626 8 2 8 100 100 89oblatec 581 14 4 4 100 100 94jplmpc 996 6 5 5 100 100 88floorlc 432 7 5 2 100 100 92atanlc 267 3 2 0 100 100 100adams4c 767 28 10 12 100 100 85tanlc 256 4 3 0 100 100 100

Table 9 Programs used for comparison with method 1

Program LOC Number of branches Number of variables Description Source

Bonus 29 10 1 To calculate bonus according toprofit [13]

Days 33 17 3 To calculate the day numberwithin a year [13]

Statistics 21 8 5 To count the number of everykind of characters [13]

gcd 38 5 2 To calculate greatest commondenominator [14]

Mathematical Problems in Engineering 17

Table 10 Comparison result with method 1 using three coverage criteria

Program Coverage criterion Number of paths Average coverage bymethod 1

Average coverage byBFS-BB

BonusStatement 6 25 100Branch 6 37 100MCDC 6 30 100

DaysStatement 17 100 100Branch 17 100 100MCDC 14 94 100

StatisticsStatement 4 100 100Branch 5 100 100MCDC 11 82 100

gcdStatement 3 85 100Branch 3 77 100MCDC 5 75 100

Table 11 Programs used for comparison with GA and SA

Program LOC Number of branches Number of variables Description SourceTriangle type 31 3 5 To classify type of a triangle [15]

Valid date 59 16 3 To check whether a date is validor not [15]

Cal day 72 11 3 To calculate the day of the week [16]

Cal 53 18 5To calculate the number of daysbetween the two given days in

the same year[17]

Since interval arithmetic has been improved in CTS forthe sake of validity test data were generated for all the testbeds on the same foundation of interval arithmetic It can beseen that BFS-BB reached 100 coverage for all the test bedsusing three coverage criteria while method 1 did not That islargely due to the heuristicmethods utilized in BFS-BBThereare modulus operations in the program days which had beendifficult for the interval arithmetic in method 1 to handle Butthe functionality of interval arithmetic has been improved byadding a library of inverse functions so it was not so difficulteven for method 1

624 Comparison with DynamicMethods This part presentsresults from an empirical comparison of BFS-BB with twodynamic methods which are genetic algorithm (GA) andsimulated annealing (SA) on four different benchmark pro-grams using branch coverage The details of the benchmarkprograms are shown in Table 11 In order to obtain unbiasedexperimental results we made a number of experiments withdifferent parameter settings and chose the one that GA andSA performed the best as shown in Table 12 Test data wereautomatically generated for each program using GA SA andBFS-BB with each tested 100 timesThe average coverage wasrecorded to make the comparison and the result was pre-sented in Table 13

It can be seen that BFS-BB reached 100 coverage onall the four benchmark programs which are rather simpleprograms for BFS-BB and it outperformed the algorithms

in comparison The better performance of BFS-BB is due tothree factorsThe first is that the initial values of variables areselected by heuristics on the path so BFS-BB reaches a rela-tively high coverage for the first round of the searchThe sec-ond is that MHS crashes on several occasions due to the iter-ation exception while the probability of aborting is quite lowfor BFS-BB because it has no demand for iteration The thirdis thatMPC is checked not only in the state space search stagebut also in the initialization stage which reduces the domainsof the variables to ensure a relatively small search spacethat follows

7 Conclusion

This paper presents an intelligent algorithm best-first-searchbranch and bound (BFS-BB) for path-wise test data gener-ation (Q) The problem Q is reformulated as a constraintoptimization problem (COP) and two techniques from arti-ficial intelligence are introduced to tackle the COP which arestate space search and branch and bound (BB) The formeris used to construct the search tree dynamically and thelatter is used as the search method Heuristics are adoptedin the look-ahead stage Dynamic variable ordering (DVO)is presented with a heuristic rule to break ties Maintainingpath consistency (MPC) is achieved through analysis on theresult of interval arithmetic The monotonicity analysis onbranching conditions is applied both in the selection of initialvalues by path tendency calculation (PTC) and initial domain

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

10 Mathematical Problems in Engineering

Input 1198631 the domain of all variables before checking path consistencyBr(119899119902119886 119899119902119886+1

) (119886 isin [1 119896]) 119896 branching conditions along the pathOutput 119863119896+1 the reduced domain of all variables after a successful path consistency check

(119899119902119886 119899119902119886+1

) the conflicted branch spotted by path consistency checkBegin(1) for 119886 rarr 1 119896(2) calculate Br(119899

119902119886 119899119902119886+1

) with119863119886(3) if (Br(119899

119902119886 119899119902119886+1

) = true)(4) 119863

119886+1

⊑ 119863119886

(5) else return (119899119902119886 119899119902119886+1

)(6) path consistentlarr true(7) return 119863119896+1End

Algorithm 7 Maintaining path consistency

is generally not satisfied for all the values in1198631 but for valuesin a certain subset 1198632 sube 119863

1 ensuring the traversal of the

branch (1198991199021 1198991199021+1

) that is 1198631119861119903(1198991199021 1198991199021+1)

997888997888997888997888997888997888997888997888997888rarr 1198632 Next the

branching condition Br(1198991199022 1198991199022+1

) is evaluated given that thedomain of all variables is1198632 Again generally Br(119899

1199022 1198991199022+1

) isonly satisfied by a subset 1198633 sube 1198632 This procedure continuesalong 119901 until all the branching conditions are satisfied tomaintain path consistency and119863119896+1 is returned as the domainof all variables The process of maintaining path consistencyis the propagation of the branching conditions along p in the

form of 1198631Br(1198991199021 1198991199021+1)997888997888997888997888997888997888997888997888997888rarr 119863

2

Br(11989911990221198991199022+1)997888997888997888997888997888997888997888997888997888rarr 119863

3

sdot sdot sdot 119863119896

Br(119899119902119896119899119902119896+1)997888997888997888997888997888997888997888997888997888rarr

119863119896+1 where 1198631 supe 119863

2

supe 1198633

sdot sdot sdot supe 119863119896

supe 119863119896+1 But if in

this procedure Br(119899119902ℎ 119899119902ℎ+1

) = 119891119886119897119904119890(1 le ℎ le 119896) whichmeans a conflict is detected then MPC is terminated andbisection will function according to the result of MPC atthe conflicted branch (119899

119902ℎ 119899119902ℎ+1

) The process of checkingwhether path consistency is maintained is shown by pseudo-codes in Algorithm 7

54 Case Study In this part the problem mentioned inSection 31 is used as an example to explain how BFS-BBworks especially the heuristic look-aheadmethods proposedahead The input is Path3 as shown in bold in Figure 3where each branching condition is decomposed into its basicfunctions in the right The IVR process has been illustratedin detail in Table 1 and all the three variables are determinedrelevant to Path3 For simplicity the input domains of allvariables are set [minus2 2] with the size 5 In the initializationstage MPC check reduces their domains to 1199091 [minus1 2] 1199092[minus2 1] and 1199093 [minus1 2] The path tendency of each variableis calculated by PTC as shown in Table 3 DVO serves todetermine the first variable to be instantiated as shown inTable 4 with the head of the queue (1199092) highlighted in boldOn determining 1199092 to be the current variable an initialvalue needs to be selected from[minus2 1] The retrieval of pathtendency map by IDC returns negative for 1199092 indicating thata smaller value will perform better and minus1 is selected

MPC checks the domains of all variables which are 1199091[minus1 2] 1199092 [minus1 1] and 1199093 [minus1 2] It succeeds and reduces

the domains of 1199091 and 1199093 to [0 2] and [0 2] respectivelyThen DVO determines the next variable to be instantiated asshown in Table 5 with the head of the queue (1199091) highlightedin bold

1 is selected for 1199091 after IDC MPC checks whether1199091 [1 1] 1199092 [minus1 minus1] and 1199093 [0 2] works It suc-ceeds and in the same manner 1199093 is assigned 1 Finally⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is checked by MPC to be suitablefor Path3 No variable needs to be permutated and BFS-BBsucceeds with the test data ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ Table 6shows how the domains of variables are changed duringthe search process The changed domains are highlighted inbold The changes listed in the fourth column are owing tovariable assignments according to the results of IDC andthe changes listed in the fifth column are owing to domainreduction by MPC checks The process of generating the testdata ⟨1199091 1⟩ ⟨1199092 minus1⟩ ⟨1199093 1⟩ is presented as the search treein Figure 4 It is a backtrack-free search that accounts for anextremely large proportion in the implementation of BFS-BBEach variable consumes one MPC check in the state spacesearch stage and the initial values of each variable make thesolution The solution path is shown by the bold arrows

6 Experimental Results and Discussion

To observe the effectiveness of BFS-BB we carried out a largenumber of experiments in CTS Within the CTS frameworkthe PUT is automatically analyzed and its basic informationis abstracted to generate its CFG According to the specifiedcoverage criteria the paths to be traversed are generated andprovided for BFS-BB as input The generated test data willbe used for mutation testing that requires a high coverageideally 100 [37] This is a challenge for test data generation

The experiments were performed in the environment ofMSWindows 7with 32 bits Pentium4with 28GHz and 2GBmemory The algorithms were implemented in Java and runon the platformof eclipseThe experiments include two partsSection 61 presents the performance evaluation of BFS-BBand Section 62 tests the capability of BFS-BB to generate testdata in terms of coverage and makes comparisons with somecurrently existing static and dynamic methods

Mathematical Problems in Engineering 11

Table 3 PTC process for 1199091 1199092 and 1199093

Branchingcondition

Basic functions andcorresponding monotonicity

Monotonicity ofbranching conditions Weight Path weight Path tendency

1199091 minus 1199092 gt 0

119891(1199091) = 1199091 minus 1199092 increasing119891(1199092) = 1199091 minus 1199092 decreasing119891(1198871) = 1198871 gt 0 increasing

Br(1199091) increasingBr(1199092) decreasing

1199081 = 05

1199082 = minus051199011199081 = 05

1199011199082 = minus1

1199011199083 = 15

⟨1199091 positive⟩⟨1199092 negative⟩⟨1199093 positive⟩1199093 minus 1199092 gt 0

119891(1199092) = 1199093 minus 1199092 decreasing119891(1199093) = 1199093 minus 1199092 increasing119891(1198872) = 1198872 gt 0 increasing

Br(1199092) decreasingBr(1199093) increasing

1199082 = minus05

1199083 = 05

3 lowast 1199093 ge minus5119891(1199093) = 3 lowast 1199093 increasing119891(1198873) = 1198873 ge minus5 increasing Br(1199093) increasing 1199083 = 1

Table 4 DVO process for 1199091 1199092 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 4 |1198632| = 4 |1198633| = 4 Yes (all three have the same domain size)

x2 rarr 1199091 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199092) = 1 Rank 1(1199093) =infin Yes (1199091 and 1199092 both have Rank 1)Rank 2 Rank 2(1199091) =infin Rank 2(1199092) = 2 No (1199092 has Rank 2 while 1199091 has infinity)

Table 5 DVO process for 1199091 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 3 |1198633| = 3 Yes (both have the same domain size) x1 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199093) =infin No (1199091 has Rank 1 while 1199093 has infinity)

Table 6 Domain changes in the search process

Stage Function Before IDC After IDC and before MPC After MPC

Initialization Initial domain reduction mdash 1199091 [minus2 2] 1199092 [minus2 2]1199093 [minus2 2]

x1 [minus1 2] x2 [minus2 1]x3 [minus1 2]

State space search

MPC check when 1199092 isassigned minus1

1199091 [minus1 2] 1199092 [minus2 1] 1199093[minus1 2]

1199091 [minus1 2] x2 [minus1 minus1]1199093 [minus1 2]

x1 [0 2] 1199092 [minus1 minus1]x3 [0 2]

MPC check when 1199091 isassigned 1

1199091 [0 2] 1199092 [minus1 minus1] 1199093[0 2]

x1 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

MPC check when 1199093 isassigned 1

1199091 [1 1] 1199092 [minus1 minus1] 1199093[0 2]

1199091 [1 1] 1199092 [minus1 minus1]x3 [1 1]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [1 1]

void test(int x1 int x2 int x3) void test(int x1 int x2 int x3)

if (x1-x2lt=0)

printf(ldquoPath1rdquo)

else if(x3-x2lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

else if(3lowastx3+5gt=0)

int b1=x1-x2

printf(ldquoPath1rdquo)

if (b1lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

if(b2lt=0)

if(b3gt=-5)

else int b2=x-3x2

elseint b3=3lowastx3

Figure 3 Overview of our approach for searching the test data

12 Mathematical Problems in Engineering

0 1 2

DVO

MPC

0 1 2MPC

0 1 2MPC

DVO

DVOIDC

PTC

MPC

MPC

IDC

IDC

minus1

minus1

minus1

minus2

minus2

minus2

x1 minus x2 gt 0x3 minus x2 gt 03x3 + 5 ge 0

x1 x2 x3 isin minus2 minus1 0 1 2

x2x2x2x2

x3x3x3

x1 x1 x1

radic

Figure 4 The search tree of generating the test data for Path3 using BFS-BB

61 Performance Evaluation The number of relevant vari-ables is an important factor that affects the performanceof BFS-BB so in this part experiments were carried out toevaluate the performance of BFS-BB for varying numbers ofinput variables To be specific our major concern is (1) therelationship between the number of MPC checks (exclusiveof the one taken in the initialization stage) and the numberof relevant variables (2) the relationship between genera-tion time and the number of relevant variables This wasaccomplished by repeatedly running BFS-BB on generatedtest programs having input variables 119909

1 1199092 119909

119899where 119899

varied from 1 to 50 Adopting statement coverage in eachtest the program contained five if statements (equivalent tofive branching conditions along the path for MPC check)and there was only one path to be traversed of fixed lengthwhich was the one consisting of entirely true branches(TTTTT) that is all the branching conditions are the sameas the corresponding predicates Considering the relationshipbetween variables experiments involving two situations wereconducted that (1) the variables are all independent of eachother and (2) the variables are linearly related in the tightestmanner Generation time varied greatly in these two cases sothe axes of generation time of both cases are normalized forsimplicity

611 Variables Are All Independent of Each Other The pred-icate of each if statement is an expression in the form of

11988611199091relop

1const [1] and 119886

21199092relop

2119888119900119899119904119905 [2]

and sdot sdot sdot and 119886119899119909119899relop

119899119888119900119899119904119905 [119899]

(6)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop119894(119894 = 1 2 119899) isin gt ge lt le =

= and const is an array of randomly generated constantsThe randomly generated 119886

119894and 119888119900119899119904119905[119894] should be selected

to make the path feasible This arrangement constructs arelationship that all the variables are independent of eachother but all of them are relevant to the path The programsfor various values of n ranging from 1 to 50 were each tested50 times and the number of MPC checks and time required

to generate the data for each test were recorded The resultscan be seen from Figures 5 and 6

Figure 5 shows the relationship between the number ofMPC checks and the number of variables (119899) for variablesthat are all independent of each other and from (a) to (d)represent four different situations marked by the ordinatesIt can be seen that since the relation in formula (6) is thesimplest one between variables the number of MPC checksremains linearly increasing with the number of variables nomatter in which situation from (a) to (d) 119910 = 119909means thatfor this kind of constraint one relevant variable requires onlyone MPC check It also can be seen that 1198772 = 1 in all thefour situations and the number of MPC checks increasescompletely linearly with the number of variables

Figure 6 shows the relationship between generation timeand the number of variables (119899) for variables that are allindependent of each other and from (a) to (d) represent fourdifferent situations marked by the ordinates It can be seenthat generation time increases approximately linearlywith thenumber of variables and the linear correlation relationship issignificant at 95 confidence level with 119875 value far less than005 By the increase of the number of variables generationtime increases at an even speed The minimum value can becommendably represented as a straight line showing that itis the most ideal in the four situations with a larger value of1198772 Variations between tests with the same values of 119899 were

attributed to the randomness in the difference in the selectionof the initial values

612 Variables Are Linearly Related in the Tightest MannerThe predicate of each if statement is a linear combination ofall the 119899 variables in the form of

[1198861 1198862 119886

119899] [1199091 1199092 119909

119899]1015840relop119888119900119899119904119905 [119888] (7)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop isin gt ge lt le = = and 119888119900119899119904119905[119888](119888 isin 1 2 3 4 5) is an array of randomly generatedconstants The randomly generated 119886

119894and 119888119900119899119904119905[119888] should be

selected to make the path feasible This arrangement con-structs the tightest linear relation between the variables all

Mathematical Problems in Engineering 13

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = x

R2 = 1

(a)

Number of variables0 20 40 60

0

20

40

60

Aver

age n

umbe

r of M

PC ch

ecks

y = x

R2 = 1

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(c)

Number of variables0 20 40 60

0

20

40

60

Min

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(d)

Figure 5 Relationship between the number of MPC checks and the number of variables for variables that are all independent of each other

of which are relevant to the path The programs for variousvalues of 119899 ranging from 1 to 50 were each tested 50 times andthe number ofMPC checks and time required to generate thedata for each test were recordedThe results can be seen fromFigures 7 and 8

Figure 7 shows the relationship between the number ofMPC checks and the number of variables (119899) for variables thatare the tightest linearly related and from (a) to (d) representfour different situations marked by the ordinates It can beseen that the number of MPC checks remains approximatelylinearly increasing with the number of variables and thefitting curves are all near 119910 = 119909 The linear correlationrelationship is significant at 95 confidence level with 119875-value far less than 005 The general average and maximumnumbers ofMPC checks are all larger than those in the exper-iment for variables that are all independent of each otherbecause the relation in formula (7) is the tightest linear onebetween variablesTheminimumnumber ofMPC checks canbe completely represented as 119910 = 119909 with 1198772 = 1 whichmeans that theminimumnumber is themost ideal in the foursituations

Figure 8 shows the relationship between generation timeand the number of variables (119899) for variables that are

the tightest linearly related and from (a) to (d) represent fourdifferent situationsmarked by the ordinates It is clear that therelation between generation time and the number of variablescan be commendably represented as a quadratic curve andthe quadratic correlation relationship is significant at 95confidence level with 119875-value far less than 005 The betterfitting curves of average and minimum generation timesshow that average generation time is perfectly stable andminimum generation time is still the most ideal Variationsbetween tests with the same values of n were attributed to therandomness in (1) the difference in the selection of the initialvalues and (2) the difference in the expressions along the path(an equality relational operator will generally require morecalculation than an inequality relational operator) Besidesgeneration time increases at a uniformly accelerative speed bythe increase of the number of variables Take (b) for examplethe differentiation of average generation time indicates thatits increase rate rises by 119910 = 9994119909 minus 7734 as the number ofvariables increases We can roughly draw the conclusion thatgeneration time is very close for 119899 ranging from 1 to 8 whileit begins to increase when 119899 is larger than 8

The above cases are both completely backtrack-freesearch owing to the linear correlation relationship between

14 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 1471x + 3161

R2 = 0899

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time

y = 1636x + 1175

R2 = 0862

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 2073x minus 3655

R2 = 0997

(d)

Figure 6 Relationship between generation time and the number of variables for variables that are all independent of each other

the number of MPC checks and the number of relevantvariables Surely they cannot include all the relations betweenvariables in engineering so the analyses in this part arejust from the theoretic perspective The real-world PUTs aremuch more complex Whatrsquos more 50 tests were conductedfor each case of n ranging from 1 to 50 So the results from thesamples can only approximate the actual situation But it canbe concluded that BFS-BB functions are stably given a PUTof regular structure which lays a solid foundation for itsapplication in engineering

62 Coverage Evaluation To evaluate the capability of BFS-BB to generate test data in terms of coverage four experi-ments were carried out The first involves the testing with abenchmark used in CTS the second aims at generating testdata for a project in engineering the third compares BFS-BBwith a static method and the last compares it with dynamicmethods

621 Testing a Benchmark in CTS In this part test datawere automatically generated to meet three coverage criteriawhich were statement branch and MCDC The test bedwas branch boundc a benchmark in CTS with 402 LOC

29 input variables and complex structure trying to includemore content that might appear in engineering119898 was set 10for each variable as the upperbound of the number of MPCchecks so it can be estimated that the simplest backtrackingwill consume at least 11 MPC checks for the variable inquestion

The result is shown in Table 7 The numbers of paths wasdifferent owing to different coverage criteria adopted BFS-BB was able to generate test data for all the feasible pathsno matter which coverage criterion was taken The MCDCcoverage did not reach 100 because it is relatively strictand difficult to meet and subsumes statement and branchcoverage [38] But tolerable coverage was achieved withintolerable timeThere exists a trade-off between efficiency andsuccess rate IVR had no significant influence on coveragebut it did on generation time Generation time after IVRwas much less than that without IVR Note that the amountof generation time reduced by IVR is determined by the struc-ture of the PUT The numbers of generation time reductionin Table 7 are only related to the program branch boundcOur following analyses all concern BFS-BB with IVR Aver-age number of MPC checks per relevant variable adoptingstatement coverage was larger than the results in Section 61

Mathematical Problems in Engineering 15

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = 0994x + 0221

R2 = 0997

(a)

0 20 40 600

20

40

60

Number of variables

Aver

age n

umbe

r of M

PC ch

ecks

y = 0996x + 0309

R2 = 0999

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = 1014x + 2030

R2 = 0985

(c)

0 20 40 600

20

40

60

Number of variables

Min

imum

num

ber o

f MPC

chec

ks

y = x

R2 = 1

(d)

Figure 7 Relationship between the number of MPC checks and the number of variables for variables that are linearly related in the tightestmanner

Table 7 Coverage achieved by BFS-BB on branch boundc

Coverage criterion Number of paths Average coverage Generation time reducedby IVR

Average MPC checks perrelevant variable

Statement 61 100 34 134Branch 119 100 37 173MCDC 125 98 42 229

because there are some library functions as well as nonlinearconstraints in the PUT which require moreMPC checks Butfrom all the average values of the number of MPC checks itcan be concluded that the tests for all the three coverage crite-ria were basically backtrack-free and not all the tests includethe bisection operation

622 Testing Programs from a Project in Engineering In thispart seven programs were selected from the project de118i-2 at httpwwwmoshiernet as the test beds each of whichcontains several functions or loops Three coverage criteriawere adopted which were statement branch and MCDCThe information of the programs and the coverage results areshown in Table 8

From Table 8 it can be seen that BFS-BB performedencouragingly for programs with complex structure in engi-neering Some of the tests adoptingMCDC coverage did notreach 100 resulting from the fact thatMCDC is a relativelystrict criterion But all the tests were basically backtrack-freewith a high coverage proving its effectiveness in engineering

623 Comparison with a Static Method This part presentsthe results from an empirical comparison of BFS-BB withthe static method [13] (denoted as ldquomethod 1rdquo to avoidverbose description) which was implemented in CTS priorto BFS-BB The details of the test beds are shown in Table 9The comparison adopted three coverage criteria statementbranch and MCDC And the result is shown in Table 10

16 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 4318x2 minus 6684x + 3655

R2 = 0975

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 4997x2 minus 7734x + 4134

R2 = 0991

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time y = 4151x2 minus 2923x + 7107

R2 = 0907

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 5455x2 minus 9565x + 5030

R2 = 0994

(d)

Figure 8 Relationship between generation time and the number of variables for variables that are linearly related in the tightest manner

Table 8 Test result achieved by BFS-BB on programs from de118i-2

Program LOC Number ofvariables

Number offunctions

Number ofloops Statement coverage Branch coverage MCDC coverage

rungec 1626 8 2 8 100 100 89oblatec 581 14 4 4 100 100 94jplmpc 996 6 5 5 100 100 88floorlc 432 7 5 2 100 100 92atanlc 267 3 2 0 100 100 100adams4c 767 28 10 12 100 100 85tanlc 256 4 3 0 100 100 100

Table 9 Programs used for comparison with method 1

Program LOC Number of branches Number of variables Description Source

Bonus 29 10 1 To calculate bonus according toprofit [13]

Days 33 17 3 To calculate the day numberwithin a year [13]

Statistics 21 8 5 To count the number of everykind of characters [13]

gcd 38 5 2 To calculate greatest commondenominator [14]

Mathematical Problems in Engineering 17

Table 10 Comparison result with method 1 using three coverage criteria

Program Coverage criterion Number of paths Average coverage bymethod 1

Average coverage byBFS-BB

BonusStatement 6 25 100Branch 6 37 100MCDC 6 30 100

DaysStatement 17 100 100Branch 17 100 100MCDC 14 94 100

StatisticsStatement 4 100 100Branch 5 100 100MCDC 11 82 100

gcdStatement 3 85 100Branch 3 77 100MCDC 5 75 100

Table 11 Programs used for comparison with GA and SA

Program LOC Number of branches Number of variables Description SourceTriangle type 31 3 5 To classify type of a triangle [15]

Valid date 59 16 3 To check whether a date is validor not [15]

Cal day 72 11 3 To calculate the day of the week [16]

Cal 53 18 5To calculate the number of daysbetween the two given days in

the same year[17]

Since interval arithmetic has been improved in CTS forthe sake of validity test data were generated for all the testbeds on the same foundation of interval arithmetic It can beseen that BFS-BB reached 100 coverage for all the test bedsusing three coverage criteria while method 1 did not That islargely due to the heuristicmethods utilized in BFS-BBThereare modulus operations in the program days which had beendifficult for the interval arithmetic in method 1 to handle Butthe functionality of interval arithmetic has been improved byadding a library of inverse functions so it was not so difficulteven for method 1

624 Comparison with DynamicMethods This part presentsresults from an empirical comparison of BFS-BB with twodynamic methods which are genetic algorithm (GA) andsimulated annealing (SA) on four different benchmark pro-grams using branch coverage The details of the benchmarkprograms are shown in Table 11 In order to obtain unbiasedexperimental results we made a number of experiments withdifferent parameter settings and chose the one that GA andSA performed the best as shown in Table 12 Test data wereautomatically generated for each program using GA SA andBFS-BB with each tested 100 timesThe average coverage wasrecorded to make the comparison and the result was pre-sented in Table 13

It can be seen that BFS-BB reached 100 coverage onall the four benchmark programs which are rather simpleprograms for BFS-BB and it outperformed the algorithms

in comparison The better performance of BFS-BB is due tothree factorsThe first is that the initial values of variables areselected by heuristics on the path so BFS-BB reaches a rela-tively high coverage for the first round of the searchThe sec-ond is that MHS crashes on several occasions due to the iter-ation exception while the probability of aborting is quite lowfor BFS-BB because it has no demand for iteration The thirdis thatMPC is checked not only in the state space search stagebut also in the initialization stage which reduces the domainsof the variables to ensure a relatively small search spacethat follows

7 Conclusion

This paper presents an intelligent algorithm best-first-searchbranch and bound (BFS-BB) for path-wise test data gener-ation (Q) The problem Q is reformulated as a constraintoptimization problem (COP) and two techniques from arti-ficial intelligence are introduced to tackle the COP which arestate space search and branch and bound (BB) The formeris used to construct the search tree dynamically and thelatter is used as the search method Heuristics are adoptedin the look-ahead stage Dynamic variable ordering (DVO)is presented with a heuristic rule to break ties Maintainingpath consistency (MPC) is achieved through analysis on theresult of interval arithmetic The monotonicity analysis onbranching conditions is applied both in the selection of initialvalues by path tendency calculation (PTC) and initial domain

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 11

Table 3 PTC process for 1199091 1199092 and 1199093

Branchingcondition

Basic functions andcorresponding monotonicity

Monotonicity ofbranching conditions Weight Path weight Path tendency

1199091 minus 1199092 gt 0

119891(1199091) = 1199091 minus 1199092 increasing119891(1199092) = 1199091 minus 1199092 decreasing119891(1198871) = 1198871 gt 0 increasing

Br(1199091) increasingBr(1199092) decreasing

1199081 = 05

1199082 = minus051199011199081 = 05

1199011199082 = minus1

1199011199083 = 15

⟨1199091 positive⟩⟨1199092 negative⟩⟨1199093 positive⟩1199093 minus 1199092 gt 0

119891(1199092) = 1199093 minus 1199092 decreasing119891(1199093) = 1199093 minus 1199092 increasing119891(1198872) = 1198872 gt 0 increasing

Br(1199092) decreasingBr(1199093) increasing

1199082 = minus05

1199083 = 05

3 lowast 1199093 ge minus5119891(1199093) = 3 lowast 1199093 increasing119891(1198873) = 1198873 ge minus5 increasing Br(1199093) increasing 1199083 = 1

Table 4 DVO process for 1199091 1199092 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 4 |1198632| = 4 |1198633| = 4 Yes (all three have the same domain size)

x2 rarr 1199091 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199092) = 1 Rank 1(1199093) =infin Yes (1199091 and 1199092 both have Rank 1)Rank 2 Rank 2(1199091) =infin Rank 2(1199092) = 2 No (1199092 has Rank 2 while 1199091 has infinity)

Table 5 DVO process for 1199091 and 1199093

Ordering rule Condition for each variable Tie encountered Ordering resultDomain size |1198631| = 3 |1198633| = 3 Yes (both have the same domain size) x1 rarr 1199093Rank 1 Rank 1(1199091) = 1 Rank 1(1199093) =infin No (1199091 has Rank 1 while 1199093 has infinity)

Table 6 Domain changes in the search process

Stage Function Before IDC After IDC and before MPC After MPC

Initialization Initial domain reduction mdash 1199091 [minus2 2] 1199092 [minus2 2]1199093 [minus2 2]

x1 [minus1 2] x2 [minus2 1]x3 [minus1 2]

State space search

MPC check when 1199092 isassigned minus1

1199091 [minus1 2] 1199092 [minus2 1] 1199093[minus1 2]

1199091 [minus1 2] x2 [minus1 minus1]1199093 [minus1 2]

x1 [0 2] 1199092 [minus1 minus1]x3 [0 2]

MPC check when 1199091 isassigned 1

1199091 [0 2] 1199092 [minus1 minus1] 1199093[0 2]

x1 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [0 2]

MPC check when 1199093 isassigned 1

1199091 [1 1] 1199092 [minus1 minus1] 1199093[0 2]

1199091 [1 1] 1199092 [minus1 minus1]x3 [1 1]

1199091 [1 1] 1199092 [minus1 minus1]1199093 [1 1]

void test(int x1 int x2 int x3) void test(int x1 int x2 int x3)

if (x1-x2lt=0)

printf(ldquoPath1rdquo)

else if(x3-x2lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

else if(3lowastx3+5gt=0)

int b1=x1-x2

printf(ldquoPath1rdquo)

if (b1lt=0)

printf(ldquoPath2rdquo)

printf(ldquoPath3rdquo)

if(b2lt=0)

if(b3gt=-5)

else int b2=x-3x2

elseint b3=3lowastx3

Figure 3 Overview of our approach for searching the test data

12 Mathematical Problems in Engineering

0 1 2

DVO

MPC

0 1 2MPC

0 1 2MPC

DVO

DVOIDC

PTC

MPC

MPC

IDC

IDC

minus1

minus1

minus1

minus2

minus2

minus2

x1 minus x2 gt 0x3 minus x2 gt 03x3 + 5 ge 0

x1 x2 x3 isin minus2 minus1 0 1 2

x2x2x2x2

x3x3x3

x1 x1 x1

radic

Figure 4 The search tree of generating the test data for Path3 using BFS-BB

61 Performance Evaluation The number of relevant vari-ables is an important factor that affects the performanceof BFS-BB so in this part experiments were carried out toevaluate the performance of BFS-BB for varying numbers ofinput variables To be specific our major concern is (1) therelationship between the number of MPC checks (exclusiveof the one taken in the initialization stage) and the numberof relevant variables (2) the relationship between genera-tion time and the number of relevant variables This wasaccomplished by repeatedly running BFS-BB on generatedtest programs having input variables 119909

1 1199092 119909

119899where 119899

varied from 1 to 50 Adopting statement coverage in eachtest the program contained five if statements (equivalent tofive branching conditions along the path for MPC check)and there was only one path to be traversed of fixed lengthwhich was the one consisting of entirely true branches(TTTTT) that is all the branching conditions are the sameas the corresponding predicates Considering the relationshipbetween variables experiments involving two situations wereconducted that (1) the variables are all independent of eachother and (2) the variables are linearly related in the tightestmanner Generation time varied greatly in these two cases sothe axes of generation time of both cases are normalized forsimplicity

611 Variables Are All Independent of Each Other The pred-icate of each if statement is an expression in the form of

11988611199091relop

1const [1] and 119886

21199092relop

2119888119900119899119904119905 [2]

and sdot sdot sdot and 119886119899119909119899relop

119899119888119900119899119904119905 [119899]

(6)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop119894(119894 = 1 2 119899) isin gt ge lt le =

= and const is an array of randomly generated constantsThe randomly generated 119886

119894and 119888119900119899119904119905[119894] should be selected

to make the path feasible This arrangement constructs arelationship that all the variables are independent of eachother but all of them are relevant to the path The programsfor various values of n ranging from 1 to 50 were each tested50 times and the number of MPC checks and time required

to generate the data for each test were recorded The resultscan be seen from Figures 5 and 6

Figure 5 shows the relationship between the number ofMPC checks and the number of variables (119899) for variablesthat are all independent of each other and from (a) to (d)represent four different situations marked by the ordinatesIt can be seen that since the relation in formula (6) is thesimplest one between variables the number of MPC checksremains linearly increasing with the number of variables nomatter in which situation from (a) to (d) 119910 = 119909means thatfor this kind of constraint one relevant variable requires onlyone MPC check It also can be seen that 1198772 = 1 in all thefour situations and the number of MPC checks increasescompletely linearly with the number of variables

Figure 6 shows the relationship between generation timeand the number of variables (119899) for variables that are allindependent of each other and from (a) to (d) represent fourdifferent situations marked by the ordinates It can be seenthat generation time increases approximately linearlywith thenumber of variables and the linear correlation relationship issignificant at 95 confidence level with 119875 value far less than005 By the increase of the number of variables generationtime increases at an even speed The minimum value can becommendably represented as a straight line showing that itis the most ideal in the four situations with a larger value of1198772 Variations between tests with the same values of 119899 were

attributed to the randomness in the difference in the selectionof the initial values

612 Variables Are Linearly Related in the Tightest MannerThe predicate of each if statement is a linear combination ofall the 119899 variables in the form of

[1198861 1198862 119886

119899] [1199091 1199092 119909

119899]1015840relop119888119900119899119904119905 [119888] (7)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop isin gt ge lt le = = and 119888119900119899119904119905[119888](119888 isin 1 2 3 4 5) is an array of randomly generatedconstants The randomly generated 119886

119894and 119888119900119899119904119905[119888] should be

selected to make the path feasible This arrangement con-structs the tightest linear relation between the variables all

Mathematical Problems in Engineering 13

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = x

R2 = 1

(a)

Number of variables0 20 40 60

0

20

40

60

Aver

age n

umbe

r of M

PC ch

ecks

y = x

R2 = 1

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(c)

Number of variables0 20 40 60

0

20

40

60

Min

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(d)

Figure 5 Relationship between the number of MPC checks and the number of variables for variables that are all independent of each other

of which are relevant to the path The programs for variousvalues of 119899 ranging from 1 to 50 were each tested 50 times andthe number ofMPC checks and time required to generate thedata for each test were recordedThe results can be seen fromFigures 7 and 8

Figure 7 shows the relationship between the number ofMPC checks and the number of variables (119899) for variables thatare the tightest linearly related and from (a) to (d) representfour different situations marked by the ordinates It can beseen that the number of MPC checks remains approximatelylinearly increasing with the number of variables and thefitting curves are all near 119910 = 119909 The linear correlationrelationship is significant at 95 confidence level with 119875-value far less than 005 The general average and maximumnumbers ofMPC checks are all larger than those in the exper-iment for variables that are all independent of each otherbecause the relation in formula (7) is the tightest linear onebetween variablesTheminimumnumber ofMPC checks canbe completely represented as 119910 = 119909 with 1198772 = 1 whichmeans that theminimumnumber is themost ideal in the foursituations

Figure 8 shows the relationship between generation timeand the number of variables (119899) for variables that are

the tightest linearly related and from (a) to (d) represent fourdifferent situationsmarked by the ordinates It is clear that therelation between generation time and the number of variablescan be commendably represented as a quadratic curve andthe quadratic correlation relationship is significant at 95confidence level with 119875-value far less than 005 The betterfitting curves of average and minimum generation timesshow that average generation time is perfectly stable andminimum generation time is still the most ideal Variationsbetween tests with the same values of n were attributed to therandomness in (1) the difference in the selection of the initialvalues and (2) the difference in the expressions along the path(an equality relational operator will generally require morecalculation than an inequality relational operator) Besidesgeneration time increases at a uniformly accelerative speed bythe increase of the number of variables Take (b) for examplethe differentiation of average generation time indicates thatits increase rate rises by 119910 = 9994119909 minus 7734 as the number ofvariables increases We can roughly draw the conclusion thatgeneration time is very close for 119899 ranging from 1 to 8 whileit begins to increase when 119899 is larger than 8

The above cases are both completely backtrack-freesearch owing to the linear correlation relationship between

14 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 1471x + 3161

R2 = 0899

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time

y = 1636x + 1175

R2 = 0862

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 2073x minus 3655

R2 = 0997

(d)

Figure 6 Relationship between generation time and the number of variables for variables that are all independent of each other

the number of MPC checks and the number of relevantvariables Surely they cannot include all the relations betweenvariables in engineering so the analyses in this part arejust from the theoretic perspective The real-world PUTs aremuch more complex Whatrsquos more 50 tests were conductedfor each case of n ranging from 1 to 50 So the results from thesamples can only approximate the actual situation But it canbe concluded that BFS-BB functions are stably given a PUTof regular structure which lays a solid foundation for itsapplication in engineering

62 Coverage Evaluation To evaluate the capability of BFS-BB to generate test data in terms of coverage four experi-ments were carried out The first involves the testing with abenchmark used in CTS the second aims at generating testdata for a project in engineering the third compares BFS-BBwith a static method and the last compares it with dynamicmethods

621 Testing a Benchmark in CTS In this part test datawere automatically generated to meet three coverage criteriawhich were statement branch and MCDC The test bedwas branch boundc a benchmark in CTS with 402 LOC

29 input variables and complex structure trying to includemore content that might appear in engineering119898 was set 10for each variable as the upperbound of the number of MPCchecks so it can be estimated that the simplest backtrackingwill consume at least 11 MPC checks for the variable inquestion

The result is shown in Table 7 The numbers of paths wasdifferent owing to different coverage criteria adopted BFS-BB was able to generate test data for all the feasible pathsno matter which coverage criterion was taken The MCDCcoverage did not reach 100 because it is relatively strictand difficult to meet and subsumes statement and branchcoverage [38] But tolerable coverage was achieved withintolerable timeThere exists a trade-off between efficiency andsuccess rate IVR had no significant influence on coveragebut it did on generation time Generation time after IVRwas much less than that without IVR Note that the amountof generation time reduced by IVR is determined by the struc-ture of the PUT The numbers of generation time reductionin Table 7 are only related to the program branch boundcOur following analyses all concern BFS-BB with IVR Aver-age number of MPC checks per relevant variable adoptingstatement coverage was larger than the results in Section 61

Mathematical Problems in Engineering 15

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = 0994x + 0221

R2 = 0997

(a)

0 20 40 600

20

40

60

Number of variables

Aver

age n

umbe

r of M

PC ch

ecks

y = 0996x + 0309

R2 = 0999

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = 1014x + 2030

R2 = 0985

(c)

0 20 40 600

20

40

60

Number of variables

Min

imum

num

ber o

f MPC

chec

ks

y = x

R2 = 1

(d)

Figure 7 Relationship between the number of MPC checks and the number of variables for variables that are linearly related in the tightestmanner

Table 7 Coverage achieved by BFS-BB on branch boundc

Coverage criterion Number of paths Average coverage Generation time reducedby IVR

Average MPC checks perrelevant variable

Statement 61 100 34 134Branch 119 100 37 173MCDC 125 98 42 229

because there are some library functions as well as nonlinearconstraints in the PUT which require moreMPC checks Butfrom all the average values of the number of MPC checks itcan be concluded that the tests for all the three coverage crite-ria were basically backtrack-free and not all the tests includethe bisection operation

622 Testing Programs from a Project in Engineering In thispart seven programs were selected from the project de118i-2 at httpwwwmoshiernet as the test beds each of whichcontains several functions or loops Three coverage criteriawere adopted which were statement branch and MCDCThe information of the programs and the coverage results areshown in Table 8

From Table 8 it can be seen that BFS-BB performedencouragingly for programs with complex structure in engi-neering Some of the tests adoptingMCDC coverage did notreach 100 resulting from the fact thatMCDC is a relativelystrict criterion But all the tests were basically backtrack-freewith a high coverage proving its effectiveness in engineering

623 Comparison with a Static Method This part presentsthe results from an empirical comparison of BFS-BB withthe static method [13] (denoted as ldquomethod 1rdquo to avoidverbose description) which was implemented in CTS priorto BFS-BB The details of the test beds are shown in Table 9The comparison adopted three coverage criteria statementbranch and MCDC And the result is shown in Table 10

16 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 4318x2 minus 6684x + 3655

R2 = 0975

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 4997x2 minus 7734x + 4134

R2 = 0991

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time y = 4151x2 minus 2923x + 7107

R2 = 0907

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 5455x2 minus 9565x + 5030

R2 = 0994

(d)

Figure 8 Relationship between generation time and the number of variables for variables that are linearly related in the tightest manner

Table 8 Test result achieved by BFS-BB on programs from de118i-2

Program LOC Number ofvariables

Number offunctions

Number ofloops Statement coverage Branch coverage MCDC coverage

rungec 1626 8 2 8 100 100 89oblatec 581 14 4 4 100 100 94jplmpc 996 6 5 5 100 100 88floorlc 432 7 5 2 100 100 92atanlc 267 3 2 0 100 100 100adams4c 767 28 10 12 100 100 85tanlc 256 4 3 0 100 100 100

Table 9 Programs used for comparison with method 1

Program LOC Number of branches Number of variables Description Source

Bonus 29 10 1 To calculate bonus according toprofit [13]

Days 33 17 3 To calculate the day numberwithin a year [13]

Statistics 21 8 5 To count the number of everykind of characters [13]

gcd 38 5 2 To calculate greatest commondenominator [14]

Mathematical Problems in Engineering 17

Table 10 Comparison result with method 1 using three coverage criteria

Program Coverage criterion Number of paths Average coverage bymethod 1

Average coverage byBFS-BB

BonusStatement 6 25 100Branch 6 37 100MCDC 6 30 100

DaysStatement 17 100 100Branch 17 100 100MCDC 14 94 100

StatisticsStatement 4 100 100Branch 5 100 100MCDC 11 82 100

gcdStatement 3 85 100Branch 3 77 100MCDC 5 75 100

Table 11 Programs used for comparison with GA and SA

Program LOC Number of branches Number of variables Description SourceTriangle type 31 3 5 To classify type of a triangle [15]

Valid date 59 16 3 To check whether a date is validor not [15]

Cal day 72 11 3 To calculate the day of the week [16]

Cal 53 18 5To calculate the number of daysbetween the two given days in

the same year[17]

Since interval arithmetic has been improved in CTS forthe sake of validity test data were generated for all the testbeds on the same foundation of interval arithmetic It can beseen that BFS-BB reached 100 coverage for all the test bedsusing three coverage criteria while method 1 did not That islargely due to the heuristicmethods utilized in BFS-BBThereare modulus operations in the program days which had beendifficult for the interval arithmetic in method 1 to handle Butthe functionality of interval arithmetic has been improved byadding a library of inverse functions so it was not so difficulteven for method 1

624 Comparison with DynamicMethods This part presentsresults from an empirical comparison of BFS-BB with twodynamic methods which are genetic algorithm (GA) andsimulated annealing (SA) on four different benchmark pro-grams using branch coverage The details of the benchmarkprograms are shown in Table 11 In order to obtain unbiasedexperimental results we made a number of experiments withdifferent parameter settings and chose the one that GA andSA performed the best as shown in Table 12 Test data wereautomatically generated for each program using GA SA andBFS-BB with each tested 100 timesThe average coverage wasrecorded to make the comparison and the result was pre-sented in Table 13

It can be seen that BFS-BB reached 100 coverage onall the four benchmark programs which are rather simpleprograms for BFS-BB and it outperformed the algorithms

in comparison The better performance of BFS-BB is due tothree factorsThe first is that the initial values of variables areselected by heuristics on the path so BFS-BB reaches a rela-tively high coverage for the first round of the searchThe sec-ond is that MHS crashes on several occasions due to the iter-ation exception while the probability of aborting is quite lowfor BFS-BB because it has no demand for iteration The thirdis thatMPC is checked not only in the state space search stagebut also in the initialization stage which reduces the domainsof the variables to ensure a relatively small search spacethat follows

7 Conclusion

This paper presents an intelligent algorithm best-first-searchbranch and bound (BFS-BB) for path-wise test data gener-ation (Q) The problem Q is reformulated as a constraintoptimization problem (COP) and two techniques from arti-ficial intelligence are introduced to tackle the COP which arestate space search and branch and bound (BB) The formeris used to construct the search tree dynamically and thelatter is used as the search method Heuristics are adoptedin the look-ahead stage Dynamic variable ordering (DVO)is presented with a heuristic rule to break ties Maintainingpath consistency (MPC) is achieved through analysis on theresult of interval arithmetic The monotonicity analysis onbranching conditions is applied both in the selection of initialvalues by path tendency calculation (PTC) and initial domain

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

12 Mathematical Problems in Engineering

0 1 2

DVO

MPC

0 1 2MPC

0 1 2MPC

DVO

DVOIDC

PTC

MPC

MPC

IDC

IDC

minus1

minus1

minus1

minus2

minus2

minus2

x1 minus x2 gt 0x3 minus x2 gt 03x3 + 5 ge 0

x1 x2 x3 isin minus2 minus1 0 1 2

x2x2x2x2

x3x3x3

x1 x1 x1

radic

Figure 4 The search tree of generating the test data for Path3 using BFS-BB

61 Performance Evaluation The number of relevant vari-ables is an important factor that affects the performanceof BFS-BB so in this part experiments were carried out toevaluate the performance of BFS-BB for varying numbers ofinput variables To be specific our major concern is (1) therelationship between the number of MPC checks (exclusiveof the one taken in the initialization stage) and the numberof relevant variables (2) the relationship between genera-tion time and the number of relevant variables This wasaccomplished by repeatedly running BFS-BB on generatedtest programs having input variables 119909

1 1199092 119909

119899where 119899

varied from 1 to 50 Adopting statement coverage in eachtest the program contained five if statements (equivalent tofive branching conditions along the path for MPC check)and there was only one path to be traversed of fixed lengthwhich was the one consisting of entirely true branches(TTTTT) that is all the branching conditions are the sameas the corresponding predicates Considering the relationshipbetween variables experiments involving two situations wereconducted that (1) the variables are all independent of eachother and (2) the variables are linearly related in the tightestmanner Generation time varied greatly in these two cases sothe axes of generation time of both cases are normalized forsimplicity

611 Variables Are All Independent of Each Other The pred-icate of each if statement is an expression in the form of

11988611199091relop

1const [1] and 119886

21199092relop

2119888119900119899119904119905 [2]

and sdot sdot sdot and 119886119899119909119899relop

119899119888119900119899119904119905 [119899]

(6)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop119894(119894 = 1 2 119899) isin gt ge lt le =

= and const is an array of randomly generated constantsThe randomly generated 119886

119894and 119888119900119899119904119905[119894] should be selected

to make the path feasible This arrangement constructs arelationship that all the variables are independent of eachother but all of them are relevant to the path The programsfor various values of n ranging from 1 to 50 were each tested50 times and the number of MPC checks and time required

to generate the data for each test were recorded The resultscan be seen from Figures 5 and 6

Figure 5 shows the relationship between the number ofMPC checks and the number of variables (119899) for variablesthat are all independent of each other and from (a) to (d)represent four different situations marked by the ordinatesIt can be seen that since the relation in formula (6) is thesimplest one between variables the number of MPC checksremains linearly increasing with the number of variables nomatter in which situation from (a) to (d) 119910 = 119909means thatfor this kind of constraint one relevant variable requires onlyone MPC check It also can be seen that 1198772 = 1 in all thefour situations and the number of MPC checks increasescompletely linearly with the number of variables

Figure 6 shows the relationship between generation timeand the number of variables (119899) for variables that are allindependent of each other and from (a) to (d) represent fourdifferent situations marked by the ordinates It can be seenthat generation time increases approximately linearlywith thenumber of variables and the linear correlation relationship issignificant at 95 confidence level with 119875 value far less than005 By the increase of the number of variables generationtime increases at an even speed The minimum value can becommendably represented as a straight line showing that itis the most ideal in the four situations with a larger value of1198772 Variations between tests with the same values of 119899 were

attributed to the randomness in the difference in the selectionof the initial values

612 Variables Are Linearly Related in the Tightest MannerThe predicate of each if statement is a linear combination ofall the 119899 variables in the form of

[1198861 1198862 119886

119899] [1199091 1199092 119909

119899]1015840relop119888119900119899119904119905 [119888] (7)

where 1198861 1198862 119886

119899are randomly generated numbers either

positive or negative relop isin gt ge lt le = = and 119888119900119899119904119905[119888](119888 isin 1 2 3 4 5) is an array of randomly generatedconstants The randomly generated 119886

119894and 119888119900119899119904119905[119888] should be

selected to make the path feasible This arrangement con-structs the tightest linear relation between the variables all

Mathematical Problems in Engineering 13

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = x

R2 = 1

(a)

Number of variables0 20 40 60

0

20

40

60

Aver

age n

umbe

r of M

PC ch

ecks

y = x

R2 = 1

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(c)

Number of variables0 20 40 60

0

20

40

60

Min

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(d)

Figure 5 Relationship between the number of MPC checks and the number of variables for variables that are all independent of each other

of which are relevant to the path The programs for variousvalues of 119899 ranging from 1 to 50 were each tested 50 times andthe number ofMPC checks and time required to generate thedata for each test were recordedThe results can be seen fromFigures 7 and 8

Figure 7 shows the relationship between the number ofMPC checks and the number of variables (119899) for variables thatare the tightest linearly related and from (a) to (d) representfour different situations marked by the ordinates It can beseen that the number of MPC checks remains approximatelylinearly increasing with the number of variables and thefitting curves are all near 119910 = 119909 The linear correlationrelationship is significant at 95 confidence level with 119875-value far less than 005 The general average and maximumnumbers ofMPC checks are all larger than those in the exper-iment for variables that are all independent of each otherbecause the relation in formula (7) is the tightest linear onebetween variablesTheminimumnumber ofMPC checks canbe completely represented as 119910 = 119909 with 1198772 = 1 whichmeans that theminimumnumber is themost ideal in the foursituations

Figure 8 shows the relationship between generation timeand the number of variables (119899) for variables that are

the tightest linearly related and from (a) to (d) represent fourdifferent situationsmarked by the ordinates It is clear that therelation between generation time and the number of variablescan be commendably represented as a quadratic curve andthe quadratic correlation relationship is significant at 95confidence level with 119875-value far less than 005 The betterfitting curves of average and minimum generation timesshow that average generation time is perfectly stable andminimum generation time is still the most ideal Variationsbetween tests with the same values of n were attributed to therandomness in (1) the difference in the selection of the initialvalues and (2) the difference in the expressions along the path(an equality relational operator will generally require morecalculation than an inequality relational operator) Besidesgeneration time increases at a uniformly accelerative speed bythe increase of the number of variables Take (b) for examplethe differentiation of average generation time indicates thatits increase rate rises by 119910 = 9994119909 minus 7734 as the number ofvariables increases We can roughly draw the conclusion thatgeneration time is very close for 119899 ranging from 1 to 8 whileit begins to increase when 119899 is larger than 8

The above cases are both completely backtrack-freesearch owing to the linear correlation relationship between

14 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 1471x + 3161

R2 = 0899

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time

y = 1636x + 1175

R2 = 0862

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 2073x minus 3655

R2 = 0997

(d)

Figure 6 Relationship between generation time and the number of variables for variables that are all independent of each other

the number of MPC checks and the number of relevantvariables Surely they cannot include all the relations betweenvariables in engineering so the analyses in this part arejust from the theoretic perspective The real-world PUTs aremuch more complex Whatrsquos more 50 tests were conductedfor each case of n ranging from 1 to 50 So the results from thesamples can only approximate the actual situation But it canbe concluded that BFS-BB functions are stably given a PUTof regular structure which lays a solid foundation for itsapplication in engineering

62 Coverage Evaluation To evaluate the capability of BFS-BB to generate test data in terms of coverage four experi-ments were carried out The first involves the testing with abenchmark used in CTS the second aims at generating testdata for a project in engineering the third compares BFS-BBwith a static method and the last compares it with dynamicmethods

621 Testing a Benchmark in CTS In this part test datawere automatically generated to meet three coverage criteriawhich were statement branch and MCDC The test bedwas branch boundc a benchmark in CTS with 402 LOC

29 input variables and complex structure trying to includemore content that might appear in engineering119898 was set 10for each variable as the upperbound of the number of MPCchecks so it can be estimated that the simplest backtrackingwill consume at least 11 MPC checks for the variable inquestion

The result is shown in Table 7 The numbers of paths wasdifferent owing to different coverage criteria adopted BFS-BB was able to generate test data for all the feasible pathsno matter which coverage criterion was taken The MCDCcoverage did not reach 100 because it is relatively strictand difficult to meet and subsumes statement and branchcoverage [38] But tolerable coverage was achieved withintolerable timeThere exists a trade-off between efficiency andsuccess rate IVR had no significant influence on coveragebut it did on generation time Generation time after IVRwas much less than that without IVR Note that the amountof generation time reduced by IVR is determined by the struc-ture of the PUT The numbers of generation time reductionin Table 7 are only related to the program branch boundcOur following analyses all concern BFS-BB with IVR Aver-age number of MPC checks per relevant variable adoptingstatement coverage was larger than the results in Section 61

Mathematical Problems in Engineering 15

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = 0994x + 0221

R2 = 0997

(a)

0 20 40 600

20

40

60

Number of variables

Aver

age n

umbe

r of M

PC ch

ecks

y = 0996x + 0309

R2 = 0999

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = 1014x + 2030

R2 = 0985

(c)

0 20 40 600

20

40

60

Number of variables

Min

imum

num

ber o

f MPC

chec

ks

y = x

R2 = 1

(d)

Figure 7 Relationship between the number of MPC checks and the number of variables for variables that are linearly related in the tightestmanner

Table 7 Coverage achieved by BFS-BB on branch boundc

Coverage criterion Number of paths Average coverage Generation time reducedby IVR

Average MPC checks perrelevant variable

Statement 61 100 34 134Branch 119 100 37 173MCDC 125 98 42 229

because there are some library functions as well as nonlinearconstraints in the PUT which require moreMPC checks Butfrom all the average values of the number of MPC checks itcan be concluded that the tests for all the three coverage crite-ria were basically backtrack-free and not all the tests includethe bisection operation

622 Testing Programs from a Project in Engineering In thispart seven programs were selected from the project de118i-2 at httpwwwmoshiernet as the test beds each of whichcontains several functions or loops Three coverage criteriawere adopted which were statement branch and MCDCThe information of the programs and the coverage results areshown in Table 8

From Table 8 it can be seen that BFS-BB performedencouragingly for programs with complex structure in engi-neering Some of the tests adoptingMCDC coverage did notreach 100 resulting from the fact thatMCDC is a relativelystrict criterion But all the tests were basically backtrack-freewith a high coverage proving its effectiveness in engineering

623 Comparison with a Static Method This part presentsthe results from an empirical comparison of BFS-BB withthe static method [13] (denoted as ldquomethod 1rdquo to avoidverbose description) which was implemented in CTS priorto BFS-BB The details of the test beds are shown in Table 9The comparison adopted three coverage criteria statementbranch and MCDC And the result is shown in Table 10

16 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 4318x2 minus 6684x + 3655

R2 = 0975

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 4997x2 minus 7734x + 4134

R2 = 0991

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time y = 4151x2 minus 2923x + 7107

R2 = 0907

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 5455x2 minus 9565x + 5030

R2 = 0994

(d)

Figure 8 Relationship between generation time and the number of variables for variables that are linearly related in the tightest manner

Table 8 Test result achieved by BFS-BB on programs from de118i-2

Program LOC Number ofvariables

Number offunctions

Number ofloops Statement coverage Branch coverage MCDC coverage

rungec 1626 8 2 8 100 100 89oblatec 581 14 4 4 100 100 94jplmpc 996 6 5 5 100 100 88floorlc 432 7 5 2 100 100 92atanlc 267 3 2 0 100 100 100adams4c 767 28 10 12 100 100 85tanlc 256 4 3 0 100 100 100

Table 9 Programs used for comparison with method 1

Program LOC Number of branches Number of variables Description Source

Bonus 29 10 1 To calculate bonus according toprofit [13]

Days 33 17 3 To calculate the day numberwithin a year [13]

Statistics 21 8 5 To count the number of everykind of characters [13]

gcd 38 5 2 To calculate greatest commondenominator [14]

Mathematical Problems in Engineering 17

Table 10 Comparison result with method 1 using three coverage criteria

Program Coverage criterion Number of paths Average coverage bymethod 1

Average coverage byBFS-BB

BonusStatement 6 25 100Branch 6 37 100MCDC 6 30 100

DaysStatement 17 100 100Branch 17 100 100MCDC 14 94 100

StatisticsStatement 4 100 100Branch 5 100 100MCDC 11 82 100

gcdStatement 3 85 100Branch 3 77 100MCDC 5 75 100

Table 11 Programs used for comparison with GA and SA

Program LOC Number of branches Number of variables Description SourceTriangle type 31 3 5 To classify type of a triangle [15]

Valid date 59 16 3 To check whether a date is validor not [15]

Cal day 72 11 3 To calculate the day of the week [16]

Cal 53 18 5To calculate the number of daysbetween the two given days in

the same year[17]

Since interval arithmetic has been improved in CTS forthe sake of validity test data were generated for all the testbeds on the same foundation of interval arithmetic It can beseen that BFS-BB reached 100 coverage for all the test bedsusing three coverage criteria while method 1 did not That islargely due to the heuristicmethods utilized in BFS-BBThereare modulus operations in the program days which had beendifficult for the interval arithmetic in method 1 to handle Butthe functionality of interval arithmetic has been improved byadding a library of inverse functions so it was not so difficulteven for method 1

624 Comparison with DynamicMethods This part presentsresults from an empirical comparison of BFS-BB with twodynamic methods which are genetic algorithm (GA) andsimulated annealing (SA) on four different benchmark pro-grams using branch coverage The details of the benchmarkprograms are shown in Table 11 In order to obtain unbiasedexperimental results we made a number of experiments withdifferent parameter settings and chose the one that GA andSA performed the best as shown in Table 12 Test data wereautomatically generated for each program using GA SA andBFS-BB with each tested 100 timesThe average coverage wasrecorded to make the comparison and the result was pre-sented in Table 13

It can be seen that BFS-BB reached 100 coverage onall the four benchmark programs which are rather simpleprograms for BFS-BB and it outperformed the algorithms

in comparison The better performance of BFS-BB is due tothree factorsThe first is that the initial values of variables areselected by heuristics on the path so BFS-BB reaches a rela-tively high coverage for the first round of the searchThe sec-ond is that MHS crashes on several occasions due to the iter-ation exception while the probability of aborting is quite lowfor BFS-BB because it has no demand for iteration The thirdis thatMPC is checked not only in the state space search stagebut also in the initialization stage which reduces the domainsof the variables to ensure a relatively small search spacethat follows

7 Conclusion

This paper presents an intelligent algorithm best-first-searchbranch and bound (BFS-BB) for path-wise test data gener-ation (Q) The problem Q is reformulated as a constraintoptimization problem (COP) and two techniques from arti-ficial intelligence are introduced to tackle the COP which arestate space search and branch and bound (BB) The formeris used to construct the search tree dynamically and thelatter is used as the search method Heuristics are adoptedin the look-ahead stage Dynamic variable ordering (DVO)is presented with a heuristic rule to break ties Maintainingpath consistency (MPC) is achieved through analysis on theresult of interval arithmetic The monotonicity analysis onbranching conditions is applied both in the selection of initialvalues by path tendency calculation (PTC) and initial domain

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 13

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = x

R2 = 1

(a)

Number of variables0 20 40 60

0

20

40

60

Aver

age n

umbe

r of M

PC ch

ecks

y = x

R2 = 1

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(c)

Number of variables0 20 40 60

0

20

40

60

Min

imum

num

ber o

f MPC

chec

ks y = x

R2 = 1

(d)

Figure 5 Relationship between the number of MPC checks and the number of variables for variables that are all independent of each other

of which are relevant to the path The programs for variousvalues of 119899 ranging from 1 to 50 were each tested 50 times andthe number ofMPC checks and time required to generate thedata for each test were recordedThe results can be seen fromFigures 7 and 8

Figure 7 shows the relationship between the number ofMPC checks and the number of variables (119899) for variables thatare the tightest linearly related and from (a) to (d) representfour different situations marked by the ordinates It can beseen that the number of MPC checks remains approximatelylinearly increasing with the number of variables and thefitting curves are all near 119910 = 119909 The linear correlationrelationship is significant at 95 confidence level with 119875-value far less than 005 The general average and maximumnumbers ofMPC checks are all larger than those in the exper-iment for variables that are all independent of each otherbecause the relation in formula (7) is the tightest linear onebetween variablesTheminimumnumber ofMPC checks canbe completely represented as 119910 = 119909 with 1198772 = 1 whichmeans that theminimumnumber is themost ideal in the foursituations

Figure 8 shows the relationship between generation timeand the number of variables (119899) for variables that are

the tightest linearly related and from (a) to (d) represent fourdifferent situationsmarked by the ordinates It is clear that therelation between generation time and the number of variablescan be commendably represented as a quadratic curve andthe quadratic correlation relationship is significant at 95confidence level with 119875-value far less than 005 The betterfitting curves of average and minimum generation timesshow that average generation time is perfectly stable andminimum generation time is still the most ideal Variationsbetween tests with the same values of n were attributed to therandomness in (1) the difference in the selection of the initialvalues and (2) the difference in the expressions along the path(an equality relational operator will generally require morecalculation than an inequality relational operator) Besidesgeneration time increases at a uniformly accelerative speed bythe increase of the number of variables Take (b) for examplethe differentiation of average generation time indicates thatits increase rate rises by 119910 = 9994119909 minus 7734 as the number ofvariables increases We can roughly draw the conclusion thatgeneration time is very close for 119899 ranging from 1 to 8 whileit begins to increase when 119899 is larger than 8

The above cases are both completely backtrack-freesearch owing to the linear correlation relationship between

14 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 1471x + 3161

R2 = 0899

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time

y = 1636x + 1175

R2 = 0862

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 2073x minus 3655

R2 = 0997

(d)

Figure 6 Relationship between generation time and the number of variables for variables that are all independent of each other

the number of MPC checks and the number of relevantvariables Surely they cannot include all the relations betweenvariables in engineering so the analyses in this part arejust from the theoretic perspective The real-world PUTs aremuch more complex Whatrsquos more 50 tests were conductedfor each case of n ranging from 1 to 50 So the results from thesamples can only approximate the actual situation But it canbe concluded that BFS-BB functions are stably given a PUTof regular structure which lays a solid foundation for itsapplication in engineering

62 Coverage Evaluation To evaluate the capability of BFS-BB to generate test data in terms of coverage four experi-ments were carried out The first involves the testing with abenchmark used in CTS the second aims at generating testdata for a project in engineering the third compares BFS-BBwith a static method and the last compares it with dynamicmethods

621 Testing a Benchmark in CTS In this part test datawere automatically generated to meet three coverage criteriawhich were statement branch and MCDC The test bedwas branch boundc a benchmark in CTS with 402 LOC

29 input variables and complex structure trying to includemore content that might appear in engineering119898 was set 10for each variable as the upperbound of the number of MPCchecks so it can be estimated that the simplest backtrackingwill consume at least 11 MPC checks for the variable inquestion

The result is shown in Table 7 The numbers of paths wasdifferent owing to different coverage criteria adopted BFS-BB was able to generate test data for all the feasible pathsno matter which coverage criterion was taken The MCDCcoverage did not reach 100 because it is relatively strictand difficult to meet and subsumes statement and branchcoverage [38] But tolerable coverage was achieved withintolerable timeThere exists a trade-off between efficiency andsuccess rate IVR had no significant influence on coveragebut it did on generation time Generation time after IVRwas much less than that without IVR Note that the amountof generation time reduced by IVR is determined by the struc-ture of the PUT The numbers of generation time reductionin Table 7 are only related to the program branch boundcOur following analyses all concern BFS-BB with IVR Aver-age number of MPC checks per relevant variable adoptingstatement coverage was larger than the results in Section 61

Mathematical Problems in Engineering 15

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = 0994x + 0221

R2 = 0997

(a)

0 20 40 600

20

40

60

Number of variables

Aver

age n

umbe

r of M

PC ch

ecks

y = 0996x + 0309

R2 = 0999

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = 1014x + 2030

R2 = 0985

(c)

0 20 40 600

20

40

60

Number of variables

Min

imum

num

ber o

f MPC

chec

ks

y = x

R2 = 1

(d)

Figure 7 Relationship between the number of MPC checks and the number of variables for variables that are linearly related in the tightestmanner

Table 7 Coverage achieved by BFS-BB on branch boundc

Coverage criterion Number of paths Average coverage Generation time reducedby IVR

Average MPC checks perrelevant variable

Statement 61 100 34 134Branch 119 100 37 173MCDC 125 98 42 229

because there are some library functions as well as nonlinearconstraints in the PUT which require moreMPC checks Butfrom all the average values of the number of MPC checks itcan be concluded that the tests for all the three coverage crite-ria were basically backtrack-free and not all the tests includethe bisection operation

622 Testing Programs from a Project in Engineering In thispart seven programs were selected from the project de118i-2 at httpwwwmoshiernet as the test beds each of whichcontains several functions or loops Three coverage criteriawere adopted which were statement branch and MCDCThe information of the programs and the coverage results areshown in Table 8

From Table 8 it can be seen that BFS-BB performedencouragingly for programs with complex structure in engi-neering Some of the tests adoptingMCDC coverage did notreach 100 resulting from the fact thatMCDC is a relativelystrict criterion But all the tests were basically backtrack-freewith a high coverage proving its effectiveness in engineering

623 Comparison with a Static Method This part presentsthe results from an empirical comparison of BFS-BB withthe static method [13] (denoted as ldquomethod 1rdquo to avoidverbose description) which was implemented in CTS priorto BFS-BB The details of the test beds are shown in Table 9The comparison adopted three coverage criteria statementbranch and MCDC And the result is shown in Table 10

16 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 4318x2 minus 6684x + 3655

R2 = 0975

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 4997x2 minus 7734x + 4134

R2 = 0991

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time y = 4151x2 minus 2923x + 7107

R2 = 0907

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 5455x2 minus 9565x + 5030

R2 = 0994

(d)

Figure 8 Relationship between generation time and the number of variables for variables that are linearly related in the tightest manner

Table 8 Test result achieved by BFS-BB on programs from de118i-2

Program LOC Number ofvariables

Number offunctions

Number ofloops Statement coverage Branch coverage MCDC coverage

rungec 1626 8 2 8 100 100 89oblatec 581 14 4 4 100 100 94jplmpc 996 6 5 5 100 100 88floorlc 432 7 5 2 100 100 92atanlc 267 3 2 0 100 100 100adams4c 767 28 10 12 100 100 85tanlc 256 4 3 0 100 100 100

Table 9 Programs used for comparison with method 1

Program LOC Number of branches Number of variables Description Source

Bonus 29 10 1 To calculate bonus according toprofit [13]

Days 33 17 3 To calculate the day numberwithin a year [13]

Statistics 21 8 5 To count the number of everykind of characters [13]

gcd 38 5 2 To calculate greatest commondenominator [14]

Mathematical Problems in Engineering 17

Table 10 Comparison result with method 1 using three coverage criteria

Program Coverage criterion Number of paths Average coverage bymethod 1

Average coverage byBFS-BB

BonusStatement 6 25 100Branch 6 37 100MCDC 6 30 100

DaysStatement 17 100 100Branch 17 100 100MCDC 14 94 100

StatisticsStatement 4 100 100Branch 5 100 100MCDC 11 82 100

gcdStatement 3 85 100Branch 3 77 100MCDC 5 75 100

Table 11 Programs used for comparison with GA and SA

Program LOC Number of branches Number of variables Description SourceTriangle type 31 3 5 To classify type of a triangle [15]

Valid date 59 16 3 To check whether a date is validor not [15]

Cal day 72 11 3 To calculate the day of the week [16]

Cal 53 18 5To calculate the number of daysbetween the two given days in

the same year[17]

Since interval arithmetic has been improved in CTS forthe sake of validity test data were generated for all the testbeds on the same foundation of interval arithmetic It can beseen that BFS-BB reached 100 coverage for all the test bedsusing three coverage criteria while method 1 did not That islargely due to the heuristicmethods utilized in BFS-BBThereare modulus operations in the program days which had beendifficult for the interval arithmetic in method 1 to handle Butthe functionality of interval arithmetic has been improved byadding a library of inverse functions so it was not so difficulteven for method 1

624 Comparison with DynamicMethods This part presentsresults from an empirical comparison of BFS-BB with twodynamic methods which are genetic algorithm (GA) andsimulated annealing (SA) on four different benchmark pro-grams using branch coverage The details of the benchmarkprograms are shown in Table 11 In order to obtain unbiasedexperimental results we made a number of experiments withdifferent parameter settings and chose the one that GA andSA performed the best as shown in Table 12 Test data wereautomatically generated for each program using GA SA andBFS-BB with each tested 100 timesThe average coverage wasrecorded to make the comparison and the result was pre-sented in Table 13

It can be seen that BFS-BB reached 100 coverage onall the four benchmark programs which are rather simpleprograms for BFS-BB and it outperformed the algorithms

in comparison The better performance of BFS-BB is due tothree factorsThe first is that the initial values of variables areselected by heuristics on the path so BFS-BB reaches a rela-tively high coverage for the first round of the searchThe sec-ond is that MHS crashes on several occasions due to the iter-ation exception while the probability of aborting is quite lowfor BFS-BB because it has no demand for iteration The thirdis thatMPC is checked not only in the state space search stagebut also in the initialization stage which reduces the domainsof the variables to ensure a relatively small search spacethat follows

7 Conclusion

This paper presents an intelligent algorithm best-first-searchbranch and bound (BFS-BB) for path-wise test data gener-ation (Q) The problem Q is reformulated as a constraintoptimization problem (COP) and two techniques from arti-ficial intelligence are introduced to tackle the COP which arestate space search and branch and bound (BB) The formeris used to construct the search tree dynamically and thelatter is used as the search method Heuristics are adoptedin the look-ahead stage Dynamic variable ordering (DVO)is presented with a heuristic rule to break ties Maintainingpath consistency (MPC) is achieved through analysis on theresult of interval arithmetic The monotonicity analysis onbranching conditions is applied both in the selection of initialvalues by path tendency calculation (PTC) and initial domain

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

14 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 1471x + 3161

R2 = 0899

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 1977x minus 3127

R2 = 0987

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time

y = 1636x + 1175

R2 = 0862

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 2073x minus 3655

R2 = 0997

(d)

Figure 6 Relationship between generation time and the number of variables for variables that are all independent of each other

the number of MPC checks and the number of relevantvariables Surely they cannot include all the relations betweenvariables in engineering so the analyses in this part arejust from the theoretic perspective The real-world PUTs aremuch more complex Whatrsquos more 50 tests were conductedfor each case of n ranging from 1 to 50 So the results from thesamples can only approximate the actual situation But it canbe concluded that BFS-BB functions are stably given a PUTof regular structure which lays a solid foundation for itsapplication in engineering

62 Coverage Evaluation To evaluate the capability of BFS-BB to generate test data in terms of coverage four experi-ments were carried out The first involves the testing with abenchmark used in CTS the second aims at generating testdata for a project in engineering the third compares BFS-BBwith a static method and the last compares it with dynamicmethods

621 Testing a Benchmark in CTS In this part test datawere automatically generated to meet three coverage criteriawhich were statement branch and MCDC The test bedwas branch boundc a benchmark in CTS with 402 LOC

29 input variables and complex structure trying to includemore content that might appear in engineering119898 was set 10for each variable as the upperbound of the number of MPCchecks so it can be estimated that the simplest backtrackingwill consume at least 11 MPC checks for the variable inquestion

The result is shown in Table 7 The numbers of paths wasdifferent owing to different coverage criteria adopted BFS-BB was able to generate test data for all the feasible pathsno matter which coverage criterion was taken The MCDCcoverage did not reach 100 because it is relatively strictand difficult to meet and subsumes statement and branchcoverage [38] But tolerable coverage was achieved withintolerable timeThere exists a trade-off between efficiency andsuccess rate IVR had no significant influence on coveragebut it did on generation time Generation time after IVRwas much less than that without IVR Note that the amountof generation time reduced by IVR is determined by the struc-ture of the PUT The numbers of generation time reductionin Table 7 are only related to the program branch boundcOur following analyses all concern BFS-BB with IVR Aver-age number of MPC checks per relevant variable adoptingstatement coverage was larger than the results in Section 61

Mathematical Problems in Engineering 15

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = 0994x + 0221

R2 = 0997

(a)

0 20 40 600

20

40

60

Number of variables

Aver

age n

umbe

r of M

PC ch

ecks

y = 0996x + 0309

R2 = 0999

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = 1014x + 2030

R2 = 0985

(c)

0 20 40 600

20

40

60

Number of variables

Min

imum

num

ber o

f MPC

chec

ks

y = x

R2 = 1

(d)

Figure 7 Relationship between the number of MPC checks and the number of variables for variables that are linearly related in the tightestmanner

Table 7 Coverage achieved by BFS-BB on branch boundc

Coverage criterion Number of paths Average coverage Generation time reducedby IVR

Average MPC checks perrelevant variable

Statement 61 100 34 134Branch 119 100 37 173MCDC 125 98 42 229

because there are some library functions as well as nonlinearconstraints in the PUT which require moreMPC checks Butfrom all the average values of the number of MPC checks itcan be concluded that the tests for all the three coverage crite-ria were basically backtrack-free and not all the tests includethe bisection operation

622 Testing Programs from a Project in Engineering In thispart seven programs were selected from the project de118i-2 at httpwwwmoshiernet as the test beds each of whichcontains several functions or loops Three coverage criteriawere adopted which were statement branch and MCDCThe information of the programs and the coverage results areshown in Table 8

From Table 8 it can be seen that BFS-BB performedencouragingly for programs with complex structure in engi-neering Some of the tests adoptingMCDC coverage did notreach 100 resulting from the fact thatMCDC is a relativelystrict criterion But all the tests were basically backtrack-freewith a high coverage proving its effectiveness in engineering

623 Comparison with a Static Method This part presentsthe results from an empirical comparison of BFS-BB withthe static method [13] (denoted as ldquomethod 1rdquo to avoidverbose description) which was implemented in CTS priorto BFS-BB The details of the test beds are shown in Table 9The comparison adopted three coverage criteria statementbranch and MCDC And the result is shown in Table 10

16 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 4318x2 minus 6684x + 3655

R2 = 0975

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 4997x2 minus 7734x + 4134

R2 = 0991

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time y = 4151x2 minus 2923x + 7107

R2 = 0907

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 5455x2 minus 9565x + 5030

R2 = 0994

(d)

Figure 8 Relationship between generation time and the number of variables for variables that are linearly related in the tightest manner

Table 8 Test result achieved by BFS-BB on programs from de118i-2

Program LOC Number ofvariables

Number offunctions

Number ofloops Statement coverage Branch coverage MCDC coverage

rungec 1626 8 2 8 100 100 89oblatec 581 14 4 4 100 100 94jplmpc 996 6 5 5 100 100 88floorlc 432 7 5 2 100 100 92atanlc 267 3 2 0 100 100 100adams4c 767 28 10 12 100 100 85tanlc 256 4 3 0 100 100 100

Table 9 Programs used for comparison with method 1

Program LOC Number of branches Number of variables Description Source

Bonus 29 10 1 To calculate bonus according toprofit [13]

Days 33 17 3 To calculate the day numberwithin a year [13]

Statistics 21 8 5 To count the number of everykind of characters [13]

gcd 38 5 2 To calculate greatest commondenominator [14]

Mathematical Problems in Engineering 17

Table 10 Comparison result with method 1 using three coverage criteria

Program Coverage criterion Number of paths Average coverage bymethod 1

Average coverage byBFS-BB

BonusStatement 6 25 100Branch 6 37 100MCDC 6 30 100

DaysStatement 17 100 100Branch 17 100 100MCDC 14 94 100

StatisticsStatement 4 100 100Branch 5 100 100MCDC 11 82 100

gcdStatement 3 85 100Branch 3 77 100MCDC 5 75 100

Table 11 Programs used for comparison with GA and SA

Program LOC Number of branches Number of variables Description SourceTriangle type 31 3 5 To classify type of a triangle [15]

Valid date 59 16 3 To check whether a date is validor not [15]

Cal day 72 11 3 To calculate the day of the week [16]

Cal 53 18 5To calculate the number of daysbetween the two given days in

the same year[17]

Since interval arithmetic has been improved in CTS forthe sake of validity test data were generated for all the testbeds on the same foundation of interval arithmetic It can beseen that BFS-BB reached 100 coverage for all the test bedsusing three coverage criteria while method 1 did not That islargely due to the heuristicmethods utilized in BFS-BBThereare modulus operations in the program days which had beendifficult for the interval arithmetic in method 1 to handle Butthe functionality of interval arithmetic has been improved byadding a library of inverse functions so it was not so difficulteven for method 1

624 Comparison with DynamicMethods This part presentsresults from an empirical comparison of BFS-BB with twodynamic methods which are genetic algorithm (GA) andsimulated annealing (SA) on four different benchmark pro-grams using branch coverage The details of the benchmarkprograms are shown in Table 11 In order to obtain unbiasedexperimental results we made a number of experiments withdifferent parameter settings and chose the one that GA andSA performed the best as shown in Table 12 Test data wereautomatically generated for each program using GA SA andBFS-BB with each tested 100 timesThe average coverage wasrecorded to make the comparison and the result was pre-sented in Table 13

It can be seen that BFS-BB reached 100 coverage onall the four benchmark programs which are rather simpleprograms for BFS-BB and it outperformed the algorithms

in comparison The better performance of BFS-BB is due tothree factorsThe first is that the initial values of variables areselected by heuristics on the path so BFS-BB reaches a rela-tively high coverage for the first round of the searchThe sec-ond is that MHS crashes on several occasions due to the iter-ation exception while the probability of aborting is quite lowfor BFS-BB because it has no demand for iteration The thirdis thatMPC is checked not only in the state space search stagebut also in the initialization stage which reduces the domainsof the variables to ensure a relatively small search spacethat follows

7 Conclusion

This paper presents an intelligent algorithm best-first-searchbranch and bound (BFS-BB) for path-wise test data gener-ation (Q) The problem Q is reformulated as a constraintoptimization problem (COP) and two techniques from arti-ficial intelligence are introduced to tackle the COP which arestate space search and branch and bound (BB) The formeris used to construct the search tree dynamically and thelatter is used as the search method Heuristics are adoptedin the look-ahead stage Dynamic variable ordering (DVO)is presented with a heuristic rule to break ties Maintainingpath consistency (MPC) is achieved through analysis on theresult of interval arithmetic The monotonicity analysis onbranching conditions is applied both in the selection of initialvalues by path tendency calculation (PTC) and initial domain

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 15

0 20 40 600

20

40

60

Number of variables

Num

ber o

f MPC

chec

ks

y = 0994x + 0221

R2 = 0997

(a)

0 20 40 600

20

40

60

Number of variables

Aver

age n

umbe

r of M

PC ch

ecks

y = 0996x + 0309

R2 = 0999

(b)

0 20 40 600

20

40

60

Number of variables

Max

imum

num

ber o

f MPC

chec

ks y = 1014x + 2030

R2 = 0985

(c)

0 20 40 600

20

40

60

Number of variables

Min

imum

num

ber o

f MPC

chec

ks

y = x

R2 = 1

(d)

Figure 7 Relationship between the number of MPC checks and the number of variables for variables that are linearly related in the tightestmanner

Table 7 Coverage achieved by BFS-BB on branch boundc

Coverage criterion Number of paths Average coverage Generation time reducedby IVR

Average MPC checks perrelevant variable

Statement 61 100 34 134Branch 119 100 37 173MCDC 125 98 42 229

because there are some library functions as well as nonlinearconstraints in the PUT which require moreMPC checks Butfrom all the average values of the number of MPC checks itcan be concluded that the tests for all the three coverage crite-ria were basically backtrack-free and not all the tests includethe bisection operation

622 Testing Programs from a Project in Engineering In thispart seven programs were selected from the project de118i-2 at httpwwwmoshiernet as the test beds each of whichcontains several functions or loops Three coverage criteriawere adopted which were statement branch and MCDCThe information of the programs and the coverage results areshown in Table 8

From Table 8 it can be seen that BFS-BB performedencouragingly for programs with complex structure in engi-neering Some of the tests adoptingMCDC coverage did notreach 100 resulting from the fact thatMCDC is a relativelystrict criterion But all the tests were basically backtrack-freewith a high coverage proving its effectiveness in engineering

623 Comparison with a Static Method This part presentsthe results from an empirical comparison of BFS-BB withthe static method [13] (denoted as ldquomethod 1rdquo to avoidverbose description) which was implemented in CTS priorto BFS-BB The details of the test beds are shown in Table 9The comparison adopted three coverage criteria statementbranch and MCDC And the result is shown in Table 10

16 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 4318x2 minus 6684x + 3655

R2 = 0975

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 4997x2 minus 7734x + 4134

R2 = 0991

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time y = 4151x2 minus 2923x + 7107

R2 = 0907

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 5455x2 minus 9565x + 5030

R2 = 0994

(d)

Figure 8 Relationship between generation time and the number of variables for variables that are linearly related in the tightest manner

Table 8 Test result achieved by BFS-BB on programs from de118i-2

Program LOC Number ofvariables

Number offunctions

Number ofloops Statement coverage Branch coverage MCDC coverage

rungec 1626 8 2 8 100 100 89oblatec 581 14 4 4 100 100 94jplmpc 996 6 5 5 100 100 88floorlc 432 7 5 2 100 100 92atanlc 267 3 2 0 100 100 100adams4c 767 28 10 12 100 100 85tanlc 256 4 3 0 100 100 100

Table 9 Programs used for comparison with method 1

Program LOC Number of branches Number of variables Description Source

Bonus 29 10 1 To calculate bonus according toprofit [13]

Days 33 17 3 To calculate the day numberwithin a year [13]

Statistics 21 8 5 To count the number of everykind of characters [13]

gcd 38 5 2 To calculate greatest commondenominator [14]

Mathematical Problems in Engineering 17

Table 10 Comparison result with method 1 using three coverage criteria

Program Coverage criterion Number of paths Average coverage bymethod 1

Average coverage byBFS-BB

BonusStatement 6 25 100Branch 6 37 100MCDC 6 30 100

DaysStatement 17 100 100Branch 17 100 100MCDC 14 94 100

StatisticsStatement 4 100 100Branch 5 100 100MCDC 11 82 100

gcdStatement 3 85 100Branch 3 77 100MCDC 5 75 100

Table 11 Programs used for comparison with GA and SA

Program LOC Number of branches Number of variables Description SourceTriangle type 31 3 5 To classify type of a triangle [15]

Valid date 59 16 3 To check whether a date is validor not [15]

Cal day 72 11 3 To calculate the day of the week [16]

Cal 53 18 5To calculate the number of daysbetween the two given days in

the same year[17]

Since interval arithmetic has been improved in CTS forthe sake of validity test data were generated for all the testbeds on the same foundation of interval arithmetic It can beseen that BFS-BB reached 100 coverage for all the test bedsusing three coverage criteria while method 1 did not That islargely due to the heuristicmethods utilized in BFS-BBThereare modulus operations in the program days which had beendifficult for the interval arithmetic in method 1 to handle Butthe functionality of interval arithmetic has been improved byadding a library of inverse functions so it was not so difficulteven for method 1

624 Comparison with DynamicMethods This part presentsresults from an empirical comparison of BFS-BB with twodynamic methods which are genetic algorithm (GA) andsimulated annealing (SA) on four different benchmark pro-grams using branch coverage The details of the benchmarkprograms are shown in Table 11 In order to obtain unbiasedexperimental results we made a number of experiments withdifferent parameter settings and chose the one that GA andSA performed the best as shown in Table 12 Test data wereautomatically generated for each program using GA SA andBFS-BB with each tested 100 timesThe average coverage wasrecorded to make the comparison and the result was pre-sented in Table 13

It can be seen that BFS-BB reached 100 coverage onall the four benchmark programs which are rather simpleprograms for BFS-BB and it outperformed the algorithms

in comparison The better performance of BFS-BB is due tothree factorsThe first is that the initial values of variables areselected by heuristics on the path so BFS-BB reaches a rela-tively high coverage for the first round of the searchThe sec-ond is that MHS crashes on several occasions due to the iter-ation exception while the probability of aborting is quite lowfor BFS-BB because it has no demand for iteration The thirdis thatMPC is checked not only in the state space search stagebut also in the initialization stage which reduces the domainsof the variables to ensure a relatively small search spacethat follows

7 Conclusion

This paper presents an intelligent algorithm best-first-searchbranch and bound (BFS-BB) for path-wise test data gener-ation (Q) The problem Q is reformulated as a constraintoptimization problem (COP) and two techniques from arti-ficial intelligence are introduced to tackle the COP which arestate space search and branch and bound (BB) The formeris used to construct the search tree dynamically and thelatter is used as the search method Heuristics are adoptedin the look-ahead stage Dynamic variable ordering (DVO)is presented with a heuristic rule to break ties Maintainingpath consistency (MPC) is achieved through analysis on theresult of interval arithmetic The monotonicity analysis onbranching conditions is applied both in the selection of initialvalues by path tendency calculation (PTC) and initial domain

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

16 Mathematical Problems in Engineering

0 20 40 600

5000

10000

15000

Number of variables

Gen

erat

ion

time

y = 4318x2 minus 6684x + 3655

R2 = 0975

(a)

0 20 40 600

5000

10000

15000

Number of variables

Aver

age g

ener

atio

n tim

e

y = 4997x2 minus 7734x + 4134

R2 = 0991

(b)

0 20 40 600

5000

10000

15000

Number of variables

Max

imum

gen

erat

ion

time y = 4151x2 minus 2923x + 7107

R2 = 0907

(c)

0 20 40 600

5000

10000

15000

Number of variables

Min

imum

gen

erat

ion

time

y = 5455x2 minus 9565x + 5030

R2 = 0994

(d)

Figure 8 Relationship between generation time and the number of variables for variables that are linearly related in the tightest manner

Table 8 Test result achieved by BFS-BB on programs from de118i-2

Program LOC Number ofvariables

Number offunctions

Number ofloops Statement coverage Branch coverage MCDC coverage

rungec 1626 8 2 8 100 100 89oblatec 581 14 4 4 100 100 94jplmpc 996 6 5 5 100 100 88floorlc 432 7 5 2 100 100 92atanlc 267 3 2 0 100 100 100adams4c 767 28 10 12 100 100 85tanlc 256 4 3 0 100 100 100

Table 9 Programs used for comparison with method 1

Program LOC Number of branches Number of variables Description Source

Bonus 29 10 1 To calculate bonus according toprofit [13]

Days 33 17 3 To calculate the day numberwithin a year [13]

Statistics 21 8 5 To count the number of everykind of characters [13]

gcd 38 5 2 To calculate greatest commondenominator [14]

Mathematical Problems in Engineering 17

Table 10 Comparison result with method 1 using three coverage criteria

Program Coverage criterion Number of paths Average coverage bymethod 1

Average coverage byBFS-BB

BonusStatement 6 25 100Branch 6 37 100MCDC 6 30 100

DaysStatement 17 100 100Branch 17 100 100MCDC 14 94 100

StatisticsStatement 4 100 100Branch 5 100 100MCDC 11 82 100

gcdStatement 3 85 100Branch 3 77 100MCDC 5 75 100

Table 11 Programs used for comparison with GA and SA

Program LOC Number of branches Number of variables Description SourceTriangle type 31 3 5 To classify type of a triangle [15]

Valid date 59 16 3 To check whether a date is validor not [15]

Cal day 72 11 3 To calculate the day of the week [16]

Cal 53 18 5To calculate the number of daysbetween the two given days in

the same year[17]

Since interval arithmetic has been improved in CTS forthe sake of validity test data were generated for all the testbeds on the same foundation of interval arithmetic It can beseen that BFS-BB reached 100 coverage for all the test bedsusing three coverage criteria while method 1 did not That islargely due to the heuristicmethods utilized in BFS-BBThereare modulus operations in the program days which had beendifficult for the interval arithmetic in method 1 to handle Butthe functionality of interval arithmetic has been improved byadding a library of inverse functions so it was not so difficulteven for method 1

624 Comparison with DynamicMethods This part presentsresults from an empirical comparison of BFS-BB with twodynamic methods which are genetic algorithm (GA) andsimulated annealing (SA) on four different benchmark pro-grams using branch coverage The details of the benchmarkprograms are shown in Table 11 In order to obtain unbiasedexperimental results we made a number of experiments withdifferent parameter settings and chose the one that GA andSA performed the best as shown in Table 12 Test data wereautomatically generated for each program using GA SA andBFS-BB with each tested 100 timesThe average coverage wasrecorded to make the comparison and the result was pre-sented in Table 13

It can be seen that BFS-BB reached 100 coverage onall the four benchmark programs which are rather simpleprograms for BFS-BB and it outperformed the algorithms

in comparison The better performance of BFS-BB is due tothree factorsThe first is that the initial values of variables areselected by heuristics on the path so BFS-BB reaches a rela-tively high coverage for the first round of the searchThe sec-ond is that MHS crashes on several occasions due to the iter-ation exception while the probability of aborting is quite lowfor BFS-BB because it has no demand for iteration The thirdis thatMPC is checked not only in the state space search stagebut also in the initialization stage which reduces the domainsof the variables to ensure a relatively small search spacethat follows

7 Conclusion

This paper presents an intelligent algorithm best-first-searchbranch and bound (BFS-BB) for path-wise test data gener-ation (Q) The problem Q is reformulated as a constraintoptimization problem (COP) and two techniques from arti-ficial intelligence are introduced to tackle the COP which arestate space search and branch and bound (BB) The formeris used to construct the search tree dynamically and thelatter is used as the search method Heuristics are adoptedin the look-ahead stage Dynamic variable ordering (DVO)is presented with a heuristic rule to break ties Maintainingpath consistency (MPC) is achieved through analysis on theresult of interval arithmetic The monotonicity analysis onbranching conditions is applied both in the selection of initialvalues by path tendency calculation (PTC) and initial domain

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 17

Table 10 Comparison result with method 1 using three coverage criteria

Program Coverage criterion Number of paths Average coverage bymethod 1

Average coverage byBFS-BB

BonusStatement 6 25 100Branch 6 37 100MCDC 6 30 100

DaysStatement 17 100 100Branch 17 100 100MCDC 14 94 100

StatisticsStatement 4 100 100Branch 5 100 100MCDC 11 82 100

gcdStatement 3 85 100Branch 3 77 100MCDC 5 75 100

Table 11 Programs used for comparison with GA and SA

Program LOC Number of branches Number of variables Description SourceTriangle type 31 3 5 To classify type of a triangle [15]

Valid date 59 16 3 To check whether a date is validor not [15]

Cal day 72 11 3 To calculate the day of the week [16]

Cal 53 18 5To calculate the number of daysbetween the two given days in

the same year[17]

Since interval arithmetic has been improved in CTS forthe sake of validity test data were generated for all the testbeds on the same foundation of interval arithmetic It can beseen that BFS-BB reached 100 coverage for all the test bedsusing three coverage criteria while method 1 did not That islargely due to the heuristicmethods utilized in BFS-BBThereare modulus operations in the program days which had beendifficult for the interval arithmetic in method 1 to handle Butthe functionality of interval arithmetic has been improved byadding a library of inverse functions so it was not so difficulteven for method 1

624 Comparison with DynamicMethods This part presentsresults from an empirical comparison of BFS-BB with twodynamic methods which are genetic algorithm (GA) andsimulated annealing (SA) on four different benchmark pro-grams using branch coverage The details of the benchmarkprograms are shown in Table 11 In order to obtain unbiasedexperimental results we made a number of experiments withdifferent parameter settings and chose the one that GA andSA performed the best as shown in Table 12 Test data wereautomatically generated for each program using GA SA andBFS-BB with each tested 100 timesThe average coverage wasrecorded to make the comparison and the result was pre-sented in Table 13

It can be seen that BFS-BB reached 100 coverage onall the four benchmark programs which are rather simpleprograms for BFS-BB and it outperformed the algorithms

in comparison The better performance of BFS-BB is due tothree factorsThe first is that the initial values of variables areselected by heuristics on the path so BFS-BB reaches a rela-tively high coverage for the first round of the searchThe sec-ond is that MHS crashes on several occasions due to the iter-ation exception while the probability of aborting is quite lowfor BFS-BB because it has no demand for iteration The thirdis thatMPC is checked not only in the state space search stagebut also in the initialization stage which reduces the domainsof the variables to ensure a relatively small search spacethat follows

7 Conclusion

This paper presents an intelligent algorithm best-first-searchbranch and bound (BFS-BB) for path-wise test data gener-ation (Q) The problem Q is reformulated as a constraintoptimization problem (COP) and two techniques from arti-ficial intelligence are introduced to tackle the COP which arestate space search and branch and bound (BB) The formeris used to construct the search tree dynamically and thelatter is used as the search method Heuristics are adoptedin the look-ahead stage Dynamic variable ordering (DVO)is presented with a heuristic rule to break ties Maintainingpath consistency (MPC) is achieved through analysis on theresult of interval arithmetic The monotonicity analysis onbranching conditions is applied both in the selection of initialvalues by path tendency calculation (PTC) and initial domain

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

18 Mathematical Problems in Engineering

Table 12 Parameter setting for GA and SA

Item Parameter Value

Common issues Population size 30Number of max generations 100

GASelection strategy Roulette wheel

Crossover probability 090Mutation probability 005

SA Initial temperature 100Cooling coefficient 095

Table 13 Comparison result with GA and SA using branch coverage

Program Number of paths Average coverage by GA Average coverage by SA Average coverage byBFS-BB

Triangle type 6 95 9988 100Valid date 5 9995 9821 100Cal day 20 9631 9997 100Cal 7 9902 9927 100

calculation (IDC) and in the selection of other values bybisection when MPC encounters a conflict An optimizationmethod irrelevant variable removal (IVR) is also proposedto reduce the search space Empirical experiments wereconducted to evaluate the performance of BFS-BBThe resultsshow that it searches in a basically backtrack-free mannergenerates test data on programs of complex structure withpromising performance and outperforms some currentlyexisting static and dynamic methods in terms of coverageThe application of BFS-BB in engineering proves its effective-ness From the perspective of BFS-BB the heuristics used inthe look-ahead search are counterproductive to the look-backsearch

Our future research will involve how to generate testdata to reach high coverage with more types of constraintsso as to give scalability to BFS-BB We will also study howcoverage criteria generation approach and system structurejointly influence test effectiveness The effectiveness of thegeneration approach continues to be our primary workParticularly theMCDCcoverage criterionwill be givenmoreemphases

Conflict of Interests

The authors declare no competing financial interests

Acknowledgments

This work was supported by the National Grand Funda-mental Research 863 Program of China (no 2012AA011201)the National Natural Science Foundation of China (no61202080) the Major Program of the National NaturalScience Foundation of China (no 91318301) and the OpenFunding of State Key Laboratory of Computer Architecture(no CARCH201201)

References

[1] M R Lyu S Rangarajan and A P A Van Moorsel ldquoOptimalallocation of test resources for software reliability growthmodeling in software developmentrdquo IEEE Transactions onReliability vol 51 no 2 pp 183ndash192 2002

[2] Z Xiaonan Y Junfeng D Siliang and H Shudong ldquoA newmethod on software reliability predictionrdquoMathematical Prob-lems in Engineering vol 2013 Article ID 385372 8 pages 2013

[3] B Beizer Software Testing Techniques Van Nostrand ReinholdNew York NY USA 2nd edition 1990

[4] E J Weyuker ldquoEvaluation techniques for improving the qualityof very large software systems in a cost-effective wayrdquo Journal ofSystems and Software vol 47 no 2 pp 97ndash103 1999

[5] B W Kernighan and P J PlaugerThe Elements of ProgrammingStyle McGraw-Hill New York NY USA 1982

[6] J C King ldquoSymbolic execution and program testingrdquo Commu-nications of the Association for ComputingMachinery vol 19 no7 pp 385ndash394 1976

[7] S Person G Yang N Rungta and S Khurshid ldquoDirected incre-mental symbolic executionrdquo in Proceedings of the 32nd ACMSIGPLAN Conference on Programming Language Design andImplementation (PLDI rsquo11) vol 47 no 6 pp 504ndash515 NewYorkNY USA June 2011

[8] T Hickey Q Ju and M H Van Emden ldquoInterval arithmeticfromprinciples to implementationrdquo Journal of the ACM vol 48no 5 pp 1038ndash1068 2001

[9] R E Moore R B Kearfott and M J Cloud Introduction toInterval Analysis Society for Industrial and AppliedMathemat-ics Philadelphia Pa USA 2009

[10] R A DeMillo and A J Offutt ldquoConstraint-based automatic testdata generationrdquo IEEE Transactions on Software Engineeringvol 17 no 9 pp 900ndash910 1991

[11] A Gotlieb B Botella and M Rueher ldquoAutomatic test data gen-eration using constraint solving techniquesrdquo in Proceedings ofthe ACMSIGSOFT International Symposiumon Software Testingand Analysis pp 53ndash62 1998

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 19

[12] C Cadar D Dunbar and D R Engler ldquoKLEE unassistedand automatic generation of high-coverage tests for complexsystems programsrdquo in Proceedings of USENIX Symposium onOperating SystemsDesign and Implementation (OSDI rsquo08) vol 8pp 209ndash224

[13] W Yawen G Yunzhan andXQing ldquoAmethod of test case gen-eration based on necessary interval setrdquo Journal of Computer-Aided Design amp Computer Graphics vol 25 no 4 pp 550ndash556 2013

[14] A Bouchachia ldquoAn immune genetic algorithm for softwaretest data generationrdquo in Proceedings of the 7th InternationalConference on Hybrid Intelligent Systems (HIS rsquo07) pp 84ndash89September 2007

[15] M Chengying Y Xinxin and C Jifu ldquoGenerating test casefor structural testing based on ant colony optimizationrdquo inProceedings of the 12th International Conference on QualitySoftware (QSIC rsquo12) pp 98ndash101 2012

[16] E Alba and F Chicano ldquoObservations in using parallel andsequential evolutionary algorithms for automatic software test-ingrdquo Computers and Operations Research vol 35 no 10 pp3161ndash3183 2008

[17] P Ammann and J Offutt Introduction to Software TestingCambridge University Press New York NY USA 2008

[18] S Ali L C Briand H Hemmati and R K Panesar-WalawegeldquoA systematic review of the application and empirical investiga-tion of search-based test case generationrdquo IEEE Transactions onSoftware Engineering vol 36 no 6 pp 742ndash762 2010

[19] S Chayanika S Sabharwal and R Sibal ldquoA survey on softwaretesting techniques using genetic algorithmrdquo International Jour-nal of Computer Science Issues vol 10 no 1 pp 381ndash393 2013

[20] N Tracey J Clark and K Mander ldquoAutomated program flawfinding using simulated annealingrdquo in Proceedings of the ACMSIGSOFT International Symposium on Software Testing andAnalysis (ISSTA rsquo98) vol 23 no 2 pp 73ndash81 1998

[21] A Sakti Y G Gueheneuc and G Pesant ldquoCSBT constrainedsearch-based test data generation for softwarerdquo in Proceedingsof the 18th International Conference on Principles and Practice ofConstraint Programming (CP rsquo12) p 55 2012

[22] M J Gallagher and V L Narasimhan ldquoAdtest a test datageneration suite for ada software systemsrdquo IEEE Transactionson Software Engineering vol 23 no 8 pp 473ndash484 1997

[23] L-L Wang and W-H Tsai ldquoOptimal assignment of task mod-uleswith precedence for distributed processing by graphmatch-ing and state-space searchrdquo BIT Numerical Mathematics vol28 no 1 pp 54ndash68 1988

[24] K L McMillan and D K Probst ldquoA technique of state spacesearch based on unfoldingrdquo Formal Methods in System Designvol 6 no 1 pp 45ndash65 1995

[25] L Gao S K Mishra and J Shi ldquoAn extension of branch-and-bound algorithm for solving sum-of-nonlinear-ratios problemrdquoOptimization Letters vol 6 no 2 pp 221ndash230 2012

[26] E I Goldberg L P Carloni T Villa and R K BraytonldquoNegative thinking in branch-and-bound the case of unatecoveringrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 19 no 3 pp 281ndash294 2000

[27] D Frost and R Dechter ldquoLook-ahead value ordering for con-straint satisfaction problemsrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 1 pp 572ndash5781995

[28] R EMoore Interval Arithmetic and Automatic Error analysis indigital computing [PhD thesis] Stanford University 1962

[29] R E Moore Interval Analysis Prentice-Hall Englewood CliffsNJ USA 1966

[30] R EMooreMethods andApplications of Interval Analysis vol 2of Society for Industrial and Applied Mathematics PhiladelphiaPa USA 1979

[31] P McMinn ldquoSearch-based software test data generation a sur-veyrdquo Software Testing Verification and Reliability vol 14 no 2pp 105ndash156 2004

[32] A Petcu and B Faltings ldquoA scalable method for multiagentconstraint optimizationrdquo in Proceedings of the InternationalJoint Conference on Artificial Intelligence vol 5 pp 266ndash2712005

[33] P J Modi W-M Shen M Tambe and M Yokoo ldquoAdoptasynchronous distributed constraint optimization with qualityguaranteesrdquo Artificial Intelligence vol 161 no 1-2 pp 149ndash1802005

[34] D Szer F Charpillet and S Zilberstein ldquoMAAlowast a heuris-tic search algorithm for solving decentralized POMDPsrdquo inProceedings of the 21st Conference on Uncertainty in ArtificialIntelligence (UAI rsquo05) pp 576ndash583 July 2005

[35] DDelling A VGoldberg I Razenshteyn andR F FWerneckldquoExact Combinatorial Branch-and-Bound forGraph Bisectionrdquoin Proceedings of the Workshop on Algorithm Engineering andExperiments (ALENEX rsquo12) pp 30ndash44 2012

[36] X Chen and P van Beek ldquoConflict-directed backjumpingrevisitedrdquo Journal of Artificial Intelligence Research vol 14 pp53ndash81 2001

[37] R Baker and I Habli ldquoAn empirical evaluation of mutationtesting for improving the test quality of safety-critical softwarerdquoIEEE Transactions on Software Engineering vol 39 no 6 pp787ndash805 2013

[38] A RajanMWWhalen andM P E Heimdahl ldquoDistinguishedpaper the effect of program and model structure on MCDCtest adequacy coveragerdquo in Proceedings of the 30th InternationalConference on Software Engineering (ICSE rsquo08) pp 161ndash170May2008

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of