[ieee 2011 annual ieee india conference (indicon) - hyderabad, india (2011.12.16-2011.12.18)] 2011...

Compiler Based Approach To Estimate Software Reliability Using Metrics

VIPIN KUMAR K SDept. of Computer Science &

EnggRajiv Gandhi Institute of

TechnologyKottayam, Kerala,India

[email protected]

ALBIN ABRAHAM ITTYDept. of Computer Science &


TechnologyKottayam, Kerala,India

[email protected]

ARJUNLAL BDept. of Computer Science &


TechnologyKottayam, Kerala,[email protected]

Dr.SHEENA MATHEWSchool of Engineering

CUSATCochin, Kerala, India

Abstract—This paper discusses a compiler-based approach to estimate the reliability of Object-Oriented Programs. In this approach, automated analysis of the code results in increased efficiency of analysis and estimation of complexity. Different aspects of an object-oriented program that contribute to complexity of the developed software and there by reflect upon its reliability are estimated from the developed program. A code-based analysis on the other hand would require several parsing of the program each time some information is required. In contrast to other approaches, this approach uses a compiler based approach for the estimation of reliability.

Keywords-reliability, analysis, object oriented, metrics

I. INTRODUCTION

Software now controls banking systems, all forms of telecommunications, process controlling nuclear plants and factories, as well as defense systems. The society has developed an extraordinary dependence on software. There are many well known cases of tragic consequences of software failures. In popular software packages used everyday, a very high degree of reliability is needed, because the enormous investment of the software developer is at stake.

Software Reliability[1] is the probability of failure-free software operation for a specified period of time in a specified environment. Software Reliability is also an important factor affecting system reliability. All programs must be tested and debugged, until sufficiently high software reliability is achieved. It differs from hardware reliability in that it reflects the design perfection, rather than manufacturing perfection. The high complexity of software is the major contributing factor of Software Reliability problems.

Software must be released at some point in time, further delay will cause unacceptable loss of revenue and market share. The developer must take a calculated risk and must have a strategy for achieving the required reliability by the target release date. In recent past, enough data has become available to develop and evaluate methods for achieving high reliability. Developing reliable software has become an engineering discipline rather than an art.

Software failures may be due to errors, ambiguities, oversights or misinterpretation of the specification that the

software is supposed to satisfy, carelessness or incompetence in writing code, inadequate testing, incorrect or unexpected usage of the software or other unforeseen problems. While it is tempting to draw an analogy between Software Reliability and Hardware Reliability, software and hardware have basic differences that make them different in failure mechanisms. Hardware faults are mostly physical faults, while software faults are design faults, which are harder to visualize, classify, detect, and correct. Design faults are closely related to fuzzy human factors and the design process, which we don't have a solid understanding. In software, we can hardly find a strict corresponding counterpart for "manufacturing" as hardware manufacturing process, if the simple action of uploading software modules into place does not count. Therefore, the quality of software will not change once it is uploaded into the storage and start running. Trying to achieve higher reliability by simply duplicating the same software modules will not work, because design faults cannot be masked off by voting.

Failure: a departure of the system behavior from user requirements during execution.

Defect (or fault): an error in system implementation that can cause a failure during execution.

A defect will cause a failure only when the erroneous code is executed, and the effect is propagated to the output. The testability of a defect is defined as the probability of detecting it with a randomly chosen input. Defects with very low testability can be very difficult to detect and this drastically affects reliability.

Some mathematical concepts are applicable to both software and hardware reliability. Hardware faults often occur due to aging. Combined with manufacturing variation in the quality of identical hardware components, the reliability variation can be characterized as exponential decay with time. On the other hand, the software reliability improves during testing as bugs are found and removed. Once released, the software reliability is fixed. The software will fail from time to time during operational use when it cannot respond correctly to an input. Reliability of hardware components is often estimated by collecting failure data for a large number of identical units. For a software system, its own past behavior is

often a good indicator of its reliability, even though data from other similar software systems can be used for making projections. In this proposed work, we analyze the reliability of programs written in an Object Oriented Language. For implementation of this system, we chose C++ programs as input.

II. SOFTWARE METRIC BASED APPROACH FOR RELIABILITY CALCULATION

No good ways have been devised till now to quantify software reliability and hence the hunt for quantifiers that measure software reliability still goes on. The difficulty in measuring software reliability lies in the fact that we do not have lucid definitions regarding the various aspects that constitute reliability of a software.

As we cannot measure software reliability directly, various software reliability metrics[2], that reflect the characteristics of software reliability, have been formulated. The values of these metrics for a given program can be used to estimate its reliability.

Two types of metrics can be used for software development - product metrics and process metrics. Product metrics enumerate the characteristics of the software whereas process metrics enumerate the characteristics of the process that is used to develop the software. The use of metrics can be considered as a key factor in determining the success of software engineering strategy and its management.

Since we are analyzing the reliability of an object oriented program, we make use of a more refined version of the software metrics – The Object oriented metrics[3]

A single metric by itself is not adequate to provide enough information on the reliability aspects of a software. Hence, we decided to include 11 different metrics to account for the reliability measure of a software. They are detailed below:

A. Cyclomatic Complexity

This metric is used to find out the complexity of an algorithm. It is a count of the total number of different paths that needs to be considered to test the method completely. The metric, proposed by McCabe, is a graph-theoretic-based concept. For a graph G with n nodes, e edges and p connected components, the cyclomatic number.

V(G)=e-n+p

Cyclomatic complexity can also be measured as “number of decision nodes+1”. Low Cyclomatic complexity implies more reliability. Generally, Cyclomatic Complexity for a method should be less than 10, which indicates that decisions are deferred through message passing.

B. Depth Of Inheritance Tree(DIT)

The depth of a class within the inheritance hierarchy is the maximum length from the class node to the root of the tree, measured by the number of ancestral classes. As the depth increases, it becomes more and more complex to predict the behaviour of the class. It is preferable that DIT is less than or equal to 4.

C. Lack of Cohesion of Methods (LCOM)

It gives an account of the dissimilarity between the methods of a class. There are several LCOM metrics. The LCOM takes its values in the range [0-1]. The LCOM HS (HS stands for Henderson-Sellers) takes its values in the range [0-2]. A LCOM HS value highest than 1 should be considered alarming. Here are algorithms used to compute LCOM metrics:

LCOM = 1 – (sum(MF)/M*F) LCOM HS = (M – sum(MF)/F)(M-1)

Where: M is the number of methods in class (both static and

instance methods are counted, it includes also constructors, properties getters/setters, events add/remove methods). F is the number of instance fields in the class. MF is the number of methods of the class accessing a particular instance field. Sum(MF) is the sum of MF over all instance fields of the class. In our implementation, we have chosen LCOM and not LCOM HS. A high value of LCOM(ie, closer to 1) indicates poor cohesion and hence lesser reliability. A value closer to 0 indicates good cohesion and hence greater reliability.

D. Coupling between Object Classes(CBO)Complexity

CBO for a class is defined as the number of other classes to which it is coupled. Two classes are coupled when methods declared in one class use methods or instance variables of the other class. Excessive coupling is detrimental to modular design and prevents reuse. Coupling increases the complexity of the software and hence degrades its reliability.

E. Methods per Class

It is the average number of methods per object class. It is calculated as:

Methods per class = Total number of methods/Total number of object classes.

A large number of methods per object class complicates testing due to the increased size and complexity. If the number of methods per object class gets too large, extensibility will be hard.

F. Executability Factor

This metric calculates the ratio of number of executable lines of code in the program, to the total number of lines of code. Low value of Executability indicates that there are excess variable declarations or non operational statements, which perform only declarations.

G. Method Hiding Factor(MHF)

It is defined as the ratio of the sum of the invisibilities of all methods defined in all classes to the total number of methods defined in the system under consideration. The invisibility of a method is the percentage of the total classes from which this method is not visible.

H. Attribute Hiding Factor(AHF)

Its defined as the ratio of the sum of the inherited attributes in all classes of the system under consideration to the total number of available attributes (locally defined plus inherited) for all classes.

I. Number of Catch Blocks per class

It is defined as the ratio of the catch blocks in a class to the total number of classes in the program. This helps to analyze if all the possible exceptions and their corresponding ‘catches’ is being defined in the program. More the number of exceptions handled, more is the reliability.

J. Metric on methods(NbVariables)

It is the number of variables declared in the body of a method. Methods where NbVariables is higher than 8, are hard to understand and maintain. Methods where NbVariables is higher than 15, are extremely complex and should be split in to smaller methods.

K. Metric on methods(NbParameters)

It is the number of parameters of a method. Methods where NbParameters is higher than 5, might be painful to call and might degrade performance and reliability.

III. METRIC BASED ESTIMATION OF RELIABILITY

The software metrics’ values are explicitly calculated and an approximate optimal range is fixed for each one of them. Calculation of the metrics’ values can be implemented by using required data structures. We developed a tool, that analyzes the code from the reliability aspect, along with syntax analysis. For this purpose, ANTLR, a compiler writing tool was used. The software metrics’ values and the percentage of reliability is obtained as output.

The input program is verified for its syntactical correctness using this tool and calculation of metric values are carried out simultaneously. Percentage of reliability is calculated as follows :

Reliability = (Σ (score of each metric))

Taking into account, no specific domain or scenario, we gave equal importance to each metric. The metric score is calculated as the difference between optimal value of that metric with the obtained value, converted to the scale of ten. Since all metrics were given equal importance, the percentage of reliability, is calculated as a sum of all the metric scores, to the scale of 100.

ANTLR, also gives out a parse tree, which helps to identify any undesired paths taken during parsing.

The normal compilation of the object oriented programs can be done using any common compiler(like GCC) and bugs can be removed. The purpose of our tool is to test the Object Oriented programs to reveal the areas for further improvement. This could be determined by analyzing the metric value outputs, obtained from the compiler developed using Antlr. We will also obtain reliability in percentage which tell us how reliable the program is, regarding the object oriented aspects, and how well it follows the OO concepts.

We performed the testing or analysis of the object oriented programs(C++ programs) using the tool we have made. It outputs the values of 11 different software reliability metrics for the program and a reliability percent based on those values. We can improve the program to reach the desired reliability level by analyzing each metric.

The optimum values for the metrics are as follows:

TABLE I. OPTIMAL VALUES FOR METRICS

Name of the Metric Optimal ValueCyclomatic Complexity 1 - 10Depth of Inheritance Tree 0-4Lack of Cohesion of Methods Lesser the betterCoupling between object classes 0-4Methods per class Lesser the betterExecutability Higher the betterMethod hiding factor Higher the betterAttribute Hiding Factor Higher the betterNumber of catch blocks per class Higher the betterNbvariables 0-8Nbmethods 0-5

)

Optimal values of some metrics are scientifically fixed and some others were fixed for evaluation. If the values obtained for all the metrics by a program fall in the optimal ranges, we say that the program is sufficiently reliable. Clearly, the reliability percent will be higher.

The percent of reliability can be improved by analyzing the metric values and improving the program accordingly.

A. Evaluation of a program

We will now see what our proposed compiler outputs and how the analysis of the result is done.

The compiler code is run in the Antlr IDE .The program, whose reliability is to be found out, is given as input to the compiler.

Input can be given as shown here :

Figure 3.1 Sample input to the compiler

The compiler performs the lexical and basic syntax analysis and produces a perfect parse tree of the given program, if the program conforms to the given grammar. Once the whole input program is parsed,we can see the parsed input program and the output of the compiler in two windows, as shown below.

Figure 3.2 Parsed Input

Figure 3.3 Output Obtained

Figure 3.4 Parse Tree

B. Analyzing the obtained results

1.The output obtained shows different metric values and the percentage of reliability, which is 49.16 here.

2.This implies that the program is to be more bound to the object-oriented concepts and thus should be formulated as satisfying the optimal ranges of the above mentioned concepts, which we have expressed in the form of metrics. Doing so, we can improve our program and thus make our program more reliable.

IV. RELATED WORK

P.N.Mishra [4] proposed the use of failure data from a space shuttle software project to predict the number of failures likely during a mission. This method depends on previously acquired data and it doesn’t help in improving a software before its actually implemented.

Ohba,Mitsuru [5] proposed the use of reliability data obtained from test reports for quality assessment. It proposes utilization of technical knowledge about a program, testing and test data to select an appropriate reliability analysis model,

which helps in the quality assessment. This paper also depended more on the previously acquired data and was not a generalized concept.

Martin. L. Shooman [6] proposed a micro model for reliability analysis, based on the structure of the program. This approach assumes that the program is written in structured or modular form so that decomposition in to its constituent parts is simple. This approach does not deal with implementation of object oriented concepts, which is a must in today’s scenario.and reliability.

V. CONCLUSION

The metric values calculated are generated by parsing the input and analyzing the various aspects of the input program such as use of data structures, programming methodologies adopted, etc. Our method analyzes the object oriented concepts, program level, class level and method level aspects of a program to predict its reliability. This type of evaluation helps to churn out the real aspects of reliability, in terms of metrics, which is evidently a much advanced way than code reading, calculation of execution time or making approximations based on previously acquired data. The

proposed method helps in determining the exact effort required and the various metrics calculated reflects each programming aspects of the input program. The programmer shall improve the program by bringing in modifications to the code, so that the entire metric values tend to its optimal range. And this shall, by far, be the best known way to calculate reliability using software.

REFERENCES

[1] A.L. Goel, "Software Reliability Models: Assumptions, Limitations, and Applicability," IEEE Transactions on Software Engineering, vol. 11, no. 12, pp. 1411-1423, Dec. 1985, doi:10.1109/TSE.1985.232177

[2] Chidamber, S.R.; Kemerer, C.F, ”A metrics suite for object oriented design”, IEEE Transactions on Software Engineering, Volume 20, Issue 6, pages 476 – 493, June 1994

[3] Sayed Mohsen Jamali, “Object Oriented Metrics(A Survey Approach)”, Tehran, Iran, January 2006

[4] P.N.Mishra, ”Software Reliability Analysis”, Vol.22,Issue 3, pages 262-270, September 1983

[5] Ohba, Mitsuru, ”Software Reliability Analysis Models”, Internal IBM Journal of Research and Development,Volume 28, Issue 4pages 428-443, July 1984.

[6] Martin. L. Shooman, ”Structural Models for software reliability prediction”, ICSE '76 Proceedings of the 2nd international conference on Software engineering , 1976

[ieee 2011 annual ieee india conference (indicon) - hyderabad, india (2011.12.16-2011.12.18)] 2011...

Documents