a pragmatic study on cost estimation in software platform ...salserver.org.aalto.fi › vanhat_sivut...
TRANSCRIPT
Helsinki University of Technology
System Analysis Laboratory
Mat-2.108 Independent Research Projects in Applied Mathematics
A Pragmatic Study on Cost Estimation in Software
Platform Based Product Development Model
Jukka Lehmusvirta
In Salo, October 31, 2004
Supervisor Pertti Laininen Dr.Tech., Helsinki University of Technology
A Pragmatic Study on Cost Estimation in Software Platform Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
Contact Information: Jukka Lehmusvirta Nokia Enterprise Solutions Joensuunkatu 7, FIN-24100 SALO P.O.BOX 86 Mobile +358504821010 Fax +358718044610 Email: [email protected]
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
i
Table of Contents
TABLE OF CONTENTS ............................................................................................................ I
DEFINITIONS........................................................................................................................... II
1. INTRODUCTION............................................................................................................ 1
1.1. CASE ORGANIZATION ..................................................................................................... 1
1.2. BACKGROUND AND MOTIVATION FOR THE STUDY ........................................................ 1
1.3. RESEARCH OBJECTIVES .................................................................................................. 2
1.4. SCOPE OF THE STUDY ..................................................................................................... 2
2. PRODUCT-LINE BASED TERMINAL SOFTWARE DEVELOPMENT ............... 3
2.1. CHANGE IN SOFTWARE DEVELOPMENT AND DELIVERY MODEL ................................... 3
2.2. CONCLUSION: ATTRIBUTABLE CHANGES IN SOFTWARE DEVELOPMENT ...................... 5
3. APPROACHES FOR SOFTWARE COST ESTIMATION........................................ 7
3.1. SOFTWARE COST ESTIMATION IN GENERAL................................................................... 7
3.2. EXPERT-BASED APPROACHES......................................................................................... 9
3.3. ALGORITHMIC APPROACHES ........................................................................................ 10
4. ESTIMATING ERROR AMOUNTS FOR RESOURCE ALLOCATION.............. 12
4.1. ANALYSIS OF FINDINGS FROM LITERATURE................................................................. 12
4.2. ESTIMATING ERROR USING REQUIREMENTS AND CHANGE ORDERS ........................... 15
4.3. DATA REFINEMENT AND CLASSIFICATION ................................................................... 20
4.4. RESULTS ....................................................................................................................... 21
5. SUMMARY AND CONCLUSIONS ............................................................................ 27
6. REFERENCES............................................................................................................... 29
6.1. BOOKS .......................................................................................................................... 29
6.2. PAPERS AND ARTICLES ................................................................................................. 29
6.3. INTERNET ...................................................................................................................... 31
6.4. INTERVIEWEES .............................................................................................................. 31
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
ii
Definitions
Definitions presented below are only for the purposes of this study. In literature their meanings
may differ. Counterparts specific to case organization are used in text for readability. Order of
the list is based on order of appearance in the text.
Product program
(Product engineering unit1)
An organizational unit responsible for new product development.
Encompasses management team, interdisciplinary units and persons
responsible for interfacing to different functional organizations.
Software Platform
Organization
An organizational unit that owns certain software platform (software
product-line), and is responsible for component and subsystem
deliveries to software streams or directly to product programs.
Software Streams An organizational unit that is responsible for final integration and
testing of software components, and their delivery to product programs.
Software Platform
(Software Product-Line1)
Consists of a product-line architecture and a set of reusable components
and subsystems that are designed for incorporation into the product-line
architecture.
Product architecture A product architecture is derived from product-line architecture by
pruning unwanted features out of it, extending it with product specific
features and resolving conflicting features.
Software fault A fault is a encoding of human error, that is, a fault occurs when a
human error results in a mistake in some software product.
Software failure A failure is the departure of a system from it required behavior. A fault
can lead to failure.
Software error In case organization errors are work items that are used to manage
corrective actions regarding to faults that have resulted as failures when
software was verified against specification or user expectations. That is,
here an error may be understood as detected fault.
1 Notation used by Bosch (2000)
Chapter 1: Introduction
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
1
1. Introduction
1.1. Case Organization
This study is done for Enterprise Solution (ES), one of the four business groups within Nokia
Corporation. According to IDC (2004), during the third quarter of 2004 Nokia had 31.3% share
of global mobile terminal markets and shipped around 51.4 million terminals worldwide.
1.2. Background and Motivation for the Study
As time-to-market (TTM) is now increasingly the main driver for new product development
also in mobile terminal business, predictability - ability to anticipate the probable chain of
events - is a desired capability one to possess. In mobile terminal development predictability is
utilized e.g. in resources allocation for software implementation and testing.
In software product-line2 based product development model, a product program may not
implement nor even test the terminal software by itself, but rather outsources the
implementation and maintenance i.e. error correction and evolution of software technology
components to software platform organizations (see Figure 2), and component integration and
system verification to software streams. In this kind of development model product program’s
responsibilities include creating development work items i.e. requirements, change orders,
errors reports etc., and managing outsourced development by monitoring and prioritizing these.
Courtesy of this model, software effort estimation techniques presented in literature are
generally inadequate or inapplicable to model development costs, as argued also by Bosch
(1999). Furthermore, data gathering for model population may be costly as well as problematic
as development activities cross organizations borders, although mostly intra-organizational.
An issue requiring predictability is the cumulative amount of software errors induced during
the project execution phase attributable to a product program. This study tries to help persons
responsible for cost estimation in case organization by reviewing a set of approaches for
effort/defect estimation in software engineering. Main objective of the study is to present a
pragmatic approach that enables resource allocation for error correction in software platform
organizations to be assessed.
2 See Bosch (2000) about software product lines.
Chapter 1: Introduction
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
2
1.3. Research Objectives
The research problem is defined as follows:
• How in software product-line based development model the cumulative amount of errors in
product-line software attributable to a product program could be estimated?
The research problem is divided into the following questions:
• What different approaches to cost estimation there are in general and what kind of
techniques has been applied for cost estimation in software engineering?
• What makes cost estimation in software product-lines different from cost estimation in a
traditional one-at-a-time project?
The research objective is therefore the following:
• To present a pragmatic approach for error estimation in constituent organization.
1.4. Scope of the Study
This study tries not to be a comprehensive presentation of software cost estimation techniques
and models suggested in literature but rather a pragmatic, reviewing study on the strengths,
weaknesses and opportunities different software cost estimation approaches have and what
requirements they have on data. In is left for further research to analyze what is the usability of
different approach/techniques in case organization and how they should be adjusted to fit in the
requirements that product-line based development model introduces.
Software cost estimation research focusing on defect/error estimation have tended to
concentrate on the three perspectives, namely i) predicting the number of errors a system will
reveal in testing or operation, ii) estimating the reliability of the system in terms of time to
failure, and iii) understanding the impact of design and testing processes on error counts and
failure densities. This study concentrates on research done in the first one, but from resource
allocation rather than quality point of view.
Study is explorative in nature and has a pragmatic grip on issues; estimation and measurement
theories as such are out of the scope of this study.
Chapter 2: Product-line Based Terminal Software Development
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
3
2. Product-line Based Terminal Software Development
This chapter briefly goes over terminal software development in software product line based
development model. Chapter ends with results from interviews focusing on differences between
product-line and one-at-a-time software development models from cost-modeling point of view.
2.1. Change in Software Development and Delivery Model
2.1.1. Software Development in Product Programs
Before mobile telecommunication boom in the late nineties the number of product programs
developing mobile terminals in case organization was modest. Product programs were almost
entirely autarchic in their software development.
Disutility from developing almost identical software in parallel progressing product programs
was not however significant enough to cause a change in the way that software development
was done. Alleviating reasons were relatively small amount of software needed for products
and lack of modularity in contemporary software architectures.
Product Program C Product Program A
REQs/CRs/ Errors
SW
Product Program F Product Program D
Independent Software Project
Independent Software Project
Independent Software Project
Independent Software Project
Product Program G
SW
Product Program E
Product Program B
t
Figure 1: Product oriented Software Development by Programs
Subsequent product programs merely copied (solid arrows in Figure 1) terminal software from
previous products and modified it to match renewed requirements. Programs implemented new
requirements partly by themselves and partly by outsourcing development to autonomous, in-
house software projects (dashed lines). The benefits of product wise development emerge from
its simplicity: there is no need to manage conflicting demands for development resources, as
each program owns its resources.
Chapter 2: Product-line Based Terminal Software Development
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
4
2.1.2. Software Delivery from Software Platform Organization
Product wise software development can be efficient while the number of products in company’s
product portfolio is small enough. However, as pressure to differentiate products according to
more granular market segmentation widens the portfolio, and as the introduction of new
functionality increases the amount of software needed for products, disadvantages from product
program wise software development may outdone the benefits.
In product wise product development the amount of double work grows in proportion to the
number of programs. A lot of development effort is lost as same solutions with minor
alterations could be utilized simultaneously in more than one product. Notably, double work is
not done only in software implementation, but e.g. in testing and error correction also.
Moreover, copying code across product programs arguably erodes it over long period of time if
there are no central authority to control, define and document the changes made to it. This can
lead to a situation where the understandability, performance and reusability of software may
become is significant cost factor in using it in new products.
Software Platform Software Platform
Software Stream
Product Program A
Software Project Software
Project
Software Project Software
Project
Software Project Software
Project
Software Project Software
Project
Software Project Software
Project
Software Project Software
Project
Product Program B
Product Program C Product Program F
Product Program D Product Program G
Product Program H
t
Product Program E
Figure 2: Platform oriented Software Development for Programs
Product-line oriented software development centralizes software development and software
architecture management. A product belonging to same product-line can exploit software
components adhering to the product-line architecture, and moreover utilize shared resources
sited in software streams.
Chapter 2: Product-line Based Terminal Software Development
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
5
2.2. Conclusion: Attributable Changes in Software Development
Few interviews were conducted in purpose to understand how cost estimation differs in
software product-line based development setting. Following issues were brought up in
interviews.
2.2.1. Continuously Evolving Software Asset
The core idea of product platform oriented product development is to branch new product-line
architecture from previous version with minimized effort. Branching preserves previous version
of product-line architecture for further exploitation, while new functionality can be developed
on the top of new one by extending/modifying the present functionality or implementing new.
Designing reusable assets that support branching is a multi-variable decision problem and this
some times dilutes designers, as pointed out also by Bosch (1999). This is arguably a cause for
volatility in estimates.
2.2.2. Build-in Variability of Software Components
In order to components adhering to certain product-line architecture to be truly valuable, they
need to encompass required level of variability to adhere to specific product architectures with
relative ease. That is, in-build variability of components is utilized to accommodate component
into specific hardware environment, and to include of needed component functionality and
exclude overheads.
In product wise software development there is no true need to insert variability to software
components as each software component is explicitly tailored for particular product in mind.
That is, all the functionality and interfaces in a component are necessary and sufficient.
However, in a component adhering to certain product-line architecture there can exist
considerable amount of functionality and interfaces that are not needed in certain product
architecture. Cost modeling these variation points is difficult, as also pointed by Bosch (ibid.).
And as an implication of aforementioned, not all defects attributable to certain software
component are relevant for a product program whose product line architecture uses only
partially the functionality of that component.
Chapter 2: Product-line Based Terminal Software Development
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
6
2.2.3. Distributed integration of software components
Software components are regularly integrated in purpose to verify their compliance with other
components at system level and in target hardware. In product wise software development
component implementation and integration is often done by the same person. Furthermore,
components are designed to be integrated with certain known set of time-invariant components.
This may be not the case in software product-line based development. Components are possibly
developed in dispersion depending on the work decomposition inside the software platform
organization. Regular integration again may be done in multiple levels by separate teams
specialized to certain product families or products. As components are not developed against
certain components but rather to certain product line architectural principles, new breed of
integration based defects emerge.
2.2.4. Lack of Inter and Intra Organizational Visibility and Control
Software development projects responsible for development of certain components adhering to
the product-line architecture are rather independent in their internal operations. There exist no
common policies on reporting, data collection or metric definitions.
Fenton (1997) advocates this by recognizing that the first and most important method of
improving cost estimation in a particular environment is to use size and effort measures that are
defined consistently across the environment; the measures must be understandable by all who
must supply or use them. Lack of organizational visibility as well as managerial control makes
is very difficult in dispersed organization.
Chapter 3: Approaches for Software Cost Estimation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
7
3. Approaches for Software Cost Estimation
In following words ‘approach’, ‘technique’ and ‘model’ are used. For readability some
definitions are therefore given. Here ‘approach’ refers to a high-level mean to move in the
problem of estimating the effort/defects. Technique and model again refer to more specific
means i.e. to a formal procedure or to a mathematical function to produce an estimate.
In his book Boehm (1981) gives following listing for estimation approaches: algorithmic
models, expert judgment, analogy, Parkinson, price to win, top-down and bottom-up. However,
most of the techniques reported in various research papers illustrate the fact, that more and
more them embodies aspects from two or more approaches given by Boehm.
In following, as different estimation approaches are discussed, focus is kept on very high level.
Acknowledging the diversity of ways that estimation techniques could be classified according
to the objective of the task, techniques are here classified to expert-based and to algorithmic
approaches. This emphasizes techniques relationship to data and source of it.
3.1. Software Cost Estimation in General
Software cost estimation is a task aiming to provide manager responsible e.g. for software
technology creation or software project management information about cost involved in
software development. Here focus is on effort estimation in term of defect/errors.
3.1.1. About the Accuracy of Estimates
It can be argued that software cost modeling can not even at it very best provide estimates that
can be considered more than reasonably accurate. There are many reasons why estimation
techniques fail to perform in software engineering. Boehm (1981, p. 32) lists few of them:
1. source instructions3 are not a uniform commodity, nor are they the essence of the
desired product;
2. software requires the creativity and cooperation of human beings, whose individual
and group behavior is generally hard to predict;
3 Boehm refers to a internal product attribute size that can be measured e.g. as number of instructions.
Chapter 3: Approaches for Software Cost Estimation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
8
3. and software has a much smaller base of relevant qualitative historical experience, and
it’s hard to add to the base to the base by performing small controlled experiments.
The acknowledged inaccuracy of estimates in the field of software engineering is well
illustrated in categorization given by Murthi (2004). His three level categorization for the
accuracy of estimates is following: ballpark / order of magnitude, rough and fair estimates. Fair
estimates - ideally about 25% to 50% off the actual value - are possible when one is very
familiar with what needs to be done and has done it many times before e.g. while adding well-
understood functionality that has been done before; rough estimates - ideally about 50% to
100% off the actual value - are possible when working with well-understood needs and one is
familiar with domain and technology issues; order of magnitude estimates again would fall
within two or three times the actual value and are often the ones started with.
3.1.2. About the Subjectivity and Estimates
By looking Boehm’s list of causes above, it is evident that software cost estimation is
characterized by some level of subjectivity. According to Gray et al. (1999, p. 216) not even
purely mathematical models are totally immune to subjectivity. He argues that when effort
estimates are made, using whatever technique some subjectivity is involved - either in making
the estimates themselves (as they in most cases are in fact simply guesstimates based on
subjective opinion) or in calibrating some inputs into the model. For instance in expert-based
approaches inclusion of subjectivity, intentional or inadvertent, is straightforward; estimates
depend on the knowledge, capability and objectivity of the estimator, who may be biased,
optimistic, pessimistic, or unfamiliar with key aspects of the project.
However, subjectivity - although often seen as a thread to the reliability of estimates - can also
be an opportunity. Biffl (2000) reported that subjective defect estimation models (DEM)
generally estimated defect amounts in software documents more accurately than objective
DEMs. This kind of result can possibly be attributable to expert judgment’s ability to factor in
the differences between past project experiences and the new techniques, architectures, or
applications involved in the future project – list given by Boehm (1981, p. 333).
Gray et al. (1999, p. 217) further points out that inclusion of subjective elements into models in
cases where the data sets are smalls and estimation process limited to empirical models allows
for a great reduction in the number of variables as well as the accounting for factors that are
difficult to measure. It can be with wary therefore concluded that subjectivity coming from
experts may in proper circumstances be used as an adjustment factor for estimation models.
Chapter 3: Approaches for Software Cost Estimation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
9
3.2. Expert-based Approaches
3.2.1. About Expert Opinion based Techniques
Expert opinion based techniques involve consulting with one or more experts, who use their
experience and understanding of the proposed project/product to arrive at an estimate of its
cost. That is, the parameters of the project/product are described to mature developers who use
their personal experience to turn them into effort predictions.
While estimation is done by humans, characteristics for estimates are embodiment of relatively
high level of subjectivity. Reflecting this, expert opinion techniques proposed in literature often
aim to offer means to reduce the level of subjectivity by making estimates more transparent,
preventing important aspects to be overlooked, and eliciting underlying assumptions and
possible threats to the estimates (Passing and Shepperd, 2003, p. 120).
3.2.2. About Estimation by Analogies
The idea of analogy-based estimation is to identify the completed projects/products that are the
most similar to a new project/product. The key activities are the identification of a problem as a
new case, the retrieval of similar cases from the repository, the reuse of knowledge derived
from previous cases and the suggestion of a solution for the new case. Shepperd (1997)
suggests that 10-12 cases are required in order to provide a stable basis for estimation.
Approach has two main issues to deal with: how to characterize cases and how to retrieve
similar cases i.e. how to measure similarity. Shepperd (ibid.) notes that codifying a repository
of projects is a major challenge that may require a use of an expert to establish those features of
a case that are believed to be significant in determining similarity and differences of cases.
Kolodner (1993) lists a number of approaches to measure similarity between cases: nearest
neighbor algorithms, manual guided induction, template retrieval, goal directed preference,
specificity preference, frequency preference, recency preference, and fuzzy similarity.
Once the analogous projects have been found, the known effort of similar cases can be utilized
in a variety of ways. While Sheppard (1997) used both weighted and unweighted average up to
three analogies, Angelis (2000) suggested that the bootstrap method should be used to choose
and configure i) the choice of distance measure used to evaluate the similarity between projects,
ii) the number of analogies to take into account in the effort estimation, and the statistic that
will be used in calculating the unknown effort from the efforts of the similar projects/products.
Chapter 3: Approaches for Software Cost Estimation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
10
3.3. Algorithmic Approaches
Algorithmic models provide one or more mathematical algorithms which produce an estimate
as a function of variables considered to be the major cost drivers. The most common forms of
algorithms used for software cost estimation are presented below with some examples from
literature. Categorization below is by Boehm (1981, p. 330).
1. Linear models: A linear model for cost estimation has a general from of
nn xaxaxaaEffort ++++= ...22110 , (1)
where x1, …, xn are the cost driver variables, and a0, …, an the set of coefficients
chosen to provide the best fit to the set of observed data points. An example of simple
linear model predicting errors is the one by Akiyama (1971)
LDefects 018.086.4 += , (2)
where D is the number of defects in a program and L is the number of code lines.
2. Multiplicative Models: General form of a multiplicative cost-estimating model is
nxn
xx aaaaEffort ...21210= , (3)
where x1, …, xn are the cost driver variables, and a0, …, an the set of coefficients
chosen to provide the best fit to the set of observed data points.
3. Analytic Models: An analytic model takes the more general mathematical form
),...,,( 21 nxxxfEffort = , (5)
where x1, …, xn are the cost driver variables, and f is some mathematical function.
Lipow’s (1982) model,
LALAALD 2
210 lnln ++= , (6)
where D and L are as previously stated and each of the Ai are dependent on the average
number of usages of operators and operands per LOC for a particular language, defines
a relationship between fault density and size and takes also the differences among
programming languages into account.
Chapter 3: Approaches for Software Cost Estimation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
11
4. Tabular Models: Tabular models contain a number of tables, which relate the values of
cost driver valuables either to portions of the software development effort.
5. Composite Algorithmic Models: Composite models incorporate a combination of
linear, multiplicative, analytic and tabular functions to estimate software effort as a
function of cost driver variables. In constructive cost model (COCOMO) by Boehm
(1981) effort estimate models takes the general form of
FaSEffort b= , (4)
where a is a productivity parameter, S is size measured in thousands of delivered source
instructions (KDSI), b an economies or diseconomies of scale and F convolution of
cost factors (see Boehm for a list).
Example models presented above have a common denominator i.e. they try predict an external4
(product) quality attribute by an internal5 (product) quality attribute (i.e. measure), attribute that
is measurable or derivable for plans, specifications or designs relatively early in development.
Internal product attribute used in examples considers the size of software product from the
aspect of length (e.g. line of codes). In addition to length – the physical size of the software
product – Fenton (1997, p. 245) suggests that software size can be described with functionality
that measures the functions supplied by the product to the user, and complexity that can be
interpreted different way depending on perspective: computational, algorithmic, structural and
cognitive complexity. Fenton (ibid.) further acknowledges that also the degree of reuse is
important to consider when size is provided as input to effort, cost and productivity models.
Many software engineers argue that instead of length, the amount of functionality inherent in a
product gives better base for estimation. As a distinct attribute, functionality captures an
intuitive notion of the amount of function contained in a delivered product or in a description of
how the product is supposed to be. There has been several attempts to define measures for
functionality: function points by Albrecht, object points by Boehm, specification weights by
DeMarco, Mark 2 function points by Synoms and feature points by Jones.
4 External attributes of a product, process or resources are those that can be measured only with respect to how the product, process or resource relates to its environment. (Fenton, 1997, p. 74)
5 Internal attribute of a product, process or resource are those that can be measured purely in terms of the product, process or resource itself. (Fenton, 1997, p. 74)
Chapter 4: Estimating Error Amounts for Resource Allocation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
12
4. Estimating Error Amounts for Resource Allocation
In literature part of the study several techniques were reviewed. Techniques are collected and
classified in Table 1. In following synthesis qualitative analyze is very general as the scope of
this study is rather limited.
Table 1: Some reviewed estimation techniques
Approach Some techniques applied in literature
Expert opinion based techniques
• Software project size and effort estimation using checklists (see technique Tausworthe, 1980) and group consensus technique Wideband Delphi (see technique Boehm and Farquhar, 1974). (Passing and Sheppard 2003)
• Subjective team estimation models based on inspection data. (Biffl, 2000)
Analogy-based techniques
• Effort estimation using reasoning by analogy, fuzzy logic and linguistic quantifiers. (Idri et al., 2002)
• Effort estimation using analogies and comparison to regression models. (Shepperd and Schofield, 1997)
Algorithmic techniques
• Defect estimation using non-linear regression models and complexity & other internal variables as measures. (Khoshgoftaar et al., 1990; Khoshgoftaar et al., 1992)
Other • Bayesian Belief Networks. (Fenton and Neil, 1999)
• Software defect estimation using capture-recapture and curve fitting. (Petersson and Wohlin, 1999)
4.1. Analysis of Findings from Literature
A study by Moløkken and Jørgensen (2003) reviewing surveys done on software cost
estimation in software industry identified that expert based estimation techniques are by far the
preferred approach for software cost estimation.
There may be many reasons for this. Boehm (1981) notes that expert-based estimation have
ability to factor in the differences between past project experiences and the new techniques,
architectures, or application involved in the future project and to factor in exceptional
personnel characteristics and interactions, or other unique project considerations. That is,
Boehm gives a lot of credit to their robustness, and ability to cope in unique situations, which is
a much emphasized characteristic for software projects in software engineering textbooks.
However, expert based techniques are claimed not to give any better estimates than the
expertise and objectivity of the estimator, and they are subject to number of causes of bias (see
Salo, 2004). Gray et al. (1999) gives following causes for frequent misestimation:
1. changes in technology that are not fully understood in terms of their effect on effort
(e.g. effect that a new tool can have for software development);
2. difficulty to assess factors like levels of personnel experience and skills;
Chapter 4: Estimating Error Amounts for Resource Allocation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
13
3. a lack of understanding of the module/system characteristics (i.e. some feature may
be seen as simple by a particular manager when in fact it may require considerable
development effort);
4. other influence due to estimator’s background;
5. any number other related reasons (including political and motivational goals etc.).
No literature reference was found discussing software defect estimation techniques based on
expert opinions. Passing and Shepperd (2003) however explored two techniques frequently
suggested to as support for software project effort and software size estimation: checklists6 and
a group discussion technique Wideband Delphi7.
They concluded that both checklist and group discussions improved the accuracy of the line of
codes8 estimates and increased the confidence the estimators had in their estimates. Checklist
also improved the consistency and transparency of estimates (LOC and effort), and increased
the estimates in size, while group discussion did not have any additional influence on these.
However, as Passing and Shepperd (ibid.) concluded, in experiment design student acted as
estimators, so it remains unclear to what extent these result can be generalized to experienced
estimators and outside the controlled experiment environment.
No literature reference was found either discussing software defect estimation based on
analogies. All reviewed papers seemed to close in estimation from comparative analysis point
of view. That is, papers focused to compare analogy based estimation models with algorithmic /
other formal models presented elsewhere using external data set. Metrics like Pred9, adjusted R
squared or mean magnitude of relative error (MRE) were used to measure the accuracy of the
estimates.
Shepperd and Schofield (1997) for instance compared estimates achieved by a analogy-based
technique that bases on another analogy estimation technique, namely case base reasoning
6 For more see Tausworthe, 1980.
7 A technique modified by Boehm (see Boehm, 1981) from original Delphi technique by Helmer (1966).
8 LOC is an acronym for line of codes.
9 E.g. Pred(25) is the percentage of predictions that fall within 25 percentage of the actual value.
Chapter 4: Estimating Error Amounts for Resource Allocation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
14
(CBR)10, to estimates obtained through stepwise regression models using nine different data set,
each data set having different attributes and varying number of attributes. To determine the
most similar projects they used unweighted Euclidean distance between project attributes to
measure similarity; they criticized similarity measures proposed by Kolodner (see above) to
suffer from computational complexity, are intolerant of noise and irrelevant attributes,
incapable of handling symbolic or categorical attributes, and are weak for higher order attribute
relationship (i.e. relationship which can be derived from the structure of the data).
Shepperd and Schofield (ibid.) reported that estimation technique based on analogy resulted to
better MMRE11 in all data set and seven out of nine as Pred was used as a measure. They
further concluded that estimation by analogy is able to generate estimates where algorithmic
models were not because of categorical nature of data and lack of statistical relationship. They
also argued that, as this type of situation may be quite common particularly at a very early stage
in a project, analogy could be an attractive method for producing very early estimates.
Idri et al. (2002) reports results similar to Shepperd while using fuzzy sets and linguistic
quantifiers to include both numerical and linguistic (categorical data) values such as ‘very low’,
‘low’ and ‘high’ into analogy-based cost estimation models. However, for instance Briand et al.
(1999) reports opposite results while comparing common software cost modeling techniques
using data from European Space Agency. They also concluded that analogy-based models used
in experiment setting did not seem to be as robust when using data external to organization for
which the model was built.
Recently, algorithmic approaches that uses internal variable like size and complexity to predict
defects have received a lot of critique. In their paper Fenton and Neil (1999) point out that
correlations the variables might demonstrate in empirical investigations are not entirely causal
i.e. a third variable that is responsible for the correlation. Zuse (1998, p. 422) points out that in
order to solve this problem of third variable, proper hypothesis of reality may be necessary.
In addition to several points about problems in statistical methods and the quality of data used
in model building and testing, Fenton and Neil (ibid.) list following observations about the way
complexity metrics are misleadingly used to predict defect counts:
10 For more see Aamodt and Plaza, 1994.
11 MMRE is an acronym for mean magnitude of relative error.
Chapter 4: Estimating Error Amounts for Resource Allocation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
15
1. the models ignore causal effects of programmers and designers;
2. overly complex programs are themselves a consequence of poor design ability or
problem difficulty;
3. defects may be introduced at the design stage because of the overcomplexity of the
design already produced.
To overcome the shortcomings they attribute to defect prediction models using just internal
variables as input, Fenton and Neil (ibid.) introduce a technique based on Bayesian belief
networks (BBN). The model and discussion presented by Fenton and Neil (ibid.) approaches
defect estimation again from software quality point of view, but arguably, offers a method to
quantify causal relationships between variables that can be assumed to be relevant in a
particular environment and problem.
4.2. Estimating Error Using Requirements and Change Orders
4.2.1. Description and Grounds for Chosen Approach
One conclusion - rather intuitive - from above analysis that there does not exist a ‘best’
estimation technique outperforming all others in every situation is supported e.g. by Idri et al.
(2002, p. 21). It can be argued that in reality there are number of pragmatic constraints and
challenges that inhibit and complicate the usage of estimation techniques, what ever they might
be. For instance inter as well as intra-organizational borders that are consequences of
organizational structure in software product-line make appliance of certain method without
time and management’s support impossible.
Approach presented here in simple. Thus, in order to evaluate the costs exerted by new product
program towards one of the software platform organization in terms of errors, the possible
interdependency between error and requirement amounts is analyzed by forming a model using
notations by Zuse (1998). This model gives conditions under which suggested measures can be
meaningfully analyzed using among other things statistical analysis.
The approach is not entirely a new one in experimental software engineering research. There
are many suggestion how one should measure the size of software using the functionality that it
encompasses, and use this to give estimations about external variables. However, quite few
have considered the conditions under which the suggested measures can be used in these
purposes.
Chapter 4: Estimating Error Amounts for Resource Allocation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
16
4.2.2. Requirement Engineering
In contemporary software development life cycle models requirement engineering (RE) is a
continuous activity aiming to assure that the product will meet the consumer expectations.
In project definition phase RE concentrates on analyzing large quantity of requirement requests
emerging from number of sources. Requests that are considered to be valid to particular
business case at are sent to software platform organization. After refinement together with
product program software platform organization then decides whether or not it can deliver a
software implementation fulfilling the request in schedule requested by the product program.
Accepted requirements requests are included to software platform organizations roadmap,
granulated to their platform side counterparts i.e. to features and subfeatures (see below) and
these assigned to one or more software projects, depending on the work decomposition inside
software platform organization.
As time from project definition to first sales release in mobile terminal development range
usually from one to five years, a product program would with high likelihood find itself holding
an outdated product in its hands, if RE should only be limited to the project definition phase.
Therefore, a product program continuously evaluates whether its requirement set is competitive
or whether it needs to be changed somehow.
Change orders are requirements requests sent to software during the project execution phase,
after the specification freeze that ends project definition phase. Change orders, if accepted after
similar decision-making process as in case of original requirement requests, can lead either to
changes to existing feature(s) i.e. to feature change orders (see below) or result to a entirely
new feature.
4.2.3. Description of the Data
Data consisting of work entities presented below was collected from four past projects. Each
work entity type has several attributes including e.g. report title, unique identification code,
date detected and date closed.
1. Requirement request (REQ). A work order from a product program to software
platform organization. Requirement requests illustrate functionality to be implemented
as seen by product program. An example content of a requirement request is that the
terminal software should implement Bluetooth (BT) protocol stack and certain set of
relevant user profiles.
Chapter 4: Estimating Error Amounts for Resource Allocation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
17
2. Change orders (CR). A new requirement request or request for a change to terminal
software functionality entered during the project execution phase. Again product
program’s interpretation of required functionality.
3. Features (FEA) and subfeatures (SUB). Features and subfeatures represent the
functionality to be implemented as seen by the software platform organization.
4. Feature change orders (FCO) and subfeature change orders (SCO).
Feature/subfeature change order represents a change or an enhancement of a
feature/subfeature to be implemented as seen by the software platform organization.
5. Error reports. An error report is generated as deviation from specification or expected
behavior is detected while software is being tested. Likewise requirement requests and
change orders, error reports are work orders for software platform organization; as
mentioned software platform organization is responsible for software maintenance
throughout the product’s life cycle i.e. from project definition phases to maintenance
releases to products already on the market.
Thus, requirements requests and change orders represent customer side i.e. product programs’
understanding of the need; they seldom consider the technical details of functionality requested.
Therefore, on software platform organization side they are mapped to features (FEA) and these
again to subfeatures, and change orders again to features change orders (FCO) and to sub
feature change orders, accordingly.
Mapping is however not bijective. Relationship between these work items is that one
requirement (or change order) can bring several features and one feature can be common to one
or more requirements. Moreover, one requirement (or change order) can bring several feature
change orders and one feature change order can be common to one or more requirement (or
change order).
4.2.4. Measurement Theoretic Model and Work Assumptions
Measurement is the mapping of empirical properties to numbers, which thereafter express the
meaning of these empirical properties12. For instance complexity measure by McCabe
12 Preservation of all relations between the empirical and numerical statement is called homomorphism. That is, a scale is a homomorphism between two relational systems.
Chapter 4: Estimating Error Amounts for Resource Allocation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
18
quantifies the complexity of software design presented as a flowchart into a number. One can
then compare of the complexity of a software design to another from one aspect by using
McCabe’s number.
According to Zuse (1998, p. 419), as one attempt to predict an external quality attribute by an
internal quality attribute, three issues are needed:
1. Criteria – empirical conditions/properties – to describe the behavior of internal and
external software quality attributes
2. The scale level of internal and external quality
3. Empirically validated function between internal and external attribute
Considering the empirical properties of objects allows one to decide whether a measure of
internal attribute is appropriate to predict the external variable. To illustrate empirical
properties of objects under scrutiny, let’s assume that F1 and F2 ∈ F where F is the set of
functionality objects. Thereafter statement
F2 F1 >• , (7)
where >• 13 is an empirical ‘more difficult to develop than’ relation, it interpreted that
functionality F1 is more difficult to develop than functionality F2. If the number of
requirements14 (NOR) is then chosen as a measure for this, it then has to hold
NOR(F2)NOR(F1)F2 F1 >⇔>• , (8)
for all F1, F2 ∈ F if we assume homomorphism between the numerical and empirical relational
system of the measure. Thereafter using previous notations, following statement about the
number of errors (NOE) can be written:
NOE(F2)NOE(F1) > , (9)
wherein all F1, F2 ∈ F. By this it is said that functionality F1 - as being more difficult to
develop than F2 - will induce more errors than functionality of F1.
13 An empirical relation is defined, here, as: F x F ⊆>• , where F is a set of functionality objects.
14 Number of requirements can be considered also in terms of features, subfeatures, etc. since there is defined mapping policy between requirements and measures.
Chapter 4: Estimating Error Amounts for Resource Allocation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
19
To use the measure NOR then in a prediction model, which here can be written as
))(()( FNORfFNOE = , (10)
where F ∈ F is the set of functionality and f is the prediction function, both measures should
have corresponding properties (Zuse, 1998, 449). Thus, here it needs to hold:
1. NOR assumes an extensive structure15 and can be used as an additive ratio scale;
2. NOE assumes an extensive structure and can be used as an (additive or non-additive)
ratio scale, if conditions given in Table 2 apply;
3. and the ranking order (i.e. weak order) of the measure NOR should correspond with the
ranking order of the external variable NOE.
Table 2: Twelve conditions for the external variable given by Zuse (1998)
Condition Definition
Weak order NOR(F1) � NOR(F2) <=> NOE(F1) � NOE(F2)
Positivity NOE(F1o F2) > NOE(F1)
Condition C1 NOE(F1)=NOE(F2) => NOE(F1o F)=NOE(F2o F) <=> NOE(F o F1)=NOE(F o F2)
Condition C2 NOE(F1)=NOE(F2) <=> NOE(F1o F)=NOE(F2oF) <=> NOE(F o F1)=NOE(F o F2)
Substitution Property NOE(F1o F2) > NOE(F1o F2’) <=> NOE(F2)>NOE(F2’) or vice versa
Package Depiction Forecast system from the components
Weak commutativity NOE(F1o F2)= NOE(F2o F1)
Weak monotonicity NOE(F1) � NOE(F2) => NOE(F1 o F) � NOE(F2 o F)
Monotonicity NOE(F1) � NOE(F2) <=> NOE(F1 o F) � NOE(F2 o F)
Archimedean Property NOE(F3) > NOE(F4) => NOE(F1o F3o F3 …) > NOE(F2o F3o F3…)
Wholeness NOE(F1o F2) > NOE(F1) + NOE(F2)
Additivity NOE(F1o F2) = NOE(F1) + NOE(F2)
In the field of software engineering it is not as clear as e.g. in measurement of length whether
the axioms behind the assumed extensive structure holds in reality or not. For instance the
axiom of monotonicity (see table above) will surely hold for lengths measurement, but this is
not self-evident for software measure. Although no prediction function will be given in this
study, the conditions above are given in purpose to explicitly list assumptions under which
results presented in chapter 4.4 should be reviewed (see conclusion in chapter 5).
15 See Zuse (1997, p. 126) for definition of extensive structure by Krantz et al.
Chapter 4: Estimating Error Amounts for Resource Allocation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
20
4.3. Data Refinement and Classification
4.3.1. Data Filtering and Refinement
As mentioned earlier each work entity type has a unique identification code. However, possibly
due to nature of database replication16, a situation can and often does occur in which more than
one instance of a work order is stored in the database repository. While the other instances than
the valid one could have been easily picked out and disregarded by using human judgment, the
large number of these cases and cost versus benefits comparisons forced to automate this
process. Thus, simple heuristics17 were applied to do this. The validity of data was not seen to
be compromised as the relative number of cases was small.
Filtered data was refined further. This was seen necessary since these is no enforcement in
place that would assure that given policies on how to fill in work orders are followed across the
organization. For instance several dates that where missing was filled in on the basis of
previous or next life cycle state dates as they represent earliest possible time for state change.
4.3.2. Classification of Data
Each entity type has a life cycle that corresponds to underlying task. Without going into details,
following diagram (Figure 3 below) gives an overall description of this life cycle.
Detected Assigned to Platform
Corrected by Platform
Closed
Postponed
Ignored / Duplicate
Verified by Customer
Figure 3: Life cycle of a work item according to state of corresponding task
To make a judgment whether a work item is attributable either to the number of requirements or
to the number of errors, diagram was divided into three parts:
16 Regularly performed activity by database administration in which content data between different instances of same databases situating in geographically dislocated servers is synchronized.
17 These heuristics helped to disregard 95% of all instances while rest were disregarded manually.
Chapter 4: Estimating Error Amounts for Resource Allocation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
21
1. Finished: included all work items in ‘verified’ and ‘closed’ states in case of
errors, and ‘released’ in case of features18.
2. Disregarded: included all work items in ‘ignored’ and ‘duplicate’ states.
3. Not yet realized: included all work items before ‘verified’ state (excluding
work items in disregarded group).
That is, this high level classification of data assured that only features, sub features etc. that has
been released to testing can contribute to error amounts and similarly only errors reports that
truly are errors (i.e. can be traced to failure or difference from specification) are accounted.
4.4. Results
4.4.1. Visualization of Data
As noted, error correction is a considerable cost item for case organization. While analyzing it,
one should remember that time and resources are not only put on correction and testing
activities in platform organization, but also to control and management activities on customer
projects like product programs and software streams. The more errors are found on developers
workstation instead of in integration and testing activities in software streams and product
programs, the bigger the resulting cost savings for case organization are, directly as fewer
resources are required and indirectly through shorter time-to-market.
As time allocated to error correction is away from implementation of new functionality, it
should also be considered while planning and allocating resources in platform organization.
Negligence to do this may arguably result in situation in which peak in error amount
attributable to certain platform would manifest itself as peak but to opposite direction in
implementation of new functionality. Moreover, according to studies done in case organization
work-in-progress (WIP) will amplify these effects.
Figure 4 below illustrates error load attributable to a software platform by four product
programs under scrutiny. Quick rise in error load is partially factitious, as the predecessors
programs are not included into figure.
18 Keywords used for life cycle states have different meaning across the databases.
Chapter 4: Estimating Error Amounts for Resource Allocation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
22
0
500
1000
1500
2000
2500
3000
01.12.2002 01.03.2003 01.06.2003 01.09.2003 01.12.2003 01.03.2004 01.06.2004 01.09.2004
Program A Program B Program D Program C Other
Figure 4: Error load attributable to a certain software platform
The profile formed by programs looks quite interesting; load has stayed relatively steady during
the whole observation window. This can be a sign of high utilization of resources or overload
of the system. However, the questioned interdependency between the amount of functionality
measured in number of features work orders and software development costs measured in terms
of induced errors remain as the key interest here.
It was noticed during data collection, that the content of requirements requests and change
orders varies considerably per product program. This was quite expected since each program
has a different way of working as requirement engineering is considered. Also technical
experience - that is arguably needed when feeding in requirement request to system - varies
between programs resulting in fewer but bigger requirement requests.
On platform side requirement request and change order handling can however be considered to
more homogenous. Requests coming from product programs are divided to features etc.
according to technical details and software platform organization’s internal work
decomposition. Therefore only feature related functionality measures - namely number of
features, number of subfeatures, and sum of subfeatures and subfeature change orders - were
included to this analysis. These measures are illustrated program wise in scatter plot diagrams
below in Figure 5 with corresponding error amounts.
Chapter 4: Estimating Error Amounts for Resource Allocation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
23
Number of FEAs
0
2000
4000
6000
8000
10000
12000
0 10 20 30 40 50 60 70 80 90
FEA (prog.) FEA (prog.+plat)
Expon. (FEA (prog.+plat)) Expon. (FEA (prog.))
Number of SUBs
0
2000
4000
6000
8000
10000
12000
0 100 200 300 400 500 600
SUB (prog.) SUB (prog.+plat)
Expon. (SUB (prog.+plat)) Expon. (SUB (prog.))
Sum of SUBs and SCOs
0
2000
4000
6000
8000
10000
12000
0 200 400 600 800 1000 1200
SUB+SCO SUB+SCO (prog.+plat)
Expon. (SUB+SCO (prog.+plat)) Expon. (SUB+SCO)
Figure 5: Scatter plots of feature related measures
Small sample size makes it difficult to find possible interdependence patterns in figures.
Assuming that the interdependence is super-linear in nature, exponential curves were fitted to
data using regular least squares method. The assumption of super-linearity is not unreasonable
as one considers the wholeness condition given for NOE measure in Table 2. That is, it is
intuitive that coupling two functionalities would with all likelihood result to larger number of
chances to make errors in functionality integration.
Two observations can be made about the data. First, the value of measures in program C - ones
at the top of each scatter plot - could be considered as outliers19 if no confirmation about data’s
correctness would be available. However, as it really is a legitimate observation, it should not
be ignored. One underlying reason for high error numbers for program C could arguably be the
immaturity of software platform at the time program was ongoing.
19 An outlier is an observation in the data set at either extreme of a sample, which is so far removed from the main body of the data that the appropriateness of including it in the sample is questionable. (Milton and Arnold,1990)
Chapter 4: Estimating Error Amounts for Resource Allocation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
24
Secondly, data points at the lower right corners belong to product program A, that is, to a
program that is still ongoing. While in other programs on average 6% of errors and 8% of
features were in ‘Not yet realized’ state, corresponding figures in program A were 30% and
16%, accordingly. This demonstrates that the work item amounts are not yet finalized which
should be taken into account while correlation analysis is performed.
4.4.2. Correlation Analysis
In external validation of a measure correlation coefficient is used to map the equivalent
empirical and numerical relational systems in one number (Zuse, 1998, p. 467). Zuse however
acknowledges that in order to the calculation and analyze of correlation coefficients be
meaningful, both the software measure and the external variable has to assume an extensive
structure (conditions given above).
Then, depending on the measurement scale a valid correlation coefficient can be determined. In
case both variables can be used as ordinal scale, Spearman’s20 correlation coefficient can be
used to compare weak orders of both variables. If both variables can be used as ratio scale, also
Pearson’s21 correlation coefficient can be used correspondingly. Both these correlation
coefficients are given above in Table 3 with and without program A (as reasoned above).
Table 3: Measures and correlation coefficients
Correlation with NOE Program A excluded Measure
Pearson Spearman Pearson Spearman
FEA (attributable to program) 0.26 0.76 0.38 0.79
FEA (+ platform features attributable to program) 0.28 0.77 0.47 0.82
SUB (attributable to program) 0.10 0.71 0.50 0.82
SUB (+ platform subfeatures attributable to program) 0.38 0.79 0.82 0.92
SUB + SCO (attributable to program) 0.22 0.72 0.87 0.94
SUB + SCO (+ platform … attributable to program) 0.34 0.77 0.93 0.96
20 Spearman’s correlation coefficient
� �
�=22
ii
iiSPEARMAN
yx
yxC
21 Pearson’s correlation coefficient � �
�−−
−−=
22 )()(
))((
yyxx
yyxxC
ii
iiPEARSON .
Chapter 4: Estimating Error Amounts for Resource Allocation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
25
Next correlation coefficients were statistically tested to check whether the calculated values of
coefficients are significantly different from zero at a specified level of significance. Milton and
Arnold (1990) give following test statistic to test this:
221
2
R
nRTn
−−=− . (11)
The observed values of the test statistics along the two-tailed probability values from t-
distribution are presented in Table 4 below.
Table 4: Test statistics from significance testing
Measure Test statistic / P-value Program A excluded
FEA (attributable to program) 0.384 / 0.738 0.406 / 0.755
FEA (+ platform features attributable to program) 0.417 / 0.717 0.539 / 0.685
SUB (attributable to program) 0.148 / 0.896 0.576 / 0.667
SUB (+ platform subfeatures attributable to program) 0.577 / 0.622 1.412 / 0.392
SUB + SCO (attributable to program) 0.312 / 0.785 1.786 / 0.325
SUB + SCO (+ platform … attributable to program) 0.517 / 0.657 2.437 / 0.248
As expected, the small sample size along with relative low correlation coefficient values makes
it statistically uncertain to say whether there exist independence – liner, sub or super-linear in
nature – between the variables.
4.4.3. Conclusions and Criticism
Scatter plots and correlation analysis show interesting results as interdependency between of
proposed measures and number of errors is concerned. Nothing conclusive about the nature and
power of interdependency however can be said because of the relative small size of the sample.
For the same reasons, further statistical analysis is meaningless.
Some criticism about applied statistical methodology can be made. For instance use of
Pearson’s correlation coefficient assumes at least the interval scale, which means linear
relationship between the attributes. Furthermore, it assumes that attribute values are normally
distributed. However, Fenton (1997, p. 209) notes that most software measurements are not
normally distributed and usually contain atypical value. Data set could be tested for normality
using e.g. Wilks-Shapiro’s test for normality, but however considering the size of data set such
testing was skipped.
Chapter 4: Estimating Error Amounts for Resource Allocation
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
26
Thus, Spearman’s correlation coefficient is more robust than Pearson’s as it instead of raw
values considers the ranks of the attributes. Zuse (1998, p. 466) notes that to avoid the
prerequisite for the Pearson correlation coefficient, additionally ranking correlation coefficient
such as Spearman or Kendall Tau should be used. Zuse further points out that high correlation
is does not transparently mean that there is a causal relationship between the variables; a third
can cause the relationship between the variables.
0
10
20
30
40
50
60
70
Jan.
02
Feb
.02
Mar
.02
Apr
.02
May
.02
Jun.
02
Jul.0
2
Aug
.02
Sep
.02
Oct
.02
Nov
.02
Dec
.02
Jan.
03
Feb
.03
Mar
.03
Apr
.03
May
.03
Jun.
03
Jul.0
3
Aug
.03
Sep
.03
Oct
.03
Nov
.03
Dec
.03
Jan.
04
Feb
.04
Mar
.04
Apr
.04
May
.04
Jun.
04
Jul.0
4
Aug
.04
Sep
.04
Oct
.04
Program A Program B Program C Program D
Program B 80%Program A 80%
Program D 80%
Program C 80%
MS
MS
MS
MS
Figure 6: Subfeature by date entered and program
A relevant point while assessing NOR’s capability to help its user to determine NOE is the
point in time when the data about feature are available. In Figure 6 subfeatures has been plotted
program wise by date they were entered to database.
In figure two points in time per each program are highlighted with two vertical lines connected
by a horizontal line. Lower vertical line ending to an arrow represents the end of project
definition phase at which that time the specifications are expected to be frozen. Upper line
again demonstrates point in time when 80% of final subfeatures were inserted the requirement
management system. Horizontal line demonstrates the lead-time between these points in time.
Although 80% is purely arbitrary percentage, figure above shows that if one would try to get an
estimate for NOE using NOR, he or she should acknowledge that the estimate would be based
even at best case on 80% of what the final feature amount could in reality be.
Chapter 5: Summary and Conclusions
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
27
5. Summary and Conclusions
The importance of software engineering has grown and strategic intent of software changed in
companies designing and manufacturing wireless information devices. Investments made of
software engineering typify investments made on intellectual capital. Software cost modeling
aims to provide managers responsible for software technology creation with information about
related costs in purpose to enable better management of software related processes and assets.
This study focused on software defect estimation from resource allocation point of view rather
than quality point of view that often is the matter. Short review on software cost estimation
techniques was conducted. Also some relevant issues on how software cost estimation in
product-line based development model differs from traditional one-at-a-time project were
gathered through interviews.
As the scope of this study was limited, the analysis part of this study concentrated on how
different approaches and techniques in general has been used to approach software defect and
effort estimation. Based on analysis simple measures were suggested for estimating the total
number of errors - in purpose to assess the resource allocation in software streams and software
platforms. A measurement theoretic model illustrating under which conditions proposed
measure could be used for prediction was then constructed using notations given by Zuse
(1998).
No definite statement could be reached whether there exists a significant correlation between
proposed measure and amount of errors due to the insufficient size of sample. And if such
correlation would have been found, one should have considered point made by Fenton and Neil
(1999, p. 680) in their paper. They concluded that the relationship between internal variables
like complexity and size, and defects is not entirely causal i.e. the correlation that empirical
investigation may demonstrate could be attributable to a causal dependency to third variable.
Obviously, explaining the number of errors by the number of features is a definite
oversimplification of issues. For instance there are 17 cost drivers (e.g. required reusability,
personnel continuity etc.) in Boehm’s effort estimation model to be taken into account beside
the size of the software product. Understanding the conditions stated in measurement model,
i.e. why suggested measures and used technique is in generally poor to estimate number of
errors, results of this study can be used to determine the roughly right amount of software
related errors after certain time in product program’s life cycle.
Chapter 5: Summary and Conclusions
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
28
One should however make no arguments or statements about the reliability of software based
on the context of this study. As Fenton and Neil (1997) point out, knowing error count or even
the exact number of residual defects in a system one have to be extremely wary about making
definitive statement about how the system will operate in practice. The variability in the way
systems are used by different users will make it difficult to predict which faults are likely to
lead to failures, and to how serious seen by user.
This study does not in any way help people responsible for software creation and error
management to improve relevant software processes for the better i.e. to change the underlying
factors that causally relate greater functionality implementation to greater amount of work due
to the greater amount of errors induced. Therefore, as one is tempted to understand the
dynamics of fault inducement better, causal dependencies behind development activities and
resources and errors could be modeled using e.g. Bayesian belief networks oriented techniques
as proposed by Fenton and Neil (1999). In future research could also be focused on how to
estimate the number of work items even earlier in product development life cycle. This could
arguably be possible by using techniques based on analogy as proposed by Shepperd and
Schofield (1997).
Chapter 6: References
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
29
6. References
6.1. Books
Boehm, B.W. (1981). Software Engineering Economics. Englewood Cliffs, New Jersey, USA:
Prentice Hall. 768 p. ISBN 0-138-22122-7.
Bosch, J. (2000). Design & Use of Software Architectures: Adopting and Evolving a Product-
Line Approach. Oxford, United Kingdom: ACM Press. 354 p. ISBN 0-201-67494-7.
Fenton, N.E. and Pfleeger, S.L. (1997). Software Metrics – A Rigorous and Practical Approach.
2nd Edition. Boston, USA: PWS Publishing Company. 638 p. ISBN 0-534-95425-1.
Milton, J. S. and Arnord J. C. (1990). Introduction to Probability and Statistics: Principles and
Application for Engineering and the Computing Sciences. 2nd Edition: McGraw-Hill Publishing
Company. 700p. ISBN 0-07-042353-9.
Zuse, H. (1998). A Framework of Software Measurement. Berlin, Germany: Walter de Gruyter.
755p. ISBN 3-11-015587-7.
6.2. Papers and articles
Aamodt A. and Plaza E. (1994). Case-Based Reasoning: Foundational Issues, Methodological
Variations, and Systems Approach. In: AI Communication, Vol. 7, No. 1, 1994, pp. 39-59.
Angelis, L. and Stamelos, I. (2000). A simulation Tool for Efficient Analogy Based Cost
Estimation. In: Empirical Software Engineering, Vol. 5, No. 1, 2000, pp. 35-68.
Biffl, S. (2000). Using Inspection Data for Defect Estimation. In: IEEE Software,
November/December, Vol. 17, No. 6, 2000, pp. 36-43.
Bosch, J. (1999). Product-Line Architectures in Industry: A Case Study. In: ICSE, 1999:
Proceedings of the 21st International Conference on Software engineering, held at, Los
Angeles, California, USA 16-22 May, 1999. pp. 544–554.
Briand, L., El Emam, K., Surmann, D., Wieczorek, I. and Maxwell, K. (1999). An Assessment
and Comparison of Common Software Cost Estimation Modeling Techniques. In: Proceedings
of the 1999 International Conference on Software Engineering, held at, Los Angeles,
California, USA 16-22 May 1999. pp. 313–323.
Chapter 6: References
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
30
Fenton, N. E. and Neil, M. (1999). A Critique of Software Defect Prediction Models. In: IEEE
Transactions on Software Engineering, Vol. 25, No. 5, September/October 1999. pp. 675–688.
Gray, A. R., MacDonell, S. G. and Shepperd, M. J. (1999). Factors Systematically Associated
with Errors in Subjective Estimates of Software Development Effort: The Stability of Expert
Judgment. In: Sixth IEEE International Symposium on Software Metrics, held at, Boca Raton,
Florida, USA 4–6 November, 1999. pp. 216–229.
Idri, A., Abran A. and Khoshogoftaar, T. M. (2002). Estimating Software project Effort by
Analogy Based on Linguistic Values. In: Proceedings of the 8th IEEE Symposium on Software
Metrics, held at, Ottawa, Canada, 4–7 June, 2002. pp. 21-30.
Khoshogoftaar, T. M., Bhattacharyya, B. B. and Richardson G. D. (1992). Predicting Software
Errors, During Development, Using Nonlinear Regression Models: A Comparative Study. In
IEEE Transactions on Reliability, Vol. 41, No. 3, September 1992, pp. 390-395.
Khoshogoftaar, T. M. and Munson, J. C. (1990). Predicting Software Development Errors
Using Software Complexity Metrics. In: IEEE Journal on Selected Areas in Communication,
Vol. 8, No. 2, February 1990. pp. 253-261.
Moløkken, K. and Jørgensen, M. (2003). A Review of Surveys on Software Effort Estimation.
In: Proceedings of the 2003 International Symposium on Empirical Software Engineering, held
at, Rome, Italy 30 Sept.–1 Oct. 2003. pp. 223-230.
Passing, U. and Shepperd, M. (2003). An experiment on software project size and effort
estimation. In: Proceedings of the 2003 International Symposium on Empirical Software
Engineering, held at, Rome, Italy 30 Sept.–1 Oct. 2003. pp. 120-129.
Petersson, H. and Wohlin, C. (1999). An Empirical Study of Experience-Based Software Defect
Content Estimation Methods. In: Proceedings of the 10th International Symposium on Software
Reliability Engineering, held at, Boca Raton, Florida, USA 1–4 November, 1999. pp 126–135.
Shepperd, M. and Schofield, C. (1997). Estimating Software Project Effort Using Analogies.
In: IEEE Transactions on Software Engineering, Vol. 23, No. 12, November 1997, pp. 726-743.
Tausworthe, R. (1980). The work breakdown structure in software project management. In:
Journal of Systems and Software, Vol. 1, No. 8, pp. 181-186.
Chapter 6: References
A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.
31
6.3. Internet
IDC – Press Release (2004). Worldwide Mobile Phone Market Grows 23%, Nokia Re-Enters
30% Share Bracket. By: IDC’s Worldwide Mobile Phone QView program.
http://www.idc.com/getdoc.jsp?containerId=pr2004_11_02_153412, referred 15-Nov-2004.
Salo, A. (2004). Material for course ‘Mat-2.134 Decision making and problem solving’ in
Applied Mathematics, held at, Helsinki University of Technology, autumn 2004.
http://www.sal.tkk.fi/Opinnot/Mat-2.134/luennot04/Uncertainty_and_risk1.pdf, referred 15-
Nov-2004.
Murthi, S. (2002. Useful Estimation Techniques for Software Projects.
http://www.developer.com/mgmt/article.php/1463281, referred 15-Nov-2004.
6.4. Interviewees
Elvang Morten, Nokia Multimedia
Ruotsalainen, Reijo, Nokia Enterprise Solutions
Saikku, Kirsti, Nokia Research Center
Takala Petri, Nokia Customer and Market Operation