a pragmatic study on cost estimation in software platform ...salserver.org.aalto.fi › vanhat_sivut...

Helsinki University of Technology

System Analysis Laboratory

Mat-2.108 Independent Research Projects in Applied Mathematics

A Pragmatic Study on Cost Estimation in Software

Platform Based Product Development Model

Jukka Lehmusvirta

In Salo, October 31, 2004

Supervisor Pertti Laininen Dr.Tech., Helsinki University of Technology

A Pragmatic Study on Cost Estimation in Software Platform Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.

Contact Information: Jukka Lehmusvirta Nokia Enterprise Solutions Joensuunkatu 7, FIN-24100 SALO P.O.BOX 86 Mobile +358504821010 Fax +358718044610 Email: [email protected]

A Pragmatic Study on Cost Estimation in Software Product-Line Based Product Development Model Jukka Lehmusvirta, Mat-2.108 Independent Research Projects in Applied Mathematics, HUT, SAL 2004.

i

Table of Contents

TABLE OF CONTENTS ............................................................................................................ I

DEFINITIONS........................................................................................................................... II

1. INTRODUCTION............................................................................................................ 1

1.1. CASE ORGANIZATION ..................................................................................................... 1

1.2. BACKGROUND AND MOTIVATION FOR THE STUDY ........................................................ 1

1.3. RESEARCH OBJECTIVES .................................................................................................. 2

1.4. SCOPE OF THE STUDY ..................................................................................................... 2

2. PRODUCT-LINE BASED TERMINAL SOFTWARE DEVELOPMENT ............... 3

2.1. CHANGE IN SOFTWARE DEVELOPMENT AND DELIVERY MODEL ................................... 3

2.2. CONCLUSION: ATTRIBUTABLE CHANGES IN SOFTWARE DEVELOPMENT ...................... 5

3. APPROACHES FOR SOFTWARE COST ESTIMATION........................................ 7

3.1. SOFTWARE COST ESTIMATION IN GENERAL................................................................... 7

3.2. EXPERT-BASED APPROACHES......................................................................................... 9

3.3. ALGORITHMIC APPROACHES ........................................................................................ 10

4. ESTIMATING ERROR AMOUNTS FOR RESOURCE ALLOCATION.............. 12

4.1. ANALYSIS OF FINDINGS FROM LITERATURE................................................................. 12

4.2. ESTIMATING ERROR USING REQUIREMENTS AND CHANGE ORDERS ........................... 15

4.3. DATA REFINEMENT AND CLASSIFICATION ................................................................... 20

4.4. RESULTS ....................................................................................................................... 21

5. SUMMARY AND CONCLUSIONS ............................................................................ 27

6. REFERENCES............................................................................................................... 29

6.1. BOOKS .......................................................................................................................... 29

6.2. PAPERS AND ARTICLES ................................................................................................. 29

6.3. INTERNET ...................................................................................................................... 31

6.4. INTERVIEWEES .............................................................................................................. 31


ii

Definitions

Definitions presented below are only for the purposes of this study. In literature their meanings

may differ. Counterparts specific to case organization are used in text for readability. Order of

the list is based on order of appearance in the text.

Product program

(Product engineering unit1)

An organizational unit responsible for new product development.

Encompasses management team, interdisciplinary units and persons

responsible for interfacing to different functional organizations.

Software Platform

Organization

An organizational unit that owns certain software platform (software

product-line), and is responsible for component and subsystem

deliveries to software streams or directly to product programs.

Software Streams An organizational unit that is responsible for final integration and

testing of software components, and their delivery to product programs.

Software Platform

(Software Product-Line1)

Consists of a product-line architecture and a set of reusable components

and subsystems that are designed for incorporation into the product-line

architecture.

Product architecture A product architecture is derived from product-line architecture by

pruning unwanted features out of it, extending it with product specific

features and resolving conflicting features.

Software fault A fault is a encoding of human error, that is, a fault occurs when a

human error results in a mistake in some software product.

Software failure A failure is the departure of a system from it required behavior. A fault

can lead to failure.

Software error In case organization errors are work items that are used to manage

corrective actions regarding to faults that have resulted as failures when

software was verified against specification or user expectations. That is,

here an error may be understood as detected fault.

1 Notation used by Bosch (2000)

Chapter 1: Introduction


1

1. Introduction

1.1. Case Organization

This study is done for Enterprise Solution (ES), one of the four business groups within Nokia

Corporation. According to IDC (2004), during the third quarter of 2004 Nokia had 31.3% share

of global mobile terminal markets and shipped around 51.4 million terminals worldwide.

1.2. Background and Motivation for the Study

As time-to-market (TTM) is now increasingly the main driver for new product development

also in mobile terminal business, predictability - ability to anticipate the probable chain of

events - is a desired capability one to possess. In mobile terminal development predictability is

utilized e.g. in resources allocation for software implementation and testing.

In software product-line2 based product development model, a product program may not

implement nor even test the terminal software by itself, but rather outsources the

implementation and maintenance i.e. error correction and evolution of software technology

components to software platform organizations (see Figure 2), and component integration and

system verification to software streams. In this kind of development model product program’s

responsibilities include creating development work items i.e. requirements, change orders,

errors reports etc., and managing outsourced development by monitoring and prioritizing these.

Courtesy of this model, software effort estimation techniques presented in literature are

generally inadequate or inapplicable to model development costs, as argued also by Bosch

(1999). Furthermore, data gathering for model population may be costly as well as problematic

as development activities cross organizations borders, although mostly intra-organizational.

An issue requiring predictability is the cumulative amount of software errors induced during

the project execution phase attributable to a product program. This study tries to help persons

responsible for cost estimation in case organization by reviewing a set of approaches for

effort/defect estimation in software engineering. Main objective of the study is to present a

pragmatic approach that enables resource allocation for error correction in software platform

organizations to be assessed.

2 See Bosch (2000) about software product lines.

Chapter 1: Introduction


2

1.3. Research Objectives

The research problem is defined as follows:

• How in software product-line based development model the cumulative amount of errors in

product-line software attributable to a product program could be estimated?

The research problem is divided into the following questions:

• What different approaches to cost estimation there are in general and what kind of

techniques has been applied for cost estimation in software engineering?

• What makes cost estimation in software product-lines different from cost estimation in a

traditional one-at-a-time project?

The research objective is therefore the following:

• To present a pragmatic approach for error estimation in constituent organization.

1.4. Scope of the Study

This study tries not to be a comprehensive presentation of software cost estimation techniques

and models suggested in literature but rather a pragmatic, reviewing study on the strengths,

weaknesses and opportunities different software cost estimation approaches have and what

requirements they have on data. In is left for further research to analyze what is the usability of

different approach/techniques in case organization and how they should be adjusted to fit in the

requirements that product-line based development model introduces.

Software cost estimation research focusing on defect/error estimation have tended to

concentrate on the three perspectives, namely i) predicting the number of errors a system will

reveal in testing or operation, ii) estimating the reliability of the system in terms of time to

failure, and iii) understanding the impact of design and testing processes on error counts and

failure densities. This study concentrates on research done in the first one, but from resource

allocation rather than quality point of view.

Study is explorative in nature and has a pragmatic grip on issues; estimation and measurement

theories as such are out of the scope of this study.

Chapter 2: Product-line Based Terminal Software Development


3

2. Product-line Based Terminal Software Development

This chapter briefly goes over terminal software development in software product line based

development model. Chapter ends with results from interviews focusing on differences between

product-line and one-at-a-time software development models from cost-modeling point of view.

2.1. Change in Software Development and Delivery Model

2.1.1. Software Development in Product Programs

Before mobile telecommunication boom in the late nineties the number of product programs

developing mobile terminals in case organization was modest. Product programs were almost

entirely autarchic in their software development.

Disutility from developing almost identical software in parallel progressing product programs

was not however significant enough to cause a change in the way that software development

was done. Alleviating reasons were relatively small amount of software needed for products

and lack of modularity in contemporary software architectures.

Product Program C Product Program A

REQs/CRs/ Errors

SW

Product Program F Product Program D

Independent Software Project




Product Program G

SW

Product Program E

Product Program B

t

Figure 1: Product oriented Software Development by Programs

Subsequent product programs merely copied (solid arrows in Figure 1) terminal software from

previous products and modified it to match renewed requirements. Programs implemented new

requirements partly by themselves and partly by outsourcing development to autonomous, in-

house software projects (dashed lines). The benefits of product wise development emerge from

its simplicity: there is no need to manage conflicting demands for development resources, as

each program owns its resources.



4

2.1.2. Software Delivery from Software Platform Organization

Product wise software development can be efficient while the number of products in company’s

product portfolio is small enough. However, as pressure to differentiate products according to

more granular market segmentation widens the portfolio, and as the introduction of new

functionality increases the amount of software needed for products, disadvantages from product

program wise software development may outdone the benefits.

In product wise product development the amount of double work grows in proportion to the

number of programs. A lot of development effort is lost as same solutions with minor

alterations could be utilized simultaneously in more than one product. Notably, double work is

not done only in software implementation, but e.g. in testing and error correction also.

Moreover, copying code across product programs arguably erodes it over long period of time if

there are no central authority to control, define and document the changes made to it. This can

lead to a situation where the understandability, performance and reusability of software may

become is significant cost factor in using it in new products.

Software Platform Software Platform

Software Stream

Product Program A

Software Project Software

Project


Project


Project


Project


Project


Project

Product Program B

Product Program C Product Program F

Product Program D Product Program G

Product Program H

t

Product Program E

Figure 2: Platform oriented Software Development for Programs

Product-line oriented software development centralizes software development and software

architecture management. A product belonging to same product-line can exploit software

components adhering to the product-line architecture, and moreover utilize shared resources

sited in software streams.



5

2.2. Conclusion: Attributable Changes in Software Development

Few interviews were conducted in purpose to understand how cost estimation differs in

software product-line based development setting. Following issues were brought up in

interviews.

2.2.1. Continuously Evolving Software Asset

The core idea of product platform oriented product development is to branch new product-line

architecture from previous version with minimized effort. Branching preserves previous version

of product-line architecture for further exploitation, while new functionality can be developed

on the top of new one by extending/modifying the present functionality or implementing new.

Designing reusable assets that support branching is a multi-variable decision problem and this

some times dilutes designers, as pointed out also by Bosch (1999). This is arguably a cause for

volatility in estimates.

2.2.2. Build-in Variability of Software Components

In order to components adhering to certain product-line architecture to be truly valuable, they

need to encompass required level of variability to adhere to specific product architectures with

relative ease. That is, in-build variability of components is utilized to accommodate component

into specific hardware environment, and to include of needed component functionality and

exclude overheads.

In product wise software development there is no true need to insert variability to software

components as each software component is explicitly tailored for particular product in mind.

That is, all the functionality and interfaces in a component are necessary and sufficient.

However, in a component adhering to certain product-line architecture there can exist

considerable amount of functionality and interfaces that are not needed in certain product

architecture. Cost modeling these variation points is difficult, as also pointed by Bosch (ibid.).

And as an implication of aforementioned, not all defects attributable to certain software

component are relevant for a product program whose product line architecture uses only

partially the functionality of that component.



6

2.2.3. Distributed integration of software components

Software components are regularly integrated in purpose to verify their compliance with other

components at system level and in target hardware. In product wise software development

component implementation and integration is often done by the same person. Furthermore,

components are designed to be integrated with certain known set of time-invariant components.

This may be not the case in software product-line based development. Components are possibly

developed in dispersion depending on the work decomposition inside the software platform

organization. Regular integration again may be done in multiple levels by separate teams

specialized to certain product families or products. As components are not developed against

certain components but rather to certain product line architectural principles, new breed of

integration based defects emerge.

2.2.4. Lack of Inter and Intra Organizational Visibility and Control

Software development projects responsible for development of certain components adhering to

the product-line architecture are rather independent in their internal operations. There exist no

common policies on reporting, data collection or metric definitions.

Fenton (1997) advocates this by recognizing that the first and most important method of

improving cost estimation in a particular environment is to use size and effort measures that are

defined consistently across the environment; the measures must be understandable by all who

must supply or use them. Lack of organizational visibility as well as managerial control makes

is very difficult in dispersed organization.

Chapter 3: Approaches for Software Cost Estimation


7

3. Approaches for Software Cost Estimation

In following words ‘approach’, ‘technique’ and ‘model’ are used. For readability some

definitions are therefore given. Here ‘approach’ refers to a high-level mean to move in the

problem of estimating the effort/defects. Technique and model again refer to more specific

means i.e. to a formal procedure or to a mathematical function to produce an estimate.

In his book Boehm (1981) gives following listing for estimation approaches: algorithmic

models, expert judgment, analogy, Parkinson, price to win, top-down and bottom-up. However,

most of the techniques reported in various research papers illustrate the fact, that more and

more them embodies aspects from two or more approaches given by Boehm.

In following, as different estimation approaches are discussed, focus is kept on very high level.

Acknowledging the diversity of ways that estimation techniques could be classified according

to the objective of the task, techniques are here classified to expert-based and to algorithmic

approaches. This emphasizes techniques relationship to data and source of it.

3.1. Software Cost Estimation in General

Software cost estimation is a task aiming to provide manager responsible e.g. for software

technology creation or software project management information about cost involved in

software development. Here focus is on effort estimation in term of defect/errors.

3.1.1. About the Accuracy of Estimates

It can be argued that software cost modeling can not even at it very best provide estimates that

can be considered more than reasonably accurate. There are many reasons why estimation

techniques fail to perform in software engineering. Boehm (1981, p. 32) lists few of them:

1. source instructions3 are not a uniform commodity, nor are they the essence of the

desired product;

2. software requires the creativity and cooperation of human beings, whose individual

and group behavior is generally hard to predict;

3 Boehm refers to a internal product attribute size that can be measured e.g. as number of instructions.



8

3. and software has a much smaller base of relevant qualitative historical experience, and

it’s hard to add to the base to the base by performing small controlled experiments.

The acknowledged inaccuracy of estimates in the field of software engineering is well

illustrated in categorization given by Murthi (2004). His three level categorization for the

accuracy of estimates is following: ballpark / order of magnitude, rough and fair estimates. Fair

estimates - ideally about 25% to 50% off the actual value - are possible when one is very

familiar with what needs to be done and has done it many times before e.g. while adding well-

understood functionality that has been done before; rough estimates - ideally about 50% to

100% off the actual value - are possible when working with well-understood needs and one is

familiar with domain and technology issues; order of magnitude estimates again would fall

within two or three times the actual value and are often the ones started with.

3.1.2. About the Subjectivity and Estimates

By looking Boehm’s list of causes above, it is evident that software cost estimation is

characterized by some level of subjectivity. According to Gray et al. (1999, p. 216) not even

purely mathematical models are totally immune to subjectivity. He argues that when effort

estimates are made, using whatever technique some subjectivity is involved - either in making

the estimates themselves (as they in most cases are in fact simply guesstimates based on

subjective opinion) or in calibrating some inputs into the model. For instance in expert-based

approaches inclusion of subjectivity, intentional or inadvertent, is straightforward; estimates

depend on the knowledge, capability and objectivity of the estimator, who may be biased,

optimistic, pessimistic, or unfamiliar with key aspects of the project.

However, subjectivity - although often seen as a thread to the reliability of estimates - can also

be an opportunity. Biffl (2000) reported that subjective defect estimation models (DEM)

generally estimated defect amounts in software documents more accurately than objective

DEMs. This kind of result can possibly be attributable to expert judgment’s ability to factor in

the differences between past project experiences and the new techniques, architectures, or

applications involved in the future project – list given by Boehm (1981, p. 333).

Gray et al. (1999, p. 217) further points out that inclusion of subjective elements into models in

cases where the data sets are smalls and estimation process limited to empirical models allows

for a great reduction in the number of variables as well as the accounting for factors that are

difficult to measure. It can be with wary therefore concluded that subjectivity coming from

experts may in proper circumstances be used as an adjustment factor for estimation models.



9

3.2. Expert-based Approaches

3.2.1. About Expert Opinion based Techniques

Expert opinion based techniques involve consulting with one or more experts, who use their

experience and understanding of the proposed project/product to arrive at an estimate of its

cost. That is, the parameters of the project/product are described to mature developers who use

their personal experience to turn them into effort predictions.

While estimation is done by humans, characteristics for estimates are embodiment of relatively

high level of subjectivity. Reflecting this, expert opinion techniques proposed in literature often

aim to offer means to reduce the level of subjectivity by making estimates more transparent,

preventing important aspects to be overlooked, and eliciting underlying assumptions and

possible threats to the estimates (Passing and Shepperd, 2003, p. 120).

3.2.2. About Estimation by Analogies

The idea of analogy-based estimation is to identify the completed projects/products that are the

most similar to a new project/product. The key activities are the identification of a problem as a

new case, the retrieval of similar cases from the repository, the reuse of knowledge derived

from previous cases and the suggestion of a solution for the new case. Shepperd (1997)

suggests that 10-12 cases are required in order to provide a stable basis for estimation.

Approach has two main issues to deal with: how to characterize cases and how to retrieve

similar cases i.e. how to measure similarity. Shepperd (ibid.) notes that codifying a repository

of projects is a major challenge that may require a use of an expert to establish those features of

a case that are believed to be significant in determining similarity and differences of cases.

Kolodner (1993) lists a number of approaches to measure similarity between cases: nearest

neighbor algorithms, manual guided induction, template retrieval, goal directed preference,

specificity preference, frequency preference, recency preference, and fuzzy similarity.

Once the analogous projects have been found, the known effort of similar cases can be utilized

in a variety of ways. While Sheppard (1997) used both weighted and unweighted average up to

three analogies, Angelis (2000) suggested that the bootstrap method should be used to choose

and configure i) the choice of distance measure used to evaluate the similarity between projects,

ii) the number of analogies to take into account in the effort estimation, and the statistic that

will be used in calculating the unknown effort from the efforts of the similar projects/products.



10

3.3. Algorithmic Approaches

Algorithmic models provide one or more mathematical algorithms which produce an estimate

as a function of variables considered to be the major cost drivers. The most common forms of

algorithms used for software cost estimation are presented below with some examples from

literature. Categorization below is by Boehm (1981, p. 330).

1. Linear models: A linear model for cost estimation has a general from of

nn xaxaxaaEffort ++++= ...22110 , (1)

where x1, …, xn are the cost driver variables, and a0, …, an the set of coefficients

chosen to provide the best fit to the set of observed data points. An example of simple

linear model predicting errors is the one by Akiyama (1971)

LDefects 018.086.4 += , (2)

where D is the number of defects in a program and L is the number of code lines.

2. Multiplicative Models: General form of a multiplicative cost-estimating model is

nxn

xx aaaaEffort ...21210= , (3)

where x1, …, xn are the cost driver variables, and a0, …, an the set of coefficients

chosen to provide the best fit to the set of observed data points.

3. Analytic Models: An analytic model takes the more general mathematical form

),...,,( 21 nxxxfEffort = , (5)

where x1, …, xn are the cost driver variables, and f is some mathematical function.

Lipow’s (1982) model,

LALAALD 2

210 lnln ++= , (6)

where D and L are as previously stated and each of the Ai are dependent on the average

number of usages of operators and operands per LOC for a particular language, defines

a relationship between fault density and size and takes also the differences among

programming languages into account.



11

4. Tabular Models: Tabular models contain a number of tables, which relate the values of

cost driver valuables either to portions of the software development effort.

5. Composite Algorithmic Models: Composite models incorporate a combination of

linear, multiplicative, analytic and tabular functions to estimate software effort as a

function of cost driver variables. In constructive cost model (COCOMO) by Boehm

(1981) effort estimate models takes the general form of

FaSEffort b= , (4)

where a is a productivity parameter, S is size measured in thousands of delivered source

instructions (KDSI), b an economies or diseconomies of scale and F convolution of

cost factors (see Boehm for a list).

Example models presented above have a common denominator i.e. they try predict an external4

(product) quality attribute by an internal5 (product) quality attribute (i.e. measure), attribute that

is measurable or derivable for plans, specifications or designs relatively early in development.

Internal product attribute used in examples considers the size of software product from the

aspect of length (e.g. line of codes). In addition to length – the physical size of the software

product – Fenton (1997, p. 245) suggests that software size can be described with functionality

that measures the functions supplied by the product to the user, and complexity that can be

interpreted different way depending on perspective: computational, algorithmic, structural and

cognitive complexity. Fenton (ibid.) further acknowledges that also the degree of reuse is

important to consider when size is provided as input to effort, cost and productivity models.

Many software engineers argue that instead of length, the amount of functionality inherent in a

product gives better base for estimation. As a distinct attribute, functionality captures an

intuitive notion of the amount of function contained in a delivered product or in a description of

how the product is supposed to be. There has been several attempts to define measures for

functionality: function points by Albrecht, object points by Boehm, specification weights by

DeMarco, Mark 2 function points by Synoms and feature points by Jones.

4 External attributes of a product, process or resources are those that can be measured only with respect to how the product, process or resource relates to its environment. (Fenton, 1997, p. 74)

5 Internal attribute of a product, process or resource are those that can be measured purely in terms of the product, process or resource itself. (Fenton, 1997, p. 74)

Chapter 4: Estimating Error Amounts for Resource Allocation


12

4. Estimating Error Amounts for Resource Allocation

In literature part of the study several techniques were reviewed. Techniques are collected and

classified in Table 1. In following synthesis qualitative analyze is very general as the scope of

this study is rather limited.

Table 1: Some reviewed estimation techniques

Approach Some techniques applied in literature

Expert opinion based techniques

• Software project size and effort estimation using checklists (see technique Tausworthe, 1980) and group consensus technique Wideband Delphi (see technique Boehm and Farquhar, 1974). (Passing and Sheppard 2003)

• Subjective team estimation models based on inspection data. (Biffl, 2000)

Analogy-based techniques

• Effort estimation using reasoning by analogy, fuzzy logic and linguistic quantifiers. (Idri et al., 2002)

• Effort estimation using analogies and comparison to regression models. (Shepperd and Schofield, 1997)

Algorithmic techniques

• Defect estimation using non-linear regression models and complexity & other internal variables as measures. (Khoshgoftaar et al., 1990; Khoshgoftaar et al., 1992)

Other • Bayesian Belief Networks. (Fenton and Neil, 1999)

• Software defect estimation using capture-recapture and curve fitting. (Petersson and Wohlin, 1999)

4.1. Analysis of Findings from Literature

A study by Moløkken and Jørgensen (2003) reviewing surveys done on software cost

estimation in software industry identified that expert based estimation techniques are by far the

preferred approach for software cost estimation.

There may be many reasons for this. Boehm (1981) notes that expert-based estimation have

ability to factor in the differences between past project experiences and the new techniques,

architectures, or application involved in the future project and to factor in exceptional

personnel characteristics and interactions, or other unique project considerations. That is,

Boehm gives a lot of credit to their robustness, and ability to cope in unique situations, which is

a much emphasized characteristic for software projects in software engineering textbooks.

However, expert based techniques are claimed not to give any better estimates than the

expertise and objectivity of the estimator, and they are subject to number of causes of bias (see

Salo, 2004). Gray et al. (1999) gives following causes for frequent misestimation:

1. changes in technology that are not fully understood in terms of their effect on effort

(e.g. effect that a new tool can have for software development);

2. difficulty to assess factors like levels of personnel experience and skills;



13

3. a lack of understanding of the module/system characteristics (i.e. some feature may

be seen as simple by a particular manager when in fact it may require considerable

development effort);

4. other influence due to estimator’s background;

5. any number other related reasons (including political and motivational goals etc.).

No literature reference was found discussing software defect estimation techniques based on

expert opinions. Passing and Shepperd (2003) however explored two techniques frequently

suggested to as support for software project effort and software size estimation: checklists6 and

a group discussion technique Wideband Delphi7.

They concluded that both checklist and group discussions improved the accuracy of the line of

codes8 estimates and increased the confidence the estimators had in their estimates. Checklist

also improved the consistency and transparency of estimates (LOC and effort), and increased

the estimates in size, while group discussion did not have any additional influence on these.

However, as Passing and Shepperd (ibid.) concluded, in experiment design student acted as

estimators, so it remains unclear to what extent these result can be generalized to experienced

estimators and outside the controlled experiment environment.

No literature reference was found either discussing software defect estimation based on

analogies. All reviewed papers seemed to close in estimation from comparative analysis point

of view. That is, papers focused to compare analogy based estimation models with algorithmic /

other formal models presented elsewhere using external data set. Metrics like Pred9, adjusted R

squared or mean magnitude of relative error (MRE) were used to measure the accuracy of the

estimates.

Shepperd and Schofield (1997) for instance compared estimates achieved by a analogy-based

technique that bases on another analogy estimation technique, namely case base reasoning

6 For more see Tausworthe, 1980.

7 A technique modified by Boehm (see Boehm, 1981) from original Delphi technique by Helmer (1966).

8 LOC is an acronym for line of codes.

9 E.g. Pred(25) is the percentage of predictions that fall within 25 percentage of the actual value.



14

(CBR)10, to estimates obtained through stepwise regression models using nine different data set,

each data set having different attributes and varying number of attributes. To determine the

most similar projects they used unweighted Euclidean distance between project attributes to

measure similarity; they criticized similarity measures proposed by Kolodner (see above) to

suffer from computational complexity, are intolerant of noise and irrelevant attributes,

incapable of handling symbolic or categorical attributes, and are weak for higher order attribute

relationship (i.e. relationship which can be derived from the structure of the data).

Shepperd and Schofield (ibid.) reported that estimation technique based on analogy resulted to

better MMRE11 in all data set and seven out of nine as Pred was used as a measure. They

further concluded that estimation by analogy is able to generate estimates where algorithmic

models were not because of categorical nature of data and lack of statistical relationship. They

also argued that, as this type of situation may be quite common particularly at a very early stage

in a project, analogy could be an attractive method for producing very early estimates.

Idri et al. (2002) reports results similar to Shepperd while using fuzzy sets and linguistic

quantifiers to include both numerical and linguistic (categorical data) values such as ‘very low’,

‘low’ and ‘high’ into analogy-based cost estimation models. However, for instance Briand et al.

(1999) reports opposite results while comparing common software cost modeling techniques

using data from European Space Agency. They also concluded that analogy-based models used

in experiment setting did not seem to be as robust when using data external to organization for

which the model was built.

Recently, algorithmic approaches that uses internal variable like size and complexity to predict

defects have received a lot of critique. In their paper Fenton and Neil (1999) point out that

correlations the variables might demonstrate in empirical investigations are not entirely causal

i.e. a third variable that is responsible for the correlation. Zuse (1998, p. 422) points out that in

order to solve this problem of third variable, proper hypothesis of reality may be necessary.

In addition to several points about problems in statistical methods and the quality of data used

in model building and testing, Fenton and Neil (ibid.) list following observations about the way

complexity metrics are misleadingly used to predict defect counts:

10 For more see Aamodt and Plaza, 1994.

11 MMRE is an acronym for mean magnitude of relative error.



15

1. the models ignore causal effects of programmers and designers;

2. overly complex programs are themselves a consequence of poor design ability or

problem difficulty;

3. defects may be introduced at the design stage because of the overcomplexity of the

design already produced.

To overcome the shortcomings they attribute to defect prediction models using just internal

variables as input, Fenton and Neil (ibid.) introduce a technique based on Bayesian belief

networks (BBN). The model and discussion presented by Fenton and Neil (ibid.) approaches

defect estimation again from software quality point of view, but arguably, offers a method to

quantify causal relationships between variables that can be assumed to be relevant in a

particular environment and problem.

4.2. Estimating Error Using Requirements and Change Orders

4.2.1. Description and Grounds for Chosen Approach

One conclusion - rather intuitive - from above analysis that there does not exist a ‘best’

estimation technique outperforming all others in every situation is supported e.g. by Idri et al.

(2002, p. 21). It can be argued that in reality there are number of pragmatic constraints and

challenges that inhibit and complicate the usage of estimation techniques, what ever they might

be. For instance inter as well as intra-organizational borders that are consequences of

organizational structure in software product-line make appliance of certain method without

time and management’s support impossible.

Approach presented here in simple. Thus, in order to evaluate the costs exerted by new product

program towards one of the software platform organization in terms of errors, the possible

interdependency between error and requirement amounts is analyzed by forming a model using

notations by Zuse (1998). This model gives conditions under which suggested measures can be

meaningfully analyzed using among other things statistical analysis.

The approach is not entirely a new one in experimental software engineering research. There

are many suggestion how one should measure the size of software using the functionality that it

encompasses, and use this to give estimations about external variables. However, quite few

have considered the conditions under which the suggested measures can be used in these

purposes.



16

4.2.2. Requirement Engineering

In contemporary software development life cycle models requirement engineering (RE) is a

continuous activity aiming to assure that the product will meet the consumer expectations.

In project definition phase RE concentrates on analyzing large quantity of requirement requests

emerging from number of sources. Requests that are considered to be valid to particular

business case at are sent to software platform organization. After refinement together with

product program software platform organization then decides whether or not it can deliver a

software implementation fulfilling the request in schedule requested by the product program.

Accepted requirements requests are included to software platform organizations roadmap,

granulated to their platform side counterparts i.e. to features and subfeatures (see below) and

these assigned to one or more software projects, depending on the work decomposition inside

software platform organization.

As time from project definition to first sales release in mobile terminal development range

usually from one to five years, a product program would with high likelihood find itself holding

an outdated product in its hands, if RE should only be limited to the project definition phase.

Therefore, a product program continuously evaluates whether its requirement set is competitive

or whether it needs to be changed somehow.

Change orders are requirements requests sent to software during the project execution phase,

after the specification freeze that ends project definition phase. Change orders, if accepted after

similar decision-making process as in case of original requirement requests, can lead either to

changes to existing feature(s) i.e. to feature change orders (see below) or result to a entirely

new feature.

4.2.3. Description of the Data

Data consisting of work entities presented below was collected from four past projects. Each

work entity type has several attributes including e.g. report title, unique identification code,

date detected and date closed.

1. Requirement request (REQ). A work order from a product program to software

platform organization. Requirement requests illustrate functionality to be implemented

as seen by product program. An example content of a requirement request is that the

terminal software should implement Bluetooth (BT) protocol stack and certain set of

relevant user profiles.



17

2. Change orders (CR). A new requirement request or request for a change to terminal

software functionality entered during the project execution phase. Again product

program’s interpretation of required functionality.

3. Features (FEA) and subfeatures (SUB). Features and subfeatures represent the

functionality to be implemented as seen by the software platform organization.

4. Feature change orders (FCO) and subfeature change orders (SCO).

Feature/subfeature change order represents a change or an enhancement of a

feature/subfeature to be implemented as seen by the software platform organization.

5. Error reports. An error report is generated as deviation from specification or expected

behavior is detected while software is being tested. Likewise requirement requests and

change orders, error reports are work orders for software platform organization; as

mentioned software platform organization is responsible for software maintenance

throughout the product’s life cycle i.e. from project definition phases to maintenance

releases to products already on the market.

Thus, requirements requests and change orders represent customer side i.e. product programs’

understanding of the need; they seldom consider the technical details of functionality requested.

Therefore, on software platform organization side they are mapped to features (FEA) and these

again to subfeatures, and change orders again to features change orders (FCO) and to sub

feature change orders, accordingly.

Mapping is however not bijective. Relationship between these work items is that one

requirement (or change order) can bring several features and one feature can be common to one

or more requirements. Moreover, one requirement (or change order) can bring several feature

change orders and one feature change order can be common to one or more requirement (or

change order).

4.2.4. Measurement Theoretic Model and Work Assumptions

Measurement is the mapping of empirical properties to numbers, which thereafter express the

meaning of these empirical properties12. For instance complexity measure by McCabe

12 Preservation of all relations between the empirical and numerical statement is called homomorphism. That is, a scale is a homomorphism between two relational systems.



18

quantifies the complexity of software design presented as a flowchart into a number. One can

then compare of the complexity of a software design to another from one aspect by using

McCabe’s number.

According to Zuse (1998, p. 419), as one attempt to predict an external quality attribute by an

internal quality attribute, three issues are needed:

1. Criteria – empirical conditions/properties – to describe the behavior of internal and

external software quality attributes

2. The scale level of internal and external quality

3. Empirically validated function between internal and external attribute

Considering the empirical properties of objects allows one to decide whether a measure of

internal attribute is appropriate to predict the external variable. To illustrate empirical

properties of objects under scrutiny, let’s assume that F1 and F2 ∈ F where F is the set of

functionality objects. Thereafter statement

F2 F1 >• , (7)

where >• 13 is an empirical ‘more difficult to develop than’ relation, it interpreted that

functionality F1 is more difficult to develop than functionality F2. If the number of

requirements14 (NOR) is then chosen as a measure for this, it then has to hold

NOR(F2)NOR(F1)F2 F1 >⇔>• , (8)

for all F1, F2 ∈ F if we assume homomorphism between the numerical and empirical relational

system of the measure. Thereafter using previous notations, following statement about the

number of errors (NOE) can be written:

NOE(F2)NOE(F1) > , (9)

wherein all F1, F2 ∈ F. By this it is said that functionality F1 - as being more difficult to

develop than F2 - will induce more errors than functionality of F1.

13 An empirical relation is defined, here, as: F x F ⊆>• , where F is a set of functionality objects.

14 Number of requirements can be considered also in terms of features, subfeatures, etc. since there is defined mapping policy between requirements and measures.



19

To use the measure NOR then in a prediction model, which here can be written as

))(()( FNORfFNOE = , (10)

where F ∈ F is the set of functionality and f is the prediction function, both measures should

have corresponding properties (Zuse, 1998, 449). Thus, here it needs to hold:

1. NOR assumes an extensive structure15 and can be used as an additive ratio scale;

2. NOE assumes an extensive structure and can be used as an (additive or non-additive)

ratio scale, if conditions given in Table 2 apply;

3. and the ranking order (i.e. weak order) of the measure NOR should correspond with the

ranking order of the external variable NOE.

Table 2: Twelve conditions for the external variable given by Zuse (1998)

Condition Definition

Weak order NOR(F1) � NOR(F2) <=> NOE(F1) � NOE(F2)

Positivity NOE(F1o F2) > NOE(F1)

Condition C1 NOE(F1)=NOE(F2) => NOE(F1o F)=NOE(F2o F) <=> NOE(F o F1)=NOE(F o F2)

Condition C2 NOE(F1)=NOE(F2) <=> NOE(F1o F)=NOE(F2oF) <=> NOE(F o F1)=NOE(F o F2)

Substitution Property NOE(F1o F2) > NOE(F1o F2’) <=> NOE(F2)>NOE(F2’) or vice versa

Package Depiction Forecast system from the components

Weak commutativity NOE(F1o F2)= NOE(F2o F1)

Weak monotonicity NOE(F1) � NOE(F2) => NOE(F1 o F) � NOE(F2 o F)

Monotonicity NOE(F1) � NOE(F2) <=> NOE(F1 o F) � NOE(F2 o F)

Archimedean Property NOE(F3) > NOE(F4) => NOE(F1o F3o F3 …) > NOE(F2o F3o F3…)

Wholeness NOE(F1o F2) > NOE(F1) + NOE(F2)

Additivity NOE(F1o F2) = NOE(F1) + NOE(F2)

In the field of software engineering it is not as clear as e.g. in measurement of length whether

the axioms behind the assumed extensive structure holds in reality or not. For instance the

axiom of monotonicity (see table above) will surely hold for lengths measurement, but this is

not self-evident for software measure. Although no prediction function will be given in this

study, the conditions above are given in purpose to explicitly list assumptions under which

results presented in chapter 4.4 should be reviewed (see conclusion in chapter 5).

15 See Zuse (1997, p. 126) for definition of extensive structure by Krantz et al.



20

4.3. Data Refinement and Classification

4.3.1. Data Filtering and Refinement

As mentioned earlier each work entity type has a unique identification code. However, possibly

due to nature of database replication16, a situation can and often does occur in which more than

one instance of a work order is stored in the database repository. While the other instances than

the valid one could have been easily picked out and disregarded by using human judgment, the

large number of these cases and cost versus benefits comparisons forced to automate this

process. Thus, simple heuristics17 were applied to do this. The validity of data was not seen to

be compromised as the relative number of cases was small.

Filtered data was refined further. This was seen necessary since these is no enforcement in

place that would assure that given policies on how to fill in work orders are followed across the

organization. For instance several dates that where missing was filled in on the basis of

previous or next life cycle state dates as they represent earliest possible time for state change.

4.3.2. Classification of Data

Each entity type has a life cycle that corresponds to underlying task. Without going into details,

following diagram (Figure 3 below) gives an overall description of this life cycle.

Detected Assigned to Platform

Corrected by Platform

Closed

Postponed

Ignored / Duplicate

Verified by Customer

Figure 3: Life cycle of a work item according to state of corresponding task

To make a judgment whether a work item is attributable either to the number of requirements or

to the number of errors, diagram was divided into three parts:

16 Regularly performed activity by database administration in which content data between different instances of same databases situating in geographically dislocated servers is synchronized.

17 These heuristics helped to disregard 95% of all instances while rest were disregarded manually.



21

1. Finished: included all work items in ‘verified’ and ‘closed’ states in case of

errors, and ‘released’ in case of features18.

2. Disregarded: included all work items in ‘ignored’ and ‘duplicate’ states.

3. Not yet realized: included all work items before ‘verified’ state (excluding

work items in disregarded group).

That is, this high level classification of data assured that only features, sub features etc. that has

been released to testing can contribute to error amounts and similarly only errors reports that

truly are errors (i.e. can be traced to failure or difference from specification) are accounted.

4.4. Results

4.4.1. Visualization of Data

As noted, error correction is a considerable cost item for case organization. While analyzing it,

one should remember that time and resources are not only put on correction and testing

activities in platform organization, but also to control and management activities on customer

projects like product programs and software streams. The more errors are found on developers

workstation instead of in integration and testing activities in software streams and product

programs, the bigger the resulting cost savings for case organization are, directly as fewer

resources are required and indirectly through shorter time-to-market.

As time allocated to error correction is away from implementation of new functionality, it

should also be considered while planning and allocating resources in platform organization.

Negligence to do this may arguably result in situation in which peak in error amount

attributable to certain platform would manifest itself as peak but to opposite direction in

implementation of new functionality. Moreover, according to studies done in case organization

work-in-progress (WIP) will amplify these effects.

Figure 4 below illustrates error load attributable to a software platform by four product

programs under scrutiny. Quick rise in error load is partially factitious, as the predecessors

programs are not included into figure.

18 Keywords used for life cycle states have different meaning across the databases.



22

0

500

1000

1500

2000

2500

3000

01.12.2002 01.03.2003 01.06.2003 01.09.2003 01.12.2003 01.03.2004 01.06.2004 01.09.2004

Program A Program B Program D Program C Other

Figure 4: Error load attributable to a certain software platform

The profile formed by programs looks quite interesting; load has stayed relatively steady during

the whole observation window. This can be a sign of high utilization of resources or overload

of the system. However, the questioned interdependency between the amount of functionality

measured in number of features work orders and software development costs measured in terms

of induced errors remain as the key interest here.

It was noticed during data collection, that the content of requirements requests and change

orders varies considerably per product program. This was quite expected since each program

has a different way of working as requirement engineering is considered. Also technical

experience - that is arguably needed when feeding in requirement request to system - varies

between programs resulting in fewer but bigger requirement requests.

On platform side requirement request and change order handling can however be considered to

more homogenous. Requests coming from product programs are divided to features etc.

according to technical details and software platform organization’s internal work

decomposition. Therefore only feature related functionality measures - namely number of

features, number of subfeatures, and sum of subfeatures and subfeature change orders - were

included to this analysis. These measures are illustrated program wise in scatter plot diagrams

below in Figure 5 with corresponding error amounts.



23

Number of FEAs

0

2000

4000

6000

8000

10000

12000

0 10 20 30 40 50 60 70 80 90

FEA (prog.) FEA (prog.+plat)

Expon. (FEA (prog.+plat)) Expon. (FEA (prog.))

Number of SUBs

0

2000

4000

6000

8000

10000

12000

0 100 200 300 400 500 600

SUB (prog.) SUB (prog.+plat)

Expon. (SUB (prog.+plat)) Expon. (SUB (prog.))

Sum of SUBs and SCOs

0

2000

4000

6000

8000

10000

12000

0 200 400 600 800 1000 1200

SUB+SCO SUB+SCO (prog.+plat)

Expon. (SUB+SCO (prog.+plat)) Expon. (SUB+SCO)

Figure 5: Scatter plots of feature related measures

Small sample size makes it difficult to find possible interdependence patterns in figures.

Assuming that the interdependence is super-linear in nature, exponential curves were fitted to

data using regular least squares method. The assumption of super-linearity is not unreasonable

as one considers the wholeness condition given for NOE measure in Table 2. That is, it is

intuitive that coupling two functionalities would with all likelihood result to larger number of

chances to make errors in functionality integration.

Two observations can be made about the data. First, the value of measures in program C - ones

at the top of each scatter plot - could be considered as outliers19 if no confirmation about data’s

correctness would be available. However, as it really is a legitimate observation, it should not

be ignored. One underlying reason for high error numbers for program C could arguably be the

immaturity of software platform at the time program was ongoing.

19 An outlier is an observation in the data set at either extreme of a sample, which is so far removed from the main body of the data that the appropriateness of including it in the sample is questionable. (Milton and Arnold,1990)



24

Secondly, data points at the lower right corners belong to product program A, that is, to a

program that is still ongoing. While in other programs on average 6% of errors and 8% of

features were in ‘Not yet realized’ state, corresponding figures in program A were 30% and

16%, accordingly. This demonstrates that the work item amounts are not yet finalized which

should be taken into account while correlation analysis is performed.

4.4.2. Correlation Analysis

In external validation of a measure correlation coefficient is used to map the equivalent

empirical and numerical relational systems in one number (Zuse, 1998, p. 467). Zuse however

acknowledges that in order to the calculation and analyze of correlation coefficients be

meaningful, both the software measure and the external variable has to assume an extensive

structure (conditions given above).

Then, depending on the measurement scale a valid correlation coefficient can be determined. In

case both variables can be used as ordinal scale, Spearman’s20 correlation coefficient can be

used to compare weak orders of both variables. If both variables can be used as ratio scale, also

Pearson’s21 correlation coefficient can be used correspondingly. Both these correlation

coefficients are given above in Table 3 with and without program A (as reasoned above).

Table 3: Measures and correlation coefficients

Correlation with NOE Program A excluded Measure

Pearson Spearman Pearson Spearman

FEA (attributable to program) 0.26 0.76 0.38 0.79

FEA (+ platform features attributable to program) 0.28 0.77 0.47 0.82

SUB (attributable to program) 0.10 0.71 0.50 0.82

SUB (+ platform subfeatures attributable to program) 0.38 0.79 0.82 0.92

SUB + SCO (attributable to program) 0.22 0.72 0.87 0.94

SUB + SCO (+ platform … attributable to program) 0.34 0.77 0.93 0.96

20 Spearman’s correlation coefficient

� �

�=22

ii

iiSPEARMAN

yx

yxC

21 Pearson’s correlation coefficient � �

�−−

−−=

22 )()(

))((

yyxx

yyxxC

ii

iiPEARSON .



25

Next correlation coefficients were statistically tested to check whether the calculated values of

coefficients are significantly different from zero at a specified level of significance. Milton and

Arnold (1990) give following test statistic to test this:

221

2

R

nRTn

−−=− . (11)

The observed values of the test statistics along the two-tailed probability values from t-

distribution are presented in Table 4 below.

Table 4: Test statistics from significance testing

Measure Test statistic / P-value Program A excluded

FEA (attributable to program) 0.384 / 0.738 0.406 / 0.755

FEA (+ platform features attributable to program) 0.417 / 0.717 0.539 / 0.685

SUB (attributable to program) 0.148 / 0.896 0.576 / 0.667

SUB (+ platform subfeatures attributable to program) 0.577 / 0.622 1.412 / 0.392

SUB + SCO (attributable to program) 0.312 / 0.785 1.786 / 0.325

SUB + SCO (+ platform … attributable to program) 0.517 / 0.657 2.437 / 0.248

As expected, the small sample size along with relative low correlation coefficient values makes

it statistically uncertain to say whether there exist independence – liner, sub or super-linear in

nature – between the variables.

4.4.3. Conclusions and Criticism

Scatter plots and correlation analysis show interesting results as interdependency between of

proposed measures and number of errors is concerned. Nothing conclusive about the nature and

power of interdependency however can be said because of the relative small size of the sample.

For the same reasons, further statistical analysis is meaningless.

Some criticism about applied statistical methodology can be made. For instance use of

Pearson’s correlation coefficient assumes at least the interval scale, which means linear

relationship between the attributes. Furthermore, it assumes that attribute values are normally

distributed. However, Fenton (1997, p. 209) notes that most software measurements are not

normally distributed and usually contain atypical value. Data set could be tested for normality

using e.g. Wilks-Shapiro’s test for normality, but however considering the size of data set such

testing was skipped.



26

Thus, Spearman’s correlation coefficient is more robust than Pearson’s as it instead of raw

values considers the ranks of the attributes. Zuse (1998, p. 466) notes that to avoid the

prerequisite for the Pearson correlation coefficient, additionally ranking correlation coefficient

such as Spearman or Kendall Tau should be used. Zuse further points out that high correlation

is does not transparently mean that there is a causal relationship between the variables; a third

can cause the relationship between the variables.

0

10

20

30

40

50

60

70

Jan.

02

Feb

.02

Mar

.02

Apr

.02

May

.02

Jun.

02

Jul.0

2

Aug

.02

Sep

.02

Oct

.02

Nov

.02

Dec

.02

Jan.

03

Feb

.03

Mar

.03

Apr

.03

May

.03

Jun.

03

Jul.0

3

Aug

.03

Sep

.03

Oct

.03

Nov

.03

Dec

.03

Jan.

04

Feb

.04

Mar

.04

Apr

.04

May

.04

Jun.

04

Jul.0

4

Aug

.04

Sep

.04

Oct

.04

Program A Program B Program C Program D

Program B 80%Program A 80%

Program D 80%

Program C 80%

MS

MS

MS

MS

Figure 6: Subfeature by date entered and program

A relevant point while assessing NOR’s capability to help its user to determine NOE is the

point in time when the data about feature are available. In Figure 6 subfeatures has been plotted

program wise by date they were entered to database.

In figure two points in time per each program are highlighted with two vertical lines connected

by a horizontal line. Lower vertical line ending to an arrow represents the end of project

definition phase at which that time the specifications are expected to be frozen. Upper line

again demonstrates point in time when 80% of final subfeatures were inserted the requirement

management system. Horizontal line demonstrates the lead-time between these points in time.

Although 80% is purely arbitrary percentage, figure above shows that if one would try to get an

estimate for NOE using NOR, he or she should acknowledge that the estimate would be based

even at best case on 80% of what the final feature amount could in reality be.

Chapter 5: Summary and Conclusions


27

5. Summary and Conclusions

The importance of software engineering has grown and strategic intent of software changed in

companies designing and manufacturing wireless information devices. Investments made of

software engineering typify investments made on intellectual capital. Software cost modeling

aims to provide managers responsible for software technology creation with information about

related costs in purpose to enable better management of software related processes and assets.

This study focused on software defect estimation from resource allocation point of view rather

than quality point of view that often is the matter. Short review on software cost estimation

techniques was conducted. Also some relevant issues on how software cost estimation in

product-line based development model differs from traditional one-at-a-time project were

gathered through interviews.

As the scope of this study was limited, the analysis part of this study concentrated on how

different approaches and techniques in general has been used to approach software defect and

effort estimation. Based on analysis simple measures were suggested for estimating the total

number of errors - in purpose to assess the resource allocation in software streams and software

platforms. A measurement theoretic model illustrating under which conditions proposed

measure could be used for prediction was then constructed using notations given by Zuse

(1998).

No definite statement could be reached whether there exists a significant correlation between

proposed measure and amount of errors due to the insufficient size of sample. And if such

correlation would have been found, one should have considered point made by Fenton and Neil

(1999, p. 680) in their paper. They concluded that the relationship between internal variables

like complexity and size, and defects is not entirely causal i.e. the correlation that empirical

investigation may demonstrate could be attributable to a causal dependency to third variable.

Obviously, explaining the number of errors by the number of features is a definite

oversimplification of issues. For instance there are 17 cost drivers (e.g. required reusability,

personnel continuity etc.) in Boehm’s effort estimation model to be taken into account beside

the size of the software product. Understanding the conditions stated in measurement model,

i.e. why suggested measures and used technique is in generally poor to estimate number of

errors, results of this study can be used to determine the roughly right amount of software

related errors after certain time in product program’s life cycle.

Chapter 5: Summary and Conclusions


28

One should however make no arguments or statements about the reliability of software based

on the context of this study. As Fenton and Neil (1997) point out, knowing error count or even

the exact number of residual defects in a system one have to be extremely wary about making

definitive statement about how the system will operate in practice. The variability in the way

systems are used by different users will make it difficult to predict which faults are likely to

lead to failures, and to how serious seen by user.

This study does not in any way help people responsible for software creation and error

management to improve relevant software processes for the better i.e. to change the underlying

factors that causally relate greater functionality implementation to greater amount of work due

to the greater amount of errors induced. Therefore, as one is tempted to understand the

dynamics of fault inducement better, causal dependencies behind development activities and

resources and errors could be modeled using e.g. Bayesian belief networks oriented techniques

as proposed by Fenton and Neil (1999). In future research could also be focused on how to

estimate the number of work items even earlier in product development life cycle. This could

arguably be possible by using techniques based on analogy as proposed by Shepperd and

Schofield (1997).

Chapter 6: References


29

6. References

6.1. Books

Boehm, B.W. (1981). Software Engineering Economics. Englewood Cliffs, New Jersey, USA:

Prentice Hall. 768 p. ISBN 0-138-22122-7.

Bosch, J. (2000). Design & Use of Software Architectures: Adopting and Evolving a Product-

Line Approach. Oxford, United Kingdom: ACM Press. 354 p. ISBN 0-201-67494-7.

Fenton, N.E. and Pfleeger, S.L. (1997). Software Metrics – A Rigorous and Practical Approach.

2nd Edition. Boston, USA: PWS Publishing Company. 638 p. ISBN 0-534-95425-1.

Milton, J. S. and Arnord J. C. (1990). Introduction to Probability and Statistics: Principles and

Application for Engineering and the Computing Sciences. 2nd Edition: McGraw-Hill Publishing

Company. 700p. ISBN 0-07-042353-9.

Zuse, H. (1998). A Framework of Software Measurement. Berlin, Germany: Walter de Gruyter.

755p. ISBN 3-11-015587-7.

6.2. Papers and articles

Aamodt A. and Plaza E. (1994). Case-Based Reasoning: Foundational Issues, Methodological

Variations, and Systems Approach. In: AI Communication, Vol. 7, No. 1, 1994, pp. 39-59.

Angelis, L. and Stamelos, I. (2000). A simulation Tool for Efficient Analogy Based Cost

Estimation. In: Empirical Software Engineering, Vol. 5, No. 1, 2000, pp. 35-68.

Biffl, S. (2000). Using Inspection Data for Defect Estimation. In: IEEE Software,

November/December, Vol. 17, No. 6, 2000, pp. 36-43.

Bosch, J. (1999). Product-Line Architectures in Industry: A Case Study. In: ICSE, 1999:

Proceedings of the 21st International Conference on Software engineering, held at, Los

Angeles, California, USA 16-22 May, 1999. pp. 544–554.

Briand, L., El Emam, K., Surmann, D., Wieczorek, I. and Maxwell, K. (1999). An Assessment

and Comparison of Common Software Cost Estimation Modeling Techniques. In: Proceedings

of the 1999 International Conference on Software Engineering, held at, Los Angeles,

California, USA 16-22 May 1999. pp. 313–323.



30

Fenton, N. E. and Neil, M. (1999). A Critique of Software Defect Prediction Models. In: IEEE

Transactions on Software Engineering, Vol. 25, No. 5, September/October 1999. pp. 675–688.

Gray, A. R., MacDonell, S. G. and Shepperd, M. J. (1999). Factors Systematically Associated

with Errors in Subjective Estimates of Software Development Effort: The Stability of Expert

Judgment. In: Sixth IEEE International Symposium on Software Metrics, held at, Boca Raton,

Florida, USA 4–6 November, 1999. pp. 216–229.

Idri, A., Abran A. and Khoshogoftaar, T. M. (2002). Estimating Software project Effort by

Analogy Based on Linguistic Values. In: Proceedings of the 8th IEEE Symposium on Software

Metrics, held at, Ottawa, Canada, 4–7 June, 2002. pp. 21-30.

Khoshogoftaar, T. M., Bhattacharyya, B. B. and Richardson G. D. (1992). Predicting Software

Errors, During Development, Using Nonlinear Regression Models: A Comparative Study. In

IEEE Transactions on Reliability, Vol. 41, No. 3, September 1992, pp. 390-395.

Khoshogoftaar, T. M. and Munson, J. C. (1990). Predicting Software Development Errors

Using Software Complexity Metrics. In: IEEE Journal on Selected Areas in Communication,

Vol. 8, No. 2, February 1990. pp. 253-261.

Moløkken, K. and Jørgensen, M. (2003). A Review of Surveys on Software Effort Estimation.

In: Proceedings of the 2003 International Symposium on Empirical Software Engineering, held

at, Rome, Italy 30 Sept.–1 Oct. 2003. pp. 223-230.

Passing, U. and Shepperd, M. (2003). An experiment on software project size and effort

estimation. In: Proceedings of the 2003 International Symposium on Empirical Software

Engineering, held at, Rome, Italy 30 Sept.–1 Oct. 2003. pp. 120-129.

Petersson, H. and Wohlin, C. (1999). An Empirical Study of Experience-Based Software Defect

Content Estimation Methods. In: Proceedings of the 10th International Symposium on Software

Reliability Engineering, held at, Boca Raton, Florida, USA 1–4 November, 1999. pp 126–135.

Shepperd, M. and Schofield, C. (1997). Estimating Software Project Effort Using Analogies.

In: IEEE Transactions on Software Engineering, Vol. 23, No. 12, November 1997, pp. 726-743.

Tausworthe, R. (1980). The work breakdown structure in software project management. In:

Journal of Systems and Software, Vol. 1, No. 8, pp. 181-186.



31

6.3. Internet

IDC – Press Release (2004). Worldwide Mobile Phone Market Grows 23%, Nokia Re-Enters

30% Share Bracket. By: IDC’s Worldwide Mobile Phone QView program.

http://www.idc.com/getdoc.jsp?containerId=pr2004_11_02_153412, referred 15-Nov-2004.

Salo, A. (2004). Material for course ‘Mat-2.134 Decision making and problem solving’ in

Applied Mathematics, held at, Helsinki University of Technology, autumn 2004.

http://www.sal.tkk.fi/Opinnot/Mat-2.134/luennot04/Uncertainty_and_risk1.pdf, referred 15-

Nov-2004.

Murthi, S. (2002. Useful Estimation Techniques for Software Projects.

http://www.developer.com/mgmt/article.php/1463281, referred 15-Nov-2004.

6.4. Interviewees

Elvang Morten, Nokia Multimedia

Ruotsalainen, Reijo, Nokia Enterprise Solutions

Saikku, Kirsti, Nokia Research Center

Takala Petri, Nokia Customer and Market Operation

a pragmatic study on cost estimation in software platform ...salserver.org.aalto.fi › vanhat_sivut...

Documents