cigdem gencel persistence in poor estimating in software engineering- whys and hows v04

38
Persistence in Poor Estimating in Software Engineering: Whys and Hows Çiğdem Gencel, Assist. Prof. Free University of Bolzano/Bozen (Italy) Faculty of Computer Science [email protected] Oxford University, UK 11 June 2014

Upload: oxwocs

Post on 13-Jul-2015

228 views

Category:

Software


0 download

TRANSCRIPT

Persistence in Poor Estimatingin Software Engineering:

Whys and Hows

Çiğdem Gencel, Assist. Prof.

Free University of Bolzano/Bozen (Italy)

Faculty of Computer Science

[email protected]

Oxford University, UK

11 June 2014

Agenda

Introduction

The Estimating Problem

WHY? Fundamental

Issues

HOW?

What are the Basic Needs?

Conclusions

Open Discussion

What is CS and SE?

Software Engineering: The application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software [IEEE Std 610.12-1990]

Computer Science: Study of information and computation, and of practical techniques for using machines to process information and perform computation

Subjective opinions

Objective truth

Why to measure?

We measure to understand, to predict, to control and to improve

What is Measurement?

Entity Attribute Measure (Metric)

10,000 Lines of CodeLengthIf A>B

then

begin

A -

B

end

else

begin

A +

B

end;

Code

“The process by which numbers and symbols are assigned to attributes of entities in the real world so as to describe them according to clearly defined rules.” - Fenton, 1991

1 Schalken, J, and van Vliet H. "Measuring where it matters: Determining starting points for metrics collection",

Journal of Systems and Software, 81, 5, May 2008, p. 603-615

Exploratory Cycle Confirmatory Cycle

The empirical investigation in software engineering consists of exploratory and confirmatory cycles that are iterative in nature1

Empirical Investigations in SE (I)

Folk Proverbs for Weather Forecast

UK

“Red sky at night, sailor's delight; Red sky at morning, sailors take warning”

ITALY

IT: “Rosso tramonto, bianco mattino”

EN: Red sunset, white morning

IT: “Rosso di mattina,

il mal tempo s'avvicina”

EN: Rosy in the morning,

bad weather is coming

Italian proverbs source: http://www.italyrevisited.org/photo/Folk_Sayings_on_Nature

Photo source: http://www.wikihow.com/Predict-the-Weather-Without-a-Forecast

The exploratory cycle usually starts with unstructured observations

Folk Proverbs for Weather Forecast

UK

“Circle around the moon, rain or snow soon”

ITALY

IT: Quannu la luna e pallita, chiovi; quannu e russa, fa ventu e quann'e chiara fa sirinita.

EN: When the moon is pale, it will rain; when it is reddish, it will be windy and when it is clear the weather will be pleasant

Italian proverbs source: http://www.italyrevisited.org/photo/Folk_Sayings_on_Nature

Photo source: http://www.wikihow.com/Predict-the-Weather-Without-a-Forecast

Preliminary insights lead to hypothesis generation and more structured observations

Operational measures are selected/defined to test the hypothesis in the confirmatory cycle

Exploratory Cycle Confirmatory Cycle

1 Schalken, J, and van Vliet H. "Measuring where it matters: Determining starting points for metrics collection",

Journal of Systems and Software, 81, 5, May 2008, p. 603-615

Empirical Investigations in SE (II)

Controlled Experiments Surveys

Case StudiesInterviews

Measurement is necessary for collecting evidence during empirical inquiries

A sundial on a church at North Lake Garda (Italy). As the sun moves across the sky, shadows change in direction

and length, so a sundial can measure the length of a day with respect to different times of the year

Various measures and measurement instruments were developed throughout the history

What is Estimation?

y = f(parameter1, parameter2, ….., parametern)

MEASUREMENT : NOW

E.g. Temperature, Pressure, etc.

ESTIMATION: FUTURE

E.g. Simple or sophisticated weather forecast models

History of Base Measures and Instruments for Weather Predictions

Humidity Measurement1400s - da Vinci: First primitive hygrometer

1664 - Folli: First practical hygrometer

1820 – Daniell: First dew point hygrometer using electrical resistance

Wind Measurement1450 – Alberti: first anemometer

1805 – Beaufort: Beaufort Scale to visually estimate wind speed

1846 – Robinson: First four-cup anemometer

Temperature Measurement1593 – Galileo: First water thermometer

1714 – Fahrenheit: Mercury thermometer with Fahrenheit scale

1743 – Celsius: Mercury Thermometer with Celsius scale

1848 – Kelvin: Kelvin Scale (with absolute zero as -273 C)

Pressure Measurement1644 – Torricelli: Torricelli tube

1843 – Vidie: Metallic barometer

“Measure what can be measured, and make measurable what cannot be measured.” - Galileo Galilei

A Wind/Barometer Table used by Sailors

Sometimes all we need is a simple prediction method!

Modern Weather Forecast Models

In other cases, we might need more accuracy and therefore, more sophisticated models

How about Measurements & Estimations in Software Engineering?

Significance of the Problem

Annual cost of failures and over-runs:

• US market (Standish) ~100 Billion US$

• European market ~100 Billion €

Study No. of Cost Over-runs/

Country Projects Write-offs

UK Public Sect. 105 £ 29B £ 9B (31%)

Mostly US 1471 $ 246B $ 66B (27%)

2 Symons, C., Gencel, C., From Requirements to Project Effort Estimates – Work in Progress (Still?)

REFSQ Annual Conference, Industry Track Keynote, Germany, 2013

Software industry records show that projects are often delivered late and/or over budget2

Three major shifts in SE

Shift 1: Agility

Shift 2: GSE

Shift 3: Scale

Shift towards agility in development, distribution of tasks across borders, and increase in scale created more challenges3

3 Gencel, C., Petersen, K., Opening presentation of the 1st Intern. Workshop on Estimations in the 21st Century

Software Engineering (EstSE21), The Agile Conference (XP 2014), Rome, Italy, 2014

An Example from UK (I)

Over 20 years ago there was a lot of interest in software metrics (Norman Fenton wrote his book, the Government adopted metrics, UKSMA started)

Then there was a lot of outsourcing to the big international software houses, who moved a lot of work off-shore to low-cost countries.

This had two consequences:

◦ there were big cost savings, so why bother to measure supplier performance

◦ the customers lost all their knowledge of measurement to the suppliers (with the staff that they passed over to the suppliers)

Source of Information: Charles Symons, President of the Common Software Measurement International

Consortium (COSMIC)

More recently, off-shore costs have risen so software development work is starting to come back to low-cost regions of the UK

Simultaneously there is more interest in Agile development

◦ Agile requires small cohesive teams, which is difficult to achieve when e.g. the team is spread over the US, the UK and, say India. So quality concerns have arisen

Currently, there are signs of growing interest in metrics again to be able to manage these situations.

An Example from UK (II)Source of Information: Charles Symons, President of the Common Software Measurement International

Consortium (COSMIC)

WHY Poor Estimations?

I. Lack of well-established taxonomies/categories

II. Ill-defined attributes / measures

III. Lack of standardization

I. Lack of Well-established Taxonomies/Categories

Product categories

Building

Apartment

Low rise

Mid rise

High rise

Airport

Hospital

Bridge

Motorway

Highway

Parameters measured with different metrics

Site work (m2 of site area) Foundations and columns (m2) Conveying system (# of floor stops) …

Measurement of Engineering Products

Various parametric systems exist for different types of civil engineering projects

Types of Software Systems

In software engineering, there is no commonly agreed classificationof software types

ISO TR 14143-5 CHAR Method - Functional

Domain Types

Pure Data Handling System

Information System

Data Processing System

Controlling Information System

Controlling Data System

Complex Controlling Information System

Non-Specific (Complex) System

Simple Control System

Control System

Complex Control System

Data Driven Control System

Complex Data Driven Control System

Pure Calculation System

Controlling Calculation System

Scientific Information System

Scientific Controlling Data Processing System

ISO 12182 Software Types

(no corresponding type)

Management Information System (Business transaction processing), Decision Support

Word Processing, Geographic Information System

(no corresponding type)

Automated Teller Banking

Business (Business Enterprise)

Military Command and Control

Real Time: Embedded, Device Driver

(no corresponding type)

Real Time: Embedded, Avionics, Message router

E-mail, Emergency dispatch call/receipt, Oper.Syst.

Process Control (Control System)

Scientific, Standard math/Trig. Algorithms

Engineering

Self-learning (Expert or Artificial Intelligence), Statistical, Spreadsheet, Secure Systems, Actuarial

Safety Critical

Inconsistent Classifications in SE

Each software

benchmark

dataset has their

own attributes

Categories not

well-established

and not

orthogonal

Application Types in an Example Dataset

Customer billing/relationship management; Business;

Customer billing/relationship management; Document management; Trading;

Customer billing/relationship management; CRM;

Customer billing/relationship management; Document management; Trading;

Customer billing/relationship management; Financial transaction process/accounting; Online analysis and reporting; Trading; Workflow support & management; Process Control; Electronic Data Interchange;

Customer billing/relationship management; Logistic or supply planning & control;

Customer billing/relationship management; Other;

Customer billing/relationship management; Stock control & order processing.

4 Gencel, C,, Buglione, L, Abran, A., “Improvement Opportunities and Suggestions for Benchmarking”, Intern.

Workshop on Software Measurement and Mensura Joint Conference, 2009

II. Ill-defined Attributes & Measures

Which building is larger?

Floor area (m2)

Height (m)

Size of a building

In civil engineering, different size measures are defined to measure the size of buildings

◦ Floor area (length x width of the floor) (m2) & height(m)

◦ Volume of a building (length x width x height)

The selection depends on the needs of the engineers or managers!

How about Size of Software?

Information processing amount

It is common that companies use ‘one size fits all’ approach!

III. Lack of Standardization

Measurement in Physical Sciences

bit?

Base Measure SI unit Symbol

length meter m

mass kilogram kg

time second s

electric current ampere Athermodynamic temperature kelvin K

amount of substance mole mol

luminous intensity candela cd

• 7 base units were defined to measure physical quantities and

• 22 measures defined in terms of the base quantities via a system of

quantity equations

Source: NIST website: http://physics.nist.gov/cuu/Units/units.html

The foundation for the System of Units (SI) was laid during the French Revolution (1799)

Measurement in Social Sciences

In social sciences, there are no standard units of measurement

The theory and practice of measurement is studied in psychometrics

Measurement in Computer Science

Factor Name Symbol Origin Derivation 210 kibi Ki kilobinary: (210)1 kilo: (103)1

220 mebi Mi megabinary: (210)2 mega: (103)2

230 gibi Gi gigabinary: (210)3 giga: (103)3

240 tebi Ti terabinary: (210)4 tera: (103)4

250 pebi Pi petabinary: (210)5 peta: (103)5

260 exbi Ei exabinary: (210)6 exa: (103)6

Source: NIST website: http://physics.nist.gov/cuu/Units/units.html

In 1998, ISO IEC approved prefixes for binary multiples for use in the fields of data processing and data transmission

Recent Attempts for Standardization:A Standard on Functional Size Measurement5

Part 1 (1998) : Functional Size Measurement Concepts

◦ IEEE Std. 14143-1 (2000) owned ISO/IEC 14143-1:1998

Part 2 (2002) : Conformity evaluation of software size measurement

methods to ISO/IEC 14143-1:1998

Part 3 (2003) : Verification of Functional Size Measurement Methods

Part 4 (2002) : FSM – Reference Model

Part 5 (2004) : Determination of Functional Domains for use with

Functional Size Measurement

Part 6 (2005) : Guide for Use of ISO 14143 Series and Related

International Standards

5 ISO/IEC 14143: Information Technology – Software Measurement – Functional Size Measurement

In 1998, ISO established a working group to define the base concepts of functional size measurement

Standardized Functional Size Measurement

(FSM) Methods

IFPUG Function Point Analysis (ISO/IEC 20926)

Mark II Function Point Analysis (ISO/IEC 20968)

NESMA FSM Method (ISO/IEC 24570)

COSMIC Function Points (ISO/IEC 19761)

FISMA FSM (ISO/IEC 29881)

Among five standardized FSM methods, only COSMIC was designed to measure ‘pure functional size’ whereas others actually are designed to estimate ‘relative effort’ 6

6 Gencel, C., Symons, C., “From performance measurement to project estimating using COSMIC

functional sizing”, in the Proc. of Software Measurement European Forum (SMEF), Rome, 2009

HOW to Improve?

These make it difficult to investigate relationships and rules among different attributes

Persistence of Poor Predictions!

Accurate Predictions

Well-Defined & Standard

measures / instruments

Good Categories/Tax

onomies

Linus: I guess it is wrong always to be worrying about tomorrow.

Maybe we should think only about today

Charlie Brown: No, that’s giving up. I am still hoping that

yesterday will get better.