sweng 505 lecture 4: properties of software and their measurement by dr. phil laplante, pe

SWENG 505 Lecture 4: Properties of software and

their measurement

By Dr. Phil Laplante, PE

2

Today’s topics

Metrics for project estimation Metrics for project insight and control How to use metrics in software project

management Objections to metrics

Metrics for size estimation*

Lines of code Function points/related techniques Estimation by analogy Estimation by rule of thumb Expert/Delphi estimation

* Always use two or more of these techniques

3

4

Lines of code

KLOC, NCSS, or DSI – the number of executable program instructions, excluding comment statements, header files, formatting statements, macros, and anything that does not show up as executable code after compilation or cause allocation of memory, are counted.

SLOC – a single source line of code may span several lines.

Pros: better than nothing and other metrics are derived from it.

Cons: LOC is very misleading, particularly with 4GLs, graphical programming, and toolsets

5

Function Points

Introduced in the late 70s as an alternative to metrics based on simple source line count.

Basis is that as more powerful programming languages were developed the number of source lines necessary to perform a given function decreases.

Paradoxically, the cost/LOC measure indicated a reduction in productivity, as the fixed costs of software production were largely unchanged.

FPs are based on 5 visible aspects of software that could be enumerated accurately, and then empirical weighting factors for each aspect that reflected their relative difficulty in implementation.

The function point size measure

SYSTEMOutputs

Inputs

Files

QueriesInterfaces

#FPs = (I, O, F, Q, In)

code

Managing and Leading Software Projects,by R. Fairley, © Wiley, 2009

6

An estimate is derived from externally observable properties of the software to be developed or modified

i.e., requirements and designCalculation of function points:

number of input FPs: 25number of output FPs : 35number of file FPs : 20number of inquiry FPs : 15number of interface FPs : 5 Total Function Points: 100

Adjustment factor: 1.2 Adjusted FPs: 120

A function point example


7

Function point estimation Suppose we estimate 120 adjusted function

points based on the requirements

Also suppose our past experience indicates that we can build this kind and size of system at a rate of 10 function points per staff-month (10 FP/SM)– we will need 120 / 10 = 12 staff-months

2 people for 6 months? 3 people for 4 months? 4 people for 3 months? but not 12 people for 1 month and probably not 1 person for 12 months

8


Conversion of function points to lines of code

The relationship between function points and lines of code is language dependent:

For example:

– MDD tools: 6 LOC/FP

– Java: 50 LOC/FP

– C++: 75 LOC/FP

– Visual Basic: 100 LOC/FP

– C: 125 LOC/FP

– Assembly Language: 300 LOC/FPNote: the conversion factor is application and context dependent; it should be developed locallyQ: why would you develop a local conversion factor?9


Q: how would you develop a local conversion factor?

Lines of code and productivity ratios for the example

Using the conversion factors on the previous slide for 120 FPs:– MDD: 6 x 120 = 720 LOC– C++ : 75 x 120 = 8500 LOC– Assembly Language: 300 x 120 = 36000 LOC

Productivity leverage factors:– AL / AG = 36000 / 720 = 50:1

Application Generator to assembly language

– AL / C++ = 8500 / 720 = 12.5:1 C++ to assembly language10


External size measures

Function points is an example of an external size measure (ESM)– factors in the environment of the software to be written are

counted– these factors are applied to past projects to determine

conversion factors from ESM to factors of interest e.g., ESM/SM (SM: staff-month)

– the conversion factors are used along with the ESM count for the future system to develop estimates for attributes of interest

i.e., SM = ESM / (ESM/SM) e.g., SM = 120 / 10 = 12 staff-months

11


Examples of external size measures

Type of system ESM factors counted

Data processing Inputs, outputs, interfaces, queries, files

Process control Sensors, valves, actuators

Embedded systems Interrupts, signals, priority levels

User interfaces Windows, menus, items per menu

Object oriented Classes, associations, methods

12


The ESM conjecture

It is always possible to find an External Size Measure that can be used, along with historical data and adjustment factors, to develop estimates for project attributes of interest

13


Developing external size measures

To estimate effort, schedule, defect levels, etc., use factors in the environment of the software to be implemented

system ofinterest

stimuli responses

Environment

systeminterfaces

14

An example Suppose we estimate a new User Interface will have

– 5 user windows, – 6 menus, – 30 items on the 6 menus, and– 7 buttons.

Also suppose we have developed the following relationship from past project data:

Effort = 4.2 * #screens + 3 * # menus + 1.2 * #items + 0.5 * #buttons Then the UI will require

4.2*5 + 3*6 + 30*1.2 + 0.5*7 = 74.5 staff-hours of effort

i.e., 1 person, 2 FTE weeks or 2 people, 1 FTE week

We may include some adjustment factors to account for factors that make the future project different from past projects of the same size

15


16

Feature points

Feature points are an extension of function points developed by Software Productivity Research, Inc. in 1986.

Realized that FP were developed for classical Management Information Systems and were, therefore, not particularly applicable to many other systems, such as:

– Real-time software– Embedded systems– Communications systems– Process control software

Very similar to function points, just added a sixth parameter

17

Use case points (UCP)

Allow the estimation of an application’s size and effort from its use cases.

Based on the number of actors, scenarios and various technical and environmental factors in the use case diagram.

Use case point equation is based on:– Technical Complexity Factor (TCF). – Environment Complexity Factor (ECF). – Unadjusted Use Case Points (UUCP). – Productivity Factor (PF).

UCP = TCP * ECF * UUCP * PF Use case points are a relatively new estimation technique.

Estimation by analogy (1)

Analogy is one of the most widely used estimation techniques

Simple analogy: based on the requirements, it appears that a similar job took 5 people 6 months to completea few of the requirements are different, so we will plan the project for 5 people, 8 months

Questions:1. what are the future product attributes?2. what is the historical data?3. what are the adjustment factors?


18

Estimation by analogy (2)

Estimation by analogy can be quite sophisticated:– describe the attributes of your project and product– search for similar projects in your historical

database of past projects– use the similar projects as the basis of estimation– make adjustments as necessary


19

20

Using analogy from published data

Based on a study by Donald Reifer Taken from a database of 1800 projects In some cases function points were

converted to SLOCs for apple-to-apple comparison

21

Software productivity by application domains (Reifer 2004)

SM=staff month

ESLOC = lines of code

22

Productivity comparisons between US, Europe and Asia by application domain (Reifer 2004)

SM=staff month

ESLOC = lines of code

23

Software cost ($/SLOC) by predominate language level by application domain (Reifer 2004)

3GL: High level procedural language, 4GL: near-human language (e.g. PERL, SQL), 5GL: includes knowledge-based systems etc.

24

Waterfall paradigm effort and duration allocations (Reifer 2004)

PDR: Preliminary design review

SAR: Software acceptance review

SDR : system design review

SRR: software requirements review STR: Software test review

UTC : unit test review

25

Rational unified process effort and duration allocation (Reifer 2004)

IRR: Internal requirements review

LCO: Life cycle objectives review

LCA: Life cycle architecture review

IOC: Initial operational capability

PRR: Product readiness review

Estimation by rule of thumb (1)

Rule of thumb can be based on industry averages or local data for your kind of projects

A typical productivity rule of thumb– our productivity is typically 500 LOC/SM

Typical quality rules of thumb– our typical defect rate during development is typically 20

defect per KLOC– our defect capture rate is typically 90%

A typical schedule rule of thumb– it generally takes about one month to do final system

testing

26



Suppose we estimation our next system will contain 50,000 LOC– perhaps determined by analogy

If our productivity ROT is 500 LOC/SM, we will need 50,000 / 500

100 SM of effort Using the square root ROT, we might plan the project as:

10 people for 10 months If our defect ROT is 20 defects per KLOC, with a 90% capture

rate, we should expect to inject 20 x 50 = 1000 defects, with

900 detected prior to product release

100 reported by users in the first 12 months of use

27


Estimation by rule of thumb (3) Some questions to consider:

1. what scope of effort is included in the productivity ROT (500 LOC/SM)?

2. what kind of work hours are included in the productivity ROT?

3. is the productivity ROT for FTEs?

4. what types of defects are considered in the defect ROT (20 per 1000 LOC)?

5. what ROT should we use for allocation of the effort and schedule?

allocation:

% of effort and time to be allocated for requirements, design, coding, testing, CM, QA, project management, etc

FTE: Full-Time Equivalent software developers

28



Activity percent of effort

percent of schedule

requirements 10% 20%

design 10% 20%

implementation 40% 30%

integration & system testing

30% 20%

acceptance & delivery

10% 10%

Suppose your ROT for effort and schedule allocation is:

NOTE: the amounts of effort and schedule for the various work activities may be intermixed in an iterative manner


Estimation by rule of thumb (5) For our 10 person, 10 month project we

should plan:– requirements: 10 SM / 2 months (5 people)– design: 10 SM / 2 months (5 people)– implementation: 40 SM / 3 months (~13 people)– integration & system testing: 30 SM / 2 months

(15 people)– acceptance & delivery: 10 SM / 1 month

(1 person)

NOTE: 10 people is the average FTE staffing level which can be used to estimate cost

30


Estimation by expert judgment Expert judgment is typically

– combined with “estimation by parts” Estimation by parts:

– the high-level product structure (i.e., the ADV) is determined requirements and interfaces are allocated to each

element of the product– experts are asked to estimate effort and schedule for each

part in which they are experts– the individual estimates are combined to produce an overall

estimate Two cautions:

1. experts will estimate what it would take them to do the job2. you must add in effort estimates for other tasks, such as:

project management, integration and test, documentation, CM, QA, V&V, etc perhaps another 50%31


Expert judgment / delphi estimation1. A coordinator gives each expert the information on which to

base the estimate

2. Experts work alone and submit their estimates and their rationales to the coordinator

3. The coordinator prepares an anonymous report that contains the estimates and rationales of each estimator and gives the report to each estimator and asks each to submit a second estimate

4. The procedure continues until the estimates stabilize– usually after 3 or 4 rounds– If there is small disparity in the stabilized estimates, they

can be used as the range of estimates– If there is wide disparity in the stabilized estimates, the

estimators meet to discuss and resolve their disagreements

32

Estimation for reuse

Reuse of existing code may require:– effort to locate candidate code– effort to evaluate candidate code– effort to modify and test chosen code– effort to integrate chosen code

The COCOMO II estimation model (later) includes an equation for estimating the cost in effort to reuse existing software

33

34

Using metrics for project management

Should be used in cost estimation Can be used to compare systems (baseline) Can be used to track progress and rate of

development (e.g. KLOC/person/day) Can be used for testing, debugging,

benchmarking

35

Why measure?

To measure the desirable properties of the software and set limits on the bounds of those criteria.

To achieve software quality by controlling the software development environment

To institutionalize the process of learning from past mistakes. To institutionalize the process of learning from past successes. To identify progress with respect to some software life cycle

model To estimate and control resource utilization Measurement can also be used during the testing phase and

for debugging purposes to help focus on likely sources of errors.

36

Two types of measurements

Primitive measurements– Presuppose no others – Are atomic– Measure a single identifiable attribute of an entity– E.g. lines of code

Derived measurements– Obtained from primitive measures through numerical

calculation or possibly indirect observation– McCabe’s, Halstead’s FP, etc.

37

Some musings on metrics

Correlation versus causality Metrics classification Lines of code McCabe’s metric Halstead’s metrics Function Points Object-oriented metrics Royce’s core metrics

38

Correlation and causality

Churches and bars Motorcycle deaths and helmets Red tee shirts and defects detected Metrics and software qualities

39

Metric classes

Process metrics Product metrics Environment metrics

40

Process metrics (primitive)

Rate of programmer errors Rate of software fault introduction Rate of software failure events Number of software change requests Number of pending trouble reports Measures of developer productivity Cost information Software process improvement costs Return on investment

41

Product metrics (primitive)

Attributes of – the software requirements specification– The high level design– The low level design– The source code– The test cases– The documentation

42

Environment metrics

Attributes of– Operating system– Development environment– Operating environment– Administrative stability (staff turnover)– Machine (software) stability– Office interruptability– Office privacy– Library facilities– Rendezvous facilities

43

McCabe’s metric

Due to McCabe in 1976, this is a graph-theoretic complexity measure that highlights program complexity (lack of modularization).

Based upon determining the number of linearly independent paths in a program module; suggesting that the complexity increases with this number, and reliability reduces.

Has two primary uses: – to indicate escalating complexity in a module as it is coded

and therefore assisting the coders in determining the “size” of their modules,

– to determine the upper bound on the number of tests that must be designed and executed.

44

McCabe’s metric

Then, calculate C=V(G) = E - N + 2 = 5

First count the graph elements:•Edges (E) = 9•Nodes (N) = 6•Regions = 5

Consider flow graph for some piece of fictitious code.

45

Test cases determined using basis path method. In our example, C=5, so 5 independent paths These are

1. a, c, f

2. a, d, c, f

3. a, b, e, f

4. a, b, e, a, …

5. a, b, e, b, e, …

McCabe’s metric

a

b c d

e f

46

Halstead’s Metrics

Based on information content Count the number of logical distinct BEGIN-END pairs, known

as operators. Count the number of distinct statements (e.g. lines terminated

by semicolons in C), known as operands Count the total number of occurrences of distinct operators in

the program Count the total number of occurrences of operands in the

program Form relatively simple mathematical relationships that lead to

“Effort” and “Level” measures.

47

Halstead’s metrics

The Level is a measure of program abstraction. It is believed that increasing this number will increase system reliability.

Effort measures the amount of mental effort required in the development of the code.

Decreasing the effort level is believed to increase reliability as well as ease of implementation.

In principle, the program length can be estimated, and therefore is useful in cost and schedule estimation.

The length is also a measure of the “complexity” of the program in terms of language usage, and therefore can be used to estimate defect rates.

Halstead’s metrics, though 30 >years old, are still widely used and tools are available to compute them.

48

Goal/Question/Metric (GQM)

An analysis technique that helps in the selection of an appropriate metric.

Follows three simple rules. 1. State the goals of the measurement, that is “what the

organization is trying to achieve?”

2. Derive from each goal the questions that must be answered to determine if the goals are being met.

3. Decide what must be measured in order to be able to answer the questions.

49

GQM Example

Suppose goal is to evaluate the effectiveness of coding standard. Some of the associated questions that you might ask to assess if this

goal has been achieved is:– Who is using the standard?– What is coder productivity?– What is the code quality?

For these questions, the corresponding metrics might be:– What proportion of coders using standard?– How have the number of lines of code or function points generated per day

per coder changed?– How have appropriate measures of quality for the code changed?

Appropriate measures might be errors found per line of code, or cyclomatic complexity.

Now that this framework has been established appropriate steps can be taken to collect, analyze and disseminate data for the study.

50

Questions to consider when choosing metric (Kaner 2002)

What is the purpose of the measure? What is the scope of the measure? What attribute are you trying to measure? What is the attribute’s natural scale? What is the attribute’s natural variability? What instrument are you using? What is the instrument’s natural scale? What is the measurement error? What is the attribute’s relationship to the instrument? What are the foreseeable side effects of using this instrument?

51

Objections to metrics

They are a costly and unnecessary distraction. Errors made in metric calibration and data collection

that can lead to bad decisions Negligent metric misuse (you don’t know what you

are doing) – Measuring the correlation effects of a metric without clearly

understanding the causality is unscientific and dangerous. Malicious metric misuse

– Use metrics to “prove” a point– Use metrics as a weapon

52

Some management data from “metrics” that could be used improperly

In a study of project staff, the following was determined about how they spent their time

– 45 percent surfing the Web– 12 percent reading and responding to personal e-mail– 5 percent using chat rooms– 5 percent on rest room breaks– 8 percent discussing the current political situation– 10 percent in staff meetings– 5 percent in design reviews– 10 percent coding

53

Characteristics of highly successful measurement programs (Dekkers 2002)

Set solid measurement objectives and plans. Embed measurement into the process. Gain a thorough understanding of measurement. Focus on cultural issues. Create a safe environment to collect and report true

data. Cultivate a predisposition to change. Develop a complementary suite of measures.

54

Metrics in your post mortem discussion

From your experiences on a software project that you were involved in discuss the following

– Briefly describe the situation (software product, environment, customer, etc.)

– Was measurement used? How was it used?– Relate how uncertainty affected the measurement(s). Or if

measurement was not used how did the lack of measurement highlight the uncertainties.

– How could more measurement helped eliminate uncertainty? What uncertainties were invulnerable to measurement?

– Was there any metric mismanagement? If so, describe.

55

References

Bill Agresti, “Lightweight Software Metrics; The P-10 Framework,” IT Professional, September/October 2006, pp. 12-16.

Carol Dekkers and Patricia McQuaid, “The Dangers of Using Software Metrics to (Mis)Manage Software,” IT Professional, March/April 2002.

Cem Kaner, et a, Testing Computer Software, 3rd Edition, John Wiley, 2002.

Phillip Laplante, Software Engineering for Image Processing Systems, CRC Press, 2003.

John Munson, Software Engineering Measurement, Auerbach, 2003. Donald J. Reifer, “Industry Software Cost, Quality and Productivity

Benchmarks”, Software Technology Newsletter, DACS, 2004, pp. 3-8, 16.

Walker Royce, Software Project Management: A Unified Framework, Addison-Wesley, 1998.

sweng 505 lecture 4: properties of software and their measurement by dr. phil laplante, pe

Documents

conversion of function

adjusted function points

total function points

function pointsintroduced

leading software projects

properties of software

number of interface

number of file fps