sim5102 software evaluation measuring internal product attribute

SIM5102 Software Evaluation

Measuring Internal Product Attribute

PRODUCT SIZE

Product size is a favourite thing to measure Productivity → do “more” in same time Improve quality → “larger” more likely to have

problems Improve prediction → “larger” will have a

different prediction than “smaller” Reduce risk → “larger” is more risky

PRODUCT SIZE

Product are physical entities → size seems meaningful

But, many “obvious” size measurements don’t take into account Redundancy and complexity, and so aren’t

useful for understanding effort True functionality and effort, and so aren’t

useful for understanding productivity Complexity and reuse, and so aren’t useful for

understanding cost

CODE EXAMPLE

Comment Statements Many statements separated by terminators in

one line Statements with comments on the same line ....

CODE EXAMPLE

/**

* Create a rational number

**/

public Rational(int num, int den) {

if (den == 0) {

throw new

RuntimeException(“invalid rational: denominator=0”);

}

//canonical form requires the denominator > 0

int sign = num*den;

num = ((sign < 0) ?-1;1) * Math.abs(num);

den = Math.abs(den);

int gcd = gcd(Math.abs(num),den); //local method call

_numerator = num / gcd;

_denominator = den / gcd;

}

CODE SIZE

Bytes – spaces?, lengths of variable names? Lines of code (LOC) – blank lines?

Comments? Non-commented lines of code (NCLOC) –

non-executable lines? Executable lines – code lines longer than one

line? Statements – some statements are bigger

than others

GQM & CODE SIZE

Goal : Predict test suite size Goal : Predict amount of disk needed Goal : Compute effort Goal : Predict number of detects Goal : Compute quality of documentations

COMPLICATING FACTORS

Code libraries, Framework (reusing executable code)

Code generations Copy & paste Tool support Programming Language Difficulty of the problem ( “complexity” )

PROGRAMMING LANGUAGE

Java Perl script Asp ....

ATTRIBUTES OF SOFTWARE SIZE

Need to determine fundamental attributes of software size that capture key aspects Length – physical size Functionality – what functions are supplied to

the user Complexity – how “difficult”

SIZE : SOFTWARE ENGINEERING STATE-OF-THE-ART Length – some consensus on how to

measure length of code, but not of other artifacts

Functionality – some ongoing work measuring functionality, especially for specifications

Complexity – little work other than computational complexity

CODE LENGTH

NCLOC ( non-commented lines of code ) Common measure of “length” Not a valid measure for total program length (adding more

comments doesn’t change the measurement) Measure comment lines (CLOC) separately, and then combine

total length (LOC) = NCLOC + CLOC Allows useful indirect measure such as : comment density = CLOC

LOC Executable statements (ES) – excludes headers, data

declarations Delivered Source Instruction (DSI) – excludes code created

during development but not delivered, such as “scaffolding” code, prototypes, testing

MORE CODE LENGTH MEASUREMENT Number of function Number of files Number of classes Number of methods Number of objects? Number of method calls?

HALSTEAD’S SOFTWARE SCIENCE µ1 = # unique operations, µ2 = # unique operands, N1

= total # of operators, N2 = total # of operands Length N = N1 + N2, vocabulary µ1 + µ2

Volume V = N × log2 µ Program level L = V* / V, where V* is the volume of the

minimal size of implementation Difficulty D = 1/L Estimate Level Ĺ = 1/D = 2/µ1 × µ2 / N2

Estimate length Ń = µ1 × log2 µ1 + µ2 × log2 µ2 Estimate effort ( “mental discrimination” ) E = V / Ĺ =

µ1 N2 N log2 µ / 2 µ2 Programming time T = E / 18

PROBLEMS WITH SOFTWARE SCIENCE What is relationship between empirical

relational system and the model ( ex : volume, difficulty )?

Does software science define measures or prediction system?

Seems to require implementation before can be used?

What scale is effort?

USING CODE LENGTH FOR PREDICTION Too late for predicting cost/effort for current

project Only applies to future projects that are in the

same environment, language, team, etc Can be used for model validation

SIZE OF OTHER PRODUCT

Requirement – ex : number of pages, number of use causes

Specification – ex : number of Z rules Design – ex : number of classes, number of

diagrams “ Functionality “

FUNCTION POINT & RELATED MEASURES A measure of “functionality” Generally can be computed “early” in the

development process Albrecht’s function points Symons Mark 2 function points COCOMO 2.0 object points DeMarco’s specification weights Web objects

ALBRECHT’S FUNCTION POINTS

Count the components in each category External inputs – items provide by the user to

describe distinct application data External outputs – items provided to the user to

describe distinct application data External inquiries – interactive inputs requiring a

response External (interface) files – machine readable

interfaces to other systems Internal (master) files – logical master files in system

Determine the complexity of each component (simple, average, complex)


Compute unadjusted function points :

UFP = Cinput × input + Coutput × output + Cinquiry × inquiry + CMFile × master file + Cinterface × interface

Determine the technical complexity factor

TFC = 0.65 + 0.01

where 0 ≤ Fi ≤ 5

643inquiry

1075Interface

15107Master file

754Output

643Input

Complex

Average

Simple Component

iif

14

1


F1 Reliable back-up & directory F2 Data communication

F3 Distributed function F4 Performance

F5 Heavily used configuration F6 Online data entry

F7 Operational ease F8 Online update

F9 Complex interface F10 Complex processing

F11 Reusability F12 Installing ease

F13 Multiple sites F14 Facilitate change

Compute the adjusted function points as FP = UFC + TCF

EXAMPLE

Spell-checker description : the checker accepts as input a document file and an optional personal dictionary file. The checker lists all words in the document file not contained in either the built-in dictionary or the personal dictionary. The user can query the number of words processed and the number of spelling errors found at any stage during processing.

External inputs : document file name, personal dictionary filename

External outputs : misspelled word report, number-of-words-processed message, number-of-errors-so-far message

Internal (master) files : dictionary External (interface) files : document file, personal dictionary External inquiries : words processed, errors so far

PROBLEMS WITH FUNCTION POINTS Subjectivity & accuracy of TCF Need lots of detail to compute Problems with double counting Problems with measurement theory

Class Exercise

A simple information system required by a bank:

It will enable new customers to be added and deleted from a customer file. The system must also support paying in and withdrawal transactions, and will display a warning message if the borrower has an excessive overdraft.

Customers can query their account balance via a terminal.

A report of overdrawn customers can be requested.

WHAT IS COMPLEXITY?

Need to distinguish problem complexity from solution complexity

Problem (computational) complexity – resources required for an optimal solution

Solution complexity – resources actually used for a given solution

Several kinds of resources, typically time and space Which is important depends on goals, ex;

Buildability – what is the cost in creating a useful solution in the first place

Maintainability – what is the cost of keeping the solution useful after it has been built

KINDS OF COMPLEXITY

Problem (computational) complexity – easy problems are quicker to solve than hard ones

Algorithmic complexity (efficiency) – what resources does the solution need

Structural complexity – some code is more complicated (more complex structure) than other code that does the same thing

Cognitive complexity – some ways of doing things are harder to understand than others

STRUCTURE COMPLEXITY : INTUITION The more “complex” the structure code ;

…the harder it is to understand …the more likely it will have a detect …the harder it will be to change …the longer it will take to produces …the more difficult it will be to reuse

WHAT IS STRUCTURE?

Control-flow – the sequence of instructions that are executed

Data-flow – the creation and movement of data between “components” of the code

Data organization – the relationship of data items to each other

DIRECTED GRAPHS

A mathematical structure (with an appealing visual representation) for representing things that are related

Consists of vertices (or nodes, or points), connected by edges (or line segments, or arcs). Vertices: A, B, C, D, E, F & Edges: (A,C), (A,C), (A,E), (B,C), (B,F), (C,D), (D,A), (E,F)

Also : undirected graphs, rules restricting edges between vertices, classification of vertices

Graph theory – the study of properties of graphs

B

CF

E

AD

FLOWGRAPHS

Directed graphs (flow graphs) can be used to model control flow of a program

Vertex = statement, edge = (A,B) if control flows from statement A to B

Properties of flow graphs may provide information about properties of the code

McCABE’S CYCLOMATIC COMPLEXITY NUMBER Measures the number of linearly independent

paths through the flow graphs v (F) = e – n + 2, the flow graphs of the

code, n the number vertices, e the number of edges

Intuition – the larger the CCN the “more complex” the code

Various sources recommended a CCN of no more than 10-15

Example : CCN = 8 – 7 + 2 = 3

EVALUATION

Good formal definition Satisfies properties of metrics for “number of

linearly independent paths” Control flow only Some studies seem to indicate correlation

between CCN and some costs Doesn’t match intuition of “complexity” Is usable where “number of linearly

independent paths” is useful for answering questions for some goal

sim5102 software evaluation measuring internal product attribute

Documents