sim5102 software evaluation measuring internal product attribute
Post on 18-Dec-2015
222 views
TRANSCRIPT
PRODUCT SIZE
Product size is a favourite thing to measure Productivity → do “more” in same time Improve quality → “larger” more likely to have
problems Improve prediction → “larger” will have a
different prediction than “smaller” Reduce risk → “larger” is more risky
PRODUCT SIZE
Product are physical entities → size seems meaningful
But, many “obvious” size measurements don’t take into account Redundancy and complexity, and so aren’t
useful for understanding effort True functionality and effort, and so aren’t
useful for understanding productivity Complexity and reuse, and so aren’t useful for
understanding cost
CODE EXAMPLE
Comment Statements Many statements separated by terminators in
one line Statements with comments on the same line ....
CODE EXAMPLE
/**
* Create a rational number
**/
public Rational(int num, int den) {
if (den == 0) {
throw new
RuntimeException(“invalid rational: denominator=0”);
}
//canonical form requires the denominator > 0
int sign = num*den;
num = ((sign < 0) ?-1;1) * Math.abs(num);
den = Math.abs(den);
int gcd = gcd(Math.abs(num),den); //local method call
_numerator = num / gcd;
_denominator = den / gcd;
}
CODE SIZE
Bytes – spaces?, lengths of variable names? Lines of code (LOC) – blank lines?
Comments? Non-commented lines of code (NCLOC) –
non-executable lines? Executable lines – code lines longer than one
line? Statements – some statements are bigger
than others
GQM & CODE SIZE
Goal : Predict test suite size Goal : Predict amount of disk needed Goal : Compute effort Goal : Predict number of detects Goal : Compute quality of documentations
COMPLICATING FACTORS
Code libraries, Framework (reusing executable code)
Code generations Copy & paste Tool support Programming Language Difficulty of the problem ( “complexity” )
ATTRIBUTES OF SOFTWARE SIZE
Need to determine fundamental attributes of software size that capture key aspects Length – physical size Functionality – what functions are supplied to
the user Complexity – how “difficult”
SIZE : SOFTWARE ENGINEERING STATE-OF-THE-ART Length – some consensus on how to
measure length of code, but not of other artifacts
Functionality – some ongoing work measuring functionality, especially for specifications
Complexity – little work other than computational complexity
CODE LENGTH
NCLOC ( non-commented lines of code ) Common measure of “length” Not a valid measure for total program length (adding more
comments doesn’t change the measurement) Measure comment lines (CLOC) separately, and then combine
total length (LOC) = NCLOC + CLOC Allows useful indirect measure such as : comment density = CLOC
LOC Executable statements (ES) – excludes headers, data
declarations Delivered Source Instruction (DSI) – excludes code created
during development but not delivered, such as “scaffolding” code, prototypes, testing
MORE CODE LENGTH MEASUREMENT Number of function Number of files Number of classes Number of methods Number of objects? Number of method calls?
HALSTEAD’S SOFTWARE SCIENCE µ1 = # unique operations, µ2 = # unique operands, N1
= total # of operators, N2 = total # of operands Length N = N1 + N2, vocabulary µ1 + µ2
Volume V = N × log2 µ Program level L = V* / V, where V* is the volume of the
minimal size of implementation Difficulty D = 1/L Estimate Level Ĺ = 1/D = 2/µ1 × µ2 / N2
Estimate length Ń = µ1 × log2 µ1 + µ2 × log2 µ2 Estimate effort ( “mental discrimination” ) E = V / Ĺ =
µ1 N2 N log2 µ / 2 µ2 Programming time T = E / 18
PROBLEMS WITH SOFTWARE SCIENCE What is relationship between empirical
relational system and the model ( ex : volume, difficulty )?
Does software science define measures or prediction system?
Seems to require implementation before can be used?
What scale is effort?
USING CODE LENGTH FOR PREDICTION Too late for predicting cost/effort for current
project Only applies to future projects that are in the
same environment, language, team, etc Can be used for model validation
SIZE OF OTHER PRODUCT
Requirement – ex : number of pages, number of use causes
Specification – ex : number of Z rules Design – ex : number of classes, number of
diagrams “ Functionality “
FUNCTION POINT & RELATED MEASURES A measure of “functionality” Generally can be computed “early” in the
development process Albrecht’s function points Symons Mark 2 function points COCOMO 2.0 object points DeMarco’s specification weights Web objects
ALBRECHT’S FUNCTION POINTS
Count the components in each category External inputs – items provide by the user to
describe distinct application data External outputs – items provided to the user to
describe distinct application data External inquiries – interactive inputs requiring a
response External (interface) files – machine readable
interfaces to other systems Internal (master) files – logical master files in system
Determine the complexity of each component (simple, average, complex)
ALBRECHT’S FUNCTION POINTS
Compute unadjusted function points :
UFP = Cinput × input + Coutput × output + Cinquiry × inquiry + CMFile × master file + Cinterface × interface
Determine the technical complexity factor
TFC = 0.65 + 0.01
where 0 ≤ Fi ≤ 5
643inquiry
1075Interface
15107Master file
754Output
643Input
Complex
Average
Simple Component
iif
14
1
ALBRECHT’S FUNCTION POINTS
F1 Reliable back-up & directory F2 Data communication
F3 Distributed function F4 Performance
F5 Heavily used configuration F6 Online data entry
F7 Operational ease F8 Online update
F9 Complex interface F10 Complex processing
F11 Reusability F12 Installing ease
F13 Multiple sites F14 Facilitate change
Compute the adjusted function points as FP = UFC + TCF
EXAMPLE
Spell-checker description : the checker accepts as input a document file and an optional personal dictionary file. The checker lists all words in the document file not contained in either the built-in dictionary or the personal dictionary. The user can query the number of words processed and the number of spelling errors found at any stage during processing.
External inputs : document file name, personal dictionary filename
External outputs : misspelled word report, number-of-words-processed message, number-of-errors-so-far message
Internal (master) files : dictionary External (interface) files : document file, personal dictionary External inquiries : words processed, errors so far
PROBLEMS WITH FUNCTION POINTS Subjectivity & accuracy of TCF Need lots of detail to compute Problems with double counting Problems with measurement theory
Class Exercise
A simple information system required by a bank:
It will enable new customers to be added and deleted from a customer file. The system must also support paying in and withdrawal transactions, and will display a warning message if the borrower has an excessive overdraft.
Customers can query their account balance via a terminal.
A report of overdrawn customers can be requested.
WHAT IS COMPLEXITY?
Need to distinguish problem complexity from solution complexity
Problem (computational) complexity – resources required for an optimal solution
Solution complexity – resources actually used for a given solution
Several kinds of resources, typically time and space Which is important depends on goals, ex;
Buildability – what is the cost in creating a useful solution in the first place
Maintainability – what is the cost of keeping the solution useful after it has been built
KINDS OF COMPLEXITY
Problem (computational) complexity – easy problems are quicker to solve than hard ones
Algorithmic complexity (efficiency) – what resources does the solution need
Structural complexity – some code is more complicated (more complex structure) than other code that does the same thing
Cognitive complexity – some ways of doing things are harder to understand than others
STRUCTURE COMPLEXITY : INTUITION The more “complex” the structure code ;
…the harder it is to understand …the more likely it will have a detect …the harder it will be to change …the longer it will take to produces …the more difficult it will be to reuse
WHAT IS STRUCTURE?
Control-flow – the sequence of instructions that are executed
Data-flow – the creation and movement of data between “components” of the code
Data organization – the relationship of data items to each other
DIRECTED GRAPHS
A mathematical structure (with an appealing visual representation) for representing things that are related
Consists of vertices (or nodes, or points), connected by edges (or line segments, or arcs). Vertices: A, B, C, D, E, F & Edges: (A,C), (A,C), (A,E), (B,C), (B,F), (C,D), (D,A), (E,F)
Also : undirected graphs, rules restricting edges between vertices, classification of vertices
Graph theory – the study of properties of graphs
B
CF
E
AD
FLOWGRAPHS
Directed graphs (flow graphs) can be used to model control flow of a program
Vertex = statement, edge = (A,B) if control flows from statement A to B
Properties of flow graphs may provide information about properties of the code
McCABE’S CYCLOMATIC COMPLEXITY NUMBER Measures the number of linearly independent
paths through the flow graphs v (F) = e – n + 2, the flow graphs of the
code, n the number vertices, e the number of edges
Intuition – the larger the CCN the “more complex” the code
Various sources recommended a CCN of no more than 10-15
Example : CCN = 8 – 7 + 2 = 3
EVALUATION
Good formal definition Satisfies properties of metrics for “number of
linearly independent paths” Control flow only Some studies seem to indicate correlation
between CCN and some costs Doesn’t match intuition of “complexity” Is usable where “number of linearly
independent paths” is useful for answering questions for some goal