Download - Metrics and Optimization
-
8/2/2019 Metrics and Optimization
1/60
Metrics and Optimization
Team Tango & Victor
-
8/2/2019 Metrics and Optimization
2/60
What is it
Metric is a rule for quantifying characteristic or attribute of aprogram Important part of software development process. Helps developers to find improvements in their code.
-
8/2/2019 Metrics and Optimization
3/60
Common Code Metrics
Code Coverage Program Load Time Cohesion Coupling
Code Density Source Lines of Code or Program Length Bugs Per Line of Code Number of Classes and Interfaces Execution Time
-
8/2/2019 Metrics and Optimization
4/60
Code Coverage
Definition: A measurement of how many lines of code areexecuted while in automated testing. Its a structural testingtechnique. The code coverage tool will give a percentage of how much
the code has been exercised. Used to develop a set of rigorous and manageableregression tests.
-
8/2/2019 Metrics and Optimization
5/60
Types of Coverage
The Main Coverages are: Function Coverage Reports whether each function orprocedure is invoked. Statement Coverage Reports whether each executable
statement is encountered. Decision Coverage Reports whether each Booleanexpression tested in control structures evaluates to both trueand false. Condition/Decision Coverage Both decision and conditionrequirement must be satisfied.
-
8/2/2019 Metrics and Optimization
6/60
Continued
Condition Coverage Reports whether each Boolean sub-expression evaluates to both true and false.
Path Coverage Reports whether each path in each function
has been followed. Also known as predicate coverage.
-
8/2/2019 Metrics and Optimization
7/60
-
8/2/2019 Metrics and Optimization
8/60
Advantage
Allows the developers and QA to look at parts of a systemthat are rarely accessed under normal conditions such as errorhandling. Testers can use the results to develop more test case sets
that increases the overall code coverage.
-
8/2/2019 Metrics and Optimization
9/60
Disadvantages
Time Consuming
High Cost
-
8/2/2019 Metrics and Optimization
10/60
Program Loading Time
Definition: How long it takes for a program to load before theuser can interact with it. It starts by OS reading the contents of the executable into thememory and carry out other preparatory tasks to prepare it.
Once the loading part is done, the OS starts the program bypassing the control to the loaded program.
-
8/2/2019 Metrics and Optimization
11/60
Cohesion and Coupling
(Invented by Larry Constantine)
-
8/2/2019 Metrics and Optimization
12/60
Cohesion
What is Cohesion?
Cohesion is a measure of how strongly-related each piece of
functionality expressed by the source code of a softwaremodule is.
-
8/2/2019 Metrics and Optimization
13/60
Which one is better?
High Cohesion or Low Cohesion
-
8/2/2019 Metrics and Optimization
14/60
Disadvantages of Low Cohesion
Increased difficulty in understanding modules
Increased difficulty in maintaining a system
Increased difficulty in reusing a module
-
8/2/2019 Metrics and Optimization
15/60
Types of Cohesion
Coincidental Cohesion (worst): is when parts of a moduleare grouped arbitrarily
Logical cohesion: is when parts of a module are grouped
because they logically are categorized to do the same thing
Temporal cohesion: is when parts of a module are groupedby when they are processed
Procedural cohesion: is when parts of a module aregrouped because they always follow a certain sequence ofexecution
-
8/2/2019 Metrics and Optimization
16/60
Types of Cohesion
Communicational Cohesion: is when parts of a module aregrouped because they operate on the same data
Sequential cohesion: is when parts of a module are grouped
because the output from one part is the input to another part
Functional cohesion (best): is when parts of a module aregrouped because they all contribute to a single well-definedtask of the module
-
8/2/2019 Metrics and Optimization
17/60
Coupling (or Dependency)
Coupling is the degree to which each program module relies oneach one of the other modules.
-
8/2/2019 Metrics and Optimization
18/60
Which one is better?
Tight Coupling or Loose Coupling
-
8/2/2019 Metrics and Optimization
19/60
Disadvantages of Tight Coupling
A change in one module usually forces a ripple effect ofchanges in other modules.
Assembly of modules might require more effort and/or time
due to the increased inter-module dependency.
A particular module might be harder to reuse and/or testbecause dependent modules must be included
-
8/2/2019 Metrics and Optimization
20/60
Types of Coupling
-
8/2/2019 Metrics and Optimization
21/60
Cohesion V.S. Coupling
Low Coupling often correlates with high cohesion
High Coupling often correlates with low cohesion
-
8/2/2019 Metrics and Optimization
22/60
Comment Density
A measure of meaningful comments per each logical line ofcode
Content of comments important
Too many comments can be cumbersome
-
8/2/2019 Metrics and Optimization
23/60
Source Lines of Code (SLOC)
Software metric used to measure size of a program Useful for predicting general amount of effort required to
complete a similar program First used when FORTRAN and assembler were main
languages
-
8/2/2019 Metrics and Optimization
24/60
SLOC
Physical SLOC (LOC) Actual number of lines in source code Easier to write tools to measure SLOC Subject to logically irrelevant formatting conventions
Logical SLOC (LLOC)
Number of "statements" in source code Does not count formatting conventions
-
8/2/2019 Metrics and Optimization
25/60
SLOC Example
example 1:
for (i = 0; i < 100; i += 1) printf ("hello");
example 2:
/* Now how many lines of code is this? */
for (i = 0; i < 100; i += 1)
{
printf("hello");
}
How many Physical Lines of Code?Logical Lines of Code?
-
8/2/2019 Metrics and Optimization
26/60
Importance of SLOC
Advantages:
Intuitive software size measuring metric
Easy for new programmers to understand
Can estimate number of bugs per chunk of code
SLOC per staff hour
-
8/2/2019 Metrics and Optimization
27/60
Importance of SLOC
Disadvantages:
Coding accounts for a small chunk of entire softwarecreation process
Software often uses more than one language
Often cause of unnecessarily verbose code
GUI Tools achieve high level of functionality from very littlework from the programmer
-
8/2/2019 Metrics and Optimization
28/60
Types of SLOC
KLOC - 1 000 lines KDLOC - 1 000 delivered lines KSLOC - 1 000 source lines
MLOC - 1 000 000 lines
GLOC - 1 000 000 000 lines
Does having a lower SLOC count mean having a better
program?
-
8/2/2019 Metrics and Optimization
29/60
Similarly
Number of classes and interfaces
Number of lines of customer requirements
-
8/2/2019 Metrics and Optimization
30/60
Bugs Per Line of Code
On average: In the software industry there are 15 - 50 errors per 1000 lines
of delivered code. Microsoft applications have about 0.5 defects per 1000 lines of
delivered code (in 1992). You should expect 500 bugs per 10 KLOC
You should spend 50% of the time debugging
Count bugs and log errors to improve code quality.
-
8/2/2019 Metrics and Optimization
31/60
How can we reduce this? Clean Room Development
Averages 3 defects per 1000 lines during testing and0.1 defects per 1000 lines of delivered code.
Focuses on defect prevention rather than removal.
An example of software written using this method is "TheSpace-Shuttle software" which achieved 1 defects in 400,000lines of code using format development methods, peer
reviews, and statistical testing. The downside: this came at acost of 1000$ (tax payers money) per line of code.
E ti Ti
-
8/2/2019 Metrics and Optimization
32/60
Execution Time
What is it? Defined by the time during which a program is running or
executing.
Things that can influence Execution time: Type checking Storage allocation
Code optimization Run time of algorithms
How to improve?
Try to push most tasks to compile time rather than runtime Multithread when possible Design better and faster algorithms
-
8/2/2019 Metrics and Optimization
33/60
Number of Classes and Interfaces
The number of classes and interfaces excluding the numberof lines of code is a good way to measure the size of the
program
If you know the number of classes and interfaces beforeimplementing the code, then you can use this number as a
measurement to estimate completion time
Example: A program with 5 classes is smaller than a program with
500 classes.
-
8/2/2019 Metrics and Optimization
34/60
Tools for Software Metrics
There are plenty of tools used to measure software such as Analyst4j - An Eclipse plug in or stand-alone tool to
measure Java programs OOMeter - Measuring software using cohesion,
complexity, or coupling
S mmar
-
8/2/2019 Metrics and Optimization
35/60
Summary
"You can't control what you can't measure."
To prevent bugs, worry about them while coding ratherthan fixing them after
Try to minimize execution time by pushing most tasks tocompile time, designing efficient algorithms, and use your
processor to its potential
Measuring your program will give you a better estimate ofhow much work is remaining and helps understandwhere code could be improved
Use good methods of measurements because you couldcause more harm than good using a naive approach
-
8/2/2019 Metrics and Optimization
36/60
Performance Tuning
"Machine independent code hasmachine independentperformance." Greg Wilson
-
8/2/2019 Metrics and Optimization
37/60
Performance Tuning
Moore's Law tells us that chips double in speed every 18months
Proebstring's Law tells us that compiler optimizations doubleprogram speed every 18 years
After 18 years, an upgrade to the latest hardware wouldconstitute a 4096x improvement in speed over theoptimization's 2x.
-
8/2/2019 Metrics and Optimization
38/60
What could possibly go wrong?
Optimization almost alwaysfails when performedprematurely.
The system is constantlyevolving early on.
Effects are unpredictable oncomplex modern systems.
bool ready = true;while (ready) {
// act}
Optimized =>while (true) {}
-
8/2/2019 Metrics and Optimization
39/60
Considerations before you optimize
Why is it behaving thatway?
Is that behaviour reallynecessary?
Read the documentation! Add documentationyourself
Can predict the effects ofminor changes.
//TODO: Use this laterbool ready = true;while (ready) {
// changeReady();
}
-
8/2/2019 Metrics and Optimization
40/60
Bentley's Rules for
Optimization
Complete list at http://www.imaging.robarts.ca/~kwang/OriginTuning/sgi_html/apa.html
Excerpts from WritingEfficient Programs
25 years old
Some rules still worthknowing
Key points of interest:1. Data Structure
Augmentation
2. Storing Computed Results3. Lazy Evaluation4. Packing5. Interpreters6. Code Motion Out of Loops
7. Combining Tests8. Loop Unrolling & Fusion9. Exploit Algebraic ID's
10. Short-Circuiting
11. Precomputation12. Iteration Over Recursion13. Recycling Objects
http://www.imaging.robarts.ca/~kwang/OriginTuning/sgi_html/apa.htmlhttp://www.imaging.robarts.ca/~kwang/OriginTuning/sgi_html/apa.htmlhttp://www.imaging.robarts.ca/~kwang/OriginTuning/sgi_html/apa.htmlhttp://www.imaging.robarts.ca/~kwang/OriginTuning/sgi_html/apa.htmlhttp://www.imaging.robarts.ca/~kwang/OriginTuning/sgi_html/apa.htmlhttp://www.imaging.robarts.ca/~kwang/OriginTuning/sgi_html/apa.htmlhttp://www.imaging.robarts.ca/~kwang/OriginTuning/sgi_html/apa.html -
8/2/2019 Metrics and Optimization
41/60
Data Structure Augmentation
"The time required for commonoperations on data can often bereduced by augmenting thestructure with additional
information, or by changing theinformation within the structure soit can be accessed more easily."(Bentley)
-
8/2/2019 Metrics and Optimization
42/60
Augmentation High-Level Example
string quoteText = @"The time required forcommon operations on data can often be reducedby augmenting the structure with additionalinformation";
Quote quote = new Quote(quoteText, "Bentley");quote.Append(@"or by changing the informationwithin the structure so it can be accessed moreeasily.");
quote.Owner = "Michael Scott";
-
8/2/2019 Metrics and Optimization
43/60
Store Precomputed Results
"The cost of recomputing anexpensive function can bereduced by computing thefunction only once and storing the
results. Subsequent requests forthe function are handled by tablelookup." (Bentley)
-
8/2/2019 Metrics and Optimization
44/60
Storing Precomputed Results Local variables within
functions and methods Static variables withinclasses
Collections and hashingresults when multiple
results. Cache database requests
Worthwhile to do this when
function calls are expensive
Get in the habit of doing this forshort functions as well.
-
8/2/2019 Metrics and Optimization
45/60
Lazy Evaluation
"We'll do it live!"
-
8/2/2019 Metrics and Optimization
46/60
Lazy Evaluation
"Never evaluate an item until it isneeded."
-
8/2/2019 Metrics and Optimization
47/60
Examples of Lazy Evaluation
def fibonacci(i):# Evaluate fibonacci numbers lazily.return fibonacci_results.setdefault(i, \
i if i in [0, 1] else (fibonacci(i - 1) + fibonacci(i - 2)))
Populate the table with only the values that are actually
requested, when they are requested.Bad:def fibonacci(i):
if not fibonacci_results.has_key(i):# Precompute at least the first N valuesfibonacci_results[0] = 0fibonacci_results[1] = 1for j in range(2, max(i, N + 1)):
fibonacci_results[j] = fibonacci_results[j - 1] + \fibonacci_results[j - 2]
fibonacci_results[i] = fibonacci_results[i - 1] + fibonacci_results[i - 2]return fibonacci_results[i]
Ti f S R l
-
8/2/2019 Metrics and Optimization
48/60
Packing
Dense storage representations can decrease storage
costs by increasing the time to store and retrievedata.
Time-for-Space Rules
struct big_rgb {
int red;int green;int blue;
};
/* Wasting 3 bytes (24 bits)* per component.
* Bad for high resolution
* images*/
struct small_rgb {
unsigned int red:8;unsigned int green:8;
unsigned int blue:8;
};
/* C automatically packs the
* above bit fields as
* compactly as possible.* Could use unsigned char */
Example: Storing rgb component values (0-255).
Ti f S R l
-
8/2/2019 Metrics and Optimization
49/60
Interpreters
The space required to represent a program can oftenbe decreased by the use of interpreters wherecommon sequences of operations are representedcompactly.
Time-for-Space Rules
Example: regular expressions encoded as FSAs However, state jumping confuses compilers.
L R l
-
8/2/2019 Metrics and Optimization
50/60
Code Motion Out of Loops
An expression whose value does not depend on theloop variable should be calculated once, outside theloop, rather than iteratively.
Loop Rules
Compiler is good at recognizing invariant
expressions. Place expressions where it ismost natural to read or write them, and let thecompiler move them for you.
Example: for (i=0; i < n ;i++) { if (x[i]
-
8/2/2019 Metrics and Optimization
51/60
L R l
-
8/2/2019 Metrics and Optimization
52/60
Loop Unrolling
Large cost of some short loops is in modifying loopindexes. That cost can often be reduced by unrollingthe loop.
The goal of loop unrolling is to increase a program'sspeed by reducing (or eliminating) instructions thatcontrol the loop.
Loop Rules
Example:
int x;
for (x = 0; x < 100; x++){delete(x);
}
int x;
for (x = 0; x < 100; x+=5) {delete(x);delete(x+1);delete(x+2);delete(x+3);delete(x+4);
}
L R l
-
8/2/2019 Metrics and Optimization
53/60
Loop Fusion
If two nearby loops operate on the same set ofelements, combine their operational parts and useonly one set of loop-control operations.
However, this contradicts modularity principles.
Loop Rules
Example:
//BAD!!
for (i=0;i
-
8/2/2019 Metrics and Optimization
54/60
Exploit Algebraic Identities
In a conditional expression, replace a costlyexpression with an algebraically equivalentexpression that is cheaper to evaluate.
Compilers can sometimes do this.
Logic Rules
Example: Not (sqr(X) > 1) but (X > 1). Not (!(A) && !(B)) but!(A || B)(DeMorgan's Law).
Logic Rules
-
8/2/2019 Metrics and Optimization
55/60
Short-Circuit Monotone Functions
Take advantage of short-circuit behavior of Booleanexpressions by evaluating cheap conditions first.
Compilers don't usually trust themselves to rearrangeorder of evaluation.
Logic Rules
Example:
//BAD: May get a run time error.
if(((1/x) < 1) && x != 0)
vs.
//GOOD: Avoid dividing by zero.
if( x!=0 && ((1/x) < 1))
P L i l F i
-
8/2/2019 Metrics and Optimization
56/60
Precompute Logical Functions
Hard-code a function as a table instead.
Pros: Table look-ups are very fast.
Cons:
Consumes more memory. A lot more on large domains.
Takes more time to write and change.
I i R i
-
8/2/2019 Metrics and Optimization
57/60
Iteration vs. Recursion
Iteration is always at least as good as recursion.
In many cases, recursion is worse due to stack-frameallocation.
Some languages will optimize tail-end recursion to avoidexcessive stack-frame allocation.
for (i = 0; i < 10; ++i) {// DO STUFF
}
Good
void stuff(int i) {// DO STUFFi = i + 1;if (i < 10)
stuff(i + 1);
}
Bad
R li Obj t
-
8/2/2019 Metrics and Optimization
58/60
Recycling Objects Important to free unused memory for future use.
Manual is most efficient, but prone to errors:
Too early, too late, not at all, too many times.
Automatic garbage collection is more reliable, but
expensive. "Mark-and-Sweep": mark all reachable references,
sweep away all unmarked memory. "Stop-and-Copy": copy all reachable references to a new
section of memory, free everything left behind.
"Generational": use one of the above methods on recentmemory (faster, but doesn't catch everything).
M Ti
-
8/2/2019 Metrics and Optimization
59/60
More Tips Vectors are the most efficient lists.
Hash Tables are the most efficient maps.
Don't synchronize "just in case".
Acquire lock, then loop/recurse, then release.
Instance variables are initialized per-object, class variablesjust once.
Inner classes can use private methods of their containers,but at a cost.
-
8/2/2019 Metrics and Optimization
60/60
Questions?