choosing the right curve - cerias · choosing the right curve. 2 questions? do software products...
TRANSCRIPT
1
by Dr. Ronald W. Ritchey
Choosing the right curve
2
Questions?
Do software products tend toward increasing complexity over time?
Is complexity correlated with software faults?
Can we manage complexity for large projects?
Should development teams explicitly limit complexity to what they havedemonstrated they can manage?
3
How hard can it be to write good software?
Largest bug-free program I can write
#include <stdio.h>int main() {
printf(“Hello Werld\n”);}
The real world challenge is much harderint main(int n,char**N) { S->O=0; L=n>1?*N[1]-85?1:6:0; i=L&1?atol(N[1]):123;#define i (int)(81.0*(i=1103515245*i+12345&0x7fffffff)/2147483648.0) for(l=C=0;l<81;) { I=L&1?0:getchar()^48; i; I=I-30?I:0; if(I<10) { #define S S[O if(C<22) z[C++]^=13; N(I,)NO } }
for(;;) { l0: Su(row,col,box) C=l0=0; f(l,81) if(!(s]>>10&&++l0)) { o=s]&1022; for(I=0;~o&1&&(o/=2);I++); o-1||(s]|=I<<10,C++); } if(l0==l) { if(O&&L&2) { O--; goto l0; } goto O; } for(l0=1;10>l0;l0++) { Su(,,) }
if(!C) { l=(o=S].O)?S].I:0; I=o?S].l%9+1:(S].O=i%9+1); for(;l<81;l++,I=S].O,o=0) if(!(s]>>10)) { for(;;I=I%9+1,o=1) { l0=0; if(o&&I==S].O) goto O; if(s]>>I&1) { S].l=I; S++].I=l; S]=S-1]; N(I,); O>w&&(w=O); goto lO; } } } }
4
Complexity is often sited as a reason for poor security
“Complexity is the Enemy” – Dr. Daniel Geer
“Every time the number of lines of code is doubled, a company adds four times as manysecurity problems” – Paul Kocher, President of Cryptography Research
“Complexity systems can have backdoors and Trojan code implanted that is more difficultto find because of complexity. Complex code tends to be bigger… more opportunityfor accidents, omissions, and manifestation of code error” – Pro. Eugene Spafford,Purdue University
5
The research is not so clear. Some say yes…
– Shin & Williams Is complex code less secure while investigating Java Script Engine (JSE) in Mozilla?
• Found some but not strong correlation between complexity and security
– Scan.coverty.com – Open Source Report 2008Analysis of 55 MLOC of 250 projects
• Found strong linear correlation between SLOC and the number of faults
6
Others no…
Ozment and Schechter (MIT Lincoln Labs)– Analysis of OpenBSD
Found no correlation between SLOC and number of vulnerabilities
Jeff Jones – Windows Vista: One Year Vulnerability Report– Argued and quantified that SLOC in Windows XP vs. Vista cannot be
correlated
Michael Howard– Vista’s SLOC is higher than XP, yet we are experiencing 50% reduction in
vulnerability countAttributed to Secure Development Lifecycle (SDL)
7
It’s likely that SLOC is a weak measure of complexity. Thereare certainly others…
Structural– Size– Cyclomatic– Halstead’s– Information flow– System in terms of maintainability– Object-oriented
Conceptual– Difficulty in understanding and
maintaining
Computational– Space and time
Architectural – Class Level– Number of Methods per Class (NMC)– Weighted Methods per Class (WMC)– Response for a Class (RFC)– Lack of Cohesion of Methods (LCOM)– Coupling Between Object (CBO)
Structural - Tree– Depth of Inheritance (DIP)– Number of Children (NOC)
More research into how good each of thesemetrics assess complexity is needed!
8
What is very clear is the programs are becoming larger
Windows XP 40
Windows Vista 60
Mac OS X 10.4 86
Red Hat Linux Enterprise 7.1 20
Internet Explore 6 7
Mozilla FireFox 1.2 5
Linux Kernel 2.6.0 5.2
OpenSolaris 9.7
Debian 4.0 283
Ubuntu 121
9
Most of these products doubled in SLOC between versions
Product SLOC in Previous Version SLOC in Newer Version
Windows NT4: 11-12 XP: 40
Mac OS X v10.4: 86 V10.5: 90
Red Hat Linux v6.2: 17 v7.1: 30
Debian v3.0: 104 v4.0: 283
Internet Explorer v6: 7 v7: 10
Mozilla Firefox v1.0: 3-4 v1.5: 6
Linux Kernel v2.5: 3-4 v2.6: 5.2
What do you think? Did vulnerability ratesincrease or decrease in this period?
10
A Case Study!
Analyzed MS Windows Products– NT 3.5– NT 4.0– 2K– XP– Vista
Data acquired from NVD– Analysis conducted on data collected over a 13 year period
11
Vulnerability Discovery Over Time
12
Cumulative Vulnerability Count Over Time
13
Cumulative Vulnerability Count vs. Complexity
14
Finding from this data, vulnerability rates appear to beincreasing, especially for newly introduced foundational code
Results suggest that– Vulnerability count does correlates to the code base (SLO) reaffirming the
generally held hypothesis.– Initial spike in number of vulnerability after the release of foundation code– Less vulnerabilities are discovered in foundational code over time
15
Need to be careful in how we judge the numbers
FoundationalCode
NewCode
FoundationalCode
NewCode
Foundational code seems to have a much higher vulnerability rate, buthow much of that exists in Ver 2.0?
At what level of change, does added code take on the character offoundational code?
16
Another problem, what is the total population ofvulnerabilities in a given sample of code?
17
A few eyes may find some vulnerabilities
18
Popular products draw much more scrutiny
19
Other factors also affect the quality of vulnerability data
Not discovered vulnerabilities are published– This is increasingly driven by the operational and economic value of zero-day
vulnerabilities
Vulnerability reports can often be duplicates– It is occasionally difficult to detect that the same underlying issue has been
reported repeatedly based upon different circumstances of use
Death dates are based on published dates– Can significantly contribute noise
20
Some industries have a good track record at managingcomplexity even for large projects
Product SLOCNASA Space Shuttle flight control 420K (shuttle) + 1.4 million (ground)Boeing 777 (2004) 2 – 3 millionRed Hat Linux 7.1 (2007) 17 millionWindows Vista (2008) 60 million
Safety critical development seems ahead of the curve for managing complexdevelopment but cost can be high
21
Managing complexity is difficult
Requires a cultural paradigm shift in software development processes– Avoid faults
Code inspection (static & dynamically)Code walk-through,…
– Remove faultsFunctional Integration testing,…
– Tolerate some faultsN-Version Programming Schema (NVPS),Recovery Block Scheme (RBS),…,
– Predict faultsLeverage machine learning capabilities ( supervised, unsupervised, semi-supervised, etc.)
22
At a minimum, you must know how much complexity yourdevelopment team can handle
Complexity (Sloc, McCabe, Halstead, etc)
Vul
nera
bilit
y in
trodu
ctio
n ra
te
93 vuln’s 1st yearper MLOC(notional)
The curve you are on should dictateyour acceptable complexity tolerance!
10 vuln’s 1st yearper MLOC(notional)
Are you here?
Or here?
23
Conclusions
Initial code release (foundational code) contributed to the majority of thevulnerabilities reported
Complexity does impact vulnerability
Complexity of foundational code is increasing– Appears to be doubling every five to eight years
Our ability to prevent vulnerability rates from increasing at the same rate is tied toour ability to either limit complexity or improve our ability to handle it
– Easy those unpalatable solution is to limit functionality– Improvements in software development processes, testing, technologies, and
skillsets may allow flat vulnerability growth– More investment needed!– Many more skilled practitioners needed?
24
Backup
25
McCabe’s (1976) Cyclomatic Complexity (CC)
For the given graph– CC = V(G) = e – n + 2pe = Number of edges = 13n = Number of nodesp = number of unconnected parts of
of the graph
The rule of thumb:
CC <=10 otherwise it is too complex
11
2
1
3
4
5
6
7
8
9
10
26
Halstead’s Metrics (1977)
Counting Operations and Operands– n1: number of unique operators– n2: number of unique operands– N1: total number of operators– N2: total number of operands
Vocabulary: n = n1 + n2
Program Length: N = N1 + N2
Volume: V = N log2n Level of abstraction: L = V*/ V,
• V*=volume of function prototype.For sort(x), V* = 2 log 2.For main(), L is high.approximation: L’ = (2/n1)(n2/N2)
Programming Effort: E = V/L Estimated Programming Time: T ’ = E/18 Estimate of N: N ’ = n1log2n1 + n2log2n2