1 role of software readability on software development cost emilio collar, jr., ph.d. western...
Post on 20-Dec-2015
214 views
TRANSCRIPT
1
Role of Software Readability on
Software Development Cost
Emilio Collar, Jr., Ph.D.Western Connecticut State University
Ancell School of Business181 White Street
Danbury, CT [email protected]
Nov 9, 2006
Ricardo Valerdi, Ph.D.Massachusetts Institute of Technology
Lean Aerospace Initiative77 Vassar Street, Bldg #41, Rm #205
Cambridge, MA [email protected]
21st International Forum on COCOMO and Software Cost ModelingNovember 9, 2006
2
Presentation Outline
1. The High Cost of Software Maintenance2. RUSE cost driver in COCOMO II3. Linking RUSE to readability4. A Cognitive Approach to Text Readability5. Concepts in Software Readability6. Programming Code Textbase Readability Model (PCTRM)7. Summary of Key Interpretations8. Implications for Software Cost Estimation
3
The High Cost of Software Maintenance
Typically, 70% of the life cycle cost of software is in the maintenance phase (Agresti 1982)
The high cost of software maintenance is linked to the difficulty of reading and understanding programming code, particularly code written by someone else
Code reading has been estimated to account for more than 50% of the effort expended in software maintenance (von Mayrhauser and Vans 1995)
Agresti, W. W. (1982). "Managing program maintenance." Journal of Systems Management 33(2): 34-37. von Mayrhauser, A. and A. M. Vans (1995). Program Understanding: Models and Experiments. Advances in Computers. M.C. Yovits and M. Zelkowitz (Eds.) San Diego, CA, Academic Press. 40: 2-36.
4
RUSE Cost Driver in COCOMO II
Early Design Model Post Architecture Model
Required Reuse (RUSE)
Personnel Experience (PREX)
Required Reuse (RUSE)
Platform Experience (PEXP)
Applications Experience (AEXP)
Programming Language Experience (LEXP)
This phenomenon is captured in COCOMO II costs drivers (Boehm, Abts, et al 2000)
Developing programming code that is more easily read by other programmers would reduce the cost of maintaining that code (Basili 1997)
Boehm, B., C. Abts, et al. (2000). Software Cost Estimation with COCOMO II. New York, Prentice Hall. Basili, V. R. (1997). "Evolving and packaging reading technologies." Journal of Systems and Software, 38: 3-12.
VL L
H VH XH
1.5
1.0
1.25
0.75
Required Reusability (RUSE)This cost driver accounts for the additional effort needed to construct components intended for reuse on the current or future projects. This effort is consumed with creating more generic design of software, more elaborate documentation, and more extensive testing to ensure components are ready for use in other applications.
5
Key question: What affects code readability?
CodeReadability
Code Comprehension
Reusability
Linking RUSE to Code Readability
influencesis affected by
Quantifiable Quantifiable
6
A Cognitive Approach to Text Readability
Reading and educational measurement Text comprehension, vocabulary difficulty
Linguistics/discourse processing Text-reader interaction
Psychology/natural language Cognitive Readability theory (Kintsch and Vipond 1979) Text comprehension proceeds cognitively across three
textual levels1. Verbatim representation
The text as written code (orthography)2. Textbase representation
The text as decontextualized organization of literal meaning (an ordered set of propositions)
3. Situation model representation The text understood as embedded in a situation
or context
Kintsch, W. and D. Vipond (1979). Reading comprehension and readability in educational practice and psychological theory. Perspectives on memory research. I. G. Nillson. Hillsdale, NJ, Erlbaum: 329-365.
Current focus
7
Programming languages are like natural languages They have a grammatical structure They have a linguistic structure
Give rise to propositions containing predicates (relational terms in a string of words) arguments (associated entities)
But they are value-neutral Mathematical, logical, and intentional aspects dominate Social and moral aspects are secondary due to limitations of
the domain (i.e., computer-programmer interaction)
Text must be evaluated in terms of propositions (Kintsch and Vipond 1979) rather than sentences (Chomsky 1956)
Concepts in Software Readability
Key concept: proposition as the unit of analysis
Backus, J. (1960). The syntax and semantics of the proposal international algebraic language of the Zurich ACM-GAMM conference. Zurich ACM-GAMM conference, Paris, UNESCO. Naur, P. (1960). "Report on the algorithmic language ALGOL 60." Communications of the ACM 3(5): 299-314. Chomsky, N. (1956). "Three models for the description of language." IRE Transactions on Information Theory IT-2(3): 113-124.
8
Code Readability Example
z = ((3*x^2) + (4*x) – 5) – ((2*y^2) – (7*y) + 11) / ((3*x^2) + (4*x) – 5)
vs.
“Although both examples are comprehensible, example 2b is comprehensible with greater ease (i.e., more readable) then example 2a.” (Collar 2005, p. 120)
Example #2a
Example #2b
a = ((3*x^2) + (4*x) – 5)
b = ((2*y^2) – (7*y) + 11)
z = (a – b) / a
Collar, E. (2005). An Investigation of Programming Code Readability Based on a Cognitive Readability Model - Volume I: Manuscript. Leeds School of Business. Boulder, CO, University of Colorado at Boulder: 403.
9
Propositional Density (PD)• Greater PD requires greater processing effort on the part of the reader;
this effort makes the code more difficult to read.
Number of New Arguments (NA)• Greater numbers of NA require the reader to manage more concepts in
memory; this effort makes the code more difficult to read.
Number of Repeated Arguments (RA)• Greater numbers of RA within and between logical lines of code render
the program more coherent and, therefore, easier to read.
Number of Branching Reinstatements (BR)• The reader performs a BR whenever integration into representational
memory of a concept in a current proposition requires reference to a concept in a proposition found elsewhere in the code; greater numbers of BR increase cognitive load, making the code more difficult to read.
Components of the PCTRM*
PROGRAMMING CODETEXTBASE
NewArguments
RepeatedArguments
PropositionalDensity
BranchingReinstatements
*Programming Code Textbase Readability Model
10
Other PCTRM Constructs
A reader’s level of skill in effectively using cognitive reading processes while reading
A reader’s knowledge, both conceptual and experiential, about programming and programming languages
Textbase comprehension arises from the effects of textbase readability interacting with the cognitive reading processes of the reader.
Textbase readability arises from the effects of specific propositional features of the programming code interacting with the cognitive reading processes of the reader.
PROGRAMMINGCODE TEXTBASE
READABILITY
PROGRAMMINGCODE TEXTBASE
COMPREHENSION
COGNITIVE READINGPROCESSES
PROGRAMMINGDOMAIN
KNOWLEDGE
READINGABILITY
11
Programming Code Textbase Readability Model (PCTRM)Branch.Reinst.(BR)
Prop.Density
(PD)
NewArg.(NA)
Rep.Arg.(RA)
0.3660.201 0.896**** -0.054
Eng. TestReadTime
Eng. TestScore
READINGABILITY
(RA)
0.910****
0.857****
0.728****
0.673**** 0.859****
PROGRAMMINGDOMAIN
KNOWLEDGE(PDK)
PROGRAMMINGCODE TEXTBASE
(PCT)
PCTC TestRead Time
PCTC TestScore
0.790**** 0.845****
0.117**
-0.174****
-0.247**
0.301****
PROGRAMCODE
TEXTBASECOMPREHENSION
(PCTC);R2 = 0.294
PROGRAMCODE
TEXTBASEREADABILITY
(PCTR);R2 = 0.045
LEGEND* 0.05 significance level** 0.01 significance level*** 0.001 significance level**** 0.0001 significance levelPCTRM Latent Variable
(or construct)
Link between indicatorand construct (outermodel relationship)
Link betweenconstructs (innermodel relationship)
PCTRM Observed Variable(or indicator)
PROGRAMMINGDOMAIN
KNOWLEDGE
PROGRAMMINGCODE TEXTBASE
READABILITY(PDK*PCTR)
-0.584****
-0.955****
-0.961****
-0.381****
Interaction Latent Variable(or construct)
Non-Visual Basic-Specific
ProgrammingLanguage
Experience (ESF1)
Visual BasicProgramming
LanguageExperience (ESF2)
Visual BasicTraining (ESF3)
[Perceived Propositional Density,Presence of New Argumentsand Perceived Readability]
with[All Programming Language
Experience: ESF1, ESF2](INTF1)
[Perceived Presence of BranchingReinstatements]
with[All Forms of Programming
Experience: ESF1, ESF2, ESF3](INTF2)
[Perceived Propositional Density,Presence of New Argumentsand Perceived Readability]
with[Visual Basic Training: ESF3]
(INTF3)
PerceivedPropositional
Density(PRFS1)
Perceived Presenceof New Arguments
(PRFS2)
Perceived Presenceof Branching
Reinstatements(PRFS3)
PerceivedReadability(PRFS4)
0.719****0.631****
0.503**** 0.761****
12
Summary of Key Interpretations
As Non-Visual Basic-Specific Programming Language Experience (ESF1) increases, perceived readability (PRSF4) increases
As Visual Basic Programming Language Experience (ESF2) increases, perceived readability (PRSF4) increases.
As Visual Basic Training (ESF3) increases, perceived readability (PRSF4) increases (i.e., code becomes easier to read).
As perceived readability increases (PRSF4), the time spent reading the code decreases (PT).
13
Implications for Software Cost Estimation (1)
COCOMO II Parameters
779 Person-Months (P-M) of software development
$5,500 1 P-M (all phases)
For 13 KSLOC Project Described in the COCOMO II User Manual (Chapter 7)
SDLC PARAMETERS
12% Design Phase
6% Coding Phase
12% Testing Phase
70% Maintenance Phase
CODE READING PARAMETERS
50%Time spent reading code during maintenance phase
14
Implications for Software Cost Estimation (2)
Design Phase Allocation (normal) 93.48
Coding Phase Allocation (normal) 46.74
Testing Phase Allocation (normal) 93.48
Maintenance Phase Allocation (normal) 545.30
P-M Allocated to Design Phase 93.48
P-M Design Phase Cost $ 5,500.00
Total P-M Cost Allocated to Design Phase $ 514,140.00
P-M Allocated to Coding Phase 46.74
P-M Coding Phase Cost $ 5,500.00
Total P-M Cost Allocated to Coding Phase $ 257,070.00
P-M Allocated to Testing Phase 93.48
P-M Testing Phase Cost $ 5,500.00
Total P-M Cost Allocated to Testing Phase $ 514,140.00
P-M allocated to Maintenance Phase 545.30
P-M Maintenance Phase Cost $5,500
Total P-M Cost Allocated to Maintenance Phase $ 2,999,150.00
Person-Month Allocation
Conditions without
Readability Considerations
TOTAL PROJECT COST WITHOUT READABILITY CONSIDERATIONS: $4,284,500
15
Implications for Software Cost Estimation (3)
P-M Allocated to Design Phase 116.85
P-M Design Phase Cost $5,500
Total P-M Cost Allocated to Design Phase $ 642,675.00
P-M Allocated to Coding Phase 58.43
P-M Coding Phase Cost $5,500
Total P-M Cost Allocated to Coding Phase $ 321,337.50
P-M Allocated to Testing Phase 93.48
P-M Testing Phase Cost $5,500
Total P-M Cost Allocated to Testing Phase $ 514,140.00
P-M allocated to Maintenance Phase 408.98
P-M Maintenance Phase Cost $5,500
Total P-M Cost Allocated to Maintenance Phase $ 2,249,362.50
Conditions withReadability
Enhancement
TOTAL PROJECT COST WITH READABILITY ENHANCEMENTS: $3,727,515
TOTAL $$ MAINTENANCE SAVINGS $ 749,787.50
TOTAL % MAINTENANCE SAVINGS 25%
TOTAL $$ PROJECT SAVINGS $ 556,985.00
TOTAL % PROJECT SAVINGS 13%
16
The International Group of e-SystemsResearch & Applications Presents...
The International Conference on Computing & e-Systems:March 12-15, 2007, Hammamet Beach, Tunisia
www.tigera.org