lopez
TRANSCRIPT
REASONING ABOUT FLAWS IN SOFTWARE DESIGN:DIAGNOSIS AND RECOVERY
TAMARA LOPEZ
Supervisors Marian Petre, Charles Haley and Bashar NuseibehDepartment Computing
Status Full-timeProbation Viva Before
Starting Date February 2010
Since its diagnosis at the 1960’s NATO conferences as one of the key problems in
computing[1, 2], the provision of reliable software has been a core theme in software
engineering research. One strand of this research analyzes software that fails, while
a second develops and tests techniques for ensuring software success.
Despite these efforts, the threat of failure and the quest for a multivalent yet com-
prehensive ”sense” of quality[2] remain powerful drivers for research and provoca-
tive tropes in anecdotal accounts of computing[3]. However, current analytical
approaches tend to result in overly broad accounts of why software fails or in
overly narrow views about what is required to make software succeed. This sug-
gests a need for a different approach toward the study of failure that can address
the complexities of large scale ”systems-of-systems”[4, 5], while accounting for the
effects and trajectories of specific choices made within software initiatives.
To address this gap, this research asks: How does failure manifest in actual soft-
ware development practice? What constitutes a flaw, and what are the conditions
surrounding its occurrence and correction? What can adopting a situational ori-
entation tell us more generally about why some software fails and other software
succeeds?
Background
Within computing literature, failure analysis typically takes two perspectives:
2010 CRC PhD Student Conference
Page 47 of 125
• Systemic analyses identify weak elements in complex organizational,
operational and software systems. Within these systems, individual or
multiple faults become active at a moment in time or within a clearly
bounded interval of time, and result in catastrophic or spectacular op-
erational failure[6, 7]. Alternatively, software deemed ”good enough” is
released into production with significant problems that require costly main-
tenance, redesign and redevelopment[8, 5].
• Means analyses treat smaller aspects or attributes of software engi-
neering as they contribute to the goal of creating dependable software[4].
These studies develop new or test existing techniques to strengthen all
stages of development such as requirements engineering[9], architectural
structuring[10], testing and maintenance [11] and verification and validation[12].
Systemic analyses produce case studies and often do not conclude with specific,
precise reasons for failure. Instead they retrospectively identify the system or sub-
system that failed, and provide general recommendations for improvement going
forward. Even when they do isolate weaknesses in the processes of software creation
or in particular software components, they do not produce general frameworks or
models that can be extended to improve software engineering practice.
Means analyses employ a range of methods including statistical, program analy-
sis, case study development, formal mathematical modeling and systems analysis.
Frequently, they examine a single part of the development process, with a corre-
sponding focus on achieving a single dependability mean[4]. The studies are often
experimental, applying a set of controlled techniques to existing bodies of soft-
ware in an effort to prove, verify and validate that software meets a quantifiable,
pre-determined degree of ”correctness”.
Methodology
This research will produce an analysis of the phenomenon of failure that lies some-
where between the broad, behavioral parameters of systemic analyses and the
narrowly focused goals of means analyses. To do this, it will draw upon recent
software engineering research that combines the socially oriented qualitative ap-
proaches of computer supported cooperative work(CSCW) with existing software
2010 CRC PhD Student Conference
Page 48 of 125
analysis techniques to provide new understandings of longstanding problems in
software engineering. In one such group of studies, de Souza and collaborators
have expanded the notion of dependency beyond its technical emphasis on the
ways in which software components rely on one another, demonstrating that hu-
man and organizational factors are also coupled to and expressed within software
source code[14, 15]. In a study published in 2009, Aranda and Venolia made a case
for developing rich bug histories using qualitative analyses in order to reveal the
complex interdependencies of social, organizational and technical knowledge that
influence and inform software maintenance[16].
In the manner of this and other cooperative and human aspects of software en-
gineering(CHASE) work, the research described here will apply a combination of
analytic and qualitative methods to examine the role of failure in the software
development process as it unfolds. Studies will be designed to allow for analysis
and examination of flaws within a heterogeneous artifact universe, with particu-
lar emphasis given to the interconnections between technical workers and artifacts.
Ethnographically informed techniques will be used to deepen understanding about
how the selected environments operate, and about how notions of failure and re-
covery operate within the development processes under investigation.
References
[1] P. Naur and B. Randell, “Software engineering: Report on a conference sponsored by the
NATO Science Committee Garmisch, Germany, 7th to 11th October 1968,” NATO Science
Committee, Scientific Affairs Division NATO Brussels 39 Belgium, Tech. Rep., January
1969. [Online]. Available: http://homepages.cs.ncl.ac.uk/brian.randell/NATO/
[2] J. Buxton and B. Randell, “Software engineering techniques: Report on a conference
sponsored by the NATO Science Committee Rome, Italy, 27th to 31st October 1969,” NATO
Science Committee, Scientific Affairs Division NATO Brussels 39 Belgium, Tech. Rep.,
April 1970 1970. [Online]. Available: http://homepages.cs.ncl.ac.uk/brian.randell/NATO/
[3] R. Charette, “Why software fails,” IEEE Spectrum, vol. 42, no. 9, pp. 42–49, 2005.
[4] B. Randell, “Dependability-A unifying concept,” in Proceedings of the Conference on Com-
puter Security, Dependability, and Assurance: From Needs to Solutions. IEEE Computer
Society Washington, DC, USA, 1998.
[5] ——, “A computer scientist’s reactions to NPfIT,” Journal of Information Technology,
vol. 22, no. 3, pp. 222–234, 2007.
2010 CRC PhD Student Conference
Page 49 of 125
[6] N. G. Leveson and C. S. Turner, “Investigation of the Therac-25 accidents,” IEEE Computer,
vol. 26, no. 7, pp. 18–41, 1993.
[7] B. Nuseibeh, “Ariane 5: Who dunnit?” IEEE Software, vol. 14, pp. 15–16, 1997.
[8] D. Ince, “Victoria Climbie, Baby P and the technological shackling of British childrens social
work,” Open University, Tech. Rep. 2010/01, 2010.
[9] T. Thein Than, M. Jackson, R. Laney, B. Nuseibeh, and Y. Yu, “Are your lights off? Using
problem frames to diagnose system failures,” Requirements Engineering, IEEE International
Conference on, vol. 0, pp. v–ix, 2009.
[10] H. Sozer, B. Tekinerdogan, and M. Aksit, “FLORA: A framework for decomposing software
architecture to introduce local recovery,” Software: Practice and Experience, vol. 39, no. 10,
pp. 869–889, 2009. [Online]. Available: http://dx.doi.org/10.1002/spe.916
[11] F.-Z. Zou, “A change-point perspective on the software failure process,” Software
Testing, Verification and Reliability, vol. 13, no. 2, pp. 85–93, 2003. [Online]. Available:
http://dx.doi.org/10.1002/stvr.268
[12] A. Bertolino and L. Strigini, “Assessing the risk due to software faults: Estimates of
failure rate versus evidence of perfection,” Software Testing, Verification and Reliability,
vol. 8, no. 3, pp. 155–166, 1998. [Online]. Available: http://dx.doi.org/10.1002/(SICI)1099-
1689(1998090)8:3¡155::AID-STVR163¿3.0.CO;2-B
[13] Y. Dittrich, D. W. Randall, and J. Singer, “Software engineering as cooperative work,”
Computer Supported Cooperative Work, vol. 18, no. 5-6, pp. 393–399, 2009.
[14] C. R. B. de Souza, D. Redmiles, L.-T. Cheng, D. Millen, and J. Patterson, “Sometimes
you need to see through walls: A field study of application programming interfaces,” in
CSCW ’04: Proceedings of the 2004 ACM conference on Computer supported cooperative
work. New York, NY, USA: ACM, 2004, pp. 63–71.
[15] C. de Souza, J. Froehlich, and P. Dourish, “Seeking the source: Software source code as a
social and technical artifact,” in GROUP ’05: Proceedings of the 2005 international ACM
SIGGROUP conference on Supporting group work. New York, NY, USA: ACM, 2005, pp.
197–206.
[16] J. Aranda and G. Venolia, “The secret life of bugs: Going past the errors and omissions in
software repositories,” in Proceedings of the 2009 IEEE 31st International Conference on
Software Engineering. IEEE Computer Society, 2009, pp. 298–308.
2010 CRC PhD Student Conference
Page 50 of 125