lionel briand icsm 2011 keynote
DESCRIPTION
Abstract:Though in essence an engineering discipline, software engineering research has always been struggling to demonstrate impact. This is reflected in part by the funding challenges that the discipline faces in many countries, the difficulties we have to attract industrial participants to our conferences, and the scarcity of papers reporting industrial case studies.There are clear historical reasons for this but we nevertheless need, as a community, to question our research paradigms and peer evaluation processes in order to improve the situation. From a personal standpoint, relevance and impact are concerns that I have been struggling with for a long time, which eventually led me to leave a comfortable academic position and a research chair to work in industry-driven research.I will use some concrete research project examples to argue why we need more inductive research, that is, research working from specific observations in real settings to broader generalizations and theories. Among other things, the examples will show how a more thorough understanding of practice and closer interactions with practitioners can profoundly influence the definition of research problems, and the development and evaluation of solutions to these problems. Furthermore, these examples will illustrate why, to a large extent, useful research is necessarily multidisciplinary. I will also address issues regarding the implementation of such a research paradigm and show how our own bias as a research community worsens the situation and undermines our very own interests.On a more humorous note, the title hints at the fact that being a scientist in software engineering and aiming at having impact on practice often entails leading two parallel careers and impersonate different roles to different peers and partners.Bio:Lionel Briand is heading the Certus center on software verification and validation at Simula Research Laboratory, where he is leading research projects with industrial partners. He is also a professor at the University of Oslo (Norway). Before that, he was on the faculty of the department of Systems and Computer Engineering, Carleton University, Ottawa, Canada, where he was full professor and held the Canada Research Chair (Tier I) in Software Quality Engineering. He is the coeditor-in-chief of Empirical Software Engineering (Springer) and is a member of the editorial boards of Systems and Software Modeling (Springer) and Software Testing, Verification, and Reliability (Wiley). He was on the board of IEEE Transactions on Software Engineering from 2000 to 2004. Lionel was elevated to the grade of IEEE Fellow for his work on the testing of object-oriented systems. His research interests include: model-driven development, testing and verification, search-based software engineering, and empirical software engineering.TRANSCRIPT
Useful Software Engineering Research: Leading a Double
-Agent Life
Lionel Briand Certus Center for Software Verification
and Validation Simula Research Laboratory & University of Oslo, Norway
1
Software Everywhere
2
Failures Everywhere …
Software Engineering Research Funding & Relevance
➜ Software Engineering (SE) research should be a top priority in most countries (except Sweden)
➜ But that is not the case anymore. Hard numbers are hard to get, and in some cases well protected
➜ Symptoms ➥ Listed priorities by research councils, funding ➥ University hiring ➥ Large centers or institutes being established or closed down
➜ May be partly related to lack of relevance? ➥ Industry participation in leading SE conferences ➥ Application/industry tracks not first class citizens ➥ A very small percentage of research work ever used and assessed on
real industrial software
4
Basili’s and Meyer’s Take ➜ Many of the advances in software engineering have come out
of non-university sources ➜ “Academic research has had its part, honorable but
limited.” (Meyer) ➜ Large scale labs don’t get funded, like they do in other
engineering and scientific disciplines (Basili, Meyer) ➜ Software Engineering is “big science” (Basili)
➜ One significant difference though is that we cannot entirely recreate the phenomena we study within four walls – This, as discussed later, has significant consequences
➜ Question: What is our responsibility in all this? 5
Engineering Research
6
➜ “Engineering: The application of scientific and mathematical principles to practical ends such as the design, manufacture, and operation of efficient and economical structures, machines, processes, and systems.” (American Heritage Dictionary)
➜ Engineering research: ➥ Problem driven ➥ Real world requirements ➥ Scalability ➥ Human factors, where it matters ➥ Economic tradeoffs and cost-benefit analysis ➥ Actually doing it on real artifacts, not just talking
about it
A Representative Example ➜ Parnin and Orso (ISSTA, 2011) looked at
automated debugging techniques ➜ 50 years of automated debugging research ➜ Only 5 papers have evaluated automated
debugging techniques with actual programmers ➜ Focus since ~2001: dozens of papers ranking
program statements according to their likelihood of containing a fault
➜ Experiment ➥ How do programmers use the ranking? ➥ Do they see the bugs? ➥ Is the ranking important?
7
Results from Parnin and Orso’s Study
• Only low performers strictly followed the ranking • Only one out of 10 programmers who checked a
buggy statement stopped the investigation • Automated support did not speed up debugging • Developers wanted explanations rather than
recommendations • We cannot abstract the human away in our
research • “… we must steer research towards more promising
directions that take into account the way programmers actually debug in real scenarios.”
8
What Happened? ➜ How people debug and what information they need
is poorly understood ➥ Probably varies a great deal according to context and
skills
➜ Researchers focused on providing a solution that was a mismatch for the actual problem
➜ That line of research became fashionable: a lot of (cool) ideas could be easily applied and compared, without involving human participants
➜ Resulted in many, many papers … ➜ Many other examples in SE, e.g., Clone detection?
9
Other Examples ➜ Adaptive Random Testing: Many papers since 2004
➥ Mostly simulations and small artifacts, unrealistic failure rates. Arcuri and Briand (2011), ISSTA
➜ Regression testing: Arguably the most studied testing problem, perhaps most studied software engineering problem … ➥ "However, empirical evaluation and application of
regression testing techniques at industrial level seems to remain limited. Out of the 159 papers …, only 12 papers consider industrial software artefacts as a subject of the associated empirical studies. This suggests that a large-scale industrial uptake of these techniques has yet to occur.” Yoo and Harman (2011)
➥ Possible reason: Strong focus on white-box, not black-box regression testing? Scalability and Practicality?
10
Industry-Driven Research ➜ Let’s take now an entirely different angle … ➜ Research driven by industry needs ➜ Simula’s motto: “The industry is our lab” ➜ Go through recent and successful projects (mostly
@ Simula) with industry partners ➜ Summarize what happened, our experience ➜ Draw conclusions and lessons learned
➥ Patterns for successful research ➥ Challenges and possible solutions
11
Mode of Collaboration
12
Problem identification
Problem formulation
Study State-of- the-art
Candidate Solutions
Initial validation
Realistic validation Release
solution
Ind. Partners
Research Center
Gorschek et al., IEEE Software 2006
Project Example 1
➜ Context: Testing in communication systems (Cisco) ➜ Original scientific problem: Modeling and test case
generation, oracle, coverage strategy ➜ Practical observation: Access to test network
infrastructure limited (emulate network traffic, etc.). Models get too large and complex.
➜ Modified research objectives: (1) How to select an optimal subset of test cases matching the time budget, (2) Modeling cross-cutting concerns
13
Project Example 1
➜ Context: Testing in communication systems (Cisco) ➜ Original scientific problem: Modeling and model
-based test case generation, oracle, coverage strategy
➜ Practical observation: Access to test network infrastructure limited (emulate network traffic, etc.). Models get too large and complex.
➜ Modified research objectives: (1) How to select an optimal subset of test cases matching the time budget, (2) Modeling cross-cutting concerns
➜ References: Hemmati et al. (2010), Ali et al. (2011)
14
Project Example 2 ➜ Context: Testing image segmentation algorithms
for medical applications (Siemens) ➜ Original scientific problem: Define specific test
strategies for segmentation algorithms ➜ Practical observations: Algorithms are validated by
using highly specialized medical experts. Expensive and slow. No obvious test oracle
➜ Modified research objective: Learning oracles for image segmentation algorithms in medical applications. Machine learning.
15
Project Example 2 ➜ Context: Testing image segmentation algorithms
for medical applications (Siemens) ➜ Original scientific problem: Define specific test
strategies for segmentation algorithms ➜ Practical observations: Algorithms are validated by
using highly specialized medical experts. Expensive and slow. No obvious test oracle
➜ Modified research objective: Learning oracles for image segmentation algorithms in medical applications. Machine learning.
➜ Reference: Frouchni et al. (2011)
16
Project Example 3 ➜ Context: Subsea integrated control systems (FMC) ➜ Original scientific problem: Architecture-driven integration in
systems of systems ➜ Practical observations: Each subsea installation is unique
(variant), the software configuration is extremely complex (hundreds of interrelated variation points in software and hardware)
➜ Modified research objective: Product Line architectures in integrated control systems to support the configuration process
17
➜ Note: Despite decades of research in PLA, we could not find a methodology fitting our requirements
Project Example 3 ➜ Context: Subsea integrated control systems (FMC) ➜ Original scientific problem: Architecture-driven integration in
systems of systems ➜ Practical observations: Each subsea installation is unique
(variant), the software configuration is extremely complex (hundreds of interrelated variation points in software and hardware)
➜ Modified research objective: Product Line architectures in integrated control systems to support the configuration process
18
➜ Note: Despite decades of research in PLA, we could not find a methodology fitting our requirements
➜ Reference: Behjati et al. (2011)
Project Example 4 ➜ Context: safety-critical embedded systems in
the energy and maritime sectors, e.g., fire and gas monitoring, process shutdown, dynamic positioning (Kongsberg Maritime)
➜ Original scientific problem: Model-driven engineering for failure-mode and effect analysis
➜ Practical observations: Certification meetings with third-party certifiers. Certification is lengthy, expensive, etc. Traceability in large complex systems a priority.
➜ Modified research objective: Traceability between safety requirements and system design decisions. Solution based on SysML and a simple traceability language along with model slicing.
19
Project Example 4 ➜ Context: safety-critical embedded systems in
the energy and maritime sectors, e.g., fire and gas monitoring, process shutdown, dynamic positioning (Kongsberg Maritime)
➜ Original scientific problem: Model-driven engineering for failure-mode and effect analysis
➜ Practical observations: Certification meetings with third-party certifiers. Certification is lengthy, expensive, etc. Traceability in large complex systems a priority.
➜ Modified research objective: Traceability between safety requirements and system design decisions. Solution based on SysML and a simple traceability language along with model slicing.
➜ Reference: Sabetzadeh et al. (2011)
20
Project Example 5 ➜ Context: Technology qualification (TQ)
in maritime sector (DNV) ➜ Original scientific problem: Model-based
quantitative safety analysis ➜ Practical observations: TQ is not purely
objective, quantitative argument. Great complexity (e.g., sources of information) and expert judgment. Many stakeholders.
➜ Modified research objective: Modeling safety arguments to support quantitative reasoning and decision making by several stakeholders
21
Project Example 5 ➜ Context: Technology qualification (TQ)
in maritime sector (DNV) ➜ Original scientific problem: Model-based
quantitative safety analysis ➜ Practical observations: TQ is not purely
objective, quantitative argument. Great complexity (e.g., sources of information) and expert judgment. Many stakeholders.
➜ Modified research objective: Modeling safety arguments to support quantitative reasoning and decision making by several stakeholders
➜ Reference: Sabetzadeh et al. (2011) 22
Two Other Examples on the ICSM Program
➜ Erik Rogstad et al., “Industrial Experiences with Automated Regression Testing of a Legacy Database Application”
➜ Amir Reza Yazdanshenas and Leon Moonen, “Crossing the Boundaries While Analyzing Heterogeneous Component-Based Software Systems”
23
Successful Research Patterns
➜ Successful: Innovative and high impact ➜ Inductive research: Working from specific
observations in real settings to broader generalizations and theories ➥ Field studies and replications, analyze commonalities
➜ Scalability and practicality considerations must be part of the initial research problem definition
➜ Researching by doing: Hands-on research. Apply what exists in well defined, realistic context, with clear objectives. The observed limitations become the research objectives.
➜ Multidisciplinary: other CS, Engineering, or non-technical domains
24
So What? ➜ Making a conscious effort to understand the
problem first ➥ Precisely identify the requirements for an applicable solution ➥ More papers focused on understanding the problems ➥ Making industry tracks first class citizens in SE conferences
➜ Better relationships between academia and industry ➥ Different models, e.g., Research-based innovation centers in Norway ➥ Common labs (e.g., NASA SEL Lab) ➥ Exposing PhD students to industry practice: Ethical considerations
(Fixing the PhD, Nature)
➜ Playing an active role in solving the problem, e.g., action research-like
25
So What? ➜ Work on end-to-end solutions: Pieces of solutions
are interdependent. Necessary for impact. ➜ Beyond professors and students
➥ Labs with interdisciplinary teams of professional scientists and engineers within or collaborating with universities
➥ Used to be the case with corporate research labs: Bell Labs, Xerox PARC, HP labs, NASA SEL, etc.
➥ Now: Fraunhofer (Germany), Simula (Norway), Microsoft Research (US), SEI (US), SnT (Luxembourg)
➥ Corporate labs versus publicly supported ones? ➥ Key point: The level of basic funding must allow high
risk research, performed by professional scientists, focused on impact in society
26
The NASA SEL Experience Factory Model
27
Basili et al. (NASA SEL)
Academic Challenges ➜ Our CS legacy … emancipating ourselves as an engineering
discipline ➥ Systems engineering departments?
➜ How cool is it? SE research is more driven by “fashion” than needs, a quest for silver bullets ➥ We can only blame ourselves
➜ Counting papers and how the JSS ranking does not help ➥ We are pressuring ourselves into irrelevance
➜ Taking academic tenure and promotion seriously ➥ What about rewarding impact?
➜ One’s research must cover a broader ground and be somewhat opportunistic – this pushes us out of our comfort zone
➜ Resources to support industry collaborations ➥ Large lab infrastructure, engineers, time
28
Industrial Challenges
➜ From a discussion with Bran Selic … ➜ Short term versus longer term goals (next
quarter’s forecast is the priority) ➜ Industrial research groups are often disconnected
from their own business units and external researchers may be perceived as competitors
➜ Company’s intellectual property regulations may conflict with those of the research institution
➜ Complexity of industrial systems and technology ➥ Cannot be transplanted in artificial settings for
research - Need studies in real settings ➥ Substantial domain knowledge is required
29
A Double-Agent Life
30
Scientist, trying to be discrete, but
inquisitive
Warning: No research here
A new idea (as initially
perceived by our partners)
Practitioner (sanitized)
Conclusions ➜ Software engineering is obviously important in all
aspects of society, but academic software engineering research is not perceived the same way
➜ The academic community, at various levels, is partly responsible for this
➜ How we take up the challenge of increasing our impact will determine the future of the profession
➜ There are solutions, but no silver bullet ➜ We all have a role to play in this, as deans,
department chairs, professors, scientists, reviewers, conference organizers, journal editors, etc. We can all be double-agents …
31
Empirical Software Engineering
➜ Springer, 6 issues a year ➜ Both research papers and industry experience
reports ➜ 2nd highest impact factor among SE research
journals ➜ “Applied software engineering research with a
significant empirical component”
32
References ➜ Ali, Briand, Hemmati, “Modeling Robustness Behavior Using Aspect-Oriented Modeling
to Support Robustness Testing of Industrial Systems”, Journal of Software and Systems Modeling, forthcoming, 2011.
➜ Arcuri and Briand, “Adaptive Random Testing: An Illusion of Effectiveness”, ISSTA 2011
➜ Basili, “Learning Through Application: The maturing of the QIP in the SEL”, Making Software; What really works and why we believe it, Edited by Andy Oram and Greg Wilson, O’Reilly Publishers, 2011, pp.65-78.
➜ Behjati, Yue, Briand and Selic , “SimPL: A Product-Line Modeling Methodology for Families of Integrated Control Systems”, Simula Technical Report 2011-14 (V. 2), Submitted.
➜ Hemmati, Briand, Arcuri, Ali “An Enhanced Test Case Selection Approach for Model-Based Testing: An Industrial Case Study”, FSE, 2010
➜ Frouchni, Briand, Labiche, Grady, and Subramanyan, “Automating Image Segmentation Verification and Validation by Learning Test Oracles”, forthcoming in Information and Software Technology (Elsevier), 2011.
33
References II ➜ Sabetzadeh, Nejati, Briand, Evensen Mills “Using SysML for Modeling of Safety
-Critical Software–Hardware Interfaces: Guidelines and Industry Experience”, HASE, 2011
➜ Parnin and Orso, “Are Automated Debugging Techniques Actually Helping Programmers?”, ISSTA, 2011
➜ Sabetzadeh et al., “Combining Goal Models, Expert Elicitation, and Probabilistic Simulation for Qualification of New Technology”, HASE, 2011
➜ Yoo and Harman, “Regression testing minimization, selection and prioritization: a survey”, STVR, Wiley, forthcoming
➜ Bertrand Meyer’s blog: http://bertrandmeyer.com/2010/04/25/the-other-impediment-to-software-engineering-research/
34