lionel briand icsm 2011 keynote

Useful Software Engineering Research: Leading a Double

-Agent Life

Lionel Briand Certus Center for Software Verification

and Validation Simula Research Laboratory & University of Oslo, Norway

1

Software Everywhere

2

Failures Everywhere …

Software Engineering Research Funding & Relevance

➜ Software Engineering (SE) research should be a top priority in most countries (except Sweden)

➜ But that is not the case anymore. Hard numbers are hard to get, and in some cases well protected

➜ Symptoms ➥  Listed priorities by research councils, funding ➥  University hiring ➥  Large centers or institutes being established or closed down

➜ May be partly related to lack of relevance? ➥  Industry participation in leading SE conferences ➥  Application/industry tracks not first class citizens ➥  A very small percentage of research work ever used and assessed on

real industrial software

4

Basili’s and Meyer’s Take ➜  Many of the advances in software engineering have come out

of non-university sources ➜  “Academic research has had its part, honorable but

limited.” (Meyer) ➜  Large scale labs don’t get funded, like they do in other

engineering and scientific disciplines (Basili, Meyer) ➜  Software Engineering is “big science” (Basili)

➜  One significant difference though is that we cannot entirely recreate the phenomena we study within four walls – This, as discussed later, has significant consequences

➜  Question: What is our responsibility in all this? 5

Engineering Research

6

➜ “Engineering: The application of scientific and mathematical principles to practical ends such as the design, manufacture, and operation of efficient and economical structures, machines, processes, and systems.” (American Heritage Dictionary)

➜ Engineering research: ➥ Problem driven ➥ Real world requirements ➥ Scalability ➥ Human factors, where it matters ➥ Economic tradeoffs and cost-benefit analysis ➥ Actually doing it on real artifacts, not just talking

about it

A Representative Example ➜ Parnin and Orso (ISSTA, 2011) looked at

automated debugging techniques ➜ 50 years of automated debugging research ➜ Only 5 papers have evaluated automated

debugging techniques with actual programmers ➜ Focus since ~2001: dozens of papers ranking

program statements according to their likelihood of containing a fault

➜ Experiment ➥ How do programmers use the ranking? ➥ Do they see the bugs? ➥ Is the ranking important?

7

Results from Parnin and Orso’s Study

•  Only low performers strictly followed the ranking •  Only one out of 10 programmers who checked a

buggy statement stopped the investigation •  Automated support did not speed up debugging •  Developers wanted explanations rather than

recommendations •  We cannot abstract the human away in our

research •  “… we must steer research towards more promising

directions that take into account the way programmers actually debug in real scenarios.”

8

What Happened? ➜ How people debug and what information they need

is poorly understood ➥ Probably varies a great deal according to context and

skills

➜ Researchers focused on providing a solution that was a mismatch for the actual problem

➜ That line of research became fashionable: a lot of (cool) ideas could be easily applied and compared, without involving human participants

➜ Resulted in many, many papers … ➜ Many other examples in SE, e.g., Clone detection?

9

Other Examples ➜ Adaptive Random Testing: Many papers since 2004

➥ Mostly simulations and small artifacts, unrealistic failure rates. Arcuri and Briand (2011), ISSTA

➜ Regression testing: Arguably the most studied testing problem, perhaps most studied software engineering problem … ➥ "However, empirical evaluation and application of

regression testing techniques at industrial level seems to remain limited. Out of the 159 papers …, only 12 papers consider industrial software artefacts as a subject of the associated empirical studies. This suggests that a large-scale industrial uptake of these techniques has yet to occur.” Yoo and Harman (2011)

➥ Possible reason: Strong focus on white-box, not black-box regression testing? Scalability and Practicality?

10

Industry-Driven Research ➜ Let’s take now an entirely different angle … ➜ Research driven by industry needs ➜ Simula’s motto: “The industry is our lab” ➜ Go through recent and successful projects (mostly

@ Simula) with industry partners ➜ Summarize what happened, our experience ➜ Draw conclusions and lessons learned

➥ Patterns for successful research ➥ Challenges and possible solutions

11

Mode of Collaboration

12

Problem identification

Problem formulation

Study State-of- the-art

Candidate Solutions

Initial validation

Realistic validation Release

solution

Ind. Partners

Research Center

Gorschek et al., IEEE Software 2006

Project Example 1

➜ Context: Testing in communication systems (Cisco) ➜ Original scientific problem: Modeling and test case

generation, oracle, coverage strategy ➜ Practical observation: Access to test network

infrastructure limited (emulate network traffic, etc.). Models get too large and complex.

➜ Modified research objectives: (1) How to select an optimal subset of test cases matching the time budget, (2) Modeling cross-cutting concerns

13

Project Example 1

➜ Context: Testing in communication systems (Cisco) ➜ Original scientific problem: Modeling and model

-based test case generation, oracle, coverage strategy

➜ Practical observation: Access to test network infrastructure limited (emulate network traffic, etc.). Models get too large and complex.

➜ Modified research objectives: (1) How to select an optimal subset of test cases matching the time budget, (2) Modeling cross-cutting concerns

➜ References: Hemmati et al. (2010), Ali et al. (2011)

14

Project Example 2 ➜ Context: Testing image segmentation algorithms

for medical applications (Siemens) ➜ Original scientific problem: Define specific test

strategies for segmentation algorithms ➜ Practical observations: Algorithms are validated by

using highly specialized medical experts. Expensive and slow. No obvious test oracle

➜ Modified research objective: Learning oracles for image segmentation algorithms in medical applications. Machine learning.

15

Project Example 2 ➜ Context: Testing image segmentation algorithms

for medical applications (Siemens) ➜ Original scientific problem: Define specific test

strategies for segmentation algorithms ➜ Practical observations: Algorithms are validated by

using highly specialized medical experts. Expensive and slow. No obvious test oracle

➜ Modified research objective: Learning oracles for image segmentation algorithms in medical applications. Machine learning.

➜ Reference: Frouchni et al. (2011)

16

Project Example 3 ➜  Context: Subsea integrated control systems (FMC) ➜  Original scientific problem: Architecture-driven integration in

systems of systems ➜  Practical observations: Each subsea installation is unique

(variant), the software configuration is extremely complex (hundreds of interrelated variation points in software and hardware)

➜  Modified research objective: Product Line architectures in integrated control systems to support the configuration process

17

➜  Note: Despite decades of research in PLA, we could not find a methodology fitting our requirements

Project Example 3 ➜  Context: Subsea integrated control systems (FMC) ➜  Original scientific problem: Architecture-driven integration in

systems of systems ➜  Practical observations: Each subsea installation is unique

(variant), the software configuration is extremely complex (hundreds of interrelated variation points in software and hardware)

➜  Modified research objective: Product Line architectures in integrated control systems to support the configuration process

18

➜  Note: Despite decades of research in PLA, we could not find a methodology fitting our requirements

➜  Reference: Behjati et al. (2011)

Project Example 4 ➜  Context: safety-critical embedded systems in

the energy and maritime sectors, e.g., fire and gas monitoring, process shutdown, dynamic positioning (Kongsberg Maritime)

➜  Original scientific problem: Model-driven engineering for failure-mode and effect analysis

➜  Practical observations: Certification meetings with third-party certifiers. Certification is lengthy, expensive, etc. Traceability in large complex systems a priority.

➜  Modified research objective: Traceability between safety requirements and system design decisions. Solution based on SysML and a simple traceability language along with model slicing.

19

Project Example 4 ➜  Context: safety-critical embedded systems in

the energy and maritime sectors, e.g., fire and gas monitoring, process shutdown, dynamic positioning (Kongsberg Maritime)

➜  Original scientific problem: Model-driven engineering for failure-mode and effect analysis

➜  Practical observations: Certification meetings with third-party certifiers. Certification is lengthy, expensive, etc. Traceability in large complex systems a priority.

➜  Modified research objective: Traceability between safety requirements and system design decisions. Solution based on SysML and a simple traceability language along with model slicing.

➜  Reference: Sabetzadeh et al. (2011)

20

Project Example 5 ➜ Context: Technology qualification (TQ)

in maritime sector (DNV) ➜ Original scientific problem: Model-based

quantitative safety analysis ➜ Practical observations: TQ is not purely

objective, quantitative argument. Great complexity (e.g., sources of information) and expert judgment. Many stakeholders.

➜ Modified research objective: Modeling safety arguments to support quantitative reasoning and decision making by several stakeholders

21

Project Example 5 ➜ Context: Technology qualification (TQ)

in maritime sector (DNV) ➜ Original scientific problem: Model-based

quantitative safety analysis ➜ Practical observations: TQ is not purely

objective, quantitative argument. Great complexity (e.g., sources of information) and expert judgment. Many stakeholders.

➜ Modified research objective: Modeling safety arguments to support quantitative reasoning and decision making by several stakeholders

➜ Reference: Sabetzadeh et al. (2011) 22

Two Other Examples on the ICSM Program

➜ Erik Rogstad et al., “Industrial Experiences with Automated Regression Testing of a Legacy Database Application”

➜ Amir Reza Yazdanshenas and Leon Moonen, “Crossing the Boundaries While Analyzing Heterogeneous Component-Based Software Systems”

23

Successful Research Patterns

➜ Successful: Innovative and high impact ➜ Inductive research: Working from specific

observations in real settings to broader generalizations and theories ➥ Field studies and replications, analyze commonalities

➜ Scalability and practicality considerations must be part of the initial research problem definition

➜ Researching by doing: Hands-on research. Apply what exists in well defined, realistic context, with clear objectives. The observed limitations become the research objectives.

➜ Multidisciplinary: other CS, Engineering, or non-technical domains

24

So What? ➜ Making a conscious effort to understand the

problem first ➥  Precisely identify the requirements for an applicable solution ➥  More papers focused on understanding the problems ➥  Making industry tracks first class citizens in SE conferences

➜ Better relationships between academia and industry ➥  Different models, e.g., Research-based innovation centers in Norway ➥  Common labs (e.g., NASA SEL Lab) ➥  Exposing PhD students to industry practice: Ethical considerations

(Fixing the PhD, Nature)

➜ Playing an active role in solving the problem, e.g., action research-like

25

So What? ➜ Work on end-to-end solutions: Pieces of solutions

are interdependent. Necessary for impact. ➜ Beyond professors and students

➥ Labs with interdisciplinary teams of professional scientists and engineers within or collaborating with universities

➥ Used to be the case with corporate research labs: Bell Labs, Xerox PARC, HP labs, NASA SEL, etc.

➥ Now: Fraunhofer (Germany), Simula (Norway), Microsoft Research (US), SEI (US), SnT (Luxembourg)

➥ Corporate labs versus publicly supported ones? ➥ Key point: The level of basic funding must allow high

risk research, performed by professional scientists, focused on impact in society

26

The NASA SEL Experience Factory Model

27

Basili et al. (NASA SEL)

Academic Challenges ➜  Our CS legacy … emancipating ourselves as an engineering

discipline ➥ Systems engineering departments?

➜  How cool is it? SE research is more driven by “fashion” than needs, a quest for silver bullets ➥ We can only blame ourselves

➜  Counting papers and how the JSS ranking does not help ➥ We are pressuring ourselves into irrelevance

➜  Taking academic tenure and promotion seriously ➥ What about rewarding impact?

➜  One’s research must cover a broader ground and be somewhat opportunistic – this pushes us out of our comfort zone

➜  Resources to support industry collaborations ➥ Large lab infrastructure, engineers, time

28

Industrial Challenges

➜ From a discussion with Bran Selic … ➜ Short term versus longer term goals (next

quarter’s forecast is the priority) ➜ Industrial research groups are often disconnected

from their own business units and external researchers may be perceived as competitors

➜ Company’s intellectual property regulations may conflict with those of the research institution

➜ Complexity of industrial systems and technology ➥ Cannot be transplanted in artificial settings for

research - Need studies in real settings ➥ Substantial domain knowledge is required

29

A Double-Agent Life

30

Scientist, trying to be discrete, but

inquisitive

Warning: No research here

A new idea (as initially

perceived by our partners)

Practitioner (sanitized)

Conclusions ➜ Software engineering is obviously important in all

aspects of society, but academic software engineering research is not perceived the same way

➜ The academic community, at various levels, is partly responsible for this

➜ How we take up the challenge of increasing our impact will determine the future of the profession

➜ There are solutions, but no silver bullet ➜ We all have a role to play in this, as deans,

department chairs, professors, scientists, reviewers, conference organizers, journal editors, etc. We can all be double-agents …

31

Empirical Software Engineering

➜ Springer, 6 issues a year ➜ Both research papers and industry experience

reports ➜ 2nd highest impact factor among SE research

journals ➜ “Applied software engineering research with a

significant empirical component”

32

References ➜  Ali, Briand, Hemmati, “Modeling Robustness Behavior Using Aspect-Oriented Modeling

to Support Robustness Testing of Industrial Systems”, Journal of Software and Systems Modeling, forthcoming, 2011.

➜  Arcuri and Briand, “Adaptive Random Testing: An Illusion of Effectiveness”, ISSTA 2011

➜  Basili, “Learning Through Application: The maturing of the QIP in the SEL”, Making Software; What really works and why we believe it, Edited by Andy Oram and Greg Wilson, O’Reilly Publishers, 2011, pp.65-78.

➜  Behjati, Yue, Briand and Selic , “SimPL: A Product-Line Modeling Methodology for Families of Integrated Control Systems”, Simula Technical Report 2011-14 (V. 2), Submitted.

➜  Hemmati, Briand, Arcuri, Ali “An Enhanced Test Case Selection Approach for Model-Based Testing: An Industrial Case Study”, FSE, 2010

➜  Frouchni, Briand, Labiche, Grady, and Subramanyan, “Automating Image Segmentation Verification and Validation by Learning Test Oracles”, forthcoming in Information and Software Technology (Elsevier), 2011.

33

References II ➜  Sabetzadeh, Nejati, Briand, Evensen Mills “Using SysML for Modeling of Safety

-Critical Software–Hardware Interfaces: Guidelines and Industry Experience”, HASE, 2011

➜  Parnin and Orso, “Are Automated Debugging Techniques Actually Helping Programmers?”, ISSTA, 2011

➜  Sabetzadeh et al., “Combining Goal Models, Expert Elicitation, and Probabilistic Simulation for Qualification of New Technology”, HASE, 2011

➜  Yoo and Harman, “Regression testing minimization, selection and prioritization: a survey”, STVR, Wiley, forthcoming

➜  Bertrand Meyer’s blog: http://bertrandmeyer.com/2010/04/25/the-other-impediment-to-software-engineering-research/

34