keynote sbst 2014 - search-based testing

Search-Based Software Testing in Industry ---

Research collaborations and Lessons Learned

Lionel Briand Interdisciplinary Centre for ICT Security, Reliability, and Trust (SnT) University of Luxembourg, Luxembourg SBST, Hyderabad, 2014

SnT Software Verification and Validation Lab

•  SnT centre, Est. 2009: Interdisciplinary, ICT security-reliability-trust

•  200 scientists and Ph.D. candidates, 20 industry partners

•  SVV Lab: Established January 2012, www.svv.lu

•  25 scientists (Research scientists, associates, and PhD candidates)

•  Industry-relevant research on system dependability: security, safety, reliability

•  Six partners: Cetrel, CTIE, Delphi, SES, IEE, Hitec …

•  And we are always hiring! 2

An Effective, Collaborative Model of Research and Innovation

Basic Research Applied Research

Innova3on & Development

•  Basic and applied research take place in a rich context

•  Basic Research is also driven by problems raised by applied research, which is itself fed by innovation and development

•  Publishable research results and focused practical solutions that serve an existing market. 3

Schneiderman, 2013

Collaboration in Practice

•  Well-defined problems in context •  Realistic evaluation •  Long term industrial collaborations

4

Problem Formulation

Problem Identification

State of the Art Review

Candidate Solution(s)

Initial Validation

Training

Realistic Validation

IndustryPartners

ResearchGroups

1

2

3

4

5

7Solution Release

8

6

Outline

•  Four projects: –  Testing PID controllers in the automotive industry (Delphi) –  Robustness testing of a video conference system (Cisco) –  Environment-based testing of a seismic acquisition system

(WesternGeco) –  Schedulability analysis and stress testing of safety-critical

drivers in the oil&gas industry (Kongsberg)

•  Lessons learned, patterns, discussions

•  Meant to be an interactive talk – I am also here to learn

5

Acknowledgements

PhD. Students: •  Marwa Shousha •  Shaukat Ali •  Zohaib Iqbal •  Hadi Hemmati •  Reza Matinnejad •  Stefano Di Alesio

Research Associates/Scientists, former colleagues: •  Shiva Nejati •  Andrea Arcuri •  Arnaud Gotlieb •  Yvan Labiche

6

Testing PID Controllers (Delphi)

References:

7

•  R. Matinnejad, S. Nejati, L. Briand, T. Bruckmann, “MiL Testing of Highly Configurable Continuous Controllers: Scalable Search Using Surrogate Models”, Submitted (2104)

•  R. Matinnejad, S. Nejati, L. Briand, T. Bruckmann, C. Poull, , “Search-Based Automated Testing of Continuous Controllers: Framework, Tool Support, and Case Studies”, forthcoming in Information and Software Technology (2014)

•  R. Matinnejad, S. Nejati, L. Briand, T. Bruckmann, C. Poull , “Automated Model-in-the-Loop Testing of Continuous Controllers using Search”, in 5th Symposium on Search-Based Software Engineering (SSBSE 2013), Springer Lecture Notes in Computer Science (2013, August)

Dynamic continuous controllers are present in many embedded systems

8

Development Process

9

Hardware-in-the-Loop Stage

Model-in-the-Loop Stage

Simulink Modeling

Generic Functional

Model

MiL Testing

Software-in-the-Loop Stage

Code Generationand Integration

Software Running on ECU

SiL Testing

SoftwareRelease

HiL Testing

Controllers at MIL

10

Plant Model

++

+

⌃

+-

e(t)

actual(t)

desired(t)

⌃

KP e(t)

KDde(t)dt

KI

Re(t) dt

P

I

D

output(t)

Inputs: Time-dependent variables

Configuration Parameters

Inputs, Outputs, Test Objectives

11

Initi

al D

esire

d(ID

)

Desired ValueI (input)Actual Value (output)

Fina

l Des

ired

(FD

)

timeT/2 T

Smoothness

Responsiveness

Stability

Process and Technology

12

HeatMap Diagram

1. ExplorationList of Critical RegionsDomain

Expert

Worst-Case Scenarios

+Controller-

plant model

Objective Functionsbased on

Requirements 2. Single-State

Search

Continuous Controller Tester

(a) Liveness (b) Smoothness

Testing in the Configuration Space

•  MIL testing for all feasible configurations •  The search space is much larger

•  The search is much slower (Simulations of Simulink models are expensive)

•  Not all configuration parameters matter for all objective functions

•  Results are harder to visualize

13

Modified Process and Technology

14

+Controller

Model (Simulink)

Worst-Case Scenarios

List of Critical

PartitionsRegressionTree

1.Exploration with Dimensionality

Reduction

2.Search withSurrogate Modeling

Objective Functions

DomainExpert

Visualization of the 8-dimension space using regression trees Dimensionality

reduction to identify the significant variables

Surrogate modeling to predict the objective function and speed up the search

Dimensionality Reduction

•  Sensitivity Analysis: Elementary Effect Analysis (EEA)

•  Identify non-influential inputs in computationally costly mathematical models

•  Requires less data points than other techniques

•  Observations are simulations generated during the Exploration step

•  Compute sample mean and standard deviation for each dimension of the distribution of elementary effects

15

Cal5ID

Cal3FD

Cal4Cal6

Cal1,Cal2

0.6

0.4

0.2

0.0

Sam

ple

Stan

dard

Dev

iatio

n (

)

-0.6 -0.4 -0.2 0.0 0.2Sample Mean ( )

⇤10�2

⇤10�2

S� i

�i

Visualization in Inputs & Configuration Space

16

All Points

FD>=0.43306

Count MeanStd Dev

Count MeanStd Dev

FD<0.43306Count MeanStd Dev

ID>=0.64679Count MeanStd Dev

Count MeanStd Dev

Cal5>=0.020847 Cal5>0.020847Count MeanStd Dev

Count MeanStd Dev

Cal5>=0.014827 Cal5<0.014827Count MeanStd Dev

Count MeanStd Dev

1000 0.007822

0.0049497

ID<0.64679

574 0.00595130.0040003

426 0.01034250.0049919

373 0.00475940.0034346

201 0.00816310.0040422

182 0.01345550.0052883

244 0.00802060.0031751

70 0.01067950.0052045

131 0.00681850.0023515 Regression Tree

Surrogate Modeling

17

•  Any supervised learning or statistical technique providing fitness predictions with confidence intervals

1.  Predict higher fitness with high confidence: Move to new position, no simulation

2.  Predict lower fitness with high confidence: Do not move to new position, no simulation

3.  Low confidence in prediction: Simulation

Surrogate Model

Real Function

x

Fitness

Results

•  Search yielded worst-case scenarios that were much worse than known and expected scenarios

•  Surrogate modeling: Polynomial regression yielded best fit and predictive power so far

•  Dimensionality reduction helps generate better surrogate models

•  Surrogate modeling can yield up to an eight-fold increase in search speed

•  Surrogate modeling can help find more critical requirements violations

•  By accounting for variations in configurations, we found more critical requirements violations than just with the HIL configuration

18

Robustness Testing of a Video Conference System (Cisco)

References:

19

•  S. Ali, Briand, H. Hemmati, “Modeling Robustness Behavior Using Aspect-Oriented Modeling to Support Robustness Testing of Industrial Systems”, Journal of Software and Systems Modeling (Springer), 2011

•  S. Ali, M. Z. Iqbal, A. Arcuri, L. Briand, “Generating Test Data from OCL Constraints with Search Techniques”, IEEE Transactions on Software Engineering, 2013

Video Conference System

20

Core Functionality

21

EP1 EP3

EP2

Call

Outgoing channel

Incoming channel

Audio Channel

Presentation Channel Video

Channel

Robustness

•  Robustness is the degree to which a software component functions correctly in the presence of exceptional inputs or stressful environmental conditions (IEEE Std 610.12-1990)

•  Significant additional complexity lies with handling the

robustness properties –  Network communication faults –  Media quality faults in media streams –  Faults in the endpoints

22

Cross-Cutting Concern

23

NotFull [0<#calls<max]

Full [#calls=

max]

dial()

dial() [#calls=max-1]

dial() [#calls<m

ax-1]

disconnect()

disconnect() [#calls=1]

disconnect() [#calls>1]

Idle [#calls=0

]

Recovery […]

Afte

r(tim

e)

Disc

onne

ctAll()

PL>0 or PacketDelay>0 or ReorderDelay>0 or corrupt>0 or Duplicate>0

PL=0 && PacketDelay=0 && ReorderDelay=0 && corrupt=0 && Duplicate=0

Cross-cutting concern

Base model

Model-Based Testing (MBT)

24

•  Goals: Scalability, complete automation •  Model-based Testing (MBT) uses models of the system for test case

and oracle generation –  The models typically describe some aspects of system under test –  Increasingly used for complete test automation, e.g., aerospace,

automotive, banking

•  Often using well-established standards for modeling and their extensions: UML (profiles), OCL, etc.

•  Requirements: –  Test-ready models –  Appropriate test strategies, e.g., path selection –  Test data generation –  Oracles

Model-Based Testing: Process and Technology

25

Test Data Generation for MBT

•  Test data is needed to execute program paths as required by a coverage criterion during testing

•  For MBT, test data is typically an instance of a class diagram •  Instances must fulfill invariants •  Paths in state machines carry constraints (guards) on conditions •  To generate test data for UML/OCL models, we need to solve

OCL constraints written on the models

26

context Student inv ageConstraint: self.age > 15 and self.age < 80

Example OCL expression in VC Model

27

context Saturn inv synchronizationConstraint: !!self.systemUnit.NumberOfActiveCalls > 1 and !

!self.systemUnit.NumberOfActiveCalls <= !! ! ! ! ! !!self.systemUnit.MaximumNumberOfActiveCalls !!and !!self.media.synchronizationMismatch.unit = TimeUnitKind::s and !!!(!! !self.media.synchronizationMismatch.value >= 0 and !

! !self.media.synchronizationMismatch.value <= ! ! !!!! ! !self.media.synchronizationMismatchThreshold.value!!) and !!self.conference.PresentationMode = Mode::Off and !!self.conference.call→select(call | !! !call.incomingPresentationChannel.Protocol <> VideoProtocol::Off)→size()=2 !!and !!!self.conference.call→select(call | !! ! call.outgoingPresentationChannel.Protocol <> VideoProtocol::Off)→size()=2!

OCL Constraint Solvers

•  A number approaches for OCL constraint solving

•  Not complete –  Support subset of OCL

•  Lack of proper tool support –  A number of approaches are not automated

•  Not scalable –  Often based on translation (e.g., to CSP) –  Combinatorial explosion

28

A Search Problem

•  We used an alternate approach by applying the search-based testing (SBT) concepts to solve OCL constraint

•  The process of generating test data can be seen as a search process –  There is a huge number of possible instances that can be

generated for a particular model –  We need to select instances that solve the constraint

•  Fitness defined as a distance function d() –  d() returns 0 if the constraint is solved –  otherwise a value that heuristically estimates how far the constraint

was from being evaluated as true

29

Challenges

•  Primitive Types, Boolean Operators •  Operations on Collections, Iterators •  Fine grained fitness functions for iterators using size, oclInState

•  Consider a collection C = {1, 2, 3} and a constraint C→forAll(x|x= 0)

d(C->forAll(x|x=0)) ! d(C.at(i) = 0)/C->size() ! (d(1 = 0) + d(2=0) + d(3=0))/3 ! (2 + 3 + 4)/3 ! 3

•  Many complex rules for the computations of fitness functions based on OCL expressions

•  Fine grained heuristics -> maximum guidance 30

VC Model and Results

•  UML Class diagram, state machines, OCL •  20 subsystems, on average 5 states and 11

transitions (largest: 22 states – 63 transitions) •  OCL: 144 constraints as guards, 100 invariants, and

57 change events •  Results:

–  All constraints were resolved –  Maximum time: ~ 2 minutes on laptop

31

Environment-Based Testing of a Seismic Acquisition System (WesternGeco)

References:

32

•  Z. Iqbal, A. Arcuri, L. Briand, “Empirical Investigation of Search Algorithms for Environment Model-Based Testing of Real-Time Embedded Software”, ACM ISSTA, 2012

•  Z. Iqbal, A. Arcuri, L. Briand, “Environment Modeling and Simulation for Automated Testing of Soft Real-Time Embedded Software”, Software and System Modeling (Springer), 2014

Objectives

•  Model-based System testing –  Black-box –  Environment models

33

Environment Simulator

Test cases

Environment Models

Test oracle

Environment: “Domain” Model

34

Environment: “Behavioral” Model

35

Test Case Generation

•  Test objectives: Reach “error” states (critical environment states) •  Test Case: (1) Environment and (2) Simulation Configuration

–  (1) Number of instances for each component in domain model, e.g., number of items on conveying belt

–  (2) Setting non-deterministic properties of the environment, e.g., speed of sorter’s left and right arms

•  Oracle: Reaching an “error” state •  SBST: Heuristics

–  Distance from error state –  Distance from satisfying OCL guards –  Time distance –  Time in “risky” states –  …

36

Schedulability Analysis and Stress Testing of Safety-Critical Drivers (Kongsberg Maritime)

References:

37

•  L. Briand, Y. Labiche, and M. Shousha, “Using genetic algorithms for early schedulability analysis and stress testing in real-time systems”, Genetic Programming and Evolvable Machines, vol. 7 no. 2, pp. 145-170, 2006

•  S. Nejati, S. Di Alesio, M. Sabetzadeh, and L. Briand, “Modeling and analysis of cpu usage in safety-critical embedded systems to support stress testing,” in Model Driven Engineering Languages and Systems. Springer, 2012, pp. 759–775.

•  S. Di Alesio, S. Nejati, L. Briand. A. Gotlieb, “Stress Testing of Task Deadlines: A Constraint Programming Approach”, ISSRE 2013, San Jose, USA!

•  S. Di Alesio, S. Nejati, L. Briand. A. Gotlieb, “Worst-Case Scheduling of Software Tasks – A Constraint Optimization Model to Support Performance Testing, Constraint Programming (CP), 2014

Fire/Gas Detection and Emergency Shutdown

38

Drivers (Software-Hardware Interface)

Control Modules Alarm Devices (Hardware)

Multicore Archt.

Real Time Operating System

Monitor gas leaks and fire in oil extraction platforms

Performance Requirements are Hard to Verify

39

They constraint the entire system’s behavior and thus can’t be checked locally

They depend on the environment the software interacts with (hw devices)

They depend on the computing platform on which the software runs

Schedulability Analysis and Testing

•  RTES have concurrent interdependent tasks which have to finish before their deadlines

•  Each task has a deadline (i.e., latest finishing time) w.r.t. its arrival time

•  Some task properties depend on the environment, some are design choices

•  Tasks can trigger other tasks, and can share computational resources with other tasks

•  Schedulability analysis encompasses techniques that try to predict whether all (critical) tasks are schedulable, i.e., meet their deadlines

•  Stress testing runs carefully selected test cases that have a high probability of leading to deadline misses

40

Arrival Times Determine Deadline Misses

41

0123456789

𝒋𝟎, 𝒋𝟏, 𝒋𝟐 arrive at 𝒂𝒕𝟎, 𝒂𝒕𝟏, 𝒂𝒕𝟐 and must , 𝒋𝟏, 𝒋𝟐 arrive at 𝒂𝒕𝟎, 𝒂𝒕𝟏, 𝒂𝒕𝟐 and must , 𝒋𝟐 arrive at 𝒂𝒕𝟎, 𝒂𝒕𝟏, 𝒂𝒕𝟐 and must arrive at 𝒂𝒕𝟎, 𝒂𝒕𝟏, 𝒂𝒕𝟐 and must , 𝒂𝒕𝟏, 𝒂𝒕𝟐 and must , 𝒂𝒕𝟐 and must and must finish before 𝒅𝒍𝟎, 𝒅𝒍𝟏, 𝒅𝒍𝟐 , 𝒅𝒍𝟏, 𝒅𝒍𝟐 , 𝒅𝒍𝟐

𝒋𝟏 can miss its deadline 𝒅𝒍𝟏 depending on can miss its deadline 𝒅𝒍𝟏 depending on depending on when 𝒂𝒕↓𝟐  occurs!

0123456789

𝒋𝟐

𝒂𝒕𝟐

𝒅𝒍𝟐

𝒋𝟏

𝒂𝒕𝟏

𝒅𝒍𝟏

𝑻

𝒋𝟎

𝒂𝒕𝟎

𝒅𝒍𝟎

𝒋𝟐

𝒂𝒕𝟐

𝒅𝒍𝟐

𝒋𝟏

𝒂𝒕𝟏

𝒅𝒍𝟏

𝑻

𝒋𝟎

𝒂𝒕𝟎

𝒅𝒍𝟎

Search-Based Approaches

•  This problem can be tackled as a search problem in the space of arrival times for aperiodic tasks

•  Identify worst-case scenarios for testing •  No assumptions

•  Genetic algorithms: Briand et al., 2003-2006

•  Constraint Programming (e.g., OPL, ILOG CP Optimizer) –  Nejati et al., 2012 –  Di Alesio et al., 2013-2014

42

Constraint Optimization

43

Constraint Optimization Problem

Static Properties of Tasks (Constants)

Dynamic Properties of Tasks

(Variables)

Performance Requirement (Objective Function)

OS Scheduler Behaviour (Constraints)

Process and Technologies

44

UML Modeling

Automated Search

Optimization Problem (Find arrival times that maximize the chance of deadline misses)

System Platform

Solutions (Task arrival times likely to

lead to deadline misses)

Deadline Misses Analysis

System Design Design Model (Time and Concurrency

Information)

INPUT

OUTPUT

Genetic Algorithms

(GA)

Stress Test Cases

Constraint Programming

(CP)

𝒂𝒕↓𝟎 =𝟏 𝒂𝒕↓𝟏 =𝟑 𝒂𝒕↓𝟐 =𝟒

Results and Current Work

•  GA tends to be more efficient but less effective than CP –  More efficient: Find deadline misses quicker –  More effective: Find worse deadline misses

•  CP is deterministic, evolutionary search is randomized

•  For testing we want a diverse test of stress test cases

•  Combining GA and CP (Di Alesio’s dissertation): –  Achieve an efficiency close to GA and an effectiveness close to

CP –  Use GA first and improve worst solutions found by GA by

performing a CP complete search in the neighborhood of solutions

–  Results on five case studies are very encouraging

45

SBST in Industry: Discussion

•  Scalability

•  Applicability

•  Variety of heuristics as a function of test objectives, available information, assumptions, etc.

•  Search as a piece of the solution: multidisciplinarity

•  Combining search with other techniques: Likely candidates

46

Scalability

•  Search spaces are huge in practice

•  Fitness computation is often computationally-intensive

•  Test execution can be expensive –  Web applications or phone apps versus embedded systems

with HIL –  Models, simulation to guide the search

•  Simulation is always expensive –  Simulink models, e.g., 31s for a 2s simulation –  Surrogate modeling?

•  In many situations, models of the system can help guide the search

47

Applicability

•  Many academic solutions are not applicable in practice

•  Context matters

•  Scalability -> applicability

•  But also inputs required for guiding the search

•  Integrated to the rest of the development process –  E.g., design models, WCET analysis, Simulink development

48

A Large Variety of Heuristics

•  Test objectives differ a great deal depending on context –  Performance, robustness, critical environment states …

•  Available information also differs, both for guiding test generation and oracles –  Purely black-box testing –  Design information, e.g., through models

•  Working assumptions –  About process, technology, … –  E.g., availability of plant/environment models in Simulink

•  In a given context, some degree of tailoring is usually required for applying SBST

49

Multidisciplinarity

•  Typically, meta-heuristic search is only part of a solution to a testing problem

•  Dedicated system or environment modeling, e.g., in Cisco and WesternGeco studies

•  Machine learning, e.g., regression trees in Delphi study

•  Statistical analysis, e.g., EEA and non-linear regression in Delphi study

•  Constraint programming, e.g., in Kongsberg study

50

Search-Based Software Testing in Industry ---

Research collaborations and Lessons Learned

Lionel Briand Interdisciplinary Centre for ICT Security, Reliability, and Trust (SnT) University of Luxembourg, Luxembourg SBST, Hyderabad, 2014 SVV lab: svv.lu SnT: www.securityandtrust.lu