testing dynamic behavior in executable software models - making cyber-physical systems testable

.lusoftware verification & validationVVS

Testing Dynamic Behavior in Executable Software Models!

-!Making Cyber-physical Systems Testable!

! Lionel Briand

July 18, ISSTA 2016

Acknowledgements

• Shiva Nejati

• Reza Matinejad

• Raja Ben Abdessalem

2

Cyber-Physical Systems •  Increasingly complex and critical

systems

•  Complex environment

•  Complex requirements, e.g., temporal, timing, resource usage

•  Dynamic behavior

•  Uncertainty, e.g., about the environment

•  Testing is expensive and difficult, e.g., HW in the loop

3

Cyber Space

Physical Sensing

Actuation Information

Networks

ObjectDomain

Real Space

Dynamic Behavior

•  Common when dealing with physical entities

•  Inputs and outputs are variables evolving over time (signals)

•  Properties to be verified consider change over time, for individual or sets of outputs

4

Time-Continuous Magnitude-Continuous

time

value

MiL Testing Hardware-in-the-Loop

StageModel-in-the-Loop

Stage

Simulink Modeling

Generic Functional

Model

MiL Testing

Software-in-the-Loop Stage

Code Generationand Integration

Software Running on ECU

SiL Testing

SoftwareRelease

HiL Testing

5

Both the system and environment

are modeled

Testing is highly expensive and time consuming

Simulink Models - Simulation

• Simulation Models

• heterogeneous

6

Time-Continuous Simulink Model Hardware

Model

Network Model

• continuous behavior

• are used for

• algorithm design testing

• comparing design options

MiL Test Cases

7

ModelSimulation

InputSignals

OutputSignal(s)

S3t

S2t

S1t

S3t

S2t

S1t

Test Case 1

Test Case 2

MiL Testing Challenges

• Space of test input signals is extremely large.

• Model execution, especially when involving physical modeling, is extremely expensive.

• Oracles are not simple Boolean properties – they involve analyzing changes in value over time (e.g., signal patterns) and assess levels of risk.

8

MiL Testing Challenges (2)

• Simulable model of the (physical) environment is required for test automation, but not always available.

• Effectiveness of test coverage strategies is questionable, e.g., model coverage.

• No equivalence classes on input signal domains, no combinatorial approaches.

9

We need novel, automated, and cost-effective MiL

testing strategies for CPS

10

Industrial Examples

11

Advanced Driver Assistance Systems (ADAS)

Decisions are made over time based on sensor data

12

Sensors Software

Pedestrian Detection System (PeVi)

13

• The PeVi system is a camera-based assistance system providing improved vision

Challenges • Simulation/testing is performed

using physics-based simulation environments

• Challenge 1: A large number of simulation scenarios

• more than 2000 configuration variables

• Challenge 2: Simulations are computationally expensive

14

weather road sensors human vehicles

Simulation Scenario

Approach

15

Generation of Test specifications

Static[ranges/values/

resolution]

Dynamic[ranges/

resolution]

(2)

test case specification

Specification Documents (Simulation Environment and PeVi System)

Domain model

Requirements model

(1)Development of Requirementsand domain models

Domain Model

16

- intensity: RealSceneLight

DynamicObject

1- weatherType: Condition

Weather

- fog- rain- snow- normal

«enumeration»Condition

Output Trajectory

- field of view: Real

Camera Sensor

RoadSide Object

- roadType: RTRoad

1 - curved- straight- ramped

«enumeration»RT

- vc: RealVehicle

- x0: Real

- y0: Real

- θ: Real- vp: Real

Pedestrian

- x: Real- y: Real

Position

1

*

1

*

11

- state: BooleanCollision

Parked Cars

Trees- simulationTime: Real- timeStep: Real

Test Scenario

PeVi

- state: BooleanDetection

11

11

11

11

«positioned»

«uses»1 1

Requirements Model

17

<<trace>> <<trace>>

Speed Profile

Path1 1

Slot Path Segment

1..**1

Trajectory Human1*

trajectory

WarningSensors posx1, posx2

posy1, posy2

AWACar/Motor/Truck/Bus

sensorhas

hasawa

11

1

*

humanappears

posx1 posx2

posy1

posy2

The PeVi system shall detect any person located in the Acute Warning Area of a vehicle

Test Generation Overview

18

Simulator + PeVi

Environment Settings (Roads, weather, vehicle type, etc.)

Fixed during Search Manipulated by Search

Human Simulator (initial position,

speed, orientation)

Car Simulator (speed)

PeVi

Meta-heuristic Search (multi-objective)

Generate scenarios

Detection or not?

Collision or not?

Multi-Objective Search

• Search algorithm needs objective or fitness functions for guidance

•  In our case several independent functions can be interesting (heuristics): •  Distance between car and pedestrian

•  Distance between pedestrian and AWA

•  Time to collision

19

posx1 posx2

posy1

posy2

Pareto Front

20

Individual A Pareto dominates individual B if A is at least as good as B

in every objective and better than B in at

least one objective.

Dominated by x

O1

O2

Pareto front x

•  A multi-objective optimization algorithm must achieve:

•  Guide the search towards the global Pareto-Optimal front. •  Maintain solution diversity in the Pareto-Optimal front.

MO Search with NSGA-II

21

Non-Dominated Sorting

Selection based on rank and crowding distance

Size: 2*N Size: 2*N Size: N

•  Based on Genetic Algorithm •  N: Archive and population size •  Non-Dominated sorting: Solutions are ranked according to how

far they are from the Pareto front, fitness is based on rank. •  Crowding Distance: Individuals in the archive are being spread

more evenly across the front (forcing diversity) •  Runs simulations for close to N new solutions

Pareto Front Results

22

TTC / D(P/AWA)

23

Simulation Scenario Execution

• https://sites.google.com/site/testingpevi/

24

Improving Time Performance

•  Individual simulations take on average more than 1min

•  It takes 10 hours to run our search-based test generation !(≈ 500 simulations)

è We use surrogate modeling to improve the search

•  Goal: Predict fitness based on dynamic variables

•  Neural networks 25

Multi-Objective Search with Surrogate Models

26

Non-Dominated Sorting

Selection based on rank and crowding distance

Size: 2*N Size: 2*N Size: N

Original Algorithm - Runs simulations for all!new solutions

New Algorithm -  Uses prediction values &!

prediction errors to run !simulations only!for the solutions that !might be selected

Results – Surrogate Modeling

27

0.00

0.25

0.50

0.75

1.00

Time (min)

HV

50 100 15010

(a) Comparing HV values obtained by NSGAII and NSGAII-SM

NSGAII (mean)NSGAII-SM (mean)

0.00

0.25

0.50

0.75

1.00

Time (min)50 100 15010

(b) Comparing HV values obtained by RS and NSGAII-SM

HV

RS (mean)NSGAII-SM (mean)

0.00

0.25

0.50

0.75

1.00

Time (min)

HV

50 100 15010

(c) HV values for worst runs of NSGAII, NSGAII-SM and RS

RS

NSGAII-SM NSGAII

Results – Worst Runs

28

0.00

0.25

0.50

0.75

1.00

Time (min)

HV

50 100 15010



0.00

0.25

0.50

0.75

1.00

Time (min)50 100 15010


HV


0.00

0.25

0.50

0.75

1.00

Time (min)

HV

50 100 15010


RS

NSGAII-SM NSGAII

Results – Random Search

29

0.00

0.25

0.50

0.75

1.00

Time (min)

HV

50 100 15010



0.00

0.25

0.50

0.75

1.00

Time (min)50 100 15010


HV


0.00

0.25

0.50

0.75

1.00

Time (min)

HV

50 100 15010


RS

NSGAII-SM NSGAII

Conclusion • A general testing approach for ADAS, and many CP systems

•  Formulated the generation of critical test cases as a multi-objective search problem using NSGAII algorithm

•  Improved the search performance with surrogate models based on neural networks

• Generated some critical scenarios: no detection in the AWA, collision and no detection

• No clear cut oracle – failure to detect may be deemed acceptable risk

30

Dynamic Continuous Controllers

31

•  Supercharger bypass flap controller ü Flap position is bounded within

[0..1] ü Implemented in MATLAB/Simulink ü 34 (sub-)blocks decomposed into 6

abstraction levels ü The simulation time T=2 seconds

Supercharger

Bypass Flap

Supercharger

Bypass Flap

Flap position = 0 (open) Flap position = 1 (closed)

Simple Example

32

InitialDesired Value

FinalDesired Value

time time

Desired Value

Actual Value

T/2 T T/2 T

Test Input Test Output

Plant Model

Controller(SUT)

Desired value Error

Actual value

System output+-

MiL Testing of Controllers

33

Configurable Controllers at MIL

Plant Model

++

+

⌃

+-

e(t)

actual(t)

desired(t)

⌃

KP e(t)

KDde(t)dt

KI

Re(t) dt

P

I

D

output(t)

Time-dependent variables

Configuration Parameters 34

Requirements and Test Objectives In

itial

Des

ired

(ID)

Desired ValueI (input)Actual Value (output)

Fina

l Des

ired

(FD

)

timeT/2 T

Smoothness

Responsiveness

Stability

35

A Search-Based Test Approach

Initial Desired (ID)

Fina

l Des

ired

(FD

)

Worst Case(s)?

•  Continuous behavior

•  Controller’s behavior can be complex

•  Meta-heuristic search in (large) input space: Finding worst case inputs

•  Possible because of automated oracle (feedback loop)

•  Different worst cases for different requirements

•  Worst cases may or may not violate requirements

36

Initial Solution

HeatMap Diagram

1. ExplorationList of Critical RegionsDomain

Expert

Worst-Case Scenarios

+Controller-

plant model

Objective Functionsbased on

Requirements 2. Single-State

Search

time

Desired ValueActual Value

0 1 20.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Initial Desired

Final Desired

37

Results

• We found much worse scenarios during MiL testing than our partner had found so far

• These scenarios are also run at the HiL level, where testing is much more expensive: MiL results -> test selection for HiL

• But further research was needed:

• Simulations are expensive

• Configuration parameters

38

Final Solution

+Controller

Model (Simulink)

Worst-Case Scenarios

List of Critical

PartitionsRegressionTree

1.Exploration with Dimensionality

Reduction

2.Search withSurrogate Modeling

Objective Functions

DomainExpert

Visualization of the 8-dimension space

using regression trees Dimensionality reduction to identify

the significant variables (Elementary Effect Analysis)

Surrogate modeling to predict the objective

function and speed up the search (Machine learning)

39

Regression Tree

All Points

FD>=0.43306

Count MeanStd Dev

Count MeanStd Dev

FD<0.43306Count MeanStd Dev

ID>=0.64679Count MeanStd Dev

Count MeanStd Dev

Cal5>=0.020847 Cal5>0.020847Count MeanStd Dev

Count MeanStd Dev

Cal5>=0.014827 Cal5<0.014827Count MeanStd Dev

Count MeanStd Dev

1000 0.007822

0.0049497

ID<0.64679

574 0.00595130.0040003

426 0.01034250.0049919

373 0.00475940.0034346

201 0.00816310.0040422

182 0.01345550.0052883

244 0.00802060.0031751

70 0.01067950.0052045

131 0.00681850.0023515

40

Surrogate Modeling

Any supervised learning or statistical technique providing fitness predictions with confidence intervals 1.  Predict higher fitness with high

confidence: Move to new position, no simulation

2.  Predict lower fitness with high confidence: Do not move to new position, no simulation

3.  Low confidence in prediction: Simulation

Surrogate Model

Real Function

x

Fitness

41

ü  Our approach is able to identify more critical violations of the controller requirements that had neither been found by our earlier work nor by manual testing.

MiL-Testing different configurations

Stability

Smoothness

Responsiveness

MiL-Testing fixed configurations Manual MiL-Testing

- -2.2% deviation

24% over/undershoot 20% over/undershoot 5% over/undershoot

170 ms response time 80 ms response time 50 ms response time

Results

42

Open Loop Controllers

On

Off

CtrlSig

•  Mixed discrete-continuous behavior: Simulink stateflows

•  No plant model: Much quicker simulation time

•  No feedback loop -> no automated oracle

•  The main testing cost is the manual analysis of output signals

•  Goal: Minimize test suites

•  Challenge: Test selection

•  Entirely different approach to testing

respectively. In addition, we adapt the whitebox coverage and theblackbox output diversity selection criteria to Stateflows, and evalu-ate their fault revealing power for continuous behaviours. Coveragecriteria are prevalent in software testing and have been consideredin many studies related to test suite effectiveness in different appli-cation domains [?]. In our work, we consider state and transitioncoverage criteria [?] for Statflows. Our output diversity criterion isbased on the recent output uniqueness criterion [?] that has beenstudied for web applications and has shown to be a useful surro-gate to whitebox selection techniques. We consider this criterionin our work because Stateflows have complex internal structuresconsisting of differential equations, making them less amenable towhitebox techniques, while they have rich time-continuous outputs.

In this paper, we make the following contributions:

• We focus on the problem of testing Stateflows with mixeddiscrete-continuous behaviours. We propose two new testcase selection criteria output stability and output continuitywith the goal of selecting test inputs that are likely to pro-duce continuous outputs exhibiting instability and disconti-nuity failures, respectively.

• We adapt the whitebox coverage and the blackbox outputdiversity selection criteria to Stateflows, and evaluate theirfault revealing power for continuous behaviours. The formeris defined based on traditional state and transition coveragefor state machines, and the latter is defined based on the re-cent output uniqueness criterion [?].

• We evaluate effectiveness of our newly proposed and theadapted selection criteria by applying them to three Stateflowcase study models: two industrial and one public domain.Our results show that RESULT.

Organization of the paper.

2. BACKGROUND AND MOTIVATIONMotivating example. We motivate our work using a simplifiedStateflow from the automotive domain which controls a superchargerclutch and is referred to as the Supercharger Clutch Controller (SCC).Figure 1(a) represents the discrete behaviour of SCC specifyingthat the supercharger clutch can be in two quiescent states [?]: en-gaged or disengaged. Further, the clutch moves from the disen-gaged to the engaged state whenever both the engine speed engspdand the engine coolant temperature tmp respectively fall inside thespecified ranges of [smin..smax] and [tmin..tmax]. The clutchmoves back from the engaged to the disengaged state whenevereither the speed or the temperature falls outside their respectiveranges. The variable ctrlSig in Figure 1(a) indicates the sign andmagnitude of the voltage applied to the DC motor of the clutchto physically move the clutch between engaged and disengagedpositions. Assigning 1.0 to ctrlSig moves the clutch to the en-gaged position, and assigning �1.0 to ctrlSig moves it back tothe disengaged position. To avoid clutter in our figures, we useengageReq to refer to the condition on the Disengaged ! En-gaged transition, and disengageReq to refer to the condition onthe Engaged ! Disengaged transition.

The discrete transition system in Figure 1(a) assumes that theclutch movement takes no time, and further, does not provide anyinsight on the quality of movement of the clutch. Figure 1(b) ex-tends the discrete transition system in Figure 1(a) by adding a timervariable, i.e., time, to explicate the passage of time in the SCCbehaviour. The new transition system in Figure 1(b) includes two

(a) SCC -- Discrete Behaviour

(b) SCC -- Timed Behaviour

EngagedDisengaged

Engaging

(c) Engaging state of SCC -- mixed discrete-continuous behaviour

Disengaging

Disengaged

Engaged

time + +;

[disengageReq]/time := 0

[time

>5]

[time

>5]

time + +;

[(engspd > smin � engspd < smax) � (tmp > tmin � tmp < tmax)]/ctrlSig := 1

[engageReq]/ time := 0

[¬(engspd > smin � engspd < smax) � ¬(tmp > tmin � tmp < tmax)] /ctrlSig := �1

OnMoving OnSlipping

OnCompleted

time + +;ctrlSig := f(time)

Engaging

time + +;ctrlSig := g(time)

time + +;ctrlSig := 1.0

[¬(vehspd = 0) �time > 2]

[(vehspd = 0) �time > 3]

[time > 4]

Figure 1: Supercharge Clutch Controller (SCC) Stateflow.

transient states [?], engaging and disengaging, specifying that mov-ing from the engaged to the disengaged state and vice versa takessix milisec. Since this model is simplified, it does not show han-dling of alterations of the clutch state during the transient states.In addition to adding the time variable, we note that the variablectrlSig, which controls physical movement of the clutch, cannotabruptly jump from 1.0 to �1.0, or vice versa. In order to ensuresafe and smooth movement of the clutch, the variable ctrlSig hasto gradually move between 1.0 and �1.0 and be described as afunction over time, i.e., a signal. To express the evolution of thectrlSig signal over time, we decompose the transient states en-gaging and disengaging into sub-state machines. Figure 1(c) showsthe sub-state machine related to the engaging state. The one relatedto the disengaging state is similar. At beginning (in state OnMov-ing), the function ctrlSig has a steep grade (i.e., function f ) tomove the stationary clutch from the disengaged state and acceler-ate it to reach a certain speed in about two milisec. Afterwards (instate OnSlipping), ctrlSig decreases the speed of clutch basedon the gradual function g until about four milisec. This is to ensurethat the clutch slows down as it gets closer to the crank shaft ofthe car. Finally, at state OnCompleted, ctrlSig reaches value 1.0and remains constant, causing the clutch to get engaged in aboutone milisec. When the car is stationary, i.e., vehspd is 0, the clutchmoves based on the steep grade function f for three milisec, anddoes not have to go to the OnSlipping phase to slow down beforeit reaches the crank shaft at state OnCompleted.Input and Output. The Stateflow inputs and outputs are signals(functions over time). Each input/output signal has a data type,e.g. boolean, enum or float, specifying the range of the signal.For example, Figure 2 shows an example input (dashed line) andoutput (solid line) signals for SCC. The input signal is related toengageReq and is boolean, while the output signal is related to

43

Selection Strategies Based on Search •  Input signal diversity

•  White-box structural coverage

•  State Coverage

•  Transition Coverage

•  Output signal diversity

•  Failure-Based selection criteria

•  Domain specific failure patterns

•  Output Stability

•  Output Continuity

S3t

S3t

44

!Output Diversity -- Vector-Based

45

Output

Time Output Signal 2 Output Signal 1

46

Output Diversity -- Feature-Based

increasing (n) decreasing (n)constant-value (n, v)

signal featuresderivative second derivative

sign-derivative (s, n) extreme-derivatives

1-sided discontinuity

discontinuity

1-sided continuitywith strict local optimum

value

instant-value (v)constant (n)

discontinuitywith strict local optimum

increasing

C

A

B

Failure-based Test Generation

47

Instability Discontinuity

0.0 1.0 2.0-1.0

-0.5

0.0

0.5

1.0

Time

Ctr

lSig

Output

• Search: Maximizing the likelihood of presence of specific failure patterns in output signals

• Domain-specific failure patterns elicited from engineers

0.0 1.0 2.0Time

0.0

0.25

0.50

0.75

1.0

Ctr

lSig

Output

Results • The test cases resulting from state/transition coverage

algorithms cover the faulty parts of the models

• However, they fail to generate output signals that are sufficiently distinct from expectations, hence yielding a low fault revealing rate

• Output-based algorithms are much more effective

• Existing commercial tools: Not effective at finding faults, not applicable to entire Simulink models

48

Reflecting

49

Commonalities • Large input spaces

• Combinatorial approaches not applicable

• Coverage?

• Expensive testing: test execution time, oracle analysis effort

• Complex oracle (dynamic behavior)

• Testing is driven by risk

• Search-based solution: Highest risk scenarios

50

Differences • Model execution time, e.g., plant or not

• Fitness function: Exact or heuristic

• Single or multi objectives

• Automated oracle or not

• Other techniques involved to achieve scalability: regression trees, neural networks, sensitivity analysis, …

51

Related Work

52

Constraint Solving • Test data generation via constraint solving is not feasible

when:

• Continuous mathematical models, e.g., differential equations

• Library functions in binary code

• Complex operations

• Constraints capturing dynamic properties (discretized) tend not to be scalable

53

Search-Based Testing

• Largely focused on unit or function testing, where the goal is to maximize model coverage, check temporal properties (state transitions) …

• To address CPS, we need more work on system-level testing, targeting dynamic properties in complex input spaces capturing the behavior of physical entities.

54

Future: !A more general methodological

and automation framework, targeting more complex and heterogeneous CPS models

55

Future Work •  Shifting the bulk of testing from implemented systems to models of

such systems and their environments requires:

•  Heterogeneous modeling and co-simulation

•  Modeling dynamic properties and risk

•  Uncertainty modeling enabling probabilistic test oracles

•  Executable model at a proper level of precision for testing purposes

•  Use results to make the best out of available time and resources for testing the implemented system with hardware in the loop

•  Focus on high risk test scenarios within budget and time constraints 56

References •  R. Ben Abdessalem et al., "Testing Advanced Driver Assistance Systems Using

Multi-Objective Search and Neural Networks”, ASE 2016

•  R. Matinnejad et al., “Automated Test Suite Generation for Time-continuous Simulink Models“, ICSE 2016

•  R. Matinnejad et al., “Effective Test Suites for Mixed Discrete-Continuous Stateflow Controllers”, ESEC/FSE 2015 (Distinguished paper award)

•  R. Matinnejad et al., “MiL Testing of Highly Configurable Continuous Controllers: Scalable Search Using Surrogate Models”, ASE 2014 (Distinguished paper award)

•  R. Matinnejad et al., “Search-Based Automated Testing of Continuous Controllers: Framework, Tool Support, and Case Studies”, Information and Software Technology, Elsevier (2014)

57

.lusoftware verification & validationVVS

Testing Dynamic Behavior in Executable Software Models!

!

Lionel Briand

** WE HIRE! **

July 18, ISSTA 2016