a methodology for automating assurance case generationa methodology for automating assurance case...

A Methodology for Automating Assurance CaseGeneration

Shreyas Ramakrishna∗, Charles Hartsell∗, Abhishek Dubey∗, Partha Pal†, Gabor Karsai∗∗Institute for Software Integrated Systems

Vanderbilt UniversityNashville, TN, USA

†Raytheon BBN TechnologiesCambridge, MA 02138

Abstract—Safety Case has become an integral component forsafety-certification in various Cyber Physical System domainsincluding automotive, aviation, medical devices, and military. Thecertification processes for these systems are stringent and requirerobust safety assurance arguments and substantial evidencebacking. Despite the strict requirements, current practices stillrely on manual methods that are brittle, do not have a systematicapproach or thorough consideration of sound arguments. In addi-tion, stringent certification requirements and ever-increasing sys-tem complexity make ad-hoc, manual assurance case generation(ACG) inefficient, time consuming, and expensive. To improvethe current state of practice, we introduce a structured ACGtool which uses system design artifacts, accumulated evidence,and developer expertise to construct a safety case and evaluateit in an automated manner. We also illustrate the applicabilityof the ACG tool on a remote-control car testbed case study.

Index Terms—Assurance Case, Safety Case, Goal StructuringNotation, Automated Generation.

ABBREVIATIONSAC Assurance CaseACG Assurance Case GenerationAEBS Automatic Emergency Braking SystemALC Assurance-based Learning-enabled CPSCPS Cyber Physical SystemGSN Goal Structuring NotationSC Safety Case

I. INTRODUCTION

Design of Assurance Case (AC) for safety critical CPSshas become an important industrial requirement. An assurancecase [1], [2] is a structured argument which composes differentpieces of evidence to show that system-level goals have beensatisfied. The goals (or claims) refer to the property of thesystem being monitored like safety, reliability, security, etc.,and evidences are facts about a component of the system,that are accumulated by prior research or experiments. Anargument links relevant evidences to the claims, and can eitherbe deterministic, probabilistic or qualitative. Safety Case (SC)is a specialized assurance case widely used for assuring thesystem level safety in CPS domains like aviation [3], military[4] and automotive [5].

Despite its importance, construction of a safety case is oftendone manually by developers without a systematic approachor thorough consideration of safety arguments. Conventionalsafety case reports are long textual arguments which typically

try to communicate and argue about the safety of a sys-tem. However, unclear and poorly structured textual language(mostly English) has always been a problem (explained in[6]) in communicating safety arguments among the differentdesigners or operators involved. To overcome the irregularitieswith conventional textual safety reports and to make the docu-mentation of safety case easy to read, graphical structures suchas Claims-Argument-Evidence (CAE) [7] or Goal StructuringNotation (GSN) [6] was introduced.

The GSN is widely used among the two. It is a graphicalway of constructing a safety case using individual elementsof claims, sub-claims, assumptions, arguments and evidences.The idea behind the goal structure is to show how the top-level claim or goal can be decomposed into multiple sub-goals connected with some argument structure. This processis repeated to further decompose sub-goals until it reachesthe leaf-goals which can be directly supported by low-levelcomponent evidences. This graphical structuring of argumentshas certainly simplified and improved the comprehension ofassurance arguments across all stakeholders, thus encouragingits use in various safety-critical industries (listed in [6]). How-ever, as explained in [8], GSN has only provided a simplifiedmeans of expressing arguments but has not improved thequality of the argument structure itself. Also, the design ofthe GSN is still performed manually by human experts whichmakes it time consuming and error prone.

There are several tools [9]–[12] that support development ofCAE or GSN for safety cases. These tools provide excellent vi-sual aid and editors for rapid prototyping, and management oflarge safety cases. However, they do not provide a mechanismto automate the construction of the CAE or GSN. These toolsstill require a human expert to manually generate the safetyarguments that connect the goals to the low-level evidences.Also, as discussed in [13], these tools lack a formal basis,which has limited their automation capability. Additionally,most of these tools do not provide a mechanism to quantifythe confidence of the developed safety case, which is animportant functionality. These limitations motivate the needfor an automated tool that can develop and evaluate safetycase with little human involvement.

Our Contributions: In this work, we introduce an ACG toolthat uses design artifacts of the target system along with the

arX

iv:2

003.

0538

8v1

[cs

.RO

] 1

1 M

ar 2

020

ElectronicSpeedControl

LIDAR

USBWebcam

RaspberryPi3(RPi3)

Titan12TMotor

IROptocoupler

8.4VNiMHBattery

RPi3PowerSupply

Fig. 1: DeepNNCar is a resource-constrained autonomous RC car that usesCamera, LIDAR, IR-Optocoupler, and Raspberry Pi.

evidences aggregated by human experts to generate an safetycase GSN starting from a given certification criteria (CC). Ourgoal with the ACG tool is to provide a structured methodfor the construction and evaluation of an assurance argumentin support of a given certification criteria while automatingthe process to reduce human involvement. To realize thesegoals, the tool uses a classic divide and conquer strategyleveraging iterative search to decompose complex GSN goalsand to find evidences that match them. We follow a rapidprototyping approach to realize the components of the ACGtool. Specifically, our contributions are as follows:• We elaborate the three-step methodology of the ACG tool

that automates the safety case generation process.• We discuss three core steps of Link-Seal-Expand for itera-

tive goal decomposition and automated GSN construction.• We discuss a method using confidence score to evaluate the

credibility of the generated safety case.We evaluate the ACG tool for an illustrative example of an

AEBS for a RC car testbed called DeepNNCar [14]. We hy-pothesize this automated safety assurance construction methodwould reduce the cost of assuring CPS, provide a robustprocess that minimizes human involvement, and accelerate theentire process of safety case generation.

Outline: In Section II, we discuss a few background con-cepts required to understand this work. Section III sets up anexample of AEBS, that is used throughout the paper to explainthe ACG concepts. Section IV describes the core componentsof the ACG tool. In Section V, we illustrate the utility of ACGtool in the context of the AEBS example. Section VI discussesa safety case evaluation scheme using confidence score as ametric. Section VII discusses the related work. In Section VIII,we make a brief discussion about the capabilities of the ACGtool, and thereafter we list a few possible future enhancements.Finally, in Section IX we present our conclusion.

II. BACKGROUND

This section provides an overview of safety case, GSN andthe ALC Toolchain that is going to be extensively referredthroughout this paper.

A. Safety Case (SC)

Safety Case has become an important requirement to provethe functional safety of safety-critical CPS, and has become

a wide standard in the automotive, aerospace, and militaryindustries. Safety case is a structured argument made usingevidences to support the different claims made about theproperties of the system. For every component or subsystem,a claim can be made about its safety, reliability, availability,security, etc. Then either deterministic, probabilistic, or qual-itative arguments can be used along with evidences to proveif the claim about the system holds. The elements of a safetycase as explained in [1] are:

• Claim: The property of a system that requires assurance (e.g.safety, reliability, availability, security, etc.)

• Evidence: Facts, sub-goals, assumptions, functions, and sub-arguments which provide a conclusive support to prove theclaims made about the system.

• Arguments: A linking structure between the claim and thesupporting evidence. Used to show how the claims arebacked by the evidence available for the system.

• Inference/Operators: Provides rules or mechanisms to trans-form the arguments.

Despite the importance of designing robust safety case forsafety-critical systems, there have been cases (explained in [6])like the Clapham Rail Disaster [15] and the Piper Alpha off-shore oil, and gas platform disaster [16] where substandardsafety documentation did not confer to any standards andlacked any systematic approach by designers. These incidentsillustrate the hazards of not clearly understanding, communi-cating and documenting a safety report. The main problem asseen from these incidents is the use of text (e.g, English) in thesafety reports. The ambiguity of using text hinders designers oroperators at different levels to generate a robust safety report.This problem of using text led to a requirement of a structurednotation to explain claims, arguments and evidences. Thisrequirement led to introduction of Claims-Argument-Evidence(CAE) and Goal Structuring Notation (GSN).

B. Goal Structuring Notation (GSN)

Goal Structuring Notation (GSN) [6] was introduced by TimKelly at the University of York. It is a graphical argumentnotation which explicitly represents the elements of a safetycase (claims, sub-claims, requirements, context, assumptionsand evidence) as a node structure and maps the relationshipthat exists between each of these nodes. The purpose of thisgraphical goal structure is to show an iterative breakdown ofthe goal to sub-goals, with a mention of the assumptions andcontext under which the decomposition can be made. Also,the iterative decomposition continues to a point when there isno further decomposition, or we have sufficient evidence tosupport the parent claim.

The GSN with its graphical representation clearly removesthe ambiguities involved in defining a safety case using textuallanguage, but this does not qualitatively imply anything aboutthe argument itself (as explained in [8]). The arguments,sub-goals, and evidences constituting a safety case can beimperfect, and the conventional GSN has no justification(or rationale) to indicate if the sub-goals or the evidence is

2

DeepNNCarOperation

Sensing DrivingStateEstimation

camera

Lidar Opto-coupler

currentposition

ObstacleDistance

SlipStatus

LECSTEER

DrivingManager

PWMapplicator

OpenCVSTEER

EmergencyBraking

BrakingManager

ObstacleDetection

currentspeed

Laneposition

Detection

CuratedImages

CONTROLSENSORS PERCEPTIONANDSTATEESTIMATOR

DETECTION

STATEESTIMATION

CAMERA

F:SensingHW:CameraModule

SW:OpenCVCameraCapture

OPTO-COUPLER

F:SensingHW:IR-OptocouplerModuleSW:OptocouplerSWInterrupt

LIDAR

F:SensingHW:LIDARModuleSW:LIDARFirmware

\LANEDETECTION

F:DetectionHW:RaspberryPiSW:OpenCVLDAlgorithm

OBJECTDETECTIONF:DetectionHW:RaspberryPi

SW:MobileNetV2CNNODModel

CURRENTSPEEDF:StateEstimationHW:RaspberryPi

SW:RPM-SpeedconversionCURRENTPOSITION

F:StateEstimationHW:RaspberryPiSW:LanePostionestimator

SLIPSTATUSF:StateEstimationHW:RaspberryPi

SW:Slipdetector

LECSTEERF:DrivingHW:RaspberryPiSW:NVIDIADAVE-IICNN

OpenCVSTEERF:DrivingHW:RaspberryPiSW:Lane-SteerConverter

BRAKINGMANAGERF:EmergencyBrakingHW:RaspberryPi

SW:ReversePWMCalculator

DRIVINGMANAGERF:DrivingHW:RaspberryPiSW:Steer-SpeedEvaluator

PWMAPPLICATORF:DrivingHW:RaspberryPi

SW:GPIOApplicatorCURATEDIMAGESF:StateEstimationHW:RaspberryPi

SW:OpenCVpre-processor

OBSTACLEDISTANCEF:StateEstimationHW:RaspberryPi

SW:Distanceestimator

(b) Function Component Mapping

(a) DeepNNCar Operational Blocks

Fig. 2: (a) The operation of the DeepNNCar is split into sensing-state estimation-control tasks. F – represents the functionality of the components in thesystem, HW – represents the hardware component which is responsible for the function, and SW – represents the software component which is responsiblefor the function. The arrows between the tasks represents the interdependence of the different components and their functionality. (b) Function-componentmapping of DeepNNCar operations. Shows the different hardware and software components that contribute to a function.

sufficient to support the safety case. Uncertainty in the sub-goal’s supporting evidence can lower the assurance of theentire safety case.

To overcome the uncertainty problem in the conventionalGSN, Hawkins et al. [17] proposed a new structure of safetyarguments called the assured safety arguments which extendsthe conventional safety argument by decomposing it intotwo separate arguments: (1) safety argument – performs thedecomposition of the safety goal and presents a strategy toexplain the reason behind it, and (2) confidence argument –holds the justification about the sufficiency of the confidencein the safety argument being made. This extension of the con-fidence arguments would provide a mechanism to backtrackthe branching decisions of the GSN, while making it robust.We are currently working on extending our GSN to have arationale block which holds a justification about the evidenceused while selecting a (sub)goal.

C. ALC Toolchain

The ALC Toolchain [18] provides an integrated set of toolsfor development of CPS with a particular focus on systemsusing Learning Enabled Components (LECs). The toolchainsupports GSN safety case and allows individual nodes withinan safety case to contain links to other artifacts including sys-tem architectural models, testing results and analysis, formalverification results, etc. Within a GSN argument, these linkedartifacts can be interpreted as needed based on the desired tasksuch as justification of argument structure, supporting evidenceblocks, or system functional decomposition among others.The ALC Toolchain is the future implementation platformfor the automated ACG method presented in this paper andis referenced for implementation details in the remainingsections.

3

Safety Case Generation

Certification Criteria

Curated Evidence Store

Generated Safety Case

GSNDesign Artifacts

Symbolic Analysis

Confidence Evaluation

Evidential Reasoning

ACG SystemConfidence Score

Leader car does notcollide with the

follower car

1) Indoor track withsuitable Lighting condition

for perception.2) Track does not have

uneven ridges

Indoor operatingenvironment

Leader carstationary

Follower car moving

Leader car movingFollower car

moving

Decompose the Safety claimbased on the operating states

of the cars.

Fig. 3: The high level approach of Assurance Case Generation tool, that uses curated evidences and target systems design artifacts to design an safety caseGSN from the certification criteria.

dmin

C1 C2

Fig. 4: The AEBS case study for DeepNNCar platoon, where the follower car(C2) is required to maintain a minimum safe distance dmin from the leadercar.

III. AN ILLUSTRATIVE EXAMPLE

To illustrate our proposed ACG tool, we introduce a realis-tic application of an Automatic Emergency Braking System(AEBS) using an RC car testbed DeepNNCar. AEBS is afeature that will automatically override any driver input andapply maximum braking to immediately stop the car in thecase of an imminent collision. To motivate the AEBS safetycase, we consider the example of a car platoon as shown inFig. 4. Throughout this text, the front car will be referred toas “Leader” and the second car is referred to as “Follower”.

These cars move around an indoor race track with varyingspeed (Vt) and steering controls (St), and are equipped witha several sensors including 2D LIDAR, IR Opto-coupler,and decawave positioning [19]. The entire operation of thesensing-perception-control operations of these cars are shownin Fig. 2. At each time step t, the sensors of the cars capturedifferent observations Ot which includes the current positionpt = (xt, yt), current speed (Vt), current steering (St) and

the distance to the car in front of it (dt). These observationsare used by different software components in the perceptionand state estimation block to generate information (it) whichincludes lanes detected (Lt), object detected (Objt) and slipstatus (slipt). These values are then passed to the differentcontrollers which calculate the required actuation commands(at) including the steering and speed Pulse Width Modulation(PWM) values.

Certification Criteria: To prove the AEBS reliably works,we need to assure that the follower car always maintains aminimum safe separation distance (dmin) from the leader car.This follows from the reasoning that, for dt ≥ dmin, the AEBSsystem has enough time to compute and apply the reversePWM required to slow or stop the car from its current speed.Mathematically, the problem can be expressed as:

dmin ≥ dt ∧ (obj) (1)

So, if there is a car detected (obj=1) by the perceptionsubsystem, and if the distance measurement from the LIDARdt is always greater than the minimum safe threshold distancedmin, then we can prove in the safety case that the carswill always avoid collision under a set of assumptions. FromEq. (1) above, we see the safety case requires conjunctiveclaim proof that the object (car) is detected and the distancefrom the LIDAR dt is always greater than dmin.

Further, the first step of decomposing the certificationcriteria for AEBS is performed based on the state (stopped,moving) of the cars (we assume that we know the states andposition of both the cars). For the two car platoon, there are

4

GivenCertificationCriteria (CC)

Structure Claim(G) from CC forGSN root node

mapping CC to G

successful?

CC could notbe de-

composed

Perform furtherdecomposition

of G using step 3

Y

N

2. Structuring Claim

Solve for(sub)goal G

Find supportingevidence to

SEAL the node

Gather designartifacts of the target

system

Design curatedevidence store

1. Pre-processing

Found any?Find the

Sealed SCbranch

EXPAND the(sub)goal to thenext level with

sub-goals

Any remaining un-solved

goal?

Finalize theAC GSN

Y

N

Y

N

3. Iterative Goal decomposition

Fig. 5: Phases involved in automated safety case generation using the ACGtool.

four possible operating modes: mode1 - Leader stationary &Follower moving, mode2 - Leader moving & Follower moving,mode3 - Leader moving & Follower stationary, and mode4 -Leader stationary & Follower stationary. For the AEBS casestudy, we are particularly interested in modes 1 and 2 wherethe follower car is moving. A similar iterative decompositionis performed using the proposed ACG tool till the lowest levelleaf node with conclusive evidence is reached. (illustrated inSection V)

IV. THE ACG TOOL

The Assurance case generation tool (Fig. 3) is responsi-ble for automatically generating the safety case for a givencertification criteria, and evaluating that safety case with aconfidence score (discussed in Section VI). The generatedsafety case will be expressed as a GSN. Fig. 5 illustrates thethree phases involved in the ACG process, they are:1) Pre-processing involves accumulating system design arti-

facts and curating an evidence store about the componentsof the target system.

2) Structuring Claim involves construction of a structuredclaim from the given certification criteria.

3) Iterative Goal Decomposition involves applying decompo-sition operators (Link-Seal-Expand) to decompose the GSNuntil the leaf nodes are reached.

The steps involved in each phase is elaborated in greater detailin the subsections below.

A. Design Artifacts

The proposed ACG tool will systematically leverage thesystem design information to automate the decompositionof the certification criteria to smaller sub-claims that caneventually be mapped and sealed using appropriate evidences.One way of using the entire system information is by breaking

them into different graphical representations. These graphs canthen be used to find supporting evidences and find relationaloperators among sub-claims during the safety constructionprocess. Some of the graphs that can be used are (illustra-tive graphs for the DeepNNCar AEBS example is listed ifavailable):• System Functional Breakdown (SFD) is a logical breakdown

of the system functionality that is involved or responsiblefor a specific mission. (Fig. 2-b)

• System Physical Decomposition is a logical breakdown ofthe system into sub-systems, functions and components.(Fig. 2-a)

• Interconnectivity Graph is a graph with interconnectivityamong the different hardware and software components.(Fig. 2-a)

• Behavioral Graph captures the activities associated with asystem/component.

• Mapping Diagrams are graphs which indicate the mappingof functions to components, software blocks to hardwareblocks, or activities to hardware or software, etc. (Fig. 2-a)

• Ontology Graph uses a common, domain-specific set ofterms and relationships to support mapping of conceptsacross the other graph-of-graph elements.

• System Architectural Model captures all components in thesystem and the interconnections between them. An archi-tectural model constructed in the ALC Toolchain for theDeepNNCar is shown as a ”Design Artifact” in Fig. 3.

B. Curated Evidence Store

The Evidence Store is a table consisting of informationabout the various components and functions, evidence arti-facts supporting their correct operation, and all assumptionsrequired for components to work correctly. Evidence artifactscan take any form (eg. statistical analysis of test data, an-alytical analysis, etc.), but must provide enough proof thatthe corresponding component works reliably under the statedassumptions. It is up to the developer to determine when theavailable evidence is sufficient for use in the safety case.

Curated evidence for the different components of DeepN-NCar’s operation is shown in Table I. The supporting evi-dences for the components were gathered based on a numberof randomized hardware and software tests (shown in Fig. 2)under different operating environments to evaluate when thecomponents/tests succeed and fail. For e.g. (1) the cameramodule was tested under different lighting conditions and wasfound to work best in environments with evenly distributedlighting conditions with (800-1000) lumens of light, and (2)the LIDAR module was tested under different indoor andoutdoor operating environment with different obstacles andwas found to reliably and accurately work in smaller indoorrooms (due its operating range of 12 m). Similar informationabout different components of DeepNNCar was accumulatedas evidences.

Once we have accumulated evidence for all the systemscomponents and compiled the design artifacts required tounderstand the alternatives and dependencies in the system,

5

we can start the safety case generation process (illustratedin Fig. 5) that includes (1) structuring claims for the givencertification criteria, and (2) iterative goal decomposition.

C. Structuring Claims for GSN Root Node

The first step of generating the safety case is to designa structured claim for the root node of the GSN from thecertification criteria. After this step we expect the certifica-tion criteria given in natural language to be expressed asa structured claim statement referring to elements in thedesign artifacts. In the case study of AEBS, the structuredclaim for the certification criteria (maintain safe distance andavoid collision) is expressed in Eq. (1). This mapping of thecertification criteria to the claim was manually performed bya human expert based on the states of the cars (explained inSection III), where the AEBS is important. Once we have aclaim for the root node of the GSN, we can start an iterativedecomposition process using some basic operators (Link-Seal-Expand), and logical connectives.

D. Core Steps for Decomposing GSN Goal

After obtaining a structured claim for the root node of theGSN, it must be further decomposed into sub-goals (and sub-claims) until a leaf node is reached. In this work, we defineleaf nodes to be component-level nodes which have directsupporting evidence found in the curated evidence store. Thedecomposition in the ACG tool is handled by three primaryoperations Link, Seal and Expand, along with basic logicalconnectives.

1) Link: To avoid reuse of evidence for similar GSNbranches, we use the link operation to link evidence nodesthat have been previously used in different branches of theGSN. Effectively, if the link has seen a relationship provedpreviously among the different sub-goals, then this informationcan be used in the future to stop the exploration for evidence.This feature of the link is helpful in reducing the requirednumber of iterations (explained for the DeepNNCar examplein Section V).

For this, we use Link as a background step which involvescreating an evidence repository with modular evidences forvarious claims which can be reused in different GSN branchesor safety cases. As discussed above, the construction and man-agement of the curated evidence store itself can be partiallyautomated. However, manual input from the developer is stillrequired to link each source of evidence to the relevant systemcomponents and safety claims.

2) Seal: The seal operation queries the curated evidencestore to look for supporting evidences of a (sub)goal, andthen decides if the available evidence is sufficient to stop theiteration or further evidence is required. Evidence is said tobe sufficient if the claim in the (sub)goal cannot be furtherdecomposed and there is evidence available to directly supportthe claim. Some sub(goals) may have evidence to support theclaim, and in such cases the iteration looking for supportingevidences can be stopped. However, in cases involving higherlevel goals, the available evidence may not be sufficient to

directly support a claim. In this case, it is necessary toquery the evidence store repeatedly until sufficient evidence isavailable. For goals with supporting evidence from the store,an evidence with maximum confidence (explained in SectionVI) that satisfies the goal under the mutual satisfaction of theassumptions is selected. Sometimes it is also possible that asub-goal may not have sufficient evidence, in such cases weclaim the node to be orphaned. Whenever an orphaned nodeis found, we can stop the search and prove the safety claimcannot be argued until new evidence to support the node isavailable.

If the Seal operation has found a linking evidence (from theevidence store or Link operation), then it seals of the sub-goalwithout allowing for further exploration. A component mayhave multiple supporting evidences based on the operatingconditions and its functionality. In such cases the seal nodewill have to select one evidence artifact from the availableoptions, and this selection process is discussed in Section VI.

3) Expand: When no evidence for directly supporting agoal is found during the Seal phase, the goal must be decom-posed into multiple sub-goals with corresponding assumption,context, and strategy nodes. The expand operation drives thisdecomposition of goals into sub-goals using the availabledesign artifacts. Each goal in the generated safety case corre-sponds to a system function which may require inputs fromother components in order to operate correctly. The systemarchitectural model in Fig. 2 shows these dependencies be-tween components, and is one design artifact used to drive goaldecomposition. The strategy which connects the sub-goals canbe formalized using different logical combination functions.This step refines the argument strategy of the decomposedgoal node to find an appropriate logic gating function.

Logical Connectives (gating functions): The decomposi-tion of the GSN results in the goal node being split into sub-goals that can be connected using different gating functionsincluding AND and OR. For the AEBS example, the claimresolution step results in two scenarios Mode 1 & 2, whichcan be connected by an OR operator as shown in Fig. 7. Thisdecomposition is performed based on the states of the cars(explained in Section III). A similar illustration of using thegating function to combine the sub-goals is shown in Fig. 7(explained in Section V).

E. Implementation and Automation

The ALC Toolchain introduced in Section II-C will beused to implement the ACG including tool automation whereappropriate. The following paragraphs explain how each stepin the ACG workflow, shown in Fig. 5, will be automated.

The Pre-Processing step consists of accumulating the avail-able design artifacts and constructing an evidence store.For systems designed using the ALC Toolchain, all designartifacts are automatically cataloged in a version-controlleddatabase and may be cross-referenced from other models asneeded, effectively eliminating the need for a developer tocollect artifacts manually. Systems designed outside of thetoolchain may upload design artifacts to be added to the

6

Claim EvidenceType

Function Components/Subsystems

Assumptions

Camera Module captures image of Leadercar in range (0.1, 1)m. (G1)

H/W &S/WTesting

Sensing Camera 1) Light intensity above 100 lumens2) Leader car is within range (0.1,1)m3) Is powered.

LIDAR Module provides distance of theobstacles in range (0,12)m and (0°,359°).(G2)

H/W &S/WTesting

Sensing LIDAR 1) Leader car is within scan range.2) LIDAR scan motor is working.3) Is powered.

IR Opto-coupler Module provides RPM in-formation in range (16,160). (G3)

H/W &S/WTesting

Sensing IR-Optocoupler

1) It is mounted on the chassis.2) It can be occluded by plastic piece on wheel.3) Is powered.

Lane Detection Module detects lanes andorientation of track segment. (G4)

S/WTesting

Detection LaneDetection

1) Received frame from camera.2) LD algorithm parameters are correct.3) No light glare on track.

Obstacle Detection Module detects imagesof car in range (0.1, 1)m. (G5)

SWTesting

Detection ObjectDetection

1) Received frame the camera.2) Mobilenet V2 model weights are correct.

Current Speed Module information is inrange (0, 1)m/s. (G6)

S/WTesting

StateEstimation

currentSpeed

RPM-Speed conversion is correct.

Current Position Module updates cars posi-tion on track. (G7)

S/WTesting

StateEstimation

currentPosition

Working on indoor track with lanes

Obstacle Distance Module provides dis-tance of the obstacles in range (0,12)m(150°,180°). (G8)

S/WTesting

StateEstimation

ObstacleDistance

Obstacle is within the scanning range (0, 12)m.

Slip Status Module identifies wheel slip.(G9)

S/WTesting

StateEstimation

Slip Status 1) Track surface is known.2) Opto-coupler module is working correctly.

Steer LEC Module provides steer in range(-30°, 30°). (G10)

S/WTesting

Driving LEC Steer 1) Receives frames from camera.2) Trained deep-learning model weights are correct.3) Trained model has seen the track before.

OpenCV Steer Module steer in range (-30°,30°). (G11)

S/WTesting

Driving OpenCVSteer

1) Receives frame from camera.2) No light glares on track.3) Lane-steer conversion is correct.

Braking Manager Module provides reversepolarity RPWM to brake the car.(G12)

S/WTesting

Braking BrakingManager

Track surface allows braking.

Driving Manager Module provides SteerPWM in range (10,20) & speed PWM inrange (15.58, 15.62). (G13)

S/WTesting

Driving DrivingManager

Receives updated sensor and processed data (not staleones).

PWM applicator Module applies PWM at100Hz. (G14)

H/W &S/WTesting

Driving PWMApplicator

1) Wiring to motors and GPIO done correctly.2) Raspberry Pi is powered.

TABLE I: The sub-goals of the different functions is shown along with its requirement, evidence type and the assumptions under which they hold.

catalog. Construction of the curated evidence store itself can bepartially automated with use of this catalog. All artifacts in thecatalog which represent sources of evidence for the safety case(eg., results from system testing, formal verification, or anyuser-defined analysis methods) can be used to automaticallypopulate the available evidence in the evidence store. However,the developer must manually link each piece of evidence to thecorresponding component in the system architectural model,determine when the available evidence is sufficient to supporta claim, and define any assumptions required for the evidenceto be valid.

Next, the root node of the safety case must be derivedfrom the certification criteria during the Structuring Claimstep. This step requires translating a claim about the systemspecified in natural language into a formal assurance claim.While there is a significant body of research on automation forsuch tasks, this is outside the scope of the ACG tool presentedin this paper. Instead, this step is not automated and must becompleted manually by the developer.

The Iterative Goal Decomposition step interprets the avail-able artifacts to drive the Link, Seal, and Expand operations.The Link step can be fully automated as a background task

which is executed after each iteration of the Seal and Expandoperations. When a branch of the safety case has been fullydecomposed into sealed leaf nodes, this completed section canbe added to the evidence store as a GSN fragment. This way,if the fragment appears again in another branch of the safetycase, it can be immediately retrieved from the evidence storeinstead of repeating the iterative decomposition process. TheSeal operation may be automated as a straight-forward queryof both the evidence store and any GSN fragments foundby the link operation. The Expand operation may also beautomated with an appropriate model interpreter. Since eachgoal is linked to a component in the system architecture model,a graph-traversal algorithm can determine all componentsrequired for a goal to function correctly. This information,combined with the assumptions listed in the evidence store, issufficient to decompose each goal into progressively smallersub-goals. Finally, the iterative decomposition process usingthe Seal and Expand operations can be automated with anappropriate workflow.

Once a complete safety case has been constructed, auto-mated evaluation and correctness checks can be implementedwith an appropriate formal specification language (eg. FOR-

7

(G)Follower car

does notcollide with

the leader car.

(A)1) Indoor track with suitable Lighting condition for

perception.2) Track surface allows for acceleration/ braking.

3) Leader car is within sensor measurement range.

(S1)Arguing the safety requirement by

showing the LIDAR, Object Detectionand Braking Manager components

meet the safety requirements

(G8)ObstacleDetection

Module detectsimages of car inrange (0.1, 1)m

(G12)Braking ManagerModule providesreverse polarityRPWM > VPWM

(A8)1) Received frame from camera 2) Mobilenet V2 model weights

are correct.

(G5)Obstacle DistanceModule providesdistance of the

obstacles in range(0,12)m

(A5)Obstacle is within the

distance measurementrange.

Sub-goals

Assumptions

Argument(Strategy)

Evidence (SG1)Leader car stationaryFollower car moving

(SG2)Leader car moving

Follower car moving

(S) Decompose the Safety claim based on

the operating states (moving, stationary)of the cars. Two cases of particular

Interest will be considered

SealedSub-goal

(A12)Brakes of the car works

for the track.

(S2)Arguing the safety requirement by

showing the LIDAR, Object Detectionand Braking Manager components

meet the safety requirements

(G8)ObstacleDetection

Module detectsimages of car inrange (0.1, 1)m

(G12)Braking ManagerModule providesreverse polarityRPWM > VPWM

(A8)1) Received frame from camera 2) Mobilenet V2 model weights

are correct.


obstacles inrange (0,12)m



(A12)Brakes of the car works

for the track.

(G2)LIDAR Module

provides distance ofthe obstacle in range(0,12)m and (0,359)

degrees.

(G2)LIDAR Module

provides distance ofthe obstacle in range(0,12)m and (0,359)

degrees.

(G1)Camera Module

captures images ofobjects in range

(0.1,1)m.

(G1)Camera Modulecaptures images

of objects in range(0.1,1)m.

(A2)1) Leader car is within scan

range.2) LIDAR is powered.

(A2)1) Leader car is within

scan range.2) LIDAR is powered.

(A1)1) Lighting condition is above

acceptable 100 lumens.2) Leader car is within capture range.

3) Camera is powered.

(A1)1) Lighting condition is above

acceptable 100 lumens.2) Leader car is within capture

range.3) Camera is powered.

E2E1

E2E1

(S3) Argue the braking manager

computes accurate braking information if it

gets reliable information from slip-status, LIDAR and current speed module.

(G3)IR Opto-coupler

Moduleprovides reliable

RPM in range(16,160)

(G9)Slip Status

Moduleidentifies

wheelsleep.

(A3)1) IR-Opto-coupler ismounted correctly.

2) It is powered.


obstacles in range(0,12)m


distance measurementrange. (A9)

Slip can be identifiedfrom current speed

values.

(S4) Argue the braking manager computes

accurate braking information if it gets reliable information from

slip-status, LIDAR and current speed module.

(G3)IR Opto-coupler

Moduleprovides reliable

RPM in range(16,160)

(G9)Slip Status

Moduleidentifies

wheel sleep.

(A3)1) IR-Opto-coupler ismounted correctly.

2) It is powered.

(G5)Obstacle

Distance Moduleprovides

distance of theobstacles in

range (0,12)m



(A9)Slip can be identifiedfrom current speed

values.

(G7)Current Position

Moduleaccurately

updates the carposition on the

track.

(A7)1) Camera has captured

the image. 2) LD-Position parameters

are correct.

Iteration 1

Iteration 2

Iteration 3

Fig. 6: A GSN fragment of the DeepNNCar’s AEBS case study.

MULA [20]).

V. ACG FOR THE AEBS CASE STUDY

In this section we manually design a GSN for the AEBSexample to outline the complexity of graphing and to illustratethe manual dependence in the design process. We then applythe ACG tool for the same example and iterate through thebuilding blocks to generate an automated GSN.

A. Manual GSN tree generation

A GSN fragment for the DeepNNCar’s AEBS example isshown in Fig. 6. We manually designed the GSN using the

evidence store (Table I) and the design artifacts (Fig. 2-a andFig. 2-b). To prove AEBS reliably works, we need to assurethe claim has sufficient evidences in both Mode 1 & 2. So, weperform a parallel decomposition of the two modes in Fig. 6.The conventional GSN uses different shapes for representingthe blocks. In this hierarchical decomposition, the top goal isreferred to as the parent node, and the decomposed sub-goalsare referred to as child nodes. The principal symbols used inthe GSN construction are:

• Blue blocks represent the goals/sub-goals, e.g. follower cardoes not collide with the leader car. Each blue goal is

8

decomposed until there is direct supporting evidence fromthe evidence store.

• Gray blocks represent the various assumptions made underwhich the goals are satisfied. For the AEBS example,the top goal is satisfied under assumptions including thatthe track surface allows for permissible operation of thecar, the lighting is sufficient for the sensors to work, etc.The assumptions of the parent goal is a superset of theassumptions made by the child nodes.

• Green blocks represent the strategy that will be used to provethat the goal holds.

• Orange blocks represent the evidences used to seal a goal.These evidences can be results of various randomized hard-ware or software tests.

• Purple blocks represent the sealed goals. i.e. goals whichhave been directly supported by evidence and further ex-ploration is not possible or necessary.

In Fig. 6, the different decomposition levels are separatedusing dotted lines, and they are termed as iterations. We canalso see a few sub-goals are sealed with evidences (representedby purple blocks), while some still require further exploration(vertical dotted line extensions from the sub-goals in Iteration3).

Designing this GSN for a complex system like DeepNNCaris time consuming and required us to iterate through thedifferent goals and sub-goals. To avoid the hassle of manuallygraphing the large GSN structure that was spread acrossmultiple pages, we applied the ACG tool introduced in SectionIV to generate the GSN for the same AEBS example.

B. GSN generation using ACG tool

The safety case generation for the DeepNNCar’s AEBSexample using ACG tool is shown in Fig. 7. For each of themodes identified earlier we generate a safety case iteratingthrough Link-Seal-Expand steps discussed in Section IV. Thedesign artifacts (Fig. 2-a and Fig. 2-b) are used along with theevidence store (Table I) to generate the GSN.

Iteration 1: As the first iteration for the GSN generation, theevidence store is curated by the ALC Toolchain using designartifacts and tests on the components of the system. Once,the evidence store is designed, the claim for the root nodeof GSN is structured for the AEBS certification criteria by ahuman expert. Then based on the states of the car, the claimis parallelly decomposed into Mode 1 & 2 as shown in Fig. 7.

Iteration 2: Seal– ACG tool looks for supporting evidencefor the goal (AEBS) from the evidence store. Since no support-ing evidence can be found in the evidence store, the Expandoperation is performed to decompose the goal. The AEBSnode makes use of the functional complimentary pattern whichshows that maintaining safe distance, including stopping whenobstacle distance is less than dmin, requires the conjunctionof three functions: obstacle detection, measuring obstacle dis-tance, and the breaking manager providing appropriate PWMsignal. This results in the three sub-goals {G8, G5, G12}logically connected using the AND operator with the parentAEBS goal. Since there is no concluding evidence from the

G

Mode1(SG1)

Mode2(SG2)

G8 G5 G12

G2 G1

E2 E1

G8 G9 G6

G2 G6

E2

G3

E3

G3

E3

G8 G5 G12

G2 G1

E2 E1

G8 G9 G6

G2 G6

E2

G3

E3

G3

E3

G7

G4

G1

E1

Iteration 1

Iteration 2

Iteration 3

Iteration 4

Iteration 5 LinkedEvidence

Fig. 7: The graph which is generated by continuously iterating through theLink-Seal-Expand steps of the ACG tool.

current leaf nodes {G8, G5, G12}, the ACG tool iterates thisstep again to decompose these sub-goals further.

Iteration 3: Seal– ACG tool looks for supporting evidencesfor the sub-goals {G8, G5, G12} from the evidence store,since no supporting evidence can be found in the evidencestore, the Expand operation is performed to decompose thegoals. Looking at the evidence table, and functional breakdowngraph (Fig. 2-b) the ACG tool performs the following goaldecomposition’s: (1) obstacle distance node (G8) depends onthe LIDAR node (G2), (2) the obstacle detection node (G5)depends on the camera node (G1), and (3) the braking managernode (G12) depends on the slip status (G9), IR-Optocoupler(G3), Object distance (G8) and current position (G7) nodes(Only for the case when the two cars are moving). The sub-goals of the braking manager are logically connected usingthe AND operator. Again, since no evidence was found forthe three sub-goals, ACG tool iterates further.

Iteration 4: Seal– Again, the ACG tool looks for evidencesof sub-goals {G2,G1,G8,G9,G6,G7} from the evidence storeto seal off the branches. The tool finds, conclusive evidencesfor G1 and G2 and seals them. In this work we advocatethe selection of an evidence with maximum confidence score.However, most of the component sub-goals of this examplehas only one piece of supporting evidence for selection. So,we directly select the only available evidence to seal the nodes.All the other sub-goals of G6, G7, G8 and G9 do not havesupporting evidence and must be further expanded. Expand–the obstacle distance node (G8) depends on the evidence from

9

ConfidenceScore

H/WcomponentFactors

S/WcomponentFactors

ScanFrequency MeasurementRange

ExternalFactors

S/WDevelopmentExperience

H/WDevelopmentExperience

S/WTools ProgrammingLanguage

0.6 0.3 0.1

0.25 0.75 0.2 0.8 0.5 0.5

[a1,a2,...an] [a1,a2,...an] [a1,a2,...an] [a1,a2,...an] [a1,a2,...an] [a1,a2,...an]

Fig. 8: The evidence evaluation tree for the LIDAR module of DeepNNCar.

LIDAR node (G2), the slip status node (G9) depends onevidence from current speed node (G6), and (G6) depends onevidence from the IR-Optocoupler node (G3), and the currentposition node (G7) depends on evidence from Lane detectionnode (G4) (only for Mode2, as in this case both the cars aremoving and the braking manager requires constant informationupdates of the position of the cars, which is not that importantin Mode1 as the leader car is stationary). Since, none of theseare nodes are sealed, a further iteration is performed by thetool.

Iteration 5: Seal– The ACG tool again looks for evidencesof sub-goals {G6, G3, G4(for mode 2)} from the evidencestore and finds that node G3 has supporting evidence. Also,the current speed node (G6) has dependence on G3, for whichevidence has already been found the evidence store. So, usingthe evidence of G3 as linking evidence, the tool seals bothsub-goal nodes. However, G4 has no supporting evidence andhence is further expanded. Expand: Lane detection node (G4)is further decomposed to a single node G1, and a furtheriteration is performed to find an evidence for it.

Iteration 6: Seal– the ACG tool looks for the evidence ofthe camera node (G1) to seal it off. Since it is a componentnode, direct evidence is available from the evidence store.Since every branch of the GSN is sealed off with evidence,iteration is finished, and the AC is complete.

VI. SAFETY CASE EVALUATIONSafety case evaluation is important for quantifying the

confidence of the generated safety case [21]. For evaluatingthe credibility of the generated safety case, we compute aconfidence score to the top goal node associated with thecertification criteria. Our mechanism to identify this score isbased on a bottom up approach, which moves the confidenceassociated with the evidence used to seal a claim. The safetyevaluation in the ACG tool is now performed manually bya human expert at design time, but we are working onautomating it.

A. Confidence Score EstimationBased on the operating context and the expert’s assessments

regarding the component module, every evidence node in the

Attribute Assessment ScoreScan Frequency precise at 2Hz

Precise at 4 Hz0.40.6

Measurement Range precise in range 0-3 mprecise in range 3-6 mprecision in range 6-12 m

0.80.60.5

Software Tools – –ProgrammingLanguage

PythonC++

0.50.8

SoftwareDevelopmentExperience

Programming KnowledgeCommunication protocolsKnowledge

0.40.7

HardwareDevelopmentExperience

PWM KnowledgeMotor KnowledgeUART Knowledge

0.80.60.7

TABLE II: The assessment score Table for attributes of the LIDAR componentof DeepNNCar.

GSN structure gets a confidence score, and the ones with thehigher score will be chosen as the supporting evidence. Theconfidence score can be computed using an approach describedin [8] which performs confidence evaluation using EvidentialReasoning (ER) [22]. ER is a concept of assimilating multipleattributes of a piece of evidence into a single coherent assess-ment. Specifically, the evidences of a claim are decomposedinto various different attributes {e1, e2, ......en} which arethen further decomposed to sub-attributes and this process isrepeated till further breakdown is not possible. We refer to thisstructure as the evidence evaluation tree, and we have designedone such structure for the LIDAR module of DeepNNCar (seeFig. 8). The attributes and sub-attributes of the evidence treevary according to the class of evidence (software module,hardware module) and the context of operation.

At the lowest level of the attribute decomposition, thedesigner can provide a score for the different assessment{a1, a2, ......an} regarding the attribute. These assessmentsvary for different class of evidences (software, hardware) andthe context of their operation. A sample attribute assessmentfor the LIDAR module is shown in Table II. Every attributecan have an assessment and a score, which is statisticallycomputed by a human expert based on his experience of howthe components work under different scenarios. In Table IIwe have a few assessment scores for different attributes of a

10

LIDAR module, and from various tests of the component weassigned the score. We found the LIDAR to most preciselymeasure distances of objects in range 0-3 meters, so a score of0.8 is assigned, and similarly the LIDAR’s precision degradeswhen the object is in the range 6-12 meters, and so a score of0.5 is assigned. A similar assessment score was evaluated foreach of the other attributes. These assessment scores are alsoreferred to as belief functions.

Once, we have the belief function for all the leaf level sub-attributes, and if the importance (weights) of the attributes isavailable towards the claim, then the ER algorithm [22] can beused to assimilate them. The algorithm is developed based onmulti-attribute evaluation framework and evidence combina-tion rule of Dempster—Shafer (D–S) theory of evidence [23].The three steps involved in the algorithm (as explained in [24])are: (1) weighting the belief distribution – weights are assignedto the belief distribution based on the importance of attributetowards the top goal safety claim, (2) aggregation process –combine all the assessment of the basic/ sub-attributes, and (3)generation of combined belief degree – after aggregating theassessments for all the basic attributes, the combined beliefdegree is computed for the entire evidence.

The weights of the attributes are important in computing theconfidence score, and they can be estimated either randomly bythe designer based on each attribute’s importance towards thegoal, or can be computed using elaborate methods of pairwisecomparison of attributes [25] (In this work the weights arechosen by the expert based on his intuition of the component).Then the beliefs are propagated from the leaf nodes, combinedalong with the weights, and summed with the scores of theother sibling sub-attributes to compute an assessment score.This assessment score represents the overall confidence ofthe evidence. A similar evaluation of the evidence can beperformed for all the nodes in the GSN structure.

Once a confidence score is available on all the sealedevidence nodes, we use compound semantics based upon thelogical operators used in the strategy combining the sub-claims. Some similar composition using logical operators hasbeen applied in literature [26], [27] for reliability estimationin CPS. The composition works as follows:• AND operator will propagate the minimum confidence

score.• OR operator will propagate the maximum confidence score

from all the available branches.The safety case evaluation scheme will assign a confidence

score to the top goal node associated with the certificationcriteria.

B. Evidence CoverageWe are also currently working on integrating the metric of

evidence coverage as one of the attributes used in evidentialreasoning. From the GSN structure we can infer that the claimssupported by the evidence should always be a superset of theclaims made in the goal node. Also, the assumptions madeby the evidence should be a subset of the assumptions madein the context of the goal node which we are trying to seal

with the evidence. We use this containment relationship amongthe GSN blocks at different hierarchical levels to evaluate thequality of the supporting evidence. We term this method ofevaluating the evidence based on their containment relationto the higher-level goal as “Evidence Coverage”. We use thisbased on our hypothesis that higher score should be givento the evidence that provides the biggest margin betweenthe assumptions. Also, this metric is qualitative unlike theconfidence score which is quantitative.

VII. RELATED WORK

As explained in [28], current safety case construction prac-tices can be divided into one of three categories: Prescriptivewhere standards explicitly define the required developmentprocesses and procedures, Goal-Oriented where high-levelsafety goals are specified but the process for achieving themis flexible and left to the system developer, or Blended whichuses aspects from both of the other categories. Rinehart et al.examine the processes used in various industries and show thatprescriptive techniques tend to be used in industries with wellunderstood technologies and a history of safe operation. How-ever, they note there is a general trend toward goal-orientedapproaches, similar to the ACG tool proposed in this paper,such as the Risk-Informed Safety Case [29] from NASA. Goal-oriented approaches appear to be a suitable option for CPSswhich operate in highly uncertain environments.

There are several commonly encountered pitfalls in theconstruction of safety cases. Leveson [30] identifies a varietyof these pitfalls and provides suggestions for avoiding them.For one, safety case construction and system safety analysisshould be an ongoing process started early in the design cycleas opposed to a discrete activity performed near the end ofsystem development for the purpose of certification. Levesonalso shows that safety cases are prone to confirmation biasand argues that developers should instead attempt to showwhen a system can become unsafe. ”The Nimrod Review” [31]provides examples of both of these fallacies where the Nimrodaircraft was inherently assumed to be safe due to a history ofsafe operation, and the resulting safety case did not provideany real improvement in the safe operation of the aircraft.While some of the lessons learned from these examples requirehuman input and understanding to address, other issues canbe mitigated with the use of appropriate assurance case tools.These include enforcing the use of sound safety case patterns,promoting early and continuous development by reducing thetime required for construction, and tightly integrating systemassurance with the relevant system models and documentationamong others.

For simplifying the ACG process several commercial andresearch tools are developed. Maksimov, Mike, et al. [32]provide a comprehensive survey on the assurance and safetycase tools developed in the last two decades. This paper reports46 assurance case tools and evaluates them based on theircapability to generate, maintain, assess and report safety cases.For comparison we have listed a few commercial and researchtools. Commercial tools mainly focus on providing a platform

11

for developing and managing assurance cases. Assurance CaseConstruction and Evaluation Support System (ACCESS) [9]is a tool based on Microsoft visio that aids in creation andmaintenance of safety cases. It provides a platform for rapidprototyping, node creation, node coloring, of the GSN argu-ment structure it creates. CertWare workbench [12] is anothertool based on Eclipse that provides various functionalitieslike multi-user safety case editing, change tracking, standardsafety case templates, and cheat sheets for simple and fastsafety case development. Similarly, Assurance and safety caseenvironment (ASCE) [10] provides an environment for simplesafety case creation and management, and allows for simpleand low cost generation of safety case reports. Also, D-Case editor from DEOS [11] is an open-source platformimplemented as an eclipse plugin to generate and manage GSNargument structures.

In addition to these, several research tools have madesignificant improvements in automating the safety case gen-eration process. Gacek et al. [33] introduce Resolute, a toolwhich generates safety cases from system architecture mod-els specified in AADL [34] along with formal claims andrules specified in an appropriate domain-specific modelinglanguage. Resolute can also automatically propagate updatesfrom the architectural model to the safety case and check forany assumption violations, but manual effort is still requiredto construct the formal claims and rules. Similarly, Calinescuet al. [35] apply ACG techniques to self-adaptive systems withthe ENTRUST methodology. This approach generates dynamicassurance cases which adapt along with the system to remainvalid after system reconfiguration. Additionally, Denney et al.[13], examine several such tools and provide an introductionto their AdvoCATE toolset. AdvoCATE introduces a method-ology for automated generation of safety cases and providesfunctionality for argument analysis and improvement, evidenceselection, and claim definition and composition.

VIII. DISCUSSION

In this section, we first discuss the functionalities of theACG tool, which was motivated because of the limitations inthe existing ACG process and tools. We then discuss the futureenhancements to the automation of our ACG tool.

A. Reflections

As discussed before, large efforts are being made in de-signing safety case reports for safety-critical systems whichhas recently become a mandate in many industries. Despitesignificant efforts and improvements, developing these reportsare typically a manual process requiring human involvementat several steps. This has made the ACG process slow anderror prone. Overall, we feel less attention is being paid to themethods by which these safety case reports are being devel-oped. This was the primary motivation behind the proposal ofthe ACG tool which significantly reduces the cost of assuringCPSs, provides a robust process which minimizes humaninvolvement, and accelerates the entire process of safety casegeneration. Specifically, the functionalities of the ACG tool

that were motivated by the limitations in the existing ACGprocess and tools are:

• Automated artifact management (Section IV-E) and partiallyautomated evidence store generation from the system archi-tecture models using the ALC Toolchain (Section II-C).

• Automated safety case GSN construction (Section V-B)using domain artifacts and the curated evidence store.

• Linked atomic evidence nodes (Section IV-D1) to minimizeredundancy and promote reusablability for other safetycases.

• Safety case evaluation (Section VI) to determine credibilityof the generated safety case using a confidence score.

Through we have not comprehensively discussed the spe-cific research methods (e.g., FORMULA) that have been usedin the proposed tool, we believe enough description of theresearch methods to achieve automation is discussed in Sec-tion IV. To the best of our knowledge, several functionalities(e.g. safety case evaluation) provided by our ACG tool areeither not provided or primitive in most of the tools discussedin Section VII.

B. Future Work

As future steps, we plan to improve the existing componentsand enhance the automation capability of the ACG tool. Somepossible extensions and improvements are listed below:

• Automating Claim Structuring: Currently, the conversion ofthe informal certification criteria into the GSN root nodegoal is performed by a human expert, however, as anextension we would like to use a natural language processingtechnique such as keyword matching [36] to automaticallyextract and map the informal certification requirements togoal of the GSN root node.

• Safety Case Evaluation: Currently, confidence score is usedto evaluate the generated safety case. However, confidenceas a metric is probably not sufficient to evaluate the cred-ibility of the generated safety case. To strengthen this, wemay also need to evaluate the ACG in terms of soundnessand stability.

• Automating Seal operation: As discussed in Section IV-E,the steps of linking each piece of evidence to the cor-responding component in the system architectural model,determining when the available evidence is sufficient tosupport a claim, and defining any assumptions requiredfor the evidence to be valid are all done manually by thedeveloper. We want to automate this process.

• Extending GSN notation: The existing automated GSNnotation does not have a means to explain the justificationthat indicates whether the sub-goals or evidence are suffi-cient. This is a vital piece of argument justifying the GSNbranching and supporting evidence. We are working to adda justification node to the existing GSN.

• Extending Logical connectives: Currently, the tool onlysupports the AND and OR logical connectives. We areworking on extending to other logical connectives like XOR.

12

IX. CONCLUSIONSafety Case has become a part of the regulatory certification

process in different CPS domains. Despite its importance,very little efforts are being made to improve the existingACG process which is mostly been manual, ad-hoc withouta systematic approach. There are tools designed to supportthe ACG, however, they still facilitate manual generation ofsafety case. To address this, we have proposed an ACG toolthat considers the certification criteria along with system’sdesign artifacts and evidences accumulated by human expertsto generate a fully decomposed GSN in an automated manner.Specifically, the ACG tool along with the ALC Toolchain canautomatically generate design artifacts from the system modelarchitecture, and further populate an evidence store that isrequired for the safety case generation process. In addition, itcan iteratively decompose the root node goal (or certificationcriteria) of the GSN to automatically construct a GSN usingthe Link-Seal-Expand steps. This automated GSN constructionsignificantly reduces time and human effort. Additionally, theACG tool has the capability to evaluate the generated safetycase using a confidence score. This evaluation mechanism isnovel and extends the existing popular tools like ASCE [10]and AdvoCATE [13].

We also envision our tool to reduce the time and cost ofthe certification process, and reduce the ambiguity in a safetycase that is otherwise introduced by too much involvementof human experts. Further, we have also illustrated the pro-posed ACG tool on an AEBS case study using a RC cartestbed. Currently the ACG tool is not fully functional foronline validation. We are working on integrating the differentcomponents together, so that it can be validated with other CPStestbeds. We eventually want to integrate the ACG tool intothe ALC toolchain to build a single comprehensive toolchainfor offline design, development and safety case generation ofCPS applications.

ACKNOWLEDGEMENTSThis work was supported in part by DARPA’s Assured

Autonomy project and Air Force Research Laboratory. Anyopinions, findings, and conclusions or recommendations ex-pressed in this material are those of the author(s) and do notnecessarily reflect the views of DARPA or AFRL.

REFERENCES

[1] P. Bishop and R. Bloomfield, “A methodology for safety case develop-ment,” in Safety and Reliability, vol. 20, no. 1. Taylor & Francis, 2000,pp. 34–42.

[2] T. Chowdhury, C.-W. Lin, B. Kim, M. Lawford, S. Shiraishi, andA. Wassyng, “Principles for systematic development of an assurancecase template from iso 26262,” in 2017 IEEE International Symposiumon Software Reliability Engineering Workshops (ISSREW). IEEE, 2017,pp. 69–72.

[3] E. W. Denney, G. J. Pai, and J. M. Pohl, “Automating the generation ofheterogeneous aviation safety cases,” 2012.

[4] D. Kritzinger, Aircraft system safety: Military and civil aeronauticalapplications. Woodhead Publishing, 2006.

[5] R. Palin and I. Habli, “Assurance of automotive safety–a safety caseapproach,” in International Conference on Computer Safety, Reliability,and Security. Springer, 2010, pp. 82–96.

[6] T. Kelly, “A systematic approach to safety case management,” SAETechnical Paper, Tech. Rep., 2004.

[7] R. Bloomfield, P. Bishop, C. Jones, and P. Froome, “Ascad—adelardsafety case development manual,” Adelard, 1998. ISBN 0-9533771-0,vol. 5, 1998.

[8] S. Nair, N. Walkinshaw, T. Kelly, and J. L. de la Vara, “An evidentialreasoning approach for assessing confidence in safety evidence,” in 2015IEEE 26th International Symposium on Software Reliability Engineering(ISSRE). IEEE, 2015, pp. 541–552.

[9] P. Steele, K. Collins, and J. Knight, “Access: A toolset for safetycase creation and management,” in Proc. 29th Intl. Systems SafetyConf.(August 2011), 2011.

[10] L. Adelard, “Assurance and safety case environment (asce),” 2011.[11] Y. Matsuno, “D-case editor: A typed assurance case editor,” University

of Tokyo, 2011.[12] M. R. Barry, “Certware: A workbench for safety case production and

analysis,” in 2011 Aerospace conference. IEEE, 2011, pp. 1–10.[13] E. Denney and G. Pai, “Tool support for assurance case development,”

Automated Software Engineering, vol. 25, no. 3, pp. 435–499, 2018.[14] S. Ramakrishna, A. Dubey, M. P. Burruss, C. Hartsell, N. Mahadevan,

S. Nannapaneni, A. Laszka, and G. Karsai, “Augmenting learningcomponents for safety in resource constrained autonomous robots,”arXiv preprint arXiv:1902.02432, 2019.

[15] C. Edwards, “Railway safety cases,” in Safety and Reliability of SoftwareBased Systems. Springer, 1997, pp. 317–322.

[16] L. W. D. Cullen, “The public inquiry into the piper alpha disaster,”Drilling Contractor;(United States), vol. 49, no. 4, 1993.

[17] R. Hawkins, T. Kelly, J. Knight, and P. Graydon, “A new approachto creating clear safety arguments,” in Advances in systems safety.Springer, 2011, pp. 3–23.

[18] C. Hartsell, N. Mahadevan, S. Ramakrishna, A. Dubey, T. Bapty,T. Johnson, X. Koutsoukos, J. Sztipanovits, and G. Karsai, “Model-baseddesign for cps with learning-enabled components,” in Proceedings of theWorkshop on Design Automation for CPS and IoT. ACM, 2019, pp.1–9.

[19] “Ciholas archimedes system,” https://cuwb.io/docs/v2.0/overview/.[20] E. K. Jackson and W. Schulte, “Formula 2.0: A language for formal

specifications,” in Unifying Theories of Programming and Formal En-gineering Methods. Springer, 2013, pp. 156–206.

[21] E. Denney, G. Pai, and I. Habli, “Towards measurement of confidence insafety cases,” in 2011 International Symposium on Empirical SoftwareEngineering and Measurement. IEEE, 2011, pp. 380–383.

[22] J.-B. Yang and D.-L. Xu, “On the evidential reasoning algorithm formultiple attribute decision analysis under uncertainty,” IEEE Transac-tions on Systems, Man, and Cybernetics-Part A: Systems and Humans,vol. 32, no. 3, pp. 289–304, 2002.

[23] P. J. Krause, “Approximate reasoning models by ramon lopez demantaras, ellis horwood, chichester, 1990, pp 109. - search, inference anddependencies in artificial intelligence by murray shanahan and richardsouthwick, ellis horwood, chichester, 1989, pp 140.” The KnowledgeEngineering Review, vol. 6, no. 3, p. 239–242, 1991.

[24] L. Jiao and X. Geng, “Analysis and extension of the evidential reasoningalgorithm for multiple attribute decision analysis with uncertainty,” arXivpreprint arXiv:1903.11857, 2019.

[25] G.-H. Tzeng and J.-J. Huang, Multiple attribute decision making:methods and applications. Chapman and Hall/CRC, 2011.

[26] S. Pradhan, A. Dubey, T. Levendovszky, P. S. Kumar, W. A. Emfinger,D. Balasubramanian, W. Otte, and G. Karsai, “Achieving resiliencein distributed software systems via self-reconfiguration,” Journal ofSystems and Software, vol. 122, pp. 344–363, 2016.

[27] S. Nannapaneni, A. Dubey, S. Abdelwahed, S. Mahadevan, S. Neema,and T. Bapty, “Mission-based reliability prediction in component-basedsystems,” International Journal of Prognostics and Health Management,vol. 7, no. 001, 2016.

[28] D. J. Rinehart, J. C. Knight, and J. Rowanhill, “Current practicesin constructing and evaluating assurance cases with applications toaviation,” 2015.

[29] H. Dezfuli, A. Benjamin, C. Everett, C. Smith, M. Stamatelatos, andR. Youngblood, “Nasa system safety handbook. volume 1; system safetyframework and concepts for implementation,” 2011.

[30] N. G. Leveson, “The use of safety cases in certification and regulation,”2011.

[31] C. H.-C. QC, “The nimrod review: An independent review into thebroader issues surrounding the loss of the raf nimrod mr2 aircraft xv230in afghanistan in 2006. hc 1025,” 2009.

13

https://cuwb.io/docs/v2.0/overview/

[32] M. Maksimov, N. L. Fung, S. Kokaly, and M. Chechik, “Two decades ofassurance case tools: a survey,” in International Conference on ComputerSafety, Reliability, and Security. Springer, 2018, pp. 49–59.

[33] A. Gacek, J. Backes, D. Cofer, K. Slind, and M. Whalen, “Resolute:an assurance case language for architecture models,” ACM SIGAda AdaLetters, vol. 34, no. 3, pp. 19–28, 2014.

[34] P. H. Feiler, D. P. Gluch, and J. J. Hudak, “The architecture analysis& design language (aadl): An introduction,” Carnegie-Mellon UnivPittsburgh PA Software Engineering Inst, Tech. Rep., 2006.

[35] R. Calinescu, D. Weyns, S. Gerasimou, M. U. Iftikhar, I. Habli, andT. Kelly, “Engineering trustworthy self-adaptive software with dynamicassurance cases,” IEEE Transactions on Software Engineering, vol. 44,no. 11, pp. 1039–1069, 2017.

[36] S. Ghosh, D. Elenius, W. Li, P. Lincoln, N. Shankar, and W. Steiner,“Arsenal: automatic requirements specification extraction from naturallanguage,” in NASA Formal Methods Symposium. Springer, 2016, pp.41–46.

14

a methodology for automating assurance case generationa methodology for automating assurance case...

Documents