using an iec 61508-certified rtos kernel for safety ... · pdf fileusing an iec...
TRANSCRIPT
Using an IEC 61508-Certified RTOS Kernelfor Safety-Critical Systems
Bob MonkmanDirector, Business DevelopmentQNX Software Systems
FTF China, August 2011
The StandardsIEC 61508
Accreditation and Auditing Bodies
Derived Standards
Certification
The Plan
The Standards
The Practice
Conclusion
2© 2011 QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
IEC 61508
IEC 61508 Functional safety of electrical/electroni c/programmable electronic safety-related systems
First edition (1998-2000)
Second edition (April 2010) — significant additions, especially concerning software
SummaryPart 0: Functional Safety and IEC 61508Part 1: General RequirementsPart 2: System Requirements Part 3: Software Requirements Part 4: Definitions and Abbreviations Part 5: Examples of Methods Part 6: Guidelines for the application of Parts 2 a nd 3Part 7: Overview of Techniques and Measures
3© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
Accreditation and Auditing Bodies
A member ofthe International
AccreditationForum
A certification organization
A process or
a productaccredits certifies
4
Derived standards
EN 5012n — European railway standards
EN 50126 — reliability, availability,maintainability and safety
EN 50128 — communications,signalling and processing systems
EN 50129 — communications, signalling andprocessing systems (safety related electronics for signalling)
IEC 62304 — medical software and software life cycle processesIEC 62304 — medical software and software life cycle processes
ISO 26262 — functional safetyfor road vehicles (in development)
5© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
The Standards
The Plan
Functional Safety System
Safety Claim
Safety Case and Supporting Evidence
The Practice
The Certification Challenge
The Practice
Conclusion
6© 2011 QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
An Example of Functional Safety System
A chainsaw
7© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
The Claims
Context of the claims
Probability of dangerous failure
Level of dependability – availability and reliabilit y
Sufficient dependabilityFunctional Safety Requirements
Safety Manual
8© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
The Claims: The Infamous “Five-Nines Availability”
Failures per year Duration of each failure
1 5 minutes 16 seconds Potentially catastrophic
10 32 seconds
100 3.2 seconds
1000 316 milliseconds
10,000 32 milliseconds
100,000 3.2 milliseconds
1,000,000 316 microseconds Possibly benign
9© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
Five-nines availability sounds good, but …
Would you fly in a plane with a flight control syst em that makes this claim, with no further precision?
The Evidence Pyramid
10© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
The Foundation - Quality management system
Without these basic procedures, you can go no further
Quality management system
ISO 9000
ISO 15504
Capability Maturity Model Integration (CMMI)Capability Maturity Model Integration (CMMI)
Source control
Revision/version/source control
Defect tracking
Defects found by customers as well as through verification
Defect classification (for fault analysis)
11© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
Design Artefacts
Records from software life cycle
Design documentation
Project plan
Quality plan
Architectural design
Detailed designDetailed design
Test plans
Test results
Other validation methodsplans and results
Traceability matrix
12© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
Static Analysis
Syntax checking Check that coding standards are being applied
Compiler is a syntax checker
Checking with semantics knowledgeTargeted module analysis
Common fault scanningCommon fault scanning
Assertion checking
Symbolic executionDetect logical inconsistencies
Pros: helps catch design errors early
Cons: false positives
13© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
Proven-In-Use Data
Particularly important for retrofitting
In-field usage data are invaluable
Build the gathering of this data into your business model
The more in-use data available, the stronger the ev idence
In-use data only meaningful when scrutinized with fa ult analysis
QNX used proven-in-use data to support its safety c ase for theQNX® Neutrino ® RTOS Safe Kernel.
14© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
Fault Tree Analysis
Structured analysis
Easier for auditor
Easier for audited
Example: Bayesian Belief Networks
tool for incorporating and providing quantitative results fromproviding quantitative results from
Hard and soft evidence
A priori (cause to effect) and a posteriori (effect to cause) evidence
15© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
Fault tree
Design Verification
Could be applied before or after design
Powerful tool for retrofitting
SPIN — Simple Promela (Process Meta Language) Interpreter
NuSMV — New Symbolic Model Checker
Less effective for retrofitting, but may be needed for SIL 4
Formal analysis
For example:VDM (Vienna DevelopmentMethod)
Z
16© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
The Standards
The Plan
The PracticeReason’s Model
Preventing the introduction of faults
Preventing faults from causing errors
A Closer Look at Building Functional Safety
Preventing errors from causing failures
Minimizing the effect of failures
Conclusion
17© 2011 QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
Reason’s Model
Fault — a mistake in the code, which may or may not cause undesired behaviour.
Error — undesired behaviour caused by a fault in the code.
Failure — a system failure caused by an uncontained error.
18© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
Preventing the Introduction of Faults (cont’d)
System engineering
Formal languagesVDM (Vienna DevelopmentModel)
Z Notation
Language choices
Loose/Strong typing
Dynamic/Static typing
Exception handling
Design techniques
Test-driven design
19© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
Preventing Faults from Causing Errors
Assertions
Static code analysis“Automatic code inspection”
Code inspections
Fault injection
Test fault detection and recovery
Estimate number of Heisenbugs
20© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
Preventing Errors from Causing Failures
Coherent exception handling
Fundamental technique
Throw the exception — transfer control from point of exception another location where it can be handled appropriat ely
Programming by contract
Rejuvenation (or reset)
Replication (redundancy/recovery)
Consistency vs. performance and availability
21© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
Minimizing the Effects of Failures
Architecture
Microkernel
Partitioning
Fault Isolation
Fault Detection & Recovery
Clean crashClean crash
Crash-Only Software
Rapid restart may be required
22© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
An simple elevator system with a failure.What techniques could we have used to find the fault? Is recovery possible?
Example: Adaptive Partitioning
QNX Adaptive PartitioningProvides minimum CPU time guarantees to partitions (sets of processes or threads)
Allows partitions to exceed their time budgets when spare processing cycles are available
23© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
24
How can QNX help?
QNX Certified Platform
Architected for reliability and self-healing
IEC 61508 Certification Statement
Safety Manual
Device-specific Assurance Case report plug -inplug -in
Neutrino RTOS Safety Assurance Case
“Proven in Use” data
Safe design training courses
On-site audit (regulatory body participation possible)
Subject Matter Expert consultancy time (hours)
The Standards
The Plan
The Practice
Conclusion
To Summarize
25© 2011 QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
Summary
Functional safety certification has no “Short Cut”
Process and quality management are essential
A proven OS architecture that ensures reliability/s afety
Gather in-field usage data
Engage the auditor from the beginning
and throughout the process• Consider Pre-Audit Services
Design and build for safetycertification:
Fault, error, failure,recovery
26© 2011, QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
Thank you!
27© 2011 QNX Software Systems, GmbH & Co. KG,
a subsidiary of Research In Motion Ltd.
Bob Monkman
www.qnx.com