improving the risk matrixpsas.scripts.mit.edu/home/wp-content/uploads/2019/... · scenario 1: the...

Post on 04-Apr-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

ImprovingtheRiskMatrixNancyLeveson

MIT

AStandardVersionoftheRiskMatrix•  Usedthroughoutthelifecycle•  AssumesRisk=f(severity,likelihood)

Severity•  Definedasasetofcategories,suchasCatastrophic:mulFpledeaths

CriFcal:onedeathormulFplesevereinjuriesMarginal:onesevereinjuryormulFpleminorinjuriesNegligible:oneminorinjury

•  RelaFvelystraighIorwardbut–  Worstcase?Mostlikely?Credible?Predefinedcommonevents?–  Howdefinecredible?(blurswithlikelihood)–  Designbasis?(nuclearenergy)

•  ARP4761example:–  LossofdeceleraFoncapability

•  Notannunciatedduringtaxi:Major(Crewunabletostopa/cresulFnginslowspeedcontactwithterminal,aircraU,orvehicles)

•  Annunciatedduringtaxi:Nosafetyeffect(Crewsteersa/cclearofanyobstaclesandcallsforatugorportablestairs)

"Improved"Disembarka=onMethod

Likelihood

•  Example:Frequent:likelytooccurfrequentlyProbable:WilloccurseveralFmesinthesystem’slifeOccasional:LikelytooccursomeFmeinthesystem’slifeRemote:Unlikelytooccurinsystem’slife,butpossibleImprobable:ExtremelyunlikelytooccurImpossible:Equaltoaprobabilityofzero

•  MoreproblemaFcthanseverity–  Historiceventsmaynotapply

•  System,environment,orwayusedmaychange•  SoUware“failure”isalways1

–  SomeFmesassociatewithprobabilitylevels(canthisbedetermined?)

HowAccurateistheRiskMatrix?

•  AlmostnoscienFficevaluaFon–  TwostudiesIknowabout,bothhadpoorresults(ordersofmagnitudedifferentevaluaFonsbyexperts)

•  Empirical(frompracFcaluse)

•  GeneraltechnicallimitaFons–  MathemaFcalandtheoreFcal–  HeurisFcBiases

EmpiricalEvalua=onsandPrac=calLimita=ons

Caveats•  NothingavailablesoonlyourownevaluaFonsonrealsystems

•  NotcriFcizingindividualengineersorcompanies–  TheywerefollowingstandardpracFces–  Ourgoalwastofigureouthowtoimprovewhatisdonetoday–  SameflawsinhundredsoftheseIhaveseeninmycareer

EmpiricalEvalua=ons(2)

•  Commonproblem:Assessriskoffailuresnothazards–  LossofexternalcommunicaFonorbreakingpistonnutsvs.aircraUinstabilityorviolaFonofminseparaFonfromterrain

–  Reliability,notsafety

–  Whataboutnon-failures?

–  IndividualfailuresbutnotcombinaFonsoflow-rankedfailures(andusuallyassumpFonsthatpilotwillbehaveappropriately)

•  InfeasibletoconsiderallcombinaFons•  AssumpFonofindependence•  Affectsaccuracyofresults

EmpiricalEvalua=ons(3)

•  AssumpFonsaboutcorrectpilotreacFontofailures(thenblamethemfortheaccidents)–  PilotmentalmodeliscriFcal.Whereisthisintheriskassessment?

•  UnrealisFcassumpFonsabouthardwareandsoUware–  RedundancyasamiFgaFon:

•  Doesn’tworkforsoUwareorfordesignerrorsinhardware•  SoUwareONLYhasdesignerrors

–  VirtuallyallsoUware-relatedaccidentsstemfromrequirementserrors,notimplementaFonerrors•  RedundancyandrigorofsoUwaredevelopmentwillnothelphere

EmpiricalEvalua=ons(4)•  Wefounditemscategorizedas

Severity=CatastrophicLikelihood=LowthathadbeeninvolvedinmulFpleaccidentsforthosesystems

•  OnlyimprobableifignoresoUwarerequirementsflaws,humanbehavioraspects,etc.

•  STPAfoundnon-failurescenariosleadingtocatastrophiceventsthatwereomigedfromofficialriskassessment

•  STPAidenFfiedrealisFcandrelaFvelylikelyscenariosleadingtoallofspecificfailuresdismissedasimprobableinofficialriskassessment.

•  LikelihoodcandiffersignificantlydependingonexternalenvironmentandoperaFonsinwhichafailureoccurs.

TechnicalLimita=ons

•  TheuseoftheriskmatrixitselfhasbeenshowntohavemathemaFcalandotherlimitaFons(seepaper)

•  MostimportantstemfromHeurisFcBiases(Kahnemann,Tversky,Slovic)–  PsychologistswhostudiedhowpeopleactuallydoriskevaluaFons

–  Humans,itturnsout,areterribleatesFmaFngrisk

Heuris=cBiases(Tversky,Slovic,andKahneman)

•  ConfirmaFonbias(lookfordatathatsupportsourbeliefs)

•  Constructsimplecausalscenarios–  Ifnonecomestomind,assumeimpossible

•  TendtoidenFfysimple,dramaFceventsratherthaneventsthatarechronicorcumulaFve

•  Incompletesearchforcauses–  OnceonecauseidenFfiedandnotcompelling,thenstopsearch

•  Defensiveavoidance–  Downgradeaccuracyordon’ttakeseriously–  Avoidtopicthatisstressfulorconflictswithothergoals

Heuris=cBiases

Canavoidby:ProvidingthoseresponsiblewithbegerinformaFon,obtainedthroughastructuredprocesstogeneratescenarios.

Thatgoalbeaccomplishedusingmorepowerfulhazardanalysistechniques,suchasSTPA

Poten=alAlterna=vestotheRiskMatrix

1.  Usehazards(notfailures)andbegerinformaFonaboutpotenFalcausalscenarios

2.  ChangebasicdefiniFonofriskandhowitisassessed(notcoveredinthistalk)

UseHazardRatherthanFailures

•  RelaFonshipbetweenindividualfailuresandlossesisnotobvious.

–  AssessinghazardsisamoredirectpathtoulFmategoal

–  Componentreliabilityisnotequivalenttosystemsafety

–  UsinghazardsistradiFonalinsystemsafety

Example:WhyShouldUseHazards

•  HelicopterDeiceFuncFon•  FinalSARincludedafailureofAPUresulFngfromchaffing.

–  ImportantbecauseAPUusedwhenlossofonegeneratoroccursduringbladedeicing

–  ButalsoanotherscenarioidenFfiedbyusingSTPAthatcouldoccurwhenAPUhasnotfailed

UCA:TheflightcrewdoesnotswitchtheAPU(AuxiliaryPowerUnit)generatorpowerONwheneitherGEN1orGEN2arenotsupplyingpowertothehelicopterandthebladede-icesystemisrequiredtopreventicing.

–  Severalcausalscenariosandfactors,buttheyarenotinofficialSAR

–  Needtobefactoredintoanyriskassessment

ChangeBeingRecommended

•  StartfromaprioriFzedlistofstakeholderidenFfiedaccidentsorsystemlosses.

•  IdenFfyhigh-levelsystemhazardsleadingtotheselosses

•  Assessseverityandlikelihoodofhazards

•  Onlyconsiderfailuresthatcanleadtohazards(idenFfiedbySTPA)alongwiththenon-failurescenarios(again,STPAcanidenFfythem)

•  ConsistentwithMIL-STD-882andmostothersafetystandards

LikelihoodasStrengthofPoten=alControls

•  Severitynoweasybecausecanbetraceddirectlytolistofaccidentsormishaps

•  HeurisFcbiasesleadtopooresFmatesoflikelihood

•  FollowingarigorousSTPAwillresultin–  Reducingshortcutsandbiases–  MorefullconsideraFonofpotenFalcausalscenarios

•  CanbedoneearlyindevelopmenttoidenFfywheretoplacedevelopmenteffort

•  MaybefocusoncomponentbehaviorbecausehavehistoricalfailureinformaFon

Example1:Pilot’suseofflightcontrols•  UCA:TheFlightCrewdoesnotdeflectpedalssufficientlytocountertorque

fromthemainrotor,resulLngintheFlightCrewlosingcontroloftheaircraMandcomingintocontactwithanobstacleintheenvironmentortheterrain.

Oneofcausalscenarios:•  Scenario1:TheFlightCrewisunawarethatthepedalshavenotbeendeflected

sufficientlytocounterthetorquefromthemainrotor.•  TheFlightCrewcouldhavethisflawedprocessmodelbecause:

–  a)TheflightinstrumentsaremalfuncLoningandprovidingincorrectorinsufficientfeedbacktothecrewabouttheaircraMstateduringdegradedvisualcondiLons.

–  b)TheflightinstrumentsareoperaLngasintended,butprovidinginsufficientfeedbacktothecrewtoapplytheproperpedalinputstocontrolheadingoftheaircraMtoavoidobstaclesduringdegradedvisualcondiLons.

–  c)TheFlightCrewhasanincorrectmentalmodelofhowtheFCSwillexecutetheircontrolinputstocontroltheaircraMandhowtheenginewillrespondtotheenvironmentalcondiLons.

–  d)TheFlightCrewisconfusedaboutthecurrentmodeoftheaircraMautomaLonandisthusunawareoftheactualcontrollawsthataregoverningtheaircraMatthisLme.

–  e)Thereisincorrectorinsufficientcontrolfeedback.

Example1:Pilot’suseofflightcontrols(Con’t)

•  Eachcausalfactorusedtogeneraterequirementsanddesignfeaturestoreducetheirlikelihoodofoccurring

•  LikelihoodcanbebasedonstrengthofpotenLalcontrols–  Interfacedesign(evaluatedbyhumanfactorsexpert)–  Redundancyandfaulttolerantdesign–  Training–  Systemdesign(hardware,soUware,interacFons)–  Designoffeedback

•  SFllneedawaytolinkthesetolikelihood(willcomebacktothat)

Example2:SoZware

•  Whatdonow---rigorofdevelopment---makesnosensetechnically

UCA:OneormoreoftheFCCs(flightcontrolcomputers)commandcollecLveinputtothehydraulicservostoolong,resulLnginanundesirablerotorRPMcondiLonandpotenLallyleadingtothehazardofviolaLngminimumseparaLonfromterrainorthehazardoflosingcontroloftheaircraM.

•  Atleast5causalscenarioswhytheFCCsmightdothis

Example(2):SoZwareScenario1:TheFCCsareunawarethatthedesiredstatehasbeenachievedandconFnuetosupplycollecFveinput.a)TheFCCsarenotreceivingaccurateposiFonfeedbackfromthemainrotorservos.b)TheFCCsarenotreceivinginputfromtheICUstostopsupplyingswashplateinput.Scenario2:TheFCCsdonotsendtheappropriateresponsetotheaircraUforparFcularcontrolinputs.Thiscouldhappenif:

a)ThecontrollogicdoesnotfollowintuiFveguidelinesthathavebeenimplementedinearlieraircraU,perhapsbecauserequirementstodosowerenotincludedinthesoUwarerequirementsspecificaFon.b)ThehardwareonwhichtheFCCsareimplementedhasfailedorisoperaFnginadegradedstate.

Scenario3:TheFCCsdonotprovidefeedbacktothepilotstostopcommandingcollecFveincreasewhenneededbecausetheFADEC(enginecontroller)issupplyingincorrectcuestotheFCCsregardingenginecondiFons.Scenario4:TheFCCsdonotprovidefeedbacktothepilotstostopcommandingcollecFveincreasewhenneededbecausetheFCCsarereceivinginaccurateNR(rotorrpm)sensorinformaFonfromthemainrotor.Scenario5:TheFCCsprovideincorrecttacFlecueingtotheICUs(inceptorcontrolunits)toproperlyplacethecollecFvetopreventlowrotorRPMcondiFons.

Example2:SoZware(con’t)

•  ScenariosusedtoidenFfyappropriateFCCrequirementsanddesignconstraints.

•  Forexample,forScenario1:–  1.TheFCCsmustperformmediantesLngtodetermineiffeedbackreceivedfromthemainrotorservosisinaccurate.

–  2.ThePRSVOFAULTcauLonmustbepresentedtotheFlightCrewiftheFCCslosecommunicaLonwithamainrotorservo.

–  3.TheEICASmustalerttheFlightCrewiftheFCCsdonotgetinputfromtheICUeveryxseconds.

•  Translatetheseinto“likelihood”(finalpieceofpuzzle)

Transla=ngStrengthofControlsintoLikelihood

QualitaFveRankingsuchas1.  Thecausalfactorcanbeeliminatedthroughdesignandhigh

assurance.2.  Theoccurrenceofthecausalfactorcanbereducedor

controlledthroughsystemdesign3.  ThecausalfactorcanbedetectedandmiFgatedifitdoes

occurthroughsystemdesignorthroughoperaFonalprocedures

4.  TheonlypotenFalcontrolsinvolvetrainingandprocedures.

MaybetoosimplisFc?–  Couldincludehowthoroughlythecausalfactorhasbeenhandledwithineachcategory

–  CombinaFonsofpossiblecontrols?

Transla=ngStrengthofControlsintoLikelihood(2)

•  MaybeabletocomeupwithmoresophisFcatedproceduresforspecifictypesofsystems.

•  Examplesinpaperonthistopicat:hgp://sunnyday.mit.edu/Risk-Matrix.pdf

ArchitecturaltradestudyforspaceexploraFonAirTrafficControlenhancements

Addi=onalConsidera=ons

•  RiskalsoaffectedbyfactorsduringmanufacturingandoperaFons:

–  Manufacturingcontrols

–  Designedmaintainabilityandmaintenanceerrors

–  Trainingprograms

–  ChangesoverFmeinusageenvironment

–  Consistencyandrigorofmanagementandoversight

–  AssumpFonsduringdevelopmentaboutoperaFonalenvironment:howwellcommunicatedtousersandhowrigorouslyareenforcedduringoperaFons

–  etc.

Addi=onalConsidera=ons(2)

•  Includingthesefactorswillimproveriskassessment

•  ShouldalsotrackfactorsandimproveriskassessmentoverFme

–  Riskassessmentprocessneednotstopatdeployment

–  Risk-baseddecisionsneededthroughoutlifecycled–  CasFlho:AcFveSTPA

•  IdenFfyleadingindicatorsofincreasingriskduringoperaFons

Conclusions

•  Canprovideimprovedriskmatrixprocesses•  Startfromhazards,notfailures,togetmorerealisFcassessmentsofrisk

•  STPAandbegercausalanalysiscangreatlyimprovelikelihoodesFmates

•  SuggesFonswereprovidedandotherpeopleshouldbeabletocreateevenbegerprocesses

•  ButlimitedbytheuseoftheRiskMatrixandcurrentdefiniFonofrisk–  AlternaFveistoimprovedefiniFonofriskanditsevaluaFon–  SuggesFonsforthisgoalwillfollow(soon)

top related