the limits of foresight in an uncertain world
TRANSCRIPT
The Limits of Foresight in an Uncertain WorldIET Conference on System Safety and Cyber Security 2014
Matthew Squair
Jacobs Australia
15 October 2014
1 M.Squair (SSCS 2104) :
1 Overview
2 Risk, and it’s limits
3 Dealing with risk differently
2 M.Squair (SSCS 2104) Overview:
Fukushima - 11 March 2011
A tsunami overtops the sea wall leadingto 3 reactor meltdowns [3]
I Historical events not consideredI Height exceeded design basisI Progressive floodingI Standby generators lossI ECCS & UHS became inoperableI Loss of offsite powerI Severe damage to external
infrastructure crippled recovery efforts (EPA/TEPCO 2011)
3 M.Squair (SSCS 2104) Overview:
”But one could hardly imagine that such an event would recur, nor thegreater event would happen in the land of the living.”
Yoshimitsu Okada, President Japanese NRI for Earth Science and DisasterPrevention, 25th March 2011
”Research results are out, but there is no mention of that [tsunami] here...why?.”
Yukinomo Okamura, NISA seismologist, June 2009
4 M.Squair (SSCS 2104) Overview:
”But one could hardly imagine that such an event would recur, nor thegreater event would happen in the land of the living.”
Yoshimitsu Okada, President Japanese NRI for Earth Science and DisasterPrevention, 25th March 2011
”Research results are out, but there is no mention of that [tsunami] here...why?.”
Yukinomo Okamura, NISA seismologist, June 2009
4 M.Squair (SSCS 2104) Overview:
Blayais NPP - 27 December 1999
A combination of high tide, stormsurge, wind-driven waves and riverflooding at the Blayais NPP overtoppedthe sea wall and flooded the plant
I combination of events not consideredI Height exceeded design basisI Progressive floodingI Damaged safety systemsI Loss of offsite powerI Damage to external infrastructure
hampered recovery (EDF)
5 M.Squair (SSCS 2104) Overview:
Why?
To understand why such events recur we need to start at the verybeginning with the history of risk
6 M.Squair (SSCS 2104) Overview:
Blaise Pascal and the theory of parallel worlds
Pascals problem, how to fairly apportion points from an interruptedcard-game
His answer? Expectation...
Imagine a series of parallel worlds made up of all the fall of hands.
Applying set theory we get expectation (amount x probability of winning)
De Moivre - Risk is the reverse of expectation[6]
Total risk = sum of individual risks (a probability weighted mean)
This is the heart of classical risk theory
This Pascalian logic of probability has come to dominate how we thinkabout uncertainty, it is the dominant paradigm in the Kuhnian sense
7 M.Squair (SSCS 2104) Risk, and it’s limits:
To apply risk theory in the world
We need to make some assumptions...
That we are certain about the probabilities
That the world is ’normal’
That we live in an ergodic world
That we are not going to see a Black Swan
That we actually apply it (normative)
8 M.Squair (SSCS 2104) Risk, and it’s limits:
Risk incorporates uncertainty
To return to Blaise Pascal’s problem, we can easily determine a player’srisk because the problem space is closed
But the real world is most certainly not closed
If the real world were a game of cards, it’s one of uncertain rules, only asyou play the game can you start to infer the rules
As Aven [1] points out real world risk contains inherent uncertainty aboutboth severity and probability
Examples: Tsunami height modelling, WASH1400 neglect of spent fuelrod risks, climate change impact, Follensbee’s list[5]
9 M.Squair (SSCS 2104) Risk, and it’s limits:
Risk incorporates uncertainty
To return to Blaise Pascal’s problem, we can easily determine a player’srisk because the problem space is closed
But the real world is most certainly not closed
If the real world were a game of cards, it’s one of uncertain rules, only asyou play the game can you start to infer the rules
As Aven [1] points out real world risk contains inherent uncertainty aboutboth severity and probability
Examples: Tsunami height modelling, WASH1400 neglect of spent fuelrod risks, climate change impact, Follensbee’s list[5]
9 M.Squair (SSCS 2104) Risk, and it’s limits:
The world is not normal
To make risk work we must assume some bound on possible severity
When someone says, ’worst credible’ that’s the game they’re playing...
If the distribution of extreme events is gaussian(normal) or exponentialthis is a fairly safe (low risk) assumption, if not?
The problem is on what basis do you judge the shape of the tail?
We know there are events in the real world that have heavy tails
Wars
Electrical power network outages
stock market crashes
Even if the tail is truncated, extreme events will still dominate risk
10 M.Squair (SSCS 2104) Risk, and it’s limits:
Our systems may not be normal either
Highly Optimised Tolerance (HOT) theory is a plausible theory for heavytails behaviour in biological and technological systems [2]
Based on the premise that designing systems is an optimisation underconstraints problem
For HOT systems we will see a high tolerance for expected events
But also a vulnerability to rare events or combinations of events
Such behaviour is reflected in heavy tail relationships
Note the peformative nature of the optimisation model, in effect itdictates the sort of risks we will see with HOT systems
11 M.Squair (SSCS 2104) Risk, and it’s limits:
The FAR 121 system - A HOT system?
FAR 121 Aviation accidents 1962-2013
12 M.Squair (SSCS 2104) Risk, and it’s limits:
Tail dependence and the mean excess heuristic
If we are dealing with heavy tails there are some nasty problems:
Tail dependenceI tendency of dependence between two random variables to concentrate
in the extreme values
I Examples: Wave height/storm surge (Blayais)[4],Tsunami/earthquake (Fukushima), Unreliable airspeed events
Mean excess heuristicI If the mean excess value increases then suspect a heavy tail
I The next worst will be much worse that the current worse case
I Example: In air safety should we consider 9/11 to be the next worstcase after Tenerife?
13 M.Squair (SSCS 2104) Risk, and it’s limits:
Mean excess heuristic, or worse is much worse
FAR 121 Top 20 most severe accidents and mean excess (red)
14 M.Squair (SSCS 2104) Risk, and it’s limits:
The irreversibility of time
Ergodic? What?
Ergodic means a system in which time averages and ensemble averages arethe same
Ergodicity is fundamental to Pascal’s parallel worlds solution
A small wager...
I’ll throw a coin, if it’s heads I’ll pay you 5 times your current worth, butit’s tails I get to take all you own, house, car, even your socks. I’m alsoexceedingly rich so payment is not a problem
Would you take the bet?
15 M.Squair (SSCS 2104) Risk, and it’s limits:
The irreversibility of time
If we calculate the expectation (0.5*5*CW - 0.5*CW = 2CW) it looksgood, but would you really?
If you’re uneasy, you’ve just experienced the St Petersburg Paradox
The paradox. A game of chance is arranged such that the expected valueis infinite. The paradox arises in that no one is willing to pay an infiniteamount to play
The reason. We don’t live in a parallel world
Of course when we use classical risk theory to argue the acceptability ofextreme risks that’s exactly what we’re assuming...
And it’s wrong
Conclusion: Classical risk fails in the region of extreme consequences
The use of classical risk theory for extreme consequence risks, where a losswould wipe us out chronically underestimates the risk [7]
16 M.Squair (SSCS 2104) Risk, and it’s limits:
The irreversibility of time
If we calculate the expectation (0.5*5*CW - 0.5*CW = 2CW) it looksgood, but would you really?
If you’re uneasy, you’ve just experienced the St Petersburg Paradox
The paradox. A game of chance is arranged such that the expected valueis infinite. The paradox arises in that no one is willing to pay an infiniteamount to play
The reason. We don’t live in a parallel world
Of course when we use classical risk theory to argue the acceptability ofextreme risks that’s exactly what we’re assuming...And it’s wrong
Conclusion: Classical risk fails in the region of extreme consequences
The use of classical risk theory for extreme consequence risks, where a losswould wipe us out chronically underestimates the risk [7]
16 M.Squair (SSCS 2104) Risk, and it’s limits:
About those Black Swans (and the Ludic fallacy)
Of course we can only evaluate those risks that we know about
But risks that we haven’t identified still exist don’t they?
If we have a heavy tailed distribution, they may be severe and unexpected(Nasim Taleb’s Black Swans) if we’ve assumed a thin tail
So how to identify and address these risks?
Clarke’s rule
The only way of discovering the limits of the possible is to venture a littleway past them into the impossible
17 M.Squair (SSCS 2104) Risk, and it’s limits:
Reducing uncertainty sometimes means taking risksThe Apollo LLTV was used for both flight control development andastronaut training, but almost killed Neil Armstrong
18 M.Squair (SSCS 2104) Risk, and it’s limits:
Risk as a prescription
Risk is also a normative statement, if there is a finite acceptable risk thenwhen the event occurs, that’s acceptable
But in practice?
No one has ever argued the law of large numbers after a string of majoraccidents as a justification for inaction
After a major accident we almost always try and figure out the why toreduce the risk...
err but wasn’t it acceptable before?
Conclusion: We don’t follow risk theory in practice
We may use the tools of classical risk theory, but we don’t act as if wereally believe them [7]
19 M.Squair (SSCS 2104) Risk, and it’s limits:
Risk as a prescription
Risk is also a normative statement, if there is a finite acceptable risk thenwhen the event occurs, that’s acceptable
But in practice?
No one has ever argued the law of large numbers after a string of majoraccidents as a justification for inaction
After a major accident we almost always try and figure out the why toreduce the risk...err but wasn’t it acceptable before?
Conclusion: We don’t follow risk theory in practice
We may use the tools of classical risk theory, but we don’t act as if wereally believe them [7]
19 M.Squair (SSCS 2104) Risk, and it’s limits:
Risk as a prescription
Risk is also a normative statement, if there is a finite acceptable risk thenwhen the event occurs, that’s acceptable
But in practice?
No one has ever argued the law of large numbers after a string of majoraccidents as a justification for inaction
After a major accident we almost always try and figure out the why toreduce the risk...err but wasn’t it acceptable before?
Conclusion: We don’t follow risk theory in practice
We may use the tools of classical risk theory, but we don’t act as if wereally believe them [7]
19 M.Squair (SSCS 2104) Risk, and it’s limits:
Managing risk in the real world
USN aviation Class A accident rates (manned) vs USAF drones
20 M.Squair (SSCS 2104) Risk, and it’s limits:
A different model: Risk as lack of knowledge
Secretary Rumsfeld’s view
”...there are known knowns; there are things weknow we know. We also know there are knownunknowns; ...we know there are some things wedo not know. But there are also unknownunknowns, the ones we don’t know we don’tknow...”
21 M.Squair (SSCS 2104) Dealing with risk differently:
A different model: The 4 quadrants of uncertainty & risk
4 quadrant risk diagram (courtesy Mike Clayton)
22 M.Squair (SSCS 2104) Dealing with risk differently:
Handling risk types
Unknown knowns - Knowledge governance/risk communication
Known knowns (aleatory) - Classic reliability techniques, design margins
Known unknowns (epistemic) - Robustness, research, model sensitivityanalysis
Unknown unknowns (ontological) - Constant horizon scanning,exploration, severity reduction measures, resilience
Conclusion
All types of risk need to be addressed for risk management to be effective
23 M.Squair (SSCS 2104) Dealing with risk differently:
A different model: An uncertainty budget for systems
Our initial budget should cover both the known, knowns and a risk reserveto cover what we don’t know about
As we identify risks, the unknown risk reserve will transfer to the knownbudget
Operating the system will reduce the unknown reserve gradually
Reducing severity of losses can reduce both the known and unknown riskscomponents
But change the system and we increase our unknown component again,even if we are decreasing our known risks
Note that this is not a ‘memoryless’ process, risk depends on what hascome before, there’s no bathtub assumption here
24 M.Squair (SSCS 2104) Dealing with risk differently:
Summing up
We should probably stop fretting about 10E-9 type numbers, we can’tprove them up front (if at all) and the failure models they represent arenot representative of system risks
Classical risk theory is inappropriate for truly catastrophic events (ask theJapanese) use something else, like Kelly’s theorem
The greater the consequences, and the less the required probability, thegreater our risk will be dominated by uncertainty of an epistemic orontological persuasion
Be prepared for heavy tail behaviour in your system
Be prepared to be surprised
25 M.Squair (SSCS 2104) Dealing with risk differently:
References and acknowledgements I
Thanks to Mike Clayton for the use of his four quadrants of uncertainty diagram.
[1] T. Aven. “Misconceptions of Risk”, John Wiley & Sons Ltd, Chichester, UnitedKingdom (2009).
[2] J.M. Carlson, J. Doyle, HOT: A mechanism for power laws in designed systemsPhys. Rev. E 60, 1412–1427 (1999).
[3] M. Fackler. “Japan Weighed Evacuating Tokyo in Nuclear Crisis”, New York Times,27 February (2012).
[4] E. de Fraguier. “Lessons Learned from 1999 Blayais Flood: Overview of the EDFFlood Risk Management Plan”, NRC Regulatory Information Conference, BethesdaMaryland (2010).
[5] R.E. Follensbee,“Six Recent Accidents/Incidents where the Probability ofOccurrence Calculates to Less than 10−9”, Sunnyday.mit.edu, (1993). [online] URL:http://sunnyday.mit.edu/16.863/follensbee.html [Accessed 14 Sep. 2014].
26 M.Squair (SSCS 2104) References and acknowledgements:
References and acknowledgements II
[6] A. Hald, A. de Moivre and B. McClintock, “A. de Moivre: ‘De Mensura Sortis’ or‘On the Measurement of Chance’́’, International Statistical Review, Vol. 52, No. 3(Dec., 1984), pp. 229-262, International Statistical Institute (ISI), (1984), URL:http://www.jstor.org/stable/1403045.
[7] O. Peters, “On Time and Risk”, The Santa Fe Institute Bulletin, pp36-41 (2009).
27 M.Squair (SSCS 2104) References and acknowledgements: