statistics of past errors as a source of safety factors...

20
St~tistics of Past Y7rrors as a Source of Safety Factors for Current Models Alexander 1. Shlyakhter Departnunt of Physics Md NonIseust Regio~l Center for Global Environmc~al Change Harvard University, Cambridge, MA 02138 USA Abstract Resltlts of a comparative Malysis of actual vs. estintattd mnaa'nty in several dma sets dcrivdftm mural ad m'd sciences are pre~twd h sets include: i ) time #re& in the sequential masuremetus of the same physical q u m i v ; ii) e nviro~al mearurems of ~rMium in soil; iii) mional popularion pro- jectiom; iv) projccti~~tls for the United States' encrgy sector. Probobilitics of large &irotbtw3fom riz t l ~ ~ vahs ate garametW by M apme~tid distribu- tion with the slope &&nnincd by the dma. One can kdgc againn wpctrcd uncertainties by i@ui~g repned 1wtcertahty range by a &fcuolt j4fctyfoctor ~minedfrorn the rchm Nsro&ddolcr sets. N empirical approach can be UKd in the uncertainty analysis of the low probabilityfigk wttseqwwe events, such as risk of global wam'ng. Science policy often hinges on ~h&h assessment of the ~lllcceMty in predictions derived from various models. Ar~dysis of events with low probability but high consequences (such as probabil- ity of extxeme global warming) is meN fot Wion making. However it is well known that there is a strong tendency for researchers to underestimate uncertainties in results, thus dtxmdg their reliability md increasing the probability of "swprises".lb The history of natural and social sci- ences contains a wealth of data about reliability of models and estimates of parmetric unlmaiarty. methods of building confidence intervals around point estimates an used in the weather, population, and economic f~ncasting.~-ll They rely on the assumption that the distaibution of errors in htwe forecasts is the me as the distribution of these errors in the past forecasa. In many other fields, however (such as physical measurements and environmental risk as:~ssment),

Upload: others

Post on 27-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

St~tistics of Past Y7rrors as a Source of Safety Factors for Current Models

Alexander 1. Shlyakhter

Departnunt of Physics Md NonIseust R e g i o ~ l Center for Global Environmc~al Change Harvard University, Cambridge, MA 02138 USA

Abstract

Resltlts of a comparative Malysis of actual vs. estintattd mnaa'nty in several dma sets dcrivdftm mural a d m'd sciences are pre~twd h sets include: i ) time #re& in the sequential masuremetus of the same physical q u m i v ; ii) e n v i r o ~ a l mearurems of ~rMium in soil; iii) mional popularion pro- jectiom; iv ) projccti~~tls for the United States' encrgy sector. Probobilitics of large &irotbtw3fom r i z t l ~ ~ v a h s ate garametW by M apme~tid distribu- tion with the slope &&nnincd by the dma. One can kdgc againn wpctrcd uncertainties by i@ui~g repned 1wtcertahty range by a &fcuolt j4fctyfoctor ~ m i n e d f r o r n the rchm Nsro&ddolcr sets. N empirical approach can be UKd in the uncertainty analysis of the low probabilityfigk wttseqwwe events, such as risk of global wam'ng.

Science policy often hinges on ~ h & h assessment of the ~lllcceMty in predictions derived from various models. Ar~dysis of events with low probability but high consequences (such as probabil- ity of extxeme global warming) is m e N fot W i o n making. However it is well known that there is a strong tendency for researchers to underestimate uncertainties in results, thus d t x m d g their reliability md increasing the probability of "swprises".lb The history of natural and social sci- ences contains a wealth of data about reliability of models and estimates of parmetric unlmaiarty.

methods of building confidence intervals around point estimates an used in the weather, population, and economic f~ncasting.~-ll They rely on the assumption that the distaibution of errors in htwe forecasts is the m e as the distribution of these errors in the past forecasa. In many other fields, however (such as physical measurements and environmental risk as:~ssment),

Page 2: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

uppzr confidence limits for health risks are often calculated theoretically without er~~pirical valida- tion. In this paper, I present the results of a systematic analysis of actual errors in the datasets derived from nuclear physics, environmertal measurements, energy and population projec- tions.12-w The goal is to show how historic data on past overconfidence can be used to develop the empirical safety factors that can k applied to uncertainty estimates in ctlinnt models.

Error in a physical measurement is an additive function of three types of errors: rnndoln errors. systematic errors, and blunders. Random errors come from stz~tistical fluctuations of tEc me:: vd- ues obtained with a finial: number of trials. Systematic errors are errors that have the same sign in different trials, for example, errors caused by unknown thermal expansion of a measuring 1.0.'.

Blunders are gross errors usilaily caused by trivial mistakes (such as errors in entering tie data). Routine scientific data sets contain 5- 10% of blunders.3

If the total uncertainty is dominated by random errrjis, then by the Cenval Limit Theorem (CLT) the distribution of the aridmetic mean of rnany observations around the me value is asymptoticdily normal. Uncertainty is usudFj leportcd as an average of many trial rneasmlnents (corrected for d recognized systeruauc effects), A, and an associated standard deviation, A. If the actual vaiue of this quantity is a the2 the normalized deviation x = (a - A)lA follows the stmdard normal distribu- tion. In !!!at distribution, the range A A f ,964 has a 95 percent probability of i~tcluding a.

The presence of systematic errors violam the assumptions n- for use of the CL'I'. If most of the uncert.ainty comes fmm systematic errors, the usual jirsafication for normal distribution does not apply. Despite this fact, the normal distribution remains implicit when resachers report ma- s d values arrd the companding ancertaincies.

In Sections 2 and 3, I present the distribution of the actual errors in several datasets derived from physical and environmental measurements. In Sections 4,5, and 6, I present the methodoloey and the results of the analysis of the distribution of errors in population and energy pr~jxrions. II! all cases the observed distributions 61' emrs have long tails that can be approrknately described by the exponential distribution.

In Section 7, I show how urn&fes~ati,on of uncertainty cen be quantified by the ratio of the total uncertainty to the reported standard deviation, A. Assuming that this ratio is a normally distributed random variable with the standard deviation u, I derive a compound distribution which is normal for u = 0 and is exponential for u 2 1. Itl Section 8, I show how one can use the exponential dis- tribution to hedge against unsuspected wrcertainties in estimating the risk of global wanning.

2. TRENDS IN PHYSICAL MEASUREMENTS

The first attempts to quantify overcofi&..r.~ in physical measuremenu come from the work of B u k h v o s m ~ ~ ~ and Henrion and Fihoff.4 They compared elementary particle pisperks and fun- damental constants ic w l y compilations -4th much more accurate values take2 fmm a more recent

Page 3: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

compilation. A convenie~lt m a s u r e cf the deviation of "new" values from the "old" values is the normalized deviation n =: (a - A)/& with Q the new value, A the old value, and A the old standard devistkn.

Shlyakhter el al 12-'" extended ~hese original studies by following trends in data sets derived from nt~r!zz partic!e physics: masses and lifetimes of elementary particles, magnetic moments and Ilrztinnes of the excited nuclear states, and neutron scattering lengths (see Figure 1). All data sets were first converted into a standard fomart, Successive measurements of the same quantity com- prised a block of data; r data set typically consisted of several hundred such blocks. In order to limit the effects of "noise" in the d a ~ on final results, two selection criteria were applied: i) new s ~ i x i uncertainty had to be much smaller than the old one: &&.&, 2 4; and ii) only cases where deviation from the true value did not exceed ten standard deviatis~ls were included in the analysis (in this way most blunders were excluded). Criterion (i) ensures that the new measured value, a, is close to the unkrmwn true value. Neglecting &, introduces a small bias into the calculated x values. However, since the errors are combined in quadrature, this bias is only about 3%.

0 1 2 3 4 5 6 7 Ixl

Fig. 1. b b b i l i t y sf onexpected msults in physh9 mcmuremenb. The plots s h ~ w the cumuhtive pmbbility, S ( x ) = rp( t )dt tbac new measurements ((1) will be st least irl smdard deviations ( A ) away from the old results (A); x = (i - A)ld as defmed in the text. 'Ibe cumulative pFobability distributions of M iae shown for the three dara m: particle data (elementary heavy solid he); magnetic moments of exciciwl nuclear s w s Z 2 (dotted line); aeumn scattering (heavy dashed line). Also ploaed is cumulative normal distribution, erfc(xlrl&), (chi0 solid Line witb markm), and compound expmentid distribution with parameter u = 1 from Figure 5 (thin solid he) .

Page 4: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

The results confirm the earlier findisfigs that a normal distribution grossly underestimates probability of large deviations from the expected values. A new finding is that the pattern of o v e r c a ~ d e n a is similar In different h d s of measaernens.

3. TIWNDS IN ENVIRONMENTAL MEASUREMENTS

Environmental measurements are rarely repeated with the same samples and it is hard to estimate how widespread ate the tuwcounted errors in routinely collected data. An oppohty to address this issue is provided by the data on the excess uranium in the soil around the former Feed Materials Production Ccnter at Femald, Ohio, which had bsen a key uranium metals fabrication facility for U.S. defense projects mtil it was closed in 1989.z4 The data set was requalified at the request of the legal counsel to establish its credibility for litigation purposes. About 200 pairs of measurements of the radioactivity of three uranium isotopes (234U, a5U, and a8U) in the rme samples have been provided t me by K.A. Stcve~mn (U.S. Department of Energy Environmental Measurements Labomtory).

Fig. 2. Distribution of errow 3c wwsuramcnt of excess inniurn in SOU. Prejentation is similar to Flgm 1. The c m W v e probability dhtribudcm of Lrl are shown for (he three uranium isotopes (234~, heavy dotled W; Z ~ U , heavy dashed ~iae; aad 23% h v y solid line). ~ l s o PIOW are ~ 3 m p ~ n d expentiai ~ t r i - butions fmm Figure 5: u = 0 (Gaussian, (bin solid Linc witb marluas) and u = 1 (thin solid Line).

Page 5: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

The distribution of deviations between the original and requalified measurements normalized by the reported standard enor of the old measurement is shown in Figure 2. The distribution is similar to the dietxibuPiuii of amrs in ohe mesmments in nuclear physics shown in Figure 1. This similar- ity indicates that the fraction of unsuspected uncertainties may be similar in a wide variety of rou- tine mwmments.

4. UNCERTAINTY FUTUBE PROJECTIONS

Uncertainty in fu tm projections is ddefmd less formally than uncertainty in physical measure- ments. In this section m dgt;brithm for analysis of uncertainty in historical projections is presented One can estimate the s~m&sl deviation A of an equivalent normal distribution and then draw the empirid p b a b i l i ~ distributions of the deviations of the old projections from the true values nor- malized by A. Experts may not ~iecg&y imply the normally distributed enor terms. However, the users of the results tend to base their decisions on the assumption that deviations exceeding scved uncerhhty mges are hprobatle. Coinparkon of errors in historical ddta sets with those predicted by the normal dis~xibaarion provib rp wful measm of the credibility of curent unmr- &€y atimates. Uncertainty in the projection. is uwi3y presented in the form of "seference," Mlowor't and "upper" esthn8t.s (R, L, and (I yt:s,ptive1y) that ore obWed by running a inodel with diftemnt sets of cxogcnous parameters (cg. the annual rate of growth). The range of scatter around the reference value R does not forai;lly define a &~ussilrn standard deviation because the fundamental uncertain- ties involved (e.g. tht: rate of fuim economic growth) arr! fqllently not stoclnastic. However, it is reimnabie to assurne that the mrigc of pmwneter variation presented! by a forecaster represents a subjective judgment about gte probability that the true value Z'E [L,U. Generally, lower and upper b a ~ & pmnrt what is believed to be sr, "envelope" most likely tn bc8cht the true value and include the m~~jority of possible outcorns.

Note that bounded distributions (such 8s the tdarigulm distribution) which are sometimes used to describe the probabilities of various xenarlos asign zero probability to the outliers. This incor- tectly implies that L and kl m real bounds rather than the estimates of plausible lange. Histohcal data presented below, however, suggests that deviations exceeding the expected uncertainty range are not uncommon. Therefore, using a normal (unbounded) distribution as a frame of reference ~m&res~mtes trite svemnfickncx.

The standaid deviation of the equivalent normal distribution is calculated as follows:

a) Spec@ the subjective probabilty a that the me value vfiJ1 lie between the low (L) and high ( L o estimates. I assme a = 6896; larger values of u increase the discrepancy between the G a u h model and h t czlculaced by this method.

b) h.w m equivalent normal di~ibution that would have a spxif'id cur-ulative probaSilit-y fx bctwexm L and W. For a = 68% the stmchrd deviation of the quivalent nomaJ di:;eribu-

Page 6: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

tion is (U - L)/2 so that x = 2-(T - R)I(U - L). Therefore this choice of a cornsponds to the usual practice of splitting the un@e~~in ty m g e in half and using it as a smgate for the standard deviation.

c) If the refe~nce value (R) is not in the middle of the (&,U) iintervd use two separate normal distributions tnmcated lit zero: "left half" for (La) interval and "right half' for (R,U) interval each having oln as the cumulative probability.

5. POPULATION YROJEC918NS

The history of population projections provides an opportunity to test the mliaility of uncertainty mhates in demographic models. Shlyakhter md Kammenl2-'4 andyad United Nations poptda- tion projections, made in 1972, for the year 1985 that can serve as the set of "exact" values, a.

'Ihe population data base includes projections from 164 notions with population exmeding 100,000 presented in the form of "high" and "medium" and "low" variants for each nation.26 Data for 31 countries were excluded due to extreme errors resulting for example from unanticipated intema- tional migration; such cases can be considered as the use of a wrong model. Data for 133 nations satisfying the criteriA M < 10 were included in the analysis.

Tke results are shown in Figure 3. Because all the ppulation estimates come from an authoritative soruce, namely the United Nations, it might be oxpxted that systematic crors would be small, representing a well-calibrated modd. The unsuspacted uncertainty, however is very large. Data for 37 indusaiaUwl countrim (when! data we generally more reliable) show little less surprise, but probability of large emrs is still grossly underudmatcd by the normal distribution.

6, ENERGY PRO,JEglTl[OPJS

Forecilsts of future energy consumption are a prerequisite for many major economic and policy dedsions such as how best to reduce cmhn dioxide ernisions to deviate global warning, or how best to stimulate the pace of development of alternate sources of energy. An analysis of credibility of uncertainty estimates was perfomed in Refs. 16 and 17 using the largest coherent set of US erErgy projections for the year 1990, the Annual Energy Outlook (AE0).27

AEO projections .'or 1990 made in 1983, 1985, and 1987 consist of 182, 185, and 177 energy producing or conslming sectors of the U.S. economy respectively. The variation in the number of sacer~ss resulted because the low md high projections coincided in some cases, and no cornspond- ing uncertainty rmge could be derived.

Page 7: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

Fig. 3. Popubtton pmjectiom. Tbe plots depict cbc cumulative probrldiity. S ( x ) = rp(t)dl. Ihat me valw (T) will be at leaat lxl slandanl &viadomi (A) away h m the refet~=11~1: value of old projections (R) . The popWicm data base is desdW la iba text. The d v e pmbaM& dis(ributim8 of I*I am &own for the total dataset of 133 amtrim (heavy solid linc) aod for or asubaa of 37 Muatriallzed countries (heavy Whed line). Also &own i s cbe ~ a r n a l distrlbulton (W wUd li~llc with mrkm) nnrd (be eamp~uod d ~ b u ~ witb u 1 3 fm

In 47, 50, a d 47 caws respectively, the values of 1*I (calculated as described in Section 3) exceeded 100; such cases were omitted as they were outside any reasonable range of parametric uncertainty of the AEO model. For dl remaining cases the x values were calculated and the fie- quency distributions analyzed. The distribution of signed x values is approximately symmetric with respect to zero. There is no large systematic bias (e.g. a gross underestimation of energy consumption in all or many sectors) and no sou~ng trends in the sca#~;rpms of x values; this indi- cates that the projections are generally independeat.

figure 4 shows the cumulative probability distributions of M for the projections made for 1990 in 1983. 1985, nlld 1987 together with the Gaussian and exponential distributions. The tilree empiri- ciJ distributions nsr, s~%&~gly similar. Although the absolute error in projections made in 1987 for 1990 is somewhat smaller ban made in 1983 for 1990, the range of unc~rtahty is also smaller so cb%f probability of "large" deviations relative to the obsemd ~ ~ ~ c c r t a i n i y is roughly the same as for i-: othrs two ~ C R T S . OIle would expect that energy projec'iinns for a.ggregated mtcrs of economy

Page 8: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

would be more reliable than projections for individual sectors. However, this appears not to be the case 4, heavy dashed line).

Pig. 4. Amml b e g Chtlook projection#. 'Ibe pnsenlalion is as in Figure 3: 1983 to 1990 (heavy solid b); 198s to 1990 (beavy dssbed b); 1987 to 1990 (havy &tW Ltns), aggregefsd Seaas of a m c m y (vcry heavy &shed line): nomd dlatrlbttoa (Ulin miid Mnc wtlb marlren); canpoMd d&i&ulba witb u = 3.4 (thin mlid line).

7. EXPONENTIAL PARAMETMZATION OF OBSERVED DISTRIBWflIONS

Bukhvostov2o and ShlyaUIter and Kammenl4 suggested simple heuristic arguments to describe how an exponential distribution of erron might arise. Let us assume that the estimate of the mean. A, is u a b i ~ s d but that the estimate of the true standard deviation, A' is randomly biased with a distribution f(r) where t = A'IA. Here A is the estimated standard deviation. In other words, I assume chat the &viatiofis nomalized by A', xO= (a - A)lA' follow the s t w b d normal distribution while the deviations normalized by A, x = (a - A)lA, follow a normal distribution with a randomly chosen st~naclaxll deviation t:

hr:, ig over all values of I gives a corupound clistxibution:

Page 9: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

If f(t) has a sharp peak near t = 1, Eq. (2) reduces to the normal distribution. If f(t) iu broad, how- ever, the result is different. For simplicity, let us assume chat for large t, f(t) follows the Gaussian distribution with standard deviation u: f(t) oc exp[-t*/(2u)]. The main contribution to the integral in Eq.(2) comes from the vicinity of the saddle point where the exponential term xaches a maximum (for z = I,,: tz,, - ulrl). It is straightforward to show that, for large values of x, the probability distribution p(x) is not Gaussian but exponential: p(x) oc exp(-lrllu ). In order to 1 3 f l ~ t the fat that experts are mostly overconfi&nt (Ae 2 A), I use a huncawl normal form of f(t):

Integrating Eq. (2) with f(t) from Q. (3) gives the ccmuulative probability S(x) of deviations ex- ceeding Lxl:

The normal (rr = 0) and exponential distributions (cr > 1) are members of a shglepmmetcr family of curves (Figure 5) . For quick estimates for u 2 1, x 2 3. one can use the approximation e-hif(0*7u*0.j). In this framework the parametric uncertainty can be quantified by aaalyzing thc record of prior projections and estimating the value of r. Data presented iin Figures 1-4 sllow that u - l for physical and environmental measurements and u - 3 for population and energy pmjec ticns.

Parameterization with the one-parameter family of con~pound exponential distributions described above is not the oilly one possible. Tryiqg to find the standard deviation that best fits the data while keeping the parametric form of the normal distribution produces a very poor fit. A good- ness-of-fit analysis of the data sets presented in Figure 1 (Sakharov, private communication, February 1994) gave a good two-parameter f i t when the tlistributions of negative and positive deviations V J ~ mdyzed separately (with two different u values). Another good two-parameter fit was obmind with a sum of an inflated Gaussian distribution and a constant background term. Although increasing tile number of free parameters improves the quality of fit to the data, i t also cornlplicetes cornpaxisons betwefin datasets. Safety factors for uncei t:jinty cstimaks should c1r:y~cnd only on the general form of the dis~ibution of errors, and for this p u p = a simplc one-paramekr e.ri;ni *;Eial pi?r;alne&~iz~~t=ion is s:~ffibicient

Page 10: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

Flgum 5. Ow-parameter frxnlly of compound distributions. Parmeter u is Qfiaed in Eq. (3); it is a masure of mcatah~ty in che slandard & M m A). The value8 of u rare iMlicated in the figure. The curves demon- strate tbc coatinurn of gn,baWty dlRtTlb~IIG118. IlOm C3msh.n (~4) to expmcntlBJ. (U > 1).

There exists a broad s p t r u r n of uncertainties extending from the negligible parametric uncertain- ties to a crucial uncertainty which of several models to choose. An analyst usually has a scale in mind for "important" uncertainties: smaller uncertainties are assumed to be negligible and are excluded from the detailed analysis. For example, in measurements of the excess uranium in soil, radioactive decay of U5U introduces a negligible error becaw the characteristic time scale far this decay is billion years. "Importmt" uncertainties (such as detector efficiency) are carefully evalu- ated and combined to produce the published estimate of the combined standard u n c e w l t y . However, errors of larger scales, particularly those arising from the use of a wmng model, are also possible. For example an important uncenainty that was overlooked in the old measurements of excess uranium was the uncertainty in the overall chemical yield of the procedure used to leach uranium from the soil.24 This heuristi,: model is an attempt t describe the human thought pro- cxsses that are mpnsible for the obxwd pattern of overconfidence. It can be also fomulated as a h c i a l model with tlsc distribution of errors described by Levy stable distribution (a long-tail gcn- L liz~tiotl of tb2 normal distributi~n).~~

Page 11: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

$. APPLICATION: RISK OF GLOBAL WARMING

Cl~oosing appropriate safety factors as a hedge against unsuspected errors is particularly important in the uncertainty analysis of many situations in public policy. These describe events with low probability but high consequences that are deteranined by the he of the probability distributions. I illustrate possible applications using the c m n t debate about possible rise in global temperature in mgome to the emissions of greenhouse gases. For applications of the inflated confidence inter- vals to population a d energy projections, sea-level rise, estimates of health risks, and epidernio- logic studies see Refs. 14- 19.

The causal sequence leading to sea-level rise is as follows: population + energy production -+ C02 emissions -+ greenhouse warming. One can pnsent the temperature rise as a product of four roughly independent facurs:

The f i t factor is the world population; the second factor is energy production per capita; the third factor is C02 emissions per unit energ2 production; the founh factor is temperam increase ATper unit rise in C02. For each factor there are uncertainies in its respective model. In particular, the last factor includes a key scientific uncertainty of climate lndels, the global-mean swfiace tempera- ture increase, AT,, that would follow from a doubling of C 4 concentrations.

If the Earth was dry, the global-mean surface temperature would inciease by A& = 1.2'C and this estimate is quite reliable. However, numerous interactive feedbacks from water (most importantly, water vapor, snow-ice albedo, and clouds), introduce considerable uncertainties into AT, esti- mates. The value of AT, is roughly related to ATd by the formula ATs = ATd/(l - fi, where f denotes the sum of all feedbacks. The water vapor feedback is relatively simple: warmer ahno- sphere contains more water vapor which is itself a greenhouse gas. This results in a positive feed- back as an increase in one greenhouse gas, C4, induces an increase in another greenhouse gas, water vapor. On the other hand, cloud feedback is the difference between the warming caused by the reduced emission of infrared radiation rrom the Eanh into outer space and by the cooling through reduced absorption of solar radiation. The net effect is determined by cloud amount, alti- tude, and cbsad water content. As a result-, the values of AT, from different models vary from AT, = 1.9"C to ATs 5.2OC.28

Note dlat two models with comparable ATs values can have different strength of the feedback mecha_nisms. For example two models (labelled GFDL and GISS) show an unequal temperature incrmse as clouds are included (from 1.7"C md 2.0°C to 20°C and 3.2'C rcspectively).28 The effcc~s of ice dbcdo are different betni,;ca rn-dels but oppositely, so that the results converge (4.0°C v~:-~-sus 4.2OC, respcxtively). Therefore, agavxrnent beewen models may be spuriorrs, andl both COUICI be. v~roi3g.

Page 12: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

The causal sequence leading to sea-level rise is as follows: population -+ energy production + C02 emissions -t p n h o u s e warming. One can present the temperature rise as a product of four roughly independent factors:

The f i s t factor is the world population; the second factor is energy production per capita; the third factor is C02 emissions per unit e n e r c production; the fourth factor is temperature inc- AT per unit rise in @02. For each factor there are u n m d e s in its respective model. In particular, the last factor includes a key scientific u n ~ M t y of climate models, the global-mean surface tempeta- ture increase, AT8, that would follow f'rom a doubling of C 4 concentrations.

If the Earth was dry, the global-mean surface temperature would increase by ATd = 1.2.C and this estimate is quite reliable. However, numerous interactive feedbacks from water (most importantly, water vapor, snow-ice albedo, and clouFs), introduce considerable uncertainties into ATs esri- mates. The value of AT, is roughly related to ATd by the fomlula ATs = ATdl(l - j), wlrere f demotes the sum of all feedbacks. The water vapor feedback is relatively simple: warmer atmo- sphere contains more water vapor which is itself a greenhouse gas. This results in a positive feed- back as an increase in one greenhouse gas, COz, induces an increase in another greenhouse gas, water vapor. On the other hand, cloud feedback is the difference between the warming caused by the reduced emission of i n f d radiation itom the Earth into outer space and by the cooling through reduced absorption of solar radiation. The net effect is determined by cloud amount, alti- tude, and cloud water content As a result, the vdws of AT8 from different models vary from AT, = 1.9OC to AT, = 5.2OC.28

Note that two models with comparable AT, values can have different strength of the feedback mechanisms. For example two models (labelled GFDL and GISS) show an unequal temperature increase as clouds are included (from 1.7'C and 2.0°C to 2.0°C and 3.2'C n?spcctively).2* The effects of ice albedo are different b e t w ~ n models but oppositely, so that the results converge (4.0°C versus 4.2'C, nespectively). Therefore, agreement between models may be, spurious, and both could be wrong.

I i

I Tlae ~ilnlyst has thus been1 placed in a position for which he or she h a not been tmineti, docs not dcsirc. a ~ d should not be asked to take on. The opinion was expreswd h a t becruse there i s no

$->;: to ltnow with c~rrpinty v~!i!letl~er a reeralatory requirement b a e d on a probability has b r ~ n mcr, srrcil ~cgl~Iarion migirt be reg~rded a s meaningless. I~leaUy, regularion.; axe si-qed in tmms of concinnt~mis ii! .[lxicb both fi?e ~-egt~la_'@x' ~ n ( i cdtc regulated hd iv idud or entie: c m !=:now whether the , I !E i ~ ~ d i v i r i ~ ~ z i oi. cni'ity is BPI v o m p b a ~ ~ r ~ , .

Page 13: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

One can view the collection of all ATs predictions as a random sample derivcd from the population of all possible modds. We do not kn~w whether c m i ~ t models cover aP1 possible values of ATs. I assume that with probability a the true value is within the range of reported values. Let us assume a = 99%; the standard deviation is then 292.575 times less than the reported range. Note that in estimating u values for measurements and projections I used a = 68% for the reported uncertainty range. Themfore I assume that the collection of current climate models almost certainly covers the true value of ATs. Had I assumed a = 99% for the old forecasts, the derived standard deviations would be smaller and all x values would be larger. The resulting u values and the corre- sponding inflation factors would be also larger than the ones I used.

Since AT. is determined by the value of the sum of all feedbacks, f, I assume the normal distribu- tion for f and convert the range of AT, values into the range off values. For example, ATs = 1.9'C gives f = 0.37 and ATs = 5.2% gives f = 0.77. This range off values can be used to estimate the standard deviation of the equivalen normal distribution in the same way as for the population and energy projections above.

The corresponding distribution of ATs values is shown in Figure 6 together with the exponential distribution for u = 1 and the distribution of ATs from 21 global circula~tion models compiled in Ref. 28. By using the exponential distribution with u = 1, I assume that the fraction of unsuspected errors in climate models is similar to the fraction of unsuspected errors in physical measurements. With the normal distribution, here is a f % chance that the true value of AT, exceeds 5'C while with the exponential distribution this same probability corresponds to a catastrophic increase of more than 10'C. In a simple feedback description, f = 1 would m u l t in a catastrophic runaway warming. Although the true picture will be much more complex, and negative feedbacks will ultimately h i t the warming, the pcssibility of C02 atmospheric buildup that could lead to a m a w a y warming and a switch to a different c h i a t e equilibrium must be avoided by all means. In my view, prudent policy decisions should be bawd on the exponential as the "default" distribution, rather than the normal distribution. Any policy based on a more optimistic view of future temperature rise would have to be justifid.

9. SUMMARY

St;p&ctical analysis of the distributions of the actual parametric errors and frequency of model mis- specification in natural and sociai sciences can provide valuable infomation about the credibility of the estimates of parametric uncertah:y in the current models. Data sets for such analysis can be derived, for example, from time trends in sequential meswments of the same physical quantity or the same sample (for parametric uncertainty of the models used in natural sciences), or from the comparison of energy and population projections with actual values that became available later (for models of social parameters). For all data sets analyzed so far, distributions of errors show the same pattern: they have long tails that do not follow a normal distribution but can be pragmatically pa-ametrized by an exponential distribution with the slope &termined by the dala

Page 14: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

rhreshold rise, AT ("C)

Fig. 6. Estimates of the lmcan-surface global tempratare r k in response to danblling CO2 c o m e c n h h . aBe pubability of a t@anpmtm aise, ATs, gmfa thaa a given ttmdmld is plotted for the aKwmal Wbutim of mxxmhty in tbc: fadback W,P f (heavy solid line); the ex.pomntial Wbut ion with u = 1 (thin sow be); cbc dblxihtioll of ATs values horn 21 climate modelsB (heavy Qtted line).

The additional component of uncertainty derived from such analyses can be viewed as a safety factor accounting for o v e ~ ~ ~ o ~ & n c e of the experts. it therefore incorporates the possibility of human error into the h e w o r k of uncef.tahty analysis. Although data on past misunderstanding of a given situation cannot prevent our current misunderstanding of a significantly different sitwltion, statistical end;ysis of the frequency of past undemthates of uncertainty can provide useful clues to the choice of the appropriate safety factors.

A Legithate concern about the use of the default W o n factors for the confidence intervals is that this procedure ignores the specifics of particular studies. Some of the studies may be of much higl,cr quality than an raverage study in the data set from which the inflation coefficient was derived. Unfortunately, elicitation of expert opinions about all s igmfi i t uncefiainh is rarely feasible. The user can hedge against unstspected uncertainties multiplying the reported uncertainty range by a safety factor. Such factor should be clearly speafkd aad applied only afer the uncer- tainty has been &te&eIQ by a standard method, so that the o p t i o n may be easily reversxi.29

InteiSx.sestingly, the is values derived from our data sets cover a rather narrow interval: u = 1 for p11y:;icasl constants and M = 3 for cmr l t models of population growth and energy projections. d U b ~ ~ & &erg are nmny different scie~bfic models with s c fxu-ces sf mma-@hQ'? it may @C

posibBe to combine &elm in several distinct groups according to reliability of past unertflinty

Page 15: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

estimates. This would allow to use default inflation factors when no historical data sets are available.

This research was partially supported by the U.S. Department of Energy through the Nonheast Center of the National Institute for Global Environmental Change (NIGEC) and by the Biological Effects Bmch of the U.S. Department of Air Force through the contract F 3 3 6 1 5 - 9 2 - 0 2 . I am &ply grateful to D. Hattis, P. Stone, and R. Wilson for discussions and comments and to KA. Stevenson for providing numerical data on excess uranium in the soil at Femald site. Electronic mail address: [email protected].

1 . L.G. Parrat, Probability mid hkperimentul Errors in Science, John Wiley, New York, 1861.

2. J.R Macd~nald, Are the data worth owning?, Science , 176: 1377 (1972).

3. S. Lichterrstein, B. Fischoff, and L.D. Phillips, Calibration of probabilities: the state of the art to 1980, in J d g m n t Undcr UnceMinry, D. Kahneman, P. Slovic, and A. Tversky, eds., Cambridge University Ress, Cambridge, 1982.

4. M. Henrion and B. Fischoff, Assessing uncertainty in physical constants, Amer. J. of Physics ,54:791 (1886).

5 . M.G. Morgan and M. Henrion, Uncertainty: A Guide to Dealing with t h x ? ~ i n t y in Qmtitative Risk und Policy Analysis, Cambridge University Press, New Y ork, 1990.

6 . R.M.Cooke, Experts in Uncerroinry: Opinion md Subjective Probability in Science, Oxford University Ress, 199 1.

7. A.H. Murphy and R.L.WinBder, Reliability of subjective probability forecasts of precipi- tation and tefnpemm, 9. Royal Saat Soc., Ser.C, 26:41 (1977).

8. W.W. Williams and M.L. Goodman. A simple method for the construction of empirical confidence limits for economic f o m a s ~ . J. ~ r ~ a e r . Ss(at* Assoc., 66:752 (197 1).

9. M.A. Stoto, The accuracy of population projections, J. Amer. Stat. Assoc-, 78: 13 (1983).

Page 16: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

N.W. Keilman, "Uncertainty in National Population Forecasting: Issues, Backgrounds, Analyses, Recommendations," Amsterdam, Swets and Zeitlinger, hblications of the Netherlands Interdisciplinary Demographic Institute (NIDI) and the Population and Family Study Centre (CBGS), vol. 20. (1990).

V. b o w i tz, Business Cycles: Theory, History, Indicators, and Forecasting, The University of Chicago Ress, 1992.

kl. Shlyakkter and DM. Kamlrzn, Sea-level rise or fall?, Nature ,357:25 (1992).

AX. Shlyakhter and D.M. hnmen, Estimating the range of uncertainty in future develop- ment from trends in physical constants and predictions of global change, CSIA discussion paper 92-06, Kenned j School of Government, Haward University, July 1992.

AX. Shlyakhter and D.M. Kammen, Uncertainties in modeling low probabilityfhigh conse- quence events: application to population projections and models of sea-level rise, Proc. ISUM '93: 2nd Intl. Symp. Uncertainty Modeling and Analysis, University of Maryland, College Park, Maryland, April 25-28, IEEE Computer Soc. Press, Los Alamitos, California, 246-253 (1993).

AI. Shlyakhkr, et d., Estimating uncertainty in physical measurements and observational studies: lessons from trends ir. nuclear data, ibid p. 3 10-3 17.

D.M. h m e r r , ep al., Non-Gaussian uncertainty distributions: historical trends and fore- casts of the United States Energy sector, 1983-2010, ibid p. 1 12- 1 19.

A.I. Shly&terl et A, Quantifying the credibility of energy projections from mnds in past data: the U.S. energy sector," Energy 1'Policy, 22: 1 19 (1 994).

A.I. Shly&terr, Imp& framework for uncertainty analysis: accounting for unsuspected errors, Risk Analysis, in press.

AL Shly&hter4 Uncertainty atmates in scientific models: lessons from trends in physical I measuremeats, population and energy projections, in Uncenainty Modeling cndAIxz1ysis: Theory Md Apl?lications, B.M.Ayuub and M.M.Gupta, &, North Holland, in press.

t I i f.

AP. Bukhvostov, On the stat%t-;cal meaning of experimental mults, Eenirn.pd Nuclear 1 Physics htitub:: BfepPiunb LWX-45 ( I'973). (in Russian) i

LBL (199 1) Daut file of elementary particle masses and lifetimes maintained by Nuclear Data Center, Lawrence Berkeley National Laboratory.

Page 17: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

22. h4.P. Avotina, I.A. Kondurov, and 0.N. Sbitneva, "Tables Of Nuclear. Moments and Dcfonnation Parmeters of Atorriic Nuclei," Lerlingmd N~1c1ca-r Physics Institute Report, 11982.

23. I.. Koester, H. Rauch, and E. Seymann, Neutron scattering lengths: a survey of experi- mental data and methods, Atomic Dada Qnd Nuclear Data Tables, 4965 (1991).

24. KA. Stevenson and E.P. Hardy, Esmate of excess uranium in surface soil sl~rrounding the feed materials production center using a requalified data base, Health Physics. 65:283 ( 1 993).

25. P.R. Hampel, et al., Robust Stati$tic.s: The Approach Based On Influence Functions, John Wilcy and Sons, New York, 1986.

26. United Nations. World Population Prospects. Population Studies Paper No. 1 20 (1 99 1).

27. Annual Energy 00urlook with Projections to 2010, Energy Momation Administration, US Department of Energy, Washington, JX 20585, DOE/EIA-0383, 1992.

1

28. U. Cubasch and R.D. Cess in "Climate Change", The Intergovemrnen~l Panel on Climate Chmge (IPCC) Scientific Assessment. &port h p a r e d for K C by Working Group 1. Edited by I.T. Boughton, G.J. Oenkins and JJ. Ephraums. Cambridge University Press, 1990, p.75-91.

29. International Organization for SuinCadization (ISO), "Guide to the Expression of Uncefinty in Measurement," Geneva, Switzerland, 1993.

Page 18: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

i $.P. Avotina, 1.A. Kondurov, atld 0.N. Sbitneva, "Table? Of Nuclear T~io~ncnts md Dcfoi-taadon P a r u ~ ~ e ~ r s of Atomic Nuclei," Lexlingrad N ~ ~ l a u Physics blslieute Report, 1982.

I.. ICoester, GI. R a u ~ h , and E. Seyrnmn, Neutron scatrering lengths: a survey of experi- manzal data ;and rncthods, Atomic Dam landlVwlear D m Tables, 49:65 (1991).

KA. Stevenson and E.P. Hardy, Eathate of excess uranium in s u r f a ~ soil sun-oua~ding, the feed raaterials production center using a requmed data base, Health Physics. 65:283 (1993).

F.R. Hampel. et al.. Robust S ~ ~ ~ t i s t i c s : ne Approach Based On IItfluence Fwctions. John Wiley and Sons, New Yo&, 1986.

United Nations, World Popuhtion Prospecrs, Population Studies Paper No. 120 (199 1 ).

Annual Energy Outlook with Projections to 2010, Energy Information Adminis~ation. US Department of Energy, Washington, DC 20585, DOEIEIA-0383, 1992.

f

U. Cubasch and R.D. Cess in "Climate Change". The Intergovernmenfa1 Panel on Climate Change (IPCC) Scientific Assessment. Repon Prepand for K C by Working Group 1. Edited by J.T. Woughton, G.J. Genldns and J.J. Ephraurns. Cambridge University Press, 1990, p.75-91.

International Organization for Stmdadization (ISO), "Guide to the Expression of Uncertainty in Measurement," Geneva., Switzerland, 1993.

Page 19: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

Wiodel Uncertainty: Xes ~haracterizatit~n and Quantification

Held at Governor Calvert House and Conference Center Annapolis, Maryland October 20-22, 3.993

EditeJ by A. Moslcl.1, Univ. MD, N. Sia, ENEL, @. Srnidis, Lrniv. MD, C . hi, r1PC

Sga~rnsomd By UJ.S, Nuclrsr RqpBntory Commission E;t;:G&kG Idaho, %nc, University of Mnpia l~d

Page 20: Statistics of Past Errors as a Source of Safety Factors ...people.csail.mit.edu/ilya_shl/alex/93e_statistics... · errors in htwe forecasts is the me as the distribution of these

r 51 +< \; ~.g-A,

- li QIAL A jced %pits in Risk and

Model Uneertaintty: Its ~haracterizatibn and Quantification

Held at Govcrnor Calvert House, and Conference Center Annapolis, Maryland October 20-22, 1.953

Editc,j by A. ~Amich, Univ. MD, N. Sit;, INEL, C. Srnidi:;, Univ. MP, C. hi, F'EC

Spt~rrsod by 'BJ.S, 1Y'~uclear Regralutory Commission F~GikG Idaho, Hnc, U~tiversity 0% Maqf and

P F ~ P a? red for U.S. T4vrclenr Regulatory Commission