verification of ensemble systems chiara marsigli arpa-simc
DESCRIPTION
Probabilistic forecasts Event E e.g.: the precipitation cumulated over 24 hours at a given location (raingauge, radar pixel, hydrological basin, area) exceeds 20 mm yes o(E) = 1 no o(E) = 0 the event is observed with frequency o(E) the event is forecast with probability p(E) p(E) [0,1]TRANSCRIPT
Verification of ensemble systemsChiara Marsigli
ARPA-SIMC
Deterministic forecastsEvent E
e.g.: the precipitation cumulated over 24 hours at a given location (raingauge, radar pixel, hydrological basin, area)
exceeds 20 mm
yeso(E) = 1
noo(E) = 0
the event is observed with frequency o(E)
the event is forecast with probability p(E)
yesp(E) = 1
nop(E) = 0
Probabilistic forecastsEvent E
e.g.: the precipitation cumulated over 24 hours at a given location (raingauge, radar pixel, hydrological basin, area)
exceeds 20 mm
yeso(E) = 1
noo(E) = 0
the event is observed with frequency o(E)
the event is forecast with probability p(E)
p(E) [0,1]
Ensemble forecastsEvent E
e.g.: the precipitation cumulated over 24 hours at a given location (raingauge, radar pixel, hydrological basin, area)
exceeds 20 mm
yeso(E) = 1
noo(E) = 0
the event is observed with frequency o(E)
M member ensemblethe event is forecast with probability p(E) =
k/Mno
memberp(E) = 0
all membersp(E) = 1
Probabilistic forecasts
An accurate probability forecast system has:
reliability - agreement between forecast probability and mean observed frequency
sharpness - tendency to forecast probabilities near 0 or 1, as opposed to values clustered around the mean
resolution - ability of the forecast to resolve the set of sample events into subsets with characteristically different outcomes
Scalar summary measure for the assessment of the forecast performance, RMS error of the probability forecast• n = number of points in the “domain” (spatio-temporal)• oik = 1 if the event occurs = 0 if the event does not occur • fk is the probability of occurrence according to the forecast system (e.g. the fraction of ensemble members forecasting the event)• BS can take on values in the range [0,1], a perfect forecast having BS = 0
n
kkk of
nBS
1
21
Brier Score
Sensitive to climatological frequency of the event: the more rare an event, the easier it is to get a good BS without having any real skill
M = ensemble sizeK = 0, …, M number of ensemble members forecasting the event (probability classes)N = total number of point in the verification domainNk = number of points where the event is forecast by k members
= frequency of the event in the sub-sample Nk
kN
iik oo
1
M
k
M
kkkkk oooo
NofN
NBS
0 0
22 )1()(1)(1
reliability
resolution
uncertainty
= total frequency of the event (sample climatology)
o
Brier Score decomposition Murphy
(1973)
The forecast system has predictive skill if BSS is positive, a perfect system having BSS = 1.
cli
cli
BSBSBSBSS
ooBScli 1
= total frequency of the event (sample climatology)
o
Brier Skill Score
Measures the improvement of the probabilistic forecast relative to a reference forecast (e. g. sample climatology)
Extension of the Brier Score to multi-event situation.The squared errors are computed with respect to the cumulative
probabilities in the forecast and observation vectors.
• M = number of forecast categories• oik = 1 if the event occurs in category k = 0 if the event does not occur in category k• fk is the probability of occurrence in category k according to the forecast system (e.g. the fraction of ensemble members forecasting the event)• RPS take on values in the range [0,1], a perfect forecast having RPS = 0
2
1 1111
M
m
m
kk
m
kk of
MRPS
Ranked Probability Score
contingency table
ObservedYes No
Forecast
Yes a bNo c d
A contingency table can be built for each probability class (a probability class can be defined as the % of ensemble elements which actually forecast a given event)
event theof soccurrence ofnumber totalevent theof forecastscorrect ofnumber
caaH
event theof soccurrence-non ofnumber totalevent theof forecastscorrect non ofnumber
dbbF
Hit Rate
False Alarm Rate
ROC Curves(Relative Operating Characteristics, Mason and Graham
1999)
For the k-th probability class:
Hit rates are plotted against the corresponding false alarm rates to generate the ROC Curve. The area under the ROC curve is used as a statistic measure of forecast usefulness. A value of 0.5 indicates that the forecast system has no skill.
M
kiik HH
M
kiik FF
ROC Curvek-th probability class: E is forecast if it is forecast by at least k ensemble members => a warning can be issued when the forecast probability for the predefined event exceeds some threshold
“At least 0 members” (always)
“At least M+1 members” (never)
xx x
xx
xx
x
x
xx
Cost-loss Analysis
With a deterministic forecast system, the mean expense for unit loss is:
oLCoHo
LCF
LCbaLc
11*)(*ME =
contingency table
ObservedYes No
Forecast
Yes a bNo c d
is the sample climatology (the observed frequency)
cao
Vk = MEpMEcli
fMEMEcli k
Value
Gain obtained using the system instead of the climatological information, percentage with respect to the gain obtained using a perfect system
Decisional model
E happensyes
noU take action
yes C Cno L 0
If the forecast system is probabilistic, the user has to fix a probability threshold k.
When this threshold is exceeded, it take protective action.
Cost-loss Analysis
Curves of Vk as a function of C/L, a curve for each probability threshold. The area under the envelope of the curves is the cost-loss area.
Reliability Diagram
o(p) is plotted against p for some finite binning of width dp
In a perfectly reliable system o(p)=p and the graph is a straight line oriented at 45o to the axes
If the curve lies below the 45° line, the probabilities are overestimatedIf the curve lies above the 45° line, the probabilities are underestimated
SharpnessRefers to the spread of the probability distributions.
It is expressed as the capability of the system to forecast extreme values, or values close 0 or 1. The frequency of forecasts in each probability bin (shown in the histogram) shows the sharpness of the forecast.
Rank histogram (Talagrand Diagram)Rank histogram of the distribution of the
values forecast by an ensemble
range of forecast value
V1
V2
V3
V4
V5
Outliers below
the minimum
Outliers abovethe maximum
I II III IV
Percentage of Outliers: percentage of points where the observed value lies out of the range
of forecast values.
bibliography www.bom.gov.au/bmrc/wefor/staff/eee/verif/verif_web_page.html www.ecmwf.int Bougeault, P., 2003. WGNE recommendations on verification methods for numerical prediction of weather elements and severe weather events (CAS/JSC WGNE Report No. 18) Jolliffe, I.T. and D.B. Stephenson, 2003. Forecast Verification: A Practitioner’s Guide. In Atmospheric Sciences (Wiley). Pertti Nurmi, 2003. Recommendations on the verification of local weather forecasts. ECMWF Technical Memorandum n. 430. Stanski, H.R., L.J. Wilson and W.R. Burrows, 1989. Survey of Common Verification Methods in Meteorology (WMO Research Report No. 89-5) Wilks D. S., 1995. Statistical methods in atmospheric sciences. Academic Press, New York, 467 pp.
bibliography Hamill, T.M., 1999: Hypothesis tests for evaluating numerical precipitation forecasts. Wea. Forecasting, 14, 155-167. Mason S.J. and Graham N.E., 1999. “Conditional probabilities, relative operating characteristics and relative operating levels”. Wea. and Forecasting, 14, 713-725. Murphy A.H., 1973. A new vector partition of the probability score. J. Appl. Meteor., 12, 595-600. Richardson D.S., 2000. “Skill and relative economic value of the ECMWF ensemble prediction system”. Quart. J. Roy. Meteor. Soc., 126, 649-667. Talagrand, O., R. Vautard and B. Strauss, 1997. Evaluation of probabilistic prediction systems. Proceedings, ECMWF Workshop on Predictability.