assessment of the statistical properties of cosmo-i7 qpf as a methodology to evaluate its...
DESCRIPTION
QPF of HR LAMs: Problems we know… Precipitation has a large space-time variability. HR-LAMs have problems to reproduce mesoscale details (Zepeda Arce et all., 2000). In detail: HR Lams are able to forecast QPF maxima but may be wrong in space-time localisation (Theis et all, 2005; Ebert & Mc Bride, 2000; Bernadet et all., 2000). A HR Lams gives much more realistic simulation of QPF than a GCM, but may have the same a low quality (double penalty). On the contrary a GCM may have a better quality but be completely un- realisticTRANSCRIPT
Assessment of the statistical properties Assessment of the statistical properties of COSMO-I7 QPF as a methodology to of COSMO-I7 QPF as a methodology to evaluate its predictable spatial scales evaluate its predictable spatial scales and optimize the operational use for and optimize the operational use for
weather forecasting weather forecasting
Carlo Cacciamani, Maria Stefania Tesini Carlo Cacciamani, Maria Stefania Tesini Federico GrazziniFederico Grazzini
(many thanks to Chiara Marsigli)(many thanks to Chiara Marsigli)
9th COSMO General Meeting - WG5 parallel session, Athens 18-21 Sep. 2007
Open questions:Open questions:
How can we use HR-LAM QPF ?How can we use HR-LAM QPF ?
Does COSMO-LM-7 (and 2.8) furnish a real Does COSMO-LM-7 (and 2.8) furnish a real added value (with respect to coarser GCMs or added value (with respect to coarser GCMs or LAMs) to the forecasters (example: for the LAMs) to the forecasters (example: for the emission of alerts in case of intense precipitation emission of alerts in case of intense precipitation occurrence) ?occurrence) ?
QPF of HR LAMs: Problems we know…QPF of HR LAMs: Problems we know…
Precipitation has a large space-time variability. Precipitation has a large space-time variability. HR-LAMs have problems to reproduce mesoscale HR-LAMs have problems to reproduce mesoscale details (Zepeda Arce et all., 2000). In detail:details (Zepeda Arce et all., 2000). In detail:
HR Lams are HR Lams are ableable to forecast QPF maxima but to forecast QPF maxima but may be may be wrongwrong in space-time localisation (Theis et in space-time localisation (Theis et all, 2005; Ebert & Mc Bride, 2000; Bernadet et all., all, 2005; Ebert & Mc Bride, 2000; Bernadet et all., 2000).2000).
A HR Lams gives much more A HR Lams gives much more realisticrealistic simulation simulation of QPF than a GCM, but may have the same a of QPF than a GCM, but may have the same a low low qualityquality (double penalty). On the contrary a GCM (double penalty). On the contrary a GCM may have a better quality but be completely un-may have a better quality but be completely un-realisticrealistic
Some examplesSome examples
COSMOCOSMOLM-I28LM-I286h forecast6h forecast
Observed Observed rain rate rain rate from SPC from SPC radarradar
REALISTICperformance buterror in time andspace precipitationlocalisation !
The added value of limited area models
patterns are more realistic……..
Totally non-realistic but better score ! Realistic but bad QPF scores !
Very high CAPE ~3000 J/Kg Unrealistic updraft, 4/5 grid points with w at3500 m (model level 24) greater than 2 m/s
Example of big QPF errors !A case of upscaling of convection
0
10
20
30
40
50
60
01/0
9/20
05
04/0
9/20
05
07/0
9/20
05
10/0
9/20
05
13/0
9/20
05
16/0
9/20
05
19/0
9/20
05
22/0
9/20
05
25/0
9/20
05
28/0
9/20
05
01/1
0/20
05
04/1
0/20
05
07/1
0/20
05
10/1
0/20
05
13/1
0/20
05
16/1
0/20
05
19/1
0/20
05
22/1
0/20
05
25/1
0/20
05
28/1
0/20
05
31/1
0/20
05
03/1
1/20
05
06/1
1/20
05
09/1
1/20
05
12/1
1/20
05
15/1
1/20
05
18/1
1/20
0521
/11/
2005
24/1
1/20
05
27/1
1/20
05
30/1
1/20
05
0.00
50.00
100.00
150.00
200.00
250.00
300.00
350.00
400.00
450.00
500.00
oss-porretta
pg-1
cumulata osservata porrettacum-pg-1
Problem: TIME INCONSISTENCY and error in amplitude !Precipitation in one point: PORRETTA (Emilia-Romagna Appennine)
xLarge time shift !
-35
-30
-25
-20
-15
-10
-5
0
5
10
15
2001
/09/
2005
04/0
9/20
05
07/0
9/20
05
10/0
9/20
05
13/0
9/20
05
16/0
9/20
05
19/0
9/20
05
22/0
9/20
05
25/0
9/20
05
28/0
9/20
05
01/1
0/20
05
04/1
0/20
05
07/1
0/20
05
10/1
0/20
05
13/1
0/20
05
16/1
0/20
05
19/1
0/20
05
22/1
0/20
05
25/1
0/20
05
28/1
0/20
05
31/1
0/20
05
03/1
1/20
05
06/1
1/20
05
09/1
1/20
05
12/1
1/20
05
15/1
1/20
05
18/1
1/20
05
21/1
1/20
05
24/1
1/20
05
27/1
1/20
05
30/1
1/20
05
prec
ipita
tion
anom
aly
0
10
20
30
40
50
60
prec
ipita
tion
pg1-pg2 pg1-pg3 oss-porretta pg-1
Very (very !) often, when QPF is large, the difference between QPF of the nearestgrid points to a selected observation is larger than the QPF (or the obs.prec.) itself !
Problem: Spatial INCONSISTENCY
Below a defined spatial scale (Below a defined spatial scale (what ?what ?), linked to the LAM ), linked to the LAM resolution, QPF cannot be reproduced in a deterministic way. resolution, QPF cannot be reproduced in a deterministic way.
How can we manage QPF (and HR QPF) ?How can we manage QPF (and HR QPF) ?
In order to increase details it is necessary to increase LAM In order to increase details it is necessary to increase LAM resolution...resolution... Alternative/pragmatic approach: forget the “dream” to Alternative/pragmatic approach: forget the “dream” to forecast QPF details but use a “forecast QPF details but use a “fuzzy-probabilisticfuzzy-probabilistic approach” approach” (i.e. use of quantities inferred by a “(i.e. use of quantities inferred by a “pdfpdf” of QPF: median, ” of QPF: median, time-space averages, maxima, percentiles, probability of time-space averages, maxima, percentiles, probability of occurrence of large amplitude events...); occurrence of large amplitude events...);
What can we learn from all that ?What can we learn from all that ?
From Athens COSMO meeting in 2002
In box of different size In box of different size (what is the best size ?)(what is the best size ?)
alert warning areas (Emilia-Romagna) alert warning areas (Emilia-Romagna)
First simple approach: averaging QPFFirst simple approach: averaging QPF
TS (single-point/alert area average)
0
0.1
0.2
0.3
0.4
0.5
0.6
5mm 10mm 15mm 20mm 25mm
precipitation threshold
PORRETTAMEDIA AREA C
TS increases making spatial averaging !
x
Benefit Benefit of precipitation aggregation: use of precipitation aggregation: use of BOXof BOX
LAMI 7Km grid-points distribution over verification boxes of 0.5°x0.5°
observation distribution over verification boxes of 0.5°x0.5°
Contingency tables and scoresContingency tables and scores
0FAR 1 PODBS,TS, :forecast Perfect
cabaBS
caaPOD
ba
bFAR cba
aTS
Verification of 24h accumulated precipitation aggregated over Verification of 24h accumulated precipitation aggregated over boxes of 0.5°x0.5° . Sensitivity to precip. threshold and variable: boxes of 0.5°x0.5° . Sensitivity to precip. threshold and variable: QPF averaged value QPF averaged value vsvs QPF maximum value QPF maximum value
20 mm/24h
5mm/24hN° obs~ 950
N° obs~ 300
Mean value in the box
N° obs~ 1300
N° obs~ 550
Maximum value in the box
5mm/24h
20 mm/24h
The gain of HRLam with respect to GCMs is greater for high thresholds and for precipitation maxima
Mean value in the box Maximum value in the box
Soglia 10 Soglia 20
N° obs~ 750 N° obs~ 380
Soglia 10 Soglia 20
N° obs~ 400 N° obs~ 150
Much better results increasing time averaging !
Verification of 6h and 24h cumulated precipitation aggregated Verification of 6h and 24h cumulated precipitation aggregated over boxes of 0.4°x0.4° (sensitivity to time aggregation)over boxes of 0.4°x0.4° (sensitivity to time aggregation)
threat score cumulata a +24h
0.3
0.4
0.5
0.6
0.7
0.5 X 0.5 0.4 X 0.4
dimensione box
1mm/24h5 mm/24h10 mm/24h20 mm/24h
Sensitivity to box size and precipitation threshold
Positive impact of larger box is more visible at higher precipitation thresholds
false alarm rate cumulata a +24h
0
0.1
0.2
0.3
0.4
0.5
0.6
0.5 X 0.5 0.4 X 0.4
dimensione box
1mm/24h5 mm/24h10 mm/24h20 mm/24h
Sensitivity to box size and precipitation threshold
Positive impact of larger box is more visible at higher precipitation thresholds
probability of dedetction cumulata a +24h
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0.5 X 0.5 0.4 X 0.4
dimensione box
1mm/24h5 mm/24h10 mm/24h20 mm/24h
Sensitivity to box size and precipitation threshold
Positive impact of larger box is more visible at higher precipitation thresholds
Best result box = 0.5 deg ? (7 * 7 grid points …)Best result box = 0.5 deg ? (7 * 7 grid points …)
Sensitivity to box size and precipitation Sensitivity to box size and precipitation thresholdthreshold
POD (QPF: +24)
00.20.40.60.8
1 5 10 20 50Threshold (mm/24h)
box-0.3box-0.4box-0.5
TS (QPF: +24)
0
0.2
0.4
0.6
1 5 10 20 50Threshold (mm/24 h)
box-0.3box-0.4box-0.5
Sensitivity to box size and precipitation thresholdSensitivity to box size and precipitation threshold
Best result box = 0.5 deg ? (7 * 7 grid points …)Best result box = 0.5 deg ? (7 * 7 grid points …)
Some preliminary conclusionsSome preliminary conclusionsQPF spatial averaging over box or alert areas QPF spatial averaging over box or alert areas produces a more usable QPF field for produces a more usable QPF field for applications. Space-time localisation errors are applications. Space-time localisation errors are minimisedminimisedBox or alert areas with size of 5-6 times the grid Box or alert areas with size of 5-6 times the grid resolution gives the best resultsresolution gives the best resultsPositive impact of larger box is more visible at Positive impact of larger box is more visible at higher precipitation thresholdshigher precipitation thresholdsThe gain of HRLam with respect to GCMs is The gain of HRLam with respect to GCMs is greater for high thresholds and for precipitation greater for high thresholds and for precipitation maxima maxima Better results increasing time averaging Better results increasing time averaging (problems with 6 hours accumulation period, (problems with 6 hours accumulation period, much better with 24 hours cumulated period !much better with 24 hours cumulated period !
Application to the entire italian Application to the entire italian territoryterritory
Study of the QPF Study of the QPF pdfpdf
MotivationMotivationThe predicted state of interest has a substantial amount of The predicted state of interest has a substantial amount of intrinsic uncertainty, depending on forecast lead time, spatial intrinsic uncertainty, depending on forecast lead time, spatial scale, terrain,flow and forcingscale, terrain,flow and forcingThe relation between spatial scale and prediction skill has The relation between spatial scale and prediction skill has became evident in a moltitude of theoretical and experimental became evident in a moltitude of theoretical and experimental studiesstudiesThe output of a LAM must not be regarded as purely The output of a LAM must not be regarded as purely deterministic, but rather as an outcome of a random experiment deterministic, but rather as an outcome of a random experiment that behaves according to a given probability density function that behaves according to a given probability density function (pdf)(pdf)
(Theis et al,2005)(Theis et al,2005)
Study of QPF Study of QPF pdfpdf and its “moments” and its “moments” within the boxwithin the box
Aim:Aim: To investigate the spatial characteristics of precipitations (observed and To investigate the spatial characteristics of precipitations (observed and
predicted) within the boxpredicted) within the box
To use To use pdf pdf agreement between observed and prediced precipitation to agreement between observed and prediced precipitation to judge the quality and the realism of HRLam (COSMO-LMI7)judge the quality and the realism of HRLam (COSMO-LMI7)
Test Period:Test Period: 2005/09/01-2005/11/30 2005/09/01-2005/11/30It will be repeated in other period s(work in progress)It will be repeated in other period s(work in progress)
00-24 cumulated precipitation00-24 cumulated precipitation RUN 00 UTC COSMO-LAMI (+24h) RUN 00 UTC COSMO-LAMI (+24h)
[~7 km hor. res][~7 km hor. res] RUN 12 UTC ECMWF-IFS (+12h /+36h)RUN 12 UTC ECMWF-IFS (+12h /+36h)
[~50 km hor.res][~50 km hor.res]
DATASETDATASET
E
~ 800 stations in total
obtained by the regional network of obtained by the regional network of raingauges centralised at the raingauges centralised at the National Department of Civil National Department of Civil Protection (DPCN)Protection (DPCN)
[20 – 100] stations for each box of (dimension 1.0°x1.0°)
The choice of the boxThe choice of the box
1.1. The maximum station points as possible in the areaThe maximum station points as possible in the area2.2. Geographical and orographic homogeneityGeographical and orographic homogeneity3.3. Possibility to divide non homogeneous Possibility to divide non homogeneous
boxes in homogeneous sub-boxesboxes in homogeneous sub-boxes
……about models gridpointsabout models gridpoints
~ 50 km horizontal resolution
4 points in a box (max 9!)
~ 7 km horizontal resolution
170-220 points in a box
COSMO –LAMI ECMWF
Selected BoxSelected Box
COSMO LAMI ECMWF STATIONS
BOX TOTAL PLAIN HILL MOUNTAIN TOTAL PLAIN HILL MOUNTAIN TOTAL PLAIN HILL MOUNTAIN
A_01_01 192 75 91 26 4 2 2 0 100 23 51 26
F_01_01 212 108 96 8 4 3 1 0 74 13 56 5
K_01_01 202 5 168 29 4 0 4 0 63 8 43 12
D_01_01 176 79 57 40 6 2 3 1 61 3 26 32
G_01_01 208 1 203 4 4 0 4 0 47 1 43 3
M_01_01 182 85 40 57 4 1 1 2 44 11 25 8
L_01_01 220 120 75 25 4 1 3 0 44 11 23 10
J_01_01 216 90 80 46 4 2 2 0 40 9 19 12
C_01_01 192 128 57 7 6 3 3 0 36 8 15 13
R_01_01 216 76 86 54 4 0 3 1 32 5 18 9
P_01_01 171 0 29 142 4 0 1 3 26 0 9 17
Q_01_01 192 21 162 9 4 1 3 0 24 2 19 3
T_01_01 202 65 135 2 4 1 3 0 24 4 20 0
U_01_01 212 49 120 43 4 0 2 2 24 6 12 6
V_01_01 206 65 98 43 9 3 3 3 23 4 11 8
H_01_01 208 115 92 1 4 3 1 0 23 11 12 0
S_01_01 181 119 34 28 4 2 2 0 20 9 6 5
W_01_01 198 3 49 146 4 0 3 1 20 1 11 8
X_01_01 210 30 156 24 4 0 3 1 15 5 7 3
Y_01_01 210 75 84 51 6 4 2 0 14 9 5 0
B_01_01 191 183 8 0 6 6 0 0 6 4 2 0
N_01_01 203 92 98 13 4 1 3 0 6 0 4 2
I_01_01 206 6 179 21 4 0 4 0 46 6 32 8
A: Comparison of observed A: Comparison of observed and predicted precipitation pdfand predicted precipitation pdf
Use of a graphical approach to compare Use of a graphical approach to compare observed and forecast distribution:observed and forecast distribution:
BoxplotBoxplot Quantile-QuantileQuantile-Quantile plot plot
Box ABox ANorth Tyrrhenian areaNorth Tyrrhenian area
COSMO-LM I7 pdf is realistic, even in the tail of the distribution
ECMWF pdf doesn’t reproduce high values. Over-enstimation at low values
Box CBox CCentral Po Valley Area Central Po Valley Area
COSMO-LM I7 underestimates pdf. Good reproduction of the maximum percentile
Box HBox H Central Tyrrhenian area Central Tyrrhenian area
Realistic COSMO LM I7 pdf even for the tail
ECMWF does not reproduce the tail
Box VBox V Central Adriatic area Central Adriatic area
COSMO-LAMI under-estimates precipitation as ECMWF
What happen for smaller box ? What happen for smaller box ?
Box ABox AOBSVOBSV COSMO-I7COSMO-I7 ECMWECMW
FF
AA 114114 192192 44
Box ABox AOBSVOBSV COSMO-COSMO-
I7I7
A1A1 77 4848
A2A2 4949 4848
A3A3 2121 4848
A4A4 2626 5454
1
2
3
4
Box ABox AOBOBSVSV
COSMOCOSMO-I7-I7
A1A1 00 2020
A2A2 1212 2020
A3A3 2727 2424
A4A4 44 2020
A5A5 1515 2020
A6A6 2222 2424
A7A7 66 2020
A8A8 88 2121
A9A9 1313 2323
1
2
3
4 7
Box ABox A
1
2
3
4
5 9 13
OBSVOBSV COSMO-I7COSMO-I7
A1A1 00 1212
A2A2 33 1212
A3A3 55 1212
A4A4 1818 1212
A5A5 00 1212
A6A6 88 1212
A7A7 1515 1212
A8A8 2020 1212
Box ABox A
1
2
3
4
5 9 13
OBSVOBSV COSMO-I7COSMO-I7
A9A9 33 1212
1010 88 1212
A11A11 22 1212
A12A12 1212 1212
A13A13 44 1212
A14A14 88 1212
A15A15 11 1212
A16A16 77 1212
Box ABox A
Quantile-quantile plot
Box ABox A
Box ABox A
Box ABox A
Climatological behaviour”Climatological behaviour”
COSMO-LM-I7 COSMO-LM-I7 pdfpdf is very often realistic is very often realistic High values (right tail of the High values (right tail of the pdfpdf) are ) are statisticallystatistically reproduced. That is not the case for ECMWF reproduced. That is not the case for ECMWF model.model.Results depends on the Results depends on the geographicalgeographical positioningpositioning of area of area Strong dependence of the results of Strong dependence of the results of pdfpdf intercomparison on the intercomparison on the sizesize of the box of the box
BoxplotBoxplot time series time seriesQuantileQuantile time series time series
ScatterplotScatterplot of the forecast error versus observation of the forecast error versus observation Mean Error and Mean Absolute Error Mean Error and Mean Absolute Error
Day by day behaviour of statistical moments Day by day behaviour of statistical moments deduced by observed and predicted precipitationdeduced by observed and predicted precipitation
Aim: Aim: Investigate the day-by-day reproduction of the Investigate the day-by-day reproduction of the observed observed pdf pdf and of some indicesand of some indices (IQR, 90th (IQR, 90th percentile, etc..)percentile, etc..)
Box ABox ANorth Tyrrhenian areaNorth Tyrrhenian area
COSMO-LM-I7 reproduces realistically the day-by-day spread. Sometime big errors are evident.
ECMWF always underestimates the spread
Box CBox CCentral Po Valley Area Central Po Valley Area
Box HBox H Central Tyrrhenian area Central Tyrrhenian area
Box VBox V Central Adriatic area Central Adriatic area
Box ABox ANorth Tyrrhenian areaNorth Tyrrhenian area
Box CBox CCentral Po Valley Area Central Po Valley Area
Box HBox H Central Tyrrhenian area Central Tyrrhenian area
Box VBox V Central Adriatic area Central Adriatic area
Day-by-day behauviorDay-by-day behauvior
Results are different in the different areas Results are different in the different areas Common elements: observed spread” is Common elements: observed spread” is well reproduced by COSMO-LM-I7 even if well reproduced by COSMO-LM-I7 even if false and miss cases are evident false and miss cases are evident ECMWF is not able to simulate the ECMWF is not able to simulate the observed variability, i.e. is less realistic observed variability, i.e. is less realistic than COSMO-LM-I7 than COSMO-LM-I7
Threat Score. Threat Score. Variable used: Variable used: mean value of the pdfmean value of the pdf
Threat Score. Threat Score. Variable used: Variable used: 90th percentile of the pdf 90th percentile of the pdf
Strong dependence on geographical positioning. When 90th percentile is overlarge precipitation values COSMO-LM-I7 is much better than ECMWF
Threat Score. Threat Score. Variable used: Variable used: 90th percentile of the pdf 90th percentile of the pdf
Strong dependence on geographical positioning. When 90th percentile is overlarge precipitation values COSMO-LM-I7 is much better than ECMWF
Scores: conclusive considerationsScores: conclusive considerationsAs regard mean precipitation inside the box, COSMO-As regard mean precipitation inside the box, COSMO-LM-I7 has worse performance than ECMWF for low QPF LM-I7 has worse performance than ECMWF for low QPF thresholds; for larger thresholds COSMO-LM-I7 is better thresholds; for larger thresholds COSMO-LM-I7 is better or similar to ECMWF or similar to ECMWF Considering the tail of Considering the tail of pdfpdf (90th percentile) COSMO-LMI7 (90th percentile) COSMO-LMI7 reproduces well strong precipitation events (more than 20 reproduces well strong precipitation events (more than 20 mm/24h) interesting a limited number of points (10% ) in mm/24h) interesting a limited number of points (10% ) in the domain the domain It is evident a strong dependence of the results on the It is evident a strong dependence of the results on the geographical positioning of the areas, in strong geographical positioning of the areas, in strong correlation with the orography and the direction of the correlation with the orography and the direction of the incident flow (for example the different behaviour incident flow (for example the different behaviour between area A and C positioned upwind and downwind between area A and C positioned upwind and downwind the Apennine chain when the mean flow is south westerlythe Apennine chain when the mean flow is south westerly
Example of dependence on the Example of dependence on the geographical positioning – geographical positioning –
90th percentile90th percentile
• in the adriatic areas (areas V and W - black) COSMO-LM-I7 underestimates the precipitation • In the tyrrhenian areas (areas H, Y and F - red) COSMO-LM-I7 overestimates precipitation
The end, The end, thank you for the attentionthank you for the attention
Diapo in piùDiapo in più
BoxplotBoxplotThe median for each dataset is The median for each dataset is indicated by the black center lineindicated by the black center linethe first and third quartiles are the the first and third quartiles are the edges of the yellow area, which is edges of the yellow area, which is known as the inter-quartile range known as the inter-quartile range (IQR). (IQR). The extreme values (within 1.5 times The extreme values (within 1.5 times the IQR from the upper or lower the IQR from the upper or lower quartile) are the ends of the lines quartile) are the ends of the lines extending from the IQR.extending from the IQR. Points at a greater distance from Points at a greater distance from the median than 1.5 times the IQR the median than 1.5 times the IQR are plotted individually as circle.are plotted individually as circle.These points represent the outliers.These points represent the outliers.
Quantile-Quantile PlotQuantile-Quantile PlotThe quantile-quantile (q-q) plot is a The quantile-quantile (q-q) plot is a graphical technique for determining if two graphical technique for determining if two data sets come from populations with a data sets come from populations with a common distribution.common distribution. A q-q plot is a plot of the quantiles of the A q-q plot is a plot of the quantiles of the first data set against the quantiles of the first data set against the quantiles of the second data set. second data set. By a quantile, we mean the fraction (or By a quantile, we mean the fraction (or percent) of points below the given value. percent) of points below the given value. That is, the 0.3 (or 30%) quantile is the That is, the 0.3 (or 30%) quantile is the point at which 30% percent of the data point at which 30% percent of the data fall below and 70% fall above that value. fall below and 70% fall above that value. A 45-degree reference line is also A 45-degree reference line is also plotted. If the two sets come from a plotted. If the two sets come from a population with the same distribution, the population with the same distribution, the points should fall approximately along points should fall approximately along this reference line. this reference line.