ensembling medium range forecast mos guiance

Ensembling Medium Range

Forecast MOS GUIANCE

ByRichard H. Grumm

National Weather Service State College PA 16803

and Robert Hart

The Florida State University

Introduction

• Model Output Statistics (MOS)– Regression equations of parameters from models

to make a forecast for a point.– statistical equations were specifically tailored for

each location, taking into account factors such as local climate.

• MOS equations are based on output from a single model– Models have bias and errors– This will affect, though some statistical corrections, the

MOS output

– Timing errors and intensity errors will affect the outcome

Introduction-II

• Verification Issues– Base line the extended MOS and ENSEMBLE of

extended MOS with more widely used products• FWC- NGM based MOS• MET – Eta based MOS• MAV – GFS based MOS

– Limited at this time to temperatures only• Average (bias or avg), Mean Absolute Error (MAE) and

root-mean square error (RMSE).

• Primary Goal is to evaluate the longer term MOS guidance and an Ensemble– But the shorter term MOS helps illustrate the point– Helps us further learn about ensembles and ensemble

strategies.

Medium Range Forecast MOS

• Produced from data from the Global forecast system (GFS) at the highest resolution of the model– Known as the Extended Range GFS MOS (MEX)

– Message contains routine variables and a climate range of data at some locations at the end of the message

– Most widely used extended range MOS product

– Based on the highest resolution forecast model

MEX Ensemble MOS

• NCEP runs the GFS and has an GFS ensemble prediction system (EPS)– MOS is generated for each GFS EPS member– The control run is slightly coarser resolution than

the operational GFS– The control is used to produce perturbed

members• There are 5 positive (P1-P5)• and 5 negatively (N1-N5) perturbed members based on

the control run• This provides 12 GFS runs to produce MEX guidance

• the operational MRF MOS prediction equations are applied to the output from each of the ensemble runs

Why Ensemble MOS?

• The operation GFS and its MOS– are expected to be more accurate– Due to higher resolution which we expect to

be more accurate– But may pay for timing and intensity errors

as finer scale systems typically correlate less in space.

• The ENSEMBLE of the MOS….– Will show a range and times of high

disagreement ~ UNCERTAINTY

Ensembles help usquick review of ensemble concepts

• Deal with uncertainties in data (initial conditions)– the ability to properly resolve the feature

• Deal with uncertainties in data verse resolution of the model– 6 rule, we may under sample a system.

• Deal with uncertainties in physics– Current GFS system is based solely on the same

model– Only variation is initial conditions

GFS EPS with varied initial Conditions

Forecast LengthForecasts Initialized at most recent data time

Envelope of solutions

at single time

Solution

Displaying uncertainties in forecasts

• In Model output:– spaghetti plots and probability charts (the most likely

outcome)– consensus forecast charts, the middle ground, with

dispersion (standard deviation about the mean)– to visualize these is to see limits of any single solution.

• MOS OUTPUT:– Unless plan view maps, does not lend itself well to spaghetti

plots– In text bulletins, the dispersion about the mean may show

uncertainty– Consensus and the range of possibilities are good

candidates for display

• Why ensemble MOS output

Consider this• Consider this: Would you want to shoot one arrow

at the bullseye or a quiver full of arrows at the bullseye. Or…pick the MEX and maybe miss the mark or use the ensembles and have a better chance of approximating something something.

Producing Ensemble MOS

• Assume all members of equal skill– Any single member may be most correct at any single time

frame– No a priori knowledge as to which would be best member

on any give day or forecast length• Decode each product

– Computer sums, sums of squares etc• Text variables are assigned numbers

– Clouds: CL = 0; SC: 33 BK: 66 and OV is 100– Java Object can translate number to letter or vise/verse

– Get range– Produce consensus and dispersion of forecasts about the

mean. • For select parameters show :

– MEX value, consensus value, highest and lowest members value– Current system has no weights applied.

Extended MOSFormatting is lost from web

KAOO KAOO GFSX MOS GUIDANCE 4/15/2004 000 UTC THU 15| FRI 16| SAT 17| SUN 18| MON 19| TUE 20| WED 21| THU 22 CLIMO FHR 24| 36 48| 60 72| 84 96|108 120|132 144|156 168|180 192| MX/MN OPRNL 57| 33 61| 47 73| 54 75| 53 73| 49 65| 42 61| 41 59| MEAN 59| 33 65| 47 73| 54 76| 53 71| 49 64| 44 63| 43 63| STDV 3| 3 2| 2 2| 1 2| 0 3| 3 3| 4 3| 3 3| HI 65| 40 68| 50 76| 56 79| 54 75| 53 68| 50 69| 49 66| LO 55| 30 61| 43 71| 53 74| 53 66| 44 57| 38 59| 39 58| TMP 51| 40 58| 53 66| 58 67| 58 63| 53 56| 48 56| 47 56| DWPT 28| 29 39| 45 49| 51 51| 50 49| 46 43| 41 40| 40 40| CLD CL| CL PC| PC PC| PC OV| PC OV| OV OV| PC OV| OV OV| P12 OPRNL 4| 1 1| 25 2| 8 14| 12 18| 28 30| 20 24| 24 28| MEAN 1| 4 14| 21 17| 19 20| 21 25| 30 26| 20 20| 22 23| STDV 1| 3 7| 5 6| 7 7| 7 6| 6 7| 8 9| 7 3| HI 4| 12 23| 35 26| 32 40| 35 37| 49 35| 31 34| 32 28| LO 0| 0 0| 12 2| 8 14| 12 18| 21 9| 8 5| 11 17| P24 OPRNL | 1| 25| 17| 22| 51| 31| 41| MEAN | 14| 27| 30| 34| 45| 32| 36| STDV | 7| 7| 11| 6| 7| 10| 7| HI | 23| 46| 54| 45| 57| 46| 43| LO | 0| 16| 17| 22| 30| 17| 18| Q12 OPRNL 0| 0 0| 0 0| 0 0| 0 0| 0 0| 0| | MEAN 0| 0 0| 0 0| 0 0| 0 0| 0 0| 0| | HI 0| 0 0| 0 0| 0 1| 1 1| 3 1| 1| | LO 0| 0 0| 0 0| 0 0| 0 0| 0 0| 0| | Q24 OPRNL | 1| 2| 10| 7| 13| 6| 9| MEAN | 0| 0| 1| 0| 2| | | HI | 1| 2| 10| 7| 13| 6| 9| LO | 0| 0| 0| 0| 0| | | TYP 0| 0 0| 0 0| 0 0| 0 0| 0 0| 0 0| 0 0| <BR>

Short Term Ensemble MOSKAOO

KAOO ENSEMBLE MOS GUIDANCE 4/15/2004 0000 UTC DT /APR 15 /APR 16 /APR 17 /APR 18 FHR 06 09 12 15 18 21 00 03 06 09 12 15 18 21 00 03 06 09 12 18 00 XN 57 33 63 48 TMP 39 37 39 48 54 56 49 42 39 36 39 52 59 62 57 54 52 51 54 DPT 25 26 27 28 27 26 27 27 28 27 30 32 34 36 39 41 42 42 46 CLD CL CL CL CL CL CL CL CL CL CL CL CL CL SC SC OV OV OV BK SC CL WDR 22 21 21 22 22 22 22 01 03 00 00 13 13 13 12 13 14 15 15 40 40 WSP 13 08 10 13 13 13 06 03 02 00 00 02 05 09 06 04 04 04 03 10 08 P06 04 02 01 00 01 00 00 04 08 P12 02 01 01 13 34

Verifying Ensemble MOS

• At select locations compute– Mean errors (bias)– Mean absolute error (MAE), and – Root mean square error (RMSE)

• How to view data with 12 members per site and 150 sites verified for this study?

• Lots of graphs, currently at single sites.

Reference values

• CTP 1-4 day forecasts Dec 2003MOS BIAS MAEMAV 0.74 2.96ETA 0.78 3.73FWC 0.14 3.58CCF 0.68 3.33

Short Term MOS Verification

• Examine MAV, ETA, FWC– 06 and 18 UTC MAV provide 18,30,42,54 and 66 hour

forecasts– ETA,FWC,MAV at 0000 and 1200 UTC provide

• 24,36,48,60 and MAV 72 hour forecasts– So, 6 hour forecast intervals for MAV

• Not to belabor point, but each MAV update clearly improves on previous as skill decreases with time. The false belief that the 06 and 18UTC MAV is unfounded.

• Select sites are shown though data exists for about 100 sites

• No scoring of ensemble products is accomplished here

KMDTAVG-MAE

FWC cold bias

MAV lower MAE

KMDT RMSE

MAV lower RMSE

KIPTAVG-MAE

FWC cold bias

MAV lower MAE

KIPTRMSE

MAV lower RMSE

KBOSAVG-MAE

•FWC cold bias

•The ETA/FWC bias are opposite ~enemble

potential?

MAV slightly lower MAE

KBOSRMSE

MAV lower RMSE But Eta is close

Short Term MOS findings

At sites examined (3 shown):– MAV is the most skillful temperature forecast

MOS– FWC has cold bias at many (most) sites– Eta MOS has some regional/local variation and

is more competitive with MAV as some sites with opposite bias from FWCthis may lend itself to ensembling.

– Clear skill differences at some sites where MAV is far superior May limit ensembling without weighting at these sites as straight blend would weaken impact of more accurate member

Medium Range MOS

• Similar display concepts• Same observational data sets used• Plot all 12 members

– P members are RED N members are blue– Focus on the 3 best members:

• MEX with finest detail forecast (GREEn)• Ensemble mean (BLACK)• Control Run (YELLOW)

KMDT BIAS JA-FE 2004

consensus

MEX-Control

KMDT MAE JA-FE 2004

Early on MEX is far superior!

KMDT RMSE JA-FE 2004

Allentown MAE

Atlantic City MAE

Boston MAE

LGA MAE

KABE MAE

BFD MAE

Some findings

• RMSE at 24-48 hours comparable to those found in other MOS products 3 to 4 degrees – May be a bit better than MAV at 24 hours at KMDT

• MEX clearly outperforms all members and consensus at most sites 24-60 hours– Ensembling unskillful members is not helpful?

• MEX and Consensus both very good at 60+ hours– MEX more skillful at some sites than consensus, but consensus

more skillful at others– Consensus, treated as a single forecast member is quite a skillful

member at all locations after 60 hours

• Weighting the consensus toward the more skillful members might improve the forecasts

Ensemble Envelope

• Useful to know how often the warmest and coldest members captured the range of solutions….

• At mid-range forecasts– 50-60% of the time the observed

temperature is within the forecast range of all 12 members

– 60% of the time at longer ranges, the observed temperature is outside of the range of the ensemble range

– Artifact of untuned MOS for GFS EPS members?

KMDT when observed Temperature within range of

Ensemble MOS forecasts

KBFD when observed Temperature within range of

KBOS when observed Temperature within range of

A few more items

• Examining just January– The control run was slightly more

skillful at many sites than the MEX– The consensus was typically more

skillful than the MEX– A pattern change likely contributed to

this, however over 2 months this problem was mitigated.

January Potential Case Study

Cold snap – caused timing errors in MEX CTL run benefited as did Consensus from the changes

Short Term MOS findings

At sites examined (3 shown):– MAV is the most skillful temperature forecast

MOS– FWC has cold bias at many (most) sites– Eta MOS has some regional/local variation and

is more competitive with MAV as some sites with opposite bias from FWCthis may lend itself to ensembling.

– Clear skill differences at some sites where MAV is far superior May limit ensembling without weighting at these sites as straight blend would weaken impact of more accurate member

Conclusions• Short Term MOS findings:

– MAV is the most skillful temperature forecast MOS– FWC has cold bias at many (most) sites– Eta MOS has some regional/local variation and is more

competitive with MAV as some sites with opposite bias from FWC

– Clear skill differences at some sites where MAV is far superior

• Medium Range Ensemble MOS was generated– From collections MOS forecasts from

• Various NCEP EPS model runs• And NCEP short range forecast models• Each model produced had different initial conditions

– ensemble mean (consensus) temperature forecasts were skillful and competitive with the MEX forecasts at all sites after 60 hours

Conclusions-II• Limitations

– Ideally, the ensemble MOS would beat the MEX at all sites after the initial time periods.

– It does not implying over the long haul:• The MEX is more skillful than the other members

– Especially at 24 to 60 hours!• The MEX equations are tuned to the operational GFS and • are not tuned to the GFS EPS members. Tuned equations

for each member might improve the MOS guidance for each member and the ensemble MOS system in general.

– Observed temperatures often falls outside the ensemble envelope

• 60% of the time at longer ranges• Is this a problems?

– What is the value and what are the limitations of ensembling unequally skillful members?

Conclusions-III• Operational Applications

– Consensus and the dispersion about the mean show times of large uncertainty

– Forecasters need to apply knowledge of this uncertainty in forecasts

– And this information needs to be conveyed to the users of these forecasts

– Times of uncertainty are times of the ensemble providing more value to the forecast process.

• Plans– Apply same technique to verify the POP forecasts from

these data– Experiment with weights to improve the consensus

forecasts.– Improve the verification methods and software

NCEP EPS BreedingN SEEDS GIVE 2*N PERTURBATIONS

Scaled + perturbation

Initial random seed

Opposite sign is negative perturbation

Adjust magnitude to typical analysis errors

12-h forecast12-h forecast

CONTROL-CTL

Complete cycle forecast

ensembling medium range forecast mos guiance

Documents

model output statistics (mos) - objective interpretation...

mos capacitor

the ensemble-mos of deutscher wetterdienst ensemble-mos of...

gfs mos parallel evaluation - national oceanic and...

basic mos device physics lecture 16 mse 515. topics mos...

june 19, 2007 gridded mos starts with point (station) mos...

mos = mannan oligosaccharides - jasper soy … mos.pdfad...

mos capacitors mos capacitors are the basic building blocks...

enscat: clustering of categorical data via ensembling

ausdm 09 analytic challenge ‘ensembling’

digital integrated circuits logic families (pt.i) ·...

improving localized aviation mos program (lamp) guidance...

pn diode, transistors –mos fet …and c-mos

mos-ak group meeting : mos model 11 mos model 11 r. van...

uncertainty in neural networks: bayesian ensembling ·...

multi-model ensembling for seasonal-to-interannual...

mos capacitances

16c0130 interno referenze web - copia · referenze...

loss surfaces, mode connectivity, and fast ensembling of...

df1 - dmc - trophimov - tips tricks and use-cases of...