impacts of phased-array radar data on forecaster
TRANSCRIPT
Impacts of Phased-Array Radar Data on Forecaster Performance during SevereHail and Wind Events
KATIE A. BOWDEN
Cooperative Institute for Mesoscale Meteorological Studies, University of Oklahoma, Norman, Oklahoma
PAMELA L. HEINSELMAN
NOAA/OAR/National Severe Storms Laboratory, Norman, Oklahoma
DARREL M. KINGFIELD
Cooperative Institute for Mesoscale Meteorological Studies, University of Oklahoma, and NOAA/OAR/National
Severe Storms Laboratory, Norman, Oklahoma
RICK P. THOMAS
School of Psychology, Georgia Institute of Technology, Atlanta, Georgia
(Manuscript received 29 August 2014, in final form 8 December 2014)
ABSTRACT
The ongoing Phased Array Radar Innovative Sensing Experiment (PARISE) investigates the impacts of
higher-temporal-resolution radar data on the warning decision process of NWS forecasters. Twelve NWS
forecasters participated in the 2013 PARISE and were assigned to either a control (5-min updates) or an
experimental (1-min updates) group. Participants worked two case studies in simulated real time. The first
case presented a marginally severe hail event, and the second case presented a severe hail and wind event.
While working each event, participants made decisions regarding the detection, identification, and re-
identification of severe weather. These three levels compose what has now been termed the compound
warning decision process. Decisions were verified with respect to the three levels of the compound warning
decision process and the experimental group obtained a lower mean false alarm ratio than the control group
throughout both cases. The experimental group also obtained a higher mean probability of detection than the
control group throughout the first case and at the detection level in the second case. Statistical significance
( p value 5 0.0252) was established for the difference in median lead times obtained by the experimental
(21.5min) and control (17.3min) groups. A confidence-based assessment was used to categorize decisions
into four types: doubtful, uninformed, misinformed, and mastery. Although mastery (i.e., confident and
correct) decisions formed the largest category in both groups, the experimental group had a larger proportion
of mastery decisions, possibly because of their enhanced ability to observe and track individual storm char-
acteristics through the use of 1-min updates.
1. Introduction
During warning operations, weather forecasters rely
heavily on radar technology to observe and monitor
potentially severe thunderstorms (Andra et al. 2002).
The National Weather Service (NWS) currently utilizes
a network of 158 Weather Surveillance Radar-1988
Dopplers (WSR-88Ds) that are located across the
United States (Whiton et al. 1998). Given that theWSR-
88D was initially designed with a projected lifetime of
20 yr (Zrni�c et al. 2007), continuous upgrades are re-
quired to maintain its functionality (e.g., Saffle et al.
2009; Crum et al. 2013). However, eventually the WSR-
88D network will have to be replaced. A replacement
candidate under consideration is phased-array radar
(PAR; Zrni�c et al. 2007). To explore the suitability of
PAR for weather observation, a phased-array antenna
was loaned to the NOAA/National Severe Storms
Corresponding author address: Katie Bowden, 120 David L.
Boren Blvd., Norman, OK 73072.
E-mail: [email protected]
APRIL 2015 BOWDEN ET AL . 389
DOI: 10.1175/WAF-D-14-00101.1
� 2015 American Meteorological SocietyUnauthenticated | Downloaded 02/17/22 08:19 AM UTC
Laboratory (Forsyth et al. 2005) in Norman, Oklahoma
by the U.S. Navy. A key characteristic of this PAR is its
capability to provide volume updates in less than 1min
(Heinselman and Torres 2011).
When exploring future replacement technologies to
the WSR-88D, an important consideration is forecaster
needs. In a survey conducted by LaDue et al. (2010),
forecasters expressed a need for higher-temporal-
resolution radar data during rapidly evolving weather
events. In particular, forecasters reported that the 4–6-min
updates provided by the WSR-88D are insufficient for
observing radar precursor signatures of thunderstorms
such as downbursts (LaDue et al. 2010). Fujita and
Wakimoto (1983) define a downburst as, ‘‘A strong
downdraft which induces an outburst of damaging winds
on or near the ground.’’ Radar precursor signatures,
such as a descending high-reflectivity core and strong
midlevel convergence, can be used to identify storms
capable of producing a downburst (e.g., Roberts and
Wilson 1989; Campbell and Isaminger 1990). Such pre-
cursor signatures, however, can evolve too quickly for
trends to be sampled sufficiently by theWSR-88D. Such
limitations may result in delayed warnings and therefore
reduced lead time or, worse, missed events. These lim-
itations are of concern because downbursts can produce
damaging winds at the surface, presenting a threat to life
and property. Therefore, for improvement in warning
operations, a future radar system should be capable of
sampling the atmosphere on a shorter time scale, which
PAR can provide.
Heinselman et al. (2008) examined the weather sur-
veillance capabilities of the PAR during severe weather
events. In particular, microburst precursor signatures
observed by the PAR were compared to those observed
by the WSR-88D. During a 13-min observation period
when a storm was sampled by both radars, the PAR and
WSR-88D collected 23 and 3.5 volume scans, respec-
tively. The considerably faster PAR sampling resulted in
an improved ability to observe and track microburst
precursor signatures, prior to the detection of divergent
outflow at the lowest scans. Additionally, Heinselman
et al. (2008) analyzed a hailstorm observed by PAR.
Although a comparison to the WSR-88D was not
available, the development of radar features indicative
of a hail threat (e.g., bounded weak-echo region and
three-body scatter spike) were clearly visible in PAR
data as the storm quickly evolved. These findings by
Heinselman et al. (2008) suggest that the use of PAR
data could provide forecasters with the ability to detect
impending severe weather earlier, which in turn may
provide the public with longer warning lead times.
The Phased Array Radar Innovative Sensing Exper-
iment (PARISE) was designed to assess the impacts of
higher-temporal-resolution radar data on the warning
decision process of forecasters (Heinselman et al. 2012;
Heinselman and LaDue 2013). The work of PARISE is
critical to ensuring that the implementation of PAR
technology would be beneficial to the NWS. The 2010
and 2012 PARISE focused on low-end tornado events
(Heinselman et al. 2012; Heinselman and LaDue 2013).
Both experiments reported enhanced forecaster per-
formance with the use of 1-min radar updates compared
to forecasters using traditional 5-min radar updates, as
demonstrated through warnings issued with longer tor-
nado lead times. The purpose of this study was to extend
the work of PARISE to include severe hail and wind
events, with a focus on downbursts (see section 3b for
the NWS definition of severe). Based on the findings of
Heinselman et al. (2012) and Heinselman and LaDue
(2013), we hypothesized that during such events, rapidly
updating radar data would positively impact the warning
decision process of NWS forecasters. To assess this hy-
pothesis, data collection focused on both quantitative
and qualitative aspects of the forecaster warning de-
cision process. In particular, details of warning products
were recorded so that forecaster performance could be
assessed from a verification standpoint. The data col-
lected revealed that the warning decision process com-
prised three key decision stages. For this reason,
verification was assessed with regard to what has been
termed the compound warning decision process, which
recognizes that forecasters detect, identify, and re-
identify severe weather (see section 3a). Additionally,
confidence ratings were obtained each time a forecaster
made a key decision, along with reasoning for each
confidence rating. Through the use of a confidence-
based assessment, these ratings were analyzed to ad-
dress the question of whether increasing the temporal
availability of radar data leads to better decisions. Spe-
cifically, decisions were categorized into four types:
doubtful, uninformed, misinformed, and mastery. The
reasoning for each confidence rating provides insight
into why each decision type occurred, and whether the
temporal resolution of radar data played a role.
2. Methods
a. Experimental design
From two NWS Weather Forecast Offices (WFOs),
12 forecasters were recruited to participate in the 2013
PARISE. The two WFOs were located in the NWS’s
Southern and Eastern Regions, and therefore given the
climatology of these regions, the 12 forecasters would
have experienced working severe hail and wind events
(Kelly et al. 1985). During each of the six experiment
weeks, one forecaster from each WFO visited Norman,
390 WEATHER AND FORECAST ING VOLUME 30
Unauthenticated | Downloaded 02/17/22 08:19 AM UTC
Oklahoma. The experiment adopted a two-independent-
group design, where each week forecasters were as-
signed to either a control or an experimental group.
The volume update time acted as the independent var-
iable, where the control group received 5-min updates
from temporally degraded PAR data, and the experi-
mental group received 1-min updates from full-
temporal-resolution PAR data.
To ensure balanced groups in terms of knowledge and
experience, matched random assignment was in-
corporated into the experiment design. Matching was
accomplished through an online survey that was issued
to participants prior to the experiment. Participants’
experience was measured by the number of years they
had worked in the NWS (Table 1, columns 1 and 3).
Although experience is important with respect to the
amount of exposure one has had in their work envi-
ronment, experience does not imply expertise. As de-
scribed by Jacoby et al. (1986), experience and expertise
are ‘‘conceptually orthogonal,’’ with a distinguishing
factor being that expertise is achieved through acquiring
a ‘‘qualitatively higher level of either knowledge or
skill.’’ Therefore, to assess aspects of forecaster exper-
tise relevant to this study, knowledge was measured
through four questions regarding familiarity (Table 1,
columns 2 and 3), understanding, knowledge of pre-
cursors, and training with respect to downburst events
(Table 1, columns 4–7). For knowledge, the three
questions requiring qualitative responses were com-
pared to criteria that were based on downburst con-
ceptual models (e.g., Atkins and Wakimoto 1991).
Based on their survey responses, all participants were
assigned an experience and knowledge score ranging
between 1 and 5 (Fig. 1). The experience score was
based on the single experience question, whereas the
knowledge score was generated by averaging the points
obtained from the four knowledge questions. Among
the participants, experience was spread fairly evenly,
and knowledge was clustered around the medium range
TABLE 1. Criteria for points assigned to questions from the preexperimental online survey. Columns 1 and 3 refer to how experience
scores were assigned, and columns 2–7 refer to how knowledge scores were assigned. In column 2, a scale from 1 to 10 is used (where 1
indicates no familiarity and 10 indicates extensive familiarity).
Experience
(yr) Familiarity Points
Understanding
of a downburst
Precursors for
forecasting a downburst Training Points
#5 1 and 2 1 Definition Suspended core Distance Learning
Operations Course
and Advanced
Warning Operations
Course
Assign one point for
each topic discussed
within the question
category; total of five
points for each category
#10 3 and 4 2 Wet and dry variety
recognized
Midaltitude radial
convergence
Seasonal familiarization
training
#15 5 and 6 3 Description of
soundings
Storm-top
divergence
Other courses (e.g.,
online/workshops)
#20 7 and 8 4 Thermodynamic
and dynamic
mechanisms
Environment
assessment
Exposure to
literature/current
forecasting techniques
.20 9 and 10 5 Demonstration of
an understanding
beyond that of a
typical responder
Demonstration of
an understanding
beyond that of a
typical responder
Personal experience
(e.g., storm chasing)
FIG. 1. Experience and knowledge scores for each participant are
given. The group assignment of each participant was based on the
control and experimental group combinations that yielded the
smallest Mahalanobis distance. Participants assigned to the control
or experimental groups are assigned open or filled circles,
respectively.
APRIL 2015 BOWDEN ET AL . 391
Unauthenticated | Downloaded 02/17/22 08:19 AM UTC
(Fig. 1). For all possible group combinations, the
Mahalanobis distance was computed to assess the sim-
ilarity between groups by using experience and knowl-
edge scores as variables (McLachlan 1999). The smallest
distance represented the greatest similarity between
groups, which therefore determined the group assign-
ment for each participant (Fig. 1).
Although efforts were made to match groups, the
limitations associated with the applied methodology
should be acknowledged. A limitation that arose fol-
lowing the distribution of the survey was that partici-
pants may not have always interpreted the questions
correctly, leading to discussions on tangential topics. For
example, participants were asked to explain their un-
derstanding of a downburst. Although most participants
perceived this question as intended (Table 1, column 4),
some responses focused on the type of damage observed
from downbursts. In addition, the amount of time and
effort that participants invested into the survey was
likely variable. For these reasons, it is possible that
survey responses did not provide a complete represen-
tation of participants’ knowledge. However, despite
this possibility, the consistent assessment of survey re-
sponses and the use of a similarity metric provided
a means to objectively match groups.
b. Case studies
The National Weather Radar Testbed located in
Norman, Oklahoma, is home to an S-band PAR that is
being evaluated and tested for weather applications.
Given that the PAR is a single flat-panel array, data
collection is limited to a 908 sector at any one time.
PAR’s electronic beam steering means that it operates
with a nonconformal beamwidth increasing from 1.58 to2.18 as the beam is steered from boresight to6458 (Zrni�cet al. 2007). Additionally, the electronic beam steering
allows the atmosphere to be scanned noncontiguously,
enabling weather-focused observations, which further
reduce the volume update time to less than 1min
(Heinselman and Torres 2011; Torres et al. 2012).
Based on the following criteria, two cases from ar-
chived PAR data were selected for the 2013 PARISE
(Table 2). First, the cases needed to be long enough to
allow participants to settle into their roles and demon-
strate their warning decision processes as the weather
evolved. Second, severe hail and/or wind reports needed
to be associated with the event, preferably toward the
end of the case to give participants an opportunity to
interrogate the storms beforehand and make warning
decisions as necessary. Third, for consistent low-level
sampling of the weather event, the PAR data needed to
be uninterrupted and within a range of 100 km from
the radar.
Case 1 presented multicell clusters of storms that oc-
curred at 0134–0210 UTC 20 April 2012 (Figs. 2a,b;
Table 2). This marginally severe (i.e., at or slightly
greater than the severe criteria) hail event was observed
by the PAR using an enhanced volume coverage pattern
(VCP) 12 strategy. Specifically, this VCP scanned 19
elevation angles ranging between 0.518 and 52.908. Al-
though only one severe hail report occurred during case
time, an additional six hail reports were associated with
the same storm 1h after case end time.
Case 2 included multicellular storms with some rota-
tion that were sampled by PAR at 2053–2139 UTC 16
July 2009 (Figs. 2c,d; Table 2). PAR collected data using
a VCP that was composed of 14 elevation angles ranging
between 0.518 and 38.808. Both severe hail and wind
events were reported and associated with a downburst
event that occurred in central Oklahoma. During case
time, there was one severe wind and two severe hail
reports. Within the hour after case end time, an addi-
tional 16 reports of severe hail and wind events were
associated with the same storm.
All storm reports were obtained from StormData, which
is logged in the NWS Performance Management System
(https://verification.nws.noaa.gov/). Because the spatial
and temporal accuracy of Storm Data is limited (e.g., Witt
et al. 1998; Trapp et al. 2006), it was important to ensure
consistency between the location and timing of storm re-
ports with the radar data. Additionally, weather reports
obtained during the Severe Hazards Analysis and Verifi-
cation Experiment (SHAVE; Ortega et al. 2009) were
examined to validate confidence in the storms that did not
produce severe weather. Both SHAVE and Storm Data
were in agreement with storms classified as null events.
The occurrence of both severe and nonsevere storms
during the cases provided a realistic scenario whereby
participants were challenged to differentiate between
storms that would and would not produce severe weather.
c. Working the cases
Before working each case, participants viewed
a weather briefing video that was prepared by J. Ladue
TABLE 2. Descriptions of cases 1 and 2.
Case 1 Case 2
Time and
date
0134–0210 UTC
20 Apr 2012
2053–2139 UTC 16 Jul 2009
Event type Multicell, severe hail Multicell, severe hail
and wind
Storm
reports
0209 UTC, 1-in.
hail
2135 UTC, 1.75-in. hail;
2135 UTC, estimated
gust 56-kt wind; and
2138 UTC, 1.75-in. hail
VCP 19 elevations,
0.518–52.90814 elevations, 0.518–38.808
392 WEATHER AND FORECAST ING VOLUME 30
Unauthenticated | Downloaded 02/17/22 08:19 AM UTC
of the Warning Decision Training Branch. This video
provided participants with an overview of the envi-
ronmental conditions associated with the case, along
with satellite and radar imagery leading up to the case
start time. The weather briefing gave all participants
the same information from which they could form ex-
pectations. Participants were then told that they had
just come on shift, that no warnings were in progress,
and that it was their job to determine whether a warn-
ing was required for the storms they would encounter.
All participants worked independently in separate
rooms. They were reminded that the data collected
from their participation would remain anonymous, and
participants were encouraged to work as they would in
their usual WFOs. In this study, participants are
referred to as P1–P6 for the control group and P7–P12
for the experimental group.
Cases were played in simulated real time using the
next-generation Advanced Weather Interactive Pro-
cessing System-2 (AWIPS-2). Given that during the
summer of 2013 participants were using AWIPS-1
within their WFOs, a short familiarization session with
the newer software prior to working events was pro-
vided to increase the participants’ comfort level using
AWIPS-2 as their forecasting tool. Therein, participants
were able to view base velocity, reflectivity, and spec-
trum width products from the PAR. During the case,
participants received verbal information of storm re-
ports that were timed according to the details provided
in Storm Data. All warning products (e.g., special
FIG. 2. The (left) 0.518 reflectivity and (right) velocity for (a),(b) case 1 at 0140 UTC 20 Apr 2012 and (c),(d) case 2
at 2111 UTC 16 Jul 2009. Times were chosen to illustrate the variety of storms that participants encountered during
the cases. The storms that were later associated with severe weather reports from Storm Data are identified by the
white circles.
APRIL 2015 BOWDEN ET AL . 393
Unauthenticated | Downloaded 02/17/22 08:19 AM UTC
weather statements, severe thunderstorm warnings, and
severe weather statements) were issued using the
Warning Generation (WARNGEN) software. When-
ever participants issued a product, they were asked to
indicate their level of confidence on a scale that ranged
from not sure (0%), to partially sure (50%), to sure
(100%; Fig. 3). Following the case, participants were
asked a set of probing questions that targeted the rea-
sons for each decision and the decision maker’s associ-
ated confidence level.
3. Forecaster performance
a. The compound warning decision process
Decisions are oftentimes not a one-step procedure.
Rather, decision makers can find themselves in a com-
pound decision environment that consists of multiple
decision elements. For example, search and rescue op-
erations require locating a target followed by identifying
that the correct target has been recovered (Duncan
2006), and medical diagnoses can involve first detecting
an abnormality, and then correctly localizing the ab-
normality for treatment (Obuchowski et al. 2000). Ob-
servations of participants during the 2013 PARISE
revealed that weather forecasters also encounter mul-
tiple problems when working toward a solution. In
particular, these problems are focused on warning de-
cisions and are recognized as detection, identification,
and reidentification, together forming the compound
warning decision process (Fig. 4). Detection relates to
the decision to warn; a forecaster perceives and com-
prehends information that leads to the belief that severe
weather will occur. The decision to issue a warning
prompts the forecaster to open the WARNGEN soft-
ware, at which point the forecaster progresses to the
identification stage. For instance, when issuing a severe
thunderstorm warning, the forecaster must identify the
expected weather threats (i.e., hail and/or wind) from
the storm in question. Once the warning is issued, the
forecaster continues to monitor the storm’s evolution
and updates the warning by issuing severe weather
statements. It is through these updates that the
forecaster reidentifies the weather threats; the threat
may be maintained, changed in magnitude, changed in
type, or canceled.
Distinguishing severe hail and wind events from one
another is a challenge that NWS forecasters regularly
encounter during warning operations. Currently,
though, the NWS only assesses forecaster performance
at the detection level. The compound warning decision
process, however, allows for a more comprehensive as-
sessment of warning decisions. A correct decision at the
detection level does not necessarily mean that the
forecaster has accurately comprehended information
regarding the storm’s potential. For example, while
working case 2, P3 (control participant) issued a severe
thunderstorm warning, identifying only wind as the
weather threat. Although at the detection level P3 made
a correct decision, P3 had missed the hail threat during
identification. The participant maintained this threat
expectation through the issuance of two warnings, only
realizing after the first hail report at 2135 UTC 16 July
2009 that hail was also a threat. At this point, P3 issued
a severe weather statement to reidentify both hail and
wind as weather threats, but unfortunately had not
communicated the hail threat until after the event had
occurred. This example demonstrates that to fully un-
derstand the quality of a forecast, a more intricate
analysis of warning decisions is required.
FIG. 3. Tool used to indicate confidence related to the issuance of a warning, severe weather
statement (SVS), or special weather statement (SPS) product.
FIG. 4. The compound warning decision process is composed of
three decision stages: detection, identification, and reidentification.
394 WEATHER AND FORECAST ING VOLUME 30
Unauthenticated | Downloaded 02/17/22 08:19 AM UTC
b. Verification
Tomeasure forecaster performance at the three levels
of the compound warning decision process, forecaster
decisions were verified using the NWS severe criteria. In
operations, a severe thunderstorm warning is verified by
the occurrence of 50 knots (kt; 1 kt 5 0.51m s21) or
higher wind and/or at least hail of 1-in. diameter,
whereas a tornado warning is verified by reports of
a tornado within the spatiotemporal limits of the warn-
ing polygon (NOAA 2011). Storm reports associated
with the severe weather events in cases 1 and 2 were
treated as instantaneous events (Table 2). Additionally,
since participants worked only a portion of a severe
weather event, there were occasions where warnings
were verified only by severe hail and/or wind events
after the case had ended. In these instances, storm re-
ports recorded 1h after case end time were used for
verification purposes. For detection, individual warnings
were verified by assessing whether the warning encom-
passed an event both spatially and temporally. Each
event that was not warned for was recorded as a miss.
For identification, weather threats were first considered
individually. For example, in case 1, only hail reports
were associated with the severe storm. If both hail and
wind were identified as threats in the warning, then hail
was a hit, and wind was a false alarm. The results from
each weather threat were then combined for overall
identification statistics. Reidentification was verified in
a similar manner to identification, but this time for the
updated warning information detailed in severe weather
statements.
Performance measures were calculated for detection,
identification, and reidentification. The NWS commonly
assesses forecaster performance using the probability of
detection POD and false alarm ratio FAR. Whereas the
POD represents the proportion of events that occurred
and were successfully warned for, the FAR represents
the proportion of warnings issued that were false alarms.
The POD and FAR can be calculated as follows (Wilks
2006):
POD5a
a1 cand (1)
FAR5b
a1 b, (2)
where a, b, and c are the numbers of hits, false alarms,
and misses, respectively. For cases 1 and 2, group mean
POD and FAR scores were calculated for detection,
identification, and reidentification. With the exception
of two instances in case 2, the experimental group ob-
tained superior mean POD and FAR scores compared
to the control group (Table 3). Participants’ individual
results are plotted in Figs. 5 and 6, which illustrates the
underlying distribution in performance that led to the
different group averages.
1) CASE 1
All but one participant (control) successfully detected
the severe hail event in case 1, which resulted in POD
scores of 0.83 and 1 for the control and experimental
groups, respectively (Fig. 5a). All control and five ex-
perimental participants also decided to issue warnings
on a storm that was not associated with severe weather.
Although at the detection level the experimental
group’s performance scores were more variable, the
overall performance of the experimental group resulted
in a lower FAR score (0.45) than the control group (0.58;
Fig. 5a). Five control and two experimental participants
obtained FAR scores of 0.5 by warning once on the se-
vere storm and once on a nonsevere storm. Two warn-
ings were also issued by P5, but neither verified (FAR51). Three warnings were issued by P7 and P12, of which
only one verified at the detection level (FAR 5 0.67).
Three warnings were also issued by P10, but two of his
warnings verified (FAR5 0.33). The only experimental
participant who did not incorrectly detect severe
weather (FAR 5 0) was P11.
Following detection, participants identified the
weather threat associated with the storm that was being
warned. All participants identified hail in each of their
warnings, which only verified for warnings that were
successful at the detection level. Therefore, the POD
scores for identification match those for detection
(Fig. 5b). Participants were also assigned FAR scores
because of incorrect identifications (Fig. 5b). Incorrect
identifications were a result of two reasons: 1) a weather
threat was identified for a warning that did not verify at
the detection level and 2) incorrect identification of
TABLE 3. Mean POD and FAR statistics for the control
and experimental groups for detection, identification, and
reidentification.
Case 1 Case 2
Mean
control
Mean
experimental
Mean
control
Mean
experimental
Detection
POD 0.83 1.00 0.95 1.00
FAR 0.58 0.45 0.33 0.25
Identification (overall threat)
POD 0.83 1.00 0.88 0.88
FAR 0.79 0.70 0.36 0.22
Reidentification (overall threat)
POD 0.60 0.83 1.00 0.90
FAR 0.87 0.69 0.23 0.19
APRIL 2015 BOWDEN ET AL . 395
Unauthenticated | Downloaded 02/17/22 08:19 AM UTC
a wind threat was made on the severe storm that was
associated with only severe hail events. Both groups’
FAR scores increased from detection to identification,
though the experimental group continued to achieve
FIG. 5. POD (black circle) and FAR (open circle) scores for
(a) detection, (b) identification, and (c) reidentification in case 1.
The vertical dashed line separates the (left) control and the (right)
experimental participants, and the horizontal dashed line marks
the 0.5 values for POD and FAR scores.
FIG. 6. As in Figs. 5, but for case 2.
396 WEATHER AND FORECAST ING VOLUME 30
Unauthenticated | Downloaded 02/17/22 08:19 AM UTC
a lower FAR score than the control group (Table 3). We
surmise that the increase in FAR scores from the de-
tection to identification level is due to the added chal-
lenge of having to discern between potential weather
threats.
Reidentification of weather threats during case 1 were
made while updating a warning. No updates were issued
by P6, and therefore statistics were not calculated for
this participant. Hail and wind threats were only re-
identified on warnings that were false alarms at the de-
tection level by P4, P5, and P7. These participants
therefore received POD and FAR scores of 0 and 1,
respectively (Fig. 5c). The remaining participants re-
identified a hail threat on the correct storm at least once,
achieving POD scores of 1. The variable FAR scores at
the reidentification level resulted from 1) what storms
participants decided to update (i.e., the severe storm or
the nonsevere storm) and 2) whether participants were
able to correctly reidentify hail as the only threat.
Whereas the mean FAR score from identification to re-
identification remained nearly steady for the experimen-
tal group, the control group’s mean FAR score increased
(Table 3). Overall, the experimental group was more
successful at reidentifying the correct weather threat on
the correct (i.e., severe) storm than the control group.
2) CASE 2
During case 2, all participants except P5 successfully
detected the three severe weather events (Fig. 6a). This
participant missed one event, which resulted in the
control group achieving a slightly lower mean POD
score of 0.95 compared to the experimental group’s
mean POD score of 1 (Table 3). In comparison to POD
scores, FAR scores were more variable among partici-
pants (Fig. 6a). For participants obtaining FAR scores of
0, each warning that was issued encompassed the severe
storm and therefore was verified with respect to de-
tection. Participants obtaining FAR scores of 0.5 typi-
cally issued two warnings, of which only one was verified
by severe weather, while the other targeted a storm to
the north that was not associated with severe weather
reports. Three warnings were issued by both P4 and P6.
For P4, one of these warnings verified (FAR 5 0.67),
whereas for P6, two warnings verified (FAR 5 0.33).
Overall, the experimental group had fewer false alarms,
as demonstrated by their lower mean FAR score of 0.25
compared to 0.33 for the control group (Table 3).
Unlike in case 1, case 2 presented a storm that pro-
duced both severe hail and wind. Of these events, all
participants identified the wind event successfully, and
four participants in each group identified the hail events
(Fig. 6b). This similar performance between groups led
to matching mean POD scores at the identification level
of 0.88 (Table 3). The experimental group, though,
performed better than the control group regarding false
alarms. In case 2, false alarms at the identification level
occurred mostly within warnings that did not verify at
the detection level. While all control participants in-
correctly identified weather threats within these warn-
ings, three experimental participants achieved an FAR
score of 0. Additionally, two control participants in-
correctly identified a tornado threat. The resulting mean
FAR scores for identification were 0.36 for the control
group and 0.22 for the experimental group (Table 3).
When participants began to reidentify weather
threats, group POD scores increased and the FAR
scores decreased (Table 3). As the severe storm evolved
over time, participants realized that the southern storm
had more potential than the storm to the north, which
was beginning to dissipate. The wind threat associated
with the severe storm was correctly reidentified by all
participants, while all control and four experimental
participants also correctly reidentified the hail threat
(Fig. 6c). Some participants in both groups also in-
correctly reidentified weather threats. Whereas the ex-
perimental group’s FAR score decreased slightly from
identification to reidentification, the control group’s
FAR score decreased more substantially (Table 3).
However, the accuracy of the control group’s decisions
during reidentification improved to a level of accuracy
similar to that demonstrated by the experimental group
during the identification stage.
c. Lead time
The lead time was calculated as the time of the severe
hail or wind event minus the time of warning issuance.
For events that were unwarned, a lead time of 0min was
assigned. On occasions where multiple warnings en-
compassed one event, the earliest issued warning was
used to calculate lead time. Lead time was calculated for
all 12 participants for one event in case 1, and three
events in case 2.
Participants’ lead times during case 1 ranged from 0 to
30min (Fig. 7a). The experimental group, however,
demonstrated a tendency toward longer lead times.
With the exception of P7, all experimental participants
achieved a lead time of at least 20min, compared to just
half of the control participants. Group mean lead times
were 16.4 and 22.0min for the control and experimental
groups, respectively (Table 4). For case 2, lead time was
calculated for three events that both spatially and tem-
porally occurred close to one another. Therefore, often
one warning verified the three events. Within the ex-
perimental group, four participants achieved a lead time
of at least 20min for all three events, compared to just
one control participant (Fig. 7b). Group mean lead
APRIL 2015 BOWDEN ET AL . 397
Unauthenticated | Downloaded 02/17/22 08:19 AM UTC
times for case 2 were 16.4 and 21.8min for the control
and experimental groups, respectively.
Combining the lead time results of both cases, we find
that the control group’s mean lead time was 16.4min
compared to 21.9min for the experimental group.
Therefore, the experimental group’s mean lead time
exceeded the mean lead time of the control group by
5.5min. While this difference in mean lead time is sim-
ilar to the temporal resolution provided to the control
group, the variability among participants’ lead time re-
sults within the same group suggests that factors in
addition to temporal resolution may be important for
explaining participant performance. Additionally, the
Wilcoxon rank sumnonparametric test (Wilks 2006) was
used to assess the difference between the median lead
times of the control (17.3min) and experimental
(21.5min) groups. The test yielded a p value of 0.0252,
indicating that the difference in median lead times was
statistically significant above the 95% confidence in-
terval. Although the results from this study cannot be
generalized because of the small sample size, the per-
formance of the experimental group is encouraging and
the lead time results are in favor of the use of higher-
temporal-resolution radar data.
4. Decision types
a. Confidence-based assessment
The increased flux of information provided by PAR
raises the question of how rapidly updating radar data
will impact forecaster confidence, and what the result-
ing effects will be on the decisions that are made. To
investigate this question, the relationship between con-
fidence and correctness was assessed using a two-
dimensional testing method (Bruno 1993). Referred to
as the confidence-based assessment behavioral model,
a decision maker is required to indicate their confidence
associated with each decision on a scale ranging from
‘‘not sure’’ (0%) to ‘‘partially sure’’ (50%) to ‘‘sure’’
(100%; Fig. 3). In particular, confidence-based assess-
ment can identify three states of mind, confidence,
doubt, and ignorance (e.g., lacking knowledge), and can
help categorize decisions into four types: doubtful, un-
informed, misinformed, andmastery (Fig. 8; Bruno et al.
2006; Adams and Ewen 2009). According to Bruno et al.
(2006), doubtful decisions, although correct, lack confi-
dence and are made with hesitance. Decisions that are
both incorrect and made without confidence are un-
informed. Decisions that are incorrect yet made with
confidence aremisinformed, and perhaps are the riskiest
types of decisions. The most desirable type of decision is
mastery, which arises from smart and informed choices
that are both confident and correct.
FIG. 7. (a) Case 1 and (b) case 2 warning lead times for each
participant. The vertical dashed line separates the (left) control and
(right) experimental participants.
TABLE 4. Mean lead times for control and experimental groups
for cases 1 and 2, along with the group differences in mean lead
time.
Case 1: Mean
lead time (min)
Case 2: Mean
lead time (min)
Control 16.4 16.4
Experimental 22.0 21.8
D lead time 5 experimental
lead time2 control lead time
5.6 5.4
398 WEATHER AND FORECAST ING VOLUME 30
Unauthenticated | Downloaded 02/17/22 08:19 AM UTC
b. Categorizing decisions
When participants made a key decision (i.e., decision
to issue or update a warning), a corresponding confi-
dence rating was assigned. Since there was variability in
their confidence baselines, results (which ranged from
26% to 100%) were normalized by linear trans-
formation onto a new scale ranging from 0 to 7. Ratings
of at least 5 were considered confident decisions, since
this value indicated that the decision was closer to sure
than partially sure (i.e.,$75%). The key decisions made
during cases 1 and 2 were combined, yielding a total of
N 5 53 and 54 key decisions for the control and exper-
imental groups, respectively. Decisions were classified
as correct if the decision to issue or maintain a warning
corresponded with the occurrence of severe weather.
Similarly, decisions to not issue or to cancel a warning
were correct for instances when severe weather did not
occur.
Of these key decisions, a larger proportion of the
decisions made by the experimental group were classi-
fied as mastery (63%) compared to those of the control
group (51%). Individual participants in the experimen-
tal group made a higher number of mastery decisions
and a lower number of uninformed and misinformed
decisions compared to individual participants in the
control group (Fig. 9a). The majority of the key de-
cisions in both groups were categorized as misinformed
and mastery. This result is unsurprising since one may
expect for decisions to be made more frequently when
a decision maker is confident rather than unsure. The
Wilcoxon rank sumnonparametric test (Wilks 2006) was
used to assess the difference in the median number of
decisions made by the control and experimental groups
for all four decision types. The p values yielded were
0.862, 0.673, 0.802, and 0.325 for the doubtful, un-
informed, misinformed, and mastery decision types, re-
spectively. Therefore, although statistical significance
was not established, these results indicate that of the
four decision types, the control and experimental groups
differed most with respect to mastery decisions.
c. Explanations for decision types
Following each case, participants were questioned on
the reasons for the confidence ratings that they had
provided. The qualitative data collected from this
questioning gives insight into why doubtful, uninformed,
misinformed, and mastery decisions were made. Al-
though reasoning provided by participants varied
somewhat, common topics discussed by participants
were also found.
The control and experimental groups made five and
four decisions, respectively, that were correct but made
without confident (i.e., doubtful; Fig. 9b). Of these
hesitant decisions, the majority were made during case
2, with just one doubtful decision being recorded during
case 1 for both groups. Three control participants ex-
plained that their hesitation was due to their warning
criteria not being fully satisfied. For example, in case 1,
P3 said that she was ‘‘flirting with the criteria’’ since the
storm appeared ‘‘more marginal,’’ and P4 went ahead
with issuing a tornado warning in case 2 despite being
‘‘not sure [that the] environment was conducive’’ for
tornadogenesis. Similarly, some experimental partici-
pants found themselves making warning decisions
without confidence. During case 2, P10 questioned the
severe potential of a storm on which he had decided to
warn. His doubt arose because despite seeing that the
storm had a ‘‘good’’ and ‘‘healthy’’ core, he was ‘‘just not
sure [whether] the environment’’ was supportive of se-
vere storms. For P12 and P8, though, conflict arose as
a result of earlier warnings not being verified. For ex-
ample, P12 explained that during case 1 she wanted ‘‘any
kind of determination on previous storms.’’ Addition-
ally, P8 lacked confidence in case 2 after observing
a ‘‘downward trend in reflectivity and velocity data’’
while also having ‘‘not received reports at that time.’’
The absence of reports on storms that were already
warned on resulted in P12 and P8 being hesitant in their
subsequent warning decisions.
Decisions categorized as uninformed were made on
eight occasions in the control group and five occasions in
the experimental group (Fig. 9b). Participants that did
not make incorrect decisions without confidence (i.e.,
uninformed) also did not make incorrect decisions with
FIG. 8. The four types of decisions based on the relationship
between confidence and correctness. [Adapted from Adams and
Ewen (2009).]
APRIL 2015 BOWDEN ET AL . 399
Unauthenticated | Downloaded 02/17/22 08:19 AM UTC
confidence (i.e., doubtful). These participants are iden-
tified as P5 and P6 of the control group, and P7, P9, and
P11 of the experimental group (Fig. 9b). Of the eight
uninformed decisions recorded in the control group,
three control participants explained that they did not
have sufficient data to make a confident and informed
decision. In particular, P1 described going ‘‘off [his] gut’’
when he decided to warn during case 1,P4 projected that
a storm in case 2 would ‘‘continue to grow’’ despite
‘‘[not] having a lot of information,’’ and P2 decided to
issue a tornado warning in case 2 because she thought
that if she had waited for more information, it would
have been ‘‘too late.’’ Control participants made in-
correct decisions without confidence for other reasons
also, including the warning decision being the ‘‘first one
of the day’’ (P3; case 2), feeling that a warning could not
be canceled despite it ‘‘[not] look[ing] severe anymore’’
(P4; case 1), and maintaining a tornado warning because
it was ‘‘approaching a major interstate’’ despite having
‘‘reservations about [the] tornado aspects’’ of the storm
(P4; case 2). Experimental participants’ reasoning for
their lack of confidence varied, but unlike control par-
ticipants, their reasons were not associated with the
amount of radar data they had available. Furthermore,
all uninformed decisions made by experimental partic-
ipants were made during case 1. For P12, not having
‘‘reports of ground truth’’ led to an incorrect decision
being made without confidence on two occasions. Both
FIG. 9. The distribution of doubtful (yellow), uninformed (orange), misinformed (red), and
mastery (green) decisions made by (a) the control and experimental groups and (b) individual
participants for both cases 1 and 2. The sample size of each decision type is given for both
groups.
400 WEATHER AND FORECAST ING VOLUME 30
Unauthenticated | Downloaded 02/17/22 08:19 AM UTC
P8 and P10 reported that their lack of confidence in case
1 was due to the storm of interest appearing weaker than
a storm that they had already warned on. It was ex-
plained by P10 that, ‘‘reflectivity-wise, it did not seem as
robust as the southern storm.’’ Similarly, P8 noted that
the storm was not ‘‘as strong as the southern storm.’’ A
second decision was made by P8 in case 1 without con-
fidence as a result of observing an apparent weakening
in a storm of interest, which was evident by the ‘‘lowering
hail core to less than 20kft.’’
Misinformed decisions, which were incorrect but
made with confidence, made up the second largest
decision-type category for both the control and experi-
mental groups. Whereas all control participants made at
least one incorrect decision with confidence, only four
experimental participants did so (Fig. 9b). No key de-
cisions made by P10 or P12 were categorized as mis-
informed. Across the two cases, the experimental
group’s misinformed decisions were distributed evenly,
whereas the control group’s occurred predominantly
during case 1. Most incorrect yet confident decisions
made by the control (N 5 11 of 13) and experimental
(N 5 10 of 11) groups were made with the belief that
severe weather was a threat. Typically within the forecast
office, warning criteria are established based on experi-
ence and climatology. Many participants applied their
usual warning criteria to the storms they encountered
during these cases. For example, in case 1, P7 reported
seeing ‘‘60dBZ above 20kft,’’ which she explained ‘‘fit
[her] conceptual model for [severe] hail.’’ This warning
criterion was common among participants, because, as P9
explained during case 2, ‘‘hail is very predictable when
the core is that high.’’ However, given that warning cri-
teria are established with respect to a certain location,
participants’ warning criteriamay not have been as suited
to the environment in Oklahoma, ultimately leading to
participants making incorrect decisions with confidence.
Misinformed decisions were also recorded twice in the
control group and once in the experimental group for
participants who had decided to trim thewarning polygon
since the storm had moved ‘‘out of the county’’ (P1).
Although confidence was associated with the decision to
cancel a threat in some location, these three participants
chose to incorrectly maintain the severe threat elsewhere
in the warning polygon, resulting in false alarms at the
reidentification level.
More than half of the decisions made by both groups
fell into the mastery decision category. In total, the
control and experimental groups made 27 and 34 con-
fident and correct decisions, respectively (Fig. 9b). At
least three key decisions made by each participant were
categorized as mastery. A maximum of eight key de-
cisions were categorized as mastery for one participant
in each group (P1 and P11; Fig. 9b). Mastery decisions
were common in both cases, with approximately 40%
occurring during case 1 and 60% during case 2. Expla-
nations for confidence that was associated with correct
decisions revolved around two reasons. The first reason
was that participants compared storm characteristics on
radar. For example, in case 1, P4 noted that the severe
storm had a ‘‘much larger and deeper high-reflectivity
core’’ than other storms, and P8 described the severe
storm as being the ‘‘most intense’’ on radar. Similar to
these observations, in case 2, P6 explained that the se-
vere storm was ‘‘more impressive’’ than the storm to the
north that he had already warned on. Making compar-
ative observations of storms provided participants with
confidence in their warning decisions. This type of rea-
soning was provided on 12 occasions by the control
group, but only 4 occasions by the experimental group.
The second reason for mastery decisions was based on
perceived severe radar signatures of specific storms.
Participants observed features and trends of individual
storms that justified their warning decisions. The ex-
perimental group made confident and correct decisions
using this reasoning on 30 occasions compared to 15
occasions by the control group. One possibility that the
experimental group provided this reasoning twice as
often as the control group is that the use of rapidly
updating radar data aided experimental participants in
obtaining more-detailed observations of storms. For
example, in case 2, P7 saw that the severe storm was
‘‘increasing in intensity aloft,’’ leading to concern that
there was ‘‘precipitation loading producing high winds
near the ground.’’ Another example of specific storm
interrogation was when P8 observed that the hail core
was ‘‘[continuing] to grow on upper-level reflectivity,’’
while the ‘‘midlevel convergence signature [was] getting
stronger and stronger.’’ Examples such as these dem-
onstrate the experimental group’s ability to track in-
dividual characteristics of storms, which was sufficient
for developing an understanding of the storm dynamics
and correctly projecting the occurrence of severe
weather.
5. Discussion and summary
The purpose of the 2013 PARISE was to extend the
work of earlier experiments (Heinselman et al. 2012;
Heinselman andLaDue 2013) to investigate whether the
use of higher-temporal-resolution radar data during se-
vere hail and wind events would be beneficial to the
warning decision process of NWS forecasters. The ex-
periment design allowed for a comparison between
control and experimental participants that utilized PAR
data with temporal updates of 5 and 1min, respectively.
APRIL 2015 BOWDEN ET AL . 401
Unauthenticated | Downloaded 02/17/22 08:19 AM UTC
While working two severe hail and/or wind case studies
in simulated real time, all participants exhibited a de-
cision process that was formed of multiple components.
Observing participants detecting, identifying, and re-
identifying severe weather led to the designation of the
compound warning decision process. This process in-
troduced a new verification approach, where the accu-
racy of warning decisions was considered with respect to
the detection, identification, and reidentification of se-
vere weather. This verification approach was important
for fully understanding and comparing each group’s
performance, since warning decisions and perceived
severe weather threats changed as storms evolved with
time. Given that the elements that compose the com-
pound warning decision process are a part of real-time
warning operations, we suggest that evaluating fore-
caster warning decisions beyond the detection level may
provide a more thorough assessment of forecaster per-
formance for the duration of a severe weather event,
rather than for the initial warning decision only.
The POD and FAR statistics were calculated for all
three stages of the compound warning decision process
(Table 3). Overall, the experimental group made more
accurate warning decisions than did the control group.
Additionally, the experimental group also made more
timely warnings (Table 4). More timely warnings were
demonstrated through the significantly higher (p value50.0252) median lead time obtained by the experimen-
tal group (21.5min) compared to the control group
(17.3min). The finding that the experimental group
made more accurate and timely warning decisions was
not necessarily expected, since earlier studies have
shown that the skill of operational meteorologists did
not increase notably with increased information
(Stewart et al. 1992; Heideman et al. 1993). Research has
also shown that increasing the amount of information
a decision maker receives may increase confidence and
satisfaction, yet decrease actual performance (O’Reilly
1980). This effect was not observed during the 2013
PARISE. Rather, the experimental group performed
superiorly to the control group through the use of 1-min
radar updates.
The findings from each experiment support the use of
higher-temporal-resolution radar data during warning
operations with increased lead time being a consistent
finding through all three experiments (Heinselman et al.
2012; Heinselman and LaDue 2013). However, a limi-
tation of the 2013 PARISE, along with the 2010
and 2012 PARISE, is the sample size. Given that a to-
tal of 12 participants were recruited for each experi-
ment, and that each PARISE focused on a particular
weather threat, the generalizability of the results to the
wider forecasting community is questionable. Future
experiments should include a wider variety of cases that
together are more representative of what forecasters
encounter in the real world.
Participants’ decisions were also assessed with respect
to both confidence and correctness. Rather than simply
identifying decisions as right or wrong, the goal of this
confidence-based assessment was to categorize de-
cisions into four different types, namely, doubtful, un-
informed, misinformed, and mastery (Bruno et al. 2006;
Adams and Ewen 2009). Both groups made decisions
that fell into each category. However, while the control
group made slightly more doubtful, uninformed, and
misinformed decisions than the experimental group, the
experimental group made more mastery decisions than
the control group. Qualitative reasoning for each con-
fidence rating was important for understanding the
factors that led to each decision type. The reasons
leading to uninformed and misinformed decisions
highlight some of the limitations associated with work-
ing in a simulated environment. Not having available
radar data prior to the case start time resulted in control
participants making incorrect decisions without confi-
dence, while a change in geographic location and
therefore unsuitable application of warning criteria re-
sulted in both groups making incorrect decisions with
confidence. Avoiding limitations such as these could be
accomplished by experimenting with the use of PAR
data during real-time operations in the local forecast
office. Mastery decisions resulted from participants ei-
ther making a comparison between storms or observing
and tracking individual storm characteristics. While
both reasons explained the confident and correct de-
cisions made by the control group, the mastery decisions
in the experimental group were predominantly ex-
plained by the latter reason. As discussed previously,
LaDue et al. (2010) reported that forecasters expressed
a need for faster radar updates in order to observe
rapidly evolving weather. The qualitative reasoning
provided for mastery decisions suggests that the exper-
imental group’s ability to observe storm evolution on
a finer temporal scale was enhanced through the use of
1-min radar updates.
Acknowledgments. Thank you to the 12 NWS fore-
casters for participating in this study, to the participating
WFOs’ MICs for supporting recruitment, and to
Michael Scotten for participating in the pilot experiment.
We also thank A/V specialist James Murnan, software
expert Eddie Forren, and GIS expert Ami Arthur. Ad-
vice from committeemembers Robert Palmer andDavid
Parsons, along with insightful discussion with Harold
Brooks and Lans Rothfusz, aided the development of
this study. We are grateful to Kurt Hondl, Michael
402 WEATHER AND FORECAST ING VOLUME 30
Unauthenticated | Downloaded 02/17/22 08:19 AM UTC
Scotten, and the two anonymous reviewers for providing
comments on this paper. Funding was provided by
NOAA/Office of Oceanic and Atmospheric Research
under NOAA–University of Oklahoma Cooperative
Agreement NA11OAR4320072, U.S. Department of
Commerce.
REFERENCES
Adams, T. M., and G. W. Ewen, 2009: The importance of confi-
dence in improving educational outcomes. 25th Annual Conf.
on Distance Teaching and Learning, Madison, WI, University
of Wisconsin–Madison, 1–5.
Andra, D. L., E. M. Quoetone, and W. F. Bunting, 2002: Warning
decision making: The relative roles of conceptual models, tech-
nology, strategy, and forecaster expertise on 3 May 1999. Wea.
Forecasting, 17, 559–566, doi:10.1175/1520-0434(2002)017,0559:
WDMTRR.2.0.CO;2.
Atkins, N. T., andR.M.Wakimoto, 1991:Wet microburst activity
over the southeastern United States: Implications for
forecasting. Wea. Forecasting, 6, 470–482, doi:10.1175/
1520-0434(1991)006,0470:WMAOTS.2.0.CO;2.
Bruno, J. E., 1993: Using testing to provide feedback to support
instruction: A reexamination of the role of assessment in ed-
ucational organizations. Item Banking: Interactive Testing and
Self-Assessments, D. A. Leclercq and J. E. Bruno, Eds.,
Springer-Verlag, 190–209.
——, C. J. Smith, P. G. Engstrom, T. M. Adams, K. D. Warr, M. J.
Cushman, B. D. Webster, and F. M. Bollin, 2006: Method and
system for knowledge assessment using confidence-based
measurement. U.S. Patent 2006/0029920 A1, filed 23 July
2005, issued 9 February 2006.
Campbell, S. D., and M. A. Isaminger, 1990: A prototype micro-
burst prediction product for the Terminal Doppler Weather
Radar. Preprints, 16thConf. on Severe Local Storms,Kananaskis
Park, AB, Canada, Amer. Meteor. Soc., 393–396.
Crum, T., S. D. Smith, J. N. Chrisman, R. E. Saffle, R. W. Hall, and
R. J. Vogt, 2013:WSR-88D radar projects–Update 2013.Proc.
29th Conf. on Environmental Information Processing Tech-
nologies, Austin, TX, Amer. Meteor. Soc., 8.1. [Available
online at https://ams.confex.com/ams/93Annual/webprogram/
Paper221461.html.]
Duncan, M., 2006: A signal detection model of compound decision
tasks. Defense Research and Development Canada Tech.
Rep. TR2006–256, 56 pp.
Forsyth, D. E., and Coauthors, 2005: The National Weather Radar
Testbed (phased array). Preprints, 32nd Conf. on Radar Me-
teorology, Albuquerque, NM, Amer. Meteor. Soc., 12R.3.
[Available online at https://ams.confex.com/ams/pdfpapers/
96377.pdf.]
Fujita, T. T., and R. Wakimoto, 1983: Microbursts in JAWS de-
picted by Doppler radars, PAM and aerial photographs. Pre-
prints, 21st Conf. on Radar Meteorology, Edmonton, AB,
Canada, Amer. Meteor. Soc., 19–23.
Heideman, K. F., T. R. Stewart, W. R. Moninger, and P. Reagan-
Cirincione, 1993: The Weather Information and Skill Exper-
iment (WISE): The effect of varying levels of information on
forecast skill. Wea. Forecasting, 8, 25–36, doi:10.1175/1520-
0434(1993)008,0025:TWIASE.2.0.CO;2.
Heinselman, P. L., and S. M. Torres, 2011: High-temporal-
resolution capabilities of the National Weather Radar Testbed
Phased-Array Radar. J. Appl. Meteor. Climatol., 50, 579–593,
doi:10.1175/2010JAMC2588.1.
——, and D. S. LaDue, 2013: Supercell storm evolution observed
by forecasters using PAR data. Proc. 36th Conf. on Radar
Meteorology, Breckenridge, CO, Amer. Meteor. Soc., 3B.4.
[Available online at https://ams.confex.com/ams/36Radar/
webprogram/Paper228747.html.]
——, D. L. Priegnitz, K. L. Manross, T. M. Smith, and R. W.
Adams, 2008: Rapid sampling of severe storms by theNational
Weather Radar Testbed Phased Array Radar. Wea. Fore-
casting, 23, 808–824, doi:10.1175/2008WAF2007071.1.
——, D. S. LaDue, and H. Lazrus, 2012: Exploring impacts of
rapid-scan radar data on NWS warning decisions. Wea. Fore-
casting, 27, 1031–1044, doi:10.1175/WAF-D-11-00145.1.
Jacoby, J., T. Troutman, A. Kuss, and D. Mazursky, 1986: Expe-
rience and expertise in complex decision making. Adv. Con-
sum. Res., 13, 469–472.
Kelly, D. L., J. T. Schaefer, and C. A. Doswell III, 1985: Clima-
tology of nontornadic severe thunderstorm events in the
United States. Mon. Wea. Rev., 113, 1997–2014, doi:10.1175/
1520-0493(1985)113,1997:CONSTE.2.0.CO;2.
LaDue, D. S., P. L. Heinselman, and J. F. Newman, 2010: Strengths
and limitations of current radar systems for two stakeholder
groups in the southern plains. Bull. Amer. Meteor. Soc., 91,
899–910, doi:10.1175/2009BAMS2830.1.
McLachlan, G. J., 1999: Mahalanobis distance. Resonance, 4, 20–26, doi:10.1007/BF02834632.
NOAA, 2011: Verification. NWS Rep. NWSI 10-51601, 100 pp.
[Available online at http://www.nws.noaa.gov/directives/sym/
pd01016001curr.pdf.]
Obuchowski, N. A., M. L. Lieber, and K. A. Powell, 2000: Data
analysis for detection and localization of multiple abnormali-
ties with application to mammography. Acad. Radiol., 7, 516–
525, doi:10.1016/S1076-6332(00)80324-4.
O’Reilly, C. A., 1980: Individuals and information overload in or-
ganizations: Is more necessarily better? Acad. Manage. J., 23,
684–696, doi:10.2307/255556.
Ortega, K. L., T. M. Smith, K. L. Manross, K. A. Scharfenberg,
A. Witt, A. G. Kolodziej, and J. J. Gourley, 2009: The Severe
Hazards Analysis and Verification Experiment. Bull. Amer.
Meteor. Soc., 90, 1519–1530, doi:10.1175/2009BAMS2815.1.
Roberts, R. D., and J. W. Wilson, 1989: A proposed microburst
nowcasting procedure using single-Doppler radar. J. Appl.
Meteor., 28, 285–303, doi:10.1175/1520-0450(1989)028,0285:
APMNPU.2.0.CO;2.
Saffle, R. E., M. J. Istok, and G. Cate, 2009: NEXRAD product
improvement—Update 2009. 25th Conf. on Interactive In-
formation and Processing Systems (IIPS) for Meteorology,
Oceanography, and Hydrology, Phoenix, AZ, Amer. Meteor.
Soc., 10B.1. [Available online at https://ams.confex.com/ams/
pdfpapers/147971.pdf.]
Stewart, R. T., K. F. Heideman, W. R. Moninger, and P. Reagan-
Cirincione, 1992: Effects of improved information on
the components of skill in weather forecasting. Organ.
Behav. Hum. Decis. Processes, 53, 107–134, doi:10.1016/
0749-5978(92)90058-F.
Torres, S. M., and Coauthors, 2012: ADAPTS Implementation:
Can we exploit phased-array radar’s electronic beam steering
capabilities to reduce update time? Extended Abstract, 28th
Conf. on Interactive Information and Processing Systems (IIPS)
for Meteorology, Oceanography, and Hydrology, New Orleans,
LA, Amer. Meteor. Soc., 6B.3. [Available online at https://ams.
confex.com/ams/92Annual/webprogram/Paper196416.html.]
APRIL 2015 BOWDEN ET AL . 403
Unauthenticated | Downloaded 02/17/22 08:19 AM UTC
Trapp, R. J., D. M. Wheatley, N. T. Atkins, R. W. Przybylinski,
and R. Wolf, 2006: Buyer beware: Some words of caution on
the use of severe wind reports in postevent assessment and
research. Wea. Forecasting, 21, 408–415, doi:10.1175/
WAF925.1.
Whiton, R. C., P. L. Smith, S. G. Bigler, K. E. Wilk, and A. C.
Harbuck, 1998: History of operational use of weather radar by
U.S. weather services. Part II: Development of operational
Doppler weather radars. Wea. Forecasting, 13, 244–252,
doi:10.1175/1520-0434(1998)013,0244:HOOUOW.2.0.CO;2.
Wilks, D. S., 2006: Statistical Methods in the Atmospheric Sciences.
2nd ed. Academic Press, 467 pp.
Witt, A., M. D. Eilts, G. J. Stumpf, E. D. Mitchell, J. T. Johnson,
and K. W. Thomas, 1998: Evaluating the performance of
WSR-88D severe storm detection algorithms. Wea. Fore-
casting, 13, 513–518, doi:10.1175/1520-0434(1998)013,0513:
ETPOWS.2.0.CO;2.
Zrni�c, D. S., and Coauthors, 2007: Agile beam phased array radar
for weather observations. Bull. Amer. Meteor. Soc., 88, 1753–
1766, doi:10.1175/BAMS-88-11-1753.
404 WEATHER AND FORECAST ING VOLUME 30
Unauthenticated | Downloaded 02/17/22 08:19 AM UTC