modeling ground delay program incidence ... - atm...

28
Modeling Ground Delay Program Incidence using Convective and Local Weather Information Presenter: Yi Liu (Amazon) Co-authors (UC Berkeley): Mark Hansen, Danqing Zhang, Yulin Liu, Alexey Pozdnukhov Funded by NASA under Award No NNX14AJ79A ATM Seminar 2017

Upload: others

Post on 21-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

  • Modeling Ground Delay Program Incidence using Convective and Local Weather Information Presenter: Yi Liu (Amazon) Co-authors (UC Berkeley): Mark Hansen, Danqing Zhang, Yulin Liu, Alexey Pozdnukhov

    Funded by NASA under Award No NNX14AJ79A

    ATM Seminar 2017

  • Outline

    • Background • Data • Methodology • Results • Conclusions

    2

  • 3

    Background • Similar Days is a NASA funded project

    – 3 years, starting from 9/14 – Team members: UCB, UMD, and ATAC

    • Project Goal: Support TMI decision making by finding and presenting pertinent information about historical days that are “similar” to a given reference day

    • Similar Days tool identifies the similar days and presents information about daily weather, capacity, TMIs, and performance

    • Applications include – Post-operational analysis – Day-of-operations decision making – Research

  • Objective in this research

    • Predict the incidence of GDPs based on weather conditions – Help flight operators predict GDPs, which can

    significantly impact their operations – Generate analytics for “significant” convective

    weather used for assessing similarity between convective weather conditions

    4

  • Contribution to the literature

    • Previously, researchers have investigated the links between weather and GDP occurrence (Wang & Kulkarni 2011; Mukherjee et al. 2014; Bloem and Bambos 2015; Kuhn 2016)

    • Our analysis is unique in its attention to the spatial patterns of convective weather

    5

  • Scope of this analysis

    • Our work focuses on the relation between current realized weather and GDP occurrence

    • Use of our models with weather forecasts, rather than realized weather, will reduce their reliability to some extent, but we will not address this problem in this research

    6

  • Outline

    • Background • Data • Methodology • Results • Conclusions • Future Research

    7

  • Data: hourly GDP label

    FAA’s National Traffic Management Log

    8

    • Dependent variable

  • Data: airport local weather

    • Aviation System Performance Metrics – visibility (statute mile) – ceiling (feet) – wind magnitude (knots) – wind direction – instrument meteorological conditions (IMC)

    indicator

    9

  • Data: airport local weather

    • Aviation System Performance Metrics – visibility (statute mile) – ceiling (feet) – wind magnitude (knots) – wind direction – instrument meteorological conditions (IMC)

    indicator

    Tailwind/headwind Crosswind

    5 airports: EWR, LGA, JFK, ATL, PHL Reference runways are: 22, 31, 31, 27, 27

    10

  • Data: convective weather

    • National Convective Weather Forecast (NCWF), providing coverage at national airspace system level

    https://www.aviationweather.gov/products/ncwf/info 11

    = 1

    https://www.aviationweather.gov/products/ncwf/info

  • Data: convective weather

    • Polygonal representations of convective areas provided in NCWF are discretized into geo-referenced image spanning the areas around the selected five airports

    • A range of area sizes and resolutions were evaluated experimentally, resulting in a squared area of 200x200 square miles centered at the airport of interest, with a resolution of 200x200 pixels and one pixel per 1x1 square mile

    12

    200x200 sq miles

  • Data range

    • The data range covers 2012 to 2014 with occasional gaps due to missing or incomplete data in one of the data sources for any of the airports

    13

    EWR

    JFK

    LGA

    PHL

    ATL

    NO GDP

    20305

    (82.1%)

    23160

    (91.9%)

    21588

    (86.6%)

    22521

    (90.3%)

    24355

    (98.6%)

    GDP

    4418

    (17.9%)

    2055

    (8.1%)

    3354

    (13.4%)

    2426

    (9.7%)

    355

    (1.4%)

  • Outline

    • Background • Data • Methodology • Results • Conclusions • Future Research

    14

  • Data mining techniques

    • We use two data mining techniques: – Support Vector Machine (SVM); to understand

    and quantify the impact of convective weather on GDP incidences where convective weather is represented using geo-referenced image

    – logistic regression; to model GDP incidences using SVM output, airport local weather variables and airport busy hour (7 am to 10 pm; Gorripaty et al. 2016) indicator variable

    15

  • SVM

    16

    200x200 GDP initiation

    (C)

  • Tuning parameters

    17

    Performance metrics (AUC, F1-score and precision)

    The samples of GDP and no GDP can overlap heavily in the feature space, favoring use of lower values of C, and, hence models that tolerate misclassification

    Average Area under ROC Curve over 5-fold Cross-validation for EWR

  • Tuning parameters

    18

    Performance metrics (AUC, F1-score and precision)

    Map overlay of the convective weather weight vector w for EWR model at C=1e-4

    Average Area under ROC Curve over 5-fold Cross-validation for EWR

    Spatial pattern of weights, C = 1e-4

  • Tuning parameters

    19

    Spatial pattern of weights, C =1

    Spatial pattern of weights, C = 1e-4

  • Methodology (II)

    • Estimate Binary Logit Model of GDP Occurrence – Dependent variable: GDP in effect in a given hour – Explanatory variables

    • Ceiling, Visibility, IMC Dummy • Tailwind and Crosswind • Busy hour dummy (=1, if local departure hour is

    between 7 am and 10 pm ) • Convective weather score from SVM

    20

    Why not modeling all features in SVM? Lead to imbalance in the dimensionality of the feature space Complicate the usability and interpretability of the weights from SVM

  • Outline

    • Background • Data • Methodology • Results • Conclusions

    21

  • SVM output— convective weather heatmap

    22

    One can observe spatially contingent pattern centered on the EWR and spread along the main east coast corridors.

  • 23

    1. Consistency of the observed pattern for all the airports of New York metroplex. 2. While the comparison confirms that weather closer to the airport (located at the center of the figure) is more important, it also reveals that the area of high impact is elongated along the northeast/northwest axis. This belies the assumption that weather impact depends only on the distance from the airport.

  • Logit Results—Dependent Variable: GDP in Effect

    24

    For the three airports within New York area, the magnitude of the convective weather variable is comparable ATL has the largest coefficient for convective weather variable indicating strong impact of convective weather on GDP at ATL

    Variable

    Airport

    Est./ (Std.)

    EWR

    JFK

    LGA

    PHL

    ATL

    3.893***

    4.511***

    3.369***

    2.773***

    9.660***

    (0.12)

    (0.13)

    (0.11)

    (0.10)

    (0.41)

    0.789***

    0.798***

    1.366***

    1.526***

    0.23

    (0.07)

    (0.09)

    (0.06)

    (0.08)

    (0.20)

    -0.280***

    -0.750***

    -0.716***

    -1.627***

    -0.903***

    (0.05)

    (0.08)

    (0.07)

    (0.09)

    (0.23)

    -0.104***

    -0.044***

    -0.090***

    -0.178***

    0.061

    (0.01)

    (0.01)

    (0.01)

    (0.01)

    (0.03)

    0.017***

    0.096***

    0.024***

    -0.034***

    -0.058**

    (0.01)

    (0.01)

    (0.01)

    (0.01)

    (0.02)

    0.046***

    0.065***

    0.056***

    0.044***

    -0.038*

    (0.00)

    (0.01)

    (0.00)

    (0.01)

    (0.02)

    0.088***

    0.033***

    0.054***

    0.071***

    -0.047**

    (0.00)

    (0.01)

    (0.01)

    (0.01)

    (0.02)

    2.284***

    2.267***

    2.466***

    3.767***

    2.546***

    (0.07)

    (0.12)

    (0.08)

    (0.16)

    (0.37)

    Constant

    -0.942***

    -1.354***

    -1.746***

    -2.489***

    2.399***

    (0.14)

    (0.20)

    (0.15)

    (0.20)

    (0.60)

    Observations

    24723

    25215

    24942

    24947

    24710

  • Logit Results—Dependent Variable: GDP in Effect

    25

    Increased ceiling and visibility reduces the likelihood of GDP incidence, while the presence of IMC increases the likelihood of GDP

    Variable

    Airport

    Est./ (Std.)

    EWR

    JFK

    LGA

    PHL

    ATL

    3.893***

    4.511***

    3.369***

    2.773***

    9.660***

    (0.12)

    (0.13)

    (0.11)

    (0.10)

    (0.41)

    0.789***

    0.798***

    1.366***

    1.526***

    0.23

    (0.07)

    (0.09)

    (0.06)

    (0.08)

    (0.20)

    -0.280***

    -0.750***

    -0.716***

    -1.627***

    -0.903***

    (0.05)

    (0.08)

    (0.07)

    (0.09)

    (0.23)

    -0.104***

    -0.044***

    -0.090***

    -0.178***

    0.061

    (0.01)

    (0.01)

    (0.01)

    (0.01)

    (0.03)

    0.017***

    0.096***

    0.024***

    -0.034***

    -0.058**

    (0.01)

    (0.01)

    (0.01)

    (0.01)

    (0.02)

    0.046***

    0.065***

    0.056***

    0.044***

    -0.038*

    (0.00)

    (0.01)

    (0.00)

    (0.01)

    (0.02)

    0.088***

    0.033***

    0.054***

    0.071***

    -0.047**

    (0.00)

    (0.01)

    (0.01)

    (0.01)

    (0.02)

    2.284***

    2.267***

    2.466***

    3.767***

    2.546***

    (0.07)

    (0.12)

    (0.08)

    (0.16)

    (0.37)

    Constant

    -0.942***

    -1.354***

    -1.746***

    -2.489***

    2.399***

    (0.14)

    (0.20)

    (0.15)

    (0.20)

    (0.60)

    Observations

    24723

    25215

    24942

    24947

    24710

  • Logit Results—Dependent Variable: GDP in Effect

    26

    The tailwind seems to have mixed effects on GDP for different airports, but headwind and crosswind both increase the likelihood of GDP, except for ATL

    Variable

    Airport

    Est./ (Std.)

    EWR

    JFK

    LGA

    PHL

    ATL

    3.893***

    4.511***

    3.369***

    2.773***

    9.660***

    (0.12)

    (0.13)

    (0.11)

    (0.10)

    (0.41)

    0.789***

    0.798***

    1.366***

    1.526***

    0.23

    (0.07)

    (0.09)

    (0.06)

    (0.08)

    (0.20)

    -0.280***

    -0.750***

    -0.716***

    -1.627***

    -0.903***

    (0.05)

    (0.08)

    (0.07)

    (0.09)

    (0.23)

    -0.104***

    -0.044***

    -0.090***

    -0.178***

    0.061

    (0.01)

    (0.01)

    (0.01)

    (0.01)

    (0.03)

    0.017***

    0.096***

    0.024***

    -0.034***

    -0.058**

    (0.01)

    (0.01)

    (0.01)

    (0.01)

    (0.02)

    0.046***

    0.065***

    0.056***

    0.044***

    -0.038*

    (0.00)

    (0.01)

    (0.00)

    (0.01)

    (0.02)

    0.088***

    0.033***

    0.054***

    0.071***

    -0.047**

    (0.00)

    (0.01)

    (0.01)

    (0.01)

    (0.02)

    2.284***

    2.267***

    2.466***

    3.767***

    2.546***

    (0.07)

    (0.12)

    (0.08)

    (0.16)

    (0.37)

    Constant

    -0.942***

    -1.354***

    -1.746***

    -2.489***

    2.399***

    (0.14)

    (0.20)

    (0.15)

    (0.20)

    (0.60)

    Observations

    24723

    25215

    24942

    24947

    24710

  • Factor contribution

    27

    0%

    20%

    40%

    60%

    80%

    100%

    JFK LGA EWR PHL ATL

    Factor Contributions

    Wx IMC C Vis TW HW CW

    • We construct a counter factual scenario in which each factor is set to the value in the dataset that would minimize the GDP probability, and use the logit model to predict the corresponding GDP probability for each record

    • Contribution: % change between the predicted and observed GDP probability

    Convective weather information is indispensable to predicting GDPs

  • Conclusions

    • We have modeled GDP occurrence using convective weather in region surrounding airport and local airport weather conditions

    • We find that the SVM, when properly tuned, provides a reasonable and spatially coherent picture of where convective weather is important

    • The importance of convective weather depends on both its distance and direction from the airport

    • Local and regional convective weather both have significant impact on GDP probability, but the latter is most important

    28

    Modeling Ground Delay Program Incidence using Convective and Local Weather Information�OutlineBackgroundObjective in this researchContribution to the literatureScope of this analysisOutlineData: hourly GDP labelData: airport local weather Data: airport local weather Data: convective weather Data: convective weather Data rangeOutlineData mining techniquesSVMTuning parametersTuning parametersTuning parametersMethodology (II)OutlineSVM output—�convective weather heatmapSlide Number 23Logit Results—Dependent Variable: GDP in EffectLogit Results—Dependent Variable: GDP in EffectLogit Results—Dependent Variable: GDP in EffectFactor contributionConclusions