investigation of an automated approach to threshold selection for generalized pareto · 2015. 4....

Extreme Value TheoryFitting Models

Investigation of an Automated Approach to ThresholdSelection for Generalized Pareto

Kate R. Saunders

Supervisors: Peter Taylor & David Karoly

University of Melbourne

April 8, 2015

Outline

1 Extreme Value Theory

2 Fitting Models

Problem

What are the climate processes that drive extreme rainfall?(El Nino Southern Oscillation, Interdecadal Pacific Oscillation)

How do these drivers differ at different timescales; sub-daily, daily,consecutive day totals?

Extreme Value Theory

Block Maxima

Let X1, X2, ... , Xn be a sequence of i.i.d. random variables withdistribution function F . Define Mn = max{X1, X2, . . . , Xn}.

(Xi might be daily rainfall observations and M365 the annual maximumrainfall.)

Pr(Mn ≤ x) = Pr(X1 ≤ x , . . . ,Xn ≤ x)

= Pr(X1 ≤ x)× · · · × Pr(Xn ≤ x)

= F (x)n.

As n → ∞, the distribution of the Mn converges to a generalisedextreme value distribution.

Generalized Extreme Value Theorem (Fisher-Tippett-Gnendenko)

If there exists sequences of constants {an > 0} and {bn} such that

(Mn − bn

an≤ z

)→ G (z) as n→∞

for a non-degenerate distribution function G , then G is a member of theGeneralized Extreme Value family

G (z) = exp

1 + ξ

(z − µσ

)]−1

defined on {z : 1 + ξ(z − µ)/σ > 0}, where ∞ < µ <∞, σ > 0 and−∞ < ξ <∞.

Leveraging more data

Generalized Pareto Distribution

Let X1, X2, ... , Xn be a sequence of iid random variables with marginaldistribution function F .

Pr{X > u + y |X > u} =1− F (u + y)

1− F (u)y > 0.

If F satisfies Generalized Extreme Value Theorem then for a large enoughthreshold u, the distribution function of (X − u) conditional on X > u isthe GPD.

Generalized Pareto Distribution - Picklands (1975)

H(y) = 1−(

1 +ξy

)−1/ξdefined on {y : y > 0} and (1 + ξy/σ > 0) where, σ = σ + ξ(u − µ).

Dependence

Rainfall observations are dependentHeavy rainfall yesterday effects the probability of heavy rain todayHeavy rainfall a year ago doesn’t

Extreme Value Theory extends to stationary series with weak longrange dependence

However, for processes with short range dependence extremes occurin clusters

Clusters

Dependent Series

Let {Xi}i≥1 be a stationary series and {X ∗i }i≥1 be an independent seriesof variables with the same marginal distribution.

Define Mn = max{X1, . . . , Xn} and M∗n = max{X ∗1 , . . . , X ∗n }. Undersuitable regularity conditions,

{(M∗n − bn)

an≤ z

}→ G (z),

as n→∞ for normalizing sequences {an > 0} and {bn}, where G is anon-degenerate distribution functions, if and only if

{(Mn − bn)

an≤ z

}→ G θ(z),

for a constant θ such that 0 < θ ≤ 1.

Extremal Index

θ = {Limiting mean cluster size}−1 ∈ (0, 1]

θ = 0.5⇒ 2 observations per cluster on average.

Fitting Models

Select a threshold

Decluster the data for independent observations

Declustering

BlocksPartition the observation sequence into blocks of length, bAssume extreme observations within the same block belong to thesame same cluster.

RunsSpecify a run length, KAssume extreme observations with an inter-exceedance time of lessthan K belong to the same cluster.

Intervals

The limiting process of exceedance times is compound Poisson forstationary series (Hsing et al. 1988).

Ferro and Segers (2003) showed the limiting distribution ofinter-exceedance times is a mixture distribution with weight θ,

Tθ(t) = (1− θ)ε0 + θ · θ exp(−θt),

where ε0 is a degenerate distribution, Tθ is the distribution of arrivaltimes of exceedances at threshold u.

By equating moments a non-parametric estimator can be found for θ.

The largest θ(N − 1) inter-exceedance times can be interpreted as betweencluster arrivals.

Fitting Models

→ Select a threshold

Decluster the data for independent observations

Mean Residual Life Plots

For sufficiently high thresholds, as the threshold increases the expectedexceedance above the threshold should grow linearly.

Parameter Stability Plots

Parameter estimates of (modified) scale and shape parameters should beconstant for the range of valid thresholds.

Alternative

Set the threshold according to a high quantile of non-zero observationsEg. 90th percentile.

Is this an appropriate threshold?Is our model is misspecified?

Suggested approach by Suveges and Davison et al. (2010) is to test thethreshold, u, and run parameter, K pair for model misspecification.

Log-Likelihood

Limiting distribution of inter-exceedance times:

Tθ(t) = (1− θ)ε0 + θ2 exp(−θt),

Log-Likelihood (strictly positive inter-exceedance times):

N−1∑i=1

log((1− θ)I(ti=0)(θ2 exp(θti )

I(ti>0))

=N−1∑i=1

[2I(ti > 0) log(θ)− θti

where ti = NTin , n is the total number of observations and N is the

number of exceedances.

However as n gets large our estimate, θ, tends to 1 suggestingindependence.

Log-Likelihood

Adjustment of the inter-exceedance times using the run parameter K :

ci = max{ti − K , 0}

Log-likelihood:

`(θ; ci ) =N−1∑i=1

[I(ci = 0) log(1− θ) + 2I(ci > 0) log(θ)− θci

]Approach used in Fukutome et al. (2014) and Suveges and Davison

(2010).

Test combinations of threshold, u, and run parameter, K , formisspecification of the likelihood function. Select the (u,K ) pair thatmaximizes the number of independent clusters.

Model Misspecification

If a parametric model is misspecified then there is no θ such that g = f (θ),where g is the true model and f is the misspecified parametric model.

For a well specified model,the Fisher’s information matrix, I (θ) = E{`′′(θ; cj} is equal to thevariance of the score vector, J(θ) = Var{`′(θ; cj)}.

Test the hypothesis:D(θ) = J(θ)− I (θ),

where H0 : D(θ) = 0 and H1 : D(θ) 6= 0.

Empirically:

IN−1(θ) =−1

(N − 1)

N−1∑j=1

`′′(θ; cj)

JN−1(θ) =1

(N − 1)

N−1∑j=1

`′(θ; cj)2

DN−1(θ) = JN−1(θ)− IN−1(θ)

VN−1(θ) =1

(N − 1)

N−1∑j=1

[(dj(θ; cj)− D

′N−1(θ)IN−1(θ)−1`

′(θ; cj)

where VN−1(θ) is the sample variance of DN−1(θ).

Model Misspecification

Theorem: (Information Matrix Test - Whyte 1982) If the assumedmodel `(θ; ci ) contains the true model for some θ = θ0, then as n→∞,

(i)√

(N − 1)DN−1(θ)w−→ N(0,V (θ0)),

(ii) VN−1( ˆθN−1)a.s.−−→ V (θ0), and VN−1(θ) is non-singular for sufficiently

large N,

(iii) Then the Information Matrix Test statistic,(N − 1)DN−1(θ)′VN−1(θ)−1DN−1(θ) is asymptotically χ2

1 distributed.

Example: AR(2)

Yi = 0.95Yi−1 − 0.89Yi−2 + Zi where Zi ∼ GP(1, 1/2) and n = 8000.100 simulations

Adjusting inter-exceedance times

Common to assume stationarity by enforcing seasonal blocking.

Collapse inter-exceedance times across seasonal blocks using thememoryless property of the exponential for fitting.

Results: Gatton, South East Queensland

Results: Oenpelli, Northern Territory

Summary

Shown how to check if the threshold and run parameter selectedviolate the assumptions of the model

Given confidence to threshold selection in the absence of a hard andfast rule and in the presence of subjectivity

References

Ferro, C. and Segers, J. (2003). Inference for clusters of extreme values. Journalof the Royal Statistical Society: Series B (Statistical Methodology), 65(2),pp.545-556.

Fukutome, S., Liniger, M. and Sveges, M. (2014). Automatic threshold and runparameter selection: a climatology for extreme hourly precipitation inSwitzerland. Theoretical and Applied Climatology.

Hsing, T., Husler, J. and Leadbetter, M. (1988). On the exceedance pointprocess for a stationary sequence. Probability Theory and Related Fields, 78(1),pp.97-112.

Suveges, M. and Davison, A. (2010). Model misspecification in peaks overthreshold analysis. The Annals of Applied Statistics, 4(1), pp.203-221.

White, H. (1982). Maximum Likelihood Estimation of Misspecified Models.

Econometrica, 50(1), p.1.

ANZAPW 2015: Barossa Valley, South Australia

This work has been supported by the ARC through the LaureateFellowship FL130100039.

Questions?

Results: Kalamia, Far North Queensland

Results: Yamba, New South Wales

investigation of an automated approach to threshold selection for generalized pareto · 2015. 4....

Documents

using extreme value theory to estimate value-at-risk

sam’s club extreme value locals program

on an extreme value version of the … · on an extreme...

extreme value analysis

extreme value distributions

identification of extreme climate by extreme value theory...

stochastic modelling approach for synthesising...

extreme wind distribution tails: a 'peaks over threshold...

an introduction to extreme value analysis

extreme value theory and auction models

extreme value theory to estimating value at...

extreme value stochastic processes: vasicek model...

parameter estimation for generalized extreme value

blast sequence alignment, e-value & extreme value...

a. course: 37:575:250:02 finance for personal and ... 3...

generalized pareto distributions-application to autofocus...

an extreme value reference price approach

j. beirlant , g. maribe , a. verster dept. of mathematics...

extreme value k means clustering - openreview

peaks-over-threshold modelling of environmental...