Download - Leonardo Rojas-Nandayapa
![Page 1: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/1.jpg)
Approximation of Heavy-tailed distributionsvia infinite dimensional phase–type
distributions
Leonardo Rojas-Nandayapa
The University of Queensland
ANZAPW March, 2015.Barossa Valley, SA, Australia.
1 / 36
![Page 2: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/2.jpg)
Summary of Mogens’ talk...
2 / 36
![Page 3: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/3.jpg)
Phase–typeDistributions of first passage times associated to a MarkovJump Process with a finite number of states.
Facts
1. Dense class in the nonnegative distributions.2. Mathematically tractable.3. However, it is a subclass of light–tailed distributions.
3 / 36
![Page 4: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/4.jpg)
Key problemFor most applications, a phase–type approximation will suffice.However, for certain important problems this fail.
A partial solutionConsider Markov Jump processes with infinite number ofstates. This class contains heavy–tailed distribution but alsomany other degenerate cases.
4 / 36
![Page 5: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/5.jpg)
A more refined approach: Infinite mixtures of PHConsider X a random variable with phase–type distribution, andlet S be a nonnegative discrete random variable. Then thedistribution of the random variable
Y := S · X
is an infinite mixture of phase–type distributions.Such a class is mathematically tractable and obviously dense inthe nonnegative distributions.
5 / 36
![Page 6: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/6.jpg)
Let F , G and H be the cdf of Y , S and X respectively. Then
F (y) =
∫ ∞0
G(y/s)dH(s), y ≥ 0.
The right hand side is called the Mellin–Stieltjes transform.
If G is phase–type, the F is absolutely continuous with density
f (y) =
∫ ∞0
g(y/s)dH(s), y ≥ 0.
6 / 36
![Page 7: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/7.jpg)
Some interesting questions related to this class of distributions?
7 / 36
![Page 8: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/8.jpg)
1. What are the possible tail behaviors in this class?2. How to approximate a “theoretic” heavy–tailed distribution?3. Given a sample y1, . . . , yn. How to estimate the parameters
of a distribution in this class?
8 / 36
![Page 9: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/9.jpg)
Heavy Tails
9 / 36
![Page 10: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/10.jpg)
Recall that...
Heavy–tailed distributionsA nonnegative random variable X has a heavy–taileddistribution iff
E [eθX ] =∞, ∀θ > 0.
Equivalently,
lim supx→∞
P(X > x)
e−θx =∞, ∀θ > 0.
Otherwise we say X is light–tailed.
10 / 36
![Page 11: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/11.jpg)
Characterization via Convolutions
Let X1, X2 be nonnegative iid rv with unbounded support,1. It holds that
lim infx→∞
P(X1 + X2 > x)
P(X1 > x)≥ 2.
2. Moreover, X1 is heavy–tailed iff
lim infx→∞
P(X1 + X2 > x)
P(X1 > x)= 2.
3. Furthermore, X is subexponential iff
limx→∞
P(X1 + X2 > x)
P(X1 > x)= 2.
11 / 36
![Page 12: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/12.jpg)
Heavy–tailed random variables arise asI Exponential transformations of some light–tailed random
variables (normal—lognormal, exponential—Pareto).I As weak limits of normalized maxima (also normalized
sums).I As reciprocals of certain random variables.I Random sums of light–tailed distributions.I Products of light–tailed distributions.
12 / 36
![Page 13: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/13.jpg)
Problem 1:What are the possible tail behaviours in this class?
13 / 36
![Page 14: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/14.jpg)
TheoremLet X be a phase–type distribution and S be a nonnegativerandom variable. Define
Y := S · X
Y is heavy–tailed iff S has unbounded support.
14 / 36
![Page 15: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/15.jpg)
ExampleLet X ∼ exp(λ) and S any nonnegative discrete distributionsupported over a countable set {s1, s2, . . . }.
limy→∞
P(S · X > y)
e−θy = limy→∞
∑∞k=1 e−λy/skP(S = sk )
e−θy =∞.
(choose sk large enough and such that λ/sk < θ).
15 / 36
![Page 16: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/16.jpg)
Theorem (generalization)Let S ∼ H1 and X ∼ H2, and let Y := S · X .
1. If there exist θ > 0 and ξ(x) a nonnegative function suchthat
lim supx→∞
eθx(
H1(x/ξ(x)) + H2(ξ(x)))
= 0, (1)
then Y is light–tailed.2. If there exists ξ(x) a nonnegative function such that for all
θ > 0 it holds that
lim supx→∞
eθx H1(x/ξ(x)) · H2(ξ(x)) =∞, (2)
then Y is heavy–tailed.
16 / 36
![Page 17: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/17.jpg)
ExampleLet S ∼Weibull(λ,p) and Y ∼Weibull(β,q). Then Y := S ·X islight–tailed iff
1p
+1q≤ 1.
Otherwise it is heavy–tailed.
17 / 36
![Page 18: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/18.jpg)
What are the possible maximum domains of attraction?
18 / 36
![Page 19: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/19.jpg)
Frechét domain of attraction
Breiman’s LemmaIf S is a α–regularly varying random variable and there existsδ > 0 such that E [Xα+δ] <∞, then
Y := S · X
has an α–regularly varying distribution. Moreover
P(Y > x) ∼ E [X ]P(S > x).
19 / 36
![Page 20: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/20.jpg)
Breiman’s LemmaIf L1/S is a α–regularly varying function then
Y := S · X
has an α–regularly varying distribution. Moreover
P(Y > x) ∼ E [X ]P(S > x).
20 / 36
![Page 21: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/21.jpg)
Gumbel domain of attraction and Subexponentiality
Let V (x) = (−1)η−1L(η−1)1/S (x).1
TheoremIf V (·) is a von Mises function, then F ∈ MDA(Λ). Moreover, if
lim infx→∞
V (tx)V ′(x)
V ′(tx)V (x)> 1, ∀t > 1,
then F is subexponential.
1η is the largest dimension among the Jordan blocks associated to thelargest eigenvalue of the sub–intensity matrix
21 / 36
![Page 22: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/22.jpg)
Problem 2:How to approximate theoretical heavy–tailed distributions?
22 / 36
![Page 23: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/23.jpg)
An approach for approximating theoretic distributions
I Take a sequence of Erlang rvs
Xn ∼ Erlang(n,n).
I Consider increasing sequences
Sn := {sn,0, sn,1, sn,2, . . . }, n ∈ N
such that sn,0 = 0, sn,i →∞ as i →∞ and
∆n := sup{sn,i+1 − sn,i : i ∈ N} → 0,
as n→ 0 For each n, define the cdf’s
Fn(x) = F (sn,i+1), x ∈ [sn,i , sn,i+1).
23 / 36
![Page 24: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/24.jpg)
Construct a sequences of random variables
Sn ∼ Fn, Xn ∼ Erlang(n,n).
Since Xn → 1, then Slutsky’s theorem implies that
Sn · Xn → X ∼ F .
Note: It can be modified and will also work for light–taileddistributions.
24 / 36
![Page 25: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/25.jpg)
0 1 2 3 4 50
0.5
1
1.5
2
2.5
3
3.5
25 / 36
![Page 26: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/26.jpg)
Ruin problem
DefinitionLet F and G be two distribution functions with non-negativesupport. The relative error from F to G from 0 to s is definedby
supx∈[0,s]
|F (x)−G(x)||F (x)|
.
26 / 36
![Page 27: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/27.jpg)
TheoremFix u > 0 and let F and G be such that
supx∈[0,u]
|FI(x)−GI(x)||F I(x)|
≤ ε(u)
where ε : R+ → R+ is some non-decreasing function. Let FI bethe integrated tail of F and let G and approximation of F .Further assume that F dominates G stochastically. Then
|ψF (u)− ψ̂G(u)| ≤ ε(u)E (Number of down-crossings; A),
where A := {Ruin happens in the last down-crossing}.
27 / 36
![Page 28: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/28.jpg)
Problem 3:Statistical inference?
28 / 36
![Page 29: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/29.jpg)
Key IdeaSee the first passages times as incomplete data set from theevolution of the Markov Jump Process. Employ the EMalgorithm for estimating the parameters.
29 / 36
![Page 30: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/30.jpg)
Density of an infinite mixture of PH
fA(y) =∞∑
i=1
πi(θ)αeT y/si t/si , x ≥ 0.
Complete Data Likelihood
`c(θ,α,T )
=∞∑
i=1
Li logπi(θ) +∞∑
i=1
p∑k=1
Bik logαk +
∞∑i=1
∑k 6=`
N ik` log
(Tk`
si
)
−∞∑
i=1
∑k 6=`
Tk`
siZ i
k +∞∑
i=1
p∑k=1
Aik log
(tksi
)−∞∑
i=1
p∑k=1
tksi
Z ik .
30 / 36
![Page 31: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/31.jpg)
EM estimatorsAssume (θ,α,T ) is the current set of parameters in the EMalgorithm. The one step EM-estimators are thus given by
θ̂ = argmaxΘ
∞∑i=1
(αUi t) · logπi(Θ),
α̂ =diag (α)U t
N,
T̂kl = ( diag (R)−1T ◦ R)k`, k 6= `,
t̂ = αU diag (t) diag (R)−1,
where
R = R(θ,α,T ) =∞∑
i=1
N∑j=1
πi(θ)
si
J(yj/si ;α,T , t
)fA(yj ;θ,α,T )
,
Ui = Ui(θ,α,T ) =N∑
j=1
πi(θ)
si
exp(
T yjsi
)fA(yj ;θ,α,T )
,
U = U(θ,α,T )31 / 36
![Page 32: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/32.jpg)
In the above
Ji(y) :=
∫ y
0eTi (y−u)tiαeTi udu.
PropositionLet c = max{Tkk : k = 1, . . . ,p} and set K := c−1T + I . Define
D(s) := D(s;α,T ) =1c
s∑r=0
K r tαK s−r
Then
J(y ; θ,α,T ) =∞∑
s=0
(cy)s+1e−cy
(s + 1)!D(s).
32 / 36
![Page 33: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/33.jpg)
An alternative modelLet r ∈ (0,1).
fB(y) = r ·αeT y t + (1− r)∞∑
i=1
πi(θ)λq
i yq−1e−λi y
Γ(q − 1),
33 / 36
![Page 34: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/34.jpg)
PropositionThe EM estimators take the form
r̂ =rαUt
N
α̂ =diag (α)Ut
αUtT̂k` =
(diag (R)−1T ◦ R′
)k`
t̂ = αU diag (t) · diag (R)−1
θ̂ = argmaxΘ
∞∑i=1
wi logπi(Θ)
λ̂ =∞∑
i=1
wi
/ ∞∑i=1
vi
si.
34 / 36
![Page 35: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/35.jpg)
where
R = R(α,T , λ, θ) =N∑
j=1
J(yj ;α,T , t
)fB(yj ; r ,α,T ,θ, λ)
.
U = U(α,T , λ, θ) =N∑
j=1
exp(T yj)
fB(yj ; r ,α,T ,θ, λ)
wi = wi(α,T , λ, θ) = πi(θ)N∑
j=1
g(yj ; q, λ/si)
fB(yj ; r ,α,T ,θ, λ)
vi = vi(α,T , λ, θ) = πi(θ)N∑
j=1
yj · g(yj ; q, λ/si)
fB(yj ; r ,α,T ,θ, λ)
35 / 36
![Page 36: Leonardo Rojas-Nandayapa](https://reader034.vdocument.in/reader034/viewer/2022051405/58a2d0cb1a28ab2e458b66d0/html5/thumbnails/36.jpg)
Advantages
I Fast.I Recovers the parameters of simulated data.I Works reasonable well with real data.I Converges to the MLE estimators most of the times.I Provides a better approximation for the tails.
Disadvantages
I There are some (minor) numerical issues yet to beresolved.
I MLE estimators are not appropriate for estimating “tailparameters”. EVT methods could be implemented.
I Sensitive with respect to the tail parameters.I Kullback-Leibler does not work properly. Though we now
have the first simpler approach.
36 / 36