expectile and quantile regression and other...

26
2453-14 School on Modelling Tools and Capacity Building in Climate and Public Health KAZEMBE Lawrence 15 - 26 April 2013 University of Malawi Chancellor College Faculty of Science Department of Mathematical Sciences 18 Chirunga Road, 0000 Zomba MALAWI Expectile and Quantile Regression and Other Extensions

Upload: others

Post on 05-Jun-2020

22 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

2453-14

School on Modelling Tools and Capacity Building in Climate and Public Health

KAZEMBE Lawrence

15 - 26 April 2013

University of Malawi Chancellor College

Faculty of Science Department of Mathematical Sciences 18 Chirunga Road, 0000 Zomba

MALAWI

Expectile and Quantile Regression and Other Extensions

Page 2: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Expectile and Quantile Regression

and Other Extensions

Lawrence Kazembe

University of Namibia

Windhoek, Namibia

A presentation atSchool on Modelling Tools and Capacity Building in

Climate and Public Health

ICTP, Trieste, Italy

1

Page 3: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Objectives and Questions

• Objective:Introduce other regression beyond the mean

• Are there median regression models

• What about regression at the mode?

• Can we fit regression model at any other locationstoo?

• What of the variance, skewness and kurtosis?

2

Page 4: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Prelimaries

• Given a r.v. y ∼ f(·) then

-E(Y ) = μ defines the mean;

-V ar(Y ) = σ2 is the variance.

• General assumption: two measures completely de-

fine the distribution (stationarity concept)

• Classical regression is often characterized by relat-

ing to the mean

-E(y|x) = βx if x is the set of covariates.

-y ∼ N(μ, σ2), then μ = βx

3

Page 5: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

-y ∼ Bern(π), then log(

π1−π

)= βx

-y ∼ Pois(λ), then log(λ) = βx.

• Statisticians are mean lovers (Friedman, Fried-man & Amoo)

• It goes like:We are ”mean” lovers. Deviation is considered normal. We are right 95%

of the time. Statisticians do it discretely and continuously. We can legally

comment on someone’s posterior distribution.

• Why the mean?-Mean regression reduces complexity-However, the mean is not sufficient to describe adistribution.

Page 6: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Examples Plot (Rent in Munich, Kneib et al)

4

Page 7: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Examples Quantile Plot (Rent in Munich, Kneib

et al)

5

Page 8: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Example (Malaria vs Altitude , Kazembe et al)

6

Page 9: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Example (Botswana data)

7

Page 10: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Motivating Examples

• Epidemiology and Public Health:

-Investigating height, weight and body mass index

as a function of different covariates e.g. age (Wei

et al. 2006)

-Exploring stunting curves in African and Indian

children (Yue et al. 2012)

-Generating age-specific centile charts (Chitty et al.

1994).

• Economics:

-Study of determinants of wages (Koenker, 2005).

8

Page 11: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

• Education:

-The performance of students in public schools (Koenker

and Hallock, 2001).

• Climate data:

- Spatiotemporal analysis of Boston temperature

(Reich 2012)

Page 12: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Double GLM

• Classical regression assumes a homogeneous vari-

ance

y|x, ε ∼ N(βx, σ2)

ε ∼ N(0, σ2)

• However variance heterogeneity (heteroscedasticity)

is an order in real-life problems

• The variance of the response may depend on the

covariates

9

Page 13: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

• Normal regression example:

-Regression for mean and variance of a normal dis-

tribution

yi = ηi1 + exp(η2i)εi, εi ∼ N(0,1)

such that

E(yi|xi) = η1i

V ar(yi|xi) = exp(η2i)2

Page 14: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Regression for location, scale and shape

• In general: Specify a distribution for the response

on all parameters and relate to the predictors.

• The generalized additive model location, scale and

shape (GAMLSS) is a statistical model developed

by Rigby and Stasinopoulos (2005).

• For a probability (density) function f(yi|μi, σi, νi, τi)conditional on (μi, σi, νi, τi) a vector of four distri-

bution parameters, each of which can be a function

10

Page 15: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

to the explanatory variables (X)

g1(μ) = η1 = β1X1 +J1∑j=1

hj1(xj1) (1)

g2(σ) = η2 = β2X2 +J2∑j=1

hj2(xj2) (2)

g3(ν) = η3 = β3X3 +J3∑j=1

hj3(xj3) (3)

g4(τ) = η4 = β4X4 +J4∑j=1

hj4(xj4) (4)

Page 16: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

• GAMLSS assumes different models for each param-

eter

-Model 1: Mean regression

-Model 2: Dispersion regression

-Model 3: Skewness regression

-Model 4: Kurtosis regression

• Other summary measures are also permissible

Page 17: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Quantile Regression

• Quantile, Centile, and Percentile are terms that canbe used interchangeably-A 0.5 quantile ≡ 50 percentile, which is a median.

• Related terms are quartiles, quintiles and deciles:divides distribution into 4, 5, and 10 parts

• Quantiles are related to the median.

• Suppose Y has a cumulative distribution F (y) =P (Y ≤ τ). Then τth quantile of Y is defined to be

Q(τ) =∫ τ

−∞f(y)dy = inf{y : τ ≤ F (y)}

11

Page 18: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

for 0 < τ < 1.

• The quantile regression drops the parametric as-sumption for the error/ response distribution.

• Fit separate models for different asymmetries τ ∈[0,1]:

yi = ηiτ + εiτ

• Instead of assuming E(yi ≤ ηiτ = 0, one assumes

P (εiτ ≤ 0) = τ

or

Fyi(ηiτ) = P (εiτ ≤ 0) = τ

Page 19: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

• This gives a set of regression function at any as-

sumed quantile value.

• Assumptions:

-it is distribution-free since it does not make any

specific assumption on the type of errors

-it does not even require i.i.d errors

-it allows for heterogeneity.

Page 20: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Expectile Regression

• Expectiles are related to the mean, as are quantiles

related to the median.

∑ni=1 |yi − ηi| → min

∑ni=1wτ(yi, ηiτ)|yi − ηiτ | → min

median regression quantile regression

∑ni=1 |yi − ηi|2 → min

∑ni=1wτ(yi, ηiτ)|yi − ηiτ |2 → min

mean regression expectile regression

where wτ is the check function defined by

12

Page 21: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

wτ =

⎧⎨⎩τ if yi > μ(τ)

1− τ if yi ≤ μ(τ)

for some population expectile μ(τ) for different val-ues of an asymmetric parameter 0 < τ < 1.

• Expectiles are obtained by solving

τ =

∫ eτ−∞ |y − eτ |fy(y)dy∫∞−∞ |y − eτ |fy(y)dy=

Gy(eτ)− eτFy(eτ)

2(Gy(eτ)− eτFy(eτ)) + (eτ − μ)

where-fy(·) and Fy(·) denote the density and cumulativedistribution function of y.-Gy(e) =

∫ e−∞ yfy(y)dy is the partial moment func-tion of y and-Gy(∞) = μ is the expectation of y.

Page 22: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Worked example - binomial data

13

Page 23: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

500 1500

−1

01

2

ALT

s(A

LT, d

f = c

(2, 4

, 2))

:1

500 1500

−0.

40.

00.

20.

40.

60.

8

ALT

s(A

LT, d

f = c

(2, 4

, 2))

:2

500 1500

−0.

20.

00.

10.

2

ALT

s(A

LT, d

f = c

(2, 4

, 2))

:3500 1500

−0.

10−

0.05

0.00

0.05

0.10

ALT

s(A

LT, d

f = c

(2, 4

, 2))

:4

500 1500

−0.

3−

0.2

−0.

10.

00.

1

ALT

s(A

LT, d

f = c

(2, 4

, 2))

:5

500 1500

−0.

3−

0.2

−0.

10.

00.

1ALT

s(A

LT, d

f = c

(2, 4

, 2))

:6

Page 24: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

8 10 14

−1.

0−

0.5

0.0

0.5

1.0

1.5

TMIN

s(T

MIN

, df =

c(2

, 4, 2

)):1

8 10 14

−0.

40.

00.

20.

40.

60.

8

TMIN

s(T

MIN

, df =

c(2

, 4, 2

)):2

8 10 14

−0.

10.

00.

10.

20.

30.

4

TMIN

s(T

MIN

, df =

c(2

, 4, 2

)):3

8 10 14

−0.

2−

0.1

0.0

0.1

0.2

TMIN

s(T

MIN

, df =

c(2

, 4, 2

)):4

8 10 14

−0.

5−

0.3

−0.

10.

1

TMIN

s(T

MIN

, df =

c(2

, 4, 2

)):5

8 10 14

−0.

5−

0.3

−0.

10.

1TMIN

s(T

MIN

, df =

c(2

, 4, 2

)):6

14

Page 25: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

25 30 35

−2

−1

01

TMAX

s(T

MA

X, d

f = c

(2, 4

, 2))

:1

25 30 35

−0.

8−

0.4

0.0

0.4

TMAX

s(T

MA

X, d

f = c

(2, 4

, 2))

:2

25 30 35

−0.

20.

00.

20.

4

TMAX

s(T

MA

X, d

f = c

(2, 4

, 2))

:325 30 35

−0.

100.

000.

100.

20

TMAX

s(T

MA

X, d

f = c

(2, 4

, 2))

:4

25 30 35

−0.

2−

0.1

0.0

0.1

0.2

TMAX

s(T

MA

X, d

f = c

(2, 4

, 2))

:5

25 30 35

−0.

050.

000.

050.

100.

15TMAX

s(T

MA

X, d

f = c

(2, 4

, 2))

:615

Page 26: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Reference

• Hudson, I. L., Rea, A., Dalrymple, M. L., Eilers, P. H. C(2008). Climate impacts on sudden infant death syndrome:a GAMLSS approach. Proceedings of the 23rd internationalworkshop on statistical modelling pp. 277280.

• Beyerlein, A., Fahrmeir, L., Mansmann, U., Toschke., A. M(2008). Alternative regression models to assess increase inchildhood BM. BMC Medical Research Methodology, 8(59).

• Smyth, G. K (1989). Generalized linear models with varyingdispersion. J. R. Statist. Soc. B, 51, 4760.

• Sobotka, F., and T. Kneib, 2010. Geoadditive Expectile Re-gression. Computational Statistics and Data Analysis, doi:10.1016/j.csda.2010.11.015.

16