sas ® global forum 2014 march 23-26 washington, dc got randomness? sas tm for mixed and generalized...

48
SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University TM SAS and its products are the registered trademarks of SAS Institute, Cary, NC

Upload: theresa-sparks

Post on 27-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

SAS® Global Forum 2014March 23-26 Washington, DC Got Randomness?

SASTM for Mixed and Generalized Linear Mixed Models

David A. DickeyNC State University

TM SAS and its products are the registered trademarks of SAS Institute, Cary, NC

Page 2: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

2

Data: Crab mating patterns (X rated)

Data: Typists (Poisson with random effects)

(Poisson Regression, ZIP model,

Negative Binomial)

Data: Challenger (Binomial with random effects)

Data: Flu samples (Binomial with random effects)

Data: Ships (Poisson with offset)

Page 3: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

3

“Generalized” non normal distribution

Binary for probabilities: Y=0 or 1Mean E{Y}=p Variance p(1-p)Pr{Y=j}= pj(1-p)(1-j)

Link: L=ln(p/(1-p)) = “Logit”Range (over all L): 0<p<1

Poisson for counts: Y in {0,1,2,3,4, ….}Mean count l Variance l Pr{Y=j} = exp(- l )(lj)/(j!)

Link: L = log(l) Range (over all L): l >0

Like DislikePr{ Like }=2/3

2/3

1/3

2/3

.27 .05

5

.00

7

.00

1

l=0.4055

0 1 2 3 4 5

Page 4: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

4

Mixed (not generalized) Models:

Fixed Effects and Random Effects

Page 5: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

5

Fixed? or Random?

Replication : Same levels Different levels

Inference for: Only Observed Population of Levels Levels

Levels : Picked on Picked at Purpose Random

Inference on: Means Variances Example: Only These All Doctors Drugs All ClinicsExample: Only These All Farms Fertilizers All Fields

Page 6: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

6

Generalized (not mixed) linear models.

Use link L = g(E{Y}), e.g. ln(p/(1-p)) = ln(E{Y}/(1-E{Y})Assume L is linear model in the inputs with fixed effects.

Estimate model for L, e.g. L=g(E{Y})=bo + b1 X Use maximum likelihood

Example: L = -1 + .18*doseDose = 10, L=0.8, p=exp(0.8)/(1+exp(0.8))= “inverse link” = 0.86

Page 7: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University
Page 8: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

8

Challenger was mission 24From 23 previous launches we have:6 O-rings per missionY=0 no damage, Y=1 erosion or blowby p = Pr {Y=1} = f{mission, launch temperature)Features: Random mission effects

Logistic link for p

proc glimmix data=O_ring; class mission; model fail = temp/dist=binomial s; random mission; run;

Generalized Mixed

Page 9: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

9

Estimated G matrix is not positive definite.Covariance Parameter Estimates

Cov Standard Parm Estimate Error mission 2.25E-18 .

Solutions for Fixed Effects

Effect Estimate Error DF t Value Pr > |t|

Intercept 5.0850 3.0525 21 1.67 0.1106temp -0.1156 0.04702 115 -2.46 0.0154

We “hit the boundary”

Page 10: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

10

Likelihood

02 M

02 M

02 E

02 E

Page 11: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

11

Just logistic regression – no mission variance component

Page 12: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

12

Flu Data CDC Active Flu Virus Weekly Data % positivedata FLU; input fluseasn year t week pos specimens;pct_pos=100*pos/specimens; logit=log(pct_pos/100/(1+(pct_pos/100))); label pos = "# positive specimens"; label pct_pos="% positive specimens"; label t = "Week into flu season (first = week 40)"; label week = "Actual week of year"; label fluseasn = "Year flu season started";

Em

piric

al

Logi

t%

pos

itive

Page 13: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

13

“Sinusoids”S(j) = sin(2pjt/52) C(j)=cos(2pjt/52)

PROC GLM DATA=FLU; class fluseasn; model logit = s1 c1 fluseasn*s1 fluseasn*c1 fluseasn*s2 fluseasn*c2 fluseasn*s3 fluseasn*c3 fluseasn*s4 fluseasn*c4; output out=out1 p=p; data out1; set out1; P_hat = exp(p)/(1+exp(p)); label P_hat = "Pr{pos. sample} (est.)"; run;

(1) GLM all effects fixed (harmonic main effects insignificant)

Page 14: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

14

(2) MIXED analysis on logitsRandom harmonics. Normality assumed

PROC MIXED DATA=FLU method=ml; ** reduced model; class fluseasn; model logit = s1 c1 /outp=outp outpm=outpm ddfm=kr; random intercept/subject=fluseasn; random s1 c1/subject=fluseasn type=toep(1); random s2 c2/subject=fluseasn type=toep(1); random s3 c3/subject=fluseasn type=toep(1); random s4 c4/subject=fluseasn type=toep(1); run;

2

2

0

0

j

j

Toeplitz(1)

Page 15: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

15

PROC GLIMMIX DATA=FLU; title2 "GLIMMIX Analysis"; class fluseasn; model pos/specimens = s1 c1 ; * s2 c2 s3 c3 s4 c4; random intercept/subject=fluseasn; random s1 c1/subject=fluseasn type=toep(1); random s2 c2/subject=fluseasn; ** Toep(1) - no converge; random s3 c3/subject=fluseasn type=toep(1); random s4 c4/subject=fluseasn type=toep(1); random _residual_; covtest glm; output out=out2 pred(ilink blup)=pblup pred(ilink noblup)=overall pearson = p_resid; run;

(3) GLIMMIX analysisRandom harmonics. Binomial assumed (overdispersed – lab effects?)

Page 16: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

16

output out=out2 pred(ilink blup)=pblup pred(ilink noblup)= overallpearson = p_resid; run; Pearson Residuals:

Used to check fit when using default (pseudo-likelihood) Variance should be near 1 proc means mean var; var p_resid; run; Without random _residual_; variance 3.63. With random _residual_; variance 0.83. --------------------------------------------------------------- Fit Statistics-2 Res Log Pseudo-Likelihood 341.46Generalized Chi-Square 1707.29Gener. Chi-Square / DF 4.59------------------------------------------------------------------------- Or… use method=quad

Page 17: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

17

Type III Tests of Fixed Effects

Num Den Effect DF DF F Value Pr > F S1 1 8 34.93 0.0004 c1 1 8 25.49 0.0010

Tests of Covariance Parameters Based on the Residual Pseudo-Likelihood

Label DF -2 Res Log P-Like ChiSq Pr > ChiSq Note Independence 6 1312.34 970.88 <.0001 MI

MI: P-value based on a mixture of chi-squares

Output due to covtest glm;

random _residual_ does not affect the fit (just standard errors)

Page 18: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

18

PROC GLIMMIX DATA=FLU; title2 "GLIMMIX Analysis"; class fluseasn; model f = s1 c1 /dist=beta link=logit s; random intercept/subject=fluseasn; random s1 c1/subject=fluseasn type=toep(1); random s2 c2/subject=fluseasn type=toep(1); random s3 c3/subject=fluseasn type=toep(1); random s4 c4/subject=fluseasn type=toep(1); output out=out3 pred(ilink blup)=pblup pred(ilink noblup)=overall pearson=p_residbeta; run;

Could try 2 parameter Beta distribution instead:

Page 19: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

19

Binomial AssumptionWithout BLUPS and with BLUPS

Page 20: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

20

Beta AssumptionWithout BLUPS and with BLUPS

Page 21: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

21

Poisson Example: Wave induced damage incidents in 40 ships (ship groups)

Variables:

Factorial Effects (fixed, classificatory): Ship Type 5 levels A,B,C,D,E

Year Constructed 4 levels Years of Operation 2 levels

Covariate (“offset” - continuous) = Time in service (“Aggregate months”)

Incidents (dependent, counts)

Source” McCullough & Nelder (but I ignore cases where year constructed > period of operation)

ConstructedOperated

60-64 65-69 70-74 75-79

60-74 ABCDE ABCDE ABCDE -X-

75-79 ABCDE ABCDE ABCDE ABCDE

Page 22: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

22

proc glimmix data=ships; Title "Ignoring Ship Variance"; class operation construct shiptype; model incidents = operation construct shiptype/ dist=poisson s offset=log_service; run;

-2 Log Likelihood 136.56 (more fit statistics) Pearson Chi-Square 42.28Pearson Chi-Square / DF 1.69

>1 Ship variance?

ln(service)

Poisson: ln(l) – ln(service) = b0 + b1(operation) + b2(construct) + b3(ship_type)

ln(l/service)

Page 23: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

23

Type III Tests of Fixed Effects

Num DenEffect DF DF F Value Pr > F

Operation 1 25 10.57 0.0033Construct 3 25 9.72 0.0002Shiptype 4 25 6.50 0.0010

Without ship variance component:

Page 24: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

24

PROC GLIMMIX data=ships method=quad; class operation construct shiptype ship; model incidents = operation construct shiptype/ dist=poisson s offset=log_service; covtest "no ship effect" glm;* random ship; random intercept / subject = operation*construct*shiptype;run;

Covariance Parameter Estimates StandardCov Parm Subject Estimate ErrorIntercept Operat*Constr*Shipty 0 .

Page 25: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

25

Fit Statistics -2 Log Likelihood 136.56

Fit Statistics for Conditional Distribution

-2 log L(incidents | r. effects) 136.56Pearson Chi-Square 42.27Pearson Chi-Square / DF 1.24

Type III Tests of Fixed Effects

Num Den Effect DF DF F Value Pr > F Operation 1 25 10.57 0.0033 no Construct 3 25 9.72 0.0002 shiptype 4 25 6.50 0.0010 changes

Tests of Covariance Parameters covtest "no ship effect" glm; Based on the Likelihood

Label DF -2 Log Like ChiSq Pr > ChiSq Note no ship effect 1 136.56 . 1.0000 MIMI: P-value based on a mixture of chi-squares.

Page 26: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

26

To nest

Horseshoe Crab study (reference: SAS GLIMMIX course notes): Female nests have “satellite” males Count data – Poisson? Generalized Linear

Features (predictors): Carapace Width, Weight, Color, Spine conditionRandom Effect: Site Mixed Model

Go State

Page 27: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

27

proc glimmix data=crab; class site; model satellites = weight width / dist=poi solution ddfm=kr; random int / subject=site; output out=overdisp pearson=pearson; run;

proc means data=overdisp n mean var; var pearson; run;

proc univariate data=crab normal plot; var satellites; run;

N Mean Variance---------------------------173 -0.0258264 2.6737114---------------------------

Fit StatisticsGener. Chi-Square / DF 2.77

Cov Parm Subject EstimateIntercept site 0.1625

Effect Estimate Pr > |t|Intercept -1.1019 0.2527weight 0.5042 0.0035width 0.0318 0.5229

Histogram # Boxplot 15.5+* 1 0 .* 1 0 . 12.5+* 1 | .* 1 | .** 3 | 9.5+** 3 | .*** 6 | .** 4 | 6.5+******* 13 | .******** 15 +-----+ .********** 19 | | 3.5+********** 19 | | .***** 9 *--+--* .******** 16 | | 0.5+******************************* 62 +-----+ ----+----+----+----+----+----+- * may represent up to 2 counts

Zero Inflated ?

Page 28: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

28

Zero Inflated Poisson (ZIP)

0!/)1(

0)1(!0/)1(}Pr{

0

000

00

jforjep

jforeppeppjY

j

)1(}{ 0pYE

Q: Can zero inflation cause overdispersion (2> )?Recall: in Poisson, 2=

Page 29: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

29

)1(}{ 0pYE

!/)1()1()}1({1

0 jejjpYYE j

j

)1()1(}{ 02

02 ppYE

20

220

222 )1())(1()(}{ ppYEYE )1()1()1( 00

200

20 ppppp

2

}{)1()!2/()1( 220

2

2

20 YEpjep j

j

A: yes!

Page 30: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

30

Nice job Grandpa.

That proof just about put everyone to sleep

Page 31: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

31

Zero Inflated Poisson - (ZIP code )

proc nlmixed data=crab;parms b0=0 bwidth=0 bweight=0 c0=-2 c1=0 s2u1=1 s2u2=1; x=c0+c1*width+u1; p0 = exp(x)/(1+exp(x)); * width affects p0; eta= b0+bwidth*width +bweight*weight +u2; lambda=exp(eta); if satellites=0 then loglike = log(p0 +(1-p0)*exp(-lambda)); else loglike = log(1-p0)+satellites*log(lambda)-lambda-lgamma(satellites+1); expected=(1-p0)*lambda; id p0 expected lambda;model satellites~general(loglike);Random U1 U2~N([0,0],[s2u1,0,s2u2]) subject=site;predict p0+(1-p0)*exp(-lambda) out=out1; run;

Dickeyncsu

SAS Global ForumWashington DC

20745

Page 32: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

32

Parameter Estimates

Parameter Estimate t Pr>|t| Lower Upper

b0 2.7897 2.55 0.0268 0.3853 5.1942 bwidth -0.0944 -1.65 0.1267 -0.2202 0.0314 bweight 0.4649 2.38 0.0366 0.0347 0.8952 c0 13.3739 4.42 0.0010 6.7078 20.0401 c1 -0.5447 -4.61 0.0008 -0.8049 -0.2844 s2u1 0.5114 1.12 0.2852 -0.4905 1.5133 s2u2 0.1054 1.67 0.1239 -0.0339 0.2447

width affects p0

weight affects l

Variance for p0 l

Page 33: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

33

From fixed part of model, compute Pr{count=j} and plot (3D) versus Weight, Carapace width

Page 34: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

34

Page 35: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

35

Page 36: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University
Page 37: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University
Page 38: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University
Page 39: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University
Page 40: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University
Page 41: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University
Page 42: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University
Page 43: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

Your talk seems better now Grandpa!

Page 44: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

44

Another possibility: Negative binomial

Number of failures until kth success ( p=Prob{success} )

jkjk ppk

jkppp

k

jkjY )1(

1

1)1(

1

1}Pr{ 1

k

kp

p

pk,

)1(

k

k

p

)(2

0/11 22

kask

Page 45: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

45

Negative binomial: In SAS, k (scale) is our 1/k

proc glimmix data=crab; class site; model satellites = weight width / dist=nb solution ddfm=kr; random int / subject=site; run;

Fit Statistics -2 Res Log Pseudo-Likelihood 539.06 Generalized Chi-Square 174.83 Gener. Chi-Square / DF 1.03

Covariance Parameter Estimates Cov Parm Subject Estimate Std. Error Intercept site 0.09527 0.07979 Scale 0.7659 0.1349 Standard Effect Estimate Error DF t Value Pr > |t| Intercept -1.2022 1.6883 168.5 -0.71 0.4774 weight 0.6759 0.3239 156.6 2.09 0.0386 width 0.01905 0.08943 166.2 0.21 0.8316

Page 46: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

Population average model vs. Individual Specific Model

8 typistsY=Error counts (Poisson distributed)

ln(li)= ln(mean of Poisson) = m+Ui for typist i so li=em+Ui

conditionally (individual specific) Distributions for Y, U~N(0,1) and m=1

l=em=e1=2.7183= mean for“typical” typist (typist with U=0)

Page 47: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

47

Population average model Expectation ||||| | | of individual distributions averaged across population of all typists.

Run same simulation for 8000 typists, compute mean of conditional population means, exp(m+U).

The MEANS ProcedureVariable N Mean Std Dev Std Errorƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒlambda 8000 4.4280478 6.0083831 0.067175ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Z=(4.428-2.7183)/0.06718 = 25.46 !! Population mean is not em

Conditional means, m+U, are lognormal. Log(Y)~N(1,1) E{Y}=exp(m+0.5s2) = e1.5 = 4.4817

Page 48: SAS ® Global Forum 2014 March 23-26 Washington, DC Got Randomness? SAS TM for Mixed and Generalized Linear Mixed Models David A. Dickey NC State University

48

Main points:

1. Generalized linear models with random effects are subject specific models.

2. Subject specific models have fixed effects that represent an individual with random effects 0 (individual at the random effect distributional means).

3. Subject specific models when averaged over the subjects do not give the model fixed effects.

4. Models with only fixed effects do give the fixed effect part of the model when averaged over subjects and are thus called population average models.