logistic regression - advanced...

Chapter 21

Logistic Regression - Advanced Topics

Contents21.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137121.2 Sacrificial pseudo-replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137121.3 Example: Fox-proofing mice colonies - dealing with sacrificial pseudo replication . 1372

21.3.1 Using the simple proportions as data . . . . . . . . . . . . . . . . . . . . . . 137321.3.2 Logistic regression using overdispersion . . . . . . . . . . . . . . . . . . . . 137421.3.3 GLIMM modeling the random effect of colony . . . . . . . . . . . . . . . . . 1376

21.4 Example: Over-dispersed Seed Germination Data . . . . . . . . . . . . . . . . . . 137921.4.1 Using the simple proportions as data . . . . . . . . . . . . . . . . . . . . . . 138321.4.2 Logistic regression using overdispersion . . . . . . . . . . . . . . . . . . . . 138521.4.3 GLIMM modeling the random effect of pots . . . . . . . . . . . . . . . . . . 1386

21.5 Example: Are mosquitos choosy? A preference experiment. . . . . . . . . . . . . 138821.5.1 Using the simple proportions as data . . . . . . . . . . . . . . . . . . . . . . 139321.5.2 Using the empirical logits as data . . . . . . . . . . . . . . . . . . . . . . . . 139421.5.3 Logistic regression using overdispersion . . . . . . . . . . . . . . . . . . . . 139621.5.4 GLIMM modeling the random effect of replicates . . . . . . . . . . . . . . . 139721.5.5 Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1398

21.6 Example: Reprise: Are mosquitos choosy? A preference experiment with com-plete blocks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139821.6.1 Using the simple proportions as data . . . . . . . . . . . . . . . . . . . . . . 140421.6.2 Using the empirical logits as data . . . . . . . . . . . . . . . . . . . . . . . . 140921.6.3 Logistic regression using overdispersion . . . . . . . . . . . . . . . . . . . . 141321.6.4 GLIMM modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141621.6.5 Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1418

21.7 Example: Reprise: Are mosquitos choosy? A preference experiment with IN-COMPLETE blocks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141921.7.1 Using the simple proportions as data . . . . . . . . . . . . . . . . . . . . . . 142221.7.2 Using the empirical logits as data . . . . . . . . . . . . . . . . . . . . . . . . 142321.7.3 Logistic regression using overdispersion . . . . . . . . . . . . . . . . . . . . 142521.7.4 GLIMM modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142821.7.5 Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1430

The suggested citation for this chapter of notes is:

1370

CHAPTER 21. LOGISTIC REGRESSION - ADVANCED TOPICS

Schwarz, C. J. (2019). Logistic Regression - Advanced Topics.In Course Notes for Beginning and Intermediate Statistics.Available at http://www.stat.sfu.ca/~cschwarz/CourseNotes. Retrieved2019-11-04.

21.1 Introduction

The previous chapters on chi-square tests, logistic regression, and logistic ANOVA only considered thesimplest of experiment designs where the data were collected under a completely randomized design,i.e. every observation is independent of every other observation with complete randomization over ex-perimental units and treatments.

It is possible to extend logistic regression and logistic ANOVA to more complex experimental de-signs. My course notes in a graduate course Stat-805 http://www.stat.sfu.ca/~cschwarz/Stat-805 have some details on these more advanced topics.

It is only recently that software has become readily available to analyze these types of experiments.In this chapter some variations from the simple CRD will be discussed.

21.2 Sacrificial pseudo-replication

In many experiments, the experimental unit is a collection of individuals, but measurements take placeon the individual.

Hurlbert (1984) cites the example of an experiment to investigate the effect of fox predation uponthe sex ratio of mice. Four colonies of mice are established. Two of the colonies are randomly chosenand a fox-proof fence is erected around the plots. The other two colonies serve as controls with out anyfencing.

Here are the data (Table 6 of Hurlbert (1984)):

Colony % Males Number males Number females

Foxes A1 63% 22 13

A2 56% 9 7

No foxes B1 60% 15 10

B2 43% 97 130

This data has the characteristics of a chi-square test or logistic ANOVA. The factor (type of fencing)is categorical. The response, the sex of the mouse, is also categorical. Many researchers would simplypool over the replicates to give the pooled table:


Foxes A1 +A2 61% 31 20

No foxes B1 +B2 44% 112 140

If a χ2 test is applied to the pooled data, the p-value is less than 5% indicating there is evidence that

c©2019 Carl James Schwarz 1371 2019-11-04

http://www.stat.sfu.ca/~cschwarz/CourseNotes

http://www.stat.sfu.ca/~cschwarz/Stat-805

http://www.stat.sfu.ca/~cschwarz/Stat-805


the sex ratio is not independent of the presence of foxes.

This “pooled analysis” is INCORRECT. According to Hurlbert (1984), the major problem is thatindividual units (the mice) are treated as independent objects, when in fact, there are not. Experimentersoften pool experimental units from disparate sets of observations in order to do simple chi-square testsor logistic ANOVA. He specifically labels this pooling as sacrificial pseudo-replication.

Hurlbert (1984) identifies at least 4 reasons why the pooling is not valid:

• non-independence of observation. The 35 mice caught in A1 can be regarded as 35 observationsall subject to a common cause, as can the 16 mice in A2, as each group were subject to a commoninfluence in the patches. Consequently, the pooled mice are NOT independent; they represent twosets of interdependent or correlated observations. The pooled data set violates the fundamentalassumption of independent observations.

• throws away some information. The pooling throws out the information on the variability amongreplicate plots. Without such information there is no proper way to assess the significance of thedifferences between treatments. Note that in previous cases of ordinary pseudo-replication (e.g.multiple fish within a tank), this information is also discarded but is not needed - what is needed isthe variation among tanks, not among fish. In the latter case, averaging over the pseudo-replicatescauses no problems.

• confusion of experimental and observational units. If one carries out a test on the pooled data,one is implicitly redefining the experimental unit to be individual mice and not the field plots. Theenclosures (treatments) are applied at the plot level and not the mouse level. This is similar to theproblem of multiple fish within a tank that is subject to a treatment.

• unequal weighting. Pooling weights the replicate plots differentially. For example, suppose thatone enclosure had 1000 mice with 90% being male; and a second enclosure has 10 mice with 10%being male. The pooled data would have 1000 + 10 mice with 900 + 1 being male for an overallmale ratio of 90%. Had the two enclosures been given equal weight, the average male percentagewould be (90%+10%)/2=50%. In the above example, the number of mice captured in the plotsvaries from 16 to over 200; the plot with over 200 mice essentially drives the results.

There are multiple ways to analyze this data that avoid the problem that render the pooled analysisinvalid.

21.3 Example: Fox-proofing mice colonies - dealing with sacrificialpseudo replication

Hurlbert (1984) cites the example of an experiment to investigate the effect of fox predation upon thesex ratio of mice. Four colonies of mice are established. Two of the colonies are randomly chosen anda fox-proof fence is erected around the plots. The other two colonies serve as controls with out anyfencing.

Here are the data (Table 6 of Hurlbert (1984)):




Foxes A1 63% 22 13

A2 56% 9 7

No foxes B1 60% 15 10

B2 43% 97 130

The data are imported into R in the usual fashion:

mice <- read.csv("foxes.csv", header=TRUE, as.is=TRUE, strip.white=TRUE)mice$colony <- as.factor(mice$colony)mice$sex <- as.factor(mice$sex)mice$treatment <- as.factor(mice$treatment)mice

Don’t forget to declare the categorical variables as factors.

Part of the raw data and the structure of the data frame are shown below:

colony treatment sex count1 a1 foxes m 222 a1 foxes f 133 a2 foxes m 94 a2 foxes f 75 b1 no.foxes m 156 b1 no.foxes f 107 b2 no.foxes m 978 b2 no.foxes f 130

21.3.1 Using the simple proportions as data

Hurlbert (1984) suggests the proper way to analyze the above experiment is to essentially compute asingle number for each plot and then do a two-sample t-test on the percentages. [This is equivalent tothe ordinary averaging process that takes place in ordinary pseudo-replication or sub-sampling.]

We first need to convert from long to wide format using the dcast() function from the reshape2package. Following this we do the simple t-test.

# we first reshape from long to wide formatmice.wide <- dcast(mice, value.var="count", colony + treatment ~ sex)mice.wide$totalmice <- mice.wide$m + mice.wide$fmice.wide$p.males <- mice.wide$m/mice.wide$totalmicemice.widemice.t.test <- t.test(p.males ~ treatment, data=mice.wide)mice.t.testmice.t.test$diff <- sum(c(1,-1)*mice.t.test$estimate)mice.t.test$diff



mice.t.test$se.diff <- mice.t.test$diff / mice.t.test$statisticmice.t.test$se.diff

giving

colony treatment f m totalmice p.males1 a1 foxes 13 22 35 0.62857142 a2 foxes 7 9 16 0.56250003 b1 no.foxes 10 15 25 0.60000004 b2 no.foxes 130 97 227 0.4273128

Welch Two Sample t-test

data: p.males by treatmentt = 0.88568, df = 1.2866, p-value = 0.5102alternative hypothesis: true difference in means is not equal to 095 percent confidence interval:-0.6238191 0.7875778sample estimates:

mean in group foxes mean in group no.foxes0.5955357 0.5136564

[1] 0.08187933t

0.0924477

The estimated difference in the sex ratio between colonies that are subject to fox predation and coloniesnot subject to fox predation is .082 (SE .092) with p-values of .47 (pooled t-test)1 and .51 (unpooledt-test) respectively. As the p-values are quite large, there is NO evidence of a predation effect.

With only two replicates (the colonies), this experiment is likely to have very poor power to detectanything but gross differences.

The above analysis is not entirely satisfactory. The proportion of males have different variabilitiesbecause they are based on different number of total mice. As well, there may be over dispersion amongcolonies under the same treatment, i.e. the variation in the proportion of males may be larger among thetwo colonies under the same treatment than expected.

21.3.2 Logistic regression using overdispersion

Another “approximate” method to deal with the potential overdispersion among the colonies within thesame treatment group (the colony effect) is to use a standard logistic regression but use the goodness-of-fit test to estimate an overdispersion effect. This overdispersion is then used to adjust the standard errorsof estimates and the test statistics for hypothesis tests. Please consult the chapter on Logistic Regressionfor more details.

# glm() fit ignoring overdispersion. Reported SE are too small;# and reported p-values are too small

1Not computed by R.



mice.glm <- glm( p.males ~ treatment, weight=totalmice,data=mice.wide, family=binomial)

mice.glm

# glm() with overdispersion. Reported SE and pvalues# are corrected for overdispersionmice.qglm <- glm( p.males ~ treatment, weight=totalmice,

data=mice.wide, family=quasibinomial)mice.qglmanova(mice.qglm, test=’Chisq’)mice.qglm.emmo <- emmeans::emmeans(mice.qglm, ~treatment)summary(pairs(mice.qglm.emmo), infer=TRUE)

giving

Call: glm(formula = p.males ~ treatment, family = binomial, data = mice.wide,weights = totalmice)

Coefficients:(Intercept) treatmentno.foxes

0.4383 -0.6614

Degrees of Freedom: 3 Total (i.e. Null); 2 ResidualNull Deviance: 7.458Residual Deviance: 2.903 AIC: 23.61

Call: glm(formula = p.males ~ treatment, family = quasibinomial, data = mice.wide,weights = totalmice)

Coefficients:(Intercept) treatmentno.foxes

0.4383 -0.6614

Degrees of Freedom: 3 Total (i.e. Null); 2 ResidualNull Deviance: 7.458Residual Deviance: 2.903 AIC: NAAnalysis of Deviance Table

Model: quasibinomial, link: logit

Response: p.males

Terms added sequentially (first to last)

Df Deviance Resid. Df Resid. Dev Pr(>Chi)NULL 3 7.4580treatment 1 4.5545 2 2.9035 0.0774 .---Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1



contrast estimate SE df asymp.LCL asymp.UCL z.ratio p.valuefoxes - no.foxes 0.661 0.379 Inf -0.0813 1.4 1.745 0.0809

Results are given on the log odds ratio (not the response) scale.Confidence level used: 0.95

The quasi binomial argument in the glm() function indicates that extra binomial variation is to be ac-counted for. This is common done by dividing the residual deviance (2.903) by the residual degrees offreedom (2) giving an over dispersion factor of 1.45. This implies that se from an ordinary glm() fit needto be inflated by a factor of sqrt1.45 = 1.20. Similarly, test statistics need to be deflated by a factor of1.20.

The test for no treatment effect has a p-value of 0.0774. CAUTION. Recall thatR’s anova() function fits incremental (Type I) tests ratherthan marginal (Type III tests) Fortunately, in the case of a single factor (such asthis experiment), the incremental and marginal tests are identical and so the results are fine.

The emmeans() function from the emmeans package again gives estimates of the treatment effectalong with standard errors and (asymptotic) confidence limits. These have been corrected for over dis-persion.

The estimated difference of .66 is on the log-odds scale, and implies that the odds-ratio of the proportionof males between colonies with foxes and without foxes is exp(.66) = 1.94x but the 95% confidenceinterval for the odds ratio is from exp(−.0791) = .92 to exp(1.40) = 4.05 which includes the value of1 (indicating no difference in the odds of males between treatments.).

The use a simple overdispersion factor is not completely satisfactory. It assumes a single correctionfactor for all of the estimates and again estimates the different amount of mice in each colony.

21.3.3 GLIMM modeling the random effect of colony

A more “refined” analysis is now available using Generalized Linear Mixed Models (GLIMM) whichhave been implemented in SASand R.

GLIMM allow the specification of random effects in much the same way as in advanced ANOVAmodels. This is a very general treatment and now allows us to analyze data from very complex experi-mental designs.

In this model, the model would be specified as:

logit(pmales) = Treatment Colony(Treatment)(R)

where the Colony(Treatment) would be the random effect of the experimental units (the colonies).Again, it is good practice to use unique labels for nested effects to make the model syntax a little easierto digest:

logit(pmales) = Treatment Colony(R)

A logistic type model is used.

There are many ways to fit these GLIMM and a discussion of the pros and cons of the variousmethods is beyond the scope of this course. The key problem is that an integration over the random



effects must be performed to evaluate the likelihood. In standard linear mixed models based on normaltheory, this can be done relatively painlessly because of the normality assumption. This is not true inGLIMMs and is an active area of research and so the results that are presented here may change overtime.

We illustrate using both the glmer() and glmmPQL() functions which differ in the way the likelihoodis approximated.

# fit using glmer(). Estimates 0 for variance componentmice.glim <- glmer(sex ~ treatment + (1 | colony), weights=count,

data=mice, family=binomial(link="logit"), nAGQ=25)mice.glim

summary(mice.glim)anova(mice.glim, test="Chisq")mice.glim.emmo <- emmeans::emmeans(mice.glim, ~treatment)summary(pairs(mice.glim.emmo), infer=TRUE)

# fit using glmmPQL(). Estimates a positive variance componentlibrary(MASS)mice.pql <- glmmPQL( fixed=p.males ~ treatment, random=~1|colony,

family=binomial, data=mice.wide, weight=totalmice)summary(mice.pql)VarCorr(mice.pql)#anova(mice.pql) Not availablemice.pql.emmo <- emmeans::emmeans(mice.pql, ~treatment)summary(pairs(mice.pql.emmo), infer=TRUE)

giving the extensive output:

Generalized linear mixed model fit by maximum likelihood (AdaptiveGauss-Hermite Quadrature, nAGQ = 25) [glmerMod]

Family: binomial ( logit )Formula: sex ~ treatment + (1 | colony)

Data: miceWeights: count

AIC BIC logLik deviance df.resid420.5384 420.7767 -207.2692 414.5384 5Random effects:Groups Name Std.Dev.colony (Intercept) 0Number of obs: 8, groups: colony, 4Fixed Effects:

(Intercept) treatmentno.foxes0.4383 -0.6614

convergence code 0; 1 optimizer warnings; 0 lme4 warningsGeneralized linear mixed model fit by maximum likelihood (Adaptive

Gauss-Hermite Quadrature, nAGQ = 25) [glmerMod]Family: binomial ( logit )Formula: sex ~ treatment + (1 | colony)

Data: mice



Weights: count

AIC BIC logLik deviance df.resid420.5 420.8 -207.3 414.5 5

Scaled residuals:Min 1Q Median 3Q Max

-10.1980 -3.5927 -0.2094 3.9081 11.0114

Random effects:Groups Name Variance Std.Dev.colony (Intercept) 0 0Number of obs: 8, groups: colony, 4

Fixed effects:Estimate Std. Error z value Pr(>|z|)

(Intercept) 0.4383 0.2868 1.528 0.1265treatmentno.foxes -0.6614 0.3136 -2.109 0.0349 *---Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Correlation of Fixed Effects:(Intr)

trtmntn.fxs -0.915convergence code: 0boundary (singular) fit: see ?isSingular

Analysis of Variance TableDf Sum Sq Mean Sq F value

treatment 1 4.4488 4.4488 4.4488contrast estimate SE df asymp.LCL asymp.UCL z.ratio p.valuefoxes - no.foxes 0.661 0.314 Inf 0.0468 1.28 2.109 0.0349

Results are given on the log odds ratio (not the response) scale.Confidence level used: 0.95Linear mixed-effects model fit by maximum likelihoodData: mice.wideAIC BIC logLikNA NA NA

Random effects:Formula: ~1 | colony

(Intercept) ResidualStdDev: 0.2653088 1.583492e-05

Variance function:Structure: fixed weightsFormula: ~invwtFixed effects: p.males ~ treatment

Value Std.Error DF t-value p-value(Intercept) 0.3887038 0.2653088 2 1.4650992 0.2805treatmentno.foxes -0.3323829 0.3752034 2 -0.8858741 0.4691Correlation:

(Intr)



treatmentno.foxes -0.707

Standardized Within-Group Residuals:a1 a2 b1 b2

1.081228e-05 -1.557597e-05 3.206578e-05 -1.053836e-05attr(,"label")[1] "Standardized residuals"

Number of Observations: 4Number of Groups: 4colony = pdLogChol(1)

Variance StdDev(Intercept) 7.038878e-02 2.653088e-01Residual 2.507446e-10 1.583492e-05contrast estimate SE df lower.CL upper.CL t.ratio p.valuefoxes - no.foxes 0.332 0.375 2 -1.28 1.95 0.886 0.4691

Results are given on the log odds ratio (not the response) scale.Confidence level used: 0.95

Notice that when the glmer() function is used, the estimated variance component for the colony effectis zero, but not so when the glmmPQL() function is used. This has a major impact on the estimated size ofthe treatment effect, the estimated se for the treatment effect, and the p-value for the test of no treatmenteffect!

When the glmer() function is used, the estimated treatment effect (on the log-odds scale) is 0.66 (se.32; p-value=.035) while when the glmmPQL() function is used, the estimated treatment effect (on thelog-odds scale) is 0.33 (se .38; p-value=.47). The two functions give completely contradictory results!

This just shows you how difficult it is to work with these type of data given limited data.

In this case, given the limited sample sizes, I would have likely gone with a Bayesian analysis wherethe random effects can be modeled directly and there is no need to try and approximate the likelihood.

21.4 Example: Over-dispersed Seed Germination Data

This data is from the SAS manual.

In a seed germination test, seeds of two cultivars were planted in pots of two soil conditions. Thefollowing data contains the observed proportion of seeds that germinated for various combinations ofcultivar and soil condition. Variable n represents the number of seeds planted in a pot, and r representsthe number germinated. CULT and SOIL are indicator variables, representing the cultivar and soilcondition, respectively.



Pot n r Cult Soil

1 16 8 0 0

2 51 26 0 0

3 45 23 0 0

4 39 10 0 0

5 36 9 0 0

6 81 23 1 0

7 30 10 1 0

8 39 17 1 0

9 28 8 1 0

10 62 23 1 0

11 51 32 0 1

12 72 55 0 1

13 41 22 0 1

14 12 3 0 1

15 13 10 0 1

16 79 46 1 1

17 30 15 1 1

18 51 32 1 1

19 74 53 1 1

20 56 12 1 1

The data is available in the file germination.csv in the Sample Program Library at: http://www.stat.sfu.ca/~cschwarz/Stat-Ecology-Datasets. It is read in the usual ways in mostcomputer packages.

Notice that the experimental unit is the pot (i.e. soil and cult were applied to the pot level), but theobservational unit (what is actually measured) is the individual seed. The response variable for eachindividual seed is the either yes or no depending if it germinated or not. The fact that the data have beensummarized to the pot level doesn’t change the observational variable.

Consequently, this is an example of pseudo-replication (experiment units not equal to the observa-tional unit) and any analysis must account for this mismatch. As was seen in the previous section, thereare four ways to analyze this data:

• Compute a response variable at the pot level. This is often the “average” response when theresponse variable is continuous. In this case, the “average” will be the empirical proportion thatgerminated in each pot. You would now have 20 values to be analyzed. The analysis would be atwo-factor (cultivar and soil) completely randomized design (cultivar and soil were randomized tothe pots) with the empirical proportion that hatched as the response variable.

• Perform a generalized-linear model but adjust the results for over dispersion. This is known as aquasi-binomial or quasi-logit analysis. The goodness-of-fit statistics can be used to estimate theover dispersion factor which is then used to adjust the standard errors, test statistics, and p-values.

• Perform a generalized linear mixed model analysis where the ordinary logistic regression is aug-mented with random effects. The key problem here is the need to integrate over the random effectswhen computing the likelihood and this can be computationally challenging.


http://www.stat.sfu.ca/~cschwarz/Stat-Ecology-Datasets



• Perform a Bayesian analysis. This analysis uses MCMC sampling to do the numerical integrationover the likelihood. This analysis is beyond the scope of these notes.

Is there evidence of over dispersion? This would mean that the variation in pot-to-pot results is largerthan expected under simple binomial variation. For example, here is plot of the 95% confidence intervalfor the proportion hatched compared to the average proportion hatched for each combination of soil andcultivar.2 This was computed and plotted using

# Some preliminary plots to investigate overdispersion# Estimate the average p for each cultivar-soil combinationreport <- ddply(germ, c("cult","soil","trt"), summarize,

tot.n = sum(n),tot.r = sum(r),avg.phat = tot.r/tot.n)

report# Merge back and estimate the range of variability in the estimatesgerm2 <- merge(germ, report)

germ2$lcl <- germ2$phat - qnorm(.975)*sqrt(germ2$phat*(1-germ2$phat)/germ2$n)germ2$ucl <- germ2$phat + qnorm(.975)*sqrt(germ2$phat*(1-germ2$phat)/germ2$n)head(germ2)

overdplot <- ggplot(data=germ2, aes(x=trt, y=phat, group=pot))+ggtitle("Germination rates vs. expected average for eacht treatment")+geom_point(position=position_dodge(width=0.2))+ylab("P(germination) with 95% c.i.")+xlab("Cultivar.Soil combinations")+geom_errorbar(aes(ymin=lcl, ymax=ucl), width=.2,

position=position_dodge(width=0.2))+geom_point(aes(y=avg.phat), shape="X", color="blue", size=4,

position=position_dodge(width=0.2))overdplot

2This plot was prepared by R.



Notice that several of the confidence intervals from the individual pots do not cover the average ger-mination proportion for each treatment. This indicates that there is some (random) pot effect that isinfluencing all of the seeds within the pot simultaneously (e.g. pot location). One way to estimate thiswould be to compare the variation of p among pots within the same soil-cultivar combination with thetheoretical variation based on binomial sampling within each pot. In order to account for the differingsample sizes in each pot, we will compute a “standardized normal” variable for pot i within soil-cultivarcombination j as:

zij =pij − pj√

pij(1− pij)/nij

where pj =∑i rijnij

is the average germination rate for the soil-cultivar combination j. This is computedusing this R code:

# Empirical estimate of the pot random effect.# We compute a z-score for each pot being the deviation of the observed# p-hat from the theoretical average.germ2$zscore <- (germ2$phat - germ2$avg.phat)/

sqrt(germ2$phat*(1-germ2$phat)/germ2$n)stdz <- sd(germ2$zscore)cat("Empirical estimate of SD of pot effect ", round(stdz,2),

"; or a variance of ", round(stdz**2,2), "\n")

giving

Empirical estimate of SD of pot effect 2.11 ; or a variance of 4.46

If the additional pot-to-pot random variation was negligible, then Z should have an approximate



standard normal distribution with a variance of 1. The actual variance ofZ was found to be 4.5 indicatingthat the pot-to-pot variation in p was about 4× larger than expected from a simple binomial variation.

Because of this extra-binomial variation, it is not proper to simply “ignore” the pot and pool over thefive pots for each cultivar-soil combination. This would be an example of sacrificial pseudo-replication asoutlined by Hurlbert (1984). As you will below, the pot-to-pot variation in the proportion that germinateis more than can be explained by the simple binomial variation, i.e. there is a large random effect of potsthat must be incorporated.


As suggested by Hurlbert (1984), a simple analysis could proceed by finding the proportion of seeds thatgerminated in each pot (e.g. for pot 1, p = 8/16 = 0.50) and then doing a two-factor CRD analysis onthese proportions using the model:

p = Soil Cult Soil ∗ Cult

This is not completely satisfactory because the number of seeds in each pot (n) varies considerably frompot-to-pot, and hence the variance of p also varies3. A weighted analysis could be performed whichwould partially solve this problem.

The lm() function is used on the empirical germination probabilities

fit.lm <- lm(phat ~ cultF + soilF + cultF:soilF, data=germ,contrast=(list(cultF="contr.sum", soilF=’contr.sum’)))

# better is a weighted analysisfit.lm <- lm(phat ~ cultF + soilF + cultF:soilF, data=germ, weight=n,

contrast=(list(cultF="contr.sum", soilF=’contr.sum’)))

cat("ANOVA at the pot level\n")Anova(fit.lm, type=’III’)

CAUTION: As always when using the lm() function, be aware that the default analyze is a TypeI (incremental) rather than a Type III (marginal) analysis and so the function call must be modified asexplained elsewhere in the notes.

This give an estimate of the residual variance of:

Residual Variance from lm() fit is 1.005709

Here some care must be taken in the interpretation depending if the analysis was a weighted orunweighted analysis. If the analysis was an unweighted analysis, then the residual variance is a com-bination of pot-to-pot variance and the variability of the p in each pot. As a rough guess, the averagegermination rate is around 0.5 with an average sample size of around 45. This would give a binomialvariance of .5(1 − .5)/45 = .005. Hence the pot-to-pot variance is about .026 − .005 = .020 which isabout 4× that of the binomial variance which we saw earlier. If the analysis is a weighted analysis (as

3The binomial variance of each p for a pot would be found as√p(1− p)/n



was done), the residual variance is the individual seed variation plus the pot-to-pot variation (standard-ized to the individual seeds). As a rough guess, the average germination rate is around 0.5. This wouldgive a binomial variance for individual seeds of .5(1− .5) = .25. Hence the pot-to-pot variance is about1.00− .25 = .75 which is about 3× that of the individual binomial variance.

The following results for the tests for no main effects and interactions:

ANOVA at the pot levelAnova Table (Type III tests)

Response: phatSum Sq Df F value Pr(>F)

(Intercept) 204.964 1 203.8002 1.601e-10 ***cultF 1.576 1 1.5667 0.228674soilF 10.918 1 10.8556 0.004569 **cultF:soilF 0.055 1 0.0549 0.817654Residuals 16.091 16---Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Hence the naive analysis find no evidence of an interaction effect of soil and cultivar, no evidenceof a main effect of cultivar, but strong evidence of a main effect of soil upon the germination rate.

The marginal estimates and differences can be found in the usual ways. In SAS, the emmeans state-ment provides estimates of marginal means and differences in marginal means. The emmeans() functionin the emmeans package can be used in the usual way.

cat("Estimate of soil effect from analysis at pot level\n")fit.lm.emmo.soil <- emmeans::emmeans(fit.lm, ~soilF)summary(fit.lm.emmo.soil)summary(pairs(fit.lm.emmo.soil), infer=TRUE)

Here are the estimates of the marginal means and difference in the germination rates for the effectsof soil. Similar tables can be produced for the effects of the other factor and their interaction, but are notshown.

Estimate of soil effect from analysis at pot levelsoilF emmean SE df lower.CL upper.CL0 0.372 0.0489 16 0.268 0.4761 0.595 0.0469 16 0.496 0.695

Results are averaged over the levels of: cultFConfidence level used: 0.95contrast estimate SE df lower.CL upper.CL t.ratio p.value0 - 1 -0.223 0.0677 16 -0.367 -0.0796 -3.295 0.0046

Results are averaged over the levels of: cultFConfidence level used: 0.95



The estimated marginal mean germination rates are relatively precise. The standard errors are notequal because of the differing sample sizes in the pots in the various soil-cultivar combinations.


The pots serve as “cluster” in this experiment, so the ad hoc methods of correcting for overdispersioncaused by “cluster” effects can also be done, i.e. estimating the overdispersion factor (c) and multiplyingthe standard errors by the

√c. This is known as a quasi-binomial or quasi-logit analysis.

This is done in R using the glm() function but with the quasibinomial family selected:

fit.glm <- glm(phat ~ cultF+soilF+cultF:soilF, data=germ, weight=n,family=quasibinomial(link="logit"),contrast=(list(cultF="contr.sum", soilF=’contr.sum’)))

cat("ANOVA from quasi-binomial\n")Anova(fit.glm, type="III")

CAUTION: As always when using the lm() function, be aware that the default analyze is a TypeI (incremental) rather than a Type III (marginal) analysis and so the function call must be modified asexplained elsewhere in the notes,

This will use the deviance to degrees of freedom to estimate the overdispersion factor. This gives thefollowing estimate for the overdispersion factor:

Estimated overdispersion factor is 4.172622

We see that the overall variation is over 4× larger than expected under the binomial model. It is notpossible to obtain an explicit estimate of the actual pot-to-pot variance.

The tests for no effects of soil or cultivar or their interaction on the germination rate are:

ANOVA from quasi-binomialAnalysis of Deviance Table (Type III tests)

Response: phatLR Chisq Df Pr(>Chisq)

cultF 1.5823 1 0.208434soilF 10.6410 1 0.001106 **cultF:soilF 0.0473 1 0.827850---Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

The results are similar to the naive analysis seen earlier.

The estimated marginal “means” for the soil effect are now on the logit scale:



Estimate of soil effect from quasi-binomialsoilF emmean SE df asymp.LCL asymp.UCL0 -0.527 0.206 Inf -0.93098 -0.1221 0.390 0.197 Inf 0.00436 0.775

Results are averaged over the levels of: cultFResults are given on the logit (not the response) scale.Confidence level used: 0.95soilF prob SE df asymp.LCL asymp.UCL0 0.371 0.0482 Inf 0.283 0.4691 0.596 0.0473 Inf 0.501 0.685

Results are averaged over the levels of: cultFConfidence level used: 0.95Intervals are back-transformed from the logit scalecontrast estimate SE df asymp.LCL asymp.UCL z.ratio p.value0 - 1 -0.916 0.285 Inf -1.47 -0.358 -3.215 0.0013

Results are averaged over the levels of: cultFResults are given on the log odds ratio (not the response) scale.Confidence level used: 0.95contrast odds.ratio SE df asymp.LCL asymp.UCL z.ratio p.value0 / 1 0.4 0.114 Inf 0.229 0.699 -3.215 0.0013

Results are averaged over the levels of: cultFConfidence level used: 0.95Intervals are back-transformed from the log odds ratio scaleTests are performed on the log odds ratio scale

These can be converted back to the regular scale4using the inverse transformation:

psoil=0 = expit(−.5266) = .37

which is similar to the previous results. Note that the se must be converted using the delta-method andnot simply using the expit transformation and is found to be

se(psoil=0) = se(logit(psoil=0))(psoil=0)(1− psoil=0) = .2063(.37)(1− .37) = .047

which is again pretty close to results from the simple analysis. The comparison in germination rates canalso be done. The raw estimates are on the log-odds scale and need to be converted back to the odds-scaleas illustrated elsewhere in these notes.

21.4.3 GLIMM modeling the random effect of pots

Finally, a generalized linear mixed model logistic ANOVA can be done that explicitly models the randomeffects of pots directly. This is done using the glmer() function in the lme4 package:

fit.glmm <- glmer(phat ~ cultF+soilF+cultF:soilF +(1|pot), data=germ,

4In R use the type=’response’ in the summary() function to do the conversion sutomatically



weight=n, family=binomial(link="logit"),contrast=(list(cultF="contr.sum", soilF=’contr.sum’)))

cat("ANOVA from binomial with random effects\n")Anova(fit.glmm, type=’III’)

This corresponds to the model in the shorthand notation:

logit(p) = cult soil cult ∗ soil pots(cult ∗ soil)−R

or the generalized linear model:

rij ∼Binomial(nij , hpij)θij =logit(pij)

θij =cult soil cult ∗ soil pots(cult ∗ soil)−R

Note that pots are nested within each cultivar-soil combination.

The estimated pot-to-pot variance (on the logit scale) is found as:

Groups Name Std.Dev.pot (Intercept) 0.48686

There is no easy way to convert this to an estimate of the pot-to-pot variation on the regular scale.

Tests for no main effects and interactions are:

ANOVA from binomial with random effectsAnalysis of Deviance Table (Type III Wald chisquare tests)

Response: phatChisq Df Pr(>Chisq)

(Intercept) 0.7984 1 0.371558cultF 1.3213 1 0.250350soilF 10.0952 1 0.001487 **cultF:soilF 0.0367 1 0.848068---Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

These results are similar to those seen in the previous analyses.

Similarly, estimates of marginal effects (on the logit scale) of soil are found to be:

Estimate of soil effect from binomial with random effectssoilF emmean SE df asymp.LCL asymp.UCL0 -0.543 0.188 Inf -0.9126 -0.1741 0.305 0.189 Inf -0.0665 0.676



Results are averaged over the levels of: cultFResults are given on the logit (not the response) scale.Confidence level used: 0.95soilF prob SE df asymp.LCL asymp.UCL0 0.367 0.0438 Inf 0.286 0.4571 0.576 0.0463 Inf 0.483 0.663

Results are averaged over the levels of: cultFConfidence level used: 0.95Intervals are back-transformed from the logit scalecontrast estimate SE df asymp.LCL asymp.UCL z.ratio p.value0 - 1 -0.848 0.267 Inf -1.37 -0.325 -3.177 0.0015

Results are averaged over the levels of: cultFResults are given on the log odds ratio (not the response) scale.Confidence level used: 0.95contrast odds.ratio SE df asymp.LCL asymp.UCL z.ratio p.value0 / 1 0.428 0.114 Inf 0.254 0.723 -3.177 0.0015

Results are averaged over the levels of: cultFConfidence level used: 0.95Intervals are back-transformed from the log odds ratio scaleTests are performed on the log odds ratio scale

These are again comparable to the previous results and can be converted back to the original scale in theusual way. For example, the estimated difference (on the logit scale) between the two marginal meansof the soil combinations is −.83 (SE .30). This can be converted to an odds ratio using the methods seenearlier, i.e.

ORsoil 0:1 = exp(−.8257) = .44

which implies that the odds of germination in soil=0 is only 44% of the odds of germination in soil=1.The 95% confidence interval for the odds-ratio is from (.23→ .83).

One of the the advantage of the using generalized linear mixed model procedures is the availabilityof model diagnostic plots. For example, the residual plot panel:

Not currently available in R.

indicates the potential presence of at least one outlier with a germination rate (on the logit scale) is wellbelow that predicted. The normal-probability plot (on the logit scale) also identifies this one potentialoutlier.

In this simple experimental design, there is no obvious advantage to using logistic ANOVA withrandom effects; however, in more complex designs such as split-plot designs, this is the only way toproceed.

21.5 Example: Are mosquitos choosy? A preference experiment.

This (fictitious) data is based on experiment conducted by Dan Peach in the Department of BiologicalSciences, Simon Fraser University.



Both sexes of mosquitoes consume sugar from flowers and from honeydew (no, not the fruit5).Volatile chemicals emitted by flowers may be used by foraging mosquitoes to locate a source of sugar.In this experiment, certain oils from a plant were extracted and applied to delta traps within an enclosure(see below) with one trap having the extracted oils (the treatment) and the other delta trap serving as acontrol.

Then approximately 60 mosquitos were released in the enclosure. After 2 hours, the traps wereremoved and the mosquitos in each trap counted. Not all mosquitos choose one of the traps. Thesemosquitos can be ignored as they are non-informative about preferences. Each experiment took one day,i.e. each replicate is done on a separate day over a three-week period.

The data is available in the file mosquitos.csv in the Sample Program Library at: http://www.stat.sfu.ca/~cschwarz/Stat-Ecology-Datasets. Here is the raw (fictitious) data

5Visit https://en.wikipedia.org/wiki/Honeydew_(secretion) if you are curious.




https://en.wikipedia.org/wiki/Honeydew_(secretion)


Replicate Treatment Control

1 32 23

2 28 7

3 26 16

4 18 5

5 18 12

6 28 21

7 30 23

8 28 35

9 15 12

10 13 8

11 23 17

12 28 22

13 7 3

14 43 44

15 34 23

The data are read in the usual fashion:

# Read the raw datamos <- read.csv("mosquitos.csv", header=TRUE, strip.white=TRUE, as.is=TRUE)mos$total.mos <- mos$Treatment + mos$Controlmos$p.treatment <- mos$Treatment/mos$total.mosmos$logit.treatment <- log(mos$p.treatment/(1-mos$p.treatment))mos$Replicate <- factor(mos$Replicate)head(mos)

and derived variables computed. Note that we declare the Replicate as a factor.

Notice that the experimental unit is the trap, but the observational unit (what is actually measured)is the individual mosquito. The response variable for each individual mosquito is the either Treatmentor Control trap. The fact that the data have been summarized to the trap level doesn’t change theobservational unit. It is very tempting to think that the counts in each trap are the response variable, butthat is incorrect – these are simply a summary of the data.

Consequently, this is yet another example of pseudo-replication (experiment units not equal to theobservational unit) and any analysis must account for this mismatch. As was seen in the previous section,there are several ways to analyze this data:

• Compute a response variable at the trap level. This is often the “average” response when theresponse variable is continuous. In this case, the “average” will be the empirical proportion thatchoose the Treatment trap in each replicate. You would now have 15 values to be analyzed. Theanalysis would be a simple t-test on this empirical proportion to see if there is evidence that theproportion that choose the Treatment is different from 0.5.

• Similarly, compute the logit of the proportion that chose the treatment, and do a one-sample t-test on the empirical logits to see if there is evidence that the mean logit is different from 0 =logit(0.5).



• Fit a generalized-linear model but adjust the results for over dispersion. This is known as a quasi-binomial or quasi-logit analysis. The goodness-of-fit statistics can be used to estimate the overdispersion factor which is then used to adjust the standard errors, test statistics, and p-values.



On the surface, the data looks like a paired design with two counts (corresponding to the two treat-ments) in each replicate. Many students would like to analyze the difference in the counts, i.e. in replicate1, the difference is 9 = 32− 23. This is not a good choice of analysis for several reasons:

• The difference ignores the size of the base counts. For example a difference of 4 = 8− 4 has thesame impact as a difference of 4 = 80 − 76. Clearly a difference of 4 when the base counts arearound 6 is more “interesting” than a difference of 4 when the base counts are around 78. This iswhy the proportion choosing one of the treatments is a better choice because now 0.67 = 8/12 ismore interesting than .51 = 80/176.

• Many effects operate multiplicatively on counts. For example, consider counts of (8,4) and (80,40).In both bases the proportion choosing the treatment cell (the first cell) is 0.67 despite the largedifference in the base counts. A simple difference of 4 or 40 will hide the consistent effects.Consequently it makes sense then to use the raw proportions 0.67 = 8/12 = 80/120 or thelog(RATIO) of the two counts log(2.0) = log(8/4) = log(80/40) as both give consistent effects.The log() of the ratio is used to make the analysis symmetrical. For example, it should not matterif you analyze the proportion that chose the treatment or the proportion that chose control as oneis simply the complement of the other. Similarly, log(8/4) = − log(4/8) so the analysis willagain be symmetrical depending on how you compute the ratio. Note that the ratio of the countsis actually the logit in disguise, i.e.

2.0 = log(8/4) = log((8/12)/(4/12)) = log(p/(1− p)) = logit(p)

It is also tempting to do a simple chi-square test based on the pooled data, i.e. find the total numberof mosquitos that choose each type of trap (371 for Treatment; 271 for control) and test if the pooledproportion is 0.5 (indicating no preference) using a one sample z-test for proportions. For example, hereis the output from R:

*** It is tempting (but wrong) to try a one-sample z-test on the total proportions ***

1-sample proportions test with continuity correction

data: sum(mos$Treatment) out of sum(mos$total.mos), null probability 0.5X-squared = 15.266, df = 1, p-value = 9.336e-05alternative hypothesis: true p is not equal to 0.595 percent confidence interval:0.5385411 0.6162769sample estimates:

p0.5778816

Estimated (incorrect) standard error 0.01949257



The (incorrect) p-value is .00009 with the estimated proportion choosing the treatment trap of 0.57 (se0.019) and (incorrect) 95% confidence interval of 0.53 → 0.61. As you will see later, the reportedstandard error is too small and the reported confidence interval is too narrow. The key problem is thatignoring pseudo-replication ignores extra experimental unit variation – this is called overdispersion inthe logistic world.

Is there evidence of over dispersion? This would mean that the variation in results is larger thanexpected under simple binomial variation. For example, here is plot of the 95% confidence intervals forthe proportion that chose the treatment trap compared to the overall proportion that chose the treatmenttrap.6 This was computed and plotted using

# We need to account for potential overpersion as shown in the following plotmos$p.lcl <- mos$p.treatment -

qnorm(0.975)*sqrt(mos$p.treatment*(1-mos$p.treatment)/mos$total.mos)mos$p.ucl <- mos$p.treatment +

qnorm(0.975)*sqrt(mos$p.treatment*(1-mos$p.treatment)/mos$total.mos)quasiplot <- ggplot(data=mos, aes(x=Replicate, y=p.treatment))+

ggtitle("Is there evidence of over dispersion?")+ylab("P(choose treatment) with overall proportion")+geom_point()+geom_errorbar(aes(ymin=p.lcl, ymax=p.ucl), width=0.1)+geom_hline(yintercept=sum(mos$Treatment)/sum(mos$total.mos))

quasiplot

Notice that several of the confidence intervals from the individual enclosured do not cover the overallproportion that chose each treatment. This indicates that there is some (random) day and enclosure effectthat is influencing all of the mosquitos simultaneously (e.g. daily humidity; daily temperature, etc.).

Because of this extra-binomial variation, it is not proper to simply “ignore” the days and pool overall of the days and do a simple chi-square test or simple logistic regression. This would be an exampleof sacrificial pseudo-replication as outlined by Hurlbert (1984).

6This plot was prepared by R.




As suggested by Hurlbert (1984), a simple analysis could proceed by finding the proportion of mosquitosthat choose the treatment trap in each replicate (e.g. for replicate 1, p = 32/55 = 0.58) and thendoing a one-sample t-test analysis on these proportions. This is not completely satisfactory because thenumber of mosquitos in each replicate (n) varies considerably from replicate-to-replicated, and hencethe variance of p also varies7. A weighted analysis could be performed which would partially solve thisproblem.

If the mosquitos were choosing the traps simply at random, we would expect that p = 0.5 - this isthe hypothesized value in the t-test:

H0 : µp = 0.5

HA : µp 6= 0.5

A plot of the empirical probability of selecting the treatment trap

shows that most (but not all of the proportion exceed 0.5. The approximate confidence interval does notappear to include the hypothesized value of 0.5.

7The binomial variance of each p for a replicate would be found as√p(1− p)/n



The t.test() function is used on the empirical proportion that choose the treatment trap:

# Do a simple t-test to see if the proportion of mosquitos choosing# treatment is 0.5? Unfortunately, this doesn’t weight by number# of mosquitos in total which chose. Confidence interval is also# silly as the upper and lower limits should not exceed 1.res.p <- t.test(mos$p.treatment, mu=0.5)res.pcat("Estimated SE for p is", abs((res.p$estimate-0.5)/res.p$statistic), "\n")

giving

One Sample t-test

data: mos$p.treatmentt = 4.2669, df = 14, p-value = 0.0007821alternative hypothesis: true mean is not equal to 0.595 percent confidence interval:0.5519133 0.6568510sample estimates:mean of x0.6043822

Estimated SE for p is 0.02446343

The p-value for no preference is .00078. The estimated proportion that choose the treatment trapis 0.60 (se 0.024) with a 95% ci of 0.55 → 0.66. This analysis is not completely satisfactory for tworeasons:

• The distribution of the empirical proportions may not have a nice “normal” shape, especially if theestimates are close to 0 or 1. As well, the reported confidence bounds may fall outside of 0 and 1.

• Each replicate is given equal weight even though some replicates had close to 60 mosquitos makinga choice and some replicates only had 10 mosquitos that made a choice.

The first item can be solved using an analysis on the empirical logits. The second item can be resolvedusing a weighted analysis, or as shown below a logistic analysis (adjusted for overdispersion).

21.5.2 Using the empirical logits as data

We proceed by finding the empirical logit of the proportion of mosquitos that choose the treatment trapin each replicate (e.g. for replicate 1, logit = logit(32/55) = logit(0.58) = log( 0.58

1−0.58 ) = 0.33) andthen doing a one-sample t-test analysis on these logits, This is not completely satisfactory because thenumber of mosquitos in each replicate (n) varies considerably from replicate-to-replicate, and hence thevariance of the logits also varies.



If the mosquitos were choosing the traps simply at random, we would expect that p = 0.5 or thelogit = 0.0 Hence our hypothesis is now

H0 : µlogit = 0.0

HA : µlogit 6= 0.0

The t.test() is used on the empirical proportion that choose the treatment trap:

# Do a comparison on the logit scale. This is often more desireable# because the logit often has nicer distribution than the raw# proportion. This automatically keeps the confidence interval# bounds between 0 and 1 unlike the previous analysis.# Fortunately, none of the estimates are 0 or 1# which causes problems in computing the logit which is infinite.res.logit <- t.test(mos$logit.treatment, mu=0)res.logitcat("Estimated SE for mean logit is", res.logit$estimate/res.logit$statistic, "\n")

# Find the estimate on the anti-logit scalecat("Estimate of P(choose treatment) are :", plogis(res.logit$estimate), "\n")cat("Estimated SE for mean proportion is",

abs(res.logit$estimate/res.logit$statistic)*plogis(res.logit$estimate)*(1-plogis(res.logit$estimate)), "\n")

cat("Confidence bounds for P(choose treatment) are :",plogis(res.logit$conf.int), "\n")

giving

One Sample t-test

data: mos$logit.treatmentt = 4.005, df = 14, p-value = 0.001303alternative hypothesis: true mean is not equal to 095 percent confidence interval:0.2070315 0.6844290sample estimates:mean of x0.4457302

Estimated SE for mean logit is 0.1112925Estimate of P(choose treatment) are : 0.6096236Estimated SE for mean proportion is 0.02648569Confidence bounds for P(choose treatment) are : 0.5515738 0.6647265

The p-value for no preference is .0013. The estimated logit of the proportion that chose the treatmentis 0.45 (se 0.11) which can be back-transformed to estimate the proportion that choose the treatmenttrap as 0.61 (se 0.027) with a 95% ci of 0.55→ 0.66. These results are very similar to the the previousanalysis.




The enclosures serve as “cluster” in this experiment, so the ad hoc methods of correcting for overdis-persion caused by “cluster” effects can also be done, i.e. estimating the overdispersion factor (c) andmultiplying the standard errors by the


This is done in R using the glm() function but with the quasibinomial family selected. Notice that wefit the intercept only model (the ∼ 1 term in the model formula). Because the fit takes place on the logitscale, the hypothesis is that the intercept is 0 which is automatically provided.

res.quasilogit <- glm(p.treatment ~ 1, data=mos, weight=total.mos,family=quasibinomial(link=logit))

summary(res.quasilogit)$coefficientsconfint(res.quasilogit)phat <- plogis(coef(res.quasilogit))cat("Estimate of P(choose treatment) are :", phat, "\n")cat("Estimated SE for mean proportion is",

sqrt(vcov(res.quasilogit))*phat*(1-phat), "\n")cat("Confidence bounds for P(choose treatment) are :",

plogis(confint(res.quasilogit)), "\n")# Overdispersion factor iscat("Overdispersion factor ", summary(res.quasilogit)$dispersion, "\n")

giving:

Estimate Std. Error t value Pr(>|t|)(Intercept) 0.3140832 0.09425537 3.332258 0.004935137

2.5 % 97.5 %0.1301004 0.4998411Estimate of P(choose treatment) are : 0.5778816Estimated SE for mean proportion is 0.02299213Confidence bounds for P(choose treatment) are : 0.5324793 0.622422Overdispersion factor 1.391303

The p-value for no preference is 0.0049. Note that different packages may give slight different valuesfor the p-value depending on the way the overdispersion factor is computed (e.g. using the deviance or thePearson chi-square) and the way the p-value is computed (e.g. using quasilikelihood or a Wald statistic).All are asymptotically equivalent but can differ in small sample sizes – contact me for more details. Theestimated logit of the preference for the treatment is 0.31 (se .094) which can be converted back to theestimate of the proportion choosing the treatment of 0.57 (se 0.23) with a 95% confidence interval from0.53→ 0.62.

Again, the results are similar to what was seen previously.

The estimated overdispersion factor is 1.39 (see above) indicating that standard errors from a simplelogistic fit must be adjusted by a factor of

√1.39. This was automatically done above.



21.5.4 GLIMM modeling the random effect of replicates

Finally, a generalized linear mixed model logistic ANOVA can be done that explicitly models the randomeffects of replicates directly. This is done using the glmer() function in the lme4 package. Again noticethat the model only has the intercept term (∼ 1) plus the random effects of the replicates.

# Do a random effects logisitic regressionres.rand.logit <- glmer(p.treatment ~1 + (1|Replicate), data=mos,

weight=mos$total.mos, family=binomial(link=logit))summary(res.rand.logit)$coefficientsconfint(res.rand.logit)phat <- plogis(fixef(res.rand.logit))cat("Estimate of P(choose treatment) is :", phat , "\n")cat("Estimated SE for mean proportion is",

as.vector(sqrt(vcov(res.rand.logit)))*phat*(1-phat), "\n")

cat("Confidence bounds for P(choose treatment) are :",plogis(confint(res.rand.logit)[2,]), "\n")

# Random effect of replicatecat("Estimated size of random effect \n")VarCorr(res.rand.logit)


logit(p) = 1 + replicates(R)


treatmenti ∼Binomial(ni, pi)θi =logit(pi)

θi =1 + replicates(R)

The estimated replicate variance (on the logit scale) is found as:

Estimated size of random effectGroups Name Std.Dev.Replicate (Intercept) 0.17724

Notice that R reports the standard deviation of the random effect rather than the variance.

The results of the fit are:

Estimate Std. Error z value Pr(>|z|)(Intercept) 0.3412777 0.09898551 3.447755 0.0005652676

2.5 % 97.5 %.sig01 0.0000000 0.4560886(Intercept) 0.1565661 0.5674595Estimate of P(choose treatment) is : 0.5845009



Estimated SE for mean proportion is 0.02403958Confidence bounds for P(choose treatment) are : 0.5390618 0.6381768Estimated size of random effectGroups Name Std.Dev.Replicate (Intercept) 0.17724

The p-value is .00056. The p-value may differ slightly between packages depending a likelihoodratio or Wald test is used (contact me for details). The estimated logit of the preference for the treatmentis 0.34 (se .098) which can be converted back to the estimate of the proportion choosing the treatment of0.58 (se 0.24) with a 95% confidence interval from 0.54→ 0.64.

It is again tempting to fit a simple generalized linear model with FIXED block effects rather than ran-dom block effects. This gives similar results as seen in the previous analysis but is again not completelysatisfactory. The key problem is that the analysis with fixed block effects assumes that if the experimentwes replicated, that the replicate effects would be indentical the next time around. Given the the replicateeffect are “day” effects and represent the effects of external factors that vary by day (e.g. temperature,humidity, handling effects), this is unlikely to be true. An example of the analysis using fixed blockeffects is found in the R code. As expected, the estimates are similar, but the reported standard errors aretoo small and the confidence interval too narrow because that extra source of variation if new replicateswere done is ignored.

21.5.5 Recommendation

As long as the number of mosquitos that select one of the traps is roughly the same among all of theenclosures, the analysis of the empirical logits using a t-test analysis is simple and relatively straightforward to do with modern statistical software.

If the number of mosquitos that makes a choice is highly variable and/or the empirical probabilitiesof selection are close to 0 or 1, then the more advanced methods will be preferred. Note that if some ofthe cages has 100% or 0% of mosquitos choosing one of the traps, then many of the methods will failbecause the empirical logit are±∞ respectively. You can try using the adjustment to the empirical logitsof adding 1/2 to the number of failures and the number of successes, or Bayesian methods often workquite well.

21.6 Example: Reprise: Are mosquitos choosy? A preference ex-periment with complete blocks.


Both sexes of mosquitoes consume sugar from flowers and from honeydew (no, not the fruit8.Volatile chemicals emitted by flowers may be used by foraging mosquitoes to locate a source of sugar.In this experiment, three different oils (T1, T2, T3) from a plant were extracted and applied to delta trapswithin one of threes enclosures (see previous section for a picture) on each day with one trap having theextracted oils (the treatment) and the other delta trap serving as a control.





Then approximately 60 mosquitos were released in the enclosure. After 2 hours, the traps wereremoved and the mosquitos in each trap counted. Not all mosquitos choose one of the traps. Thesemosquitos can be ignored as they are non-informative about preferences. Each experiment took one day,i.e. all three oils were tested on a single day in 3 enclosures and were randomly assigned to the threeenclosures in a day similar to:

Day Chamber 1 Chamber 2 Chamber 3

1 T1 vs. C T2 vs. C T3 vs. C



etc.

Note that an alternative design where all three oils and the control traps are placed in the same cage(i.e. 4 traps per enclosure) is much more difficult to analyze. They key problem with this revised designis the constraint that mosquitos must choose one of the traps and so the sum of the counts in the trap mustbe less than or equal to the number released. This introduces a negative dependence among the traps, i.e.if one trap catches many mosquitos, the other traps must capture fewer. This negative dependence alsoexists in the two traps/enclosure designs, but by measuring just the proportion of mosquitos selectingone of the traps or by using the logit of the capture probability, this negative dependency is translatedinto just a single variable. A similar analysis pathway occurs if there were 4 trap types per enclosure –the 4 counts are translated into 3 response variables, but there are now many ways in which this can bedone. Please contact me for details on the analysis of such experiments.

There are now TWO different questions of interest

• Is there evidence for each oil (T1, T2, T3) that mosquitos choose it with a probability differentthan 0.5, i.e. is there evidence of a an attractiveness or repulsion for each oil

• Can we compare the efficacy of the three treatments

The first questions above is similar to what was done in the the previous section when only 1 treatmentis compared to a control trap.

The data is available in the file mosquitos2.csv in the Sample Program Library at: http://www.stat.sfu.ca/~cschwarz/Stat-Ecology-Datasets. The full dataset is not given here butsnippets are shown below.

The data are read in the usual fashion:

# Read the raw datamos <- read.csv("mosquitos2.csv", header=TRUE, strip.white=TRUE, as.is=TRUE)mos$total.mos <- mos$Treatment + mos$Controlmos$p.treatment <- mos$Treatment/mos$total.mosmos$logit.treatment <- log(mos$p.treatment/(1-mos$p.treatment))mos$Day <- factor(mos$Day)mos$Cage <- factor(mos$Cage)mos$Experiment <- paste(mos$Day, ".", mos$Cage, sep="")head(mos)

and derived variables computed. Note that we declare the Day and Cage as a factor and create an





variable, Experiment which is a combination of a day and a cage because the cage number is replicatedwithin each day.

Part of the raw data is shown below:

Day Cage trt Treatment Control total.mos p.treatment logit.treatment1 1 1 T2 42 6 48 0.8750000 1.94591012 1 2 T3 33 10 43 0.7674419 1.19392253 1 3 T1 26 19 45 0.5777778 0.31365764 2 1 T1 44 9 53 0.8301887 1.58696515 2 2 T2 42 9 51 0.8235294 1.54044506 2 3 T3 32 13 45 0.7111111 0.9007865

Experiment1 1.12 1.23 1.34 2.15 2.26 2.3

Notice that the experimental unit is the trap, but the observational unit (what is actually measured)is the individual mosquito. The response variable for each individual mosquito is the either Treatmentor Control trap. The fact that the data have been summarized to the replicate level doesn’t change theobservational unit. It is very tempting to think that the counts in each trap are the response variable, butthat is incorrect – these are simply a summary of the data.

Consequently, this is yet another example of pseudo-replication (experiment units not equal to the ob-servational unit) and any analysis must account for this mismatch. As was seen in the previous sections,there are several ways to analyze this data:

• Compute a response variable at the trap (experimental unit) level. This is often the “average”response when the response variable is continuous. In this case, the “average” will be the empiricalproportion that choose the Treatment trap in each replicate. The analysis would be a simple t-test on this empirical proportion to see if there is evidence that the proportion that choose theTreatment is different from 0.5 for each of the individual oils. A combined analysis that analyzesall the oils simultaneously can also be be used which has some advantages as will be seen. Asimple RCB analysis can be used to compare the efficacy of the oils.

• Similarly, compute the logit of the proportion that chose the treatment, and do a one-sample t-test on the empirical logits to see if there is evidence that the mean logit is different from 0 =logit(0.5). A combined analysis that analyzes all the oils simultaneously can also be be usedwhich has some advantages as will be seen. A simple RCB analysis can be used to compare theefficacy of the oils among themselves.

• Fit a generalized-linear model but adjust the results for over dispersion. This is known as a quasi-binomial or quasi-logit analysis. The goodness-of-fit statistics can be used to estimate the overdispersion factor which is then used to adjust the standard errors, test statistics, and p-values.





On the surface, the data looks like a paired design with two counts (corresponding to the two treat-ments) in each enclosure. Many students would like to analyze the difference in the counts from eachenclosure. This is not a good choice of analysis for several reasons:

• The difference ignores the size of the base counts. For example a difference of 4 = 8− 4 has thesame impact as a difference of 4 = 80 − 76. Clearly a difference of 4 when the base counts arearound 6 is more “interesting” than a difference of 4 when the base counts are around 78. This iswhy the proportion choosing one of the treatments is a better choice because now 0.67 = 8/12 ismore interesting than .51 = 80/176.

• Many effects operate multiplicatively on counts. For example, consider counts of (8,4) and (80,40).In both bases the proportion choosing the treatment cell (the first cell) is 0.67 despite the largedifference in the base counts. A simple difference of 4 or 40 will hide the consistent effects.Consequently it makes sense then to use the raw proportions 0.67 = 8/12 = 80/120 or thelog(RATIO) of the two counts log(2.0) = log(8/4) = log(80/40) as both give consistent effects.The log() of the ratio is used to make the analysis symmetrical. For example, it should not matterif you analyze the proportion that chose the treatment or the proportion that chose control as oneis simply the complement of the other. Similarly, log(8/4) = − log(4/8) so the analysis willagain be symmetrical depending on how you compute the ratio. Note that the ratio of the countsis actually the logit in disguise, i.e.

2.0 = log(8/4) = log((8/12)/(4/12)) = log(p/(1− p)) = logit(p)

It is also tempting to do a simple chi-square test based on the pooled data, i.e. find the total number ofmosquitos that choose each type of trap and test if the pooled proportion is 0.5 (indicating no preference)using a one sample z-test for proportions for each oil. For example, here is the output from R for thethree oils:

*** It is tempting (but wrong) to try a one-sample z-test on the total proportions for each treatment ***The estimated overall proportion isn’t badly estimated, but reported SE are too small, and p-value is too small


data: sum(x$Treatment) out of sum(x$total.mos), null probability 0.5X-squared = 28.136, df = 1, p-value = 1.131e-07alternative hypothesis: true p is not equal to 0.595 percent confidence interval:0.5618727 0.6338312sample estimates:

p0.5983718


data: sum(x$Treatment) out of sum(x$total.mos), null probability 0.5X-squared = 107.22, df = 1, p-value < 2.2e-16alternative hypothesis: true p is not equal to 0.595 percent confidence interval:0.6576330 0.7259704sample estimates:

p0.6928375




data: sum(x$Treatment) out of sum(x$total.mos), null probability 0.5X-squared = 6.7246, df = 1, p-value = 0.009509alternative hypothesis: true p is not equal to 0.595 percent confidence interval:0.5119093 0.5864170sample estimates:

p0.549435

trt sum.treatment sum.control statistic df p.value phat se1 T1 441 296 28.135685 1 1.131011e-07 0.5983718 0.018057762 T2 503 223 107.219008 1 3.985953e-25 0.6928375 0.017121093 T3 389 319 6.724576 1 9.509332e-03 0.5494350 0.01869908

lcl ucl1 0.5618727 0.63383122 0.6576330 0.72597043 0.5119093 0.5864170

The estimated proportions are about correct, but the reported standard errors and reported p-values willbe TOO SMALL. The key problem is that ignoring pseudo-replication ignores extra experimental unitvariation – this is called overdispersion in the logistic world.

Similarly, it is tempting to pool over the data and do a chi-square test to compare the three oils, i.e.create a two-way table of oil by choosing the treatment or control trap. For example, here is the outputfrom R for the pooled chi-square test on the counts for the three oils:

*** It is tempting (but wrong) to try a pooled-chisquare to compare treatments ***The reported p-value is too small

trt total.Treatment total.control1 T1 441 2962 T2 503 2233 T3 389 319

Pearson’s Chi-squared test

data: bad.table[, 2:3]X-squared = 32.252, df = 2, p-value = 9.923e-08

The Pearson chi-square test is inappropriate to compare the efficacy because of the pseudo-replicationand overdispersion in the experiment.

Is there evidence of over dispersion? This would mean that the variation in results is larger thanexpected under simple binomial variation. For example, here is plot of the 95% confidence intervals forthe proportion that chose the treatment trap compared to the overall proportion that chose the treatmenttrap.9 This was computed and plotted using

9This plot was prepared by R. A similar plot was also prepared using SAS but is not shown here.



# We need to account for potential overpersion as shown in the [following] plotgrand.means <- ddply(mos, "trt", summarize, phat=sum(Treatment)/sum(total.mos))

mos$p.lcl <- mos$p.treatment -qnorm(0.975)*sqrt(mos$p.treatment*(1-mos$p.treatment)/mos$total.mos)

mos$p.ucl <- mos$p.treatment +qnorm(0.975)*sqrt(mos$p.treatment*(1-mos$p.treatment)/mos$total.mos)

quasiplot <- ggplot(data=mos, aes(x=Experiment, y=p.treatment, color=trt, shape=trt, linetype=trt))+ggtitle("Is there evidence of over dispersion?")+ylab("P(choose treatment) with overall proportion")+geom_point()+geom_errorbar(aes(ymin=p.lcl, ymax=p.ucl), width=0.1)+geom_hline(data=grand.means, aes(yintercept=phat, color=trt,

shape=trt, linetype=trt))quasiplot

Notice that several of the confidence intervals from the individual cages do not cover the overall propor-tion that chose each treatment for each oil. This indicates that there is some (random) day and enclosureeffect that is influencing all of the mosquitos simultaneously (e.g. replicate humidity; replicate tempera-ture, etc.).

Because of this extra-binomial variation, it is not proper to simply “ignore” the replicates and poolover all of the replicates and do a simple chi-square test or simple logistic regression. This would be anexample of sacrificial pseudo-replication as outlined by Hurlbert (1984).

Before doing any analysis, it is always a good idea to make sure there are no missing data, and if ablocked design, to see if the design is complete:

Daytrt 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

T1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1T2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1T3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1



where a 1 indicates that the treatment (oil) was tested on that day. We see that every oil is tested onevery day and every day has every oil tested. The design is complete. This makes the analysis somewhatsimpler.


As suggested by Hurlbert (1984), a simple analysis could proceed by finding the proportion of mosquitosthat choose the treatment trap in each replicate and then doing a one-sample t-test analysis on theseproportions to see if proportion is different from 0.5 This is not completely satisfactory because thenumber of mosquitos in each replicate (n) varies considerably from replicate-to-replicated, and hencethe variance of p also varies10. A weighted analysis could be performed which would partially solve thisproblem.

If the mosquitos were choosing the traps simply at random, we would expect that p = 0.5 - this isthe hypothesized value in the t-test for each oil:

H0 : µp = 0.5

HA : µp 6= 0.5

A plot of the empirical probability of selecting the treatment trap

10The binomial variance of each p for a replicate would be found as√p(1− p)/n



The approximate confidence intervals (notches on the boxplots) can be used to (approximately) test thehypothesized value of 0.5.

The t.test() function is used on the empirical proportion that choose the treatment trap for each oil:

# Do a simple t-test to see if the proportion of mosquitos choosing# treatment is 0.5? Unfortunately, this doesn’t weight by number of# mosquitos in total which chose. Confidence interval is also silly# as the proportion can’t exceed 1. Because the design is complete,# the analysis is a separate t-test for each trtres.p <- ddply(mos, "trt", function(x){

res.p <- t.test(x$p.treatment, mu=0.5)print(res.p)phat <- res.p$estimatese <- (res.p$estimate-0.5)/res.p$statisticdata.frame(statistic=res.p$statistic, df= res.p$parameter,

p.value=res.p$p.value, phat=phat, se=se,lcl=res.p$conf.int[1], ucl=res.p$conf.int[2])

})



res.p

giving

One Sample t-test

data: x$p.treatmentt = 2.7451, df = 14, p-value = 0.01579alternative hypothesis: true mean is not equal to 0.595 percent confidence interval:0.5213022 0.6735064sample estimates:mean of x0.5974043

One Sample t-test

data: x$p.treatmentt = 6.5667, df = 14, p-value = 1.256e-05alternative hypothesis: true mean is not equal to 0.595 percent confidence interval:0.6295164 0.7551552sample estimates:mean of x0.6923358

One Sample t-test

data: x$p.treatmentt = 1.1918, df = 14, p-value = 0.2532alternative hypothesis: true mean is not equal to 0.595 percent confidence interval:0.4629895 0.6295737sample estimates:mean of x0.5462816

trt statistic df p.value phat se lcl ucl1 T1 2.745146 14 1.579479e-02 0.5974043 0.03548238 0.5213022 0.67350642 T2 6.566748 14 1.256001e-05 0.6923358 0.02928935 0.6295164 0.75515523 T3 1.191760 14 2.531668e-01 0.5462816 0.03883467 0.4629895 0.6295737

The estimated proportion that choose the treatment trap for each oil are shown along with the p-valuefor each oil’s test and the 95% confidence interval for the proportion that choose the oil. This analysis isnot completely satisfactory for several reasons:

• The distribution of the empirical proportions may not have a nice “normal” shape, especially if theestimates are close to 0 or 1. As well, the reported confidence bounds may fall outside of 0 and 1.



• Each replicate is given equal weight even not all trails had the same number of mosquitos thatmade a choice

• Each oil was analyzed separately, even though they were all tested on the same day. Consequentlya separate estimate of variance was computed for each oil’s analysis which is inefficient and leadsto larger reported standard errors and a loss of power to detect effects.

The first item can be solved using an analysis on the empirical logits. The second item can be resolvedusing a weighted analysis, or as shown below a logistic analysis (adjusted for overdispersion). The thirditem can be resolved by fitting an RCB on the empirical proportions and then testing if the marginal“means” have the value of 0.5.

In order to make the analysis more efficient, we fit an RCB model with RANDOM blocks to fullycapture the variability in the data. If an RCB model with fixed blocks was fit, it would assume that theexact same set of days and enclosures would be used in the future. While we may have control over theenclosures, we have no control over environmental variables such as humidity that may change from dayto day. The lmer() function is used to fit the RCB with random blocks model, followed by the emmeans()function in the emmeans package to test if the marginal “means” (i.e. the marginal proportions) are equalto 0.5.

# We can also get this from an RCB analysis on the p.treatments,# but we need to treat the blocks (day) as random effects.# This is more efficient than an separate analysis by treatment# (above), because the analysis of all of the treatments# simultaneously provides better information on the varianceres.prcb <- lmer( p.treatment ~ trt + (1|Day), data=mos)

# We now compute the emmeans and test if these equal 0.5# Estimates/se are the same as above, but ci are narrower and pvalues smallerres.prcb.emmo <- emmeans::emmeans(res.prcb, ~ trt)summary(res.prcb.emmo, infer=TRUE, null=0.5)

giving

trt emmean SE df lower.CL upper.CL null t.ratio p.valueT1 0.597 0.0348 26.9 0.526 0.669 0.5 2.802 0.0093T2 0.692 0.0348 26.9 0.621 0.764 0.5 5.533 <.0001T3 0.546 0.0348 26.9 0.475 0.618 0.5 1.331 0.1942

Degrees-of-freedom method: kenward-rogerConfidence level used: 0.95

Notice that now all of the marginal “means” all have the same standard error. This is because thedesign was complete (i.e. every block had every treatment), and so a pooled estimate of the residualvariation was used to estimate the standard error for each estimate. On average, these standard errorswill be slightly smaller than those computed previously and the tests of the hypothesis will be slightlymore powerful. The effect of analyzing each oil separately vs. a combined analysis will not be that greatunless there only a few blocks (days) of data.

Once the RCB analysis is fit, it is then straight forward to compare the efficacy of the oils in theusual way, i.e. using a multiple comparison procedure following the overall test for now differences in



efficacy: We use the cld() function in the emmeans package to compare the marginal “means” (i.e. themarginal proportions) and to compute the pairwise differences:

# We can now do comparison between the various treatments in the# traditional way using the emmeans package.# using the emmeans::CLD() and pairs() functionsres.prcb.cld <- emmeans::CLD(res.prcb.emmo)res.prcb.cld

# Here is the plotres.prcb.plot <- sf.cld.plot.bar(res.prcb.cld, "trt")+

ggtitle("Comparions of attractivness of oils")+xlab("Treatment")+ylab("Proportion attracted to bait")

res.prcb.plot

# all the pairwide differencesres.prcb.pairs <- pairs(res.prcb.emmo)summary(res.prcb.pairs, infer=TRUE)

giving

trt emmean SE df lower.CL upper.CL .groupT3 0.546 0.0348 26.9 0.475 0.618 1T1 0.597 0.0348 26.9 0.526 0.669 1T2 0.692 0.0348 26.9 0.621 0.764 2

Degrees-of-freedom method: kenward-rogerConfidence level used: 0.95P value adjustment: tukey method for comparing a family of 3 estimatessignificance level used: alpha = 0.05contrast estimate SE df lower.CL upper.CL t.ratio p.valueT1 - T2 -0.0949 0.0337 28 -0.1783 -0.0116 -2.818 0.0231T1 - T3 0.0511 0.0337 28 -0.0322 0.1345 1.518 0.2981T2 - T3 0.1461 0.0337 28 0.0627 0.2294 4.336 0.0005

Confidence level used: 0.95Conf-level adjustment: tukey method for comparing a family of 3 estimatesP value adjustment: tukey method for comparing a family of 3 estimates

The compact letter display (cld) is interpreted in the usual fashion.

A plot of the results can be make in the usual fashion as shown using R.




In an analogous fashion, we proceed by finding the empirical logit of the proportion of mosquitos thatchoose the treatment trap in each replicate and then doing a one-sample t-test analysis on these logitsand/or an RCB analysis on the logits

If the mosquitos were choosing the traps simply at random, we would expect that p = 0.5 or thelogit = 0.0 for each marginal logit Hence our hypothesis is now

H0 : µlogit = 0.0

HA : µlogit 6= 0.0

for each marginal logit.

The t.test() is used on the empirical proportion that choose the treatment trap:

# Do a comparison on the logit scale. This is often more desireable# because the logit often has a nicer distribution than the raw# proportion. This automatically keeps the confidence interval# bounds between 0 and 1 unlike the previous analysis.# Fortunately, none of the estimates are 0 or 1# which causes problems in computing the logit.res.logit <- ddply(mos, "trt", function(x){

res.logit <- t.test(x$logit.treatment, mu=0)print(res.logit)logithat <- res.logit$estimate



logitse <- abs((res.logit$estimate)/res.logit$statistic)phat <- expit(logithat)pse <- logitse*phat*(1-phat)plcl <- expit(res.logit$conf.int[1])pucl <- expit(res.logit$conf.int[2])data.frame(statistic=res.logit$statistic, df= res.logit$parameter,

p.value=res.logit$p.value, logithat=logithat, logitse=logitse,logitlcl=res.logit$conf.int[1], logitucl=res.logit$conf.int[2],phat=phat, pse=pse, plcl=plcl, pucl=pucl)

})res.logit

giving

One Sample t-test

data: x$logit.treatmentt = 2.6743, df = 14, p-value = 0.01815alternative hypothesis: true mean is not equal to 095 percent confidence interval:0.0859759 0.7824053sample estimates:mean of x0.4341906

One Sample t-test

data: x$logit.treatmentt = 5.8603, df = 14, p-value = 4.144e-05alternative hypothesis: true mean is not equal to 095 percent confidence interval:0.5520209 1.1893261sample estimates:mean of x0.8706735

One Sample t-test

data: x$logit.treatmentt = 1.0739, df = 14, p-value = 0.301alternative hypothesis: true mean is not equal to 095 percent confidence interval:-0.1879099 0.5647669sample estimates:mean of x0.1884285

trt statistic df p.value logithat logitse logitlcl logitucl1 T1 2.674345 14 1.814511e-02 0.4341906 0.1623540 0.0859759 0.78240532 T2 5.860329 14 4.143805e-05 0.8706735 0.1485708 0.5520209 1.1893261



3 T3 1.073871 14 3.010415e-01 0.1884285 0.1754666 -0.1879099 0.5647669phat pse plcl pucl

1 0.6068739 0.03873409 0.5214807 0.68619832 0.7048858 0.03090596 0.6346043 0.76662053 0.5469682 0.04347957 0.4531603 0.6375548

The estimated logit of the proportion can be back-transformed to estimate the proportion that choosethe treatment trap. These results are very similar to the the previous analysis.

We again fit the RCB model to the logits to do the comparison of the marginal logits for all three oilssimultaneously and to compare the efficacy of the three oils: The lmer() function is used to fit the RCBwith random blocks model, followed by the emmeans() function in the emmeans package to test if themarginal “logits” are equal to 0. We can also back transform the estimates on the logit scale.

# We can also get this from an RCB analysis on the logit.treatments,# but we need to treat the blocks (day) as random effects# This is more efficient than an separate analysis by treatment (# above), because the analysis of all of the treatments simultaneously# provides better information on the variance.res.logitrcb <- lmer( logit.treatment ~ trt + (1|Day), data=mos)

# We now compute the emmeans and test if these equal 0.5# Estimates/se are the same as above, but ci are narrower and pvalues smallerres.logitrcb.emmo <- emmeans::emmeans(res.logitrcb, ~ trt)res.logitrcb.logits <- summary(res.logitrcb.emmo, infer=TRUE, null=0.)res.logitrcb.logits$phat <- expit(res.logitrcb.logits$emmean)res.logitrcb.logits$pse <- res.logitrcb.logits$SE*res.logitrcb.logits$phat*

(1-res.logitrcb.logits$phat)res.logitrcb.logits$plcl <- expit(res.logitrcb.logits$lower.CL)res.logitrcb.logits$pucl <- expit(res.logitrcb.logits$upper.CL)res.logitrcb.logits

giving

trt emmean SE df lower.CL upper.CL t.ratio p.value phat pse plclT1 0.434 0.163 26.9 0.101 0.768 2.672 0.0127 0.607 0.0388 0.525T2 0.871 0.163 26.9 0.537 1.204 5.358 <.0001 0.705 0.0338 0.631T3 0.188 0.163 26.9 -0.145 0.522 1.160 0.2564 0.547 0.0403 0.464pucl

0.6830.7690.628

Degrees-of-freedom method: kenward-rogerConfidence level used: 0.95

Notice that now all of the marginal “logits” all have the same standard error for the same reasons aswhen we analyzed the empirical proportions directly.

Once the RCB analysis is fit, it is then straight forward to compare the efficacy of the oils in the



usual way, i.e. using a multiple comparison procedure following the overall test for now differences inefficacy: We use the cld() function in the emmeans package to compare the marginal “logits” (i.e. themarginal proportions) and to compute the pairwise differences:

# We can now do comparison between the various treatments# in the usual way with the emmeans packaged and# using the emmeans::CLD() and pairs() functions.res.logitrcb.cld <- emmeans::CLD(res.logitrcb.emmo)res.logitrcb.cld

# Here is the plotres.logitrcb.plot <- sf.cld.plot.bar(res.logitrcb.cld, "trt")+

ggtitle("Comparions of attractivness of oils on logit scale")+xlab("Treatment")+ylab("logit(P( attracted to bait))")

res.logitrcb.plot

# all the pairwide differences. The anti-logs are odds ratiosres.logitrcb.pairs <- pairs(res.logitrcb.emmo)res.logitrcb.pairs <- summary(res.logitrcb.pairs, infer=TRUE)res.logitrcb.pairs$OR <- exp(res.logitrcb.pairs$estimate)res.logitrcb.pairs$OR.lcl <- exp(res.logitrcb.pairs$lower.CL)res.logitrcb.pairs$OR.ucl <- exp(res.logitrcb.pairs$upper.CL)res.logitrcb.pairs

giving

trt emmean SE df lower.CL upper.CL .groupT3 0.188 0.163 26.9 -0.145 0.522 1T1 0.434 0.163 26.9 0.101 0.768 1T2 0.871 0.163 26.9 0.537 1.204 2

Degrees-of-freedom method: kenward-rogerConfidence level used: 0.95P value adjustment: tukey method for comparing a family of 3 estimatessignificance level used: alpha = 0.05contrast estimate SE df lower.CL upper.CL t.ratio p.value OR OR.lclT1 - T2 -0.436 0.157 28 -0.826 -0.0469 -2.773 0.0257 0.646 0.438T1 - T3 0.246 0.157 28 -0.144 0.6353 1.561 0.2789 1.279 0.866T2 - T3 0.682 0.157 28 0.293 1.0718 4.334 0.0005 1.978 1.340OR.ucl0.9541.8882.921


The compact letter display (cld) is interpreted in the usual fashion. Notice that when we back transformthe difference in logits, it is equivalent to the odds ratio (OR) for choosing the treated trap compared tothe control trap and an OR = 1 indicates no evidence of a difference in efficacy between the oils.



A plot of the results can be make in the usual fashion as shown using R.




This is done in R using the glm() function but with the quasibinomial family selected. Notice thatwe fit the RCB model. Because the default of R is to perform Type I tests (incremental tests) we needto ensure that the trt is last in the model and/or use the contrasts argument to set the contrast matrixand then use the ANOVA function in the car package. Because the fit takes place on the logit scale, thehypothesis is that the marginal logits are 0 and then we compare the oils in the usual fashion. Note thatthe emmeans package gives you the option of doing the back transformation automatically.

res.quasilogit <- glm(p.treatment ~ Day + trt, data=mos,weight=total.mos, family=quasibinomial(link=logit),contrasts=list(Day=contr.sum, trt=contr.sum))

# get the marginal means (on the logit scale) and the probability scaleres.quasilogit.emmo <- emmeans::emmeans(res.quasilogit, ~ trt)summary(res.quasilogit.emmo, infer=TRUE)summary(res.quasilogit.emmo, type="response")

giving:

trt emmean SE df asymp.LCL asymp.UCL z.ratio p.value



T1 0.424 0.104 Inf 0.21919 0.628 4.060 <.0001T2 0.853 0.112 Inf 0.63433 1.072 7.636 <.0001T3 0.203 0.105 Inf -0.00239 0.409 1.937 0.0527

Results are averaged over the levels of: DayResults are given on the logit (not the response) scale.Confidence level used: 0.95trt prob SE df asymp.LCL asymp.UCLT1 0.604 0.0250 Inf 0.555 0.652T2 0.701 0.0234 Inf 0.653 0.745T3 0.551 0.0259 Inf 0.499 0.601

Results are averaged over the levels of: DayConfidence level used: 0.95Intervals are back-transformed from the logit scale

This analysis is not completely satisfactory because Day is treated as a fixed effect and so the estimatedmarginal logits will tend to have reported standard errors that are too small. See me for additional details.

Note that different packages may give slight different values for the p-values for the marginal logitsdepending on the way the overdispersion factor is computed (e.g. using the deviance or the Pearsonchi-square) and the way the p-value is computed (e.g. using quasilikelihood or a Wald statistic). All areasymptotically equivalent but can differ in small sample sizes – contact me for more details.

Once the RCB analysis is fit, it is then straight forward to compare the efficacy of the oils in the usualway, i.e. using a multiple comparison procedure after the test for no differences in efficacy: We use thecld() function in the emmeans package to compare the marginal “logits” (i.e. the marginal proportions)and to compute the pairwise differences:

# Multiple comparison between the treatments# Because the design is balanced, type I are the same as Type III tests.Anova(res.quasilogit, test="F", type=3)res.quasilogit.cld <- emmeans::CLD(res.quasilogit.emmo)res.quasilogit.cld

# Here is the plotres.quasilogit.plot <- sf.cld.plot.bar(res.quasilogit.cld, "trt")+

ggtitle("Comparions of attractivness of oils - using quasi-binomial model")+xlab("Treatment")+ylab("logit Proportion attracted to bait")

res.quasilogit.plot

# all the pairwide differencesres.quasilogit.pairs <- pairs(res.quasilogit.emmo)summary(res.quasilogit.pairs, infer=TRUE)summary(res.quasilogit.pairs, infer=TRUE,type="response")

giving

Analysis of Deviance Table (Type III tests)

Response: p.treatment



Error estimate based on Pearson residuals

Sum Sq Df F values Pr(>F)Day 107.075 14 4.2033 0.0006012 ***trt 34.456 2 9.4680 0.0007230 ***Residuals 50.949 28---Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1trt emmean SE df asymp.LCL asymp.UCL .groupT3 0.203 0.105 Inf -0.00239 0.409 1T1 0.424 0.104 Inf 0.21919 0.628 1T2 0.853 0.112 Inf 0.63433 1.072 2

Results are averaged over the levels of: DayResults are given on the logit (not the response) scale.Confidence level used: 0.95Results are given on the log odds ratio (not the response) scale.P value adjustment: tukey method for comparing a family of 3 estimatessignificance level used: alpha = 0.05contrast estimate SE df asymp.LCL asymp.UCL z.ratio p.valueT1 - T2 -0.430 0.152 Inf -0.787 -0.0724 -2.819 0.0134T1 - T3 0.221 0.148 Inf -0.125 0.5667 1.495 0.2934T2 - T3 0.650 0.153 Inf 0.292 1.0086 4.254 0.0001

Results are averaged over the levels of: DayResults are given on the log odds ratio (not the response) scale.Confidence level used: 0.95Conf-level adjustment: tukey method for comparing a family of 3 estimatesP value adjustment: tukey method for comparing a family of 3 estimatescontrast odds.ratio SE df asymp.LCL asymp.UCL z.ratio p.valueT1 / T2 0.651 0.0992 Inf 0.455 0.93 -2.819 0.0134T1 / T3 1.247 0.1841 Inf 0.882 1.76 1.495 0.2934T2 / T3 1.916 0.2929 Inf 1.339 2.74 4.254 0.0001

Results are averaged over the levels of: DayConfidence level used: 0.95Conf-level adjustment: tukey method for comparing a family of 3 estimatesIntervals are back-transformed from the log odds ratio scaleP value adjustment: tukey method for comparing a family of 3 estimatesTests are performed on the log odds ratio scale

The compact letter display (cld) is interpreted in the usual fashion. Notice that when we back trans-form the difference in logits, it is equivalent to the odds ratio (OR) for choosing the treated trap comparedto the control trap and an OR = 1 indicates no evidence of a difference in efficacy between the oils.

A plot of the results can be make in the usual fashion as well (but is not shown) in these notes.

The estimated overdispersion factor is:

# Overdispersion factor iscat("Overdispersion factor ", summary(res.quasilogit)$dispersion, "\n")



giving

Overdispersion factor 1.81959

Notice that R gives the

indicating that standard errors from a simple logistic fit must be adjusted by the square root of the factor.This was automatically done above.

This analysis is not completely satisfactory because Day is treated as a fixed effect and so the esti-mated marginal logits will tend to have reported standard errors that are too small. See me for additionaldetails.

21.6.4 GLIMM modeling

Finally, a generalized linear mixed model logistic ANOVA can be done that explicitly models the randomeffects of days and enclosures (the combination of day and cage) directly. This is done using the glmer()function in the lme4 package. We fit a model with trt to compare the marginal logits to 0 and to comparethe efficacy among the different oils

# Do a random effects logisitic regressionres.rand.logit <- glmer(p.treatment ~trt + (1|Day)+(1|Experiment),

data=mos, weight=mos$total.mos,family=binomial(link=logit))

# get the marginal means (on the logit scale) and the probability scaleres.rand.logit.emmo <- emmeans::emmeans(res.rand.logit, ~ trt)summary(res.rand.logit.emmo, infer=TRUE)summary(res.rand.logit.emmo, type="response")


logit(p) = trt + Day(R) + Experiment(R)



θi =trt + Day(R) + Experiment(R)

where the Experiment term represent the combination of day and cage. You should not use Cage as therandom effect because cage 1 on day 1 is not necessarily the same “cage” on day 2, despite it havingthe same physical location etc. Perhaps something happened to the cage on day 2 that made it subtlydifferent from the same cage on the previous day.

This gives the estimated marginals and comparisons to 0 on both the logit and probability scale.

trt emmean SE df asymp.LCL asymp.UCL z.ratio p.valueT1 0.422 0.149 Inf 0.1299 0.713 2.833 0.0046



T2 0.853 0.152 Inf 0.5557 1.150 5.623 <.0001T3 0.199 0.149 Inf -0.0928 0.492 1.337 0.1811

Results are given on the logit (not the response) scale.Confidence level used: 0.95trt prob SE df asymp.LCL asymp.UCLT1 0.604 0.0356 Inf 0.532 0.671T2 0.701 0.0318 Inf 0.635 0.760T3 0.550 0.0369 Inf 0.477 0.621

Confidence level used: 0.95Intervals are back-transformed from the logit scale

The usual comparison among treatments is made after the test for no treatment effect.

# Multiple comparison between the treatmentsres.rand.logit.cld <- emmeans::CLD(res.rand.logit.emmo)res.rand.logit.cld

# Here is the plotres.rand.logit.plot <- sf.cld.plot.bar(res.rand.logit.cld, "trt")+

ggtitle("Comparions of attractivness of oils - using random effect glmer")+xlab("Treatment")+ylab("logit Proportion attracted to bait")

res.rand.logit.plot

# all the pairwide differencesres.rand.logit.pairs <- pairs(res.rand.logit.emmo)summary(res.rand.logit.pairs, infer=TRUE)summary(res.rand.logit.pairs, infer=TRUE,type="response")

giving

trt emmean SE df asymp.LCL asymp.UCL .groupT3 0.199 0.149 Inf -0.0928 0.492 1T1 0.422 0.149 Inf 0.1299 0.713 1T2 0.853 0.152 Inf 0.5557 1.150 2

Results are given on the logit (not the response) scale.Confidence level used: 0.95Results are given on the log odds ratio (not the response) scale.P value adjustment: tukey method for comparing a family of 3 estimatessignificance level used: alpha = 0.05contrast estimate SE df asymp.LCL asymp.UCL z.ratio p.valueT1 - T2 -0.431 0.145 Inf -0.771 -0.0917 -2.977 0.0082T1 - T3 0.222 0.142 Inf -0.112 0.5559 1.560 0.2633T2 - T3 0.654 0.145 Inf 0.313 0.9945 4.493 <.0001

Results are given on the log odds ratio (not the response) scale.Confidence level used: 0.95Conf-level adjustment: tukey method for comparing a family of 3 estimatesP value adjustment: tukey method for comparing a family of 3 estimates



contrast odds.ratio SE df asymp.LCL asymp.UCL z.ratio p.valueT1 / T2 0.65 0.0942 Inf 0.462 0.912 -2.977 0.0082T1 / T3 1.25 0.1778 Inf 0.894 1.743 1.560 0.2633T2 / T3 1.92 0.2797 Inf 1.367 2.703 4.493 <.0001

Confidence level used: 0.95Conf-level adjustment: tukey method for comparing a family of 3 estimatesIntervals are back-transformed from the log odds ratio scaleP value adjustment: tukey method for comparing a family of 3 estimatesTests are performed on the log odds ratio scale

The p-values may differ slightly between packages depending a likelihood ratio or Wald test is used(contact me for details).

The difference of logits corresponds to the log(OddsRation) of the proportion choosing the treat-ment trap vs the control trap between two different oils. A back transformation can give estimates ofthe odds ratio (and confidence limits) directly. Notice that you do NOT take the anti-log of the se of thelog(odds ratio) to get the se of the odds-ratio, but must do a delta-method transformation. Contact mefor details.

The estimated variance components (on the logit scale) are found as:

Estimated variance componentsGroups Name Std.Dev.Experiment (Intercept) 0.24778Day (Intercept) 0.42508



As long as the number of mosquitos that select one of the traps is roughly the same among all of theenclosures, the analysis of the empirical logits using a Randomized Complete Block (RCB) analysis issimple and relatively straight forward to do with modern statistical software. The key to this model isthe ability to declare blocks as a random factor to ensure that the individual comparisons of the marginallogits to 0 captures the true extent of the uncertainty.

If the number of mosquitos that makes a choice is highly variable and/or the empirical probabilitiesof selection are close to 0 or 1, then the more advanced methods will be preferred. Note that if some ofthe cages has 100% or 0% of mosquitos choosing one of the traps, then many of the methods will failbecause the empirical logit are ±∞ respectively. In such cases, Bayesian methods can often work quitewell.



21.7 Example: Reprise: Are mosquitos choosy? A preference ex-periment with INCOMPLETE blocks.


This third analysis continues that of the previous section but now deals with cases where there ismissing data, i.e. not all oils are tested on all days. For example, you may have 5 different oils, but onlythree enclosures that can be used in a day. Consequently, not all oils can be tested in one day. Or, missingdata (e.g. something went wrong with an enclosure) implies that the blocks are also not complete. Wewill (arbitrarily) delete some of the data from the previous section and redo the analyses.

As before, both sexes of mosquitoes consume sugar from flowers and from honeydew (no, not thefruit11. Volatile chemicals emitted by flowers may be used by foraging mosquitoes to locate a sourceof sugar. In this experiment, three different oils (T1, T2, T3) from a plant were extracted and applied todelta traps within one of threes enclosures (see previous section for a picture) on each day with one traphaving the extracted oils (the treatment) and the other delta trap serving as a control.

Then approximately 60 mosquitos were released in the enclosure. After 2 hours, the traps wereremoved and the mosquitos in each trap counted. Not all mosquitos choose one of the traps. Thesemosquitos can be ignored as they are non-informative about preferences. Each experiment took oneday, i.e. the oils were tested on a single day in 3 enclosures and were randomly assigned to the threeenclosures in a day similar to:

Day Chamber 1 Chamber 2 Chamber 3




etc.

However, as noted, we will (arbitrarily) remove some of the data and the revised design looks like:

Daytrt 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

T1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1T2 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1T3 0 0 1 0 0 1 1 0 1 1 1 1 0 1 1

where a 1 indicates that the treatment was tested on that day and 0 indicates it was not. We see that notevery oil is tested on every day and not every day has every oil tested. The design is INCOMPLETE.

A naive analysis might simple delete those blocks with missing data. This is inefficient and costly –there is information in the incomplete blocks and you should not simply discard it.

Incomplete block designs must be analyzed with care because:

• Simple marginal comparisons of treatment no longer have the same set of blocks and so some of





the differences in the margins are confounded with treatment effects. The key is do a block analysisto “adjust” the reading to make all treatments compared against a comparable set of blocks.

• There are two sources of information about comparisons among treatments. The intra-block anal-ysis looks at differences among treatments WITHIN blocks. If the blocking design is complete,this is the entire information about treatment comparisons. However, with incomplete blocks,there is another (smaller) source of information, the inter-block analysis. This will automaticallybe included if you do the analysis with random blocks.

As before, there are now TWO different questions of interest

• Is there evidence for each oil (T1, T2, T3) that mosquitos choose it with a probability differentthan 0.5, i.e. is there evidence of a an attractiveness or repulsion for each oil

• Can we compare the efficacy of the three treatments

We will arbitrarily delete some data from the mosquitos3.csv in the Sample Program Library at:http://www.stat.sfu.ca/~cschwarz/Stat-Ecology-Datasets. Consult the programcode for details.

As before, the Day and Cage are declared as as a factor variables and we created anvariable,Experiment which is a combination of a day and a cage because the cage number is replicated withineach day.

Part of the new raw data is shown below:

Day Cage trt Treatment Control total.mos p.treatment logit.treatment1 1 1 T2 42 6 48 0.8750000 1.94591012 2 1 T1 44 9 53 0.8301887 1.58696513 2 2 T2 42 9 51 0.8235294 1.54044504 3 1 T1 25 21 46 0.5434783 0.17435345 3 2 T3 33 20 53 0.6226415 0.50077536 3 3 T2 34 13 47 0.7234043 0.9614112

Experiment1 1.12 2.13 2.24 3.15 3.26 3.3

As before, notice that the experimental unit is the trap, but the observational unit (what is actuallymeasured) is the individual mosquito. The response variable for each individual mosquito is the eitherTreatment or Control trap. The fact that the data have been summarized to the replicate level doesn’tchange the observational unit. It is very tempting to think that the counts in each trap are the responsevariable, but that is incorrect – these are simply a summary of the data.

Consequently, this is yet another example of pseudo-replication (experiment units not equal to theobservational unit) and any analysis must account for this mismatch. Now some care must be taken inthe analysis:




• Compute a response variable at the trap (experimental unit) level. This is often the “average”response when the response variable is continuous. In this case, the “average” will be the empiricalproportion that choose the Treatment trap in each replicate. Now a block analysis that analyzesall the oils simultaneously MUST BE USED. You should not analyze each oil separately becausethe set of blocks over which the oil was tests is NOT the same.

• Similarly, compute the logit of the proportion that chose the treatment, and perform a blockanalysis on the empirical logits to see if there is evidence that the mean logit is different from0 = logit(0.5). You should not analyze each oil separately because the set of blocks over whichthe oil was tests is NOT the same.

• Fit a generalized-linear model but adjust the results for over dispersion. This is known as a quasi-binomial or quasi-logit analysis. The goodness-of-fit statistics can be used to estimate the overdispersion factor which is then used to adjust the standard errors, test statistics, and p-values.Unfortunately, this will NOT automatically incorporate the inter-block information becauseblocks must be declared as fixed.

• Perform a generalized linear mixed model analysis where the ordinary logistic regression is aug-mented with random effects. The key problem here is the need to integrate over the random effectswhen computing the likelihood and this can be computationally challenging. This automaticallyincludes both the inter- and intra-block information.


As before, simple ch-square tests on the poole data are NOT appropriate because of the pseudo-replication issues.

Is there evidence of over dispersion? Here is plot of the 95% confidence intervals for the proportionthat chose the treatment trap compared to the overall proportion that chose the treatment trap.



Notice that several of the confidence intervals from the individual cages do not cover the overall propor-tion that chose each treatment for each oil. This indicates that there is some (random) day and enclosureeffect that is influencing all of the mosquitos simultaneously (e.g. replicate humidity; replicate tempera-ture, etc.).


Because the design is an incomplete block design, a combined block analysis MUST be used to avoidhaving the oils using different sets of blocks when computing the marginal means. We fit an block modelwith RANDOM blocks to fully capture the variability in the data. If a block model with fixed blocks wasfit, it would assume that the exact same set of days and enclosures would be used in the future. While wemay have control over the enclosures, we have no control over environmental variables such as humiditythat may change from day to day. The lmer() function is used to fit the block with random blocks model,followed by the emmeans() function in the emmeans package to test if the marginal “means” (i.e. themarginal proportions) are equal to 0.5.

# We do the combined inter- and intra-block analysis on the p.treatments.# This is recommended# The blocks (day) MUST be declared as random effectsres.prcb <- lmer( p.treatment ~ trt + (1|Day), data=mos3)

# We now compute the emmeans and test if these equal 0.5# Estimates/se are the same as above, but ci are narrower and# the pvalues smaller.res.prcb.emmo <- emmeans::emmeans(res.prcb, ~ trt)cat("*** RECOMMENDED *** Adjusts for incomplete blocks and extracts ",

"intra- and inter-block information \n")summary(res.prcb.emmo, infer=TRUE, null=0.5)

giving

*** RECOMMENDED *** Adjusts for incomplete blocks and extracts intra- and inter-block informationtrt lsmean SE df lower.CL upper.CL null t.ratio p.valueT1 0.6046668 0.03359525 26.52 0.5356763 0.6736572 0.5 3.116 0.0044T2 0.6965971 0.03359525 26.52 0.6276067 0.7655875 0.5 5.852 <.0001T3 0.5665061 0.04016233 32.34 0.4847316 0.6482807 0.5 1.656 0.1074

Confidence level used: 0.95

Notice that not all of the marginal “means” all have the same standard error. This is because thedesign was incomplete and each oil was tested in differing number of blocks. Once the block analysisis fit, it is then straight forward to compare the efficacy of the oils in the usual way, i.e. using a multiplecomparison procedure: We use the cld() function in the emmeans package to compare the marginal“means” (i.e. the marginal proportions) and to compute the pairwise differences:

# We can now do comparison between the various treatments in the# usual way using the emmeans packaged and# using the emmeans::CLD() and pairs() functions



res.prcb.cld <- emmeans::CLD(res.prcb.emmo)res.prcb.cld

# Here is the plotres.prcb.plot <- sf.cld.plot.bar(res.prcb.cld, "trt")+

ggtitle("Comparions of attractivness of oils - incomplete blocks")+xlab("Treatment")+ylab("Proportion attracted to bait")

res.prcb.plot

# all the pairwide differencesres.prcb.pairs <- pairs(res.prcb.emmo)summary(res.prcb.pairs, infer=TRUE)

giving

trt emmean SE df lower.CL upper.CL .groupT3 0.567 0.0402 32.3 0.485 0.648 1T1 0.605 0.0336 26.5 0.536 0.674 1T2 0.697 0.0336 26.5 0.628 0.766 2

Degrees-of-freedom method: kenward-rogerConfidence level used: 0.95P value adjustment: tukey method for comparing a family of 3 estimatessignificance level used: alpha = 0.05contrast estimate SE df lower.CL upper.CL t.ratio p.valueT1 - T2 -0.0919 0.0358 21.4 -0.1821 -0.00173 -2.566 0.0452T1 - T3 0.0382 0.0414 22.0 -0.0659 0.14226 0.921 0.6331T2 - T3 0.1301 0.0414 22.0 0.0260 0.23419 3.140 0.0127


The compact letter display (cld) is interpreted in the usual fashion.

A plot of the results can be make in the usual fashion (not shown).


In an analogous fashion, we proceed by finding the empirical logit of the proportion of mosquitos thatchoose the treatment trap in each replicate and then perform a block analysis analysis on the logits

If the mosquitos were choosing the traps simply at random, we would expect that p = 0.5 or thelogit = 0.0 for each marginal logit Hence our hypothesis is now

H0 : µlogit = 0.0

HA : µlogit 6= 0.0

for each marginal logit.



We again fit the block model to the logits to do the comparison of the marginal logits for all three oilssimultaneously and to compare the efficacy of the three oils: The lmer() function is used to fit the blockmodel with random blocks model, followed by the emmeans() function in the emmeans package to testif the marginal “logits” are equal to 0. We can also back transform the estimates on the logit scale.

# The block analysis on the logit.treatments needs to treat the# blocks (day) as random effects. This is more efficient than# a separate analysis by treatment (above), because# the analysis of all of the treatments simultaneously# provides better information on the variance.res.logit2 <- lmer( logit.treatment ~ trt + (1|Day), data=mos3)

# We now compute the emmeans and test if these equal 0.5# Estimates/se are the same as above, but ci are narrower and pvalues smallerres.logit2.emmo <- emmeans::emmeans(res.logit2, ~ trt)res.logit2.logits <- summary(res.logit2.emmo, infer=TRUE, null=0.)res.logit2.logits$phat <- expit(res.logit2.logits$lsmean)res.logit2.logits$pse <- res.logit2.logits$SE*res.logit2.logits$phat*

(1-res.logit2.logits$phat)res.logit2.logits$plcl <- expit(res.logit2.logits$lower.CL)res.logit2.logits$pucl <- expit(res.logit2.logits$upper.CL)cat("*** RECOMMENDED *** Adjusts for incomplete blocks and extracts intra- and inter-block information \n")res.logit2.logits

giving

*** RECOMMENDED *** Adjusts for incomplete blocks and extracts intra- and inter-block informationtrt lsmean SE df lower.CL upper.CL t.ratio p.value phat pse plcl puclT1 0.4801399 0.1593132 25.59 0.15241028 0.8078696 3.014 0.0058 0.6177809 0.03761824 0.5380290 0.6916553T2 0.8928238 0.1593132 25.59 0.56509410 1.2205534 5.604 <.0001 0.7094726 0.03283783 0.6376304 0.7721609T3 0.2946268 0.1888357 31.83 -0.09009779 0.6793515 1.560 0.1286 0.5731285 0.04619907 0.4774908 0.6635939

Confidence level used: 0.95

Notice that not all of the marginal “logits” all have the same standard error, again because they usedifferent sets of blocks. Once the block analysis is fit, it is then straight forward to compare the efficacyof the oils in the usual way, i.e. using a multiple comparison procedure: We use the cld() function in theemmeans package to compare the marginal “logits” (i.e. the marginal proportions) and to compute thepairwise differences:

# We can now do comparison between the various treatments in the# usual way using the emmeans package and# using the emmeans::CLD() and pairs() functions.res.logit2.cld <- emmeans::CLD(res.logit2.emmo)res.logit2.cld

# Here is the plotres.logit2.plot <- sf.cld.plot.bar(res.logit2.cld, "trt")+

ggtitle("Comparisons of attractivness of oils on logit scale")+xlab("Treatment")+ylab("logit(P( attracted to bait))")



res.logit2.plot

# all the pairwide differences. The anti-logs are odds ratiosres.logit2.pairs <- pairs(res.logit2.emmo)res.logit2.pairs <- summary(res.logit2.pairs, infer=TRUE)res.logit2.pairs$OR <- exp(res.logit2.pairs$estimate)res.logit2.pairs$OR.lcl <- exp(res.logit2.pairs$lower.CL)res.logit2.pairs$OR.ucl <- exp(res.logit2.pairs$upper.CL)res.logit2.pairs

giving

trt emmean SE df lower.CL upper.CL .groupT3 0.2946268 0.1888357 31.83 -0.09009779 0.6793515 1T1 0.4801399 0.1593132 25.59 0.15241028 0.8078696 12T2 0.8928238 0.1593132 25.59 0.56509410 1.2205534 2

Degrees-of-freedom method: kenward-rogerConfidence level used: 0.95P value adjustment: tukey method for comparing a family of 3 estimatessignificance level used: alpha = 0.05contrast estimate SE df lower.CL upper.CL t.ratio p.valueT1 - T2 -0.4126838 0.1645216 21.31 -0.8269206 0.001552983 -2.508 0.0510T1 - T3 0.1855131 0.1902558 21.79 -0.2927500 0.663776174 0.975 0.5999T2 - T3 0.5981969 0.1902558 21.79 0.1199338 1.076459999 3.144 0.0127

OR OR.lcl OR.ucl0.6618715 0.4373941 1.0015541.2038360 0.7462087 1.9421121.8188363 1.1274223 2.934274



A plot of the results can be make in the usual fashion (not shown).




This is done in R using the glm() function but with the quasibinomial family selected. Notice thatwe fit a block model. Because the default of R is to perform Type I tests (incremental tests) we needto ensure that the trt is last in the model and/or use the contrasts argument to set the contrast matrix



and then use the ANOVA function in the car package. Because the fit takes place on the logit scale, thehypothesis is that the marginal logits are 0 and then we compare the oils in the usual fashion. Note thatthe emmeans package gives you the option of doing the back transformation automatically.

res.quasilogit <- glm(p.treatment ~ Day + trt, data=mos3,weight=total.mos, family=quasibinomial(link=logit),contrasts=list(Day=contr.sum, trt=contr.sum))

# get the marginal means (on the logit scale) and the probability scaleres.quasilogit.emmo <- emmeans::emmeans(res.quasilogit, ~ trt)summary(res.quasilogit.emmo, infer=TRUE)summary(res.quasilogit.emmo, type="response")

giving:

trt emmean SE df asymp.LCL asymp.UCL z.ratio p.valueT1 0.4982021 0.1118172 Inf 0.27904447 0.7173598 4.456 <.0001T2 0.8955512 0.1170990 Inf 0.66604133 1.1250611 7.648 <.0001T3 0.3369068 0.1449502 Inf 0.05280973 0.6210039 2.324 0.0201

Results are averaged over the levels of: DayResults are given on the logit (not the response) scale.Confidence level used: 0.95trt prob SE df asymp.LCL asymp.UCLT1 0.6220367 0.02628901 Inf 0.5693119 0.6720254T2 0.7100344 0.02410900 Inf 0.6606162 0.7549263T3 0.5834390 0.03522839 Inf 0.5131994 0.6504468

Results are averaged over the levels of: DayConfidence level used: 0.95Intervals are back-transformed from the logit scale

This analysis is not completely satisfactory because Day is treated as a fixed effect and so the estimatedmarginal logits will tend to have reported standard errors that are too small. See me for additional details.

Note that different packages may give slight different values for the p-values for the marginal logitsdepending on the way the overdispersion factor is computed (e.g. using the deviance or the Pearsonchi-square) and the way the p-value is computed (e.g. using quasilikelihood or a Wald statistic). All areasymptotically equivalent but can differ in small sample sizes – contact me for more details.

Once the block analysis is fit, it is then straight forward to compare the efficacy of the oils in the usualway, i.e. using a multiple comparison procedure: We use the cld() function in the emmeans package tocompare the marginal “logits” (i.e. the marginal proportions) and to compute the pairwise differences:

# Multiple comparison between the treatmentsAnova(res.quasilogit, test="F", type=3)res.quasilogit.cld <- emmeans::CLD(res.quasilogit.emmo)res.quasilogit.cld

# Here is the plotres.quasilogit.plot <- sf.cld.plot.bar(res.quasilogit.cld, "trt")+



ggtitle(paste("Comparisons of attractivness of oils \n using"," quasi-binomial model and incomplete blocks", sep=’’))+

xlab("Treatment")+ylab("logit Proportion attracted to bait")res.quasilogit.plot

# all the pairwide differencesres.quasilogit.pairs <- pairs(res.quasilogit.emmo)summary(res.quasilogit.pairs, infer=TRUE)summary(res.quasilogit.pairs, infer=TRUE,type="response")

giving

Analysis of Deviance Table (Type III tests)

Response: p.treatmentError estimate based on Pearson residuals

SS Df F Pr(>F)Day 80.733 14 3.3388 0.007066 **trt 19.754 2 5.7185 0.010861 *Residuals 34.543 20---Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1trt emmean SE df asymp.LCL asymp.UCL .groupT3 0.3369068 0.1449502 Inf 0.05280973 0.6210039 1T1 0.4982021 0.1118172 Inf 0.27904447 0.7173598 1T2 0.8955512 0.1170990 Inf 0.66604133 1.1250611 2

Results are averaged over the levels of: DayResults are given on the logit (not the response) scale.Confidence level used: 0.95Results are given on the log odds ratio (not the response) scale.P value adjustment: tukey method for comparing a family of 3 estimatessignificance level used: alpha = 0.05contrast estimate SE df asymp.LCL asymp.UCL z.ratio p.valueT1 - T2 -0.3973491 0.1579826 Inf -0.7676129 -0.02708524 -2.515 0.0319T1 - T3 0.1612953 0.1746503 Inf -0.2480327 0.57062336 0.924 0.6254T2 - T3 0.5586444 0.1783210 Inf 0.1407133 0.97657545 3.133 0.0049

Results are averaged over the levels of: DayResults are given on the log odds ratio (not the response) scale.Confidence level used: 0.95Conf-level adjustment: tukey method for comparing a family of 3 estimatesP value adjustment: tukey method for comparing a family of 3 estimatescontrast odds.ratio SE df asymp.LCL asymp.UCL z.ratio p.valueT1 / T2 0.6720994 0.1061800 Inf 0.4641196 0.9732783 -2.515 0.0319T1 / T3 1.1750319 0.2052197 Inf 0.7803344 1.7693697 0.924 0.6254T2 / T3 1.7483009 0.3117588 Inf 1.1510946 2.6553473 3.133 0.0049

Results are averaged over the levels of: DayConfidence level used: 0.95Conf-level adjustment: tukey method for comparing a family of 3 estimatesIntervals are back-transformed from the log odds ratio scale



P value adjustment: tukey method for comparing a family of 3 estimatesTests are performed on the log odds ratio scale


A plot of the results can be make in the usual fashion as well (but not shown).

The estimated overdispersion factor is

# Overdispersion factor iscat("Overdispersion factor ", summary(res.quasilogit)$dispersion, "\n")

giving

Overdispersion factor 1.727161

indicating that standard errors from a simple logistic fit must be adjusted by the square root of the factor.This was automatically done above.

This analysis is not completely satisfactory because Day is treated as a fixed effect and so the esti-mated marginal logits will tend to have reported standard errors that are too small. See me for additionaldetails.

21.7.4 GLIMM modeling

Finally, a generalized linear mixed model logistic ANOVA can be done that explicitly models the randomeffects of days and enclosures (the combination of day and cage) directly. This analysis will automati-cally include the inter- and intra-block information. This is done using the glmer() function in the lme4package. We fit a model with trt to compare the marginal logits to 0 and to compare the efficacy amongthe different oils


data=mos3, weight=mos3$total.mos,family=binomial(link=logit))



logit(p) = trt + Day(R) + Experiment(R)





θi =trt + Day(R) + Experiment(R)

where the Experiment term represent the combination of day and cage. You should not use Cage as therandom effect because cage 1 on day 1 is not necessarily the same “cage” on day 2, despite it havingthe same physical location etc. Perhaps something happened to the cage on day 2 that made it subtlydifferent from the same cage on the previous day.

This gives the estimated marginals and comparisons to 0 on both the logit and probability scale.

trt emmean SE df asymp.LCL asymp.UCL z.ratio p.valueT1 0.4577048 0.1473787 Inf 0.16884782 0.7465618 3.106 0.0019T2 0.8725659 0.1500167 Inf 0.57853861 1.1665932 5.816 <.0001T3 0.2895170 0.1706528 Inf -0.04495631 0.6239904 1.697 0.0898

Results are given on the logit (not the response) scale.Confidence level used: 0.95trt prob SE df asymp.LCL asymp.UCLT1 0.6124695 0.03498043 Inf 0.5421120 0.6784291T2 0.7052793 0.03118253 Inf 0.6407311 0.7625287T3 0.5718779 0.04178154 Inf 0.4887628 0.6511256


The usual comparison among treatments is made


data=mos3, weight=mos3$total.mos,family=binomial(link=logit))


giving

trt emmean SE df asymp.LCL asymp.UCL z.ratio p.valueT1 0.4577048 0.1473787 Inf 0.16884782 0.7465618 3.106 0.0019T2 0.8725659 0.1500167 Inf 0.57853861 1.1665932 5.816 <.0001T3 0.2895170 0.1706528 Inf -0.04495631 0.6239904 1.697 0.0898

Results are given on the logit (not the response) scale.Confidence level used: 0.95trt prob SE df asymp.LCL asymp.UCL



T1 0.6124695 0.03498043 Inf 0.5421120 0.6784291T2 0.7052793 0.03118253 Inf 0.6407311 0.7625287T3 0.5718779 0.04178154 Inf 0.4887628 0.6511256


The p-values may differ slightly between packages depending a likelihood ratio or Wald test is used(contact me for details).

The difference of logits corresponds to the log(OddsRation) of the proportion choosing the treat-ment trap vs the control trap between two different oils. A back transformation can give estimates ofthe odds ratio (and confidence limits) directly. Notice that you do NOT take the anti-log of the se of thelog(odds ratio) to get the se of the odds-ratio, but must do a delta-method transformation. Contact mefor details.

The estimated variance components (on the logit scale) are found as:

Estimated variance componentsGroups Name Std.Dev.Experiment (Intercept) 0.24024Day (Intercept) 0.39924



As long as the number of mosquitos that select one of the traps is roughly the same among all of theenclosures, the analysis of the empirical logits using a incomplete block analysis is simple and relativelystraight forward to do with modern statistical software. The key to this model is the ability to declareblocks as a random factor to ensure that the individual comparisons of the marginal logits to 0 capturesthe true extent of the uncertainty and to capture the intra- and inter-block information.

If the number of mosquitos that makes a choice is highly variable and/or the empirical probabilitiesof selection are close to 0 or 1, then the more advanced methods will be preferred. Note that if some ofthe cages has 100% or 0% of mosquitos choosing one of the traps, then many of the methods will failbecause the empirical logit are ±∞ respectively. In such cases, Bayesian methods can often work quitewell.


logistic regression - advanced...

Documents