on the robustness of dirichlet-multinomial regression in ... 2011.pdf · abstract on the robustness...

On the Robustness of Dirichlet-multinomial Regression in the Context ofModeling Pollination Networks

by

Catherine Crea

A Thesispresented to

The University of Guelph

In partial fulfilment of requirementsfor the degree ofMaster of Science

inMathematics and Statistics

Guelph, Ontario, Canada

c© Catherine Crea, December, 2011

ABSTRACT

ON THE ROBUSTNESS OF DIRICHLET-MULTINOMIAL REGRESSION IN

THE CONTEXT OF MODELING POLLINATION NETWORKS

Catherine Crea Advisor:

University of Guelph, 2011 Professor A. Ali

Recent studies have suggested that the structure of plant-pollinator networks

is driven by two opposing theories: neutrality and linkage rules. However, rela-

tively few studies have tried to exploit both of these theories in building pollina-

tion webs. This thesis proposes Dirichlet-Multinomial (DM) regression to model

plant-pollinator interactions as a function of plant-pollinator characteristics (e.g.

complementary phenotypic traits), for evaluating the contribution of each pro-

cess to network structure. DM regression models first arose in econometrics for

modeling consumers’ choice behaviour. Further, this thesis (i) evaluates the ro-

bustness of DM regression to misspecification of dispersion structure, and (ii)

compares the performance of DM regression to grouped conditional logit (GCL)

regression through simulation studies. Results of these studies suggest that DM

regression is a robust statistical method for modeling qualitative plant-pollinator

interaction networks and outperforms the GCL regression when data are indeed

over-dispersed. Finally, using DM regression seems to significantly improve model

fit.

iii

Acknowledgements

First and foremost, I would like to thank my advisor, Dr. Ayesha Ali, for

her expertise, support, and patience throughout the course of my research. Her

ability to be both a teacher and a mentor has given me the skills and confidence

necessary to complete this thesis. I appreciate all her contributions of time, ideas,

and advice to make my masters experience productive and stimulating.

I would like to thank NSERC-CANPOLIN for providing the funding for this

research. A special thank you to Dr. Peter Kevan, principle investigator for

CANPOLIN, Dr. Tom Woodcock, research associate for CANPOLIN, and Dr.

Sarah Bates, network manager for CANPOLIN. I appreciate the time you spent

providing input and feedback throughout the development of this research, but

also, your interest and attention was the most encouraging.

I would like to thank Dr. Gary Umphrey for being on my advisory committee.

Not only am I grateful for his technical insight and thoughtful input with respect

to this thesis, but also for being one of the most passionate professors in the

department. Through his courses, I gained a solid grasp of the fundamentals of

Statistics which afforded me the skills necessary to pursue a masters. Also, I am

thankful to all the members of the Department of Mathematics and Statistics,

whether it be the professors, administrative staff, or student colleagues, you have

all made my graduate experience challenging, enjoyable and unforgettable.

It is difficult to oversight my gratitude to my employer, Geosyntec Consultants,

for supporting me both financially and professionally during my post-graduate

studies. Being surrounded by such brilliant and remarkable professionals has given

me the motivation to pursue a higher level of expertise in my field of study.

Lastly, this thesis would not have been possible without the love and support

iv

of my family and friends. My sisters, Mary and Carm, encouraged, supported,

guided, and understood me at every moment and I am forever indebted to them

for giving me the strength to persevere. My brother, Vince, is my heart and soul

and without him I would not be the determined person I am. His extraordinary

will to live and continuous resiliency to overcome any illness will forever inspire

me and keep me grounded. Finally, my friends kept me sane and laughing during

all the stages of my thesis and for that I am so appreciative.

Table of Contents

1 Introduction 1

2 Pollination Networks 6

2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Network Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.3 Network Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.4 Network Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.5 Previous Pollination Network Studies . . . . . . . . . . . . . . . . . 11

3 Dirichlet-Multinomial Regression 14

3.1 Multinomial Responses . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.2 Random Utility Model . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.3 Grouped Conditional Logit . . . . . . . . . . . . . . . . . . . . . . . 18

3.4 Dirichlet-Multinomial Model . . . . . . . . . . . . . . . . . . . . . . 19

3.4.1 Additional Parameterizations and Considerations . . . . . . 23

4 Design of Simulation Study 27

4.1 Description of Simulation Study A . . . . . . . . . . . . . . . . . . 28

4.2 Description of Simulation Study B . . . . . . . . . . . . . . . . . . . 30

vi

4.3 Data Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.4 Model Fitting and Summary Statistics . . . . . . . . . . . . . . . . 35

5 Results 38

5.1 Simulation Study A Results . . . . . . . . . . . . . . . . . . . . . . 38

5.1.1 Model Convergence . . . . . . . . . . . . . . . . . . . . . . . 39

5.1.2 Estimation of β . . . . . . . . . . . . . . . . . . . . . . . . . 40

5.1.3 Model Fit and Estimated Dispersion . . . . . . . . . . . . . 46

5.2 Simulation Study B Results . . . . . . . . . . . . . . . . . . . . . . 48

5.2.1 Model Convergence . . . . . . . . . . . . . . . . . . . . . . . 48

5.2.2 Estimation of β . . . . . . . . . . . . . . . . . . . . . . . . . 50

5.2.3 Model Fit and Estimated Dispersion . . . . . . . . . . . . . 61

6 Discussion 66

7 Conclusions 71

7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Bibliography 74

A Derivation of Conditional Logit Model 80

B Supplementary Tables for Simulation Study A 84

vii

List of Tables

4.1 Simulation Study A . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.2 Simulation Study B . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.3 Dispersion structures . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.1 Percentage of samples that reached convergence for β = (2.78, 2.35, 1.88)

and network size 57 x 23. . . . . . . . . . . . . . . . . . . . . . . . . 40

5.2 Percent relative bias of β for β = (2.78, 2.35, 1.88) and network size

57 x 23 for DM models in terms of δg. . . . . . . . . . . . . . . . . . 42

5.3 Percent relative bias of β for β = (2.78, 2.35, 1.88) and network size

57 x 23 for DM models in terms of ρg. . . . . . . . . . . . . . . . . 43

5.4 Percent coefficient of variation of β for β = (2.78, 2.35, 1.88) and

network size 57 x 23 for DM model in terms of δg. . . . . . . . . . . 44

5.5 Percent coefficient of variation of β for β = (2.78, 2.35, 1.88) and

network size 57 x 23 for DM models in terms of ρg. . . . . . . . . . 45

5.6 Median negative log-likelihood values for samples generated with

β = (2.78, 2.35, 1.88) and network size 57 x 23. . . . . . . . . . . . . 47

5.7 Percent relative bias and percent coefficient of variation for disper-

sion parameters with β = (2.78, 2.36, 1.88) and network size 57 x

23. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

viii

5.8 Percentage of samples that reached convergence for β = (1.2,−0.4, 0.1)


5.9 Percentage of samples that reached convergence for β = (1.1, 0.8, 2.3)


5.10 Percent relative bias of β for β = (1.2,−0.4, 0.1) and network size

49 x 18 for DM model in terms of δg. . . . . . . . . . . . . . . . . . 51

5.11 Percent relative bias of β for β = (1.2,−0.4, 0.1) and network size

49 x 18 for DM model in terms of ρg. . . . . . . . . . . . . . . . . . 52

5.12 Percent relative bias of β for β = (1.1, 0.8, 2.3) and network size 49

x 18 for DM model in terms of δg. . . . . . . . . . . . . . . . . . . . 53

5.13 Percent relative bias of β for β = (1.1, 0.8, 2.3) and network size 49

x 18 for DM model in terms of ρg. . . . . . . . . . . . . . . . . . . . 54

5.14 Percent coefficient of variation of β for β = (1.2,−0.4, 0.1) and

network size 49 x 18 for DM model in terms of δg. . . . . . . . . . . 57

5.15 Percent coefficient of variation of β for β = (1.2,−0.4, 0.1) and

network size 49 x 18 for DM model in terms of ρg. . . . . . . . . . . 58

5.16 Percent coefficient of variation of β for β = (1.1, 0.8, 2.3) and net-

work size 49 x 18 for DM model in terms of δg. . . . . . . . . . . . . 59

5.17 Percent coefficient of variation of β for β = (1.1, 0.8, 2.3) and net-

work size 49 x 18 for DM model in terms of ρg. . . . . . . . . . . . 60


β = (1.2,−0.4, 0.1) and network size 49 x 18. . . . . . . . . . . . . . 61


β = (1.1, 0.8, 2.3) and network size 49 x 18. . . . . . . . . . . . . . . 62

ix

5.20 Percentage of χ2 p-values < 0.05 for samples generated with β =

(1.2,−0.4, 0.1) and network size 49 x 18. . . . . . . . . . . . . . . . 63

5.21 Percentage of χ2 p-values < 0.05 for samples generated with β =

(1.1, 0.8, 2.3) and network size 49 x 18. . . . . . . . . . . . . . . . . 63


sion parameters with β = (1.2,−0.4, 0.1) and network size 49 x

18. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64


sion parameters with β = (1.1, 0.8, 2.3) and network size 49 x 18. . . 64

B.1 Percentage of samples that reached convergence for β = (2.78, 2.35, 1.88)



and network size 105 x 76. . . . . . . . . . . . . . . . . . . . . . . . 85











x



B.9 Median negative loglikelihood values for samples generated with



β = (2.78, 2.35, 1.88) and network size 105 x 76. . . . . . . . . . . . 87













B.17 Percent relative bias of β for β = (2.78, 2.35, 1.88) and network size

90 x 54. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90


105 x 76. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91


90 x 54. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

xi

B.20 Monte Carlo bias of β for β = (3.21, 3.85, 3.50) and network size

105 x 76. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93


57 x 23. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94


90 x 54. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95


105 x 76. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

B.24 Monte Carlo bias of β for β = (3.92, 4.54, 3.52) and network size 57

x 23. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

B.25 Percent coefficient of variation of β for β = (2.78, 2.35, 1.88) and

network size 90 x 54. . . . . . . . . . . . . . . . . . . . . . . . . . 98


network size 105 x 76. . . . . . . . . . . . . . . . . . . . . . . . . . 99


network size 90 x 54. . . . . . . . . . . . . . . . . . . . . . . . . . . 100


network size 105 x 76. . . . . . . . . . . . . . . . . . . . . . . . . . 101


network size 57 x 23. . . . . . . . . . . . . . . . . . . . . . . . . . . 102


network size 90 x 54. . . . . . . . . . . . . . . . . . . . . . . . . . . 103


network size 105 x 76. . . . . . . . . . . . . . . . . . . . . . . . . . 104

xii


network size 57 x 23. . . . . . . . . . . . . . . . . . . . . . . . . . . 105

B.33 Percent relative bias and percent coefficient of variation for disper-


54. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106



76. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106



54. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107



76. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107



23. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107



54. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108



76. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

xiii



23. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

xiv

List of Figures

2.1 Pollination web depicted as a bipartite graph . . . . . . . . . . . . . 7

3.1 Graphical model of DM regression - Y contains the observed counts;

P contains the corresponding interaction probabilities which are a

function of multi-dimensional array X, consisting of k observable

covariate matrices, and corresponding β; δ is an over-dispersion

parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.2 Graphical model of Stata’s parameterization of DM regression - Y

contains the observed counts; P contains the corresponding inter-

action probabilities which are a function of λ∗ = eγ′x∗ = eβ

′xgj−ψ′xg . 24

xv

Chapter 1

Introduction

This thesis contributes to the area of multivariate statistics and provides a

study of the robustness of multinomial-based regression models to misspecifica-

tion of dispersion structure, within an ecology context. In particular, Dirichlet-

multinomial (DM) regression is proposed to model the interaction probabilities of

plant-pollinator mutualistic networks, and the robustness of the DM model with

respect to over-dispersion is investigated through simulation studies.

To date, few studies have been conducted to evaluate the simultaneous con-

tributions of the processes driving the structural patterns of pollination networks.

Vazquez et al. (2009b) provide a conceptual framework for evaluating multiple

factors in determining network structure and within the past couple of years, oth-

ers have adopted a similar framework (Santamarıa and Rodrıguez-Girones, 2007;

Allesina et al., 2008; Stang et al., 2009). However, the statistical approaches that

have been used in practice are typically simplistic, such as χ2 tests for observed

proportions. In fact, all studies failed to determine the relative contribution of

these factors.

1

DM regression is a cutting edge technique that arose in econometrics to model

consumers’ choice behaviour and that may provide a flexible model for estimating

the true contribution of various factors to observed plant-pollinator interactions

(Guimaraes and Lindrooth, 2007). Hence, this thesis represents an effort to ex-

tend Vazquez’s conceptual framework and apply state of the art methods from

econometrics to a pollination context. Finally, the DM framework for pollination

is evaluated through simulation studies.

Mutualisms among organisms are found at the core of any ecosystem; therefore,

they play a key role in ecology. Among the most commonly studied mutualisms

are those between plants and pollinators. Pollinators are responsible for the sexual

reproduction of flowering plants, seed production and fruit production. In fact,

humans rely on pollination for about one third of the food they eat (Kearns et al.,

1998). Unfortunately, pollination networks are constantly under threat due to

anthropological activities, such as, agricultural practices, change in land use, and

habitat loss. Since the relationship between plants and pollinators are interdepen-

dent, the extinction of one results in the extinction of the other and vice versa.

Thus their conservation and management are crucial for the Earth’s biodiversity.

Ecologists and evolutionary biologists have become increasingly interested in

the study of plant and pollinator interactions within a community context in order

to understand the implications of these mutualisms (Vazquez et al., 2009b). As

such, a network approach has been adopted to gain insight on the structure of

plant-pollinator mutualisms. These networks may consist of hundreds of species

that form highly complex and heterogeneous ecosystems (Bascompte and Jordano,

2007). Recent work has revealed that these networks share common structural

patterns, such as the nested organization of pairwise interactions and the skewed

2

distribution of links per species (Vazquez et al., 2009a). It is believed that these

structural patterns are being driven by both evolutionary and ecological processes,

which are summarized by the theories of neutrality (random interactions) and

linkage rules (trait matching).

Evidence suggests that both theories contribute to the organization of network

structure; therefore, there are multiple determinants that influence the probability

that a given pollinator species will interact with a given plant species. Vazquez et

al. (2009b) developed a conceptual framework which states that the observed in-

teractions of a given network are assumed to be distributed multinomial and are a

function of probability matrices derived from relative abundance, spatio-temporal

overlap, phenotypic traits, phylogenetic signal and sampling effects. They in-

vestigated the extent to which relative abundance and spatio-temporal overlap

predicted network structure using the multinomial likelihood calculated from the

probability matrices and observed counts from real world networks. Others (San-

tamarıa and Rodrıguez-Girones, 2007; Allesina et al., 2008; Stang et al., 2009)

have adopted a similar conceptual framework in order to evaluate the extent to

which these processes, namely, trait complementarity, describe network topology.

Although these studies have confirmed that these factors do contribute to network

structure, they fall short of quantifying the relative contribution of each factor to

the interaction probabilities.

In order to evaluate the relative contribution of the mechanisms driving net-

work structure, a more sophisticated statistical approach is needed. Hence, the

motivation of this thesis is to exploit the theories of neutrality and forbidden links

in one comprehensive model.

Multinomial logistic regression is often used in econometrics to model indi-

3

vidual choice behaviour (Hausman et al., 1984; McFadden, 1974; Shonkwiler and

Hanley, 2003). Analogous to consumers choosing among a set of brand products,

pollinators species choose among a set of plant species. As such, according to the

random utility hypothesis, pollinators assign a level of utility to every plant species

and choose the one that provides the maximum utility. McFadden (1974) derived

a conditional logit (CL) model from the random utility model which incorporates

factors that include characteristics of the individuals, or pollinators, and/or the

attributes of the choices, or plants. DM regression is an extension of McFadden’s

CL model which allows for an extra level of variability that is consumer (pollinator

species) specific (Guimaraes and Lindrooth, 2007). Since plant-pollinator mutual-

istic networks are known to be heterogeneous, this extra unobserved heterogeneity

accounts for the possibility of over-dispersion in the observed network counts.

As part of this thesis, simulation studies were conducted to evaluate the ro-

bustness of DM regression to misspecification of dispersion structure, and to com-

pare the performance of DM regression to grouped conditional logit (GCL) re-

gression. The GCL model is a special case of the DM model which assumes no

over-dispersion. Plant-pollinator mutualistic networks of varying size and disper-

sion structures were simulated according to the DM regression model to conduct

the analysis.

This thesis provides a summary of the motivation, objective and results of

the research on DM regression in the context of pollination networks. Chapter 2

provides general background on pollination networks, including the patterns and

processes driving network structure, and a review of previous pollination studies

that have stimulated this research. Chapter 3 reviews the DM regression model

through the methodologies adapted from econometrics. Chapters 4 provides a

4

description of the simulation studies and Chapter 5 presents the results of the

simulation studies. A discussion of the results of this thesis are provided in Chapter

6 and conclusions and future works are discussed in Chapter 7.

5

Chapter 2

Pollination Networks

This chapter provides a general overview of plant-pollinator mutualistic net-

works. Section 2.1 defines a pollination network and shows how a pollination web

can be depicted as a bipartite graph. Section 2.2 presents the network statistics

used to uncover the common structural patterns that characterize these networks,

while Section 2.3 presents the underlying mechanisms driving the network pat-

terns. Section 2.4 introduces the notation used to describe pollination networks

throughout this thesis and Section 2.5 summarizes the previous network studies

that have motivated the work presented in this thesis.

2.1 Definition

A pollination network is a graph consisting of nodes that represent plants

and pollinators, and undirected edges that represent the mutualistic interactions

between plant and pollinator species within a given ecosystem. These networks

are considered bipartite networks since interactions occur only between plants and

6

pollinators and not within plants or pollinators. In other words, pollinator species

do not interact with other pollinator species and plant species do not interact

with other plant species. An example of a pollination network is shown in Figure

2.1. The pollinator species are represented by purple nodes and the plants species

are represented by green nodes. An edge connecting a pollinator node to a plant

node represents an interaction or link between those two species. Essentially, the

graph gives a snapshot of which pollinator species are interacting with which plant

species.

Figure 2.1: Pollination web depicted as a bipartite graph

Due to the interdependent relationship between these plants and pollinators,

it quickly becomes clear that as the size of the network increases, the complexity

of the network also increases. The study of plant and pollinator communities pro-

vides an understanding of the underlying patterns of these complex networks and

the processes - both ecological and evolutionary - that are driving these network

patterns.

7

2.2 Network Patterns

Network statistics are metrics calculated from an observed real-world network

which aim at quantifying its topological features. Examples of these statistics

include:

• connectance – proportion of links that are actually realized (Jordano, 1987);

• species degree – number of different species to which a specific species is

linked (Jordano et al., 2003); and

• interaction strength or dependence – an estimate of the extent to which one

species depends on another species, typically approximated by interaction

frequency (Vazquez et al., 2005; Bascompte et al., 2006).

Network statistics provide a means of uncovering the underlying structural pat-

terns of a given network.

Extensive studies of plant-pollinator networks reveal that, regardless of the geo-

graphical origins, mutualistic networks share common structural patterns. Vazquez

et al. (2009a) summarize the topological features of plant-pollinator networks as

follows:

• Connectance is typically low in mutualistic networks.

• There are often many more pollinator species than there are plant species.

• The species degree distribution is skewed: many species have few links (these

species are considered specialists) and few species have many links (these

species are considered generalists).

8

• Mutualistic networks tend to be: nested – specialists tend to interact with

a subset of generalists (Bascompte et al., 2003), and compartmentalized –

clearly defined groups of species that have many intragroup links and few

intergroup links (Olesen et al., 2007).

• Most interactions are asymmetric in that specialists tend to interact with

generalists.

These structural patterns suggest that pollination networks are very heteroge-

neous; therefore, there may be multiple determinants that play a role in the orga-

nization of the network structure.

2.3 Network Processes

The topological features of these complex networks are driven by ecological

and evolutionary mechanisms which can be explained by two opposing theories:

neutrality and linkage rules. Neutrality states that all individuals interact ran-

domly such that all plant-pollinator pairs have the same probability of interacting.

As a result, individuals interact proportionately to their relative abundance, i.e.,

more abundant species interact more frequently than rare species (Vazquez et al.,

2009b).

Linkage rules arise from trait matching, namely, trait complementarity and

exploitation barriers. Both rules prevent the occurrence of certain interactions,

known as forbidden links (Santamarıa and Rodrıguez-Girones, 2007). An example

of a complementarity trait would be a plant’s nectar concentration matching a

pollinator’s concentration preference. Similarly, an example of an exploitation

barrier would be a pollinator’s proboscis length being long enough to forage a

9

plant’s corolla tube (Bascompte et al., 2003). Other evolutionary processes, such

as neutral evolution and phylogenetic signal, may also have a causal influence on

trait matching (Vazquez et al., 2009a). This thesis attempts to exploit both of

these theories simultaneously in one comprehensive model.

It is worth noting that although these mechanisms influence the true network

structure, these mechanisms also affect the observed network structure through

sampling effects. The discrepancy between the true network and the observed

network is an artifact of sampling bias. An example of a sampling effect would be

observation error: interactions between certain plants and pollinator species occur

but are not recorded because they are not observed during the sampling times.

2.4 Network Notation

For the purpose of this thesis, consider a plant-pollinator network with G

pollinator species, J plant species, and K traits (or covariates). Let Y denote

the matrix of observed counts where ygj is the observed number of plant visits

between pollinator g and plant j. Then we have,

Y =

y11 · · · y1J...

. . ....

yG1 · · · yGJ

.

The total number of observed interactions ng for pollinator species g is the sum

of the gth row of Y , and the total number of observed interactions in the network

10

N is the sum of all counts, given by

N =G∑g=1

J∑j=1

yg,j. (2.1)

Let Xk denote the matrix for observable covariate k, where xkgj is the observed

covariate k value for pollinator g and plant j. Then,

Xk =

xk11 · · · xk1J

.... . .

...

xkG1 · · · xkGJ

, k = 1, . . . , K.

Let probability matrix P contain the interaction probabilities corresponding to the

observed counts in Y . It is assumed that the probabilities in P are being driven

by the covariates, or traits, in Xk. In other words, the interaction probabilities are

a function of multiple factors contained in Xk, k = 1, . . . , K. This thesis considers

DM regression, such that the interaction probabilities are modeled through a logit

link function which is a linear combination of these covariates.

2.5 Previous Pollination Network Studies

As discussed in Section 2.3, due to the evolutionary and ecological processes

driving network pattern, multiple determinants must be considered in evaluat-

ing network structure. Several studies have attempted to evaluate many factors

simultaneously. Vazquez et al. (2009b) proposed a conceptual framework for

modeling a pollination web as a function of several factors. They attempted to

evaluate space, time and relative abundance using a likelihood-based approach.

11

They assumed that the counts in Y are distributed multinomial and calculated

the corresponding interaction probability matrices based on all possible combina-

tions of time, space and relative abundance. However, for this analysis, Vazquez

et al. (2009b) created probability models by multiplying binary covariate matri-

ces and normalizing the product matrices so that their elements summed to one.

The observed counts and those expected under each of the probability models

were compared by calculating the corresponding likelihoods and AIC values. The

models containing more than one factor proved to best predict network structure.

A similar likelihood approach was used by Allesina et al. (2008) in a food web

context. This is a very rudimentary approach for modeling interaction probabil-

ities as a function of multiple determinants. In this thesis, Vazquez’s framework

is extended in which the probabilities are modeled as a function of covariates via

a logit formulation.

Santamarıa and Rodrıguez-Girones (2007) investigated whether simple linkage

rules, that account for both trait complementarity and/or exploitation barriers,

could explain the structure of plant-pollinator mutualistic networks. They used

simulation methods to build binary interaction matrices, i.e., a qualitative network

where 1 indicates the presence of an interaction between a given pollinator species

and a given plant species, 0 otherwise. These models were simulated using one,

two and four complementarity trait and barrier trait models and two null models.

The structure of simulated communities based on simple linkage rules was com-

pared to the structure of 37 real-world networks. Network topology was described

by the number of interactions, nestedness, relative nestedness and connectivity.

Santamarıa and Rodrıguez-Girones found that models that incorporate two traits

were able to predict the network statistics found in the real-world networks.

12

Trait matching was also investigated by Stang et al. (2009) in a Spanish plant-

pollinator network. They introduced a new network statistic: the degree of size

matching between nectar depth and proboscis length. They used two rules: (i)

size threshold and (ii) relative abundance, to explain the frequency distributions of

interactions across size classes and average degree of size matching for individuals

in a species (Stang et al., 2009). For each analysis they compared observed and

expected values using a contingency table approach (χ2 tests). Observed values

were calculated as a function of size and using the mean and standard deviation

of trait values, for each analysis, respectively. The expected frequencies were cal-

culated based on probabilities dervied from a threshold indicator that indicates

whether the interaction is possible determined by equal species abundance or rela-

tive species abundance. They found that size thresholds, size distributions (nectar

depths and proboscis lengths), and species abundance seemed to be important in

understanding observed interaction probabilities.

Although these studies provide a conceptual framework for incorporating mul-

tiple factors in predicting network structure, a means of quantifying the relative

contribution of each factor is yet to be explored. The DM regression model can

identify the factors that affect the interaction probabilities and can estimate their

relative contribution to those interaction probabilities. Additionally, DM regres-

sion is a flexible model and can be used to incorporate different kinds of covariates,

such as space and time, or possibly to learn a set of linkage rules.

The next chapter provides details of the DM regression model from its roots

in econometrics. Further discussion extends the DM regression model into the

context of pollination networks.

13

Chapter 3

Dirichlet-Multinomial Regression

This chapter begins with a review of multinomial response data and the logit

formulation used to model the multinomial probabilities. Section 3.2 introduces

the random utility model from econometrics (used to model individual choice

behaviour). Section 3.3 provides the derivation of the grouped conditional logit

(GCL) model from the random utility framework and places it in the context of

pollination networks. Finally, Section 3.4 provides the details of the DM regression

model, including its extension of the GCL model, its equivalence to the log-linear

model, and its alternate parameterizations.

3.1 Multinomial Responses

A dependent variable that can take on more than two discrete values is known

as a polytomous response variable. For example, travelers may choose among a

set of travel modes or consumers choose among a set of brand name products.

The number of travelers who choose the respective travel mode or the number of

14

consumers who choose the respective brand name products can be modeled using

the multinomial distribution. The multinomial responses can be either nominal,

i.e., there is no natural order to the categories, or ordinal, i.e., there is an order

or ranking to the categories. In this paper, only nominal response variables will

be considered and discussed hereafter.

Consider a random variable Yi, that can take on one of a finite number of

discrete values, 1, 2, . . . , J . Let pij = P (Yi = j) be the probability that the

ith response falls into the jth category. Assuming that the response categories

are mutually exclusive, then∑J

j=1 pij = 1 for each i. Let Yij be the number

of observations falling into category j for a group or individual i and let ni =∑Jj=1 Yij. For ungrouped data, ni = 1 corresponding to the one observation

falling into the jth category and the rest of the J − 1 categories are set to zero.

The probability distribution of the counts Yij given the total ni is given by the

multinomial distribution:

P (Yi1 = yi1, ..., YiJ = yiJ) =ni!

yi1!...yiJ !pyi1i1 · · · p

yiJiJ . (3.1)

The special case where J = 2 is the binomial distribution. The expected value

and variance of the Yij are:

E(Yij) = nipij (3.2)

and

V ar(Yij) = nipij(1− pij). (3.3)

15

Multinomial logitistic (MNL) regression models the pij in terms of individual

or group specific covariates Xi. The link function, known as the logit link or log-

odds, connects the pij to the covariates Xi. The logit uses the J th category as the

baseline group; therefore, the log-odds for all other J − 1 categories are relative

to the baseline. This generalized logit can be written as a linear combination of

covariates:

νij = logpijpiJ

= β′jxi, (3.4)

where β′j is a vector of regression coefficients for j = 1, . . . , J − 1 associated with

the covariate values for the ith individual or group 1.

The probability that the ith individual or group selects the jth category is:

pij =exp(νij)∑Jj=1 exp(νij)

, (3.5)

which is the same logit formulation used in the conditional logit (CL) models

discussed in Sections 3.2 and 3.3. It should be noted that the interpretation of

the β parameters do differ between the MNL and the CL. In the former, these

parameters correspond to individual or group level characteristics, while those in

the latter correspond to choice attributes. An introduction to the random utility

model will elucidate this distinction.

1η is commonly used to represent the link function in a generalized linear model; however,in this thesis, η is used to represent the random group effects for the DM regression model (seeSection 3.4)

16

3.2 Random Utility Model

McFadden (1974) derived a CL model from the random utility model commonly

used in econometrics for modeling individual choice behaviour. The random utility

model assumes that (i) an individual is faced with Ji mutually exclusive and

exhaustive choices, (ii) the utilities Uij are random variables that vary across

individuals, and (iii) an individual selects the choice with the highest, or maximum,

utility (Maddala, 1983). The utility function is defined as the utility ascribed to

choice j by individual i:

Uij = Vij + εij, (3.6)

for i = 1, . . . , N and j = 1, . . . , J and where Vij is a function of covariates that

can reflect the choice attributes and/or the individual characteristics and the εij

is a random error term. The εij is assumed to follow a Type I Extreme Value

distribution, or a standard Weibull distribution, because the modeled utilities are

a maxima.

The probability that individual i selects choice j can be expressed by the

following logit formulation:

pij =exp(Vij)∑Jj=1 exp(Vij)

=exp(β′xij)∑Jj=1 exp(β

′xij), (3.7)

where β is a vector of unknown parameters associated with each of the covariates,

and xij is the vector of covariate values corresponding to individual i and choice

j, for i = 1, . . . , N and j = 1, . . . , J . Note that the MNL model is a special case

of the CL model when only individual characteristics are considered as covariates

17

and β = βj and xij = xi. However, if both types of covariates are included, the

covariates that vary across individuals are constant for all choices and cancel out

of the logit formulation. It can be shown that the logit formulation is a direct

result of the extreme value distribution placed on the random errors in Equation

3.6. Appendix A provides an explicit derivation of McFadden’s CL formulation.

3.3 Grouped Conditional Logit

In a pollination context, each pollinator species is faced with the same choice

set, or J plant species. Further, it is assumed that the pij are identical for all

individuals in the same group, or pollinator species. Thus the covariate values

are identical across members of a group. As such, the utility function for the

ith individual in the gth group and the probability that the individuals in the gth

group select the jth plant can be rewritten as:

Uigj = β′xgj + εigj, (3.8)

and

pgj =exp(β′xgj)∑Jj=1 exp(β

′xgj), (3.9)

where the xgj is the vector of covariate values corresponding to pollinator species

g and plant species j, and β and εigj are defined as in Section 3.2.

18

The likelihood function for the grouped conditional logit is:

LGCL =G∏g=1

J∏j=1

pygjgj , (3.10)

where the ygj are the number of individuals from pollinator species g that select

plant species j. The parameters of the GCL model can be estimated via maxi-

mum likelihood (ML) procedures available in most statistical software packages.

Guimaraes and Lindrooth (2007) give a detailed discussion of the equivalent log-

linear model that arises when the ygj are modeled as a count variable. As such,

the corresponding model parameters can be estimated using Poisson regression.

In this thesis, the plant-pollinator interaction probabilities are modeled using

only the GCL and the DM models and the estimates and corresponding standard

errors of β are compared.

3.4 Dirichlet-Multinomial Model

In the standard GCL model, it is assumed that the pgj are fixed constants,

g = 1, . . . , G and j = 1, . . . , J . However, due to the complex structure of plant-

pollinator networks, the counts in Y are often greater than that predicted by the

GCL model; a phenomenon known as over-dispersion. Consequently, pgj may vary

within a pollinator species due to some unobserved heterogeneity (Faraway, 2006).

For example, the pollinators in species g may be observed more frequently than

those from other species due to observation error; therefore, the counts for the

species g may be greater than predicted by pgj. To account for the possibility of

19

this group-specific heterogeneity, the utility function can be expressed as

Uigj = β′xgj + ηgj + εigj, (3.11)

where ηgj is the random group effect for pollinator species g and plant species

j; and the εigj are independent conditional on the group random effect, for i =

1, . . . , N , g = 1, . . . , G, and j = 1, . . . , J .

Conditional on the group random effects, a modified expression for the prob-

ability that an individual from pollinator g selects plant j is:

pgj =exp(β′xgj + ηgj)∑Jj=1 exp(β

′xgj + ηgj)=

λgjexp(ηgj)∑Jj=1 λgjexp(ηgj)

, (3.12)

where λgj=exp(β′xgj), for g = 1, . . . , G, and j = 1, . . . , J . Introducing this extra

level of variability into the model can allow for correlation across the choices for

pollinators in the same group, which translates into over-dispersion of the ygj

counts (Guimaraes and Lindrooth, 2007).

Assume that the exp(ηgj) are independent and identically (i.i.d.) gamma dis-

tributed with both shape and scale (i.e. rate) parameters δ−1g λgj, where δ−1g >

0. Then, the expected value of exp(ηgj) is 1 and the variance is δgλ−1gj . Fur-

thermore, the products λgjexp(ηgj), for g = 1, . . . , G and j = 1, . . . , J , also have

independent gamma distributions with parameters (δ−1g λgj, δ−1g ). Since all vari-

ables follow independent gamma distributions with the same scale parameter, the

vector of probabilities for a given pollinator species, or group, (pg1,. . . ,pgJ) follows

a Dirichlet distribution with parameters (δ−1g λg1,. . . ,δ−1g λgJ) (Mosimann, 1962).

20

The probability density function of (pg1,. . . ,pgJ) can then be written as:

fD(pg1, . . . , pgJ−1) =Γ(δ−1g λg)∏Jj=1 Γ(δ−1g λgj)

J∏j=1

pδ−1g λgj−1gj (3.13)

where, pgJ=1-∑J−1

j=1 pgj.

Within a Bayesian framework, the above modification is equivalent to placing

a Dirichlet prior on pgj. Note that the Dirichlet distribution happens to be the

conjugate prior for the multinomial distribution. The resulting distribution for

Y is the Dirichlet-multinomial distribution with parameters (ng; pg1,. . . ,pgJ), g =

1, . . . , G, where ng=∑J

j=1 ygj. Mosimann (1962) provides a closed form expression

for the unconditional DM likelihood:

LDMd =G∏g=1

ng!Γ(δ−1g λg)

Γ(δ−1g λg + ng)

J∏j=1

Γ(δ−1g λgj + ngj)

ngj!Γ(δ−1gj λgj)(3.14)

for g = 1, . . . , G and j = 1, . . . , J and where λg =∑J

j=1 λgj.

A graphical model for DM regression is shown in Figure 3.2. The observed

counts in Y follow a multinomial distribution with parameters P = (p11, . . . , pGJ).

The interaction matrix P follows a Dirichlet distribution with parameters

(δ−11 λ11, . . . , δ−1G λGJ). These Dirichlet parameters depend on observed covariates

X and the associated parameter β through λgj = exp(β′xgj) and over-dispersion

parameter δg. As mentioned earlier, counts in Y may be greater than those pre-

dicted by P ; therefore, δ is group specific and accounts for this variability. In

summary, the pgj provides information on the strength of the links in the network

and β summarizes the covariates’ contributions to those probabilities. Only the

observed counts in Y and the observed covariates in X are needed to estimate the

21

parameters of the DM model.

Figure 3.1: Graphical model of DM regression - Y contains the observedcounts; P contains the corresponding interaction probabilities which are afunction of multi-dimensional array X, consisting of k observable covariate

matrices, and corresponding β; δ is an over-dispersion parameter

As mentioned earlier, the GCL model can be rewritten as a log-linear model

by letting ygj follow a Poisson distribution and conditioning on the sum of counts

ng. Guimaraes and Lindrooth (2007) give the analogous relationship between the

DM model and the negative binomial model, also known as the negative binomial

type 1 or negative binomial model with fixed effects in the econometrics literature.

Once again, conditioning on the sum of counts ng, assume ygj follow a Poisson

distribution with parameter λgj and let λgj follow a gamma distribution with

parameters (δ−1g λgj, δ−1g ). Then under these assumptions, the ygj follow a negative

binomial distribution. This parameterization was used in the simulation study to

generate plant-pollinator networks, discussed in Section 4.3.

22

3.4.1 Additional Parameterizations and Considerations

As mentioned earlier, the addition of the group random effect ηgj induces cor-

relation across the choices of plant species. Under the DM model, the marginal dis-

tributions of ygj is a beta-binomial distribution with parameters (ng, pgj) (Guimaraes

and Lindrooth, 2007). As such, the intragroup correlation coefficient can be ex-

pressed as:

ρg =1

δ−1g λg + 1=

δgλg + δg

(3.15)

for g = 1, ..., G.

By inspection, it is obvious that ρg tends to zero as δg approaches zero. The

DM likelihood parameterized in terms of the intragroup correlation coefficient is:

LDMr =G∏g=1

ng!Γ(ρ−1g − 1)

Γ[(ρ−1g − 1) + ng]

J∏j=1

Γ[(ρ−1g − 1)pgj + ygj]

Γ[(ρ−1gj − 1)pgj]ygj!. (3.16)

Maximization of the likelihoods provide estimates of the β and the dispersion pa-

rameters. An iterative procedure such as the Newton-Raphson or Fisher Scoring

can easily be employed to obtain the maximum likelihood (ML) estimates. Exist-

ing routines are available in statistical software packages, such as, LIMDEP and

Stata.

In this thesis, Stata was used to obtain estimates of the DM model parame-

ters. Figure 3.2 displays the graphical model of Stata’s parameterization of DM

regression. Stata’s implementation of DM regression models the interaction prob-

abilities pgj as a function of λ∗ = eγ′x∗ = eβ

′xgj−ψ′xg , where β and xgj are as defined

23

Figure 3.2: Graphical model of Stata’s parameterization of DM regression - Ycontains the observed counts; P contains the corresponding interaction

probabilities which are a function of λ∗ = eγ′x∗ = eβ

′xgj−ψ′xg

in Sections 3.2 and 3.3, respectively,

γ =

β

−−−

ψ

and

X∗ =

xgj

−−−

xg

.

If δg is modeled as a constant, then ψ is a unknown scalar constant and xg=1.

Otherwise, ψ is a vector of unknown coefficients and xg is a vector of pollinator-

specific covariates.

Since the group random effect ηgj translates into a pollinator-specific over-

dispersion parameter, δg or ρg, it may be modeled in several ways. Hence, the

options for modeling the dispersion parameters for the DM model in Stata are as

24

follows:

1. Over-dispersion parameter is modeled as a constant: δg = e−δ

This implementation assumes that all pollinator species share the same over-

dispersion parameter. This is equivalent to introducing an intercept term

into the model.

2. Over-dispersion parameter is modeled as a function of pollinator species

covariates: δg = f(xg)

This implementation assumes δg = e−ψ′xg . This is equivalent to introducing

an intercept and additional coefficient terms into the model.

3. Intragroup correlation coefficient is modeled as a constant: logit(ρg) = ρ

This implementation assumes that all pollinator species share the same in-

tragroup correlation coefficient. Hence, this is equivalent to introducing an

intercept term into the model.

4. Intragroup correlation coefficient is modeled as a function of covariates: ρg =

f(xg)

This implementation assumes logit(ρg) = ψ′xg. In Stata, this is equivalent

to introducing an intercept and additional coefficient terms into the model.

The pollinator-specific parameters account for extra-multinomial variability, but

do not affect the choice probabilities since they drop out of the logit formulation.

However, the addition of pollinator-specific covariates (Options 2 and 4 above)

can provide additional insight into the heterogeneity of plant-pollinator networks.

Option 2 provides information on the impact of these covariates on the number

of times each plant species is chosen (Guimaraes and Lindrooth, 2007). Option 4

25

provides an assessment of the impact that the covariates have on the correlation

across plant species for individuals in the same pollinator species.

In this thesis, Options 1–3 and the GCL model were used to generate plant-

pollinator networks for the simulation studies and Options 1–4 and the GCL model

were used to fit the simulated data sets. Chapter 4 gives a detailed outline of the

procedures carried out in the simulation studies.

26

Chapter 4

Design of Simulation Study

This chapter provides a description of the simulation studies conducted for

the evaluation of the DM regression model in the context of pollination networks.

Section 4.1 describes the design of Simulation Study A, for which the aim was to

gain insights into the robustness of DM regression to various parameterizations of

the model dispersion. Section 4.2 describes the design of Simulation Study B, for

which the aim was to challenge the performance of DM regression with respect to

the parameter boundaries tested in Simulation Study A. Section 4.3 explains the

data generation procedure for the various plant-pollinator networks used in the

simulation studies. Finally, Section 4.4 sketches the model fitting techniques used

to analyze the simulated data sets and the summary statistics used to compile the

results of the simulation studies.

27

4.1 Description of Simulation Study A

The main objective of Simulation Study A was to evaluate the overall per-

formance of the DM model for plant-pollinator network data and to compare its

performance to the GCL model in the presence of mild over-dispersion. Data were

generated based on three sets of parameter values, three network sizes and four

dispersion structures. The DM model parameters were randomly generated as

follows:

• βi ∼ Uniform(1 + i, 3 + i), i = 0, 1, 2

• δg ∼ Uniform(0, 2), for δg = δ

• δg ∼ Uniform(−2, 4), for δg = f(xg)

• ρg ∼ Uniform(0.25)

The β parameters were randomly generated from a Uniform distribution over in-

terval (1, 3) for the first set of parameter values, but the interval was shifted up by

one for each subsequent set of parameter values. The dispersion parameters were

also randomly generated from a Uniform distribution: (i) the δg parameters were

selected within the range of (0, 2) if δg was considered constant for all pollinator

species or within the range of (-2, 4) if δg was modeled as a function of pollinator

covariates, and (ii) the ρ parameters were selected within the range of (0, 0.25).

Parameter ranges were based on an ad-hoc pre-analysis (results not shown)

that considered various parameter values in the generation of plant-pollinator

networks. The results from the pre-analysis suggested that β < −1 or β > 5

resulted in sparse networks containing almost all zero counts or highly populated

networks containing astronomical counts (>1 million), respectively. Similarly, δ

28

values in excess of 10 produced sparse networks with many zero counts and a few

cells containing very low counts. Also, ρ values in excess of 0.5 produced many

zero counts with a few cells containing high counts. Accordingly, the β ranges

were selected in such a way so as to produce heavily populated plant-pollinator

networks, i.e., N > 90, 000, while the ranges of the dispersion parameters were

selected to represent mild over-dispersion, i.e. δ < 2 and ρ < 0.25. The choice

of parameter values reflected an attempt to evaluate the performance of the DM

model with network counts representing close to an infinite population.

Network sizes were selected based on a review of 35 published plant-pollinator

communities available from the Interaction Web Database (IWDB) (Guimaraes

et al., 2011). The Interaction Web Database is a cooperative effort of scien-

tists interested in the study of species interactions and is hosted by the Na-

tional Center for Ecological Analysis and Synthesis, at the University of Califor-

nia, U.S.A. Similar to the technique used by Santamarıa and Rodrıguez-Girones

(2007), the number of plant species J in a given network was randomly gener-

ated from a Uniform distribution over (7, 135), the endpoints of which corre-

sponded to minimum and maximum number of plant species from all networks

recorded in IWDB, respectively. The number of pollinator species G was then

calculated from a regression of pollinators on plants fit from the same networks,

i.e., G = (0.5491 + 4.4821log(√J))2.

The four types of dispersion structures represent no dispersion, constant dis-

persion, dispersion as a function of pollinator covariates, and constant dispersion

in terms of intragroup correlation. Table 4.1 outlines the scenarios defined as a

unique combination of parameter set, network size and dispersion structure.

29

So for each of the three parameter sets, all combinations of network size and

dispersion structure were considered, resulting in 3 × 3 × 4 = 36 combinations

in total. For each scenario, 750 plant-pollinator networks were generated (as

outlined in Section 4.3), fit using GCL and DM regression models and analyzed

via summary statistics (as outlined in Section 4.4).

4.2 Description of Simulation Study B

The results of Simulation A provided insights into the performance of DM

regression for varying sizes of networks, but because data were generated for spe-

cific sets of parameter values, one cannot conjecture about performance trends

for varying β, δg or ρg values. In response to this concern, Simulation Study B

focuses on one network size, but looked at many combinations of the other model

parameters. More specifically, data were generated based on one network size,

two sets of β values, and one set of ten dispersion values (incorporating the four

dispersion structures outlined in Section 4.1).

Following the same procedure as was used in Simulation Study A, the network

size was obtained using the median number of plant species (J = 18) recorded in

IWDB (Guimaraes et al., 2011) and letting the number of pollinator species equal

Table 4.1: Simulation Study A

Parameter Set Network Dispersion

β δg δg = f(xg) ρ Size Structure

(3.21, 3.85, 3.50) 0.4 (1.48, −0.55) 0.05 57 x 23 δg = 0(3.92, 4.54, 3.52) 1.1 (0.25, 2.41) 0.08 90 x 54 δg = δ(2.78, 2.35, 1.88) 1.6 (−0.96, 2.13) 0.11 105 x 76 δg = f(xg)

ρg = ρ

30

G = (0.5491 + 4.4821log(√J))2 = 49.

The DM model parameters were randomly generated according to:

• βi ∼ Uniform(−1 + i, 2 + i), i = 0, 1

• δg ∼ Uniform(−1, 3), for δg = f(xg)

and dispersion parameters specified as follows:

• δg = 0, 0.1, 0.5, 2, for δg = δ

• ρg = 0.05, 0.2, 0.5.

The choices of parameter ranges and values were based on both the pre-analysis

and the results of Simulation Study A. As such, the β parameters were randomly

generated from a Uniform distribution within the range of (-1, 3), in order to

produce total network counts that matched the averages of those recorded in

IWDB, i.e., 1700 < N < 3400.

The dispersion parameters were selected in such a way so as to incorporate

the four dispersion structures introduced in Table 4.1, but also to provide a rep-

resentative range of values that spanned the boundaries of the parameter space

based on Simulation Study A. Accordingly, if δg was considered constant for all

pollinator species, values were selected within the range of (0, 2), where the spe-

cial case of δg = 0 accounts for a no dispersion structure. If δg was modeled as

a function of pollinator covariates, the parameters of δg were randomly generated

from a Uniform distribution within the range of (-1, 3). Finally, in terms of the

intragroup correlation coefficient, ρ values were selected within the range of (0,

0.5).

31

Table 4.2: Simulation Study B

Network Size β Dispersion

48 x 19 (1.2, -0.4, 0.1) none δ = 0(1.1, 0.8, 2.3)

constant δ = 0.1δ = 0.5δ = 2

function of covariates δg = 0.2 + 1.4xgδg = −0.9 + 2.1xgδg = 1.5− 0.5xg

intragroup corr. constant ρ = 0.05ρ = 0.2ρ = 0.5

Table 4.2 outlines the scenarios defined as a unique combination of a set of β

parameters and a dispersion parameter. So for each set of the two sets of β

parameters, data were generated for the ten dispersion parameter values, resulting

in 2 × 10 = 20 scenarios in total. The advantage of this simulation study is that

it allows for a comparison of the four dispersion structures and a comparison of

the effect of varying amounts of dispersion, given a set of β values and a network

size of 48x19. For each scenario, data were: (i) generated for 750 individual

plant-pollinator networks (as outlined in Section 4.3), (ii) fit using GCL and DM

regression models, and (iii) analyzed via Monte Carlo biases and standard errors

(as outlined in Section 4.4).

32

4.3 Data Generation

Data were generated in R (R Development Core Team, 2011) for all simulation

studies. A total of three covariates, which incorporate the theories of linkage

rules and neutrality, were used to model the interaction probabilities. Two binary

linkage rules, one barrier trait and one complementarity trait, were simulated

based on plant and pollinator species traits as follows:

1. Mean proboscis length ∼ Uniform (0, 10)

2. Mean tubal length ∼ Uniform (0,10)

3. Sweet fragrance preference ∼ Bernoulli (0.5)

4. Sweet fragrance status ∼ Bernoulli (0.65)

The linkage rule was then determined based on matching the plant and pollinator

traits according to the following boolean operators (Santamarıa and Rodrıguez-

Girones, 2007): If the mean proboscis length was greater than or equal to the

tubal length for a given plant-pollinator species pair, then the barrier trait for

that plant-pollinator pair equaled 1; 0 otherwise. Similarly, if the sweet fragrance

preference and status matched for a given plant-pollinator species pair, then the

complementarity trait for that plant-pollinator pair equaled 1; 0 otherwise.

Finally, relative species abundance was generated using the species abundance

distribution proposed by Ravasz et al. (2005). This type of covariate was used

to model both the interaction probabilities (which corresponds to plant species

abundance) and over-dispersion for δg = f(xg) (which corresponds to pollinator

species abundance). The normalized probability density function for the species

33

abundance distribution is:

f(x) =1

Nsln(Ns)−Ns + 1

Ns − xx

(4.1)

where Ns is the number of individuals in the most abundant species and x rep-

resents the size of a given species in a network. Using the inverse cumulative

distribution function method (Devroye, 1986), plant or pollinator species abun-

dances (SA) for a given network were randomly sampled using Ns = 15, 000. The

corresponding relative species abundances (RA) were calculated as:

RAi =SAi∑Ii=1 SAi

. (4.2)

Once the covariates were generated, 750 random samples from a DM distri-

bution were simulated in R. Since the counts in Y can also be modeled as over-

dispersed count variables as mentioned in Section 3.4, the Poisson and Gamma

distributions were used to generate each sample. The following algorithm was

used to generate samples from a DM distribution:

1. Calculate λgj = exp(β′xgj).

2. Randomly sample λ∗gj ∼ Gamma(δ−1g λgj, δ−1g ).

3. Randomly sample ygj ∼ Poisson(λ∗gj).

4. Repeat.

34

For scenarios that considered a dispersion structure in terms of the intragroup

correlation coefficient as a non-zero constant, i.e., ρg = ρ, δg was calculated as a

function of ρ as follows:

δg =ρ

1− ρλg, (4.3)

for g = 1, . . . , G, and was substituted into the above algorithm.

4.4 Model Fitting and Summary Statistics

Estimates of model parameters were obtained via Stata (StataCorp, 2011) us-

ing the multin and dirmul commands. These commands are part of a package

named in‘groupcl’, available in the Statistical Software Components (SSC) li-

brary of Stata. These commands implement a Newton-Raphson algorithm for the

ML estimation of the parameters. Initially, a Poisson regression model is fit to the

data set to provide starting values for the ML estimation. For any given data set,

regardless of the true dispersion structure, maximum likelihood (ML) estimates

for the model parameters were obtained under five different dispersion structure

assumptions, i.e. five different model fits. Table 4.3 lists the model fits and the

corresponding acronyms to be used as a reference in the Results section (Chapter

5).

Fitting the data sets under the different dispersion structure assumptions

makes it possible to examine the impact of incorrect modeling on the ML es-

timates, e.g. via calculation of biases and standard errors. Ultimately, it allows

for a comparison of the robustness and accuracy of the DM model to the GCL

model.

35

Table 4.3: Dispersion structures

Dispersion Acronym

δ = 0 GCLδg = δ DMd

δg = f(xg) DMdfρg = ρ DMr

ρg = f(xg) DMrf

For each data set, ML estimates, standard errors, log-likelihoods, number of

iterations until convergence, number of convergence issues, and Pearson χ2 test

statistics were recorded. Guimaraes and Lindrooth (2007) provide a modified

Pearson χ2 test for the DM model:

P =G∑g=1

J∑J=1

(ng − ngpgj)2

φgngpgj, (4.4)

where φg = λg+ngδgλg+δg

, for g = 1, . . . , G and j = 1, . . . , J .

Monte Carlo means, standard errors, and the corresponding biases were calcu-

lated as per Equations 4.5 and 4.6 for the model parameters using R (Robert and

Casella, 2010). Suppose that θr denotes the ML estimator for a given parameter

θ obtained from the rth sample out of R replications, r = 1, . . . , R, under one of

the four types of dispersion structures. Then we define the bias and Monte Carlo

standard error of the ML estimator θ as:

Bias(θ) = θ − θ (4.5)

36

where θ = 1R

∑Rr=1 θr is the Monte Carlo mean, and

SE(θ) =

√√√√ 1

(R− 1)

R∑r=1

(θr − θ)2. (4.6)

Section 5 summarizes the numerical results of the simulation studies and Section

6 provides an interpretation of the trends suggested by these results.

37

Chapter 5

Results

The results of the two simulations studies are presented in Sections 5.1 and

5.2, respectively. The objective of Simulation Study A was to evaluate the overall

performance of DM regression compared to GCL regression for varying network

sizes and specific sets of parameter values. The results of the study suggest that

DM regression outperforms GCL regression when data are indeed over-dispersed

and significantly improves the model fit. Simulation Study B focuses on one

network size, but evaluates performance trends based on many combinations of

the other model parameters. The results of Simulation Study B reveal similar

trends to that of Simulation Study A; however, it raises questions about whether

DM regression can handle continuous covariates.

5.1 Simulation Study A Results

This simulation study was conducted to evaluate the overall performance of the

DM regression model compared to the standard GCL regression model. Parameter

38

values were chosen to ensure observed interaction frequency matrices were not too

sparse. Further, sets of parameter values (β, δg, ρg) were used to simulate networks

of various sizes assuming one of four dispersion structures discussed in this thesis

(GCL, DMd, DMdf, DMr, as per Table 4.3) in order to assess whether observed

trends in performance, with respect to misspecification of dispersion structure,

depended on network size or not.

Although only four dispersion structures were used to generate data, all five

dispersion structures (GCL, DMd, DMdf, DMr, DMrf) were used to fit each data

set. In what follows, only the results associated with β = (2.78, 2.35, 1.88) and

network size 57 x 23 are presented and discussed in detail. However, the trends

seen here were similar for the other parameter sets, and result tables are provided

in Appendix B.

5.1.1 Model Convergence

Convergence issues due to Hessian instability did arise while fitting some data

sets to the DM models. Table 5.1 presents the percentage of the 750 samples that

reached convergence and were used to calculate the simulation statistics. For each

of the four scenarios, listed in the first column, the percentage of samples that

reached convergence for each of the five model fits are recorded in columns 2–6.

Note that all convergence issues arose either when data were generated with

zero dispersion, but fit by a DM model, i.e., model with non-zero dispersion, (first

row), or when data were generated with dispersion in terms of δg, but fit by the

DMrf model (last column), or both. For the scenario in which the true dispersion

was δg = 0, but the modeled dispersion was DMrf, the fit for only 2% of the

runs converged. Among those that did converge, more than 10 iterations were

39

Table 5.1: Percentage of samples that reached convergence forβ = (2.78, 2.35, 1.88) and network size 57 x 23.

TrueDispersion

Modeled Dispersion*

GCL DMd DMdf DMr DMrf

δg = 0 100 79 45 54 2δg = 0.9 100 100 100 100 18δg = −1 + 2.1xg 100 100 100 100 51ρg = 0.11 100 100 100 100 100

* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).

needed to reach convergence. In the cases where 100% of the samples reached

convergence, e.g. the GCL model fits, convergence was met within less than 10

iterations.

Consequently, the statistics calculated using≥ 50% of the samples that reached

convergence are reported in the summary tables througout the remainder of this

section; otherwise, a not applicable (NA) is reported.

5.1.2 Estimation of β

In general, most models produced accurate β estimates with small bias, i.e.,

percent relative bias ≤ 1. Tables 5.2 and 5.3 display the percent percent bias

of β for the four dispersion structures scenarios (GCL, DMd, DMdf, DMr) when

β = (2.78, 2.36, 1.88) and network size 57 x 23 for DM models fitted in terms of

δg and ρg, respectively.

Not surprisingly, the ML estimates for β obtained under a dispersion structure

that matched the true dispersion structure of the data tend to have the lowest

bias. For example, when data generated with no over-dispersion, i.e., δg = 0, the

40

estimates obtained from the GCL regression model produced a percent relative

bias ≤ 0.003 for all three β parameters. A similar trend can be seen for the

remaining three dispersion structure scenarios.

When the true underlying dispersion structure is δg = 0.9 or δg = −1 + 2.1xg,

fitting the data with any of the five dispersion structures produced β values with

low bias, though the DMd and the DMdf models (Table 5.2) showed slightly

lower bias than the DMr and DMrf models (Table 5.3). However, when the true

underlying dispersion structure is δg = 0 (no dispersion) or ρg = 0.11, then fitting

the DMd and DMdf models produced β values with high bias relative to the other

modeled dispersion structures. However, it can be seen that no matter what the

true dispersion structure, the GCL or DMr tended to consistently have lower bias

compared to the DMd and DMdf models.

Tables 5.4 and 5.5 display the percent coefficient of variation of β for the DM

models parameterized in terms of δg and ρg, respectively. The true dispersion

structures are listed in the first columns of the tables.

Analogous to what was observed in Tables 5.2 and 5.3, model fits for which

the modeled dispersion match the true dispersion produced β values with small

standard errors. The standard errors for the DMd model were very large when

data were generated with no dispersion. Finally, the standard errors for all models

when the true dispersion structure was either δg = 0.9 or δg = −1+2.1xg were small

and comparable to each other. Interestingly, when the true dispersion structure

was ρg = 0.11, no model seemed to fit the data well, though standard errors for the

GCL model were considerably higher than that of the other models. In general,

most of the standard errors for β3 are consistently greater than those for β1 and

β2. It should be noted that the average of the estimated standard errors of β (from

41

Table 5.2: Percent relative bias of β for β = (2.78, 2.35, 1.88) and network size 57 x 23 for DM models in terms ofδg.

TrueDispersion

Modeled Dispersion*

GCL DMd DMdf

δg = 0 (< 0.01, < 0.01, 0.03) (1.50, 1.03, 24.08) NAδg = 0.9 (0.01, < 0.01, 0.10) (0.01, < 0.01, 0.10) (0.01, < 0.01, 0.10)δg = −1 + 2.1xg (0.02, 0.02, 0.08) (0.01, 0.01, 0.04) (0.01, 0.02, 0.06)ρg = 0.11 (0.25, 0.48, 0.19) (33.40, 20.87, 78.84) (33.3, 20.93, 78.70)

* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg).

42

Table 5.3: Percent relative bias of β for β = (2.78, 2.35, 1.88) and network size 57 x 23 for DM models in terms ofρg.

TrueDispersion

Modeled Dispersion*

DMr DMrf

δg = 0 (2.36, 4.00, 241.66) NAδg = 0.9 (0.01, < 0.01, 0.10) NAδg = −1 + 2.1xg (0.01, 0.02, 0.06) (0.43, 0.15, 1.56)ρg = 0.11 (33.30, 20.93, 78.70) (0.03, 0.04, 0.19)

* DMr: ρg = ρ; DMrf: ρg = f(xg).

43

Table 5.4: Percent coefficient of variation of β for β = (2.78, 2.35, 1.88) and network size 57 x 23 for DM model interms of δg.

TrueDispersion

Modeled Dispersion*

GCL DMd DMdf

δg = 0 (0.29, 0.24, 2.45) (13, 8.88, 299.79) NAδg = 0.9 (0.38, 0.33, 3.31) (0.38, 0.33, 3.31) (0.38, 0.33, 3.31)δg = −1 + 2.1xg (0.54, 0.45, 4.62) (0.51, 0.43, 4.6) (0.51, 0.43, 4.61)ρg = 0.11 (5.67, 5.56, 60.20) (1.80, 1.93, 29.94) (1.80, 1.93, 30)


44

Table 5.5: Percent coefficient of variation of β for β = (2.78, 2.35, 1.88) and network size 57 x 23 for DM models interms of ρg.

TrueDispersion

Modeled Dispersion*

DMr DMrf

δg = 0 (74.27, 389.71, 2800.45) NAδg = 0.9 (0.38, 0.33, 3.31) NAδg = −1 + 2.1xg (0.51, 0.43, 4.61) (0.56, 0.47, 5.04)ρg = 0.11 (1.80, 1.93, 30) (2.25, 2.14, 29.05)


45

the Stata output) matched the trends seen here with respect to the Monte

Carlo standard errors of β (calculated as per Equation 4.6). As such, only the

Monte Carlo standard errors are discussed throughout this section.

5.1.3 Model Fit and Estimated Dispersion

Table 5.6 provides the median negative log-likelihood values for the samples

that reached convergence and that were generated with β = (2.78, 2.36, 1.88) and

network size 57 x 23. Each row in Table 5.6 corresponds to the true dispersion

structure specified in the first column. In terms of model fit, the improvement

in the log-likelihood and the Pearson χ2 statistics indicate that the DM models

do provide a better fit to the data. In fact, the log-likelihood values decrease by

two orders of magnitude for the DM models as compared to the GCL model. For

data generated with a non-zero dispersion structure, the percentage of Pearson χ2

p-values < 0.05 for samples that reached convergence ranged from 0−20% for the

DM model fits, while the percentages ranged from 60− 100% for the GCL model

fits, which suggests that the DM models tend to provide a better fit compared to

the GCL model (results not shown, refer to Simulation Study B for results and a

more detailed discussion).

Table 5.7 displays the percent relative bias and percent coefficient of variation

obtained for the dispersion parameters when the modeled dispersion matched the

true dispersion structure of the data. The values of the true dispersion values are

listed in the first column.

The point estimate for δg = 0.9 produced small bias and corresponding stan-

dard errors; however, the point estimate for ρg = 0.11 produced slightly larger

bias, but a lower corresponding standard error. Interestingly, for δg = −1 + 2.1xg,

46

Table 5.6: Median negative log-likelihood values for samples generated withβ = (2.78, 2.35, 1.88) and network size 57 x 23.

TrueDispersion

Modeled Dispersion*


δg = 0 1503182 20838 NA 20859 NAδg = 0.9 1502999 22816 22815 23006 NAδg = −1 + 2.1xg 1503256 24361 24359 24858 24856ρg = 0.11 1504077 16733 16733 16334 16334


Table 5.7: Percent relative bias and percent coefficient of variation fordispersion parameters with β = (2.78, 2.36, 1.88) and network size 57 x 23.

True Modeled DispersionDispersion % Relative Bias % Coefficient of Variation

δg = 0.9 1.56 4.44δg = −1 + 2.1xg (0.1, 3.24) (2.92, 56.76)ρg = 0.11 2.73 1.82

47

the point estimate of the slope produced considerably smaller bias and correspond-

ing standard error than the point estimate of the intercept term, for which the

associated standard error is noticeably the largest.

5.2 Simulation Study B Results

This simulation study was conducted to further compare the robustness of the

DM regression model to that of the GCL regression model; however, under the

consideration of a more selective range of parameter values. The results of Simula-

tion Study A suggest that similar trends exist among all network sizes; therefore,

only one network size, 49 x 18, was selected for this study. Furthermore, ranges

for the β parameters were selected to produce plant-pollinator network counts

that matched the averages of those recorded in IWDB. Finally, the ranges for

the dispersion parameters were chosen to be more representative of the parameter

space, incorporating varying structures and values, in order to allow for a better

evaluation of misspecification of dispersion structure.

Analogous to Simulation Study A, comparisons are made between the perfor-

mance of the different DM models discussed (GCL, DMd, DMdf, DMr, DMrf),

fit to the data sets generated for the 20 unique combinations of β and dispersion

values from Table 4.2. The tables in this section report the results of the model

comparisons for the ten dispersion parameter values, given a set of β parameters.

5.2.1 Model Convergence

Once again, convergence issues due to Hessian instability arose while fitting

the DM models to the data. Tables 5.8 and 5.9 provide the percentage of the 750

48

samples that reached convergence, which were used to calculate the simulation

statistics, for β = (1.2,−0.4, 0.1) and β = (1.1, 0.8, 2.3), respectively.

Table 5.8: Percentage of samples that reached convergence forβ = (1.2,−0.4, 0.1) and network size 49 x 18.

TrueDispersion

Modeled Dispersion*


δg = 0 100 86 58 61 19δg = 0.1 100 100 86 97 53δg = 0.5 100 100 100 100 87δg = 2 100 100 100 100 100

δg = 1.5− 0.5xg 100 100 95 100 75δg = 0.2 + 1.4xg 100 100 100 100 89δg = −0.9 + 2.1xg 100 100 100 100 99

ρg = 0.05 100 100 100 100 100ρg = 0.2 100 100 100 100 100ρg = 0.5 100 100 100 100 100


Similar trends can be seen in both tables. Most convergence issues were en-

countered when data were generated with true dispersion δg = 0 or δg = 0.1,

but were modeled with a different dispersion structure (rows 1 and 2), or if data

were generated with true dispersion parameterized in terms of δg, but modeled in

terms of ρg = f(xg) (last column). In particular, most convergence issues were

encountered when data generated in terms of dispersion parameter δg, either as

a constant or as a function of covariates, were fit to the DMrf model. For these

data sets, 54% to 99% of the 750 samples reached convergence, and, among those

that did, up to 13 iterations were needed to reach convergence. In the cases where

100% of the samples reached convergence, e.g. the GCL model fits, convergence

49

Table 5.9: Percentage of samples that reached convergence forβ = (1.1, 0.8, 2.3) and network size 49 x 18.

TrueDispersion

Modeled Dispersion*


δg = 0 100 90 56 61 10δg = 0.1 100 99 86 98 32δg = 0.5 100 100 99 100 80δg = 2 100 100 100 100 100

δg = 1.5− 0.5xg 100 100 97 100 54δg = 0.2 + 1.4xg 100 100 98 100 87δg = −0.9 + 2.1xg 100 100 100 100 99

ρg = 0.05 100 100 100 100 100ρg = 0.2 100 100 100 100 100ρg = 0.5 100 100 100 100 100


was met within 10 iterations.

Consequently, the statistics calculated using≥ 50% of the samples that reached

convergence are reported in the summary tables throughout the remainder of this

section; otherwise, a not applicable (NA) is reported.

5.2.2 Estimation of β

Tables 5.10 and 5.11 display the percent relative bias of β for β = (1.2,−0.4, 0.1)

and network size 49 x 18, for DM models fit in terms of δg and ρg, respectively.

Tables 5.12 and 5.13 display the percent relative bias of β for β = (1.1, 0.8, 2.3)

and network size 49 x 18, for DM models fit in terms of δg and ρg, respectively.

Not surprisingly, most ML estimates for β obtained under a dispersion

50

Table 5.10: Percent relative bias of β for β = (1.2,−0.4, 0.1) and network size 49 x 18 for DM model in terms of δg.

TrueDispersion

Modeled Dispersion*

GCL DMd DMdf

δg = 0 (0.15, 0.23, 17.81) (0.19, 0.20, 9.82) NAδg = 0.1 (0.29, 0.69, 27.62) (0.31, 0.73, 26.09) (0.38, 0.46, 27.65)δg = 0.5 (0.30, 0.56, 9.86) (0.52, 0.77, 1.57) (0.51, 0.81, 1.25)δg = 2 (0.93, 0.51, 46.45) (0.86, 0.46, 17.08) (0.83, 0.46, 16.59)

δg = 1.5− 0.5xg (0.27, 0.05, 41.12) (0.23, 0.05, 37.42) (0.2, < 0.01, 39.61)δg = 0.2 + 1.4xg (0.22, 0.48, 36.57) (0.17, 0.39, 28.15) (0.21, 0.37, 28.29)δg = −0.9 + 2.1xg (0.51, 0.09, 67.15) (0.30, 0.18, 29.59) (0.29, 0.01, 30.86)

ρg = 0.05 (< 0.01, 1.18, 40.95) (8.10, 2.13, 159.30) (8.11, 2.16, 159.83)ρg = 0.2 (0.83, 3.41, 112.80) (17.11, 3.75, 327.44) (17.04, 3.82, 327.24)ρg = 0.5 (2.86, 0.33, 498.76) (18.51, 2.40, 478.89) (18.34, 2.40, 475.94)


51

Table 5.11: Percent relative bias of β for β = (1.2,−0.4, 0.1) and network size 49 x 18 for DM model in terms of ρg.

TrueDispersion

Modeled Dispersion*

DMr DMrf

δg = 0 (0.31, 0.35, 31.22) NAδg = 0.1 (0.38, 0.64, 25.31) NAδg = 0.5 (0.86, 0.79, 8.63) (0.90, 0.68, 15.07)δg = 2 (0.85, 0.66, 11.29) (0.93, 0.65, 12.03)

δg = 1.5− 0.5xg (0.13, 0.04, 34.61) (0.13, 0.01, 24.37)δg = 0.2 + 1.4xg (0.56, 0.28, 12.91) (0.49, 0.67, 13.54)δg = −0.9 + 2.1xg (2.47, 0.37, 4.02) (2.56, 0.34, 0.51)

ρg = 0.05 (< 0.01, 0.92, 15.25) (< 0.01, 0.98, 15.84)ρg = 0.2 (0.31, 1.31, 17.75) (0.35, 1.37, 17.73)ρg = 0.5 (0.17, 0.11, 139.65) (0.25, 0.18, 135.33)


52

Table 5.12: Percent relative bias of β for β = (1.1, 0.8, 2.3) and network size 49 x 18 for DM model in terms of δg.

TrueDispersion

Modeled Dispersion*

GCL DMd DMdf

δg = 0 (0.12, 0.01, 0.04) (0.18, 0.06, 0.10) NAδg = 0.1 (0.05, 0.27, 0.27) (0.06, 0.28, 0.27) (0.08, 0.29, 0.13)δg = 0.5 (0.16, 0.03, 0.33) (0.13, 0.03, 0.20) (0.14, 0.03, 0.20)δg = 2 (0.20, 0.45, 0.90) (0.13, 0.36, 0.84) (0.13, 0.37, 0.83)

δg = 1.5− 0.5xg (0.20, 0.08, 0.46) (0.22, 0.07, 0.40) (0.25, 0.09, 0.50)δg = 0.2 + 1.4xg (0.16, 0.34, 0.31) (15.29, 36.76, 17.11) (7.80, 17.00, 7.43)δg = −0.9 + 2.1xg (0.08, 0.01, 0.08) (0.40, 0.12, 0.62) (0.36, 0.19, 0.61)

ρg = 0.05 (0.45, 0.42, 0.65) (10.45, 0.68, 7.61) (10.42, 0.64, 7.58)ρg = 0.2 (0.19, 0.57, 3.79) (21.71, 2.70, 16.84) (21.65, 2.65, 16.80)ρg = 0.5 (2.36, 2.64, 8.76) (21.47, 3.13, 17.20) (21.33, 3.08, 17.11)


53

Table 5.13: Percent relative bias of β for β = (1.1, 0.8, 2.3) and network size 49 x 18 for DM model in terms of ρg.

TrueDispersion

Modeled Dispersion*

DMr DMrf

δg = 0 (0.15, 0.11, 0.07) NAδg = 0.1 (0.04, 0.30, 0.27) NAδg = 0.5 (0.01, 0.07, 0.13) (0.02, 0.23, 0.01)δg = 2 (0.73, 0.05, 0.10) (0.71, 0.07, 0.09)

δg = 1.5− 0.5xg (0.26, 0.07, 0.37) (0.62, 0.19, 0.03)δg = 0.2 + 1.4xg (0.45, 0.27, 0.03) (0.38, 0.15, 0.18)δg = −0.9 + 2.1xg (1.07, 0.36, 1.26) (1.16, 0.29, 1.27)

ρg = 0.05 (0.38, 0.24, 0.10) (0.43, 0.28, 0.14)ρg = 0.2 (0.33, 0.09, 1.96) (0.26, 0.03, 1.91)ρg = 0.5 (1.14, 0.01, 1.33) (1.24, 0.05, 1.25)


54

structure that matched the true dispersion structure of the data seem to have the

lowest bias. For example, when the true underlying dispersion structure has ρg

as a non-zero constant, the estimates obtained from the DMr regression model

produced a bias noticeably lower than the other modeled dispersion structures. A

similar trend can be seen for most of the remaining dispersion scenarios.

Interestingly, when data were generated with δg = 0, the estimates obtained

from the GCL and DM regression models produced small bias for β1 and β2,

while the estimates obtained by the GCL model produced the lowest bias for

β3. When the true dispersion structure is δg = δ or δg = f(xg), most β values

had low bias. For β = (1.2,−0.4, 0.1), the DMd and DMdf models (Table 5.10)

showed slightly lower bias than the DMr and DMrf models (Table 5.11). While for

β = (1.1, 0.8, 2.3), the DMd and DMdf models (Table 5.12) showed slightly lower

bias for all values of δg, except when δg = 0.5, for which the DMr and DMrf models

(Table 5.13) showed the lowesr bias; however, when δg = f(xg), the other models

performed as well or better than the DMdf model. In fact, when δg = 0.2 + 1.4xg,

the DMd and DMdf models produced the largest bias. Conversely, when the true

dispersion structure was ρg = ρ, then the DMd and DMdf models produced β val-

ues with noticeably high bias relative to the other modeled dispersion structures.

However, regardless of the true dispersion structure, the GCL, DMr and DMrf

models consistently showed lower bias than the DMd and DMdf models.

In general, the bias of β3 was greater than those for β1 and β2. β3 represents the

effect of plant relative species abundance on the interaction probabilities and is a

continuous covariate. Also, as the values of the dispersion parameters increase, the

bias of β1 and β3 also tend to increase which is to be expected since the greater the

dispersion, the more heterogenous the network, making it more difficult to detect

55

the structure.

Tables 5.14 through 5.17 display the percent coefficient of variation of β for

β = (1.2,−0.4, 0.1) and β = (1.1, 0.8, 2.3) , respectively. Similar to what was

seen in Tables 5.10 through 5.13, model fits when the modeled dispersion match

the true dispersion produced β values with small standard errors. Further, for

data generated with δg = 0, standard errors for all models, with the exception of

the DMdf model, were comparatively small. The standard errors for all models

when the true dispersion structure was either δg = δ or δg = f(xg) were small and

comparable to each other, though the DMd and DMdf models produced slightly

lower standard errors. The one anomaly was for β = (1.1, 0.8, 2.3) and δg =

0.2 + 1.4xg in which case the standard errors of β obtained from the DMd and

DMdf models were considerably larger. Interestingly, when the true dispersion

structure was ρg = ρ, all standard errors were relatively large regardless of the

modeled dispersion. Further, the standard errors for the GCL model tended to be

higher than that of the other models for all data sets generated with dispersion.

Overall, the standard errors tended to increase as the amount of dispersion

increased, which is expected with data showing a higher level of heterogeneity.

Also, the standard errors for β2 are considerably higher than those for β1 and β3 .

56

Table 5.14: Percent coefficient of variation of β for β = (1.2,−0.4, 0.1) and network size 49 x 18 for DM model interms of δg.

TrueDispersion

Modeled Dispersion*

GCL DMd DMdf

δg = 0 (6.25, 12.83, 3.09) (6.26, 12.77, 3.17) NAδg = 0.1 (6.69, 12.85, 3.32) (6.69, 12.83, 3.30) (6.64, 12.83, 3.25)δg = 0.5 (7.95, 15.66, 3.90) (7.69, 15.63, 3.79) (7.68, 15.67, 3.79)δg = 2 (11.24, 21.47, 5.41) (9.57, 19.48, 4.6) (9.58, 19.47, 4.59)

δg = 1.5− 0.5xg (7.01, 12.97, 3.58) (6.93, 12.92, 3.56) (6.90, 12.83, 3.56)δg = 0.2 + 1.4xg (8.94, 16.67, 4.10) (8.33, 16.01, 3.91) (8.33, 16.01, 3.93)δg = −0.9 + 2.1xg (12.19, 22.61, 5.95) (9.91, 19.56, 4.86) (9.88, 19.61, 4.86)

ρg = 0.05 (10.82, 21.15, 5.33) (9.19, 19.06, 4.69) (9.19, 19.11, 4.69)ρg = 0.2 (19.04, 40.47, 9.72) (11.76, 27.36, 6.55) (11.76, 27.37, 6.55)ρg = 0.5 (38.7, 81.84, 20.84) (18.92, 40.55, 10.76) (18.92, 40.51, 10.78)


57

Table 5.15: Percent coefficient of variation of β for β = (1.2,−0.4, 0.1) and network size 49 x 18 for DM model interms of ρg.

TrueDispersion

Modeled Dispersion*

DMr DMrf

δg = 0 (6.18, 12.36, 3.12) NAδg = 0.1 (6.71, 12.74, 3.30) NA)δg = 0.5 (7.84, 15.63, 3.81) (7.95, 15.84, 3.84)δg = 2 (10.17, 19.54, 4.67) (10.17, 19.55, 4.67)

δg = 1.5− 0.5xg (6.94, 12.90, 3.56) (6.95, 12.86, 3.60)δg = 0.2 + 1.4xg (8.60, 16.06, 3.93) (8.63, 16.00, 3.92)δg = −0.9 + 2.1xg (10.72, 19.45, 4.94) (10.65, 19.50, 4.94)

ρg = 0.05 (9.76, 19.08, 4.69) (9.77, 19.12, 4.70)ρg = 0.2 (13.2, 27.41, 6.52) (13.20, 27.43, 6.52)ρg = 0.5 (20.64, 40.92, 10.75) (20.71, 40.86, 10.77)


58

Table 5.16: Percent coefficient of variation of β for β = (1.1, 0.8, 2.3) and network size 49 x 18 for DM model interms of δg.

TrueDispersion

Modeled Dispersion*

GCL DMd DMdf

δg = 0 (4.76, 4.43, 0.07) (4.75, 4.45, 0.07) NAδg = 0.1 (5.24, 4.82, 0.08) (5.23, 4.83, 0.08) (5.24, 4.77, 0.08)δg = 0.5 (5.80, 5.71, 0.09) (5.64, 5.66, 0.09) (5.63, 5.65, 0.09)δg = 2 (8.68, 8.08, 0.13) (7.83, 7.42, 0.12) (7.84, 7.42, 0.12)

δg = 1.5− 0.5xg (5.51, 5.31, 0.07) (5.49, 5.29, 0.07) (5.47, 5.28, 0.07)δg = 0.2 + 1.4xg (7.04, 6.18, 0.10) (181.49, 428.5, 1.91) (176.26, 380.47, 1.57)δg = −0.9 + 2.1xg (9.19, 8.66, 0.13) (7.92, 7.89, 0.12) (7.94, 7.88, 0.12)

ρg = 0.05 (10.10, 10.34, 0.15) (8.56, 8.64, 0.13) (8.58, 8.65, 0.13)ρg = 0.2 (20.02, 21.08, 0.31) (11.36, 13.38, 0.20) (11.36, 13.39, 0.20)ρg = 0.5 (39, 40.31, 0.65) (18.29, 20.17, 0.32) (18.31, 20.21, 0.32)


59

Table 5.17: Percent coefficient of variation of β for β = (1.1, 0.8, 2.3) and network size 49 x 18 for DM model interms of ρg.

TrueDispersion

Modeled Dispersion*

DMr DMrf

δg = 0 (4.85, 4.42, 0.07) NAδg = 0.1 (5.24, 4.81, 0.08) NAδg = 0.5 (5.72, 5.69, 0.09) (5.70, 5.57, 0.09)δg = 2 (8.25, 7.50, 0.12) (8.25, 7.48, 0.12)

δg = 1.5− 0.5xg (5.50, 5.31, 0.07) (5.33, 5.44, 0.07)δg = 0.2 + 1.4xg (6.86, 6.04, 0.10) (6.75, 6.05, 0.10)δg = −0.9 + 2.1xg (8.45, 7.95, 0.12) (8.50, 7.95, 0.12)

ρg = 0.05 (9.05, 8.65, 0.13) (9.07, 8.66, 0.13)ρg = 0.2 (12.81, 13.47, 0.20) (12.82, 13.46, 0.20)ρg = 0.5 (20.17, 20.21, 0.32) (20.19, 20.24, 0.32)


60

5.2.3 Model Fit and Estimated Dispersion

Tables 5.18 and 5.19 provide the median negative log-likelihood values for the

samples that reached convergence and for the scenarios with β = (1.2,−0.4, 0.1)

and β = (1.1, 0.8, 2.3), respectively. Both tables demonstrate that a model as-

suming some dispersion provided a better fit compared to the GCL model. The

improvement in log-likelihood was most marked when the true dispersion structure

was ρg = ρ, and as the intragroup correlation coefficient (ρ) increased.

The log-likelihood values for the DM models are approximately 10 to 30 percent

lower than that of the GCL model. However, the model fit across the DM models

for any true dispersion structure were comparable to each other.

Table 5.18: Median negative log-likelihood values for samples generated withβ = (1.2,−0.4, 0.1) and network size 49 x 18.

TrueDispersion

Modeled Dispersion*


δg = 0 4721 1238 1240 1243 NAδg = 0.1 4725 1258 1259 1259 1263δg = 0.5 4714 1317 1316 1318 1318δg = 2 4723 1359 1358 1363 1362

δg = 1.5− 0.5xg 4727 1281 1281 1281 1282δg = 0.2 + 1.4xg 4712 1339 1339 1341 1342δg = −0.9 + 2.1xg 4715 1352 1352 1357 1356

ρg = 0.05 4719 1370 1370 1366 1365ρg = 0.2 4709 1149 1148 1144 1144ρg = 0.5 4575 646 646 645 644

*GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).

Tables 5.20 and 5.21 provide the percentage of Pearson χ2 p-values < 0.05 for

samples that reached convergence for the scenarios with β = (1.2,−0.4, 0.1) and

β = (1.1, 0.8, 2.3), respectively. The Pearson chi-squared goodness of fit tests also

61

Table 5.19: Median negative log-likelihood values for samples generated withβ = (1.1, 0.8, 2.3) and network size 49 x 18.

TrueDispersion

Modeled Dispersion*


δg = 0 9221 1542 1545 1549 NAδg = 0.1 9243 1570 1572 1571 NAδg = 0.5 9219 1655 1655 1657 1658δg = 2 9219 1776 1776 1783 1782

δg = 0.2 + 1.4xg 9223 1697 1697 1700 1700δg = −0.9 + 2.1xg 9239 1786 1785 1793 1792δg = 1.5− 0.5xg 9230 1603 1603 1604 1606

ρg = 0.05 9210 1803 1802 1796 1795ρg = 0.2 9197 1473 1473 1466 1466ρg = 0.5 9001 802 803 800 799


suggest that the DM models provide an adequate fit compared to the GCL model.

In fact, the DM models provided a better fit than the GCL model in the absence

of dispersion. Conversely, the GCL model provides an inadequate fit 60−100% of

the time, for data generated with any of the dispersion structures. Interestingly,

the DMd and DMdf models tended to consistently provide the best fit regardless

of dispersion structures, while the DMr and DMrf models tended to have a higher

percentage of p-values < 0.05 as the value of the dispersion paramteres increased.

When the modeled dispersion matched the true dispersion structure of the

data, one could compare the percent relative bias and percent coefficient of vari-

ation of the true dispersion parameters. Tables 5.22 and 5.23 present the percent

relative bias and percent coefficient of variation for the dispersion parameters for

β = (1.1, 0.8, 2.3) and β = (1.2,−0.4, 0.1), respectively.

When the dispersion structure was either δg = δ or ρg = ρ, the DM model

62

Table 5.20: Percentage of χ2 p-values < 0.05 for samples generated withβ = (1.2,−0.4, 0.1) and network size 49 x 18.

TrueDispersion

Modeled Dispersion*


δg = 0 6 0 1 0 NAδg = 0.1 61 0 0 0 0δg = 0.5 100 0 0 2 2δg = 2 100 7 7 25 25

δg = 1.5− 0.5xg 99 0 0 0 0δg = 0.2 + 1.4xg 100 2 2 8 8δg = −0.9 + 2.1xg 100 8 7 25 24

ρg = 0.05 100 1 1 5 5ρg = 0.2 100 5 6 16 16ρg = 0.5 100 7 7 16 16


Table 5.21: Percentage of χ2 p-values < 0.05 for samples generated withβ = (1.1, 0.8, 2.3) and network size 49 x 18.

TrueDispersion

Modeled Dispersion*


δg = 0 5 0 1 0 NAδg = 0.1 60 0 0 0 NAδg = 0.5 100 0 0 1 0δg = 2 100 5 5 14 14

δg = 1.5− 0.5xg 99 0 0 0 0δg = 0.2 + 1.4xg 100 1 0 3 2δg = −0.9 + 2.1xg 100 5 5 18 19

ρg = 0.05 100 2 2 7 7ρg = 0.2 100 6 6 18 18ρg = 0.5 100 8 9 20 20


63

Table 5.22: Percent relative bias and percent coefficient of variation fordispersion parameters with β = (1.2,−0.4, 0.1) and network size 49 x 18.


δg = 0.1 20.00 1.42+E19δg = 0.5 3.40 76.20δg = 2 2.00 3.35

δg = 1.5− 0.5xg (3.20, 643.60) (24.00, 2226.00)δg = 0.2 + 1.4xg (110.00, 169.36) (80.00, 345.71)δg = −0.9 + 2.1xg (0.78, 10.81) (15.56, 103.33)

ρg = 0.05 < 0.01 1.00ρg = 0.2 0.50 6.50ρg = 0.5 4.40 5.80

Table 5.23: Percent relative bias and percent coefficient of variation fordispersion parameters with β = (1.1, 0.8, 2.3) and network size 49 x 18.


δg = 0.1 20.00 5.34+E09δg = 0.5 4.20 73.00δg = 2 4.60 11.00

δg = 1.5− 0.5xg (3.73, 638.20) (33.33, 2364.00)δg = 0.2 + 1.4xg (403.50, 126.29) (8710.00, 1419.29)δg = −0.9 + 2.1xg (< 0.01, 10.62) (12.22, 90.95)

ρg = 0.05 < 0.01 0.60ρg = 0.2 < 0.01 2.20ρg = 0.5 < 0.01 5.20

64

tended to estimate the dispersion with small bias and standard error, except when

δ was small (δg = 0.1 or 0.5). When the dispersion structure was δg = f(xg), the

bias was generally small for the intercept parameter, but generally large for the

slope parameter. The associated standard errors were large for both the intercept

and slope parameters.

The trends presented in this chapter persisted in all scenarios and networks

simulated. Full results are presented in Appendix B.

65

Chapter 6

Discussion

The results of both simulation studies suggest that employing a model that

matches the true dispersion structure of the data tends to produce point estimates

of β with the smallest bias and standard errors, but, in general, bias and standard

errors increase as the amount of dispersion increases. However, the GCL and DMr

models seem to consistently provide estimates with small bias and small standard

errors, regardless of the true dispersion structure.

In the absence of dispersion, convergence issues, namely, due to Hessian in-

stability, were encountered when the DM models were used to fit the data. This

result is not surprising since the DM model collapses to the standard GCL model

when the group random effects are zero, making the use of the GCL model more

stable. Nevertheless, among those that did converge, the ML estimates obtained

for β using the DM models were comparable to those obtained using the GCL

model. In fact, the DM models were able to detect a lack of over-dispersion as

demonstrated by corresponding dispersion parameter estimates that were close to

zero and not statistically significant.

66

In the presence of over-dispersion, the DM models, on average, outperformed

the GCL model. As the values of δg and ρg increased, the ML estimates for β

from the DM models tended to have smaller bias and estimated standard errors

relative to that of the GCL estimates. The DM models parameterized in terms

of δg performed best when fit to data with the same dispersion structure, while

the DM models parameterized in terms of ρg tended to perform well regardless of

whether the dispersion was generated in terms of δg or ρg.

Recall that each model contained three covariates, two corresponding to link-

age rules and the third corresponding to relative species abundance. Results of

Simulation Study B demonstrated that the bias and standard errors for the plant

relative species abundance covariate estimates were noticeably larger than those

corresponding to the binary covariate estimates. One possible explanation may

be that relative species abundance is a continuous variable and that DM regres-

sion tends to handle estimation of coefficients for binary covariates better. Also,

as the values of the dispersion parameters increased, the corresponding bias and

standard errors for β also increased. This finding is not surprising since the more

over-dispersed the data are, the harder it may be to learn the structure of the

data.

The same trends can be described for the estimation of the dispersion pa-

rameters. In general, estimates for the dispersion parameters were accurate with

small standard errors, but as the value of the dispersion parameters increased,

the bias and standard errors of the point estimates also increased. For data

generated with little dispersion, i.e. δg = 0.1, the DMd model had difficulties

obtaining an estimate of δg. The DM model parameterized in terms of ρg was able

to consistently provide estimates with small bias for the dispersion parameters.

67

However, the DMdf model had difficulties obtaining an estimate corresponding to

the group-level covariate (pollinator relative species abundance). Consequently,

the corresponding bias and standard errors were noticeably large. Pollinator rela-

tive species abundance is a continuous covariate, suggesting, once again, that DM

regression may have issues estimating this type of covariate.

Although, the GCL seems competitive to the DM models with respect to bias

and standard errors, the DM model has an advantage over the GCL model in

terms of model fit. The reduction in the DM log-likelihood compared to the GCL

log-likelihood suggests that modeling the dispersion helps reduce the variation

in the model. This hypothesis is supported by comparing the corresponding χ2

statistics, which also suggest that the DM models provide as good or better a fit,

regardless of the true dispersion structure.

DM regression seems to be a robust method for modeling plant-pollinator net-

works compared to GCL regression. Since, one cannot predict the true dispersion

structure of an observed network, DM regression provides a procedure for the de-

tection and estimation of the known factors that contribute to network pattern.

More specifically, the results of the simulation studies suggest that, in practice, all

five DM models (GCL, DMd, DMdf, DMr, and DMrf) can be fit initially. Once

the models are fit, comparisons can be made via log-likelihood values, or more

formally the Pearson χ2 goodness of fit tests. If the estimates of the dispersion

parameters are not significantly different from zero and have inflated standard

errors, then there is evidence to suggest that the data are not over-dispersed, in

which case the GCL model is an appropriate choice. However, if the estimates

of β and the corresponding standard errors do not differ greatly between models,

and all models provide the same fit, then any one may be selected. Alternatively,

68

if the estimates of the dispersion parameters are significantly different than zero,

then the data may be over-dispersed, in which case the DM models may be an

appropriate choice. If convergence issues arise, then perhaps the DM model with

an alternative parameterization is more appropriate.

If over-dispersion exists, the DM model can be parameterized to account for a

non-zero constant or pollinator specific covariates that can be used to absorb or

explain this extra-multinomial variability. If the dispersion parameters are mod-

eled as a function of covariates, the estimates of the pollinator specific parameters

can provide additional information in terms of the impact these covariates have

on the number of interactions that occur between a given pollinator and plant

pair (through the use of the DMdf model) or the impact these covariates have on

the correlation among individuals in a species to select a particular plant species

(through the DMrf model).

Although DM regression appears to be a robust model for pollination network

data, at least for the results presented in this thesis, it should be noted that the

DMrf model is problematic due to Hessian instability resulting in convergence is-

sues. Furthermore, Simulation Study B suggests that the estimation of continuous

covariates may show larger bias and standard errors.

Additionally, highly populated networks, such as those generated in Simulation

Study A, are unrealistic and are not representative of those found in observed

networks. The results of Simulation Study A do, however, provide insight into

the asymptotic properties of the DM model parameters, which suggest that the

DM model does produce accurate estimates with small standard errors. Similarly,

despite the efforts in the design of Simulation Study B, the generated networks did

not exhibit the sparse and nested properties observed in real world networks. In

69

fact, the simulated networks were not designed to account for sampling bias, which

is a known causal effect of observed network structure (Vazquez et al., 2009a).

70

Chapter 7

Conclusions

This thesis introduces Dirichlet-multinomial regression to the modeling of polli-

nation networks. It further provides an evaluation of multinomial regression mod-

els to misspecification of dispersion structure within an ecology context. Specif-

ically, GCL and DM regression were used to model the interaction probabilities

of various simulated plant-pollinator networks as a function of trait matching and

relative species abundance. A comparison of the performance of the DM model

to the standard GCL model in terms of misspecification of dispersion structure

was investigated through simulation studies. The results of the simulation studies

suggest that both the GCL model and the DM models perform comparably for

plant-pollinator network data. However, the DM model outperforms the GCL

model in the presence of over-dispersion and significantly improves the model fit.

To date, simple statistical methods such as χ2 tests for proportions and simple

linear regressions have been employed on both real-world and simulated networks

to predict network structure. Unfortunately, these methods have only confirmed

that the factors in question contribute to and only partially explain network struc-

71

ture, but do not quantify the relative contribution of each factor. Further, they are

not commonly used in practice since they are all relatively ‘new’ methods intro-

duced within the past few years (Allesina et al., 2008; Santamarıa and Rodrıguez-

Girones, 2007; Stang et al., 2009; Vazquez et al., 2009b).

The mechanisms driving the topological features of plant-pollinator networks

were examined using an extension of the conceptual framework proposed by Vazquez

et al. (2009) and cutting edge statistical modeling techniques borrowed from

econometrics (random utility model) (Guimaraes and Lindrooth, 2007). More

specifically, DM regression was used to exploit the theories of neutrality and link-

age rules to determine their relative contribution to the structure of mutualistic

plant-pollinator networks. DM regression uses an hierarchical model within a

Bayesian framework to model plant-pollinator interaction probabilities as a func-

tion of plant-pollinator characteristics (e.g. complementary phenotypic traits). In

short, the DM model allows for the exploration of covariates that are plant spe-

cific, pollinator specific, or both, and facilitates identification of factors that affect

interaction probabilities and estimates the relative contribution of those factors.

More specifically, DM regression uses a logit formulation to model the interac-

tion probabilities. Essentially, the log ratio of two probabilities is being modeled

as a linear combination of covariates. As such, the model obtains a β estimate for

each covariate introduced into the model which can be easily interpreted. The β

estimate corresponding to a particular covariate reflects the impact of the change

in that covariate value to the probability of choosing one plant species over the

other plant species. In other words, the probabilities pgj provide a measure of

the strength of the interactions between pollinator g, g = 1, . . . , G, and plant

j, j = 1, . . . , J , while β summarizes a covariate’s relative contribution to those

72

interaction probabilities.

The results presented in this paper suggest that DM regression is a promising

robust statistical approach to evaluate the processes driving the structural patterns

in plant-pollinator mutualistic networks. Additionally, the model can be extended

to incorporate additional types of covariates, such as time and space, or can be

used to learn a set of linkage rules. Furthermore, no other simulation studies

have been done, in econometrics or ecology, to evaluate the misspecification of

dispersion structure.

Although the proposed DM model does take a step towards progress for the

study of pollination networks, the model is yet to be tested on real-world data

sets. In order to conduct a general evaluation of these processes on network

structure, detailed information needs to be measured at the time of data collection.

Hopefully, this research will motivate increased sampling efforts that facilitate the

collection of detailed and representative samples of observed networks over a longer

span of time.

7.1 Future Work

In light of the results presented in this thesis, additional work devoted to

further developing and applying the DM regression in the context of pollination

networks can be groundbreaking for the research of mutualisms. Some ideas and

considerations for future work include:

1. Currently, the DM model does not account for structural zero counts in

the interaction matrix. The presence of zero counts can be attributed to

sampling effects or other informative or relevant driving forces. Therefore,

73

additional extensions to the DM regression model, such as zero-inflated ad-

justments, can be made to account for these structural zeroes.

2. Studies have confirmed that temporal and spatial variability impose con-

straints on potential interactions which in turn influence network pattern

and interaction probabilities (Vazquez et al., 2009a; Jordano et al., 2006).

Species that do not overlap in space or time will not and cannot interact.

Hence, a true evaluation of the mechanisms driving network structure re-

quires an exploration of the spatio-temporal distribution of the species in

these networks.

3. Recent studies exploring the theory of linkage rules suggest that such ecolog-

ical processes drive network structure and encourage the search for linkage

rules in the field (Santamarıa and Rodrıguez-Girones, 2007). As such, DM

regression can be used to test or learn a set of linkage rules.

4. The nestedness typically seen in pollination networks seems to increase as the

size of the network increases. Nested networks tend to be more robust, i.e.

resistant to species loss, making them less vulnerable to species extinction

(Bascompte and Jordano, 2007). An assessment of the robustness of these

networks in the presence of invading species can provide further insight into

the study of mutualisms and co-evolution (Dıaz-Castelazo et al., 2010).

74

Bibliography

Allesina, S., Alonso, D., Pascual, M., 2008. A general model for food web structure.

Science 320, 658–661.

Bascompte, J., Jordano, P., 2007. Plant-animal mutualistic networks: the archi-

tecture of biodiversity. Annual Review of Ecology, Evolution and Systematics

38 (1), 567–593.

Bascompte, J., Jordano, P., Bluthgen, N., 2006. Asymmetric coevolutionary net-

works facilitate biodiversity maintenance. Science 312, 431–433.

Bascompte, J., Jordano, P., Melian, C. J., Olesen, J. M., 2003. The nested assem-

bly of plant-animal mutualistic networks. Proceedings of the National Academy

of Sciences of the United States of America 100 (16), 9383–9387.

Devroye, L., 1986. Non-uniform random variate generation. Springer-Verlag, New

York.

Dıaz-Castelazo, C., Guimaraes, J., Jordano, P., Thompson, J. N., Marquis, R. J.,

Rico-Gray, V., 2010. Changes of a mutualistic network over time: reanalysis

over a 10-year period. Ecology 91 (3), 793–801.

Faraway, J. J., 2006. Extending the Linear Model with R: Generalized Linear,

75

Mixed Effects and Nonparametric Regression Models. Chapman & Hall/CRC,

Boca Raton, FL.

Guimaraes, P., 2005. A simple approach to fit the beta-binomial model. Stata

Journal 5 (3), 385–394.

Guimaraes, P., Galdini Raimundo, R. L., Cagnolo, L., 2011. Interaction Web

Database. National Center for Ecological Analysis and Synthesis, University of

California, Santa Barbara, USA.

URL http://www.nceas.ucsb.edu/interactionweb/index.html

Guimaraes, P., Lindrooth, R. C., 2007. Controlling for overdispersion in grouped

conditional logit models: A computationally simple application of dirichlet-

multinomial regression. The Econometrics Journal 10 (2), 439–452.

Hausman, J. A., Hall, B. H., Griliches, Z., 1984. Econometric models for count

data with an application to the patents-R&D relationship. Econometrica 52,

909–938.

Johnson, N. L., Kotz, S., Balakrishnan, N., 1997. Discrete Multivariate Distribu-

tions. John Wiley & Sons, Inc., New York.

Jordano, P., 1987. Patterns of mutualistic interactions in pollination and seed dis-

persal: Connectance, dependence asymmetries, and coevolution. The American

Naturalist 129 (5), 657–677.

Jordano, P., Bascompte, J., Olesen, J. M., 2003. Invariant properties in coevolu-

tionary networks of plant animal interactions. Ecology Letters 6, 69–81.

76

Jordano, P., Bascompte, J., Olesen, J. M., 2006. The ecological consequences

of complex topology and nested structure in pollination webs. University Of

Chicago Press, Chicago, IL.

Kearns, C. A., Inouye, D. W., Waser, N. M., 1998. Endangered mutualisms: The

conservation of plant-pollinator interactions. Annual Review of Ecology and

Systematics 29 (1), 83–112.

Maddala, G., 1983. Limited-dependent and qualitative variables in econometrics.

Cambridge University Press, New York.

McCulloch, C. E., Searle, S. R., 2005. Generalized, Linear, and Mixed Models.

John Wiley & Sons, Inc., New York.

McFadden, D., 1974. Conditional logit analysis of qualitative choice behavior.

Vol. 1. Academic Press, New York, Ch. 4, pp. 105–142.

Mosimann, J. E., 1962. On the compound multinomial distribution, the multi-

variate beta-distribution, and correlations among proportions. Biometrics 49,

65–82.

Olesen, J. M., Bascompte, J., Dupont, Y. L., Jordano, P., 2007. The modularity

of pollination networks. Proceedings of the National Academy of Sciences of the

United States of America 104 (50), 19891–19896.

R Development Core Team, 2011. R: A Language and Environment for Statistical

Computing. R Foundation for Statistical Computing, Vienna, Austria, ISBN

3-900051-07-0.

URL http://www.R-project.org

77

Ravasz, M., Balog, A., Marko, V., Neda, Z., 2005. The species abundances distri-

bution in a new perspective. Arxiv preprint qbio0502029, 8.

Robert, C., Casella, G., 2010. Introducing Monte Carlo Methods with R. Springer,

New York, NY.

Santamarıa, L., Rodrıguez-Girones, M. A., 2007. Linkage rules for plantpollinator

networks: Trait complementarity or exploitation barriers? PLoS Biol 5 (2), e31.

Shonkwiler, J. S., Hanley, N., 2003. A new approach to random utility model-

ing using the dirichlet multinomial distribution. Environmental and Resource

Economics 26 (3), 401–416.

Stang, M., Klinkhamer, P. G. L., Waser, N. M., Stang, I., van der Meijden, E.,

2009. Size-specific interaction patterns and size matching in a plant-pollinator

interaction web. Annals of Botany 103 (9), 1459–1469.

StataCorp, 2011. Stata Statistical Software: Release 11. StataCorp LP, College

Station, TX.

Thompson, J., 2005. The geographic mosaic of coevolution. University of Chicago

Press.

Vazquez, D. P., 2005. Degree distribution in plant-animal mutualistic networks:

forbidden links or random interactions? Oikos 108, 421–426.

Vazquez, D. P., Bluthgen, N., Cagnolo, L., Chacoff, N. P., 2009a. Uniting pattern

and process in plant-animal mutualistic networks: a review. Annals of Botany

103 (9), 1445–1457.

78

Vazquez, D. P., Chacoff, N. P., Cagnolo, L., 2009b. Evaluating multiple deter-

minants of the structure of plant-animal mutualistic networks. Ecology 90 (8),

2039–2046.

Vazquez, D. P., Morris, W. F., Jordano, P., 2005. Interaction frequency as a

surrogate for the total effect of animal mutualists on plants. Ecology Letters

8 (10), 1088–1094.

79

Appendix A

Derivation of Conditional Logit

Model

Maddala (1983) provides a derivation of McFadden’s conditional logit model.

The derivation is as follows:

Suppose an individual faces J choices and define Y ∗j as the level of indirect utility

associated with the jth choice. The observed variables Yj are defined as:

Yj = 1, if Y ∗j = max(Y ∗1 , Y∗2 , . . . , Y

∗J )

Yj = 0, otherwise.

Then,

Y ∗j = Vj(Xj) + εj , (A.1)

where Xj is a vector of attributes for the jth choice and εj is a random error term

that captures unobserved variability. Assume that the εj are independently and

identically distributed (i.i.d.) type I extreme value distribution with probability

80

density function (PDF) and cumulative distribution function (CDF) are:

f(εj) = exp(−εj − e−εj) (A.2)

and

F (εj < ε) = exp(−e−ε) , (A.3)

respectively. Then it can be shown that:

P (Yj = 1|X) =eVj∑Jj=1 e

Vj, (A.4)

where Vj = β′Xj. The condition Y ∗j = max(Y ∗1 , Y∗2 , . . . , Y

∗J ) implies:

εj + Vj > εk + Vk, for all k 6= j

εk < εj + Vj − Vk, for all k 6= j. (A.5)

Hence, if ε1, ε2, ..., εJ are i.i.d.with CDF given by A.3, then

P (Yj = 1|X) = P (εk < εj + Vj − Vk), for all k 6= j

=

∫ ∞−∞

∏k 6=j

F (εj + Vj − Vk)f(εj)dεj, (A.6)

where f(·) and F (·) are given by A.2 and A.3, respectively. Now

81

∏k 6=j

F (εj + Vj − Vk)f(εj) =∏

k 6=j exp(−e−εj−Vj+Vk)exp(−εj − e−εj)

= exp

[εj − e−εj

(1 +

∑k 6=j

eVk

eVj

)]. (A.7)

If we let

λj = log

(1 +

∑k 6=j

eVk

eVj

)= log

( J∑j=1

eVk

eVj

), (A.8)

then we can rewrite A.6 as:

∫ ∞−∞

exp(−εj − e−(εj−λj))dεj = exp(−λj)∫ ∞−∞

exp(−ε∗j − e−ε∗j )dε∗j

= exp(−λj)

=eVj∑Jj=1 e

Vj

where ε∗j = εj − λj.

If we have a set of N individuals facing J choices, we can define for i = 1, . . . , N

and j = 1, . . . , J :

Y ∗ij = Vij, the level of indirect utility for the ith individual making the jth choice.

Yij = 1, if the ith individual makes the jth choice.

Yij = 0, otherwise.

Assume that Vij = β′Xij + α′jZi + εij, where Zi are individual specific variables

and Xij is the vector of values of attributes of the jth choice as perceived by the

ith individual, and β and α are unknown parameters to be estimated.

82

Then the probability that the ith individual selects the jth choice is:

Pij = P (Yij = 1) =eβ′Xij+α

′jZi∑J

j=1 eβ′Xij+α′jZi

, (A.9)

which is the logit formulation used to model multinomial probabilities.

83

Appendix B

Supplementary Tables for

Simulation Study A

Table B.1: Percentage of samples that reached convergence forβ = (2.78, 2.35, 1.88) and network size 90 x 54.

TrueDispersion

Modeled Dispersion*


δg = 0 100 82 46 55 5δg = 0.9 100 100 100 100 59δg = −1 + 2.1xg 100 95 97 100 91ρg = 0.11 100 100 100 100 100


84


TrueDispersion

Modeled Dispersion*


δg = 0 100 80 49 60 3δg = 0.9 100 100 100 100 71δg = −1 + 2.1xg 100 99 98 100 100ρg = 0.11 100 100 100 100 100



TrueDispersion

Modeled Dispersion*


δg = 0 100 79 47 52 1δg = 2.5 100 97 98 100 98δg = 1.5− 0.5xg 100 100 98 100 17ρg = 0.05 100 100 100 100 100



TrueDispersion

Modeled Dispersion*


δg = 0 100 78 51 54 2δg = 2.5 100 100 100 100 94δg = 1.5− 0.5xg 100 99 98 100 20ρg = 0.05 100 100 100 100 100


85


TrueDispersion

Modeled Dispersion*


δg = 0 100 75 53 57 1δg = 2.5 100 100 100 100 41δg = 1.5− 0.5xg 100 100 100 100 6ρg = 0.05 100 100 100 100 100



TrueDispersion

Modeled Dispersion*


δg = 0 100 75 48 52 3δg = 0.6 100 100 100 100 48δg = 0.2 + 2.4xg 100 100 100 100 44ρg = 0.08 100 100 100 100 100



TrueDispersion

Modeled Dispersion*


δg = 0 100 72 48 52 3δg = 0.6 100 100 100 100 52δg = 0.2 + 2.4xg 100 100 97 100 51ρg = 0.08 100 100 100 100 100


86


TrueDispersion

Modeled Dispersion*


δg = 0 100 70 35 50 3δg = 0.6 100 100 98 100 51δg = 0.2 + 2.4xg 100 100 100 100 56ρg = 0.08 100 100 100 100 100


Table B.9: Median negative loglikelihood values for samples generated withβ = (2.78, 2.35, 1.88) and network size 90 x 54.

TrueDispersion

Modeled Dispersion*


δg = 0 220460 4246 NA 4257 NAδg = 0.9 220417 4657 4656 4682 4683δg = −1 + 2.1xg 220357 4966 4966 5035 5033ρg = 0.11 219528 4247 4246 4166 4165



TrueDispersion

Modeled Dispersion*


δg = 0 851438 12756 NA 12776 NAδg = 0.9 851557 13975 13974 14073 14079δg = −1 + 2.1xg 851549 14939 14939 15214 15207ρg = 0.11 852426 11246 11241 11012 11012


87


TrueDispersion

Modeled Dispersion*


δg = 0 1656747 5219 NA 5227 NAδg = 2.5 1656799 5988 5988 6177 6177δg = 1.5− 0.5xg 1656574 5356 5356 5363 NAρg = 0.05 1650244 6314 6312 6054 6053



TrueDispersion

Modeled Dispersion*


δg = 0 4665737 14773 14789 14800 NAδg = 2.5 4665713 16987 16986 17325 17324δg = 1.5− 0.5xg 4665837 15181 15182 15197 NAρg = 0.05 4660583 14779 14778 14409 14408



TrueDispersion

Modeled Dispersion*


δg = 0 8067871 24118 24127 24132 NAδg = 2.5 8067725 27681 27680 28122 NAδg = 1.5− 0.5xg 8067731 24775 24775 24796 NAρg = 0.05 8046445 22028 22026 21591 21590


88


TrueDispersion

Modeled Dispersion*


δg = 0 6736102 6069 NA 6081 NAδg = 0.6 6736328 6417 6416 6432 NAδg = 0.2 + 2.4xg 6735725 6463 6462 6484 NAρg = 0.08 6716123 5857 5856 5736 5736



TrueDispersion

Modeled Dispersion*


δg = 0 18572360 16434 NA 16446 NAδg = 0.6 18571834 17376 17375 17437 17436δg = 0.2 + 2.4xg 18571976 17501 17490 17580 17585ρg = 0.08 18559325 14046 14042 13790 13789



TrueDispersion

Modeled Dispersion*


δg = 0 24047668 25524 NA 25546 NAδg = 0.6 24047495 27036 27039 27200 27199δg = 0.2 + 2.4xg 24047736 27251 27250 27457 27457ρg = 0.08 23993702 20352 20351 19758 19758


89

Table B.17: Percent relative bias of β for β = (2.78, 2.35, 1.88) and network size 90 x 54.

TrueDispersion

Modeled Dispersion*


δg = 0 (0.01, 0.04, 0.07) (11.28, 51.48, 21.77) NA (< 0.01, 0.03, 0.16) NAδg = 0.9 (0.02, 0.01, 0.03) (0.02, 0.01, 0.04) (0.02, 0.01, 0.04) (0.1, 0.02, 0.05) (0.07, 0, 0.1)δg = −1 + 2.1xg (0.02, 0, 0.23) (21, 168.39, 45.08) (16.92, 13.33, 20.96) (0.4, 0.13, 0.59) (0.44, 0.12, 0.6)ρg = 0.11 (0.99, 0.4, 0.4) (21.74, 2.71, 10.43) (21.66, 2.65, 10.49) (0.27, 0.05, 0.54) (0.3, 0.08, 0.49)


90


TrueDispersion

Modeled Dispersion*


δg = 0 (< 0.01, 0.01, 0.26) (5.32, 152.08, 91.4) NA (0.01, < 0.01, 0.34) NAδg = 0.9 (0.03, < 0.01, 0.01) (0.03, 0, 0.02) (0.03, < 0.01, 0.02) (0.21, 0.01, 0.02) (0.22, 0, 0.02)δg = −1 + 2.1xg (0.03, 0.05, 0.33) (2.88, 63.56, 2.84) (8.37, 6.82, 17.3) (0.6, 0.16, 0.19) (0.59, 0.16, 0.17)ρg = 0.11 (0.06, 0.28, 0.5) (34.14, 6.68, 27.87) (33.93, 6.44, 27.11) (0.13, 0.18, 0.72) (0.13, 0.19, 0.68)


91


TrueDispersion

Modeled Dispersion*


δg = 0 (0.01, 0.01, 0.02) (1.61, 2.69, 4.33) NA (0.02, < 0.01, < 0.01) NAδg = 2.5 (0.02, 0.01, 0.03) (31.26, 39.71, 29.27) (10.76, 13.09, 19.11) (< 0.01, 0.15, 0.15) (0, 0.15, 0.15)δg = 1.5− 0.5xg (< 0.01, 0.01, 0.01) (2.34, 3.95, 7.76) (2.28, 3.75, 6.65) (< 0.01, 0.01, 0.01) NAρg = 0.05 (0.42, 0.77, 0.74) (27.32, 24.06, 25.52) (27.26, 24.14, 25.38) (0.1, 0.08, 0.56) (0.12, 0.11, 0.6)


92

Table B.20: Monte Carlo bias of β for β = (3.21, 3.85, 3.50) and network size 105 x 76.

TrueDispersion

Modeled Dispersion*


δg = 0 (0.01, < 0.01, 0.06) (39.7, 0.13, 35.74) (31.25, 6.32, 1) (< 0.01, < 0.01, 0.05) NAδg = 2.5 (< 0.01, < 0.01, 0.01) (0, 0, 0.01) (< 0.01, < 0.01, 0.01) (0.15, 0.07, 0.05) (0.15, 0.08, 0.06)δg = 1.5− 0.5xg (< 0.01, 0.01, 0.01) (10.4, 1.44, 14.06) (12.4, 1.55, 15.86) (< 0.01, 0.01, 0.01) NAρg = 0.05 (0.23, 0.45, 1.34) (30.4, 4.46, 15.79) (30.28, 4.42, 15.68) (0.02, 0.02, 0.23) (0.01, 0.02, 0.23)


93


TrueDispersion

Modeled Dispersion*


δg = 0 (< 0.01, < 0.01, < 0.01) (< 0.01, < 0.01, < 0.01) (14.14, 9.35, 43.42) (< 0.01, 0.01, < 0.01) NAδg = 2.5 (0.01, < 0.01, 0.01) (0.01, < 0.01, 0.01) (0.01, < 0.01, 0.01) (0.17, < 0.01, 0.13) NAδg = 1.5− 0.5xg (0.01, < 0.01, 0.02) (0.01, < 0.01, 0.02) (0.01, < 0.01, 0.02) (< 0.01, < 0.01, 0.02) NAρg = 0.05 (0.11, 0.1, 0.49) (27.39, 7.89, 14.39) (27.43, 7.94, 14.38) (0.1, 0.07, 0.12) (0.11, 0.08, 0.12)


94


TrueDispersion

Modeled Dispersion*


δg = 0 (0.01, 0.01, < 0.01) (2.24, 9.32, 27.94) NA (0.02, 0.01, 0.01) NAδg = 0.6 (< 0.01, < 0.01, 0.02) (0.01, 0.06, 0.25) (< 0.01, < 0.01, 0.02) (0.01, < 0.01, 0.02) NAδg = 0.2 + 2.4xg (0.01, 0.01, 0.01) (0.01, 0.01, 0.01) (0.01, 0.01, 0.01) (0.02, 0.01, 0.01) NAρg = 0.08 (1.42, 1.79, 0.15) (23.69, 6.64, 2.51) (23.56, 6.58, 2.59) (0.27, 0.13, 0.13) (0.29, 0.16, 0.09)


95


TrueDispersion

Modeled Dispersion*


δg = 0 (< 0.01, < 0.01, < 0.01) (0.39, 0.41, 0.24) NA (< 0.01, < 0.01, < 0.01) NAδg = 0.6 (< 0.01, 0.01, 0.01) (< 0.01, 0.01, 0.01) (0.12, 0.14, 0.07) (< 0.01, 0.01, 0.01) (0.01, 0.01, < 0.01)δg = 0.2 + 2.4xg (< 0.01, < 0.01, < 0.01) (< 0.01, < 0.01, < 0.01) (33.64, 34.86, 15.84) (0.01, 0.01, 0.01) (0.01, 0.01, 0.01)ρg = 0.08 (0.51, 1.09, 0.13) (26.12, 7.34, 5.4) (25.95, 7.34, 5.66) (0.05, 0.07, 0.74) (0.06, 0.08, 0.75)


96

Table B.24: Monte Carlo bias of β for β = (3.92, 4.54, 3.52) and network size 57 x 23.

TrueDispersion

Modeled Dispersion*


δg = 0 (< 0.01, < 0.01, < 0.01) (5.72, 0.59, 14.97) NA (< 0.01, < 0.01, 0.03) NAδg = 0.6 (< 0.01, 0.01, 0.03) (6.6, 0.34, 15.03) (1.57, 0.23, 3.55) (0.01, 0.01, 0.02) (0.01, < 0.01, 0.06)δg = 0.2 + 2.4xg (< 0.01, < 0.01, 0.02) (0.4, 0.03, 0.81) (< 0.01, < 0.01, 0.02) (0.01, 0.01, 0.02) (0.01, < 0.01, 0.05)ρg = 0.08 (0.34, 1.37, 3.66) (37.19, 8.37, 54.51) (37.05, 8.34, 54.26) (0.04, 0.13, 1.1) (0.05, 0.14, 1.12)


97

Table B.25: Percent coefficient of variation of β for β = (2.78, 2.35, 1.88) and network size 90 x 54.

TrueDispersion

Modeled Dispersion*


δg = 0 (0.56, 0.52, 2.51) (92.47, 432.27, 197.98) NA (0.57, 0.52, 2.54) NAδg = 0.9 (0.79, 0.71, 3.47) (0.78, 0.71, 3.46) (0.79, 0.71, 3.45) (0.8, 0.73, 3.56) (0.81, 0.75, 3.55)δg = −1 + 2.1xg (1.04, 0.97, 4.58) (222.95, 1252.44, 460.33) (320.46, 499.41, 440.13) (1.07, 1.01, 4.87) (1.07, 1, 4.82)ρg = 0.11 (6.51, 7.5, 40.11) (2.92, 3.74, 25.84) (2.93, 3.74, 25.86) (3.55, 3.71, 24.48) (3.56, 3.72, 24.53)


98


TrueDispersion

Modeled Dispersion*

GCL DMd DMdf DMr DMrfδg = 0 (0.36, 0.3, 2.54) (25.92, 775.5, 473.63) NA (0.36, 0.31, 2.63) NAδg = 0.9 (0.48, 0.44, 3.5) (0.47, 0.44, 3.49) (0.47, 0.44, 3.49) (0.49, 0.45, 3.55) (0.49, 0.46, 3.48)δg = −1 + 2.1xg (0.65, 0.56, 4.97) (30.37, 663.98, 49.26) (227.39, 186.44, 477.7) (0.69, 0.6, 5.54) (0.69, 0.59, 5.51)ρg = 0.11 (5.72, 6.25, 56.6) (1.9, 2.55, 25.96) (1.9, 2.56, 25.96) (2.46, 2.62, 27.15) (2.46, 2.62, 27.12)


99


TrueDispersion

Modeled Dispersion*


δg = 0 (0.22, 0.27, 0.39) (14.12, 23.87, 38.91) NA (0.22, 0.28, 0.38) NAδg = 2.5 (0.38, 0.52, 0.74) (285.32, 352.44, 827.4) (141.25, 169.35, 327.65) (0.44, 0.59, 0.83) (0.44, 0.59, 0.83)δg = 1.5− 0.5xg (0.23, 0.32, 0.46) (23.01, 39, 75.91) (18.87, 31.4, 55.9) (0.23, 0.32, 0.47) NAρg = 0.05 (5.19, 5.92, 13.42) (2.18, 2.15, 9.55) (2.19, 2.15, 9.56) (2.48, 2.35, 9.04) (2.48, 2.35, 9.04)


100


TrueDispersion

Modeled Dispersion*


δg = 0 (0.14, 0.15, 0.66) (212.39, 14.91, 258.43) (564.98, 59.35, 231.32) (0.14, 0.15, 0.67) NAδg = 2.5 (0.27, 0.27, 1.24) (0.26, 0.27, 1.24) (0.27, 0.27, 1.24) (0.29, 0.3, 1.35) (0.28, 0.3, 1.35)δg = 1.5− 0.5xg (0.15, 0.17, 0.72) (124.34, 12.25, 152.29) (156.3, 17.23, 212.97) (0.15, 0.17, 0.71) NAρg = 0.05 (3.84, 5.16, 23.92) (1.44, 1.87, 15.13) (1.44, 1.87, 15.14) (1.82, 1.85, 14.47) (1.83, 1.85, 14.46)


101


TrueDispersion

Modeled Dispersion*


δg = 0 (0.11, 0.11, 0.46) (0.11, 0.11, 0.46) (204.71, 159.09, 846.52) (0.1, 0.12, 0.47) NAδg = 2.5 (0.2, 0.22, 0.87) (0.2, 0.22, 0.87) (0.2, 0.22, 0.87) (0.21, 0.23, 0.96) NAδg = 1.5− 0.5xg (0.12, 0.13, 0.56) (0.12, 0.13, 0.56) (0.12, 0.13, 0.56) (0.12, 0.13, 0.56) NAρg = 0.05 (3.3, 4.57, 21.26) (1.29, 1.61, 14.25) (1.29, 1.61, 14.27) (1.48, 1.65, 13.07) (1.49, 1.66, 13.07)


102


TrueDispersion

Modeled Dispersion*


δg = 0 (0.15, 0.12, 0.2) (14.39, 58.01, 171.78) NA (0.15, 0.12, 0.19) NAδg = 0.6 (0.19, 0.16, 0.25) (0.27, 1.22, 4.75) (0.19, 0.16, 0.26) (0.19, 0.16, 0.26) NAδg = 0.2 + 2.4xg (0.2, 0.17, 0.26) (0.2, 0.16, 0.26) (0.2, 0.16, 0.26) (0.2, 0.17, 0.26) NAρg = 0.08 (8.16, 8.34, 14.25) (2.23, 2.68, 11.22) (2.23, 2.68, 11.27) (2.98, 2.78, 10.19) (2.98, 2.78, 10.23)


103


TrueDispersion

Modeled Dispersion*


δg = 0 (0.07, 0.09, 0.35) (9.16, 9.7, 5.89) NA (0.08, 0.09, 0.35) NAδg = 0.6 (0.1, 0.12, 0.42) (0.1, 0.12, 0.42) (3.34, 3.73, 1.74) (0.1, 0.12, 0.42) (0.09, 0.12, 0.42)δg = 0.2 + 2.4xg (0.1, 0.12, 0.44) (0.1, 0.12, 0.44) (147.18, 151.74, 70.8) (0.11, 0.12, 0.44) (0.1, 0.12, 0.44)ρg = 0.08 (4.93, 7.1, 30.28) (1.47, 2.37, 19.13) (1.49, 2.37, 19.16) (1.82, 2.43, 17.55) (1.82, 2.42, 17.54)


104


TrueDispersion

Modeled Dispersion*


δg = 0 (0.06, 0.07, 0.5) (35.39, 3.71, 91.05) NA (0.06, 0.07, 0.48) NAδg = 0.6 (0.07, 0.1, 0.64) (52.01, 2.89, 115.83) (13.1, 2.04, 30.48) (0.07, 0.1, 0.65) (0.07, 0.1, 0.64)δg = 0.2 + 2.4xg (0.08, 0.11, 0.65) (6.42, 0.61, 13.1) (0.08, 0.11, 0.65) (0.08, 0.11, 0.67) (0.07, 0.11, 0.66)ρg = 0.08 (3.82, 7.47, 52.91) (1.09, 2.07, 21.34) (1.09, 2.07, 21.32) (1.37, 2.17, 22.78) (1.37, 2.17, 22.78)


105

Table B.33: Percent relative bias and percent coefficient of variation fordispersion parameters with β = (2.78, 2.35, 1.88) and network size 90 x 54.


δg = 0.9 0.56 9.67δg = −1 + 2.1xg (200.21, 115.76) (1999.79, 1285.62)ρg = 0.11 2.73 4.55




δg = 0.9 1.56 5.78δg = −1 + 2.1xg (13.65, 9.05) (359.38, 253)ρg = 0.11 2.73 2.73


106



δg = 2.5 100.00 9.24E+37δg = 1.5− 0.5xg (123.07, 4390.00) (1026.20, 41533.20)ρg = 0.05 < 0.01 4.00




δg = 2.5 0.84 0.52δg = 1.5− 0.5xg (242.00, 2421.80) (2431.67, 26050.80)ρg = 0.05 < 0.01 2.00




δg = 2.5 0.60 0.40δg = 1.5− 0.5xg (0.40, 112.40) (6.47, 831.80)ρg = 0.05 < 0.01 2.00


107



δg = 0.6 100.00 1.22+E41δg = 0.2 + 2.4xg (3.00, 18.75) (47.00, 147.17)ρg = 0.08 5.00 5.00




δg = 0.6 5.67 14.83δg = 0.2 + 2.4xg (6744.00, 611.04) (29524.00, 4366.83)ρg = 0.08 5.00 2.50




δg = 0.6 100.00 8.93+E47δg = 0.2 + 2.4xg (1.00, 3.38) (21.00, 91.50)ρg = 0.08 5.00 2.50


108

on the robustness of dirichlet-multinomial regression in ... 2011.pdf · abstract on the robustness...

Documents