on the robustness of dirichlet-multinomial regression in ... 2011.pdf · abstract on the robustness...
TRANSCRIPT
On the Robustness of Dirichlet-multinomial Regression in the Context ofModeling Pollination Networks
by
Catherine Crea
A Thesispresented to
The University of Guelph
In partial fulfilment of requirementsfor the degree ofMaster of Science
inMathematics and Statistics
Guelph, Ontario, Canada
c© Catherine Crea, December, 2011
ABSTRACT
ON THE ROBUSTNESS OF DIRICHLET-MULTINOMIAL REGRESSION IN
THE CONTEXT OF MODELING POLLINATION NETWORKS
Catherine Crea Advisor:
University of Guelph, 2011 Professor A. Ali
Recent studies have suggested that the structure of plant-pollinator networks
is driven by two opposing theories: neutrality and linkage rules. However, rela-
tively few studies have tried to exploit both of these theories in building pollina-
tion webs. This thesis proposes Dirichlet-Multinomial (DM) regression to model
plant-pollinator interactions as a function of plant-pollinator characteristics (e.g.
complementary phenotypic traits), for evaluating the contribution of each pro-
cess to network structure. DM regression models first arose in econometrics for
modeling consumers’ choice behaviour. Further, this thesis (i) evaluates the ro-
bustness of DM regression to misspecification of dispersion structure, and (ii)
compares the performance of DM regression to grouped conditional logit (GCL)
regression through simulation studies. Results of these studies suggest that DM
regression is a robust statistical method for modeling qualitative plant-pollinator
interaction networks and outperforms the GCL regression when data are indeed
over-dispersed. Finally, using DM regression seems to significantly improve model
fit.
iii
Acknowledgements
First and foremost, I would like to thank my advisor, Dr. Ayesha Ali, for
her expertise, support, and patience throughout the course of my research. Her
ability to be both a teacher and a mentor has given me the skills and confidence
necessary to complete this thesis. I appreciate all her contributions of time, ideas,
and advice to make my masters experience productive and stimulating.
I would like to thank NSERC-CANPOLIN for providing the funding for this
research. A special thank you to Dr. Peter Kevan, principle investigator for
CANPOLIN, Dr. Tom Woodcock, research associate for CANPOLIN, and Dr.
Sarah Bates, network manager for CANPOLIN. I appreciate the time you spent
providing input and feedback throughout the development of this research, but
also, your interest and attention was the most encouraging.
I would like to thank Dr. Gary Umphrey for being on my advisory committee.
Not only am I grateful for his technical insight and thoughtful input with respect
to this thesis, but also for being one of the most passionate professors in the
department. Through his courses, I gained a solid grasp of the fundamentals of
Statistics which afforded me the skills necessary to pursue a masters. Also, I am
thankful to all the members of the Department of Mathematics and Statistics,
whether it be the professors, administrative staff, or student colleagues, you have
all made my graduate experience challenging, enjoyable and unforgettable.
It is difficult to oversight my gratitude to my employer, Geosyntec Consultants,
for supporting me both financially and professionally during my post-graduate
studies. Being surrounded by such brilliant and remarkable professionals has given
me the motivation to pursue a higher level of expertise in my field of study.
Lastly, this thesis would not have been possible without the love and support
iv
of my family and friends. My sisters, Mary and Carm, encouraged, supported,
guided, and understood me at every moment and I am forever indebted to them
for giving me the strength to persevere. My brother, Vince, is my heart and soul
and without him I would not be the determined person I am. His extraordinary
will to live and continuous resiliency to overcome any illness will forever inspire
me and keep me grounded. Finally, my friends kept me sane and laughing during
all the stages of my thesis and for that I am so appreciative.
Table of Contents
1 Introduction 1
2 Pollination Networks 6
2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Network Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Network Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Network Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.5 Previous Pollination Network Studies . . . . . . . . . . . . . . . . . 11
3 Dirichlet-Multinomial Regression 14
3.1 Multinomial Responses . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Random Utility Model . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 Grouped Conditional Logit . . . . . . . . . . . . . . . . . . . . . . . 18
3.4 Dirichlet-Multinomial Model . . . . . . . . . . . . . . . . . . . . . . 19
3.4.1 Additional Parameterizations and Considerations . . . . . . 23
4 Design of Simulation Study 27
4.1 Description of Simulation Study A . . . . . . . . . . . . . . . . . . 28
4.2 Description of Simulation Study B . . . . . . . . . . . . . . . . . . . 30
vi
4.3 Data Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.4 Model Fitting and Summary Statistics . . . . . . . . . . . . . . . . 35
5 Results 38
5.1 Simulation Study A Results . . . . . . . . . . . . . . . . . . . . . . 38
5.1.1 Model Convergence . . . . . . . . . . . . . . . . . . . . . . . 39
5.1.2 Estimation of β . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.1.3 Model Fit and Estimated Dispersion . . . . . . . . . . . . . 46
5.2 Simulation Study B Results . . . . . . . . . . . . . . . . . . . . . . 48
5.2.1 Model Convergence . . . . . . . . . . . . . . . . . . . . . . . 48
5.2.2 Estimation of β . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.2.3 Model Fit and Estimated Dispersion . . . . . . . . . . . . . 61
6 Discussion 66
7 Conclusions 71
7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Bibliography 74
A Derivation of Conditional Logit Model 80
B Supplementary Tables for Simulation Study A 84
vii
List of Tables
4.1 Simulation Study A . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Simulation Study B . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3 Dispersion structures . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.1 Percentage of samples that reached convergence for β = (2.78, 2.35, 1.88)
and network size 57 x 23. . . . . . . . . . . . . . . . . . . . . . . . . 40
5.2 Percent relative bias of β for β = (2.78, 2.35, 1.88) and network size
57 x 23 for DM models in terms of δg. . . . . . . . . . . . . . . . . . 42
5.3 Percent relative bias of β for β = (2.78, 2.35, 1.88) and network size
57 x 23 for DM models in terms of ρg. . . . . . . . . . . . . . . . . 43
5.4 Percent coefficient of variation of β for β = (2.78, 2.35, 1.88) and
network size 57 x 23 for DM model in terms of δg. . . . . . . . . . . 44
5.5 Percent coefficient of variation of β for β = (2.78, 2.35, 1.88) and
network size 57 x 23 for DM models in terms of ρg. . . . . . . . . . 45
5.6 Median negative log-likelihood values for samples generated with
β = (2.78, 2.35, 1.88) and network size 57 x 23. . . . . . . . . . . . . 47
5.7 Percent relative bias and percent coefficient of variation for disper-
sion parameters with β = (2.78, 2.36, 1.88) and network size 57 x
23. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
viii
5.8 Percentage of samples that reached convergence for β = (1.2,−0.4, 0.1)
and network size 49 x 18. . . . . . . . . . . . . . . . . . . . . . . . . 49
5.9 Percentage of samples that reached convergence for β = (1.1, 0.8, 2.3)
and network size 49 x 18. . . . . . . . . . . . . . . . . . . . . . . . . 50
5.10 Percent relative bias of β for β = (1.2,−0.4, 0.1) and network size
49 x 18 for DM model in terms of δg. . . . . . . . . . . . . . . . . . 51
5.11 Percent relative bias of β for β = (1.2,−0.4, 0.1) and network size
49 x 18 for DM model in terms of ρg. . . . . . . . . . . . . . . . . . 52
5.12 Percent relative bias of β for β = (1.1, 0.8, 2.3) and network size 49
x 18 for DM model in terms of δg. . . . . . . . . . . . . . . . . . . . 53
5.13 Percent relative bias of β for β = (1.1, 0.8, 2.3) and network size 49
x 18 for DM model in terms of ρg. . . . . . . . . . . . . . . . . . . . 54
5.14 Percent coefficient of variation of β for β = (1.2,−0.4, 0.1) and
network size 49 x 18 for DM model in terms of δg. . . . . . . . . . . 57
5.15 Percent coefficient of variation of β for β = (1.2,−0.4, 0.1) and
network size 49 x 18 for DM model in terms of ρg. . . . . . . . . . . 58
5.16 Percent coefficient of variation of β for β = (1.1, 0.8, 2.3) and net-
work size 49 x 18 for DM model in terms of δg. . . . . . . . . . . . . 59
5.17 Percent coefficient of variation of β for β = (1.1, 0.8, 2.3) and net-
work size 49 x 18 for DM model in terms of ρg. . . . . . . . . . . . 60
5.18 Median negative log-likelihood values for samples generated with
β = (1.2,−0.4, 0.1) and network size 49 x 18. . . . . . . . . . . . . . 61
5.19 Median negative log-likelihood values for samples generated with
β = (1.1, 0.8, 2.3) and network size 49 x 18. . . . . . . . . . . . . . . 62
ix
5.20 Percentage of χ2 p-values < 0.05 for samples generated with β =
(1.2,−0.4, 0.1) and network size 49 x 18. . . . . . . . . . . . . . . . 63
5.21 Percentage of χ2 p-values < 0.05 for samples generated with β =
(1.1, 0.8, 2.3) and network size 49 x 18. . . . . . . . . . . . . . . . . 63
5.22 Percent relative bias and percent coefficient of variation for disper-
sion parameters with β = (1.2,−0.4, 0.1) and network size 49 x
18. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.23 Percent relative bias and percent coefficient of variation for disper-
sion parameters with β = (1.1, 0.8, 2.3) and network size 49 x 18. . . 64
B.1 Percentage of samples that reached convergence for β = (2.78, 2.35, 1.88)
and network size 90 x 54. . . . . . . . . . . . . . . . . . . . . . . . . 84
B.2 Percentage of samples that reached convergence for β = (2.78, 2.35, 1.88)
and network size 105 x 76. . . . . . . . . . . . . . . . . . . . . . . . 85
B.3 Percentage of samples that reached convergence for β = (3.21, 3.85, 3.50)
and network size 90 x 54. . . . . . . . . . . . . . . . . . . . . . . . . 85
B.4 Percentage of samples that reached convergence for β = (3.21, 3.85, 3.50)
and network size 105 x 76. . . . . . . . . . . . . . . . . . . . . . . . 85
B.5 Percentage of samples that reached convergence for β = (3.21, 3.85, 3.50)
and network size 57 x 23. . . . . . . . . . . . . . . . . . . . . . . . . 86
B.6 Percentage of samples that reached convergence for β = (3.92, 4.54, 3.52)
and network size 90 x 54. . . . . . . . . . . . . . . . . . . . . . . . . 86
B.7 Percentage of samples that reached convergence for β = (3.92, 4.54, 3.52)
and network size 105 x 76. . . . . . . . . . . . . . . . . . . . . . . . 86
x
B.8 Percentage of samples that reached convergence for β = (3.92, 4.54, 3.52)
and network size 57 x 23. . . . . . . . . . . . . . . . . . . . . . . . . 87
B.9 Median negative loglikelihood values for samples generated with
β = (2.78, 2.35, 1.88) and network size 90 x 54. . . . . . . . . . . . . 87
B.10 Median negative loglikelihood values for samples generated with
β = (2.78, 2.35, 1.88) and network size 105 x 76. . . . . . . . . . . . 87
B.11 Median negative loglikelihood values for samples generated with
β = (3.21, 3.85, 3.50) and network size 90 x 54. . . . . . . . . . . . . 88
B.12 Median negative loglikelihood values for samples generated with
β = (3.21, 3.85, 3.50) and network size 105 x 76. . . . . . . . . . . . 88
B.13 Median negative loglikelihood values for samples generated with
β = (3.21, 3.85, 3.50) and network size 57 x 23. . . . . . . . . . . . . 88
B.14 Median negative loglikelihood values for samples generated with
β = (3.92, 4.54, 3.52) and network size 90 x 54. . . . . . . . . . . . . 89
B.15 Median negative loglikelihood values for samples generated with
β = (3.92, 4.54, 3.52) and network size 105 x 76. . . . . . . . . . . . 89
B.16 Median negative loglikelihood values for samples generated with
β = (3.92, 4.54, 3.52) and network size 57 x 23. . . . . . . . . . . . . 89
B.17 Percent relative bias of β for β = (2.78, 2.35, 1.88) and network size
90 x 54. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
B.18 Percent relative bias of β for β = (2.78, 2.35, 1.88) and network size
105 x 76. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
B.19 Percent relative bias of β for β = (3.21, 3.85, 3.50) and network size
90 x 54. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
xi
B.20 Monte Carlo bias of β for β = (3.21, 3.85, 3.50) and network size
105 x 76. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
B.21 Percent relative bias of β for β = (3.21, 3.85, 3.50) and network size
57 x 23. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
B.22 Percent relative bias of β for β = (3.92, 4.54, 3.52) and network size
90 x 54. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
B.23 Percent relative bias of β for β = (3.92, 4.54, 3.52) and network size
105 x 76. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
B.24 Monte Carlo bias of β for β = (3.92, 4.54, 3.52) and network size 57
x 23. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
B.25 Percent coefficient of variation of β for β = (2.78, 2.35, 1.88) and
network size 90 x 54. . . . . . . . . . . . . . . . . . . . . . . . . . 98
B.26 Percent coefficient of variation of β for β = (2.78, 2.35, 1.88) and
network size 105 x 76. . . . . . . . . . . . . . . . . . . . . . . . . . 99
B.27 Percent coefficient of variation of β for β = (3.21, 3.85, 3.50) and
network size 90 x 54. . . . . . . . . . . . . . . . . . . . . . . . . . . 100
B.28 Percent coefficient of variation of β for β = (3.21, 3.85, 3.50) and
network size 105 x 76. . . . . . . . . . . . . . . . . . . . . . . . . . 101
B.29 Percent coefficient of variation of β for β = (3.21, 3.85, 3.50) and
network size 57 x 23. . . . . . . . . . . . . . . . . . . . . . . . . . . 102
B.30 Percent coefficient of variation of β for β = (3.92, 4.54, 3.52) and
network size 90 x 54. . . . . . . . . . . . . . . . . . . . . . . . . . . 103
B.31 Percent coefficient of variation of β for β = (3.92, 4.54, 3.52) and
network size 105 x 76. . . . . . . . . . . . . . . . . . . . . . . . . . 104
xii
B.32 Percent coefficient of variation of β for β = (3.92, 4.54, 3.52) and
network size 57 x 23. . . . . . . . . . . . . . . . . . . . . . . . . . . 105
B.33 Percent relative bias and percent coefficient of variation for disper-
sion parameters with β = (2.78, 2.35, 1.88) and network size 90 x
54. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
B.34 Percent relative bias and percent coefficient of variation for disper-
sion parameters with β = (2.78, 2.35, 1.88) and network size 105 x
76. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
B.35 Percent relative bias and percent coefficient of variation for disper-
sion parameters with β = (3.21, 3.85, 3.50) and network size 90 x
54. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
B.36 Percent relative bias and percent coefficient of variation for disper-
sion parameters with β = (3.21, 3.85, 3.50) and network size 105 x
76. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
B.37 Percent relative bias and percent coefficient of variation for disper-
sion parameters with β = (3.21, 3.85, 3.50) and network size 57 x
23. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
B.38 Percent relative bias and percent coefficient of variation for disper-
sion parameters with β = (3.92, 4.54, 3.52) and network size 90 x
54. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
B.39 Percent relative bias and percent coefficient of variation for disper-
sion parameters with β = (3.92, 4.54, 3.52) and network size 105 x
76. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
xiii
B.40 Percent relative bias and percent coefficient of variation for disper-
sion parameters with β = (3.92, 4.54, 3.52) and network size 57 x
23. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
xiv
List of Figures
2.1 Pollination web depicted as a bipartite graph . . . . . . . . . . . . . 7
3.1 Graphical model of DM regression - Y contains the observed counts;
P contains the corresponding interaction probabilities which are a
function of multi-dimensional array X, consisting of k observable
covariate matrices, and corresponding β; δ is an over-dispersion
parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2 Graphical model of Stata’s parameterization of DM regression - Y
contains the observed counts; P contains the corresponding inter-
action probabilities which are a function of λ∗ = eγ′x∗ = eβ
′xgj−ψ′xg . 24
xv
Chapter 1
Introduction
This thesis contributes to the area of multivariate statistics and provides a
study of the robustness of multinomial-based regression models to misspecifica-
tion of dispersion structure, within an ecology context. In particular, Dirichlet-
multinomial (DM) regression is proposed to model the interaction probabilities of
plant-pollinator mutualistic networks, and the robustness of the DM model with
respect to over-dispersion is investigated through simulation studies.
To date, few studies have been conducted to evaluate the simultaneous con-
tributions of the processes driving the structural patterns of pollination networks.
Vazquez et al. (2009b) provide a conceptual framework for evaluating multiple
factors in determining network structure and within the past couple of years, oth-
ers have adopted a similar framework (Santamarıa and Rodrıguez-Girones, 2007;
Allesina et al., 2008; Stang et al., 2009). However, the statistical approaches that
have been used in practice are typically simplistic, such as χ2 tests for observed
proportions. In fact, all studies failed to determine the relative contribution of
these factors.
1
DM regression is a cutting edge technique that arose in econometrics to model
consumers’ choice behaviour and that may provide a flexible model for estimating
the true contribution of various factors to observed plant-pollinator interactions
(Guimaraes and Lindrooth, 2007). Hence, this thesis represents an effort to ex-
tend Vazquez’s conceptual framework and apply state of the art methods from
econometrics to a pollination context. Finally, the DM framework for pollination
is evaluated through simulation studies.
Mutualisms among organisms are found at the core of any ecosystem; therefore,
they play a key role in ecology. Among the most commonly studied mutualisms
are those between plants and pollinators. Pollinators are responsible for the sexual
reproduction of flowering plants, seed production and fruit production. In fact,
humans rely on pollination for about one third of the food they eat (Kearns et al.,
1998). Unfortunately, pollination networks are constantly under threat due to
anthropological activities, such as, agricultural practices, change in land use, and
habitat loss. Since the relationship between plants and pollinators are interdepen-
dent, the extinction of one results in the extinction of the other and vice versa.
Thus their conservation and management are crucial for the Earth’s biodiversity.
Ecologists and evolutionary biologists have become increasingly interested in
the study of plant and pollinator interactions within a community context in order
to understand the implications of these mutualisms (Vazquez et al., 2009b). As
such, a network approach has been adopted to gain insight on the structure of
plant-pollinator mutualisms. These networks may consist of hundreds of species
that form highly complex and heterogeneous ecosystems (Bascompte and Jordano,
2007). Recent work has revealed that these networks share common structural
patterns, such as the nested organization of pairwise interactions and the skewed
2
distribution of links per species (Vazquez et al., 2009a). It is believed that these
structural patterns are being driven by both evolutionary and ecological processes,
which are summarized by the theories of neutrality (random interactions) and
linkage rules (trait matching).
Evidence suggests that both theories contribute to the organization of network
structure; therefore, there are multiple determinants that influence the probability
that a given pollinator species will interact with a given plant species. Vazquez et
al. (2009b) developed a conceptual framework which states that the observed in-
teractions of a given network are assumed to be distributed multinomial and are a
function of probability matrices derived from relative abundance, spatio-temporal
overlap, phenotypic traits, phylogenetic signal and sampling effects. They in-
vestigated the extent to which relative abundance and spatio-temporal overlap
predicted network structure using the multinomial likelihood calculated from the
probability matrices and observed counts from real world networks. Others (San-
tamarıa and Rodrıguez-Girones, 2007; Allesina et al., 2008; Stang et al., 2009)
have adopted a similar conceptual framework in order to evaluate the extent to
which these processes, namely, trait complementarity, describe network topology.
Although these studies have confirmed that these factors do contribute to network
structure, they fall short of quantifying the relative contribution of each factor to
the interaction probabilities.
In order to evaluate the relative contribution of the mechanisms driving net-
work structure, a more sophisticated statistical approach is needed. Hence, the
motivation of this thesis is to exploit the theories of neutrality and forbidden links
in one comprehensive model.
Multinomial logistic regression is often used in econometrics to model indi-
3
vidual choice behaviour (Hausman et al., 1984; McFadden, 1974; Shonkwiler and
Hanley, 2003). Analogous to consumers choosing among a set of brand products,
pollinators species choose among a set of plant species. As such, according to the
random utility hypothesis, pollinators assign a level of utility to every plant species
and choose the one that provides the maximum utility. McFadden (1974) derived
a conditional logit (CL) model from the random utility model which incorporates
factors that include characteristics of the individuals, or pollinators, and/or the
attributes of the choices, or plants. DM regression is an extension of McFadden’s
CL model which allows for an extra level of variability that is consumer (pollinator
species) specific (Guimaraes and Lindrooth, 2007). Since plant-pollinator mutual-
istic networks are known to be heterogeneous, this extra unobserved heterogeneity
accounts for the possibility of over-dispersion in the observed network counts.
As part of this thesis, simulation studies were conducted to evaluate the ro-
bustness of DM regression to misspecification of dispersion structure, and to com-
pare the performance of DM regression to grouped conditional logit (GCL) re-
gression. The GCL model is a special case of the DM model which assumes no
over-dispersion. Plant-pollinator mutualistic networks of varying size and disper-
sion structures were simulated according to the DM regression model to conduct
the analysis.
This thesis provides a summary of the motivation, objective and results of
the research on DM regression in the context of pollination networks. Chapter 2
provides general background on pollination networks, including the patterns and
processes driving network structure, and a review of previous pollination studies
that have stimulated this research. Chapter 3 reviews the DM regression model
through the methodologies adapted from econometrics. Chapters 4 provides a
4
description of the simulation studies and Chapter 5 presents the results of the
simulation studies. A discussion of the results of this thesis are provided in Chapter
6 and conclusions and future works are discussed in Chapter 7.
5
Chapter 2
Pollination Networks
This chapter provides a general overview of plant-pollinator mutualistic net-
works. Section 2.1 defines a pollination network and shows how a pollination web
can be depicted as a bipartite graph. Section 2.2 presents the network statistics
used to uncover the common structural patterns that characterize these networks,
while Section 2.3 presents the underlying mechanisms driving the network pat-
terns. Section 2.4 introduces the notation used to describe pollination networks
throughout this thesis and Section 2.5 summarizes the previous network studies
that have motivated the work presented in this thesis.
2.1 Definition
A pollination network is a graph consisting of nodes that represent plants
and pollinators, and undirected edges that represent the mutualistic interactions
between plant and pollinator species within a given ecosystem. These networks
are considered bipartite networks since interactions occur only between plants and
6
pollinators and not within plants or pollinators. In other words, pollinator species
do not interact with other pollinator species and plant species do not interact
with other plant species. An example of a pollination network is shown in Figure
2.1. The pollinator species are represented by purple nodes and the plants species
are represented by green nodes. An edge connecting a pollinator node to a plant
node represents an interaction or link between those two species. Essentially, the
graph gives a snapshot of which pollinator species are interacting with which plant
species.
Figure 2.1: Pollination web depicted as a bipartite graph
Due to the interdependent relationship between these plants and pollinators,
it quickly becomes clear that as the size of the network increases, the complexity
of the network also increases. The study of plant and pollinator communities pro-
vides an understanding of the underlying patterns of these complex networks and
the processes - both ecological and evolutionary - that are driving these network
patterns.
7
2.2 Network Patterns
Network statistics are metrics calculated from an observed real-world network
which aim at quantifying its topological features. Examples of these statistics
include:
• connectance – proportion of links that are actually realized (Jordano, 1987);
• species degree – number of different species to which a specific species is
linked (Jordano et al., 2003); and
• interaction strength or dependence – an estimate of the extent to which one
species depends on another species, typically approximated by interaction
frequency (Vazquez et al., 2005; Bascompte et al., 2006).
Network statistics provide a means of uncovering the underlying structural pat-
terns of a given network.
Extensive studies of plant-pollinator networks reveal that, regardless of the geo-
graphical origins, mutualistic networks share common structural patterns. Vazquez
et al. (2009a) summarize the topological features of plant-pollinator networks as
follows:
• Connectance is typically low in mutualistic networks.
• There are often many more pollinator species than there are plant species.
• The species degree distribution is skewed: many species have few links (these
species are considered specialists) and few species have many links (these
species are considered generalists).
8
• Mutualistic networks tend to be: nested – specialists tend to interact with
a subset of generalists (Bascompte et al., 2003), and compartmentalized –
clearly defined groups of species that have many intragroup links and few
intergroup links (Olesen et al., 2007).
• Most interactions are asymmetric in that specialists tend to interact with
generalists.
These structural patterns suggest that pollination networks are very heteroge-
neous; therefore, there may be multiple determinants that play a role in the orga-
nization of the network structure.
2.3 Network Processes
The topological features of these complex networks are driven by ecological
and evolutionary mechanisms which can be explained by two opposing theories:
neutrality and linkage rules. Neutrality states that all individuals interact ran-
domly such that all plant-pollinator pairs have the same probability of interacting.
As a result, individuals interact proportionately to their relative abundance, i.e.,
more abundant species interact more frequently than rare species (Vazquez et al.,
2009b).
Linkage rules arise from trait matching, namely, trait complementarity and
exploitation barriers. Both rules prevent the occurrence of certain interactions,
known as forbidden links (Santamarıa and Rodrıguez-Girones, 2007). An example
of a complementarity trait would be a plant’s nectar concentration matching a
pollinator’s concentration preference. Similarly, an example of an exploitation
barrier would be a pollinator’s proboscis length being long enough to forage a
9
plant’s corolla tube (Bascompte et al., 2003). Other evolutionary processes, such
as neutral evolution and phylogenetic signal, may also have a causal influence on
trait matching (Vazquez et al., 2009a). This thesis attempts to exploit both of
these theories simultaneously in one comprehensive model.
It is worth noting that although these mechanisms influence the true network
structure, these mechanisms also affect the observed network structure through
sampling effects. The discrepancy between the true network and the observed
network is an artifact of sampling bias. An example of a sampling effect would be
observation error: interactions between certain plants and pollinator species occur
but are not recorded because they are not observed during the sampling times.
2.4 Network Notation
For the purpose of this thesis, consider a plant-pollinator network with G
pollinator species, J plant species, and K traits (or covariates). Let Y denote
the matrix of observed counts where ygj is the observed number of plant visits
between pollinator g and plant j. Then we have,
Y =
y11 · · · y1J...
. . ....
yG1 · · · yGJ
.
The total number of observed interactions ng for pollinator species g is the sum
of the gth row of Y , and the total number of observed interactions in the network
10
N is the sum of all counts, given by
N =G∑g=1
J∑j=1
yg,j. (2.1)
Let Xk denote the matrix for observable covariate k, where xkgj is the observed
covariate k value for pollinator g and plant j. Then,
Xk =
xk11 · · · xk1J
.... . .
...
xkG1 · · · xkGJ
, k = 1, . . . , K.
Let probability matrix P contain the interaction probabilities corresponding to the
observed counts in Y . It is assumed that the probabilities in P are being driven
by the covariates, or traits, in Xk. In other words, the interaction probabilities are
a function of multiple factors contained in Xk, k = 1, . . . , K. This thesis considers
DM regression, such that the interaction probabilities are modeled through a logit
link function which is a linear combination of these covariates.
2.5 Previous Pollination Network Studies
As discussed in Section 2.3, due to the evolutionary and ecological processes
driving network pattern, multiple determinants must be considered in evaluat-
ing network structure. Several studies have attempted to evaluate many factors
simultaneously. Vazquez et al. (2009b) proposed a conceptual framework for
modeling a pollination web as a function of several factors. They attempted to
evaluate space, time and relative abundance using a likelihood-based approach.
11
They assumed that the counts in Y are distributed multinomial and calculated
the corresponding interaction probability matrices based on all possible combina-
tions of time, space and relative abundance. However, for this analysis, Vazquez
et al. (2009b) created probability models by multiplying binary covariate matri-
ces and normalizing the product matrices so that their elements summed to one.
The observed counts and those expected under each of the probability models
were compared by calculating the corresponding likelihoods and AIC values. The
models containing more than one factor proved to best predict network structure.
A similar likelihood approach was used by Allesina et al. (2008) in a food web
context. This is a very rudimentary approach for modeling interaction probabil-
ities as a function of multiple determinants. In this thesis, Vazquez’s framework
is extended in which the probabilities are modeled as a function of covariates via
a logit formulation.
Santamarıa and Rodrıguez-Girones (2007) investigated whether simple linkage
rules, that account for both trait complementarity and/or exploitation barriers,
could explain the structure of plant-pollinator mutualistic networks. They used
simulation methods to build binary interaction matrices, i.e., a qualitative network
where 1 indicates the presence of an interaction between a given pollinator species
and a given plant species, 0 otherwise. These models were simulated using one,
two and four complementarity trait and barrier trait models and two null models.
The structure of simulated communities based on simple linkage rules was com-
pared to the structure of 37 real-world networks. Network topology was described
by the number of interactions, nestedness, relative nestedness and connectivity.
Santamarıa and Rodrıguez-Girones found that models that incorporate two traits
were able to predict the network statistics found in the real-world networks.
12
Trait matching was also investigated by Stang et al. (2009) in a Spanish plant-
pollinator network. They introduced a new network statistic: the degree of size
matching between nectar depth and proboscis length. They used two rules: (i)
size threshold and (ii) relative abundance, to explain the frequency distributions of
interactions across size classes and average degree of size matching for individuals
in a species (Stang et al., 2009). For each analysis they compared observed and
expected values using a contingency table approach (χ2 tests). Observed values
were calculated as a function of size and using the mean and standard deviation
of trait values, for each analysis, respectively. The expected frequencies were cal-
culated based on probabilities dervied from a threshold indicator that indicates
whether the interaction is possible determined by equal species abundance or rela-
tive species abundance. They found that size thresholds, size distributions (nectar
depths and proboscis lengths), and species abundance seemed to be important in
understanding observed interaction probabilities.
Although these studies provide a conceptual framework for incorporating mul-
tiple factors in predicting network structure, a means of quantifying the relative
contribution of each factor is yet to be explored. The DM regression model can
identify the factors that affect the interaction probabilities and can estimate their
relative contribution to those interaction probabilities. Additionally, DM regres-
sion is a flexible model and can be used to incorporate different kinds of covariates,
such as space and time, or possibly to learn a set of linkage rules.
The next chapter provides details of the DM regression model from its roots
in econometrics. Further discussion extends the DM regression model into the
context of pollination networks.
13
Chapter 3
Dirichlet-Multinomial Regression
This chapter begins with a review of multinomial response data and the logit
formulation used to model the multinomial probabilities. Section 3.2 introduces
the random utility model from econometrics (used to model individual choice
behaviour). Section 3.3 provides the derivation of the grouped conditional logit
(GCL) model from the random utility framework and places it in the context of
pollination networks. Finally, Section 3.4 provides the details of the DM regression
model, including its extension of the GCL model, its equivalence to the log-linear
model, and its alternate parameterizations.
3.1 Multinomial Responses
A dependent variable that can take on more than two discrete values is known
as a polytomous response variable. For example, travelers may choose among a
set of travel modes or consumers choose among a set of brand name products.
The number of travelers who choose the respective travel mode or the number of
14
consumers who choose the respective brand name products can be modeled using
the multinomial distribution. The multinomial responses can be either nominal,
i.e., there is no natural order to the categories, or ordinal, i.e., there is an order
or ranking to the categories. In this paper, only nominal response variables will
be considered and discussed hereafter.
Consider a random variable Yi, that can take on one of a finite number of
discrete values, 1, 2, . . . , J . Let pij = P (Yi = j) be the probability that the
ith response falls into the jth category. Assuming that the response categories
are mutually exclusive, then∑J
j=1 pij = 1 for each i. Let Yij be the number
of observations falling into category j for a group or individual i and let ni =∑Jj=1 Yij. For ungrouped data, ni = 1 corresponding to the one observation
falling into the jth category and the rest of the J − 1 categories are set to zero.
The probability distribution of the counts Yij given the total ni is given by the
multinomial distribution:
P (Yi1 = yi1, ..., YiJ = yiJ) =ni!
yi1!...yiJ !pyi1i1 · · · p
yiJiJ . (3.1)
The special case where J = 2 is the binomial distribution. The expected value
and variance of the Yij are:
E(Yij) = nipij (3.2)
and
V ar(Yij) = nipij(1− pij). (3.3)
15
Multinomial logitistic (MNL) regression models the pij in terms of individual
or group specific covariates Xi. The link function, known as the logit link or log-
odds, connects the pij to the covariates Xi. The logit uses the J th category as the
baseline group; therefore, the log-odds for all other J − 1 categories are relative
to the baseline. This generalized logit can be written as a linear combination of
covariates:
νij = logpijpiJ
= β′jxi, (3.4)
where β′j is a vector of regression coefficients for j = 1, . . . , J − 1 associated with
the covariate values for the ith individual or group 1.
The probability that the ith individual or group selects the jth category is:
pij =exp(νij)∑Jj=1 exp(νij)
, (3.5)
which is the same logit formulation used in the conditional logit (CL) models
discussed in Sections 3.2 and 3.3. It should be noted that the interpretation of
the β parameters do differ between the MNL and the CL. In the former, these
parameters correspond to individual or group level characteristics, while those in
the latter correspond to choice attributes. An introduction to the random utility
model will elucidate this distinction.
1η is commonly used to represent the link function in a generalized linear model; however,in this thesis, η is used to represent the random group effects for the DM regression model (seeSection 3.4)
16
3.2 Random Utility Model
McFadden (1974) derived a CL model from the random utility model commonly
used in econometrics for modeling individual choice behaviour. The random utility
model assumes that (i) an individual is faced with Ji mutually exclusive and
exhaustive choices, (ii) the utilities Uij are random variables that vary across
individuals, and (iii) an individual selects the choice with the highest, or maximum,
utility (Maddala, 1983). The utility function is defined as the utility ascribed to
choice j by individual i:
Uij = Vij + εij, (3.6)
for i = 1, . . . , N and j = 1, . . . , J and where Vij is a function of covariates that
can reflect the choice attributes and/or the individual characteristics and the εij
is a random error term. The εij is assumed to follow a Type I Extreme Value
distribution, or a standard Weibull distribution, because the modeled utilities are
a maxima.
The probability that individual i selects choice j can be expressed by the
following logit formulation:
pij =exp(Vij)∑Jj=1 exp(Vij)
=exp(β′xij)∑Jj=1 exp(β
′xij), (3.7)
where β is a vector of unknown parameters associated with each of the covariates,
and xij is the vector of covariate values corresponding to individual i and choice
j, for i = 1, . . . , N and j = 1, . . . , J . Note that the MNL model is a special case
of the CL model when only individual characteristics are considered as covariates
17
and β = βj and xij = xi. However, if both types of covariates are included, the
covariates that vary across individuals are constant for all choices and cancel out
of the logit formulation. It can be shown that the logit formulation is a direct
result of the extreme value distribution placed on the random errors in Equation
3.6. Appendix A provides an explicit derivation of McFadden’s CL formulation.
3.3 Grouped Conditional Logit
In a pollination context, each pollinator species is faced with the same choice
set, or J plant species. Further, it is assumed that the pij are identical for all
individuals in the same group, or pollinator species. Thus the covariate values
are identical across members of a group. As such, the utility function for the
ith individual in the gth group and the probability that the individuals in the gth
group select the jth plant can be rewritten as:
Uigj = β′xgj + εigj, (3.8)
and
pgj =exp(β′xgj)∑Jj=1 exp(β
′xgj), (3.9)
where the xgj is the vector of covariate values corresponding to pollinator species
g and plant species j, and β and εigj are defined as in Section 3.2.
18
The likelihood function for the grouped conditional logit is:
LGCL =G∏g=1
J∏j=1
pygjgj , (3.10)
where the ygj are the number of individuals from pollinator species g that select
plant species j. The parameters of the GCL model can be estimated via maxi-
mum likelihood (ML) procedures available in most statistical software packages.
Guimaraes and Lindrooth (2007) give a detailed discussion of the equivalent log-
linear model that arises when the ygj are modeled as a count variable. As such,
the corresponding model parameters can be estimated using Poisson regression.
In this thesis, the plant-pollinator interaction probabilities are modeled using
only the GCL and the DM models and the estimates and corresponding standard
errors of β are compared.
3.4 Dirichlet-Multinomial Model
In the standard GCL model, it is assumed that the pgj are fixed constants,
g = 1, . . . , G and j = 1, . . . , J . However, due to the complex structure of plant-
pollinator networks, the counts in Y are often greater than that predicted by the
GCL model; a phenomenon known as over-dispersion. Consequently, pgj may vary
within a pollinator species due to some unobserved heterogeneity (Faraway, 2006).
For example, the pollinators in species g may be observed more frequently than
those from other species due to observation error; therefore, the counts for the
species g may be greater than predicted by pgj. To account for the possibility of
19
this group-specific heterogeneity, the utility function can be expressed as
Uigj = β′xgj + ηgj + εigj, (3.11)
where ηgj is the random group effect for pollinator species g and plant species
j; and the εigj are independent conditional on the group random effect, for i =
1, . . . , N , g = 1, . . . , G, and j = 1, . . . , J .
Conditional on the group random effects, a modified expression for the prob-
ability that an individual from pollinator g selects plant j is:
pgj =exp(β′xgj + ηgj)∑Jj=1 exp(β
′xgj + ηgj)=
λgjexp(ηgj)∑Jj=1 λgjexp(ηgj)
, (3.12)
where λgj=exp(β′xgj), for g = 1, . . . , G, and j = 1, . . . , J . Introducing this extra
level of variability into the model can allow for correlation across the choices for
pollinators in the same group, which translates into over-dispersion of the ygj
counts (Guimaraes and Lindrooth, 2007).
Assume that the exp(ηgj) are independent and identically (i.i.d.) gamma dis-
tributed with both shape and scale (i.e. rate) parameters δ−1g λgj, where δ−1g >
0. Then, the expected value of exp(ηgj) is 1 and the variance is δgλ−1gj . Fur-
thermore, the products λgjexp(ηgj), for g = 1, . . . , G and j = 1, . . . , J , also have
independent gamma distributions with parameters (δ−1g λgj, δ−1g ). Since all vari-
ables follow independent gamma distributions with the same scale parameter, the
vector of probabilities for a given pollinator species, or group, (pg1,. . . ,pgJ) follows
a Dirichlet distribution with parameters (δ−1g λg1,. . . ,δ−1g λgJ) (Mosimann, 1962).
20
The probability density function of (pg1,. . . ,pgJ) can then be written as:
fD(pg1, . . . , pgJ−1) =Γ(δ−1g λg)∏Jj=1 Γ(δ−1g λgj)
J∏j=1
pδ−1g λgj−1gj (3.13)
where, pgJ=1-∑J−1
j=1 pgj.
Within a Bayesian framework, the above modification is equivalent to placing
a Dirichlet prior on pgj. Note that the Dirichlet distribution happens to be the
conjugate prior for the multinomial distribution. The resulting distribution for
Y is the Dirichlet-multinomial distribution with parameters (ng; pg1,. . . ,pgJ), g =
1, . . . , G, where ng=∑J
j=1 ygj. Mosimann (1962) provides a closed form expression
for the unconditional DM likelihood:
LDMd =G∏g=1
ng!Γ(δ−1g λg)
Γ(δ−1g λg + ng)
J∏j=1
Γ(δ−1g λgj + ngj)
ngj!Γ(δ−1gj λgj)(3.14)
for g = 1, . . . , G and j = 1, . . . , J and where λg =∑J
j=1 λgj.
A graphical model for DM regression is shown in Figure 3.2. The observed
counts in Y follow a multinomial distribution with parameters P = (p11, . . . , pGJ).
The interaction matrix P follows a Dirichlet distribution with parameters
(δ−11 λ11, . . . , δ−1G λGJ). These Dirichlet parameters depend on observed covariates
X and the associated parameter β through λgj = exp(β′xgj) and over-dispersion
parameter δg. As mentioned earlier, counts in Y may be greater than those pre-
dicted by P ; therefore, δ is group specific and accounts for this variability. In
summary, the pgj provides information on the strength of the links in the network
and β summarizes the covariates’ contributions to those probabilities. Only the
observed counts in Y and the observed covariates in X are needed to estimate the
21
parameters of the DM model.
Figure 3.1: Graphical model of DM regression - Y contains the observedcounts; P contains the corresponding interaction probabilities which are afunction of multi-dimensional array X, consisting of k observable covariate
matrices, and corresponding β; δ is an over-dispersion parameter
As mentioned earlier, the GCL model can be rewritten as a log-linear model
by letting ygj follow a Poisson distribution and conditioning on the sum of counts
ng. Guimaraes and Lindrooth (2007) give the analogous relationship between the
DM model and the negative binomial model, also known as the negative binomial
type 1 or negative binomial model with fixed effects in the econometrics literature.
Once again, conditioning on the sum of counts ng, assume ygj follow a Poisson
distribution with parameter λgj and let λgj follow a gamma distribution with
parameters (δ−1g λgj, δ−1g ). Then under these assumptions, the ygj follow a negative
binomial distribution. This parameterization was used in the simulation study to
generate plant-pollinator networks, discussed in Section 4.3.
22
3.4.1 Additional Parameterizations and Considerations
As mentioned earlier, the addition of the group random effect ηgj induces cor-
relation across the choices of plant species. Under the DM model, the marginal dis-
tributions of ygj is a beta-binomial distribution with parameters (ng, pgj) (Guimaraes
and Lindrooth, 2007). As such, the intragroup correlation coefficient can be ex-
pressed as:
ρg =1
δ−1g λg + 1=
δgλg + δg
(3.15)
for g = 1, ..., G.
By inspection, it is obvious that ρg tends to zero as δg approaches zero. The
DM likelihood parameterized in terms of the intragroup correlation coefficient is:
LDMr =G∏g=1
ng!Γ(ρ−1g − 1)
Γ[(ρ−1g − 1) + ng]
J∏j=1
Γ[(ρ−1g − 1)pgj + ygj]
Γ[(ρ−1gj − 1)pgj]ygj!. (3.16)
Maximization of the likelihoods provide estimates of the β and the dispersion pa-
rameters. An iterative procedure such as the Newton-Raphson or Fisher Scoring
can easily be employed to obtain the maximum likelihood (ML) estimates. Exist-
ing routines are available in statistical software packages, such as, LIMDEP and
Stata.
In this thesis, Stata was used to obtain estimates of the DM model parame-
ters. Figure 3.2 displays the graphical model of Stata’s parameterization of DM
regression. Stata’s implementation of DM regression models the interaction prob-
abilities pgj as a function of λ∗ = eγ′x∗ = eβ
′xgj−ψ′xg , where β and xgj are as defined
23
Figure 3.2: Graphical model of Stata’s parameterization of DM regression - Ycontains the observed counts; P contains the corresponding interaction
probabilities which are a function of λ∗ = eγ′x∗ = eβ
′xgj−ψ′xg
in Sections 3.2 and 3.3, respectively,
γ =
β
−−−
ψ
and
X∗ =
xgj
−−−
xg
.
If δg is modeled as a constant, then ψ is a unknown scalar constant and xg=1.
Otherwise, ψ is a vector of unknown coefficients and xg is a vector of pollinator-
specific covariates.
Since the group random effect ηgj translates into a pollinator-specific over-
dispersion parameter, δg or ρg, it may be modeled in several ways. Hence, the
options for modeling the dispersion parameters for the DM model in Stata are as
24
follows:
1. Over-dispersion parameter is modeled as a constant: δg = e−δ
This implementation assumes that all pollinator species share the same over-
dispersion parameter. This is equivalent to introducing an intercept term
into the model.
2. Over-dispersion parameter is modeled as a function of pollinator species
covariates: δg = f(xg)
This implementation assumes δg = e−ψ′xg . This is equivalent to introducing
an intercept and additional coefficient terms into the model.
3. Intragroup correlation coefficient is modeled as a constant: logit(ρg) = ρ
This implementation assumes that all pollinator species share the same in-
tragroup correlation coefficient. Hence, this is equivalent to introducing an
intercept term into the model.
4. Intragroup correlation coefficient is modeled as a function of covariates: ρg =
f(xg)
This implementation assumes logit(ρg) = ψ′xg. In Stata, this is equivalent
to introducing an intercept and additional coefficient terms into the model.
The pollinator-specific parameters account for extra-multinomial variability, but
do not affect the choice probabilities since they drop out of the logit formulation.
However, the addition of pollinator-specific covariates (Options 2 and 4 above)
can provide additional insight into the heterogeneity of plant-pollinator networks.
Option 2 provides information on the impact of these covariates on the number
of times each plant species is chosen (Guimaraes and Lindrooth, 2007). Option 4
25
provides an assessment of the impact that the covariates have on the correlation
across plant species for individuals in the same pollinator species.
In this thesis, Options 1–3 and the GCL model were used to generate plant-
pollinator networks for the simulation studies and Options 1–4 and the GCL model
were used to fit the simulated data sets. Chapter 4 gives a detailed outline of the
procedures carried out in the simulation studies.
26
Chapter 4
Design of Simulation Study
This chapter provides a description of the simulation studies conducted for
the evaluation of the DM regression model in the context of pollination networks.
Section 4.1 describes the design of Simulation Study A, for which the aim was to
gain insights into the robustness of DM regression to various parameterizations of
the model dispersion. Section 4.2 describes the design of Simulation Study B, for
which the aim was to challenge the performance of DM regression with respect to
the parameter boundaries tested in Simulation Study A. Section 4.3 explains the
data generation procedure for the various plant-pollinator networks used in the
simulation studies. Finally, Section 4.4 sketches the model fitting techniques used
to analyze the simulated data sets and the summary statistics used to compile the
results of the simulation studies.
27
4.1 Description of Simulation Study A
The main objective of Simulation Study A was to evaluate the overall per-
formance of the DM model for plant-pollinator network data and to compare its
performance to the GCL model in the presence of mild over-dispersion. Data were
generated based on three sets of parameter values, three network sizes and four
dispersion structures. The DM model parameters were randomly generated as
follows:
• βi ∼ Uniform(1 + i, 3 + i), i = 0, 1, 2
• δg ∼ Uniform(0, 2), for δg = δ
• δg ∼ Uniform(−2, 4), for δg = f(xg)
• ρg ∼ Uniform(0.25)
The β parameters were randomly generated from a Uniform distribution over in-
terval (1, 3) for the first set of parameter values, but the interval was shifted up by
one for each subsequent set of parameter values. The dispersion parameters were
also randomly generated from a Uniform distribution: (i) the δg parameters were
selected within the range of (0, 2) if δg was considered constant for all pollinator
species or within the range of (-2, 4) if δg was modeled as a function of pollinator
covariates, and (ii) the ρ parameters were selected within the range of (0, 0.25).
Parameter ranges were based on an ad-hoc pre-analysis (results not shown)
that considered various parameter values in the generation of plant-pollinator
networks. The results from the pre-analysis suggested that β < −1 or β > 5
resulted in sparse networks containing almost all zero counts or highly populated
networks containing astronomical counts (>1 million), respectively. Similarly, δ
28
values in excess of 10 produced sparse networks with many zero counts and a few
cells containing very low counts. Also, ρ values in excess of 0.5 produced many
zero counts with a few cells containing high counts. Accordingly, the β ranges
were selected in such a way so as to produce heavily populated plant-pollinator
networks, i.e., N > 90, 000, while the ranges of the dispersion parameters were
selected to represent mild over-dispersion, i.e. δ < 2 and ρ < 0.25. The choice
of parameter values reflected an attempt to evaluate the performance of the DM
model with network counts representing close to an infinite population.
Network sizes were selected based on a review of 35 published plant-pollinator
communities available from the Interaction Web Database (IWDB) (Guimaraes
et al., 2011). The Interaction Web Database is a cooperative effort of scien-
tists interested in the study of species interactions and is hosted by the Na-
tional Center for Ecological Analysis and Synthesis, at the University of Califor-
nia, U.S.A. Similar to the technique used by Santamarıa and Rodrıguez-Girones
(2007), the number of plant species J in a given network was randomly gener-
ated from a Uniform distribution over (7, 135), the endpoints of which corre-
sponded to minimum and maximum number of plant species from all networks
recorded in IWDB, respectively. The number of pollinator species G was then
calculated from a regression of pollinators on plants fit from the same networks,
i.e., G = (0.5491 + 4.4821log(√J))2.
The four types of dispersion structures represent no dispersion, constant dis-
persion, dispersion as a function of pollinator covariates, and constant dispersion
in terms of intragroup correlation. Table 4.1 outlines the scenarios defined as a
unique combination of parameter set, network size and dispersion structure.
29
So for each of the three parameter sets, all combinations of network size and
dispersion structure were considered, resulting in 3 × 3 × 4 = 36 combinations
in total. For each scenario, 750 plant-pollinator networks were generated (as
outlined in Section 4.3), fit using GCL and DM regression models and analyzed
via summary statistics (as outlined in Section 4.4).
4.2 Description of Simulation Study B
The results of Simulation A provided insights into the performance of DM
regression for varying sizes of networks, but because data were generated for spe-
cific sets of parameter values, one cannot conjecture about performance trends
for varying β, δg or ρg values. In response to this concern, Simulation Study B
focuses on one network size, but looked at many combinations of the other model
parameters. More specifically, data were generated based on one network size,
two sets of β values, and one set of ten dispersion values (incorporating the four
dispersion structures outlined in Section 4.1).
Following the same procedure as was used in Simulation Study A, the network
size was obtained using the median number of plant species (J = 18) recorded in
IWDB (Guimaraes et al., 2011) and letting the number of pollinator species equal
Table 4.1: Simulation Study A
Parameter Set Network Dispersion
β δg δg = f(xg) ρ Size Structure
(3.21, 3.85, 3.50) 0.4 (1.48, −0.55) 0.05 57 x 23 δg = 0(3.92, 4.54, 3.52) 1.1 (0.25, 2.41) 0.08 90 x 54 δg = δ(2.78, 2.35, 1.88) 1.6 (−0.96, 2.13) 0.11 105 x 76 δg = f(xg)
ρg = ρ
30
G = (0.5491 + 4.4821log(√J))2 = 49.
The DM model parameters were randomly generated according to:
• βi ∼ Uniform(−1 + i, 2 + i), i = 0, 1
• δg ∼ Uniform(−1, 3), for δg = f(xg)
and dispersion parameters specified as follows:
• δg = 0, 0.1, 0.5, 2, for δg = δ
• ρg = 0.05, 0.2, 0.5.
The choices of parameter ranges and values were based on both the pre-analysis
and the results of Simulation Study A. As such, the β parameters were randomly
generated from a Uniform distribution within the range of (-1, 3), in order to
produce total network counts that matched the averages of those recorded in
IWDB, i.e., 1700 < N < 3400.
The dispersion parameters were selected in such a way so as to incorporate
the four dispersion structures introduced in Table 4.1, but also to provide a rep-
resentative range of values that spanned the boundaries of the parameter space
based on Simulation Study A. Accordingly, if δg was considered constant for all
pollinator species, values were selected within the range of (0, 2), where the spe-
cial case of δg = 0 accounts for a no dispersion structure. If δg was modeled as
a function of pollinator covariates, the parameters of δg were randomly generated
from a Uniform distribution within the range of (-1, 3). Finally, in terms of the
intragroup correlation coefficient, ρ values were selected within the range of (0,
0.5).
31
Table 4.2: Simulation Study B
Network Size β Dispersion
48 x 19 (1.2, -0.4, 0.1) none δ = 0(1.1, 0.8, 2.3)
constant δ = 0.1δ = 0.5δ = 2
function of covariates δg = 0.2 + 1.4xgδg = −0.9 + 2.1xgδg = 1.5− 0.5xg
intragroup corr. constant ρ = 0.05ρ = 0.2ρ = 0.5
Table 4.2 outlines the scenarios defined as a unique combination of a set of β
parameters and a dispersion parameter. So for each set of the two sets of β
parameters, data were generated for the ten dispersion parameter values, resulting
in 2 × 10 = 20 scenarios in total. The advantage of this simulation study is that
it allows for a comparison of the four dispersion structures and a comparison of
the effect of varying amounts of dispersion, given a set of β values and a network
size of 48x19. For each scenario, data were: (i) generated for 750 individual
plant-pollinator networks (as outlined in Section 4.3), (ii) fit using GCL and DM
regression models, and (iii) analyzed via Monte Carlo biases and standard errors
(as outlined in Section 4.4).
32
4.3 Data Generation
Data were generated in R (R Development Core Team, 2011) for all simulation
studies. A total of three covariates, which incorporate the theories of linkage
rules and neutrality, were used to model the interaction probabilities. Two binary
linkage rules, one barrier trait and one complementarity trait, were simulated
based on plant and pollinator species traits as follows:
1. Mean proboscis length ∼ Uniform (0, 10)
2. Mean tubal length ∼ Uniform (0,10)
3. Sweet fragrance preference ∼ Bernoulli (0.5)
4. Sweet fragrance status ∼ Bernoulli (0.65)
The linkage rule was then determined based on matching the plant and pollinator
traits according to the following boolean operators (Santamarıa and Rodrıguez-
Girones, 2007): If the mean proboscis length was greater than or equal to the
tubal length for a given plant-pollinator species pair, then the barrier trait for
that plant-pollinator pair equaled 1; 0 otherwise. Similarly, if the sweet fragrance
preference and status matched for a given plant-pollinator species pair, then the
complementarity trait for that plant-pollinator pair equaled 1; 0 otherwise.
Finally, relative species abundance was generated using the species abundance
distribution proposed by Ravasz et al. (2005). This type of covariate was used
to model both the interaction probabilities (which corresponds to plant species
abundance) and over-dispersion for δg = f(xg) (which corresponds to pollinator
species abundance). The normalized probability density function for the species
33
abundance distribution is:
f(x) =1
Nsln(Ns)−Ns + 1
Ns − xx
(4.1)
where Ns is the number of individuals in the most abundant species and x rep-
resents the size of a given species in a network. Using the inverse cumulative
distribution function method (Devroye, 1986), plant or pollinator species abun-
dances (SA) for a given network were randomly sampled using Ns = 15, 000. The
corresponding relative species abundances (RA) were calculated as:
RAi =SAi∑Ii=1 SAi
. (4.2)
Once the covariates were generated, 750 random samples from a DM distri-
bution were simulated in R. Since the counts in Y can also be modeled as over-
dispersed count variables as mentioned in Section 3.4, the Poisson and Gamma
distributions were used to generate each sample. The following algorithm was
used to generate samples from a DM distribution:
1. Calculate λgj = exp(β′xgj).
2. Randomly sample λ∗gj ∼ Gamma(δ−1g λgj, δ−1g ).
3. Randomly sample ygj ∼ Poisson(λ∗gj).
4. Repeat.
34
For scenarios that considered a dispersion structure in terms of the intragroup
correlation coefficient as a non-zero constant, i.e., ρg = ρ, δg was calculated as a
function of ρ as follows:
δg =ρ
1− ρλg, (4.3)
for g = 1, . . . , G, and was substituted into the above algorithm.
4.4 Model Fitting and Summary Statistics
Estimates of model parameters were obtained via Stata (StataCorp, 2011) us-
ing the multin and dirmul commands. These commands are part of a package
named in‘groupcl’, available in the Statistical Software Components (SSC) li-
brary of Stata. These commands implement a Newton-Raphson algorithm for the
ML estimation of the parameters. Initially, a Poisson regression model is fit to the
data set to provide starting values for the ML estimation. For any given data set,
regardless of the true dispersion structure, maximum likelihood (ML) estimates
for the model parameters were obtained under five different dispersion structure
assumptions, i.e. five different model fits. Table 4.3 lists the model fits and the
corresponding acronyms to be used as a reference in the Results section (Chapter
5).
Fitting the data sets under the different dispersion structure assumptions
makes it possible to examine the impact of incorrect modeling on the ML es-
timates, e.g. via calculation of biases and standard errors. Ultimately, it allows
for a comparison of the robustness and accuracy of the DM model to the GCL
model.
35
Table 4.3: Dispersion structures
Dispersion Acronym
δ = 0 GCLδg = δ DMd
δg = f(xg) DMdfρg = ρ DMr
ρg = f(xg) DMrf
For each data set, ML estimates, standard errors, log-likelihoods, number of
iterations until convergence, number of convergence issues, and Pearson χ2 test
statistics were recorded. Guimaraes and Lindrooth (2007) provide a modified
Pearson χ2 test for the DM model:
P =G∑g=1
J∑J=1
(ng − ngpgj)2
φgngpgj, (4.4)
where φg = λg+ngδgλg+δg
, for g = 1, . . . , G and j = 1, . . . , J .
Monte Carlo means, standard errors, and the corresponding biases were calcu-
lated as per Equations 4.5 and 4.6 for the model parameters using R (Robert and
Casella, 2010). Suppose that θr denotes the ML estimator for a given parameter
θ obtained from the rth sample out of R replications, r = 1, . . . , R, under one of
the four types of dispersion structures. Then we define the bias and Monte Carlo
standard error of the ML estimator θ as:
Bias(θ) = θ − θ (4.5)
36
where θ = 1R
∑Rr=1 θr is the Monte Carlo mean, and
SE(θ) =
√√√√ 1
(R− 1)
R∑r=1
(θr − θ)2. (4.6)
Section 5 summarizes the numerical results of the simulation studies and Section
6 provides an interpretation of the trends suggested by these results.
37
Chapter 5
Results
The results of the two simulations studies are presented in Sections 5.1 and
5.2, respectively. The objective of Simulation Study A was to evaluate the overall
performance of DM regression compared to GCL regression for varying network
sizes and specific sets of parameter values. The results of the study suggest that
DM regression outperforms GCL regression when data are indeed over-dispersed
and significantly improves the model fit. Simulation Study B focuses on one
network size, but evaluates performance trends based on many combinations of
the other model parameters. The results of Simulation Study B reveal similar
trends to that of Simulation Study A; however, it raises questions about whether
DM regression can handle continuous covariates.
5.1 Simulation Study A Results
This simulation study was conducted to evaluate the overall performance of the
DM regression model compared to the standard GCL regression model. Parameter
38
values were chosen to ensure observed interaction frequency matrices were not too
sparse. Further, sets of parameter values (β, δg, ρg) were used to simulate networks
of various sizes assuming one of four dispersion structures discussed in this thesis
(GCL, DMd, DMdf, DMr, as per Table 4.3) in order to assess whether observed
trends in performance, with respect to misspecification of dispersion structure,
depended on network size or not.
Although only four dispersion structures were used to generate data, all five
dispersion structures (GCL, DMd, DMdf, DMr, DMrf) were used to fit each data
set. In what follows, only the results associated with β = (2.78, 2.35, 1.88) and
network size 57 x 23 are presented and discussed in detail. However, the trends
seen here were similar for the other parameter sets, and result tables are provided
in Appendix B.
5.1.1 Model Convergence
Convergence issues due to Hessian instability did arise while fitting some data
sets to the DM models. Table 5.1 presents the percentage of the 750 samples that
reached convergence and were used to calculate the simulation statistics. For each
of the four scenarios, listed in the first column, the percentage of samples that
reached convergence for each of the five model fits are recorded in columns 2–6.
Note that all convergence issues arose either when data were generated with
zero dispersion, but fit by a DM model, i.e., model with non-zero dispersion, (first
row), or when data were generated with dispersion in terms of δg, but fit by the
DMrf model (last column), or both. For the scenario in which the true dispersion
was δg = 0, but the modeled dispersion was DMrf, the fit for only 2% of the
runs converged. Among those that did converge, more than 10 iterations were
39
Table 5.1: Percentage of samples that reached convergence forβ = (2.78, 2.35, 1.88) and network size 57 x 23.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 100 79 45 54 2δg = 0.9 100 100 100 100 18δg = −1 + 2.1xg 100 100 100 100 51ρg = 0.11 100 100 100 100 100
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
needed to reach convergence. In the cases where 100% of the samples reached
convergence, e.g. the GCL model fits, convergence was met within less than 10
iterations.
Consequently, the statistics calculated using≥ 50% of the samples that reached
convergence are reported in the summary tables througout the remainder of this
section; otherwise, a not applicable (NA) is reported.
5.1.2 Estimation of β
In general, most models produced accurate β estimates with small bias, i.e.,
percent relative bias ≤ 1. Tables 5.2 and 5.3 display the percent percent bias
of β for the four dispersion structures scenarios (GCL, DMd, DMdf, DMr) when
β = (2.78, 2.36, 1.88) and network size 57 x 23 for DM models fitted in terms of
δg and ρg, respectively.
Not surprisingly, the ML estimates for β obtained under a dispersion structure
that matched the true dispersion structure of the data tend to have the lowest
bias. For example, when data generated with no over-dispersion, i.e., δg = 0, the
40
estimates obtained from the GCL regression model produced a percent relative
bias ≤ 0.003 for all three β parameters. A similar trend can be seen for the
remaining three dispersion structure scenarios.
When the true underlying dispersion structure is δg = 0.9 or δg = −1 + 2.1xg,
fitting the data with any of the five dispersion structures produced β values with
low bias, though the DMd and the DMdf models (Table 5.2) showed slightly
lower bias than the DMr and DMrf models (Table 5.3). However, when the true
underlying dispersion structure is δg = 0 (no dispersion) or ρg = 0.11, then fitting
the DMd and DMdf models produced β values with high bias relative to the other
modeled dispersion structures. However, it can be seen that no matter what the
true dispersion structure, the GCL or DMr tended to consistently have lower bias
compared to the DMd and DMdf models.
Tables 5.4 and 5.5 display the percent coefficient of variation of β for the DM
models parameterized in terms of δg and ρg, respectively. The true dispersion
structures are listed in the first columns of the tables.
Analogous to what was observed in Tables 5.2 and 5.3, model fits for which
the modeled dispersion match the true dispersion produced β values with small
standard errors. The standard errors for the DMd model were very large when
data were generated with no dispersion. Finally, the standard errors for all models
when the true dispersion structure was either δg = 0.9 or δg = −1+2.1xg were small
and comparable to each other. Interestingly, when the true dispersion structure
was ρg = 0.11, no model seemed to fit the data well, though standard errors for the
GCL model were considerably higher than that of the other models. In general,
most of the standard errors for β3 are consistently greater than those for β1 and
β2. It should be noted that the average of the estimated standard errors of β (from
41
Table 5.2: Percent relative bias of β for β = (2.78, 2.35, 1.88) and network size 57 x 23 for DM models in terms ofδg.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf
δg = 0 (< 0.01, < 0.01, 0.03) (1.50, 1.03, 24.08) NAδg = 0.9 (0.01, < 0.01, 0.10) (0.01, < 0.01, 0.10) (0.01, < 0.01, 0.10)δg = −1 + 2.1xg (0.02, 0.02, 0.08) (0.01, 0.01, 0.04) (0.01, 0.02, 0.06)ρg = 0.11 (0.25, 0.48, 0.19) (33.40, 20.87, 78.84) (33.3, 20.93, 78.70)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg).
42
Table 5.3: Percent relative bias of β for β = (2.78, 2.35, 1.88) and network size 57 x 23 for DM models in terms ofρg.
TrueDispersion
Modeled Dispersion*
DMr DMrf
δg = 0 (2.36, 4.00, 241.66) NAδg = 0.9 (0.01, < 0.01, 0.10) NAδg = −1 + 2.1xg (0.01, 0.02, 0.06) (0.43, 0.15, 1.56)ρg = 0.11 (33.30, 20.93, 78.70) (0.03, 0.04, 0.19)
* DMr: ρg = ρ; DMrf: ρg = f(xg).
43
Table 5.4: Percent coefficient of variation of β for β = (2.78, 2.35, 1.88) and network size 57 x 23 for DM model interms of δg.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf
δg = 0 (0.29, 0.24, 2.45) (13, 8.88, 299.79) NAδg = 0.9 (0.38, 0.33, 3.31) (0.38, 0.33, 3.31) (0.38, 0.33, 3.31)δg = −1 + 2.1xg (0.54, 0.45, 4.62) (0.51, 0.43, 4.6) (0.51, 0.43, 4.61)ρg = 0.11 (5.67, 5.56, 60.20) (1.80, 1.93, 29.94) (1.80, 1.93, 30)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg).
44
Table 5.5: Percent coefficient of variation of β for β = (2.78, 2.35, 1.88) and network size 57 x 23 for DM models interms of ρg.
TrueDispersion
Modeled Dispersion*
DMr DMrf
δg = 0 (74.27, 389.71, 2800.45) NAδg = 0.9 (0.38, 0.33, 3.31) NAδg = −1 + 2.1xg (0.51, 0.43, 4.61) (0.56, 0.47, 5.04)ρg = 0.11 (1.80, 1.93, 30) (2.25, 2.14, 29.05)
* DMr: ρg = ρ; DMrf: ρg = f(xg).
45
the Stata output) matched the trends seen here with respect to the Monte
Carlo standard errors of β (calculated as per Equation 4.6). As such, only the
Monte Carlo standard errors are discussed throughout this section.
5.1.3 Model Fit and Estimated Dispersion
Table 5.6 provides the median negative log-likelihood values for the samples
that reached convergence and that were generated with β = (2.78, 2.36, 1.88) and
network size 57 x 23. Each row in Table 5.6 corresponds to the true dispersion
structure specified in the first column. In terms of model fit, the improvement
in the log-likelihood and the Pearson χ2 statistics indicate that the DM models
do provide a better fit to the data. In fact, the log-likelihood values decrease by
two orders of magnitude for the DM models as compared to the GCL model. For
data generated with a non-zero dispersion structure, the percentage of Pearson χ2
p-values < 0.05 for samples that reached convergence ranged from 0−20% for the
DM model fits, while the percentages ranged from 60− 100% for the GCL model
fits, which suggests that the DM models tend to provide a better fit compared to
the GCL model (results not shown, refer to Simulation Study B for results and a
more detailed discussion).
Table 5.7 displays the percent relative bias and percent coefficient of variation
obtained for the dispersion parameters when the modeled dispersion matched the
true dispersion structure of the data. The values of the true dispersion values are
listed in the first column.
The point estimate for δg = 0.9 produced small bias and corresponding stan-
dard errors; however, the point estimate for ρg = 0.11 produced slightly larger
bias, but a lower corresponding standard error. Interestingly, for δg = −1 + 2.1xg,
46
Table 5.6: Median negative log-likelihood values for samples generated withβ = (2.78, 2.35, 1.88) and network size 57 x 23.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 1503182 20838 NA 20859 NAδg = 0.9 1502999 22816 22815 23006 NAδg = −1 + 2.1xg 1503256 24361 24359 24858 24856ρg = 0.11 1504077 16733 16733 16334 16334
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Table 5.7: Percent relative bias and percent coefficient of variation fordispersion parameters with β = (2.78, 2.36, 1.88) and network size 57 x 23.
True Modeled DispersionDispersion % Relative Bias % Coefficient of Variation
δg = 0.9 1.56 4.44δg = −1 + 2.1xg (0.1, 3.24) (2.92, 56.76)ρg = 0.11 2.73 1.82
47
the point estimate of the slope produced considerably smaller bias and correspond-
ing standard error than the point estimate of the intercept term, for which the
associated standard error is noticeably the largest.
5.2 Simulation Study B Results
This simulation study was conducted to further compare the robustness of the
DM regression model to that of the GCL regression model; however, under the
consideration of a more selective range of parameter values. The results of Simula-
tion Study A suggest that similar trends exist among all network sizes; therefore,
only one network size, 49 x 18, was selected for this study. Furthermore, ranges
for the β parameters were selected to produce plant-pollinator network counts
that matched the averages of those recorded in IWDB. Finally, the ranges for
the dispersion parameters were chosen to be more representative of the parameter
space, incorporating varying structures and values, in order to allow for a better
evaluation of misspecification of dispersion structure.
Analogous to Simulation Study A, comparisons are made between the perfor-
mance of the different DM models discussed (GCL, DMd, DMdf, DMr, DMrf),
fit to the data sets generated for the 20 unique combinations of β and dispersion
values from Table 4.2. The tables in this section report the results of the model
comparisons for the ten dispersion parameter values, given a set of β parameters.
5.2.1 Model Convergence
Once again, convergence issues due to Hessian instability arose while fitting
the DM models to the data. Tables 5.8 and 5.9 provide the percentage of the 750
48
samples that reached convergence, which were used to calculate the simulation
statistics, for β = (1.2,−0.4, 0.1) and β = (1.1, 0.8, 2.3), respectively.
Table 5.8: Percentage of samples that reached convergence forβ = (1.2,−0.4, 0.1) and network size 49 x 18.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 100 86 58 61 19δg = 0.1 100 100 86 97 53δg = 0.5 100 100 100 100 87δg = 2 100 100 100 100 100
δg = 1.5− 0.5xg 100 100 95 100 75δg = 0.2 + 1.4xg 100 100 100 100 89δg = −0.9 + 2.1xg 100 100 100 100 99
ρg = 0.05 100 100 100 100 100ρg = 0.2 100 100 100 100 100ρg = 0.5 100 100 100 100 100
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Similar trends can be seen in both tables. Most convergence issues were en-
countered when data were generated with true dispersion δg = 0 or δg = 0.1,
but were modeled with a different dispersion structure (rows 1 and 2), or if data
were generated with true dispersion parameterized in terms of δg, but modeled in
terms of ρg = f(xg) (last column). In particular, most convergence issues were
encountered when data generated in terms of dispersion parameter δg, either as
a constant or as a function of covariates, were fit to the DMrf model. For these
data sets, 54% to 99% of the 750 samples reached convergence, and, among those
that did, up to 13 iterations were needed to reach convergence. In the cases where
100% of the samples reached convergence, e.g. the GCL model fits, convergence
49
Table 5.9: Percentage of samples that reached convergence forβ = (1.1, 0.8, 2.3) and network size 49 x 18.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 100 90 56 61 10δg = 0.1 100 99 86 98 32δg = 0.5 100 100 99 100 80δg = 2 100 100 100 100 100
δg = 1.5− 0.5xg 100 100 97 100 54δg = 0.2 + 1.4xg 100 100 98 100 87δg = −0.9 + 2.1xg 100 100 100 100 99
ρg = 0.05 100 100 100 100 100ρg = 0.2 100 100 100 100 100ρg = 0.5 100 100 100 100 100
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
was met within 10 iterations.
Consequently, the statistics calculated using≥ 50% of the samples that reached
convergence are reported in the summary tables throughout the remainder of this
section; otherwise, a not applicable (NA) is reported.
5.2.2 Estimation of β
Tables 5.10 and 5.11 display the percent relative bias of β for β = (1.2,−0.4, 0.1)
and network size 49 x 18, for DM models fit in terms of δg and ρg, respectively.
Tables 5.12 and 5.13 display the percent relative bias of β for β = (1.1, 0.8, 2.3)
and network size 49 x 18, for DM models fit in terms of δg and ρg, respectively.
Not surprisingly, most ML estimates for β obtained under a dispersion
50
Table 5.10: Percent relative bias of β for β = (1.2,−0.4, 0.1) and network size 49 x 18 for DM model in terms of δg.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf
δg = 0 (0.15, 0.23, 17.81) (0.19, 0.20, 9.82) NAδg = 0.1 (0.29, 0.69, 27.62) (0.31, 0.73, 26.09) (0.38, 0.46, 27.65)δg = 0.5 (0.30, 0.56, 9.86) (0.52, 0.77, 1.57) (0.51, 0.81, 1.25)δg = 2 (0.93, 0.51, 46.45) (0.86, 0.46, 17.08) (0.83, 0.46, 16.59)
δg = 1.5− 0.5xg (0.27, 0.05, 41.12) (0.23, 0.05, 37.42) (0.2, < 0.01, 39.61)δg = 0.2 + 1.4xg (0.22, 0.48, 36.57) (0.17, 0.39, 28.15) (0.21, 0.37, 28.29)δg = −0.9 + 2.1xg (0.51, 0.09, 67.15) (0.30, 0.18, 29.59) (0.29, 0.01, 30.86)
ρg = 0.05 (< 0.01, 1.18, 40.95) (8.10, 2.13, 159.30) (8.11, 2.16, 159.83)ρg = 0.2 (0.83, 3.41, 112.80) (17.11, 3.75, 327.44) (17.04, 3.82, 327.24)ρg = 0.5 (2.86, 0.33, 498.76) (18.51, 2.40, 478.89) (18.34, 2.40, 475.94)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg).
51
Table 5.11: Percent relative bias of β for β = (1.2,−0.4, 0.1) and network size 49 x 18 for DM model in terms of ρg.
TrueDispersion
Modeled Dispersion*
DMr DMrf
δg = 0 (0.31, 0.35, 31.22) NAδg = 0.1 (0.38, 0.64, 25.31) NAδg = 0.5 (0.86, 0.79, 8.63) (0.90, 0.68, 15.07)δg = 2 (0.85, 0.66, 11.29) (0.93, 0.65, 12.03)
δg = 1.5− 0.5xg (0.13, 0.04, 34.61) (0.13, 0.01, 24.37)δg = 0.2 + 1.4xg (0.56, 0.28, 12.91) (0.49, 0.67, 13.54)δg = −0.9 + 2.1xg (2.47, 0.37, 4.02) (2.56, 0.34, 0.51)
ρg = 0.05 (< 0.01, 0.92, 15.25) (< 0.01, 0.98, 15.84)ρg = 0.2 (0.31, 1.31, 17.75) (0.35, 1.37, 17.73)ρg = 0.5 (0.17, 0.11, 139.65) (0.25, 0.18, 135.33)
* DMr: ρg = ρ; DMrf: ρg = f(xg).
52
Table 5.12: Percent relative bias of β for β = (1.1, 0.8, 2.3) and network size 49 x 18 for DM model in terms of δg.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf
δg = 0 (0.12, 0.01, 0.04) (0.18, 0.06, 0.10) NAδg = 0.1 (0.05, 0.27, 0.27) (0.06, 0.28, 0.27) (0.08, 0.29, 0.13)δg = 0.5 (0.16, 0.03, 0.33) (0.13, 0.03, 0.20) (0.14, 0.03, 0.20)δg = 2 (0.20, 0.45, 0.90) (0.13, 0.36, 0.84) (0.13, 0.37, 0.83)
δg = 1.5− 0.5xg (0.20, 0.08, 0.46) (0.22, 0.07, 0.40) (0.25, 0.09, 0.50)δg = 0.2 + 1.4xg (0.16, 0.34, 0.31) (15.29, 36.76, 17.11) (7.80, 17.00, 7.43)δg = −0.9 + 2.1xg (0.08, 0.01, 0.08) (0.40, 0.12, 0.62) (0.36, 0.19, 0.61)
ρg = 0.05 (0.45, 0.42, 0.65) (10.45, 0.68, 7.61) (10.42, 0.64, 7.58)ρg = 0.2 (0.19, 0.57, 3.79) (21.71, 2.70, 16.84) (21.65, 2.65, 16.80)ρg = 0.5 (2.36, 2.64, 8.76) (21.47, 3.13, 17.20) (21.33, 3.08, 17.11)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg).
53
Table 5.13: Percent relative bias of β for β = (1.1, 0.8, 2.3) and network size 49 x 18 for DM model in terms of ρg.
TrueDispersion
Modeled Dispersion*
DMr DMrf
δg = 0 (0.15, 0.11, 0.07) NAδg = 0.1 (0.04, 0.30, 0.27) NAδg = 0.5 (0.01, 0.07, 0.13) (0.02, 0.23, 0.01)δg = 2 (0.73, 0.05, 0.10) (0.71, 0.07, 0.09)
δg = 1.5− 0.5xg (0.26, 0.07, 0.37) (0.62, 0.19, 0.03)δg = 0.2 + 1.4xg (0.45, 0.27, 0.03) (0.38, 0.15, 0.18)δg = −0.9 + 2.1xg (1.07, 0.36, 1.26) (1.16, 0.29, 1.27)
ρg = 0.05 (0.38, 0.24, 0.10) (0.43, 0.28, 0.14)ρg = 0.2 (0.33, 0.09, 1.96) (0.26, 0.03, 1.91)ρg = 0.5 (1.14, 0.01, 1.33) (1.24, 0.05, 1.25)
* DMr: ρg = ρ; DMrf: ρg = f(xg).
54
structure that matched the true dispersion structure of the data seem to have the
lowest bias. For example, when the true underlying dispersion structure has ρg
as a non-zero constant, the estimates obtained from the DMr regression model
produced a bias noticeably lower than the other modeled dispersion structures. A
similar trend can be seen for most of the remaining dispersion scenarios.
Interestingly, when data were generated with δg = 0, the estimates obtained
from the GCL and DM regression models produced small bias for β1 and β2,
while the estimates obtained by the GCL model produced the lowest bias for
β3. When the true dispersion structure is δg = δ or δg = f(xg), most β values
had low bias. For β = (1.2,−0.4, 0.1), the DMd and DMdf models (Table 5.10)
showed slightly lower bias than the DMr and DMrf models (Table 5.11). While for
β = (1.1, 0.8, 2.3), the DMd and DMdf models (Table 5.12) showed slightly lower
bias for all values of δg, except when δg = 0.5, for which the DMr and DMrf models
(Table 5.13) showed the lowesr bias; however, when δg = f(xg), the other models
performed as well or better than the DMdf model. In fact, when δg = 0.2 + 1.4xg,
the DMd and DMdf models produced the largest bias. Conversely, when the true
dispersion structure was ρg = ρ, then the DMd and DMdf models produced β val-
ues with noticeably high bias relative to the other modeled dispersion structures.
However, regardless of the true dispersion structure, the GCL, DMr and DMrf
models consistently showed lower bias than the DMd and DMdf models.
In general, the bias of β3 was greater than those for β1 and β2. β3 represents the
effect of plant relative species abundance on the interaction probabilities and is a
continuous covariate. Also, as the values of the dispersion parameters increase, the
bias of β1 and β3 also tend to increase which is to be expected since the greater the
dispersion, the more heterogenous the network, making it more difficult to detect
55
the structure.
Tables 5.14 through 5.17 display the percent coefficient of variation of β for
β = (1.2,−0.4, 0.1) and β = (1.1, 0.8, 2.3) , respectively. Similar to what was
seen in Tables 5.10 through 5.13, model fits when the modeled dispersion match
the true dispersion produced β values with small standard errors. Further, for
data generated with δg = 0, standard errors for all models, with the exception of
the DMdf model, were comparatively small. The standard errors for all models
when the true dispersion structure was either δg = δ or δg = f(xg) were small and
comparable to each other, though the DMd and DMdf models produced slightly
lower standard errors. The one anomaly was for β = (1.1, 0.8, 2.3) and δg =
0.2 + 1.4xg in which case the standard errors of β obtained from the DMd and
DMdf models were considerably larger. Interestingly, when the true dispersion
structure was ρg = ρ, all standard errors were relatively large regardless of the
modeled dispersion. Further, the standard errors for the GCL model tended to be
higher than that of the other models for all data sets generated with dispersion.
Overall, the standard errors tended to increase as the amount of dispersion
increased, which is expected with data showing a higher level of heterogeneity.
Also, the standard errors for β2 are considerably higher than those for β1 and β3 .
56
Table 5.14: Percent coefficient of variation of β for β = (1.2,−0.4, 0.1) and network size 49 x 18 for DM model interms of δg.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf
δg = 0 (6.25, 12.83, 3.09) (6.26, 12.77, 3.17) NAδg = 0.1 (6.69, 12.85, 3.32) (6.69, 12.83, 3.30) (6.64, 12.83, 3.25)δg = 0.5 (7.95, 15.66, 3.90) (7.69, 15.63, 3.79) (7.68, 15.67, 3.79)δg = 2 (11.24, 21.47, 5.41) (9.57, 19.48, 4.6) (9.58, 19.47, 4.59)
δg = 1.5− 0.5xg (7.01, 12.97, 3.58) (6.93, 12.92, 3.56) (6.90, 12.83, 3.56)δg = 0.2 + 1.4xg (8.94, 16.67, 4.10) (8.33, 16.01, 3.91) (8.33, 16.01, 3.93)δg = −0.9 + 2.1xg (12.19, 22.61, 5.95) (9.91, 19.56, 4.86) (9.88, 19.61, 4.86)
ρg = 0.05 (10.82, 21.15, 5.33) (9.19, 19.06, 4.69) (9.19, 19.11, 4.69)ρg = 0.2 (19.04, 40.47, 9.72) (11.76, 27.36, 6.55) (11.76, 27.37, 6.55)ρg = 0.5 (38.7, 81.84, 20.84) (18.92, 40.55, 10.76) (18.92, 40.51, 10.78)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg).
57
Table 5.15: Percent coefficient of variation of β for β = (1.2,−0.4, 0.1) and network size 49 x 18 for DM model interms of ρg.
TrueDispersion
Modeled Dispersion*
DMr DMrf
δg = 0 (6.18, 12.36, 3.12) NAδg = 0.1 (6.71, 12.74, 3.30) NA)δg = 0.5 (7.84, 15.63, 3.81) (7.95, 15.84, 3.84)δg = 2 (10.17, 19.54, 4.67) (10.17, 19.55, 4.67)
δg = 1.5− 0.5xg (6.94, 12.90, 3.56) (6.95, 12.86, 3.60)δg = 0.2 + 1.4xg (8.60, 16.06, 3.93) (8.63, 16.00, 3.92)δg = −0.9 + 2.1xg (10.72, 19.45, 4.94) (10.65, 19.50, 4.94)
ρg = 0.05 (9.76, 19.08, 4.69) (9.77, 19.12, 4.70)ρg = 0.2 (13.2, 27.41, 6.52) (13.20, 27.43, 6.52)ρg = 0.5 (20.64, 40.92, 10.75) (20.71, 40.86, 10.77)
* DMr: ρg = ρ; DMrf: ρg = f(xg).
58
Table 5.16: Percent coefficient of variation of β for β = (1.1, 0.8, 2.3) and network size 49 x 18 for DM model interms of δg.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf
δg = 0 (4.76, 4.43, 0.07) (4.75, 4.45, 0.07) NAδg = 0.1 (5.24, 4.82, 0.08) (5.23, 4.83, 0.08) (5.24, 4.77, 0.08)δg = 0.5 (5.80, 5.71, 0.09) (5.64, 5.66, 0.09) (5.63, 5.65, 0.09)δg = 2 (8.68, 8.08, 0.13) (7.83, 7.42, 0.12) (7.84, 7.42, 0.12)
δg = 1.5− 0.5xg (5.51, 5.31, 0.07) (5.49, 5.29, 0.07) (5.47, 5.28, 0.07)δg = 0.2 + 1.4xg (7.04, 6.18, 0.10) (181.49, 428.5, 1.91) (176.26, 380.47, 1.57)δg = −0.9 + 2.1xg (9.19, 8.66, 0.13) (7.92, 7.89, 0.12) (7.94, 7.88, 0.12)
ρg = 0.05 (10.10, 10.34, 0.15) (8.56, 8.64, 0.13) (8.58, 8.65, 0.13)ρg = 0.2 (20.02, 21.08, 0.31) (11.36, 13.38, 0.20) (11.36, 13.39, 0.20)ρg = 0.5 (39, 40.31, 0.65) (18.29, 20.17, 0.32) (18.31, 20.21, 0.32)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg).
59
Table 5.17: Percent coefficient of variation of β for β = (1.1, 0.8, 2.3) and network size 49 x 18 for DM model interms of ρg.
TrueDispersion
Modeled Dispersion*
DMr DMrf
δg = 0 (4.85, 4.42, 0.07) NAδg = 0.1 (5.24, 4.81, 0.08) NAδg = 0.5 (5.72, 5.69, 0.09) (5.70, 5.57, 0.09)δg = 2 (8.25, 7.50, 0.12) (8.25, 7.48, 0.12)
δg = 1.5− 0.5xg (5.50, 5.31, 0.07) (5.33, 5.44, 0.07)δg = 0.2 + 1.4xg (6.86, 6.04, 0.10) (6.75, 6.05, 0.10)δg = −0.9 + 2.1xg (8.45, 7.95, 0.12) (8.50, 7.95, 0.12)
ρg = 0.05 (9.05, 8.65, 0.13) (9.07, 8.66, 0.13)ρg = 0.2 (12.81, 13.47, 0.20) (12.82, 13.46, 0.20)ρg = 0.5 (20.17, 20.21, 0.32) (20.19, 20.24, 0.32)
* DMr: ρg = ρ; DMrf: ρg = f(xg).
60
5.2.3 Model Fit and Estimated Dispersion
Tables 5.18 and 5.19 provide the median negative log-likelihood values for the
samples that reached convergence and for the scenarios with β = (1.2,−0.4, 0.1)
and β = (1.1, 0.8, 2.3), respectively. Both tables demonstrate that a model as-
suming some dispersion provided a better fit compared to the GCL model. The
improvement in log-likelihood was most marked when the true dispersion structure
was ρg = ρ, and as the intragroup correlation coefficient (ρ) increased.
The log-likelihood values for the DM models are approximately 10 to 30 percent
lower than that of the GCL model. However, the model fit across the DM models
for any true dispersion structure were comparable to each other.
Table 5.18: Median negative log-likelihood values for samples generated withβ = (1.2,−0.4, 0.1) and network size 49 x 18.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 4721 1238 1240 1243 NAδg = 0.1 4725 1258 1259 1259 1263δg = 0.5 4714 1317 1316 1318 1318δg = 2 4723 1359 1358 1363 1362
δg = 1.5− 0.5xg 4727 1281 1281 1281 1282δg = 0.2 + 1.4xg 4712 1339 1339 1341 1342δg = −0.9 + 2.1xg 4715 1352 1352 1357 1356
ρg = 0.05 4719 1370 1370 1366 1365ρg = 0.2 4709 1149 1148 1144 1144ρg = 0.5 4575 646 646 645 644
*GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Tables 5.20 and 5.21 provide the percentage of Pearson χ2 p-values < 0.05 for
samples that reached convergence for the scenarios with β = (1.2,−0.4, 0.1) and
β = (1.1, 0.8, 2.3), respectively. The Pearson chi-squared goodness of fit tests also
61
Table 5.19: Median negative log-likelihood values for samples generated withβ = (1.1, 0.8, 2.3) and network size 49 x 18.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 9221 1542 1545 1549 NAδg = 0.1 9243 1570 1572 1571 NAδg = 0.5 9219 1655 1655 1657 1658δg = 2 9219 1776 1776 1783 1782
δg = 0.2 + 1.4xg 9223 1697 1697 1700 1700δg = −0.9 + 2.1xg 9239 1786 1785 1793 1792δg = 1.5− 0.5xg 9230 1603 1603 1604 1606
ρg = 0.05 9210 1803 1802 1796 1795ρg = 0.2 9197 1473 1473 1466 1466ρg = 0.5 9001 802 803 800 799
*GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
suggest that the DM models provide an adequate fit compared to the GCL model.
In fact, the DM models provided a better fit than the GCL model in the absence
of dispersion. Conversely, the GCL model provides an inadequate fit 60−100% of
the time, for data generated with any of the dispersion structures. Interestingly,
the DMd and DMdf models tended to consistently provide the best fit regardless
of dispersion structures, while the DMr and DMrf models tended to have a higher
percentage of p-values < 0.05 as the value of the dispersion paramteres increased.
When the modeled dispersion matched the true dispersion structure of the
data, one could compare the percent relative bias and percent coefficient of vari-
ation of the true dispersion parameters. Tables 5.22 and 5.23 present the percent
relative bias and percent coefficient of variation for the dispersion parameters for
β = (1.1, 0.8, 2.3) and β = (1.2,−0.4, 0.1), respectively.
When the dispersion structure was either δg = δ or ρg = ρ, the DM model
62
Table 5.20: Percentage of χ2 p-values < 0.05 for samples generated withβ = (1.2,−0.4, 0.1) and network size 49 x 18.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 6 0 1 0 NAδg = 0.1 61 0 0 0 0δg = 0.5 100 0 0 2 2δg = 2 100 7 7 25 25
δg = 1.5− 0.5xg 99 0 0 0 0δg = 0.2 + 1.4xg 100 2 2 8 8δg = −0.9 + 2.1xg 100 8 7 25 24
ρg = 0.05 100 1 1 5 5ρg = 0.2 100 5 6 16 16ρg = 0.5 100 7 7 16 16
*GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Table 5.21: Percentage of χ2 p-values < 0.05 for samples generated withβ = (1.1, 0.8, 2.3) and network size 49 x 18.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 5 0 1 0 NAδg = 0.1 60 0 0 0 NAδg = 0.5 100 0 0 1 0δg = 2 100 5 5 14 14
δg = 1.5− 0.5xg 99 0 0 0 0δg = 0.2 + 1.4xg 100 1 0 3 2δg = −0.9 + 2.1xg 100 5 5 18 19
ρg = 0.05 100 2 2 7 7ρg = 0.2 100 6 6 18 18ρg = 0.5 100 8 9 20 20
*GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
63
Table 5.22: Percent relative bias and percent coefficient of variation fordispersion parameters with β = (1.2,−0.4, 0.1) and network size 49 x 18.
True Modeled DispersionDispersion % Relative Bias % Coefficient of Variation
δg = 0.1 20.00 1.42+E19δg = 0.5 3.40 76.20δg = 2 2.00 3.35
δg = 1.5− 0.5xg (3.20, 643.60) (24.00, 2226.00)δg = 0.2 + 1.4xg (110.00, 169.36) (80.00, 345.71)δg = −0.9 + 2.1xg (0.78, 10.81) (15.56, 103.33)
ρg = 0.05 < 0.01 1.00ρg = 0.2 0.50 6.50ρg = 0.5 4.40 5.80
Table 5.23: Percent relative bias and percent coefficient of variation fordispersion parameters with β = (1.1, 0.8, 2.3) and network size 49 x 18.
True Modeled DispersionDispersion % Relative Bias % Coefficient of Variation
δg = 0.1 20.00 5.34+E09δg = 0.5 4.20 73.00δg = 2 4.60 11.00
δg = 1.5− 0.5xg (3.73, 638.20) (33.33, 2364.00)δg = 0.2 + 1.4xg (403.50, 126.29) (8710.00, 1419.29)δg = −0.9 + 2.1xg (< 0.01, 10.62) (12.22, 90.95)
ρg = 0.05 < 0.01 0.60ρg = 0.2 < 0.01 2.20ρg = 0.5 < 0.01 5.20
64
tended to estimate the dispersion with small bias and standard error, except when
δ was small (δg = 0.1 or 0.5). When the dispersion structure was δg = f(xg), the
bias was generally small for the intercept parameter, but generally large for the
slope parameter. The associated standard errors were large for both the intercept
and slope parameters.
The trends presented in this chapter persisted in all scenarios and networks
simulated. Full results are presented in Appendix B.
65
Chapter 6
Discussion
The results of both simulation studies suggest that employing a model that
matches the true dispersion structure of the data tends to produce point estimates
of β with the smallest bias and standard errors, but, in general, bias and standard
errors increase as the amount of dispersion increases. However, the GCL and DMr
models seem to consistently provide estimates with small bias and small standard
errors, regardless of the true dispersion structure.
In the absence of dispersion, convergence issues, namely, due to Hessian in-
stability, were encountered when the DM models were used to fit the data. This
result is not surprising since the DM model collapses to the standard GCL model
when the group random effects are zero, making the use of the GCL model more
stable. Nevertheless, among those that did converge, the ML estimates obtained
for β using the DM models were comparable to those obtained using the GCL
model. In fact, the DM models were able to detect a lack of over-dispersion as
demonstrated by corresponding dispersion parameter estimates that were close to
zero and not statistically significant.
66
In the presence of over-dispersion, the DM models, on average, outperformed
the GCL model. As the values of δg and ρg increased, the ML estimates for β
from the DM models tended to have smaller bias and estimated standard errors
relative to that of the GCL estimates. The DM models parameterized in terms
of δg performed best when fit to data with the same dispersion structure, while
the DM models parameterized in terms of ρg tended to perform well regardless of
whether the dispersion was generated in terms of δg or ρg.
Recall that each model contained three covariates, two corresponding to link-
age rules and the third corresponding to relative species abundance. Results of
Simulation Study B demonstrated that the bias and standard errors for the plant
relative species abundance covariate estimates were noticeably larger than those
corresponding to the binary covariate estimates. One possible explanation may
be that relative species abundance is a continuous variable and that DM regres-
sion tends to handle estimation of coefficients for binary covariates better. Also,
as the values of the dispersion parameters increased, the corresponding bias and
standard errors for β also increased. This finding is not surprising since the more
over-dispersed the data are, the harder it may be to learn the structure of the
data.
The same trends can be described for the estimation of the dispersion pa-
rameters. In general, estimates for the dispersion parameters were accurate with
small standard errors, but as the value of the dispersion parameters increased,
the bias and standard errors of the point estimates also increased. For data
generated with little dispersion, i.e. δg = 0.1, the DMd model had difficulties
obtaining an estimate of δg. The DM model parameterized in terms of ρg was able
to consistently provide estimates with small bias for the dispersion parameters.
67
However, the DMdf model had difficulties obtaining an estimate corresponding to
the group-level covariate (pollinator relative species abundance). Consequently,
the corresponding bias and standard errors were noticeably large. Pollinator rela-
tive species abundance is a continuous covariate, suggesting, once again, that DM
regression may have issues estimating this type of covariate.
Although, the GCL seems competitive to the DM models with respect to bias
and standard errors, the DM model has an advantage over the GCL model in
terms of model fit. The reduction in the DM log-likelihood compared to the GCL
log-likelihood suggests that modeling the dispersion helps reduce the variation
in the model. This hypothesis is supported by comparing the corresponding χ2
statistics, which also suggest that the DM models provide as good or better a fit,
regardless of the true dispersion structure.
DM regression seems to be a robust method for modeling plant-pollinator net-
works compared to GCL regression. Since, one cannot predict the true dispersion
structure of an observed network, DM regression provides a procedure for the de-
tection and estimation of the known factors that contribute to network pattern.
More specifically, the results of the simulation studies suggest that, in practice, all
five DM models (GCL, DMd, DMdf, DMr, and DMrf) can be fit initially. Once
the models are fit, comparisons can be made via log-likelihood values, or more
formally the Pearson χ2 goodness of fit tests. If the estimates of the dispersion
parameters are not significantly different from zero and have inflated standard
errors, then there is evidence to suggest that the data are not over-dispersed, in
which case the GCL model is an appropriate choice. However, if the estimates
of β and the corresponding standard errors do not differ greatly between models,
and all models provide the same fit, then any one may be selected. Alternatively,
68
if the estimates of the dispersion parameters are significantly different than zero,
then the data may be over-dispersed, in which case the DM models may be an
appropriate choice. If convergence issues arise, then perhaps the DM model with
an alternative parameterization is more appropriate.
If over-dispersion exists, the DM model can be parameterized to account for a
non-zero constant or pollinator specific covariates that can be used to absorb or
explain this extra-multinomial variability. If the dispersion parameters are mod-
eled as a function of covariates, the estimates of the pollinator specific parameters
can provide additional information in terms of the impact these covariates have
on the number of interactions that occur between a given pollinator and plant
pair (through the use of the DMdf model) or the impact these covariates have on
the correlation among individuals in a species to select a particular plant species
(through the DMrf model).
Although DM regression appears to be a robust model for pollination network
data, at least for the results presented in this thesis, it should be noted that the
DMrf model is problematic due to Hessian instability resulting in convergence is-
sues. Furthermore, Simulation Study B suggests that the estimation of continuous
covariates may show larger bias and standard errors.
Additionally, highly populated networks, such as those generated in Simulation
Study A, are unrealistic and are not representative of those found in observed
networks. The results of Simulation Study A do, however, provide insight into
the asymptotic properties of the DM model parameters, which suggest that the
DM model does produce accurate estimates with small standard errors. Similarly,
despite the efforts in the design of Simulation Study B, the generated networks did
not exhibit the sparse and nested properties observed in real world networks. In
69
fact, the simulated networks were not designed to account for sampling bias, which
is a known causal effect of observed network structure (Vazquez et al., 2009a).
70
Chapter 7
Conclusions
This thesis introduces Dirichlet-multinomial regression to the modeling of polli-
nation networks. It further provides an evaluation of multinomial regression mod-
els to misspecification of dispersion structure within an ecology context. Specif-
ically, GCL and DM regression were used to model the interaction probabilities
of various simulated plant-pollinator networks as a function of trait matching and
relative species abundance. A comparison of the performance of the DM model
to the standard GCL model in terms of misspecification of dispersion structure
was investigated through simulation studies. The results of the simulation studies
suggest that both the GCL model and the DM models perform comparably for
plant-pollinator network data. However, the DM model outperforms the GCL
model in the presence of over-dispersion and significantly improves the model fit.
To date, simple statistical methods such as χ2 tests for proportions and simple
linear regressions have been employed on both real-world and simulated networks
to predict network structure. Unfortunately, these methods have only confirmed
that the factors in question contribute to and only partially explain network struc-
71
ture, but do not quantify the relative contribution of each factor. Further, they are
not commonly used in practice since they are all relatively ‘new’ methods intro-
duced within the past few years (Allesina et al., 2008; Santamarıa and Rodrıguez-
Girones, 2007; Stang et al., 2009; Vazquez et al., 2009b).
The mechanisms driving the topological features of plant-pollinator networks
were examined using an extension of the conceptual framework proposed by Vazquez
et al. (2009) and cutting edge statistical modeling techniques borrowed from
econometrics (random utility model) (Guimaraes and Lindrooth, 2007). More
specifically, DM regression was used to exploit the theories of neutrality and link-
age rules to determine their relative contribution to the structure of mutualistic
plant-pollinator networks. DM regression uses an hierarchical model within a
Bayesian framework to model plant-pollinator interaction probabilities as a func-
tion of plant-pollinator characteristics (e.g. complementary phenotypic traits). In
short, the DM model allows for the exploration of covariates that are plant spe-
cific, pollinator specific, or both, and facilitates identification of factors that affect
interaction probabilities and estimates the relative contribution of those factors.
More specifically, DM regression uses a logit formulation to model the interac-
tion probabilities. Essentially, the log ratio of two probabilities is being modeled
as a linear combination of covariates. As such, the model obtains a β estimate for
each covariate introduced into the model which can be easily interpreted. The β
estimate corresponding to a particular covariate reflects the impact of the change
in that covariate value to the probability of choosing one plant species over the
other plant species. In other words, the probabilities pgj provide a measure of
the strength of the interactions between pollinator g, g = 1, . . . , G, and plant
j, j = 1, . . . , J , while β summarizes a covariate’s relative contribution to those
72
interaction probabilities.
The results presented in this paper suggest that DM regression is a promising
robust statistical approach to evaluate the processes driving the structural patterns
in plant-pollinator mutualistic networks. Additionally, the model can be extended
to incorporate additional types of covariates, such as time and space, or can be
used to learn a set of linkage rules. Furthermore, no other simulation studies
have been done, in econometrics or ecology, to evaluate the misspecification of
dispersion structure.
Although the proposed DM model does take a step towards progress for the
study of pollination networks, the model is yet to be tested on real-world data
sets. In order to conduct a general evaluation of these processes on network
structure, detailed information needs to be measured at the time of data collection.
Hopefully, this research will motivate increased sampling efforts that facilitate the
collection of detailed and representative samples of observed networks over a longer
span of time.
7.1 Future Work
In light of the results presented in this thesis, additional work devoted to
further developing and applying the DM regression in the context of pollination
networks can be groundbreaking for the research of mutualisms. Some ideas and
considerations for future work include:
1. Currently, the DM model does not account for structural zero counts in
the interaction matrix. The presence of zero counts can be attributed to
sampling effects or other informative or relevant driving forces. Therefore,
73
additional extensions to the DM regression model, such as zero-inflated ad-
justments, can be made to account for these structural zeroes.
2. Studies have confirmed that temporal and spatial variability impose con-
straints on potential interactions which in turn influence network pattern
and interaction probabilities (Vazquez et al., 2009a; Jordano et al., 2006).
Species that do not overlap in space or time will not and cannot interact.
Hence, a true evaluation of the mechanisms driving network structure re-
quires an exploration of the spatio-temporal distribution of the species in
these networks.
3. Recent studies exploring the theory of linkage rules suggest that such ecolog-
ical processes drive network structure and encourage the search for linkage
rules in the field (Santamarıa and Rodrıguez-Girones, 2007). As such, DM
regression can be used to test or learn a set of linkage rules.
4. The nestedness typically seen in pollination networks seems to increase as the
size of the network increases. Nested networks tend to be more robust, i.e.
resistant to species loss, making them less vulnerable to species extinction
(Bascompte and Jordano, 2007). An assessment of the robustness of these
networks in the presence of invading species can provide further insight into
the study of mutualisms and co-evolution (Dıaz-Castelazo et al., 2010).
74
Bibliography
Allesina, S., Alonso, D., Pascual, M., 2008. A general model for food web structure.
Science 320, 658–661.
Bascompte, J., Jordano, P., 2007. Plant-animal mutualistic networks: the archi-
tecture of biodiversity. Annual Review of Ecology, Evolution and Systematics
38 (1), 567–593.
Bascompte, J., Jordano, P., Bluthgen, N., 2006. Asymmetric coevolutionary net-
works facilitate biodiversity maintenance. Science 312, 431–433.
Bascompte, J., Jordano, P., Melian, C. J., Olesen, J. M., 2003. The nested assem-
bly of plant-animal mutualistic networks. Proceedings of the National Academy
of Sciences of the United States of America 100 (16), 9383–9387.
Devroye, L., 1986. Non-uniform random variate generation. Springer-Verlag, New
York.
Dıaz-Castelazo, C., Guimaraes, J., Jordano, P., Thompson, J. N., Marquis, R. J.,
Rico-Gray, V., 2010. Changes of a mutualistic network over time: reanalysis
over a 10-year period. Ecology 91 (3), 793–801.
Faraway, J. J., 2006. Extending the Linear Model with R: Generalized Linear,
75
Mixed Effects and Nonparametric Regression Models. Chapman & Hall/CRC,
Boca Raton, FL.
Guimaraes, P., 2005. A simple approach to fit the beta-binomial model. Stata
Journal 5 (3), 385–394.
Guimaraes, P., Galdini Raimundo, R. L., Cagnolo, L., 2011. Interaction Web
Database. National Center for Ecological Analysis and Synthesis, University of
California, Santa Barbara, USA.
URL http://www.nceas.ucsb.edu/interactionweb/index.html
Guimaraes, P., Lindrooth, R. C., 2007. Controlling for overdispersion in grouped
conditional logit models: A computationally simple application of dirichlet-
multinomial regression. The Econometrics Journal 10 (2), 439–452.
Hausman, J. A., Hall, B. H., Griliches, Z., 1984. Econometric models for count
data with an application to the patents-R&D relationship. Econometrica 52,
909–938.
Johnson, N. L., Kotz, S., Balakrishnan, N., 1997. Discrete Multivariate Distribu-
tions. John Wiley & Sons, Inc., New York.
Jordano, P., 1987. Patterns of mutualistic interactions in pollination and seed dis-
persal: Connectance, dependence asymmetries, and coevolution. The American
Naturalist 129 (5), 657–677.
Jordano, P., Bascompte, J., Olesen, J. M., 2003. Invariant properties in coevolu-
tionary networks of plant animal interactions. Ecology Letters 6, 69–81.
76
Jordano, P., Bascompte, J., Olesen, J. M., 2006. The ecological consequences
of complex topology and nested structure in pollination webs. University Of
Chicago Press, Chicago, IL.
Kearns, C. A., Inouye, D. W., Waser, N. M., 1998. Endangered mutualisms: The
conservation of plant-pollinator interactions. Annual Review of Ecology and
Systematics 29 (1), 83–112.
Maddala, G., 1983. Limited-dependent and qualitative variables in econometrics.
Cambridge University Press, New York.
McCulloch, C. E., Searle, S. R., 2005. Generalized, Linear, and Mixed Models.
John Wiley & Sons, Inc., New York.
McFadden, D., 1974. Conditional logit analysis of qualitative choice behavior.
Vol. 1. Academic Press, New York, Ch. 4, pp. 105–142.
Mosimann, J. E., 1962. On the compound multinomial distribution, the multi-
variate beta-distribution, and correlations among proportions. Biometrics 49,
65–82.
Olesen, J. M., Bascompte, J., Dupont, Y. L., Jordano, P., 2007. The modularity
of pollination networks. Proceedings of the National Academy of Sciences of the
United States of America 104 (50), 19891–19896.
R Development Core Team, 2011. R: A Language and Environment for Statistical
Computing. R Foundation for Statistical Computing, Vienna, Austria, ISBN
3-900051-07-0.
URL http://www.R-project.org
77
Ravasz, M., Balog, A., Marko, V., Neda, Z., 2005. The species abundances distri-
bution in a new perspective. Arxiv preprint qbio0502029, 8.
Robert, C., Casella, G., 2010. Introducing Monte Carlo Methods with R. Springer,
New York, NY.
Santamarıa, L., Rodrıguez-Girones, M. A., 2007. Linkage rules for plantpollinator
networks: Trait complementarity or exploitation barriers? PLoS Biol 5 (2), e31.
Shonkwiler, J. S., Hanley, N., 2003. A new approach to random utility model-
ing using the dirichlet multinomial distribution. Environmental and Resource
Economics 26 (3), 401–416.
Stang, M., Klinkhamer, P. G. L., Waser, N. M., Stang, I., van der Meijden, E.,
2009. Size-specific interaction patterns and size matching in a plant-pollinator
interaction web. Annals of Botany 103 (9), 1459–1469.
StataCorp, 2011. Stata Statistical Software: Release 11. StataCorp LP, College
Station, TX.
Thompson, J., 2005. The geographic mosaic of coevolution. University of Chicago
Press.
Vazquez, D. P., 2005. Degree distribution in plant-animal mutualistic networks:
forbidden links or random interactions? Oikos 108, 421–426.
Vazquez, D. P., Bluthgen, N., Cagnolo, L., Chacoff, N. P., 2009a. Uniting pattern
and process in plant-animal mutualistic networks: a review. Annals of Botany
103 (9), 1445–1457.
78
Vazquez, D. P., Chacoff, N. P., Cagnolo, L., 2009b. Evaluating multiple deter-
minants of the structure of plant-animal mutualistic networks. Ecology 90 (8),
2039–2046.
Vazquez, D. P., Morris, W. F., Jordano, P., 2005. Interaction frequency as a
surrogate for the total effect of animal mutualists on plants. Ecology Letters
8 (10), 1088–1094.
79
Appendix A
Derivation of Conditional Logit
Model
Maddala (1983) provides a derivation of McFadden’s conditional logit model.
The derivation is as follows:
Suppose an individual faces J choices and define Y ∗j as the level of indirect utility
associated with the jth choice. The observed variables Yj are defined as:
Yj = 1, if Y ∗j = max(Y ∗1 , Y∗2 , . . . , Y
∗J )
Yj = 0, otherwise.
Then,
Y ∗j = Vj(Xj) + εj , (A.1)
where Xj is a vector of attributes for the jth choice and εj is a random error term
that captures unobserved variability. Assume that the εj are independently and
identically distributed (i.i.d.) type I extreme value distribution with probability
80
density function (PDF) and cumulative distribution function (CDF) are:
f(εj) = exp(−εj − e−εj) (A.2)
and
F (εj < ε) = exp(−e−ε) , (A.3)
respectively. Then it can be shown that:
P (Yj = 1|X) =eVj∑Jj=1 e
Vj, (A.4)
where Vj = β′Xj. The condition Y ∗j = max(Y ∗1 , Y∗2 , . . . , Y
∗J ) implies:
εj + Vj > εk + Vk, for all k 6= j
εk < εj + Vj − Vk, for all k 6= j. (A.5)
Hence, if ε1, ε2, ..., εJ are i.i.d.with CDF given by A.3, then
P (Yj = 1|X) = P (εk < εj + Vj − Vk), for all k 6= j
=
∫ ∞−∞
∏k 6=j
F (εj + Vj − Vk)f(εj)dεj, (A.6)
where f(·) and F (·) are given by A.2 and A.3, respectively. Now
81
∏k 6=j
F (εj + Vj − Vk)f(εj) =∏
k 6=j exp(−e−εj−Vj+Vk)exp(−εj − e−εj)
= exp
[εj − e−εj
(1 +
∑k 6=j
eVk
eVj
)]. (A.7)
If we let
λj = log
(1 +
∑k 6=j
eVk
eVj
)= log
( J∑j=1
eVk
eVj
), (A.8)
then we can rewrite A.6 as:
∫ ∞−∞
exp(−εj − e−(εj−λj))dεj = exp(−λj)∫ ∞−∞
exp(−ε∗j − e−ε∗j )dε∗j
= exp(−λj)
=eVj∑Jj=1 e
Vj
where ε∗j = εj − λj.
If we have a set of N individuals facing J choices, we can define for i = 1, . . . , N
and j = 1, . . . , J :
Y ∗ij = Vij, the level of indirect utility for the ith individual making the jth choice.
Yij = 1, if the ith individual makes the jth choice.
Yij = 0, otherwise.
Assume that Vij = β′Xij + α′jZi + εij, where Zi are individual specific variables
and Xij is the vector of values of attributes of the jth choice as perceived by the
ith individual, and β and α are unknown parameters to be estimated.
82
Then the probability that the ith individual selects the jth choice is:
Pij = P (Yij = 1) =eβ′Xij+α
′jZi∑J
j=1 eβ′Xij+α′jZi
, (A.9)
which is the logit formulation used to model multinomial probabilities.
83
Appendix B
Supplementary Tables for
Simulation Study A
Table B.1: Percentage of samples that reached convergence forβ = (2.78, 2.35, 1.88) and network size 90 x 54.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 100 82 46 55 5δg = 0.9 100 100 100 100 59δg = −1 + 2.1xg 100 95 97 100 91ρg = 0.11 100 100 100 100 100
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
84
Table B.2: Percentage of samples that reached convergence forβ = (2.78, 2.35, 1.88) and network size 105 x 76.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 100 80 49 60 3δg = 0.9 100 100 100 100 71δg = −1 + 2.1xg 100 99 98 100 100ρg = 0.11 100 100 100 100 100
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Table B.3: Percentage of samples that reached convergence forβ = (3.21, 3.85, 3.50) and network size 90 x 54.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 100 79 47 52 1δg = 2.5 100 97 98 100 98δg = 1.5− 0.5xg 100 100 98 100 17ρg = 0.05 100 100 100 100 100
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Table B.4: Percentage of samples that reached convergence forβ = (3.21, 3.85, 3.50) and network size 105 x 76.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 100 78 51 54 2δg = 2.5 100 100 100 100 94δg = 1.5− 0.5xg 100 99 98 100 20ρg = 0.05 100 100 100 100 100
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
85
Table B.5: Percentage of samples that reached convergence forβ = (3.21, 3.85, 3.50) and network size 57 x 23.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 100 75 53 57 1δg = 2.5 100 100 100 100 41δg = 1.5− 0.5xg 100 100 100 100 6ρg = 0.05 100 100 100 100 100
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Table B.6: Percentage of samples that reached convergence forβ = (3.92, 4.54, 3.52) and network size 90 x 54.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 100 75 48 52 3δg = 0.6 100 100 100 100 48δg = 0.2 + 2.4xg 100 100 100 100 44ρg = 0.08 100 100 100 100 100
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Table B.7: Percentage of samples that reached convergence forβ = (3.92, 4.54, 3.52) and network size 105 x 76.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 100 72 48 52 3δg = 0.6 100 100 100 100 52δg = 0.2 + 2.4xg 100 100 97 100 51ρg = 0.08 100 100 100 100 100
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
86
Table B.8: Percentage of samples that reached convergence forβ = (3.92, 4.54, 3.52) and network size 57 x 23.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 100 70 35 50 3δg = 0.6 100 100 98 100 51δg = 0.2 + 2.4xg 100 100 100 100 56ρg = 0.08 100 100 100 100 100
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Table B.9: Median negative loglikelihood values for samples generated withβ = (2.78, 2.35, 1.88) and network size 90 x 54.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 220460 4246 NA 4257 NAδg = 0.9 220417 4657 4656 4682 4683δg = −1 + 2.1xg 220357 4966 4966 5035 5033ρg = 0.11 219528 4247 4246 4166 4165
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Table B.10: Median negative loglikelihood values for samples generated withβ = (2.78, 2.35, 1.88) and network size 105 x 76.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 851438 12756 NA 12776 NAδg = 0.9 851557 13975 13974 14073 14079δg = −1 + 2.1xg 851549 14939 14939 15214 15207ρg = 0.11 852426 11246 11241 11012 11012
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
87
Table B.11: Median negative loglikelihood values for samples generated withβ = (3.21, 3.85, 3.50) and network size 90 x 54.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 1656747 5219 NA 5227 NAδg = 2.5 1656799 5988 5988 6177 6177δg = 1.5− 0.5xg 1656574 5356 5356 5363 NAρg = 0.05 1650244 6314 6312 6054 6053
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Table B.12: Median negative loglikelihood values for samples generated withβ = (3.21, 3.85, 3.50) and network size 105 x 76.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 4665737 14773 14789 14800 NAδg = 2.5 4665713 16987 16986 17325 17324δg = 1.5− 0.5xg 4665837 15181 15182 15197 NAρg = 0.05 4660583 14779 14778 14409 14408
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Table B.13: Median negative loglikelihood values for samples generated withβ = (3.21, 3.85, 3.50) and network size 57 x 23.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 8067871 24118 24127 24132 NAδg = 2.5 8067725 27681 27680 28122 NAδg = 1.5− 0.5xg 8067731 24775 24775 24796 NAρg = 0.05 8046445 22028 22026 21591 21590
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
88
Table B.14: Median negative loglikelihood values for samples generated withβ = (3.92, 4.54, 3.52) and network size 90 x 54.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 6736102 6069 NA 6081 NAδg = 0.6 6736328 6417 6416 6432 NAδg = 0.2 + 2.4xg 6735725 6463 6462 6484 NAρg = 0.08 6716123 5857 5856 5736 5736
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Table B.15: Median negative loglikelihood values for samples generated withβ = (3.92, 4.54, 3.52) and network size 105 x 76.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 18572360 16434 NA 16446 NAδg = 0.6 18571834 17376 17375 17437 17436δg = 0.2 + 2.4xg 18571976 17501 17490 17580 17585ρg = 0.08 18559325 14046 14042 13790 13789
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Table B.16: Median negative loglikelihood values for samples generated withβ = (3.92, 4.54, 3.52) and network size 57 x 23.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 24047668 25524 NA 25546 NAδg = 0.6 24047495 27036 27039 27200 27199δg = 0.2 + 2.4xg 24047736 27251 27250 27457 27457ρg = 0.08 23993702 20352 20351 19758 19758
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
89
Table B.17: Percent relative bias of β for β = (2.78, 2.35, 1.88) and network size 90 x 54.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 (0.01, 0.04, 0.07) (11.28, 51.48, 21.77) NA (< 0.01, 0.03, 0.16) NAδg = 0.9 (0.02, 0.01, 0.03) (0.02, 0.01, 0.04) (0.02, 0.01, 0.04) (0.1, 0.02, 0.05) (0.07, 0, 0.1)δg = −1 + 2.1xg (0.02, 0, 0.23) (21, 168.39, 45.08) (16.92, 13.33, 20.96) (0.4, 0.13, 0.59) (0.44, 0.12, 0.6)ρg = 0.11 (0.99, 0.4, 0.4) (21.74, 2.71, 10.43) (21.66, 2.65, 10.49) (0.27, 0.05, 0.54) (0.3, 0.08, 0.49)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
90
Table B.18: Percent relative bias of β for β = (2.78, 2.35, 1.88) and network size 105 x 76.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 (< 0.01, 0.01, 0.26) (5.32, 152.08, 91.4) NA (0.01, < 0.01, 0.34) NAδg = 0.9 (0.03, < 0.01, 0.01) (0.03, 0, 0.02) (0.03, < 0.01, 0.02) (0.21, 0.01, 0.02) (0.22, 0, 0.02)δg = −1 + 2.1xg (0.03, 0.05, 0.33) (2.88, 63.56, 2.84) (8.37, 6.82, 17.3) (0.6, 0.16, 0.19) (0.59, 0.16, 0.17)ρg = 0.11 (0.06, 0.28, 0.5) (34.14, 6.68, 27.87) (33.93, 6.44, 27.11) (0.13, 0.18, 0.72) (0.13, 0.19, 0.68)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
91
Table B.19: Percent relative bias of β for β = (3.21, 3.85, 3.50) and network size 90 x 54.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 (0.01, 0.01, 0.02) (1.61, 2.69, 4.33) NA (0.02, < 0.01, < 0.01) NAδg = 2.5 (0.02, 0.01, 0.03) (31.26, 39.71, 29.27) (10.76, 13.09, 19.11) (< 0.01, 0.15, 0.15) (0, 0.15, 0.15)δg = 1.5− 0.5xg (< 0.01, 0.01, 0.01) (2.34, 3.95, 7.76) (2.28, 3.75, 6.65) (< 0.01, 0.01, 0.01) NAρg = 0.05 (0.42, 0.77, 0.74) (27.32, 24.06, 25.52) (27.26, 24.14, 25.38) (0.1, 0.08, 0.56) (0.12, 0.11, 0.6)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
92
Table B.20: Monte Carlo bias of β for β = (3.21, 3.85, 3.50) and network size 105 x 76.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 (0.01, < 0.01, 0.06) (39.7, 0.13, 35.74) (31.25, 6.32, 1) (< 0.01, < 0.01, 0.05) NAδg = 2.5 (< 0.01, < 0.01, 0.01) (0, 0, 0.01) (< 0.01, < 0.01, 0.01) (0.15, 0.07, 0.05) (0.15, 0.08, 0.06)δg = 1.5− 0.5xg (< 0.01, 0.01, 0.01) (10.4, 1.44, 14.06) (12.4, 1.55, 15.86) (< 0.01, 0.01, 0.01) NAρg = 0.05 (0.23, 0.45, 1.34) (30.4, 4.46, 15.79) (30.28, 4.42, 15.68) (0.02, 0.02, 0.23) (0.01, 0.02, 0.23)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
93
Table B.21: Percent relative bias of β for β = (3.21, 3.85, 3.50) and network size 57 x 23.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 (< 0.01, < 0.01, < 0.01) (< 0.01, < 0.01, < 0.01) (14.14, 9.35, 43.42) (< 0.01, 0.01, < 0.01) NAδg = 2.5 (0.01, < 0.01, 0.01) (0.01, < 0.01, 0.01) (0.01, < 0.01, 0.01) (0.17, < 0.01, 0.13) NAδg = 1.5− 0.5xg (0.01, < 0.01, 0.02) (0.01, < 0.01, 0.02) (0.01, < 0.01, 0.02) (< 0.01, < 0.01, 0.02) NAρg = 0.05 (0.11, 0.1, 0.49) (27.39, 7.89, 14.39) (27.43, 7.94, 14.38) (0.1, 0.07, 0.12) (0.11, 0.08, 0.12)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
94
Table B.22: Percent relative bias of β for β = (3.92, 4.54, 3.52) and network size 90 x 54.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 (0.01, 0.01, < 0.01) (2.24, 9.32, 27.94) NA (0.02, 0.01, 0.01) NAδg = 0.6 (< 0.01, < 0.01, 0.02) (0.01, 0.06, 0.25) (< 0.01, < 0.01, 0.02) (0.01, < 0.01, 0.02) NAδg = 0.2 + 2.4xg (0.01, 0.01, 0.01) (0.01, 0.01, 0.01) (0.01, 0.01, 0.01) (0.02, 0.01, 0.01) NAρg = 0.08 (1.42, 1.79, 0.15) (23.69, 6.64, 2.51) (23.56, 6.58, 2.59) (0.27, 0.13, 0.13) (0.29, 0.16, 0.09)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
95
Table B.23: Percent relative bias of β for β = (3.92, 4.54, 3.52) and network size 105 x 76.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 (< 0.01, < 0.01, < 0.01) (0.39, 0.41, 0.24) NA (< 0.01, < 0.01, < 0.01) NAδg = 0.6 (< 0.01, 0.01, 0.01) (< 0.01, 0.01, 0.01) (0.12, 0.14, 0.07) (< 0.01, 0.01, 0.01) (0.01, 0.01, < 0.01)δg = 0.2 + 2.4xg (< 0.01, < 0.01, < 0.01) (< 0.01, < 0.01, < 0.01) (33.64, 34.86, 15.84) (0.01, 0.01, 0.01) (0.01, 0.01, 0.01)ρg = 0.08 (0.51, 1.09, 0.13) (26.12, 7.34, 5.4) (25.95, 7.34, 5.66) (0.05, 0.07, 0.74) (0.06, 0.08, 0.75)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
96
Table B.24: Monte Carlo bias of β for β = (3.92, 4.54, 3.52) and network size 57 x 23.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 (< 0.01, < 0.01, < 0.01) (5.72, 0.59, 14.97) NA (< 0.01, < 0.01, 0.03) NAδg = 0.6 (< 0.01, 0.01, 0.03) (6.6, 0.34, 15.03) (1.57, 0.23, 3.55) (0.01, 0.01, 0.02) (0.01, < 0.01, 0.06)δg = 0.2 + 2.4xg (< 0.01, < 0.01, 0.02) (0.4, 0.03, 0.81) (< 0.01, < 0.01, 0.02) (0.01, 0.01, 0.02) (0.01, < 0.01, 0.05)ρg = 0.08 (0.34, 1.37, 3.66) (37.19, 8.37, 54.51) (37.05, 8.34, 54.26) (0.04, 0.13, 1.1) (0.05, 0.14, 1.12)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
97
Table B.25: Percent coefficient of variation of β for β = (2.78, 2.35, 1.88) and network size 90 x 54.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 (0.56, 0.52, 2.51) (92.47, 432.27, 197.98) NA (0.57, 0.52, 2.54) NAδg = 0.9 (0.79, 0.71, 3.47) (0.78, 0.71, 3.46) (0.79, 0.71, 3.45) (0.8, 0.73, 3.56) (0.81, 0.75, 3.55)δg = −1 + 2.1xg (1.04, 0.97, 4.58) (222.95, 1252.44, 460.33) (320.46, 499.41, 440.13) (1.07, 1.01, 4.87) (1.07, 1, 4.82)ρg = 0.11 (6.51, 7.5, 40.11) (2.92, 3.74, 25.84) (2.93, 3.74, 25.86) (3.55, 3.71, 24.48) (3.56, 3.72, 24.53)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
98
Table B.26: Percent coefficient of variation of β for β = (2.78, 2.35, 1.88) and network size 105 x 76.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrfδg = 0 (0.36, 0.3, 2.54) (25.92, 775.5, 473.63) NA (0.36, 0.31, 2.63) NAδg = 0.9 (0.48, 0.44, 3.5) (0.47, 0.44, 3.49) (0.47, 0.44, 3.49) (0.49, 0.45, 3.55) (0.49, 0.46, 3.48)δg = −1 + 2.1xg (0.65, 0.56, 4.97) (30.37, 663.98, 49.26) (227.39, 186.44, 477.7) (0.69, 0.6, 5.54) (0.69, 0.59, 5.51)ρg = 0.11 (5.72, 6.25, 56.6) (1.9, 2.55, 25.96) (1.9, 2.56, 25.96) (2.46, 2.62, 27.15) (2.46, 2.62, 27.12)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
99
Table B.27: Percent coefficient of variation of β for β = (3.21, 3.85, 3.50) and network size 90 x 54.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 (0.22, 0.27, 0.39) (14.12, 23.87, 38.91) NA (0.22, 0.28, 0.38) NAδg = 2.5 (0.38, 0.52, 0.74) (285.32, 352.44, 827.4) (141.25, 169.35, 327.65) (0.44, 0.59, 0.83) (0.44, 0.59, 0.83)δg = 1.5− 0.5xg (0.23, 0.32, 0.46) (23.01, 39, 75.91) (18.87, 31.4, 55.9) (0.23, 0.32, 0.47) NAρg = 0.05 (5.19, 5.92, 13.42) (2.18, 2.15, 9.55) (2.19, 2.15, 9.56) (2.48, 2.35, 9.04) (2.48, 2.35, 9.04)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
100
Table B.28: Percent coefficient of variation of β for β = (3.21, 3.85, 3.50) and network size 105 x 76.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 (0.14, 0.15, 0.66) (212.39, 14.91, 258.43) (564.98, 59.35, 231.32) (0.14, 0.15, 0.67) NAδg = 2.5 (0.27, 0.27, 1.24) (0.26, 0.27, 1.24) (0.27, 0.27, 1.24) (0.29, 0.3, 1.35) (0.28, 0.3, 1.35)δg = 1.5− 0.5xg (0.15, 0.17, 0.72) (124.34, 12.25, 152.29) (156.3, 17.23, 212.97) (0.15, 0.17, 0.71) NAρg = 0.05 (3.84, 5.16, 23.92) (1.44, 1.87, 15.13) (1.44, 1.87, 15.14) (1.82, 1.85, 14.47) (1.83, 1.85, 14.46)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
101
Table B.29: Percent coefficient of variation of β for β = (3.21, 3.85, 3.50) and network size 57 x 23.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 (0.11, 0.11, 0.46) (0.11, 0.11, 0.46) (204.71, 159.09, 846.52) (0.1, 0.12, 0.47) NAδg = 2.5 (0.2, 0.22, 0.87) (0.2, 0.22, 0.87) (0.2, 0.22, 0.87) (0.21, 0.23, 0.96) NAδg = 1.5− 0.5xg (0.12, 0.13, 0.56) (0.12, 0.13, 0.56) (0.12, 0.13, 0.56) (0.12, 0.13, 0.56) NAρg = 0.05 (3.3, 4.57, 21.26) (1.29, 1.61, 14.25) (1.29, 1.61, 14.27) (1.48, 1.65, 13.07) (1.49, 1.66, 13.07)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
102
Table B.30: Percent coefficient of variation of β for β = (3.92, 4.54, 3.52) and network size 90 x 54.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 (0.15, 0.12, 0.2) (14.39, 58.01, 171.78) NA (0.15, 0.12, 0.19) NAδg = 0.6 (0.19, 0.16, 0.25) (0.27, 1.22, 4.75) (0.19, 0.16, 0.26) (0.19, 0.16, 0.26) NAδg = 0.2 + 2.4xg (0.2, 0.17, 0.26) (0.2, 0.16, 0.26) (0.2, 0.16, 0.26) (0.2, 0.17, 0.26) NAρg = 0.08 (8.16, 8.34, 14.25) (2.23, 2.68, 11.22) (2.23, 2.68, 11.27) (2.98, 2.78, 10.19) (2.98, 2.78, 10.23)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
103
Table B.31: Percent coefficient of variation of β for β = (3.92, 4.54, 3.52) and network size 105 x 76.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 (0.07, 0.09, 0.35) (9.16, 9.7, 5.89) NA (0.08, 0.09, 0.35) NAδg = 0.6 (0.1, 0.12, 0.42) (0.1, 0.12, 0.42) (3.34, 3.73, 1.74) (0.1, 0.12, 0.42) (0.09, 0.12, 0.42)δg = 0.2 + 2.4xg (0.1, 0.12, 0.44) (0.1, 0.12, 0.44) (147.18, 151.74, 70.8) (0.11, 0.12, 0.44) (0.1, 0.12, 0.44)ρg = 0.08 (4.93, 7.1, 30.28) (1.47, 2.37, 19.13) (1.49, 2.37, 19.16) (1.82, 2.43, 17.55) (1.82, 2.42, 17.54)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
104
Table B.32: Percent coefficient of variation of β for β = (3.92, 4.54, 3.52) and network size 57 x 23.
TrueDispersion
Modeled Dispersion*
GCL DMd DMdf DMr DMrf
δg = 0 (0.06, 0.07, 0.5) (35.39, 3.71, 91.05) NA (0.06, 0.07, 0.48) NAδg = 0.6 (0.07, 0.1, 0.64) (52.01, 2.89, 115.83) (13.1, 2.04, 30.48) (0.07, 0.1, 0.65) (0.07, 0.1, 0.64)δg = 0.2 + 2.4xg (0.08, 0.11, 0.65) (6.42, 0.61, 13.1) (0.08, 0.11, 0.65) (0.08, 0.11, 0.67) (0.07, 0.11, 0.66)ρg = 0.08 (3.82, 7.47, 52.91) (1.09, 2.07, 21.34) (1.09, 2.07, 21.32) (1.37, 2.17, 22.78) (1.37, 2.17, 22.78)
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
105
Table B.33: Percent relative bias and percent coefficient of variation fordispersion parameters with β = (2.78, 2.35, 1.88) and network size 90 x 54.
True Modeled DispersionDispersion % Relative Bias % Coefficient of Variation
δg = 0.9 0.56 9.67δg = −1 + 2.1xg (200.21, 115.76) (1999.79, 1285.62)ρg = 0.11 2.73 4.55
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Table B.34: Percent relative bias and percent coefficient of variation fordispersion parameters with β = (2.78, 2.35, 1.88) and network size 105 x 76.
True Modeled DispersionDispersion % Relative Bias % Coefficient of Variation
δg = 0.9 1.56 5.78δg = −1 + 2.1xg (13.65, 9.05) (359.38, 253)ρg = 0.11 2.73 2.73
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
106
Table B.35: Percent relative bias and percent coefficient of variation fordispersion parameters with β = (3.21, 3.85, 3.50) and network size 90 x 54.
True Modeled DispersionDispersion % Relative Bias % Coefficient of Variation
δg = 2.5 100.00 9.24E+37δg = 1.5− 0.5xg (123.07, 4390.00) (1026.20, 41533.20)ρg = 0.05 < 0.01 4.00
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Table B.36: Percent relative bias and percent coefficient of variation fordispersion parameters with β = (3.21, 3.85, 3.50) and network size 105 x 76.
True Modeled DispersionDispersion % Relative Bias % Coefficient of Variation
δg = 2.5 0.84 0.52δg = 1.5− 0.5xg (242.00, 2421.80) (2431.67, 26050.80)ρg = 0.05 < 0.01 2.00
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Table B.37: Percent relative bias and percent coefficient of variation fordispersion parameters with β = (3.21, 3.85, 3.50) and network size 57 x 23.
True Modeled DispersionDispersion % Relative Bias % Coefficient of Variation
δg = 2.5 0.60 0.40δg = 1.5− 0.5xg (0.40, 112.40) (6.47, 831.80)ρg = 0.05 < 0.01 2.00
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
107
Table B.38: Percent relative bias and percent coefficient of variation fordispersion parameters with β = (3.92, 4.54, 3.52) and network size 90 x 54.
True Modeled DispersionDispersion % Relative Bias % Coefficient of Variation
δg = 0.6 100.00 1.22+E41δg = 0.2 + 2.4xg (3.00, 18.75) (47.00, 147.17)ρg = 0.08 5.00 5.00
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Table B.39: Percent relative bias and percent coefficient of variation fordispersion parameters with β = (3.92, 4.54, 3.52) and network size 105 x 76.
True Modeled DispersionDispersion % Relative Bias % Coefficient of Variation
δg = 0.6 5.67 14.83δg = 0.2 + 2.4xg (6744.00, 611.04) (29524.00, 4366.83)ρg = 0.08 5.00 2.50
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
Table B.40: Percent relative bias and percent coefficient of variation fordispersion parameters with β = (3.92, 4.54, 3.52) and network size 57 x 23.
True Modeled DispersionDispersion % Relative Bias % Coefficient of Variation
δg = 0.6 100.00 8.93+E47δg = 0.2 + 2.4xg (1.00, 3.38) (21.00, 91.50)ρg = 0.08 5.00 2.50
* GCL: δg = 0; DMd: δg = δ 6= 0; DMdf: δg = f(xg); DMr: ρg = ρ;DMrf: ρg = f(xg).
108