montpellier
TRANSCRIPT
Overview New methods War stories Conclusions References
Statistical machismo and common sense
Ben Bolker, McMaster UniversityDepartments of Mathematics & Statistics and Biology
ISEC
July 2014
Overview New methods War stories Conclusions References
Outline
1 OverviewStatistical machismoWhy do ecological statistics?
2 New methodsDesiderataStatisticians and software
3 War stories
4 Conclusions
Overview New methods War stories Conclusions References
Acknowledgements
People Steve Walker, Mollie Brooks, Mike McCoy
Support NSERC Discovery grant
Overview New methods War stories Conclusions References
Outline
1 OverviewStatistical machismoWhy do ecological statistics?
2 New methodsDesiderataStatisticians and software
3 War stories
4 Conclusions
Overview New methods War stories Conclusions References
Statistical machismo
blog post by Brian McGill on Dynamic Ecology
≈ �method ageism� (M. Brewer)
criticizes unnecessarily fancy statistics, e.g.
Bonferroni correctionsphylogenetic correctionsspatial regressionestimation of detectabilityBayesian methods
Also cf. Murtaugh (2007; 2009; 2014)
Overview New methods War stories Conclusions References
Statistical machismo
blog post by Brian McGill on Dynamic Ecology
≈ �method ageism� (M. Brewer)
criticizes unnecessarily fancy statistics, e.g.
Bonferroni correctionsphylogenetic correctionsspatial regressionestimation of detectabilityBayesian methods
Also cf. Murtaugh (2007; 2009; 2014)
Overview New methods War stories Conclusions References
Statistical machismo
blog post by Brian McGill on Dynamic Ecology
≈ �method ageism� (M. Brewer)
criticizes unnecessarily fancy statistics, e.g.
Bonferroni correctionsphylogenetic correctionsspatial regressionestimation of detectabilityBayesian methods
Also cf. Murtaugh (2007; 2009; 2014)
Overview New methods War stories Conclusions References
Statistical machismo (2)
Slippery slope: what is �good enough�? (GLM vs ANOVA)
McGill: �And when the p-values are <0.0000001 and unlikelyto change . . . �
Criticizing dogma, not methods per se
Does this apply to statistical ecologists?Are we enabling bad practice?
Caveat: researcher/teacher vs. researcher/consultant niche
Overview New methods War stories Conclusions References
Outline
1 OverviewStatistical machismoWhy do ecological statistics?
2 New methodsDesiderataStatisticians and software
3 War stories
4 Conclusions
Overview New methods War stories Conclusions References
Intellectual satisfaction
Novelty
Seek elegant/rigorous solutions
Mathematical or computational challenges
Overview New methods War stories Conclusions References
Solve scienti�c and societal problems
Knuth: �The important thing, once you have enough to eat and anice house, is what you can do for others, what you can contributeto the enterprise as a whole�
Basic science
Hypothesis testing (broad sense)Links to ecological theory
Applied science
PredictionDecision analysisTechnology transfer(where possible)
Overview New methods War stories Conclusions References
Solve scienti�c and societal problems
Knuth: �The important thing, once you have enough to eat and anice house, is what you can do for others, what you can contributeto the enterprise as a whole�
Basic science
Hypothesis testing (broad sense)Links to ecological theory
Applied science
PredictionDecision analysisTechnology transfer(where possible)
Overview New methods War stories Conclusions References
Career advancement
get a job/get tenure/get funded
satisfy colleagues
be either useful or rigorous!
avoid �too much collaboration�
curse of novelty
Overview New methods War stories Conclusions References
Outline
1 OverviewStatistical machismoWhy do ecological statistics?
2 New methodsDesiderataStatisticians and software
3 War stories
4 Conclusions
Overview New methods War stories Conclusions References
Robustness
less reliance on assumptions(Box �all models are wrong . . . �)
computational robustnesshandle crappy (small, noisy, missing . . . ) data
robust to user error
examples: sandwich estimators; M-estimators;design-based statistics; permutation tests
tradeo�s: complexity, loss of interpretability, increased bias,decreased e�ciency
Overview New methods War stories Conclusions References
Speed and scalability
�just get a bigger computer�, or a cluster
quantitative becomes qualitative
modern methods (ensembles, resampling, MCMC . . . )increase need for speed(but usually trivially parallelizable)
scalability(parallel implementations; computational complexity)
examples: ensemble-based approaches; kernel-based methods;sparse matrix computation; map/reduce
tradeo�s: complexity, lack of generality
Overview New methods War stories Conclusions References
Interpretability (Warton and Hui, 2010)
connection to mechanistic/theoretical models
. . . or to statistical paradigms (e.g. linear models)
maybe a hard sell
examples: Bayesian statistics (P(H|D) vs. P(D|H));GLMs
counterargument: Breiman (2001)
Overview New methods War stories Conclusions References
Increased correctness (O'Hara and Kotze, 2010; Warton and Hui, 2010)
decrease bias/type I error; improve coverage
how big is bias generally? how much does it matter?
unpopular with ecologists!especially if it �lowers power�/makes e�ects disappear . . .
Overview New methods War stories Conclusions References
Statistical power/e�ciency
squeezing more out of data is always good
. . . but how much?
maybe irrelevant if there's enough data (econometrics, e.g.Angrist and Pischke (2009))
examples: GLMs vs. least-squares; random vs �xed e�ects
tradeo�s: complexity, model-dependence/loss of robustness
Overview New methods War stories Conclusions References
Handle new kinds of data / solve new problems
Hard to argue with this one!
Most often, methods for combining di�erent characteristics:
spatial/temporal/phylog. correlationnon-Normal datamissing dataregularization/smoothing/dimension limitationetc. ...
Overview New methods War stories Conclusions References
Ease of use
general, �exible frameworks (maybe)(or domain-speci�c solutions)
interfaces
examples: MaxEnt; information-theoretic approaches
counterexamples: BUGS/JAGS, AD Model Builder?
Overview New methods War stories Conclusions References
Outline
1 OverviewStatistical machismoWhy do ecological statistics?
2 New methodsDesiderataStatisticians and software
3 War stories
4 Conclusions
Overview New methods War stories Conclusions References
User friendliness/interfaces
how hard should you try to make your methods usable?(Dynamic Ecology blog post)
what kind of interface?
equationspseudocodecode blobspackageGUI
Overview New methods War stories Conclusions References
Technology transfer and software engineering
Technology transfer might be the most useful we can be
Boring? Unrewarding? New tech has to come from somewhere
statistical software now commoditized and often free
good statistical software takes a lot of time and e�ort(and few users want to pay for it)
eating dog food/developer blindness
building software exposes new issues:Knuth: �Science is what we understand well enough to explainto a computer. Art is everything else we do�
Overview New methods War stories Conclusions References
Technology transfer and software engineering
Technology transfer might be the most useful we can be
Boring? Unrewarding? New tech has to come from somewhere
statistical software now commoditized and often free
good statistical software takes a lot of time and e�ort(and few users want to pay for it)
eating dog food/developer blindness
building software exposes new issues:Knuth: �Science is what we understand well enough to explainto a computer. Art is everything else we do�
Overview New methods War stories Conclusions References
Responsibilities
Do you have a responsibility to
Fix bugs?Provide features?Advise users?
Does the �no liability� clause of free software licenses(e.g. GPL §16) absolve us of moral responsibility?
Overview New methods War stories Conclusions References
Outline
1 OverviewStatistical machismoWhy do ecological statistics?
2 New methodsDesiderataStatisticians and software
3 War stories
4 Conclusions
Overview New methods War stories Conclusions References
Spatial moment equations
general theoretical framework for ecological dynamics ofspatial point processes (Ovaskainen et al., 2014)
used mostly to understand qualitative ecological dynamics
framework for connecting spatial theory and data?
in particular, should be able to deconvolve e�ects of (e.g.)habitat and habitat preference
N(x) =
∫E (y − x)H(y) dy = (E ∗ H)(x)
then
H̃(ω) =C̃EN
C̃EE
Overview New methods War stories Conclusions References
Deconvolution: example
habitat preference
distance
habitat
0
1
0 50
correlations & preference
distance
population
so far just a gleam in my eye
many details (non-Normality, nonlinear response) . . .
Overview New methods War stories Conclusions References
Estimating growth autocorrelation in unmarked individuals(Brooks et al., 2013)
uncorrelated growth: σ2(t)linear
correlated growth: σ2(t)quadratic
straightforward estimation
Overview New methods War stories Conclusions References
Growth autocorrelation (cont.)
power analysis:require 500 individuals@ 25%autocorrelation,100 individuals @ 50%autocorrelation
available data: 5�50individuals
so far unused(cf. Lavine et al.(2002))
Overview New methods War stories Conclusions References
Mixed stock estimation(Bolker et al., 2003, 2007; Okuyama and Bolker, 2005)
Bayesian �mixed stock�analysis, following Pella andMasuda (2001)
unconditional likelihood:account for sampling errorin sources
better con�dence intervals
many-to-many methods(Chen et al., 2010)
widely used . . .
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●●
●AS
AVCR
CYFL
GBGG
MX
BRSU
TR
BAH
BAR
FLF
CBGMOC
NIC
BRF
Overview New methods War stories Conclusions References
A cautionary example
Mismatch between simpleGibbs sampler withequivalent WinBUGSimplementation
Bug (??) in relativelywidely used software;haven't found time todiagnose/�x it!
Contribution
Den
sity
05
1015
0.0 0.2 0.4 0.6 0.8 1.0
NWFLNWFL
0.0
0.5
1.0
1.5
SOFLNWFL
02
46
810
NWFLSOFL
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.5
1.0
1.5
SOFLSOFL
tmcmcwbugswbugsLwbugsLL
Overview New methods War stories Conclusions References
Mixed models
(generalized) linear mixedmodels
very useful framework:GLMs (exponential family)+ random e�ects
linear algebra magic by D.Bates, M. Maechler
technology transfer
amazingly popular!
●
●
●
●
●
● ● ●● ● ● ●
● ●●
●
●●
● ●
●●
●
●
●
●
1995 2000 2005 2010 2015year
cita
tions w
●
●
GLMM paper
all others
Overview New methods War stories Conclusions References
Conclusions
Can't be too gloomy . . . trickle-down process does work
But we should at least recognize when we are amusingourselves, and when we are doing science
Can we formalize these ideas?Is it worth it?
Overview New methods War stories Conclusions References
Overview New methods War stories Conclusions References
References
Angrist, J.D. and Pischke, J.S., 2009. MostlyHarmless Econometrics: An Empiricist'sCompanion. Princeton University Press,Princeton, 1 edition edition. ISBN9780691120355.
Bolker, B., Okuyama, T., et al., 2003. EcologicalApplications, 13(3):763�775.
Bolker, B.M., Okuyama, T., et al., 2007.Molecular Ecology, 16:685�695.doi:10.1111/j.1365-294X.2006.03161.x.
Breiman, L., 2001. Statistical Science,16(3):199�215. ISSN 08834237.
Brooks, M.E., McCoy, M.W., and Bolker, B.M.,2013. PLoS ONE, 8(10):e76389.doi:10.1371/journal.pone.0076389.
Chen, Y., Smith, S.J., and Campana, S.E., 2010.Canadian Journal of Fisheries and AquaticSciences, 67(10):1533�1548. ISSN 0706-652X,1205-7533. doi:10.1139/F10-078.
Lavine, M., Beckage, B., and Clark, J.S., 2002.Journal of Agricultural, Biological, andEnvironmental Statistics, 7(1):21�41. ISSN1085-7117, 1537-2693.doi:10.1198/108571102317475044.
Murtaugh, P.A., 2007. Ecology, 88(1):56�62.
�, 2009. Ecology Letters, 12(10):1061�1068.ISSN 1461023X.doi:10.1111/j.1461-0248.2009.01361.x.
�, 2014. Ecology, 95(3):611�617. ISSN0012-9658. doi:10.1890/13-0590.1.
O'Hara, R.B. and Kotze, D.J., 2010. Methods inEcology and Evolution, 1(2):118�122. ISSN2041-210X.doi:10.1111/j.2041-210X.2010.00021.x.
Okuyama, T. and Bolker, B.M., 2005. EcologicalApplications, 15(1):315�325.
Ovaskainen, O., Finkelshtein, D., et al., 2014.Theoretical Ecology, 7(1):101�113. ISSN1874-1738, 1874-1746.doi:10.1007/s12080-013-0202-8.
Pella, J. and Masuda, M., 2001. FisheriesBulletin, 99:151�167.
Warton, D.I. and Hui, F.K.C., 2010. Ecology,92(1):3�10. ISSN 0012-9658.doi:10.1890/10-0340.1.