New methodologies for the use of cladistic-type matrices to measure
morphological disparity and evolutionary rate
@GraemeTLloyd
Acknowledgements
Matt Friedman
Liam Revell
MarkBell
PeterSmits
SteveBrusatte
Roger Benson
SteveWang
RichFitzjohn
Cladistic-type data
- Discrete morphological data
Cladistic-type data
- Discrete morphological data
- Limited to 32 states (often less)
Cladistic-type data
- Discrete morphological data
- Limited to 32 states (often less)
- Frequently non-Euclidean
Cladistic-type data
- Discrete morphological data
- Limited to 32 states (often less)
- Frequently non-Euclidean
- Missing data common
Acladistic analyses
Disparity Rates
Acladistic analyses
Common Rare
Disparity Rates
Acladistic analyses
Common
No models
Rare
Simple models
Disparity Rates
Acladistic analyses
Common
No models
Single approach (GED)
Rare
N approaches ≈ N studies
Simple models
Disparity Rates
Acladistic analyses
Common
No models
Single approach (GED)
Rare
N approaches ≈ N studies
Simple models
Disparity Rates
Time series an issue
Claddis
github.com/graemetlloyd/Claddis
Disparity
Toljagicand
Butler2013
Disparity studies
Brusatte et al2008
Thorneet al2011
Butler et al. 2011
Cladistic disparity
Cladisticmatrix
Distancematrix
Ordination ‘Morphospace’
Cladistic disparity
Cladisticmatrix
Distancematrix
Ordination ‘Morphospace’
Distancemetric
Desiderata
An ideal distance metric should:
Desiderata
An ideal distance metric should:
1. have high fidelity
Desiderata
An ideal distance metric should:
1. have high fidelity2. be normally distributed
Desiderata
An ideal distance metric should:
1. have high fidelity2. be normally distributed3. be Euclidean
Desiderata
An ideal distance metric should:
1. have high fidelity2. be normally distributed3. be Euclidean4. be calculable
Desiderata
An ideal distance metric should:
1. have high fidelity2. be normally distributed3. be Euclidean4. be calculable5. be easily visualised
Generalised Euclidean Distance
Wills 2001
Generalised Euclidean Distance
Wills 2001
But: Sijk is incalculable if k values for i or j (or both) are missing
Wills 2001
Generalised Euclidean Distance
Wills 2001
But: Sijk is incalculable if k values for i or j (or both) are missing
Wills 2001
Alternate distances
GED
Alternate distances
Raw GED
Alternate distances
Gower
Raw GED
Alternate distances
Gower
Raw GED
MOD
Simulations
20 taxa
50 binary characters
0-80% missing data
Input
Simulations
20 taxa
50 binary characters
0-80% missing data
Mantel test
N taxa retained
Variance of first two PCA axes
Shapiro-Wilk test
Input Output
Calculable
Gower
Raw GED
MOD
Incompleteness
% t
axa
reta
ined
0% 80%0%
100%
Visualisation
Gower
Raw GED
MOD
Incompleteness
% v
aria
nce
axe
s 1
& 2
0% 80%
15%
45%
Normalcy
Gower
Raw GED
MOD
Incompleteness
Shap
iro
-Wilk
test
0% 80%
0.75
1.00
Fidelity
Gower
Raw GED
MOD
Incompleteness
Co
rrel
atio
n
0% 80%-30%
+30%
Fidelity
Gower
Raw
GED
MOD
Incompleteness
% h
igh
est
fid
elit
y
0% 80%
0%
100%
% missing data
Incompleteness
N d
ata
sets
0% 80%
0
30
Rates
Rate studies
Derstler 1982 Forey 1988
Ruta et al 2006 Brusatte et al 2008
Rate calculation
Rate = N changes /Δt × Completeness
Null hypothesis
H0 = equal rates
Alternate hypothesis
+Halt =
Lungfish
Westoll 1949
Lungfish
Devonianhigh rates
Lloyd et al 2012
Lungfish
post-Devonianlow rates
Lloyd et al 2012
Parsimony problem
?
?
?
Changeearly
Changelate
DELTRAN ACCTRAN
Parsimony problem
Lloyd et al 2012
Parsimony problem
Lloyd et al 2012
Parsimony problem
?
?
?
?
?
?
Parsimony problem
?
?
?
?
?
?
?
?
?
?
?
?No changes
Internal vs. terminal
>
Rate
Internal vs. terminal
≈
Changes
Internal vs. terminal
<
Duration
Internal vs. terminal
Solution
Rates revisited
Brusatte et al 2014
high rateslow rates
Rates revisited
Brusatte et al 2014
high rateslow rates
Time series problems
Toljagicand
Butler2013
Disparity time series
Brusatte et al2008
Thorneet al2011
Butler et al. 2011
Toljagicand
Butler2013
Brusatte et al2008
Thorneet al2011
Butler et al. 2011
Disparity time series
4 time bins 4 time bins
2 time bins14 time bins
Rate time series
Lloyd et al 2012
No completenessBranch-binning
Ruta et al2006
Rate time series
N changes | Δt | Completeness
Conclusions
Gower
Raw GED
MOD
Conclusions
Gower
Raw GED
MOD
PC
O 1
PCO 2
?
Conclusions
Gower
Raw GED
MOD
PC
O 1
PCO 2
?
Conclusions
Gower
Raw GED
MOD
Rat
e
t
?P
CO
1
PCO 2
?