multivariate statistics: an ecological perspective
TRANSCRIPT
1
Multivariate Statistics: An Ecological Perspective
Nature is Complex!
2
Advantages of Multivariate Statistics
P Reflect more accurately the truemultidimensional, multivariatenature of natural systems.
P Provide a way to handle large datasets with large numbers ofvariables.
P Provide a way of summarizingredundancy in large data sets.
P Provide rules for combiningvariables in an "optimal" way.
3
Advantages of Multivariate Statistics
P Provide a solution to the multiplecomparison problem by controllingexperimentwise error rate.
P Provide a means of detecting andquantifying truly multivariate patternsthat arise out of the correlationalstructure of the variable set.
P Provide a means of exploring complexdata sets for patterns and relationshipsfrom which hypotheses can begenerated and subsequently testedexperimentally.
4
What is Multivariate Statistics?
Model Techniques
y = x1 + x2 + ... xj RegressionAnalysis of VarianceContingency Tables
y1 + y2 + ... yi = x Multivariate ANOVADiscriminant AnalysisCART,MRPP,MANTEL
y1 + y2 + ... yi = x1 + x2 + ... xj
y1 + y2 + ... yi
Canonical Corr. AnalysisRedundancy AnalysisCan. Correspond. Analysis
OrdinationCluster Analysis
Multivariate Statistics
5
Canopy Snag CanopyObs Cover Density Height
1 80 0.2 352 75 0.5 323 72 0.8 28. . . .. . . .12 25 0.6 15
Example 1-Environmental Gradient
Data Matrix3-Dimensional Data Space
Ordination
6
Example 1-Environmental Gradient
3-Dimensional Data Space
ClusterAnalysis
Canopy Snag CanopyObs Cover Density Height
1 80 0.2 352 75 0.5 323 72 0.8 28. . . .. . . .12 25 0.6 15
Data Matrix
7
Sample Species A Species B Species C
1 80 1.2 352 75 0.5 323 72 0.8 28. . . .. . . .12 25 0.6 15
Example 2-Community Structure
Data Matrix
3-Dimensional Species Space
1
3
5
6
7
8
9
1011
12
2 4
Ordination
8
1
3
5
6
7
8
9
1011
12
2 4
Sample Species A Species B Species C
1 80 1.2 352 75 0.5 323 72 0.8 28. . . .. . . .12 25 0.6 15
Data Matrix
Example 2-Community Structure
3-Dimensional Species Space
ClusterAnalysis
9
AB
C
Sample Species A Species B Species C
1 80 1.2 352 75 0.5 323 72 0.8 28. . . .. . . .12 25 0.6 15
Example 2-Community Structure
Data Matrix
3-Dimensional Sample Space
Ordination
10
3-Dimensional Ordination Space
A
B
C1
3
5
6
7
8 10
11
12
2 4
Sample Species A Species B Species C
1 80 1.2 352 75 0.5 323 72 0.8 28. . . .. . . .12 25 0.6 15
Example 2-Community Structure
Data Matrix
Ordination
11
Ind. Species Canopy Snag CanopyCover Density Height
1 A 80 1.2 352 A 75 0.5 323 A 72 0.8 28. . . . .31 B 35 3.3 1532 B 75 4.1 2560 B 15 5.0 3. . . . .61 C 5 2.1 562 C 8 3.4 290 C 25 0.6 15
Data Matrix
Example 3-Niche Separation
X2=Snag density
A
B
C
AB
C
X3=Canopy height
ABC
1-Dimensional Data Space
12
2-Dimensional Data Space
A
AA
B
B
B
C
C
C
Ind. Species Canopy Snag CanopyCover Density Height
1 A 80 1.2 352 A 75 0.5 323 A 72 0.8 28. . . . .31 B 35 3.3 1532 B 75 4.1 2560 B 15 5.0 3. . . . .61 C 5 2.1 562 C 8 3.4 290 C 25 0.6 15
Data Matrix
Example 3-Niche Separation
13
Ind. Species Canopy Snag CanopyCover Density Height
1 A 80 1.2 352 A 75 0.5 323 A 72 0.8 28. . . . .31 B 35 3.3 1532 B 75 4.1 2560 B 15 5.0 3. . . . .61 C 5 2.1 562 C 8 3.4 290 C 25 0.6 15
Data Matrix
Example 3-Niche Separation
14
3-Dimensional Data Space
A AA
A
X2=Snag density
X3=Canopy height
B
BB
B
C
C
C
C
Ind. Species Canopy Snag CanopyCover Density Height
1 A 80 1.2 352 A 75 0.5 323 A 72 0.8 28. . . . .31 B 35 3.3 1532 B 75 4.1 2560 B 15 5.0 3. . . . .61 C 5 2.1 562 C 8 3.4 290 C 25 0.6 15
Data Matrix
Example 3-Niche Separation
15
Ind. Species Canopy Snag CanopyCover Density Height
1 A 80 1.2 352 A 75 0.5 323 A 72 0.8 28. . . . .31 B 35 3.3 1532 B 75 4.1 2560 B 15 5.0 3. . . . .61 C 5 2.1 562 C 8 3.4 290 C 25 0.6 15
Data Matrix
Example 3-Niche Separation
16
Data Matrix
Example 4-Habitat Use
3-Dimensional Data Space
X2=Snag density
X3=Canopy height
Obs Group Canopy Snag CanopyCover Density Height
1 Use 80 1.2 352 Use 75 0.5 323 Use 72 0.8 284 Use 35 3.3 15. . . . .31 Random 5 2.1 532 Random 68 3.4 233 Random 25 0.6 1534 Random 70 1.3 33. . . . .
Use
Random
17
Data Matrix
Example 5-Constrained Ordination
3-D Environment Space
3-D Species Space
X2=Species B
X3=Species C
X2=Snag Density
X3=Canopy Height
A
B
C1
3
5
6
7
8 10
1124
12
CanopyCover
Snags
CanopyHeight
18
Multivariate StatisticsKey Points
P Multivariate statistics involves cases involving multiple “dependent”variables, or a single set of variables presumed to be dependent onsome underlying (latent) but unknown factors.
P All multivariate problems can be respresented as a two-way datamatrix in which rows represent sampling entities and columnsrepresent variables; the internal structure of the matrix with respect togroups of sampling entities or dependence relationships amongvariables distinquishes among the various multivariate techniques.
P All multivariate problems can be conceptualized geometrically as adata cloud in a P-dimensional data space, where the dimensions (oraxes) are defined by the variables of interest; it is the shape, clumping,and dispersion of this cloud that multivariate techniques seek todescribe.
19
Multivariate Description versus Inference
P Provide rules for combining the variables in an optimalway. What is meant by ‘optimal' may vary from onetechnique to the next.
On the Descriptive Side:
P Provide explicit control over the experimentwise error rate.Many situations in which multivariate techniques areapplied could be analyzed through a series of univariatesignificance tests
On the Inferential Side:
20
Multivariate Confusion
PWhich technique to use?< Ordination or cluster analysis?
< Unconstrained or constrainedordination?
< Polar ordination, principalcomponents analysis, principalcoordinates analysis,correspondence analysis,nonmetric multidimensionalscaling?
21
Multivariate Confusion
PAlternative Terminology forTechniques
< Indirect Gradient Analysis orUnconstrained Ordination?
<Reciprocal Averaging orCorrespondence Analysis?
<Canonical Ordination orConstrained Ordination?
<Discriminant Analysis or CanonicalVariates Analysis?
22
Multivariate Confusion
PTerminology for Variable LabelsBased on Data Type andMeasurement Scale!Categorical Variable< Dichotomous< Polytomous
– Ordinal Scale– Nominal Scale
!Continuous Variable– Ratio Scale (true zero)– Interval Scale (arbitrary zero)
!Count Variable
23
Multivariate Confusion
PTerminology for Variable Labels basedon the Relationship with otherVariables!Independent Variable< Variable presumed to be a cause of any change
in a dependent variable; often regarded as fixed,either as in experimentation or because thecontext of the data suggests they play a causalrole in the situation under study.
!Dependent Variable< Variable presumed to be responding to a
change in an independent variable; variablesfree to vary in response to controlledconditions.
24
Multivariate Techniques
Unconstrained Ordination(PCA, PO, CA, DCA, NMDS)
Cluster Analysis(Family of techinques)
Discrimination(MRPP, MANTEL, DA, CART, ...)
Constrained Ordination(RDA, CCA, CAPS, CanCorr)
Extract gradients of maximumvariation
Establish groups of similar entities
Test for or describe differencesamong groups of entities or predictgroup membership
Extract gradients of variation independent variables explanable byindependent variables
Technique Objective
25
Multivariate Techniques
Interdependence
Interdependence
Dependence
Dependence
Dependence Type
Unconstrained Ordination(PCA, PO, CA, DCA, NMDS)
Cluster Analysis(Family of techinques)
Discrimination(MRPP, MANTEL, DA, CART, ...)
Constrained Ordination(RDA, CCA, CAPS, CanCorr)
Technique
26
One set; >>2 variables
One set; >>2 varibles
Two sets; 1 grouping variable, >>2discriminating variables
Two sets; >>2 depend variables,>>2 independent variables
Data Structure
Multivariate Techniques
Unconstrained Ordination(PCA, PO, CA, DCA, NMDS)
Cluster Analysis(Family of techinques)
Discrimination(MRPP, MANTEL, DA, CART, ...)
Constrained Ordination(RDA, CCA, CAPS, CanCorr)
Technique
27
Multivariate Techniques
Obs Group X-set Y-set
1 A a11 a12 a13 ... a1p b11 b12 b13 ... b1m
2 A a21 a22 a23 ... a2p b21 b22 b23 ... b2m
3 A a31 a32 a33 ... a3p b31 b32 b33 ... b3m
. . . . . ... . . . . ... .
. . . . . ... . . . . ... .n A an1 an2 an3 ... anp bn1 bn2 bn3 ... bnm
n+1 C c11 c12 c13 ... c1p
n+2 C c21 c22 c23 ... c2p
n+3 C c31 c32 c33 ... c3p
. . . . . ... .
. . . . . ... .N C cn1 cn2 cn3 ... cnp
28
N (from known or unknown #pop's)
N (from known or unknown #pop's)
N (from known # pop's) orN1, N2, .. (from separate pop's)
N (from one pop)
Sample Characteristics
Multivariate Techniques
Unconstrained Ordination(PCA, PO, CA, DCA, NMDS)
Cluster Analysis(Family of techinques)
Discrimination(MRPP, MANTEL, DA, CART, ...)
Constrained Ordination(RDA, CCA, CAPS, CanCorr)
Technique
29
P Multivariate statistics involves both descriptive andinferential statistics, although most applications areexploratory and descriptive in nature.
P Multivariate statistics includes a broad array of techniquesand confusing and inconsistent use of terminology – sorry,no way around this.
P Research questions often warrant the use of more than onetechnique as the same technique can often be used toanswer different questions and the same question can oftenbe answered with different techniques.
Multivariate StatisticsKey Points