multivariate analysis of variance for stream classification in texas

Multivariate Analysis of Variancefor

Stream Classification in Texas

Eric S. HershCE397 – Statistics in Water Resources

Term Project

Cinco de Mayo, 2009

Can we quantitatively regionalize the streams of Texas?

East Texas

North-Central Texas

WestTexas

South-CentralTexas

Lower Rio Grande Basin

Hersh, E.S., Maidment, D.R., and W.S. Gordon. “An Integrated Stream Classification System to Support Environmental Flow Analyses in Texas.” J. Am. Water Res. Assoc. Submitted November 2008.

Revisited - the question posed

Can we improve the way in which we perform the regionalization and thus (potentially)

increase its classification strength?

Analysis of VarianceANOVAPurpose: test whether group means are different

MANOVAMultivariate Analysis of Variance

Purpose: ANOVA with several

dependent variables

• Multiple metric dependent variables (n=18)

• Based on categorical (non-metric) independent variables (n=5 regions)

• Manipulate independent variables to determine effect on dependent variables using SAS PROC GLM (general linear model)

Region = DO ± Temp ± TSS ± pH ± Cond ± AirTemp ± Precip ± PET ± MAQ ± MAV ± BFI ± ZeroQ ± IQR ± Slope ± Substrate ± Sand ± Silt ± Clay

The Model

ANOVA MANOVA

= = … =

where:

p = parameter (dependent variable)

k = factor (independent variable)

Data Gaps

• Total number of subbasins in Texas = 205• Number with complete data = 103

Uh oh! This test is going to lose a lot of value. Unless…

• Can we fill in the gaps somehow?

Data Gaps

• Some of the subbasins in Texas have no rivers.

• Many have no gages.

• Many have no WQ sampling stations.– Synthetic data would be difficult and poor.

• But, the MANOVA test requires complete matrices.– Solution: fill in gaps with parameter means

– Dilutes strength of classification (regions tend toward others)

Hypothesis Test• Null Hypothesis: (vectors of) the group means

are equalOf course not! That’s preposterous! There would be no

regionalization!

But… we don’t care.

(PRISM, 1971-2000)

Evaluating the Model

• Pillai’s trace considered most robust– S.S. Pillai, 1901-1950, Indian mathematician

Revision Methodology1. Identify bordering subbasins

(n=50, but 10 border multiple, so 60 trials total)2. Switch one subbasin, check for increase in test stat,

record and reset (21 deemed beneficial)3. Rank by improvement4. Implement changes in order, discard if decline (18 kept)5. View in geographic context, apply decision rules (no

islands or peninsulas, 15 kept)

OLD NEW

SWITCHED

Possible Future Work

• Write final report

multivariate analysis of variance for stream classification in texas

metric independent variables

classification strength

subbasins n

test stat

manova test

complete data

streams of texas

parameter dependent

Documents

a generalized multivariate analysis of variance model

multivariate techniques: an overview using...

a new method for non-parametric multivariate analysis of...

manova(multivariate analysis of variance)1...

robust tests for multivariate factorial designs under ...the...

multivariate state-space approach to variance reduction in

6 multivariate repeated measures analysis of · pdf file6...

generalized variance multivariate normal distribution

analysis of variance (anova) and multivariate analysis of...

a generalized multivariate analysis of variance model...

بسم الله الرحمن الرحیم.. multivariate...

permutation tests for univariate or multivariate analysis...

sociology 680 multivariate analysis: analysis of variance

lecture 6: distance-based multivariate analysis of variance

warne: a primer on multivariate analysis of variance

sociology 690 multivariate analysis: analysis of variance

multivariate analysis of variance, part 2 bmtry 726 2/21/14

a bayesian approach to estimating variance components...

manova multivariate analysis of variance. one way...

6 multivariate repeated measures analysis of...