chapter 30 structural equation modeling from: mccune, b. & j. b. grace. 2002. analysis of...

Post on 21-Dec-2015

236 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CHAPTER 30Structural Equation Modeling

From: McCune, B. & J. B. Grace. 2002. Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon http://www.pcord.com

Tables, Figures, and Equations

Figure 30.1. Example of an ordination biplot showing results of nonmetric multidimensional scaling, group identities for individual plots, and vectors indicating environmental correlations (modified from Grace et al. 2000). Elev is elevation; C, Ca, K, Mg , Mn, N, P, and Zn are elements in soil samples. Ellipses represent ordination space envelopes for vegetation groups A, B, C, D, and E while an envelope is not given for the heterogeneous group F.

Table 30.1. Example of correlations among environmental and axis variables (Pearson correlation coefficients). Numbers that are underlined represent significant correlations at the p = 0.05 level. Variables are elevation, soil composition (calcium, magnesium, manganese, zinc, potassium, phosphorus, acidity, total carbon, and total nitrogen), and ordination axis scores.

elev Ca Mg Mn Zn K P pH C N Axis1 Axis2

elev 1.0 -- -- -- -- -- -- -- -- -- -- --

Ca .42 1.0 -- -- -- -- -- -- -- -- -- --

Mg .21 .79 1.0 -- -- -- -- -- -- -- -- --

Mn .71 .66 .42 1.0 -- -- -- -- -- -- -- --

Zn -.10 .47 .45 .39 1.0 -- -- -- -- -- -- --

K -.20 .40 .35 .25 .69 1.0 -- -- -- -- -- --

P -.32 .32 .26 .08 .72 .74 1.0 -- -- -- -- --

pH .43 -.02 -.14 .28 -.26 -.46 -.48 1.0 -- -- -- --

C -.44 .04 .15 -.19 .49 .55 .62 -.63 1.0 -- -- --

N -.46 -.05 .11 -.27 .31 .41 .51 -.53 .81 1.0 -- --

Axis1 .77 .30 .03 .55 -.20 -.23 -.37 .47 -.55 -.47 1.0 --

Axis2 .005 -.26 -.35 -.14 -.28 -.16 -.25 .11 -.21 -.13 .17 1.0

Correlations with Axis 1

.77 .57 .77

.30 .33 ----

---- -.31 ----

.55 ---- ----

-.20 ---- ----

-.23 ---- ----

-.37 ---- ----

.47 ---- ----

-.55 -.33 -.23

-.47 ---- ----

.59 .65 .64

Simple Standard Stepwise

Mg

Mn

N

Ca

Zn

K

P

pH

C

Axis1

elev

R2 =

Correlations with Axis 2

.005 ---- ----

-.26 ---- ----

-.35 -.40 -.35

---- ---- ----

-.28 ---- ----

---- ---- ----

-.25 ---- ----

---- ---- ----

-.21 ---- ----

---- ---- ----

.12 .11 .12R2 =

Simple Standard Stepwise

Axis2

Figure 30.2. Illustration of the regression relationships between environmental parameters and Axes 1 and 2 of the ordination. Simple bivariate regression, multiple regression, and stepwise regression results are shown for comparison. “----“ denotes nonsignificant coefficients. Double-headed arrows represent correlations between independent variables, which are dealt with differently in the three methods of correlation analysis. R2 in the simple correlation column represents the highest R2 obtained for any single variable; for other columns it is the variance explained for the whole model.

Correlation Matrix Graz Bio RichGraz 1.0 --- ---Bio -.5 1.0 ---Rich 0.0 -.6 1.0

Bio

Graz

Rich

-.4

-.8

-.5

Multiple Regression

R2 = .48 Bio

Graz Rich-.4

-.8-.5

Path Model Representation

R2 = .48

Graz Rich0.0

Bivariate Regressions

R2 = 0.0

Bio Rich-.6

R2 = .36

Figure 30.3. Offsetting pathways are represented differently by bivariate correlation and regression, multiple regression, and path models. Graz = grazing (yes or no), Bio = standing community biomass, and Rich = plant species richness. In this example, the path model shows how offsetting negative and positive effects of grazing on richness can result in a zero bivariate correlation.

Partial correlations help to make sense of the mathematical interrelationships among a set of intercorrelated variables. As the name implies, partial correlations are calculated to remove the effect of a third variable and then recalculate the relationship between two variables. This is also referred to as “controlling for the third variable,” “partialling out” the third variable, or “holding the third variable constant.”

Consider, for example, three intercorrelated varibles, A, B, and C. We can represent the partial correlation between A and B, rABC, as

rr r r

r rAB C

AB AC BC

AC BC

( )

( )( )1 12 2

where rAB, rAC, and rBC represent the bivariate correlations between pairs of variables. If, for example, rAB = 0.5, rAC = 0.8, and rBC = 0.6,

rAB C

0 5 0 8 0 6

1 0 8 1 0 60 04

2 2

. ( . )( . )

( . )( . ).

To better understand this equation, consider what is accomplished by the numerator and denominator. First, the numerator subtracts out the indirect corre- lation between A and B that passes through their joint correlations with C. Second, the denominator adjusts the total standardized variance of A and B by subtracting the portions of each explained by C.

Box 30.2 What is a partial correlation?

The strength of the path mediated through biomass is calculated based on the formula:

strength of a compound path = product of path components

which in the case of Graz Bio Rich is

-0.5 -0.8 = +0.40

Further, the total effect of grazing on richness is the sum of the various paths that connect a predictor variable with a response variable. In this case,

total effect = sum of individual paths

= -0.40 + 0.40 = 0.0

In multiple regression and path models, results compensate for the correlations among predictors. For the example in Figure 30.3, variation in richness is explained by both variation in grazing and biomass. If grazing and biomass were uncorrelated, the variance explained (R2) would simply be the sum of the squared bivariate correlations. In such a case, the R2 for richness would be

However, when predictors are correlated, the variance explained differs markedly from that estimated by a simple addition.

R2 2 20 0 0 6 0 36 ( . ) ( . ) .

Box 30.3. Calculation of R2 in a multiple regression or path model.

Box 30.3. cont.

Estimates of the predicted values can be calculated as follows:

where 1 and 2 are standardized partial regression coefficients (see Box 30.2) and x1z and x2z are “z-transformed” predictor variables. As in bivariate regression, the values of the betas are those that satisfy the least squares criterion,

( ) ( )y x xz z z 1 1 2 2

min ( )y y 2

Box 30.3. cont.

Now, the R2 for our example can be calculated using the formula

where ry1 and ry2 are the bivariate correlations between y and x1 and y and x2. For our example in Figure 30.3,

R r ry y y2

12 1 1 2 2 ( ) ( ),

Ry2

120 4 0 0 0 8 0 6 0 48 . ( . ) . ( . ) .

Box 30.3. cont.

To come full circle in this illustration, if we had the case where grazing (x1) and biomass (x2) were uncorrelated as first mentioned in this Box, then the bivariate and partial correlations would be equal (i.e., 1 = ry1 and 2 = ry2) and the above equation would reduce to

36.0)6.0()0.0( 2222

2112

2 )()( yyy rrR

Table 30.2. Principal component loadings for the first five principal components (PC1 to PC5). Loadings greater than 0.3 are shown in bold to highlight patterns.

Variable PC1 PC2 PC3 PC4 PC5

elev -.18 .44 -.04 .55 -.26

Ca .19 .47 -.29 -.14 .04

Mg .22 .36 -.60 -.29 .25

Mn .04 .52 .23 .32 -.11

Zn .38 .20 .38 -.14 .22

K .41 .10 .29 -.11 -.26

P .42 .01 .32 -.13 -.01

pH -.33 .20 .35 .03 .77

C .40 -.19 -.09 .41 .12

N .35 -.24 -.22 .52 .37

When moving from a multiple regression to a structural equation model, the underlying mathematics changes from

y = a1 + b1x1 + b2x2

where y is Rich and x1 and x2 are Graz and Bio, respectively, to a structured set of simultaneous regression equations (hence, “structural equation” modeling)

y1 = a1 + b11x1 + b12y2

y2 = a2 + b21x1

Mg

Mn

N

Ca

Zn

K

P

pH

C

elev

PC1

PC2

PC4

PC5

Axis2

Axis1

-.45-.28

.59

.25

-.13

R2 = .62

R2 = .10

Figure 30.4. Use of principal components in combination with multiple regression. PC1 through PC5 represent principal components of the environmental variables at left. Axis1 and Axis2 represent scores on NMS ordination axes of community data. Numbers along arrows are partial correlation coefficients.

x3

x4

x2

x5

x6

x1

y

regression model

indicator variables only

x3

x2

x1

x4 x5 x6

yA

C

B

structural equation model

indicator and latent variables

Figure 30.5. Development of a structural equation model in contrast to a regression model. Boxes represent measured or indicator variables while ellipses represent conceptual or latent variables. In the structural equation model, the indicator variables are organized around the hypothesis that x1, x2, and x3 are different facets of a single underlying causal variable, A, while x4, x5, and x6 are different facets of the causal variable, B. Further, y represents an available estimate of the latent variable C.

Mg

Mn

N

Ca

Zn

K

P

pH

C

elevELEV

MINRL

HYDR

Figure 30.6. Initial (hypothesized) measurement model relating three latent variables and ten indicator variables.

Mg

Mn

N

Ca

Zn

K

P

pH

C

elevELEV

MINRL

HYDR

Figure 30.7. Modified measurement model.

Mg

Mn

N

Ca

Zn

K

P

pH

C

elevELEV

MINRL

HYDR

-.67.71

-.33 AXIS2

AXIS1

a2

a1

.57

-.30

-.28

-.23

R2 = .63

R2 = .09

Figure 30.8. Final full model. Here a1 and a2 represent the measured ordination axis scores.

Measured Latent factors

Variable ELEV MINRL HYDR AXIS1 AXIS2

elev 1.00 --- --- --- ---

Ca --- 0.66 --- --- ---

Mg --- 0.43 --- --- ---

Mn --- 0.99 --- --- ---

Zn --- 0.66 0.81 --- ---

K --- 0.53 0.84 --- ---

P --- 0.39 0.93 --- ---

pH --- --- -0.66 --- ---

C --- --- 0.78 --- ---

N --- --- 0.67 --- ---

axis1 --- --- --- 1.00 ---

axis2 --- --- --- --- 1.00

Table 30.3. Standardized factor loadings resulting from structural equation model analysis.

Table 30.4. Correlations among latent variables.

ELEV MINRL HYDR AXIS 1 AXIS 2

ELEV 1.00 --- --- --- --- MINRL 0.71 1.00 --- --- --- HYDR -0.67 -0.33 1.00 --- --- AXIS1 0.77 0.50 -0.67 1.00 --- AXIS2 0.03 -0.14 -0.21 0.07 1.00

EnvironmentalConditions

BiomassCompetitive Exclusion

RealizedRichness

PotentialRichness

A

B C D

Figure 30.9. Initial conceptual model (from Gough et al. 1994).

Abiotic

Disturb. Biomass

Rich

abio#2

lt-lo

abio#1

dist

mass lt-hi

rich

Figure 30.10. Initial structural equation model.

Figure 30.11. Initial results of confirmatory factor analysis of measurement model. Numbers are path coefficients, represent partial regression coefficients and correlation coefficients.

Figure 30.12. Final results for confirmatory factor analysis of revised measurement model.

Abiotic

Disturb. Biomass

Rich

abio#2

lt-lo

abio#1

dist

mass lt-hi

rich

Light

Figure 30.13. Revised structural equation model.

Abiotic

Disturb. Biomass

Rich

abio#2

lt-lo

abio#1

dist

mass lt-hi

rich

Light

.99

.99

1.0

-.52

-.25-.39

-.57(+/-)

-.51-.68

.20

.30.86 .98 .97

R2 =.61 R2 =.59

R2 =.42

Figure 30.14. Final structural equation model. Path coefficients shown represent partial regression and correlation coefficients. R2 values specify the amount of variance explained for the associated endogenous variable.

Table 30.5. Standardized total effects of predictors on predicted variables for the model in Figure 30.14. Total effects include both indirect and direct effects, and represent the sum of the strengths of all pathways between two variables. Numbers in parentheses are standard errors. Numbers in brackets are t values. Reprinted with permission from Grace and Pugesek (1997).

Predicted Predictors

variables abiotic disturbance biomass light

biomass -0.2533 -0.6846 -- --

(0.0510) (0.0743)

[-4.97] [-9.21]

light 0.1303 0.6553 -0.5144 --

(0.0361) (0.0648) (0.0980)

[3.86] [10.11] [-5.25]

richness -0.4971 -0.1046 -0.0981 -0.5665

(0.0602) (0.0568) (0.0971) (0.0761)

[-8.25] [-1.84] [-1.01] [-7.44]

top related