crd, strength of association, effect size, power, and sample size calculations

CRD, Strength of Association, Effect Size, Power, and Sample Size Calculations

BUSI 6480Lecture 4

Distinction between Type I and Type III Sum of Squares

This printout is from Proc GLM using 5 Factors with two levels each. The dependent variable is House Price. Notice the section with Type I SS and Type III SS.

A - F tests are sequential G – K tests

assume that other variables are in model

L - Q t-tests are based on Type III SS . Directional hypotheses can be tested.

Test A is the overall F test

Type I and III Tests Type I tests are sequential tests based on the order of

variables inputted. It is assumed that those variables listed above the variable being tested are included in the model.

Type III tests assume that all of the other variables are included in the model. Usually Type III sums of squares are the values presented in published articles.

Note that for the F tests in tests F and K (see slide #2) are the same.

t tests (slide #2, tests L through Q)

Note that the p-values for t-tests for the parameter estimates are the same as for the Type III F tests. This is because the t tests assume that all other variables are in the model.

Note that the square of the t statistic is equal to the Type III F statistic. The reason that the Parameter tests are provided is to display the regression coefficient and its standard error.

For a Factorial Experiment with an Orthogonal Design, Type I = Type III

For an orthogonal design, the Type I sum of squares and the Type III sum of squares will be the same since the sums of squares are independent.

Measures of Strength of Association

For a CRD, omega squared 2 for the fixed effects model and the intraclass correlation I for the random effects model are defined as follows:

2 / (

2 + 2)

This measure indicates the proportion of the population variance in the dependent variable that is accounted for by the treatment levels.

effect variance

error variance

Cohen (1988) suggested guidelines for omega square

2 = .010 is a small association 2= .059 ( approx .06) is a medium

association 2= .138 ( approx .14) is a large association

Formulas for 2

For Fixed Effects Models: 2= (SSB – (p-1)MSE ) / (SSTO + MSE) 2= (p-1)(F-1) / ((p-1)(F-1) + np)

For Random Effects Models: = (MSB – MSE ) / (MSB + (n-1)MSE) 2= (F-1) / (n-1 + F) is the intraclass correlation of the random

treatments

Effect Size

Cohen (1988) popularized the term effect size. It is a standardized difference between the mean of a group and the overall mean.

d1 = ( – 1) /

The top graph has a larger effect size than the bottom graph.

Just a standardized difference.

Formulas for Effect Size (represented by the letter f) First recall that E(MSB) =

2 + nj2/(p-1) for a fixed effects

model. f = sqrt(((j–)2/p)/

sqrt((j2/p)/

f = sqrt(2 / (1- 2)) Note that (j

2/p) can be estimated by

((p-1)/(np)) * (MSB – MSE). This comes from the E(MSB) above.

Cohen (1988) suggested guidelines for the f measure of effect size

f = .10 is a small effect size f = .25 is a medium effect size f = .40 (or larger) is a large effect size

Introduction to Calculation of Power

The F statistic has a noncentrality parameter = j

2/(nwhen the null hypothesis is false.

The Tang charts in TableE.12 can be used to compute the power. To use the charts, the value of must be entered:

sqrt(p) = sqrt( (j2/p)/(

n

Note that (j2/p) can be estimated by

((p-1)/(np)) * (MSB – MSE).

How to Increase Power Power is related to the effect size (separation

of means), standard deviation of residuals (sigma), significance level (alpha), and sample size. In general, there are four ways to increase power: increase effect size decrease residual variance increase sample size increase alpha

Example Using Tang’s Charts

Assume that the estimate of is 2.21 and that the numerator (1) and denominator degrees of freedom (2) for the F statistic are 3 and 28 respectively.

For an alpha value of .05, use the chart on page 818 and locate on the x-axis the tick mark after the 2 (that’s approx 2.21). Go up the chart vertically till you intersect the curve with 30 degrees of freedom (that’s the closest to 28 df). Approximate the power at .95.

Estimating Sample Size for a Specified Value of Power

Use the Tang Charts by substituting different values of n into the formula

sqrt(n’) sqrt( (j2/p)/(

Use a pilot study to estimate

by MSE and

(j2/p) by ((p-1)/(np)) * (MSB – MSE).

Two Other Ways to Estimate Sample Size Using Tang Charts

If there is no pilot study, figure out a desirable value for d (effect size) such as 1.5 then use

sqrt(n’) sqrt( d2/2p)

Or specify a value of 2 (strength of the relationship) and use

sqrt(n’) sqrt(2 / ( 1 - 2 ))

Relationship of effect size f to noncentrality parameter

f = sqrt(np)) Note that np is the total number of observations.

SPSS provides the noncentrality parameter when an estimate of the effect size is requested. The actual effect size needs to be computed using the above formula.

Test for Presence of Trend

If the Levels can be expressed quantitatively (in an order), then the mean of each level can be estimated by

j = o + c1j + 2c2j + 3c3j for a p = 4 level factor.

To check for Linear, Quadratic, and Cubic Trend components the following contrasts can be used.

Trend Component X1 X2 X3 X4Linear (c1j) -3 -1 1 3Quadratic (c2j) 1 -1 -1 1Cubic (c3j) -1 3 -3 1

Computing Effect Size in SAS DM "Log;Clear;OUT;Clear;";

PROC IMPORT OUT= WORK.KirkPage171 /*******Read Data******/ DATAFILE= “D:\KirkPage171Data.xls" DBMS=EXCEL2000 REPLACE; RANGE="Sheet1$"; GETNAMES=Yes; RUN;

proc print data=KirkPage171;

proc glm data=KirkPage171; /******One Way ANOVA*****/ class Level; model Response = Level ; ods output overallanova = atable; run; quit;

Data compute_effect; /****Compute Omega and Effect Size***/ set atable end = eof; if _n_ = 1 then do; pminus1 = df; fv = fvalue -1; end; omega_sq = (pminus1)*fv/((pminus1)*fv + df + 1); effect_size = sqrt(omega_sq/(1-omega_sq)); if NOT eof then delete; Keep omega_sq effect_size; Retain pminus1 fv;

proc print data = compute_effect;

To Get Power Estimate in SAS

data KirkPg171; input Level ResponseMean cellsize; datalines; 1 3 8 2 3.5 8 3 4.25 8 4 6.25 8 ; proc glmpower data= KirkPg171; class Level; model ResponseMean =Level; power alpha= .01 .05 stddev=1.476 ntotal=32 power=.;

Output from SAS on Power

The GLMPOWER Procedure

Fixed Scenario Elements

Dependent Variable ResponseMean Source Level Error Standard Deviation 1.476 Total Sample Size 32 Test Degrees of Freedom 3 Error Degrees of Freedom 28

Computed Power

Index Alpha Power

1 0.01 0.880 2 0.05 0.972

Power and Effect Size in SPSSClick on Analyze > General Linear Model > Univariate

Click on Options and select Estimates of effect size and Observed Power

Power and Effect Size in SPSS

Output from SPSS

Power is given. To estimate the f (effect size) convert the Noncentrality Parameter to f

Estimate Trend Compoments Across Levels Using SAS

proc glm data=KirkPage171; class Level; model Response = Level /noint; estimate 'Linear' Level-3 -1 1 3; contrast 'Quad ' Level 1 -1 -1 1; estimate 'Cubic ' Level -1 3 -3 1; run; quit; proc glm data=KirkPage171; class Level; model Response = Level/noint; estimate 'Linear' Level -.671 -.224 .224 .671; contrast 'Quad ' Level .5 -.5 -.5 .5; estimate 'Cubic ' Level -.224 .671 -.671 .224; run; quit;

Note that the sum of the squared coefficients sum to 1 for the decimal coefficients.

Estimating Trend Components Across Levels Using SPSSSelect Contrasts and Choose Polynomial. The menu does not allow for specific values of the Polynomial coefficients.

SPSS uses the following polynomial SPSS uses the following polynomial coefficients for 4 groups.coefficients for 4 groups.'Linear' Level -'Linear' Level -.671.671 - -.224.224 .224.224 .671.671;;'Quad ' Level 'Quad ' Level .5.5 - -.5.5 - -.5.5 .5.5;;'Cubic ' Level -'Cubic ' Level -.224.224 .671.671 - -.671.671 .224.224;;

SAS Proc Insight – Use to Get Quick Look at Data

PROC IMPORT OUT= WORK.KirkPage171 DATAFILE= “D:\KirkPage171Data.xls" DBMS=EXCEL2000 REPLACE; RANGE="Sheet1$"; GETNAMES=Yes; RUN;

proc Insight; Open KirkPage171; Fit Response = Level; run; quit;

Power Plot in SAS

proc glmpower data= KirkPg203Ex2; class Level; model ResponseMean =Level; power alpha=.01 .05 stddev= 2.6618 /*sqrt(MSE)*/ ntotal=30 power=.; plot x=n min=20 max=80;

Stddev of ANOVA and total sample size need to be specified.

Plot for Alpha = .01 and .05

Contrasts can be added to statements such as Proc Glmpower and Proc Multtest, but note the use of the word Level in both.

proc multtest data=myAnovaData Holm; class Level; /****Note don't type Level before coefficients as in Prog glmpower**/ contrast 'Grp1VersusGrp2' 1 -1 0; contrast 'Grp2VersusGrp3' 0 1 -1; contrast 'Grp1VersusGrp3' 1 0 -1;Test mean(resp); /*resp is the dependent variable*/

proc glmpower data= KirkPg203Ex2; class Level; model ResponseMean =Level; contrast 'Grp1VersusGrp2' Level 1 -1 0; /*Note the word Level*/ contrast 'Grp2VersusGrp3' Level 0 1 -1; contrast 'Grp1VersusGrp3' Level 1 0 -1; power alpha=.01 .05 stddev= 2.6618 /*sqrt(MSE)*/ ntotal=30 power=.; ods output output=mypoweroutput; run;

Put a “.” for ntotal if you want the total sample size for a specified power

power alpha=.01 .05 stddev= 2.6618 /*sqrt(MSE)*/ ntotal= . power= .8;

crd, strength of association, effect size, power, and sample size calculations

Documents

f tests

tests f

f measure of effect

sqrtsaj2pse2 f

value of f

small effect size f

medium effect size f

parameter tests