st 370 - factorial experiments and anova

33
Chapter 4 ST 370 - Factorial Experiments and ANOVA Readings: Chapter 13.1-13.2, Chapter 14.1-14.4 Recap: So far we’ve learned: Why we want a ‘random’ sample and how to achieve it (Sampling Scheme) How to use randomization, replication, and control/blocking to create a valid experi- ment. Now well look at a specific type of experiment and how to investigate which factors are important. Motivating Example: Mentos and Coke Consider an experiment where we want to determine the effect of initial volume (591 ml, 1000 ml, or 2000ml) on the % of coke expelled when Mentos (the freshmaker) are dropped in. A CRD was used. What is the response? Factor(s)? Level(s)? Treatments? What parameters might answer our question? 30

Upload: others

Post on 15-Nov-2021

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ST 370 - Factorial Experiments and ANOVA

Chapter 4

ST 370 - Factorial Experiments andANOVA

Readings: Chapter 13.1-13.2, Chapter 14.1-14.4

Recap: So far we’ve learned:

• Why we want a ‘random’ sample and how to achieve it (Sampling Scheme)

• How to use randomization, replication, and control/blocking to create a valid experi-ment.

• Now well look at a specific type of experiment and how to investigate which factors areimportant.

Motivating Example: Mentos and CokeConsider an experiment where we want to determine the effect of initial volume (591 ml,1000 ml, or 2000ml) on the % of coke expelled when Mentos (the freshmaker) are droppedin. A CRD was used.What is the response? Factor(s)? Level(s)? Treatments?

What parameters might answer our question?

30

wlu4
Typewritten Text
response: % of coke expelled factor: initial volume; three levels. treatment: three, 591ml, 1000ml and 2000ml.
wlu4
Typewritten Text
The mean or median of the % of coke expelled by each treatment.
Page 2: ST 370 - Factorial Experiments and ANOVA

Suppose we collect data. Consider the following two hypothetical sets of boxplots for thedata:

Which set of boxplots gives more evidence that the true means differ?

Although we’ll never know the true values of the parameters, we can use our sample data toestimate them.

31

wlu4
Typewritten Text
The right panel because the boxplots show less variations compared with those in the left panel.
wlu4
Typewritten Text
Page 3: ST 370 - Factorial Experiments and ANOVA

How to use these estimates to make a claim?One-Way Analysis of Variance (ANOVA) Model (Used to analyze a CRD):Consider the data below

We ‘fit’ the following model to this completely randomized design:

32

wlu4
Typewritten Text
Y_{ij} = mu + tau_i + epsilon_{ij}, i = 1,..., t; j = 1, ..., n. See textbook (6th edition) page 542, equation (13-1).
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
mu: overall mean; tau_i: treatment effect of the i-th treatment, or equivalently the deviation of the i-th treatment from the overall mean mu; mu_i = mu + tau_i: the mean of the i-th treatment; epsilon_ij: assumed to be normally and independently distributed with mean zero and variance sigma^2.
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
for model identifiability, we need the following constrain: tau_1 + ... + tau_t = 0.
wlu4
Typewritten Text
wlu4
Typewritten Text
Page 4: ST 370 - Factorial Experiments and ANOVA

Is the factor important in our One-Way ANOVA model?

How can we estimate each treatment mean?

If these sample means differ by enough, what would this imply?

- the difference(s) in the mean response when thefactor goes from one level to another.

Here, we would have two elements of our main effect. Write down their true values and theirestimates.

If the differences are not

33

wlu4
Typewritten Text
consider sample mean for each treatment: Ybar_i = (Y_{i1} + ... + Y_{in})/n, i =1, ..., t.
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
Ybar_i is an estimate of mu_i; Ybar = (Y_{11} + ... Y_{1n} + ... + Y_{t1} + ... Y_{tn})/(n*t): an estimate for the overall mean mu; Ybar_i - Ybar: an estimate for tau_i, i =1, ..., t.
wlu4
Typewritten Text
wlu4
Typewritten Text
It implies some of the tau_i's are different from others. In other words, different treatments may give different mean of the % of coke expelled when Mentos are added.
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
null hypothesis, H_0: tau_1 = ... = tau_t = 0
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
;
wlu4
Typewritten Text
alternative hypothesis, H_a: tau_i is not equal to 0 for at least one i.
wlu4
Typewritten Text
wlu4
Typewritten Text
main effects
wlu4
Typewritten Text
main effects: treatment 1 vs. treatment 2; and treatment 1 vs. treatment 3; true values: mu_1 - mu_2 = tau_1 - tau_2; mu_1 - mu_3 = tau_1 - tau_3;
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
their estimates: mu_1 - mu_2 = Ybar_1 - Ybar_2; mu_1 - mu_3 = Ybar_1 - Ybar_3;
wlu4
Typewritten Text
large enough, we don't have evidence to reject the null hypothesis.
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
Page 5: ST 370 - Factorial Experiments and ANOVA

Ok, so now we can estimate the treatment means for the Mentos and Coke example. Doesthere appear to be evidence the factor is important? What information would help?

One-Way ANOVA model in statcrunch

34

wlu4
Typewritten Text
mean for treatment 1 (591ml): Ybar_1 = 0.599; mean for treatment 2 (1000ml): Ybar_2 = 0.550; mean for treatment 3 (2000ml): Ybar_3 = 0.538; main effects: Ybar_1 - Ybar_2 = 0.049; Ybar_1 - Ybar_3 = 0.061.
wlu4
Typewritten Text
Variations matter: variations between treatments and variations within the same treatment
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
Page 6: ST 370 - Factorial Experiments and ANOVA

Remember: Statistics is all about Variation!

Total amount of variation in the data:

ANOVA (Analysis of Variance) table splits up this total variation into different sourcesto help determine which sources are statistically significant.

The ANOVA table generally has 6 columns

• Source:

• SS:

• Df: Degrees of Freedom

• MS:

• F-stat:

• P-value:

35

wlu4
Typewritten Text
total sum of squares
wlu4
Typewritten Text
SS(Total) = Sum_{i=1,..,t}Sum_{j=1,..,n} (Y_{ij} - Ybar)^2. (see page 544 of textbook)
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
treatment (or the factor of interest), error and total
wlu4
Typewritten Text
sum of squares from each source
wlu4
Typewritten Text
mean square for each source = SS/Df;
wlu4
Typewritten Text
wlu4
Typewritten Text
MS for treatment / MS for error;
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
is a probability; P(F-Stat >= 5.6536 | H_0 is true).
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
Page 7: ST 370 - Factorial Experiments and ANOVA

In One-Way ANOVA we only have 2 sources we care about (recall sources of variation fromthe design of experiments section!):

• Treatment Effect

• Error

Source: Treatment Effect

• SS(Trt) =

• DF: For t treatments, we have

• Often called

• MS(Trt) =

Source: Error

• SS(E) =

• DF: For N total observations and t treatments, we have

• Often called

• MS(E) =

36

wlu4
Typewritten Text
: treatment sum of squares
wlu4
Typewritten Text
: error sum of squares
wlu4
Typewritten Text
wlu4
Typewritten Text
n * Sum_{i=1,..,a} (Ybar_i - Ybar)^2
wlu4
Typewritten Text
wlu4
Typewritten Text
Df of treatment = t - 1;
wlu4
Typewritten Text
: between treatment variations
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
SS(Trt)/(t-1)
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
Sum_{i=1,...,t}Sum_{j=1,...,n} (Y_{ij} - Ybar_i)^2
wlu4
Typewritten Text
Df of error = t*(n-1)
wlu4
Typewritten Text
reason: t*n - 1 = (t - 1) + t*(n-1)
wlu4
Typewritten Text
within treatment variations
wlu4
Typewritten Text
wlu4
Typewritten Text
why t - 1? recall we have the constrain tau_1 + ... + tau_t = 0;
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
SS(E)/[t*(n-1)].
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
MS(E) is an unbiased estimator of sigma^2 regardless of whether or not the null hypothesis H_0 is true.
Page 8: ST 370 - Factorial Experiments and ANOVA

Table for balanced one-way ANOVA:

Source DF SS MS F

Treatments t− 1 SS(T ) MS(T ) = SS(T )(t−1)

F = MS(T )MS(E)

Error t(n− 1) SS(E) MS(E) = SS(E)(N−t)

Total nt− 1 SS(TOT )

where

SS(T ) =t∑i=1

n∑j=1

(yi• − y••)2 = n

t∑i=1

(yi• − y••)2

SS(E) =t∑i=1

n∑j=1

(yij − yi•)2

SS(Tot) =t∑i=1

n∑j=1

(yij − y••)2

Notes: SS(T) is also called SS(Between) and SS(E) is also called SS(Within).

Treatment DF + Error DF = Total DFSS(Trt) + SS(E) = SS(Tot)

More on the F-ratio and P-value

37

wlu4
Typewritten Text
F-ratio has an F-distribution with t - 1 and t*(n-1) degrees of freedom under the null. The larger F-ratio implies a stronger evidence to reject the null hypothesis. An intuitive explanation is that the between treatment variation (i.e. the numerator) is relatively large compared with the within treatment variation (i.e. the denominator). Therefore, the treatments have a significant contribution to the variation of the response and are important. A more formal explanation is that when the null hypothesis is true, MS(Trt) is also an unbiased estimator of sigma^2. However, when the alternative is true, the expected value of MS(Trt) is greater than sigma^2. Therefore, under the alternative hypothesis, the expected value of the numerator of F-ratio is greater than the expected value of the denominator. Consequently, we should reject the null H_0 if F-ratio is large.
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
The larger F-ratio implies the smaller p-value. The smaller P-value, the stronger evidence to reject the null hypothesis. In practice, we will compare the P-value with a pre-determined alpha level, say alpha = 0.05 or 0.01. If P-value <= alpha, we reject the null; otherwise, we don't have enough evidence to reject the null and so we accept the null is true.
wlu4
Typewritten Text
wlu4
Typewritten Text
Page 9: ST 370 - Factorial Experiments and ANOVA

Recall Boxplot idea: Consider the following two hypothetical sets of boxplots for the data:

We use the p-value to determine if the F-ratio is ‘large’ enough. If p-value is less than apre-specified value (usually 0.05) we say have evidence that the main effect(s) are not all 0.

That is

Idea of a P-value

• P-values are

• Here, p-value represents

• P-value for Initial Volume = 0.0148. Small!

Goals of One-Way ANOVA

• Determine if the factor is related to the response.

• If so, estimate the main effects (factor level differences)

38

wlu4
Typewritten Text
we reject the null and conclude the factor (treatment) is important for the mean response.
wlu4
Typewritten Text
wlu4
Typewritten Text
defined as the probability of obtaining a result equal to or "more extreme" than what was actually observed, when the null hypothesis is true.
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
the strength of evidence of rejecting the null hypothesis.
wlu4
Typewritten Text
< 0.05, we have strong evidence to conclude that H_0 is not true.
wlu4
Typewritten Text
Practical interpretation: There is strong evidence to conclude that the initial volume has an effect on the % coke expelled when Mentos are added.
Page 10: ST 370 - Factorial Experiments and ANOVA

One-Way ANOVA Example: (some description taken from Goosen, 2014)

Consider having 24 pieces of cheese. Color of the cheese is important in terms of consumersatisfaction. We have interest in how the color differs for 4 different types of corn syrup (26,42, 55, and 62) (4 treatments). A CRD design is decided upon and we randomly assign eachcorn syrup type to 6 pieces of cheese (6 replicates for each treatment).

As a response, we measure the color using a 3 part CIE L*a*b* Color System.

• ‘L’ reflects the lightness of a sample, from black (L = 0) to white (L = 100) and runsfrom top to bottom.

• ‘a’ defines the shades from red (positive values) to green (negative values).

• ‘b’ defines the shades from yellow (positive values) to blue (negative values).

All three of these could be treated as responses (and analyzed together), but for our purposeswe will only look at the ‘L’ response variable.

Again, we will focus on the means of the population. How might we make inference here?

Define

• µ1 = mean ‘L’ score for all pieces of cheese that with corn syrup 26.

• µ2 = mean ‘L’ score for all pieces of cheese that with corn syrup 42.

• µ3 = mean ‘L’ score for all pieces of cheese that with corn syrup 55.

• µ4 = mean ‘L’ score for all pieces of cheese that with corn syrup 62.

1. What is our factor and what are the levels of that factor?

2. What hypothesis do we want to test?

39

wlu4
Typewritten Text
wlu4
Typewritten Text
corn syrup type: four levels, 26, 42, 55, 62
wlu4
Typewritten Text
There is no difference in mean "L" score between the 4 different types of corn syrup. H_0: mu_1 = mu_2 = mu_3 = mu_4 vs. H_a: at least two of them are different H_0: tau_1 = ... = tau_4 = 0 vs. H_a: at least one of them is not 0.
Page 11: ST 370 - Factorial Experiments and ANOVA

3. The data are given below. Fill in the label column (labeling the observations in termsof y’s) Data and labeling:

Corn Syrup Replicate # ‘L’ measurement Response Label26 1 51.8926 2 51.5226 3 52.6926 4 52.0626 5 51.6326 6 52.7342 1 47.2142 2 48.5742 3 47.5742 4 46.8542 5 48.6442 6 47.4955 1 41.4355 2 42.3155 3 42.3155 4 41.4955 5 42.1255 6 42.6562 1 45.9962 2 46.6662 3 47.3562 4 45.8362 5 46.7762 6 47.88

40

wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
Y_11 Y_12 Y_13 Y_14 Y_15 Y_16 Y_21 Y_22 . . . . . . . . . . . . Y_46
Page 12: ST 370 - Factorial Experiments and ANOVA

4. Summary statistics (from SAS) and a boxplot of the data is given below. Based onthese summary measures, does there appear to be a relationship between the ‘L’ scoreand corn syrup amount? Justify your answer.

41

Page 13: ST 370 - Factorial Experiments and ANOVA

Analysis of this data can be done in SAS using the following code: (assume data is readin as cheese)

proc anova data=cheese;

class syrup;

model L = syrup;

means syrup/tukey;

run;

5. Why do we have 3 degrees of freedom for the model (aka treatment)?

6. What is the number 313.39 and what does that value represent for this data?

42

wlu4
Typewritten Text
We have 4 treatments. Df = 4 - 1 = 3
wlu4
Typewritten Text
total sum of squares = 313.39 represents the total variations in responses.
Page 14: ST 370 - Factorial Experiments and ANOVA

7. If the value for SS(Trt) were missing but you had SS(Tot) and SS(E), how could ou findSS(Trt)?

8. What is the relationship between the Mean square values and the F-ratio?

9. What does the MSE value of 0.41378 represent?

10. Based on the p-value what is your conclusion about the ‘hypothesis’ we are testing?

11. What does the p-value mean in words for this experiment?

12. Since we have a significant factor, we would want to estimate the main effects. Give anestimate of the main effect of syrup 42 vs syrup 26 and also for syrup 55 vs syrup 26.

43

wlu4
Typewritten Text
SS(Trt) = SS(Tot) - SS(E)
wlu4
Typewritten Text
F-ratio = MS(Trt)/MS(E)
wlu4
Typewritten Text
0.41378 is an unbiased estimator of sigma^2, in regardless of whether the null hypothesis is correct or not.
wlu4
Typewritten Text
reject the null hypothesis and support the alternative
wlu4
Typewritten Text
It means there is strong evidence to conclude that different corn syrup type have different mean "L" score measurement.
wlu4
Typewritten Text
42 vs. 26: 47.72 - 52.09 = -4.37 55 vs. 26: 42.05 - 52.09 = -10.04
wlu4
Typewritten Text
wlu4
Typewritten Text
Here, MSE = MS(E), i.e. mean squares of error
Page 15: ST 370 - Factorial Experiments and ANOVA

Factorial designs

• This idea of modeling can be extended to when we look at more than one factor at atime.

• Factorial experiment:

• Full factorial design (henceforth factorial design):

Example: Gardner wants to look at water and fertilizer on crop yield.

• factor A: Water (levels= Low, High)

• factor B: Fertilizer (levels = Nitrogen, Phosphate)

Treatments?

Note: In general, no need to look at all treatment combinations. However, we will onlyconsider the full experiments.

Why do we want to look at more than one variable at a time?

44

wlu4
Typewritten Text
4 treatments: Low+Nitrogen, Low+Phosphate, High+Nitrogen, High+Phosphate
wlu4
Typewritten Text
Because many variables can have effects on response and they can have interaction effects, which can not be studied just by one variable a time. Factorial design is an extremely important tool for engineers and scientists who are interested in improving the performance of a manufacturing process. It also has extensive application in the development of new processes and in new product design.
wlu4
Typewritten Text
experiments that include two or more factors that the experimenter thinks may be important.
wlu4
Typewritten Text
experimental trials (or runs) are performed at all combinations of factor levels.
Page 16: ST 370 - Factorial Experiments and ANOVA

Notation for factorial design

• 2x2 factorial design =

• 3x4 factorial design =

• 2x3x4 factorial design =

Total # of treatments is found by multiplying the #s of levels. How many treatments foreach design above?

Recall: Goals of One-Way ANOVA

• Determine if the factor is related to the response.

• If so, estimate the main effects (factor level differences)

Goals of factorial data analysis?

Basically, which factors are important and then estimation of the appropriate effects.

45

wlu4
Typewritten Text
2 factors, each has 2 levels
wlu4
Typewritten Text
2 factors, first has 3 levels and second has 4 levels
wlu4
Typewritten Text
wlu4
Typewritten Text
3 factors, first has 2 levels, second has 3 levels and third has 4 levels.
wlu4
Typewritten Text
4 in first design; 12 in second design; 24 in third design.
wlu4
Typewritten Text
Determine which factor (or factors) is (or are) related to the response.
wlu4
Typewritten Text
For a significant factor, estimate the main effects.
wlu4
Typewritten Text
Determine whether there are interaction effects between factors (both graphically and numerically).
wlu4
Typewritten Text
If so, estimate the interaction effects.
Page 17: ST 370 - Factorial Experiments and ANOVA

2x2 ANOVA example: Back to Mentos and Coke Example

A CRD is run, the response is still percent of coke expelled, but consider a second factor inthis experiment.

• Factor A: initial volume (591 ml, 1000 ml, or 2000ml)

• Factor B: # of mentos (4 or 8)

What type of factorial design is this? How many total treatments?

What are the parameters of interest now?

We might assume the factors act independently of one another, called an additive (or maineffects only) model.

46

wlu4
Typewritten Text
3 x 2 factorial design; 6 treatments.
wlu4
Typewritten Text
The mean percent of coke expelled of each treatment.
Page 18: ST 370 - Factorial Experiments and ANOVA

Two-Way (additive) ANOVA Model

First, consider a different parametrization of the One-Way ANOVA Model.

Back to Two-Way (additive) ANOVA modelAssuming the factors act independently the two-way additive ANOVA model is

47

wlu4
Typewritten Text
mu_{ij}: the mean of the treatment group ij (factor A in level i and factor B in level j)
wlu4
Typewritten Text
Y_{ijk}: the k-th observation in treatment group ij, k=1,...,n; Model: Y_{ijk} = mu_{ij} + epsilon_{ijk}, epsilon_{ijk}: a random error component having a normal distribution with mean zero and variance sigma^2
wlu4
Typewritten Text
mu_{ij} = mu + alpha_i + beta_j; mu: overall mean; alpha_i: the effect of the i-th level of factor A; i=1,...,a; beta_j: the effect of the j-th level of factor B; j=1,...,b.
wlu4
Typewritten Text
assumption: alpha_1 + ... + alpha_a = 0; beta_1 + ... + beta_b = 0.
wlu4
Typewritten Text
wlu4
Typewritten Text
Page 19: ST 370 - Factorial Experiments and ANOVA

Now we can get the ‘treatment means’ with these parameters:

• µ11 = Mean for 591 ml/4 mentos group =

• µ21 = Mean for 1000 ml/4 mentos group =

• µ31 = Mean for 2000 ml/4 mentos group =

• µ12 = Mean for 591 ml/8 mentos group =

• µ22 = Mean for 1000 ml/8 mentos group =

• µ32 = Mean for 2000 ml/8 mentos group =

How can we estimate each treatment mean?

• µ11 =

• µ21 =

• µ31 =

• µ12 =

• µ22 =

• µ32 =

How to determine if the Factors are Important?If the sample means involving the same level of factor B differ by enough, then we wouldsay factor A is important!

If the sample means involving the same level of factor A differ by enough, then we wouldsay factor B is important

These are investigations of the main effects!

48

wlu4
Typewritten Text
wlu4
Typewritten Text
mu + alpha_1 + beta_1
wlu4
Typewritten Text
mu + alpha_2 + beta_1
wlu4
Typewritten Text
mu + alpha_3 + beta_1
wlu4
Typewritten Text
mu + alpha_1 + beta_2
wlu4
Typewritten Text
mu + alpha_2 + beta_2
wlu4
Typewritten Text
mu + alpha_3 + beta_2
wlu4
Typewritten Text
fix factor B at a given level
wlu4
Typewritten Text
fix factor A at a given level
wlu4
Typewritten Text
ybar_{11.}
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
ybar_{21.}
wlu4
Typewritten Text
ybar_{31.}
wlu4
Typewritten Text
ybar_{12.}
wlu4
Typewritten Text
ybar_{22.}
wlu4
Typewritten Text
ybar_{32.}
wlu4
Typewritten Text
we can also define ybar_{i..}, i=1,...,a; ybar_{.j.}, j=1,...,b; and ybar_{...}.
wlu4
Typewritten Text
mu: ybar_{...}; alpha_i: ybar_{i..} - ybar_{...}; beta_j: ybar_{.j.} - ybar_{...}.
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
Page 20: ST 370 - Factorial Experiments and ANOVA

Let’s estimate the treatment means and main effects for this example.

49

Page 21: ST 370 - Factorial Experiments and ANOVA

How to fit Two-Way (additive) ANOVA model in Statcrunch?

50

Page 22: ST 370 - Factorial Experiments and ANOVA

More about the P-values in this model

Probability of getting our sample estimates or further apart assuming there is no effect fromthat factor.

• P-value for # Mentos = 0.5777

• P-value for initial Volume = 0.0183

Interactions - We’ve been assuming factors act independently.

51

wlu4
Typewritten Text
not significant (this factor does not have an effect on mean response)
wlu4
Typewritten Text
Usually consider alpha-level = 0.05
wlu4
Typewritten Text
significant (this factor has an effect on mean response)
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
You should know how to make the interaction plot by hand! Put the levels of one factor on X-axis; link the mean responses of the other factor at the same level using a line; give each line an index, like different color (red vs. blue) or different style (dotted line vs. solid line) for each level of the other factor.
Page 23: ST 370 - Factorial Experiments and ANOVA

If the lines are , there is no visual evidence of inter-action.

If the lines are , there is visual evidence of interac-tion.

Why?

Creating Interaction plots

1. Calculate a table of

2. On the X-axis put

3. The Y-axis represents

4. Draw a point for each treatment mean.

5. Connect treatment means that have the

52

wlu4
Typewritten Text
parallel
wlu4
Typewritten Text
not parallel
wlu4
Typewritten Text
wlu4
Typewritten Text
means for each treatment group
wlu4
Typewritten Text
the levels of one factor
wlu4
Typewritten Text
the means of each treatment group
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
same level of the other factor
wlu4
Typewritten Text
the treatment effects from two factors are additive, when we fix one factor at a certain level, say alpha_i, the difference of the mean responses between the levels of the other factor should be a constant across all levels of i = 1,...,a. In the previous example, when the two-way additive ANOVA model is correct, mu_{i1} - mu_{i2} = beta_1 - beta_2, for i = 1,2,3. Similarly, we should have mu_{1j} - mu_{2j} = alpha_1 - alpha_2, for j = 1,2, and mu_{1j} - mu_{3j} = alpha_1 - alpha_2, for j=1,2.
wlu4
Typewritten Text
Page 24: ST 370 - Factorial Experiments and ANOVA

Lets recreate the interaction plots for the Mentos experiment.

Note: To get the interaction plots in statcrunch, simply check the box.

If you hit Next you can then put the plots side by side if you like

53

wlu4
Typewritten Text
wlu4
Typewritten Text
Page 25: ST 370 - Factorial Experiments and ANOVA

Recap:

• To investigate the interactions in the interaction plots:

– Factors A and B interact when the effect of factor A depends on the levels of factorB.

– So we look for parallelism (or lack thereof).

– There appears to be an interaction in the mentos and coke interaction plots, butremember this is just a visual check!—

• We can also visually look at main effects using the interaction plots:

54

Page 26: ST 370 - Factorial Experiments and ANOVA

Modeling Interactions in Two-Way ANOVAMore general than the additive model, we may consider a model with interaction effects:

• call our factors factor A” and factor B”

• we need to model:

Back to the Mentos and Coke ExampleSame 6 parameters of interest! Now we just model them differently.

• µ = overall mean % volume lost

• α1, α2, α3, β1, β2 sames as before

• (αβ)11 = interaction effect for 591 ml and 4 mentos

• (αβ)12 = interaction effect for 591 ml and 8 mentos

• (αβ)21 = interaction effect for 1000 ml and 4 mentos

• (αβ)22 = interaction effect for 1000 ml and 8 mentos

• (αβ)31 = interaction effect for 2000 ml and 4 mentos

• (αβ)32 = interaction effect for 2000 ml and 8 mentos

If no interaction then all (αβ)ij = 0, we get back the additive model!

55

wlu4
Typewritten Text
Y_{ijk} = mu + alpha_i + beta_j + (alpha-beta)_{ij} + epsilon_{ijk}; i=1,..,a; j =1,...,b; k=1,...,n.
wlu4
Typewritten Text
wlu4
Typewritten Text
assumption: alpha_1 + ... + alpha_a = 0; beta_1 + ... + beta_b = 0; (alpha-beta)_{1j} + ... + (alpha-beta)_{aj} = 0, j=1,...,b; (alpha-beta)_{i1} + ... + (alpha-beta)_{ib} = 0, i=1,...,a.
wlu4
Typewritten Text
wlu4
Typewritten Text
(alpha-beta)_{ij}: ybar_{ij.} - ybar_{i..} - ybar_{.j.} + ybar_{...}. why? recall mu_{ij} = mu + alpha_i + beta_j + (alpha-beta)_{ij}. We use ybar_{ij.} to estimate mu_{ij}; ybar_{i..} - ybar_{...} to estimate alpha_i; ybar_{.j.}-ybar_{...} to estimate beta_j; ybar_{...} to estimate mu.
wlu4
Typewritten Text
estimates:
Page 27: ST 370 - Factorial Experiments and ANOVA

To fit the interaction model in statcrunch, uncheck the box that says additivemodel.

Interpreting the p-value on the interaction:

56

wlu4
Typewritten Text
The interaction effect is no significant (p-value > 0.05). We may conclude that there is no interaction effect between # of mentos and the initial volume on the mean response.
Page 28: ST 370 - Factorial Experiments and ANOVA

What to report when you have an interaction and when you don’t

57

wlu4
Typewritten Text
When there is an interaction, report "simple" effects; When there is not interaction, report only main effects for significant factors.
wlu4
Typewritten Text
2 main effects of initial volume: 591ml vs. 1000ml (0.599 - 0.550); 591ml vs. 2000ml (0.599-0.538). main effects of "# of Mentos" are not significant; interaction effects between initial volume and "# of Mentos" are not significant.
Page 29: ST 370 - Factorial Experiments and ANOVA

More on the ANOVA table in the Two-Factor Model

To construct the ANOVA table, we will still take the total sum of squares and split it up.Now we split it into a few more parts than previous:

ANOVA table for a×b (balanced) factorial experiment

Source df Sum of Squares (SS) Mean Square (MS) F-statA a− 1 SS(A) MS(A)=SS(A)/(a-1) MS(A)/MS(E)B b− 1 SS(B) MS(B)=SS(B)/(b-1) MS(B)/MS(E)

AB (a− 1)(b− 1) SS(AB) MS(AB)=SS(AB)/((a-1)(b-1)) MS(AB)/MS(E)Error ab(n− 1) SS(E) MS(E)=SS(E)/(ab(n-1))Total N − 1 SS(Tot)

58

wlu4
Typewritten Text
MS(E) is always an unbiased estimate of sigma^2.
Page 30: ST 370 - Factorial Experiments and ANOVA

Steps for Investigating Two-Way ANOVA

When fitting a model with interactions

1. First check if the interaction is significant

• If significant, then both factors are important.

• We should report simple effects.

2. If interaction not significant, look at main effects of each factor.

• For each main effect that is significant, report the fitted main effects

Example - An entomologist records energy expended (Y ) by N = 18 honeybees at a = 3temperature (A) levels (20, 30, 40oC) consuming liquids with b = 2 levels of sucrose concen-tration (B)(20%, 40%) in a balanced, completely randomized crossed 3× 2 design. The dataare given below:

Temp Suc Sample20 20 y111=3.1 y112=3.7 y113=4.720 40 y121=5.5 y122=6.7 y123=7.330 20 y211=6 y212=6.9 y213=7.530 40 y221=11.5 y222=12.9 y223=13.440 20 y311=7.7 y312=8.3 y313=9.540 40 y321=15.7 y322=14.3 y323=15.9

proc glm data=ent;

class Temp Suc;

model Energy=Temp|Suc; *Vertical Bar fits all combinations of Temp and Suc (main effects and interactions);

run;

59

wlu4
Typewritten Text
5.16
wlu4
Typewritten Text
wlu4
Typewritten Text
9.7
wlu4
Typewritten Text
wlu4
Typewritten Text
11.9
wlu4
Typewritten Text
wlu4
Typewritten Text
6.38
wlu4
Typewritten Text
wlu4
Typewritten Text
11.47
wlu4
Typewritten Text
8.92
wlu4
Typewritten Text
alpha, beta and (alpha-beta)
wlu4
Typewritten Text
Page 31: ST 370 - Factorial Experiments and ANOVA

1. What are the factor(s), levels, and treatments?

2. Construct an interaction plot for this experiment.

3. Which factors are important and why? State the hypothesis you are testing.

4. Report the appropriate effects given your response to the previous question.

5. What does the MSE represent for this model?

60

wlu4
Typewritten Text
factors: temperature (3 levels), sucrose concentration (2 levels); totally 6 treatments
wlu4
Typewritten Text
MSE = MS(E) = 0.7722 is an unbiased estimate of sigma^2.
wlu4
Typewritten Text
You need to know how to do it by hand and how to do it in statcrunch as shown before.
wlu4
Typewritten Text
Both the main effects and interaction effects of the two factors are important since p-values are very small (less than 0.05).
wlu4
Typewritten Text
H_0: alpha_1 = ... = alpha_a = 0; H_0: beta_1 = ... = beta_b = 0; H_0: (alpha-beta)_{11} = ... = (alpha-beta)_{ab} = 0.
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
marginal effects of temp.: alpha_1=5.16-8.92=-3.76; alpha_2=9.7-8.92=0.78; alpha_3=11.9-8.92 = 2.98; marginal effects of sucrose concentration: beta_1=6.38-8.92=-2.54; beta_2=11.47-8.92=2.55.
wlu4
Typewritten Text
interaction effects: (alpha-beta)_{11}: ybar_{11.} - ybar{1..} - ybar{.1.} + ybar{...} = 3.83 - 5.165 - 6.38 + 8.92 = 1.205; (alpha-beta)_{21}: ybar_{21.} - ybar{2..} - ybar{.1.} + ybar{...} = 6.80 - 9.7 - 6.38 + 8.92 = -0.36.
wlu4
Typewritten Text
wlu4
Typewritten Text
wlu4
Typewritten Text
(what are other interaction effects? (alpha-beta)_{31} = 0 - (1.205 - 0.36) =
wlu4
Typewritten Text
-0.845;
wlu4
Typewritten Text
wlu4
Typewritten Text
(alpha-beta)_{12} = -1.205; (alpha-beta)_{22} = 0.36; (alpha-beta)_{32} = 0.845.)
wlu4
Typewritten Text
Page 32: ST 370 - Factorial Experiments and ANOVA

Example: Tomato YieldYields on 36 tomato crops from balanced, complete, crossed design with a = 3 varieties (A)at b = 4 planting densities (B) :

Variety Density k/hectare Sample

1 10 7.9 9.2 10.52 10 8.1 8.6 10.13 10 15.3 16.1 17.51 20 11.2 12.8 13.32 20 11.5 12.7 13.73 20 16.6 18.5 19.21 30 12.1 12.6 14.02 30 13.7 14.4 15.43 30 18.0 20.8 21.01 40 9.1 10.8 12.52 40 11.3 12.5 14.53 40 17.2 18.4 18.9

proc glm data=tomato; class variety density;

model Yield=Variety|Density;

run;

1. What are the factor(s), levels, and treatments?

61

wlu4
Typewritten Text
factors: variety (3 levels) and density (4 levels); totally 12 treatments.
wlu4
Typewritten Text
3 x 4 factorial design. 3 replicates for each treatment.
Page 33: ST 370 - Factorial Experiments and ANOVA

2. Which factors are important and why? State the hypothesis(es) you are testing.

3. Report the appropriate effects given your response to the previous question.

4. What does the MSE represent for this model?

62

wlu4
Typewritten Text
The main effects of two factors are important since p-values are small (less than 0.05), but the interaction effect between the two factors is not significant (p-value > 0.05).
wlu4
Typewritten Text
H_0: alpha_1 = ... = alpha_a = 0; H_0: beta_1 = ... = beta_b = 0; H_0: (alpha-beta)_{11} = ... = (alpha-beta)_{ab} = 0.
wlu4
Typewritten Text
wlu4
Typewritten Text
MSE = 1.585 represents an unbiased estimate of sigma^2.
wlu4
Typewritten Text
wlu4
Typewritten Text
2 main effects of factor A (variety): 1 vs. 2 (11.33 - 12.21); 1 vs. 3 (11.33-18.13) 3 main effects of factor B (density): 10 vs. 20 (11.48 - 14.39); 10 vs. 30 (11.48 - 15.78); 10 vs. 40 (11.48 - 13.91).
wlu4
Typewritten Text
wlu4
Typewritten Text
no interaction effects to report.
wlu4
Typewritten Text