anova one way
TRANSCRIPT
Analysis of variance
ONE WAY ANOVA
IT USED TO ESTIMATE AND COMPARE THE EFFECTES OF THE Difference treatments on the response variables
We do this by estimating and comparing the treatments means
When we compare more than two groups based on one factors.
E.g. If one the car oil company would to like to compare the effects of three oil type assume that A,B and C on oil mileage by midsize cars. And the company select randomly 5 cars from 1000 cars. There is only one Factor : oil type
Assumptions on one way ANOVA
1. Constant variance: the populations of values of the response variable associated with the treatments have equal variables.
2. Normality: the pupations of the values of the response variable associated with the treatments all have normal distributions
3. Independence: the samples of experimental units associated with the treatments are randomly selected, independent sample
Testing the significant differences between treatments means
If the oil company gets the following result
We have n=15 p=3
Xa1 34 xb1 35.3 xc1 33.3xa2 35 xb2 36.5 xc2 34Xa3 34.3 xb3 36.4 xc3 34.7Xa4 35.5 xb4 37 xc4 33Xa5 35.8 xb5 37.6 xc5 34.9
sum 174.6 0 182.8 0 169.9
Mean 34.92 0 36.56 0 33.9835.153
oil Type A OIL TYPE B OIL TYPE C
Step 1: Determine the null hypothesis and alternative hypothesis H0: µ1=µ2=µ3 Ha: at least two of µ1,µ2,µ3 differ Compare the between-treatment variability to the within-treatment variability
Between-treatment variability is the variability of the sample means, sample to sample
Within-treatment variability is the variability of the treatments (that is, the values) within each sample
Step 2 in order to numerically compare between within and between treatments variability we define sum of square and mean square
The treatment sum of square(SST):measure the variability of the sample treatment means. SST= so 5(34-35.153)+5(36.6-53.153)+5(33.98-35.153) =17.0493
p
iii xxn
1
2
2
2 2
STEP 2 : THE ERROR SUM OF SQUARES (SSE) measure the within treatment variability
SSE=We compute the SSE by calculating the squared difference between each observed value.So [(34.0-34.92)+(35-34.92)+…….+(34.9-33.98)] =8.028
p
i
n
jiij
i
xx1 1
2
22 2
Step 3 we define a sum of square that measures the total amount of variability in the observed values of the response. The total sum of square SSTO
SSTO=SST+SSE 17.0493+8.028=25.0773 Using the treatment and error sums of
square we next define two mean square: The treatment mean square is MST=SST/P-
1 17.0493/3-1 8.525 The error mean square is
MSE=SSE/n-p 8.028/15-3 0.669
THEN WE CALCULATE F VALUE
DEFINE F statistic
And its value to the area under the F curve with p-1 and n-p degree of freedom to the right of F. We reject H0 at the significance a if either of the following conditions holds1. F>Fa 2. p-value <a
11
1 :StatisticTest
bp-SSE/p-SST/
MSEMSTF=
IT FOLLOWS THAT F=8.525/0.669=12.74IN order to test H0 at the 0.05 level of significance we use F .05 with p-1 =3-1=2 nominator and n-p=15-3=12 denominator,From the table we got 3.89So we have F=12.74>F .05=3.89There fore, we reject H0 at the 0.05 level of significance.In other words we conclude that at least two of oil types A B C have different effects on the mean mileage.
Pairwise comparison If one way anova f test says that at least two treatment
mean differ, we estimate how large the difference are. Comparing treatment means two at a time. In our example we might estimate the pairwise
differences µa-µb, it is the change in mean mileage achieved by changeling from B to A
There are two approaches to calculating intervals for pairwise differences
1. INDIVIDUAL: the confidence interval for each pairwise difference
ta/2 is based on n – p degrees of freedom
hiα/hi nn
MSEtxx 112
2. Simultaneous confidence interval : such an interval make us 100(1-a) percent confidence that all of the pairwise difference are simultaneously contained in their respective intervals. There are so many kinds but mostly Tukey formula used.
qa is the upper a percentage point of the studentized range for p and (n – p) from Table
mMSEqxx αhi
E.g in the oil mileage example we are comparing
p=3 treatments each sample size m=5 total n=15 MSE=0.669 q.05=3.77 from the table corresponding to p=3 and n-p=12Similar Tukey simultaneous 95 % confidence interval for µb-µa [(36.56-34.92)±3.77= [0.261,3.019]The interval make us simultaneously 95% confidence that 1 changing from oil type A to oil B increase mean mileage by between 02.61 and 3.019 mpg.