anova one way

Analysis of variance

ONE WAY ANOVA

IT USED TO ESTIMATE AND COMPARE THE EFFECTES OF THE Difference treatments on the response variables

We do this by estimating and comparing the treatments means

When we compare more than two groups based on one factors.

E.g. If one the car oil company would to like to compare the effects of three oil type assume that A,B and C on oil mileage by midsize cars. And the company select randomly 5 cars from 1000 cars. There is only one Factor : oil type

Assumptions on one way ANOVA

1. Constant variance: the populations of values of the response variable associated with the treatments have equal variables.

2. Normality: the pupations of the values of the response variable associated with the treatments all have normal distributions

3. Independence: the samples of experimental units associated with the treatments are randomly selected, independent sample

Testing the significant differences between treatments means

If the oil company gets the following result

We have n=15 p=3

Xa1 34 xb1 35.3 xc1 33.3xa2 35 xb2 36.5 xc2 34Xa3 34.3 xb3 36.4 xc3 34.7Xa4 35.5 xb4 37 xc4 33Xa5 35.8 xb5 37.6 xc5 34.9

sum 174.6 0 182.8 0 169.9

Mean 34.92 0 36.56 0 33.9835.153

oil Type A OIL TYPE B OIL TYPE C

Step 1: Determine the null hypothesis and alternative hypothesis H0: µ1=µ2=µ3 Ha: at least two of µ1,µ2,µ3 differ Compare the between-treatment variability to the within-treatment variability

Between-treatment variability is the variability of the sample means, sample to sample

Within-treatment variability is the variability of the treatments (that is, the values) within each sample

Step 2 in order to numerically compare between within and between treatments variability we define sum of square and mean square

The treatment sum of square(SST):measure the variability of the sample treatment means. SST= so 5(34-35.153)+5(36.6-53.153)+5(33.98-35.153) =17.0493

p

iii xxn

1

2

2

2 2

STEP 2 : THE ERROR SUM OF SQUARES (SSE) measure the within treatment variability

SSE=We compute the SSE by calculating the squared difference between each observed value.So [(34.0-34.92)+(35-34.92)+…….+(34.9-33.98)] =8.028

p

i

n

jiij

i

xx1 1

2

22 2

Step 3 we define a sum of square that measures the total amount of variability in the observed values of the response. The total sum of square SSTO

SSTO=SST+SSE 17.0493+8.028=25.0773 Using the treatment and error sums of

square we next define two mean square: The treatment mean square is MST=SST/P-

1 17.0493/3-1 8.525 The error mean square is

MSE=SSE/n-p 8.028/15-3 0.669

THEN WE CALCULATE F VALUE

DEFINE F statistic

And its value to the area under the F curve with p-1 and n-p degree of freedom to the right of F. We reject H0 at the significance a if either of the following conditions holds1. F>Fa 2. p-value <a

11

1 :StatisticTest

bp-SSE/p-SST/

MSEMSTF=

IT FOLLOWS THAT F=8.525/0.669=12.74IN order to test H0 at the 0.05 level of significance we use F .05 with p-1 =3-1=2 nominator and n-p=15-3=12 denominator,From the table we got 3.89So we have F=12.74>F .05=3.89There fore, we reject H0 at the 0.05 level of significance.In other words we conclude that at least two of oil types A B C have different effects on the mean mileage.

Pairwise comparison If one way anova f test says that at least two treatment

mean differ, we estimate how large the difference are. Comparing treatment means two at a time. In our example we might estimate the pairwise

differences µa-µb, it is the change in mean mileage achieved by changeling from B to A

There are two approaches to calculating intervals for pairwise differences

1. INDIVIDUAL: the confidence interval for each pairwise difference

ta/2 is based on n – p degrees of freedom

hiα/hi nn

MSEtxx 112

2. Simultaneous confidence interval : such an interval make us 100(1-a) percent confidence that all of the pairwise difference are simultaneously contained in their respective intervals. There are so many kinds but mostly Tukey formula used.

qa is the upper a percentage point of the studentized range for p and (n – p) from Table

mMSEqxx αhi

E.g in the oil mileage example we are comparing

p=3 treatments each sample size m=5 total n=15 MSE=0.669 q.05=3.77 from the table corresponding to p=3 and n-p=12Similar Tukey simultaneous 95 % confidence interval for µb-µa [(36.56-34.92)±3.77= [0.261,3.019]The interval make us simultaneously 95% confidence that 1 changing from oil type A to oil B increase mean mileage by between 02.61 and 3.019 mpg.

anova one way

Marketing