one way analysis of variance anova o it is used to investigate the effect of one factor which occurs...

Post on 20-Jan-2016

212 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

ONE WAY ANALYSIS OF VARIANCE

ANOVAo It is used to investigate the effect of one factor

which occurs at h levels (≥3).

Example: Suppose that we wish to test the effect of temperature at levels (20, 30, 35, 40oC) on the serum total proteins.

Biostatistics and Data analysis3rd Lecture

RANDOM MODEL HYPOTHESIS

1) Thek samples (independent random samples) drawn from K specific populationswith means

2) Each of the k population is normally distributed.

3) Each of thek population has the same variance

𝑯𝟎 :𝝁𝟏=𝝁𝟐=……….=𝝁𝒌

𝑯 𝑨 :𝒂𝒕 𝒍𝒆𝒂𝒔𝒕 𝒐𝒏𝒆𝒑𝒂𝒊𝒓 𝒐𝒇 𝝁′ 𝒔𝒓𝒆𝒏𝒐𝒕𝒆𝒒𝒖𝒂𝒍 .

The summary statistics for each row are shown in the table below

20oC 25oC 30oC

Sample size (n) 7 9 8

Mean 2.2857 2.444 5.625

S.D. 0.487 0.882 1.922

Variance (S2) 0.237 0.778 3.694

Temperature(oC)

Serum Total Proteins (g/dL)

20 2, 3, 2, 2, 3, 2, 225 4, 3, 2, 3, 1, 2, 2, 3, 230 5, 6, 7, 4, 2, 6, 7, 8

o The sum of the squares of the deviations between a value and the mean of the value

SS between groups SS(B)SS within groups SS(W)

o The average squared deviation from the mean and are found by dividing the variation by the degrees of freedom

MS = SS / df

MS between groups MS(B) MS within groups MS(W)

Variances (Mean of Squares) = MS

Variation (Sum of Squares) = SS

• Are all of the values identical?– There are variations among the data called

the total variation SS(T).

Variation (Sum of Squares) = SSTemperature

(oC)Serum Total Proteins (g/dL) Means

20 2, 3, 2, 2, 3, 2, 2 2.285725 4, 3, 2, 3, 1, 2, 2, 3, 2 2.44430 5, 6, 7, 4, 2, 6, 7, 8 5.625

• Are all of the sample means identical?– There variation called between group

SS(B)variation or variation due to Factor.

Temperature(oC)

Serum Total Proteins (g/dL) Means

20 2, 3, 2, 2, 3, 2, 2 2.285725 4, 3, 2, 3, 1, 2, 2, 3, 2 2.44430 5, 6, 7, 4, 2, 6, 7, 8 5.625

• Are each of the values within each group identical?– There is variation within group SS(W) (error variation).

Temperature(oC)

Serum Total Proteins (g/dL) Means

20 2, 3, 2, 2, 3, 2, 2 2.285725 4, 3, 2, 3, 1, 2, 2, 3, 2 2.44430 5, 6, 7, 4, 2, 6, 7, 8 5.625

– The variation between groups, SS(B), or the variation due to the factor

– The variation within groups, SS(W), or the error variation

There are two sources of variation

• Here is the basic one-way ANOVA table

Source SS df MS F P

Between (Factor)

Within (Error)

Total

The summary statistics for the grades of each row are shown in the table below

20oC 25oC 30oC

Sample size (n) 7 9 8

Mean 2.2857 2.444 5.625

S.D. 0.487 0.882 1.922

Variance (S2) 0.237 0.778 3.694

Temperature(oC)

Serum Total Proteins (g/dL)

20 2, 3, 2, 2, 3, 2, 225 4, 3, 2, 3, 1, 2, 2, 3, 230 5, 6, 7, 4, 2, 6, 7, 8

Grand Mean– The grand mean is the average of all the values

– It is a weighted average of the individual sample means

1

1

k

i iik

ii

n xx

n

𝐗𝐠=𝐧𝟏𝐗𝟏+𝐧𝟐𝐗𝟐+…+𝐧𝐤𝐗𝐤

𝐧𝟏+𝐧𝟐+…+𝐧𝐤

Between Group Variation, SS(B)

𝐒𝐒 (𝐁 )=𝟕 (𝟐 .𝟐𝟖𝟓𝟕−𝟑 .𝟒𝟓𝟖)𝟐+𝟗 (𝟐 .𝟒𝟒𝟒−𝟑 .𝟒𝟓𝟖 )𝟐+𝟖 (𝟓 .𝟔𝟐𝟓−𝟑 .𝟒𝟓𝟖 )𝟐

𝐒𝐒 (𝐁 )=𝟗 .𝟔𝟐+𝟗 .𝟐𝟓𝟒+𝟑𝟕 .𝟓𝟔𝟕=𝟓𝟔 .𝟕𝟏𝟏

Within Group Variation, SS(W)

𝐒𝐒 (𝑾 )=𝟔∗𝟎 .𝟐𝟑𝟕+𝟖∗𝟎 .𝟕𝟕𝟖+𝟕∗𝟑 .𝟔𝟗𝟒

𝐒𝐒 (𝑾 )=𝟏 .𝟒𝟐𝟐+𝟔 .𝟐𝟐𝟒+𝟐𝟓 .𝟔𝟗𝟒=𝟑𝟑 .𝟓𝟎𝟒

𝐒 (𝐖 )=𝒅𝒇 𝟏∗𝑺𝟏𝟐+𝒅𝒇 𝟐∗𝑺𝟐

𝟐+…+𝒅𝒇 𝒌∗𝑺𝒌𝟐

• After filling in the sum of squares, we have …

Source SS df MS F p

Between 56.441 2

Within 33.504 21

Total 89.945 23

– MS = SS / df• MS(B) = 56.441 / 2 = 28.221• MS(W) = 33.504 / 21 = 1.595

Variances

• After filling in the sum of squares, we have …

Source SS df MS F p

Between 56.441 2 28.221

Within 33.504 21 1.595

Total 89.945 23

– An F test statistic is the ratio of two sample variances

– The MS(B) and MS(W) are two sample variances and that’s what we divide to find F.

– F = MS(B) / MS(W)F = 28.2 / 1.595 = 17.69

F test

After filling in the sum of squares, we have …

Source SS df MS Fcal P

Between 56.441 2 28.221 17.69

Within 33.504 21 1.595

Total 89.945 23

Tabulated F2,21(5%)= 3.47, F2,21(1%)= 5.78 , F2,21(0.1%)= 5.78 Thus calculated F at df 2,21 > Tabulated at F2,21(0.1%)= 5.78 Thus reject null hypothesis

top related