statistical analysis of biological data (comaprison of means)

100
Zagazig university Faculty of Veterinary Medicine Animal Wealth Development Department Session#2: Statistical Analysis of Biological Data (Comparison of Means) M.Afifi M.Sc., Biostatistics(Joint Supervision with ISSR, Cairo University) Ph.D., Candidate (AVC, UPEI, Canada) E-mail: [email protected], [email protected] Tel: +201060658185

Upload: mohamed-afifi

Post on 27-Jan-2017

451 views

Category:

Health & Medicine


3 download

TRANSCRIPT

Page 1: Statistical analysis of biological data (comaprison of means)

Zagazig universityFaculty of Veterinary Medicine

Animal Wealth Development Department

Session#2:Statistical Analysis of Biological Data

(Comparison of Means)

M.Afifi

M.Sc., Biostatistics(Joint Supervision with ISSR, Cairo University) Ph.D., Candidate (AVC, UPEI, Canada)

E-mail: [email protected], [email protected] Tel: +201060658185

Page 2: Statistical analysis of biological data (comaprison of means)
Page 3: Statistical analysis of biological data (comaprison of means)

Course Materials are derived from Statistics for Veterinary and Animal Science, Third Edition.(2013)

Very Popular Minimal Mathematical Approach Well Defined Veterinary Examples (Highlighted) Exercise following each chapter Solutions to Exercise P(303) Ready made SPSS outputs

Page 4: Statistical analysis of biological data (comaprison of means)

134 Publications

Page 5: Statistical analysis of biological data (comaprison of means)
Page 6: Statistical analysis of biological data (comaprison of means)

Flowcharts for selection of appropriate tests

Page 7: Statistical analysis of biological data (comaprison of means)
Page 8: Statistical analysis of biological data (comaprison of means)

Basic rules of any statistical test

Assumption Hypothesis testing

Page 9: Statistical analysis of biological data (comaprison of means)

Basic rules of hypothesis testing Hypothesis:

• Null hypothesis, H0:

• Difference in (means, proportions, medians) is not actual (non-sig),

• Difference not due to treatment effect but due to any other reasons (Chance , Error)

• Alternative hypothesis HA : VS H0

Test statistic- value: value calculated from the data (an algebraic expression particular to the

hypothesis we are testing),

t-test >>>> t-value

F-test >>>> F-value

χ2-test >>>>>> χ2 value

P-value: probability value (0-1) (Sig): Attached to each value of the test statistic It

the probability of getting the observed effect (or one more extreme) if the null hypothesis is true

Page 10: Statistical analysis of biological data (comaprison of means)

Set Up your probabilities

True

False

Null hypothesis

Reject

Accept

Accept

Reject

Confidence Level95%

Level of Significance5%

P (Type II error)

Power

Page 11: Statistical analysis of biological data (comaprison of means)

Comparison of Means

Page 12: Statistical analysis of biological data (comaprison of means)

2-Independent sample Means

Page 13: Statistical analysis of biological data (comaprison of means)

2-Independent sample Means

Page 14: Statistical analysis of biological data (comaprison of means)

Two-sample t-test (unpaired t-test) Compare the means in two independent groups of observations using representative

samples.

Assumptions

Two samples must be independent unrelated

Normality A small departure from Normality is not crucial and leads to only a marginal loss in power

Homoscedastic (equal variances) >>>> Checked by Levene’s test

Page 15: Statistical analysis of biological data (comaprison of means)

7.4.4 Example Consider the comparison of the mean body weights at the time of mating in one

group of ewes which have been flushed (put on a high plane of nutrition for 2–3

weeks prior to mating) and another group which have not. Each ewe in a random

sample of 54 ewes is randomly allocated to the flushed or control group. Table

7.2 shows the weights of two samples of 24 flushed and 30 control ewes.

Page 16: Statistical analysis of biological data (comaprison of means)
Page 17: Statistical analysis of biological data (comaprison of means)

The null hypothesis: mean body weights in the populations of flushed and

control ewes are equal; the two-sided alternative is that they are different.

The test statistic for the two-sample t-test is It is derived as

The P-value (denoted in SPSS by ‘Sig.’)

Page 18: Statistical analysis of biological data (comaprison of means)

Statistics for Veterinary and Animal Science-Wiley-Blackwell (2013)\Data File

(Excel-SPSS-Stata)\SPSS\Table 7.2.sav 2 Variables:

Ewe Weight Group: 0;Flushed, 1;control

Page 19: Statistical analysis of biological data (comaprison of means)

Analyze >>>> Compare means >>> Independent-Samples T Test

Page 20: Statistical analysis of biological data (comaprison of means)
Page 21: Statistical analysis of biological data (comaprison of means)
Page 22: Statistical analysis of biological data (comaprison of means)
Page 23: Statistical analysis of biological data (comaprison of means)

SPSS Output

Page 24: Statistical analysis of biological data (comaprison of means)

Descriptives

Page 25: Statistical analysis of biological data (comaprison of means)

Test

Page 26: Statistical analysis of biological data (comaprison of means)

Levene’s Test

T-test Test

The first line of the t-test result (equal variances assumed) is relevant in this instance

because the result of Levene’s test for the equality of variances indicates that the two

variances are not significantly different (P = 0.617).

Page 27: Statistical analysis of biological data (comaprison of means)

How would these results be reported in a scientific journal article?

Page 28: Statistical analysis of biological data (comaprison of means)

Tabular Presentation.

Page 29: Statistical analysis of biological data (comaprison of means)

Mean ± SD or SEM with Both t-value and P-value

Mean ± SD or SEM with only P-value

Page 30: Statistical analysis of biological data (comaprison of means)

Representing P-values with astrikes

Representing P-values with superscripts

Page 31: Statistical analysis of biological data (comaprison of means)

Graphical presentation

Page 32: Statistical analysis of biological data (comaprison of means)

Simple Bar Box plots

Page 33: Statistical analysis of biological data (comaprison of means)

Report your results in words

Page 34: Statistical analysis of biological data (comaprison of means)

Your Formal sentence must includes:

Dependent , independent variable

Exact p-value (unless the p value is less than .001). < 0.000 Or < 0.0001

The direction of the effect as evidenced by the reported means, as well as a

statement about statistical significance,

Symbol of the test (t), the degrees of freedom (6), the statistical value (2.95)

Page 35: Statistical analysis of biological data (comaprison of means)

There was a significant difference in the mean body weights between the flushed

ewe (M = 6.79, SD = 1.94) and the control one (M = 5.59, SD = 1.81), t(6) =

2.95, p = 0.018.

The mean body weights of the flushed ewe was significantly different, with the

estimated mean ewe body weight in the flushed ewes being 1.59 kg greater than

that of the control ewes. The significantly higher mean body weight of the

flushed ewes implies an effect on metabolism and is expected to be associated

with an optimal ovulation rate

P = 0.018, indicating that the chance of obtaining a difference in means at least

as large as 1.59 kg is only 1.8% if the null hypothesis is true.

Page 36: Statistical analysis of biological data (comaprison of means)

Welch's t-test

(Unequal variances t-test) widely used modification of the t-test,

adjusts the number of degrees of freedom when the variances are not equal to

each other.

Page 37: Statistical analysis of biological data (comaprison of means)
Page 38: Statistical analysis of biological data (comaprison of means)

If the sample sizes are not large,

equal variances not assumed

non-parametric method,

Mann–Whitney U test

Page 39: Statistical analysis of biological data (comaprison of means)

Paired (dependent) t-test

Page 40: Statistical analysis of biological data (comaprison of means)
Page 41: Statistical analysis of biological data (comaprison of means)

FreezingRefrigeration

Page 42: Statistical analysis of biological data (comaprison of means)

Methods of pairing: Self-pairing: each animal used as its own control (Before and After)

Natural pairing: each pair of animals is biologically related (e.g. litter mates).

Artificial (matched) pairing: each animal is paired with an animal matched with

respect to one or more factors that affect response.

To avoid allocation bias in an experiment when there is self-pairing, each animal is

randomly allocated to receive one of the two treatments initially; it then receives

the other treatment later.

If there is natural or matched pairing, one member of the pair is randomly allocated

to one of the two treatments and the other member receives the second treatment.

Page 43: Statistical analysis of biological data (comaprison of means)

Assumption Paired

we take the difference between the observations in each pair, the set of

differences for all pairs is approximately Normally distributed even though the

original observations in the groups may not be

Page 44: Statistical analysis of biological data (comaprison of means)

7.5.4 Example

Nelson et al. (1998) conducted a randomized cross-over trial of two diets in 11 insulin-

dependent diabetic dogs; they measured serum glucose as the variable indicating the quality

of diabetic control. The diets contained either low insoluble fibre (LF) or high insoluble fibre

(HF). Each dog was randomly allocated to receive a particular diet first. The dogs were

adapted to the diet for 2 months and then fed it for 6 months: evaluation was performed at 6-

week intervals. As the study ran over 16 months of each dog’s life, we might expect changes

in the animal’s metabolic responses to diabetes during the course of the trial, irrespective of

diet.

Page 45: Statistical analysis of biological data (comaprison of means)
Page 46: Statistical analysis of biological data (comaprison of means)

Null hypothesis: true mean difference in the preprandial serum glucose levels

between the low-fibre and high-fibre diets is zero; the two-sided alternative is

that it is not zero.

Test statistic:

P-value

Page 47: Statistical analysis of biological data (comaprison of means)

SPSS Table 7.3

2 variables: Low fiber

High Fiber

Page 48: Statistical analysis of biological data (comaprison of means)

Analyze >>>> Compare means >>> Paired Samples T test

Page 49: Statistical analysis of biological data (comaprison of means)
Page 50: Statistical analysis of biological data (comaprison of means)

SPSS computer output for Paired t-test

Page 51: Statistical analysis of biological data (comaprison of means)

Descriptives

Page 52: Statistical analysis of biological data (comaprison of means)
Page 53: Statistical analysis of biological data (comaprison of means)

Test

Page 54: Statistical analysis of biological data (comaprison of means)

Paired Vs. Independent Test

Page 55: Statistical analysis of biological data (comaprison of means)

If the sample sizes are not large,

equal variances not assumed

non-parametric method,

Wilcoxon rank test

Page 56: Statistical analysis of biological data (comaprison of means)

F-test

ANOVA

Comparing more than two means

Page 57: Statistical analysis of biological data (comaprison of means)
Page 58: Statistical analysis of biological data (comaprison of means)
Page 59: Statistical analysis of biological data (comaprison of means)

Suppose, for example, we have four groups. >>>>> compare using a two-

sample t-test) for every combination of pairs of groups >>> six possible t-tests

Page 60: Statistical analysis of biological data (comaprison of means)

Principle

Total variability in a data set is partitioned into a different source of variation.

The sources of variation comprise one or more factors, each explained by the

levels or categories of that factor (e.g. the two levels, ‘male’ and ‘female’, defining

the factor ‘sex’, or three dose levels for a given drug factor), and also unexplained

or residual variation which results from uncontrolled biological variation and

technical error.

We can assess the contribution of the different factors to the total variation by

making the appropriate comparisons of these variances.

The variation is expressed by its variance

Page 61: Statistical analysis of biological data (comaprison of means)
Page 62: Statistical analysis of biological data (comaprison of means)

The analysis of variance encompasses a broad spectrum of experimental

designs ranging from the simple to the complex.

Page 63: Statistical analysis of biological data (comaprison of means)
Page 64: Statistical analysis of biological data (comaprison of means)
Page 65: Statistical analysis of biological data (comaprison of means)
Page 66: Statistical analysis of biological data (comaprison of means)
Page 67: Statistical analysis of biological data (comaprison of means)

One-way analysis of variance Single factor with several levels or categories where each level comprises a group

of observations.

For example, the levels may be:

Feed formula for dogs: dry feed formula, a tinned feed and a raw meat

Different treatment dose levels of a drug, one of which is a placebo representing

simply the drug vehicle, while the others are, say, 50%, 100% and 200% of the

presumed effective dose. Consider the simple case >>> only one factor , 2 sources of variation:

Between the group means

Within the groups

Page 68: Statistical analysis of biological data (comaprison of means)

In the experimental situation, the animals should be randomly allocated to one of

the levels of the factor, i.e. to one of the groups, in order to avoid allocation bias

(see Section 5.6).

Page 69: Statistical analysis of biological data (comaprison of means)

Assumptions:

results are reliable only if the assumptions on which it is based are satisfied

samples representing the levels are independent

Observations in each sample come from a Normally distributed population with

variance σ2; this implies that the group variances are the same. Approximate

Normality may be established by drawing a histogram; moderate departures

from Normality have little effect on the result.

Constant variance, the more important assumption, may be established by

Levene’s test

Page 70: Statistical analysis of biological data (comaprison of means)

8.6.4 Example Dogs were fed a dry diet coated with different agents that were believed to

affect the build-up of calculus on the teeth. Calculus accumulation was

measured by an index that combined estimates of both the proportion of the teeth

covered by the deposit and the thickness of the deposit. Twenty-six dogs were

randomly allocated to three treatments: control, soluble pyrophosphate (P2O7)

and sodium hexametaphosphate (HMP). The calculus accumulation index was

measured on each dog 4 weeks after it received treatment.

Page 71: Statistical analysis of biological data (comaprison of means)
Page 72: Statistical analysis of biological data (comaprison of means)

Null hypothesis: mean calculus indices in the three treatment groups are equal;

the alternative hypothesis is that they are not all equal.

Test statistic: >>>>>F-ratio (F-value)

P-value = 0.005 >>>>Further examination by post hoc multiple comparisons

Page 73: Statistical analysis of biological data (comaprison of means)

SPSS Table 8.1: 2 variables: Caculus

Group (1,2, 3)

Page 74: Statistical analysis of biological data (comaprison of means)

Analyze >>>> Compare Means >>>> One-way ANOVA

Page 75: Statistical analysis of biological data (comaprison of means)
Page 76: Statistical analysis of biological data (comaprison of means)
Page 77: Statistical analysis of biological data (comaprison of means)
Page 78: Statistical analysis of biological data (comaprison of means)

(Unequal variances ANOVA) Welch ANOVA

Brown-Forsythe test

Page 79: Statistical analysis of biological data (comaprison of means)

SPSS computer output for analysis of variance of calculus data on three groups of dogs.

Page 80: Statistical analysis of biological data (comaprison of means)

Descriptives

Page 81: Statistical analysis of biological data (comaprison of means)

(P = 0.44).>>>>> variances of the observations in the three groups are not

significantly different

Page 82: Statistical analysis of biological data (comaprison of means)
Page 83: Statistical analysis of biological data (comaprison of means)

Which group means Differs?????

Post-hoc Test

Multiple Comparison

Page 84: Statistical analysis of biological data (comaprison of means)

Post-hoc testMultiple Comparisons of Means

Page 85: Statistical analysis of biological data (comaprison of means)

Multiple comparisons

Conducting a number of tests, but the more tests that we perform, the more

likely it is that we will obtain a significant P-value on the basis of chance alone.

We have to approach this problem of multiple comparisons in such a way that

we avoid spurious P-values.

Adjusted p-values are simply the unadjusted p-values multiplied by the number of possible comparisons (six in this case);

If multiplying a p-value by the number of comparisons produces a value greater than one, the probability is given as 1.00.

Page 86: Statistical analysis of biological data (comaprison of means)

Most Common Multiple comparisons

Least significant difference (LSD)

Duncan’s multiple range test, (DMRT)

Tukey’s (HSD)

Newman–Keuls tests,

Bonferroni’s correction

Scheffe’s

. Be aware: they often produce slightly different results!

Page 87: Statistical analysis of biological data (comaprison of means)
Page 88: Statistical analysis of biological data (comaprison of means)

Bonferroni's

Page 89: Statistical analysis of biological data (comaprison of means)

Reporting results in table:

Page 90: Statistical analysis of biological data (comaprison of means)

Automatic Multiple comparison using Bonferroni’s test.

Page 91: Statistical analysis of biological data (comaprison of means)
Page 92: Statistical analysis of biological data (comaprison of means)
Page 93: Statistical analysis of biological data (comaprison of means)
Page 94: Statistical analysis of biological data (comaprison of means)

Duncan Multiple range Test

ab

Page 95: Statistical analysis of biological data (comaprison of means)

If the sample sizes are not large,

equal variances not assumed

non-parametric method,

kruskal wallis test

Page 96: Statistical analysis of biological data (comaprison of means)

Data Transformation

Variance within groups (obtain homogenous Individuals (data values)) e.g.:

Titer Data (Virus – HI Titer). Bacterial Count

Skewness (Normalize Data)

Page 97: Statistical analysis of biological data (comaprison of means)

Transforming Data Transform menu >> Compute Variable

Page 98: Statistical analysis of biological data (comaprison of means)

Essential types of transformationLog transformation: (Log to the base 10)An Example. The bacterial count: 10, 100 and 1000 simple average (arithmetic mean) (10+100+1000 = 1110/3) is 370when converted to logs become 1, 2 and 3 >>>> average (1+2+3 = 6/3) is 2 or converted back >>>>100. Thus the geometric mean of 10, 100 and 1000 is 100 whereas the simple average (arithmetic mean) (10+100+1000 = 1110/3) is 370. Note that the high number of 1000 in the example skews the simple average upwards making it much higher than the geometric mean.

Page 99: Statistical analysis of biological data (comaprison of means)

Enter the name of the target Variable for example Logvirus

Page 100: Statistical analysis of biological data (comaprison of means)