unit 8: inference for two samples - university of...
TRANSCRIPT
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 1
Statistics 571: Statistical MethodsRamón V. León
Unit 8: Inference for Two Samples
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 2
Introductory Remarks• A majority of statistical studies whether experimental or
observational are comparative• Simplest type of comparative study compares two
populations• Two principal designs for comparative studies
– Using independent samples– Using matched pairs
• Graphical methods for informal comparisons• Formal comparisons of means and variances of normal
populations– Confidence intervals– Hypothesis tests
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 3
Independent Samples DesignExample: Compare Control Group to Treatment Group •Children are randomly assigned to two groups. One group is taught using an old teaching method and the other group using a new teaching method. To avoid one possible source of confounding both methods are taught by the same teacher.
1
2
1 2
1 2
:Sample 1: , ,...,
Sample 2: , ,...,n
n
x x x
y y y
Independent samples design
•Independent sample design relies on random assignment to make the two groups equal (on the average) on all attributes except for the treatment used (treatment factor).
•The two samples are independent
Possibly different numbers
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 4
Matched Pairs DesignExample: Comparison of Two Methods of Teaching Spelling.•Pretest the children and match pairs of children with similar test scores.
•Randomly assign one child of each pair to the control group andthe other to the treatment group
•The pretest score is called a blocking factor
yny2y1Sample 2:xnx2x1Sample 1:n…21Pair:
The x’s are the changes in the test scores of the children taught by the old method.The y’s are the changes in the test scores of the children taught by the new method.
The same number
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 5
Pros and Cons of Each Design• In the independent sample design the two groups may not be
quite equal on some attribute, especially if the sample sizes are small, because randomization assures equality only on the average. Does one group have a higher pretest score than the other? Difference in the groups could be the result of this difference not the treatment factor.
• The matched pair design enables more precise comparisons to be made between treatment groups because of smaller experimental error
• However, it can be difficult to form matched pairs, especially when dealing with human subjects
• The two types of designs can be used in observational studies where the two groups are not formed by randomization. Then a cause and effect relationship cannot be established.
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 6
Hospitalization Cost: Independent Sample Design
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 7
Graphical Methods for Comparing Independent Samples: Side-by-Side Box Plots
Outliers present and variation seems different
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 8
Side-by-Side Box Plots:Log Costs
•No outliers for LogeCost•Assumption of equal variance appears reasonable for the two groups in the log scale
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 9
Another Graphical
Methods for Comparing
Two Independent
Samples
( ) ( )
Plot of the order statistics ordered pairs ( , )
which are the i quantiles
n+1of the respective samples
i ix y
Plot suggests that treatment group costs are less than control group costs. But is this true?
Book discusses how to prepare this graph when the two samples are not of the same size.
Q-Q Plot
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 10
Graphical Displays of Data from Matched Pairs
• Plot the pairs (xi, yi) in a scatter plot. Using the 45° line as a reference, one can judge whether the two sets of values are similar or whether one set tends to be larger than the other
• Plots of the differences or the ratios of the pairs may prove to be useful
• Will show a JMP plot that is very informative• A Q-Q plot is meaningless for paired data because the
same quantiles based on the order observations do not, in general, come from the same pair.
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 11
Comparing Means of Two Populations:Independent Samples Design
(Large Samples Case)
1 21 2 1 2
12 2
2 1 2
Suppose that the observations , ,..., and , ,...,
are independent random samples from two populations with means
and and variances and . Both means and variancesare assumed to
n nx x x y y y
µ
µ σ σ
1
2 1 2 1
2
be unknown. The goal is to compare and in terms of their difference - . We assume that and are large (say 30).
nn
µµ µ µ
>
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 12
Comparing Means of Two Populations:Independent Samples Design
1 22 21 2
1 2
1 22 21 1 2 2
1 2
( ) ( ) ( )
( ) ( ) ( )
Therefore the standarized r.v.( ) has mean = 0 and variance = 1
If and are large, then Z is approximately (0,1) byth
E X Y E X E Y
Var X Y Var X Var Yn n
X YZn n
n n N
µ µ
σ σ
µ µσ σ
− = − = −
− = + = +
− − −=
+
e Central Limit Theorem though we did not assume the samples came from normal populations. (We also use fact that the difference of independent normal r.v.'s is also normal.)
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 13
Large Sample (Approximate) 100(1-α)% CI for µ1−µ2
( ) ( )2 2 2 21 2 1 2
2 1 2 21 2 1 2
2 2Note has been substituted for because samples arelarge, i.e., bigger than 30.
i i
s s s sx y z x y zn n n n
s
α αµ µ
σ
− − + ≤ − ≤ − + +
Example 8.2: Two large calculus classes of 150 students taught by two different professors (99% CI for difference of means)
2 2 2 21 2
.0051 2
(15) (12)(75 70) 2.576 [0.960,9.040]150 150
s sx y zn n
− ± + = − ± + =
Note interval does not contain zero so there is a α =.01 level significant difference between the two professors
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 14
Large Sample (Approximate) Test of Hypothesis
0 1 2 0 1 1 2 0 0: vs. : (Typically 0)H Hµ µ δ µ µ δ δ− = − ≠ =
02 21 1 2 2
( )Test statistics: x yzs n s n
δ− −=
+
Example 8.2: Final exam grades for large calculus classes of 150 students taught by two different professors. (Same test used on both classes.)
02 2 2 21 1 2 2
( ) (75 70) 0 3.188(15) 150 (12) 150
[ 3.188] 2[1 (3.188)] .0014Result is significant at the .01 level.
x yzs n s n
P value P z
δ
α
− − − −= = =
+ +
− = ≥ = − Φ =
=
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 15
Inference for Small Samples2 21 2Case 1: Variances and assumed equal.σ σ
2 22 22 1 1 2 2
1 2 1 22 2 2
1 2
Pooled estimate of the common variance:( ) ( )( 1) ( 1)
( 1) ( 1) 2Note: ( ) / 2 if sample sizes are equal
i iX X Y Yn S n SSn n n n
S S S
− + −− + −= =
− + − + −
= +
∑ ∑
1 21 2
1 2
( ) has -distribution with 2 d.f.1 1
X YT t n nS n n
µ µ− − −= + −
+
Assumption of normal populations is important since we cannot invoke the CLT
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 16
Inference for Small Sample: Confidence Intervals and Hypothesis Tests
1 2 1 22, 2 1 2 2, 21 2 1 2
Two-sided 100(1- )% CI is given by:
1 1 1 1n n n nx y t s x y t s
n n n nα α
α
µ µ+ − + −− − + ≤ − ≤ − + +
1 2
0 1 2 0 1 1 2 0
0
1 2
0 2, 2
Test of Hypothesis: : vs. :
Test statistics: 1 1
Reject if n n
H Hx yts
n n
H t t α
µ µ δ µ µ δδ
+ −
− = − ≠− −
=+
>
2 21 2Case 1: Variances and assumed equal.σ σ
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 17
JMP: Checking Assumptions for Log Costs
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 18
Normal Plots by Group
After log transformation, control group data seems normal
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 19
Normal Plots by Group
After log transformation, treatment group data seems normal
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 20
Testing for Normality
Control:
Treatment:
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 21
Hospitalization Cost Example•Work with log cost since then the normal assumption holds•Assumption of equal variance is also reasonable as seen bythe side-by-side box plots of loge cost (Slide 8)
2 2
Pooled estimate of the common variance:
29(1.167) 29(0.905) 1.045 with 58 d.f.58
s += =
58,.0251 2
95% CI:
1 1 1 1[ (8.251 8.084) 2.002 1.045 [ 0.373,0.707]30 30
x y t sn n
− ± + = − ± × + = −
Difference in cost in logarithmic scale are not significant. Contrast this conclusion with apparent difference seen on the Q-Q plot in Figure 8.1 (Slide 9)
SE( - ) 0.2698x y =
Contains zero
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 22
Interpretation of Difference in Means on the Log Scale
Mean (log Cost) = Median (log Cost) = log (Median Cost)
Because distribution of log cost is symmetric
Because the log preserves ordering
0.373 (log ) (log ) 0.707 0.373 log( ) log( ) 0.707
0.373 log 0.707
.689 exp( 0.373) exp(0.707) 2.028
C T
C T
C
T
C
T
Mean Cost Mean CostMedian Cost Median Cost
Median CostMedian Cost
Median CostMedian Cost
− ≤ − ≤− ≤ − ≤
− ≤ ≤
= − ≤ ≤ =
95% confidence interval for the ratio of median costs
This Interpretation is not in your textbook
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 23
JMP Analysis of Hospitalization Cost Example
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 24
JMP Analysis of Hospitalization Cost Example
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 25
JMP Analysis of Hospitalization Cost Example
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 26
Means Diamonds and x-Axis ProportionalA means diamond illustrates a sample mean and 95% confidence interval, as shown by the schematic in "Means Diamonds and X-Proportional Options". The line across each diamond represents the group mean. The vertical span of each diamond represents the 95%confidence interval for each group. Overlap marks are drawn above and below the group mean. For groups with equal sample sizes, overlapping marks indicate that the two group means are not significantly different at the 95% confidence level. If the X-Axis Proportional display option is in effect, the horizontal extent of each group along the x-axis (the horizontal size of the diamond) is proportional to the sample size of each level of the xvariable. It follows that the narrower diamonds are usually the taller ones because fewer data points yield a less precise estimate of the group mean. The confidence interval computation assumes that variances are equal across observations. Therefore, the height of the confidence interval (diamond) is proportional to the reciprocal of the square root of the number of observations in the group.
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 27
JMP Analysis of Hospitalization Cost Example
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 28
Inference for Small Samples2 21 2Case 2: Variances and unequal.σ σ
1 22 2
1 2
1 2
( ) does not have a Student - distributionX YT tS Sn n
µ µ− − −=
+
T is not a pivotal quantity since it depends on ratio of sigmas.
However, when n1 and n2 are large T has an approximate N(0,1) distribution
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 29
Inference for Small Samples2 21 2Case 2: Variances and unequal.σ σ
( ) ( )
1 22 2
1 2
1 2
21 2
2 21 1 2 2
2 22 21 2
1 21 2
For small samples( ) has an approximately -distribution
( )with degrees of freedom( 1) ( 1)
where SEM( ) and SEM( )
X YT tS Sn n
w ww n w n
s sw x w yn n
µ µ
ν
− − −=
+
+=
− + −
= = = =
Note: d.f. are estimated from the data and are not a function of the samples sizes alone
Note: ν is not usually an integer but is rounded down to the nearest integer
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 30
Inference for Small Samples2 21 2Case 2: Variances and unequal.σ σ
1 2
2 2 2 21 2 1 2
, 2 1 2 , 21 2 1 2
Approximate 100(1- )% two-sided CI for :
s s s sx y t x y tn n n nν α ν α
α µ µ
µ µ
−
− − + ≤ − ≤ − − +
0 1 2 0 1 1 2 0
02 21 1 1 1
0 , 2
Test statistics for : vs. :
is
Reject if .
H Hx yt
s n s n
H t tν α
µ µ δ µ µ δδ
− = − ≠− −
=+
>
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 31
Hospitalization Costs: Inference Using Separate Variances
2 2 2 2 221 2 1 2
1 21 2
2
2 2 2Since we have 2
where is the pooled sample variance. Hence,
s s s s sn n n sn n n n n
s
+ = = + = × = =
0 0 02 2 21 1 1 1
0.6191 12
That is, with this is the same as the with pooled variance
x y x y x yts n ns n s n s n
t t
δ δ δ− − − − − −= = = =
++
equal sample sizes
The textbook calculates the d.f. to be ν = 54.6 = 54Recall that assuming equal variance the d.f. were n1+ n2 - 2 = 58
Since the t distributions with 54 and 58 d.f. do not differ by much the results not assuming equal variances do not differ by much from those gotten assuming equal variances.
Recall that we are working with log cost
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 32
Hospitalization Costs: Inference Using Separate Variances – Conclusions
• Result with equal variances and unequal variance do not differ by much because– The two sample variances are not too different– The two sample sizes are equal
• In other cases there can be considerable loss in d.f. as a result of using separate variance estimates
• If there is any doubt about the equality of variances do not usethe pooled variance method since this method is not robustagainst the violation of the assumption of equal variances. This is particularly so when sample sizes are unequal.
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 33
Testing for the Equality of VariancesSection 8.4 covers the classical F test for the equality of two variances and associated confidence intervals. However, this method is not robust against departures from normality. For example, p-values can be off by a factor of 10 if the distributions have shorter or longer tails than the normal.A robust alternative is Levene’s Test. His test applies the two-sample t-test to the absolute value of the difference of each observation and the group mean
1 1 1
2 2 2
| |, 1, 2, ,| |, 1, 2, ,
i
i
Y Y i nY Y i n
− = ⋅⋅⋅
− = ⋅⋅⋅This method works well even though these absolute deviations are not independent.
In the Brown-Forsythe test the response is the absolute value of the difference of each observation and the group median
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 34
JMP Analysis Without Assuming Equal Variances
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 35
JMP 4 Analysis Without Assuming Equal Variances
Note that (.6181)2 = 0.3820
Classical Test for equality of several variances is not robust against departure from the normality assumption
These tests can be used when there aremore than two populations
ν = 54.6 = 54
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 36
Independent Sample Design: Sample Size Determination Assuming Equal Variances
0 1 2 1 1 22
21 2
: 0 vs. : 0
( )2
H H
z zn n n α β
µ µ µ µ
σδ
− = − ≠
+ = = =
Hospital cost example: Assume σ = 1 in loge scale costs (consistent with data)Assume we want to detect change of a factor of two in the median cost with at least .90 power. The alpha level is .05.
21 2
1
2 .025 .10
2
Then (log ) (log ) log log 2 0.693
1, 1.960, and 1.282
1.0(1.960 1.282)2 43.75 44.693
median CostE Y E Ymedian Cost
z z z z
n
α β
δ
σ
= − = =
= = = = =
+ = =
Because we assume a known variance this n is a slight underestimate of sample size
Smallest difference of practical importance that we want to detect
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 37
Sample Size Determination With JMP
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 38
Sample Size Determination With JMP: Method 1
.693/2= δ/2
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 39
Sample Size Determination With JMP
Sample sizehere is 2n
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 40
Sample Size Determination With JMP:Exploring a Range of Standard Deviations
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 41
Sample Size Determination With JMPExploring a Range of Standard Deviations
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 42
Sample Size Determination With JMP: A Simpler Way
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 43
Sample Size Determination With JMP: A Simpler Way
Extra Paramsis only for multi-factor designs. Leave this field zero in simple cases.
2n
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 44
Matched Pairs Design
Example: Clinical study to compare two methods of measuringcardiac output discussed in Example 4.11. Method A is invasivewhile Method B is noninvasive. To control for difference betweenpersons, both methods are used on each person participating in the study. The purpose is to compare the two methods in terms ofthe mean difference.
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 45
Cardiac Output Data
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 46
Cardiac Output JMP FileYou can downloadthis file in the coursehome page followingthe Textbook Data Setslink.
Methods used for matched pair are the one-sample methods already learned applied to the A-B data.
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 47
Cardiac Output JMP Analysis
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 48
Testing Equality of Means
Test that the meandifference is zero
Example 8.6 in the textbook has a slight rounding error in the calculation of the mean difference
Distribution JMP platform:
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 49
Confidence Interval for Difference of Means
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 50
JMP’s Matched Pairs Platform: Another Way to Compare Matched Pairs
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 51
JMP’s Matched Pairs Platform
X axis measures how different the pairs are. The more scatter in this axis the more likely the results are to generalize to other persons outside the sample.
Significant difference between the methods because significance bands do not contain 0.
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 52
Statistical Justification of Matched Pairs Design
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 53
Sample Size Determination2
22
( ) (One-Sided Test)
( ) (Two-Sided Test)
D
D
z zn
z zn
α β
α β
σδ
σδ
+ =
+ =
•One needs a planning value for σD
•This formulas come from the one-sample formulas applied to the differences
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 54
Comparing Variances of Two Populations
•Application arises when comparing instrument precision oruniformities of products. Also when checking assumptions.
•The method discussed in the book (F test) is applicable only under the assumption of normality of the data. The F test is highly sensitive to even modest departures from normality.
• In case of nonnormal data there are nonparametric and other robust methods for comparing data dispersion. We recallsome supported by JMP.
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 55
Example 8.7 (Hospitalization Costs: Test of Equality of Variances)
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 56
Comparing Variances with JMP
These tests are alternatives to the F test used inExample 8.7 on Page 288. These tests can be usedwith more than two samples. Levene’s test is relatively robust against violations of the normalityassumption . See JMP’s help for details.
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 57
Comparing Variances of Two Populations: F Test
1
1
21 2 1 1
21 2 1 1
Independent sample design:Sample 1: , ,..., is a random sample from ( , )
Sample 2: , ,..., is a random sample from ( , )n
n
x x x N
y y y N
µ σ
µ σ2 2
1 11 22 2
2 2
has an F distribution 1 and 1 d.f. respectivelySF n nS
σσ
= − −
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 58
An Important Industrial Application:Example 8.8
Do the two labs have equal measurement precision?
7/14/2004 Unit 8 - Stat 571 - Ramón V. León 59
Example 8.8: Calculation Details
d.f. = (12 - 1) + (12 - 1) + (12 - 1) = 33
d.f. = (6 - 1) + (6 - 1) + (6 - 1) = 15, ,
, ,1
Used the fact that1
m nn m
ff α
α−
=