two populations -- unknown sigmas
TRANSCRIPT
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 1/39
Two Population Means
Hypothesis Testing andConfidence Intervals
With UnknownStandard Deviations
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 2/39
The Problem
• 1 or 2 are unknown
• 1 and 2 are not known (the usual case)
OBJECTIVES
• Test whether 1 > 2 (by a certain amount)
–
or whether 1 2 • Determine a confidence interval for the
difference in the means: 1 - 2
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 3/39
KEY ASSUMPTIONSSampling is done from two populations.
–
Population 1 has mean µ1 and variance σ1
2
. – Population 2 has mean µ2 and variance σ2
2.
– A sample of size n1 will be taken from population 1.
– A sample of size n2 will be taken from population 2.
– Sampling is random and both samples are drawnindependently.
– Either the sample sizes will be large or the
populations are assumed to be normally distribution.
1
2
1
1
111
n
σ variance,
n
σ deviationstandard,μmean:XvariableRandom
2
2
2
2
222
n
σ variance,
n
σ deviationstandard,μmean:XvariableRandom
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 4/39
Distribution of X1 - X2
• Since X1 and X2 are both assumed to be normal,
or the sample sizes, n1 and n2 are assumed to be
large, then because 1 and 2 are unknown, the
random variable X1 -X2 has a: – Distribution -- t
– Mean = 1 - 2
– Standard deviation that depends on whether or not the
standard deviations of X1 and X2 (although unknown)
can be assumed to be equal
– Degrees of freedom that also depends on whether or
not the standard deviations of X1 and X2 can be
assumed to be equal
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 5/39
Appropriate Standard Deviation For
X1 -X2 When Are ’s Are Known
• Recall the appropriate standard deviation
for X1 - X2 is:
• Now if 1 = 2 we can simply call it and write it
as:
• So if the standard deviations are unknown, we
need an estimate for the common variance, 2.
2
2
2
1
2
1
n
σ
n
σ
21
2
n1
n1σ
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 6/39
Estimating 2
Degrees of Freedom
• If we can assume that the populations have equal
variances, then the variance of X1 - X2 is the
weighted average of s12 and s2
2, weighted by:
DEGREES OF FREEDOM• There are n1- 1 degrees of freedom from the first
sample and n2-1 degrees of freedom from the
second sample, so
• Total Degrees of Freedom for the hypothesis test
or confidence interval = (n1 -1) + (n2 -1) = n1 + n2 -2
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 7/39
e ppropr a e an ar ev a onFor X1 - X2 When Are ’s Unknown,
but Can Be Assumed to Be Equal• The best estimate for 2 then is the pooled
variance, sp2:
• Thus the best estimates for the variance and
standard deviation of X1 - X2 are:
2221
22121
12222112 p s2nn
1ns
2nn
1ns
DFTotal
DFs
DFTotal
DFs
21
2
PXX
21
2P
2XX
n
1
n
1
ss
n
1
n
1ss
21
21
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 8/39
21 xx
t-Statistic
Error
Standard
v
Estimate
Point
t
t-Statistic and t-Confidence Interval
Assuming Equal Variances
Degrees of Freedom = n 1 + n 2 -2
Confidence Interval
Error
Standard t
Estimate
Point/2
2
x1
x
2x
1x
21
2
p
n
1
n
1s
21
2
pn
1
n
1s
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 9/39
The Appropriate Standard Deviation
For X1 - X2 When Are ’s Unknown,
And Cannot Be Assumed to Be Equal• If we cannot assume that the populations have
equal variances, then the best estimate for 12 is
s12 and the best estimate for 2
2 is s22.
• Thus the best estimates for the variance andstandard deviation of X1 - X2 are:
2
22
1
21
XX
2
2
21
2
12 XX
n
s
n
s s
n
s
n
s s
21
21
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 10/39
t-Statistic and t-Confidence Interval
Assuming Unequal Variances
21 xx
t-Statistic
Error Standard
vEstimate
Point
t
Confidence Interval
Error
Standardt
Estimate
Point/2
Total Degrees of Freedom
1n
n
s
1n
n
s
n
s
n
s
2
2
2
2
2
1
2
1
2
1
2
2
22
1
21
2
x1
x 2
x1x
2
2
2
1
2
1
n
s
n
s
2
2
2
1
2
1
ns
ns
Round the resulting value.
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 11/39
Testing whether the Variances
Can Be Assumed to Be Equal• The following hypothesis test tests whether or not
equal variances can be assumed:
H0: 12/2
2 1 (They are equal)
HA: 12
/22
1 (They are different)
This is an F-test!
If the larger of s1
2
and s2
2
is put in the numerator, thethe test is:
Reject H0 if F = s12/s2
2 > F/2, DF1, DF2
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 12/39
Hypothesis Test/Confidence Interval
Approach With Unknown ’s
• Take a sample of size n1 from population 1
– Calculate x1 and s12
•
Take a sample of size n2 from population 2 – Calculate x2 and s2
2
• Perform an F-test to determine if the
variances can be assumed to be equal
• Perform the Appropriate Hypothesis Test
or Construct the Appropriate Confidence
Interval
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 13/39
Example 1
Based on the following two random samples, – Can we conclude that women on the average score
better than men on civil service tests?
– Construct a 95% for the difference in average scores
between women and men on civil service tests.
• Because the samp le sizes are large, we do no t have to
assume that test scores have a no rmal distr ibut ion to
perform our analyses.
Number sampled = 32
Sample Average = 75Sample St’d Dev. = 13.92
Women
Number sampled = 30
Sample Average = 73Sample St’d Dev. = 11.79
Men
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 14/39
Example 1 – F-testDo an F-test to determine if variances can be
assumed to be equal.H0: W
2 /M2 = 1 (Equal Variances)
HA: W2 /M
2 1 (Unequal Variances)
• Select α = .05.
• Reject H0 (Accept HA) if Larger s2 /Smaller s2
> F.025,DF(Larger s2),DF(Smaller s2) = F.025,31,29 = 2.09 *
(*Note th is is F .025,30,29 sinc e the table does n ot giv e the value for F .025, 31,29 )
Calculat ion: sW2 / sM
2 = (13.92)2 /(11.79)2 = 1.39
Since 1.39 < 2.09, Cannot conclude unequal variances.
Do Equal Variance t-test with 32+30-2=60 degrees of freedom .
E l 1
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 15/39
Example 1
The Equal Variance t-Test
H0: W - M = 0HA: W - M > 0
• Select α = .05.
• Reject H0 (Accept HA) if t > t.05,60 = 1.658
Since .608 < 1.658, we cannot conclude that
women average better than men on the tests.
.608
30
1
32
1167.30
073)(75t
167.30(11.79)60
29(13.92)
60
31s
222
p
E l 1
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 16/39
Example 1
95% Confidence Interval
95% Confidence Interval
21
2
P.025,60MWn
1
n
1st)xx(
30
1
32
130.167000.2)7375(
2 ± 6.57
-4.57 8.57
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 17/39
Example 2
Based on the following random samples of
basketball attendances at the Staples Center, – Can we conclude that the Lakers average attendance is
more than 2000 more than the Clippers average
attendance at the Staples Center?
–Construct a 95% for the difference in averageattendance between Lakers and Clippers games at the
Staples Center.
Since samp le sizes are small , we mus t assum e that attendance at Lakers and Clipper games have normal distr ib ut ion s to perform the analyses.
Number sampled = 13
Sample Average = 16,675
Sample St’d Dev. = 1014.97
LA Lakers
Number sampled = 11
Sample Average = 12,009
Sample St’d Dev. = 3276.73
LA Clippers
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 18/39
Example 2 – F-test
• Do an F-test to determine if variances can be
assumed to be equal.H0: C
2 /L2 = 1 (Equal Variances)
HA: C2 /L
2 1 (Unequal Variances)
Note: Cl ipper var iance is the larger sample var iance
• Choose α = .05.
• Reject H0 (Accept HA) if Larger s2 /Smaller s2 > F
.025,DF(Larger variance),DF(Smaller variance)
= F.025,10,12
= 3.37
Calculat ion: sC2 / sL
2 = (3276.73)2 /(1014.97)2 = 10.42
Since 10.42 > 3.37, Can conclude unequal variances.
Do Unequal Variance t-test.
D f F d f th U l
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 19/39
Degrees of Freedom for the Unequal
Variance t-Test
• The degrees of freedom for this test is given by:
1n
n
s
1n
n
s
n
s
n
s
2
2
2
2
2
1
2
1
2
1
2
2
2
2
1
2
1
= 11.626=
12
13
(1014.97)
10
11
(3276.73)
13
(1014.97)
11
(3276.73)
22
22
222
This rounded to 12 degrees of freedom.
E l 2 th t T t
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 20/39
Proceed to the hypo thesis test for the
di f ference in means w ith unequal var iances: H0: L - C = 2000
HA: L - C > 2000
•
Select α = .05.• Reject H0 (Accept HA) if t > t.05,12 = 1.782
Since t = 2.595 > 1.782, we can conclude that theLakers average more than 2000 per game morethan the Clippers at the Staples Center.
Example 2 – the t-Test
595.2
11)73.3276(
13)97.1014(
2000)009,12675,16(t
22
E l 1
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 21/39
Example 1
95% Confidence Interval
95% Confidence Interval
2
2
2
1
2
1.025,12CL
n
s
n
s t)xx(
11
)73.3276(
13
)97.1014(179.2)009,12675,16(
22
4666 ± 2238.47
2427.53 6904.47
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 22/39
Excel Approach
•
F-test, t-test Assuming Equal Variances, t-test Assuming Unequal Variances are all
found in Data Analysis.
• Excel only performs a one-tail F-test.
– Multiply this 1-tail p-value by 2 to get the p-
value for the 2-tail F-test.
• Formulas must be entered for the LCL and
UCL of the confidence intervals.
– All values for these formulas can be found in
the Equal or Unequal Variance t-test Output.
I tti /I t ti R lt
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 23/39
Inputting/Interpreting Results
From Hypotheses Tests
• Express H0 and HA so that the number on theright side is positive (or 0)
• The p-value returned for the two-tailed test will
always be correct.
• The p-value returned for the one-tail test is
usually correct. It is correct if:
– HA is a “> test” and the t-statistic is positive
•
This is the usual case• If t < 0, the true p-value is 1 – (p-value printed by Excel)
– HA is a “< test” and the t-statistic is negative
• This is the usual case
• If t>0, the true p-value is 1 – (p-value printed by Excel)
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 24/39
Excel For Example 1 – F-Test
Go Tools
Select Data Analysis
Select F-Test Two-Sample For Variances
1 (C )
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 25/39
Example 1 – F-Test (Cont’d)
Use Women (Column A) for Variable Range 1
Use Men (Column B) for Variable Range 2
Check Labels
Designate first cell
for output.
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 26/39
Example 1 – F-Test (Cont’d)
p-value for
one-tail test
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 27/39
Example 1 – F-Test (Cont’d)
p-value for
one-tail test
=2*D9Multiply the one-tail p-value
by 2 to get the 2-tail p-value.
High p-value (.371671)
Cannot conclude Unequal Variances
Use Equal Variance t-test
E l 1 t T t
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 28/39
Example 1 – t-Test
Go Tools
Select Data Analysis
Select t-Test: Two-Sample Assuming Equal Variances
E l 1 t T t (C t’d)
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 29/39
Example 1 – t-Test (Cont’d)
Since HA is W - M > 0, enter
Column A for Range 1
Column B for Range 2
0 for Hypothesized Mean Difference
Check
Labels Designate first cell
for output.
E l 1 t t t (C t’d)
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 30/39
Example 1 – t-test (Cont’d)
p-value for
the one-tail “>” test
p-value for at
two-tail “” test
High p-value for 1-tail test!
Cannot conclude average
women’s score >
average men’s score
E l 1 95% C fid I t l
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 31/39
Example 1 – 95% Confidence Interval
=(D15-E15)-TINV(.05,D20)*SQRT(D18*(1/D17+1/E17))
1x 2x- DF.025,t- 2Ps1n
1
2n
1
Highlight Cell G19
Add $ Signs Using
F4 key
Drag to cell G20
Change “-” to “+”
**
E l F E l 2 F T t
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 32/39
Excel For Example 2 – F-Test
Go Tools
Select Data Analysis
Select F-Test Two-Sample For Variances
E l 2 F T t (C t’d)
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 33/39
Example 2 – F-Test (Cont’d)
Use Lakers (Column B) for Variable Range 1
Use Clippers (Column D) for Variable Range 2
Check Labels
Designate first cell
for output.
E l 2 F T t (C t’d)
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 34/39
Example 2 – F-Test (Cont’d)
Enter =2*F9
to give the p-value
for the two-tailed test
p-value for
one-tail test
Low p-value (.000352) – Can conclude Unequal Variances
Use Unequal Variance t-test
E l 2 t T t
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 35/39
Example 2 – t-Test
Go Tools
Select Data Analysis
Select
t-Test: Two Sample Assuming Unequal Variances
E l 2 t T t (C t’d)
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 36/39
Example 2 – t-Test (Cont’d)
Check
Labels Designate first cell
for output.
Since HA is L - C > 2000, enter
Column B for Range 1
Column D for Range 2
2000 for Hypothesized Mean Difference
Example 2 t test (Cont’d)
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 37/39
Example 2 – t-test (Cont’d)
Low p-value for 1-tail test
(compared to α = .05)!
Can conclude the Lakers average
more than 2000 more people per
game than the Clippers.
p-value for
the one-tail “>” test
p-value for at
two-tail “” test
Example 2 95% Confidence Interval
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 38/39
Example 2 – 95% Confidence Interval
=(F15-G15)-TINV(.05,F19)*SQRT(F16/F17+G16/G17)
1x2x- DF.025,t-
Highlight Cell I14
Add $ Signs Using
F4 key
Drag to cell I15
Change “-” to “+”
1
21
n
s
2n
s
2
2
*
7/29/2019 Two Populations -- Unknown Sigmas
http://slidepdf.com/reader/full/two-populations-unknown-sigmas 39/39
Review• Standard Errors and Degrees of Freedom when:
– Variances are assumed equal – Variances are not assumed equal
• F-statistic to determine if variances differ
• t-statistic and confidence interval when: – Variances are assumed equal
– Variances are not assumed equal
• Hypothesis Tests/ Confidence Intervals for Differences in Means (Assuming Equal or UnequalVariances) – By hand
– By Excel