test about a population proportionlzhang/teaching/3070summer2009... · test about a population...
TRANSCRIPT
Test about a Population Proportion
Summary for large-sample tests for population proportion p
Null hypothesis: H0 : p = p0
Test statistic value: z = p−p0√p0(1−p0)/n
Alternative Hypothesis Rejection RegionHa : p > p0 z ≥ zα (upper-tailed)Ha : p < p0 z ≤ −zα (lower-tailed)Ha : p 6= p0 either z ≥ z
α/2or z ≤ −z
α/2(two-tailed)
Remark: These test procedures are valid provided that np0 ≥ 10and n(1− p0) ≥ 10.
Test about a Population Proportion
Type II Error β for large-sample testsAlternative Hypothesis β(p′)
Ha : p > p0 Φ
[p0−p′+zα
√p0(1−p0)/n√
p′(1−p′)/n
]
Ha : p < p0 1− Φ
[p0−p′−zα
√p0(1−p0)/n√
p′(1−p′)/n
]
Ha : p > p0 Φ
[p0−p′+z
α/2
√p0(1−p0)/n√
p′(1−p′)/n
]−
Φ
[p0−p′−z
α/2
√p0(1−p0)/n√
p′(1−p′)/n
]
Test about a Population ProportionSmall-Sample TestsWhen the sample size n is small (n ≤ 30), we test the hypothesesbased directly on the binomial distribution.
For example, if the null hypothesis is H0 : p = p0 and thealternative hypothesis is Ha : p > p0, then the rejection region is ofthe form X ≥ c , where X ∼ Bin(n, p).
P(type I error) = P(reject H0 | H0) = P(X ≥ c | p = p0)
= 1− P(X < c | p = p0) = 1− P(X ≤ c − 1 | p = p0)
= 1− B(c − 1; n, p0)
And
P(type II error) = P(fail to reject H0 | p = p′) = P(X < c | p = p′)
= P(X ≤ c − 1 | p = p′) = B(c − 1; n, p′)
Test about a Population ProportionSmall-Sample TestsWhen the sample size n is small (n ≤ 30), we test the hypothesesbased directly on the binomial distribution.
For example, if the null hypothesis is H0 : p = p0 and thealternative hypothesis is Ha : p > p0, then the rejection region is ofthe form X ≥ c , where X ∼ Bin(n, p).
P(type I error) = P(reject H0 | H0) = P(X ≥ c | p = p0)
= 1− P(X < c | p = p0) = 1− P(X ≤ c − 1 | p = p0)
= 1− B(c − 1; n, p0)
And
P(type II error) = P(fail to reject H0 | p = p′) = P(X < c | p = p′)
= P(X ≤ c − 1 | p = p′) = B(c − 1; n, p′)
Test about a Population ProportionSmall-Sample TestsWhen the sample size n is small (n ≤ 30), we test the hypothesesbased directly on the binomial distribution.
For example, if the null hypothesis is H0 : p = p0 and thealternative hypothesis is Ha : p > p0, then the rejection region is ofthe form X ≥ c , where X ∼ Bin(n, p).
P(type I error) = P(reject H0 | H0) = P(X ≥ c | p = p0)
= 1− P(X < c | p = p0) = 1− P(X ≤ c − 1 | p = p0)
= 1− B(c − 1; n, p0)
And
P(type II error) = P(fail to reject H0 | p = p′) = P(X < c | p = p′)
= P(X ≤ c − 1 | p = p′) = B(c − 1; n, p′)
Test about a Population ProportionSmall-Sample TestsWhen the sample size n is small (n ≤ 30), we test the hypothesesbased directly on the binomial distribution.
For example, if the null hypothesis is H0 : p = p0 and thealternative hypothesis is Ha : p > p0, then the rejection region is ofthe form X ≥ c , where X ∼ Bin(n, p).
P(type I error) = P(reject H0 | H0) = P(X ≥ c | p = p0)
= 1− P(X < c | p = p0) = 1− P(X ≤ c − 1 | p = p0)
= 1− B(c − 1; n, p0)
And
P(type II error) = P(fail to reject H0 | p = p′) = P(X < c | p = p′)
= P(X ≤ c − 1 | p = p′) = B(c − 1; n, p′)
Test about a Population Proportion
Small-Sample TestsRemark: in the samll-sample case, it is usually not possible to finda vale c for which P(type I error) is exactly the desired significancelevel α. Therefore we choose the largest rejection region whichstatisfying
P(type I error) < α.
P-value
DefinitionThe P-value (or observed significance level) is the smallest level ofsignificance at which H0 would be rejected when a specified testprocedure is used on a given data set. Once the P-value has beendetermined, the conclusion at any partivular level α results fromcomparing the P-value to α:
1. P-value ≤ α⇒ reject H0 at level α.
2. P-value > α⇒ fail to reject H0 at level α.
Convention: it is customary to call the data significant when H0 isrejected and not significant otherwise.
P-value
P-value for z Tests
P =
1− Φ(z) for an upper-tailed test
Φ(z) for a lower-tailed test
2[1− Φ(|z |)] for a two-tailed test
where Φ(z) is the cdf for standard normal rv.
Statistical Inference Based on Two Samples
Basic Assumptions
1. X1,X2, . . . ,Xm is a random sample from a population withmean µ1 and variance σ2
1.
2. Y1,Y2, . . . ,Ym is a random sample from a population withmean µ2 and variance σ2
2.
3. The two samples are independent of one another.
Proposition
The expected value of X − Y is µ1 − µ2 and the standarddeviation of X − Y is
σX−Y
=
√σ2
1
m+σ2
2
n
Samples from Normal Populations with Known Variances
If the the two samples X1,X2, . . . ,Xm and Y1,Y2, . . . ,Ym are fromnormal populations, then we have
X − Y ∼ N(µ1 − µ2 ,σ2
1
m+σ2
2
n)
Therefore,
Z =(X − Y )− (µ1 − µ2)√
σ21
m +σ2
2n
is a standard normal rv.
Samples from Normal Populations with Known Variances
If the population variances are known to be σ21andσ2
2, then the
two-sided confidence interval for the difference of the populationmeans µ1 − µ2 with confidence level 1− α is((
X − Y)− z
α/2
√σ2
1
m+σ2
2
n,(X − Y
)+ z
α/2
√σ2
1
m+σ2
2
n
)
Samples from Normal Populations with Known Variances
In case of known population variances, the procedures forhypothesis testing for the difference of the population meansµ1 − µ2 is similar to the one sample test for the population mean:
Null hypothesis H0 : µ1 − µ2 = ∆0
Test statistic value
z =(X − Y )−∆0√
σ21
m +σ2
2n
Alternative Hypothesis Rejection Region for Level α TestHa : µ1 − µ2 > ∆0 z ≥ zα (upper-tailed)Ha : µ1 − µ2 < ∆0 z ≤ −zα (lower-tailed)Ha : µ1 − µ2 6= ∆0 z ≥ z
α/2or z ≤ −z
α/2(two-tailed)
Samples from Normal Populations with Known Variances
The type II er-ror when µ1−µ2 = ∆′ is calculated similarly as the one sample case:
Alternative Hypothesis Type II Error Probability β(∆′) for Level α Test
Ha : µ1 − µ2 > ∆0 Φ(
zα + ∆0−∆′
σ
)Ha : µ1 − µ2 < ∆0 1− Φ
(−zα + ∆0−∆′
σ
)Ha : µ1 − µ2 6= ∆0 Φ
(zα/2 + ∆0−∆′
σ
)− Φ
(−zα/2 − ∆0−∆′
σ
)where
σ = σX−Y
=√
(σ21/m) + (σ2
2/n).
Large Size Samples
Example 9.1Analysis of a random sample consisting of m = 20 specimens ofcold-rolled steel to determine yield strengths resulted in a sampleaverage strength of x = 29.8 ksi. A second random sample ofn = 25 two-sided galvanized steel specimens gave a sample averagestrength of y = 34.7 ksi. Assuming that the two yield-strenghdistributions are normal with σ1 = 4.0 and σ2 = 5.0, does the dataindicate that the corresoponding true average yield strengths µ1
and µ2 are different?
Large Size Samples
When the sample size is large, both X and Y are approximatelynormally distributed, and
Z =(X − Y )− (µ1 − µ2)√
S21
m +S2
2n
is approximately a standard normal rv.
Large Size Samples
In case both m and n are large (m, n > 30), the procedure forconstructing confidence interval and testing hypotheses for thedifference of two population means are similar to the one samplecase.
The two-sided confidence interval for the difference of thepopulation means µ1 − µ2 with confidence level 1− α is(X − Y
)− z
α/2
√S 2
1
m+
S 2
2
n,(X − Y
)+ z
α/2
√S 2
1
m+
S 2
2
n
Large Size Samples
In case both m and n are large (m, n > 30), the procedures forhypothesis testing for the difference of the population meansµ1 − µ2 is :
Null hypothesis H0 : µ1 − µ2 = ∆0
Test statistic value
z =(X − Y )−∆0√
S21
m +S2
2n
Alternative Hypothesis Rejection Region for Level α TestHa : µ1 − µ2 > ∆0 z ≥ zα (upper-tailed)Ha : µ1 − µ2 < ∆0 z ≤ −zα (lower-tailed)Ha : µ1 − µ2 6= ∆0 z ≥ z
α/2or z ≤ −z
α/2(two-tailed)
Samples from Normal Populations with Known Variances
Example Problem 7Are male college stuents more easily bored than their femalecounterparts? This question was examined in the article “Boredomin Young Adults – Gender and Cultural Comparisons” (J. ofCross-Cultural Psych., 1991: 209-223). The authors administereda scale called the Boredom Proneness Scale to 97 male and 148female U.S. college students. Does the accompanying data supportthe research hypothesis that the mean Boredom Proneness Ratingis highter for men than for women?
Gender Sample Size Sample Mean Sample SDMale 97 10.40 4..83
Female 148 9.26 4..68
Two-Sample t Test and C.I.
Assumptions:Both populations are normal, so that X1,X2, . . . ,Xm is a randomsample from a normal distribution and so is Y1,Y2, . . . ,Yn (withthe X ’s and Y ’s independent of one another).
The plausibility of
these assumptions can be judged by constructing a normalprobability plot of the xi s and another of yi s.
Two-Sample t Test and C.I.
Assumptions:Both populations are normal, so that X1,X2, . . . ,Xm is a randomsample from a normal distribution and so is Y1,Y2, . . . ,Yn (withthe X ’s and Y ’s independent of one another). The plausibility of
these assumptions can be judged by constructing a normalprobability plot of the xi s and another of yi s.
Two-Sample t Test and C.I.
TheoremWhen the population distributions are both normal, thestandardized variable
T =(X − Y )− (µ
X− µ
Y)√
S2Xm +
S2Yn
has approximately a t distribution with df ν estimated from thedata by
ν =
(s2Xm +
s2Yn
)2
(s2X/m)2
m−1 +(s2
Y/n)2
n−1
(round ν down to the nearest integer.)
Two-Sample t Test and C.I.
Remark: the df of r.v. T can also be estimated from the sampledata by
ν =[(se
X)2 + (se
Y)2]2
(seX
)4
m−1 +(se
Y)4
n−1
wherese
X=
sX√m, se
Y=
sY√n
(round ν down to the nearest integer.)
Two-Sample t Test and C.I.
The two-sample t confidence interval for µX − µY withconfidence level 100(1− α)% is given by(x − y)− t
α/2,ν
√s2X
m+
s2Y
n, (x − y) + t
α/2,ν
√s2X
m+
s2Y
n
A one-sided confidence bound can be obtained by replacing t
α/2,ν
with tα,ν .
Two-Sample t Test and C.I.
The two-sample t test for testing H0 : µX− µ
Y= ∆0 is as
follows:
Test statistic value: t =(x − y)−∆0√
s2Xm +
s2Yn
Alternative Hypothesis Rejection Region for Approximate Level α TestHa : µ
X− µ
Y> ∆0 t ≥ tα,ν (upper-tailed)
Ha : µX− µ
Y< ∆0 t ≤ tα,ν (lower-tailed)
Ha : µX− µ
Y6= ∆0 t ≥ t
α/2,νor t ≤ t
α/2,ν(two-tailed)
Two-Sample t Test and C.I.
Example: (Problem 23)Fusible interlinings are being used with increasing frequency to supportouter fabrics and improve the shape and drape of various pieces ofclothing. The article “Compatibility of Outer and Fusible InterliningFabrics in Tailored Garments” (Textile Res. J. 1997: 137-142) gave theaccompanying data on extensibility (%) at 100 gm/cm for bothhigh-quality fabric (H) and poor-quality fabric (P) specimens.
H 1.2 0.9 0.7 1.0 1.7 1.7 1.1 0.9 1.71.9 1.3 2.1 1.6 1.8 1.4 1.3 1.9 1.60.8 2.0 1.7 1.6 2.3 2.0
P 1.6 1.5 1.1 2.1 1.5 1.3 1.0 2.6
The sample mean and standard deviation for the high-quality sample are1.508 and 0.444, respectively, and those for the poor-quality sample are1.588 and 0.530.
Construct a 95% C.I. for the difference of the true average extensibility
between high-quality fabric and poor-quality fabric. Decide whether the
true average extensibility differs for the two types.
Two-Sample t Test and C.I.
Example: (Problem 23)Fusible interlinings are being used with increasing frequency to supportouter fabrics and improve the shape and drape of various pieces ofclothing. The article “Compatibility of Outer and Fusible InterliningFabrics in Tailored Garments” (Textile Res. J. 1997: 137-142) gave theaccompanying data on extensibility (%) at 100 gm/cm for bothhigh-quality fabric (H) and poor-quality fabric (P) specimens.
H 1.2 0.9 0.7 1.0 1.7 1.7 1.1 0.9 1.71.9 1.3 2.1 1.6 1.8 1.4 1.3 1.9 1.60.8 2.0 1.7 1.6 2.3 2.0
P 1.6 1.5 1.1 2.1 1.5 1.3 1.0 2.6
The sample mean and standard deviation for the high-quality sample are1.508 and 0.444, respectively, and those for the poor-quality sample are1.588 and 0.530.
Construct a 95% C.I. for the difference of the true average extensibility
between high-quality fabric and poor-quality fabric. Decide whether the
true average extensibility differs for the two types.
Two-Sample t Test and C.I.
The Quantile-Quantile plot for sample H is
Two-Sample t Test and C.I.
The Quantile-Quantile plot for sample P is
Two-Sample t Test and C.I.
Degrees of freedom calculated from the samples:
ν =(0.4442/24 + 0.5302/8)2
(0.4442/24)2
24−1 + (0.5302/8)2
8−1
= 10.5 ≈ 10
α = 0.05, tα/2,ν
= t0.025,10 = 2.228.
Therefore the 95% C.I. for the difference of the true averageextensibility for the two types of fabric is given by
(1.508− 1.588)∓ 2.228 ·√
0.4442
24+
0.5302
8
which is (−0.544, 0.384)
Two-Sample t Test and C.I.
Degrees of freedom calculated from the samples:
ν =(0.4442/24 + 0.5302/8)2
(0.4442/24)2
24−1 + (0.5302/8)2
8−1
= 10.5 ≈ 10
α = 0.05, tα/2,ν
= t0.025,10 = 2.228.
Therefore the 95% C.I. for the difference of the true averageextensibility for the two types of fabric is given by
(1.508− 1.588)∓ 2.228 ·√
0.4442
24+
0.5302
8
which is (−0.544, 0.384)
Two-Sample t Test and C.I.
1. Let µ1 be the true average extensibility for the high-qualityfabric and µ2 for the poor-quality fabric.
2. Hypotheses: H0 : µ1 − µ2 = 0 v.s. Ha : µ1 − µ2 < 0
3. Test statistic:
T =(X − Y )− 0√
S21
m +S2
2n
,
and the value of the test statistic is
t =(1.508− 1.588)√
0.4442
24 + 0.5302
8
= −1.846,
and df is 10.
4. The P-value for a lower-tailed t test in this case is 0.051
5. Using significance level 0.01, we will not reject H0
Two-Sample t Test and C.I.
1. Let µ1 be the true average extensibility for the high-qualityfabric and µ2 for the poor-quality fabric.
2. Hypotheses: H0 : µ1 − µ2 = 0 v.s. Ha : µ1 − µ2 < 0
3. Test statistic:
T =(X − Y )− 0√
S21
m +S2
2n
,
and the value of the test statistic is
t =(1.508− 1.588)√
0.4442
24 + 0.5302
8
= −1.846,
and df is 10.
4. The P-value for a lower-tailed t test in this case is 0.051
5. Using significance level 0.01, we will not reject H0
Two-Sample t Test and C.I.
1. Let µ1 be the true average extensibility for the high-qualityfabric and µ2 for the poor-quality fabric.
2. Hypotheses: H0 : µ1 − µ2 = 0 v.s. Ha : µ1 − µ2 < 0
3. Test statistic:
T =(X − Y )− 0√
S21
m +S2
2n
,
and the value of the test statistic is
t =(1.508− 1.588)√
0.4442
24 + 0.5302
8
= −1.846,
and df is 10.
4. The P-value for a lower-tailed t test in this case is 0.051
5. Using significance level 0.01, we will not reject H0
Two-Sample t Test and C.I.
1. Let µ1 be the true average extensibility for the high-qualityfabric and µ2 for the poor-quality fabric.
2. Hypotheses: H0 : µ1 − µ2 = 0 v.s. Ha : µ1 − µ2 < 0
3. Test statistic:
T =(X − Y )− 0√
S21
m +S2
2n
,
and the value of the test statistic is
t =(1.508− 1.588)√
0.4442
24 + 0.5302
8
= −1.846,
and df is 10.
4. The P-value for a lower-tailed t test in this case is 0.051
5. Using significance level 0.01, we will not reject H0
Two-Sample t Test and C.I.
1. Let µ1 be the true average extensibility for the high-qualityfabric and µ2 for the poor-quality fabric.
2. Hypotheses: H0 : µ1 − µ2 = 0 v.s. Ha : µ1 − µ2 < 0
3. Test statistic:
T =(X − Y )− 0√
S21
m +S2
2n
,
and the value of the test statistic is
t =(1.508− 1.588)√
0.4442
24 + 0.5302
8
= −1.846,
and df is 10.
4. The P-value for a lower-tailed t test in this case is 0.051
5. Using significance level 0.01, we will not reject H0
Two-Sample t Test and C.I.
1. Let µ1 be the true average extensibility for the high-qualityfabric and µ2 for the poor-quality fabric.
2. Hypotheses: H0 : µ1 − µ2 = 0 v.s. Ha : µ1 − µ2 < 0
3. Test statistic:
T =(X − Y )− 0√
S21
m +S2
2n
,
and the value of the test statistic is
t =(1.508− 1.588)√
0.4442
24 + 0.5302
8
= −1.846,
and df is 10.
4. The P-value for a lower-tailed t test in this case is 0.051
5. Using significance level 0.01, we will not reject H0
Two-Sample t Test and C.I.
1. Let µ1 be the true average extensibility for the high-qualityfabric and µ2 for the poor-quality fabric.
2. Hypotheses: H0 : µ1 − µ2 = 0 v.s. Ha : µ1 − µ2 < 0
3. Test statistic:
T =(X − Y )− 0√
S21
m +S2
2n
,
and the value of the test statistic is
t =(1.508− 1.588)√
0.4442
24 + 0.5302
8
= −1.846,
and df is 10.
4. The P-value for a lower-tailed t test in this case is 0.051
5. Using significance level 0.01, we will not reject H0