telesidang 4 bab_8_9_10stst
DESCRIPTION
stastitiksTRANSCRIPT
CHAPTER 8Confidence Interval Estimation
INTRODUCTION Statistical inference is the process of
using sample results to draw conclusions about the characteristics of the population.
Inferential statistic enables to estimate unknown population mean or population proportion.
Two types of estimate: point and interval estimate
Confidence interval estimate is a range of numbers, called an interval, constructed around the point estimate.
POINT AND INTERVAL ESTIMATES
A confidence interval is a range of values within which the population parameter is expected to occur.
The two confidence intervals that are used extensively are the 95% and the 99%.
An Interval Estimate states the range within which a population parameter probably lies.
A point estimate is a single value (statistic) used to estimate a population value (parameter).
POINT AND INTERVAL ESTIMATES
Factors that Factors that determine determine
the width of the width of a confidence a confidence
intervalintervalThe sample size, n
The variability in the population, usually
estimated by s
The desired level of confidence
How the formula derived? It is from z test formula
n
XZ
nZX
Xn
Z
)(
Rearrange the above formula:
From z table
8.1 Confidence interval for a mean (δ known)
nZX
nZX
nZX
or
Refer page 327 for example 8.1
EXAMPLE 3
The value of the population mean is not known. Our best estimate of this value is the sample mean of 24.0 hours. This value is called a point estimate.
The Dean of the Business School wants to estimate the mean number of hours worked per week by students. A sample of 49 students showed a mean of 24 hours with a standard deviation of 4 hours. What is the population mean?
12.100.2449
496.100.2496.1
n
sX
The confidence limits range from 22.88 to 25.12.
95 percent confidence interval for the population mean
About 95 percent of the similarly constructed intervals include the population parameter.
95% 0.25%0.25%
1.96
-1.96
From z table
8.2 Confidence interval for a mean (δ unknown)
If population δ unavailable – therefore need to develop a confidence interval estimate of µ using only the sample statistics mean and standard deviation ( and S)X
Then we use Student’s t distribution instead of Z value.
n
SX
t
Or refer to table E.3 page 330
Confidence interval for a mean (δ unknown)
n
StX
n
StX
n
StX
nn
n
11
1
or
Example refer page 332 and 333
8.3 Confidence interval estimation for the proportion
n
ppZp
)1(
n
ppZp
n
PpZp
)1()1(
or
Refer to example 8.4 pg 340
EXAMPLE 4
0497.35. 500
)65)(.35(.33.235.
A sample of 500 executives who own their own home revealed 175 planned to sell their homes and retire to Arizona. Develop a 98% confidence interval for the proportion of executives that plan to sell and move to Arizona.
8.4 Determining Sample Size
8.4.1 How to determine the right sample size for the mean?
2
22
z
nе= the acceptable sampling error
8.4.2 Sample size determination for the proportion
2
2 )1(
Z
n
If there is no knowledge about population proportion π, should use π=0.5for determining the sample size.
To determine the sample size, you must know three factors:
1. The desired confidence level, which determines the value of Z, the critical value from the standardized normal distribution
2. The acceptable sampling error 3. The standard deviation or population proportion
EXAMPLE 6
1075
)20)(58.2(2
n
A consumer group would like to estimate the mean monthly electricity charge for a single family house in July within $5 using a 99 percent level of confidence. Based on similar studies the standard deviation is estimated to be $20.00. How large a sample is required?
CHAPTER 9
Fundamental of hypothesis testing:one sample tests
WHAT IS A HYPOTHESIS?
Twenty percent of all customers at Bovine’s Chop House return for another meal within a month.
What is a What is a Hypothesis?Hypothesis?
A statement about the value of a population parameter developed for the purpose of testing.
The mean monthly income for systems analysts is $6,325.
WHAT IS HYPOTHESIS TESTING?
Hypothesis testingHypothesis testing
Based on sample
evidence and probability
theory
Used to determine whether the
hypothesis is a reasonable statement
and should not be rejected, or is
unreasonable and should be rejected
HYPOTHESIS TESTING
D o n o t re jec t n u ll R e jec t n u ll an d accep t a lte rn a te
S tep 5 : Take a sam p le , a rrive a t a d ec is ion
S tep 4 : F orm u la te a d ec is ion ru le
S tep 3 : Id en tify th e tes t s ta tis t ic
S tep 2 : S e lec t a leve l o f s ig n ifican ce
S tep 1 : S ta te n u ll an d a lte rn a te h yp o th eses
Alternative Hypothesis H1:
A statement that is accepted if the sample data provide evidence that the null
hypothesis is false
Null Hypothesis H0
A statement about the value of a population
parameter
Step One: State the null and alternate Step One: State the null and alternate hypotheseshypotheses
Three possibilities regarding
means
H0: = 0H1: = 0
H0: < 0H1: > 0
H0: > 0H1: < 0
Step One: State the null and Step One: State the null and alternate hypothesesalternate hypotheses
The null hypothesis
always contains equality.
3 HYPOTHESES ABOUT MEANS
STEP TWO: SELECT A LEVEL OF SIGNIFICANCE.
The probability of rejecting the null hypothesis when it is actually true; the level of risk
in so doing.
Rejecting the null hypothesis when it is actually true Type 1 error is under your control
Accepting the null hypothesis when it is actually false
Level of SignificanceLevel of Significance
Type I ErrorType I Error
Type II ErrorType II Error
Step Two: Select a Level of Step Two: Select a Level of Significance.Significance.
Researcher
Null Accepts Rejects
Hypothesis Ho Ho
Ho is true
Ho is false
Correct
decision
Type I error
Type II
Error
Correct
decisionRISK TABLE
Level of significant (α) Type 1 error
Confidence interval
Z value for two tail test
Z value for one tail test
0.01 99% p= 0.005z= 2.58
p=0.01Z=2.33
0.05 95% p=0.025z = 1.96
p=0.05Z=1.65
0.10 90% p=0.05z = 1.65
p=0.10Z=1.28
0 1.65
D o not
re ject
[P robability = .95]
R egion of
re jection
[P robability= .05]
C ritica l va lue
One tail test
0 1.96
D o not
re ject
[P robability = .95]
R egion of
re jection
[P robability= .025]
C ritica l va lue-1.96
R egion of
re jection
[P robability= .025]
C ritica l va lue
Two tail test
STEP THREE: SELECT THE TEST STATISTIC.
A value, determined from sample information, used to determine whether or not to reject the null hypothesis.
Examples: z, t, F, 2
Test statisticTest statistic zz Distribution as a Distribution as a test statistictest statistic
n/
X
z
The z value is based on the sampling distribution of X, which is normally distributed when the sample is reasonably large (recall Central Limit Theorem).
Step Four: Formulate the Step Four: Formulate the decision rule.decision rule.
Critical value: The dividing point between the region where the null hypothesis is rejected and the region where it is not rejected.
0 1.65
D o not
re ject
[P robability = .95]
R egion of
re jection
[P robability= .05]
C ritica l va lue
Sampling DistributionSampling DistributionOf the Statistic Of the Statistic zz, a, aRight-Tailed Test, .05Right-Tailed Test, .05Level of SignificanceLevel of Significance
Reject the null hypothesis and accept the alternate hypothesis if
Computed -z < Critical -z
or
Computed z > Critical z
DECISION RULE
Decision Rule
USING THE P-VALUE IN HYPOTHESIS TESTING
If the p-Value is larger than or equal to the significance level, , H0 is not rejected.
pp-Value-ValueThe probability, assuming that the null hypothesis is true, of finding a value of the test statistic at least as extreme as the computed value for the test
Calculated from the probability distribution function or by computer
Decision Rule
If the p-Value is smaller than the significance level, , H0 is rejected.
Refer to example 9.4 page 382
> .0 5 .1 0p
> .0 1 .0 5p
Interpreting p-valuesInterpreting p-values
SOME evidence Ho is not true
> .0 0 1 .0 1p
STRONG evidence Ho is not true
VERY STRONG evidence Ho is not true
ONE-TAILED TESTS OF SIGNIFICANCE
One-Tailed Tests of SignificanceOne-Tailed Tests of SignificanceThe alternate hypothesis, H1, states a direction
H1: The mean yearly commissions earned by full-time realtors is more than
$35,000. (µ>$35,000)
H1: The mean speed of trucks traveling on I-95 in Georgia is less than 60 miles per hour. (µ<60)
H1: Less than 20 percent of the customers pay cash for their gasoline purchase. 20)
ONE-TAILED TEST OF SIGNIFICANCE
.
0 1.65
D o not
re ject
[P robability = .95]
R egion of
re jection
[P robability= .05]
C ritica l va lue
Sampling DistributionOf the Statistic z, aRight-Tailed Test, .05Level of Significance
H1: The mean price for a gallon of gasoline is not equal to $1.54.
(µ ne $1.54).
No direction is specified in the alternate hypothesis H1.
H1: The mean amount spent by customers at the
Wal-mart in Georgetown is not
equal to $25. (µ ne $25).
Two-Tailed Tests of Two-Tailed Tests of SignificanceSignificance
Two-Tailed Tests of Two-Tailed Tests of SignificanceSignificance
Regions of Nonrejection and Rejection for a Two-Tailed Test, .05 Level of Significance
0 1.96
D o not
re ject
[P robability = .95]
R egion of
re jection
[P robability= .025]
C ritica l va lue-1.96
R egion of
re jection
[P robability= .025]
C ritica l va lue
TESTING FOR THE POPULATION MEAN: LARGE SAMPLE, POPULATION STANDARD DEVIATION KNOWN
n/
X
z
Test for the population mean Test for the population mean from a large sample with from a large sample with
population standard deviation population standard deviation knownknown
EXAMPLE 1
The processors of Fries’ Catsup indicate on the label that the bottle contains 16 ounces of catsup. The standard deviation of the process is 0.5 ounces. A sample of 36 bottles from last hour’s production revealed a mean weight of 16.12 ounces per bottle. At the .05 significance level is the process out of control? That is, can we conclude that the mean amount per bottle is different from 16 ounces?
µ = 16 ouncesσ = 0.5 ouncesn= 36
12.16X ounce
α=0.05
EXAMPLE 1
Step 1 State the null and
the alternative hypotheses
H0: = 16H1: 16
Step 3Identify the test statistic. Because we know the population standard deviation, the test statistic is z.
Step 2 Select the significance level. The significance level is .05.
Step 4 State the decision rule. Reject H0 if z > 1.96
or z < -1.96 or if p < .05.
Step 5Make a decision and interpret the results.
Two tail test
EXAMPLE 1
44.1365.0
00.1612.16
n
Xz
oComputed z of 1.44 < Critical z of 1.96, op of .1499 > of .05,
Do not reject the null hypothesis.
The p(z > 1.44) is .1499 for a two-tailed test.
Step 5: Make a decision and interpret the results.
We cannot conclude the
mean is different from
16 ounces.
TESTING FOR THE POPULATION MEAN: LARGE SAMPLE, POPULATION STANDARD DEVIATION UNKNOWN
zX
s n
/
Testing for the Testing for the Population Mean: Population Mean:
Large Sample, Large Sample, Population Standard Population Standard Deviation UnknownDeviation Unknown
Here is unknown, so we estimate it with the
sample standard deviation s.
As long as the sample size n > 30, z can be approximated using
EXAMPLE 2
Roder’s Discount Store chain issues its own credit card. Lisa, the credit manager, wants to find out if the mean monthly unpaid balance is more than $400. The level of significance is set at .05. A random check of 172 unpaid balances revealed the sample mean to be $407 and the sample standard deviation to be $38.
Should Lisa conclude that the population mean is greater than $400, or is it reasonable to assume that the difference of $7 ($407-$400) is due to chance?
EXAMPLE 2
Step 1H0: µ < $400
H1: µ > $400
Step 2The significance
level is .05.
Step 3 Because the sample is large we can use the z distribution as the
test statistic.
Step 4H0 is rejected if
z > 1.65 or if p < .05.
Step 5Make a decision and interpret the results.
42.217238$
400$407$
ns
Xz
The p(z > 2.42) is .0078 for a one-
tailed test.
oComputed z of 2.42 > Critical z of 1.65, op of .0078 < of .05.
Reject H0.
Step 5Make a decision and
interpret the results.
Lisa can conclude that the mean unpaid balance is
greater than $400.
TESTING FOR A POPULATION MEAN: SMALL SAMPLE, POPULATION STANDARD DEVIATION UNKNOWN
ns
Xt
/
The critical value of t is determined by its degrees of freedom equal to n-
1.
Testing for a Population Mean: Small
Sample, Population Standard Deviation
Unknown
The test statistic is the t distribution.
EXAMPLE 3
The current rate for producing 5 amp fuses at Neary Electric Co. is 250 per hour. A new machine has been purchased and installed that, according to the supplier, will increase the production rate. The production hours are normally distributed. A sample of 10 randomly selected hours from last month revealed that the mean hourly production on the new machine was 256 units, with a sample standard deviation of 6 per hour.
At the .05 significance level can Neary conclude that the new machine is faster?
Step 4 State the decision rule.
There are 10 – 1 = 9 degrees of freedom.
Step 1
State the null and alternate hypotheses.
H0: µ < 250
H1: µ > 250
Step 2 Select the level of
significance. It is .05.
Step 3 Find a test statistic. Use the t distribution since is not
known and n < 30.
The null hypothesis is rejected if t > 1.833 or, using the p-value, the null hypothesis is rejected if p < .05.
EXAMPLE 3
162.3106
250256
ns
Xt
oComputed t of 3.162 >Critical t of 1.833 op of .0058 < a of .05
Reject Ho
The p(t >3.162) is .0058 for a one-
tailed test.
Step 5 Make a decision and
interpret the results.
The mean number of amps produced is more
than 250 per hour.
t test for one tailsNull hypothesis µ 250Level of significance α 0.05Sample size n 10Sample mean X 256Sample SD s 6
Intermediate calculation Std error of the mean 1.8974Degree of freedom 9t test statistic 3.1623
One Tail test Lower critical value -1.8331 p-Value 0.0058 Reject the null hypothesis
Extracted from excel
n
pz
)1(
The sample proportion is p and is the population proportion.
The fraction or percentage that indicates the part of the population or sample having a particular trait of interest.
sampledNumber
sample in the successes ofNumber p
ProportionProportion
Test Statistic for Testing a Single Population Proportion
EXAMPLE 4
In the past, 15% of the mail order solicitations for a certain charity resulted in a financial contribution. A new solicitation letter that has been drafted is sent to a sample of 200 people and 45 responded with a contribution. At the .05 significance level can it be concluded that the new letter is more effective?
EXAMPLE 4
Step 1State the null and the alternate hypothesis.
H0: p < .15 H1: p > .15
Step 2Select the level of
significance. It is .05.
Step 3Find a test statistic. The z distribution is
the test statistic.
Step 4State the decision rule.The null hypothesis is rejected if z is greater than 1.65 or if p < .05.
Step 5Make a decision and interpret the results.
EXAMPLE 4
97.2
200
)15.1(15.
15.200
45
)1(
n
pz
Because the computed z of 2.97 > critical z of 1.65, the p of .0015 < of .05, the null hypothesis is rejected. More than 15 percent responding with a pledge. The new letter is more effective.
p( z > 2.97) = .0015.
Step 5: Make a decision and interpret the results.
CHAPTER 10TWO SAMPLES TESTS
COMPARING TWO POPULATIONS
Does the distribution of the
differences in sample means have a
mean of 0?
Comparing two populations
If both samples contain at least 30 observations we use the z distribution as the test statistic.
No assumptions about the shape of the populations are required.
The samples are from independent populations.
The formula for computing the value of z is:
2
22
1
21
21
n
s
n
s
XXz
EXAMPLE 1
with a standard deviation of $7,000 for a sample of 35 households. At the .01 significance level can we conclude the mean income in Bradford is more?
Two cities, Bradford and Kane are separated only by the Conewango River. There is competition between the two cities. The local
paper recently reported that the mean household income in Bradford is $38,000 with a standard deviation of $6,000 for a sample of 40 households. The same article reported the mean income in Kane is $35,000
EXAMPLE 1 CONTINUED
Step 2 State the level of significance.
The .01 significance level is stated in the problem.
Step 3 Find the appropriate test statistic. Because both samples are more than 30, we can use z as the test
statistic.
Step 1 State the null and
alternate hypotheses.H0: µB < µK
H1: µB > µK
Step 4 State the decision rule.The null hypothesis is rejected if z is greater than 2.33 or p < .01.
98.1
35
)000,7($
40
)000,6($
000,35$000,38$22
z
Step 5: Compute the value of z and make a decision.
The p(z > 1.98) is .0239 for a one-tailed test of significance.
Because the computed Z of 1.98 < critical Z of 2.33, the p-value of .0239 > of .01, the decision is to not reject the null hypothesis. We cannot conclude that the mean household income in Bradford is larger.
21
21
nn
XXpc
Two Sample Tests of ProportionsTwo Sample Tests of Proportions investigate whether two samples came from populations with an equal proportion of successes.
The two samples are pooled using the following formula.
where X1 and X2 refer to the number of successes in the respective samples of n1 and n2.
The value of the test statistic is computed from the following formula.
21
21
)1()1(
n
pp
n
pp
ppz
cccc
where X1 and X2 refer to the number of successes in the respective samples of n1 and n2.
EXAMPLE 2
Are unmarried workers more likely to be absent from work than married workers? A sample of 250 married workers showed 22 missed more than 5 days last year, while a sample of 300 unmarried workers
showed 35 missed more than five days. Use a .05 significance level.
EXAMPLE 2 CONTINUED
The null and the alternate hypotheses
H0: U < M H1: U > M
The null hypothesis is rejected if the computed value of z is greater than 1.65 or the p-value < .05.
The pooled proportion
250300
2235
cp
= .1036
EXAMPLE 2 CONTINUED
10.1
250
)1036.1(1036.
300
)1036.1(1036.250
22
300
35
z
The p(z > 1.10) = .136 for a one-tailed test of significance.
Because a calculated z of 1.10 < a critical z of 1.96, p of .136 > of .05, the null hypothesis is not rejected. We cannot conclude that a higher proportion of unmarried workers miss more days in a year than the married workers.
SMALL SAMPLE TESTS OF MEANS
The required assumptions1. Both populations must follow the
normal distribution.2. The populations must have equal
standard deviations.3. The samples are from independent
populations.
Small Sample Tests of MeansSmall Sample Tests of MeansThe t distribution is used as the test statistic if one or more
of the samples have less than 30 observations.
SMALL SAMPLE TEST OF MEANS CONTINUED
2
)1()1(
21
222
2112
nn
snsns p
21
2
21
11
nns
XXt
p
Step Two: Determine the value of t from the following formula.
Finding the value of the test statistic requires two steps.
Step One: Pool the sample standard deviations.
EXAMPLE 3
A recent EPA study compared the highway fuel economy of domestic and imported passenger cars. A sample of 15 domestic cars revealed a mean of 33.7 mpg with a standard deviation of 2.4 mpg.
A sample of 12 imported cars revealed a mean of 35.7 mpg with a standard deviation of 3.9. At the .05 significance level can the EPA conclude that the mpg is higher on the imported cars?
EXAMPLE 3 CONTINUED
Step 1 State the null and
alternate hypotheses. H0: µD > µI
H1: µD < µI
Step 2 State the level of
significance. The .05 significance level is stated
in the problem.
Step 3 Find the appropriate test
statistic. Both samples are less than 30, so we use the t
distribution.
918.921215
)9.3)(112()4.2)(115(
2
))(1())(1(
22
21
222
2112
nn
snsns p
Step 4 The decision rule is to reject H0 if
t<-1.708 or if p-value < .05. There are n-1 or 25 degrees of freedom.
Step 5 We compute the pooled
variance.
EXAMPLE 3 CONTINUED
640.1
12
1
15
1312.8
7.357.33
11
21
2
21
nns
XXt
p
We compute the value of t as follows.
Since a computed z of –1.64 > critical z of –1.71, the p-value of .0567 > of .05, H0 is not rejected. There is insufficient sample evidence to claim a higher mpg on the imported cars.
P(t < -1.64) = .0567 for a one-tailed t-test.
EXAMPLE 3 CONTINUED
HYPOTHESIS TESTING INVOLVING PAIRED OBSERVATIONS
Dependent samples are samples that are paired or related in some fashion.
Independent samples are samples that are not related in any way.
If you wished to buy a car you would look at the same car at two (or more) different dealerships and compare the prices.
If you wished to measure the effectiveness of a new diet you would weigh the dieters at the start and at the finish of the program.
HYPOTHESIS TESTING INVOLVING PAIRED OBSERVATIONS
Use the following test when the samples are dependent:
where is the mean of the differences Sd is the standard deviation of the differencesn is the number of pairs (differences)
td
s nd
/
d
EXAMPLE 4
An independent testing agency is comparing the daily rental cost for renting a compact car from Hertz and Avis. A random sample of eight cities revealed the following information. At the .05 significance level can the testing agency conclude that there is a difference in the rental charged?
City Hertz ($)
Avis ($)
Atlanta 42 40
Chicago 56 52
Cleveland 45 43
Denver 48 48
Honolulu 37 32
Kansas City 45 48
Miami 41 39
Seattle 46 50
EXAMPLE 4 CONTINUED
Step 4 H0 is rejected if
t < -2.365 or t > 2.365; or if p-value < .05.
We use the t distribution with n-1 or 7 degrees of freedom.
Step 2 The stated significance level is .05.
Step 3 The appropriate test statistic is the paired
t-test.
Step 1Ho: d = 0H1: d = 0
Step 5Perform the calculations
and make a decision.
EXAMPLE 4 CONTINUED
City Hertz Avis d d2
Atlanta42 40 2 4
Chicago 56 52 4 16
Cleveland 45 43 2 4
Denver 48 48 0 0
Honolulu 37 32 5 25
Kansas City 45 48 -3 9
Miami 41 39 2 4
Seattle 46 50 -4 16
EXAMPLE 4 CONTINUED
00.18
0.8
n
dd
1623.3
188
878
1
222
n
n
dd
sd
894.081623.3
00.1
ns
dt
d
EXAMPLE 4 CONTINUED
P(t>.894) = .20 for a one-tailed t-test at 7 degrees of freedom.
Because 0.894 is less than the critical value, the p-value of .20 > a of .05, do not reject the null hypothesis. There is no difference in the mean amount charged by Hertz and Avis.
Advantage of dependent samples:Reduction in variation in the sampling distribution
Disadvantage of dependent samples:
Degrees of freedom are halved
COMPARING DEPENDENT AND INDEPENDENT SAMPLES
The same subjects measured at two different points in
time.
Two types of dependent samplesTwo types of dependent samples
Matched or paired
observations
THANK YOU
Good Luck