mms testing of hypothesis
TRANSCRIPT
-
8/6/2019 Mms Testing of Hypothesis
1/69
Statistical Estimation
1. Point and interval estimation
2. Confidence interval for mean,
proportion & Variance
-
8/6/2019 Mms Testing of Hypothesis
2/69
8/8/2011 Lecture24 2
Introduction
Everyone makes estimates !
When you ready to cross a street, youestimate the speed of any car that
approaching, the distance between you andthat car, and your own speed. Having madethese quick estimates, you decide whether towait, walk, or run.
All managers must make quick estimatestoo.. The outcome of these estimates canaffect their organizations seriously too.
-
8/6/2019 Mms Testing of Hypothesis
3/69
8/8/2011 Lecture24 3
Introduction
How do mangers use sample statistics toestimate population parameters?
StatisticalE
stimation methods enable us toestimate with reasonable accuracy thepopulation proportion
If all these estimates are obtained on a
census Basis, it would be very costly andtime-consuming proposition. Hence samplingtheory
-
8/6/2019 Mms Testing of Hypothesis
4/69
8/8/2011 Lecture24 4
Statistical Estimation
Statistical estimation is the procedure of using a sample statistic to estimate apopulation parameter. A statistic used toestimate a parameter is called an estimatorand the value taken by the estimator is calledan estimate.
Statistical estimation is divided into two maincategories: Point estimation & Intervalestimation
-
8/6/2019 Mms Testing of Hypothesis
5/69
8/8/2011 Lecture24 5
Point estimation
A point estimate is a single number that isused to estimate an unknown populationparameter.
Example If a firm takes a sample of 50 salesman
And Average amount of time each salesmanspend with his customers is 80 minutes
And This figure is used for an estimate of aparameter
Then 80 is the point estimate.
-
8/6/2019 Mms Testing of Hypothesis
6/69
8/8/2011 Lecture24 6
Interval estimation
An estimate of a population parameter givenby two numbers between which theparameter may be considered to lie is calledas interval estimate of the parameter
An interval estimation is a range of valuesused to estimate a population parameter.
Average amount of time each salesman spendwith his customers is between 60 to 80 minutes
-
8/6/2019 Mms Testing of Hypothesis
7/69
8/8/2011 Lecture24 7
Criteria of a Good Estimator.
A good estimate is one which is close to thepopulation parameter being estimated.
We can evaluate the quality of a statistic asan estimator by using four criteria.
Unbiasedness
Consistency
Efficiency
Sufficiency
-
8/6/2019 Mms Testing of Hypothesis
8/69
8/8/2011 Lecture24 8
Unbiasedness
An estimator is said to be unbiased if theexpected value of the estimator is equal tothe population parameter being estimated.
OR
The mean of the sample values is equal tothe population parameter, then it is unbiased
estimate.
-
8/6/2019 Mms Testing of Hypothesis
9/69
8/8/2011 Lecture24 9
Consistency
As the sample size increases, the differencebetween the sample statistic and thepopulation parameter should become smallerand smaller. If the difference continues tobecome smaller and smaller as the samplesize becomes larger, the sample statistic is
said to converge in probability to a parameterand is said to be consistent estimator of thatparameter.
-
8/6/2019 Mms Testing of Hypothesis
10/69
8/8/2011 Lecture24 10
Efficiency
The efficiency of an estimator depends on itsvariance. If the variance of the estimator issmall, then that estimate is closer to the
parameter value. For example sample mean and sample
median are unbiased and consistentestimators of population mean. Choose
between them on the basis of relativeefficiency. (select one which have smallervariance)
-
8/6/2019 Mms Testing of Hypothesis
11/69
8/8/2011 Lecture24 11
Sufficiency
A sufficient estimator is one that uses allinformation about the population parametercontained in the sample.
Example sample mean is a sufficientestimator of the population mean since all theinformation in the sample is used in its
computation. Not sample range.
-
8/6/2019 Mms Testing of Hypothesis
12/69
8/8/2011 Lecture24 12
Example
Consider
A medical supplies company that producesdisposable hypodermic syringes . Each syringe is
wrapped in a sterile package and then jumble-packed in a large corrugated carton. Jumble packingcauses the cartons to contain differing number ofsyringes. Because the syringes are sold on a per
unit basis, the company needs an estimate of thenumber of syringes per carton for billing purposes.
-
8/6/2019 Mms Testing of Hypothesis
13/69
8/8/2011 Lecture24 13
Cont..
A sample of 35 cartons is taken and thenumber of syringes in each carton isrecorded .
Obtain the sample mean
Sample mean = 102 syringes
Then we can say that the point estimate of
the population mean is 102 syringes per carton
-
8/6/2019 Mms Testing of Hypothesis
14/69
8/8/2011 Lecture24 14
Cont..
The manufactured price of a disposablehypodermic syringe is quite small, so both thebuyer and seller would accept the use of this
point estimate 102 as the basis for billing
Manufacturer can save the time and expenseof counting each syringe that goes into a
carton.
-
8/6/2019 Mms Testing of Hypothesis
15/69
8/8/2011 Lecture24 15
Interval estimates
An interval estimate describes a range of values within which a population parameteris likely to lie.
-
8/6/2019 Mms Testing of Hypothesis
16/69
8/8/2011 Lecture24 16
Example
Suppose the marketing research directorneeds an estimate of the average life inmonths of car batteries his company
manufactures. Select a sample of 200 carowners. Interview these owners and collectthe data about the life of batteries.
Let mean life = 36 months If Point estimate then 36 months.
-
8/6/2019 Mms Testing of Hypothesis
17/69
8/8/2011 Lecture24 17
Cont..
If the director asks for a statement about theuncertainty that will be likely to accompany thisestimate or a range
That can be done by
Calculating the standard error of the mean as
Say 0.707
We could now report that our estimate of the lifeof the companys batteries may lie somewhere inthe range of 35.293 to 36.707 months.
nx
W
W !
-
8/6/2019 Mms Testing of Hypothesis
18/69
8/8/2011 Lecture24 18
ConfidenceInterval
The probability that we associate with aninterval estimate is called the confidencelevel.
How confident?
Most commonly used confidence levels are90%, 95% & 99%
Free to apply any confidence level.
-
8/6/2019 Mms Testing of Hypothesis
19/69
-
8/6/2019 Mms Testing of Hypothesis
20/69
8/8/2011 Lecture24 20
Interval estimation-Students t distribution
When ever sample size is 30 or less and thepopulation standard deviation is not known.Then use t distribution.
In using t distribution we assume that thepopulation is normally distributed.
A t distribution is lower at the mean andhigher at the tails than a normal distribution.
There is separate t distribution for eachsample size Or for different degrees offreedom.
-
8/6/2019 Mms Testing of Hypothesis
21/69
8/8/2011 Lecture24 21
Degrees of freedom.
The number of values we can choose freely.
-
8/6/2019 Mms Testing of Hypothesis
22/69
8/8/2011 Lecture24 22
Example
As part of the budgeting process for next year, themanager of the Fan point electric generating plantmust estimate the coal he will need for this year.
Last year the plant almost ran out, so he is reluctantto budget for that same amount again. The plantmanager took a random sample of 10 plantoperating weeks chosen over the last 5 years. Ityielded a mean usage of 11400 tons a week, asample standard deviation of 700 tons a week.Calculated a sensible estimate of the amount( with95 % confident ) to order this year.
-
8/6/2019 Mms Testing of Hypothesis
23/69
8/8/2011 Lecture24 23
n=10 df=9
Sample mean=11400
S.d=700 ( approximate this as population S.D.)
Standard error=
=221.38
From t-table, corresponding to d.f 9 & confidencelevel (1.00-0.95)=0.05 the t value= 2.262
The confidence interval is 11400 + 2.262* 221.38
10899 to 11901 tons with 95 % confidence
nx
W
W !
-
8/6/2019 Mms Testing of Hypothesis
24/69
8/8/2011 Lecture24 24
Tests of Hypothesis
-
8/6/2019 Mms Testing of Hypothesis
25/69
8/8/2011 Lecture24 25
Suppose a manger of a large shopping mall tells us thatthe average work efficiency of her employees is at least90%. How can we test the validity of her claim?
We could calculate the efficiency of a sample of her employees.
If this sample statistic came out be 95% we would accept
the managers statement. But if it is 46% we would reject her assumption as
untrue.
Suppose sample statistic is 88%. Whether we accept orreject?
We cannot be absolutely certain that our decision iscorrect.
Therefore learn to deal with uncertainty in our decisionmaking.
-
8/6/2019 Mms Testing of Hypothesis
26/69
8/8/2011 Lecture24 26
Hypothesis
Here we wish to test efficiency = 90% (null)
Against the alternative, efficiency 90%,(alternate)
Or we can say
null hypothesis H0 0=90
alternate hypothesis H1 1 90
-
8/6/2019 Mms Testing of Hypothesis
27/69
-
8/6/2019 Mms Testing of Hypothesis
28/69
8/8/2011 Lecture24 28
Level of significance
If we assume the hypothesis is correct, thenthe significance level will indicate the % ofsample means that is outside certain limits.
-
8/6/2019 Mms Testing of Hypothesis
29/69
8/8/2011 Lecture24 29
Cont..
The purpose of testing is not to question thecomputed value of the sample statistic but tomake a judgment about the difference
between that sample statistic and testedpopulation parameter.
-
8/6/2019 Mms Testing of Hypothesis
30/69
8/8/2011 Lecture24 30
Introduction
A hypothesis is an assumption about the populationparameter to be tested based on sampleinformation.
Hypothesis tests are widely used in business andindustry for making decisions..
Examples
Based on sample data decide whether a new
medicine is really effective in curing a disease Whether one training procedure is better than other.
-
8/6/2019 Mms Testing of Hypothesis
31/69
8/8/2011 Lecture24 31
The hypothesis is made about the value ofsome parameter, (only facts available toestimate the true parameter are thoseprovided by sample)
If the sample statistic differs from thehypothesis made about the populationparameter, and if it is significant, then rejectthe hypothesis.
If it is not significant then it must beaccepted. Hence tests of hypothesis
-
8/6/2019 Mms Testing of Hypothesis
32/69
8/8/2011 Lecture24 32
Procedures of HypothesisTesting
Set up a hypothesis
Set up a suitable significance level
Determination of a suitable test statistic Determination of the critical region
Doing computations
Making decisions
-
8/6/2019 Mms Testing of Hypothesis
33/69
8/8/2011 Lecture24 33
Set up a hypothesis
Establish the hypothesis to be tested.
Set up
Null hypothesis denoted by H0 & Alternate hypothesis denoted by H1 The null hypothesis
There is no true difference in the samplestatistic and population parameter underconsideration
-
8/6/2019 Mms Testing of Hypothesis
34/69
8/8/2011 Lecture24 34
Set up a hypothesis
The hypothesis that is different from the nullhypothesis is the alternate hypothesis H1
If the sample information leads to reject H0
,then accept H1
-
8/6/2019 Mms Testing of Hypothesis
35/69
8/8/2011 Lecture24 35
Set up a suitable significance level
The confidence with which an experimenter rejects orretains null hypothesis
The level of significance is denoted by
It is generally specified before any sample is drawn.
(no influence)
In practice 5% or 1% level of significance
5% 5 chances out of 100 that we would reject thenull hypothesis ( 95% confident that right decision )
E
-
8/6/2019 Mms Testing of Hypothesis
36/69
8/8/2011 Lecture24 36
EWhen the null hypothesis is rejected at=0.5 the result is said to be significant.
When the null hypothesis is rejected at =0.01 the result is said to be significant. Thetest result is said to be highly significant
E
-
8/6/2019 Mms Testing of Hypothesis
37/69
-
8/6/2019 Mms Testing of Hypothesis
38/69
8/8/2011 Lecture24 38
Determination the critical region
Determination of
Which value of test statistic will lead to arejection of H0
And which lead to acceptance of H0. The former is called critical region.
Establishing a critical region is similar todetermining a 100 (1- ) % confidence interval.E
-
8/6/2019 Mms Testing of Hypothesis
39/69
8/8/2011 Lecture24 39
Doing computations
Calculations for step 3
-
8/6/2019 Mms Testing of Hypothesis
40/69
8/8/2011 Lecture24 40
Making decisions
Draw statistical conclusions
Either acceptance of the null hypothesis orrejection of it.
Based on whether the computed value of thetest statistic falls in the region of acceptanceor region of rejection
-
8/6/2019 Mms Testing of Hypothesis
41/69
8/8/2011 Lecture24 41
-
8/6/2019 Mms Testing of Hypothesis
42/69
8/8/2011 Lecture24 42
Point estimation. Appropriate when the goal is to estimate a population
parameter.
Confidence interval.
Appropriate when the goal is to estimate a populationparameter with confidence.
Hypotheses testing. Hypothesis: a statement about the parameters.
Appropriate when the goal is to assess if the evidenceprovided by the data is in favor of some claim about thepopulation.
Procedures for statistical inferences
-
8/6/2019 Mms Testing of Hypothesis
43/69
8/8/2011 Lecture24 43
ConfidenceInterval
Point estimate +/- margin of error
Confidence interval for a population mean
Assumption: the population variance is known.
Confidence level: C
n
zx
n
zx
WW
** ,
-
8/6/2019 Mms Testing of Hypothesis
44/69
8/8/2011 Lecture24 44
HypothesisTesting
Sometimes, not interested in
Estimate an unknown parameter
Provide a confidence interval for the parameter
But rather, you have some claim (belief)about the parameter and you want to see
whether the data supports the claim or not. Support
Contradict
-
8/6/2019 Mms Testing of Hypothesis
45/69
8/8/2011 Lecture24 45
The critical concepts of hypothesis testing:two hypotheses
H0 - the null hypothesis
The statement of no effect or nodifference.
Ha - the alternative hypothesis
The statement we hope or suspect is true.
Usually one would decide on Ha first.
Concepts of HypothesisTesting
-
8/6/2019 Mms Testing of Hypothesis
46/69
8/8/2011 Lecture24 46
Biased one-Euro Coin?A group of Statistics students spin theBelgian one-Euro coin 250 times, and it
came up heads 140 times.
p: the probability of getting a head duringeach spin.
H0: p = .5 against Ha: p > .5.
One-sided H0: p = .5 against Ha: p .5.
Two-sided
-
8/6/2019 Mms Testing of Hypothesis
47/69
8/8/2011 Lecture24 47
A new billing system for a company will be cost- effective only if themean monthly account is more than $170.
A sample of 400 monthly accounts has a mean of $178.
If the accounts are normally distributed with W = $65, can we concludethat the new system will be cost effective?
The population is the credit accounts at the store.
We want to show that the mean account for all customers is greater than$170. Ha: Q > 170.
The null hypothesis must specify a single value of the parameterQH0 :Q = 170.
How can we achieve that?
CompanyBilling System
-
8/6/2019 Mms Testing of Hypothesis
48/69
8/8/2011 Lecture24 48
Test statistic
A test is based on a statistic, which estimatesthe parameter that appears in the hypotheses
Point estimate
Values of the estimate far from the parametervalue in H0 give evidence against H0.
Ha determines which direction will be counted
as far from the parameter value.
-
8/6/2019 Mms Testing of Hypothesis
49/69
8/8/2011 Lecture24 49
CompanyBilling SystemQuestion:
Is a sample mean of 178 sufficiently greaterthan 170 to infer that the population mean isgreater than 170?
Answer:
Lets assume the population mean is 170,and see how likely it is for us to observe a
sample mean of 178 or even more.
-
8/6/2019 Mms Testing of Hypothesis
50/69
8/8/2011 Lecture24 50
P-value:
the probability of observing a test statistic as extreme ormore extreme than the actually observed value, giventhat H0 is true.
extreme means far from what we would expect fromH0 .
The P-value provides information about theamount of statistical evidence that supports the
null hypothesis. The smaller the P-value, the less the evidence forH0.
P-value
-
8/6/2019 Mms Testing of Hypothesis
51/69
8/8/2011 Lecture24 51
Because the probability that the sample mean is equal orlarger than 178, when Q = 170, is so small (.0069), thereare no reasons to believe that Q = 170.
(or, reasons to believe that Q> 170.)
We can conclude that the smallerthe P-value
the more statistical evidenceexists to
suppor
t the
alter
native
hypo
thesis.
InterpretingP-value
-
8/6/2019 Mms Testing of Hypothesis
52/69
8/8/2011 Lecture24 52
If the P-value is less than 1%, there is overwhelmingevidence that supports the alternative hypothesis.
If the P-value is between 1% and 5%, there is strong
evidence that supports the alternative hypothesis.
If the P-value is between 5% and 10% there is weakevidence that supports the alternative hypothesis.
If the P-value exceeds 10%, there is no evidence thatsupports of the alternative hypothesis.
DescribingP-value
-
8/6/2019 Mms Testing of Hypothesis
53/69
8/8/2011 Lecture24 53
SignificanceLevel E We need to make a conclusion after carrying out the
hypothesis test. What do we conclude?
We can compare the P-value with a fixed value that weregard as decisive.
This amounts to announcing in advance how muchevidence against H0we require in order to reject H0.
The decisive value is called the significance levelof thetest. It is denoted by E and the corresponding test is
called a levelE
test.
Statistical Significance: If the P-value e E, we saythat the data are statistically significant at level E.
-
8/6/2019 Mms Testing of Hypothesis
54/69
8/8/2011 Lecture24 54
E and P-value
P-value and significance level E: Reject H0 if
Do not reject H0 if
When is it easier to reject H0?
Large E or smallE ?
.
When is the evidence against H0 stronger?
Large P-value or smallP-value?
.
-
8/6/2019 Mms Testing of Hypothesis
55/69
8/8/2011 Lecture24 55
Four steps of hypotheses testing
Define the hypotheses to test, and the requiredsignificance level E
Calculate the value of the test statistic.
Find the P-value based on the observed data. State the conclusion.
Reject the null hypothesis if the P-value E, thedata do not provide sufficient evidence to reject the null.
-
8/6/2019 Mms Testing of Hypothesis
56/69
8/8/2011 Lecture24 56
Testing for normal mean with
known W Let X1, ,Xn be a random sample from N(Q,W2). Null hypothesis:
H0: Q =Q0
Alternative hypothesis: Ha: Q { Q0 Ha: Q >Q0 Ha: Q
-
8/6/2019 Mms Testing of Hypothesis
57/69
8/8/2011 Lecture24 57
Normal with known W: Z test
When H0 is true, and
has a standard normal distribution. Z is a natural measure of the distance between
the sample mean and its expected value Q.
For a given sample, we observe
IfH0 is true, we expect zto be close to 0.
n
XZ
/
0
W
Q!
X
./
0
nxzW
Q!
0QQ !X
-
8/6/2019 Mms Testing of Hypothesis
58/69
8/8/2011 Lecture24 58
Normal with known W
Case 1: Ha: Q {Q0. H0 should be rejected if z is too far away from 0.
The P-value is
Case 2: Ha: Q >Q0. H0 should be rejected if z is much larger than 0.
The P-value is
Case 3: Ha: Q
-
8/6/2019 Mms Testing of Hypothesis
59/69
8/8/2011 Lecture24 59
Normal with known W:P-value
method
Null hypothesis: H0: Q=
Q0 Test statistic:
Alternative hypothesis P-value
Ha: Q { Q0Ha: Q > Q0Ha: Q < Q0
.n
xz
W
Q!
-
8/6/2019 Mms Testing of Hypothesis
60/69
8/8/2011 Lecture24 60
Sprinkler A sprinkler systems maker claims that the true average
system-activation temperature is 130o. A sample ofn = 9systems , when tested, yields a sample averageactivation temperature of 131.08o. If the distribution ofactivation temperature is normal with W= 1.5o, does the
data contradict the claim at significance level E = .01 ? Let Q = true average activation temperature.
Hypotheses:
Test statistic:
P-value:
Conclusion:
-
8/6/2019 Mms Testing of Hypothesis
61/69
8/8/2011 Lecture24 61
The rejection region is a range of values such that if thetest statistic falls into that range, the null hypothesis isrejected.
The rejection region method: Define the hypotheses to test, and the required significance level
E
Find the corresponding rejection region.
Calculate the test statistic.
Reject the null hypothesis only if the value of the test statistic fallsin the rejection region.
Rejection Region Method
-
8/6/2019 Mms Testing of Hypothesis
62/69
8/8/2011 Lecture24 62
Normal with known W:Rejection
RegionMethod Null hypothesis: H0: Q =Q0 Test statistic:
Alternative Rejection regionhypothesis for level E test
Ha: Q { Q0
Ha: Q >Q0Ha: Q
-
8/6/2019 Mms Testing of Hypothesis
63/69
8/8/2011 Lecture24 63
Sprinkler A sprinkler systems maker claims that the true average
system-activation temperature is 130o. A sample ofn = 9systems , when tested, yields a sample average activationtemperature of 131.08o. If the distribution of activationtemperature is normal with W= 1.5o, does the data
contradict the claim at significance level E = .01 ? Let Q =true average activation temperature.
1 Hypotheses:
2 Rejection region:
3 Test statistic:
4 Conclusion:
-
8/6/2019 Mms Testing of Hypothesis
64/69
8/8/2011 Lecture24 64
Sprinkler Revisited
A sprinkler systems maker claims that thetrue average system-activation temperatureis 130o. A sample ofn = 9 systems , when
tested, yields a sample average activationtemperature of 131.08o. If the distribution ofactivation temperature is normal with W=1.5o,
does the data contradict the claim at significancelevel E = .01 ?
whats the 99% confidence interval for theactivation temperature?
-
8/6/2019 Mms Testing of Hypothesis
65/69
8/8/2011 Lecture24 65
CI & 2-Sided Tests
A level E 2-sided test rejects H0: Q =Q0 exactlywhen the value Q0 falls outside a level 1 - Econfidence interval forQ.
Confidence interval can be used to testhypotheses.
Calculate the 1 - E level confidence interval, then
ifQ0 falls within the interval, do not reject the null
hypothesis, Otherwise, reject the null hypothesis.
-
8/6/2019 Mms Testing of Hypothesis
66/69
8/8/2011 Lecture24 66
In a discussion of SAT scores, someone comments: Because only a
minority of students take the test, the scores overestimate the ability oftypical seniors. The mean SAT-M score is about 475, but I think if allseniors took the test, the mean would be 450.
You gave the test to an SRS of 500 seniors from California. Thesestudents had an average score of 461. (The SAT-M score follows a
normal distribution with a standard deviation of 100.)
Is there sufficient evidence against the claim that the mean for allCalifornia seniors is 450 under a significance level of 0.05?
Give a 95% CI for the mean score Q of all seniors.
SAT
-
8/6/2019 Mms Testing of Hypothesis
67/69
8/8/2011 Lecture24 67
A 95% confidence interval forQ is
SAT
Because Ha is two-sided, the P-value is.
Conclusion:
The hypotheses are
The test statistic is
-
8/6/2019 Mms Testing of Hypothesis
68/69
8/8/2011 Lecture24 68
Take HomeMessage
Tests of significance: When to use it Two hypotheses:
Null
Alternative Test for a population mean with known W
Test statistic P-value Significance level E
P-value method 4 steps
Rejection region method
CI and 2-sided test
-
8/6/2019 Mms Testing of Hypothesis
69/69
Homework12.1
Reading in Text 435-452
Exercises in Text
6.32, 6.36, 6.44, 6.48, 6.52, 6.56
Due Time
Thursday, April 28