mms testing of hypothesis

8/6/2019 Mms Testing of Hypothesis

1/69

Statistical Estimation

1. Point and interval estimation

2. Confidence interval for mean,

proportion & Variance


2/69

8/8/2011 Lecture24 2

Introduction

Everyone makes estimates !

When you ready to cross a street, youestimate the speed of any car that

approaching, the distance between you andthat car, and your own speed. Having madethese quick estimates, you decide whether towait, walk, or run.

All managers must make quick estimatestoo.. The outcome of these estimates canaffect their organizations seriously too.


3/69

8/8/2011 Lecture24 3

Introduction

How do mangers use sample statistics toestimate population parameters?

StatisticalE

stimation methods enable us toestimate with reasonable accuracy thepopulation proportion

If all these estimates are obtained on a

census Basis, it would be very costly andtime-consuming proposition. Hence samplingtheory


4/69

8/8/2011 Lecture24 4

Statistical Estimation

Statistical estimation is the procedure of using a sample statistic to estimate apopulation parameter. A statistic used toestimate a parameter is called an estimatorand the value taken by the estimator is calledan estimate.

Statistical estimation is divided into two maincategories: Point estimation & Intervalestimation


5/69

8/8/2011 Lecture24 5

Point estimation

A point estimate is a single number that isused to estimate an unknown populationparameter.

Example If a firm takes a sample of 50 salesman

And Average amount of time each salesmanspend with his customers is 80 minutes

And This figure is used for an estimate of aparameter

Then 80 is the point estimate.


6/69

8/8/2011 Lecture24 6

Interval estimation

An estimate of a population parameter givenby two numbers between which theparameter may be considered to lie is calledas interval estimate of the parameter

An interval estimation is a range of valuesused to estimate a population parameter.

Average amount of time each salesman spendwith his customers is between 60 to 80 minutes


7/69

8/8/2011 Lecture24 7

Criteria of a Good Estimator.

A good estimate is one which is close to thepopulation parameter being estimated.

We can evaluate the quality of a statistic asan estimator by using four criteria.

Unbiasedness

Consistency

Efficiency

Sufficiency


8/69

8/8/2011 Lecture24 8

Unbiasedness

An estimator is said to be unbiased if theexpected value of the estimator is equal tothe population parameter being estimated.

OR

The mean of the sample values is equal tothe population parameter, then it is unbiased

estimate.


9/69

8/8/2011 Lecture24 9

Consistency

As the sample size increases, the differencebetween the sample statistic and thepopulation parameter should become smallerand smaller. If the difference continues tobecome smaller and smaller as the samplesize becomes larger, the sample statistic is

said to converge in probability to a parameterand is said to be consistent estimator of thatparameter.


10/69

8/8/2011 Lecture24 10

Efficiency

The efficiency of an estimator depends on itsvariance. If the variance of the estimator issmall, then that estimate is closer to the

parameter value. For example sample mean and sample

median are unbiased and consistentestimators of population mean. Choose

between them on the basis of relativeefficiency. (select one which have smallervariance)


11/69

8/8/2011 Lecture24 11

Sufficiency

A sufficient estimator is one that uses allinformation about the population parametercontained in the sample.

Example sample mean is a sufficientestimator of the population mean since all theinformation in the sample is used in its

computation. Not sample range.


12/69

8/8/2011 Lecture24 12

Example

Consider

A medical supplies company that producesdisposable hypodermic syringes . Each syringe is

wrapped in a sterile package and then jumble-packed in a large corrugated carton. Jumble packingcauses the cartons to contain differing number ofsyringes. Because the syringes are sold on a per

unit basis, the company needs an estimate of thenumber of syringes per carton for billing purposes.


13/69

8/8/2011 Lecture24 13

Cont..

A sample of 35 cartons is taken and thenumber of syringes in each carton isrecorded .

Obtain the sample mean

Sample mean = 102 syringes

Then we can say that the point estimate of

the population mean is 102 syringes per carton


14/69

8/8/2011 Lecture24 14

Cont..

The manufactured price of a disposablehypodermic syringe is quite small, so both thebuyer and seller would accept the use of this

point estimate 102 as the basis for billing

Manufacturer can save the time and expenseof counting each syringe that goes into a

carton.


15/69

8/8/2011 Lecture24 15

Interval estimates

An interval estimate describes a range of values within which a population parameteris likely to lie.


16/69

8/8/2011 Lecture24 16

Example

Suppose the marketing research directorneeds an estimate of the average life inmonths of car batteries his company

manufactures. Select a sample of 200 carowners. Interview these owners and collectthe data about the life of batteries.

Let mean life = 36 months If Point estimate then 36 months.


17/69

8/8/2011 Lecture24 17

Cont..

If the director asks for a statement about theuncertainty that will be likely to accompany thisestimate or a range

That can be done by

Calculating the standard error of the mean as

Say 0.707

We could now report that our estimate of the lifeof the companys batteries may lie somewhere inthe range of 35.293 to 36.707 months.

nx

W

W !


18/69

8/8/2011 Lecture24 18

ConfidenceInterval

The probability that we associate with aninterval estimate is called the confidencelevel.

How confident?

Most commonly used confidence levels are90%, 95% & 99%

Free to apply any confidence level.


19/69


20/69

8/8/2011 Lecture24 20

Interval estimation-Students t distribution

When ever sample size is 30 or less and thepopulation standard deviation is not known.Then use t distribution.

In using t distribution we assume that thepopulation is normally distributed.

A t distribution is lower at the mean andhigher at the tails than a normal distribution.

There is separate t distribution for eachsample size Or for different degrees offreedom.


21/69

8/8/2011 Lecture24 21

Degrees of freedom.

The number of values we can choose freely.


22/69

8/8/2011 Lecture24 22

Example

As part of the budgeting process for next year, themanager of the Fan point electric generating plantmust estimate the coal he will need for this year.

Last year the plant almost ran out, so he is reluctantto budget for that same amount again. The plantmanager took a random sample of 10 plantoperating weeks chosen over the last 5 years. Ityielded a mean usage of 11400 tons a week, asample standard deviation of 700 tons a week.Calculated a sensible estimate of the amount( with95 % confident ) to order this year.


23/69

8/8/2011 Lecture24 23

n=10 df=9

Sample mean=11400

S.d=700 ( approximate this as population S.D.)

Standard error=

=221.38

From t-table, corresponding to d.f 9 & confidencelevel (1.00-0.95)=0.05 the t value= 2.262

The confidence interval is 11400 + 2.262* 221.38

10899 to 11901 tons with 95 % confidence

nx

W

W !


24/69

8/8/2011 Lecture24 24

Tests of Hypothesis


25/69

8/8/2011 Lecture24 25

Suppose a manger of a large shopping mall tells us thatthe average work efficiency of her employees is at least90%. How can we test the validity of her claim?

We could calculate the efficiency of a sample of her employees.

If this sample statistic came out be 95% we would accept

the managers statement. But if it is 46% we would reject her assumption as

untrue.

Suppose sample statistic is 88%. Whether we accept orreject?

We cannot be absolutely certain that our decision iscorrect.

Therefore learn to deal with uncertainty in our decisionmaking.


26/69

8/8/2011 Lecture24 26

Hypothesis

Here we wish to test efficiency = 90% (null)

Against the alternative, efficiency 90%,(alternate)

Or we can say

null hypothesis H0 0=90

alternate hypothesis H1 1 90


27/69


28/69

8/8/2011 Lecture24 28

Level of significance

If we assume the hypothesis is correct, thenthe significance level will indicate the % ofsample means that is outside certain limits.


29/69

8/8/2011 Lecture24 29

Cont..

The purpose of testing is not to question thecomputed value of the sample statistic but tomake a judgment about the difference

between that sample statistic and testedpopulation parameter.


30/69

8/8/2011 Lecture24 30

Introduction

A hypothesis is an assumption about the populationparameter to be tested based on sampleinformation.

Hypothesis tests are widely used in business andindustry for making decisions..

Examples

Based on sample data decide whether a new

medicine is really effective in curing a disease Whether one training procedure is better than other.


31/69

8/8/2011 Lecture24 31

The hypothesis is made about the value ofsome parameter, (only facts available toestimate the true parameter are thoseprovided by sample)

If the sample statistic differs from thehypothesis made about the populationparameter, and if it is significant, then rejectthe hypothesis.

If it is not significant then it must beaccepted. Hence tests of hypothesis


32/69

8/8/2011 Lecture24 32

Procedures of HypothesisTesting

Set up a hypothesis

Set up a suitable significance level

Determination of a suitable test statistic Determination of the critical region

Doing computations

Making decisions


33/69

8/8/2011 Lecture24 33

Set up a hypothesis

Establish the hypothesis to be tested.

Set up

Null hypothesis denoted by H0 & Alternate hypothesis denoted by H1 The null hypothesis

There is no true difference in the samplestatistic and population parameter underconsideration


34/69

8/8/2011 Lecture24 34

Set up a hypothesis

The hypothesis that is different from the nullhypothesis is the alternate hypothesis H1

If the sample information leads to reject H0

,then accept H1


35/69

8/8/2011 Lecture24 35

Set up a suitable significance level

The confidence with which an experimenter rejects orretains null hypothesis

The level of significance is denoted by

It is generally specified before any sample is drawn.

(no influence)

In practice 5% or 1% level of significance

5% 5 chances out of 100 that we would reject thenull hypothesis ( 95% confident that right decision )

E


36/69

8/8/2011 Lecture24 36

EWhen the null hypothesis is rejected at=0.5 the result is said to be significant.

When the null hypothesis is rejected at =0.01 the result is said to be significant. Thetest result is said to be highly significant

E


37/69


38/69

8/8/2011 Lecture24 38

Determination the critical region

Determination of

Which value of test statistic will lead to arejection of H0

And which lead to acceptance of H0. The former is called critical region.

Establishing a critical region is similar todetermining a 100 (1- ) % confidence interval.E


39/69

8/8/2011 Lecture24 39

Doing computations

Calculations for step 3


40/69

8/8/2011 Lecture24 40

Making decisions

Draw statistical conclusions

Either acceptance of the null hypothesis orrejection of it.

Based on whether the computed value of thetest statistic falls in the region of acceptanceor region of rejection


41/69

8/8/2011 Lecture24 41


42/69

8/8/2011 Lecture24 42

Point estimation. Appropriate when the goal is to estimate a population

parameter.

Confidence interval.

Appropriate when the goal is to estimate a populationparameter with confidence.

Hypotheses testing. Hypothesis: a statement about the parameters.

Appropriate when the goal is to assess if the evidenceprovided by the data is in favor of some claim about thepopulation.

Procedures for statistical inferences


43/69

8/8/2011 Lecture24 43

ConfidenceInterval

Point estimate +/- margin of error

Confidence interval for a population mean

Assumption: the population variance is known.

Confidence level: C

n

zx

n

zx

WW

** ,


44/69

8/8/2011 Lecture24 44

HypothesisTesting

Sometimes, not interested in

Estimate an unknown parameter

Provide a confidence interval for the parameter

But rather, you have some claim (belief)about the parameter and you want to see

whether the data supports the claim or not. Support

Contradict


45/69

8/8/2011 Lecture24 45

The critical concepts of hypothesis testing:two hypotheses

H0 - the null hypothesis

The statement of no effect or nodifference.

Ha - the alternative hypothesis

The statement we hope or suspect is true.

Usually one would decide on Ha first.

Concepts of HypothesisTesting


46/69

8/8/2011 Lecture24 46

Biased one-Euro Coin?A group of Statistics students spin theBelgian one-Euro coin 250 times, and it

came up heads 140 times.

p: the probability of getting a head duringeach spin.

H0: p = .5 against Ha: p > .5.

One-sided H0: p = .5 against Ha: p .5.

Two-sided


47/69

8/8/2011 Lecture24 47

A new billing system for a company will be cost- effective only if themean monthly account is more than $170.

A sample of 400 monthly accounts has a mean of $178.

If the accounts are normally distributed with W = $65, can we concludethat the new system will be cost effective?

The population is the credit accounts at the store.

We want to show that the mean account for all customers is greater than$170. Ha: Q > 170.

The null hypothesis must specify a single value of the parameterQH0 :Q = 170.

How can we achieve that?

CompanyBilling System


48/69

8/8/2011 Lecture24 48

Test statistic

A test is based on a statistic, which estimatesthe parameter that appears in the hypotheses

Point estimate

Values of the estimate far from the parametervalue in H0 give evidence against H0.

Ha determines which direction will be counted

as far from the parameter value.


49/69

8/8/2011 Lecture24 49

CompanyBilling SystemQuestion:

Is a sample mean of 178 sufficiently greaterthan 170 to infer that the population mean isgreater than 170?

Answer:

Lets assume the population mean is 170,and see how likely it is for us to observe a

sample mean of 178 or even more.


50/69

8/8/2011 Lecture24 50

P-value:

the probability of observing a test statistic as extreme ormore extreme than the actually observed value, giventhat H0 is true.

extreme means far from what we would expect fromH0 .

The P-value provides information about theamount of statistical evidence that supports the

null hypothesis. The smaller the P-value, the less the evidence forH0.

P-value


51/69

8/8/2011 Lecture24 51

Because the probability that the sample mean is equal orlarger than 178, when Q = 170, is so small (.0069), thereare no reasons to believe that Q = 170.

(or, reasons to believe that Q> 170.)

We can conclude that the smallerthe P-value

the more statistical evidenceexists to

suppor

t the

alter

native

hypo

thesis.

InterpretingP-value


52/69

8/8/2011 Lecture24 52

If the P-value is less than 1%, there is overwhelmingevidence that supports the alternative hypothesis.

If the P-value is between 1% and 5%, there is strong

evidence that supports the alternative hypothesis.

If the P-value is between 5% and 10% there is weakevidence that supports the alternative hypothesis.

If the P-value exceeds 10%, there is no evidence thatsupports of the alternative hypothesis.

DescribingP-value


53/69

8/8/2011 Lecture24 53

SignificanceLevel E We need to make a conclusion after carrying out the

hypothesis test. What do we conclude?

We can compare the P-value with a fixed value that weregard as decisive.

This amounts to announcing in advance how muchevidence against H0we require in order to reject H0.

The decisive value is called the significance levelof thetest. It is denoted by E and the corresponding test is

called a levelE

test.

Statistical Significance: If the P-value e E, we saythat the data are statistically significant at level E.


54/69

8/8/2011 Lecture24 54

E and P-value

P-value and significance level E: Reject H0 if

Do not reject H0 if

When is it easier to reject H0?

Large E or smallE ?

.

When is the evidence against H0 stronger?

Large P-value or smallP-value?

.


55/69

8/8/2011 Lecture24 55

Four steps of hypotheses testing

Define the hypotheses to test, and the requiredsignificance level E

Calculate the value of the test statistic.

Find the P-value based on the observed data. State the conclusion.

Reject the null hypothesis if the P-value E, thedata do not provide sufficient evidence to reject the null.


56/69

8/8/2011 Lecture24 56

Testing for normal mean with

known W Let X1, ,Xn be a random sample from N(Q,W2). Null hypothesis:

H0: Q =Q0

Alternative hypothesis: Ha: Q { Q0 Ha: Q >Q0 Ha: Q


57/69

8/8/2011 Lecture24 57

Normal with known W: Z test

When H0 is true, and

has a standard normal distribution. Z is a natural measure of the distance between

the sample mean and its expected value Q.

For a given sample, we observe

IfH0 is true, we expect zto be close to 0.

n

XZ

/

0

W

Q!

X

./

0

nxzW

Q!

0QQ !X


58/69

8/8/2011 Lecture24 58

Normal with known W

Case 1: Ha: Q {Q0. H0 should be rejected if z is too far away from 0.

The P-value is

Case 2: Ha: Q >Q0. H0 should be rejected if z is much larger than 0.

The P-value is

Case 3: Ha: Q


59/69

8/8/2011 Lecture24 59

Normal with known W:P-value

method

Null hypothesis: H0: Q=

Q0 Test statistic:

Alternative hypothesis P-value

Ha: Q { Q0Ha: Q > Q0Ha: Q < Q0

.n

xz

W

Q!


60/69

8/8/2011 Lecture24 60

Sprinkler A sprinkler systems maker claims that the true average

system-activation temperature is 130o. A sample ofn = 9systems , when tested, yields a sample averageactivation temperature of 131.08o. If the distribution ofactivation temperature is normal with W= 1.5o, does the

data contradict the claim at significance level E = .01 ? Let Q = true average activation temperature.

Hypotheses:

Test statistic:

P-value:

Conclusion:


61/69

8/8/2011 Lecture24 61

The rejection region is a range of values such that if thetest statistic falls into that range, the null hypothesis isrejected.

The rejection region method: Define the hypotheses to test, and the required significance level

E

Find the corresponding rejection region.

Calculate the test statistic.

Reject the null hypothesis only if the value of the test statistic fallsin the rejection region.

Rejection Region Method


62/69

8/8/2011 Lecture24 62

Normal with known W:Rejection

RegionMethod Null hypothesis: H0: Q =Q0 Test statistic:

Alternative Rejection regionhypothesis for level E test

Ha: Q { Q0

Ha: Q >Q0Ha: Q


63/69

8/8/2011 Lecture24 63

Sprinkler A sprinkler systems maker claims that the true average

system-activation temperature is 130o. A sample ofn = 9systems , when tested, yields a sample average activationtemperature of 131.08o. If the distribution of activationtemperature is normal with W= 1.5o, does the data

contradict the claim at significance level E = .01 ? Let Q =true average activation temperature.

1 Hypotheses:

2 Rejection region:

3 Test statistic:

4 Conclusion:


64/69

8/8/2011 Lecture24 64

Sprinkler Revisited

A sprinkler systems maker claims that thetrue average system-activation temperatureis 130o. A sample ofn = 9 systems , when

tested, yields a sample average activationtemperature of 131.08o. If the distribution ofactivation temperature is normal with W=1.5o,

does the data contradict the claim at significancelevel E = .01 ?

whats the 99% confidence interval for theactivation temperature?


65/69

8/8/2011 Lecture24 65

CI & 2-Sided Tests

A level E 2-sided test rejects H0: Q =Q0 exactlywhen the value Q0 falls outside a level 1 - Econfidence interval forQ.

Confidence interval can be used to testhypotheses.

Calculate the 1 - E level confidence interval, then

ifQ0 falls within the interval, do not reject the null

hypothesis, Otherwise, reject the null hypothesis.


66/69

8/8/2011 Lecture24 66

In a discussion of SAT scores, someone comments: Because only a

minority of students take the test, the scores overestimate the ability oftypical seniors. The mean SAT-M score is about 475, but I think if allseniors took the test, the mean would be 450.

You gave the test to an SRS of 500 seniors from California. Thesestudents had an average score of 461. (The SAT-M score follows a

normal distribution with a standard deviation of 100.)

Is there sufficient evidence against the claim that the mean for allCalifornia seniors is 450 under a significance level of 0.05?

Give a 95% CI for the mean score Q of all seniors.

SAT


67/69

8/8/2011 Lecture24 67

A 95% confidence interval forQ is

SAT

Because Ha is two-sided, the P-value is.

Conclusion:

The hypotheses are

The test statistic is


68/69

8/8/2011 Lecture24 68

Take HomeMessage

Tests of significance: When to use it Two hypotheses:

Null

Alternative Test for a population mean with known W

Test statistic P-value Significance level E

P-value method 4 steps

Rejection region method

CI and 2-sided test


69/69

Homework12.1

Reading in Text 435-452

Exercises in Text

6.32, 6.36, 6.44, 6.48, 6.52, 6.56

Due Time

Thursday, April 28

mms testing of hypothesis

Documents