agenda

12
AGENDA 1. SIRS Forms 2. Quiz 4 3. Chi-Square Goodness of Fit Test of Independence

Upload: varick

Post on 04-Jan-2016

33 views

Category:

Documents


3 download

DESCRIPTION

AGENDA. SIRS Forms Quiz 4 Chi-Square Goodness of Fit Test of Independence. 3. Chi-square Tests. In this class, we will learn tests that make use of the chi-square distribution , with enumerative (counts or frequencies) data: 1. Chi-square tests for goodness of fit - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: AGENDA

AGENDA

1. SIRS Forms

2. Quiz 4

3. Chi-Square

• Goodness of Fit

• Test of Independence

Page 2: AGENDA

3. Chi-square Tests In this class, we will learn tests that make use

of the chi-square distribution, with enumerative (counts or frequencies) data:

1. Chi-square tests for goodness of fit• With equal expected frequencies

(proportions)• With unequal expected frequencies

(proportions)

2. Chi-square tests for independence • Contingency table analysis

Page 3: AGENDA

Chi-square Test for Goodness-of-Fit

Goodness-of-fit test is one of the most commonly used nonparametric tests.

The purpose of this test is to determine how well an observed set of data fits an expected outcome.

Chi-square analysis is useful for goodness of fit tests, because many real world situations in business and other areas allow for the collection of count data.

Page 4: AGENDA

Two types of Chi-square Goodness-of-Fit:

1. With equal expected frequencies:All the proportions are hypothesized to be equal to each other.

H0: P1 = P2 = P3 = P4 = …Pk

H1: At least one proportion is not equal to p1/k

2. With unequal expected frequencies:Hypothesized proportions are different than each other.

H0: P1 = P2 = P3 = P4 = …Pk = PH0

H1: At least one pi is not equal to the hypothesized value

Page 5: AGENDA

Chi-square Test for Goodness-of-Fit

The expected count in the cell: Hypotheses:

H0: Probabilities of occurrence of events are equal to the given probabilities.

H1: Probabilities of occurrence of events are not equal to the given probabilities.

The test statistic for this test is:

The critical value for this test is (pg 761):

where k is the # of categories

k

i i

ii

E

EO

1

22 )(

21k

ii npE

Page 6: AGENDA

Example (with equal expected frequencies)

The marketing manager of a manufacturer of sports cards plans to begin series of cards with pictures and playing statistics of former major league baseball players. At the baseball card show at a mall last weekend, she set up a booth and offered cards of the following six Hall of fame baseball players: Dizzy Dean, Bob Feller, Phil Rizzuto, Warren Spahn, Mickey Mantle, and Willie Mays. At the end of the first day she sold a total of 120 cards. Can she conclude that the sales of cards are the same for the six players?

player cards sold1 Dizzy Dean 132 Bob Feller 333 Phil Rizzuto 144 Warren Spahn 75 Mickey Mantle 366 Willie Mays 17

Page 7: AGENDA

Example(with unequal expected

frequencies) A national study of hospital admissions

during a two-year period revealed these statistics concerning senior citizens who resided in care centers and who were hospitalized anytime during the period: Forty percent were admitted only once in the two-year period. Twenty percent were admitted twice. Fourteen percent were admitted three times, and so on.

The administrator of the local hospital is anxious to compare her hospital’s experience with the national pattern. She selected 400 senior citizens in local care centers who needed hospitalization and determined the number of times during a two-year period each was admitted to her hospital. How can the locally observed frequencies be compared with national percentages?

Observed # of

AdmissionsExpected percent

Oi pi

1 165 40%2 79 20%3 50 14%4 44 10%5 32 8%6 20 6%7 10 2%

Total 400 100%

Number of times

Admitted

Page 8: AGENDA

Chi-square Test for Independence

A common problem in applied statistics is deciding whether two variables are related.

So far, we used different methods for investigating the relationship between two or more variables (ANOVA, regression, etc). However, those methods were valid if at least the dependent variable was continuous.

Now, we will see how the chi-square statistic can be adapted to test the independence of two categorical variables.

Page 9: AGENDA

Chi-square Test for Independence

Hypotheses:

H0: The two classification variables are independent.H1: The two classification variables are not independent.

The expected count in the cell:

The test statistic for this test is:

The critical value for this test is:

where r and c are the # of categories for the two variables.

r

i

c

j ij

ijij

E

EO

1 1

22 )(

n

CRE jiij

2)1)(1( cr

Page 10: AGENDA

Contingency Table

1 2 3 4 5 6 RowTotal

1 O11

(E11)

O12

(E12)

O13

(E13)

O14

(E14)

O15

(E15)

O16

(E16)

R1

2 O21

(E21)

O22

(E22)O23

(E23)

O24

(E24)

O25

(E25)

O26

(E26)

R2

3 O31

(E31)

O32

(E32)

O33

(E33)

O34

(E34)

O35

(E35)

O36

(E36)

R3

4 O41

(E41)

O42

(E42)

O43

(E43)

O44

(E44)

O45

(E45)

O46

(E46)

R4

5 O51

(E51)

O52

(E52)

O53

(E53)

O54

(E54)

O55

(E55)

O56

(E56)

R5

Column Total

C1 C2 C3 C4 C5 C6 n

Column Classification Variable

RowClassification

Variable

Page 11: AGENDA

Example #1 A publishing house wanted to find out whether there is a dependence

between the place where the book is sold and the color of its cover. For one of its latest novels, the publisher sent displays and a supply of copies of the novel to large bookstores in five major cities. The resulting sales of the novel for each city-color combination are as follows. Numbers are in thousands of copies sold over a 3-month period. Conduct the test for independence of color and location

City Red Blue Green Yellow TotalNew York 21 27 40 15 103Washington 14 18 28 8 68Boston 11 13 21 7 52Chicago 3 33 30 9 75Los Angeles 30 11 34 10 85Total 79 102 153 49 383

Color

Page 12: AGENDA

An analyst in the soft drink industry wants to conduct a statistical test to determine whether there is a relationship between a person’s preference for one of the four brands: Coke, Pepsi, 7Up, and Dr. Pepper and whether the person drinks regular or diet drinks. A random sample of 330 people is selected, and their responses are as follows.

Soft Drink Preference Coke Pepsi 7Up Dr Pepper Total Diet 55 32 47 21 155 Regular 60 43 35 37 175 Total 115 75 82 58 330

Example #2