mb0040 slm unit10

41
Statistics for Management Unit 10 Sikkim Manipal Page No. Unit 10 Chi–Square Test Structure: 10.1 Introduction Objectives Relevance 10.2 Chi-Square test Characteristics of Chi-Square test Steps in solving problems related to Chi-Square test Conditions for applying the Chi-Square test Restrictions in applying Chi-Square test Practical applications of Chi-Square test Uses of Chi- Square test Degrees of freedom Levels of significance Interpretation of Chi- Square values 10.3 Applications of Chi-Square Test Tests for independence of attributes Test of goodness of fit Test for comparing variance 10.4 Summary 10.5 Glossary 10.6 Terminal Questions 10.7 Answers 10.8 Case Study 10.1 I

Upload: smu-sovabazar

Post on 04-Jan-2016

105 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Mb0040 Slm Unit10

Statistics for Management Unit 10

Sikkim Manipal Page No. 1

Unit 10 Chi–Square Test

Structure:10.1 Introduction

Objectives Relevance

10.2 Chi-Square testCharacteristics of Chi-Square testSteps in solving problems related to Chi-Square test Conditions for applying the Chi-Square test Restrictions in applying Chi-Square testPractical applications of Chi-Square test Uses of Chi-Square testDegrees of freedom

Levels of significance Interpretation of Chi-Square values

10.3 Applications of Chi-Square TestTests for independence of attributes Test of goodness of fitTest for comparing variance

10.4 Summary10.5 Glossary10.6 Terminal Questions10.7 Answers10.8 Case Study

10.1Introduction

In the previous unit, testing of hypothesis, we discussed about how to test hypothesis concerned with parameters like mean and proportion, using data from either one or two samples. We used one-sample tests to determine whether a mean or a proportion was significantly different from a hypothesised value. In the two-sample tests, we examined the difference between either two means or two proportions, and we tried to learn whether this difference was significant.

For example, we have proportions from five populations instead of only two, then for these cases, the methods for comparing proportions described for

Page 2: Mb0040 Slm Unit10

Statistics for Management Unit 10

Sikkim Manipal Page No. 2

testing hypothesis for two-samples do not apply; we must use the Chi-

Square test (2 test). In this unit, Chi-Square, we will discuss the Chi-Square tests which enable us to test whether more than two population proportions can be considered equal. In other words, a Chi-Square test is also a parametric test which can be applied on categorical data or qualitative data. This test can be applied when we have few or no assumptions about the population parameter.

Actually, Chi-Square tests allow us to do a lot more than just test for the quality of several proportions. If we classify a population into several categories with respect to two attributes (such as age and job performance), we can then use a Chi-Square test to determine whether the two attributes are independent of each other. So, Chi-Square tests can be applied on a contingency table.

Objectives:After studying this unit, you should be able to: describe the non parametric method of testing hypothesis describe the Chi-Square characteristics identify the conditions required for applying Chi-Square test for a given

population distribution recognise the applications of Chi-Square

test describe the steps in solving problems related to Chi-Square

test

10.1.1 Relevance

Case-letWomen still earn less than men

On 27 February 2006 the Women and Work Commission (WWC), published its report on the causes of the “gender pay gap “or the difference between men’s and women‘s hourly pay. According to the report, British women working full-time currently earn 17% less per hour than men. In February the European commission also brought out its own report on the pay gap across the European Union. Its findings were similar in that, on an hourly basis, women earn 15% less than men for the same work.

In the United States, the difference in median pay between men and women is around 20%. According to the WWC report the gender pay gap opens

Page 3: Mb0040 Slm Unit10

Statistics for Management Unit 10

Sikkim Manipal Page No. 3

early. Boys and girls study different subjects in school, and boy’s subjects

Page 4: Mb0040 Slm Unit10

2

Statistics for Management Unit 10

Sikkim Manipal Page No. 4

lead to more lucrative careers. They then work in different sorts of jobs. As a result, average hourly pay for a woman at the start of her working life is only 91% of a man’s; even through nowadays she is probably better qualified. How do we compile this type of statistical information? We can use Chi- Square testing for more than one type of population.

(Source: Derek L Waller Published by Elsevier Inc Ed 2008).

10.2Chi-Square test

The Chi-square test is one of the most commonly used non-parametric tests

in statistical work. The Greek Letter 2 is used to denote this test. 2

describe the magnitude of discrepancy between the observed and the

expected frequencies. The value of 2 is calculated as:

O E 2

O E 2 O E 2 O E 2 O E 2

2 i i E i

1 1 E1

2 2 E2

3 3 ....... n n

E3 En

Where, O1, O2, O3….On are the observed frequencies and E1, E2, E3…En

are the corresponding expected or theoretical frequencies.

10.2.1 Characteristics of Chi-Square test

The following are the characteristics of a Chi-Square test (2 test):

he 2 test is based on frequencies and not on parameters It is a non-parametric test where no parameters regarding the rigidity of

populations are required

Additive property is also found in 2 test

he 2 test is useful to test the hypothesis about the independence of attributes

The 2 test can be used in complex contingency tables

The 2 test is very widely used for research purposes in behavioral and social sciences including business research

While testing whether the observed frequencies of certain outcomes fits

with expected frequencies defined by a theoretical distribution, the 2

value defined here follows 2 distribution:

2 O i E i

Page 5: Mb0040 Slm Unit10

Statistics for Management Unit 10

Sikkim Manipal Page No. 5

E i

Page 6: Mb0040 Slm Unit10

Statistics for Management Unit 10

Sikkim Manipal Page No. 6

where, ‘Oi’ is the observed frequency and ‘Ei’ is the expected frequency.

Key StatisticThe observed frequencies are the frequencies obtained from the observation, which are sample frequencies. The expected frequencies are the calculated frequencies.

10.2.2 Steps in solving problems related to Chi-Square testFigure 10.1 depicts the steps required for solving the problems related to Chi-Square test.

Fig. 10.1: Procedural Steps in Solving Problems on Chi-Square Test

10.2.3 Conditions for applying the Chi-Square test

The following are the conditions for using the Chi-Square test:1. The frequencies used in Chi-Square test must be absolute and not in

relative terms.2. The total number of observations collected for this test must be large.

3. Each of the observations which make up the sample of this test must be independent of each other.

4. As 2 test is based wholly on sample data, no assumption is made

Page 7: Mb0040 Slm Unit10

Statistics for Management Unit 10

Sikkim Manipal Page No. 7

concerning the population distribution. In other words, it is a non parametric-test.

Page 8: Mb0040 Slm Unit10

Statistics for Management Unit 10

Sikkim Manipal Page No. 8

5. 2 test is wholly dependent on degrees of freedom. As the degrees of freedom increase, the Chi-Square distribution curve becomes symmetrical.

6. The expected frequency of any item or cell must not be less than 5, the frequencies of adjacent items or cells should be polled together in order to make it more than 5.

7. The data should be expressed in original units for convenience of comparison and the given distribution should not be replaced by relative frequencies or proportions.

8. This test is used only for drawing inferences through test of the hypothesis, so it cannot be used for estimation of parameter value.

10.2.4 Restrictions in applying Chi-Square testThe sample observations should be independently and normally distributed. For this; either the parent population should be infinitely large (for example, greater than 50), or sampling should be done with replacement.

Constraints imposed upon the observations must be of linear character, for example,

Oi E i

The 2 distribution is essentially a continuous distribution; however its character of continuity is maintained only when the individual frequencies of

the variate values remain greater than or equal to 5. So, in applying 2 test in the testing of the goodness of fit or testing of the dependency of variables in a contingency table, the cell frequency should not be less than 5. In practical problems we can combine a few values of small frequencies into one to get the pooled frequency greater than 5.

Key Statistic

The results of Chi-Square test cannot be accurate if the cell frequencies in a contingency table are less than 5.

10.2.5 Practical applications of Chi-Square test

In inferential statistics, the Chi-Square test can also be applied for the discrete distributions. In using Chi-Square test, we need no assumptions regarding the shape of sampling distributions. The applications of Chi- Square test include testing:

Page 9: Mb0040 Slm Unit10

Statistics for Management Unit 10

Sikkim Manipal Page No. 9

the significance of sample variances

the goodness of fit of a theoretical distribution the independence in a contingency table whether the observed results

are consistent with the expected segregations in breeding experiments of genetics

Where the first is a parametric test and the other two are nonparametric test.

10.2.6 Uses of Chi-Square test

The 2 test is used broadly to: Test goodness of fit for one way classification or for one variable

only Test independence or interaction for more than one row or column in the

form of a contingency table concerning several attributes

Test population variance ‘2’ through confidence intervals suggested by

2 test

10.2.7 Degrees of freedom

The number of degrees of freedom for ‘n’ observations is ‘n-k’ and is usually denoted by ‘’, where ‘k’ is the number of independent linear constraints imposed upon them.

Example 1

For example, we are asked to write any four numbers, we will have all the numbers of our choice. If a restriction is applied or imposed to the choice that the sum of these numbers should be 50; then the freedom of choice would be reduced to three only and so the degrees of freedom would now be 3.

If a 2 is defined as the sum of the squares of ‘n’ independent standardized normal variates, and the condition of the satisfaction of one linear relation is imposed upon them (such as the estimation of some population parametric value, etc.), then the effect of these ‘n’ constraints would be replaced by ‘n- k’. If the sum of squares of a sample mean is taken instead of the population mean, then ‘n’ is replaced by n -1 = . This is because one linear constraint has been imposed.

Key Statistic

Page 10: Mb0040 Slm Unit10

Statistics for Management Unit 10

Sikkim Manipal Page No.

The Chi-Square distribution has only one parameter, that is, the degrees of freedom.

Page 11: Mb0040 Slm Unit10

0

Statistics for Management Unit 10

Sikkim Manipal Page No.

10.2.8 Levels of significanceTables have been prepared for the values of ‘P’, where the probability of

getting a value of 2 2where 0

2 is an observed value. From these

tables, we can find the value of ‘P’ corresponding to an observed value of 2

and then proceed to test, whether the difference between observed and theoretical frequencies is significant or not. Smaller the values of ‘P’, greater the divergence between fact and theory so that small values lead us to suspect the hypothesis. Not only do small values of ‘P’ lead us to suspect the hypothesis but a value of ‘P’ very near to unity may also lead to a similar

result. Thus, if P = 1, 2 = 0, showing that there is a perfect agreement between fact and theory and this is a very improbable event. There are two conventional levels of significance. They are:

If P < 0.05, we say that the observed value of 2 is significant at 5 percent level of significance.

Similarly, if P < 0.01, the value is significant at 1 % level.

10.2.9 Interpretation of Chi-Square values

After ascertaining the 2 value, the 2 table comprises of columns headed with symbols 0.05 for 5% level of significance, 0.01 for 1% level of significance, etc. The left hand side indicates the degrees of freedom. If the

calculated value of 2 falls in the acceptance region, the null hypothesis ‘Ho’ is accepted and vice-versa. Figure 10.2 depicts the acceptance and rejection regions of Chi-Square distribution.

Page 12: Mb0040 Slm Unit10

Statistics for Management Unit 10

Sikkim Manipal Page No.

Fig. 10.2: Acceptance and Rejection Regions under Chi-Square Distribution

Page 13: Mb0040 Slm Unit10

Sikkim Manipal Page No.

Statistics for Management Unit 10

Key Statistic

The Chi-Square curve will be on the positive side of x-axis because the Chi-Square values are always positive.

10.3Applications of Chi-Square test

10.3.1 Tests for independence of attributes

In the test for independence, the null hypothesis is that the row and column variables are independent of each other. We have studied earlier, that the hypothesis testing is done under the assumption that the null hypothesis is true.

The following are the properties of the test for independence: The data are the observed frequencies The data is arranged in the form of a contingency table

The degrees of freedom ‘’ can be calculated as:

Number

of

rows 1Number

of

columns 1

where, ‘’ is the degrees of freedom The test for independence has a Chi-Square distribution and is always a

right tail test. The expected value is computed by taking the row total, multiplying it

with the column total and dividing by the grand total. That is given by:Row T otalColumn T otal

E Grand T otal

The test statistic value does not change, if the order of the rows or columns is interchanged. Also the value does not change even if the rows and columns are interchanged.

Solved Problem 1

Calculate the degrees of freedom for a contingency table with three rows and two columns.

Solution – The degrees of freedom denoted by ‘’ is calculated as:

Numberof rows 1Number

of

3 12 12

columns 1

Page 14: Mb0040 Slm Unit10

Statistics for Management Unit 10

Sikkim Manipal Page No.

Hence, a contingency table with three rows and two columns has two degrees of freedom.

Solved Problem 2Table 10.1 depicts the production in three shifts and the number of defective goods that turned out in three weeks. Test at 5% level of significance whether weeks and shifts are independent.

Table 10.1: Production of Defective Goods in Three Shifts

Shift 1 Week 2 Week 3 Week Total

I 15 5 20 40

II 20 10 20 50

III 25 15 20 60

Total 60 30 60 150

Solution: Table 10.1a depicts the observed and expected values required

to calculate 2.

Table 10.1a: Observed and Expected Values

Observed Value

Oi

Expected Value

R o w T o t a l C o l u m n T o t al E

i

Grand T otal

(O – E )2

i i

O i E i 2

E i

15 (40 x 60) /150 = 16 1 0.0625

20 (50 x 60) /150 = 20 0 0.0000

25 (60 x 60) /150 = 24 1 0.0417

5 (40 x 30) /150 = 8 9 1.1250

10 (50 x 30) /150 = 10 0 0.0000

15 (60 x 30) /150 = 12 9 0.7500

20 (40 x 60) /150 = 16 16 1.0000

20 (50 x 60) /150 = 20 0 0.0000

20 (60 x 60) /150 = 24 16 0.6667

2cal =3.6459

The steps to calculate 2 are described as follows:

1. Null hypothesis ‘Ho’: The week and shifts are independent

Alternate hypothesis ‘H1’: The week and shifts are dependent

2. Level of significance is 5% and degrees of freedom

d.f. = (3 – 1) (3 – 1) = 4

tab2 9.49

Page 15: Mb0040 Slm Unit10

2

c

t

Statistics for Management Unit 10

Sikkim Manipal Page No.

3. Test statistics

O E 2 i i

E i

2

cal = 3.6459

4. Conclusion: Since 2

(3.6459) < 2 %0%.49 ), ‘Ho’ is accepted. Hence,

the attributes ‘week’ and ‘shifts’ are independent.

Solved Problem 3

Out of 1000 people surveyed, 600 belonged to urban areas and rest to rural areas. Among 500 who visited other states, 400 belonged to urban areas. Test at 5% level of significance whether area and visiting other states are dependent.

Solution: Table 10.2 depicts the information given in solved problem 3 in a tabulated form.

Table 10.2: People Belonging to Urban and Rural Areas

Other States Urban Rural Total

Visited 400 100 500

Not Visited 200 300 500

Total 600 400 1000

Table 10.2a depicts the observed and expected values for the calculation of 2.

Table 10.2a: Observed and Expected Values

Observed Value

Oi

Expected Value

R o w T o t a l C o l u m n T o t al E

i

Grand T otal

(O – E )2

i i

O E 2

i i

E i

400 300 10000 33.33

200 300 10000 33.33

100 200 10000 50.00300 200 10000 50.00

2cal = 166.66

The steps for calculation of Chi-Square are described as follows:

1. Null hypothesis ‘H0’: Area and visit are independent.

Page 16: Mb0040 Slm Unit10

Statistics for Management Unit 10

Sikkim Manipal Page No.

Alternate hypothesis ‘H1’: They are dependent.

Page 17: Mb0040 Slm Unit10

2

c t

Statistics for Management Unit 10

Sikkim Manipal Page No.

2. Level of significance is 5% and degrees of

freedom d.f. = (2 – 1) (2 – 1) = 1

tab2 3.84

3. Test statistics

O E 2 i i

E i

2

cal = 166.66

4. Conclusion: Since 2

(166.66) > 2 (3.84), ‘Ho’ is rejected. Hence, the

‘area’ and ‘visit’ are dependent.

10.3.2 Test of goodness of fitThe test of goodness of fit of a statistical model measures how accurately the test fits a set of observations. This test measures and summarises the differences if any, between the observed and expected values of the considered statistical model. These test results are helpful to know whether the samples are drawn from identical distributions or not. The degrees of freedom are ‘n-1’ and the expected value is equal to the average of the observed values.

Solved Problem 4A personal manager is interested in trying to determine whether absenteeism is greater on one day of the week than on another day of the week. The record for the past years is available. Table 10.3a depicts the absenteeism for each working day over a week. Test whether absenteeism is uniformly distributed over the week.

Table 10.3: Comparison of Data about Absenteeism

Days of Week Monday Tuesday Wednesday Thursday Friday

Number of absentees

66 57 54 48 75

Solution: If the absenteeism is uniformly distributed over the week, then expected number of absenteeism per day is given by:

Page 18: Mb0040 Slm Unit10

2

c t

Statistics for Management Unit 10

Sikkim Manipal Page No.

i 66 57 54 48

755

60

The table 10.3a depicts the calculated expected values required for

calculation of 2 for the data related to problem 4.

Table 10.3a: Observed and Expected Values for Calculation of 2

Observed Value OiExpected Value

E i

(O – E )2

i i

O i E i 2E i

66 60 36 0.6000

57 60 9 0.1500

54 60 36 0.6000

48 60 144 2.4000

75 60 225 3.7500

2cal=7.5000

The steps for calculation of Chi-Square are described as follows:

1. Null hypothesis ‘Ho’: The observed frequencies fit with uniform distribution.

2. Alternate hypothesis ‘H1’: The observed frequencies does not fit with uniform distribution.

3. Level of significance is 5% and degrees of freedom (d.f.)= (5 – 1) = 4

2 tab 9.49

4. Test statistics

2 O i E i E i

2

cal = 7.50

5. Conclusion: Since 2 (7.5) < 2 %0%.49 ), ‘Ho’ is accepted. In other

words, we conclude at 5% level of significance that absenteeism is uniformly distributed and is independent of the days of the week.

Page 19: Mb0040 Slm Unit10

2

c t

Statistics for Management Unit 10

Sikkim Manipal Page No.

Solved Problem 5According to a theory in Genetics, the proportion of beans of A, B, C and D types in a generation should be 9:3:3:1. In an experiment with 1600 beans, the frequency of bean of A, B, C and D type was observed to be 882, 313, 287 and 118 respectively. Does the result support the theory?

Solution: The steps for calculation of Chi-Square are described as follows:

1. Null hypothesis ‘Ho’: The result supports theory

Alternate hypothesis ‘H1’: The result does not support theory

2. Level of significance is 5% and degrees of freedom(d.f.)= (4 – 1) = 3

3. Test statistics

tab2 7.81

2 O i E i E i

Table 10.4 depicts the observed and expected values for calculation of 2

for solved problem 5.

Table 10.4: Observed and Expected Values for Calculation of 2

Observed Value OiExpected Value

E i

(O – E )2

i i

O i E i 2E i

882 (1600 x 9) / 16 = 900 324 0.36

313 (1600 x 3) / 16 = 300 169 0.56

287 (1600 x 3) / 16 = 300 169 0.56

118 (1600 x 1) / 16 = 100 324 3.24

2cal = 4.72

cal = 4.72

4. Conclusion: Since 2 (4.72) < 2 %0%.81 ), ‘Ho’ is accepted. Therefore,

the result supports the theory.

Page 20: Mb0040 Slm Unit10

2

Statistics for Management Unit 10

Sikkim Manipal Page No.

Solved problem 6The following table gives the classification of 100 workers according to gender and the nature of work. Test whether nature of work is independent of the gender of the worker.

Table 10.5

Skilled Unskilled Total

Males 40 20 60

Females 10 30 40

Total 50 50 100

The steps for calculation of Chi-Square are described as follows:

1. Null hypothesis ‘Ho’: There is no association between nature of work and is independent of the gender of the worker

2. Level of significance is 5% and degrees of freedom(d.f.)=

(r-1)(c-1)= (2-1) (2-1)=1

tab2 3.84

3. Test statistics

O E 2 i i

E i

Table 10.5a depicts the observed and expected values for calculation of 2

for solved problem 6.

Table 10.5a: Observed and Expected Values for Calculation of 2

Observed Value OiExpected Value

E i

(O – E )2

i i

O i E i 2E i

40 30 10 3.333

10 20 -10 5.000

20 30 -10 3.333

30 20 10 5.000

2cal = 16.666

cal = 16.666

Page 21: Mb0040 Slm Unit10

c

t

s p p

p

2

Statistics for Management Unit 10

Sikkim Manipal Page No.

4. Conclusion: Since 2

(16.666) > 2 %0%.84 ), ‘Ho’ is accepted. Therefore

the null hypothesis that gender and nature of work are independent will be rejected.

10.3.3 Test for comparing variance

When we have to use 2 as a test of population variance, then,

Ho: 2

= 2 and HA: s2

22

2 s

p2

(n 1)

Where s = variance of the sample

2= variance of the population(n -1) = degrees of freedom, n being the number of items in the

sample.

Then by comparing the calculated value with the table value of 2 for (n-1)

degrees of freedom at a given level of significance, we may either accept or

reject the null hypothesis. If the calculated of 2 is less than the table value,

the null hypothesis is accepted, but if the calculated value is equal or greater than the table value the hypothesis is rejected.

Self Assessment Questions

1. 2 – test is a test.2. A table with 4 rows and 2 columns has the degrees of freedom of

.

3. 2 – test is wholly based on data.

4. If there are four rows and five columns in classification for 2 – test, then the number of degrees of freedom equal to .

5. If the calculated 2 value is less than the tabulated 2 value, then the

null hypothesis is .

Page 22: Mb0040 Slm Unit10

i) 100.0ii) 38.4

iii) 0.61iv) -

2.45

i) 5ii) 6iii) 7iv) 12

Statistics for Management Unit 10

Sikkim Manipal Page No.

Activity

Objective Questions:

1. What is the appropriate test to use if you want to determine whether there is evidence that the proportion of successes is higher in group 1 than in group 2 and we have obtained independent samples from the two groups?

i) The Z testii) The Chi-Square testiii) Both of the aboveiv) None of the above

2. Which of the following values cannot occur in a Chi-Square distribution?

3. What test would you use to determine whether a set of observed frequencies differ from their corresponding expected frequencies?

i) The t test for dependent samplesii) The Chi-Square testiii) The t test for independent samplesiv) The F test

4. When using the chi-square test for differences in two proportions with a contingency table that has r rows and c columns, how many degrees of freedom will the test statistic have?

i) n – 1ii) n

1+ n - 2

2

iii) (r - 1) x (c - 1)iv) (r - 1) + (c – 1)

5. When testing for the independence in a contingency table with 3 rows and 4 columns, how many the degrees of freedom will the test statistic have?

Page 23: Mb0040 Slm Unit10

Statistics for Management Unit 10

Sikkim Manipal Page No.

6. Which of the following is true about the Chi-Square distribution?i) It is a skewed distribution

ii) Its shape depends on the number of degrees of freedom

iii) As the degrees of freedom increase, the Chi-Square distribution becomes more symmetrical

iv) All of the above

7. What other name is used for a contingency table?i) A cross-classification tableii) An ANOVA tableiii) A histogramiv) None of the above

Solutions to Objective Questions1. i) The Z test

2. iv) -2.45

3. ii) The Chi-Square

test 4. iii) (r - 1)x(c – 1)

5. ii) 6

6. 8 iv) All of the above

7. i) A cross-classification table

10.4Summary

Let us recapitulate the important concepts discussed in this unit: Chi-Square test is a non-parametric test. The important applications of

Chi-Square test are the tests for independence of attributes, the test of goodness of fit and the test for specified variance.

2 describe the magnitude of discrepancy between the observed and the

expected frequencies. The value of 2 is calculated as:

O E 2 O E 2 O E 2 O E 2 O E 2

2 i i E i

1 1 E1

2 2 E2

3 3 ....... n n

E3 En

Where, O1, O2, O3….On are the observed frequencies and E1, E2, E3…En are the corresponding expected or theoretical frequencies..

Page 24: Mb0040 Slm Unit10

Sikkim Manipal Page No.

Statistics for Management Unit 10

An important criterion for applying the Chi-Square test is that the sample size should be very large.

10.5Glossary

Chi-Square test: It is a non-parametric test where no parameters regarding the rigidity of population are required.

Level of significance: The smallest probability at which the null hypothesis would be rejected (type I error). Usually, if the significance level is less than a number such as 0.05 (5%), the null hypothesis would be rejected in favour of the alternative; the chance of getting a sample like the one being analysed if the null hypothesis were true. A small significance level would imply that getting such a sample was highly unlikely, suggesting that the null hypothesis is probably not true; also called the P-value of the test.

10.6Terminal Questions

5. 400 items of each (material) were given treatment ‘x’ and ‘y’ to enhance the strength of the material. 80 gained strength by treatment ‘x’ and 20 gained strength by treatment ‘y’. Does the gain in strength depend on the treatment?

6. The demand for a particular spare part was found to vary from day to

day. Table 10.6 depicts the information obtained in a sample study. Test the hypothesis that the number demanded depends upon the day.

Table 10.6: Spare Part Demand from Monday to Saturday

Days Mon Tue Wed Thur Fri Sat

Quantity Demanded

1124 1125 1110 1120 1126 1115

7. In a survey of 200 boys, of which 75 were intelligent, 40 had skilled fathers. While 85 of the unintelligent boys had unskilled fathers. Can we say on the basis of the information that skilled fathers had intelligent boys?

8. The number of car accidents per month in a town was as follows: 6, 9, 4, 12, 8, 20, 14, 15, 2, and 10. Test the hypothesis that the number of accidents is same every month.

Page 25: Mb0040 Slm Unit10

1.

2.

3.

4.

5.

6.

Statistics for Management Unit 10

Sikkim Manipal Page No.

9. In a particular industry the post graduate, graduate, undergraduates are in the ratio 2:3:5. A firm belonging to the industry had 400, 550 and 1050 postgraduates, graduates and undergraduates on its pay-roll. Do they follow earlier observation about the industry?

10. Three hundred digits were chosen at random from a set of tables. The frequencies of the digits were as follows:

Digits 0 1 2 3 4 5 6 7 8 9

Frequency 28 29 33 31 26 35 32 30 31 25

Using Chi-square test assess the hypothesis that the digits were distributed in equal numbers in the table.

10.7Answers

Self Assessment Questions 1. Non-parametric2. 33. Sample4. 125. Not Rejected

Terminal Questions

2cal

2cal

2cal

2cal

2cal

2cal

= 41.142

Ho

= 0.179

Ho

= 8.888

Ho

= 26.6

Ho

= 6.6667

Ho

= 2.864

Ho

Page 26: Mb0040 Slm Unit10

Statistics for Management Unit 10

Sikkim Manipal Page No.

r

e

j

e

c

t

e

d

accepted

rejected

rejected

rejected

accepted

10.8Case Study

Automobile PreferenceA market research firm in an Asian country made a survey to see if there was any correlation between a person’s nationality and their preference in the make of automobile they purchased. Table 10.7 depicts the sample information obtained.

Page 27: Mb0040 Slm Unit10

Statistics for Management Unit 10

Sikkim Manipal Page No.

Table 10.7: Types of Automobile Purchased in Various Countries

Pakistan China India Srilanka Nepal

Maruti Suzuki 40 28 30 25 50

Opel 32 35 29 39 35

Lancer 24 40 27 28 29

Ford 40 20 40 26 40

Fiat 26 10 35 35 46

Discussion Questions:

i. Indicate the appropriate null and alternative hypothesis to test if the make of automobile purchased is dependent on an individual’s nationality?

ii. Using the critical value approach of the Chi-Square test at a 1% significant level, does it appear that there is a relationship between automobile purchase and nationality?

iii. Verify the result to Question 2 by using the p-value approach of the Chi-Square test

iv. What has to be the significance level in order that there appears a breakeven situation between dependency of nationality and automobile preference?

v. What is your comment about the results?

References: Bevington, P. R. & Robinson, D. K. Data Reduction and Error Analysis

for the Physical Sciences (3rd Edition). (Paperback). Cowan, G. Statistical Data Analysis (Oxford Science Publications).

(Paperback). Devore, J. L. Probability and Statistics for Engineering and the Sciences

Enhanced Review Edition. (Hardcover - Jan. 29, 2008). Froedesen, A. G., Skieggestad, D. & Tofte, H. Probability and Statistics

in Particle Physics. (Hardcover, 1979 – out of print).

James. H. Statistical Methods in Experimental Physics (2nd Edition). (Hardcover - Nov. 29, 2006).

Levin, R. I. & Rubin, D. S. (2008) Statistics for Management, Seventh Edition, PHI Learning Private Limited.

Lyons, L. Nuclear and Particle Physicists. (Paperback, 1989).

Page 28: Mb0040 Slm Unit10

Statistics for Management Unit 10

Sikkim Manipal Page No.

Mandel, J. The Statistical Analysis of Experimental Data. (Paperback).

Mayer, S. L. Data Analysis for Scientists and Engineers. (Paperback).

Morris. H., Schervish, M. J. & Degroot Probability and Statistics [PROBABILITY & STATISTICS 3 -OS]. (Paperback - Jan. 31, 2002).

Press, W. H., Teukolsky, S. A., Vetterling, W. T. & Flannery, B. P.Numerical Recipes (3rd Edition): The Art of Scientific Computing.

Ross, S. M. Introduction to Probability and Statistics for Engineers and Scientists, Fourth Edition. (Hardcover - Feb. 13, 2009).

Taylor, J. R. An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements. (Paperback).