final a092 scheme editted

19
CONFIDENTIAL SQQS2013 UNIVERSITI UTARA MALAYSIA FINAL EXAMINATION SECOND SEMESTER 2009/2010 SESSION CODE / COURSE NAME : SQQS2013 / APPLIED STATISTICS DATE : 1 st MAY 2010 TIME : 2.30 PM – 5.00 PM (2 ½ HOUR) VENUE : DMS, TE, KYM, IKIP, PMI, NEGERI, KIA, KTB INSTRUCTIONS: 1. This book script contains FOUR (4) questions in TWENTY (20) printed pages excluding the cover page. 2. List of formulae and distributions are provided on pages THIRTEEN (13) until TWENTY (20). 3. Answer ALL questions in the SPACE provided. 4. Show all the calculation (if relevant), and use FOUR (4) decimal places in your calculation. MATRIC NO. :_______________________________ (in words) (in numbers) IDENTITY CARD NO. : LECTURER : _____________________________ GROUP : TABLE NO. : CONFIDENTIAL

Upload: khairunnaimah-kncreative

Post on 25-Oct-2014

157 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Final A092 Scheme Editted

CONFIDENTIAL SQQS2013

UNIVERSITI UTARA MALAYSIA

FINAL EXAMINATION SECOND SEMESTER 2009/2010 SESSION

CODE / COURSE NAME : SQQS2013 / APPLIED STATISTICSDATE : 1st MAY 2010 TIME : 2.30 PM – 5.00 PM (2 ½ HOUR)VENUE : DMS, TE, KYM, IKIP, PMI, NEGERI, KIA, KTB

INSTRUCTIONS:

1. This book script contains FOUR (4) questions in TWENTY (20) printed pages excluding the cover page.

2. List of formulae and distributions are provided on pages THIRTEEN (13) until TWENTY (20).

3. Answer ALL questions in the SPACE provided.

4. Show all the calculation (if relevant), and use FOUR (4) decimal places in your calculation.

MATRIC NO. :_______________________________

(in words) (in numbers)

IDENTITY CARD NO. :

LECTURER : _____________________________

GROUP : TABLE NO. :

PLEASE DO NOT OPEN THIS QUESTION BOOKLET

UNTIL FURTHER INSTRUCTION IS GIVEN

CONFIDENTIAL

Page 2: Final A092 Scheme Editted

QUESTION 1 (25 MARKS)

a) Tick the correct answer.

i) A numerical quantity computed from the data of a sample and is used in reaching a decision on whether or not to reject the null hypothesis is referred to as:

significance level critical value test statistic parameter(1 mark)

ii) In developing an 87.4% confidence interval estimate for a population mean, the value of z to use is

1.15 0.32 1.53 0.16(1 mark)

iii) Given significance level 4.4%, the critical value for testing that the proportion in population A is different from population B is

2.12 1.82 2.00 1.96(1 mark)

iv) When the p-value is found to be equal to 0.076, the result at 0.05 significance level is

reject H0 fail to reject H0.(1 mark)

v) In Levene’s test of equality of variance, we conclude with the assumption that the variance of the two populations are equal when we

reject H0 fail to reject H0

(1 mark)

vi) The manager of a cyber café claims that the mean daily revenue was $700 with a standard deviation of $70. A sample of 32 days reveals mean daily revenue of $620. The test we would use is

z-test t-test.(1 mark)

vii)To determine if the mean test scores of English students, E, is higher than from the mean test scores of American students, A, the alternative hypothesis is

H1: E < A H1: E ≤ A H1: E > A H1: E ≥ A

(1 mark)viii) For a left-tailed test of the difference two means of independent populations, the alternative

hypothesis for the Levene’s test of equality of variance is:

H1: 12 = 2

2 H1: 12 ≠ 2

2 H1: 12 < 2

2 H1:12 ≤ 2

2

1

Page 3: Final A092 Scheme Editted

(1 mark)

ix) If the lower limit of a confidence interval is , what is the upper limit for this

interval?

320 340 380 400(1 mark)

x) If there are two unbiased estimators, the one whose variance is smaller is said to be relatively efficient.

True false(1 mark)

b) Mary, the owner of two laundry shops (Perfect laundry and Best Laundry) would like to determine the number of complaint due to any unsatisfaction with her laundry services. Customer satisfaction is the key for the success. Thus, the owner of the laundry shops has set up, if the number of complaints is at most 5 per week, then the services provided by her laundry shops are success. A number of complaints for 45 weeks of her two laundry shops are given in Table 1.

Table 1Perfect LaundryWeek 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15Number ofcomplaint

4 2 0 7 6 0 6 4 2 6 5 2 5 1 0

Week 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30Number ofcomplaint

0 3 2 6 5 1 0 1 0 4 1 6 0 4 6

Week 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45Number ofcomplaint

0 0 4 5 5 7 5 0 5 5 4 5 6 2 1

Best laundryWeek 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15Number ofcomplaint

8 2 1 4 6 7 6 0 7 2 1 3 5 2 0

Week 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30Number ofcomplaint

0 2 2 6 5 6 4 6 6 2 4 6 5 3 3

Week 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45Number ofcomplaint

0 2 8 6 6 7 5 3 7 8 0 5 6 3 7

i) Construct a 90% confidence interval for the different in the proportion number of weeks with status failure between Perfect Laundry and Best Laundry.

(5 marks)

-1m

-1m

-1m for 1.6449

2

Page 4: Final A092 Scheme Editted

-1m

ii) Based on your answer in (i), does Mary have any significant evidence to conclude that there is no different in the proportion number of weeks with status failure between her two laundry shops? Give your reason.

(2 marks)

- 0 is not in the interval -1m- At 90% confidence level, there is not enough evidence to conclude that there is no

different in the proportion number of days with status failure between the two laundries. -1m

iii) For nearly 10 years, Perfect Laundry had a good achievement in their services with at most 10% of failure for every 45 weeks the laundry operates. Can we conclude that the Perfect Laundry achievement remain the same now? Do an appropriate test at 5% significance level.

(6 marks)

versus H0 : p -1m

-2m

-1m

Reject H0 -1m

The perfect laundry achievement has changed now. -1m

3

Page 5: Final A092 Scheme Editted

iv) If the significance level change to 1%, is there any changes in your conclusion in (iii)? Give your reason.

(2 marks)

, Fail to Reject H0 -1m

The conclusion in (iii) has changed at 1% level of significance -1m

4

Page 6: Final A092 Scheme Editted

QUESTION 2 (25 MARKS)

a) Choose the correct answer.

i) Analysis of variance is used to

compare nominal data.

compute t test.

compare population proportion.

simultaneously compare several population means.(1 mark)

ii) In ANOVA, F statistic is used to test a null hypothesis such as:

(1 mark)

iii) If an ANOVA test is conducted and the null hypothesis is rejected, what does this indicate?

Too many degrees of freedom

No difference between the population means

Difference between at least one pair of population means

None of the above(1 mark)

b) A study was conducted to compare the final scores obtained by students from 5 different schools in four different subjects. The researchers wanted to show that schools have an effect on the scores. He believed that the subjects have an effect on the scores too. The following data represent the final scores obtained by randomly selected students from 5 different schools in Mathematics, English, Science and Biology

SchoolsSubject

Mathematics English Science Biology1 68 57 73 61 2 83 94 91 86 3 72 81 63 59 4 55 73 77 66

5

Page 7: Final A092 Scheme Editted

5 92 68 75 87i) Based on the data, complete the analysis of variance table.

(10 marks)

Source of Variation

Sum of Square

Degree of freedom

Mean of Square

F

Treatment 1618.7 (2m)

4 404.675(1m)

4.3666(1m)

Block 42.15 (2m)

3 14.05(1m)

Error 1112.1 (1m)

12 92.675(1m)

Total 2772.95 19

(1m) and - school

and - subject

,

, ,

ii) Use a 0.05 level of significance to test the researcher interest. (4 marks)

Reject

We conclude that the school have an effect on the scores. -1mc) A study on rental rates in four cities has been done. Based on the OUTPUT 2.1, what is your

conclusion on the rental rates between the four cities at ?

6

Page 8: Final A092 Scheme Editted

OUTPUT 2.1

Rental per month for two-bedded apartmentsSum of Square df Mean Square F Sig.

Between GroupsWithin GroupsTotal

44947.000378299.040423246.040

39699

14982.3333940.615

3.802 .013

(4 marks)

We conclude that the rental rates is differ for different cities. -1m

d) A researcher in a manufacturing company has done a research to study the effect of incentives given by the company on the workers productivity. To reduce the error of the experiment, the workers’ commitment was also considered in the study. The collected data have been analyzed using SPSS and the output is as below.

OUTPUT 2.2

Tests of Between-Subjects EffectsDependent Variable: productivity

Source

Type III Sum of Squares df Mean Square F Sig.

ModelIncentiveCommitmentErrorTotal

543.222(a)27.556

D3.776

547.000

522A9

108.64413.772

C.944

115.035B

24.482

.000

.015

.006

A R Squared = .993 (Adjusted R Squared = .984)

Based on the OUTPUT 2.2, find the value of A, B, C and D.

A =

B =

C =

D = (4 marks)QUESTION 3 (25 MARKS)

a) Answer the following questions.

7

Page 9: Final A092 Scheme Editted

i) Give one of measurement scales that can be analyzed using Chi-square test?(1 mark)

Nominal or Ordinal

ii) One guideline to ensure a good approximation to the Chi-square distribution is that

expected frequency for the ith category is at least 5 ( ≥ 5). If this were not possible, what

would be a possible solution?(1 mark)

Combine rows or columnsOr increase sample sizes

b) A social worker believes that the age distribution of regular users of marijuana in a certain population is as follows: below 21, 30%; 21 – 30, 60%; 31 – 40, 8%; and over 40, 2% of the total population. A random sample of 300 drawn from the population yielded the age breakdown shown in Table 3.1. Do these data provide sufficient evidence to support the social worker’s belief at 5% level of significance?

Table 3.1Age, years NumberBelow 21 9621 – 30 17131 – 40 22Over 40 11

(7 marks)

Ho: The age distribution of regular users of marijuana is follows the social worker’s belief. H1: The age distribution of regular users of marijuana is different than the social worker’s belief. -1m

The observed and expected frequencies are shown in the table below, where E = np.

Below 21 21 -30 31 - 40 Over 40 Total

Observed 96 171 22 11 300

Expected 90 180 24 6 300

Correct expected value -1m

= 0.4+0.45+0.1667+4.1667=5.1833 -2m

The critical value: = 7.8147 -1m

Failed to reject -1m

There is no sufficient evidence at the 0.05 level of significance to show that the age distribution of regular users of marijuana is different than the social worker’s belief.. -1m

We assume that the social worker’s belief is not true.

8

Page 10: Final A092 Scheme Editted

c) A graduate student in psychology recorded the number of people contributing to a solicitor for a charity organization stationed in a shopping mall during the Christmas season. The numbers of people contributing during five – minutes time intervals were counted. The results are shown in Table 3.2.

Table 3.2

Number of contributors 0 1 2 3 4 5 ≥6Number of intervals 15 30 36 33 22 12 6

i) Find the sample mean for number of contributors.(1 mark)

ii) Test the hypothesis that the number of people contributing during five – minutes time intervals that follows a Poisson distribution at 1% significance level.

(8 marks)

Number of contributors 0 1 2 3 4 5 ≥6Number of intervals 15 30 36 33 22 12 6Pi 0.0821 0.2052 0.2565 0.2138 0.1336 0.0668 0.042Ei 12.6434 31.6008 39.501 32.9252 20.5744 10.2872 6.468(Oi-Ei)2/Ei 0.4392 0.0811 0.3103 0.0002 0.0988 0.2852 0.0339

Correct Pi -1mCorrect Ei -1m

H0 : The number of people contributing during five-minutes time intervals follows a Poisson distribution.

H1 : The number of people contributing during five-minutes time intervals do not follows a Poisson distribution. -1m

= 1.2487 -2m

The critical value : -1m

Failed to reject H0 -1m

We do not have enough evidence that he number of people contributing during five-minutes time intervals do not follows a Poisson distribution

9

Page 11: Final A092 Scheme Editted

We assume the number of people contributing during five-minutes time intervals that follows a Poisson distribution. -1m

c) A study is conducted to see if there is any association between the colour of cars involved in accidents and the time the accidents occur. The result of the analysis is displayed in Output 3.1.

Output 3.1

Car_Colour * Time Crosstabulation

Time Total Morning Noon Night

Car_Colour

Bright Count 30 50 35 -

Expected Count A - - -

Dark Count 80 B 120 - Expected

Count- 60.8 - -

Total Count - - - 355 Expected

Count- - -

Chi-Square Tests

Value DfAsymp. Sig.

(2-sided)Pearson Chi-Square 30.179(a) C .000Likelihood Ratio 29.010 2 .000Linear-by-Linear Association

1.611 1 .204

N of Valid Cases 355

i) Based on Output 3.1, find the value of A, B and C.(3 marks)

A = 35.6338

B = 40 C = (r –1) (c –1) = (2–1)(3–1) = 2

ii) At 2.5% significance level test whether there is an association between the colour of cars involved in accidents and the time the accidents occur.

(4 marks)

10

Page 12: Final A092 Scheme Editted

H0 : There is no association between colour of car involved in accidents and the time the accidents occur

H1 : There is an association between colour of car involved in accidents and the time the accidents occur -1m

P-value : 0.000 < α = 0.025 -1m

Reject H0 -1m

There is an association between colour of car involved in accidents and the time the accidents occur. -1m

QUESTION 4 (25 MARKS)

a) A biologist assumes that there is a linear relationship between the amount of fertilizer supplied to tomato plants and the subsequent yield of tomatoes obtained. Eight tomato plants, of the same variety, were selected at random and treated, weekly, with solution in which fertilizer (in grams) was dissolved in a fixed quantity of water. The yield of tomatoes (in kilograms) was recorded.

Tomato Plant A B C D E F G HFertilizer (in grams) 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5Yield of tomatoes (in kilograms) 3.9 4.4 5.8 6.6 7.0 7.1 7.3 7.7

i) Identify the dependent and independent variables.(2 marks)

Dependent variable: yield of tomato plant -- 1M

Independent variable: amount of fertilizer -- 1M

ii) Find the Pearson correlation coefficient and interpret.(2 marks)

r = 0.9444 -- 1 M

The correlation coefficient suggests a strong positive linear relationship between yield of tomato plant and amount of fertilizer. -- 1M

iii) Test the relationship between yield of tomato plant and amount of fertilizer based on Pearson correlation coefficient at = 5 %.

(6 marks)H0 : = 0 vs H1: 0 -- 1M

-- 2M

-- 1M

so reject H0. -- 1M

11

Page 13: Final A092 Scheme Editted

There is enough evidence to conclude that the positive relationship between y and x is significant. --1M

iv) Fit a least squares line.(3 marks)

a = 3.2524 -- 1Mb = 1.0810 -- 1M

= 3.2524 + 1.0810x -- 1M

v) Interpret the slope.(1 mark)

Amount of fertilizer (x) with an increment of 1 grams will increase 1.0810 kilograms yield of tomato plant (y). -- 1M

vi) Estimate the yield of plant treated, weekly, with 3.2 grams of fertilizer.(2 marks)

-- 1M

-- 1M

b) A researcher wants to find out the relationship between concentration of cholesterol in blood serum (y), age (x1), and body mass index (x2). He run a regression analysis using computer and the output obtained is shown in OUTPUT 4.1.

OUTPUT 4.1

ANOVA(b)

ModelSum of

Squaresdf Mean Square F Sig.

1 Regression 23.132 2 11.566 .000(a) Residual 26.571 27 .984

Total 49.703 29

a Predictors: (Constant), x1, x2b Dependent V

Coefficients(a)

Model

Unstandardized Coefficients

Standardized Coefficients t Sig.

B Std. Error Beta

1 (Constant) -.740 1.896 -.390 .700 x1 .041 .014 .462 3.006 .006

x2 .201 .089 .349 2.269 .031

a Dependent Variable: y

12

Page 14: Final A092 Scheme Editted

i) Write down the estimated regression model obtained.(1 mark)

-- 1M

ii) Briefly explain about the significant of estimated regression model and its coefficients. (Use = 5%).

(4 marks)The ANOVA table showed that the model is significant since the p-value = 0.000 less than = 5%. -- 2M

The coefficient table showed that both x1 and x2 has a significant contribution to the model since the p-value of x1 = 0.006 and p-value of x2 = 0.031 are less than = 5%. -- 2M

iii) How many variables have positive significant effect on the model at = 1%?(1 mark)

One -- 1M

iv) Interpret the effect of x2 to the model.(1 mark)

b2 = 0.201 indicates that, assuming the other variable are constant, a body mass index variability with an increment of 1 unit will increase 0.201 average concentration of cholesterol in blood serum. -- 1M

v) What is the estimated value of concentration of cholesterol in blood serum if a person is 47 years old and his body mass index is 23.1?

(2 marks)

-- 1M

13

Page 15: Final A092 Scheme Editted

-- 1M

~ END OF QUESTIONS~

14