ms 08 - solved assignment

13
1 ASSIGNMENTS Course Code : MS 08 Course Title : Quantitative Analysis for Managerial Applications Assignment No. : 08/TMA-1/SEM-I/2011 Coverage : All Blocks Note: Answer all the questions and send them to the Coordinator of the Study Centre you are attached with. 1. Calcu late the mea n, media n and mode from th e follow ing data rel ating to pro ducti on of a steel mill for 60 days  Production (in tons per day) 21-22 23-24 25-26 27-28 29-30  Number of days 7 13 22 10 8 Solution : class No. of  days(f) Mid value (x) d=(x-A)/h d 2 fd fd 2 c.f. 20.5-22.5 7 21.5 -2 4 -14 28 7 22.5-24.5 13 23.5 -1 1 -13 13 20 24.5-26.5 22 25.5 = A 0 0 0 0 42 26.5-28.5 10 27.5 1 1 10 10 52 28.5-30.5 8 29.5 2 4 16 32 60 f=60 fd= -1 fd 2 = 83 a) Mean = A + [ ( fd/ f ) × h] Where A = assumed mean h = class size = 25.5 + [(-1/60) × 2] = 25.467 (approx)

Upload: adarsh-kalhia

Post on 07-Apr-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MS 08 - Solved Assignment

8/4/2019 MS 08 - Solved Assignment

http://slidepdf.com/reader/full/ms-08-solved-assignment 1/13

1

ASSIGNMENTS

Course Code : MS 08

Course Title : Quantitative Analysis for Managerial Applications

Assignment No. : 08/TMA-1/SEM-I/2011Coverage : All Blocks

Note: Answer all the questions and send them to the Coordinator of the Study Centre you are attached

with.

1. Calculate the mean, median and mode from the following data relating to production of a steel millfor 60 days

 Production (in tons per day) 21-22 23-24 25-26 27-28 29-30

 Number of days 7 13 22 10 8

Solution :

class No. of  days(f)

Mid value(x)

d=(x-A)/h d2 fd fd2 c.f.

20.5-22.5 7 21.5 -2 4 -14 28 722.5-24.5 13 23.5 -1 1 -13 13 2024.5-26.5 22 25.5 = A 0 0 0 0 42

26.5-28.5 10 27.5 1 1 10 10 5228.5-30.5 8 29.5 2 4 16 32 60

∑f=60 ∑fd= -1 ∑fd2= 83

a) Mean = A + [ (∑fd/∑f) × h]

Where A = assumed mean

h = class size

= 25.5 + [(-1/60) × 2]

= 25.467 (approx)

Page 2: MS 08 - Solved Assignment

8/4/2019 MS 08 - Solved Assignment

http://slidepdf.com/reader/full/ms-08-solved-assignment 2/13

2

b) Median (Q2) = l1 + [{(N/2- p.c.f)/f} × l2 – l1

Median class = 24.5 – 26.5

Median = 24.5 + [{(60/2 – 20) /22} × 2]

= 25.409 (approx.)

c) Mode = l + [ { (f-f 1)/(2f – f 2 – f 1) }× h]

Modal class = 24.5-26.5

Mode = 24.5 + [{ (22-13)/(2×22-13-10) } ×2]

= 25.357 (approx.)

2. A restaurant is experiencing discontentment among its customers. It analyses that there are threefactors responsible viz. food quality, service quality and interior décor. By conducting an analysis, itassesses the probabilities of discontentment with the three factors as 0.40, 0.35 and 0.25 respectively.By conducting a survey among the customers, it also evaluated the probabilities of a customer goingaway discontented on account of these factors as 0.6, 0.8 and 0.5, respectively. With this information,the restaurant wants to know that, if a customer is discontented, what are the probabilities that it is sodue to food, service or interior décor?

Solution : Let

“A” be the case of discontentment due to food.

“B” be the case of discontentment due to service.

& “C” be the case of discontentment due to interior décor.

 Now, it’s given that

P(A) = 0.40

P(B) = 0.35

P( C ) = 0.25

Page 3: MS 08 - Solved Assignment

8/4/2019 MS 08 - Solved Assignment

http://slidepdf.com/reader/full/ms-08-solved-assignment 3/13

3

Let the Probability of a CUSTOMER going away discontented on account be representated as “E”.

Then, it’s given that,

P(E/A) = 0.6 ( customer going away discontented on due to food)

P(E/B) = 0.8 ( “ “ “ “ “ “ “ due to service)

P(E/C) = 0.5 (“ “ “ “ “ “ “ “ due to interior décor)

 Now, Probability of a customer discontented ,

due to FOOD is given by P(A/E)

due to SERVICE P(B/E)

due to INTERIOR DÉCOR P(C/E)

 According to Baye’s theorem,

a) P(A/E) = {P(E/A) × P(A)} ÷ {P(E/A) × P(A) + P(E/B) × P(B) + P(E/C) × P(C )}

= {0.6 × 0.40} ÷ {0.6 × 0.40 + 0.8 × 0.35 + 0.5 × 0.25}

= 0.372 (approx.)

 

b) P(B/E) = {P(E/B) × P(B)} ÷ {P(E/A) × P(A) + P(E/B) × P(B) + P(E/C) × P(C )}

= {0.8×0.35}÷ {0.6 × 0.40 + 0.8 × 0.35 + 0.5 × 0.25}

= 0.434 (approx.)

c) P(C/E) = {P(E/C) × P(C)} ÷ {P(E/A) × P(A) + P(E/B) × P(B) + P(E/C) × P(C )}

= {0.5×0.25}÷{0.6 × 0.40 + 0.8 × 0.35 + 0.5 × 0.25}

= 0.194 (approx.)

Page 4: MS 08 - Solved Assignment

8/4/2019 MS 08 - Solved Assignment

http://slidepdf.com/reader/full/ms-08-solved-assignment 4/13

4

3. The monthly incomes of a group of 10,000 persons were found to be normally distributed with meanequal to 15,000 and standard deviation equal to 1000. What is the lowest income among the

richest 250 persons?

Solution :

Page 5: MS 08 - Solved Assignment

8/4/2019 MS 08 - Solved Assignment

http://slidepdf.com/reader/full/ms-08-solved-assignment 5/13

5

4. Write short notes on the following:a. Test of goodness of fit b. Critical Region of a testc. Exponential Smoothing Method

Solution :

a. Test of goodness of fit

he goodness of fit of a statistical model describes how well it fits a set of observations. Measuresof goodness of fit typically summarize the discrepancy between observed values and the valuesexpected under the model in question. Such measures can be used in statistical hypothesis testing, e.g. to test for normality of residuals, to test whether two samples are drawn from identicaldistributions or whether outcome frequencies follow a specified distribution In the analysis of variance, one of the components into which the variance is partitioned may be a lack-of-fit sum of  squares.

Fit of distributions

In assessing whether a given distribution is suited to a data-set, the following tests and their underlying

measures of fit can be used:

Kolmogorov–Smirnov test;

Cramér–von-Mises criterion;

Anderson–Darling test.

Regression analysis

In regression analysis, the following topics relate to goodness of fit:

Coefficient of determination (The R squared measure of goodness of fit);

Lack-of-fit sum of squares.

Example

One way in which a measure of goodness of fit statistic can be constructed, in the case where the variance

of the measurement error is known, is to construct a weighted sum of squared errors:

where σ2 is the known variance of the observation.

Page 6: MS 08 - Solved Assignment

8/4/2019 MS 08 - Solved Assignment

http://slidepdf.com/reader/full/ms-08-solved-assignment 6/13

6

This definition is only useful when one has estimates for the error on the measurements, but it leads to a

situation where a chi-square distribution can be used to test goodness of fit, provided that the errors can

 be assumed to have a normal distribution.

The reduced chi-squared statistic is simply the chi-squared divided by the number of degrees of freedom:

where ν is the number of degrees of freedom, usually given by N − n − 1, where N is the number of 

observations, and n is the number of fitted parameters, assuming that the mean value is an additional

fitted parameter. The advantage of the reduced chi-squared is that it already normalizes for the number of 

data points and model complexity.

As a rule of thumb, a large indicates a poor model fit. However indicates that the model

is 'over-fitting' the data (either the model is improperly fitting noise, or the error variance has been over-

estimated). A indicates that the fit has not fully captured the data (or that the error variance

has been under-estimated). In principle a value of indicates that the extent of the match

 between observations and estimates is in accord with the error variance.

Categorical data

The following are examples that arise in the context of categorical data.

Pearson's chi-square test

Pearson's chi-square test uses a measure of goodness of fit which is the sum of differences between

observed and expected outcomefrequencies (that is, counts of observations), each squared and divided by

the expectation:

where:

Oi = an observed frequency (ie count) for the ith bin

 E i = an expected (theoretical) frequency for the ith bin, asserted by the null hypothesis.

The resulting value can be compared to the chi-square distribution to determine the

goodness of fit. In order to determine the degrees of freedom of the chi-squared

Page 7: MS 08 - Solved Assignment

8/4/2019 MS 08 - Solved Assignment

http://slidepdf.com/reader/full/ms-08-solved-assignment 7/13

7

distribution, one takes the total number of observed frequencies and subtracts one. For 

example, if there are eight different frequencies, one would compare to a chi-squared

with seven degrees of freedom.

 Example: equal frequencies of men and women

For example, to test the hypothesis that a random sample of 100 people has been drawn from a population

in which men and women are equal in frequency, the observed number of men and women would be

compared to the theoretical frequencies of 50 men and 50 women. If there were 44 men in the sample and

56 women, then

If the null hypothesis is true (i.e., men and women are chosen with equal probability in the sample),the test statistic will be drawn from a chi-square distribution with one degree of freedom. Though

one might expect two degrees of freedom (one each for the men and women), we must take into

account that the total number of men and women is constrained (100), and thus there is only one

degree of freedom (2 − 1). Alternatively, if the male count is known the female count is determined,

and vice-versa.

Consultation of the chi-square distribution for 1 degree of freedom shows that the probability of 

observing this difference (or a more extreme difference than this) if men and women are equally

numerous in the population is approximately 0.23. This probability is higher than conventional

criteria for statistical significance (.001-.05), so normally we would not reject the null hypothesis

that the number of men in the population is the same as the number of women (i.e. we would

consider our sample within the range of what we'd expect for a 50/50 male/female ratio.)

Binomial case

A binomial experiment is a sequence of independent trials in which the trials can result in one of two

outcomes, success or failure. There aren trials each with probability of success, denoted by p. Provided

that npi ≫ 1 for every i (where i = 1, 2, ..., k ), then

This has approximately a chi-squared distribution with k − 1 df. The fact that df = k − 1 is a consequence

of the restriction . We know there are k observed cell counts, however, once any k − 1 are

Page 8: MS 08 - Solved Assignment

8/4/2019 MS 08 - Solved Assignment

http://slidepdf.com/reader/full/ms-08-solved-assignment 8/13

8

known, the remaining one is uniquely determined. Basically, one can say, there are only k − 1 freely

determined cell counts, thus df = k − 1.

Other measures of fit

The likelihood ratio test statistic is a measure of the goodness of fit of a model, judged by whether an

expanded form of the model provides a substantially improved fit.

b. Critical region of a test

A statistical hypothesis test is a method of making decisions using data, whether from a controlled

experiment or an observational study(not controlled). In statistics, a result is called statistically

significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold probability, the significance level. The phrase "test of significance" was coined by Ronald Fisher :

"Critical tests of this kind may be called tests of significance, and when such tests are available we may

discover whether a second sample is or is not significantly different from the first." Hypothesis testing is

sometimes called confirmatory data analysis, in contrast to exploratory data analysis. In frequency

 probability, these decisions are almost always made using null-hypothesis tests (i.e., tests that answer the

question Assuming that the null hypothesis is true, what is the probability of observing a value for the test 

 statistic that is at least as extreme as the value that was actually observed?) One use of hypothesis testing

is deciding whether experimental results contain enough information to cast doubt on conventional

wisdom.

A result that was found to be statistically significant is also called a positive result; conversely, a result

whose probability under the null hypothesis exceeds the significance level is called a negative result or 

a null result.

Statistical hypothesis testing is a key technique of frequentist statistical inference. The Bayesian approach

to hypothesis testing is to base rejection of the hypothesis on the posterior probability.

Other approaches to reaching a decision based on data are available via decision theory and optimal

decisions.The critical region of a hypothesis test is the set of all outcomes which, if they occur, will lead us to

decide that there is a difference. That is, cause the null hypothesis to be rejected in favor of the alternative

hypothesis. The critical region is usually denoted by the letter C .

Page 9: MS 08 - Solved Assignment

8/4/2019 MS 08 - Solved Assignment

http://slidepdf.com/reader/full/ms-08-solved-assignment 9/13

Page 10: MS 08 - Solved Assignment

8/4/2019 MS 08 - Solved Assignment

http://slidepdf.com/reader/full/ms-08-solved-assignment 10/13

10

Before the test is actually performed, the desired probability of a Type I error is determined. Typically,

values in the range of 1% to 5% are selected. Depending on this desired Type 1 error rate, the critical

value c is calculated. For example, if we select an error rate of 1%, c is calculated thus:

From all the numbers c, with this property, we choose the smallest, in order to minimize the probability of 

a Type II error, a false negative. For the above example, we select: c = 12.

But what if the subject did not guess any cards at all? Having zero correct answers is clearly an oddity

too. The probability of guessing incorrectly once is equal to p'=(1-p)=3/4. Using the same approach we

can calculate that probability of randomly calling all 25 cards wrong is:

This is highly unlikely (less than 1 in a 1000 chance). While the subject can't guess the cards correctly,

dismissing H0 in favour of H1 would be an error. In fact, the result would suggest a trait on the subject's

 part of avoiding calling the correct card. A test of this could be formulated: for a selected 1% error rate

the subject would have to answer correctly at least twice, for us to believe that card calling is based purely

on guessing.

c. Exponential Smoothing Method

Exponential smoothing is a technique that can be applied to time series data, either to produce

smoothed data for presentation, or to make forecasts. The time series data themselves are a

sequence of observations. The observed phenomenon may be an essentially random process, or it

may be an orderly, but noisy, process. Whereas in the simple moving average the past

observations are weighted equally, exponential smoothing assigns exponentially decreasing

weights over time.

Exponential smoothing is commonly applied to financial market and economic data, but it can beused with any discrete set of repeated measurements. The raw data sequence is often represented

 by { xt }, and the output of the exponential smoothing algorithm is commonly written as { st },

which may be regarded as a best estimate of what the next value of  x will be. When the sequence

of observations begins at time t = 0, the simplest form of exponential smoothing is given by the

formulas

Page 11: MS 08 - Solved Assignment

8/4/2019 MS 08 - Solved Assignment

http://slidepdf.com/reader/full/ms-08-solved-assignment 11/13

11

where α is the smoothing factor , and 0 < α < 1.

The simple moving average

Intuitively, the simplest way to smooth a time series is to calculate a simple, or unweighted, moving

average. The smoothed statistic st  is then just the mean of the last k observations:

where the choice of an integer k > 1 is arbitrary. A small value of k will have less of a smoothing effect

and be more responsive to recent changes in the data, while a larger k will have a greater smoothing

effect, and produce a more pronounced lag in the smoothed sequence. One disadvantage of this technique

is that it cannot be used on the first k −1 terms of the time series.

The weighted moving average

A slightly more intricate method for smoothing a raw time series { xt } is to calculate a weighted moving

average by first choosing a set of weighting factors

such that

and then using these weights to calculate the smoothed statistics { st }:

In practice the weighting factors are often chosen to give more weight to the most recent terms in the time

series and less weight to older data. Notice that this technique has the same disadvantage as the simple

moving average technique (i.e., it cannot be used until at least k observations have been made), and that itentails a more complicated calculation at each step of the smoothing procedure. In addition to this

disadvantage, if the data from each stage of the averaging is not available for analysis, it may be difficult

if not impossible to reconstruct a changing signal accurately (because older samples may be given less

weight). If the number of stages missed is known however, the weighting of values in the average can be

adjusted to give equal weight to all missed samples to avoids this issue.

Page 12: MS 08 - Solved Assignment

8/4/2019 MS 08 - Solved Assignment

http://slidepdf.com/reader/full/ms-08-solved-assignment 12/13

12

The exponential moving average

The simplest form of exponential smoothing is given by the formulae

where α is the smoothing factor , and 0 < α < 1. In other words, the smoothed statistic st  is a simple

weighted average of the previous observation xt -1 and the previous smoothed statistic st −1. The

term smoothing factor applied to α here is something of a misnomer, as larger values of α actually

reduce the level of smoothing. In the limiting case with α = 1 the output series is just the same as the

original series. Simple exponential smoothing is easily applied, and it produces a smoothed statistic

as soon as two observations are available.

Values of α close to one have less of a smoothing effect and give greater weight to recent changes in

the data, while values of α closer to zero have a greater smoothing effect and are less responsive to

recent changes. There is no formally correct procedure for choosing α. Sometimes the statistician's

 judgment is used to choose an appropriate factor. Alternatively, a statistical technique may be used

to optimizethe value of α. For example, the method of least squares might be used to determine the

value of α for which the sum of the quantities ( sn-1 − xn-1)2 is minimized.

This technique technically does not share disadvantage where it cannot be used until a minimum

number of observations have been made, though in practice a "good average" will not be achieved

until several samples have been averaged together (a constant signal will take

approximately 3/α stages to reach 95% of the actual value). To accurately reconstruct the original

signal without information loss all stages of the exponential moving average must also be available

(because older samples decay in weighting exponentially). In the simple moving average some

samples can be skipped without as much loss of information, due to the constant weighting of 

samples within the average. If a known number of samples will be missed, a weighted average can

 be adjusted for this as well, by giving equal weight to the new sample and all those to be skipped.

This simple form of exponential smoothing is also known as an exponentially weighted moving

average (EWMA). Technically it can also be classified as an Autoregressive integrated moving

average (ARIMA) (0,1,1) model with no constant term.

Page 13: MS 08 - Solved Assignment

8/4/2019 MS 08 - Solved Assignment

http://slidepdf.com/reader/full/ms-08-solved-assignment 13/13

13

By direct substitution of the defining equation for simple exponential smoothing back into itself we find

that

In other words, as time passes the smoothed statistic st becomes the weighted average of a greater and

greater number of the past observations xt−n, and the weights assigned to previous observations are in

general proportional to the terms of the geometric progression {1, (1 − α), (1 − α)2, (1 − α)3, …}.

A geometric progression is the discrete version of an exponential function, so this is where the name for 

this smoothing method originated.

Double exponential smoothing

Simple exponential smoothing does not do well when there is a trend in the data.

In such situations, double exponential smoothing can be used.

Again, the raw data sequence of observations is represented by { xt }, beginning at time t = 0. We use { st }

to represent the smoothed value for time t , and {bt } is our best estimate of the trend at time t . The output

of the algorithm is now written as F t+m, an estimate of the value of  x at time t+m, m>0 based on the raw

data up to time t . Double exponential smoothing is given by the formulas

where α is the data smoothing factor , 0 < α < 1, β is the trend smoothing factor , 0 < β < 1, and b0 is takenas (xn-1 - x0 )/(n - 1) for somen > 1. Note that F 0 is undefined (there is no estimation for time 0), andaccording to the definition F 1= s0+b0, which is well defined, thus further values can be evaluated.