abc

35
QUANTITATIVE ANALYSIS FOR MANAGEMENT – II http://ganga.iiml.ac.in/~ggar g/QAM2.htm QAM – II by Gaurav Garg (IIM Lucknow)

Upload: shantanu-chikhale

Post on 01-Nov-2014

18 views

Category:

Documents


0 download

DESCRIPTION

xzv

TRANSCRIPT

Page 1: abc

QUANTITATIVE ANALYSIS FOR MANAGEMENT – II

http://ganga.iiml.ac.in/~ggarg/QAM2.htm

QAM – II by Gaurav Garg (IIM Lucknow)

Page 2: abc

• COURSE OUTLINE Sampling Distributions – Chi-Square, t and F distributions Interval Estimation Sample Size Decision Testing of Hypothesis

• single population and • two populations

Measures of Association for Qualitative data and contingency Table

Chi-square test for Goodness of fit Analysis of Variance – one way and two way Nonparametric tests Multiple Regression Analysis

QAM – II by Gaurav Garg (IIM Lucknow)

Page 3: abc

• EVALUATION SCHEME AND EXAM DATES

QAM – II by Gaurav Garg (IIM Lucknow)

Weight Date• QUIZ -1: 10% SEP 28, 2011/ 0830 - 0850 Hrs.• QUIZ-2: 10% OCT 12, 2011/ 0830 - 0850 Hrs.• MID TERM EXAM: 25% OCT 27 - 29, 2011 • QUIZ-3: 00% NOV 16, 2011/ 0830 - 0850 Hrs. • ASSIGNMENT: 10% NOV 24, 2011,

due date: DEC 10, 2011• END TERM EXAM: 40% DEC 14 - 19, 2011• IMPRESSION: 05% --• Three quizzes will be conducted and best two out of three will

be considered.

Page 4: abc

Sampling Distributions

• Concept of Sampling Distribution• Distributions of Sample Mean and

Sample Proportion• Central Limit Theorem• t, Chi-Square and F distributions.

QAM – II by Gaurav Garg (IIM Lucknow)

Page 5: abc

• Parameter: Statistical measures computed using population observations. Let X1, X2,…, XN are population units. Population mean Population Variance

• Statistic: Statistical measures computed using sample observations. Let x1, x2,…, xn are sample units. Sample mean Sample Variance

Parameter and Statistic

N

iiX

N 1

1 2

1

2 )(1

N

ii XX

N

n

iix

nx

1

1 2

1

21

2

1

2 )(1

1or)(

1

n

ii

n

ii xx

nsxx

ns

QAM – II by Gaurav Garg (IIM Lucknow)

Page 6: abc

• In practice, parameter values are not known.• They are estimated using sample observations.• Parameter values are fixed.• Values of statistic varies sample to sample.• Unbiased Estimate

If E(statistic) = parameter, then the statistic is said to be unbiased estimate of

the parameter. Sample mean is an unbiased estimate of population

mean.

QAM – II by Gaurav Garg (IIM Lucknow)

Page 7: abc

• Let us consider the following population of size 4:• 18, 20, 22, 24• Population mean = (18 + 20 + 22 + 24)/ 4 = 21• Population Variance • = [(18-21)2 + (20-21) 2 + (22-21) 2 + (24-21) 2] / 4 = 5• Consider all possible samples of size 2• Obtain sample mean and sample variance of all the

samples.• Sample mean is an unbiased estimate of population

mean.• This means that the average of all sample means

equals population mean.

QAM – II by Gaurav Garg (IIM Lucknow)

Page 8: abc

Samples s2 s12

18, 18 18 0 020, 18 19 1 222, 18 20 4 824, 18 21 9 1818, 20 19 1 220, 20 20 0 022, 20 21 1 224, 20 22 4 818, 22 20 4 820, 22 21 1 222, 22 22 0 024, 22 23 1 218, 24 21 9 1820, 24 22 4 822, 24 23 1 224, 24 24 0 0

Average 21 2.5 5

x

2,5,21 2 n

221

22

)(

)(

)(

sE

sE

xE

n

ii

n

ii

n

ii

xxn

s

xxn

s

xn

x

1

221

1

22

1

)(1

1

)(1

1

Page 9: abc

Sampling Distributions• Unknown parameters are estimated using sample

observations.

• Parameter values are fixed.

• Values of statistic varies sample to sample.

• Each sample has some probability of being chosen.

• Each value of a statistic is associated with a probability.

• Statistic is a random variable.

• Distribution of a statistic is called sampling distribution.

• Distribution of a statistic may not be the same as the distribution of population.

QAM – II by Gaurav Garg (IIM Lucknow)

Page 10: abc

Sampling Distribution of Mean(or Distribution of Sample Mean)

• Consider the previous example again.• Histogram of population units

• Each item is frequented only once.• Population distribution is discrete uniform

distribution.QAM – II by Gaurav Garg (IIM Lucknow)

0.25

0x

18 20 22 24

Page 11: abc

QAM – II by Gaurav Garg (IIM Lucknow)

SamplesSample Mean Frequency

Probability = relative frequency

(18, 18) 18 1 1/16(20, 18), (18, 20) 19 2 2/16

(22, 18), (18, 20), (20, 20) 20 3 3/16(24, 18), (18, 24), (20, 22), (22,20) 21 4 4/16

(20, 24), (24, 20), (22, 22) 22 3 3/16(22, 24), (24, 22) 23 2 2/16

(24, 24) 24 1 1/16Total 1

18 19 20 21 22 23 240

(no longer uniform)

4/16

3/16

2/16

1/16

Page 12: abc

• The value of the sample mean depends on the chosen sample.

• Each sample is chosen with certain probability.

• So, each possible value of sample mean is associated with some probability.

• Distribution of sample mean is the list of all possible values along with corresponding probabilities.

QAM – II by Gaurav Garg (IIM Lucknow)

Sample Mean

18 19 20 21 22 23 24

Probability

1/16 2/16 3/16 4/16 3/16 2/16 1/16

Page 13: abc

• In other words, the statistic (sample mean) can be considered as a random variable.

• The distribution of T is given by following table:

QAM – II by Gaurav Garg (IIM Lucknow)

xT

t P(T=t)18 1/1619 2/1620 3/1621 4/1622 3/1623 2/1624 1/16

t x P(T=t) t2 x P(T=t)1.125 20.2502.375 45.1253.750 75.0005.250 110.2504.125 90.7502.875 66.1251.500 36.000

21.000 443.500

E(T) = 21E(T2) = 443.5Var(T) = E(T2) – [E(T)]2

= 2.5

Page 14: abc

• In general,• can also be obtained as follows:

• Common Notation:

QAM – II by Gaurav Garg (IIM Lucknow)

nxVarxE 2)(,)( )(and)( xVarxE

nnn

xEn

xn

ExEn

i

n

ii

n

ii

11)(

11)(

111

nn

nnxVar

nx

nVarxVar

n

i

n

ii

n

ii

22

21

22

12

1

11)(

11)(

nxVarxE xx22 )(,)(

Page 15: abc

Standard Error • Different samples of the same size from the same

population will yield different sample means.• A measure of the variability in different values of

sample mean is given by the Standard Error of the sample mean.

• Standard error of a statistic is the standard deviation of its distribution.

• In our example,

• Standard error decreases when sample size is increased.

QAM – II by Gaurav Garg (IIM Lucknow)

nxVarx x )()(error standard

5811.15.2 x

Page 16: abc

Central Limit Theorem• When population distribution is N(μ, σ),• then

• When the population distribution is not normal,• then also provided n→∞.

• Practically, this result is true for n ≥ 30.

QAM – II by Gaurav Garg (IIM Lucknow)

.,~ nNx

,,~ nNx

Page 17: abc

x012345678910

0.000.250.500.751.001.251.501.752.002.252.502.753.003.253.503.754.000246810121416

0.000.250.500.751.001.251.501.752.002.252.502.753.00x

1,800 Randomly Selected Values from an Exponential Distribution

Distribution of Sample Mean

n=2 n=5 n=30

QAM – II by Gaurav Garg (IIM Lucknow)

Page 18: abc

1,800 Randomly Selected Values from a Uniform Distribution

X

Frequency

0

50

100

150

200

250

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

Distribution of Sample Mean

Frequency

x012345678910

1.001.251.501.752.002.252.502.753.003.253.503.754.004.25n=2

Frequency

x024681012

1.001.251.501.752.002.252.502.753.003.253.503.754.004.25

n=30n=5

Frequency

x0510152025

1.001.251.501.752.002.252.502.753.003.253.503.754.004.25

QAM – II by Gaurav Garg (IIM Lucknow)

Page 19: abc

• Example:• Suppose a population has mean μ = 8 and standard

deviation σ = 3. • Suppose a random sample of size n = 36 is selected. • What is the probability that the sample mean is

between 7.75 and 8.25?• Even if the population is not normally distributed,

the central limit theorem can be used (n > 30).• So, the distribution of the sample mean is

approximately N(8, 3/6).• i.e,

QAM – II by Gaurav Garg (IIM Lucknow)

)6/3,8(~ Nx

?]25.875.7[ xP

Page 20: abc

Sampling Distribution of Proportion(or Distribution of Sampling Proportion)

• Let us consider that the population is divided into two mutually exclusive and collectively exhaustive classes.

• One class possesses a particular attribute,

• Other class does not posses that attribute.

• For example a people in a city could be divided into “Smokers” and “Non-smokers”.

QAM – II by Gaurav Garg (IIM Lucknow)

Page 21: abc

• Let N= population size X= no. of people out of N possessing a particular

attribute P= Actual proportion of the people possessing a

particular attribute = X/N

• Let a sample is selected from this population. n= sample size x= no. of people in the sample possessing a

particular attribute p= x/n = sample proportion

QAM – II by Gaurav Garg (IIM Lucknow)

Page 22: abc

• X, P are population parameters.• x, p are sample statistics.• p provides an estimate of P.

• Note that, x ~ B(n, P)

• E(x) = nP, Var(x) = nPQ, (where Q = 1-P).

• This implies that• E(p) = E(x/n) = P,

• Var(p) = Var(x/n) = nPQ/n2 = PQ/n.

• Standard error (p) = √[Var(p)] = √(PQ/n)

QAM – II by Gaurav Garg (IIM Lucknow)

Page 23: abc

• When the sample size n is large enough,

• This is a particular case of central limit theorem.• Practically, this result is true for n ≥ 30.• Or, when nP ≥ 5 as well as nQ ≥ 5

QAM – II by Gaurav Garg (IIM Lucknow)

)1,0(~or

)1,0(~

NnPQ

PpZ

NnPQ

nPxZ

Page 24: abc

• Example:• If the true proportion of voters who support ABC party

0.4.• What is the probability that a sample of size 200 yields a

sample proportion between 0.40 and 0.45?• P = 0.4, Q = 1 – 0.4 = 0.6

• n = 200.

• Pr[ 0.40 < p < 0.45 ] =?

QAM – II by Gaurav Garg (IIM Lucknow)

)1,0(~ NnPQ

PpZ

Page 25: abc

Finite Population Correction• For the application of central limit theorem, we

assumed that the sample size n is large.

• If the population size N is small, sample size n can not be sufficiently large.

• And we can not apply central limit theorem.

• In this situation, we multiply the standard error by Finite Population Correction (fpc),

• which is given by

• Clearly, when N → ∞, fpc → 1.QAM – II by Gaurav Garg (IIM Lucknow)

1

N

nNfpc

Page 26: abc

• Thus

• And

• fpc should be used when n / N >0.05

QAM – II by Gaurav Garg (IIM Lucknow)

).1,0(~

1

or,~ N

NnN

n

xnfpcNx

.1,0~

1

N

NnN

nPQ

Pp

fpcn

PQ

Pp

Page 27: abc

Degree of Freedom• The no. of independent observations which make up a

statistic, is known as the degrees of freedom (d.f.) associated with that statistic.

• d.f. is the number of values in the final calculation of a statistic that are free to vary.

• In general, d.f. of a statistic = (no. of independent observations) - (no. of parameters estimated)

• Assume four numbers: a, b, c, and d, • such that a+b+c+d = m.• You are free to choose the any three numbers at random.• But 4th must be chosen so that it makes the total equal to m.• Thus your degree of freedom is three.

QAM – II by Gaurav Garg (IIM Lucknow)

Page 28: abc

Chi Square Distribution• Let us take a sample from N(μ,σ).

• Define the statistic

• The symbol is read as Chi-Square and has a Chi-Square Distribution with n degree of freedom and range (0,∞).

• This distribution is denoted as .

• If we define the statistic as

• The distribution of this statistic is .

nxxx ,...,, 21

2

1

2

n

i

ix

2

2)(n

QAM – II by Gaurav Garg (IIM Lucknow)

2

1

2

n

i

i xx

2)1( n

Page 29: abc

• If , then

• Plot of p.d.f. of Chi-Square distribution with d.f. k

• The modes are at T = k-2 (k>1).

2)(~ kX .2)(,)( kXVarkXE

QAM – II by Gaurav Garg (IIM Lucknow)

Page 30: abc

Student’s t Distribution• Let us take a sample from N(μ,σ).• Define the statistic

• Then T follows Student’s t Distribution with (n-1) d.f. and range (- ∞,∞).

• It is denoted as

• If , then

nxxx ,...,, 21

.)(1

1,

1where, 2

1

21

11

xxn

sxn

xns

xT

n

ii

n

ii

QAM – II by Gaurav Garg (IIM Lucknow)

)(~ ktT )2(,2

)(,0)(

kk

kTVarTE

)1(~ ntT

Page 31: abc

pdf of Student’s t distribution with k d.f.

• This distribution is symmetric about 0.• Mean=Median=Mode=0

QAM – II by Gaurav Garg (IIM Lucknow)

Page 32: abc

t0

t (df = 5)

t (df = 13)t-distributions are bell-shaped and symmetric, but have ‘fatter’ tails than the normal

Standard Normal(t with df = ∞)

Note: t(n-1) →N(0,1) as n increases

QAM – II by Gaurav Garg (IIM Lucknow)

Page 33: abc

Snedecor’s F Distribution• Let X and Y be two independent random

variables such that

• Define the statistic

• F follows Snedecor’s F Distribution with d1 and d2 d.f. and range (0,∞).

• It is denoted as F~F(d1,d2) .

QAM – II by Gaurav Garg (IIM Lucknow)

2)(

2)( 21

~and~ dd YX

2

1

dY

dXF

4,)4()2(

)2(2)(and2,

2)( 2

22

21

122

22

2

2

d

ddd

dddFVard

d

dFE

Page 34: abc

pdf of Snedecor’s F distribution with d1 and d2 d.f.

QAM – II by Gaurav Garg (IIM Lucknow)

Page 35: abc

Summary• Parameter and Statistic• Unbiasedness• Distribution of sample mean• Distribution of sample proportion• Central limit theorem• Finite population correction• Degree of Freedom• Student’s t, Chi-Square and Snedecor’s F-

distributions

QAM – II by Gaurav Garg (IIM Lucknow)