class 2 statistical inference lionel nesta observatoire français des conjonctures economiques...

Class 2Statistical Inference

Lionel Nesta

Observatoire Français des Conjonctures Economiques

[email protected]

CERAM February-March-April 2008

Hypothesis Testing

The Notion of Hypothesis in Statistics Expectation

An hypothesis is a conjecture, an expected explanation of why a given

phenomenon is occurring

Operational -ity

An hypothesis must be precise, univocal and quantifiable

Refutability

Le result of a given experiment must give rise to either the refutation or the

corroboration of the tested hypothesis

Replicability

Exclude ad hoc, local arrangements from experiment, and seek universality

Examples of Good and Bad Hypotheses

« The stakes Peugeot and Citroen have the same variance »

« God exists! »

« In general, the closure of a given production site in Europe is positively

associated with the share price of a given company on financial markets. »

« Knowledge has a positive impact on economic growth »

Hypothesis Testing In statistics, hypothesis testing aims at accepting or rejecting a

hypothesis

The statistical hypothesis is called the “null hypothesis” H0

The null hypothesis proposes something initially presumed true.

It is rejected only when it becomes evidently false, that is, when the

researcher has a certain degree of confidence, usually 95% to 99%,

that the data do not support the null hypothesis.

The alternative hypothesis (or research hypothesis) H1 is the

complement of H0.

Hypothesis Testing There are two kinds of hypothesis testing:

Homogeneity test compares the means of two samples.

H0 : Mean(x) = Mean(y) ; Mean(x) = 0

H1 : Mean(x) ≠ Mean(y) ; Mean(x) ≠ 0

Conformity test looks at whether the distribution of a given sample follows

the properties of a distribution law (normal, Gaussian, Poisson, binomial).

H0 : ℓ(x) = ℓ*(x)

H1 : ℓ(x) ≠ ℓ*(x)

The Four Steps of Hypothesis Testing1. Spelling out the null hypothesis H0 et and the alternative

hypothesis H1.

2. Computation of a statistics corresponding to the distance

between two sample means (homogeneity test) or between the

sample and the distribution law (conformity test).

3. Computation of the (critical) probability to observe what one

observes.

4. Conclusion of the test according to an agreed threshold around

which one arbitrates between H0 and H1 .

The Logic of Hypothesis Testing We need to say something about the reliability (or

representativeness) of a mean

Large number theory; Central limit theorem

The notion of confidence interval

Once done, we can whether two mean are alike

If so (not), their confidence intervals are (not) overlapping

Statistical Inference

In real life calculating parameters of populations is prohibitive because populations are very large.

Rather than investigating the whole population, we take a sample, calculate a statistic related to the parameter of interest, and make an inference.

The sampling distribution of the statistic is the tool that tells us how close is the statistic to the parameter.

Prerequisite Standard Normal

Distribution

Two Prerequisites

Large number theory

Large number theory tells us that the sample mean will converge

to the population (true) mean as the sample size increases.

Central Limit Theorem

Central Limit Theorem tells us that for many samples of like and

sufficiently large size, the histogram of these sample means will

appear to be a normal distribution.

The Dice Experiment

6

1

1 213.5

6 6

x

Xx

E X x

Value P(X = x)

1 1/6

2 1/6

3 1/6

4 1/6

5 1/6

6 1/60.00

0.04

0.08

0.12

0.16

0.20

1 2 3 4 5 6x

Sample Mean Sample Mean Sample Mean

1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.5

10 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6

The Dice Experiment (n = 2)

1 2 36

1 1 13.5

36 36 36 XXE X X X X

1 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0

6/365/364/363/362/361/36

x

Sample Mean Sample Mean Sample Mean

1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.5

10 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6

The Normal Distribution

In probability, a random variable follows a normal distribution

law (also called Gaussian, Laplace-Gauss distribution law) of

expectation μ and standard deviation σ if its probability

density function (pdf) is such that

This law is written (μ,σ ²). The density function of a normal

distribution is symmetrical.

21

21( )

2

x

f x e

Normal Distributions For Different values of μ and σ

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

-5 -4 -3 -2 -1 0 1 2 3 4 5

(μ=0;σ=1) (μ=0.5;σ=1.1) (μ=-2;σ=0.5)

The standard normal distribution, also called Z distribution,

represents a probability density function with mean μ = 0 and

standard deviation σ = 1. It is written as N (0,1).

All random variable following a normal law can be standardized via

the following transformation

xz

The Standard Normal Distribution

The Standard Normal Distribution (μ=0; σ=1)

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

-5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0


0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

-5 -4 -3 -2 -1 0 1 2 3 4 5

68% of observations

95% of observations

99.7% of observations


0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

-5 -4 -3 -2 -1 0 1 2 3 4 5

95% of observations

2.5% 2.5%

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

-5 -4 -3 -2 -1 0 1 2 3 4 5

P(Z ≥ 0)P(Z < 0)

The Standard Normal Distribution (z scores)

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

-5 -4 -3 -2 -1 0 1 2 3 4 5

P(Z ≥ 0.51)

Probability of an event (z = 0.51)

Probability of an event (z = 0.51)

The z-score is used to compute the probability of

obtaining an observed score.

Example

Let z = 0.51. What is the probability of observing

z=0.51?

It is the probability of observing z ≥ 0.51: P(z ≥ 0.51)

= ??

Standard Normal Distribution Tablez 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

0.0 0.500 0.496 0.492 0.488 0.484 0.480 0.476 0.472 0.468 0.464

0.1 0.460 0.456 0.452 0.448 0.444 0.440 0.436 0.433 0.429 0.425

0.2 0.421 0.417 0.413 0.409 0.405 0.401 0.397 0.394 0.390 0.386

0.3 0.382 0.378 0.375 0.371 0.367 0.363 0.359 0.356 0.352 0.348

0.4 0.345 0.341 0.337 0.334 0.330 0.326 0.323 0.319 0.316 0.312

0.5 0.309 0.305 0.302 0.298 0.295 0.291 0.288 0.284 0.281 0.278

0.6 0.274 0.271 0.268 0.264 0.261 0.258 0.255 0.251 0.248 0.245

0.7 0.242 0.239 0.236 0.233 0.230 0.227 0.224 0.221 0.218 0.215

0.8 0.212 0.209 0.206 0.203 0.201 0.198 0.195 0.192 0.189 0.187

0.9 0.184 0.181 0.179 0.176 0.174 0.171 0.169 0.166 0.164 0.161

1.0 0.159 0.156 0.154 0.152 0.149 0.147 0.145 0.142 0.140 0.138

1.6 0.055 0.054 0.053 0.052 0.050 0.050 0.049 0.048 0.047 0.046

1.9 0.029 0.028 0.027 0.027 0.026 0.026 0.025 0.024 0.024 0.023

2.0 0.023 0.022 0.022 0.021 0.021 0.020 0.020 0.019 0.019 0.018

2.5 0.006 0.006 0.006 0.006 0.006 0.005 0.005 0.005 0.005 0.005

2.9 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.001 0.001

Probability of an event (Z = 0.51)

The Z-score is used to compute the probability of obtaining

an observed score.

Example

Let z = 0.51. What is the probability of observing z=0.51?

It is the probability of observing z ≥ 0.51: P(z ≥ 0.51)

P(z ≥ 0.51) = 0.3050

Example

12 10 0.66. ( 0.66) 0.255 25.5%

3z P z

Suppose that for a population students of a famous business school in

Sophia-Antipolis, grades are distributed normal with an average of 10

and a standard deviation of 3. What proportion of them Exceeds 12 ; Exceeds 15 Does not exceed 8 ; Does not exceed 12

Let the mean μ = 10 and standard deviation σ = 3:

15 10 1.66. ( 1.66) 0.049 4.9%

3z P z

8 10 0.66. ( 0.66) ( 0.66) 0.255 25.5%

3z P z P z

12 10 0.66. ( 0.66) 1 - ( 0.66) 1 0.255 74.5%

3z P z P z

Confidence Interval

Inverting the way of thinking Until now, we have thought in terms of observations x

and sample values μ and σ to produce the z score.

Let us now imagine that we do not know x, we know μ

and σ. If we consider any interval, we can write:

-

xz z x

z x z

? ?

Inverting the way of thinking If z∈[-2.55;+2.55] we know that 99% of z-scores will

fall within the range

If z∈[-1.64;+1.64] we know that 90% of z-scores will

fall within the range

Let us now consider an interval which comprises 95% of

observations. Looking at the z table, we know that

z=1.96

Pr 1.96 1.96 0.95x

Confidence Interval In statistics, a confidence interval is an interval within which the value

of a parameter is likely to be (the mean). Instead of estimating the

parameter by a single value, an interval of likely estimates is given.

Confidence intervals are used to indicate the reliability of an estimate.

A1. The sample mean is a random variable following a normal distribution

A2.The sample values μ and σ are good approximation of the population values.

If a random sample is drawn from any population, the sampling distribution of the sample mean is

approximately normal for a sufficiently large sample size.

The larger the sample size, the more closely the sampling distribution of will resemble a normal distribution. x

The Central Limit Theorem

1 2

1 2

1...

1...

1...

1

n

n

X

E X

X X Xn

E X E X E X E Xn

E Xn

E X nn

Moments of Sample Mean: The Mean

On average, the sample mean will be on target, that is, equal to the population mean.

1 2

1 22

2 2 22

2 2

2

1 1 1var var var ... var

1var var var ... va

Standard error of

r

1var

v

...

ar

n

n

X X X Xn n n

X

X X X Xn

Xn

n

nX

n n

Moments of Sample Mean: The Variance

The standard deviation of the sample means represents the estimation error of the sample mean, and therefore it is called the standard

error.

22

1.

2.

3. normal, x is normal. If x is nonnormal

x is approximately normally distributed for

sufficiently large sample size.

x

xx

X

nIf x is

The Sampling Distribution of the Sample Mean

pc pcX z X zN N

1.64 1.64X XN N

1.96 1.96X XN N

General definition

Definition for 95% CI

Definition for 90% CI

Confidence Interval

Standard Normal Distribution and CI

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

-5 -4 -3 -2 -1 0 1 2 3 4 5

90% of observations

95% of observations

99.7% of observations

3 310 1.96 10 1.96 8.8 11.2

25 25

Let us draw a sample of 25 students from CERAM (n = 25), with X =

10 and σ = 3. Let us build the 95% CI

Application of Confidence Interval

CERAM Average grades

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0 5 10 15 20

95% of chances that the mean is indeed located within this interval

8.8 11.2

3 310 1.96 10 1.96 8.8 11.2

25 25

Let us draw a sample of 25 students from CERAM (n = 25), with X =

10 and σ = 3. Let us build the 95% CI

Application of Confidence Interval

4.7 4.711.5 1.96 11.5 1.96 9.8 13.2

30 30

Let us draw a sample of 25 students from HEC (n = 30), with X = 11.5

and σ = 4.7. Let us build the 95% CI

HEC Average grades

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0 5 10 15 20

95% of chances that the mean is indeed located within this interval

9.8 13.2

Hypothesis Testing Hypothesis 1 : Students from CERAM have an average grade which is

not significantly different from 11

H0 : μ(CERAM) = 11

H1 : μ(CERAM) ≠ 11

Hypothesis 2 : Students from CERAM have similar grades as students

from HEC

H0 : μ(CERAM) = μ(HEC)

H1 : μ(CERAM) ≠ μ(HEC)

I Accept H0 and reject H1

I Accept H0 and reject H1

Comparing the Means Using CI’s

0.00

0.05

0.10

0.15

0.20

0.25

0 5 10 15 20

(μ=11.5;σ=4.7)

(μ=10;σ=3)

HEC

CERAM

The Overlap of the two CIs means that at 95% level, the two means do not differ significantly.

Thus far, we have assumed that we know both the mean

and the standard deviation of the population. But in fact,

we do not know them: both μ and σ are unknown.

The Student t statistics is then preferred to the z statistics.

Its distribution is similar (identical to z as n → +∞). The CI

becomes

dfcp

sX t

N

The Student Test

24 242.5 2.5

3 310 10

25 253 3

10 2.06 10 2.06 8.76 11.2325 25

t t

Let us draw a sample of 25 students from CERAM (n = 25), with μ = 10

and σ = 3. Let us build the 95% CI

Application of Student t to CI’s

4.7 4.711.5 2.06 11.5 2.06 9.73 13.26

30 30

Let us draw a sample of 25 students from HEC (n = 30), with μ = 11.5

and σ = 4.7. Let us build the 95% CI

Import CERAM_LMC into SPSS Produce descriptive statistics for sales; labour, and R&D expenses

Analyse Statistiques descriptives Descriptive Options: choose the statistics you may wish

A newspaper writes that by and large, LMCs have 95,000 employees. Test statistically whether this is true at 1% level Test statistically whether this is true at 5% level Test statistically whether this is true at 10% and 20% level

Write out H0 and H1

Analyse Comparer les moyennes Test t pour échantillon unique Options: 99; 95, 90%

SPSS Application: Student t

SPSS Application: t test at 99% level

Statistiques sur échantillon unique

1634 91298.87 96400.957 2384.818labourN Moyenne Ecart-type

Erreurstandardmoyenne

Test sur échantillon unique

-1.552 1633 .121 -3701.130 -9851.20 2448.94labourt ddl

Sig.(bilatérale)

Différencemoyenne Inférieure Supérieure

Intervalle de confiance99% de la différence

Valeur du test = 95000



-1.552 1633 .121 -3701.130 -8378.75 976.50labourt ddl

Sig.(bilatérale)












-1.552 1633 .121 -3701.130 -6758.63 -643.63labourt ddl

Sig.(bilatérale)




2 20.01 0.0195000 95000 95000

96400 9640095000 2.573 95000 95000 2.573

1634 1634

9851.20 95000 2448.94

85148.8 97448.94

Pr 85148.8 97448.94 0.99

s sX t X t

N N

X X

SPSS Results (at 1% level)

2

Xt

s n

Critical probability The confidence interval is designed in such a way that for each t

statistics chosen, we define a share of observations which this CI is

comprising. For large n, when t = 1.96, we have 95% CI For large n, when t = 2.55, we have 99% CI

Actually, for each t, there corresponds a share of observations One can compute directly the t value from our observations as follows:

Critical probability The confidence interval is designed in such a way that for each t

statistics chosen, we define a share of observations which this CI is

comprising. For large n, when t = 1.96, we have 95% CI For large n, when t = 2.55, we have 99% CI

Actually, for each t, there corresponds a share of observation http://www.socr.ucla.edu/Applets.dir/T-table.html

One can compute directly the t value from our observations as follows:

2

95000 91298 95000 37021.552

96400 23841634

Xt

s

N

Critical probability

With t = 1.552, I can conclude the following: 12% probability that μ belongs to the distribution

where the population mean = 95,000

I have 12% chances to wrongly reject H0

88% probability that μ belongs to another

distribution where the population mean ≠ 95,000

I have 88% chances to rightly reject H0

Shall I the accept or reject H0?

6.1% 6.1%

88.0%


With t = 1.552, I can conclude the following: 12% probability that μ belongs to the distribution

where the population mean = 95,000

I have 12% chances to wrongly reject H0

88% probability that μ belongs to another

distribution where the population mean ≠ 95,000

I have 88% chances to rightly reject H0

I accept H0 !!!


The practice is to reject H0 only when the

critical probability is lower than 0.1, or 10% Some are even more cautious and prefer to

reject H0 at a critical probability level of 0.05,

or 5%. In any case, the philosophy of the statistician

is to be conservative.

A Direct Comparison of Means Using Student t Another way to compare two sample means is to calculate the CI

of the mean difference. If 0 does not belong to CI, then the two

sample have significantly different means.

1 2 1 2

1 2

2 2

1 1 2 22

1 2

( 1) ( 1)

ppc

p

sX X t

n n

X X X Xs

n n

Standard error, also called pooled

variance

Another newspaper argues that US companies are much larger than

those from the rest of the world. Is this true?

Produce descriptive statistics labour comparing the two groups Produce a group variables which equals 1 for US firms, 0 otherwise This is called a dummy variable

Write out H0 and H1

Analyse Comparer les moyennes Test t pour échantillon

indépendants What do you conclude at 5% level? What do you conclude at 1% level?

SPSS Application: t test comparing means


Statistiques de groupe

628 97808.99 112765.1 4499.817

1006 87234.90 84403.469 2661.101

AM1

0

labourN Moyenne Ecart-type


Test d'échantillons indépendants

.024 .877 2.159 1632 .031 10574.084 4897.135 968.751 20179.417

2.023 1061.268 .043 10574.084 5227.792 316.102 20832.067

Hypothèse devariances égales

Hypothèse devariances inégales

labourF Sig.

Test de Levene surl'égalité des variances

t ddlSig.

(bilatérale)Différencemoyenne

Différenceécart-type Inférieure Supérieure


Test-t pour égalité des moyennes


Statistiques de groupe

628 97808.99 112765.1 4499.817

1006 87234.90 84403.469 2661.101

AM1

0

labourN Moyenne Ecart-type


Test d'échantillons indépendants

.024 .877 2.159 1632 .031 10574.084 4897.135 -2054.870 23203.038

2.023 1061.268 .043 10574.084 5227.792 -2916.075 24064.243

Hypothèse devariances égales

Hypothèse devariances inégales

labourF Sig.

Test de Levene surl'égalité des variances

t ddlSig.

(bilatérale)Différencemoyenne

Différenceécart-type Inférieure Supérieure


Test-t pour égalité des moyennes

class 2 statistical inference lionel nesta observatoire français des conjonctures economiques...

Documents

hypothesis testing slide

null hypothesis h

statistical hypothesis

notion of hypothesis

alternative hypothesis

research hypothesis

logic of hypothesis

x slide