class 2 statistical inference lionel nesta observatoire français des conjonctures economiques...
Post on 18-Dec-2015
213 views
TRANSCRIPT
![Page 1: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/1.jpg)
Class 2Statistical Inference
Lionel Nesta
Observatoire Français des Conjonctures Economiques
CERAM February-March-April 2008
![Page 2: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/2.jpg)
Hypothesis Testing
![Page 3: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/3.jpg)
The Notion of Hypothesis in Statistics Expectation
An hypothesis is a conjecture, an expected explanation of why a given
phenomenon is occurring
Operational -ity
An hypothesis must be precise, univocal and quantifiable
Refutability
Le result of a given experiment must give rise to either the refutation or the
corroboration of the tested hypothesis
Replicability
Exclude ad hoc, local arrangements from experiment, and seek universality
![Page 4: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/4.jpg)
Examples of Good and Bad Hypotheses
« The stakes Peugeot and Citroen have the same variance »
« God exists! »
« In general, the closure of a given production site in Europe is positively
associated with the share price of a given company on financial markets. »
« Knowledge has a positive impact on economic growth »
![Page 5: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/5.jpg)
Hypothesis Testing In statistics, hypothesis testing aims at accepting or rejecting a
hypothesis
The statistical hypothesis is called the “null hypothesis” H0
The null hypothesis proposes something initially presumed true.
It is rejected only when it becomes evidently false, that is, when the
researcher has a certain degree of confidence, usually 95% to 99%,
that the data do not support the null hypothesis.
The alternative hypothesis (or research hypothesis) H1 is the
complement of H0.
![Page 6: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/6.jpg)
Hypothesis Testing There are two kinds of hypothesis testing:
Homogeneity test compares the means of two samples.
H0 : Mean(x) = Mean(y) ; Mean(x) = 0
H1 : Mean(x) ≠ Mean(y) ; Mean(x) ≠ 0
Conformity test looks at whether the distribution of a given sample follows
the properties of a distribution law (normal, Gaussian, Poisson, binomial).
H0 : ℓ(x) = ℓ*(x)
H1 : ℓ(x) ≠ ℓ*(x)
![Page 7: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/7.jpg)
The Four Steps of Hypothesis Testing1. Spelling out the null hypothesis H0 et and the alternative
hypothesis H1.
2. Computation of a statistics corresponding to the distance
between two sample means (homogeneity test) or between the
sample and the distribution law (conformity test).
3. Computation of the (critical) probability to observe what one
observes.
4. Conclusion of the test according to an agreed threshold around
which one arbitrates between H0 and H1 .
![Page 8: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/8.jpg)
The Logic of Hypothesis Testing We need to say something about the reliability (or
representativeness) of a mean
Large number theory; Central limit theorem
The notion of confidence interval
Once done, we can whether two mean are alike
If so (not), their confidence intervals are (not) overlapping
![Page 9: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/9.jpg)
Statistical Inference
In real life calculating parameters of populations is prohibitive because populations are very large.
Rather than investigating the whole population, we take a sample, calculate a statistic related to the parameter of interest, and make an inference.
The sampling distribution of the statistic is the tool that tells us how close is the statistic to the parameter.
![Page 10: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/10.jpg)
Prerequisite Standard Normal
Distribution
![Page 11: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/11.jpg)
Two Prerequisites
Large number theory
Large number theory tells us that the sample mean will converge
to the population (true) mean as the sample size increases.
Central Limit Theorem
Central Limit Theorem tells us that for many samples of like and
sufficiently large size, the histogram of these sample means will
appear to be a normal distribution.
![Page 12: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/12.jpg)
The Dice Experiment
6
1
1 213.5
6 6
x
Xx
E X x
Value P(X = x)
1 1/6
2 1/6
3 1/6
4 1/6
5 1/6
6 1/60.00
0.04
0.08
0.12
0.16
0.20
1 2 3 4 5 6x
![Page 13: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/13.jpg)
Sample Mean Sample Mean Sample Mean
1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.5
10 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6
The Dice Experiment (n = 2)
1 2 36
1 1 13.5
36 36 36 XXE X X X X
![Page 14: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/14.jpg)
1 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0
6/365/364/363/362/361/36
x
Sample Mean Sample Mean Sample Mean
1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.5
10 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6
![Page 15: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/15.jpg)
The Normal Distribution
In probability, a random variable follows a normal distribution
law (also called Gaussian, Laplace-Gauss distribution law) of
expectation μ and standard deviation σ if its probability
density function (pdf) is such that
This law is written (μ,σ ²). The density function of a normal
distribution is symmetrical.
21
21( )
2
x
f x e
![Page 16: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/16.jpg)
Normal Distributions For Different values of μ and σ
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
-5 -4 -3 -2 -1 0 1 2 3 4 5
(μ=0;σ=1) (μ=0.5;σ=1.1) (μ=-2;σ=0.5)
![Page 17: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/17.jpg)
The standard normal distribution, also called Z distribution,
represents a probability density function with mean μ = 0 and
standard deviation σ = 1. It is written as N (0,1).
All random variable following a normal law can be standardized via
the following transformation
xz
The Standard Normal Distribution
![Page 18: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/18.jpg)
The Standard Normal Distribution (μ=0; σ=1)
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
-5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
![Page 19: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/19.jpg)
The Standard Normal Distribution
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
-5 -4 -3 -2 -1 0 1 2 3 4 5
68% of observations
95% of observations
99.7% of observations
![Page 20: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/20.jpg)
The Standard Normal Distribution
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
-5 -4 -3 -2 -1 0 1 2 3 4 5
95% of observations
2.5% 2.5%
![Page 21: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/21.jpg)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
-5 -4 -3 -2 -1 0 1 2 3 4 5
P(Z ≥ 0)P(Z < 0)
The Standard Normal Distribution (z scores)
![Page 22: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/22.jpg)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
-5 -4 -3 -2 -1 0 1 2 3 4 5
P(Z ≥ 0.51)
Probability of an event (z = 0.51)
![Page 23: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/23.jpg)
Probability of an event (z = 0.51)
The z-score is used to compute the probability of
obtaining an observed score.
Example
Let z = 0.51. What is the probability of observing
z=0.51?
It is the probability of observing z ≥ 0.51: P(z ≥ 0.51)
= ??
![Page 24: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/24.jpg)
Standard Normal Distribution Tablez 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.500 0.496 0.492 0.488 0.484 0.480 0.476 0.472 0.468 0.464
0.1 0.460 0.456 0.452 0.448 0.444 0.440 0.436 0.433 0.429 0.425
0.2 0.421 0.417 0.413 0.409 0.405 0.401 0.397 0.394 0.390 0.386
0.3 0.382 0.378 0.375 0.371 0.367 0.363 0.359 0.356 0.352 0.348
0.4 0.345 0.341 0.337 0.334 0.330 0.326 0.323 0.319 0.316 0.312
0.5 0.309 0.305 0.302 0.298 0.295 0.291 0.288 0.284 0.281 0.278
0.6 0.274 0.271 0.268 0.264 0.261 0.258 0.255 0.251 0.248 0.245
0.7 0.242 0.239 0.236 0.233 0.230 0.227 0.224 0.221 0.218 0.215
0.8 0.212 0.209 0.206 0.203 0.201 0.198 0.195 0.192 0.189 0.187
0.9 0.184 0.181 0.179 0.176 0.174 0.171 0.169 0.166 0.164 0.161
1.0 0.159 0.156 0.154 0.152 0.149 0.147 0.145 0.142 0.140 0.138
1.6 0.055 0.054 0.053 0.052 0.050 0.050 0.049 0.048 0.047 0.046
1.9 0.029 0.028 0.027 0.027 0.026 0.026 0.025 0.024 0.024 0.023
2.0 0.023 0.022 0.022 0.021 0.021 0.020 0.020 0.019 0.019 0.018
2.5 0.006 0.006 0.006 0.006 0.006 0.005 0.005 0.005 0.005 0.005
2.9 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.001 0.001
![Page 25: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/25.jpg)
Probability of an event (Z = 0.51)
The Z-score is used to compute the probability of obtaining
an observed score.
Example
Let z = 0.51. What is the probability of observing z=0.51?
It is the probability of observing z ≥ 0.51: P(z ≥ 0.51)
P(z ≥ 0.51) = 0.3050
![Page 26: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/26.jpg)
Example
12 10 0.66. ( 0.66) 0.255 25.5%
3z P z
Suppose that for a population students of a famous business school in
Sophia-Antipolis, grades are distributed normal with an average of 10
and a standard deviation of 3. What proportion of them Exceeds 12 ; Exceeds 15 Does not exceed 8 ; Does not exceed 12
Let the mean μ = 10 and standard deviation σ = 3:
15 10 1.66. ( 1.66) 0.049 4.9%
3z P z
8 10 0.66. ( 0.66) ( 0.66) 0.255 25.5%
3z P z P z
12 10 0.66. ( 0.66) 1 - ( 0.66) 1 0.255 74.5%
3z P z P z
![Page 27: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/27.jpg)
Confidence Interval
![Page 28: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/28.jpg)
Inverting the way of thinking Until now, we have thought in terms of observations x
and sample values μ and σ to produce the z score.
Let us now imagine that we do not know x, we know μ
and σ. If we consider any interval, we can write:
-
xz z x
z x z
? ?
![Page 29: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/29.jpg)
Inverting the way of thinking If z∈[-2.55;+2.55] we know that 99% of z-scores will
fall within the range
If z∈[-1.64;+1.64] we know that 90% of z-scores will
fall within the range
Let us now consider an interval which comprises 95% of
observations. Looking at the z table, we know that
z=1.96
Pr 1.96 1.96 0.95x
![Page 30: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/30.jpg)
Confidence Interval In statistics, a confidence interval is an interval within which the value
of a parameter is likely to be (the mean). Instead of estimating the
parameter by a single value, an interval of likely estimates is given.
Confidence intervals are used to indicate the reliability of an estimate.
A1. The sample mean is a random variable following a normal distribution
A2.The sample values μ and σ are good approximation of the population values.
![Page 31: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/31.jpg)
If a random sample is drawn from any population, the sampling distribution of the sample mean is
approximately normal for a sufficiently large sample size.
The larger the sample size, the more closely the sampling distribution of will resemble a normal distribution. x
The Central Limit Theorem
![Page 32: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/32.jpg)
1 2
1 2
1...
1...
1...
1
n
n
X
E X
X X Xn
E X E X E X E Xn
E Xn
E X nn
Moments of Sample Mean: The Mean
On average, the sample mean will be on target, that is, equal to the population mean.
![Page 33: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/33.jpg)
1 2
1 22
2 2 22
2 2
2
1 1 1var var var ... var
1var var var ... va
Standard error of
r
1var
v
...
ar
n
n
X X X Xn n n
X
X X X Xn
Xn
n
nX
n n
Moments of Sample Mean: The Variance
The standard deviation of the sample means represents the estimation error of the sample mean, and therefore it is called the standard
error.
![Page 34: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/34.jpg)
22
1.
2.
3. normal, x is normal. If x is nonnormal
x is approximately normally distributed for
sufficiently large sample size.
x
xx
X
nIf x is
The Sampling Distribution of the Sample Mean
![Page 35: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/35.jpg)
pc pcX z X zN N
1.64 1.64X XN N
1.96 1.96X XN N
General definition
Definition for 95% CI
Definition for 90% CI
Confidence Interval
![Page 36: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/36.jpg)
Standard Normal Distribution and CI
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
-5 -4 -3 -2 -1 0 1 2 3 4 5
90% of observations
95% of observations
99.7% of observations
![Page 37: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/37.jpg)
3 310 1.96 10 1.96 8.8 11.2
25 25
Let us draw a sample of 25 students from CERAM (n = 25), with X =
10 and σ = 3. Let us build the 95% CI
Application of Confidence Interval
![Page 38: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/38.jpg)
CERAM Average grades
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0 5 10 15 20
95% of chances that the mean is indeed located within this interval
8.8 11.2
![Page 39: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/39.jpg)
3 310 1.96 10 1.96 8.8 11.2
25 25
Let us draw a sample of 25 students from CERAM (n = 25), with X =
10 and σ = 3. Let us build the 95% CI
Application of Confidence Interval
4.7 4.711.5 1.96 11.5 1.96 9.8 13.2
30 30
Let us draw a sample of 25 students from HEC (n = 30), with X = 11.5
and σ = 4.7. Let us build the 95% CI
![Page 40: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/40.jpg)
HEC Average grades
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0 5 10 15 20
95% of chances that the mean is indeed located within this interval
9.8 13.2
![Page 41: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/41.jpg)
Hypothesis Testing Hypothesis 1 : Students from CERAM have an average grade which is
not significantly different from 11
H0 : μ(CERAM) = 11
H1 : μ(CERAM) ≠ 11
Hypothesis 2 : Students from CERAM have similar grades as students
from HEC
H0 : μ(CERAM) = μ(HEC)
H1 : μ(CERAM) ≠ μ(HEC)
I Accept H0 and reject H1
I Accept H0 and reject H1
![Page 42: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/42.jpg)
Comparing the Means Using CI’s
0.00
0.05
0.10
0.15
0.20
0.25
0 5 10 15 20
(μ=11.5;σ=4.7)
(μ=10;σ=3)
HEC
CERAM
The Overlap of the two CIs means that at 95% level, the two means do not differ significantly.
![Page 43: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/43.jpg)
Thus far, we have assumed that we know both the mean
and the standard deviation of the population. But in fact,
we do not know them: both μ and σ are unknown.
The Student t statistics is then preferred to the z statistics.
Its distribution is similar (identical to z as n → +∞). The CI
becomes
dfcp
sX t
N
The Student Test
![Page 44: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/44.jpg)
24 242.5 2.5
3 310 10
25 253 3
10 2.06 10 2.06 8.76 11.2325 25
t t
Let us draw a sample of 25 students from CERAM (n = 25), with μ = 10
and σ = 3. Let us build the 95% CI
Application of Student t to CI’s
4.7 4.711.5 2.06 11.5 2.06 9.73 13.26
30 30
Let us draw a sample of 25 students from HEC (n = 30), with μ = 11.5
and σ = 4.7. Let us build the 95% CI
![Page 45: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/45.jpg)
Import CERAM_LMC into SPSS Produce descriptive statistics for sales; labour, and R&D expenses
Analyse Statistiques descriptives Descriptive Options: choose the statistics you may wish
A newspaper writes that by and large, LMCs have 95,000 employees. Test statistically whether this is true at 1% level Test statistically whether this is true at 5% level Test statistically whether this is true at 10% and 20% level
Write out H0 and H1
Analyse Comparer les moyennes Test t pour échantillon unique Options: 99; 95, 90%
SPSS Application: Student t
![Page 46: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/46.jpg)
SPSS Application: t test at 99% level
Statistiques sur échantillon unique
1634 91298.87 96400.957 2384.818labourN Moyenne Ecart-type
Erreurstandardmoyenne
Test sur échantillon unique
-1.552 1633 .121 -3701.130 -9851.20 2448.94labourt ddl
Sig.(bilatérale)
Différencemoyenne Inférieure Supérieure
Intervalle de confiance99% de la différence
Valeur du test = 95000
![Page 47: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/47.jpg)
SPSS Application: t test at 95% level
Test sur échantillon unique
-1.552 1633 .121 -3701.130 -8378.75 976.50labourt ddl
Sig.(bilatérale)
Différencemoyenne Inférieure Supérieure
Intervalle de confiance95% de la différence
Valeur du test = 95000
Statistiques sur échantillon unique
1634 91298.87 96400.957 2384.818labourN Moyenne Ecart-type
Erreurstandardmoyenne
![Page 48: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/48.jpg)
SPSS Application: t test at 80% level
Statistiques sur échantillon unique
1634 91298.87 96400.957 2384.818labourN Moyenne Ecart-type
Erreurstandardmoyenne
Test sur échantillon unique
-1.552 1633 .121 -3701.130 -6758.63 -643.63labourt ddl
Sig.(bilatérale)
Différencemoyenne Inférieure Supérieure
Intervalle de confiance80% de la différence
Valeur du test = 95000
![Page 49: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/49.jpg)
2 20.01 0.0195000 95000 95000
96400 9640095000 2.573 95000 95000 2.573
1634 1634
9851.20 95000 2448.94
85148.8 97448.94
Pr 85148.8 97448.94 0.99
s sX t X t
N N
X X
SPSS Results (at 1% level)
![Page 50: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/50.jpg)
2
Xt
s n
Critical probability The confidence interval is designed in such a way that for each t
statistics chosen, we define a share of observations which this CI is
comprising. For large n, when t = 1.96, we have 95% CI For large n, when t = 2.55, we have 99% CI
Actually, for each t, there corresponds a share of observations One can compute directly the t value from our observations as follows:
![Page 51: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/51.jpg)
Critical probability The confidence interval is designed in such a way that for each t
statistics chosen, we define a share of observations which this CI is
comprising. For large n, when t = 1.96, we have 95% CI For large n, when t = 2.55, we have 99% CI
Actually, for each t, there corresponds a share of observation http://www.socr.ucla.edu/Applets.dir/T-table.html
One can compute directly the t value from our observations as follows:
2
95000 91298 95000 37021.552
96400 23841634
Xt
s
N
![Page 52: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/52.jpg)
Critical probability
With t = 1.552, I can conclude the following: 12% probability that μ belongs to the distribution
where the population mean = 95,000
I have 12% chances to wrongly reject H0
88% probability that μ belongs to another
distribution where the population mean ≠ 95,000
I have 88% chances to rightly reject H0
Shall I the accept or reject H0?
![Page 53: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/53.jpg)
6.1% 6.1%
88.0%
![Page 54: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/54.jpg)
Critical probability
With t = 1.552, I can conclude the following: 12% probability that μ belongs to the distribution
where the population mean = 95,000
I have 12% chances to wrongly reject H0
88% probability that μ belongs to another
distribution where the population mean ≠ 95,000
I have 88% chances to rightly reject H0
I accept H0 !!!
![Page 55: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/55.jpg)
Critical probability
The practice is to reject H0 only when the
critical probability is lower than 0.1, or 10% Some are even more cautious and prefer to
reject H0 at a critical probability level of 0.05,
or 5%. In any case, the philosophy of the statistician
is to be conservative.
![Page 56: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/56.jpg)
A Direct Comparison of Means Using Student t Another way to compare two sample means is to calculate the CI
of the mean difference. If 0 does not belong to CI, then the two
sample have significantly different means.
1 2 1 2
1 2
2 2
1 1 2 22
1 2
( 1) ( 1)
ppc
p
sX X t
n n
X X X Xs
n n
Standard error, also called pooled
variance
![Page 57: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/57.jpg)
Another newspaper argues that US companies are much larger than
those from the rest of the world. Is this true?
Produce descriptive statistics labour comparing the two groups Produce a group variables which equals 1 for US firms, 0 otherwise This is called a dummy variable
Write out H0 and H1
Analyse Comparer les moyennes Test t pour échantillon
indépendants What do you conclude at 5% level? What do you conclude at 1% level?
SPSS Application: t test comparing means
![Page 58: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/58.jpg)
SPSS Application: t test comparing means
Statistiques de groupe
628 97808.99 112765.1 4499.817
1006 87234.90 84403.469 2661.101
AM1
0
labourN Moyenne Ecart-type
Erreurstandardmoyenne
Test d'échantillons indépendants
.024 .877 2.159 1632 .031 10574.084 4897.135 968.751 20179.417
2.023 1061.268 .043 10574.084 5227.792 316.102 20832.067
Hypothèse devariances égales
Hypothèse devariances inégales
labourF Sig.
Test de Levene surl'égalité des variances
t ddlSig.
(bilatérale)Différencemoyenne
Différenceécart-type Inférieure Supérieure
Intervalle de confiance95% de la différence
Test-t pour égalité des moyennes
![Page 59: Class 2 Statistical Inference Lionel Nesta Observatoire Français des Conjonctures Economiques Lionel.nesta@ofce.sciences-po.fr CERAM February-March-April](https://reader036.vdocument.in/reader036/viewer/2022081519/56649d255503460f949fbad6/html5/thumbnails/59.jpg)
SPSS Application: t test comparing means
Statistiques de groupe
628 97808.99 112765.1 4499.817
1006 87234.90 84403.469 2661.101
AM1
0
labourN Moyenne Ecart-type
Erreurstandardmoyenne
Test d'échantillons indépendants
.024 .877 2.159 1632 .031 10574.084 4897.135 -2054.870 23203.038
2.023 1061.268 .043 10574.084 5227.792 -2916.075 24064.243
Hypothèse devariances égales
Hypothèse devariances inégales
labourF Sig.
Test de Levene surl'égalité des variances
t ddlSig.
(bilatérale)Différencemoyenne
Différenceécart-type Inférieure Supérieure
Intervalle de confiance99% de la différence
Test-t pour égalité des moyennes