statistics review

56
Review of Mathematics & Statistics Essentials BSP 4513 Econometrics 1 Econometrics Lecture 2

Upload: vu-duy-anh

Post on 28-Dec-2015

4 views

Category:

Documents


0 download

DESCRIPTION

Correlation, statistics

TRANSCRIPT

Page 1: Statistics Review

Review of Mathematics

&

Statistics Essentials

BSP 4513 Econometrics 1

Econometrics

Lecture 2

Page 2: Statistics Review

References

2

(1) Principles of Econometrics :: Appendix A and Appendix B

Page 3: Statistics Review

1) Summation and Operators

2) Logarithm – always natural

3) Linear Relationships

4) Nonlinear Relationships

Slide A-3

Page 4: Statistics Review

Notation • Summation ∑i

n Xi = X1 + X2 + … + Xn ; ∑jn Rj = R + R2 + R3 + … + Rn

∑ikXi = k.∑iXi ; 0.8 ∑j

n Rj = 0.8R + 0.8R2 + 0.8R3 + … + 0.8Rn

∑i(aXi + bYi) = a∑iXi + b∑iYi • Lag Operator LXt = Xt-1 ; LnXt = Xt-n ; Yt = a0Xt +a1Xt-1 + a2Xt-2

=(a0+a1L+a2L2)Xt = A(L)Xt. • Difference Operator ∆Xt = Xt – Xt-1

∆Xt = Xt – LXt = (1-L)Xt ∆ = 1 – L ; ∆(aXt) = a∆Xt

Page 5: Statistics Review

Slide A-5

1 1 1 1

1 21 1 1

1 1 1 1

( , ) ( )

( , ) [ ( , ) ( , ) ( , )]

( , ) ( , )

m n m n

i j i ji j i j

m n m

i j i i i ni j i

m n n m

i j i ji j j i

f x y x y

f x y f x y f x y f x y

f x y f x y

BSP 4513 Econometrics

Page 6: Statistics Review

dln(Xt) = dXt/Xt growth of X

Discrete approximation

ln(Xt) = ln(Xt) – ln(Xt-1)

Slide A-6

ln ln

Rules:

ln( ) ln( ) ln( )

ln( / ) ln( ) ln( )

ln( ) ln( )

b

a

x e b

xy x y

x y x y

x a x

Page 7: Statistics Review

Slide A-7

Page 8: Statistics Review

The exponential function is the antilogarithm because we can recover

the value of x using it.

Slide A-8

lnexp ln

xx e x

Page 9: Statistics Review

Slide A-9

(A.1)

(A.2)

1 2y x

2y x

1 2 1 2 10y x

Page 10: Statistics Review

Figure A.1 A linear relationship Slide A-10

Page 11: Statistics Review

Slide A-11

(A.3)

(A.4)

2

dy

dx

1 2 2 3 3y x x

(A.5)

2 3

2

3 2

3

given that is held constant

given that is held constant

yx

xy

xx

Page 12: Statistics Review

Slide A-12

(A.6)

1 2 3Q L K

2 given that capital is held constant

, the marginal product of labor inputL

QK

L

MP

2 3

2 3

,y y

x x

Page 13: Statistics Review

Slide A-13

(A.7)

∆y/y = the relative change in y, which is a decimal

%∆y = percentage change in y

(A.8a)

(A.8b)

(A.8c)

yx

y y y x xslope

x x x y y

% 100y

yy

Page 14: Statistics Review

Figure A.2 A nonlinear relationship Slide A-14

Page 15: Statistics Review

Slide A-15

(A.9)

(A.10)

2dy dx

yx

dy y dy x xslope

dx x dx y y

Page 16: Statistics Review

Slide A-16

Page 17: Statistics Review

Figure A.3 Alternative Functional Forms Slide A-17

Page 18: Statistics Review

If β3 > 0, then the curve is U-shaped, and representative of average or

marginal cost functions, with increasing marginal effects. If β3 < 0,

then the curve is an inverted-U shape, useful for total product curves,

total revenue curves, and curves that exhibit diminishing marginal

effects.

Slide A-18

2

1 2 3y x x

2 3 2 32 0, or / (2 )dy dx x x

Page 19: Statistics Review

• Example: the Phillips Curve

Slide A-19

1

1 2 1 2

1y x

x

1

1

1 2

% 100

1%

t tt

t

t

t

w ww

w

wu

Page 20: Statistics Review

• In order to use this model all values of y and x must be positive. The

slopes of these curves change at every point, but the elasticity is

constant and equal to β2.

Slide A-20

1 2ln( ) ln( )y x

Page 21: Statistics Review

• Both its slope and elasticity change at each point and are the same

sign as β2.

• The slope at any point is β2y, which for β2 > 0 means that the marginal

effect increases for larger values of y.

Slide A-21

1 2ln( )y x

1exp ln( ) exp( )y y x

Page 22: Statistics Review

Random Variables

Probability Distributions

Joint, Marginal and Conditional Probability

Distributions

Properties of Probability Distributions

Slide B-22

Page 23: Statistics Review

Random Variables

• A random variable is simply a variable whose values are determine by some chance mechanism.

• A random variable has a designated set of possible values and associated probabilities.

• A discrete random variable X consists of a set of possible values X1, X2…,Xk and associated non-negative fractions (probabilities) p1, p2…,pN such that

p1+ p2 + . . .+ pN = ∑pN = 1

Page 24: Statistics Review

Probability Distribution

• A formula giving the probabilities for different values of the random variable X is called a probability distribution, and probability density function for continuous random variables.

Page 25: Statistics Review

Probability Distribution Example: (a) Probability Function for outcome from

tossing 2 well-balanced coins:

Outcome (No of Heads) Probabilities X.P(X)

No Head X1 = 0 0.25 = P(X1) 0 x 0.25 = 0.0

1 Head X2 = 1 0.50 = P(X2) 1 x 0.50 = 0.5

2 Heads X3 = 2 0.25 = P(X3) 2 x 0.25 = 0.5

Total 1.00 ∑X.P(X) = 1.00

● H

T

H

H

T

T

2

1

1

0

½

½

½

½

½

½

Page 26: Statistics Review

Probability Distribution

Example: (b) Probability Function for outcome from tossing

a die with the numbers shown on its faces.

Page 27: Statistics Review

Probability Density Functions…

• Unlike a discrete random variable, a continuous random variable is one that can assume an uncountable number of values.

• We cannot list the possible values because there is an infinite number of them.

• Because there is an infinite number of values, the probability of each individual value is virtually 0.

Page 28: Statistics Review

Point Probabilities are Zero

Because there is an infinite number of values, the probability of each individual value is virtually 0.

Thus, we can determine the probability of a range of values only.

• E.g. with a discrete random variable like tossing a die, it is meaningful to talk about P(X=5), say.

• In a continuous setting (e.g. with time as a random variable), the probability the random variable of interest, say task length, takes exactly 5 minutes is infinitesimally small, hence P(X=5) = 0.

• It is meaningful to talk about P(X ≤ 5).

Page 29: Statistics Review

Probability Density Function…

• A function f(x) is called a probability density function (over the range a ≤ x ≤ b if it meets the following requirements:

1) f(x) ≥ 0 for all x between a and b, and

2) The total area under the curve between a and b is 1.0

f(x)

x b a

area=1

Page 30: Statistics Review

Uniform Distribution… • Consider the uniform probability distribution (sometimes

called the rectangular probability distribution).

• It is described by the function:

f(x)

x b a

area = width x height = (b – a) x = 1

Page 31: Statistics Review

Example 2.1(a)…

• The amount of gasoline sold daily at a service station is uniformly distributed with a minimum of 2,000 gallons and a maximum of 5,000 gallons.

• Find the probability that daily sales will fall between 2,500 and 3,000 gallons.

• Algebraically: what is P(2,500 ≤ X ≤ 3,000) ?

f(x)

x 5,000 2,000

Page 32: Statistics Review

Example 2.1(a)…

• P(2,500 ≤ X ≤ 3,000) = (3,000 – 2,500) x = .1667

• “there is about a 17% chance that between 2,500 and 3,000 gallons of gas will be sold on a given day”

f(x)

x 5,000 2,000

Page 33: Statistics Review

Probability Density Function

• PDF of a Continuous Random Variable

Example: the Normal Distribution for values of a random variable

taking values in the interval ( - , + )

45 60 0

Weight in kilogram

Pro

bab

ilit

y D

en

sit

y f(x) Probability that weight lies

between 45 and 60 kg is

given by area shaded

Page 34: Statistics Review

Normal Distribution

• A bell-shaped distribution • Symmetrical about its mean,

median and mode • The uni-variate normal

probability density function is:

with mean = and variance = 2

f(X) = __1__ exp{-½[(X-)/]2}

(22)

-∞ +∞

Page 35: Statistics Review

Measure of Central Tendency • Mean (average); Median; Mode • Expectation Operator, E( . ) E(x) = x.f(x)dx = μx = mean of distribution = x.f(x)dx More generally, for any function of x, g(x): + E[g(x)] = g(x).f(x)dx. In particular, E[a+bx] = a+bE(x), when a and b are non-stochastic. E[x-E(x)]2 = variance of distribution = E[x2 + (E(x))2 -2xE(x)] = Ex2 - (E(x))2

& Note that E(XY) = E(X).E(Y), if X and Y are independent; E(X/Y) ≠ E(X)/E(Y)

Page 36: Statistics Review

Measure of Dispersion

• Range (max – min); Variance (σ2) ; Standard deviation (σ); coefficient of variation (CV =σ/μ).

• Definition of variance:

Var(x) = (σ2)

= E[x-E(x)]2

= E[x2 + (E(x))2 -2xE(x)]

= Ex2 - (E(x))2

Var(X) = ∑(Xi - X)2P(Xi)

Page 37: Statistics Review

Measure of Dispersion Example: The variance of a random variable X,

the number shown on a die.

σ2 = 2.9167; σ = 1.708; CV = 0.488

Page 38: Statistics Review

Dispersion with same mean

Hypothetical PDFs of continuous random variables

all with the same expected value.

Page 39: Statistics Review

Joint Distribution; Marginal and Conditional Distributions

Consider the 16 possible outcomes from tossing a coin four times.

X = number of heads obtained on the first three tosses;

Y = number of heads obtained on four tosses;

X→

Y

0 1 2 3 g(Y) E(X|Y)

0 1/16 0 0 0 1/16 0.0

1 1/16 3/16 0 0 ¼ 0.75

2 0 3/16 3/16 0 3/8 1.5

3 0 0 3/16 1/16 ¼ 2.25

4 0 0 0 1/16 1/16 3.0

f(X) 1/8 3/8 3/8 1/8

E(Y|X) 0.5 1.5 2.5 3.5

Marginal distribution of X = f(X); E(X) = 12/8 = 3/2

Marginal distribution of Y = g(Y); E(Y)= 32/16 = 2

Page 40: Statistics Review

Conditional Probabilities and Conditional Distribution

• P(Y|X) = P(X,Y)/P(X)

Or P(X,Y) = P(Y|X).P(X) = P(X|Y).P(Y)

• E.g. P(Y=3|X=2) = P(Y=3, X=2)/P(X=2)

= (3/16)/(3/8) = 0.5

Similarly, P(Y=4|X=2) = (3/16)/(3/8) = 0.5

The conditional distribution of Y given X = 2 is:

Values of Y given X = 2 Probabilities Y.P(Y|X)

Y = 2 0.5 = P(Y=3|X=2) 1.0

Y = 3 0.5 = P(Y=4|X=2) 1.5

Total 1.00 2.5 = E(Y|X=2)

Page 41: Statistics Review

Conditional Distribution

X=2 P(Y|X=2)

Y=0 0 0

Y=1 0 0

Y=2 3/16 0.5

Y=3 3/16 0.5

Y=4 0 0

3/8 1.0

E(Y|X=2) = 2.5

P(Y=3|X=2)

= P(X=2,Y=3)/P(X=2)

= (3/16)/(3/8)

= 0.5

•Conditional Mean:

E(Y|X=2)=0.5*2+0.5*3 = 2.5

•Conditional Variance:

V(Y|X=2)=0.5*(2-2.5)2

+0.5*(3-2.5)2 = 0.25

Page 42: Statistics Review

Statistical Independence

X and Y are said to be statistically independent if

P(X,Y) = P(X).P(Y)

X g(Y)

1 2 3

Y 1 0.12 0.18 0.30 0.6

2 0.08 0.12 0.20 0.4

f(X) 0.2 0.3 0.5

Example:

P(X=1,Y=2)

= P(X=1).P(Y=2)

= 0.2*0.4 =0.08.

Page 43: Statistics Review

Conditional Distribution

• Conditional density function: f(Y|x)

• Conditional Expectation = μY|X

=E(Y|X) =EX(Y)

= YiP(Yi|X)

E.g. μY|X=2 = E(Y|X=2) = 2 * 0.5 + 3 * 0.5

= 5/2 = 2.5

• Conditional Variance = EX{Y- μY|X}2

Page 44: Statistics Review

Conditional Expectation & Regression Line

• Note that the expectation of Y conditional on X can be expressed as a function of X.

• For the coin tossing example:

E(Y|X) = 0.5 + 1.0X regression line

• Also notice that the variance of (Y|X) in the coin-tossing example is the same for different values of X:

Var(Y|X) = EX{Y- μY|X}2 = 0.25

homoscedastic

Page 45: Statistics Review

Regression Line

X

Y|X

E(Y|X) = 0.5 + 1.0X

0

3

2

1

Page 46: Statistics Review

Covariance

• Cov(X,Y) = E[(X- X)(Y- Y)] = E(XY) - X Y

= ∑x∑y[(X- X)(Y- Y)P(X,Y)

= ∑x∑yXYf(X,Y) - XY

• Cov(X,X) = Var (X)

• Cov(a+bX, c+dY) = bdCov(X,Y)

Page 47: Statistics Review

Covariance as an average of area

(x2,y2)

(x1,y1)

(x3,y3)

y

x

Cov(X,Y)

= ∑∑[(X- X)(Y-Y)f(X,Y)

= (1/n)∑∑xy

Let x = (X- X)

y = (Y - Y)

Page 48: Statistics Review

Correlation Between X and Y

• Correlation Coefficient:

ρ = Cov(X,Y)/ X y

Computation formula:

∑∑(X- X)(Y-Y)

√{∑(X- X)2. ∑(Y-Y)2}

In the coin-tossing example, ρ =0.5/√(0.75*1.0)

= 0.5774

ρ =

Page 49: Statistics Review

Properties of Correlation Coefficient

• A measure of linear association between two variables.

• The correlation coefficients always lies between -1 and +1.

Page 50: Statistics Review

Figure 3-3

Some typical patterns of the correlation coefficient, ρ.

Page 51: Statistics Review

Skewness of Distribution

Measured y the third moment, S=E(X – ux)3/σ3

Skewness =0, when distribution is symmetric

Page 52: Statistics Review

Kurtosis

•Measured by the normalized fourth moments, K = E(X-ux)4/σ4

•For normal distribution, K = 3.

Page 53: Statistics Review

Example

• According to the Registry of Business in a small city, there are 1,200 establishments. The tabulation below shows the size of the establishment in terms of the number of employees.

No of Employees

No of

Establishments

Relative

Frequency

0 - 50 768 0.640

51 - 100 180 0.150

101 - 200 180 0.150

201 - 500 72 0.060

Total 1200 1.000

Page 54: Statistics Review

Example: Distribution of Establishments

Size Ref Freq (p)

0 - 50 0.640

51 - 100 0.150

101 - 200 0.150

201 - 500 0.060

0 50 100 200 500

Area of Shaded = 1.0

Prob

Density

Number of Employees

Page 55: Statistics Review

Quick Quiz

• Express the following using the ‘L’ operator:

(a) Yt = 2 + 3Xt + 4Xt-1 + 8 Xt-4

(b) Qt = 1 + 0.3Qt-1 + 0.4Nt + 0.2Kt-1

Page 56: Statistics Review

Quick Quiz

X→

Y

0 1 2 3 g(Y) E(X|Y)

0 1/8 1/16 0 0 3/16 ?

1 1/8 1/8 0 0 1/4 ?

2 0 1/16 1/8 0 3/16 ?

3 0 0 1/16 1/16 1/8 ?

4 0 0 1/16 3/16 1/4 ?

f(X) 1/4 1/4 1/4 1/4

E(Y|X) ? ? ? ?

Fill in the blanks

E(X) = ____

E(Y) = ____