chapter 2: probability

2.1

Random Variable (r.v.) is a variable whose value is unknown until it is observed. The value of a random variable results from an experiment.

Chapter 2: Probability

Experiments can be either controlled (laboratory) or uncontrolled (observational). Most economic variables are random and are the result of uncontrolled experiments.

2.2Random Variables

A discrete random variable can take on only a finite number of values such as

• The number of visits to a doctor’s office

• Number of children in a household

• Flip of a coin

• Dummy (binary) variable: D=0 if male, D=1 if female

A continuous random variable can take any real value (not just whole numbers) in an interval on the real number line such as:

• Gross Domestic Product next year

• Price of a share in Microsoft

• Interest rate on a 30 year mortgage

2.3Probability Distributions of Random Variables

• All random variables have probability distributions that describe the values the random variable can take on and the associated probabilities of these values.

• Knowing the probability distribution of random variable gives us some indication of the value the r.v. may take on.

2.4Probability Distribution for Discrete Random Variable

Expressed as a table, graph or function

1. Suppose X = # of tails when a coin is flipped twice. X can take on the values 0, 1 or 2. Let f(x) be the associated probabilities:

Table Graph

X f(x)

0 0.25

1 0.50

2 0.25

0 1 2

0.25

0.50

f(x)

x

Probability is represented as height on this bar graph

2.5

2. Suppose X is a binary variable that can take on two values: 0 or 1. Furthermore, assume P(X=1) = p and P(X=0) = (1-p)

Function:

P(X=x) = f(x) = px(1-p)1-x for X = 0, 1

Table

X f(x)

0 (1-p)

1 p

Suppose p = 0.10

Then X takes on 0 with probability 0.90 and X takes on 1 with probability 0.10

2.6Facts about discrete probability distribution functions

1. Each probability P(X=x) = f(x) must lie between 0 and 1: 0 f(x) 1

2. The sum of the probabilities must be 1. If X can take on n different values then:

f(x1) + f(x2)+. . .+f(xn) = 1

2.7

Probability Distribution (Density)for Continuous Random Variables

Expressed as a function or graph.

Continuous r.v.’s can take on an infinite number of values in a given interval

– A table isn’t appropriate to express pdf

EX: f(x) = 2x for 0 x 1

= 0 otherwise

2.8

Because a continuous random variable has an uncountably infinite number of values, the probability of one occurring is zero.

P(X = a) = 0

Instead, we ask “What is the probability that X is between a and b?

P[a < X < b] = ?

In an experiment, the probability P[a < X < b] is the proportion of the time, in many experiments, that X will fall between a and b.

2.9

Probability is represented as area under the function.

Total area must

be 1.0

Area of triangle

is 1.0

Probability that x lies between 0 and 1/2 P [ 0 X 1/2 ] = 0.25[Area of any triangle is ½*Base*Height]

2

x1

f(x)

1/2

1

2.10

Uniform Random Variable: u is distributed uniformly between a and b

• p.d.f. is a line between a and b of height 1/(b-a)

• f(u) = 1/(b – a) if a u b

= 0 otherwise

EX: Spin a dial on a clock

a = 0 and b = 12

Find the probability that

u lies between 1 and 2

0 12

1/12

f(u)

u1 2

2.11

In calculus, the integral of a function defines the area under it:

For continuous random variables it is thearea under f(x), and not f(x) itself, whichdefines the probability of an event. We will NOT be integrating functions; when necessary we use tables and/or computers to calculate the necessary probability (integral).

b

aP [ a X b ] = f(x) dx

2.12

Rule 2: a = na i = 1

n

Rule 1: xi = x1 + x2 + . . . + xni = 1

n

Rule 4: xi +yi = xi + yii = 1 i = 1 i = 1

n n n

Rules of Summation

Rule 3: axi = a xi

2.13

Rule 5: axi +byi = a xi + b yii = 1 i = 1 i = 1

n n n

Rules of Summation (continued)

i = 1

n

n1

Rule 6: x = xi =x1 + x2 + . . . + xn

n

From Rule 6, we can prove (in class) that:

xi x) = 0i = 1

n

2.14

Rule 6: f(xi) = f(x1) + f(x2) + . . . + f(xn)i = 1

n

Notation: f(xi) = f(xi) = f(xi)

n

x i i = 1

n

Rule 7: f(xi,yj) = [ f(xi,y1) + f(xi,y2)+. . .+ f(xi,ym)] i = 1 i = 1

n m

j = 1

The order of summation does not matter :

f(xi,yj) = f(xi,yj)i = 1

n m

j = 1 j = 1

m n

i = 1

Rules of Summation (continued)

2.15

The mean of a random variable is its mathematical expectation, or expected value. For a discrete random variable, this is:

The Mean of a Random Variable

E(X) = xif(xi) = x1f(x1) + x2f(x2) + . . . + xnf(xn)where n measures the number of values X can take on

It is a probability-weighted average of the possible values the random variable X can take on. This is a sum for discrete r.v.’s and an integral for continuous r.v.’s

2.16

• E(X) tells us the “long-run” average value for X. It is not the value one would expect X to take on.

• If you were to randomly draw values of X from its pdf an infinite number of times and average these values, you would get E(X)

• E(X) = this greek letter “mu” is not used in your text but is commonly used to denote the mean of X.

2.17Example: Roll a fair die

5.36/21

)6/1(6)6/1(5

)6/1(4)6/1(3)6/1(2)6/1(1

6

1

i

ii xfxXE

Interpretation: In a large number of rolls of a fair die, one-sixth of the values will be 1’s, one-sixth of the values will be 2’s. etc., and the average of these values will be 3.5.

2.18Mathematical Expectation

• Think of E(.) as an operator that requires you to weight by probabilities any expression inside the parentheses, and then sum

• E(g(x)) = g(xi)f(xi) = g(x1)f(x1) + g(x2 ) f(x2) + . . . + g(xn ) f(xn)

2.19Rules of Mathematical Expectation

• E(c) = c where c is a constant

• E(cX) = cE(X) where c is a constant and X is a random variable

• E(a + cX) = a + cE(X) where a and c are constants and X is a random

variable.

2.20Variance of a Random Variable

• Like the mean, the variance of a r.v. is an expected value, but it is the expected value of the squared deviations from the mean

• Let g(x) = (x – E(x))2

• Variance 2 = Var(x) = E(x – E(x))2

= g(xi)f(xi)

= (xi – E(xi))2f(xi) • It measures the amount of dispersion in the possible values for X.

2.21About Variance

• Unit of measurement is X units squared

• When we create a new random variable as a linear transformation of X:

y = a + cx

We know that E(y) = a + cE(x)

But Var(y) = c2Var(x)

(proof in class) This property tells us that the amount of variation in y is determined by: the amount of variation in X and the constant c. The additive constant a in no way alters the amount of variation in the values on x.

2.22About Variance (con’t)

• E(x – E(x))2 = E[x2 – 2E(x)x + E(x)2]= E(x2) – 2E(x)E(x) + E(x)2

= E(x2) – 2E(x)2 + E(x)2

= E(x2) – E(x)2

• Run the E(.) operator thru, pulling out constants and stopping on random variables. Remember that E(x) is itself a constant, so

• E(E(x)) = E(x)

2.23Standard Deviation

• Because variance is in squared units of the r.v., we can take the square root of the variance to obtain the standard deviation.

= 2 = Var(x)

Be sure to take the square root after you square and sum the deviations from the mean.

2.24Joint Probability

• An experiment can randomly determine the outcome of more than one variable.

• When there are 2 random variables of interest, we study the joint probability density function

• When there are more than 2 random variables of interest, we study the multivariate probability density function.

2.25For a discrete joint pdf, probability is expressedin a matrix:

X f(y)

Y

-10 0 10 20

6 0 0 0.10 0.10

8 0 0.10 0.30 0.20

10 0.10 0.10 0 0

f(x)

Let X= return on stocks, Y= return on bonds

P(X=x,Y=y) = f(x,y)

e.g. P(X=10,Y=8) = 0.30

2.26About Joint P.d.F’s

• Marginal Probability Distribution: what is the probability distribution for X regardless of what values Y takes on?

f(x) = yf(x,y)

what is the probability distribution for Y regardless of what values X takes on?

f(y) = xf(x,y)

2.27• Conditional Probability Distribution:

What is the probability distribution for X given that Y takes on a particular value?

f(x|y) = f(x,y)/f(y)

What is the probability distribution for Y given that X takes on a particular value?

f(y|y\x) = f(x,y)/f(x)

2.28

• Covariance: A measure that summarizes the joint probability distribution between two random variables.

cov(x,y) = E[(x – E(x))(y-E(y))]

= x y (xi – E(x))(yi – E(y))f(x,y)

Ex:

2.29About Covariance:

It measures the joint association between 2 random variables. Try asking: “When X is large, is Y more or less likely to also be large?”

If the answer is that Y is likely to be large when X is large, then we say X and Y have a positive relationship. Cov(x,y) > 0

If the answer is that Y is likely to be small when X is large, then we say that X and Y have a negative relationship. Cov(x,y) < 0.

cov(x,y) = E[(x – E(x))(y – E(y))]

= E[xy – E(x)y – xE(y) + E(x)E(y)]

= E(xy) – E(x)E(y) – E(x)E(y) + E(x)E(y)

= E(xy) – E(x)E(y) useful!!

2.30

• Correlation

Covariance has awkward units of measurement. Correlation removes all units of measurement by dividing covariance by the product of the standarddeviations:

xy = Cov(x,y)/(xy)and –1 xy 1

Ex:

2.31What does correlation look like??

=0

=.3

=.7

=.9

2.32Statistical Independence

Two random variables are statistically independent if knowing the value that one will take on does not reveal anything about what value the other may take on:

f(x|y) = f(x) or f(y|x) = f(y)

This implies that f(x,y) = f(x)f(y) if X and Y are independent.

If 2 r.v.’s are independent, then their covariance will necessarily be equal to 0.

2.33Functions of more than one Random Variable

Suppose that X and Y are two random variables. If we sum them together we create a new random variable that has the following mean and variance:

Z = aX + bY

E(Z) = E(aX + bY) = aE(x) + bE(y)

Var(Z) = Var(aX + bY)

= a2Var(X) + b2Var(Y) + 2abCov(X,Y)

If X and Y are independent

Var(Z) = Var(aX + bY)

= a2Var(X) + b2Var(Y) see page 31

2.34Normal Probability Distribution

• Many random variables tend to have a normal distribution (a well known bell shape)

• Theoretically, x~N(β,2) where E(x) = β and Var(x) = 2

The probability density function is 2

22

1 ( )( ) exp ,

22

xf x x

xa b

2.35Normal Distribution (con’t)

• A family of distributions, each with its own mean and variance. The mean anchors the distribution’s center and the variance captures the spread of the bell-shaped curve

• To find area under the curve would require integrating the p.d.f – too complicated. Computer generated table gives all the probabilities we need for a normal r.v. that has mean 0 and variance of 1

To use the table (pg. 389), we need to take a normalrandom variable x~N(,2) and transform it by subtracting the mean and dividing by the standarddeviation. This is a linear transformation of X that creates a new random variable that has mean 0 and variance of 1.

Z = (x - )/ where z ~N(0,1)

2.36Statistical inference: drawing conclusions about a population based on a sample

)()( ii xfxXE

)(2 XVarxx

)()(

),(

YVarXVar

YXCovxy

yx

yx

XYE

YXEYXCov

)(

))((),(

T

X

X

T

tt

1

1

)( 22

T

xxs i

x

2xx ss

))((1

1yyxx

TS ttxy

2222 )()(

))((

yyxx

yyxx

ss

Sr

tt

tt

yx

xy

2

22

)(

))(()(

XE

XEXEXVar

chapter 2: probability

Documents