probability distributions random variables: finite and continuous a review mat174, spring 2004

26
Probability Distributions Random Variables: Finite and Continuous A review MAT174, Spring 2004

Post on 21-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Probability DistributionsRandom Variables: Finite and Continuous

A review

MAT174, Spring 2004

Finite Random Variables

We want to associate probabilities with the values that the random variable takes on.

There are two types of functions that allow us to do this:

Probability Mass Functions (p.m.f) Cumulative Distribution Functions (c.d.f)

Probability Distributions

The pattern of probabilities for a random variable is called its probability distribution.

In the case of a finite random variable we call this the probability mass function (p.m.f.), fx(x) where fx(x) = P( X = x )

1

all x

( ) 1. Thus, 0 ( ) 1 for any value of and

( ) 1

n

i Xi

X

P X x f x x

f x

Probability Mass Function

This is a p.m.f which is a histogram representing the probabilities

The bars are centered above the values of the random variable

The heights of the bars are equal to the corresponding probabilities (when the width of your rectangles is 1) 0

0.1

0.2

0.3

0.4

0.5

0 1 2

P(X=x)

Cumulative Distribution Function

The same probability information is often given in a different form, called the cumulative distribution function (c.d.f) or FX

FX(x) = P(X ≤ x) 0 ≤ FX(x) ≤ 1, for all x In the finite case, the graph of a c.d.f. should

look like a step function, where the maximum is 1 and the minimum is 0.

Cumulative Distribution Function

Cumulative Distrib ution Function

0.0

0.2

0.4

0.6

0.8

1.0

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

x

F X (x )

Binomial Random Variable

Let X stand for the number of successes in n Bernoulli Trials where X is called a Binomial Random Variable

Binomial Setting:

1. You have n repeated trials of an experiment 2. On a single trial, there are only two possible outcomes3. The probability of success is the same from trial to trial4. The outcome of each trial is independent

Expected Value of a Binomial R.V is represented by E(X)=n*p

BINOMDIST

BINOMDIST is a built-in Excel function that gives values for the p.m.f and c.d.f of any binomial random variable

It is located under Statistical in the Function menu– BINOMDIST(x, n, p, false) = P(X=x)– BINOMDIST(x, n, p, true) = P(X ≤ x)

Expected Value

This is average value of X (what happens on average in infinitely many repeated trials of the underlying experiment

– It is denoted by X

For a Binomial Random Variable, E(X)=n*p, where n is the the number of independent trials and p is the probability of success

x

X xfxXEall

)()(

Continuous Random Variable

Continuous random variables take on values in an interval; you cannot list all the possible values

Examples: 1. Let X be a randomly selected number between 0 and 12. Let R be a future value of a weekly ratio of closing prices for IBM stock3. Let W be the exact weight of a randomly selected student

You can only calculate probabilities associated with interval values of X. You cannot calculate P(X=x); however we can still look at its c.d.f, FX(x).

Probability Density Function (p.d.f)

Represented by fx(x)– fx(x) is the height of the function fx(x) at an input of x

– This function does not give probabilities

For any continuous random variable, X, P(X=a)=0 for every number a.

Look at probabilities associated with X taking on an interval of values– P(a ≤ X ≤ b)

Probability Density Function (p.d.f)

To find P(a ≤ X ≤ b), we need to look at the portion of the graph that corresponds to this interval.

How can we relate this to integration?

Aa b

fX

Probability Density Function

( ) ( ) ( )

( )

( )

( ).

X XA F b F a P a X b

P a X b

P a X b

P a X b

Cumulative Distribution Function

CDF --– FX(x)=P(X ≤ x)

– 0 ≤ FX(x) ≤ 1, for all x

NOTE: Regardless of whether the random variable is finite or continuous, the cdf, FX, has the same interpretation– I.e., FX(x)=P(X ≤ x)

Cumulative Distribution Function

For the finite case, our c.d.f graph was a step function

For the continuous case, our c.d.f. graph will be a continuous graph

Cumulative Distribution Function

0.00.20.40.60.81.01.2

-1 0 1 2 3t

F T(t )

Fundamental Theorem of Calculus (FTC)

Given that – Differentiate both sides and what happens?

Well, from the previous slide we can see that

– If we differentiate both sides, we get that

What does this say? How can we verify this claim?

dxxgxG )()(

dxxfxF XX )()(

)()(' xfxF XX

Example 7 from Course Files

Define the following function:

– What are the possible values of X?– Set up an integral that would give you the following

probabilities: P(X < 0.5) P(X > 0.6) P(0.1 ≤ X ≤ 0.9) P(0.1 ≤ X ≤ 5)

– Verify that the function is a density function – What is E(X)?

elsewhere 0

10 if 155.37305.7)(

234 xxxxxxf X

Expected Value

For a finite random variable, we summed over all possible values of x

For a continuous random variable, we want to integrate over all possible values of x

This implies that

dxxfxXE XX )()(

Example 8 from the Course Files

Let T be the amount of time between consecutive computer crashes and has the following p.d.f. and c.d.f.

– What type of r.v. is T?– Calculate P(1 < T < 5) in

two different ways.– What is E(X)?

0 tif 1

0 if 0)(

0 tif 8.16

1

0 if 0

)(

8.16

8.16

tT

tT

e

ttF

e

t

tf

Exponential Distribution

Exponential random variables usually describe the waiting time between consecutive events.

In general, the p.d.f and c.d.f for an exponential random variable X is given as follows:

Any EXPONENTIAL random variable X, with parameter , has

How can we verify this?

xe

xxF xX 0if1

0if0)( /

xe

xxf xX 0if

10if0

)( /

( )E X

Continuous R.V. with exponential distribution

Probability Density Function

0.00.10.20.30.40.50.6

-3 0 3 6 9 12 15x

f X (x )

Cumulative Distribution Function

0.00.20.40.60.81.01.2

-3 0 3 6 9 12 15x

F X (x )

• How can we verify that the graph on the left is the graph of a p.d.f.?

Uniform Distribution

If the probability that X assumes a value is the same for all equal subintervals of an interval [0,u], then we have a continuous uniform random variable

X is equally likely to assume any value in [0,u] If X is uniform on the interval [0,u], then we have the

following formulas:

xu

uxu

ux

xf X

if0

0if1

if0

)(

xu

uxu

x

ux

xFX

if1

0if

if0

)(

Continuous R.V. with uniform distribution

Probbility Density Function

0.0000

0.0004

0.0008

0.0012

0.0016

-100 0 100 200 300 400 500 600 700 800

x

f X (x )

Cumulative Distribution Function

0.0

0.2

0.4

0.6

0.8

1.0

-100 0 100 200 300 400 500 600 700 800

x

F X (x )

• In general, if X is a continuous random variable with a UNIFORM distribution on [0,u], then

( )2

uE X

Focus on the Project

Look at the file Auction Focus.xls in the course files– This file contains 22 prior leases– Looking at each prior lease, we see that if each company bid

their signal, every company that won the auction would have lost money

– We want to devise a new bidding strategy using this data

Use data to simulate thousands of similar auctions

Identify Random Variables

We need random variables– Let V be the continuous random variable that gives the fair

profit value, in millions of dollars, for an oil lease similar to the 22 tracts

Look through Auction Focus.xls to see the statistics for the sample

– Each signal is an observation of the continuous random variable, SV where v is the actual fair value of the tract

It is assumed that E(SV) = v for every lease

– RV gives the error in a company’s signal Given by the signal minus the actual fair profit value of the lease E(RV) = 0 for every value of v

What should you do?

From slide 65 in MBD 2 Proj 2.ppt –1. Start an Excel file which incorporates the historical data on

the lease values and your team’s particular set of signals

2. Use these to compute the complete sample of signal errors, and then analyze this sample. Specifically, you should compute the maximum, minimum, and sample mean of the errors. You should also plot a histogram that approximates the actual p.d.f, fR of R– Go to slide 50 to see information about relative frequencies