continuous random variables - binghamton university we try and deal cards every second until we...

74
Continuous random variables Continuous r.v.’s take an uncountably infinite number of possible values. ✿✿✿✿✿✿✿✿✿ Examples: Heights of people Weights of apples Diameters of bolts Life lengths of light-bulbs We cannot assign positive probabilities to every particular point in an interval. (Why?) We can only measure intervals and their unions. For this reason, pmf (probability mass function) is not useful for continuous r.v.’s. For them, probability distributions are described differently.

Upload: lamdieu

Post on 31-Mar-2018

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Continuous random variables

Continuous r.v.’s take an uncountably infinite number of possiblevalues.

:::::::::Examples:

• Heights of people• Weights of apples• Diameters of bolts• Life lengths of light-bulbs

We cannot assign positive probabilities to every particular point in aninterval. (Why?)We can only measure intervals and their unions.

For this reason, pmf (probability mass function) is not useful forcontinuous r.v.’s. For them, probability distributions are describeddifferently.

Page 2: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Cumulative distribution function

::::Defn: The

::::::::::::(Cumulative)

::::::::::distribution

::::::::function (or cdf) of a r.v. Y is

denoted F (y) and is defined as

F (y) = P(Y ≤ y)

for −∞ < y <∞.

::::::::Example (cdf of a discrete r.v.):Let Y be a discrete r.v. with probability functiony 1 2 3p(y) 0.3 0.5 0.2 Find the cdf of Y and sketch the graph of the

cdf.

F (y) = P(Y ≤ y) =

0, for y < 1,0.3 for 1 ≤ y < 2,0.8 for 2 ≤ y < 3,1 for y ≥ 3.

::::Note: The cdf for any discrete r.v. is a

:::step function.

There us a finite or countable number of points where cdf increases.

Page 3: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Properties of a cdf function

1. F (−∞) ≡ limy→−∞ F (y) = 0.2. F (∞) ≡ limy→∞ F (y) = 1.3. F (y) is a nondecreasing function of y .4. F (y) is a right-continuous function of y .

A r.v. is said to be::::::::::continuous, if its distribution function is continuous

everywhere.

Page 4: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Probability density functionSuppose Y is a continuous r.v. with cdf F (y). Assume that F (y) isdifferentiable everywhere,except possibly a finite number of points in every finite interval.

Then the density or probability density function (pdf) of Y is definedas

f (y) =dF (y)

dy≡ F ′(y).

It follows from the definitions that

F (y) =

∫ y

−∞f (t)dt .

Page 5: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Properties of a pdf

1. f (y) ≥ 0 for all y(because F (y) is non-decreasing.)

2.∫∞−∞ f (t)dt = 1

(because F (∞) = 1.)

::::::::Example

::1: Let Y be a continuous r.v. with cdf:

F (y) =

0, for y < 0,y3, for 0 ≤ y ≤ 1,1, for y > 1.

Find the pdf f (y) and graph both F (y) and f (y). Find P(Y ≤ 0.5).

Page 6: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Interval probabilitiesWith continuous r.v.’s we usually want the probability that Y falls in aspecific interval [a,b]; that is, we want P(a ≤ Y ≤ b) for numbersa < b.

::::::::Theorem If Y is a continuous r.v. with pdf f (y) and a < b, then

P(a ≤ Y ≤ b) =

∫ b

af (y)dy .

This represents the “area under the density curve” between y = aand y = b:

Page 7: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Example

::::::::Example Let Y be a continuous r.v. with pdf:

f (y) =

0, for y < 0,3y2, for 0 ≤ y ≤ 1,0, for y > 1.

Find P(0.4 ≤ y ≤ 0.8).

Page 8: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Quiz

Which of the following are graphs of valid cumulative distributionfunctions?

(A) 1, 2, and 3(B) 1 and 2(C) 2 and 3(D) 3 only(E) 4 only

Page 9: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Quiz

Suppose X has range [0,2] and pdf f (x) = cx .

What is the value of c?

(A) 1/4(B) 1/2(C) 1(D) 2(E) Other

Page 10: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Quiz

Suppose X has range [0,2] and pdf f (x) = cx .

Compute P(1 ≤ X ≤ 2).

(A) 1/4(B) 1/2(C) 3/4(D) Other

Page 11: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Quiz

Suppose X has range [0,b] and cdf f (x) = x2/9.

What is b?

Page 12: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Quiz

Suppose X is a continuous random variable.

What is P(a ≤ X ≤ a)?

Page 13: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Quiz

Suppose X is a continuous random variable.

Does P(X = a) = 0 mean X never equals a?

(A) Yes(B) No

Page 14: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

What does it mean never happens?

The number of possible deals in bridge card game is(52

13,13,13,13

)=

52!

13!13!13!13!≈ 5× 1028

. If we try and deal cards every second until we find a specificcombination, then the time until success is a geometric randomvariable with the expected time equal to 5× 1028 seconds.

The age of the Universe is currently estimated as 13.7× 109 years.Every year has 60× 60× 24× 365 ≈ 31× 106 seconds. So the ageof the Universe is ≈ 4× 1017 seconds

So the time to wait for a specific combination of cards in the bridgegame is significantly longer than the age of the Universe.

Page 15: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Why continuous r.v. take values?

Why can’t we argue as below:

P(X ∈ [0,1] = P(⋃

0≤a≤1

{X = a}) =∑

0≤a≤1

P(X = a) =∑

0≤a≤1

0 = 0 ?

Axioms of probability theory do not allow us to speculateabout the probability of a union of the uncountable number of events.

Page 16: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Example

Let Y be a continuous r.v. with pdf

f (y) =

{cy−1/2, for 0 < y ≤ 4,0, elsewhere,

for some constant c.

Find c, find F (y), graph f (y) and F (y), and find P(Y < 1).

Page 17: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Expected value of a continuous random variable

::::Defn: The

:::::::::expected

:::::value of a continuous r.v. Y is

E(Y ) =

∫ ∞−∞

yf (y)dy ,

provided that the integral exists.

::::::::Theorem If g(Y ) is a function of a continuous r.v. Y , then

E[g(Y )] =

∫ ∞−∞

g(y)f (y)dy .

This is similar to the case of a discrete r.v., with an integral replacinga sum.

Page 18: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Properties of the expected value

• E[aY + b] = aEY + b.• E[Y1 + Y2 + . . .+ Yn] = EY1 + EY2 + . . .+ EYn.

The variance and the standard deviation are defined in the same wayas for the discrete r.v.:

Var(Y ) = E[(Y − µ)2] = E[Y 2]− µ2,

where µ = EY .

Andσ =

√Var(Y ).

Page 19: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Example

Find the expectation and variance for the r.v. in the previous example.

Pdf is

f (y) =

{cy−1/2, for 0 < y ≤ 4,0, elsewhere,

for some constant c.

Page 20: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Quiz

True or False. The mean divides the probability mass in half.

(A) True(B) False

Page 21: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Quantiles::::Defn: Let 0 < x < 1. The x-th quantile of a r.v. Y (denoted qx ) is thesmallest value such that P(Y ≤ qx ) ≥ x .

If Y is continuous, qx is the smallest value such that P(Y ≤ qx ) = x .

The quantile function is the inverse of the cumulative distributionfunction.

Page 22: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Median

The 50% quantile of a random variable Y is called the median of Y .

Example: What is the median in the previous example?

In R, quantiles can be computed by functions like qbinom or qpois.

Example: what is the median of the Poisson distribution withparameter λ = 7.

Page 23: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Quiz

In the plot below some densities are shown. The median of the blackplot is always at qp. Which density has the greatest median?

(A) Black(B) Red(C) Blue(D) All the same(E) Impossible to tell

Page 24: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Quiz

In the plot below some densities are shown. The median of the blackplot is always at qp. Which density has the greatest median?

(A) Black(B) Red(C) Blue(D) All the same(E) Impossible to tell

Page 25: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

The Uniform Distribution

::::::::Example Suppose a bus is supposed to arrive at a bus stop at 8 : 00AM. In reality, it is equally likely to arrive at any time between 8 : 00AM and 8 : 10 AM.

Let Y be the amount of time between 8 : 00 AM and the momentwhen the bus arrives. This is a simple example of a continuous r.v.

What is the probability the bus will arrive between 8 : 02 and 8 : 06AM?

Page 26: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

The uniform distribution

::::Defn: A r.v. Y has a uniform probability distribution on the interval[θ1, θ2] if its pdf is

f (y) =

{ 1θ2−θ1

, if θ1 ≤ y ≤ θ2

0, otherwise.

(If parameters θ1 = 0 and θ2 = 1, the distribution is called thestandard uniform distribution.)

::::::::Theorem If Y ∼ Unif(θ1, θ2), then

E(Y ) =θ1 + θ2

2,

Var(Y ) =(θ2 − θ1)2

12.

Proof:

Page 27: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Quiz

Find the variance of X ∼ Unif(0,4).

(A) 1/12(B) 1/3(C) 4/3(D) 2(E) other

Page 28: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Quiz

Find the variance of the sum of four independent Unif(0,4) randomvariables.

Page 29: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Quiz

Find the 0.7 quantile of X ∼ Unif(2,32).

(A) 21(B) 23(C) 24(D) 25(E) other

Page 30: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

The normal (Gaussian) distribution

The::::::normal

::::::::::distribution is called normal because it describes well

many real-world random variables. It is also often called::::::::Gaussian.

::::Defn: A r.v. Y has a normal or Gaussian distribution, if its pdf is

f (y) =1√

2πσ2exp−(y−µ)

2/(2σ2).

Notation: Y ∼ N (µ, σ2).

The normal pdf is defined over entire real line.

Page 31: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Standard normal distribution

If µ = 0 and σ = 1, then the distribution is called the standard normaldistribution.

Rules of thumb:P(−1 ≤ Z ≤ 1) ≈ .68,P(−2 ≤ Z ≤ 2) ≈ .95,P(−3 ≤ Z ≤ 3) ≈ .997.

Page 32: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Quiz

Suppose a p-quantile of the standard normal distribution equals 2. Towhich of the following is p closest to?

(A) 99.7%(B) 97.5%(C) 95%(D) 5%(E) 2.5 %

Page 33: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Normal distribution as an approximation

Let X1, X2, . . . Xn be independent identically distributed r.v. withexpectation µ and variance σ2.

ThenY =

X1 + X2 + . . .+ Xn − nµ√nσ2

is distributed approximately as a standard normal r.v.

Page 34: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

R-simulationGenerating data:repet <- 10000size <- 100p <- .5data <- (rbinom(repet, size, p) - size * p) / sqrt(size * p * (1-p))

Histograms:• By default you get a histogram with the counts of items in each

binTo get a density you set freq = FALSE

• controlling the number of bins:‘breaks = a number’ sets the number of bins to use in thehistogram

hist(data, breaks = size/2, col = ’red’, freq = FALSE)

Plotting the pdf of the normal distribution:

x = seq(min(data) - 1, max(data) + 1, .01)lines(x, dnorm(x), col=’green’, lwd = 4)

Page 35: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Properties of normal distribution

The functionf (x) =

1√2πσ2

e−(x−µ)2/(2σ2)

is a valid density function:

∫ ∞−∞

1√2πσ2

e−(x−µ)2/(2σ2)dx = 1

Proof: First let us change variable, t = (x − µ)/σ.

Page 36: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

The Gaussian integral

We need to compute

I =

∫ ∞−∞

e−t2/2dt

Let us try a change of variable: x = t2

2 , t =√

2x , dt = 1√2x−1/2dx .

Then we get

I =√

2∫ ∞

0x−1/2e−xdx .

The integral∫∞−∞ xa−1e−xdx is called the Gamma function Γ(a) and

therefore

I =√

(12

).

It is easy to calculate that Γ(n) = (n − 1)! for positive integers n.But what is Γ

( 12

)?

Page 37: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

A trick

I2 =

∫ ∞−∞

e−x2/2∫ ∞−∞

e−y2/2dy

=

∫ ∞−∞

∫ ∞−∞

e−(x2+y2)/2dxdy .

Let us use the polar coordinates:

I2 =

∫ 2π

0

∫ ∞0

re−r2/2drdθ

I2 = −2π e−r2/2∣∣∣∞0

= 2π.

So I =√

2π and 1√2π

e−t2/2 is a density.

And Γ( 1

2

)=√π, which is surprising.

Page 38: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Standartization

:::::::::Theorem. If Y ∼ N(µ, σ2), then the

::::::::::::standardized version of Y ,

Z =Y − µσ

,

has the standard normal distribution N(0,1).

Proof: Let us calculate the cdf of Z:

FZ (t) = P(Z ≤ t) = P(Y − µσ

≤ t) = . . .

.

Page 39: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Mean and variance of normal distribution

::::::::Theorem If Y ∼ N(µ, σ), then

E(Y ) = µ,

Var(Y ) = σ2.

Proof: Since Y = µ+ σZ , where Z is the standard normal r.v., let usfirst compute EZ and VarZ .

Page 40: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Mgf of normal distribution

::::::::Theorem If Z ∼ N(0,1), then its moment-generating function is

mZ (t) = et2/2.

Proof:

Similarly, the moment generating function of Y ∼ N(µ, σ2) ismY (t) = eµt+σ2

2 t2.

Consequence: The odd moments of the standard normal variable arezero and the even moments are:m′2 = 1, m′3 = 0,m′4 = 3, m′5 = 0,m′6 = 3 · 5, m′7 = 0,m′8 = 3 · 5 · 7,. . .m′2n = (2n − 1)!! = 1√

πΓ(n + 1

2

)

Page 41: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Calculating normal probabilities

If Y ∼ N(µ, σ2), the cdf for Y is

P(Y ≤ y)

∫ y

−∞

1√2πσ2

exp−(x−µ)2/(2σ2)dx = 1.

However, this integral cannot be expressed in elementary functions.

It can be calculated via numerical integration.

We use software or tables to calculate such normal probabilities.

Page 42: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Example::::::::Example. If Z ∼ N(0,1), findP(Z > 1.83), P(Z > −0.42), P(Z ≤ 1.19), P(−1.26 < Z ≤ 0.35)

Page 43: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Example

Using standartization, we can use Table 4 to find probabilities for anynormal r.v.

::::::::Example A graduating class has GPAs that follow a normaldistribution with mean 2.70 and variance 0.16.What is the probability a randomly chosen student has a GPA greaterthan 2.50?

What is the probability that a random GPA is between 3.00 and 3.50?

Exactly 5% of students have GPA above what number?

Page 44: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Example

::::::::Example: Let Y be a normal r.v. with σ = 10. Find µ such thatP(Y < 10) = 0.75.

Page 45: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Quiz

The length of human gestation (pregnancy) is well-approximated by anormal distribution with mean µ = 280 days and standard deviationσ = 8.5 days. Suppose your final exam is scheduled for May 18 andyour pregnant professor has a due date of May 25.

Find the probability she will give birth on or before the day of the final.

Page 46: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

QuizFind the probability she will give birth on or before the day of the final.

(A) ≈ 2.5% (B) ≈ 5% (C) ≈ 10% (D) ≈ 20% (E) ≈ 30%

Page 47: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

QuizThe professor decides to move up the exam date so there will be a 95%probability that she will give birth afterward. What date should she pick?

(A) May 20 (B) May 15 (C) May 11 (D) May 8 (E) May 6

Page 48: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

The Gamma distribution and related distributions

Many variables have distributions that are non-negative and:::::::skewed

to the right (long right tail).

:::::::::Examples:

• Lifelengths of manufactured parts• Lengths of time between arrivals at a restorant• Survival times for severely ill patients

Such r.v.’s may often be modeled with a Gamma distribution.

Page 49: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

The Gamma distribution::::Defn: A continuous r.v. has a Gamma distribution if its pdf is

f (y) =1

βαΓ(α)yα−1e−y/β ,

where α > 0, β > 0, and y ∈ [0,∞).

Γ(α) =

∫ ∞0

yα−1e−y dy .

α and β are called the::::::shape and scale parameters, respectively.

Page 50: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

The mean and variance of gamma distribution

:::::::::Theorem: If Y has a gamma distribution with parameters α and β,then

E(Y ) = αβ,

Var(Y ) = αβ2.

Proof:

Page 51: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

The mgf of gamma distribution

:::::::::Theorem: If Y has a gamma distribution with parameters α and β,then its moment generating function is

mY (t) =1

(1− βt)α.

Proof:

Page 52: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Example

Let f (y) = cy4e−y/3 if y ≥ 0 and zero elsewhere.What value of c makes f (y) a valid density?

What is E(Y )? What is V (Y )

Page 53: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Example

Suppose Y ∼ Γ(5,3). Find the probability that Y falls within 2standard deviations of its mean.

Page 54: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

First special case: χ2 distribution.::::Defn: For any integer ν ≥ 1, a r.v. Y has a chi-square (χ2) distributionwith ν degrees of freedom if Y is a Gamma r.v. with α = ν/2 andβ = 2.[Shorthand: Y ∼ χ2(ν)]

In other words, χ2 distribution is the Gamma distribution forhalf-integer α and a specific choice of β.

Page 55: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Properties of χ2 distribution

If X1, X2, . . . Xν are independent standard normal r.v.’s, thenY = (X1)2 + (X2)2 + . . . (Xν)2 has the χ2 distribution with ν degrees offreedom.

Proof:

χ2 distribution often occurs in statistics.

Page 56: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Mean, variance and mgf of χ2 distribution:::::::::Theorem: If Y ∼ χ2(ν), then

E(Y ) = ν,

V (Y ) = 2ν,

mY (t) =1

(1− 2t)ν/2 .

Page 57: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Second special case: Exponential distribution.

The exponential distribution is commonly used as a model forlifetimes, survival times, or waiting times.

Page 58: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Mean, variance, and mgf of exponential distribution.

::::Defn: A r.v. Y has an exponential distribution with parameter β if Y isa Gamma r.v. with α = 1. That is, the pdf of Y is

f (y) =1β

e−y/β

for y ≥ 0.

::::::::Theorem If Y is an exponential r.v. with parameter β, then

E(Y ) = β,

Var(Y ) = β2.

mY (t) =1

1− βt

Page 59: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Example

Let the lifetime of a part (in thousands of hours) follow an exponentialdistribution with mean lifetime 2000 hours. (β = . . .)

Find the probability the part lasts less than 1500 hours.

Find the probability the part lasts more 2200 hours.

Find the probability the part lasts between 2000 and 2500 hours.

Page 60: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Memoryless property of exponential distribution

Let a lifetime Y follow an exponential distribution. Suppose the parthas lasted a units of time already. Then the conditional probability ofit lasting at least b

::::::::additional units of time is

P(Y > a + b|Y > a)

.

It is the same as the probability of lasting at least b units of time in thefirst place:

P(Y > b)

Proof:

The exponential distribution is the only:::::::::continuous distribution with

this “memoryless”property.It is the continuous analog of the

:::::::::geometric distribution.

Page 61: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Relationship with Poisson process

Suppose we have a Poisson process X with a mean of λ events pertime unit. Consider the waiting time W until the first occurrence.Then, W ∼ Exp(1/λ).

Proof:

P(W ≤ t) = P(X[0,t] ≥ 1) = 1− P(X[0,t] = 0) = . . .

.

This result can be generalized: For any integer α ≥ 1, the waitingtime W until the α-th occurence has a gamma distribution withparameters α and β = 1/γ.

Page 62: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Example

Suppose customers arrive in a queue according to a Poisson processwith mean λ = 12 per hour. What is the probability we must wait morethan 10 minutes until the first customer?

What is the probability we must wait more than 10 minutes to see the

:::::fourth customer?

Page 63: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Beta distribution

Suppose we want to model a random quantity that takes valuesbetween 0 and 1.

Examples:• The proportion of a chemical product that is pure.• The proportion of a hospital’s patients infected with a certain

virus.• The parameter p of a binomial distribution.

We can model this random quantity with a beta-distributed randomvariable.

More generally, if a random quantity Y has its support on interval[c,d ] then we can model the transformed quantity Y ∗ = Y−c

d−c with abeta r.v.

Page 64: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Beta distribution

::::Defn: A r.v. Y has a beta distribution with parameters α > 0 andβ > 0 [Shorthand: Y ∼ Beta(α, β)] if its pdf is:

f (y) =1

B(α, β)yα−1(1− y)β−1,

where

B(α, β) =

∫ 1

0yα−1(1− y)β−1dy =

Γ(α)Γ(β)

Γ(α + β).

Note 1: The beta pdf has support on [0,1].

Note 2: For integer α and β,

1B(α, β)

=(α + β − 1)!

(α− 1)!(β − 1)!.

Page 65: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Graphs of beta distribution

Beta distribution is quite flexible.

Page 66: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Quiz

What is another name for Beta(1,1)?

(A) Normal(B) Exponential(C) Poisson(D) Chi-square(E) Uniform

Page 67: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Mean and variance of beta distribution

:::::::::Theorem: If Y ∼ Beta(α, β), then

E(Y ) =α

α + β,

Var(Y ) =αβ

(α + β)2(α + β + 1).

What happens if α and β grow ?

Page 68: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Relation to Bayesian Statistics

Suppose we perform an experiment with outcomes F and S. Theprobability of outcome S is p.

We model p as a r.v. and our initial belief about the distribution of p isthat it is distributed according to Beta distribution with parameters αand β.

We observed outcome S. How we should update our beliefs?

The posterior distribution of p is Beta(α + 1, β).

What happens with the distribution if we repeat the experiment 100times with 90 successes and 10 failures?

Page 69: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Finding beta probabilities

If α and β are integers, then one can use the formula

F (y) =1

B(α, β)

∫ y

0tα−1(1− t)β−1dt =

n∑i=α

(ni

)y i (1− y)n−i ,

where n = α + β − 1. This formula can be established by partialintegration.

That is, we can write

F (y) = P(X ≥ α) = 1− P(X ≤ α− 1),

where X ∼ Binom(n, y).

In general, one can use R functions like pbeta(y0, α, 1/β) for cdf andqbeta(p, α, 1/β) for quantiles.

Page 70: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Example

Define the downtime rate as the proportion of time a machine isunder repair. Suppose a factory produces machines whose downtimerate follows a beta(3,18) distribution.

What is the pdf of the downtime rate Y ?

For a randomly selected machine, what is the expected downtimerate?

What is the probability that a randomly selected machine is underrepair less than 5% of the time?

If machines act independently, in a shipment of 25 machines, what isthe probability that at least 3 have downtime rates greater than 0.20?

Page 71: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Mixed distributions

A mixed distribution has probability mass placed at one or morediscrete points, and has its remaining probability spread overintervals.

::::::::Example Let Y be the annual amount paid out to a random customerby an auto insurance company.

Suppose 3/4 of customers each year do not receive:::any payout.

Of those who do receive benefits, the payout amount Y2 follows agamma(10,160) distribution.

For a random customer, what is the expected payout amount?

Page 72: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

CDF of a mixed distribution

If Y has a mixed distribution, then the cdf of Y may be written as

F (y) = c1F1(y) + c2F2(y),

where F1(y) is a step distribution function, F2(y) is a continuousdistribution function, c1 is the total mass of all discrete points, and c2is the total probability of all continuous portions.We can think about the distributions F1(y) and F2(y) as cdf’s of twor.v.’s, X1 (discrete) and X2 (continuous).What are c1, c2, mpf of X1 and pdf of X2 in the previous example?

For any function g(Y ) of Y ,

Eg(Y ) = c1Eg(X1) + c2Eg(X2).

For a random customer, what is the expected payout amount?

Page 73: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Example

In a lab experiment, a rat is given a shot of a dangerous toxin. Thereis a probability 0.2 that the rat will die instantly. If it survives the initialshot, suppose it would ordinarily then die at a random point in timeover the next 24 hours. However, if it is still alive 6 hours after theshot, the rat is killed.

Find the expected value of the survival time for a rat entering thisexperiment.What is the probability that a rat entering the experiment will survive 3hours or less?

Page 74: Continuous random variables - Binghamton University we try and deal cards every second until we find a specific combination, then the time until success is a geometric random variable

Chebyshev’s theorem one more time

:::::::::Theorem: Let Y be a r.v. with mean µ and variance σ2. Then forevery k > 0, we have

P(|Y − µ| ≥ kσ) ≤ 1k2 .

Proof:Let X = (Y − µ)2/(kσ)2. We need to prove that P(X ≥ 1) ≤ 1/k2.Define the

:::::::indicator

::::::::function of the event A = {X ≥ 1}:

1A(x) =

{1, if x ≥ 1,0, otherwise.

Then P(X ≥ 1) = E[1A(X )].Note that 1A(X ) ≤ X ,hence P(X ≥ 1) = E[1A(X )] ≤ EX = E[(Y − µ)2/(kσ)2] = 1/k2. �