stat 3610: review of probability...

21
STAT 3610: Review of Probability Distributions Mark Carpenter Professor of Statistics Department of Mathematics and Statistics August 25, 2015

Upload: others

Post on 16-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

STAT 3610: Review of Probability Distributions

Mark CarpenterProfessor of Statistics

Department of Mathematics and Statistics

August 25, 2015

Page 2: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

Support of a Random Variable

DefinitionThe support of a random variable, say X , denoted as X , isdefined to be the set of all points on the real line for which thepdf/pmf is non-zero. That is,

X = {x ∈ R : fX (x) > 0} ,

where the braces indicates a set.

Page 3: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

Support of a Random Variable

I The support of a random variable is usually denoted by thescript form of the letter corresponding to random variable

I the random variable X has support XI the random variable Y has support YI the random variable Z has support Z

I The support of a random variable is one of the firstcharacteristics we can use to help identify the distribution of arandom variable.

Page 4: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

Continuous versus Discrete Random Variables

DefinitionContinuous Random Variable: A random variable is said to becontinuous if its support is a continuous set (made up of unionsand intersections of real intervals). The CDF* of a continuousrandom variable must be a continuous function on the real line

DefinitionDiscrete Random Variable: A random variable is said to bediscrete if its support is a discrete set. The CDF* of a discreterandom variable is not continuous, but it is right continuous.

*CDF stands for Cumulative Distribution Function

Page 5: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

Difference between Discrete and Continuous RandomVariables

So, whether the random variable, X , is a continuous or discreterandom variable depends on whether its support is continuous ordiscrete.

I In the discrete case, the pmf* is a discontinuous function witha positive mass (probability) at each point in the support.

I In the continuous case, the pdf** itself does not have to be acontinuous function everywhere, but it usually a continuousfunction on intervals in the support.

*pmf stands for probability mass function

**pdf stands for probability density function

Page 6: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

Exponential Random Variable and Exponential Distribution

Example 1: (continuous support) Suppose X ∼ exponential (θ).From Section 3.2 (pp. 95-113) of the textbook, the exponentialdistribution, indexed by the scale parameter θ (θ > 0) is

f (x ; θ) =1

θe−x/θI[0,∞)(x) =

{1θe−x/θ x ≥ 00 otherwise.

which means {x ∈ R : f (x) > 0} = [0,∞) and the support of X isX = [0,∞), a continuous set.

We see that the pdf for the exponential is zero for all points belowzero, then jumps to λe0 = λ at x = 0 and is continuous on [0,∞).

Page 7: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

pdf and cdf for the Standard Exponential

1 a x

0.5

1

F (a) = 1− e−a

f (x)

1 a x

0.5

1

F (a) = 1− e−a

F (x)

Page 8: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

Exponential is special case of Gamma and Weibull

You can verify that the non-truncated gamma and Weibulldistributions, from which the exponential is a special case, sharethis same support.

If X is a normal random variable, then the support isX = (−∞,∞) = R.

Page 9: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

Mean or Expected Value of a Random Variable

Recall, for any random variable, X , with pdf/pmf f (x), a measureof central tendency of the population is the population mean µ, orthe expected value/long run average for X . More formally,

Population Mean: For any random variable, X , with pdf/pmff (x), the population mean µ = E (X ) where

µ = E (X ) =

∫ ∞−∞

xf (x)dx if X is continuous

∑x∈X

xf (x) if X is discrete

Page 10: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

Population Variance or Variance of a Random Variable

Population Variance: For any random variable, X , with pdf/pmff (x), the population variance is σ2 = E (X − µ)2, where

σ2 = E (X − µ)2 =

∫ ∞−∞

(x − µ)2f (x)dx if X is continuous

∑x∈X

(x − µ)2f (x) if X is discrete

Page 11: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

Sometimes Easier Way to Compute Population Variance

Note that it is often easier to compute the variance by noting that ,

σ2 = E (X−µ)2 = E (X 2−2Xµ+µ2) = EX 2−2µEX+µ2 = EX 2−(EX )2.

So, rather than going through the original express, one need onlycompute E (X 2) and µ = E (X ) and plug the results in to thefollowing expression

σ2 = E (X 2)− µ2.

Page 12: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

Expectations of Functions of a Random Variable

You might notice that each of E (X ), E (X 2), and E (X − µ)2 arethe expected value of different functions, g1(x) = x , g2(x) = X 2

and g3(x) = (x − µ)2. The expected value for any function isdefined below.

Page 13: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

Moment Generating Functions

Whenever it exists, the moment-generating function for arandom variable X , denoted MX (t), is the continuous function oft ∈ (−∞,∞) given as

MX (t) = E[etX], t ∈ (−h, h), h > 0.

The interval (−h, h) is referred to as the radius of convergence.

Page 14: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

Properties of a Moment Generating Function (mgf)

This function is called the moment-generating function becauseyou can find the nth moment for the random variable, X, bycomputed its nth derivative with respect to t then setting t = 0, asfollows

E (X n) = M(n)X (0) =

d

dtMX (t)

∣∣∣∣t=0

.

Notice that the moment generating function is a continuous anddifferentiable function of |t| < h, whether or not X is continuous.In fact, the moment generating function is mathematicallyindependent of the original variable (since it was integrated orsummed over the support) and only relates to the variable Xthrough the moments of the distribution and any relatedparameters.

Page 15: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

Properties of Exponential

We will show on chalkboard that if X ∼ Exp(θ) then

I

∫ ∞−∞

f (x)dx =

∫ ∞0

1

θe−x/θdx = 1.

I µ = E (X ) =

∫ ∞−∞

x · f (x)dx =

∫ ∞0

x

θe−x/θdx = λ.

I σ2 = E (X − µ)2 =

∫ ∞0

(x − µ)2f (x)dx = θ2

I Cumulative Distribution Function (cdf) for any w ≥ 0 is

F (w) = P(X ≤ w) = 1− e−w/θ

I The moment generating function (mgf), denoted M(t) existsand

M(t) =1

(1− θt), t <

1

θ

Page 16: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

Discrete Example (Binomial)

Example: (discrete support) Suppose Y is a Binomial randomvariable with parameters n and p (see page 117 of textbook) ,thenthe probability mass function (pmf) is

fY (y) =

(

ny

)py (1− p)n−y y = 0, 1, . . . , n

0 otherwise

which means {y : f (y) > 0} = {0, 1, . . . , n} and the support of Yis Y = {0, 1, . . . , n}, a discrete (and finite) set of points.

Recall that,

(ny

)=

n!

y !(n − y)!.

Page 17: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

cdf for Binomial Random Variable

Example 2 (binomial): Suppose X is a Binomial (4, 0.5), thenthe pmf is

f (x ; n = 4, p = 1/2) =

(4x

)1

2n, x ∈ X = {0, 1, 2, 3, 4}

Table : CDF for Binomial (n=4, p=1/2)

(−∞, 0) P(X < 0) = 0 0 = 0

[0, 1) P(X ≤ 0) = P(0) 116

= 116

[1, 2) P(X ≤ 1) = P(0) + P(1) 116

+ 416

= 516

[2, 3) P(X ≤ 2) = P(0) + P(1) + P(2) 116

+ 416

+ 616

= 1116

[3, 4) P(X ≤ 3) = P(0) + P(1) + P(2) + P(3) 116

+ 416

+ 616

+ 416

= 1516

[4,∞) P(X ≤ 4) = P(0) + P(1) + P(2) + P(3) + P(4) 116

+ 416

+ 616

+ 416

+ 116

= 1

Page 18: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

cdf for Binomial Random Variable

0 1 2 3 4 5

1/16

5/16

11/16

15/161

Page 19: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

Binomial, hypergeometric, geometric

One can verify that the supports for the negative binomial (r , p),the Geometric(p), and the Poisson random variables (see page126) are the same countably infinite set {0, 1, 2, . . .}. The supportof the hypergeomtric(n,M,N) is discrete/finite set{max(0, n − N + M), . . . ,min(n,M).}

Page 20: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

Poisson Random Variable

Suppose X is a discrete random variable with a Poisson (λ)distribution, then the probability mass function (pmf) is

f (x) =

λxe−λ

x!x = 0, 1, 2, ...,

0 otherwise

where λ > 0.

Recall that the MacLaurin series of an Exponential function, ey , is

ey =∞∑i=0

y i

i !, where i ! represents the factorial function for an

integer i .

Page 21: STAT 3610: Review of Probability Distributionswebhome.auburn.edu/.../courses/stat3610/notes_old/reviewPresentat… · I In the discrete case, the pmf* is a discontinuous function

Some Properties of a Poisson Random Variable

I∑x∈X

f (x ;λ) =∞∑x=0

λxe−λ

x!= 1

I µ = E (X ) =∑x∈X

x · f (x) = λ

I σ2 = E (X − µ)2 =∑x∈X

(x − µ)2 · f (x) = λ

I M(t) = E (etX ) =∑x∈X

etx · f (x) = eλ(et−1), −∞ < t <∞.