home | applied mathematics & statisticszhu/ams570/lecture6_570.docx · web viewother common...

Other Common Univariate Distributions

Dear students, besides the Normal, Bernoulli and Binomial

distributions, the following distributions are also very

important in our studies.

1. Discrete Distributions

1.1. Geometric distribution

This is a discrete waiting time distribution. Suppose a

sequence of independent Bernoulli trials is performed and

let X be the number of failures preceding the first success.

Then X Geom ( p ) , with pdf

f ( x )=p (1−p)x , x=0,1,2,…

1.2. Negative Binomial distribution

Suppose a sequence of independent Bernoulli trials is

conducted. If X is the number of failures preceding the nth

success, then X has a negative binomial distribution.

Probability density function:

f ( x )= (n+x−1n−1 )⏟

ways of allocatingfailures∧successes

pn(1−p)x ,

E (X )=n (1−p)p

,

Var (X )=n(1−p)p2 ,

M x ( t )=[ p1−e t(1−p) ]

n

.

1. If n = 1, we obtain the geometric distribution.

1

2. Also seen to arise as sum of n independent geometric

variables.

1.3. Poisson distribution

Parameter: rate λ>0

MGF: M x ( t )=eλ(et−1)

Probability density function:

f ( x )= e− λ λx

x !, x=0,1,2 ,…

1. The Poisson distribution arises as the distribution for the

number of “point events” observed from a Poisson process.

Examples:

Figure 1: Poisson Example

2. The Poisson distribution also arises as the limiting form

of the binomial distribution:

n→∞, np→ λ

p→0

The derivation of the Poisson distribution (via the binomial)

is underpinned by a Poisson process i.e., a point process on

¿; see Figure 1.

AXIOMS for a Poisson process of rate λ>0 are (That is, a

counting process is a Poisson process if it satisfies the

following rules):

(A) The number of occurrences in disjoint intervals are

independent.

(B) Probability of exactly 1 occurrence in any sub-interval ¿

2

is λh+o(h) (h→0) (approx prob. is equal to length of

interval (h) timesλ).

(C) Probability of more than one occurrence in ¿ is o (h)

(h→0) (i.e. prob is small, negligible).

Note: o (h), pronounced (small order h) is standard notation

for any function r (h) with the property:

limh→0

r (h)h

=0

1.4. Hypergeometric distribution

Consider an urn containing M black and N white balls.

Suppose n balls are sampled randomly without replacement

and let X be the number of black balls chosen. Then X has a

hypergeometric distribution.

Parameters: M ,N>0 , 0<n≤M+N

Possible values: max (0 , n−N )≤x≤min(n ,M )

Prob. density function:

f ( x )=(Mx )( N

n−x)(M+N

n ),

E (X )=n MM+N

,Var (X )=M+N−nM+N−1

nMN(M+N )2 .

The mgf exists, but there is no useful expression available.

1. The hypergeometric PDF is simply

¿samples with x black balls¿ possible samples

¿(Mx )( N

n−x)(M+N

n ),

2. To see how the limits arise, observe we must have x≤n

3

(i.e., no more than sample size of black balls in the sample.)

Also, x≤ M , i.e., x≤min(n ,M ).

Similarly, we must havex≥0 (i.e., cannot have < 0 black balls

in sample), and

n−x≤ N (i.e., cannot have more white balls than number in

urn).

i.e.x≥n−N

i.e.x≥max (0 , n−N ) .

3. If we sample with replacement, we would get

X∼B(n , p= MM+N ) . It is interesting to compare moments:

finite

population correction

↑Hypergeometric E (X )=np Var (X )=M+N−n

M+N−1[np (1−p)]

Binomial E ( x )=np Var (X )

¿np (1−p )↓

When

sample all

balls in urn Var (X) 0

4

2. Continuous Distributions

2.1 Uniform Distribution

For X U (a ,b ) , its pdf and cdf are:

f ( x )={ 1b−a

,a<x<b

0 ,otherwise ,

F ( x )=∫−∞

x

f ( x ) dx=∫a

x 1b−a

dx= x−ab−a

, for a<x<b

The more complete form of the cdf is:

F ( x )={ 0 x≤ax−ab−a

a<x<b

1b≤x

Its mgf is:

M X ( t )= e tb−e ta

t(b−a)

A special case is the X U (0 ,1) distribution:

f ( x )={ 1 0<x<10otherwise ,

F ( x )=x for 0<x<1

E (X )=12,Var (X )= 1

12,M ( t )= et−1

t.

2.2 Exponential Distribution

PDF: f ( x )= λ e−λx , x≥0

CDF:F (x)=1−e− λx ,

MGF:

M X ( t )= λλ−t

, λ>0 , x≥0.

This is the distribution for the waiting time until the first

occurrence in a Poisson

5

process with rate parameter λ>0.

1. If X exp( λ)then,

P(X≥ t+x∨X ≥t )=P(X ≥ x)

(memoryless property)

2. It can be obtained as limiting form of geometric

distribution.

3. One can also easily derive that the geometric distribution

also has the memoryless property.

2.3 Gamma distribution

Or, some books use

Then:

If r is a non-negative integer, then

6

Special case : when

Special case : when

Review

p.d.f.

m.g.f.

e.g. Let . What is the distribution of

?

Solution

e.g. Let . What is the mgf of ?

Solution

7

e.g. Let , , and are independent.

What is the distribution of ?

Solution

*** In general, for X∼ gamma(α , λ) , > 0. α

(Note, sometimes we use r instead of α as shown above.)

1. α is the shape parameter,

λ is the scale parameter

Note: if Y ∼g amma(α ,1) and X=Yλ

, then X∼ gamma(α , λ) ,

That is, λ is scale parameter.

Figure 2: Gamma Distribution

2. Gamma ( k2, 12) distribution is also called χk

2 (chi-square

with k df) distribution, if k is a positive integer;

3.The gamma(K , λ) random variable can also be interpreted

8

as the waiting time until the K t h occurrences (events) in a

Poisson process.

2.4 Beta density function

Suppose Y 1∼Gamma(α , λ), Y 2∼Gamma(β , λ)

independently, then,

X=Y 1

Y 1+Y 2B (α , β ) ,0≤ x≤1.

Remark: we have derived this in Lecture 5 – transformation.

I have copied it here again, as a review.

e.g. Suppose that Y 1 gamma(α, 1), Y 2 gamma(β, 1), and thatY 1∧Y 2 are independent. Define the transformation

U 1=g1 (Y 1 ,Y 2)=Y 1+Y 2

U 2=g2 (Y 1 ,Y 2 )=Y 1

Y 1+Y 2.

Find each of the following distributions:(a) f U 1 ,U 2

(u1 , u2 ), the joint distribution of U 1∧U 2,

(b) f U 1(u1), the marginal distribution of U 1

, and

(c)f U 2(u2)

, the marginal distribution of U 2.

Solutions. (a) Since Y 1∧Y 2 are independent, the joint distribution of Y 1∧Y 2 is

f Y1 , Y2( y1 , y2 )=f Y 1

( y1 ) f Y 2( y2 )

¿ 1Γ (α )

y1α−1 e− y1× 1

Γ (β )y2β−1 e− y2

¿ 1Γ (α ) Γ ( β )

y1α−1 y2

β−1 e− (y1+ y2) ,

for y1>0 , y2>0 ,and 0, otherwise. Here,

RY 1 ,Y 2={( y1, y2 ) : y1>0 , y2>0 }. By inspection, we see that

u1= y1+ y2>0, and u2=y1

y1+ y2

must fall between 0 and 1.

9

Thus, the domain of U=(U 1 ,U 2) is given by

RU 1 ,U 2={(u1 , u2 ) :u1>0 ,0<u2<1} .

The next step is to derive the inverse transformation. It follows that

u1= y1+ y2

u2=y1

y1+ y2

⇒y1=g1

−1 (u1 , u2 )=u1u2

y2=g2−1 (u1 , u2 )=u1−u1u2

The Jacobian is given by

J=det|∂ g1−1 (u1 , u2 )∂u1

∂g1−1 (u1, u2 )∂u2

∂ g2−1 (u1 , u2 )∂u1

∂g2−1 (u1, u2 )∂u2

|=det| u2 u1

1−u2 −u1|=−u1u2−u1 (1−u2 )=−u1 .

We now write the joint distribution for U=(U 1 ,U 2). For u1>0∧0<u2<1, we have that

f U 1 ,U 2(u1 , u2 )=f Y 1 ,Y 2

[g1−1 (u1 , u2 ) , g2

−1 (u1, u2 ) ]|J|

¿ 1Γ (α ) Γ ( β ) (u1u2 )α−1 (u1−u1u2)β−1 e− [(u1u2)+ (u1−u1u2) ]

×∨−u1∨¿

Note: We see that U 1∧U 2are independent since the domain

RU 1 ,U 2={(u1 , u2 ) :u1>0 ,0<u2<1} does not constrain u1

by u2 or

vice versa and since the nonzero partof f U 1 ,U 2

(u1, u2 ) can be factored into the two expressions

h1 (u1 ) and h2 (u2 ), where

h1 (u1 )=u1α+ β−1 e−u1

and

h2 (u2 )=u2

α−1(1−u2)β−1

Γ (α ) Γ (β ).

(b) To obtain the marginal distribution of U1, we integrate the joint pdff U 1 ,U 2

(u1 , u2 )

over u2. That is, for u1 > 0,

f U 1(u1)= ∫

u2=0

1

f U 1 ,U 2(u1 , u2 )d u2

10

¿ ∫u2=0

1 u2α−1 (1−u2 )β−1

Γ (α ) Γ (β )u1

α+ β−1 e−u1d u2

¿ 1Γ (α ) Γ (β )

u1α+ β−1e−u1 ∫

u2=0

1

u2α−1 (1−u2 )β−1du2

(u2α−1 ( 1−u2 )β−1is beta (α , β ) kernel)

⇔

1Γ (α )Γ (β )

u1α+β−1 e−u1× Γ (α )Γ (β )

Γ (α+ β )

¿ 1Γ (α+β )

u1α+β−1e−u1

Summarizing,

f U 1(u1)={ 1

Γ (α+β )u1

α+ β−1e−u1, u1>0

0 , o therwise .

We recognize this as a gamma(α+β , 1) pdf; thus, marginally, U1 ~gamma(α+β , 1).(c) To obtain the marginal distribution of U2, we integrate the joint pdf f U 1 ,U 2

(u1, u2 ) over u2. That is, for 0<u2<1,

f U 2(u2)= ∫

u1=0

∞

f U 1 ,U 2(u1 , u2 )d u1

¿ ∫u1=0

∞ u2α−1 (1−u2 )β−1

Γ (α ) Γ (β )u1

α+ β−1 e−u1d u1

¿u2

α−1 (1−u2 )β−1

Γ (α ) Γ (β ) ∫u1=0

∞

u1α+ β−1 e−u1d u1

¿ Γ (α+β )Γ (α ) Γ (β )

u2α−1 (1−u2 )β−1 .

Summarizing,

f U 2(u2)={ Γ (α+β )

Γ (α ) Γ (β )u2

α−1 (1−u2)β−1 , 0<u2<1

0 , o therwise .

Thus, marginally, U2 ~ beta(α ,β). □

11

Figure. Selected Beta pdf’s

(https://en.wikipedia.org/wiki/Beta_distribution).

2.5 Standard Cauchy distribution

Possible values: x∈ R

PDF: f ( x )= 1π ( 1

1+x2 ); (location parameter θ=0)

CDF: F ( x )=12+ 1π

arctan x

E(X) ,Var (X ) ,M X (t) do not exist.

The Cauchy is a bell-shaped distribution symmetric about

zero for which no moments are defined.

If Z1∼N (0 ,1)andZ2∼N (0 ,1) independently, then

12

https://en.wikipedia.org/wiki/Beta_distribution

X=Z1

Z2Cauthy distribution . Please also see the lecture notes

on transformation for proof (Lecture 5).

HW4: 3.2, 3.5, 3.7, 3.9, 3.12, 3.18, 3.23

13

HW3 Q.1. Suppose that Y 1 and Y 2 are random variables with joint pdf

f Y1 , Y2( y1 , y2 )={8 y1 y2 , 0< y1< y2<1

0 , o therwise .

Find the pdf ofU1=Y 1/Y 2.

It is given that the joint pdf of X and Y is

f Y1 , Y2( y1 , y2 )=8 y1 y2 , 0< y1< y2<1.

Now we need to introduce a second random variable U 2 which is a function of X andY . We wish to do so in a way that the resulting bivariate transformation is one-to-one and our task of finding the pdf of U 1is as easy as possible. Our choice of U2is of course, not unique. Let us defineU 2=Y 2. Then the transformation is:

Y 1=U 1¿U 2

Y 2=U 2

From this, we find the Jacobian:

J = |u2 u1

0 1|=u2

To determinethe new joint domainB, of U1andU2, we note that

0< y1< y2<1⇒

0<u1 ¿u2<u2<1, equivalently,

u1>0u1<1u2>0u2<1

So B is as indicated in the diagram below.

f U 1, U 2 (u1, u2 )=8∗(u¿¿1∗u2)∗u2∗u2=8u1u23 ¿,

14

Thus, the marginal pdf of U1is obtained by integrating

f U 1, U 2 (u1 , u2 ) with respect tou2, yielding

f U 1(u1)=∫

0

1

8u1u23d u2=2∗u1 ,0<u1<1 .

15

home | applied mathematics & statisticszhu/ams570/lecture6_570.docx · web viewother common...

Documents