home | applied mathematics & statisticszhu/ams570/lecture6_570.docx · web viewother common...
TRANSCRIPT
Other Common Univariate Distributions
Dear students, besides the Normal, Bernoulli and Binomial
distributions, the following distributions are also very
important in our studies.
1. Discrete Distributions
1.1. Geometric distribution
This is a discrete waiting time distribution. Suppose a
sequence of independent Bernoulli trials is performed and
let X be the number of failures preceding the first success.
Then X Geom ( p ) , with pdf
f ( x )=p (1−p)x , x=0,1,2,…
1.2. Negative Binomial distribution
Suppose a sequence of independent Bernoulli trials is
conducted. If X is the number of failures preceding the nth
success, then X has a negative binomial distribution.
Probability density function:
f ( x )= (n+x−1n−1 )⏟
ways of allocatingfailures∧successes
pn(1−p)x ,
E (X )=n (1−p)p
,
Var (X )=n(1−p)p2 ,
M x ( t )=[ p1−e t(1−p) ]
n
.
1. If n = 1, we obtain the geometric distribution.
1
2. Also seen to arise as sum of n independent geometric
variables.
1.3. Poisson distribution
Parameter: rate λ>0
MGF: M x ( t )=eλ(et−1)
Probability density function:
f ( x )= e− λ λx
x !, x=0,1,2 ,…
1. The Poisson distribution arises as the distribution for the
number of “point events” observed from a Poisson process.
Examples:
Figure 1: Poisson Example
2. The Poisson distribution also arises as the limiting form
of the binomial distribution:
n→∞, np→ λ
p→0
The derivation of the Poisson distribution (via the binomial)
is underpinned by a Poisson process i.e., a point process on
¿; see Figure 1.
AXIOMS for a Poisson process of rate λ>0 are (That is, a
counting process is a Poisson process if it satisfies the
following rules):
(A) The number of occurrences in disjoint intervals are
independent.
(B) Probability of exactly 1 occurrence in any sub-interval ¿
2
is λh+o(h) (h→0) (approx prob. is equal to length of
interval (h) timesλ).
(C) Probability of more than one occurrence in ¿ is o (h)
(h→0) (i.e. prob is small, negligible).
Note: o (h), pronounced (small order h) is standard notation
for any function r (h) with the property:
limh→0
r (h)h
=0
1.4. Hypergeometric distribution
Consider an urn containing M black and N white balls.
Suppose n balls are sampled randomly without replacement
and let X be the number of black balls chosen. Then X has a
hypergeometric distribution.
Parameters: M ,N>0 , 0<n≤M+N
Possible values: max (0 , n−N )≤x≤min(n ,M )
Prob. density function:
f ( x )=(Mx )( N
n−x)(M+N
n ),
E (X )=n MM+N
,Var (X )=M+N−nM+N−1
nMN(M+N )2 .
The mgf exists, but there is no useful expression available.
1. The hypergeometric PDF is simply
¿samples with x black balls¿ possible samples
¿(Mx )( N
n−x)(M+N
n ),
2. To see how the limits arise, observe we must have x≤n
3
(i.e., no more than sample size of black balls in the sample.)
Also, x≤ M , i.e., x≤min(n ,M ).
Similarly, we must havex≥0 (i.e., cannot have < 0 black balls
in sample), and
n−x≤ N (i.e., cannot have more white balls than number in
urn).
i.e.x≥n−N
i.e.x≥max (0 , n−N ) .
3. If we sample with replacement, we would get
X∼B(n , p= MM+N ) . It is interesting to compare moments:
finite
population correction
↑Hypergeometric E (X )=np Var (X )=M+N−n
M+N−1[np (1−p)]
Binomial E ( x )=np Var (X )
¿np (1−p )↓
When
sample all
balls in urn Var (X) 0
4
2. Continuous Distributions
2.1 Uniform Distribution
For X U (a ,b ) , its pdf and cdf are:
f ( x )={ 1b−a
,a<x<b
0 ,otherwise ,
F ( x )=∫−∞
x
f ( x ) dx=∫a
x 1b−a
dx= x−ab−a
, for a<x<b
The more complete form of the cdf is:
F ( x )={ 0 x≤ax−ab−a
a<x<b
1b≤x
Its mgf is:
M X ( t )= e tb−e ta
t(b−a)
A special case is the X U (0 ,1) distribution:
f ( x )={ 1 0<x<10otherwise ,
F ( x )=x for 0<x<1
E (X )=12,Var (X )= 1
12,M ( t )= et−1
t.
2.2 Exponential Distribution
PDF: f ( x )= λ e−λx , x≥0
CDF:F (x)=1−e− λx ,
MGF:
M X ( t )= λλ−t
, λ>0 , x≥0.
This is the distribution for the waiting time until the first
occurrence in a Poisson
5
process with rate parameter λ>0.
1. If X exp( λ)then,
P(X≥ t+x∨X ≥t )=P(X ≥ x)
(memoryless property)
2. It can be obtained as limiting form of geometric
distribution.
3. One can also easily derive that the geometric distribution
also has the memoryless property.
2.3 Gamma distribution
Or, some books use
Then:
If r is a non-negative integer, then
6
Special case : when
Special case : when
Review
p.d.f.
m.g.f.
e.g. Let . What is the distribution of
?
Solution
e.g. Let . What is the mgf of ?
Solution
7
e.g. Let , , and are independent.
What is the distribution of ?
Solution
*** In general, for X∼ gamma(α , λ) , > 0. α
(Note, sometimes we use r instead of α as shown above.)
1. α is the shape parameter,
λ is the scale parameter
Note: if Y ∼g amma(α ,1) and X=Yλ
, then X∼ gamma(α , λ) ,
That is, λ is scale parameter.
Figure 2: Gamma Distribution
2. Gamma ( k2, 12) distribution is also called χk
2 (chi-square
with k df) distribution, if k is a positive integer;
3.The gamma(K , λ) random variable can also be interpreted
8
as the waiting time until the K t h occurrences (events) in a
Poisson process.
2.4 Beta density function
Suppose Y 1∼Gamma(α , λ), Y 2∼Gamma(β , λ)
independently, then,
X=Y 1
Y 1+Y 2B (α , β ) ,0≤ x≤1.
Remark: we have derived this in Lecture 5 – transformation.
I have copied it here again, as a review.
e.g. Suppose that Y 1 gamma(α, 1), Y 2 gamma(β, 1), and thatY 1∧Y 2 are independent. Define the transformation
U 1=g1 (Y 1 ,Y 2)=Y 1+Y 2
U 2=g2 (Y 1 ,Y 2 )=Y 1
Y 1+Y 2.
Find each of the following distributions:(a) f U 1 ,U 2
(u1 , u2 ), the joint distribution of U 1∧U 2,
(b) f U 1(u1), the marginal distribution of U 1
, and
(c)f U 2(u2)
, the marginal distribution of U 2.
Solutions. (a) Since Y 1∧Y 2 are independent, the joint distribution of Y 1∧Y 2 is
f Y1 , Y2( y1 , y2 )=f Y 1
( y1 ) f Y 2( y2 )
¿ 1Γ (α )
y1α−1 e− y1× 1
Γ (β )y2β−1 e− y2
¿ 1Γ (α ) Γ ( β )
y1α−1 y2
β−1 e− (y1+ y2) ,
for y1>0 , y2>0 ,and 0, otherwise. Here,
RY 1 ,Y 2={( y1, y2 ) : y1>0 , y2>0 }. By inspection, we see that
u1= y1+ y2>0, and u2=y1
y1+ y2
must fall between 0 and 1.
9
Thus, the domain of U=(U 1 ,U 2) is given by
RU 1 ,U 2={(u1 , u2 ) :u1>0 ,0<u2<1} .
The next step is to derive the inverse transformation. It follows that
u1= y1+ y2
u2=y1
y1+ y2
⇒y1=g1
−1 (u1 , u2 )=u1u2
y2=g2−1 (u1 , u2 )=u1−u1u2
The Jacobian is given by
J=det|∂ g1−1 (u1 , u2 )∂u1
∂g1−1 (u1, u2 )∂u2
∂ g2−1 (u1 , u2 )∂u1
∂g2−1 (u1, u2 )∂u2
|=det| u2 u1
1−u2 −u1|=−u1u2−u1 (1−u2 )=−u1 .
We now write the joint distribution for U=(U 1 ,U 2). For u1>0∧0<u2<1, we have that
f U 1 ,U 2(u1 , u2 )=f Y 1 ,Y 2
[g1−1 (u1 , u2 ) , g2
−1 (u1, u2 ) ]|J|
¿ 1Γ (α ) Γ ( β ) (u1u2 )α−1 (u1−u1u2)β−1 e− [(u1u2)+ (u1−u1u2) ]
×∨−u1∨¿
Note: We see that U 1∧U 2are independent since the domain
RU 1 ,U 2={(u1 , u2 ) :u1>0 ,0<u2<1} does not constrain u1
by u2 or
vice versa and since the nonzero partof f U 1 ,U 2
(u1, u2 ) can be factored into the two expressions
h1 (u1 ) and h2 (u2 ), where
h1 (u1 )=u1α+ β−1 e−u1
and
h2 (u2 )=u2
α−1(1−u2)β−1
Γ (α ) Γ (β ).
(b) To obtain the marginal distribution of U1, we integrate the joint pdff U 1 ,U 2
(u1 , u2 )
over u2. That is, for u1 > 0,
f U 1(u1)= ∫
u2=0
1
f U 1 ,U 2(u1 , u2 )d u2
10
¿ ∫u2=0
1 u2α−1 (1−u2 )β−1
Γ (α ) Γ (β )u1
α+ β−1 e−u1d u2
¿ 1Γ (α ) Γ (β )
u1α+ β−1e−u1 ∫
u2=0
1
u2α−1 (1−u2 )β−1du2
(u2α−1 ( 1−u2 )β−1is beta (α , β ) kernel)
⇔
1Γ (α )Γ (β )
u1α+β−1 e−u1× Γ (α )Γ (β )
Γ (α+ β )
¿ 1Γ (α+β )
u1α+β−1e−u1
Summarizing,
f U 1(u1)={ 1
Γ (α+β )u1
α+ β−1e−u1, u1>0
0 , o therwise .
We recognize this as a gamma(α+β , 1) pdf; thus, marginally, U1 ~gamma(α+β , 1).(c) To obtain the marginal distribution of U2, we integrate the joint pdf f U 1 ,U 2
(u1, u2 ) over u2. That is, for 0<u2<1,
f U 2(u2)= ∫
u1=0
∞
f U 1 ,U 2(u1 , u2 )d u1
¿ ∫u1=0
∞ u2α−1 (1−u2 )β−1
Γ (α ) Γ (β )u1
α+ β−1 e−u1d u1
¿u2
α−1 (1−u2 )β−1
Γ (α ) Γ (β ) ∫u1=0
∞
u1α+ β−1 e−u1d u1
¿ Γ (α+β )Γ (α ) Γ (β )
u2α−1 (1−u2 )β−1 .
Summarizing,
f U 2(u2)={ Γ (α+β )
Γ (α ) Γ (β )u2
α−1 (1−u2)β−1 , 0<u2<1
0 , o therwise .
Thus, marginally, U2 ~ beta(α ,β). □
11
Figure. Selected Beta pdf’s
(https://en.wikipedia.org/wiki/Beta_distribution).
2.5 Standard Cauchy distribution
Possible values: x∈ R
PDF: f ( x )= 1π ( 1
1+x2 ); (location parameter θ=0)
CDF: F ( x )=12+ 1π
arctan x
E(X) ,Var (X ) ,M X (t) do not exist.
The Cauchy is a bell-shaped distribution symmetric about
zero for which no moments are defined.
If Z1∼N (0 ,1)andZ2∼N (0 ,1) independently, then
12
X=Z1
Z2Cauthy distribution . Please also see the lecture notes
on transformation for proof (Lecture 5).
HW4: 3.2, 3.5, 3.7, 3.9, 3.12, 3.18, 3.23
13
HW3 Q.1. Suppose that Y 1 and Y 2 are random variables with joint pdf
f Y1 , Y2( y1 , y2 )={8 y1 y2 , 0< y1< y2<1
0 , o therwise .
Find the pdf ofU1=Y 1/Y 2.
It is given that the joint pdf of X and Y is
f Y1 , Y2( y1 , y2 )=8 y1 y2 , 0< y1< y2<1.
Now we need to introduce a second random variable U 2 which is a function of X andY . We wish to do so in a way that the resulting bivariate transformation is one-to-one and our task of finding the pdf of U 1is as easy as possible. Our choice of U2is of course, not unique. Let us defineU 2=Y 2. Then the transformation is:
Y 1=U 1¿U 2
Y 2=U 2
From this, we find the Jacobian:
J = |u2 u1
0 1|=u2
To determinethe new joint domainB, of U1andU2, we note that
0< y1< y2<1⇒
0<u1 ¿u2<u2<1, equivalently,
u1>0u1<1u2>0u2<1
So B is as indicated in the diagram below.
f U 1, U 2 (u1, u2 )=8∗(u¿¿1∗u2)∗u2∗u2=8u1u23 ¿,
14
Thus, the marginal pdf of U1is obtained by integrating
f U 1, U 2 (u1 , u2 ) with respect tou2, yielding
f U 1(u1)=∫
0
1
8u1u23d u2=2∗u1 ,0<u1<1 .
15