sanjoséstateuniversity math161a:appliedprobability&statistics ·...
TRANSCRIPT
San José State University
Math 161A: Applied Probability & Statistics
A summary and two new continuous distributions
Prof. Guangliang Chen
A summary first
We have covered a total of 9 distributions thus far:
• 6 discrete distributions
• 3 continuous distributions
A summary and two new continuous distributions
Special discrete distributions
• Bernoulli
• Binomial
• HyperGeometric
• Geometric
• Negative Binomial
• Poisson
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 3/31
A summary and two new continuous distributions
The Bernoulli distribution (X ∼ Bernoulli(p))
• Probability mass function:
fX(x) = px(1− p)1−x, x = 0, 1
• Example: Toss a coin once and let X = #heads.
• Expected value: E(X) = p
• Variance: Var(X) = p(1− p)
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 4/31
A summary and two new continuous distributions
The Binomial distribution (X ∼ B(n, p))
• Probability mass function
fX(x) =(n
x
)px(1− p)n−x, x = 0, 1, . . . , n
• Example: Toss a coin n times and let X = #heads.
• Expected value: E(X) = np
• Variance: Var(X) = np(1− p)
• Special case: B(n = 1, p) = Bernoulli(p)
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 5/31
A summary and two new continuous distributions
The HyperGeometric distribution (X ∼ HyperGeom(N, r, n))
• Probability mass function
fX(x) =(r
x
)(N − rn− x
)/
(N
n
), x = 0, 1, . . . , n
• Example: Draw n objects, without replacement, from an urn havingr red and N − r blue balls. Let X = #red balls selected.
• Expected value: E(X) = nrN = np (where p = r
N )
• Variance: Var(X) = np(1− p)(N−nN−1
)Prof. Guangliang Chen | Mathematics & Statistics, San José State University 6/31
A summary and two new continuous distributions
• Binomial approximation: HyperGeom(N, r, n) ≈ B(n, p = rN )
when N, r are both large (relative to n).
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 7/31
A summary and two new continuous distributions
The Geometric distribution (X ∼ Geom(p))
• Probability mass function:
f(x) = (1− p)x−1p, x = 1, 2, . . .
• Example: Toss a coin repeatedly until a head first appears. Let X= #tosses in total.
• Expected value: E(X) = 1p
• Variance: Var(X) = 1−pp2
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 8/31
A summary and two new continuous distributions
The Negative Binomial distribution (X ∼ NB(p, r))
• Probability mass function:
f(x) =(x− 1r − 1
)pr(1− p)x−r, x = r, r + 1, r + 2, . . .
• Example: Toss a coin repeatedly until a total of r heads have beenobtained. Let X = #tosses in total.
• Expected value: E(X) = rp
• Variance: Var(X) = r(1−p)p2
• Note: If X1, . . . , Xriid∼ Geom(p), then X =
∑Xi ∼ NB(p, r).
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 9/31
A summary and two new continuous distributions
The Poisson distribution (X ∼ Pois(λ))
• Probability mass function:
f(x) = λx
x! e−λ, x = 0, 1, 2, . . .
• Example: X = #hurricanes that hit a region each year
• Expected value: E(X) = λ
• Variance: Var(X) = λ
• Note: B(n, p) ≈ Poisson(λ) for large n and small p such that λ = np
is moderate.
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 10/31
A summary and two new continuous distributions
Special continuous distributions
• Uniform
• Exponential
• Normal
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 11/31
A summary and two new continuous distributions
The Uniform distribution (X ∼ Unif(a, b))
• Probability density function:
f(x) = 1b− a
, a < x < b
• Cumulative distribution function:
F (x) = x− ab− a
, a < x < b
• Expected value: E(X) = a+b2
• Variance: Var(X) = (b−a)2
12
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 12/31
A summary and two new continuous distributions
The Normal distribution (X ∼ N(µ, σ2))
• Probability density function (bell-shaped, symmetric and unimodal):
f(x) = 1√2πσ2
e−(x−µ)2
2σ2 , −∞ < x <∞.
It is called standard normal if µ = 0, σ = 1.
• Example: Measurements, test scores of a large class, etc.
• Mean and variance: E(X) = µ, and Var(X) = σ2
• Standardization: If X ∼ N(µ, σ2), then X−µσ ∼ N(0, 1)
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 13/31
A summary and two new continuous distributions
• Normal approximation to binomial: If np ≥ 10, n(1− p) ≥ 10, then
B(n, p) ≈ N( np︸︷︷︸µ
, np(1− p)︸ ︷︷ ︸σ2
)
This means that for any integer 0 < x < n,
P ( X︸︷︷︸binomial
= x) ≈ P (x− 0.5 < X︸︷︷︸normal
< x+ 0.5)
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 14/31
A summary and two new continuous distributions
The Exponential distribution (X ∼ Exp(λ))
• Probability density function
f(x) = λe−λx, x > 0.
• Cumulative distribution function:
F (x) = 1− e−λx, x > 0
• Complementary CDF:
F̄ (x) = 1− F (x) = e−λx, x > 0
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 15/31
A summary and two new continuous distributions
• Example: Waiting times
• Mean and variance:
E(X) = 1λ, Var(X) = 1
λ2
• Exponential random variables have the memoryless property:
P (X > t+ t0 | X > t0) = P (X > t), for all t0, t > 0
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 16/31
A summary and two new continuous distributions
B(n, p)Bernoulli(p)
n = 1n → ∞, p → 0
np = λ Pois(λ)
HyperGeom(N, r, n)
N(µ, σ2)
n −→ ∞µ = np
σ2 = np(1− p)
X−n
p
√ np(1−p)∼N
(0, 1)
Unif(a, b)
Geom(p)
NB(p, r)
Exp(λ)
r = 1
waitin
gtim
e
draw
balls
w/oreplacement
fixed#succ
esses
fixed#trial
s
N(0, 1)µ = 0σ2 = 1
#ocurrences
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 17/31
A summary and two new continuous distributions
So far we have only covered three special continuous distributions:
• Uniform
• Exponential
• Normal
Next, we present two extra continuous distributions:
• Gamma
• Chi-squared
This is from Section 4.4 of the book.
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 18/31
A summary and two new continuous distributions
The Gamma distributionThe Gamma distribution is defined based on the Gamma function.
Def 0.1. The Gamma function isa function Γ : (0,∞) 7→ (0,∞) with
Γ(α) =∫ ∞
0xα−1e−x dx, α > 0
(The Gamma function can be seenas a way to generalize factorials fromintegers to non-integers, e.g., 2.4!)
Properties:
• Γ(1) = 1
• For any α > 0, Γ(α + 1) =α · Γ(α)
• For any positive integer n,Γ(n) = (n− 1)!
• Γ(12) =
√π
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 19/31
A summary and two new continuous distributions
https://www.medcalc.org/manual/gamma_function.php
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 20/31
A summary and two new continuous distributions
The one-parameter Gamma distribution
The Gamma distribution uses the template function xα−1e−x over (0,∞)to model certain random phenomenon:
1 =∫ ∞
0Cxα−1e−x dx = C · Γ(α) −→ C = 1
Γ(α) .
Def 0.2. Any random variable X that has a pdf of the form
f(x;α) = 1Γ(α)x
α−1e−x, x > 0
is said to follow a one-parameter Gamma distribution with parameterα. We denote this by X ∼ Gamma(α).
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 21/31
A summary and two new continuous distributions
Remark. When α = 1, the Gamma distribution reduces to exponential.
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 22/31
A summary and two new continuous distributions
We introduce a second parameter (β) to make the Gamma distributionmore flexible.
From1 =
∫ ∞0
1Γ(α)x
α−1e−x dx,
by letting x = y/β for some β > 0, we have
1 =∫ ∞
0
1Γ(α)
(y
β
)α−1e−y/β
1β
dy
=∫ ∞
0
1βα Γ(α)y
α−1e−y/β︸ ︷︷ ︸two-parameter Gamma density
dy
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 23/31
A summary and two new continuous distributions
The two-parameter Gamma distributions
Def 0.3. Any random variable X that has a pdf of the form
f(x;α, β) = 1βα Γ(α)x
α−1e−x/β, x > 0
is said to follow a (two-parameter) Gamma distribution with parametersα, β. We denote this by X ∼ Gamma(α, β).
Two special cases:
• Gamma(α, β = 1) = Gamma(α)
• Gamma(α = 1, β) = Exp(λ = 1/β).
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 24/31
A summary and two new continuous distributions
(β is a scale parameter)
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 25/31
A summary and two new continuous distributions
Like the normal distribution, there is no closed-form formula for the cdf ofthe Gamma distribution: For any x > 0
F (x;α, β) =∫ x
0
1βα Γ(α)y
α−1e−y/β dy
= 1Γ(α)
∫ x/β
0zα−1e−z dz ←− incomplete Gamma function
However, the expected value and variance of the Gamma distribution canstill be computed explicitly.
Theorem 0.1. If X ∼ Gamma(α, β), then
E(X) = αβ, Var(X) = αβ2
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 26/31
A summary and two new continuous distributions
Application of the Gamma distribution
Consider the experiment of counting the occurrences of a rare event (suchas hurricane) that occurs with rate λ:
| b b |b
X (#occurrences) ∼ Pois(λ)
T1 ∼Exp(λ) T2 ∼Exp(λ)
0 1
b
Tn ∼Exp(λ)
It is known that
• The total number of occurrences of the event in a unit interval oftime has a Poisson distribution: X ∼ Pois(λ);
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 27/31
A summary and two new continuous distributions
• The separate waiting time for each occurrence of the event has an ex-ponential distribution: T1, T2, . . . ∼ Exp(λ) (and are independent);
• The total waiting time for n occurrences of the event has a Gammadistribution:
T = T1 + · · ·+ Tn ∼ Gamma(α = n, β = 1/λ)
This implies that
E(T ) = E(T1) + · · ·+ E(Tn) = n · 1λ
= n
λ
Var(T ) = Var(T1) + · · ·+ Var(Tn) = n · 1λ2 = n
λ2
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 28/31
A summary and two new continuous distributions
The chi-squared distribution
Another special case of the Gamma distribution is the chi-squared distri-bution with parameter k, denoted as χ2(k) and sometimes also χ2
k:
Gamma(α = k
2 , β = 2) = χ2(k) ←− k is called #degrees of freedom
which plays an important role in statistical inference.
The pdf of the χ2(k) distribution is the following:
f(x; k) = 12k/2 Γ(k/2)
x(k/2)−1e−x/2, x > 0
Its mean and variance are E(X) = k and Var(X) = 2k.
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 29/31
A summary and two new continuous distributions
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 30/31
A summary and two new continuous distributions
Clarifications
• Gamma and Chi-squared distributions: You will and only needto know the concepts; however, no calculations about them will berequired.
• Joint distributions: The material is only needed by the optionalhomework (HW7). It won’t be tested in the second midterm or thefinal.
Prof. Guangliang Chen | Mathematics & Statistics, San José State University 31/31