probability theory and specific distributions (moore ch5 and guan ch6)
DESCRIPTION
Objectives Bernoulli The binomial setting Binomial probabilities Hypergeometric distribution The Poisson model Exponential distribution The Normal approximation Chi-square distribution t distribution F distribution Discrete distribution Continuous distributionTRANSCRIPT
Probability Theory and Specific Distributions (Moore Ch5 and Guan Ch6)
Research questions Waiting Bus On averagely, how many buses will arrive
in a hour?
Or, we may ask how long do we need to wait at the bus stop?
Objectives Bernoulli
The binomial setting
Binomial probabilities
Hypergeometric distribution
The Poisson model
Exponential distribution
The Normal approximation
Chi-square distribution
t distribution
F distribution
Discrete distribution
Continuous distribution
Bernoulli experiment Bernoulli experiment( 白奴里試驗 ) : An experiment with binary
outcome. Flip a coin with half-half chance
The probability function of a distribution following Bernoulli experiment is
apappaXpaf aa
X
y withprobabilit is ,1or ,0;)1(P);( )1(
A coin is flipped 1 time. The variable X is the outcome of either head or
tail. If the coin is a fair one, then
otherwise 0 head; of 1
;)5.01(5.0P);( )1(
aaaXpaf aa
X
)1()()()(
))-(10())-(11(E(X)
.
222
1001
ppppXEXEXVar
ppppp
Binomial settingBinomial distributions ( 二項分配 ) are models for some categorical
variables, typically representing the number of successes in a series of n
trials.
The observations must meet these requirements:
The total number of observations n is fixed in advance.
The outcomes of all n observations are statistically independent.
Each observation falls into just one of two categories: success and failure.
All n observations have the same probability of “success,” p.
We record the next 50 vehicles sold at a dealership. Each buyer either
purchases or leases; each vehicle sold is either an SUV or not.
The distribution of the count X of successes in the binomial setting is the binomial distribution with parameters n and p: B(n,p).
The parameter n is the total number of observations. The parameter p is the probability of success on each observation. The count of successes X can be any whole number between 0 and n.
A coin is flipped 10 times. Each outcome is either a head or a tail.
The variable X is the number of heads among those 10 flips, our count
of “successes.”
On each flip, the probability of success, “head,” is 0.5. The number X of
heads among 10 flips has the binomial distribution B(n = 10, p = 0.5).
Binomial distribution
Applications for binomial distributions
Binomial distributions describe the possible number of times that a particular event will occur in a sequence of observations.
They are used when we want to know about the occurrence of an event, not its magnitude.
In a clinical trial, a patient’s condition may improve or not. We study the number of patients who improved, not how much better they feel.
Was a sales transaction considered pleasant? The binomial distribution describes the number of pleasant transactions, not how pleasant they are.
In quality control we assess the number of defective items in a lot of goods, irrespective of the type of defect.
Binomial probabilitiesThe number of ways of arranging k successes in a series of n
observations (with constant probability p of success) is the number of
possible combinations (unordered sequences).
This can be calculated with the binomial coefficient:
pnna
ppan
aXpnaf anaX
, parameters with,....,1,0
)1(P),;( )(
If X obeys binomial distribution with parameters n and p, we usually note as X~B(n,p).
Binomial formulas The binomial coefficient “n_choose_k” uses the factorial notation
“!”. The factorial n! for any strictly positive whole number n is:
n! = n × (n − 1) × (n − 2) × · · · × 3 × 2 × 1
For example: 5! = 5 × 4 × 3 × 2 × 1 = 120
Note that 0! = 1.
Finding binomial probabilities: tables You can also look up the probabilities for some values of n and p in
Table C in the back of the book.
The entries in the table are the probabilities P(X = k) of individual outcomes.
The values of p that appear in Table C are all 0.5 or smaller. When the probability of a success is greater than 0.5, restate the problem in terms of the number of failures.
Customer satisfaction
Each consumer has probability of 0.25 of preferring your product over a
competitor’s product. If we question five consumers, what is the probability that
exactly three of them prefer your product?
Use Excel “=BINOMDIST(number_s,trials,probability_s,cumulative)”
P(x = 3) = BINOMDIST(3, 5, 0.25, 0) = 0.08789
P(x = 3) = (n! / a!(n a)!)pa(1 p)n-a = (5! / ((3!)(2!)) * 0.253 * 0.752
P(x = 3) = (5*4) / (2*1) * 0.253 * 0.752
P(x = 3) = 10 * 0.015625 * 0.5625 = 0.08789
Binomial mean and standard deviation
The center and spread of the binomial
distribution for a count X are defined by
the mean and standard deviation :
)1( pnpnpqnp
Effect of changing p when n is fixed.a) n = 10, p = 0.25
b) n = 10, p = 0.5
c) n = 10, p = 0.75
For small samples, binomial distributions
are skewed when p is different from 0.5.0
0.05
0.1
0.15
0.2
0.25
0.3
0 1 2 3 4 5 6 7 8 9 10
Number of successes
P(X=
x)
0
0.05
0.1
0.15
0.2
0.25
0.3
0 1 2 3 4 5 6 7 8 9 10
Number of successes
P(X=
x)
0
0.05
0.1
0.15
0.2
0.25
0.3
0 1 2 3 4 5 6 7 8 9 10
Number of successes
P(X
=x)
a)
b)
c)
Hypergeometric distribution (超幾何分配 )
A population has two objects (black and white balls, for example), we have N object and M another object (N black balls and M white balls). For a simple random sample without replacement for n times, the number of assigned object (black balls, for example) is a hypergeometric setting.
We can use Bernoulli experiment to understand the hypergeometric setting. A hypergeometric variable X is the sum of dependent bernoulli variables, where Yi=1 if successes and Yi=0 if fails.
n
iiYX
1
Hypergeometric distribution The pdf of a hypergeometric distribution is
Parameter a ranges:
Where M,N,n are al parameters, M is number of successes, N is number of failures, n is the total trials.
We usually note a hypergeometric distribution as
X~ HG (M,N,n)
.),0max( MaNn
nNMan
NaM
aXnNMafX )(P),,;(
Hypergeometric distribution
1)()var(
,)(E
then),/()(Eand
, variablesBernoullidependent arewhere),,,(HG~If
2
1
nMnNM
NMMNnX
NMMnX
NMMY
YnNMY
i
n
iii
The Poisson setting A count X of successes has a Poisson distribution ( 波氏分配 ) in the
Poisson setting:
The number of successes that occur in any unit of measure is independent of the number of successes that occur in any non-overlapping unit of measure.
The probability that a success will occur in a time slot is the same for all time slots of equal length and is proportional to the length of the time.
The probability that two or more successes will occur in a unit approaches 0 as the size of the unit becomes smaller.
Waiting bus:
We count number of buses arriving in a hour, then the number of buses will obey a Possion distribution.
Poisson distribution The distribution of the count X of successes in the Poisson setting is
the Poisson distribution with mean λ. The parameter λ is the mean number of successes per unit of measure.
The possible values of X are the whole numbers 0, 1, 2, 3, …. If a is any whole number 0 or greater, then
P(X = a) = (e-λλa)/a!
The standard deviation of the distribution is the square root of λ. The Possion is a discrete distribution.
Poisson distribution Let 1.6 be the average number of flaws in a square yard of carpet.
The count of X flaws per square yard of carpet is modeled by the Poisson distribution with λ = 1.6.
The probability of two or less flaws per square yard is P(X < 2).
P(X < 2) = P(X = 0) + P(X = 1) + P(X = 2) = (e-1.6(1.6)0)/0! + (e-1.6(1.6)1)/1! + (e-1.6(1.6)2)/2! = 0.2019 + 0.3230 + .02584 = 0.7833
Poisson distribution Two independent Possion variables X and Y with parameters λ1
and λ2, we know X+Y is a Possion with mean: λ1 + λ2 .
.
For a binomial distribution, when n approach infinite and p approach
0, the binomial approaches a Possion.
!),(
, 0 and ),,(~ If
aepnaf
pnpnBXa
X
;
Exponential distribution (指數分配 )
An exponential distribution describes the time span between two events.
Let X is the time span between two A events, given t >0,
where {X > t} means next event A will occur later than time t.
tXtX P1P
Waiting bus:
We count waiting time of bus, then the waiting time for next bus will obey a exponential distribution.
Exponential distribution For a Possion, the parameter λ means the expected times of
outcomes in a period of time t. So, let t/ λ =β is the expected time span of two lead-lag events.
The pdf of an exponential distribution is
We usually note an exponential variable as X~Exp(β) 。 If X~Exp(β), then E(X)= β, var(X)= β2
0,1)( / tetf tX
Normal distribution The density function of a normal distribution is
We can describe a normal by two parameters only: mean and variance.
,0,and
where
,2
)(exp2
1),;(
2
2
2
2
b
bbfX
Normal distribution feature The probability density function is in continuous form, and also
termed as Gaussian distribution ( 高斯分配 ). A normal has mean=median=mode. A normal is in bell shaped, which is also symmetric. A normal distribution with high standard deviation has fat tail in both
sides. We usually use and to stand for pdf and cdf of a normal. .A normal distribution with mean 0
and standard deviation (or variance) 1 is termed standard normal distribution ( 標準常態分配 ).
Normal approximationBack to binomial distribution, if n is large, and p is not too close to 0 or 1,
the binomial distribution can be approximated by the normal distribution
N(µ= np, σ = np(1 p))
Practically, the Normal approximation can be used when both
np ≥10 and n(1 p) ≥10.
If X is the count of successes in the sample, the sampling distributions for
large n is:
X approximately N(µ = np, σ2 = np(1 − p))
Normal approximationIf 605 of people claim to be satisfied with their lives, what is the
probability that at least 610 people out of 1010 surveyed will say they
are satisfied with their lives?
µ= np = (1010)(0.60) = 606
σ = √(np(1 p)) = √(1010(0.60)(0.40)) = 15.57
So X is N(606, 15.57); Note: np ≥10 and n(1 p) ≥10.
P(X > 610) = P((X – 606)/15.57 > (610 – 606)/15.57)
= P(Z > 0.26) = 0.3974
Chi-square distribution (卡方分配 )
A random variable X obeying Chi-square distribution with degree of freedom 1 is the square of a standard normal Z: Z~N (0,1) , then Z2 ~ 2 (1).
Random variable X with degree of freedom n, then we usually express the Chi-square X ~ 2 (n).
For Z1,….Zn being n independent standard normal
variables, then X=Z12+…..+Zn
2 ~ 2 (n)
If X ~ 2 (n) and Y ~ 2 (m) are independent, then
X+Y ~ 2 (n+m) 。
Chi-square distribution 2 (n) random variable X, its mean is n and its variance
is 2n. When degree of freedom increases, Chi-square
approaches a symmetric distribution.
t distribution (t分配 )
For Z ~ N (0,1) and Y ~ 2 (n) where Z and Y are independent, then
t distribution with degree of freedom denotes as t (n) 。 The moment of a t (n) depends on its degree of freedom n.
t (1) has infinite expected value, which is also known Cauchy distribution ( 柯西分配 ). In other words, only when n >1, t (n) has finite expected value.
t (2) has no valid variance. Only when n >2, t (n) has finite variance equal to n/(n – 2).
)(~/
ntnY
Z
t distribution When n approaches infinite, the variance of t (n) is closed
to 1, meaning that t(n) approaches standard normal. When t distribution has low df (e.g., n =2 或 4), we will
get fat-rail ( 厚尾 ) and more outliers.
F distribution (F 分配 )
For two independent Chi-square variables Z ~ 2 (m) and Y ~ 2 (n)
where m and n are df of numerator and denominator. If X ~ F (m,n), then 1/ X ~ F (n,m) The moments of F distribution depend on df. When n >2 ,
F (m,n) has mean: n/(n – 2); when n > 4, F (m,n)has variance:
),(~// nmFnYmZ
)4()2()2(2
2
2
nnmnmn