probability theory and specific distributions (moore ch5 and guan ch6)

Probability Theory and Specific Distributions (Moore Ch5 and Guan Ch6)

Research questions Waiting Bus On averagely, how many buses will arrive

in a hour?

Or, we may ask how long do we need to wait at the bus stop?

Objectives Bernoulli

The binomial setting

Binomial probabilities

Hypergeometric distribution

The Poisson model

Exponential distribution

The Normal approximation

Chi-square distribution

t distribution

F distribution

Discrete distribution

Continuous distribution

Bernoulli experiment Bernoulli experiment( 白奴里試驗 ) ： An experiment with binary

outcome. Flip a coin with half-half chance

The probability function of a distribution following Bernoulli experiment is

apappaXpaf aa

X

y withprobabilit is ,1or ,0;)1(P);( )1(

A coin is flipped 1 time. The variable X is the outcome of either head or

tail. If the coin is a fair one, then

otherwise 0 head; of 1

;)5.01(5.0P);( )1(

aaaXpaf aa

X

)1()()()(

))-(10())-(11(E(X)

.

222

1001

ppppXEXEXVar

ppppp

Binomial settingBinomial distributions ( 二項分配 ) are models for some categorical

variables, typically representing the number of successes in a series of n

trials.

The observations must meet these requirements:

The total number of observations n is fixed in advance.

The outcomes of all n observations are statistically independent.

Each observation falls into just one of two categories: success and failure.

All n observations have the same probability of “success,” p.

We record the next 50 vehicles sold at a dealership. Each buyer either

purchases or leases; each vehicle sold is either an SUV or not.

The distribution of the count X of successes in the binomial setting is the binomial distribution with parameters n and p: B(n,p).

The parameter n is the total number of observations. The parameter p is the probability of success on each observation. The count of successes X can be any whole number between 0 and n.

A coin is flipped 10 times. Each outcome is either a head or a tail.

The variable X is the number of heads among those 10 flips, our count

of “successes.”

On each flip, the probability of success, “head,” is 0.5. The number X of

heads among 10 flips has the binomial distribution B(n = 10, p = 0.5).

Binomial distribution

Applications for binomial distributions

Binomial distributions describe the possible number of times that a particular event will occur in a sequence of observations.

They are used when we want to know about the occurrence of an event, not its magnitude.

In a clinical trial, a patient’s condition may improve or not. We study the number of patients who improved, not how much better they feel.

Was a sales transaction considered pleasant? The binomial distribution describes the number of pleasant transactions, not how pleasant they are.

In quality control we assess the number of defective items in a lot of goods, irrespective of the type of defect.

Binomial probabilitiesThe number of ways of arranging k successes in a series of n

observations (with constant probability p of success) is the number of

possible combinations (unordered sequences).

This can be calculated with the binomial coefficient:

pnna

ppan

aXpnaf anaX

, parameters with,....,1,0

)1(P),;( )(

If X obeys binomial distribution with parameters n and p, we usually note as X~B(n,p).

Binomial formulas The binomial coefficient “n_choose_k” uses the factorial notation

“!”. The factorial n! for any strictly positive whole number n is:

n! = n × (n − 1) × (n − 2) × · · · × 3 × 2 × 1

For example: 5! = 5 × 4 × 3 × 2 × 1 = 120

Note that 0! = 1.

Finding binomial probabilities: tables You can also look up the probabilities for some values of n and p in

Table C in the back of the book.

The entries in the table are the probabilities P(X = k) of individual outcomes.

The values of p that appear in Table C are all 0.5 or smaller. When the probability of a success is greater than 0.5, restate the problem in terms of the number of failures.

Customer satisfaction

Each consumer has probability of 0.25 of preferring your product over a

competitor’s product. If we question five consumers, what is the probability that

exactly three of them prefer your product?

Use Excel “=BINOMDIST(number_s,trials,probability_s,cumulative)”

P(x = 3) = BINOMDIST(3, 5, 0.25, 0) = 0.08789

P(x = 3) = (n! / a!(n a)!)pa(1 p)n-a = (5! / ((3!)(2!)) * 0.253 * 0.752

P(x = 3) = (5*4) / (2*1) * 0.253 * 0.752

P(x = 3) = 10 * 0.015625 * 0.5625 = 0.08789

Binomial mean and standard deviation

The center and spread of the binomial

distribution for a count X are defined by

the mean and standard deviation :

)1( pnpnpqnp

Effect of changing p when n is fixed.a) n = 10, p = 0.25

b) n = 10, p = 0.5

c) n = 10, p = 0.75

For small samples, binomial distributions

are skewed when p is different from 0.5.0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10

Number of successes

P(X=

x)

0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10

Number of successes

P(X=

x)

0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10

Number of successes

P(X

=x)

a)

b)

c)

Hypergeometric distribution (超幾何分配 )

A population has two objects (black and white balls, for example), we have N object and M another object (N black balls and M white balls). For a simple random sample without replacement for n times, the number of assigned object (black balls, for example) is a hypergeometric setting.

We can use Bernoulli experiment to understand the hypergeometric setting. A hypergeometric variable X is the sum of dependent bernoulli variables, where Yi=1 if successes and Yi=0 if fails.

n

iiYX

1

Hypergeometric distribution The pdf of a hypergeometric distribution is

Parameter a ranges:

Where M,N,n are al parameters, M is number of successes, N is number of failures, n is the total trials.

We usually note a hypergeometric distribution as

X~ HG (M,N,n)

.),0max( MaNn

nNMan

NaM

aXnNMafX )(P),,;(

Hypergeometric distribution

1)()var(

,)(E

then),/()(Eand

, variablesBernoullidependent arewhere),,,(HG~If

2

1

nMnNM

NMMNnX

NMMnX

NMMY

YnNMY

i

n

iii

The Poisson setting A count X of successes has a Poisson distribution ( 波氏分配 ) in the

Poisson setting:

The number of successes that occur in any unit of measure is independent of the number of successes that occur in any non-overlapping unit of measure.

The probability that a success will occur in a time slot is the same for all time slots of equal length and is proportional to the length of the time.

The probability that two or more successes will occur in a unit approaches 0 as the size of the unit becomes smaller.

Waiting bus:

We count number of buses arriving in a hour, then the number of buses will obey a Possion distribution.

Poisson distribution The distribution of the count X of successes in the Poisson setting is

the Poisson distribution with mean λ. The parameter λ is the mean number of successes per unit of measure.

The possible values of X are the whole numbers 0, 1, 2, 3, …. If a is any whole number 0 or greater, then

P(X = a) = (e-λλa)/a!

The standard deviation of the distribution is the square root of λ. The Possion is a discrete distribution.

Poisson distribution Let 1.6 be the average number of flaws in a square yard of carpet.

The count of X flaws per square yard of carpet is modeled by the Poisson distribution with λ = 1.6.

The probability of two or less flaws per square yard is P(X < 2).

P(X < 2) = P(X = 0) + P(X = 1) + P(X = 2) = (e-1.6(1.6)0)/0! + (e-1.6(1.6)1)/1! + (e-1.6(1.6)2)/2! = 0.2019 + 0.3230 + .02584 = 0.7833

Poisson distribution Two independent Possion variables X and Y with parameters λ1

and λ2, we know X+Y is a Possion with mean: λ1 + λ2 .

.

For a binomial distribution, when n approach infinite and p approach

0, the binomial approaches a Possion.

!),(

, 0 and ),,(~ If

aepnaf

pnpnBXa

X

；

Exponential distribution (指數分配 )

An exponential distribution describes the time span between two events.

Let X is the time span between two A events, given t >0,

where {X > t} means next event A will occur later than time t.

tXtX P1P

Waiting bus:

We count waiting time of bus, then the waiting time for next bus will obey a exponential distribution.

Exponential distribution For a Possion, the parameter λ means the expected times of

outcomes in a period of time t. So, let t/ λ =β is the expected time span of two lead-lag events.

The pdf of an exponential distribution is

We usually note an exponential variable as X~Exp(β) 。 If X~Exp(β), then E(X)= β, var(X)= β2

0,1)( / tetf tX

Normal distribution The density function of a normal distribution is

We can describe a normal by two parameters only: mean and variance.

,0,and

where

,2

)(exp2

1),;(

2

2

2

2

b

bbfX

Normal distribution feature The probability density function is in continuous form, and also

termed as Gaussian distribution ( 高斯分配 ). A normal has mean=median=mode. A normal is in bell shaped, which is also symmetric. A normal distribution with high standard deviation has fat tail in both

sides. We usually use and to stand for pdf and cdf of a normal. .A normal distribution with mean 0

and standard deviation (or variance) 1 is termed standard normal distribution ( 標準常態分配 ).

Normal approximationBack to binomial distribution, if n is large, and p is not too close to 0 or 1,

the binomial distribution can be approximated by the normal distribution

N(µ= np, σ = np(1 p))

Practically, the Normal approximation can be used when both

np ≥10 and n(1 p) ≥10.

If X is the count of successes in the sample, the sampling distributions for

large n is:

X approximately N(µ = np, σ2 = np(1 − p))

Normal approximationIf 605 of people claim to be satisfied with their lives, what is the

probability that at least 610 people out of 1010 surveyed will say they

are satisfied with their lives?

µ= np = (1010)(0.60) = 606

σ = √(np(1 p)) = √(1010(0.60)(0.40)) = 15.57

So X is N(606, 15.57); Note: np ≥10 and n(1 p) ≥10.

P(X > 610) = P((X – 606)/15.57 > (610 – 606)/15.57)

= P(Z > 0.26) = 0.3974

Chi-square distribution (卡方分配 )

A random variable X obeying Chi-square distribution with degree of freedom 1 is the square of a standard normal Z: Z~N (0,1) ， then Z2 ~ 2 (1).

Random variable X with degree of freedom n, then we usually express the Chi-square X ~ 2 (n).

For Z1,….Zn being n independent standard normal

variables, then X=Z12+…..+Zn

2 ~ 2 (n)

If X ~ 2 (n) and Y ~ 2 (m) are independent, then

X+Y ~ 2 (n+m) 。

Chi-square distribution 2 (n) random variable X, its mean is n and its variance

is 2n. When degree of freedom increases, Chi-square

approaches a symmetric distribution.

t distribution (t分配 )

For Z ~ N (0,1) and Y ~ 2 (n) where Z and Y are independent, then

t distribution with degree of freedom denotes as t (n) 。 The moment of a t (n) depends on its degree of freedom n.

t (1) has infinite expected value, which is also known Cauchy distribution ( 柯西分配 ). In other words, only when n >1, t (n) has finite expected value.

t (2) has no valid variance. Only when n >2, t (n) has finite variance equal to n/(n – 2).

)(~/

ntnY

Z

t distribution When n approaches infinite, the variance of t (n) is closed

to 1, meaning that t(n) approaches standard normal. When t distribution has low df (e.g., n =2 或 4), we will

get fat-rail ( 厚尾 ) and more outliers.

F distribution (F 分配 )

For two independent Chi-square variables Z ~ 2 (m) and Y ~ 2 (n)

where m and n are df of numerator and denominator. If X ~ F (m,n), then 1/ X ~ F (n,m) The moments of F distribution depend on df. When n >2 ,

F (m,n) has mean: n/(n – 2); when n > 4, F (m,n)has variance:

),(~// nmFnYmZ

)4()2()2(2

2

2

nnmnmn

probability theory and specific distributions (moore ch5 and guan ch6)

Documents