ess011 mathematical statistics and signal processing › stat › grundutb › cth › ess011 ›...

18
ESS011 Mathematical statistics and signal processing Lecture 8: Some common distributions Tuomas A. Rajala Chalmers TU March 31, 2014

Upload: others

Post on 28-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

ESS011Mathematical statistics and signal processing

Lecture 8: Some common distributions

Tuomas A. Rajala

Chalmers TU

March 31, 2014

Page 2: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

Course ESS011 (2014)

Lecture 8: Some common distributions

Where are we

Last week:

Laws for several events: Conditional probability, multiplication rule,Bayes

Independence: A ?? B ) P (A \B) = P (A)P (B)

and

Events generated by random variables (r.v.’s)

Distribution, density, CDF

Characteristics: Expectation, mean, variance, sd

Today we study some the most common parametric models for r.v.’s

1/17

Page 3: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

Course ESS011 (2014)

Lecture 8: Some common distributions

Parametric density models

Some new definitions:

i.i.d. If r.v.’s X1, X2, ... have the same distribution f and areindependent, they are called independent and identically distributed

(i.i.d.)

Distribution family If a distribution function f(·) = f(·; ✓) depends onsome parameters ✓ 2 Rp, p > 0, we call the set of functions

{f(·; ✓) : ✓ 2 Rp}

a distribution family.

We often call say e.g. Gaussian r.v. instead of r.v. having a density ofGaussian family.

2/17

Page 4: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

Course ESS011 (2014)

Lecture 8: Some common distributions

Moment generating function

Expected value and variance are called moments of a distribution. Oneway to derive them is using

Moment generating function (mgf) For a random variable X withdensity f , the function

mX

(t) := E(etX)

is called the moment generating function.

Why this is useful is given by the following theorem:

Moments using mgf If r.v. X has mgf mX

(t), then

E(Xk) =dkm

X

(t)

dtk

����t=0

Proof: Series expansion of e.

3/17

Page 5: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

Course ESS011 (2014)

Lecture 8: Some common distributions

First: Bernoulli

Bernoulli trial (discrete) Let parameter p 2 (0, 1). A random variable iscalled Bernoulli trial i↵ its density is

f(x) = px(1� p)1�x, x 2 {0, 1}

Notation: X ⇠ Bernoulli(p) or Bern(p)

Examples: Fair coin heads p = 0.5; Win election p = 0.51; Win thelottery jackpot p ⇡ 10�9.

Common notation in e.g. sports: odds O =p

1� p.

Moments:mgf = (1� p) + pet, E(X) = p, Var(X) = p(1� p)

4/17

Page 6: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

Course ESS011 (2014)

Lecture 8: Some common distributions

GeometricImagine a sequence of Bern(p) trials with interpretation 0=”failure” and1=”success”:

Geometric distribution (discrete) The random number of steps until thefirst ”success” has the geometric distribution with density

f(x) = (1� p)x�1px, x = 1, 2, ...

Notation: X ⇠ Geom(p)

Moments: E(X) = 1/p,Var(X) = (1� p)/p2

1 5 9 13 18 23 28 33 38 43 48

p=0.3

0.00

0.05

0.10

0.15

0.20

1 5 9 13 18 23 28 33 38 43 48

p=0.1

0.00

0.02

0.04

0.06

0.08

5/17

Page 7: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

Course ESS011 (2014)

Lecture 8: Some common distributions

BinomialRepeat a Bern(p) trial n times: How many 1’s?

Binomial distribution For i.i.d. Zi

⇠ Bern(p) the sum X :=P

n

i=1 Zi

has the binomial distribution with density

f(x) =

✓n

x

◆(1� p)n�xpx, x = 0, ..., n

Notation: X ⇠ Binom(n, p).

Moments: mgf (1� p+ pet)n, E(X) = np,Var(X) = np(1� p)

1 5 9 13 18 23 28 33 38 43 48

p=0.3

0.00

0.02

0.04

0.06

0.08

0.10

0.12

1 5 9 13 18 23 28 33 38 43 48

p=0.1

0.00

0.05

0.10

0.15

6/17

Page 8: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

Course ESS011 (2014)

Lecture 8: Some common distributions

Negative binomial

Let’s reverse the binomial setup: I want r successful Bern(p) trials, howmany trials do I need in total?

Negative binomial distribution For i.i.d. Zi

⇠ Bern(p), and aparameter r 2 N, the number N , for which

PN

i=1 Zi

= r, followsnegative binomial distribution with density

f(n) =

✓n� 1

r � 1

◆(1� p)n�rpr, n = r, r + 1, r + 2, ...

Notation: X ⇠ NegBinom(r, p).

Moments: E(X) = r/p,Var(X) = r(1� p)/p2

E.g. Collect names for a cause. Approx. 1 out of 10 sign up, p = 0.1.You need 1000 names. Expected work: Ask E(X) = 1000/0.1 = 10000people, with P (X > 9000) = 0.5.

7/17

Page 9: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

Course ESS011 (2014)

Lecture 8: Some common distributions

Poisson distribution

Consider counting something that takes place at random intervals over aperiod of time. It can be shown that (with assumptions) the naturalmodel is

Poisson distribution A r.v. X 2 N is said to be Poisson distributed if

f(x) =�x

x!e�� x = 0, 1, 2, ...

with some parameter � > 0. Notation X ⇠ Pois(�).

Moments: mgf = e�(et�1), E(X) = �,Var(X) = �

Note: Let the rate of occurrences per time unit be as Pois(�). Then ther.v. of counts over a time period of length T follows Pois(�T ).

If a process outputs a number according to Poisson distribution, theprocess is a called Poisson process.

8/17

Page 10: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

Course ESS011 (2014)

Lecture 8: Some common distributions

Poisson example

Example: A small grocery store sell on average 3 melons a day. Theshopkeeper wants to optimize his stock. How many should he stock forevery 5 day period so that the chance of running out is less than 0.01?

If we assume Poisson distribution, we have the rate � = 3 per day. Thetime window of interest is T = 5 days. Then X ⇠ Pois(15). Thequestion is to solve P (X < k) = 0.99.

Such a value k is called 0.99-quantile. Look it up from table/computer:k = 25 should be enough stock.

1 3 5 7 9 12 15 18 21 24 27 30

lambda=15

0.00

0.02

0.04

0.06

0.08

0.10

● ● ● ● ● ● ●●

●●

●● ● ● ● ● ● ● ● ●

0 5 10 15 20 25 30

0.0

0.2

0.4

0.6

0.8

1.0

CDF

x

F(x)

k=25

F(k)=0.99

9/17

Page 11: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

Course ESS011 (2014)

Lecture 8: Some common distributions

Other discrete

Generalize Bernoulli trial: Let p = {pk

� 0 : k = 1, ...,K} be adistribution.Categorical distribution The random variable X 2 {1, ...,K} followsthe categorical distribution with parameters {p

k

} if

f(k) = P (X = k) = pk

8k = 1, ...,K

Notation: X ⇠ Cat(p)

Uniform distribution: A r.v. X 2 {1, ...,K} with

f(k) = P (X = k) =1

K8k

has the uniform distribution, denoted by X ⇠ Unif(1, ...,K)

e.g. die cast, fully random sampling, when people say ”random”.

* * *

10/17

Page 12: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

Course ESS011 (2014)

Lecture 8: Some common distributions

Continuous uniform

Let D 2 Rd be a compact set for any d > 0, and write |D| :=RD

dx forits size.Uniform distribution A r.v. X with density

f(x) =1

|D| 8x 2 D

follows uniform distribution, X ⇠ Unif(D).

For an interval D = [a, b]: E(X) =b+ a

2,Var(X) =

(b� a)2

12

Computers: Simulation of other r.v.’s is based on X0 ⇠ Unif([0, 1])

11/17

Page 13: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

Course ESS011 (2014)

Lecture 8: Some common distributions

Gamma

Gamma distribution With some parameters ↵ > 0,� > 0, a r.v. X > 0is said to be gamma distributed, X ⇠ �(↵,�), if

f(x) =��↵

�(↵)x↵�1e�x/� x > 0

Note: ↵ shape, � scale. Sometimes written with 1/�.

Moments: mgf = (1� �t)↵, E(X) = ↵�,Var(X) = ↵�2.

0 5 10 15 20 25 30 35

0.00

0.05

0.10

0.15

0.20

0.25

Gamma(a,4)

a=0.8

a=2

a=5

0 5 10 15 20 25 30 35

0.00

0.05

0.10

0.15

0.20

0.25

Gamma(4,b)

b=0.8

b=2

b=5

12/17

Tuomas Rajala
error: should be to the power -alpha
Tuomas Rajala
Tuomas Rajala
Tuomas Rajala
Tuomas Rajala
Tuomas Rajala
Page 14: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

Course ESS011 (2014)

Lecture 8: Some common distributions

Exponential

Consider the family �(1,�) with some � > 0:

Exponential distribution X follows exponential distribution if

f(x) =1

�e�x/� , x > 0

denoted by X ⇠ Exp(�).

Moments: See Gamma.

Connection to Poisson process with parameter � > 0: If W is the waitingtime until the next event, then W ⇠ Exp(1/�). (proof: prob. of noevents)

E.g. Wait for tram less than 6 mins, if they come at random 3 cars perhour rate: W ⇠ Exp(1/3) so P (W < 0.1) = F (0.1) = 1� e�0.3 = 0.26.

13/17

Page 15: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

Course ESS011 (2014)

Lecture 8: Some common distributions

Normal distribution briefly

The Most Popular Distribution on the planet:

Normal, or Gaussian, distribution Let µ 2 R,� > 0. A r.v. X withdensity

f(x) =1

�p2⇡

e�1

2(x�µ)2/�2

, x 2 R

is said to follow the normal distribution N(µ,�2) with parameters µ and�.

Moments: E(X) = µ,Var(X) = �2. ”Location” and ”scale”

standard normal distribution is N(0, 1), with mean µ = 0 and sd� = 1.

14/17

Page 16: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

Course ESS011 (2014)

Lecture 8: Some common distributions

Others...

e.g. Chi-squared, or �2, distribution Let X ⇠ �(�/2, 2) with some

� 2 N \ 0. Then X follows �2 distribution with degrees of freedom �.

Beta distribution X 2 [0, 1] follows betadistribution Beta(↵,�) if

f(x) =1

B(↵,�)x↵�1(1� x)��1 x 2 [0, 1]

where B is the beta function.0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Beta

Pareto distribution X 2 [xm

,1) follows Pareto distribution if

f(x) =↵x↵

m

x↵+1x > x

m

for some parameters xm

,↵ > 0.

15/17

Page 17: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

Course ESS011 (2014)

Lecture 8: Some common distributions

Many, many more

Leemis and McQueston (2008) “Univariate Distribution Relationships”

16/17

Page 18: ESS011 Mathematical statistics and signal processing › Stat › Grundutb › CTH › ess011 › 1314 › files … · ESS011 Mathematical statistics and signal processing Lecture

Course ESS011 (2014)

Lecture 8: Some common distributions

Summary

Today:

Moment generating function

Some popular families of distributions

Tomorrow we study the normal distribution, and see how we cantransform and combine random variables.

17/17