ee401 (semester 1) jitkomut songsiri 1. review of probability

194
EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability Random Experiments The Axioms of Probability Conditional Probabilty Independence of Events Sequential Experiments 1-1

Upload: others

Post on 19-Feb-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

EE401 (Semester 1) Jitkomut Songsiri

1. Review of Probability

• Random Experiments

• The Axioms of Probability

• Conditional Probabilty

• Independence of Events

• Sequential Experiments

1-1

Page 2: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Random Experiments

An experiment in which the outcome varies in an unpredictable fashionwhen the experiment is repeated under the same conditions

Examples:

• Select a ball from an urn containing balls numbered 1 to n

• Toss a coin and note the outcome

• Roll a dice and note the outcome

• Measure the time between page requests in a Web server

• Pick a number at random between 0 and 1

Review of Probability 1-2

Page 3: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Sample space

Sample space is the set of all possible outcomes, denoted by S

• obtained by listing all the elements, e.g., S = {H,T}, or• giving a property that specifies the elements, e.g., S = {x | 0 ≤ x ≤ 3}

Same experimental procedure may have different sample spaces

y

x

0

1

1

S1

y

x

0

1

1

S2

• Experiment 1: Pick two numbers at random between zero and one

• Experiment 2: Pick a number X at random between 0 and 1, then picka number Y at random between 0 and X

Review of Probability 1-3

Page 4: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Three possibilities for the number of outcomes in sample spaces

finite, countably infinite, uncountably infinite

Examples:

S1 = {1, 2, 3, . . . , 10}S2 = {HH,HT,TT,TH}S3 = {x ∈ Z | 0 ≤ x ≤ 10}S4 = {1, 2, 3, . . .}S5 = {(x, y) ∈ R× R | 0 ≤ y ≤ x ≤ 1}S6 = Set of functions X(t) for which X(t) = 0 for t ≥ t0

Discrete sample space: if S is countable (S1, S2, S3, S4)

Continuous sample space: if S is not countable (S5, S6)

Review of Probability 1-4

Page 5: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Events

Event is a subset of a sample space when the outcome satisfies certainconditions

Examples: Ak denotes an event corresponding to the experiment Ek

E1 : Select a ball from an urn containing balls numbered 1 to 10

A1 : An even-numbered ball (from 1 to 10) is selected

S1 = {1, 2, 3, . . . , 10}, A1 = {2, 4, 6, 8, 10}

E2 : Toss a coin twice and note the sequence of heads and tails

A2 : The two tosses give the same outcome

S2 = {HH,HT,TT,TH}, A2 = {HH,TT}

Review of Probability 1-5

Page 6: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

E3 : Count # of voice packets containing only silence from 10 speakers

A3 : No active packets are produced

S3 = {x ∈ Z | 0 ≤ x ≤ 10}, A3 = {0}

Two events of special interest:

• Certain event, S, which consists of all outcomes and hence alwaysoccurs

• Impossible event or null event, ∅, which contains no outcomes andnever occurs

Review of Probability 1-6

Page 7: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Review of Set Theory

• A = B if and only if A ⊂ B and B ⊂ A

• A ∪B (union): set of outcomes that are in A or in B

• A ∩B (intersection): set of outcomes that are in A and in B

• A and B are disjoint or mutually exclusive if A ∩B = ∅• Ac (complement): set of all elements not in A

• A ∪B = B ∪A and A ∩B = B ∩A

• A ∪ (B ∪ C) = (A ∪B) ∪ C and A ∩ (B ∩ C) = (A ∩B) ∩ C

• A∪ (B ∩C) = (A∪B)∩ (A∪C) and A∩ (B ∪C) = (A∩B)∪ (A∩C)

• DeMorgan’s Rules

(A ∪B)c = Ac ∩Bc, (A ∩B)c = Ac ∪Bc

Review of Probability 1-7

Page 8: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Axioms of Probability

Probabilities are numbers assigned to events indicating how likely it is thatthe events will occur

A Probability law is a rule that assigns a number P (A) to each event A

P (A) is called the the probability of A and satisfies the following axioms

Axiom 1 P (A) ≥ 0

Axiom 2 P (S) = 1

Axiom 3 If A ∩B = ∅ then P (A ∪B) = P (A) + P (B)

Review of Probability 1-8

Page 9: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Probability Facts

• P (Ac) = 1− P (A)

• P (A) ≤ 1

• P (∅) = 0

• If A1, A2, . . . , An are pairwise mutually exclusive then

P

(

n⋃

k=1

Ak

)

=

n∑

k=1

P (Ak)

• P (A ∪B) = P (A) + P (B)− P (A ∩B)

• If A ⊂ B then P (A) ≤ P (B)

Review of Probability 1-9

Page 10: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Conditional Probability

The probability of event A given that event B has occured

The conditional probability, P (A|B), is defined as

P (A|B) =P (A ∩B)

P (B), for P (B) > 0

If B is known to have occured, then A can occurs only if A ∩B occurs

Simply renormalizes the probability of events that occur jointly with B

Useful in finding probabilities in sequential experiments

Review of Probability 1-10

Page 11: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: Tree diagram of picking balls

Selecting two balls at random without replacement

Outcome of first draw

Outcome of second draw

B1

B2B2

W1

W2W2

35

25

14

34

24

24

110

310

310

310

B1, B2 are the events of getting a black ball in the first and second draw

P (B2|B1) =1

4, P (W2|B1) =

3

4, P (B2|W1) =

2

4, P (W2|W1) =

2

4

The probability of a path is the product of the probabilities in the transition

P (B1 ∩B2) = P (B2|B1)P (B1) =1

4

2

5=

1

10

Review of Probability 1-11

Page 12: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: Tree diagram of Binary Communication

Input

Output 00

0

11

1

p1− p

εε 1− ε1− ε

(1− p)(1− ε) (1− p)ε pε p(1− ε)

Ai: event the input was i, Bi: event the reciever was i

P (A0 ∩B0) = (1− p)(1− ε)

P (A0 ∩B1) = (1− p)ε

P (A1 ∩B0) = pε

P (A1 ∩B1) = p(1− ε)

Review of Probability 1-12

Page 13: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Theorem on Total Probability

Let B1, B2, . . . , Bn be mutually exclusive events such that

S = B1 ∪B2 ∪ · · · ∪Bn

(their union equals the sample space)

Event A can be partitioned as

A = A ∩ S = (A ∩B1) ∪ (A ∩B2) ∪ · · · ∪ (A ∩Bn)

Since A ∩Bk are disjoint, the probability of A is

P (A) = P (A ∩B1) + P (A ∩B2) + · · ·+ P (A ∩Bn)

or equivalently,

P (A) = P (A|B1)P (B1) + P (A|B2)P (B2) + · · ·+ P (A|Bn)P (Bn)

Review of Probability 1-13

Page 14: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: revisit the tree diagram of picking two balls

Outcome of first draw

Outcome of second draw

B1

B2B2

W1

W2W2

35

25

14

34

24

24

110

310

310

310

Find the probability of the event that the second ball is white

P (W2) = P (W2|B1)P (B1) + P (W2|W1)P (W1)

=3

4· 25+

1

2· 35=

3

5

Review of Probability 1-14

Page 15: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Bayes’ Rule

The conditional probablity of event A given B is related to the inverseconditional probability of event B given A by

P (A|B) =P (B|A)P (A)

P (B)

• P (A) is called a priori probability

• P (A|B) is called a posteriori probability

Let A1, A2, . . . , An be a partition of S

P (Ai|B) =P (B|Ai)P (Ai)

∑nk=1P (B|Ak)P (Ak)

Review of Probability 1-15

Page 16: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: Binary Channel

OutputInput

00

11

ε

ε

1− ε

1− εAi event the input was i

Bi event the receiver output was i

Input is equally likely to be 0 or 1

P (B1) = P (B1|A0)P (A0)+P (B1|A1)P (A1) = ε(1/2)+(1−ε)(1/2) = 1/2

Applying Bayes’ Rule, we obtain

P (A0|B1) =P (B1|A0)P (A0)

P (B1)=

ε/2

1/2= ε

If ε < 1/2, input 1 is more likely than 0 when 1 is observed

Review of Probability 1-16

Page 17: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Independence of Events

Events A and B are independent if

P (A ∩B) = P (A)P (B)

• Knowledge of event B does not alter the probability of event A

• This implies P (A|B) = P (A)

�������������������������

�������������������������

��������������������

��������������������

0

1

1

A

B

x

y

��������������������������������

��������������������������������

��������������������

��������������������

0

1

1

A

C x

y

A and B are independent A and C are not independent

Review of Probability 1-17

Page 18: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: System Reliability

Controller Unit 1

Unit 2

Unit 3

• System is ’up’ if the controller and atleast two units are functioning

• Controller fails with probability p

• Peripheral unit fails with probability a

• All components fail independently

• A: event the controller is functioning

• Bi: event unit i is functioning

• F : event two or more peripheral units are functioning

Find the probability that the system is up

Review of Probability 1-18

Page 19: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

The event F can be partition as

F = (B1 ∩B2 ∩Bc3) ∪ (B1 ∩Bc

2 ∩B3) ∪ (Bc1 ∩B2 ∩B3) ∪ (B1 ∩B2 ∩B3)

Thus,

P (F ) = P (B1)P (B2)P (Bc3) + P (B1)P (Bc

2)P (B3)

+ P (Bc1)P (B2)P (B3) + P (B1)P (B2)P (B3)

= 3(1− a)2a+ (1− a)3

P (system is up) = P (A ∩ F ) = P (A)P (F )

= (1− p)P (F ) = (1− p){3(1− a)2a+ (1− a)3}

Review of Probability 1-19

Page 20: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Sequential Independent Experiments

• Consider a random experiment consisting of n independent experiments

• Let A1, A2, . . . , An be events of the experiments

• We can compute the probability of events of the sequential experiment

P (A1 ∩A2 ∩ · · · ∩An) = P (A1)P (A2) . . . P (An)

• Example: Bernoulli trial

– Perform an experiment and note if the event A occurs– The outcome is “success” or “failure”– The probability of success is p and failure is 1− p

Review of Probability 1-20

Page 21: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Binomial Probability

• Perform n Bernoulli trials and observe the number of successes

• Let X be the number of successes in n trials

• The probability of X is given by the Binomial probability law

P (X = k) =

(

nk

)

pk(1− p)n−k

for k = 0, 1, . . . , n

• The binomial coefficient

(

nk

)

=n!

k!(n− k)!

is the number of ways of picking k out of n for the successes

Review of Probability 1-21

Page 22: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: Error Correction Coding

OutputInput

00

11

ε

ε

1− ε

1− ε• Transmit each bit three times

• Decoder takes a majority vote of the receivedbits

Compute the probability that the receiver makes an incorrect decision

• View each transmission as a Bernoulli trial

• Let X be the number of wrong bits from the receiver

P (X ≥ 2) =

(

32

)

ε2(1− ε) +

(

33

)

ε3

Review of Probability 1-22

Page 23: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Mutinomial Probability

• Generalize the binomial probability law to the occurrence of more thanone event

• Let B1, B2, . . . , Bm be possible events with

P (Bk) = pk, and p1 + p2 + · · ·+ pm = 1

• Suppose n independent repetitions of the experiment are performed

• Let Xj be the number of times each Bj occurs

• The probability of the vector (X1,X2, . . . ,Xm) is given by

P (X1 = k1, X2 = k2, . . . , Xm = km) =n!

k1!k2! . . . km!pk11 pk22 · · · pkmm

where k1 + k2 + · · ·+ km = n

Review of Probability 1-23

Page 24: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Geometric Probability

• Repeat independent Bernoulli trials until the the first success occurs

• Let X be the number of trials until the occurrence of the first success

• The probability of this event is called the geometric probability law

P (X = k) = (1− p)k−1p, for k = 1, 2, . . .

• The geometric probabilities sum to 1:

∞∑

k=1

P (X = k) = p∞∑

k=1

qk−1 =p

1− q= 1

where q = 1− p

• The probability that more than n trials are required before a success

P (X > n) = (1− p)n

Review of Probability 1-24

Page 25: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: Error Control by Retransmission

• A sends a message to B over a radio link

• B can detect if the messages have errors

• The probability of transmission error is q

• Find the probability that a message needs to be transmitted more thantwo times

Each transmission is a Bernoulli trial with probability of success p = 1− q

The probability that more than 2 transmissions are required is

P (X > 2) = q2

Review of Probability 1-25

Page 26: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Sequential Dependent Experiments

Sequence of subexperiments in which the outcome of a givensubexperiment determine which subexperiment is performed next

Example: Select the urn for the first draw by flipping a fair coin

1

0

11

1

1

01

0

0 1

Draw a ball, note the number on the ball and replace it back in its urn

The urn used in the next experiment depends on # of the ball selected

Review of Probability 1-26

Page 27: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Trellis Diagram

Sequence of outcomes

...

H

T000

0000000

111

111

1111

Probability of a sequence of outcomes

...

2/3

1/3

5/6

2/3 2/3

1/3 1/3

1/6 1/6

5/6 5/6

1/2

1/21/6

0000

1111

is the product of probabilities along the path

Review of Probability 1-27

Page 28: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Markov Chains

Let A1, A2, . . . , An be a sequence of events from n sequential experiments

The probability of a sequence of events is given by

P (A1A2 · · ·An) = P (An|A1A2 · · ·An−1)P (A1A2 · · ·An−1)

If the outcome of An−1 only determines the nth experiment and An then

P (An|A1A2 · · ·An−1) = P (An|An−1)

and the sequential experiments are called Markov Chains

Thus,

P (A1A2 · · ·An) = P (An|An−1)P (An−1|An−2) · · ·P (A2|A1)P (A1)

Find P (0011) in the urn example

Review of Probability 1-28

Page 29: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

The probability of the sequence 0011 is given by

P (0011) = P (1|1)P (1|0)P (0|0)P (0)

where the transition probabilities are

P (1|1) = 5

6, P (1|0) = 1

3, P (0|0) = 2

3

and the initial probability is given by

P (0) =1

2

Hence,

P (0011) =5

6· 13· 23· 12=

5

54

Review of Probability 1-29

Page 30: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

References

Chapter 2 inA. Leon-Garcia, Probability, Statistics, and Random Processes for

Electrical Engineering, 3rd edition, Pearson Prentice Hall, 2009

Review of Probability 1-30

Page 31: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

EE401 (Semester 1) Jitkomut Songsiri

2. Random Variables

• definition

• probability measures: CDF, PMF, PDF

• expected values and moments

• examples of RVs

2-1

Page 32: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Definition

Real lineS

SX

x

X(ζ) = x

ζ

a random variable X is a function mapping an outcome to a real number

• the sample space, S, is the domain of the random variable

• SX is the range of the random variable

example: toss a coin three times and note the sequence of heads and tails

S = {HHH,HHT,HTH,THH,HTT,THT,TTH,TTT}

Let X be the number of heads in the three tosses

SX = {0, 1, 2, 3}

Random Variables 2-2

Page 33: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Types of Random Variables

Discrete RVs take values from a countable set

example: let X be the number of times a message needs to be transmitteduntil it arrives correctly

SX = {1, 2, 3, . . .}Continuous RVs take an infinite number of possible values

example: let X be the time it takes before receiving the next phone calls

Mixed RVs have some part taking values over an interval like typicalcontinuous variables, and part of it concentrated on particular values likediscrete variables

Random Variables 2-3

Page 34: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Probability measures

Cumulative distribution function (CDF)

F (a) = P (X ≤ a)

Probability mass function (PMF) for discrete RVs

p(k) = P (X = k)

Probability density function (PDF) for continuous RVs

f(x) =dF (x)

dx

Random Variables 2-4

Page 35: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Cumulative Distribution Function (CDF)

Properties

0 ≤ F (a) ≤ 1

F (a) → 1, as a → ∞F (a) → 0, as a → −∞

−10 −5 0 5 100

0.2

0.4

0.6

0.8

1

−10 −5 0 5 100

0.2

0.4

0.6

0.8

1

F (b)− F (a) =

∫ b

a

f(x)dx F (a) =∑

k≤a

p(k)

Random Variables 2-5

Page 36: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Probability Density Function

Probability Density Function (PDF)

• f(x) ≥ 0

• P (a ≤ X ≤ b) =∫ b

af(x)dx

• F (x) =x∫

−∞f(u)du

Probability Mass Function (PMF)

• p(k) ≥ 0 for all k

• ∑

k∈S

p(k) = 1

Random Variables 2-6

Page 37: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Expected values

let g(X) be a function of random variable X

E[g(X)] =

x∈S

g(x)p(x) X is discrete

∞∫

−∞g(x)f(x)dx X is continuous

Mean

µ = E[X ] =

x∈S

xp(x) X is discrete

∞∫

−∞xf(x)dx X is continuous

Varianceσ2 = var[X ] = E[(X − µ)2]

nth MomentE[Xn]

Random Variables 2-7

Page 38: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Facts

Let Y = g(X) = aX + b, a, b are constants

• E[Y ] = aE[X ] + b

• var[Y ] = a2 var[X ]

• var[X ] = E[X2]− (E[X ])2

Random Variables 2-8

Page 39: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example of Random Variables

Discrete RVs

• Bernoulli

• Binomial

• Geometric

• Negative binomial

• Poisson

• Uniform

Continuous RVs

• Uniform

• Exponential

• Gaussian (Normal)

• Gamma

• Rayleigh

• Cauchy

• Laplacian

Random Variables 2-9

Page 40: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Bernoulli random variables

let A be an event of interest

a Bernoulli random variable X is defined as

X = 1 if A occurs and X = 0 otherwise

it can also be given by the indicator function for A

X(ζ) =

{

0, if ζ not in A

1, if ζ in A

PMF: p(1) = p, p(0) = 1− p, 0 ≤ p ≤ 1

Mean: E[X ] = p

Variance: var[X ] = p(1− p)

Random Variables 2-10

Page 41: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example of Bernoulli PMF: p = 1/3

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 0.2 0.4 0.6 0.8 10.65

0.7

0.75

0.8

0.85

0.9

0.95

1

xx

p(x)

F(x)

Random Variables 2-11

Page 42: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Binomial random variables

• X is the number of successes in a sequence of n independent trials

• each experiment yields success with probability p

• when n = 1, X is a Bernoulli random variable

• SX = {0, 1, 2, . . . , n}• ex. Transmission errors in a binary channel: X is the number of errorsin n independent transmissions

PMF

p(k) = P (X = k) =

(

nk

)

pk(1− p)n−k, k = 0, 1, . . . , n

MeanE[X ] = np

Variancevar[X ] = np(1− p)

Random Variables 2-12

Page 43: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example of Binomial PMF: p = 1/3, n = 10

0 2 4 6 8 100

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1

xx

p(x)

F(x)

Random Variables 2-13

Page 44: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Geometric random variables

• repeat independent Bernoulli trials, each has probability of success p

• X is the number of experiments required until the first success occurs

• SX = {1, 2, 3, . . .}• ex. Message transmissions: X is the number of times a message needsto be transmitted until it arrives correctly

PMFp(k) = P (X = k) = (1− p)k−1p

Mean

E[X ] =1

p

Variance

var[X ] =1− p

p2

Random Variables 2-14

Page 45: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example of Geometric PMF: p = 1/3

0 2 4 6 8 10 120

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

x

p(x)

• parameters: p = 1/4, 1/3, 1/2

Random Variables 2-15

Page 46: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Negative binomial (Pascal) random variables

• repeat independent Bernoulli trials until observing the rth success

• X is the number of trials required until the rth success occurs

• X can be viewed as the sum of r geometrically RVs

• SX = {r, r + 1, r + 2, . . .}

PMF

p(k) = P (X = k) =

(

k − 1r − 1

)

pr(1− p)k−r, k = r, r + 1, . . .

MeanE[X ] =

r

pVariance

var[X ] =r(1− p)

p2

Random Variables 2-16

Page 47: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example of negative binomial PMF: p = 1/3, r = 5

0 10 20 30 400

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0 10 20 30 400

0.2

0.4

0.6

0.8

1

xx

p(x)

F(x)

Random Variables 2-17

Page 48: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Poisson random variables

• X is a number of events occurring in a certain period of time

• events occur with a known average rate

• the expected number of occurrences in the interval is λ

• SX = {0, 1, 2, . . .}• examples:

– number of emissions of a radioactive mass during a time interval– number of queries arriving in t seconds at a call center– number of packet arrivals in t seconds at a multiplexer

PMF

p(k) = P (X = k) =λk

k!e−λ, k = 0, 1, 2, . . .

Mean E[X ] = λ

Variance var[X ] = λ

Random Variables 2-18

Page 49: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example of Poisson PMF: λ = 2

0 2 4 6 8 100

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0 2 4 6 8 100.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

xx

p(x)

F(x)

Random Variables 2-19

Page 50: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Derivation of Poisson distribution

• approximate a binomial RV when n is large and p is small

• define λ = np, in 1898 Bortkiewicz showed that

p(k) =

(

nk

)

pk(1− p)n−k ≈ λk

k!e−λ

Proof.

p(0) = (1− p)n = (1− λ/n)n ≈ e−λ, n → ∞p(k + 1)

p(k)=

(n− k)p

(k + 1)(1− p)=

(1− k/n)λ

(k + 1)(1− λ/n)

take the limit n → ∞

p(k + 1) =λ

k + 1p(k) =

(

λ

k + 1

)(

λ

k

)

· · ·(

λ

1

)

p(0) =λk+1

(k + 1)!e−λ

Random Variables 2-20

Page 51: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Comparison of Poisson and Binomial PMFs

0 2 4 6 8 100

0.05

0.1

0.15

0.2

0.25

0 5 10 15 200

0.02

0.04

0.06

0.08

0.1

0.12

0.14

xx

p(x)

p(x)

p = 1/2, n = 10 p = 1/10, n = 100

• red: Poisson

• blue: Binomial

Random Variables 2-21

Page 52: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Uniform random variables

Discrete Uniform RVs

• X has n possible values, x1, . . . , xn that are equally probable

• PMF

p(x) =

{

1n, if x ∈ {x1, . . . , xn}0, otherwise

Continuous Uniform RVs

• X takes any values on an interval [a, b] that are equally probable

• PDF

f(x) =

{

1(b−a), for x ∈ [a, b]

0, otherwise

• Mean: E[X ] = (a+ b)/2

• Variance: var[X ] = (b− a)2/12

Random Variables 2-22

Page 53: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example of Discrete Uniform PMF: X = 0, 1, 2, . . . , 10

0 5 100

0.02

0.04

0.06

0.08

0.1

0 5 100

0.2

0.4

0.6

0.8

1

xx

p(x

)

F(x

)

Example of Continuous Uniform PMF: X ∈ [0, 2]

0 0.5 1 1.5 2−0.5

0

0.5

1

1.5

0 0.5 1 1.5 20

0.2

0.4

0.6

0.8

1

xx

f(x

)

F(x

)

Random Variables 2-23

Page 54: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Exponential random variables

• arise when describing the time between occurrence of events

• examples:

– the time between customer demands for call connections– the time used for a bank teller to serve a customer

• λ is the rate at which events occur

• a continuous counterpart of the geometric random variable

PDF

f(x) =

{

λe−λx, if x ≥ 0

0, if x < 0

Mean E[X ] = 1λ

Variance var[X ] = 1λ2

Random Variables 2-24

Page 55: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example of Exponential PDF

0 5 100

0.2

0.4

0.6

0.8

1

0 5 100

0.2

0.4

0.6

0.8

1

xx

f(x

)

F(x

)

• parameters: λ = 1, 1/2, 1/3

Random Variables 2-25

Page 56: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Memoryless property

P (X > t+ h|X > t) = P (X > h)

• P (X > t+ h|X > t) is the probability of having to wait additionally atleast h seconds given that one has already been waiting t seconds

• P (X > h) is the probability of waiting at least h seconds when one firstbegins to wait

• thus, the probability of waiting at least an additional h seconds is thesame regardless of how long one has already been waiting

Proof.

P (X > t+ h|X > t) =P{(X > t+ h) ∩ (X > t)}

P (X > t), for h > 0

=P (X > t+ h)

P (X > t)=

e−λ(t+h)

e−λt= e−λh

Random Variables 2-26

Page 57: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

this is not the case for other non-negative continuous RVs

in fact, the conditional probability

P (X > t+ h|X > t) =1− P (X ≤ t+ h)

1− P (X ≤ t)=

1− F (t+ h)

1− F (t)

depends on t in general

Random Variables 2-27

Page 58: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Gaussian (Normal) random variables

• arise as the outcome of the central limit theorem

• the sum of a large number of RVs is distributed approximately normally

• many results involving Gaussian RVs can be derived in analytical form

• let X be a Gaussian RV with parameters mean µ and variance σ2

Notation X ∼ N (µ, σ2)

PDF

f(x) =1√2πσ2

exp− (x− µ)2

2σ2, −∞ < x < ∞

Mean E[X ] = µ

Variance var[X ] = σ2

Random Variables 2-28

Page 59: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

let Z ∼ N (0, 1) be the normalized Gaussian variable

CDF of Z is

FZ(z) =1√2π

∫ z

−∞e−t2/2dt , Φ(z)

then CDF of X ∼ N (µ, σ2) can be obtained by

FX(x) = Φ

(

x− µ

σ

)

in MATLAB, the error function is defined as

erf(x) =2√π

∫ x

0

e−t2dt

hence, Φ(z) can be computed via the erf command as

Φ(z) =1

2

[

1 + erf

(

z√2

)]

Random Variables 2-29

Page 60: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example of Gaussian PDF

−10 −5 0 5 100

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

−10 −5 0 5 100

0.2

0.4

0.6

0.8

1

xx

f(x

)

F(x

)

• parameters: µ = 1, σ = 1, 2, 3

Random Variables 2-30

Page 61: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Gamma random variables

• appears in many applications:

– the time required to service customers in queuing system– the lifetime of devices in reliability studies– the defect clustering behavior in VLSI chips

• let X be a Gamma variable with parameters α, λ

PDF

f(x) =λ(λx)α−1e−λx

Γ(α), x ≥ 0; α, λ > 0

where Γ(z) is the gamma function, defined by

Γ(z) =

∫ ∞

0

xz−1e−xdx, z > 0

Mean E[X ] = αλ Variance var[X ] = α

λ2

Random Variables 2-31

Page 62: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Properties of the gamma function

Γ(1/2) =√π

Γ(z + 1) = zΓ(z) for z > 0

Γ(m+ 1) = m!, for m a nonnegative integer

Special cases

a Gamma RV becomes

• exponential RV when α = 1

• m-Erlang RV when α = m, a positive integer

• chi-square RV with k DOF when α = k/2, λ = 1/2

Random Variables 2-32

Page 63: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example of Gamma PDF

0 5 10 15 200

0.1

0.2

0.3

0.4

0.5

x

f(x

)

• blue: α = 0.2, λ = 0.2 (long tail)

• green: α = 1, λ = 0.5 (exponential)

• red: α = 3, λ = 1/2 (Chi square with 6 DOF)

• black: α = 5, 20, 50, 100 and α/λ = 10 (α-Erlang with mean 10)

Random Variables 2-33

Page 64: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

m-Erlang random variables

0 t2 t

X2

tm−1 tm+1

. . .

. . .t1

X1

tm

Xm Xm+1

• the kth event occurs at time tk

• the times X1, X2, . . . , Xm between events are exponential RVs

• N(t) denotes the number of events in t seconds, which is a Poisson RV

• Sm = X1 +X2 + . . .+Xm is the elapsed time until the mth occurs

we can show that Sm is an m-Erlang random variable

Random Variables 2-34

Page 65: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Proof. Sm ≤ t iff m or more events occur in t seconds

F (t) = P (Sm ≤ t)

= P (N(t) ≥ m)

= 1−m−1∑

k=0

(λt)k

k!e−λt

to get the density function of Sm, we take the derivative of F (t):

f(t) =dF (t)

dt=

m−1∑

k=0

e−λt

k!

(

λ(λt)k − kλ(λt)k−1)

=λ(λt)m−1e−λt

(m− 1)!⇒ Erlang distribution with parameters m,λ

the sum of m exponential RVs with rate λ is an m-Erlang RV

if m becomes large, the m-Erlang RV should approach the normal RV

Random Variables 2-35

Page 66: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Rayleigh random variables

• arise when observing the magnitude of a vector

• ex. The absolute values of random complex numbers whose real andimaginary are i.i.d. Gaussian

PDFf(x) =

x

σ2e−x2/2σ2

, x ≥ 0, α > 0

MeanE[X ] = σ

π/2

Variance

var[X ] =4− π

2σ2

the Chi square distribution is a generalization of Rayleigh distribution

Random Variables 2-36

Page 67: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example of Rayleigh PDF

0 2 4 6 8 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1

xx

f(x

)

F(x

)

• parameters: σ = 1, 2, 3

Random Variables 2-37

Page 68: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Cauchy random variables

PDF

f(x) =1/π

1 + x2, −∞ < x < ∞

• Cauchy distribution does not have any moments

• no mean, variance or higher moments defined

• Z = X/Y is the standard Cauchy if X and Y are independent Gaussian

−5 −4 −3 −2 −1 0 1 2 3 4 50

0.05

0.1

0.15

0.2

0.25

0.3

0.35

x

f(x

)

Random Variables 2-38

Page 69: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Laplacian random variables

PDFf(x) =

α

2e−α|x−µ|, −∞ < x < ∞

MeanE[X ] = µ

Variance

var[X ] =2

α2

• arise as the difference between two i.i.d exponential RVs

• unlike Gaussian, the Laplace density is expressed in terms of theabsolute difference from the mean

Random Variables 2-39

Page 70: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example of Laplacian PDF

−2 −1.5 −1 −0.5 0 0.5 1 1.5 20

0.5

1

1.5

2

2.5

x

f(x

)

• parameters: µ = 1, α = 1, 2, 3, 4, 5

Random Variables 2-40

Page 71: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Related MATLAB commands

• cdf returns the values of a specified cumulative distribution function

• pdf returns the values of a specified probability density function

• randn generates random numbers from the standard Gaussiandistribution

• rand generates random numbers from the standard uniform distribution

• random generates random numbers drawn from a specified distribution

Random Variables 2-41

Page 72: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

References

Chapter 3,4 inA. Leon-Garcia, Probability, Statistics, and Random Processes for

Electrical Engineering, 3rd edition, Pearson Prentice Hall, 2009

Random Variables 2-42

Page 73: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

EE401 (Semester 1) Jitkomut Songsiri

3. Functions of random variables

• linear and quadratic transformations

• general transformations

• characteristic function

• Markov and Chebyshev inequalities

• Chernoff bound

3-1

Page 74: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Functions of random variables

let X be an RV and g(x) be a real-valued function defined on the real line

• Y = g(X), Y is also an RV

• CDF of Y will depend on g(x) and CDF of X

Example: define g(x) as

g(x) = (x)+ =

{

x, if x ≥ 0

0, if x < 0

• an input voltage X passes thru a halfwave rectifier

• A/D converter: a uniform quantizer maps input to the closet point

• Y is # of active speakers in excess of M , i.e., Y = (X −M)+

Functions of random variables 3-2

Page 75: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

CDF of Y = g(X)

probability of equivalent events:

P (Y in C) = P (g(X) in C) = P (X in B)

where B is the equivalent event of values of X sucht that g(X) is in C

Functions of random variables 3-3

Page 76: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: Voice Transmission System

• X is # of active speakers in a group of N speakers

• let p be the probability that a speaker is active

• a voice transmission system can transmit up to M signals at a time

• let Y be the number of signal discarded, so Y = (X −M)+

Y take values from the set SY = {0, 1, . . . , N −M}

we can compute PMF of Y as

P (Y = 0) = P (X in {0, 1, . . . ,M}) =M∑

k=0

pX(k)

P (Y = k) = P (X = M + k) = pX(M + k), 0 < k ≤ N −M,

Functions of random variables 3-4

Page 77: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Affine functions

define Y = aX + b, a > 0. Find CDF and PDF of Y

If a > 0

y

x

{Y ≤ y}

{X ≤ y−ba }

y−ba

Y=aX

+b

P (Y ≤ y) = P (aX + b ≤ y)

= P (X ≤ (y − b)/a)

thus,

FY (y) = FX

(

y − b

a

)

PDF of Y is obtained by differentiating the CDF wrt. to y

fY (y) =1

afX

(

y − b

a

)

Functions of random variables 3-5

Page 78: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: Affine function of a Gaussian

let X ∼ N (m,σ2) :

fX(x) =1√2πσ2

exp − (x−m)2

2σ2

let Y = aX + b, with a > 0

from page 3-5,

fY (y) =1

afX

(

y − b

a

)

=1

2π(aσ)2exp − (y − b− am)2

2(aσ)2

• Y has also a Gaussian distribution with mean b+am and variance (aσ)2

• thus, a linear function of a Gaussian is also a Gaussian

Functions of random variables 3-6

Page 79: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: Quadratic functions

define Y = X2. find CDF and PDF of Y

for a positive y, we have

{Y ≤ y} ⇐⇒ {−√y ≤ X ≤ √

y}

thus,

FY (y) =

{

0, y < 0

FX(√y)− FX(−√

y), y > 0−√

y√y

{Y ≤ y}

Y = X2

differentiating wrt. to y gives

fY (y) =fX(

√y)

2√y

+fX(−√

y)

2√y

for X ∼ N (0, 1), Y is a Chi-square random variable with one DOF

Functions of random variables 3-7

Page 80: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

General functions of random variables

suppose Y = g(X) is a transformation (could be many-to-one)

to find fY (y) for a fixed y, we solve the equation y = g(x) to get x

this may have multiple roots:

y = g(x1) = g(x2) = · · · = g(xn) = · · ·

it can be shown that

fY (y) =fX(x1)

|g′(x1)|+ · · ·+ fX(xn)

|g′(xn)|+ · · ·

where g′(x) is the derivative (Jacobian) of g(x)

Functions of random variables 3-8

Page 81: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Affine transformation: Y = aX + b, g′(x) = a

the equation y = ax+ b has a single solution x = (y − b)/a for every y, so

fY (y) =1

|a|fX(

y − b

a

)

Quadratic transformation: Y = aX2, a > 0, g′(x) = 2ax

if y ≤ 0, then the equation y = ax2 has no real solutions, so fY (y) = 0

if y > 0, then it has two solutions

x1 =√

y/a, x2 = −√

y/a

and therefore

fY (y) =1

2√ay

(

fX(√

y/a) + fX(−√

y/a))

Functions of random variables 3-9

Page 82: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Log of uniform variables

verify that if X has a standard uniform distribution U(0, 1), then

Y = − ln(X)/λ

has an exponential distribution with parameter λ

for Y = y, we can solve X = x = e−λy ⇒ unique root

• the Jacobian is g′(x) = −1/λx = −eλy/λ

• fY (y) = 0 when x = e−λy /∈ [0, 1] or when y < 0

• when y ≥ 0 (or e−λy ∈ [0, 1]), we will have

fY (y) =fX(e−λy)

| − 1/λx| = λe−λy

Functions of random variables 3-10

Page 83: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Amplitude samples of a sinusoidal waveform

let Y = cosX where X ∼ U(0, 2π], find the pdf of Y

for |y| > 1 there is no solution of x ⇒ fY (y) = 0

for |y| < 1 the equation y = cosx has two solutions:

x1 = cos−1(y), x2 = 2π − x1

the Jacobians are

g′(x1) = − sin(x1) = − sin(cos−1(y)) = −√

1− y2, g′(x2) =√

1− y2

since fX(x) = 1/2π in the interval (0, 2π], so

fY (y) =1

π√

1− y2, for − 1 < y < 1

note that although fY (±1) = ∞ the probability that y = ±1 is 0

Functions of random variables 3-11

Page 84: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Transform Methods

• moment generating function

• characteristic function

Functions of random variables 3-12

Page 85: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Moment generating functions

for a random variable X , the moment generating function (MGF) of X is

Φ(t) = E[etX]

Continuous

Φ(t) =

∫ ∞

−∞etxf(x)dx

DiscreteΦ(t) =

k

etxkp(xk)

• except for a sign change, Φ(t) is the 2-sided Laplace transform of pdf

• knowing Φ(t) is equivalent to knowing f(x)

Functions of random variables 3-13

Page 86: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Moment theorem

computing any moments of X is easily obtained by

E[Xn] =dnΦ(t)

dtn

t=0

because

E[etX] = E

[

1 + tX +(tX)2

2!+ · · ·+ (tX)n

n!+ · · ·

]

= 1 + tE[X ] +t2

2!E[X2] + · · ·+ tn

n!E[Xn] + · · ·

note that Φ(0) = 1

Functions of random variables 3-14

Page 87: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

MGF of Gaussian variables

the MGF of X ∼ N (µ, σ2) is given by

Φ(t) = e(µt+σ2t2/2)

it can be derived by completing square in the exponent:

Φ(t) =1√2πσ2

∫ ∞

−∞e−(x−µ)2

2σ2 etxdx

where

−(x− µ)2

2σ2+ tx = −(x− (µ+ σ2t))2

2σ2+ µt+

σ2t2

2

from the MGF, we obtain

Φ′(0) = µ, Φ′′(0) = µ2 + σ2

Functions of random variables 3-15

Page 88: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Characteristic functions

the characteristic function (CF) of a random variable X is defined by

Continuous

Φ(ω) = E[ejωX] =

∫ ∞

−∞f(x)ejωxdx

DiscreteΦ(ω) = E[ejωX] =

k

eiωxkp(xk)

• Φ(ω) is obtained from the moment generating function when t = jω

• Φ(ω) is simply the Fourier transform of the PDF or PMF of X

• every pdf and its characteristic function form a unique Fourier pair:

Φ(ω) ⇐⇒ f(x)

Functions of random variables 3-16

Page 89: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

the characteristic function is maximum at origin because f(x) ≥ 0:

|Φ(ω)| ≤ Φ(0) = 1

Linear transformation: if Y = aX + b , then

Φy(ω) = ejbωΦx(aω)

Gaussian variables: let X ∼ N (µ, σ2)

the characteristic function of X is

Φ(ω) = ejµω · e−σ2ω2/2

Binomial variables: parameters are n, p and q = 1− p

Φ(ω) = (pejω + q)n

Functions of random variables 3-17

Page 90: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Markov and Chebyshev Inequalities

Markov inequality

let X be a nonnegative RV with mean E[X ]

P (X ≥ a) ≤ E[X ]

a, a > 0

Chebyshev inequality

let X be an RV with mean µ and variance σ2

P (|X − µ| ≥ a) ≤ σ2

a2

Functions of random variables 3-18

Page 91: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

example: manufacturing of low grade resistors

• assume the averge resistance is 100 ohms (measured by a statisticalanalysis)

• some of resistors have different values of resistance

if all resistors over 200 ohms will be discarded, what is the maximumfraction of resistors to meet such a criterion ?

using Markov inequality with µ = 100 and a = 200

P (X ≥ 200) ≤ 100

200= 0.5

the percentage of discarded resistors cannot exceed 50% of the total

Functions of random variables 3-19

Page 92: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

if the variance of the resistance is known to equal 100, find the probabilitythat the resistance values are between 50 and 150

P (50 ≤ X ≤ 150) = P (|X − 100| ≤ 50)

= 1− P (|X − 100| ≥ 50)

by Chebyshev inequality

P (|X − 100| ≥ 50) ≤ σ2

(50)2= 1/25

hence,

P (50 ≤ X ≤ 150) ≥ 1− 1

25=

24

25

Functions of random variables 3-20

Page 93: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Chernoff bound

the Chernoff bound is given by

P (X ≥ a) ≤ inft≥0

Eet(X−a)

which can be expressed as

logP (X ≥ a) ≤ inft≥0

{−ta+ logEetX}

• E[etX] is the moment generating function

• logEetX is called the cumulant generating function

• Chernoff bound is useful when EetX has an analytical expression

Functions of random variables 3-21

Page 94: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: X is Gaussian with zero mean and unit variance

the cumulant generating function is

logE[etX] = t2/2

hence,logP (X ≥ a) ≤ inf

t≥0{−ta+ t2/2} = −a2/2

and the Chernoff bound gives

P (X ≥ a) ≤ e−a2/2

which is tighter than the Chebyshev inequality:

P (|X | ≥ a) ≤ 1/a2 =⇒ P (X ≥ a) ≤ 1/2a2

Functions of random variables 3-22

Page 95: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

0 0.5 1 1.5 20

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Chernoff bound

Chebyshev boundP(X

≥a)

P (X ≥ a)

a

when a is small, Chebyshev bound is useless while the Chernoff bound istighter

Functions of random variables 3-23

Page 96: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

References

Chapter 3,4 inA. Leon-Garcia, Probability, Statistics, and Random Processes for

Electrical Engineering, 3rd edition, Pearson Prentice Hall, 2009

Functions of random variables 3-24

Page 97: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

EE401 (Semester 1) Jitkomut Songsiri

4. Pairs of Random Variables

• probabilities

• conditional probability and expectation

• independence

• joint characteristic function

• functions of random variables

4-1

Page 98: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Definition

let ζ be an outcome in the sample space S

a pair of RVs Z(ζ) is a function that maps ζ to a pair of real numbers

Z(ζ) = (X(ζ), Y (ζ))

example: a web page provides the user with a choice either to watch anad or move directly to the requested page

let ζ be the patterns of user arrivals to a webpage

• N1(ζ) be the number of times the webpage is directly requested

• N2(ζ) be the number of that the ads is chosen

(N1(ζ), N2(ζ)) assigns a pair of nonnegative integers to each outcome ζ

Pairs of Random Variables 4-2

Page 99: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Events of interest

events involving a pair of RVs (X, Y ) can be represented by regions

Example:

yyy

xxx(4, 0)

(0, 4)(5, 5)

A

B

C

A = {X + Y ≤ 4}, B = {min(X,Y ) ≤ 5}, C = {X2 + Y 2 ≤ 25}

• A: total revenue from two sources is less than 4

• C: total noise power is less than r2

Pairs of Random Variables 4-3

Page 100: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Events and Probabilities

we consider the events that that the product form:

C = {X in A} ∩ {Y in B}

the probability of product-form events is

P (C) = P ({X in A}) ∩ P ({Y in B})= P (X in A, Y in B)

Pairs of Random Variables 4-4

Page 101: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Probability for pairs of random variables

Joint cumulative distribution function

FXY (a, b) = P (X ≤ a, Y ≤ b)

Properties:

y

x

(a, b)

• a joint CDF is a nondecreasing function of x and y:

FXY (x1, y1) ≤ FXY (x2, y2), if x1 ≤ x2 and y1 ≤ y2

• FXY (x1,−∞) = 0, FXY (−∞, y1) = 0, FXY (∞,∞) = 1

• P (x1 < X ≤ x2, y1 < Y ≤ y2)

= FXY (x2, y2)− FXY (x2, y1)− FXY (x1, y2) + FXY (x1, y1)

Pairs of Random Variables 4-5

Page 102: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Joint PMF for discrete RVs

pXY (x, y) = P (X = x, Y = y), (x, y) ∈ S

Joint PDF for continuous RVs

fXY (x, y) =∂2FXY (x, y)

∂x∂y

Marginal PMF

pX(x) =∑

y∈S

pXY (x, y), pY (y) =∑

x∈S

pXY (x, y)

Marginal PDF

fX(x) =

∫ ∞

−∞fXY (x, z)dz, fY (y) =

∫ ∞

−∞fXY (z, y)dz

Pairs of Random Variables 4-6

Page 103: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example 1: Jointly Gaussian Random Variables

if X,Y are jointly Gaussian, a joint pdf of X and Y can be given by

fXY (x, y) =1

2π√

1− ρ2exp − (x2 − 2ρxy + y2)

2(1− ρ2), −∞ < x, y < ∞

−2

0

2

−2

0

2

0

0.05

0.1

0.15

0.2

fXY (x, y)

with |ρ| < 1

find the marginal PDF’s

Pairs of Random Variables 4-7

Page 104: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

the marginal pdf of X is found by integrating fXY (x, y) over y:

fX(x) =e−x2/2(1−ρ2)

2π√

1− ρ2

∫ ∞

−∞e−(y2−2ρxy)/2(1−ρ2)dy

=e−x2/2

√2π

∫ ∞

−∞

e−(y−ρx)2/2(1−ρ2)

2π(1− ρ2)dy

=e−x2/2

√2π

• the second step follows from completing the square in (y − ρx)2

• the last integral equals 1 since its integrand is a Gaussian pdf withmean ρx and variance 1− ρ2

• the marginal pdf of X is also a Gaussian with mean 0 and variance 1

• from the symmetry of fXY (x, y) in x and y, the marginal pdf of Y isalso the same as X

Pairs of Random Variables 4-8

Page 105: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example 2

consider X and Y with a joint PDF

fXY (x, y) = ce−xe−y, 0 ≤ y ≤ x < ∞

find the constant c, the marginal PDFs and P (X + Y ≤ 1)

y y

x x

x=

y

x=

y x+y=1

12

12

the constant c is found from the normalization condition:

1 =

∫ ∞

0

∫ x

0

ce−xe−ydydx =⇒ c = 2

Pairs of Random Variables 4-9

Page 106: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

the marginal PDFs are obtained by

fX(x) =

∫ ∞

0

fXY (x, y)dy =

∫ x

0

2e−xe−ydy, 0 ≤ x < ∞

fY (y) =

∫ ∞

0

fXY (x, y)dx =

∫ ∞

y

2e−xe−ydx = 2e−y, 0 ≤ y < ∞

P (X + Y ≤ 1) can be found by taking the intersection of the region wherethe joint PDF is nonzero and the event {X + Y ≤ 1}

P (X + Y ≤ 1) =

∫ 1/2

0

∫ 1−y

y

2e−xe−ydxdy =

∫ 1/2

0

2e−y[e−y − e−(1−y)]dy

= 1− 2e−1

Pairs of Random Variables 4-10

Page 107: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Conditional Probability

Discrete RVs

the conditional PMF of Y given X = x is defined by

pY |X(y|x) = P (Y = y|X = x) =P (X = x, Y = y)

P (X = x)

=pXY (x, y)

pX(x)

Continuous RVs

the conditional PDF of Y given X = x is defined by

fY |X(y|x) =fXY (x, y)

fX(x)

Pairs of Random Variables 4-11

Page 108: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: Number of defects in a region

• let X be the total number of defects on a chip

X ∼ Poisson(α)

• let Y be the number of defects falling in region R

• if X = n (given), then Y is binomial with (n, p)

pY |X(k|n) =

0, k > n(

n

k

)

pk(1− p)n−k, 0 ≤ k ≤ n

• we can show thatY ∼ Poisson(αp)

Pairs of Random Variables 4-12

Page 109: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

P (Y = k) =∞∑

n=0

P (Y = k|X = n)P (X = n)

=∞∑

n=k

(

nk

)

pk(1− p)n−kαne−α

n!

=(αp)ke−α

k!

∞∑

n=k

[(1− p)α]n−k

(n− k)!

=(αp)ke−αe(1−p)α

k!=

(αp)k

k!e−αp

Pairs of Random Variables 4-13

Page 110: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: Customers arrive at a service station

• let N be # of customers arriving at a station during time t

N ∼ Poisson(βt)

• let T be the service time for each customer

T ∼ exponential(α)

• we can show that # of customers that arrive during the service time isa geometric RV with probability of success α/(α+ β)

Pairs of Random Variables 4-14

Page 111: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

P (N = k) =

∫ ∞

0

P (N = k|T = t)fT (t)dt

=

∫ ∞

0

(

(βt)k

k!e−βt

)

αe−αtdt

=αβk

k!

∫ ∞

0

tke−(α+β)tdt

let r = (α+ β)t, then

P (N = k) =αβk

k!(α+ β)k+1

∫ ∞

0

rke−rdr

=

(

α

α+ β

)(

β

α+ β

)k

(the last integral is a gamma function and is equal to k!)

Pairs of Random Variables 4-15

Page 112: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Conditional Expectation

the conditional expectation of Y given X = x is defined by

Continuous RVs

E[Y |X ] =

∫ ∞

−∞y fY |X(y|x)dy

Discrete RVsE[Y |X ] =

y

y pY |X(y|x)

• E[Y |X ] is the center of mass associated with the conditional pdf or pmf

• E[Y |X ] can be viewed as a function of random variable X

• E[E[Y |X ]] = E[Y ]

Pairs of Random Variables 4-16

Page 113: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

in fact, we can show that

E[h(Y )] = E[E[h(Y )|X ]]

for any function h(·) that E[|h(Y )|] < ∞proof.

E[E[h(Y )|X ]] =

∫ ∞

−∞E[h(Y )|x]fX(x)dx

=

∫ ∞

−∞

∫ ∞

−∞h(y)fY |X(y|x)dy fX(x)dx

=

∫ ∞

−∞h(y)

∫ ∞

−∞fXY (x, y) dxdy

=

∫ ∞

−∞h(y)fY (y) dy

= E[h(Y )]

Pairs of Random Variables 4-17

Page 114: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: Average defects of a chip in a region

From the example on page 4-12,

E[Y ] = E[E[Y |X ]]

=∞∑

n=0

np P (X = n)

= p∞∑

n=0

nP (X = n)

= pE[X ] = αp

Pairs of Random Variables 4-18

Page 115: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: Average arrivals in a service time

from the example on page 4-14,

E[N ] = E[E[N |T ]]

=

∫ ∞

0

E[N |T = t]fT (t)dt

=

∫ ∞

0

βtfT (t)dt

= βE[T ]

α

Pairs of Random Variables 4-19

Page 116: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: Variance of arrivals in a service time

same example as in page 4-12

N is Poisson RV with parameter βt when T = t is given, so

E[N |T = t] = βt, E[N2|T = t] = (βt) + (βt)2

the second moment of N can be calculated by

E[N2] = E[E[N2|T ]]

=

∫ ∞

0

E[N2|T = t] fT (t) dt

=

∫ ∞

0

(βt+ β2t2) fT (t) dt

= βE[T ] + β2E[T 2]

Pairs of Random Variables 4-20

Page 117: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

therefore,

var(N) = E[N2]− (E[N ])2

= β2E[T 2] + βE[T ]− β2(E[T ])2

= β2 var(T ) + βE[T ]

• if T is not random (E[T ] is constant and var(T ) = 0), the mean andvariance of N are those of a Poisson RV with parameter βE[T ]

• when T is random, the mean of N remains the same, but var(N)increases by the term β2 var(T )

• bote that the above result holds for any distribution fT (t)

• if T is exponential with parameter α, then E[T ] = 1/α andvar(T ) = 1/α2, so

E[N ] =β

α, var(N) =

β2

α2+

β

α

Pairs of Random Variables 4-21

Page 118: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Independence of two random variables

X and Y are independent if and only if

FXY (x, y) = FX(x)FY (y), ∀x, y

this is equivalent to

Discrete Random Variables

pXY (x, y) = pX(x)pY (y)

pY |X(y|x) = pY (y)

Continuous Random Variables

fXY (x, y) = fX(x)fY (y)

fY |X(y|x) = fY |X(y)

If X and Y are independent, so are any pair of functions g(X) and h(Y )

Pairs of Random Variables 4-22

Page 119: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example

let X and Y be Gaussian RVs with zero mean and unit variance

the product of the marginal pdf’s of X and Y is

fX(x)fY (y) =1

2πexp − (x2 + y2)

2, −∞ < x, y < ∞

from the example on page 4-7, the joint pdf of X and Y is

fXY (x, y) =1

2π√

1− ρ2exp − (x2 − 2ρxy + y2)

2(1− ρ2), −∞ < x, y < ∞

therefore the jointly Guassian X and Y are independent if and only if

ρ = 0

ρ is called correlation coefficient between X and Y

Pairs of Random Variables 4-23

Page 120: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Expected Values and Covariance

the expected value of Z = g(X,Y ) is defined as

E[Z] =

∫ ∞

−∞

∫ ∞

−∞g(x, y) fXY (x, y) dx dy X, Y continuous

E[Z] =∑

x

y

g(x, y) pXY (x, y) X, Y discrete

• E[X + Y ] = E[X ] +E[Y ]

• E[XY ] = E[X ]E[Y ] if X and Y are independent

Covariance of X and Y

cov(X,Y ) = E[(X −E[X ])(Y −E[Y ])]

• cov(X,Y ) = E[XY ]−E[X ]E[Y ]

• cov(X,Y ) = 0 if X and Y are independent (the converse is NOT true)

Pairs of Random Variables 4-24

Page 121: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Correlation Coefficient

denoteσX =

var(X), σY =√

var(Y )

the standard deviations of X and Y

the correlation coefficient of X and Y is defined by

ρXY =cov(X, Y )

σXσY

• −1 ≤ ρXY ≤ 1

• ρXY gives the linear dependence between X and Y : for Y = aX + b,

ρXY = 1 if a > 0 and ρXY = −1 if a < 0

• X and Y are said to be uncorrelated if ρXY = 0

Pairs of Random Variables 4-25

Page 122: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

if X and Y are independent then X and Y are uncorrelated

but the converse is NOT true

example: uncorrelated but dependent random variables

let θ be a uniform RV in the interval (0, 2π) and let

X = cos θ, Y = sin θ

y

xθ1

1

−1

−1

• the marginals of X and Y are arcsine pdf’s

• the products of the marginals of X and Y is nonzeroin the square region

• (X,Y ) is the point on the unit circle, so they areare dependent

E[XY ] =1

∫ 2π

0

sinφ cosφ dφ =1

∫ 2π

0

sin 2φ dφ = 0

since E[X ] = E[Y ] = 0, the above eq. implies X and Y are uncorrelated

Pairs of Random Variables 4-26

Page 123: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Joint Characteristic Function

the joint characteristic function of X and Y is defined by

ΦXY (λ, ω) = E[ej(λX+ωY )] =

∫ ∞

−∞

∫ ∞

−∞ej(λX+ωY )fXY (x, y) dx dy

the joint characterestic function is a 2D Fourier transform

if X and Y are independent

ΦXY (λ, ω) = E[ejλX]E[ejωY ] = ΦX(λ)ΦY (ω)

Pairs of Random Variables 4-27

Page 124: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example

let U ∼ N (0, 1) and V ∼ N (0, 1) be independent RVs

defineX = U + V, Y = U − V

the joint characteristic function of X,Y is obtained by

ΦXY (λ, ω) = E[ej(λ(U+V )+ω(U−V ))]

= E[ej(λ+ω)U+j(λ−ω)V ]

= E[ej(λ+ω)U ]E[ej(λ−ω)V ]

= ΦU(λ+ ω)ΦV (λ− ω)

= e−(λ+ω)2/2e−(λ−ω)2/2

= = e−(λ2+ω2) = ΦX(λ)ΦY (ω)

X and Y are also Gaussian with zero mean and variance 2

Pairs of Random Variables 4-28

Page 125: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

from the identity:

∂2E[ej(λX+ωY )]

∂λ∂ω= j2E[XY ej(λX+ωY )]

the joint characteristic function is also useful for finding E[XY ], since

E[XY ] =1

j2∂2E[ej(λX+ωY )]

∂λ∂ω

λ=0,ω=0

= −∂2E[e−(λ2+ω2)]

∂λ∂ω

λ=0,ω=0

= −e−(λ2+ω2)(4λω)|λ=0,ω=0

= 0

thus X and Y are uncorrelated

(note that X and Y have zero mean)

Pairs of Random Variables 4-29

Page 126: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Function of Multiple Random Variables

• sum of random variables: Z = X + Y

• division of random variables: Z = X/Y

• linear transformation

Pairs of Random Variables 4-30

Page 127: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Sum of Random Variables

y

x

x+ y ≤ z

Let Z = X + Y

P (Z ≤ z) = P (X + Y ≤ z)

integrate the joint pdf fXY over the yellow region

CDF of Z: FZ(z) = P (Z ≤ z) =∞∫

−∞

z−x∫

−∞fXY (x, y)dydx

PDF of Z: fZ(z) =dFZ(z)

dz=

∞∫

−∞fXY (x, z − x)dx

when X,Y are independent, the pdf of Z has a form of convolution:

fZ(z) =

∫ ∞

−∞fX(x)fY (z − x)dx

Pairs of Random Variables 4-31

Page 128: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example

find the pdf of the sum Z = X + Y

X, Y are jointly Gaussian with zero mean and unit variance withcorrelation coefficient ρ = −1/2

fZ(z) =

∫ ∞

−∞fXY (x, z − x)dx

=1

2π√

(1− ρ2)

∫ ∞

−∞e−(x2−2ρx(z−x)+(z−x)2)/2(1−ρ2)dx

=1

2π√

3/4

∫ ∞

−∞e−(x2−xz+z2)/2(3/4)dx

=e−z2/2

√2π

the sum of these two nonindependent Gaussian is also a Gaussian RV

Pairs of Random Variables 4-32

Page 129: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Characteristic function of a sum

let X and Y be independent RVs and define

Z = X + Y

then CF of X is the product of CFs of X and Y :

ΦZ(ω) = ΦX(ω)ΦY (ω)

proof.

ΦZ(ω) = E[ejωZ] = E[ejω(X+Y )]

= E[ejωX] E[ejωY ] (∵ X and Y are independent)

= ΦX(ω) ΦY (ω)

Pairs of Random Variables 4-33

Page 130: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: sum of independent binomials

let X and Y be i.i.d. binomials RVs with parameters n, p

PX(k) = PY (k) =

(

nk

)

pk qn−k (q = 1− p)

first compute the CF of X and Y

ΦX(ω) = ΦY (ω) =n∑

k=0

ejωk

(

nk

)

pk qn−k = (pejω + q)n

the CF of Z is then given by

ΦZ(ω) = ΦX(ω) ΦY (ω) = (pejω + q)2n

conclusion: Z is also a binomial with parameters 2n and p

Pairs of Random Variables 4-34

Page 131: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Division of Random Variables

let Z = X/Y

if Y = y (given), then Z = X/y, a scaled version of X

therefore, if Y is fixed then the distribution of Z must be the same as X

fZ|Y (z|y) = |y|fX|Y (yz|y)

use this result to find the pdf of Z:

fZ(z) =

∫ ∞

−∞fZ|Y (z|y)fY (y) dy

=

∫ ∞

−∞|y|fX|Y (yz|y)fY (y) dy

=

∫ ∞

−∞|y|fXY (yz, y) dy

Pairs of Random Variables 4-35

Page 132: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Division of Exponential RVs

let X and Y be exponential RVs with mean 1

fX(x) = e−x, x ≥ 0, fY (y) = e−y, y ≥ 0

assume that X,Y are independent, so

fXY (x, y) = fX(x)fY (y) = e−(x+y)

the pdf of Z = Y/X can be determined by

fZ(z) =

∫ ∞

0

ye−yze−ydy =1

(z + 1)2, z > 0

Pairs of Random Variables 4-36

Page 133: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Linear Transformation

let A be an invertible linear transformation such that[

UV

]

= A

[

XY

]

⇐⇒[

XY

]

= A−1

[

UV

]

(x, y)

(u, v)dx

dyA dM :

area of the parallelogram

P (X ∈ dx, Y ∈ dy) = fXY (x, y)dxdy, P (U ∈ du, V ∈ dV ) = fUV (u, v)dM

it can be shown that

dM = | detA|dx dy,

fUV (u, v) =1

| detA|fXY (x, y)

Pairs of Random Variables 4-37

Page 134: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: Linear Transformation of a Gaussian

let X and Y be jointly Gaussian RVs with the joint pdf

fXY (x, y) =1

2π√

3/4exp − 2(x2 − xy + y2)

3

let U and V be obtained from (X,Y ) by

[

UV

]

=1√2

[

1 1−1 1

] [

XY

]

⇐⇒[

XY

]

=1√2

[

1 −11 1

] [

UV

]

therefore the pdf of U and V is

fUV (u, v) =1

π√3exp − (u2/3 + v2)

U and V become independent, zero-mean Gaussian RVs

Pairs of Random Variables 4-38

Page 135: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

References

Chapter 5 inA. Leon-Garcia, Probability, Statistics, and Random Processes for

Electrical Engineering, 3rd edition, Pearson Prentice Hall, 2009

Pairs of Random Variables 4-39

Page 136: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

EE401 (Semester 1) Jitkomut Songsiri

5. Random Vectors

• probabilities

• characteristic function

• cross correlation, cross covariance

• Gaussian random vectors

• functions of random vectors

5-1

Page 137: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Random vectors

we denote X a random vector

X is a function that maps each outcome ζ to a vector of real numbers

an n-dimensional random variable has n components:

X =

X1

X2...

Xn

also called a multivariate or multiple random variable

Random Vectors 5-2

Page 138: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Probabilities

Joint CDF

F (x) , FX(x1, x2, . . . , xn) = P (X1 ≤ x1, X2 ≤ x2, . . . ,Xn ≤ xn)

Joint PMF

p(x) , pX(x1, x2, . . . , xn) = P (X1 = x1, X2 = x2, . . . ,Xn = xn)

Joint PDF

f(x) , fX(x1, x2, . . . , xn) =∂n

∂x1 . . . ∂xnF (x)

Random Vectors 5-3

Page 139: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Marginal PMF

pXj(xj) = P (Xj = xj) =∑

x1

. . .∑

xj−1

xj+1

. . .∑

xn

pX(x1, x2, . . . , xn)

Marginal PDF

fXj(xj) =

∫ ∞

−∞. . .

∫ ∞

−∞fX(x1, x2, . . . , xn) dx1 . . . dxj−1dxj+1 . . . dxn

Conditional PDF: the PDF of Xn given X1, . . . ,Xn−1 is

f(xn|x1, . . . , xn−1) =fX(x1, . . . , xn)

fX1,...,Xn−1(x1, . . . , xn−1)

Random Vectors 5-4

Page 140: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Characteristic Function

the characteristic function of an n-dimensional RV is defined by

Φ(ω) = Φ(ω1, . . . , ωn) = E[ej(ω1X1+···+ωnXn)]

=

x

ejωTxf(x)dx

where

ω =

ω1

ω2...ωn

, x =

x1

x2...xn

Φ(ω) is the n-dimensional Fourier transform of f(x)

Random Vectors 5-5

Page 141: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Independence

the random variables X1, . . . , Xn are independent if

the joint pdf (or pmf) is equal to the product of their marginal’s

DiscretepX(x1, . . . , xn) = pX1(x1) · · · pXn(xn)

ContinuousfX(x1, . . . , xn) = fX1(x1) · · · fXn(xn)

we can specify an RV by the characteristic function in place of the pdf,

X1, . . . , Xn are independent if

Φ(ω) = Φ1(ω1) · · ·Φn(ωn)

Random Vectors 5-6

Page 142: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: White noise signal in communication

the n samples X1, . . . Xn of a noise signal have the joint pdf:

fX(x1, . . . , xn) =e−(x2

1+···+x2n)/2

(2π)n/2for all x1, . . . , xn

the joint pdf is the n-product of one-dimensional Gaussian pdf’s

thus, X1, . . . ,Xn are independent Gaussian random variables

Random Vectors 5-7

Page 143: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Expected Values

the expected value of a function

g(X) = g(X1, . . . , Xn)

of a vector random variable X is defined by

E[g(X)] =

x

g(x)f(x)dx Continuous

E[g(X)] =∑

x

g(x)p(x) Discrete

Mean vector

µ = E[X] = E

X1

X2...

Xn

,

E[X1]E[X2]

...E[Xn]

Random Vectors 5-8

Page 144: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Correlation and Covariance matrices

Correlation matrix has the second moments of X as its entries:

R , E[XXT ] =

E[X1X1] E[X1X2] · · · E[X1Xn]E[X2X1] E[X2X2] · · · E[X2Xn]

... ... . . . ...E[XnX1] E[XnX2] · · · E[XnXn]

withRij = E[XiXj]

Covariance matrix has the second-order central moments as its entries:

C , E[(X− µ)(X− µ)T ]

withCij = cov(Xi,Xj) = E[(Xi − µi)(Xj − µj)]

Random Vectors 5-9

Page 145: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Symmetric matrix

A ∈ Rn×n is called symmetric if A = AT

Facts: if A is symmetric

• all eigenvalues of A are real

• all eigenvectors of A are orthogonal

• A admits a decomposition

A = UDUT

where UTU = UUT = I (U is unitary) and D is diagonal

(of course, the diagonals of D are eigenvalues of A)

Random Vectors 5-10

Page 146: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Unitary matrix

a matrix U ∈ Rn×n is called unitary if

UTU = UUT = I

example: 1√2

[

1 −11 1

]

,

[

cos θ − sin θsin θ cos θ

]

Facts:

• a real unitary matrix is also called orthogonal

• a unitary matrix is always invertible and U−1 = UT

• columns vectors of U are mutually orthogonal

• norm is preserved under a unitary transformation:

y = Ux =⇒ ‖y‖ = ‖x‖

Random Vectors 5-11

Page 147: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Positive definite matrix

a symmetric matrix A is positive semidefinite, written as A � 0 if

xTAx ≥ 0, ∀x ∈ Rn

and positive definite, written as A ≻ 0 if

xTAx > 0, for all nonzero x ∈ Rn

Facts: A � 0 if and only if

• all eigenvalues of A are non-negative

• all principle minors of A are non-negative

Random Vectors 5-12

Page 148: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

example: A =

[

1 −1−1 2

]

� 0 because

xTAx =[

x1 x2

]

[

1 −1−1 2

] [

x1

x2

]

= x21 + 2x2

2 − 2x1x2

= (x1 − x2)2 + x2

2 ≥ 0

or we can check from

• eigenvalues of A are 0.38 and 2.61 (real and positive)

• the principle minors are 1 and

1 −1−1 2

= 1 (all positive)

note: A � 0 does not mean all entries of A are positive!

Random Vectors 5-13

Page 149: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Properties of correlation and covariance matrices

let X be a (real) n-dimensional random vector with mean µ

Facts:

• R and C are n× n symmetric matrices

• R and C are positive semidefinite

• If X1, . . . ,Xn are independent, then C is diagonal

• the diagonals of C are given by the variances of Xk

• if X has zero mean, then R = C

• C = R− µµT

Random Vectors 5-14

Page 150: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Cross Correlation and Cross Covariance

let X,Y be vector random variables with means µX, µY respectively

Cross Correlation

cor(X,Y) = E[XYT ]

if cor(X,Y) = 0 then X and Y are said to be orthogonal

Cross Covariance

cov(X,Y) = E[(X− µX)(Y − µY )T ]

= cor(X,Y)− µXµTY

if cov(X,Y) = 0 then X and Y are said to be uncorrelated

Random Vectors 5-15

Page 151: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Affine transformation

let Y be an affine transformation of X:

Y = AX+ b

where A and b are deterministic matrices

• µY = AµX + b

µY = E[AX+ b] = AE[X] +E[b] = AµX + b

• CY = ACXAT

CY = E[(Y − µY )(Y − µY )T ] = E[(A(X− µX))(A(X− µX))T ]

= AE[(X− µX)(X− µX)T ]AT = ACXAT

Random Vectors 5-16

Page 152: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Diagonalization of covariance matrix

suppose a random vector Y is obtained via a linear transformation of X

X Y

A

• the covariance matrices of X,Y are CX,CY respectively

• A may represent linear filter, system gain, etc.

• the covariance of Y is CY = ACXAT

Problem: choose A such that CY becomes ’diagonal’

in other words, the variables Y1, . . . , Yn are required to be uncorrelated

Random Vectors 5-17

Page 153: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

since CX is symmetric, it has the decomposition:

CX = UDUT

where

• D is diagonal and its entries are eigenvalues of CX

• U is unitary and the columns of U are eigenvectors of CX

diagonalization: pick A = UT to obtain

CY = ACXAT = AUDUTAT = UTUDUTU = D

as desired

Random Vectors 5-18

Page 154: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

one can write X in terms of Y as

X = UUTX = UY =[

U1 U2 · · · Un

]

Y1

Y2...Yn

=

n∑

k=1

YkUk

this equation is called Karhunen-Loeve expansion

• X can be expressed as a weighted sum of the eigenvectors Uk

• the weighting coefficients are uncorrelated random variables Yk

Random Vectors 5-19

Page 155: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

example: X has the covariance matrix

[

4 22 4

]

design a transformation Y = AX s.t. the covariance of Y is diagonal

the eigenvalues of CX and the corresponding eigenvectors are

λ1 = 6, u1 =

[

11

]

, λ2 = 2, u2 =

[

1−1

]

u1 and u2 are orthogonal, so if we normalize uk so that ‖uk‖ = 1 then

U =[

u1√2

u2√2

]

is unitary

therefore, CX = UDUT where

U =

[

1/√2 1/

√2

1/√2 −1/

√2

]

, D =

[

6 00 2

]

thus, if we choose A = UT then CY = D which is diagonal as desired

Random Vectors 5-20

Page 156: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Whitening transformation

we wish to find a transformation Y = AX such that

CY = I

• a white noise property: the covariance is the identity matrix

• all components in Y are all uncorrelated

• the variances of Yk are normalized to 1

from CY = ACXAT and use the eigenvalue decomposition in CX

CY = AUDUTAT = AUD1/2D1/2UTAT

denote D1/2 the square root of D with D � 0, i.e., D1/2D1/2 = D

D = diag(d1, . . . , dn) =⇒ D1/2 = diag(√

d1, . . . ,√

dn)

can you find A that makes CY the identity matrix ?

Random Vectors 5-21

Page 157: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Gaussian random vector

X1, . . . , Xn are said to be jointly Gaussian if their joint pdf is given by

f(x) , fX(x1, x2, . . . , xn) =1

(2π)n/2 det(Σ)1/2exp −1

2(x−µ)TΣ−1(x−µ)

µ is the mean (n× 1) and Σ ≻ 0 is the covariance matrix (n× n):

µ =

µ1

µ2...µn

, Σ =

Σ11 Σ12 · · · Σ1n

Σ21 Σ22 · · · Σ2n... ... . . . ...

Σn1 Σn2 · · · Σnn

and

µk = E[Xk], Σij = E[(Xi − µi)(Xj − µj)]

Random Vectors 5-22

Page 158: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

example: the joint density function of x (not normalized) is given by

f(x1, x2, x3) = exp − x21 + 3x2

2 + 2(x3 − 1)2 + 2x1(x3 − 1)

2

• f is an exponential of negative quadratic in x so x must be a Gaussian

f(x1, x2, x3) = exp − 1

2

x1

x2

x3 − 1

T

1 0 10 3 01 0 2

x1

x2

x3 − 1

• the mean vector is (0, 0, 1) and the covariance matrix is

C =

1 0 10 3 01 0 2

−1

=

2 0 −10 1/3 0−1 0 1

• the variance of x1 is highest while x2 is smallest

• x1 and x2 are uncorrelated, so are x2 and x3

Random Vectors 5-23

Page 159: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

examples of Gaussian density contour (the exponent of exponential)

[

x1

x2

]T [Σ11 Σ12

Σ12 Σ22

]−1 [x1

x2

]

= 1

uncorrelated

x1

x2

different variance

x1

x2

correlated

x1

x2

Σ =

[

1 00 1

]

Σ =

[

1/2 00 1

]

Σ =

[

1 −1−1 2

]

Random Vectors 5-24

Page 160: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Properties of Gaussian variables

many results on Gaussian RVs can be obtained analytically:

• marginal’s of X is also Gaussian

• conditional pdf of Xk given the other variables is a Gaussian distribution

• uncorrelated Gaussian random variables are independent

• any affine transformation of a Gaussian is also a Gaussian

these are well-known facts

and more can be found in the areas of estimation, statistical learning, etc.

Random Vectors 5-25

Page 161: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Characteristic function of Gaussian

Φ(ω) = Φ(ω1, ω2, . . . , ωn) = ejµTω e−

ωTΣω2

Proof. By definition and arranging the quadratic term in the power of exp

Φ(ω) =1

(2π)n/2|Σ|1/2∫

x

ejxTω e−

(x−µ)TΣ−1(x−µ)2 dx

=ejµ

Tω e−ωTΣω

2

(2π)n/2|Σ|1/2∫

x

e−(x−µ−jΣω)TΣ−1(x−µ−jΣω)

2 dx

= exp (jµTω) exp (−1

2ωTΣω)

(the integral equals 1 since it is a form of Gaussian distribution)

for one-dimensional Gaussian with zero mean and variance Σ = σ2,

Φ(ω) = e−σ2ω2

2

Random Vectors 5-26

Page 162: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Affine Transformation of a Gaussian is Gaussian

let X be an n-dimensional Gaussian, X ∼ N (µ,Σ) and define

Y = AX+ b

where A is m× n and b is m× 1 (so Y is m× 1)

ΦY(ω) = E[ejωTY] = E[ejω

T (AX+b)]

= E[ejωTAX · ejωTb] = ejω

TbΦX(ATω)

= ejωTb · ejµTATω · e−ωTAΣATω/2

= ejωT (Aµ+b) · e−ωTAΣATω/2

we read off that Y is Gaussian with mean Aµ+ b and covariance AΣAT

Random Vectors 5-27

Page 163: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Marginal of Gaussian is Gaussian

the kth component of X is obtained by

Xk =[

0 · · · 1 0]

X , eTkX

(ek is a standard unit column vector; all entries are zero except the kth

position)

hence, Xk is simply a linear transformation (in fact, a projection) of X

Xk is then a Gaussian with mean

eTkµ = µk

and covarianceeTk Σ ek = Σkk

Random Vectors 5-28

Page 164: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Uncorrelated Gaussians are independent

suppose (X,Y) is a jointly Gaussian vector with

mean µ =

[

µx

µy

]

and covariance

[

CX 00 CY

]

in otherwords, X and Y are uncorrelated Gaussians:

cov(X,Y ) = E[XY T ]−E[X ]E[Y ]T = 0

the joint density can be written as

fXY(x,y) =1

(2π)n|CX|1/2|CY |1/2exp −1

2

[

x− µx

y − µy

]T [C−1

X 00 C−1

Y

] [

x− µx

y − µy

]

=1

(2π)n/2|CX|1/2e−1

2(x−µx)TC

−1X (x−µx)· 1

(2π)n/2|CY |1/2e−

12(y−µy)

TC−1Y (y−µy)

proving the independence

Random Vectors 5-29

Page 165: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

we can also see from the characteristic function

Φ(ω1, ω2) = E

[

exp

(

j

[

ω1

ω2

]T [X

Y

]

)]

= exp

(

j

[

ω1

ω2

]T [µy

µy

]

)

· exp − 1

2

[

ω1

ω2

]T [CX 00 CY

] [

ω1

ω2

]

= exp (jωT1 µx) exp − ωT

1 CXω1

2· exp (jωT

2 µy) · exp − ωT2 CYω2

2

, Φ1(ω1) · Φ2(ω2)

proving the independence

Random Vectors 5-30

Page 166: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Conditional of Gaussian is Gaussian

let Z be an n-dimensional Gaussian which can be decomposed as

Z =

[

X

Y

]

∼ N([

µx

µy

]

,

[

Σxx Σxy

ΣTxy Σyy

])

the conditional pdf of X given Y is also Gaussian with conditional mean

µX|Y = µx +ΣxyΣ−1yy (Y − µy)

and conditional covariance

ΣX|Y = Σx − ΣxyΣ−1yy Σ

Txy

Random Vectors 5-31

Page 167: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Proof:

from the matrix inversion lemma, Σ−1 can be written as

Σ−1 =

S−1 −S−1ΣxyΣ−1yy

−Σ−1yy Σ

TxyS

−1 Σ−1yy +Σ−1

yy ΣTxyS

−1ΣxyΣ−1yy

where S is called the Schur complement of Σxx in Σ and

S = Σxx − ΣxyΣ−1yy Σ

Txy

detΣ = detS · detΣyy

we can show that Σ ≻ 0 if any only if S ≻ 0 and Σyy ≻ 0

Random Vectors 5-32

Page 168: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

from fX|Y(x|y) = fX(x,y)/fY(y), we calculate the exponent terms

[

x− µx

y − µy

]T

Σ−1

[

x− µx

y − µy

]

− (y − µy)TΣ−1

yy (y − µy)

= (x− µx)TS−1(x− µx)− (x− µx)

TS−1ΣxyΣ−1yy (y − µy)

−(y − µy)TΣ−1

yy ΣTxyS

−1(x− µx)

+(y − µy)T (Σ−1

yyΣTxyS

−1ΣxyΣ−1yy )(y − µy)

= [x− µx − ΣxyΣ−1yy (y − µy)]

TS−1[x− µx − ΣxyΣ−1yy (y − µy)]

, (x− µX|Y)TΣ−1

X|Y(x− µX|Y)

fX|Y(x|y) is an exponential of quadratic function in x

so it has a form of Gaussian

Random Vectors 5-33

Page 169: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Standard Gaussian vectors

for an n-dimensional Gaussian vector X ∼ N (µ,C) with C ≻ 0

let A be an n× n invertible matrix such that

AAT = C

(A is called a factor of C)

then the random vectorZ = A−1(X− µ)

is a standard Gaussian vector, i.e.,

Z ∼ N (0, I)

(obtain A via eigenvalue decomposition or Cholesky factorization)

Random Vectors 5-34

Page 170: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Functions of random vectors

• minimum and maximum of random variables

• general transformation

• affine transformation

Random Vectors 5-35

Page 171: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Minimum and Maximum of RVs

let X1,X2, . . . ,Xn be independent RVs

define the minimum and maximum of RVs by

Y = min(X1,X2, . . . ,Xn), Z = max(X1, X2, . . . , Xn)

the maximum of X1,X2, . . . ,Xn is less than z iff Xi ≤ z for all i, so

FZ(z) = P (X1 ≤ z)P (X2 ≤ z) · · ·P (Xn ≤ z) = (FX(z))n

the minimum of X1, X2, . . . , Xn is greater than y iff Xi ≥ y for all i, so

1− FY (y) = P (X1 > y)P (X2 > y) · · ·P (Xn > y) = (1− FX(y))n

andFY (y) = 1− (1− FX(y))n

Random Vectors 5-36

Page 172: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: Merging of independent Poisson arrivals

...Source 1 Source 2 Source n

Server

web page request

• Ti denotes the interarrival times for source i

• Ti has exponential distribution with rate λi

• find the distribution of the interarrival times between consecutiverequests at server

Random Vectors 5-37

Page 173: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

each Ti satisfies the memoryless property, so the time that has elapsedsince the last arrival is irrelevant

the time until the next arrival at the multiplexer is

Z = min(T1, T2, . . . , Tn)

therefore, the cdf of Z can be computed by:

1− FZ(z) = P (min(T1, T2, . . . , Tn) > z)

= P (T1 > z)P (T2 > z) · · ·P (Tn > z)

= (1− FT1(z))(1− FT2(z)) · · · (1− FTn(z))

= e−λ1z · · · e−λnz = e−(λ1+λ2+···+λn)z

the interarrival time is an exponential RV with rate λ1 + λ2 + · · ·+ λn

Random Vectors 5-38

Page 174: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

General transformation

let X be a vector random variable

define Z = g(X) : Rn → Rn and assume that g is invertible

so that for Z = z we can solve for x uniquely:

x = g−1(z)

then the joint pdf of Z is given by

fZ(z) =fX(g

−1(z))

|detJ |

where detJ is the determinant of the Jacobian matrix:

J =

∂g1/∂x1 · · · ∂g1/∂xn... . . . ...

∂gn/∂x1 · · · ∂gn/∂xn

Random Vectors 5-39

Page 175: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Affine transformation

if X is a continuous random vector and A is an invertible matrix

then Y = AX+ b has pdf

fY(y) =1

|det(A)|fX(A−1(y − b))

Gaussian case: let X ∼ N (0,Σ)

fY(y) =1

(2π)n/2| detA||Σ|1/2 exp − 1

2(y − b)TA−TΣ−1A−1(y − b)

=1

(2π)n/2|AΣAT |1/2 exp − 1

2(y − b)T (AΣAT )−1(y − b)

we read off that Y is also Gaussian with mean b and covariance AΣAT

this agrees with the result in page 5-16 and 5-27

Random Vectors 5-40

Page 176: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example: Sum of jointly Gaussian

a special case of linear transformation is

Z = a1X1 + a2X2 + · · ·+ anXn

where X1, . . . , Xn are jointly Gaussian

Z can be written as

Z =[

a1 · · · an]

X1...

Xn

, AX

Z is simply a linear transformation of a Gaussian

Random Vectors 5-41

Page 177: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

therefore, Z is Gaussian with mean

E[Z] = Aµ =∑

i=1

aiE[Xi]

and variance

var(Z) = cov(Z) = AΣAT =n∑

i=1

n∑

j=1

aiaj cov(Xi,Xj)

if X1, . . . ,Xn are independent Gaussian, i.e.,

cov(Xi,Xj) = 0

then the variance of Z is reduced to

var(Z) =n∑

i=1

a2i cov(Xi, Xi) =n∑

i=1

a2i var(Xi)

Random Vectors 5-42

Page 178: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

References

Chapter 6 inA. Leon-Garcia, Probability, Statistics, and Random Processes for

Electrical Engineering, 3rd edition, Pearson Prentice Hall, 2009

Random Vectors 5-43

Page 179: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

EE401 (Semester 1) Jitkomut Songsiri

6. Sums of Random Variables

• mean and variance

• PDF of sums of independent RVs

• laws of large numbers

• central limit theorems

6-1

Page 180: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Mean and Variance

let X1,X2, . . . ,Xn be a sequence of RVs

regardless of statistical dependence, we have

E[X1 +X2 + · · ·+Xn] = E[X1] + E[X2] + · · ·+ E[Xn]

the variance of a sum of RVs is, however, NOT equal to the sum ofvariances

var(X1 +X2 + · · ·+Xn) =n∑

k=1

var(Xk) +n∑

j=1

n∑

k=1

cov(Xj,Xk)

If X1, X2, . . . , Xn are uncorrelated, then

var(X1 +X2 + · · ·+Xn) = var(X1) + var(X2) + · · ·+ var(Xn)

Sums of Random Variables 6-2

Page 181: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

PDF of sums of independent RVs

consider the sum of n independent RVs

Sn = X1 +X2 + · · ·+Xn

the characteristic function of Sn is

ΦS(ω) = E[ejωSn] = E[ejω(X1+X2+···+Xn)]

= E[ejωX1] · · ·E[ejωXn]

= ΦX1(ω) · · ·ΦXn(ω)

thus the pdf of Sn is found by finding the inverse Fourier of ΦS(ω):

fS(X) = F−1[ΦX1(ω) · · ·ΦXn(ω)]

Sums of Random Variables 6-3

Page 182: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example

find the pdf of a sum of n independent exponential RVs

all exponential variables have parameter α

the characteristic function of a single exponential RV is

ΦX(ω) =α

α− jω

the characteristic function of the sum is

ΦS(ω) =

(

α

α− jω

)n

we see that Sn is an n-Erlang RV

Sums of Random Variables 6-4

Page 183: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Sample mean

let X be an RV with E[X ] = µ (unknown)

X1, X2, . . . , Xn denote n independent, repeated measurements of X

Xj’s are independent, identically distributed (i.i.d.) RVs

the sample mean of the sequences is used to estimate E[X ]:

Mn =1

n

n∑

j=1

Xj

two statistical quantities for characterizing the sample mean’s properties:

• E[Mn]: we say Mn is unbiased if E[Mn] = µ

• var(Mn): we examine this value when n is large

Sums of Random Variables 6-5

Page 184: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

the sample mean is an unbiased estimator for µ:

E[Mn] = E

1

n

n∑

j=1

Xj

=1

n

n∑

j=1

E[Xj] = µ

suppose var(X) = σ2 (true variance)

since Xj’s are i.i.d, the variance of Mn is

var(Mn) =1

n2

n∑

j=1

var(Xj) =nσ2

n2=

σ2

n

hence, the variance of the sample mean approaches zero as the number ofsamples increases

Sums of Random Variables 6-6

Page 185: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Weak Law of Large Numbers

let X1,X2, . . . ,Xn be a sequence of iid RVs with finite mean E[X ] = µand variance σ2

for any ǫ > 0,limn→∞

P [ |Mn − µ| < ǫ ] = 1

• for large enough n, the sample mean will be close to the true mean withhigh probability

• Proof. apply Chebyshev inequality:

P [|Mn − µ| ≥ ǫ] ≤ σ2

nǫ2=⇒ P [|Mn − µ| < ǫ] ≥ 1− σ2

nǫ2

Sums of Random Variables 6-7

Page 186: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

scattergram of 1000 realizations of the sample mean

−0.4 −0.2 0 0.2 0.4−0.4

−0.2

0

0.2

0.4

−0.4 −0.2 0 0.2 0.4−0.4

−0.2

0

0.2

0.4

−0.4 −0.2 0 0.2 0.4−0.4

−0.2

0

0.2

0.4

−0.4 −0.2 0 0.2 0.4−0.4

−0.2

0

0.2

0.4

x1x1

x1x1

x2

x2

x2

x2

n = 100 n = 1000

n = 10000 n = 100000

• Mn’s are computed from 2-dimensional Gaussian with zero mean

• as n increases, the probability of Mn’s are concentrated at zero is high

Sums of Random Variables 6-8

Page 187: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Strong Law of Large Numbers

let X1,X2, . . . ,Xn be a sequence of iid RVs with finite mean E[X ] = µand finite variance, then

P [ limn→∞

Mn = µ] = 1

• Mk is the sequence of sample mean computed using X1 through Xk

• with probability 1, every sequence of sample mean calculations willeventually approach and stay close to E[X ] = µ

• the strong law implies the weak law

Sums of Random Variables 6-9

Page 188: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Central Limit Theorem

let X1,X2, . . . ,Xn be a sequence of iid RVs with

finite mean E[X ] = µ and finite variance σ2

let Sn be the sum of the first n RVs in the sequences:

Sn = X1 +X2 + · · ·+Xn

and define

Zn =Sn − nµ

σ√n

then

limn→∞

P (Zn ≤ z) =1√2π

∫ z

−∞e−x2/2dx

as n becomes large, the CDF of normalized Sn approaches Gaussiandistribution

Sums of Random Variables 6-10

Page 189: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Proof of Central Limit Theorem

first note that

Zn =Sn − nµ

σ√n

=1

σ√n

n∑

k=1

(Xk − µ)

the characteristic function of Zn is given by

ΦZn(ω) = E[ejωZn] = E

[

expjω

σ√n

n∑

k=1

(Xk − µ)

]

= E

[

n∏

k=1

ejω(Xk−µ)/σ√n

]

=(

E[ejω(X−µ)/σ√n])n

(using the fact that Xk’s are iid)

Sums of Random Variables 6-11

Page 190: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

expanding the exponential expression gives

E[ejω(X−µ)/σ√n] = E

[

1 +jω

σ√n(X − µ) +

(jω)2

2!nσ2(X − µ)2 + . . .

]

≈ 1− ω2

2n

(the higher order term can be neglected as n becomes large)

then we obtain

ΦZn(ω) →(

1− ω2

2n

)n

→ e−ω2/2, as n → ∞

Sums of Random Variables 6-12

Page 191: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

0 5 10 15 20 250

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

sum of 5 exponential RVs

Gaussian

x

F(x)

n = 5

20 30 40 50 60 70 800

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

sum of 50 exponential RVs

Gaussian

x

F(x)

n = 50

• blue lines are the CDF of the sum of n exponential RVs with λ = 1where n = 5 (left) and n = 50 (right)

• red dashed line is the CDF of a Gaussian RV with the same mean (n/λ)and variance n/λ2

• as n increases, the CDF approaches that of Gaussian distribution

Sums of Random Variables 6-13

Page 192: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

0 1 2 3 4 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

sum of 5 Bernoulli RVs

Gaussian

x

F(x)

n = 5

0 5 10 15 20 250

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

sum of 25 Bernoulli RVs

Gaussian

x

F(x)

n = 25

• blue lines are the CDFs of the sum of n Bernoulli RVs with p = 1/2where n = 5 (left) and n = 25 (right)

• red dashed line is the CDF of a Gaussian with mean np and variancenp(1− p)

Sums of Random Variables 6-14

Page 193: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

Example

the time between events is iid exponential RVs with mean m sec

find the probability that the 1000th event occurs in time interval(1000± 50)m

• Xj is the time between events

• Sn is the time of the nth event (then Sn = X1 +X2 + · · ·+Xn)

• E[Sn] = nm and var(Sn) = nm2

the CLT gives

P (950m ≤ S1000 ≤ 1050m)

= P

(

950m− 1000m

m√1000

≤ Zn ≤ 1050m− 1000m

m√1000

)

≈ Φ(1.58)− Φ(−1.58)

Sums of Random Variables 6-15

Page 194: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability

References

Chapter 7 inA. Leon-Garcia, Probability, Statistics, and Random Processes for

Electrical Engineering, 3rd edition, Pearson Prentice Hall, 2009

Sums of Random Variables 6-16