ppt on probability theory &stochastic process ppt.pdf · • bell shaped pdf –intuitively...

INSTITUTE OF AERONAUTICAL ENGINEERING(Autonomous)

DUNDIGAL, HYDERABAD - 500043

PPT ON PROBABILITY THEORY &STOCHASTIC PROCESS

II B.Tech I semester (JNTUH-R15)

Prepared byMs.G.Mary Swarna Latha

(Assistant professor)Mr.G.Anil kumar reddy

(Assistant professor)

probability introduced through sets and relative frequency

• Experiment:- a random experiment is an action or process that leads to one of several possible outcomes

Experiment Outcomes

Flip a coin Heads, Tails

Exam MarksNumbers: 0, 1, 2, ...,

Assembly Time t > 0 seconds

Course Grades F, D, C, B, A, A+

Sample Space• List: “Called the Sample Space”• Outcomes: “Called the Simple Events”

This list must be exhaustive, i.e. ALL possible outcomes included.

• Die roll {1,2,3,4,5} Die roll {1,2,3,4,5,6}

• The list must be mutually exclusive, i.e. no two outcomes can occur at the same time:

• Die roll {odd number or even number} • Die roll{ number less than 4 or even

number}

Sample Space• A list of exhaustive [don’t leave anything out] and

mutually exclusive outcomes [impossible for 2 different events to occur in the same experiment]is called a sample space and is denoted by S.

• The outcomes are denoted by O1, O2, …, Ok

• Using notation from set theory, we can represent the sample space and its outcomes as:

• S = {O1, O2, …, Ok}

• Given a sample space S = {O1, O2, …, Ok}, the probabilities assigned to the outcome must satisfy these requirements:

(1) The probability of any outcome is between 0 and 1

• i.e. 0 ≤ P(Oi) ≤ 1 for each i, and

(2) The sum of the probabilities of all the outcomes equals 1

• i.e. P(O1) + P(O2) + … + P(Ok) = 1

Relative FrequencyRandom experiment with sample space S. we shall assign

non-negative number called probability to each event in the sample space.

Let A be a particular event in S. then “the probability of event A” is denoted by P(A).

Suppose that the random experiment is repeated n times, if the event A occurs nA times, then the probability of event A is defined as “Relative frequency “

• Relative Frequency Definition: The probability of an

• event A is defined as

nnAP A

n lim)(

Axioms of ProbabilityFor any event A, we assign a number P(A), called

the probability of the event A. This number satisfies the following three conditions that act the axioms of probability.

).()()( then, If (iii)unity) isset whole theofty (Probabili 1)( (ii)

number) enonnegativa isty (Probabili 0)( (i)

BPAPBAPBAP

(Note that (iii) states that if A and B are mutually

exclusive (M.E.) events, the probability of their union is the sum of their probabilities.)

Events• The probability of an event is the sum of the

probabilities of the simple events that constitute the event.

• E.g. (assuming a fair die) S = {1, 2, 3, 4, 5, 6} and

• P(1) = P(2) = P(3) = P(4) = P(5) = P(6) = 1/6

• Then:

• P(EVEN) = P(2) + P(4) + P(6) = 1/6 + 1/6 + 1/6 = 3/6 = 1/2

Conditional Probability• Conditional probability is used to determine how

two events are related; that is, we can determine the probability of one event given the occurrence of another related event.

• Experiment: random select one student in class.• P(randomly selected student is male) =• P(randomly selected student is male/student is

on 3rd row) =• Conditional probabilities are written as P(A | B)

and read as “the probability of A given B” and is calculated as

• P( A and B) = P(A)*P(B/A) = P(B)*P(A/B) both are true

• Keep this in mind!

Bayes’ Law• Bayes’ Law is named for Thomas Bayes, an

eighteenth century mathematician.

• In its most basic form, if we know P(B | A),

• we can apply Bayes’ Law to determine P(A | B)

P(B|A) P(A|B)

• The probabilities P(A) and P(AC) are called prior probabilities because they are determined prior to the decision about taking the preparatory course.

• The conditional probability P(A | B) is called a posterior probability (or revised probability), because the prior probability is revised afterthe decision about taking the preparatory course.

Total probability theorem• Take events Ai for I = 1 to k to be:

– Mutually exclusive: for all i,j

– Exhaustive: SAAAA

For any event B on S

ApABpBp

ApABpApABpBp

)()()(

)()()()()(

Bayes theorem follows

ApABpBp

BApBAp

)()()(

Independence• Do A and B depend on one another?

– Yes! B more likely to be true if A.

– A should be more likely if B.

• If Independent

• If Dependent

BpABpApBAp

BpApBAp

ApABpBApBApBpApBAp

BpApBAp

Random variable

• Random variable

– A numerical value to each outcome of a particular experiment

0 21 3-1-2-3

• Example 1 : Machine Breakdowns– Sample space :

– Each of these failures may be associated with a repair cost

– State space :

– Cost is a random variable : 50, 200, and 350

{ , , }S electrical mechanical misuse

{50,200,350}

• Probability Mass Function (p.m.f.)– A set of probability value assigned to each of the

values taken by the discrete random variable

– and

– Probability :

0 1ip 1iip

( )i iP X x p

Continuous and Discrete random variables

• Discrete random variables have a countable number of outcomes– Examples: Dead/alive, treatment/placebo, dice, counts,

• Continuous random variables have an infinite continuum of possible values.– Examples: blood pressure, weight, the speed of a car, the

real numbers from 1 to 6.

• Distribution function:

• If FX(x) is a continuous function of x, then X is a continuous random variable.

– FX(x): discrete in x Discrete rv’s

– FX(x): piecewise continuous Mixed rv’s

– PROPERTIES:

Probability Density Function (pdf)

• X : continuous rv, then,

• pdf properties:1.

dxxftF

Binomial

• Suppose that the probability of success is p

• What is the probability of failure?

q = 1 – p

• Examples

– Toss of a coin (S = head): p = 0.5 q = 0.5

– Roll of a die (S = 1): p = 0.1667 q = 0.8333

– Fertility of a chicken egg (S = fertile): p = 0.8 q = 0.2

• Imagine that a trial is repeated n times

• Examples

– A coin is tossed 5 times

– A die is rolled 25 times

– 50 chicken eggs are examined

• Assume p remains constant from trial to trial and that the trials are statistically independent of each other

• Example

– What is the probability of obtaining 2 heads from a coin that was tossed 5 times?

P(HHTTT) = (1/2)5 = 1/32

binomial

Poisson• When there is a large number of trials, but a small probability

of success, binomial calculation becomes impractical

– Example: Number of deaths from horse kicks in the Army in different years

• The mean number of successes from n trials is µ = np

– Example: 64 deaths in 20 years from thousands of soldiers

If we substitute µ/n for p, and let n tend to infinity, the binomial distribution becomes the Poisson distribution:

P(x) =e -µµx

poisson

• Poisson distribution is applied where random events in space or time are expected to occur

• Deviation from Poisson distribution may indicate some degree of non-randomness in the events under study

• Investigation of cause may be of interest

Exponential Distribution

UniformAll (pseudo) random generators generate random deviates of U(0,1) distribution; that is, if you generate a large number of random variables and plot their empirical distribution function, it will approach this distribution in the limit.U(a,b) pdf constant over the (a,b) interval and CDF is the ramp function

U(0,1) pdf

0 0.1 0.2 0.3 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2

Uniform distribution

{0 , x < a,

F(x)= , a < x < b

1 , x > b.

Gaussian (Normal) Distribution• Bell shaped pdf – intuitively pleasing!

• Central Limit Theorem: mean of a large number of mutually independent rv’s (having arbitrary distributions) starts following Normal distribution as n

• μ: mean, σ: std. deviation, σ2: variance (N(μ, σ2))

• μ and σ completely describe the statistics. This is significant in statistical estimation/signal processing/communication theory etc.

• N(0,1) is called normalized Guassian.

• N(0,1) is symmetric i.e.

– f(x)=f(-x)

– F(z) = 1-F(z).

• Failure rate h(t) follows IFR behavior.

– Hence, N( ) is suitable for modeling long-term wear or aging related failure phenomena

Exponential Distribution

Conditional Distributions

• The conditional distribution of Y given X=1 is:

• While marginal distributions are obtained from the bivariate by summing, conditional distributions are obtained by “making a cut” through the bivariate distribution

The Expectation of a Random Variable

( )i iP X x p

( ) i ii

E X p x

Expectation of a discrete random variable with p.m.f

Expectation of a continuous random variable with p.d.f f(x)

state space( ) ( )E X xf x dx

expectation of = mean of = average of X X X

[ ] ( )XE X X xf x dx

1[ ] ( )

E X X x P x

continuous r.v.

discrete r.v.

( ) ( ), [ ]X Xf x a f x a x E X a

r.v. = ( ) r.v.X Y g X Ex: 2( )Y g X X

1( 0) ( 1) ( 1)3

P X P X P X 1( 0)3

P Y 2( 1)3

Expectation

expectation of a function of a r.v. X[ ( )] ( ) ( )XE g X g x f x dx

1[ ( )] ( ) ( )

E g X g x P x

continuous r.v.

discrete r.v.

conditional expectation of a r.v. X

[ ] ( )XE X B xf x B dx

1[ ] ( )

E X B x P x B

continuous r.v.

discrete r.v.

Ex: { }B X b

( ) ,( )( )

f x x bf x dxf x X b

[ ]( )

xf x dxE X X b

f x dx

Moments-th moment of a r.v. n X

[ ] ( )n nn Xm E X x f x dx

1[ ] ( )

n i ii

m E X x P x

continuous r.v.

discrete r.v.

0 1m 1m X

properties of expectation:

(1) [ ]E c c -- constantc

(2) [ ( ) ( )] [ ( )] [ ( )]E ag X bh X aE g X bE h X

PF: [ ] ( ) ( )X XE c cf x dx c f x dx c

[ ( ) ( )] { ( ) ( )} ( )XE ag X bh X ag x bh x f x dx

( ) ( ) ( ) ( ) [ ( )] [ ( )]X Xa g x f x dx b h x f x dx aE g X bE h X

variance of a r.v. X2 2 2 2

22 2 2

[( ) ] [ 2 ] [ ] 2 [ ]

X E X X E X XX XE X XE X X m m

standard deviation of a r.v. ( 0)XX

33skewness of a r.v. X

Ex 3.2-1 & Ex3.2-2: 3( ) symmetric about 0Xf x x X

1 ,( )0, x

Xe x af x b

exponential r.v.

am E X x e dx a b

2 2 22 2 1X m m b

2 2 2 22

1[ ] ( )x ab

am E X x e dx a b b

3 3 3 2 2 33

1[ ] 3 6 6x ab

am E X x e dx a a b ab b

3 3 2 2 3 2 3

3 3 1 2 1 1 1[( ) ] [ 3 3 ] 3 3E X X E X X X XX X m m m m m m

3 2 2 3 2 2 3 33 6 6 3( ){( ) } 2( ) 2a a b ab b a b a b b a b b

2skewness of a r.v. 2X

2 2 2( ) ( ) ( ) ( )X X Xx Xx X f x dx x X f x dx

Chebychev's inequality

2 2( ) [ ]Xx Xf x dx P X X

Markov's inequality [ ][ 0] 0 [ ] E XP X P X aa

Ex 3.2-3:2

1[ 3 ]9 9

2[ ] XP X X

( ) [ ] ( )j X j xX XE e f x e dx

Characteristic function of r.v. X

1( ) ( )2

j xX Xf x e d

Fourier transform

( ) ( ) ( ) 1 (0)j xX X X Xf x e dx f x dx

( ) ( ) ( ) [ ]n

n n j x n n n nXX Xn

d f x j x e dx j f x x dx j E Xd

( )( )n

n Xn n

Functions That Give Moments

Moment generating function of r.v. X

( ) [ ] ( )vX vxX XM v E e f x e dx

( ) ( ) ( )n

n vx nXX v X nn

d M v f x x e dx f x x dx mdv

Ex 3.3-1 & Ex 3.3-2:1 ,( )0, x

Xe x af x b

1( )1( )1 1( ) [ ] 1( )

j xa a bj xj X b b bX a

eE e e e dx eb b j

1 1( )

j aa j abb e ee

b j bjb

( ) (1 )(1 )

j a j aXd jae j b e jb

( ) [ ]1

XeM v E e

( ) (1 )(1 )

va vaXdM v ae vb e b

( )( ) Xdm j a bd

dM vm a bdv

Chernoff's inequality Ex 3.3-3:

[ ] ( ) ( ) ( )

( ) ( )

v x a vaX X

P X a f x dx f x u x a dx

f x e dx e M v

Transformations of a Random Variable

( )Y T X ( ) given ( ) ?X Yf x f y

1 2 1 2

monotone increasing ( ) ( ) for any T x T x x x

1 2 1 2

monotone decreasing ( ) ( ) for any T x T x x x

( )Y T XAssume monotone increasing ( )T

0 0 0 0( ) [ ] [ ] ( )Y XF y P Y y P X x F x

10 0( )

( ) ( )y T y

Y Xf y dy f x dx

( )( ) [ ( )]Y XdT yf y f T y

11 ( )( ) [ ( )] ( )Y X X

dT y dxf y f T y f xdy dy

( )Y T XAssume monotone decreasing ( )T

0 0 0 0( ) [ ] [ ] 1 ( )Y XF y P Y y P X x F x

( ) ( )Y Xdxf y f xdy

1( ) ( ) ( )Y X Xdxf y f x f x

dydydx

monotone ( )T

nonmonotone ( )T

( )( )( )

f xf ydT x

( )Y T X

nonmonotone

/( ) ( / )

/ ( / )

d y cf y f y c

d y cf y c

2( )Y T X cX

Ex 3.4-2:

( / ) ( / ),

2X Xf y c f y c

MULTIPLE RANDOM VARIABLES and OPERATIONS:MULTIPLE RANDOM VARIABLES :

Vector Random Variables

A vector random variable X is a function that assigns a vector of real numbers to each outcome ζ in S, the sample space of the random experiment

Events and Probabilities

EXAMPLE 4.4

Consider the tow-dimensional random variable X = (X, Y). Find the region of the plane corresponding to the events

The regions corresponding to events A and C are straightforward to find and are shown in Fig. 4.1.

.100,5),min(

YXCYXB

YXAand

Independence

If the one-dimensional random variable X and Y are “independent,” if A1

is any event that involves X only and A2 is any event that involves Y only, then

., 2121 AYPAXPAYAXP in in in in

In the general case of n random variables, we say that the random variables X1, X2,…, Xn are independent if

where the Ak is an event that involves Xk only.

(4.3) in in in , in ,, 1111 nnnn AXPAXPAXAXP

Pairs of Discrete Random Variable

Let the vector random variable X = (X,Y) assume values from some countable set The joint probability mass function of Xspecifies the probabilities of the product-form event

.,2,1,,2,1),,( kjyxS kj

:kj yYxX

(4.4) ,, k,,jyYxXP

yYxXPyxp

kjkjYX

2121,)( ,,

The probability of any event A is the sum of the pmf over the outcomes in A

(4.5) in

.),(),(

, kjyx Ain

YX yxpAXPkj

, .1),(j k

kjYX yxp (4.6)

(4.7a)

anything

kkjX,Y

),y(xp

yYandxXyYandxXPYxXP

(4.7b) )( .

jkjX,Y

The marginal probability mass functions :

The Joint cdf of X and Y

The joint cumulative distribution function of X and Y is defined as the probability of the product-form event

The joint cdf is nondecreasing in the “northeast” direction,

It is impossible for either X or Y to assume a value less than , therefore

It is certain that X and Y will assume values less than infinity, therefore

:"11 yYxX

(4.8) .,),( 1111, yYxXPyxF YX

,21212211 yyxx),y(xF),y(xF X,YX,Y and if (i)

(ii) 021 ),(xF),y(F X,YX,Y

(iii) .1 ),(FX,Y

If we let one of the variables approach infinity while keeping the other fixed, we obtain the marginal cumulative distribution functions

Recall that the cdf for a single random variable is continuous form the right. It can be shown that the joint cdf is continuous from the “north” and from the “east”

xXPYxXPxFxF YXX ,),()( , (iv)

.),(, yYPyFyF YXY )(

),(),(lim ,, yaFyxF YXYXax

),(),(lim ,, bxFyxF YXYXby

The Joint pdf of Two Jointly Continuous Random Variables

We say that the random variables X and Y are jointly continuousif the probabilities of events involving (X, Y) can be expressed as an integral of a pdf. There is a nonnegative function fX,Y(x,y), called the joint probability density function, that is defined on the real plane such that for every event A, a subset of the plane,

as shown in Fig. 4.7. When a is the entire plane, the integral must equal one :

The joint cdf can be obtained in terms of the joint pdf of jointly continuous random variables by integrating over the semi-infinite

A YX .dydxyxfAP )( in X 94,'')','(,

(4.10) .'')','(1 , dydxyxf YX

),()( , xFxF YXX

.),()( , yFyF YXY

(4.15a) .')',(

'')','()(

dxdyyxfdxdxF

(4.15b) .'),'()( ,

dxyxfyF YXY

The marginal pdf’s fX(x) and fY(y) are obtained by taking the derivative of the corresponding marginal cdf’s

(4,17) in in in in ., 2121 AYPAXPAYAXP

,21 kj yYAxXA and

(4.18) and all for

kjkYjX

kjkjYX

yxypxpyYPxXP

yYxXPyxp

X and Y are independent random variables if any event A1 defined in terms of X is independent of any event A2 defined in terms of Y ;

INDEPENDENCE OF TWO RANDOM

VARIABLES

Suppose that X and Y are a pair of discrete random variables. If we let

then the independence of X and Yimplies that

4.4 CONDITIONAL PROBABILITY AND

CONDITIONAL EXPECTATION

Conditional Probability

In Section 2.4, we know

If X is discrete, then Eq. (4.22) can be used to obtain the conditional cdf of Y given X = xk :

The conditional pdf of Y given X = xk , if the derivative exists, is given by

(4.22) in in .,|

xXPxXAYPxXAYP

xXPxXP

xXyYPxyF kk

kkY (4.23) for .0,,)|(

xyFdydxyf kYkY (4.24) .)|()|(

MULTIPLE RANDOM VARIABLES

Joint Distributions

The joint cumulative distribution function of X1, X2,…., Xn is defined as the probability of an n-dimensional semi-infinite rectangle associate with the point (x1,…, xn):

(4.38) .,,,),,( 221121,, 21 nnnXXX xXxXxXPxxxFn

The joint cdf is defined for discrete, continuous, and random variables of mixed type

FUNCTIONS OF SEVERAL RANDOM

VARIABLES

One Function of Several Random Variables

Let the random variable Z be defined as a function of several random variables:

The cdf of Z is found by first finding the equivalent event of that is, the set then

(4.51) .,,, 21 nXXXgZ

(4.52)

1,,1 nnXXR

dxdxxxf

,,,1 zgxxR nZ x that such x

.'')','()('

YXZ dxdyyxfzF

(4.53) .')','()()( ,

dxxzxfzF

dzdzf YXZZ

(4.54) .')'()'()(

dxxzfxfzf YXZ

EXAMPLE 4.31 Sum of Two Random Variables

Let Z = X + Y. Find FZ(z) and fZ(z) in terms of the joint pdf of Xand Y.

The cdf of Z is

The pdf of Z is

Thus the pdf for the sum of two random variables is given by a superposition integral. If X and Y are independent random variables, then by Eq. (4.21) the pdf is given by the convolution integral of the margial pdf’s of X and Y :

eYcXWbYaXV

(4.56) .1

dPwvfdxdyyxf WVYX ),(),( ,,

pdf of Linear Transformations

We consider first the linear transformation of two random variables

Denote the above matrix by A. We will assume A has an inverse, so each point (v, w) has a unique corresponding point (x, y) obtained from

In Fig. 4.15, the infinitesimal rectangle and the parallelogram are equivalent events, so their probabilities must be equal. Thus

(4.57) ,),(

),( ,,

dxdydP

yxfwvf YX

,dxdybcaedP

,Abcae

dxdydxdybcae

dxdydP

where dP is the area of the parallelogram. The joint pdf of V and W is thus given by

where x an y are related to (v, w) by Eq. (4.56) It can be shown that so the “stretch factor” is

where |A| is the determinant of A. Let the n-dimensional vector Z be

where A is an invertible matrix. The joint of Z is then

EXPECTED VALUE OF FUNCTIONS OF RANDOM VARIABLES

The expected value of Z = g(X, Y) can be found using the following expressions

(4.64) discrete.

continuousjointly

X,Yyxpyxg

X, YyxfyxgZE

i nniYXni

),(),(

*Joint Characteristic Function

The joint characteristic function of n random variables is defined as

(4.73a) .),,( 2211

21 21,,nn

XwXwXwjnXXX eEwww

(4.73b) .),( 2121,

YwXwjYX eEww

(4.73c) .),(),( 21,21,

dxdyeyxfww ywxwj

(4.74) .),(4

1),( 2121,2,21

dwdwewwyxf ywxwj

The inversion formula for the Fourier transform implies that the joint pdf is given by

If X and Y are jointly continuous random variables, then

(4.79)

mymymxmx

yx and

JOINTLY GAUSSIAN RANDOM VARIABLES

The random variables X and Y are said to be jointly Gaussian if their joint pdf has the form

The pdf is constant for values x and y for which the argument of the exponent is constant

constant

mymymxmxYX

(4.80) .2

arctan21

(4.81) ,1

When ρX,Y = 0, X and Y are independent ; when ρX,Y ≠ 0, the major axis of

the ellipse is oriented along the angle

Note that the angle is 45º when the variance are equal.The marginal pdf of X is found by integrating fX,Y(x, y) over all y

that is, X is a Gaussian random variable with mean m1 and variance

(4.83)

mxmxxX ,

221exp

),,()( 2/12/

21,,, 21 k

Kxxxff

nXXX n

XEXEXEXE

n Jointly Gaussian Random Variables

The random variables X1, X2,…, Xn are said to be jointly Gaussian if their joint pdf is given by

where x and m are column vectors defined by

and K is the covariance matrix that is defined by

(4.84)

VARCOV

COVVARCOVCOVCOVVAR

XXXXXXXXXX

Transformations of Random Vectors

Let X1,…, Xn be random variables associate with some experiment, and let the random variables Z1,…, Zn be defined by n functions of X = (X1,…, Xn) :

.)()()( 2211 X X X nn gZgZgZ

The joint cdf of Z1,…, Zn at the point z = (z1,…, zn) is equal to the probability of the region of x where

(4.55a) XX .)(,,)(),,( 111,,1 nnnZZ zgzgPzzFn

(4.55b) xx

.),...,(),,( ''1

''1,...,

)'(:'1,, 11

nnXXzg

nZZ dxdxxxfzzFn

eYcXWbYaXV

(4.56) .1

dPwvfdxdyyxf WVYX ),(),( ,,

pdf of Linear Transformations

We consider first the linear transformation of two random variables

Denote the above matrix by A. We will assume A has an inverse, so each point (v, w) has a unique corresponding point (x, y) obtained from

In Fig. 4.15, the infinitesimal rectangle and the parallelogram are equivalent events, so their probabilities must be equal. Thus

Stochastic ProcessesLet denote the random outcome of an experiment. To every such outcome suppose a waveform

is assigned.The collection of such waveforms form a stochastic process. The set of and the time index t can be continuousor discrete (countably infinite or finite) as well.For fixed (the set of all experimental outcomes), is a specific time function.For fixed t,

),( tX

Si ),( tX

),( 11 itXX

Fig. 14.1

),( tX

is a random variable. The ensemble of all such realizationsover time represents the stochastic

),( tX

process X(t). (see Fig 14.1). For example

),cos()( 0 tatX

If X(t) is a stochastic process, then for fixed t, X(t) representsa random variable. Its distribution function is given by

Notice that depends on t, since for a different t, we obtaina different random variable. Further

represents the first-order probability density function of the process X(t).

})({),( xtXPtxFX

),( txFX

dxtxdFtxf X

),(),(

For t = t1 and t = t2, X(t) represents two different random variablesX1 = X(t1) and X2 = X(t2) respectively. Their joint distribution is given by

represents the second-order density function of the process X(t).Similarly represents the nth order densityfunction of the process X(t). Complete specification of the stochasticprocess X(t) requires the knowledge of for all and for all n. (an almost impossible taskin reality).

})(,)({),,,( 22112121 xtXxtXPttxxFX

21 2 1 2

1 2 1 21 2

( , , , )( , , , )

F x x t tf x x t tx x

),, ,,,( 2121 nn tttxxxf X

),, ,,,( 2121 nn tttxxxf X niti , ,2 ,1 ,

Mean of a Stochastic Process:

represents the mean value of a process X(t). In general, the mean of a process can depend on the time index t.

Autocorrelation function of a process X(t) is defined as

and it represents the interrelationship between the random variablesX1 = X(t1) and X2 = X(t2) generated from the process X(t).

Properties:

( ) { ( )} ( , )Xt E X t x f x t dx

* *1 2 1 2 1 2 1 2 1 2 1 2( , ) { ( ) ( )} ( , , , )XX XR t t E X t X t x x f x x t t dx dx

*21 )}]()({[),(),( tXtXEttRttR XXXX

.0}|)({|),( 2 tXEttRXX

3. represents a nonnegative definite function, i.e., for anyset of constants

Eq. (14-8) follows by noticing that The function

represents the autocovariance function of the process X(t).Example 14.1Let

.)(for 0}|{|1

iii tXaYYE

)()(),(),( 2*

12121 ttttRttC XXXXXX

(14-9)

TdttXz

dtdtttR

dtdttXtXEzE

)}()({]|[|

(14-10)

niia 1}{

),( 21 ttRXX

jjiji ttRaa XX

* .0),( (14-8)

Stationary Stochastic ProcessesStationary processes exhibit statistical properties that are

invariant to shift in the time index. Thus, for example, second-orderstationarity implies that the statistical properties of the pairs {X(t1) , X(t2) } and {X(t1+c) , X(t2+c)} are the same for any c. Similarly first-order stationarity implies that the statistical properties of X(ti) and X(ti+c) are the same for any c.

In strict terms, the statistical properties are governed by thejoint probability density function. Hence a process is nth-orderStrict-Sense Stationary (S.S.S) if

for any c, where the left side represents the joint density function of the random variables andthe right side corresponds to the joint density function of the randomvariables A process X(t) is said to be strict-sense stationary if (14-14) is true for all

),, ,,,(),, ,,,( 21212121 ctctctxxxftttxxxf nnnn XX

)( , ),( ),( 2211 nn tXXtXXtXX

).( , ),( ),( 2211 ctXXctXXctXX nn

. and ,2 ,1 , , ,2 ,1 , canynniti

For a first-order strict sense stationary process,from (14-14) we have

for any c. In particular c = – t gives

i.e., the first-order density of X(t) is independent of t. In that case

Similarly, for a second-order strict-sense stationary processwe have from (14-14)

for any c. For c = – t2 we get

),(),( ctxftxf XX

(14-16)

(14-15)

(14-17)

)(),( xftxf XX

[ ( )] ( ) , E X t x f x dx a constant.

), ,,(), ,,( 21212121 ctctxxfttxxf XX

) ,,(), ,,( 21212121 ttxxfttxxf XX (14-18)

i.e., the second order density function of a strict sense stationary process depends only on the difference of the time indices In that case the autocorrelation function is given by

i.e., the autocorrelation function of a second order strict-sensestationary process depends only on the difference of the time indices Notice that (14-17) and (14-19) are consequences of the stochastic process being first and second-order strict sense stationary. On the other hand, the basic conditions for the first and second order stationarity – Eqs. (14-16) and (14-18) – are usually difficult to verify.In that case, we often resort to a looser definition of stationarity,known as Wide-Sense Stationarity (W.S.S), by making use of

.21 tt

(14-19)

*1 2 1 2

*1 2 1 2 1 2 1 2

( , ) { ( ) ( )}

( , , )

( ) ( ) ( ),

XX XX XX

R t t E X t X t

x x f x x t t dx dx

R t t R R

(14-17) and (14-19) as the necessary conditions. Thus, a process X(t)is said to be Wide-Sense Stationary if(i)and(ii)

i.e., for wide-sense stationary processes, the mean is a constant and the autocorrelation function depends only on the difference between the time indices. Notice that (14-20)-(14-21) does not say anything about the nature of the probability density functions, and instead deal with the average behavior of the process. Since (14-20)-(14-21) follow from (14-16) and (14-18), strict-sense stationarity always implies wide-sense stationarity. However, the converse is not true in general, the only exception being the Gaussian process.This follows, since if X(t) is a Gaussian process, then by definition

are jointly Gaussian randomvariables for any whose joint characteristic function is given by

)}({ tXE

(14-21)

(14-20)

),()}()({ 212*

1 ttRtXtXE XX

nttt ,, 21

)( , ),( ),( 2211 nn tXXtXXtXX

where is as defined on (14-9). If X(t) is wide-sense stationary, then using (14-20)-(14-21) in (14-22) we get

and hence if the set of time indices are shifted by a constant c to generate a new set of jointly Gaussian random variables

then their joint characteristic function is identical to (14-23). Thus the set of random variables and have the same joint probability distribution for all n and all c, establishing the strict sense stationarity of Gaussian processes from its wide-sense stationarity.

To summarize if X(t) is a Gaussian process, thenwide-sense stationarity (w.s.s) strict-sense stationarity (s.s.s).

Notice that since the joint p.d.f of Gaussian random variables dependsonly on their second order statistics, which is also the basis

),( ki ttCXX

1 ,( ) ( , ) / 2

1 2( , , , )XX

k k i k i kk l k

j t C t t

(14-22)

1 1 1 1( )

1 2( , , , )XX

k i k i kk k

j C t t

(14-23)

niiX 1}{

PILLAI/Cha

),( 11 ctXX )(,),( 22 ctXXctXX nn

Systems with Stochastic InputsA deterministic system1 transforms each input waveform intoan output waveform by operating only on the time variable t. Thus a set of realizations at the input corresponding to a process X(t) generates a new set of realizations at the output associated with a new process Y(t).

),( itX )],([),( ii tXTtY

)},({ tY

Our goal is to study the output process statistics in terms of the inputprocess statistics and the system function.

1A stochastic system on the other hand operates on both the variables t and .

][T )(tX )(tY

tX ),(

Fig. 14.3

Linear Systems: represents a linear system if

represent the output of a linear system.Time-Invariant System: represents a time-invariant system if

i.e., shift in the input results in the same shift in the output also.If satisfies both (14-28) and (14-30), then it corresponds to a linear time-invariant (LTI) system.LTI systems can be uniquely represented in terms of their output to a delta function

)}({)( tXLtY

)}.({)}({)}()({ 22112211 tXLatXLatXatXaL (14-28)

][L)()}({)}({)( 00 ttYttXLtXLtY

(14-29)

(14-30)][L

LTI)(t )(th

Impulse

Impulseresponse ofthe system

Impulseresponse

Fig. 14.5

Eq. (14-31) follows by expressing X(t) as

and applying (14-28) and (14-30) to Thus)}.({)( tXLtY

)()()( dtXtX

(14-31)

(14-32)

(14-33).)()()()(

)}({)(

})()({

})()({)}({)(

dtXhdthX

dtXLtXLtY

By Linearity

By Time-invariance

)()()(

dXthtYarbitrary

Fig. 14.6

)(tX )(tY

Output Statistics: Using (14-33), the mean of the output processis given by

Similarly the cross-correlation function between the input and outputprocesses is given by

Finally the output autocorrelation function is given by

).()()()(

})()({)}({)(

thtdth

dthXEtYEt

(14-34)

).(),(

)()}()({

})()()({

)}()({),(

dhtXtXE

tYtXEttR

(14-35)

),(),(

)()}()({

})( )()({

)}()({),(

dhtYtXE

tYdhtXE

tYtYEttR

).()(),(),( 12*

2121 ththttRttR XXYY

(14-36)

(14-37)

h(t))(tX )(tY

h*(t2) h(t1) ),( 21 ttRXY ),( 21 ttRYY),( 21 ttRXX

Fig. 14.7

In particular if X(t) is wide-sense stationary, then we haveso that from (14-34)

Also so that (14-35) reduces to

Thus X(t) and Y(t) are jointly w.s.s. Further, from (14-36), the output autocorrelation simplifies to

From (14-37), we obtain

XX t )(

constant.a cdht XXY ,)()(

(14-38))(),( 2121 ttRttR XXXX

(14-39)

).()()(

,)()(),( 21

ttdhttRttR

(14-40)

).()()()( * hhRR XXYY (14-41)

. ),()()(

)()(),(

dhttRttR

From (14-38)-(14-40), the output process is also wide-sense stationary.This gives rise to the following representation

LTI systemh(t)

Linear system

wide-sense stationary process

strict-sense stationary process

Gaussianprocess (alsostationary)

wide-sense stationary process.

strict-sensestationary process(see Text for proof )

Gaussian process(also stationary)

)(tX )(tY

LTI systemh(t)

Fig. 14.8

Discrete Time Stochastic Processes:

A discrete time stochastic process Xn = X(nT) is a sequence of random variables. The mean, autocorrelation and auto-covariance functions of a discrete-time process are gives by

respectively. As before strict sense stationarity and wide-sense stationarity definitions apply here also.For example, X(nT) is wide sense stationary if

)}()({),()}({

121 TnXTnXEnnRnTXEn

*2121 21),(),( nnnnRnnC

(14-57)

(14-58)

(14-59)

constanta nTXE ,)}({ (14-60)

(14-61)* *[ {( ) } {( ) }] ( ) n nE X k n T X k T R n r r

For a deterministic signal x(t), the spectrum is well defined: If represents its Fourier transform, i.e., if

then represents its energy spectrum. This follows from Parseval’s theorem since the signal energy is given by

Thus represents the signal energy in the band (see Fig 18.1).

( ) ( ) ,j tX x t e dt

2| ( ) |X

12( ) | ( ) | .x t dt X d E

(18-1)

(18-2)

2| ( ) |X ( , )

Fig 18.1

Power Spectrum

( )X t

2| ( )|X Energy in ( , )

However for stochastic processes, a direct application of (18-1) generates a sequence of random variables for every Moreover,for a stochastic process, E{| X(t) |2} represents the ensemble averagepower (instantaneous energy) at the instant t.

To obtain the spectral distribution of power versus frequency for stochastic processes, it is best to avoid infinite intervals to begin with, and start with a finite interval (– T, T ) in (18-1). Formally, partial Fourier transform of a process X(t) based on (– T, T ) is given by

so that

represents the power distribution associated with that realization basedon (– T, T ). Notice that (18-4) represents a random variable for every

and its ensemble average gives, the average power distribution based on (– T, T ). Thus

( ) ( )

T j tT T

X X t e dt

| ( ) | 1 ( )2 2

T j tTT

X X t e dtT T

(18-3)

(18-4),

represents the power distribution of X(t) based on (– T, T ). For wide sense stationary (w.s.s) processes, it is possible to further simplify (18-5). Thus if X(t) is assumed to be w.s.s, then and (18-5) simplifies to

Let and proceeding as in (14-24), we get

to be the power distribution of the w.s.s. process X(t) based on (– T, T ). Finally letting in (18-6), we obtainT

1 2 ( )

1 2 1 2

1( ) ( ) .2T XX

T T j t tT T

P R t t e dt dtT

1 2 1 2

( )1 2 1 2

| ( ) | 1( ) { ( ) ( )}2 2

1 ( , )2

T T j t tTT T

T T j t tT T

XP E E X t X t e dt dtT T

R t t e dt dtT

2 | |2 2

1( ) ( ) (2 | |)2

( ) (1 ) 0

P R e T dT

1 2 1 2( , ) ( )XX XXR t t R t t

1 2t t

(18-5)

(18-6)

to be the power spectral density of the w.s.s process X(t). Notice that

i.e., the autocorrelation function and the power spectrum of a w.s.sProcess form a Fourier transform pair, a relation known as the Wiener-Khinchin Theorem. From (18-8), the inverse formula gives

and in particular for we get

From (18-10), the area under represents the total power of theprocess X(t), and hence truly represents the power spectrum. (Fig 18.2).

( ) lim ( ) ( ) 0XX T XX

TS P R e d

F T( ) ( ) 0.XX XXR S

12( ) ( )XX XX

jR S e d

212 ( ) (0) {| ( ) | } , XX XXS d R E X t P the total power.

( )XXS ( )XXS

(18-9)

(18-8)

(18-7)

(18-10)

If X(t) is a real w.s.s process, then so that

so that the power spectrum is an even function, (in addition to beingreal and nonnegative).

( ) = ( )XX XXR R

( ) ( )

( )cos

2 ( )cos ( ) 0

jS R e d

(18-13)

ppt on probability theory &stochastic process ppt.pdf · • bell shaped pdf –intuitively...

Documents

6 crowd-pleasing holiday party recipes

03 pleasing god

people-pleasing through eating: sociotropy predicts ... ·...

being well pleasing in 2017

different rv’s categories by dealers in mesa, az

near field communication - organising everyday life...

access for user self-sufficiency: making rich local content...

delivering aesthetically-pleasing dslr images

aesthetically pleasing concrete beam-and post bridge rail

opinions on pleasing the king (autosaved)

how to learn trigonometry intuitively | betterexplained

is our worship pleasing to god

one-to-many network for visually pleasing compression

aesthetically pleasing frequencies - asomaning · david...

one-to-many network for visually pleasing compression...

pleasing methods vs. pleasing results

america’s most enduring vocal group pleasing audiences

combining colours to create the perfect pleasing patterns

aesthetically pleasing led panel lights for home

how to boost sales with crowd-pleasing combos