ppt on probability theory &stochastic process ppt.pdf · • bell shaped pdf –intuitively...
Post on 20-Mar-2019
249 Views
Preview:
TRANSCRIPT
INSTITUTE OF AERONAUTICAL ENGINEERING(Autonomous)
DUNDIGAL, HYDERABAD - 500043
PPT ON PROBABILITY THEORY &STOCHASTIC PROCESS
II B.Tech I semester (JNTUH-R15)
Prepared byMs.G.Mary Swarna Latha
(Assistant professor)Mr.G.Anil kumar reddy
(Assistant professor)
probability introduced through sets and relative frequency
• Experiment:- a random experiment is an action or process that leads to one of several possible outcomes
Experiment Outcomes
Flip a coin Heads, Tails
Exam MarksNumbers: 0, 1, 2, ...,
100
Assembly Time t > 0 seconds
Course Grades F, D, C, B, A, A+
Sample Space• List: “Called the Sample Space”• Outcomes: “Called the Simple Events”
This list must be exhaustive, i.e. ALL possible outcomes included.
• Die roll {1,2,3,4,5} Die roll {1,2,3,4,5,6}
• The list must be mutually exclusive, i.e. no two outcomes can occur at the same time:
• Die roll {odd number or even number} • Die roll{ number less than 4 or even
number}
Sample Space• A list of exhaustive [don’t leave anything out] and
mutually exclusive outcomes [impossible for 2 different events to occur in the same experiment]is called a sample space and is denoted by S.
• The outcomes are denoted by O1, O2, …, Ok
• Using notation from set theory, we can represent the sample space and its outcomes as:
• S = {O1, O2, …, Ok}
• Given a sample space S = {O1, O2, …, Ok}, the probabilities assigned to the outcome must satisfy these requirements:
(1) The probability of any outcome is between 0 and 1
• i.e. 0 ≤ P(Oi) ≤ 1 for each i, and
(2) The sum of the probabilities of all the outcomes equals 1
• i.e. P(O1) + P(O2) + … + P(Ok) = 1
Relative FrequencyRandom experiment with sample space S. we shall assign
non-negative number called probability to each event in the sample space.
Let A be a particular event in S. then “the probability of event A” is denoted by P(A).
Suppose that the random experiment is repeated n times, if the event A occurs nA times, then the probability of event A is defined as “Relative frequency “
• Relative Frequency Definition: The probability of an
• event A is defined as
nnAP A
n lim)(
Axioms of ProbabilityFor any event A, we assign a number P(A), called
the probability of the event A. This number satisfies the following three conditions that act the axioms of probability.
).()()( then, If (iii)unity) isset whole theofty (Probabili 1)( (ii)
number) enonnegativa isty (Probabili 0)( (i)
BPAPBAPBAP
AP
(Note that (iii) states that if A and B are mutually
exclusive (M.E.) events, the probability of their union is the sum of their probabilities.)
Events• The probability of an event is the sum of the
probabilities of the simple events that constitute the event.
• E.g. (assuming a fair die) S = {1, 2, 3, 4, 5, 6} and
• P(1) = P(2) = P(3) = P(4) = P(5) = P(6) = 1/6
• Then:
• P(EVEN) = P(2) + P(4) + P(6) = 1/6 + 1/6 + 1/6 = 3/6 = 1/2
Conditional Probability• Conditional probability is used to determine how
two events are related; that is, we can determine the probability of one event given the occurrence of another related event.
• Experiment: random select one student in class.• P(randomly selected student is male) =• P(randomly selected student is male/student is
on 3rd row) =• Conditional probabilities are written as P(A | B)
and read as “the probability of A given B” and is calculated as
• P( A and B) = P(A)*P(B/A) = P(B)*P(A/B) both are true
• Keep this in mind!
Bayes’ Law• Bayes’ Law is named for Thomas Bayes, an
eighteenth century mathematician.
• In its most basic form, if we know P(B | A),
• we can apply Bayes’ Law to determine P(A | B)
P(B|A) P(A|B)
• The probabilities P(A) and P(AC) are called prior probabilities because they are determined prior to the decision about taking the preparatory course.
• The conditional probability P(A | B) is called a posterior probability (or revised probability), because the prior probability is revised afterthe decision about taking the preparatory course.
Total probability theorem• Take events Ai for I = 1 to k to be:
– Mutually exclusive: for all i,j
– Exhaustive: SAAAA
k
ji
1
0
For any event B on S
k
iii
kk
ApABpBp
ApABpApABpBp
1
11
)()()(
)()()()()(
Bayes theorem follows
k
iii
jjj
ApABp
ApABpBp
BApBAp
1)()(
)()()(
)()(
Independence• Do A and B depend on one another?
– Yes! B more likely to be true if A.
– A should be more likely if B.
• If Independent
• If Dependent
BpABpApBAp
BpApBAp
ApABpBApBApBpApBAp
BpApBAp
Random variable
• Random variable
– A numerical value to each outcome of a particular experiment
S
0 21 3-1-2-3
R
• Example 1 : Machine Breakdowns– Sample space :
– Each of these failures may be associated with a repair cost
– State space :
– Cost is a random variable : 50, 200, and 350
{ , , }S electrical mechanical misuse
{50,200,350}
• Probability Mass Function (p.m.f.)– A set of probability value assigned to each of the
values taken by the discrete random variable
– and
– Probability :
0 1ip 1iip
ix
( )i iP X x p
Continuous and Discrete random variables
• Discrete random variables have a countable number of outcomes– Examples: Dead/alive, treatment/placebo, dice, counts,
etc.
• Continuous random variables have an infinite continuum of possible values.– Examples: blood pressure, weight, the speed of a car, the
real numbers from 1 to 6.
• Distribution function:
• If FX(x) is a continuous function of x, then X is a continuous random variable.
– FX(x): discrete in x Discrete rv’s
– FX(x): piecewise continuous Mixed rv’s
– PROPERTIES:
•
Probability Density Function (pdf)
• X : continuous rv, then,
• pdf properties:1.
2.
,)(
)()(
0
t
t
dxxf
dxxftF
Binomial
• Suppose that the probability of success is p
• What is the probability of failure?
q = 1 – p
• Examples
– Toss of a coin (S = head): p = 0.5 q = 0.5
– Roll of a die (S = 1): p = 0.1667 q = 0.8333
– Fertility of a chicken egg (S = fertile): p = 0.8 q = 0.2
• Imagine that a trial is repeated n times
• Examples
– A coin is tossed 5 times
– A die is rolled 25 times
– 50 chicken eggs are examined
• Assume p remains constant from trial to trial and that the trials are statistically independent of each other
• Example
– What is the probability of obtaining 2 heads from a coin that was tossed 5 times?
P(HHTTT) = (1/2)5 = 1/32
binomial
Poisson• When there is a large number of trials, but a small probability
of success, binomial calculation becomes impractical
– Example: Number of deaths from horse kicks in the Army in different years
• The mean number of successes from n trials is µ = np
– Example: 64 deaths in 20 years from thousands of soldiers
If we substitute µ/n for p, and let n tend to infinity, the binomial distribution becomes the Poisson distribution:
P(x) =e -µµx
x!
poisson
• Poisson distribution is applied where random events in space or time are expected to occur
• Deviation from Poisson distribution may indicate some degree of non-randomness in the events under study
• Investigation of cause may be of interest
Exponential Distribution
UniformAll (pseudo) random generators generate random deviates of U(0,1) distribution; that is, if you generate a large number of random variables and plot their empirical distribution function, it will approach this distribution in the limit.U(a,b) pdf constant over the (a,b) interval and CDF is the ramp function
U(0,1) pdf
0
0.2
0.4
0.6
0.8
1
1.2
0 0.1 0.2 0.3 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2
time
cdf
0
0.1
0.2
0.3
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2
Uniform distribution
{0 , x < a,
F(x)= , a < x < b
1 , x > b.
abax
Gaussian (Normal) Distribution• Bell shaped pdf – intuitively pleasing!
• Central Limit Theorem: mean of a large number of mutually independent rv’s (having arbitrary distributions) starts following Normal distribution as n
• μ: mean, σ: std. deviation, σ2: variance (N(μ, σ2))
• μ and σ completely describe the statistics. This is significant in statistical estimation/signal processing/communication theory etc.
• N(0,1) is called normalized Guassian.
• N(0,1) is symmetric i.e.
– f(x)=f(-x)
– F(z) = 1-F(z).
• Failure rate h(t) follows IFR behavior.
– Hence, N( ) is suitable for modeling long-term wear or aging related failure phenomena
Exponential Distribution
Conditional Distributions
• The conditional distribution of Y given X=1 is:
• While marginal distributions are obtained from the bivariate by summing, conditional distributions are obtained by “making a cut” through the bivariate distribution
The Expectation of a Random Variable
( )i iP X x p
( ) i ii
E X p x
Expectation of a discrete random variable with p.m.f
Expectation of a continuous random variable with p.d.f f(x)
state space( ) ( )E X xf x dx
expectation of = mean of = average of X X X
[ ] ( )XE X X xf x dx
1[ ] ( )
N
i ii
E X X x P x
continuous r.v.
discrete r.v.
( ) ( ), [ ]X Xf x a f x a x E X a
r.v. = ( ) r.v.X Y g X Ex: 2( )Y g X X
1( 0) ( 1) ( 1)3
P X P X P X 1( 0)3
P Y 2( 1)3
P Y
Expectation
expectation of a function of a r.v. X[ ( )] ( ) ( )XE g X g x f x dx
1[ ( )] ( ) ( )
N
i ii
E g X g x P x
continuous r.v.
discrete r.v.
conditional expectation of a r.v. X
[ ] ( )XE X B xf x B dx
1[ ] ( )
N
i ii
E X B x P x B
continuous r.v.
discrete r.v.
Ex: { }B X b
( ) ,( )( )
0,
Xb
XX
f x x bf x dxf x X b
x b
( )
[ ]( )
b
X
b
X
xf x dxE X X b
f x dx
Moments-th moment of a r.v. n X
[ ] ( )n nn Xm E X x f x dx
1[ ] ( )
Nn n
n i ii
m E X x P x
continuous r.v.
discrete r.v.
0 1m 1m X
properties of expectation:
(1) [ ]E c c -- constantc
(2) [ ( ) ( )] [ ( )] [ ( )]E ag X bh X aE g X bE h X
PF: [ ] ( ) ( )X XE c cf x dx c f x dx c
[ ( ) ( )] { ( ) ( )} ( )XE ag X bh X ag x bh x f x dx
( ) ( ) ( ) ( ) [ ( )] [ ( )]X Xa g x f x dx b h x f x dx aE g X bE h X
variance of a r.v. X2 2 2 2
22 2 2
2 1
[( ) ] [ 2 ] [ ] 2 [ ]
X E X X E X XX XE X XE X X m m
standard deviation of a r.v. ( 0)XX
33skewness of a r.v. X
X
Ex 3.2-1 & Ex3.2-2: 3( ) symmetric about 0Xf x x X
1 ,( )0, x
x ab
Xe x af x b
a
exponential r.v.
11[ ]
x ab
am E X x e dx a b
b
2 2 22 2 1X m m b
2 2 2 22
1[ ] ( )x ab
am E X x e dx a b b
b
3 3 3 2 2 33
1[ ] 3 6 6x ab
am E X x e dx a a b ab b
b
3 3 2 2 3 2 3
3 3 1 2 1 1 1[( ) ] [ 3 3 ] 3 3E X X E X X X XX X m m m m m m
3 2 2 3 2 2 3 33 6 6 3( ){( ) } 2( ) 2a a b ab b a b a b b a b b
333 3
2skewness of a r.v. 2X
bXb
2 2 2( ) ( ) ( ) ( )X X Xx Xx X f x dx x X f x dx
Chebychev's inequality
2 2( ) [ ]Xx Xf x dx P X X
Markov's inequality [ ][ 0] 0 [ ] E XP X P X aa
Ex 3.2-3:2
2
1[ 3 ]9 9
XX
X
P X X
2
2[ ] XP X X
( ) [ ] ( )j X j xX XE e f x e dx
Characteristic function of r.v. X
1( ) ( )2
j xX Xf x e d
Fourier transform
( ) ( ) ( ) 1 (0)j xX X X Xf x e dx f x dx
00
( ) ( ) ( ) [ ]n
n n j x n n n nXX Xn
d f x j x e dx j f x x dx j E Xd
0
( )( )n
n Xn n
dm jd
Functions That Give Moments
Moment generating function of r.v. X
( ) [ ] ( )vX vxX XM v E e f x e dx
00
( ) ( ) ( )n
n vx nXX v X nn
v
d M v f x x e dx f x x dx mdv
Ex 3.3-1 & Ex 3.3-2:1 ,( )0, x
x ab
Xe x af x b
a
1( )1( )1 1( ) [ ] 1( )
j xa a bj xj X b b bX a
x a
eE e e e dx eb b j
b
1( )1
1 1( )
j aa j abb e ee
b j bjb
2
( ) (1 )(1 )
j a j aXd jae j b e jb
d j b
( ) [ ]1
vavX
XeM v E e
vb
2
( ) (1 )(1 )
va vaXdM v ae vb e b
dv vb
10
( )( ) Xdm j a bd
10
( )X
v
dM vm a bdv
Chernoff's inequality Ex 3.3-3:
( )
[ ] ( ) ( ) ( )
( ) ( )
X Xa
v x a vaX X
P X a f x dx f x u x a dx
f x e dx e M v
0v
Transformations of a Random Variable
( )Y T X ( ) given ( ) ?X Yf x f y
1 2 1 2
monotone increasing ( ) ( ) for any T x T x x x
1 2 1 2
monotone decreasing ( ) ( ) for any T x T x x x
( )Y T XAssume monotone increasing ( )T
0 0 0 0( ) [ ] [ ] ( )Y XF y P Y y P X x F x
10 0( )
( ) ( )y T y
Y Xf y dy f x dx
11 0
0 00
( )( ) [ ( )]Y XdT yf y f T y
dy
11 ( )( ) [ ( )] ( )Y X X
dT y dxf y f T y f xdy dy
( )Y T XAssume monotone decreasing ( )T
0 0 0 0( ) [ ] [ ] 1 ( )Y XF y P Y y P X x F x
( ) ( )Y Xdxf y f xdy
1( ) ( ) ( )Y X Xdxf y f x f x
dydydx
monotone ( )T
nonmonotone ( )T
( )( )( )
n
X nY
n
x x
f xf ydT x
dx
( )Y T X
nonmonotone
/( ) ( / )
/ ( / )
Y X
X
d y cf y f y c
dy
d y cf y c
dy
2( )Y T X cX
Ex 3.4-2:
( / ) ( / ),
2X Xf y c f y c
cy
0y
MULTIPLE RANDOM VARIABLES and OPERATIONS:MULTIPLE RANDOM VARIABLES :
Vector Random Variables
A vector random variable X is a function that assigns a vector of real numbers to each outcome ζ in S, the sample space of the random experiment
Events and Probabilities
EXAMPLE 4.4
Consider the tow-dimensional random variable X = (X, Y). Find the region of the plane corresponding to the events
The regions corresponding to events A and C are straightforward to find and are shown in Fig. 4.1.
.100,5),min(
,10
22
YXCYXB
YXAand
Independence
If the one-dimensional random variable X and Y are “independent,” if A1
is any event that involves X only and A2 is any event that involves Y only, then
., 2121 AYPAXPAYAXP in in in in
In the general case of n random variables, we say that the random variables X1, X2,…, Xn are independent if
where the Ak is an event that involves Xk only.
(4.3) in in in , in ,, 1111 nnnn AXPAXPAXAXP
Pairs of Discrete Random Variable
Let the vector random variable X = (X,Y) assume values from some countable set The joint probability mass function of Xspecifies the probabilities of the product-form event
.,2,1,,2,1),,( kjyxS kj
:kj yYxX
(4.4) ,, k,,jyYxXP
yYxXPyxp
kj
kjkjYX
2121,)( ,,
The probability of any event A is the sum of the pmf over the outcomes in A
(4.5) in
.),(),(
, kjyx Ain
YX yxpAXPkj
1 1
, .1),(j k
kjYX yxp (4.6)
(4.7a)
anything
,
,)(
1
21
kkjX,Y
jj
j
jjX
),y(xp
yYandxXyYandxXPYxXP
xXPxp
(4.7b) )( .
)(
1
jkjX,Y
kkY
,yxp
yYPyp
The marginal probability mass functions :
The Joint cdf of X and Y
The joint cumulative distribution function of X and Y is defined as the probability of the product-form event
The joint cdf is nondecreasing in the “northeast” direction,
It is impossible for either X or Y to assume a value less than , therefore
It is certain that X and Y will assume values less than infinity, therefore
:"11 yYxX
(4.8) .,),( 1111, yYxXPyxF YX
,21212211 yyxx),y(xF),y(xF X,YX,Y and if (i)
(ii) 021 ),(xF),y(F X,YX,Y
(iii) .1 ),(FX,Y
If we let one of the variables approach infinity while keeping the other fixed, we obtain the marginal cumulative distribution functions
and
Recall that the cdf for a single random variable is continuous form the right. It can be shown that the joint cdf is continuous from the “north” and from the “east”
and
xXPYxXPxFxF YXX ,),()( , (iv)
.),(, yYPyFyF YXY )(
),(),(lim ,, yaFyxF YXYXax
(v)
),(),(lim ,, bxFyxF YXYXby
The Joint pdf of Two Jointly Continuous Random Variables
We say that the random variables X and Y are jointly continuousif the probabilities of events involving (X, Y) can be expressed as an integral of a pdf. There is a nonnegative function fX,Y(x,y), called the joint probability density function, that is defined on the real plane such that for every event A, a subset of the plane,
as shown in Fig. 4.7. When a is the entire plane, the integral must equal one :
The joint cdf can be obtained in terms of the joint pdf of jointly continuous random variables by integrating over the semi-infinite
A YX .dydxyxfAP )( in X 94,'')','(,
(4.10) .'')','(1 , dydxyxf YX
),()( , xFxF YXX
.),()( , yFyF YXY
(4.15a) .')',(
'')','()(
,
,
dyyxf
dxdyyxfdxdxF
YX
x
YXX
(4.15b) .'),'()( ,
dxyxfyF YXY
The marginal pdf’s fX(x) and fY(y) are obtained by taking the derivative of the corresponding marginal cdf’s
(4,17) in in in in ., 2121 AYPAXPAYAXP
,21 kj yYAxXA and
(4.18) and all for
.)()(
,),(,
kjkYjX
kj
kjkjYX
yxypxpyYPxXP
yYxXPyxp
X and Y are independent random variables if any event A1 defined in terms of X is independent of any event A2 defined in terms of Y ;
INDEPENDENCE OF TWO RANDOM
VARIABLES
Suppose that X and Y are a pair of discrete random variables. If we let
then the independence of X and Yimplies that
4.4 CONDITIONAL PROBABILITY AND
CONDITIONAL EXPECTATION
Conditional Probability
In Section 2.4, we know
If X is discrete, then Eq. (4.22) can be used to obtain the conditional cdf of Y given X = xk :
The conditional pdf of Y given X = xk , if the derivative exists, is given by
(4.22) in in .,|
xXPxXAYPxXAYP
xXPxXP
xXyYPxyF kk
kkY (4.23) for .0,,)|(
xyFdydxyf kYkY (4.24) .)|()|(
MULTIPLE RANDOM VARIABLES
Joint Distributions
The joint cumulative distribution function of X1, X2,…., Xn is defined as the probability of an n-dimensional semi-infinite rectangle associate with the point (x1,…, xn):
(4.38) .,,,),,( 221121,, 21 nnnXXX xXxXxXPxxxFn
The joint cdf is defined for discrete, continuous, and random variables of mixed type
FUNCTIONS OF SEVERAL RANDOM
VARIABLES
One Function of Several Random Variables
Let the random variable Z be defined as a function of several random variables:
The cdf of Z is found by first finding the equivalent event of that is, the set then
(4.51) .,,, 21 nXXXgZ
(4.52)
in X
inx
.,,
)(''
1''
1,,1 nnXXR
zZ
dxdxxxf
RPzF
nz
,,,1 zgxxR nZ x that such x
.'')','()('
,
xz
YXZ dxdyyxfzF
(4.53) .')','()()( ,
dxxzxfzF
dzdzf YXZZ
(4.54) .')'()'()(
dxxzfxfzf YXZ
EXAMPLE 4.31 Sum of Two Random Variables
Let Z = X + Y. Find FZ(z) and fZ(z) in terms of the joint pdf of Xand Y.
The cdf of Z is
The pdf of Z is
Thus the pdf for the sum of two random variables is given by a superposition integral. If X and Y are independent random variables, then by Eq. (4.21) the pdf is given by the convolution integral of the margial pdf’s of X and Y :
eYcXWbYaXV
.
YX
ecba
WV
(4.56) .1
wv
Ayx
dPwvfdxdyyxf WVYX ),(),( ,,
pdf of Linear Transformations
We consider first the linear transformation of two random variables
Denote the above matrix by A. We will assume A has an inverse, so each point (v, w) has a unique corresponding point (x, y) obtained from
In Fig. 4.15, the infinitesimal rectangle and the parallelogram are equivalent events, so their probabilities must be equal. Thus
(4.57) ,),(
),( ,,
dxdydP
yxfwvf YX
WV
,dxdybcaedP
,Abcae
dxdydxdybcae
dxdydP
,XZ A
nn
where dP is the area of the parallelogram. The joint pdf of V and W is thus given by
where x an y are related to (v, w) by Eq. (4.56) It can be shown that so the “stretch factor” is
where |A| is the determinant of A. Let the n-dimensional vector Z be
where A is an invertible matrix. The joint of Z is then
EXPECTED VALUE OF FUNCTIONS OF RANDOM VARIABLES
The expected value of Z = g(X, Y) can be found using the following expressions
(4.64) discrete.
continuousjointly
X,Yyxpyxg
X, YyxfyxgZE
i nniYXni
YX
),(),(
),(,
,
,
*Joint Characteristic Function
The joint characteristic function of n random variables is defined as
(4.73a) .),,( 2211
21 21,,nn
n
XwXwXwjnXXX eEwww
(4.73b) .),( 2121,
YwXwjYX eEww
(4.73c) .),(),( 21,21,
dxdyeyxfww ywxwj
YXYX
(4.74) .),(4
1),( 2121,2,21
dwdwewwyxf ywxwj
YXYX
The inversion formula for the Fourier transform implies that the joint pdf is given by
If X and Y are jointly continuous random variables, then
(4.79)
2
,21
2
2
2
2
2
1
1,
2
1
12
,
,
12
212
1exp
),(
YX
YXYX
YX
mymymxmx
yxf
yx and
JOINTLY GAUSSIAN RANDOM VARIABLES
The random variables X and Y are said to be jointly Gaussian if their joint pdf has the form
The pdf is constant for values x and y for which the argument of the exponent is constant
constant
2
2
2
2
2
1
1,
2
1
1 2
mymymxmxYX
(4.80) .2
arctan21
22
21
21,
YX
(4.81) ,1
2/
2)(
21
21
mx
Xexf
21
When ρX,Y = 0, X and Y are independent ; when ρX,Y ≠ 0, the major axis of
the ellipse is oriented along the angle
Note that the angle is 45º when the variance are equal.The marginal pdf of X is found by integrating fX,Y(x, y) over all y
that is, X is a Gaussian random variable with mean m1 and variance
(4.83)
mxmxxX ,
221exp
),,()( 2/12/
1
21,,, 21 k
Kxxxff
n
T
nXXX n
4
3
2
1
2
1
2
1
,
XEXEXEXE
m
mm
x
xx
nn
m x
n Jointly Gaussian Random Variables
The random variables X1, X2,…, Xn are said to be jointly Gaussian if their joint pdf is given by
where x and m are column vectors defined by
and K is the covariance matrix that is defined by
(4.84)
VARCOV
COVVARCOVCOVCOVVAR
nn
n
n
XXX
XXXXXXXXXX
K
1
2212
1121
,
,,,,
Transformations of Random Vectors
Let X1,…, Xn be random variables associate with some experiment, and let the random variables Z1,…, Zn be defined by n functions of X = (X1,…, Xn) :
.)()()( 2211 X X X nn gZgZgZ
The joint cdf of Z1,…, Zn at the point z = (z1,…, zn) is equal to the probability of the region of x where
(4.55a) XX .)(,,)(),,( 111,,1 nnnZZ zgzgPzzFn
(4.55b) xx
.),...,(),,( ''1
''1,...,
)'(:'1,, 11
nnXXzg
nZZ dxdxxxfzzFn
kk
n
eYcXWbYaXV
.
YX
ecba
WV
(4.56) .1
wv
Ayx
dPwvfdxdyyxf WVYX ),(),( ,,
pdf of Linear Transformations
We consider first the linear transformation of two random variables
Denote the above matrix by A. We will assume A has an inverse, so each point (v, w) has a unique corresponding point (x, y) obtained from
In Fig. 4.15, the infinitesimal rectangle and the parallelogram are equivalent events, so their probabilities must be equal. Thus
Stochastic ProcessesLet denote the random outcome of an experiment. To every such outcome suppose a waveform
is assigned.The collection of such waveforms form a stochastic process. The set of and the time index t can be continuousor discrete (countably infinite or finite) as well.For fixed (the set of all experimental outcomes), is a specific time function.For fixed t,
),( tX
}{ k
Si ),( tX
),( 11 itXX
t1
t2
t
),(n
tX
),(k
tX
),(2
tX
),(1
tX
Fig. 14.1
),( tX
0
is a random variable. The ensemble of all such realizationsover time represents the stochastic
),( tX
process X(t). (see Fig 14.1). For example
),cos()( 0 tatX
If X(t) is a stochastic process, then for fixed t, X(t) representsa random variable. Its distribution function is given by
Notice that depends on t, since for a different t, we obtaina different random variable. Further
represents the first-order probability density function of the process X(t).
})({),( xtXPtxFX
),( txFX
dxtxdFtxf X
X
),(),(
For t = t1 and t = t2, X(t) represents two different random variablesX1 = X(t1) and X2 = X(t2) respectively. Their joint distribution is given by
and
represents the second-order density function of the process X(t).Similarly represents the nth order densityfunction of the process X(t). Complete specification of the stochasticprocess X(t) requires the knowledge of for all and for all n. (an almost impossible taskin reality).
})(,)({),,,( 22112121 xtXxtXPttxxFX
21 2 1 2
1 2 1 21 2
( , , , )( , , , )
XX
F x x t tf x x t tx x
),, ,,,( 2121 nn tttxxxf X
),, ,,,( 2121 nn tttxxxf X niti , ,2 ,1 ,
Mean of a Stochastic Process:
represents the mean value of a process X(t). In general, the mean of a process can depend on the time index t.
Autocorrelation function of a process X(t) is defined as
and it represents the interrelationship between the random variablesX1 = X(t1) and X2 = X(t2) generated from the process X(t).
Properties:
1.
2.
( ) { ( )} ( , )Xt E X t x f x t dx
* *1 2 1 2 1 2 1 2 1 2 1 2( , ) { ( ) ( )} ( , , , )XX XR t t E X t X t x x f x x t t dx dx
*1
*212
*21 )}]()({[),(),( tXtXEttRttR XXXX
.0}|)({|),( 2 tXEttRXX
3. represents a nonnegative definite function, i.e., for anyset of constants
Eq. (14-8) follows by noticing that The function
represents the autocovariance function of the process X(t).Example 14.1Let
Then
.)(for 0}|{|1
2
n
iii tXaYYE
)()(),(),( 2*
12121 ttttRttC XXXXXX
(14-9)
.)(
T
TdttXz
T
T
T
T
T
T
T
T
dtdtttR
dtdttXtXEzE
XX
2121
212*
12
),(
)}()({]|[|
(14-10)
niia 1}{
),( 21 ttRXX
n
i
n
jjiji ttRaa XX
1 1
* .0),( (14-8)
Stationary Stochastic ProcessesStationary processes exhibit statistical properties that are
invariant to shift in the time index. Thus, for example, second-orderstationarity implies that the statistical properties of the pairs {X(t1) , X(t2) } and {X(t1+c) , X(t2+c)} are the same for any c. Similarly first-order stationarity implies that the statistical properties of X(ti) and X(ti+c) are the same for any c.
In strict terms, the statistical properties are governed by thejoint probability density function. Hence a process is nth-orderStrict-Sense Stationary (S.S.S) if
for any c, where the left side represents the joint density function of the random variables andthe right side corresponds to the joint density function of the randomvariables A process X(t) is said to be strict-sense stationary if (14-14) is true for all
),, ,,,(),, ,,,( 21212121 ctctctxxxftttxxxf nnnn XX
)( , ),( ),( 2211 nn tXXtXXtXX
).( , ),( ),( 2211 ctXXctXXctXX nn
. and ,2 ,1 , , ,2 ,1 , canynniti
For a first-order strict sense stationary process,from (14-14) we have
for any c. In particular c = – t gives
i.e., the first-order density of X(t) is independent of t. In that case
Similarly, for a second-order strict-sense stationary processwe have from (14-14)
for any c. For c = – t2 we get
),(),( ctxftxf XX
(14-16)
(14-15)
(14-17)
)(),( xftxf XX
[ ( )] ( ) , E X t x f x dx a constant.
), ,,(), ,,( 21212121 ctctxxfttxxf XX
) ,,(), ,,( 21212121 ttxxfttxxf XX (14-18)
i.e., the second order density function of a strict sense stationary process depends only on the difference of the time indices In that case the autocorrelation function is given by
i.e., the autocorrelation function of a second order strict-sensestationary process depends only on the difference of the time indices Notice that (14-17) and (14-19) are consequences of the stochastic process being first and second-order strict sense stationary. On the other hand, the basic conditions for the first and second order stationarity – Eqs. (14-16) and (14-18) – are usually difficult to verify.In that case, we often resort to a looser definition of stationarity,known as Wide-Sense Stationarity (W.S.S), by making use of
.21 tt
.21 tt
(14-19)
*1 2 1 2
*1 2 1 2 1 2 1 2
*1 2
( , ) { ( ) ( )}
( , , )
( ) ( ) ( ),
XX
X
XX XX XX
R t t E X t X t
x x f x x t t dx dx
R t t R R
(14-17) and (14-19) as the necessary conditions. Thus, a process X(t)is said to be Wide-Sense Stationary if(i)and(ii)
i.e., for wide-sense stationary processes, the mean is a constant and the autocorrelation function depends only on the difference between the time indices. Notice that (14-20)-(14-21) does not say anything about the nature of the probability density functions, and instead deal with the average behavior of the process. Since (14-20)-(14-21) follow from (14-16) and (14-18), strict-sense stationarity always implies wide-sense stationarity. However, the converse is not true in general, the only exception being the Gaussian process.This follows, since if X(t) is a Gaussian process, then by definition
are jointly Gaussian randomvariables for any whose joint characteristic function is given by
)}({ tXE
(14-21)
(14-20)
),()}()({ 212*
1 ttRtXtXE XX
nttt ,, 21
)( , ),( ),( 2211 nn tXXtXXtXX
where is as defined on (14-9). If X(t) is wide-sense stationary, then using (14-20)-(14-21) in (14-22) we get
and hence if the set of time indices are shifted by a constant c to generate a new set of jointly Gaussian random variables
then their joint characteristic function is identical to (14-23). Thus the set of random variables and have the same joint probability distribution for all n and all c, establishing the strict sense stationarity of Gaussian processes from its wide-sense stationarity.
To summarize if X(t) is a Gaussian process, thenwide-sense stationarity (w.s.s) strict-sense stationarity (s.s.s).
Notice that since the joint p.d.f of Gaussian random variables dependsonly on their second order statistics, which is also the basis
),( ki ttCXX
1 ,( ) ( , ) / 2
1 2( , , , )XX
n n
k k i k i kk l k
X
j t C t t
n e
(14-22)
12
1 1 1 1( )
1 2( , , , )XX
n n n
k i k i kk k
X
j C t t
n e
(14-23)
niiX 1}{
niiX 1}{
PILLAI/Cha
),( 11 ctXX )(,),( 22 ctXXctXX nn
Systems with Stochastic InputsA deterministic system1 transforms each input waveform intoan output waveform by operating only on the time variable t. Thus a set of realizations at the input corresponding to a process X(t) generates a new set of realizations at the output associated with a new process Y(t).
),( itX )],([),( ii tXTtY
)},({ tY
Our goal is to study the output process statistics in terms of the inputprocess statistics and the system function.
1A stochastic system on the other hand operates on both the variables t and .
][T )(tX )(tY
t t
),(i
tX ),(
itY
Fig. 14.3
Linear Systems: represents a linear system if
Let
represent the output of a linear system.Time-Invariant System: represents a time-invariant system if
i.e., shift in the input results in the same shift in the output also.If satisfies both (14-28) and (14-30), then it corresponds to a linear time-invariant (LTI) system.LTI systems can be uniquely represented in terms of their output to a delta function
][L
)}({)( tXLtY
)}.({)}({)}()({ 22112211 tXLatXLatXatXaL (14-28)
][L)()}({)}({)( 00 ttYttXLtXLtY
(14-29)
(14-30)][L
LTI)(t )(th
Impulse
Impulseresponse ofthe system
t
)(th
Impulseresponse
Fig. 14.5
Eq. (14-31) follows by expressing X(t) as
and applying (14-28) and (14-30) to Thus)}.({)( tXLtY
)()()( dtXtX
(14-31)
(14-32)
(14-33).)()()()(
)}({)(
})()({
})()({)}({)(
dtXhdthX
dtLX
dtXL
dtXLtXLtY
By Linearity
By Time-invariance
then
LTI
)()(
)()()(
dtXh
dXthtYarbitrary
input
t
)(tXt
)(tY
Fig. 14.6
)(tX )(tY
Output Statistics: Using (14-33), the mean of the output processis given by
Similarly the cross-correlation function between the input and outputprocesses is given by
Finally the output autocorrelation function is given by
).()()()(
})()({)}({)(
thtdth
dthXEtYEt
XX
Y
(14-34)
).(),(
)(),(
)()}()({
})()()({
)}()({),(
2*
21
*
21
*
21
*
21
2*
121
thttR
dhttR
dhtXtXE
dhtXtXE
tYtXEttR
XX
XX
XY
*
*
(14-35)
or
),(),(
)(),(
)()}()({
})( )()({
)}()({),(
121
21
21
2*
1
2*
121
thttR
dhttR
dhtYtXE
tYdhtXE
tYtYEttR
XY
XY
YY
*
).()(),(),( 12*
2121 ththttRttR XXYY
(14-36)
(14-37)
h(t))(tX )(tY
h*(t2) h(t1) ),( 21 ttRXY ),( 21 ttRYY),( 21 ttRXX
(a)
(b)
Fig. 14.7
In particular if X(t) is wide-sense stationary, then we haveso that from (14-34)
Also so that (14-35) reduces to
Thus X(t) and Y(t) are jointly w.s.s. Further, from (14-36), the output autocorrelation simplifies to
From (14-37), we obtain
XX t )(
constant.a cdht XXY ,)()(
(14-38))(),( 2121 ttRttR XXXX
(14-39)
).()()(
,)()(),( 21
2121
YYXY
XYYY
RhR
ttdhttRttR
(14-40)
).()()()( * hhRR XXYY (14-41)
. ),()()(
)()(),(
21*
*
2121
ttRhR
dhttRttR
XYXX
XXXY
From (14-38)-(14-40), the output process is also wide-sense stationary.This gives rise to the following representation
LTI systemh(t)
Linear system
wide-sense stationary process
strict-sense stationary process
Gaussianprocess (alsostationary)
wide-sense stationary process.
strict-sensestationary process(see Text for proof )
Gaussian process(also stationary)
)(tX )(tY
LTI systemh(t)
)(tX
)(tX
)(tY
)(tY
(a)
(b)
(c)
Fig. 14.8
Discrete Time Stochastic Processes:
A discrete time stochastic process Xn = X(nT) is a sequence of random variables. The mean, autocorrelation and auto-covariance functions of a discrete-time process are gives by
and
respectively. As before strict sense stationarity and wide-sense stationarity definitions apply here also.For example, X(nT) is wide sense stationary if
and
)}()({),()}({
2*
121 TnXTnXEnnRnTXEn
*2121 21),(),( nnnnRnnC
(14-57)
(14-58)
(14-59)
constanta nTXE ,)}({ (14-60)
(14-61)* *[ {( ) } {( ) }] ( ) n nE X k n T X k T R n r r
For a deterministic signal x(t), the spectrum is well defined: If represents its Fourier transform, i.e., if
then represents its energy spectrum. This follows from Parseval’s theorem since the signal energy is given by
Thus represents the signal energy in the band (see Fig 18.1).
( )X
( ) ( ) ,j tX x t e dt
2| ( ) |X
2 2
12( ) | ( ) | .x t dt X d E
(18-1)
(18-2)
2| ( ) |X ( , )
Fig 18.1
Power Spectrum
t0
( )X t
0
2| ( )|X Energy in ( , )
However for stochastic processes, a direct application of (18-1) generates a sequence of random variables for every Moreover,for a stochastic process, E{| X(t) |2} represents the ensemble averagepower (instantaneous energy) at the instant t.
To obtain the spectral distribution of power versus frequency for stochastic processes, it is best to avoid infinite intervals to begin with, and start with a finite interval (– T, T ) in (18-1). Formally, partial Fourier transform of a process X(t) based on (– T, T ) is given by
so that
represents the power distribution associated with that realization basedon (– T, T ). Notice that (18-4) represents a random variable for every
and its ensemble average gives, the average power distribution based on (– T, T ). Thus
.
( ) ( )
T j tT T
X X t e dt
2 2
| ( ) | 1 ( )2 2
T j tTT
X X t e dtT T
(18-3)
(18-4),
represents the power distribution of X(t) based on (– T, T ). For wide sense stationary (w.s.s) processes, it is possible to further simplify (18-5). Thus if X(t) is assumed to be w.s.s, then and (18-5) simplifies to
Let and proceeding as in (14-24), we get
to be the power distribution of the w.s.s. process X(t) based on (– T, T ). Finally letting in (18-6), we obtainT
1 2 ( )
1 2 1 2
1( ) ( ) .2T XX
T T j t tT T
P R t t e dt dtT
1 2
1 2
2( )*
1 2 1 2
( )1 2 1 2
| ( ) | 1( ) { ( ) ( )}2 2
1 ( , )2
T
XX
T T j t tTT T
T T j t tT T
XP E E X t X t e dt dtT T
R t t e dt dtT
2
2
2 | |2 2
1( ) ( ) (2 | |)2
( ) (1 ) 0
T XX
XX
T jT
T jTT
P R e T dT
R e d
1 2 1 2( , ) ( )XX XXR t t R t t
1 2t t
(18-5)
(18-6)
to be the power spectral density of the w.s.s process X(t). Notice that
i.e., the autocorrelation function and the power spectrum of a w.s.sProcess form a Fourier transform pair, a relation known as the Wiener-Khinchin Theorem. From (18-8), the inverse formula gives
and in particular for we get
From (18-10), the area under represents the total power of theprocess X(t), and hence truly represents the power spectrum. (Fig 18.2).
( ) lim ( ) ( ) 0XX T XX
j
TS P R e d
F T( ) ( ) 0.XX XXR S
12( ) ( )XX XX
jR S e d
212 ( ) (0) {| ( ) | } , XX XXS d R E X t P the total power.
0,
( )XXS ( )XXS
(18-9)
(18-8)
(18-7)
(18-10)
If X(t) is a real w.s.s process, then so that
so that the power spectrum is an even function, (in addition to beingreal and nonnegative).
( ) = ( )XX XXR R
0
( ) ( )
( )cos
2 ( )cos ( ) 0
XX XX
XX
XX XX
jS R e d
R d
R d S
(18-13)
top related