good lecture for est theory
TRANSCRIPT
-
8/18/2019 Good Lecture for Est Theory
1/99
Chapter 5
Estimation Theory and Applications
References:
S.M.Kay, Fundamentals of Statistical Signal Processing: EstimationTheory, Prentice Hall, 1993
1
-
8/18/2019 Good Lecture for Est Theory
2/99
Estimation Theory and Applications
Application Areas
1. Radar
Radar system transmits an electromagnetic pulse )(n s . It is reflected by
an aircraft, causing an echo )(nr to be received after 0τ seconds:)()(( nwn snr +τ−) 0α=
where the range of the aircraft is related to the time delay by
c/20 =τ
2
-
8/18/2019 Good Lecture for Est Theory
3/99
3
-
8/18/2019 Good Lecture for Est Theory
4/99
2. Mobile Communications
The position of the mobile terminal can be estimated using the time-of-arrival measurements received at the base stations.
4
-
8/18/2019 Good Lecture for Est Theory
5/99
-
8/18/2019 Good Lecture for Est Theory
6/99
Differences from Detection
1. Radar
Radar system transmits an electromagnetic pulse )(n s . After some time,
it receives a signal )(nr . The detection problem is to decide whether )(nr
isecho from an object or it is not an echo
6
-
8/18/2019 Good Lecture for Est Theory
7/99
7
-
8/18/2019 Good Lecture for Est Theory
8/99
2. Communications
In wired or wireless communications, we need to know the informationsent from the transmitter to the receiver.
e.g., for binary phase shift keying (BPSK) signals, it consists of only two
symbols, “0” or “1”. The detection problem is to decide whether it is “0” or“1”.
8
-
8/18/2019 Good Lecture for Est Theory
9/99
-
8/18/2019 Good Lecture for Est Theory
10/99
-
8/18/2019 Good Lecture for Est Theory
11/99
4. Image Processing
Fingerprint authentication: given a fingerprint image and his owner sayshe is “A”, we need to verify if it is true or not
Other biometric examples include face authentication, iris authentication,
etc.
11
-
8/18/2019 Good Lecture for Est Theory
12/99
5. Biomedical Engineering
17 Jan. 2003, Hong Kong Economics Times
e.g., given some X-ray slides, the detection problem is to determine if shehas breast cancer or not
6. Seismology
To detect if there is oil or there is no oil at a region
12
-
8/18/2019 Good Lecture for Est Theory
13/99
-
8/18/2019 Good Lecture for Est Theory
14/99
Estimate the value of resistance from a set of voltage and current
readings:1,,1,0],[][][],[][][ 2actual1actual −=+=+= nnwn I n I nwnV nV L
Given pairs of ( ][],[ n I nV ), we need to estimate the resistance ,
ideally, I V /=
⇒ the parameter is not directly observed in the received signals
Estimate the position of the mobile terminal using time-of-arrival
measurements:
1,,1,0],[)()(
][
22
−=+−−−
= N nnwc
y y x xnr
n sn sL
Given ][nr , we need to find the mobile position ( s s y x , ) where is thesignal propagation speed and (
c
nn x , ) represent the known position of
the nth base station
⇒ the parameters are not directly observed in the received signals
14
-
8/18/2019 Good Lecture for Est Theory
15/99
-
8/18/2019 Good Lecture for Est Theory
16/99
Parameter is unknown deterministic or random
Unknown deterministic: constant but unknown (classical)DC value is an unknown constant
Random : random variable with prior knowledge ofPDF (Bayesian)If we have prior knowledge that the DC valueis bounded by 0− and 0 with a particularPDF ⇒ better estimate
Parameter is stationary or changing
Stationary : Unknown deterministic for whole observationperiod, time-of-arrivals of a static source
Changing : Unknown deterministic at different timeinstants, time-of-arrivals of a moving source
16
-
8/18/2019 Good Lecture for Est Theory
17/99
Performance Measures for Classical Parameter Estimation
Accuracy:
Is the estimator biased or unbiased?
e.g., 1,,1,0],[][ −=+= nnwn x L
where is a zero-mean random noise with variance][nw 2wσ
Proposed estimators:
]0[ˆ1 x A =
][1ˆ 1
02 n x
N A
N
n∑=
−
=
][1
1ˆ 1
03 n x
N A
n∑
−= −
=
N N N
n
N x x xn x A ]1[]1[]0[][ˆ1
0
4 −⋅=∏=−
=
L
17
-
8/18/2019 Good Lecture for Est Theory
18/99
Biased : A A E ≠}ˆ{
Unbiased : A A E =}ˆ{ Asymptotically unbiased : only if A A E =}ˆ{ ∞→
Taking the expected values for , and , we have1ˆ A 2
ˆ A 3ˆ A
A Aw E A E x E A E =+=+== 0]}0[{}{]}0[{}ˆ{ 1
A N
A N N
nw E N
A N
nw N E A N E n x N E A E
N
n
N
n
N
n
N
n
N
n
N
n
=∑+⋅⋅=∑+∑=
∑+
∑=
∑=−
=
−
=
−
=
−
=
−
=
−
=
011
]}[{11
][
11
][
1
}ˆ{
1
0
1
0
1
0
1
0
1
0
1
02
A A N
A E ⋅−
=⋅−
=/11
1
1}ˆ{ 3
Q. State the biasedness of , and .1ˆ
A 2ˆ
A 3ˆ
A
18
-
8/18/2019 Good Lecture for Est Theory
19/99
For , it is difficult to analyze the biasedness. However, for4ˆ A 0][ =nw :
A A A A A N x x x N N N N ==⋅=−⋅ LL ]1[]1[]0[
What is the value of the mean square error or variance?
They correspond to the fluctuation of the estimate in the second order:
})ˆ{(MSE 2 A A E −= (5.1)
}})ˆ{ˆ{(var 2 A E A E −= (5.2):
If the estimator is unbiased, then MSE = var
19
-
8/18/2019 Good Lecture for Est Theory
20/99
-
8/18/2019 Good Lecture for Est Theory
21/99
An optimum estimator should give estimates which are
Unbiased Minimum variance (MSE as well)
Q. How do we know the estimator has the minimum variance?
Cramer-Rao Lower Bound (CRLB)
Performance bound in terms of minimum achievable variance provided by
any unbiased estimators
Use for classical parameter estimation
Require knowledge of the noise PDF and the PDF must have closed form
More easier to determine than other variance bounds
21
-
8/18/2019 Good Lecture for Est Theory
22/99
Let the parameters to be estimated be T ],,,[ 21 θθθ= Lθ , the CRLB for
θ in Gaussian noise is stated as followsi
[ ] iiiii ,1
, )()()CRLB( θIθJ −==θ (5.4)
where
θ∂
∂
θ∂θ∂
∂
θ∂
∂
θ∂θ∂
∂
θ∂θ∂
∂
θ∂θ∂
∂
θ∂
∂
=
2
2
1
2
22
2
12
2
1
2
21
2
21
2
);(ln);(ln
);(ln);(ln
);(ln);(ln);(ln
)(
P P
P
p E
p E
p E
p E
p E
p E
p E
θx-
θx-
θx-
θx-
θx-
θx-
θx-
θI
OM
L
(5.5)
22
-
8/18/2019 Good Lecture for Est Theory
23/99
-
8/18/2019 Good Lecture for Est Theory
24/99
Review of Gaussian (Normal) Distribution
The Gaussian PDF for a scalar random variable is defined as x
µ−
σ
−
πσ
= 222
)(
2
1exp
2
1)( x x p (5.6)
We can write ),(~ σµ N x
The Gaussian PDF for a random vector of sizex is defined as
−⋅⋅−−
π= )()(
2
1exp
)(det)2(
1)( 1
2/12/ µxCµx
Cx
-T
N p (5.7)
We can write ),(~ Cµx
24
-
8/18/2019 Good Lecture for Est Theory
25/99
The covariance matrix has the form ofC
µ−−µ−−µ−
µ−µ−
µ−−µ−µ−
=
−⋅−=
−−
−
})]1[{()}]1[)(]0[{(
)}]1[)(]0[{(
)}]1[)(]0[{(})]0[{(
})(){(
2110
1010
2
0
N N
N
T
N x E N x x E
x x E
N x x E x E
E
L
M
MO
L
µxµxC
(5.8)
whereT N x x x ]]1[,],1[],0[[ −= Lx
T N E ]],,,[}{ 110 −µµµ== Lxµ
25
-
8/18/2019 Good Lecture for Est Theory
26/99
If is a zero-mean white vector and all vector elements have variancex 2σ
N T E IµxµxC ⋅σ=
σ
σ
σ
=−⋅−= 2
2
2
2
00
0
0
00
})(){(
L
OM
M
L
The Gaussian PDF for the random vector can be simplified asx
∑
σ−
πσ=
−
=][
2
1exp
)2(
1)( 2
1
022/2
n x p N
n N
x (5.9)
with the use of
N -
IC ⋅σ= −21 N N 22 )(det σ=σ=(C)
26
-
8/18/2019 Good Lecture for Est Theory
27/99
Example 5.1
Determine the PDF of
]0[]0[ w x += and
1,,1,0],[][ −=+= nnwn x L
where { } is a white Gaussian process with known variance and)(nw 2wσ
is a constant
−
σ−
πσ= 2
22)]0[(
2
1exp
2
1)];0[( A x A x p
ww
−∑
σ−
πσ=
−
=
21
022/2 )][(
2
1exp
)2(
1)( An x;A p
N
nw N
w
x
27
-
8/18/2019 Good Lecture for Est Theory
28/99
Example 5.2
Find the CRLB for estimating based on single measurement:
]0[]0[ w x +=
22
2
22
22
2
2
22
1))];0[(ln(
)]0[(1)]0[(2
2
1))];0[(ln(
)]0[(2
1
)2ln())];0[(ln(
)]0[(2
1exp
2
1)];0[(
w
ww
ww
ww
A
A x p
A x A x
A
A x p
A x A x p
A x A x p
σ−=
∂
∂⇒
σ
−=−⋅−⋅
σ−=
∂∂
⇒
−σ−πσ−=⇒
−
σ−
πσ=
28
-
8/18/2019 Good Lecture for Est Theory
29/99
As a result,
22
2
1))];0[(ln(w A
A x p E σ−=
∂∂
2
2
2
)CRLB(
)(
1
)()(
w
w
w
A
A J
A I A
σ=⇒
σ=⇒
σ==I
This means the best we can do is to achieve estimator variance = or2wσ
2)ˆvar( w A σ≥
where ˆ is any unbiased estimator for estimating
29
-
8/18/2019 Good Lecture for Est Theory
30/99
We also observe that a simple unbiased estimator
]0[ˆ1 x A =
achieves the CRLB:
222221 ]}0[{})]0[{(})]0[{(})
ˆ{( ww E Aw A E A x E A A E σ==−+=−=−
Example 5.3
Find the CRLB for estimating based on measurements:
1,,1,0],[][ −=+= nnwn x L
−∑
σ−
πσ=
−
=
21
022/2 )][(
2
1exp
)2(
1)( An x;A p
N
nw N
w
x
30
-
8/18/2019 Good Lecture for Est Theory
31/99
22
2
2
1
01
02
21
02
2/2
21
022/2
));(ln(
)][(
1)][(22
1));(ln(
)][(2
1))2ln(());(ln(
)][(
2
1exp
)2(
1);(
w
w
N
n N
nw
N
nw
N w
N
nw N w
N
A
A p
An x
An x A
A p
An x A p
An x A p
σ−=
∂∂⇒
σ
−∑=−⋅−∑⋅⋅
σ−=
∂∂
⇒
−∑σ
−πσ−=⇒
−∑
σ
−
πσ
=
−
=−
=
−
=
−
=
x
x
x
x
22
2 ));(ln(
w
N
A
A p E
σ−=
∂
∂⇒
x
31
-
8/18/2019 Good Lecture for Est Theory
32/99
As a result,
A
N A J
A I A
w
w
w
2
2
2
)CRLB(
)(
)()(
σ=⇒
σ
=⇒
σ==I
This means the best we can do is to achieve estimator variance =
or
N w /2σ
A w2
)ˆvar( σ≥
where ˆ is any unbiased estimator for estimating
32
-
8/18/2019 Good Lecture for Est Theory
33/99
We also observe that a simple unbiased estimator
]0[ˆ1 x A = does not achieve the CRLB
22222
1 ]}0[{})]0[{(})]0[{(})ˆ
{( ww E Aw A E A x E A A E σ==−+=−=−
On the other hand, the sample mean estimator
][1ˆ 1
02 n x
N A
N
n∑=
−
=
achieve the CRLB
N nw E N An x N E A A E w
N
n
N
n
22
1
0
21
0
22 ][
1
][
1
})ˆ{(
σ
=
∑=
−∑=−
−
=
−
=
⇒ sample mean is the optimum estimator for white Gaussian noise
33
-
8/18/2019 Good Lecture for Est Theory
34/99
Example 5.4
Find the CRLB for and given2wσ ]}[{ n x :
1,,1,0],[][ −=+= nnwn x L
2
1
01
02
21
02
2
21
02
2/2
221
022/2
)][(
1)][(22
1));(ln(
)][(2
1)ln(
2)2ln(
2
)][(
2
1))2ln(());(ln(
,)][(2
1exp)2(
1);(
w
N
n N
nw
N
nw
w
N
nw
N w
w N
nw N
w
An x
An x A
p
An x N N
An x p
A, An x p
σ
−∑=−⋅−∑⋅⋅
σ−=
∂∂
⇒
−∑σ
−σ−π−=
−∑
σ
−πσ−=⇒
σ=
−∑
σ−
πσ=
−=
−
=
−
=
−
=
−=
θx
θx
][θθx
34
-
8/18/2019 Good Lecture for Est Theory
35/99
0
]})[{());(ln(
])[()][());(ln(
));(ln(
));(ln(
4
1
02
2
4
1
04
1
02
2
22
2
22
2
=σ
∑−=
σ∂∂
∂⇒
σ
∑−=
σ
−∑−=
σ∂∂
∂⇒
σ−=
∂
∂⇒
σ
−=
∂
∂⇒
−
=
−
=
−
=
w
N
n
w
w
N
n
w
N
n
w
w
w
nw E
A
p E
nw An x
A
p
N
A
p E
N
A
p
θx
θx
θx
θx
35
-
8/18/2019 Good Lecture for Est Theory
36/99
4
2
6422
2
21
06422
2
21
0422
21
0
2
2
2
1
2)(
));(ln(
])[(1
2)(
));(ln(
)][(2
1
2
));(ln(
)][(
2
1)ln(
2
)2ln(
2
));(ln(
wwwww
N
nwww
N
nwww
N
nw
w
N N
N p E
nw N p
An x N p
An x N N
p
σ−=σ⋅
σ−
σ=
σ∂
∂⇒
∑σ
−σ
=σ∂
∂⇒
−∑σ
+σ
−=σ∂
∂⇒
−∑
σ
−σ−π−=
−=
−
=
−
=
θx
θx
θx
θx
σ
σ=4
2
20
0
)(
w
w N
N
θI
36
-
8/18/2019 Good Lecture for Est Theory
37/99
σ
σ
== N w
w
- 4
2
12
0
0
)()( θIθJ
N A
ww
w
42
2
2)CRLB(
)CRLB(
σ=σ⇒
σ=⇒
⇒ the CRLBs for unknown and known noise power are identical
Q. The CRLB is not affected by knowledge of noise power. Why?
Q. Can you suggest a method to estimate ? 2wσ
37
-
8/18/2019 Good Lecture for Est Theory
38/99
Example 5.5
Find the CRLB for phase of a sinusoid in white Gaussian noise:
1,,1,0],[)cos(][ 0 −=+φ+ω= nnwnn x L
where and are assumed known0ω The PDF is
2
0
1
02
2/2
20
1
022/2
))cos(][(2
1
))2ln(());(ln(
))cos(][(2
1exp
)2(
1);(
φ+ω−∑σ−πσ−=φ⇒
φ+ω−∑
σ−
πσ=φ
−
=
−
=
n An x p
n An x p
N
nw
N
w
N
nw N
w
x
x
38
-
8/18/2019 Good Lecture for Est Theory
39/99
φ+ω−φ+ω∑
σ−=
φ+ω−⋅−⋅φ+ω−∑
σ
−=
φ∂
φ∂
−
=
−
=
)22sin(2
)sin(][
)sin())cos(][(2
2
1));(ln(
00
1
02
00
1
0
2
n A
nn x A
n An An x p
N
nw
N
nw
x
[ ])22cos()cos(][));(ln( 001
022
2
φ+ω−φ+ω∑σ
−=φ∂
φ∂ −
=n Ann x A p
N
nw
x
[ ]
[ ]
φ+ω−φ+ω+∑
σ−=
φ+ω−φ+ω∑σ
−=
φ+ω−φ+ω⋅φ+ω∑
σ
−=
φ∂
φ∂
−
=
−
=
−
=
)22cos()22cos(2
1
2
1
)22cos()(cos
)22cos()cos()cos());(ln(
001
02
2
002
1
02
2
000
1
0
22
2
nn A
nn A
n Ann A A p
E
N
nw
N
nw
N
nw
x
39
-
8/18/2019 Good Lecture for Est Theory
40/99
)22cos(22
)22cos(
22
));(ln(
0
1
02
2
2
2
0
1
02
2
2
2
2
2
φ+ω∑σ
+σ
−=
φ+ω∑
σ
+⋅
σ
−=
φ∂
φ∂
−
=
−
=
n A NA
n A N A p
E
N
nww
N
nww
x
As a result,11
002
211
002
2
2
2
)22cos(1
12
)22cos(22
)CRLB(
−−
=
−−
=
∑ φ+ω−
σ=
∑ φ+ω
σ−
σ=φ
N
n
w N
nww
n N NA
n A NA
If 1>> , 0)22cos(1
0
1
0
≈φ+ω∑−
=n
N
N
n
then
2
22)CRLB(
NA
wσ≈φ
40
-
8/18/2019 Good Lecture for Est Theory
41/99
-
8/18/2019 Good Lecture for Est Theory
42/99
2
0
1
020
21
022
2
2
)22cos(
2
1
2
11)(cos
1));(ln(
w
N
nw
N
nw
N
nn
A
p
σ−≈
φ+ω+∑
σ
−=φ+ω∑
σ
−=
∂
∂ −
=
−
=
θx
22
2
2
));(ln(
w
N
A
p E
σ−≈
∂
∂ θx
Similarly,
0)22sin(
2
))θ;x(ln(0
1
0
2
0
2
≈φ+ω∑
σ
=
ω∂∂
∂ −
=
nn A
A
p E
N
nw
0)22sin(2
))θ;x(ln(0
1
02
2
≈φ+ω∑σ
=
φ∂∂∂ −
=n
A
A
p E
N
nw
42
-
8/18/2019 Good Lecture for Est Theory
43/99
21
02
2
02
1
02
2
20
2
2
)22cos(
2
1
2
1));(ln(n
Ann
A p E
N
nw
N
nw
∑
σ
−≈
φ+ω−∑
σ
−=
ω∂
∂ −
=
−
=
θx
n A
nn A p
E N
nw
N
nw
∑σ
−≈φ+ω∑σ
−=
φ∂ω∂∂ −
=
−
=
1
02
2
02
1
02
2
0
2
2)(sin
))θ;x(ln(
2
20
21
02
2
2
2
2)(sin));(ln(
w
N
nw
NAn A p E σ
−≈φ+ω∑σ
−=
φ∂∂ −
=
θx
∑
∑∑σ
≈
−
=
−
=
−
=
220
220
002
1)(
21
0
2
1
0
22
1
0
2
2
NAn
A
n A
n A
N
N
n
N
n
N
nw
θI
43
Aft t i i i h
-
8/18/2019 Good Lecture for Est Theory
44/99
After matrix inversion, we have
N A w
22)CRLB( σ≈
2
2
20
2
SNR ,
)1(SNR
12)CRLB(
w
A
N N σ=
−⋅≈ω
)1(SNR
)12(2)CRLB(
+⋅−
≈φ N N
N
Note that
NA N N N N
N w22
SNR
1
SNR
4
)1(SNR
)12(2)CRLB(
σ=
⋅>
⋅≈
+⋅−
≈φ
⇒ In general, the CRLB increases as the number of parameters to beestimated increases
⇒ CRLB decreases as the number of samples increases
44
P t T f ti i CRLB
-
8/18/2019 Good Lecture for Est Theory
45/99
Parameter Transformation in CRLB
Find the CRLB for θ)α ( g = where () g is a function
e.g., 1,,1,0],[][ −=+= nnwn x L
What is the CRLB for 2?
The CRLB for parameter transformation of )(α θ g = is given by
θ∂
θ∂
−
θ∂θ∂
=α
2
2
2
));(ln(
)(
)CRLB(x p
E
g
(5.10)
For nonlinear function, “=” is replaced by “ ” and it is true only for large≈
45
E l 5 7
-
8/18/2019 Good Lecture for Est Theory
46/99
Example 5.7
Find the CRLB for the power of the DC value, i.e., 2 :
1,,1,0],[][ −=+= nnwn x L
22
2
4)(
2)(
)(
A A
A g A
A
A g
A A g
=
∂∂
⇒=∂
∂⇒
==α
From Example 5.3, we have
22
2 ));(ln(
w
N
A
A p E
σ=
∂
∂−
x
As a result,
1,4
4)CRLB(222
22 >>σ
=σ
⋅≈ N A
A A ww
46
Example 5 8
-
8/18/2019 Good Lecture for Est Theory
47/99
Example 5.8
Find the CRLB for cc 21 +=α from
1,,1,0],[][ −=+= nnwn x L
c
22
2
2
21
)()(
)(
c A
A g c
A
A g
c g
=
∂∂
⇒=∂
∂⇒
+==α
As a result,
c
N c Ac
w
w
222
222
22 )CRLB()CRLB(
σ=
σ⋅=⋅=α
47
Maximum Likelihood Estimation
-
8/18/2019 Good Lecture for Est Theory
48/99
Maximum Likelihood Estimation
Parameter estimation is achieved via maximizing the likelihood function
Optimum realizable approach and can give performance close to CRLB
Use for classical parameter estimation
Require knowledge of the noise PDF and the PDF must have closed form
Generally computationally demanding
Let be the PDF of the observed vector parameterized by the
parameter vector θ. The maximum likelihood (ML) estimate is
);( θx p x
);(maxargˆ θxθ
θ
p= (5.11)
48
-
8/18/2019 Good Lecture for Est Theory
49/99
Example 5 9
-
8/18/2019 Good Lecture for Est Theory
50/99
Example 5.9
Given1,,1,0],[][ −=+= nnwn x L
where is an unknown constant and is a white Gaussian noise with
known variance 2 . Find the ML estimate of
][nw
wσ .
−∑
σ
−
πσ
= −
=
21
022/2
)][(
2
1exp
)2(
1)( An x;A p
N
nw N w
x
Since ))};({ln(maxarg);(maxarg θxθxθθ
p p = , taking log for gives)( ;A p x
2
1
02
2/2 )][(2
1))2ln(());(ln( An x A p
N
nw
N w −∑
σ−πσ−=
−
=x
50
Differentiate with respect to yields
-
8/18/2019 Good Lecture for Est Theory
51/99
Differentiate with respect to yields
2
1
01
02
)][(1)][(2
2
1));(ln(
w
N
n N
nw
An x An x
A
A p
σ−∑=−⋅−∑⋅⋅
σ−=
∂∂
−
=−
=
x
)};({ln(maxargˆ A p A x= is determined from
][1ˆ0)ˆ][(0
)ˆ][(1
0
1
02
1
0 n x N
A An x
An x N
n
N
nw
N
n ∑=⇒=−∑⇒=σ
−∑ −
=
−
=
−
=
Note that
ML estimate is identical to the sample mean Attain CRLB
Q. How about if is unknown?2wσ
51
Example 5 10
-
8/18/2019 Good Lecture for Est Theory
52/99
Example 5.10
Find the ML estimate for phase of a sinusoid in white Gaussian noise:
1,,1,0],[)cos(][ 0 −=+φ+ω= nnwnn x L
where and are assumed known0ω The PDF is
2
0
1
02
2/2
20
1
022/2
))cos(][(2
1
))2ln(());(ln(
))cos(][(2
1exp
)2(
1);(
φ+ω−∑σ−πσ−=φ⇒
φ+ω−∑σ
−πσ
=φ
−
=
−
=
n An x p
n An x p
N
nw
N
w
N
nw N
w
x
x
52
It is obvious that the maximum of );( φxp or ));(ln( φxp corresponds to the
-
8/18/2019 Good Lecture for Est Theory
53/99
It is obvious that the maximum of );( φx p or ));(ln( φx p corresponds to theminimum of
20
1
02
))cos(][(2
1φ+ω−∑
σ
−
=n An x
nw
or 201
0
))cos(][( φ+ω−∑−
=n An x
N
n
Differentiating with respect to φ and then set the result to zero:
0)22sin(2
)sin(][
)sin())cos(][(2
001
0
00
1
0
=
φ+ω−φ+ω∑=
φ+ω−⋅−⋅φ+ω−∑
−=
−
=
n Ann x A
n An An x
N
n
N
n
⇒ )ˆ22sin(2
)ˆsin(][ 01
0
0
1
0
φ+ω∑=φ+ω∑−
=
−
=n
Ann x
N
n
N
n
The ML estimate for φ is determined from the root of the above equation
Q. Any ideas to solve the nonlinear equation?
53
Approximate ML (AML) solution may exist and it depends on the structure
-
8/18/2019 Good Lecture for Est Theory
54/99
pp o ate ( ) so ut o ay e st a d t depe ds o t e st uctu eof the ML expression. For example, there exists an AML solution for φ
1,002)ˆ22sin(
1
2)ˆsin(][
1
)ˆ22sin(2
)ˆsin(][
0
1
00
1
0
0
1
00
1
0
>>=⋅≈φ+ω∑⋅=φ+ω∑⇒
φ+ω∑=φ+ω∑
−
=
−
=
−
=
−
=
N
A
n N
A
nn x N
n A
nn x
N
n
N
n
N
n
N
n
The AML solution is obtained from
)cos(][)ˆsin()sin(][)ˆcos(
0)ˆsin()cos(][)ˆcos()sin(][
0)ˆsin(][
0
1
00
1
0
0
1
0
0
1
0
01
0
nn xnn x
nn xnn x
nn x
N
n
N
n
N
n
N
n
N
n
ω∑⋅φ−=ω∑⋅φ⇒
=φω∑+φω∑⇒
=φ+ω∑
−
=
−
=
−
=
−
=
−
=
54
ω∑ − )sin(][ˆ 0
101 nnx
N
-
8/18/2019 Good Lecture for Est Theory
55/99
ω∑
ω∑−=φ
−=
=−
)cos(][
)sin(][tanˆ
010
001
nn x
nn x
N n
n
In fact, the AML solution is reasonable:
ω∑+φ
ω∑−φ=
>>
ω∑+φ
ω∑+φ−−≈
ω+φ+ω∑ω+φ+ω∑−=φ
−=
−=−
−=
−
=−
−=
−=−
)cos(][2
)cos(
)sin(][
2
)sin(tan
1,
)cos(][)cos(2
)sin(][)sin(2tan
)cos(])[)cos((
)sin(])[)cos((tanˆ
010
0101
010
0
1
01
0010
00101
nnw NA
nnw NA
N
nnw NA
nnw NA
nnwn A
nnwn A
N n
N n
N n
N
n
N n
N n
55
For parameter transformation, if there is a one-to-one relationship
-
8/18/2019 Good Lecture for Est Theory
56/99
p , pbetween )(θ=α g and θ, the ML estimate for α is simply:
)ˆ(ˆ θ=α g (5.12)
Example 5.11
Given samples of a white Gaussian process ,][nw 1,,1,0 −=n L , with
unknown variance 2. Determine the power of in dB.σ ]n[w
The power in dB is related to 2σ by
)(log10 210 σ= P
which is a one-to-one relationship. To find the ML estimate for , we first
find the ML estimate for 2σ
56
][11
)( 21
2 N
∑−
-
8/18/2019 Good Lecture for Est Theory
57/99
][2
1)ln(
2)2ln(
2))σ(ln(
][
2
exp
)2(
)σ(
21
02
22
2
0
22/2
2
n x N N
; p
n x; p
N
n
n
N
∑σ
−σ−π−=⇒
∑
σ
−
πσ
=
−
=
=
w
w
Differentiating the log-likelihood function w.r.t. to
2
σ :
][2
1
2σ
))σ(ln( 21
0422
2
n x N ; p N
n∑
σ+
σ−=
∂
∂ −
=
w
Setting the resultant expression to zero:
][1
ˆ][ˆ2
1
ˆ2
21
0
221
042
n x N
n x N N
n
N
n∑=σ⇒∑
σ=
σ
−
=
−
=
As a result,
∑=σ=
−
=][
1log10)ˆ(log10ˆ
21
010
210 n x
N P
N
n
57
Example 5.12
-
8/18/2019 Good Lecture for Est Theory
58/99
Given1,,1,0],[][ −=+= nnwn x L
where is an unknown constant and is a white Gaussian noise with
unknown variance 2. Find the ML estimates of
][nw
σ and 2σ .
T N
n
N A An x; p ][θθx 22
1
0
22/2 ,)][(
2
1exp
)2(
1)( σ=
−∑
σ
−
πσ
= −
=
⇒ 21
0422
1
02
)][(2
1
2σ
))(ln(
)][(1))(ln(
An x N ; p
An x A
; p
N
n
N
n
−∑σ
+σ
−=∂
∂
−∑σ
=∂
∂
−
=
−
=
θx
θx
58
Solving the first equation:
-
8/18/2019 Good Lecture for Est Theory
59/99
xn x N
A N
n
=∑= −=
][1ˆ 1
0
Putting x== ˆ in the second equation:
21
0
2 )][(1
ˆ xn x N
N
n
−∑=σ −
=
Numerical Computation of ML Solution
When the ML solution is not of closed form, it can be computed by
Grid search
Numerical methods: Newton-Raphson, Golden section, bisection, etc
59
Example 5.13
-
8/18/2019 Good Lecture for Est Theory
60/99
From Example 5.10, the ML solution of φ is determined from
)ˆ22sin(2
)ˆsin(][ 01
00
1
0
φ+ω∑=φ+ω∑−
=
−
=n
Ann x
N
n
N
n
Suggest methods to find φ̂ Approach 1: Grid search
Let
)22sin(2
)sin(][)( 01
00
1
0
φ+ω∑−φ+ω∑=φ −
=
−
=n
Ann x g
N
n
N
n
It is obvious that
φ̂ = root of )(φ g
60
The idea of grid search is simple:
-
8/18/2019 Good Lecture for Est Theory
61/99
Search for all possible values of or a given range of to find root⇒ φˆ
φˆ
Values are discrete tradeoff between resolution & computation
e.g., Range for : any values inφ̂ )2,0[ π ⇒Discrete points : 1000 resolution is 1000/2π
MATLAB source code:
N=100;n=[0:N-1];
w = 0.2*pi; A = sqrt(2);p = 0.3*pi;np = 0.1;q = sqrt(np).*randn(1,N);x = A.*cos(w.*n+p)+q;
for j=1:1000pe = j/1000*2*pi;s1 =sin(w.*n+pe);s2 =sin(2.*w.*n+2.*pe);g(j) = x*s1'-A/2*sum(s2);
end
61
pe = [1:1000]/1000;plot(pe,g)
-
8/18/2019 Good Lecture for Est Theory
62/99
plot(pe,g)
Note: x-axis is )2/( πφ
62
stem(pe g)
-
8/18/2019 Good Lecture for Est Theory
63/99
stem(pe,g)axis([0.14 0.16 -2 2])
23240)2152.0( .- g =π⋅ , 21680)2153.0( . g =π⋅
⇒ (π=π⋅=φ 306.02153.0ˆ π± 001.0 )
63
For a smaller resolution say 200 discrete points:
-
8/18/2019 Good Lecture for Est Theory
64/99
For a smaller resolution, say 200 discrete points:
clear pe;clear s1;clear s2;clear g;
for j=1:200pe = j/200*2*pi;s1 =sin(w.*n+pe);s2 =sin(2.*w.*n+2.*pe);g(j) = x*s1'-A/2*sum(s2);
endpe = [1:200]/200;plot(pe,g)
64
stem(pe,g)
-
8/18/2019 Good Lecture for Est Theory
65/99
axis([0.14 0.16 -2 2])
13061)2150.0( .- g =π⋅ , 11501)2155.0( . g =π⋅
⇒ (π=π⋅=φ 310.02155.0ˆ π± 005.0 )
⇒ Accuracy increases as number of grids increases
65
-
8/18/2019 Good Lecture for Est Theory
66/99
p1 = 0;for k=1:10
( * )
-
8/18/2019 Good Lecture for Est Theory
67/99
s1 =sin(w.*n+p1);
s2 =sin(2.*w.*n+2.*p1);c1 =cos(w.*n+p1);c2 =cos(2.*w.*n+2.*p1);g = x*s1'-A/2*sum(s2);g1 = x*c1'-A*sum(c2);p1 = p1 - g/g1;p1_vector(k) = p1;
endstem(p1_vector/(2*pi))
Newton/Raphson method converges at ~ 3rd iteration
π=π⋅=φ 305.021525.0ˆ
Q. Can you comment on the grid search & Newton/Raphson method?
67
ML Estimation for General Linear Model
-
8/18/2019 Good Lecture for Est Theory
68/99
The general linear data model is given by
wHθx += (5.14)
where
x is the observed vector of sizew is Gaussian noise vector with known covariance matrix
p
C
×H is known matrix of sizeθ is parameter vector of size p
Based on (5.7), the PDF of parameterized by isx θ
−⋅⋅−−
π= )()(
2
1exp
)(det)2(
1)( 1
2/12/ HθxCHθx
Cθx;
-T
N p (5.15)
68
-
8/18/2019 Good Lecture for Est Theory
69/99
Example 5.14
-
8/18/2019 Good Lecture for Est Theory
70/99
Given pair of ( y x, ) where is error-free but x y is subject to error:
1,,1,0,][][][ −=++⋅= nnwcn xmn L
where is white Gaussian noise vector with known covariance matrixw C
Find the ML estimates for andm c
T cmnwn xnwc
mn xn y
nwcn xmn y
][],[]1][[][]1][[][
][][][
=+⋅=+
⋅=⇒
++⋅=
θθ
⇒
]1[]1]1[[]1[
]1[]1]1[[]1[
]0[]1]0[[]0[
−+⋅−=−
+⋅=
+⋅=
N w N x N y
w x y
w x y
θ
θ
θ
LLL
70
Writing in matrix form:wHθy +=
-
8/18/2019 Good Lecture for Est Theory
71/99
y
where
T N y y y ]]1[,],1[],0[[ −= Ly
−
=
1]1[
1]1[
1[0]
N x
x
x
MMH
Applying (5.16) gives
yCHHCHθ 111 )(ˆˆˆ −−− ⋅=
= T T
cm
71
Example 5.15
-
8/18/2019 Good Lecture for Est Theory
72/99
Find the ML estimates of , 0ω and φ for
1,1,,1,0],[)cos(][ 0 >>−=+φ+ω= nnwnn x L
where is a white Gaussian noise with variance][nw 2
wσ Recall from Example 5.6:
],[θθx φω=
φ+ω−∑
σ−
πσ= −
= 0
20
1
022/2
,))cos(][(2
1exp)2(
1);( A,n An x p N
nw N
w
The ML solution for can be found by minimizingθ
20
1
00 ))cos(][(),,( φ+ω−∑=φω
−
=n An x A J
n
72
This can be achieved by using a 3-D grid search or Netwon/Raphsonmethod but it is computationally complex
-
8/18/2019 Good Lecture for Est Theory
73/99
p y p
Another simpler solution is as follows
200
1
0
20
1
00
))sin()sin()cos()cos(][(
))cos(][(),,(
n An An x
n An x A J
N
n
N
n
ωφ+ωφ−∑=
φ+ω−∑=φω
−
=
−
=
Since and are not quadratic inφ ),,(0 φω , the first step is to use
parameter transformation:
)sin(
)cos(
2
1
φ−=α
φ=α
A ⇒
αα−=φ
α+α=
−
1
21
22
21
tan
A
73
-
8/18/2019 Good Lecture for Est Theory
74/99
Substituting back to ),,( 021 ωαα :
ααT
-
8/18/2019 Good Lecture for Est Theory
75/99
( ) ( )
xHHHHx-xx
xHHHH-Ix
xHHHH-IHHHH-Ix
xHHHH-IxHHHH-I
xHHHH-xxHHHH-x
αH-x
αH-x
⋅⋅⋅⋅=
⋅⋅⋅=
⋅⋅⋅⋅=
⋅⋅⋅⋅=
⋅⋅==ω
−
−
−−
−−
−−
T T T T
T T T
T T T T T T
T T T T T
T T T T T J
1
1
11
11
110
)(
))((
))(())((
))(())((
))(())(()ˆ()ˆ()(
Minimizing )( 0ω is identical to maximizing
xHHHHx ⋅⋅⋅ − T T T 1)( or
xHHHHx ⋅⋅⋅=ω −ω
T T T 10 )(maxargˆ
0
⇒ 3-D search is reduced to a 1-D search
75
After determining , can be obtained as well0ω̂ α̂
-
8/18/2019 Good Lecture for Est Theory
76/99
For sufficiently large :
[ ]
[ ]
( ) ( ) 20
1
0
22
1
1
1
)exp(][2
2
2/0
02/
)(
n jn x N
N
N
N
N
n
T T
T
T T T
T
T
T T
T T T T T T T
ω−∑=
+=
⋅
⋅≈
⋅
⋅=⋅⋅⋅
−
=
−
−−
xsxc
xs
xcxsxc
xs
xc
sscs
scccxsxcxHHHHx
⇒
ω−∑= −
=ω
2
0
1
00 )exp(][
1maxargˆ
0
n jn x N
N
n
ω periodogram maximum⇒
76
Least Squares Methods
-
8/18/2019 Good Lecture for Est Theory
77/99
Parameter estimation is achieved via minimizing a least squares (LS) costfunction
Generally not optimum but computationally simple
Use for classical parameter estimation
No knowledge of the noise PDF is required
Can be considered as a generalization of LS filtering
77
Variants of LS Methods
-
8/18/2019 Good Lecture for Est Theory
78/99
1. Standard LS
Consider the general linear data model:
wHθx += where
x is the observed vector of sizew is zero-mean noise vector with unknown covariance matrix
p×H is known matrix of sizeθ is parameter vector of size p
The LS solution is given by
( ) ( ) xHHHHθ-xHθ-xθθ
T T T 1)(minargˆ −== (5.18)
which is equal to (5.17)
78
⇒ LS solution is optimum if covariance matrix of is and isw IC ⋅σ= 2w w Gaussian distributed
-
8/18/2019 Good Lecture for Est Theory
79/99
Define
Hθ-xe =
where
T N eee )]1()1()0([ −= Le
(5.18) is equivalent to
∑= −
=)(minargˆ 2
1
0
k e N
k θθ (5.19)
which is similar to LS filtering
Q. Any differences between (5.19) and LS filtering?
79
Example 5.16
-
8/18/2019 Good Lecture for Est Theory
80/99
Given1,,1,0],[][ −=+= nnwn x L
where is an unknown constant and is a zero-mean noise][nw Find the LS solution of
Using (5.19),
( )
−∑= −
=
21
0
][minargˆ An x A N
n A
Differentiating ( )2
1
0][ An x
N
n −∑
−
= with respect to and set the result to 0:
][1ˆ 1
0
n x N
A N
n∑=
−
=
80
On the other hand, writing ]}[{ n x in matrix form:
-
8/18/2019 Good Lecture for Est Theory
81/99
wHx += where
=1
1
1
MH
Using (5.18),
][
]1[
]1[
]0[
111
1
1
1
111ˆ 1
0
1
1
n x N
N x
x
x
A N
n∑⋅=
−
⋅
⋅= −
=
−
−
ML
ML ][][
Both (5.18) and (5.19) give the same answer and the LS solution is
81
optimum if the noise is white GaussianExample 5.17
-
8/18/2019 Good Lecture for Est Theory
82/99
Consider the LS filtering problem again. Given
1,,1,0],[][][ −=+⋅= N nnqW n X nd T L where
][nd is desired responseT Ln xn xn xn X ]]1[]1[][[][ +−−= L is the input signal vector
T
LwwwW ][
110 −= L is the unknown filter weight vector][nq is zero-mean noise
Writing in matrix form:
W , =+⋅= WqWHd Using (5.18):dHHHW T T 1)(ˆ −=
82
where
00]0[)0( x X T L
-
8/18/2019 Good Lecture for Est Theory
83/99
−−−
=
−
=
][]2[]1[
0]0[]1[
)1(
)1(
L N x N x N x
x x
N X
X
T
T
L
MMMM
L
MH
with 0]2[]1[ ==−=− L x x
Note that
dH
HH
T dx
T xx
R
R
=
=
where xx is not the original version but not modified version of (3.6)
83
-
8/18/2019 Good Lecture for Est Theory
84/99
The LS solution is then
)cos(][1
φ+ω∑−
nn x N
-
8/18/2019 Good Lecture for Est Theory
85/99
)(cos
ˆ
02
1
0
00
φ+ω∑
=−
=
=
n
A N
n
n
2. Weighted LS
Use a general form of LS via a symmetric weighting matrix W
( ) ( ) WxHWHHHθ-xWHθ-xθθ
T T T 1)(minargˆ −== (5.20)
such thatT
WW =
Due to the presence of , it is generally difficult to write the cost function
( )W
( )Hθ-xWHθ-x T in scalar form as in (5.19)
85
Rationale of using : put larger weights on data with smaller errors W
-
8/18/2019 Good Lecture for Est Theory
86/99
put smaller weights on data with larger errors
When = where is covariance matrix of the noise vector:W 1-C C
xCHHCHθ
111 )(ˆ -T -T −= (5.21)
which is equal to the ML solution and is optimum for Gaussian noise
Example 5.19
Given two noisy measurements of :
11 w x += and 22 w x +=
where and are zero-mean uncorrelated noises with known
variances 2 and . Determine the optimum weighted LS solution
1w 2w22σ1σ
86
Use
212 0/10
-
8/18/2019 Good Lecture for Est Theory
87/99
σ
σ=
σ
σ== −22
21
1
22
211
/10
0/1
0
0-CW
Grouping and into matrix form:1 x 2 x
+⋅
=
2
1
2
1
1
1
w
w A
x
x
orwHx +⋅=
Using (5.21)
[ ] [ ]
σ
σ
σ
σ==
−−
2
1
22
21
1
22
21111
/10
0/111
1
1
/10
0/111)(ˆ
x
x A -T -T xCHHCH
87
As a result,
22111
-
8/18/2019 Good Lecture for Est Theory
88/99
222
21
21
122
21
22
22
221
11
22
21
11ˆ x x x x
A ⋅σ+σ
σ+⋅σ+σ
σ=
σ
+σ
σ
+σ
= −
Note that
If , a larger weight is placed on and vice versa2122 σ>σ 1 x
If , the solution is equal to the standard sample mean
w w
21
22 σ=σ
The solution will be more complicated if and are correlated2 2
1 2
Exact values for and are not necessary, only ratio is needed1σ 2σ
Define , we have22
21 / σσ=λ
2111
1ˆ x x A ⋅λ+
λ+⋅
λ+=
88
3. Nonlinear LS
Th LS t f ti t b t d li d l i
-
8/18/2019 Good Lecture for Est Theory
89/99
The LS cost function cannot be represented as a linear model as in
wHθx +=
In general, it is more complex to solve, e.g.,
The LS estimates for 0,ω and φ can be found by minimizing
20
1
0
))cos(][( φ+ω−∑−=
n An x N
n
whose solution is not straightforward as seen in Example 5.15
Grid search and numerical methods are used to find the minimum
89
4. Constrained LS
The linear LS cost function is minimized subject to constraints:
-
8/18/2019 Good Lecture for Est Theory
90/99
( ) ( )Hθ-xHθ-xθθ
T minargˆ = subject to (5.22)S
where is a set of equalities/inequalities in terms ofS θ
Generally it can be solved by linear/nonlinear programming, but simplersolution exists for linear and quadratic constraint equations, e.g.,
Linear constraint equation: 10321 =θ+θ+θ
Quadratic constraint equation: 1002322
21 =θ+θ+θ
Other types of constraints: 10321 >θ>θ>θ
10033221 ≥θ+θ+θ
90
Consider the constraints isS
bAθ
-
8/18/2019 Good Lecture for Est Theory
91/99
bA = which contains r linear equations. The constrained LS problem for linearmodel is
( ) ( )Hθ-xHθ-xθθ
T minargˆ = subject to bAθ = (5.23)
The technique of Lagrangian multipliers can solve (5.23) as follows
Define the Lagrangian
( ) ( ) )( b-Aθλ Hθ-xHθ-x T T c J += (5.24)
where λ is a r -length vector of Lagrangian multipliers
The procedure is first solve λ then θ
91
Expanding (5.24):
bλ
Aθλ
Hθ
Hθ
xH2θ
xx
T T T T T T T
cJ
-
8/18/2019 Good Lecture for Est Theory
92/99
b-AHHxH2-xxc J ++= Differentiate c with respect to :θ
λ AHθHx-2Hθ
T T T c
++=∂
∂2
Set the result to zero:
λ AHH-θλ AHH-xHHHθ
0λ AθHHx2H-
T T T T T T c
T cT T
111 )(2
1ˆ)(2
1)(ˆ
ˆ2
−−− ==⇒
=++
where is the LS solution. Put into :θ̂
cθ̂
bAθ
=
)ˆ())((2
)(2
1ˆˆ 111 b-θAAHHAλ
bλ AHHA-θAθA −−− =⇒== T T T T c
92
Put λ back to :cθ̂
)ˆ
())(()(ˆˆ 111
bθAAHHAAHHθθ
−−− T T T T c
-
8/18/2019 Good Lecture for Est Theory
93/99
)())(()( b-θAAHHAAHH-θθ =c Idea of constrained LS can be illustrated by finding minimum value of y :
2 23 +−= x x y subject to 1=− y x
93
5. Total LS
Motivation: Noises at both and :x Hθ
-
8/18/2019 Good Lecture for Est Theory
94/99
21 wHθwx +=+ (5.25)
where and are zero-mean noise vectors1w 2w
A typical example is LS filtering in the presence of both input noise andoutput noise. The noisy input is
1,,1,0),()()( −=+= nk nk sk x i L
and the noisy output is
1,,1,0),()()()( −=+⊗= nk nk hk sk r o L
The parameters to be estimated are )}({ k h given )(k x and )(k y
94
Another example is in frequency estimation using linear prediction:
For a single sinusoid )cos()( φ+ωkks it is true that
-
8/18/2019 Good Lecture for Est Theory
95/99
For a single sinusoid )cos()( φ+ω= k k s , it is true that
)2()1()cos(2)( −−−ω= k sk sk s
s )(k is perfectly predicted by )1( −k s and )1( −k s :
)2()1()( 10 −+−= k sak sak s
It is desirable to obtain )cos(20 ω=a and 11 −=a in estimation process
In the presence of noise, the observed signal is
1,,1,0),()()( −=+= nk wk sk x L
The linear prediction model is now
95
1,,1,0),2()1()( 10 −=−+−= nk xak xak x L 0
)1()2()3(
)0()1()2(
10
10
+=
+=
xaxax
xa xa x
0)1()2(
)()1(
)3(
)2(
axx
x x
x
x
-
8/18/2019 Good Lecture for Est Theory
96/99
)3()2()1(
)1()2()3(
10
10
−+−=−
+=
N xa N xa N x
xa xa x
LLL⇒
−−
=
−1
0
)2()2(
)1()2(
)1(
)3(
a
a
N x N x
x x
N x
x
MMM
−−
+
−−
=
−
+
−1
0
1
0
)2()2(
)1()2(
)0()1(
)2()2(
)1()2(
)0()1(
)1(
)3(
)2(
)1(
)3(
)2(
a
a
N w N w
ww
ww
a
a
N s N s
s s
s s
N w
w
w
N s
s
s
MMMMMM
6. Mixed LS
A combination of LS, weighted LS, nonlinear LS, constrained LS and/ortotal LS
Examples: weighted LS with constraints, total LS with constraints, etc.
96
-
8/18/2019 Good Lecture for Est Theory
97/99
Questions for Discussion
1. Suppose you have pairs of ),( ii y x , i ,,2,1 L= and you need to fitthem into the model of ax= . Assuming that only }{ i y contain zero-mean noise, determine the least squares estimate for .a
(Hint: ithe relationship between andi x i y is
inax y iii ,,2,1, L=+=
where { } are the noise inin }{ i y .)
97
2. Use least squares to estimate the line ax= in Q.1 but now only }{ i x
contain zero-mean noise.
-
8/18/2019 Good Lecture for Est Theory
98/99
contain zero mean noise.
3. In a radar system, the received signal is
)()()( 0 nwn snr +τ−α= where the range of an object is related to the time delay by
c/20 =τ
Suppose we get an unbiased estimate of , say,0τ 0τ̂ , and its variance is. Determine the corresponding range variance ˆ , where)ˆvar( 0τ )var( R ˆ is
the estimate of .
If and20 )1.0()ˆvar( sµ=τ 18103 −×= msc , what is the value of ?)ˆvar( R
98
-
8/18/2019 Good Lecture for Est Theory
99/99
99