introduction to detection and estimation, and …ipr/ipr2005/data/material...introduction to...
TRANSCRIPT
Introduction to Detection and Estimation, and
Mathematical Notations
Mathematical Methods and Algorithms for Signal Processing, Ch10
2002/12/10 DSP Group Meeting
Outline
♦ Detection and estimation theory♦ Notational conventions♦ Conditional expectations♦ Transforms of random variables♦ Sufficient statistics♦ Exponential families
Detection and Estimation
♦ Two examples– Detection
Letwhere The signal is observed in noise, where n(t) is a random process.An example of detection problem is the choice between the two values of A, given y(t).
),,0[,)2cos()( TttfAtx c ∈= π
}1,1{ −∈A)()()( tntxty +=
Making a choice over some countable (often finite) set of options
Detection and Estimation (cont.)
– EstimationThe signal ismeasured at a receiver, where is an unknown phase. An example of an estimation problem is the determination of the phase, based upon observation of the signal over some interval of time.
)()2cos()()( tntftxty c ++= θπθ
Making a choice over a continuum of options
Game Theory♦ The component of statistical theory that we are
concerned with fits in a larger mathematical construct –that of game theory.
♦ Definition of a mathematical game– A two-person, zero-sum mathematical game, which we will
refer to as a game, consists of three basic components:• A nonempty set, , of possible actions available to Player 1• A nonempty set, , of possible actions available to Player 2• A loss function, ,representing the loss incurred by Player 1.
1θ
2θ
RL a21: θθ ×
Property of a Game
♦ In a two-person game, each “person” (either of whom may be Nature)
♦ Each player attempts to make a choice to help them achieve their goal (e.g. of “winning”)
♦ In a zero-sum game, one person‘s loss is another’s gain
An Odd-or-Even Game
♦ Two players simultaneously put up either 1 or 2 fingers.♦ Player 1 wins if the sum of digits showing is odd.♦ Player 2 wins if the sum of digits showing is even.♦ The winner receives in dollars the sum of digits showing.
-2 33 -4
1 2
12
1θ 2θ
4)2,2(3)1,2(3)2,1(
2)1,1(}2,1{}2,1{
2
1
−===
−===
LLLLθθ
A Statistical Game♦ An important class of games are those in which one
player is able to obtain information relating to the choice made by their opponent, before committing their own choices. But these observation data are subject to error.
♦ Decision and estimation theory can be viewed as a two-person game between Nature and a decision-making agent.– The choices available to nature are represented as elements of
a set – The decisions that the agent makes are represented as a
element of a set – A loss function L– The observed samples of random variable X, defined over a
sample space, whose distribution depends on
θ
∆
θ
Definition of a Statistical Game
♦ is a nonempty set of possible states of nature, or parameter. is sometimes referred to as the parameter space. An element of is denoted as
♦ is a nonempty set of possible decisions available to the agent, called the decision space. An element of is represented as
♦ is a loss function or cost function
kR∈ΘΘ
∆
Θθ
∆ δ
RL a∆×Θ:
Definition of a Statistical Game (cont.)
♦ is a random variable or vector, with cumulative distribution function . The distribution of X is governed by the parameters
♦ is a decision rule, also termed a strategy, decision function, or test, that provides the coupling between the observations and the decisions
1,: ≥nRX naχ
)|( θxFX
Θ∈θ
∆aχφ :
Elements of the Statistical Decision Game
Randomization
♦ Non-randomized decision rule– The decision rule is a single function mapping
observations into the decision space.♦ Randomized decision rule
– Let D denote the space of all nonrandomized decision rules. is a probability distribution thatspecifies the probability of selecting the elements of D.D* is the space of all randomized decision rule.
φ
]1,0[: →Dϕ
Special Cases
♦ The binary hypothesis testing problem ♦ . Corresponding to each decision is a
hypothesis. – By choosing ,the agent accepts hypothesis ,
thus rejecting hypothesis .– By choosing ,the agent accepts hypothesis ,
thus rejecting hypothesis .
},{ 10 δδ=∆
0δ 0H1H
1δ0H
1H
Special Cases (cont.)
♦ Example: a radar signal is examined at the receiver to determine whether a target is present
– H0: no target present, – H1: target present,
♦ H0 is termed the null hypothesis, and H1 is the alternative hypothesis
NX += θ
0θθ ≤
0θθ >
Special Cases (cont.)
♦ Four possible outcomes:
Null Hypothesis is True
Null Hypothesis is False
Reject Null Hypothesis
Fail to RejectNull Hypothesis
False Alarm(Type I Error)
Missed Detection(Type II Error)
Correct Detection
Correct Detection
Special Cases (cont.)
♦ Multiple decision problems, also named multiple hypothesis testing problems
♦ Point estimation of a real parameter, – Example
3},,...,,{ 21 ≥=∆ MMδδδ
ℜ=∆
2)(),( δθδθ
θ
−=
=
cLR
Notational Conventions
♦ indicates a probability distribution for the random variable X. can be viewed as a parameter, or regarded as a random variable.
♦ is also denoted by the abbreviated form
♦
♦ denotes the probability of the events s under the condition that is the true parameter
)|( θxFX
θ
)|( θxFX
)(xFθ
∫= dxxxfXE )(][ θθ
θ][sPθ
Populations and Statistics♦ The problem of estimations is to
– Obtain a set of data (observations)– Use this information to fashion the guess for the value of an
unknown parameter.– One way to achieve this goal is random sampling.
• Sampling: repeat a given experiment a number of times;• Let X be a random variable known as the population random variable• The ith repetition involves the creation of a copy of the population on
which Xi is defined.• The distribution of Xi is the same as the distribution of X • Xi are called sample random variables, or sometimes, the sample value
of X.• Sample with replacement
Populations and Statistics (cont.)
♦ Statistics– A function of the sample values of a random value X is
called a statistic of X.– Example:
∑=
=n
iiX
nX
1
1
Conditional Expectation
♦ Continuous distributions
♦ Discrete distributions∫= dxyxxfYXE YX )|()|( |
∑= dxyxxfYXE YX )|()|( |
)(),()|(
)()()|(
)()()(),(),()(
}{:}{:
| ygyxfyxf
BPBAPBAP
ygyYPBPyxfyYxXPBAP
yYBxXA
YX =
∩=
=======∩
==
Conditional Expectation (cont.)
♦ Properties of condition expectations– E(X|Y)=EX if X and Y are independent– E(X|Y) is a function of Y– E[E(X|Y)]=EX– E[g(Y)X|Y]=g(y)E(X|Y) where g(.) is a function– E[c|Y]=c for any constant of c– E[g(Y)|Y]=g(Y)– E[(cX+dZ)|Y]=cE[X|Y]+dE[Z|Y] for any constants c and
d
Transformations of Random Variables
♦ Theorem:– Let X and Y be continuous random variables with
Y=g(X). Suppose g is one-to-one, and both g and its inverse function, g-1, are continuously differentiable. Then
|)(|)]([)(1
1
dyydgygfyf XY
−−=
Transformations of Random Variables (cont.)
♦ Proof:Since g is one-to-one, it is either increasing or decreasing; suppose it is increasing. Let a and b be real numbers such that a<b; we have
))](),(([)],()([)],([ 11 bgagXPbaXgPbaYP −−∈=∈=∈
(*)0)()]([)(
|)(|)]([
)())](),(([
)()],([
11
11
)(
)(
111
1
=
−
=
=∈
=∈
∫
∫
∫
∫
−−
−−
−−−
−
dydy
ydgygfdyyf
dydy
ydgygf
dxxfbgagXP
dyyfbaYP
b
a XY
b
a X
bg
ag X
b
a Y
Transformations of Random Variables (cont.)
If the equation stated by the theorem is not true, there must exist some y* such that equality does not hold. By the continuity of the density functions fx and fy, (*) must be non-zero for some open interval containing y*. This yields a contradiction.Thus, the theorem is true if g is increasing.For the case that g is decreasing, the change of variable will also reverse the limits as well as the sign of the slope.Thus, the absolute value is required.
Transformations of Random Variables (cont.)
♦ Example:Suppose that a random variable X has the density
2
2
2
2
2
22
22
21
22
)2(2
1)(
)()(2
1)(
σσ
σ
σσ
σ
rr
R
x
X
errerf
RRgXxgR
exf
−−
−
−
==
=⇒==
=
Sufficient Statistics
♦ “How much information must be retained from sample data in order to make valid decisions?”
♦ Let X be a random variable whose distribution depends on a parameter . A real-valued function T of X is said to be sufficient for if the conditional distribution of X, given T=t, is independent of . That is, T is sufficient for if
θθ
θ θ
)|(),|( || txFtxF TXTX =θ
Sufficient Statistics (cont.)♦ Example: a coin with unknown probability of heads p is
independently tossed n times.– Let Xi be zero if the outcome of the ith toss is tails and one if
the outcome is heads– Let T denote the total number of heads,– X1,…,Xn are i.i.d. with common pmf
– We must prove that , The conditional probability of {X1,…,Xn}, given T=t, is independent of p.
ii xxiiix pppxXPpxf −−=== 1)1()|()|(
)|()|,,...,(),|,...( 1
1|,...,1 ptPptTxxPptxxf n
nTXX n
==
∑=
=n
iiXT
1
Sufficient Statistics (cont.)
=⇒
−
=
=−=
∑−∑=
−−=
==
==
−
−
−
−−
tn
ptxxf
pptnptTP
pppp
ppppptTxxP
ptTPptTxxPptxxf
nTXX
tnt
tnt
xx
xxxxn
nnTXX
n
ii
nn
n
1),|,...(
)1(
)|()1(
)1(
)1(....)1(
)|,,...,()|(
)|,,...,(),|,...(
1|,...,
)1(
111
11|,...,
1
11
1
Independent of p
Factorization Theorem
♦ Factorization Theorem– A convenient mechanism for testing the sufficiency of
a statistic.– Let be a discrete random vector
whose pmf depends on a parameter .The statistic is sufficient for if and only if the pmf factors into a production of a function of and and a function of alone; that is:
TnXXXX ],...,,[ 21=
)|( θxf x Θ∈θ
)(xtT = θ)(xt
θ x
)()),(()|( xaxtbxf X θθ =
Factorization Theorem (cont.)
♦ Proof: (à)
=
=otherwise
xtTxfxtxf X
TX 0)()|(
)|)(,(,
θθ
)|)(())(|()|)(()),(|(
)|)(,()|(
,
,
,
θ
θθ
θθ
xtfxtxfxtfxtxf
xtxfxf
TTX
TTX
TXX
=
=
=
)|)(()),(())(|()( |
θθ xtfxtbxtxfxa
T
TX
==
⇒
Factorization Theorem (cont.)
♦ Proof: (ß))()),(()|( xaxtbxf X θθ =
0)|( 0 >θtfTChoose such that for some .
0tΘ∈θ
)|()|,(
),|(0
0,0| θ
θθ
tftxf
txfT
TXTX =
≠=
=0
00, )(0
)()|()|,(
txttxtxf
txf XTX
θθ
∑
∑
=
=
=
=
0
0
)(:0
)(:0
)(),(
)|()|(
txtx
txtxXT
xatb
xftf
θ
θθ
=
≠
=∑
=
0
)'(:'
0
0| )()'(
)()(0
),|(
0
txtxa
xatxt
txf
txtx
TX θ
Independent of θ
Examples of Sufficient Statistics (I)
♦ Bernoulli random variables– Let X=[X1,X2,…Xn]T be a random vector, where Xi are
independent Bernoulli random variables.
∑
∏
=−=
∑−∑=
−=
−
−
=
−
itnt
xnx
n
i
xxX
xtpppp
ppxf
ii
ii
,)1()1(
)1()|(1
1θ
tnt pptbxa
−−=
=
)1(),(1)(
θ
Examples of Sufficient Statistics (II)
♦ Gaussian random variables– Let X=[X1,X2,…Xn]T be a random vector, where each
Xi is from– The joint pdf of these random variables are
• Only is unknown • Only is unknown • and are unknown
),( 2σµN
−−= ∑
=
−− n
ii
n
X xxf1
21222 )()2(exp)2(),|( µσπσσµ
µ2σ
µ 2σ
Examples of Sufficient Statistics (III)
♦ X1,…,Xn are drawn from the uniform distribution over [a,b]– The joint density of X1,…,Xn is
∏=
−−=n
iiba
nX xIabbaxf
1],[ )()(),|(
∉∈
=AxifAxif
xI A 01
)(
)(max)(min)(),|( ],(),[ ibian
X xIxIabbaxf −∞∞−−=
• a is unknown• b is unknown• a and b is unknown
Minimal Sufficient Statistics
♦ Sufficient statistics leads to economy in the design of algorithms to compute estimates, and simplify the requirement for data acquisition and storage, since only the sufficient statistic needs to be retained for purposes of estimation.
♦ A minimal sufficient statistic?– An example
),(1
13 n
n
ii XXT ∑
−
=
=),...,( 11 nXXT =XT =2
Definition of Minimal Sufficient Statistics
♦ S sufficient statistic for a parameter that is a function of all other sufficient statistic for is said to be a minimal sufficient statistic for
♦ Questions about minimal sufficient statistics:– Does one always exist?– If so, is it unique?– If it exists, how do I find it?
Θ∈θθ
θ
Complete Sufficient Statistics
♦ Let T be a sufficient statistic for a parameter , and let be any real-valued function of T. T is said to be complete if for all
θ)(tω
0))(( =TE ωθ
θ
Θ∈∀== θωθ 1]0)([ TPimplies that
An Example for Complete Sufficient Statistics
♦ Let X1,…,Xn be a sample from the uniform distribution over the interval [0, ],
we will compute the density of T. θ 0>θ
jjxT max=
<
≤≤
≤
=≤≤=≤
t
ttt
txtxPtTP n
n
n
θ
θθθθ
1
0
00
),...,(][ 1
dtttn
dttftTE
nn
T
∫∫
−−=
=θ
θ
ωθ
θωω
0
11 )(
)|()()(
Minimal Sufficient Statistics
♦ Theorem: a complete sufficient statistic for a parameter is minimalθ
Exponential Families♦ A family of distributions with pmf or pdf is said
to be a k-parameter exponential family if has the form
In this definition, may be either a scalar or a vector ofparameters
♦ For a k-parameter exponential family, the sufficient statistic
is complete, and therefore a minimal sufficient statistic
)|( θxf X
)|( θxf X
∑=
=k
iiiX xtxacxf
1
)]()(exp[)()()|( θπθθ
θ
Tn
jjk
n
jj XtXtT
= ∑∑
== 111 )(),...,(
Examples of Exponential Family
)]}1log([logexp{)1(
)1()|(
θθθ
θθθ
−−
−=
−
= −
xxm
xm
Xf
m
xmxX
•Binomial distribution
•Normal distribution
+
−
−
=
−−=
xx
xxfX
22
22
2
2
2
21exp
2exp
21
2)(exp
21)(
σµ
σσµ
σπ
σµ
σπ