old and new results in robust hypothesis testing
TRANSCRIPT
Old and New Results in RobustHypothesis Testing
Bernard C. LevyDepartment of Electrical and Computer Engineering
University of California, Davis
January 12, 2008
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
1
Outline� Binary Hypothesis Testing
� Robust Hypothesis Testing
� Huber’s Clipped LR Test
� Robustness with a KL Divergence Tolerance
� Simulations
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
2
Binary Hypothesis Testing� Consider observation � �� where under hypothesis � � , � has
probability density � � �� and under � , it has density � �� .
� Given � , we need to decide between � or � � . We use a randomizeddecision rule � �� , where given � � , we select � with probability
� �� and � � with probability ��� � �� , where � � � �� � � for all� �� .Note that set� is convex.
� Bayesian hypothesis testing assumes a priori probabilities
� � � � � � � � � � � � � � � � �
and costs � � and � � for a miss (deciding � � when � holds) and afalse alarm (deciding � when � � holds), respectively.
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
3
Binary hypothesis testing(cont’d)
Let
� � � � � � � �� � � �� � � �� �� �
� � � � � � �� � � � � � �� � �� �� �
denote the probability of false alarm and of a miss under � � and � ,respectively. The optimal Bayesian test minimizes the risk
� � � � � � � � � � � � � � � � � � � � � � � � � � � �
� � �
�� � � �� � � � � � � � �� � � � � � �� �� � !
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
4
Binary hypothesis testing(cont’d)
Optimal Bayesian test: Let " �� � �� # � � �� = likelihood ratio (LR) and
$ % � � � � # � � � � . The test minimizing the Bayesian risk is given by
� �� &''
( '')� " �� +* $ %
� " �� +, $ %
arbitrary " �� $ % �
and randomization is not needed.
Neyman-Pearson test (of type I): Minimizes � � � � � � under the constraint
� � � � � � � �.- . Solution:
� �� &''
( '')� " �� * $
� " �� , $
/ " �� $ !
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
5
Binary hypothesis testing(cont’d)� The threshold $ and randomization probability / are selected as follows.
Let 0 1 ��2 3 � � � � " � 2 3 � � � denote the cumulative probabilitydistribution of likelihood ratio " under � � . Then 0 1 � $ 3 � � � � - and
/ � if � � - is in the range of 0 1 �2 3 � � , and if0 1 � $ � 3 � � +, � � - , 0 1 � $ 3 � �
then
/ 0 1 � $ 3 � � � � � � -
0 1 � $ 3 � � � 0 1 � $ � 3 � �
� Both the Bayesian and NP tests rely on the LR function " �� . Only thethreshold selection changes.
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
6
Outline� Binary Hypothesis Testing
� Robust Hypothesis Testing
� Huber’s Clipped LR Test
� Robustness with a KL Divergence Tolerance
� Simulations
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
7
Robust Hypothesis Testing� The actual probability densities 4 � and 4 of observation � under � � and
� may differ slightly from the nominal densities � � and � . Assume
4 5 �6 5 , where6 5 denotes a convex neighborhood of � 5 for 7 � � � .
� Let6 6 �98 6 . The robust Bayesian hypothesis problem can beexpressed as
:; <= >? : @ABCD ECF G >H � � � � 4 � � 4 !
Since � � � � 4 � � 4 is separately linear with respect to � , and � 4 � � 4 , themin-max problem has a convex-concave structure. For appropriatechoices of metrics,� and6 are compact, so by Von-Neumann’s minimaxtheorem, there exists a saddle point � � I � 4 1� � 4 1 satisfying
� � � I � 4 � � 4 � � � � I � 4 1� � 4 1 � � � � � 4 1� � 4 1 ! (1)
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
8
Robust Hypothesis Testing (cont’d)� Here � I = robust test, and � 4 1� � 4 1 = least-favorable densities. The
second inequality in (1) implies � I is the optimum Bayesian test for thepair � 4 1� � 4 1 , so � I can be expressed as the LR test
" 1 �� 4 1 �� 4 1� ��
J FKJ D$ % !
� Since � � � � 4 � � 4 is a fixed linear combination of � � � � � 4 and
� � � � � 4 � , the first inequality in (1) is equivalent to
� � � � I � 4 � � � � � � I � 4 1� � � � � � I � 4 � � � � � I � 4 1 (2)
for all 4 � �6 � and 4 �6 .
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
9
Robust Hypothesis Testing (cont’d)� The robust NP test solves
: ; <= >? L : @ACF >H F � � � � � 4 � (3)
where� M N � �� O : @AC D >H D � � � � � 4 � P
is the set of decision rules of size less than- . Since � � � � � 4 � is a convexfunction of � for each 4 � �6 � , so is
: @AC D >H D � � � � � 4 � �hence� M is convex.
� The cost function � � � � � 4 has a convex concave structure, so a saddlepoint exist, and � I is the optimal NP test for least favorable observationdensities � 4 1� � 4 1 .
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
10
Outline� Binary Hypothesis Testing
� Robust Hypothesis Testing
� Huber’s Clipped LR Test
� Robustness with a KL Divergence Tolerance
� Simulations
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
11
Huber’s Clipped LR Test� Different choices of neighborhoods6 5 yield different robust tests. Let
Q 5 �� and 0 5 �� denote the cumulative probability distribution functionscorresponding to the actual and nominal densities 4 5 �� and � 5 �� for
7 � � � . For some numbers � �.R � �R �TS � � S , � , Huber consideredneighborhoods
6 � N 4 � O Q � �� U � � � R � 0 � �� � S � for all� �� P
6 N 4 O � � Q �� U � � � R � � � 0 �� � S for all� �� P !
� The constraints specifying6 5 are linear in 4 5 , so the neighborhoods areconvex.
� Since functions � � � � I � 4 and � � � � I � 4 � are linear in 4 and 4 � , themaximization (2) for the least-favorable densities is a linear programmingproblem, so solutions will be located on the boundary of6 5 .
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
12
Huber’s Clipped LR Test (cont’d)
Least-favorable densities: There exists V �� 1 � �W � such that over thisinterval
Q 1� �� � � � R � 0 � �� � S �
Q 1 �� � � � R 0 �� R S �
so the least-favorable densities are on the boundary of sets6 � and6 . For
7 � � � , this implies
4 15 �� � � � R 5 � 5 ��
over V . Let
X �� Y[Z � � �� \ Z � ��
] �� Y[Z Z � � �� \ Z Z � �� �GGAM Mini-Conference Department of Electrical and Computer Engineering
University of California at Davis
13
Huber’s Clipped LR Test (cont’d)
with
Y^Z R S � � R � Y^Z Z R � S �� � R �
\ Z S �� � R � � \ Z Z S � � R !
Let2 1 " �� 1 ,2W " ��W . Then
4 15 �� _ 5 X �� � � � � 1
4 15 �� � 5 ] �� � � U �W �
with _ _ � � � R � � R � 2 1 � � � � � � R � � R � 2W !
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
14
Huber’s Clipped LR Test (cont’d)
Clipping transformation: The least-favorable LR can be expressed as
" 1 �� 4 1 �� 4 1� ��
� � R � � R � � � " ��
where the clipping nonlinearity � �a` is shown below
"
� � "
2 1
2 1 2W
2W�
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
15
Huber’s clipped LR test (cont’d)
Robust test: The decision rule
" 1 �� J F
KJ D$ %
can be rewritten as
� � " �� J F
KJ D b � � R �� � R $ % !
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
16
Huber’s clipped LR test (cont’d)� ForS � S � , the LF distributions belong to the contamination class
c[d 5 N 4 5 O Q 5 �� � � � R 5 0 5 �� R 5 � �� for all� P
contained in6 5 , where � �� = arbitrary probability distribution.
� ForR � R � , the LF densities belong to the total variation class
cfe g5 N 4 5 O 3 4 5 � � 5 3 �ih S 5 P
contained in6 5 .
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
17
Outline� Binary Hypothesis Testing
� Robust Hypothesis Testing
� Huber’s Clipped LR Test
� Robustness with a KL Divergence Tolerance
� Simulations
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
18
Robustness with a KL Tolerance� For 7 � � � consider neighborhoods
6 5 N 4 5 Oj � 4 5 3 � 5 �.R P
wherej � 4 3 � �
� �k < � 4 �� # � �� 4 �� � �
is the Kullback-Leibler divergence or relative entropy of density 4 withrespect to � .
� j � 4 3 � is convex in 4 , so6 5 is convex.j � 4 3 � is not a true distance,since it is not symmetric (j � � 3 4 l j � 4 3 � ) and does not satisfy thetriangle inequality. Butj � 4 3 � U � with equality if and only if 4 � .
� j � 4 3 � and its dualj m � 4 3 � j � � 3 4 admit a non-Riemanniandifferential geometric interpretation in terms of dual connections.
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
19
Robustness with a KL Tolerance
Assumptions:
i) The nominal LR " �� � �� # � � �� is monotone increasing in� .
ii) � �� � � �� � .iii) � , R , j � � npo 3 � � , where � npo �� is the mid-way density on the
geodesic�q �� � � q� �� � q ��
r ��s
linking � � and � . Here
r ��s �� � � q �� � � q� �� � �
= normalization constant.
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
20
Robustness with a KL Tolerance (cont’d)
Robust test and LF densities: For a minimum probability of error criterion( � � � � � ) and equally likely hypotheses, there exists� W * � such that
� I �� &'''
( ''')
� � * �W
o � � tu 1 Bv Gtu w x � � �W � � � �W
� � , � �W �
4 1� �� &''
( '')2W � � �� # r ��W � * �W
2 npoW � npo � npo� �� # r ��W � �W � � � �W
� � �� # r ��W � , � �W �
4 1 �� 4 1� �� � , with2W " ��W .GGAM Mini-Conference Department of Electrical and Computer Engineering
University of California at Davis
21
Robustness with a KL Tolerance (cont’d)
Nonlinear transformation: The least-favorable LR can be expressed as anonlinear transformation " 1 y � " of the nominal LR.
"
y � "
�
� # 2W 2W�
" # 2W
2W "
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
22
Robustness with a KL Tolerance (cont’d)� 4 1� �a` 3�W is parametrized by� W with 4 1� � � for�W � and
k ; : 4 1� � n o as�W z { .
� �W is selected such thatj � 4 1� �` 3�W 3 � � R . Relies on showing that
j ��W j � 4 1� �` 3�W 3 � �
is a monotone increasing function of� W .
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
23
Outline� Binary Hypothesis Testing
� Robust Hypothesis Testing
� Huber’s Clipped LR Test
� Robustness with a KL Divergence Tolerance
� Simulations
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
24
Simulations
Consider the nominal model
� � O � � � | � O � � | �
with |i} ~ � � �9� o , so � �} ~ �� � �9� o .j ��W is plotted below for SNR= � dB (� � ).
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
yu
D(y
u)
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
25
Simulations (cont’d)
LF densities 4 1� forR � ! � and SNR = 0, 10dB.
−3 −2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 20
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
y
g0
nominalleast favorable
−3 −2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
1.2
1.4
y
g0
nominalleast favorable
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
26
Simulations (cont’d)
Comparison of worst-case � ��� � for test � I withR � ! � � , � ! � against � ��� �
for the Bayesian test on nominal model.
0 5 10 1510−10
10−8
10−6
10−4
10−2
100
SNR
PE
nominaleps=0.01eps=0.1
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis
27
References
[1] P. J. Huber, ”A robust version of the probability ratio test,” Annals Math.Stat., vol. 36, pp. 1753–1758, Dec. 1965.
[2] P. J. Huber, Robust Statistics. New York: J. Wiley, 1981.
[3] B. C. Levy, ”Robust hypothesis testing with a relative entropy tolerance,”preprint, available at http://arxiv.org/abs/cs/0707.2926, July 2007.
[4] B. C. Levy, Principles of Signal Detection and Parameter Estimation.Springer Verlag, May 2008 (to appear).
GGAM Mini-Conference Department of Electrical and Computer EngineeringUniversity of California at Davis