pdfs.semanticscholar.org · 1 in tro duction the statistical signal pro cessing literature has...

38

Upload: others

Post on 27-Apr-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

Robust Frequency-Selective Filtering usingWeighted Myriad Filters AdmittingReal-Valued Weights 1Sudhakar Kalluri and Gonzalo R. ArceDepartment of Electrical and Computer EngineeringUniversity of Delaware, Newark, Delaware 19716phone: (302) 831-8030fax: (302) 831-4316e-mail: [email protected] and [email protected] Myriad Smoothers have recently been proposed as a class of nonlinear �l-ters for robust non-Gaussian signal processing in impulsive noise environments. How-ever, weighted myriad smoothers are severely limited, since their weights are restrictedto be non-negative. This constraint makes them unusable in bandpass or highpass�ltering applications which require negative �lter weights. Further, they are incapableof amplifying selected frequency components of an input signal, since the output of aweighted myriad smoother always lies within the dynamic range of its input samples.In this paper, we generalize the weighted myriad smoother to a richer structure,a weighted myriad �lter admitting real-valued weights. This involves assigning a pairof �lter weights, one positive and the other negative, to each of the input samples.Equivalently, the �lter can be described as a weighted myriad smoother applied toa transformed set of samples that includes the original input samples as well as theirnegatives. The weighted myriad �lter is analogous to a normalized linear FIR �lter withreal-valued weights whose absolute values sum to unity. By suitably scaling the outputof the weighted myriad �lter, we extend it to yield the so-called scaled weighted myriad�lter, which includes (but is more powerful than) the traditional unconstrained linearFIR �lter. Finally, we derive stochastic gradient-based nonlinear adaptive algorithmsfor the optimization of these novel myriad �lters under the mean square error criterion.Submitted to the IEEE Transactions on Signal ProcessingEDICS Paper Category: SP 2.1.5 (Rank-Order and Median Filters)Permission to publish this abstract separately is granted1This research was funded in part by the National Science Foundation under Grant MIP-9530923.

Page 2: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

1 IntroductionThe statistical signal processing literature has traditionally been dominated by linear estima-tors, which are optimal under the assumption of the Gaussian model for the signal statistics.However, a large number of real-world processes (including radar clutter, ocean acousticnoise, and multiple-access interference in wireless communication systems) have been foundto be impulsive in nature, with sharp spikes or occasional outliers present in the data [1, 2].Such processes are more accurately modeled by distributions having heavier-than-Gaussiantails in their density functions [3]. There is, therefore, a need for the development of ro-bust nonlinear estimators for impulsive environments, based on heavy-tailed non-Gaussiandistributions for the underlying signals.The class of M -estimators (Maximum-Likelihood type estimators) of location, developedin the theory of robust statistics [4], has been of fundamental importance in the developmentof robust signal processing techniques [5]. Given a set of observations (input samples) fxigNi=1,an M -estimate of their common location is given by�̂ 4= argmin� NXi=1 � (xi � �); (1)where �(�) is called the cost function of the M -estimator. Maximum-Likelihood locationestimates form a special type of M -estimators, with the observations being independent andidentically distributed (i.i.d) and �(u) � � log f(u), where f(u) is the common density func-tion of the input samples. The linear, weighted median and the recently proposed weightedmyriad smoother families, are all derived from Maximum-Likelihood location estimators.Weighted median smoothers, and other nonlinear smoothers based on order statistics [6, 7],are derived from the heavy-tailed Laplacian noise model, and are extensively used in robustimage processing applications. Weighted Myriad Smoothers [8, 9, 10, 11] have been devel-oped based on the so-called �-stable distributions [2, 3], which are more heavy-tailed thanthe Gaussian as well as the Laplacian distributions, while including the Gaussian distribu-tion as a special limiting case. Myriad smoothers have been successfully utilized in several1

Page 3: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

applications in robust communications and image processing [12, 13, 14].The fundamental M -estimators that generate the linear, weighted median and weightedmyriad smoother families are the sample mean, sample median and sample myriad, respec-tively. Using the Gaussian and Laplacian density functions in Maximum-Likelihood locationestimation, we obtain the cost functions for the sample mean and the sample median as�(u) = u2 and �(u) = juj, respectively. The sample myriad [15, 8, 9, 10] is de�ned using theCauchy density function, which is the only symmetric (non-Gaussian) �-stable density func-tion that is expressible in closed-form. The cost function thus obtained is �(u) = log(K2+u2),where the parameter K controls the robustness of the estimator; a more detailed descrip-tion is given in Section 2. The �rst three rows of Table 1 show the cost functions forthe sample mean, median and myriad estimators. From a signal processing point of view,these plain location M -estimators, using i.i.d observations, do not adequately capture thestatistical relationships among the di�erent samples in an input signal. A proper exten-sion of these estimators would be to consider multivariate density functions for the inputsamples: fXigNi=1 � fX(x; �) = fX(x1 � �; x2 � �; : : : ; xN � �): An alternative approachthat is more tractable is to consider samples that are independent, but not identically dis-tributed, with common location �, but varying scale factors Si: Xi � f(xi � �;Si): Usingthis approach, Maximum-Likelihood location estimators are generalized by introducing non-negative weights fwi � 0gNi=1 in their cost function expressions; the weight wi is related tothe scale factor Si, and the varying scales re ect the non-uniform reliabilities of the inputsamples. By extending this principle to the more general M -estimator of (1), we obtain aweighted M -estimator: �̂ 4= argmin� NXi=1 � (pwi (xi � �)): (2)For reasons to be described shortly, these estimators are also referred to as M -smoothers.The last three rows of Table 1 show the di�erent M -smoothers obtained using the costfunctions for the linear, median and myriad smoother families, respectively. The notationswi � xi (for the weighted median) and wi � xi (for the weighted myriad), shown in the last2

Page 4: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

Smoother Cost Function Output �̂Linear NXi=1 (xi � �)2 mean (x1; x2; : : : ; xN) = NXi=1 xi = NMedian NXi=1 jxi � �j median (x1; x2; : : : ; xN)Myriad NXi=1 log hK2 + (xi � �)2i myriad (x1; x2; : : : ; xN ;K)Weighted Mean NXi=1 wi(xi � �)2 mean (wi � xi)jNi=1 = NXi=1 wixi = NXi=1 wiWeighted Median NXi=1 w0i jxi � �j; w0i 4= pwi median (w0i � xi)jNi=1Weighted Myriad NXi=1 log hK2 + wi(xi � �)2i myriad (wi � xi;K)jNi=1Table 1: M -estimator and M -smoother (weighted M -estimator) cost functions and outputsfor various �lter families.column of the table, re ect the weighting operation in (2).In spite of their proven robustness properties, weighted M -estimators in general, andweighted median and myriad smoothers in particular, are severely limited in their potentialfor signal processing and communications applications. This is directly attributable to theconstraint of non-negative weights, which makes these estimators no more than smoothersor \lowpass" type �lters; this is precisely why they are called `M -smoothers'. M -smoothersare thus unusable for bandpass or highpass �ltering applications which require negative�lter weights. Further, they are incapable of arbitrary ampli�cation of speci�ed frequencycomponents of an input signal, since the output of an M -smoother is restricted to thedynamic range of its input samples. That the weighted myriad smoother in particular isseverely constrained is clear from the fact that this �lter is analogous to the weighted meansmoother, which is a normalized linear FIR �lter with non-negative weights summing tounity. It is evident that such a severely handicapped linear �lter would be quite useless inmost signal processing applications. In order to overcome the limitations of the weightedmyriad smoother, it needs to be generalized into a robust �lter structure that possesses thefull signal processing power of the unconstrained linear FIR �lter (with unnormalized and3

Page 5: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

real-valued weights), while performing e�ciently in impulsive environments.Recently, the weighted median smoother has been successfully generalized to yield aweighted median �lter structure that admits real-valued weights [16, 17]. This new �lter isderived by assigning a pair of �lter weights, one of them positive and the other negative,to each input sample. In the present paper, we �rst generalize this approach by extendingthe class of M -smoothers de�ned in (2), leading to a new class of M -�lters that allow forreal-valued weights. We then focus on the special case of myriad estimators, generalizing theweighted myriad smoother into a Weighted Myriad Filter structure that admits real-valuedweights. This �lter is analogous to the weighted mean �lter with real-valued weights. How-ever, the weighted myriad �lter cannot achieve arbitrary ampli�cation of selected frequencycomponents of an input signal, since its output is bounded in magnitude by the maximumabsolute value of the input samples. By appropriately scaling (multiplying) the �lter output,we further extend this structure to yield the so-called Scaled Weighted Myriad Filters, whichinclude as a special case the traditional unconstrained linear FIR �lter. Myriad �lters thusprovide for a robust generalization of linear signal processing, outperforming linear �lters ina variety of applications in impulsive environments.The rest of the paper is organized as follows. Section 2 introduces the weighted myriadsmoother. The class of myriad �lters admitting real-valued weights is developed in Section 3.Having de�ned these �lter structures, the important issue of optimization of their parameters(weights) is addressed in Section 4. The optimization of weighted myriad smoothers wasconsidered in [14]. In the present paper, we derive necessary conditions for optimality of thedi�erent myriad �lters under the mean square error (MSE) criterion, and develop stochasticgradient-based (LMS-type) nonlinear adaptive algorithms for the optimization of the �lterparameters. The performance of the di�erent adaptive myriad �lters is demonstrated inSection 5 through computer simulations of frequency-selective �ltering in impulsive noiseenvironments.4

Page 6: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

2 Weighted Myriad SmoothersThis section gives a brief introduction to the weighted myriad smoother. For a more detailedtreatment, see [8, 9, 10, 11, 14]. It should be noted that in all the previous papers on myriad�ltering, the weighted myriad smoother has been referred to as the weighted myriad �lter.To avoid confusion, we shall henceforth reserve the term weighted myriad �lter for the moregeneral case when real-valued weights are admissible, while the smoother itself is restrictedto have non-negative weights.Weighted myriad smoothers are derived from the sample myriad, which is de�ned asthe Maximum-Likelihood estimate of location of data following the Cauchy distribution. Asmentioned in Section 1, myriad �lters are motivated by the properties of �-stable distribu-tions [3], of which the Gaussian and Cauchy distributions are special cases. For simplicityand tractability, the myriad has been de�ned using the Cauchy distribution, the only (non-Gaussian) �-stable distribution that has a closed-form expression for its density function.However, the myriad in fact turns out to be the optimal estimator of the location of �-stabledistributions for the triplet � = 0; 1; 2 [9, 10]. Consider now a set of N independent andidentically distributed (i.i.d.) random variables fXigNi=1, each following a Cauchy distribu-tion with location parameter � and scaling factor K > 0. Thus, Xi � Cauchy(�;K), withthe density functionfXi(xi; �;K) = K� � 1K2 + (xi � �)2 = 1K f xi � �K ! ; (3)where f(v) 4= 1� � 11 + v2 is the density function of a standard Cauchy random variable:Xi � �K � Cauchy(0; 1): Given a set of observations fxigNi=1, the sample myriad �̂K maximizesthe likelihood function QNi=1 fXi(xi; �;K). Using (3) and some manipulation, we can write�̂K 4= myriad (x1; x2; : : : ; xN ;K) = argmin� NYi=1 241 + xi � �K !235= argmin� NXi=1 log hK2 + (xi � �)2i ; (4)5

Page 7: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

using the fact that log(�) is a strictly increasing function. Comparing (4) with the de�nitionof M -estimators in (1), we see that the M -estimator cost function for the sample myriad isgiven by �(u) = log(K2 + u2). It is interesting to note that the sample myriad includes thesample mean as the special limiting case when K !1 [9]: limK!1 �̂K = NXi=1 xi = N .The sample myriad can be generalized to the weighted myriad smoother by assigningnon-negative weights to the input samples (observations); the weights re ect the varyinglevels of \reliability". To this end, the observations are assumed to be drawn from inde-pendent Cauchy random variables which are, however, not identically distributed. Given Nobservations fxigNi=1 and non-negative weights fwi � 0gNi=1, let the input and weight vectorsbe de�ned as x 4= [x1; x2; : : : ; xN ]T and w 4= [w1; w2; : : : ; wN ]T , respectively. For a givennominal scale factor K, the underlying random variables are assumed to be independent andCauchy distributed with a common location parameter �, but varying scale factors fSigNi=1:Xi � Cauchy(�; Si), where Si 4= Kpwi > 0; i = 1; 2; : : : ; N: (5)A larger value for the weight wi (smaller scale Si) makes the distribution of Xi more concen-trated around �, thus increasing the reliability of the sample xi. Note that the special casewhen all the weights are equal to unity corresponds to the sample myriad at the nominalscale factor K, with all the scale factors reducing to Si = K.It is important to realize that the location estimation problem being considered here isintimately related to the problem of �ltering a time-series fx(n)g using a sliding window.The output y(n), at time n, can be interpreted as an estimate of location based on theinput samples fx(n � N1); : : : ; x(n � 1); x(n); x(n + 1); : : : ; x(n + N2)g. Further, the afore-mentioned model of independent but not identically distributed samples can capture thetemporal relationships usually present among the input samples. To see this, note that theoutput y(n), as an estimate of location, would rely more on (give more weight to) the samplex(n), when compared with samples that are further away in time. By assigning varyingscale factors in modeling the input samples, leading to di�erent weights (reliabilities), their6

Page 8: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

temporal correlations can be e�ectively accounted for.The weighted myriad smoother output �̂K(w;x) can now be de�ned as the value of �that maximizes the likelihood function QNi=1 fXi(xi; �; Si). Using (3) for fXi(xi; �; Si) leads to�̂K(w;x) 4= myriad (wi � xi;K)jNi=1 = argmin� NYi=1 241 + xi � �Si !235= argmin� NYi=1 hK2 + wi (xi � �)2i 4= argmin� P (�); (6)the notation wi �xi denotes the weighting operation in the above expressions. Alternatively,we can write �̂K(w;x) � �̂ as�̂ = argmin� Q(�) 4= argmin� NXi=1 log hK2 + wi (xi � �)2i ; (7)thus �̂ is the global minimizer of P (�) as well as of Q(�) 4= log(P (�)). Therefore, dependingon the context, we refer to either of the functions P (�) and Q(�) as the weighted myriadsmoother objective function. Note that when wi = 0, the corresponding term drops out ofP (�) and Q(�); thus a sample xi is e�ectively ignored if its weight is zero. Now, by comparing(7) and (2), we con�rm that the weighted myriad smoother is a weighted M -estimator (oran M -smoother) with the cost function �(u) = log(K2 + u2).As we can see from (6), the objective function P (�) is a polynomial in � of degree 2N ,with well-de�ned derivatives of all orders. Therefore, P (�) (and the equivalent objectivefunction Q(�)) can have several local minima. This makes the exact computation of �̂(by direct minimization of P (�) or Q(�)) a prohibitively expensive task. In [11], we havedeveloped fast algorithms for the computation of the weighted myriad smoother output usingan indirect method with a high degree of accuracy.In order to determine the properties of �̂, we examine the objective function further,focusing on Q(�). Using (6) and (7), the derivative of Q(�) can be written asQ0(�) = P 0(�)P (�) = 2 NXi=1 wi (� � xi)K2 + wi (xi � �)2 : (8)We now brie y present a few basic properties of Q(�) and �̂ that will be useful in latersections of the paper. The proofs of these properties can be found in [11, 14]. First, let7

Page 9: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

x(N) θ

Q ( )θ

(1)x θ^Figure 1: Sketch of a typical weighted myriad objective function Q(�).fx(j)gNj=1 denote the order statistics (samples sorted in increasing order of amplitude) of theinput vector x, with x(1) the smallest and x(N) the largest. The following statements aretrue:(a) The weighted myriad smoother has onlyN independent parameters; the output �̂K(w;x)depends only on the normalized weight vector (w=K2).(b) The objective function Q(�) has a �nite number (at most (2N � 1)) of local extrema,all of which lie within the range [x(1); x(N)] of the input samples.(c) The weighted myriad smoother output �̂ is one of the local minima of Q(�):Q0(�̂) = 0: (9)(d) �̂ is always within the range of the input samples: x(1) � �̂ � x(N).Some of the afore-mentioned properties are illustrated by Fig. 1, which shows a sketch ofa typical objective function Q(�). It is clear from the �gure that the output �̂ is restrictedto the dynamic range of the input samples. This is a direct consequence of the constraintof non-negative weights. As a result, the weighted myriad smoother (and in general, anyM -smoother) is incapable of highpass or bandpass-type behavior and unable to amplify thedynamic range of an input signal. The severely constrained nature of the weighted myriad8

Page 10: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

smoother can also be seen by considering the limiting case when K !1, with the weightsfwig held constant. Using (8) and (9), we can show that as K !1, Q(�) reduces to havinga single local extremum, andlimK!1 �̂K(w;x) = �̂1(w;x) = NXi=1wixiNXi=1wi ; (10)which is the (linear) weighted mean smoother; K is called the `linearity parameter' preciselybecause of this limiting result. Thus, the weighted myriad smoother is analogous to theweighted mean smoother, which is a severely handicapped linear FIR �lter, since its weightsare constrained to be non-negative, while also summing to unity.3 Myriad Filters with Real-Valued WeightsIn this section, we �rst extend the class of M -smoothers (weighted M -estimators) de�ned in(2), leading to the more general class of M -�lters that allow for negative weights. We thenconsider the special case of the family of myriad M -estimators, generalizing the weightedmyriad smoother of Section 2 into a variety of myriad �lters admitting real-valued weights.3.1 M-�ltersBefore we can extend the general class of M -smoothers, it is useful to examine the specialcases of the weighted mean and median smoothers. Referring to Table 1, the output of aweighted mean smoother, with input vector x and a weight vector w of non-negative weights,is given by �̂ (w;x) = mean (wi � xi)jNi=1 = NXi=1 wixi = NXi=1 wi: (11)Notice from Table 1 that �̂ minimizes the cost function NXi=1 wi(xi � �)2. When the weightsare allowed to be both positive and negative, fwi 2 RgNi=1, a natural extension of (11) wouldbe to write ~� (w;x) 4= NXi=1 wixi = NXi=1 jwij; (12)9

Page 11: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

for the weighted mean �lter output. Note that the normalization factor in (12) is now thesum of the absolute values of the �lter weights. Now, we can rewrite (11) as~� (w;x) = NXi=1 jwij � sgn(wi)xi = NXi=1 jwij; (13)thus the sign of the weight is uncoupled from its magnitude and attached to the correspondinginput sample. By comparing with (11), we can rewrite (13) as~� (w;x) = mean (jwij � sgn(wi)xi)jNi=1 4= mean (gi � zi)jNi=1 = �̂ (g; z); (14)where fgi = jwijgNi=1 and fzi = sgn(wi)xigNi=1 are represented by the vectors g and z, respec-tively. The weighted mean �lter can therefore be thought of as a weighted mean smootherapplied to a modi�ed set of samples fzigNi=1, using the non-negative weights fgi � 0gNi=1. Re-ferring to the cost function for the weighted mean smoother in Table 1, we can infer that theweighted mean �lter output ~� (w;x) minimizes the cost function NXi=1 jwij � (sgn(wi)xi � �)2.Following the above approach, the weighted median smoother was recently extended toa class of weighted median �lters with real-valued weights [16]. Thus, referring to Table 1,and by analogy to (14), the output of a weighted median �lter with weights fwi 2 RgNi=1was written in [16] as�� (w;x) = median (jwij � sgn(wi)xi)jNi=1 = argmin� NXi=1 jwij � jsgn(wi)xi � �j: (15)However, it was subsequently found in [17] that a more general de�nition of the weightedmedian �lter was possible. Notice in (15) that if a particular weight wi is negative, thecorresponding modi�ed sample is sgn(wi)xi = �xi, which is the mirror sample of xi. Thus,depending on the signs of the weights, some of the mirror samples f�xig come into play inthe �ltering operation. A more general approach might then be to include all the mirrorsamples f�xigNi=1 in the �lter de�nition. An alternative way to achieve this is to assign apair of weights wi � 0 and hi � 0 to each sample xi. As before, we uncouple the sign of aweight whenever it is negative, merging the sign with the corresponding sample. Thus, sincehi � 0, we have the equivalence hi � xi = jhij � sgn(hi)xi = jhij � (�xi). This concept of10

Page 12: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

double weighting emerges naturally in [17] through the analysis of the so-called stack �lterswith real-valued weights (weighted median �lters are a special case of stack �lters). Thus,given the set of weights fwi � 0gNi=1 and fhi � 0gNi=1, the weighted median �lter output [17]is de�ned as �� 4= median (wi � xi; jhij � (�xi))jNi=1 = median (hwi; hii � xi)jNi=1 ; (16)where the notation hwi; hii re ects the double weight assigned to the sample xi. Note thatthe weighted median �lter can also be described as a weighted median smoother appliedto a modi�ed input vector z = [xT ;�xT ]T and a modi�ed weight vector of non-negativeweights, g = [wT ; jhjT ]T . Referring to the weighted median smoother cost function inTable 1, we can show that the weighted median �lter output minimizes the cost functionNXi=1 wi � jxi � �j + NXi=1 jhij � j � xi � �j.At �rst glance, the double weighting in (16) seems redundant and the structure in (15),with a single real-valued weight for each sample, appears to be su�cient as in the case of linearFIR �lters. However, the peculiarity of redundancy exists only in the case of the linear �lter;the weighted mean �lter analogous to (16) is given by ~� = mean (wi � xi; jhij � (�xi))jNi=1 =mean ((wi � jhij) � xi)jNi=1, the double weight hwi; hii thus collapsing to a single real-valuedweight (wi � jhij). This conversion into a single weight, due to the superposition propertyof linear �lters, does not occur for the weighted median �lter [17], and cannot be expectedfor the more general class of M -�lters to be developed in the following.Using the afore-mentioned ideas, the extension of the general class of M -smoothers, toallow for real-valued weights, can be achieved in a straightforward manner as follows. Theoutput of the general M -smoother, with weights fwi � 0gNi=1 and cost function �(�), is givenfrom (2) by �̂ (w;x) = argmin� NXi=1 � (pwi � (xi � �)): (17)Given a set of positive and negative weights, fwi � 0gNi=1 and fhi � 0gNi=1, the M -smoother11

Page 13: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

is generalized to the new class of M -�lters described by the �lter output~� (hw;hi ;x) 4= argmin� NXi=1 �� (pwi � (xi � �)) + � (qjhij � (�xi � �))�; (18)where the notation hw;hi describes the double weight vector with positive weights in wand negative weights in h. Note that the M -�lter ~� (hw;hi ;x) can also be described as anM -smoother �̂ (g; z), with a transformed input vector z = [xT ;�xT ]T and a transformedweight vector of non-negative weights, g = [wT ; jhjT ]T . The rest of this section is devotedto deriving various myriad �lters from (18).3.2 Weighted Myriad FiltersRecall from (7) that the output of the weighted myriad smoother with weights fwi � 0gNi=1is given by �̂K(w;x) = argmin� Q(�) 4= argmin� NXi=1 log hK2 + wi (xi � �)2i : (19)As mentioned in Section 2, the weighted myriad smoother is an M -smoother with the costfunction �(u) = log(K2 + u2). Using this cost function in the expression for the M -�lter output given in (18), we obtain the output of the weighted myriad �lter, with weightsfwi � 0gNi=1 and fhi � 0gNi=1, as~�K(hw;hi ;x) 4= myriad (hwi; hii � xi;K)jNi=1 = myriad (wi � xi; jhij � (�xi);K)jNi=1= argmin� Q1(�); (20)where the weighted myriad �lter objective function Q1(�) is given byQ1(�) 4= NXi=1 nlog hK2 + wi � (xi � �)2i + log hK2 + jhij � (�xi � �)2io : (21)The weighted myriad �lter can be represented as a weighted myriad smoother �̂K(g; z) withthe transformed input and weight vectors z = [x1; x2; : : : ; xN ;�x1;�x2; : : : ;�xN ]T andg = [w1; w2; : : : ; wN ; jh1j; jh2; : : : ; jhN ]T , respectively. Consequently, many of the propertiesof this �lter are similar to those of the weighted myriad smoother that were described in12

Page 14: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

Section 2. Now, let fz(j)g2Nj=1 denote the order statistics (sorted samples) of the transformedinput vector z; the smallest and largest order statistics are then given by z(1) = �Mx =�maxfjxijgNi=1 and z(2N) = +Mx. The following properties of the �lter output ~� and theobjective function Q1(�) follow easily from Section 2:(a) The weighted myriad �lter has only 2N independent parameters; the �lter output~�K(hw;hi ;x) depends only on the normalized double weight vector hw=K2;h=K2i.(b) The derivative of the objective function Q1(�) is given byQ01(�) = 2 NXi=1 " wi (� � xi)K2 + wi (xi � �)2 + jhij (� + xi)K2 + jhij (xi + �)2# : (22)(c) Q1(�) has not more than (4N � 1) local extrema, all of which lie within the range[�Mx;+Mx] of the magnitudes of the input samples.(d) ~� is one of the local minima of Q1(�):Q01(~�) = 0: (23)(e) ~� is always within the range of the magnitudes of the input samples: �Mx � ~� � +Mx.Since the weighted myriad �lter allows for positive as well as negative weights, it over-comes the limitations of the weighted myriad smoother to the extent that it can accomplisha wide range of frequency-selective �ltering operations, including bandpass and highpass�ltering. Unlike the weighted myriad smoother output �̂K(w;x) of (19), which is limited tothe range [x(1); x(N)] of the input samples, the weighted myriad �lter output ~�K(hw;hi ;x)is con�ned to a larger interval [�Mx;+Mx], Mx = maxfjxijgNi=1, which includes the inputsamples as well as their negatives. However, the fact that the output ~� is restricted to some�nite interval that depends on the input samples, is in itself a limitation of this �lter. Likethe weighted myriad smoother, the weighted myriad �lter is also incapable of amplifying thedynamic range of an input signal, since the output cannot go beyond the dynamic range ofthe magnitudes of the input samples. To see the constrained nature of this �lter, consider13

Page 15: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

the limiting case when the linearity parameter K ! 1, while holding the weights w andh constant. Using (22) and (23), we can show that as K ! 1, Q1(�) has a single localminimum, and limK!1 ~�K(hw;hi ;x) = ~�1(hw;hi ;x) = NXi=1 (wi � jhij) � xiNXi=1 (wi + jhij) : (24)We can also write the above expression as~�1(hw;hi ;x) = �̂1(g; z) = 2NXi=1 gizi = 2NXi=1 gi; (25)where we have used the modi�ed inputs and weights, z = [x1; : : : ; xN ;�x1; : : : ;�xN ]T andg = [w1; : : : ; wN ; jh1j; : : : ; jhN ]T , respectively. The restricted nature of the �lter is due tothe presence of the normalization factor 2NXi=1 gi = NXi=1(wi + jhij) in the above expressions. As(24) shows, the weighted myriad �lter is analogous to nothing more than a linear FIR �lterwith constrained real-valued weights.As a �nal note on weighted myriad �lters, we shall case consider a special type of weightedmyriad �lter which has only N weights, instead of the general case of 2N weights. Thisspecial case is derived by analogy with the weighted mean �lter of (14) and the specialweighted median �lter (15), proposed in [16]. We de�ne a weighted myriad �lter withoutdouble weighting, with real-valued weights fw0i 2 RgNi=1, as��K(w0;x) 4= myriad (jw0ij � sgn(w0i)xi)jNi=1 = argmin� Q2(�); (26)where Q2(�) 4= NXi=1 log hK2 + jw0ij � (sgn(w0i)xi � �)2i (27)is the objective function of this special weighted myriad �lter. Although this �lter doesnot possess the full power of the general double-weighted �lter, it is simpler to design andimplement, and it might be su�cient in some applications. To see how the �lter of (26)is in fact a special case of the double-weighted myriad �lter of (20), consider the general14

Page 16: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

weighted myriad �lter output ~�K(hw;hi ;x) with fwi � 0gNi=1 and fhi � 0gNi=1, and letfw0i 4= wi + hi = wi� jhijgNi=1, with the added constraint that fwi � hi = 0gNi=1. Thus, onlyone of wi and hi is allowed to be non-zero, and w0i is equal to that non-zero weight. It is easy toshow, using (20) and (21) together with the above constraints, that the general �lter output~�K(hw;hi ;x) reduces to ��K(w0;x) of (26). Now, by comparing (26) and (27) with (19), we seethat ��K(w0;x) can also be expressed as the output of a weighted myriad smoother �̂K(g0; z0),with weights fg0i = jw0ijgNi=1 and transformed input samples fz0i = sgn(w0i)xigNi=1. The�lter output ��K is seen to be always con�ned to the interval [z0(1); z0(N)], the range of thetransformed input samples. Considering the limiting case K !1, we can use (24) and therelation between w0 and hw;hi to show thatlimK!1 ��K(w0;x) = ��1(w0;x) = NXi=1 w0i xi = NXi=1 jw0ij; (28)which is a weighted mean �lter with real-valued weights.3.3 Scaled Weighted Myriad FiltersThe weighted myriad �lters described in Section 3.2 have the drawback that the outputis always constrained to belong to a certain �nite interval that is a function of the inputsamples. These �lters are therefore not able to amplify the dynamic range of the input signal.Consider the weighted myriad �lter output ~�K(hw;hi ;x) of (20), which always lies withinthe interval [�Mx;+Mx], where Mx = maxfjxijgNi=1. In order to remove this constraint on~�K , we examine its limiting case given in (24): ~�1 = NXi=1(wi � jhij) � xi = NXi=1(wi + jhij).Note that if we removed the normalization factor S � S (hw;hi) 4= NXi=1(wi + jhij) from theexpression for ~�1, we would obtain the full power of the linear FIR �lter in the expressionNXi=1(wi � jhij) � xi. Extending this idea to the case of a �nite value of K, we de�ne the outputof the so-called scaled weighted myriad �lter, with weights fwi � 0gNi=1 and fhi � 0gNi=1, as~�(S)K (hw;hi ;x) 4= S (hw;hi) � ~�K(hw;hi ;x)= " NXi=1(wi + jhij)# � myriad (hwi; hii � xi;K)jNi=1 ; (29)15

Page 17: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

where the weighted myriad �lter output ~�K(hw;hi ;x) is given by (20). Now, since we knowthat ~�K(hw;hi ;x) lies within the interval [�Mx;+Mx], we see that the scaled weightedmyriad �lter output is constrained by j~�(S)K (hw;hi ;x)j � S (hw;hi) �Mx. However, thisinterval can be made as large as desired, by simply increasing the magnitudes of the �l-ter weights (thus increasing the factor S). Thus, the scaled �lters are capable of in-put signal ampli�cation, unlike the weighted myriad �lters of Section 3.2. Further, since~�(S)1 (hw;hi ;x) = NXi=1(wi � jhij) � xi, the scaled weighted myriad �lter is analogous to (butmore robust than) the traditional linear FIR �lter. Hence, the full signal processing poten-tial of myriad �lters in impulsive environments can be realized with this new class of scaled�lters. It is important to note that, unlike the weighted myriad �lter which has only 2N inde-pendent parameters (the output depending only on the normalized weights hw=K2;h=K2i),the scaled weighted myriad �lter depends on (2N +1) parameters, consisting of the N pairsof weights fhwi; hiigNi=1 as well as the linearity parameter K. Finally, it is easily shown thatthe scaled weighted myriad �lter is BIBO (bounded-input bounded-output) stable. Considera bounded input signal fx(n)g with fjx(n)j < Bg+1n=�1. From the previous discussion, it isclear that the �lter output, at time n, is bounded in magnitude by S (hw;hi) �Mx(n) < S �Bsince Mx(n) = maxfjxi(n)jgNi=1 < B; hence the output signal is bounded as desired.In Section 3.2, we introduced a special kind of weighted myriad �lter with only N weights(without double weighting). We can develop a scaled version of this �lter following the sameapproach as above. Referring to the de�nition of this special �lter in (26) and to its limitingcase (when K !1) in (28), we de�ne the scaled weighted myriad �lter with N real-valuedweights, fw0i 2 RgNi=1, as��(S)K (w0;x) 4= NXi=1 jw0ij! � ��K(w0;x)= NXi=1 jw0ij! � myriad (jw0ij � sgn(w0i)xi)jNi=1 ; (30)where ��K(w0;x) is given by (26). In the limiting case, we see from (28) that ��(S)1 (w0;x) =NXi=1 w0i xi, which is the familiar linear FIR �lter. The �lter described in (30) is a special16

Page 18: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

case of the scaled �lter of (29), with fw0i 4= wi + hi = wi � jhijgNi=1 and the constraintsfwi � hi = 0gNi=1. Thus, the stability of this special scaled �lter and its ability to amplifythe range of its input signals, all follow from the previous discussion of the general scaled�lter. Note that the �lter of (30) has (N + 1) independent parameters, consisting of the Nreal-valued weights fw0igNi=1 as well as the linearity parameter K. This �lter is simpler todesign and implement than the general scaled �lter, since it requires about half the numberof parameters.4 Adaptive Myriad Filtering AlgorithmsIn this section, we address the issue of optimization of the �lter parameters for the di�erentmyriad �lters that were de�ned in Section 3. Consider �rst the general case of an arbitrarynonlinear �lter with an N -long input vector x, a parameter vector q of length M , and �lteroutput y. The �ltering error in estimating a desired signal d is given by e = y � d. Underthe mean square error (MSE) criterion, the �lter parameters are designed to minimize thecost function J (q) 4= Efe2g = Ef(y�d)2g, where Ef�g denotes statistical expectation. Bysetting the gradient of the cost function equal to zero, we obtain the necessary conditions tobe satis�ed by the optimal �lter parameters:@J (q)@qi = 2 E (e @y@qi) = 0; i = 1; 2; : : : ;M: (31)The nonlinear nature of the equations in (31) prevents a closed-form solution for the optimalparameters. We therefore turn to the method of steepest descent, which continually updatesthe �lter parameters in an attempt to converge to the global minimum of the cost functionJ (q): qi(n+ 1) = qi(n) � 12 � @J@qi (n); i = 1; 2; : : : ;M; (32)where qi(n) is the ith parameter at iteration n, � > 0 is the step-size of the update, and thegradient at the nth iteration is given by@J@qi (n) = 2 E (e(n) @y@qi (n)) ; i = 1; 2; : : : ;M: (33)17

Page 19: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

When the underlying signal statistics are unavailable and/or rapidly changing, we use in-stantaneous estimates for the gradient, since the expectation in (33) cannot be evaluated.Thus, removing the expectation operator in (33) and using the result in (32), we obtain thefollowing nonlinear adaptive �ltering algorithm:qi(n+ 1) = qi(n) � � e(n) @y@qi (n); i = 1; 2; : : : ;M: (34)So far, we have imposed no constraints on the �lter parameters: fqi 2 RgMi=1. Suppose nowthat a particular �lter parameter qi is restricted to be non-negative, qi � 0, a situation thatwill be encountered in the optimization of some of the �lters of Section 3. In this case, thealgorithm of (34) has to be modi�ed to ensure that qi(n+1) � 0 at each iteration step. Theadaptive �ltering algorithm for a non-negative parameter is therefore given byqi(n + 1) = P "qi(n) � � e(n) @y@qi (n)# ; (35)where P [�], de�ned by P [u] 4= ( u; u � 00; u < 0; (36)projects the updated value onto the constraint space R+ of this �lter parameter. Dependingon whether a particular parameter is constrained or not, the appropriate algorithm out of(34) and (35) can be used. From the expressions in the above two algorithms, we see that theadaptive algorithm for any speci�c �lter directly follows once we evaluate f @y@qigMi=1 for that�lter; we now proceed to accomplish this for the various myriad �lters de�ned in Section 3.4.1 Adaptive Weighted Myriad FiltersGiven a set of weights fwi � 0gNi=1 and fhi � 0gNi=1, and a linearity parameter K > 0,the weighted myriad �lter output y = ~�K(hw;hi ;x) is given by (20). As mentioned inSection 3.2, this �lter can also be expressed as a weighted myriad smoother �̂K(g; z) withthe transformed input and weight vectors z = [x1; x2; : : : ; xN ;�x1;�x2; : : : ;�xN ]T andg = [w1; w2; : : : ; wN ; jh1j; jh2; : : : ; jhN ]T , respectively. Thus,y = ~�K(hw;hi ;x) = �̂K(g; z): (37)18

Page 20: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

It was also stated in Section 3.2 that ~�K(hw;hi ;x) depends only on the normalized doubleweight vector hw=K2;h=K2i. The optimal �ltering action is therefore independent of K,which can be chosen arbitrarily (the optimal weights would scale according to the value ofK used). From the above discussion, we see that optimizing the weighted myriad �lter isequivalent to �nding the optimal transformed weights fgig2Ni=1. Since these weights are non-negative, the appropriate adaptive algorithm to be used is given by (35), with the parametervector q = g and M = 2N . The task that remains is to �nd an expression for f @y@gig2Ni=1.From (37), we have @y@gi = @@gi �̂K(g; z); i = 1; 2; : : : ; 2N: (38)However, the above quantity has already been derived in [14], in the context of optimizationof the weighted myriad smoother, and is given by@@gi �̂K(g; z) = 26664 � (�̂ � zi)�1 + giK2 ��̂ � zi�2�23777526664 2NXj=1 gj � 1� gjK2 ��̂ � zj�2�1 + gjK2 ��̂ � zj�2�237775 ; (39)

where �̂ � �̂K(g; z). Using (37), (38) and (39) in (35), we �nally obtain the followingadaptive algorithm for the weights fgig2Ni=1:Adaptive Weighted Myriad Filter Algorithmgi(n+ 1) = P 26666666666664gi(n) + � e(n) 8>>><>>>: (y � zi)�1 + giK2 (y � zi)2�29>>>=>>>; (n)8>>><>>>: 2NXj=1 gj � 1� gjK2 (y � zj)2�1 + gjK2 (y � zj)2�29>>>=>>>; (n)

37777777777775 ; (40)where the transformed samples fzi(n)g2Ni=1 are found from the input samples fxi(n)gNi=1 usingzi(n) = ( xi(n); i = 1; 2; : : : ; N�xi�N (n); i = N + 1; N + 2; : : : ; 2N: (41)19

Page 21: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

Note that the original �lter weights fwi(n) � 0gNi=1 and fhi(n) � 0gNi=1 can be recoveredfrom the transformed weights fgi(n) � 0g2Ni=1 at any iteration usingwi(n) = gi(n); hi(n) = �jhi(n)j = �gi+N(n); i = 1; 2; : : : ; N: (42)In Section 3.2, we introduced a special type of weighted myriad �lter having only Nweights, instead of the general case of 2N weights. We now consider the problem of opti-mizing the parameters of this �lter. Given a set of real-valued weights fw0i 2 RgNi=1, theoutput of this �lter, y = ��K(w0;x), is given by (26). It was mentioned in Section 3.2 that��K(w0;x) could be expressed as a special case of the general double-weighted myriad �lteroutput ~�K(hw;hi ;x) of (20). Indeed, given the weights fwi � 0gNi=1 and fhi � 0gNi=1, wehave y = ��K(w0;x) = ~�K(hw;hi ;x), with fw0i 4= wi + hi = wi � jhijgNi=1 and the addedconstraints fwi � hi = 0gNi=1. Thus, only one of wi and hi is allowed to be non-zero, andw0i is equal to that non-zero weight. Due to the peculiar nature of this relationship betweenhw;hi and w0, however, the general adaptive algorithm of (40) cannot be directly used forthe optimization of the weights fw0igNi=1; an entirely di�erent approach is therefore necessary.Now, since the weights fw0igNi=1 are all real-valued, the appropriate adaptive algorithm to beused is given by (34), with the parameter vector q = w0 and M = N . All that remains is to�nd an expression for @y@w0i = @@w0i ��K(w0;x); i = 1; 2; : : : ; N ; (43)this is derived in Appendix A and is given by@@w0i ��K(w0;x) = � 26664 K2 sgn(w0i) ��� � sgn(w0i)xi��K2 + jw0ij � ��� � sgn(w0i)xi�2�23777526664 NXj=1 jw0jj K2 � jw0jj � ��� � sgn(w0j)xj�2�K2 + jw0jj � ��� � sgn(w0j)xj�2�237775 ; (44)

20

Page 22: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

where �� � ��K(w0;x). Using (34), we �nally obtain the following adaptive algorithm toupdate the weights fw0igNi=1:w0i(n+ 1) = w0i(n) � � e(n) @��@w0i (n); (45)with @��@w0i (n) given by (44).4.2 Adaptive Scaled Weighted Myriad FiltersThe output of the scaled weighted myriad �lter with weights fwi � 0gNi=1 and fhi � 0gNi=1,and linearity parameter K > 0, is given from (29) byy = ~�(S)K (hw;hi ;x) = 24 NXj=1(wj + jhjj)35 � ~�K(hw;hi ;x); (46)where the (unscaled) weighted myriad �lter output ~�K(hw;hi ;x) is de�ned in (20). Usingthe interpretation of the weighted myriad �lter as a weighted myriad smoother, as shown in(37), we can rewrite (46) asy = ~�(S)K (hw;hi ;x) = 0@ 2NXj=1 gj1A � �̂K(g; z); (47)with the transformed inputs z = [x1; x2; : : : ; xN ;�x1;�x2; : : : ;�xN ]T and transformedweights g = [w1; w2; : : : ; wN ; jh1j; jh2; : : : ; jhN ]T . As we mentioned in Section 3.3, the scaledweighted myriad �lter has (2N + 1) independent parameters represented by the N pairsof weights fhwi; hiigNi=1, and the linearity parameter K. Equivalently, from (47), the �ltercan be described by the modi�ed weights fgi � 0g2Ni=1 and the linearity parameter K >0. Since the parameters are all non-negative in this second formulation of the �lter, theappropriate adaptive algorithm for their optimization is given by (35). It would appearthat the parameter vector q to be used in (35) should consist of the weights fgig2Ni=1 andthe linearity parameter K. However, it makes more sense to optimize the squared linearityparameter � 4= K2 rather than K itself, for two reasons. First, in the de�nitions of allthe di�erent myriad �lters, K always occurs only in the form of its squared value. Second,an adaptive algorithm for K faces an ambiguity in the sign of K and might converge to a21

Page 23: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

negative value (that is, to �p�). Therefore, the parameter vector used in (35) is chosento be q = [gT ; �]T , with a length M = 2N + 1. Our task is then to derive expressionsfor f @y@gig2Ni=1, and for @y@� . After optimizing �, the linearity parameter can be recovered asK = +p�.Di�erentiating with respect to the weight gi in (47), we have@y@gi = �̂K(g; z) + 0@ 2NXj=1 gj1A � @@gi �̂K(g; z); (48)note that we already know @@gi �̂K(g; z) from (39). Now, di�erentiating (47) with respect tothe � = K2, holding the weights g �xed, we obtain@y@� = 0@ NXj=1 gj1A � @@� �̂K(g; z): (49)An expression for @@� �̂K(g; z) is derived in Appendix B, and is given by@@� �̂K(g; z) = 264 2NXj=1 gj (�̂ � zj)�� + gj (�̂ � zj)2�237526664 2NXj=1 gj �� gj ��̂ � zj�2��+ gj ��̂ � zj�2�237775 ; (50)where �̂ � �̂K(g; z). Substituting the expressions for @y@gi and @y@� (from (48) and (49))into (35) results in the following algorithms to update the weights fgig2Ni=1 and the squaredlinearity parameter, � = K2:Adaptive Scaled Weighted Myriad Filter Algorithmsgi(n + 1) = P 24gi(n) � � e(n) �8<:�̂ (n) + 0@ 2NXj=1 gj(n)1A � @�̂@gi (n)9=;35 (51)and �(n+ 1) = P 24�(n) � � e(n) �0@ 2NXj=1 gj(n)1A � @�̂@� (n)35 ; (52)where @�̂@gi (n) and @�̂@� (n) are given by (39) and (50), respectively. Note that the transformedinputs and weights fzi(n); gi(n)g2Ni=1 that occur in the above algorithms can be mapped to theoriginal samples and weights fxi(n); hwi(n); hi(n)igNi=1 when desired, using (41) and (42).22

Page 24: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

5 Simulation ResultsThe adaptive myriad �ltering algorithms developed in Section 4 are investigated in this sec-tion using three computer simulation examples. The �rst example illustrates the inadequacyof the weighted myriad smoother in bandpass �ltering a clean (noiseless) chirp-type signal. Inthe second example, the weighted myriad �lters of Section 3.2 are trained to bandpass �ltera chirp-type signal corrupted by alpha-stable noise. The third example consists of adaptivelydesigning the scaled weighted myriad �lters of Section 3.3 to retain the high frequency tonesin a sum of sinusoids, followed by selective ampli�cation of the extracted components. Thus,it is demonstrated that the scaled weighted myriad �lters possess the full signal processingpower of the traditional linear FIR �lters.Example 1:In this example, we illustrate the fact that the weighted myriad smoother is unusable inbandpass �ltering applications, due to its constraint of non-negative weights. Fig. 2(a) showsa clean (noiseless) chirp-type test signal s(n), which is a digital sinusoid with a quadrati-cally increasing instantaneous frequency. Speci�cally, the test signal of length L is given bys(n) = sin(!(n) n); n = 0; 1; : : : ; L� 1, where the radian frequency !(n) = 0:2� � � nL�1�2increases quadratically with n from 0 to a maximum of 0:2�. The desired signal d(n),shown in Fig. 2(b), was obtained by passing s(n) through an FIR bandpass �lter of windowlength N = 21, designed using MATLAB's �r1 function with the passband cuto� frequencies(!1; !2) = (0:16�; 0:18�).The observed signal s(n) and desired signal d(n) of length L = 300 were used to trainthe linear FIR �lter and the weighted myriad smoother (of Section 2), using a �lter windowlength N = 21 in each case. The linear �lter was trained using the standard LMS algorithm,with the initial weights being all zero: w(0) = 0, while the weighted myriad smoother wastrained using the adaptive LMS-type algorithm developed in [14], with the initial weightsbeing all identical and normalized to sum to unity: wi(0) = (1=N) = 0:0476; i = 1; 2; : : : ; N .The linearity parameter was arbitrarily chosen to be K = 1:0 ; recall from Section 2 that23

Page 25: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

(a)

(c)

(b)

Figure 2: Example 1 (bandpass �ltering a chirp-type signal), (a) s(n): clean chirp-type testsignal, (b) d(n) � ylin(n): desired signal (linear FIR �lter output), (c) ywmy(n): weightedmyriad smoother output.24

Page 26: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

the �lter output depends only on (w=K2). Both the adaptive algorithms were implementedusing 150 passes through the data, for a total of 45000 iterations. The step-size used was� = 0:01 in both cases, and was chosen to obtain as small an MSE as possible.The linear �lter output ylin(n), using the trained FIR �lter on the test signal s(n), wasvery close to the desired signal d(n) (an MSE of 8:7 � 10�10) as expected, and is thereforenot shown separately in Fig. 2. However, the output ywmy(n) of the optimized weightedmyriad smoother, shown in Fig. 2(c), is clearly quite di�erent from the desired signal (anMSE of 0.0846). It is interesting to note from the �gure that ywmy(n) is close to a lowpass�ltered version of the test signal s(n), with a cuto� frequency corresponding to the uppercuto� !2 = 0:18� of the designed bandpass linear FIR �lter. In attempting to bandpass�lter s(n), the weighted myriad smoother thus does the best that it can, which is to lowpass�lter s(n) with the upper cuto� frequency. This example clearly shows that the weightedmyriad smoother is limited to a lowpass type behavior.Example 2:This example demonstrates that, unlike the weighted myriad smoother, the weighted myriad�lters of Section 3.2 can perform bandpass �ltering operations, since the weights can be bothpositive and negative. Fig. 3(a) shows a clean (noiseless) chirp-type signal s(n), which isthe same signal that was used in Example 1 (see Fig. 2). The desired signal d(n), shown inFig. 3(c), is also the same as the one obtained in Example 1 through bandpass �ltering s(n)with a linear FIR �lter of window length N = 21.The noiseless signal s(n) and the desired signal d(n) of length L = 300 were used totrain the linear FIR �lter, and the general and special weighted myriad �lters de�ned inSection 3.2, with a �lter window of length N = 21 in each case. As in Example 1, thelinear �lter was trained using the LMS algorithm, with initial weights given by w(0) = 0.Recall from Section 4.1 that a general weighted myriad �lter with weights fwi � 0gNi=1and fhi � 0gNi=1, and linearity parameter K > 0, can be optimized by �nding the besttransformed weights fgig2Ni=1, where g = [w1; w2; : : : ; wN ; jh1j; jh2; : : : ; jhN ]T . The gen-25

Page 27: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

(a)

(f)(e)

(b)

(d)

(c)

Figure 3: Example 2 (bandpass �ltering a chirp-type signal), (a) clean chirp-type trainingsignal, (b) noisy chirp-type test signal, (c) desired signal, (d) linear FIR �lter output, (e)weighted myriad �lter output, (f) special weighted myriad �lter output.26

Page 28: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

eral weighted myriad �lter with 2N weights was optimized using the adaptive algorithmof (40), with the initial transformed weights being all identical and normalized to unity:gi(0) = (1=2N) = 0:0238; i = 1; 2; : : : ; 2N . The special weighted myriad �lter with weightsfw0i 2 RgNi=1 was trained using the adaptive algorithm of (45), with the initial weights againbeing identical and normalized to unity: w0i(0) = (1=N) = 0:0476; i = 1; 2; : : : ; N . Allthe adaptive algorithms were implemented using 100 passes through the data, for a total of30000 iterations. A step-size � = 0:01 was chosen in all cases. For both these myriad �lters,the optimal �lter is independent of K; the linearity parameter in this example was thereforearbitrarily chosen to be K = 1:0.Table 2 shows the �nal �lter weights obtained by the di�erent adaptive algorithms. Thenoiseless training signal s(n) was �ltered using the optimized linear and myriad �lters; themean square errors between these outputs and the desired signal d(n) are shown in thesecond column of Table 3. Since the �lter outputs are very close to the desired signal in allcases, there are not shown here.Fig. 4 shows the MSE learning curves (squared error versus algorithm iterations) for theadaptive algorithms for the general and special myriad �lters. Notice that the algorithmfor the special myriad �lter converges much faster than in the general case. This is to beexpected since the general �lter has twice the number of parameters and is naturally morecomplex. The special �lter also achieves a lower MSE than the general case, as seen fromTable 3; this is mainly due to the ease of convergence. It is not surprising that the linear�lter achieves the lowest MSE in this noiseless case; it also converges the fastest in this case(the learning curve for the linear �lter is not shown here).Having designed the di�erent bandpass �lters in a noiseless environment, we now testtheir performance when applied to signals in impulsive noise. To this end, alpha-stable noisewas added to the signal s(n), with � = 1:4 and the so-called dispersion of the noise beinggiven by = 0:1. Fig. 3(b) shows the resulting test signal x(n); note the heavy distortion inthe signal due to the impulsive component. The noisy test signal x(n) was �ltered using the27

Page 29: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

Index Linear FIR General Myriad Special Myriadi wi wi jhij w0i1 0.0084 0.0340 0.0006 0.00952 0.0020 0.0156 0.0148 0.00483 -0.0134 0.0001 0.0511 -0.00404 -0.0410 0.0147 0.1603 -0.04025 -0.0732 0.0217 0.2297 -0.00366 -0.0891 0.0064 0.4526 -0.11537 -0.0677 0.0057 0.2079 -0.00828 -0.0046 0.0250 0.0030 0.00139 0.0815 0.2941 0.0019 0.009910 0.1561 0.9450 0.0210 0.223811 0.1857 1.1756 0.0319 0.024912 0.1561 0.8962 0.0112 0.227113 0.0814 0.3576 0.0073 0.005214 -0.0047 0.0143 0.0131 0.002115 -0.0676 0.0099 0.2009 -0.008616 -0.0891 0.0150 0.4976 -0.111517 -0.0732 0.0154 0.1682 -0.004418 -0.0409 0.0108 0.2445 -0.036219 -0.0133 0.0098 0.0193 -0.002620 0.0020 0.0111 0.0125 0.005721 0.0085 0.0221 0.0079 0.0068Table 2: Optimized �lter parameters in Example 2.various �lters that had been trained in a noiseless environment; the mean square errors areshown in the last column of Table 3. From the table, we see that the MSEs for the myriad�lters are still quite low, while there is a severe degradation in the MSE of the linear �lter,since it is not robust to changes in the noise environment. This is also seen in the linear �lteroutput shown in Fig. 3(d), where the severe distortion in the high frequency region is due tothe presence of the impulse which a�ects the linear �lter. On the other hand, we see fromFigs. 3(e) and (f) that the myriad �lters outputs are relatively closer to the desired signaland are not a�ected as much by the impulse in the signal. It is evident that these �lters aremore robust to changes in the noise environment. This example thus shows the potential ofweighted myriad �lters in robust bandpass and highpass �ltering applications.Example 3: 28

Page 30: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

Filter Type Training (noiseless) Test (noisy)Linear FIR 1:6� 10�9 0.0544General Myriad 0.0060 0.0173Special Myriad 0.0033 0.0229Table 3: Example 2, Mean square error (MSE) in �ltering the training and test signals.0 2000 4000 6000 8000 10000

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

ITERATION, n

SQUA

RED

ERRO

R, J

(n)

0 200 400 600 800 10000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

ITERATION, n

SQUA

RED

ERRO

R, J

(n)

(a)

(b)Figure 4: Example 2, MSE learning curves, (a) general weighted myriad �lter, (b) specialweighted myriad �lter.We show in this example that the scaled weighted myriad �lter of Section 3.3 is capa-ble of highpass �ltering as well as ampli�cation/attenuation of selected frequency compo-nents. The chosen observed signal is a clean (noiseless) sum of three sinusoids: s(n) =P2k=0 ak sin(2�fkn). Fig. 5(a) shows a segment of s(n), with the digital frequencies f0 = 0:01,f1 = 0:03, and f2 = 0:2, and corresponding amplitudes (a0; a1; a2) = (0:3; 0:4; 0:1). Thedesired signal d(n), shown in Fig. 5(b), is a sum of the two higher frequency sinusoids:d(n) = P2k=1 bk sin(2�fkn), with the amplitudes chosen as (b1; b2) = (0:2; 1:0). Note that itis required to delete the lowest frequency tone at f0, attenuate the component at f1 by a factor(a1=b1) = 2:0, and amplify the highest frequency tone at f3 by a factor (b2=a2) = 10:0. Fur-ther, we can easily verify that the dynamic ranges of the signals are given by js(n)j � 0:652729

Page 31: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

0 50 100 150 200 250 300

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

0 50 100 150 200 250 300

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

(a)

(b)Figure 5: Example 3 (highpass �ltering a sum of sinusoids), (a) s(n): clean sum of threesinusoids, (b) d(n): desired highpass component with selective ampli�cation/attenuation.and jd(n)j � 1:1495. It is obvious that a weighted myriad �lter without scaling would notsucceed in achieving the ampli�cation of the dynamic range evident in the �gure, since itsoutput is restricted to the dynamic range of its input.The linear FIR �lter and the scaled weighted myriad �lter were both trained using 6000samples of the observed signal s(n) and the desired signal d(n), using a window sizeN = 21 ineach case. For the linear �lter, the standard LMS algorithm was used, with the initial weightsbeing all zero: w(0) = 0. Recall from Section 4.2 that, for a scaled weighted myriad �lterwith weights fwi � 0gNi=1 and fhi � 0gNi=1, and linearity parameter K > 0, it is convenientto optimize the modi�ed parameter vector q = [gT ; �]T , where the transformed weightsare given by g = [w1; w2; : : : ; wN ; jh1j; jh2; : : : ; jhN ]T , and � = K2 is the squared linearityparameter. Notice that, unlike the �lters considered so far where we could choose an arbitraryvalue for K, we have to optimizeK here in addition to the weights fgig. The scaled weightedmyriad �lter was trained in this example using the algorithms of (51) and (52), with the30

Page 32: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

0 200 400 600 800 10000

0.2

0.4

0.6

0.8

1

1.2

1.4

ITERATION, n

SQ

UA

RE

D E

RR

OR

, J(

n)

Figure 6: Example 3, MSE learning curve for scaled myriad �ltering.initial value for the linearity parameter beingK(0) = 0:5, and the initial transformed weightsbeing all identical and normalized to unity: gi(0) = (1=2N) = 0:0238; i = 1; 2; : : : ; 2N . Astep-size � = 0:1 was used in all the algorithms.Fig. 6 shows the MSE learning curve (squared error versus algorithm iterations) forthe adaptive scaled myriad �ltering algorithms, demonstrating their convergence. The �nalvalues of the di�erent �lter weights are shown in Table 4, with the optimized value of thelinearity parameter given by K = 0:8236. The observed signal s(n) was �ltered using theoptimized linear and scaled myriad �lters. Since the resulting outputs were very close tothe desired signal (with MSEs equal to 7:7 � 10�8 and 0:0073, respectively), they are notshown here. Thus, the trained scaled myriad �lter was successful in emulating the frequencyselective behavior of the linear highpass �lter. This example clearly shows that the scaledweighted myriad �lter is analogous to the linear FIR �lter in its ability to arbitrarily shapethe spectrum of an input signal, as well amplify the signal if desired.6 ConclusionIn this paper, we generalize the weighted myriad smoother, proposed for robust nonlinear\lowpass" �ltering applications in impulsive environments, into a richer class of Weighted31

Page 33: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

Index Linear FIR Scaled Myriad (K = 0:8236)i wi wi jhij1 0.6142 0.5913 0.02712 0.0248 0.0795 0.29093 -0.9666 0.0189 0.91424 -0.8975 0.0152 0.79745 0.2289 0.3840 0.08206 0.9450 1.2599 0.01907 0.3432 0.6118 0.14998 -0.6728 0.0175 0.86079 -0.6398 0.0140 0.821210 0.4407 0.5756 0.022211 1.1027 0.8490 0.016712 0.4408 0.5255 0.011513 -0.6392 0.0164 1.029114 -0.6715 0.0184 0.954315 0.3448 0.5417 0.039916 0.9462 1.1263 0.016717 0.2296 0.6270 0.037118 -0.8972 0.0192 0.763819 -0.9665 0.0216 0.844120 0.0240 0.0189 0.299621 0.6119 0.6732 0.1451Table 4: Optimized �lter parameters in Example 3.Myriad Filters admitting real-valued weights. These �lters are developed by assigning apair of weights, one positive and the other negative, to each of the input samples. Thisprocedure can be applied to the generalization of any weighted M -estimator of location (`M -smoother'), leading to the class of so-calledM -�lters. With this new �lter structure, weightedmyriad �lters can now be employed in a variety of applications that require \bandpass" or\highpass" type �ltering. By suitably scaling the outputs of these �lters, we develop the classof Scaled Weighted Myriad Filters. These novel �lters constitute a robust generalization ofthe traditional linear FIR �lter, outperforming linear �lters in a variety of signal processingand communications applications in impulsive environments. We develop nonlinear adaptivealgorithms for the optimization of the various myriad �lters. The performance of theseadaptive �lters is demonstrated through computer simulation examples involving robust32

Page 34: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

frequency-selective �ltering in impulsive noise environments.A Evaluation of @@w0i ��K(w0;x)From (26) in Section 3.2, the output of the special type of weighted myriad �lter with Nreal-valued weights fw0i 2 RgNi=1 is ��K(w0;x) = argmin� Q2(�), where Q2(�) is given by (27):Q2(�) = NXi=1 log hK2 + jw0ij � (sgn(w0i)xi � �)2i : (53)In this appendix, we shall evaluate the derivative of ��K(w0;x) with respect to the weightw0i, holding all other quantities constant. Since ��K(w0;x) � �� is one of the local minima ofQ2(�), it follows that Q02(��) = 0: (54)Di�erentiating (53) and substituting into (54), we obtainG(��; w0i) 4= Q02(��) = 2 NXj=1 jw0jj � (�� � sgn(w0j)xj)K2 + jw0jj � ��� � sgn(w0j)xj�2 = 0; (55)where we have introduced the function G(�; �) to emphasize the implicit dependency of theoutput �� on the weight w0i, since we are interested in evaluating @��@w0i while holding all otherquantities constant. By implicit di�erentiation of (55) with respect to w0i, we obtain @G@�� ! : @��@w0i! + @G@w0i! = 0; (56)from which we can �nd @��@w0i once we evaluate @G@�� and @G@w0i . Using (55), it is straightforwardto show that @G@�� = 2 NXj=1 jw0jj K2 � jw0jj � ��� � sgn(w0j)xj�2�K2 + jw0jj � ��� � sgn(w0j)xj�2�2 : (57)Evaluation of @G@w0i from (55) is, however, a more tricky matter. The di�culty is due tothe term sgn(w0i) that occurs in the expression for G(��; w0i). It would seem impossible todi�erentiate G(��; w0i) with respect to w0i, since this would involve the quantity @@w0i sgn(w0i),33

Page 35: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

which clearly cannot be found. Fortunately, this problem can be circumvented by rewritingthe expression for G(��; w0i) in (55), expanding it as follows:G(��; w0i) = 2 NXj=1 jw0jj �� � w0jxjK2 + jw0jj ���2 + x2j� � 2w0j ��xj ; (58)where we have used the fact that jw0jj � sgn(w0j) = w0j. We see that (58) presents no mathe-matical di�culties in di�erentiating with respect to w0i, since the term sgn(w0i) is no longerpresent. Di�erentiating with respect to w0i, and performing some straightforward manipula-tions, we then obtain @G@w0i = 2K2 sgn(w0i) ��� � sgn(w0i)xi��K2 + jw0ij � ��� � sgn(w0i)xi�2�2 : (59)Substituting (57) and (59) into (56), we �nally obtain the following expression for @��@w0i :@@w0i ��K(w0;x) = � 26664 K2 sgn(w0i) ��� � sgn(w0i)xi��K2 + jw0ij � ��� � sgn(w0i)xi�2�23777526664 NXj=1 jw0jj K2 � jw0jj � ��� � sgn(w0j)xj�2�K2 + jw0jj � ��� � sgn(w0j)xj�2�237775 : (60)B Evaluation of @@� �̂K(g; z)In this appendix, we shall obtain an expression for the derivative of the weighted myriadsmoother output �̂K(g; z) with respect to its squared linearity parameter � = K2, whileholding the weights g and inputs z constant. Noting that the vectors g and z are of length2N , and referring to to Section 2, we conclude from (8) and (9) that the output �̂K(g; z) � �̂satis�es F (�̂; �) = 2NXj=1 gj (�̂ � zj)� + gj (�̂ � zj)2 = 0; (61)where the function F (�; �) emphasizes the implicit dependency of the output �̂ on �, holdingall other quantities constant. By implicit di�erentiation of (61) with respect to �, we obtain @F@�̂ ! : @�̂@�! + @F@� ! = 0; (62)34

Page 36: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

from which we can extract @�̂@� once we evaluate @F@�̂ and @F@� . It is straightforward to �nd thesequantities using the expression for F (�̂; �) given in (61):@F@�̂ = 2NXj=1 gj �� gj ��̂ � zj�2��+ gj ��̂ � zj�2�2 (63)and @F@� = � 2NXj=1 gj (�̂ � zj)�� + gj (�̂ � zj)2�2 : (64)Substituting (63) and (64) into (62), we �nally obtain the following expression for @�̂@� :@@� �̂K(g; z) = 264 2NXj=1 gj (�̂ � zj)�� + gj (�̂ � zj)2�237526664 2NXj=1 gj �� gj ��̂ � zj�2��+ gj ��̂ � zj�2�237775 : (65)References[1] D. Middleton, \Statistical-physical models of electromagnetic interference," IEEETransactions on Electromagnetic Compatibility, vol. EMC-19, no. 3, pp. 106{127, 1977.[2] J. Ilow, Signal Processing in Alpha-Stable Noise Environments: Noise Modeling, Detec-tion and Estimation. PhD thesis, University of Toronto, Canada, Dec. 1995.[3] C. L. Nikias and M. Shao, Signal Processing with Alpha-Stable Distributions and Appli-cations. New York: Wiley, 1995.[4] P. J. Huber, Robust Statistics. New York: Wiley, 1981.[5] S. A. Kassam and H. V. Poor, \Robust techniques for signal processing," Proceedingsof the IEEE, vol. 73, Mar. 1985.[6] L. Yin, R. Yang, M. Gabbouj, and Y. Neuvo, \Weighted median �lters: A tutorial,"IEEE Transactions on Circuits and Systems{II, vol. 43, Mar. 1996.35

Page 37: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

[7] I. Pitas and A. Venetsanopoulos, \Order statistics in digital image processing," Pro-ceedings of the IEEE, vol. 80, Dec. 1992.[8] J. G. Gonzalez and G. R. Arce, \Weighted myriad �lters: A robust �ltering frameworkderived from �-stable distributions," in Proc. of the 1996 IEEE ICASSP, (Atlanta, GA),1996.[9] J. G. Gonzalez and G. R. Arce, \Weighted myriad �lters: A powerful framework fore�cient �ltering in impulsive environments," IEEE Transactions on Signal Processing.Manuscript in review.[10] J. G. Gonzalez, Robust Techniques for Wireless Communications in NonGaussian Envi-ronments. PhD thesis, Department of Electrical and Computer Engineering, Universityof Delaware, Newark, Delaware, U.S.A., Dec. 1997.[11] S. Kalluri and G. R. Arce, \Fast algorithms for weighted myriad computation by �xedpoint search," IEEE Transactions on Signal Processing. Manuscript in review.[12] J. G. Gonzalez, D. W. Gri�th, and G. R. Arce, \Matched myriad �ltering for robustcommunications," in Proc. of the 1996 CISS, (Princeton, NJ), 1996.[13] P. Zurbach, J. G. Gonzalez, and G. R. Arce, \Weighted myriad �lters for image pro-cessing," in Proc. of the 1996 IEEE Int. Symp. on Circuits and Systems, (Atlanta, GA),1996.[14] S. Kalluri and G. R. Arce, \Adaptive weighted myriad �lter algorithms for robust signalprocessing in �-stable noise environments," IEEE Transactions on Signal Processing,vol. 46, pp. 322{334, Feb. 1998.[15] S. Ambike and D. Hatzinakos, \A new �lter for highly impulsive �-stable noise," inProc. of the 1995 Int. Workshop on Nonlinear Signal and Image Processing, (Halkidiki,Greece), 1995. 36

Page 38: pdfs.semanticscholar.org · 1 In tro duction The statistical signal pro cessing literature has traditionally b een dominated b y linear estima-tors, whic h are optimal under the assumption

[16] G. R. Arce, \A general weighted median �lter structure admitting negative weights,"IEEE Transactions on Signal Processing, vol. 46, pp. 3195{3205, Dec. 1998.[17] J. L. Paredes and G. R. Arce, \Stack �lters, stack smoothers, and mirrored thresholddecomposition," IEEE Transactions on Signal Processing. Submitted for publication.

37