on the finite sample breakdown points of redescending m-estimates of location

10
Statistics & Probability Letters 69 (2004) 233–242 On the finite sample breakdown points of redescending M -estimates of location Zhiqiang Chen a, * ,1 , David E. Tyler b,2 a Department of Mathematics, William Paterson University, Wayne, NJ 07470, USA b Department of Statistics, Rutgers, The State University of New Jersey, Piscataway, NJ 08855, USA Received 2 March 2004; received in revised form 2 April 2004 Available online 6 July 2004 Abstract The finite sample breakdown points of scale equivariate redescending M-estimates of location are studied. In particular, a simple lower bound for the finite sample breakdown point of redescending M-estimates of location is given whenever the M-estimate of location is defined using the median absolute deviation about the median (MAD) as a scaling term. This lower bound is close to 0.49 for many common cases and depends on the configuration of the ‘‘good’’ data only through breakdown point of the MAD. r 2004 Elsevier B.V. All rights reserved. Keywords: Breakdown point; MAD; Redescending M-estimates 1. Introduction Given a univariate sample X n ¼fx 1 ; x 2 ; y; x n g and an objective function r : R-R; an M-estimate of location for X n can be defined as T ðX n Þ¼ arg min tAR X n i¼1 rðx i tÞ: ð1Þ ARTICLE IN PRESS *Corresponding author. E-mail address: [email protected] (Zhiqiang Chen). 1 Research partially supported by ART program and by Center For Research at William Paterson University. 2 Research partially supported by NSF Grant DMS-0305858. 0167-7152/$ - see front matter r 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.spl.2004.06.007

Upload: zhiqiang-chen

Post on 26-Jun-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Statistics & Probability Letters 69 (2004) 233–242

On the finite sample breakdown points of redescendingM-estimates of location

Zhiqiang Chena,*,1, David E. Tylerb,2

aDepartment of Mathematics, William Paterson University, Wayne, NJ 07470, USAbDepartment of Statistics, Rutgers, The State University of New Jersey, Piscataway, NJ 08855, USA

Received 2 March 2004; received in revised form 2 April 2004

Available online 6 July 2004

Abstract

The finite sample breakdown points of scale equivariate redescending M-estimates of locationare studied. In particular, a simple lower bound for the finite sample breakdown point of redescendingM-estimates of location is given whenever the M-estimate of location is defined using the medianabsolute deviation about the median (MAD) as a scaling term. This lower bound is close to 0.49 formany common cases and depends on the configuration of the ‘‘good’’ data only through breakdown pointof the MAD.r 2004 Elsevier B.V. All rights reserved.

Keywords: Breakdown point; MAD; Redescending M-estimates

1. Introduction

Given a univariate sample X n ¼ fx1;x2;y;xng and an objective function r :R-R; anM-estimate of location for X n can be defined as

TðX nÞ ¼ arg mintAR

Xn

i¼1

rðxi � tÞ: ð1Þ

ARTICLE IN PRESS

*Corresponding author.

E-mail address: [email protected] (Zhiqiang Chen).1Research partially supported by ART program and by Center For Research at William Paterson University.2Research partially supported by NSF Grant DMS-0305858.

0167-7152/$ - see front matter r 2004 Elsevier B.V. All rights reserved.

doi:10.1016/j.spl.2004.06.007

One typically associates M-estimates with solutions to M-estimating equation. In particular, if ris differentiable, then the M-estimate of location satisfies the M-estimating equation:

Xn

i¼1

cðxi � tÞ ¼ 0 orXn

i¼1

uðxi � tÞðxi � tÞ ¼ 0; ð2Þ

where cðzÞpr0ðzÞ and with the weight function being defined so that cðzÞ ¼ zuðzÞ: If c is alsomonotonic, then t ¼ TðX nÞ is the unique solution of (2). For a bounded differentiate r; however,the corresponding c-function is not monotonic but instead redescends, i.e. it goes to zero asjzj-N: For such r; the M-estimating equation (2) can admit multiple solutions, but not everysolution satisfies (1). The focus of this paper is on the breakdown points of redescendingM-estimates of location defined via a bounded r-function, and hence attention is restricted to itsdefinition as a solution to a minimization problem.In practice, a positive scale statistic SðX nÞ; as well as a positive tuning parameter c; are usually

included within the definition of an M-estimate of location. The M-estimate of location is thendefined as

TðX nÞ ¼ arg mintAR

Xn

i¼1

rxi � t

cSðX nÞ

� �: ð3Þ

If the scale statistic is translation invariant and scale equivariant, then the correspondingM-estimate of location is both translation and scale equivariant. The definition given by (1)implies TðX nÞ is translation equivariate but not scale equivariate.

M-estimates of location having bounded, monotonic and odd c-functions have an asymptoticbreakdown point of 1=2: The breakdown point of the scale equivariate version of theseM-estimates is dependent on the breakdown properties of the scale statistic (see e.g. Huber, 1981).In particular, if the median absolute deviation about the median (MAD) is used as the scalestatistic, they maintain an asymptotic breakdown point 1=2: For redescending M-estimates oflocation, Huber (1984) derives the finite sample breakdown point for redescending M-estimatesdefined via (1), i.e. without a scale term, and notes the breakdown point depends not only on theobjective function r but also on the configuration of the ‘‘good’’ data. The asymptotic behavior ofthese finite sample breakdown points have recently been studied Zhang and Li (1998).Less is known regarding the breakdown point of the scale equivariant resdecending M-estimate,

i.e. those defined via (3). Some simulation results for the breakdown point of M-estimators usingsuitable scaling and tuning constants are reported by Hoaglin et al. (1983). Their simulation resultsindicate that when the MAD is used as the scale statistic, the M-estimate have breakdown pointsclose to 1=2 for many objective functions. Huber (1984) also reports similar simulation results.In this short article, some theoretical results are given for the breakdown point of the scale

equivariate redescending M-estimates of location, with the primary focus being on the use of thepopular MAD as the estimate of scale. In particular, we establish numerical lower bounds fortheir finite sample breakdown points. When using the MAD in conjunction with Tukey’sbiweighted M-estimate of location with tuning constant c ¼ 9; this lower bound is 0.495. Theseresults thus give a theoretical justification for the aforementioned simulation results. The mainresults of the paper are given in the next section, while the proof and some technical results aregiven in an appendix.

ARTICLE IN PRESS

Zhiqiang Chen, D. E. Tyler / Statistics & Probability Letters 69 (2004) 233–242234

2. Finite sample breakdown points

2.1. Preliminaries and background

In what follows, the objective function r is assumed to satisfy the following regularitycondition:

Condition 2.1. The function rðzÞ is

(i) symmetric about 0, i.e. rðzÞ ¼ rð�zÞ; and nondecreasing for zX0;(ii) limz-0 rðzÞ ¼ rð0Þ ¼ 0 and limz-N rðzÞ ¼ 1:

Most of the commonly used objective functions for redescending M-estimator satisfy theseconditions. For example, one obtains Tukey’s biweight M-estimate by choosing

rðzÞ ¼ 1� fð1� z2Þ3gIðjzjp1Þ; ð4Þ

where IðÞ is the usual indicator function. The corresponding weight function uðzÞ ¼ ð1�z2Þ2Iðjzjp1Þ is then Tukey’s biweight function (see e.g. Hoaglin et al., 1983). Another example isAndrews’ M-estimate, which is obtained by choosing

rðzÞ ¼ 1� 12½1þ cosðpzÞ�

� �Iðjzjp1Þ; ð5Þ

and has corresponding weight function uðzÞ ¼ ðsinðpzÞ=zÞIðjzjp1Þ (see e.g. Andrew et al., 1972).Welsch’s M-estimate, which is obtained by choosing rðzÞ ¼ 1� e�z2 (see e.g. Holland and Welsch,1977), also satisfies these conditions. Its corresponding weight function uðzÞ ¼ e�z2 ; is referred toas Welsch’s weight function in the statistical toolbox of MATLAB.A number of definitions for the finite sample breakdown point have been proposed (see e.g.

Hampel, 1971, 1974; Donoho, 1982; Donoho and Huber, 1983), as measures for quantifying the

proportion of bad data in a sample that a statistics can tolerate before returning arbitrary values.In this article, we work with the finite sample contamination breakdown point and the finitesample replacement breakdown point. Convincing arguments for these measures can be found inDonoho and Huber (1983) and in He et al. (1990). Let X n be a sample of n univariate data pointsor ‘‘good observations’’ and let Y m be m arbitrary univariate points or ‘‘bad observations’’. Thefinite sample contamination breakdown point for T at X n is then defined to be

ecðT ;X nÞ ¼ infm

n þ m: Bc;mðT ;X nÞ ¼ N

� �; ð6Þ

where the ‘‘maximum bias’’ Bc;mðT ;X nÞ ¼ supY m jTðX nÞ � TðX n,Y mÞj: For the replacementversion, rather than adding Y m to the data set X n; Y m replaces m arbitrary values in X n: Denotethe remaining n � m values form X n by X n�m: The finite sample replacement breakdown point forT at X n is then defined to be

erðT ;X nÞ ¼ infm

n: Br;mðT ;X nÞ ¼ N

n o; ð7Þ

where Br;mðT ;X nÞ ¼ supY m jTðX nÞ � TðX n�m,Y mÞj: By convention, if TðX n,Y mÞ does not existfor some Y m then Bc;mðT ;X nÞ is taken to be N: Non-existence of TðX n,Y mÞ occurs, e.g.whenever SðX n,Y mÞ ¼ 0: An analogous convention holds for Br;mðT ;X nÞ:

ARTICLE IN PRESS

Zhiqiang Chen, D. E. Tyler / Statistics & Probability Letters 69 (2004) 233–242 235

The finite sample breakdown points of the scale statistic SðX nÞ plays a role in the breakdownpoints of the corresponding M-estimate of location. The definition of the finite samplecontamination breakdown point of S at X n; denoted by ecðT ;X nÞ; is obtained by replacingBc;mðT ;X nÞ in (6) with Bc;mðS;X nÞ ¼ supY m jlogðSðX nÞÞ � logðSðX n,Y mÞÞj: The finitesample replacement breakdown point of S at X n; denoted erðT ;X nÞ; is defined analogously.The scale statistic is thus said to breakdown if it can be made arbitrary close to zero or arbitrarilylarge.

2.2. Point mass contamination

For many statistics, the worst case scenerio for breakdown occurs when all the ‘‘bad’’ datapoints lie at the same value, i.e. when Y m ¼ y½m� fy;y; yg: It turns out that this is notnecessarily true for the scale equivariant redescending M-estimates of location, as is shown inRemark 2.5 at the end of this subsection. Nevertheless, it is instructive to first consider how thestatistic TðX nÞ behaves under point mass contamination. Define the finite sample point masscontamination breakdown point epðT ;X nÞ as in (6) but with y½m� replacing Y m: Clearly, epðT ;X nÞprovides at least an upper bound for ecðT ;X nÞ: Also, define the finite sample point masscontamination for the scale statistic, i.e. epðS;X nÞ; in a manner analogous to the definition ofepðT ;X nÞ:The following theorem gives general bounds for epðT ;X nÞ: Several remarks concerning the

theorem are made afterwards.

Theorem 2.1. Let T be an M-estimate of location defined by (3) with r being continuous and

satisfying Condition 2.1. Let

Am ¼ inft

Xn

i¼1

rxi � t

csm;L

� �;

where sm;L ¼ minfsm;�; sm;þg; with

sm;� ¼ lim infy-�N

SðX n,y½m�Þ and sm;þ ¼ lim infy-N

SðX n,y½m�Þ:

Also, set m� ¼ inffm j mXJn � Amng and m�� ¼ inffm j mXIn � Am þ 1mg: The finite sample

point contamination breakdown point of TðÞ at X n then satisfies

epðT ;X nÞXmin epðS;X nÞ;m�

n þ m�

� �:

Furthermore, if epðS;X nÞ > m��=ðn þ m��Þ; then

epðT ;X nÞpm��

n þ m��:

It is straightforward to note that in the definition of Am the infimum over t can be restricted tominðX nÞptpmaxðX nÞ: Also, m�� ¼ m� unless Am� ¼ n � m�: Theorem 2.1 yields the exact finitesample contamination breakdown point whenever m�� ¼ m� and m�=ðn þ m�ÞpepðS;X nÞ:

ARTICLE IN PRESS

Zhiqiang Chen, D. E. Tyler / Statistics & Probability Letters 69 (2004) 233–242236

Remark 2.1. From the proof of the Theorem 2.1, it can be noted that the continuity of r is notneeded if SðX n,y½m�Þ ¼ sm;þ for large enough y and if SðX n,y½m�Þ ¼ sm;� for small enough y: Thisholds, for example, when the scale statistic is taken to be the MAD.

Remark 2.2. One can note that the breakdown point of the scale statistic epðS;X nÞ is not includedin the upper bound for epðT ;X nÞ given in Theorem 2.1. If the scale statistic breaks down due tobecoming exactly zero, then the location statistics also breakdown since it is not defined. It canalso be noted from the proof of Theorem 2.1 that if the scale statistics can be made arbitrarilyclose to zero, then the location statistics will again breakdown. However, if the scale statistic canbe made arbitrarily large, it is not clear if this implies the location statistic breakdowns. It maydepend upon the specific scale statistic being used.

Remark 2.3. The finite sample contamination breakdown point of the MAD is given by

ecðMAD;X nÞ ¼ ðn � 2qðX nÞ þ 1Þ=ð2n � 2qðX nÞ þ 1Þ;

see e.g. Donoho (1982). The quantity qðX nÞ represents the maximum number of data points in X n

which are repeated, or more specifically

qðX nÞ maxi¼1;y;n

Xn

j¼1

Iðxj ¼ xiÞ: ð8Þ

The MAD can be broken down by adding m ¼ n � 2qðX nÞ þ 1 data points equal to the data pointin X n which is repeated qðX nÞ times. This results in MAD ¼ 0: It also implies epðMAD;X nÞ ¼ecðMAD;X nÞ: Thus, for S ¼ MAD;

mo

n þ mopepðT ;X nÞp

moo

n þ moo;

where mo ¼ minfm�; n � 2qðX nÞ þ 1g and moo ¼ minfm��; n � 2qðX nÞ þ 1g:

Remark 2.4. The scale statistics need not be translation invariant and scale equivariate forTheorem 2.1 to hold. In particular, if SðX nÞ ¼ s a constant, then epðS;X nÞ ¼ 1 and so

m�

n þ m�pepðT ;X nÞpm��

n þ m��:

Furthermore, the infimum in the definition at Am is obtained at t ¼ TðX nÞ and Am does notdepend on m: These bounds for epðT ;X nÞ are the same as the bounds for finite samplecontamination breakdown point ecðT ;X nÞ given by Huber (1984) for fixed scale.

Remark 2.5. The following example shows that epðT ;X nÞ and ecðT ;X nÞ are not always equal. Theidea behind this example arises from noting the breakdown point epðT ;X nÞ given by Theorem 2.1generally decreases as the scale for X n,y½m� decreases. So, in seeking a counterexample toequality, one might consider general contaminations Y m such that the scale for X n,Y m tends tobe smaller than that of X n,y½m�:

Let n ¼ 5; X n ¼ f�15;�2;�1; 1; 2g and TðX nÞ be defined as in (3) with S ¼ MAD and c ¼ 1:Placing m ¼ 4 extra points at one of the data points in X n will breakdown the MAD, but using

ARTICLE IN PRESS

Zhiqiang Chen, D. E. Tyler / Statistics & Probability Letters 69 (2004) 233–242 237

only m ¼ 3 points will not. Hence, for contamination of the form y½m�; at most m ¼ 4 points areneeded to breakdown TðX nÞ:For m ¼ 1; 2; and 3, sm;L ¼ 2; 3; and 10, respectively. If the r-function is choosen so that for

some dX0; rð1=6Þpd; rð5=6Þo1� 2d and rð1=5Þo0:25; then it can be shown that A1o4; A2o3;and A3o2: Hence, moJn � Amn for m ¼ 1; 2, or 3. By Theorem 2.1, this implies contaminationof the form y½m�; for m ¼ 1; 2; and 3 cannot breakdown TðX nÞ:Consider now contamination of the form Y m ¼ f0; y; yg for arbitrarily y: Whether or not this

configuration causes TðX nÞ to breakdown is equivalent to determining whether contamination ofthe form ymo ¼ fy; yg causes TðX n

o Þ to breakdown, where X no ¼ X n,f0g; no ¼ 6 and mo ¼ 2: Itcan be shown that smo;L ¼ 2; and if in addition rð1=4Þ > 0:8 then Amo

> 4: Hence, moXIno �Amo

þ 1m: It follows by Theorem 2.1, that contamination of the form y½mo� breaks down TðX no Þ:

Consequently, m ¼ 3 contamination points can breakdown TðX nÞ; and so ecðT ;X nÞoepðT ;X nÞ:

2.3. Lower bounds for eðT ;X nÞ when S ¼ MAD

Although in general ecðT ;X nÞ and epðT ;X nÞ are not necessarily equal, we conjecture that theymay be equal if further regularity conditions are imposed on the function r and on the scalestatistic SðÞ:However, given that the form for epðT ;X nÞ in Theorem 2.1 is itself quite complicated,this topic is not pursued further here. Rather, in this subsection, simple lower bounds for the finitesample breakdown points are obtained, at least for whenever the scale statistic is taken to be theMAD. The importance of using MAD in obtaining these bounds is made apparent by Lemma A.2of the appendix. To begin, simple lower bounds for epðT ;X nÞ are obtained.

Theorem 2.2. Let T be an M-estimate of location defined by (3) with r satisfying Condition 2.1 andwith S MAD: The finite sample point contamination breakdown point of T at X n satisfies

epðT ;X nÞXminn � 2qðX nÞ þ 1

2n � 2qðX nÞ þ 1;

1� rð1:5=cÞ2ð1� rð1:5=cÞ þ rð0:5=cÞÞ

� �;

where qðX nÞ is defined in (8).

An important property of the lower bounds in the above theorem is that they are ‘‘universal’’ inX n; i.e. they do not depend upon the configuration of X n except through the finite samplebreakdown point of the MAD, which in turn depends on X n only through qðX nÞ: It is thisproperty, together with the property ecðMAD;X nÞ ¼ epðMAD;X nÞ that allows extending theselower bounds to the finite sample contamination breakdown point.

Theorem 2.3. Under the conditions of Theorem 2.2, the finite sample contamination breakdown point

of T at X n satisfies

ecðT ;X nÞXminn � 2qðX nÞ þ 1

2n � 2qðX nÞ þ 1;

1� rð1:5=cÞ2ð1� rð1:5=cÞ þ rð0:5=cÞÞ

� �:

Finally, the ‘‘universal’’ nature of the bounds over X n allow them to be readily extended to thefinite sample replacement breakdown points. An adjustment must be made though for the finite

ARTICLE IN PRESS

Zhiqiang Chen, D. E. Tyler / Statistics & Probability Letters 69 (2004) 233–242238

sample replacement breakdown point of the MAD, which is given by

erðMAD;X nÞ ¼Iðn � 2qðX nÞ þ 2Þ=2m

n;

see e.g. Gather and Hilker (1997).

Theorem 2.4. Under the conditions of Theorem 2.2, the finite sample replacement breakdown point

of T at X n satisfies

erðT ;X nÞXminIðn � 2qðX nÞ þ 2Þ=2m

n;

1� rð1:5=cÞ2ð1� rð1:5=cÞ þ rð0:5=cÞÞ

� �:

One can note from the previous theorems that for a large enough tuning constant c the finitesample breakdown points of TðX nÞ correspond to those of MADðX nÞ: This follows since

limc-N

1� rð1:5=cÞ2ð1� rð1:5=cÞ þ rð0:5=cÞÞ

¼1

2:

Furthermore, as c-N; it can be shown that at a normal distribution the asymptotic efficiency ofTðX nÞ relative to the sample mean goes to 1 when r is differentiable. However, the gross errorsensitivity also goes to infinity. So, in practice, one would wish to choose a tuning constant whichyield both high relative efficiency at the normal model and a reasonably small gross errorsensitivity.When using the r-function (4) associated with Tukey’s biweight M-estimate, a tuning constant

between c ¼ 6 and 9 is usually recommended (see e.g. Hoaglin et al., 1983). For such choices ofthe tuning constant, the lower bound for the finite sample breakdown points is the minimum ofthe finite sample breakdown point of the MAD and of 0.488 and 0.495, respectively. This supportsthe simulation results reported by Hoaglin et al. (1983) and by Huber (1984). When using ther-function (5) associated with Andrew’s wave M-estimate, a tuning constant of c ¼ 2:1p isrecommended in Andrew et al. (1972). For Welsch’s M-estimate, the statistical toolbox inMATLAB uses the constant c ¼ 2:985=0:6745 ¼ 4:4255 as the default value. The lower bound forthe finite sample breakdown points of these two choices are then the minimum of the finite samplebreakdown point of the MAD and of 0.492 and 0.497, respectively.

Appendix A. Proofs and technical results

For simplicity, define

Lðt; s;X nÞ ¼X

xAX n

rx � t

cs

�:

Proof of Theorem 2.1. Let ty ¼ TðX n,yðmÞÞ and sy ¼ SðX n,yðmÞÞ: Suppose throughout the proofthat m=ðn þ mÞoepðS;X nÞ; and hence sy is bounded above and below.

Lower bound: If jyj is bounded then ty is bounded since as jtj-N; Lðt; sy;X n,yðmÞÞ-n þ m;whereas Lðx1; sy;X n,yðmÞÞpn þ m � 1: Consider next the case jyj-N and suppose jtyj-N: By

ARTICLE IN PRESS

Zhiqiang Chen, D. E. Tyler / Statistics & Probability Letters 69 (2004) 233–242 239

definition, for any tAR; Lðty; sy;X n,yðmÞÞpLðt; sy;X n,yðmÞÞ; and so since 0prðrÞp1;Lðty; sy;X nÞpLðt; sy;X nÞ þ m: Taking the limit as jyj-N and noting that there must exist asubsequence such that either sy-sm;� or sy-sm;þ gives npLðt; sm;L;X nÞ þ m; and so npAm þ mor mXn � Am: Thus if mom�; then TðX n,yðmÞÞ must stay bounded.

Upper bound: Consider a sequence such that jyj-N and sy-sm;L: Suppose ty stays bounded.There then exists a subsequence such that ty-toAR: By definition, Lðty; sy;X n,yðmÞÞpLðy; sy;X n ,yðmÞÞpn: Taking the limit gives Lðto; sm;L;X nÞ þ mpn; and so Am þ mpn: Thus,if m ¼ m��; then TðX n,yðmÞÞ cannot be bounded. &

Before proving Theorems 2.2 and 2.3, a couple of lemmas are first established. The secondlemma is where the dependence of the lower bounds given in this article on the use of the MAD asthe scale statistics arises.

Lemma A.1. Under Condition 2.1, for any 0oapbo1;

n �Lðt; s;X nÞXxCðb; t; s;X nÞð1� rðb=CÞÞ þ xCða; t; s;X nÞðrðb=cÞ � rða=CÞÞ;

where Cðg; t; s;X nÞ ¼ fxAX n : jx � tjpgsg:

Proof. For simplicity, set CðgÞ Cðg; t; s;X nÞ: Then

n �Lðt; s;X nÞ ¼X

xAX n1� r

x � t

cs

� �

X

XCðaÞ

1� rx � t

cs

� �þX

CðbÞ-CðaÞc1� r

x � t

cs

� �

XxCðaÞð1� rða=cÞÞ þ ðxCðbÞ � xCðaÞÞð1� rðb=cÞÞ;

which gives the desired inequality. &

Lemma A.2. Let M ¼ MedianðX n,Y mÞ; s ¼ MADðX n,Y mÞ; and t ¼ M � s=2: Suppose jxjoa

for all xAX n and jyj > b for all yAY m; then for b � a large enough,

xCð0:5; t; s;X nÞXmþ and xCð1:5; t; s;X nÞXn þ m

2

where mþ ¼ xfyAY m : y > bg:

Proof. First note that Cð0:5; t; s;X nÞ ¼ fxAX n : � spx � Mp0g; and Cð1:5; t; s;X nÞ ¼fxAX n : � 2spx � Mpsg: It then follows from the definitions of the median and the MADthat for large enough b � a;

xCð1:5; t; s;X nÞXxfxAX n : jx � M jpsgXn þ m

2; and

xCð0:5; t; s;X nÞ ¼ xfxAX n : x � Mp0g þ xfxAX n : x � MX� sg � n

Xn þ m

2� m�

�þ

n þ m

2� n ¼ m � m� ¼ mþ:

where m� ¼ xfyAY m : yobg: &

ARTICLE IN PRESS

Zhiqiang Chen, D. E. Tyler / Statistics & Probability Letters 69 (2004) 233–242240

Proof of Theorem 2.2. Assume m=ðn þ mÞoepðMAD;X nÞ: Also, without loss of generality,assume sm;L ¼ sm;þ: For large enough y; we then have MADðX n,yðmÞÞ ¼ sm;L: Since Am ¼inf tLðt; sm;L;X nÞ; applying Lemmas A.1 and A.2 and noting mþ ¼ m gives

n � AmXn þ m

2ð1� rð1:5=cÞÞ þ mðrð1:5=cÞ � rð0:5=cÞÞ:

By Theorem 2.1, we know that TðX n,yðmÞÞ is bounded whenever mon � Am and hence isbounded whenever m is less than the right-hand side of the above inequality. Re-expressing this interms of m=ðn þ mÞ gives the desired result. &

Proof of Theorem 2.3. Assume m=ðn þ mÞoecðMAD;X nÞ; and suppose tm ¼ TðX n,Y mÞ is notbounded over all possible Y m: There must then exist a sequence of Y m such that jtmj-N:Without loss of generality, suppose tm-N: Furthermore, this sequence can be choosen so thateach element either converges to an element in R; diverges to infinity or diverges to minus infinity.Without loss of generality, assume the first mo elements converge, the next m� diverges to �N;and the last mþ diverges to N: Hence m ¼ mo þ m� þ mþ: Let Y mo ; Y m� ; and Y mþ denote the setfor the first mo; the second m�; and the last mþ elements of Y m; respectively.Now by definition, Lðtm; sm;X n , Y mÞpLðt; sm;X n , Y mÞ for any tAR; where sm ¼

MADðX n , Y mÞ: This implies

Lðtm; sm;Xn,Y mo,Y m�ÞpLðt; sm;X n,Y moÞ þ m� þ mþ:

Application of Lemmas A.1 and A.2 to n þ mo �Lðt; sm;X n,Y moÞ implies the right-hand side ofthe above inequality is less than or equal to

n þ m �n þ m

2ð1� rð1:5=cÞÞ � mþðrð1:5=cÞ � rð0:5=cÞÞ:

Since the limit of Lðtm; sm;X n,Y mo,Y m�Þ-n þ mo þ m� as tm-N; it then follows that ifTðX n,Y mÞ is not bounded over Y m; then

mþXn þ m

2ð1� rð1:5=cÞÞ þ mþðrð1:5=cÞ � rð0:5=cÞÞ; or

m

n þ mX

n þ mX

1� rð1:5=cÞ2ð1� rð1:5=cÞ þ rð0:5=cÞÞ

:

The theorem then follows. &

Proof of Theorem 2.4. Assume m=noerðMAD;X nÞ: Suppose now that tm ¼ TðX n�m,Y mÞ is notbounded above over possible choice of X n�m and Y m: There then exist sequence of fX n�m;Y mgsuch that jtmj-N: Within any such sequence is a subsequence such that X n�m contains the samen � m elements. Hence, m=n ¼ ðm=ðn � mÞ þ mÞXecðT ;X n�mÞ and the result then follows. &

References

Andrew, D.F., Bickel, P.J., Hampel, F.R., Huber, P.J., Rogers, W.H., Tukey, J.W., 1972. Robust Estimates of

Location: Survey and Advances. Princeton University Press, Princeton, NJ.

ARTICLE IN PRESS

Zhiqiang Chen, D. E. Tyler / Statistics & Probability Letters 69 (2004) 233–242 241

Donoho, D.L., 1982. Breakdown properties of multivariate location estimators. Ph.D. Qualifying paper, Department

of Statistics, Harvard University.

Donoho, D.L., Huber, P.J., 1983. The notion of breakdown point. In: Doksum, K., Hodges, J.L. (Eds.), Festschrift in

Honor of Erich Lehmann. Wadsworth, Belmont, CA.

Gather, U., Hilker, T., 1997. A note on Tyler’s modification of the MAD for the Stahel–Donoho estimator. Ann.

Statist. 25, 2024–2026.

Hampel, F.R., 1971. A general quantitative definition of robustness. Ann. Math. Statist. 42, 1887–1896.

Hampel, F.R., 1974. The influence curve and its role in robust estimation. J. Amer. Statist. Assoc. 69, 909–927.

He, X., Jureckova, J., Koenker, R., Portnoy, S., 1990. Tail behavior of regression estimators and their breakdown

points. Econometrica 58, 1195–1214.

Hoaglin, D.C., Mosteller, F., Tukey, J.W., 1983. Understanding Robust and Exploratory Data Analysis. Wiley,

New York.

Holland, P.W., Welsch, R.E., 1977. Robust regression using iteratively reweighted least-squares. Commun. Statist.

Theory Methods A6, 813–827.

Huber, P.J., 1981. Robust Statistics. Wiley, New York.

Huber, P.J., 1984. Finite sample breakdown of M- and P-estimators. Ann. Statist. 12, 119–126.

Zhang, J., Li, G., 1998. Breakdown properties of location M-estimators. Ann. Statist. 26, 1170–1189.

ARTICLE IN PRESS

Zhiqiang Chen, D. E. Tyler / Statistics & Probability Letters 69 (2004) 233–242242