bandwidth selection for changepoint estimation in nonparametric regression

12
This article was downloaded by: [University of Tennessee At Martin] On: 07 October 2014, At: 14:52 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Technometrics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/utch20 Bandwidth Selection for Changepoint Estimation in Nonparametric Regression Irène Gijbels a & Anne-Cécile Goderniaux a a Institut de Statistique, Université Catholique de Louvain, 20 Voie du Roman Pays, B-1348 Louvain-la-Neuve, Belgium Published online: 01 Jan 2012. To cite this article: Irène Gijbels & Anne-Cécile Goderniaux (2004) Bandwidth Selection for Changepoint Estimation in Nonparametric Regression, Technometrics, 46:1, 76-86, DOI: 10.1198/004017004000000130 To link to this article: http://dx.doi.org/10.1198/004017004000000130 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Upload: anne-cecile

Post on 14-Feb-2017

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Bandwidth Selection for Changepoint Estimation in Nonparametric Regression

This article was downloaded by: [University of Tennessee At Martin]On: 07 October 2014, At: 14:52Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office:Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

TechnometricsPublication details, including instructions for authors and subscriptioninformation:http://www.tandfonline.com/loi/utch20

Bandwidth Selection for Changepoint Estimationin Nonparametric RegressionIrène Gijbelsa & Anne-Cécile Goderniauxa

a Institut de Statistique, Université Catholique de Louvain, 20 Voie du RomanPays, B-1348 Louvain-la-Neuve, BelgiumPublished online: 01 Jan 2012.

To cite this article: Irène Gijbels & Anne-Cécile Goderniaux (2004) Bandwidth Selection for ChangepointEstimation in Nonparametric Regression, Technometrics, 46:1, 76-86, DOI: 10.1198/004017004000000130

To link to this article: http://dx.doi.org/10.1198/004017004000000130

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”)contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensorsmake no representations or warranties whatsoever as to the accuracy, completeness, or suitabilityfor any purpose of the Content. Any opinions and views expressed in this publication are the opinionsand views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy ofthe Content should not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings,demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arisingdirectly or indirectly in connection with, in relation to or arising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantialor systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, ordistribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can befound at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Bandwidth Selection for Changepoint Estimation in Nonparametric Regression

Bandwidth Selection for Changepoint Estimationin Nonparametric Regression

Irène GIJBELS and Anne-Cécile GODERNIAUX

Institut de StatistiqueUniversité Catholique de Louvain

20 Voie du Roman PaysB-1348 Louvain-la-Neuve, Belgium

Nonparametric estimation of abrupt changes in a regression function involves choosing smoothing (band-width) parameters. The performance of estimation procedures depends heavily on this choice. So far, lit-tle attention has been paid to the crucial issue of choosing appropriate bandwidth parameters in practice.In this article we propose a bootstrap procedure for selecting the bandwidth parameters in a nonparametrictwo-step estimation method. This method results in a fully data-driven procedure for estimating a � nite(but possibly unknown) number of changepoints in a regression function. We evaluate the performanceof the data-driven procedure via a simulation study, which reveals that the fully automatic procedureperforms quite well. As an illustration, we apply the procedure to some real data.

KEY WORDS: Bandwidth; Bootstrap; Cross-validation; Discontinuity points; Least squares � tting.

1. INTRODUCTION

There is a vast literature on the estimation of smooth re-gression functions. In many applications in physical sciences,engineering, medicine, and economics, the regression functionis smooth except at a � nite number of locations (e.g., jumpdiscontinuities). (See Müller and Stadtmüller 1999 for morebackground information and many examples of data applica-tions.) Estimation of an unknown regression function with a � -nite number of discontinuities is usually done in several stages.In the � rst (and more challenging) stage, the locations of thepossible jump discontinuities are estimated. The second stageis the actual estimation of the regression curve, usually by � t-ting nonparametricallysmooth curves to the left and to the rightof the estimated locations of jump discontinuities, relying ontechniques available for estimating smooth curves. Another ap-proach was described by Kang, Koo, and Park (2000), who es-timated the locations of the jump discontinuities as well as thesizes of the jumps. Using these estimations, they then adjustedthe data suitably and applied ordinary smoothing techniques tothese adjusted data. Thus, estimating regression functions withjump discontinuitiesinvolves estimation of the number of jumpdiscontinuities and their locations, the jump sizes, and the re-gression function itself.

The literature on nonparametric estimation of regressionfunctions that are smooth except at some points is by nowquite large. Kernel-based estimation methods have been stud-ied by Canny (1986), Korostelev (1987), Hall and Titterington(1992), Müller (1992), Wu and Chu (1993a, b, c), Chu (1994),Speckman (1994), Eubank and Speckman (1994), Bunt, Koch,and Pope (1995), and Kang et al. (2000), among others. Lo-cal polynomial methods were used by McDonald and Owen(1986), Leclerc and Zucker (1987),Loader (1996), Horváth andKokoszka (1997), Qiu and Yandell (1998), Spokoiny (1998),Hamrouni (1999), and Grégoire and Hamrouni (2002), amongothers. Work on spline-based methods has been reported byLaurent and Utreras (1986), Girard (1990), and Koo (1997);on wavelet-based methods, by Mallat and Hwang (1992),Potier and Vercken (1994), Wang (1995), Raimondo (1998),

Oudshoorn (1998), and Antoniadis and Gijbels (2002). Müllerand Song (1997) and Gijbels, Hall, and Kneip (1999) proposedtwo-step procedures.

Two important issues arise when dealing with nonparamet-ric estimation of a regression curve with jump discontinuities.The � rst issue is knowledge of the � nite number of jump dis-continuities. Often it is assumed that this number is known,whereas this is often not the case in applications. The secondissue is that any nonparametric estimation method involves thechoice of parameters, call them “smoothing parameters,” andthe performance of the estimation procedures depends heav-ily on the choice of these parameters. Hence it is very impor-tant to address the issue of how to choose these parameters inpractice. Some attention has been paid to these issues in theliterature. Authors who have explored the � rst issue includeYin (1988), Wu and Chu (1993a), Bunt et al. (1998), Beunen(1998), Oudshoorn (1998), Müller and Stadtmüller (1999), andAntoniadis and Gijbels (2002). As far as we know, little workhas been done on the second issue; Wu and Chu (1993c) andSpokoiny (1998) have provided theoretical contributions.

In this article we focus on estimating the locations of thejump discontinuities,particularly on the practical choice of thesmoothing (bandwidth)parameters involved.We developan es-timation procedure with a data-driven choice of the bandwidthparameters and with a built-in estimation of the number of dis-continuitypoints that performs well in practice. We use the two-step estimation method proposed by Gijbels et al. (1999), forwhich it has been shown that the estimator for the location ofa jump discontinuity achieves the optimal rate n¡1, where n isthe sample size. Other estimation procedures achieve the sameoptimal rate of convergence,but we choose this method for tworeasons. First, this estimation method demonstrated, via exten-sive simulation good � nite-sample performance for various re-gression functions. Second, as we show later, the method leadsto a fully data-driven procedure. The simulation study carried

© 2004 American Statistical Association andthe American Society for Quality

TECHNOMETRICS, FEBRUARY 2004, VOL. 46, NO. 1DOI 10.1198/004017004000000130

76

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

At M

artin

] at

14:

52 0

7 O

ctob

er 2

014

Page 3: Bandwidth Selection for Changepoint Estimation in Nonparametric Regression

BANDWIDTH SELECTION FOR CHANGEPOINT ESTIMATION 77

out for the two-step estimation method by Gijbels et al. (1999)revealed that the choice of the bandwidth parameters is crucialin practice, not only for this estimation method, but also for anyother estimation method in this context. In Section 2.3 we illus-trate this point by comparing the two-step estimation methodwith another estimation procedure. The second issue motivatesthe present article.

We propose a bootstrap procedure for selecting the band-width parameters. Gijbels et al. (2004) dealt with interval andband estimation for curves with jumps and suggested this selec-tion procedure. Some of the basic ingredients for the bootstrapprocedure used in our bandwidth selection problem are simi-lar to these described in that article, and hence for details werefer to this work. A theoretical justi� cation of the bootstrapbandwidth selection procedure introduced here would rely ontheoretical results established in Gijbels et al. (2004). We testedour fully data-driven estimation procedure on various simula-tion models; all demonstrated very good performance.

The article is organized as follows. In Section 2 we brie� ydescribe the estimation method and discuss the bandwidth pa-rameters it involves. We also compare this method with the es-timation method developed by Grégoire and Hamrouni (2002).In Section 3 we introduce the bootstrap bandwidth selectionprocedure and discuss how to choose the number of discon-tinuities. We present numerical study of the fully data-drivenestimation method, along with an illustration using some realdata, in Section 4. We provide some discussion in Section 5.

2. ESTIMATING JUMP DISCONTINUITIES

2.1 Statistical Model

Consider a sample of n observed data pairs  D f.X1;Y1/;

: : : ; .Xn; Yn/g generated from the model

Yi D g.Xi/ C "i; 1 · i · n: (1)

The design points Xi are either regularly spaced on I D [0; 1] orare the order statistics of a random sample from a distributionhaving a density f supported on I. The errors "i are assumed tobe idd distributed with mean 0 and � nite variance ¾ 2 . The un-known functiong.¢/ is smooth except at a � nite number of jumpdiscontinuities.For the moment, we assume that the number ofjump points is known, and, moreover, for ease of presentation,we explain the estimation and bootstrap selection procedure forthe case of a single jump discontinuity in the regression func-tion at the point x0 2 ]0;1[. We address the case of more thanone jump discontinuity in Section 3.3 and discuss choosing thenumber of jump discontinuitiesin Section 3.4.

2.2 Estimation Procedure

The two-step estimation procedure for estimating x0 intro-duced by Gijbels et al. (1999) involves: (1) obtaining a roughestimate of x0, say Qx0 , and (2) improving Qx0 using least squaresestimation in an interval around Qx0 , say [Qx0 ¡ h2; Qx0 C h2],with h2 > 0 a bandwidth parameter, to obtain an improved es-timator Ox0 of x0. The � rst step, called the diagnostic step, isbased on the derivative of the Nadaraya–Watson estimator ofthe regression function and involves a bandwidth parameter h1.Section 5 presents some discussion on various aspects of thisestimation method.

2.2.1 Diagnostic Step. One way to detect a jump discon-tinuity is to look at locations with high derivatives. A possiblediagnostic function is the derivative of the Nadaraya–Watsonestimator (see Nadaraya 1964; Watson 1964), de� ned as

D.x; h1/ D@

@x

ÁPniD1 Kf.x ¡ Xi/=h1gYiPniD1 Kf.x ¡ Xi/=h1g

!

; (2)

where K is a compactly supported (with support [¡v;v]) dif-ferentiable kernel function and h1 > 0 is a bandwidth. A � rstrough estimator of x0 is then given by

Qx0 D argmaxx2]vh1;1¡vh1[

jD.x; h1/j:

The rate of convergence of this preliminary estimator Qx0

is n¡1.logn/1=2. The second, least squares, step will lead toa n¡1 estimator (see Gijbels et al. 1999 for further details).

2.2.2 Least Squares Step. We construct an interval con-centrated around Qx0, namely [Qx0 ¡ h2; Qx0 C h2], to which x0 be-longs with high probability. Denote by fi1; i1 C 1; : : : ; i2g theset of integers i for which Xi 2 [Qx0 ¡ h2; Qx0 C h2]. We � t a stepfunction on the interval [Qx0 ¡ h2; Qx0 C h2] using least squares.We estimate that the jump discontinuity occurs between designpoints Xi0 and Xi0C1, where i0 is chosen to minimize the sumof squares,

i0X

iDi1

(Yi ¡ .i0 ¡ i1 C 1/¡1

i0X

jDi1

Yj

)2

Ci2X

iDi0C1

(

Yi ¡ .i2 ¡ i0/¡1i2X

jDi0C1

Yj

)2

:

Denote by O{0 the minimizer of this sum of squares. The � nalestimator of the jump discontinuity x0 is then de� ned as themidpoint between XO{0 and XO{0C1: Ox0 D 1

2 .XO{0 C XO{0C1/.

2.3 Comparison With an Available Method

In this section we present a comparison between the perfor-mances of the two-step estimation method explained earlier andthe method proposed by Grégoire and Hamrouni (2002) thatalso achieves the best rate of convergence n¡1 when using aspeci� c class of kernel functions. The latter method estimatesthe location of the discontinuity point by that value of t forwhich j OgC.t/ ¡ Og¡.t/j is maximal, where OgC and Og¡ are locallinear regression estimates of the right and left limits of g. Thismethod requires the choiceof a bandwidthparameter h, becauseit involves local linear estimation.As an illustration,we appliedthe two-step estimation method and this one-step method to theregression model (1) with regression functions g1 and g2 givenlater (Sec. 4.1) in (4) and (5), respectively, and depicted in Fig-ures 3(a) and (b). These functions show one jump discontinuityat the point .5 of sizes 1 and ¡2. We simulated 100 samplesfrom model (1) with N.0I ¾ 2/ distributederrors and with n D 50and ¾ 2 D :1 for g1 and n D 200 and ¾ 2 D :5 for g2. For each� xed bandwidth in the set h D :03 C :015j for j D 0;1; : : : ;18,and for both methods, we obtain the estimate of the unknownlocation point for the 100 samples. Figure 1 provides a boxplotof these 100 values, for each � xed value of h that we consid-ered (indicated on the horizontal axis). Panels (a) and (c) corre-

TECHNOMETRICS, FEBRUARY 2004, VOL. 46, NO. 1

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

At M

artin

] at

14:

52 0

7 O

ctob

er 2

014

Page 4: Bandwidth Selection for Changepoint Estimation in Nonparametric Regression

78 IRÈNE GIJBELS AND ANNE-CÉCILE GODERNIAUX

(a) (b)

(c) (d)

Figure 1. Boxplots of the Estimated Values for the Discontinuity Point x0 for the Functions g1 and g2 Using the Two-Step Estimation Method[(b) and (d)] and the Method of Grégoire and Hamrouni (2002) [(a) and (c)]. (a) and (b) Results for g1; (c) and (d) results for g2.

spond to the one-step method, whereas panels (b) and (d) cor-respond to the two-step procedure. The outer whiskers extendto the extreme values of the data (i.e., the smallest and largestobservations) or to the nearest observation not exceeding a dis-tance 1.5 times the interquartile range measured from the quar-tiles, whichever is smallest. This � gure is an illustration of howthe methods depend on the choice of the bandwidth parameter,and of how the methods perform for � nite samples. Figure 1shows that the choice of the bandwidth parameter is indeed im-portant. It also illustrates that the quadratic function g1 [panels(a) and (b)] represents an easier example than the cosinus func-tion g2 [panels (c) and (d)].

Another remark that can be made is that the method ofGrégoire and Hamrouni (2002) tends to be more variable ingeneral, looking at the interquartile range. Figure 1(c), for ex-ample, shows that the median values (the little white squares)of the estimate are quite far off from the true value of .5 for awhole range of h values. The two-step estimation method per-forms better here. This illustrates a feature of the latter method:If at the � rst step the preliminary estimator of x0 is rather bad

(but not too far off ), then the second step gives a chance to“correct” for this.

2.4 Choice of the Bandwidth Parameters

Each step in the two-step estimation procedure involves abandwidthparameter, h1 or h2. Both bandwidthchoices are cru-cial for the performance of the procedure, as has been exploredand discussed in detail by Gijbels et al. (1999).

The bandwidth h1 should be small enough to capture the be-havior at the jump discontinuity, but not so small that the di-agnostic curve shows arti� cial peaks. The bandwidth h2 usedin the second step determines the interval in which we carryout the least squares approximation,namely [Qx0 ¡ h2; Qx0 C h2].If this interval is wide and g differs greatly from a step function,then a local constant approximation may be very poor, leadingto a bad estimate of i0 and hence of x0. Thus the interval shouldbe as small as possible, but still contain suf� cient data pointsfor a reasonable least squares � t.

A simpli� cation is obtained by taking the same bandwidthin each step, that is, h1 D h2. In Section 3 we present the

TECHNOMETRICS, FEBRUARY 2004, VOL. 46, NO. 1

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

At M

artin

] at

14:

52 0

7 O

ctob

er 2

014

Page 5: Bandwidth Selection for Changepoint Estimation in Nonparametric Regression

BANDWIDTH SELECTION FOR CHANGEPOINT ESTIMATION 79

fully data-driven, bootstrap-type procedure for this simpli� edcase. For the more general case of two different bandwidths h1and h2, a fully data-driven procedure has also been developed.Details of this more � exible procedurecan be obtained from theauthors on request.

2.5 Identi� cation Problem

Gijbels et al. (1999) noted that a simple search for the maxi-mum in the diagnostic function jD.x; h/j might be problematic.As an example, consider the cosine function in (5), depicted inFigure 3(b). This functionhas many steep declines and inclines.Thus the diagnostic function exhibits many local maxima, andthe largest may correspond to a steep gradient instead of to thejump discontinuity.We need to identify the local maximum ofthe diagnostic function that corresponds to the jump disconti-nuity. Gijbels et al. (1999) proposed a four-step algorithm fordealing with this identi� cation problem. The algorithm also in-cludes an automatic choice of the bandwidth parameter h1 inthe diagnostic step. Let h1;i D h0ri, for i D 0;1; 2; : : : , be arange of decreasing values of the bandwidth h1, with h0 > 0and 0 < r < 1. The four-step algorithm proceeds as follows:

Step 1: Initialization.Let M denote the number of local max-ima of jD.¢; h1;0/j on ]vh1;0;1¡ vh1;0[. Let f»0;j;1 ·j · Mg be the set of points at which the maxima areachieved.

Step 2: Iteration. Given a set f»i;j;1 · j · Mg of local max-ima of jD.¢; h1;i/j on ]vh1;0;1 ¡ vh1;0[, let »iC1;j

denote the local maximum of jD.¢;h1;iC1/j that isnearest to »i;j.

Step 3: Termination. Stop the algorithm at iteration i D Q{,when the number of data values in some interval]x ¡h1;i;x Ch1;i[ µ ]vh1;0;1¡ vh1;0[ � rst falls belowa predetermined value. The preliminary estimate Qx0of the jump discontinuity x0 is the value of »Q{;j forwhich jDj.»Q{;j; h1;Q{/j ¡ jD.»0;j;h1;0/jj is largest.

Step 4: Least squares. Use local least squares within the in-terval [Qx0 ¡ h2; Qx0 C h2] to obtain the � nal estima-tor Ox0.

The � rst three steps of the algorithm yield a preliminary es-timator for x0 . The two parameters h0 and r that appear in theautomatic choice of the bandwidth h1 are of little importance,because they represent the starting value of the range of h1values and the multiplicative decreasing step in the sequenceof h1 values. The grid of h1 values should be � ne enough andshould be of a reasonable range; thus safe choices are h0 large(e.g., half of the lengthof the domain of the regression function)and r close to 1.

3. FULLY DATA-DRIVEN ESTIMATION PROCEDURE

The estimator O{0 obtained as in Section 2.2 is integer-valuedand may differ in absolute value from the theoretical (ran-dom) i0 by 0; 1;2; : : : . Ideally, the estimator O{0 equals i0 withhigh probability. We use a bootstrap procedure to estimate theprobability Pr.O{0 ¡ i0 D 0/ ´ p0 and to select the bandwidth forwhich this estimated probability is maximal. We � rst describein detail the bootstrap algorithm for estimating p0, and thenspecify the actual bandwidth selection procedure. We provideexamples of implementations in Section 4, and discuss somealternative selection criteria in Section 5.

3.1 Bootstrap Algorithm

Step 1: Estimation of g and computation of residuals. LetOx0 D 1

2 .XO{0 C XO{0C1/ denote the estimator intro-duced in Section 2.2. Using local linear regression(see, e.g., Fan and Gijbels 1996), we construct Ogon [0; Ox0] and [Ox0;1]. We de� ne Q"i D Yi ¡ Og.Xi/

for i D 1; : : : ; n and N" the mean of Q"i, and O"i D Q"i ¡ N",the centralized estimated residuals.

Step 2: Monte Carlo simulation. Conditional on the ob-served sample  D f.X1; Y1/; : : : ; .Xn;Yn/g, we con-sider "¤

1 ; : : : ; "¤n , a resample drawn randomly with

replacement from the set O"1; : : : ; O"n. We de� ne

Y¤i D Og.Xi/ C "¤

i ; i D 1; : : : ; n:

Then ¤ D f.X1;Y¤1 /; : : : ; .Xn;Y¤

n /g is the bootstrapversion of  .

Step 3: Determinationof the bootstrapprobability.Using themethod described in Section 2.2, we compute theanalog O{¤

0 and Ox¤0 D 1

2 .XO{¤0

C XO{¤0 C1/ of O{0 and Ox0 for

the resample  ¤ rather than the sample  . FromB bootstrap replications, we obtain B values of O{¤

0 ,denoted by O{¤b

0 , b D 1; 2; : : : ; B, and we evaluate thediscrete probability Pr.O{¤

0 ¡ O{0 D 0jÂ/ via

1B

BX

bD1

#fb : O{¤b0 D O{0g: (3)

Using the foregoing procedure, we can evaluate all discreteprobabilities Pr.O{¤

0 ¡ O{0 D kjÂ/, k D 0; ¡1; 1;¡2;2; : : : . Thesehave been studied theoretically by Gijbels et al. (2004), whoshowed that under some regularity conditions and for � xed de-sign supkD0;¡1;1;::: jPr.O{¤

0 ¡ O{0 D kjÂ/ ¡ Pr.O{0 ¡ i0 D k/j ! 0 inprobability.The same holds for random design; the only differ-ence is that probabilities must be considered conditionally onthe realized design X1; : : : ;Xn.

3.2 Bootstrap Bandwidth Selection Method

With the foregoing bootstrap procedure, the bandwidth se-lection is simple and reads as follows: For a set of potentialbandwidths h, choose that bandwidth for which the bootstrapestimate of Pr.O{0 ¡ i0 D 0/ is maximum. More precisely, de-note by hj, j D 0; : : : ;H the set of potential bandwidths. Foreach bandwidth hj, we obtain a preliminary estimator Qx.hj/,that is, the value of x that maximizes the diagnostic func-tion jD.x; hj/j. Next we construct an interval of length 2hj

around Qx.hj/ and use least squares � tting to obtain the � -nal estimator corresponding to this bandwidth value hj. Usingthe bootstrap algorithm described earlier, we then estimate,via (3), the probabilityPr.O{0 ¡ i0 D 0/ associated with that � xedbandwidth hj. Finally, we select the bandwidth that yields thelargest bootstrap-estimated probability. We denote this band-width by Ohboot. We then use this bandwidth to calculate the � -nal estimator of the jump discontinuity.To be safe, one shouldconsider a suf� ciently large set of potential bandwidths.

Functions with possible identi� cation problems, as indicatedin Section 2.5, require the (more sophisticated) four-step algo-rithm. The bandwidth h1 is chosen automatically by the algo-rithm, and the bandwidth h2 is chosen as before. We select a

TECHNOMETRICS, FEBRUARY 2004, VOL. 46, NO. 1

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

At M

artin

] at

14:

52 0

7 O

ctob

er 2

014

Page 6: Bandwidth Selection for Changepoint Estimation in Nonparametric Regression

80 IRÈNE GIJBELS AND ANNE-CÉCILE GODERNIAUX

(large) set of potential bandwidths h2;j for j D 0; : : : ; H and es-timate the discrete probability Pr.O{0 ¡ i0 D 0/ for each of them,then choose that bandwidth h2 from the candidate set that max-imizes the bootstrap estimate of this probability.

To achieve a fully data-driven bandwidth selection proce-dure, we use cross-validation (CV) to estimate the (smooth)regression functions. Applying the same bandwidth to esti-mate g to the left of Ox0 and to the right of Ox0, and denoting thelocal linear � t on [0; Ox0] by Og1 and the local linear � t on [Ox0; 1]by Og2 we choose h that minimizes the CV quantity

CV.h/ DO{0X

iD1

fOg¡i1 .XiI h/ ¡ Yig2 C

nX

iDO{0C1

fOg¡i2 .XiI h/ ¡ Yig2;

where Og¡i1 .¢/ and Og¡i

2 .¢/ denote the estimators Og1 and Og2

obtained by disgarding the ith data point on the interval[0; Ox0] and ]Ox0; 1]. The CV bandwidth selector is then de� nedas OhCV D argminh CV.h/. In our simulation study presented inSection 4, we incorporate more � exibility by allowing two dif-ferent bandwidths for estimating g to the left and to the rightof Ox0.We discuss alternative data-driven bandwidth selectorsfor this step in Section 5.

3.3 Generalization for More Than One Discontinuity

The generalization of the data-driven bandwidth selectionprocedure to the case of more than one jump discontinuity israther straightforward. For convenience, we use the four-stepalgorithm of Section 2.5, which generalizes easily to the caseof k jump discontinuities x1;x2; : : : ;xk by slight modi� cationsof steps 3 and 4:

Step 30: Termination (addition). Take the preliminary es-timates Qx1; : : : ; Qxk of the k jump discontinuitiesx1;x2; : : : ;xk to be those values of »Q{;j for whichjjD.»Q{;j;h1;Q{/j ¡jD.»0;j;h1;0/jj is one of the k largest.

Step 40: Least squares (addition). Use local least squareswithin the interval [Qxj ¡ h2; Qxj C h2] for j D 1; : : : ;k,and obtain the � nal estimators Ox1; : : : ; Oxk ofx1; : : : ; xk .

The � rst three steps of the generalized four-step algorithmlead to preliminary estimators for x1; : : : ;xk , involving an au-tomatic choice of the bandwidth h1 as in Section 2.5. An il-lustration of this four-step algorithm is provided in Figure 2,which deals with estimation of the piecewise quadratic functiong.x/ D 4x2 C 1:2I.x > :2/C :8I.x > :5/, where I.A/ denotes theindicator function on a set A. This function has two jump dis-continuities of size 1.2 and .8 occurring at .2 and .5. Figure 2presents four selected iterations in the four-step algorithm withh0 D :1 and r D :9, corresponding to i D 0; 4;10, and the � naliteration step with Q{ D 15. In the initial iteration step (i D 0),the diagnostic function shows M D 3 local maxima, and hencethe four-step algorithm searches in each iteration step for theclosest local maxima, indicated by the vertical dotted lines inFigure 2. The preliminary estimates are the two values of »Q{;j,j D 1;2; 3, for which jjD.»Q{;j;h1;Q{/j ¡ jD.»0;j;h1;0/jj is one ofthe two largest. These differences are the differences in mag-nitude between the maximal values of the diagnostic function

achieved in the � rst and the last iteration step for the three localmaxima. The � rst three steps of the four-step algorithm trackback the jump points nicely and lead to the preliminary esti-mates .203 and .511 of the true values .2 and .5.

For the least squares � tting, we select, via the bootstrap algo-rithm, a bandwidth h2 for each jump discontinuity separately.More precisely, we select a (large) set of potential band-widths h2;j for j D 0; : : : ; H and then estimate, for each jumppoint x`, ` D 1; : : : ; k, the discrete probability Pr.O{0;` ¡ i0;` D0/ ´ p0;`, where obviously the extra subscript ` refers to thejump discontinuity x`. For each jump discontinuity,we choosethat bandwidth h2 from the candidate set that maximizes thebootstrap estimate of the probability p0;`.

We need to ensure that the interval around Qx` in the leastsquares step contains only one jump discontinuity. Hence theset of potential bandwidths should not contain large band-width values.

3.4 Determining the Number of Discontinuities

In most practical examples, the number of discontinuities k(k ¸ 0) is not known, and hence we use the automatic bootstrapselection method described previously. We compare the qual-ity of the � tted curves for various � xed values of k and selectthe one with the best � t in terms of minimizing the CV sumof squares,

CV.k/ DnX

iD0

¡Yi ¡ g¡i

k .Xi/¢2

;

where g¡ik .Xi/ is the � t obtained at Xi, assuming k changepoints

and excluding the data point.Xi; Yi/ when constructing the � t.This CV approach was proposed by Müller and Stadtmüller(1999). The number of discontinuitiesk is then estimated by

Ok D argmink2f0;1;:::g

CV.k/;

and the estimated curve is then the one associated with thisnumber of discontinuities. With this CV rule for selecting thenumber of discontinuities, we have a fully data-driven methodfor estimating curves with possible jump discontinuities.

4. NUMERICAL STUDY

In this section we evaluate, via a simulation study, the fullydata-driven estimation procedure developed in Section 3 andprovide an illustration with some real data. Emphasis is on theevaluation of the bootstrap bandwidth selection method, but wealso brie� y demonstrate the performance of the CV method indetermining the number of discontinuities, in conjunction withthe bootstrap selection method.

4.1 Simulation Study

We consider the regression functions

g1.x/ D 4x2 C I.x > :5/ (4)

and

g2.x/ D cosf8¼.:5 ¡ x/g ¡ 2 cosf8¼.:5 ¡ x/gI.x > :5/: (5)

TECHNOMETRICS, FEBRUARY 2004, VOL. 46, NO. 1

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

At M

artin

] at

14:

52 0

7 O

ctob

er 2

014

Page 7: Bandwidth Selection for Changepoint Estimation in Nonparametric Regression

BANDWIDTH SELECTION FOR CHANGEPOINT ESTIMATION 81

Figure 2. Performance of the Diagnostic Function for the Piecewise Quadratic Function With n D 50 and ¾ 2 D .1 and Using the Four-StepAlgorithm. (a)–(d) Plots of jD(x,h1,i )j for i D 0, 4, 10, and i D Q{ D 15.

TECHNOMETRICS, FEBRUARY 2004, VOL. 46, NO. 1

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

At M

artin

] at

14:

52 0

7 O

ctob

er 2

014

Page 8: Bandwidth Selection for Changepoint Estimation in Nonparametric Regression

82 IRÈNE GIJBELS AND ANNE-CÉCILE GODERNIAUX

(a) (b)

Figure 3. The True Regression Functions (solid curves) With a Typical Simulated Dataset of Size n D 100 and Variance ¾ 2 D .5 for RegressionFunctions (a) g1 and (b) g2.

We considered a � xed equidistant design, xi D i=n fori D 1; : : : ; n. The errors, "i, are Gaussian with variance ¾ 2 D :1or :5. We present simulation results for sample sizes n D 50,100, or 200. Figure 3 presents the true regression functionsg1 and g2 with typical simulated datasets for sample sizen D 100 and ¾ 2 D :5.

For the diagnosticfunction in (2), we use a standard Gaussiankernel, 1,000 simulations and B D 2,000 bootstrap replicates.For each example, we use the CV bandwidths de� ned in Sec-tion 3.2 to choose the smoothing parameters in the � rst step ofthe bootstrap algorithm described in Section 3.1.

We � rst investigate the performance of the bootstrap band-width selection method for each of the regression functions.We simulate from model (1) with unknown regression func-tion g1 . We consider the set of potential bandwidths hj D

:03 C :015j for j D 0; : : : ;18. Figure 4(a) depicts a simulateddataset for n D 50 and ¾ 2 D :1 together with the true func-tion g1.¢/. Also presented is a local linear estimator of the re-gression function,obtainedby applying local linear � tting to theleft and right of the estimated jump point. Figure 4(b) showsthe bootstrap estimates of the probability p0 , associated withthe potential bandwidthshj for j D 0; : : : ; 18. For this particularsimulation, the bandwidthselected by the bootstrapalgorithmisOhboot D :075, which is the bandwidth with the largest bootstrapestimated probability. With this bandwidth, the � nal estimatorof the discontinuity point is Ox0=.51. To estimate the variabilityof the bootstrapalgorithmfor selectingh D h1 D h2, Figure 4(c)presents a kernel density estimate of the 1,000 bandwidths Ohboot

selected for the 1,000 simulations. This graph shows that forthese 1,000 simulations, the bootstrap-selectedbandwidth Ohboot

Figure 4. (a) A Simulated Dataset With n D 50 and ¾ 2 D .1, Together With the True Regression Function g1 (solid curve) and a Local LinearEstimator, Adapted to the Estimated Changepoint (dashed curve); (b) the Bootstrap Estimate of the Probability p0 for a Range of Values of theBandwidth h for That Simulated Dataset; and (c) a Kernel Density Estimate of the Bandwidth Ohboot Selected by the Bootstrap Procedure, Basedon 1,000 Simulations.

TECHNOMETRICS, FEBRUARY 2004, VOL. 46, NO. 1

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

At M

artin

] at

14:

52 0

7 O

ctob

er 2

014

Page 9: Bandwidth Selection for Changepoint Estimation in Nonparametric Regression

BANDWIDTH SELECTION FOR CHANGEPOINT ESTIMATION 83

Table 1. Simulation Results for the Functions g1 and g2; Evaluation of the BootstrapBandwidth Selection Method

g1 g2

¾ 2 D .1 ¾ 2 D .5 ¾ 2 D .1 ¾ 2 D .5n D 50 n D 100 n D 100 n D 100 n D 200 n D 200

% in [0; :15[ 0 0 .2 .1 .5 1.6% in [:15; :25[ 0 0 .4 1.8 .5 7.1% in [:25; :35[ 0 0 .7 .3 .7 9.2% in [:35; :45[ .1 .1 2.2 2.5 .7 4.5% in [:45; :55[ 93.9 99.5 78.4 91.2 96.4 58.8% in [:55; :65[ 4.9 .4 9.8 1.6 .4 4.8% in [:65; :75[ .1 0 3 .9 .4 6.3% in [:75; :85[ .6 0 3.2 1.6 .4 7.5% in [:85;1:0[ .4 0 2.1 0 0 2

Mean of Ox0 .517640 .50685 .533130 .504740 .499390 .491135

SD of Ox0 .035288 .007330 .091399 .06589 .044386 .151566

has a density function concentratedaround the value .0488.Thelittle mode around .3 can be explained by the fact that for thefunction g1, bandwidths of larger order are also quite appropri-ate, because the function is overall increasing, and large band-widths will still allow one to detect the jump discontinuity.

Table 1 summarizes the simulation results for functions g1and g2. Presented are percentages of the estimated values Ox0

falling in the speci� ed intervals. The last two rows list themeans and standard deviations of Ox0 across the 1,000 sim-ulations. The function g2 presents some speci� c dif� culties,because it shows many � uctuations, regions with a steep in-crease followed by regionswith a steep decrease. To identify the“appropriate” local maximum, we use the four-step algorithmas speci� ed in Section 2.5. For the bandwidth h1 we consideredthe set of possible values h1;i D :1 £ :9i for i D 0; 1; : : : , andwe took the set of potential bandwidths h2 for the least squaresstep to be h2;j D :03 C :015j for j D 0; 1; : : : ; 5. For the func-tion g2, we used � tting with linear functions instead of constantfunctions in the least squares step of the estimation procedure.See also Section 5 for some discussion on this issue. Note thatthe data-drivenmethod performs very well even for these rathersmall sample sizes.

We now demonstrate the performance of the fully data-drivenprocedure, including the CV choice of the number of jumppoints, based on 100 simulations, 1,000 bootstrap replicates,and possible values for k, k D 0;1; 2;3, or 4. For each example,we use the CV bandwidth to choose the smoothing parameterthat we need to determine g¡i

k . Table 2 summarizes the simula-tion results for the functions g1 and g2 and for the smooth func-tion g0.x/ D x2 using different sample sizes and values of ¾ 2.Presented are the frequencies (out of 100) that the estimatedvalues Ok correspond to the speci� ed values. Clearly, the CVchoice of k seems to work nicely.

4.2 Application

As an illustration, we now apply the fully data-driven esti-mation method to a real dataset. The application concerns 215average annual temperatures measured in Prague from 1775 to1989, discussed by Horváth and Kokoszka (1997). This datasetwas analyzed by Horváth and Kokoszka (1997) to detect cli-matic changes occurring over a span of several years or adecade. For this dataset, previous analysis often considered thenumber of jump points to be either two or three. Antoniadisand Gijbels (2002) also studied these data from 1775 to 1902and, using a wavelet method, found two jump points occurringat 1787 and 1837.

We apply the fully data-driven procedure to these data andsearch for changepoints in the data between 1775 and 1989.From the CV criterion, we � nd Ok D 3, and the locations of thejumps are estimated to be 1,786.5, 1,836.5, and 1,942.5. Thisagrees with previous analysis, in particular with that of Horváthand Kokoszka (1997). Horváth, Kokoszka, and Steinebach(1999) analyzed these data, considering models for dependentobservations. They tested for changes in the mean tempera-ture and found that changes occured in the years 1835, 1893,and 1927. Note that the changepoint around 1836 was also animportant one in our analysis. Figure 5(a) presents a smooth� t using the method of local linear � tting. Figures 5(b)–(e)depict the � tted curves assuming k D 1, 2, 3, and 4 discon-tinuities. The large negative slope of the � rst � tted curve inFigures 5(c)–(e) looks rather strange. A possible explanation isthat during this period, some errors appeared in the measure-ments of the temperatures. Note also that the � rst discontinuityoccurs near the boundary, and hence very little data are used todo the � tting. The plot of the CV function CV.k/ is provided inFigure 5(f ).

Table 2. Simulation Results for the CV Choice of k

Function g0 Function g1 Function g2n D 100 200 100 200 100 200

k ¾ 2D .1 .5 .1 .5 .1 .5 .1 .5 .1 .5 .1 .5

0 80 72 88 77 1 7 0 1 1 8 3 21 7 8 4 7 83 73 87 79 80 71 88 842 5 8 4 3 6 6 0 4 7 18 2 33 8 11 6 9 8 12 4 15 8 1 4 84 0 1 2 4 2 2 9 1 4 2 3 3

TECHNOMETRICS, FEBRUARY 2004, VOL. 46, NO. 1

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

At M

artin

] at

14:

52 0

7 O

ctob

er 2

014

Page 10: Bandwidth Selection for Changepoint Estimation in Nonparametric Regression

84 IRÈNE GIJBELS AND ANNE-CÉCILE GODERNIAUX

Figure 5. Average Annual Temperatures in Prague With Superimposed Regression Fits. (a) Smooth � t without discontinuity and � t with (b) onediscontinuity, (c) two discontinuities, (d) three discontinuities, and (e) four discontinuities; (f) the CV function CV(k).

5. DISCUSSION

5.1 Options for the Two-Step Estimation Method

The two-step estimation method of Gijbels et al. (1999) usedhere offers several options, including (1) the diagnostic func-tion could be any other consistent estimator of the derivativeof the regression function; (2) the least squares step could in-volve � tting any appropriate parametric function; and (3) the� nal estimation of the function could be based on any othergood nonparametric estimator. In our context of the bandwidthselection problem, we opted for implementing the simplest op-tions: the Nadaraya–Watson estimator as a basis for the diag-nostic function, a � tting with constant functions in the leastsquares step, and local linear estimation for constructing the� nal estimator. In the � nal estimation we opted for local lin-ear estimation, because this is known to better handle boundaryeffects (as opposed to local constant approximation).When us-ing the two-step estimation procedure, one should be aware ofthe various options and exploit these options according to thecontext. For example, when dealing with functions for whichone suspects signi� cant curvature (such as the cosine functionin our simulation study), we recommend using least squares � t-ting with a polynomial of degree at least 1 to better capturethe curvature.

5.2 Alternative Bootstrap-Based Criteria forSelecting the Bandwidth

Our bandwidth selection criterion is based on maximizingthe bootstrap estimate of the probability P.O{0 ¡ i0 D 0/. Herewe discuss some alternatives.

A � rst alternative criterion is to focus on minimizing themean squared error (MSE) of O{0 ,

MSE.O{0/ DC1X

kD¡1k2 Pr.O{0 ¡ i0 D k/ D fBias.O{0/g2 C var.O{0/:

Each probability, Pr.O{0 ¡ i0 D k/, can be estimated byPr.O{¤

0 ¡ O{0 D kjÂ/, and thus this criterion involves taking thebandwidth that minimizes, using (3),

[MSE.O{0/ DC1X

kD¡1k2 Pr.O{¤

0 ¡ O{0 D kjÂ/:

A second alternative criterion is simply to minimize the esti-mated variance of O{0,

cvar.O{0/ DC1X

kD¡1k2 Pr.O{¤

0 ¡ O{0 D kjÂ/

¡

( C1X

kD¡1k Pr.O{¤

0 ¡ O{0 D kjÂ/

)2

:

TECHNOMETRICS, FEBRUARY 2004, VOL. 46, NO. 1

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

At M

artin

] at

14:

52 0

7 O

ctob

er 2

014

Page 11: Bandwidth Selection for Changepoint Estimation in Nonparametric Regression

BANDWIDTH SELECTION FOR CHANGEPOINT ESTIMATION 85

We compared the performances of these alternative crite-ria with the proposed criterion. The results from 200 simula-tions for, for example, the quadratic function g1 (with n D 50and ¾ 2 D :1) revealed presentages of estimated values Ox0 fallingin the interval [:45; :55[ to be 97.5%, 87.5%, and 91% forthe criterion based on the maximum probability, the minimumMSE, and the minimum variance. For the cosine function g2

(with n D 100 and ¾ 2 D :1), these percentages were 90.5%,91.5%, and 91%. The means and standard deviations of the es-timator Ox0 were very close for the three criteria. The data-drivenmethods based on the different bandwidth selection criteria per-form comparably;hence, we opted for the simplest criterion andfocused on maximization of Pr.O{¤

0 ¡ O{0 D 0jÂ/.

5.3 Alternatives to the Cross-Validation Bandwidth

For the � nal estimation of the curve with discontinuities,we use local linear estimation with CV bandwidth selectors.We opted for CV, because it worked well throughout our ex-tensive simulation study. Of course, here the user also hasthe option to choose his or her favorite data-driven band-width selector. To illustrate this point, we implemented moresophisticated bandwidth selectors to replace the CV selec-tors that we used. We implemented the data-driven constant(global) and variable bandwidth selectors proposed by Fanand Gijbels (1995) in the context of locally weigthed leastsquares regression. For the 1,000 simulations used to produceTable 1, we found for the quadratic function g1 (with n D 50and ¾ 2 D :1) that the percentages that Ox0 falls in [:45; :55[are 93.3% and 93.3% when using the constant and variablebandwidth selector, with means (and standard deviations) of Ox0

of .519 (.0378) and .519 (.0377). These � gures should be com-pared with the corresponding ones in Table 1 (i.e., 93.9%,mean .518, and standard deviation .0353). For the cosine func-tion (with n D 100 and ¾ 2 D :1), we obtained92.4%, mean .507(standard deviation .0588) and 92.4%, mean .509 (standard de-viation .0593) for the constant and variable bandwidths. Hencevery little difference in these examples results from using themore sophisticated bandwidth selectors. There is a little gainin using the better bandwidth selectors when dealing with(the more dif� cult) cosine function. Similar conclusions holdtrue for all other simulations that we carried out (and do not re-port here). In cases where the functions are less nicely behavedto the right and the left of jump points it might be worthwhileto choose better bandwidth selectors.

5.4 Further Discussion

So far no theoretical results have been established for thedata-drivenprocedure. From the literature, we know that for ap-propriatelychosen � xed bandwidths, the two-step estimator of ajump discontinuityachieves the optimal rate n¡1. This would beexpected to continue to hold when using the data-driven band-widths (based on a consistent bootstrap procedure), but theoret-ical work is needed to establish the rate of convergence of thedata-driven estimation method.

Jump discontinuities represent only one type of irregularitythat might occur in an otherwise smooth regression function.

Other types of irregularities include changes in the deriva-tive functions. The fully data-driven estimation procedure de-veloped in this article can be adapted for detecting jumps inderivative functions. This essentially requires the use of anappropriate diagnostic function and an appropriate family ofparametric functions in the least squares step (see Gijbels andGoderniaux 2004b).

One also might be interested in testing whether or not an un-known regression function has jump discontinuities. Such test-ing problems are rather dif� cult and are mostly dealt with viaasymptotic theory. We have proposed bootstrap testing proce-dures based on the two-step estimation method that seem toperform very well in comparison with other testing proceduresavailable in the literature. They do not rely on asymptoticsand are data-driven, so that the user is not left with a dif-� cult and crucial choice of some smoothing parameter (seeGijbels and Goderniaux 2004a).

ACKNOWLEDGMENTS

Financial support of the “Projet d’Actions de RechercheConcertées,” No. 98/03–217 and of the IAP research networkNo. P5/24 of the Federal Of� ce for Scienti� c, Technical, andCultural Affairs, from the Belgian government is gratefully ac-knowledged. The authors are very grateful to the editor, anassociate editor, and two anonymous referees for their excel-lent review reports, which led to important improvements inthe article.

[Received October 2000. Revised July 2002.]

REFERENCES

Antoniadis, A., and Gijbels, I. (2002), “Detecting Abrupt Changes by WaveletMethods,” Journal of Nonparametric Statistics, 14, 7–29.

Beunen, J. A. (1998), “Implementation of a Method for Detection and Loca-tion of Discontinuities,” unpublished master’s thesis, University of Newcas-tle, Australia.

Bunt, M., Koch, I., and Pope, A. (1995), “Kernel-Based Nonparametric Regres-sion for Discontinuous Functions,” Statistics Research Report SRR 012-95,The Australian National University.

(1998), “Counting Discontinuities in Nonparametric Regression,” Re-search Report 98-2, University of Newcastle, Australia.

Canny, J. (1986), “A Computational Approach to Edge Detection,” IEEE Trans-actions on Pattern Analysis and Machine Intelligence, 8, 679–698.

Chu, C. K. (1994), “Estimation of Change-Points in a Nonparametric Regres-sion Function Through Kernel Density Estimation,” Communications in Sta-tistics, Part A—Theory and Methods, 23, 3037–3062.

Eubank, R. L., and Speckman, P. L. (1994), “Nonparametric Estimationof Functions With Jump Discontinuities,” in Change-Point Problems,eds. E. Carlstein, H.-G. Müller, and D. Siegmund, Hayward, CA: IMS,pp. 130–143.

Fan, J., and Gijbels, I. (1995), “Data-Driven Bandwidth Selection in LocalPolynomial Fitting: Variable Bandwidth and Spatial Adaptation,” Journal ofthe Royal Statistical Society, Ser. B, 57, 371–394.

(1996), Local Polynomial Modelling and Its Applications, New York:Chapman & Hall.

Gijbels, I., and Goderniaux, A.-C. (2004a), “Bootstrap Test for ChangePoints in Nonparametric Regression,” Journal of Nonparametric Statistics,to appear.

(2004b), “Data-Driven Discontinuity Detection in Derivatives of a Re-gression Function,” Communications in Statistics—Theory and Methods, 33,to appear.

Gijbels, I., Hall, P., and Kneip, A. (1999), “On the Estimation of Jump Points inSmooth Curves,” The Annals of the Institute of Statistical Mathematics, 51,231–251.

(2004), “Interval and Band Estimation for Curves With Jumps,” Jour-nal of Applied Probability, special issue “Stochastic Methods and TheirApplications”, in honor of Chris Heyde, 41A, to appear.

TECHNOMETRICS, FEBRUARY 2004, VOL. 46, NO. 1

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

At M

artin

] at

14:

52 0

7 O

ctob

er 2

014

Page 12: Bandwidth Selection for Changepoint Estimation in Nonparametric Regression

86 IRÈNE GIJBELS AND ANNE-CÉCILE GODERNIAUX

Girard, D. (1990), “Détection de discontinuités dans un signal par inf-convolution spline et validation croisée,” Technical Report RR 702-I-M, Uni-versity of Grenoble, France (in French).

Grégoire, G., and Hamrouni, Z. (2002), “Change-Point Estimation by LocalLinear Smoothing,” Journal of Multivariate Analysis, 83, 56–83.

Hall, P., and Titterington, D. M. (1992), “Edge Preserving and Peak-PreservingSmoothing,” Technometrics, 34, 429–440.

Hamrouni, Z. (1999), “Inférence statistique par lissage linéaire local pour unefonction de régression présentant des discontinuités,” unpublished doctoralthesis, Université de Joseph Fourier, Grenoble, France (in French).

Horváth, L., and Kokoszka, P. (1997), “Change-Point Detection With Nonpara-metric Regression,” technical report, University of Liverpool.

Horváth, L., Kokoszka, P., and Steinebach, J. (1999), “Testing for Changes inMultivariate Dependent Observations With an Application to TemperatureChanges,” Journal of Multivariate Analysis, 68, 96–119.

Kang, K.-H., Koo, J.-Y., and Park, C.-W. (2000), “Kernel Estimation of Discon-tinuous Regression Functions,” Statistics & Probability Letters, 47, 277–285.

Koo, J.-Y. (1997), “Spline Estimation of Discontinuous Regression Function,”Journal of Computational and Graphical Statistics, 6, 266–284.

Korostelev, A. P. (1987), “On Minimax Estimation of a Discontinuous Signal,”Theory of Probability and Its Applications, 32, 727–730.

Laurent, P. J., and Utreras, F. (1986), “Optimal Smoothing of Noisy BrokenData With Spline Functions,” Journal of Approximation Theory and Its Ap-plications, 2, 71–94.

Leclerc, Y. C., and Zucker, S. W. (1987), The Local Structure of Image Dis-continuities in One Dimension,” IEEE Transactions on Pattern Analysis andMachine Intelligence, 9, 341–355.

Loader, C. R. (1996), Change Point Estimation Using Nonparametric Regres-sion,” The Annals of Statistics, 24, 1667–1678.

Mallat, S., and Hwang, W. L. (1992), “Singularity Detection and ProcessingWith Wavelets,” IEEE Transactions on Information Theory, 2, 617–643.

McDonald, J. A., and Owen, A. B. (1986), “Smoothing With Split Linear Fits,”Technometrics, 28, 195–208.

Müller, H.-G. (1992), “Change-Points in Nonparametric Regression Analysis,”The Annals of Statistics, 20, 737–761.

Müller, H.-G., and Song, K.-S. (1997), “Two-Stage Change-Point Estimators inSmooth Regression Models,” Statistics & Probability Letters, 34, 323–335.

Müller, H.-G., and Stadtmüller, U. (1999), “Discontinuous Versus Smooth Re-gression,” The Annals of Statistics, 27, 299–337.

Nadaraya, E. A. (1964), “On Estimating Regression,” Theory of Probability andApplications, 9, 141–142.

Oudshoorn, C. G. M. (1998), “Asymptotically Minimax Estimation of a Func-tion With Jumps,” Bernoulli, 4, 15–33.

Potier, C., and Vercken, C. (1994), “Spline Fitting Numerous Noisy Data WithDiscontinuities,” in Curves and Surfaces, eds. P.-J. Laurent, A. Le Mehaute,and L. L. Schumaker, New York: Academic Press, pp. 477–480.

Qiu, P., and Yandell, B. (1998), “A Local Polynomial Jump-Detection Algo-rithm in Nonparametric Regression,” Technometrics, 40, 141–152.

Raimondo, M. (1998), “Minimax Estimation of Sharp Change Points,” The An-nals of Statistics, 26, 1379–1397.

Speckman, P. L. (1994), Detection of Change-Points in Nonparametric Regres-sion,” unpublished manuscript.

Spokoiny, V. G. (1998), “Estimation of a Function With Discontinuities viaLocal Polynomial Fit With an Adaptive Window Choice,” The Annals of Sta-tistics, 26, 1356–1378.

Wang, Y. (1995), “Jump and Sharp Cusp Detection by Wavelets,” Biometrika,82, 385–397.

Watson, G. S. (1964), “Smooth Regression Analysis,” Sankhya, Ser. A, 26,359–372.

Wu, J. S., and Chu, C. K. (1993a), “Kernel-Type Estimation of Jump Points andValues of Regression Function,” The Annals of Statistics, 21, 1545–1566.

(1993b), Modi� cation for Boundary Effects and Jump Points in Non-parametric Regression,” Nonparametric Statistics, 2, 341–354.

(1993c), “Nonparametric Function Estimation and Bandwidth Selec-tion for Discontinuous Regression Functions,” Statistica Sinica, 3, 557–576.

Yin, Y. Q. (1988), “Detection of the Number, Locations and Magnitudes ofJumps,” Communications in Statistics, Stochastic Models, 4, 445–455.

TECHNOMETRICS, FEBRUARY 2004, VOL. 46, NO. 1

Dow

nloa

ded

by [

Uni

vers

ity o

f T

enne

ssee

At M

artin

] at

14:

52 0

7 O

ctob

er 2

014