robust estimators for additive models using back ttingmatias/rbf_tr.pdf · matias salibi an{barrera...

38
Robust estimators for additive models using backfitting Graciela Boente Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires and CONICET, Argentina Alejandra Mart´ ınez Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires and CONICET, Argentina and MatiasSalibi´an–Barrera University of British Columbia, Canada Abstract Additive models provide an attractive setup to estimate regression functions in a nonparametric context. They provide a flexible and interpretable model, where each regression function depends only on a single explanatory variable and can be estimated at an optimal univariate rate. It is easy to see that most estimation procedures for these models are highly sensitive to the presence of even a small proportion of outliers in the data. In this paper, we show that a relatively simple robust version of the backfitting algorithm (consisting of using robust local polynomial smoothers) corresponds to the solution of a well-defined optimization problem. This formulation allows us to find mild conditions to show that these estimators are Fisher consistent and to study the convergence of the algorithm. Our numerical experiments show that the resulting estimators have good robustness and efficiency properties. We illustrate the use of these estimators on a real data set where the robust fit reveals the presence of influential outliers. Keywords: Fisher–consistency; Kernel weights; Robust estimation; Smoothing 1

Upload: others

Post on 01-Feb-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

  • Robust estimators for additive models usingbackfitting

    Graciela BoenteFacultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires and CONICET, Argentina

    Alejandra Mart́ınezFacultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires and CONICET, Argentina

    andMatias Salibián–Barrera

    University of British Columbia, Canada

    Abstract

    Additive models provide an attractive setup to estimate regression functions in anonparametric context. They provide a flexible and interpretable model, where eachregression function depends only on a single explanatory variable and can be estimatedat an optimal univariate rate. It is easy to see that most estimation procedures for thesemodels are highly sensitive to the presence of even a small proportion of outliers in thedata. In this paper, we show that a relatively simple robust version of the backfittingalgorithm (consisting of using robust local polynomial smoothers) corresponds to thesolution of a well-defined optimization problem. This formulation allows us to findmild conditions to show that these estimators are Fisher consistent and to study theconvergence of the algorithm. Our numerical experiments show that the resultingestimators have good robustness and efficiency properties. We illustrate the use of theseestimators on a real data set where the robust fit reveals the presence of influentialoutliers.

    Keywords: Fisher–consistency; Kernel weights; Robust estimation; Smoothing

    1

  • 1 Introduction

    Consider a general regression model, where a response variable Y ∈ R is related to a vectorX = (X1, . . . , Xd)

    t ∈ Rd of explanatory variables through the following non-parametricregression model:

    Y = g0(X) + σ0 ε . (1)

    The error ε is assumed to be independent from X and centered at zero, while σ0 is theerror scale parameter. When ε has a finite first moment, we have the usual regressionrepresentation E(Y |X ) = g0(X). Standard estimators for g0 can thus be derived relyingon local estimates of the conditional mean, such as kernel polynomial regression estimators.It is easy to see that such procedures can be seriously affected either by a small proportionof outliers in the response variable, or when the distribution of Y |X has heavy tails. Note,however, that even when ε does not have a finite first moment, the function g0(X) canstill be interpreted as a location parameter for the distribution of Y |X. In this case, localrobust estimators can be used to estimate the regression function as, for example, the localM−estimators proposed in Boente and Fraiman (1989) and the local medians studied inWelsh (1996).

    Unfortunately both robust and non-robust non-parametric regression estimators are af-fected by the curse of dimensionality, which is caused by the fact that the expected number ofobservations in local neighbourhoods decreases exponentially as a function of d, the numberof covariates. This results in regression estimators with a very slow convergence rate. Stone(1985) showed that additive models can avoid these problems and produce non-parametricmultiple regression estimators with a univariate rate of convergence. In an additive model,the regression function is assumed to satisfy

    g0(X) = µ0 +d∑j=1

    g0,j(Xj) , (2)

    where µ0 ∈ R, g0,j : R→ R, 1 ≤ j ≤ d, are unknown smooth functions with E(g0,j(Xj)) = 0.Such a model retains the ease of interpretation of linear regression models, where each com-ponent g0,j can be thought as the effect of the j-th covariate on the centre of the conditionaldistribution of Y . Moreover, Linton (1997), Fan et al. (1998) and Mammen et al. (1999)obtained different oracle properties showing that each additive component can be estimatedas well as when all the other ones are known.

    Several algorithms to fit additive models have been proposed in the literature. In thispaper, we focus on the backfitting algorithm as introduced in Friedman and Stuetzle (1981)and discussed further in Buja et al. (1989). The backfitting algorithm can be intuitivelymotivated by observing that, if (2) holds, then

    g0,j(x) = E(Y − α−

    ∑`6=j

    g0,`(X`)

    ∣∣∣∣Xj = x) . (3)2

  • Hence, given a sample, the backfitting algorithm iteratively estimates the components g0,j,1 ≤ j ≤ d, using a univariate smoother of the partial residuals in (3) as functions ofthe j-th covariate. This algorithm is widely used due to its flexiblity (different univariatesmoothers can be used), ease of implementation and intuitive motivation. Furthermore, it hasbeen shown to work very well in simulation studies (Sperlich et al. 1999) and applications,although its statistical properties are difficult to study due to its iterative nature. Someresults regarding its bias and conditional variance can be found in Opsomer and Ruppert(1997), Wand (1999) and Opsomer (2000).

    When second moments exist, Breiman and Friedman (1985) showed that, under certainregularity conditions, the backfitting procedure finds functions m1(X1), . . . ,md(Xd) mini-mizing E(Y −µ0−

    ∑di=1mj(Xj))

    2 over the space of functions with E[mj(Xj)] = 0 and finitesecond moments. In other words, even if the regression function g0 in (1) does not satisfythe additive model (2), the backfitting algorithm finds the orthogonal projection of the re-gression function onto the linear space of additive functions in L2. Equivalently, backfittingfinds the closest additive approximation (in the L2 sense) to E(Y |X1, . . . , Xd). Furthemore,the backfitting algorithm is a coordinate-wise descent algorithm minimizing the squared lossfunctional above. The sample version of the algorithm solves a system of nd × nd normalequations and corresponds to the Gauss–Seidel algorithm for linear systems of equations.

    If the smoother chosen to estimate (3) is not resistant to outliers then the estimatedadditive components can be seriously affected by a relatively small proportion of atypicalobservations. Given the local nature of non–parametric regression estimators, we will beconcerned with the case where outliers are present in the response variable. Bianco andBoente (1998) considered robust estimators for additive models using kernel regression, whichare a robust version of those defined in Baek and Wehrly (1993). The main drawback of thisapproach is that it assumes that Y − g0,j(Xj) is independent from Xj, which is difficult tojustify or verify in practice. Outlier–resistant fits for generalized additive models have beenconsidered recently in the literature. When the variance is a known function of the meanand the dispersion parameter is known, we refer to Alimadad and Salibián-Barrera (2012)and Wong et al. (2014), who consider robust fits based on backfitting and penalized splinesM−estimators, respectively. In the case of model (1), the approach of Wong et al. (2014)reduces to that of Oh et al. (2007) which is an alternative based on penalized splines. Onthe other hand, Croux et al. (2011) provides a robust fit for generalized additive modelswith nuisance parameters using penalized splines, but no theoretical support is provided fortheir method.

    In this paper, we consider an intuitively appealing way to obtain robust estimators formodel (1) which combines the backfitting algorithm with robust univariate smoothers. Forexample, one can consider those proposed in Boente and Fraiman (1989), Härdle and Tsy-bakov (1988), Härdle (1990) and Oh et al. (2007). One of the main contributions of thispaper is to show that this intuitive approach to obtain a robust backfitting algorithm is welljustified. Specifically, we show that applying the backfitting algorithm using the robust non-parametric regression estimators of Boente and Fraiman (1989) corresponds to minimizing

    3

  • E[ρ((Y − µ0 −∑d

    i=1mj(Xj))/σ0)] over functions m1(X1), . . . ,md(Xd) with E[mj(Xj)] = 0,where ρ is a loss function. Furthermore, this robust backfitting corresponds to a coordinate-wise descent algorithm and can be shown to converge. We also establish sufficient conditionsfor these robust backfitting estimators to be Fisher consistent for the true additive compo-nents when (2) holds. Our numerical experiments confirm that these estimators have verygood finite-sample properties, both in terms of robustness, and efficiency with respect to theclassical approach when the data do not contain outliers. These robust estimators cannotbe interpreted as orthogonal projections of the regression function onto the space of additivefunctions of the predictors. However, the first-order conditions for the minimum of this op-timization problem are closely related to the robust conditional location functional definedin Boente and Fraiman (1989).

    The rest of the paper is organized as follows. In Section 2, we show that the robust back-fitting algorithm mentioned above corresponds to a coordinate-descent algorithm to minimizea robust functional using a convex loss function. We also prove that the resulting estimatoris Fisher consistent, which means that the solution to the population version of the problemis the object of interest (in our case, the true regression function). The convergence of thisalgorithm is studied in Section 2.1, while its finite-sample version using local M−regressionsmoothers is presented in Section 3. The results of our numerical experiments conductedto evaluate the performance of the proposed procedure are reported in Section 4. To studythe sensitivity of the robust backfitting with respect to single outliers, Section 5 provides anumerical study of the empirical influence function. Finally, in Section 6 we illustrate theadvantage of using robust backfitting on a real data set. All proofs are relegated to theAppendix.

    2 The robust backfitting functional

    In this section, we introduce a population-level version of the robust backfitting algorithm.By showing that the robust backfitting corresponds to a coordinate-descent algorithm tominimize a “robust functional”, we are able to find sufficient conditions for the robust back-fitting to be Fisher-consistent.

    In what follows, we will assume that (Xt, Y )t is a random vector satisfying the additivemodel (2), where Y ∈ R and X = (X1, . . . , Xd)t, that is,

    Y = µ0 +d∑j=1

    g0,j(Xj) + σ0 ε . (4)

    As it is customary, to ensure identifiability of the components of the model, we will furtherassume that Eg0,j(Xj) = 0, 1 ≤ j ≤ d. When second moments exist, it is easy to see that

    4

  • the backfitting estimators solve the following minimization problem

    min(ν,m)∈R×Had

    E(Y − ν −

    d∑j=1

    mj(Xj))2, (5)

    whereHad ={m(x) =

    ∑dj=1mj(xj), mj ∈ Hj

    }⊂ H,H = {r(x) : E(r(X)) = 0 ,E(r2(X)) <

    ∞} and Hj is the Hilbert space of measurable functions mj of Xj, with zero mean andfinite second moment, i.e., Emj(Xj) = 0 and Em2j(Xj) < ∞. The solution to (5) ischaracterized by its residual Y − µ − g(X) being orthogonal to Had. Since this space isspanned by H`, 1 ≤ ` ≤ d, the solution of (5) satisfies E(Y − µ −

    ∑dj=1 gj(Xj) ) = 0 and

    E(Y − µ −∑d

    j=1 gj(Xj) |X` ) = 0, for 1 ≤ ` ≤ d, from where it follows that µ = E(Y )and g`(X`) = E(Y − µ −

    ∑j 6=` gj(Xj)|X`), 1 ≤ ` ≤ d. Given a sample, the backfitting

    algorithm iterates the above system of equations replacing the conditional expectations withnon-parametric regression estimators (e.g. local polynomial smoothers).

    To reduce the effect of outliers on the regression estimates, we replace the square lossfunction in (5) by a function with bounded derivative such as the Huber or Tukey’s–lossfunctions. For these losses, ρc(u) = c

    2ρ1(u/c), where c > 0 is a tuning constant to achievea given efficiency. The Huber–type loss corresponds to ρ1 = ρh while the Tukey’s lossto ρt, where ρh (u) = u

    2/2 if |u| ≤ 1 and ρh (u) = |u| − 1/2 otherwise, while ρt(u) =min (3u2 − 3u4 + u6, 1). Another possible choices are ρ1(u) =

    √1 + u2−1 which is a smooth

    approximation of the Huber function and ρ1(u) = u arctan(u) − 0.5 ln(1 + u2) which hasderivative ρ′1(u) = arctan(u). The bounded derivative of the loss function controls theeffect of outlying values in the response variable (sometimes called “vertical outliers” in theliterature).

    Formally, our objective function is given by

    Υ(ν,m) = E ρ

    (Y − ν −

    ∑dj=1mj(Xj)

    σ0

    ), (6)

    where ρ : R → [0,∞) is even, ν ∈ R and the functions mj ∈ Hj, 1 ≤ j ≤ d. Let P bea distribution in Rd+1 and let (Xt, Y )t ∼ P . Define the functional (µ(P ), g(P )) as thesolution of the following optimization problem:

    (µ(P ), g(P )) = argmin(ν,m)∈R×Had

    Υ(ν,m) , (7)

    where g(P )(X) =∑d

    j=1 gj(P )(Xj) ∈ Had.To prove that the functional in (7) is Fisher-consistent and to derive first-order conditions

    for the point where it attains its minimum value, we will need the following assumptions:

    E1 The random variable ε has a density function f0(t) that is even, non-increasing in |t|,and strictly decreasing for |t| in a neighbourhood of 0.

    5

  • R1 The function ρ : R → [0,∞) is continuous, non-decreasing, ρ(0) = 0, and ρ(u) =ρ(−u). Moreover, if 0 ≤ u < v with ρ(v) < supt ρ(t) then ρ(u) < ρ(v).

    A1 Given functions mj ∈ Hj, if P(∑d

    j=1mj(Xj) = 0) = 1 then, for all 1 ≤ j ≤ d, we haveP(mj(Xj) = 0) = 1

    Remark 2.1. Assumption E1 is a standard condition needed to ensure Fisher-consistencyof an M−location functional (see, e.g. Maronna et al., 2006). Assumption R1 is satisfiedby the so-called family of “rho functions” in Maronna et al. (2006), which include manycommonly used robust loss functions, such as those mentioned above. Since the loss functionρ can be chosen by the user, this assumption is not restrictive. Finally, assumption A1 allowsus to write the functional g(P ) in (7) uniquely as g(P ) =

    ∑dj=1 gj(P ).

    Assumption A1 appears to be the most restrictive and deserves some discussion. It isclosely related to the identifiability of the additive model (4) and holds if the explanatoryvariables are independent from each other. Indeed, let us denote (x,Xα) the vector withthe α−th coordinate equal to x and the other ones equal to Xj, j 6= α and by m(x) =∑d

    j=1mj(xj), for mj ∈ Hj. For any fixed 1 ≤ α ≤ d, the condition P(m(X) = 0) =1 implies that for almost every xα, P(m(xα,Xα) = 0|Xα = xα) = 1. Using that thecomponents of X are independent, we obtain that P(m(xα,Xα) = 0) = 1 which implies that∫m(xα,uα)dFXα(u) = 0. Note that since Emj(Xj) = 0 for all j,

    ∫m(xα,uα)dFXα(u) =

    mα(xα) +∫ ∑

    j 6=αmj(uj)dFXα(u) = mα(xα). Hence, mα(xα) = 0, for almost every xα asdesired. However, if the components of X are not independent, then P(m(xα,Xα) = 0|Xα =xα) = 1 does not imply

    ∫m(xα,uα)dFXα(u) = 0. This has already been observed by Hastie

    and Tibshirani (1990, page 107). The fact that Had is closed in H entails that under mildassumptions, the minimum of E(Y −m(X))2 over Had exists and is unique. However, theindividual functions mj(xj) may not be uniquely determined since the dependence amongthe covariates may lead to more than one representation for the same surface (see alsoBreiman and Friedman, 1985). In fact, condition A1 is analogous to assumption 5.1 ofBreiman and Friedman (1985). It is also worth noticing that Stone (1985) gives conditionsto ensure that A1 holds. Indeed, Lemma 1 in Stone (1985) implies Proposition 2.1 belowwhich gives weak conditions for the unique representation and hence, as shown in Theorem2.1 below, for the Fisher–consistency of the functional g(P ). Its proof is omitted since itfollows straightforwardly.

    Proposition 2.1. Assume that X has compact support S and that its density fX is boundedin S and such that infx∈Sf fX(x) > 0. Let Vj = mj(Xj) be random variables such thatP(∑d

    j=1 Vj = 0) = 1 and E(Vj) = 0, then P(Vj = 0) = 1.

    The next Theorem establishes the Fisher–consistency of the functional (µ(P ), g(P )).In other words, it shows that the solution to the optimization problem (7) are the targetquantities to be estimated under model (4).

    6

  • Theorem 2.1. Assume that the random vector (Xt, Y )t ∈ Rd+1 satisfies (4) and let Pstand for its distribution.

    a) If E1 and R1 hold, then Υ(ν,m) in (6) achieves its unique minimum over R × Hadat (µ(P ), g(P )) = (µ(P ),

    ∑dj=1 gj(P )) when µ(P ) = µ0 and P(

    ∑dj=1 gj(P )(Xj) =∑d

    j=1 g0,j(Xj) ) = 1 .

    b) If in addition A1 holds, the unique minimum (µ(P ), g(P )) = (µ(P ),∑d

    j=1 gj(P )) sat-isfies µ(P ) = µ0 and P ( gj(P )(Xj) = g0,j(Xj) ) = 1 for 1 ≤ j ≤ d.

    It is worth noticing that a minimizer (µ(P ), g(P )) of (7) always exists if ρ is a strictlyconvex function, even if E1 does not hold. If in addition A1 holds, the minimizer will havea unique representation.

    For ν ∈ R, x = (x1, . . . , xd)t ∈ Rd and m = (m1, . . . ,md)t ∈ H1 × H2 · · · × Hd letΓ(ν,m,x) = (Γ0(ν,m),Γ1(ν,m, x1), . . . ,Γd(ν,m, xd))

    t , where

    Γ0(ν,m) = E

    (Y − ν −

    ∑dj=1mj(Xj)

    σ0

    )]

    Γ`(ν,m, x`) = E

    (Y − ν −

    ∑dj=1mj(Xj)

    σ0

    )∣∣∣∣∣X` = x`], 1 ≤ ` ≤ d . (8)

    Our next theorem shows that it is possible to choose the solution g(P ) of (7) so that itsadditive components gj = gj(P ) satisfy first order conditions which are generalizations ofthose corresponding to the classical case where ρ(u) = u2.

    Theorem 2.2. Let ρ be a differentiable function satisfying R1 and such that its derivativeρ′ = ψ is bounded and continuous. Let (Xt, Y )t ∼ P be a random vector such that(µ(P ), g(P )) is a minimizer of Υ(ν,m) over R×Had where µ(P ) ∈ R, g(P ) =

    ∑dj=1 gj(P ) ∈

    Had, i.e., (µ(P ), g(P )) is the solution of (7). Then, (µ(P ),g(P )) satisfies the system ofequations Γ(ν,m,x) = 0 almost surely PX.

    Remark 2.2. a) It is also worth mentioning that if (Xt, Y )t satisfies (4) with the errorssatisfying E1, then Γ(µ0,g0,x) = 0. Moreover, if the model is heteroscedastic

    Y = g0(X) + σ0(X) ε = µ0 +d∑j=1

    g0,j(Xj) + σ0(X) ε , (9)

    where the errors ε are symmetrically distributed and the score function ψ is odd, then (µ0,g0)

    7

  • satisfies

    E

    (Y − µ0 −

    ∑dj=1 g0,j(Xj)

    σ0(X)

    )]= 0 ,

    E[

    1

    σ0(X)ψ

    (Y − µ0 −

    ∑j 6=` g0,j(Xj)− g0,`(X`)σ0(X)

    )∣∣∣∣X`] = 0 , 1 ≤ ` ≤ d ,which provides a way to extend the robust backfitting algorithm to heteroscedastic models.

    b) Assume now that missing responses can arise in the sample, that is, we have a sample(Xti , Yi, δi)

    t, 1 ≤ i ≤ n, where δi = 1 if Yi is observed and δi = 0 if Yi is missing, and(Xti , Yi)

    t satisfy an additive heteroscedastic model (9). Let (Xt, Y, δ)t be a random vectorwith the same distribution as (Xti , Yi, δi)

    t. Moreover, assume that responses may be missingat random (mar), i.e., P(δ = 1|(X, Y )) = P(δ = 1|X) = p(X). Define (µ(P ), g(P )) =argmin(ν,m)∈R×Had Υδ(ν,m) where

    Υδ(ν,m) = E δρ

    (Y − ν −

    ∑dj=1mj(Xj)

    σ0(X)

    )= E p(X)ρ

    (Y − ν −

    ∑dj=1mj(Xj)

    σ0(X)

    ).

    Analogous arguments to those considered in the proof of Theorem 2.1, allow to show that,if E1 and R1 hold, Υδ(ν,m) achieves its unique minimum at (ν,m) ∈ R×H where ν = µ0and P(m(X) =

    ∑dj=1 g0,j(Xj)) = 1. Besides, if in addition A1 holds, the unique minimum

    satisfies that µ(P ) = µ0 and P(gj(P )(Xj) = g0,j(Xj)) = 1, that is, the functional is Fisher–consistent.

    On the other hand, the proof of Theorem 2.2 can be also generalized to the case of anhomocedastic additive model (4) with missing responses. Effectively, when infx p(x) > 0,using the mar assumption, it is possible to show that there exists a unique measurablesolution g̃`(x) of λ`,δ(x, a) = 0 where

    λ`,δ(x, a) = E{p(X) ψ

    (Y − µ(P )−

    ∑j 6=` gj(P )(Xj)− aσ0

    )∣∣∣∣X` = x} .More precisely, let Γδ(ν,m,x) = (Γ0,δ(ν,m),Γ1,δ(ν,m, x1), . . . ,Γd,δ(ν,m, xd))

    t with m =(m1, . . . ,md)

    t and

    Γ0,δ(ν,m) = E

    [p(X)ψ

    (Y − ν −

    ∑dj=1mj(Xj)

    σ0

    )]

    Γ`,δ(ν,m, x`) = E[p(X) ψ

    (Y − µ(P )−

    ∑j 6=`mj(Xj)−m`(X`)σ

    )∣∣∣∣X` = x`] 1 ≤ ` ≤ d .Similar arguments to those considered in the proof of Theorem 2.2, allow to show that ifthere exists a unique minimizer (µ(P ), g(P )) ∈ R × Had of Υδ(ν,m), then (µ(P ),g(P )) is

    8

  • a solution of Γδ(ν,m,x) = 0. Note that instead of a simplified approach, a propensityscore approach can also be considered taking δ/p(X) instead of δ. In this case, Υδ(ν,m) =Υ(ν,m) defined in (6) and Γ`,δ = Γ` defined in (8). The propensity approach is usefulwhen preliminary estimates of the missing probability are available, otherwise, the simplifiedapproach is preferred.

    2.1 The population version of the robust backfitting algorithm

    In this section, we derive an algorithm to solve (7) and study its convergence. For simplicity,we will assume that the vector (Xt, Y )t is completely observed and that it satisfies (4). ByTheorem 2.2, the robust functional (µ(P ),g(P )) satisfies (8). To simplify the notation, inwhat follows we will put µ = µ(P ) and gj = gj(P ), 1 ≤ j ≤ d and

    ∑ms=` as will be understood

    as 0 if m < `. The robust backfitting algorithm is given in Algorithm 1.

    Algorithm 1 Population version of the robust backfitting

    1: Let ` = 0 and g(0) = (g(0)1 , . . . , g

    (0)d )

    t be an initial set of additive components, for example:g(0) = 0 and µ0 an initial location parameter.

    2: repeat3: `← `+ 14: for j = 1 to d do5: Let R

    (`)j = Y − µ(`−1) −

    ∑j−1s=1 g̃

    (`)s (Xs)−

    ∑ds=j+1 g

    (`−1)s (Xs)

    6: Let g̃(`)j solve

    E

    (R

    (`)j − g̃

    (`)j (Xj)

    σ0

    )∣∣∣∣∣Xj = x]

    = 0 a.s.

    7: end for8: for j = 1 to d do9: g

    (`)j = g̃

    (`)j − E[g̃

    (`)j (Xj)].

    10: end for11: Let µ(`) solve

    E

    (Y − µ(`) −

    ∑dj=1 g

    (`)j (Xj)

    σ0

    )]= 0 .

    12: until convergence

    Our next Theorem shows that each Step ` of the algorithm above reduces the objectivefunction Υ(µ(`), g(`)).

    Theorem 2.3. Let ρ be a differentiable function satisfying R1 and such that its derivativeρ′ = ψ is a strictly increasing, bounded and continuous function with limt→+∞ ψ(t) > 0

    and limt→−∞ ψ(t) < 0. Let(µ(`),g(`)

    )`≥1 = (µ

    (`), g(`)1 , . . . , g

    (`)d )`≥1 be the sequence obtained

    9

  • with Algorithm 1. Then, {Υ(µ(`), g(`))}`≥1 is a decreasing sequence, so that the algorithmconverges.

    3 The sample version of the robust backfitting algo-

    rithm

    In practice, given a random sample (Xti , Yi)t 1 ≤ i ≤ n from the additive model (4) we

    apply Algorithm 1 replacing the unknown conditional expectations with univariate robustsmoothers. Different smoothers can be considered, including splines, kernel weights or evennearest neighbours with kernel weights. In what follows we describe the algorithm for kernelpolynomial M -estimators.

    Let K : R → R be a kernel function and let Kh(t) = (1/h)K(t/h). The estimators ofthe solutions of (8) using kernel M−polynomial estimators of order q ≥ 0 are given by thesolution to the following system of equations:

    1

    n

    n∑i=1

    ψ

    (Yi − µ̂−

    ∑dj=1 ĝj(Xi,j)

    σ̂0

    )= 0

    1

    n

    n∑i=1

    Khj(Xi,j − xj)ψ

    (Yi − µ̂−

    ∑6̀=j ĝ`(Xi,`)−

    ∑qs=0 βs,jZi,j,s

    σ̂0

    )Zi,j(xj) = 0 , 1 ≤ j ≤ d ,

    where Zi,j(xj) = (Zi,j,0, Zi,j,1, . . . , Zi,j,q)t with Zi,j,s = (Xi,j−xj)s, 0 ≤ s ≤ d. Then, we have

    ĝj(xj) = β0,j, 1 ≤ j ≤ d. The corresponding algorithm is described in detail in Algorithm 2.The same procedure can be applied to the complete sample when responses are missing.

    Remark 3.1. A possible choice of the preliminary scale estimator σ̂0 is obtained by cal-culating the mad of the residuals obtained with a simple and robust nonparametric regres-

    sion estimator, as local medians. In that case we have σ̂0 = mad1≤i≤n

    {Yi − Ŷi

    }, where

    Ŷi = median1≤j≤n {Yj : |Xj,k −Xi,k| ≤ hk, ∀ 1 ≤ k ≤ d}. The bandwidths hk are preliminaryvalues to be selected, or alternatively they can be chosen as the distance between Xi,k andits r-th nearest neighbour among {Xj,k}j 6=i.

    4 Numerical studies

    This section contains the results of a simulation study designed to compare the behaviourof our proposed estimator with that of the classical one. We generated N = 500 samplesof size n = 500 for models with d = 2 and d = 4 components. We considered sampleswithout outliers and also samples contaminated in different ways. We also included in our

    10

  • Algorithm 2 The sample version of the robust backfitting

    1: Let ` = 0 and ĝ(0) = (ĝ(0)1 , . . . , ĝ

    (0)d )

    t be an initial set of additive components, for example:ĝ(0) = 0, and let σ̂0 be a robust residual scale estimator. Moreover, let µ̂

    (0) an initiallocation estimator such as an M−location estimator of the responses.

    2: repeat3: `← `+ 14: for j = 1 to d do5: for i0 = 1 to n do6: Let xj = Xi0,j7: for i = 1 to n do8: Let Zi,j(xj) = (1, (Xi,j − xj), (Xi,j − xj)2, . . . , (Xi,j − xj)q)t and R̂(`)i,j = Yi −

    µ̂(`) −∑j−1

    s=1 g̃(`)s (Xi,s)−

    ∑ds=j+1 ĝ

    (`−1)s (Xi,s).

    9: end for10: Let β̂j(xj) = (β̂0j(xj), β̂1j(xj), . . . , β̂qj(xj))

    t be the solution to

    1

    n

    n∑i=1

    Kh(Xi,j − xj)ψ

    (R̂

    (`)i,j − β̂j(xj)tZi,j(xj)

    σ̂0

    )Zi,j(xj) = 0 .

    11: Let g̃(`)j (xj) = β̂0j(xj).

    12: end for13: end for14: for j = 1 to d do15: ĝ

    (`)j = g̃

    (`)j −

    ∑ni=1 g̃

    (`)j (Xi,j)/n.

    16: end for17: Let µ̂(`) solve

    1

    n

    n∑i=1

    ψ

    (Yi − µ̂(`) −

    ∑dj=1 ĝ

    (`)j (Xi,j)

    σ̂0

    )= 0 .

    18: until convergence

    experiment cases where the response variable may be missing, as described in Remark 2.2.All computations were carried out using an R implementation of our algorithm, publiclyavailable on–line at http://www.stat.ubc.ca/~matias/soft.html.

    To generate missing responses, we first generated observations (Xti , Yi)t satisfying the

    additive model Y = g0(X) +u = µ0 +∑d

    j=1 g0,j(Xj) +u , where u = σ0 ε. Then, we generate{δi}ni=1 independent Bernoulli random variables such that P (δi = 1|Yi,Xi) = P (δi = 1|Xi) =p (Xi) where we used a different function p for each value of d. When d = 2 we usedp(x) = p2(x) = 0.4+0.5(cos(x1 +0.2))

    2 which yields around 31.5% of missing responses. Ford = 4, we set p(x) = p4(x) = 0.4 + 0.5(cos(x1 ∗ x3 + 0.2))2, which produces approximately33% of missing Yi’s. In addition, we also consider the case where all responses are observed,

    11

  • i.e., p(x) ≡ 1.We compared the following estimators:

    • The classical backfitting estimator adapted to missing responses, denoted ĝbc.

    • A robust backfitting estimator ĝbr,h using Huber’s loss function with tuning constantc = 1.345. This loss function is such that ρ′c(u) = ψc(u) = min (c,max(−c, u)) .

    • A robust backfitting estimator ĝbr,t using Tukey’s bisquare loss function with tuningconstant c = 4.685. This loss function satisfies ρ′c(u) = ψc(u) = u (1− (u/c)2)

    2 I[−c,c](u) .

    The univariate smoothers were computed using the Epanechnikov kernel K(u) = 0.75 (1 −u2)I[−1,1](u). We used both local constants and local linear polynomials which correspond toq = 0 and q = 1 in Algorithm 2 denoted in all Tables and Figures with a subscript of 0 and1, respectively.

    The performance of each estimator ĝj of g0,j, 1 ≤ j ≤ d, was measured through thefollowing approximated integrated squared error (ise):

    ise(ĝj) =1∑ni=1 δi

    n∑i=1

    (g0,j (Xij)− ĝj (Xij))2 δi .

    where Xij is the jth component of Xi and δi = 0 if the i-th response was missing and δi = 1otherwise. An approximation of the mean integrated squared error (mise) was obtained byaveraging the ise above over all replications. A similar measure was used to compare theestimators of the regression function g0 = µ0 +

    ∑dj=1 g0,j.

    4.1 Monte Carlo study with d = 2 additive components

    In this case, the covariates were generated from a uniform distribution on the unit square,Xi = (Xi,1, Xi,2)

    t ∼ U([0, 1]2), the error scale was σ0 = 0.5 and the overall location µ0 = 0.The additive components were chosen to be

    g0,1(x1) = 24 (x1 − 0.5)2 − 2 , g0,2(x2) = 2π sin(πx2)− 4 . (10)

    Optimal bandwidths with respect to the mise can be computed assuming that the othercomponents in the model are known (see, for instance, Härdle et al., 2004). Since the ex-planatory variables are uniformly distributed, the optimal bandwidths for the local constant(q = 0) and local linear (q = 1) estimators are the same and very close to our choice ofh = (0.10, 0.10).

    For the errors, we considered the following settings:

    • C0: ui ∼ N(0, σ20).

    12

  • • C1: ui ∼ (1− 0.15)N(0, σ20) + 0.15N(15, 0.01).

    • C2: ui ∼ N(15, 0.01) for all i’s such that Xi ∈ D0.3, where Dη = [0.2, 0.2 + η]2.

    • C3: ui ∼ N(10, 0.01) for all i’s such that Xi ∈ D0.09, where Dη is as above.

    • C4: ui ∼ (1− 0.30)N(0, σ20) + 0.30N(15, 0.01) for all i’s such that Xi ∈ D0.3.

    Case C0 corresponds to samples without outliers and they will illustrate the loss of effi-ciency incurred by using a robust estimator when it may not be needed. The contaminationsetting C1 corresponds to a gross-error model where all observations have the same chanceof being contaminated. On the other hand, case C2 is highly pathological in the sense thatall observations with covariates in the square [0.2, 0.5]× [0.2, 0.5] are severely affected, whileC3 is similar but in the region [0.2, 0.29] × [0.2, 0.29]. The difference between C2 and C3is that the bandwidths we used (h = 0.10) are wider than the contaminated region in C3.Finally, case C4 is a gross-error model with a higher probability of observing an outlier, butthese are restricted to the square [0.2, 05]× [0.2, 0.5]. Figure 1 illustrate these contaminationscenarios on one randomly generated sample. The panels correspond to settings C2, C3 andC4, with solid triangles indicating contaminated cases.

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    (a) C2

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    (b) C3

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    (c) C4

    Figure 1: Scatter plots of covariates (x1, x2)t with solid triangles indicating observations withcontaminated response variables, for contamination settings C2, C3 and C4. The square regions

    indicate the sets Dη for each scenario.

    Tables 1 to 3 report the obtained values of the mise when estimating the regressionfunction g0 = µ0+g0,1+g0,2 and each additive component g0,1 and g0,2, respectively. Estimatesobtained using local constant (q = 0) and local linear smoothers (q = 1) are indicated with asubscript “0” and “1”, respectively. To complement the reported results on the mise, givenin Tables 1 to 3, we report in Tables 4 to 6 the ratio between the mean integrated squarederror (mise) obtained for each considered contamination and for the clean data. To identifythe ratio, it is denoted as mise(Cj)/mise(C0).

    Consider first the situation where there are no missing responses. As expected, whenthe data do not contain outliers the robust estimators are slightly less efficient than the

    13

  • p(x

    )≡

    1p2(x

    )=

    0.4

    +0.5

    cos2

    (x1

    +0.2

    )ĝ b

    c,0

    ĝ br,h

    ,0ĝ b

    r,t

    ,0ĝ b

    c,1

    ĝ br,h

    ,1ĝ b

    r,t

    ,1ĝ b

    c,0

    ĝ br,h

    ,0ĝ b

    r,t

    ,0ĝ b

    c,1

    ĝ br,h

    ,1ĝ b

    r,t

    ,1

    C0

    0.07

    040.

    0703

    0.07

    180.

    0077

    0.0

    079

    0.0

    079

    0.0

    741

    0.0

    756

    0.0

    790

    0.0

    110

    0.0

    113

    0.0

    113

    C1

    5.84

    370.

    1306

    0.07

    305.

    9026

    0.0

    620

    0.0

    091

    6.1

    657

    0.1

    457

    0.0

    800

    6.3

    125

    0.0

    897

    0.0

    272

    C2

    8.58

    230.

    2470

    0.07

    538.

    5218

    0.1

    471

    0.0

    086

    10.0

    841

    0.3

    550

    0.0

    837

    10.0

    100

    0.3

    040

    0.0

    396

    C3

    0.15

    600.

    0722

    0.07

    190.

    0930

    0.0

    090

    0.0

    080

    0.1

    882

    0.0

    786

    0.0

    790

    0.1

    224

    0.0

    127

    0.0

    113

    C4

    0.91

    590.

    0764

    0.07

    180.

    8523

    0.0

    122

    0.0

    080

    1.1

    017

    0.0

    840

    0.0

    782

    1.0

    307

    0.0

    169

    0.0

    115

    Tab

    le1:

    mise

    ofth

    ees

    tim

    ator

    sof

    the

    regr

    essi

    onfu

    nct

    iong 0

    =µ0+g 0,1

    +g 0,2

    under

    diff

    eren

    tco

    nta

    min

    atio

    ns

    and

    mis

    sing

    mec

    han

    ism

    s.

    p(x

    )≡

    1p2(x

    )=

    0.4

    +0.5

    cos2

    (x1

    +0.2

    )ĝ 1

    ,bc,0

    ĝ 1,br,h

    ,0ĝ 1

    ,br,t

    ,0ĝ 1

    ,bc,1

    ĝ 1,br,h

    ,1ĝ 1

    ,br,t

    ,1ĝ 1

    ,bc,0

    ĝ 1,br,h

    ,0ĝ 1

    ,br,t

    ,0ĝ 1

    ,bc,1

    ĝ 1,br,h

    ,1ĝ 1

    ,br,t

    ,1

    C0

    0.04

    450.

    0445

    0.04

    540.

    0096

    0.0

    097

    0.0

    097

    0.0

    493

    0.0

    502

    0.0

    521

    0.0

    138

    0.0

    140

    0.0

    140

    C1

    0.36

    380.

    0503

    0.04

    590.

    3903

    0.0

    152

    0.0

    103

    0.5

    176

    0.0

    592

    0.0

    523

    0.5

    904

    0.0

    334

    0.0

    271

    C2

    3.54

    900.

    1348

    0.04

    733.

    3751

    0.0

    665

    0.0

    101

    3.7

    935

    0.1

    641

    0.0

    547

    3.5

    914

    0.0

    942

    0.0

    176

    C3

    0.09

    350.

    0468

    0.04

    540.

    0490

    0.0

    103

    0.0

    097

    0.1

    084

    0.0

    532

    0.0

    521

    0.0

    599

    0.0

    147

    0.0

    140

    C4

    0.43

    510.

    0514

    0.04

    530.

    3547

    0.0

    119

    0.0

    098

    0.4

    910

    0.0

    587

    0.0

    517

    0.3

    991

    0.0

    166

    0.0

    141

    Tab

    le2:

    mise

    ofth

    ees

    tim

    ator

    sof

    the

    addit

    ive

    com

    pon

    entg 0,1

    under

    diff

    eren

    tco

    nta

    min

    atio

    ns

    and

    mis

    sing

    mec

    han

    ism

    s.

    p(x

    )≡

    1p2(x

    )=

    0.4

    +0.5

    cos2

    (x1

    +0.2

    )ĝ 2

    ,bc,0

    ĝ 2,br,h

    ,0ĝ 2

    ,br,t

    ,0ĝ 2

    ,bc,1

    ĝ 2,br,h

    ,1ĝ 2

    ,br,t

    ,1ĝ 2

    ,bc,0

    ĝ 2,br,h

    ,0ĝ 2

    ,br,t

    ,0ĝ 2

    ,bc,1

    ĝ 2,br,h

    ,1ĝ 2

    ,br,t

    ,1

    C0

    0.04

    050.

    0406

    0.04

    150.

    0106

    0.0

    107

    0.0

    107

    0.0

    465

    0.0

    476

    0.0

    497

    0.0

    157

    0.0

    159

    0.0

    159

    C1

    0.35

    590.

    0494

    0.04

    180.

    3907

    0.0

    160

    0.0

    113

    0.4

    945

    0.0

    599

    0.0

    498

    0.5

    731

    0.0

    273

    0.0

    184

    C2

    3.18

    910.

    0932

    0.04

    343.

    3314

    0.0

    633

    0.0

    110

    4.0

    168

    0.1

    572

    0.0

    520

    4.1

    969

    0.1

    642

    0.0

    385

    C3

    0.06

    830.

    0403

    0.04

    160.

    0480

    0.0

    111

    0.0

    107

    0.0

    893

    0.0

    476

    0.0

    496

    0.0

    695

    0.0

    165

    0.0

    159

    C4

    0.32

    370.

    0391

    0.04

    150.

    3415

    0.0

    119

    0.0

    108

    0.4

    195

    0.0

    465

    0.0

    492

    0.4

    433

    0.0

    177

    0.0

    160

    Tab

    le3:

    mise

    ofth

    ees

    tim

    ator

    sof

    the

    addit

    ive

    com

    pon

    entg 0,2

    under

    diff

    eren

    tco

    nta

    min

    atio

    ns

    and

    mis

    sing

    mec

    han

    ism

    s.

    14

  • p(x

    )≡

    1p2(x

    )=

    0.4

    +0.5

    cos2

    (x1

    +0.2

    )ĝ b

    c,0

    ĝ br,h

    ,0ĝ b

    r,t

    ,0ĝ b

    c,1

    ĝ br,h

    ,1ĝ b

    r,t

    ,1ĝ b

    c,0

    ĝ br,h

    ,0ĝ b

    r,t

    ,0ĝ b

    c,1

    ĝ br,h

    ,1ĝ b

    r,t

    ,1

    mise

    (C1)/mise

    (C0)

    82.9

    580

    1.85

    771.

    0161

    766.6

    964

    7.8

    275

    1.1

    499

    83.1

    557

    1.9

    285

    1.0

    124

    575.4

    987

    7.9

    530

    2.4

    127

    mise

    (C2)/mise

    (C0)

    121.

    8359

    3.51

    411.

    0492

    1106.9

    072

    18.5

    640

    1.0

    907

    136.0

    025

    4.6

    984

    1.0

    594

    912.5

    855

    26.9

    607

    3.5

    103

    mise

    (C3)/mise

    (C0)

    2.21

    471.

    0277

    1.00

    15

    12.0

    853

    1.1

    315

    1.0

    051

    2.5

    383

    1.0

    407

    0.9

    996

    11.1

    610

    1.1

    286

    1.0

    060

    mise

    (C4)/mise

    (C0)

    13.0

    019

    1.08

    690.

    9997

    110.7

    086

    1.5

    411

    1.0

    138

    14.8

    589

    1.1

    118

    0.9

    898

    93.9

    705

    1.4

    974

    1.0

    160

    Tab

    le4:

    Rati

    ob

    etw

    een

    themise

    ofth

    ees

    tim

    ator

    sof

    the

    add

    itiv

    eco

    mp

    onen

    tg 0

    =µ0

    +∑ 2 j=

    1g 0,j

    un

    der

    the

    con

    sid

    ered

    conta

    min

    ati

    on

    san

    du

    nd

    ercl

    ean

    data

    for

    the

    two

    mis

    sin

    gm

    ech

    anis

    ms.

    p(x

    )≡

    1p2(x

    )=

    0.4

    +0.5

    cos2

    (x1

    +0.2

    )ĝ 1

    ,bc,0

    ĝ 1,br,h

    ,0ĝ 1

    ,br,t

    ,0ĝ 1

    ,bc,1

    ĝ 1,br,h

    ,1ĝ 1

    ,br,t

    ,1ĝ 1

    ,bc,0

    ĝ 1,br,h

    ,0ĝ 1

    ,br,t

    ,0ĝ 1

    ,bc,1

    ĝ 1,br,h

    ,1ĝ 1

    ,br,t

    ,1

    mise

    (C1)/mise

    (C0)

    8.16

    961.

    1298

    1.0113

    40.5

    953

    1.5

    601

    1.0

    552

    10.5

    007

    1.1

    803

    1.0

    044

    42.6

    524

    2.3

    876

    1.9

    396

    mise

    (C2)/mise

    (C0)

    79.6

    889

    3.02

    831.

    0427

    351.0

    462

    6.8

    407

    1.0

    395

    76.9

    559

    3.2

    722

    1.0

    495

    259.4

    529

    6.7

    405

    1.2

    555

    mise

    (C3)/mise

    (C0)

    2.09

    941.

    0500

    1.0020

    5.0

    949

    1.0

    586

    1.0

    024

    2.1

    986

    1.0

    609

    1.0

    008

    4.3

    251

    1.0

    529

    1.0

    031

    mise

    (C4)/mise

    (C0)

    9.76

    941.

    1541

    0.9993

    36.8

    950

    1.2

    287

    1.0

    063

    9.9

    606

    1.1

    701

    0.9

    934

    28.8

    339

    1.1

    895

    1.0

    073

    Tab

    le5:

    Rati

    ob

    etw

    een

    themise

    ofth

    ees

    tim

    ator

    sof

    the

    add

    itiv

    eco

    mp

    onen

    tg 0,1

    un

    der

    the

    con

    sid

    ered

    conta

    min

    atio

    ns

    and

    un

    der

    clea

    nd

    ata

    for

    the

    two

    mis

    sin

    gm

    ech

    anis

    ms.

    p(x

    )≡

    1p2(x

    )=

    0.4

    +0.5

    cos2

    (x1

    +0.2

    )ĝ 2

    ,bc,0

    ĝ 2,br,h

    ,0ĝ 2

    ,br,t

    ,0ĝ 2

    ,bc,1

    ĝ 2,br,h

    ,1ĝ 2

    ,br,t

    ,1ĝ 2

    ,bc,0

    ĝ 2,br,h

    ,0ĝ 2

    ,br,t

    ,0ĝ 2

    ,bc,1

    ĝ 2,br,h

    ,1ĝ 2

    ,br,t

    ,1

    mise

    (C1)/mise

    (C0)

    8.79

    941.

    2173

    1.0062

    36.7

    398

    1.4

    951

    1.0

    528

    10.6

    429

    1.2

    567

    1.0

    022

    36.4

    372

    1.7

    173

    1.1

    599

    mise

    (C2)/mise

    (C0)

    78.8

    384

    2.29

    531.

    0454

    313.3

    083

    5.8

    990

    1.0

    285

    86.4

    525

    3.2

    998

    1.0

    461

    266.8

    154

    10.3

    392

    2.4

    245

    mise

    (C3)/mise

    (C0)

    1.68

    730.

    9925

    1.0010

    4.5

    158

    1.0

    365

    1.0

    020

    1.9

    229

    0.9

    997

    0.9

    985

    4.4

    195

    1.0

    386

    1.0

    027

    mise

    (C4)/mise

    (C0)

    8.00

    220.

    9615

    1.0000

    32.1

    152

    1.1

    092

    1.0

    054

    9.0

    289

    0.9

    769

    0.9

    896

    28.1

    834

    1.1

    154

    1.0

    063

    Tab

    le6:

    Rati

    ob

    etw

    een

    themise

    ofth

    ees

    tim

    ator

    sof

    the

    add

    itiv

    eco

    mp

    onen

    tg 0,1

    un

    der

    the

    con

    sid

    ered

    conta

    min

    atio

    ns

    and

    un

    der

    clea

    nd

    ata

    for

    the

    two

    mis

    sin

    gm

    ech

    anis

    ms.

    15

  • least squares one, although the differences in the estimated mise’s are well within the MonteCarlo margin of error. For the contamination cases C1 and C2, when using either localconstants or local linear smoothers, the mise of the classical estimator for g0 is notablylarger than those of the robust estimators (more than 40 times larger). This difference issmaller when estimating each component g0,1 and g0,2, but remains fairly large nonetheless.In general, when the data contain outliers, we note that the robust estimators give noticeablybetter regression estimators (both for g0 and its components) than the classical one. Theestimator based on Tukey’s bisquare function is very stable across all the contamination casesconsidered here, while for the Huber one, there’s a slight increase in mise for the estimatedadditive components under the contamination setting C1 and a larger one under C2. Theadvantage of the estimators based on Tukey’s loss function over those based on the Huberone becomes more apparent when one inspects the ratio between the mise’s obtained withand without outliers given in Tables 4 to 6.

    It is worth noting that, for clean data, local linear smoothers achieve much smaller misevalues than local constants. This may be explained by the well-known bias-reducing propertyof local polynomials, particularly near the boundary. This behaviour can also be observedacross contamination settings for the robust estimators.

    Interestingly, the results of our experiments with missing responses yield the same overallconclusions. The estimators based on the subset of complete data remain Fisher-consistent,but, as one would expect, they all yield larger mise’s due to the smaller sample sizes used tocompute them. Moreover, the estimators based on Tukey’s loss function are still preferable,although they are slightly less efficient than those based on the Huber loss when q = 0. Whencomparing the behaviour of the robust local constant and local linear estimators one noticesthat, as above, local linear estimators have smaller mise’s. However, the ratios of mise,reported in Tables 4 to 6, show that under C2 the mise’s of the local linear estimators of g0with missing responses are more than 4 times larger than those obtained with local constants.This effect is mainly due to the poor estimation of g0,2 and of the location parameter µ0 assuggested by the reported results.

    We also looked at the median ise across the simulation replications to determine whetherthe differences in the observed averaged ise’s are representative of most samples or they wereinfluenced by poor fits obtained in a few bad samples. Tables 7 to 9 report the median overreplications of the ise, denoted Medise for the estimators of the regression function g0 andeach additive component when d = 2. Besides, Tables 10 to 12 report the ratio betweenthe median integrated squared error (Medise) obtained for each considered contaminationand for the clean data. To identify the ratio, it is denoted as Medise(Cj)/Medise(C0).The median ise results show that the robust estimators behave similarly when p(x) ≡ 1 andwhen responses may be missing. Furthermore, the performances of Tukey’s local linear andlocal constant smoothers are equally good.

    Based on the above results we recommend using local linear smoothers computed withTukey’s loss function, although in some situations local constants may give better results,

    16

  • p(x

    )≡

    1p2(x

    )=

    0.4

    +0.5

    cos2

    (x1

    +0.2

    )ĝ b

    c,0

    ĝ br,h

    ,0ĝ b

    r,t

    ,0ĝ b

    c,1

    ĝ br,h

    ,1ĝ b

    r,t

    ,1ĝ b

    c,0

    ĝ br,h

    ,0ĝ b

    r,t

    ,0ĝ b

    c,1

    ĝ br,h

    ,1ĝ b

    r,t

    ,1

    C0

    0.07

    000.

    0700

    0.07

    140.

    0073

    0.0

    074

    0.0

    074

    0.0

    733

    0.0

    752

    0.0

    784

    0.0

    105

    0.0

    108

    0.0

    107

    C1

    5.76

    810.

    1256

    0.07

    135.

    8235

    0.0

    587

    0.0

    086

    6.0

    524

    0.1

    379

    0.0

    793

    6.1

    475

    0.0

    676

    0.0

    126

    C2

    8.47

    180.

    2370

    0.07

    478.

    4056

    0.1

    333

    0.0

    082

    9.7

    880

    0.2

    982

    0.0

    825

    9.7

    619

    0.1

    794

    0.0

    118

    C3

    0.13

    750.

    0717

    0.07

    100.

    0753

    0.0

    083

    0.0

    075

    0.1

    535

    0.0

    783

    0.0

    781

    0.0

    892

    0.0

    122

    0.0

    108

    C4

    0.84

    810.

    0762

    0.07

    120.

    7852

    0.0

    113

    0.0

    075

    0.9

    826

    0.0

    827

    0.0

    776

    0.9

    236

    0.0

    157

    0.0

    110

    Tab

    le7:

    Medise

    of

    the

    esti

    mato

    rsof

    the

    regr

    essi

    onfu

    nct

    iong 0

    un

    der

    diff

    eren

    tco

    nta

    min

    atio

    ns

    and

    mis

    sin

    gm

    ech

    anis

    ms.

    p(x

    )≡

    1p2(x

    )=

    0.4

    +0.5

    cos2

    (x1

    +0.2

    )ĝ 1

    ,bc,0

    ĝ 1,br,h

    ,0ĝ 1

    ,br,t

    ,0ĝ 1

    ,bc,1

    ĝ 1,br,h

    ,1ĝ 1

    ,br,t

    ,1ĝ 1

    ,bc,0

    ĝ 1,br,h

    ,0ĝ 1

    ,br,t

    ,0ĝ 1

    ,bc,1

    ĝ 1,br,h

    ,1ĝ 1

    ,br,t

    ,1

    C0

    0.04

    300.

    0433

    0.04

    400.

    0068

    0.0

    069

    0.0

    070

    0.0

    465

    0.0

    478

    0.0

    501

    0.0

    099

    0.0

    099

    0.0

    099

    C1

    0.32

    960.

    0489

    0.04

    430.

    3597

    0.0

    130

    0.0

    073

    0.4

    574

    0.0

    557

    0.0

    498

    0.5

    377

    0.0

    194

    0.0

    109

    C2

    3.51

    880.

    1292

    0.04

    533.

    3350

    0.0

    613

    0.0

    073

    3.7

    000

    0.1

    526

    0.0

    523

    3.4

    773

    0.0

    750

    0.0

    108

    C3

    0.08

    650.

    0452

    0.04

    410.

    0406

    0.0

    078

    0.0

    070

    0.0

    981

    0.0

    510

    0.0

    502

    0.0

    498

    0.0

    109

    0.0

    099

    C4

    0.41

    210.

    0500

    0.04

    360.

    3325

    0.0

    096

    0.0

    070

    0.4

    404

    0.0

    567

    0.0

    496

    0.3

    520

    0.0

    131

    0.0

    100

    Tab

    le8:

    Medise

    ofth

    ees

    tim

    ator

    sof

    the

    add

    itiv

    eco

    mp

    onen

    tg 0,1

    un

    der

    diff

    eren

    tco

    nta

    min

    atio

    ns

    and

    mis

    sin

    gm

    ech

    anis

    ms.

    p(x

    )≡

    1p2(x

    )=

    0.4

    +0.5

    cos2

    (x1

    +0.2

    )ĝ 2

    ,bc,0

    ĝ 2,br,h

    ,0ĝ 2

    ,br,t

    ,0ĝ 2

    ,bc,1

    ĝ 2,br,h

    ,1ĝ 2

    ,br,t

    ,1ĝ 2

    ,bc,0

    ĝ 2,br,h

    ,0ĝ 2

    ,br,t

    ,0ĝ 2

    ,bc,1

    ĝ 2,br,h

    ,1ĝ 2

    ,br,t

    ,1

    C0

    0.03

    820.

    0383

    0.03

    910.

    0071

    0.0

    072

    0.0

    072

    0.0

    430

    0.0

    443

    0.0

    462

    0.0

    102

    0.0

    103

    0.0

    103

    C1

    0.31

    800.

    0465

    0.03

    980.

    3517

    0.0

    130

    0.0

    078

    0.4

    503

    0.0

    563

    0.0

    463

    0.5

    036

    0.0

    193

    0.0

    114

    C2

    3.17

    440.

    0874

    0.04

    093.

    3084

    0.0

    575

    0.0

    077

    3.9

    785

    0.1

    258

    0.0

    489

    4.1

    319

    0.0

    884

    0.0

    113

    C3

    0.06

    200.

    0381

    0.03

    870.

    0409

    0.0

    077

    0.0

    072

    0.0

    767

    0.0

    437

    0.0

    463

    0.0

    588

    0.0

    111

    0.0

    104

    C4

    0.29

    930.

    0369

    0.03

    910.

    3171

    0.0

    086

    0.0

    073

    0.3

    722

    0.0

    434

    0.0

    461

    0.3

    955

    0.0

    122

    0.0

    105

    Tab

    le9:

    Medise

    ofth

    ees

    tim

    ator

    sof

    the

    add

    itiv

    eco

    mp

    onen

    tg 0,2

    un

    der

    diff

    eren

    tco

    nta

    min

    atio

    ns

    and

    mis

    sin

    gm

    ech

    anis

    ms.

    17

  • p(x

    )≡

    1p2(x

    )=

    0.4

    +0.5

    cos2

    (x1

    +0.2

    )ĝ b

    c,0

    ĝ br,h

    ,0ĝ b

    r,t

    ,0ĝ b

    c,1

    ĝ br,h

    ,1ĝ b

    r,t

    ,1ĝ b

    c,0

    ĝ br,h

    ,0ĝ b

    r,t

    ,0ĝ b

    c,1

    ĝ br,h

    ,1ĝ b

    r,t

    ,1

    Medise(C

    1)

    Medise(C

    0)

    82.3

    814

    1.79

    570.

    9991

    802.7

    378

    7.9

    218

    1.1

    601

    82.6

    104

    1.8

    354

    1.0

    113

    584.4

    446

    6.2

    633

    1.1

    736

    Medise(C

    2)

    Medise(C

    0)

    120.

    9971

    3.38

    761.

    0457

    1158.6

    577

    17.9

    962

    1.1

    054

    133.5

    979

    3.9

    676

    1.0

    525

    928.0

    673

    16.6

    135

    1.0

    978

    Medise(C

    3)

    Medise(C

    0)

    1.96

    381.

    0254

    0.99

    5110.3

    761

    1.1

    264

    1.0

    095

    2.0

    946

    1.0

    416

    0.9

    961

    8.4

    822

    1.1

    289

    1.0

    078

    Medise(C

    4)

    Medise(C

    0)

    12.1

    124

    1.08

    990.

    9971

    108.2

    387

    1.5

    199

    1.0

    123

    13.4

    123

    1.1

    003

    0.9

    906

    87.8

    085

    1.4

    519

    1.0

    276

    Tab

    le10

    :R

    atio

    bet

    wee

    nth

    eMedise

    of

    the

    esti

    mat

    ors

    ofth

    ead

    dit

    ive

    com

    pon

    entg 0

    un

    der

    the

    con

    sid

    ered

    conta

    min

    atio

    ns

    and

    un

    der

    clea

    nd

    ata

    for

    the

    two

    mis

    sin

    gm

    ech

    anis

    ms.

    p(x

    )≡

    1p2(x

    )=

    0.4

    +0.5

    cos2

    (x1

    +0.2

    )ĝ 1

    ,bc,0

    ĝ 1,br,h

    ,0ĝ 1

    ,br,t

    ,0ĝ 1

    ,bc,1

    ĝ 1,br,h

    ,1ĝ 1

    ,br,t

    ,1ĝ 1

    ,bc,0

    ĝ 1,br,h

    ,0ĝ 1

    ,br,t

    ,0ĝ 1

    ,bc,1

    ĝ 1,br,h

    ,1ĝ 1

    ,br,t

    ,1

    Medise(C

    1)

    Medise(C

    0)

    7.66

    791.

    1314

    1.00

    6152.8

    814

    1.8

    807

    1.0

    462

    9.8

    420

    1.1

    645

    0.9

    930

    54.2

    267

    1.9

    519

    1.0

    976

    Medise(C

    2)

    Medise(C

    0)

    81.8

    740

    2.98

    741.

    0294

    490.3

    515

    8.8

    710

    1.0

    445

    79.6

    110

    3.1

    894

    1.0

    435

    350.6

    805

    7.5

    627

    1.0

    876

    Medise(C

    3)

    Medise(C

    0)

    2.01

    311.

    0458

    1.00

    195.9

    736

    1.1

    294

    1.0

    016

    2.1

    097

    1.0

    655

    1.0

    019

    5.0

    231

    1.0

    977

    0.9

    964

    Medise(C

    4)

    Medise(C

    0)

    9.58

    971.

    1565

    0.99

    1148.8

    814

    1.3

    842

    1.0

    081

    9.4

    753

    1.1

    843

    0.9

    894

    35.5

    014

    1.3

    224

    1.0

    083

    Tab

    le11

    :R

    ati

    ob

    etw

    een

    theMedise

    ofth

    ees

    tim

    ator

    sof

    the

    add

    itiv

    eco

    mp

    onen

    tg 0,1

    un

    der

    the

    consi

    der

    edco

    nta

    min

    atio

    ns

    and

    un

    der

    clea

    nd

    ata

    for

    the

    two

    mis

    sin

    gm

    ech

    an

    ism

    s.

    p(x

    )≡

    1p2(x

    )=

    0.4

    +0.5

    cos2

    (x1

    +0.2

    )ĝ 2

    ,bc,0

    ĝ 2,br,h

    ,0ĝ 2

    ,br,t

    ,0ĝ 2

    ,bc,1

    ĝ 2,br,h

    ,1ĝ 2

    ,br,t

    ,1ĝ 2

    ,bc,0

    ĝ 2,br,h

    ,0ĝ 2

    ,br,t

    ,0ĝ 2

    ,bc,1

    ĝ 2,br,h

    ,1ĝ 2

    ,br,t

    ,1

    Medise(C

    1)

    Medise(C

    0)

    8.32

    521.

    2142

    1.01

    7749.2

    713

    1.8

    064

    1.0

    717

    10.4

    835

    1.2

    699

    1.0

    020

    49.1

    453

    1.8

    770

    1.1

    082

    Medise(C

    2)

    Medise(C

    0)

    83.1

    079

    2.28

    141.

    0453

    463.4

    817

    8.0

    080

    1.0

    595

    92.6

    170

    2.8

    382

    1.0

    571

    403.2

    129

    8.6

    146

    1.0

    931

    Medise(C

    3)

    Medise(C

    0)

    1.62

    260.

    9934

    0.99

    115.7

    296

    1.0

    781

    0.9

    914

    1.7

    847

    0.9

    856

    1.0

    022

    5.7

    381

    1.0

    852

    1.0

    044

    Medise(C

    4)

    Medise(C

    0)

    7.83

    490.

    9617

    0.99

    9044.4

    295

    1.1

    974

    1.0

    120

    8.6

    646

    0.9

    798

    0.9

    963

    38.5

    909

    1.1

    879

    1.0

    214

    Tab

    le12

    :R

    ati

    ob

    etw

    een

    theMedise

    ofth

    ees

    tim

    ator

    sof

    the

    add

    itiv

    eco

    mp

    onen

    tg 0,2

    un

    der

    the

    consi

    der

    edco

    nta

    min

    atio

    ns

    and

    un

    der

    clea

    nd

    ata

    for

    the

    two

    mis

    sin

    gm

    ech

    an

    ism

    s.

    18

  • in particular when responses are missing and some neighbourhoods contain too few obser-vations.

    4.2 Monte Carlo study with d = 4 additive components

    For this model we generated covariates Xi = (Xi1, Xi2, Xi3, Xi4) ∼ U([−3, 3]4), independenterrors εi ∼ N(0, 1) and σ0 = 0.15. We used the following additive components: g0,1(x1) =x31/12, g0,2(x2) = sin(−x2), g0,3(x3) = x23/2 − 1.5, g0,4(x4) = ex4/4 − (e3 − e−3)/24. As indimension d = 2, optimal bandwidths with respect to the mise were computed assuming thatthe other components in the model are known, resulting in hMISEopt = (0.36, 0.38, 0.34, 0.29).However, it was difficult to obtain a reliable estimate for the residual scale σ0 using thesebandwidths (see Remark 3.1), since many 4-dimensional neighbourhoods did not containsufficient observations. To solve this problem we used hσ = (0.93, 0.93, 0.93, 0.93) to estimateσ0 (we expect an average of 5 points in each 4-dimensional neighbourhood using hσ). Wethen applied the backfitting algorithm with the optimal bandwidths hMISEopt .

    In addition to the settings C0 and C1 described above, we modified the contaminationsetting C2 so that ui ∼ N(15, 0.01) for all i such that Xi,j ∈ [−1.5, 1.5] for all 1 ≤ j ≤ 4.

    Tables 13 to 17 report the mise for the different estimators, contamination settings andmissing mechanisms. Ratios of mise’s for clean and contaminated settings and median ise’sare reported in tables included Tables 18 to 32 .

    As observed in Section 4.1, our experiments with and without missing responses yieldsimilar conclusions regarding the advantage of the robust procedure over the classical back-fitting and the benefits of the Tukey’s loss function over the Huber one. As in dimensiond = 2, when responses are missing, all the mise’s are slightly inflated, as it is to be expected.It is also not surprising that when the data do not contain outliers (C0), the robust estima-tors have a slightly larger mise than their classical counterparts. However, when outliers arepresent, both robust estimators provide a substantially better performance than the classicalone, given similar results to those for clean data. Also as noted for the model with d = 2,the mise of the estimators based on Tukey’s bisquare score function are more stable acrossthe different contamination settings than those using Huber’s score function. However, onedifference with the results in dimension d = 2 should be pointed out. Under the gross errormodel C1, the local linear smoother performs worse than the local constant when missing re-sponses arise. This difference is highlighted when one looks at the ratio of the correspondingmise’s. However, median ise’s reported in Tables 23 to 27 are more stable, which indicatesthat the difference in mise’s may be due to a few samples. In fact, due to the relatively largenumber of missing observations that may be present in this setting (around 33%) around 6%of the replications produce robust fits with Tukey’s loss function with atypically large valuesof mise, which are due to sparsely populated neighbourhoods. Based on these observations,we also recommend using Tukey’s local linear estimators.

    19

  • p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2

    (x1x3

    +0.2

    )ĝ b

    c,0

    ĝ br,h

    ,0ĝ b

    r,t

    ,0ĝ b

    c,1

    ĝ br,h

    ,1ĝ b

    r,t

    ,1ĝ b

    c,0

    ĝ br,h

    ,0ĝ b

    r,t

    ,0ĝ b

    c,1

    ĝ br,h

    ,1ĝ b

    r,t

    ,1

    C0

    0.01

    230.

    0125

    0.01

    250.

    0023

    0.0

    023

    0.0

    023

    0.0

    135

    0.0

    142

    0.0

    150

    0.0

    033

    0.0

    033

    0.0

    033

    C1

    7.33

    450.

    0525

    0.01

    347.

    6095

    0.0

    497

    0.0

    046

    8.4

    183

    0.0

    699

    0.0

    192

    8.8

    581

    0.1

    563

    0.0

    932

    C2

    4.84

    410.

    0360

    0.01

    284.

    8224

    0.0

    221

    0.0

    025

    5.9

    889

    0.0

    392

    0.0

    148

    6.0

    121

    0.0

    258

    0.0

    038

    Tab

    le13

    :mise

    ofth

    ees

    tim

    ator

    sof

    the

    regr

    essi

    on

    fun

    ctio

    ng 0

    =µ0

    +∑ 4 j=

    1g 0

    ,ju

    nd

    erd

    iffer

    ent

    conta

    min

    ati

    on

    san

    dm

    issi

    ng

    mec

    han

    ism

    s.

    p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2(x

    1x3+

    0.2)

    ĝ1,bc,0

    ĝ1,br,h,0

    ĝ1,br,t,0

    ĝ1,bc,1

    ĝ1,br,h,1

    ĝ1,br,t,1

    ĝ1,bc,0

    ĝ1,br,h,0

    ĝ1,br,t,0

    ĝ1,bc,1

    ĝ1,br,h,1

    ĝ1,br,t,1

    C0

    0.0046

    0.0047

    0.0047

    0.0020

    0.0020

    0.0020

    0.0059

    0.0060

    0.0061

    0.0030

    0.0030

    0.0030

    C1

    0.5547

    0.0085

    0.0049

    0.6356

    0.0066

    0.0021

    0.8355

    0.0107

    0.0063

    0.9703

    0.0250

    0.0178

    C2

    0.9691

    0.0089

    0.0047

    0.9897

    0.0060

    0.0020

    1.1446

    0.0105

    0.0062

    1.1960

    0.0073

    0.0030

    Tab

    le14

    :mise

    ofth

    ees

    tim

    ator

    sof

    the

    ad

    dit

    ive

    com

    pon

    entg 0

    ,1u

    nd

    erd

    iffer

    ent

    conta

    min

    ati

    on

    san

    dm

    issi

    ng

    mec

    han

    ism

    s.

    p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2(x

    1x3+

    0.2)

    ĝ2,bc,0

    ĝ2,br,h,0

    ĝ2,br,t,0

    ĝ2,bc,1

    ĝ2,br,h,1

    ĝ2,br,t,1

    ĝ2,bc,0

    ĝ2,br,h,0

    ĝ2,br,t,0

    ĝ2,bc,1

    ĝ2,br,h,1

    ĝ2,br,t,1

    C0

    0.0025

    0.0025

    0.0025

    0.0016

    0.0016

    0.0016

    0.0035

    0.0035

    0.0036

    0.0024

    0.0024

    0.0024

    C1

    0.5332

    0.0062

    0.0027

    0.6189

    0.0081

    0.0026

    0.7802

    0.0101

    0.0038

    0.8982

    0.0226

    0.0136

    C2

    0.9298

    0.0066

    0.0026

    0.9482

    0.0055

    0.0016

    1.1950

    0.0083

    0.0037

    1.2339

    0.0067

    0.0024

    Tab

    le15

    :mise

    ofth

    ees

    tim

    ator

    sof

    the

    ad

    dit

    ive

    com

    pon

    entg 0

    ,2u

    nd

    erd

    iffer

    ent

    conta

    min

    ati

    on

    san

    dm

    issi

    ng

    mec

    han

    ism

    s.

    p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2(x

    1x3+

    0.2)

    ĝ3,bc,0

    ĝ3,br,h,0

    ĝ3,br,t,0

    ĝ3,bc,1

    ĝ3,br,h,1

    ĝ3,br,t,1

    ĝ3,bc,0

    ĝ3,br,h,0

    ĝ3,br,t,0

    ĝ3,bc,1

    ĝ3,br,h,1

    ĝ3,br,t,1

    C0

    0.0091

    0.0092

    0.0092

    0.0042

    0.0042

    0.0042

    0.0136

    0.0141

    0.0146

    0.0082

    0.0082

    0.0082

    C1

    0.5991

    0.0137

    0.0095

    0.6741

    0.0106

    0.0052

    0.9195

    0.0273

    0.0157

    1.0679

    0.0449

    0.0337

    C2

    1.0137

    0.0155

    0.0094

    1.0007

    0.0085

    0.0042

    1.2200

    0.0207

    0.0144

    1.2256

    0.0126

    0.0082

    Tab

    le16

    :mise

    ofth

    ees

    tim

    ator

    sof

    the

    ad

    dit

    ive

    com

    pon

    entg 0

    ,3u

    nd

    erd

    iffer

    ent

    conta

    min

    ati

    on

    san

    dm

    issi

    ng

    mec

    han

    ism

    s.

    p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2(x

    1x3+

    0.2)

    ĝ4,bc,0

    ĝ4,br,h,0

    ĝ4,br,t,0

    ĝ4,bc,1

    ĝ4,br,h,1

    ĝ4,br,t,1

    ĝ4,bc,0

    ĝ4,br,h,0

    ĝ4,br,t,0

    ĝ4,bc,1

    ĝ4,br,h,1

    ĝ4,br,t,1

    C0

    0.0070

    0.0070

    0.0071

    0.0036

    0.0036

    0.0036

    0.0095

    0.0098

    0.0103

    0.0058

    0.0058

    0.0058

    C1

    0.6818

    0.0118

    0.0072

    0.7558

    0.0097

    0.0037

    1.0890

    0.0235

    0.0127

    1.2310

    0.0583

    0.0429

    C2

    1.0582

    0.0125

    0.0071

    1.0592

    0.0078

    0.0036

    1.3678

    0.0159

    0.0101

    1.3877

    0.0117

    0.0061

    Tab

    le17

    :mise

    ofth

    ees

    tim

    ator

    sof

    the

    ad

    dit

    ive

    com

    pon

    entg 0

    ,4u

    nd

    erd

    iffer

    ent

    conta

    min

    ati

    on

    san

    dm

    issi

    ng

    mec

    han

    ism

    s.

    20

  • p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2

    (x1x3

    +0.2

    )ĝ b

    c,0

    ĝ br,h

    ,0ĝ b

    r,t

    ,0ĝ b

    c,1

    ĝ br,h

    ,1ĝ b

    r,t

    ,1ĝ b

    c,0

    ĝ br,h

    ,0ĝ b

    r,t

    ,0ĝ b

    c,1

    ĝ br,h

    ,1ĝ b

    r,t

    ,1

    mise

    (C1)/mise

    (C0)

    594.

    6856

    4.19

    271.

    0660

    3263.3

    298

    21.3

    232

    1.9

    603

    622.6

    124

    4.9

    188

    1.2

    865

    2677.7

    184

    47.1

    836

    28.0

    937

    mise

    (C2)/mise

    (C0)

    392.

    7639

    2.87

    891.

    0204

    2068.0

    873

    9.4

    793

    1.0

    625

    442.9

    360

    2.7

    631

    0.9

    876

    1817.4

    052

    7.8

    002

    1.1

    449

    Tab

    le18

    :R

    atio

    bet

    wee

    nth

    emise

    of

    the

    esti

    mat

    ors

    ofth

    ead

    dit

    ive

    com

    pon

    entg 0

    =µ0

    +∑ 4 j=

    1g 0,j

    un

    der

    the

    con

    sid

    ered

    conta

    min

    ati

    on

    san

    du

    nd

    ercl

    ean

    data

    for

    the

    two

    mis

    sin

    gm

    ech

    anis

    ms.

    p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2

    (x1x3

    +0.2

    )ĝ 1

    ,bc,0

    ĝ 1,br,h

    ,0ĝ 1

    ,br,t

    ,0ĝ 1

    ,bc,1

    ĝ 1,br,h

    ,1ĝ 1

    ,br,t

    ,1ĝ 1

    ,bc,0

    ĝ 1,br,h

    ,0ĝ 1

    ,br,t

    ,0ĝ 1

    ,bc,1

    ĝ 1,br,h

    ,1ĝ 1

    ,br,t

    ,1

    mise

    (C1)/mise

    (C0)

    119.

    9119

    1.82

    251.

    0448

    318.2

    129

    3.3

    018

    1.0

    502

    142.1

    759

    1.7

    651

    1.0

    330

    327.3

    855

    8.4

    212

    5.9

    965

    mise

    (C2)/mise

    (C0)

    209.

    4957

    1.91

    191.

    0185

    495.5

    000

    3.0

    240

    1.0

    185

    194.7

    711

    1.7

    403

    1.0

    134

    403.5

    116

    2.4

    681

    1.0

    207

    Tab

    le19

    :R

    atio

    bet

    wee

    nth

    emise

    of

    the

    esti

    mat

    ors

    ofth

    ead

    dit

    ive

    com

    pon

    entg 0,1

    un

    der

    the

    con

    sid

    ered

    conta

    min

    atio

    ns

    and

    un

    der

    clea

    nd

    ata

    for

    the

    two

    mis

    sin

    gm

    ech

    anis

    ms.

    p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2

    (x1x3

    +0.2

    )ĝ 2

    ,bc,0

    ĝ 2,br,h

    ,0ĝ 2

    ,br,t

    ,0ĝ 2

    ,bc,1

    ĝ 2,br,h

    ,1ĝ 2

    ,br,t

    ,1ĝ 2

    ,bc,0

    ĝ 2,br,h

    ,0ĝ 2

    ,br,t

    ,0ĝ 2

    ,bc,1

    ĝ 2,br,h

    ,1ĝ 2

    ,br,t

    ,1

    mise

    (C1)/mise

    (C0)

    213.

    8436

    2.46

    421.

    0692

    385.1

    917

    5.0

    106

    1.6

    478

    225.1

    335

    2.8

    729

    1.0

    599

    379.1

    973

    9.5

    277

    5.7

    400

    mise

    (C2)/mise

    (C0)

    372.

    9278

    2.64

    081.

    0309

    590.1

    075

    3.4

    041

    1.0

    208

    344.8

    200

    2.3

    533

    1.0

    324

    520.8

    911

    2.8

    161

    1.0

    259

    Tab

    le20

    :R

    atio

    bet

    wee

    nth

    emise

    of

    the

    esti

    mat

    ors

    ofth

    ead

    dit

    ive

    com

    pon

    entg 0,2

    un

    der

    the

    con

    sid

    ered

    conta

    min

    atio

    ns

    and

    un

    der

    clea

    nd

    ata

    for

    the

    two

    mis

    sin

    gm

    ech

    anis

    ms.

    p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2

    (x1x3

    +0.2

    )ĝ 3

    ,bc,0

    ĝ 3,br,h

    ,0ĝ 3

    ,br,t

    ,0ĝ 3

    ,bc,1

    ĝ 3,br,h

    ,1ĝ 3

    ,br,t

    ,1ĝ 3

    ,bc,0

    ĝ 3,br,h

    ,0ĝ 3

    ,br,t

    ,0ĝ 3

    ,bc,1

    ĝ 3,br,h

    ,1ĝ 3

    ,br,t

    ,1

    mise

    (C1)/mise

    (C0)

    66.0

    132

    1.48

    801.

    0318

    160.3

    085

    2.5

    264

    1.2

    368

    67.6

    382

    1.9

    308

    1.0

    775

    130.9

    593

    5.5

    051

    4.1

    322

    mise

    (C2)/mise

    (C0)

    111.

    7021

    1.68

    041.

    0150

    237.9

    769

    2.0

    109

    1.0

    087

    89.7

    438

    1.4

    627

    0.9

    885

    150.2

    898

    1.5

    440

    1.0

    080

    Tab

    le21

    :R

    atio

    bet

    wee

    nth

    emise

    of

    the

    esti

    mat

    ors

    ofth

    ead

    dit

    ive

    com

    pon

    entg 0,3

    un

    der

    the

    con

    sid

    ered

    conta

    min

    atio

    ns

    and

    un

    der

    clea

    nd

    ata

    for

    the

    two

    mis

    sin

    gm

    ech

    anis

    ms.

    p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2

    (x1x3

    +0.2

    )ĝ 4

    ,bc,0

    ĝ 4,br,h

    ,0ĝ 4

    ,br,t

    ,0ĝ 4

    ,bc,1

    ĝ 4,br,h

    ,1ĝ 4

    ,br,t

    ,1ĝ 4

    ,bc,0

    ĝ 4,br,h

    ,0ĝ 4

    ,br,t

    ,0ĝ 4

    ,bc,1

    ĝ 4,br,h

    ,1ĝ 4

    ,br,t

    ,1

    mise

    (C1)/mise

    (C0)

    97.8

    712

    1.68

    421.

    0244

    211.8

    380

    2.7

    222

    1.0

    340

    114.1

    378

    2.4

    048

    1.2

    307

    212.2

    685

    10.0

    466

    7.3

    858

    mise

    (C2)/mise

    (C0)

    151.

    9140

    1.77

    601.

    0115

    296.8

    837

    2.1

    873

    1.0

    131

    143.3

    664

    1.6

    293

    0.9

    839

    239.2

    791

    2.0

    083

    1.0

    541

    Tab

    le22

    :R

    atio

    bet

    wee

    nth

    emise

    of

    the

    esti

    mat

    ors

    ofth

    ead

    dit

    ive

    com

    pon

    entg 0,4

    un

    der

    the

    con

    sid

    ered

    conta

    min

    atio

    ns

    and

    un

    der

    clea

    nd

    ata

    for

    the

    two

    mis

    sin

    gm

    ech

    anis

    ms.

    21

  • p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2

    (x1x3

    +0.2

    )ĝ b

    c,0

    ĝ br,h

    ,0ĝ b

    r,t

    ,0ĝ b

    c,1

    ĝ br,h

    ,1ĝ b

    r,t

    ,1ĝ b

    c,0

    ĝ br,h

    ,0ĝ b

    r,t

    ,0ĝ b

    c,1

    ĝ br,h

    ,1ĝ b

    r,t

    ,1

    C0

    0.01

    230.

    0125

    0.01

    250.

    0023

    0.0

    023

    0.0

    023

    0.0

    135

    0.0

    142

    0.0

    148

    0.0

    033

    0.0

    033

    0.0

    033

    C1

    7.23

    180.

    0497

    0.01

    337.

    5283

    0.0

    421

    0.0

    027

    8.3

    171

    0.0

    481

    0.0

    155

    8.7

    854

    0.0

    435

    0.0

    040

    C2

    4.75

    740.

    0344

    0.01

    284.

    7560

    0.0

    203

    0.0

    024

    5.9

    049

    0.0

    365

    0.0

    147

    5.9

    358

    0.0

    215

    0.0

    035

    Tab

    le23

    :Medise

    of

    the

    esti

    mato

    rsof

    the

    regr

    essi

    onfu

    nct

    iong 0

    un

    der

    diff

    eren

    tco

    nta

    min

    atio

    ns

    and

    mis

    sin

    gm

    ech

    anis

    ms.

    p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2

    (x1x3

    +0.2

    )ĝ 1

    ,bc,0

    ĝ 1,br,h

    ,0ĝ 1

    ,br,t

    ,0ĝ 1

    ,bc,1

    ĝ 1,br,h

    ,1ĝ 1

    ,br,t

    ,1ĝ 1

    ,bc,0

    ĝ 1,br,h

    ,0ĝ 1

    ,br,t

    ,0ĝ 1

    ,bc,1

    ĝ 1,br,h

    ,1ĝ 1

    ,br,t

    ,1

    C0

    0.00

    400.

    0040

    0.00

    400.

    0013

    0.0

    013

    0.0

    013

    0.0

    050

    0.0

    051

    0.0

    052

    0.0

    019

    0.0

    019

    0.0

    019

    C1

    0.51

    390.

    0077

    0.00

    420.

    5909

    0.0

    059

    0.0

    014

    0.7

    747

    0.0

    099

    0.0

    055

    0.9

    060

    0.0

    075

    0.0

    021

    C2

    0.95

    600.

    0084

    0.00

    410.

    9677

    0.0

    056

    0.0

    013

    1.1

    128

    0.0

    096

    0.0

    053

    1.1

    591

    0.0

    061

    0.0

    020

    Tab

    le24

    :Medise

    ofth

    ees

    tim

    ator

    sof

    the

    add

    itiv

    eco

    mp

    onen

    tg 0,1

    un

    der

    diff

    eren

    tco

    nta

    min

    atio

    ns

    and

    mis

    sing

    mec

    han

    ism

    s.

    p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2

    (x1x3

    +0.2

    )ĝ 2

    ,bc,0

    ĝ 2,br,h

    ,0ĝ 2

    ,br,t

    ,0ĝ 2

    ,bc,1

    ĝ 2,br,h

    ,1ĝ 2

    ,br,t

    ,1ĝ 2

    ,bc,0

    ĝ 2,br,h

    ,0ĝ 2

    ,br,t

    ,0ĝ 2

    ,bc,1

    ĝ 2,br,h

    ,1ĝ 2

    ,br,t

    ,1

    C0

    0.00

    200.

    0020

    0.00

    200.

    0011

    0.0

    011

    0.0

    011

    0.0

    028

    0.0

    028

    0.0

    029

    0.0

    016

    0.0

    016

    0.0

    016

    C1

    0.49

    800.

    0056

    0.00

    220.

    5879

    0.0

    054

    0.0

    012

    0.7

    289

    0.0

    071

    0.0

    031

    0.8

    413

    0.0

    064

    0.0

    018

    C2

    0.91

    080.

    0062

    0.00

    210.

    9260

    0.0

    050

    0.0

    011

    1.1

    789

    0.0

    077

    0.0

    031

    1.2

    041

    0.0

    059

    0.0

    016

    Tab

    le25

    :Medise

    ofth

    ees

    tim

    ator

    sof

    the

    add

    itiv

    eco

    mp

    onen

    tg 0,2

    un

    der

    diff

    eren

    tco

    nta

    min

    atio

    ns

    and

    mis

    sing

    mec

    han

    ism

    s.

    p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2

    (x1x3

    +0.2

    )ĝ 3

    ,bc,0

    ĝ 3,br,h

    ,0ĝ 3

    ,br,t

    ,0ĝ 3

    ,bc,1

    ĝ 3,br,h

    ,1ĝ 3

    ,br,t

    ,1ĝ 3

    ,bc,0

    ĝ 3,br,h

    ,0ĝ 3

    ,br,t

    ,0ĝ 3

    ,bc,1

    ĝ 3,br,h

    ,1ĝ 3

    ,br,t

    ,1

    C0

    0.00

    760.

    0077

    0.00

    770.

    0023

    0.0

    023

    0.0

    023

    0.0

    102

    0.0

    108

    0.0

    111

    0.0

    044

    0.0

    044

    0.0

    044

    C1

    0.56

    860.

    0124

    0.00

    800.

    6297

    0.0

    075

    0.0

    024

    0.8

    675

    0.0

    167

    0.0

    110

    1.0

    010

    0.0

    115

    0.0

    046

    C2

    1.00

    460.

    0141

    0.00

    790.

    9907

    0.0

    069

    0.0

    023

    1.1

    809

    0.0

    177

    0.0

    110

    1.1

    953

    0.0

    096

    0.0

    045

    Tab

    le26

    :Medise

    ofth

    ees

    tim

    ator

    sof

    the

    add

    itiv

    eco

    mp

    onen

    tg 0,3

    un

    der

    diff

    eren

    tco

    nta

    min

    atio

    ns

    and

    mis

    sing

    mec

    han

    ism

    s.

    p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2

    (x1x3

    +0.2

    )ĝ 4

    ,bc,0

    ĝ 4,br,h

    ,0ĝ 4

    ,br,t

    ,0ĝ 4

    ,bc,1

    ĝ 4,br,h

    ,1ĝ 4

    ,br,t

    ,1ĝ 4

    ,bc,0

    ĝ 4,br,h

    ,0ĝ 4

    ,br,t

    ,0ĝ 4

    ,bc,1

    ĝ 4,br,h

    ,1ĝ 4

    ,br,t

    ,1

    C0

    0.00

    560.

    0057

    0.00

    580.

    0022

    0.0

    022

    0.0

    022

    0.0

    073

    0.0

    076

    0.0

    080

    0.0

    034

    0.0

    034

    0.0

    034

    C1

    0.65

    100.

    0106

    0.00

    610.

    7153

    0.0

    078

    0.0

    023

    1.0

    298

    0.0

    141

    0.0

    080

    1.1

    628

    0.0

    112

    0.0

    037

    C2

    1.04

    840.

    0115

    0.00

    591.

    0447

    0.0

    066

    0.0

    022

    1.3

    321

    0.0

    140

    0.0

    079

    1.3

    391

    0.0

    084

    0.0

    035

    Tab

    le27

    :Medise

    ofth

    ees

    tim

    ator

    sof

    the

    add

    itiv

    eco

    mp

    onen

    tg 0,4

    un

    der

    diff

    eren

    tco

    nta

    min

    atio

    ns

    and

    mis

    sing

    mec

    han

    ism

    s.

    22

  • p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2

    (x1x3

    +0.2

    )ĝ b

    c,0

    ĝ br,h

    ,0ĝ b

    r,t

    ,0ĝ b

    c,1

    ĝ br,h

    ,1ĝ b

    r,t

    ,1ĝ b

    c,0

    ĝ br,h

    ,0ĝ b

    r,t

    ,0ĝ b

    c,1

    ĝ br,h

    ,1ĝ b

    r,t

    ,1

    Medise(C

    1)

    Medise(C

    0)

    587.

    9280

    3.99

    011.

    0632

    3284.7

    120

    18.3

    599

    1.1

    840

    616.5

    264

    3.3

    837

    1.0

    457

    2694.8

    344

    13.3

    422

    1.2

    335

    Medise(C

    2)

    Medise(C

    0)

    386.

    7698

    2.76

    231.

    0253

    2075.0

    990

    8.8

    790

    1.0

    682

    437.7

    162

    2.5

    680

    0.9

    951

    1820.7

    418

    6.6

    007

    1.0

    631

    Tab

    le28

    :R

    atio

    bet

    wee

    nth

    eMedise

    of

    the

    esti

    mat

    ors

    ofth

    ead

    dit

    ive

    com

    pon

    entg 0

    un

    der

    the

    con

    sid

    ered

    conta

    min

    atio

    ns

    and

    un

    der

    clea

    nd

    ata

    for

    the

    two

    mis

    sin

    gm

    ech

    anis

    ms.

    p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2

    (x1x3

    +0.2

    )ĝ 1

    ,bc,0

    ĝ 1,br,h

    ,0ĝ 1

    ,br,t

    ,0ĝ 1

    ,bc,1

    ĝ 1,br,h

    ,1ĝ 1

    ,br,t

    ,1ĝ 1

    ,bc,0

    ĝ 1,br,h

    ,0ĝ 1

    ,br,t

    ,0ĝ 1

    ,bc,1

    ĝ 1,br,h

    ,1ĝ 1

    ,br,t

    ,1

    Medise(C

    1)

    Medise(C

    0)

    129.

    7275

    1.91

    271.

    0513

    460.2

    723

    4.5

    320

    1.0

    668

    156.4

    803

    1.9

    257

    1.0

    649

    466.3

    972

    3.8

    856

    1.0

    932

    Medise(C

    2)

    Medise(C

    0)

    241.

    3087

    2.09

    091.

    0125

    753.7

    706

    4.3

    051

    1.0

    116

    224.7

    727

    1.8

    676

    1.0

    267

    596.6

    858

    3.1

    300

    1.0

    223

    Tab

    le29

    :R

    ati

    ob

    etw

    een

    theMedise

    ofth

    ees

    tim

    ator

    sof

    the

    add

    itiv

    eco

    mp

    onen

    tg 0,1

    un

    der

    the

    consi

    der

    edco

    nta

    min

    atio

    ns

    and

    un

    der

    clea

    nd

    ata

    for

    the

    two

    mis

    sin

    gm

    ech

    an

    ism

    s.

    p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2

    (x1x3

    +0.2

    )ĝ 2

    ,bc,0

    ĝ 2,br,h

    ,0ĝ 2

    ,br,t

    ,0ĝ 2

    ,bc,1

    ĝ 2,br,h

    ,1ĝ 2

    ,br,t

    ,1ĝ 2

    ,bc,0

    ĝ 2,br,h

    ,0ĝ 2

    ,br,t

    ,0ĝ 2

    ,bc,1

    ĝ 2,br,h

    ,1ĝ 2

    ,br,t

    ,1

    Medise(C

    1)

    Medise(C

    0)

    253.

    9908

    2.82

    921.

    0822

    557.0

    175

    5.1

    005

    1.0

    940

    258.5

    182

    2.4

    780

    1.0

    665

    524.3

    816

    4.0

    351

    1.1

    120

    Medise(C

    2)

    Medise(C

    0)

    464.

    5554

    3.14

    241.

    0488

    877.3

    765

    4.7

    631

    1.0

    244

    418.1

    190

    2.7

    026

    1.0

    618

    750.5

    317

    3.6

    802

    1.0

    270

    Tab

    le30

    :R

    ati

    ob

    etw

    een

    theMedise

    ofth

    ees

    tim

    ator

    sof

    the

    add

    itiv

    eco

    mp

    onen

    tg 0,2

    un

    der

    the

    consi

    der

    edco

    nta

    min

    atio

    ns

    and

    un

    der

    clea

    nd

    ata

    for

    the

    two

    mis

    sin

    gm

    ech

    an

    ism

    s.

    p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2

    (x1x3

    +0.2

    )ĝ 3

    ,bc,0

    ĝ 3,br,h

    ,0ĝ 3

    ,br,t

    ,0ĝ 3

    ,bc,1

    ĝ 3,br,h

    ,1ĝ 3

    ,br,t

    ,1ĝ 3

    ,bc,0

    ĝ 3,br,h

    ,0ĝ 3

    ,br,t

    ,0ĝ 3

    ,bc,1

    ĝ 3,br,h

    ,1ĝ 3

    ,br,t

    ,1

    Medise(C

    1)

    Medise(C

    0)

    75.1

    423

    1.60

    251.

    0372

    273.7

    164

    3.2

    475

    1.0

    414

    85.4

    013

    1.5

    531

    0.9

    944

    226.6

    848

    2.6

    012

    1.0

    510

    Medise(C

    2)

    Medise(C

    0))

    132.

    7657

    1.81

    891.

    0282

    430.6

    451

    3.0

    177

    1.0

    127

    116.2

    487

    1.6

    494

    0.9

    913

    270.6

    875

    2.1

    641

    1.0

    084

    Tab

    le31

    :R

    ati

    ob

    etw

    een

    theMedise

    ofth

    ees

    tim

    ator

    sof

    the

    add

    itiv

    eco

    mp

    onen

    tg 0,3

    un

    der

    the

    consi

    der

    edco

    nta

    min

    atio

    ns

    and

    un

    der

    clea

    nd

    ata

    for

    the

    two

    mis

    sin

    gm

    ech

    an

    ism

    s.

    p(x

    )≡

    1p4(x

    )=

    0.4

    +0.5

    cos2

    (x1x3

    +0.2

    )ĝ 4

    ,bc,0

    ĝ 4,br,h

    ,0ĝ 4

    ,br,t

    ,0ĝ 4

    ,bc,1

    ĝ 4,br,h

    ,1ĝ 4

    ,br,t

    ,1ĝ 4

    ,bc,0

    ĝ 4,br,h

    ,0ĝ 4

    ,br,t

    ,0ĝ 4

    ,bc,1

    ĝ 4,br,h

    ,1ĝ 4

    ,br,t

    ,1

    Medise(C

    1)

    Medise(C

    0)

    115.

    3211

    1.85

    211.

    0528

    331.9

    275

    3.6

    263

    1.0

    442

    140.1

    222

    1.8

    468

    0.9

    912

    344.1

    223

    3.3

    286

    1.0

    869

    Medise(C

    2)

    Medise(C

    0)

    185.

    7104

    2.01

    151.

    0223

    484.7

    585

    3.0

    544

    1.0

    200

    181.2

    518

    1.8

    305

    0.9

    832

    396.2

    957

    2.4

    830

    1.0

    267

    Tab

    le32

    :R

    ati

    ob

    etw

    een

    theMedise

    ofth

    ees

    tim

    ator

    sof

    the

    add

    itiv

    eco

    mp

    onen

    tg 0,2

    un

    der

    the

    consi

    der

    edco

    nta

    min

    atio

    ns

    and

    un

    der

    clea

    nd

    ata

    for

    the

    two

    mis

    sin

    gm

    ech

    an

    ism

    s.

    23

  • 5 Empirical Influence

    A well-known measure of robustness of an estimat