on locally most powerful sequential rank...

Full Terms & Conditions of access and use can be found athttp://www.tandfonline.com/action/journalInformation?journalCode=lsqa20

Download by: [Ustav informatiky AV CR] Date: 15 March 2017, At: 00:19

Sequential AnalysisDesign Methods and Applications

ISSN: 0747-4946 (Print) 1532-4176 (Online) Journal homepage: http://www.tandfonline.com/loi/lsqa20

On locally most powerful sequential rank tests

Jan Kalina

To cite this article: Jan Kalina (2017) On locally most powerful sequential rank tests, SequentialAnalysis, 36:1, 111-125, DOI: 10.1080/07474946.2016.1275501

To link to this article: http://dx.doi.org/10.1080/07474946.2016.1275501

Published online: 07 Mar 2017.

Submit your article to this journal

Article views: 5

View related articles

View Crossmark data

http://www.tandfonline.com/action/journalInformation?journalCode=lsqa20

http://www.tandfonline.com/loi/lsqa20

http://www.tandfonline.com/action/showCitFormats?doi=10.1080/07474946.2016.1275501

http://dx.doi.org/10.1080/07474946.2016.1275501

http://www.tandfonline.com/action/authorSubmission?journalCode=lsqa20&show=instructions

http://www.tandfonline.com/action/authorSubmission?journalCode=lsqa20&show=instructions

http://www.tandfonline.com/doi/mlt/10.1080/07474946.2016.1275501

http://www.tandfonline.com/doi/mlt/10.1080/07474946.2016.1275501

http://crossmark.crossref.org/dialog/?doi=10.1080/07474946.2016.1275501&domain=pdf&date_stamp=2017-03-07

http://crossmark.crossref.org/dialog/?doi=10.1080/07474946.2016.1275501&domain=pdf&date_stamp=2017-03-07

SEQUENTIAL ANALYSIS

2017, VOL. 36, NO. 1, 111–125

http://dx.doi.org/10.1080/07474946.2016.1275501

On locally most powerful sequential rank tests

Jan Kalinaa,b

aInstituteof Computer Scienceof theCzechAcademyof Sciences, Prague, CzechRepublic; bInstituteof InformationTheory and Automation of the Czech Academy of Sciences, Prague, Czech Republic

ABSTRACT

Sequential ranks are de�ned as ranks of such observations, which havebeenobserved so far in a sequential design. This article studies hypothe-ses tests based on sequential ranks for di�erent situations. The locallymost powerful sequential rank test is derived for the hypothesis ofrandomness against a general alternative, including the two-sampledi�erence in location or regression in location as special cases for thealternative hypothesis. Further, the locally most powerful sequentialrank tests are derived for the one-sample problem and for indepen-dence of two samples in an analogous spirit as the classical results ofHájek and Šidák (1967) for (classical) ranks. The locally most powerfultests are derived for a �xed sample size and the results bring argumentsin favor of existing tests. In addition, we propose a sequential testingprocedure based on these statistics of the locally most powerful tests.Principles of such sequential testing are explained on the two-sampleWilcoxon test based on sequential ranks.

ARTICLE HISTORY

Received 20 June 2016Revised 13 November 2016,6 December 2016Accepted 19 December 2016

KEYWORDS

Nonparametric tests;sequential ranks; stoppingvariable

MATHEMATICS SUBJECT

CLASSIFICATIONS

62G10; 62G30; 62L10

1. Introduction

The history of nonparametric sequential tests was initiated byMiller, who proposed a sequen-tial signed-rank test (Miller, 1970) and a two-sample sequential Wilcoxon test (Miller, 1973).Berk (1975) derived a locallymost powerful sequential rank test, and its structurewas recentlycharacterized by Novikov and Novikov (2010). Lombard investigated a sequential test basedon Kendall’s tau statistic (Lombard, 1977), invariance principles for a sequential test basedon classical ranks (Lombard, 1981), and rank tests for the change-point problem (Lombard,1983). Inverting sequential tests based on ranks allows de�ning R-estimators investigated byJurecková and Sen (1982). Other nonparametric tools for sequential analysis were overviewedbyHušková (1991), and recently they found applications in nonparametric tests in the contextof control charts (Zhou et al., 2016).

The simple linear rank statistic (say Mn) based on sequential ranks �rst appeared in thepioneering work of Mason (1981), who considered Mn as a useful tool for deriving limittheorems for tests based on classical ranks and derived the asymptotic normality ofMn usingmartingale theory for n → ∞. He proved that Mn and the simple linear rank statistic Tn

(based on classical ranks) are asymptotically equivalent in the quadratic mean under the nullhypothesis that X1, . . . ,Xn are independent and identically distributed (i.i.d.). Further, hefollowed Hájek and Šidák (1967) to state that they are asymptotically equivalent also undercontiguous alternatives of regression in location.

CONTACT Jan Kalina [email protected] Institute of Computer Science of the Czech Academy of Sciences, PodVodárenskou veží 2, 182 07 Prague 8, Czech Republic.

Recommended by Edit Gombay

© 2017 Taylor & Francis

http://dx.doi.org/10.1080/07474946.2016.1275501

mailto:[email protected]

112 J. KALINA

Mason (1984) proved the Pitman e�ciency of Tn and Mn to be the same, and he showeda test based on Tn to be more powerful in terms of the Bahadur e�ciency. Lombard andMason (1985) also derived invariance principles forMn. ThoughMn is constructed in thesepapers in a natural and intuitive way (but used entirely for a �xed sample size), there havebeen no e�orts to show any optimal property ofMn among all possible test statistics based onsequential ranks.

This article �lls the gap of the optimality results for various hypothesis tests based onsequential ranks. It has the following structure. A�er reviewing basic properties of sequentialranks in Section 2, the locally most powerful tests among all tests based on sequential ranksare derived. A rather general �rst result of Section 3 is formulated for various special casesthroughout Sections 4–7. Independent of these results, a proof for the locally most powerfultest based on sequential ranks is given for the one-sample test (Section 8) and for the test ofindependence of two random samples (Section 9). Though these results are valid for a �xednumber of observations n, we describe a sequential construction of tests in Section 10. Finally,a discussion in Section 11 concludes the article.

2. Sequential ranks

Let X = (X1, . . . ,Xn)T represent a random vector with values in (Rn,Bn), where B is the

system of Borel sets on R. Arranging X in ascending order, we obtain the vector of orderstatistics

Xn(1) ≤ Xn

(2) ≤ · · · ≤ Xn(n), (2.1)

where the upper index stresses that these are computed from n random variables. If no twoobservations are equal, the rank Ri of the ith observation is de�ned by Xi = X(Ri) for i =

1, . . . , n and the vector of ranks will be denoted by R = (R1, . . . ,Rn)T .Sequential ranks R∗ = (R11, . . . ,Rnn)T are de�ned as ranks computed from the data

observed so far, denoting Rkk the rank of Xi among the values X1, . . . ,Xi for any i = 1, . . . , n(Khmaladze, 2011). Throughout the article, we use the notation xi (or rii) in case we want tostress that the values of Xi (or Rii) are considered as �xed.

It holds that Xi = X(Rii) for any i = 1, . . . , n; in other words, Rii denotes the number ofvalues among r11, . . . , rnn smaller or equal to xi for i = 1, . . . , n. Barndor�-Nielsen (1963)proved the favorable property that R11, . . . ,Rnn are independent and

P(Rii = rii) =1

i, i = 1, . . . , n. (2.2)

Another useful property is devoted to the conditional expectation of a measurable function tof random variables X1, . . . ,Xn. Conditioning on the sequential ranks r∗ = (r11, . . . , rnn)T ,it holds that

E [t(X1, . . . ,Xn)|R11 = r11, . . . ,Rnn = rnn] = Et(X1(R11), . . . ,X

n(Rnn)

). (2.3)

If the sample size n is �xed, sequential ranks can be uniquely determined from the vectorof classical ranks in a trivial way. The reversed procedure is also possible but perhaps lessobvious. Thus, sequential ranks allowus to reconstruct the original observations conditionallyon the knowledge of (2.1). The computations corresponding to these one-to-one mappingsare formulated as Algorithms 2.1 and 2.2.

SEQUENTIAL ANALYSIS 113

Algorithm 2.1. Computation of classical ranks from sequential ranks.Require: R11, . . . ,RnnEnsure: R1, . . . ,RnI = {1, . . . , n}for j = 1 to n do

m := max {Rii; i ∈ I}p := min {i; i ∈ I,Rii = m}

Rp := jI := I\

{

p}

end for

Algorithm 2.2. Computation of the original observations from arranged observations andsequential ranks.

Require: Xn(1) ≤ · · · ≤ Xn

(n) and R11, . . . ,RnnEnsure: X1, . . . ,Xn

I = {1, . . . , n}for j = 1 to n do

m := max {Rii; i ∈ I}p := min {i; i ∈ I,Rii = m}

Xp := Xn(j)

I := I\{

p}

end for

The classical theory of rank tests was developed by Hájek and Šidák (1967) and extendedin Hájek et al. (1999). Standard monographs on sequential nonparametrics (Sen [1981] orGhosh and Sen [1981]) studied sequential hypotheses tests based on statistics computedfrom classical ranks. These are recalculated a�er each new observation is added. However,hypotheses tests based on sequential ranks were not considered in these monographs.

In this article, we study hypotheses tests based on sequential ranks for various situations,which are shown to be locally most powerful among all tests based on sequential ranks. Thesetests will be called locally most powerful sequential ranks tests.

De�nition 2.1. Let m(Q) be a measure of distance of alternative Q ∈ H1 from the nullhypothesis H0. The α−test 80 depending on the data only through the sequential ranks iscalled the locally most powerful sequential rank α−test of H0 against H1, if for any othertest 8 depending on the data only through sequential ranks there exists ε > 0 such that thepowers of the test satisfy β80 ≥ β8 for each Q with the property 0 < m(Q) < ε.

3. Test against a general alternative

Let us assume i.i.d. random variables X1, . . . ,Xn to be observed for n < ∞. Let X1 havea density with respect to the Lebesgue measure, so that there is a zero probability of twoobservations attaining the same value. Let c1, c2, . . ., denote a known sequence of regressionconstants. The joint density of the random vector (X1, . . . ,Xn)

T with �xed n under the null

114 J. KALINA

hypothesis will be denoted by p(x1, . . . , xn) and under the alternative hypothesis dependingon a parameter 1 > 0 by qn1(x1, . . . , xn).

For a �xed n < ∞, the null hypothesis of interest formulated in a very general way,

H0 : p(x1, . . . , xn) =

n∏

i=1

d(xi, 0), (3.1)

for a family of densities d(x, θ), where θ ∈ 2, 2 is an open interval containing 0 and d(x, θ)

is absolutely continuous in θ for almost every x. H0 will be tested against the alternativehypothesis

H1 : qn1(x1, . . . , xn) =

n∏

i=1

d(xi,1ci), 1 > 0. (3.2)

The null hypothesis can be also formulated as H0 : 1 = 0. We now derive the locally mostpowerful sequential rank test of H0 against the general alternative H1.

Theorem 3.1. Let the condition II.4.8.A1 of Hájek and Šidák (1967) be ful�lled. Then the testwith the critical region

n∑

i=1

ciEd(Xi

(Rii), 0)

d(Xi(Rii)

, 0)≥ kα , (3.3)

where

d(x, θ) =∂

∂θd(x, θ), (3.4)

is the locally most powerful sequential rank test of H0 against (3.2) at level α.

Proof. The condition of the assumptions, which is also denoted as 3.4.8.A1 of Hájek et al.(1999), ensures the existence of

d(x, 0) = limθ→0

1

θ[d(x, θ) − d(x, 0)] (3.5)

for every x. Replacing now the data by the sequential ranks, the joint likelihood of the dataunder H0 equals

P(R11 = r11, . . . ,Rnn = rnn) =1

n!(3.6)

for the vector (r11, . . . , rnn)T ful�lling rii ∈ {1, . . . , i} for i = 1, . . . , n. Under the alternativehypothesis with a �xed 1, Qn

1 will denote the probability distribution corresponding to thedensity (3.2). This can be expressed as

Qn1(R∗ = r∗) =

∫

· · ·

∫

R∗=r∗

qn1(x1, . . . , xn)dx1 · · · dxn. (3.7)

Neyman-Pearson’s lemma gives themost powerful test ofH0 against (3.2) based on sequentialranks, which rejects H0 if n!Qn

1(R∗ = r∗) exceeds the critical value. Therefore, we studyQn

1(R∗ = r∗) for the system of alternatives {Qn1}, where 1 → ∞.


From (3.2) it follows that (3.7) equals

1

n!+ 1

n∑

i=1

∫

· · ·

∫

R∗=r∗

d(xi,1ci) − d(xi, 0)

1

n∏

j=i+1

d(xj, 0)i−1∏

k=1

d(xk,1ck)

dx1 · · · dxn. (3.8)

Because

lim sup1→0

∫

· · ·

∫

R∗=r∗


1

n∏

j=i+1

d(xj, 0)i−1∏

k=1

d(xk,1ck)

dx1 · · · dxn

≤ |ci|

∫ ∞

−∞

|d(xi, 0)|dxi (3.9)

for i = 1, . . . , n, the convergence theorem II.4.2 of Hájek and Šidák (1967) can be applied toobtain

lim1→0

n∑

i=1

∫

· · ·

∫

R∗=r∗


1

n∏

j=i+1

d(xj, 0)i−1∏

k=1

d(xk,1ck)

dx1 · · · dxn

=

n∑

i=1

∫

· · ·

∫

R∗=r∗

cid(xi, 0)∏

j6=i

d(xj, 0)

dx1 · · · dxn

=

n∑

i=1

ci

∫

· · ·

∫

R∗=r∗

d(xi, 0)

d(xi, 0)

n∏

j=1

d(xj, 0)

dx1 · · · dxn

=1

n!

n∑

i=1

ciE

[

d(Xi, 0)

d(Xk, 0)

∣

∣

∣

∣

∣

R11 = r11, . . . ,Rnn = rnn

]

, (3.10)

using properties of the conditional expectation.Now using (2.3) we obtain that (3.9) equals

1

n!

n∑

i=1

ciE

[

d(Xi(Rii)

, 0)

d(Xi(Rii)

, 0)

]

, (3.11)

where the ith summand depends on X1, . . . ,Xi. (3.11) can be interpreted as the linearsequential-rank statistic. So we get an approximation for Qn

1(R∗ = r∗) by

1

n!+ 1

1

n!

n∑

i=1

ciE

[

d(Xi(Rii)

, 0)

d(Xi(Rii)

, 0)

]

(3.12)

for 1 → ∞. For this limit behavior the linear sequential-rank statistic (3.11) arranges thesequential ranks r11, . . . , rnn in the same way as Qn

1, which is at the same time the locallymost powerful sequential rank test of H0 against H1. This completes the proof.

4. Test of H0 against regression in location

The null hypothesis of randomness

H0 : p(x1, . . . , xn) =

n∏

i=1

f (xi) (4.1)

116 J. KALINA

and the alternative hypothesis of regression in location in the form

H1 : qn1(x1, . . . , xn) =

n∏

i=1

f (xi − 1ci), 1 > 0, (4.2)

are special cases of (3.1) and (3.2) of the previous section. Let us de�ne

F(x) =

∫ x

−∞

f (y)dy, (4.3)

its inverse F−1(u) = inf{x; F(x) ≥ u}, and the score function ϕ by

ϕ(u, f ) =f ′(F−1(u))

f (F−1(u)), 0 < u < 1. (4.4)

Let us assume a random sample of n random variables Un1 , . . . ,U

nn uniform on [0, 1] and

denote their ith-order statistic by Un(i).

Let us assume f to be absolutely continuous with∫ ∞

−∞

|f ′(x)|dx < ∞ (4.5)

to have well-de�ned scores for a �xed n < ∞

an(i, f ) = Eϕ(Un(i), f ) = E

[

−f ′

f(F−1(Un

(i)))

]

= E

[

−f ′

f(Xn

(i))

]

, i = 1, . . . , n. (4.6)

Theorem4.1. The locally most powerful sequential rank test of H0 against the system {qn1, 1 >

0} with qn1 de�ned by (4.2) at level α has the critical regionn∑

i=1

ciai(Rii, f ) ≥ kα (4.7)

assuming (4.5).

This is a consequence of Theorem 3.1. The test statistic (3.3) for the general test can beexpressed as

n∑

i=1

ciE

[

−f ′

f(Xi

(Rii))

]

. (4.8)

The statistic (4.8) can be expressed asn∑

i=1

ciE

[

−f ′

f(F−1(U i

(Rii)))

]

=

n∑

i=1

ciE[

ϕ(U i(Rii)

, f )]

=

n∑

i=1

ciai(Rii, f ) (4.9)

by applying (4.6), because Xi(Rii)

has the same distribution as F−1(U i(Rii)

) for each i.

5. Test of H0 against two samples di�ering in location

Let Z1,Z2, . . . ,Zn denote a random sample, which is obtained as a pooled sample of twoindependent sequences X1, . . . ,Xm and Y1, . . . ,Yn−m. The null hypothesis

H0 : p(z1, . . . , zn) =

n∏

i=1

f (zi) (5.1)


is tested against the alternative

H1 : qn1 =

n∏

i=1

f (zi − 1ci), 1 > 0, (5.2)

with a special choice of constants, namely, ci = 1 if Zi comes from the second sample and ci =

0 if Zi comes from the �rst sample. This alternative corresponds to the variables X1,X2, . . .,each with density f (x) and variables Y1,Y2, . . ., each with density f (y − 1).

Then for a �xed n < ∞ and random sample Z1, . . . ,Zn, the test statistic of Theorem 4.1 isused with the special choice of c1, . . . , cn. The sums in (4.8) and (4.9) are evaluated only overthe second sample.

The locally most powerful sequential rank test is a special case of (4.7) and the test statisticis equal to

n∑

i=1; Y

E

[

−f ′

f

(

Zi(Rii)

)

]

=

n∑

i=1; Y

E

[

−f ′

f(F−1(U i

(Rii)))

]

=

n∑

i=1; Y

Eϕ(U i(Rii)

, f ) =

n∑

i=1; Y

ai(Rii, f ).

(5.3)It is convenient to replace the scores (4.6) by approximate scores ϕ(EUn

(i), f ). Then byapplying that

EUn(i) =

i

n + 1, (5.4)

we obtain an approximation for the test statistic (5.3) in the form

n∑

i=1; Y

ϕ(EU i(Rii)

, f ) =

n∑

i=1; Y

ϕ

(

Riii + 1

, f

)

=

n∑

i=1; Y

[

−f ′(F−1)

f (F−1)

(

Riii + 1

)]

. (5.5)

Special cases include the two-sample Wilcoxon test

W =

n∑

i=1; Y

Riii + 1

(5.6)

with the approximate scores corresponding to the logistic distribution and the median test

n∑

i=1; Y

sgn

(

Rii −i + 1

2

)

(5.7)

derived with the approximate scores corresponding to the Laplace distribution.

6. Test of H0 against regression in scale

Regression in scale

H1 : qn1(x1, . . . , xn) = exp

(

−1

n∑

i=1

ci

)

n∏

i=1

f [(xi − µ)e−1ci], 1 > 0 (6.1)

is a special case of the general alternative (3.2) with known regression constants c1, . . . , cnand an arbitrary shi� parameterµ ∈ R. The locally most powerful sequential rank test ofH0

against (6.1) follows from Theorem 3.1.

118 J. KALINA

The score function ϕ1 will be de�ned by means of the distribution function F correspond-ing to the density f in the form

ϕ1(u, f ) = −1 − F−1(u)f ′(F−1(u))

f (F−1(u))(6.2)

and the corresponding scores are de�ned as

a1n(i, f ) = Eϕ1(Un(i), f ). (6.3)

Theorem6.1. The locally most powerful sequential rank test of H0 against the system {qn1, 1 >

0} with qn1 de�ned by (6.1) at level α has the critical region

n∑

i=1

cia1i(Rii, f ) ≥ kα (6.4)

assuming∫ ∞

−∞

|xf ′(x)|dx < ∞. (6.5)

Proof. Based onTheorem 3.1, the test statistic is a special case of (3.3), which can be expressedas

n∑

i=1

ciE

[

−1 − F−1(U i(Rii)

)f ′(F−1)

f (F−1)(U i

(Rii))

]

=

n∑

i=1

ciEϕ1(Ui(Rii)

, f ) =

n∑

i=1

cia1i(Rii, f ), (6.6)

which concludes the proof.

By approximating the scores (6.3) by ϕ1(EUn(i), f ), we obtain an approximation to the test

statistic of (6.4) in the formn∑

i=1

ciϕ1(EUi(Rii)

, f ) =

n∑

i=1

ciϕ1

(

Riii + 1

, f

)

=

n∑

i=1

ci

[

−1 − F−1(

Riii + 1

)

f ′(F−1)

f (F−1)

(

Riii + 1

)]

.

(6.7)

7. Test of H0 against two samples di�ering in scale

Let Z1,Z2, . . . ,Zn denote a random sample, which is obtained as a pooled sample of twoindependent sequences X1, . . . ,Xm and Y1, . . . ,Yn−m. The joint density is assumed in theform

H1 : qn1(z1, . . . , zn) = exp(−m1)

n∏

i=1; X

f [(zi − µ)e−1]

n∏

i=1; Y

f [(zi − µ)], 1 > 0, (7.1)

where the products are considered over the �rst and second sample, respectively. Here, f is aknown density and µ an arbitrary shi� parameter.

The test statistic of the locally most powerful sequential rank test is expressed inTheorem 6.1 assuming (6.5) and is equal to

n∑

i=1; Y

a1i(Rii, f ). (7.2)


An approximation based on (6.7) replaces the test statistic by

n∑

i=1; Y

[

−1 − F−1(

Riii + 1

)

f ′(F−1)

f (F−1)

(

Riii + 1

)]

. (7.3)

8. Test for the one-sample problem

We assume i.i.d. random variables X1, . . . ,Xn. Their sequential ranks will be denoted by(R+

11, . . . ,R+nn)

T , where R+ii is the rank of |Xi| among |X1|, . . . , |Xi| for i = 1, . . . , n. The vector

v = (v1, . . . , vn)T will stand for the vector of signs of X1, . . . ,Xn.Both the null hypothesis

H+0 : p(x1, . . . , xn) =

n∏

i=1

f (xi) (8.1)

and the alternative

H1 : qn1(x1, . . . , xn) =

n∏

i=1

f (xi − 1), 1 > 0, (8.2)

require f to be a known symmetric density satisfying f (x) = f (−x), x ∈ R.The score function ϕ+ will be de�ned as

ϕ+(u, f ) = ϕ

(

1

2+

1

2u, f

)

(8.3)

and the corresponding scores

a+n (i, f ) = Eϕ+(Un

(i), f ). (8.4)

Theorem8.1. The locallymost powerful sequential rank test ofH+0 against the system {qn1, 1 >


n∑

i=1

via+i (R+

ii , f ) ≥ kα (8.5)

assuming (6.5).

Proof. Under H+0 it holds that

P[sgnX = v, (R+11, . . . ,R

+nn)

T = r+] =1

2nn!(8.6)

due to the independence between sgnX and (R+11, . . . ,R

+nn)

T . The Neyman-Pearson lemmaallows to construct the most powerful test based on sgnX and (R+

11, . . . ,R+nn)

T as

Qn1[sgnX = v, (R+

11, . . . ,R+nn)

T = r+]

P[sgnX = v, (R+11, . . . ,R

+nn)

T = r+]= 2nn!Qn

1[sgnX = v, (R+11, . . . ,R

+nn)

T = r+],

(8.7)

120 J. KALINA

where Qn1 denotes the probability distribution corresponding to the density (8.2). Following

the steps of Hájek and Šidák (1967), we compute

lim1→0

1

1

[

2nn!Qn1

{

sgnX = v, (R+11, . . . ,R

+nn)

T = (r+1 , . . . , r+n )T

}

− 1]

=

n∑

i=1

viE

[

−f ′(|Xi|)

f (|Xi|)

∣

∣R+11 = r+1 , . . . ,R

+nn = r+n

]

=

n∑

i=1

viE

[

−f ′

f

(

|X|i(R+

ii )

)

]

=

n∑

i=1

viEϕ+(U i

(R+ii ), f ), (8.8)

where |X|i(k) denotes the kth-order statistic among |X1|, . . . , |Xi|. Therefore, the test statistic

in (8.5) is for 1 → 0 equivalent to the most powerful sequential rank test, so (8.5) is thelocally most powerful sequential rank test, which was to be proven.

The scores (8.4) can be approximated by ϕ+(EUn(i), f ). Then the test statistic of (8.5) can

be approximated byn∑

i=1

vi ϕ+(EU i

(R+ii ), f ) =

n∑

i=1

vi ϕ+

(

R+ii

i + 1, f

)

=

n∑

i=1

vi ϕ

(

1

2+

Rii2(i + 1)

, f

)

. (8.9)

9. Test of independence

We consider i.i.d. random variables X1, . . . ,Xn and Y1, . . . ,Yn. We denote the sequentialranks in the �rst and second samples by R∗ = (R11, . . . ,Rnn)T and Q∗ = (Q11, . . . ,Qnn)

T ,respectively.

The null hypothesis

H∗0 : p(x1, y1, . . . , xn, yn) =

n∏

i=1

f (xi)g(yi) (9.1)

will be tested against the alternative that Xi = X∗i + 1Zi and Yi = Y∗

i + 1Zi, where X∗i , Y

∗i ,

and Zi are mutually independent, 1 > 0, and X∗i and Y∗

i have a speci�ed distribution. Thisalternative can be formally expressed as

H1 : qn1(x1, y1, . . . , xn, yn) =

n∏

i=1

h1(xi, yi), 1 > 0, (9.2)

where

h1(x, y) =

∫ ∞

−∞

f (x − 1z)g(y − 1z)dM(z) (9.3)

with an arbitrary distribution functionM. The score function (4.4) and scores (4.6) are usedalso in this context.

Theorem9.1. The locallymost powerful sequential rank test of H∗0 against the system {qn1, 1 >


n∑

i=1

ai(Rii, f ) ai(Qii, g) ≥ kα . (9.4)


Proof. Starting to express the most powerful sequential rank test based on the Neyman-Pearson lemma, we obtain

lim1→0

1

12

[

(n!)2Qn1(R11, . . . ,Rnn,Q11, . . . ,Qnn) − 1

]

= σ 2(n!)2n∑

i=1

∫

· · ·

∫

R∗=r∗, Q∗=q∗

[

f ′(xi)g′(yi)

f (xi)g(yi)

n∏

k=1

f (xk)g(yk)

]

dx1 · · · dxndy1 · · · dyn, (9.5)

where σ 2 is a constant corresponding to the arbitrary variables Z1, . . . ,Zn and r∗ =

(r11, . . . , rnn)T and q∗ = (q11, . . . , qnn)T are observed values of the sequential ranks of thetwo samples.

Following further steps of Hájek and Šidák (1967), we express (9.5) as

σ 2n∑

i=1

n!

∫

· · ·

∫

R∗=r∗

f ′(xi)

f (xi)

n∏

k=1

f (xk)dx1 · · · dxn

n!

∫

· · ·

∫

Q∗=q∗

g′(yi)

g(yi)

n∏

k=1

g(yk)dy1 · · · dyn

= σ 2n∑

i=1

E

[

f ′

f(X(Rii)) |R11, . . . ,Rnn

]

E

[

g′

g(Y(Qii)) |Q11, . . . ,Qnn

]

= σ 2n∑

i=1

E

[

−f ′

f(F−1(U i

(Rii)))

]

E

[

−g′

g(G−1(U i

(Qii)))

]

, (9.6)

where F andG are distribution functions corresponding to densities f and g, respectively. Thistest statistic is further equivalent to

n∑

i=1

Eϕ(U i(Rii)

, f ) Eϕ(U i(Qii)

, g), (9.7)

which equals the test statistic of (9.4) and concludes the proof.

As a special case we can express the test statistic of (9.4) for the linear score function ϕ;such a test statistic

n∑

i=1

RiiQii

(i + 1)2(9.8)

can be interpreted as a sequential-rank analogy of Spearman correlation coe�cient.

10. A sequential testing procedure

The tests investigated in the previous sections considered a �xed number of observations. Inthis section, we propose a sequential version of the two-sample test statistic W (5.6) basedon sequential ranks. Sequential tests for all situations in Sections 4–7 can be performed in ananalogous, straightforward way.

We describe the sequential testing procedure on the example of the two-sample Wilcoxontest. In the whole section, we use the notation of Section 5 and consider H0 : 1 = 0 againstthe one-sided alternative H1 : 1 < 0. Because the test depends on the order of Xs in the

122 J. KALINA

pooled sample, we assume a �xed order X−Y−X−Y−X−Y · · · . In other words, Z2k−1 = Xk

and Z2k = Yk for k = 1, 2, . . . .The observations are obtained sequentially, and we require the maximal sample size not

to exceed a given NS. Our approach allows rejecting H0 earlier compared to the test with the�xed size NS. The stopping time is a random variable required not to exceed a �xed integerNS. The computation is explained in Algorithm 10.1, using critical values of the followingde�nition.

De�nition 10.1. Weconsider a �xed levelα of the test and a �xed value of themaximal samplesize N. Let us consider an even n. We de�ne the critical value rαn for n < N as the null criticalvalue of the permutation test based on the test statistic

NS∑

i=1; Y

riii + 1

(10.1)

evaluated asR223

+R445

+ · · · +Rnnn + 1

+n + 2

n + 3+ · · · +

NS

NS + 1. (10.2)

In (10.2), the sequential ranks R22,R44, . . . ,Rnn are supplemented by the least favorablevalues for the sequential ranks

R22,R44, . . . ,Rnn, n + 2, . . . ,NS. (10.3)

This allows us to determine whether the test remains to reject H0 always for any reallyobserved data Zn+1, . . . ,ZNS . The null critical value is found exact (or can approximated bysimulations) for the exact number of observations equal toNS. In this way, the test is ensuredto hold the probability of type I error on the selected level α.

The earliest possible stopping time for various values of NS is shown in Figure 1 for thesituation described at the beginning of this section. Let us explain how the �gure was obtained

Algorithm 10.1. Exact sequential two-sample test based on sequential ranks.Require: Integer NS

n := 0repeat

n := n + 2observe Zn−1 (from Xs) and Zn (from Ys)computeW (5.6)compute EW = n/4compute the critical value rαnif W − EW < rαn then

reject H0

stopend if

until n = NS

accept H0


Figure 1. Horizontal: Values of the upper bound NS on the number of observations. Vertical: The smallestpossible sample size of the test for which it is possible to reject H0 by the sequential testing procedure.

on the particular value of NS = 20. Let us say that the �xed number n of observations hasbeen already observed. First, we �nd the critical value of the one-sided test based onW−EWfor n = 20 to be equal to −1.30.

Let us now say that n = 14 observations have been already observed and the vector ofsequential ranks of Ys is equal to the extreme case 1 − 1 − 1 − · · · − 1. Then, in the mostextreme situation that observations 15 to 20 extend the vector of sequential ranks to 1 − 1 −

1− · · · − 1− 16− 18− 20, we will haveW = 3.86 andW − EW = −1.14 is not signi�cant.Later, if n = 16, we obtain W = 2.98 and EW = −2.02, thus yielding already a signi�cantresult. Thus, this most extreme situation leads us to the conclusion that n = 16 is the smallestpossible stopping time for a given NS = 20.

11. Discussion

Previous research considered such test statistics based on sequential ranks, which wereconstructed in an intuitive way without investigating their optimality. As we document it inSection 1, the optimal tests have been unknown and this article �lls exactly this gap of theoptimality results for tests based on sequential ranks.

To be speci�c, we derive the locally most powerful tests among all tests based on sequentialranks for a variety of situations. A general version of the test of Section 3 is formulated for twoimportant special cases, namely, regression in location (Section 4) and regression in scale(Section 6). These results are further formulated for the special case of two-sample tests,namely, for location (Section 5) and scale (Section 7). Further tests, not using the generalresult of Section 3, include the proof of the locally most powerful one-sample test (Section 8)and test of independence (Section 9) based on sequential ranks.

Our main result can be summarized as follows. Classical locally most powerful rankstatistics are known to be obtained as projections of the locally most powerful (parametric)statistics into the space of linear rank statistics. This does not hold for sequential rankstatistics.

124 J. KALINA

Thus, the current article brings arguments in favor of some of the previously used teststatistic based on sequential ranks. Particularly, we prove the linear sequential rank statisticof Lombard andMason (1985) to be locallymost powerful among all tests based on sequentialranks in Section 4. On the other hand, a previous statistic of Mason (1981) uses a di�erentstandardization of the regression constants. In Section 10,we consider theWilcoxon test basedon sequential ranks, which is again equal to the previous Wilcoxon test obtained as a specialcase of Lombard and Mason (1985). These considerations are however derived for a �xedsample size.

If the sample size is �xed, sequential ranks are fully equivalent to classical ranks. Inaddition, there is no advantage in computing sequential ranks in terms of the computationalcomplexity. The computation of R11, . . . ,Rnn has the complexity of order of n log n, which isthe same as that of R1, . . . ,Rn.

Let us recall properties of the test statistics based on sequential ranks for a �xed sam-ple size. Though previous works considered sequential ranks as a tool allowing to derivetheoretical properties of nonparametric hypothesis tests based on classical (nonsequential)ranks in a standard (nonsequential) setting, we have the impression that papers investigatingsequential ranks ignored these properties as well as the unique relationship between classicaland sequential ranks (Algorithms 2.1 and 2.2). Therefore, we �nd it worth mentioning thatsome of the properties may be suitable only in some particular situations. In any case, weunderstand the sequential ranks as a tool principally di�erent from classical ranks requiringa di�erent (“sequential”) way of thinking and interpreting the results.We will now discuss theproperties of the test statistics based on sequential ranks and for the sake of clarity; some areexplained based on the example of the two-sample Wilcoxon test statisticW (5.6):• Locally most powerful property among all tests based on sequential ranks.• A simpli�ed theoretical reasoning due to independence of R11, . . . ,Rnn (Mason, 1981).• The computationhas the same computational complexity as analogous standard tests based

on classical ranks.• W depends on the order of Ys in the pooled sample.• W depends on the order of observations even within each of the two individual samples.• Though EW does not not depend on the order of Xs in the pooled sample,

varW = varn∑

i=1; Y

Riii + 1

=

n∑

i=1; Y

varRiii + 1

(11.1)

depends on this design of the study.In Section 10, we present a straightforward way how to use the optimal test statistics in asequential testing. Our approach considers a random stopping time that is not allowed toexceed a given upper boundNS. The test statistics keeps the properties as described above fora �xed sample size. In addition, we �nd these properties speci�c for the sequential approachimportant:• It is especially suitable if acquiring new observations is expensive.• It is able to come to the conclusion more quickly compared to a test based on classical

ranks, because the weights of the �rst observations are larger compared to later ones.• It is computationally more complicated than classical approaches.Wemay conclude that the advantages of the sequential rank test are revealed in the sequentialtesting procedure, which makes the approach favorable compared to testing with a �xedsample size.


Acknowledgments

The author would like to thank Jana Jurecková, the anonymous referees, an associate editor, andthe editor-in-chief for their time and constructive advice.

Funding

This work was supported by the Neuron Fund for Support of Science and by the Czech ScienceFoundation project number 17-07384S.

References

Barndor�-Nielsen, O. (1963). On the Limit Behaviour of Extreme Order Statistics, Annals ofMathematical Statistics 34: 992–1002.

Berk, R. H. (1975). Locally Most Powerful Sequential Tests, Annals of Statistics 3: 373–381.Ghosh, B. K. and Sen, P. K. (1981).Handbook of Sequential Analysis, edited volume, New York: Dekker.Hájek, J. and Šidák, Z. (1967). Theory of Rank Tests, Prague: Academia.Hájek, J., Šidák, Z., and Sen, P. K. (1999). Theory of Rank Tests, 2nd edition, San Diego: Academic Press.Hušková, M. (1991). Sequentially Adaptive Nonparametric Procedures, in Handbook of Sequential

Analysis, B. K. Ghosh and P. K. Sen, eds., pp. 459–474, New York: Dekker.Jurecková, J. and Sen, P. K. (1982).M-Estimators and L-Estimators of Location: Uniform Integrability

andAsymptotic Risk-E�cient Sequential Versions,Communications in Statistics - Sequential Analysis1: 27–56.

Khmaladze, E. V. (2011). Sequential Ranks, in International Encyclopedia of Statistical Science,M. Lovric, ed., pp. 1308–1311, Berlin: Springer.

Lombard, F. (1977). Sequential Procedures Based on Kendall’s Tau Statistics, South African StatisticalJournal 11: 79–87.

Lombard, F. (1981). An Invariance Principle for Sequential Nonparametric Test Statistics underContiguous Alternatives, South African Statistical Journal 15: 129–152.

Lombard, F. (1983). Asymptotic Distributions of Rank Statistics in the Change-Point Problem, SouthAfrican Statistical Journal 17: 83–105.

Lombard, F. and Mason, D. M. (1985). Limit Theorems for Generalized Sequential Rank Statistics,Zeitschri� für Wahrscheinlichkeitstheorie und Verwandte Gebiete 70: 395–410.

Mason, D. M. (1981). On the Use of a Statistic Based on Sequential Ranks to Prove Limit Theorems forSimple Linear Rank Statistics, Annals of Statistics 9: 424–436.

Mason, D. M. (1984). A Bahadur E�ciency Comparison between One and Two Sample Rank Statisticsand Their Sequential Rank Statistic Analogues, Journal of Multivariate Analysis 14: 181–200.

Miller, R. G. (1970). Sequential Signed-Rank Test, Journal of American Statistical Association 65:1554–1561.

Miller, R. G. (1973).ATwo Sample SequentialWilcoxonTest, Technical Report 46, Stanford:Departmentof Statistics, Stanford University.

Novikov, A. and Novikov, P. (2010). Locally Most Powerful Sequential Tests of a Simple Hypothesis vs.One-Sided Alternatives, Journal of Statistical Planning and Inference 140: 750–765.

Sen, P. K. (1981). Sequential Nonparametrics: Invariance Principles and Statistical Inference, New York:Wiley.

Siegel, S. and Tukey, J. W. (1960). A Nonparametric Sum of Ranks Procedure for Relative Spread inUnpaired Samples, Journal of American Statistical Association 55: 429–445.

Zhou,M., Zhou, Q., andGeng,W. (2016). ANewNonparametric Control ChartMonitoring Variability,Quality and Reliability Engineering International 32: 2471–2479.

on locally most powerful sequential rank...

Documents