barnett v. (1976) the ordering of multivariate data.pdf

39
The Ordering of Multivariate Data Author(s): V. Barnett Reviewed work(s): Source: Journal of the Royal Statistical Society. Series A (General), Vol. 139, No. 3 (1976), pp. 318-355 Published by: Wiley-Blackwell for the Royal Statistical Society Stable URL: http://www.jstor.org/stable/2344839 . Accessed: 11/10/2012 19:22 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . Wiley-Blackwell and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and extend access to Journal of the Royal Statistical Society. Series A (General). http://www.jstor.org

Upload: judith-lugo

Post on 28-Nov-2015

103 views

Category:

Documents


8 download

TRANSCRIPT

The Ordering of Multivariate DataAuthor(s): V. BarnettReviewed work(s):Source: Journal of the Royal Statistical Society. Series A (General), Vol. 139, No. 3 (1976), pp.318-355Published by: Wiley-Blackwell for the Royal Statistical SocietyStable URL: http://www.jstor.org/stable/2344839 .Accessed: 11/10/2012 19:22

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Wiley-Blackwell and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and extendaccess to Journal of the Royal Statistical Society. Series A (General).

http://www.jstor.org

J. R. Statist. Soc. A, 318 (1976), 139, Part 3, p. 318

The Ordering of Multivariate Data

By V. BARNETT University of Sheffieldt

[Read before the ROYAL STATISTICAL SocIE1Y on Wednesday, April 28th, 1976, the President, Miss STELLA V. CUNLIFFE, in the Chair]

SUMMARY In spite of the lack of a natural basis for ordering multivariate data, we encounter an extension of univariate order concepts such as medians, extremes and ranges to the higher dimensional situation. Also, much multivariate theory, and method, exploits order properties in the data or model. We examine the role of ordering in these descriptive and methodological aspects of multivariate analysis by means of a four-fold classification of sub-ordering principles.

Keywords: MULTIVARIATE DATA; ORDER STATISTICS; PARTIAL ORDERING; SUB-ORDERING; EXTREME; MEDIAN; RANGE

1. INTRODUCTION KENDALL (1966) is not alone in observing that "order properties . . . exist only in one dimension". His remark was made in the context of multivariate discriminant analysis, and classification. Bell and Haller (1969), in studying tests of bivariate symmetry, claim to show that "there is no 'natural' concept of rank" for bivariate data. In both cases, the lack of any obvious and unambiguous means of fully ordering, or ranking, observations in a multivariate sample appears as an obstacle to the development of statistical method: in particular to the extension to higher dimensions of areas of application, methodological advantages or general properties of univariate order statistics.

This is not to say that the idea of order or rank is entirely absent from the multivariate scene. Indeed, a substantial effort has been directed to defining some sorts of higher dimensional analogues of univariate order concepts, and much of multivariate statistical method employs (perhaps only implicitly) various types of sub- (less than total) ordering principle. This paper will review, and classify, the work in these two categories and thus has a wider brief than the thorny, possibly irresolvable, issue of how to define higher dimensional order statistics.

At the intuitive level we recognize some rough, primitive notion of order or rank in relation to, say, a bivariate scatter diagram. Consider the two random bivariate samples of 50 observations presented as Figs 1 and 2. Certain observations might appear to be "extreme" -they "surprise us" by their apparent separation from the data mass. The interrelationships between the sample points (by appearing to be more "ordered" on a SW-NE axis in Fig. 2 than in Fig. 1) may suggest a greater degree of association in the second sample. Such ideas can to a limited extent be formalized and we shall propose (Section 2) a four-fold classification of sub-ordering principles which might serve in this respect, and one or more of which can be seen clearly represented in various attempts to extend univariate order concepts, or in particular results in multivariate analysis or multivariate distribution theory. We shall term these sub-ordering principles: marginal ordering, reduced (aggregate) ordering, partial ordering and conditional (sequential) ordering.

The material is presented as follows. A brief review of the nature and methods of appli- cation of univariate order statistics serves as a backcloth against which to set the higher dimensional scene. Section 2 presents a four-fold classification of sub-ordering principles for multivariate (or multi-sample) data. These various principles can be seen to be used (sometimes overtly, more often implicitly) in attempts that have been made to produce direct multivariate

t Work conducted in part in the Stat. Lab., SUNY, Buffalo, USA with the support of Public Health Service Grant No. CA-10810-08.

1976] BARNETr - The Ordering of Multivariate Data 319

analogues of univariate order statistics concepts, and to underlie many results in multivariate analysis and distribution theory. The paper traces these links. In Section 3 we consider multivariate medians, ranges, extremes and order statistics. In Section 4 we proceed to examine the way in which the sub-ordering principles enter into a variety of multivariate (and multi- sample) analysis procedures. A fairly detailed review is given covering such topics as outliers,

X21

x2

@ * * .~~ S

S.~~~ , S

.... F . ,^ 1 J * xt~~~~~~~~~*

'* .1 * xl *1 *

F'IG. 1. Sample 1 (n 50, P 2.) FIG. 2. Sample 2 (n 50, P 2).

discriminant analysis, mixtures of multivariate distributions, tests of symmetry, tolerance regions, distribution theory for univariate samples of dependent observations, multivariate and multi-sample non-parametric methods with brief comment on informal large-scale data screening ("data analysis") methods including probability plotting techniques, cluster analysis and multidimensional scaling. The final Section focuses on a single topic: the assessment of association (specifically correlation) between the components of a bivariate random variable. We see in particular how a certain conditional ordering principle using the idea of concomitants (David, 1973) leads to new and attractive estimates of the correlation coefficient in a bivariate normal distribution, in a "limited information" context.

Most of the work of the paper is expository; the classification and attribution of the diffierent sub-ordering principles appears to be new. An extensive list of references provides a fairly comprehensive coverage of relevant work published in the international statistical journals over the past 20 years, together with earlier key references of historical or motivational interest.

1.1. Univariate Order Statistics The ordering of a univariate random sample as o deenenfor clear representation of the

sample import has long been an important principle. Such work has snowballed to the stage where we now find built up a vast statistical methodology, and associated distribution theory,

320 BARNErT - The Ordering of Multivariate Data (Part 3, concerned with ordered samples. This is well described in the text by David (1970), and the earlier set of edited papers and tables by Sarhan and Greenberg (1962).

For a univariate sample the ordering principle is clear and unambiguous. If xl, x2, ..., xn is a random sample of observations of some continuous random variable X, we can place them in increasing order, as X(W) < X(2) ... < x(n).

The ordered values x(j) can be regarded as realized values of dependent random variables X(?) (i = 1, 2, ..., n) which are termed the order statistics of X for a sample of size n. Properties of the sample, or examination of the structure of X, can be approached in terms of the ordered observations x(j), or the order statistics X(s).

[An alternative method of ordering the sample would be to order the observations in relation to their absolute deviation, or "distance", from some reference point, a. If a < x(1) the ordering is the same as that just described. If a is in the body of the data, e.g. at the median, quite a different ordering arises. Such "distance orderings" are seldom directly considered for univariate data, but we will see that they have an inevitable appeal in the multivariate context.]

As a framework on which to consider potential (and actual) multivariate extensions of principle, application and method it is useful to distinguish different emphases in the univariate situation.

Natural interest in order At the most basic level we find ordered observations used to express intrinsic natural

features of a set of data, reflecting extremeness, contiguity, variability or the effect of external contamination. Extreme values can be crucially important as expressions of the worst or best that may be encountered (minimum temperatures or maximum flood levels in meteorological work; minimum lifetimes for components in industrial reliability studies). They may pinpoint outliers, indicating foreign influences or errors in the data assembly process. Their separation (that is, the range) is a natural measure of variability. The median provides a simple assessment of location. The very form of a problem may naturally censor a set of data: a medical study, or a reliability trial, may have to be terminated before all patients, or pieces of equipment, have reached the critical stage of behaviour. Here we are forced to work with a set of lower order statistics. It is inevitable, natural.

Exploitation of order for speed, ease or efficiency Alternatively, a set of observations may be deliberately ordered to give ease or advantage

in the statistical analysis, rather than to reflect natural factors in the data. With this emphasis we encounter optimum linear order statistics estimation of scale and location parameters; short-cut methods for parameter estimation or testing based on range, mid-range, etc. tests for outliers; methods based on contrived censoring, such as elimination or adjustment of extreme values as a robustness aid (trimmed, Winsorized, means); probability plotting methods for model validation, parameter estimation, use of ordered residuals in analysis of variance and the whole range of non-parametric procedures based on ranks, or signs.

Distribution theory Whether the ordering of the data serves a natural purpose, or is contrived for ease or

efficiency, we need a vast battery of distribution theory results for order statistics. This aspect has been widely studied. Much is known of the forms of, and interrelationships between, moments of order statistics both for general distributions and in specific cases. One particular area concerns limit law results for extreme values. In the univariate situation there are just three forms of limiting extreme value distribution; this is not so in higher dimensions. See Section 3.3. Distribution theory results for non-independent univariate samples can be viewed within a multivariate context and will be considered later (Section 1.2, Section 4.1) in some detail.

1976] BARNETr - The Ordering of Multivariate Data 321

The broad classification of univariate order statistics study into two different emphases from the point of view of application, and the corresponding background distribution theory, will prove a useful basis for examining order concepts in multivariate data. Section 3, for example, examines direct parallels to the "natural" emphasis, and considers some of the associated distribution theory, whilst the later sections are more concerned with the exploitation aspect. But first we must consider what might be meant by "ordering" of multivariate data, and to facilitate this discussion some comment is needed on notation and terminology.

1.2. Notation for Multivariate Samples or Distributions We shall denote by xl, x2, ...,xn a random sample of n observations of a p-dimensional

random variable, X. Corresponding with any component random variable, Xi of X we have a component random

sample Xil, Xi2, ..., Xin (i 1 2, ... p). The component samples for different values of i are in general non-independent, reflecting

the multivariate structure of X. Properties of the component random samples, or random variables, will be termed marginal. Thus moments of Xi are the ith marginal moments of X; the set of ordered observations Xi(l) i(2)... <xi(n) is the ith marginal ordered sample and corresponding random variables Xi), X(2), ..., X(.) are the ith marginal order statistics.

We can fit in this framework certain special cases. (i) The Xi are independent. Here the marginal samples are independent random samples

of observations of the independent random variables, Xi. If the Xi can be observed separately there is no reason why all marginal sample sizes need be the same. We shall denote them n1, n2, ..., np. Data of this type are better termed multi-sample rather than multivariate. The Xi may on occasions be identically distributed.

(ii) Internal comparisons. Usually we will be interested in comparisons from one observa- tion, xi, to another, xj. We may seek to "order" the xi (i = 1,2, ...,p) or look for "extreme values" or "outliers". The corresponding marginal concepts are clear, and individually may be handled by univariate order statistics results. Interrelationships between marginal properties need, however, to exploit the multivariate structure of X. A special case arises with n = 1 (a single multivariate observation, x), when we order over the component observations thus x can be ordered

X(1 < X(2) *X<X(p)

with corresponding internal order statistics X(1), X(2), ..., X(p). If the Xi are independent and identically distributed this specialization yields nothing more than order statistics for a univariate random sample of size p. If the Xi are dependent, the internal order statistics are what have been termed "order statistics for dependent variables (processes)". Much has been written on these, and it is relevant to our study (if somewhat off the main stream). See Section 4.1.

(iii) Several multivariate samples. Occasionally we shall wish to consider, simultaneously, samples from more than one multivariate distribution. To avoid confusion it is best to use distinct symbols for the different distributions and samples. Thus xl, x2, ..., xni and Yi, Y2, *-' Yn, may be two random samples of different multivariate random variables X and Y. Within- sample (and particularly, marginal) factors will use the above designations based on x and y separately.

2. SUB-ORDERING PRINCIPLES FOR MULTIVARIATE AND MULTI-SAMPLE DATA Accepting the futility of seeking any simple, unambiguous, universally agreeable, total

ordering of the n sample values x., x2, ..., x. among themselves, we limit our interest to ways in which some restricted form of ordering of multivariate (or multi-sample) data is feasible and advantageous. The end product of any restricted ordering principle, which we term sub- ordering, is an ordering or ranking of one or more summary features of the observations

322 BARNETr - The Ordering of Multivariate Data [Part 3, (usually quantitative and uni-dimensional) considered either individually or in combination. Sometimes we may merely achieve a relative order comparison (in declaring, say, that one observation is more, or less, extreme than another in some limited respect), or may conclude that some set of observations is of "different order" from another with no formal intra-set order comparison.

It appears convenient to distinguishfour particular sub-ordering principles for multivariate data. They are not entirely mutually exclusive. Sometimes an order-based method of data study is to be found which clearly incorporates more than one of the sub-ordering principles; occasionally the method might seem to be classifiable under more than one heading, or possibly has dubious pedigree under any of the four. But the sub-division of sub-ordering principle does seem to partition the field of order-based multivariate study fairly well, either in terms of basic principle or in terms of practical interest.

The four sub-ordering principles for multivariate data are marginal ordering, reduced (aggregate) ordering, partial ordering and conditional (sequential) ordering. We shall refer to these as M-ordering, R-ordering, P-ordering and C-ordering, respectively. Separate considera- tion is given to some ordering methods in multi-sample data.

2.1. Marginal Ordering (M-ordering) As the name suggests, ordering or ranking here takes place within one or more of the

marginal samples. Interest may centre on the individual ordered margnal samples as an aid to inference about the marginal distributions; certain order features on the marginal samples may be considered in combination (as in global, or component-based, concepts of median, range, extremes, etc. see Section 3), or marginal ordering may serve as a prelude to some further sub-ordering principle (as in Bennett, 1966, on confidence intervals for ratios of marginal medians; or various correlation estimates, see Section 5.1). In addition, Singh (1960) considers marginal ordering due to censoring, in the estimation of parameters of the multi- variate normal distribution from truncated, and censored, samples. Studies of joint distri- butional properties of marginal order statistics include Mustafi (1969), who for bivariate samples, considers recurrence relationships for the joint distribution function:

G,Ux1n, X2) = p(X1(r) < x1, X2(s) < X2)

of any two marginal order statistics from a sample of size n, and Galambos (1975) on the p-dimensional joint distribution of (Xl(r1), X2(r.), Xp(r,)).

More examples of the use of M-ordering will be encountered when we discuss (in Section 4) specific aspects of multivariate theory or method.

We frequently find M-ordering applied to transformations of the data set. The ordering of particular linear combinations (projections) of component values or of radial distances or angular deviations from some fixed point or direction are exploited (see, for example, Blumen, 1958; Weiss, 1960; Vincze, 1961; Bhattacharyya and Johnson, 1969; Andrews et al., 1972; Russell and Puri, 1974; for particular applications discussed in more detail later). The initial transformation may take the form of a component analysis (Gnanadesikan and Kettenring, 1972, in relation to outlier identification). Then again, the sample points may each be preliminarily reduced to a single value by some appropriate (non-linear) metric, perhaps of a generalized distance type. However, most examples of ordering after initial transformation of the data are best considered under the next heading of reduced (aggregate) ordering since their intention is not to represent (joint) marginal behaviour but to summarily express overall characteristics for the multivariate data set.

2.2. Reduced (Aggregate) Ordering (R-ordering) With this type of ordering each multivariate observation is reduced to a single value by

means of some combination of the component sample values. The metric employed is frequently of the "generalized distance" type: x being represented by a quadratic function

1976] BARNETT - The Ordering of Multivariate Data 323

(x- a)' r-P(x- a) for some convenient choice of a and r; a may be the origin, the mean or the set of marginal medians (sample or population); r may be the identity matrix, I, the population or sample dispersion matrix (E or S) or perhaps the diagonal matrix of (sample or population) component variances. See, for example, Wilk and Gnanadesikan (1964) on informal graphical assessment of multi-response data.

In contrast to marginal ordering, the aim is to effect some sort of restricted overall ordering of the multivariate sample. This may be explicit (as, for example, in much of the work on extremes, described in Section 3.3, or outliers, Section 4.2) or merely implicit in a particular method of multivariate analysis, as we shall observe throughout Section 4.

Generalized distance measures figure widely in statistical analysis, going back to Pearson (1900). Their use as a basis for (reduced type) sub-ordering is but a small area of application. Setting r = I has a primitive appeal in merely ordering the Euclidean distance of the sample points from some "centre" a, possibly the natural origin (a = 0). Its disadvantages include the disregard of second-order moment structure (and location), its frequent lack of appropriate probabilistic interpretation and its failure to reduce to the conventional ordering principle in one dimension (it yields here the type of distance ordering described at the beginning of Section 1.1).

If we knew the distributional form of X there would be some appeal in ordering in relation to probability concentration contours. For a normal distribution this adds respectability to ordering based on the generalized distance (x-,u) Y-1(x-V.) where ,u is the mean vector of X (or the sample equivalent). For the data in Figs 1 and 2, the underlying distributions were in fact standardized normal with correlation 0 and 0-5, respectively. Thus concentric circles, or ellipses, would be the corresponding bases on which to assess order or extremeness. See Figs 3 and 4. But such an approach has limited utility. The distribution of X will typically be

X2~ ~ ~ ~ ~ ~ ~~~~~~X

V X

FIG. 3. Probability circles-sample 1. FIG. 4. Probability ellipses-sample 2.

324 BARNETr - The Ordering of Multivariate Data [Part 3, unknown and cannot be used (except, perhaps, for identification of outliers) as a basis for expressing order or extremeness.

A slightly different version of R-ordering involves the accumulated (or aggregated) distance of each point from all the other points, rather than its distance from a single fixed point. This principle is implicit in Wilks' (1963) test for the identification of multivariate outliers (Section 4.2) and is indirectly applied in a particular definition of the multivariate sample median (see Section 3.1).

A novel possibility for reduced ordering might be found in the Fourier-type reduction of Andrews (1972), where x is represented by xIN2 + x2 sin t+ x3 cost +x4 sin 2t+x5 cos 2t ... for some choice (or range) of values of t. More speculatively, might we even contemplate "ordering" Chernoff's faces? Chernoff (1973) suggests reducing a multivariate observation x to a caricature of the human face for ease of assimilation of x, or distinction between different x. He proposes no ordering (other than rough grouping) of the faces. But why not? Could we not imagine them subsequently ordered in terms of "beauty", or "intelligence" or "malevo- lence" ? The scope for personal judgment is endless !

2.3. Partial Ordering (P-ordering) The emphasis here moves away from consideration of the marginal samples or individual

multivariate observations to consider overall interrelational properties in the total deployment of the sample. The way in which observations fall into different regions of the sample space, where such partitioning may be based on one of several possible principles, is used to dis- tinguish between groups of observations with regard to order, rank or extremeness. The partitioning method in P-ordering may involve marginal properties or reduction metrics but the aim is usually restricted to limited order distinctions for the whole set of data. Specific forms are a basis for particular methods of multivariate analysis, for example, in cluster analysis or discriminant analysis (considered in more detail below: Section 4).

P-ordering results either in a basis for dividing the sample into distinct groups of different order, one to another, with no internal distinction or for making statements of relative order in which any one observation is ranked with respect to the others. We shall consider various examples of such a principle.

Figs 5 and 6 again show the random samples 1 and 2. In each case the convex hull has been constructed by drawing the minimum convex set which encloses all sample points. Those points on the perimeter are designated c-order group 1, and discarded. The convex hull of the residue is formed; those on the perimeter are group 2. The process is repeated, thus providing an entirely sample-based method of dividing the data into order groups; the lower the group number, the more "extreme" the observation. Such a method of partial ordering is analogous to Tukey's proposal for "peeling" a multivariate sample (as the generalization of "trimming" a univariate sample) but it does not seem to have been discussed in detail, and is rather attractive. It might, for example, lead to a simple method for estimating correlation from a bivariate sample. Correlation estimates based merely on linear combinations of the differences in the number of line segments of positive and negative slope for different group perimeters (perhaps inversely weighted by group number) has intuitive appeal which is supported on empirical study. Distributional results corresponding to convex hull ordering may not be too tractable but some studies have been made. Efron (1965) discusses the expected value of the area, perimeter, probability content and number of sides of the overall convex hull. His results, in the form of integrals, are manageable for uniform and normal distributions. Fisher (1969) considers all possible limiting shapes for the convex hull as n-*oo, and refers to earlier work by himself (1966), Geffroy (1961, on the asymptotic behaviour of the convex hull for multivariate normal data), and Renyi and Sulanke (1963, 1964). See also Carnal (1970); Quesenberry and Gessaman (1968) refer to the use of tolerance regions to form convex hulls about sample points in their study of multivariate non-parametric discriminant analysis (see Section 4.3) extending the suggestion by Kendall (1966) that an observation be

1976] BARNETT - The Ordering of Multivariate Data 325

assigned to one population or another on the basis of whether it falls in the convex hull of samples from one or the other population.

Perhaps the simplest basis for P-ordering lies in examining the numbers of sample points which lie (in the bivariate case) within prescribed regions of different shape (rectangles or

X2

FIG. 5. Convex hulls-sample 1. FIG. 6. Convex hulls-sample 2.

circles, say). Cohen (1955) at one stage suggests manipulating a circle of fixed radius to include the maximum number of sample points, as a basis for estimating a bivariate mean. He recommends that the radius should be chosen with the aim of enclosing about 70-90 per cent of the sample. Loftsgaarden and Quesenberry (1965) present a non-parametric estimate of a multivariate density function using counts of observations in regions defined in terms of Euclidean distance; Elkins (1968), for a similar purpose, compares cubical and spherical regions and concludes that the optimal side or diameter is O(bn-4) where b depends on the true density function; Murthy (1966) uses "p-dimensional windows". Naus (1965) discusses the determination of p(m In, u, v): the probability that a rectangle with sides u, v, contains at least m out of n observations from a bivariate uniform distribution.

An important principle for P-ordering is found in the idea of layer ranks, due to Barndorff- Nielsen and Sobel (1966). Illustrating this idea for bivariate data, the observation xi is said to be rth layer, first quadrant-admissible (r = 1, 2, ...) if there are precisely r -1 observations in the sample with both co-ordinates in excess of those of x;. For any r, each observation is either rth layer, first quadrant-admissible, or rth layer, first quadrant-inadmissible. For the second, third and fourth quadrants, analogous attributions are defined for each xi in terms of other observations whose co-ordinates are (appropriately) greater or smaller than those of xi. (For p> 2, the definitions extend naturally in terms of orthants, but only rth layer, first orthant-admissibility is considered). For the special case where the marginal distributions are independent, Barndorff-Nielsen and Sobel discuss the distribution of the distribution-free

326 BARNETT - The Ordering of Multivariate Data [Part 3,

quantities Ar (q) (q = 1, 2, 3,4; r = 1, 2, ..., n): the numbers of rth layer, qth quadrant-admissible points in a bivariate sample of size n. They also consider analogous results for p > 2 in the first orthant only. Whereas Barndorff-Nielsen and Sobel are concerned only with distributional properties with no specific application in mind, the idea of layer ranks does seem implicit in a simple form in the work of Siddiqui (1960) on bivariate extremes, and it is used explicitly by Bhattacharyya and Johnson (1970) in a "layer rank test" for equality of two bivariate distri- butions where the alternative hypothesis is one-sided with "large" observations more likely from one distribution rather than the other. This distribution-free test for ordered shifts in bivariate or multivariate distributions uses the notion of "stochastic ordering for two- dimensional vectors". X, with distribution function F, is strongly (weakly) smaller than Y, with distribution function G, if F# G and, for all (x, y), FA G and F.<G (F> G or F< C). The test statistic is based on sums, over each sample, of layer ranks in the combined sample.

2.4. Conditional (Sequential) Ordering (C-Ordering) The final sub-ordering principle for multivariate data is one in which ordering or ranking is

conducted on one of the marginal sets of observations conditional on selection, or ordenrng or ranking, within the data in terms of other marginal sets of observations. Examples appear in the work of Kreimerman (1975), or the use of concomitants (David, 1973). (See Sections 3.4 and 5, respectively.) The marginal samples used may be the original ones, or those derived from some preliminary co-ordinate transformation. The process is often repeated sequentially throughout all the marginal sets of observations, or may be limited to a single stage of conditioning.

The principal example of C-ordering is found in the notion of statistically equivalent blocks. The term is due to Tukey (1947) but the concept is developed from ideas of Wilks (1941, 1942) and Wald (1943). Wilks (1948) surveys the work to that date; Anderson (1966) refines the notion and proposes many non-parametric procedures based on it; Kendall (1966), Gessaman (1970) and Richards (1972) propose rather informal applications for multivariate discriminant analysis and classification.

The idea stems from work by Wilks (1941) on the determination of distribution-free tolerance limits for a univariate distribution, based on the fact that the coverages for the intervals Ii = [X(i), X(j_1)] (i = 1,2, ..., n +1 with X(0) = -0o, X(f+l) = oo) are distribution-free. The coverage of k is just F(X(j))-F(X(j_1)). In higher dimensions, the multi-dimensional coverages produced by "slicing" with respect to any set of marginal order statistics will clearly also be distribution-free. This extends to hyper-rectangular coverages when the component variables are independent. Wald (1943), again concerned with tolerance regions, showed how to extend the construction of distribution-free rectangular coverages to the general (non- independent component) bivariate case. The rectangular regions were constructed by slicing with respect to chosen ordered values of the first component, then internally slicing each slice with respect to the ordered values of the second component within that slice. Tukey (1947) extended this to general multivariate situations and also relaxed the constraint of slicing parallel to the co-ordinate axes, to produce statistically equivalent blocks. Their construction is well described by Anderson (1966) in the following terms. Suppose h&(x),h2(x),...,h.(x) are n one-dimensional functions of x, not necessarily different, and k1, k2, ..., k. is a permutation of 1,2, . .., n. We use hk1(x) to order the xi and use the kith ordered value of hk5(x) to divide the sample into two blocks. One or other of these two blocks (depending on whether k2 > k1 or k1 > k2) is now divided into two in terms of the values of hk,(x) for the sample points it contains. This process is contained until we finish up with n + 1 statistically equivalent blocks with distribution-free coverages.

Anderson (1966) shows how to use statistically equivalent blocks to test if the sample comes from a prescribed multivariate distribution, or that two multivariate samples come from the same distribution (using either the blocks defined by one sample, or by the combined sample, as a basis for ranking the observations). He also considers an application to discriminant

19761 BARNETT - The Ordering of Multivariate Data 327

analysis. The proposed procedures are of course non-parametric. Vincze (1961) implicitly uses particular statistically equivalent blocks for his two-sample non-parametric Smirnov-type tests for bivariate samples, and many other non-parametric procedures have a similar implicit basis. The "bivariate order statistics" of Kreimerman (1975) also illustrate the use of statistically equivalent blocks. See Sections 3.4 and 5.

2.5. Combined and Marginal Ordering in Multi-sample Data Univariate non-parametric methods frequently use the ranks (or relative signs) of obser-

vations. When considering multi-sample comparisons the ordering may take place within each sample or possibly over the combined set of samples. As described above, multi-sample data can be viewed in the general multivariate context. The "component" or "marginal" samples are of course independent and their sizes may differ. Ordering of individual samples is of the marginal type described above, but ordering over the combined sample needs to be viewed as a distinct sub-ordering principle which we shall call combined ordering. Many non-parametric statistical methods employ combined ordering. We shall not attempt to review in detail order concepts in relation to univariate non-parametric methods. However, fuller consideration will be given to multivariate non-parametric methods (Section 4.7) since these reflect not only the distinction between within-sample, and combined, ordering but also the above four types of sub-ordering for coping with the multivariate nature of the data. (Multivariate multi-sample data is best viewed directly as a set of multivariate samples.)

Some specific distribution theory results on ordered multi-sample data include Conover (1965). He employs both combined and marginal ordering. Each ofp independent, equal sized, samples from a common distribution is internally ordered; the p ordered samples are then "ordered" in relation to their maximum observations. Some distributional properties are investigated. See also David (1966). For a similar set-up, but where the distributions may possibly differ, Cohn et al. (1960) examine the probability that the p sample maxima are the p largest values in the combined sample, and the probability of non-empty intersection of the set of intervals {(xi(r), Xi(r+l), i = 1,2,. ...,p)} (which is distribution-free and maximized if all distributions coincide). In the bivariate case they show that the distribution-free property disappears and probabilities of intersections of statistically equivalent blocks depend on the manner of their construction.

3. DIRECT ANALOGUES OF UNIVARIATE ORDER CONCEPTS FOR MULTIVARIATE DISTRIBUTIONS AND DATA

An obvious starting point in studying ordering principles applied to multivariate data is to seek direct parallels with the univariate case in the naming of concepts or expression of ideas from the point of view of the "natural" representation of the data. The literature contains many examples, including quite specifically multivariate medians, ranges, extremes and order statistics. The authors are not hampered by the obvious intangibility of any direct and total higher-dimensional ordering concept; they seldom consider the lack of formal definition of order. It would doubtless prove difficult to define an elephant, but we have some idea of what we mean by an "elephant" and readily attach the label to certain manifestations. The fact that one man's "elephant" may differ from another's does not prevent them from trying to communicate on a matter of such substantial importance.

3.1. The Median Austin (1959) discusses the determination of the median of a bivariate sample as that point

from which the sum of the absolute distances of all observations is a minimum. Known also as "the point of minimum aggregate distance (travel)" it has been claimed to be the "proper generalization of the median concept to higher dimension", and has interested economists and planners (to whom its determination is also known as the "generalized Weber problem") with

328 BARNETT - The Ordering of Multivariate Data [Part 3,

regard to the optimum location of a storage depot to serve a set of industrial plants. Mathe- maticians have long been interested in the difficulty of its determination. Wide statistical interest in the 1930's (primarily expressed through the pages of the journal, Metron) has recently been revived by new proposals (Seymour, 1970; again in Metron) for numerical determination. Such a "median" will clearly reduce, in its univariate analogue, to the traditional sample median. The implicit sub-ordering concept is the aggregate form of R-ordering.

The term median is employed quite differently elsewhere: to represent merely the vector of marginal (population or sample) medians. Mood (1941) considers the joint distribution of the set of marginal sample medians; Olmsted and Tukey (1947) partition a bivariate sample about the marginal sample medians in their non-parametric "quadrant sum test" of association, and Gnanadesikan and Kettenring (1972) propose such a "median" as a robust estimator of multivariate location. Hoel and Scheuer (1961) and Bennett (1966, 1968) are concerned with joint confidence sets, and with confidence limits for ratios of marginal medians in a bivariate distribution. For example, Bennett (1966) employs the sampling distribution of the ordered values of zi(O) = x2i- Oxli in a bivariate sample (xl, x2 ) (i = 1,2, ..., n) to construct confidence intervals for the ratio 'qle of the marginal medians (e, -). Later, in Bennett (1968), he considers two bivariate distributions whose median pairs ({1, 1) and (92) J, satisfy

& = {es 2= = 1

that is, the marginal medians in the two bivariate distributions are in constant proportion to each other. By considering the signs of the differences zj = y -axU and zj = Y2j - Xx21 he constructs an estimator and confidence limits for cx.

Blumen (1958) is concerned with a test for the value of the pair of marginal medians in the bivariate case.

This concept of a multivariate median involves just M-ordering but its subsequent investi- gation may also employ R-ordering as in the use of the ordered values of zi(O) by Bennett (1966).

We see for the median a basic distinction which permeates use of order concepts in multivariate studies. Generalization of a univariate concept either utilizes internal structure (perhaps through R-ordering) and is correspondingly remote from its univariate progenitor, or consists merely of the vector of marginal univariate equivalents (using M-ordering) and correspondingly loses information on the internal structure of X.

Haldane (1948) describes the marginal form of the median as the "arithmetic median" and declares it "the only reasonable generalisation ... obviously to be preferred in ordinary statistical work". He terms the aggregate distance form the "geometric median" with "certain advantages in problems of geometric probability".

These distinctions in form will be noticed in the following discussion of range and extremes.

3.2. The Range It is natural to seek to represent the variability in a multivariate sample in terms of a

generalization of range. Cacoullos and DeCicco (1967) consider the distribution of the "bivariate range". They propose and examine various candidates for this label, principally in relation to the bivariate circular normal distribution. In the estimation of the common standard deviation, a, they consider, inter alia,

bivariate range: max [(x,,- xl)2 + (x2,-x22)2]i

(i.e. the maximum Euclidean separation between two sample points),

diagonal: [R2 +R2]1,

figure of merit: (Rl+R2)/2,

1976] BARNETT - The Ordering of Multivariate Data 329

where R1 and R2 are the two marginal ranges. These involve R-ordering (for the bivariate range), and M-ordering (for the diagonal, and the figure of merit). They refer also to the use of covering circles (Daniels, 1952) and of the area, or perimeter, of the convex hull of the data set, thus employing P-ordering. The detailed form of the distribution of the Studentized bivariate range, when sampling from a circular normal distribution, is discussed by Gentle et al. (1975).

Another approach to range is found in the definition of connected range by Tsukibayashi (1962). Forp = 2, suppose (X1, X2) have zero means, and the regression of X2 on X1 is linear and homoscedastic. A random sample (xli, x2j) (i = 1,2, ..., n) has minimum and maximum x-values, x(l) and x(f). If Y[W] and Y[n] are the associated y-values, the "connected range" is Y[n]-Y[i] and this is used in estimating the covariance, and the regression and correlation coefficients. The Y[i] are the so-called concomitants of the ordered xi (David, 1973) and thus we also encounter the fourth type of sub-ordering principle (C-ordering) in the study of multivariate "range".

However, M-ordering is the predominant principle in the discussion of range. We saw the marginal ranges R1,R2 used in combination by Cacoullos and DeCicco (1967). Several authors have considered the correlation of the marginal ranges. For samples from a standard- ized bivariate normal distribution with correlation, p, Hartley (1950) showed how to determine p,(n, p); the correlation between R1 and R2 for a sample of size n. Mardia (1967) claimed to correct an error in Hartley's work and gives tabulated values for p = 0 (0 05) 1, n = 2 (1) 10, 20. Smith and Hartley (1968) point out that there was no mistake in Hartley (1950); it was a misunderstanding. Kurtz et al. (1966) give an explicit expression for p.(n, p) when n = 2, 3, and its limiting form as p -?0 for any n.

3.3. Extremes and Quantiles The concept of an extreme of a multivariate sample is important intuitively and methodo-

logically. Its role in identification of outliers and tests of multivariate structure is considered later. At this stage we examine attempts to explicitly define, in an overall or in a marginal sense, the idea of a multivariate extreme. We shall also review some relevant distribution theory.

Kudo (1957) claims that we are "justified to call the observation ... [xJ which has maximum value of (xj -xR)' Z-1(xj -) .. . the extreme value" in a multivariate sample {xj} (j = 1,2, ..., with sample mean x, from a distribution with known variance-covariance matrix Z. We are, of course, unlikely to know Z in which case Kudo suggests replacing it with S, the sample moment estimator. He is concerned primarily with a normal distribution and the identification of outliers. (See also Section 4.2.)

Siotani (1959) adopts a similar principle of extremeness but bases it on more generalized "distance measures" (xj - a)' 1-I(xj - a) where P may be Z or S and a may be the natural origin or the true mean, as well as the sample mean. Both Kudo and Siotani are using R-ordering.

The M-ordering notion of an extreme in the sense of the set of marginal extreme values (x1(q), x2(), ..., xp()) or (X1(n), X2(n), ..p, XP(n)) has also been widely discussed. Sibuya (1960) claims that the pair of marginal extremes from a bivariate sample is the appropriate extension of the extremal concept of two dimensions. He shows that there is an infinity of possible limiting joint distributions for (Xl(n), X2(X)) (in contrast to the three which exist in the univariate case) and that for a large class of distributions, including the normal with imperfect correlation, X1(n) and X2(n) are asymptotically independent. Gumbel and Goldstein (1964) examine marginal extremes for two real-life bivariate samples: one describing oldest ages at death for the two sexes in a particular community over many years, the other maximum flood levels for various years at two points on a river. They conclude that the maximum ages are inde- pendent, the maximum flood levels, dependent. Gumbel and Mustafi (1967) examine the analytic properties of two forms of "stable bivariate extremal distributions" having marginal type I extreme value distributions. They examine some data on flood levels to determine which of the models provides the better fit. See also Gumbel (1961).

330 BARNErr - The Ordering of Multivariate Data [Part 3, Extending Sibuya's results, Srivastava (1967) develops ideas by Finkelshtein (1953) to

examine conditions for asymptotic independence of the minima (Xl(l), X2(1)), or of any pair of marginal order statistics, and Mardia (1964a) further extends this work to the set (Xl(l), X2(1), Xl(f), X2(n)) of the four marginal minima and maxima. Other relevant papers are Geffroy (1959), Posner et al. (1969) and Galambos (1975). The asymptotic joint distribution of marginal quantiles (their independence, or joint normality) is considered by Siddiqui (1960); by Weiss (1964) building on earlier results of Siddiqui and of Mood (1941); and by Srivastava et al. (1964), including conditions for independence of distances between quantiles.

Other work in this area includes exact (non-asymptotic) joint distributions for marginal extrema, quantiles, ranges for general distributions (Siddiqui 1960; Mardia, 1964b), or for particular bivariate distributions (Mardia, 1964c, on minima and ranges for multivariate normal and Pareto type 1, but only for samples of sizes 2 and 3 for normal marginal ranges).

Convergence of minima or maxima to specific forms of limiting joint distribution are also widely discussed. Berman (1962b) considers a limiting joint distribution having joint density of the form

b(Xi, X2) = #2(x2) 01(x1)Xog0g 2(x')/1lg 01(x1)]+1,

where x(t) is a continuous, convex, function with max (- t, -1) < X(t) (0. In a long and important series of papers over a period of 15 years or so, Tiago de Oliviera

has considered asymptotic joint distributions for bivariate marginal extremes. In Tiago de Oliviera (1959, 1962) he shows that the asymptotic joint distribution must be a stable distribution with marginal distributions which (possibly after simple logarithmic transfor- mations) are each univariate extreme value distributions of the Gumbel type, that is with distribution function of the form +(x) = exp (-xe). More specifically the asymptotic joint distribution function must have a form

f(x1, x2) = [#(x1) (x2)]k(xz-x) = exp [-(e-xi + e-xs) k(x2- xl)],

where k(w), the dependence function expressing interdependence of the component random variables, must satisfy prescribed conditions. Asymptotic independence arises if k(w) = 1 and yields #(xl, x2) as

f1(xl x2) = exp [-(ex01 + e-xs)]

whilst complete dependence, or diagonality, arises when k(w) = max (1, ew)/(1 + el") in which case b(xl, x2) is

b2(x, x2) = exp { -exp [-min (xl, x2)]}.

In general

0Xl(2 XD -< +(X1, X2) 02(X13, X0.

The distribution is exchangeable: #(xl, x2) = +(x2, x1), if k(w) = k(- w). Tiago de Oliviera (1965) considers testing for independence, and estimation of b(xl, x2). Subsequent papers (1968, 1970, 1971, 1974, 1975) consider the nature and properties of specific distributions in the family O(xl, x2). These include the so-called logistic, mixed, biextremal and Gumbel distributions: the latter being related to the bivariate exponential distribution of Marshall and Olkin (1967). Posner et al. (1969) also consider estimation of parameters in bivariate extremal distributions.

This vast array of work on joint distributions of marginal extremes clearly uses no overall multivariate ordering concept. It is based almost exclusively on M-ordering. Occasionally, other sub-ordering principles are also present as, for example, in Siddiqui (1960) where P-ordering is used in a manner which appears to anticipate "layer ranks".

1976] BARNETT - The Ordering of Multivariate Data 331

3.4. Order Statistics There seem to be few examples in the literature where a claim is advanced for a global

principle of ordering the observations in a multivariate sample. One example is in Healy (1968) who suggests a "natural" ordering principle in terms of "distance" from the mean. Explicit use of the term "order statistics" in relation to bivariate distributions appears in the title of Galambos (1975, where only M-ordering is considered) and in a report by Kreimerman (1975) on a test of goodness of fit for a continuous bivariate distribution using gradually increasing numbers of order statistics. Kreimerman's "order statistics" for a bivariate sample are obtained as follows. He orders one component and selects certain ordered values to define "strip limits". For the points within each strip the values of the second component are ordered. He utilizes jointly the strip limits and a selection of ordered second-component values within each of them. Such a procedure employs C-ordering in a form akin to "statistically equivalent blocks".

4. MULTIVARIATE ANALYSIS OR THEORY INVOLVING SUB-ORDERING

We shall now review particular areas of multivariate or multi-sample analysis or theory where use is made of some form of data ordering or ranking. Several examples have already been quoted. The review in this section extends these, but must necessarily be brief and selective. In particular, we shall not attempt to survey cluster analysis or multidimensional scaling even though these can each in a sense be viewed as intrinsically concerned with ordering multidimensional data (in partial and reduced (aggregate) senses, respectively).

4.1. Distribution Theory for Ordered Dependent Observations Suppose Xl, X2,..., X. are dependent or non-identically distributed random variables.

A set xl, x2, ..., xp of one random observation from each is ordered to produce

X(1) < X(2), ... < "X(p).

The corresponding random variables {X(1)} are referred to as "order statistics from a dependent sample (or process)". Equivalently x = (xl, x2, ..., xp)' can be regarded as a single observation of a p-dimensional multivariate random variable X = (X1, X2, ..., Xp)'. Internally ordering x, on a component basis, yields the {x(j)}. For independent, identically distributed, Xi the distributional properties of the X(o) have been widely studied, since the X(2) are just the usual univariate order statistics for a sample of size p. But there are some analogous results for dependent, or non-identically distributed, Xi. These are concerned in the main with exchange- able, m-dependent or equi-correlated Xi, or with particular assumed distributional forms for X.

Most of the work in this area is concerned with the extreme value, X(,), but general X(f) and linear combinations of them (including the range, X(_) - X(1)) also feature.

The limiting distribution of X(,) is studied by Watson (1954), by Newell (1964) for m- dependent Xi and by Berman (1962) for exchangeable Xi or where p is random. Gallot (1966) considers bounds on p(X(p) > c) for general Xi.

David and Joshi (1968) extend to exchangeable Xi some of the known results on the moments of the order statistics, X(f) for independent Xi; Young (1967) derives recurrence relationships between the density functions of the X(f) for exchangeable or equi-correlated Xi; Sen (1968) demonstrates asymptotic normality of sample quantiles for m-dependent Xi.

For equi-correlated standard normal Xi, Steck and Owen (1962), and Grieg (1967), consider the approximate distribution of the extreme X(,) for any p; Gupta et al. (1973) present values of the probability integral and percentage points for X(,) extending tables by Gupta (1963); Owen and Steck (1962) express moments and product moments for {X(i)} in terms of these for uncorrelated standard normal Xi. Teichroew (1955) and Kapur (1957) consider the distribution of X(,) when the Xi are independent normal, but one has mean different from the others. Both are concerned with a p-sample slippage problem. Further study of X(,) includes Afonja (1972) for general correlated normal and t-variates, and Kozelka (1956) for multinomial sampling.

332 BARNETT - The Ordering of Multivariate Data [Part 3,

The range X(p) - X(1) is considered by Ishii and Yamasaki (1961) for binomial Xi and by Gupta et al. (1964) for correlated normal X. This latter paper, and Gupta and Pillai (1965), give limited consideration to linear combinations of the X(s), and ratios of such linear forms. This suggests an interesting prospect. For a multivariate sample xl, x2, ..., x, we might contemplate a particular form of C-ordering: first ordering the component values within any xi and then ordering the xi in relation to values of some linear combination of the ordered component values.

4.2. Multivariate Outliers One area of multivariate analysis which cannot avoid order relationships is the identification

of outliers. The principal reference is still the work by Wilks (1963) on "multivariate statistical outliers". For a multivariate normal distribution Wilks considers in detail a test for a single outlying value and for two outliers; tests for more than two outliers are discussed in general terms but without any relevant tabulation. The identification procedure is based on outlier scatter ratios. For the tth observation the "one outlier scatter ratio" is

SI= Ia= I/Iaq ,

where I a1 I is the "internal scatter" (the determinant of the matrix of sample sums of squares and products about the component sample means) and Iaijt the analogous quantity with xi omitted. The SI (t = 1, 2, ..., n) are ordered as S(1), S(2), ..., S(n) and the observation yielding the smallest Si, that is, S(q), is identified as an outlier if S(q) is sufficiently small. For two (or more) outliers the same principle is proposed except that the numerator term in the outlier scatter ratio is determined on omission of any two (or more) sample points. Tables of useful upper bounds for the lower tail probabilities of S(1) are presented for the cases of one, or two, possible outliers under the null hypothesis of an uncontaminated sample. The sub-ordering principle used is clearly R-ordering equivalent to ordering the sums of squares of the volumes of all possible simplexes which can be obtained on omission of different observations (either individually, or in pairs, etc., depending on whether we anticipate one, two, etc. outliers).

Guttman (1973) presents a Bayesian approach to the identification of a single outlier in a multivariate normal distribution. He assumes that all but one observation came from a N(jx,l) distribution, the anomalous observation comes from a N(,L+a,l) distribution. Identification of an outlier is based on the posterior distribution of a.

Healy (1968) proposes a graphical test based on R-ordering for the case of a bivariate normal distribution: plotting ordered values of (x-.)' # -1(x-,p) against the means of the reduced order statistics for a univariate exponential distribution as a test for normality or outliers. Alternatively, he suggests a square-root or cube-root, normalizing transformation of the generalized distances. Gnanadesikan and Kettenring (1972) extend such an approach by suggesting a range of graphical methods including marginal probability plots, plots of generalized distances or plots of low-order components after a preliminary component analysis. They explain that in higher dimensions an outlier does not retain its simple univariate expression as "the one that sticks out at the end" of the sample. Other multivariate outlier procedures based on ordering the principal component include Hawkins (1974) and Fellegi (1975, in the context of the automatic editing of quantitative data).

Rohlf (1975) uses R-ordering to produce a "generalized gap test" for outliers based on lengths of the edges of minimum spanning trees in the matrix of Euclidean distances between all pairs of observations. See also Devlin et al. (1975).

Karlin and Truax (1960) and Furguson (1961) discuss tests for multivariate outliers using (R-ordering) Studentized distance measures and a slippage-type alternative hypothesis.

4.3. Discriminant Analysis and Classification Both Anderson (1966) and Kendall (1966) use sub-ordering of multivariate data in dis-

criminant analysis. Having samples from each of two populations, the aim is to use the

1976] BARNETT - The Ordering of Multivariate Data 333

information they portray to assist in the assignment of a new observation to the appropriate population. Anderson shows how statistically equivalent blocks (C-ordering) can promote a method of discriminant analysis. Kendall's proposals are less formal. They amount to considering the components of the multivariate observations sequentially in some sort of decreasing order of importance. The first component is that which effects "the best division" between the populations; as reflected by the ordered component values in the two samples (M-ordering). The new observation may be assigned on the basis of the chosen component alone; if this is not possible, the second most important component (similarly structured from ordered values of the second component for the two samples within the uncertainty range for the first component) is considered as a basis for assignment. This process is continued until the new observation has been assigned or until all components have been considered. ("We end with a residuum of cases which are undecided.") An alternative procedure proposed by Kendall employs P-ordering through the concept of sample convex hulls. If A1, A2 are the convex hulls for the two samples, a new observation x is assigned to population 1 or 2 if

xeA rnA2 or xeA1nA2

otherwise it is not assigned to either. Discussion of these methods is qualitative with no consideration of statistical properties. A principle similar to Kendall's use of convex hulls is employed by Quesenberry and Gessaman (1968), except that the convex hull is determined from tolerance regions rather than from the basic sample configurations. Richards (1972) "refines and extends" Kendall's successive elimination approach by informally considering pairs of components at certain stages of the elimination process.

Gessaman and Gessaman (1972) review and compare multivariate non-parametric discrimination procedures, including Anderson's and Kendall's ideas and their own particulari- zation of statistically equivalent blocks. See also Gessaman (1970).

4.4. Mixtures of Distributions Suppose that independent observations arise from each of three distributions F1, F2 and F3,

and n such triplets are available. We wish to test if F3 is a mixture of F1 and F2 in the form

F3 = F1 +(1-G)F2.

The data can be regarded as a sample of n multivariate observations with p = 3. Thomas (1969) constructs a non-parametric test based on internal ordering of each triplet. If the ranks are (1), (2), (3) a score is assigned as zero if [(1), (2), (3)] is an even permutation of [1,2,3] or one otherwise. The test statistic is a combination of such rank scores over the n observation triplets. This is an example of the rather uncommon application of internal, or component-by- component, ordering and also illustrates the widespread principle of "rank scoring" in non- parametric analysis. See also Sections 4.5 and 4.7. Chatterjee (1972) considers the multivariate extension of such a model. Concerned with estimating 0, he uses "linearly compounded rank scores". Ranks are assigned for each component in the combined sample from all three distributions and the resulting "rank matrix" has scores attached to each element. Combination of the rank scores yields an estimator of 0, which employs combined (marginal) ordering.

4.5. Tests of Symmetry Several authors use ordering or ranking methods to construct tests of symmetry for a

bivariate distribution. If F(xl, x2) is the distribution function, they are concerned with testing the hypothesis that F(xl, x2) = F(x2, xl). Sen (1967) combines the 2n component observations, ranks overall, and separates the components into a 2 x n "collection rank matrix". Here we see combined ordering applied to non-independent components in a multivariate sample, rather than its more usual application to independent univariate samples (multi-sample data).

334 BARNETT - The Ordering of Multivariate Data [Part 3,

Hollander (1971) bases a test on the sample distribution function, implicitly using a P-ordering concept akin to layer ranks. Bell and Hailer (1969) consider parametric and non- parametric tests and show that "all distribution-free ... procedures are based on ranks". They use M-ordering.

4.6. Some Multi-sample Comparisons We have seen how sub-ordering principles in multi-sample data (for testing equality of,

or interrelationships between, distributions) usually lead to consideration of rank order. The use of signs of relative differences in observation (component) values is also considered (see Section 4.7). However, some multi-sample, order-based methods retain quantitative features of the data.

Lewis (1972) considers the ranges of each of p univariate samples from distributions differing at most in their location and rejects equality if at least two ranges do not overlap.

For two bivariate distributions, Vincze (1961) constructs a Smirnov-type test of equality by using signed projections of the bivariate observations on a line of random angle, a, relative to the X1 axis. The signed projections are ordered over the two samples in combination. The ordering is of reduced, and combined, form. He refers to the unsolved problem of determining the distribution (or limiting distribution) of the Smimov statistic as we let ce vary over (0, 2T).

Weiss (1960) proposes a test of equality of two multivariate distributions based on "distances" between any two observations within each multivariate sample and on numbers of observations in one sample within spheres of given radius about each observation in the other sample.

4.7. Non-parametric Procedures (Ranks and Signs) Almost all non-parametric statistical methods involve some idea of ordering. This usually

takes the form of rank order or signs of relative differences, either within samples or over the combination of samples for multi-sample data, expressed in terms of individual component variate values or of some one-dimensional metric on the multivariate sample space of the observations (one of the earliest references to use such R-order ranks is van Dantzig and Hemelrijk, 1954). Thus sub-ordering concepts are of marginal, reduced and combined forms.

Non-parametric multi-sample methods for univariate samples have extensive coverage in the literature and we cannot hope to provide a comprehensive survey even in the limited respect of their order-related basis. Some examples have been discussed above. Others include Mosteller (1948) and Mosteller and Tukey (1950) on a p-sample slippage test (based on picking the sample with the largest observation and counting the number of its observations which exceed all observations in the other samples) and tests of equality of distribution for several samples; for two samples with censoring, Saw (1966) and Young (1970, 1973) consider rank order in the combined sample, or numbers of times observations from one sample exceed those from the other sample, to obtain censored forms of Mann-Whitney, Smirnov, Wilcoxon, "median" or "precedence" tests; for p-samples, Breslow (1970) offers a generalized Kruskal-Wallis test for censored data and Bhapkar (1961) a test based on numbers of p-plets which can be formed (comprising one observation from each sample) such that the observation from the ith sample has the smallest value. See also Dwass (1960), Savage (1964) and Odeh (1967).

Non-parametric multivariate methods have been reviewed by Puri and Sen (1971). For multivariate multi-sample data we encounter all four possibilities for sub-ordering of the multivariate observations within any sample, as well as over the combined set of samples. Ranks and signs are the usual expression of ordering for the development of non-parametric methods. Marginal ranks and combined marginal ranks feature in this work, expressed as rank matrices for each component or aggregated over the components to form rank scores. Various forms of scoring of ranks are employed. Shane and Puri (1969) proposed a family of rank order tests based on sums of quadratic forms of linear signed rank statistics; results are

1976] BARNETT - The Ordering of Multivariate Data 335

extended by Russell and Puri (1974). Rank order multi-sample tests of location are discussed by Sen and Puri (1967); tests of independence of subsets of components based on a single sample are considered by Puri et al. (1970); tests of linear combinations of component means for p-samples (one of which is a control sample) are examined by Tamura (1969); Sen (1969) considers tests for parallelism of several regression lines, whilst Sen and Puri (1970) present a multivariate analysis of covariance based on marginal ranks. Puri and Sen (1966) offer other rank order non-parametric multivariate procedures, as do Karlin and Truax (1960), Bennett (1964, 1968), Bhapkar (1966), Bhattacharyya and Johnson (1970) and Johnson and Mehrotra (1972). Mardia (1969b, 1970) considers a test of homogeneity of bivariate distributions based on ranks (in the combined sample) of angles between the observations and the overall sample mean.

Some work on sign tests for multivariate data includes Bennett (1962, concerning equality of means in two correlated multivariate normal distributions, p (4); Hodges (1955) and Bhattacharyya and Johnson (1969) using numbers of positive projections of observations on an arbitrary line, maximized with respect to the choice of the line; Blumen (1958, a test for the value of a bivariate median based on ordering slopes of lines from the observations to the sample median).

4.8. Graphical Methods There has been built up over recent years an array of informal screening procedures for

the examination of large-scale data sets. Lacking much knowledge of their statistical properties, such methods of "data analysis" nonetheless have high intuitive appeal and are finding widespread application. They include methods of cluster analysis and of multidimensional scaling; also means for quick assimilation of certain multivariate characteristics through judicious choice of graphical plotting procedures. These extend the univariate methods for assessing the validity of a probability model, or estimating its parameters (see, for example, Barnett, 1975), based on plotting ordered sample values against a convenient choice of plotting positions; or the "half-normal plots" of Daniel (1959) in the interpretation of factorial experiments. For multivariate data the methods still depend on an assessed ordering of certain aspects of the data and are, therefore, relevant to this review of multidimensional ordering.

Andrews et al. (1972) propose plotting of ordered radii, or angles, for bivariate data after polar transformation, to assess bivariate normality. Many papers by Wilk and Gnanadesikan (and vice versa) including (1961, 1964, 1968, 1969), and Gnanadesikan and Kettenring (1972), propose multivariate graphical procedures for a variety of purposes such as examination of residuals, identification of outliers and the determination of significant effects in multiple- response experimental data. Such work employs M-ordering, but more frequently R-ordering, based on "generalized distances" of various types and often leading to "gamma plots" in view of the ubiquity of the x2 distribution for such distances resulting from an underlying normal error model.

5. CORRELATION AND AsSOCIATION

One area of multivariate analysis in which order plays an important implicit role is in the estimation and testing of association, or correlation, between the component variables in a bivariate distribution. Many measures of association, and tests of independence, are based on ranking the component observations or counting numbers of observations in different regions of the sample space after it has been partitioned in some manner. Thus we encounter M- ordering or P-ordering (sometimes C-ordering). We cannot fully survey the work on association or correlation but will make a brief review of estimators or tests in which ordering plays a particularly notable part. References will be given only to work in which the order basis is pronounced, or novel in some respect. A somewhat fuller description is given of some

336 BARNErr - The Ordering of Multivariate Data [Part 3,

estimators of the correlation coefficient, p, in a bivariate normal distribution, based on the C-ordering concept of concomitants, and relevant to situations where we have limited informa- tion on one of the marginal samples.

5.1. Order-based Estimators, and Tests, of Association P-ordering underlies many estimators or tests of correlation based on frequency counts of

observations in different regions of the partitioned sample space. (Many such examples stem from the turn of the century, often in the work of Karl Pearson and Udny Yule, and have been rediscovered or re-examined on various subsequent occasions.) This is true of the tetrachoric, polychoric or biserial measures of correlation, in which the sample space is parti- tioned to form 2 x 2, r x s or 2 x s tables of frequencies; and of coefficients of colligation or of contingency.

Interpretation of such measures and assessment of their properties is often possible only if we assume an underlying normal distribution. Tests of independence, including the ubiquitous x2 test, are manifold. Often the basis for categorization is outside the control of the investigator or reflects qualitative factors which are not readily interpretable in any ordering or ranking sense. Our current interest is restricted to situations in which the categories of classification reflect either direct, or implicit, ordering of the two marginal variables whether they be qualitative or quantitative. Here we witness partial sub-ordering.

This would be so if, for example, two quantitative variables were dichotomized with respect to large and small values of each for determination of (say) tetrachoric correlation, coefficient of contingency or application of a x2 test of independence. Mosteller (1946) offers an improved tetrachoric correlation estimator of p in N(p1, p2, a2, 4 p) by dichotomizing x2 about , but dividing the xl-range into three parts corresponding to x1 < -kal, ^ -kka <x1 <^ +kal, x > +kal. Counts in the four corners (omitting the middle region for x3) improve on the tetrachoric estimator (k = 0) for special choice of k dependent on p. The optimum k for p = 0 is 0'612, when the estimator has asymptotic variance 1.939/n. If the means and variances are unknown, Mosteller shows that similar advantages accrue if we partition on the basis of a certain C-ordering of the data set: retaining m observations with lowest, and m with highest, ordered xl-values and dividing those observations in each retained xl-group into two groups with respect to their ordered x2-values. When p = 0, mln = 0-27 is optimal. See also Ogawa (1962). Goodman and Kruskal (1954, 1959), Lancaster and Hamdan (1964) and Hamdan (1970) consider other aspects of measuring association from contingency tables.

Corresponding tests for association have been proposed, and studied, by Hotelling and Pabst (1936) and by Olmsted and Tukey (1947, a "corner test" based on counts in the four corners after partitioning about the marginal medians. Counts are made inward in each quadrant from the four extremities until forced to cross the median lines. Alternate + and - signs are attached to the counts in the successive quadrants and the accumulated signed counts provide a test statistic for assessing association. The test is also known as the "quadrant sum test"). Shahani (1969) suggests leaving a dead region about each marginal mean of width 2kac for suitable choice of k or, if ,u and ai are unknown, omitting about the mid 40 per cent of the (marginal) ordered xl-values and the mid 40 per cent of the (marginal) ordered x2-values. Mardia (1969) presents further work on tests based on counts in four quadrants. Elston and Stuart (1970) offer a test of association based on constructing a contingency table of counts with class boundaries defined by particular marginal order statistics.

Another use of ordering in estimating or testing association arises when observations are marginally ranked as in the determination of rank correlation coefficients such as Kendall's r or Spearman's p, or in rank order tests of independence. A detailed comparative review of the use of "ordinal measures of association" is given by Kruskal (1958). See also Aitkin (1964) and Aitkin and Hume (1965, 1966, 1968). Extension of rank correlation to the comparison of correlations between X1 and X2, and X3 and X4, in a multivariate distribution with p = 4, is considered by Davis and Quade (1968).

19761 BARNETr - The Ordering of Multivariate Data 337

A basic order concept of relevance to rank correlation is the idea of order association, proposed by Kendall (1962) and having some similarity with the "layer rank" concept. Two bivariate observations (xll, x21) and (x12, x22) are said to be:

concordant, if (xll < X12, X21 < X22) or (xll > X12, X21> X22);

discordant, if (xll < X12, X21 > X22) or (xll > X12, X < X22);

tied, otherwise.

The "quadrant dependence" concept described by Lehmann (1966) is relevant here, in providing a model for association between random variables X1 and X2 in which large (or small) values of X1 and X2 tend to be associated (either positively or negatively).

Daniels (1944, 1948) proposes a general class of correlation estimators ("coefficients of disarray") which include as special cases the familiar particular estimators (product-moment, Kendall's T and Spearman's p). Farlie (1961) investigates the efficiency of the Daniels' generalized correlation coefficients.

Other quick estimators of p using alternative sub-ordering ideas have been proposed by Chown and Moran (1951) in the form

E{sgn (xl,i -x1,i+?) sgn (x2,i -x2,i+i)}I(n - 1)

with efficiency about 033 when p = 0 (but note how the estimator depends on the fortuitous order in which the observations arise), and by Leigh-Dugmore (1953) based on the "range of the deviations about the reduced major axis".

5.2. Correlation Estimators Based on Concomitants Watterson (1959) examined linear estimation of the parameters of a multivariate normal

distribution when different forms of censoring affect the observational data. To discuss the idea of multivariate censoring he suggested a method of ordering a multivariate sample x1, x2, ..., xn in terms of the marginal ordering of one component (say, xl). The sample is then represented as

XIW 3, X1(2) *, * * Xl(n)

X2[1, X22], ... *, X2

xp[l], Xp[2]s .. - Xpfn;

where XT[i, (r> 1) is the value of the rth component which has arisen in association with the sth ordered value of xl. Thus we observe a C-ordering principle being used: the xr[s] are quasi-ordered component observations conditional on an ordering of the xl-component values. Linear combinations of the xrts] are then considered for estimation of first- and second-order moments.

David (1973) and David and Galambos (1974) reconsider such a principle for ordering a bivariate sample and term the x2[81 the concomitants of the order statistics of X1. Likewise, we could define xl[8J as concomitants of X2. Figs 7 and 8 illustrate the concomitants of X1 for samples 1 and 2, and comparing the two figures makes intuitively clear the potential information the xl(s] convey about association between X1 and X2.

David and Galambos were not concerned with statistical inference problems. They extend the distribution theory results of Watterson. Recent work by Barnett et al. (1976) utilizes such results for an uncensored bivariate normal distribution N(PK, 2, G2, 22, p) where 2 and U2r are known. Extending Watterson's results some useful new concomitant-based estimators of p are described and examined.

338 BARNETT - The Ordering of Multivariate Data [Part 3,

x2 X2[44]

x2X L [361

X2[11 X2[21] 23 X2[101 , X22j X21373 X2[43]

%tso,~~~~~~~~~~~~~~~~40 **

X2[2]~~~~~~~~~~X

X248 X2[214

X2 [9] X208

X2[101~~~~~~~~~~~~~~24

FIG. 7. Concomitants of XI-sample 1. FIG. 8. Concomitants of X1-sample 2.

In view of the linearity and homoscedasticity of the regression of X2 on X1 we can write

p x - i +Z. .*(s = , 2-..n)

(assuming, without loss of generality, that p2= 0 and u2 = 1) where the Z,, are i'ndependent, N[0, (1 - p2)]. Thus if X2 denotes the vector of concomitants of the order statistics of X1, its mean, and variance-covariance matrix are

*-'~~~~ ~ (I --.28 p2:2 I+ 2

E(X) = pa, V(X2) = p

where a, V are the mean and variance-covariance matrix of the reduced order statistics (X(I) - vi)/o for a univariate sample of size n from N(th, e2), and I is the n x n identity matrix. Asymptotically {X2wiu l- poj (s = 1, 2, ... n) are independent N(,, 1- p2). See David and Galambos (1974).

Exploiting the form of E(Xn), V(Xth and of the asymptotic distribution Of X2t - Pt , Bamnett et al. (1976) consider various linear estimators of p as well as generalized least squares, and maximum likelihood, estimators. Some readily calculable estimators, with reasonable efficiency characteristics, are presented.

An interesting feature of this approach is the irrelevance of the values of ma and u to the moment, and asymptotic distribution, properties of the concomitants, X2. Thus lack of knowledge of IL, and u2 is no obstacle to such estimation of p. Indeed, we can proceed even if we know only the ranks of observations in the marginal sample of Xa values.

Such a limited information situation could arise in practice. Consider estimating the correlation between adjudged grades of a group of individuals and values on some subsequent performance measure. In educational, psychological or industrial testing we can encounter situations where the earlier results have been recorded (or even collected) as rank orders, but where we can realistically postulate a joint normal distribution for the inaccessible earlier values (on which the ranks are based) and the observable later values.

1976] BARNETT - The Ordering of Multivariate Data 339

6. CONCLUSION No reasonable basis exists for fully ordering a set of multivariate observations. Notwith-

standing this fact ideas of ordering permeate the study of multivariate distributions and methods of multivariate or multi-sample analysis. We have reviewed the manners in which order relationships are introduced and applied, facilitated by a four-fold classification of sub-ordering principles. Ordering clearly plays an important implicit role in multivariate statistical theory and method; its more formal recognition and study might yield added benefits.

ACKNOWLEDGEMENT The author is grateful to the referees for suggesting some additional references.

REFERENCES AFONJA, B. (1972). The moments of the maximum of correlated normal and t-variates. J. R. Statist. Soc. B,

34, 251-262. AITKIN, M. A. (1964). Correlation in a singly truncated bivariate normal distribution. Psychometrika,

29, 263-270. (1966). Correlation in a singly truncated bivariate normal distribution. III. Correlation between

ranks and variate-values. Biometrika, 53, 278-281. - (1968). Correlation in a singly truncated bivariate normal distribution. IV. Empirical variances of rank correlation coefficients. Biometrika, 55, 437-438.

AITKIN, M. A. and HUME, M. W. (1965). Correlation in a singly truncated bivariate normal distribution. II. Rank correlation. Biometrika, 52, 639-643.

ANDERBERG, M. R. (1973). Cluster Analysis for Applications. New York: Academic Press. ANDERSON, T. W. (1966). Some nonparametric multivariate procedures based on statistically equivalent

blocks. "Krishnaiah 1", 5-27. ANDREWs, D. F. (1972). Plots of high-dimensional data. Biometrics, 28, 125-136. ANDREWs, D. F., GNANADESIKAN, R. and WARNER, J. L. (1972). Methods for assessing multivariate

normality. "Krishnaiah III", 95-115. AUSTIN, T. L., JR (1959). An approximation to the point of minimum aggregate distance. Metron, 19,

10-21. BARNDORFF-NIELsEN, 0. and SOBEL, M. (1966). On the distribution of the number of admissible points in

a vector random sample. Theor. Probability Appl., 11, 249-269. BARNETT, V. (1975). Probability plotting methods and order statistics. Appl. Statist., 24, 95-108. BARNETT, V., GREEN, P. G. and ROBINSON, A. (1976). Concomitants and correlation estimates. Biometrika,

63, in the press. BELL, C. B. and HALLER, H. S. (1969). Bivariate symmetry tests: parametric and nonparametric. Ann.

Math. Statist., 40, 259-269. BENNETT, B. M. (1962). On multivariate sign tests. J. R. Statist. Soc. B, 24, 159-161.

(1964). A bivariate signed rank test. J. R. Statist. Soc. B, 26, 457-461. (1966). Note on confidence limits for a ratio of bivariate medians. Metrika, 10, 52-54. (1968). On estimation of a ratio of multivariate medians by non-parametric methods. Metrika, 12,

22-28. BERMAN, S. M. (1962a). Limiting distribution of the maximum term in sequences of dependent random

variables. Ann. Math. Statist., 33, 894-908. - (1962b). Convergence to bivariate limiting extreme value distributions. Ann. Inst. Statist. Math., Tokyo, 13, 217-223.

BHAPKAR, V. P. (1961). A nonparametric test for the problem of several samples. Ann. Math. Statist., 32, 1108-1117.

- (1966). Some nonparametric tests for the multivariate several sample location problem. "Krishnaiah I", 29-41.

BHATTACHARYYA, G. K. and JOHNSON, R. A. (1969). On Hodges's bivariate sign test and a test for uniformity of a circular distribution. Biometrika, 56, 446-449.

- (1970). A layer rank test for ordered bivariate alternatives. Ann. Math. Statist., 41, 1296-1310. BLUMEN, I. (1958). A new bivariate sign test. J. Amer. Statist. Ass., 53, 448-456. BRESLOW, N. A. (1970). A generalized Kruskal-Wallis test for comparing k samples subject to marginal

patterns of censorship. Biometrika, 57, 579-594. CACOULLOS, T. and DECICCO, H. (1967). On the distribution of the bivariate range. Technometrics, 9,

476-480. CARNAL, H. (1970). Die konvexe Hulle von n rotationssymmetrisch verteilten Punkten. Z. Wahrschein-

lichskeitstheorie und Verw. Gebiete, 15, 168-176.

340 BARNETT - The Ordering of Multivariate Data [Part 3, CHATTERJEE, S. J. (1972). Rank approach to the multivariate two-population mixture problem. J. Mult.

Anal., 2, 261-281. CHERNOFF, H. (1973). Using faces to represent points in k-dimensional space graphically. J. Amer. Statist.

Ass., 68, 361-368. CHOWN, L. N. and MORAN, P. A. P. (1951). Rapid methods for estimating correlation coefficients. Bio-

metrika, 38, 464-467. COHEN, A. C., JR (1955). Maximum likelihood estimation of the dispersion parameter of a chi-distributed

radial error from truncated and censored samples with applications to target analysis. J. Amer. Statist. Ass., 50, 1122-1135.

COHN, R., MOSTELLER, F., PRATT, J. W. and TATSUOKA, M. (1960). Maximizing the probability that adjacent order statistics of samples from several populations form overlapping intervals. Ann. Math. Statist., 31, 1095-1104.

CONOVER, W. J. (1965). A k-sample model in order statistics. Ann. Math. Statist., 36, 1223-1235. DANIEL, C. (1959). Use of half-normal plots in factorial two-level experiments. Technometrics, 1, 311-341. DANIELS, H. E. (1944). The relation between measures of correlation in the universe of sample permutations.

Biometrika, 33, 129-135. (1948). A property of rank correlations. Biometrika, 35, 416-417. (1952). The covering circle of a sample from a circular normal distribution. Biometrika, 39,137-143.

VAN DANTZIG, D. and HEMELRLJK, J. (1954). Statistical methods based on few assumptions. Bull. Int. Statist. Inst., 34(2), 239-267.

DAVID, H. A. (1966). A note on "A k-sample model in order statistics" by W. J. Conover. Ann. Math. Statist., 37, 287-288.

(1970). Order Statistics. New York: Wiley. - (1973). Concomitants of order statistics. Bull. Int. Statist. Inst., 45, 295-300. DAVID, H. A. and GALAMBOS, J. (1974). The asymptotic theory of concomitants of order statistics. J. Appl.

Prob., 11, 762-770. DAVID, H. A. and Josm, P. C. (1968). Recurrence relations between moments of order statistics for ex-

changeable variates. Ann. Math. Statist., 39, 272-274. DAVIS, C. E. and QUADE, D. (1968). On comparing the correlations within two pairs of variables. Biometrics,

24, 987-995. DEVUN, S. J., GNANADESIKAN, R. and KETrENRING, J. R. (1975). Robust estimation and outlier detection

with correlation coefficients. Biometrika, 62, 531-546. DWASS, M. (1960). Some k-sample rank-order tests. Chapter 17 in Contributions to Probability and Statistics.

Essays in Honor of Harold Hotelling (Olkin et al., eds). Stanford: Stanford University Press. EFRON, B. (1965). The convex hull of a random set of points. Biometrika, 52, 331-343. ELKINS, T. A. (1968). Cubical and spherical estimation of multivariate probability density. J. Amer. Statist.

Ass., 63, 1495-1513. ELSTON, R. C. and STEWART, J. (1970). A new test of association for continuous variables. Biometrics,

26, 305-314. FARLIE, D. J. G. (1961). The asymptotic efficiency of Daniels's generalized correlation coefficients. J. R.

Statist. Soc. B, 23, 128-141. FELLEGI, I. P. (1975). Automatic editing and imputation of quantitative data. I.S.I. Conference, Warsaw. FERGUSON, T. S. (1961). On the rejection of outliers. Proc. 4th Berkeley Symp. Math. Stat. Prob., 1, 253-287. FINKELSHTEIN, B. V. (1953). Limiting distribution of extreme terms of variational series of a two-dimensional

random variable. Dokl. Ak. Nauk. S.S.S.R., 91, 000-0. (In Russian.) FISHER, L. (1966). The convex hull of a sample. Bull. Amer. Math. Soc., 72, 555-558. - (1969). Limiting sets and convex hulls of samples from product measures. Ann. Math. Statist., 40,

1824-1832. GALAMBOS, J. (1975). Order statistics of samples from multivariate distributions. J. Amer. Statist. Ass., 70,

674-680. GALLOT, S. (1966). A bound for the maximum of a number of random variables. J. Appl. Prob., 3, 556-558. GEFFROY, J. (1959). Contribution A la th6orie des valeurs extremes. Publ. Inst. Stat. Paris, 8, 123-184.

(1961). Localization asymptotique du poly6dre d'appue d'un 6chantillon Laplacien A K dimensions. Publ. Inst. Statist. Univ. Paris, 10, 213-228.

GENTLE, J. E. G., KODELL, R. L. K. and SMITH, P. L. S. (1975). On the distribution of the Studentised bivariate range. Technometrics, 17, 501-505.

GESSAMAN, M. P. (1970). A consistent nonparametric multivariate density estimator based on statistically equivalent blocks. Ann. Math. Statist., 41, 1344-1346.

GESSAMAN, M. P. and GESSAMAN, P. H. (1972). A comparison of some multivariate discrimination procedures. J. Amer. Statist. Ass., 67, 468-472.

GNANADESIKAN, R. and KETrENRING, J. R. (1972). Robust estimates, residuals and outlier detection with multiresponse data. Biometrics, 28, 81-124.

GNANADESIKAN, R. and WILK, M. B. (1969). Data analysis methods in multivariate statistical analysis. "Krishnaiah II", 593-638.

1976] BARNErr - The Ordering of Multivariate Data 341 GOODMAN, L. A. and KRUSKAL, W. H. (1954, 1959). Measures of association for cross classifications.

Part I. J. Amer. Statist. Ass., 49, 732-764. Part II. J. Amer. Statist. Ass., 54, 123-163. GREIG, M. (1967). Extremes in a random assembly. Biometrika, 54, 273-282. GUMBEL, E. J. (1961). Multivariate extremal distributions. Bull. Int. Statist. Inst., 33a sess., 2? liv., Paris. GUMBEL, E. J. and GOLDSTEIN, N. (1964). Analysis of empirical bivariate extremal distributions. J. Amer.

Statist. Ass., 59, 794-816. GUMBEL, E. J. and MUSTAFI, C. K. (1967). Some analytical properties of bivariate extremal distributions.

J. Amer. Statist. Ass., 62, 569-588. GUPTA, S. S. (1963). Probability integrals of multivariate normal and multivariate t. Ann. Math. Statist.,

34, 792-828. GUPTA, S. S., NAGEL, K. and PANCHA PAKESAN, S. (1973). On the order statistics from equally correlated

normal random variables. Biometrika, 60, 403-413. GUPTA, S. S. and PILLAI, K. C. S. (1965). On linear functions of ordered correlated random variables.

Biometrika, 52, 367-379. GUPTA, S. S., PILLAT, K. C. S. and STECK, G. P. (1964). On the distribution of linear functions and ratios

of linear functions of ordered correlated normal random variables with emphasis on range. Biometrika, 51, 143-151.

GUTTMAN, I. (1973). Care and handling of univariate or multivariate outliers in detecting spuriosity-a Bayesian approach. Technometrics, 15, 723-738.

HABERMAN, S. (1955). Distributions of Kendall's tau based on partially ordered systems. Biometrika, 42, 417-424.

HALDANE, J. B. S. (1948). Note on the median of a multivariate distribution. Biometrika, 35, 414-415. HAMDAN, M. A. (1970). The equivalence of tetrachoric and maximum likelihood estimates of p in 2 x 2

tables. Biometrika, 57, 212-215. HARTLEY, H. 0. (1950). The use of the range in analysis of variance. Biometrika, 37, 271-289. HAWKINS, D. M. (1974). The detection of errors in multivariate data using principal components. J. Amer.

Statist. Ass., 69, 340-344. HEALY, M. J. R. (1968). Multivariate normal plotting. Appl. Statist., 17, 157-161. HODGES, J. L., JR (1955). A bivariate sign test. Ann. Math. Statist., 26, 523-527. HOEL, P. G. and ScHEuua, E. M. (1961). Confidence sets for multivariate medians. Ann. Math. Statist., 32,

477-484. HOLLANDER, M. (1971). A nonparametric test for bivariate symmetry. Biometrika, 58, 203-212. HOTELLING, H. and PABST, M. R. (1936). Rank correlation and tests of significance involving no assumptions

of normality. Ann. Math. Statist., 7, 29-43. IsHI, G. and YAMASAKI, M. (1961). A note on the testing of homogeneity of k binomial experiments based

on the range. Ann. Inst. Stat. Math. Tokyo, 12, 273-278. JOHNSON, R. A. and MEHROTRA, K. G. (1972). Nonparametric tests for ordered alternatives in the bivariate

case. J. Mult. Anal., 2, 219-229. KAPUR, M. N. (1957). A property of the optimum solution suggested by Paulson for the k-sample slippage

problem for the normal distribution. Ind. Soc. Agric. Statist. 9, 179-190. KARLIN, S. and TRUAX, D. (1960). Slippage problems. Ann. Math. Statist., 31, 296-324. KENDALL, M. G. (1962). Rank Correlation Methods, 3rd ed. New York: Hafner.

(1966). Discrimination and classification. "Krishnaiah Its, 165-184. KOZELKA, R. M. (1956). Approximate upper percentage points for extreme values in multinomial sampling.

Ann. Math. Statist., 27, 507-512. KREIMERMAN, J. (1975). A bivariate test of goodness of fit based on a gradually increasing number of order

statistics. Tech. Report No. 250, Department of Operations Research, College of Engineering, Cornell University.

KRISHNAIAH, P. R. (ed.) Multivariate Analysis, Vol. I (1966), Vol. II (1969), Vol. III (1972). New York: Academic Press.

KRUSKAL, W. H. (1958). Ordinal measures of association. J. Amer. Statist. Ass., 53, 814-861. KUDO, A. (1956). On the testing of outlying observations. Sankhyd A, 17, 67-76.

(1957). The extreme value in a multivariate normal sample. Mem. Fac. Sci. Kyushu Univ. (A), 11, 143-156.

KURTZ, T. E., LINK, R. F., TUKEY, J. W. and WALLACE, D. L. (1966). Correlation of ranges of correlated deviates. Biometrika, 53, 191-197.

LANCASTER, H. 0. and HAMDAN, M. A. (1964). Estimation of the correlation coefficient in contingency tables with possibly nonmetrical characters. Psychometrika, 29, 383-391.

LEHMANN, E. L. (1966). Some concepts of dependence. Ann. Math. Statist., 37, 1137-1153. LEIGH-DUGMORE, C. H. (1953). A rapid method for estimating the correlation coefficient from the range of

the deviations about the reduced major axis. Biometrika, 40, 218-219. LEWIS, J. L. (1972). A k-sample test based on range intervals. Biometrika, 59, 155-160. LOFTSGAARDEN, D. 0. and QUESENBERRY, C. P. (1965). A nonparametric estimate of a multivariate density

function. Ann. Math. Statist., 36, 1049-1051.

342 BARNErr - The Ordering of Multivariate Data [Part 3, MARDuA, K. V. (1964a). Asymptotic independence of bivariate extremes. Calcutta Statist. Ass. Bull., 13,

172-178. (1964b). Exact distributions of extremes, ranges and mid-ranges in samples from any multivariate

population. J. of the Indian Stat. Assn, 2, 126-130. (1964c). Some results on the order statistics of the multivariate normal and Pareto type 1 populations.

Ann. Math. Statist., 35, 1815-1818. (1967). Correlation of the ranges of correlated samples. Biometrika, 54, 529-539. (1969a). The performance of some tests of independence for contingency-type bivariate distributions.

Biometrika, 56, 449-451. (1969b). On the null distribution of a nonparametric test for the bivariate two-sample problem.

J. R. Statist. Soc. B, 31, 98-102. (1970). A bivariate non-parametric c-sample test. J. R. Statist. Soc. B, 32, 74-89.

MARSHALL, A. W. and OLKIN, I. (1967). A multivariate exponential distribution. J. Amer. Statist. Ass., 62, 30-44.

MOOD, A. M. (1941). On the joint distribution of the medians in samples from a multivariate population. Ann. Math. Statist., 12, 268-278.

MOSTELLER, F. (1946). On some useful "inefficient" statistics. Ann. Math. Statist., 17, 377-408. (1948). A k-sample slippage test for an extreme population. Ann. Math. Statist., 19, 58-65.

MOSTELLER, F. and TUKEY, J. W. (1950). Significance levels for a k-sample slippage test. Ann. Math. Statist., 21, 120-123.

MURTHY, V. K. (1966). Nonparametric estimation of multivariate densities with applications. "Krishnaiah I", 43-56.

MUSTAFI, C. K. (1969). A recurrence relation for distribution. J. Amer. Statist. Ass., 64, 600-601. NAUS, J. I. (1965). Clustering of random points in two dimensions. Biometrika, 52, 263-267. NEWELL, G. F. (1964). Asymptotic extremes for m-dependent random variables. Ann. Math. Statist., 35,

1322-1325. ODEH, R. E. (1967). The distribution of the maximum sum of ranks. Technometrics, 9, 271-278. OGAWA, J. (1962). Chapter IOF in Contributions to Order Statistics (A. E. Sarhan and B. G. Greenberg,

eds). New York: Wiley. OLMSTEAD, P. S. and TUKEY, J. W. (1947). A corner test for association. Ann. Math. Statist., 18, 495-513. OWEN, D. B. and STECK, G. P. (1962). Moments of order statistics from the equicorrelated multivariate

normal distribution. Ann. Math. Statist., 33, 1286-1291. PEARSON, K. (1900). On the criterion that a given system of deviations from the probable in the case of a

correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Phil. Mag., 50, 157-172.

POSNER, E. C., RODEMICH, E. R., ASHLOCK, J. C. and LURIE, S. (1969). Application of an estimator of high efficiency in bivariate extreme value theory. J. Amer. Statist. Ass., 64, 1403-1414.

PURI, M. L. and Sen, P. K. (1966). On a class of multivariate multisample rank order tests. Sankhyd A, 28, 353-376.

(1971). Nonparametric Methods in Multivariate Analysis. New York: Wiley. PuRI, M. L., SEN, P. K. and GOKHALE, D. V. (1970). On a class of rank order tests for independence in

multivariate distributions. Sankhyd A, 32, 271-298. QUESENBERRY, C. P. and GESSAMAN, M. P. (1968). Nonparametric discrimination using tolerance regions.

Ann. Math. Statist., 39, 664-673. RENYI, A. and SULANKE, R. (1963, 1964). Uber die konvexe Hulle vonn zufallig gewahlten Punkten I and II.

Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 2, 75-84 and 3, 138-148. RICHARDS, L. E. (1972). Refinement and extension of distribution-free discriminate analysis. Appl. Statist.,

21, 174-176. RoHLF, F. J. (1975). Generalisation of the gap test for the detection of multivariate outliers. Biometrics,

31, 93-101. RUSSELL, C. T. and PURI, M. L. (1974). Joint asymptotic normality for a class of rank order statistics in

multivariate paired comparisons. J. Mult. Anal., 4, 88-105. SARHAN, A. E. and GREENBERG, B. G. (eds) (1962). Contributions to Order Statistics. New York: Wiley. SAVAGE, I. R. (1964). Contributions to the theory of rank order statistics: applications of lattice theory.

Rev. Inst. Statist. Inst., 32, 52-64. SAW, G. J. (1966). A nonparametric comparison of two samples one of which is censored. Biometrika, 53,

599-602. SEN, P. K. (1967). Nonparametric tests for multivariate interchangeability. Part 1: Problems of location

and scale in bivariate distributions. Sankhyd A, 29, 351-372. - (1968). Asymptotic normality of sample quantiles for m-dependent processes. Ann. Math. Statist., 39,

1724-1730. (1969). On a class of rank order tests for the parallelism of several regression lines. Ann. Math. Statist.,

40, 1668-1683.

1976] BARNETT - The Ordering of Multivariate Data 343

SEN, P. K. and PuRI, M. L. (1967). On the theory of rank order tests for location in the multivariate one sample problem. Ann. Math. Statist., 38, 1216-1228.

- (1970). Asymptotic theory of likelihood ratio and rank order tests in some multivariate linear models. Ann. Math. Statist., 41, 87-100.

SEYMOUR, D. R. (1970). Note on Austin's "An approximation to the point of minimum aggregate distance". Metron, 28, 412-421.

SHAHANI, A. K. (1969). A simple graphical test of association for large samples. Appl. Statist., 18, 185-190. SHANE, H. D. and PuRI, M. L. (1969). Rank order tests for multivariate paired comparisons. Ann. Math.

Statist., 40, 2101-2117. SIBUYA, M. (1960). Bivariate extremal statistics, I. Ann. Inst. Stat. Math., Tokyo, 11, 195-210. SIDDIQUI, M. M. (1960). Distribution of quantiles in samples from a bivariate population. J. Res. Nat. Bur.

Stand., 64B, 145-150. SINGH, N. (1960). Estimation of parameters of a multivariate normal population from truncated and

censored samples. J. R. Statist. Soc. B, 22, 307-311. SIOTANI, M. (1959). The extreme value of the generalized distances of the individual points in the multi-

variate normal sample. Ann. Inst. Statist. Math., Tokyo, 10, 183-208. SMITH, W. B. and HARTLEY, H. 0. (1968). A note on the correlation of ranges in correlated normal samples.

Biometrika, 55, 595-597. SRIVASTAVA, 0. P. (1967). Asymptotic independence of certain statistics connected with the extreme order

statistics in a bivariate distribution. Sankhyd A, 29, 175-182. SRIVASTAVA, 0. P., HARKNESS, W. L. and BARTOO, J. B. (1964). Asymptotic distribution of distances between

order statistics from bivariate populations. Ann. Math. Statist., 35, 748-754. STECK, G. P. and OWEN, D. B. (1962). A note on the equicorrelated multivariate normal distribution.

Biometrika, 49, 269-271. TAMURA, R. (1969). Some multivariate comparison procedures based on ranks. Ann. Math. Statist., 40,

1486-1491. TEICHROEW, D. (1955). Probabilities associated with order statistics in samples from two normal populations

with equal variance. Army Chemical Center, Maryland, Chemical Corps Engineering Agency. THOMAS, E. A. C. (1969). Distribution free tests for mixed probability distributions. Biometrika, 56, 475-484. TIAGO DE OLIVEIRA, J. (1959). Extremal distributions. Rev. Fac. Ciencias Lisbon, Ser. 2, A, 7, 219-227.

-- (1962). Structure theory of bivariate extremes; extensions. Estudos de Mathematica, Estatistica E. Econometria, 7, 165-195.

(1965). Statistical decision for bivariate extremes. Port. Math., 24, 145-154. - (1968). Extremal processes; definition and properties. Publ. Inst. Stat. Univ. Paris, 17, (2), 25-36.

- (1970). Biextremal distributions: statistical decision. Trab. Estad. y Inv. Oper., 21, 107-117. (1971). A new model for bivariate extremes: statistical decision. Istituto de Calcolo delle Probabilita

dell'UniversitA degli Studi di Roma, ed. Studi di Probabilita Statistica e Ricerca Operativa in Onore de Giuseppe Pompilj, pp. 437-449. Gubbio: Oderisi.

- (1974). Regression in the nondifferentiable bivariate extreme models. J. Amer. Statist. Ass., 69, 816-818.

(1975). Bivariate extremes: extensions. LS.L Conference, Warsaw. TSUKIBAYASHI, S. (1962). Estimation of bivariate parameters based on range. Rep. Statist. Appl. Res.

JUSE, 9, 10-23. TUKEY, J. W. (1947). Nonparametric estimation. 11. Statistically equivalent blocks and tolerance regions

in the continuous case. Ann. Math. Statist., 18, 529-539. VINCZE, I. (1961). On two-sample tests based on order statistics. Proc. 4th Berkeley Symposium Math.

Stat. Prob., 1, 695-705. WALD, A. (1943). An extension of Wilks' method for setting tolerance limits. Ann. Math. Statist., 14, 45-55. WATSON, G. S. (1954). Extreme values in samples from m-dependent stationary stochastic processes.

Ann. Math. Statist., 25, 798-800. WATTERSON, G. A. (1959). Linear estimation in censored samples from multivariate normal populations.

Ann. Math. Statist., 30, 814-824. WEISS, L. (1960). Two-sample tests for multivariate distributions. Ann. Math. Statist., 31, 159-164.

(1964). On the asymptotic joint normality of quantiles from a multivariate distribution. J. Res. Nat. Bur. Stand., 68B, 65-66.

WILK, M. B. and GNANADESIKAN, R. (1961). Graphical analysis of multiple response experimental data using ordered distances. Proc. Nat. Acad. Sci., USA, 47, 1209-1212.

(1964). Graphical methods for internal comparisons in multiresponse experiments. Ann. Math. Statist., 35, 613-631.

(1968). Probability plotting methods for the analysis of data. Biometrika, 55, 1-18. WILKS, S. S. (1941). On the determination of sample sizes for setting tolerance limits. Ann. Math. Statist.,

12, 91-96. - (1942). Statistical prediction with special reference to the problem of tolerance limits. Ann. Math.

Statist., 13, 400-409.

344 Discussion of Professor Barnett's Paper [Part 3, WILKS, S. S. (1948). Order statistics. Bull. Amer. Math. Soc., 54, 6-50. - (1962). Mathematical Statistics. New York: Wiley.

- (1963). Multivariate statistical outliers. Sankhyd A, 25, 407-426. YOUNG, D. H. (1967). Recurrence relations between the P.D.F.'s or order statistics of dependent variables,

and some applications. Biometrika, 54, 283-292. - (1970). Consideration of power for some two sample tests with censoring based on a given order

statistic. Biometrika, 57, 595-604. - (1973). Distributions of some censored rank statistics under Lehmann alternatives for the two-sample

case. Biometrika, 60, 543-549.

DISCUSSION OF PROFESSOR BARNETT's PAPER Professor R. L. PLACKETT (University of Newcastle upon Tyne): The order statistics of a random

sample form an integral part of statistical methodology, and have been studied for just as long. Some of the salient historical features are worth noting, in view of the wider interest now taken in past developments. The second supplement to the Theorie Analytique des Probabilitds is dated February 1818. Laplace is concerned with the classical problem of estimating fi from the linear model

E(Y) = fix1 (j = 1, 2, ..,n).

Suppose that x1, x2, ... are positive, and that the ratios y1/xl, y2/x2, ... are decreasing. His procedure is to estimate ,6 by ylx1, where r is such that

Xl+X2+...+x,-i<xr+xr+i+...+xn and x1+x2+... +X7>x+1+xt.+2+-..+x".

Thus yr/Xr is the value of fi which minimizes

?1Iy-Pxil. Laplace derives the asymptotic distribution of the estimator Yl/x7, which is a generalized form of the sample median. More details of his work are given by Stigler (1973).

After 70 years, in which much that is relevant doubtless occurred, we find Francis Galton rhapsodizing in Natural Inheritance (1889) about the Normal Curve, and writing as follows.

"Whenever a large sample of chaotic elements are taken in hand and marshalled in the order of their magnitude, an unsuspected and most beautiful form of regularity proves to have been latent all along."

The first volume of Biometrika contains articles by both Galton and Karl Pearson on the mean value of the difference between successive order statistics. This interest of K. P. must be responsible for the attention given to order statistics at University College London, which has influenced many associated with that institution. For example, the use of range in place of standard deviation, when examining the stability of variation among a large number of small samples, was suggested by Tippett (1925) and "Student" (1927). Tables of the necessary factors for control charts were included in a British standard on industrial standardization and quality control by E. S. Pearson (1935). During the immediate postwar period, simple methods of inference based on order statistics were actively explored, but they are largely obsolete in a computer age. However, the sufficiency of the order statistics provides a basis for permutation tests. These tests will continue to be used in areas such as psychology, where methods based on models with more structure cannot be justified.

The question arises naturally: what generalizations of the concept of order can be made in two or more dimensions? Recent work is well described in Professor Barnett's paper. At the beginning, he gives a quotation from Sir Maurice Kendall to the effect that order properties exist only in one dimension, and at the end he concludes that no reasonable basis exists for a full ordering of multi- variate data. I agree. We lose some attractive features of order statistics when the transition is made from one dimension to several, notably uniqueness and simplicity. Sufficiency remains, to support multivariate permutation tests.

The problem of ordering multivariate data can be expressed as follows. Given a sample x1, x2, ..., x. we require an arrangement of the form

{Xi) Z X(A}Z.. Zl {X(k)},,

1976] Discussion of Professor Barnett's Paper 345

where < is the symbol: not preferred to, and the subscripts i,j, ..., k range over mutually exclusive and exhaustive subsets of the integers 1, 2, ..., n. Professor Barnett has given four methods of interpreting the symbol <, in which he describes the ordering as marginal, reduced, partial or conditional. I believe that there are really only two methods: analytical and geometrical. They are largely, but not altogether, distinct. The first method is to introduce a function f(Q) with p arguments, monotonic increasing in any one argument for fixed values of the others. We calculate the n values of f(.) for the sample, and then order these values. This procedure includes both marginal and reduced ordering, between which there is no clearcut distinction. The second method is based on geometrical concepts such as the convex hull, which are not easily or usefully expressed in analytical terms. However, when the regions are prescribed, then this too is an analytical approach and can be expressed in term of a function f( ).

Professor Barnett has given examples wheref(- ) is a linear combination or a quadratic form, so let us consider other functions which may be relevant. Examiners in schools and universities use the totals of marks from different questions and subjects to determine a provisional ranking of the candidates. However, the final ranking may depend on further considerations. Suppose that two candidates in a particular subject achieve the same total mark, but the marks on individual questions are in the first case widely dispersed, and in the second exactly equal. The first candidate is usually preferred, on the grounds that greater weight is given to completed questions than to a multiplicity of fragments. A higher ranking of the first candidate can be achieved by combining the marks on questions using a concave function, which must also be symmetrical whenever the questions are on the same footing, as is usual. On the other hand, suppose that two candidates achieve the same total mark over all subjects, but the marks for individual subjects are in the first case exactly equal, and in the second widely dispersed. The first candidate is usually preferred on the grounds that a uniform performance on the different subjects is preferable to an uneven one. A higher ranking of the first candidate can be achieved by combining the marks on subjects using a symmetrical convex function. Both possibilities are included in the following proposal. Let n1, n2, ..., nr be the r marks to be combined, and N the score on which the ranking is based. Define N as a function of nl, n2, ..., nby

Nx= Y, nllr.

The function is convex if 0 < x < 1, and concave if a > 1. As x -- 0, N tends to the geometric mean of nL, n2, ..., n,. This proposal may appear radical, but would eliminate many of the present subjective adjustments after taking x = 1 for both questions and subjects. The values x = 2 for questions and x = a for subjects are suggested. They correspond respectively to finding the root mean square and the square mean root, and lie well within the scope of a pocket calculator.

There is much in Professor Barnett's paper to show that the ordering of multivariate data can help in the solution of statistical problems. The study of outliers is particularly important, arising as they do in meteorology and reliability, and the associated topic of robust estimation continues to deserve attention. On the other hand, I believe that coefficients of association should nearly always be avoided, on the grounds that they are seldom meaningful except as parameters in a statistical model. The opposite view is given wide currency at present in Statistical Package for the Social Sciences (Nie et al., 1975). Chapter 16 of this manual is concerned with contingency tables and related measures of association. Nearly all the figures reproduce printouts giving the values of 13 coefficients of association, which include "raw chi square" and functions thereof. This is the only method of analysis which is presented. Three-dimensional tables are treated as a sequence of two-dimensional tables, from which the same quantities are calculated. Consider, for example, the data in Fig. 16.1, reproduced below:

Race

Income White Non-white

Less than $4,000 396 98 $4,000-7,999 526 70 $8,000-12,499 612 64 $12,500 and over 624 40

346 Discussion of Professor Barnett's Paper [Part 3, What seems to be called for here is a logit regression of the white/non-white ratio on income, possibly transformed, in which the relationship is expressed by a model with two parameters. This would lead to more understanding than any analysis based on coefficients of association.

I would like to conclude by thanking Professor Barnett for guiding us through the complexities of this important and difficult topic. By presenting so many results in such a clear and systematic pattern he has brought, so to speak, order out of chaos. I have much pleasure in proposing the vote of thanks on his paper.

Professor K. V. MARDLA (University of Leeds): I am very pleased to be able to second this vote of thanks. However, Professor Barnett need not speculate as to which referee's report was mine as I am acting as a replacement in performing this pleasant duty.

The paper is to be greatly welcomed for its efforts in the classification and unification of previous work. In fact, this area needed a fresh look. Unlike Professor Plackett, I agree with Professor Barnett's four basic ordering principles. The development of this subject has been somewhat haphazard and these principles give an underlying structure which connects previous studies. I think it would be advantageous to sub-classify the different types of R-ordering. Two types could be "distance-ordering" and "projection-ordering". The former uses any specific measure of distance with the idea of the form of the underlying population while the latter would include ordering the sample using the first principal component (or higher), seriation, etc. Another type could be "polar-ordering".

The paper gives me an opportunity to look at my old work. Looking back, the importance of my 1967 paper is also in giving a pleasing general formula for corr (R1, R2) which gives a simplified expression for the normal case. This expression is also computationally more convenient. Further, the formula simplifies for n = 2 and n = 3 as given in my Ph.D. thesis of 1964 (Rajasthan University) and also obtained independently by Kurtz et al. (1966) as mentioned in Mardia (1967, p. 533). The exact expressions for marginal ranges for any n for the multinormal case are also contained in the thesis. These results are published only in the form of abstracts in the Annals of Mathematical Statistics (1963, pp. 1131, 1627; 1964, p. 461) and therefore it is not surprising that these are not generally known.

In distance-ordering, it is convenient to call D2(x, A; E) = (x - tL) E-(x - ,) the "Mahalanobis distance" between x and IL where each of the three quantities x, ,u, E may be constant or stochastic, and E- is a g-inverse of E (see Mardia, 1975a). If D2 - D2(Xt, x, S), it can be seen that the "outlier- scatter ratio" St = 1- (n - 1)-i D2 and, as Professor Barnett already knows, the graphical pro- cedures suggested by Healy (1968) (and also by Cox, 1968) look basically at these same quantities. Of course, Gnanadesikan, Wilk and Kettenring give graphical methods to assess multinormality but I think specific analytical measures and test procedures were first given by Mardia (1970) which depend on Mahalanobis angles and distances (see, for example, Mardia, 1975b). The correspondence between these tests and graphical methods is similar to that between Wilks' outlier test and graphical procedures of Cox (1968), Healy (1968) and others. There are obvious advantages of distance- ordering once the appropriate distance for the data is known.

For multinomial populations, there has been some controversy on the appropriateness of a distance measure (see Edwards, 1971). I discover that surprisingly this can be resolved through its close connection with the von Mises-Fisher distribution M,(p., K) with mean vector IL and concentration parameter K. Let nl, ..., nk be distributed as multinomial with parameters (p1, ...,PA and n = nl +... + nk. (Ep = 1.)

(i) It can be shown that as n -* oo, n-i(Vn:L, .. , 4nk) -Mkf(4pl, .*, pk), 4n}.

Hence, when n is large, the distance measure for a von Mises population is suitable for a multinomial population.

(ii) If I1 is the information matrix of 4pl, .., lPk for the multinomial, and 12 is the information matrix of p.1, .-- p.s for M,(IL, K) with K fixed, then for pi = >P

#1(n) 1 = 02(K) 12,

where 01 and #2 are functions of n and K only. Thus Rao's distance measure is the same for the two populations. Therefore, as in the von Mises-Fisher case, we are led to use the Bhattacharyya distance for the multinomial population which is the angle between (Inl, ..., Ink) and (Ip,, *--, VPk)-

1976] Discussion of Professor Barnett's Paper 347

Professor Barnett indicates how an attractive estimate of p in a bivariate normal population can be obtained. In quality control situations, a quick method to assess the extent of dependence is as follows. Let r, be the observed correlation between ranges and let pw(n, p) = f I p I ) for the normal case. One estimate of p is I p I = f-1(rw) where f is tabulated in Mardia (1967). In fact, using the approximation of Kurtz et al. (1966), it is found that

p2_ [-3n + {9L2 + 24r2(1 -Ln)2}]1/4(l -L),

where Ln is a known constant, and this is much simpler to use. Although historically unconditional (strict) non-parametric bivariate tests using circular ordering

and circular non-parametric tests were developed independently, there is a one-to-one corre- spondence between tests in the two categories. In this sense, Hodge's test is related to Ajne's test (1968), and Mardia's (1967a) test is related to the uniform scores test. My bivariate test possesses an interesting invariance property, and the asymptotic efficiency of this test compared to T2 amounts to about 80 per cent. However, on a hypersphere, we have again to rely on the four ordering principles. Consequently, as in the multivariate case (p> 3), there is a dearth of small- sample non-parametric tests of practical value on a hypersphere.

Broadly speaking, the types of ordering principles which will be more effective are determined by one's objectives, e.g. R-ordering is predominantly used in outlier problems. Perhaps Professor Barnett could have told us in Section 4 (possibly in tabular form?) which ordering principles in his opinion have been most effective in principal areas of multivariate analysis.

The paper elegantly leads us to the realistic view that fruitful research would emerge from the four ordering principles rather than trying to define higher-dimensional order statistics. It gives me the greatest pleasure to second the vote of thanks.

The vote of thanks was passed by acclamation.

Professor R. M. LOYNES (University of Sheffield): It is a great pleasure to see the Society holding one of its only too rare meetings outside London here in Sheffield today, and an equal pleasure that the speaker, who has given a useful and learned contribution, is also a colleague.

I should like to make just a few comments about various aspects of the paper. In the univariate case I see two reasons for the interest in order statistics: one, the common occurrence of censoring mechanisms in real life (the interest in linear combinations of order statistics as estimators surely depends on the existence of simple censoring processes which allow one to carry out ordering in practice, for example); and the other, the fact that within the model of a simple random sample- or more generally of exchangeable random variables-the order statistic is the minimal sufficient statistic. When we turn to the multivariate situation we find that both approaches run into rather similar difficulties. Of course the minimal sufficient statistic is in a sense easily described (it is the set of sample vectors, their labels no longer having any importance), but there seems no way of providing a simple canonical form for this. The difficulty can be illustrated rather differently: suppose x(t) are univariate order statistics of the sample xi, andfis a monotone increasing function; thenf(x(i)) are the order statistics of the transformed samplef(xi). In two dimensions, for simplicity, what would we mean by a monotone transformation? Any natural definition would exclude the possibility of rotating axes, and yet in many situations we would wish to regard the axes as arbitrary. Perhaps we should accept that the axes may not be rotated in such discussions. Some support for such a restriction is added by the observation that although on the line a point (the median) exists with equal numbers of points on either side, in the plane no point exists, in general, such that equal numbers of points lie either side of an arbitrary line through it; the vector of marginal medians possesses this property for lines drawn parallel to the axes, however.

The idea which appears in several guises, that we should order not every single observation, but rather sets of them (as, for example, in Figs 5 and 6), seems to me an appealing one. But it is rather difficult to see why the particular example of convex hulls should be of general interest: it makes no reference to an origin, and yet if the origin were not inside (or at least not much outside) the innermost polygon one would find interpretation difficult.

Finally, references: those for Section 4.1 are of course not complete. More recent ones are Galambos (1972) and O'Brien (1974).

348 Discussion of Professor Barnett's Paper [Part 3, Dr A. HUITSON (Sheffield Polytechnic): I am interested in practical applications of the tech-

niques which have been outlined in this paper. I was trying to tie up my practical experience with the various techniques and became a little lost when it came to the P-ordering and the C-ordering techniques. Would Professor Barnett care to outline his thoughts on these particular aspects please ?

Dr G. M. PADDLE (ICI): I recently had occasion to give a set of lectures to some medical colleagues at work and, at the end, one of them observed that the sophisticated techniques which I had put forward were all very well, but all that he and his colleagues really wanted was a simple technique for looking at a mass of multivariate data to pick out the salient features. I commend this, with tongue in cheek, to Professor Barnett as a target to which we should work in the area he has described today.

I should also like to thank Professor Barnett for pointing out to me in the course of his paper that two or three problems which I felt were fairly easy, but which I was unable to solve, are in fact quite difficult.

Two problems to which I should like to draw attention as being relevant to this area are as follows.

(i) For many medical data-such as observations of a battery of blood tests on various patients -we might well expect to find high positive correlations. The medical interpretation of such data is to use marginal ordering and to pick out the outliers by applying rectangular definitions of abnormality. The fact that statistically, people with xi> i but x1<x, are in some respects more abnormal is apparently of no interest and it is difficult to persuade them to repeat these observations, no matter how improbable they may be. In relation to this, Levin et al. (1973) have done some very interesting work in which they took not single observations but a number of repeat observations on the same people, and calculated a set of abnormal observations after allowing for time of day and personal characteristics. They found, as Professor Barnett suggests in his paper, a completely different set of abnormal observations from the one usually defined as such.

(ii) In the measurement of physical properties of materials, we might fit regression equations to a set of observations, Yij, for a range of materials with constituents Xik We might then want to look at the residuals, ri1, to decide which of the samples of material were unrepre- sentative, and which individual readings were unrepresentative. For instance, if we were testing hardness, tensile strength and elongation, we might decide that all three were inaccurate for one sample, but that only one was inaccurate for another sample. Inaccurate samples could be defined by a distance ordering, weighted sum of squares for instance, but definition of samples with one or two incorrect readings requires a rather sophisticated form of ranking.

I am not suggesting that Professor Barnett has presented solutions to these problems, but he has made me aware that there is a way forward.

Professor T. LEwIs (University of Hull): I should like to make a few rather random and off-the-cuff remarks.

It is all very well for Professor Barnett to quote Kendall and make the remark which Professor Plackett quoted in his conclusions. It is all very well too for Professor Plackett to say that he agrees that one cannot really order multivariate data, but the fact-which was implied by Professor Plackett in his example about the class list for students-is that the topic of Professor Barnett's splendid paper is one of the most important in the human context which one could imagine. This is because the ordering of multivariate points is what is going on all the time: the student class lists, the Cabinet deciding between the various policies in order-we hope that is what is done. In fiction, it is in its purest form in the sort of radio play in which there are, say, 10 people on a sinking boat-a judge, an actress and so on-and they have to go, one by one, for the survival of one final vector value, xl. This does exist in the real-life context, although Professor Barnett avoided using this notation in his paper.

It has also been said by everyone that ordering is perfectly obvious and natural in the univariate case. Professor Barnett says in his paper that univariate ordering is clear and unambiguous-and

1976] Discussion of Professor Barnett's Paper 349 so it is, but it is not invariant because it depends on the specification of the variable. This is, indeed, a platitude, but the figures are as follows.

If we want to order this univariate sample-and I put it down in order-there is

A<B<C<D.

If we like to use a transform of this variable-06, 0-8, 0 4, 0 7-this gives C< A < D < B. If we use 3x, then we multiply all these by three, giving us: 4-8-and we will cross off the four;

5 4-and, if we cross off the integers, we get the following: D < C<B<A. Of course, we are only interested in this one because there is a reason for preferring-in Pro-

fessor Plackett's happy world-large values of x to small values of x. If we turn to the multivariate case, I agree with Professor Plackett in his remarks that we really

want to do R-ordering, reducing it to some univariate sample, using some function of the x's. If we take the bivariate case, to make it easy to talk about, which is: hxy, now h is not completely arbitrary, which would leave us in a very vague situation, but-as Professor Plackett said-we are really only interested in functions, h, which are monotonic in each of the arguments. It is easy to show then that if the marginal ordering in the x's is the same as the marginal ordering in the y's, that ordering applies also to any h satisfying these conditions.

In that sense, for a sample with the same marginal orderings, it has a unique ordering in a reasonable way. If there is a sample of size 1,0O0, and the marginal ordering of the x's is not exactly the same as the v's, but there is perhaps only one reversal-the right correlation is nearly one-it would seem reasonable to think that the ordering was nearly defined. If there is the situation in which x and y are independent, this will not happen very often, so it is not of much help, but if x and y are highly correlated, those samples will often be obtained and the ordering becomes a useful idea. This makes me think that perhaps the ordering is related-or that we should think of it as related-to the underlying distribution. Then I note that in Professor Barnett's discussion of various M-, R-, C- and P-orderings, somewhere at least one of his R-orderings uses the proba- bility density function for the underlying distribution. I should like to ask him, therefore, whether along with his classification there is not tied up another, rather fundamental classification which is the following:

orderings which are based purely on the relative sizes of the numbers which come into the sample, on the one hand, and orderings which bring in the presumed distribution, on the other hand.

Let me conclude by thanking Professor Barnett, adding my thanks to those of the other speakers, for his paper.

Professor G. A. BARNARD (University of Essex): I suppose we should not be surprised at the multiplicity of methods of ordering multivariate data which could be presented in the country which invented snobbery and one-upmanship as well as other forms of class distinction. Since some of the requirements for ordering that have been mentioned this evening have been related to the question of outliers, I thought it would be worth while again drawing attention to Morven Gentleman's proposal for robust estimation in the multivariate case. This is one of rather few methods which do not require trimming or any ordering. It is to choose as the centre of location that point in the space from which the three-halves powers of the distances of the points are summed to a minimum. As far as I know, he has still not yet published details, but it is a method which is quite effective and efficient and does not require the isolation of a few outliers, allowing us therefore perhaps to progress along the road of recognizing that all of us excel in something-if only we can find out what it is.

Professor A. M. WALKER (University of Sheffield): I wish to ask Professor Barnett if he can make any comment about problems arising from ties in any of the variate values-that is, the occurrence of two or more sample points such that the xi are not distinct for all j. In the univariate case the presence of ties can, in certain circumstances, be troublesome, particularly when the distri- bution sampled from is discrete. However, perhaps the decreased probability of ties for multivariate samples, especially if these occur in all the variate values, makes such problems much less important.

350 Discussion of Professor Barnett's Paper [Part 3, Professor F. DOWNTON (University of Birmingham): All of us must be grateful to Professor

Barnett for his comprehensive survey of the different approaches which have been adopted in studying the ordering of multivariate data, and I regret not being able to be present to add my voice to the vote of thanks.

I would like to raise a point concerning the purposes of multivariate ordering and what they imply in practice. These purposes include, for example, quick estimating and inference procedures, assessment of outliers and problems involving censoring. Quick procedures for a particular mathe- matical model can be produced by studying the mathematical properties of data from that model only, but what constitutes an outlier and how data are censored are more likely to come from the nature of the experiment and the resulting data than from their mathematics. It is this which makes statistics such a difficult (but also fascinating) subject, but it does raise a question, which Professor Barnett may treat as rhetorical. How many of the methods discussed in the paper arose from a practical need or have been used in practice? And how many arose from thoughts such as the interesting one he has in the last sentence of Section 4.1 ? My imagination has failed me in trying to visualize an experiment for which the type of C-ordering described there would be appropriate; can Professor Barnett put me out of my misery and give me an example of what he has in mind?

The following contributions were received in writing, after the meeting.

Mr P. J. GREEN (University of Bath): I have two general comments to make on Professor Barnett's interesting paper.

Firstly, there are many alternative ways of defining "ordering" in one dimension. These may be broadly divided into those ordering the space in which the data lies, and those ordering the data directly. Among the latter we can distinguish the concepts of adjacency, orderliness and canonical representation of the set, rather than the sequence, of data values.

These concepts give rise to the various different properties of order statistics, and it is the fact that they all coincide in one dimension that has led to the apparent unity of the subject. However, when we attempt to extend order to higher dimensions, we find firstly that these concepts do not necessarily coincide, and secondly that the concepts themselves may not be so uniquely defined.

Those methods classified by Professor Barnett as M- and R-ordering are largely concerned with ordering the sample space; but many order statistics seem more naturally defined in terms of a direct ordering of the data, as considered under P-ordering. The idea of adjacency seems most suitable for generalization-but in no unique manner. Possible definitions could be based upon those observations linked in the minimum spanning tree or path of the data, or upon some ranking of the inter-point distances. A definition that seems attractive empirically is to call xi, xi adjacent if there is some point x within the convex hull of the data for which

d(x, x) = d(x, x) = min {d(x, xk): k = 1, 2, ..., n}. None of these concepts is likely to yield a very complete notion of order in more than one

dimension, but this is perhaps unimportant when, for example, as limited a notion of order as the convex hull of the data is sufficient to discuss range, extremes and outliers.

My second comment is one which might be applied to much analysis of multivariate data. Before discussing such procedures it should surely be decided under what transformations of the data are the conclusions to be invariant. In the present context, presumably any useful "order" should be invariant under uniform change of scale, and possibly also under translation. But problems that are invariant under rotations or unequal scale change seem to form two disjoint (and possibly exhaustive) classes.

The suggestions made above refer principally to rotation-invariant data, as indeed do many modern "data-analysis" methods. But for the other type of problem, where the axes may represent incomparable variables, M- and C-ordering principles seem much more attractive. It is interesting to note, however, that the c-order groups of Section 2.3 are invariant under all the transformations mentioned above.

Sir MAURICE KENDALL (International Statistical Institute): I regret that absence abroad prevents me from attending this meeting. I congratulate the author on a most useful summary of a subject in which the literature is confused and scattered. I have one major comment. In one dimension order is invariant under any monotonic transformation of the scale. It seems to me desirable that

1976] Discussion of Professor Barnett's Paper 351

any methods of extending the idea to more than one dimension, however imperfect, should also be scale invariant. Some of the methods described in the paper, such as the sequential ordering process, are so. Others, such as the use of principal components or distance metrics, are not. I would not discard them on that ground alone, but it has to be recognized that they are not metric-free as one would like to have a technique based on order alone.

Two minor points: (a) The convex-hull method of Section 2.3 is difficult and tedious to apply in more than two dimensions. The determination of the hull requires linear programming and if extra observations are added to the original sample the work has to be done again. (b) There are now programs which will project a multivariate complex, through a visual display unit, on to any assigned plane, not merely the co-ordinate planes, and this offers a relatively unsophisticated way of rejecting outliers, albeit by eye on a somewhat subjective basis.

Mr A. ROBmNSON (University of Bath): Professor Barnett has offered us a fourfold classification of ordering methods applicable to multivariate data but he points out that the classes may not be mutually exclusive. Perhaps this difficulty arises because we should more relevantly be attempting to classify the types of problem to which we would like to apply "order-statistics" and ranking methods. The claim of David (1970) that order-statistics deserve to be treated as a unified subject is possibly a little overstated; that quite separate univariate problems find useful accommodation in the house of order-statistics does not imply that they were all born at that same address. One does not necessarily want the full battery of weapons which complement each other in the univariate case; for instance, in problems involving adjacency if we remove a particular point, then we do not necessarily require the "order" of the remaining points to be invariant. I agree that many methods employed in the analysis of multivariate data naturally result in some type of ordering but it is in the nature of the problem that it demands such a solution.

Many univariate methods have an appealing simplicity which I feel cannot be carried over to the multivariate situation if one concentrates on the ordering principle rather than on the particular facet of order induced by the problem.

Dr ALLAN SEHEULT (University of Durham), Mr PETER DIGGLE and Dr DENNIS EVANS (University of Newcastle upon Tyne): We shall confine our attention to R- and P-ordering. We find the notions of probability contours (introduced in Section 2.2 on R-ordering) and of convex hulls (Section 2.3 on P-ordering) particularly interesting, and it is perhaps worth while to reflect, initially, on their possible utility in the univariate case. The figure below gives the pH values for 52 soil samples using J. W. Tukey's "stem-and-leaf" plot; thus, there are two samples with pH value 5-4, three with pH value 5-6, etc. (an explanation of the italic figures is given in due course).

5 4 4 5 6 6 6 7 7 7 7 5 8 8 8 8 8 8 8 8 9 6 0 1 1 1 1 6 2 2 2 2 6 4 4 6 7 7 6 9 9 9 7 0 0 0 7 3 7 4 4 7 7 8 8 9 8 0 0 0 1 8 3 8 4 4 4 5

For these data, we can define 26 P-order groups (a term which we prefer to c-order to avoid confusion with the concept of C-ordering introduced in Section 2.4); group 1 = {5 4, 8-5}, group 2 = {5-4, 8-4} and so on until group 26 = {6-2}, each group containing exactly two points. The above procedure suggests the value of 6-2 as a summary measure of location. This is, of course, the sample median, and we suggest that in the multivariate case the last convex hull should define

352 Discussion of Professor Barnett's Paper [Part 3, the median set of the data; the median point may then be defined as some enclosed point such as the centroid. The samples depicted in Figs 5 and 6 of the paper suggest that this is a reasonable measure of location, both median sets being close to, or enclosing, the origin. The important feature here is that for a univariate sample the convex hull emphasizes that there is no essential difference between the values 5-4 and 8'5; they are both extreme-we ignore direction, or equivalently we do not distinguish between monotone increasing and monotone decreasing functions. Note in passing that in the multivariate case the P-ordering obtained using convex hulls remains invariant under linear transformations of the data.

In a similar way, the inter-quartile range as a measure of dispersion may be generalized to the multivariate case as follows: let Ci denote the P-order group i convex hull and find the integer k such that Ck contains at least 50 per cent of the data points and Ck+1 at most 50 per cent. The interquartile set, IQS, is then partially determined by the relationship Ck+l 'IQS' Ck and, as with the median, may be uniquely determined by a suitable interpolation procedure.

One criticism of partial ordering and the resulting summary measures it produces is that the notion of ordering therein is metrical and does not depend directly on the density properties of the data configuration. In our example, the values 5-4 and 8'5 are clearly extremes with respect to both the partial ordering induced by the convex hull approach and the reduced ordering induced by the empirical density of the data. However, the median value 6-2 is also relatively extreme with respect to density. In fact, the data were collected as part of a discriminant analysis and can be divided into two identifiable sub-groups, the second of which is indicated by the italic figures in the stem-and-leaf plot.

The paper by Loftsgaarden and Quesenberry cited by Professor Barnett gives a simple and workable non-parametric estimator for a multivariate density function which can be used to exploit the idea of probability contours even when the underlying distribution is unknown. The suggestion here is therefore that when analysing data, each point may usefully be assigned both its P-order group index and its density, or some estimate thereof. Such information might give useful insight into the configuration of the data and the possible existence of outliers and clusters. It also provides a set of bivariate data for analysis in its own right!

Dr D. H. YOUNG (Brunel University): In his interesting review of ordering procedures for multivariate data, Professor Barnett mentions the particular form of C-ordering in which the observations xl, ..., x, are ordered by reference to linear combinations of the ordered component values. A possible use for this type of ordering occurs when the {x} are independent multinomial observations with common index and interest is centred on ranking the n multinomial distributions on the basis of a linear function of their ordered cell probabilities. For example, one might wish to select the multinomial distribution with the smallest range of cell probabilities or the one with the maximum cell probability. The usual indifference zone or subset selection procedures could be considered and their use would require a study of the distribution properties of

min max (xi, -xt)

and other similar order statistics.

The author replied in writing, as follows:

I should like to thank all the contributors to the discussion for their kind remarks, helpful comments and interesting proposals. It is not possible to deal with all the points in detail, but some useful summary comment can be made.

Professor Plackett's observations on the historical development of order statistics are fascinating; it would be nice to hear more about this matter. I do not, however, share his view that inference procedures based on order statistics are rendered "largely obsolete" by the advent of the computer. The acclaimed computational simplicity of order statistics methods has always struck me as a delusion. Linear forms are simple in principle but the determination of the appropriate weighting factors involves tedious calculation often made feasible only by powerful computers. The repre- sentational role of order statistics, use of order or rank in the construction of distribution-free statistical methods, probability plotting techniques (particularly in the primary examination of large data sets) all seem to have continuing practical relevance, reflected in an unabating flow of publications.

1976] Discussion of Professor Barnett's Paper 353 In more than one dimension we do of course lose "uniqueness and simplicity". My aim was

to demonstrate that, in spite of this fundamental obstacle, the order concept is widely and variously represented in multivariate work, extending far beyond multivariate permutation tests! It was not my intention to judge the propriety of such useage, merely to report it. I hoped that the fourfold classification provided a reasonable basis for indicating distinctions of attitude and objective. I feel that Professor Plackett's dichotomy of ordering principle (analytical, geometrical) is tidier and more mathematically objective, but it does not so easily distinguish basic emphases. For example, whilst M- and R-ordering are analytical they tend to distinguish approaches concerned with limited or overall ordering aims, respectively. Indeed, the extended sub-classification of R-ordering suggested by Professor Mardia is appealing as a framework for further distinguishing operational objectives.

Professor Mardia provides some interesting results with a useful set of additional references. The familiar relationship between St and D2 is important in what it implies about the Wilks multivariate outlier rejection test, at least for an underlying normal distribution. The proposal to test outliers by examining minimum scatter ratios was advanced by Wilks on an entirely intuitive argument. But any test of outliers must involve some concept of order as a basis for a declaration that certain observations are "extreme". In the normal case with a location slippage alternative hypothesis we can set up a maximum likelihood ratio test for outliers which leads to identifying outliers as observations for which the value D2 is large; the outlier is adjudged discordant if D2 is sufficiently large. Thus the implicit ordering basis is in terms of values of the distance metric D2. But the relationship between Dt and St reveals that the Wilks test is also a maximum likelihood ratio test (for normal data and a location slippage alternative) and is using again Dt as an ordering basis.

Professor Mardia, Dr Huitson and others essentially ask for some form of comparative assess- ment of the different sub-ordering principles. This is not feasible at the moment; individual principles figure in isolation (and often only implicitly) in different aspects of multivariate analysis. Part of the aim in airing this topic was to try to encourage more specific intercomparisons to be made; perhaps we might see some results arise in this area.

Dr Paddle's two practical examples well illustrate types of situation in which appropriate multivariate outlier methods need to be applied. Multivariate outliers often do not show up merely in the margins of the data.

Professor Barnard's description of the robust multivariate location estimator of Morven Gentleman is interesting. We must hope to see more of its credentials. In reply to Professor Walker's enquiry about tied values in discrete data, I can offer no useful information.

Many of the contributors to the discussion comment on the convex hull ordering proposed in Section 2.3. It has the disadvantages of probabilistic and manipulative complexity, as remarked by Sir Maurice Kendall, but it does have an appealing directness in its entirely data-oriented emphasis. The associated notions of median set and interquartile set, proposed by Dr Seheult, Mr Diggle and Dr Evans, are intriguing and surely merit further study. They comment, as does Dr Green, on the rather wide degree of invariance of c-order groups in respect of transformations of the co-ordinate basis.

Most of the discussion centres on the fundamental problems of transferring order concepts from one, to higher, dimension. Frequently the question of invariance of ordering arises. Professor Loynes points to many difficulties in extending order-invariance to the multidimensional situation, even at the level of expressing the notion of monotone transformations: a point taken up also by Professor Lewis. Sir Maurice Kendall and Dr Green also comment on the invariance issue, the former urging caution with respect to ordering principles which are not scale invariant, the latter insisting an invariance under uniform change of scale and possibly also under translation. Most agree, however, that rotation-invariance can hardly be realistically demanded, and Professor Lewis points out that we do not experience a very wide degree of invariance even in one-dimensional ordering.

Other fundamental distinctions are drawn. Professor Downton, Dr Green, Professor Lewis and Mr Robinson all suggest that different interests dictate different ordering principles, depending on whether we assume some underlying probability model, wish to order in relation to the sample space or to directly order the data themselves. I entirely agree, and my writing of Section 2 was much influenced by such considerations but perhaps I did not sufficiently stress them. Dr Green's proposal that orderings might be based on an "adjacency" principle has some interest particularly

354 Discussion of Professor Barnett's Paper [Part 3, in relation to outlier identification. Indeed Rohlf (1975) develops a "gap test" for outliers based on the minimum spanning tree.

Professor Downton's request for an example of the type of c-ordering proposed at the end of Section 4.1 has been met by the examples described by Dr Young. Additionally, we have the variety of non-parametric slippage tests. A simple case would involve ordering the values in each of a set of samples, and then ordering the samples in terms of their extreme values.

In the paper, and in the most interesting discussion, much has been said about both general principle, and detail, in multivariate ordering. A great many difficulties exist. But it seems that in the final resort we cannot avoid the point made by Professor Lewis. Whether we like it or not practical problems inevitably involve the ordering of multivariate data, and there remains the need for much more statistical investigation of this topic.

REFERENCES IN THE DIscussIoN AJNE, B. (1968). A simple test for uniformity of a circular distribution. Biometrika, 55, 343-354. Cox, D. R. (1968). Notes on some aspects of regression analysis. J. R. Statist. Soc. A, 131, 265-279. DAVID, H. A. (1970). Order Statistics. New York: Wiley. EDWARDS, A. W. F. (1971). Distances between populations on the basis of gene frequencies. Biometrics,

27, 873-881. GALAMBOS, J. (1972). On the distribution of the maximum of random variables. Ann. Math. Statist., 43,

516-521. GALTON, F. (1889). Natural Inheritance. London: Macmillan. LAPLACE, P. S. (1820). Theorie Analytique des Probabilites, 3rd edn. Paris: Courcier. LEVIN, G. E., MCPHERSON, C. K., FRASER, P. M. and BARON, D. N. (1973). Long term variation in plasma

activities of aspartate transaminase and alkaline phosphatase in health. Clin. Sci., 44, 185-196. MARDIA, K. V. (1967a). A non-parametric test for the bivariate two-sample location problem. J. R. Statist.

Soc. B, 29, 320-342. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57, 519-530. (1975a). Mahalanobis distances and angles. In Proc. 4th Int. Symp. Multivar. Anal. (P. R. Krishnaiah,

ed.). Ohio: D. Reidel Publishing Co. - (1975b). Assessment of multinormality and the robustness of Hotelling's T2 test. Appl. Statist., 24,

163-171. NIE, N. H., HULL, C. H., JENKINS, J. G., STEINBRENNER, K. and BENT, D. H. (1975). Statistical Package

for the Social Sciences, 2nd edn. New York: McGraw-Hill. O'BRIEN, G. L. (1974). Limit theorems for the maximum term of a stationary process. Ann. Prob., 2,

540-545. PEARSON, E. S. (1935). The Application of Statistical Methods to Industrial Standardization and Quality

Control. London: British Standards Institution. STIGLER, S. M. (1973). Laplace, Fisher and the discovery of the concept of sufficiency. Biometrika, 60,

439-445. "STUDENT" (1927). Errors of routine analysis. Biometrika, 19, 151-164. TIPPETT, L. H. C. (1925). On the extreme individuals and the range of samples taken from a normal popu-

lation. Biometrika, 17, 364-387.

As a result of the ballot held during the meeting, the following were elected Fellows of the Society.

ABBESS, Christopher R. DUNN, Douglas M. HUGHES, Philip G. ADAMSON, Sir Campbell FENYO, Andrew J. JoNus, David H. ALIS, David Michael B. FERRIS, David KADANE, Joseph B. BARRATT, Katherine M. FISHER, William J. KENT, John T. BEAMISH, Neil T. GARNER, Janet P. KNIGHr, John F. BECKETT, James, III GARNSWORTHY, John KOKOLAKIS, George CAPELIN, Howard J. FRAGIADAKI-SALTAVAREA$ LAVALLE, Irving H. CHANG, Kar Y. Hellas-Maria LEADBETER, Deana M. CLARKE, Ralph T. GIBSON, Robert F. MACFARLANE, Sarah B. J. COKER, Jonah B. Gipps, Peter G. McGILL, Peter R. DAYKIN, Christopher D. GOODWIN, Jennifer A. G. MASTERS, Anthony W. DELAHUNTY, Paul J. HOUSTON, Alastair G. S. MATTHEWS, David E. DIMou, Theodore GREEN, John L. MERCHANT, John R. DUNSTAN, Frank D. J. HIGGINS, Joseph MOHAMED, Abdalla E.

1976] Fellows of the Society 355

NARAIN, Hugh H. NAYLOR, John C. NEWELL, Robert OGLE, Ian F. OKUSANYA, Adedayo 0. PATER, John R. RIGBY, Michael J. RILEY, Patrick H. RUBIN, Donald B. RUST, John N. RUSTON, Paul K.

SADAT, Ali N. SAVAGE, I. R. SANDILANDS, Douglas W. SHAW, John E. H. SKEGG, Joy L. SPIEGELHALTER, David J. STEIN, George J. STEVENSON, Michael R. STUBBS, Peter A. SWAITHES, Gillian A. SZYMANKIEWICZ, Jan Z.

TANG, Victor K. T. TEICHMAN, Robert TRIGGS, Christopher M. UDOFIA, Godwin A. WEIR, Bruce S. WHITE, Patricia A. WINTER, Paul D. WHYNES, David K. ZAFAR YAB, Muhammad