order statistics...4.2 distribution-free bounds for the moments of order statistics and of the range...

30

Upload: others

Post on 21-Feb-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic
Page 2: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

Order StatisticsThird Edition

Page 3: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

WILEY SERIES IN PROBABILITY AND STATISTICS

Established by WALTER A. SHEWHART and SAMUEL S. WILKS

Editors: David J. Balding, Noel A. C. Cressie, Nicholas I. Fisher,Iain M. Johnstone, J. B. Kadane, Louise M. Ryan, David W. Scott,Adrian F. M. Smith, JozefL. TeugelsEditors Emeriti: Vic Barnett, J. Stuart Hunter, David G. Kendall

A complete list of the titles in this series appears at the end of this volume.

Page 4: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

Order StatisticsThird Edition

H. A. DAVIDIowa State UniversityDepartment of StatisticsAmes, I A

H. N. NAGARAJAThe Ohio State UniversityDepartment of StatisticsColumbus, OH

iWILEY-INTERSCIENCE

A JOHN WILEY & SONS, INC., PUBLICATION

Page 5: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

Copyright © 2003 by John Wiley & Sons, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form orby any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except aspermitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the priorwritten permission of the Publisher, or authorization through payment of the appropriate per-copy fee tothe Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax(978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission shouldbe addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ07030, (201) 748-6011, fax (201) 748-6008, e-mail: [email protected].

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts inpreparing this book, they make no representation or warranties with respect to the accuracy orcompleteness of the contents of this book and specifically disclaim any implied warranties ofmerchantability or fitness for a particular purpose. No warranty may be created or extended by salesrepresentatives or written sales materials. The advice and strategies contained herein may not besuitable for your situation. You should consult with a professional where appropriate. Neither thepublisher nor author shall be liable for any loss of profit or any other commercial damages, includingbut not limited to special, incidental, consequential, or other damages.

For general information on our other products and services please contact our Customer CareDepartment within the U.S. at 877-762-2974, outside the U.S. at 317-572-3993 or fax 317-572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print,however, may not be available in electronic format.

Library of Congress Cataloging-in-Publication Data:

David, H. A. (Herbert Aron), 1925-Order statistics — 3rd ed. / H.A. David, H.N. Nagaraja.

p. cm.Includes bibliographical references and index.ISBN 0-471-38926-9 (cloth)1. Order statistics. I. Nagaraja, H. N. (Haikady Navada), 1954-. II. Title.

QA278.7.D38 2003519.5—dc21 2003050174

Printed in the United States of America.

1 0 9 8 7 6 5 4 3 2 1

Page 6: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

To Ruth—HAD

To my mother, Susheela—HNN

Page 7: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

This page intentionally left blank

Page 8: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

Contents

PREFACE xi

1 INTRODUCTION 11.1 The subject of order statistics 11.2 The scope and limits of this book 31.3 Notation 41.4 Exercises 7

2 BASIC DISTRIBUTION THEORY 92.1 Distribution of a single order statistic 92.2 Joint distribution of two or more order statistics 112.3 Distribution of the range and of other systematic

statistics 132.4 Order statistics for a discrete parent 162.5 Conditional distributions, order statistics as a Markov

chain, and independence results 172.6 Related statistics 202.7 Exercises 22

3 EXPECTED VALUES AND MOMENTS 333.1 Basic formulae 333.2 Special continuous distributions 40

vii

Page 9: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

viii CONTENTS

3.3 The discrete case 423.4 Recurrence relations 443.5 Exercises 49

4 BOUNDS AND APPROXIMATIONS FOR MOMENTSOF ORDER STATISTICS 594.1 Introduction 594.2 Distribution-free bounds for the moments of order

statistics and of the range 604.3 Bounds and approximations by orthogonal inverse

expansion 704.4 Stochastic orderings 744.5 Bounds for the expected values of order statistics in

terms of quantiles of the parent distribution 804.6 Approximations to moments in terms of the quantile

function and its derivatives 834.7 Exercises 86

5 THE NON-IID CASE 955.7 Introduction 955.2 Order statistics for independent nonidentically

distributed variates 965.3 Order statistics for dependent variates 995.4 Inequalities and recurrence relations—non-IID cases 1025.5 Bounds for linear functions of order statistics and for

their expected values 1065.6 Exercises 113

6 FURTHER DISTRIBUTION THEORY 1216.1 Introduction 1216.2 Studentization 1226.3 Statistics expressible as maxima 1246.4 Random division of an interval 1336.5 Linear functions of order statistics 1376.6 Moving order statistics 140

Page 10: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

CONTENTS iX

6.7 Characterizations 1426.8 Concomitants of order statistics 1446.9 Exercises 148

7 ORDER STATISTICS IN NONPARAMETRICINFERENCE 1597.1 Distribution-free confidence intervals for quantiles 1597.2 Distribution-free tolerance intervals 1647.3 Distribution-free prediction intervals 1677.4 Exercises 169

8 ORDER STATISTICS IN PARAMETRIC INFERENCE 1718.1 Introduction and basic results 1718.2 Information in order statistics 1808.3 Bootstrap estimation of quantiles and of moments of

order statistics 1838.4 Least-squares estimation of location and scale

parameters by order statistics 1858.5 Estimation of location and scale parameters for

censored data 1918.6 Life testing, with special emphasis on the exponential

distribution 2048.7 Prediction of order statistics 2088.8 Robust estimation 2118.9 Exercises 223

9 SHORT-CUT PROCEDURES 2399.1 Introduction 2399.2 Quick measures of location 2419.3 Range and mean range as measures of dispersion 2439.4 Other quick measures of dispersion 2489.5 Quick estimates in bivariate samples 2509.6 The studentized range 2539.7 Quick tests 2579.8 Ranked-set sampling 262

Page 11: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

X CONTENTS

9.9 O-statistics and L-moments in data summarization 2689.10 Probability plotting and tests of goodness of fit 2 709.11 Statistical quality control 2749.12 Exercises 277

10 ASYMPTOTIC THEORY 28310.1 Introduction 28310.2 Representations for the central sample quantiles 28510.3 Asymptotic joint distribution of central quantiles 28810.4 Optimal choice of order statistics in large samples 29010.5 The asymptotic distribution of the extreme 29610.6 The asymptotic joint distribution of extremes 30610.7 Extreme-value theory for dependent sequences 30910.8 Asymptotic properties of intermediate order statistics 31110.9 Asymptotic results for multivariate samples 31310.10 Exercises 315

11 ASYMPTOTIC RESULTS FOR FUNCTIONSOF ORDER STATISTICS 32311.1 Introduction 32311.2 Asymptotic distribution of the range, midrange, and

spacings 32411.3 Limit distribution of the trimmed mean 32911.4 Asymptotic normality of linear functions of order

statistics 33111.5 Optimal asymptotic estimation by order statistics 33511.6 Estimators of tail index and extreme quantiles 34111.7 Asymptotic theory of concomitants of order statistics 34511.8 Exercises 350

APPENDIX: GUIDE TO TABLES AND ALGORITHMS 355

REFERENCES 367

INDEX 451

Page 12: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

Preface to Third Edition

Since the publication in 1981 of the second edition of this book both theory andapplications of order statistics have greatly expanded. In this edition Chapters 2-9deal with finite-sample theory, with division into distribution theory (Chapters 2-6)and statistical inference (Chapters 7-9). Asymptotic theory is treated in Chapters 10and 11, representing a doubling in coverage.

In the spirit of previous editions we present in detail an up-to-date account of whatwe regard as the basic results of the subject of order statistics. Many special topicsare also taken up, but for these we may merely provide an introduction if other moreextensive accounts exist. The number of references has increased from 1000 in thesecond edition to around 1500, and this in spite of the elimination of a good manyreferences cited earlier. Even so, we had to omit a larger proportion of relevantreferences than before, giving some preference to papers not previously mentionedin review articles.

In addition to an increased emphasis on asymptotic theory and on order statistics inother than random samples (Chapter 5), the following sections are entirely or largelynew: 2.6. Related statistics; 4.4. Stochastic orderings; 6.6. Moving order statistics;6.7. Characterizations; 7.3. Distribution-free prediction intervals; 8.2. Informationin order statistics; 8.3. Bootstrap estimation; 9.6. Studentizedrange;9.8. Ranked-setsampling; and 9.9. O-statistics and L-moments in data summarization. Section 6.6includes a major application to median and order-statistic filters and Section 9.6 tobioequivalence testing.

XI

Page 13: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

Xii PREFACE

Order Statistics continues to be both textbook and guide to the research literature.The reader interested in a particular section is urged at least to skim the exercisesfor that section and where relevant to look at the corresponding appendix section forrelated statistical tables and algorithms.

We are grateful to Stephen Quigley, Editor, Wiley Series in Probability and Statis-tics, for encouraging us to prepare this edition. Encouragement was also providedby N. Balakrishnan, to whom we owe a special debt for his careful reading of mostof the book. This has resulted in many corrections and clarifications as well as sev-eral suggestions. Chapter 10 benefited from a careful perusal by Barry Arnold. D.Dharmappa in Bangalore prepared a preliminary version of the manuscript in BTgXwith speed and accuracy. The typing of references and index was ably done byJeanette LaGrange of the Iowa State Statistics Department. We also acknowledgewith appreciation the general support of our respective departments.

H. A. DAVIDH. N. NAGARAJA

Ames, IowaColumbus, OhioJanuary 2003

Page 14: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

PREFACE XiH

Preface to Second Edition

In the ten years since the first edition of this book there has been much activityrelevant to the study of order statistics. This is reflected by an appreciable increasein the size of this volume. Nevertheless it has been possible to retain the outlook andthe essential structure of the earlier account. The principal changes are as follows.

Sections have been added on order statistics for independent nonidentically dis-tributed variates, on linear functions of order statistics (in finite samples), on concomi-tants of order statistics, and on testing for outliers from a regression model. In viewof major developments the section on robust estimation has been greatly expanded.Important progress in the asymptotic theory has resulted in the complete rewriting,with the help of Malay Ghosh, of the sections on the asymptotic joint distribution ofquantiles and on the asymptotic distribution of linear functions of order statistics.

Many other changes and additions have also been made. Thus the number ofreferences has risen from 700 to 1000, in spite of some deletions of entries in the firstedition. Many possible references were deemed either insufficiently central to ourpresentation or adequately covered in other books. By way of comparison it may benoted that the first (and so far only) published volume of Harter's (1978b) annotatedbibliography on order statistics contains 937 entries covering the work prior to 1950.

I am indebted to P. G. Hall, P. C. Joshi, Gordon Simons, and especially RichardSavage for pointing out errors in the first edition. The present treatment of asymptotictheory has benefitted from contributions by Ishay Weissman as well as Malay Ghosh.All the new material in this book has been read critically and constructively by H. N.Nagaraja. It is a pleasure to thank also Janice Peters for cheerfully given extensivesecretarial help. In addition, I am grateful to the U.S. Army Research Office forlongstanding support.

H. A. DAVID

Ames, IowaJuly 1980

Page 15: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

Xiv PREFACE

Preface

Order statistics make their appearance in many areas of statistical theory and prac-tice. Recent years have seen a particularly rapid growth, as attested by the referencesat the end of this book. There is a growing recognition that the large body of the-ory, techniques, and applications involving order statistics deserves study on its own,rather than as a mere appendage to other fields, such as nonparametric methods. Somemay decry this increased specialization, and indeed it is entirely appropriate that themost basic aspects of the subject be incorporated in general textbooks and courses,both theoretical and applied. On the other hand, there has been a clear trend in manyuniversities toward the establishment of courses of lectures dealing more extensivelywith order statistics. I first gave a short course in 1955 at the University of Melbourneand have since then periodically offered longer courses at the Virginia PolytechnicInstitute and especially at the University of North Carolina, where much of the presentmaterial has been tried out.

In this book an attempt is made to present the subject of order statistics in a mannercombining features of a textbook and of a guide through the research literature. Thewriting is at an intermediate level, presupposing on the reader's part the usual basicbackground in statistical theory and applications. Some portions of the book, are,however, quite elementary, whereas others, particularly in Chapters 4 and 9, are rathermore advanced. Exercises supplement the text and, in the manner of M. G. Kendall'sbooks, usually lead the reader to the original sources.

A special word is needed to explain the relation of this book to the only otherexisting general account, also prepared in the Department of Biostatistics, Universityof North Carolina, namely, the multiauthored Contributions to Order Statistics, editedby A. E. Sarhan and B. G. Greenberg, which appeared in this Wiley series in 1962.The present monograph is not meant to replace that earlier one, which is almost twiceas long. In particular, the extensive set of tables in Contributions will long retain theirusefulness. The present work contains only a few tables needed to clarify the text butprovides, as an appendix, an annotated guide to the massive output of tables scatteredover numerous journals and books; such tables are essential for the ready use of manyof the methods described. Contributions was not designed as a textbook and is, ofcourse, no longer quite up to date. However, on a number of topics well developedby 1962 more extensive coverage will be found there than here. Duplication of allbut the most fundamental material has been kept to a minimum.

In other respects also the size of this book has been kept down by deferring whereverfeasible to available specialized monographs. Thus plans for the treatment of the roleof order statistics in simultaneous inference have largely been abandoned in view ofR. G. Miller's very readable account in 1966.

Page 16: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

PREFACE XV

The large number of references may strike some readers as too much of a goodthing. Nevertheless the list is far from complete and is confined to direct, if oftenbrief, citations. For articles dealing with such central topics as distribution theory andestimation I have aimed at reasonable completeness, after elimination of supersededwork. Elsewhere the coverage is less comprehensive, especially where referenceto more specialized bibliographies is possible. In adopting this procedure I havebeen aided by knowledge of H. L. Barter's plans for the publication of an extensiveannotated bibliography of articles on order statistics.

It is a pleasure to acknowledge my long-standing indebtedness to H. O. Hartley, whofirst introduced me to the subject of order statistics with his characteristic enthusiasmand insight. I am also grateful to E. S. Pearson for his encouragement over the years.In writing this book I have had the warm support of B. G. Greenberg. My specialthanks go to P. C. Joshi, who carefully read the entire manuscript and made manysuggestions. Belpful comments were also provided by R. A. Bradley, J. L. Gastwirth,and P. K. Sen. Expert typing assistance and secretarial help were rendered by Mrs.Delores Gold and Mrs. Jean Scovil. The writing was supported throughout by theArmy Research Office, Durham, North Carolina.

H. A. DAVID

Chapel Hill, North CarolinaDecember 1969

Page 17: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

This page intentionally left blank

Page 18: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

1Introduction

1 .1 THE SUBJECT OF ORDER STATISTICS

If the random variables X\ , . . . , Xn are arranged in order of magnitude and then writtenas

we call X^ the z'th order statistic (i = 1, ..., n). In much of this book the (unordered)Xi are assumed to be statistically independent and identically distributed. Even thenthe X(^ are necessarily dependent because of the inequality relations among them.At times we shall relax the assumptions and consider nonidentically distributed Xias well as various patterns of dependence.

The subject of order statistics deals with the properties and applications of theseordered random variables and of functions involving them. Examples are the extremesX{n} and ̂ T(i), the range W = X^ — -X"(i), the extreme deviate (from the samplemean) A"(n) — X, and, for a random sample from a normal N(p,, a"2} distribution, thestudentized range W/S,,, where 5,, is a root-mean-square estimator of a based on i/degrees of freedom. All these statistics have important applications. The extremesarise, for example, in the statistical study of floods and droughts, in problems ofbreaking strength and fatigue failure, and in auction theory (Krishna, 2002). Therange is well known to provide a quick estimator of a and has found particularlywide acceptance in the field of quality control. The extreme deviate is a basic toolin the detection of outliers, large values of (X(n) — X)/o- indicating the presenceof an excessively large observation. The studentized range is of key importance in

Page 19: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

2 INTRODUCTION

the ranking of "treatment" means in analysis of variance situations. More recentlythis statistic has been found useful in bioequivalance testing and for related tests ofinterval hypotheses (Section 9.6).

Another major example is the median M = X/i^+y ^ or |(X(in) 4- X^in+^)according as n is odd or even. Long known to be a robust estimator of location, themedian for n odd has come to be widely used as a smoother in a time series X\, X?,...,that is, one plots or records the moving median M^ = med (Xj,Xj+i..., Xj+n-i),j = 1,2, ...Under the term median filter this statistic has been much used and studiedin signal and image processing.

With the help of the Gauss-Markov theorem of least squares it is possible to uselinear functions of order statistics quite systematically for the estimation of parametersof location and/or scale. This application is particularly useful when some of theobservations in the sample have been "censored," since in that case standard methodsof estimation tend to become laborious or otherwise unsatisfactory. Life tests providean ideal illustration of the advantages of order statistics in censored data. Since suchexperiments may take a long time to complete, it is often desirable to stop after failureof the first r out of n (similar) items under test. The observations are the r times tofailure, which here, unlike in most situations, arrive already ordered for us by themethod of experimentation; from them we can estimate the necessary parameters,such as the true mean life.

Other occurrences arise in the study of the reliability of systems. A system of ncomponents is called a k-out-of-n-system if it functions if and only if (iff) at leastA; components function. For components with independent lifetime distributionsF\,..., Fn the time to failure of the system is seen to be the (n — k + l)th orderstatistic from the set of underlying heterogeneous distributions FI ,..., Fn. The specialcases k = n and k = 1 correspond respectively to series and parallel systems.

Computers have provided a major impetus for the study of order statistics. Onereason is that they have made it feasible to look at the same data in many differentways, thus calling for a body of versatile, often rather informal techniques commonlyreferred to as data analysis (cf. Tukey, 1962; Mosteller and Tukey, 1977). Are thedata really in accord with (a) the assumed distribution and (6) the assumed model?Clues to (a) may be obtained from a plot of the ordered observations against somesimple function of their ranks, preferably on probability paper appropriate for thedistribution assumed. A straight-line fit in such a probability plot indicates that allis more or less well, whereas serious departures from a straight line may reveal thepresence of outliers or other failures in the distributional assumptions. Similarly, inanswer to (6), one can in simple cases usefully plot the ordered residuals from thefitted model. Somewhat in the same spirit is the search for statistics and tests that,although not optimal under ideal (say normal-theory) conditions, perform well undera variety of circumstances likely to occur in practice. An elementary example of theserobust methods is the use, in samples from symmetrical populations, of the trimmedmean, which is the average of the observations remaining after the most extreme k(k/n < |) at each end have been removed. Loss of efficiency in the normal case

Page 20: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

THE SCOPE AND LIMITS OF THIS BOOK 3

may, for suitable choice of k, be compensated by lack of sensitivity to outliers or toother departures from an assumed distribution.

Finally, we may point to a rather narrower but truly space-age application. Inlarge samples (e.g., of particle counts taken on a spacecraft) there are interestingpossibilities for data compression (Eisenberger and Posner, 1965), since the samplemay be replaced (on the spacecraft's computer) by enough order statistics to allow(on the ground) both satisfactory estimation of parameters and a test of the assumedunderlying distributional form. The availability of such large data sets, a commonfeature, for example, in environmental and financial studies, where central or extremeorder statistics are of main interest, necessitates the development of asymptotic theoryfor order statistics and related functions.

1.2 THE SCOPE AND LIMITS OF THIS BOOK

Although we will be concerned with all of the topics sketched in the preceding section,and with many others as well, the field of order statistics impinges on so many differentareas of statistics that some limitations in coverage have to be imposed. To startwith, unlike Wilks (1948), we use "order statistics" in the narrower sense now widelyaccepted: we will nor deal with rank-order statistics, as exemplified by the Wilcoxontwo-sample statistic, although these also require an ordering of the observations.The point of difference is that rank-order statistics involve the ranks of the orderedobservations only, not their actual values, and consequently lead to nonparametric ordistribution-free methods—at any rate for continuous random variables. On the otherhand, the great majority of procedures based on order statistics depend on the form ofthe underlying population. The theory of order statistics is, however, useful in manynonparametric problems and also in an assessment of the nonnull properties of ranktests, for example, by the power function.

Other restrictions in this book have more of an ad hoc character. Order statisticsplay an important supporting role in multiple comparisons and multiple decisionprocedures such as the ranking of treatment means. In view of the useful books byGupta and Panchapakesan (1979), Hsu (1996), and especially Hochberg and Tamhane(1987), there seems little point in developing here the inference aspects of the subject,although the needed order-statistics theory is either given explicitly or obtainable byonly minor extensions.

In the same spirit we have eliminated the chapter in previous editions on testsfor outliers, in view of the excellent extensive account by Barnett and Lewis (1994).However, we have expanded a section on robust estimation in which robustness againstoutliers is a major issue. A vast literature has developed on the analysis of data subjectto various kinds of censoring. We treat this in some detail, but confine ourselveslargely to normal and exponential data. These two cases are the most important andalso bring out the statistical issues involved.

Page 21: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

4 INTRODUCTION

Much more could be said about asymptotic methods than we do in Chapters 10and 11. In fact, the asymptotic theory of extremes and of related statistics has beendeveloped at length in a book by Galambos (1978, 1987), and a detailed compila-tion of theoretical results emphasizing financial applications is given by Embrechts,Kliippelburg, and Mikosch (1997); on the more applied side Gumbel's (1958) ac-count continues to be valuable. The asymptotic theory of central order statistics andof linear functions of order statistics has also been a very active research area in morerecent years and is well covered in Shorack and Wellner (1986). We have thought itbest to confine ourselves to a detailed treatment of some of the most important resultsand to a summary of other developments.

The effective application of order-statistics techniques requires a great many tables.Inclusion even of only the most useful would greatly increase the bulk of this book.We therefore limit ourselves to a few tables needed for illustration; for the rest, werefer the reader to general books of tables, such as the two volumes by Pearson andHartley (1970, 1972) and the collection of tables in Sarhan and Greenberg (1962).Extensive tables of many functions of interest have been prepared by Harter (1970a,b) in two large volumes devoted entirely to order statistics. Harter and Balakrishnan(1996, 1997) have extended these. Many references to tables in original papers aregiven throughout our text, and a guided commentary to available tables and algorithmsis provided in the Appendix.

Related Books

Many books on various aspects of order statistics have been written in the last20 years and are cited in the text as needed. Those of a general nature includethe somewhat more elementary and shorter account by Arnold, Balakrishnan, andNagaraja (1992) and two large multi-authored volumes on theory and applications,edited by Balakrishnan and Rao (1998a, b). Emphasizing asymptotic theory are booksby Leadbetter, Lindgren, and Rootzen (1983), Galambos (1987), Resnick (1987),Reiss (1989), and Embrechts et al. (1997).

1.3 NOTATION

Although this section may serve for reference, the reader is urged to look through itbefore proceeding further.

As far as is feasible, random variables, or variates for short, will be designated byuppercase letters, and their realizations (observations) by the corresponding lower-case letters. By order statistics we will mean either ordered variates or orderedobservations. Thus:

i,..., Xn unordered variatesi,..., xn unordered observations

Page 22: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

NOTATION

< < x(n)

X\-.n < • • • <

ordered variates 1ordered observations J

ordered variates-

order statistics

-extensive form

When the sample size n needs to be emphasized, we use the extensive form of notation,switching rather freely from the extensive to the brief form.

cumulative distribution function(cdf) of X

empirical distribution functionprobability density function (pdf)

for a continuous variateprobability function (pf)

for a discrete variateprobability integral transformationinverse cdf or quantile function,

F(x] = Pr{X < x}

Fn(x]

/(*)

u = F(x]

= inf{x : F(x) > u}= Q(u] (sometimes x(u})

F(r}(x},Fr:n(x]

= Pr{X(r)<x,X(s} <

F~l (0) = lower bound of Fcdf of X(r},Xr-n r = l,...,npdf or pf of X(r) , Xr:n

joint cdf of JC(r) and X(s)I <r < s <n

joint pdf or pf of X(r) and X(s)population quantile of order p, given

by F(£p) = p or equivalently by

But the sample median is

population mediansample quantile of order p, where[np] denotes the largest integer < npsample quantile of order pi ,

0 < P! < • • • < pk < 1

n oddn even(sample) rangeith quasi-range = W]

w

P = 0X1

mean of k ranges of nW for jfth sampleconcomitant of X(r) , Xr:n

mean, variance of Xmeans of X, Y (bivariate case)variance of X, Ycovariance of X, Ycorrelation coefficient

Page 23: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

INTRODUCTION

p,r:n = E(Xr-.n) mean of Xr-n

HXb = E(X*.n) kth raw moment of Xr:n

Pr,s:n ~ E(Xr:nXs:n)

<2(w) = F~l(u) inverse cdf, quantile functionpr = r/(n + l),qr = l-pr

Qr = Q(pr), /r = /(Qr)

Qr = dQ(pr)/dpr = l/fr

Sv estimator of cr based on v DF; for a

N(n,(r2) distribution vS^a1 = xlQn^v = Wn/Sv studentized range (Wn, Sv independent)

5 = [£(Xi - X)*/(n - 1)] * (internal) estimator of aS<p) = {[(n - l)^2 + vSl}/ pooled estimator of a

(n- l + i/)}*jS S for j th sampleF € 2?(G) F is in the domain of maximal attraction

of the extreme-value cdf G — (10.5.2) holdsB(a, 6) = Jo ta~l (I - t)b~ldt beta function

a > 0, b > 0

7p(a, 6) = /Qp ta~l(l - t}b~ldtl incomplete beta function (1.3.1)

B(a,6)/?(a, 6) beta variate X having cdf

Pr{X<x} = /x(o,6) (1.3.2)b(p, n) binomial variate X with success

probability p and trials nxf, chi-squared variate with v DF

1 1 2<£(x) = (27r)~2e~2x standard normal pdf

— 00 < X < OO

$(x) = J^ <{>(t}dt standard normal cdfN((j,, a } normal variate, mean fj,, variance cr2

AT(^t, S) multinomial variate, mean vector/Lt, covariance matrix E

nW =n(n-l) . . . (n-fc + l)k = 1, ...,n

[x] integral part of x, but n^ = E(X^ )and Y[r] = concomitant of X(r)

rv random variablepdf probability density functioncdf cumulative distribution functioniid independent, identically distributedinid independent, nonidentically distributed

Page 24: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

EXERCISES

c.f. characteristic functiona.s. almost surelyId limit distributionDF degrees of freedomML maximum likelihoodLS least squaresBLUE best linear unbiased estimatorUMVU uniformly minimum variance unbiasedUMP uniformly most powerfulH1, H2 Harter (1970a, b), Range, Order Statistics...1,2HB1, HB2 Harter and Balakrishnan (1997,1996),

Range, Order Statistics...!, 2BR1, BR2 Balakrishnan and Rao (1998 a, b) Order Statistics: Theory,

Applications... 1, 2PHI, PH2 Pearson and Hartley (1970, 1972) Biometrika Tables 1, 2SG Sarhan and Greenberg (1962) Contributions to Order StatisticsEx. Exercise ("example" is written in full)D decimal (e.g., to 3D = to 3 decimal places)S significant (e.g., to 4S = to 4 significant figures)A3.2 appendix listing of tables relating to Section 3.2

1.4 EXERCISES

1.1. For real numbers x i , . . . , xn determine all c values for which ^"=1 \Xi — c\ is thesmallest.

1.2. Let x r :n(xi,...., xn) denote the rth largest among the real numbers xi,..., xn.

(a) Show that

Xl-.2(xi,X2) = min(xi,X2) = f (Xi + X2) - |JX1 - X2 | ,

£2 :2(xi,x2) = max(xi,X2) = |(xi +x2) + ||xi -x2|.

(6) Show also that

Xl-.n(Xl, . . . ,X n ) = Xl :2(xi :n_l (xi, . . . , Xn-l ), £l:n-l (#2, . . . ,X n ) ) ,

Xn:n(xi, . . . ,X n ) = X2:2(xn-l:n-l (x\ , . . . , Xn_i ), Xn-i :n_l (x2, . . . ,X n ) ) ,

and that for r = 2,3,..., n - 1

Xr-.n(xi, ...,Xn) = Xl : 2(Xr:n-l^l, X2 , . . . , Xn-l ), • • - , Xr:n-l (xn, X\ , . . . , X n _ 2 ) ) .

(Meyer, 1969)

Page 25: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

8 INTRODUCTION

1.3. For the real numbers xi,...., xn let max^^xi,..., xn) denote the i/th largest (v =l,...,n). Also define

xn,v = max(l/) xi ,xi -f x2, . . . , x

and

Xm,v = max

Show that

(a) xn,v = xi + max(l/)(0,x2)...,£"=2au),

(&) xn,i/ = xi +max ( l / )(0,Xn-i,i,--.,x„-!,„)

(c) xn,^ = majc2'(xi,xi + Xn-i,v-i,x* + xJi_

i/ = l,...,n-l,

\ ^ — . 2 yj l - n > 3

(Pollaczek, 1975)

Page 26: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

2Basic Distribution Theory

2.1 DISTRIBUTION OF A SINGLE ORDER STATISTIC

We suppose that X\ , . . . , Xn are n independent variates, each with cumulative distri-bution function (cdf)F(x). Let F(r) (x)(r = 1, . . . , n) denote the cdf of the rth orderstatistic A"(r) . Then the cdf of the largest order statistic X(n) is given by

F(n)(x) = Pr{X(n}<x}

= Pr{all Xi<x} = Fn(x). (2.1.1)

Likewise we have

< x} = 1 - Pr{X(1} > x}

= 1 - Prfall Xi > x} = 1 - [1 - F(x)]n. (2.1.2)

These are important special cases of the general result for F(r) (x):

F(r)(x) = PT{X(r)<x}

= Pr{at least r of the Xi are less than or equal to x}

since the term in the summand is the binomial probability that exactly iofXi,...,Xn

are less than or equal to x.

Page 27: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

10 BASIC DISTRIBUTION THEORY

An alternative form of (2.1.3) is

F(r)(x) = F'(x) _ l - F(x)P,j=o ^ '

where the RHS is the sum of the probabilities that exactly rofXi,..., Xr+j , includ-ing Xr+j, are less than or equal to x. This negative binomial version of (2.1.3) mayalso be obtained by repeated application of (6) in Ex. 2.1.6. We write (2.1.3) as

F(r)(x)=£F(x)(n,r) (2.1.4)

and note that the E function has been tabled extensively (e.g., Harvard ComputationLaboratory, 1955, where the notation F/(n, r, F(x)) is used). Alternatively, from thewell-known relation between binomial sums and the incomplete beta function wehave

F(r)(x) = IF(x)(r,n- r + 1), (2.1.5)

where Ip(a, b) is defined by (1.3.1). Thus F(r) (x) can also be evaluated from tablesof 7p(a, 6) (K. Pearson, 1934). Percentage points of X(r) may be obtained by inverseinterpolation in the above tables or more directly from Table 16 of Biometrika Tables(Pearson and Hartley, 1970), which gives percentage points of the incomplete betafunction.

Example 2.1. Find the upper 5% point of J£(4) in samples of 5 from a standardnormal parent.

We require x such thatJF(X)(4,2)=0.95

orW(.) (2, 4) =0.05.

This gives 1 - F(x) = 0.07644 and hence x = 1.429.

It should be noted that results (2.1.1)-(2.1.5) hold equally for continuous anddiscrete variates. We shall now assume that Xi is continuous with probability densityfunction (pdf) /(x) = F'(x), but will return to the discrete case in Section 2.4. If/(r)(x) denotes the pdf of _X"(r) we have from (2.1.5)

F(x)

B(r,n-r + l)

A f x

T- / tr~l(l - t)n~rdtdx./0 v '

1 F'-1(x)[l-F(x)]"-'7(x). (2.1.6)B(r,n-r + 1)

In view of the importance of this result we will also derive it otherwise. The eventx < J£(r) < x -I- 5x may be realized as follows:

Page 28: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

JOINT DISTRIBUTION OF TWO OR MORE ORDER STATISTICS 11

r — 1 n — r

x 1 1 x + Sx

Xi < x for r - 1 of the Xit x < Xi < x + Sx for one Xi, and Xi > x + 6x for theremaining n — r of the Xi. The number of ways in which the n observations can beso divided into three parcels is

n! 1-r)! B(r,n-r + l)'

and each such way has probability

Fr~1(x)[F(x + Sx) - F(x)][l - F(x + Sx)]n~r-

Regarding Sx as small, we have therefore

1Pr{x < X(r) < x + Sx} =

B(r,n-r + 1)

•Ff-1(x)f(x)Sx[l - F(x + Sx)]n~r + O(Sx)2,

where O(Sx)2 means terms of order (Sx)2 and includes the probability of realizationsof x < X(r) < x + Sx in which more than one Xi is in (x, x + Sx]. Dividing bothsides by Sx and letting Sx —> 0, we again obtain (2.1.6).

The distribution of X^ when the sample size is itself a random variable, say N, iseasily obtained by conditioning on N = n. For example, specific results when N hasa generalized negative binomial, a generalized Poisson, or a generalized logarithmicseries distribution are given by Gupta and Gupta (1984). See also Exs. 2.1.8 and2.1.9.

2.2 JOINT DISTRIBUTION OF TWO OR MORE ORDER STATISTICS

The joint density function of X^ and X^(l < r < s < n) is denoted by/(r)(s)(a;, y). An expression corresponding to (2.1.6) may be derived by noting thatthe compound event x < X(rj < x + Sx, y < X^ <y + Syis realized (apart fromterms having a lower order of probability) by the configuration

r — 1

x

s — r — 1 M n — s

x + Sx y\\y + Sy

Page 29: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

12 BASIC DISTRIBUTION THEORY

meaning that r — 1 of the observations are less than x, one is in (x, x + 6x], etc. Itfollows that for x < y

n\ --I, —r—1

. (2.2.1)

Generalizations are now clear. Thus the joint pdf of -X"(ni) , . . . , -X"(nfc) (1 < n\ <• • • < nfc < n; 1 < k < n) is for x\ < • • • < xk ,

If we define XQ = — oo,Xk+iwritten as

n!

f(x2}---[l-F(xk)]n-n»f(xk). (2.2.2)

+00, no = 0,nfc+i = n + 1, the RHS may be

(2.2.3)

In particular, the joint pdf of all n order statistics becomes simply

This result is indeed directly obvious since there are n! equally likely orderings ofthe Xi, and may be used as the starting point for the derivation of the joint distributionof k order statistics (k < n) in the continuous case.

The joint cdf F(r)(s)(x,y) of X(r) and X^ may be obtained by integration of(2.2.1) as well as by a direct argument valid also in the discrete case. We have forx < y

F(r)(s) (x, y) = Pr{at least r Xi < x, at least s Xi < y}n j

exactly i Xi < x, exactly j Xi <y}

n\

(2.2.4)

Also for x > y the inequality X<a\ < y implies X<r\ < x, so that

(2.2.5)

Page 30: Order Statistics...4.2 Distribution-free bounds for the moments of order statistics and of the range 60 4.3 Bounds and approximations by orthogonal inverse expansion 70 4.4 Stochastic

DISTRIBUTION OF THE RANGE 13

We may remark here that a similar argument leads to the joint cdf of X^ and Y(s) whenthese order statistics stem from n independent observations on the couple (X, Y)—see Ex. 2.2.5. (Note that we no longer require r < s.) A rather different approach isgiven in Galambos( 1975). The correlation of X(n) andF(n) is studied for the bivariatenormal distribution by Bofinger and Bofinger (1965) and for several other distributionsby Bofinger (1970). For a general discussion of the ordering of multivariate data seeBarnett (1976b).

2.3 DISTRIBUTION OF THE RANGE AND OF OTHER SYSTEMATICSTATISTICS

From the joint pdf of k order statistics we can by standard transformation methodsderive the pdf of any well-behaved function of the order statistics. For example, tofind the pdf of Wrs = X(s) — X^ we put wrs = y — x in (2.2.1) and note that thetransformation from x, y to x, wrs has Jacobian unity in modulus. Thus, writing Crs

for the constant in (2.2.1), we have on integrating out over x

/

ooFr-l(x}f(x}[F(x + wra) - F(x}]s-r-1f(x + wrs)

-00•[1 - F(x + wra)}

n-sdx. (2.3.1)

Of special interest is the case r = 1, s = n, when Wrs becomes the range W and(2.3.1) reduces to

/

oof(x)[F(x + w)- F(x)]n~2f(x + w)dx. (2.3.2)

-00

The cdf of W is somewhat simpler. On interchanging the order of integration wehave

/

OO rW

f(x] I (n- l ) f ( x + w')[F(x + w1) - F(x)]n-2dw'dx.-oo JO

/

oof ( x ) [ F ( x + w')-F(x)]n-l\™',=%dx.

-oo/oo f ( x ) [ F ( x + w)- F(x)]n-ldx. (2.3.3)

-oo

This important result may also be obtained by noting that

nf(x)dx[F(x + tw) - F(x)]n-1

is the probability given x that one of the Xi falls into the interval (x, x + dx] and allof the n — 1 remaining Xi fall into (x, x + w).