a review and analysis of the mahalanobis—taguchi

Upload: akshuk

Post on 07-Jul-2018

219 views

Category:

Documents


1 download

TRANSCRIPT

  • 8/18/2019 A Review and Analysis of the Mahalanobis—Taguchi

    1/16

  • 8/18/2019 A Review and Analysis of the Mahalanobis—Taguchi

    2/16

     A Review and Analysisof the Mahalanobis–Taguchi System

    William H.  Woodall  and Rachelle  Koudelik 

    Department of Statistics

    Virginia Polytechnic Institute

    and State University

    Blacksburg, VA 24061

    [email protected]; [email protected] )

    Kwok-Leung Tsui  and Seoung Bum  K im

    School of Industrial and Systems Engineering

    Georgia Institute of Technology

     Atlanta, GA 30332

    [email protected] ch.edu; [email protected])

    Zachary G.  Stoumbos

    Department of Management Science

    and Information Systems and Rutgers Center

    for Operations Research (RUTCOR)

    Rutgers, The State University of New Jersey

    Piscataway, NJ 08854

    [email protected] s.edu)

    Christos P.  Carvounis, MD

    State University of New York at Stony Brook

    Nassau University Medical Center

    East Meadow, NY 11554

    ( [email protected] )

    The Mahalanobis–Taguchi system (MTS) is a relatively new collection of methods proposed for diagno-

    sis and forecasting using multivariate data. The p rimary proponent of the MTS is Genichi Taguchi, who

    is very well known for his controversial ideas and methods for using designed experiments. The MTS

    results in a Mahalanobis distance scale used to measure the level of abnormality of “abnormal” items

    compared to a group of “normal” items. First, it must be demonstrated that a Mahalanobis distance

    measure based on all available variables on the items is able to separate the abnormal items from the

    normal items. If this is the case, then orthogonal arrays and signal-to-noise ratios are used to select

    an “optimal” combination of variables for calculating the Mahalanobis distances. Optimality is dened

    in terms of the ability of the Mahalanobis distance scale to match a prespecied or estimated scale

    that measures the severity of the abnormalities. In this expository article, we review the methods of 

    the MTS and use a case study based on medical data to illustrate them. We identify some conceptual,

    operational, and technical issues with the MTS that lead us to advise against its use.

    KEY WORDS: Classication analysis; Discriminant analysis; Medical diagnosis; Multivariate analy-sis; Pattern recognition; Signal-to-noise ratio; Taguchi methods.

    1. INTRODUCTION

    Genichi Taguchi is most well known for his work on

    the design of experiments. His ideas have generated a

    considerable amount of discussion and controversy and his

    methods are widely used (see, e.g., Taguchi and Wu 1980;

    Box 1996; Montgomery 1992; Nair 1992; Tsui 1996; Wu and

    Hamada 2000; Taguchi, Chowdhury, and Taguchi 2000). The

    general consensus, among statisticians at least, seems to be

    that although many of Taguchi’s overall ideas on experimental

    design are very important and inuential, the techniques thathe proposed should be replaced with simpler, more effective

    statistical methods.

    It is not as well known that Taguchi also proposed on-line

    quality control methods (Taguchi 1981; Taguchi, Elsayed, and

    Hsiang 1989). Adams and Woodall (1989) and Nayebpour

    and Woodall (1993), among others, have studied these on-line

    methods. Taguchi’s off-line ideas have had a much greater

    impact than his ideas on on-line quality control.

    We study a new set of methods proposed by Taguchi,

    Chowdhury, and Wu (2001) and Taguchi and Rajesh (2000)

    collectively referred to as the   Mahalanobis–Taguchi system

    (MTS). The MTS is proposed as a diagnosis and forecastingmethod using multivariate data. In this approach, this mul-

    tivariate data must be available on a “healthy” or “normal”

    group of items and a number of “abnormal” items that may

    sometimes be classied into groups based on the severity

    levels of the abnormalities. In the MTS, it must rst be

    conrmed that the relative sizes of the Mahalanobis distances

    (MDs) based on the standardized variables of the healthy

    group can discriminate between normal and abnormal items.

    Once this fact is established, the number of variables used

    is reduced, if possible, using orthogonal arrays (OAs) and

    signal-to-noise (S/N) ratios to evaluate the contribution of 

    each variable. Each row of the OA determines a subset of the original variables. The recommended S/N ratio measures

    the ability of the MDs, corresponding to the abnormal items

    and calculated using this subset of variables, to reect a

    prespecied or estimated measure of the severity of the

    abnormalities. Only those variables with effects that show

    an increase in the average S/N ratio are retained. The MD

    scale using these variables has a number of stated purposes,

    including diagnosis and forecasting.

    © 2003 American Statistical Association and

    the American Society for Quality

    TECHNOMETRICS, FEBRUARY 2003, VOL. 45, NO. 1

    DOI 10.1198/004017002188618626

    1

  • 8/18/2019 A Review and Analysis of the Mahalanobis—Taguchi

    3/16

    2 WILLIAM H. WOODALL ET AL.

    Taguchi et al. (2001) listed a number of areas of application

    for the MTS, including inspection and sensor systems in man-

    ufacturing, patient monitoring, re detection, earthquake fore-

    casting, weather forecasting, credit scoring, and voice recog-

    nition. They also described case studies involving engineering

    applications of the MTS in many large companies, includ-

    ing Nissan Motor, Mitsubishi Space Software, Xerox, Delphi

    Automotive Systems, ITT Industries, Ford Motor, Fuji PhotoFilm, and others.

    We review the MTS by explaining the approach and calcula-

    tions in Section 2. In Section 3 we discuss the MTS and iden-

    tify some conceptual, operational, and technical issues asso-

    ciated with the methods. We present a detailed case study in

    Section 4. We discuss other aspects of the MTS in Section 5,

    and present concluding remarks in Section 6. A primary con-

    clusion is that the methods of the MTS are, in some respects,

    not well dened conceptually or operationally.

    2. DESCRIPTION OF THE

    MAHALANOBIS–TAGUCHI SYSTEM

    In this section we provide a detailed explanation of the MTS

    and the required computations, as presented by Taguchi and

    Rajesh (2000). These authors break the MTS into four stages.

    In  stage 1, the variables that dene the “healthiness” of an

    item are identied. Data are collected on the healthy or normal

    group. As described later, the variables are standardized and

    the MDs calculated for the normal items. These values dene

    the “Mahalanobis space” used as a frame of reference for the

    MTS measurement scale.

    We refer to the variables collected on each item to deter-

    mine its “healthiness” as   V i ,   i D 11 21 : : : 1p. We denote byV ij   the observation of the   ith variable on the   j th item,   i D11 21 : : : 1p,  j  D 11 21 : : : 1m. Thus the  p 1 data vectors forthe normal group are denoted by  vj ,  j  D 11 21: : : 1 m.

    Each individual variable in each data vector is standardized

    by subtracting the mean of the variable and dividing by its

    standard deviation, with both statistics calculated using data

    on the variable in the normal group. Thus we have the stan-

    dardized values

    Zij  D 4V ij ƒSV i5¯

    S i1 i D 11 21 : : : 1 p1 j  D 11 21 : : : 1 m1(1)

    where

    SV i DmX

    j D1V ij ̄ m

    and

    S i Ds 

      mXj D1

    4V ij ƒSV i52¯

    4mƒ150

    Next, the values of the MDs,   MDj ,   j  D 11 21: : : 1 m, are cal-culated for the normal items using

     MDj  D 41¯

    p5zT j  Sƒ1

    zj 1   (2)

    where  zT j  D 6Z1j 1 Z2j 1 : : : 1Zpj 7  and  S  is the sample correlationmatrix calculated as

    SD 1¯

    4mƒ15mX

    j D1zj z

    T j  0

    Taguchi and Rajesh (2000) stated that the  MDj  values in (2)

    have an average value of unity. For this reason, they also refer

    to the Mahalanobis space as the unit space.

    In   stage 2, abnormal items must be selected. There is no

    uncertainty incorporated into the MTS regarding the status of 

    each item used for determining the MTS measurement scale.

    As in discriminant analysis, it is assumed that each item is

    known to be either normal or abnormal.The MDs of the abnormals with data vectors denoted by  vj ,

    j  D mC 11 mC21 : : : 1mC t   are calculated after the variablesare standardized using the normal-group means and standard

    deviations. Thus we have  MDj ,   j  D mC11 mC 21: : : 1 mC t,with   MDj    dened in (2), where the   ith element of   zj    in

    (2),   Zij , is calculated using (1), for   i D   11 21 : : : 1p   andj  D mC11 mC 21 : : : 1mC t.

    According to the MTS, the resulting MD scale is good if 

    the  MDj  values for the abnormal items are higher than those

    for the normal items.

    In stage 3, OAs and S/N ratios are used to identify the most

    useful set of variables. An OA is a design matrix that containsthe levels of various factors in the runs of an experiment to

    investigate the effects of the variables on a response of inter-

    est. Each factor of the experiment is assigned to a column of 

    the OA, and the rows of the matrix correspond to the experi-

    mental runs. The MTS has  p   factors in the experiment, each

    with two levels. The level of a factor signies the inclusion

    or exclusion of a variable in the MTS analysis. The  p   factors

    are assigned to the rst  p  columns of the OA, with the other

    columns ignored. Thus the OA selected must initially have

    at least   p   columns. Each row of the OA determines which

    variables are included in any given experimental run. For each

    of these runs, the MD values are calculated for the abnormalsas in stage 2, but using only the indicated variables. These

    MD values are then used to calculate the value of a S/N ratio,

    which becomes the response for the run.

    Many different S/N ratios are used in Taguchi’s analysis

    of designed experiments. These are dened in such a way

    that larger S/N ratio values are preferred. One option men-

    tioned in the MTS is to use Taguchi’s larger-is-better S/N ratio,

    dened as

    ƒ10log"

    41=t5mCtX

    j DmC1

      1

     MDj 

    2́#

    1

    because larger MD values further separate the abnormals fromthe normal group. Taguchi and Rajesh (2000) recommended

    using the dynamic type S/N ratio instead. For the dynamic

    S/N ratio to be calculated, the severity value of each abnormal

    item must be established. These severity levels are denoted by

    M j , j DmC11 mC21 : : : 1mC t. Larger values of  M j   indicatea greater degree of abnormality. The goal of this stage is to

    select a subset of the original variables such that the result-

    ing   MDj   values of the abnormals most appropriately reect

    the levels of severity   M j . If the values of   M j   are unknown,

    Taguchi and Rajesh (2000) recommended grouping the abnor-

    mal items into classes based on a general level of severity,

    perhaps obtained subjectively. The value of  M j  used for eachmember of a class is the average value of the square roots

    of the MDs for the members in the class. These MDs are

    TECHNOMETRICS, FEBRUARY 2003, VOL. 45, NO. 1

  • 8/18/2019 A Review and Analysis of the Mahalanobis—Taguchi

    4/16

  • 8/18/2019 A Review and Analysis of the Mahalanobis—Taguchi

    5/16

    4 WILLIAM H. WOODALL ET AL.

    not understood in the context of a meaningful sampling (and

    conceptual) framework.”

    In addition, in our view, the use of the MTS measurement

    scale has never been clearly explained. Taguchi and Rajesh

    (2000), for example, stated that the problem of the MTS is not

    one of classication of a future observation into one of two

    populations corresponding to normal and abnormal. Taguchi

    et al. (2001, p. 7) stated that the MD values should be used“in continuous mode rather than discrete mode.” Nevertheless,

    a university admission process is given as an application of 

    the MTS that would seem to require classication. Also, the

    use of a threshold for MD in the MTS seems to imply classi-

    cation. It is clear, however, that the MTS results in an MD

    measurement scale that should measure the degree of abnor-

    mality of the items. Use of the MD scale is similar to that of a

    discriminant function in discriminant analysis. This similarity

    is discussed further in the case study in Section 4. Another sta-

    tistical option would be to use standard model-tting methods,

    such as ordinal logistic regression, with the level of severity as

    the dependent variable and the variables V i, iD

    11 21 31 : : : 1p,

    as the explanatory variables.

    3.2 Operational Issues

    In stage 2, it must be shown that the MD values of the

    abnormal items are higher than those for the normal items. No

    operational denition is given, however, for “higher than.” If 

    the criterion means that the smallest MD value for the abnor-

    mal items must be higher than the largest value for the normal

    items, as in the case study in Section 4, then this would appear

    to limit the usefulness of the approach. If normal and abnor-

    mal items are not clearly distinguishable, then it seems that

    misclassication probabilities must be considered, somethingnot possible under the MTS framework that eschews the use

    of probability.

    A designed fractional factorial experiment is used as a

    search algorithm for optimization in the MTS. The run for

    which all factors are at their low levels is not a valid run,

    however, because at least one variable must be used in the

    analysis. Thus an OA containing this run could not be used.

    The OA and the experimental design methods are used as an

    optimization technique to nd the combination of variables

    that maximize the S/N ratio. As illustrated in the case study

    in Section 4, this optimal combination is not always obtained.

    Fractional factorial designs are used in industry to reduce thenumber of runs, because each run is often expensive. This

    goal seems much less important in an optimization application

    involving only computations. Of course, the MTS approach

    could be modied to include a better search algorithm for

    the optimal combination of variables or another S/N ratio,

    e.g., one based on a rank correlation coefcient that would

    lead to an MTS scale that would match, to the greatest extent

    possible, the order of the given severity levels of the abnormal

    items.

    3.3 Technical Issues

    Taguchi and Rajesh (2000) stated that the expected value of  MDj   in (2) for the normal items is unity. This is an approxi-

    mation, however, evidently based on a chi-squared distribution

    with p  degrees of freedom. This is the probability distribution

    of   pMDj , provided that sampling is from a multivariate nor-

    mal distribution and the mean vector and variance-covariance

    matrix are assumed to be known and used in the calculations

    instead of the estimates. Under the assumption of multivariate

    normality and estimation of the mean vector and variance-

    covariance matrix, Tracy, Young, and Mason (1992) reported

    that the marginal distribution of  MD

    j   is related to a beta dis-tribution and has a mean of (mƒ 15=m, not unity. The meanof  MDj   is also (mƒ15=m  if the  m  observations in the normalgroup represent the entire population of normal items. Finally,

    it can be shown using matrix algebra that the average MD

    value for the  m   items in the normal group is always exactly

    (mƒ 15=m.Moreover, Taguchi and Rajesh (2000) stated that O‚ from (4)

    is 1 when working averages are used to t the regression line

    through the origin. This is true, however, only if the working

    averages are calculated using the variables included in the par-

    ticular run being considered. It is not reasonable to use just

    the variables in each run to calculate the working averages,

    because this would cause the measure of the degree of severity

    of abnormal items, and their relative rankings, to vary from

    run to run. Although descriptions of the MTS do not specify

    explicitly the variables used to obtain the working averages,

    all of the variables are used to obtain the working averages in

    the medical data case study of Taguchi and Rajesh (2000).

    4. A MEDICAL CASE STUDY 

    Taguchi and Rajesh (2000) and Taguchi et al. (2001) justi-

    ed their MTS approach solely through the use of case studies.

    In this section we consider a medical diagnosis case study

    of Taguchi and Rajesh (2000) involving liver disease. Thestudy group comprised a healthy group of 200 people and

    an unhealthy group of 17 people. This healthy group was

    also used in a case study presented by Taguchi et al. (2001,

    chap. 3).

    The data variables consist of age (V 1), gender (V 2 ), and the

    15 blood test measurements listed in Table 1. The data are

    available in EXCEL format from the rst author.

    4.1 Results of the MTS

    As described by Taguchi and Rajesh (2000), the MD val-

    ues were calculated in stage 1 for the healthy group, forming

    the Mahalanobis space. The reported MD values ranged from

    .3784 to 2.3581. The average MD value is given as .9951,

    which is, apart from rounding error, equal to (mƒ 15=m D199=200 D 0995, as expected. In stage 2, the MD values cal-culated using the observations from the unhealthy group were

    higher, ranging from 7.7274 to 135.6978, so the measurement

    scale was said to be good. We note that such a wide, clear

    separation between the groups of interest is often not possible

    in many applications of traditional statistical methods.

    Because there were 17 variables, Taguchi and Rajesh

    (2000) selected an L32 (2315   OA in stage 3. This fractional

    factorial design can accommodate up to 31 factors with 32

    runs. Taguchi and Rajesh assigned the 17 variables to the rst17 columns of the array. The remaining columns are ignored.

    The MD values were calculated for all 17 unhealthy patients,

    TECHNOMETRICS, FEBRUARY 2003, VOL. 45, NO. 1

  • 8/18/2019 A Review and Analysis of the Mahalanobis—Taguchi

    6/16

    THE MAHALANOBIS–TAGUCHI SYSTEM 5

    Table 1. The Case Study Blood Test Variables With Normal Ranges

    Variables Symbol Acronym Normal ranges Taguchi et al. (2001) normal ranges

    Total protein in blood   V 3   TP 6.0–8.3 g/dL 6.5–7.5 g/dL Albumin in blood   V 4   Alb 3.4–5.4 g/dL 3.5–4.5 g/dLCholinesterase   V 5   ChE Depends on technique; .60–1.00 dpH

    (pseudocholinesterase) 8–18 U/mLGlutamate O transaminase   V 6   GOT 10–34 IU/L 2–25 U

    (asparate aminotransferase)Glutamate P transaminase   V 7   GPT 6–59 U/L 0–22 U

    (alanine transaminase)Lactic dehydrogenase   V 8   LDH 105–333 IU/L 130–250 U

     Alkaline phosphatase   V 9   Alp 0–250 U/L, normal; 250–750 U/L, 2.0–10.0 Umoderate elevation

    r-glutamyl transpeptidase   V 10   r-GPT 0–51 IU/L 0–68 U(gamma-glutamate transferase)

    Leucine aminopeptidase   V 11   LAP Serum: Mal e: 80–200 U/mLFemale: 75–185 U/mL

    Total cholesterol   V 12   TCh  

  • 8/18/2019 A Review and Analysis of the Mahalanobis—Taguchi

    7/16

    6 WILLIAM H. WOODALL ET AL.

    Group ... :: : .

    1 +---------+---------+---------+---------+---------+-------MTS

      .. . . . . .2 +---------+---------+---------+---------+---------+-------MTS

      .

    :..:: .

    1 +---------+---------+---------+---------+---------+-------OA Optimal

      .... . . .2 +---------+---------+---------+---------+---------+-------OA Optimal

     .

    :: :. ..1 +---------+---------+---------+---------+---------+-------Optimal

      : . : . .2 +---------+---------+---------+---------+---------+-------Optimal  0 25 50 75 100 125

    Figure 1. Dotplot of MD Values for MTS, OA Optimal, and Optimal Combinations By Group (1D mild; 2 D moderate).

    next examination or the loss increase after having subjective

    symptoms followed by taking a complete examination, and

    Dü   is the “mid-value” of the MD of a patient group havingthe subjective symptoms. It is pointed out that T   will vary by

    disease, because the costs will vary by disease. The terms used

    in (6) are not clearly dened, however, because the meaning of 

    “subjective symptoms” is not clear. It is important to note that

    statistical approaches based on misclassication costs would

    incorporate into any decision rule the probability of having the

    disease, given the data on a subject (see, e.g., Zielezny and

    Dunn 1975).

    4.2 Results Using Standard Methods

    Descriptions of the MTS do not mention graphical displays

    of the raw data. Our rst step in the analysis of the medical

    data, however, was to plot each variable by status (healthyD 1;mild diseaseD 2; medium diseaseD 3). These plots are shownin the Appendix.

    A key aspect of medical diagnosis involves noting which

    variables fall outside their corresponding normal ranges. Nor-

    mal ranges are calculated to include 95% of the measurements

    on all healthy patients. Taguchi et al. (2001, p. 3) discounted

    the usefulness of these ranges based on the work of Kanetaka

    (1990), stating that they are arbitrarily determined by test

    chemical manufacturers or, in extreme cases, textbook val-

    ues used without modication. From the discussion of Harris

    (1981), however, it seems that considerable effort has gone

    into the determination of normal ranges. The standard practice

    of using normal ranges in medical diagnosis does have prob-

    lems, however, as listed by Begg (1991), one of which is the

    fact that “normalcy is an inherently multivariate concept.”

    The normal ranges that we obtained from the National

    Library of Medicine (2001) are given in Table 1. The normal

    range for alkaline phosphatase (V 9 ) was obtained from

    Neuschwander-Tetri (1995). The ranges given by Taguchi

    et al. (2001, p. 36) for several of the variables are also shown

    in Table 1.

    Note that the pair of normal ranges for cholinesterase (V 5 )

    in Table 1 do not match each other and are inconsistentwith the values of this variable given in the dataset. Thus

    we do not consider the normal range for this variable. In

    addition, the normal range given by Taguchi et al. (2001) for

    alkaline phosphatase (V 9

    ) does not match the values given in

    the dataset. It can be noted that the normal ranges given by

    Taguchi et al. (2001) do not exactly match those given by

    the National Library of Medicine for the other variables. It is

    not unusual for different sources to give somewhat different

    normal ranges. Also, the original study was done in Japan,

    so there could be differences in the normal ranges for the

    Japanese and the U.S. populations. The normal range for a

    variable also depends on the measurement method used. We

    have no information on the measurement methods used in

    this case study.

    Table 3 lists each variable for each unhealthy patient

    that is well outside the corresponding normal range for

    each unhealthy patient. We use the normal ranges from the

    National Library of Medicine, with the exception of alkaline

    phosphatase (V 9), because we have the ranges for all variables

    and, for the most part, they cover more of the corresponding

    values of the healthy group. Note that subjects 2 and 3 do

    not have any variables clearly outside any of the normal

    ranges, but they differ considerably from the healthy group

    Table 3. Variables for Unhealthy Patients

    Well Outside Normal Ranges

    Su bje ct nu mb er Variab le numb er  

    1 12, 132 None3 None4 135 106 77 78 139 12, 13

    10 4, 1211 10, 1212 1013 1014 10, 13

    15 6, 7, 1316 3, 6, 7, 10, 1217 6, 7, 8, 10, 13

    TECHNOMETRICS, FEBRUARY 2003, VOL. 45, NO. 1

  • 8/18/2019 A Review and Analysis of the Mahalanobis—Taguchi

    8/16

    THE MAHALANOBIS–TAGUCHI SYSTEM 7

    with respect to  V 5. The relevance of the various variables to

    the diagnosis of liver disease is discussed in Section 4.3.

    The following conclusions can be reached by considering

    the raw data, the dotplots, and the normal ranges:

    1. We note from Figure A.1 that the unhealthy patients are

    on average 10 years older than the healthy patients. If the

    medical variables vary naturally by age, then it would seem

    important to have roughly the same range of ages in the twogroups.

    2. From Figure A.14, there is a large difference between

    the abnormals and the healthy group for phospholipid (V 14 ),but all values of this variable are within the normal range.

    3. It is not clear from the univariate dotplots in Figures

    A.15 and A.17 why creatine (V 15 ) and uric acid (V 17 ) should

    be declared to be useful variables for the MTS.

    4. Some variables dropped under the MTS could be use-

    ful in the diagnosis for particular patients. In particular, this

    appears true for variables V 6  and V 7  for subjects numbered 15,

    16, and 17 in the unhealthy group.

    The scatterplot of cholinesterase (V 5 ) and r-GPT (V 10 )

    shows a clear separation between the healthy and unhealthy

    patients. This plot is shown in Figure 2, with healthy subjects

    represented by 1, those with mild disease by 2, and those

    with medium disease by 3. All outlying points correspond

    to unhealthy patients with two values plotted at the point

    (318, 44).

    Similarly, the unhealthy patients also show up in the scat-

    terplot of PL (V 14 ) versus TCh (V 12 ). This is illustrated in

    Figure 3. Taguchi et al. (2001, p. 37) give the correlation

    matrix for the variables for the healthy group that shows the

    variables  V 12   and  V 14  as the most highly correlated pair.

    There are some signicant differences in variation by gen-

    der over all groups. This is illustrated in Figure 4 by the r-GPT

    (V 10 ).

    There has been an extensive amount of research on the use

    of statistical modeling for medical diagnosis (see, e.g., Sahai

    and Khurshid 1991). We applied the methods of discriminant

    analysis to the medical data, as discussed by Albert and Harris

    (1987, pp. 101–115). Interestingly, these authors apply dis-

    criminant analysis to the diagnosis of liver disease to illustrate

    their approach. We performed the discriminant analysis under

    the assumption of multivariate normality for two groups, with

    7006005004003002001000

    250

    200

    150

    100

    50

    0

    V5 ChE

       V   1   0  r  -   G   P   T

    Figure 2. Scatterplot of Variable 10 Versus Variable 5 (  , 1;   + , 2; , 3).

    300200100

    350

    250

    150

    V12 TCh

       V   1   4   P   L

    Figure 3. Scatterplot of Variable 14 Versus Variable 12 (  , 1;   +  , 2; , 3).

    gender excluded in the analysis and a log transformation on

    V 10 . The resulting discriminant function did not do as well

    as the MTS recommended scale, however, in separating the

    patients with mild disease severity from those with medium

    disease severity.

    From the medical considerations discussed in more detail

    later, however, it is not reasonable to simply use collectively

    all of the variables in this dataset to assess the severity of liver

    disease. As discussed by Bodily and Fitz (1996) and Chopra

    (2001), the level of liver disease is most often measured by the

    modied Child–Pugh classication score, which is based on

    two clinical and three biochemical measures. The two clinical

    measures are ascites (uid in the abdomen) and encephalopa-

    thy (mental alertness), and the three biochemical measures are

    bilirubin, albumin, and prothrombin time (blood clotting fac-

    tor). Only albumin [Alb (V 4 )] is included in the dataset used

    for this case study. Thus it is not possible to accurately assess

    the level of liver disease for the patients listed as “abnormal.”

    4.3 Medical Considerations

    In this case study we have considered using the MTS

    in assessing the presence and extent of liver disease in a

    limited group of Japanese patients. Taguchi and Rajesh (2000)

    attempted to derive a valid diagnostic scale based on these

    data. We have no information on how the patients were

    selected, or on the criteria used to identify the severity of their

    disease. Despite reservations concerning the data, we have

    presented some statistical results regarding the performance of 

    the MTS. In this section we discuss some important medical

    issues concerning the difculty of treating liver disease as

    a single entity, the shortcomings resulting from the use of 

    2001000

    V10 r-GPT

    V2 Gender

     1

    10

    Figure 4. Dotplot of V-10, r-GPT, by Gender.

    TECHNOMETRICS, FEBRUARY 2003, VOL. 45, NO. 1

  • 8/18/2019 A Review and Analysis of the Mahalanobis—Taguchi

    9/16

    8 WILLIAM H. WOODALL ET AL.

    so-called “liver function tests” (LFTs), and Taguchi and

    Rajesh’s (2000) lack of data from some critical, standard LFTs

    used for the diagnosis of liver disease and the classication

    of its severity level.

    The diagnosis of liver disease is complicated for several

    reasons. For one, it is attributed to a diverse number of liver

    disorders with highly variable underlying pathophysiologyand

    clinical presentations. In addition, the only way to obtain spe-cic diagnostic results is often through invasive techniques

    (e.g., radiologic procedures and liver biopsy) or immunologic

    tests that allow specic diagnoses (e.g., hepatitis serology).

    The LFTs are also often used for diagnostic purposes. They

    represent a collection of tests that seldom give a specic diag-

    nosis; rather, they suggest a general category of liver disorders

    (Pratt and Kaplan 1999). It is essential that LFTs be used

    collectively, because they have a limited sensitivity and speci-

    city. According to Pratt and Kaplan (1999, p. 206) “when

    more than one of these tests provides abnormal ndings or the

    ndings are persistently abnormal on serial determinations, the

    probability of liver disease is high. When all test results are

    normal, the probability of missing occult liver disease is low.”

    The LFTs are divided into three major categories: (1) tests

    of the liver’s ability to transport organic anions and metabo-

    lize drugs, such as serum bilirubin; (2) tests that detect injury

    to liver cell, including aminotrasferases, such as GOT (V 6),

    transaminases, such as GPT (V 7 ), and alkaline phosphatase

    Alp (V 95; and (3) tests of liver’s biosynthetic capacity, includ-

    ing serum albumin Alb (V 4 ), and blood clotting factors, such

    as prothrombin time (Kaplan 1990). Indeed, three of these

    LFTs—Alb (V 4 ), prothombin time, and bilirubin—are usedin the “modied Child–Pugh classication,” the classication

    standard for severity of liver disease (Bodily and Fitz 1996;

    Chopra 2001). In this classication, the severity level is deter-

    mined by two physical ndings (ascites and encephalopathy)

    and the three aforementioned LFTs. It should be noted that

    Taguchi and Rajesh (2000) made no mention of the modied

    Child–Pugh classication and gave no data for the two crit-

    ical LFTs (bilirubin and prothrombin time) for the patients

    in this case study. The only critical LFT reported by Taguchi

    and Rajesh (2000), that of Alb (V 4 ), is consistently normal

    (3.6–5.8 g/dL) in all 17 “abnormal” patients. In the modied

    Child–Pugh classication, an Alb (V 4 ) level of 2.8–3.5 g/dL

    is consistent with mild disease, whereas moderate or severe

    disease is often found in patients with an Alb ( V 4) level less

    than 2.8 g/dL (Bodily and Fitz 1996; Chopra 2001).

    There are two general types of liver disease, acute and

    chronic. In acute liver disease, the prominently abnormal

    LFTs are the aminotransferases [e.g., GOT (V 65], which

    often exceed 500 IU and can frequently reach levels in the

    thousands while most other tests remain normal for a while

    (Kaplan 1990; Pratt and Kaplan 1999). In contrast, in chronic

    liver failure, the aminotransferases [e.g., GOT (V 65] and

    transaminases [e.g., GPT (V 75] increase minimally to less than

    500 IU, whereas the remaining LFTs are variable, according

    to the underlying pathology.

    In chronic liver disease, three major subtypes can be

    identied: chronic hepatocellular disorders (e.g., cirrhosis oralcoholic liver disease), cholestasis (e.g., obstruction), and

    inltrative disorders (e.g., tumors or tuberculosis). Each of 

    these subcategories has a specic pattern of presentation. In

    the case of hepatocellular disorder, an Alb (V 4) level below

    3.0 g/dL and an abnormally prolonged prothrombin time, with

    only minimally increased aminotransferases [e.g., GOT (V 65]to a level below 300 IU is the norm. A ratio of GPT/GOT

    above 2.0 strongly suggests alcoholic liver disease in that

    setting (Clermont and Chalmers 1967). Whereas 70% of 

    patients with alcoholic liver disease have GPT/GOT above2.0, this is encountered in only 5% or less of patients with

    other disorders (Cohen and Kaplan 1979).

    In the cholestatic form of liver disease, the pattern is differ-

    ent. There, the Alp (V 9) is usually elevated out of proportionwith other enzymes. Values exceeding four times the normal

    level suggest cholestasis (Pratt and Kaplan 1999). Because

    Alp (V 9) has a close linear relation with serum r-glutamyl

    transpeptidase [r-GPT (V 10 )], it is logical to look for similar

    changes in r-GPT (V 10 ) (Whiteld et al. 1972). If Alp (V 9) is

    elevated and r-GPT is not, then one would assume that Alp

    (V 9) is not of liver origin (probably of bone disease origin).

    Aminotransferases [e.g., GOT (V 6

    )] are usually elevated to

    levels up to 300 IU, with values exceeding 500 IU being rare.

    In cases of inltrative liver disease, the pattern is closer to

    that seen with obstruction. Often, the earliest and only abnor-

    mal test is Alp (V 9). Aminotransferases [e.g., GOT (V 6 )] are

    normal or minimally elevated, and so are Alb (V 4 ) and pro-

    thrombin time (Pratt and Kaplan 1999).

    Of all LFTs (variables) in the patients’ dataset used for this

    case study, the most relevant ones for liver disease diagnosis

    and classication are Alb (V 4), GOT (V 6), GPT (V 7 ), Alp (V 9),

    and r-GPT (V 10 ). The data results for the LFTs   V 3,   V 5 ,   V 8 ,and   V 11 –V 17   are not directly relevant to liver disease. From

    the foregoing medical discussion and the case study data, it

    is quite clear that while “abnormal” patients 15–17 seem to

    exhibit some chronic hepatocellular disease (e.g., cirrhosis or

    alcoholic liver disease), all other patients, both “normal” and

    “abnormal,” do not seem to exhibit any notable liver disease.

    In fact, it is quite doubtful that any patient participating in this

    case study has any signicant liver disease, certainly not acute,

    because no patient has an Alb (V 4) level below 3.5 g/dL.

    Although some of the abnormal patients 1–14 could exhibit

    some extremely weak signs of chronic cholestasis (e.g.,

    obstuction) or inltrative disorders (e.g., tumors), such a

    diagnosis would certainly require additional results from

    the two critical LFTs (bilirubin and prothrombin time) and

    would benet from some physical ndings (e.g., ascites), as

    suggested in the modied Child–Pugh classication method.

    However, these data are not available for the case study.

    Moreover, the use of only 17 “abnormal” patients is an

    extremely small sample for liver disease diagnosis and

    classication, given the highly diverse number of disorders

    attributed to liver disease.

    Finally, it is important to note that cluster analysis, which

    was applied to the ve most relevant LFTs (variables)

    Alb (V 4), GOT (V 6), GPT (V 7 ), Alp (V 9), and r-GPT (V 10 ) (for

    the combined sample of male and female patients), yielded

    an optimal number of two clusters. One cluster grouped

    together the “normal” patients with “abnormal” patients, 1–14,whereas the second cluster consisted of the three “abnormal”

    patients, 15–17. That is, the results of cluster analysis are

    TECHNOMETRICS, FEBRUARY 2003, VOL. 45, NO. 1

  • 8/18/2019 A Review and Analysis of the Mahalanobis—Taguchi

    10/16

    THE MAHALANOBIS–TAGUCHI SYSTEM 9

    in full agreement with a careful medical diagnosis based on

    the data available for the case study. When the MTS analysis

    was similarly applied to these same ve most relevant LFTs

    for both 8 and 32 runs, however, the results were consistent,

    but different from the results of the cluster analysis and the

    medical diagnosis. This suggests that in this case, the problem

    with the MTS analysis is connected with the use of the S/N

    ratio measure rather than the interaction issue from the OA.In general, however, both of these issues can cause problems.

    5. OTHER ASPECTS OF THE

    MAHALANOBIS–TAGUCHI SYSTEM

    Taguchi et al. (2001) presented several other methods in the

    MTS framework. We summarize these in this section.

    5.1 Forecasting

    Taguchi et al. (2001, chap. 3) presented an application of 

    the MTS to evaluate the amount of credit that should be

    extended to applicants. The MTS is proposed as an alternative

    to credit-scoring methods. Taguchi et al. (2001, p. 25) statedthat traditional methods in this area have not been successful,

    because only people who defaulted on loans were studied.

    Data on good customers is routinely used to build credit-

    scoring models, however, as discussed by Reichert, Cho, and

    Wagner (1983).

    In the MTS approach, the values of  M j   represent losses to

    the company due to unpaid bills. The regression model in (3)

    is tted, and the loss corresponding to an applicant with an

    MD value of  D 2 is estimated to be

    M  D D= O‚   34p 

     MSE = O‚50This practice of using a tted regression line to estimate the

    value of an unobserved independent variable corresponding

    to an observed value of the dependent variable is called

    “calibration” in the statistical literature. Brownlee (1965,

    pp. 361–362) discussed the calibration problem specically

    for a line through the origin. As discussed by Mee and

    Eberhardt (1996), the statistical approach to this problem

    accounts for the error in estimating the variance and the slope

    of the line. This sampling variation is ignored in the MTS.

    5.2 Use in Clinical Trials

    Taguchi et al. (2001, chap. 5) pointed out that clinical tri-

    als involve large numbers of subjects, require quite a long

    time, and are very expensive. The two reasons given for this

    are the large individual differences between patients and the

    use of attribute data, not continuous variables. It is stated

    that if continuous variables, such as the MD, could be used,

    then the study could be conducted by observing only one or

    two patients in a short period. Statisticians would nd this

    claim astounding, because clinical trials must have sample

    sizes sufciently large for investigators to measure effective-

    ness relative to other treatments, determine dosage, assess the

    side effects of the treatment being studied, and to determine

    which types of patients in a very heterogeneous population

    benet most from the treatment. Taguchi et al. (2001, p. 4)considered use of the MTS in clinical trials to be its most

    exciting potential application.

    Taguchi et al. (2001, chap. 5) proposed a method for com-

    paring the effectiveness of two treatments. Only one patient

    is used for each treatment. The MD values of each patient

    are recorded over time during treatment. The MD values of 

    the two patients are scaled using the corresponding initial

    MD values, and regression equations are tted to show the

    changes in the transformed MD values over time. The treat-

    ments are compared by comparing the estimated slopes of the two lines. In statistical terminology, this corresponds to

    a repeated-measurements experiment for two treatments, but

    with only one subject in each treatment group. Statisticians

    would never recommend this practice, however, because varia-

    tion between subjects cannot be assessed. The treatment effect

    is confounded with the difference between subjects.

    5.3 Use of Principal Components

    Taguchi and Rajesh (2000) pointed out that in some applica-

    tions of the MTS, there are two types of abnormalities present.

    For example, in the graduate student admission process there

    could be very good, as well as very bad, applicants. Thus,

    they noted that it is important to identify the direction of the

    abnormality. They stated further that this cannot be done with

    the MD values calculated using the inverse of the correlation

    matrix, but it can be done using the Gram–Schmidt orthogo-

    nalization process.

    The Gram–Schmidt process is recommended for obtaining

    a set of mutually perpendicular vectors from a set of linearly

    independent standardized original vectors. It appears that this

    is a recommendation for obtaining the values of the principal

    components of the abnormal items based on the correlation

    matrix of the normal group. The discussion is not clear, how-

    ever, for several reasons. First, the classication into the good

    and bad categories is based on the signs of the principal com-

    ponents. Often this would not be any more helpful than using

    the signs of the standardized original variables. Second, the

    threshold of the MD values shown on bivariate plots should

    correspond to an ellipse, but instead linear limits are drawn.

    Third, the axes, corresponding to what appear to be the prin-

    cipal components in the bivariate plots, are not drawn along

    the major and minor axes of the MD contour ellipse and are

    not centered at the origin, as would be expected.

    6. CONCLUDING REMARKSAs statisticians, we much prefer the multivariate statisti-

    cal approaches based on underlying probability models to the

    MTS. Mahalanobis (1950) also greatly valued the use of prob-

    ability, stating that statistics supplies the basis for choosing a

    particular course of action in practical problems by balancing

    the risks of gain and loss using the calculus of probability. He

    also held that the cross-examination of the data was the rst

    responsibility of the statistician (Mahalanobis 1965). Ques-

    tioning the validity of the data and the use of exploratory data

    analysis is not mentioned as part of the MTS.

    Statistical methods are better designed to account for

    variation between units in the groups and to account forsampling variation. The MTS does not adequately address

    the issue of variation between items, because this variation

    TECHNOMETRICS, FEBRUARY 2003, VOL. 45, NO. 1

  • 8/18/2019 A Review and Analysis of the Mahalanobis—Taguchi

    11/16

    10 WILLIAM H. WOODALL ET AL.

    typically results in at least some classication errors, when a

    classication rule is developed from a dataset. This lack of 

    attention to variation between units is most evident in the MTS

    clinical trials methods, in which variation between individu-

    als is completely ignored. In addition, sampling variation is

    ignored in the decision rules involving the S/N ratios and in

    the calibration problem involving prediction of an   M j   value

    based on the value of MD.It should be noted that some of the application areas men-

    tioned for the MTS have been studied extensively in the

    statistics and other subject matter literature, including medical

    diagnosis and credit scoring. These bodies of work are ignored

    in the development of the MTS approaches.

    From the case study presented in Section 4, the MTS anal-

    ysis based on the S/N ratio in (5) does not necessarily lead to

    a good MD scale in that separation between the classes with

    different severities of abnormality can be very poor. Of course,

    it is possible to use a more effective search algorithm than that

    based on the OA and to modify the S/N ratio. Even with such

    modications, however, we believe that there are still impor-tant unresolved conceptual issues with the MTS, and that with

    further development of the basic approach, one would eventu-

    ally need to incorporate methods based on probability. Despite

    such serious shortcomings, however, we expect the MTS to

    become more widely used in industry. Many practitioners will

    understand the advantages of using multivariate data, but will

    lack the expertise required to implement statistical approaches.

     ACKNOWLEDGMENTSThe research of W. H. Woodall, R. Koudelik, K.-L. Tsui,

    and S. B. Kim was partially supported by National Science

    Foundation-DMI grant 9908013. K.-L. Tsui’s work was also

    partially supported by The Logistic Institute—Asia Pacic,

    Singapore. The work of Z. G. Stoumbos was funded in part

    by the Law School Admission Council (LSAC) and by a

    2001 Rutgers Faculty of Management Research Fellowship.

    The opinions and conclusions contained in this publication

    are those of the authors and do not necessarily reect the

    position or policy of LSAC. We thank Rajesh Jugulum and

    Genichi Taguchi for providing the medical case study dataset

    and allowing us to distribute it. We also thank the referees andthe associate editor for their helpful comments.

     APPENDIX: DOTPLOTS FOR THE MEDICAL DATA VARIABLES

    (Status: HealthyD 1, mild diseaseD 2; medium diseaseD 3)

    20 30 40 50 60

    V1 Age

    Status

    1

    2

    3

    Figure A.1. Dotplot of  V 1  (Age) by Patient Status.

    1 2 3 4 5 6 7 8 9 10

    V2 GenderEach dot represents up to 3 observations.

    Status

    1

    2

    3

    Figure A.2. Dotplot of  V 2  (Gender) by Patient Status.

    TECHNOMETRICS, FEBRUARY 2003, VOL. 45, NO. 1

  • 8/18/2019 A Review and Analysis of the Mahalanobis—Taguchi

    12/16

    THE MAHALANOBIS–TAGUCHI SYSTEM 11

    6 7 8

    V3 TP

    Status

    1

    2

    3

    Figure A.3. Dotplot of  V 3  (Total Protein) by Patient Status.

    3.8 4.8 5.8

    3

    3.8 4.8 5.8

    V4 Alb

    Status

    1

    2

    Figure A.4. Dotplot of  V  4  (Albumin) by Patient Status.

    100 200 300 400 500 600 700

    V5 ChE

    Status

    1

    2

    3

    Figure A.5. Dotplot of  V 5  (Cholinesterase) by Patient Status.

    TECHNOMETRICS, FEBRUARY 2003, VOL. 45, NO. 1

  • 8/18/2019 A Review and Analysis of the Mahalanobis—Taguchi

    13/16

    12 WILLIAM H. WOODALL ET AL.

    50 100 150

    V6 GOT

    Status

    1

    2

    3

    Figure A.6. Dotplot of  V 6  (Glutamate O Transaminase) by Patient Status.

    20 70 120 170

    V7 GPT

    Status

    1

    2

    3

    Figure A.7. Dotplot of  V 7  (Glutamate P Transaminase) by Patient Status.

    100 200 300 400

    V8 LHD

    Status

    1

    2

    3

    Figure A.8. Dotplot of  V 8  (Lactic Dehydrogenase) by Patient Status.

    100 200 300

    V9 Alp

    Status

    1

    2

    3

    Figure A.9. Dotplot of  V 9  (Alkaline Phosphatase) by Patient Status.

    TECHNOMETRICS, FEBRUARY 2003, VOL. 45, NO. 1

  • 8/18/2019 A Review and Analysis of the Mahalanobis—Taguchi

    14/16

    THE MAHALANOBIS–TAGUCHI SYSTEM 13

    0 1 00 200

    V10 r-GPT

    Status

    1

    2

    3

    Figure A.10. Dotplot of  V 10   (r-Glutamyl Transpeptidase) by Patient Status.

    40 50 60 70 80 90 100 110 120

    V11 LAP

    Status

    1

    2

    3

    Figure A.11. Dotplot of  V 11  (Leucine Aminopeptidase) by Patient Status.

    100 200 300

    V12 TCh

    Status

    1

    2

    3

    Figure A.12. Dotplot of  V 12  (Total Cholesterol) by Patient Status.

    100 200 300 400

    V13 TG

    Status

    1

    2

    3

    Figure A.13. Dotplot of  V 13   (Triglyceride) by Patient Status.

    TECHNOMETRICS, FEBRUARY 2003, VOL. 45, NO. 1

  • 8/18/2019 A Review and Analysis of the Mahalanobis—Taguchi

    15/16

    14 WILLIAM H. WOODALL ET AL.

    150 250 350

    V14 PL

    Status

    1

    2

    3

    Figure A.14. Dotplot of  V 14  (Phospholipid) by Patient Status.

    1.0 1.5 2.0

    V15 Cr

    Status

    1

    2

    3

    Figure A.15. Dotplot of  V 15  (Creatinine) by Patient Status.

    8 18 2313

    V16 BUN

    Status

    1

    2

    3

    Figure A.16. Dotplot of  V 16  (Blood Urea Nitrogen) by Patient Status.

    TECHNOMETRICS, FEBRUARY 2003, VOL. 45, NO. 1

  • 8/18/2019 A Review and Analysis of the Mahalanobis—Taguchi

    16/16

    THE MAHALANOBIS–TAGUCHI SYSTEM 15

    2.5 3.5 4.5 5.5 6.5 7.5 8.5

    V17 UA

    Status

    1

    2

    3

    Figure A.17. Dotplot of  V 17  (Uric Acid) by Patient Status.

    [Received August 2001. Revised December 2001.]

    REFERENCES

    Adams, B. M., and Woodall, W. H. (1989), “An Analysis of Taguchi’sOn-Line Process Control Method Under a Random Walk Model,”  Techno-metrics, 31, 401–413.

    Albert, A., and Harris, E. K. (1987),  Multivariate Interpretation of Clinical Laboratory Data, New York: Marcel Dekker.

    Begg, C. B. (1991), “Advances in Statistical Methodology for DiagnosticMedicine in the 1980’s,”  Statistics in Medicine, 10, 1887–1895.

    Bodily, K. O., and Fitz, J. G. (1996), “Approach to the Patient with SuspectedLiver Disease,” in   Current Diagnosis & Treatment in Gastroenterology,

    eds. J. H. Grendell, K. R. McQuaid, and S. L. Friedman, Stamford, CT:Appleton & Lange, pp. 461–474.

    Box, G. E. P. (1996), “The Role of Statistics in Quality and ProductivityImprovement,” Journal of Applied Statistics, 23, 3–20.

    Brownlee, K. A. (1965),  Statistical Theory and Methodology in Science and 

     Engineering, New York: Wiley.Chopra, S. (2001), “Diagnostic Approach to the Patient with Cirrhosis,” UpTo-

     Date  (www.uptodate.com), 9, 1–6.

    Clermont, R. J., and Chalmers, T. C. (1967), “The Transaminase Tests inLiver Disease,”   Medicine, 46, 197–207.

    Cohen, J. A., and Kaplan, M. M. (1979), “The SGOT/SGPT Ratio: An Indi-

    cator of Alcoholic Liver Disease,”   Digestive Diseases and Sciences, 24,

    835–839.Harris, E. K. (1981), “Statistical Aspects of Reference Values in Clinical

    Pathology,” in  Progress in Clinical Pathology VIII , eds. M. Stefanini andE. Benson, New York: Grune and Stratton, pp. 45–66.

    Kanetaka, T. (1990), “Diagnosis of a Special Health Check Using Maha-lanobis Distance,”  ASI Journal, 3.

    Kaplan, M. M. (1990), “Evaluation of Hepatobiliary Disease,” in   Internal

     Medicine, (3rd ed.), eds. J. H. Stein et al., Boston, MA: Little, Brown, p. 443.Lunani, M., Nair, V. N., and Wasserman, G. S. (1997), “Graphical Meth-

    ods for Robust Design with Dynamic Characteristics,”   Journal of Quality

    Technology, 29, 327–338.Mahalanobis, P. C. (1950), “Why Statistics?,”  Sankhy Na, 10, 195–228.

    (1965), “Statistics as a Key Technology,”  The American Statistician,

    19, 43–46.Mee, R. W., and Eberhardt, K. (1996), “A C omparison of Uncertainty Criteria

    for Calibration,”  Technometrics, 38, 221–229.

    Montgomery, D. C. (1992), “The Use of Statistical Process Control andDesign of Experiments in Product and Process Improvement,”  IIE Trans-

    actions, 24, 4–17.

    Nair, V. N. (ed.) (1992), “Taguchi’s Parameter Design: A Panel Discussion,”

    Technometrics, 34, 127–161.

    National Library of Medicine (2001), MEDLINEplus Health Information(www.nlm.nih.gov/medlineplus), May, 16, 2001.

    Nayebpour, M. R., and Woodall, W. H. (1993), “An Analysis of Taguchi’s

    On-Line Quality Monitoring Procedures for Attributes,”  Technometrics,  35,53–60.

    Neuschwander-Tetri, B. A. (1995), “Common Blood Tests for Liver Disease,”Postgraduate Medicine, 98, 49–63.

    Pratt, D. S., and Kaplan, M. M. (1999), “Evaluation of the Liver. A. Lab-oratory Tests,” i n   Schiff’s Diseases of the Liver   (5th ed.), eds. E. R.Schiff, M. F. Sorrell, W. C. Maddrey, Philadelphia: Lippincott-Raven,pp. 205–244.

    Reichert, A. K., Cho, C.-C., and Wagner, G. M. (1983), “An Examinationof the Conceptual Issues Involved in Developing Credit-Scoring Models,”

     Journal of Business and Economic Statistics, 1, 101–114.

    Sahai, H., and Khurshid, A. (1991), “Mathematical and Statistical Models inComputer-Assisted Medical Diagnosis: An Overview and a Selected Bibli-ography,”  Journal of Clinical Computing, 20, 33–81.

    Taguchi, G. (1981),   On-Line Quality Control During Production, Tokyo:Japanese Standards Association.

    Taguchi, S., Chowdhury, S., and Taguchi, S. (2000),   Robust Engineering,New York: McGraw-Hill.

    Taguchi, G., Chowdhury, S., and Wu, Y. (2001),   The Mahalanobis–Taguchi

    System, New York: McGraw-Hill.

    Taguchi, G., Elsayed, E. A., and Hsiang, T. (1989),  Quality Engineering inProduction Systems, New York: McGraw-Hill.

    Taguchi, G., and Rajesh, J. (2000), “New Trends in Multivariate Diagnosis,”Sankhy Na, 62, 233–248.

    Taguchi, G., and Wu, Y. (1980),   Introduction to Off-Line Quality Control,

    Nagoya, Japan: Japan Quality Control Organization.Tracy, N. D., Young, J. C., and Mason, R. L. (1992), “Multivariate Control

    Charts for Individual Observations,”   Journal of Quality Technology, 24,

    88–95.Tsui, K.-L. (1996), “A Critical Look at Taguchi’s Modeling Approach for

    Robust Design,”  Journal of Applied Statistics, 23, 81–95.(1999), “Response Model Analysis of Dynamic Robust Design Exper-

    iments,”  IIE Transactions, 31, 1113–1122.

    Whiteld, J. B., Pounder, R. E., Neale, G., and Moss, D. W. (1972), “Serumƒ -Glytamyl Transpeptidase Activity in Liver Disease,”   Gut , 13, 702–708.

    Wu, C. F. J., and Hamada, M. (2000),  Experiments: Planning, Analysis, and 

    Parameter Design Optimization, New York: Wiley.Zielezny, M., and Dunn, O. J. (1975), “Cost Evaluation of a Two-Stage Clas-

    sication Procedure,”  Biometrics, 31, 37–47.