dual x-ray absorptiometry quality control: comparison of visual examination and process-control...

12
JOURNAL OF BONE AND MINERAL RESEARCH Volume 11, Number 5, 1996 Blackwell Science, Inc. C 1996 American Society for Bone and Mineral Research Dual X-Ray Absorptiometry Quality Control: Comparison of Visual Examination and Process-Control Charts YING LU,' ASHWINI K. MATHUR,' BARBARA A. BLUNT,' CLAUS C. GLUER,' A. STEVE WILL,' THOMAS P. FUERST,' MICHAEL D. JERGAS,' KIM N. ANDRIANO,' STEVEN R. CUMMINGS,4 and HARRY K. GENANT' ABSTRACT Dual X-ray absorptiometry (DXA) is widely used to monitor treatment efficacy in reducing the rate of bone mineral loss. In order to assure the validity of these measurements, instrument quality control of the DXA scanners becomes very important. This paper compares five quality control procedures (visual inspection, Shewhart chart with sensitizing rules, Shewhart chart with sensitizing rules and a filter for clinically insignificant mean changes, moving average chart and standard deviation, and cumulative sum chart [CUSUM]) in their ability to identify scanner malfunction by means of (1) an analysis of five longitudinal phantom data sets that had been collected during a clinical trial and (2) an analysis of simulated data sets. The visual inspection method is relatively subjective and depends on the operator's experience and attention. The regular Shewhart chart with sensitizing rules has a high false alarm rate. The Shewhart chart with sensitizing rules and an additional filter for clinically insignificant mean changes has the lowest false alarm rate but a relatively low sensitivity. The CUSUM method has good sensitivity and a low false alarm rate. In addition, this method provides an estimate of the date a change in the DXA scanner performance might have occurred. The method combining a moving average chart and a moving standard deviation chart came closest to the performance of the CUSUM method. Comparing the advantages and disadvantages of all methods, we propose the use of the CUSUM method as a quality control procedure for monitoring DXA scanner performance. For clinical trials use of the more intuitive Shewhart charts may be acceptable at the individual sites provided their scanner performance is followed up by CUSUM analysis at a central quality assurance center. (J Bone Miner Res 1996;11:626-637) INTRODUCTION MI. X-IIAY AisowrioMtmY (DXA) is an accurate and D precise noninvasive method to measure bone mineral density (BMD). It is also widely applied in multicenter clinical trials to monitor treatment effects on BMD values. Despite their remarkable accuracy and reproducibility, BMD values measured by DXA technology can still vary because of equipment changes, software upgrades, machine recalibration, hardware aging and/or failure, or operator errors. Therefore, many long-term clinical trials use daily phantom scans to monitor the longitudinal variation of a scanner and characterize the scanner's performance across distinct time intervals according to the observed drift and shift in calibration.'" Adjustments of patient scan measure- ments usually utilize these intervals.('.3' Change points mark the beginning and the end of these distinct time intervals. There are several statistical methods 'Osteoporosis Research Group, Department of Radiology, University of California, San Francisco, California. U.S.A. 'Procter and Gamble Pharmaceuticals, Cincinnati, Ohio, U.S.A. 'Sandoz Pharmaceuticals Corp., East Hanover, New Jersey, U.S.A. 'Department of Epidemiology and Biostatistics, University of California San Francisco, California. U.S.A. 626

Upload: ying-lu

Post on 11-Jun-2016

219 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Dual X-ray absorptiometry quality control: Comparison of visual examination and process-control charts

JOURNAL OF BONE AND MINERAL RESEARCH Volume 11, Number 5, 1996 Blackwell Science, Inc. C 1996 American Society for Bone and Mineral Research

Dual X-Ray Absorptiometry Quality Control: Comparison of Visual Examination

and Process-Control Charts

YING LU,' ASHWINI K. MATHUR,' BARBARA A. BLUNT,' CLAUS C. GLUER,' A. STEVE WILL,' THOMAS P. FUERST,' MICHAEL D. JERGAS,' KIM N. ANDRIANO,'

STEVEN R. CUMMINGS,4 and HARRY K. GENANT'

ABSTRACT

Dual X-ray absorptiometry (DXA) is widely used to monitor treatment efficacy in reducing the rate of bone mineral loss. In order to assure the validity of these measurements, instrument quality control of the DXA scanners becomes very important. This paper compares five quality control procedures (visual inspection, Shewhart chart with sensitizing rules, Shewhart chart with sensitizing rules and a filter for clinically insignificant mean changes, moving average chart and standard deviation, and cumulative sum chart [CUSUM]) in their ability to identify scanner malfunction by means of (1) an analysis of five longitudinal phantom data sets that had been collected during a clinical trial and (2) an analysis of simulated data sets. The visual inspection method is relatively subjective and depends on the operator's experience and attention. The regular Shewhart chart with sensitizing rules has a high false alarm rate. The Shewhart chart with sensitizing rules and an additional filter for clinically insignificant mean changes has the lowest false alarm rate but a relatively low sensitivity. The CUSUM method has good sensitivity and a low false alarm rate. In addition, this method provides an estimate of the date a change in the DXA scanner performance might have occurred. The method combining a moving average chart and a moving standard deviation chart came closest to the performance of the CUSUM method. Comparing the advantages and disadvantages of all methods, we propose the use of the CUSUM method as a quality control procedure for monitoring DXA scanner performance. For clinical trials use of the more intuitive Shewhart charts may be acceptable at the individual sites provided their scanner performance is followed up by CUSUM analysis at a central quality assurance center. (J Bone Miner Res 1996;11:626-637)

INTRODUCTION

M I . X-IIAY A i s o w r i o M t m Y (DXA) is an accurate and D precise noninvasive method to measure bone mineral density (BMD). I t is also widely applied in multicenter clinical trials to monitor treatment effects on BMD values. Despite their remarkable accuracy and reproducibility, BMD values measured by DXA technology can still vary because of equipment changes, software upgrades, machine

recalibration, hardware aging and/or failure, or operator errors. Therefore, many long-term clinical trials use daily phantom scans to monitor the longitudinal variation of a scanner and characterize the scanner's performance across distinct time intervals according to the observed drift and shift in calibration.'" Adjustments of patient scan measure- ments usually utilize these intervals.('.3'

Change points mark the beginning and the end of these distinct time intervals. There are several statistical methods

'Osteoporosis Research Group, Department of Radiology, University of California, San Francisco, California. U.S.A. 'Procter and Gamble Pharmaceuticals, Cincinnati, Ohio, U.S.A. 'Sandoz Pharmaceuticals Corp., East Hanover, New Jersey, U.S.A. 'Department o f Epidemiology and Biostatistics, University of California San Francisco, California. U.S.A.

626

Page 2: Dual X-ray absorptiometry quality control: Comparison of visual examination and process-control charts

COMPARISON OF QC METHODS 627

t o identi& such change points. One of them is to check visually the retrospective data to determine the change points and vcrify these by a t-test. An alternative method is to use statistical process control charts.('." The Shewhart chart is the most commonly used process control chart.("-"') The moving average chart and cumulative sum chart (CUSUM) are other forms of the process control charts. Although thcse quality control methods have been developed for use in osteoporosis studies. their appropriateness and effective- ness have not been evaluated and compared. In this study wc examined these methods for identifying change points as well a s the magnitude o f these changes using longitudinal quality control data from five sites participating in one clinical trial. In addition, we investigated the sensitivity and false alarm rate of each method calculated on simulated data sets.

MATERIALS AND METHODS Atiulysis of a lotigititditiul scati data

Longitudinal quality control data were obtained for a period of approximately 4 years at five centers using Hologic ( f lologic, Waltham. MA, U.S.A.) scanners. These centers were selected from the Sandoz CT310 trial. Regular phan- tom scans were performed as a part o f the quality control effort. We used SAS'") t o analyze the five data sets, one for each center, and S-Plus' 'I) t o conduct our simulation ex- periments. When multiple scans were obtained on the same day, we only used the last scan done o n these days for this paper because of a software limitation of SAS.

The performance of five quality control methods was compared in two ways. First, we used the longitudinal qual- ity control data from the Sandoz trial to compare several features o f the change points that trigger alarms, including the coefficients of variation (CV) o f the corresponding disjoint time intervals, the percentage changes of the means between two subsequent intervals, and the length of the intervals. Second, we used simulations to compare the sen- sitivity and false alarm rate of various approaches for de- tecting scanner malfunctions.

Qitulity cotitrol nietliods

In an ideal setting, a well maintained scanner should produce phantom BMD values that are randomly spread around ;I reference value. A change point is defined as the point in time at which the BMD values start to deviate from that value. Although there is n o generally accepted defini- t i on of ii change point, it may have one or more of the following properties: ( I ) the mean BMD values before and after the change point are statistically significantly different; (2) the standard deviations of BMD values before and after thc change point are statistically significantly different; (3) the BMD values after the change point show a gradual but significant departure from the reference value. In our anal- yses. we applied our quality control methods in a prospec- tive manner.

Visitul itispcctiori: Potential change points in the data can be chosen after careful visual inspection. This can be done

by plotting longitudinal BMD data over time and using visual judgment to identify the potential change points created by drifts or sudden jumps. Statistical tests, such as a t-test, can be used to confirm the significance o f the changes.

Visual inspection can be used both prospectively and retrospectively. We applied visual inspection prospectively so as to be comparable with other methods and to simulate a real ongoing clinical trial. To do this using a retrospective data set, we plotted BMD values over time in several scatter plots for each study site. Each successive scatter plot con- tained 50 more scans than the previous one, and the last scatter plot contained the entire data set. This procedure produced 14-19 plots for each study site.

An experienced medical physicist examined these plots in their time sequences for each site. Change points were selected based on the visual judgment of the medical phys- icist using these plots. A significant drift, a jump, or an increase in variations was identified. The selection of the change points was based o n the scatter plot in the most recent plot. Once a change point was identified, it could not be changed or deleted based on later scatter plots. However, additional change points could be determined retrospectively after visual inspection of several additional scatter plots.

Shewhart control chart: A Shewhart chart is a graphic display of a quality characteristic that has been measured over time. The chart contains a center line that represents the mean reference BMD value. Instead of using the mean of the overall data as the reference level, our starting ref- erence value was the mean of the first 25 observations. The reference value changes whenever the Shewhart chart indi- cates an out-of-control signal. The new reference value will then be the mean of the 25 observations after the date of the signal. Although we can use more or fewer observations to calculate the reference value, the number 25 was chosen based on practical experience t o balance the stability o f the reference value and the length o f time to establish it . Two other horizontal lines, the upper and lower control limits, are also shown on the chart. These are the reference value +- 3 SD, respectively. Since the standard deviation varies among individual scanners and manufacturers, to apply equivalent rules for all the centers, we assumed that the coefficient of variation for Hologic machines is 0.5%, which is close to reported data on long-term phantom preci- sion.(I3) Therefore, the standard deviation for a scanner was calculated as 0.005 times the reference value.

Although it is intuitive and easy t o apply, the Shewhart chart has a low sensitivity for small but significant change^.'^' Therefore, a set of tests for assignable causes have been developed to improve the sensitivity of Shewhart charts, out of which eight are available in SAS."') These tests are referred to as the sensitizing rules in the statistical and quality control literature.(4) We used four of these rules: (1 ) Test 1, 1 measurement more than 3 SD from the center line; (2) Test 2, 9 consecutive measurements either above or below the center line; (3) Test 5 , 2 out o f 3 measurements more than 2 SD from the center line; (4) Test 6, 4 out of 5 measurements more than 1 SD from the center line. Once a change point is identified by any one of the tests given above, we use the next 25 observations to generate new

Page 3: Dual X-ray absorptiometry quality control: Comparison of visual examination and process-control charts

628 LU ET AL.

reference values and apply the tests on the subsequent data according to the new reference value. We will refer to this approach as Shewhart 1.

The sensitizing rules increase the sensitivity of the Shew- hart Chart but also the number o f alarms that are clinically insignificant, which is not desirable. To overcome this, a threshold based on the magnitude of the mean shift was implemented. In this approach, we select 10 consecutive scans after the potential change point that the Shewhart chart has identified. The mean values of these scans were then calculated. If the mean differs by more than 1 SD (which equals 0.5% times the reference value, in our exam- ple) from the reference value, the change point is confirmed as a true change point. Otherwise the signal given by the Shewhart chart is ignored and the reference value is not changed. This approach filters out the small and clinically insignificant changes. However, the true difference has to be beyond 1 SD for this approach to be useful, and thus this approach may delay the recognition of true change points. We will refer to this approach as Shewhart 2.

Moving average churt: An alternative method is to deter- mine the mean and standard deviation of 25 consecutive scans and plot these means and standard deviations over time. Control limits can be set up based on the assumption o f a constant coefficient of variation during the process (0.5% times the reference mean) and a type one error rate comparable to the original Shewhart method (0.27%).'" The control limits for the moving average are ? 59.91% of the standard deviation from the reference mean, and the control limit of the moving standard deviation is 1.41 times the standard deviation (see Appendix). Note that there is only an upper limit for the moving standard deviation chart because we are only interested in the increase of the stan- dard deviation. In other words, we are looking for quality control but not for quality improvement. Once the moving average moves out of the control limit, the value of the moving average at that point is used as the new reference value for the scans collected after that date.

The number of scans used to calculate the moving aver- age will affect its performance. Twenty-five scans were selected based on power analysis, so that the moving aver- age chart has less than a 0.27% chance for a false alarm and a 98% chance to detect an increase in the mean of 1 SD. In addition, the moving standard deviation chart has a 98% chance to pick up a 100% increase in standard deviation. Twenty-five scans also correspond to the typical number of phantom scans done in a month. Details about the moving average charts can be found in the Appendix.

CUSUM churt: In this paper, we use a version of CUSUM known as TABULAR CUSUM.(4' Mathematically, we de- note by X , the BMD value of the ith scan. We define upper one-sided tabular CUSUM S,,(i) and lower one-sided tab- ular CUSUM S , ~ ( i ) as the following:

Here p,, is the reference BMD value, CT is the standard deviation, and k is selected as 0.5. The initial values o f S,,(O) and S,(0) are 0. The chart sends an alarm message if S , . ( i ) or Skl(i) is greater than 5. In other words, when the stan- dardized BMD value deviates by more than k from zero, the cumulative upper bounded sum increases by an amount o f deviations above k . However, if the deviation is less than k , the cumulative sum will reduce by an amount accordingly. When the cumulative sum is less than zero, we ignore the past data and set the cumulative sum as zero. However, when the cumulative sum is greater than five,'" we believe that there are enough indications of a deviation from the reference mean in the data.

When CUSUM indicates a change. it also estimates when the change occurred and the magnitude of the change. We use the estimated magnitude of change to establish thc new reference values, as illustrated in the Appendix.

A separate CUSUM chart was performed for one-sided change in variance. Since tests 1, 5 , and 6 of the rules o f the Shewhart 1 also reflect changes in variance, it is reasonable to compare these methods in detecting both change in mean and change in variance. The one-sided variance chart was constructed according to Ryan.'" In this approach, the observed difference of two successive scans X , - X , , was transformed to

which approximately follows a standard normal distribution N(0, l ) . For the variance chart, we selected k = 0.75 to reduce the number of alarms due to single outliers. When an alarm for a change in variance was identified, we did not change the reference values unless the CUSUM chart for means also gave an alarm.

Simulation experiments

Studies based on real data help us to understand the performance of these methods in real applications. How- ever, such studies have limitations because they are derived from a limited number of datasets and we d o not have true information about the change points. Therefore, the com- parison of methods using real data is less informative be- cause there is no gold standard. To resolve these limita- tions, we used simulation experiments to characterize and describe the quantitative indices of the performance of the above methods, except for the visual inspection.

In our simulation experiments, we independently gener- ated 200 normally distributed random numbers with the first 100 simulating well-controlled scans and the last 100 data simulating one of the three most common problems indicating out of control scanners, namely, shifting of the mean, increase in variance, and linear drifting. The first 100 scans always follow a normal distribution N ( I , 0.005'). De- pending on the simulation model, the data for the last 100 scans follow different normal distributions. T o study the effect of mean shifting, the last 100 scans follow normal

Page 4: Dual X-ray absorptiometry quality control: Comparison of visual examination and process-control charts

COMPARISON OF QC METHODS 629

# of Meun BMD Stundurd C 'lit i ic Scrrrl.s First dutc Last dare (g/cc"l-) deviation cv (%') Aurocotri4ution

7 - 070 02/07/89 071 14192 3 707 0 1/03/89 11105192 4 725 0611 0188 03109193 5 h74 I I/OI/88 07/22/92

I 837 09/30/88 02/09/93 1 .050 0.0049 0.4t1 0.32x .038 0.0063 0.0 I 0.3xx .038 0.0056 0.54 0.289 .032 0.0046 0.45 0.024 .042 0.0050 0.4x 0.240

distributioiis o f N( I ,0025. O.OO5'). N( 1.005. O.OOS'), N( I .O I , O . 0 O S 2 ) , respectively. for the simulation Models I , 2, and 3. They correspond t o mean shifting o f E, I , and 2 SD. To study the effect o f an incrcasc in variance in the simulation Models 4, 5. and 0, the last 100 scans follow normal distri- hutions of' N ( I , 0.0075'). N ( I , ( ) . ( ) I2) , and N(1, O.OI25'), rcspectivcly. corresponding t o an increase in standard de- viation o f 50, 100, and 150V from the well controlled sit u at io 11.

Simulation Modcls 7. 8. and 9 are designed to study the effcct o f ;I linear drifting. In these models, linear drifts start from the IOlst scan with positive slopes of mean increase as 0.00005, 0.000 1, and 0.0002, rcspectivcly. They correspond to situations where the mean BMD value for the 150th scan is 'I?. 1, and 2 SD above the well-controlled situations. In these models. the variance of the last 100 scans is always 0.005'. the same ;IS for the first 100 scans.

Each model is repcated 2000 times. The first 100 scans wcrc used to evaluate the false alarm rate (or the type I error rate) for I00 well controlled scans. The last 100 scans were used to evaluate the truc alarm rate (sensitivity). In irddition to the sensitivity and false alarm rate, we also cv;iluated the distribution o f the number of scans needed to identify a change point. For CUSUM, we further examined the difference in the estimated change point and the true change point that we know for the simulated data sets, or in other words we examined the bias of the estimated change point.

RESULTS

Table I describes the data for this study. The number of sc;iiis ranged from 674 t o X37 scans. The coefficients of variation (CV) for the entire study period ranged from 0.45 t o O.hI%. Short-term precision at the baseline was calcu- lated by using the first month phantom data. It ranged from 0.20 to 0.44%'. In Table I , the autocorrelation coefficient stands for the estimated correlation coefficient between two consecutive BMD readings. Spearman's autocorrelation co- efficient ranged from 0.024 I to 0.388 1. The high autocor- relation coefficients were related to drifts during the study. By excluding the BMD data from the obvious linear drifts, the Spearman's autocorrelation coefficients ranged from -0 . I to 0.1 and were not statistically different from zero (17 > 0.05). I f the autocorrelation creates waveform time scrics data. the quality control methods discussed in this paper become inappropriate.

Table 2 compares the summary statistics of the change points identified by the fivc different methods. The Shew- hart 1 approach idcntified the most change points, while the Shewhart 2 approach identified the fewest. In most centers, visual inspection had fewer numbers o f change points than both the moving average method and CUSUM, which had a similar number of change points. In terms o f the magni- tude of changes, both moving avcrage and Shewhart 1 identified a substantial number of change points for which the percentage change in mean BMD before and after was less than 0.25%). This is not surprising since both methods identify all statistically significant changes in mean and standard deviation. They did n o t force a change point to have a mean change more than any prcspecificd magnitude. Also, in some cases, an increase in standard deviation dc- tected by the moving average or Shewhart I could be missed by Shewhart 2 if the corresponding mean changes wcrc small. The Shewhart 2 method has a filter that requires the mean of the next 10 points after an alarm to be at least 1 SD (0.5%) difference from the current reference mean. How- ever, Table 2 shows some examples where the mean differ- ence between intervals is less than 0.5%. This is because the next interval may have more or fewer than 10 points. Thcre- fore, its mean may be different from the first 10 points. The change point remains. nonetheless. Visual inspection and CUSUM methods identified the change points for the ma- jority of instances when the percentage mean change was above 0.25%). In general, the magnitude of mean changes identified by the CUSUM methods are greater than those identified by other methods. Furthermore, the intervals identified by CUSUM method had the minimum average CV (not presented in the table). which suggested that these intervals were the most homogeneous in comparison with intervals by other methods.

Figures 1 and 2 describe the results of these methods graphically for two representative centers. Figures for other centers are consistent with these two centers. In Figs. 1 and 2, the observed data are plotted. Five lines on the top o f each figure correspond to the fivc quality control methods. For Center 1 (Fig. I ) , the Shcwhart 2 method missed a change point around May 1989 while the other four mcth- ods gave alarm signals. All five methods identified a major change point around August 1990. The CUSUM method gave an alarm for a downward trend (starting around De- cember 1990) earlier than the other methods. For Center 2 (Fig. 2) , all methods identified major change points around December 1989. The visual inspection missed change points

Page 5: Dual X-ray absorptiometry quality control: Comparison of visual examination and process-control charts

630 LU ET AL.

TAH1.C 2. COMPARISON OF CIIANOE POINTS BY DIFFERENT METHODS

% nican changes Mean # o f scam ~~ ~ ~~ ~

Metliodp n htw alarm, 50.25% 0.25-O.5% 0.5- I. 0% >l.O%

Clinic I Visual Shewhart I Shcwhart 2 MA CUSUM

Clinic 2 Visual Shewhart 1 Shewhart 2 MA CUSUM

Clinic 3 Visual Shcwhart 1 Shcwhart 2 MA CUSUM

Clinic 4 Visual Shewhart 1 Shewhart 2 MA CUSUM

Clinic 5 Visual Shewhart 1 Shewhart 2 MA CUSUM

4 12 3 6 4

5 I6 4

1 1 1 1

6 18 4

12 15

2 1 1

1 3 4

167 64

209 120 I67

1 I3 40

136 57 57

1 I4 42

I59 66 50

242 81

363 121 I04

225 56

337 169 135

1(25%) 7 (59%) 1(33%) 3 (50%)

1 (20%)

2 (50%) 6 ( 5 5 % ) 1(9%)

9 (56%)

2 (33%) 10 (56%)

5 (45%) 1(7%)

2 (100%)

1(100%) 5 (100%) 2 (33%)

7 (88%)

1(50%) 6 (55%)

1(33%)

2 (50%) 4 (33%) 1(33%') 2 (33%') 3 (75%)

1 (20%) 1(6%)

3 (27%') 3 (27%))

2 (33%) 5 (2%) 2 ( S O % j )

4 (36%)) 6 (40%')

1(12%)

4 (67%')

1 (50%) 5 (45%) 1(100%)

4 (100%) 2 (67%)

1(25%) 1(8%) 1(33%) 1(17%)

I (25%))

1 (20%) 4 (2%) 1(25%) 2 (18%) 4 (37%)

1(17%') 1(6%) 2 (50%) 1(9%) 6 (40%)

2 (40%) 2 (13%') I(25%>)

3 (27%)

1(17%) 2 (12%)

1(9%) 2 (13%)

n , numhcr o f change points; Shewhart 1, Shewhart Chart with the sensitizing; Shewhart 2, Shewhart Chart with the sensitizing rulcs and ;I filter for mean changes; MA, moving average o f mean and standard deviations.

around April 1992 that were identified by all the other methods. In addition, the Shewhart 2, moving average, and CUSUM methods missed a change point around December 1990, while it was identified by both visual inspection and the Shewhart 1 method. For Center 4 that had a very low CV, the visual inspection and the Shewhart 2 method iden- tified fewer change points than the other three methods. I t is possible that some of the signals identified by the CUSUM and moving average method are not clinically significant. In summary, all the methods identified major change points. However, the Shewhart 1 approach pro- duced alarm signals the most frequently.

Although visual inspection and CUSUM produced simi- lar results, the visual check method was not as sensitive to drift. It took more time for visual inspection to detect a drift s o that some of the change points could be identified only after review o f an additional two to three scatter plots.

Results from the simulation experiments in Table 3 dcmonstrate that Shewhart 1 has the highest false alarm

rate (93%). In other words, if a site scans a phantom once every working day for about half a year and the scanner is in a perfect condition, the Shewhart 1 will have about 93%' chance of sending at least one alarm signal. If the system is in a perfect condition, on the average, thc Shcwhart 1 procedure produces a false alarm once every 3 Y scans. the highest false alarm rate. In contrast, Shewhart 2 has the lowest false alarm rate (6%), producing an average of one false alarm for every 631 scans, equivalent t o once every 2-3 years. The false alarm rates and average number of scans between two false alarms are comparable for the moving average and CUSUM methods. The false alarm rates were 47 and 3096, respectively, and the average number of scans between false alarms was 238 and 247, respectively, which is about once per year. Thus, based on these results the Shewhart 2 approach has a significantly lower false alarm rate than all other methods. The moving average and CUSUM methods have acceptable false alarm rates.

Table 4 demonstrates the sensitivities of the four meth-

Page 6: Dual X-ray absorptiometry quality control: Comparison of visual examination and process-control charts

COMPARISON O F QC METHODS 63 I

CUSUM

Shewhad2

Shewharl 1

Moving Average

Visual Inspection

. . . . .... - . . _

411 8/89 211 2/90 12/9/90 1 01519 1 713 1192

Scan Date

FIG. 1. Phantom scan data and change point analysis from Ccnter I (9/30/SS-3/0/3 I ).

CUSUM

Shewharl 2

Shewharll

Moving Average

Visual Inspection

- I N

1.03

a

. - . . . . . - . _ . .. - . . . . . . . ... _ . . . . . . . . - . . . - . . . . . . . . _ . . . . . . . . . . . . . . _ _ . . . . . . . . . . .. - . . . . . . . . . . . . . . . . . . . . . . . . . . . - . . . . .

. . . . . . . . . . . . .

. .

411 8/89 2/12/90 12/9/90 10/5/91 7/31/92

Scan Date

FIG. 2. Phantom scan data and change point analysis from Center 2 (2/7/X0-7/14/cQ).

ods to changes i n mean. changes in variance. and linear drifting. All four methods will detect large changes such a s those in models 3. h. and 9. Overall, the Shewhart 1 is the most sensitive method. The C'USUM method is the second most sensitive one, a n d the Shewhart 2 approach is the least sensitive nicthod. Among those events identified by Shew- hart 3. the median number of scans that are needed to identify the out-of-control events is greater than that o f the CLJSUM method. Moving average performs ;IS well as the CUSUM method. N o matter what quality control methods

are used, there are ~tlways delays in receiving alarm signals for the occurrence of events (Table 4).

The median bias of the estimated starting date by ('USUM procedure is between ~ 1 and h for mex i shifts (models 1-3) and 1-13 for incrcasc in variation (models 4-6), hiit ix bigger (between 11-37) for linear drifts. I t is because the magnitude of early drift is so small that i t is still in the range o f variation. As a matter of fact. all the estimated starting dates have mean B M D changes o f less than ;I half ;I stan- dard deviation.

Page 7: Dual X-ray absorptiometry quality control: Comparison of visual examination and process-control charts

632 LU ET AL.

TAHIJ 3. F A I X ALARM RxrE BY SIMULATION EXPERIMENTS

False d a m rate Average running length (Per 100 can^^) between m()fal.ye a h m s

One concern about the moving average method is that it takes time to construct the information and this can delay the alarm signal. However, Table 4 suggests that it is not it

serious concern. The number of scans that are used by the M d m l

Shewhart 1 0.93 Shewhart 2 0.06

CUSUM 0.30 Moving average 0.47

39 622 238 241

Shewhart I , Shewhart Chart with the sensitivity rules but no filter for mciin changes; Shewhart 2, Shewhart Chart with the sensitivity rules and ii filter for mean changes.

DISCUSSION AND CONCLUSION

All five methods studied in this paper can pick up major changes in the longitudinal data analysis and the simulation experiments. However, CUSUM is the best among these methods. The Shewhart 1 approach yields too many false alarms and should not be used in a quality assurance center.

Although the simulation study does not include the visual inspection (due to the magnitude of effort required in doing visual inspection o f 2000 simulated data sets and a fixed pattern of simulated data), we expect that with proper training, experience, and attention, the visual inspection can identify important changes and eliminate small and insignificant alarms. In summary, the visual inspection has the advantage o f bringing human judgment into the evalu- ation and can be performed as the data are being collected. I t has the disadvantage, however, of being less sensitive for identifying drifts compared with the other statistical meth- ods. Most importantly, it depends heavily on the experience of the evaluator. As a consequence, it will be less consistent across centers and time. Also it is difficult to characterize change points quantitatively. When there are a large num- ber of participating sites in a clinical trial, it is very difficult t o have one evaluator at a QA center visually review all the plots in a timely and consistent manner.

The Shewhart 1 approach is sensitive to all the change points, but i t also produces more false alarms than the other methods. The false alarm rate of the Shewhart 1 can be improved by using a mean chart of multiple observations, such as using the mean of weekly phantom BMD data. However, a pooled chart will reduce the sensitivity and is likely to delay the signal of a real event.

The Shewhart 2 approach has the advantage of a low false alarm rate but suffers by having a low sensitivity. The results of both the longitudinal data analysis and the simu- lation experiment demonstrated that this particular version o f the method is not as efficient as the CUSUM or visual inspection approach. Shewhart 2 can be improved by reduc- ing the threshold of mean changes but this will result in an increase in the number of false alarms.

The moving average chart did not show a better perfor- mance than the CUSUM method in our study. Compared with Shewhart charts, the advantage of the moving average is that it reduces the noise level, making it more reliable. Also, it gives a single criterion for sending an alarm signal rather than four to five rules for the Shewhart procedures.

moving average chart to identify the change points is com- parable to CUSUM in most of the simulation models. Since this method detects any significant differences in means, regardless of the magnitude of changes, it also produces clinically insignificant alarms. We note that the perfor- mance of moving average charts is also related to the number of scans used to calculate the moving avcragcs. Reduction in the number of scans used will, on the one hand, reduce the lag time between the occurrence of the change point and the alarm but, o n the other hand, will reduce the statistical power to detect true change points.

The CUSUM method has several advantages over other process control The average number o f scans between two false alarms was 247, which is considerably smaller than the number reported in the literature."." However, unlike these reports, our simulation included three one-sided CUSUM charts, one each for an increase and decrease in the mean and one for an increase in vari- ance. Like the Shewhart 2, the CUSUM chart provides the capability for us to change parameters according t o clini- cally significant magnitudes of mean and variance changes. This feature is not easy to implement for Shewhart 1 and moving average methods. Different from all the other methods, CUSUM provides a much better estimate o f the date when an off control event might have occurred. This is very important information in quality control practice since it can help in identifying the possible reasons that the change occurred and in taking the necessary steps to im- prove the scanner performance. Our simulation experi- ments have shown that the CUSUM method takes the fewest scans to detect a change and the median bias in the estimate of the change point date is almost zero for both mean and variance change, although it can be more than zero for linear drifts. This bias can be reduced using the method of Wu.''') Unlike the visual inspection method, the CUSUM method can utilize the power of computers and efficiently process a large amount of data that are routinely acquired during a clinical trial. The disadvantage in using a CUSUM chart is that it is less intuitive than the Shewhart charts. This is magnified further due to a lack of statistical and quality control training.

Despite the disadvantages in terms of false alarm rate, the Shewhart 1 is still the most commonly used method in clinical trials. All Hologic scanners automatically provide a Shewhart chart without sensitizing rules and a regression program for the quality control data. Applying the addi- tional sensitizing rules is relatively easy. A combination of the Shewhart rules with visual judgment will further reduce the false alarm rate. However, at a quality assurance (QA) center, the use of Shewhart 1 (with visual inspection) and visual inspection alone are not recommended due to the magnitude of work involved. After comparing the advan- tages in sensitivities, false alarm rates, and the ability to identify the dates and magnitudes of the true change points, the CUSUM method is clearly a better choice. A combina- tion of Shewhart, visual inspection, and CUSUM charts

Page 8: Dual X-ray absorptiometry quality control: Comparison of visual examination and process-control charts

COMPARISON OF QC METHODS 633

TAILI- 4. SI-NSI I IVI I Y BY SIMUI.AI.ION EXPI:RIMFN.I.S

1 2 3 4 5 h 7 8 9

0.YY I .oo I .oo 1 .oo I .oo I .oo I .oo 1 .oo I .oo

0.32 0.93 I .oo 0.47 0.02 I .oo 0.54 1 .oo 1 .oo

0.84 1 .oo I .oo 1 .oo 1 .oo I .oo 0.95 1 .oo 1 .oo

0.86 1 .oo 1 .oo 1 .oo 1 .oo 1 .oo 0.97 I .oo 1 .oo

13 5 2 6 3 2

22 20 16

31 1 1 2

38 22 12 70 49 27

24 19 9 5 h 3

15 13 2 6 3 4

5s 51 38 34 26 22

Shewh;irt I . Shewhart Chart with the sensitivity rules hut no filter for mean changes; Shewhart 2, Shewhart Chart with the sensitivity rules and ;I filter for n iem changes: Model I . shift o f mean of SD; Model 2, shift of mean of I SD: Model 3, shift of mean of I I/? SD; Model 4. 50r i increase in standard deviation: Model 5 , 1 O O T incrcase in standard dcviation; Model 6 , 150%~ incrcase in standard deviation: Model 7. linc;ir drifting with ii slope 0.00005 per scan; Model 8, linear drifting with a slope 0.0001 per scan; and Model 9, linear drifting with a slope 0.0005 per scan.

may also he iiscful particularly for QA in multicenter stud- ies. Here. the study sites should use the Shewhart chart with the sensitizing rules and visual inspection while they are collecting quality control data. while the QA center should regularly use the CUSUM charts t o monitor quality control data from the study sites and t o identify the change points ;IS well ;IS thc magnitude o f changes. I f a study site identifies ;I change point by the Shewhart 1 when the data is collected, the QA center should bc informed and data should be transferred for ;I quick review by the QA center. We believe that this approach will reducc the length of the action cycle required for the investigation o f potential change points, possihle repair. o r recalibration of the scanner and will improve the final quality of the data in clinical trials.

Since the SAS Shewhart procedure requires an equal number of scans per day. we only used the last scan for those days when multiple scans were obtained in our com- parison o f QC methods based on longitudinal data. Using partial scan data may diminish deviations from good control and delay the detection of change points. Nevertheless, thc conclusions o f the study are valid using all o r part of the scans from one day. Because none of the QC methods require :in equal number o f scans per day in theory, an improve- ment o f software will resolve this concern. When writing custom software is not a n option, the last phantom scan done on ;I day is ;I better choice. When the results of the first scan are o u t of the manufacturer's specification, tech- nicians generally rescan the phantom to assure that no huni;in errors arc involved. As such, the last scan is more likely to rcprescnt the true machine behavior.

Statistical literature has indicated that if a high autocor- relotion exists. all quality control procedures, such as the Shewhart chart, moving average chart, and CUSUM will have much higher hilse alarm rates than the level that they are designed Several methods have been proposed to iddress the problem of autocorrelatcd data.""."' As our

data did not show any waveform behavior by visual evalu- ation, correlation analy fter excluding linear drifts, after consultation with experts in statistical quality control re- search, we concluded that autocorrclation is not a major concern at least for this study. However, one should check the autocorrelation and examine the plot of data over time before using any process control chart.

Our primary concern in this paper was to compare stu- tistical methods for identifying change points in BMD data for quality iissurance. We must emphasize, however, that statistical procedures only provide us with clues o f potential problems with scanner performance. Identification of change points focuses our attcntion o n these potential problems and forces further investigation. No matter which method is used, there is always a chance of fnlse positive results. I t should also be noted that a single event may cause several alarm signals. In addition, there are other factors, such 21s operator errors in scanning the phantom, which are not related to the scanner and do not affect the patient data. Therefore, quality control of longitudinal scanner perfor- mance cannot and should not be an automatic process. Thc identification of change points is only one o f several com- ponents of successful quality control o f the machine per- formance over time. It is equally important to have proce- dures to confirm the validity o f the change points that were statistically identified, and these points should never be accepted in isolation. Investigation o f potential causes of instrument malfunction should be carried o u t by appro- priately trained personnel, such as medical physicists or engineers.

Since the relationship betwccn changes in phantom and human BMD values is not well understood, it is difficult t o define what level of change in phantom BMD can bc con- sidered clinically significant. Unlike human scan data, the phantom scan data are not affected by biological attributes

Page 9: Dual X-ray absorptiometry quality control: Comparison of visual examination and process-control charts

634 LU ET AL.

like body size, fat content, etc. In practice, we target one standard deviation change o f mean phantom BMD as a clinically relevant change in mean value and a doubling of the standard deviation (i.e., one additional standard devia- tion) as a clinically relevant change in variance (precision). These target levels, however, should be updated once we have a better understanding o f the relationship between phantom data and patient data.

To apply the CUSUM method in a QA center, we suggest the following procedures. If a study site has a CV of all longitudinal quality control data that is less than 0.4% for a Hologic scanner or 0.5% for a Lunar scanner, we do not examine them for change points. These levels are specific for the Hologic spine phantom and may differ for other standards. This suggestion is based on our experience that statistically established change points for such sites often result in clinically insignificant correction of patient data. For sites that have CV values beyond the above standards, we apply CUSUM charts. If we observe mean changes beyond 1 SD or doubling of the standard deviation, we inves- tigate and identify assignable causes. If there is a problem unrelated to the machine, there is no need to make any change in the scanner and, therefore, the manufacturer of the scanner should not be involved. If it is likely to be a machine-related problem, the manufacturer’s cooperation is solicited to resolve the problem. For changes of less than I SD, but that are identified by the CUSUM chart, we still investigate possible ignable causes and carefully monitor the scanner for a month. Depending on the nature of the assignable causes and the performance of the follow-up data, we decide whether we should establish them as real change points. Nonetheless, we still recommend a thorough investigation of these change points for purposes of preven- tion. If the investigation indicates a scanner is out of con- trol, the manufacturer should be noticed.

We compared different methods for identification of the potential change points in the process of managing a clinical trial. Another potential use of change points is to correct patient data by removing machine induced errors.(2,3) For doing this, we use the change points to define intervals during which scanner performance is consistent. This will, in general, reduce the number of change points identified by the process control charts since they may have given multiple alarms for one event. In addition, we can ignore clinically insignificant corrections of patient data, while for monitoring purposes we should pay attention to change points that might not be clinically significant at the time but could lead to significant corrections later in the trial. Sta- tistical correction of patient data is a topic beyond the scope of this paper.

We conclude that visual inspection of the Shewhart chart is useful at a study site to quickly identify possible machine malfunctions but should not be used for large multicenter clinical trials that use QA centers. Of all the methods we compared, the CUSUM approach has the best combination o f sensitivity, low false alarm rate, and identification of the time and magnitude of change and should be used by QA centers.

APPENDIX Moving average

We denote by X, , j = l,2,. . .,n, the BMD values o f I I

longitudinal phantom scans from a study center. We define the moving average mean and standard deviation based on 2.5 scans as following:

cx, 1-4 24

i = 2.5,26,. . . , P I ( 1 ) 25 ’ M, =

as the moving average of 25 scans to the date when the ith scan was collected, and

as the moving standard deviation of the 25 scans to the date when the ith scan was collected. Notc that the first moving average can only be calculated after the first 25 scans had been collected.

Now if we assume that X,’s independently follow a nor- mal distribution N ( p , (?), it can be shown that the M,’s follow a normal distribution N ( p , 2 / 2 5 ) and 24 SL/t?’s follow a chi-square distribution with 24 degrees of freedom denoted by ,&,. However, note that both M,’s and 24 S‘/t?’s are not independent samples from the normal distribution and the chi-square distribution, respectively.

Let p,) be the reference mean BMD. If the scanner is in control, we should accept the null hypothesis, H,,:p = pl,. I f the scanner is out of control, we will accept the alternative hypothesis, H , : F # p,). We select a type one error levcl of 0.0027 to be comparable to the original Shewhart method. We will reject the null hypothesis if IM, - pol > ( Z , <?,?)u/S = 0.5991u. Here ui.5 is the standard deviation o f M,’s.

We assumed that the CV for a Hologic scanner is a constant. Therefore, if the scanner is in control, we expect the standard deviation to be u,, = CV X p0 = O.OOSp,,. To check whether the precision of the scanner is in control o r not, we will test the null hypothesis, H , p = ul,, versus the alternative that H , : a > ul,. With the same level of type one error rate as the mean difference, we will reject the null hypothesis if 24 Sfid > ,&,- t l , o r equivalently, if S, > 1.41ul,.

We can calculate statistical power of this procedure. For example, to detect a 1 SD increase o f the mean BMD, the power of the procedure is given by

where Z is a standard normal random variable. Similarly, to detect a 100% increase in standard deviation, the procedure has a power of

Page 10: Dual X-ray absorptiometry quality control: Comparison of visual examination and process-control charts

L LZ'Z 9 29.1 s 6Y'Z P 8E'E E SI'Z z LZ'Z I 18'1 0 00'0 I 9P'O 0 00'0

ss'o 80'1 - 69'0- EZ' I Zll- 9P.O 18'1 SKI - 9P'O I E'O-

0 00'0 1 80'0 0 00'0 0 00'0 0 00'0 0 00'0 0 00'0 1 58'0 0 00'0 0 00'0

SO" -

80'0 I E'O- EZ'Z- 88'0- 9P' I - 18.2 - 58.0 9P' I - 60'0-

000' 1 PEO' I OPO' I EPO' I 000' I I PO' I OPO' I I EO' I 000' I XEO' I OPO' I SCO' 1 000' I 8ZO' I OPO' 1 LPO' I

OPO'I OCO'I o~o' I sco' I

0 0 0 0 0 0 0 I 0 0 0 0 0 0 0 I 0 I 0 0

00'0 00'0 00'0 00'0 00'0 00'0 00'0 6Z'O 00'0 00'0 00'0 00'0 00'0 00'0 00'0 0 1 '0 00'0 6Z'O 00'0 00'0

80' I - P8'1 - EO'Z - 197 - EZ'Z - 95'1 - 8P'O- OZ'O 89~0 - LO'I - 8P'O-- LO'I - SP'I - LO' I - 8P'O- 01'0 LO' I - 62'0 SY. I - SO' I -

Y Po's s Lh'P P EI'P E OI'E Z hP'I I UZ'0 0 00'0 0 00'0 0 00'0 s El'O P LO") E 85'0 2 ZS'O 1 LO'O 0 00'0 0 00'0 P LO") E 00'0 Z 62.1 I S9'0

SCY

Page 11: Dual X-ray absorptiometry quality control: Comparison of visual examination and process-control charts

636

-

- . . . . . . - - -0 - n

LU ET AL.

*. , -

1.065

h

N

E -2 v 5 rn 1.040 a, 5) a > 1.033 -

1.003

. * .

m m . 2 . m

.

m 0)

tn In

OD r . % c

Scan Date

FIG. A l . Phantom scan data and CUSUM chart for Center 3 (3/13/89-5/15/89).

55 0411 0189 1.037 0.004 0.005 17 -0.24 -0.99 0.00 0 56 041 14/89 1.042 0.005 0.005 I7 0.0 1 -0.74 0.000 0 57 0411 7/89 I .044 0.002 0.005 17 -0.86 -1.61 0.000 0 58 0411 8/89 I .04 I -0.003 0.005 17 -0.52 - 1.27 0.000 0 59 041 I 9/89 1.040 -0.001 0.005 17 - I .30 -2.05 0.000 0 60 04/20/89 I .036 - 0.004 0.005 17 -0.24 -0.99 0.000 0

61 04/28/89 1.039 0.003 0.00520 -0.52 - 1.27 0.000 0 62 0510 1/89 1.035 -0.004 0.00520 -0.24 -0.90 0.000 0 63 05/02/89 1.047 0.0 12 0.00520 1.30 0.55 0.554 I (14 05/03/89 1.028 -0.019 0.00520 2.25 1 .so 2.053 2 65 05/04/89 1.035 0.007 0.00520 0.44 -0.3 I I .742 3 66 05/05/89 1.038 0.003 0.00520 -0.53 - 1.28 0.467 4 0 7 05/09/89 1.03 I -0.007 0.00520 0.44 -0.3 1 0. 156 5 68 0511 0189 I .04 I 0.0 10 0.005 20 0.99 0.24 0.39 1 h 69 051 12/89 1.043 0.002 0.00520 -0.86 - 1.61 0.000 0 70 051 15/89 1.034 -0.009 0.00520 0.8 1 0.06 0.060 I

greater than the reference value, or by /L,) - cr(k + S , ( i ) / N , ( i ) ) when the new BMD values are smaller than the reference value.

Graphical presentation o f the CUSUM chart of Table Al is shown in Fig. A I .

We constructed the one-sided standard deviation chart according t o Ryan.‘” The transformation Z, = {l(X, - X , , ) / ( ~ 2 ~ ) 1 ” ’ - 0.82218}/0.34914 follows a standard nor- mal distribution. Since a change in mean BMD is more im- portant in a clinical trial, we use a k = 0.75 to reduce the

number of alarm signals due to changes in variance. The CUSUM chart is constructed in the same way a s explained above by considering Z, instead of X,. For example. in Table A2, we calculated values of Z,. Since Z, follows a standard normal distribution, the upper bound CUSUM for variance is Sll(i) = max[0, Z, - 0.75 + Sl,(i - I ) ] , which is given in the eighth column. As before, N,l(i) indicates when a positive cumulative sum occurs and it is useful to find the assignable courses. A graphical presentation is similar to Figure Al and is not presented here.

Page 12: Dual X-ray absorptiometry quality control: Comparison of visual examination and process-control charts

COMPARISON OF QC METHODS 637

General procedure of deriving the algebraic hound- aries o f the CUSUM chart is given.(4.') and theoretical comparisons of Shewhart and CUSUM can be found in hoth books.

ACKNOWLEDGMENT

This study was financially supported by funding from Sandoz Pharmaceutical\ Corp. and Procter and Gamble Com pa ny .

REFERENCES

I . (iliicr ('C, Faulkner KG, Estilo MJ. Engelkc K. Rosin J, <;enant I1K I993 Quality assurance for hone densitometry rcsc;irch studies: concept and impact. Osteoporosis Int 3(S): 227-235.

2, Hlunl HA. Glucr C T . 13rastow PC. Rosin JD. Jergas M, Gcnant HK 1993 Patient data correction factor for multi- center trials using qunntitative computed tomography. J Bone Miner Rcs 8(Suppl I):S356.

3. L A Y. Mathur AK. Gliier CC, Andriano KM, Blunt BA. Fuerst TP. <;enant H K I Y Y S Application o f statistical quality control Method in multiccntcr osteoporosis clinical trials. Proceedings of International Conference o n Statistical Methods and Statis- tical Computation for Quality and Productivity Improvement, Vol I I . Seoul. Korea. pp. 374-480.

4, Montgomery I)C lYO2 Introduction to Statistical Quality Con- trol, 2nd Ed. Wiley. New York.

5. Ryan T P 1080 Statistical Methods f o r Quality Improvement. Wilcy, New York.

0 . Onvoll €3, Oviatt SK, Biddle JA 1YY3 Precision o f dual-energy x-ray nhsorptiometry: development of quality control rules and thcir application in longitudinal studies. J Bone Miner Rcs X( h):OO3-h09.

7. Onvoll ES. Oviutt SK 1091 Longitudinal precision of dual- cncrgy X-ray ahsorptiometry in a multi-center study, the Na- I;irclin/€3onc study group. J Bone Miner Res 6 ( 2 ) : I Y I-lY7.

8. Jergas M, Palermo L, Nevitt M, Black D. Ashby M, Gliicr CG, Genant HK 1993 Quality control of bone densitometry data in multi-center studies. J Bone Miner Res S(Suppl 1):S34S.

Y. Vilstrup L, Moligaard A, Hauhro AM. Gunthcr T, Riis BJ 1YY4 Quality assurance of multi-center trials including measurement o f bone mass precision and accuracy. J Bone Miner Res 9(Suppl I):S333.

10. Wahner HW, Looker A, Dunn WL, Walters LC, Hauser MF, Novak C 1994 Quality control of bone densitometry in a na- tional health survey (NHANES 111) using three mobile cxam- ination centers. J Bone Miner Rcs 9(6):951-960.

1 1 . SASiQC Software: Reference, Version 6, 1st Ed, 19XY. SAS Institute Inc., Cary, North Carolina.

12. Becker RA, Chambers JM, Wilks AR I Y X X The New S Lan- guage. Wadsworth & BrooksiColc, Pacific Grove, California.

13. Jcrgas M, Gcnant HK 1993 Current methods and recent ad- vances in the diagnosis of osteoporosis. Arthrit Rheum 36( 12): 1649-1662.

14. SAS Technical Report P-188, IYXY. SASiQC Software Exam- ples, Version 6. SAS Institute Inc., Cary. North Carolina.

IS. Wu Y 1994 On the biases of change point and change magni- tude estimation aftcr CUSUM test. Technical Report Y4.0S. Department o f Statistics and Applied Probability, Univcrsity o f Alberta, Canada.

16. Ryan T P 1994 Methods lor charting autocorrelated proccss data-what will and will not work. Abstracts of the Joint Statisti- cal Meetings, Toronto, Canada, pp. 268.

17. Adams BM, Lin WS 1094 Monitoring autocorrelated data with a combined EWMA-Shewhart control chart. Abstracts o f the Joint Statistical Meetings, Toronto. Canada, pp. 208.

Address reprint requests to: Ying I,ir

Osteoporosi.s Rcwurch Group Dipurtment of Kudioloy

University o f C'ul{forniu-Sun Fruncisco 0, (.'A 94143-1.749, U.S.A.

Received in original form Junc 16, IYYS; in revised form I>ccember IS, 1995; accepted January 2. 1905.