evaluation of an automated morphometry software … · manual adjustment of morphometry point...

1
ABSTRACT Prior vertebral fracture increases fracture risk making knowledge of fracture status important in clinical practice. However, many vertebral fractures are unappreciated; consequently, spine imaging is necessary for optimal clinical decision-making. One imaging modality is densitometric vertebral fracture assessment (VFA) which is increasingly being appreciated as a valuable addition to clinical densitometry. However, VFA limitations include difficulties in mild (grade 1) fracture identification; a weaknesses that might be reduced by improvements in image quality and subsequent analysis. As such, this study evaluated the utility of SpineAnalyzer TM , a software program that uses a 95-point morphometry approach, on VFA images acquired with a GE Healthcare iDXA. VFA was performed on 103 individuals; 79 women and 24 men. Their mean (range) age, lowest T-score (L1-4 spine, femoral neck or total proximal femur) and BMI was 72.6 (47.7 to 91.5) years, -1.5 (+3.7 to -3.4) and 26.5 kg/m 2 (17.1 to 36.5) respectively. Many of these individuals had substantial degenerative disease, scoliosis or other anatomic variation making fracture identification challenging. VFA interpretation was performed using printed images and applying the Genant visual semi-quantitative approach (VSQ) by a recognized expert (HKG) which was defined as the “gold standard” and by a non-radiologist clinician experienced in VFA interpretation (NB). An ISCD-certified technologist (DK) used SpineAnalyzer TM to identify vertebral deformities in these same scans. Some manual adjustment of morphometry point placement was required on the majority of these images. This manual adjustment was primarily required for abnormal and/or upper thoracic vertebral bodies. The main outcome parameter was vertebral fracture number and grade from T4-L4 using SpineAnalyzer TM in comparison to the gold standard and non-radiologist physician. In this cohort, the gold standard analysis identified 53 vertebral fractures whereas SpineAnalyzer TM identified 43 and the clinician 41. For analysis as normal (VSQ grade 0) or fracture (VSQ grade 1, 2 or 3), moderate agreement was observed between the gold standard and both SpineAnalyzer TM and the clinician (kappa 0.54 and 0.50 respectively). When limiting evaluation to just grade 2 and 3 fractures (VSQ = 0, 1 together vs. VSQ = 2, 3 together) moderate agreement with the gold standard was again observed; kappa of 0.57 for SpineAnalyzer TM and 0.56 for clinician. In conclusion, when evaluating VFAs in a cohort with substantial degenerative disease and/or anatomical abnormalities, SpineAnalyzer TM with morphometry performed by a technologist is similar to an experienced clinician when comparing to a gold standard reader. Studies evaluating other populations are needed to better characterize the utility of SpineAnalyzer TM application to VFA images. RESULTS Non-evaluable Vertebrae (Figure 2) The vast majority (94 - 95%) were felt to be evaluable by all three interpretations - Most non-evaluable vertebrae were in the upper thoracic spine - Approximately equal numbers of vertebrae were excluded by the interpretations; n = 64, 76 and 77 - Numerically more vertebrae were felt to be non-evaluable from T7 through T12 by the expert reader Vertebral Fracture Identification and Agreement Total number and grade of vertebral fractures by reader (Figure 3): - Gold standard: 53; grade 3 = 4, grade 2 = 17, grade 1 = 32 - Clinician: 41; grade 3 = 7, grade 2 = 15, grade 1 = 19 - Technologist: 43; grade 3 = 3, grade 2 = 11, grade 1 = 29 Agreement between analysis approaches for non-deformed (no fracture) vs. fracture (VSQ grades 1, 2 and 3 combined); Kappa (95% CI): - Gold standard vs. clinician = 0.50 (0.35-0.65) - Gold standard vs. technologist = 0.54 (0.40-0.68) Agreement between analysis approaches for no fracture plus mild (grades 0 and 1 combined) vs. moderate and severe fracture (grades 2 and 3 combined); Kappa (95% CI): - Gold standard vs. clinician: = 0.56 (0.36-0.76) - Gold standard vs. technologist = 0.57 (0.35-0.78) Examples of agreement and disagreement are presented in Figure 4; disagreement from gold standard summarized in Table 1 Location and grade of vertebral fractures by reader (Figure 5) steoporosis Clinical Center & Research Program University of Wisconsin - Madison UNIVERSITY OF WISCONSIN-MADISON SCHOOL OF MEDICINE AND PUBLIC HEALTH FIGURE 3 Figure 3: Number and Grade of Vertebral Fractures Identified. The total number and grade of vertebral fractures for the three interpretations is presented. Numerically more fractures were identified by the expert reader. Good agreement with the gold standard was observed for both the clinician read and SpineAnalyzer TM method. SUMMARY The vast majority of vertebral bodies (94-95%) are adequately visualized for fracture identification using iDXA images. • There is good, but imperfect, agreement in fracture identification with a gold standard interpretation by both an experienced clinician and by SpineAnalyzer TM . - Differences in identification of mild fractures continues to contribute to disagreement between approaches. - Some disagreement results from differences in what is considered to be an “evaluable” vertebral body . Perhaps improved definition of what makes a vertebral body not evaluable would enhance agreement. • Technologist identification of vertebral deformities with SpineAnalyzer TM has potential to facilitate fracture identification. Evaluation of An Automated Morphometry Software Program (SpineAnalyzer TM ) on VFA Images D. Krueger 1 , J. Staal 2 , P. Steiger 2 , B. Buehring 1 , H.K. Genant 3 , N. Binkley 1 1 University of Wisconsin Osteoporosis Clinical Research Program, 2 Optasia Medical, Inc., 3 University of California, San Francisco INTRODUCTION Prevalent vertebral fractures are powerful predictors of future fracture risk. However, many vertebral fractures are not appreciated by either the patient or physician, making spine imaging necessary for optimal fracture risk estimation. Vertebral fracture assessment (VFA) performed at the time of BMD measurement is a convenient, low radiation exposure approach to enhance fracture risk prediction. Exploration of approaches enhancing incorporation of VFA into clinical practice is clearly indicated. In this regard, automated morphometric assessment performed by the technologist may enhance fracture identification. In a cohort of 103 older adults, this study compares fracture identification by a technologist using proprietary morphometric software (SpineAnalyzer TM ) with that of an expert radiologist (HKG) and an experienced clinician (NB). FIGURE 4 Figure 4: Examples of Agreement and Disagreement in Fracture Identification by Approach. Figure 4a illustrates agreement at T8 with all 3 interpretations identifying a mild T8 fracture. Similarly, the L1 fracture was identified by all, but graded as mild by SpineAnalyzer, moderate by NB and severe by HKG. Figure 4b is an example in which morphometry did not identify fractures at L1 and L4 which were identified by HKG and NB. Figure 4c illustrates disagreement. In this example, T8, T9, T10 and T11 were deemed nonevaluable by HKG while NB called these mild fractures and SpineAnalyzer identified no fractures. 4a 4b METHODS Subjects 103 older adults participating in a research study comparing iDXA with Prodigy densitometers who met ISCD indications for VFA performance 79 women/24 men - Mean (± SD) age 72.6 (± 7.2) years (range 47.7 to 91.5) - Mean (± SD) lowest BMD T-score of L1-4 spine, femoral neck or total proximal femur -1.5 (± 1.2) (range +3.7 to -3.4) - Mean (± SD) BMI 26.5 (± 4.3) kg/m 2 (range 17.1 to 36.5) VFA Acquisition and Analysis GE Healthcare Lunar iDXA densitometer; all VFA studies performed in the lateral decubitus position by ISCD-certified technologists VFA interpretation (T4-L4) performed by a recognized expert (HKG) and a clinician experienced in VFA interpretation (NB) using printed images and applying the Genant VSQ scale (Figure 1a) SpineAnalyzer TM analysis performed by an ISCD-certified tech (DK) as follows: - The technologist placed a point in the center of all imaged vertebrae and identified L4; the software numbered all vertebral bodies accordingly - The vertebral body outline is automatically generated by the software and 95-point morphometry is performed - The technologist validated or adjusted the 95-point morphometry as appropriate in the following order: six-point morphometry, rim centers, osteophyte handles, endplates, rims and margins; (Figure 1b) The software generates a deformity results table (Figure 1c) by applying deformity thresholds to the morphometery measurements for subsequent clinical interpretation (the latter not performed in this study) Outcome Parameters Number and location of evaluable vertebrae Number, grade and location of vertebral fractures Statistical Analyses The “gold standard” analysis (HKG) was used for comparison Kappa statistics and 95% confidence intervals calculated on evaluable vertebral bodies in each pairing (n = 1189, 1205 & 1217) to compare the following: - No fracture (VSQ grade 0) vs. fracture (VSQ grades 1, 2 and 3) - No fracture plus mild (VSQ grades 0 and 1) vs. moderate or severe fracture (VSQ grades 2 and 3) Analyses performed using Matlab (Natik, MA) 4c FIGURE 5 Figure 5: Fracture Location and Severity by Interpretation. Distribution of fractures by vertebrae, grade and interpreter. In this cohort, the majority of vertebral fractures, including virtually all moderate and severe fractures, were observed from T7 through L1 by all interpretations. Most disagreement was in assessing grade 1 fractures. Fracture in Unevaluable Vertebrae Fracture Missed Total Fracture Disagreement NB 5 G1 7 G1 25 G1 SpineAnalyzer 1 G3 1 G1 4 G2 22 G1 4 G2 5 G2 15 G1 Fracture Overcall 47 79% G1 42 90% G1 Table 1: Summary of Disagreement from Gold Standard Interpretation *In the 1310 imaged vertebral bodies an ~3% fracture identification error rate was observed; 79-90% was related to differences in mild fracture identification. FIGURE 2 Figure 2: Non-evaluable Vertebrae. Overall, only approximately 5-6% of vertebral bodies were deemed uninterpretable. Many of the non-evaluable vertebrae (43%, 64% and 79% depending on the reader) were located in the upper thoracic spine from T4-T6. Considering only T7 - L4, 96-98% of vertebral bodies were felt to be adequately visualized on these iDXA images. Numerically more vertebral bodies were deemed non-evaluable from T7 through T12 by the expert reader than by the other two approaches. NB 0 5 10 15 20 25 30 35 40 T4 T5 T6 T7 T8 T9 T10 T11 T12 L1 L2 L3 L4 HKG SpineAnalyzer Percent Non-Evaluable of Imaged Vertebrae Figure 1: Genant Visual Semiquantitative (VSQ) Scale and SpineAnalyzer 95-Point Analysis. The Genant VSQ scale for fracture identification is depicted (Figure 1a). The SpineAnalyzer TM 95-point vertebral body outline (Figure 1b) is utilized to calculate values reported in the deformity result panel (Figure 1c). FIGURE 1 1a 1b 1c Grade 1 Grade 2 Grade 3 1 2 3 4 5 6 7 8 9 10 T4 T5 T6 T7 T8 T9 T10 T11 T12 L1 L2 L4 HKG SpineAnalyzer NB Number of Fractures 0 L3 0 10 20 30 40 50 60 NB HKG SpineAnalyzer Grade 1 Grade 2 Grade 3 Number of Fractures

Upload: vodang

Post on 24-Apr-2018

216 views

Category:

Documents


2 download

TRANSCRIPT

ABSTRACT Prior vertebral fracture increases fracture risk making knowledge of fracture status important in clinical practice. However, many vertebral fractures are unappreciated; consequently, spine imaging is necessary for optimal clinical decision-making. One imaging modality is densitometric vertebral fracture assessment (VFA) which is increasingly being appreciated as a valuable addition to clinical densitometry. However, VFA limitations include difficulties in mild (grade 1) fracture identification; a weaknesses that might be reduced by improvements in image quality and subsequent analysis. As such, this study evaluated the utility of SpineAnalyzerTM, a software program that uses a 95-point morphometry approach, on VFA images acquired with a GE Healthcare iDXA. VFA was performed on 103 individuals; 79 women and 24 men. Their mean (range) age, lowest T-score (L1-4 spine, femoral neck or total proximal femur) and BMI was 72.6 (47.7 to 91.5) years, -1.5 (+3.7 to -3.4) and 26.5 kg/m2 (17.1 to 36.5) respectively. Many of these individuals had substantial degenerative disease, scoliosis or other anatomic variation making fracture identification challenging. VFA interpretation was performed using printed images and applying the Genant visual semi-quantitative approach (VSQ) by a recognized expert (HKG) which was defined as the “gold standard” and by a non-radiologist clinician experienced in VFA interpretation (NB). An ISCD-certified technologist (DK) used SpineAnalyzerTM to identify vertebral deformities in these same scans. Some manual adjustment of morphometry point placement was required on the majority of these images. This manual adjustment was primarily required for abnormal and/or upper thoracic vertebral bodies. The main outcome parameter was vertebral fracture number and grade from T4-L4 using SpineAnalyzerTM in comparison to the gold standard and non-radiologist physician. In this cohort, the gold standard analysis identified 53 vertebral fractures whereas SpineAnalyzerTM identified 43 and the clinician 41. For analysis as normal (VSQ grade 0) or fracture (VSQ grade 1, 2 or 3), moderate agreement was observed between the gold standard and both SpineAnalyzerTM and the clinician (kappa 0.54 and 0.50 respectively). When limiting evaluation to just grade 2 and 3 fractures (VSQ = 0, 1 together vs. VSQ = 2, 3 together) moderate agreement with the gold standard was again observed; kappa of 0.57 for SpineAnalyzerTM and 0.56 for clinician. In conclusion, when evaluating VFAs in a cohort with substantial degenerative disease and/or anatomical abnormalities, SpineAnalyzerTM with morphometry performed by a technologist is similar to an experienced clinician when comparing to a gold standard reader. Studies evaluating other populations are needed to better characterize the utility of SpineAnalyzerTM application to VFA images.

RESULTSNon-evaluable Vertebrae (Figure 2) • The vast majority (94 - 95%) were felt to be evaluable by all three interpretations - Most non-evaluable vertebrae were in the upper thoracic spine - Approximately equal numbers of vertebrae were excluded by the interpretations; n = 64, 76 and 77 - Numerically more vertebrae were felt to be non-evaluable from T7 through T12 by the expert readerVertebral Fracture Identification and Agreement• Total number and grade of vertebral fractures by reader (Figure 3): - Gold standard: 53; grade 3 = 4, grade 2 = 17, grade 1 = 32 - Clinician: 41; grade 3 = 7, grade 2 = 15, grade 1 = 19 - Technologist: 43; grade 3 = 3, grade 2 = 11, grade 1 = 29• Agreement between analysis approaches for non-deformed (no fracture) vs. fracture (VSQ grades 1, 2 and 3 combined); Kappa (95% CI): - Gold standard vs. clinician = 0.50 (0.35-0.65) - Gold standard vs. technologist = 0.54 (0.40-0.68)• Agreement between analysis approaches for no fracture plus mild (grades 0 and 1 combined) vs. moderate and severe fracture (grades 2 and 3 combined); Kappa (95% CI): - Gold standard vs. clinician: = 0.56 (0.36-0.76) - Gold standard vs. technologist = 0.57 (0.35-0.78)• Examples of agreement and disagreement are presented in Figure 4; disagreement from gold standard summarized in Table 1• Location and grade of vertebral fractures by reader (Figure 5)

steoporosis Clinical Center & Research Program

University of Wisconsin - Madison

UNIVERSITY OF WISCONSIN-MADISONSCHOOL OF MEDICINE AND

PUBLIC HEALTH

FIGURE 3

Figure 3: Number and Grade of Vertebral Fractures Identified. The total number and grade of vertebral fractures for the threeinterpretations is presented. Numerically more fractures were identified by the expert reader. Good agreement with the gold standard was observed for both the clinician read and SpineAnalyzerTM method.

SUMMARY• The vast majority of vertebral bodies (94-95%) are adequately visualized for fracture identification using iDXA images. • There is good, but imperfect, agreement in fracture identification with a gold standard interpretation by both an experienced clinician and by SpineAnalyzerTM. - Differences in identification of mild fractures continues to contribute to disagreement between approaches. - Some disagreement results from differences in what is considered to be an “evaluable” vertebral body. Perhaps improved definition of what makes a vertebral body not evaluable would enhance agreement.• Technologist identification of vertebral deformities with SpineAnalyzerTM has potential to facilitate fracture identification.

Evaluation of An Automated Morphometry Software Program (SpineAnalyzerTM) on VFA ImagesD. Krueger1, J. Staal2, P. Steiger2, B. Buehring1, H.K. Genant3, N. Binkley1

1University of Wisconsin Osteoporosis Clinical Research Program, 2Optasia Medical, Inc., 3University of California, San Francisco

INTRODUCTION Prevalent vertebral fractures are powerful predictors of future fracture risk. However, many vertebral fractures are not appreciated by either the patient or physician, making spine imaging necessary for optimal fracture risk estimation. Vertebral fracture assessment (VFA) performed at the time of BMD measurement is a convenient, low radiation exposure approach to enhance fracture risk prediction. Exploration of approaches enhancing incorporation of VFA into clinical practice is clearly indicated. In this regard, automated morphometric assessment performed by the technologist may enhance fracture identification. In a cohort of 103 older adults, this study compares fracture identification by a technologist using proprietary morphometric software (SpineAnalyzerTM) with that of an expert radiologist (HKG) and an experienced clinician (NB).

FIGURE 4

Figure 4: Examples of Agreement and Disagreement in Fracture Identification by Approach. Figure 4a illustrates agreement at T8 with all 3 interpretations identifying a mild T8 fracture. Similarly, the L1 fracture was identified by all, but graded as mild by SpineAnalyzer, moderate by NB and severe by HKG.

Figure 4b is an example in which morphometry did not identify fractures at L1 and L4 which were identified by HKG and NB.

Figure 4c illustrates disagreement. In this example, T8, T9, T10 and T11 were deemed nonevaluable by HKG while NB called these mild fractures and SpineAnalyzer identified no fractures.

4a

4b

METHODSSubjects• 103 older adults participating in a research study comparing iDXA with Prodigy densitometers who met ISCD indications for VFA performance• 79 women/24 men - Mean (± SD) age 72.6 (± 7.2) years (range 47.7 to 91.5) - Mean (± SD) lowest BMD T-score of L1-4 spine, femoral neck or total proximal femur -1.5 (± 1.2) (range +3.7 to -3.4) - Mean (± SD) BMI 26.5 (± 4.3) kg/m2 (range 17.1 to 36.5)VFA Acquisition and Analysis• GE Healthcare Lunar iDXA densitometer; all VFA studies performed in the lateral decubitus position by ISCD-certified technologists• VFA interpretation (T4-L4) performed by a recognized expert (HKG) and a clinician experienced in VFA interpretation (NB) using printed images and applying the Genant VSQ scale (Figure 1a)• SpineAnalyzerTM analysis performed by an ISCD-certified tech (DK) as follows: - The technologist placed a point in the center of all imaged vertebrae and identified L4; the software numbered all vertebral bodies accordingly - The vertebral body outline is automatically generated by the software and 95-point morphometry is performed - The technologist validated or adjusted the 95-point morphometry as appropriate in the following order: six-point morphometry, rim centers, osteophyte handles, endplates, rims and margins; (Figure 1b)• The software generates a deformity results table (Figure 1c) by applying deformity thresholds to the morphometery measurements for subsequent clinical interpretation (the latter not performed in this study) Outcome Parameters• Number and location of evaluable vertebrae• Number, grade and location of vertebral fracturesStatistical Analyses• The “gold standard” analysis (HKG) was used for comparison• Kappa statistics and 95% confidence intervals calculated on evaluable vertebral bodies in each pairing (n = 1189, 1205 & 1217) to compare the following: - No fracture (VSQ grade 0) vs. fracture (VSQ grades 1, 2 and 3) - No fracture plus mild (VSQ grades 0 and 1) vs. moderate or severe fracture (VSQ grades 2 and 3)• Analyses performed using Matlab (Natik, MA)

4c

FIGURE 5

Figure 5: Fracture Location and Severity by Interpretation. Distribution of fractures by vertebrae, grade and interpreter. In this cohort, the majority of vertebral fractures, including virtually all moderate and severe fractures, were observed from T7 through L1 by all interpretations. Most disagreement was in assessing grade 1 fractures.

Fracture in Unevaluable

VertebraeFracture Missed

Total Fracture Disagreement

NB 5 G17 G1 25 G1

SpineAnalyzer1 G31 G1

4 G222 G14 G2

5 G215 G1

Fracture Overcall

4779% G1

42 90% G1

Table 1: Summary of Disagreement from Gold Standard Interpretation

*In the 1310 imaged vertebral bodies an ~3% fracture identification error rate was observed; 79-90% was related to differences in mild fracture identification.

FIGURE 2

Figure 2: Non-evaluable Vertebrae. Overall, only approximately 5-6% of vertebral bodies were deemed uninterpretable. Many of the non-evaluable vertebrae (43%, 64% and 79% depending on the reader) were located in the upper thoracic spine from T4-T6. Considering only T7 - L4, 96-98% of vertebral bodies were felt to be adequately visualized on these iDXA images. Numerically more vertebral bodies were deemed non-evaluable from T7 through T12 by the expert reader than by the other two approaches.

NB

0

5

10

15

20

25

30

35

40

T4 T5 T6 T7 T8 T9 T10 T11 T12 L1 L2 L3 L4

HKGSpineAnalyzer

Perc

ent N

on-E

valu

able

of I

mag

ed V

erte

brae

Figure 1: Genant Visual Semiquantitative (VSQ) Scale and SpineAnalyzer 95-Point Analysis. The Genant VSQ scale for fracture identification is depicted (Figure 1a). The SpineAnalyzerTM 95-point vertebral body outline (Figure 1b) is utilized to calculate values reported in the deformity result panel (Figure 1c).

FIGURE 11a 1b 1c

Grade 1Grade 2Grade 3

1

2

3

4

5

6

7

8

9

10

T4 T5 T6 T7 T8 T9 T10 T11 T12 L1 L2 L4

HKGSpineAnalyzer

NB

Num

ber o

f Fra

ctur

es

0L3

0

10

20

30

40

50

60

NBHKGSpineAnalyzer

Grade 1Grade 2Grade 3

Num

ber o

f Fra

ctur

es