application of neural networks and sensitivity analysis to improved prediction of trauma survival

9
Computer Methods and Programs in Biomedicine 62 (2000) 11 – 19 Application of neural networks and sensitivity analysis to improved prediction of trauma survival Andrew Hunter a, *, Lee Kennedy b , Jenny Henry c , Ian Ferguson a a Department of Computer Science, Uni6ersity of Sunderland, St Peters Campus, Tyne and Wear, Sunderland SR27EE, UK b Department of Medicine, Sunderland Ro6al Hospital. Kayll Road, Sunderland, SR47TP, UK c Scottish Trauma Audit Group, Royal Infirmary of Edinburgh, Lauriston Place, Edinhurgh EH39YW, UK Received 20 May 1999; accepted 26 July 1999 Abstract The performance of trauma departments is widely audited by applying predictive models that assess probability of survival, and examining the rate of unexpected survivals and deaths. Although the TRISS methodology, a logistic regression modelling technique, is still the de facto standard, it is known that neural network models perform better. A key issue when applying neural network models is the selection of input variables. This paper proposes a novel form of sensitivity analysis, which is simpler to apply than existing techniques, and can be used for both numeric and nominal input variables. The technique is applied to the audit survival problem, and used to analyse the TRISS variables. The conclusions discuss the implications for the design of further improved scoring schemes and predictive models. © 2000 Elsevier Science Ireland Ltd. All rights reserved. Keywords: Sensitivity analysis; Neural networks; TRISS; Trauma; Survival analysis www.elsevier.com/locate/cmpb 1. Introduction This paper discusses the application of Neural Networks to the audit of Trauma Survival data. Trauma Survival is usually audited using the TRISS methodology [12]. This uses the well- known logistic regression model to estimate the probability of survival of patients based on clini- cal data gathered during admission. A number of authors have observed that the survival estimates yielded by TRISS can be improved by locally re-optimising the TRISS coefficients [4], by in- cluding factors omitted from the original TRISS methodology, such as admission pH [6] or pre-ex- isting medical conditions [4]. Other results have indicated that comparable results may be gained by replacing TRISS with simpler models using logistic regression, either using the same variables as used in TRISS or using alternative variables [2]. Some preliminary results have also indicated that prediction accuracy may be improved by using alternative modelling approaches, such as neural networks [9] in place of logistic regression. * Corresponding author. Tel.: +44-191-515-2778; fax: + 44-191-515-2781. E-mail addresses: [email protected] (A. Hunter), [email protected] (L. Kennedy), [email protected] (I. Ferguson) 0169-2607/00/$ - see front matter © 2000 Elsevier Science Ireland Ltd. All rights reserved. PII:S0169-2607(99)00046-2

Upload: andrew-hunter

Post on 16-Sep-2016

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Application of neural networks and sensitivity analysis to improved prediction of trauma survival

Computer Methods and Programs in Biomedicine 62 (2000) 11–19

Application of neural networks and sensitivity analysis toimproved prediction of trauma survival

Andrew Hunter a,*, Lee Kennedy b, Jenny Henry c, Ian Ferguson a

a Department of Computer Science, Uni6ersity of Sunderland, St Peter’s Campus, Tyne and Wear, Sunderland SR2 7EE, UKb Department of Medicine, Sunderland Ro6al Hospital. Kayll Road, Sunderland, SR4 7TP, UK

c Scottish Trauma Audit Group, Royal Infirmary of Edinburgh, Lauriston Place, Edinhurgh EH3 9YW, UK

Received 20 May 1999; accepted 26 July 1999

Abstract

The performance of trauma departments is widely audited by applying predictive models that assess probability ofsurvival, and examining the rate of unexpected survivals and deaths. Although the TRISS methodology, a logisticregression modelling technique, is still the de facto standard, it is known that neural network models perform better.A key issue when applying neural network models is the selection of input variables. This paper proposes a novelform of sensitivity analysis, which is simpler to apply than existing techniques, and can be used for both numeric andnominal input variables. The technique is applied to the audit survival problem, and used to analyse the TRISSvariables. The conclusions discuss the implications for the design of further improved scoring schemes and predictivemodels. © 2000 Elsevier Science Ireland Ltd. All rights reserved.

Keywords: Sensitivity analysis; Neural networks; TRISS; Trauma; Survival analysis

www.elsevier.com/locate/cmpb

1. Introduction

This paper discusses the application of NeuralNetworks to the audit of Trauma Survival data.Trauma Survival is usually audited using theTRISS methodology [12]. This uses the well-known logistic regression model to estimate theprobability of survival of patients based on clini-

cal data gathered during admission. A number ofauthors have observed that the survival estimatesyielded by TRISS can be improved by locallyre-optimising the TRISS coefficients [4], by in-cluding factors omitted from the original TRISSmethodology, such as admission pH [6] or pre-ex-isting medical conditions [4]. Other results haveindicated that comparable results may be gainedby replacing TRISS with simpler models usinglogistic regression, either using the same variablesas used in TRISS or using alternative variables[2]. Some preliminary results have also indicatedthat prediction accuracy may be improved byusing alternative modelling approaches, such asneural networks [9] in place of logistic regression.

* Corresponding author. Tel.: +44-191-515-2778; fax: +44-191-515-2781.

E-mail addresses: [email protected] (A.Hunter), [email protected] (L. Kennedy),[email protected] (I. Ferguson)

0169-2607/00/$ - see front matter © 2000 Elsevier Science Ireland Ltd. All rights reserved.

PII: S 0169 -2607 (99 )00046 -2

Page 2: Application of neural networks and sensitivity analysis to improved prediction of trauma survival

A. Hunter et al. / Computer Methods and Programs in Biomedicine 62 (2000) 11–1912

This paper confirms many of the above results,using a large database of some 15 000 cases (fromSTAG-the Scottish Trauma Audit Group), withprogressively improved results yielded by locallyreoptimising TRISS, by using a simpler logisticregression model than TRISS. and by using asimple neural network. We have further experi-mented with a novel approach to sensitivity anal-ysis of neural network inputs, and can thusspeculate usefully about the most important vari-ables in Trauma survival assessment. This studyof sensitivity agrees with other authors on theimportance of some variables, but contradictsothers.

2. Problem description

Our objective is to assess the probability ofSurvival (PS) of patients admitted to Traumaunits, based on information gathered when thepatient is admitted. Predictions are used for auditpurposes — to identify unexpected deaths orsurvivals, so that the causes of any regional pat-terns can be identified. We wish to determinewhether the results obtained using the TRISSmethodology can be improved upon, and in par-ticular whether more complex neural networkmodels are justified in place of the traditionallogistic regression approach.

The TRISS model consists of two logistic re-gression equations, one of which is applied if thepatient has a Penetrating injury, the other forBlunt injuries. The TRISS equations [2] each usethree predictor variables: the Revised TraumaScore (RTS), the Injury Severity Score (ISS), andthe age encoded as a binary variable (AGE55,indicating whether patients are aged less than 55years, or 55 years and over). The RTS variable isin turn derived from the patient’s Blood Pressure(BP) and Respiratory Rate (RESP), and the Glas-gow Coma Scale (GCS). GCS in turn in derivedby scoring Eye movement (EYE), Motor co-ordi-nation (MOTOR) and Verbal response (VER-BAL). The ISS variable is derived by scoringinjury severity in six body areas (Al–A6), andcombining these together.

3. Experimental approach

We conducted a number of experiments to im-prove prediction accuracy. Our database consistedof 15 055 cases, gathered from Scottish Traumadepartments in the period 1992–1996. The datawere gathered by the Scottish Trauma AuditGroup from a range of teaching, district andcommunity hospitals. Information is gathered forpatients who are admitted for three or more days,or who die in hospital. The data is initially en-tered by Accident and Emergency nursing staff;injury scoring is performed by a highly-trainedregional co-ordinator. The data set contains al-most no missing values. However, if the initialclinical observation is missing and the patient wasmanaged in a resuscitation room and/or died,normal physiological variables are allocated andthe patient is added to the database. If theseobservations are missing and the patient survivesand is not managed in a resuscitation room, thepatient is omitted from the database. The data setwas divided into two subsets: the training set,containing 7224 cases from 1990–1994: and thetest set, containing 7831 cases gathered from1995–1996. In all cases, the new models wereoptimised using the training set, so there is anelement of prospective testing as results are re-ported on the test set. A small number of addi-tional cases (15) were discarded because somevariable values were missing.

The two major modelling techniques used werelogistic regression and neural networks. The firstof these is already used in the TRISS methodol-ogy and is widely known in the medical field.Neural networks are less well known in medicine,and are sometimes regarded as an unproven and‘mysterious’ model. It is worth discussing therelationship between the two modelling tech-niques in some detail, as they are much moreclosely related than is commonly realised.

In logistic regression, a weighted sum of theinput variables is taken, and to this sum a biasvalue is added. The resulting value is passedthrough the logistic function:

act=b+%i

wi xi

Page 3: Application of neural networks and sensitivity analysis to improved prediction of trauma survival

A. Hunter et al. / Computer Methods and Programs in Biomedicine 62 (2000) 11–19 13

PS(x)=1

1+e−act

The output of the logistic function is a valuebetween 0 and 1, which is interpreted as a proba-bility of class membership.

The theoretical justification for the use of logis-tic regression rests on the fact that, in the casewhere the two classes (live or die) have probabilitydensity functions following the multivariate nor-mal distribution, with equal covariance matrices,the posterior probability function will indeed fol-low the logistic functional form.

Of course, if the probability density functionsare significantly non-normal, or have differentcovariance matrices, then the performance of lo-gistic regression will be non-optimal. It is there-fore desirable to consider more complex forms ofmodel.

Neural networks are a family of semi-paramet-ric models which often exhibit better performancethan logistic regression on classification problemswith significant non-normality, or differing classcovariance matrices [1]. In the most commonform of Neural Network, the Multilayer Percep-tron (MLP), two layers of units (sometimes called‘neurons’) are commonly used, although morelayers are occasionally seen. Each unit forms aweighted, biased sum of its inputs, and then ap-plies an activation function to this sum. Onecommon activation function is the logistic func-tion. A single unit in the first layer (the hiddenlayer) of an MLP actually forms a logistic re-sponse function, identical to that used in logisticregression. However, in the MLP, a number ofthese logistic response surfaces are recombined inthe output layer to form the actual probabilityestimate. The outputs units also use a biasedlogistic function. Thus, a logistic regression modelis identical to an extremely simple neural networkmodel (a neural network with no hidden layer,and a single output unit), and neural networkscan be viewed as a generalisation of logistic re-gression. The primary advantage of neural net-works over logistic regression is that they canmodel non-normal class distributions, and it istherefore not surprising to find that they are oftencapable of improved performance when comparedto the logistic regression approach.

It is easy to adjust the model complexity inneural networks, simply by increasing or decreas-ing the number of hidden units — this allowsvaluable modelling flexibility within a singleframework. An important feature of neural net-works is that as the number of units in thenetwork increases, so the model complexity in-creases, but also the difficulty of fitting the modeland the amount of data required increase. A gooddesign therefore incorporates the minimum size ofnetwork necessary to adequately model theproblem.

Both logistic regression and neural networkmodels have to be optimised using sampled data.We used the same optimisation strategy for bothtypes of model. First, the logistic regression coeffi-cients or neural network weights were randomlyassigned. They were then optimised against thetraining set using a two-stage process — on-linegradient descent with momentum (Back Propaga-tion, learning rate: 0.1; momentum coefficient:0.3) for a moderate number of epochs (30), fol-lowed by Quasi-Newton BFGS optimisation for100 epochs [7]. The Multilayer Perceptrons wereoptimised against the cross-entropy error func-tion, which is equivalent to maximum likelihoodoptimisation of the logistic regression models [1].The network error function in this case is:

E=%i

ln (oi)ti+ ln (1−oi)(1− ti)

Where ti is the target output for training case i.and oi the actual output of the output neuron.This error function is used instead of the usualsum-squared error function. Despite the appar-ently more complex form, when used to modifythe back propagation algorithm, and in conjunc-tion with the logistic activation function, thederivative used to calculate the error correction isactually simplified by omission of the o(1−o)term. Details are given in [1].

The two-stage optimisation process describedabove is little known, and we have adopted it asour experience indicates that second-order non-linear optimisation algorithms such as Quasi-Newton are substantially more reliable if they arepreceded by a short burst of gradient descent(they are far less likely to become stuck in a local

Page 4: Application of neural networks and sensitivity analysis to improved prediction of trauma survival

A. Hunter et al. / Computer Methods and Programs in Biomedicine 62 (2000) 11–1914

minima) Other authors have also reported thiseffect [8]. We suspect that this improved perfor-mance is due to greater stability in the errorfunction Hessian matrix once the gradient-descentalgorithm has located a reasonable starting pointfor second-order descent. We have also found itbeneficial to scale the input variables into a con-sistent range [0, 1] using the Minimax approach,as this aids the gradient descent stage (this is notnecessary if optimisation is solely using Quasi-Newton)

Optimisation was quite straightforward, indi-cating that the data is well structured. Each modelwas optimised ten times, with the best resultingmodel selected. Results were remarkably consis-tent, showing little of the variation in perfor-mance which is common in neural networktraining — each model settled quickly to a consis-tent error level in the vast majority of cases (withone or two local minima, which were discarded),with no significant over-training. As a conse-quence, we were able to use the entire training setfor optimisation of neural network models withno cross validation. The results reported beloware from one arbitrarily selected model of eachtype.

Initially we evaluated a variety of types ofNeural Network, including Radial Basis Func-tion, Multilayer Perceptrons and ProbabilisticNeural Networks [10], and with various numbersof hidden units. Since RBF and PNN networksproved consistently inferior to MLPs on this data-set, we conducted the full set of experiments usingMLPs only, and report on these results.

Models were assessed for performance by calcu-lating the Receiver-Operating-Characteristic(ROC) curves [12], and comparing the areas underthe curves. The ROC curve and the area belowwere calculated using the test set (1995–1996cases), which was not used in optimising themodels. When applying a classifier, there is aninevitable trade-off between type one and typetwo errors (false positives and false negatives),and the trade-offs can be managed by adjustingthe decision threshold. The ROC curve sum-marises the performance of a classifier across therange of possible decision thresholds. The areacorresponds to the probability that a randomly

selected patient who actually survives will bejudged more likely to survive than a randomlyselected patient who actually dies. In the casewhere a perfect classifier is possible (that is, wherethe cases from the two classes are separable) anarea of 1.0 is achievable; in practical problems,where there is some overlap between classes, theupper threshold achievable is somewhere below1.0. However, we do not know a priori what theoptimum achievable result is.

4. Improved prediction of survival probability

We conducted a number of experiments usingthe same variables as TRISS.

As a benchmark, we first calculated survivalprobabilities using the original TRISS coefficients.The area under the ROC curve in this case is0.941.

We then re-optimised the TRISS coefficientsusing the available training data set, improvingthe result slightly to 0.943. This is consistent withresults reported by other authors [4,6], whichindicate that local re-optimisation is beneficial.

We next built simpler logistic regression mod-els. in particular incorporating AGE directly as aninput variable, and training a single model (eitherwith TYPEINJ as an input, or ignored entirely).A model using a single logistic regression equa-tion, with the Injury Type (TYPEINJ, Blunt orPenetrating), the RTS, ISS and the Age of thepatient entered directly (after rescaling), ratherthan encoded into a binary variable as in TRISS,achieved an area under the ROC curve of 0.953— a noticeable improvement. Interestingly, a sec-ond logistic regression model with only three in-puts (RTS, ISS and AGE) performed nearly aswell, at 0.952. Since the type of injury is known tohave a significant impact on the chance of sur-vival (survival rates are far lower for those withPenetrating injuries, although these are far lesscommon) it initially seems surprising that thisvery simplified model can out-perform re-opti-mised TRISS. The explanation lies in the treat-ment of the AGE variable. In TRISS, this isreduced to a binary code indicating whether thepatient is above or below 55. However, our sensi-

Page 5: Application of neural networks and sensitivity analysis to improved prediction of trauma survival

A. Hunter et al. / Computer Methods and Programs in Biomedicine 62 (2000) 11–19 15

Table 1Performance of logistic regression and neural networkclassifiers

ROC areaExperiment(test set)

TRISS 0.94110.9426Reoptimised TRISS0.9534Logistic regression, including age and

type of injuryLogistic regression, including age 0.9521Multilayer perceptron. 2 hidden units, 0.9548

including ageMultilayer perceptron, 2 hidden units, 0.9475

ageB55 nominal-encoded

ably consistent, varying between 0.9536 and0.9548. The TYPEINJ variable proved irrelevantand was discarded, so that the MLP inputs were:RTS, ISS and AGE. The optimum architecturehad only two hidden units, with performancefalling off gradually as more units were added.Performance was equally good with and withoutthe type of injury as an input to the network, sothis variable was discarded.

To confirm our speculations regarding the im-portance of the encoding of the AGE variable, weexperimented with an alternative MLP networkwith two hidden units, but with the TRISS-stylebinary encoded age, rather than the actual age, asinput. In this case, performance dropped to 0.948,still better than TRISS, but inferior to all ourother models. Even then, sensitivity analysis confi-rms that age is the most important independentpredictor in the model.

The experimental results are summarised inTable 1: Fig. 1 shows the ROC curves for the bestneural network and the original TRISS model.

As the ROC area can at best be increased to 1,the performance of the best neural network re-ported corresponds to a 23% improvement in theclassifier’s performance (i.e. 23% of the remaining

tivity analysis experiments (discussed in more de-tail in Section 5) consistently identify age as themost important independent predictor of survivalin our data set.

Finally, we built simple Multilayer-Perceptronneural network models, using the same range ofinput variables as in our logistic regression mod-els. With the number of hidden units in the (singlehidden layer) MLP varying between two andeight, the area under the ROC curve was remark-

Fig. 1. Comparison of ROC Curves for best Multilayer Perceptron and original TRISS.

Page 6: Application of neural networks and sensitivity analysis to improved prediction of trauma survival

A. Hunter et al. / Computer Methods and Programs in Biomedicine 62 (2000) 11–1916

discrepancy in ROC area from perfect perfor-mance is removed, when compared with TRISS).We can, for example, maintain the same sensitiv-ity, but improve specificity so that approximately500 cases that were reported as unexpected sur-vivors are instead classified as expected. This is avery significant improvement, making it farmore feasible to follow up the audit with detailedstudy of those cases showing unexpected out-comes.

5. Sensitivity Analysis

A key question when modelling such problemsis this: which variables are the most importantpredictors? It is desirable to assign some rating ofimportance to each variable. Any procedure to dothis is, however, fundamentally flawed, as vari-ables are not in general independent. There maybe some variables that are of use only in conjunc-tion with others, and others that encode the sameinformation and are therefore partially redundant(different arbitrary subsets of mutually redundantvariables may be equally acceptable). Strictlyspeaking, variable selection is a subset selectionproblem [5], and as for N variables there are 2N

possible subsets, is it not usually practical toevaluate all subsets.

Notwithstanding these theoretical objections, itis useful to assign ratings to variables within thecontext of a particular model. This is known assensiti6ity analysis.

One approach is to add noise to each inputvariable, and to observe the effect upon the over-all error [3]. There are drawbacks to this approachfor feature ranking, however. First, one mustmake an essentially arbitrary decision about howmuch noise to add (usually, to add some multipleof the sample standard deviation of the variable).Second, it is not possible to add noise to nominalvariables (i.e. variables which take one of a num-ber of discrete values). Although an alternativeapproach of introducing random errors with somedetermined frequency could be adopted, applica-tions of sensitivity analysis tend to be in domainswhere all inputs are numeric. In our case, theType of Injury is a nominal variable.

A second approach is to analyse the derivativesof the fan-out weights of the input units, with ahigh derivative indicating high sensitivity [11].This approach also has its drawbacks, as it isrelatively complex to calculate the sensitivity, andthe rating for each variable is necessarily a com-posite value derived from a number of derivatives,which compromises its reliability.

We propose an alternative form of sensitivityanalysis, based on our approach to the missing6alue problem. In many problem domains, it iscommon to find the values of some variablesmissing. The STAG data set is actually very well-defined, so that missing values are extremely rare,and such cases have been omitted from theseexperiments. In problem domains where missingvalues are encountered more frequently, it be-comes critical to have a method to ‘fill in’ themissing values in partially complete cases.

There are a number of approaches to this prob-lem. Perhaps the simplest is to substitute thesample mean from the training set for a missingvalue, in the case of a continuous variable, or thea priori probabilities (or an estimate of theseusing the distribution in the training set, if a priorivalues are not known) for a nominal variable.This is the approach we commonly use. Moresophisticated approaches include data imputationwhere one attempts to predict the unknown inputvariables conditioned upon those which areknown, by constructing auxiliary models — theimputed values are used to fill in the gaps beforethe main model is used to predict the output. Anyform of missing value substitution is compatiblewith our approach to sensitivity analysis, al-though we would expect (but have not confirmed)that data imputation would better isolate the ‘in-dependent’ contribution of each variable.

We analyse sensitivity by replacing each vari-able in turn with missing values, and assessing theeffect upon the output error (i.e. the RMS of theindividual cross-entropy errors of the test cases).A variable that is relatively important will cause acorrespondingly large deterioration in the model’sperformance. This approach is somewhatanalogous to a single step in the backwards fea-ture selection procedure [5], except that in thelatter case alternative models are actually built,

Page 7: Application of neural networks and sensitivity analysis to improved prediction of trauma survival

A. Hunter et al. / Computer Methods and Programs in Biomedicine 62 (2000) 11–19 17

whereas we simply alter the input to our existingmodel.

We submitted all of our models to this form ofsensitivity analysis. In all cases, the variables wereranked in importance as follows: Age, ISS, RTS,Type of Injury, when included, was ranked fourthand of marginal importance. Precise errors (asopposed to ranking) vary from model to model.

This result may seem surprising, in particular,one might expect that the age of a patient is a lesspowerful predictor of trauma survival than anysensible measurement of trauma severity. This is agood example of how sensitivity analysis must betreated with caution. Age is the most significantindependent contributor to the models — there isclearly a great deal of interdependence betweenthe ISS and RTS variables. If either of these isranked against AGE without including the other,it is far more significant.

6. Using sensitivity analysis to examine thecomposite variables

Since the major input variables used in theTRISS equations are themselves composites of anumber of simpler variables, it is interesting toconsider which of these base variables are mostinfluential. Some of these variables are extremelyexpensive to gather, requiring an expert assess-ment of the patients, and the volume of datarecorded is a significant cost for all trauma units.There is therefore a great deal of motivation toreduce the number of variables gathered, or to usevariables that are less expensive to gather.

To assess the relative importance of the basevariables, we built a Multilayer Perceptron modelusing fourteen variables as inputs, and submittedit to sensitivity analysis. The variables were theAl–A6 scores used in the ISS; the Blood Pressureand Respiratory Rate used in the RTS, and theEye, Motor and Verbal response variables used inthe GCS, which is also in turn encoded in RTS,plus the Injury Type (Blunt or Penetrating), Ageand Sex of the patients.

This model achieves a ROC area of only 0.951,somewhat inferior to our other models. This ap-pears to reflect both the higher dimensionality of

Table 2Sensitivity analysis of base variables

Meaning RankingError whenVariableomitted

AGE 10.603Motor move-MOTOR 0.478 2mentHead injury 0.465 3AlExternalA6 40.430traumaChest injuryA3 0.427 5

TYPEINJ 0.421Blunt or pene- 6trating

A4 0.420Abdominal in- 7jury

0.410Verbal response 8VERBALBP 0.409Systolic blood 9

pressureExtremitiesA5 0.408 10

11a0.407A2 Facial injuryRespiratory 12aRRATE 0.406rate

EYE Eye movement 0.406 13a

detectedSEX 0.399 14a

a These variables may be omitted entirely without degradingthe model. The benchmark error (before omitting variables)was 0.407.

the resulting model, and the fact that the scoringschemes used in TRISS do recode the informationin a more effective fashion. Nonetheless, the re-sults are interesting.

Table 2 below summarises the sensitivity of thevariables in this model: the errors compare with abaseline error of 0.407. It can be seen that threevariables (Sex, Respiratory Rate and Eye Re-sponse) actually slightly degrade the model, and afurther one (A2) is of no benefit. The insignifi-cance of parameter A2 is not surprising (it ratesfacial injuries. which do not greatly effect thechance of survival). The irrelevance of gender isinteresting: this contradicts the findings of Han-nan, although he considered only the specific caseof trauma in victims of low falls [4]. However, it isvery surprising to find that Respiratory Rate andEye motion are irrelevant — this no doubtreflects mutual redundancy with other variables,and is again indicative of the dangers of placingtoo naive an interpretation on the results of asensitivity analysis.

Page 8: Application of neural networks and sensitivity analysis to improved prediction of trauma survival

A. Hunter et al. / Computer Methods and Programs in Biomedicine 62 (2000) 11–1918

Building a second model that used only the teninputs ranked as most important in the aboveexperiment we found that performance was notaffected, whereas removing further variables de-graded performance. This confirms the results ofthe sensitivity analysis.

Although this ten-input model does not per-form as well as our models using the compositevariables derived from the TRISS approach, itdoes confirm our suspicion that some of the infor-mation used in TRISS is not of great significance.We therefore conclude that there is some value inrevisiting the scoring methods used to producethose composite variables. a viewpoint which isborne out by the work of other authors [2] usingquite different approaches.

It is possible to build quite effective modelsusing only a small subset of the available vari-ables. For example with the first five inputs listedabove we built a multilayer perceptron with aROC area of 0.9445 — significantly inferior tothe models using a more complete set of inputvariables, but still superior to TRISS even whenits coefficients are reoptimised. This opens up thepossibility of using simplified models for triage, asthis information requires relatively little expertiseto gather and is available from an initial examina-tion of the patient (age might need to be esti-mated if the patient cannot respond to verbalquestions).

7. Conclusions

We have evaluated the TRISS methodologyand alternative approaches, including neural net-works, on a large database (STAG) which has notpreviously been used for this purpose. Our experi-ments confirm that neural networks can yieldbetter results than logistic regression, and thatlocally-reoptimised models are to be preferred.We have also confirmed that recoding the Agevariable in a more straightforward fashion im-proves the performance of the models.

We have conducted a novel form of sensitivityanalysis on our models. This is simple to apply,and effective in identifying key variables, allowingan important form of knowledge extraction to be

applied to the neural network. The technique isequally applicable to any modelling approach,including those that are more sensitive to theinclusion of irrelevant inputs, such as fuzzy logicsystems.

The sensitivity analysis confirms the importanceof the Age variable. Further sensitivity analysisreveals that gender is irrelevant, and that a num-ber of the basic variables used in the TRISSmethodology contribute little.

The results of our sensitivity analysis suggestthat further work should be undertaken in simpli-fying the scoring schemes used to produce thecomposite variables in TRISS. It should be possi-ble to produce a system with a smaller number ofbase variables and with performance equal to ourbest models. For future work, we intend a morethorough investigation of alternative scoringschemes, together with an in-depth investigationof other variables that are commonly available,but which have not been included in this analysis.

References

[1] C. Bishop, Neural Networks for Pattern Recognition,Oxford University Press, Oxford, UK, 1995.

[2] H.R. Champion, W.S. Copes, W.J. Sacco, C.F. Frey,J.W. Holcroft, D.B. Hoyt, J.A. Weigelt, Improved predic-tions from a severity characterization of trauma (ASCOT)over trauma and injury severity score (TRISS): results ofan independent evaluation, J. Trauma Inj. Infect. Crit.Care 40 (1) (1996).

[3] T.D. Gedeon, Data mining of inputs: analysing magni-tude and functional measures, Int. J. Neural Sys. 8 (2)(1997) 209–218.

[4] E.L. Hannan, J. Mendeloff, L.S. Farrel, C.G. Cayten,J.C. Murphy, Multivariate models for predicting survivalof patients with trauma from low falls: the impact ofgender and pre existing conditions, J. Trauma Inj. Infect.Crit. Care 38 (5) (1995) 697–704.

[5] A. Jain, D. Zongker, Feature selection: evaluation, appli-cation and small sample performance, IEEE Trans. Pat-tern Anal. Mach. Intell. 19 (2) (1997).

[6] F.H. Millham, M. Malone, J. Blansfield, W.W. Lamorte,E.F. Hirsch, Predictive accuracy of the TRISS survivalstatistic is improved by a modification that includes ad-mission Ph, Arch. Surg. 130 (1995).

[7] W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flan-nery, Numerical Recipes in C: The Art of ScientificComputing, second ed., Cambridge University Press,Cambridge, UK, 1992.

Page 9: Application of neural networks and sensitivity analysis to improved prediction of trauma survival

A. Hunter et al. / Computer Methods and Programs in Biomedicine 62 (2000) 11–19 19

[8] B.D. Ripley, Pattern Recognition and Neural networks,Cambridge University Press, Cambridge, UK, 1996.

[9] R. Rutledge, O. Turner, S. Emery, S. Kromhout-Schiro,The end of the injury severity score (ISS) and the traumaand injury severity score (TRISS), J. Trauma Inj. Infect.Crit. Care 44 (1) (1996).

[10] D.F. Speckt, Probabilistic neural networks, Neural Netw.3 (1) (1990) 109–118.

[11] J.M. Zurada, A. Malinowski, I. Cloete, Sensitivity analy-sis for minimization of input data dimension for feedfor-ward neural network. IEEE International Symposium onCircuits and Systems, IEEE Press, London, 1994.

[12] M.H. Zweig, G. Campbell, Receiver-operating char-acteristic (ROC) plots: a fundamental evaluation toolin clinical medicine, Clin. Chem. 39 (4) (1993) 561–577.

.