using proc genmod to model adverse event counts in a ... · hematopoietic cell transplantation...

1

Using PROC GENMOD to Model Adverse Event Counts in a Health Care Setting

John Ulicny, Fox Chase - Temple Bone Marrow Transplant Program, Philadelphia, PA

Thomas R. Klumpp, MD, Fox Chase - Temple Bone Marrow Transplant Program, Philadelphia, PA

ABSTRACT Statistical models for adverse events have been developed as part of a quality management initiative at the Fox Chase – Temple Bone Marrow Transplant Program in Philadelphia, PA. Their purpose is to enable the transplant team to compare recent adverse event counts to what should be expected based on the patient population currently under care by the team. The count of adverse events is the response variable in the regression models. Generalized Linear Modeling, as implemented in the PROC GENMOD procedure, is an effective tool for performing regression analysis on a response of this type. Unlike ordinary least squares (OLS) it can be applied to a wide range of non-normal responses as long as they come from the natural exponential family of distributions and meet certain other assumptions. The models in this paper were developed using Version 9.1.3 of SAS. This version allows for enhanced, integrated graphical assessment of the model via the ODS Statistical Graphics facility which is built into various statistical procedures, GENMOD being one. The ODS Statistical Graphics facility is experimental in release 9.1.3. Some examples of its use are provided in this paper.

INTRODUCTION Hematopoietic Cell Transplantation (HCT) is performed on patients for a variety of severe illnesses, typically affecting the bone marrow and circulating blood. The term Bone Marrow Transplant (BMT) is an older term often used interchangeably with HCT, and for the purposes of this paper no distinction between the two terms is made. The vast majority of patients transplanted at the Fox Chase - Temple Bone Marrow Transplant Program have hematologic malignancies such as Non-Hodgkin’s Lymphoma (NHL), Hodgkin’s Disease (HD), Multiple Myeloma (MM), Acute Myelogenous Leukemia (AML) and others. Transplantation for some types of solid tumors is done as well, although these constitute a small minority of cases. Due to the high severity of most of these illnesses, high dose chemotherapy and radiation are typically administered to treat the cancer, but this results in the destruction of the bone marrow, which must then be replaced. This process is prone to generating a variety of toxicities, many of which are quite severe or even fatal. Common examples of HCT-related toxicities are anemia, nausea, diarrhea and mucositis. There are three main types of HCT useful for predicting adverse events: autologous, allogeneic-related and allogeneic-unrelated. In an autologous transplant, a patient’s own blood stem cells are harvested before the preparative regimen (i.e. chemotherapy and/or radiation) is administered. The stem cells are then reinfused in order to reconstitute the marrow. In an allogeneic-related HCT, a member of the patient’s family is a donor. The regimen is administered and then the related donor’s cells are infused into the patient. The third form of HCT is required when a compatible donor cannot be found from within the patient’s family. This is the allogeneic-unrelated HCT and involves a donor search through an international registry of donors to find a match. The matched unrelated donor’s (MUD) cells are then infused after the regimen is administered. Autologous transplants are the most common and least risky form of transplant, followed by matched related transplants. The riskiest transplants are the allogeneic-unrelated transplants. The toxicities associated with each transplant are recorded by the data management team at the Fox Chase – Temple BMT Program. They are graded according to the Eastern Cooperative Oncology Group (ECOG) standard grading system. These grades specified by the ECOG standard are ordinal, ranging from zero to five. If a toxicity is grade 3 or higher it is considered an adverse event in the model. A grade of zero indicates no event, and a grade of five indicates that the event was fatal. The event descriptions are listed in Table 1 on the next page.

PostersNESUG 2006

2

The goal of the model described in this paper is to compute a reasonable expected number of adverse events each month based on the characteristics of the patients under our care. Based on the information in Table 1 there have been (678+577+19)/60 ≈ 21 adverse events per month on average based on a transplant volume of about 5 patients per month during the time frame 2000-2004.

GENERALIZED LINEAR MODELS AND PROC GENMOD There have been many excellent papers and books written about generalized linear models. For a thorough technical discussion see the books by Agresti1 or Myers2. For sources that describe using PROC GENMOD for generalized linear models see Allison3 or Stokes et al4. The discussion here is designed as a tutorial for those who have little or no familiarity with this procedure. In multiple linear regression, a response variable Y is related to a set of X-variables linearly as

0 1 1 1 ( 1)...i i p p i i iY x xβ β β ε ε− −= + + + + = +ix β (1)

for i=1, 2, …, n observations. The errors from this model are assumed to be independent with zero mean and constant variance. The assumption of error normality is usually also added to enable one to construct hypothesis tests and confidence intervals for the parameters. A Generalized Linear Model also describes a relationship between a response variable and an independent variable or variables, however the relationship may be much more complex than a simple linear one. As described by Meyers2 the generalized model is made up of three components: The random component. This component consists of the response variable Y with observed values Y1, Y2, …, Yn. These observations are mutually independent and come from a natural exponential family. This family is of the general form given in equation (2) where the vector iθ may vary depending on the values of the covariates.

( ; ) ( ) ( ) exp[ ( )]i i i i i if y a b y y C=θ θ θ (2)

For instance, a special case of this family is the Bernoulli distribution with 1 parameter as given by

1( ; ) (1 ) (1 )exp log( )1

i iy y ii i i i i i

i

pf y p p p p yp

− = − = − −

(3)

The parameter(s) in equations 2 and 3 are indexed by i because they can vary as a function of the covariates. In other words, because each observation can have different covariate values, the estimate of iθ can be different for each observation due to the dependence relationship specified in the model. In a simple linear regression it is only the response mean that can vary, but in a generalized linear model the variance can vary as well. In other words, the assumption of variance homogeneity can be relaxed, although it is important to understand how the variance depends on the model data. Keep in mind that there is a distinction between the parameters of the response distribution

Table 1. Toxicity Frequencies An Adverse Even is a Toxicity of Grade 3 or Higher

Event Grade

Description of Toxicity Aggregate Frequency (2000-2004)

1 Mild 1,737 2 Moderate 889 3 Severe 678 4 Life threatening 577 5 Fatal 19 - Total Toxicities 3,900

PostersNESUG 2006

3

represented by iθ and the parameters of the regression equation covariates represented by β .

The systematic component. This is a function of the ijX that is linear in the parameters. If Y depends on several

covariates then n p×X is often called the design matrix, with n observations corresponding to p variables, possibly

including an intercept. These explanatory variables can be combinations of continuous variables, categorical variables and interactions. The function results in a vector called the linear predictor. The equation is:

1 1n n p p× × ×=η X β (4)

The final component is called the link function g. All link functions must be monotonic and differentiable, and they are often non-linear. This function relates the first two components to each other by specifying that

( ) [ ( )]i i ig g E Yη µ= = (5)

The normal distribution is a special case of the natural exponential family, and the assumption of normality plays a key role in the process of estimating and evaluating simple linear and multiple regression models. In the present case however, where adverse event counts need to be modeled, other distributions from the exponential family are required to adequately represent the special nature of the data. One obvious requirement of the distribution is that it does not allow for a negative count value. A probability distribution for a count variable also will not typically have a constant variance. The variance of a count variable usually increases as the size of its mean increases. The Poisson distribution therefore is a good choice because the variance equals the mean, and so any model using the Poisson as a response variable will accommodate such responses. Note that this contrasts with the case of multiple regression where homogeneity of variance is assumed. One cannot have homogeneity of variance in a Poisson regression unless the mean always stays the same! The negative binomial distribution is another member of the natural exponential family that is useful in the context of modeling count data. This distribution, although significantly more complicated than the Poisson, can handle the situation called “overdispersion” in which the variance of the count variable is actually greater than the mean of the variable. The adverse events model illustration will use the Poisson model as well as the negative binomial model to demonstrate the two different approaches. In the standard regression case, where the regression response is normally distributed and the other regression assumptions mentioned above are met, ordinary least squares (OLS) can be used to arrive at parameter estimates that are unbiased as well as being maximum likelihood estimates (MLEs). Thus they are usually referred to as Best Linear Unbiased Estimators (BLUE). With non-normal responses however, it is necessary to use an algorithm such as iteratively re-weighted least squares instead of OLS to arrive at estimates that are approximate MLEs. These estimators are not guaranteed to be unbiased as in the normal response case, however they will tend to have the smallest variance possible while maximizing the probability that the sample obtained was actually drawn from a distribution with parameter values equal to the estimates that were generated. This is why some GENMOD output will mention iteratively re-weighted least squares as the estimation technique used.

CHARACTERISTICS OF THE DATA There is a significant amount of very high quality data collected on each patient treated by the BMT team. The main file (that we call the “core” file) consists of one observation for each patient-protocol combination. For example, one patient might be registered onto three treatment protocols over a period of time, and only one of these protocols will actually involve a transplant. Nevertheless that patient will have three observations in the database and may be at risk for an adverse even under any of the protocols. In reality the risk of an adverse event on any protocol other than a transplant protocol is quite small and so only patients registered for transplants are included in the model.

We collect and quality-assure dozens of variables, including patient demographic information, disease status, treatment details, outcomes and more. SAS/AF and SCL are used to input the information into a dataset. This dataset is then checked at least once per day by an expert system written in Base SAS. This system checks the relationships of key variables to one another, and it also checks to determine if the magnitudes of the most important variables are reasonable within the context of the patient’s current status. For instance the code in figure 1 prints a warning message if the data manager enters a suspicious value for the creatinine level of a transplanted patient.

PostersNESUG 2006

4

/* ------------------------ */ /* CHECK CREATININE LEVELS */ /* ------------------------ */ TITLE 'WARNING (CREAT) CREATININE LESS THAN .1 OR GREATER THAN 4.0'; TITLE2 'WHEN BMT=YES'; PROC PRINT DATA=BMT.CORE; WHERE ((CREAT<.1) OR (CREAT>4.0)) AND BMT='YES'; VAR ROWNUM NAME TYPE TUPN DIGNOSIS BMT BMTDAT CREAT; RUN; We have defined a reporting period to be approximately one month. This coincides with our monthly quality management meetings and with the historical cycle of monthly permanent backups of the core file. Although our current backup strategy involves multiple daily, weekly and monthly backups, some of the old monthly backups were not done at precise monthly intervals and so we must recognize and adjust for the fact that the exposure to adverse event risk is different in these periods.

THE VARIABLES USED The BMT team has limited resources available to collect adverse event data and so we cannot record the exact date or exact cause of the events. (Sometimes it is virtually impossible to pin down such information regardless of the resources being devoted to the task.) In addition, only the maximum grade experienced by a patient within each particular adverse event category is captured. For these reasons, we make assumptions regarding the distribution of the events between the ending dates of each reporting period. In particular, it is assumed that an event reported as of the end of a reporting period was equally likely to have happened on any day within that period on which the patient was under treatment. If the reporting period was 30 days and the patient was being treated under a protocol for the entire period, then the probability is 1/30 that the event happened on any particular day within the period. Using this assumption, it was possible to construct a graph of the events as a function of days from the initiation of treatment for each patient. In virtually all cases the initiation of treatment is the start of the conditioning regimen, i.e. chemotherapy with or without radiation. The interest of the quality management team is highly focused on extremely severe adverse events. A plot of toxicities expressed as a rate per 100 patients per day is shown in Chart 1, with autologous, allogeneic-related and allogeneic-unrelated (MUD) events plotted separately.

RESPONSE VARIABLE: ADVERSE EVENTS (AECOUNT)

The adverse event variable is defined as the number of adverse events experienced by a patient in the reporting period under study. Each patient/month combination is a separate observation.

Chart 1. Estimated Historical Toxicity Rate

Toxicity Rate Per 100 Surviving Patients Per Day

0

1

2

3

4

5

0 50 100 150 200 250 300 350 400 450 500

Days After Rx Initiated

Allogeneic - MUD Allogeneic - Other Autologous

PostersNESUG 2006

5

COVARIATE 1 - TEMPORAL RISK INDEX (TRI) Chart 1 shows graphically the tendency of the risk to start low and rise to a peak at about 50 to 75 days depending on the transplant type. The risk tends to trail off at about 100 days, gradually subsiding over the subsequent 400-day period. This result is extremely reasonable based on clinical understanding. The process of conditioning and cell infusion occurs in the early stages of treatment and then, ideally, the gradual recovery of the patient commences once the blood cells – particularly neutrophils and platelets - start to be produced by the patient without the need for artificial support. We attempted to fit a number of possible candidate curves to this data in order to find a curve that was simple, intuitive and a good fit to all 3 types of transplants. Because of the apparent curvilinear risk shape in the early stage, and the tapering off of risk after day 100, an inverse quadratic curve emerged as the best fit to the data. A predictor called the temporal risk index (TRI) was constructed based on this curve, however we knew that temporal risk was not the only factor affecting adverse events! The inverse quadratic equation as well as a plot of the fit to autologous toxicities is displayed below. Having a relatively simple structure, this curve was able to explain 95% of the toxicity variability in the AUTO, 85% of the variability in the ALLO-OTHER, and 72% of the variability in the ALLO-MUD transplants, and yielded reasonably consistent parameter estimates across the different types of transplants as well as for different subsets of data. These characteristics reduce the possibility that the curves are simply fitting noise. We are also attempting to obtain data from national and international BMT registries to further validate this functional from for the relationship.

This curve can be fit using PROC NLIN in SAS. We used all toxicities instead of just adverse events in this stage in order to get a greater volume of data to work with. The resulting parameters are listed in Table 2 below.

Chart 2. Inverse Quadratic Fit to Toxicity Data

Table 2. Coefficients Estimated in the Inverse Quadratic Fit: 21/( )rate a bt ct= + +

Coefficient AUTO ALLO - other ALLO – MUD a 2.1773 0.7190 0.6672 b -0.0908 -0.0181 -0.0130 c .0011 .00021 .00013

PostersNESUG 2006

6

COVARIATE 2 - LENGTH OF TIME AS INPATIENT (LTIP) An explanatory variable based on the inpatient length of stay (LTIP) was included as a proxy for illness severity and complications during the admission. It is defined as:

LTIP = 0 if not yet discharged in the current reporting period. = LOS for transplant admission if discharged during or prior to the current reporting period.

COVARIATE 3 – FOLLOW UP FLAG (FLFLAG) Adverse events are recorded into the database as they are observed by the transplant team. The frequency of follow-ups typically diminishes over time. As we have fewer chances to observe a patient at distant time points after transplant, any adverse events are more likely to be recorded a the month in which a follow up visit has occurred. This variable is a flag to indicate whether or not a follow up visit has occurred in the current reporting period.

COVARIATE 4 - OBSERVER FLAG (OBSERVER) A personnel change took place in the BMT program during which the responsibility for assigning toxicity grades changed from one group of observers to another. This kind of change almost always introduces an observer bias and the case of our adverse events data capture is no exception. Therefore a dummy variable called OBSERVER was included in the model to account for the bias.

COVARIATE 5 - DIAGNOSIS/REGIMEN FLAG (DREG) One combination of diagnosis and regimen that was suspected to be particularly prone to generating adverse events was renal cell carcinoma patients who received fludarabine and total body irradiation as a preparative regimen. This indicator variable was included to assess the impact of this particular situation on adverse event risk.

COVARIATE 6 - TRANSPLANT YEAR (BMTYR) Over time, improvements in transplant technology, training and process control should enable us to reduce the incidence of adverse events after adjusting for the other covariates. Examples of improvements in the area of supportive care include improved antibiotics and antiviral medications and better screening for viruses so that we can treat problems sooner. Treating problems sooner can keep toxicities that are at level 1 or 2 from becoming severe toxicities (i.e. adverse events). Including the transplant year variable allows us to test for evidence of this effect, and if there is improvement over time, it can indicate approximately how much improvement has occurred.

MODELS PRODUCED BY GENMOD We will now present two models that can be produced by PROC GENMOD to analyze the adverse event data. The first is the Poisson model and the second is the negative binomial model. There are many other approaches to modeling this data that may or may not work better, however this paper will focus on these two approaches to illustrate and contrast the techniques as implemented in the GENMOD procedure. For a discussion of the event history approach and PROC PHREG see chapter 5 in Allison3 and for a description of a new SAS PROC called GLIMMIX that can estimate generalized linear mixed models see the SUGI 30 paper by Oliver Schabenberger4 of the SAS Institute. A key thing to remember is that all of these approaches will be superior to using standard multiple regression for the reasons stated above. The ODS statements invoke ODS statistical graphics, which is experimental in version 9.1.3 of SAS. The examples below show how to evaluate the LINK function by using the ASSESS statement within the GENMOD procedure. The ASSESS statement works in tandem with ODS statistical graphics to produce the assessment graphics (charts 3-5) displayed later in this paper. According to SAS documentation for version 9.1.3, the ASSESS statement implements the following idea: “Lin, Wei, and Ying (2002) present graphical and numerical methods for model assessment based on the cumulative sums of residuals over certain coordinates (e.g., covariates or linear predictors) or some related aggregates of residuals. The distributions of these stochastic processes under the assumed model can be approximated by the distributions of certain zero-mean Gaussian processes whose realizations can be generated by simulation. Each observed residual pattern can then be compared, both graphically and numerically, with a number of realizations from the null distribution. Such comparisons enable you to assess objectively whether the observed residual pattern reflects anything beyond random fluctuation. These procedures are useful in determining appropriate functional forms of covariates and link function. “

PostersNESUG 2006

7

MODEL 1 – POISSON REGRESSION The code for this model is: /* --------------------- */ /* POISSON REGRESSION */ /* --------------------- */ ODS LISTING CLOSE; ODS RTF; ODS GRAPHICS ON; PROC GENMOD DATA=AEDATA; CLASS FLFLAG OBSERVER DREG; MODEL AECOUNT3 = TRI FLFLAG LTIP OBSERVER DREG BMTYR / DIST=POISSON LINK=LOG ; ASSESS LINK / RESAMPLE=10000; RUN; QUIT; ODS GRAPHICS OFF; ODS RTF CLOSE; ODS LISTING; /* --------------------- */ With the ASSESS statement the aptness of the functional form of the link or of one of the continuous covariates is what is being checked. The analysis centers on whether the simulated residual patterns that would be generated by the model under the specified assumptions are statistically different from the one actually generated. The actual pattern is printed in bold while the simulations are represented by dotted lines. If the p-value is quite low, say p<.05, then there is cause for concern that the actual functional form being used is less than optimal because the actual residual pattern differs from the expected patterns generated by simulation. It is best to have p-values greater than .2. This is just an introduction to the ASSESS statement. For more on its capabilities please see the SAS documentation for version 9.1.3. Excerpts from the Poisson regression run are listed below:

Table 3 – Goodness of Fit Statistics for the Poisson Regression Model

Criteria For Assessing Goodness Of Fit

Criterion DF Value Value/DF

Deviance 1080 858.9736 0.7953

Scaled Deviance 1080 858.9736 0.7953

Pearson Chi-Square 1080 1636.7203 1.5155

Scaled Pearson X2 1080 1636.7203 1.5155

Log Likelihood -502.7610

PostersNESUG 2006

8

Table 4 - Analysis of Parameter Estimates for the Poisson Regression Model

Analysis Of Parameter Estimates

Parameter DF EstimateStandard

Error

Wald 95% Confidence

Limits Chi-Square Pr > ChiSq

Intercept 1 305.3136 167.9482 -23.8589 634.4861 3.30 0.0691

TRI 1 0.0868 0.0064 0.0742 0.0994 182.98 <.0001

FLFLAG 0 1 -0.3898 0.1158 -0.6167 -0.1629 11.34 0.0008

FLFLAG 1 0 0.0000 0.0000 0.0000 0.0000 . .

LTIP 1 0.9468 0.0937 0.7632 1.1304 102.18 <.0001

OBSERVER 0 1 0.7258 0.2263 0.2822 1.1694 10.29 0.0013

OBSERVER 1 0 0.0000 0.0000 0.0000 0.0000 . .

DREG 01 REN_FLUTBI 1 1.5146 0.4789 0.5759 2.4532 10.00 0.0016

DREG 99 OTHER 0 0.0000 0.0000 0.0000 0.0000 . .

BMTYR 1 -0.1554 0.0839 -0.3198 0.0089 3.44 0.0638

Scale 0 1.0000 0.0000 1.0000 1.0000

Chart 3 – Assessment of the Poisson Model Link Function

PostersNESUG 2006

9

MODEL 2 – NEGATIVE BINOMIAL REGRESSION /* ------------------------------- */ /* NEGATIVE BINOMIAL REGRESSION */ /* ------------------------------- */ ODS LISTING CLOSE; ODS RTF; ODS GRAPHICS ON; PROC GENMOD DATA=AEDATA; CLASS OBSERVER DREG; MODEL AECOUNT3 = TRI LTIP OBSERVER DREG BMTYR / DIST=NB; ASSESS VAR=(TRI) / RESAMPLE=10000; RUN; QUIT; ODS GRAPHICS OFF; ODS RTF CLOSE; ODS LISTING; /* ------------------------------- */ Notice in the negative binomial code that DIST=NB is specified and no link function is specified. The default link function is log for a negative binomial regression and the specification of it may be omitted. Also notice that the ASSESS is now being used to evaluate the temporal risk index (TRI). You may assess any of the continuous variables in the model or you may assess the link function using this same statement, but not at the same time!

Table 5 – Goodness of Fit Statistics for the Negative Binomial Model

Criteria For Assessing Goodness Of Fit

Criterion DF Value Value/DF

Deviance 1080 534.5730 0.4950

Scaled Deviance 1080 534.5730 0.4950

Pearson Chi-Square 1080 1209.1388 1.1196

Scaled Pearson X2 1080 1209.1388 1.1196

Log Likelihood -460.0776

PostersNESUG 2006

10

MODEL ANALYSIS Both the Poisson and the negative binomial models found the temporal risk and the length of stay to be very significant adverse event predictors. Other variables that have an impact are FLFLAG, OBSERVER and DREG although these variables are somewhat less important or only borderline significant. The DREG variable only affects a relatively small proportion of our patients and so even though it is significant statistically it is not as useful as one might think without knowing the data. Unfortunately the BMT year variable, BMTYR, is only borderline significant in the Poisson model and is clearly not significant in the negative binomial model. The same can be said of the intercept estimate. It is possible to re-estimate these models using the NOINT option in the MODEL statement. Tables 3 and 5 display goodness of fit statistics for the Poisson and negative binomial models respectively. Models that fit well have the values of the “Value/DF” close to 1. Notice how the Pearson chi-square for the Poisson model has a ratio of 1.5155 and the negative binomial model shows a value of 1.1196 for the corresponding entry. This number is better for the negative binomial case because the negative binomial distribution allows for a variance greater than the mean (overdispersion) whereas the Poisson distribution requires that the variance equal the mean. This is the reason the scale parameter is set to 1.0 in the Poisson model and the negative binomial model has a dispersion parameter estimate instead of a scale parameter estimate. The adverse event data are overdispersed and Table 6 shows that the dispersion parameter is 1.4709 in the negative binomial model. Note that a dispersion parameter doesn’t exist in the corresponding location in the Poisson parameter estimates table, Table 4. Instead the Poisson model displays a scale parameter that stays fixed at 1.

Table 6 – Analysis of Parameter Estimates for the Negative Binomial Model

Analysis Of Parameter Estimates

Parameter DF EstimateStandard

Error

Wald 95% Confidence

Limits Chi-Square Pr > ChiSq

Intercept 1 235.1339 228.7798 -213.266 683.5340 1.06 0.3041

TRI 1 0.0853 0.0079 0.0697 0.1009 115.15 <.0001

FLFLAG 0 1 -0.4321 0.1603 -0.7464 -0.1179 7.26 0.0070

FLFLAG 1 0 0.0000 0.0000 0.0000 0.0000 . .

LTIP 1 1.1105 0.1765 0.7645 1.4564 39.59 <.0001

OBSERVER 0 1 0.7349 0.2878 0.1708 1.2991 6.52 0.0107

OBSERVER 1 0 0.0000 0.0000 0.0000 0.0000 . .

DREG 01 REN_FLUTBI 1 1.9887 0.6678 0.6798 3.2976 8.87 0.0029

DREG 99 OTHER 0 0.0000 0.0000 0.0000 0.0000 . .

BMTYR 1 -0.1207 0.1142 -0.3445 0.1032 1.12 0.2908

Dispersion 1 1.4709 0.2956 0.8916 2.0502

PostersNESUG 2006

11

Chart 4 – Assessment of the Negative Binomial Model’s Temporal Risk Index

Chart 5 – Assessment of the Link Function for a Multiple Regression Model

PostersNESUG 2006

12

The link function assessment graph (Chart 3) shows that the choice of a log link is an excellent choice for the functional form of the link in this particular Poisson regression. For variety we showed the TRI assessment (Chart 4) for the negative binomial model. The verdict on this is less clear, as there seems to be some divergence from the simulated cumulative errors at a TRI value of around 22. Nevertheless we cannot reject the null hypothesis that the cumulative error pattern for TRI is consistent with the random fluctuation we would expect if the functional form used were appropriate. In order to assess an entire model one could produce these cumulative residual graphs on all covariates in the model as well as on the link function. Certain types of covariate misspecification will be readily apparent by the distinctive pattern produced in its cumulative residual graph. For instance, if a variable should be included in the model as log(X) but is instead included as X then the graph will exhibit a distinct pattern that starts out sloping downward then upward to a peak and then slopes downward again – like a sideways “S”. Examples of these patterns are given in the ODS statistical graphics documentation for the GENMOD procedure in SAS 9.1.3. For comparison purposes we ran a standard multiple regression analysis on the data and plotted the link function assessment graph. (To save space we have not shown the SAS code for this.) The link assessment is displayed in Chart 5. Note that the p-value is less than .0001, indicating the severe departure of the actual cumulative residuals from the simulated residuals. The chart shows that multiple regression is clearly not appropriate for this data. As an aside, PROC GENMOD and PROC REG yield virtually identical results when the response is set to normal and the link is set to identity in GENMOD, but the link assessment is possible only in GENMOD.

CONCLUSION

The Poisson and negative binomial regression approaches to modeling adverse events were discussed in this paper. Either approach has predictive value and is far superior to using standard multiple regression. When overdispersion exists in the Poisson approach it may be more appropriate to use a model based on a negative binomial response. The price you pay for using a negative binomial model is the additional complexity of the response distribution, however this additional complexity is worthwhile when the problem of overdispersion is pronounced. These models tell us how many adverse events we should expect to be reported during a reporting period. That was the main goal. The models are not intended primarily for making clinical inferences. For example, the timing of a follow up visit has nothing to do with the adverse event risks our patients face, but it DOES say that an adverse event, if it occurred, is more likely to be reported to us during a period in which a follow up visit took place. If the effect of BMT year on adverse events exists, it is too small to be detected by these models. The fact that the parameter estimate was -.1554 and the p-value was .06 in the Poisson model was encouraging, but we clearly cannot make any conclusions here without further information and additional analysis. One must always be cognizant of multiple comparison issues when building regression models. Rigorous validation and model assessment techniques should be employed to assure that significant variables truly are significant and that the analyst is not simply modeling “noise”. This is especially important in borderline significance cases such as we have with the BMTYR variable. A future direction for studying adverse events would be to implement a generalized estimating equations (GEE) adjustment. According to Allison3 this technique “allows for correlations in the dependent variable across observations”. In the present case this means that the technique would adjust for correlation over time within each patient’s transplant data. Such correlations violate the assumption of independence on which many of the formulas are based, and GEE adjustment would reduce the impact of this violation. It would also be useful to try an event history approach (i.e. survival analysis) to model adverse events. Whereas the generalized linear model assumes that the response variable comes from an exponential family of distributions, that the responses are independent and that a specific link function applies, the event history approach has as a central assumption proportional hazards. In addition the event history approach utilizes partial likelihood as opposed to the MLEs used in generalized linear models. These differences in the two approaches may yield somewhat different inferences.

PostersNESUG 2006

13

REFERENCES 1. Agresti, A. (1990). Categorical Data Analysis. New York: Wiley.

2. Meyers, Raymond H. (1990), Classical and Modern Regression with Applications. 2nd Edition, Pacific Grove: Duxbury Press.

3. Allison, Paul D. (2005), Fixed Effects Regression Methods for Longitudinal Data Using SAS. 1st Edition. Cary, NC: SAS Institute Inc.

4. Schabenberger, Oliver 2005. “ Introducing the GLIMMIX Procedure for Generalized Linear Mixed Models.” Proceedings of the Thirtieth Annual SAS Users Group InternationalConference, Philadelphia, PA, 196-30.

5. Stokes, Maura E., Davis, Charles S., Koch, Gary G., Categorical Data Analysis Using the SAS System, Cary, NC:SAS Institute Inc., 1995. 499pp.

6. Nelder, J.A., Wedderburn, R.W.M., 1972. “Generalized Linear Models”, Journal of the Royal Statistical Society, Series A 153: 370-384.

7. Gardner, W., Mulvey, Edward P., Shaw, Esther C., 1995. Regression Analysis of Counts and Rates: Poisson, Overdispersed Poisson, and Negative Binomial Models. Psychological Bulletin, Vol. 118, No 3 392-404.

RECOMMENDED READING Cameron, A.C., Trivedi, P.K. (1998), Regression Analysis of Count Data, Cambridge: University Press.

Firth, D. (1991), “Generalized Linear Models,” in Statistical Theory and Modeling, ed. Hinkley, D.V., Reid, N., and Snell, E.J., London: Chapman and Hall.

Lin, D.Y., Wei, L.J., and Ying, Z. 2002. "Model-Checking Techniques Based on Cumulative Residuals," Biometrics, 58, 1 - 12. McCullagh, P., Nelder J.A. (1983), Generalized Linear Models, New York: Chapman and Hall.

CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the authors at:

Fox Chase – Temple BMT Program 7604 Central Ave Philadelphia, PA 19111 Phone: (215) 214-3121 E-mail: [email protected] Web: http://www.foxchasetemplebmt.org/content/home.asp

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies.

PostersNESUG 2006

using proc genmod to model adverse event counts in a ... · hematopoietic cell transplantation...

Documents