application of kopec et al’s recommendations on validation philippe finès july 2010

14
Application of Kopec et al’s Recommendations on Validation Philippe Finès July 2010

Upload: pamela-floyd

Post on 28-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Application of Kopec et al’s Recommendations on Validation

Philippe FinèsJuly 2010

2 Statistics Canada • Statistique Canada

Validation exercise

Using the recommendations formulated by Kopec et al., we try to determine if they apply to our validation work of smoking and lung cancer

3 objectives for this exercise:• Evaluation of pertinence and applicability of

recommendations (“validation of validation”)

• Summary of steps and issues involved in the validation of a complex model of simulation

• Examination of the specific case of smoking and lung cancer

3 Statistics Canada • Statistique Canada

Lung Cancer Risk EquationSmoking

Cumulated cigarettes

10 years lagged Lung cancer incidence

Radon

Whittemore Equation: P = (1+0.0031*R) * (1+0.00051*S) * B

P = probability of developing lung cancer (age,sex, province)R = 10-year lagged cumulative radon exposureS = 10-year lagged cumulative smoking exposureB = background incidence rate (age, sex, province)* note that B is achieved by calibrating to observed CCR rates for 2005

PCumulated

radon exposure 10 years lagged

Background rates in the population net of radon and smoking

Lung cancer death

4 Statistics Canada • Statistique Canada

Relative risk of developing lung cancer in former smokers (relative to current smokers) RR (see table below) are applied to the smoking term in

Whittemore equation

P = (1+0.0031*R) * (1+0.00051*S)*RR * B recall that the S in this equation is cumulative smoking 10 year lagged; so even if a person stops quitting today and even with this reduced relative risk, S itself is still going up for 10 years

RR Current smoker,

former <10 years,

former 10-19 years,

former 20-29 years,

former >=30 years,

never smoker

Females 1.0, 0.69, 0.21, 0.05, 0.05, 0.05

Males 1.0, 0.66, 0.44, 0.20, 0.10, 0.03

source: Peto et al. (2000)

5 Statistics Canada • Statistique Canada

Evidence from examining model development process (17 criteria)

1. Conceptual model

Topic Recommendations Application for LC/smoking model

1.1 Underlying theories

The conceptual model should be based on an accepted theory of the phenomena under study. If an accepted theory is not available, this limitation of the model should be acknowledged.

For LC incidence, we used Whittemore model with Peto adjustment for former smokersFor LC death, we used observed survival rates

1.2 Definitions of variables

Definitions of the variables in the model should be justified. Evidence that the definitions are acceptable should be provided (e.g., a reference to published and/or generally accepted clinical criteria or results from validation studies).

- Smoking status,- Cumulated pack-years of cigarettes,- Lung cancer incidence- Lung cancer death All these concepts are well defined

1.3 Model content and structure

Evidence should be provided that the model is sufficiently complete and that the relationships between the variables in the model are correctly specified. If some variables or interactions are omitted, explanations should be given why this is acceptable and does not invalidate the results.

Our challenge comes from the fact that no previous research combines all the elements that we want to model: impact of sex, smoking, quitting, radon, LC incidence, LC death. We were not able to directly validate our model with a unique source. Therefore, our model is the 1st tentative of a comprehensive model of LC.

Table 2: Summary and recommendations (1)

6 Statistics Canada • Statistique Canada

2. Parameters

Topic Recommendations Application for LC/smoking model

2.1 Parameters obtained from experts

The process of parameter elicitation should be described (number of experts, their areas or expertise, questions asked, how the responses were converted to a parameter). Plausibility of the parameter value(s) should be assessed by independent experts. Comparisons should be made with other sources (if available) and the differences explained.

Not used in this part of the model

2.2 Parameters obtained from the literature

Quality of the source should be ascertained. If available, a published meta-analysis should be used, but a single high-quality study may be an alternative. If information from several sources is combined, the methodology should be explained. Comparisons should be made with alternative sources and discrepancies explained. If alternative sources are not available, plausibility of the parameter values should be assessed by independent experts.

Parameters for smoking and for radon from WhittemoreParameters for quitting from Peto

Table 2: Summary and recommendations (2)

7 Statistics Canada • Statistique Canada

2. Parameters

Topic Recommendations Application for LC/smoking model

2.3 Parameters obtained from data analysis

Validity evidence regarding the data and methods of analysis should be equivalent to that required for a publication in a scientific peer-review journal. The results should be compared with estimates from other sources and (if not available) expert opinion. Evidence to support generalizability of the parameters to the population modeled should be provided.

Baseline incidence rates from (calibration and) CCRDeath rates from CCR

Our leap of faith was that data obtained from male miners would be generalizable to general population (males and females)

2.4 Parameters obtained through calibration

Calibration methodology should be reported in detail (target data, search algorithm, goodness-of-fit metrics, acceptance criteria, and stopping rule). Plausibility of the parameters derived through calibration should be evaluated by independent experts and their values compared with external data (if available).

Baseline incidence rates from calibration with 2005 (and CCR)Calibration technique described somewhere but in an informal and incomplete mannerBy construction, calibration makes simulated data fit perfectly year 2005, so no need for external expertise.Possible to redo calibration for another year (e.g. 1995).

Table 2: Summary and recommendations (3)

8 Statistics Canada • Statistique Canada

Table 2: Summary and recommendations (4)

3. Computer implementation

Topic Recommendations Application for LC/smoking model

3.1 Selection of model type A justification for the selected model type should be provided (micro vs. macro-level simulation; discrete vs. continuous time models, interacting agent vs. non-interactive models, etc). Whether or not the type of model is appropriate should be determined by independent experts.

Architecture and general framework of POHEM-Described elsewhere (e.g: Wolfson; Berthelot et al.), but not specifically for this application -Not questioned (because we wanted to increase and exploit our expertise on this type of simulation)

3.2 Simulation software Information should be provided on the simulation software and programming language. The choice of software should be justified. If available, an established and well-supported simulation platform should be used.

Same comments

3.3 Computer program Independent experts should evaluate the key programming decisions and approaches used. The results of debugging tests should be documented and the source code should be made open to scrutiny by external experts.

As for POHEM, code should be available to users?

9 Statistics Canada • Statistique Canada

Table 2: Summary and recommendations (5)

4. Evidence from examining model performance

Topic Recommendations Application for LC/smoking model

4.1 Output plausibility Plausibility (face validity) should be evaluated by experts in the subject matter area for a wide range of input conditions and output variables, over varying time horizons.

We use two expertise circles:-Inner circle (within STC)-Outer circle (medical experts)We do most of the validation with the inner circle. This way, we reduce back and forth communication with external experts and we contact them when sufficient work is done within STC

Most of the intermediate outputs (% of smokers, average nb of cigarettes, etc) fit perfectly, but the LC incidence of the past is not reproduced correctly

4.2 Internal consistency

Internal consistency should be assessed by considering functional and logical relationships between different output variables. Internal consistency should be tested under a wide range of conditions, including extreme values of the input parameters.

We are not able to reproduce LC incidence in the past, but we can not from that conclude that internal consistency is not satisfied. It is either a problem of missing data (i.e. second-hand smoke, asbestos), non generalizability (see above) or incomplete model (as we said before, no research covers all the aspects simultaneously) Several tests with wide range of conditions were done, not in ModGen but with Excel.

10 Statistics Canada • Statistique Canada

Table 2: Summary and recommendations (6)

4. Evidence from examining model performance

Topic Recommendations Application for LC/smoking model

4.3 Parameter sensitivity analysis

Model validation should include uncertainty and sensitivity analyses of key parameters. Screening methods should be used to select the most influential parameters for more extensive analysis. If feasible, probabilistic uncertainty/sensitivity analysis is recommended. Interdependent parameters should be analyzed simultaneously. If parameters are estimated through calibration, the model should be recalibrated as part of uncertainty/sensitivity analysis. In probabilistic models, the Monte Carlo error should be estimated.

We made smoking parameter vary from 1/10 to 10 times its value; but with calibration, the results on LC incidence varied slightly- No need to determine most influential parameters - There is some interaction between sex, year and smoking parameter, but it is not addressed the way described here

4.4 Between-model comparisons

Comparing the results of different models provides important evidence of validity and should be employed, if feasible. Between-model comparisons should take into account the extent to which models are developed independently. The impact of changing different aspects / components of the model should be evaluated.

The model was compared with another candidate model (Bach), which had to be adapted to be comparable (e.g. rewrite probability of LC incidence into a RR of LC compared to non-smokers)With this adaptation, results did not change drastically.The impact of quitting was compared with another candidate model (Freedman; Leffondré) in ExcelComprehensive comparison of all models available is being developed

11 Statistics Canada • Statistique Canada

Table 2: Summary and recommendations (7)

4. Evidence from examining model performance

Topic Recommendations Application for LC/smoking model

4.5 Comparisons with external data

Ideally, prospective data should be used for external validation. If prospective validation is not feasible, ex-post forecasting and backcasting based on historical data should be used to support predictive validity. Data used for validation should be different from data used in model development and calibration. Cross-validation and bootstrap methods should be considered as an alternative. Criteria for model acceptability should be specified in advance.

-For smoking prevalence: Data used for development of the model (NPHS, CCHS) were not the same as data used for validation (Wolfson Institute)-For LC incidence and LC death: data is only obtained as an output of the model-Criteria for model acceptability: superimposition of observed and simulated data. No formal test.

12 Statistics Canada • Statistique Canada

Table 2: Summary and recommendations (8)

5. Evidence from examining the consequences of model-based decisions

Topic Recommendations Application for LC/smoking model

5.1Quality of decisions

Quality of decisions based on the model should be evaluated and compared with those based on alternative approaches to decision making, using both subjective and objective criteria.

- Comparative analysis of decision making should be done. A protocol could be devised for this purpose.

5.2 Usefulness Uptake of a given model by policy makers should be monitored to assess model usefulness.

- As long as we are not able to reproduce the past well, we are wondering if the model is trustable enough to be used to project outcomes in the future (even though, in principle, errors should cancel out and one should be able to compare scenarios)- Workshops have been organized on how to use the model

13 Statistics Canada • Statistique Canada

Conclusions Evaluation of pertinence and applicability of recommendations

(“validation of validation”)• The tool is comprehensive and covers all aspects of validation

Summary of steps and issues involved in the validation of a complex model of simulation• We stress the fact that our model contains several nodes, i.e. some of

them concern parameters only, some of them concern outcomes only.• Therefore, the tool may be insufficient for such a complex model

Examination of the specific case of smoking and lung cancer lung cancer• We did a lot of the recommended steps and decided to deepen some of

them to determine what goes wrong in the model.• Because simulated data do not fit observed data and because there is

no complete study that combines all the aspects simultaneously, we are no longer in the “validation” step, but in fact doing a (meta-)data analysis.

14 Statistics Canada • Statistique Canada

References Bach, Kattan, Thornquist, et al. (2003) Variations in Lung cancer risk among smokers,

Journal of the National Cancer Institute, 95:6, pp. 470-478 (with suppl.) Berthelot, Le Petit, Flanagan (1997). Use of longitudinal data in health policy

simulation models. Proceedings of the Section on Government Statistics and Section on Social Statistics. American Statistical Association,1997:120–29

Freedman, Leitzmann, Hollenbeck, et al. (2008) Cigarette smoking and subsequent risk of lung cancer in men and women: analysis of a prospective cohort study, Lancet Oncology 9:649-656

Kopec, Finès, Manuel, et al. (2010) Validation of Population-Based Disease Simulation Models: A Review of Concepts and Methods

Leffondré, Abrahamowicz, Siematycki, Rachet (2002) Modeling smoking history: A comparison of different approaches, American Journal of Epidemiology 156:9, pp. 813-823

Peto, Darby, Deo, et al. (2000) Smoking, smoking cessation, and lung cancer in the UK since 1950: combination of national statistics with two case-control studies, BMJ, 321:323-329

Whittemore, McMillan (1983) Lung Cancer Mortality Among U. S. Uranium Miners; A Reappraisal, Journal of the National Cancer Institute, 71:3, p. 489-499

Wolfson (1994) POHEM – a framework for understanding and modelling the health of human populations. World Health Stat Q 1994;47:157–76.