rigor, reproducibility and transparency oh my! · rigor, reproducibility and transparency oh my!...

13
Rigor, Reproducibility and Transparency Oh my! Sandra Taylor, Ph.D. Clinical and Translational Science Center University of California, Davis 11 January 2016

Upload: lamthien

Post on 06-Sep-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

Rigor, Reproducibility and Transparency

Oh my! Sandra Taylor, Ph.D.

Clinical and Translational Science Center University of California, Davis

11 January 2016

Presenter
Presentation Notes
I am Sandy Taylor, a principal statistician with the CTSC and I am going to talk about the NIH’s new requirements for proposals to address rigor, reproducibility and transparency – oh my! Much of what NIH wants to see is probably things that you are doing already. So – the new requirements may be more “packaging” for the proposal than requiring a change in how you are conducting research. Nevertheless, there may be some aspects of your work that you need to “tighten up” to meet the “rigor” standards.

Scientific Rigor Criterion

Application of the scientific method to ensure robust and unbiased results

– Experimental design

– Methodology

– Analysis

– Interpretation and reporting of results

Includes full transparency to allow reproducibility and extensions

Presenter
Presentation Notes
The whole intent of the “rigor criteria” are to ensure that we are doing “good” science yielding “reliable” results, that we document what we did sufficiently so that others could reproduce the results and that we are fully and adequately reporting the findings. For proposals, this means explaining how these objectives will be met. The NIH “scientific rigor criteria” serves as the foundation for the “new” review standards. The scientific rigor criteria is the “strict application of the scientific method to ensure robust and unbiased results”. This standard is achieved through appropriate and rigorous experimental design, data collection methodology, data analysis, and interpretation and reporting of results. It includes full transparency on what was done at a level sufficient to allow reproducibility and extensions.

“Robust and Unbiased”

Robust results are obtained using methods that

– Avoid bias

– Can be reproduced under well-controlled and reported experimental conditions

“Robust” and “Unbiased” results are goals, not absolute standards and may vary across scientific disciplines.

Presenter
Presentation Notes
NIH further elaborates on “robust” and “unbiased”. They say robust methods avoid bias and can be reproduced. Avoiding bias essentially means that the effects you see are attributable to the experimental manipulation and not the result of something else. Reproducibility encompasses both the conduct and reporting of experimental procedures. NIH notes that robust and unbiased results are goals and not absolute standards. There are circumstances where you can’t design and conduct a study to meet these objectives.

“Rigor” standard operates over entire project life

Experimental Design/Study Protocol

Data collection

Data analysis

Manuscript preparation

Studies need to be developed and proposals written with rigor standards in mind.

Presenter
Presentation Notes
What I want to emphasize here is that we need to consider “rigor” through the entire project – all aspects from design to data collection to analysis, interpretation and reporting influence a study’s rigor. I think most folks recognize the issues with experimental design and data collection but rigor considerations extend to data analysis and manuscript preparation. Failure in rigor with the latter topics can result from significance searching (i.e., looking for p < 0.05) , dropping “outliers”, only reporting statistically significant results. Studies need to be developed and proposals written with these standards in mind but then these standards need to remain in the forefront in the actual conduct on the study.

Rigor and reproducibility start with the experimental design

Specify inclusion/exclusion criteria

Define and justify controls

Identify confounders and biases

– Integrate measures to avoid/reduce confounders/biases

Include biological and technical replicates

Include multiple reviewers

Define key terms, conditions, requirements up front

Presenter
Presentation Notes
It goes without saying that the study design and methodology needs to be appropriate for and specific to the study aims. This typically isn’t a problem. You probably all are skilled in setting up a study specific to your hypotheses. I think most folks meet the “rigor” criterion in their work but what is needed is to more explicitly document that in the proposals. So it is mainly a packaging issue. Nevertheless, there are some experimental issues that you may not have explicitly considered in the past and/or that you may need to better document. Do specify and if necessary justify inclusion/exclusion criteria. Do carefully consider how you define controls. There may be multiple control groups Do brainstorm about potential confounders and biases. Get other opinions because other folks may identify confounders/biases that you don’t Do include biological and technical replicates as appropriate for the study Do included multiple reviewers/raters where applicable Do define key terms, conditions, requirements up front not when you are reviewing the data When putting together a proposal, you will want to explicitly address these points to the extent that they are applicable to your study design.

Key components for demonstrating “rigor”

Sample size/power analysis

Randomization

Blinding

Data handling plan

Statistical analysis plan

Presenter
Presentation Notes
In addition to tightening up some the experimental design pieces I just highlighted, there are a few key components that should be included in many if not most proposals that will bolster the proposal’s rigor. These are sample sizes/power analysis, randomization, blinding, a data handling plan and a statistical analysis plan.

Sample size/power analysis Ensure study has adequate power to detect meaningful and realistic differences

– Conduct while designing study

– Be study specific and use correct procedure

– Specify procedure (e.g., two-sample t-test) and assumptions (e.g., 80% power, 5% significance level, standard deviation of 1, difference of 2)

– Adjust for interim analyses and/or multiple primary endpoints

Presenter
Presentation Notes
Proposals typically need to have a section that justifies the sample size being proposed. Sometimes, a sample size is chosen by convention. “We have always used 10 mice”. 10 mice may be fine but you need to justify it. You need to specify what procedure you used and the assumptions. For example, sample size was determined using a two-sample t-test at 80% power, 5% significance level. The proposed sample size will be adequate to detect a mean difference of 2 assuming a standard deviation of 1 (citation). Also, it is important to please address sample size during the development stage, not the day before the proposal is due. If you need help from a statistician, come talk to us earlier rather than later. Too often investigators come to us very late in the game and say “I just need know how big a sample I need”, suggesting it is a trivial and non-important piece to the proposal. I do think this is one area that will be easy for NIH to give more scrutiny to so do it right.

Randomization

Important for – Avoiding/reducing bias

– Reduce likelihood of chance events impacting study results

Randomize wherever possible – Treatment allocation

– Order of data collection (e.g., machine run order, order or evaluator review)

Presenter
Presentation Notes
Randomization is a really important for yielding robust and unbiased results and can come into play multiple times in a study. The goal with randomization is to create experimental groups that don’t differ on average except for the assigned treatments. If subjects were not randomly allocated to treatments and instead were assigned in a systematic way you might not end up with comparable groups. For example, at a medical clinic, suppose you decide to allocate the first 10 patients meeting inclusion criteria to one group and the next 10 to another group. You might end up with gender or age differences in the groups because of when people are able to come to the clinic. By randomly assigning, you avoid these types of issues. Randomization also helps reduce potential impacts of chance events on study results – Consider temperature/thermostat issue. If you were running an experiment and decided to run all the controls first and all the cases second, a thermostat issue could differentially impact one of the groups whereas if run order was randomized the groups would be similarly affected on average. I think most investigators are keyed into randomization for treatment allocation but it extends to other study aspects such as the order in which samples are run or images reviewed, etc. For grants, make sure you explicitly state where you will randomize and how (i.e., stratified in some way?). For a complex study design, you might want to consult with a statistician who can help develop a workable approach. If you can’t randomize, this should be acknowledged.

Blinding Important for reducing bias and yielding

“robust” results Blind wherever possible

– Investigators, research personnel, animal caretakers

– Treatment allocation – Conduct of the experiment – Assessment of outcome/data collection

Acknowledge where you can’t blind and incorporate practices to minimize potential resultant biases

Presenter
Presentation Notes
Blinding goes hand-in-hand with randomization and is another important procedure for reducing bias, and yielding “robust” results (i.e., results that are not spurious). Blinding could be a new issue to deal with for some investigators. It is common practice in human clinical trials where double-blind studies, randomized trials are the gold standard but my sense is that it is less commonly considered in animal studies. There are probably a multiple reasons for this. Perhaps one thought is that because you are working with animals, they are naturally blinded to the treatment so won’t have a placebo effect. Certainly though there is the potential for an unblinded investigator to introduce bias. You should blind personnel wherever possible. This includes investigators, research personnel, even animal caretakers. To the extent possible, personnel should be blind to the treatment allocation throughout the conduct of the experiment and collection of data. Completely blinding all personnel throughout the entire experiment may not be possible. Acknowledge where blinding isn’t possible and integrate practices to minimize potential bias. Mixing personnel so you don’t have the same person involved in all steps could help.

Data Handling

Define stopping criteria in advance Prospectively define the primary

endpoint Define outliers and data exclusion

criteria prospectively – Statisticians frown on dropping “outliers”

unless there is a good reason to

Report missing data, why missing and how handled

Presenter
Presentation Notes
Data handling is not specifically mentioned by NIH in its rigor and reproducibility guidance but it is a pretty big piece of a paper by Story Landis on transparent reporting in preclinical studies and does have bearing on rigor and reproducibility. Some of the issues falling under “Data Handling” include Defining the stopping criteria in advance. Unless you have a pre-defined plan, you should not evaluate the data mid-way through and if you don’t have significance, add more animals and continue until you get significance. Prospectively define the primary endpoint – you need to do this in your proposal anyway. The point is you don’t get to choose something else if the primary doesn’t turn out to be significant. Define outliers and data exclusion criteria prospectively. This can actually be a big deal. Statisticians frown on dropping “outliers” without a good reason. Report missing data and how handled. This can be a big deal as well particularly if the missingness is non-random.

Statistical Analysis Plan

Include an appropriately detailed statistical analysis plan

Address ALL endpoints - primary and secondary

Include adjustment for multiple testing if necessary

Presenter
Presentation Notes
Do include an appropriately detailed statistical analysis plan in your proposal. Detail how you will analyze ALL and endpoints – primary and secondary Include adjustment of multiple testing if necessary -

Additional Information

Landis el. 2012 A call for transparent reporting to optimize the predictive value of preclinical research. Nature 490(7419): 187-191.

Collins, F.S. and L.A. Tabak. 2014. NIH plans to enhance reproducibility. Nature 505:612-613.

NIH Principals and Guidelines for Reporting Preclinical Research http://www.nih.gov/research-training/rigor-reproducibility/principles-guidelines-reporting-preclinical-research

Updated Application Instructions to Enhance Rigor and Reproducibility http://www.nih.gov/research-training/rigor-reproducibility/updated-application-instructions-enhance-rigor-reproducibility

How to Get Biostatistics Help

Office Hours: 12-1:30 on Tuesdays – Input regarding study design, data analysis,

interpretation or presentation of research results in informal setting

– Reserve 15 min. spots on-line at www.ucdmc.ucdavis.edu/ctsc/area/biostatistics/index.html

Application for Resource Use – More in-depth assistance with study design,

grant writing, data analysis and interpretation – Submit application on line at

www.ucdmc.ucdavis.edu/ctsc/