gladstone bioinformatics core
Post on 20-Feb-2016
39 Views
Preview:
DESCRIPTION
TRANSCRIPT
+
Gladstone Bioinformatics Core
Kirsten E. Eilertson
+ Statistics in Science:
Best Practices
+Our Goal•“thoughtful” “insightful” “rigorous” statistical analyses• Meaningful and solid inference which can be the basis of future work
Our Challenges•Every application is different• Precedents can be a
problem•P-value centric publication system•Reproducibility
+Reporting and Reproducibility
Reproducibility Crisis!!! Dr. Ioannidis (2005) PLoS Medicine
+Our Goal•“thoughtful” “insightful” “rigorous” statistical analyses• Meaningful and solid inference which can be the basis of future work
Our Challenges•Every application is different• Precedents can be a
problem•P-value centric publication system•Reproducibility
Discussion Today:
Reporting results Power and Experimental
Design Outliers
+Guidelines for reporting Resources:
Annals of Internal Medicine http://www.people.vcu.edu/~albest/Guidance/guidelines_f
or_statistical_reporting.htm American Physiological Society
http://physiolgenomics.physiology.org/content/18/3/249.full?
‘Describe the statistical methods with enough detail to enable a knowledgeable reader with access to the original data to verify the reported results’ (Bailar & Mosteller, 1988, p. 266)
+Guidelines for reporting ‘The design of an experiment, the analysis of its data,
and the communication of the results are intertwined. In fact, design drives analysis and communication.’
Always report the test statistic, the degrees of freedom, the test value, and the P-value that the result occurred at chance under the null hypothesis.
Report how assumptions were checked (e.g. histograms of residuals, tests of normality, etc.)
Provide a clear description of the design of your study or experiment; state the null hypothesis and alternative
+Guidelines for reporting
Control for multiple comparisons. Report variability using a standard deviation (not
standard error). Avoid sole reliance on statistical hypothesis testing,
such as the use of P values, which fails to convey important quantitative information. Report uncertainty about scientific importance using a
confidence interval. Interpret each main result by assessing the numerical
bounds of the confidence interval and by considering the precise p value.
+Experimental Design & Power
The appropriate analysis depends on the design!
Can I peek at my data? Can I add more samples
later? Sequential or Adaptive
design Multiple testing/Gambler’s
ruin Prism Graphpad Example:
http://www.graphpad.com/guides/prism/6/statistics/index.htm?stat_why_you_shouldnt_recompute_p_v.htm
+A stochastic process
Error Statistics Blog Discussion
Not an “Argument from intentions” but really a “Probabative capacity of the test”.
+Power analysis
Uses: Pilot studies! (don’t forget to control for multiple
comparisons) Detectable Effect Size (when non-significant result)
Consider confidence intervals instead
From the American Statistician (2001)
+Outliers measurement or model error?
Reasons for concern Increases the estimated standard deviation May indicate the model (e.g. assumption of normality) is
not correct May lead to model misspecification Biased parameter estimation
+Methods
Detection Visual inspection Grubbs Test (assumes Normality) Chauvenet’s criterion (assumes Normality) Dixon’s Q test (assumes Normality) Based on interquartile range measure 2 standard deviations
+MethodsAnalysis Delete the outlier Trimmed Mean/Winsorized Mean Weighted regression techniques Do nothing Report with & without outliers Arguments for keeping
Methods for identification does not make the practice of deleting scientifically or methodologically sound
Minimal effect on estimates/model (low influence)
+Influential Points
Outlier
Leverage
Influence
+Outliers: Decide whether or not deleting data points is
warranted: Do not delete data points just because they do not fit your
preconceived model. You must have a good, objective reason for deleting data
points. Implausible; inaccuracy in measurement; from a different
population If you delete any data after you've collected it, justify and
describe it in your reports. If you are not sure what to do about a data point, analyze
the data twice — once with and once without the data point — and report the results of both analyses.
top related