statistical writing. tables and figures (sven sandin)

35
Statistical Writing * Tables and Figures Sven Sandin, Dpt of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm

Upload: kgr023

Post on 15-Jul-2015

286 views

Category:

Education


2 download

TRANSCRIPT

Page 1: Statistical Writing. Tables and Figures (Sven Sandin)

Statistical Writing*

Tables and Figures

Sven Sandin,Dpt of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm

Page 2: Statistical Writing. Tables and Figures (Sven Sandin)

Scope

Tables and figures - General comments

The primary table: table 1

The work flow

Figure presentations to use and to avoid

Page 3: Statistical Writing. Tables and Figures (Sven Sandin)

Presentations : Figures & Tables

Summarize and focus results

Facilitate reproducing results

Help interpreting the RESULTS - Avoid busy tables … not all data are interesting

Table & Figure must be able to stand by itselfTitle - short, clear Footnotes explaining ALL abbreviations ….. Underlying model be clear Categorical covariates p-value: What's the hypothesis ?

Page 4: Statistical Writing. Tables and Figures (Sven Sandin)

Presentations: "Primary" table

Allow comparison of treatments (exposures)

Ideally (randomized) these should be "similar" ....

One column for each treatment

One row for each covariate

Confounding ...

Modifying of effect - sub tables

Page 5: Statistical Writing. Tables and Figures (Sven Sandin)

Presentations: "Primary" table

Allow comparison of treatments (exposures)Ideally these should be "similar" ....

One column for each treatmentOne row for each covariateConfounding ... Modifying of effect - sub tables

OutcomeTreatment

Confoundingcovariate

Tablecolumn

Table row

Page 6: Statistical Writing. Tables and Figures (Sven Sandin)

Presentations: "Primary" table

Allow comparison of treatments (exposures)Ideally these should be "similar" ....

One column for each treatmentOne row for each covariateConfounding ... Modifying of effect - sub tables

OutcomeTreatment

Confoundingcovariate

Tablecolumn

Table row

M

Page 7: Statistical Writing. Tables and Figures (Sven Sandin)

EXAMPLE: "Primary" table

Trolle-Lagerros, Y., Mucci, L. A., Kumle, M., Braaten, T., Weiderpass, E., Hsieh, C.-C., Sandin, S. … Adami, H.-O. (2005). Physical activity as a determinant of mortality in women. Epidemiology, 16(6), 780–785.

Page 8: Statistical Writing. Tables and Figures (Sven Sandin)

EXAMPLE: Table summarizing results

Trolle-Lagerros, Y., Mucci, L. A., Kumle, M., Braaten, T., Weiderpass, E., Hsieh, C.-C., Sandin, S. … Adami, H.-O. (2005). Physical activity as a determinant of mortality in women. Epidemiology, 16(6), 780–785.

Page 9: Statistical Writing. Tables and Figures (Sven Sandin)

Presentations: "Primary" table

Allow comparison of treatments (exposures)Ideally these should be "similar" ....

One column for each treatmentOne row for each covariateConfounding ... Modifying of effect - sub tables

Generally, don't test for baseline differences !If important ----> In the model already ---> No need to test !If not important ----> p-value not important ---> No need to test !Not known ---> No need to test !

p-values vs estimates ---> No need to test ! Estimate !confuse strength of association with importanceInflation of overall significance level ...

Page 10: Statistical Writing. Tables and Figures (Sven Sandin)

Presentations: Table work process

One-to-one relation

Data ----> Computer program ---> Table results

MethodDon't point-and-click (choice of software)Rerun all results each time ....... or use log bookIn your draft: Make notes about source, date...

Reproducibility !

Page 11: Statistical Writing. Tables and Figures (Sven Sandin)

Presentations : Tables

LayoutDecimalsAvoid using shading and colors

MeasuresNumber of missing data must be clearSurvival-type of analysis: Person year is the relevant measure

Binary data: Show one of the proportions, e.g. males

ContinuousMean or median (both to show symmetry)Q1 and Q3 or P10 and P90 etc. instead of Min and MaxSD not useful for asymmetric data

Page 12: Statistical Writing. Tables and Figures (Sven Sandin)

Presentations: Figures

Figures - examples

Continuous - Box plots

Ordinal - Segmented bar charts

Agreement - Altman Bland

Interactions

Confidence intervals

Bar charts with SD errors and other things to avoid

Page 13: Statistical Writing. Tables and Figures (Sven Sandin)

Figures - Box plot

Qualities

Meaning for any continuous data

Efficient when compare several groups

Minimizes data reduction

Interpretation

Half of the data between Q1 and Q3

Half above and half below the median

Difference between mean and median indicate lack of symmetry

Whiskers to ??? Tukey or percentiles

Outliers

Page 14: Statistical Writing. Tables and Figures (Sven Sandin)

Figures - Box plot

Page 15: Statistical Writing. Tables and Figures (Sven Sandin)

Figures - Box plot

#Data simulatedg=gl(10, 100, n*100) rnorm(n*100) + sqrt(as.numeric(g))boxplot(split(x,g), notch=TRUE)

Page 16: Statistical Writing. Tables and Figures (Sven Sandin)

Figures - Box plot

Wilcoxon rank sum test

Page 17: Statistical Writing. Tables and Figures (Sven Sandin)

Figures - Bar chart ± SD

Page 18: Statistical Writing. Tables and Figures (Sven Sandin)

Figures - Bar chart ± SD

t - test, a

ssuming symmetric data

Page 19: Statistical Writing. Tables and Figures (Sven Sandin)

Bar chart with SD errors

Often misinterpreted to be "different" or "not different" if error bars overlap or not

Why ± 1*SD ? it's 1.96 or 2 times SD that is relevant

A lot of ink to represent one (two) numbers: Mean and SD

Assume symmetry and normal distribution

Use the box plot instead !

Page 20: Statistical Writing. Tables and Figures (Sven Sandin)

Bar chart vs Box plot

Qualities

Meaning for any continuous data

Efficient when comparing several

groups

Minimizes data reduction

Interpretation

Half of the data between Q1 and Q3

Half above and half below the

median

Difference between mean and

median indicate lack of symmetry

Outliers

Qualities

NOT for any continuous data

NOT efficient when comparing

several groups

BIG reduction

Interpretation

?

?

?

Can't evaluate lack symmetry

Extremely sensitive to single outliers

Box plots Bar chart ± SD

Page 21: Statistical Writing. Tables and Figures (Sven Sandin)

Figures - Ordinal Scale

What do we want to achieve ?

What is an ordinal scale

Summarize data - not reducing

Evaluate distribution - Also cumulative

Change in distributions

Avoid problem with scattered tables

Integrated part of statistical analysis - test

Binary ?

Nominal ?

Page 22: Statistical Writing. Tables and Figures (Sven Sandin)

Figures - Ordinal ScaleICSI frozen, surgeryICSI fresh, surgery

IVF fresh

IVF frozen

ICSI fresh

ICSI frozen

N=12,775N=9,457N=142 N=1,699 N=6,886

Page 23: Statistical Writing. Tables and Figures (Sven Sandin)

Figures - Ordinal ScaleICSI frozen, surgeryICSI fresh, surgery

IVF fresh

IVF frozen

ICSI fresh

ICSI frozen

N=12,775N=9,457N=142 N=1,699 N=6,886

Wilcoxon rank sum test

Page 24: Statistical Writing. Tables and Figures (Sven Sandin)

Figures - Ordinal Scale

Trolle-Lagerros, Y., Mucci, L. A., Kumle, M., Braaten, T., Weiderpass, E., Hsieh, C.-C., Sandin, S. … Adami, H.-O. (2005). Physical activity as a determinant of mortality in women. Epidemiology, 16(6), 780–785.

Page 25: Statistical Writing. Tables and Figures (Sven Sandin)

Figures - Interaction

Trolle-Lagerros, Y, Mucci, LA, Kumle, M, Braaten, T, Weiderpass, E, Hsieh, CC, Sandin, S … Adami, HO (2005). Physical activity as a determinant of mortality in women. Epidemiology, 16(6), 780–785

Page 26: Statistical Writing. Tables and Figures (Sven Sandin)

Figure - Confidence intervals

Page 27: Statistical Writing. Tables and Figures (Sven Sandin)

Figure - Confidence intervals on log scale

Sandin, S, Nygren, KG, Iliadou, A, Hultman, CM, Reichenberg, A (2013). Autism and mental retardation among offspring born after in vitro fertilization. JAMA, 310(1), 75–84

Page 28: Statistical Writing. Tables and Figures (Sven Sandin)

Figure - Confidence intervals on log scale

Knight, A, Sandin, S, Askling, J (2010). Occupational risk factors for Wegener’s granulomatosis: a case-control study. Annals of the Rheumatic Diseases, 69(4), 737–740

Page 29: Statistical Writing. Tables and Figures (Sven Sandin)

Figure - Confidence intervals

Yang, L, Lof, M, Veierød, MB, Sandin, S, Adami, HO, Weiderpass, E (2011). Ultraviolet exposure and mortality among women in Sweden. Cancer Epidemiology, Biomarkers & Prevention: A Publication of the American Association for Cancer Research, Cosponsored by the American Society of Preventive Oncology, 20(4), 683–690

Page 30: Statistical Writing. Tables and Figures (Sven Sandin)

Figure - Confidence intervals

Knight, A, Sandin, S, & Askling, J (2010). Increased risk of autoimmune disease in families with Wegener’s granulomatosis. The Journal of Rheumatology, 37(12), 2553–2558

Page 31: Statistical Writing. Tables and Figures (Sven Sandin)

Figure - Confidence intervals

Overlapping CI's can be statistically significantly different

Scale: Ratio vs absolute (linear)

Tables with several comparisons can be hard to digest

Efficient in picking single effects

Efficient in picking out statistically significant results

Page 32: Statistical Writing. Tables and Figures (Sven Sandin)

Figures - Altman Bland

The problem

In a lab we have just bought a new robot. It is expected to be a lot

more accurate than the old one.

Can we just start using it or do we need to evaluate ? How ?

There are two variables measuring the effect of disease.

Can they be used interchangeable ?

Page 33: Statistical Writing. Tables and Figures (Sven Sandin)

Figures - Altman Bland

The problem

Compare two methods

What is our best guess of the truth ?

X and X-Y correlated

Y and X-Y correlated

Page 34: Statistical Writing. Tables and Figures (Sven Sandin)

Figures - Altman Bland

The problem

Compare two methods

What is our best guess of the truth ?

X and X-Y correlated

Y and X-Y correlated

The Figure

Calculate the mean X and Y

Calculate the difference X-Y

Plot Mean vs Difference

Draw reference line at D=0

Mean and Difference un-correlated

Page 35: Statistical Writing. Tables and Figures (Sven Sandin)

EXAMPLE : Altman-Bland

Bexelius, C, Löf, M, Sandin, S, Trolle Lagerros, Y, Forsum, E, Litton JE (2010). Measures of physical activity using cell phones: validation using criterion methods. Journal of Medical Internet Research, 12(1)