biost 590: statistical consulting - university of … · lecture outline • preliminaries ... •...

35
1 Biost 590: Statistical Consulting Approach to Consulting ; September 26, 2008 Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics, University of Washington © 2000, 2008 Scott S. Emerson, M.D.,Ph.D.

Upload: dangtuong

Post on 09-Aug-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

1

Biost 590:Statistical Consulting

Approach to Consulting ;

September 26, 2008

Scott S. Emerson, M.D., Ph.D.Professor of Biostatistics, University of Washington

© 2000, 2008 Scott S. Emerson, M.D.,Ph.D.

2

Lecture Outline

• Preliminaries• The Abstract Problem• The Concrete Problem

3

Preliminaries

4

It’s all about…

• The client?– We are providing a service to answer the

client’s statistical questions• You?

– But we generally have to provide the organization of the session, or else we get nowhere

5

Fundamentals

• Consulting session involves– People– Science– Statistics

• Keep in mind the above order of importance

6

Preliminaries: People

• Statisticians– Name, title, background (a little), species

• Client– Name, title, department– Role in this research effort

• Principal investigator, investigator, student, assistant

– Background• Scientific, statistical expertise

7

Preliminaries: Session

• Outline structure of session– First: Scientific background– Then: Description of (planned) scientific study– Then: Statistical advice– Final: Wrap-up

• Summarize what went on• Identify action items and timelines

– Client– Statistician

8

Preliminaries: Caveats

• We ask questions and restate principles– Science is collaborative– Our knowledge about their field is as little as

(or less than) their field knowledge about statistics

• We take notes• In the class, we confer among ourselves• There may be a need for additional visits

9

Discussing theProblem in the Abstract

10

Start at the Beginning

• Understand the scientific question– Overall goal– Current state of knowledge– Specific aims of this experiment– Scientific relevance of the experiment

• “Why do we care?”• What will we do with the results

– If positive– If negative

11

Objectives: Background

• We need to learn the scientific background– Experienced vs inexperienced teachers

• We need to have confidence in the client– Knowledge of science– Knowledge of current status of research– Knowledge of scientific method– Knowledge of statistics

12

Objectives: Study

• Specific aims – From the field’s perspective

• “Minimal publishable quantity”• “Headline of eventual report”

– From the client about this study• Primary, secondary, tertiary, exploratory

– How do they address overall goal• Logistical constraints

– Scientific, financial, ethical, …

13

Tools: Listening and Curiosity

• First: Client’s soliloquy• Then: Questions

– Open-ended specific– Purposeful naivete

• Restatement of your understanding• Conjectures

14

Preliminary Evaluation

• In learning the scientific background and specific aims, we start evaluating whether we will be able to provide any help at all– Can we refine the scientific question into a

statistical question about a random variable?• Scientifically important measurement• Variability that can be quantified

– in a scientifically important manner

15

Refine to Statistical Hypotheses

• Gently lead the researcher toward something that we can help with– Definition of ultimate goal

• Clusters, latent variables, outcomes of interventions, comparisons across groups, group membership, …

– Definition of probability model• Where does randomness come in

– Summary of probability distribution

16

Avoid Pitfalls

• Do not “oversteer” the client to what is statistically easy– Rarely will two different analyses answer the

exact same scientific question– Hence, steering a client to a particular

analysis may be changing the question• Two year olds should not be given hammers

17

Thought Experiment

• Think about ideal study design– It is very useful to imagine what you would

like to do (unconstrained by practicality or ethics) as a starting point

– This can then serve as a reference to help decide what can be done within the limitations of the actual problem

18

Can Statistics Help?

• Litmus Test # 1:

– If the scientific question cannot be answered by an experiment when outcomes are entirely deterministic, there is NO chance that statistics can be of any help.

19

Can Statistics Help?

• Litmus Test # 2:

– If the scientific researcher cannot decide on an ordering of data distributions which would be appropriate when measurements are available on the entire population, there is NO chance that statistics can be of any help.

20

Classify the Statistical Goal

• Type of question• Cluster analysis• Factor analysis• Quantifying distribution within group• Comparing distributions across groups• Prediction: point, interval

• Level of detail• Existence, direction, first order trends, precise

dose response

21

Discussing theConcrete Problem at Hand

22

Now (and only now)

• Start considering the current study– Where are we in the study– Study design (so far?)– Data of interest– Data available– Prior analyses

23

Where Are We in the Study

• Hypothesis generation – (Interpretation of literature)

• Study design• Statistical design• Quality control• Analysis of data• Interpretation of results• Response to reviewers

24

Statistical Hypotheses

• Prior to data analysis– Protect inference for primary, secondary aims– As careful a statement of hypotheses as

possible• Scientific outcome• Statistical summary measures• Any variables used in adjustment• Exact statistic used as estimates, tests

25

Describe Sampling Methods

• Source of data– Location, time– Selection criteria (inclusion, exclusion)

• Sample sizes specified by design– Overall and within prespecified strata

• Sampling scheme may have specified number of smokers and nonsmokers in a cohort design

• Sampling scheme may reflect random process – incidence of events is then estimable

26

Scientifically Classify Variables

• E.g., in clinical studies– Demographic variables– Measures of exposure– Measures of concurrent disease– Measures of severity of disease– Measures of clinical outcomes– etc.

27

Not Always Easy

• Scientific classification of the same variable can differ from study to study

• E.g., What does race mean to you?– Genetics– Culture– Environmental factors– Socioeconomic status

28

Statistical Role of Variables

• Outcome (response) variable(s)• Predictor(s) of interest

– (main grouping variable(s))

• Subgroups of interest – for effect modification

• Potential confounders• Variables that add precision to analysis

– Often these are potential confounders, because they may be associated with predictor(s) of interest in sample

29

Statistical Role of Variables

• Factors that are• Varied systematically

– Interventions, blocking factors

• Controlled at particular levels• Sampled, measured, and used in analysis

– Observational predictor of interest or covariates

• Sampled and used in analysis– Random effects, possibly balanced across interventions

• Sampled but unmeasured– source of “error”?

• Sampled and measured as outcome

30

Descriptive Statistics

• Tables, plots• Identification of measurement or data entry errors• Characterize materials and methods

– Available data

• Validity of analysis methods– Assess scientific and statistical assumptions

• (Straightforward estimates of effects-- inference)• Hypothesis generation (inference-- estimation)

31

Inferential Methods

• Cluster analysis• Factor analysis• Prediction, quantifying, or comparing

distributions• Summary measures• Comparison measures• Probability models• Statistical models, covariate adjustment• Measures of accuracy and precision of analysis

32

Results of Analyses

• For each scientific question– Point estimate– Interval estimate– Quantification of strength of evidence

33

Interpretation

• Statistician as collaborator– Contribution to manuscript

• (Introduction)• Statistical methods• Results

– Analysis– Clarity of presentation

• Discussion– Especially where statistical limitations play a role

34

Education

• Understanding prior analyses• Interpretation of literature

35

Next Lectures

• October 3:– Statistical classification of scientific questions

• October 10:– Approach to design of studies