collecting and organizing data (for ease of analysis and good results!) annie n. simpson, msc....

14
Collecting and Collecting and Organizing Data Organizing Data (for ease of (for ease of analysis and good results!) analysis and good results!) Annie N. Simpson, MSc. Annie N. Simpson, MSc. Biostatistician Biostatistician

Upload: mervyn-stokes

Post on 05-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Collecting and Organizing Data (for ease of analysis and good results!) Annie N. Simpson, MSc. Biostatistician

Collecting and Organizing Collecting and Organizing Data Data (for ease of analysis and good (for ease of analysis and good

results!)results!)

Annie N. Simpson, MSc.Annie N. Simpson, MSc.

BiostatisticianBiostatistician

Page 2: Collecting and Organizing Data (for ease of analysis and good results!) Annie N. Simpson, MSc. Biostatistician
Page 3: Collecting and Organizing Data (for ease of analysis and good results!) Annie N. Simpson, MSc. Biostatistician

3

Data Collection Considerations

Should be investigated when your study is being planned

Should be implemented before (or shortly after) subjects are being recruited

Computational collection tools should be proportionate to the size of your study (size= number of subjects, number of forms/collection instruments)

A data Collection Schedule is often the best place to start!

Page 4: Collecting and Organizing Data (for ease of analysis and good results!) Annie N. Simpson, MSc. Biostatistician

4

Case Report Forms

If available use a previously used and vetted form (i.e. HAM-D)

All forms in a case book for a study should have the same “header information”

Header Information should capture, patient id, patient initials (if commonly used), visit number and or type, time of visit (if collecting things multiple times over 1 visit)

Remember that you are creating forms that may be used more than once depending on your study design, so you need to know how to differentiate visits etc.

Page 5: Collecting and Organizing Data (for ease of analysis and good results!) Annie N. Simpson, MSc. Biostatistician

DART Seminar Invited Talk 3/24/09 5

In Protocol

Page 6: Collecting and Organizing Data (for ease of analysis and good results!) Annie N. Simpson, MSc. Biostatistician

DART Seminar Invited Talk 3/24/09 6

For Data Collection During Study!

*Can even be used as a “face page” for each subjects binder, where each visit/form can get checked off!

Page 7: Collecting and Organizing Data (for ease of analysis and good results!) Annie N. Simpson, MSc. Biostatistician

7

Steps that I follow when I have a new study (from my perspective):Create and review with the Team (this is a very long

but worthwhile meeting): Updated “Form Based” Data Collection

Schedule Complete Blank Case Report Form Book Go through each page of the CRF book with

your team and ask questions (let them ask) that are not clear to everyone (include your statistician!).

Review each persons’ responsibilities/roles Review the current timeline

Page 8: Collecting and Organizing Data (for ease of analysis and good results!) Annie N. Simpson, MSc. Biostatistician

8

How to electronically capture your data to a spreadsheet… Not every form HAS to be entered, think

about whether the information will be analyzed or is it for study coordination

Patient Identifier number should be a column on every spreadsheet and should be set up EXACTLY the same (same length and type)

Usually one spreadsheet per collection form Usually laid out “vertically”, i.e. one row for

each patients for each visit time NEVER skip filling down the columns!

Page 9: Collecting and Organizing Data (for ease of analysis and good results!) Annie N. Simpson, MSc. Biostatistician

9

Examples of bad data layouts…

And good ones

Page 10: Collecting and Organizing Data (for ease of analysis and good results!) Annie N. Simpson, MSc. Biostatistician

10

How to think about how to begin analysis? 1st clean you data! Don’t forget to first check your N’s for

correctness, are they what you expect (for each form!).

Also examine the extreme values (max & mins) for each of your variables as the simplest way to check for incorrectly entered (i.e. dirty) data.

Always have original source documents (when you can) and don’t neglect checking between them and your spreadsheets!

Page 11: Collecting and Organizing Data (for ease of analysis and good results!) Annie N. Simpson, MSc. Biostatistician

11

How to think about how to begin analysis? Think about what and how your “Table 1” will

look Should the table be describe the total

sample…or perhaps by gender (depends on the question or focus of your research)

Can use any simple software to do this Excel, SPSS, Minitab

For all continuous vars get N, Mean, STD For all categorical vars get N, %, Total N’s

Page 12: Collecting and Organizing Data (for ease of analysis and good results!) Annie N. Simpson, MSc. Biostatistician

12

Basic Analysis of Continuous Response Variables Numerical Descriptives

Mean Median Mode Variance Range

Graphical Descriptives Boxplots Scatterplots Histograms

Page 13: Collecting and Organizing Data (for ease of analysis and good results!) Annie N. Simpson, MSc. Biostatistician

13

Basic Analysis of Categorical Response Variables Numerical Descriptives

Frequencies Percents/Proportions

Graphical Descriptives Bar Charts Pie Graphs (not so common in

biomedical research)

Page 14: Collecting and Organizing Data (for ease of analysis and good results!) Annie N. Simpson, MSc. Biostatistician

14

Other data considerations

Large multi-center clinical trials will usually have a centralized data collection and coordinating center.

You, as a clinical site, would be responsible for error correction with source documentation.

Training of entry/coordination staff is very important (ex: 5 year study data collected, at the end statistician got the data and nowhere was the study group collected, and it wasn’t on source documents either!)

Your study is only as good as the data that you collect, pre-planning is the key.