collecting and organizing data (for ease of analysis and good results!) annie n. simpson, msc....
TRANSCRIPT
Collecting and Organizing Collecting and Organizing Data Data (for ease of analysis and good (for ease of analysis and good
results!)results!)
Annie N. Simpson, MSc.Annie N. Simpson, MSc.
BiostatisticianBiostatistician
3
Data Collection Considerations
Should be investigated when your study is being planned
Should be implemented before (or shortly after) subjects are being recruited
Computational collection tools should be proportionate to the size of your study (size= number of subjects, number of forms/collection instruments)
A data Collection Schedule is often the best place to start!
4
Case Report Forms
If available use a previously used and vetted form (i.e. HAM-D)
All forms in a case book for a study should have the same “header information”
Header Information should capture, patient id, patient initials (if commonly used), visit number and or type, time of visit (if collecting things multiple times over 1 visit)
Remember that you are creating forms that may be used more than once depending on your study design, so you need to know how to differentiate visits etc.
DART Seminar Invited Talk 3/24/09 5
In Protocol
DART Seminar Invited Talk 3/24/09 6
For Data Collection During Study!
*Can even be used as a “face page” for each subjects binder, where each visit/form can get checked off!
7
Steps that I follow when I have a new study (from my perspective):Create and review with the Team (this is a very long
but worthwhile meeting): Updated “Form Based” Data Collection
Schedule Complete Blank Case Report Form Book Go through each page of the CRF book with
your team and ask questions (let them ask) that are not clear to everyone (include your statistician!).
Review each persons’ responsibilities/roles Review the current timeline
8
How to electronically capture your data to a spreadsheet… Not every form HAS to be entered, think
about whether the information will be analyzed or is it for study coordination
Patient Identifier number should be a column on every spreadsheet and should be set up EXACTLY the same (same length and type)
Usually one spreadsheet per collection form Usually laid out “vertically”, i.e. one row for
each patients for each visit time NEVER skip filling down the columns!
9
Examples of bad data layouts…
And good ones
10
How to think about how to begin analysis? 1st clean you data! Don’t forget to first check your N’s for
correctness, are they what you expect (for each form!).
Also examine the extreme values (max & mins) for each of your variables as the simplest way to check for incorrectly entered (i.e. dirty) data.
Always have original source documents (when you can) and don’t neglect checking between them and your spreadsheets!
11
How to think about how to begin analysis? Think about what and how your “Table 1” will
look Should the table be describe the total
sample…or perhaps by gender (depends on the question or focus of your research)
Can use any simple software to do this Excel, SPSS, Minitab
For all continuous vars get N, Mean, STD For all categorical vars get N, %, Total N’s
12
Basic Analysis of Continuous Response Variables Numerical Descriptives
Mean Median Mode Variance Range
Graphical Descriptives Boxplots Scatterplots Histograms
13
Basic Analysis of Categorical Response Variables Numerical Descriptives
Frequencies Percents/Proportions
Graphical Descriptives Bar Charts Pie Graphs (not so common in
biomedical research)
14
Other data considerations
Large multi-center clinical trials will usually have a centralized data collection and coordinating center.
You, as a clinical site, would be responsible for error correction with source documentation.
Training of entry/coordination staff is very important (ex: 5 year study data collected, at the end statistician got the data and nowhere was the study group collected, and it wasn’t on source documents either!)
Your study is only as good as the data that you collect, pre-planning is the key.