Download - Elements of Data Documentation
![Page 1: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/1.jpg)
Elements of Data Documentation
Adam MackEducation and Human Development Incubator (EHDi)
Social Science Research InstituteOctober 1, 2015
![Page 2: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/2.jpg)
Why Is Documentation Important?
• Describe the contents of the data
• Explain context in which data was collected
• Explain any manipulations performed on the data
• Allow research data to be understood by people outside of the original project
![Page 3: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/3.jpg)
Do I Need to Document?
Back in the day… … and now.Research:
![Page 4: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/4.jpg)
Consequences of Insufficient Documentation
![Page 5: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/5.jpg)
Consequences of Insufficient Documentation
• Data may be unusable
• May make inaccurate assumptions about data
– Manipulations performed on data may affect results of analyses
– May be unclear how to interpret contents of a variable
![Page 6: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/6.jpg)
Consequences of Insufficient Documentation:
Example
• Assume each of the following prompts is answered on a 1–5 agreement scale. – Data management is great. (dmgreat) 5– Data management is the greatest! (dmgrtst) 5– I don’t like data management. (dmnolike) 1
• Dmnolike needs to be reversed scored (to 5) before a scale score can be calculated from the variables.
• You can recode this value within the same variable, but should you?
![Page 7: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/7.jpg)
Elements of Data Documentation
• What are the most important elements to document?
![Page 8: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/8.jpg)
Elements of Data Documentation
• What are the most important elements to document?
– Data elements
– Study elements
– Processes and decisions
![Page 9: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/9.jpg)
Elements of Data Documentation
• Who will be using the documentation?
– Data managers
– Statisticians
– Researchers
– Outside users
![Page 10: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/10.jpg)
Elements of Data Documentation
• When should documentation be created?
– Often, projects wait until data has been collected before creating documentation such as codebooks.
– Creating documentation early in the project has numerous advantages.
![Page 11: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/11.jpg)
Elements of Data Documentation• How should these elements be documented?
Potential forms that documentation may take include:
– Codebook
![Page 12: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/12.jpg)
Elements of Data Documentation• How should these elements be documented?
Potential forms that documentation may take include:
– Annotated version of instrument
![Page 13: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/13.jpg)
Elements of Data Documentation• How should these elements be documented?
Potential forms that documentation may take include:
– More descriptive, less structured forms of documentation (data narratives)
![Page 14: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/14.jpg)
Data-Level Documentation
• What are the most important elements to document?
– Data elements
– Study elements
– Processes and decisions
![Page 15: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/15.jpg)
Data-Level Documentation
Should include basic information needed to use the data, including:
• Structural information about variable
– Name of variable
– Label (if applicable)
– Type of variable (numeric or character)
– Length of variable
![Page 16: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/16.jpg)
Data-Level Documentation
• Information describing variable contents
– Question text (or text description of variable contents)
– Valid values
– Coding of values
![Page 17: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/17.jpg)
Data-Level Documentation
• Scales/derived variables
– Algorithm used to create variable
– Procedures for handling missing data
![Page 18: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/18.jpg)
Data-Level Documentation
• Question routing (if skip patterns used)
– Identify number of participants asked each question/path through survey
• Error checking/validation
![Page 19: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/19.jpg)
Data-Level Documentation
• Reliability of scales
– Calculate Cronbach’s alpha for each scale included in the data
– Compare values for your study to previously reported values in the literature
![Page 20: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/20.jpg)
Types of Data Documentation
• Tabular codebook (Excel)
– Good for organizing a large amount of information concisely
– Sortable
– Filterable
– Customizable; can hide columns that may be needed but are not of interest to a general audience
![Page 21: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/21.jpg)
Tabular Codebook
![Page 22: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/22.jpg)
Types of Data Documentation
• Annotated instrument
– Contains basic variable and value information in context
– Easy to interpret
– Difficult to integrate much additional detail; not useful for some forms of data
![Page 23: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/23.jpg)
Annotated Instrument
![Page 24: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/24.jpg)
Study-Level Documentation
• What are the most important elements to document?
– Data elements
– Study elements
– Processes and decisions
![Page 25: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/25.jpg)
Study-Level Documentation
• Details about the source of the data
– Study design and purpose
– Collection method
– Information about the research sample
– Longitudinal time points (if applicable)
![Page 26: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/26.jpg)
Study-Level Documentation
• Information about data files
– File name/version
– Date created
– Number of records
– Number of variables
– Changes since last version of file
![Page 27: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/27.jpg)
Study-Level Documentation
• Information about measures used
– Description of measure
– Description of scales
– Source of measure, including references as appropriate
![Page 28: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/28.jpg)
Study-Level Documentation
Programs used to process/manipulate data
– Documentation within program (comments)
![Page 29: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/29.jpg)
Study-Level Documentation
Programs used to process/manipulate data
– Documentation of what various programs do and in what order they are used
Program Description
SSIS_01 Creates data set with 1st batch of data. Includes scoring
code for social skills and problem behavior scales and
subscales.
SSIS_01a Corrects scoring issue with problem behavior scale.
SSIS_02 Adds 2nd batch of data; adds assessment date and birth
date information to allow calculation of age-dependent
scores.
SSIS_03 Adds 3rd batch of data.
![Page 30: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/30.jpg)
Study-Level Documentation
• Data narrative
– Good for measure/study-level information
![Page 31: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/31.jpg)
Study-Level Documentation
• Data narrative (continued)
![Page 32: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/32.jpg)
Decision and Process Documentation
• What are the most important elements to document?
– Data elements
– Study elements
– Processes and decisions
![Page 33: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/33.jpg)
Decision and Process Documentation
• By far, the least established area of research documentation.
• Due to individual differences between research projects, it can be difficult to identify a standard template.
![Page 34: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/34.jpg)
Decision and Process Documentation
Elements to include in documentation:
• Scope (variables/measures)
• Time (if multiple time points)
• Describe purpose of process or situation requiring a decision being made
![Page 35: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/35.jpg)
Decision and Process Documentation
Elements to include in documentation:
• Information from the data that describes or affects the decision or process
• A description of the process itself, including:
– Any software or tools needed to complete the process
– Any resources /references used
![Page 36: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/36.jpg)
Decision and Process Documentation
• What sorts of decisions and processes should be documented with this level of detail?
– Basic scales and processes that are commonly utilized may not require this much detail
– Processes and procedures that are not well established or that deviate significantly from the standard method should be documented
![Page 37: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/37.jpg)
Decision and Process Documentation
• Examples of processes that might need to be documented
– Naming conventions for variables
– Naming conventions for data files
– Structure of data directories
– Version information
![Page 38: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/38.jpg)
Decision and Process Documentation
• Examples of decisions that might need to be documented
– Resolving discrepancies in data obtained from multiple sources or at multiple time points
– Data transformations that require interpretation
![Page 39: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/39.jpg)
Decision and Process Documentation
![Page 40: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/40.jpg)
Tools for Documentation
• Statistical software packages (e.g. SAS, Stata)
– Variable information (PROC contents; describe)
– Provides a good starting point for a codebook
• Database management systems
![Page 41: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/41.jpg)
Tools for Documentation
• Data collection instruments
– Paper forms
– Electronic/online collection
![Page 42: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/42.jpg)
PROC CODEBOOK (SAS)
• PROC CODEBOOK is a SAS macro that creates a codebook based on a SAS data set
![Page 43: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/43.jpg)
PROC CODEBOOK (SAS)
• Requirements
– Labels on variables and data set
– Formats assigned to categorical values
– Minimum of 1 categorical/2 numeric variables
• Optional elements
– Ordering of variables (default is by variable name)
– ODS formatting of title text
![Page 44: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/44.jpg)
PROC CODEBOOK (SAS)
• Can be useful when dealing with data sets that include SAS formats
• If data set does not already have formats applied, may take as much time to add them as to create your own codebook (which has more flexibility)
• To download the SAS macro and access documentation, visit http://www.cpc.unc.edu/research/tools/data_analysis/proc_codebook
![Page 45: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/45.jpg)
Documentation Standards
• How can we document the data in a way that helps interested parties find the data?
• Dublin Core– Includes 15 standard elements.
– Intended for describing a wide range of different web-based or physical resources
• Data Documentation Initiative– An international specification for describing data from the
social, behavioral, and economic sciences
– Supports the entire research data lifecycle
![Page 46: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/46.jpg)
The Takeaway
• Good documentation is not just a product, it’s an approach
![Page 47: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/47.jpg)
Resources
• Inter-university Consortium for Political and Social Research (ICPSR)
– Guide to Social Science Data Preparation and Archiving
• Cornell Research Data Management Service Group
– Guide to writing "readme" style metadata
• Duke University Libraries
![Page 48: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/48.jpg)
Questions?
• Ask away!
• If you would like to talk more about documentation for your own projects, contact us at [email protected].
• Thanks for coming!
![Page 49: Elements of Data Documentation](https://reader033.vdocument.in/reader033/viewer/2022042723/58778ce11a28ab0f778b4811/html5/thumbnails/49.jpg)
Acknowledgements
For their help in putting together this workshop:
• Lorrie Schmid
• Chandler Thomas
And for helping keep you interested in the material:
• Darth Vader
• Success Kid
• Mark Wahlberg (and @ResearchMark)