data management 101
DESCRIPTION
Learn the basics of managing your research well, covering the topics of: file organization and naming, documentation, storage and backups, and future file usability.TRANSCRIPT
![Page 1: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/1.jpg)
Do You Still Have Your Data?
• What if your hard drive crashes?
• What if you are accused of fraud?
• What if your collaborator abruptly quits?
• What if the building burns down?
• What if you need to use your old data?
• What if your backup fails?
• What if your computer gets stolen?
• What if…
![Page 2: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/2.jpg)
Data Management 101
14 November 2014
Kristin Briney, PhD
![Page 3: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/3.jpg)
Data Management Basics
• Introduction to a few topics in data management
– File organization and naming
– Documentation
– Storage and backups
– Future file usability
![Page 4: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/4.jpg)
For each minute of planning atbeginning of a project, you will save10 minutes of headache later
![Page 5: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/5.jpg)
FILE ORGANIZATION & NAMING
Dan Zen, http://www.flickr.com/photos/danzen/5551831155/ (CC BY)
![Page 6: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/6.jpg)
File Organization
• What?
– Keeping your files in order
![Page 7: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/7.jpg)
File Organization
• Why?
– Easier to find and use data
– Tell, at a glance, what is done and what you have yet to do
– Can still find and use files in the future
![Page 8: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/8.jpg)
File Organization
• When?
– Always!
– Get in the habit of putting files in the right place
![Page 9: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/9.jpg)
File Organization
• How?
– Any system is better than none
– Make your system logical for your data
• 80/20 Rule
– Possibilities
• By project
• By analysis type
• By date
• …
![Page 10: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/10.jpg)
Example
• Thesis
– By chapter
• By file type (draft, figure, table, etc.)
• Data
– By researcher
• By analysis type– By date
![Page 11: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/11.jpg)
File Naming Conventions
• What?
– Consistent naming for files
![Page 12: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/12.jpg)
http://retractionwatch.com/2014/01/07/doing-the-right-thing-authors-retract-brain-paper-with-systematic-human-error-in-coding/
![Page 13: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/13.jpg)
File Naming Conventions
• Why?
– Make it easier to find files
– Avoid duplicates
– Make it easier to wrap up a project because you know which files belong to it
![Page 14: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/14.jpg)
File Naming Conventions
• When?
– For a group of related files (3 to 1000+)
– May need different conventions for different groups
![Page 15: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/15.jpg)
File Naming Conventions
• How?
– Pick what is most important for your name
• Date
• Site
• Analysis
• Sample
• Short description
![Page 16: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/16.jpg)
File Naming Conventions
• How?
– Files should be named consistently
– Files names should be descriptive but short (<25 characters)
– Use underscores instead of spaces
– Avoid these characters: “ / \ : * ? ‘ < > [ ] & $
– Use the dating convention: YYYY-MM-DD
![Page 17: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/17.jpg)
Example
• YYYYMMDD_site_sampleNum
– 20140422_PikeLake_03
– 20140424_EastLake_12
• Analysis-sample-concentration
– UVVis-stilbene-10mM
– IR-benzene-pure
![Page 18: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/18.jpg)
DOCUMENTATION
Brady, https://www.flickr.com/photos/freddyfromutah/4424199420 (CC BY)
![Page 19: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/19.jpg)
What would someone unfamiliarwith your data need in order to find,evaluate, understand, and reusethem?
![Page 20: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/20.jpg)
Documentation
• Why?
– Data without notes are unusable
– Because you won’t remember everything
– For others who may need to use your files
![Page 21: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/21.jpg)
Documentation
• When?
– Always
– Documentation needs will vary between files
![Page 22: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/22.jpg)
Documentation
• How?
– Take good notes
– Metadata schemas
• http://www.dcc.ac.uk/resources/metadata-standards
![Page 23: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/23.jpg)
Documentation
• How?
– Methods
• Protocols
• Code
• Survey
• Codebook
• Data dictionary
• Anything that lets someone reproduce your results
![Page 24: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/24.jpg)
Documentation
• How?
– Templates
• Like structured metadata but easier
• Decide on a list of information before you collect data– Make sure you record all necessary details
– Takes a few minutes upfront, easy to use later
• Print and post in prominent place or use as worksheet
![Page 25: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/25.jpg)
Example
• I need to collect:
– Date
– Experiment
– Scan number
– Powers
– Wavelengths
– Concentration (or sample weight)
– Calibration factors, like timing and beam size
![Page 26: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/26.jpg)
Documentation
• How?
– README.txt
• For digital information, address the questions– “What the heck am I looking at?”
– “Where do I find X?”
• Use for project description in main folder
• Use to document conventions
• Use where ever you need extra clarity
![Page 27: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/27.jpg)
Example
• Project-wide README.txt
– Basic project information
• Title
• Contributors
• Grant info
• etc.
– Contact information for at least one person
– All locations where data live, including backups
![Page 28: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/28.jpg)
Example
“Talk_v1: rough outline of talk
Talk_v2: draft of talk
Talk_v3: updated 2014-01-15 after feedback”
“ ‘Data’ folder contains all raw data files by date
‘Analysis’ has analyzed data and plots
‘Paper’ has drafts of article on this work”
![Page 29: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/29.jpg)
grover_net, http://www.flickr.com/photos/9246159@N06/599820538/ (CC BY-ND)
STORAGE AND BACKUPS
![Page 30: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/30.jpg)
Storage
• Why?
– Need good storage practices to prevent loss
– Keep data secure
![Page 31: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/31.jpg)
Storage
• How?
– Library motto: Lots of Copies Keeps Stuff Safe!
– Rule of 3: 2 onsite, 1 offsite
![Page 32: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/32.jpg)
Storage
• How?
– Computer
– External hard drive
– Shared drives/servers
– Tape backup
– Cloud storage*
– CDs/DVDs
– USB flash drive
Erica Wheelan, https://www.flickr.com/photos/reinventedwheel/5985479866 (CC BY)
![Page 33: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/33.jpg)
*Cloud Storage
• Read the Terms of Service!
• Eg. Google Drive– “When you upload or otherwise submit content to our Services,
you give Google (and those we work with) a worldwide license to use, host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes we make so that your content works better with our Services), communicate, publish, publicly perform, publicly display and distribute such content. The rights you grant in this license are for the limited purpose of operating, promoting, and improving our Services, and to develop new ones”
![Page 34: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/34.jpg)
Backups
http://toystory.disney.com/
![Page 35: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/35.jpg)
Backups
• How?
– Any backup is better than none
– Automatic backup is better than manual
– Your work is only as safe as your backup plan
![Page 36: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/36.jpg)
Backups
• How?
– Check your backups
• Backups only as good as ability to recover data
• Test your backups periodically– Preferably a fixed schedule
– 1 or 2 times a year may be enough
– Bigger/more complex backups should be checked more often
• Test your backup whenever you change things
![Page 37: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/37.jpg)
Example
• I keep my data
– On my computer
– Backed up manually on shared drive
• I set a weekly reminder to do this
– Backed up automatically via SpiderOak cloud storage
![Page 38: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/38.jpg)
FUTURE FILE USABILITY
Ian, http://www.flickr.com/photos/ian-s/2152798588/ (CC BY-NC-ND)
![Page 39: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/39.jpg)
Future File Usability
• What?
– Can you read your files from 10 years ago?
– Data needs to be
• Accessible
• Interpretable
• Readable
![Page 40: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/40.jpg)
lukasbenc, https://www.flickr.com/photos/lukasbenc/3493808772 (CC BY-NC-SA)
![Page 41: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/41.jpg)
Future File Usability
• Why?
– You may want to use the data in 5 years
– PI sometimes keeps data and notes
– Prep for data sharing
– Per OMB Circular A-110, must retain data at least 3 years post-project
• Better to retain for >6 years
![Page 42: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/42.jpg)
Future File Usability
• When?
– When you wrap up a project
– (As you work on a project)
![Page 43: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/43.jpg)
Future File Usability
• How?
– Back up written notes
• People always forget this one
• Difficult to interpret data without notes
• Options– Digitally scan (recommended with digital data)
– Photocopies
![Page 44: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/44.jpg)
Future File Usability
• How?
– Convert file formats
• Can you open digital files from 10 years ago?
• Use open, non-proprietary formats that are in wide use– .docx .txt
– .xlsx .csv
– .jpg .tif
• Save a copy in the old format, just in case
• Preserve software if no open file format
![Page 45: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/45.jpg)
Future File Usability
• How?
– Move to new media
• Hardware dies and becomes obsolete– Floppy disks!
• Expect average lifetime to be 3-5 years
• Keep up with technology
![Page 46: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/46.jpg)
WHERE TO GO FROM HERE
![Page 47: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/47.jpg)
easylocum, https://www.flickr.com/photos/easylocum/2921542814 (CC BY)
![Page 48: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/48.jpg)
Chris Hoving, https://www.flickr.com/photos/pcrucifer/2433274595 (CC BY-ND)
![Page 49: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/49.jpg)
Resources
• Data Services
– http://uwm.edu/libraries/dataservices/
• Data Management Guide
– http://guides.library.uwm.edu/data
• Data Services Librarian
![Page 50: Data Management 101](https://reader034.vdocument.in/reader034/viewer/2022042700/559b053f1a28ab4e638b45d4/html5/thumbnails/50.jpg)
Thank You!
• This presentation available under a Creative Commons Attribution (CC-BY) license
• Some content courtesy of Dorothea Salo– http://www.graduateschool.uwm.edu/research/resear
cher-central/proposal-development/data-plan/boot-camp/ (CC BY)