practical best practices for data management
TRANSCRIPT
![Page 1: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/1.jpg)
Practical Best Practices for Data Management
Brianna Marshall | @notsosternlib UW Digital Humanities Research Network | 12.2.2014
![Page 2: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/2.jpg)
my data (family history photos & documents)
![Page 3: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/3.jpg)
Time to ponder
Can you still access your data from… – 20 years ago? – 10 years ago? – 5 years ago? – 1 year ago?
Let’s talk about the data you’ve kept and lost.
![Page 4: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/4.jpg)
Horror stories
Data that is… – Lost (disorganized) – Unreadable – Without context – Gone (deleted)
![Page 5: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/5.jpg)
Data management basics File organization • Is your data organized meaningfully or jumbled together? Do you
know where your data is? Documentation • How much contextual information accompanies your data? Can you
understand it? Can a stranger understand it?
Storage & backup • Where is your data stored and backed up? Could you recover from
hardware failure or accidental deletion? Media obsolescence • Do you know how the software, hardware, and file formats you use
will impact your data’s readability in the future?
![Page 6: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/6.jpg)
1. Organization
• Create a system at the beginning of the project.
• Make sure the entire team is on board. • The more collaborators, the more
important your system becomes! • Any system is better than none.
![Page 7: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/7.jpg)
File naming conventions • Use them any time you have related files • Consistent • Short yet descriptive • Avoid spaces and special characters
example: File001.xls vs. Project_instrument_location_YYYYMMDD.xls
![Page 8: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/8.jpg)
Directory/folder organization • Lots of possibilities, so consider what makes
sense for your project – File type – Date – Type of analysis
example: MyDocuments\Research\Sample12.tiff vs.
C:\\NSFGrant01234\WaterQuality\Images\LakeMendota_20141030.tiff
![Page 9: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/9.jpg)
Retroactive organization
• Do a data inventory. List all the places where your data lives (both physical and digital)
• Make a plan for consolidating – follow the rule of 3, not the rule of 17
![Page 10: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/10.jpg)
2. Documentation
Coded SPSS survey responses (Useless without the original questionnaires)
![Page 11: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/11.jpg)
Document on many levels Project- & folder-level
– Create a readme file. (Good example located here: http://hdl.handle.net/2022/17155)
– Document any data processing and analyses. – Don’t forget written notes.
Item-level – Remember the importance of file names for conveying
descriptive information. – Find and adhere to disciplinary metadata standards
• TEI (XML) • Dublin Core
![Page 12: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/12.jpg)
3. Storage & backup storage = working files. The files you access regularly and change frequently. In general, losing your storage means losing current versions of the data. backup = regular process of copying data separate from storage. You don’t really need it until you lose data, but when you need to restore a file it will be the most important process you have in place.
![Page 13: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/13.jpg)
Rule of 3 • Keep THREE copies of your data
– TWO onsite – ONE offsite
• Example – One: Network drive – Two: External hard drive – Three: Cloud storage
• This ensures that your storage and backup is not all in the same place – that’s too risky!
![Page 14: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/14.jpg)
Evaluating cloud services • Lots of options out there – and not all are
created equal • Read the Terms of Service! • Servers get hacked all the time. Whatever you’re
storing, you don’t want your provider to have access to it.
• Data encryption is your friend.
![Page 15: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/15.jpg)
4. Media obsolescence • software • hardware • file formats
CC image by Flickr user wlef70
![Page 16: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/16.jpg)
Thwarting obsolescence • You can’t. • Today’s popular software can become
obsolete through business deals, new versions, or a gradual decline in user base. (Consider WordPerfect.)
• Anticipate average lifespan of media to be 3-5 years. Migrate your files every few years, if not more frequently!
![Page 17: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/17.jpg)
Thwarting obsolescence • Some file formats are less susceptible to
obsolescence than others – Open, non-proprietary formats (pick TXT
over DOCX, CSV over XSLX, TIF over JPG)
– Wide adoption – History of backward compatibility – Metadata support in open format (XML)
![Page 18: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/18.jpg)
Back to (data management) basics
File organization • Is your data organized meaningfully or jumbled together? Do you
know where your data is? Documentation • How much contextual information accompanies your data? Can you
understand it? Can a stranger understand it?
Storage & backup • Where is your data stored and backed up? Could you recover from
hardware failure or accidental deletion? Media obsolescence • Do you know how the software, hardware, and file formats you use
will impact your data’s readability in the future?
![Page 19: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/19.jpg)
Federal funding requirements
Data management plans (DMPs) are required by all federal funding agencies. 2013 OSTP mandate:
• Public access to data and publications. • Individual agencies create their own
requirements. • Goal is to make publically-funded research
reproducible.
![Page 20: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/20.jpg)
NEH Office of Digital Humanities DMP
2 page document that answers:
• What data are generated by your research? • What is your plan for managing the data?
NEH will also release requirements for public data access soon.
![Page 21: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/21.jpg)
http://researchdata.wisc.edu/make-a-plan/dmptool/
![Page 22: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/22.jpg)
My suggestion?
Grant or not, start new projects with a data management plan compiled by project leaders. The DMP should cover:
• Organization & naming • Documentation & metadata • Storage & sharing • Any and all other pertinent details. (The more the
better; it’ll save you headaches later.)
The DMP should be actively revisited and adapted as needed throughout the project.
![Page 23: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/23.jpg)
Final thoughts
• Think about how your data organization, documentation, and storage impacts your ability to access your data years from now.
• If organizing retroactively, prioritize your most important research.
• Any plan is better than no plan at all. Start today. Ask for help.
![Page 24: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/24.jpg)
Get in touch
Brianna Marshall Digital Curation Coordinator General Library System Lead, Research Data Services [email protected]
![Page 25: Practical Best Practices for Data Management](https://reader031.vdocument.in/reader031/viewer/2022032504/55c3ca86bb61eb145d8b458d/html5/thumbnails/25.jpg)
Thank you! Adapt this presentation as needed! Creative Commons Attribution: Some content adapted from the wise Kristin Briney. Find all RDS slides at: www.slideshare.net/UW_RDS/