acrl march2015 final
TRANSCRIPT
Roles for Libraries in Providing
Research Data Management
Services
Nicole Vasilevsky, Oregon Health & Science University
Victoria Mitchell, University of Oregon
Jeremy Kenyon, University of Idaho
Nicole
VasilevskyProject Manager, Biocurator and Ontologist, Ontology Development Group,OHSU
Victoria
MitchellSocial Science Data & Government Documents Librarian, University of Oregon
Jeremy
KenyonResearch Librarian, University of Idaho Library
Why?
Researcher Perspective
Version
control Track
processes for
reproducibility
Quality
Control
Stay Organized Save Time and Stress
Avoid Data Loss
Format data for reuse (by self,
team, or others)
Document for own recollection,
accountability, reuse
Funding mandates
http://www.economist.com/news/briefing/21588057-scientists-think-science-self-correcting-alarming-degree-it-not-trouble
Reproducibility
Why?
Funding mandates
The UO Environment
• No campus-wide research data policy
• Library leading on research data
management and preservation
• Collaborating with campus IT, Research
Services
The UO Environment
• Digital Scholarship Center• Open Access Publishing
• Digital Collections
• Institutional Repository
• Interactive Media Development
• Data Services• Science Data Services Librarian
• Social Science Data Librarian
Education
• Workshops
• Presentations in classes and new faculty
orientations
• 1-credit course in research data
management for grad students
Graduate Seminar in Data
Management
• 2 iterations so far
• 1st: Spring 2013 – 1 credit course, LIB 407/507
• Made it available to upper-division undergrads; none
signed up
• 2nd Spring 2014 – 1 credit course, LIB 607
Graduate Seminar in Data
Management
Based course around creation of a DMP for a
funding agency
• Students registering for the course were
strongly encouraged to have a research
project already in mind or underway
• Also used, in part and with modification, the
education modules created by DataONE
• Natural disaster
• Facilities infrastructure failure
• Storage failure
• Server hardware/software
failure
• Application software failure
• External dependencies (e.g.
PKI failure)
• Format obsolescence
• Legal encumbrance
• Human error
• Malicious attack by human or
automated agents
• Loss of staffing competencies
• Loss of institutional
commitment
• Loss of financial stability
• Changes in user expectations
and requirements
Data Loss
CC
imag
e b
y Sh
aryn
Mo
rro
w o
n F
lickr
CC
imag
e b
y m
om
bo
leu
mo
n F
lickr
Slide adapted from DataONE Education Module: Why Data Management. DataOne. Retrieved March 21, 2013
Spreadsheet for Help with
OrganizingResearch Project:
[Name of research project]
Name: [Your name]
Dates:
[when you'll be conducting your research, e.g. 7/14-1/15]
Project Data Folder:
[e.g. dissertation_coldfusion_data]
Research Process/Method/ Data Source
Collection Dates Storage Format
Original Format
Working Format Access Format
Preservation Format(s)
File Naming Convention
Folder / Convention Versioning Strategy
Storage Location Who can help?
Access restrictions?
Who needs access?
Software / Tools Required
Metadata Schema Notes
LIB 607 v.3
• Changed to Data Management for the
Social Sciences (and Digital Humanities)
• Less emphasis on DMP per funder
requirements
• More time to address issues specific to the
social sciences and humanities
@ the University of Idaho Library
Research Data Services Credit: University of Idaho Creative Services
University of Idaho Characteristics:
• Public, comprehensive, land-grant university
• Strong emphasis on agriculture, environmental science, engineering
• Recent emphasis on developing research data and research cyberinfrastructure, including library research data services, INSIDE Idaho, the geospatial data repository, and NKN, a multi-disciplinary institutional data repository
Research Data Services at the U-Idaho Library
Appointments
&
ConsultationsNorthwest Knowledge
Network
(institutional data repository)
Embedded Services
(Buy-outs of librarian time)
Tool & Technology Support:
IQ-Station,
ESRI Products,
DMPTool,
Metadata editors
Website:
Data Management Best Practices
Guide
Instruction & Workshops
Many modes of service
Raise awareness of research data management & our services
Create a culture of documentation
Transform thinking across disciplines about data distribution & publishing
Focus: creating a culture of documentation
FISH502 “One-shot” Instruction Session
- Class participants: fisheries biology and statistics graduate students
- Exercise: 1) review the following spreadsheet2) identify the information needed to re-use this dataset
Focus: creating a culture of documentation
Research consultation: environmental modelling
Post-doc from a multi-institutional project was primary contact for several teams
Consultation on metadata was made towards the end of project
Producing 6 discrete collections of data as netCDF (format required by funder)
Repository required ISO 19115 XML metadata for describing whole collections
Focus: creating a culture of documentation
Challenges:
Understanding the standardAttribute Conventions for Dataset DiscoveryISO 19115-2Codelists and controlled vocabularies
Rules for free-text fieldswhat does a good title look like?
Placement of contentshould variables be listed in keywords, title, or description?
Responsibilitieswho should create XML files – the researcher or us?
Focus: creating a culture of documentation
Re-use and comprehension of data requires good
documentation
Researchers often have idiosyncratic and localized, i.e.
customized, documentation practices
Content standards are often not well-known among researchers
Disciplinary content standards are necessary for enabling
advanced modes of data access
Library services must emphasize documentation
Future Directions
Fienberg, S.E. et al. (1985). Sharing Research Data. Washington, D.C: National Academies Press.
http://www.nap.edu/catalog/2033/sharing-research-data
What would you do with
$1k today to make
research communication
better that doesn’t involve
building another tool?
Your Data: Gummy Bear Raw Data
Bounces Amplitude Color
15 4 blue
43 3 red
58 9 green
75 82 purple
Materials:• Haribo Gummi Bears
Sugar Free, 5 lb bag, Amazon.com (UPC: 422384500110)
• SpringOMatic 3000 (ICanPickleThat, Portland, OR)
http://laughingsquid.com/the-anatomy-of-a-gummy-
bear-by-jason-freeny/
Figure 1. A) Gummy skeleton with belly button annotated with red arrow B) Springiness by sample color.
Methods Section: Haribo Gummi Bears (Sugar Free) were purchased from Amazon.com (UPC: 422384500110). Gummy bears were placed in the SpringOMatic 3000 (ICanPickleThat, Portland OR) according to the manufactures instructions. The Gummy Anatomy (Jason Freeny) image was cropped in PPT (Microsoft) and annotate to highlight the bellybutton.
Gummy Bear Final Figure
0
2
4
6
8
10
12
14
16
blue red green purple
Spri
ngi
ne
ss (
bo
un
ces/
len
gth
)
Sample Color
A B Figure legends/metadat
aManipulating
images
Attribution
Metadata about research resources
Group 1: Gummy Bear Final Data
0
2
4
6
8
10
12
14
16
blue red green purple
4 3 9 82
15 43 58 75
Springiness (Bounces/Amplitude)
15 4 blue
43 3 red
58 9 green
75 82 purple
Methods: A schematic of a Gummi Bear was cropped to indicate where the belly button is located (Fig. 1). At this point, raw experimental data showing the bounce, amplitude, and color were analyzed and the springiness calculated for each color of bear. This was accomplished by dividing the bounce by the amplitude and plotting this against bear color.
Fig. 1Belly button ofHaribo Sugar FreeGummi Bear
What is missing?A. Image manipulationB. AttributionC. Figure LegendsD.Metadata about
resources
Figure 1. A) Gummy skeleton with belly button annotated with red arrow B) Springiness by sample color.
Methods Section: Haribo Gummi Bears (Sugar Free) were purchased from Amazon.com (UPC: 422384500110). Gummy bears were placed in the SpringOMatic 3000 (ICanPickleThat, Portland OR) according to the manufactures instructions.
Group 2: Gummy Bear Final Data
0
2
4
6
8
10
12
14
16
blue red green purple
Spri
ngi
ne
ss (
bo
un
ces/
len
gth
)
Sample Color
A
B
What is missing?A. Image manipulationB. AttributionC. Figure LegendsD.Metadata about
resources
Figure 2: Schematic depiction of Haribo Gummi Bear umbilical skeletal anatomy.
Methods & MaterialsGummi Bears were obtained through Amazon in 3 kg bags. Lot and temperature during transport data were not made available. Bears were housed in a plastic bowl in accordance with IACUC policy and national standards for gummi bear care. They were housed at room temperature on a natural light cycle.
Food and water were provided ad libitum (consumption was not monitored)
Each bear was sampled only once to reduce costs
Group 3: Gummy Bear Final Data
What is missing?A. Image manipulationB. AttributionC. Figure LegendsD.Metadata about
resources
Belly Button
0.00
2.00
4.00
6.00
8.00
10.00
12.00
14.00
16.00
blue red green purple
Spri
ngi
ne
ss (
bo
un
ces/
amp
litu
de
)
Gummy Bear Color
(a) (b)
Fig. 1. (a) schematic of the anatomy of a gummy bear (adapted from 1). (b) springiness of bear by color using spring-o-matic.
Methods: Insert the sample of interest, specifically a colored gummy bear (Haribo, Japan). Position the probe above the sample. Press "Tickle" and the SpringOMatic (ICanPickleThat, Portland) will poke the belly button a standard depth of 1 cm. Record the number of bounces and the amplitude of the largest bounce in cm. From these values, the springiness can be calculated (bounce/amplitude).
What is missing?A. Image manipulationB. AttributionC. Figure LegendsD.Metadata about
resources
Group 4: Gummy Bear Final Data
GUMMY BEARS TAUGHT US…
• People see the same data very
differently
• “Detailed” means different things…
• Metadata?!?
• File management is difficult
• Workflow
Vasilevsky N; Wirz J, Champieux R, Hannon T, Laraway B Banerjee K, Shaffer C, and Haendel M. Lions, Tigers, and Gummi Bears: Springing Towards Effective Engagement with Research Data Management (2014). Scholar Archive. Paper 3571.
Researchers DO need assistance: Finding and choosing data standards
File versioning
Applying metadata to facilitate data sharing
“Gummi Bear” themed data management exercise
resonated well with students
Lack of awareness of services and expertise
offered by the Library
Conclusions
OHSU New Directions
OHSU Library is developing
data services for researchers
BD2K educational grants in
collaboration with DMICE
www.ohsu.edu/xd/education/library/data
Acknowledgements
OHSUMelissa Haendel
Robin Champieux
Jackie Wirz
Kyle Banerjee
Bryan Laraway
Chris Shaffer
KaiserTodd Hannon
UOBrian Westra
Karen Estlund
Cathy Flynn- Purvis
John Russell
IdahoBruce Godfrey
Nancy Sprague
Lynn Baird
Greg Gollberg
Luke Sheneman
Steven Daley-Laursen
Contact usNicole Vasilevsky
@N_Vasilevsky
Thank you
Victoria Mitchell
@VictoriaStap
Jeremy Kenyon
@jr_kenyon