vision for an academic research library as partner in campus-wide data management as it contributes...
TRANSCRIPT
Vision for an academic research library as partner in campus-‐wide data management as it contributes
to a preeminent ins8tu8on
Plato L. Smith II, CLIR/DLF Postdoc Fellow at UNM University of Florida Libraries
August 25, 2015
As We May Think “A record [data/database] if it is to be useful to science, must be con8nuously extended, it must be stored, and above all it must be consulted.” – Vannevar Bush, 1945 “The process by which data is captured and maintained con8nues to evolve and mature as scien8fic needs change.” – DAF Interview P1 Par8cipant (2013)
q How can an Academic Research Library (ARL) make people,
research, and data management services be^er? 8/25/2015 Plato L. Smith II 2
Table of Contents 1. An Academic Research Library Perspec8ve 2. A Data Assessment Framework Use Case 3. Academic Research Library as Broker 4. An Organiza8onal Approach – UF 5. Address Other RDM Challenges 6. Build Collabora8on, Engagement, & Support
8/25/2015 Plato L. Smith II 3
An Academic Research Library Perspec@ve
1. CCSDS OAIS Reference Model (2002) – ISO14721:2003
2. Levels 1 – 3 cura8on (2003)
3. Data Cura8on Centre – DCC (2004)
4. DCC Cura8on Lifecycle Model (2007)
5. NSF DMP Requirement (2011)
6. JISC Research Lifecycle Model (2013)
7. OSTP Memo (2013) 8. NSF Public Access Plan
(2015) Map Research Data Life Cycle to Domains via UF RDMS
Source: UF Libraries Research Data Management Support (RDMS)
8/25/2015 Plato L. Smith II 4
A Data Asset Framework Use Case • Data Assess Framework (DAF) Methodology
• Mixed Methods – surveys and interviews
• Data Assessment (Environmental Scan)
• Gap Analysis • Mul8ple Research Labs
Data Assets
DAF
Types Sources
The DAF was developed in 2009 by the Humani8es Advanced Technology and Informa8on Ins8tute (HATII), University of Glasgow in conjunc8on with the DCC via JISC support.
8/25/2015 Plato L. Smith II 5
A Data Asset Framework Use Case
Research Labs/Centers @FSU Ø Labs/Centers – 6/58 (10%)
1. Center for Ocean-‐Atmospheric Predic8on Studies (COAPS)
2. Na8onal High Magne8c Field Laboratory (NHMFL)
3. Marine Coastal Laboratory 4. Antarc8c Marine Geology Research
Facility (AMGRF) 5. Center for Advanced Power Systems
(CAPS) 6. Geophysical Fluid Dynamics Ins8tute
(GFDI)
Ø Interdisciplinary Ø Mul8disciplinary
Scien@sts/Faculty Par@cipa@on
Ø Direct email to Directors
Ø Distributed to domain-‐specific list serves (Purposive Sampling)
Ø Responses and Comple8on – Surveys – 107/129 (83%) – Interviews – 7/6 (86%)
8/25/2015 Plato L. Smith II 6
23
29
26
3
3
10
7
0 5 10 15 20 25 30 35
Senior Researcher
Principal Inves@gator
Research Assistant
Research Technician
Research Support
Research Student
Other
What is your primary research role?
A Data Asset Framework Use Case
8/25/2015 Plato L. Smith II 7
A Data Asset Framework Use Case
20
17
16
8
70
6
0 20 40 60 80
Project manager
Research assistant
Research groups
Na@onal data center
You
Other
Who is responsible for managing your research data (select all that apply)?
RDM Responsibility
8/25/2015 Plato L. Smith II 8
A Data Asset Framework Use Case
3
48 58
74
42
26
2 0
10 20 30 40 50 60 70 80
What is the data type of your primary data?
Primary Data Type
8/25/2015 Plato L. Smith II 9
A Data Asset Framework Use Case
1
37 43
50
3 9
46
12 23
40
6
30 31
4 6 4 9
2 0
10 20 30 40 50 60
Audio tape
s
Compu
ter soY
ware
Data -‐ compu
ter
Data -‐ sensors
Digital aud
io files
Digital video
files
Excel she
ets
Fieldw
ork da
ta
Images, scans, p
hotos
Labo
ratory notes
MS Ac
cess
MS Po
werpo
int
MS Word
Slides -‐ ph
ysical m
edia
SPSS files/sta@
s@cal
Vide
o tape
s
Web
sites
Other
What is the data type of your secondary data?
Secondary Data Type
8/25/2015 Plato L. Smith II 10
A Data Asset Framework Use Case
50
54
29
6
4
45
18
6
0 10 20 30 40 50 60
Finding files/folder structure
Loca@ng where data files are stored
Non standard file formats
Legal issues arising from transfer of
Problems establishing ownership of
Finding or accessing research data
Security and protec@on of files
Other
Which of the following data management issues have you experienced? [Please select all that apply]
RDM Issues
8/25/2015 Plato L. Smith II 11
A Data Asset Framework Use Case
14 15
49
68
30 24
5
32
7
0 10 20 30 40 50 60 70 80
CD/DVD External commercial web data storage
External Hard Disk
Local computer
My documents on research
lab PC
Paper/file records
Technology vendor file server
Other provided file server
Other -‐ give details
Where do you store your data (excluding backup copies)? [Select all that apply]
8/25/2015 Plato L. Smith II 12
A Data Asset Framework Use Case
Budget/funding 22%
Infrastructure/resources
31% Stakeholders
8%
Storage/technology
25%
Other 14%
What are some barriers for you with regards to managing and storing your research data?
Budget/funding
Infrastructure/resources
Stakeholders
Storage/technology
Other
8/25/2015 Plato L. Smith II 13
Academic Research Library as Broker
• Capaci8es/Facili8es • Infrastructure (HPC) • Resources & Tools
• Library & Campus-‐wide stakeholders
• University, Government, Industry
• Research Data Management (RDM)
• Repository (IR@UF) • Publishing/Sharing
• Data Management Planning (DMP)
• Research Data Lifecycle
• DMP Tools
Plan Access
Assets Support
8/25/2015 Plato L. Smith II 14
Academic Research Library as Broker 1. Conduct Data Assessment & Gap Analysis across mul8ple disciplines,
ins8tutes, and centers (e.g. DAF, Evalua8on, Monitor & Track Metrics)
2. Ar@culate and facilitate Federal Data Access Policies (e.g. NSF, OSTP) compliance – educa8on, IM, outreach, training, webinars, workshops
3. Assist faculty with Data Management Planning (DMP) throughout research data lifecycle – DMP Tool, IR@UF, HiPerGator (HPC), FSP FAQ
4. Connect and integrate with diverse communi8es of prac8ce (e.g. USGS)
5. Document Outcomes, Metrics, & Successes (e.g. varied infographics, IM)
6. Leverage services and services realloca8on (e.g. data storage, HPC)
7. Secure library, campus, consor8um, and university support (e.g. GUIRR) 8/25/2015 Plato L. Smith II 15
(Diagram modeled aoer Purdue Libraries -‐ used with permission) 8/25/2015 Plato L. Smith II 16
(Diagram modeled aoer Purdue Libraries -‐ used with permission)
Promote cross-‐cupng DMP/RDM educa8on, outreach, and training synergies
8/25/2015 Plato L. Smith II 17
Address Other RDM Challenges q Earth-‐Centered Communica8on for Cyberinfrastructure (EC3) 2015 Field Trip Scenario q Metadata, Features/Func8onality, Architecture, Best Prac8ces, Standards q Interoperability, Data Collec8on & Integra8on, Seman8cs (e.g. ontology, vocabulary) q Applica8ons, Web Services (e.g. APIs, W3C, SOAP, RESTful, etc.) q End-‐to-‐End development (e.g. funding beyond prototype/end of funding)
Diagram developed by GIS specialist, Nicole Kong (used with permission)
8/25/2015 Plato L. Smith II 18
Address Other RDM Challenges
USGS Community for Data Integra8on (CDI) Science Support Framework (SSF) – 2015 8/25/2015 Plato L. Smith II 19
Address Other RDM Challenges General and Domain Specific Repositories
• dLOC-‐UFDC, Dryad, DataVerse, Figshare, HathiTrust, IR@UF • EarthChem, GenBank, idigBio, Integrated Earth Data Applica8ons (IEDA) • arXiv.org, XSEDE, Long Term Ecological Research (LTER), Morphbank, NCBI, NGDC/NOAA, NODC/NOAA, UCAR/NCAR
General and Domain Specific Tools • DataUp, dataZoa, DCC Tools, iPython Notebook, Visual Understanding Environment (VUE) • Digital Research Tools (DIRT), import io, LabArchives, MATLAB, OPENRefine, R, SPSS, Tabula • FGDC tools, NCBI (APIs, Code Libraries, Data Formats, GitHub repository), PubMed Tools
Author disambigua8on and linked data Linking • ORCiD, DOI, EZ-‐ID, Zenodo • Impactstory, Open Science Framework (OSF), VIVO • Linked Open Data (5 star), Ontologies, W3C Prov, RDF, XML
8/25/2015 Plato L. Smith II 20
Address Other RDM Challenges
8/25/2015 Plato L. Smith II 21
Build Collabora8on, Engagement, & Support
q Build and extend exis8ng collabora8ons and partnerships q Develop Data Management Use Cases and RDM Scenarios
q Engage UF Preeminence Faculty (e.g. 8 Preeminence areas of focus – 4 CoE, 3 CoLAS, 3 CoM, 1 CoBA, 1 Levin CoL, 1 CoN, 1 CoP, 1 CoPH&HP)
q Engage Communi8es of Prac8ces -‐ AGU, ARL SHARE, CUAHSI, Dataverse, DataONE, Deep-‐C, Dryad, EarthCube, ESIP, GoMRI, GreyNet, HASTAC, IDCC, iDigBio, NHMFL, OGC, RDA, USGS
q Develop new Partnerships and Funding Opportuni8es (e.g. UF Division of Research Program Development, COS, NSF Funding)
8/25/2015 Plato L. Smith II 22
Build Collabora8on, Engagement, & Support
q Computer and Informa@on Science and Engineering (CISE) Research Ini@a@on Ini@a@ve (CRII) – untenured faculty/1st 2yr of academic posi8on aoer PhD – Solicita8on #15-‐569
q CISE Research Infrastructure (CRI) – Community Infrastructure/enhancement of exis8ng CI-‐EN -‐ Solicita8on #15-‐590
q Campus Cyberinfrastructure – Data, Networking, and Innova@on Program (CC*DNI)
– (1) DIBBs (Mul8-‐campus Model) or (2) Data Driven Networking Infrastructure for the Campus Researcher -‐ Solicita8on #15-‐534
q Grant Opportuni@es for Academic Liaison with Industry (GOALI) – promotes university-‐industry partnerships/linkages -‐ Solicita8on #12-‐513 (any8me)
q Industry/University Coopera@ve Research Centers Program (I/UCRC) – develops long-‐term partnerships among industry, academe, and government – Solicita8on #13-‐594
8/25/2015 Plato L. Smith II 23
References q ACRL. (2015). Informa8on Literacy Competency Standards for Higher Educa8on.
Retrieved August 19, 2015 from ACRL ILCS. q Brandt, D. S. (2015) DLF E-‐Research Network 2015 Webinar on “Introduc8on/
Research Data Management Services in Academic Libraries” for 2015 DLF E-‐Research Network Cohort, May 13, 2015.
q CCSDS. (2002). Consulta8ve Commi^ee for Space Data Systems (CCSDS) The OAIS Reference Model. Retrieved August 19, 2015 from OAIS.
q Mar8n, J. (2002). Cultures in Organiza8ons: Three Perspec8ves. Oxford University Press.
q NSF. (2015). NSF’s Public Access Plan: Today’s Data, Tomorrow’s Discoveries – Increasing Access to the Results of Research Funded by the Na8onal Science founda8on. Retrieved August 19, 2015 from h^p://www.nsf.gov/pubs/2015/nsf15052/nsf15052.pdf.
q UF Libraries. (2015). Research Data Management Support (RDMS). q USGS. (2015). USGS Community for Data Integra8on. The CDI Science Support
Framework (SSF). Retrieved August 19, 2015 from h^p://www.usgs.gov/cdi/about.html.
q USGS. (2015). USGS Fundamental Science Prac8ces (FSP). Retrieved August 19, 2015 from h^p://www.usgs.gov/fsp/.
8/25/2015 Plato L. Smith II 24
Acknowledgements 1. UF Libraries Data Management Librarian Search Commi^ee 2. Brian Keith, Hannah Norton, Laurie Taylor, Tina Marie Litchfield 3. FSU School of Informa8on (Florida's iSchool) 4. Dr. Paul Marty and Dr. A.K.S.K. Prasad (FSU) 5. CLIR/DLF Postdoctoral Program 6. University of New Mexico Libraries 7. Dr. Karl Benedict (UNM) 8. Sco^ D. Brandt (Purdue) 9. New Mexico EPSCoR 10. NSF-‐Funded EarthCube, EC3, and DataONE Projects 11. USGS Community for Data Integra8on (CDI)/Fundamental Science
Prac8ces (FSP) publicly-‐available resources
8/25/2015 Plato L. Smith II 25
Thank you
Ques8ons and comments
Crea8ve Commons A^ribu8on-‐NonCommercial 4.0 Interna8onal License 8/25/2015 Plato L. Smith II 26