a centre of expertise in digital information management ukoln is supported by: transition or...
TRANSCRIPT
A centre of expertise in digital information management
www.ukoln.ac.uk
UKOLN is supported by:
Transition or Transform? Repositioning the Library for the Petabyte Era
Dr Liz Lyon, Director, UKOLN, University of Bath, UK
Associate Director, UK Digital Curation Centre
ARL / CNI Forum, Washington DC, October 2008
This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 2.0
Perspectives
1. Disciplines and Practice : a diverse landscape
2. Institutions and Assets : emerging initiatives
3. People and Skills : building capacity
Robot librarian @ CERN tends 5PB data (Nature July 2008)
Disciplines and Practice
• Immersive case studies www.dcc.ac.uk/scarp/
• Disciplinary factors in curating Architectural Research (Colin Neilson)
• Curating Brain Images in a Psychiatric Research Group (Angus Whyte)
• Curating MST radar data at STFC (Esther Conway)
• Roles and reusability of video data in social research (Angus Whyte)
http://www.flickr.com/photos/30435752@N08/2892112112/
http://www.flickr.com/photos/macronin47/85006920/
Neuro-imaging Case StudyReport due October 2008
Division of Psychiatry, Univ. of Edinburgh
9Tb MRI images + demographic data shared in Neurogrid/NeuroPsygrid and multi-centre studies
10 year longitudinal studies data is continually re-analysed: new analytic techniques add value to older data
Data integration: multiple scanner image normalisation, shared terminology needed
Interdisciplinary team roles: clinicians, imaging researchers, psychologists, scanner engineers, sys-admin
Heedful interaction: team weekly meetings provide a human infrastructure for curation, based on trust
Data-sharing as a form of trade or gift exchange: “give to get” rather than “give away” – implications for research funder data access policies: research funders should “think global & act local”
Ethics, confidentiality, privacy issues: “skull-stripping” software & anonymisation, potential for prediction of psychiatric disorders
DRAMBORA Risk assessment, mitigation steps identified; data policy + core metadata set, data documentation, phased development
http://wiki.ecrystals.chem.soton.ac.uk/index.php/Main_Page
eCrystals Curation & Preservation Study Working with the Digital Curation Centre
Examined four main areas
1. Audit and certification (TRAC, DRAMBORA, NESTOR, ISO International repository audit and certification BOF Group)
2. The Open Archival Information System (OAIS) and Representation Information (RI)
3. eBank-UK application profile and preservation metadata
4. ePrints.org repository platform
Recommendations
http://www.ukoln.ac.uk/projects/ebank-uk/curation/eBank3-WP4-Report%20(Revised).pdf
eCrystals Federation: Preservation & sustainability Recommendations
Data repositories• Use DRAMBORA Interactive Vs 1.0 for self-assessment• Add PREMIS preservation metadata• Collect eCrystals representation information• Examine repository platform conformance to OAIS Reference Model• Survey partner preservation policies
Digital Curation Centre partnership
Interviews & analysis of a discipline: crystallography
Findings:Diverse lab practiceLaboratory Information Management Systems (LIMS) & proprietary formatsData policy should reflect lab practice & institutional modelData quality criteria/validation“Prior publication” problemWe need scalable assignment of “terms” for data discoveryNo discipline preservation model
Recommendations (7), commentary
May 2008 UKOLN and
University of Southampton
Scaling Up Report
Practice challenges?1. Understanding the risks, awareness2. Community consensus, advocacy3. Data management plans4. Appraisal: selection criteria5. Data documentation: metadata,
schema, RepInfo, semantics6. Data formats: applying standards7. Instrumentation: proprietary formats 8. Data provenance: authenticity9. Data citation & versions: persistent IDs 10. Data validation and reproducibility11. Data access: embargo policy12. Data linking: text, images, software
***Open Science Experiment***
With thanks to Simon Coles, Univ Southampton
Blogging results data
1. Transition or Transform? the Library• Remote research support or integrated team science?• Passive observation or proactive participation?• Is your library fully embedded in research practice?• How do you acquire a deeper understanding of
disciplinary data curation approaches?• Models of engagement?
– Immersive case studies– Joint R&D projects– New service offerings– Role extension (Faculty / subject / liaison librarians)– Secondments
Library supports human infrastructure for curation
Institutions and Assets
http://www.flickr.com/photos/mintchocicecream/7491707/
Shared Research Data Service Feasibility Study
• HEFCE award £255K to SERCO• Objectives:
– Develop understanding of UK’s current and future research data service needs
– Work with other UK stakeholders to identify priorities for action– Develop a number of scenarios/options for the shared service from “do
nothing” to a managed national service– Develop a detailed business plan for the preferred option(s)– Include assessment of costs and benefits in options appraisal– Indicate both scale of investment required & an estimate of likely ROI– Present outline governance and management proposals for the
preferred option(s)• 4 case study “volunteers”: Bristol, Leeds, Leicester, Oxford• Report January 2009, Interim Report published July 2008
State-of-the Nation Analysis
Research funder policies
Data centres and facilities
International comparators
Options analysis and appraisal
Baseline for Costs
Stakeholder analysis,
Success criteria
Emerging survey themes:
Advocacy, Co-ordination and information, Coherence, Data Depository, Skills and training, Seeding the Data Commons
October 2008: Development of Models
http://www.flickr.com/photos/philipdunn/2424950499/
University of Oxford case study
37 interviews with researchers + Workshop
Report published July 2008
Background
A recommendation to JISC:
“JISC should develop a Data Audit Framework to enable all universities and colleges to carry out an audit of departmental data collections, awareness, policies and practice for data curation and preservation”
Liz Lyon, Dealing with Data: Roles, Rights, Responsibilities and Relationships, (2007)
Data Audit FrameworkLaunch: 1st October 2008 http://www.data-audit.eu/
Benefits:
Prioritisation of resources
Capacity development and planning
Efficiency savings – move data to more cost-effective storage
Manage risks associated with data loss
Realise value through improved access & re-use
Positioned as a self-audit tool
Scale: departments, institutions
Methodology
http://www.data-audit.eu/DAF_Methodology.pdf
Workflow and tasks
Detailed assessment
• ID • Data creator(s)• Title • Description• Subject• Date• Purpose• Source
• Updating frequency• Type• Format• Rights and restrictions• Usage frequency• Relation• Back-up and archiving
policy• Management to date
School of GeoSciences pilot audit
• 80 academics, 70 research fellows, 130 PhD students• Annual research grant and contract income of £4-6M • Staff contribute to >1 of five Research Groups • Involvement in inter-University Research Consortia and Research Centres• 15Tb data on main server• Audit led by Information Services staff• Interviews with 35 Faculty staff• Create Inventory of 25 datasets and classify them• Assess most significant assets in detail, collect basic set of data elements
based on Dublin Core• Draft Report and Recommendations to the School of GeoSciences and to
Information Services
GeoSciences pilot: lessons learned
• Time needed is longer than initially anticipated (for interviews etc.) but still manageable - Plan well in advance
• Inventory doesn't have to be comprehensive but could be a representative sample
• Little documentation/knowledge of what exists:“a nightmare”• There are no standards in creating & managing data assets• Define the scope and granularity carefully• Ensure appropriate timing (avoid exams, field trips, Boards…)• Get support from senior management (VP level)• Collect as much information as possible in interviews/surveys• Variable openness of staff and their data
GeoSciences pilot: some outcomes
• Preliminary but positive• Requirement for institution-wide data policy and
guidelines• Requirement for researcher training• IPR issues associated with data ownership:
individual or institution?• Requirement for training for auditors• Scaling up audits: 6 further data audits in process
(including Physics, Biol Sci., Education, History, Classics & Archaeology, Biomedical Sciences)
2. Transition or Transform? Librarians
There are lots of opportunities for action• Leadership by senior managers
– Data policy development - with PVC / VP research– Storage infrastructure provision – with IT Director
• Faculty audit co-ordination (DAF tool)• Advocacy, awareness-raising workshops, training
– Data literacy programmes– Curation Lifecycle management
• Inform data management plans• Data documentation best practice• Repository assessment (DRAMBORA tool)• Deliver new integrated support services
People and Skills
Background Recommendations to JISC:“A study is needed to examine the role and career development of
data scientists and the associated supply of specialist data curation skills to the research community”.
“JISC should fund a study to assess the value and potential of extending data handling curation and preservation skills within the undergraduate and postgraduate curriculum”.
Liz Lyon, Dealing with Data: Roles, Rights, Responsibilities and Relationships, (2007)
“The role of the Library in data-intensive research is important and a strategic repositioning of the Library with respect to research support is now appropriate.”
“there are…not enough specialised data librarians yet”
“Recommendation: The research library community in the UK should work with universities and research institutes to define properly and to formalise the role of data librarians, and to develop a curriculum that ensures a suitable supply of librarians skilled in data handling.”
CILIP Update June 2008
Only 5
in the
UK???
““Accidental” data
Accidental” data librarianslibrarians
Research Data Forum
• Bringing diverse communities together• Data centre managers, IR managers, librarians,
funders & policy makers• Aims & Objectives:
– Facilitate co-operation between organisations and individuals
– Exchange experience and best practice
• November 2008, Manchester UK http://www.dcc.ac.uk/data-forum/
• 2nd joint DCC – RIN event
Day 2 (Thursday 27th November)
09.00 Identifying roles : JISC report Dr Alma Swan, Key Perspectives
09.30 Data creator Professor Stephen Lawrie, Dept of Psychiatry, University of Edinburgh
09.50 Data scientist Helen Parkinson, EBI
10.10 Data librarian Robin Rice, EDINA,
10.30 Data manager Sam Pepler, BADC
10.50 Coffee
11.15 Professional education and training perspectives
Professor Sheila Corrall, Dept of Information Studies, University of Sheffield
11.45 Discussion: Identifying core data skills Facilitator: Liz Lyon
13.00 Lunch
13.45 Introduction to Breakout Groups1) Do librarians have the right skills?2) Ways to up-skill the research community?3) Sharing skills from data scientists and data managers
Graham Pryor, DCC (Chair)Facilitators:1)tbc2)tbc3)Stuart Jeffrey, ADS
15.00 Feedback Session
15.30 Discussion
16.00 Way forward and next steps Graham Pryor, DCC
Roles and responsibilities for effective data management
DCC Digital Curation 101• Curation “summer” school• 6-10 October 2008 @ NeSC• Lectures + hands-on• Target participants: bench
scientists, LIS professionals • Focussed around the DCC
Curation Lifecycle Model• Conceptualise, Create and/or
Receive, Appraise & select, Ingest & Store, Preserve, Access, Use & Re-Use, Data Management Plan
Open
ScienceOpen
ScienceOpen
Science
A centre of expertise in digital information management
www.ukoln.ac.uk
http://jiscpowr.jiscinvolve.org/
3.Transition or Transform? the Role
• Multidisciplinary teams, multidisciplinary people
• Domain + ICT + library + archiving knowledge
• New roles: “data librarians”, “data scientists”
• Skills shortage: capacity building needed
• What core data skills are required?
• Not in LIS school curriculum? Radical change!
• Recruit different people to the LIS team
• Re-brand the LIS career
From Librarianship to Informatics
Thank you.
Slides will be available at :
http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/presentations.html