e-research and the data librarian

Support for e-Research: Filling the Library Skills Gap E-Science Institute, University of Edinburgh, June 2007 Support for e-Research: Filling the Library Skills Gap e-Research and the Data e-Research and the Data Librarian Librarian Stuart Macdonald Edinburgh University Data Library / EDINA National Data Centre Luis Martinez London School of Economics Data Library

Upload: datacenters

Post on 15-Apr-2017




2 download


Page 1: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh, June 2007

Support for e-Research: Filling the Library Skills Gap

e-Research and the Data e-Research and the Data LibrarianLibrarian

Stuart MacdonaldEdinburgh University Data Library / EDINA National Data Centre

Luis MartinezLondon School of EconomicsData Library

Page 2: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

• What are data?

• Where do you get it from?

• Data support services

• Developments in data storage, dissemination and analysis

• e-Research definition and examples

• DISC-UK DataShare

Page 3: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

What are Data?What are Data?Some definitions:

a collection of observations or other information related to a particular question, problem, experiment or place

information, most commonly in the form of a series of binary digits, stored on a physical storage medium for manipulation by a computer program

information in numerical form that can be digitally transmitted or processed

a representation of facts, concepts, or instructions in a formalized manner suitable for communication, interpretation, or processing by humans or by automated means

Page 4: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

Data TypesData TypesSocial Sciences - micro data; aggregated data; geospatial data; financial

data; qualitative data; in addition to commercial or private data (bank transactions, Tesco customer purchase records, government administrative records, CCTV footage)

‘Hard Science’ : astronomical and meteorological observations; climate modelling; crystallography; gene sequence data; clinical and epidemiological records; mass spectrometry; satellite or archaeological images and aerial photography; polar orbit tracking data; chemical, structural and mechanical engineering data; remote sensing,………

Associated concerns: • ethics (confidentiality/disclosure), • scale (time/storage), • proprietary formats, • copyright and legal issues, • long-term preservation

Page 5: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

Page 6: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

More data will be created in the next five years than has been collected in thewhole of human history. Properly managed, this data will form major resourcefor Australian researchers.

**Department of Education, Science and Training (2007) "Backing Australia's Ability - An Ongoing Commitment" – url:http://backingaus.innovation.gov.au/info_booklet/on_commit.htm

Researchers, government institutions, non-profit organizations, schools, commercial organizations, and individual citizens all need the widest possibleaccess to data from all sources to explore, experiment, test, create new knowledge and new products, and, ultimately, to increase understandingof our world.

*Harlan Onsrud and James Campbell, Department of Spatial Information Science and Engineering, University of Maine [2006] – “Big Opportunities in Access to ‘Small Science’ Data”

‘‘increase the democratisation of knowledge’increase the democratisation of knowledge’

Page 7: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

Research Council-funded Data Centres Research Council-funded Data Centres • EDINA, MIMAS (JISC/ESRC)• UK Data Archive, ESDS (JISC/ESRC)• Arts and Humanities Data Service (AHRC/JISC)

• NGDC - National Geoscience Data Centre (NERC)• BADC - British Atmospheric Data Centre (NERC)• AEDC - Antarctic Environmental Data Centre (NERC)• NEODC - NERC Earth Observation Data Centre (NERC) • BODC - British Oceanographic Data Centre (NERC) • NEBC - NERC Environmental Bioinformatics Centre (NERC)

• UK Cluster Data Centre (Particle Physics and Astronomy Research Council)

• UK Stem Cell Bank (MRC)• UK DNA Banking Network (MRC)• Brain Tissue Bank (MRC)

• UKIDC - UK Infrared Space Observatory Data Centre (STFC)• UKSSDC - UK Solar System Data Centre (STFC)• Chemical Database Service (STFC)

Page 8: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

National Statistical Agencies:• Office of National Statistics (ONS) - http://www.statistics.gov.uk/• General Register Office for Scotland (GROS) - http://www.gro-scotland.gov.uk/• Northern Ireland Statistics and Research Agency (NISRA) - http://www.nisra.gov.uk/• Statistics for Wales - http://new.wales.gov.uk/topics/statistics/• Eurostat - http://epp.eurostat.ec.europa.eu/portal/

Free Resources:• Non-Governmental Organisations• Government websites (national/local)• Independent Research Organisations• Charitable Organisations• Media Organisations

Data Discovery Tools:• Intute: http://www.intute.ac.uk/• Go-Geo! - http://www.gogeo.ac.uk/

Other SourcesOther Sources

Page 9: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

Data Support ServicesData Support ServicesInstitutions provide support for data services in different

ways: • Data Libraries • University Libraries• Computing Centres • Research Offices • Academic Departments

Data Libraries go beyond local support of national data centres & statistical agencies:

• Act as a ‘repository’ of data• Reference service• Train users to access and handle data resources

Page 10: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

UK Data LibrariesUK Data Libraries•Edinburgh University Data Library - first such service in the UK, 1983

•Oxford University Data Library –1988

•London School of Economics Data Library –1997

•RLab Data Service –1999, providing support to LSE’s research laboratory

Other institutions with ‘Social Statistics’ libraries:•University of Southampton•Strathclyde University

DISC-UK (Data Information Specialist Committee – UK)• Foster understanding between data users and providers• Raise awareness of the value of data support in Universities• Share information and resources among local data support staff• URL:http://www.disc-uk.org/

Page 11: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

Web 2.0 – lateral thinking in a linear world?Web 2.0 – lateral thinking in a linear world?

Blogs and wikis – Wordpress, bloggerSocial Bookmarking – del.icio.usMedia-sharing services – YouTube, Flickr, ScridbSocial networking systems – MySpace, ElggCollaborative editing tools – Google Docs and Spreadsheets, GliffySyndication technologies – RSSMashups:

Numeric Data:

Swivel - http://swivel.com/ Many Eyes - http://services.alphaworks.ibm.com/manyeyes/home Data360 - http://www.data360.co.uk/

Spatial Data :

BackOfMyHand – http://www.backofmyhand.comMap Builder – http://www.mapbuilder.net, Maptrot – http://www.maptrot.com, Click2Map – http://www.click2map.com, Blockrocker – http://www.blockrocker.com

Page 12: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

Institutional RepositoriesInstitutional RepositoriesUK Repository Projects:

• StORe – Source-to-Output Repositories• GRADE - Geospatial Repository for Academic Deposit and Extraction• R4L – Repository for the Laboratory• SPECTRa – Submission, Preserv’n & Exposure of Chemistry Teaching & Research data• CLADDIER – Citation, Location And Deposition in Discipline & Institutional Repositories

Issues for further development:• Interoperability - Dublin Core, OAI versus domain-specific XML schemas• Embedding - repository seen as part of the organisational workflow• Redefining repository - as a suite of methodological and technological processes that

facilitate the research lifecycle• Web 2.0 tools for collaboration - across and within department / institution / discipline• Clarity on data citation & persistent identifiers• Data rights - open access v restricted access v user-defined access

Domain-Specific Repositories:• ArXiv.org – physics, maths, computer science• Blue Obelisk Data Repository – chemoinformatics• PubMedCentral – biomedical and lifesciences

Page 13: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

• eScience, e-Social Science, e-Research and eScience, e-Social Science, e-Research and cyberinfrastructure cyberinfrastructure

• ““E-Research extends e-Science’s remit to all sciences E-Research extends e-Science’s remit to all sciences referring to the use of distributed resources across referring to the use of distributed resources across multiple domains to do science or further research multiple domains to do science or further research with the following key features: collaborative, with the following key features: collaborative, multidisciplinary, use of GRID technologies and vast multidisciplinary, use of GRID technologies and vast amounts of data” (CURL Workshop, 2005)amounts of data” (CURL Workshop, 2005)

Page 14: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007


GRIDPPGRIDPP• Large Hadron ColliderLarge Hadron Collider• GRID Prototype to analyze dataGRID Prototype to analyze data

AstroGRIDAstroGRID• UK contribution to Virtual UK contribution to Virtual ObservatoryObservatory

Page 15: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007


CQeSSCQeSS• Develop and support quantitative Develop and support quantitative E-Social ScienceE-Social Science

MiMeGMiMeG• Tools and techniques to analyse Tools and techniques to analyse audio-visual qualitative dataaudio-visual qualitative data

Page 16: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

Seamless Access to Multiple Seamless Access to Multiple Datasets (SAMD)Datasets (SAMD)

MIMAS as major contributorMIMAS as major contributor ESRC and DTI fundedESRC and DTI funded Solving a problem of the UK academic Social Science Solving a problem of the UK academic Social Science


Page 17: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

Page 18: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

Page 19: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

Page 20: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

DISC-UK DATASHARE PROJECTDISC-UK DATASHARE PROJECT JISC Repository and Preservation ProgrammeJISC Repository and Preservation Programme March 2007 to March 2009 March 2007 to March 2009 DISC-UK membersDISC-UK members

• EDINA (lead)EDINA (lead)• University of EdinburghUniversity of Edinburgh• London School of EconomicsLondon School of Economics• University of OxfordUniversity of Oxford• University of Southampton University of Southampton

PurposePurpose““provide exemplars for a range of approaches and policies in which provide exemplars for a range of approaches and policies in which to embed the deposit and stewardship of datasets in institutional to embed the deposit and stewardship of datasets in institutional repositories”repositories”

Page 21: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

Growing presence of IRsGrowing presence of IRs SToRe Social Science ReportSToRe Social Science Report

• 70% of survey respondents producing quantitative 70% of survey respondents producing quantitative questionnaire dataquestionnaire data

• Vast majority of researchers not depositing data Vast majority of researchers not depositing data

DATASHARE MotivationDATASHARE Motivation

Page 22: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

DeliverablesDeliverables Enhancements to partners’ IRs Enhancements to partners’ IRs Exemplars of the process of setting up an institutional Exemplars of the process of setting up an institutional

data repository service data repository service Documentation and open source code for adapting Documentation and open source code for adapting

repository software for handling datasets.repository software for handling datasets. Technical watch on e-Research, VREs and Web 2.0 Technical watch on e-Research, VREs and Web 2.0

developments.developments. Papers, presentations and online dissemination of Papers, presentations and online dissemination of

collected knowledge.collected knowledge.

Page 23: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007


Management: storage, curation, policies Management: storage, curation, policies Legal: access rights, confidentiality and creating public Legal: access rights, confidentiality and creating public

use filesuse files Technical: standards to describe, transport and Technical: standards to describe, transport and

communicatecommunicate Cultural and political: do people want to share data? Cultural and political: do people want to share data?

Central vs. distributed. Self-archiving vs. assisted Central vs. distributed. Self-archiving vs. assisted depositeddeposited

Page 24: e-Research and the Data Librarian

Support for e-Research: Filling the Library Skills GapSupport for e-Research: Filling the Library Skills Gap

E-Science Institute, University of Edinburgh – June 2007

Thank youThank you