Download - Imac 090924
ProjectProject3TU.Datacentrum3TU.Datacentrum
Im@c, September 24Im@c, September 24thth 2009 2009Jeroen Rombouts, MScJeroen Rombouts, MSc
Project manager 3TU.DatacentrumProject manager 3TU.Datacentrum
Presentation outlinePresentation outline
Why care about research data?
What do data producers have to say?
Why care? 1/3Why care? 1/3
Research
Manuscript Publication
Data Metadata
Repository Library
Why care? 2/3Why care? 2/3
• Physical decay of storage media;
• Loss of descriptive (meta)data;
• Loss of ‘rendering’ capabilities (contemporary applications for viewing and analysing data).
Risks of current research data management
Reasons for long-term preservation and access
• Data value (cost intensive, valorisation, continuous datasets);
• Research quality (verification, knowledge transfer, sharing).
Why care? 3/3Why care? 3/3
• Plan of National Science Foundation regarding preservation of digital scientific output (2006);
• OAIS reference model (2002 by CCSDS) becomes ISO standard (2009);
• KNAW starts Dutch data repository for humanities and social sciences: DANS (Data Archiving and Networked Services) (2005);
• No initiatives for engineering and science in the Netherlands.
Project setting
The 3TU.Datacentrum 1/8The 3TU.Datacentrum 1/8
• Builds on two previous projects;– E-Archiving – digital depot– Darelux – Data Archiving River Environment Luxemburg
• Time frame of 3 years 2008 - 2010;– Financed mainly by 3TU.Federation– Datasets from TUD, TU/e and UT, later other science data
• Goal: long-term access to research data.
Project description
The 3TU.Datacentrum 2/8The 3TU.Datacentrum 2/8
Tasks
CollaborationWith DANS, SURF, Koninklijke Bibliotheek and others:• “DRIVER-II” (EU-7FP), Demonstrator voor Enhanced Publications;• “Waardevolle Data & Diensten” (SURFshare), identify added value of data repository for data producers.• Partner in DataCite consortium with TIB Hannover, ETH Zurich, INIST (France), British Library, DTU Kopenhagen, NRC-CISTI (Canada), California Digital Library.
• Implement and run ‘data-archive’ (facilitate data producers);- Collect, preserve, publish and provide access to data- (ß): drietu2.3tu.nl/repository/collection:all/view/html
• Data management consultancy;- Select and develop formats, metadata, tools, etc.
The 3TU.Datacentrum 3/8The 3TU.Datacentrum 3/8
• Data of ‘enhanced publications’ (underlying data and visualisations linked to publications).Increase publication value (stronger basis, more citations, …);
• Data generated by ‘hard to repeat’ processes.E.g. high cost, (environmental) observations, complex or continuous experiments, …;
• Data collected with public funding.Conditions by funding organisations or publishers like Nature Publishing Group, NWO, governmental organisations, universities, …;
• Preferably open access data with potential for reuse (verification, new research, …).Increase visibility, efficiency and quality of research efforts.
Which data to preserve? And why?
• Technical infrastructure (server, platform, websites, formats & models)
• Dataset Darelux (2.0)http://drietu2.3tu.nl/repository/resource:study-CITG/view/html
• Dataset Flame (BagIt)http://drietu2.3tu.nl/datasets/flame/
• Dataset Wind speed/Solar radiationhttp://drietu2.3tu.nl/datasets/windzon/
• Datasets ‘on the way’: NNV Survey ‘job market physicists’, Enhanced Publication ‘combustion’, Waterlab, Biotechnology, Remote sensing, ‘Tire noise’
The 3TU.Datacentrum 4/8The 3TU.Datacentrum 4/8
• Partner in DataCite consortium with TIB Hannover, ETH Zurich, INIST (France), British Library, DTU Kopenhagen, NRC-CISTI (Canada), California Digital Library.“to support researchers by providing methods for them to locate, identify, and cite research datasets with confidence”;
• Founding member COAR: Confederation of Open Access Repositories (October);
• Provide input for “Nota Wetenschappelijke informatievoorziening” (OC&W), “Toekomst voor ons digitaal geheugen” (NCDD);
• Partner in “Nationale Coalitie Digitale Duurzaamheid” (www.ncdd.nl);
• Coordinating “Forum onderzoeksdata”.
Related ‘results’ 5/8Related ‘results’ 5/8
The 3TU.Datacentrum 6/8The 3TU.Datacentrum 6/8
The 3TU.Datacentrum 8/8The 3TU.Datacentrum 8/8
The benefits for data producers and data consumers
• Increased visibility of research output. (metadata in repository networks, assigning doi’s, facilitate increases citation rate for ‘enhanced publications’, ...);
• Improved quality of dataset (quality assurance for multi- user setup, checks on ingest, …);
• Provide (long-term) preservation of and accessibility to, valuable research data;
• Distribution of research data for reuse, including administration and usage statistics;
• Provides advice on data management, rights, formats, metadata, etc.
Nobody needs my data
Data transfer not needed, every PhD does own project
Our datasets are confidential
Interesting but not for me
Only for long term continuous
data
Datasets are stored by publisherNo time!
Our research is once only
What do data producers say? 1/2What do data producers say? 1/2
Surprising our university had no faciltity for data
preservation
Transfer of data between PhD’s can be
improved
Would like to publish data
Good opportunity to share datasets
we bought
Very usefull, essential metadata
often missing Much to
improve in reuse of data
When can I store my datasets?
What do data producers say? 2/2What do data producers say? 2/2
Questions? Suggestions?Questions? Suggestions?
Nature News Special on Data Sharing (september 2009)www.nature.com/news/specials/datasharing/index.html
Toekomst voor ons digitaal geheugenhttp://www.ncdd.nl/documents/NCDDToekomst2009_000.pdf
ResourcesResources
• The 3TU.Datacentrum project www.datacentrum.3tu.nl• "Unavailability of online supplementary scientific information from
articles published in major journals" doi:10.1096/fj.05-4784lsf• "Going, Going, Gone: Lost Internet References“
doi:10.1126/science.1088234• “Sharing Detailed Research Data Is Associated with Increased
Citation Rate” doi:10.1371/journal.pone.0000308• “To share or not to share” www.rin.ac.uk/data-publication• “NSF’s Cyberinfrastructure Vision for 21st century Discovery”
www.nsf.gov/od/oci/ci_v5.pdf• “SURF Direct” Digitale rechten – onderzoeksdata (Dutch)
www.surf.nl/surfdirect• Nature News Special on Data Sharing (september 2009)
www.nature.com/news/specials/datasharing/index.html• Toekomst voor ons digitaal geheugen
http://www.ncdd.nl/documents/NCDDToekomst2009_000.pdf