a centre of expertise in digital information management ukoln is supported by: c21st scholarship:...

35
A centre of expertise in digital information management www.ukoln.ac.u k UKOLN is supported by: C21st Scholarship: Data as an Agent for Change Dr Liz Lyon, Director, UKOLN, University of Bath, UK Associate Director, UK Digital Curation Centre 3 rd Bloomsbury Conference, London, June 2009 . This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0

Upload: arianna-mcgrath

Post on 10-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

UKOLN is supported by:

C21st Scholarship: Data as an Agent for Change

Dr Liz Lyon, Director, UKOLN, University of Bath, UKAssociate Director, UK Digital Curation Centre

3rd Bloomsbury Conference, London, June 2009

.This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 2.0

Perspectives1. The 21stC Scholar :

Team Science in the Cloud

2. Chemical Crystallography : Data Publishing Showcase

3. The Future : a Transformational Agenda

The 21stC Scholar : Team Science in the Cloud

http://www.flickr.com/photos/wwarby/3632317031/

What does the C21st research(er) look like?• “From users to

choosers” (Yanosky)

• Pro-sumers (Toffler)

• Digital nomads

• Work on the Webtop

http://www.flickr.com/photos/shankrad/2905938179/

• Multi-scale & complex

• Highly data-intensive

• Increasingly “open”http://www.flickr.com/photos/stormsriver/2286011597/

OPEN CLOSED

“Continuum of Openness”?

What do we mean by Team Science?• Science as a

social activityTweet

BlogComment

RateVote

RecommendTag

ShareMash• Highly collaborative

• Multi-disciplinary

• Core team skills

• Trust is key• Inter-institutional

collaboration –better science (Brian Uzzi, 2008)

A new digital economy?• Data is:– On demand– A utility– Commoditised– Un-differentiated– “Publish then filter”

(Shirky)– Traded

• “Cloud” model?• Brokers & aggregators

are key roles• Free, pay per use, pay as

you grow…..

http://www.flickr.com/photos/will-lion/2738252562/

• Economies of scale• Network effects

• New data publishing business models

http://www.flickr.com/photos/thomasreichart/2130018485/sizes/l/

Chemical Crystallography : Data Publishing Showcase

Data Deluge

A bottleneck : the primary cause is the current data publication process, which is tied to journal articles and peer review A bottleneck : the primary cause is the current data publication process, which is tied to journal articles and peer review

“40 years ago a PhD student would determine about 3 crystal structures for their thesis – this can now be easily achieved in a day”

35 million 2.5 million

0.5 million

‘Few thousand’

Slide: Dr Simon Coles, Univ Soton

Multi-scale : fr

om Diamond Light

Source …..

…..to the Laboratory bench

Simon Coles, Mike Hursthouse, Jeremy Frey, Cameron Neylon, Andrew

Milsted, Richard Stephenson, Jamie Robinson, Steven

Wilson, Andrew Bailey, Mark Borkum

Dave DeRoure, Les Carr, Monica Schraefel, Chris

Gutteridge, Tim Myles-Board, Arouna Woukei, Dave

Tarrant, Stuart Middleton

Liz Lyon, Manjula Patel, Rachel Heery, Monica Duke, Michael Day, Traugott Koch,

Pete Cliff

Domain (Chemists)

Computer science

Informatics

eCrystals Team

eCrystals Data Repository

• Quick & simple to deposit• Software tools • Laboratory archive• Community involvement• ‘Embargo’ facility• Structured foundations• Discoverable & harvestable

http://ecrystals.chem.soton.ac.ukhttp://ecrystals.chem.soton.ac.uk

Trust

Standards

Audit and certification tools

• TRAC

• DRAMBORA

• PLATTER

• NESTOR

• Data Seal of Approval

eCrystals Curation Reports (3)• Preservation metadata• PREMIS Data Dictionary• OAIS

• Representation Information• Registry/Repository RRORI

Data sustainability

Data Discovery & Access

“Community Criteria for Interoperability” (Scaling Up Report 2008)

• Domain data format standard: CIF• Domain data validation standard: CheckCIF• Metadata schema: eCrystals Application Profile

http://www.ukoln.ac.uk/projects/ebank-uk/schemas/

• Crystallography Data Commons: TIDCC Data Model in development

• Embargo & Rights http://ecrystals.chem.soton.ac.uk/rights.html

• Domain identifier: International Chemical Identifier • Citation & linking: DOI

http://dx.doi.org/10.1594/ecrystals.chem.soton.ac.uk/145

Paris, March 2009

Memorandum of Understanding“

Dr Simon Coles, Univ Southamptonhttp://wiki.ecrystals.chem.soton.ac.uk/index.php/Main_Page

                                                             

A centre of expertise in digital information management

www.ukoln.ac.uk

Slide of data services : CrystalEye, Crystal Web, Chemxseer etc search structures check PMR stuff aggregate, syndiucate, filter etc.

New Web service to aggregate published crystallography data...

... federated search.....

structure search...

Data casts : Lab Blogs

Machines Tools

Sensors

Original slide: Dr Simon Coles, Univ Soton

Publishing and sharing methodologies ...

... and workflows ...

... data for re-use, mash-ups, mining, computation, models, simulations ...

27molecules

Data(capture)

SemanticGraph

(storage)

Mash-up(reuse)

text

experiments

measurements

documents data

molecules

data

scientists

oreChem – The Chemical Semantic Web

• University of Cambridge

• Cornell University• Indiana University• Penn State University• University of

Queensland• University of

Southampton

• At-source capture of chemistry data• Chemical structure search• Compound object authoring• Retrospective harvesting of chemistry

data• Reuse through common ORE data model• Semantic authoring• Virtualized triple storage

Slide: Dr Simon Coles, Univ Soton

The Future : a Transformational Agenda?

http://www.flickr.com/photos/cyber_chof/1246303241/sizes/m/

We need to understand the value and benefits of data publishing and associated data curation / management.... and articulate them clearly

• Values & benefits may be:– political – economic – societal...

• DCC Research Data Management Forum 3

Some issues and challenges.....

1. Research quality• Publications based on closed peer review

• Maintain reputation

• Demonstrate provenance

• Open pilots – Nature

• Use collective intelligence

• Ratings, polls, recommender systems

• Data publishing policy?

2. Research sustainability• Ensure curation & preservation of long term scientific

record including the data

• Requires significant investment in infrastructure

• Assure data security

• Demonstrate resilience & robustness

• Establish trust

• New business models

• Understand full costs

3. Research capacity & capability

• Multi-disciplinary team

• Hybrid skills

• New field - data informatics

• New roles for information professionals?

IJDC 2009 (in press)

• Increase capacity & capability• Embed skills in LIS curriculum• Develop career paths, incentivise

Take homes

1. Team science is a social activity

2. We need to advocate the value & benefits of data publishing

3. Data informatics underpins C21st scholarship

Moving to Multi-Scale Science: Managing Complexity & Diversity

Thank you

Slides will be available at :http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/presentations.htmlhttp://www.dcc.ac.uk/