data-intensive research: making better use of research data malcolm atkinson & david de roure...
TRANSCRIPT
Data-IntensiveResearch: Making
better use of Research Data
Malcolm Atkinson & David De Roure
[email protected] & [email protected]
8 December 2009
Report from a fact-finding mission
Mission goal: learn how researchers use data
2
Acknowledgements: the UK e-Science Directors CIR authors, our teams, the EPSRC & all our hosts in the USA,they had the good ideas; all the opinions, observations and recommendations are our own.
Outline•Cornucopia of data
•Yet to learn how to use it well
•Hot topic•Research to politics
•Concepts•Datascopes, Intellectual Ramps,Going the last mile•Co-*•Digital ecosystem
•Principles•Recommendations•Actions•Survival in the Digital Revolution
3
Data-Intensive Research Events
Bermuda agreement 1996, 97 & 98SDSS Archive DB 1999Human Genome 2001DI Comp. Environm’s 2001Fort Lauderdale 2003Hey&Trefethen D.Del. 2003Digital Curation Cen. 2004NSF DataNet call 2007XLDB series starts 2007SciDB starts 2008Yahoo DI workshop 2008Harnessing data 2009Beyond data del. 2009Gov’s use Linked D. 2009NSF CISE DI call 20094th Paradigm book2009JISC Research DM 2009e-IRG DMTF report 2009DIEW Japan 2010
Sir Tim Berners-Leehttp://www.w3.org/DesignIssues/GovData.html
Datascopes for the mind
6
NRAO/AUI/NSF
To see things in your data you could never see before
Datato Information
to Knowledgeto Wisdom
Changed our place in the universe
Example datascopes
7
Searching for an expression
≈ OTX1 AND (Pou3f2 OR (Brd4 AND Sim1))
2903 ≈ 947 AND (1688 OR (1697 AND 3096))
Dmbx1
Match = 0.755
Slide from Jano van Hemert
Intellectual ramps
9
Easy and low risk to startProgress to advanced skillsFor research data usersNo obligationGo as far as you want
Find a service & relax
Dropbox as a RampDropbox as a Ramp
Local folder synchronised and shared via cloud
Condor job submitted by drag and drop
Ian Cottam
Results appear in Dropbox
Slide from David De Roure
Intuitive interfaces
e-Science Research http://research.nesc.ac.uk/rapid/Slide from Jano van Hemert
Engineering economic ramps
Going the last mile
12
Slide from Jano van Hemert
Gene Expression
Run 1
1234567C1
891011121314C2
Run 2
1234567C3
891011121314C4
15161718192021C5
22232425262728C6
Run 3
15161718192021C7
22232425262728C8
29313233343536C9
37383940414243C10
Run 4
29313233343536C11
37383940414243C12
44454647484950C13
51525354565758C14
Run 5
44454647484950C15
51525354565758C16
5559606162633*C17
5559606162633*C18
1234567C1
891011121314C2
1234567C3
891011121314C4
15161718192021C5
22232425262728C6
15161718192021C7
22232425262728C8
29313233343536C9
37383940414243C10
29313233343536C11
37383940414243C12
44454647484950C13
51525354565758C14
44454647484950C15
51525354565758C16
5559606162633*C17
5559606162633*C18
Slide from Rob Kitchin
Walking a path together
15
co-shaping
co-design
co-creation
co-constitution
co-evolution
co-construction
co-
Finding a niche in the digital ecosystem
Alignment of paths to routine use
• Invention• Proof of concept demonstration• Local group use• Filling a research niche• Community use• Established but still evolving• Widespread and global use• de facto standard
16Competition fo
r mind-sh
are and
reso
urces
Competition fo
r mind-sh
are and
reso
urces
A data-intensive future
e-Science ResearchSlide from Jano van Hemert
General Principles• Support for research data should be in harmony with the
evolving digital-data ecosystem
• Increase investment in analysing data to be commensurate with that for collecting data
• Co-evolve research practices with new methods and their supporting software
• Democratise research by improving education and access
• Smooth the path from foundational research, through invention and proof of concept to sustained use
• Expose the costs of computation and data to researchers
18
Recommendations• Stimulate new thinking and international collaboration
• Invest and collaborate in creating shared methods and their supporting software for exploring and exploiting digital data
• Build intellectual ramps to new methods and provide convenient services for routine tasks
• Invest in the foundations for exploiting research data
• Develop a smooth path from method invention to their sustained and routine use
19
Actions
1. Workshops on DIR
2. DIR education3. Ideas factory4. Engage with
current best practice
5. Immediate research challenges
6. DIR facilities7. Boost reference
data services8. Foundational
research9. Green DIR10.Coordination
20
Take home message
Survival in the digital-data revolution
depends on speed and appropriateness
of adaptation
21
22
ADMIRE – Framework 7 ICT 215024
?
Picture compositionbyLuke Humphrybased on prior art by Frans Hals
www.omii.ac.uk
www.admire-project.eu
www.ogsadai.org.uk
www.nesc.ac.uk
Logo store