the australian national data service and open access to data
DESCRIPTION
THE AUSTRALIAN NATIONAL DATA SERVICE AND OPEN ACCESS TO DATA. Andrew Treloar Director, ANDS Establishment Project. ANDS: Building the Australian Research Data Commons. Outline. Context Blueprint Goal Structure Progress Internationally Acknowledgements. Context. - PowerPoint PPT PresentationTRANSCRIPT
THE AUSTRALIAN NATIONAL DATA SERVICE AND OPEN ACCESS TO DATA
Andrew TreloarDirector, ANDS Establishment Project
1
ANDS: Building the Australian Research Data Commons
ands.org.au
Outline Context Blueprint Goal Structure Progress Internationally Acknowledgements
2
ands.org.au
Context3
ands.org.au
eResearch Co-ordinating Committee (2006)
Thematic Issues Continuing Need for a Focus
through national coordination Human Capabilities
People, skills and understanding Linkage of eResearch Resources
seamless access to resources Access to Data
best practice data management and curation
Structural and Cultural Change evolution of organisational
structures and cultures Awareness and Support
develop researchers’ ability to adopt eResearch
4
Service Clusters• Data
– outreach, curation, data management
– meta-services, location, access, movement
– practice, providers and users• Computing
– capability computing facilities– national computing environment
• Interoperation– discipline services (tools
((software))– user and operations support– collaboration services support
• Access– the Australian access federation– the Australian research and
education network
http://www.dest.gov.au/sectors/research_sector/publications_resources/profiles/e_research_strat_imp_framework.htm
ands.org.au
Australian Code for theResponsible Conduct of Research Describes the responsibilities of institutions and
researchers in range of areas, including the management of research data and primary materials
Institutions are to retain research data, provide secure data storage, identify ownership, and ensure security and confidentiality of research data
Researchers are to retain research data and primary materials, manage storage of research data and primary materials, maintain confidentiality of research data and primary materials
http://www.nhmrc.gov.au/publications/synopses/r39syn.htm
5
ands.org.au
NCRIS Investments$542M** over the five years: 2007-2011
• Evolving bio-molecular platforms and informatics
• Integrated biological systems• Characterisation• Fabrication • Biotechnology products• Optical and radio astronomy• Integrated marine capability• Structure and evolution of the Australian
continent
• Networked biosecurity framework
• Population health and clinical data linkage
• Terrestrial ecosystem research network
+ Platforms for Collaboration (allocated $82 M)
6
ands.org.au
NCRIS Budget Breakdown7
ands.org.au
Platforms for Collaboration:Major Investments 2007-20118
Access
Interoperation
Compute
Data
Capability ComputingAdvanced models
NCI - $26M
The Data CommonsData Federations
ANDS - $24M
Research connectivitySeamless reach
AAF+AREN - $6M
Collaboration servicesResearch workflows
ARCS - $20M
ands.org.au
Blueprint9
=
ands.org.au
The ANDS Blueprint Towards the Australian Data Commons
(TADC) Developed during 2007 by ANDS
Technical Working Group Mapped out coherent vision of what
needs to be done in the data space Available at
http://www.pfc.org.au/bin/view/Main/Data
10
ands.org.au
TADC: Why Data? Why Now? We are in an era of increasing data-
intensive research Almost all data is now born digital Increasing amount of data generated
(semi-)automatically “Consequently, increasing effort and
therefore funding will necessarily be diverted to data and data management over time” (TADC, p. 4)
11
ands.org.au
TADC: Need for standardisation Software and hardware keep getting cheaper,
wetware keeps getting more expensive Fixing data management problems is
enormously labour intensive and costly “Consequently, standardisation within forms of
data and simplification in the frameworks around retention, storage, access and use of data, and the elimination of differences whose resolution requires labour, must be made, if the on-going keeping and reuse of data is to remain affordable” (TADC, p. 5)
12
ands.org.au
TADC: Role of data federations With more data online, more can be done Possible now to answer questions unrelated
to reasons why data was collected originally Increasing focus on cross-disciplinary
science “Consequently greater clarity is needed
over control and access to community-funded data, and the means of aggregating, federating and accessing such data are increasingly important” (TADC, p. 5)
13
ands.org.au
Changing Data, Changing Research New scientific instruments
Large Hadron Collider at CERN will generate 1.5 gigabytes of data per second
the Square Kilometre Array (1 EB/day!) New scientific Models
The mapping of the Human Genome: A billion DNA letters in a human sequence
Global climate models with ever finer resolution New knowledge from unlocked data
Hubble data has to be shared six months after collection Majority of published research from Hubble telescope
data was not “first use” http://www.nature.com/news/specials/bigdata/
was free for two weeks, now isn’t
14
ands.org.au
Goal15
ands.org.au
The ANDS Goal “to deliver greater access to
Australia’s research data assets in forms that support easier and more
effective data use and reuse”
TADC, p. 18
And to be a “voice for data” RF, 24/9/08
16
ands.org.au
ANDS implementation assumptions ANDS doesn’t have enough money to fund
storage And so is predicated on institutionally-supported
solutions Not all data shared by ANDS will be open ANDS aims to leverage existing activity, and
coordinate/fund new activity ANDS will only start to build the Australian
Research Data Commons ANDS governance and management
arrangements are sized for the current funding
17
ands.org.au
Realising the goal18
Develop user and owner frameworks for data commons
Develop and operate national registries and discovery
Seed the commons by connecting existing stores/federations
Increase capabilities across sector in data mgt, integration
ands.org.au
Structure19
ands.org.au
ANDS Delivery Structure ANDS has been structured as four inter-
related and co-ordinated service delivery programs: Developing Frameworks Providing Utilities Seeding the Commons Building Capabilities
Plus candidate service development activities funded through National eResearch Architecture Taskforce projects
20
ands.org.au
Developing Frameworks (Monash) Influencing relevant national policies Building common understanding of data
management issues and solutions across government, research funding agencies, and research intensive organizations
21
Assisting OA by encouraging moves in favour of discipline-acceptable default data sharing practices
ands.org.au
Providing Utilities (ANU) Building and delivering national technical
services to support the data commons Initial services
Discovery Both “you come to us” and “we come to you” flavours Probably a two-step process for some collections Includes surfacing of ISO2146 entities (next slide) for web
harvesting Persistent identifier minting and management Collections registry to underpin discovery
Plus Services Roadmap for later years Providing capability within ANDS for integration of
existing systems into Australian Data Commons
22
Assisting OA by improving discoverability, particularly across disciplines
23
ISO2146
ands.org.au
Seeding the Commons (Monash) In targeted areas (because not enough
resource to do everything), working to improve: fabric for data management amount of content state of data capture and management
Selection process to identify targets Plus, opportunistic content recruitment
in first year
24
Assisting OA by increasing the amount of content available, much of it (hopefully!) OA
ands.org.au
Building Capabilities (ANU) Improving level of capability for research
data management and research access to data Train-the-trainer model
Two initial target populations Early career researchers Research support staff (IT, data management) NOTE: Overlapping but different messages
Building a community around data management concerns
25
Assisting OA by advocating to researchers for changed practices
ands.org.au
Progress26
ands.org.au
ANDS: From Project to Service Government asked Monash, ANU, CSIRO to set up
ANDS Establishment Project has met all its deliverables DIISR has now signed contract for ANDS First (interim) Business Plan available at
http://ands.org.au/andsinterimbusinessplan-final.pdf This will run until June 2009 Next Business Plan needs to be complete by March
2009 for consideration and approval ANDS will run until July 2011
27
ands.org.au
Australian Strategic Roadmap Review Data Storage (p.21)
National data-fabric, based on institutional nodes Shared Data (p. 22)
More ANDS Coordination Component (p. 23)
Integration of eResearch activities Expertise as an enabling infrastructure (p. 23) http://www.innovation.gov.au/ScienceAndResearch/
Documents/Strategic%20Roadmap%20Aug%202008.pdf
28
ands.org.au
National Innovation System Review R7.10: A specific strategy for ensuring the
scientific knowledge produced in Australia is placed in machine searchable repositories be developed and implemented using public funding agencies and universities as drivers
R7.14: To the maximum extent practicable, information, research and content funded by Australian governments including national collections should be made freely available over the internet as part of the global public commons…
http://www.innovation.gov.au/innovationreview/Pages/home.aspx
29
ands.org.au
Internationally30
ands.org.au
Wellcome Trust Policy on data management and sharing
The Trust “wishes to ensure that the outputs of the research it funds, including research data, are managed and used in ways that maximise public benefit.”
Benefits gained from research data “will be maximised when they are made widely available to the research community as soon as feasible, so that they can be verified, built upon and used to advance knowledge.”
Trust “expects the researchers that it funds to maximise the availability of research data with as few restrictions as possible”
http://www.wellcome.ac.uk/About-us/Policy/Policy-and-position-statements/WTX035043.htm
31
ands.org.au
National Institutes of Health Final NIH Statement On Sharing Research Data
(February 26, 2003) “Data sharing is essential for expedited translation
of research results into knowledge, products, and procedures to improve human health”
NIH “endorses the sharing of final research data to serve these and other important scientific goals”
Investigators “are expected to include a plan for data sharing or state why data sharing is not possible”
http://grants.nih.gov/grants/guide/notice-files/NOT-OD-03-032.html
32
ands.org.au
AcknowledgementsANDS Project Management
Committee Paul Bonnington, Monash Cathrine Harboe-Ree, Monash Alan McMeekin, Monash David Groenewegen, Monash Vic Elliott, ANU Adrian Burton, ANU Markus Buchhorn, ANU Alex Zelinsky, CSIRO David Toll, CSIRO Tracey Hind, CSIRO Clare McLaughlin/Jacqueline
Cooke/Peter Nicholson, DIISR Rhys Francis, AeRIC
ANDS Organising Network Andrew Treloar, Monash David Groenewegen, Monash Adrian Burton, ANU Margaret Henty, ANU Chris Blackall, ANU Ross Wilkinson, CSIRO Tracey Hind, CSIRO John Morrissey, CSIROSenior Representatives Edwina Cornish, Monash Robin Stanton, ANU Alez Zelinsky, CSIRO
33