the research data alliance--creating the culture and technology for an international data...

Post on 22-Nov-2014

546 Views

Category:

Data & Analytics

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

RDA in theory and practice presented to Geoscience Australia and at an Australian National Data Service seminar

TRANSCRIPT

Unless otherwise noted, the slides in this presentation are licensed by Mark A. Parsons under a Creative Commons Attribution-Share Alike 3.0 License

The Research Data Alliance Creating the culture and technology for an international data infrastructure

Mark A. ParsonsSecretary General

Australian National Data ServiceMelbourne, Australia24 October 2014

All of society’s grand challenges require diverse

(often large) data to be shared and integrated

across cultures, scales, and technologies.

Research Data Alliance

Vision Researchers and innovators openly share data across technologies, disciplines, and countries to address the grand challenges of society.

Mission RDA builds the social and technical bridges that enable open sharing of data.

Courtesy xkcd.com

Dynamics of Infrastructure Edwards, et al. 2007 Understanding Infrastructure: Dynamics, Tensions, and Design.

• Infrastructures become “ubiquitous, accessible, reliable, and transparent” as they mature.

• Systems Networks Inter-networks

• “system-building, characterized by the deliberate and successful design of technology-based services.”

• “technology transfer across domains and locations results in variations on the original design, as well as the emergence of competing systems.”

• Finally, “a process of consolidation characterized by gateways that allow dissimilar systems to be linked into networks.”

Not what, but When is infrastructure?

Not what, but When and Who is infrastructure?

Bridges and Gateways

Gateways are often wrongly understood as “technologies,” i.e. hardware or software alone. A more accurate approach conceives them as combining a technical solution with a social choice, i.e. a standard, both of which must be integrated into existing users’ communities of practice. Because of this, gateways rarely perform perfectly. — Edwards et al. 2007

"Data Deluge," Brett Ryder, The Economist, Feb. 2010

Data Blizzard?© Mindy Veissid | Mindy Veissid Photography.

Diverse snow crystal photos by Kenneth G. Libbrecht snowcrystals.com

The long tail of science Heidorn 2008

Distribution of NSF Awards by Dollar Value

© 2009 The Board of Trustees, University of Illinois

Surface-level diversity (race, age, gender)

vs. Deep-level diversity

(values, conceptual metaphors, personality)

Ashby’s Law of Requisite Variety Only variety absorbs variety

The value of a network increases as the square of the number of nodes.

Metcalfe’s Law

Map of the internet by the Opte Project [CC-BY] via Wikimedia Commons

Networks or ecosystems often rely on “weak” links, so partner and build relationships. (See Barabási A-L and R Albert. 1999 and others)

Increasing Complexity of Mediation

From: C. Borgman, 2008, NSF Cyberlearning Report

Themes from A. Tsing on Collaboration Friction—An ethnography of global connection

•“Actual existing universalisms are hybrid, transient, and involved in constant reformulation through dialogue.” They work out through friction.

•“There is no reason to think collaborators have common goals.”

•Unity and diversity cover each other up. Need to remember the local.

Where Good Ideas Come From

•The Adjacent Possible—the importance of local

•Often not “Eureka!” but rather a slow hunch fading in to view over time.

•Hunches need to collide with other hunches so create that environment. Don’t protect IP share it. Connecting vs. protecting

•Sharing of failures as well. •Create spaces for that to happen—

virtual and real coffee shops • “Chance favors the connected mind.”

It’s all about Relationships (I’m an introvert)

• The central challenge is diversity. • We address it through variety and myriad interfaces and

connections. • Fostering relationships is central to community and data science.

• they build social capital—success through giving • they uncover tacit knowledge • they inform methods

Data Science and Collaborative Methods

• User-driven design is not just end user. Engage providers and funders too.

• Case studies not just use cases.

• Ethnography—study relationships because data are often at the center of that interaction—a boundary object.

• Agile is not just for software (courtesy Bruce Caron).

• Individuals and interactions over processes and tools

• Working volunteers over comprehensive documentation

• Member collaboration over contract negotiation

• Responding to change over following a plan.

But what does this all have to do with RDA?

1. RDA focusses on developing “gateways”

2. RDA doesn’t do “architecture,” but it does provide a level of unity.

Deliverables that make data work

“Create - Adopt - Use”

• Adopted code, policy, specifications, standards, or practices that enable data sharing

• “Harvestable” efforts for which 12-18 months of work can eliminate a roadblock

• Efforts that have substantive applicability to groups within the data community but may not apply to all

• Efforts that can start today

RDA Principles OpennessConsensus

BalanceHarmonization

Community Driven Non-profit

RDA Organisational Framework

RDA Working Groups

1. Brokering Governance*

2. Data Citation WG

3. Data Description Registry Interoperability

4. Data Foundation and Terminology WG

5. Data Type Registries WG

6. Metadata Standards Directory Working Group

7. PID Information Types WG

8. Practical Policy WG

9. RDA/CODATA Summer Schools in Data Science and Cloud Computing in the Developing World*

10.RDA/WDS Publishing Data Bibliometrics WG

11.RDA/WDS Publishing Data Services WG

12.RDA/WDS Publishing Data Workflows WG

13.Repository Audit and Certification DSA–WDS Partnership WG

14.Standardisation of Data Categories and Codes WG

15.The BioSharing Registry: connecting data policies, standards & databases in life sciences*

16.Urban Quality of Life Indicators*

17.Wheat Data Interoperability WG

* in review

But what does this all have to do with RDA?

1. RDA focusses on developing “gateways”

2. RDA doesn’t do “architecture,” but it does provide a level of unity.

3. RDA plays both globally and locally—Think “glocal”.

Distribution of 2,353 Individual RDA Members in 96 Countries 12 September 2014

Other6%Private

13%

Government18% Academia

63%

Map courtesy traveltip.org

Europe50%

North America36%

Austral-pacific 5%

Africa 3%

SouthAmerica 1%

Asia 5%

Regional RDAs

• Australian National Data Service, RDA/United States, RDA/Europe,

• Implement RDA deliverables locally and enhance adoption.

• Ensure regional or national issues are addressed globally.

• Support plenaries and support attendance at plenaries.

But what does this all have to do with RDA?

1. RDA focusses on developing “gateways”

2. RDA doesn’t do “architecture,” but it does provide a level of unity.

3. RDA plays both globally and locally—Think glocal.

4. RDA fosters relationships, interfaces, and connections.

5. RDA provides a “neutral place” to identify and work through friction.

RDA Organisational Framework

RDA Interest Groups

1. Agricultural Data Interoperability IG2. Big Data Analytics IG3. Biodiversity Data Integration IG4. Brokering IG5. Community Capability Model IG6. Data Fabric IG7. Data for Development8. Data in Context IG9. Defining Urban Data Exchange for Science IG*10.Development of cloud computing capacity and

education in developing world research11.Digital Practices in History and Ethnography IG12.Domain Repositories Interest Group13.Education and Training on handling of research

data14.ELIXIR Bridging Force IG*15.Engagement IG16.Federated Identity Management17.Geospatial IG*18.Libraries for Research Data*

19.Long tail of research data IG20.Marine Data Harmonization IG21.Metabolomics22.Metadata IG23.PID Interest Group24.Preservation e-Infrastructure IG25.RDA/CODATA Legal Interoperability IG26.RDA/CODATA Materials Data, Infrastructure &

Interoperability IG27.RDA/WDS Certification of Digital Repositories IG28.RDA/WDS Publishing Data Cost Recovery for

Data Centres29.RDA/WDS Publishing Data IG30.Reproducibility IG*31.Research data needs of the Photon and Neutron

Science community32.Research Data Provenance33.Service Management IG34.Structural Biology IG35.Toxicogenomics Interoperability IG

* in review

Plenary 5 San Diego, California9 - 11 March 2015

©2013 Pecoff Studios Inc

RDA Organisational Framework

Fran Berman

39

§ Council: § Fran Berman (US), co-Chair § Patrick Cocquet (France) § Tony Hey (US) § Kaye Raseroka (Botswana) § Satoshi Sekiguchi (Japan) § Doris Wedlich (Germany) § Ross Wilkinson (Australia) § John Wood (UK), co-Chair

• Secretariat § Timea Biro § Hilary Hanahoe § Fotis Karayannis § Stefanie Kethers § Kathy Fontaine § Yolanda Meleco § Mark Parsons, Sec Gen § Herman Stehouwer

•Organisational Assembly § Juan Bicarregui, co-Chair § Walter Stewart, co-Chair

§ Technical Advisory Board § Bridget Almas § Simon Cox § Liu Chuang § Peter Fox § Francoise Genova § Carole Palmer § Beth Plale, Chair § Susanna-Assunta Sansone, § Jamie Shiers § Rainer Stotzka § Andrew Treloar, Chair § Peter Wittenburg

RDA Leadership

Organisational Partners—key linkages

• Organisations play an essential role as adopters!

• Organisational Assembly = Organisational Members and Affiliates.

• Organisational Advisory Board will represent Organisational Assembly to Council

• Organisational Members pay (modest) dues and have a special voice within RDA helping ensure RDA products stay relevant

Image courtesy anybots.com

Organisational Members and Affiliates

§ Organisational Members: § Alliance for Permanent Access § American University Library § Australian National Data Service § Barcelona Supercomputing Center - Centro

Nacional de Supercomputación § Columbia University Library § CNRI § CSC § Digital Curation Center § EIROForum IT Working Group § eResearch Services and Scholarly

Application Development Division of Information Services, Griffith University

§ European Data Infrastructure (EUDAT) § National Institute of Advanced Industrial

Science and Technology (AIST), Japan § International Association of STM Publishers

§ Internet2 § Microsoft Research § NZ eScience Infrastructure § Purdue University Libraries § Research Data Canada § Scholarly Publishing and Academic

Resources Coalition (SPARC) § Washington University in St. Louis Libraries § Science and Technology Facilities Council

§ Affiliates § CODATA § ICSU World Data System § ORCID § DataCite § CASRAI § Global Alliance for Genomics and Health

RDA Organisational Framework

• The group of government and non-profit science funding organisations that support the data and science communities to participate in RDA activities:

• Australian Government

• US Government (NSF and NIST)

• European Commission

• Allows agencies the opportunity to share funding program plans that support data exchange, interoperability, and data infrastructures across the globe, and thereby amplify their impact.

• Related to but distinct from RDA. A parallel organisation.

RDA Colloquium—RDAC

• A basic vocabulary of foundational terminology and query tool to make sure we know what we’re talking about.

• A data type model and registry (“MIME-types” for data) to help tools interpret, display, and process data.

• A persistent identifier type registry to help search engines understand what they are pointing to and retrieving.

• Coming soon:

• A basic set of machine actionable rules to enhance trust

• A metadata standards directory so we can describe similar things consistently

• A dynamic data citation methodology so we can reference precise subsets of changing data.

• Semantically linked terms describing wheat data so we can share harvest and related information around the world

Initial Products—adopt one today!

Get involved!

• Join RDA as an individual member supporting our principles at http://rd-alliance.org

• Join as an Organisational Member (nominal fee) or an Organisational Affiliate (jointly sponsored efforts).

• Initiate or join an Interest Group

• Propose or join a Working Group

• Attend the RDA Plenaries

Coming together is a beginning; keeping together is progress; working together is success.

—Henry Ford

Summary

• Infrastructure is created in phases with the final consolidation phase relying on gateways and bridges.

• Diversity is a central problem, but only diversity absorbs diversity.

• Networking and interconnection are the way to solve complex problems.

• We are in more global and democratic world, but also a more local world. Coalition politics with new kinds of coalitions because there are new kinds of identity.

• Data science needs to focus on relationships, connections, interfaces.

• You must participate “glocally” to succeed.

• RDA provides mechanisms to address all of the above!

Info: enquiries@rd-alliance.org

@resdatall

top related