how open is open? an evaluation rubric for public knowledgebases

25
HOW OPEN IS OPEN? AN EVALUATION RUBRIC FOR PUBLIC KNOWLEDGEBASES MELISSA HAENDEL MARCH 28TH, 2017 @ontowonka

Upload: mhaendel

Post on 22-Jan-2018

819 views

Category:

Science


1 download

TRANSCRIPT

Page 1: How open is open?  An evaluation rubric for public knowledgebases

H O W O P E N I S O P E N ?

A N E V A L U AT I O N R U B R I C F O R P U B L I C

K N O W L E D G E B A S E S

MELISSA HAENDEL

MARCH 28TH, 2017

@ontowonka

Page 2: How open is open?  An evaluation rubric for public knowledgebases

THERE ARE OVER 1500 PUBLIC DATABASES IN NUCLEIC ACIDS RESEARCH DATABASE COLLECTION

https://doi.org/10.1093/nar/gkw1188

Page 3: How open is open?  An evaluation rubric for public knowledgebases

HOW MANY OF THESE ARE TRULY OPEN?

OPENNESS IS AN NARREQUIREMENT, BUT …

Page 4: How open is open?  An evaluation rubric for public knowledgebases

WHY ARE WE STILL FAILING?

Page 5: How open is open?  An evaluation rubric for public knowledgebases

OPEN DATA IS FAIR DATA

http://www.nature.com/articles

/sdata201618

Findable Accessible Interoperable Reusable

Page 6: How open is open?  An evaluation rubric for public knowledgebases

ANATOMY OF FAIR: FINDABLE

persistent identifier

rich metadata

registered or indexed in a searchable resource

McMurry et al Identifiers for the 21st century

bit.ly/identifiers-2017

Page 7: How open is open?  An evaluation rubric for public knowledgebases

ANATOMY OF FAIR: ACCESSIBLE

(meta) data are openly retrievable by their

identifier using a standardized

communications protocol

Metadata are accessible, even when the data

are no longer available

http://api.monarchinitiative.org/api/

Page 8: How open is open?  An evaluation rubric for public knowledgebases

ANATOMY OF FAIR: INTEROPERABLE

Use a formal, accessible, shared, and broadly

applicable language for knowledge

representation

Define semantics of all relationships, including

cross references (hint: use the Relations

Ontology!)

Page 9: How open is open?  An evaluation rubric for public knowledgebases

ANATOMY OF FAIR: INTEROPERABLE

Picking on the Personal Genome Project (thanks Sasha!)

Do you have a severe genetic disease or rare genetic trait? If so, you can add a description for your public profile.

1. Extreme susceptibility to motion sickness. - answers pertain to this trait2. Pyloric stenosis3. Unusually small feet for my height

Page 10: How open is open?  An evaluation rubric for public knowledgebases

ANATOMY OF FAIR: REUSABLE

Meta(data) are described with a plurality of

accurate and relevant attributes

Detailed provenance and use of community

standards

www.obofoundry.orghttps://www.w3.org/TR/hcls-dataset/

https://peerj.com/articles/2331.pdf

Page 11: How open is open?  An evaluation rubric for public knowledgebases

A RUBRIC FOR EVALUATION

bit.ly/eval-rfi

Page 12: How open is open?  An evaluation rubric for public knowledgebases

Findable Accessible Interoperable Reusable

FAIR-TLC

Traceable Licensed Connected

Page 13: How open is open?  An evaluation rubric for public knowledgebases

FAIR-TLC: TRACEABILITY

Provenance is documented and attributed

Contributions to the content (data, tools,

algorithms, sources, etc.) are declared

Documentation on how to cite a record from a

source or the whole resource

Page 14: How open is open?  An evaluation rubric for public knowledgebases

FAIR-TLC: LICENSURE

http://peterdesmet.com/posts/analyzing-gbif-data-licenses.html

Not all data resources are free to use, derive, and

redistribute, even if they are publicly funded and

seemingly publicly available.

Page 15: How open is open?  An evaluation rubric for public knowledgebases

FAIR-TLC: LICENSURE

http://peterdesmet.com/posts/analyzing-gbif-data-licenses.html

Standard

license171

Non-standar

d license1069

No license10734

Page 16: How open is open?  An evaluation rubric for public knowledgebases

NON-STANDARD LICENSES BURDEN SCIENCE bit.ly/reusabledata-forum

Page 17: How open is open?  An evaluation rubric for public knowledgebases

FAIR-TLC: CONNECTEDBECAUSE AGGREGATED != INTEGRATED

Page 18: How open is open?  An evaluation rubric for public knowledgebases

FAIR-TLC: CONNECTEDBECAUSE AGGREGATED != INTEGRATED

192K datasets….probably more than 38 are relevant to diabetes

Page 19: How open is open?  An evaluation rubric for public knowledgebases

FAIR-TLC: CONNECTEDBECAUSE AGGREGATED != INTEGRATED

Similarly, clouds do not integrate data.

http://stonebond.com/wp-content/uploads/2015/05/cloud-data-bullet-points-img.jpg

Page 20: How open is open?  An evaluation rubric for public knowledgebases

EVALUATING THE OPEN SCIENCE CANDIDATES Room for

improvement

bit.ly/open-science-prize

Open imaging

Page 21: How open is open?  An evaluation rubric for public knowledgebases

DISCUSSION: HOW DO WE DO BETTER?

Make the right thing the easy thing:

- Carrots:

- Tenure & promotion cycles

- Dedicated funding for increasing FAIR-

TLC

- Sticks:

- Publication requirements

- Funding requirements

- Tools:

- Tracking tools

- Documentation tools

- Social tools

Page 22: How open is open?  An evaluation rubric for public knowledgebases

ARE JOURNAL DATA SHARING POLICIES HITTING THE MARK ?

Vasilevsky et al.

https://doi.org/10.7287/peerj.preprints.2588v1

Page 23: How open is open?  An evaluation rubric for public knowledgebases

TOO TINY A STICK?

Vasilevsky et al.

https://doi.org/10.7287/peerj.preprints.2588v1

Page 24: How open is open?  An evaluation rubric for public knowledgebases

REUSABLEDATA.ORG

Curate, evaluate, and provide guidance on

legal and effective data reuse and redistrubiton

Wanna help? Join the google group at:

Seth Carbonbit.ly/reusabledata-forum

Page 25: How open is open?  An evaluation rubric for public knowledgebases

T H A N KS T O :

JULIE MCMURRY

ANDREW SU

SETH CARBON