biocuration 2014 - the resource identification initiative
DESCRIPTION
TRANSCRIPT
Melissa Haendel, OHSU LibraryApril 9, 2014
The Resource Identification Initiative:
What are we curating anyway?
@ontowonka #RII
Journal guidelines for methods are often poor and space is limited
“All companies from which materials were obtained should be listed.” - A well-known journal
Reproducibility is dependent at a minimum, on using the same resources. But…
How identifiable are resources in the published literature?
An experiment in reproducibility
Domains:ImmunologyCell biologyNeuroscienceDevelopmental biology
General biology
Impact factors:HighMediumLow
84 Journals
248 papers
707 antibodies
104 cell lines
258 constructs
210 knockdown reagents
437 model organisms
Reporting Guidelines:StringentSatisfactoryLoose
Only ~50% of resources were identifiableVasilevsky et al, 2013, PeerJ
There is no correlation between impact factor and resource identification
Journal Impact Factor
0 10 20 30 40
Fra
ctio
n of
res
ourc
es id
entif
ied
0.0
0.2
0.4
0.6
0.8
1.0AntibodiesCell LinesConstructsKnockdown reagentsOrganisms
Resources are not more identifiable in journals with stricter reporting requirements
http://validation.scienceexchange.com/#/cancer-biology
Attempting to reproduce 50 prominent cancer studies
Even for some of the highest profile papers, we still have to go back to the authors to identify resources
Resources reported in the 50 Reproducibility Initiative studies show similar results
Vasilevsky et al., 2013, PeerJ Reproducibility Initiativehttp://shar.es/BSsab
Maybe labs are just disorganized?
Meet the Urban Lab
Meet the Urban Lab
A+ organization!
The Urban lab antibodies
Of 9 antibodies published in 5 articles, only 44% were identifiable
Per
cen
t id
enti
fiab
le
Commerical Ab identifiable
Catalog number reported
Source organism reported
Target uniquely identifiable
0%
25%
50%
75%
100%
Resource information is not adequately getting into the literature, EVEN
THOUGH IT IS READILY AVAILABLE
The problem is a lack of standards, review, and tools
http://www.force11.org/Resource_Identification_Initiative
Numerous endorsers https://www.force11.org/RII/SignUpImplementation of the new standard http://biosharing.org/bsg-000532
Promoting use of Research Resource IDs (RRIDs) in the published literature
Antibodies
Software & Tools
Model Organisms
Pilot project runs from February – April
RRIDs should be: Machine Readable
Consistent across publishers and journals
Free to generate and access
Resources:
Resource
Identification
Portal
Sample citation: Polyclonal rabbit anti-MAPK3 antibody, Abgent, Cat# AP7251E, RRID:AB_2140114
1. Researcher submits a manuscript for publication
2. Editor or Publisher asks for inclusion of RRID
3. Author goes to Research Identification Portal to locate RRID
4. RRID is included in Methods section and as Keyword
Publishing Workflow
Potential outcomes
Better reporting of materials and methods Making the literature machine readable
outside the paywall Reduce the curation load Change how we utilize the literature –
observations span journals and formats Credit and citation tracking for contributions
of resources Better retrospective reanalysis of data Ability for authors to subscribe to resources
and semantically similar entities
Be a beta tester!
http://www.antibodies-online.com/resource-identification-initiative/
Free gift!
Questions and issues• Should the RRIDs be: DOIs? URIs? Nanopubs?
– Most sources do not create these• Part of a new bibliographic type?
– For pilot, going into keywords to be outside paywall• What is the best way to incorporate the RIID portal
into the publishers workflow?– Provide text-mined checklist when author submits?
• How to best synchronize and coordinate nomenclature with authoritative sources aggregated in the RII Portal?
• How to utilize the RII Portal in dataset contribution to data repositories?