the neuroscience information framework making resources discoverable for the computational...
TRANSCRIPT
The Neuroscience Information FrameworkMaking Resources Discoverable for the Computational
Neuroscience Community
Jeffrey S. Grethe, Ph. D.
Co-Principal Investigator, NIFCenter for Research in Biological Systems
University of California, San Diego
OCNS 2010Workshop on Methods in Neuroinformatics
The Neuroscience Information Framework: Discovery and utilization of web-based resources for neuroscience
http://neuinfo.org UCSD, Yale, Cal Tech, George Mason, Washington Univ
A portal for finding and using neuroscience resources
A consistent framework for describing resources
Provides simultaneous search of multiple types of information, organized by category
Supported by an expansive ontology for neuroscience
Utilizes advanced technologies to search the “hidden web”
Brief History of NIF• Outgrowth of Society for Neuroscience Neuroinformatics
Committee– Neuroscience Database Gateway: a catalog of neuroscience
databases• “Didn’t I fund this already?”
– Over 2500 databases are on-line; no one can go to them all• “Why can’t I have a Google for neuroscience”
– “Easy”, comprehensive, pervasive• Phase I-II: Funded by a broad agency announcement from the
NIH Neuroscience Blueprint– Feasibility
• Current phase: Started Sept 2008
How can we provide a consistent and easy to implement framework for those who are providing resources, eg., data, and those looking for these data and resources
➤ Both humans and machines
The Problem• Over 2000 databases have been identified
through NIF– Researchers can’t visit them all– Most content from these resources not easily found
through standard search engines– Even more structured content on the web
• Databases provide domain specific views of data– NIF provides a snapshot of information in a simple to
understand form that can be further explored in the native database
– Providing a biomedical science based semantic framework for resource description and search
NIF uniquely provides access to the largest registry of neuroscience resources available on the web
DateData
Federation
Data Federation
Records Catalog Web IndexLiterature
CorpusNIF
Vocabulary
9/2008 5 60,420* 388 113,458 67,000 18,884†
7/2009 18 4,393,744* 1,605 497,740 101,627 17,086
5/2010 55 23,228,658 2,871 1,184,261All
(PubMed) 53,023% yearly increase 205 429 79 138 181% overall increase 1,000 38,345 640 944 210
* Numbers for initial sources were generated by examining current source content† First year of NIF contract involved re-factoring of ontology
Guiding principles of NIF• Builds heavily on existing technologies (open source tools and
ontologies)• Information resources come in many sizes and flavors• Framework has to work with resources as they are, not as we wish
them to be– Federated system; resources will be independently maintained– Developed for their own purpose with different levels of resources
• No single strategy will work for the current diversity of neuroscience resources
• Trying to design the framework so it will be as broadly applicable as possible to those who are trying to develop technologies
• Interface neuroscience to the broader life science community• Take advantage of emerging conventions in search, semantic web,
linked data and in building web communities
http://neuinfo.org
A Quick Tour of the NIF
Domain Enhanced Search for Neuroscience
NIF now searches more than 55 databases with information neuronal descriptions, neuronal morphology, connectivity, chemical compounds…
Ontology Based Search Refinement
Diverse Database Content
NeuroMorpho.org
NeuronDB
Concept-based search• Search Google: GABAergic neuron• Search NIF: “GABAergic neuron”
– NIF automatically searches for types of GABAergic neurons
Types of GABAergic
neurons
Concept-based search
Use of Ontologies within NIF• Controlled vocabulary for describing type of resource
and content– Database, Image, Parkinson’s disease
• Entity-mapping of database and data content• Data integration across sources• Search: Mixture of mapped content and string-based
search– Different parts of NIF use the vocabularies in different ways– Utilize synonyms, parents, children to refine search– Increasing use of other relationships and logical inferencing
• Generation of semantic content (i.e. RDF, Linked Data)
http://neurolex.org
Building the NIF Ontologies
Modular Ontologies
NIFSTD
NS Function
MoleculeInvestigatio
nSubcellular Anatomy
Macromolecule Gene
Molecule Descriptors
Techniques
Reagent Protocols
Cell
Instruments
NS Dysfunctio
nQualityMacroscopic
AnatomyOrganis
m
Resource
• Single inheritance trees with minimal cross domain and intradomain properties
• Orthogonal: Neuroscientists didn’t like too many choices
• Human readable definitions (not complete yet)
• Set of expanded vocabularies largely imported from existing terminological resources
• Adhere to ontology best practices as we understood them• Built from existing resources when possible• Standardized to same upper ontology: BFO• Encoded in OWL DL• Provides mapping to source terminologies• Provides synonyms, lexical variants, abbreviations
Anatomy Cell TypeCellular
ComponentSmall
Molecule
Neuro-transmitter
TransmembraneReceptor
GABA GABA-R
TransmitterVesicle
Terminal AxonBouton
Presynapticdensity
PurkinjeCell
Neuron
Dentate NucleusNeuron
CNS
Cpllection of Deep Cerebellar
Nuclei
PurkinjeCell Layer
DentateNucleus
CytoarchitecturalPart of
Cerebellar Cortex
Expressed in
Located in
“Bridge files”
NIF Cell• NIF has made significant enhancements to its
cell ontology– Expanded neuron list– Generated neuronal classifications based on
neurotransmitter, brain region, molecules, morphology, circuit role
– Recommended standard naming convention– Is working with the International Neuroinformatics
Coordinating Facility through the PONS (program in ontologies for neural structures) program• Creating Knowledge base for neuronal classification based
on properties
Neurolex Wiki
http://neurolex.org
• NIF has posted its vocabularies in Wiki form (Semantic MediaWiki)
• Simplified interface for ontology construction and refinement
• Custom forms for neurons and brain regions
• Semantic linking between category pages
• Significant knowledge base
• Curation NIFSTD
NeuroLex and NeuroML“There was further discussion of how to define specific
types of morphological groups such as apical dendrites, basal dendrites, axons, etc. Several options include having predefined names for common types or linking to ontologies that define these types. We suggest adding tags or rdf for metadata that provide NeuroLex ontology ids to groups. We propose to begin with simple tags, and when a tag is present, one should assume it indicates “is a”. If more complicated semantic information is needed, we can use rdf in a way that is similar to SBML.”
NeuroML Development Workshop 2010http://www.neuroml.org/files/NeuroMLWorkshop2010.pdf
http://neuinfo.org
Providing community access
Access at various levels…• A search portal (link to NIF advanced search interface) for researchers,
students, or anyone looking for neuroscience information, tools, data or materials.
• Access to content normally not indexed by search engines, i.e, the "hidden web”
• Tools for resource providers to make resources more discoverable, e.g., ontologies, data federation tools, vocabulary services
• Tools for promoting interoperability among databases• Standards for data annotation• The NIFSTD ontology covering the major domains of neuroscience, e.g.,
brain anatomy, cells, organisms, diseases, techniques• Services for accessing the NIF vocabulary and NIF tools• Best practices for creating discoverable and interoperable resources• Data annotation services: NIF experts can enhance your resource through
semantic tagging• NIF cards: Easy links to neuroscience information from any web browser• Ontology services: NIF knowledge engineers can help create or extend
ontologies for neuroscience
http://wholebraincatalog.org
Integration of NIF services and ontologies
WBC and Simulation VisualizationDemonstrates the neurogenesis simulation driven by the model of Aimone et al., 2009 from the Gage lab at the Salk Institute within the Whole Brain Catalog
http://www.youtube.com/watch?v=1YzfXv4yNzg
WBC and NeuroConstruct
http://www.neuroml.org/tool_support.php
A network model of the cerebellar granule cell layer which can be fully expressed as a Level 3 NeuroML file. Visualised in the Whole Brain Catalog (left), and neuroConstruct (right)
http://wiki.wholebraincatalog.org/wiki/Running_Simulations
NIF cardsSimple tool for linking search
results to other sources of information
NIF literature results display for “Cerebellum”; concepts in NIF ontologies highlighted and linked to more information through NIF knowledge base
http://nifcards.neuinfo.org/nifstd/anatomical_structure/birnlex_1489.html
Providing Semantic Content
RDF data / SPARQL Queries
The NIF Team• Maryann Martone, UCSD-PI• Jeff Grethe, UCSD-Co PI• Amarnath Gupta, UCSD-Co-PI• Ashraf Memon, UCSD, Project Manager• Anita Bandrowski, UCSD, NIF Curator• Fahim Imam, UCSD, Ontology Engineer• David Van Essen, Wash U, Co-PI• Erin Reid, Wash U• Gordon Shepherd, Yale, Co-PI• Perry Miller, Yale• Luis Marenco, Yale• Rixin Wang, Yale• Paul Sternberg, Cal Tech, Co-PI• Hans Michael-Muller, Cal Tech• Arun Ragarajan, Cal Tech
• Giorgio Ascoli, George Mason, Co-PI
• Sridevi Polavaram, George Mason
• Vadim Astakhov, UCSD• Andrea Arnaud-Stagg, UCSD• Lee Hornbrook, UCSD• Jennifer Lawrence, UCSD• Irfan Baig, UCSD student• Anusha Yelisetty, UCSD
student• Timothy Tsui, UCSD student• Chris Condit, UCSD• Xufei Qian, UCSD• Larry Liu, UCSD