of biological data data integration model for re-use€¦ · makes data available (and useful) to...

25
Data Integration Model for Re-use of Biological Data Julie Sullivan

Upload: others

Post on 30-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

Data Integration Model for Re-use of Biological Data

Julie Sullivan

Page 2: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

InterMine as a data sharing platform

1. How InterMine enables data sharing

2. A specific example of data sharing

Page 3: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

Who is InterMine?

● Started in 2003

● Department of Genetics

● Gos Micklem’s Lab

Page 4: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

What is InterMine?● Data warehouse

● Open source

● Biological data

Page 5: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

What is InterMine? cont’d● Import data

○ Core data model

○ Sophisticated data integration system

● Webapp

○ Analysis tools

○ Advanced search

● API

Page 6: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

InterMine

Page 7: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API
Page 8: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

FUNCTION

PHENOTYPES

PROTEIN DOMAINS EXPRESSION

INTERACTIONSGENES

PROTEINS

DISEASE

ORTHOLOGUES GWAS

ALLELES

SEQUENCE VARIANTS

GENE ONTOLOGY

PATHWAYS

REGULATION

Page 9: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

How InterMine Enables Data Sharing and Re-use● Makes data available (and useful) to public● FAIR data principles*

○ Findable■ Identifiers■ Searchable

○ Accessible■ Public API■ Perl / Python .. , JSON / XML ..

○ Interoperable■ Ontologies

○ Re-usable■ Provenance

*Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data3:160018 doi: 10.1038/sdata.2016.18 (2016).

Page 10: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

InterMine as a data sharing platform

1. How we (InterMine) enable data sharing

2. A specific example of data sharing

Page 11: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API
Page 12: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

Normal Cell Tumor Cell

BRCA1/2PARP

BRCA1/2PARP

Page 13: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

BRCA1/2PARP

Normal Cell Tumor Cell

BRCA1/2PARP

PARP Inhibitor+

Page 14: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

BRCA1/2PARP

Normal Cell Tumor Cell

BRCA1/2PARP

PARP Inhibitor+

BRCA1/2PARP

BRCA1/2PARP

Page 15: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

BRCA1/2PARP

Normal Cell Tumor Cell

BRCA1/2PARP

PARP Inhibitor+

BRCA1/2PARP

BRCA1/2PARP

Page 16: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

YeastMine1. http://yeastmine.yeastgenome.org

2. SGD

a. Saccharomyces Genome Database

b. Stanford

c. Mike Cherry’s lab

2. Yeast data

a. Interactions loaded from BioGRID, an interactions database

Page 17: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

We will analyse 173 tumor suppressor genes from Vanderbilt's TSGene Database

Page 18: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

Tumour Suppressor Genes

Equivalent Yeast Genes

HOMOLOGUES

Page 19: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

Tumour Suppressor Genes

Equivalent Yeast Genes

HOMOLOGUES

SYNTHETIC

LETHAL

Interactors (yeast)

Page 20: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

Tumour Suppressor Genes

Equivalent Yeast Genes

HOMOLOGUES

SYNTHETIC

LETHAL

Interactors (yeast)

HOMOLOGUES

Equivalent Human Genes

Page 21: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

Tumour Suppressor Genes

Equivalent Yeast Genes

HOMOLOGUES

SYNTHETIC

LETHAL

Interactors (yeast)

HOMOLOGUES

Equivalent Human Genes

SYNTHETIC

LETHAL (POTENTIAL)

Page 22: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

Human ATR and POLD1 potentially share a synthetic lethal interaction.

Page 23: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

Human ATR and POLD1 potentially share a synthetic lethal interaction.

There is evidence that ATR-POLD1 have a synthetic lethal interaction

Oncotarget. 2016 Feb 9;7(6):7080-95. doi: 10.18632/oncotarget.6857.

A synthetic lethal screen identifies ATR-inhibition as a novel therapeutic approach for POLD1-deficient cancers.Hocke S1, Guo Y1, Job A2, Orth M3, Ziesch A1, Lauber K3, De Toni EN1, Gress TM2, Herbst A1, Göke B1, Gallmeier E1,2.

Page 24: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API

The Future - InterMine in the cloudPlanning for a “InterMine in the Cloud”

Upload data file and deploy an InterMine for your data

Federated search tools

Page 25: of Biological Data Data Integration Model for Re-use€¦ · Makes data available (and useful) to public FAIR data principles* Findable Identifiers Searchable Accessible Public API