gene association networks - large-scale integration of data and text

Post on 22-Jan-2017

73 Views

Category:

Science

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Gene association networks

Large-scale integration of data and text

Lars Juhl Jensen

9.6 million genes

association network

guilt by association

genomic context

gene fusion

Korbel et al., Nature Biotechnology, 2004

phylogenetic profiles

Korbel et al., Nature Biotechnology, 2004

experimental data

gene coexpression

physical interactions

Jensen & Bork, Science, 2008

curated knowledge

protein complexes

pathways

Letunic & Bork, Trends in Biochemical Sciences, 2008

many databases

different formats

different identifiers

variable quality

not comparable

hard work

(Ph.D. students)

parsers

mapping files

quality scores

affinity purification

von Mering et al., Nucleic Acids Research, 2005

score calibration

gold standard

von Mering et al., Nucleic Acids Research, 2005

implicit weighting by quality

common scale

cross-species transfer

Franceschini et al., Nucleic Acids Research, 2013

missing most of the data

>10 km

too much to read

text mining

comprehensive lexicon

cyclin dependent kinase 1

CDC2

orthographic variation

spaces and hyphens

cyclin dependent kinase 1

cyclin-dependent kinase 1

prefixes and suffixes

CDC2

hCdc2

“black list”

SDS

co-mentioning

counting

within documents

within paragraphs

within sentences

quality scores

score calibration

cross-species transfer

combine all evidence

Szklarczyk et al., Nucleic Acids Research, 2015string-db.org

web resource

download files

REST API

Bioconductor package

Cytoscape App

AcknowledgmentsDamian Szklarczyk

Michael KuhnAndrea Franceschini

Milan SimonovicAlexander Roth

Sune Pletscher-FrankildJohn “Scooter” MorrisChristian von Mering

Peer Bork

Unacknowledgments

Do yourself a favor, don’t fly

top related