text and data mining

Post on 10-May-2015

577 Views

Category:

Technology

8 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Text and data mining

Lars Juhl Jensen

Part 1text mining

exponential growth

some things are constant

~45 seconds per paper

computer

as smart as a dog

teach it specific tricks

named entity identification

Reflect

Pafilis, O’Donoghue, Jensen et al., Nature Biotechnology, 2009

comprehensive lexicon

orthographic variation

“black list”

information extraction

no access

collaboration

Part 2protein networks

guilt by association

STRING

Szklarczyk, Franceschini et al., Nucleic Acids Research, 2011

genomic context

gene fusion

Korbel et al., Nature Biotechnology, 2004

experimental data

physical interactions

Jensen & Bork, Science, 2008

gene coexpression

genetic interactions

Beyer et al., Nature Reviews Genetics, 2007

curated knowledge

pathways

Letunic & Bork, Trends in Biochemical Sciences, 2008

text mining

many data types

many databases

different formats

different identifiers

variable quality

quality scores

calibrate vs. gold standard

von Mering et al., Nucleic Acids Research, 2005

orthology transfer

Frishman et al., Modern Genome Annotation, 2009

Part 3drug networks

new uses for old drugs

shared target(s)

chemical similarity

Campillos & Kuhn et al., Science, 2008

similar drugs share targets

Campillos & Kuhn et al., Science, 2008

only trivial predictions

phenotypic similarity

chemical perturbations

phenotypic readouts

drug treatment

side effects

no database

package inserts

Campillos & Kuhn et al., Science, 2008

text mining

manual validation

side-effect correlations

Campillos & Kuhn et al., Science, 2008

side-effect frequencies

Campillos & Kuhn et al., Science, 2008

side-effect similarity

chemical similarity

Campillos & Kuhn et al., Science, 2008

categorization

Campillos & Kuhn et al., Science, 2008

20 drug–drug pairs

in vitro binding assays

Ki<10 µM for 11 of 20

cell assays

9 of 9 showed activity

Acknowledgments

reflect.wsSune Frankild

Heiko Horn

Evangelos Pafilis

Michael Kuhn

Reinhardt Schneider

Sean O’Donoghue

sideeffects.embl.deMonica Campillos

Michael Kuhn

Anne-Claude Gavin

Peer Bork

string-db.orgDamian Szklarczyk

Andrea Franceschini

Michael Kuhn

Milan Simonovic

Alexander Roth

Pablo Minguez

Tobias Doerks

Manuel Stark

Jean Muller

Peer Bork

Christian von Mering

larsjuhljensen

top related