turning big data and text collections into web resrouces

90
Lars Juhl Jensen Turning big data and text collections into web resources

Upload: lars-juhl-jensen

Post on 10-May-2015

100 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Turning big data and text collections into web resrouces

Lars Juhl Jensen

Turning big data and text collections into web

resources

Page 2: Turning big data and text collections into web resrouces

three parts

Page 3: Turning big data and text collections into web resrouces

data integration

Page 4: Turning big data and text collections into web resrouces

text mining

Page 5: Turning big data and text collections into web resrouces

interface design

Page 6: Turning big data and text collections into web resrouces

data integration

Page 7: Turning big data and text collections into web resrouces

association networks

Page 8: Turning big data and text collections into web resrouces

guilt by association

Page 9: Turning big data and text collections into web resrouces
Page 10: Turning big data and text collections into web resrouces

STRING

Page 11: Turning big data and text collections into web resrouces

Szklarczyk, Franceschini et al., Nucleic Acids Research, 2011

Page 12: Turning big data and text collections into web resrouces

computational predictions

Page 13: Turning big data and text collections into web resrouces

gene fusion

Page 14: Turning big data and text collections into web resrouces

Korbel et al., Nature Biotechnology, 2004

Page 15: Turning big data and text collections into web resrouces

experimental data

Page 16: Turning big data and text collections into web resrouces

physical interactions

Page 17: Turning big data and text collections into web resrouces

Jensen & Bork, Science, 2008

Page 18: Turning big data and text collections into web resrouces

curated knowledge

Page 19: Turning big data and text collections into web resrouces

metabolic pathways

Page 20: Turning big data and text collections into web resrouces

Letunic & Bork, Trends in Biochemical Sciences, 2008

Page 21: Turning big data and text collections into web resrouces

many databases

Page 22: Turning big data and text collections into web resrouces

different formats

Page 23: Turning big data and text collections into web resrouces

different identifiers

Page 24: Turning big data and text collections into web resrouces

variable quality

Page 25: Turning big data and text collections into web resrouces

not comparable

Page 26: Turning big data and text collections into web resrouces

hard work

Page 27: Turning big data and text collections into web resrouces

quality scores

Page 28: Turning big data and text collections into web resrouces

von Mering et al., Nucleic Acids Research, 2005

Page 29: Turning big data and text collections into web resrouces

calibrate vs. gold standard

Page 30: Turning big data and text collections into web resrouces

missing most of the data

Page 31: Turning big data and text collections into web resrouces

text mining

Page 32: Turning big data and text collections into web resrouces

>10 km

Page 33: Turning big data and text collections into web resrouces

too much to read

Page 34: Turning big data and text collections into web resrouces

computer

Page 35: Turning big data and text collections into web resrouces

as smart as a dog

Page 36: Turning big data and text collections into web resrouces

teach it specific tricks

Page 37: Turning big data and text collections into web resrouces
Page 38: Turning big data and text collections into web resrouces
Page 39: Turning big data and text collections into web resrouces

named entity recognition

Page 40: Turning big data and text collections into web resrouces

comprehensive lexicon

Page 41: Turning big data and text collections into web resrouces

cyclin dependent kinase 1

Page 42: Turning big data and text collections into web resrouces

CDC2

Page 43: Turning big data and text collections into web resrouces

expansion rules

Page 44: Turning big data and text collections into web resrouces

flexible matching

Page 45: Turning big data and text collections into web resrouces

cyclin dependent kinase 1

Page 46: Turning big data and text collections into web resrouces

cyclin-dependent kinase 1

Page 47: Turning big data and text collections into web resrouces

CDC2

Page 48: Turning big data and text collections into web resrouces

hCdc2

Page 49: Turning big data and text collections into web resrouces

“black list”

Page 50: Turning big data and text collections into web resrouces

SDS

Page 51: Turning big data and text collections into web resrouces

proteins

Page 52: Turning big data and text collections into web resrouces

small molecules

Page 53: Turning big data and text collections into web resrouces

compartments

Page 54: Turning big data and text collections into web resrouces

tissues

Page 55: Turning big data and text collections into web resrouces

diseases

Page 56: Turning big data and text collections into web resrouces

information extraction

Page 57: Turning big data and text collections into web resrouces

count co-mentioning

Page 58: Turning big data and text collections into web resrouces

within documents

Page 59: Turning big data and text collections into web resrouces

within paragraphs

Page 60: Turning big data and text collections into web resrouces

within sentences

Page 61: Turning big data and text collections into web resrouces

corpora

Page 62: Turning big data and text collections into web resrouces

~22 million abstracts

Page 63: Turning big data and text collections into web resrouces

no access

Page 64: Turning big data and text collections into web resrouces

~4 million full-text articles

Page 65: Turning big data and text collections into web resrouces
Page 66: Turning big data and text collections into web resrouces

interface design

Page 67: Turning big data and text collections into web resrouces

ease of use

Page 68: Turning big data and text collections into web resrouces

web resources

Page 69: Turning big data and text collections into web resrouces

simple search interface

Page 70: Turning big data and text collections into web resrouces

complex relational database

Page 71: Turning big data and text collections into web resrouces

attractiveness

Page 72: Turning big data and text collections into web resrouces

data visualization

Page 73: Turning big data and text collections into web resrouces

STRING

Page 74: Turning big data and text collections into web resrouces

Szklarczyk, Franceschini et al., Nucleic Acids Research, 2011

Page 75: Turning big data and text collections into web resrouces

payload

Page 76: Turning big data and text collections into web resrouces

compartments.jensenlab.org

Page 77: Turning big data and text collections into web resrouces

COMPARTMENTS

Page 78: Turning big data and text collections into web resrouces

compartments.jensenlab.org

Page 79: Turning big data and text collections into web resrouces

TISSUES

Page 80: Turning big data and text collections into web resrouces

tissues.jensenlab.org

Page 81: Turning big data and text collections into web resrouces

provenance

Page 82: Turning big data and text collections into web resrouces

evidence viewers

Page 83: Turning big data and text collections into web resrouces

DISEASES

Page 84: Turning big data and text collections into web resrouces
Page 85: Turning big data and text collections into web resrouces
Page 86: Turning big data and text collections into web resrouces

reusability

Page 87: Turning big data and text collections into web resrouces

web services

Page 88: Turning big data and text collections into web resrouces

download files

Page 89: Turning big data and text collections into web resrouces

open licenses

Page 90: Turning big data and text collections into web resrouces

Acknowledgments

Protein networks

Christian von MeringDamian Szklarczyk

Michael KuhnManuel Stark

Samuel ChaffronChris Creevey

Jean MullerTobias DoerksPhilippe Julien

Alexander RothMilan Simonovic

Jan KorbelBerend Snel

Martijn HuynenPeer Bork

Literature miningSune FrankildEvangelos PafilisJanos BinderKalliopi TsafouAlberto SantosHeiko HornMichael KuhnNigel BrownReinhardt SchneiderSean O’Donoghue