your brains in my e-laboratory

39
Your Brains in My e- Laboratory Feasting on brains with Taverna and Semantic Web tools Marco Roos acknowledging the AID team (Scott Marshall, Sophia Katrenko, Willem van Hage, Edgar Meij, Konstantinos Krommydas, Pieter Adriaans), Andrew Gibson, Martijn Schuemie, Piter de Boer, the myGrid team (in particular Katy Wolstencroft, Carole Goble, and Dave de Roure), OMII-UK and NBIC Amsterdam, May 28, 2009

Upload: razi

Post on 23-Feb-2016

51 views

Category:

Documents


0 download

DESCRIPTION

Amsterdam, May 28 , 2009. Your Brains in My e-Laboratory. Feasting on brains with Taverna and Semantic Web tools Marco Roos - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Your Brains in  My e-Laboratory

Your Brains in My e-Laboratory

Feasting on brains with Taverna and Semantic Web tools

Marco Roosacknowledging the AID team (Scott Marshall, Sophia Katrenko, Willem van Hage, Edgar Meij, Konstantinos Krommydas, Pieter Adriaans), Andrew Gibson, Martijn Schuemie, Piter de Boer, the myGrid team (in particular Katy Wolstencroft, Carole Goble, and Dave de Roure), OMII-UK and NBIC

Amsterdam, May 28, 2009

Page 2: Your Brains in  My e-Laboratory

2

Marco RoosBiologist and bioinformatician

Post-doc e-(bio)science, University of Amsterdam (BioRange/VL-e)Project or Area Liaison (PAL) OMII-UK Member UK e-Science All Hands FoundationMember BioAssist programme committee NBIC

A biologist in e-Science

Page 3: Your Brains in  My e-Laboratory

3

Mouse fibroblast (skin) cells

My primary motivationStructure and function of DNA in the nucleus

Esch

eric

hia

coli

Page 4: Your Brains in  My e-Laboratory

5

/* * determines ridges in htm expression table*/

#include "ridge.h"

int selecthtm(PGconn *conn, char *htmtablename, char *chromname, PGresult *htmtable){

char querystring[256];

sprintf("SELECT * FROM %s WHERE chrom = %s ORDER BY genstart", htmtablename, chromname);htmtable = PQexec(conn, querystring);

return(validquery(htmtable, querystring));}

int is_ridge(PGresult *htmtable, int row, double exprthreshold, int mincount)/* determines if mincount genes in a row are (part of) a ridge *//* pre: htmtable is valid and sorted on genStart (ascending)/* post: {

if (mincount<=0) return TRUE;

if (row>=PQntuples(htmtable)) return FALSE;

if(PQgetvalue(htmtable, 0, PQfnumber(htmtable, "movmed39expr")) < exprthreshold){ return FALSE;}return(is_ridge(htmtable, ++row, exprthreshold, --mincount));

}

int main(){

PGconn *conn; /* holds database connection */char querystring[256]; /* query string */PGresult *result;int i;

conn = PQconnectdb("dbname=htm port=6400 user=mroos password=geheim");

if (PQstatus(conn)==CONNECTION_BAD){

fprintf(stderr, "connection to database failed.\n");fprintf(stderr, "%s", PQerrorMessage(conn));exit(1);

}else printf("Connection ok\n");

sprintf(querystring, "SELECT * FROM chromosomes");printf("%s\n", querystring);

result = PQexec(conn, querystring);

if (validquery(result, querystring)){

printresults(result);}else{

PQclear(result);PQfinish(conn);return FALSE;

}

PQclear(result);PQfinish(conn);return TRUE;

}

int printresults(PGresult *tuples){

int i;

for (i=0; i< PQntuples(tuples) && i < 10; i++){

printf("%d, ", i);printf("%s\n", PQgetvalue(tuples,i,0));

}return TRUE;

}

int validquery(PGresult *result, char *querystring){

printf(" in validquery\n");if (PQresultStatus(result) != PGRES_TUPLES_OK) {

printf("Query %s failed.\n", querystring);fprintf(stderr, "Query %s failed.\n", querystring);return FALSE;

}return TRUE;

}

Page 5: Your Brains in  My e-Laboratory

6

‘Old school’ bioinformatics approach

LocalDatabase

LocalDatabase

Page 6: Your Brains in  My e-Laboratory

8

My tiny brain

Page 7: Your Brains in  My e-Laboratory

9

Virtual professor

My ws

Your ws

My ws

Your ws

My ws

* From P.J. Verschure, Journal of Cellular Biochemistry 2006, vol. 99(1), pg 23-34

*

Page 8: Your Brains in  My e-Laboratory

10

Combining expertise

Edgar Meij

Information retrieval expert

Page 9: Your Brains in  My e-Laboratory

11

Combining expertise

Sophia Katrenko

Machine learning expert

Page 10: Your Brains in  My e-Laboratory

12

Combining expertise

Willem van Hage

Semantic web expert(and bass guitar player)

Page 11: Your Brains in  My e-Laboratory

13

Combining expertiseTowards a knowledge framework

Computer scientist and bioinformatician

Scott Marshall

Page 12: Your Brains in  My e-Laboratory

14

The AIDA toolbox, Web Services for knowledge extraction

and knowledge management

Page 13: Your Brains in  My e-Laboratory

15

AIDA toolbox

e-Science collaboration

Page 14: Your Brains in  My e-Laboratory

16

“Collaboration through Web Services”

Bio-text mining expertBioSemantics group,

Erasmus University Rotterdam

Martijn Schuemie

Page 15: Your Brains in  My e-Laboratory

17

“Collaboration through Web Services”

Biological Database expert

Hideaki Sugawara

Page 16: Your Brains in  My e-Laboratory

18

“Collaboration through Web Services”

e-bioscientist

Page 17: Your Brains in  My e-Laboratory

19

An insightful computational experiment

Page 18: Your Brains in  My e-Laboratory

Workflow paradigm for biologists

Page 19: Your Brains in  My e-Laboratory

21

e-Science leveraging the use of more brains

Want this…

Page 20: Your Brains in  My e-Laboratory

22

e-Science leveraging the use of more brains

…need this

Page 21: Your Brains in  My e-Laboratory

23

Workflow and Semantic Web Decondensed chromatin

Condensed chromatin

Histone methylation at H3K9

DNA methylation

HDAC HAT

Histoneacetylation

Decondensed chromatin

Condensed chromatin

Histone methylation at H3K9

DNA methylation

HDAC HAT

Histoneacetylation

HDAC HAT

Histoneacetylation

Alpha versionof Concept

Web

Page 22: Your Brains in  My e-Laboratory

24

Separation of models and instances in OWL

Protein or gene

Association

BiologicalModel

HDAC1 PCAF HDAC1-PCAF interaction

Chromatin condensation

hypothesis

Protein term

Proteins or genes associationassertion

Document

“HDAC1” “p68”

“p68 and p72 associate with

histonedeacetylase 1

(HDAC1)”

PMID: 15298701

Interaction term

“associate”

isComponentOf some

relates somerelatesBy some

isComponentOf

relatesrelatesisComponentOf

isComponentOf

relatesBy

isComponentOf some

Discovered protein term

Text mining process

AIDA based extraction process

“p68”“p68”

“DNMT3B also interacts with histone

deacetylase 1 (HDAC1)”

“DNMT3B also interacts with histone

deacetylase 1 (HDAC1)”

Discovered interaction

term

“interacts”“interacts”

discoveredBy

searchesWith

Document search query Discovered

interactionassertion

Retrieved document

“HDAC1 AND chromatin”

PMID: 15298701

PMID: 15298701

discoveredBy

discoveredBy

discoveredBy

Discovered protein term

“p68”“p68”“p68 and p72

associate with histone

deacetylase 1 (HDAC1)”

“p68 and p72 associate with

histonedeacetylase 1

(HDAC1)”

Document search query

Discovered associationassertion

“HDAC1”“HDAC1”

Protein or gene

BiologicalModel

Chromatin condensation

hypothesis

Chromatin condensation

hypothesisHDAC1HDAC1

p68p68

HDAC1-p68 associationHDAC1-p68 association

Interaction

Page 23: Your Brains in  My e-Laboratory

hasParticipant someProtein

ProteinAssociation

BiologicalModel

HDAC1 PCAF HDAC1-PCAF interaction

Chromatin condensation

hypothesis

hasModelComponent

hasModelComponent

hasModelComponent hasParticipant

hasParticipant

Biological model (representing cartoon elements)

<myModel:HDAC1><rdfs:type><myModel:Protein><myModel:Protein><rdfs:type><owl:Class>

Page 24: Your Brains in  My e-Laboratory

26

Model for text mining observations

Protein term

Proteins or genes associationassertion

Document

“HDAC1” “p68”

“p68 and p72 associate with

histonedeacetylase 1

(HDAC1)”

PMID: 15298701

Interaction term

“associate”

isComponentOf some

relates somerelatesBy some

isComponentOf

relatesrelatesisComponentOf

isComponentOf

relatesBy

isComponentOf some

Page 25: Your Brains in  My e-Laboratory

27

Experiment log model

Discovered protein term

Text mining process

AIDA based extraction process

“p68”“p68”

“DNMT3B also interacts with histone

deacetylase 1 (HDAC1)”

“DNMT3B also interacts with histone

deacetylase 1 (HDAC1)”

Discovered interaction

term

“interacts”“interacts”

discoveredBy

searchesWith

Document search query Discovered

interactionassertion

Retrieved document

“HDAC1 AND chromatin”

PMID: 15298701

PMID: 15298701

discoveredBy

discoveredBy

discoveredBy

Page 26: Your Brains in  My e-Laboratory

28

Mappings between Biological and other models

Discovered protein or gene term

“p68”“p68 and p72 associate with

histone deacetylase 1

(HDAC1)”

Document search query

Discovered associationassertion

“HDAC1”

Protein

BiologicalModel

Chromatin condensation

hypothesisHDAC1

p68

HDAC1-p68 association

ProteinAssociation

references

referencesreferences

references

references

references

Page 27: Your Brains in  My e-Laboratory

29

PRELIMINARY RESULTS

Page 28: Your Brains in  My e-Laboratory

SELECT label(comment), label(query1), label(query2) FROM {protein_instance} rdf:type {bio:Protein} rdf:type {owl:Class},

{protein_instance} rdfs:comment {comment};bioModel:isModelComponentOf {model1};bioModel:isModelComponentOf {model2},

{representation1} mappingModel:partially_represents {model1}; methodModel:has_query {query1},

{representation2} mappingModel:partially_represents {model2}; methodModel:has_query {query2}

WHERE model1 != model2

Pseudo RDF query and results

Protein Query for model 1 Query for model 2"protein referred to by as NF-kappaB and UniProt ID: P19838"

"HDAC1 chromatin" "(Nutrician OR food) AND (chromatin OR epigenetics) AND (protein OR proteins)"

"protein referred to by as p21 and UniProt ID: P38936"

"HDAC1 chromatin" "(Nutrician OR food) AND (chromatin OR epigenetics) AND (protein OR proteins)"

"protein referred to by as Bax and UniProt ID: P97436"

"HDAC1 chromatin" "(Nutrician OR food) AND (chromatin OR epigenetics) AND (protein OR proteins)"

Page 29: Your Brains in  My e-Laboratory

Protein

Proteinname

Discoveryprocess run

Servicerun

Creator

Run date & time

Document

references

discovered by

implemented by

run at

creator

has input

component of

Page 30: Your Brains in  My e-Laboratory

UniProt:P19838

NF-KappaB

Conditional Random FieldsProtein Name Recognition

AIDA:applyCRF

Sophia Katrenko(UvA)

2008-11-1803:29:30

PMID:17540846

references

discovered by

implemented by

run at

creator

has input

component of

Page 31: Your Brains in  My e-Laboratory

Access to triples in Taverna via AIDA plugin

33

Page 32: Your Brains in  My e-Laboratory

34

Knowledge mining

Knowledge mining:my knowledge is mine, your knowledge is mine

Page 33: Your Brains in  My e-Laboratory

35

Demonstrate Exploiting Brains (2x)

My ws

Your ws

My ws

Your ws

My ws

* From P.J. Verschure, Journal of Cellular Biochemistry 2006, vol. 99(1), pg 23-34

*Computational

brains

Biologicalbrains

Page 34: Your Brains in  My e-Laboratory

36

A typical biologist…

A needy biologist

Tiny brain

Lots of data to deal with

Lots of methodsand algorithms to try

and combine

No computationalsuperpowers

Lots of knowledge to deal with

Page 35: Your Brains in  My e-Laboratory

37

An enhanced biologist…

An enhanced biologist

Many brains

Lots of data to support me

Web Services, Workflows,

and their creatorsavailable

Other people’scomputationalsuperpowers

Knowledge basesto query

Page 36: Your Brains in  My e-Laboratory

38

Publish and share on myExperiment.org

Publish & share research objects

Page 37: Your Brains in  My e-Laboratory

39

e-Laboratory factories

Page 38: Your Brains in  My e-Laboratory

40

http://www.epigenius.org/ (mock-up)

Page 39: Your Brains in  My e-Laboratory

41

End of presentation...

Thank youhttp://adaptivedisclosure.org

Are you willing to share your brain?