your brains in my e-laboratory

Post on 23-Feb-2016

51 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Amsterdam, May 28 , 2009. Your Brains in My e-Laboratory. Feasting on brains with Taverna and Semantic Web tools Marco Roos - PowerPoint PPT Presentation

TRANSCRIPT

Your Brains in My e-Laboratory

Feasting on brains with Taverna and Semantic Web tools

Marco Roosacknowledging the AID team (Scott Marshall, Sophia Katrenko, Willem van Hage, Edgar Meij, Konstantinos Krommydas, Pieter Adriaans), Andrew Gibson, Martijn Schuemie, Piter de Boer, the myGrid team (in particular Katy Wolstencroft, Carole Goble, and Dave de Roure), OMII-UK and NBIC

Amsterdam, May 28, 2009

2

Marco RoosBiologist and bioinformatician

Post-doc e-(bio)science, University of Amsterdam (BioRange/VL-e)Project or Area Liaison (PAL) OMII-UK Member UK e-Science All Hands FoundationMember BioAssist programme committee NBIC

A biologist in e-Science

3

Mouse fibroblast (skin) cells

My primary motivationStructure and function of DNA in the nucleus

Esch

eric

hia

coli

5

/* * determines ridges in htm expression table*/

#include "ridge.h"

int selecthtm(PGconn *conn, char *htmtablename, char *chromname, PGresult *htmtable){

char querystring[256];

sprintf("SELECT * FROM %s WHERE chrom = %s ORDER BY genstart", htmtablename, chromname);htmtable = PQexec(conn, querystring);

return(validquery(htmtable, querystring));}

int is_ridge(PGresult *htmtable, int row, double exprthreshold, int mincount)/* determines if mincount genes in a row are (part of) a ridge *//* pre: htmtable is valid and sorted on genStart (ascending)/* post: {

if (mincount<=0) return TRUE;

if (row>=PQntuples(htmtable)) return FALSE;

if(PQgetvalue(htmtable, 0, PQfnumber(htmtable, "movmed39expr")) < exprthreshold){ return FALSE;}return(is_ridge(htmtable, ++row, exprthreshold, --mincount));

}

int main(){

PGconn *conn; /* holds database connection */char querystring[256]; /* query string */PGresult *result;int i;

conn = PQconnectdb("dbname=htm port=6400 user=mroos password=geheim");

if (PQstatus(conn)==CONNECTION_BAD){

fprintf(stderr, "connection to database failed.\n");fprintf(stderr, "%s", PQerrorMessage(conn));exit(1);

}else printf("Connection ok\n");

sprintf(querystring, "SELECT * FROM chromosomes");printf("%s\n", querystring);

result = PQexec(conn, querystring);

if (validquery(result, querystring)){

printresults(result);}else{

PQclear(result);PQfinish(conn);return FALSE;

}

PQclear(result);PQfinish(conn);return TRUE;

}

int printresults(PGresult *tuples){

int i;

for (i=0; i< PQntuples(tuples) && i < 10; i++){

printf("%d, ", i);printf("%s\n", PQgetvalue(tuples,i,0));

}return TRUE;

}

int validquery(PGresult *result, char *querystring){

printf(" in validquery\n");if (PQresultStatus(result) != PGRES_TUPLES_OK) {

printf("Query %s failed.\n", querystring);fprintf(stderr, "Query %s failed.\n", querystring);return FALSE;

}return TRUE;

}

6

‘Old school’ bioinformatics approach

LocalDatabase

LocalDatabase

8

My tiny brain

9

Virtual professor

My ws

Your ws

My ws

Your ws

My ws

* From P.J. Verschure, Journal of Cellular Biochemistry 2006, vol. 99(1), pg 23-34

*

10

Combining expertise

Edgar Meij

Information retrieval expert

11

Combining expertise

Sophia Katrenko

Machine learning expert

12

Combining expertise

Willem van Hage

Semantic web expert(and bass guitar player)

13

Combining expertiseTowards a knowledge framework

Computer scientist and bioinformatician

Scott Marshall

14

The AIDA toolbox, Web Services for knowledge extraction

and knowledge management

15

AIDA toolbox

e-Science collaboration

16

“Collaboration through Web Services”

Bio-text mining expertBioSemantics group,

Erasmus University Rotterdam

Martijn Schuemie

17

“Collaboration through Web Services”

Biological Database expert

Hideaki Sugawara

18

“Collaboration through Web Services”

e-bioscientist

19

An insightful computational experiment

Workflow paradigm for biologists

21

e-Science leveraging the use of more brains

Want this…

22

e-Science leveraging the use of more brains

…need this

23

Workflow and Semantic Web Decondensed chromatin

Condensed chromatin

Histone methylation at H3K9

DNA methylation

HDAC HAT

Histoneacetylation

Decondensed chromatin

Condensed chromatin

Histone methylation at H3K9

DNA methylation

HDAC HAT

Histoneacetylation

HDAC HAT

Histoneacetylation

Alpha versionof Concept

Web

24

Separation of models and instances in OWL

Protein or gene

Association

BiologicalModel

HDAC1 PCAF HDAC1-PCAF interaction

Chromatin condensation

hypothesis

Protein term

Proteins or genes associationassertion

Document

“HDAC1” “p68”

“p68 and p72 associate with

histonedeacetylase 1

(HDAC1)”

PMID: 15298701

Interaction term

“associate”

isComponentOf some

relates somerelatesBy some

isComponentOf

relatesrelatesisComponentOf

isComponentOf

relatesBy

isComponentOf some

Discovered protein term

Text mining process

AIDA based extraction process

“p68”“p68”

“DNMT3B also interacts with histone

deacetylase 1 (HDAC1)”

“DNMT3B also interacts with histone

deacetylase 1 (HDAC1)”

Discovered interaction

term

“interacts”“interacts”

discoveredBy

searchesWith

Document search query Discovered

interactionassertion

Retrieved document

“HDAC1 AND chromatin”

PMID: 15298701

PMID: 15298701

discoveredBy

discoveredBy

discoveredBy

Discovered protein term

“p68”“p68”“p68 and p72

associate with histone

deacetylase 1 (HDAC1)”

“p68 and p72 associate with

histonedeacetylase 1

(HDAC1)”

Document search query

Discovered associationassertion

“HDAC1”“HDAC1”

Protein or gene

BiologicalModel

Chromatin condensation

hypothesis

Chromatin condensation

hypothesisHDAC1HDAC1

p68p68

HDAC1-p68 associationHDAC1-p68 association

Interaction

hasParticipant someProtein

ProteinAssociation

BiologicalModel

HDAC1 PCAF HDAC1-PCAF interaction

Chromatin condensation

hypothesis

hasModelComponent

hasModelComponent

hasModelComponent hasParticipant

hasParticipant

Biological model (representing cartoon elements)

<myModel:HDAC1><rdfs:type><myModel:Protein><myModel:Protein><rdfs:type><owl:Class>

26

Model for text mining observations

Protein term

Proteins or genes associationassertion

Document

“HDAC1” “p68”

“p68 and p72 associate with

histonedeacetylase 1

(HDAC1)”

PMID: 15298701

Interaction term

“associate”

isComponentOf some

relates somerelatesBy some

isComponentOf

relatesrelatesisComponentOf

isComponentOf

relatesBy

isComponentOf some

27

Experiment log model

Discovered protein term

Text mining process

AIDA based extraction process

“p68”“p68”

“DNMT3B also interacts with histone

deacetylase 1 (HDAC1)”

“DNMT3B also interacts with histone

deacetylase 1 (HDAC1)”

Discovered interaction

term

“interacts”“interacts”

discoveredBy

searchesWith

Document search query Discovered

interactionassertion

Retrieved document

“HDAC1 AND chromatin”

PMID: 15298701

PMID: 15298701

discoveredBy

discoveredBy

discoveredBy

28

Mappings between Biological and other models

Discovered protein or gene term

“p68”“p68 and p72 associate with

histone deacetylase 1

(HDAC1)”

Document search query

Discovered associationassertion

“HDAC1”

Protein

BiologicalModel

Chromatin condensation

hypothesisHDAC1

p68

HDAC1-p68 association

ProteinAssociation

references

referencesreferences

references

references

references

29

PRELIMINARY RESULTS

SELECT label(comment), label(query1), label(query2) FROM {protein_instance} rdf:type {bio:Protein} rdf:type {owl:Class},

{protein_instance} rdfs:comment {comment};bioModel:isModelComponentOf {model1};bioModel:isModelComponentOf {model2},

{representation1} mappingModel:partially_represents {model1}; methodModel:has_query {query1},

{representation2} mappingModel:partially_represents {model2}; methodModel:has_query {query2}

WHERE model1 != model2

Pseudo RDF query and results

Protein Query for model 1 Query for model 2"protein referred to by as NF-kappaB and UniProt ID: P19838"

"HDAC1 chromatin" "(Nutrician OR food) AND (chromatin OR epigenetics) AND (protein OR proteins)"

"protein referred to by as p21 and UniProt ID: P38936"

"HDAC1 chromatin" "(Nutrician OR food) AND (chromatin OR epigenetics) AND (protein OR proteins)"

"protein referred to by as Bax and UniProt ID: P97436"

"HDAC1 chromatin" "(Nutrician OR food) AND (chromatin OR epigenetics) AND (protein OR proteins)"

Protein

Proteinname

Discoveryprocess run

Servicerun

Creator

Run date & time

Document

references

discovered by

implemented by

run at

creator

has input

component of

UniProt:P19838

NF-KappaB

Conditional Random FieldsProtein Name Recognition

AIDA:applyCRF

Sophia Katrenko(UvA)

2008-11-1803:29:30

PMID:17540846

references

discovered by

implemented by

run at

creator

has input

component of

Access to triples in Taverna via AIDA plugin

33

34

Knowledge mining

Knowledge mining:my knowledge is mine, your knowledge is mine

35

Demonstrate Exploiting Brains (2x)

My ws

Your ws

My ws

Your ws

My ws

* From P.J. Verschure, Journal of Cellular Biochemistry 2006, vol. 99(1), pg 23-34

*Computational

brains

Biologicalbrains

36

A typical biologist…

A needy biologist

Tiny brain

Lots of data to deal with

Lots of methodsand algorithms to try

and combine

No computationalsuperpowers

Lots of knowledge to deal with

37

An enhanced biologist…

An enhanced biologist

Many brains

Lots of data to support me

Web Services, Workflows,

and their creatorsavailable

Other people’scomputationalsuperpowers

Knowledge basesto query

38

Publish and share on myExperiment.org

Publish & share research objects

39

e-Laboratory factories

40

http://www.epigenius.org/ (mock-up)

41

End of presentation...

Thank youhttp://adaptivedisclosure.org

Are you willing to share your brain?

top related