1 semantic relations for interpreting dna microarray data and for novel hypotheses generation...
TRANSCRIPT
![Page 1: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/1.jpg)
1
Semantic Relations for Interpreting DNA Microarray Data
and for Novel Hypotheses Generation
Dimitar Hristovski,1 PhD, Andrej Kastrin,2 Borut Peterlin,2 MD PhD, Thomas C Rindflesch,3 PhD
1Institute of Biomedical Informatics, Medical Faculty, University of Ljubljana, Slovenia
2Institute of Medical Genetics, University Medical Centre, Ljubljana, Slovenia3National Library of Medicine, National Institutes of Health, Bethesda, MD,
U.S.A.
e-mail: [email protected]
![Page 2: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/2.jpg)
2
Introduction
Microarray experiments:
• great potential to support progress in biomedical research,
• results NOT EASY to interpret,
• information about functions and relations of relevant genes needs to be extracted from the vast biomedical literature
![Page 3: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/3.jpg)
Related Work
• Text mining and microarray analysis
• Literature-based Discovery
![Page 4: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/4.jpg)
4
Proposed Solution
• Computerized text analysis system• Extract semantic relations from literature
– SemRep
• Integrate with microarray experiments• Develop tools for:
– Interpretation– Novel hypotheses generation
![Page 5: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/5.jpg)
Overall Design
Medline GEO
SemRepSem.rels Extraction
R Bioconductorscripts
Integrated Database=semantic relations +
microarrays
Interpretation & Discovery Tools
semantic relationsmicroarra
ys
![Page 6: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/6.jpg)
SemRep
• Extracts semantic relations from biomedical text (implemented in Prolog)
• Based on UMLS Metathesaurus and Semantic Network– <MetaConc> SEMNET RELATION <MetaConc>
• Database of relations extracted from MEDLINE– 6.7M citations (01/01/1999 through 03/31/2009)– 43M sentences– 21M relation instances– 7M relation types
6
![Page 7: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/7.jpg)
7
Semantic Relations Extracted
• Wide range of relations in:– Clinical medicine– Molecular genetics– Pharmacogenomics
• Genetic Etiology: associated_with, predisposes, causes• Substance Relations: interacts_with, inhibits, stimulates • Pharmacological Effects: affects, disrupts, augments • Clinical Actions: administered_to, manifestation_of, treats, • Organism Characteristics: location_of, part_of, process_of • Co-existence: co-exists_with
![Page 8: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/8.jpg)
8
Examples
• “… the loss of Mbd1 could lead to autism-like behavioral phenotypes …”
• Relation: MDB1 causes Autistic Disorder • “… Mbd1 can directly regulate the
expression of Htr2c, one of the serotonin receptors, …”
• Relation: MBD1 interacts_with HTR2C
![Page 9: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/9.jpg)
![Page 10: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/10.jpg)
10
Interpretation of Microarrays
Find known facts from the literature:
• Desease related:– Associated genes– Current treatments– …
• Microarray Genes:– Relations between genes (INHIBITS, STIMULATES, …)– Relations between the genes and anything else
![Page 11: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/11.jpg)
Relations with “Parkinson” as Argument?
![Page 12: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/12.jpg)
What Treats Parkinson?
![Page 13: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/13.jpg)
What (causes, associated_with) Parkinson?
![Page 14: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/14.jpg)
Sentences from which Relations are Extracted
![Page 15: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/15.jpg)
Genes from the Microarray Related to Anything?
![Page 16: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/16.jpg)
16
Novel Hypotheses Generation
• Based on discovery patterns
• Discovery patterns:– search templates that have a higher likelihood of
returning a new discovery
• Specific discovery patterns for specific discovery tasks
![Page 17: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/17.jpg)
17
Discovery Patterns
• Inhibit the upregulated:– Search for substances, genes, ... which, according to the
literature, inhibit the top N (e.g. 300) genes that are upregulated on a given microarray
– Such substances, genes, … might be used to regulate the upregulated genes
• Stimulate the downregulated:– Search for substances, genes, ... which, according to the
literature, stimulate the top N (e.g. 300) genes that are downregulated on a given microarray
– Such substances, genes, … might be used to regulate the downregulated genes
![Page 18: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/18.jpg)
Discovery Patterns – Graphical View
Disease X
Maybe_Treats2?
Upregulated
Downregulated
Genes Y1
Genes Y2
Drug Z1
(or substance)
Drug Z2
(or substance)
Inhibits
Stimulates
Maybe_Treats1?
Microarray Literature
![Page 19: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/19.jpg)
19
Results – Inhibit the Upregulated
Paclitaxel INHIBITS HSPB1|HSPB1 protein
Paclitaxel completely inhibited the expression of HSP27 (PMID: 15304155)
Quercetin INHIBITS HSPB1|HSPB1 gene
Quercetin …, inhibited the expression of both HSP70 and HSP27 (PMID: 12926076)
•Parkinson microarray GSE8397
•HSP27 (HSPB1) gene is upregulated on the microarray
•We identified paclitaxel and quercetin as substances that inhibit the expression of this gene
![Page 20: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/20.jpg)
Inhibit the Upregulated
![Page 21: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/21.jpg)
21
Results – Stimulate the
Downregulated• NR4A2 downregulated on the microarray• We found out that:
– Pramipexol stimulates expression of NR4A2 – NR4A2 is associated with Parkinson disease
pramipexol STIMULATES NR4A2
… the increase of Nurr1 gene expression induced by PRX, ... (PMID: 15740846)
… the induction of Nurr1 gene expression by PRX ... (PMID: 15740846)
NR4A2 ASSOCIATED_WITH
Parkinson Disease
… lower levels of NURR1 gene expression were associated with significantly increased risk for PD (PMID: 18684475)
![Page 22: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/22.jpg)
Explaining a Relation - Closed Discovery
![Page 23: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/23.jpg)
Closed Discovery – Aligned Relations
![Page 24: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/24.jpg)
Evaluation• Estimate – based on [Masseroli, BMC Bioinformatics
2006]:• Extract known facts – baseline precision on 2,042
extracted relations:– Gene – Disease (causes, assoc_with, …) P=74.2%– Gene – Gene (inhibits, stimulates, …) P=41.95%
• Propose Argument-Predicate distance for filtering (Gene-Gene):– At distance no more than 1: P=70.75%; R=43.6%– At distance no more than 2: P=55.88%; R=66.28%
• We use Argument-Predicate distance for ranking of semantic relations and we show relations more likely to be correct first.
![Page 25: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/25.jpg)
25
Conclusion
• A new bioinformatics tool for interpretation and novel hypotheses generation
• Based on integration of semantic relations extracted from literature with microarrays
• Available at:
• http://sembt.mf.uni-lj.si
![Page 26: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/26.jpg)
Syntactic Processing
Mbd1 can directly regulate the expression of Htr2c• MedPost tagger and shallow parser[ NP[head([… inputmatch(mdb1),tag(noun)])], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… inputmatch(htr2c),tag(noun)])] ]
26
![Page 27: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/27.jpg)
Semantic Processing
• Identify concepts: MetaMap and ABGene
[ NP[head([… semtype(gngm),entrez(MBD1,4152)])], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… semtype(gngm),entrez(HTR2C,3358)])] ]
27
![Page 28: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/28.jpg)
Semantic Processing
• Identify concepts: MetaMap and ABGene
[ NP[head([… semtype(gngm),entrez(MBD1,4152)], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ]
• Match semantic type patterns to ontology:
<gngm> INTERACTS_WITH <gngm>
28
![Page 29: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/29.jpg)
Semantic Processing
• Identify concepts: MetaMap and ABGene
[ NP[head([… semtype(gngm),entrez(MBD1,4152)], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ]
• Match semantic type patterns to ontology:
<gngm> INTERACTS_WITH <gngm>
29
![Page 30: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/30.jpg)
Semantic Processing
• Identify concepts: MetaMap and ABGene
[ NP[head([… semtype(gngm),entrez(MBD1,4152)], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ]
• Match semantic type patterns to ontology:
<gngm> INTERACTS_WITH <gngm>
• Apply indicator rule: Verb(regulate) INTERACTS_WITH
30
![Page 31: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/31.jpg)
Semantic Processing
• Identify concepts: MetaMap and ABGene
[ NP[head([… semtype(gngm),entrez(MBD1,4152)], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ]
• Match semantic type patterns to ontology:
<gngm> INTERACTS_WITH <gngm>
• Apply indicator rule: Verb(regulate) INTERACTS_WITH
31
![Page 32: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/32.jpg)
Semantic Processing
• Identify concepts: MetaMap and ABGene
[ NP[head([… semtype(gngm),entrez(MBD1,4152)], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ]
• Match semantic type patterns to ontology:
<gngm> INTERACTS_WITH <gngm>
• Apply indicator rule: Verb(regulate) INTERACTS_WITH
• Substitute concepts for semantic types:
32
![Page 33: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/33.jpg)
Semantic Processing
• Identify concepts: MetaMap and ABGene
[ NP[head([… semtype(gngm),entrez(MBD1,4152)], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ]
• Match semantic type patterns to ontology:
<gngm> INTERACTS_WITH <gngm>
• Apply indicator rule: Verb(regulate) INTERACTS_WITH
• Substitute concepts for semantic types:
33
![Page 34: 1 Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 Borut Peterlin,](https://reader036.vdocument.in/reader036/viewer/2022062423/5697bff01a28abf838cba775/html5/thumbnails/34.jpg)
Semantic Processing
• Identify concepts: MetaMap and ABGene
[ NP[head([… semtype(gngm),entrez(MBD1,4152)], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ]
• Match semantic type patterns to ontology:
<gngm> INTERACTS_WITH <gngm>
• Apply indicator rule: Verb(regulate) INTERACTS_WITH
• Substitute concepts for semantic types:
MBD1 INTERACTS_WITH HTR2C
34