using text to build semantic networks for pharmacaogenomics2
TRANSCRIPT
![Page 1: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/1.jpg)
Using Text to Build Semantics Networks for Pharmacogenomics
George Karystianis
Adrien Coulet, Nigam Shah, Yael Garten, Mark Musen, Russ B. Altman
Journal of Biomedical informatics (2010)
![Page 2: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/2.jpg)
2
Motivation● Manually crafted rules to define relationships
between entities.– Limited scope domains.
● Pharmacogenomics.– Semantic complexity.
● Enhance the PharmaGKB.● Large size of literature.● NLP techniques promising.
![Page 3: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/3.jpg)
3
Aim
● Automatic relationship extraction.● Entity mapping in a schema.
– Semantic network structure.
● Curation of PGx knowledge.● Resource for knowledge discovery.
![Page 4: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/4.jpg)
4
However...
![Page 5: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/5.jpg)
5
What is the meaning of Pharmacogenomics?
![Page 6: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/6.jpg)
6
Pharmacogenomics (1)
Pharmaco Genomics PGx
Φάρμακο Γίνομαι
![Page 7: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/7.jpg)
7
Pharmacogenomics (2)
● How genetic variation influences drug response in patients.
● Most of this knowledge presented in binary relationships.
R(a,b)
Relationship ObjectSubject
![Page 8: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/8.jpg)
8
Is This Something New?
● Co-occurrence approach:– Pharmexpresso.
– Tri-co-occurrences.
● Syntactic parser approach:– OpenDMAP.
– Vocabularies.
Complex relationship semantics.
Manual relationship evaluation.
Explicit relationship identification.
Large pattern sets.
Stable ontologies.
![Page 9: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/9.jpg)
9
So...
Regular gene expression networksDrug-disease networks
Molecular interaction networksGene-disease networks
![Page 10: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/10.jpg)
10
Method Overview
MEDLINEAbstracts
DependencyGraphs ofSentences
R
Ontology
PGx network
![Page 11: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/11.jpg)
11
1a. Sentence Parsing
● Implementation of lexicons for sentence retrieval.
● Stanford Parser.● Focused on sentences with at least 2 key PGX
entities.
![Page 12: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/12.jpg)
12
1b. Sentence Parsing
● Querying the sentence index using seeds.– particular terms corresponding to recognized entities.
– focus on gene-drug/gene-phenotype pairs.
● Reducing set/size of parse trees.● Parse trees -> dependency graphs.
– rooted, oriented, labelled, easy to read, process, understand than parse trees.
![Page 13: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/13.jpg)
13
Parsing Example“Several single nucleotide polymorphisms (SNPs) in VKORC1 are associated
with warfarin dose across the normal dose range”
![Page 14: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/14.jpg)
14
Dependency Graph
![Page 15: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/15.jpg)
15
2a. Relation Extraction
● Sentence analysis for raw relationship extraction.
● Seed recognition:– through PharmGKB lexicons.
● Seed expansion:– edge traversal of DG to see if the seed is a key entity
or a modified entity.
![Page 16: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/16.jpg)
16
Dependencies for Seed Expansion
● Expand the seed● End the expansion● Interrupt the expansion
![Page 17: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/17.jpg)
17
2b. Relation Extraction
● Seed coupling– Two seeds wend with a normalised verb.
– Relationship creation.
![Page 18: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/18.jpg)
18
2c. Relation Extraction
● Evaluation of precision:– manual precision evaluation of extracting raw
relationships.
– random selection of 220 raw relationships.
– classification-complete and true, incomplete and true, false.
![Page 19: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/19.jpg)
19
3. Ontology Construction
● Identification of R types.● Hierarchical organisation of R types and E.
– 4 lists: most frequent, the most frequent modified entities by genes, drugs, phenotype.
● Refine choice available.
![Page 20: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/20.jpg)
20
4a. Relationship Normalization
● Application of ontology to relationship instances.
● Creation of set of normalised relationships.● Normalization of entity names:
– modified entity name returned in normalized form according to ontology.
– Decomposition of modified entity to iterate for the construction of normalised form.
![Page 21: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/21.jpg)
21
Example
![Page 22: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/22.jpg)
22
Example● Seed: VKORC1_polymorphisms.
● Seed concept: Gene.
● Next word: polymorphism.
– refers to a concept modified by Gene.
– synonym of the concept “variant”.
● Normalised word: – VKORC1_variant.
![Page 23: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/23.jpg)
23
4b. Relation Normalization
● Normalization of relationship types.– search for a role label which matches the relationship.
– the identifier of the corresponding role is the normalized type.
– creation of knowledge base of PGX relationships.
![Page 24: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/24.jpg)
24
Did it work?
● Input: – 17.396.436 MEDLINE abstracts
● Sentences: – 87.806.828.
● Sentences with pairs of PGx entities: – 295.569.
● After pruning:– 41.134 raw relationships, 21.050 gene-drug pair,
20.084 gene-phenotype pair.
![Page 25: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/25.jpg)
25
![Page 26: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/26.jpg)
26
Results
● The 200 most frequent raw relationship types:– 80% of the extracted relationships.
● Creation of an ontology:– 200 most frequent relationship types and modified
entities called PHARE-PHArmacogenomics RElationships.
– 237 concepts and 76 roles.
![Page 27: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/27.jpg)
27
Results (2)
![Page 28: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/28.jpg)
28
Results (3)
![Page 29: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/29.jpg)
29
![Page 30: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/30.jpg)
30
![Page 31: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/31.jpg)
31
Discussion (1)
● Identification of both PGx entities.● Identification of PGx modified entities.● Use of key entity lexicons for discovery and
normalization of modified entities. ● Record and recognition of modified entities
under very general textual conditions.● Flexible, precise method.
![Page 32: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/32.jpg)
32
Discussion (2)
● Concern: lower recall due to the large corpus size.
– improve precision with full text parsing.
● Applicable to other domains.– Human effort required for the ontology creation.
![Page 33: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/33.jpg)
33
Conclusions (1)● New method for PGX relationship extraction.● Use of key PGX entities to identify modified
entities.● Capture and normalization of raw
relationships.● Automatic labelling of parsed sentences.
![Page 34: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/34.jpg)
34
Conclusions (2)
● Creation of a knowledge base.● Creation of relationship summaries between:
– Genes, drugs, phenotypes.
● Novel approach for PGX text processing.
![Page 35: Using text to build semantic networks for pharmacaogenomics2](https://reader035.vdocument.in/reader035/viewer/2022062514/55b29ba2bb61eb3f218b4690/html5/thumbnails/35.jpg)
35
Questions?
Ερωτήσεις;
Questions? (in French ^_^)
Preguntas?
質問 ?