semantic web development for traditional chinese medicine
DESCRIPTION
This thesis is motivated by a case study of SSME, in which we digitalize and integrate cultural assets of TCM and provide Web-based knowledge services to medical experts. The major quest is to turn cultural assets from prolonged Chinese history into knowledge services contributing to modern biomedicine. In our view, the essence of knowledge service is cross-domain collaboration in knowledge discovery on the Web of data. Whereas the Service-Oriented Architecture enables interactions between Web agents, and the Semantic Web provides a knowledge representation and integration framework, the feasibility and benefits of Web-based collaborative knowledge discovery need to be further investigated. We propose a methodology named Semantic Graph Mining (SGM), which uses the semantic graph model to integrate graph mining and ontology reasoning for better analyzing biomedical complex networks (an important KDD problem). Potential methods of SGM include Web resource ranking, semantic association discovery, frequent subgraph mining, and clustering. The effectiveness of these methods is investigated in use cases such as TCM semantic search, TCM formulae analysis, drug-interaction analysis.TRANSCRIPT
Semantic Web Development for Traditional Chinese Medicine
Tong Yu, Zhejiang University. China.
July. 15th, 2008, IAAI, Chicago, Il.
Outline
• Overview of TCM Semantic Web• Ontology Engineering and Reuse• Semantic Mapping and Integration• Semantic Query, Search and
Navigation• Semantic Graph Mining • Summary
The Semantic Web: “A Giant Graph of Things”
• Based on the Internet and the Web• Formal Semantics
o Use URIs as names for thingso RDF information about things available through HTTP URIso Use RDF statements for semantic links between things
• Global network of databases
Project Background
• Preservation and Modernization of TCM
• Integrative Medicine • Connecting the data• A Student Project
The ultimate vision: Make a connection between TCM and modern medicine
User Scenario
TCM Ontology Platform
• Domain Categorizationo The current TCM ontology contains 15 major
categories for each sub-domain.
• Ontology Structureo A typing system as a concept hierarchyo A semantic network defining the associations between concepts
• Scaleo More than 20,000 classes and 100,000 instances
defined in the current ontology
• Access Controlo layered privilege mechanism that defines users as
reader, editor, checker and administrator.
• Serviceo Web APIs for ontology-based applications.
The 15 Categories Defined in the TCM Ontology
Ajax-based Ontology Viewer and Editor
Ontology Fusion and Reuse
Visualized Mapper: An Eclipse Plugin
Visualized Mapper: The Ajax-based Tool
• http://ec2-67-202-22-44.compute-1.amazonaws.com:8890/demo/mapper/
Semantic Search Portal
Semantic Graph Mining
• We envision that intelligent agents could work on the Semantic Web of structured data, and assist their masters to solve problems, who can o discover important Web resourceso discern latent semantic associationso interpret interesting graph patterns.
• Existing methods of data mining, especially graph mining, can be adopted to implement these intelligent agents.
• We propose a methodology, called Semantic Graph Mining (SGM), for building agents that discover knowledge on the Semantic Web.
TCM Semantic Graph
The Network of Herbs
The Process for Analyzing the Network of Herbs
1.Data Modeling2.Data
Transformation & Integration :
3.Entity Disambiguation
4.Interaction Identification
5.Network Mapping6.Network Analysis
Semantic Graph Resource Importance
• the in-degree centrality CI of a resource is measured by the weighted sum of statements with the resource as object, and the out-degree centrality is measured by the weighted sum of statements with the resource as subject.
Semantic Graph Resource Importance
• The Closeness Centrality of a resource r is defined as the inverse of the sum of the distance from r to all other resources.
Semantic Graph Resource Importance
• The Betweenness Centrality of a resource r is defined as the ratio of shortest paths across the resource in the graph.
Frequent Semantic Subgraph
SG1 SG2
SG3
SG4SG5
Frequent Semantic Subgraph
Pattern Interpretation
Interactive Mining of TCM Knowledge
Conclusion
• We took the first systematic approach to leverage the progress of Biomedical Informatics to address the modernization of TCM.
• Domain experts evaluate the platform’s major technical features as original and productive in Drug Safety and Efficacy analysis.
• This case study demonstrates the Semantic Web’s advantages in representation, integration, and discovery of knowledge with complex domain models.
• Contributes to the Preservation and Modernization of TCM as intangible cultural heritage.
Reference
• TCM Ontology Engineering and Reuseo Y. Mao, et al. Dynamic Sub-Ontology Evolution for Traditional Chinese
Medicine Web Ontology. Journal of Biomedical Informatics, 2008 (In progress)
o Y. Mao et al. Sub-Ontology Based Resource Management for Web-based e-Learning. IEEE Transactions on Knowledge and Data Engineering, 2008 (In Progress)
• Data Mapping and Integrationo Zhao-hui Wu, Hua-jun Chen. 2008. Semantic Grid:Model,
Methodology,and Applications (Monograph). Co-published by Zhejiang University Press and Springer-Verlag GmbH.
o Huajun Chen et al. RDF/RDFS-based Relational Database Integration. ICDE 2006
o Huajun Chen et al.From Legacy Relational Databases to the Semantic Web: an In-Use Application for Traditional Chinese Medicine. ISWC 2006.
Reference
• Semantic Search, Query, and Navigationo Huajun Chen et al. Towards semantic e-science for traditional chinese
medicine. BMC Bioinformatics, 8(Suppl 3):56, 2007.• Knowledge Discovery for TCM
o Yi Feng et al. Knowledge discovery in traditional Chinese medicine: State of the art and perspectives, AI in Medicine, 38(3): 219-236, 2006.
o Xuezhong Zhou et al. Integrative mining of traditional Chinese medicine literature and MEDLINE for functional gene networks. AI in Medicine (2007) 41, 87—104.
• Semantic Graph Miningo Tong Yu et al. Semantic Graph Mining for Biomedical Complex Network
Analysis. WWW ’08 Workshops: HCLS.o Huajun Chen et al. Semantic Graph Mining for Biomedical Complex
Network Analysis. Brief. In Bioinformatics ( In progress).
Thanks for your time!