developing a software package for conceptualizing molecular findings xinghua lu, harry hocheiser...
TRANSCRIPT
Developing a Software Package for Conceptualizing Molecular Findings
Xinghua Lu, Harry Hocheiser & Vicky Chen
Dept Biomedical Informatics
Motivation and Goal
• Bioinformatics research often produce a long list of genes of potential interest, e.g., genes differentially expressed in a cancer tumor
• A key task is to understand what are the major functional themes of the genes– Input: a list of genes of interest– Output: divide genes in to functional groups; the
functional theme of each group is to be represented by a suitable biological concept
Strategies
• Currently, gene function is annotated with specific concepts from the Gene Ontology
• The Gene Ontology consists of a set of concepts organized in a hierarchy—a directed acyclic graph (DAG)
• Given a gene list, find out their corresponding annotations on the graph
• Group genes whose functions are closely related to each other within the graph and search a general concept to summarize the genes
• Quantitatively assess the information loss and strive minimize information loss
Finding Coherent Modules
4
What are involved
• Learning Python programming language• Object-oriented programming, graph representation, and graph
algorithms• Information theory • Most of the functionality is already developed• Need to package/organize code into a package or a tool kit
– Define API– Learn the function of existing code, modify if needed– Implement API by wrapping existing code– Package into a Python module– Documentation– Submit to BioPython community.