biological storytelling: a software tool for biological information organization based upon...
Post on 11-Aug-2014
Data & Analytics
Embed Size (px)
DESCRIPTIONAllan Kuchinsky, Kathy Graham, David Moh, Annette Adler, Ketan Babaria, Michael L. Creech; Biological storytelling: a software tool for biological information organization based upon narrative structure; ACM Siggroup Bulletin 01/2002; DOI:10.1145/1556262.1556315 ISBN: 1-58113-537-8 Presented by Anjani K Dhrangadhariya Junior Student, M.S. Life Science Informatics Bonn-Aachen International Center for Information Technology B-IT
A Software Tool for Biological Information Organization Based upon Narrative Structure
Presented byAnjani K DhrangadhariyaJunior Student, M.S. Life Science InformaticsBonn-Aachen International Center for Information Technology B-IT
Allan Kuchinsky, Kathy Graham, David Moh, Annette Adler, Ketan Babaria, Michael L. Creech; Biological storytelling: a software tool for biological information organization based upon narrative structure; ACM Siggroup Bulletin 01/2002; DOI:10.1145/1556262.1556315 ISBN: 1-58113-537-8
Way research really is
Your preference ?
✔ Protein A with 2 active sites interacts with Protein B having 1 active site and 1 allosteric site, in presence of co-enzyme X under high concentration of XYZ in ABC pathway and BBC temperature in XOX fish found in XXX ocean.
● Drug designing/discovery and treatment for life threatening diseases is a mind mapping process which takes into account piecing together the story of how genes and proteins behave in pathway (in disease pathogenesis).
● Paradigm is Personalized medicine which involves taking into account peculiar genomic information and tailoring drugs for individuals according to their genetic makeup.
● Microarray: Made it possible for researchers to correlate gene expression with disease progression, screen for mutations, and treat patients according to their genetic profiles.
● Yu X, Schneiderhan-Marra N, Joos TO.;Protein microarrays forpersonalized medicine.; Clin Chem. 2010 Mar;56(3):376-87. doi: 10.1373/clinchem.2009.137158. Epub 2010 Jan 14.
DNA microarray made Paradigm shift in Medicine possible and GAVE HUGE AMOUNT OF RAW DATA which remains to be sense of....by researchers
Analysis and Synthesis
● Synthesis is a Greek word meaning "to put together"
● Procedure to combine separate elements or components in order to form a coherent whole
● Synthesis and Analysis task go hand in hand and complement each other.
● Every synthesis is built upon the results of a preceding analysis, and every analysis requires a subsequent synthesis in order to verify and correct its results.
● But this simply does not seem the case in Bioinformatics whereby huge number of Analysis softwares are available but synthesis softwares lack.
● Tom Ritchey.; Analysis and synthesis: On scientific method – based on a study by bernhard riemann.; Systems Research Volume 8, Issue 4, pages 21–41, December 1991; DOI: 10.1002/sres.3850080402
(1) Keeping track of the diverse pieces of information,
(2) Organizing and using the diverse information,
(3) Formulating hypotheses and higher-level explanations,
(4) Sharing the information
Findings from User Research● The facts gathered and hypothesis formed during the synthesis task
were usually stored in form of printed web pages and were kept in binders but it did not serve at later date when information was required. (Finding a needle from haystack!!!!!!!!!!!)
● User research
– results came up that researchers particularly have difficulty in
keeping the track of diverse pieces of information. (Hence the
problem in synthesis task was identified)
● From user studies and other collaborative design sessions with researchers at Cancer Genetics lab, a number of themes for navigating and interacting with this huge amount of data was identified AND...
● Vicki O’Day, Annette Adler, Allan Kuchinsky, Anna Bouch.; When Worlds Collide: Molecular Biology as Interdisciplinary Collaboration; Proceedings of European Conference on Computer Supported Collaborative Work (ECSCW2001), Bonn, Germany, 2001.
A Picture emerged which explained the Synthesis tasks involved in
Piecing together the story
Aspects of User Research
● Connecting the dots “mind mapping”
● Information is in free form (Structure-less)
● Many researchers, many fields and many hypothesis
● Group work, multiple perspectives
● Solving the puzzle together
● Reasoning over data
● Sharing of data
The Role of Narrative Structure
Thorndyke: Comprehensibility and recall are were a function of amount of inherent plot
structure of the story, independent of the actual content.
Schank: When prior experience is indexed cleverly, we can call it to mind to
understand current situation. This process can lead to brand new insights.
● Thorndyke,P.W.; CognitiveStructures in Comprehension and Memory of Narrative Discourses; Cognitive Psychology, 9, 1977, pp.77-110
● Schank, R, Tell Me a Story: Narrative and Intelligence, North western University Press, 1990.
Middleton and Edwards: Telling stories preserves potentially arcane and
idiosyncratic pieces of information ...
Erickson: Depicts storytelling as an integral part of design. Stories have informalities that are well suited to lack of certainity
that characterises much design related knowledge.Stories also provide concrete examples that people
from vastly different backgrounds can relate to
● Middleton, D., and Edwards, E, Collective Remembering; SAGE Publications;1990.
● Erickson, T.; Notes on Design Practice. Stories and Prototypes as Catalysts for Communication, in Scenario Based Design: Envisioning Work and Technology in Systems Development (ed .J .Carroll), Wiley, 1995, pp.37-58
Birth of the ToolThus after applying above perspectives to user findings and drawing analogies, a prototype software tool was developed on the concept of STORY TELLING which utilizes Narrative Structure as framework and allows Biologists in their Synthesis tasks-
✔ Organize and use information
✔ Build hypothesis from data
✔ Construct alternative explantions from Biological Process
✔ share diverse information
(Labs are generally interdisciplinary with each researcher having varied field related terms e.g. KB = KiloByte for Computer Scientist and KB = KiloBase for a Biologist)
OVERVIEW OF THE SOFTWARE
Features provided by BIOLOGICAL STORYTELLING
● Free form database model
● Narrative framework
● Concrete social aspect of information sharing
● Creating alternative hypothesis with reasoning
● Semi-automatic clustering of biological entities
● Extensive cross references to public and proprietary data and literature
There are three connecting entitites in the free form data model of Biological Storytelling.
● Basic unit of information
● Represent biological entities like gene, proteins or other gene product like a trascript
● Items contain detailed information about biological entity in form of links to different public and proprietary databases and literature information
● Data can be automatically loaded into Items (Tabular gene expression data)
● Researcher can also manually add information
● Viewer for dataset of Items loads from microarray experiment raw data
● User created and free form set of items.
● Analogous to clustering
● Collections manager component provides tree view of collections (Analogous to tree view panel in Windows Explorer)
● Data can be manually or semi-automatically loaded into Collections.
● Collections are malleable. (one can split, merge, add to collections and/or move Items from one collection to another.
● Collections can have other Collections as well as other Items (Nested)
● Repositories for links to experimental data and literature
● Narrative structure to represent state of Biological hypothesis
● Textually and graphically building up Biological stories (Story editor)
● Allows to represent story in form of themes, players and explanations
● Represent paths explored and alternative hypothesis formulated
● Repositories for links to experimental data and literature
● PAX3-FHKR oncogene
● activates Myogenin and MyoD
● Action of myogenin and MyoD induces My14
● Failure of muscle cells to differentiate and end cycle = uncontrolled proliferation = Cancer
● Top level element in story editor is a STORY..
A set of Players
A set of Explanations
Theme● Works as an abstract ( as in a publication
● Entered in free form text in Story Editor
● “PAX3 oncogene activates a myogenic transcription program, causing alveolar rhabdomyosarcoma”
● Items and Collections which play a role in Biological Stories
● e.g. Genes and Proteins that interact in pathways or can say kind of characters playing role in a story
● Myogenin and MyoD
● This forms the main plot of the Story
● Contains descriptive elements about a Biological Story
● Explanation contains
– An optimal theme
– A set of players
– A set of interactions
– Annotations which support or oppose the claims made in explanation
Description//could be a URL for acitation
Putting the story together graphically
People think Graphically rather than Textually
Diagram Editor tool
Nodes = Players (Nouns)
Edges = Interactions (Verbs)
Diagram Editor Tool● More general● Prefeined Verbs for relationships
between genes.● Inhibits● Binds● Promotes
Semantic Overlays● A way of juxtaposing Biological Stories with detailed experimental data● Validate higher level hypothesis
Annotation and Citations
● Textual notes
● Arbitrary list of citations (URL)
● Each citation can again have an arbitrary textual note attached to it
● For annotating biological elements an Object Editor is provided
Support for Group Work
● Annotation is tagged with user name and timestamp
● Provides Support and Oppose elements in Story Editors which can accomodate alternative hypothesis
● Conflict about use of terms in Story Editor.
– Some researchers liked literary terms while others liked scientific terms
● Discussions on importance of free form data model. (which is rather malleable)
● User suggestion: Able to group experiments intos into Collections, not just Items
● Using Players globally v/s tagging them with alternatives
● Storyspace (EastGate Systems)
● Lotus Agenda
● STKE, BIND, KEGG, EcoCyc, TransPath, SPAD
● The eLab Book
● Cell Space software
Better for microarray experiments and not the generalized research
Data format for loading (Tabular data for automatic loading into software)
✔ Provision for multiple data types
✔ Evolution to support “Systems approach”
✔ Support Scientific publications
✔ Converting diagramatic representation into textual stories via use of parsers
✔ Utilization of data mining methods. (Semi-automatic in populating collections)
Image sources● http://www.clipartheaven.com/clipart/kids_stuff/images_%28a_-_f%29/boy_-_happy_3.gif
● Snap shot of papers
● Snap shot of public databases:
➔ PubMed http://www.ncbi.nlm.nih.gov/pubmed
➔ Pubchem: https://pubchem.ncbi.nlm.nih.gov/
➔ Protein DataBank: http://www.rcsb.org/pdb/home/home.do