Advanced Editing of Pathway/Genome Databases Ron Caspi

IntroductionRon Caspi
know your α, β
Text wrapper: need two newlines to force a new paragraph
Text wrapper: Never leave empty spaces at the end of a line
An internal link to a reaction frame will print the reaction equation
To print an enzymatic activity name use an internal link to the enzymatic activity frame ID, not the enzyme frame ID (important when an enzyme is multifunctional e.g. CPLX-6934).
When providing multiple citations, use |CITS:[PMID1][PMID2| (rather than |CITS:[PMID1]|, |CITS:[PMID2]|.
Special characters:
Hypothetical Reactions, Excluded Enzymes
Enzymes Not Used : useful when an enzyme is associated with a reaction, but does not participate in a specific pathway. For example, a catabolic enzyme in a biosynthetic pathway (e.g. EC, ornithine carbamoyltransferase)
Hypothetical reactions: useful when a pathway step is proposed, but has not been proven
Specified in the Pathway Info Editor
Link pathways to upstream or downstream pathways with pathway links.
Keep pathways simple
is composed of:
catechol degradation II
mandelate degradation I
benzoate degradation (aerobic)
Using the Frame Editor
The frame editor is powerful, but dangerous… Use it when there are no alternatives.
Replacing an enzyme or reaction in an enzymatic-reaction frame
Removing mistakes from pathway frames, such as predecessor pairs that the software ignores.
Removing duplicated values from slots that should only have a single value (OFFIClAL-EC?)
Investigated orphan enzymatic reaction frames reported by the consistency Checker
Adenosylmethionine decarboxylase is first synthesized as a proenzyme, and then self-cleaves into two smaller polypeptides. Each cleavage product forms a homotetramer, and the two complexes form a heterooctamer.
A combination of editors enables creation of such multi-level complexes.
Tutorial: Creating Protein Complexes
Instance frames describe specific objects (e.g. a specific gene)
Class frames describe general types of biological objects (e.g. the class of all genes)
Proteins that are substrates of MetaCyc reactions are classes
Every compound with an “R” in its structure should be a class
Open the compound editor
Rename the frame to follow class name convention (if necessary)
Modify the common name to start with “a”
Modifications of MetaCyc classes is considered a schema change, and will be overwritten during the next update!
Only use this procedure to correct curation errors that were introduced in your PGDB!
Tutorial: the Ontology Editor
Consistency Checking should be performed routinely (every few months), and detected problems should be addressed
Bad Links
MetaCyc pathways are extensively linked to other pathways. When new PGDBs are created by Pathologic, these links are still there, even if they point to pathways that are not present in the new PGDB. These links are only removed by the Consistency Checker.
Consistency Checker – Manual Tasks
Example: create an empty |FRAME: | construct, then run the task “Check Frame References”
(optional inclusion of enzymes and genes)
Edit => Add Pathway to File Export List
File => Export => Selected Pathways to Lisp-format File
To import a pathway from file:
File => Import => Pathways from File
To export a pathway directly to another PGDB (both PGDBs must be installed on the same system):
Edit -> Export Pathway to DB
Moving Objects Between PGDBs
The following commands will import a frame from MetaCyc to EcoCyc:
Both databases must be open before this will work.
(import-compounds '(CPD-ID) (kb-of-organism 'meta) (kb-of-organism 'ecoli))
(import-reactions '(ID-RXN) (kb-of-organism 'meta) (kb-of-organism 'ecoli))
(import-proteins '(ID-MONOMER) (kb-of-organism 'meta) (kb-of-organism 'ecoli))
Exporting Graphics
You can save any screen as a vector-based postscript file by using File -> Print
The PS files are easily converted to PDF by Adobe Distiller (pat of the Acrobat Pro package)
Graphics programs like Corel Draw or Illustrator can open the PDF files and let you manipulate the graphics
The software also generates two posters – the cellular overview, and the genome poster. Those are also generated in postscript format.
To define a new external database link:
File → Create → External Database Description
Enter frame name
To edit an existing link:
Right-click on a link (from a Navigator page), and select “Edit External Database Info”
Creating links to a PGDB
The Registry – Uploading Your PGDB