advanced editing of pathway/genome databases

28
SRI International Bioinformatics 1 Advanced Editing of Pathway/Genome Databases Ron Caspi

Upload: penny

Post on 25-Feb-2016

54 views

Category:

Documents


4 download

DESCRIPTION

Advanced Editing of Pathway/Genome Databases. Ron Caspi. General Curation. User Preferences. Create and Use Author and Organization Frames. Using the Text Editor. Formatting italic , bold know your α, β Text wrapper: need two newlines to force a new paragraph - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics1

Advanced Editing of Pathway/Genome

Databases

Ron Caspi

Page 2: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics2

General Curation

Page 3: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics3

User Preferences

Page 4: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics4

Create and Use Author and Organization Frames

Page 5: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics5

Using the Text Editor

● Formatting● <i>italic</i>, <b>bold</i>● know your &alpha;, &beta;● Text wrapper: need two newlines to force a new paragraph● Text wrapper: Never leave empty spaces at the end of a line● An internal link to a reaction frame will print the reaction equation● To print an enzymatic activity name use an internal link to the

enzymatic activity frame ID, not the enzyme frame ID (important when an enzyme is multifunctional e.g. CPLX-6934).

● When providing multiple citations, use |CITS:[PMID1][PMID2| (rather than |CITS:[PMID1]|, |CITS:[PMID2]|.

● Special characters: ● Ångstrom &Aring; (Å)● Degree &deg; (◦)

Page 6: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics6

Use Internal Hyperlinks

Page 7: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics7

Use Variant Classes

example: Putrescin Biosynthesis

Page 8: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics8

Hypothetical Reactions, Excluded Enzymes

Enzymes Not Used : useful when an enzyme is associated with a reaction, but does not participate in a specific pathway. For example, a catabolic enzyme in a biosynthetic pathway (e.g. EC 2.1.3.3, ornithine carbamoyltransferase)

Hypothetical reactions: useful when a pathway step is proposed, but has not been proven

Specified in the Pathway Info Editor

Page 9: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics9

Super Pathways● Need to keep pathways within well-defined end points● Link pathways to upstream or downstream pathways with pathway links.

● Keep pathways simple● Create more complex metabolic networks using

superpathways

● Example: superpathway of aromatic compound degradation (aerobic)

is composed of:● catechol degradation II● mandelate degradation I● benzoate degradation (aerobic)● -ketoadipate degradation● protocatechuate degradation II● shikimate degradation● quinate degradation● 4-hydroxymandelate degradation● tryptophan degradation I

Page 10: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics10

Advanced Curation

Page 11: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics11

Using the Frame EditorThe frame editor is powerful, but

dangerous… Use it when there are no alternatives.

Examples:● Renaming frames ● Modified proteins● Modifying dates of author credits● Replacing an enzyme or reaction in

an enzymatic-reaction frame● Removing mistakes from pathway

frames, such as predecessor pairs that the software ignores.

● Removing duplicated values from slots that should only have a single value (OFFIClAL-EC?)

● Investigated orphan enzymatic reaction frames reported by the consistency Checker

Page 12: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics12

Protein complexes

Adenosylmethionine decarboxylase is first synthesized as a proenzyme, and then self-cleaves into two smaller polypeptides. Each cleavage product forms a homotetramer, and the two complexes form a heterooctamer.

A combination of editors enables creation of such multi-level complexes.

Tutorial: Creating Protein Complexes

Page 13: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics13

Classes and Instances

● Instance frames describe specific objects (e.g. a specific gene)

● Class frames describe general types of biological objects (e.g. the class of all genes)

● Proteins that are substrates of MetaCyc reactions are classes

● Every compound with an “R” in its structure should be a class

Page 14: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics14

Converting an existing compound instance to a class

● Open the compound editor● Click “Convert to Class” and

exit● Rename the frame to follow

class name convention (if necessary)

● Modify the common name to start with “a”

Modifications of MetaCyc classes is considered a schema change, and will be overwritten during the next update!

Only use this procedure to correct curation errors that were introduced in your PGDB!

Page 15: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics15

The Ontology Editor

Page 16: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics16

The Ontology Editor

● Changing parent classes● Adding parent classes● Creating new classes to improve ontology

Tutorial: the Ontology Editor

Page 17: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics17

The Consistency Checker

Consistency Checking should be performed routinely (every few months), and detected problems should be addressed

Page 18: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics18

Consistency Checker – Automatic Tasks

Bad LinksMetaCyc pathways are extensively linked to other pathways. When new PGDBs are created by Pathologic, these links are still there, even if they point to pathways that are not present in the new PGDB. These links are only removed by the Consistency Checker.

Page 19: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics19

Consistency Checker – Manual Tasks

Example: create an empty |FRAME: | construct, then run the task “Check Frame References”

Page 20: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics20

Exporting Pathways Between PGDBs● To export a pathway to a file:(optional inclusion of enzymes and genes)

● Edit => Add Pathway to File Export List● File => Export => Selected Pathways to Lisp-format File

● To import a pathway from file:● File => Import => Pathways from File

● To export a pathway directly to another PGDB (both PGDBs must be installed on the same system):● Edit -> Export Pathway to DB

Page 21: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics21

Moving Objects Between PGDBs

The following commands will import a frame from MetaCyc to EcoCyc:

Both databases must be open before this will work.

● (import-compounds '(CPD-ID) (kb-of-organism 'meta) (kb-of-organism 'ecoli))

● (import-reactions '(ID-RXN) (kb-of-organism 'meta) (kb-of-organism 'ecoli))

● (import-proteins '(ID-MONOMER) (kb-of-organism 'meta) (kb-of-organism 'ecoli))

Page 22: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics22

Exporting Graphics

● You can save any screen as a vector-based postscript file by using File -> Print

● The PS files are easily converted to PDF by Adobe Distiller (pat of the Acrobat Pro package)

● Graphics programs like Corel Draw or Illustrator can open the PDF files and let you manipulate the graphics

● The software also generates two posters – the cellular overview, and the genome poster. Those are also generated in postscript format.

Page 23: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics23

Creating Links to External Databases

● To define a new external database link: ● File → Create → External Database Description● Enter frame name● Fill fields as shown in next slide

● To edit an existing link:● Right-click on a link (from a Navigator page), and

select “Edit External Database Info”

● Creating links to a PGDBsee http://biocyc.org/linking.shtml

Page 24: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics24

External Database Editor

Page 26: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics26

The Pathway Registry

Page 27: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics27

The Registry – Schema Upgrades

Page 28: Advanced Editing of Pathway/Genome Databases

SRI International Bioinformatics28

The Registry – Uploading Your PGDB

The process of uploading a PGDB to the Registry is largely automated. See “Publishing PGDBs in the Registry” in the User Guide for details