editing pathway/genome databases compounds, reactions and pathways ron caspi

36
Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

Upload: sheryl-evans

Post on 19-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

Editing Pathway/Genome Databases

Compounds, Reactions and Pathways

Ron Caspi

Page 2: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsActivate Editing Mode

Type (enable/disable-editors t) at the listener pane

Page 3: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsWhy Curation is Important!

Curators need jobs

“in silico” information less solid than experimental evidence

Database curation greatly enhances the usefulness of the data

Page 4: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsPathway Tools Paradigms

Separate database from user interface

Navigator provides one interface to the DB

Editors provide an alternative interface to the DB

• Reuse information whenever possible!

• A PGDB should not describe the same biological or chemical entity more than once

• A tool helps to prevent creation of duplicate reactions

Page 5: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsEditing rules: Support Policy

Do not modify the EcoCyc or MetaCyc datasets

Do not alter DB schema e.g. do not add or remove classes or slots

Page 6: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsList of Editors

Compound Editor Compound Structure Editors Reaction Editor Synonym Editor Publication Editor Pathway Editor and Pathway Info Editor Protein/Subunit structure/Enzymatic Reaction Editors Gene Editor Intron Editor (Eukaryotes only) Transcription Unit Editor Frame Editor Relationships Editor Ontology Editor

Page 7: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsSaving Changes

The user must save changes explicitly File => Save Current DB Save DB button

List Unsaved Changes in Current DBRevert Current DB

Checkpoint Current DB Updates to FileRestore Updates from Checkpoint File

Page 8: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsOther DB commands under the

File menu

Summarize databases Summarize current organism Refresh DB list

Refresh All Current DBs Delete a DB Attempt to Reconnect to Database Server

Page 9: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsInvoking the Editors

2. Existing Object: Right-Click on the Object Handle

1. New Object: Use the “New” command

Page 10: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsCompound Editor

Create or edit a compound

Specify Class Common Name and

Synonyms Comments, citations Links to other DBs

Page 11: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsThe Synonym Editor

Lets you easily edit the synonyms and set the common name

Page 12: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsCitations

Citation boxesThe CITS fieldFile =>Import Citations

from PubmedPublication editor

(invoke by right clicking on a citation at bottom)

Non Pubmed citation:Enter in citation box in the form Smith06, invoke editor by clicking out of a citation box.

Page 13: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsMore Compound Editing

Compound Structure Editors (Marvin, JME) Mol files Exporting to other DBs Merging Duplicate Frame and Edit

Page 14: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsReaction Editor

Enter or edit a reaction equation EC number (official?) Check for balance Compound Resolver

Page 15: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsPathway Info Editor

• Class (variant class)

• Common Name

• Synonyms

• Evidence code

• Citations (CIT)

• Comments

• External Links

• Hypothetical reactions

• Author credits

Page 16: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsPathway Editor

Graphically create and modify pathwaysTwo tools:

Connections Editor: add reactions one by one Segment Editor: enter a linear pathway segment

Page 17: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsConnections Editor Operations

Two main display panes: left: unconnected pathway reactions right: draws connected reactions (looks like the regular Pathway display

window) Connecting reactions:

select initial reaction (in either pane) ===> red and green reactions select a green reaction

Useful Commands: choose main compounds for reaction disconnect all reactions

In circular pathways, specify which compound should be at the top

Add links to other pathways, reactions, or comments

Page 18: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsPathway Segment Editor

To enter linear sequence of reactions simultaneously

Reactions are specified by EC numbers or reaction substrates

One segment may contain up to 7 reactions

Page 19: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformatics

Pathway Editor Limitations

Complex situations can cause ambiguity: link may be ignored dialog box for disambiguating pathway drawn in bizarre arrangement

Fix: try removing offending link and add links in different order

Pathway Editor does not handle polymerization pathways easily.

Page 20: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsEvidence Codes for Pathways

http://brg.ai.sri.com/ptools/evidence-ontology.html

EV-COMP: Inferred from computation HINF - Human inference AINF - Artificial inference

EV-AS: Author statement TAS - traceable NAS – non-traceable

EV-IC: Inferred by curator EV-EXP: Inferred from experiment

IDA - inferred from direct assay IPI - inferred from physical interaction TAS – inferred from traceable data (review) IEP - inferred from expression pattern IGI - inferred from genetic interaction IMP- inferred from mutant phenotype

Page 21: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsEnzyme/Protein Editors

To add an enzyme to a reaction: Right click the reaction, choose Edit => Create/Add enzyme. “Choose Protein”: specify ID, or “Search by genes or create new

protein” => Protein subunit structure editor Protein Editor Check the Curator Guide at

http://bioinformatics.ai.sri.com/ptools/curatorsguide.pdf

Page 22: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsProtein Editor

Page 23: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsEnzymatic Reaction Editor

Page 24: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsProtein Subunit Editor

Page 25: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsSuper Pathways

Need to keep pathways within well-defined end points Link pathways to upstream or downstream pathways with pathway links.

Create more complex metabolic networks using superpathways

Example: superpathway of aromatic compound degradation (aerobic)

is composed of: catechol degradation II mandelate degradation I benzoate degradation (aerobic) -ketoadipate degradation protocatechuate degradation II shikimate degradation quinate degradation 4-hydroxymandelate degradation tryptophan degradation I

Page 26: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsPathway Export

Export

Edit => Add Pathway to File Export List File => Export => Selected Pathways to File

Import

File => Import => Pathways from File

Page 27: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsCreating Links to External

Databases

Creating links from a PGDB to external databases

To define a new external database: Tools => Ontology Browser View => Browse from new root / type Databases Highlight Databases Frame => Create => Instance Enter frame name, frame edit Enter Common Name, Static-Search-URL

e.g. http:/gene.pharma.com/dbquery? Enter a value for Search-Object-Class (e.g. Proteins)

Creating links to a PGDBsee http://biocyc.org/linking.shtml

Page 28: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsConstraint Checking

General rules that constrain the valid relationships among instances

Constraints are checked when new facts are asserted to assure that the DB remains logically consistent

Constraints on slots: Domain violation checks to make sure the slots are in instances of the

appropriate class Range violation :

value type value cardinality

Inverse Cardinality Lisp-predicate

Page 29: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsConsistency Checking

(correctify-kb) Removes newlines from names Converts “<“ to “|” in string citations Checks isozyme sequence similarity Fixes references between polypeptides and genes Changes compound names to ids in a variety of slots Matches physiological regulators to other regulators Cross-references compounds to reactions Checks pathways predecessors/reactions/subs Checks reaction balancing Checks compound structures Calculates sub- and super-pathways Finds missing sub-pathways links Verifies chromosome components and positions

Page 30: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsUpdate your computers!

To install a patch:

Tools => Instant Patch => Download and Activate All Patches

Page 31: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsMake sure that…

You perform all exercises on the Hb. pylori database, not on your own!!!

Page 32: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsCreating New Reactions

Don’t forget to include spaces between chemical names and terms such as “+” and “=“:

1. ascorbate + H2O = 3-keto-L-gulonate

2. 3-keto-L-gulonate + ATP = 3-keto-L-gulonate-6-phosphate + ADP

3. 3-keto-L-gulonate-6-phosphate = L-xylulose-5-phosphate + CO2

4. L-xylulose-5-phosphate = L-ribulose-5-phosphate

5. L-ribulose-5-phosphate = xylulose-5-phosphate

6. xylulose-5-phosphate = D-ribulose-5-phosphate

Page 33: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformatics

Fill Reaction frame ID’s in your handout

Reaction Frame ID

ascorbate + H2O = 3-keto-L-gulonate XXX

3-keto-L-gulonate + ATP = 3-keto-L-gulonate 6-phosphate + ADP

3-keto-L-gulonate 6-phosphate = L-xylulose-5-phosphate + CO2

L-xylulose-5-phosphate = L-ribulose-5-phosphate

L-ribulose-5-phosphate = xylulose-5-phosphate

xylulose-5-phosphate = D-ribulose-5-phosphate

Page 34: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsDuplicate Reaction?

Frame ID of the new reaction to be created. This frame will NOT be created unless you choose “Keep”

Frame ID of the existing reaction.This reaction will NOT be transferred into your database until you click “Import”!

Record this BEFORE you click “Import”

Page 35: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsDefine a New Pathway

Define the pathway L-ascorbate degradation to D-ribulose-5-phosphate by connecting the reactions together

Assign class:

(Pathways -> Degradation/Utilization/Assimilation -> Carboxylates, Other)

Add the reactions, conect them, and add a link to the pathway non-oxidative branch of the pentose phosphate pathway(Generation of precursor metabolites and energy => Pentose phosphate pathways =>)

Add a reverse link from non-oxidative branch of the pentose phosphate pathway to the new pathway

Page 36: Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi

SRI InternationalBioinformaticsRun

(correctify-kb)

Open the database Hb. pylori (HypCyc) (so ‘hyp)

Run (correctify-kb)

Analyze output