the subliminal toolbox: automating steps in the reconstruction of metabolic networks

34
The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks Neil Swainston Manchester Centre for Integrative Systems Biology Integrative Bioinformatics 2011, Wageningen, Netherlands 22 March 2011

Upload: neil-swainston

Post on 18-Nov-2014

1.979 views

Category:

Technology


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

The Subliminal Toolbox: automating steps in the

reconstruction of metabolic networks

Neil SwainstonManchester Centre for Integrative Systems Biology

Integrative Bioinformatics 2011, Wageningen, Netherlands22 March 2011

Page 2: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Metabolic networks

• Computational and mathematical representation of the metabolic capabilities of a given organism

• On a genome-scale• ~1000 unique metabolites• ~1000 unique reactions

• Predictive, simulatable

Page 3: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Metabolic networks

Metabolic reactions

A + B

C + D

Gene / enzyme

E1Protonation

C + DH+Mass balancing

A + 2B + H+

AextExtracellular

Intracellular

Transport reactions

Mitochondria

Cytosol

Cm

Compartmentalisation

biomass

aa nucl

Biomass objective

T1

• Goal: generate biomass from growth medium?

Page 4: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

How are they generated?

• Traditionally: manually

• Start with KEGG download? Genome sequence?• Collated / edited in spreadsheets• Many steps done by hand• Curated in focussed meetings (“jamborees”)

• Expensive• Boring

Page 5: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Automation

• Many of these steps can be automated• Subliminal Toolbox

• Goal is to generate a metabolic reconstruction automatically• Manual curation still necessary• BUT reduce what needs to be done

• Investigation• Can we automate the generation of a metabolic

network in yeast?

Page 6: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

KEGG MetaCyc

Merge pathways

Balance reactions

Add transport reactions

Draft

(De)protonate metabolites

Balance reactions

(De)protonate metabolites

Merge

Add compartmentalisation

Add biomass reaction

Page 7: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Initial draft

• Both KEGG and MetaCyc allow export of pathways / networks in SBML

• BUT these are representations of the database, NOT computational models

• Merging issue:• Components are named inconsistently

Page 8: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Naming

• Glucose, glc, D-glucose, alpha-D-glucose?

• Need to be reconciled

• Use semantic annotations• ChEBI terms for metabolites• UniProt terms for enzymes• Apply MIRIAM standard (RDF and URIs)

Page 9: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

MIRIAM annotation

<species metaid="_glc" id="glc" name="D-Glucose">

</species>

Page 10: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

MIRIAM annotation

<species metaid="_glc" id="glc" name="D-Glucose">

<annotation>

<rdf:RDF>

<rdf:Description rdf:about="#_glc">

<bqbiol:is>

<rdf:Bag>

<rdf:li rdf:resource="urn:miriam:obo.chebi:CHEBI:17634"/>

</rdf:Bag>

</bqbiol:is>

</rdf:Description>

</rdf:RDF>

</annotation>

</species>

Page 11: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks
Page 12: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Exploiting model annotations

• MIRIAM annotations provide unambiguous, unique identifiers for model components

• But – also provide link to chem/bioinformatics resources via web services• Models become “live” and increase in utility as

resources develop• Kinetic parameters accessible from ChEBI• Improving annotation in UniProt (phospho sites,

etc.)• Extract data through web services

Page 13: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

UniProtChEBIKEGG

libAnnotationSBML

MIRIAM / RDF annotation Molecular formula, protein sequence, etc.

Page 14: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Merging

• Standard identifiers: job done?

• Inconsistent charge states• Pyruvic acid and pyruvate

Page 15: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Charge state determination

• Annotated ChEBI terms provides web service access to structural data• InChI, SMILES strings• InChI=1/C3H4O3/c1-2(4)3(5)6/h1H3,(H,5,6)/p-1/fC3H3O3/q-1• CC(=O)C([O-])=O

• Cheminformatics software (ChemAxon MARVIN) can be used to predict charge state at given pH• Consistency

✓✗

Page 16: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Stereochemistry

• KEGG and MetaCyc are inconsistent in their definition of stereochemical precision

• Considered different: apparently minor but can cause gaps in the network

beta-D-glucose D-glucose

Page 17: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Stereochemistry-induced gaps

X

Y

Page 18: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

ChEBI ontology

• ChEBI is an ontology and contains relationships between metabolites

Page 19: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Stereochemistry-induced gaps

X

Y

Page 20: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Stereochemistry-induced gaps

X

Y

Page 21: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Stereochemistry-induced gaps

X Y

Page 22: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Reaction balancing

• Reaction elemental and charge balancing• Prevents mass violation and inconsistencies arriving

from “magical” production or disappearance of matter

• KEGG and MetaCyc reactions don’t always balance• Incorrect stoichiometry• Missing protons, water, etc.

• Solution: use linear programming

Page 23: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Reaction balancing

carbon dioxide + 2-Acetolacetate Pyruvate

Ab = 0

A =

Reactants Products Optional reactants Optional productsCO2 C5H7O4 C3H3O3 H+ H20 H+ H20 CO2

C 1 5 -3 0 0 0 0 -1O 2 4 -3 0 1 0 -1 -2H 0 7 -3 1 2 -1 -2 0charge 0 -1 1 1 0 -1 0 0

bmin 1 1 1 0 0 0 0 0

b represents a vector of stoichiometries

CO2 + C5H7O4- C3H3O3

-

Page 24: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Reaction balancing

• Linear programming solver solves Ab = 0

Reactants Products Optional reactants Optional productsCO2 C5H7O4 C3H3O3 H+ H20 H+ H20 CO2

C 1 5 -3 0 0 0 0 -1O 2 4 -3 0 1 0 -1 -2H 0 7 -3 1 2 -1 -2 0charge 0 -1 1 1 0 -1 0 0

bmin 1 1 1 0 0 0 0 0b 1 1 2 0 0 1 0 0

carbon dioxide + 2-Acetolacetate 2 Pyruvate + H+

CO2 + C5H7O4- 2 C3H3O3

- + H+

Page 25: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Compartmentalisation

• Determination of intracellular compartment in which enzymes operate

• Two approaches:• Extract curated information from UniProt annotation• Extract protein sequence from UniProt and pipe to

WoLF PSORT localisation prediction algorithm

• Infer reaction localisation from enzyme localisation

Page 26: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Biomass function

• Flux Balance Analysis requires an objective function to maximise• Traditionally, a biomass function is specified• Simulates cell growth

• Subliminal adds a generic biomass function• Production of amino acids, nucleotides, lipids, ATP

• Formats model such that it can be loaded into the COBRA Toolbox

Page 27: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

KEGG MetaCyc

Merge pathways

Balance reactions

Add transport reactions

Draft

(De)protonate metabolites

Balance reactions

(De)protonate metabolites

Merge

Add compartmentalisation

Add biomass reaction

Page 28: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Analysis

• Goal: can biomass be generated from growth medium?

• Simulate biomass generation

• Specify a “sensible” growth medium• Glucose, NH4+, etc.• Only histidine had to be added to the growth medium• Suggests good connectivity• BUT suggests gap(s) in histidine synthesis pathways

Page 29: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Analysis

Components Subliminal ManualCompartments 7 17Unique metabolites 1277 728Unique enzymes 847 939Unique metabolic reactions 1394 947Unreachable metabolites 1281/2287 (57%) 75/758 (9.9%)Blocked reactions 728/1687 (43%) 140/1102 (13%)

• Many more metabolites• Better coverage? Poor merging?

• Many unreachable metabolites• Many blocked reactions

• Network gaps?

Page 30: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Future developments

• Directionality• Use thermodynamic predictions of reaction

reversibility• Possible to automate due to extraction of chemical

structures (SMILES, InChI) from ChEBI

• Editing• Online, graphical editor (with checking) would be

incredibly useful• Difficult to render genome-scale reconstructions• Pathway by pathway?

• WikiPathways? Payao?

Page 31: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Conclusion

• Many steps can be automated in generating genome-scale metabolic reconstructions

• Additional modules would be useful

• Manual curation still necessary… but…• Subliminal Toolbox is modular

• Can be used in manual curation phase• Back-end for graphical editors?

• Approach is better than starting from scratch• Capable of producing reconstructions covering central

carbon metabolism

Page 32: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Thanks…

Page 33: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

The Subliminal Toolbox: automating steps in the

reconstruction of metabolic networks

Neil SwainstonManchester Centre for Integrative Systems Biology

Integrative Bioinformatics 2011, Wageningen, Netherlands22 March 2011

Page 34: The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

Transporters

• Transporters are required to transport metabolites into and out of the cell

• TransportDB is a source of transporter proteins• BUT not comprehensive enough to assign these to

individual reactions

• Approach taken is a pragmatic one• Add all transport proteins from TransportDB• Generate transport reactions for ALL metabolites• Map the proteins to the reactions manually