helm notation overview

12
http://pistoiaalliance.org @PistoiaAlliance Pistoia Alliance HELM Project - What About the Big Guys? The emerging HELM standard for macromolecular representation Domain Lead – Sergio Rotstein Business Technology, Pfizer

Upload: pistoiaallianceclaire

Post on 11-Jun-2015

224 views

Category:

Technology


1 download

DESCRIPTION

HELM, which was originally developed by Pfizer, provides a way to represent molecules that are too large to represent atomically or which contain non-natural chemical modifications that make it impractical to represent them as sequences. HELM's structure hierarchy consists of complex and simple polymers, monomers, and atoms. It describes monomers using atoms and bonds, single-type polymers are described as a sequence of monomers, and complex multi-type polymers are described as connected polymers. A detailed description of HELM is available in a paper that was published in the Journal of Chemical Information and Modeling.

TRANSCRIPT

Page 1: HELM Notation Overview

http://pistoiaalliance.org @PistoiaAlliance

Pistoia Alliance HELM

Project - What About the Big

Guys?The emerging HELM standard for macromolecular representation

Domain Lead – Sergio RotsteinBusiness Technology, Pfizer

Page 2: HELM Notation Overview

What is a “Biomolecule”?

2

Peptides

Therapeutic Proteins

ADCs

Antibodies

Vaccines

ASOs

siRNAs

For our purposes, anything that is not a small molecule is a biomolecule

Goal

• Eliminate biomolecule penalty

• Make these entities first-class citizens of the Informatics tool portfolio

Page 3: HELM Notation Overview

GAP

So what’s the problem?

3

N

NH

O

O

O

N

NH

O

O

O

Small Molecules

Sequences

Biomolecules

Small Molecule Tools Sequence-Based Tools

Page 4: HELM Notation Overview

“Fit-for-Purpose” Structure Representation

We need to enable the representation, manipulation and visualization of each molecule type in a way that is appropriate for its size and complexity

4

Page 5: HELM Notation Overview

Fit for Purpose: “Monomer” Level• While you could draw out an oligonucleotide like

this:

• The representation is likely more intuitive / practical:

5

Page 6: HELM Notation Overview

Fit for Purpose: Sequence Level

• But even the monomer level representation would not scale well to proteins with hundreds of amino acids. Larger molecules require a more sequence-oriented representation:

6

Page 7: HELM Notation Overview

Fit for Purpose: Component Level

• For multi-component structures such as antibody drug conjugates, component level representations are required to enable each component to dealt with separately.

7

F

O

OO

O N

N

“Collapsed” Antibody

Expanded Drug

Ab

Page 8: HELM Notation Overview

Hierarchical Editing Language for Macromolecules

– Hierarchical – Amenable to the various “levels”• Complex Polymer ⇒ Simple Polymer ⇒ Monomer ⇒

Atom– Extensible

• Allowing addition of new biopolymer types– (Reasonably) comprehensive

• e.g. Allowing representation of oligonucleotide hybridization

– Canonicalizable• Facilitating uniqueness checking

– (Somewhat) human-readable

8

Page 9: HELM Notation Overview

HELM Example: Simple polymer

• HELM notation: A.R.G.[dF].C.K.[ahA].E.D.A

– Non-natural amino acid codes are enclosed in square brackets

• Natural equivalent: ARGFCKXEDA9

Page 10: HELM Notation Overview

HELM Example: Complex Polymer

10

Page 11: HELM Notation Overview

Monomer Database

• Each monomer used in the notation needs to be predefined in a monomer database

• The database includes the chemical structure of the monomer and a description of all acceptable attachment points

11

Page 12: HELM Notation Overview

J. Chem. Inf. Model 2012, 52, 2796-2806

12