babel.pdf

24
7/28/2019 BABEL.pdf http://slidepdf.com/reader/full/babelpdf 1/24 Babel molecular structure file conversion version 3.3 OpenEye Scientific Software, Inc. July 11, 2007 9 Bisbee Court, Suite D Santa Fe, NM 87508 www.eyesopen.com [email protected]

Upload: abhishek-mandal

Post on 03-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 1/24

Babelmolecular structure file conversion

version 3.3 

OpenEye Scientific Software, Inc.

July 11, 2007

9 Bisbee Court, Suite D

Santa Fe, NM 87508

www.eyesopen.com

[email protected]

Page 2: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 2/24

Copyright c� 1997-2007 OpenEye Scientific Software, Santa Fe, New Mexico. All rights reserved.

All rights reserved. This material contains proprietary information of OpenEye Scientific Software. Use

of copyright notice is precautionary only and does not imply publication or disclosure.

The information supplied in this document is believed to be true but no liability is assumed for its use orthe infringement of the rights of others resulting from its use. Information in this document is subject to

change without notice and does not represent a commitment on the part of OpenEye Scientific Software.

This package is sold/licensed/distributed subject to the condition that it shall not, by way of trade or

otherwise, be lent, re-sold, hired out or otherwise circulated without OpenEye Scientific Software’s prior

consent, in any form of packaging or cover other than that in which it was produced. No part of this

manual or accompanying documentation, may be reproduced, stored in a retrieval system on optical or

magnetic disk, tape, CD, DVD or other medium, or transmitted in any form or by any means, electronic,

mechanical, photocopying recording or otherwise for any purpose other than for the purchaser’s personal

use without a legal agreement or other written permission granted by OpenEye.

This product should not be used in the planning, construction, maintenance, operation or use of anynuclear facility nor the flight, navigation or communication of aircraft or ground support equipment.

OpenEye Scientific software, shall not be liable, in whole or in part, for any claims arising from such

use, including death, bankruptcy or outbreak of war.

Windows is a registered trademark of Microsoft Corporation. Apple and Macintosh are registered trade-

marks of Apple Computer, Inc. AIX and IBM are registered trademarks of International Business Ma-

chines Corporation. UNIX is a registered trademark of the Open Group. RedHat is a registered trade-

mark of RedHat, Inc. Linux is a registered trademark of Linus Torvalds. Alpha is a trademark of Digital

Equipment Corporation. SPARC is a registered trademark of SPARC International Inc.

SYBYL is a registered trademark of TRIPOS, Inc. MDL is a registered trademark and ISIS is a trademark 

of MDL Information Systems, Inc. SMILES, SMARTS, and SMIRKS may be trademarks of Daylight

Chemical Information Systems. Macromodel is a trademark of Schrodinger, Inc. Schrodinger, Inc maybe a wholly owned subsidiary of the Columbia University, New York.

Python is a trademark of the Python Software Foundation.

Java is a trademark or registered trademark of Sun Microsystems, Inc. in the U.S. or other countries.

“The forefront of chemoinformatics” is a trademark of Daylight Chemical Information Systems, Inc.

Other products and software packages referenced in this document are trademarks and registered trade-

marks of their respective vendors or manufacturers.

Page 3: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 3/24

CONTENTS

1 Introduction 1

2 Theory 2

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2.2 Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2.3 Flavors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.4 Multiconformer databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Installation and Licensing 8

3.1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.2 Licensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4 Usage 9

4.1 Command line interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.2 Command line options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

5 Example executions 17

A Release Notes 18

A.1 Babel 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

A.2 Babel3 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

A.3 Babel3 2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

A.4 Babel3 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

A.5 Babel2 2.0b1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

B Known problems and caveats 21B.1 Reporting of non-compliant molecule records . . . . . . . . . . . . . . . . . . . . . . . 21

B.2 Aromaticity perception of very large ring systems . . . . . . . . . . . . . . . . . . . . 21

ii

Page 4: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 4/24

CHAPTER

ONE

Introduction

The OpenEye Babel program interconverts molecule files among several supported formats. Babel is

built upon and in most ways a thin wrapper around OEChem, the OpenEye toolkit for chemistry, and the

chemoinformatics foundation for most OpenEye applications. Babel is intended to provide convenient

and flexible access to the file interconversion capabilities integral to the OpenEye suite of software.

The program name ”Babel” has a rich and proud history involving Arizona (a U.S. state formerly in

the New Mexico territories), the open source software movement, intrigue, daring, and several colorful

characters. For now may it suffice to say that this incarnation, the OpenEye Babel application, should

not be confused with OpenBabel, the original open source Babel, or OELib’s Babel.

1

Page 5: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 5/24

CHAPTER

TWO

Theory

2.1 Introduction

Babel should not really need a theory manual. The relevant theory is OEChem theory. However, a few

points are worth noting.

1. OEChem adheres to format specifications insofar as they are defined by authoritative documenta-

tion.

2. Format variants are generally handled by ”flavors”.

3. In some cases de facto format variants exist by virtue of their general usage, which OEChem may

support when not in conflict with the defined format (e.g. MOL2 files with absent hydrogens).

4. Different formats not only differ in their encoding, but also differ in the information represented,

e.g., the molecular representation. Where source information is absent, use of the term ”conver-

sion” is a something of a stretch, as information must be inferred (e.g., bonds from PDB files).

Thus, is not always possible to guarantee correct output for all conversions, when correctness is

defined by information not strictly contained in the data source.

5. OEChem is specifically designed to handle and interconvert the different chemical models intrinsic

to certain formats, for example, the varying aromaticity models of Tripos, MDL, and Daylight.

2.2 Formats

The current list of supported formats can be displayed by running ”babel -helpformats”, and are as

follows:

2

Page 6: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 6/24

2.3. Flavors 3

extension OEChem code format read write

smi ( 1) OEFormat::SMI SMILES yes yes

mdl,mol,rxn ( 2) OEFormat::MDL MDL Mol yes yes

ent,pdb ( 3) OEFormat::PDB PDB yes yes

mol2,syb ( 4) OEFormat::MOL2 Tripos MOL2 yes yesbin ( 5) OEFormat::BIN OEBinary v1 yes no

tdt ( 6) OEFormat::TDT Daylight TDT no no

ism,isosmi ( 7) OEFormat::ISM Isomeric SMILES yes yes

mol2h ( 8) OEFormat::MOL2H MOL2 with H yes yes

sd,sdf ( 9) OEFormat::SDF MDL SDF yes yes

can (10) OEFormat::CAN Canonical SMILES yes yes

mf (11) OEFormat::MF Molecular Formula no yes

xyz (12) OEFormat::XYZ XYZ yes yes

fasta,seq (13) OEFormat::FASTA FASTA yes yes

mopac,pac (14) OEFormat::MOPAC MOPAC no yes

oeb (15) OEFormat::OEB OEBinary v2 yes yesdat,mmd,mmod (16) OEFormat::MMOD Macromodel yes yes

sln (17) OEFormat::SLN Tripos SLN no yes

rd,rdf (18) OEFormat::RDF MDL RDF yes no

cdx (19) OEFormat::CDX ChemDraw CDX yes yes

skc (20) OEFormat::SKC MDL ISIS Sketch yes no

This list is completely determined by the OEChem library upon which Babel is built. Similarly, all

applications with OEChem inside can normally handle these same formats.

2.3 Flavors

For each supported format there may be input and output flavors. Some flavors are generic and applicable

to multiple formats. If none are specified the default flavor will be used. Babel is intended to expose

all the format flavors defined in the official OEChem API. To list the available flavors, run ”babel

--help all”:

prompt> babel --help all

:jGf:

:jGDDDDf:

,fDDDGjLDDDf, ______ _ _  

,fDDLt: :iLDDL; | ___ \ | | | |

;fDLt: :tfDG; | |_/ / __ _| |__ ___| |

,jft: ,ijfffji, :iff | ___ \/ _‘ | ’_ \ / _ \ |

.jGDDDDDDDDDGt. | |_/ / (_| | |_) | __/ |

;GDDGt:’’’:tDDDG, \____/ \__,_|_.__/ \___|_|

.DDDG: :GDDG.

;DDDj tDDDi babel - molecular structure file conversion

Page 7: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 7/24

4 Chapter 2. Theory

,DDDf fDDD, version: 3.3

LDDDt. .fDDDj

.tDDDDfjtjfDDDGt copyright (c) 2005,2006,2007

:ifGDDDDDGfi. OpenEye Scientific Software, Inc.

.:::. OEChem version: 1.4.3

...................... platform: osx-10.4-g++3.3-G4

DDDDDDDDDDDDDDDDDDDDDD built: 20070401

DDDDDDDDDDDDDDDDDDDDDD

licensee: OpenEye

site: Albuquerque

Complete parameter list

basic : file i/o and other top level options

-in : input file

-out : output file

-firstonly : convert first molecule

-helpformats : list supported formats

-n : convert only N molecules (0 means all)

-skip : skip first N molecules

-v : verbose

-vv : very verbose

advanced

-ifmt : input format

-ofmt : output format

-add2d : add 2D coordinates

-chunk : split the output into several files

-chunk_filecount : number of chunk output files

-chunk_molcount : molecules per chunk output file

-chunk_prefix : file prefix for output chunk files (default is

<inbase>_XX)

-hydrogens : hydrogen handling

-igz : ungzip input

-input_params : read execution parameters from file

-mc : treat the file as multi-conformer

-mc2sc : one output scmol for each input mcmol

-mc_isomer : treat the file as multi-conformer (isomeric)

-mc_titles : mc perception title-sensitive

-mdlstereocorrect : correct non-compliant MDL stereo if possible

-molcount : write molcount to stdout

-nowarn : supress warnings

-ogz : gzip output

-output_names : write molecule names to file

-output_params : write execution parameters to file

-parts2mols : split out connected components

-perceive_residues : perceive macromolecular residues

-perceive_residues_preserve_All

-perceive_residues_preserve_AlternateLocation

-perceive_residues_preserve_ChainID

-perceive_residues_preserve_HetAtom

-perceive_residues_preserve_InsertCode

-perceive_residues_preserve_ResidueName

-perceive_residues_preserve_ResidueNumber

-perceive_residues_preserve_SerialNumber

-quiet : minimal verbosity, no banner

Page 8: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 8/24

2.3. Flavors 5

-sc : handle as single-conformer molecules

-sd2title : copy specified SD data to title

-stereofrom3d : perceive stereo from input 3D

flavors, input generic : generic (format non-specific) input flavors

-iAroMask

-iFlavorNone : raw, no standardizations

-iGenericMask

-iOEAroModelDaylight

-iOEAroModelMDL

-iOEAroModelMMFF

-iOEAroModelOpenEye

-iOEAroModelTripos

-iRings

flavors, output generic : generic (format non-specific) output flavors

-oAroMask

-oFlavorNone : raw, no standardizations

-oGenericMask

-oOEAroModelDaylight

-oOEAroModelMDL

-oOEAroModelMMFF

-oOEAroModelOpenEye

-oOEAroModelTripos

-oRings

flavors, input format specific : format specific input flavors

flavors, mmod specific

-mmodiDefault : default flavors

-mmodiFormalCrg

flavors, mol2 specific

-mol2iDefault : default flavors

-mol2iM2H

flavors, pdb specific

-pdbiALL : read all atoms including alternate locations, dummy atoms,

etc.

-pdbiAllMask

-pdbiBasicMask

-pdbiBondOrder

-pdbiCHARGE : read partial charges from b-factor field

-pdbiConnect

-pdbiDATA : preserve header data as generic data

-pdbiDELPHI : combines -pdbiCHARGE and -pdbiRADIUS

-pdbiDefault : default flavors

-pdbiEND : read END as separator

-pdbiENDM : read ENDM as separator

-pdbiExtraMask

-pdbiFormalCrg

-pdbiImplicitH

-pdbiRADIUS : read atomic radius from occupancy field

-pdbiRings

-pdbiTER : read TER as separator

-pdbiTerMask

Page 9: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 9/24

6 Chapter 2. Theory

flavors, smiles specific

-smiiCanon

-smiiDefault : default flavors

-smiiStrict

flavors, xyz specific

-xyziBondOrder

-xyziConnect

-xyziDefault : default flavors

-xyziExtraMask

-xyziFormalCrg

-xyziImplicitH

-xyziRings

flavors, output format specific : format specific output flavors

flavors, mdl specific

-mdloCurrentParity : write internal parity

-mdloDefault : default flavors

-mdloMCHG : write MCHG and MRAD fields for charged/radical atoms

-mdloMDLParity : write MDL parity

-mdloMISO : write ISO field for isotopes

-mdloMMask

-mdloMRGP : write RGP field for each R-group atom

-mdloMV30 : MDL V3000 format

-mdloNoParity : write no parity

-mdloPMask

flavors, mf specific

-mfoDefault : default flavors

-mfoTitle

flavors, mmod specific

-mmodoAtomTypes

-mmodoDefault : default flavors

flavors, mol2 specific

-mol2oAtomNames

-mol2oAtomTypeNames

-mol2oBondTypeNames

-mol2oDefault : default flavors

-mol2oHydrogens

-mol2oNameMask

-mol2oOrderAtoms

-mol2oSubstructure

flavors, mopac specific

-mopacoCHARGES : write charges

-mopacoDefault : default flavors

-mopacoXYZ : cartesian coords (default is internal coords/z-matrix)

flavors, pdb specific

-pdboBONDS : write CONECT records (all single without -pdboORDERS)

-pdboBOTH : bi-directional CONECT records

-pdboCHARGE : write partial charges to b-factor field

-pdboCurrentResidues

Page 10: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 10/24

2.4. Multiconformer databases 7

-pdboDELPHI : combines -pdboCHARGE and -pdboRADIUS

-pdboDefault : default flavors

-pdboELEMENT

-pdboFormalCrg

-pdboHETBONDS

-pdboNoResidues

-pdboOEResidues

-pdboORDERS : include bond orders in CONECT records

-pdboOrderAtoms

-pdboRADIUS : write atomic radii to occupancy field

-pdboTER : terminate with TER rather than END

flavors, smiles specific

-smioAtomMaps

-smioAtomStereo

-smioBondStereo

-smioCanonical

-smioDefault : default flavors

-smioExtBonds

-smioHydrogens

-smioImpHCount

-smioIsotopes

-smioKekule

-smioRGroups

-smioSmiMask

-smioSuperAtoms

2.4 Multiconformer databases

OEChem and Babel are designed to handle multiconformer files in a consistent way across formats. OE-

Binary is an explicitly multiconformer format. Others are not, so consistent rules must exist to determine

whether subsequent molecules are in fact conformers of the same molecule. Controls exist to specify

whether stereoisomers are considered the same or different molecules, and whether titles should distin-

guish molecules. Within one multiconformer molecule, OEChem requires that atoms and bonds must be

identically ordered.

Page 11: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 11/24

CHAPTER

THREE

Installation and Licensing

3.1 Installation

As with other OpenEye packages, Babel is normally shipped as a gzipped tarball, to be installed into a

subdirectory ”openeye” (normally /usr/local/openeye). For example:

prompt> cd /usr/local

prompt> tar xzvf $HOME/babel-3.3-centos-3.6-i586.tar.gz

3.2 Licensing

Babel requires a valid OpenEye license for the product OEChem and only OEChem licensees are en-

titled to use Babel. (One feature, generating 2D coordinates, invoked by the option -add2d, requires

an Ogham (oedepict) license, but this is not required for all other operations.) The license file should

be defined by environment variable OE_LICENSE. To request an evaluation license use OpenEye’s

online request form. If already a licensee, contact [email protected] or your local system ad-

ministrator for a license file. To purchase OEChem and Babel, or other OpenEye software, contact

[email protected].

8

Page 12: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 12/24

CHAPTER

FOUR

Usage

4.1 Command line interface

The command line interface is similar to that of other OpenEye applications. Normally, input and output

file formats are implied by filename extensions. Also, gzipped input and output are allowed and implied

by the ”.gz” extension. Standard input and output can be used by specifying only the file extension.

Extensive online help is available. The simplest way to get started is just to type ”babel” and follow

the directions.

4.2 Command line options

4.2.1 Basic

-in file containing input molecules

-out file containing output molecules

-firstonly Convert first molecule only and exit.

-helpformats List supported formats.

-n Convert first n-molecules (0, the default, means all).

-skip Skip first n-molecules.

-v verbose

-vv very verbose

9

Page 13: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 13/24

10 Chapter 4. Usage

4.2.2 Advanced

-ifmt input format specification. Not normally needed, since formats are implied by filename exten-

sions. Specify by extension (e.g., ”mol2”, ”sdf”, ”smi”, etc.).

-ofmt output format specification. Not normally needed, since formats are implied by filename exten-

sions. Specify by extension (e.g., ”mol2”, ”sdf”, ”smi”, etc.). Useful with -chunk.

-add2d  Generate 2D coordinates and include in the output. This functionality is disabled in the absence

of a valid Ogham (oedepict) license. -add2d is incompatible with output formats which cannot

represent 2D.

-chunk Split, a.k.a. ”chunk”, the output into several files. See also -chunk prefix, -chunk molcount,

and -chunk filecount. The default output file prefix is the input file directory and basename, and

the specified output format. Output format should be specified by -ofmt and -ogz.

-chunk filecount number of chunk output files

-chunk molcount molecules per chunk output file

-chunk prefix file prefix for output chunk files (default is INPATH  /  INBASE  XX)

-hydrogens Hydrogen handling: allowed values are ”add”, ”delete” and ”same” (meaning same as

input).

-igz Ungzip input. Not normally needed, since file extension can imply gzipped.

-input params read execution parameters from file.

-mc Treat the file as multi-conformer. Not needed for OEB which is multi-conformer by default.

-mc2sc One output scmol for each input mcmol.

-mc isomer Treat the file as multi-conformer (isomeric).

-mc titles Multi-conformer perception title-sensitive.

-mdlstereocorrect Correct non-compliant MDL stereo if possible.

-molcount Write molecule count to stdout.

-nowarn Supress warnings.

-ogz Gzip output. Not normally needed, since file extension can imply gzipped. Useful with -chunk.

-output names Write molecule names (titles) to file.

-output params write execution parameters to file.

-parts2mols Split out connected components.

-perceive residues perceive macromolecular residues

Page 14: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 14/24

4.2. Command line options 11

-perceive residues preserve All

-perceive residues preserve AlternateLocation

-perceive residues preserve ChainID

-perceive residues preserve HetAtom 

-perceive residues preserve InsertCode

-perceive residues preserve ResidueName

-perceive residues preserve ResidueNumber

-perceive residues preserve SerialNumber

-quiet minimal verbosity, no banner

-sc Handle input and output as single-conformer molecules. This is the default for all input formatsexcept OEBinary v1 and v2 (.bin and .oeb).

-sd2title Copy specified SD data to title.

-stereofrom3d  perceive stereo from input 3D

4.2.3 Format flavors

If no format flavor flags are invoked, Babel will use the default flavors for the specified input and out-

put formats. These defaults are also available by using a flavor which combines the individual flavors;

for example, -mdloDefault. To see what these defaults are, view the detailed help for the specific

default flavor (e.g., babel --help -mdloDefault). If any input flavors are specified, the user

must take full control over all input flavors. Likewise for output flavors. So, to add one flavor to the de-

faults, that flavor should be used in combination with the default flavor (e.g., babel -mdloDefault

-mdloMV30). Combining flavors involves a bitwise OR-ing of an integer datatype which represents a

binary array for these purposes. Babel reports flavors used as hex integers.

Generic input flavorings (format non-specific)

-iAroMask

-iFlavorNone Raw, no standardizations.

-iGenericMask

-iOEAroModelDaylight Daylight aromaticity model.

-iOEAroModelMDL MDL aromaticity model.

-iOEAroModelMMFF MMFF aromaticity model.

Page 15: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 15/24

12 Chapter 4. Usage

-iOEAroModelOpenEye OpenEye aromaticity model.

-iOEAroModelTripos Tripos aromaticity model.

-iRings Perceive rings.

Generic output flavorings (format non-specific)

-oAroMask

-oFlavorNone Raw, no standardizations.

-oGenericMask

-oOEAroModelDaylight Daylight aromaticity model.

-oOEAroModelMDL MDL aromaticity model.

-oOEAroModelMMFF MMFF aromaticity model.

-oOEAroModelOpenEye OpenEye aromaticity model.

-oOEAroModelTripos Tripos aromaticity model.

-oRings

Input format specific flavorings: mmod

-mmodiDefault default flavors

-mmodiFormalCrg

Input format specific flavorings: mol2

-mol2iDefault default flavors

-mol2iM2H

Input format specific flavorings: pdb

-pdbiALL read all atoms including alternate locations, dummy atoms, etc.

-pdbiAllMask

-pdbiBasicMask

-pdbiBondOrder

-pdbiCHARGE read partial charges from b-factor field

Page 16: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 16/24

4.2. Command line options 13

-pdbiConnect

-pdbiDATA  preserve header data as generic data

-pdbiDELPHI combines -pdbiCHARGE and -pdbiRADIUS

-pdbiDefault default flavors

-pdbiEND read END as separator

-pdbiENDM  read ENDM as separator

-pdbiExtraMask

-pdbiFormalCrg

-pdbiImplicitH

-pdbiRADIUS read atomic radius from occupancy field

-pdbiRings

-pdbiTER  read TER as separator

-pdbiTerMask

Input format specific flavorings: smiles

-smiiCanon skips Kekulization test

-smiiDefault default flavors

-smiiStrict disallow format extensions

Input format specific flavorings: xyz

-xyziBondOrder

-xyziConnect

-xyziDefault default flavors

-xyziExtraMask

-xyziFormalCrg

-xyziImplicitH

-xyziRings

Page 17: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 17/24

14 Chapter 4. Usage

Output format specific flavorings: mdl

-mdloCurrentParity write internal parity

-mdloDefault default flavors

-mdloMCHG write MCHG and MRAD fields for charged/radical atoms

-mdloMDLParity write MDL parity

-mdloMISO write ISO field for isotopes

-mdloMMask

-mdloMRGP write RGP field for each R-group atom

-mdloMV30 MDL V3000 format

-mdloNoParity write no parity

-mdloPMask

Output format specific flavorings: mf

-mfoDefault default flavors

-mfoTitle include title

Output format specific flavorings: mmod

-mmodoAtomTypes

-mmodoDefault default flavors

Output format specific flavorings: mol2

-mol2oAtomNames

-mol2oAtomTypeNames

-mol2oBondTypeNames

-mol2oDefault default flavors

-mol2oHydrogens

-mol2oNameMask

-mol2oOrderAtoms

-mol2oSubstructure

Page 18: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 18/24

4.2. Command line options 15

Output format specific flavorings: mopac

-mopacoCHARGES write charges

-mopacoDefault default flavors

-mopacoXYZ cartesian coords (default is internal coords/z-matrix)

Output format specific flavorings: pdb

-pdboBONDS write CONECT records (all single without -pdboORDERS)

-pdboBOTH write bi-directional CONECT records

-pdboCHARGE write partial charges to b-factor field

-pdboCurrentResidues

-pdboDELPHI combines -pdboCHARGE and -pdboRADIUS

-pdboDefault default flavors

-pdboELEMENT writes the chemical symbol in columns 77-78 of the output

-pdboFormalCrg writes non-zero formal charges in columns 79-80 (and implies -pdboELEMENT)

-pdboHETBONDS all bonds between (and to/from) hetero atoms are written to the output PDB file

-pdboNoResidues

-pdboOEResidues

-pdboORDERS include bond orders in CONECT records

-pdboOrderAtoms

-pdboRADIUS write atomic radii to occupancy field

-pdboTER  terminate with TER rather than END

Output format specific flavorings: smiles

-smioAtomMaps

-smioAtomStereo

-smioBondStereo

-smioCanonical

-smioDefault default flavors

Page 19: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 19/24

16 Chapter 4. Usage

-smioExtBonds

-smioHydrogens

-smioImpHCount

-smioIsotopes

-smioKekule

-smioRGroups

-smioSmiMask

-smioSuperAtoms

Page 20: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 20/24

CHAPTER

FIVE

Example executions

1. prompt> babel −in foo . sdf −out bar . mol2

Convert SDF file to MOL2 file.

2. prompt> babel −in foo . sdf . gz −out bar . smi

Convert gzipped SDF file to SMILES.

3. prompt> babel foo . sdf . gz bar . smi

Convert SDF file to MOL2 file using shortcut ”keyless” syntax.

4. prompt> cat foo . sdf . gz | babel −in . sdf . gz −out bar . smi

Convert gzipped SDF stream from stdin to SMILES.

5. prompt> babel −in mongodb . sdf . gz −out bar . oeb . gz −mc

Convert gzipped SDF file to OEBinary multiconformer file, where consecutive molecules may beinterpreted as conformers of the same molecule.

6. prompt> babel −in mongodb . sdf . gz −out bar . oeb . gz −mc_isomer

Convert gzipped SDF file to OEBinary multiconformer file, where consecutive molecules may

be interpreted as conformers of the same molecule, but different stereoisomers are considered

different molecules.

7. prompt> babel −in mongodb . sdf . gz −mc −quiet

Create no output but report counts for input SDF file handled as multiconformer, without fanfare.

8.prompt> babel

\−in mongodb . sdf . gz \−ofmt sdf \−ogz \−chunk \−chunk_prefix datadir / mongo_part \−chunk_molcount 1 0 0 0 0 0

Split the input file into several files containing 100000 molecules each, in .sdf.gz format.

17

Page 21: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 21/24

APPENDIX

A

Release Notes

A.1 Babel 3.3

 July 2007  v3.3

1. Babel 3.3 is a minor update from Babel3 v2.2, largely to provide full compatibility with OEChem

1.5.0.

2. The name of the program has been changed from ”babel3” to ”babel”, for simplicity. Incrementing

the major version number only reflects this name change.

3. Option -output names is added to extract a list of names.

4. Option -mdlstereocorrect is added to correct non-compliant MDL stereo if possible.

5. Option -molcount is added for use in automation.

6. Bug fixed: Fixed -chunk so -n 0 is allowed.

7. Options added to allow residue perception with fine control:

-perceive residues -perceive residues preserve AlternateLocation

-perceive residues preserve ChainID -perceive residues preserve HetAtom

-perceive residues preserve InsertCode -perceive residues preserve Residu

-perceive residues preserve ResidueNumber -perceive residues preserve Ser

-perceive residues preserve All

A.2 Babel3 2.2

October 2006  v2.2

1. Babel3 2.2 is a minor update largely to provide full compatibility with OEChem 1.4.2. Thereby,

writing of the MDL V3000 file format is added (use -mdloDefault -mdloMV30). (Note that

18

Page 22: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 22/24

A.3. Babel3 2.1 19

a spurious warning is generated by OEChem 1.4.2 when the atom count exceeds 999 with MDL

V3000 output.) (Note that V3000 reading is not yet available.) Also included are improvements to

PDB handling.

2. Option -quiet added, to avoid banner and verbose messages.

3. Options -chunk, -chunk prefix, -chunk molcount and -chunk filecount added,

to facilitate the task of splitting an input file into several output files. This can be useful for simple

”divide and conquer” approaches to parallel processing. This replicates the functionality of the

Rocs auxiliary program chunker but with format control.

4. Option -add2d added, to generate and add 2D coordinates. This feature is disabled in the absence

of a valid Ogham (oedepict) license. Very useful for generating 2D SD files from 3D or 0D input

(e.g. PDB, SMILES). 2D coordinates are required for preserving cis/trans stereochemistry in SD

files. This task can also be done with the Ogham program depict.

5. Option -stereofrom3d added, to perceive stereochemistry from 3D coordinates, for writing to

a non-3D format.

A.3 Babel3 2.1

 April 2006  v2.1

1. Babel3 2.1 is a minor update largely to provide full compatibility with OEChem 1.4.0. Thereby,

reading of the MDL ISIS Sketch File format is added.

A.4 Babel3 2.0

 August 2005 v2.0

1. Babel3 2.0 is the first officially supported version of this program. Prior to this, the program

”Babel2” (babel2.cpp) was an unsupported OEChem example, despite its critical functionality.

Hence the promotion. ”Babel2 v2.0b1” was this program’s beta, prior to renaming.

2. Built with OEChem 1.3.4.

3. Performance has been improved significantly (x3 in typical use).

4. Fixed one-off bug in mol count with -n option.

5. Keyless syntax enabled; e.g., ”babel foo.sdf bar.mol2”.

6. Multiconformer perception for XYZ and PDB format disallowed, due to fundamental format lim-

itations (ref: OEChem 1.3.4 manual).

Page 23: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 23/24

20 Appendix A. Release Notes

A.5 Babel2 2.0b1

 May 2005 v2.0b1

1. This beta release (called Babel2) reflects the form and features of the planned Babel 2.0, to be the

first officially supported version. Prior to this, the program (babel2.cpp) has been an unsupported

OEChem example, despite its critical functionality. Hence the promotion.

Page 24: BABEL.pdf

7/28/2019 BABEL.pdf

http://slidepdf.com/reader/full/babelpdf 24/24

APPENDIX

B

Known problems and caveats

B.1 Reporting of non-compliant molecule records

The Babel program makes use of the OEChem function OEReadMolecule() and input molecule streams.

This is a highly robust method for processing input data including recovering from input errors and

broken, non-compliant files. However, error recovery sometimes entails not reporting input errors, or not

correlating them precisely with input records.

B.2 Aromaticity perception of very large ring systems

The OpenEye aromaticity model considers all rings including non-SSSR, and with unlimited size. Thuswith some large ringsystems (esp. larger buckminsterfullerenes) the algorithm can be impractically slow

(days). This problem does not affect Babel’s normal task of processing of small organic molecules or

oligomeric macromolecules.

21