babel.pdf
TRANSCRIPT
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 1/24
Babelmolecular structure file conversion
version 3.3
OpenEye Scientific Software, Inc.
July 11, 2007
9 Bisbee Court, Suite D
Santa Fe, NM 87508
www.eyesopen.com
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 2/24
Copyright c� 1997-2007 OpenEye Scientific Software, Santa Fe, New Mexico. All rights reserved.
All rights reserved. This material contains proprietary information of OpenEye Scientific Software. Use
of copyright notice is precautionary only and does not imply publication or disclosure.
The information supplied in this document is believed to be true but no liability is assumed for its use orthe infringement of the rights of others resulting from its use. Information in this document is subject to
change without notice and does not represent a commitment on the part of OpenEye Scientific Software.
This package is sold/licensed/distributed subject to the condition that it shall not, by way of trade or
otherwise, be lent, re-sold, hired out or otherwise circulated without OpenEye Scientific Software’s prior
consent, in any form of packaging or cover other than that in which it was produced. No part of this
manual or accompanying documentation, may be reproduced, stored in a retrieval system on optical or
magnetic disk, tape, CD, DVD or other medium, or transmitted in any form or by any means, electronic,
mechanical, photocopying recording or otherwise for any purpose other than for the purchaser’s personal
use without a legal agreement or other written permission granted by OpenEye.
This product should not be used in the planning, construction, maintenance, operation or use of anynuclear facility nor the flight, navigation or communication of aircraft or ground support equipment.
OpenEye Scientific software, shall not be liable, in whole or in part, for any claims arising from such
use, including death, bankruptcy or outbreak of war.
Windows is a registered trademark of Microsoft Corporation. Apple and Macintosh are registered trade-
marks of Apple Computer, Inc. AIX and IBM are registered trademarks of International Business Ma-
chines Corporation. UNIX is a registered trademark of the Open Group. RedHat is a registered trade-
mark of RedHat, Inc. Linux is a registered trademark of Linus Torvalds. Alpha is a trademark of Digital
Equipment Corporation. SPARC is a registered trademark of SPARC International Inc.
SYBYL is a registered trademark of TRIPOS, Inc. MDL is a registered trademark and ISIS is a trademark
of MDL Information Systems, Inc. SMILES, SMARTS, and SMIRKS may be trademarks of Daylight
Chemical Information Systems. Macromodel is a trademark of Schrodinger, Inc. Schrodinger, Inc maybe a wholly owned subsidiary of the Columbia University, New York.
Python is a trademark of the Python Software Foundation.
Java is a trademark or registered trademark of Sun Microsystems, Inc. in the U.S. or other countries.
“The forefront of chemoinformatics” is a trademark of Daylight Chemical Information Systems, Inc.
Other products and software packages referenced in this document are trademarks and registered trade-
marks of their respective vendors or manufacturers.
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 3/24
CONTENTS
1 Introduction 1
2 Theory 2
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.3 Flavors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.4 Multiconformer databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Installation and Licensing 8
3.1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Licensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4 Usage 9
4.1 Command line interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.2 Command line options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5 Example executions 17
A Release Notes 18
A.1 Babel 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
A.2 Babel3 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
A.3 Babel3 2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
A.4 Babel3 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
A.5 Babel2 2.0b1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
B Known problems and caveats 21B.1 Reporting of non-compliant molecule records . . . . . . . . . . . . . . . . . . . . . . . 21
B.2 Aromaticity perception of very large ring systems . . . . . . . . . . . . . . . . . . . . 21
ii
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 4/24
CHAPTER
ONE
Introduction
The OpenEye Babel program interconverts molecule files among several supported formats. Babel is
built upon and in most ways a thin wrapper around OEChem, the OpenEye toolkit for chemistry, and the
chemoinformatics foundation for most OpenEye applications. Babel is intended to provide convenient
and flexible access to the file interconversion capabilities integral to the OpenEye suite of software.
The program name ”Babel” has a rich and proud history involving Arizona (a U.S. state formerly in
the New Mexico territories), the open source software movement, intrigue, daring, and several colorful
characters. For now may it suffice to say that this incarnation, the OpenEye Babel application, should
not be confused with OpenBabel, the original open source Babel, or OELib’s Babel.
1
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 5/24
CHAPTER
TWO
Theory
2.1 Introduction
Babel should not really need a theory manual. The relevant theory is OEChem theory. However, a few
points are worth noting.
1. OEChem adheres to format specifications insofar as they are defined by authoritative documenta-
tion.
2. Format variants are generally handled by ”flavors”.
3. In some cases de facto format variants exist by virtue of their general usage, which OEChem may
support when not in conflict with the defined format (e.g. MOL2 files with absent hydrogens).
4. Different formats not only differ in their encoding, but also differ in the information represented,
e.g., the molecular representation. Where source information is absent, use of the term ”conver-
sion” is a something of a stretch, as information must be inferred (e.g., bonds from PDB files).
Thus, is not always possible to guarantee correct output for all conversions, when correctness is
defined by information not strictly contained in the data source.
5. OEChem is specifically designed to handle and interconvert the different chemical models intrinsic
to certain formats, for example, the varying aromaticity models of Tripos, MDL, and Daylight.
2.2 Formats
The current list of supported formats can be displayed by running ”babel -helpformats”, and are as
follows:
2
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 6/24
2.3. Flavors 3
extension OEChem code format read write
smi ( 1) OEFormat::SMI SMILES yes yes
mdl,mol,rxn ( 2) OEFormat::MDL MDL Mol yes yes
ent,pdb ( 3) OEFormat::PDB PDB yes yes
mol2,syb ( 4) OEFormat::MOL2 Tripos MOL2 yes yesbin ( 5) OEFormat::BIN OEBinary v1 yes no
tdt ( 6) OEFormat::TDT Daylight TDT no no
ism,isosmi ( 7) OEFormat::ISM Isomeric SMILES yes yes
mol2h ( 8) OEFormat::MOL2H MOL2 with H yes yes
sd,sdf ( 9) OEFormat::SDF MDL SDF yes yes
can (10) OEFormat::CAN Canonical SMILES yes yes
mf (11) OEFormat::MF Molecular Formula no yes
xyz (12) OEFormat::XYZ XYZ yes yes
fasta,seq (13) OEFormat::FASTA FASTA yes yes
mopac,pac (14) OEFormat::MOPAC MOPAC no yes
oeb (15) OEFormat::OEB OEBinary v2 yes yesdat,mmd,mmod (16) OEFormat::MMOD Macromodel yes yes
sln (17) OEFormat::SLN Tripos SLN no yes
rd,rdf (18) OEFormat::RDF MDL RDF yes no
cdx (19) OEFormat::CDX ChemDraw CDX yes yes
skc (20) OEFormat::SKC MDL ISIS Sketch yes no
This list is completely determined by the OEChem library upon which Babel is built. Similarly, all
applications with OEChem inside can normally handle these same formats.
2.3 Flavors
For each supported format there may be input and output flavors. Some flavors are generic and applicable
to multiple formats. If none are specified the default flavor will be used. Babel is intended to expose
all the format flavors defined in the official OEChem API. To list the available flavors, run ”babel
--help all”:
prompt> babel --help all
:jGf:
:jGDDDDf:
,fDDDGjLDDDf, ______ _ _
,fDDLt: :iLDDL; | ___ \ | | | |
;fDLt: :tfDG; | |_/ / __ _| |__ ___| |
,jft: ,ijfffji, :iff | ___ \/ _‘ | ’_ \ / _ \ |
.jGDDDDDDDDDGt. | |_/ / (_| | |_) | __/ |
;GDDGt:’’’:tDDDG, \____/ \__,_|_.__/ \___|_|
.DDDG: :GDDG.
;DDDj tDDDi babel - molecular structure file conversion
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 7/24
4 Chapter 2. Theory
,DDDf fDDD, version: 3.3
LDDDt. .fDDDj
.tDDDDfjtjfDDDGt copyright (c) 2005,2006,2007
:ifGDDDDDGfi. OpenEye Scientific Software, Inc.
.:::. OEChem version: 1.4.3
...................... platform: osx-10.4-g++3.3-G4
DDDDDDDDDDDDDDDDDDDDDD built: 20070401
DDDDDDDDDDDDDDDDDDDDDD
licensee: OpenEye
site: Albuquerque
Complete parameter list
basic : file i/o and other top level options
-in : input file
-out : output file
-firstonly : convert first molecule
-helpformats : list supported formats
-n : convert only N molecules (0 means all)
-skip : skip first N molecules
-v : verbose
-vv : very verbose
advanced
-ifmt : input format
-ofmt : output format
-add2d : add 2D coordinates
-chunk : split the output into several files
-chunk_filecount : number of chunk output files
-chunk_molcount : molecules per chunk output file
-chunk_prefix : file prefix for output chunk files (default is
<inbase>_XX)
-hydrogens : hydrogen handling
-igz : ungzip input
-input_params : read execution parameters from file
-mc : treat the file as multi-conformer
-mc2sc : one output scmol for each input mcmol
-mc_isomer : treat the file as multi-conformer (isomeric)
-mc_titles : mc perception title-sensitive
-mdlstereocorrect : correct non-compliant MDL stereo if possible
-molcount : write molcount to stdout
-nowarn : supress warnings
-ogz : gzip output
-output_names : write molecule names to file
-output_params : write execution parameters to file
-parts2mols : split out connected components
-perceive_residues : perceive macromolecular residues
-perceive_residues_preserve_All
-perceive_residues_preserve_AlternateLocation
-perceive_residues_preserve_ChainID
-perceive_residues_preserve_HetAtom
-perceive_residues_preserve_InsertCode
-perceive_residues_preserve_ResidueName
-perceive_residues_preserve_ResidueNumber
-perceive_residues_preserve_SerialNumber
-quiet : minimal verbosity, no banner
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 8/24
2.3. Flavors 5
-sc : handle as single-conformer molecules
-sd2title : copy specified SD data to title
-stereofrom3d : perceive stereo from input 3D
flavors, input generic : generic (format non-specific) input flavors
-iAroMask
-iFlavorNone : raw, no standardizations
-iGenericMask
-iOEAroModelDaylight
-iOEAroModelMDL
-iOEAroModelMMFF
-iOEAroModelOpenEye
-iOEAroModelTripos
-iRings
flavors, output generic : generic (format non-specific) output flavors
-oAroMask
-oFlavorNone : raw, no standardizations
-oGenericMask
-oOEAroModelDaylight
-oOEAroModelMDL
-oOEAroModelMMFF
-oOEAroModelOpenEye
-oOEAroModelTripos
-oRings
flavors, input format specific : format specific input flavors
flavors, mmod specific
-mmodiDefault : default flavors
-mmodiFormalCrg
flavors, mol2 specific
-mol2iDefault : default flavors
-mol2iM2H
flavors, pdb specific
-pdbiALL : read all atoms including alternate locations, dummy atoms,
etc.
-pdbiAllMask
-pdbiBasicMask
-pdbiBondOrder
-pdbiCHARGE : read partial charges from b-factor field
-pdbiConnect
-pdbiDATA : preserve header data as generic data
-pdbiDELPHI : combines -pdbiCHARGE and -pdbiRADIUS
-pdbiDefault : default flavors
-pdbiEND : read END as separator
-pdbiENDM : read ENDM as separator
-pdbiExtraMask
-pdbiFormalCrg
-pdbiImplicitH
-pdbiRADIUS : read atomic radius from occupancy field
-pdbiRings
-pdbiTER : read TER as separator
-pdbiTerMask
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 9/24
6 Chapter 2. Theory
flavors, smiles specific
-smiiCanon
-smiiDefault : default flavors
-smiiStrict
flavors, xyz specific
-xyziBondOrder
-xyziConnect
-xyziDefault : default flavors
-xyziExtraMask
-xyziFormalCrg
-xyziImplicitH
-xyziRings
flavors, output format specific : format specific output flavors
flavors, mdl specific
-mdloCurrentParity : write internal parity
-mdloDefault : default flavors
-mdloMCHG : write MCHG and MRAD fields for charged/radical atoms
-mdloMDLParity : write MDL parity
-mdloMISO : write ISO field for isotopes
-mdloMMask
-mdloMRGP : write RGP field for each R-group atom
-mdloMV30 : MDL V3000 format
-mdloNoParity : write no parity
-mdloPMask
flavors, mf specific
-mfoDefault : default flavors
-mfoTitle
flavors, mmod specific
-mmodoAtomTypes
-mmodoDefault : default flavors
flavors, mol2 specific
-mol2oAtomNames
-mol2oAtomTypeNames
-mol2oBondTypeNames
-mol2oDefault : default flavors
-mol2oHydrogens
-mol2oNameMask
-mol2oOrderAtoms
-mol2oSubstructure
flavors, mopac specific
-mopacoCHARGES : write charges
-mopacoDefault : default flavors
-mopacoXYZ : cartesian coords (default is internal coords/z-matrix)
flavors, pdb specific
-pdboBONDS : write CONECT records (all single without -pdboORDERS)
-pdboBOTH : bi-directional CONECT records
-pdboCHARGE : write partial charges to b-factor field
-pdboCurrentResidues
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 10/24
2.4. Multiconformer databases 7
-pdboDELPHI : combines -pdboCHARGE and -pdboRADIUS
-pdboDefault : default flavors
-pdboELEMENT
-pdboFormalCrg
-pdboHETBONDS
-pdboNoResidues
-pdboOEResidues
-pdboORDERS : include bond orders in CONECT records
-pdboOrderAtoms
-pdboRADIUS : write atomic radii to occupancy field
-pdboTER : terminate with TER rather than END
flavors, smiles specific
-smioAtomMaps
-smioAtomStereo
-smioBondStereo
-smioCanonical
-smioDefault : default flavors
-smioExtBonds
-smioHydrogens
-smioImpHCount
-smioIsotopes
-smioKekule
-smioRGroups
-smioSmiMask
-smioSuperAtoms
2.4 Multiconformer databases
OEChem and Babel are designed to handle multiconformer files in a consistent way across formats. OE-
Binary is an explicitly multiconformer format. Others are not, so consistent rules must exist to determine
whether subsequent molecules are in fact conformers of the same molecule. Controls exist to specify
whether stereoisomers are considered the same or different molecules, and whether titles should distin-
guish molecules. Within one multiconformer molecule, OEChem requires that atoms and bonds must be
identically ordered.
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 11/24
CHAPTER
THREE
Installation and Licensing
3.1 Installation
As with other OpenEye packages, Babel is normally shipped as a gzipped tarball, to be installed into a
subdirectory ”openeye” (normally /usr/local/openeye). For example:
prompt> cd /usr/local
prompt> tar xzvf $HOME/babel-3.3-centos-3.6-i586.tar.gz
3.2 Licensing
Babel requires a valid OpenEye license for the product OEChem and only OEChem licensees are en-
titled to use Babel. (One feature, generating 2D coordinates, invoked by the option -add2d, requires
an Ogham (oedepict) license, but this is not required for all other operations.) The license file should
be defined by environment variable OE_LICENSE. To request an evaluation license use OpenEye’s
online request form. If already a licensee, contact [email protected] or your local system ad-
ministrator for a license file. To purchase OEChem and Babel, or other OpenEye software, contact
8
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 12/24
CHAPTER
FOUR
Usage
4.1 Command line interface
The command line interface is similar to that of other OpenEye applications. Normally, input and output
file formats are implied by filename extensions. Also, gzipped input and output are allowed and implied
by the ”.gz” extension. Standard input and output can be used by specifying only the file extension.
Extensive online help is available. The simplest way to get started is just to type ”babel” and follow
the directions.
4.2 Command line options
4.2.1 Basic
-in file containing input molecules
-out file containing output molecules
-firstonly Convert first molecule only and exit.
-helpformats List supported formats.
-n Convert first n-molecules (0, the default, means all).
-skip Skip first n-molecules.
-v verbose
-vv very verbose
9
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 13/24
10 Chapter 4. Usage
4.2.2 Advanced
-ifmt input format specification. Not normally needed, since formats are implied by filename exten-
sions. Specify by extension (e.g., ”mol2”, ”sdf”, ”smi”, etc.).
-ofmt output format specification. Not normally needed, since formats are implied by filename exten-
sions. Specify by extension (e.g., ”mol2”, ”sdf”, ”smi”, etc.). Useful with -chunk.
-add2d Generate 2D coordinates and include in the output. This functionality is disabled in the absence
of a valid Ogham (oedepict) license. -add2d is incompatible with output formats which cannot
represent 2D.
-chunk Split, a.k.a. ”chunk”, the output into several files. See also -chunk prefix, -chunk molcount,
and -chunk filecount. The default output file prefix is the input file directory and basename, and
the specified output format. Output format should be specified by -ofmt and -ogz.
-chunk filecount number of chunk output files
-chunk molcount molecules per chunk output file
-chunk prefix file prefix for output chunk files (default is INPATH / INBASE XX)
-hydrogens Hydrogen handling: allowed values are ”add”, ”delete” and ”same” (meaning same as
input).
-igz Ungzip input. Not normally needed, since file extension can imply gzipped.
-input params read execution parameters from file.
-mc Treat the file as multi-conformer. Not needed for OEB which is multi-conformer by default.
-mc2sc One output scmol for each input mcmol.
-mc isomer Treat the file as multi-conformer (isomeric).
-mc titles Multi-conformer perception title-sensitive.
-mdlstereocorrect Correct non-compliant MDL stereo if possible.
-molcount Write molecule count to stdout.
-nowarn Supress warnings.
-ogz Gzip output. Not normally needed, since file extension can imply gzipped. Useful with -chunk.
-output names Write molecule names (titles) to file.
-output params write execution parameters to file.
-parts2mols Split out connected components.
-perceive residues perceive macromolecular residues
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 14/24
4.2. Command line options 11
-perceive residues preserve All
-perceive residues preserve AlternateLocation
-perceive residues preserve ChainID
-perceive residues preserve HetAtom
-perceive residues preserve InsertCode
-perceive residues preserve ResidueName
-perceive residues preserve ResidueNumber
-perceive residues preserve SerialNumber
-quiet minimal verbosity, no banner
-sc Handle input and output as single-conformer molecules. This is the default for all input formatsexcept OEBinary v1 and v2 (.bin and .oeb).
-sd2title Copy specified SD data to title.
-stereofrom3d perceive stereo from input 3D
4.2.3 Format flavors
If no format flavor flags are invoked, Babel will use the default flavors for the specified input and out-
put formats. These defaults are also available by using a flavor which combines the individual flavors;
for example, -mdloDefault. To see what these defaults are, view the detailed help for the specific
default flavor (e.g., babel --help -mdloDefault). If any input flavors are specified, the user
must take full control over all input flavors. Likewise for output flavors. So, to add one flavor to the de-
faults, that flavor should be used in combination with the default flavor (e.g., babel -mdloDefault
-mdloMV30). Combining flavors involves a bitwise OR-ing of an integer datatype which represents a
binary array for these purposes. Babel reports flavors used as hex integers.
Generic input flavorings (format non-specific)
-iAroMask
-iFlavorNone Raw, no standardizations.
-iGenericMask
-iOEAroModelDaylight Daylight aromaticity model.
-iOEAroModelMDL MDL aromaticity model.
-iOEAroModelMMFF MMFF aromaticity model.
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 15/24
12 Chapter 4. Usage
-iOEAroModelOpenEye OpenEye aromaticity model.
-iOEAroModelTripos Tripos aromaticity model.
-iRings Perceive rings.
Generic output flavorings (format non-specific)
-oAroMask
-oFlavorNone Raw, no standardizations.
-oGenericMask
-oOEAroModelDaylight Daylight aromaticity model.
-oOEAroModelMDL MDL aromaticity model.
-oOEAroModelMMFF MMFF aromaticity model.
-oOEAroModelOpenEye OpenEye aromaticity model.
-oOEAroModelTripos Tripos aromaticity model.
-oRings
Input format specific flavorings: mmod
-mmodiDefault default flavors
-mmodiFormalCrg
Input format specific flavorings: mol2
-mol2iDefault default flavors
-mol2iM2H
Input format specific flavorings: pdb
-pdbiALL read all atoms including alternate locations, dummy atoms, etc.
-pdbiAllMask
-pdbiBasicMask
-pdbiBondOrder
-pdbiCHARGE read partial charges from b-factor field
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 16/24
4.2. Command line options 13
-pdbiConnect
-pdbiDATA preserve header data as generic data
-pdbiDELPHI combines -pdbiCHARGE and -pdbiRADIUS
-pdbiDefault default flavors
-pdbiEND read END as separator
-pdbiENDM read ENDM as separator
-pdbiExtraMask
-pdbiFormalCrg
-pdbiImplicitH
-pdbiRADIUS read atomic radius from occupancy field
-pdbiRings
-pdbiTER read TER as separator
-pdbiTerMask
Input format specific flavorings: smiles
-smiiCanon skips Kekulization test
-smiiDefault default flavors
-smiiStrict disallow format extensions
Input format specific flavorings: xyz
-xyziBondOrder
-xyziConnect
-xyziDefault default flavors
-xyziExtraMask
-xyziFormalCrg
-xyziImplicitH
-xyziRings
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 17/24
14 Chapter 4. Usage
Output format specific flavorings: mdl
-mdloCurrentParity write internal parity
-mdloDefault default flavors
-mdloMCHG write MCHG and MRAD fields for charged/radical atoms
-mdloMDLParity write MDL parity
-mdloMISO write ISO field for isotopes
-mdloMMask
-mdloMRGP write RGP field for each R-group atom
-mdloMV30 MDL V3000 format
-mdloNoParity write no parity
-mdloPMask
Output format specific flavorings: mf
-mfoDefault default flavors
-mfoTitle include title
Output format specific flavorings: mmod
-mmodoAtomTypes
-mmodoDefault default flavors
Output format specific flavorings: mol2
-mol2oAtomNames
-mol2oAtomTypeNames
-mol2oBondTypeNames
-mol2oDefault default flavors
-mol2oHydrogens
-mol2oNameMask
-mol2oOrderAtoms
-mol2oSubstructure
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 18/24
4.2. Command line options 15
Output format specific flavorings: mopac
-mopacoCHARGES write charges
-mopacoDefault default flavors
-mopacoXYZ cartesian coords (default is internal coords/z-matrix)
Output format specific flavorings: pdb
-pdboBONDS write CONECT records (all single without -pdboORDERS)
-pdboBOTH write bi-directional CONECT records
-pdboCHARGE write partial charges to b-factor field
-pdboCurrentResidues
-pdboDELPHI combines -pdboCHARGE and -pdboRADIUS
-pdboDefault default flavors
-pdboELEMENT writes the chemical symbol in columns 77-78 of the output
-pdboFormalCrg writes non-zero formal charges in columns 79-80 (and implies -pdboELEMENT)
-pdboHETBONDS all bonds between (and to/from) hetero atoms are written to the output PDB file
-pdboNoResidues
-pdboOEResidues
-pdboORDERS include bond orders in CONECT records
-pdboOrderAtoms
-pdboRADIUS write atomic radii to occupancy field
-pdboTER terminate with TER rather than END
Output format specific flavorings: smiles
-smioAtomMaps
-smioAtomStereo
-smioBondStereo
-smioCanonical
-smioDefault default flavors
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 19/24
16 Chapter 4. Usage
-smioExtBonds
-smioHydrogens
-smioImpHCount
-smioIsotopes
-smioKekule
-smioRGroups
-smioSmiMask
-smioSuperAtoms
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 20/24
CHAPTER
FIVE
Example executions
1. prompt> babel −in foo . sdf −out bar . mol2
Convert SDF file to MOL2 file.
2. prompt> babel −in foo . sdf . gz −out bar . smi
Convert gzipped SDF file to SMILES.
3. prompt> babel foo . sdf . gz bar . smi
Convert SDF file to MOL2 file using shortcut ”keyless” syntax.
4. prompt> cat foo . sdf . gz | babel −in . sdf . gz −out bar . smi
Convert gzipped SDF stream from stdin to SMILES.
5. prompt> babel −in mongodb . sdf . gz −out bar . oeb . gz −mc
Convert gzipped SDF file to OEBinary multiconformer file, where consecutive molecules may beinterpreted as conformers of the same molecule.
6. prompt> babel −in mongodb . sdf . gz −out bar . oeb . gz −mc_isomer
Convert gzipped SDF file to OEBinary multiconformer file, where consecutive molecules may
be interpreted as conformers of the same molecule, but different stereoisomers are considered
different molecules.
7. prompt> babel −in mongodb . sdf . gz −mc −quiet
Create no output but report counts for input SDF file handled as multiconformer, without fanfare.
8.prompt> babel
\−in mongodb . sdf . gz \−ofmt sdf \−ogz \−chunk \−chunk_prefix datadir / mongo_part \−chunk_molcount 1 0 0 0 0 0
Split the input file into several files containing 100000 molecules each, in .sdf.gz format.
17
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 21/24
APPENDIX
A
Release Notes
A.1 Babel 3.3
July 2007 v3.3
1. Babel 3.3 is a minor update from Babel3 v2.2, largely to provide full compatibility with OEChem
1.5.0.
2. The name of the program has been changed from ”babel3” to ”babel”, for simplicity. Incrementing
the major version number only reflects this name change.
3. Option -output names is added to extract a list of names.
4. Option -mdlstereocorrect is added to correct non-compliant MDL stereo if possible.
5. Option -molcount is added for use in automation.
6. Bug fixed: Fixed -chunk so -n 0 is allowed.
7. Options added to allow residue perception with fine control:
-perceive residues -perceive residues preserve AlternateLocation
-perceive residues preserve ChainID -perceive residues preserve HetAtom
-perceive residues preserve InsertCode -perceive residues preserve Residu
-perceive residues preserve ResidueNumber -perceive residues preserve Ser
-perceive residues preserve All
A.2 Babel3 2.2
October 2006 v2.2
1. Babel3 2.2 is a minor update largely to provide full compatibility with OEChem 1.4.2. Thereby,
writing of the MDL V3000 file format is added (use -mdloDefault -mdloMV30). (Note that
18
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 22/24
A.3. Babel3 2.1 19
a spurious warning is generated by OEChem 1.4.2 when the atom count exceeds 999 with MDL
V3000 output.) (Note that V3000 reading is not yet available.) Also included are improvements to
PDB handling.
2. Option -quiet added, to avoid banner and verbose messages.
3. Options -chunk, -chunk prefix, -chunk molcount and -chunk filecount added,
to facilitate the task of splitting an input file into several output files. This can be useful for simple
”divide and conquer” approaches to parallel processing. This replicates the functionality of the
Rocs auxiliary program chunker but with format control.
4. Option -add2d added, to generate and add 2D coordinates. This feature is disabled in the absence
of a valid Ogham (oedepict) license. Very useful for generating 2D SD files from 3D or 0D input
(e.g. PDB, SMILES). 2D coordinates are required for preserving cis/trans stereochemistry in SD
files. This task can also be done with the Ogham program depict.
5. Option -stereofrom3d added, to perceive stereochemistry from 3D coordinates, for writing to
a non-3D format.
A.3 Babel3 2.1
April 2006 v2.1
1. Babel3 2.1 is a minor update largely to provide full compatibility with OEChem 1.4.0. Thereby,
reading of the MDL ISIS Sketch File format is added.
A.4 Babel3 2.0
August 2005 v2.0
1. Babel3 2.0 is the first officially supported version of this program. Prior to this, the program
”Babel2” (babel2.cpp) was an unsupported OEChem example, despite its critical functionality.
Hence the promotion. ”Babel2 v2.0b1” was this program’s beta, prior to renaming.
2. Built with OEChem 1.3.4.
3. Performance has been improved significantly (x3 in typical use).
4. Fixed one-off bug in mol count with -n option.
5. Keyless syntax enabled; e.g., ”babel foo.sdf bar.mol2”.
6. Multiconformer perception for XYZ and PDB format disallowed, due to fundamental format lim-
itations (ref: OEChem 1.3.4 manual).
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 23/24
20 Appendix A. Release Notes
A.5 Babel2 2.0b1
May 2005 v2.0b1
1. This beta release (called Babel2) reflects the form and features of the planned Babel 2.0, to be the
first officially supported version. Prior to this, the program (babel2.cpp) has been an unsupported
OEChem example, despite its critical functionality. Hence the promotion.
7/28/2019 BABEL.pdf
http://slidepdf.com/reader/full/babelpdf 24/24
APPENDIX
B
Known problems and caveats
B.1 Reporting of non-compliant molecule records
The Babel program makes use of the OEChem function OEReadMolecule() and input molecule streams.
This is a highly robust method for processing input data including recovering from input errors and
broken, non-compliant files. However, error recovery sometimes entails not reporting input errors, or not
correlating them precisely with input records.
B.2 Aromaticity perception of very large ring systems
The OpenEye aromaticity model considers all rings including non-SSSR, and with unlimited size. Thuswith some large ringsystems (esp. larger buckminsterfullerenes) the algorithm can be impractically slow
(days). This problem does not affect Babel’s normal task of processing of small organic molecules or
oligomeric macromolecules.
21