chemical and pharmaceutical databases and their …nist chemistry webbook - database of organic...

15
Chemical and pharmaceutical databases and their using The open Web offers a rich collection of diverse chemical data sources. Chemistry Databases on the Web CCOHS: Web Information Services - Offers subscriptions to chemical databases as well as free access to Chemindex. CHEMnetBASE - Online access to major chemical reference works from Chapman and Hall/CRC including Combined Chemical Dictionary (CCD), The Handbook of Chemistry and Physics, Polymers: A Property Database, Properties of Organic Compounds. CML Reference Collection - Commercial database of more than 120,000 3D molecular structures expressed in CML (Chemical Markup Language) format. Chem Sources - Supplier information for approximately 200000 chemical compounds. Searching for compounds and supplier information is fee based. ChemBank - Initiative for Chemical Genetics. Freely available collection of data about small molecules (over 2000 compounds) and resources for studying their properties, especially their effects on biology. ChemExpo - News, stock reports, products, directories, classifieds, and profiles of commercially available chemical products; focus reports; ChemIDplus - A free database of 350000 chemical compounds. Records consist of name, synonyms, CAS number, molecular formula, and direct links to biomedical resources. Included also over 100000 structures. Searchable by substructure with free Chime plugin.

Upload: others

Post on 31-May-2020

30 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chemical and pharmaceutical databases and their …NIST Chemistry WebBook - Database of organic chemistry compounds, organized by species. Contains chemical and physical property data

Chemical and pharmaceutical databases and their

using

The open Web offers a rich collection of diverse chemical data sources.

Chemistry Databases on the Web

CCOHS: Web Information Services - Offers subscriptions to chemical

databases as well as free access to Chemindex.

CHEMnetBASE - Online access to major chemical reference works from

Chapman and Hall/CRC including Combined Chemical Dictionary (CCD), The

Handbook of Chemistry and Physics, Polymers: A Property Database, Properties of

Organic Compounds.

CML Reference Collection - Commercial database of more than 120,000

3D molecular structures expressed in CML (Chemical Markup Language) format.

Chem Sources - Supplier information for approximately 200000

chemical compounds. Searching for compounds and supplier information is fee

based.

ChemBank - Initiative for Chemical Genetics. Freely available collection

of data about small molecules (over 2000 compounds) and resources for studying

their properties, especially their effects on biology.

ChemExpo - News, stock reports, products, directories, classifieds, and

profiles of commercially available chemical products; focus reports;

ChemIDplus - A free database of 350000 chemical compounds. Records

consist of name, synonyms, CAS number, molecular formula, and direct links to

biomedical resources. Included also over 100000 structures. Searchable by

substructure with free Chime plugin.

Page 2: Chemical and pharmaceutical databases and their …NIST Chemistry WebBook - Database of organic chemistry compounds, organized by species. Contains chemical and physical property data

ChemSpider - Free chemistry search engine which aggregates and

indexes chemical structures and their associated information into a single searchable

repository.

Chemical Abstracts Service (CAS) - Provides pathways to published

research in the world's journal and patent literature through tools such as SciFinder,

SciFinder Scholar and STN. CAS databases provide access to over 25 million

documents, 26 million chemical substances, and 56 million sequences.

DDBST - Dortmund Data Bank and DDB Software Package - Database

of thermodynamic and transport properties of pure components and mixtures for use

in process design and development and related software package.

DETHERM Thermophysical Properties - Thermophysical properties of

pure substances and mixtures. Phase equilibrium data, caloric data, critical data,

transport properties, electrolyte data. Data sets with descriptors and bibliographic

references.

EaSI-Pro - On-line, fee-based database of dangerous substances and

(inter)national legislation on 200.000 chemicals.

FTIRsearch.com - Search a database of 87,000 FTIR and Raman spectra

using text or full spectral comparison. JAVA required.

Hazardous Chemical Database - Contains names, formulas and registry

numbers for 1991 hazardous chemicals. This site was last updated in 1996.

IRSLDB-Infrared and Raman Spectroscopy Literature Database -

Bibliography of infrared and Raman spectroscopy and its applications. Provides

reference data (title, author(s), journal, page, date info.) collected from 123 journals,

grouped into 129 classes.

IUPAC-NIST Solubility Data Series Database - Comprising 11 volumes

from the IUPAC Solubility Data Series liquids with liquids. A limited number of

multi-component (organic-water-salt) systems are also included.

Indiana University Molecular Structure Center - Small molecule

database of structures that have primarily been determined using single crystal X-ray

diffraction. Features JAVA applets that can be used to orient and examine the

Page 3: Chemical and pharmaceutical databases and their …NIST Chemistry WebBook - Database of organic chemistry compounds, organized by species. Contains chemical and physical property data

molecule, and a dedicated Beowulf computer system to generate high quality

rendered images. Currently contains over 2,500 molecules.

JAICI - JAICI is a not-for-profit scientific information service

organization featuring, in particular, chemical expertise.

LiqCryst - Online - Information about approximately 75000 compounds

with liquid crystalline properties. Compound search is free, the compound data are

fee based.

MALDI Database from NIST Polymers Division - Database of methods,

taken largely from the peer-review scientific literature, for matrix-assisted laser

desorption ionization (MALDI) mass spectrometry of synthetic polymers.

NIH Chemical Information - Database of substances ranging from drugs

to hazardous chemicals, maintained by the US National Library of Medicine,

Bethesda, Maryland.

NIST Chemistry WebBook - Database of organic chemistry compounds,

organized by species. Contains chemical and physical property data on over 30,000

compounds.

Nanogen Index 2 - Offers a book containing data on over 3000 pesticides

and other environmental contaminants.

Organic Reactions on Wiley InterScience - Comprehensive database of

important synthetic reactions, with all published examples of the topic reactions.

Organic Syntheses - Free electronic version of printed Organic Syntheses

series - detailed reliable experimental methods for the synthesis of organic

compounds.

Organic Syntheses - Describes checked and edited experimental

procedures, spanning a broad range of synthetic methodologies. CrossRef(R) and

Chemport(R) enabled reference linking.

Organic Synthesis series - All annual volumes of Organic Syntheses in

ISIS/Base format. Checked and edited experimental synthetic procedures.

Page 4: Chemical and pharmaceutical databases and their …NIST Chemistry WebBook - Database of organic chemistry compounds, organized by species. Contains chemical and physical property data

PubChem - Free database of chemical structures of small organic

molecules and information on their biological activities. Linked with NIH

PubMed/Entrez information retrieval system.

Public Chromatography Applications Database - Contains HPLC and

GC applications provided by major vendors of chromatography equipment. Can be

searched by numerous chromatographic parameters, as well as by structure,

substructure, and structural similarity.

Query Chem - Chemical search engine powered by the Google Web

API.

RADEN Data Bank - Is designed for compilation of experimental

researches and calculations of the RADiative and ENergy parameters defining the

distribution of intensity in electronic and vibrational spectra of diatomic molecules.

Reciprocal Net - Distributed database used by research crystallographers

to store information about 3D molecular structures.

SDBS, Spectral Database for Organic Compounds - Integrated spectral

database system for organic compounds, which includes the following spectra -

Electron Impact Mass (EI-MS), Fourier Transform Infrared (FT-IR), 1H and 13C

NMR, Raman and Electron Spin Resonance (ESR).

SOLV-DB - A free database of commercially available solvents

searchable by many properties. Compiled by National Center for Manufacturing

Sciences.

SPRESIweb - InfoChem GmbH - The new SpresiWeb developed by

InfoChem is an integrated scientific database containing over 4.5 million molecules,

3.5 million reactions, 380,000 references, 98,000 patents covering years 1974 - 1999.

STM Data - Data collections from Science, Technology and Medicine

(STM): AntiBase (natural compound identifier), AmicBase (antimicrobial activity

verifier), Mass Spectra (MS) of Pharmaceuticals and Agrochemicals, Infrared Spectra

(IR) of Polymers.

Page 5: Chemical and pharmaceutical databases and their …NIST Chemistry WebBook - Database of organic chemistry compounds, organized by species. Contains chemical and physical property data

Spectra databases for analytical laboratories - ATR, FT IR, and Raman

Spectra of chemicals and other compounds. Includes polymers, solvents, forensics,

etc.

Structure Database of Chemicals with Pharmaceutical Activity -

Animations require the Chime plug-in to view. From Oxford University, UK.

Synthetic Pages - Free interactive database of practical and reliable

organic, organometallic and inorganic chemical syntheses submitted by synthetic

chemists.

The ChemExper Chemicals Directory - A catalog of 60000 fine

chemicals searchable for free using a chemist-friendly search engine.

Thermochemistry Resource - Free database containing thermochemistry

data for over 400 compounds relevant to high-temperature processes used in

materials synthesis. Searching by molecular formula.

Thermodata - Databases and software for thermochemistry.

WebReactions - Organic Reaction Retrieval System - Free reaction

search system offering direct retrieval of reaction precedents. Search based on

reaction types and bonding change.

Wiley Database of Polymer Properties - Physical property data for

polymers commercially available, with experimentally determined and selected data

for over 2,500 polymers. Derived from the Polymer Handbook.

Page 6: Chemical and pharmaceutical databases and their …NIST Chemistry WebBook - Database of organic chemistry compounds, organized by species. Contains chemical and physical property data

PubChem “… organized as three linked databases …

PubChem Substance, PubChem Compound, and PubChem BioAssay.”

PubChem is a database of chemical molecules and their activities against

biological assays. The system is maintained by the National Center for Biotechnology

Information (NCBI), a component of the National Library of Medicine, which is part

of the United States National Institutes of Health (NIH). PubChem can be accessed

for free through a web user interface. Millions of compound structures and

descriptive datasets can be freely downloaded via FTP. PubChem contains substance

descriptions and small molecules with fewer than 1000 atoms and 1000 bonds. The

American Chemical Society tried to get the U.S. Congress to restrict the operation of

PubChem, because they claim it competes with their Chemical Abstracts Service.

More than 80 database vendors contribute to the growing PubChem database.

Page 7: Chemical and pharmaceutical databases and their …NIST Chemistry WebBook - Database of organic chemistry compounds, organized by species. Contains chemical and physical property data

PubChem consists of three dynamically growing primary databases. As of 7

January 2011:

Compounds, 31 million entries, contains pure and characterized

chemical compounds.

Substances, 75 million entries, contains also mixtures, extracts,

complexes and uncharacterized substances.

BioAssay, bioactivity results from 1644 high-throughput screening

programs with several million values.

Searching the databases is possible for a broad range of properties including

chemical structure, name fragments, chemical formula, molecular weight, XLogP,

and hydrogen bond donor and acceptor count.

PubChem contains its own online molecule editor with SMILES/SMARTS

and InChI support that allows the import and export of all common chemical file

formats to search for structures and fragments.

Each hit provides information about synonyms, chemical properties,

chemical structure including SMILES and InChI strings, bioactivity, and links to

structurally related compounds and other NCBI databases like PubMed.

In the text search form the database fields can be searched by adding the

field name in square brackets to the search term. A numeric range is represented by

two numbers separated by a colon. The search terms and field names are case-

insensitive. Parentheses and the logical operators AND, OR, and NOT can be used.

AND is assumed if no operator is used.

Example (Lipinski's Rule of Five): 0:500[mw] 0:5[hbdc] 0:10[hbac] -

5:5[logp]

The American Chemical Society has raised concerns about the publicly

supported PubChem database, since it appears to directly compete with their existing

Chemical Abstracts Service. They have a strong interest in the issue since the

Chemical Abstracts Service generates a large percentage of the society's revenue. To

advocate their position against the PubChem database, ACS has actively lobbied the

US Congress.

Page 8: Chemical and pharmaceutical databases and their …NIST Chemistry WebBook - Database of organic chemistry compounds, organized by species. Contains chemical and physical property data

ZINC “… a free database of commercially-available compounds for virtual

screening. ZINC contains over 13 million purchasable compounds in ready-to-dock,

3D formats.”

The ZINC database is a curated collection of commercially available

chemical compounds prepared especially for virtual screening. ZINC is used by

investigators (generally people with training as biologists or chemists) in

pharmaceutical companies, biotech companies, and research universities

ZINC is different from other chemical databases because it aims to represent

the biologically relevant, three dimensional form of the molecule.

ZINC is updated regularly and may be downloaded and used free of charge.

It is developed by John Irwin in the Shoichet Laboratory in the Department of

Pharmaceutical Chemistry at the University of California, San Francisco.

The latest release of the website interface is "ZINC 12"(2012). The database

contents are continuously updated. Static subsets are generated regularly and are

dated.

Page 9: Chemical and pharmaceutical databases and their …NIST Chemistry WebBook - Database of organic chemistry compounds, organized by species. Contains chemical and physical property data

eMolecules “eMolecules discovers sources of chemical data by searching

the Internet, and receives submissions from data providers such as chemical suppliers

and academic research institutions.”

The typical eMolecules user is a research scientist, laboratory technician,

student, or procurement manager seeking small-molecule chemical compounds.

By area of specialization, typical users include:

Function: lab chemists, analytical chemists, medicinal chemists,

biochemists, and procurement managers.

Industry: pharmaceutical drug discovery, biotechnology, research

institutes, flavor and fragrance manufacturing, contract research organizations

(CRO’s), contract manufacturing organizations (CMO’s), and legal (patent search).

Academic: university labs, professors, graduate students, librarian

reference.

Page 10: Chemical and pharmaceutical databases and their …NIST Chemistry WebBook - Database of organic chemistry compounds, organized by species. Contains chemical and physical property data

ChEBI “Chemical Entities of Biological Interest (ChEBI) is a freely

available dictionary of molecular entities focused on �small’ chemical

compounds.”

Chemical Entities of Biological Interest, also known as ChEBI, is a

database and ontology of molecular entities focused on 'small' chemical compounds,

that is part of the Open Biomedical Ontologies effort. The term "molecular entity"

refers to any "constitutionally or isotopically distinct atom, molecule, ion, ion pair,

radical, radical ion, complex, conformer, etc., identifiable as a separately

distinguishable entity".[3]

The molecular entities in question are either products of

nature or synthetic products used to intervene in the processes of living organisms.

Molecules directly encoded by the genome, such as nucleic acids, proteins and

peptides derived from proteins by proteolytic cleavage, are not as a rule included in

ChEBI.

Page 11: Chemical and pharmaceutical databases and their …NIST Chemistry WebBook - Database of organic chemistry compounds, organized by species. Contains chemical and physical property data

ChEBI uses nomenclature, symbolism and terminology endorsed by the

International Union of Pure and Applied Chemistry (IUPAC) and Nomenclature

Committee of the International Union of Biochemistry and Molecular Biology (NC-

IUBMB).

All data in the database is non-proprietary or is derived from a non-

proprietary source. It is thus freely accessible and available to anyone. In addition,

each data item is fully traceable and explicitly referenced to the original source.

The ChEBI data is available through a public web interface, Web Service

and downloads.

NIST Chemistry WebBook “…provides thermochemical, thermophysical,

and ion energetics data compiled by NIST under the Standard Reference Data

Program.”

Page 12: Chemical and pharmaceutical databases and their …NIST Chemistry WebBook - Database of organic chemistry compounds, organized by species. Contains chemical and physical property data

Compendium of Common Pesticide Names “This Compendium is

believed to be the only place where all of the ISO-approved standard names of

chemical pesticides are listed. It also includes more than 300 approved names from

national and international bodies for pesticides that do not have ISO names.”

For purposes of trade, registration and legislation, and for use in popular and

scientific publications, pesticides need names that are short, distinctive, non-

proprietary and widely-accepted. Systematic chemical names are rarely short and are

not convenient for general use, and so standards bodies assign common names to the

active ingredients of pesticides.

More than 1100 of these official common names for pesticides have been

assigned by the International Organization for Standardization (ISO). This

Compendium is believed to be the only place where all of the ISO-approved standard

names of chemical pesticides are listed.

Page 13: Chemical and pharmaceutical databases and their …NIST Chemistry WebBook - Database of organic chemistry compounds, organized by species. Contains chemical and physical property data

The Compendium contains much more than ISO common names, with

nomenclature data sheets for more than 1700 different active ingredients and for

more than 350 ester and salt derivatives, made accessible by a comprehensive set of

indexes and a classification.

NMRShiftDB “… a NMR database (web database) for organic structures

and their nuclear magnetic resonance (nmr) spectra. It allows for spectrum prediction

(13C, 1H and other nuclei) as well as for searching spectra, structures and other

properties. Last not least, it features peer-reviewed submission of datasets by its

users.”

Page 14: Chemical and pharmaceutical databases and their …NIST Chemistry WebBook - Database of organic chemistry compounds, organized by species. Contains chemical and physical property data

DrugBank “… a unique bioinformatics and cheminformatics resource that

combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data

with comprehensive drug target (i.e. sequence, structure, and pathway) information.

The database contains 6707 drug entries including 1436 FDA-approved small

molecule drugs, 134 FDA-approved biotech (protein/peptide) drugs, 83 nutraceuticals

and 5086 experimental drugs.”

Page 15: Chemical and pharmaceutical databases and their …NIST Chemistry WebBook - Database of organic chemistry compounds, organized by species. Contains chemical and physical property data

Literature

ChEBI / Wikipedia – The Free Encyclopedia:

http://en.wikipedia.org/wiki/ChEBI

De Matos, P.; Alcantara, R.; Dekker, A.; Ennis, M.; Hastings, J.; Haug, K.;

Spiteri, I.; Turner, S. et al. (2009). "Chemical Entities of Biological Interest: An

update". Nucleic Acids Research 38 (Database issue): D249–54.

doi:10.1093/nar/gkp886. PMC 2808869. PMID 19854951.

//www.ncbi.nlm.nih.gov/pmc/articles/PMC2808869/

Degtyarenko, K.; De Matos, P.; Ennis, M.; Hastings, J.; Zbinden, M.;

McNaught, A.; Alcantara, R.; Darsow, M. et al. (2007). "ChEBI: A database and

ontology for chemical entities of biological interest". Nucleic Acids Research 36

(Database issue): D344–50. doi:10.1093/nar/gkm791. PMC 2238832.

PMID 17932057. //www.ncbi.nlm.nih.gov/pmc/articles/PMC2238832/

eMolecules: http://www.emolecules.com

Nic, M.; Jirat, J.; Kosata, B., eds. (2006–). "molecular entity". IUPAC

Compendium of Chemical Terminology (Online ed.). doi:10.1351/goldbook.M03986.

ISBN 0-9678550-9-8. http://goldbook.iupac.org/M03986.html

PubChem / Wikipedia – The Free Encyclopedia:

http://en.wikipedia.org/wiki/PubChem

Sixty-four free chemistry databases. – October 12th

,2011: http://depth-

first.com/articles/2011/10/12/sixty-four-free-chemistry-databases/

ZINC database / Wikipedia – The Free Encyclopedia:

http://en.wikipedia.org/wiki/ZINC_database