instantjchem: a flexible chemical database system

37
1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000 Strasbourg

Upload: cole

Post on 08-Jan-2016

27 views

Category:

Documents


0 download

DESCRIPTION

InstantJChem: a flexible chemical database system. G. Marcou, D. Horvath + Laboratoire d ’ infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000 Strasbourg. Introduction. The goal is to present InstantJChem for the storage and manipulation of chemical information - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: InstantJChem: a flexible chemical database system

1

InstantJChem: a flexible chemical database system

G. Marcou, D. Horvath+Laboratoire d’infochimie, Université de Strasbourg, 1, rue

Blaise Pascal, 67000 Strasbourg

Page 2: InstantJChem: a flexible chemical database system

Introduction The goal is to present InstantJChem for the

storage and manipulation of chemical information

1. General presentation2. Database search3. Creation of a database from scratch

Page 3: InstantJChem: a flexible chemical database system

What is a database? A database stores data in an ordered form on a

precise subject. A relational database stores information into

tables which possess inter-references A relational database management system

(RDBMS) is a software that manages relational databases

InstantJChem is not a database and is not an RDBMS.

Page 4: InstantJChem: a flexible chemical database system

What is InstantJChem? InstantJChem is a friendly interface between a

RDBMS, chemical information and the user.

User

RDBMS

Chemical Information

Page 5: InstantJChem: a flexible chemical database system

Key concepts of InstantJChem

ProjectsSchemaDatabases and TablesEntitiesData TreesViews

Page 6: InstantJChem: a flexible chemical database system

Exercise 1Create a new project names IJCExercises…

Page 7: InstantJChem: a flexible chemical database system

Key concept: Project

Project

contains resources and connections to one or more databases.

icon

Page 8: InstantJChem: a flexible chemical database system

Exercise 1

…and import the file SC100.SDF in it….

Page 9: InstantJChem: a flexible chemical database system

Key concept: Schema

Schema/Database

Contains connection to a database and special tables (JChemProperties)

icon

Page 10: InstantJChem: a flexible chemical database system

Key concept: Database and Tables

Table

Database and tables are managed by the RDBMS.

Actually store information.

icon

Page 11: InstantJChem: a flexible chemical database system

What can be storedType Description

Standard tableInteger Long integer: 232 = 4294967296

Text User can specify widths of text fields as large as needed.

Real Real double-precision

Date Allows to store dates.

Boolean Value is True or False

List (Standard) To store a list of database items

JChem table

Chemical terms A list of functions evaluated on chemical structures: logD, pKa, tautomers,...

Structure Chemical structure, automatically created with a Jchem table

Page 12: InstantJChem: a flexible chemical database system

Key concept: Entities

Entity

An entity is a representation of data.

icon

It is a unique interface to conceptually different types of tables (Standard, Chemical, SQL, Extractions, etc).

Page 13: InstantJChem: a flexible chemical database system

Key concept: Data Trees

Data Tree

A collection of entities and views.

icon

Organize information using a hierarchy (parent-child relationship between entities).

Page 14: InstantJChem: a flexible chemical database system

Exercise 1….Customize a browser for it.

Page 15: InstantJChem: a flexible chemical database system

Key concept: Views

Views

An interface to data.

icon

For simple data, a spreadsheet view is relevant. For complex relational data, a form is mandatory.

Page 16: InstantJChem: a flexible chemical database system

Exercise 2In the SC100 database, search for fluorobenzene and pyridine containing molecules. Use Substructure or Similarity search.

Page 17: InstantJChem: a flexible chemical database system

Exercise 2In the SC100 database, search for fluorobenzene and pyridine containing molecules. Use Substructure or Similarity search.

Substructure search: 20 hitsSimilarity search: 0 hits

Substructure search: 14 hitsSimilarity search: 0 hits

Similarity search uses Chemical Hashed Fingerprints defined at database creation.

Page 18: InstantJChem: a flexible chemical database system

Chemical Hashed Fingerprints (CHF)

• Pattern Length: number of bonds of a pattern

• Fingerprint Length: total number of bits to store the fingerprint

• Bits per pattern: number of bits a pattern shall set on

Efficient annotation to accelerate structure search

www.chemaxon.com

Page 19: InstantJChem: a flexible chemical database system

Exercise 3Combine molecule 25 and 89 into a pseudo-molecule to perform a superstructure query.

Page 20: InstantJChem: a flexible chemical database system

Exercise 4Use compound 46 as a Full and Full fragment query to search the database. Repeat after removing the bromide from the query.

Page 21: InstantJChem: a flexible chemical database system

Structure Searches

www.chemaxon.com

Page 22: InstantJChem: a flexible chemical database system

Exercise 5Search benzene containing compounds, which name contains “pyrimidin” and annotated as “Good” concerning their aqueous solubility.

Page 23: InstantJChem: a flexible chemical database system

Exercise 6Search for compounds with at least one aromatic ring containing at least on Nitrogen atom

Page 24: InstantJChem: a flexible chemical database system

Exercise 7Search for compounds which MolWeight > 200 and not containing a benzene ring

Page 25: InstantJChem: a flexible chemical database system

Exercise 8Search for compounds with MolWeigh > 200, then for compounds without a benzene ring and search for the union of the hit lists.

Page 26: InstantJChem: a flexible chemical database system

Execrise 9Search for compounds possessing more than 4 microspecies at pH=4.0….

Page 27: InstantJChem: a flexible chemical database system

Exercise 9… Export your hit list.

Page 28: InstantJChem: a flexible chemical database system

Exercise 10Import in your project the file ISICCRsm.RDF…

Page 29: InstantJChem: a flexible chemical database system

Exercise 10… Create a Browser for this database

Page 30: InstantJChem: a flexible chemical database system

Exercise 11Search for reactions including an imidazole ring into their reactants then into their products.

Page 31: InstantJChem: a flexible chemical database system

Exercise 12Add to your Schema a new data tree and structure entity named AlkanBoilingPoint…

Page 32: InstantJChem: a flexible chemical database system

Exercise 12… and add a floating point value field named BoilingPoint.

Page 33: InstantJChem: a flexible chemical database system

Exercise 13Add to the AlkanBoilingPoint entity the following data.

Page 34: InstantJChem: a flexible chemical database system

Exercise 14Add to the AlkanBoilingPoint entity a new date field named Date and fill it.

Page 35: InstantJChem: a flexible chemical database system

Exercise 15Add to the AlkanBoilingPoint entity a calculated value of LogP using a Chemicalterm field.

Page 36: InstantJChem: a flexible chemical database system

Summary Create a project and schema Import data Search by substructure, superstructure, similarity,

and exact match Search by keyword Combining queries and result lists Export query results Create a new database

Page 37: InstantJChem: a flexible chemical database system

Conclusion InstantJChem is a Chemoinformatics layer above a

standard SGDB. Provides many more Chemoinformatics services

(databases overlap, QSPR modeling, plots, enumeration, scripting)

SGDBSGDB InstantJChemInstantJChem