Développement "IN SILICO" de nouveaux complexants
de métaux
Alexandre Varnek Laboratoire d’Infochimie,
Université Louis Pasteur, Strasbourg, FRANCE
COMPLEXATION
M1+
M2+
An - L
SOLVENT EXTRACTION
M1+
M2+
An-
L
- Acquisition of Data;
- Acquisition of Knowledge;
- Exploitation of Knowledge
« In silico » design of new complexants (extractants)
« In silico » design of new compounds
Generation of combinatorial
libraries
Models « structure-activity »
Database
Expert system
Clustering
Knowledge base
Screening
Hits
EXPERIMENT
I S I D AIn SIlico Design and data Analysis
Expert system
Generation of combinatorial
libraries
Database
Clustering
Knowledge base
Supplementary tools
Informational System for Complexation (Extraction)
Expert system
Generation of combinatorial
libraries
Database Comprehensive Solvent eXtraction Database
Substructural Molecular Fragments method
Generation of focussed libraries using molecular
fragments
Design of new compounds
•Databases development.
Acquisition of Data:
Comprehensive Solvent Extraction Database (SXD)
Two of six informational pages of SXD “in house”
One record per extraction equilibrium (90 fields). It contains bibliography + system description + 2D and 3D structures of extractants + thermodynamic and kinetic data (in textual, numerical and graphical forms).
• Development of an Expert System
Acquisition of Knowledge:
Quantitative Structure Activity Relationship (QSAR)
O
4
P O
O
N
Bu
Bu
PhX = f ( )
Quantitative Structure Property Relationships (QSPR)
X = distribution coefficient, extraction constant, ….
QSAR / QSPRQSAR / QSPR
Hansch-type approach:Hansch-type approach: Property = Property = f f (physico-chemical, structural, … (physico-chemical, structural, … descriptors)descriptors)
CODESSA PRO program
Free-Wilson -type approach:Free-Wilson -type approach:
Property = Property = f f (fragment descriptors)(fragment descriptors)
TRAIL programSMF method
The SMF method is based on the representation of a molecule by its fragments and on the calculation of their contributions to a given property.
V. P. Solov’ev, A. Varnek, G. Wipff, J. Chem. Inf. Comput. Sci., 2000, 40, 847-858
A. Varnek, G. Wipff, V. P. Solov’ev, Solvent Extract. Ion. Exch., 2001, 19, 791-837
A. Varnek, G. Wipff, V. P. Solov’ev, J. Chem. Inf. Comput. Sci., 2002, 42, 812-829
V. P. Solov’ev, A. Varnek, J. Chem. Inf. Comput. Sci., 2003, 43, 1703 - 1719
Fragment Descriptors:
Substructural Molecular Fragments (SMF) method
N
N
N
N
NHH
HH
HI. Sequences
II. Augmented Atoms
Substructural Molecular Fragments method
Type of Fragments
C-N=C-H
C-N=C
N=C-N
C-N
N=C
C-H
I(AB, 2-4)
sequence
Atoms+Bonds
2 to 4 atoms
N
N
N
N
NHH
HH
H
I. Sequences
II. Augmented Atoms
Type of Fragments
II(Hy) (hybridization of neighbours is taken into account)
II(A) (no hybridization)
Substructural Molecular Fragments method
Fitting Equations
X= ao + ai Ni + , (1) i
X = ao + ai Ni + bi (2Ni 2
- 1) + , (2) i i
X = ao + ai Ni + bik Ni Nk + , (3) i i, k
SMF method
X : propertyai, bi : fitting coefficients Ni : number of the fragments of i-type : external descriptor (s)
TRAIL programSMF method
TRAIL procedure for the property X
SMF method
1. Training stage• generates 147 computational models involving 49 types of fragments and 3 fitting equations;
• uses all generated models in order to fit fragment contributions;
• applies statistical criteria to select the “best fit” models for the Training set;
2. Prediction stage• applies the best models to “predict” properties of compounds from the Test and/or CombiLibrary sets.
Complexation: Assessment of stability constants • phosphoryl-containing podands + K+ in THF/CHCl3
• crown-ethers + Na+, K+ and Cs+ in MeOH• -cyclodextrins + neutral guests in water
• Octanol / Water partition coefficients• Eight physical properties of C2 - C9 hydrocarbons
Test calculations
Solvent ExtractionExtraction constants UO2
2+ extracted in chloroform by phosphoryl-containing ligands
Distribution coefficients Hg, In or Pt extracted in DChE by phosphoryl-cont. podands UO2
2+ extracted in DChE by mono- and tripodands
UO22+ extracted in toluene by amides
Application of the SMF method
Biological propertiesAnti-HIV activity of HEPT, TIBO and CU derivatives
CODESSA PRO (Prof. A.R. Katritzky, Univ. of Florida, USA)
ConstitutionalTopologicalGeometrical ElectrostaticCharged Partial Surface Area (CPSA) Quantum-chemicalMO-related Thermodynamical
The program uses about 700 Physico-Chemical Descriptors
Fitting and validation of structure – property models
Building of structure - property models
Selection of the best models according to statistical criteria
Splitting of an initial data set into training and test sets
Train
ing setT
est
Initial d
ata set
“Prediction” calculations using the best structure - property models
10 – 15 %
Property (X) predictions using best fit models
Compound model 1 model 2 … mean ± s
Compound 1 X11 X12 … <X1> ± X1
Compound 2 X21 X22 … <X2> ± X2
…
Compound m Xm1 Xm2 … <Xm> ± Xm
• establishes reliable quantitative structure–property relationships
• must be very fast to analyse data sets of 104-106 compounds
Expert System
• Generation of virtual combinatorial libraries;• Screening and Hits selection.
Exploitation of Knowledge:
Generation of Virtuel Combinatorial Libraries
if R1, R2, R3 = and then
Markush structure P
O
R1 R3
R2
P
O
P
O
P
O
P
O
P
O
P
O
P
O
P
O
R1 NR2
O
R3
Markush structure
Program CombiLib generates virtual combinatorial libraries based on the Markush structures when selected substituents are attached to a given molecular core.
COMPLEXATION
M1+
M2+
An - L
Complexation of crown-ethers with alkali cations
OO
O
OO
O+ M+
OO
O
OO
OM+
Different properties compared to acyclic ligands: macrocyclic effect:
ME = (logK)crown - (logK)acyclic
OO
O
OO
O+ M+
OO
O
OO
OM+
Complexation of crown-ethers with alkali cations
- Estimation of stability constants for acyclic analogues of crowns
- Estimation of macrocyclic effect
- QSPR modelling on structurally diverse data set
Goal:
A. Varnek, G. Wipff, V. P. Solov’ev, J. Chem. Inf. Comput. Sci., 2002, 42, 812-829
OO
O
OO
O+ M+
OO
O
OO
OM+
Complexation of crown-ethers with alkali cations: macrocyclic effect
log = ao + ai Ni + acycl Ncycl (4) i
log = ao + ai Ni + bi (2Ni 2
- 1) + acycl Ncycl (5) i i
log = ao + ai Ni + bik Ni Nk + acycl Ncycl (6) i i, k
L + M+ in MeOH:
acycl = 0.7
Na+ : Ncycl = 2 (15c5); 3 (18c6); 0 (other)
K+ : Ncycl = 2 (15c5); 5 (18c6); 3 (21c7); 2 (24c8 - 36c12); 0 (other)
OO
O O
OOO
OO
R3
R4
OOO
OO
OO
O
OO
O
OO
O
OO
O
OO
O
OO
OO
O
R1
R1
R1
R1
R2
R5
R5
R5
R5
n
R1 = H, Alk; n =1-7R2 = H, Alk, CH2(OH)CH2NHCH3
R3 = H, AlkR4 = CH2NH-Alk, CH2O-Alk, CH2(OCH2O)1,2-Alk
R5 = H, Alk, m = 1-3
m
Training stage
1 2 3 4 5 61
2
3
4
5
6
LogKcalc, mean
LogKexp
n=108, R2=0.952,F=2103, s=0.22
Crown-ethers with K+ in MeOH
Crown-ethers with K+ in MeOH
Validation stage
O
OO
OOOO
OO
O
OO
O
OO
O
OO
O
OO
OOO
O O
O
O
OO
OO
OO
O
OO
OO
O
OO
O
OO
O
OO
O
OO
O
O
O
O
O
O
O OO
OOO
OO
OO
OOO
1 2 3 4 5 6
2
3
4
5
LogKcalc, mean
LogKexp
n=11, R2=0.924,F=110.0, s=0.33
0.5 1.0 1.5 2.0 2.5 3.01.0
1.5
2.0
2.5
Acyclic polyethers with K+ in MeOH
“Prediction” of logKCH3
O OCH34 CH3
O OCH35 CH3
O OCH36
CH3
O OCH37 3
O OCH3
OHO O
CH3OH
4
OO
OCH3
OO
OCH3 OO
OCH3
OO
OCH3
3CH3 O
O O OCH3
3
CH3 OO O O
CH3
4 4
OO
O O
OHOH
OO
OO
O
OHOHOHOH
OO
O
LogKcalc, mean
LogKexp
n=13, R2=0.732,F=30.1, s=0.24
The ratio () of the average ME contribution and experimental logK for different macrocyclic scaffolds for Na+ (), K+ () and Cs+ () crown ether complexes respectively.
0.0
0.4
0.8
1.2
15c5 18c6 21c7 24c8 30c10
L + M+ in MeOH: estimation of the macrocyclic effect
= (acyclNcycl) / logK
SOLVENT EXTRACTION
M1+
M2+
An-
L
Extraction of UO22+ by phosphoryl-containing podands:
QSPR modeling of distribution coefficient (logD)
PX
P
O O
R
R
R
R P
O
O X
Ph
PhPO
3
P
O
R
R
P X
O
OPh
Ph
Y YP X
O
R
R
PX
O
R
R
R = Ph, Tol, OEt
X = (CH2)n-O-(CH2)m, (CH2)n
Y = (CH2)n-O-(CH2)m, (CH2)n,
OCH2P(O)MeCH2O
calculations were performed for the initial data set of 32 podands as well as for two training (test) sets of 29 (3) compounds
Extraction of UO22+ by podands: QSPR modeling of logD
Fragment descriptors, TRAIL: 3 models
Pre-selected 262 « classical » descriptors, CODESSA: 0 models (!)
Mixed (16 fragment + 262 « classical ») descriptors, CODESSA: 2 models
Virtual Combinatorial Libraries of Podands
P
O
R1 R3
R2
R1, R2, R3 =
Me, Bu, Ph, Tol, CH2O(o-C6H4)P(O)Bu2, CH2O(o-C6H4)P(O)Ph2, CH2O(o-C6H4)P(O)Tol2,
CH2O(o-C6H4)CH2P(O)Bu2, CH2O(o-C6H4)CH2P(O)Ph2, CH2O(o-C6H4)CH2P(O)Tol2,
CH2O(o-C6H4)OCH2P(O)Bu2, CH2O(o-C6H4)OCH2P(O)Ph2, CH2O(o-C6H4)OCH2P(O)Tol2,
CH2CH2OCH2CH2(o-C6H4)P(O)Bu2, CH2CH2OCH2CH2(o-C6H4)P(O)Ph2,
CH2CH2OCH2CH2(o-C6H4)P(O)Tol2, o-C6H4OCH2P(O)Bu2, o-C6H4OCH2P(O)Ph2, o-
C6H4OCH2P(O)Tol2, CH2CH2OCH2P(O)Bu2, CH2CH2OCH2P(O)Ph2,
CH2CH2OCH2P(O)Tol2
Generation of Virtual Extractants and Hits Selection
Generated Focussed Combinatorial Library of Podans:
2200 compounds
Hits selection
Screening
Blind test : are our predictions reliable ?!
-1
0
1
2
3
4
1 2 3 4 5 6 7 8
EXP
TRAIL
CODESSA - PRO
logD(UO22+)
N° of compound
Extraction properties for 7 of 8 new compounds have been correctly predicted
Synthesis
Extractionexperiments
P O
OPO
O
O
PO
OP
P O
O
O
O
P
P O
O
OP
POO
PO
O
PO
PO
O
O
POO
PO
O POP
O
OO
OO
P OPO
O
O PO
OP
O
1 2
3 4
5 6
7 8
Theoretically generated compounds
« In silico » design of new compounds
EXPERIMENT
Expert system
Generation of combinatorial
libraries
Models « structure-activity »
Screening
Database
ACKNOWLEDGEMENTS
Denis FOURCHES Nicolas SIEFFERT
Dr Vitaly SOLOVIEV (IPAC, Russia)
Prof. Alan Katritzky (Univ. of Florida, USA)
GDR PARIS
Joseph Louis Gay-Lussac, Mémoires de la Société d ’Arcueil 2:207 (1808)
« We are perhaps not far removed from the time when we shall be able to submit the bulk of chemical phenomena to calculation »
Tools for searching and records preparation
Structure-Data-File Editor (2D structures + properties) MOL Editor (2D structures)
·Internal Text Editor
·Digitazer (converts a graph represented as image into data table Y=F(X))
· Searching Options (textual and numerical fields) (Sub) Structural Search
(internal 2D editor + searching engine)
Solvent eXtraction Database (SXD)
Labo d’Infochimie
Molecular Molecular StructureStructure
Molecular Molecular StructureStructure ACTIVITIESACTIVITIESACTIVITIESACTIVITIES
RepresentationRepresentationRepresentationRepresentation Feature Selection & Feature Selection & MappingMapping
Feature Selection & Feature Selection & MappingMapping
DescriptorsDescriptorsDescriptorsDescriptors
Quantitative Structure Activity Relationships (QSAR)
(logD)exp (logD)calc
1.2 0.78
-0.20 -0.38
1.72 1.40
R = 0.973s = 0.071
-1 0 1 2-1
0
1
2
Extraction of UO22+ by phosphoryl-containing podands
calc
exp
n=24, R=0.956,F=235, s=0.18
Training stage
Validation stage
LogDcalc = 0.060 + 0.914 LogDexp
PO
OPO
OO
O
P P
O
O
OPOO
P O O P