urmila joshi

Upload: nandini-banerjee

Post on 06-Apr-2018

231 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Urmila Joshi

    1/53

    2D QSAR

  • 8/3/2019 Urmila Joshi

    2/53

    QSAR IN DRUG DESIGN

    To introduce order in the universe, man must

    pay attention to the quantitative aspects of

    the universe and try to find a mathematical

    relationship between them.

    --------- Galileo Galili

  • 8/3/2019 Urmila Joshi

    3/53

    HistoryofQSAR

    Biological properties of molecules are related totheir structure.

    A mathematical relationship between thestructure and biological properties was proposedat the end of the 19th century

    Hydrophobicity was the first physicochemicalproperty which showed a quantitativerelationship with narcotic activity

  • 8/3/2019 Urmila Joshi

    4/53

    HistoryofQSAR

    Crum-Brown and Fraser Postulate : = f (C)

    Meyer and Overtone : Practical Evidence :

    Toxicity as a function of Lipophilicity

    Fergussons Principle : Depressant action

    related to the relative saturation in vapourphase. First attempt to use a thermodynamic

    constant

  • 8/3/2019 Urmila Joshi

    5/53

    HistoryofQSAR

    Use of Substituent Constants :

    1. Hammett Constant

    2. Taft Steric Constant Es and the Hancock

    correction

    3. Lipophilicity Constant

  • 8/3/2019 Urmila Joshi

    6/53

    A VarietyofPhysicochemical

    Parameters Lipophilicity : log P, , Rm

    Electronic Effect : , F, R, Dipole moments,Spectral shifts, Ionization constants, Quantum

    chemical indices for electron density

    Steric Parameters :

  • 8/3/2019 Urmila Joshi

    7/53

    MolecularConnectivity Index

  • 8/3/2019 Urmila Joshi

    8/53

    MolecularDescriptors

    Calculated/Computed : Can be calculated

    using a mathematical procedure that converts

    chemical structure/ information into a number

    Experimentally determined : using

    standardized experiments to measure some

    molecular attributes

  • 8/3/2019 Urmila Joshi

    9/53

    Whydo weneed Descriptors?

    Describe different aspects of molecules

    Compare different molecular structures

    Compare different conformation of same

    molecule

    Database storage

  • 8/3/2019 Urmila Joshi

    10/53

    MolecularDescriptors

    Structural information is given by molecular

    descriptors

    Molecular descriptors can be classified as

    1D, 2D and 3D descriptors

    1D descriptors give information about thewhole molecule, 2D about the substituent and

    3D about the molecular fields

  • 8/3/2019 Urmila Joshi

    11/53

    TypesofDescriptors

    Counts of features: For example HBAs, HBDs,

    aromatic ring systems, substructures/fragments (

    e.g. , carbonyl groups, basic nitrogens, carboxyl

    groups,),etc.

    Physicochemical Properties: LogP, solubility,

    MW, MP, BP, heat of sublimation, molarrefractivity, Hammett parameters, etc.

  • 8/3/2019 Urmila Joshi

    12/53

    TypesofDescriptors Contd.

    Topological Indices: Wiener index, branching

    indices, kappa shape indices, electrotopological state

    indices, atom-pairs, topological torsions, etc.

    BCUTs (3-D, 2-D, 2-T): Electrostatic, charge, and

    polarizability (hydrophobic).

    Others: Volsurf, polar surface area, etc.

  • 8/3/2019 Urmila Joshi

    13/53

  • 8/3/2019 Urmila Joshi

    14/53

    HistoryofQSAR

    Subsequent efforts resulted in electronic and the

    steric parameters being correlated with biological

    activity.

    The first successful application of QSAR is credited

    to the efforts of Prof. Corwin Hansch, who developed

    an equation correlating biological activity of a set of

    molecules to a linear combination of lipophilicy and

    electronic effects of a series of closely relatedmolecules

  • 8/3/2019 Urmila Joshi

    15/53

    ModelsinMedicinal Chemistry

    Model is a transformation of a pototype which

    can be more conveniently handled.

    Need for a Model : Ethical considerations and

    Reduction in complexity

    Types of Models

  • 8/3/2019 Urmila Joshi

    16/53

    What is QSAR

    A QSAR is a mathematical relationship between a

    biological activity of a molecular system and its

    geometric and chemical characteristics.

    QSAR attempts to find consistent relationship

    between biological activity and molecular

    properties, so that these rules can be used to

    evaluate the activity of new compounds.

  • 8/3/2019 Urmila Joshi

    17/53

    Why QSAR?

    The number of compounds required for

    synthesis in order to place 10 different groups

    in 4 positions of benzene ring is 104

    Solution: synthesize a small number of

    compounds and from their data derive rules

    to predict the biological activity of other

    compounds.

  • 8/3/2019 Urmila Joshi

    18/53

    NecessitiesofQSAR

    Good input data

    Meaningful Structural Information

    Predictive Models

  • 8/3/2019 Urmila Joshi

    19/53

    Experimental Data Set

    Needs of Experimental Data :

    1. As numerous as possible

    2. Correct

    3. Representative

    4. Homogenous ( ideally same lab, same

    method)

  • 8/3/2019 Urmila Joshi

    20/53

    Criteria ofExperimental Dataset

    Compounds should belong to the same

    congeneric series

    The compounds should have similar binding

    mode

    Binding affinity should correlate with

    interaction energy

    Biological activity should correlate withbinding affinity

  • 8/3/2019 Urmila Joshi

    21/53

    Garbagein- Garbageout

    The models will only be as good as

    the dataset used to develop them

  • 8/3/2019 Urmila Joshi

    22/53

    A Dataset will looklikethis!

    No. Comp. Activity

    1 A-1 1.23

    2 A-2 1.87

    3 A-3 2.65

    4 A-4 2.08

    5 A-5 1.956 A-6 2.43

    7 A-7 2.28

  • 8/3/2019 Urmila Joshi

    23/53

    A Dataset Along With DescriptorsA Dataset Along With Descriptors

    XX log(1/EClog(1/EC5050))MRMR TT WW EEss

    HH 4.934.93 1.031.03 0.000.00 0.000.00 0.000.00

    ClCl 5.915.91 6.036.03 0.710.71 0.230.23 --0.970.97

    NONO22 5.345.34 7.367.36 --0.280.28 0.780.78 --2.522.52

    CNCN 4.584.58 6.336.33 --0.570.57 0.660.66 --0.510.51

    CC66HH55 6.626.62 25.3625.36 1.961.96 --0.010.01 --3.823.82

    NMe2NMe2 5.365.36 15.5515.55 0.180.18 --0.830.83 --2.902.90

    II 6.466.46 13.9413.94 1.121.12 0.180.18 --1.401.40

    NHCHO ?NHCHO ? 10.3110.31 --0.980.98 0.000.00 --0.980.98

  • 8/3/2019 Urmila Joshi

    24/53

    TooMany Descriptors!!

    Reduce to manageable size by

    1. Principal Component Analysis

    2. Cluster Analysis

    3. Choice of important Descriptors : Remove

    Descriptors which show similar value for 90%

    of the compounds

    4. Investigators Intervention : Computers do

    mathematics, they do not understand biology

  • 8/3/2019 Urmila Joshi

    25/53

    NumberofDescriptors

    The data set should contain at least 5 times as many

    compounds as the no. of descriptor in the QSAR if

    MLR is used as a method; and 60% of the no. of

    compounds if PLS is used as a method

    Too few compounds relative to the number of

    descriptors will give a false meaningless and high

    correlation

  • 8/3/2019 Urmila Joshi

    26/53

    Biological Data ForQSAR

    Experimentally generated data

    Reported Data

  • 8/3/2019 Urmila Joshi

    27/53

    HanschModel

    Steps Involved

    1. Selection of Lead Compound2. Selection of Substituents

    3. Synthesis and Biological Evaluation

    4. Determination of Descriptors5. Generation of Regression Equation

  • 8/3/2019 Urmila Joshi

    28/53

    SelectionofSubstituents

    Batchwise Selection

    Stepwise Selection

  • 8/3/2019 Urmila Joshi

    29/53

    IndicatorVariable

    Indicated by I

    Arbitrorily assigned only two values, 1 and 0

    Should always be used along with other

    Descriptors, and that descriptor should be

    significant in the regression equation even inabsence of an indicator variable.

  • 8/3/2019 Urmila Joshi

    30/53

    Mathematical Relationship

    QSAR attempts to find a mathematicalrelationship between Descriptors and thebiological activity in form of an equation

    This is traditionally done using Regressionanalysis

    Descriptors are considered to be independentparameters and the biological activity as thedependent parameter

  • 8/3/2019 Urmila Joshi

    31/53

    Statistical Techniques

    Two major statistical techniques are used for

    this purpose

    1. MLR : Multiple Linear Regression

    2. PLS : Partial Least Squares

  • 8/3/2019 Urmila Joshi

    32/53

    NewerTechniques

    Artificial Neural Networks

    Genetic Algorithm

    Advantages : Can detect nonlinear

    relationship between the descriptors and

    biological activity

  • 8/3/2019 Urmila Joshi

    33/53

    TheMathematical Result

    The result is an equation with associated

    statistical parameters

    The Statistical Parameters are :1. n

    2. r

    3. s4. F

  • 8/3/2019 Urmila Joshi

    34/53

    Putting it all together

    For a group of antihistamines,

    Log (1/C) = 0.440 Es 2.204

    (n=30, s=0.307, r= 0.886)Log (1/C) = 2.814 W - 0.223

    (n=30, s=0.519, r= 0.629)

    Log (1/C) = 0.492 Es - 0.585 W- 2.445

    (n=30, s= .301, r= 0.889)

  • 8/3/2019 Urmila Joshi

    35/53

    ValidationofQSAR Equations

    Statistical Parameters and their significance

    Scrambling the Y-values (Biological activity)

    Leave One Out and Leave Many Out Method

    Test set-Training set Method

  • 8/3/2019 Urmila Joshi

    36/53

    InterpretationofEquations and Use

    Prediction of Activity ofUnknown Compounds

    Improving the series of compounds with

    reference to the biological activity bysynthesis of new and active compounds

    Restricting the number of compounds

    synthesized for maximising the activity

  • 8/3/2019 Urmila Joshi

    37/53

    LimitationsofQSAR

    False correlations due to noisy data

    False positives is a major problem as compared tofalse negatives

    Statistical Gimmick

    Metabolism of the compounds not taken intoconsideration

    Alternate binding modes may affect the results in asignificant way.

  • 8/3/2019 Urmila Joshi

    38/53

    3D QSAR : CoMFA

    Cramer and Milne (1979)

    Comparison of molecules by alignment and

    field generation

    Wold (1986)

    Proposes using PLS instead of PCA for

    overrepresented (1000s of field non-orthogonal

    variables) problem Cramer, Patterson andBunce (1988)

    Introduced CoMFA

  • 8/3/2019 Urmila Joshi

    39/53

    Free Energyof Binding andEquilibriumConstants

    The free energy of binding is related to thereaction constants of ligand-receptor complexformation:

    Gbinding = 2.303 RT log K= 2.303 RT log (kon / koff)

    Equilibrium constant K

    Rate constants kon (association) and koff(dissociation)

  • 8/3/2019 Urmila Joshi

    40/53

    Free Energyof Binding

    (Gbinding = (G0+ (Ghb+ (Gionic + (Glipo + (Grot

    (G0 entropy loss (translat. + rotat.) +5.4

    (Ghb

    ideal hydrogen bond 4.7

    (Gionic ideal ionic interaction 8.3

    (Glipo lipophilic contact 0.17

    (Grot entropy loss (rotat.bonds) +1.4

    (Energies in kJ/mol per unit feature)

  • 8/3/2019 Urmila Joshi

    41/53

    CoMFA

    Set of chemically related compounds

    3D structures needed

    Bioactive conformations of the active

    compounds are to be aligned

  • 8/3/2019 Urmila Joshi

    42/53

    CoMFA Alignment

    L

    LL

    d1

    d2

    d3

    L

    LL

    d1

    d2

    d3

    "Pharmacophore"

    C7OH

    OH

    A

    D

    B

    L

    LL

    d1

    d2

    d3

    O

    OC

    7OH

    OHOH

    A

    B

  • 8/3/2019 Urmila Joshi

    43/53

    CoMFA Grid and Field Probe

  • 8/3/2019 Urmila Joshi

    44/53

    Molecular Fields in CoMFA

    CoMFA standard: steric and electrostatic,

    additional: H-bonding, indicator, parabolic and

    others.

    A grid with energyfieldsiscalculatedbyplacing a probe atom ateach voxel.

    Themolecularfields are:

    Steric (Lennard-Jones)interactionsElectrostatic (Coulombic)interactions

    A probeissp3 carbon atom with chargeof+1.0

  • 8/3/2019 Urmila Joshi

    45/53

    Common 3D molecular fields

    MEP Molecular Electrostatic Potential (unit

    positive charge probe).

    MLP Molecular Lipophilicity Potential (no

    probe necessary).

    GRID total energy of interaction: the sum of

    steric (Lennard-Jones), H-bonding and

    electrostatics (any probe can be used).

  • 8/3/2019 Urmila Joshi

    46/53

    Themolecularfields are:

    Steric (Lennard-Jones)interactions

    Electrostatic (Coulombic)interactions

    A probeissp3 carbon atom with chargeof+1.0

    El t t ti P t ti l C t

  • 8/3/2019 Urmila Joshi

    47/53

    Electrostatic PotentialContourLines

  • 8/3/2019 Urmila Joshi

    48/53

    3DContour Map forElectronegativity

  • 8/3/2019 Urmila Joshi

    49/53

    CoMFA Pros andCons

    Suitable to describe receptor-ligand

    interactions

    3D visualization of important features Good correlation within related set

    Predictive power within scanned space

    Alignment is often difficult

    Training required

  • 8/3/2019 Urmila Joshi

    50/53

    3D-QSAR: CoMFA

    1st needtostructurally align

    2-D Alignment Methods Maximum common substructure based methods

    Feature-Based

    Vector Methods

    Discrete feature values (Bit Strings)

    Continuous feature values

    3-D Alignment Methods

    Field-based methods

    Structure-based methods (generalized RMSD approaches)

    S b S

  • 8/3/2019 Urmila Joshi

    51/53

    Subsequent Steps

    Calculate property fields for each molecule at every

    grid point (training set)

    Property value at each grid point is equivalent to a

    descriptor value in 2-D QSAR

    Grid points with low variance may be neglected;

    nevertheless this may result in hundreds of grid points

    Many more descriptor values than experimental data

    points, thus traditional least-squares approach cannot

    be used Perform partial least squares (PLS) analysis

    Validate model (test set)

    Predict activities of new molecules

  • 8/3/2019 Urmila Joshi

    52/53

    CoMFA CountourPlots

    Inactive Molecule Active Molecule

  • 8/3/2019 Urmila Joshi

    53/53

    TheHansch equation

    ORJ&.ORJ3 .ORJ3.

    .

    :KHUH...DQG.DUHFRQVWDQWV

    /RJ3LVWKHSDUWLWLRQFRHIILFLHQW

    is the substituent constant describing the

    electronic effect of the substituent