similarity methods c371 fall 2004. limitations of substructure searching/3d pharmacophore searching...
TRANSCRIPT
![Page 1: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/1.jpg)
Similarity Methods
C371
Fall 2004
![Page 2: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/2.jpg)
Limitations of Substructure Searching/3D Pharmacophore Searching
• Need to know what you are looking for
• Compound is either there or not– Don’t get a feel for the relative ranking of the
compounds
• Output size can be a problem
![Page 3: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/3.jpg)
Similarity Searching
• Look for compounds that are most similar to the query compound
• Each compound in the database is ranked
• In other application areas, the technique is known as pattern matching or signature analysis
![Page 4: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/4.jpg)
Similar Property Principle
• Structurally similar molecules usually have similar properties, e.g., biological activity
• Known also as “neighborhood behavior”
• Examples: morphine, codeine, heroin
• Define: in silico– Using computational techniques as a
substitute for or complement to experimental methods
![Page 5: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/5.jpg)
Advantages of Similarity Searching
• One known active compound becomes the search key
• User sets the limits on output
• Possible to re-cycle the top answers to find other possibilities
• Subjective determination of the degree of similarity
![Page 6: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/6.jpg)
Applications of Similarity Searching
• Evaluation of the uniqueness of proposed or newly synthesized compounds
• Finding starting materials or intermediates in synthesis design
• Handling of chemical reactions and mixtures
• Finding the right chemicals for one’s needs, even if not sure what is needed.
![Page 7: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/7.jpg)
Subjective Nature of Similarity Searching
• No hard and fast rules
• Numerical descriptors are used to compare molecules
• A similarity coefficient is defined to quantify the degree of similarity
• Similarity and dissimilarity rankings can be different in principle
![Page 8: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/8.jpg)
Similarity and Dissimilarity
“Consider two objects A and B, a is the number of features (characteristics) present in A and absent in B, b is the number of features absent in A and present in B, c is the number of features common to both objects, and d is the number of features absent from both objects. Thus, c and d measure the present and the absent matches, respectively, i.e., similarity; while a and b measure the corresponding mismatches, i.e., dissimilarity.” (Chemoinformatics; A Textbook (2003), p. 304)
![Page 9: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/9.jpg)
2D Similarity Measures
• Commonly based on “fingerprints,” binary vectors with 1 indicating the presence of the fragment and 0 the absence
• Could relate structural keys, hashed fingerprints, or continuous data (e.g., topological indexes that take into acount size, degree of branching, and overall shape)
![Page 10: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/10.jpg)
Tanimoto Coefficient
• Tanimoto Coefficient of similarity for Molecules A and B:
SAB = c _
a + b – ca = bits set to 1 in A, b = bits set to 1 in B, c =
number of 1 bits common to both
Range is 0 to 1.
Value of 1 does not mean the molecules are identical.
![Page 11: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/11.jpg)
Similarity Coefficients
• Tanimoto coefficient is most widely used for binary fingerprints
• Others:– Dice coefficient– Cosine similarity– Euclidean distance– Hamming distance– Soergel distance
![Page 12: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/12.jpg)
Distance Between Pairs of Molecules
• Used to define dissimilarity of molecules
• Regards a common absence of a feature as evidence of similarity
![Page 13: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/13.jpg)
When is a distance coefficient a metric?
• Distance values must be zero or positive– Distance from an object to itself must be zero
• Distance values must be symmetric• Distance values must obey the triangle
inequality: DAB ≤ DAC + DBC
• Distance between non-identical objects must be greater than zero.
• Dissimilarity = distance in the n-dimensional descriptor space
![Page 14: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/14.jpg)
Size Dependency of the Measures
• Small molecules often have lower similarity values using Tanimoto
• Tanimoto normalizes the degree of size in the denominator:
SAB = c _
a + b – c
![Page 15: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/15.jpg)
Other 2D Descriptor Methods
• Similarity can be based on continuous whole molecule properties, e.g. logP, molar refractivity, topological indexes.
• Usual approach is to use a distance coefficient, such as Euclidean distance.
![Page 16: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/16.jpg)
Maximum Common Subgraph Similarity
• Another approach: generate alignment between the molecules (mapping)
• Define MCS: largest set of atoms and bonds in common between the two structures.
• A Non-Polynomial- (NP)-complete problem: very computer intensive; in the worst case, the algorithm will have an exponential computational complexity
• Tricks are used to cut down on the computer usage
![Page 17: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/17.jpg)
Maximum Common Subgraph
![Page 18: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/18.jpg)
Reduced Graph Similarity
• A structure’s key features are condensed while retaining the connections between them
• Cen ID structures with similar binding characteristics, but different underlying skeletons
• Smaller number of nodes speeds up searching
![Page 19: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/19.jpg)
3D Similarity
• Aim is often to identify structurally different molecules
• 3D methods require consideration of the conformational properties of molecules
![Page 20: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/20.jpg)
Tanimoto Coefficient to Find Compounds Similar to Morphine
![Page 21: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/21.jpg)
3D: Alignment-Independent Methods
• Descriptors: geometric atom pairs and their distances, valence and torsion angles, atom triplets
• Consideration of conformational flexibility increases greatly the compute time
• Relatively fewer pharmacophoric fingerprints than 2D fingerprints– Result: Low similarity values using Tanimoto
![Page 22: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/22.jpg)
Pharmacophore
• A structural abstraction of the interactions between various functional group types in a compound
• Described by a spatial representation of these groups as centers (or vertices) of geometrical polyhedra, together with pairwise distances between centers
• http://www.ma.psu.edu/~csb15/pubs/searle.pdf
![Page 23: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/23.jpg)
3D: Alignment Methods
• Require consideration of the degrees of freedom related to the conformational flexibility of the molecules
• Goal: determine the alignment where similarity measure is at a maximum
![Page 24: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/24.jpg)
3D: Field-Based Alignment Methods
• Consideration of the electron density of the molecules– Requires quantum mechanical calculation:
costly– Property not sufficiently discriminatory
![Page 25: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/25.jpg)
3D: Gnomonic Projection Methods
• Molecule positioned at the center of a sphere and properties projected on the surface
• Sphere approximated by a tessellated icosahedron or dodecahedron
• Each triangular face is divided into a series of smaller triangles
![Page 26: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/26.jpg)
Finding the Optimal Alignment
• Need a mechanism for exploring the orientational (and conformational) degrees of freedon for determining the optimal alignment where the similarity is maximized
• Methods: simplex algorithm, Monte Carlo methods, genetic alrogithms
![Page 27: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/27.jpg)
Evaluation of Similarity Methods
• Generally, 2D methods are more effective that 3D– 2D methods may be artificially enhanced
because of database characteristics (close analogs)
– Incomplete handling of conformational flexibility in 3D databases
• Best to use data fusion techniques, combining methods
![Page 28: Similarity Methods C371 Fall 2004. Limitations of Substructure Searching/3D Pharmacophore Searching Need to know what you are looking for Compound is](https://reader036.vdocument.in/reader036/viewer/2022062321/56649e495503460f94b3c8dc/html5/thumbnails/28.jpg)
For additional information . . .
• See Dr. John Barnard’s lecture at:http://www.indiana.edu/~cheminfo/C571/c571_Barnard6.ppt