The aim of my research is to establish a relation among diseases, physiological processes and the action of small
molecules like mithramycin
Our goal is to provide ageneric solution to this problem by attempting todescribe all biological states…in terms of genomic signatures, createa large public database of signatures of drugs andgenes, and develop pattern-matching tools to detectsimilarities among these signatures
FIRST GENERATION of CONNECTIVITY MAPFIRST GENERATION of CONNECTIVITY MAP
small molecules: 164 perturbagens tested (FDA approved and nondrug bioactive compounds)
cell lines: MCF7 (breast cancer) PC3 (prostate cancer) HL60 (leukemia) SKMEL5 (melanoma)
concentration and treatment 10M (when the optimal concentration is unknown)
x 6h
control cells in the same plate and treated with vehicle alone (medium, DMSO…)
OVERALL DATAOVERALL DATA
164 bioactive small molecules and corresponding vehicle control
Affymetrix GeneChip microarraysHG U133A
564 gene expression profiles
Traditional method: HIERARCHICAL CLUSTERING
CLUSTER is a collection of objects/data that are: * similar to each object in the same cluster
* different to the objects in the other clusters
In hierarchical clustering the data are not partitioned into a particular cluster in a single step. Instead, a series of partitions takes place, which may run from a single cluster containing all objects to n clusters each containing a single object.
Strategy already used to analyze data from yeast and rat tissues
Drawbacks of hierarchical clustering
the structure that they obtained by this approach was related to cell type and batch effects
all profiles must be generated on the same microarray platform
was necessary an analytical method that could detect multiple component within the cellular response to a perturbation
new method based on rank and using Kolmogorov-Smirnov statistic (like to TTest)
QUERY SIGNATUREQUERY SIGNATUREGene expression profile correlated with a biological state
EXPRESSION PROFILESEXPRESSION PROFILESGene expression profile for the perturbagens tested
comparison
Query signaturewith up regulated (+) and down-regulated genes (-)
Profiles gene expression profile for each perturbagens compared to its vehicle
(22.000 genes)
connectionstrong positive
…null…
strong negative
connectivity score+1…0…-1
Connectivity map
SOME EXAMPLESSOME EXAMPLES
HDAC inhibitorsHDAC inhibitors
query signature: T24 (bladder), MDA435 and MDA468 (breast cancer)treated with HDAC inhibitors: vorinostat(SAHA), MS-27-275, tricostatin A
Gene expression profile
CDKN1A cyclin-dependent kinase inhibitor 1A (p21, Cip1)FUCA1 fucosidase, alpha-L- 1, tissueMT1X metallothionein 1XDHRS2 dehydrogenase/reductase (SDR family) member 2GLRX glutaredoxin (thioltransferase)CLU clusterinTUBA3 tubulin, alpha 3HIST1H2BG histone 1, H2bg
8 up-regulated genes
5 down-regulated genesANP32B acidic (leucine-rich) nuclear phosphoproteinTYMS thymidylate synthetaseCTPS CTP synthaseKPNB1 karyopherin (importin) beta 1--- Full-length cDNA clone CS0DH006YD11 of T cells
connectivity map
* Vorinostat Thricostatin A
* HC toxin Valproic acid
Connectivity map allows us to identify compounds unknown for this function
In this case the results are independent from the used cell linesand from the dose of the drug
EstrogensEstrogens
query signature: MCF7 treated with 17-estradiol (E2) natural ligand of ER
129 up and 89 down-regulated genes
connectivity map
• Both agonists and antagonists can be discovered directly from the Connectivity
Map
• is very important to collect the cells in an appropriate
physiological state or molecular context
GeduninGedunin
Gedunin is able to abrogate AR activity in prostate cancer cells. Mechanism???
• query signature: LNCaP treated for 6h with gedunin
35 up and 35 down-regulated genes
connectivity map
• high connectivity with HSP90 inhibitor
DESEASESDESEASES
Diet-induced obesityDiet-induced obesity
query signature: gene expression in rat model of diet-induced obesity
163 up and 161 down-regulated genes
• PPAR agonists and inducers of adipogenesis
• there is connection also between data in rat and data in human cell lines (but only
in PC3)
Alzheimer diseaseAlzheimer disease
query signature: two independent studies
Comparison between hippocampusfrom AD and normal brain
Comparison between cerebral cortex from AD and age-
matched controls
40 genes25 genes
Significant negative connectivity with DAPH
Dexamethasone resistance in ALLDexamethasone resistance in ALL
query signature: comparison of cells from patients with sensitivity and patients with resistance to Dexamethasone
• sirolimus, mTOR inhibitor
• treatment with sirolimus sensitize CEM-CL cell lines to dexamethasone treatment
Sp1
Start site
// //
Start site
// //
Sp1 MTM
transcription
no transcription
The anticancer activity of MTM has been associated with its ability to inhibit replication and transcription via cross-linking of the DNA strands; MTM is known to bind to the minor groove of
GC-rich DNA as a Mg2+-dimer complex (MTM:Mg2+ = 2:1)
Our data: SDK Our data: SDK
We tested a new MTM analog: SDK
3355 down-regulated genes48 up regulated genes
900 ≥2 fold change
240 ≥3 fold change
query signature: A2780 treated with SDK 100nM for 6 hours
DISCUSSIONDISCUSSION
encouraging results
connectivity map can be used for: - drugs with common mechanism of action (HDAC inhibitors) - discover unknown mechanism of action (gedunin) - identify potential new therapeutics
the genomic signature are often conserved across different cell types and different origins
but there are also several limitations at this pilot study
- few number of used cell lines - few concentrations - interpretation of the results - the method for statistical analysis
Non-parametric models differ from parametric models in that the model structure is not specified a priori but is instead determined from data. The term nonparametric is
not meant to imply that such models completely lack parameters but that the number and nature of the parameters are flexible and not fixed in advance.
Nonparametric models are therefore also called distribution free.A histogram is a simple nonparametric estimate of a probability distribution
Non-parametric (or distribution-free) inferential statistical methods are mathematical procedures for statistical hypothesis testing which, unlike parametric statistics, make no assumptions about the frequency distributions of the variables being assessed. The most frequently used tests include
the Kolmogorov-Smirnov test (often called the K-S test) is used to determine whether two underlying probability distributions differ, or whether an underlying probability distribution differs from a hypothesized distribution, in either case based on finite samples.
Nonparametric statistical methods allow one to analyze data without making strong assumptions about the process that generated the data. For example, instead of assuming that the data have a Gaussian distribution, we might assume only that the distribution has a probability density that satisfies some weak, smoothness conditions