Location analysis of Location analysis of transcription factor transcription factor
binding sitesbinding sitesGuy NaamatiGuy Naamati
Andrei GrodzovkyAndrei Grodzovky
A brief historyA brief history
What about today?What about today?
• Two weeks ago Masha and Michal told us about gene expression and gene clusters.
• Last week, Lior and Ofer told us about tfbs, and how to identify them.
TodayToday!!
A revolutionary new method that identifies where and A revolutionary new method that identifies where and when in the genome a binding factor actually binds!when in the genome a binding factor actually binds!
We will talk about the method that reveals the genome We will talk about the method that reveals the genome wide localization, and provide several important wide localization, and provide several important examples from the world of yeast cells.examples from the world of yeast cells.
The star of the showThe star of the show
What can location analysis give us that micro-What can location analysis give us that micro-arrays alone can’t? arrays alone can’t?
• Micro-arrays identifies changes in mRNA levels, but can not distinguish direct from indirect effects.
Motivation no 1Motivation no 1
Motivation no 2Motivation no 2
What advantage does localization have over What advantage does localization have over try to identify the binding site? try to identify the binding site?
• Right! We don’t have to handle many case in which it “looks” like we identified binding site, but in vivo it’s not.
The MethodThe Method
The MethodThe Method
Developed by the group of Richard A. Young Developed by the group of Richard A. Young in Cambridge.in Cambridge.
A combination of location and expression A combination of location and expression profile.profile.
Allows protein-DNA interactions to be Allows protein-DNA interactions to be monitored across the entire yeast genome.monitored across the entire yeast genome.
The MethodThe Method
A modified ChIP, combined with micro-array A modified ChIP, combined with micro-array analysis.analysis.
DNA was taken from a cell, and broken with DNA was taken from a cell, and broken with sound waves (sonication).sound waves (sonication).
Proteins of interest where tagged with myc.Proteins of interest where tagged with myc. Fragments cross linked to those proteins were Fragments cross linked to those proteins were
enriched by immunoprecipitation (IP).enriched by immunoprecipitation (IP).
What now?What now?
Cross links were reversed, and the enriched Cross links were reversed, and the enriched DNA was amplified and labeled (Cy5).DNA was amplified and labeled (Cy5).
Cy5 labeled DNA was hybridized to a micro-Cy5 labeled DNA was hybridized to a micro-array, together with non-enriched DNA array, together with non-enriched DNA labeled with Cy3. labeled with Cy3.
Gene expression was also analyzed.Gene expression was also analyzed. Three independent experiments, for accuracy. Three independent experiments, for accuracy.
Handling noiseHandling noise
A single-array error method was used.A single-array error method was used.
How accurate is itHow accurate is it??
This method can identify factors binding to This method can identify factors binding to DNA, but cannot recognize the exact DNA, but cannot recognize the exact location of the binding site. Why?location of the binding site. Why?
• The sonication breaks the DNA into fragments 500-1000 bases long. Not very specific.
Testing if it worksTesting if it works
Used to identify sites bound by Gal4 in the Used to identify sites bound by Gal4 in the yeast genome.yeast genome.
Found seven genes previously reported to be Found seven genes previously reported to be regulated by Gal4.regulated by Gal4.
In addition, 3 more genes were found!In addition, 3 more genes were found!
An important reminderAn important reminder
The consensus binding site for Gal4 was The consensus binding site for Gal4 was found in many places in the gene where found in many places in the gene where Gal4 did not bind. Why is that?Gal4 did not bind. Why is that?
• Previous studies of Gal4 have suggested that chromatin structure also has a big role.
ConfirmationConfirmation
The next investigationThe next investigation
Ste12 functions in the response of haploid Ste12 functions in the response of haploid yeast to mating pheromones.yeast to mating pheromones.
More than 200 genes are activated in a Ste12 More than 200 genes are activated in a Ste12 dependent fashion. Which are directly dependent fashion. Which are directly regulated? regulated?
• By this method, only 29!
What’s nextWhat’s next??
This method can identify the global set of This method can identify the global set of genes that are regulated directly in vivo.genes that are regulated directly in vivo.
Gives us accurate information about where and Gives us accurate information about where and when transcription factors bind.when transcription factors bind.
Opens a new pathway into regulation Opens a new pathway into regulation analysis…analysis…
Transcriptional regulatory networks in yeastTranscriptional regulatory networks in yeastLee et alLee et al..
Just as there are networks of metabolic pathways…
There are networks of regulator-gene interactions
But the network consists of building blocks :
Those are…
How we identify themHow we identify them ? ?
Using genome wide location Using genome wide location
analysisanalysis
Identification of a set of promoter Identification of a set of promoter
regions that are bound by specific regions that are bound by specific regulators allowed us to predict regulators allowed us to predict sequence motifs that are bound by sequence motifs that are bound by these regulatorsthese regulators
Auto-RegulationAuto-Regulation
Provides reduced response time to environmental stimuli
Multi-Component Loop
Multi-Component Loop
Offers the potential to produce bistable systems that can switch
between two alternative states.
Feed-Forward LoopFeed-Forward Loop Provides a form of multi step ultra sensitivity as small changes in the level of activity of the master regulator at the top of the loop might be amplified at the ultimate target.
Single-input motifs are potentially useful for coordinating a discrete unit of biological function, such as a set of genes that code for the subunits of a biosynthetic apparatus or enzymes of a metabolic pathway.
Single Input MotifSingle Input Motif
Multi Input MotifMulti Input Motif
This motif offers the potential for coordinating gene expression across a wide variety of growth conditions.
The chain represents the simplest circuit logic for ordering transcriptional events in a temporal sequence.
FHL1 – Ribosomal proteins regulator.
Forms a single input regulatory motif consisting of essentially all ribosomal protein genes
Genome wide location analysis
Single Input MotifSingle Input Motif
ExampleExample
Assembling motifs into network Assembling motifs into network structuresstructures
An algorithm based on genome wide An algorithm based on genome wide
location data and expression data from over location data and expression data from over 500 experiments was developed in order to 500 experiments was developed in order to identify group identify group
of genes that are both coordinatelyof genes that are both coordinately
bound and expressed. bound and expressed.
Network assembly algorithmNetwork assembly algorithm
1-Define a set of genes G bound by a set of regulators 1-Define a set of genes G bound by a set of regulators S.S.
2- Find a subset of G with a similar expression 2- Find a subset of G with a similar expression pattern.pattern.
3- Go over the genes in G and drop genes with a 3- Go over the genes in G and drop genes with a significantly different expression pattern.significantly different expression pattern.
4- Scan the remaining genome for genes with similar 4- Scan the remaining genome for genes with similar expression profile and check if they’re bound by expression profile and check if they’re bound by factors from S. factors from S.
What have we gotWhat have we got? ?
The resulting sets of The resulting sets of genes and regulators are genes and regulators are multi input motifs.multi input motifs.
Multi Input MotifMulti Input Motif
But they are refined for But they are refined for common expression common expression
MIM-CE’s: What are they good forMIM-CE’s: What are they good for? ?
Using MIM-CE’s the yeasts cell cycle Using MIM-CE’s the yeasts cell cycle
networks was constructed using an networks was constructed using an
automated method, automated method, without prior without prior
knowledgeknowledge of the regulators that of the regulators that
control transcription.control transcription.
The processThe process
Check for MIM-CE’s significantly enriched Check for MIM-CE’s significantly enriched in genes whose expression oscillates during in genes whose expression oscillates during the cell cycle.the cell cycle.
Align MIM-CE’s around the cell cycle on Align MIM-CE’s around the cell cycle on the basis of peak expression of the genes in the basis of peak expression of the genes in the MIM-CEthe MIM-CE..
The outcomeThe outcome
Yeasts cell cycle transcriptional regulatory network.
FeaturesFeatures of the network model of the network model::
Correlation of the computational positioning of regulators Correlation of the computational positioning of regulators with previous studies.with previous studies.
Regulators whose function was not known before could be Regulators whose function was not known before could be positioned in the network on the basis of direct binding positioned in the network on the basis of direct binding data.data.
Third, and most important, reconstruction of the regulatory Third, and most important, reconstruction of the regulatory architecture was automatic and required no prior architecture was automatic and required no prior knowledge of the regulators that control transcription knowledge of the regulators that control transcription during the cell cycle. during the cell cycle.
Serial Regulation of Transcriptional RegulatorsSerial Regulation of Transcriptional Regulatorsin the Yeast Cell Cyclein the Yeast Cell Cycle
Simon et alSimon et al..
Many transcriptional regulatory networks in yeast…
Why the cell cycle network ?
Cyclins regulate the cell cycleCyclins regulate the cell cycle
Regulation of the cell cycle clock Regulation of the cell cycle clock
is effected through activity of theis effected through activity of the
cyclin-dependent kinase (CDK) cyclin-dependent kinase (CDK)
family of protein kinases.family of protein kinases.
But who regulate the regulatorsBut who regulate the regulators? ?
NineNine
transcriptionaltranscriptional
regulators were regulators were
identified identified
The methodThe method
Using genome wide Using genome wide location analysis tolocation analysis to
identify the binding sites identify the binding sites for each of the factors for each of the factors in vivo.in vivo.
The resultsThe results
ChIP Micro Array
These results confirm the stage specific regulation of gene expression by those factors.
The results also confirm thatgenes encoding several of the cell cycle transcriptional regulators are themselves bound by other cell cycle regulators
In this way a full regulatory network is formed.
And of course the cell cycle regulators Cyclin’s/CDK’s are also regulated by those factors.
Functional redundancyFunctional redundancy
Each of the factors Each of the factors binds a critical cell binds a critical cell cycle gene.cycle gene.
Deletion mutants with Deletion mutants with one of the factors one of the factors deleted survive…deleted survive…
Why ?Why ?
What forWhat for
Insures that the cell cycle completesInsures that the cell cycle completes efficiently.efficiently. On the other hand devoting the two
members of the pair to distinct functional
group of genes enables coordinated regulation of those functions.
The Genome-Wide The Genome-Wide Localization of Rsc-9Localization of Rsc-9
Damelin et al., 2002Damelin et al., 2002
A bit of backgroundA bit of background
Recent studies identified common set of genes Recent studies identified common set of genes that are repressed/induced in response to stress that are repressed/induced in response to stress (in yeast).(in yeast).
Generalized the roles of Msn2 and Msn4 in the Generalized the roles of Msn2 and Msn4 in the stress response.stress response.
Do they account for all the observed changes Do they account for all the observed changes in transcription response to stress? in transcription response to stress?
Evidently notEvidently not
Must account for extensive gene repression as Must account for extensive gene repression as well as activation.well as activation.
Previous evidence (Gasch et al, 2000): many Previous evidence (Gasch et al, 2000): many genes involving Msn2/4 are activated only in genes involving Msn2/4 are activated only in somesome stress conditions. stress conditions.
• Tempting to consider a role for general
transcription factors in the stress response.
Along came RSCAlong came RSC
Regulation of gene expression is closely Regulation of gene expression is closely connected to change in Chromatin structure.connected to change in Chromatin structure.
RSC: a 15 protein complex that uses ATP RSC: a 15 protein complex that uses ATP energy to reposition nucleosomes.energy to reposition nucleosomes.
Rsc9: a stable component of the RSC complex. Rsc9: a stable component of the RSC complex.
Genome wide localizationGenome wide localization
The exact method we talked about was used The exact method we talked about was used for Rsc-9. for Rsc-9.
Two categories with significant enrichment: Two categories with significant enrichment: 1. Genes coding the cytoplasmatic and 1. Genes coding the cytoplasmatic and mitochondrial ribosomal proteins. mitochondrial ribosomal proteins. 2. Genes involved with stress response. 2. Genes involved with stress response.
What kind of stressWhat kind of stress??
Both set of genes are are affected by many Both set of genes are are affected by many types of stress.types of stress.
The question is raised whether Rsc9 The question is raised whether Rsc9 responds to specific or general stress. How responds to specific or general stress. How do we find out?do we find out?
• Localization to the rescue!!
Two Stress TreatmentsTwo Stress Treatments
Hydrogen-peroxide (elicits a transcriptional Hydrogen-peroxide (elicits a transcriptional response similar to many other stress).response similar to many other stress).
Rapamycin (cell response is similar to Rapamycin (cell response is similar to starvation).starvation).
Similar changes of Rsc9 localization after both treatments suggest a general stress response.
A questionA question
Right. A genome wide localization was Right. A genome wide localization was used after treatment with the mating used after treatment with the mating pheromone alpha factor. The results were:pheromone alpha factor. The results were:
• How would we know that the changes wouldn’t occur from an unrelated treatment?
We have seen how genome wide localization helps us recognize regulation motifs and networks
Also we’ve seen a computational method to create a whole regulatory network without prior knowledge of the factors involved.
ConclusionConclusion
ConclusionConclusion
The changes in Rsc9 localization suggests that The changes in Rsc9 localization suggests that the genome itself is conditioned during the genome itself is conditioned during widespread transcriptional regulation.widespread transcriptional regulation.
Raises new and interesting questions for Raises new and interesting questions for transcriptional regulation. transcriptional regulation.
BibliographyBibliographyLee et al. Transcriptional Regulatory Networks in Saccharomyces cerevisiae. Lee et al. Transcriptional Regulatory Networks in Saccharomyces cerevisiae. ScienceScience. 2002 298:799-804. 2002 298:799-804
Damelin et al.Damelin et al. The Genome-Wide Localization of Rsc9, The Genome-Wide Localization of Rsc9, a Component of the RSC Chromatin-Remodelinga Component of the RSC Chromatin-Remodeling Complex, Changes in Response to Stress. Complex, Changes in Response to Stress. Mol CellMol Cell.. 2002 9:563-5732002 9:563-573
Simon et al. Serial Regulation of Transcriptional RegulatorsSimon et al. Serial Regulation of Transcriptional Regulators in the Yeast Cell Cycle. in the Yeast Cell Cycle. CellCell 2001 106:697-708 2001 106:697-708
Ren et al. Genome-Wide Location and Function of DNA Binding Proteins. Ren et al. Genome-Wide Location and Function of DNA Binding Proteins. ScienceScience 2000 290:2306-23092000 290:2306-2309
Hope you had fun! Hope you had fun!