development team - inflibnet centre
TRANSCRIPT
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
1
Development Team
Paper Coordinator : Prof. Kuldeep K. Sharma
Department of Zoology, University of Jammu
Principal Investigator : Prof. Neeta Sehgal Department of Zoology, University of Delhi
Content Writer : Dr. Sudhida Gautam, Hansraj College, University of Delhi Dr. Kiran Bala, Deshbandhu College, University of Delhi Content Reviewer : Prof. Rup Lal Department of Zoology, University of Delhi
Co-Principal Investigator : Prof. D.K. Singh
Department of Zoology, University of Delhi
Paper : 15 Molecular Cell Biology Module : 21 Principles of gene expression: Transcription
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
2
Description of Module
Subject Name ZOOLOGY
Paper Name Molecular Cell Biology; Zool 015
Module Name/Title Principles of Gene Expression
Module Id M21; Transcription
Keywords Central dogma, Transcription, RNA polymerase, template, Pribnow
box, promoter, splicing, holoenzyme, monocistronic
Contents
1. Learning Outcomes
2. Introduction
3. Transcription
3.1 Components
3.2 Types of RNA
4. Experimental Evidences
4.1 DNA acts as template for transcription
4.2 One DNA strand acts as a template
5. Transcription Unit
6. RNA Polymerase
6.1 Bacterial RNA polymerase
6.2 Eukaryotic RNA polymerase
7. Bacterial Transcription
7.1 Initiation
7.2 Elongation
7.3 Termination
8. Transcription in Eukaryotes
8.1 Initiation
8.2 Elongation
8.3 Termination (i) Allosteric model
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
3
(ii) Torpedo model
9. Post Transcriptional Modifications
9.1 RNA splicing
9.2 Pre-mRNA Processing/ 3’ and 5’ modifications
10. Summary
1. Learning Outcomes
The present module explains the central dogma of molecular biology.
The basic differences of DNA and RNA.
Purpose of transcription process in biological system and the three events (initiation, elongation
and termination).
The passing on of the genetic information from DNA to RNA before translation can begin.
Key differences between prokaryotic and eukaryotic transcription.
2. Introduction
The word transcript means written or printed version of something. Transcription is a vital process of
the biological forms in which a single stranded RNA is synthesized using DNA as a template.
Transcription has a complex regulatory system associated with numerous gene regulatory elements
like promoter, silencer etc. In 1953 Watson and Crick gave the double helix model of DNA, and three
years later Crick gave the Central Dogma of molecular biology (Figure 1) which stated that genetic
flow of information within the different organism’s is a two steps process. The genetic information
stored in DNA is activated i.e., the double stranded DNA is denatured by enzymes to give a single
strand of DNA which acts as a template for the synthesis of mRNA (Transcription). The mRNA
chain produced is in accordance to the template DNA’s base sequence. The mRNA produced by
transcription helps in the production of protein molecules and this process is known as Translation.
The growth and development of an organism depends on the properties of various proteins present in
the tissue/cell. Whenever there is a demand of a certain protein in the cell, it needs to be synthesized
from the DNA by following the amino acid sequence of the required protein. Until the discovery of
retroviruses it was believed that the genetic information flows in a unidirectional way from DNA to
RNA (transcription) and from RNA to proteins (translation). This flow of genetic information was
commonly known as gene expression. This two steps process (DNA ---RNA --- Proteins) was referred
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
4
to as Central Dogma of Molecular Biology by Crick. Thus, it provides basic information about the
flow of genetic information within a cell. Most organisms use DNA as the genetic material and they
follow the central dogma of molecular biology. However, there are certain Retroviruses which have
RNA as the genetic material and for the gene expression to take place, the information of RNA is
conveyed to DNA and the process is known as Reverse Transcription. We need to understand that the
central dogma is no more valid in its original form.
Figure 1: The central dogma of molecular biology
Table 1 tells us about the differences between molecule of DNA and RNA. The DNA is a double
stranded molecule having deoxyribose sugar. Sugar of RNA and DNA varies only at carbon second
position. Deoxyribose sugar as the name suggests lacks an oxygen molecule at C2. Absence of
oxygen at C2 makes it more stable than ribose sugar of RNA. Adenine, Guanine and Cytosine are
present both in DNA and RNA. However, Thymine is present in DNA but absent in RNA instead it
has Uracil which replaces Thymine during gene expression (Figure2).
Table 1: Difference between RNA and DNA
Characteristic DNA (Deoxyribose Nucleic acid) RNA (Ribose Nucleic acid)
Sugar Deoxyribose sugar Ribose sugar
Nucleotides A,T,G,C A,U,G,C
Strands Double stranded Single stranded
Presence of 2’-OH group (Figure 2) No Yes
Stability More stable Less stable
Types A,B and Z rRNA, mRNA, tRNA, SnRNA,
miRNA, siRNA
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
5
(A) Ribose Sugar (B) Deoxyribose sugar
Figure 2: Structure of Pentoses sugar: (A) Ribose; (B) Deoxyribose
Source: https://s3.amazonaws.com/classconnection/506/flashcards/735506/jpg/picture2-
148FB8638C909AD7EE1.jpg
3. Transcription
The genetic code of the DNA (genotype) is used to synthesis RNA through the process of
transcription. It requires a series of events involving varies RNA nucleotides, DNA template and a
series of protein components for its initiation and regulation.
3.1. Components
The transcription requires three major components:
1. DNA template
2. RNA polymerase and associated protein factors
3. Raw materials (Nucleotides)
RNA polymerase catalyzing the process of transcription was discovered independently in 1960 by
Samuel Weiss and Jerard Hurwitz. The information on the DNA is used by RNA polymerase to make
mRNA using one of the strands of DNA as a template also known as the anti-sense or non-coding
strand. The double stranded DNA is unwound, one strand to which the RNA polymerase attaches acts
as the template strand (Figure 3). The other strand is referred to as the non-template strand, coding
strand or sense strand. Transcription begins on specific DNA sequences called promoters. It occurs in
three phases: initiation, elongation and termination.
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
6
Figure 3: A transcribing unit
Source: (http://www.phschool.com/science/biology_place/biocoach/images/transcription/startrans.gif)
http://biobar.hbhcgz.cn/Article/UploadFiles/201104/20110415163953927.jpg
3.2. Types of RNA
RNA (Ribose nucleic acid) is a polymer of ribonucleotides linked together by 3’-5’ phosphodiester
bond. To begin the chapter we’ll first have a brief discussion about the different types of RNA (Table 2).
Table 2: Types of RNA: Depending upon the type of function the RNA molecules are classified as-
Type of RNA Function Location
Ribosomal RNA (rRNA) Structural component of the ribosomes Cytoplasm
Messenger RNA (mRNA) Carries the information in a gene for the protein
synthesis
Nucleus and
cytoplasm
Transfer RNA (tRNA) Transport amino acids to the ribosomes during
protein synthesis
Cytoplasm
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
7
Small nuclear RNA (snRNA) Modification of the RNA transcript Nucleus
Micro RNA (miRNA) RNA interference Nucleus
Small interfering RNA( Si RNA) RNA interference Nucleus
4. Experimental Evidences/Historical Studies Helpful to Study Transcription
4.1. DNA acts as the template for transcription
In 1970, Oscar miller, Barbara Hamkalo and Charles Thomas provided the evidence of DNA
molecule being used as a template for transcription. Electron microscopic studies of internal cellular
contents revealed presence of Christmas-tree like structures; thin central fibers (the trunk of the tree),
to which were attached strings (the branches) with granules (Figure 4). When deoxyribonucleases
were added to it breakdown of the central fibers was observed, indicating that the tree trunk were
DNA molecules. Ribonucleases removed the granular strings, indicating that the branches were RNA.
The Christmas tree like structures was concluded to be a gene undergoing transcription. As, the
process of transcription proceeds, more and more RNA is formed which further extends the branches
of the tree.
(a) Christmas tree (b) RNA Polymerase and DNA chain
Figure 4: Christmas tree like structures within the cell showing the site of transcription
Source: (a) http://www.nature.com/scitable/content/Under-the-electron-microscope-DNA-molecules-undergoing-29586
(b) http://mol-biol4masters.masters.grkraj.org/html/Gene_Expression_II3-RNAP_I_Promoter.htm
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
8
DNA molecule unwinds to act as a template i.e. the double helix opens up and only one strand acts as
template for the transcription. This is the strand to which the RNA polymerase binds and
transcription process continues which is exactly opposite to the coding strand which contains the gene
sequences. As transcription takes place on this template the new strand is perfect replica of the coding
strand of DNA molecule. At a given time only one of the DNA strand acts as a template.
4.2. One strand of DNA acts as a template for transcription
In 1963, Julius Marmur and his colleagues (Figure5) proved that only one strand acts as template.
They made use of DNA of bacteriophage SP8, which infects the Bacterium Bacillus subtilis. The
double stranded DNA of this phage has different densities for each strand, which permits the
separation of the two strands by equilibrium density gradient configuration into "heavy" and "light"
DNA strands.
B. subtilis was placed in a medium containing a radioactively labeled precursor of RNA by Marmur
and his colleagues. Later the bacteria were infected with SP8, as a result the phage DNA was injected
into the bacterial cells. Transcription within the infected bacterial cells produced radioactively labeled
RNA complementary to the phage DNA. This newly synthesized RNA was isolated from the cells.
Secondly, DNA from fresh culture of SP8 was isolated and the two strands were separated into heavy
and light stand of DNA, respectively.
The radioactively labeled RNA obtained from the infected bacterial cells was hybridized with heavy
and the light strand independently. Interestingly, the hybridization signal was observed only with the
heavy strand and the light strand did not exhibit any hybridization. Hence, evidence was provided by
Marmur and his colleagues which proved that RNA is transcribed from only one of the DNA strands
in SP8. Heavy strand acted as the template in this case. The newly synthesized RNA strand was
complementary and anti-parallel to this strand, having the same polarity and base sequences as that of
the non-template strand, with the exception that T in DNA is replaced by U in RNA.
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
9
Figure 5: Marmur’s Experiment to prove only one strand of DNA acts as template during transcription
5. Transcription Unit
A transcription unit (Figure 6) is a stretch of DNA or a particular gene sequence which is flanked by a
promoter sequence situated upstream of the start site and a terminator region located downstream of
the structural genes. The three components namely are:
1. Promoter: The DNA sequences (consensus sequences) located towards the 5’ region which
promotes/recruits or initiates the transcription process. It is present, upstream of the RNA coding
region/transcriptional start site. The promoter facilitates the binding of transcription apparatus to
the DNA template and ensures that the initiation of each RNA occurs at the same point. Two
promoter regions have been identified in the bacterial system; (a) Pribnow box/TATA box located
-10 upstream from the site of initial transcription (TATAAT sequence; rich in adenine and
thymine) and (b) TTGACA located 35 nucleotides upstream. The specific sequence of the
promoter is responsible for the binding strength of the RNA polymerase to the transcription unit.
2. RNA – coding sequence: The base pair sequence of the template DNA which is copied in the
RNA molecule.
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
10
3. Terminator: The sequences of nucleotides which signals the termination of transcription and are
part of the coding region.
Figure 6: A transcription unit
Source: http://bioap.wikispaces.com/Ch+17+Collaboration
6. RNA Polymerase
The enzyme RNA polymerase catalyzes the process of transcription
6.1. Bacterial RNA polymerase
A single large multimeric, RNA polymerase catalyses the process of bacterial transcription (Figure 8).
The RNA polymerase consists of a core enzyme made of 5 polypeptides which are two α, one β, and
one β’, bound to another polypeptide called the sigma factor (σ). The sigma factor recognizes the
upstream -35 and -10 regions of the promoter and makes sure that this binding of the RNA
polymerase and DNA is stable. Core enzyme cannot carry on the process of transcription on its own
as its lacks the ability to bind at the specific promoter region (it is converted to a holoenzyme when
the sigma factor binds to it and now the enzyme is ready to bind with the promoter). The polymerase
binds to the promoter region while the DNA is still in the double helical form, known as the closed
promoter complex. Holoenzyme (consists of core enzyme (two α, one β, and one β’) and sigma factor
together as a single unit) is the actual functional enzyme which helps in the opening up of the DNA
helix, by melting a short stretch of the DNA helix. This polymerase promoter complex undergoes a
transition from closed to open promoter complex, which is the unwinding of the DNA approximately
12bp long. The incoming nucleotides complimentary to the DNA template strand can bind to the
enzyme which catalyzes the formation of a phosphodiester bond between various incoming
nucleotides. This complex of polymerase-DNA-RNA is known as a ternary complex. The enzyme can
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
11
move forward to a new position on the template. The sigma factor can leave the transcription bubble
once the RNA chain reaches 8-9 base pairs. The released sigma factor can be used for initiation of
transcription (a different gene) by another RNA polymerase. Further elongation of the RNA chain can
proceed without requiring the sigma factor. The leaving of sigma molecule is associated with the
conformational changes in the enzyme, which locks the processivity. The enzyme continues to
catalyze RNA elongation until it encounters a terminator sequence. We should note that the core
enzyme can bind to the DNA molecule with the same affinity at any position. The binding sites for
core enzyme DNA are known as loose binding sites. But the holoenzyme binds to promoters very
tightly, with an association constant increased from that of core enzyme by (on average) 1000 times
and with a half-life of several hours.
6.2. Eukaryotic RNA polymerase
RNA polymerase is large and complex enzymes e.g. yeast holoenzyme consists of two large subunits
and ten small subunits (Figure7). Three different types of RNA polymerase are present in eukaryotes
responsible for transcribing a different class of RNA (Table: 3). RNA polymerase I transcribes 28S,
18S and 5.8S rRNA and is located in the nucleolus. RNA Polymerase II transcribes mRNA and
snRNA and is present in the nucleoplasm of the nucleus. RNA Polymerase III transcribes tRNA, 5
SrRNA, snRNA and few miRNA’s and is present in the nucleoplasm. RNA polymerase IV is present
in the nucleus of plants and helps in transcribing some siRNA. Compared to prokaryotic RNA these
RNA polymerase are large multimeric units; thus several genes encode for them.
Table 3: Types of eukaryotic RNA Polymerase
RNA polymerase Function Location
RNA polymerase I Transcribes 28S,18S and 5.8 S rRNA molecules Nucleolus
RNA polymerase II Transcribes mRNA, sn RNA Nucleoplasm of the nucleus
RNA polymerase III Transcribes tRNA, 5S rRNA, some snRNA and few
miRNA’s molecules
Nucleoplasm
RNA polymerase IV Some siRNA in plants Nucleus in plants
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
12
Figure 7: Comparison of structural composition of prokaryotic and eukaryotic RNA polymerase
(Source: http://www.cbs.dtu.dk/dtucourse/cookbooks/dave/Fig13_35.JPG)
7. Bacterial Transcription
The basic transcription unit and apparatus have already been discussed we’ll study in detail how the
process is carried on. Transcription can be easily divided into three steps namely:
7.1. Initiation
Initiation consists of binding of RNA polymerase to the DNA helix for RNA synthesis to begin. The
transcription apparatus recognizes and binds to the promoter region which is identified by the sigma
complex. The DNA of closed promoter complex melts to result in open promoter complex. The
template strand is identified and accordingly the nucleotides are added. Rate of transcription varies for
different genes, depending on the varying affinity of the promoter and RNA polymerase. Two DNA
sequences in most promoters of E. coli which have played a critical role in initiation of transcription
are found upstream at -35 (helps in initial recognition) and -10 (for the melting reaction to convert
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
13
closed promoter complex into an open promoter complex) (Figure6 and Figure8) . The short stretch of
nucleotides ahead of the promoter is referred to as the consensus sequence. The two most common
consensus sequence of the most bacterial promoters are the -35 region (the -35 box) is 5’ –TTGACA-
3’ and -10 region (the -10 box, formerly called the Pribnow box; after David Pribnow its discoverer)
is 5’ –TATAAT-3’. The sigma factor associates with the core enzyme to form holoenzyme enzyme.
This holoenzyme binds to the consensus sequence and strongly to the promoter at the -10 region,
simultaneously accompanied by a local untwisting of about 12-14 bp around the region. Thus, RNA
polymerase orients itself to begin the transcription at +1. The polymerase pairs the base of the
nucleotide triphosphate with the complimentary base present on the template DNA. No primer is
required for this paring; the next coming nucleotide is bound to the 3’ end of the first nucleotide with
the release of a pyrophosphate. Since no phosphodiester bond forms at the 5’ end it continues to have
the three phosphate groups.
Figure 8: Binding of RNA apparatus to DNA
(Source: http://www.quia.com/jg/1269935list.html)
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
14
7.2. Elongation
Once the 10-12 long RNA is synthesized the sigma factor leaves the transcription bubble, which leads
to a conformational change in the RNA polymerase. The core enzyme moves along the template
joining nucleotides to the RNA molecule, in the process it untwists the DNA double helix
downstream and then reanneals it (Figure9). An average of 30-50 nucleotides per second are added to
the elongating RNA molecule. Topoisomerases help in the uncoiling and recoiling of the DNA
template as transcription proceeds (These enzymes can easily cleave and rejoin a DNA molecule
without the requirement of another protein or an high energy cofactor). RNA polymerase has proof
reading ability which helps to remove any non-complimentary base and continue the transcription. If
the enzyme encounters a wrong base it goes back cleaves it and resumes the synthesis in the forward
direction.
7.3. Termination
The sequences which code the termination of the transcription are referred to as the terminator
sequences. Termination includes detaching of the enzyme from the DNA template and the release of
the newly synthesized RNA molecule (Figure 9). Two types of terminators are present in the bacterial
system with or without an ancillary protein called Rho factor; namely Rho dependent (also, type II
terminators) and Rho independent terminators (also, type I terminators) (Table 4). A polycistronic
RNA is produced when a number of genes are transcribed in a single RNA; i.e. a single termination
occurs at the end. Polycistronic RNA are absent in eukaryotes as each gene has its own initiation and
termination site.
Table 4: Major differences between type I (Rho independent) terminators and type II (Rho dependent)
terminators
Type I (Rho independent) terminators Type II (Rho dependent) terminators
Termination takes place in absence of rho factor Termination takes place in the presence of rho factor
Terminator consists of an inverted repeat sequence Terminator lacks the AT string found in Rho dependent
terminators.
When transcribed the inverted repeat sequence
forms a hair-pin like loop.
Terminator lacks the hair-pin loop.
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
15
The termination sequence is followed by a string
of approximately 6 adenine nucleotide; their
transcription produces a string of uracil nucleotide
after the hair-pin loop.
Rho has RNA –binding and ATPase domains.
Formation of the hair-pin slows down the
polymerase and the adenine-uracil nucleotides
which follow it are relatively unstable. This
destabilization of the DNA RNA pairing; results
in the release of the RNA molecule.
Rho binds to the unstructured RNA (stretch of RNA
upstream of terminator sequence which lacks any
secondary structure) and moves towards the 3’ end. Rho
reaches the transcription bubble and its helicase activity
unwinds thee RNA-DNA hybrid and stops transcription.
Figure 9: Transcription steps
Source: http://www.nature.com/nrmicro/journal/v9/n5/images/nrmicro2560-f4.jpg
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
16
8. Transcription in Eukaryotes
Transcription in eukaryotes is similar to that of prokaryotes. However, it involves three RNA
polymerase which help in recognition of specific promoter regions. These promoters/activators/
enhancers have two regions namely; 1) core promoter and 2) promoter proximal region/regulatory
promoter region (present upstream of a gene sequence).
The core promoter is located upstream of the initiation site and consists of -35 to-52 base pairs. The
TATA box (also known as Goldberg-Hogness box, after its discoverer) is present-25 to -30 bp
upstream of the start site and consensus sequence is TATAAA. These promoters facilitate the
formation of initiation complex and affect the rate of transcription. The regulatory promoter are
located upstream of the core promoter eg: CAAT box (5' GGCCAATCT 3'), GC box (GGGCGG) (
box centered at about -75 to -120) (Figure10). Any mutation which takes place in this region has the
ability to occasionally alter the rate of transcription, indicating there role in the efficiency of the
initiation complex.
Figure 10: Sequence elements of a general eukaryotic promoter/gene
Source: http://mol-biol4masters.masters.grkraj.org/html/Gene_Structure5B-
Eukaryotic_Promoter_Structure_for_RNA_Polymerase_II_files/image004.jpg
8.1. Initiation
Transcription initiation requires the assembly of the RNA polymerase and the general transcription
factors (GTFs) in a sequential manner. The GTFs are specific for each RNA polymerase and are
numbered according to the RNA polymerase for which they work. These GTFs have replaced the
sigma factor of prokaryotes. The GTFs are represented as TFIIA, TFIIB, TFIID, TFIIE and TFIIG.
The final alphabetical letter designates the individual factor (Figure11). TFIID is the initial committed
complex which recognizes and binds to the TATA box with the help of its TBP (TATA-binding
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
17
protein). TATA-binding protein binds the major groove of DNA which results in its bending and
unwinding of the DNA helix.
The binding of TFIID facilitates the bending which helps in the binding of TFIIB to TFIID followed
by sequential binding of other GTFs (TFIIA, TFIIF accompanied with polymerase and finally TFIIE
and TFIIH) and RNA polymerase to produce the initiation complex (Table 5). TFIIE and TFIIH bind
to the RNA polymerase to form the pre-initiation complex. TFIIH acts as a helicases (breaks the bond
between the double stranded DNA) to form an open complex. Conformational changes within the
DNA and polymerase result in unwinding of 10-15bp of DNA. The template DNA is placed on the
active site resulting in the formation of the open initiation complex. TFIIH also hydrolyses ATP to
phosphorylate the carboxy terminal domain (CTD) in RNA polymerase II. This phosphorylation
breaks the contact between the RNA polymerase II and TFIIB. As, a result TFIIB, TFIIE and TFIIH
dissociate from RNA polymerase and it’s free to proceed the elongation process.
Table 5: Function of the general transcription factors
General Transcription Factor Function in transcription
TFIID (composed of TATA-binding
proteins (TBP) and TBP-associated
factors (TAFs)
Recognizes the TATA box in the promoter region (core promoter
binding factor)
TFIIB Interacts with TBP of TFIID and stabilizes TBP-TATA complex,
recruits binding of TFIIF- RNA polymerase complex
TFIIH Helicases activity for opening of the promoter complex, initiates
transcription (Enzymatic activities of DNA Helicase and ATP kinase)
and repairs DNA damage ( by nucleotide excision repair)
TFIIA Stabilizes TBP-DNA binding
TFIIF Binds to RNA polymerase and prevents it from binding to nonspecific
DNA binding sites
TFIIE Helps in maintenance of initiation complex and switching to elongation
process
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
18
Figure 11: Initiation in eukaryotes
Source: http://www.mun.ca/biology/desmid/brian/BIOL2060/BIOL2060-21/21_12.jpg
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
19
8.2. Elongation
Once the initiation complex begins the synthesis from the promoter region, general transcription
factors are released and they can be used by other RNA polymerases. The RNA transcript has a length
of 25-30 bp which keeps on elongating as new nucleotides are being added in the 3’end. During the
elongation process 8 RNA nucleotides remains paired with the DNA template. This DNA-RNA duplex
is bent at 90° between the jaw-like extensions of the enzyme. As the complex moves forward the
unwound DNA is rewound and RNA transcript exits from it separately. Roger Kornberg and his
colleagues were awarded Nobel Prize in 2006 for studying the process of transcription. He discovered
Mediators responsible for mediating the interacting between the RNA polymerase II and regulatory
transcription factors which bind to enhancers or silencers and serve as interfering molecules between
RNA polymerase II and many diverse regulatory signals.
8.3. Termination
In eukaryotes the RNA transcription continues down the DNA template until it encounters a poly A
sequence. The mRNA transcription can even continue past this poly A site, in some cases even 100 or
1000 bp. The poly A consensus sequence i.e. AAUAAA is a string of adenine nucleotides which
continues near the 3’ end of the mRNA. The addition of a tail of polyadenylic acid (poly A) to the 3'
end of mRNA is referred to as polyadenylation. Polyadenylation involves recognizing the processing
site signal, (AAUAAA), and cleaving of the mRNA to create a 3' OH terminal end to which poly A
polymerase adds 60-200 adenylate residues. Transcription via RNA polymerase II typically terminates
about 500 to 2000 nucleotides downstream from the poly A signal. Two models have being proposed
for termination process namely,
(i) Allosteric model: After transcribing the poly A sequence, RNA polymerase and DNA template
destabilize, which ultimately results in their dissociation. For poly A addition to the RNA, a number
of proteins including cleavage stimulation factor (CPSF) protein, and two cleavage factor proteins
(CFI and CFII), bind to and cleave the RNA. Then, the enzyme poly A polymerase (PAP) uses ATP
as a substrate and catalyzes the addition of A nucleotides to the 3’ end of the RNA to produce the poly
(A) tail. During this process PAP is bound to CPSF. As, the poly (A) tail is synthesized, molecules of
poly (A) binding protein II (PABII) bind to it.
(ii) Torpedo model: It requires the Rat 1 exonuclease. Cleavage of the mRNA results in a 5’ end
trailing out of the RNA polymerase (Figure12). To this free 5’ end the Rat 1 attaches and cleaves the
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
20
growing RNA by moving towards the 3’ end. Rat1 is a 5’-3’ exonuclease i.e. it cuts the RNA from 5’
end towards 3’ end. Like a torpedo it devours the growing RNA and on reaching the RNA polymerase
it disrupts the transcriptional machinery and terminates transcription.
Figure12: Termination of transcription in eukaryotes. (1. Synthesis of polyA tail); 2. RNA is released which
destabilizes the RNA polymerase and DNA complex; 3. Allosteric model: Due to destabilization DNA and
RNA polymerase seperate; 4. from the growing RNA Rat1 exonuclease binds; 5. Binding leads to a torpedo
like action which ferociously cleaves the RNA leading to separation of DNA and RNA polymerase)
(Source: https://s3.amazonaws.com/classconnection/819/flashcards/3148819/png/eukaryotic_termination-
149544E302B1B1E1FA2.png)
9. Post Transcriptional Modifications
Unlike prokaryotes (which have polycistronic mRNA and require no post transcriptional
modifications) the eukaryotic mRNA are modified at both the ends. Also, all the genes are not
collinear with the proteins that they code (When a continuous sequence of nucleotides in DNA
encodes a continuous sequence of amino acids in a protein, the two are said to be collinear). In 1970’s
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
21
it was discovered that the regions of DNA were much longer than RNA. When DNA and RNA were
hybridized the hybrid of DNA-RNA showed looped structures whereas DNA-DNA molecule could
match through the entire length. It was concluded that certain regions of DNA are absent from the
RNA. This provided evidence that the eukaryotic genes consisted of coding and non-coding regions.
The coding sequences i.e. exons are disrupted by non-coding introns. The term intron refers to the
intervening sequences which do not code the amino acid sequences (Refer value addition column).
Exons are the expressed sequences which are ligated to obtain a continuous coding mRNA. The
introns are removed and the exons are joined together before the mRNA leaves the nucleus. This
process of joining the exons is known as RNA splicing (Figure 13, 14). The mRNA bears three sites
for splicing to take place which are; 5’ consensus/splice site which begins with 5’GU and a branch
point followed by 3’ splice site which has AG3’end. Above which is located a branch point
approximately 18 to 40 bp
Value Addition:R looping experiments were first performed by R.J. Roberts and P.A.Sharp; they
identified introns in the protein coding adenovirus gene. They were awarded NobelPrize in 1993 for
discovering the introns.
Source: https://upload.wikimedia.org/wikipedia/commons/d/da/R_loop.jpg
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
22
9.1. RNA splicing
RNA splicing involves the removal of introns and joining of the exons. An endonucleolytic “cut” is
made at each end of an intron, the intron is removed, and the exon ends are rejoined. RNA ligase seals
the exon ends to complete each splicing event. However, the precise excision of introns is much more
complex and interesting in higher eukaryotes. These catalytic RNAs were referred to as ribozymes.
Thomas Cech and his colleagues discovered in 1963 during a study of the ciliate protozoan
Tetrahymena.
Figure 13: Self splicing introns
(Source: Concepts of genetics; Klug and Cummings tenth edition. Pg: 335)
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
23
1. Group I: Self splicing introns which are present in some rRNA genes. The self-excision involves
an interaction between a guanosine cofactor and the primary transcript (Figure 13). The 3’-OH
group of guanosine is transferred to the nucleotide adjacent to the 5’ end of the intron. Then this
newly acquired 3’-OH group (of guanosine) on the left-hand of exon and the phosphate on the 3’
end of the intron form a bond. The intron is spliced out and the two exon regions are ligated,
leading to the mature RNA.
2. Group II: Self-splicing introns with a different mechanism than that of the group I, are present in
the protein coding genes of mitochondria and chloroplasts. An autocatalytic reaction leads to
excision of intron, which lacks guanosine as a cofactor.
3. Nuclear pre-mRNA introns: Splicing takes place within a large complex known as spliceosome
which consists of a pre mRNA bound to snRNA (small nuclear RNA) ranging from 107 to 210
nucleotides which associate with proteins to form snRNPs (small nuclear ribonucleoprotein
particles). Small nuclear RNAs (snRNAs or snurps) are an essential component of the splicesomal
complex and are located in the protein coding genes of eukaryotic cell. Being rich in uridine they
are known as U1, U2…….U6. There sequential binding results in the formation lariat which
contains the removed introns (Figure 14).The splicing reactions proceed as described below:
U1 binds to the 5’ splice end.
U2 binds to the branch point.
Complex of U4, U5, and U6 joins the splicesome and combines the U1 and U2. This causes
the introns to loop and brings the exons closer.
U1 and U4 snRNPs dissociate resulting in activation of the splicesome complex.
Active complex removes the introns (in the form of a lariat) and ligates the two exons. The
branch point bond breaks and the linear intron are easily digested by the nuclear enzymes.
The snRNPs are released after ligating the exon and this process is followed for each intron
molecule.
4. Transfer RNA introns: found in the tRNA genes. This intron makes use of specialized enzymes to
cut and reseal the RNA. In prokaryotic cell both transcription and translation can take place at
same time, as both the processes are coupled with each other. Thus, mRNA produced has no
opportunity to be modified. However, in eukaryotes the site of transcription and translation are
nucleus and cytoplasm, respectively. Changes are incorporated into the nascent mRNA at both the
3’ and 5’ end of the molecule to protect the coding of the molecule (in eukaryotes).
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
24
Figure 14: Lariat formations in Spliceosome
(Source: Concepts of genetics; Klug and Cummings tenth edition. Pg: 336)
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
25
9.2. Pre-mRNA Processing/ 3’ and 5’ modifications
As mRNA has around 20-30 bp, a capping enzyme adds a methylated guanine nucleotide to the 5’ end
by an unusual 5’ to 5’ linkage as opposed to the usual 5’ to 3’ linkage. The methyl group is added to
the position 7 of the base making the base 7-methylguanine. This is referred to as capping and the
presence of this cap helps in removal of introns in addition to providing stability to the mRNA. The
5’ cap is easily recognized by the ribosomes, which binds to it and initiates the translation process.
Rarely, additional methyl residues may be attached to the bases of the second and third nucleotide.
A sequence of about 50-250 adenines are added to the 3’end of the mRNA, forming a poly (A) tail.
These are added after the mRNA is released from the polymerase and is known as polyadenalytion.
Polyadenalytion provides stability to the RNA molecule (protects from exonucleases) and a longer
time period to be available for the translation process. Poly A site having 11-30 nucleotides upstream
of the cleavage has a consensus sequence of AAUAAA.
10. Summary
The genetic information is passed on to generations through the central dogma of biology i.e. from
DNA to RNA (via transcription) and from RNA to proteins (via translation). In Prokaryotes both
the process occurs in nucleus whereas in eukaryotes site of transcription is nucleus and that of
translation is cytoplasm. Retroviruses do not follow the central dogma of biology.
Transcription is the synthesis of RNA from a DNA. The DNA unwinds and RNA polymerase
synthesizes RNA along with certain general transcription factors. The entire process is divided into
three major steps; namely initiation, elongation and termination.
Prokaryotes: A single RNA polymerase catalyzes the polymerization of ribonucleoside 5′-
triphosphates (NTPs) and the growing chain is always in the 5′ to 3′ direction. The specific
promoters are recognized by the σ subunit. and initiates the binding of RNA polymerase. Core
polymerase consists of two α, one β, and one β′ subunits, is fully capable of catalyzing the
polymerization of NTPs into RNA.
The enzymes move along the DNA to continue elongation of the growing RNA chain. The moving
polymerase maintains an unwound region of about 17 base pairs and the entire transcription bubble
is referred to as open promoter complex. The addition of nucleotides continues until the
polymerase encounters a termination signal.
ZOOLOGY Molecular Cell Biology
Principles of Gene Expression: Transcription
26
Termination is of two types (a) Rho dependant: Protein factor Rho binds to the end of the RNA
chain along the strand towards the open complex and shears the RNA transcript and all
components dissociate; (b) Rho independant: Transcription of the GC-rich inverted repeat results
in the formation of a segment of RNA that can form a stable stem-loop structure by
complementary base pairing. The formation of such a self-complementary structure in the RNA
disrupts its association with the DNA template and terminates transcription.
In Eukaryotes: Transcription involves three RNA polymerase which help in recognition of
specific promoter regions located upstream of the initiation site and consists of -35 to-52 base pairs
followed by the TATA box.
The General transcription factors (GTFs) have replaced the sigma factor of prokaryotes. TFIID is
the initial committed complex which recognizes and binds to the TATA box with the help of its
TBP (TATA-binding protein). The binding of TFIID facilitates the sequential binding of other
GTFs (TFIID followed by TFIIA, TFIIB, TFIIF accompanied with polymerase and finally TFIIE
and TFIIH) and RNA polymerase to produce the initiation complex.
The RNA transcript has a length of25-30bp which keeps on elongating as new nucleotides are
being added in the 3’end.
Polyadenylation involves addition of poly A tail (AAUAAA), to 3’ OH terminal end of the newly
synthesized mRNA. Transcription via RNA polymerase II typically terminates about 500 to 2000
nucleotides downstream from the poly A signal.
Allosteric model states that after transcribing the poly A sequence, RNA polymerase and DNA
template seperate, which ultimately results in their dissociation. Torpedo model requires the Rat 1
exonuclease which cleaves the growing RNA and on reaching the RNA polymerase it disrupts the
transcriptional machinery and terminates transcription.
The eukaryotic genes consisted of coding and non-coding regions. The coding sequences i.e. exons
are interrupted by non-coding introns. This process of joining the exons is known as RNA splicing.
The methyl group is added to the position 7 of the base making the base 7-methylguanine. This is
referred to as capping and the presence of this cap helps in removal of introns in addition to
providing stability to the mRNA. The 5’ cap is easily recognized by the ribosome, which binds to
it and initiates the translation process.