development team - inflibnet centre

26
ZOOLOGY Molecular Cell Biology Principles of Gene Expression: Transcription 1 Development Team Paper Coordinator : Prof. Kuldeep K. Sharma Department of Zoology, University of Jammu Principal Investigator : Prof. Neeta Sehgal Department of Zoology, University of Delhi Content Writer : Dr. Sudhida Gautam, Hansraj College, University of Delhi Dr. Kiran Bala, Deshbandhu College, University of Delhi Content Reviewer : Prof. Rup Lal Department of Zoology, University of Delhi Co-Principal Investigator : Prof. D.K. Singh Department of Zoology, University of Delhi Paper : 15 Molecular Cell Biology Module : 21 Principles of gene expression: Transcription

Upload: others

Post on 21-Oct-2021

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

1

Development Team

Paper Coordinator : Prof. Kuldeep K. Sharma

Department of Zoology, University of Jammu

Principal Investigator : Prof. Neeta Sehgal Department of Zoology, University of Delhi

Content Writer : Dr. Sudhida Gautam, Hansraj College, University of Delhi Dr. Kiran Bala, Deshbandhu College, University of Delhi Content Reviewer : Prof. Rup Lal Department of Zoology, University of Delhi

Co-Principal Investigator : Prof. D.K. Singh

Department of Zoology, University of Delhi

Paper : 15 Molecular Cell Biology Module : 21 Principles of gene expression: Transcription

Page 2: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

2

Description of Module

Subject Name ZOOLOGY

Paper Name Molecular Cell Biology; Zool 015

Module Name/Title Principles of Gene Expression

Module Id M21; Transcription

Keywords Central dogma, Transcription, RNA polymerase, template, Pribnow

box, promoter, splicing, holoenzyme, monocistronic

Contents

1. Learning Outcomes

2. Introduction

3. Transcription

3.1 Components

3.2 Types of RNA

4. Experimental Evidences

4.1 DNA acts as template for transcription

4.2 One DNA strand acts as a template

5. Transcription Unit

6. RNA Polymerase

6.1 Bacterial RNA polymerase

6.2 Eukaryotic RNA polymerase

7. Bacterial Transcription

7.1 Initiation

7.2 Elongation

7.3 Termination

8. Transcription in Eukaryotes

8.1 Initiation

8.2 Elongation

8.3 Termination (i) Allosteric model

Page 3: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

3

(ii) Torpedo model

9. Post Transcriptional Modifications

9.1 RNA splicing

9.2 Pre-mRNA Processing/ 3’ and 5’ modifications

10. Summary

1. Learning Outcomes

The present module explains the central dogma of molecular biology.

The basic differences of DNA and RNA.

Purpose of transcription process in biological system and the three events (initiation, elongation

and termination).

The passing on of the genetic information from DNA to RNA before translation can begin.

Key differences between prokaryotic and eukaryotic transcription.

2. Introduction

The word transcript means written or printed version of something. Transcription is a vital process of

the biological forms in which a single stranded RNA is synthesized using DNA as a template.

Transcription has a complex regulatory system associated with numerous gene regulatory elements

like promoter, silencer etc. In 1953 Watson and Crick gave the double helix model of DNA, and three

years later Crick gave the Central Dogma of molecular biology (Figure 1) which stated that genetic

flow of information within the different organism’s is a two steps process. The genetic information

stored in DNA is activated i.e., the double stranded DNA is denatured by enzymes to give a single

strand of DNA which acts as a template for the synthesis of mRNA (Transcription). The mRNA

chain produced is in accordance to the template DNA’s base sequence. The mRNA produced by

transcription helps in the production of protein molecules and this process is known as Translation.

The growth and development of an organism depends on the properties of various proteins present in

the tissue/cell. Whenever there is a demand of a certain protein in the cell, it needs to be synthesized

from the DNA by following the amino acid sequence of the required protein. Until the discovery of

retroviruses it was believed that the genetic information flows in a unidirectional way from DNA to

RNA (transcription) and from RNA to proteins (translation). This flow of genetic information was

commonly known as gene expression. This two steps process (DNA ---RNA --- Proteins) was referred

Page 4: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

4

to as Central Dogma of Molecular Biology by Crick. Thus, it provides basic information about the

flow of genetic information within a cell. Most organisms use DNA as the genetic material and they

follow the central dogma of molecular biology. However, there are certain Retroviruses which have

RNA as the genetic material and for the gene expression to take place, the information of RNA is

conveyed to DNA and the process is known as Reverse Transcription. We need to understand that the

central dogma is no more valid in its original form.

Figure 1: The central dogma of molecular biology

Table 1 tells us about the differences between molecule of DNA and RNA. The DNA is a double

stranded molecule having deoxyribose sugar. Sugar of RNA and DNA varies only at carbon second

position. Deoxyribose sugar as the name suggests lacks an oxygen molecule at C2. Absence of

oxygen at C2 makes it more stable than ribose sugar of RNA. Adenine, Guanine and Cytosine are

present both in DNA and RNA. However, Thymine is present in DNA but absent in RNA instead it

has Uracil which replaces Thymine during gene expression (Figure2).

Table 1: Difference between RNA and DNA

Characteristic DNA (Deoxyribose Nucleic acid) RNA (Ribose Nucleic acid)

Sugar Deoxyribose sugar Ribose sugar

Nucleotides A,T,G,C A,U,G,C

Strands Double stranded Single stranded

Presence of 2’-OH group (Figure 2) No Yes

Stability More stable Less stable

Types A,B and Z rRNA, mRNA, tRNA, SnRNA,

miRNA, siRNA

Page 5: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

5

(A) Ribose Sugar (B) Deoxyribose sugar

Figure 2: Structure of Pentoses sugar: (A) Ribose; (B) Deoxyribose

Source: https://s3.amazonaws.com/classconnection/506/flashcards/735506/jpg/picture2-

148FB8638C909AD7EE1.jpg

3. Transcription

The genetic code of the DNA (genotype) is used to synthesis RNA through the process of

transcription. It requires a series of events involving varies RNA nucleotides, DNA template and a

series of protein components for its initiation and regulation.

3.1. Components

The transcription requires three major components:

1. DNA template

2. RNA polymerase and associated protein factors

3. Raw materials (Nucleotides)

RNA polymerase catalyzing the process of transcription was discovered independently in 1960 by

Samuel Weiss and Jerard Hurwitz. The information on the DNA is used by RNA polymerase to make

mRNA using one of the strands of DNA as a template also known as the anti-sense or non-coding

strand. The double stranded DNA is unwound, one strand to which the RNA polymerase attaches acts

as the template strand (Figure 3). The other strand is referred to as the non-template strand, coding

strand or sense strand. Transcription begins on specific DNA sequences called promoters. It occurs in

three phases: initiation, elongation and termination.

Page 6: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

6

Figure 3: A transcribing unit

Source: (http://www.phschool.com/science/biology_place/biocoach/images/transcription/startrans.gif)

http://biobar.hbhcgz.cn/Article/UploadFiles/201104/20110415163953927.jpg

3.2. Types of RNA

RNA (Ribose nucleic acid) is a polymer of ribonucleotides linked together by 3’-5’ phosphodiester

bond. To begin the chapter we’ll first have a brief discussion about the different types of RNA (Table 2).

Table 2: Types of RNA: Depending upon the type of function the RNA molecules are classified as-

Type of RNA Function Location

Ribosomal RNA (rRNA) Structural component of the ribosomes Cytoplasm

Messenger RNA (mRNA) Carries the information in a gene for the protein

synthesis

Nucleus and

cytoplasm

Transfer RNA (tRNA) Transport amino acids to the ribosomes during

protein synthesis

Cytoplasm

Page 7: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

7

Small nuclear RNA (snRNA) Modification of the RNA transcript Nucleus

Micro RNA (miRNA) RNA interference Nucleus

Small interfering RNA( Si RNA) RNA interference Nucleus

4. Experimental Evidences/Historical Studies Helpful to Study Transcription

4.1. DNA acts as the template for transcription

In 1970, Oscar miller, Barbara Hamkalo and Charles Thomas provided the evidence of DNA

molecule being used as a template for transcription. Electron microscopic studies of internal cellular

contents revealed presence of Christmas-tree like structures; thin central fibers (the trunk of the tree),

to which were attached strings (the branches) with granules (Figure 4). When deoxyribonucleases

were added to it breakdown of the central fibers was observed, indicating that the tree trunk were

DNA molecules. Ribonucleases removed the granular strings, indicating that the branches were RNA.

The Christmas tree like structures was concluded to be a gene undergoing transcription. As, the

process of transcription proceeds, more and more RNA is formed which further extends the branches

of the tree.

(a) Christmas tree (b) RNA Polymerase and DNA chain

Figure 4: Christmas tree like structures within the cell showing the site of transcription

Source: (a) http://www.nature.com/scitable/content/Under-the-electron-microscope-DNA-molecules-undergoing-29586

(b) http://mol-biol4masters.masters.grkraj.org/html/Gene_Expression_II3-RNAP_I_Promoter.htm

Page 8: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

8

DNA molecule unwinds to act as a template i.e. the double helix opens up and only one strand acts as

template for the transcription. This is the strand to which the RNA polymerase binds and

transcription process continues which is exactly opposite to the coding strand which contains the gene

sequences. As transcription takes place on this template the new strand is perfect replica of the coding

strand of DNA molecule. At a given time only one of the DNA strand acts as a template.

4.2. One strand of DNA acts as a template for transcription

In 1963, Julius Marmur and his colleagues (Figure5) proved that only one strand acts as template.

They made use of DNA of bacteriophage SP8, which infects the Bacterium Bacillus subtilis. The

double stranded DNA of this phage has different densities for each strand, which permits the

separation of the two strands by equilibrium density gradient configuration into "heavy" and "light"

DNA strands.

B. subtilis was placed in a medium containing a radioactively labeled precursor of RNA by Marmur

and his colleagues. Later the bacteria were infected with SP8, as a result the phage DNA was injected

into the bacterial cells. Transcription within the infected bacterial cells produced radioactively labeled

RNA complementary to the phage DNA. This newly synthesized RNA was isolated from the cells.

Secondly, DNA from fresh culture of SP8 was isolated and the two strands were separated into heavy

and light stand of DNA, respectively.

The radioactively labeled RNA obtained from the infected bacterial cells was hybridized with heavy

and the light strand independently. Interestingly, the hybridization signal was observed only with the

heavy strand and the light strand did not exhibit any hybridization. Hence, evidence was provided by

Marmur and his colleagues which proved that RNA is transcribed from only one of the DNA strands

in SP8. Heavy strand acted as the template in this case. The newly synthesized RNA strand was

complementary and anti-parallel to this strand, having the same polarity and base sequences as that of

the non-template strand, with the exception that T in DNA is replaced by U in RNA.

Page 9: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

9

Figure 5: Marmur’s Experiment to prove only one strand of DNA acts as template during transcription

5. Transcription Unit

A transcription unit (Figure 6) is a stretch of DNA or a particular gene sequence which is flanked by a

promoter sequence situated upstream of the start site and a terminator region located downstream of

the structural genes. The three components namely are:

1. Promoter: The DNA sequences (consensus sequences) located towards the 5’ region which

promotes/recruits or initiates the transcription process. It is present, upstream of the RNA coding

region/transcriptional start site. The promoter facilitates the binding of transcription apparatus to

the DNA template and ensures that the initiation of each RNA occurs at the same point. Two

promoter regions have been identified in the bacterial system; (a) Pribnow box/TATA box located

-10 upstream from the site of initial transcription (TATAAT sequence; rich in adenine and

thymine) and (b) TTGACA located 35 nucleotides upstream. The specific sequence of the

promoter is responsible for the binding strength of the RNA polymerase to the transcription unit.

2. RNA – coding sequence: The base pair sequence of the template DNA which is copied in the

RNA molecule.

Page 10: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

10

3. Terminator: The sequences of nucleotides which signals the termination of transcription and are

part of the coding region.

Figure 6: A transcription unit

Source: http://bioap.wikispaces.com/Ch+17+Collaboration

6. RNA Polymerase

The enzyme RNA polymerase catalyzes the process of transcription

6.1. Bacterial RNA polymerase

A single large multimeric, RNA polymerase catalyses the process of bacterial transcription (Figure 8).

The RNA polymerase consists of a core enzyme made of 5 polypeptides which are two α, one β, and

one β’, bound to another polypeptide called the sigma factor (σ). The sigma factor recognizes the

upstream -35 and -10 regions of the promoter and makes sure that this binding of the RNA

polymerase and DNA is stable. Core enzyme cannot carry on the process of transcription on its own

as its lacks the ability to bind at the specific promoter region (it is converted to a holoenzyme when

the sigma factor binds to it and now the enzyme is ready to bind with the promoter). The polymerase

binds to the promoter region while the DNA is still in the double helical form, known as the closed

promoter complex. Holoenzyme (consists of core enzyme (two α, one β, and one β’) and sigma factor

together as a single unit) is the actual functional enzyme which helps in the opening up of the DNA

helix, by melting a short stretch of the DNA helix. This polymerase promoter complex undergoes a

transition from closed to open promoter complex, which is the unwinding of the DNA approximately

12bp long. The incoming nucleotides complimentary to the DNA template strand can bind to the

enzyme which catalyzes the formation of a phosphodiester bond between various incoming

nucleotides. This complex of polymerase-DNA-RNA is known as a ternary complex. The enzyme can

Page 11: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

11

move forward to a new position on the template. The sigma factor can leave the transcription bubble

once the RNA chain reaches 8-9 base pairs. The released sigma factor can be used for initiation of

transcription (a different gene) by another RNA polymerase. Further elongation of the RNA chain can

proceed without requiring the sigma factor. The leaving of sigma molecule is associated with the

conformational changes in the enzyme, which locks the processivity. The enzyme continues to

catalyze RNA elongation until it encounters a terminator sequence. We should note that the core

enzyme can bind to the DNA molecule with the same affinity at any position. The binding sites for

core enzyme DNA are known as loose binding sites. But the holoenzyme binds to promoters very

tightly, with an association constant increased from that of core enzyme by (on average) 1000 times

and with a half-life of several hours.

6.2. Eukaryotic RNA polymerase

RNA polymerase is large and complex enzymes e.g. yeast holoenzyme consists of two large subunits

and ten small subunits (Figure7). Three different types of RNA polymerase are present in eukaryotes

responsible for transcribing a different class of RNA (Table: 3). RNA polymerase I transcribes 28S,

18S and 5.8S rRNA and is located in the nucleolus. RNA Polymerase II transcribes mRNA and

snRNA and is present in the nucleoplasm of the nucleus. RNA Polymerase III transcribes tRNA, 5

SrRNA, snRNA and few miRNA’s and is present in the nucleoplasm. RNA polymerase IV is present

in the nucleus of plants and helps in transcribing some siRNA. Compared to prokaryotic RNA these

RNA polymerase are large multimeric units; thus several genes encode for them.

Table 3: Types of eukaryotic RNA Polymerase

RNA polymerase Function Location

RNA polymerase I Transcribes 28S,18S and 5.8 S rRNA molecules Nucleolus

RNA polymerase II Transcribes mRNA, sn RNA Nucleoplasm of the nucleus

RNA polymerase III Transcribes tRNA, 5S rRNA, some snRNA and few

miRNA’s molecules

Nucleoplasm

RNA polymerase IV Some siRNA in plants Nucleus in plants

Page 12: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

12

Figure 7: Comparison of structural composition of prokaryotic and eukaryotic RNA polymerase

(Source: http://www.cbs.dtu.dk/dtucourse/cookbooks/dave/Fig13_35.JPG)

7. Bacterial Transcription

The basic transcription unit and apparatus have already been discussed we’ll study in detail how the

process is carried on. Transcription can be easily divided into three steps namely:

7.1. Initiation

Initiation consists of binding of RNA polymerase to the DNA helix for RNA synthesis to begin. The

transcription apparatus recognizes and binds to the promoter region which is identified by the sigma

complex. The DNA of closed promoter complex melts to result in open promoter complex. The

template strand is identified and accordingly the nucleotides are added. Rate of transcription varies for

different genes, depending on the varying affinity of the promoter and RNA polymerase. Two DNA

sequences in most promoters of E. coli which have played a critical role in initiation of transcription

are found upstream at -35 (helps in initial recognition) and -10 (for the melting reaction to convert

Page 13: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

13

closed promoter complex into an open promoter complex) (Figure6 and Figure8) . The short stretch of

nucleotides ahead of the promoter is referred to as the consensus sequence. The two most common

consensus sequence of the most bacterial promoters are the -35 region (the -35 box) is 5’ –TTGACA-

3’ and -10 region (the -10 box, formerly called the Pribnow box; after David Pribnow its discoverer)

is 5’ –TATAAT-3’. The sigma factor associates with the core enzyme to form holoenzyme enzyme.

This holoenzyme binds to the consensus sequence and strongly to the promoter at the -10 region,

simultaneously accompanied by a local untwisting of about 12-14 bp around the region. Thus, RNA

polymerase orients itself to begin the transcription at +1. The polymerase pairs the base of the

nucleotide triphosphate with the complimentary base present on the template DNA. No primer is

required for this paring; the next coming nucleotide is bound to the 3’ end of the first nucleotide with

the release of a pyrophosphate. Since no phosphodiester bond forms at the 5’ end it continues to have

the three phosphate groups.

Figure 8: Binding of RNA apparatus to DNA

(Source: http://www.quia.com/jg/1269935list.html)

Page 14: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

14

7.2. Elongation

Once the 10-12 long RNA is synthesized the sigma factor leaves the transcription bubble, which leads

to a conformational change in the RNA polymerase. The core enzyme moves along the template

joining nucleotides to the RNA molecule, in the process it untwists the DNA double helix

downstream and then reanneals it (Figure9). An average of 30-50 nucleotides per second are added to

the elongating RNA molecule. Topoisomerases help in the uncoiling and recoiling of the DNA

template as transcription proceeds (These enzymes can easily cleave and rejoin a DNA molecule

without the requirement of another protein or an high energy cofactor). RNA polymerase has proof

reading ability which helps to remove any non-complimentary base and continue the transcription. If

the enzyme encounters a wrong base it goes back cleaves it and resumes the synthesis in the forward

direction.

7.3. Termination

The sequences which code the termination of the transcription are referred to as the terminator

sequences. Termination includes detaching of the enzyme from the DNA template and the release of

the newly synthesized RNA molecule (Figure 9). Two types of terminators are present in the bacterial

system with or without an ancillary protein called Rho factor; namely Rho dependent (also, type II

terminators) and Rho independent terminators (also, type I terminators) (Table 4). A polycistronic

RNA is produced when a number of genes are transcribed in a single RNA; i.e. a single termination

occurs at the end. Polycistronic RNA are absent in eukaryotes as each gene has its own initiation and

termination site.

Table 4: Major differences between type I (Rho independent) terminators and type II (Rho dependent)

terminators

Type I (Rho independent) terminators Type II (Rho dependent) terminators

Termination takes place in absence of rho factor Termination takes place in the presence of rho factor

Terminator consists of an inverted repeat sequence Terminator lacks the AT string found in Rho dependent

terminators.

When transcribed the inverted repeat sequence

forms a hair-pin like loop.

Terminator lacks the hair-pin loop.

Page 15: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

15

The termination sequence is followed by a string

of approximately 6 adenine nucleotide; their

transcription produces a string of uracil nucleotide

after the hair-pin loop.

Rho has RNA –binding and ATPase domains.

Formation of the hair-pin slows down the

polymerase and the adenine-uracil nucleotides

which follow it are relatively unstable. This

destabilization of the DNA RNA pairing; results

in the release of the RNA molecule.

Rho binds to the unstructured RNA (stretch of RNA

upstream of terminator sequence which lacks any

secondary structure) and moves towards the 3’ end. Rho

reaches the transcription bubble and its helicase activity

unwinds thee RNA-DNA hybrid and stops transcription.

Figure 9: Transcription steps

Source: http://www.nature.com/nrmicro/journal/v9/n5/images/nrmicro2560-f4.jpg

Page 16: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

16

8. Transcription in Eukaryotes

Transcription in eukaryotes is similar to that of prokaryotes. However, it involves three RNA

polymerase which help in recognition of specific promoter regions. These promoters/activators/

enhancers have two regions namely; 1) core promoter and 2) promoter proximal region/regulatory

promoter region (present upstream of a gene sequence).

The core promoter is located upstream of the initiation site and consists of -35 to-52 base pairs. The

TATA box (also known as Goldberg-Hogness box, after its discoverer) is present-25 to -30 bp

upstream of the start site and consensus sequence is TATAAA. These promoters facilitate the

formation of initiation complex and affect the rate of transcription. The regulatory promoter are

located upstream of the core promoter eg: CAAT box (5' GGCCAATCT 3'), GC box (GGGCGG) (

box centered at about -75 to -120) (Figure10). Any mutation which takes place in this region has the

ability to occasionally alter the rate of transcription, indicating there role in the efficiency of the

initiation complex.

Figure 10: Sequence elements of a general eukaryotic promoter/gene

Source: http://mol-biol4masters.masters.grkraj.org/html/Gene_Structure5B-

Eukaryotic_Promoter_Structure_for_RNA_Polymerase_II_files/image004.jpg

8.1. Initiation

Transcription initiation requires the assembly of the RNA polymerase and the general transcription

factors (GTFs) in a sequential manner. The GTFs are specific for each RNA polymerase and are

numbered according to the RNA polymerase for which they work. These GTFs have replaced the

sigma factor of prokaryotes. The GTFs are represented as TFIIA, TFIIB, TFIID, TFIIE and TFIIG.

The final alphabetical letter designates the individual factor (Figure11). TFIID is the initial committed

complex which recognizes and binds to the TATA box with the help of its TBP (TATA-binding

Page 17: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

17

protein). TATA-binding protein binds the major groove of DNA which results in its bending and

unwinding of the DNA helix.

The binding of TFIID facilitates the bending which helps in the binding of TFIIB to TFIID followed

by sequential binding of other GTFs (TFIIA, TFIIF accompanied with polymerase and finally TFIIE

and TFIIH) and RNA polymerase to produce the initiation complex (Table 5). TFIIE and TFIIH bind

to the RNA polymerase to form the pre-initiation complex. TFIIH acts as a helicases (breaks the bond

between the double stranded DNA) to form an open complex. Conformational changes within the

DNA and polymerase result in unwinding of 10-15bp of DNA. The template DNA is placed on the

active site resulting in the formation of the open initiation complex. TFIIH also hydrolyses ATP to

phosphorylate the carboxy terminal domain (CTD) in RNA polymerase II. This phosphorylation

breaks the contact between the RNA polymerase II and TFIIB. As, a result TFIIB, TFIIE and TFIIH

dissociate from RNA polymerase and it’s free to proceed the elongation process.

Table 5: Function of the general transcription factors

General Transcription Factor Function in transcription

TFIID (composed of TATA-binding

proteins (TBP) and TBP-associated

factors (TAFs)

Recognizes the TATA box in the promoter region (core promoter

binding factor)

TFIIB Interacts with TBP of TFIID and stabilizes TBP-TATA complex,

recruits binding of TFIIF- RNA polymerase complex

TFIIH Helicases activity for opening of the promoter complex, initiates

transcription (Enzymatic activities of DNA Helicase and ATP kinase)

and repairs DNA damage ( by nucleotide excision repair)

TFIIA Stabilizes TBP-DNA binding

TFIIF Binds to RNA polymerase and prevents it from binding to nonspecific

DNA binding sites

TFIIE Helps in maintenance of initiation complex and switching to elongation

process

Page 18: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

18

Figure 11: Initiation in eukaryotes

Source: http://www.mun.ca/biology/desmid/brian/BIOL2060/BIOL2060-21/21_12.jpg

Page 19: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

19

8.2. Elongation

Once the initiation complex begins the synthesis from the promoter region, general transcription

factors are released and they can be used by other RNA polymerases. The RNA transcript has a length

of 25-30 bp which keeps on elongating as new nucleotides are being added in the 3’end. During the

elongation process 8 RNA nucleotides remains paired with the DNA template. This DNA-RNA duplex

is bent at 90° between the jaw-like extensions of the enzyme. As the complex moves forward the

unwound DNA is rewound and RNA transcript exits from it separately. Roger Kornberg and his

colleagues were awarded Nobel Prize in 2006 for studying the process of transcription. He discovered

Mediators responsible for mediating the interacting between the RNA polymerase II and regulatory

transcription factors which bind to enhancers or silencers and serve as interfering molecules between

RNA polymerase II and many diverse regulatory signals.

8.3. Termination

In eukaryotes the RNA transcription continues down the DNA template until it encounters a poly A

sequence. The mRNA transcription can even continue past this poly A site, in some cases even 100 or

1000 bp. The poly A consensus sequence i.e. AAUAAA is a string of adenine nucleotides which

continues near the 3’ end of the mRNA. The addition of a tail of polyadenylic acid (poly A) to the 3'

end of mRNA is referred to as polyadenylation. Polyadenylation involves recognizing the processing

site signal, (AAUAAA), and cleaving of the mRNA to create a 3' OH terminal end to which poly A

polymerase adds 60-200 adenylate residues. Transcription via RNA polymerase II typically terminates

about 500 to 2000 nucleotides downstream from the poly A signal. Two models have being proposed

for termination process namely,

(i) Allosteric model: After transcribing the poly A sequence, RNA polymerase and DNA template

destabilize, which ultimately results in their dissociation. For poly A addition to the RNA, a number

of proteins including cleavage stimulation factor (CPSF) protein, and two cleavage factor proteins

(CFI and CFII), bind to and cleave the RNA. Then, the enzyme poly A polymerase (PAP) uses ATP

as a substrate and catalyzes the addition of A nucleotides to the 3’ end of the RNA to produce the poly

(A) tail. During this process PAP is bound to CPSF. As, the poly (A) tail is synthesized, molecules of

poly (A) binding protein II (PABII) bind to it.

(ii) Torpedo model: It requires the Rat 1 exonuclease. Cleavage of the mRNA results in a 5’ end

trailing out of the RNA polymerase (Figure12). To this free 5’ end the Rat 1 attaches and cleaves the

Page 20: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

20

growing RNA by moving towards the 3’ end. Rat1 is a 5’-3’ exonuclease i.e. it cuts the RNA from 5’

end towards 3’ end. Like a torpedo it devours the growing RNA and on reaching the RNA polymerase

it disrupts the transcriptional machinery and terminates transcription.

Figure12: Termination of transcription in eukaryotes. (1. Synthesis of polyA tail); 2. RNA is released which

destabilizes the RNA polymerase and DNA complex; 3. Allosteric model: Due to destabilization DNA and

RNA polymerase seperate; 4. from the growing RNA Rat1 exonuclease binds; 5. Binding leads to a torpedo

like action which ferociously cleaves the RNA leading to separation of DNA and RNA polymerase)

(Source: https://s3.amazonaws.com/classconnection/819/flashcards/3148819/png/eukaryotic_termination-

149544E302B1B1E1FA2.png)

9. Post Transcriptional Modifications

Unlike prokaryotes (which have polycistronic mRNA and require no post transcriptional

modifications) the eukaryotic mRNA are modified at both the ends. Also, all the genes are not

collinear with the proteins that they code (When a continuous sequence of nucleotides in DNA

encodes a continuous sequence of amino acids in a protein, the two are said to be collinear). In 1970’s

Page 21: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

21

it was discovered that the regions of DNA were much longer than RNA. When DNA and RNA were

hybridized the hybrid of DNA-RNA showed looped structures whereas DNA-DNA molecule could

match through the entire length. It was concluded that certain regions of DNA are absent from the

RNA. This provided evidence that the eukaryotic genes consisted of coding and non-coding regions.

The coding sequences i.e. exons are disrupted by non-coding introns. The term intron refers to the

intervening sequences which do not code the amino acid sequences (Refer value addition column).

Exons are the expressed sequences which are ligated to obtain a continuous coding mRNA. The

introns are removed and the exons are joined together before the mRNA leaves the nucleus. This

process of joining the exons is known as RNA splicing (Figure 13, 14). The mRNA bears three sites

for splicing to take place which are; 5’ consensus/splice site which begins with 5’GU and a branch

point followed by 3’ splice site which has AG3’end. Above which is located a branch point

approximately 18 to 40 bp

Value Addition:R looping experiments were first performed by R.J. Roberts and P.A.Sharp; they

identified introns in the protein coding adenovirus gene. They were awarded NobelPrize in 1993 for

discovering the introns.

Source: https://upload.wikimedia.org/wikipedia/commons/d/da/R_loop.jpg

Page 22: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

22

9.1. RNA splicing

RNA splicing involves the removal of introns and joining of the exons. An endonucleolytic “cut” is

made at each end of an intron, the intron is removed, and the exon ends are rejoined. RNA ligase seals

the exon ends to complete each splicing event. However, the precise excision of introns is much more

complex and interesting in higher eukaryotes. These catalytic RNAs were referred to as ribozymes.

Thomas Cech and his colleagues discovered in 1963 during a study of the ciliate protozoan

Tetrahymena.

Figure 13: Self splicing introns

(Source: Concepts of genetics; Klug and Cummings tenth edition. Pg: 335)

Page 23: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

23

1. Group I: Self splicing introns which are present in some rRNA genes. The self-excision involves

an interaction between a guanosine cofactor and the primary transcript (Figure 13). The 3’-OH

group of guanosine is transferred to the nucleotide adjacent to the 5’ end of the intron. Then this

newly acquired 3’-OH group (of guanosine) on the left-hand of exon and the phosphate on the 3’

end of the intron form a bond. The intron is spliced out and the two exon regions are ligated,

leading to the mature RNA.

2. Group II: Self-splicing introns with a different mechanism than that of the group I, are present in

the protein coding genes of mitochondria and chloroplasts. An autocatalytic reaction leads to

excision of intron, which lacks guanosine as a cofactor.

3. Nuclear pre-mRNA introns: Splicing takes place within a large complex known as spliceosome

which consists of a pre mRNA bound to snRNA (small nuclear RNA) ranging from 107 to 210

nucleotides which associate with proteins to form snRNPs (small nuclear ribonucleoprotein

particles). Small nuclear RNAs (snRNAs or snurps) are an essential component of the splicesomal

complex and are located in the protein coding genes of eukaryotic cell. Being rich in uridine they

are known as U1, U2…….U6. There sequential binding results in the formation lariat which

contains the removed introns (Figure 14).The splicing reactions proceed as described below:

U1 binds to the 5’ splice end.

U2 binds to the branch point.

Complex of U4, U5, and U6 joins the splicesome and combines the U1 and U2. This causes

the introns to loop and brings the exons closer.

U1 and U4 snRNPs dissociate resulting in activation of the splicesome complex.

Active complex removes the introns (in the form of a lariat) and ligates the two exons. The

branch point bond breaks and the linear intron are easily digested by the nuclear enzymes.

The snRNPs are released after ligating the exon and this process is followed for each intron

molecule.

4. Transfer RNA introns: found in the tRNA genes. This intron makes use of specialized enzymes to

cut and reseal the RNA. In prokaryotic cell both transcription and translation can take place at

same time, as both the processes are coupled with each other. Thus, mRNA produced has no

opportunity to be modified. However, in eukaryotes the site of transcription and translation are

nucleus and cytoplasm, respectively. Changes are incorporated into the nascent mRNA at both the

3’ and 5’ end of the molecule to protect the coding of the molecule (in eukaryotes).

Page 24: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

24

Figure 14: Lariat formations in Spliceosome

(Source: Concepts of genetics; Klug and Cummings tenth edition. Pg: 336)

Page 25: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

25

9.2. Pre-mRNA Processing/ 3’ and 5’ modifications

As mRNA has around 20-30 bp, a capping enzyme adds a methylated guanine nucleotide to the 5’ end

by an unusual 5’ to 5’ linkage as opposed to the usual 5’ to 3’ linkage. The methyl group is added to

the position 7 of the base making the base 7-methylguanine. This is referred to as capping and the

presence of this cap helps in removal of introns in addition to providing stability to the mRNA. The

5’ cap is easily recognized by the ribosomes, which binds to it and initiates the translation process.

Rarely, additional methyl residues may be attached to the bases of the second and third nucleotide.

A sequence of about 50-250 adenines are added to the 3’end of the mRNA, forming a poly (A) tail.

These are added after the mRNA is released from the polymerase and is known as polyadenalytion.

Polyadenalytion provides stability to the RNA molecule (protects from exonucleases) and a longer

time period to be available for the translation process. Poly A site having 11-30 nucleotides upstream

of the cleavage has a consensus sequence of AAUAAA.

10. Summary

The genetic information is passed on to generations through the central dogma of biology i.e. from

DNA to RNA (via transcription) and from RNA to proteins (via translation). In Prokaryotes both

the process occurs in nucleus whereas in eukaryotes site of transcription is nucleus and that of

translation is cytoplasm. Retroviruses do not follow the central dogma of biology.

Transcription is the synthesis of RNA from a DNA. The DNA unwinds and RNA polymerase

synthesizes RNA along with certain general transcription factors. The entire process is divided into

three major steps; namely initiation, elongation and termination.

Prokaryotes: A single RNA polymerase catalyzes the polymerization of ribonucleoside 5′-

triphosphates (NTPs) and the growing chain is always in the 5′ to 3′ direction. The specific

promoters are recognized by the σ subunit. and initiates the binding of RNA polymerase. Core

polymerase consists of two α, one β, and one β′ subunits, is fully capable of catalyzing the

polymerization of NTPs into RNA.

The enzymes move along the DNA to continue elongation of the growing RNA chain. The moving

polymerase maintains an unwound region of about 17 base pairs and the entire transcription bubble

is referred to as open promoter complex. The addition of nucleotides continues until the

polymerase encounters a termination signal.

Page 26: Development Team - INFLIBNET Centre

ZOOLOGY Molecular Cell Biology

Principles of Gene Expression: Transcription

26

Termination is of two types (a) Rho dependant: Protein factor Rho binds to the end of the RNA

chain along the strand towards the open complex and shears the RNA transcript and all

components dissociate; (b) Rho independant: Transcription of the GC-rich inverted repeat results

in the formation of a segment of RNA that can form a stable stem-loop structure by

complementary base pairing. The formation of such a self-complementary structure in the RNA

disrupts its association with the DNA template and terminates transcription.

In Eukaryotes: Transcription involves three RNA polymerase which help in recognition of

specific promoter regions located upstream of the initiation site and consists of -35 to-52 base pairs

followed by the TATA box.

The General transcription factors (GTFs) have replaced the sigma factor of prokaryotes. TFIID is

the initial committed complex which recognizes and binds to the TATA box with the help of its

TBP (TATA-binding protein). The binding of TFIID facilitates the sequential binding of other

GTFs (TFIID followed by TFIIA, TFIIB, TFIIF accompanied with polymerase and finally TFIIE

and TFIIH) and RNA polymerase to produce the initiation complex.

The RNA transcript has a length of25-30bp which keeps on elongating as new nucleotides are

being added in the 3’end.

Polyadenylation involves addition of poly A tail (AAUAAA), to 3’ OH terminal end of the newly

synthesized mRNA. Transcription via RNA polymerase II typically terminates about 500 to 2000

nucleotides downstream from the poly A signal.

Allosteric model states that after transcribing the poly A sequence, RNA polymerase and DNA

template seperate, which ultimately results in their dissociation. Torpedo model requires the Rat 1

exonuclease which cleaves the growing RNA and on reaching the RNA polymerase it disrupts the

transcriptional machinery and terminates transcription.

The eukaryotic genes consisted of coding and non-coding regions. The coding sequences i.e. exons

are interrupted by non-coding introns. This process of joining the exons is known as RNA splicing.

The methyl group is added to the position 7 of the base making the base 7-methylguanine. This is

referred to as capping and the presence of this cap helps in removal of introns in addition to

providing stability to the mRNA. The 5’ cap is easily recognized by the ribosome, which binds to

it and initiates the translation process.