chapter 1 - shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...several bacteriophage...

44
CHAPTER 1 INTRODUCTION AND REVIEW OF LITERATURE

Upload: others

Post on 07-Jul-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

CHAPTER 1

INTRODUCTION

AND

REVIEW OF LITERATURE

Page 2: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

1

1.1 INTRODUCTION The rapid availability of the complete genome sequences of various

organisms, especially pathogens has provided a route from the genomic era to “post genomic era” also referred as the “era of functional genomics” that allows annotation of genes encoding proteins that contribute to a particular function. For a genome-wide approach in functional genomics, it is essential that all the genes encoded by an organism must be identified, cloned, expressed and then tested for the function. However, it is a very big challenge to clone and purify all the encoded full-length proteins of an organism to create an ORFeome, a collection of ORFs in a genome. The goal of functional genomics is to understand the relationship between an organism’s genome and its phenotype that includes applications in studying protein - protein interactions, identifying novel genes essential for survival that may be a valid drug or vaccine candidate. There are several methods known for the identification and characterization of protein-protein interactions at genome wide scale, like, two-hybrid screens (yeast two hybrid, Y2H) (Uetz et al., 2000; Ito et al., 2001), co-immunoprecipitation or co-affinity purification followed by mass spectroscopy (Gavin et al., 2002; Ho et al., 2002). Phage display is a powerful technology that has been used very extensively to study protein-protein or protein-ligand interactions. It allows creation of large size libraries of high diversities that provides a means of rapidly screening and selecting out large numbers of proteins against potential binding partners. Different phage display vectors have been employed to display

proteins/peptides, but M13 based phagemid vectors have been most extensively used to carry out fusions at the N-terminus of gIIIp coat protein. To assemble functional phage particles in phagemid system, co-infection with helper phage

(e.g., M13K07) is required which have an origin of replication or a packaging signal of reduced functionality. Hence, the resulting phage carry a mixture of wild-type and fusion coat proteins in a predominantly monovalent fashion. For the purpose of multivalent display, alternative helper phages have been used, such as the hyperphage, CT-phage etc. that provide reduced amount of gIIIp or no gIIIp at all. The low display density of phagemid system can be overcome by

Page 3: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

2

the use phage display systems based on lytic phages like T4, T7 or lambda phage wherein the phage assembly occurs inside the host cell in contrast to the assembly of filamentous phages that occurs in the periplasm. Bacteriophage lambda has been used for the high-density display of large size proteins/polypeptides attached to N- or C-termini of gpD (head protein) and gpV (tail protein). For an intronless organism like, M. tuberculosis, the best nucleic acid source to create an ORFeome is, the randomly fragmented genomic DNA, where each fragment represents a domain rather than full-length protein, as former are well folded and hence, easier to be produced in soluble form. But in fragmented DNA libraries, the vast majority of clones are nonfunctional due to inserts that are off frame with respect to signal sequence and gIIIp because of frame shifts, stop codons, or incorrect orientation. This indicates the need for a selective step to filter DNA fragments encoding ORFs away from those that do not. Different approaches based on helper phage, C-terminal reporter genes, antibiotic selection markers have been reported to select out the ORFs, which often impose certain limitations as discussed further in Chapter 1. After obtaining a complete set of validated protein-coding ORFs for any organism of interest, different protein functional studies demand different protein expression vectors; therefore, a flexible vector system should be used that enables the cloned ORFs to be transferred rapidly to any vector. There are different cloning systems available commercially, based on site-specific recombination for the high throughput transfer the clones with high fidelity and ease.

Therefore, the literature was collected and reviewed to understand different phage display systems, available methods for ORF selection, methods for transferring genes at genome scale from one vector to another vector. Ultimately, identification of shortcomings of available methods led to formulation of work that is embodied in this thesis.

Page 4: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

3

1.2.1 PHAGE DISPLAY Phage display is the technology (Smith et al., 1985) where a DNA encoding

a peptide or protein, if cloned in frame with one of the coat protein of a bacteriophage is displayed on the phage surface. The principle underlying all the phage display systems is the physical linkage of a polypeptide's phenotype to its corresponding genotype. The proteins or peptides to be displayed are usually expressed as fusions with the phage coat protein wherein the genetic information encoding the displayed fusion protein is packaged inside the same phage particle in the form of a cloned DNA sequence. Because of this unique property, the enrichment of phages that present a binding protein (or peptide) is achieved by affinity selection of a phage library on the immobilized target. In this “panning” process, binding phages are captured whereas nonbinding ones are washed off. In the next step, the bound phages are eluted and amplified by reinfection of E. coli cells. The amplified phage population can, in turn, be subjected to the next round of panning. It is quite typical that phage library screening entails several consecutive rounds of panning and phage amplification before the selected phage and the polypeptide that they present are individually analyzed. Put simply, the selection from phage display libraries is a cyclic process of selective enrichment and amplification. Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of proteins or peptides but the most

prominent and frequently used system is the M13 filamentous phage-based phage display system.

1.2.2 SURFACE DISPLAY IN FILAMENTOUS PHAGES

Most of the work in phage display is done using filamentous phages,

partly due to the ease with which the phage can be manipulated and also due to the availability of a detailed understanding of the viral life cycle and phage structure.

Page 5: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

4

1.2.2.1 Biology and structural organization of filamentous phages

M13 has a flexible rod shaped structure with a circular genome of 6000 - 8000 bases enclosed in a coat composed of five different proteins (Kehoe and Kay,

2005; Smith and Petrenko, 1997). M13 phages are 65A° in diameter, with their length dependent on the length of the enclosed genome. The infectious cycle of the filamentous phage is initiated by its infection of the bacterial host, leading to the entry of the phage DNA into the cytoplasm. Upon infection, the single-

stranded M13 genome is first converted to a double-stranded replicative form, which serves as a template for the production of viral proteins and single-stranded DNA progeny. The phage genome encodes ten proteins of which five are structural components (gIIIp, gVIp, gVIIp, gVIIIp, and gIXp), three are required for DNA synthesis (gIIp, gVp and gXp) and two serve in phage assembly (gIp and gIVp). The phage genome contains small stretches of non-coding or intergenic regions containing the origin of DNA replication and a hairpin loop region called the DNA packaging signal, which is the site of initiation for the assembly of the phage particles. The single-stranded DNA is eventually extruded from the host cell through the inner membrane. During this process the viral proteins, which are anchored in the membrane, encapsulate the DNA as it traverses the membrane. The released virus is a long flexible rod about 1 µm in length. Most of the viral coat consists of the major coat protein gVIIIp (50 amino acids), of which there are 2700 copies per phage. At one end of the virus

are five copies of the minor coat protein, gIIIp (406 amino acids) and gVIp (113

amino acids), and at the opposite end are two other minor coat proteins, gVIIp (33

amino acids) and gIXp (32 amino acids). All of the coat proteins are synthesized with N-terminal signal sequences that direct them via the secretory machinery to the membrane. 1.2.2.2 Phage coat proteins as vehicles of phage display

Phage display technology has been mainly applied to generate libraries with phages displaying the encoded peptide or antibody sequence on its surface

Page 6: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

5

fused to one of the viral coat proteins. Display of proteins on the surface of the filamentous phage is possible by fusion of the gene of interest to any one of the five phage coat protein genes (bacteriophage coat protein III, VI, VII, VIII or IX). In early examples of phage display, peptides were fused to the amino terminus of either gIIIp or gVIIIp in the viral genome (Scott and Smith, 1990; Greenwood et al., 1991). Jespers et al., (1995) showed that the proteins could be displayed as fusion to the carboxyl terminus of gVIp. Because of its orientation on the viral particle, gVIp is amenable for the display of cDNA encoded libraries as C-terminal fusion. Gao et al., (1999) later demonstrated the display of antibody fragments fused to the N-terminus of gVIIp and gIXp. Since the two proteins appear to interact with one another, they are ideal for display of dimeric proteins such as antibodies. Fuh et al., (2000) has also demonstrated carboxyl terminal gVIIIp display. However, this discussion is limited to the studies related to gIIIp and gVIIIp, two most widely used coat proteins in phage display. 1.2.2.2.a Minor phage coat protein, gIIIp

The minor coat protein gIIIp is synthesized as a 424 amino acid long pre-protein which includes an 18-residue amino terminal signal sequence and requires the bacterial Sec system for insertion into the membrane (Rapoza and

Webster, 1993). After removal of the signal sequence, the mature 406 residue long gIIIp is attached to the virion through interaction with g6p. The mature protein spans the cytoplasm once with only the carboxyl terminal last five residues in the cytoplasm. NMR and X-ray structure have recently been reported for the gIIIp capsid protein (Holliger and Riechmann, 1997; Lubkowski et al., 1998; Holliger et al., 1999) Structurally, gIIIp is composed of three separate domains connected by glycine rich linker regions. The amino terminal domain (N1) and the middle domain (N2) have important functions in viral infection. N2 is responsible for the binding to the tip of the F pilus of the E. coli host (Gray et al., 1981). Once the phage is bound, the pilus retracts bringing the viral particle close to the cell surface (Jacobson, 1972). At the cell surface, N1 binds to the TolA receptor, which then leads to the penetration of the host by the viral DNA. If the N1 and N2

Page 7: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

6

domains are deleted, the resulting viral particles are non-infectious (Armstrong et al., 1981; Gray et al., 1981). The carboxyl terminal domain (CT) is essential for viral morphogenesis and cannot be deleted. Interestingly, carboxyl terminal fusion’s to gIIIp has also been shown to be tolerated despite the fact that the CT domain appears buried in the viral capsid (Fuh et al., 2000). 1.2.2.2.b Major coat protein, gVIIIp

M13 phage coat is composed of approximately 2700 copies of the major coat protein, gVIIIp. The structure of this coat protein has been extensively studied by X-ray crystallography and NMR (Glucksman et al., 1992); (Kishchenko

et al., 1994). The protein is encoded as a precursor protein of 73 amino acid residues, a water-soluble cytoplasmic protein, which contains an additional

leader sequence of 23 residues at its N-terminus. When this protein is inserted into the membrane, the leader sequence is cleaved off by a leader peptidase. The structure of gVIIIp can be divided into three regions (Marvin, 1998). The coat protein has a hydrophobic stretch of 19 amino acids (transmembrane domain) flanked by an acidic N-terminal region of 20 amino acids (amphipathic domain) and, a basic C-terminal region. The overall secondary structure of the protein is α-helical with a β-turn connecting the transmembrane and the amphipathic domain (Overman et al., 1996). During assembly of the progeny virions at the plasma membrane, gVp ssDNA binding protein is removed and replaced by gVIIIp (Rasched and Oberer, 1986). The capsid structure is held together in the virion by hydrophobic interactions between the 19-residue apolar domain forming a tube around the viral DNA. The flexible N-terminal amphipathic domain is located at the outside of the phage coat and the basic C-terminus is towards the inside of the coat. The carboxyl terminus contains 4 positively charged lysine residues, which interact with the sugar phosphate backbone of the DNA that is present inside the particle (Greenwood et al., 1991).

During phage life cycle, gVIIIp protein interacts with several components including (i) the phospholipid bilayer (ii) gIp and gIXp (iii) host protein

Page 8: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

7

thioredoxin (vi) phosphate backbone of viral genome (v) and other gVIIIp molecules to form the viral coat. A number of studies, primarily involving mutagenesis, have examined the roles of individual residues in packing and

stabilizing interactions of the protein subunits (Deber et al., 1993; Williams et al., 1995). Mutagenesis studies suggested that small apolar residues are highly conserved in the apolar domain of gVIIIp (Williams et al., 1995). It was

demonstrated that small residues at certain positions (Ala7, Ala10, Ala18, Gly34,

Ala35 and Gly38) were not easily mutated or could be substituted only with other small residues. Mutations that seriously reduced the basic nature of the C-terminal domain or that interrupted the hydrophobic nature of the signal peptide or the central stretch of the mature protein with polar residues impaired the capacity of the altered coat protein for membrane insertion (Kuhn et al., 1986). The importance of lysine residues in the C-terminus in protein-DNA interaction was also studied by mutagenesis. It was shown that only Lys48 can be changed by mutation, without affecting the phage viability, though it also results in longer particles to compensate for the decrease in the positive charge density inside the coat (Hunter et al., 1987). 1.2.2.3 Vectors used for phage display

Though all the coat proteins of the filamentous phage can be fused to foreign peptides or proteins, the current vectors mostly use the minor coat protein, gIIIp or the major coat protein, gVIIIp as the fusion partner. Various vectors have been made available for surface display in the filamentous phages

M13 and fd. Depending upon the number of copies of the fusion protein that are to be displayed, the vector systems have been classified into two groups, phage vectors and phagemid vectors.

1.2.2.3.a Phage vectors

Viral vectors that accept and display fusion with gIIIp and gVIIIp have been termed as type 3 and type 8 vectors. In phages produced from these vectors,

Page 9: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

8

all the five copies of gIIIp are displayed as fusion with the foreign peptides encoded by the cloned insert. Similarly, fusion with gVIIIp allows upto 2700 copies of the fusion protein to be displayed, resulting in multivalent display of

the foreign peptide/protein. Most of the commonly used type 3 phage vectors have been derived from the filamentous phage fdtet (Zacher et al., 1980) which confer tetracycline resistance to the cells harboring them. (Parmley and Smith,

1988) developed fdtet into another type 3 display vector fUSE1 and 2, in which site of gIIIp display was moved near its N-terminus reducing the impairment in infectivity caused by gIIIp fusion. Some phage vectors have been described which contain a frame shift mutation in gIIIp (Cwirla et al., 1990; Scott and Smith, 1990). Two restriction enzyme sites flank the region between the signal peptidase

cleavage site and +3 codon of the mature gIIIp known as the stuffer region. Removal of this stuffer region and the subsequent insertion of the target DNA restore the translation frame of gIIIp to produce infective phages. Vectors have

also been described based on M13mp8 (Fowlkes et al., 1992). One of these vectors,

m663, does not confer any antibiotic resistance but it carries an intact LacZ gene, which allows detection of blue plaques. In most vectors, the displayed peptide or protein is separated from gIIIp or gVIIIp by short linkers, which improve the flexibility of the displayed peptide.

To display a peptide on gVIIIp, a fragment of DNA encoding the peptide is

inserted at an engineered restriction site in bacteriophage g8, such that encoded amino acid sequence appears fused to the mature coat protein that is generated by leader peptidase cleavage of the initial procoat (Greenwood et al., 1991). In such recombinant virions, all 2700 copies of the major coat protein display the peptide. However, there is a limit of about 5-6 on the number of amino acids, that can be inserted into the phage coat fused to all the 2700 copies of the major coat protein in a recombinant phage (Greenwood et al., 1991; Iannolo et al., 1995). Fusion of larger peptides/proteins to gVIIIp in phage vectors interferes with the phage assembly process. This drawback however, was overcome with the development of the phagemid system.

Page 10: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

9

1.2.2.3.b Phagemid vectors

The second group of phage display systems utilizes phagemid vectors

(Marks et al., 1991; Barbas et al., 1991; Breitling et al., 1991; Hoogenboom et al., 1991), which produce the fusion coat protein. A phagemid is a plasmid, that, in addition to its plasmid origin of replication bears a phage-derived origin of replication (also called the major intergenic region) and carries a copy of the gene encoding the peptide/protein fused to gIIIp or gVIIIp under the control of a regulated promoter mostly, lacPO along with an antibiotic resistance gene. Unlike plasmid, however, the phagemid genome can be packaged in the phage coat. However, the production of phages containing the phagemid genome can only be achieved when additional phage-derived proteins are present. For the purpose of phage display, these proteins are simply provided by superinfecting phagemid-carrying cells with a helper phage, which itself has an origin of replication or a packaging signal of reduced functionality. Propagation of phagemid in cells superinfected with a helper phage or wild-type phage results in the packaging of the phagemid DNA into phage particles in a fashion identical to that of the phage DNA itself. The helper phage derived proteins and enzymes act in trans on the phage origin of replication carried on both the helper phage and

the phagemid genome. M13KO7 (Vieira and Messing, 1987) and its derivative

VCSM13 (Stratagene) are commonly used helper phages. Both bear a kanamycin resistance gene, which, along with the antibiotic resistance gene carried on the corresponding phagemid aid in the selection of cells that contain the genomes of both the helper phage and the phagemid. Therefore, two distinct types of phage particles with different genotypes are produced from cells bearing phagemid and helper phage DNA: (1) those carrying the phagemid genome and (2) those carrying the helper phage genome. For less understood reasons, only a fraction of gIIIp-fusion encoded by the phagemid is incorporated in the extruded phages. This ensures that each phage particle does not display more than one fusion protein. It has been reported that in phagemid system only 1 – 10 % of phages display the fusion protein and remaining contain gIIIp produced by the helper phage. Phage particles containing the helper phage genome are useless in phage display processes even if they present the desired phenotype because they do not

Page 11: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

10

contain the required genetic information. Independent of the genotype, phagemid-based display systems usually yield phages with a hybrid phenotype displaying wild type and fusion coat protein on the same particle. The phagemid particles can be individually titred and propagated by virtue of conferring antibiotic resistance on any cell they infect.

Display of proteins fused to gIIIp is greatly affected by gIIIp functions.

The N-terminal portion of gIIIp is required for phage infectivity as well as, for providing immunity to the cells already infected with a filamentous phage preventing their superinfection by other phages. To allow infection with helper phage, the N-terminal domain of gIIIp in the fusion expressed by phagemid should be deleted. This feature was incorporated in phagemid vectors designed

for antibody display, pComb3 (truncated gIIIp with aa 198-406; Barbas et al., 1991)

and its derivatives pComb3H and pComb3X (truncated gIIIp with aa 230-406; Yang et al., 1995; Rader and Barbas, 1997), as well as other vectors. Several type of phagemid systems, including pHEN (Hoogenboom et al., 1991) and pCANTAB (Amersham Pharmacia Biotech), use the entire gIIIp as the fusion partner. The phage produced by type phagemid systems can display from 0 to 5 copies of wild-type gIIIp, because of competition between wild-type gIIIp produced by the helper phage and the fusion protein produced by the phagemid for incorporation into the assembling phage particle.

Practically, phage and phagemid libraries have a number of differences.

At the DNA level (preparing DNA, cloning, transfection efficiency) it is easier to work with phagemids than phage. As a result, phagemid libraries can be made far larger in terms of number of independent clones than phage libraries. It is also easier to produce soluble proteins using phagemids by the insertion of an amber stop codon between the displayed protein and gIIIp (Winter et al., 1991). Although soluble protein could theoretically be made in phage libraries using a similar genetic arrangement, the low copy number of the vector and the weakness of the gIIIp promoter and ribosome-binding site, results in levels of soluble protein that are too low for most practical purposes, requiring subsequent

Page 12: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

11

recloning into expression vectors (Marks et al., 2001). Another advantage of phagemids concerns the relative resistance to deletions of extraneous genetic material. Filamentous phage vectors, in general, have a tendency to delete unnecessary DNA, due to the selective growth advantage that a smaller phage genome has over a larger one. Phagemids suffer far less from such deletions and as a result are genetically more stable.

Phage libraries have considerable operational advantages. First, they do

not require the use of helper phage for phage production. As a result, the additional technical procedures associated with helper phage infection, such as monitoring the absorbance of bacterial cultures, are omitted from protocols. To amplify phage libraries it is sufficient to grow the bacteria containing phage genomes and phage particles are produced. This makes phage far easier to use in selections, particularly in high-throughput selections. Moreover, each phage particle in a phage library displays up to five copies of the displayed protein (using a gIIIp display system), whereas only 1–10 % of phage particles in a phagemid library display a single copy of the displayed protein (Clackson et al., 1994). As a result, a greater number of binders in a library can be recovered, and therefore antibodies tend to be more diverse. However, this is counterbalanced by a lower average affinity: phagemid display, by virtue of the display of single proteins, results in the selection of fewer unique binders, which tend to have higher affinities.

1.2.3 HELPER PHAGES

Phagemid systems yield hybrid phage usually displaying wild type and

fusion coat protein at a certain ratio. Filamentous phage presents 3 to 5 copies of gIIIp per phage that means that ideally multivalent phages displaying up to five copies of fusion protein can occur. However during phage assembly, wild type gIIIp is preferentially incorporated so that most phages exhibit the wild type phenotype referred as “bald” phages whereas only a few percent of phages carrying a fusion protein are mainly monovalent. The low levels of display favors

Page 13: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

12

the selection of antibodies or ligands with high monovalent affinities in the solid phase panning step, as avidity effects can be avoided. However, the low display rate can be a drawback in several applications where very few binders are to be enriched from a huge excess of unwanted phages, bald phages can reduce the accessible molecular diversity of the library and the efficiency of the system. A solution to the problem can be the use of true phage vectors at the expense of the advantages associated with the phagemid system as discussed above. A different approach is to avoid the delivery of wild type gIIIp by engineering the helper phage. Several helper phage constructs with a deleted or optionally untranslated gene III have been developed (Duenas and Borrebaeck, 1995; Rakonjac et al., 1997;

Rondot et al., 2001; Baek et al., 2002; Kramer et al., 2003; Soltes et al., 2003). These helper phages (Table I.1) provide all the necessary proteins for packaging the phagemid except wild type protein gIIIp. Therefore, the only gIIIp protein that phage particles display, when gIIIp-deficient helper phage is used for packaging, is the gIIIp fusion protein encoded by the phagemid. Apart from some technical problems with the propagation of the gene III-deficient helper phages associated with the early constructs, improved specific enrichment in panning experiments and higher display rates compared with standard helper phage systems were reported (Rondot et al., 2001; Kirsch et al., 2005).

Helper phage Features Reference

Ex-phage gIIIp carries amber stop codon Baek et al., 2002 CT-Phage N1–N2 deleted Kramer et al., 2003

Hyper-phage gIIIp deletion, special packaging strain Rondot et al., 2001 R408d3 gIIIp deletion Rakonjac et al., 1997

M13Δ3.2 gIIIp deletion Duenas and Borrebaek, 1995 Phaberge gIIIp carries amber stop codon Soltes et al., 2003

KM13 protease site in gIIIp Kristensen and Winter, 1998 M13 KO7 replication-deficient Vieira and Messing, 1987 VCS M13 derivative of M13 KO7 Stratagene

XP5 reduced expression of gIIIp due to rare codons Beaber et al., 2012 Table I.1 Helper phages and their characteristic features.

Kristensen and Winter (1998) designed a helper phage KM13 to reduce the

hampering effects of bald phages during the selection. KM13 helper phage was

Page 14: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

13

constructed by introducing a protease cleavage sequence in the helper phage gIIIp protein between the N-terminal (N1–N2) domains and the CT domain. The N-terminal domain of gIIIp is required for infection, therefore phages with proteolytically cleaved off N1–N2 were noninfectious. Although using this helper phage probably did not alter the display rate, infectivity mediated through helper phage gIIIp could be destroyed such that only phages displaying the gIIIp fusion protein remain infectious. Another helper phage named CT helper phage

was designed (Kramer et al., 2003) for monovalent display that does not rely on specific phagemids, bacteria or protease treatments. The CT helper phage encodes a truncated gIIIp protein lacking the infectivity domains N1 and N2 such that phages rescued with CT helper phage would be infective only if they display

phagemid‐encoded fusion, which is fused to a full‐length gIIIp. The reports showed that the background due to infective phages not displaying a scFv was not completely abolished using CT helper phage, yet drastically reduced since only 0.001 % of CT helper phages carried a native wild type gIIIp gene compared

to > 99 % when VCSM13 was used. However, it should be noted that most of the phage display phagemids

exploiting gIIIp fusions use a truncated version of gIIIp (Hust and Dubel, 2005) comprising its CT domain to reduce genetic instability and negative effects on folding caused by an unpaired cysteine in the N-terminal domain (Krebber et al., 1997). In addition, the use of CT diminishes resistance to superinfection caused by over-expression of full-length gIIIp fusion proteins (Stengele et al., 1990; Orum

et al., 1993). For reasons of phage infectivity, only full-length gIIIp fusion proteins are able to complement the wild type gIIIp protein. Phagemids using the truncated gIIIp are therefore not compatible with the helper phage that does not provide the functional gIIIp protein.

Hyperphage (Rondot et al., 2001) have a wild-type gIIIp phenotype and were therefore able to infect F+ Escherichia coli cells with high efficiency; however, their lack of a functional gIIIp gene means that the phagemid-encoded gIIIp - fusion is the sole source of gIIIp during phage assembly that results in a

Page 15: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

14

considerable increase in the fraction of phage particles carrying a fusion protein on their surface. The enrichment factor was considerably improved upon panning of human scFv library packaged using Hyperphage, on tetanus toxin. After two panning rounds, more than 50 % of the phages were found to bind to

the antigen, compared to 3 % when conventional M13 KO7 helper phage was used.

Baek et al., (2002) have shown that the display of single-chain antibody

fragment (scFv) generated with pIGT3 phagemid can be increased dramatically

by using a genetically modified Ex-phage. Ex-phage was derived from M13KO7 by introducing an amber stop codon in gIIIp such that it produced a functional wild-type gIIIp in suppressing E. coli strains but did not make any gIIIp in non-suppressing strains. Packaging phagemids encoding antibody-gIIIp fusion in F+ non-suppressing E. coli strains with Ex-phage enhanced the display level of antibody fragments on the surfaces of recombinant phage particles resulting in an increase of antigen-binding reactivity >100-fold compared to packaging with

M13KO7 helper phage. Thus, the Ex-phage and pIGT3 phagemid vector provides a system for the efficient enrichment of specific binding antibodies from a phage display library and, thereby, increased the chance of obtaining more diverse antibodies specific for target antigens.

In another independent study Soltes et al., (2003) constructed a novel helper phage, Phaberge, having a conditional deletion of gIIIp and demonstrated that this helper phage conferred increased display levels, by selectively preventing the insert less phagemids from being packaged into functional virions.

In contrast to Ex-phage, Phaberge yields substantially high virions than M13KO7. Phaberge can also selectively prevent viral propagation of insert-less phagemid clones that proves useful in preventing phagemid libraries from becoming overgrown by insert-less clones.

Recently, Beaber et al., (2012) created a new helper phage XP5, that uses a

combination of ribosome binding site spacing alternations and rare codon clusters

Page 16: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

15

to reduce the expression of gIIIp from the helper phage. This reduction in the gIIIp expression leads to an increased incorporation of gIIIp-Fab fusions during phage rescue. This helper phage reduced the level of wild type gIIIp moderately, thereby increasing the percentage of phage displaying a fusion molecule while maintaining the monovalency. The phage particles rescued with the XP5 helper phage had an 80 % increase in display levels as compared to phages recued with

M13KO7.

1.2.4 LIMITATIONS OF FILAMENTOUS PHAGE BASED PHAGE DISPLAY LIBRARIES

The M13 system though extremely popular, has few limitations related to the restricted codon usage of E. coli as host, the lack of posttranslational modifications and to the potential toxicity of some gene products expressed in heterologous hosts. Further, the requirement of periplasmic secretion of the displayed peptide imposes a constraint on the sequence repertoire that can be efficiently displayed. Any cDNA gene product (fused to a capsid protein) whose properties prevent it from crossing the membrane will not be secreted and, therefore, not be assembled into the phage coat. This is a potential limitation mainly concerning display of transmembrane proteins such as receptor molecules, cell wall proteins or other structures embedded into membrane bilayers that have never been reported to be found by screening of phage surface-displayed cDNA libraries. Also, a display density of several hundred copies of a

variety of peptides and protein domains has not been achieved with M13 based gIIIp systems. The copy number of the peptide or protein displayed on the surface of a single particle is referred as display density. Display densities as low as one copy per particle are important for proteins engineering when one needs to isolate improved binders to bait, e.g., antibody molecules with high affinities for

the ligand. Display up to 3 to 10 copies of the molecules are sufficient to allow the isolation of specific binders to bait when the strength of the interaction is in the nanomolar or micromolar range. Molecular interactions in biological systems are dependent on the concentrations and affinities of the interacting partners and the

Page 17: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

16

analyses of results obtained for such studies indicate the plausible benefits of the use of high-density display and the potential advantages afforded by increased display densities. Random peptide libraries or gene/genome fragment libraries and cDNA libraries displayed on the phages can be used to identify peptides against which the immune responses are elicited by patients. For the identification of specific epitopes from the patient’s sera having low titers of antibodies, high-density display on phages will be useful for the selection and enrichment of specific binders from a large population of non-binders. Therefore, alternative high density display systems based on large genome phages mainly T7, T4 and lambda have been developed to overcome the potential limitations imposed by secretion of fusion protein through the cellular membrane or low display levels.

1.2.5 LYTIC PHAGE BASED DISPLAY SYSTEMS (1) T7 Phage display vectors

Bacteriophage T7 is a lytic phage that replicates within the host cell and

upon amplification, phage progeny are released by host cell lysis. The displayed peptides do not have to be compatible with secretory complex within the bacterial cell membrane or with the host cell infection process. Studier and coworkers have described display vectors on bacteriophage T7. This system has the capacity to display peptides of up to 50 amino acids at a high copy number (415 per phage) and peptides or proteins up to 1200 amino acids at a low copy number (0.1 to 1 per phage). The T7Select phage display system (available from Novagen) uses the T7 capsid protein to display peptides or proteins on the surface of the

phage. The capsid protein of T7 is normally made in two forms, 10A (344 amino

acids) and 10B (397 amino acids) and is present as 415 copies per capsid. There are two basic types of T7Select phage display vectors, namely, the T7Select 415-1 vector for high copy number display of peptides and T7Select-1 and -2 vectors for low copy number display of peptides or larger proteins.

Page 18: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

17

The T7Select 415 system has been used to display (10 to 39 residues) peptides at high copy numbers (415 per phage). The T7Select -1 vectors have been used to display peptides and proteins at low copy numbers as measured by

Western blot analysis, 0.5 for herpes simplex virus Tag, 0.3 for the T7 single stranded DNA binding protein, 0.2 for β-galactosidase and 0.1 for T7 RNA polymerase (Rosenberg et al., 1996).

T7 phage display based cDNA libraries have been used for the identification and characterization of a novel angiostatin-binding protein (Kang and Yu, 2004), vaccine candidates of Brugia malayi (Ramaswamy et al., 2004) and bacterial ribonuclease inhibitors for the study of protein - protein interactions and for the cloning of RNA binding proteins. The commercial availability of the T7 phage display system, optimized protocols and ready-made libraries for selections, has led to an increasing use of this system for a variety of studies. (2) T4 Phage display vectors

The phage T4 capsid is composed of three essential capsid proteins namely

the major capsid protein gp23 (960 copies per phage particle) and the two minor capsid proteins gp24 (55 copies per particle) and gp120 (12 copies per particle). In addition the outer surface of the capsid is coated with two nonessential outer capsid proteins HOC (40 kDa) and SOC (9 kDa). HOC and SOC proteins have several features that make them suitable for the display of peptides and proteins as both the proteins are not essential for T4 capsid morphogenesis but if available they bind with high affinities to sites on the outer surface of the capsid after the completion of capsid assembly but prior to DNA packaging. Both these proteins are present at high copy numbers with 160 copies of HOC and 960 copies of SOC per phage particle. The elimination of one or both these proteins by mutation does not affect phage productivity, viability or infectivity. The T4 SOC display

system has been used to display large number of copies of 43- amino acid domain

(V3) of the gp120 protein of human immunodeficiency virus type 1 (HIV-1) per

phage capsid. These V3 displaying phages were highly antigenic in mice and

Page 19: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

18

produced antibodies reactive to native gp120 (Ren et al., 1996). Similarly, the other nonessential protein of the T4 capsid, HOC, was used to display the HIV-1

CD4 receptor of 183 amino acids, which was detectable by monoclonal antibodies against human CD4 domains 1 and 2 (Ren et al., 1998). The number of protein molecules displayed per phage particle was small 10 to 40 per phage particle but the displayed molecules possessed a native conformation. Jiang et al., 1997 used

T4 SOC and HOC to display a 36-amino acid PorA peptide from Neisseria meningitides as N-terminal fusion. It was demonstrated that PorA-HOC and PorA-SOC recombinant phages were highly immunogenic in mice and elicited strong anti-peptide antibody titers. (3) Lambda phage display system

One of the lambda proteins used for display is gpD, a small 11.4 kDa capsid stabilizing protein that is found as 405 to 420 copies per capsid (Casjens et al., 1974). During morphogenesis lambda DNA is packaged in the prohead shell that expands and undergoes an irreversible conformational change that allows gpD to bind to the prohead and stabilize the phage head (Imber et al., 1980). The X-ray structure of gpD has been solved to 1.1 A resolution and coupled with high-resolution cryoelectron microscopy to provide a detailed picture of the display scaffold (Pluckthun et al., 2000). While monomeric in solution, gpD is a trimer in the crystal. Virus structures reconstructed using cryo-electron microscopy show that gpD is a trimer on the phage capsid suggesting that the trimer seen in the crystal structure is not an artifact of crystal packing. The trimers of gpD bind to underlying molecules of gpE that form the capsid shell. The crystal structure shows that both the amino and carboxy termini of gpD appear to point downwards towards the capsid interior rather than outward from the surface. Despite this, peptides and proteins fused to gpD are accessible at the surface. One explanation is that the flexible linkers that join gpD and the fusion partner somehow allow the fusion to be displayed on the outward side of gpD. The ability to display proteins fused to gpD were demonstrated by selectively capturing phage with a reagent that specifically recognizes the fusion partner.

Page 20: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

19

The IgG binding protein domains A and B1 were each fused to gpD and the resulting phage were efficiently captured on plates coated with IgG (Sternberg et al., 1995; Mikawa et al., 1996). These experiments also showed that fusions could be made to either the amino or carboxy termini of gpD. While protein domains A and B1 are relatively small (~ 65 amino acids), larger protein domains such as β- lactamase or tetrameric β-galactosidase could also be displayed in an enzymatically active form (Mikawa et al., 1996).

A second protein that has been used for lambda display is the tail protein

gpV. While a crystal structure for gpV is unavailable, early genetic and biochemical analysis indicated that the carboxy terminal portion of the protein was dispensable (Katsura et al., 1976). Electron micrographs of the hexamer rings formed by gpV showed that the carboxy terminal deletion mutants lacked protrusions on the outer surface when compared with wild-type gpV preparations (Katsura et al., 1981). Despite the gpV carboxy deletions, such phages are viable. The combination of the non-essential nature of the carboxy terminus of gpV, and that it faces outward from the surface of the tail structure would appear to make this an ideal platform for the display of peptides or proteins. Initially carboxy terminal fusions to gpV were made fusing either β-galactosidase or the plant Bauhinia purpurea agglutinin (Maruyama et al., 1994). Display of these proteins was verified by affinity selection with either a monoclonal antibody to β- galactosidase or mucin. In the case of the fusion with β- galactosidase it could be shown that enzymatic activity copurified with phage banded in CsCl. Electron micrographs of these phage clearly show the presence of the tetrameric β- galactosidase molecules on the surface of the phage tail. Peptides have also been displayed on gpV, including the target sequence for cAMP-dependent protein kinase and a complementing fragment of β-galactosidase (Dunn et al., 1995, 1996).

Lambda display has been used for epitope mapping of monoclonal

antibodies against a large number of human and microbial proteins (Gupta et al.,

2003; Kuwabara et al., 1997; 1999). One of the most promising potential of lambda

Page 21: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

20

display is the construction of cDNA-encoded library displays. Santini and Hoess, 1998 reported the construction of a cDNA library of Hepatitis C virus on lambda and its use for affinity selection with monoclonal antibodies and patient sera. Several lambda display libraries of T. gondii cDNA fragments have been constructed and used for the diagnosis of congenital toxoplasmosis with patient sera from either pregnant women with acquired infection or diseased children

(Beghetto et al., 2001; 2003; 2006). Lambda display library of DNA fragments from the whole bacterial genome of Streptococcus pneumoniae was constructed and challenged with sera from patients with acute pneumococcal pneumonia to allow the identification of immunodominant epitope of the bacterial immunogloubulin-A protease, a proteolytic enzyme playing a major role in pathogen’s resistance to host’s immune system (De Paolis F et al., 2007). Genome fragment libraries on lambda phage of pathogens like, Mycoplasma pneumoniae (Beghetto et al., 2009), Human cytomegalovirus (HCMV) (Beghetto et al., 2008) have also been constructed with an aim to identify B cell epitopes for developing tests or identifying antigen as vaccine candidate.

Gupta et al., (2003) described a bacteriophage lambda system for the display of peptides and proteins fused at the C-terminus of the head protein gpD by a highly efficient process of phage infection and in vivo recombination that allowed the integration of the sequence encoding the peptide to be displayed on the lambda surface into the lambda genome and the cloned sequence is displayed as gpD fusion on the surface of progeny lambda phage particles. DNA encoding

the foreign peptide is inserted at the 3’ end of the DNA segment encoding gpD under the control of lac promoter in a plasmid vector (donor plasmid), which also carries loxPwt and loxP511 mutant recombination sequence. The site-specific recombinase (Cre)-expressing cells are transformed with the plasmid and subsequently infected with the recipient lambda phage that carries a stuffer DNA segment flanked by loxPwt and loxP511 sites. The recombination occurs in vivo to form Ampr integrates that produce recombinant phages displaying foreign protein fused to gpD. It was also reported that the lambda system was able to display proteins of different sizes and the number of copies of each protein per

Page 22: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

21

phage particle was 2 to 3 orders of magnitude higher than that for display on M13 phage as fusion to gIIIp.

Although lambda is an attractive display vehicle, lambda phage display systems have been used for limited applications that can be attributed to complex lambda phage biology, large genome size of 48 Kb making the isolation of viral DNA, cloning of foreign DNA, and packaging of the ligated product in vitro to form lambda particles difficult thereby leading to small library sizes.

1.3 FUNCTIONAL GENOMICS

Functional genomics is a field of molecular biology that attempts to make use of the vast wealth of data produced by genomic projects (such as genome sequencing projects) to describe gene (and protein) functions and interactions. Unlike genomics, functional genomics focuses on the dynamic aspects such as gene transcription, translation, and protein–protein interactions, as opposed to the static aspects of the genomic information such as DNA sequence or structures. Functional genomics attempts to answer questions about the function of DNA at the levels of genes, RNA transcripts, and protein products. A key characteristic of functional genomics studies is their genome-wide approach to these questions, generally involving high-throughput methods rather than a more traditional “gene-by-gene” approach. The goal of functional genomics is to understand the relationship between an organism's genome and its phenotype. The term functional genomics is often used broadly to refer to the many possible approaches to understanding the properties and function of the entirety of an organism's genes and gene products. This includes identification of function of the different proteins encoded by the tens of thousands of genes, and to understand the pathways in which they participate. Functional genomics involves studies of natural variation in genes, RNA, and proteins over time (such as an organism's development) or space (such as its body regions), as well as studies of natural or experimental functional disruptions affecting genes, chromosomes, RNAs, or proteins. The promise of functional genomics is to

Page 23: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

22

expand and synthesize genomic and proteomic knowledge into an understanding of the dynamic properties of an organism at cellular and/or organism levels.

There are several technologies available that facilitate rapid gene identification, study of protein expression levels or protein – protein interactions which involve the use of traditional approaches, such as automated cloning and expression, transgenic animals, as well as more novel approaches such as two hybrid system (Uetz et al., 2002) and two dimensional gel electrophoresis

combined with mass spectrometry (Aebersold and Mann, 2003). However, recent developments in molecular libraries based on bacterial and phage display have shown great potential for generating information and providing the tools necessary for facilitating progress in functional genomics. Phage display is an attractive technique that allows creation of large molecular (genomic, cDNA, gene-fragment, antibody) libraries, in the order of 1010 - 1011 members, that can be efficiently employed for various applications as discussed in detail in the next section.

1.3.1 APPLICATION OF PHAGE DISPLAY (1) Protein-protein interactions

Phage display has been used in numerous studies of protein-protein

interactions. Its use with combinatorial mutagenesis provides a rapid method to identify residues contributing energetically to binding at protein-protein interfaces. Phage display random peptide libraries have been used in identifying novel interacting partners of proteins. For example, phage display experiments predicted interaction between bacterial membrane transport proteins TonB and BtuF, identifying the potential binding residues on each protein. Phage displayed peptides were affinity selected in complementary biopanning using either TonB sor BtuF as targets (James et al., 2009; Mandava et al., 2004; Carter et al., 2006). Phage display has also been used to map intracellular interactions of distinct

protein domains like SH3 and PDZ (Fuh et al., 2000; Kiewitz & Wolfes, 1997).

Affinity selected peptides for SH3 domain also had sequences different from the

Page 24: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

23

known natural ligands suggesting the diverse binding specificity of SH3. The peptide modules affinity selected from a phage display library can be mapped back to the whole genome sequences to identify the potential binding partners of the target proteins. C-terminally expressed random peptide phage library is useful in exploring the structural binding motifs like PDZ domains that may interact with the C-terminus of another protein. Phage cDNA libraries allow identification of endogenous protein ligands for protein and non-protein targets like phosphatidylserine (Caberoy et al., 2009). They provide an important tool in functional characterization of genes identified by genome sequencing. (2) Enzyme specificity and inhibitors

Phage display has been used in enzymology to determine the substrate specificity and to develop modulators of both the active and allosteric sites of the enzyme (Diamond, 2007; Kehoe & Kay, 2005; Kay et al., 2001; Benhar, 2001). The method can be used to display mutants of enzymes to study their mechanisms of action (Vanwetswinkel et al., 2000; Ponsard et al., 2001; Verhaert et al., 2002). Since filamentous phage is resistant to broad range of proteases, it has been used in

identification of substrates of various proteases (Matthews & Wells, 1993; Diamond, 2007). Phage display library of random peptides with N-terminal affinity tag for immobilization of phage before protease exposure or for separation of protease resistant phage from solution is commonly used. In addition to mapping substrate specificity, phage display is used to select stably folded proteins resistant to cleavage despite containing the protease substrate sites. Such selections may prove useful in linking protein sequence to its structure and help engineer proteins with improved folding and stability (Finucane et al., 1999).

Catalytic site enzyme inhibitors with high affinity may be developed by screening phage display random peptide libraries or libraries of mutants of existing inhibitors for binding to the immobilized enzyme molecules (Hekim et al., 2006). Highly specific catalytic site inhibitors of enzymes involved in the

Page 25: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

24

bacterial cell wall synthesis have been developed by phage display (Paradis-Bleau

et al., 2006; El Zoeiby et al., 2003; Paradis-Bleau et al., 2008, 2009; Molina-Lopez et al., 2006). Selections against therapeutic targets have resulted in phage-encoded sequences with high affinity and specificity to become candidates for drug development.

Phage display can be used to gain insight into enzyme-substrate

interactions that may be responsible for enzyme specificity. Botulinum type B and tetanus endopeptidases are neurotoxins that may interact with a region of the substrate (recognition motif), which is different from the cleavage site. Phage display library of vesicle-associated membrane protein with mutated recognition motif was screened for cleavage by botulinum type B toxin to identify the alternative substrates and the role of recognition motifs in endopeptidase specificity (Evans et al., 2005). In another example, phage display has been used to develop peptide inhibitors that bind either anthrax toxin or its cell surface receptors (Basha et al., 2006; Gujraty et al., 2005). Studies with inhibitors have been useful in providing information on the mechanism of cytotoxicity of anthrax toxin (Basha et al., 2006).

(3) Antibodies

Phage display of antibody fragments has been used successfully in generating target specific antibodies, which can be useful in multiple applications including proteomics, specific drug delivery and in analysis of intracellular processes (Bratkovic, 2010; Benhar, 2001; Hoogenboom, 2005; Smith & Petrenko, 1997). The major advantages in using phage display for acquiring antibodies are speed and a lack of necessity for immunizing animals especially humans. Naive antibody phage libraries made from rearranged V gene pools of a non-immunized individual and synthetic antibody phage libraries made by artificially introducing diversity in the CDRs of germline V-gene segments completely bypass the use of immunization. Naive library with natural CDRs or synthetic library with artificial CDRs can then be screened against most antigens, including

Page 26: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

25

non-immunogenic molecules and targets conserved between species. The power of this approach is illustrated in screening a naive antibody library to select antibodies specific for each of the 20 related human Src Homology 2 (SH2) domains that share a common three-dimensional structure with 20 – 89 % identity in their sequences (Pershad et al., 2010).

Phage display has also been used for the isolation of intrabodies which are antibodies directed against intracellular target molecules. Intrabodies in the form of scFvs face the difficulty of folding properly under the reducing environment of cell cytosol and nucleus. However, phage libraries of highly diverse scFv's or engineered scFv's optimized for cellular expression, have been screened to select intrabodies (Cardinale and Biocca, 2008; Philibert et al., 2007). (4) Epitopes and mimotopes

Phage display is a cheap and rapid method to map epitope of the antigen that is involved in specific interaction with the antibody. The identification of epitopes is essential in diagnostics, immunotherapy and vaccine development. Phage display peptide libraries can help identify critical residues within a continuous epitope that are involved in antibody binding. Since linear continuous epitopes are often six amino acids in length, the screening of libraries may affinity select peptides that exactly match the primary structure of the epitope (Geysen et al., 1988; Fack et al., 1997). The epitope mapping can be carried out by screening phage libraries, displaying random peptides encoded by, either synthetic oligonucleotides or gene-fragments (Wang & Yu, 2009; Bottger & Bottger, 2009; Scott & Smith, 1990). The gene-fragment libraries are useful in identifying epitopes that are longer or adopt structural conformation (Fack et al., 1997). The phage display peptide library can be screened to affinity select mimotopes, which are peptides that mimic discontinuous epitope structures. Mimotopes may not have similarity to any linear sequence of the antigen and may represent conformational dependent interaction of the epitope with the anitbody. Several analytical tools are available to map the native epitope based on sequences of the selected mimotopes and the three dimensional structure of

Page 27: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

26

the antigen (Mayrose et al., 2007; Huang et al., 2006). The phage display peptide library can also identify epitope mimics of carbohydrate and lipid antigens that have a low immunogenic profile (Forster-Waldl et al., 2005). Mimotopes coupled with carrier proteins or presented as polymers have been developed for cancer, anti-allergic and contraceptive vaccines (Naz, 2009; Knittelfelder et al., 2009).

(5) Receptors and G-proteins

Phage display has been used to identify agonists and antagonists to probe the receptor structure and function. The peptide libraries can be screened for binding to functionally folded extracellular domains of receptors that contain the site for natural ligand. Selected peptides that recognize the binding interface of the receptor can antagonize its interaction with the natural ligand. The structural and functional properties of individual members of a large receptor family that bind the same natural ligands can be characterized with affinity-selected peptides specific for each member (Koolpe et al., 2005). Phage encoded peptide ligands have also been selected for targets like G-protein coupled receptors, in which it is difficult to purify the functionally folded extracellular receptor domains. Antibodies specific for the known receptor ligand can be used as a target to affinity select mimotopes of the ligand from phage display library. The selected mimotopes can be used to study the mechanism of interaction of ligand with its receptor and allow the development of potent agonist and antagonists (Bonetto et al., 2005). Receptor antagonists can also be obtained by selecting peptides that bind to the receptor agonist and thereby inhibit its interaction with the receptor. Phage display approach has also been used to determine biologically relevant

proteins that bind to pharmacologically active compounds such as SB-236057, Taxol and FK506. The identification of these proteins helped in resolving the mechanism of action of these compounds (Rodi et al., 1999; Sche et al., 1999; Augustine-Rauch et al., 2004).

Page 28: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

27

(6) Phage displayed gene/genome-fragment libraries

Phage display of DNaseI generated overlapping gene-fragments has been

employed for epitope mapping and for studying protein-protein interactions. In this strategy, target DNA encoding the antigen of interest is partially digested

with DNaseI to generate overlapping 50-300 bp long fragments. These fragments are purified, polished with T4 DNA polymerase and cloned at the N-termini of the coat protein in phage display vectors. The phages produced from the gene-fragment library are then affinity selected on immobilized antibody whose epitope is to be determined. Bound phages are eluted and on nucleotide sequencing of the peptide-encoding fragment, the epitope is determined. (Wang et al., 1995) constructed a gene-fragment phage display library of the outer capsid protein VP5 of Bluetongue virus in phage vector. The specific phages enriched after affinity selection, on sequencing, yielded the amino acid sequences of the peptides involved in antibody binding. (van Zonneveld et al., 1995) used phagemid vectors to construct an epitope library for human plasminogen-activator inhibitor 1 (PAI-1) and used this library to map the epitope of a monoclonal antibody directed against this protein. Gene-fragment phage display approach was also used for mapping epitopes of polyclonal antibodies (Wilson et al., 1998). Gene fragment library of Bordetella pertussis virulence factor filamentous hemagglutinin (FHA) was prepared using phage vectors. The library was affinity selected with rabbit anti-FHA polyclonal antibodies. After analysis, a 90 aa domain within FHA was identified as an immunodominant region. This system was also utilized to map protein-protein interactions (Kiewitz and Wolfes, 1997). DNA segments of proto-oncogene c-myb were cloned into a modified phagemid system, which allowed for expression in all possible reading frames. The library encompassing all functional domains of the protein was screened with c-myb co-activator CBP protein. Alignment of the sequences from eluted phages

revealed that the amino acids 317-342 of Myb interacted with CBP protein. Furthermore, it was shown that an intermolecular interaction could be detected between the N-terminal Myb domain with the C-terminus (aa 541-567). Three gene-fragment phage display libraries of porcine reproductive and respiratory syndrome virus (PRRSV) were constructed and used for identification of linear B-

Page 29: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

28

cell epitopes with sera from experimentally infected pigs (Oleksiewicz et al.,

2001). Ten linear epitope sites, 11 to 53 amino acids in length were identified after three or four rounds of affinity selection. Eight of these ten epitope sites resided within the replicase polyprotein. ELISA using phage displayed epitope sites as antigen indicated that they might serve as important diagnostic agents. In another study, gene-fragment phage display library of VP2 outer capsid protein of African horsesickness virus (AHSV) was constructed (Bentley et al., 2000).

Peptides ranging in size from 30 to 100 amino acids were fused to gIIIp in phage display vector fUSE2. The resulting library was subjected to affinity selection with AHSV-specific polyclonal chicken IgY, polyclonal horse immunoglobulin and a neutralizing monoclonal antibody to AHSV. It was shown that most antigenic determinants that were mapped localized in the N-terminal half of VP2. Important binding areas were mapped with high resolution by identifying the minimum overlapping area of the selected peptides. A comparison of the antigenic regions identified by phage display with corresponding regions on three other serotypes of AHSV revealed two peptides of VP2 with potential to discriminate serologically between AHSV serotypes.

Gene fragment libraries have been employed in elucidating regions in protein involved in protein-protein interactions. A genome wide screening to identify novel genes of human pathogens GBS (Group B streptococci) involved in binding to fibronectin leading to adherence to the host cells was done using gene fragment phage display library of serotype Ia GBS strain (Beckmann et al., 2002). 100 to 1000 bp fragments of chromosomal DNA were cloned into phagemid

vector pG3H6 and the resultant library was selected using immobilized fibronectin by four rounds of affinity selection. One of the selected clones showed significant homology to the gene (ScpB) for the GBS C5a peptidase, a surface-associated serine protease that cleaves complement C5a. Results suggested that C5a peptidase not only functions as an enzyme but also mediates adherence to fibronectin. In another study, a whole genome phage display library was constructed using fragmented E. coli genome digested with three

restriction enzymes (Yano et al., 2003). These fragments were ligated into

Page 30: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

29

pCANTAB5 vector and the generated library was screened for clones with high affinity to alkaline phosphatase (AP) from calf intestine. After four panning rounds, three of the selected phage clones were shown to have specific binding properties towards AP by ELISA. Robben et al., (2002) constructed a whole genome phage display library of Toxoplasma gondii genomic DNA, 50-500 bp gene

fragments of T. gondii genomic DNA (80 Mb) were fused to gIIIp of M13. Biopanning of the library on Toxoplasma gondii MAbs resulted in identification of

dense granule antigen GRA3. 1.3.2 PHAGE DISPLAYED cDNA LIBRARIES AND ITS LIMITATIONS

Phage display has been widely used to identify bait-binding antibodies or short peptides from antibody libraries or random peptide libraries but the potential of phage-display as a tool to directly dissect cDNA libraries has been limited for several reasons. First, the efficiency with which the host bacteria secrete the cDNA – gIIIp fusion protein is highly dependent on the cDNA sequence as large cDNA fragments often show unsatisfactory presentation efficiencies as compared to shorter inserts, resulting in a disproportionate representation of certain sequences in the library. Second, cDNAs must be in the same reading frame as both the gIIIp signal-peptide-encoding sequences and the gIIIp structural gene for the natural peptide to be displayed. Also, the gene fusion must not contain any in-frame stop codons that would prematurely terminate the fusion protein. Antibody libraries with predictable reading frames can be conveniently fused to gIIIp in correct frames without problem, whereas cDNA repertoires with unpredictable reading frames and stop codons may interfere with gIIIp expression, resulting in only ~ 6% of identified clone encoding real proteins (Faix et al., 2004). Majority of identified non-open reading frames (non-ORFs) encoding unnatural short peptides have minimal implications in protein interaction networks.

To circumvent the problem of large cDNA fragments, cDNA libraries are

fragmented prior to cloning assuming that in doing so functional binding

Page 31: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

30

domains can be separated from potentially problematic sequences (McCafferty et al., 2000). However, the majority of the clones in fragmented cDNA libraries appear to be non-functional or contain undesirable stop codons because of the cDNA fragments being out of frame to the N-terminal leader sequence, or gIIIp,

or both. To overcome this limitation, Crameri and Suter (1993) exploited the hetrodimeric leucine zipper interaction of Jun and Fos. They fused a PelB signal sequence together with the cysteine-flanked Fos leucine zipper to the 5’- end (i.e., N-terminus) of a cDNA library. The Fos-decorated cDNA products were

captured by a modified gIIIp, which had the 30-amino acid long leucine zipper domain of Jun-c flanked by cysteine residues fused to its N terminus. Jun-Fos heterodimerization and disulfide bond formation covalently linked the cDNA product to the gIIIp in the periplasm. This pseudofusion protein could be incorporated into the phage particle (Crameri et al., 1994). The phagemid pJuFo (and its derivatives) representing the embodiment of this innovative concept were mainly used to select IgG binding proteins from cDNA libraries of fungal, bacterial, plant, and human origin (Palzkill et al., 1998; Kleber-Janke et al., 1999;

Crameri et al., 2001). More than 350 IgG-binding molecules were isolated using the pJuFo system and several publications report its successful use.

An alternative approach to circumvent the limitations of N-terminal fusion

to the coat protein was followed by Jespers et al., 1995, who described a system for the fusion of cDNA encoded proteins to the C terminus of the phage protein pVI. This minor coat protein is probably oriented in the phage particle in such a way that its C terminus is solvent-exposed, which offers the opportunity of C-terminal fusion without hampering phage assembly. The system was tested on a cDNA library derived from the parasite Ancylostoma caninum, which was selected against trypsin and factor Xa. In another study (Hufton et al., 1999), pVI display vectors that allow the cloning of cDNA in all three reading frames were constructed and tested with a colorectal cancer tissue-derived cDNA library. This phage library was selected on an anti-β2M antibody and on anti-IgG serum, in which binding

phages carrying full-length β2M and constant Ig domains were selected, respectively. However, this study also revealed that the display rate of the model

Page 32: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

31

protein alkaline phosphatase was decreased about 100-fold with the pVI system compared to the display rate obtained with the pJuFo phagmid. The pJuFo system also showed approximately 20-fold higher enrichment factor when both

systems were tested for the selection of the CH3 domain of IgG. Recent studies suggest that gIIIp and pVIII can be engineered to allow C-terminal fusions as well

(Fuh and Sidhu 2000; Fuh et al., 2000; Weiss et al., 2003), although these systems have not yet been applied to cDNA screenings.

1.4 ORF SELECTION

Progress in functional genomics is currently hampered on a practical level

by the extremely large number of clones that must be generated and screened for the relevant phenotype or function. A representative random genomic expression library must contain sufficient members to span the genome multiple times in order to ensure that each protein coding segment is present and cloned in its correct frame and orientation. An important and unwanted feature of random construct libraries is the presence of non-coding diversity i.e., gene fragments unable to produce proteins due to frame shifting of the insert relative to start or stop codons. These clones dilute the desired ones and result in decreased screening efficiency and reduced coverage of diversity. For fragment libraries generated by physical methods like sonication 17/18 clones in the library are unproductive. Therefore, it is quite important to enrich ORF’s before selection. The principle to construct ORF cDNA libraries is based on the fact that non-ORF cDNA has high frequency of stop codon(s). Database analysis revealed that ~ 96 % of 200 bp non-ORF cDNAs have at least one stop codon (Garufi et al., 2005).

This number drastically increases to 99.6% for non-ORF cDNAs with 300 bp. Several strategies have been explored to address the problem of selecting open reading frames. 1. Helper phage mediated phage display based selection

One of the possible explanations for low percentage of ORFs is that

Page 33: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

32

filamentous phagemids encoding the library-gIIIp fusion protein require a helper phage carrying a predominant wild-type gIIIp gene to supply other proteins for the rescue of the phagemid assembly. It was speculated that avoiding the delivery of wild-type gIIIp during the phage packaging might solve this problem. Consequently, a new type of phage packaging system of hyperphage was developed to eliminate the packaging of wild-type gIIIp (Rondot et al., 2001). Approximately, 60% of cDNA library phages generated with the hyperphage had ORF inserts (Hust et al., 2006).

2. C-terminal ampicillin selection

The concept of C-terminal selection with an antibiotic resistant gene to remove deletion mutants from antibody library was originally described by Seehaus et al., (1992) with a plasmid in which antibody library was cloned

upstream of a β-lactamase gene. Zacchi et al., (2003) further demonstrated a similar strategy with a phagemid, wherein cDNA inserts were followed by β-lactamase gene and gIIIp. The β-lactamase gene was flanked by two homologous

lox sites. After ampicillin selection, the β-lactamase gene was removed by Cre recombinase-mediated recombination. The removal of β-lactamase gene was necessary for the efficient display of foreign polypeptides at gIIIp N-terminus. Faix et al., (2004) pre-selected ORFs with a C-terminal β-lactamase gene in a plasmid. The sequences of ORFs were extracted from ampicillin-resistant plasmids, re-cloned into a phagemid, and rescued by hyperphage. The library had ~ 87% of ORF clones. Affinity selection with a monoclonal antibody (MAb) against human placental lactogen identified 8 clones with 6 ORFs encoding lactogen. However, the technical challenge of this strategy is the complicated procedure of generating the ORF phage display cDNA library with a shuttle plasmid. In fact, the cDNA library in the ampicillin-resistant shuttle plasmid in this study had only limited representation of 1 × 106 clones which restricted the quality of the subsequent phagemid cDNA library.

Page 34: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

33

3. C-terminal reporter genes

Several ORF selection vectors have been described that are based on fusing DNA inserts to reporter genes that encode selectable enzymatic functions. The

first described ORF vector, pUK230, contains a lacZ reporter gene located downstream and out of frame with respect to an initiating ATG codon, with the two being separated by restriction enzyme sites (Koenin et al., 1982). Insertion of

an ORF of the correct length allows expression of β-galactosidase, conferring a LacZ+ phenotype on the host. This vector was shown to be useful for isolating ORF clones from a randomly cleaved 10 Kb genomic DNA fragment that

contained two exons. However, while 13 % of clones exhibited a LacZ+ phenotype (Ruther et al., 1982), only one out of 18 of clones (5.6 %) generated from an entirely coding DNA fragment should be expected, indicating a large

number of false positives. Moreover, given that only ∼ 300 bp (3 %) of the 10 kb

fragment is coding DNA, there was clearly a large overrepresentation of

positives. Like pUK230, ORF vectors PORF1 and PORF2 also contain a lacZ

reporter gene (Weinstock et al., 1983; Weinstock, 1987). In addition, these

plasmids contain the 5′ end of the Escherichia coli ompF gene, including the promoter and translational start site, located upstream and out of frame with the

promoterless lacZ gene. As with pUK230, insertion of ORF fragments of the correct length to bring the reporter gene in frame with the initiating ATG gives rise to a LacZ+ phenotype. In this case, the resultant polypeptide is a tribrid

protein with the ORF translation product sandwiched between OmpF and β-galactosidase. The PORF vectors were tested with sub-fragments of isolated genes and shown to be useful for generating protein fusions that could be used to raise antibodies. However, the efficacy of using these vectors to distinguish ORFs from non-ORFs was not determined.

More recently, an ORFTRAP vector that contains an intein embedded

within a kanamycin resistance gene was described (Daugelat and Jacobs, 1999). The ORFTRAP system relies upon insertion of an ORF to allow the intein to be translated in its correct frame, resulting in splicing and hence expression of the

Page 35: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

34

KanR gene. However, the ORFTRAP vector was not tolerant of a wide range of

fusions, as evidenced by intolerance for most fragments larger than ∼ 250 bp. Furthermore, a genomic screen of Haemophilus influenzae yielded only 0.5 % of KanR colonies, rather than the predicted 5.5 %, indicating that more than 90 % of the protein fusions were unstable.

Fusion of the protein to several reporter genes such as chloramphenicol

acetyltransferase CAT (Maxwell et al., 1999), DHFR (Liu et al., 2006), thioredoxin Trx (Tsunoda et al., 2005) have also been reported to enhance the solubility of the fusion partner. Fluorescent protein fusions, more commonly GFP (Waldo et al., 1999) and split GFP (Waldo et al., 2005) have been shown to improve the folding and solubility of recalcitrant proteins from M. tuberculosis. The use of these solubility based reporter systems can also be extrapolated for the purpose of enrichment of ORF clones before the generation of phage display libraries followed by subsequent cloning into any desirable phage displayed vector. However these reporter-based systems may also show certain limitations like enrichment of truncated false positives fused to intact reporter resulting from translational initiation within the target sequence or proteolysis of the target within the host cytoplasm. 4. C-terminal biotin tag

Ansuini et al., (2002) generated ORF phage display cDNA library in

lambda phage with a C-terminal 13-amino acid biotinylation epitope or biotin tag. cDNA library was fused to the C-terminus of capsid D protein, followed by the biotin tag. If a cDNA insert is an ORF, the C-terminal tag is expressed and efficiently biotinylated by biotin holoenzyme synthetase (BirA) endogenously

present in E. coli (Schatz, 1993). As a result, only the ORF phage clones are labeled with biotin and enriched by binding to immobilized streptavidin.

Affinity selection with anti-GAP-43 mAb was used as a model system to evaluate

the library. After selection, a total of 34 clones were randomly chosen and

analyzed with ~ 79 % of them in correct reading frames, including 7 GAP-43-

Page 36: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

35

expressing clones.

1.5 HIGH-THROUGHPUT CLONING METHODS FOR TRANSFER OF OPEN READING FRAMES (ORFs) FROM ONE VECTOR TO ANOTHER

Functional genomics and proteomics offer the promise of examining the roles of all genes and proteins in an organism in a controlled format. Accordingly, researchers around the world have recently exploited the availability of completed genome and cDNA sequences to create comprehensive, arrayed collections of individual cloned genes for a number of organisms. Nearly all methods for studying protein function begin with the expression of protein from cloned copies of the protein-coding sequences. Thus, obtaining a complete set of validated protein-coding clones is the first step toward establishing a functional proteomics platform for any organism of interest. Different protein functional studies demand different protein expression vectors; therefore, a flexible vector system should be used that enables the cloned sequences to be transferred rapidly to any vector. There are several parameters to effectively judge the cloning systems available for high throughput transfer the clones that include, high fidelity and efficiency of transfer, ease of use, reliability and stability of the cloning system and validation of the cloned products (LaBaer et al., 2004). This can be achieved most efficiently by using vectors that employ recombinational cloning, a strategy that allows DNA fragments flanked by site-specific recombination sites to be moved from one vector to another in a single-step procedure, in frame and without mutation. Highly efficient site-specific recombination-based systems are available from commercial suppliers, such as the Gateway cloning system from Invitrogen and the Creator cloning system from Clontech and these reactions are simple enough to allow high-throughput (HTP) automation and are highly efficient. Once a “master” clone is created, the identical sequences can be transferred easily into all bacterial, mammalian, and viral vectors commonly used for protein analysis in vivo or in vitro.

Page 37: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

36

1. The Gateway Recombinational cloning system The Gateway recombinational cloning system available from Invitrogen

utilizes a modified version of the site-specific recombination system of

bacteriophage lambda λ (Hartley et al., 2000; Walhout et al., 2000). The Gateway

system utilizes a minimal set of the components of the λ system for in vitro

transfer of DNA, the λ Integrase protein (Int), the λ Excisionase protein (Xis), the E. coli protein IHF, and the att recombination sequences embedded in the DNA to be recombined. The Gateway system stems from the observation that the lambda att sites can be mutated to generate variants with high specificity and virtually no cross talk that allows the orientation of cloned DNA to be maintained through vector transfers. Hence, attB1 can recombine with the corresponding attP1 (upstream site on donor vector), but not attP2 (downstream site on donor vector). To select for the desired recombinant product and against the parental plasmids and undesired recombination intermediates, the Gateway system uses an E. coli death gene, ccdB, in combination with differential drug-resistance markers on the master (Entry) and Destination plasmids. The ccdB gene, taken from the E. coli F plasmid segregation control system, allows for negative selection in E. coli by virtue of its ability to inhibit E. coli DNA gyrase (Bernard and Couturier, 1992). When the products of Gateway recombination reactions are used to transform E. coli, cells transformed by a Gateway Donor or Destination plasmid or by the cointegrate intermediate of the Gateway recombination reaction are thus, unable to grow. Only the desired recombinant product, which lacks the ccdB gene and has the appropriate drug selection marker (e.g., ampicillin resistance for the expression plasmid product), can give rise to transformants.

The primary means of creating Gateway master clones is BP

recombination, a reaction in which the ORF with flanking attB sites (usually generated by PCR) is recombined into a vector with the corresponding attP sites. BP recombination is accomplished by a simple in vitro recombination reaction that requires the lambda Int and the IHF protein (a mixture marketed as BP Clonase by Invitrogen), and is usually complete within hours. For large-scale

Page 38: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

37

projects requiring high efficiency, it can be advantageous to allow an overnight incubation. Gateway BP recombinational cloning is both efficient and relatively insensitive to target DNA concentration, making it amenable to automation. However, this method of capture does exhibit some size bias, with reduced

efficiency for fragments larger than 3 Kb (D. Hill, pers. comm.; A. Rolfs, pers. comm.; G. Marsischky and J. LaBaer,unpubl.).

The transfer of ORF sequences cloned into Gateway expression clones is

accomplished by LR recombination in a simple in vitro reaction that requires the phage lambda Int and Xis proteins together with the IHF protein (LR Clonase), an expression plasmid modified with attR sites, and an Entry plasmid ORF clone in which the ORF to be transferred is flanked by attL sites. As with BP recombination, the LR recombination reaction is complete within hours. The LR recombination reaction is streamlined, easily adapted to robotic manipulations, and exceedingly reliable with the efficiency of transfer of cloned DNA to expression vectors approaching 100 %.

Once a library of master clones is established, it is straightforward to

convert them to any expression vector that has been modified for use in the Gateway system. They allow expression of proteins in a wide range of organisms (including bacteria, yeast cells, insect cells, and mammalian cells), using both plasmid and viral expression vectors (adenovirus, retrovirus), with a variety of available promoters.

2. Clontech cloning system: In-fusion technology

The Clontech cloning approach uses two different enzyme systems for the capture of PCR products to create master clones and for the transfer of genes from master clones to expression clones. For the capture reaction, Clontech uses a proprietary enzyme, In-Fusion, which mediates DNA cloning by the use of short stretches of sequence homology. The Clontech Creator cloning system, which is used to transfer the cloned ORFs from master clones to expression vectors, is

Page 39: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

38

based on the Cre-loxP - based site-specific recombination system of bacteriophage P1 (Sternberg et al., 1981). Both Clontech systems are well suited to automated methods.

Like the Gateway system, Clontech master (Donor) clones can be

assembled using restriction enzymes; however, for large-scale cloning projects, the In-Fusion system is the straightforward choice. The In-Fusion system uses a proprietary enzyme that has intrinsic strand displacement and exonuclease activities, and when the ends of two linear DNA fragments share the same sequence (any homologous sequence suffices), promotes their pairing. Once transformed into bacteria, the resected and paired DNA fragments are readily converted into circular plasmids. In this case, by ensuring that the ends of each PCR product contains 15 bp of homology to the corresponding ends of the vector, the PCR products are captured readily into the vector. One advantage of the In-Fusion reaction is that it is agnostic with respect to the sequences used for recombination. Thus, it can be used as a general method for inserting DNA fragments into any vector. In-Fusion recombinational cloning entails a brief in vitro incubation in which the ORF-specific PCR product and pDNR-Dual are mixed with the In-Fusion enzyme. This results in the amplified DNA cloned within the loxP sites of pDNR-Dual. A simple blue-white screen identifies E. coli transformants of pDNR-Dual with cloned inserts. Because the cloned DNA disrupts the vector lacZ gene, clones with inserts are easily identifiable as white colonies on plates containing IPTG and X-GAL. Advantageously, the In-Fusion cloning reaction exhibits only minimal size bias.

Page 40: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

OBJECTIVES AND SCOPE

Page 41: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

39

Phage display is a technique where DNA encoding a peptide or protein, when cloned in frame with a coat protein of a bacteriophage, is displayed on the phage surface, while the encoding DNA is encapsulated in the same phage. This allows for a physical linkage between the phenotype (the displayed protein) and the genotype (the cloned DNA sequence). Because of this unique property, a phage displaying a peptide or protein can be enriched from a milieu of millions of other phages by a process of panning (affinity selection) on immobilized baits. Several bacteriophages like Lambda, T7, T4 have been employed for displaying

proteins. However, filamentous M13 phage has been most successfully exploited

for surface display. Although all the five coat proteins of M13 phage namely, gIIIp, gVIIIp, gVIIp, gVIp and gIXp have been shown suitable for functional display, gIIIp has been most extensively used. The phage display is a powerful technique for studying protein ligand interactions and can be employed to understand gene functions and thus, can be a prime technology for “Functional genomics”. Phage display has been employed for creating large libraries of antibodies and small peptide (random peptide libraries). The former have been successfully used for selecting high affinity antibody fragments against virtually any molecule including self-antigens (Hoogenboom et al., 2002; Kretzschmar and Ruden et al., 2002), and the latter has been used for identifying peptide mimics that can bind any receptor and such molecules have been used as inhibitors of interaction between two macromolecules (James et al., 2009; Mandava et al., 2004; Hekim et al., 2006).

Another major application of phage display has been in constructing cDNA and fragmented whole genome libraries where use of gIIIp display vectors has been challenging due to the presence of stop codon and polyA tails in the case of oligo dT tailed cDNA. This problem can be alleviated using random primed cDNA fragments or fragmented genomes, which can be cloned between the signal sequence and the gIIIp coding sequence. However, in this approach only a small fraction (1/18 = 5.55 %) of gene fragments are in reading frame with respect to signal sequence and gIIIp coding sequence to code for gIIIp fusion protein that matches with the proteome of the organism. Thus, majority of the phages either

Page 42: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

40

do not display any protein, or display protein not corresponding to the proteome / ORFeome of the organism as the latter arise due to wrong frame without any internal stop codons. This situation can be improved if phages not displaying any protein are eliminated to enrich for phages that display proteins that match to the real proteome, so that this phage library could be directly used for affinity selection on desired baits.

The whole genome phage display libraries of several organisms have been

reported that include applications in the area of epitope mapping (Fehrsen and du Plessis, 1999; Palzkill et al., 1998; Robben et al., 2002; Wilson et al., 1998),

identification of ligands binding to the yeast Abp1-SH3-domain (Fazi et al., 2002), identification of Gal80p interacting proteins using Saccharomyces cerevisiae whole

genome phage displayed library (Herteveldt et al., 2003), to identify binding partners (Protein A) for an immobilized ligand (IgG protein) using the phage displayed genome fragment library of Staphylococcus aureus (Jacobsson and Frykberg, 1995; 1996) and screening immunogenic polypeptides (Hust et al., 2008). Our lab constructed a large (107 independent clones) whole genome fragment library of M. tuberculosis containing fragments of size range 100 - 600 bp cloned in gIIIp based phagemid vector (Kulshrestha A, Ph.D thesis, 2005). However, majority of phages in this library displayed unwanted polypeptides as only 1 out of 18 fragments lead to functional polypeptide corresponding to the M. tuberculosis proteome, which in combination with low display density led to the

failure in selecting out desired phages during affinity selection. M13 phage display systems based on gIIIp allow display of only a few copies per phage and therefore, are used where high affinity interactions are required. For high-density display of proteins, lambda phage based gene fragment libraries have also been used extensively. Gene fragment libraries on lambda phage have been constructed using genomic DNA of Streptococcus pneumoniae (Beghetto et al., 2006), Mycoplasma pneumoniae (Montagnani F et al., 2010) with an aim to indentify B cell epitopes for developing tests or identifying antigen as vaccine candidates. But the efficiency of cloning of DNA libraries in the lambda DNA is quite challenging. However, all these studies have used ORF unselected gene

Page 43: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

41

fragments or have employed inefficient methods for ORF selection. But it is very obvious that construction of ORF-selected library will greatly enhance the quality of selection as it would have much less of non-genic (clones having fragments in frame for display but not matching with the proteome of the organism).

Further, the ORF-selected libraries in any format are a great resource for

genome-wide proteome analysis. Literature shows that there have been efforts to select ORF either by cloning gene fragments in fusion with β-lactamase to select

for ampicillin resistant clones as carrier of reading frames (Zacchi et al., 2003) or by cloning upstream of reporter genes such as LacZ or GFP to identify blue or fluorescent clones as carriers of reading frames (Koenin et al., 1982; Waldo et al.,

1999), and even a M13 helper phage which is genotypically devoid of gIIIp has been employed for ORF selection (Hust et al., 2006). However, all these systems have problems and more so, the ORF selected in the process cannot be easily transferred to other vector systems. In addition, the phage display based protein interaction studies can be greatly benefitted by high avidity effect such as by high-density phage display in phage lambda.

Based on the available literature and after identifying the need for improvement in Phage display based functional genomics, the work was planned to achieve the following objectives:

• Development of phagemid based system to construct ORFeome library of

M. tuberculosis H37Rv on M13 phage using novel helper phage AGM13. • Characterization of ORFeome library for the identification of epitopes

recognized by various MAbs raised against different mycobacterial proteins.

• Novel and efficient genome-wide transfer of ORFs to other vectors. • Development of new lambda phage based display vectors for high-density

display of libraries.

Page 44: CHAPTER 1 - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/18005/7/07...Several bacteriophage systems such as lambda, T4 and T7 have been employed for displaying a variety of

42