characterization of the consensus dna binding site for ... · function that requires a dna binding...

Characterization of the consensus DNA binding site for human

KRAB-zinc finger protein KS1

Jennifer M. Grants

Thesis submitted to the

Department of Biochemistry & Microbiology

in partial fulfillment of the requirements for the degree of

Bachelor of Science (Hon.)

University of Victoria

Victoria, British Columbia, Canada

April 2010

© Jennifer M. Grants, April 2010

ii

Abstract

The zinc finger domain is a common eukaryotic DNA binding motif, and is often

found in transcription factors and transcriptional repressor proteins. KS1 (KRAB Suppressor

of Transformation 1) is a member of the Krüppel-Associated Box (KRAB) family of zinc

finger proteins. Its consensus DNA binding site (KSE) contains two regions of high sequence

conservation: a 5 base pair subsite termed the A box, and an 8 base pair subsite termed the B

box. The current study characterized this DNA binding site quantitatively, using

nitrocellulose filter binding assays (FBAs) to determine equilibrium constants for KS1

binding to the consensus site and scanning mutants. Numerous diagnostic and

troubleshooting procedures were utilized in the cloning and protein purification of the KS1

zinc finger region. A modified FBA protocol substituting polyA instead of poly(d[IC]) as the

carrier was developed in order to measure specific protein-DNA binding. The apparent

dissociation constant (Kd) for KS1 binding to KSE was 7.10 ± 1.28 nM. Scanning mutations

in the A or B boxes of KSE reduced the affinity of KS1 by 10- or 50-fold, respectively, while

mutations in flanking regions caused an approximate 2-fold reduction. Binding of KS1 to a

truncated binding site could not be detected. A model for KS1-DNA binding was proposed,

in which all zinc fingers may potentially bind to the entire 30 base pair consensus site, but the

strongest contacts are made by a subset of zinc fingers that bind to the A and B box sites.

iii

Acknowledgements

I would like to thank Dr. Paul Romaniuk for providing me with so many opportunities

over the past three years, and for his patience and guidance during my time in his lab. I also

thank Tristen Weiss for passing on her expertise in laboratory protocols, and I thank Tristen

Weiss and Erin Flanagan for many helpful discussions. Finally, I thank my parents, Jan and

Arvid Grants, for their continuing support throughout my time at university.

iv

Table of Contents

Abstract……………………………………………………………………….. ii

Acknowledgements…………………………………………………………… iii

List of Tables…………………………………………………………………. vi

List of Figures………………………………………………………………… vii

Introduction…………………………………………………………………… 1

Materials and Methods………………………………………………………...

• PCR amplification of KS1 gene constructs from a pTOPO vector.......

• Diagnostic restriction digests………………………………................

• Diagnostic PCR using universal primers……………………………..

• Plasmid sequencing……………………………………………………

• Cloning of KS1 gene constructs from human genomic DNA…………

• Plasmid isolation……………………………………………………....

• Protein purification…………………………………………………....

• Radiolabelling of DNA…………………………………………….......

• Standard nitrocellulose filter binding assay………………………......

• DNA excess nitrocellulose filter binding assay……………………......

• Cloning of KSE and mutant oligonucleotides……………………........

11

11

13

13

13

13

14

16

16

17

18

18

Results…………………………………………………………………………

• The pTOPO vector did not contain the KS1 gene……………………..

• KS1 zinc fingers can be cloned and purified from human genomic

DNA…………………………………………………………………....

• DNA binding by KS1 zinc fingers is not sequence-specific under

assay conditions……………………………………………………....

• Effect of monovalent salt concentration……………………………....

• Effect of competitor DNA or RNA………………………………….....

• Kd values for specific binding determined by FBAs with polyA

carrier. ………………………………………………………………

• Poly(d[IC]) interferes with binding reported in the literature………

19

19

24

29

33

33

35

37

v

Discussion……………………………………………………………………..

• KS1 protein constructs………………………………………………...

• Filter binding assays…………………………………………………..

40

40

43

Conclusion………………………………………………………….................. 54

References…………………………………………………………………….. 55

vi

List of Tables

Table 1: PCR programs and sequence of KS1-specific primers employed in

the amplification of two KS1 gene constructs………………………………...

12

Table 2: Standard Taq PCR program and sequence of pTOPO universal PCR

primers…………………………………………………………………………

15

Table 3: Sequence of pET universal PCR primers…………………………… 15

Table 4: Expected and actual sizes of restriction fragments in the diagnostic

digest…………………………………………………………………………...

23

Table 5: Sequences of oligonucleotides employed in standard and DNA

excess FBAs…………………………………………………………………...

31

Table 6: Kd values and relative affinities as determined by standard FBAs….. 31

Table 7: Relative affinities as determined by DNA excess FBAs………......... 32

Table 8: Relative affinities of KSE scanning mutants as determined by

standard FBAs with polyA carrier……………………………………………..

36

Table 9: Sequences of oligonucleotides used by Gebelein and Urrutia

(2001)…………………………………………………………………………..

38

Table 10: Dissociation constants of literature oligonucleotides as determined

by standard FBA with polyA carrier.………………………………………….

39

vii

List of Figures

Figure 1: Crystal structures of Zif268 bound to DNA……………………... 3

Figure 2: Schematic of hydrogen bonds formed in C-capping and

stabilization of the TGEKP linker……………………………………………..

4

Figure 3: Comparison of consensus sequences for the KS1 DNA binding

site…………………………………………………………………………......

7

Figure 4: Hydrogen bond donors and acceptors presented at the major

groove by DNA base pairs…………………………………………………….

Figure 5: Diagnostic PCR reactions with the putative pTOPO-KS1

plasmid...............................................................................................................

8

21

Figure 6: Diagnostic restriction digests of the putative pTOPO-KS1

plasmid………………………………………………………………………...

22

Figure 7: PCR of KS1-zf1 and KS1-zf2 constructs from human genomic

DNA…………………………………………………………………………...

26

Figure 8: Colony PCR identification of clones positive for KS1 gene

constructs……………………………………………………………………...

26

Figure 9: Confirmation of colony PCR results using specific and universal

primers. …………………………………………………………………….....

27

Figure 10: BLASTp (NCBI) analysis of KS1-zf1 #18(4) ………………..... 27

Figure 11: Induction and purification of KS1-zf1 and KS1-zf2………….... 28

Figure 12: Effect of competitor DNA or RNA on apparent Ka, as determined

by standard FBA………………………………………………………………

34

Figure 13: Standard FBAs with 1 µg/ml polyA for KS1-zf1 binding to

scanning mutants M1-M4……………………………………………………..

36

Figure 14: Comparison of standard FBAs with 1 µg/ml poly(d[IC]) or polyA

for KS1-zf1 binding to literature mutants……………………………………..

38

Figure 15: Comparison of IC and GC base pairs…………………………….. 49

1

Introduction

Eukaryotic gene expression is a tightly controlled process, in which each step can be

up- or downregulated, at a gene-specific level or on a global scale within the cell. The first

step in this process, transcription, is modulated by a variety of transcription factors ranging

from the general factors needed to guide RNA polymerase to promoters, to highly specific

factors that act only at a unique address within the genome. In general, transcription factors

serve as adaptors for other proteins which lack the ability to bind to DNA on their own, a

function that requires a DNA binding domain and one or more protein binding domains.

The zinc finger is a common DNA binding motif which consists of approximately 30

amino acids folded into two N-terminal β-strands and a C-terminal α-helix, surrounding a

central zinc ion. This zinc ion is generally coordinated by four cysteine and/or histidine

residues; the so-called “classical” arrangement of ligands is C2H2 (Iuchi, 2001). From the

perspective of DNA binding, the key residues in a zinc finger motif are found in the α-helix,

as these form sequence-specific hydrogen bonds with DNA bases. When a zinc finger binds

to DNA, the α-helix is brought into close contact with the major groove so that these bonds

may form. Crystal structures have shown that each zinc finger typically contacts three to four

DNA bases, and while the majority of contacts are formed with one strand of the DNA

duplex, some cross-strand contacts are also formed (Isalan et al., 1997).

Within transcription factors, zinc fingers are not found as discrete units, but instead

are organized into domains of tandem zinc fingers. Perhaps one of the best characterized zinc

finger transcription factors is Zif268 (also known as EGR1), which contains three tandem

zinc fingers. When bound to DNA, the zinc finger domain is said to be oriented in an

“antiparallel” fashion to the DNA, meaning that its N-terminus is directed toward the 3’ end

of the DNA strand to which the majority of bonds are formed, while its C-terminus is found

at the 5’ end (Isalan et al., 1997). In order for each zinc finger to make specific contacts with

DNA, the entire domain must wrap around the major groove, as is shown in Figure 1(a).

Contacts are made with nine consecutive bases: each zinc finger forms hydrogen bonds with

a three base pair subsite on one strand of the double helix, as well as reaching over and back

to a fourth base on the opposite strand, upstream of the core subsite (Isalan et al., 1997).

Five amino acid linkers, with the canonical sequence TGEKP, separate individual zinc

2

fingers and serve to control their orientation and spacing for more facile interaction with

DNA (Peisach and Pabo, 2003).

Transcription factors may have many more than just three zinc fingers. In these

polyzinc finger proteins, individual zinc finger motifs still interact specifically with bases at

the major groove; however, natural polyzinc finger proteins generally do not wrap

indefinitely around the major groove. Instead, some zinc fingers adopt the role of a

“structured linker” that crosses the minor groove, bringing the next zinc finger back to the

major groove on the same side of the DNA duplex. Notably, zinc finger proteins which use

this mode of binding recognize discontinuous DNA binding sites. Xenopus transcription

factor IIIA (TFIIIA), a nine-zinc finger protein, provides a well characterized example of this

type of DNA binding: zinc fingers 1 to 3 bind to the so-called “A-block” of the internal

control region of the 5S rRNA gene, finger 5 binds to the “intermediate element”, and fingers

7 to 9 bind to the “C-block”, while fingers 4 and 6 cross the minor groove (Wuttke et al.,

1997).

Artificial polyzinc finger proteins with six or nine zinc fingers have been designed to

recognize continuous binding sites of up to 18 or 30 base pairs, respectively, by wrapping

around the major groove of the DNA duplex with all zinc fingers (Kamiuchi et al., 1998).

However, this mode of binding requires an extended period of time to reach equilibrium, and

gives dissociation constants that are considerably higher than expected for the number of zinc

fingers present. Artificial zinc finger proteins with much lower dissociation constants can be

achieved by using longer linkers (Kim and Pabo, 1998), or “structured linkers” consisting of

an non-DNA binding zinc finger (Moore et al., 2001).

The problem of recognizing an extended DNA binding site stems from a difference in

the periodicities of DNA and the zinc finger domain. When a three-zinc finger protein such

as Zif268 binds to DNA, the double helix is slightly underwound (Peisach and Pabo, 2003).

The greater the number of zinc fingers that bind to the major groove, the longer the segment

of DNA that must unwind, and so the entire complex is strained. This begs the question, why

did the canonical linker between zinc fingers evolve to be only five amino acids long? If the

linker were longer, it would allow the zinc finger domain to conform to the natural

periodicity of B-DNA, which would be particularly beneficial in polyzinc finger proteins.

However, a crystal structure of two individual Zif268 zinc finger domains bound to adjacent

3

sites on DNA (Figure 1(b)) demonstrated that the distance spanned by a TGEKP linker is

naturally the optimal distance between zinc fingers in a polyzinc finger complex. It is

assumed that longer linkers increase the flexibility of the protein in solution, to the point that

DNA binding carries a high entropic penalty (Peisach and Pabo, 2003).

(a)

(b)

Figure 1: Crystal structures of Zif268 bound to DNA. (a) A single Zif268 zinc finger domain

bound to DNA. Zinc fingers are separated by canonical TGEKP linkers, and the double helix

is slightly unwound to 11.3 bp/turn (Elrod-Erickson et al., 1996). (b) Two Zif268 zinc finger

domains bound to DNA. Individual domains bind such that they are separated by the distance

spanned by a canonical TGEKP linker (Peisach and Pabo, 2003). N- and C-termini of the

zinc finger proteins are indicated by letters N and C, and 5’ and 3’ ends of the DNA strand

of interest are indicated by the appropriate numbers.

Primary contacts with brown DNA strand

N

C

5’

3’

N

N

C C

5’

3’

Primary contacts with pink DNA strand

4

TGEKP linkers make significant enthalpic contributions to the free energy of DNA

binding, by the stabilizing the zinc finger α-helices and locking the zinc finger domain in its

DNA binding conformation. Upon DNA binding, the linker stabilizes the C-terminus of the

α-helix in the preceding zinc finger domain, through an interaction referred to as “C-

capping”. Figure 2 is a schematic of this interaction, in which the glycine α-amino proton

forms a key hydrogen bond with the α-carbonyl oxygen of the third residue from the C-

terminal end of the α-helix, which is not stabilized by internal hydrogen bonds within the α-

helix itself. Also shown in Figure 2 is an internal hydrogen bond within the linker, between

the glutamate α-amino proton and the threonine Oγ, which converts the linker from a flexible,

unstructured region to a rigid structure upon DNA binding (Laity et al., 2000). Due to this

dramatic increase in protein rigidity upon DNA binding, low entropy in solution is

favourable. Thus, these two hydrogen bonds likely provided selective pressure for the

maintenance of both the sequence and length of the TGEKP linker throughout evolution. It

should be noted that similar N-capping interactions often form as well, to stabilize the N-

terminal end of a zinc finger α-helix; however, these interactions involve other residues

within the zinc finger, and not linker residues (Laity et al., 2000).

Figure 2: Schematic of hydrogen bonds formed in C-capping and stabilization of the TGEKP

linker. The C-capping hydrogen bond (dashed arrow) forms between the glycine α-amino

proton and the α-carbonyl group of the third amino acid from the C-terminal end of the α-

helix. An additional hydrogen bond (dotted arrow) between the glutamate glycine α-amino

proton and the threonine Oγ stabilizes the linker in a rigid conformation (Laity et al., 2000).

Thr

Gly

Hydrogen bonding interactions

α-helix residues linker residues

Glu

5

The protein which is investigated in this study, KS1 (KRAB suppressor of

transformation 1, also referred to in the literature as ZNF382), is an interesting zinc finger

protein in many regards. It contains a HCH2 zinc finger which is separated by a 56-amino

acid spacer from nine tandem C2H2 zinc fingers at its C-terminus (Luo et al., 2002); refer to

the bottom of this page for a schematic of the KS1 protein. Zinc finger 1 has been referred to

as a vestigial zinc finger (Luo et al., 2002), implying that its DNA binding activity has been

abolished by a cysteine to histidine mutation which rendered it a HCH2 zinc finger. However,

histidine is also capable of coordinating zinc, and experimental evidence to date has not

demonstrated definitively whether or not this zinc finger is capable of binding DNA.

Interestingly, a KS1 homolog in rats (83% amino acid identity in the zinc finger domain) has

ten C2H2 zinc fingers and a true vestigial zinc finger within the 56-amino acid spacer, in

which a cysteine to aspartate mutation and a histidine to leucine mutation prevent the

coordination of a zinc ion (Gebelein et al., 1998). In human KS1, this vestigial zinc finger

within the spacer retains only one histidine, as it has the same C/D and H/L mutations as the

rat homolog, as well as an additional cysteine to tryptophan mutation (Luo et al., 2002; refer

to Figure 10 for clarification). The remaining nine zinc fingers of human KS1 follow the

consensus sequence CX2CX3FX5LX2HX3H (Luo et al., 2002), which conforms to the

classical C2H2 zinc finger consensus, CX2-4CX12HX2-6H (Iuchi, 2001). A rather unusual

feature of the KS1 zinc fingers is that most are separated by canonical TGEKP linkers (Luo

et al., 2002); this feature suggests that C-capping may contribute significantly to the strength

of KS1-DNA binding.

Schematic of the domain organization of the KS1 protein.

Zinc finger domain KRAB N C

56 amino acid spacer

2 3 1 4 6 7 8 9 10 5

6

The N-terminus of KS1 contains a Krüppel-associated box (KRAB) domain, which

recruits proteins involved in transcriptional repression, such as KAP-1 (Luo et al., 2002).

KS1 was the first KRAB-zinc finger protein proven capable of bringing about site-specific

transcriptional repression in vivo by binding to a high affinity site determined in an in vitro

random oligonucleotide selection process. Interestingly, the consensus binding site for KS1

was determined to be 27 base pairs long, and repression of promoters containing this

sequence required the presence of zinc fingers 2 through 10 (Gebelein and Urrutia, 2001).

Thus, prior research indicates that all nine zinc fingers may bind to DNA by wrapping around

the double helix at the major groove. If this postulation can be proved, then KS1 would be

the first example of a natural polyzinc finger protein to use such a mechanism of DNA

binding.

To date, the DNA binding site of KS1 has largely been studied in an indirect and/or

qualitative manner. Electrophoretic mobility shift assays (EMSAs) and reporter gene assays

were used to demonstrate that KS1 binds to the consensus site determined by random

oligonucleotide selection (Gebelein and Urrutia, 2001); however, these assays could not be

used to determine the equilibrium constants for the binding of KS1 to its consensus site or

mutants. An additional pitfall of these studies was that the consensus KS1 binding element

(KBE) was derived from 31 high affinity oligonucleotide sequences identified an in vitro

selection process, but it was not taken into account that some of these sequences were

identical; therefore, certain bases within the KBE were unduly biased. In the current study, a

slightly modified consensus KS1 sequence element (KSE) was used instead, which was

derived from one representative of each unique sequence identified in the original random

oligonucleotide selection process. The differences between KBE and KSE are outlined in

Figure 3; notably, KBE is 27 base pairs long, while KSE is a 30 base pair consensus

sequence. If each zinc finger is presumed to contact consecutive groups of three base pairs, as

the model for Zif268 binding would indicate (Isalan et al., 1997), then the length of KSE

would suggest that all ten KS1 zinc fingers are involved in DNA binding.

7

KBE TCCTACAGTACCAACCCTACAGAGTAA AGGATGTCATGGTTGGGATGTCTCATT

KSE CTATCATACTACCCACCCTACAGATGCACA GATAGTATGATGGGTGGGATGTCTACGTGT Figure 3: Comparison of consensus sequences for the KS1 DNA binding site. KBE was

determined by Gebelein and Urrutia (2001) from 31 sequences identified in a random

oligonucleotide selection process, all of which were not unique. KSE was used in this study,

and was derived from only the unique sequences identified by the Gebelein group;

differences from KBE are indicated in red. The A and B box regions are underlined, and

other conserved residues of KBE are highlighted in gray.

Despite these differences, KBE and KSE share two common motifs, termed the A and

B box regions. These were highly conserved among the 31 sequences identified by random

oligonucleotide selection, and this is reflected by the fact that KBE and KSE differ by only

one base pair in the A box and are identical in the B box (Figure 3). Despite a high degree of

conservation, preliminary mutagenesis studies of KBE indicated that while the A and B

boxes participate in protein-DNA binding, KS1 likely makes contact with bases flanking

these sites as well. That is, mutating one or both of the A and B boxes appeared to diminish

the affinity of protein-DNA binding, whereas deleting five or six base pairs at the 5’ or 3’

end of KBE abrogated binding entirely (Gebelein and Urrutia, 2001). This seemed to indicate

that KS1 strictly requires a 27 base pair minimum binding site; however, the study lacked

both quantitative data and a complete set of mutants. The strength of the binding interaction

observed between KS1 and mutants of the KBE A and/or B boxes could have been hundreds

or thousands of times weaker than binding to the consensus KBE, which would render the

binding interactions irrelevant in vivo. Indeed, KBE scanning mutants were not tested in

reporter assays; deletions of KBE were the only mutants investigated. In addition, a complete

set of mutants should have included scanning mutations in the 5’ and 3’ ends, as these would

A B

8

have ruled out the possibility that the deletion mutations simply eliminated residues which

were crucial to high-affinity binding.

The current study provides a first step toward a more complete set of DNA mutant

data. These included scanning mutations of the A or B boxes, as well as scanning mutations

in the 5’ and 3’ flanking regions of KSE. In this way, it was possible to assess the

contribution of each of these key regions to the free energy of protein-DNA binding. By

using scanning mutations of the 5’ and 3’ flanking sites, instead of simple deletions as in the

previous study, it was possible to distinguish between the sequence requirements and the

binding site length requirements of the KS1 protein. In addition, the scanning mutations of

the A and B boxes employed in this study were more informative than those used in previous

research: the mutations were designed to alter the pattern of hydrogen bonds donors and

acceptors presented at the major groove by each base pair (Figure 4), whereas previous

studies had simply mutated the sequences of interest to polyG.

Figure 4: Hydrogen bond donors and acceptors presented at the major groove by DNA base

pairs. D indicates hydrogen bond donors, A indicates hydrogen bond acceptors.

Major groove side

Minor groove side

D D A A A A

9

This study also provides the first quantitative data for the binding of KS1 to its DNA

element. Nitrocellulose filter binding assays (FBA) were used to determine the equilibrium

constants for the binding of KS1 to the KSE consensus sequence and scanning mutants. This

technique (Romaniuk, 1985) relies upon the ability of a nitrocellulose filter to trap protein

and any complexed labelled oligonucleotides, while a nylon filter is used to trap unbound

labelled oligonucleotides. The fraction of DNA bound by varying protein concentrations can

thus be determined by autoradiography, and then normalized to account for minimal retention

of free oligonucleotide by the nitrocellulose; this fraction is subsequently plotted as a

function of protein concentration to generate an equilibrium binding curve (with curve fitting

software).

From this curve, it is possible to determine the dissociation constant (Kd) from the

following bimolecular equilibrium:

DNA-protein DNAfree + Proteinfree (1)

In an FBA, it is not possible to measure the concentration of free protein directly. However,

assuming that the concentration of bound protein is significantly lower than the Kd, the

concentration of free protein can be approximated by the total protein concentration:

[protein]free = [protein]total – [DNA-protein] Where [DNA-protein] << Kd

[protein]free ≈ [protein]total (2)

Bearing this assumption in mind, the Kd can be determined as the total protein concentration

at 50% saturation:

[DNA-protein] = [protein]total (3)

[DNA]total [protein]total + Kd

A drawback to this method is that the protein is assumed to be 100% active. In order

to circumvent this assumption, an alternate FBA can be used in which the Kd is determined

by titrating constant a concentration of protein and labelled oligonucleotide with an

increasing concentration of identical, unlabelled competitor oligonucleotide. Plotting the

amount of bound DNA as a function of the amount of unbound DNA gives a curve in which

the Kd is the free DNA concentration that gives 50% protein saturation:

10

[DNA-protein] = [DNA]free (4)

[Protein]total [DNA]free + Kd

Thus, this alternate assay (termed the DNA excess FBA) does not assume 100% protein

activity, since the concentration of protein is held constant. However, standard FBAs are still

valid when comparing the relative binding affinity of a single protein preparation to different

DNA mutants.

The results of these FBAs demonstrated that the KS1 zinc finger region bound to

KSE with an apparent Kd of 7.10 ± 1.28 nM. The A and B boxes made the greatest

contribution to the free energy of binding, as mutations in these regions caused an

approximate 10- or 50 fold reduction in apparent Kd, respectively, while mutations in the 5’

and 3’ flanking regions caused only an approximate 2-fold reduction. Conversely, binding of

KS1 to a 3’ deletion of KBE could not be detected. From these data, a modified model for

KS1-DNA binding emerged, in which all zinc fingers may potentially bind to the entire 30

base pair consensus site, but the strongest contacts are formed by the binding of certain zinc

fingers binding to the A and B box sites.

11

Materials and Methods

PCR amplification of KS1 gene constructs from a pTOPO vector1

A pTOPO vector containing the KS1 gene (Thermo Scientific) was isolated using a

QIAGEN miniprep kit. PCR primers (IDT) unique to KS1 were used to amplify two

constructs of the zinc finger region: zinc fingers 1 through 10 (henceforth referred to as KS1-

zf1), and zinc fingers 2 through 10 (referred to as KS1-zf2). Reactions contained 2x Pfx

DNA polymerase buffer and 0.5 units Pfx DNA polymerase (Invitrogen), 1 mM MgSO4, 0.3

mM dNTPs, 1 ng plasmid, 0.3 µM upstream NcoI primer, and 0.3 µM downstream XhoI

primer (Table 1). A standard 35-cycle Pfx PCR program outlined in Table 1 was employed,

using a Biometra T-personal thermocycler. Alternately, the annealing temperature of the

standard PCR program was raised to 62°C, or a touchdown PCR program was employed

(Table 1). Samples from each reaction were analyzed on a 1.5% agarose gel.

1 The pTOPO vector did not contain the KS1 gene, but instead the gene for a Rab kinase GTPase activating

protein (TBC1D13). However, this was unknown to me at the time.

12

Table 1: PCR programs and sequence of KS1-specific primers employed in the amplification

of two KS1 gene constructs.

Standard Pfx PCR program Touchdown PCR program

i) 94°C

ii) 94°C

iii) 60°C

iv) 72°C

5 min

30 sec

30 sec

5 min

v) repeat ii-iv 30x, decreasing

annealing temp. 0.3°C per cycle

i) 94°C

ii) 94°C

iii) 55°C or 62°C

iv) 68°C

v) repeat ii-iv 35x

vi) 68°C

vii) 4°C

2 min

10 sec

10 sec

90 sec

2 min

hold vi) 94°C

vii) 55°C

viii) 72°C

ix) repeat vi-viii 10x

x) 72°C

xi) 4°C

30 sec

30 sec

5 min

5 min

hold

Upstream NcoI primer Downstream XhoI primer

KS1-zf1 CGGCCGCCATGGAACCATTTG

-ACCATAATGAATGTGAAAAATCC

KS1-zf2 CGGTCACCATGGAACCCT

-TTCACTGTCCTTACTGTGG

CGCCAGCTCGAGTCATTACTGAA

-TTCCCATAGTTTCTACCTTGTG

13

Diagnostic restriction digests

The pTOPO-KS1 plasmid was digested with NotI (Invitrogen), ScaI (Pharmacia),

NdeI, AflIII, PstI, AvaII, NcoI, or EcoRI (NEB). Reactions contained appropriate buffers (as

specified by the manufacturer), 200 ng plasmid, and 1 to 4 units of restriction enzyme.

Additional digests were conducted with an excess of restriction enzyme (15 units NotI, 10

units NcoI, or 20 units EcoRI). Digests were incubated for 90 minutes at 37°C. Samples were

analyzed on a 0.8% agarose gel, along with undigested plasmid.

Diagnostic PCR using universal primers

Universal M13 primers (IDT; Table 2), complementary to the pTOPO vector, were

used to detect the presence of a gene within the pTOPO vector. Reactions contained 1x Taq

DNA polymerase buffer and 0.25 units Taq DNA polymerase (ABM), 1.5 mM MgCl2, 0.1

mM dNTPs, 1 ng plasmid, 0.3 µM forward primer, and 0.3 µM reverse primer. Additional

reactions combined universal primers with the KS1-specific primers, to detect the

functionality of the KS1 primers. A 35-cycle standard Taq PCR program was employed

(Table 2). Samples were analyzed on a 0.8% agarose gel stained with ethidium bromide.

Plasmid sequencing

Plasmids were sequenced by the University of Victoria Centre for Biomedical

Research. The identity of gene sequences were determined by BLAST analysis (NCBI).

Cloning of KS1 gene constructs from human genomic DNA

The two KS1 constructs were amplified from 1 µg of human genomic DNA, using the

same KS1-specific primers and the standard Pfx PCR reaction conditions that were employed

with the pTOPO vector. PCR products were purified by gel extraction (QIAGEN kit),

digested with NcoI and XhoI, purified with the QIAquick PCR purification kit (QIAGEN),

and ligated into pET30a digested with the same restriction enzymes. Restriction digests

contained the appropriate buffers (as specified by NEB) and 1 unit of each enzyme, and were

incubated for 90 minutes at 37°C. Ligation reactions contained 1x T4 DNA ligase buffer and

400 units T4 DNA ligase (NEB), 60 ng NcoI/XhoI-digested pET30a, and 5.0 µl of the

14

digested and purified PCR product; the reaction was incubated for 24 hours at room

temperature.

Plasmid isolation

Competent Escherichia coli DH5α cells were transformed with one half of each

ligation reaction, and plated on selective agar media containing 50 µg/ml kanamycin.

Colonies positive for the desired gene were identified by colony PCR, using 0.25 units Taq

DNA polymerase (ABM), 1.5 mM MgCl2, 0.1 mM dNTPs, 1 ng plasmid, 0.3 µM forward

primer, and 0.3 µM reverse primer. The pET-specific universal primers (IDT) detailed in

Table 3 were employed. Plasmids were purified using the QIAprep Spin Miniprep Kit

(QIAGEN).

15

Table 2: Standard Taq PCR program and sequence of pTOPO universal PCR primers.

Standard Taq PCR program

i) 94°C

ii) 94°C

iii) 55°C or 62°C

iv) 68°C

v) repeat ii-iv 35x

vi) 68°C

vii) 4°C

2 min

10 sec

10 sec

90 sec

2 min

hold

M13 forward primer M13 reverse primer GGCGATTAAGTTGGGTAACGCCAG CTCGTATGTTGTGTGGAATTGTGAGCG

Table 3: Sequence of pET universal PCR primers. A standard Taq PCR program was

employed (Table 2).

pET16 forward primer pET16 reverse primer CGGCGGTGTGAGCGGATAACAATTCCC GGCGCGATGCTAGTTATTGCTCAGCGG

16

Protein purification

Competent E. coli BL21(DE3) cells were transformed with 50 ng of each plasmid,

and plated on selective agar media containing 50 µg/ml kanamycin. Overnight cultures were

prepared by inoculating 5 ml 1x LB containing 50 µg/ml kanamycin with a single

transformed colony and incubating at 37°C for approximately 20 hours. Half of each

overnight culture was subcultured (1/10 dilution) and then grown at 37°C to an optical

density between 0.5 and 0.6 before induction with 1 mM IPTG. Induced cells were grown at

37°C to an optical density between 1.1 and 1.6; for incubation times longer than 5.5 hours,

the temperature was reduced to room temperature for the remainder of the period. Cells were

harvested by centrifugation (5000 rpm, Beckman JLA 16.250). Cells were lysed by French

press in a resuspension buffer consisting of 1x TAB (20 mM MgCl2, 5 mM ZnCl2, 250 mM

NaCl, pH 7.5), 1 mM PMSF, 10.2 % glycerol, 5 mM DTT, and 5 mM imidazole. The lysate

was centrifuged (11,500 rpm, Beckman JA-20), and inclusion bodies in the insoluble fraction

were resuspended overnight in a denaturation buffer, consisting of resuspension buffer plus

5M urea. The supernatant was isolated by centrifugation (11,500 rpm, Beckman JA-20), and

loaded on a Nickel-Agarose 6 Resin (ABM) column equilibrated with denaturation buffer.

Protein was eluted with 150 mM and 250 mM imidazole in denaturation buffer, and the

protein concentration was determined by the method of Bradford (1976). Samples from

inductions and column purification were analyzed on SDS-PAGE gels (3.5% stacking, 15%

separating) stained with Coomassie blue.

Radiolabelling of DNA

DNA oligonucleotides (IDT) to be used as the substrate in filter binding assays were

annealed by heating a solution of top and bottom strand oligonucleotides (40 µM each) in a

5x concentrated annealing buffer2 to 95°C for 5 minutes and then cooling to room

temperature for 1 hour. The mixture was subsequently diluted 1/10 in 1x annealing buffer.

These were designed such that each had a 4 base pair 5’ overhang, to allow end-labelling

with α-32P dATP. Labelling reactions contained 1x React2 buffer and 2.5 units DNA

polymerase I large fragment (Klenow), 4 µM annealed oligonucleotide, 2.0 µl α-32P dATP

(Perkin-Elmer), and 0.2 mM each of dGTP, dTTP and dCTP. The reactions were incubated

2 5x concentrated annealing buffer: 250 mM NaCl, 5 mM EDTA, 15 mM Tris, 35 mM Tris-HCl

17

for 30 minutes at room temperature, and labelled oligonucleotides were subsequently

separated from unincorporated nucleotides on a non-denaturing 20% acrylamide gel. Bands

were visualized on Kodak X-ray film (2 minute exposure), and then excised and eluted

overnight in buffer containing 1 mM EDTA, 0.6 M NH4OAc, and 0.1% SDS. The buffer now

containing labelled DNA was decanted, and precipitated for 1 hour at -80°C with 15 µg

glycogen and 70% ethanol. DNA was harvested by centrifugation (13,300 rpm, Fischer

Scientific tabletop centrifuge), the supernatant was removed, and the pellets were dried

(Savant Speed Vac Concentrator) and then resuspended in distilled DNase-free water

(Gibco). The concentration of each DNA preparation (in cpm/µl) was determined by

scintillation (LKB Wallac).

Standard nitrocellulose filter binding assay

The method developed by Romaniuk (1985) was followed. Briefly, a 12-point protein

dilution series (720 nM, 360 nM, 180 nM, 72 nM, 36 nM, 18 nM, 7.2 nM, 3.6 nM, 1.8 nM,

0.72 nM, 0.36 nM, and 0 protein) was prepared using TMK buffer as the diluent (pH 7.5, 100

mM KCl 0.5 mM TCEP, 0.1 mM BSA, 5 mM MgCl2, 0.01 mM ZnCl2 final concentrations).

Radiolabelled DNA and poly(d[IC]) were added to each well, to a final concentration of 50

cpm/µl and 1 µg/ml, respectively. Mixtures were incubated for 90 minutes at 20°C, and then

80% of the total volume was filtered on nitrocellulose followed by nylon filter paper, using a

Dot Blot apparatus. Dried filters were exposed to a phosphor screen (Molecular Dynamics)

overnight, and the signal was detected using a Storm 820 phosphor imager (Molecular

Dynamics). The fraction of DNA bound at each protein concentration was determined with

Microsoft Excel, using the following formula:

[DNA-protein] = bound DNAx bound DNA0

[DNA]total total DNAx total DNA0

Where total DNA = bound DNA + free DNA and subscript x indicates values for a

given protein concentration, while subscript 0 indicates values for the zero protein wells.

The fraction of DNA bound was plotted as a function of protein concentration, and a curve

was fitted using Kaleidagraph 2.0 software.

18

DNA excess nitrocellulose filter binding assay

Protein was diluted to 200 or 400 nM in TMK buffer, and aliquoted into 11 of 12

wells in each column of a 96-well round-bottom plate; the 12th well was reserved for TMK

buffer alone (no protein). A cold DNA dilution series (81.6 nM, 42.6 nM, 21.7 nM, 10 nM,

7.22 nM, 4.26 nM, 2.17 nM, 1.00 nM, 0.722 nM, 0.426 nM, 0.217 nM, 0.217 nM) was

prepared separately, and then the equivalent radiolabelled DNA oligonucleotide and

poly(d[IC]) were added to each well, to a final concentration of 7,500 cpm/µl and 10 µg/ml,

respectively. Subsequently, DNA from each well of the DNA series was added to the

corresponding well in the diluted protein plate, diluting the DNA 1/10. The mixture was

incubated, filtered, and filters were exposed to a phosphor screen as in the standard filter

binding assay. Data were analyzed using the appropriate formulae, with Microsoft Excel and

Kaleidagraph 2.0 software.

Cloning of KSE and mutant oligonucleotides

Oligonucleotides to be sequenced were ligated into pK19 that had been digested for

90 minutes at 37°C with EcoRI and BamHI (NEB) in appropriate buffers. Ligation reactions

were prepared as described previously, and were incubated overnight at 37°C. Plasmids were

purified as described as above, and were sequenced without identification by colony PCR.

19

Results

The pTOPO vector did not contain the KS1 gene

Attempts to amplify two KS1 gene constructs from a pTOPO vector, using a standard

PCR thermocycler program, failed to yield product. In an effort to increase primer annealing

efficiency, the annealing temperature was raised to 62°C (approximately 5°C below the

primer melting temperature), or a touchdown PCR program was employed; however, these

conditions also failed to yield product. As a positive control, PCR reactions were performed

using plasmids with known inserts and universal plasmid primers; these reactions

successfully yielded product, thus eliminating the possibility that human error was at fault.

This unexpected problem with the initial step in cloning required further investigation of the

pTOPO vector and the gene within it, including diagnostic restriction digests, PCR using

universal primers, and finally, sequencing of the plasmid.

A series of diagnostic PCR reactions were performed with universal primers. M13

primers complimentary to pTOPO were intended to detect whether the plasmid was indeed

the pTOPO vector, and to assess the size of the insert. KS1-specific primers were employed

in combination with universal primers, in order to diagnose potential problems with either the

upstream or the downstream KS1 primers. All reactions were performed in quadruplicate,

using independent preparations of the putative pTOPO-KS1 plasmid as the template; all

replicates behaved identically. PCR using the universal primers produced a strong band

between 2036 and 3054 base pairs. By contrast, the KS1-zf1 upstream primer combined with

the universal reverse primer failed to yield a product. Both the KS1-zf2 upstream primer

combined with the universal reverse primer, and the universal forward primer combined with

the KS1 downstream primer yielded weak bands that were likely artefacts (Figure 5).

Diagnostic restriction digests were designed to target restriction sites within the

pTOPO vector, within the KS1 gene, or within both vector and gene. NotI and NdeI digests

were intended to linearize the plasmid by cutting at a unique restriction site within pTOPO or

within the KS1 gene, respectively. These digests would indicate the size of the plasmid and

insert combined, as well as offering preliminary evidence that the plasmid and insert were

correct, if two unique restriction sites were indeed present. All other digests were intended to

20

cut at multiple sites within both the plasmid and the insert, yielding a specific banding pattern

that could be predicted based on the vector map and the sequence of the KS1 gene (Table 4).

Digests were performed in duplicate or quadruplicate, using independent preparations

of the putative pTOPO-KS1 plasmid. All replicates behaved identically, in that all reactions

failed to yield the expected size and/or number of restriction fragments (Figure 6 and Table

4). Notably, the linearized plasmid appeared to be approximately 6100 base pairs, although

pTOPO-KS1 was supposed to be 6700 base pairs in length. Additionally, EcoRI, NcoI, and

NotI digests were repeated with an excess of restriction enzyme, in order to ensure that the

reaction ran to completion. This, too, yielded bands of unexpected sizes.

The large number of unexpected results obtained in these analyses prompted me to

have all four pTOPO plasmid preparations sequenced (Centre for Biomedical Research,

University of Victoria). The insert within the vector was identified by BLASTn analysis

(NCBI) as the TBC1D13 gene, a human protein with Rab GTPase activator activity (NCBI).

21

Figure 5: Diagnostic PCR reactions with the putative pTOPO-KS1 plasmid. Universal

primers and/or KS1-specific primers were employed, and samples were analyzed on a 0.8%

agarose gel stained with ethidium bromide and imaged under UV light.

22

Figure 6: Diagnostic restriction digests of the putative pTOPO-KS1 plasmid. Digests were

incubated for 90 minutes at 37°C and samples were visualized on 0.8% agarose gels stained

with ethidium bromide and UV light. Reactions were run in quadruplicate or duplicate on

independent plasmid preparations; however, identical results of the quadruplicate trials

(reactions 1-6) are not shown. Table 4 summarizes the products obtained.

23

Table 4: Expected and actual sizes of restriction fragments in the diagnostic digest. Predicted

restriction sites and band sizes are based on the pTOPO vector map and the KS1 gene

sequence (Thermo Scientific).

Restriction

enzyme

Expected site of

restrictions

Expected

fragments

(bp)

Actual fragments

(bp, approx. values)

NotI vector 6700 6100

NdeI insert 6700 6090

ScaI vector and insert 3672, 3026 6100

AflIII vector and insert 5845, 686, 155 6090, 1000

PstI vector and insert 4545, 1739, 285,

113

6090, 1000

AvaII vector and insert 3575, 2294, 608,

219

2000, 1000

Normal Excess

NcoI

insert

3196, 3134, 356 6100, (6000),

(4000), (2100)

6100,

(6000)

EcoRI

insert 3934, 1026, 626,

416, 288, 272,

108

6100, 4000,

2100

4000,

2100

24

KS1 zinc fingers can be cloned and purified from human genomic DNA

The KS1 zinc finger domain is encoded by a single exon on chromosome 19 (Luo et

al., 2002), and so it was possible to amplify the desired gene constructs by PCR using human

genomic DNA as a template (Figure 7). The construct termed “KS1-zf1” was 1161 base pairs

in length, and consisted of zinc fingers 1 through 10, including the spacer region separating

zinc fingers 1 and 2. The construct termed “KS1-zf2” was 909 base pairs long, and consisted

of zinc fingers 2 through 10 and the linker preceding finger 2, but did not include the spacer

region. Following gel extraction, the KS1-zf1 product was not concentrated enough to be

successfully ligated into pET30a. This problem can be visualized, albeit qualitatively, in

Figure 7a: the band for KS1-zf1 is weaker than the band for KS1-zf2, which was successfully

ligated into the pET30a cloning vector. The PCR amplification of KS1-zf1 was repeated,

yielding the product that was subsequently ligated into pET30a (Figure 7b).

Plasmids containing the KS1 gene constructs were identified by colony PCR (Figure

8), and then confirmed by a diagnostic PCR protocol which utilized KS1-specific primers

and universal primers complementary to the pET30a vector (see example in Figure 9).

Positive clones were subsequently sequenced (CBR). However, it was necessary to make two

separate plasmid preparations for the KS1-zf1 construct, as the first (denoted pET30a-KS1-

zf1 #4) was contaminated and could not be sequenced. The second plasmid preparation was

obtained by transforming E. coli DH5α with pET30a-KS1-zf1 #4 and re-purifying the

plasmid. This plasmid preparation was called pET30a-KS1-zf1 #18(4) to reflect that its

sequence was the same as KS1-zf1 plasmid construct #4, barring the occurrence of any

mutations during the re-transformation process. Sequencing and BLASTp (NCBI) analysis

revealed that the KS1-zf1 construct contained three amino acid mismatches: an glutamine to

glutamate mutation near zinc finger 1, a lysine to isoleucine mutation in the linker between

fingers 4 and 5, and an glutamine to arginine mutation between the two cysteine residues of

zinc finger 9 (Figure 10). The KS1-zf2 construct contained a single amino acid mutation: the

first amino acid at the peptide’s N-terminus was mutated from lysine to glutamate (data not

shown).

Before it became apparent that the KS1-zf1 plasmid preparation #4 could not be

sequenced, it was used to express and purify protein. Similarly, KS1-zf1 protein was

expressed and purified from preparation #18(4) before the sequencing results revealed

25

mutations. The protein products of both the KS1-zf1 and KS1-zf2 gene constructs appeared

on a 15% SDS-PAGE gel at an apparent molecular weight between 30 and 46 kDa (Figure

11). These molecular weights are in agreement with predictions based on sequence, including

the 6-His tag: the molecular weight of KS1-zf1 is 44.5 kDa, and that of KS1-zf2 is 34.6 kDa.

Protein expression was unusually slow, requiring up to 22 hours to reach an optical density of

1.6.

26

(a) (b)

Figure 7: PCR of KS1-zf1 and KS1-zf2 constructs from human genomic DNA. Primers

specific to the zinc finger region of the KS1 gene were used to amplify the two gene

constructs by PCR. Samples were visualized with ethidium bromide under UV light, on 0.8%

agarose gels. KS1-zf1 consists of zinc fingers 1 to 10, and KS1-zf2 is zinc fingers 2 to 10.

Note that a second KS1-zf1 reaction, (b), was performed because the first, (a), was not

concentrated enough for subsequent cloning steps.

(a) (b)

Figure 8: Colony PCR identification of clones positive for KS1 gene constructs. (a)

Identification of KS1-zf1 clone. (b) Identification of KS1-zf2 clone. Universal primers

(pET16 forward and reverse) were employed with a standard PCR program, and samples

bands were visualized on 0.8% agarose gels stained with ethidium bromide. Asterisks

indicate positive clones.

* *

27

Figure 9: Confirmation of colony PCR results using specific and

universal primers. This example shows the KS1-zf1 #4 plasmid.

The diagnostic PCR protocol was performed, with both specific

and universal primers. Samples were run on a 0.8% gel, and

visualized with ethidium bromide and imaged under UV light.

Query 46 EPFDHNECEKSFLMKGMLFTHTRAHRGERTFEYNKDGIAFIEKSSLSVHPSNLMEKKPSA 105 +PFDHNECEKSFLMKGMLFTHTRAHRGERTFEYNKDGIAFIEKSSLSVHPSNLMEKKPSA Sbjct 209 QPFDHNECEKSFLMKGMLFTHTRAHRGERTFEYNKDGIAFIEKSSLSVHPSNLMEKKPSA 268 Query 106 YNKYGKFLCRKPVFIMPQRPQTEEKPFHCPYCGNNFRRKSYLIEHQRIHTGEKPYVCNQC 165 YNKYGKFLCRKPVFIMPQRPQTEEKPFHCPYCGNNFRRKSYLIEHQRIHTGEKPYVCNQC Sbjct 269 YNKYGKFLCRKPVFIMPQRPQTEEKPFHCPYCGNNFRRKSYLIEHQRIHTGEKPYVCNQC 328 Query 166 GKAFRQKTALTLHEKTHIEGKPFICIDCGKSFRQKATLTRHHKTHTGEIAYECPQCGSAF 225 GKAFRQKTALTLHEKTHIEGKPFICIDCGKSFRQKATLTRHHKTHTGExAYECPQCGSAF Sbjct 329 GKAFRQKTALTLHEKTHIEGKPFICIDCGKSFRQKATLTRHHKTHTGEKAYECPQCGSAF 388 Query 226 RKKSYLIDHQRTHTGEKPYQCNECGKAFIQKTTLTVHQRTHTGEKPYICNECGKSFCQKT 285 RKKSYLIDHQRTHTGEKPYQCNECGKAFIQKTTLTVHQRTHTGEKPYICNECGKSFCQKT Sbjct 389 RKKSYLIDHQRTHTGEKPYQCNECGKAFIQKTTLTVHQRTHTGEKPYICNECGKSFCQKT 448 Query 286 TLTLHQRIHTGEKPYICNECGKSFRQKAILTVHHRIHTGEKSNGCPRCGKAFSRKSNLIR 345 TLTLHQRIHTGEKPYICNECGKSFRQKAILTVHHRIHTGEKSNGCP+CGKAFSRKSNLIR Sbjct 449 TLTLHQRIHTGEKPYICNECGKSFRQKAILTVHHRIHTGEKSNGCPQCGKAFSRKSNLIR 508 Query 346 HQKTHTGEKPYECKQCGKFFSCKSNLIVHQKTHKVETMGIQ 386 HQKTHTGEKPYECKQCGKFFSCKSNLIVHQKTHKVETMGIQ Sbjct 509 HQKTHTGEKPYECKQCGKFFSCKSNLIVHQKTHKVETMGIQ 549

Figure 10: BLASTp (NCBI) analysis of KS1-zf1 #18(4). The query sequence is compared to

Homo sapiens ZNF382, also known as KS1. Mismatches are in blue, zinc finger amino acids

are in boldface (zinc ligands underlined), and TGEKP linkers are in italics. “Vestigial”

designates an inactivated zinc finger also found in rat KS1 (Luo et al., 2002; Gebelein et al.,

1998).

zinc finger 2

zinc finger 3 zinc finger 4

zinc finger 5 zinc finger 6 zinc finger 7

zinc finger 8 zinc finger 9

zinc finger 10

zinc finger 1 vestigial

28

Figure 11: Induction and purification of KS1-zf1 and KS1-zf2. Caption on next page.

29

Figure 11 caption: Induction gels are shown at left, in which E. coli BL21(DE3) cells were

induced with 1 mM IPTG at T=0 hours, and samples were taken at subsequent time points

until the cells were harvested. Samples from affinity column purification of each protein are

shown at right, with the following abbreviations: SF, soluble fraction of cell lysate; SN,

supernatant (insoluble fraction of cell lysate containing protein of interest); P, pellet; FT,

flow through; W, wash; E1-E4, two elutions with 150 mM imidazole followed by two

elutions with 250 mM imidazole. All samples were visualized on 3.5% stacking, 15%

separating SDS-PAGE gels, stained with Coomassie blue. Arrows indicate the protein of

interest.

DNA binding by KS1 zinc fingers is not sequence-specific under assay conditions

Standard nitrocellulose filter binding assays (FBAs) were performed to determine the

strength of binding of KS1-zf1 and -zf2 to the KS1 consensus DNA binding sequence (KSE),

or to scanning mutants of this sequence (M1-M4; oligonucleotide sequences given in Table

5). These assays involved a protein dilution series from 720 to 0.36 nM and a constant

concentration of radiolabelled oligonucleotide, in a buffer designed to maximize specific

protein-DNA binding and minimize non-specific interactions (Romaniuk, 1985). The

scanning mutants (Table 5) utilized in these assays were designed to test the contribution of

four major features of KSE, identified by Gebelein and Urrutia (2001), to the overall binding

energy: the region upstream of the A box (M1), the A box (M2), the B box (M3), and the

region downstream of the B box (M4). ZFY4R, the consensus DNA binding site of an

unrelated polyzinc finger protein, ZFY (Taylor-Harris et al., 1995), was included as a

negative control. The Kd for the binding of KS1-zf1 or -zf2 to the consensus site were

unexpectedly high: 114 ± 10 nM and 133 ± 13 nM, respectively. In addition, there was very

little difference in the Kd values for binding to mutant sites or to negative control DNA

(Table 6).

At this point, oligonucleotides M1 through M4 were sequenced, in order to rule out

the possibility that the mutants had incorrectly been synthesized by IDT. If all four were

simply replicates of the KSE oligonucleotide sequence, then this would account for the

similarities in binding constants (although it would not explain the binding observed with

ZFY4R and NRSE). However, the sequences of these mutants were found to be correct.

30

This indicated that the protein constructs might instead be at fault. To determine if the

anomalous results of the standard FBAs were caused by low protein activity, the association

constants were also determined by DNA excess FBAs. In these assays, the protein and

radiolabelled DNA oligonucleotide concentrations were held constant, while the equivalent

unlabelled competitor DNA was diluted from 81.6 nM to 0.217 nM. Additional scanning

mutants (MA through MD; Table 5) were tested, which were essentially the inverse of the

original four mutants used in the standard assays. That is, all but one of the four major

regions of the KSE site were mutated in each construct. Another scanning mutant, MX, was

designed to alter every residue within KSE. Finally, a second non-cognate DNA, NRSE

(Mori et al., 1992), was used as a negative control in case ZFY4R shared slight sequence

similarity with KSE. The relative affinities, determined by DNA excess FBA for KS1-zf1

and -zf2 binding to all mutants, are summarized in Table 7. The data were again anomalous,

in that both protein constructs bound with high affinity to negative control DNA, as well as to

the total mutant MX. The apparent Kd values for binding to KSE were approximately 100-

fold lower than the values observed in the standard DNA excess assays: 6.18 ± 3.91 nM for

KS1-zf1 and 3.76 ± 1.30 nM for KS1-zf2.

31

Table 5: Sequences of oligonucleotides employed in standard and DNA excess FBAs. The A

and B box regions are underlined, and scanning mutations are indicated in red. Lower case

letters indicate 5’ overhangs for labelling and/or ligations.

Oligonucleotide sequence

KSE aattCTATCATACTACCCACCCTACAGATGCACAACAG GATAGTATGATGGGTGGGATGTCTACGTGTTGTCctag

M1 aattGGTAGTACTTACCCACCCTACAGATGCACAACAG CCATCATGAATGGGTGGGATGTCTACGTGTTGTCctag

M2 aattCTATCATACCTTTTACCCTACAGATGCACAACAG GATAGTATGGAAAATGGGATGTCTACGTGTTGTCctag

M3 aattCTATCATACTACCCATGTCTGTAGTGCACAACAG GATAGTATGATGGGTACAGACATCACGTGTTGTCctag

M4 aattCTATCATACTACCCACCCTACAGACCGTTTTGGC GATAGTATGATGGGTGGGATGTCTGGCAAAACCGctag

MX aattGGTAGTACTCTTTTGTGTCTGTAGCCGTTTTGGC CCATCATGAGAAAACACAGACATCGGCAAAACCGctag

MA aattCTATCATACCTTTTGTGTCTGTAGCCGTTTTGGC GATAGTATGGAAAACACAGACATCGGCAAAACCGctag

MB aattCCTAGTACTTACCCATGTCTGTAGCCGTTTTGGC CCATCATGAATGGGTACAGACATCGGCAAAACCGctag

MC aattGGTAGTACTCTTTTGCCCTACAGACCGTTTTGGC CCATCATGAGAAAACGGGATGTCTGGCAAAACCGctag

MD aattGGTAGTACTCTTTTGTGTCTGTAGTGCACAACAG CCATCATGAGAAAACACAGACATCACGTGTTGTCctag

ZFY4R gatcCTGTCGATGGAGGCCCGAGTAGGCCTAAAATTGAGGCG GACAGCTACCTCCGGGCTCATCCGGATTTTAACTCCGCttaa

NRSE aattCGGTTCAGCACCCTGGACAGCTCCCGG GCCAAGTCGTGGGACCTGTCGAGGGCCttaa

Table 6: Kd values and relative affinities as determined by standard FBAs. KS1-zf1 or -zf2

binding to the consensus DNA binding sequence (KSE) and four scanning mutants. ZFY4R

was included as a negative control. Oligonucleotide sequences are given in Table 5.

KS1-zf1 KS1-zf2

Kd (nM)a Relative affinityb Kd (nM)a Relative affinityb

KSE 114 ± 10 1.00 133 ± 13 1.00

M1 75 ± 6 1.51 147 ± 14 0.91

M2 55 ± 5 2.05 175 ± 18 0.76

M3 54 ± 5 2.12 215 ± 31 0.62

M4 42 ± 4 2.68 217 ± 26 0.61

ZFY4R 156 ± 18 0.730 not determined not determined a Error values represent the error within a single trial. b Relative affinities were arrived at by dividing the apparent association constant (Ka = 1/Kd) for mutant oligonucleotides by the apparent Ka for the wild type DNA determined in parallel.

32

Table 7: Relative affinities as determined by DNA excess FBAs. KS1-zf1 or -zf2 binding to

the consensus DNA binding sequence (KSE) and scanning mutants. ZFY4R and NRSE were

included as negative controls. Oligonucleotide sequences are given in Table 5.

Relative Affinitya

KS1-zf1 KS1-zf2

KSE 1.00b 1.00b

M1 0.58 ± 0.81 1.34 ± 2.25

M2 0.40 ± 0.64 4.21 ± 1.55

M3 0.95 ± 2.29 4.80 ± 7.20

M4 0.38 ± 2.40 7.62 ± 5.56

MX 2.38 ± 3.23 0.56 ± 0.53

MA 2.83 ± 4.11 5.26 ± 3.15

MB 1.21 ± 2.23 0.16 ± 1.53

MC 1.09 ± 3.35 2.85 ± 1.30

MD 2.26 ± 3.14 2.69 ± 1.00

ZFY4R 3.20 ± 4.11 1.64 ± 1.03

NRSE 1.84 ± 2.85 2.10 ± 0.84 a Relative affinities were arrived at by dividing the mean apparent Ka for mutant oligonucleotides by the mean apparent Ka for the wild type DNA. Error values represent the standard error for at least two independent trials. b The apparent Kd values for the binding of KS1-zf1 and KS1-zf2 to KSE were 6.18 ± 3.91 nM and 3.76 ± 1.30 nM, respectively.

33

Effect of monovalent salt concentration

The contribution of non-specific contacts with the DNA backbone to the observed

binding interactions was tested by increasing the concentration of KCl from 100 mM to 250

mM, and monitoring the effect on the binding of KS1-zf1 and -zf2 to KSE or MX in a DNA

excess FBA. As observed in other trials, both proteins bound to KSE and MX in 100 mM

KCl; however, binding to both DNA oligonucleotides was abolished in the presence of 250

mM KCl.

Effect of competitor DNA or RNA

The sequence specificity, as well as the specificity of KS1 for DNA or RNA, was

probed by monitoring the effect of using carrier RNA, polyA, in place of the usual carrier

DNA, poly(d[IC]). The apparent Kd was measured by standard FBA for the binding of

KS1-zf1 and -zf2 to KSE, MX, or ZFY4R oligonucleotides, in the presence or absence of

each carrier. The Ka values determined in these trials are depicted graphically in Figure 12. In

the absence of carrier, or in the presence of 1 µg/ml poly(d[IC]), the two proteins bound

indiscriminately to all three labelled DNA oligonucleotides. The apparent Ka in the absence

of carrier was approximately 1000-fold higher than in the presence of poly(d[IC]) in all

cases. However, KS1-zf1 appeared to bind specifically to KSE when polyA was used as the

carrier: the apparent Kd for binding to KSE was 7.14 ± 0.70 nM, while the Kd’s for binding to

MX or ZFY4R were in the micromolar range (1.59 ± 1.31 nM and 3.14 ± 4.62 µM,

respectively).

34

(a)

(b)

Figure 12: Effect of competitor DNA or RNA on apparent Ka, as determined by standard

FBA. (a) KS1-zf1 (b) KS1-zf2. No competitor, or 1 µg/ml of either poly(d[IC]) competitor

DNA or polyA competitor RNA were present in each trial. Error bars represent the standard

deviation for two independent trials; where error bars are not shown, useable data could be

obtained in only one trial.

35

Kd values for specific binding determined by FBAs with polyA carrier

Since polyA did not appear to interfere with specific DNA binding by KS1-zf1, I used

it as the carrier in all subsequent FBAs. FBAs were conducted in triplicate, to determine the

apparent Kd values for the binding of KS1-zf1 to mutants M1-M4 (Figure 13). It was

necessary to suppress two outliers in each curve, corresponding to 720 and 360 nM protein,

in order to minimize the margin of error; these points were considerably higher than the fitted

curve. The consensus KSE site was included in each replicate, and the relative affinity of

each mutant was determined as the ratio of the mutant Ka to the consensus Ka (Table 8). In

this way, the fraction of properly refolded, active protein obtained in each assay (a variable)

was removed from the equation. The total mutant MX was also included as a negative control

against non-specific binding. These assays again indicated that KS1-zf1 binds specifically to

KSE, with a Kd of 7.10 ± 1.28 nM. Scanning mutants of the A or B sites affected binding

most profoundly: the A box mutant (M2) displayed an approximate 10-fold reduction in

affinity, while the B box mutant (M3) displayed an approximate 50-fold reduction in affinity

(Table 8). Mutations in the regions upstream and downstream of these sites (M1 and M4)

also displayed decreased affinity for KS1-zf1, but to a lesser extent. The Kd values obtained

for M1 and M4 contained a relatively large margin of error, as the FBA data displayed a

biphasic curve, to which Kaleidagraph software was unable to accurately fit the dissociation

curve.

36

Figure 13: Standard FBAs with 1 µg/ml polyA for KS1-zf1 binding to scanning mutants

M1-M4. Mutant MX was included as a negative control. Error bars represent the standard

deviation among three independent trials. Points corresponding to 720 and 360 nM protein

were suppressed in each curve. See Table 5 for oligonucleotide sequences, and Table 8 for

relative affinities.

Table 8: Relative affinities of KSE scanning mutants as determined by standard FBAs with

polyA carrier. Values were determined from determined from Figure 13. Oligonucleotide

sequences are given in Table 5.

Relative affinitya

KSE 1.00b

M1 0.44 ± 0.21c M2 0.12 ± 0.05 M3 0.04 ± 0.02 M4 0.52 ± 0.20 MX 0.00 ± 0.00

a Relative affinities were arrived at by dividing the mean apparent Ka for mutant oligonucleotides by the mean apparent Ka for the wild type DNA. b The apparent Kd value for the binding of KS1-zf1 to KSE was 7.10 ± 1.28 nM . c Errors were determined by the following formula: ½ error = σmut 2 σwt

2 xmut

xmut xwt xwt

37

Poly(d[IC]) interferes with binding reported in the literature

Standard FBAs were conducted to measure the binding of KS1-zf1 to

oligonucleotides reported in the literature (Gebelein and Urrutia, 2001), with either 1 µg/ml

poly(d[IC]) or 1 µg/ml polyA carrier. Oligonucleotide sequences are outlined in Table 9.

ROB1 was an individual oligonucleotide sequence selected by KS1, during the random

oligonucleotide binding assay in which the high affinity binding site for KS1 was

determined. A GC box (denoted GC) and mutant GC box (GCm) served as negative controls.

The final oligonucleotide that I was able to test was a KS1 consensus binding element in

which the base pairs flanking the B box on the 3’ side were deleted (KBE del 2). Additional

oligonucleotides had been tested by Gebelein and Urrutia ; however, I was unable to obtain

radiolabelled oligonucleotides with these constructs, as the 5’ overhangs were too short.

When poly(d[IC]) was used as the carrier, binding curves appeared to be shifted to the

right, and dissociation constants were accordingly high. Binding appeared to be non-specific,

as KS1-zf1 bound with approximately equal affinity to all oligonucleotides, including those

which were not bound in the literature, as well as the MX negative control (Figure 13). When

polyA was used as the carrier, on the other hand, KS1-zf1 bound specifically to KSE and

ROB1 only, which was in agreement with the literature. The dissociation constant for the

interaction with ROB1 was 5.30 ± 1.72 nM.

38

Table 9: Sequences of oligonucleotides used by Gebelein and Urrutia (2001). Differences

between the GC box and mutant GC box (GCm) are indicated in red.

Oligonucleotide sequence

ROB 1 TCCGTCTTGCTATCATACTACCCACCCTACAGATGCACAACAGTAGGAATT AGGCAGAACGATAGTATGATGGGTGGGATGTCTACGTGTTGTCATCCTTAA

GC ATTCGATCGGGGCGGGGCGAGC TAAGCTAGCCCCGCCCCGCTCG

GCm TTCGATCGGTTCGGGGCGAGC AAGCTAGCCAAGCCCCGCTCG

KBE TCCTACAGTACCAACCCTACAGAGTAA AGGATGTCATGGTTGGGATGTCTCATT

KBE del 1 AGTACCAACCCTACAGAGTAA TCATGGTTGGGATGTCTCATT

KBE del 2 TCCTACAGTACCAACCCTACAG AGGATGTCATGGTTGGGATGTC

Figure 14: Comparison of standard FBAs with 1 µg/ml poly(d[IC]) or polyA for KS1-zf1

binding to literature mutants. Mutant MX was included as a negative control. See Tables 5

and 9 for oligonucleotide sequences, Table 10 for relative affinities.

Poly(d[IC]) PolyA

39

Table 10: Dissociation constants of literature oligonucleotides as determined by standard

FBA with polyA carrier. Refer to Table 9 for oligonucleotide sequences.

Kd (nM) Binding reported in literaturea

ROB 1 5.30 ± 1.72 Yes

GC 2090 ± 1770 No

GCm 1260 ± 811 No

KBE del 2 657 ± 316 No a Gebelein and Urrutia (2001).

40

Discussion

KS1 protein constructs

Initial efforts to clone and express the zinc finger region of KS1 from the pTOPO

vector were fruitless, due to an error on the part of Thermo Scientific, who supplied the

wrong gene. Nevertheless, this was a helpful exercise in plasmid characterization, as well as

a reminder that one should maintain a healthy level of scepticism toward any biological or

chemical reagent that has not been prepared personally. In addition, one of the protocols that

I used to characterize the insert proved to be of use in the verification of later plasmid

preparations, as I will describe in due course.

When PCR reactions involving primers specific to KS1 failed to yield product, I first

investigated a potential problem with primer annealing. The annealing temperature of the

PCR thermocycler program was raised from 55°C to 62°C, which was 4 to 5 degrees below

the melting temperature of all three primers (IDT). It was hoped that this temperature would

be low enough to promote primer annealing to the complimentary region within the KS1

gene, while being high enough to allow rapid breaking of transient bonds between non-

complimentary regions. A touchdown PCR program was also employed; this program began

at a high annealing temperature and gradually decreased (Table 1), thus increasing the

likelihood that the optimal annealing temperature would be achieved during some of the PCR

cycles. Despite the fact that all PCR programs had failed to amplify the desired product,

positive controls (known plasmids and universal primers) were successfully amplified;

therefore, the reaction conditions, enzymes, buffers, and thermocycler were fully functional.

Diagnostic PCR reactions were performed in order to ascertain whether the problem

lay in the primers and/or the insert, or in the vector. The primers specific to KS1 would not

amplify a product if the insert was incorrect, and conversely, the KS1 gene would not be

amplified if the primers were incorrect; thus, the primers and the insert were essentially

indistinguishable in these trials. The universal M13 primers, which are complimentary to the

pTOPO vector, produced a band of between 2036 and 3054 base pairs (Figure 5). This served

to demonstrate that the plasmid did, in fact, contain an insert. If the plasmid had been an

empty pTOPO vector, the band observed using the M13 primers would have been much

smaller. The full length KS1 gene is 2442 base pairs in length, and the M13 primers anneal

41

40 nucleotides upstream and downstream of the vector’s multiple cloning site; therefore, the

band observed in Figure 5 was in the correct size range.

The functionality of each KS1-specific primer was tested individually, using

combinations of specific and universal primers. This attempt was a long shot, at best, as one

would expect to observe large smears if one functional KS1-specific primer were paired with

a non-functional one. If a DNA polymerase is left to terminate replication on its own, without

the “bracketing” function of a second primer, it does so at random, by eventually falling off

the DNA template strand (Berg et al., Biochemistry 6e). This creates amplicons of differing

lengths, which appear as a smear, instead of a band, within the agarose gel. As expected, the

combination of the KS1-zf1 upstream NcoI primer with the universal downstream primer

yielded no product. Surprisingly, the KS1-zf2 upstream NcoI primer combined with the

universal reverse primer produced two faint bands, and the KS1 XhoI primer combined with

the universal forward primer produced a single weak band (Figure 5). By chance, these bands

were within the correct size range to potentially correspond to the KS1 gene. However, this

left crucial questions unanswered: If both the KS1-zf2 upstream primer and the KS1

downstream primer were functional, why did they fail to yield product when combined? Why

did the KS1-zf2 primer produce two bands, and why were bands for both of the primers in

question so faint? It was much more likely that the bands were simply false positives.

Restriction digest characterization of the putative pTOPO-KS1 plasmid confirmed

that something was amiss, and ultimately indicated a problem with the insert. Firstly, the

linearized plasmid did not appear to run at the correct size in a 0.8% agarose gel: pTOPO-

KS1 was expected to be 6700 base pairs in length, while the plasmid at hand ran at

approximately 6100 base pairs (Figure 6). It is important to note that the plasmid was

linearized with two restriction enzymes: NotI was to cut a unique site within the vector, while

NdeI was to cut a unique site within the insert. The NdeI digest was incomplete in all

replicates, as is demonstrated by the faint band, corresponding to the relaxed circular

plasmid, that ran at approximately 12,000 base pairs; however, this was attributed to an

excess of plasmid in the restriction digest reaction.

Secondly, all of the diagnostic restriction digests which were designed to produce

identifiable banding patterns, by cutting at multiple sites within vector and insert, produced

an unexpected number and/or size of bands. This was particularly informative in the case of

42

EcoRI and NcoI, for which all of the expected restriction sites were within or directly

flanking the insert. These enzymes were expected to produce 7 and 3 restriction fragments,

respectively (Table 4). Instead, EcoRI produced only 3 fragments, and the plasmid appeared

to be largely cut only once by NcoI, with 3 other fragments faintly visible (Figure 6). In both

cases, the sum of the sizes of the bands was greater than the size of the linearized plasmid,

indicating that the restriction digests were not running to completion. Increasing the amount

of enzyme remedied this particular problem, but did not produce the banding patterns

expected for pTOPO-KS1 (Figure 6). Specifically, EcoRI produced 2 bands of approximately

4000 and 2100 base pairs, while NcoI appeared largely to linearize the plasmid, creating one

band equal in size to the product of a NotI digest (approximately 6100 base pairs). A very

faint band of approximately 6000 base pairs was also produced by the NcoI digest, perhaps

by partial digestion at a second, imperfect restriction site within the insert. Overall, the faulty

digestion of the insert region by EcoRI and NcoI strongly suggested that the insert within the

pTOPO plasmid was not the KS1 gene. As I have indicated, this hypothesis was ultimately

confirmed by sequencing the plasmid, to find that the insert was the TBC1D13 gene, which

encodes a Rab GTPase activator protein (NCBI).

Despite the fact that the KS1 zinc finger domain could not be cloned from the pTOPO

plasmid, all was not lost in this experience. It was a simple matter to clone the desired

constructs from human genomic DNA (Figure 7), as the zinc finger domain is encoded in a

single exon at the 19q13.13 locus (Luo et al., 2002). Then, when it came time to sequence the

clones, I was able to make use of the KS1-specific and universal PCR primers to confirm that

the plasmids identified by colony PCR (Figure 8) indeed contained inserts (see example in

Figure 9). This was useful, as I had encountered problems with false positives from colony

PCR in the past.

A subsequent difficulty arose in the purification of the pET30a-KS1-zf1 plasmid: the

plasmid preparation was contaminated, and could not be sequenced. While these issues were

being resolved, by re-purification of the plasmid, research went ahead using protein

expressed from the plasmid which could not be sequenced. When the re-purified plasmid was

obtained, new KS1-zf1 protein was expressed and purified from this plasmid; from this point

onward, all FBAs were carried out using the second protein preparation only. What are the

possible implications for the FBA results obtained using these two protein preparations?

43

Ideally, the two preparations should be identical, but the replication of the original plasmid

within E. coli DH5α could have introduced one or more mutations. The samples of protein

expressed from either plasmid appeared at the same molecular weight on SDS-PAGE gels

(Figure 11), indicating that a frameshift mutation leading to the introduction of a premature

stop codon did not occur.

Sequencing of the re-purified KS1-zf1 plasmid was delayed by complications, this

time due to high plasmid concentration causing false peaks. When the sequence was

ultimately obtained, the protein was found to contain three amino acid mutations (Figure 10);

however, at this point it was too late to clone and sequence a new KS1-zf1 construct, and

obtain purified protein with which to conduct FBAs. The point mutations were not expected

to abolish the DNA binding activity of any of the protein’s zinc fingers, as none of the

mutations affected zinc-binding cysteines and histidines. However, all three mutations may

have destabilized the peptide. A glutamine to glutamate mutation in the first residue of the

KS1-zf1 peptide placed a negatively charged side chain at the positively charged N-terminus

of the protein, which may have destabilized protein folding. A lysine to isoleucine mutation

occurred in the linker between zinc fingers 4 and 5, altering the overall charge of the linker

sequence; however, the lysine side chain is not responsible for stabilizing the linker upon

DNA binding, as the threonine side chain is (Figure 2). Finally, a glutamine to arginine

mutation between the two key cysteines in zinc finger 9 introduced a positive charge, which

could potentially have decreased the ability of the cysteine residues to bind zinc, by

electrostatic repulsion of the positively charged ion. Among all of the KS1 zinc fingers, there

is a strong tendency toward neutral and negatively charged amino acids between the two key

cysteines. In future, KS1-zf1 will be re-cloned to eliminate these mutations.

Filter binding assays

All FBAs, performed with either KS1-zf1 or -zf2, showed a general lack of DNA

sequence specificity under the assay conditions. However, these results were not entirely

believable, as the two peptides bound with surprisingly low affinity to their consensus DNA

binding sequence, KSE: the apparent Kd for KS1-zf1 was 114 ± 10 nM, and that of KS1-zf2

was 133 ± 13 nM (Table 6). Artificial 6-zinc finger peptides have been shown to bind to 18-

base pair DNA sites with femtomolar dissociation constants (Kim and Pabo, 1998). Seeing as

44

KS1 is thought to require 9 of its 10 zinc fingers to bind a 27-base pair minimum site

(Gebelein and Urrutia, 2001), very strong binding to the DNA consensus site was expected.

However, the caveat to the extreme binding capacity of artificial zinc finger proteins is that

groups of three tandem zinc fingers had to be separated by longer, noncanonical linker

peptides of eleven amino acids as opposed to the usual five in the canonical linker (Kim and

Pabo, 1998). Without this added flexibility, the strain of the polyzinc finger protein upon the

DNA, which must unwind slightly even for proteins with 3 zinc fingers (Peisach and Pabo,

2003), is so great that binding is hindered. Since the nine tandem zinc fingers in KS1 are

separated by canonical linkers (Luo et al., 2002), femtomolar Kd values were not expected

for this protein. Nevertheless, the dissociation constants of natural polyzinc finger proteins

are typically within the nanomolar range, and so the values observed in standard FBAs were

thought to be questionable.

It was also rather unsettling that the KS1 peptides displayed little or no specificity for

the consensus KSE compared to the mutants, or even the unrelated DNA oligonucleotide,

ZFY4R. This is particularly apparent for the KS1-zf1 peptide, which bound with slightly

higher affinity to scanning mutants M1 through 4, and with only moderately reduced affinity

to ZFY4R, the intended negative control. KS1-zf2 also showed only moderate reductions in

affinity for the scanning mutants (Table 6). These results disagreed with the very method by

which the consensus sequence had been determined, by Gebelein and Urrutia (2001): a

random library of DNA oligonucleotides was subjected to successive rounds of selection by

KS1 binding and purification, leading to the isolation of a pool of oligonucleotides with high

sequence identity, from which a consensus sequence could be deduced. Non-specific DNA

binding would not have led to the selection of oligonucleotides with sequence identities, and

so it would have been impossible to arrive at a consensus sequence.

At this point, the most likely explanation for both the weak DNA binding and the

lack of sequence specificity was very low protein activity. In the standard FBA, 100%

protein activity is assumed; however, expressing and storing the proteins in denaturing

conditions may reduce protein activity, due to improper refolding. When making

comparisons between DNA mutants, decreased activity is usually not a problem, as relative

affinities are determined as the ratio of mutant DNA binding to wild type DNA binding for

trials done in parallel, using the same refolded protein. In this way, the protein activity is

45

essentially “divided out” of the equation. (It follows from this that the Kd values for KS1-zf1

and -zf2 reported above cannot be reliably compared, since the activities of the two protein

preparations are unknown, and likely different.) Nevertheless, when considering the relative

affinity values of any one protein, the activity of the refolded protein may become

problematic if it is extremely low, to the point that the lower range of the protein dilution

series contains virtually no active protein. Indeed, an unusually low level of binding was

observed in the lower range of the protein dilution series; only the first three protein dilutions

(720 nM, 360 nM, 180 nM) showed significant DNA binding. This seemed to confirm that

protein activity was at fault; however, later evidence that poly(d[IC]) was interfering with

specific binding would disprove this hypothesis.

There is no direct assay for the activity level of a refolded zinc finger protein. In

previous studies, the activity of zinc finger proteins, such as TFIIIA, has been assessed using

the Scatchard analysis. This method consists of a binding assay in which the protein

concentration is held constant, while the concentration of labelled nucleic acid is varied

(Romaniuk, 1985). The slope of the resulting line indicates the apparent Ka, and if this value

is in agreement with the Ka obtained by standard FBAs, then the protein can be assumed to

be 100% active. An important feature of the Scatchard analysis is that the protein

concentration is held constant in each well; it is this feature of the assay which allows protein

activity to be disregarded in Ka calculations.

The DNA excess FBA makes use of the same principle as the Scatchard analysis in

order to eliminate the protein activity variable: the protein concentration as well as the

labelled oligonucleotide concentration are held constant, and instead, unlabelled competitor

DNA with the same sequence as the labelled DNA is used to titrate the protein of interest.

Thus, the Kd in a DNA excess FBA is defined as the DNA concentration at 50% saturation

(Equation 4), instead of being defined in terms of protein concentration as in the standard

FBA (Equation 3). Therefore, I used this assay, which would not be affected by protein

activity, in an attempt to detect specific DNA binding by KS1.

The dissociation constants determined by DNA excess FBAs fell within the expected

nanomolar range: 6.18 ± 3.91 nM and 3.76 ± 1.30 nM Kd values were obtained for KS1-zf1

and -zf2, respectively. This seemed to indicate that low protein activity was at fault for the

weak binding observed in standard FBAs. The DNA excess FBA protocol uses a constant

46

protein concentration of 200 nM, while the standard FBA involves a protein dilution series

from 720 to 0.36 nM. Within the standard FBA protein dilution series, only the first two

points fell above 200 nM; if protein activity was so low that concentrations below this level

contained virtually no active protein, then this would explain why many data points showed

little to no protein-DNA binding. For DNA excess FBAs conducted with KS1-zf2, it was

necessary to increase the protein concentration to 400 nM in order to observe binding. This

requirement for higher overall protein concentration was, at the time, attributed to an even

lower activity of the KS1-zf2 protein preparation than of the KS1-zf1 preparation. Once

again, later evidence of poly(d[IC]) interference disproved these hypotheses about protein

activity, as will be discussed.

The DNA excess FBA protocol did not resolve the issue of sequence specificity that

was observed in standard FBAs. Notably, both KS1-zf1 and -zf2 appeared to bind to the

negative control oligonucleotides, ZFY4R and NRSE, with slightly higher affinity than the

consensus KSE (Table 7). Also, both proteins appeared to bind to the total mutant MX, in

which the entire 27-base pair KSE sequence was mutated. These key results rendered the

results obtained for the remaining mutants inconsequential; that is, the binding of both KS1

peptides was not sequence specific, and so any variability observed between the mutants

could only be attributed to experimental error. Indeed, the margin of error in the data was

relatively high.

The standard and DNA excess FBA protocols were modified, to determine whether

DNA binding by the two KS1 constructs was non-specific in both cases. First, the

concentration of KCl was increased over two-fold in the DNA excess assay, to assess the

contribution to binding of electrostatic contacts between positively charged amino acid side

chains and the DNA backbone. Binding of both peptides to KSE or MX was abolished in the

presence of 250 mM KCl, indicating that the binding constants observed in the normal DNA

excess assay were entirely non-specific electrostatic contacts. Backbone contacts are

common among zinc finger proteins, and serve to enhance the free energy of binding. For

example, the extensively characterized zinc finger protein, Zif268, makes 7 or 8 contacts

with the DNA backbone (Hamilton et al., 1998). However, numerous specific interactions

with the DNA major groove are normally present in conjunction with these backbone

contacts; in the case of Zif268, each finger forms hydrogen bonds with three or four DNA

47

bases (Elrod-Erickson et al., 1996; Isalan et al., 1997). Although the effect of increasing the

KCl concentration suggested that KS1 was incapable of forming specific hydrogen bonding

interactions with DNA, this result could not be taken at face value. As mentioned previously,

a consensus binding site could not have been identified if KS1 did not bind specifically to

DNA under certain conditions; therefore, the conditions of the FBAs had to be disrupting

specific binding interactions. The two most likely candidates were improper protein folding

in the TMK buffer, or interference of poly(d[IC]).

Fortuitously, I next investigated the effect of the poly(d[IC]) carrier DNA, which was

found to be at the root of the problem. In these trials, the binding of KS1-zf1 and -zf2 to

KSE, MX, or ZFY4R was measured by standard FBAs, using either poly(d[IC]), polyA, or

no carrier. Poly(d[IC]) is the non-specific carrier DNA that was used in all standard and

DNA excess FBAs, and it was included in these trials at the standard concentration. PolyA

RNA was used as an alternate carrier, to assess whether single stranded nucleic acid of a

different sequence could interfere with specific DNA binding. The standard FBA protocol

was chosen for this purpose, because all of the DNA oligonucleotides of interest are labelled,

while poly(d[IC]) or polyA is unlabelled. Thus any reduction in apparent Ka could be

attributed solely to poly(d[IC]) or polyA, unlike a DNA excess FBA, in which a fraction of

the oligonucleotides of interest would also be unlabelled.

The results of this trial represented a turning point in my project, as they strongly

suggested that specific DNA binding can be achieved using polyA carrier. In the absence of

carrier, the Ka values for KS1-zf1 or -zf2 binding to all three labelled DNA oligonucleotides

were approximately 1000-fold greater than values obtained in the presence of poly(d[IC])

(Figure 12). These findings would have implied that the KS1 peptides were binding to

poly(d[IC]) DNA, which shares no sequence identity with any human DNA. On its own, this

result would have implied once again that binding was non-specific. However, when polyA

was used as a carrier, KS1-zf1 bound to KSE with an apparent Kd of 7.14 ± 0.70 nM, and

did not show significant binding to non-cognate DNA (micromolar Kd values). KS1-zf2, on

the other hand, did not display specific DNA binding when polyA was used as the carrier;

instead, it bound weakly to all three labelled DNA oligonucleotides. Binding to KSE was

only approximately 5 times stronger than to MX or ZFY4R, and even so, the apparent Kd was

48

in the micromolar range. Thus, KS1-zf1 binds specifically to the KSE consensus sequence in

the presence of polyA carrier, but KS1-zf2 does not.

One possible explanation for the interference of poly(d[IC]) with the specific binding

of KS1-zf1 is that the repetitive inosine-cytosine sequence shares common chemical features

with critical residues of the KSE site. This seems unlikely, since inosine is not typically

found in DNA; however, an IC base pair displays the same set of hydrogen bond donors and

acceptors at the major groove as a GC base pair (Figure 15). It is only at the minor groove

that the IC and CG base pairs differ, as guanine has an additional amino group which inosine

lacks. Zinc finger proteins bind to DNA by wrapping around the major groove, and form

sparse contacts with the minor groove. Usually, zinc fingers only contact the minor groove if

they are acting as a spacer to traverse a region of minor groove separating two high-affinity

major groove binding sites, as is the case for TFIIIA zinc fingers 4 and 6 (Nolte et al., 1998).

Since KS1 is thought to wrap continuously around the major groove, it could imaginably

form the same specific hydrogen bonding interactions with an IC base pair of poly(d[IC])

carrier DNA as it would with a GC base pair of KSE, thereby essentially acting as a

competitive inhibitor of specific DNA binding. If these poly(d[IC])-KS1 complexes could

form electrostatic interactions with the DNA backbone, or if the refolded KS1 protein

preparation contained any misfolded molecules capable of forming electrostatic interactions

with DNA, then the non-specific binding of these species to labelled oligonucleotide would

have been detected in FBAs.

However, this model begs the question: when a GC box oligonucleotide was later

used as a negative control, why did this not show specific binding (Figure 14, Table 10)? A

notable feature of the GC box is that it is only 22 base pairs long; if the minimum binding site

of KS1 is 27 base pairs in length, as determined by Gebelein and Urrutia (2001), then the GC

box may have been too short to bind to KS1. Indeed, a truncated consensus site (KBE del 2),

which was also only 22 base pairs long, also failed to bind KS1 (Figure 14, Table 10).

49

Figure 15: Comparison of IC and GC base pairs. Both present identical groups at the major

groove. D denotes a hydrogen bond donor, and A denotes a hydrogen bond acceptor.

An alternate explanation is that KS1 bound non-specifically to the poly(d[IC])

backbone, but the concentration of poly(d[IC]) was too great. Although poly(d[IC])

concentrations as high as 5 µg/ml have been used in FBAs (Romaniuk, 1990; Hamilton et al.,

1998), it is possible that KS1 is more sensitive to the presence of non-specific carrier DNA

than are other zinc finger proteins. In this model, KS1 forms strong bonds with the DNA

backbone of any 27-base pair site. If a poly(d[IC]) fragment is taken to be approximately

1000 base pairs long, then each fragment contains 37 potential KS1 binding sites. Thus, KS1

is easily saturated by poly(d[IC]), and it is unable to bind to labelled, specific

oligonucleotides except at high protein concentration. The results of FBAs conducted without

carrier support this model: both the KS1-zf1 and -zf2 peptides bound to the non-cognate

DNA molecules, MX and ZFY4R, with Kd’s in the nanomolar range (Figure 12). From this,

it follows that the peptides may too have bound to each 27 base pair site along the

poly(d[IC]) backbone with a Kd in the nanomolar range. This problem could potentially be

resolved by decreasing the poly(d[IC]) concentration. For the sake of thoroughness, the effect

of reducing the concentration of polyA should also be studied, to ensure that the observed Kd

values represent as accurately as possible the affinity of KS1 for the DNA oligonucleotide in

question.

Unlike poly(d[IC]), 1 µg/ml polyA did not interfere with the specific binding of KS1-

zf1 to KSE, while still preventing non-specific interactions with MX and ZFY4R (Figure 12).

As a single stranded RNA species, polyA neither presents a pattern of hydrogen bonding

Major groove side

Minor groove side

D A A D A A

50

groups in the context of a major groove, nor does it possess a double-helical backbone. DNA-

binding zinc fingers form bonds with both strands of the DNA double helix (Isalan et al.,

1997), and so binding to a single stranded nucleic acid molecule would be much less efficient

than to double stranded DNA. Although some zinc finger proteins, such as TFIIIA, are

capable of binding to both DNA and RNA, they do so using different subsets of zinc fingers

(Romaniuk, 1985; 1990). Thus, a zinc finger that binds specifically to DNA will not show

significant RNA binding, and vice versa. Since KS1 is a DNA-binding zinc finger protein, it

would not be expected to form strong interactions with RNA; therefore, the polyA RNA

carrier acted as a suitably weak competitor to disrupt non-specific binding, while leaving

specific protein-DNA interactions intact. Further research is needed to confirm that KS1 is

not capable of binding specific RNA sequences; however, for the purposes of this study, it is

sufficient to state that KS1 does not bind specifically polyA RNA.

An important conclusion came out of the success of FBAs with the polyA carrier: my

prior hypothesis that the activity of the refolded KS1-zf1 protein preparation was extremely

low was disproved. The same cannot be said for the KS1-zf2 preparation, as this protein did

not show specific binding to KSE when polyA was used as carrier. However, this may

instead be evidence that zinc finger 1 and/or the 56-amino acid spacer region of the KS1 zinc

finger domain is required for specific DNA binding. Although a study by Gebelein and

Urrutia (2001) indicated that these regions do not participate directly in DNA binding, they

may be required to stabilize protein folding. Krüppel-type zinc fingers undergo a

conformational change from a flexible state to a rigid state upon DNA binding, due to C-

capping and N-capping interactions that form between zinc finger α-helices and surrounding

residues (Laity et al., 2000). It is conceivable that certain amino acids in the spacer region

between zinc fingers 1 and 2 could adopt a similar role in stabilizing zinc finger 2. However,

further research is required to determine the minimal KS1 peptide that is capable of specific

DNA binding. At this time, it can only be concluded that zinc fingers 2 to 10 do not appear to

be sufficient for specific DNA binding.

This conclusion disagrees with the work of Gebelein and Urrutia (2001), which

demonstrated that KS1 requires only 9 of its 10 zinc fingers to bind DNA. However, in that

particular study, the DNA binding activity of KS1 was measured based on repression of a

reporter gene. The transcriptional repressor function of KS1 also relied upon the presence of

51

an N-terminal KRAB domain; in the construct containing only 9 zinc fingers, this domain

was fused to the N-terminal end of zinc finger 2. In this way, the KRAB domain may also

have acted to stabilize zinc finger 2. In the current study, the KS1-zf2 construct began with

four amino acids upstream of zinc finger 2; the first residue was mutated from glutamine to

aspartate (data not shown), and this negative charge at the protein’s N-terminus may have

destabilized the folding of zinc finger 2. This implies that zinc finger 2 plays a critical role in

specific DNA binding, if disrupting its structure is enough to abolish the function of the

entire zinc finger domain.

Having determined the assay conditions which could be used to detect specific DNA

binding by KS1-zf1, I conducted standard FBAs involving scanning mutants M1 through M4

again, this time using the polyA carrier. The assay conditions were still not ideal, as it was

necessary to suppress the data points for the two highest protein concentrations (720 and 360

nM) when fitting the equilibrium binding curves. In order for an equilibrium binding curve to

be useful in determining a Kd value, the curve must plateau at the point when saturation is

reached. Such a plateau was indeed reached in all of my equilibrium binding curves,

however, the fractions of DNA bound by 720 and 360 nM protein were considerably higher

than the plateau. In the case of mutants M1 and M4, an apparent plateau was reached at even

lower protein concentrations, and so the curves appeared biphasic even when the final two

data points were suppressed. This may be explained by an increase in retention efficiency on

the nitrocellulose filter at high concentrations of DNA-protein complex. It is generally

assumed that the retention efficiency does not change over the range of protein

concentrations used in the standard FBA; however, this assumption must be tested, and

modifications to the protocol must be made accordingly, in the case of the KS1 protein.

Despite these problems, the results of FBAs with polyA carrier were still informative,

revealing that the strongest contacts are formed at the highly conserved regions of the KSE

consensus binding site. Mutation of the A box (mutant M2) reduced the binding affinity by

approximately 10-fold, while mutation of the B box (mutant M3) reduced the affinity by

approximately 50-fold (Table 8). Alterations to the regions flanking the A and B boxes also

affected KS1 binding; however, these mutations only reduced the affinity by approximately

2-fold (Table 8). Therefore, my results indicate that the KSE B box makes the greatest

contribution to the free energy of KS1 binding, and the A box also contributes significantly.

52

These findings agree with the nature of the KBE consensus sequence derived by Gebelein

and Urrutia (2001): the strong consensus among all of the A and B boxes in the selected

oligonucleotides implies that these regions were required for high-affinity binding during the

library enrichment process. The flanking regions did not show an equally strong consensus,

except at the -2, -4, and -8 positions upstream of the A box, as well as at the +5 position

downstream of the B box (highlighted in gray in Figure 3). The importance of these

individual base pairs, as well as the free energy contribution of each base pair within the A

and B boxes must be investigated in the future with KSE point mutants.

While the findings of these FBAs agreed with the general features of the consensus

binding site for KS1, there still remained one major discrepancy between the protein-DNA

binding data reported by Gebelein and Urrutia (2001) and my own: the Gebelein group had

observed what appeared to be specific DNA binding using the poly(d[IC]) carrier, while I

had not. This previous report had used non-quantitative eletrophoretic mobility shift assays

(EMSAs) to investigate protein-DNA binding. In these experiments, protein and

radiolabelled oligonucleotides were incubated in an appropriate buffer, along with

poly(d[IC]) carrier. Reaction mixtures contained 200 ng of protein and 50 µg/ml poly(d[IC]),

whereas even the most concentrated well of the standard FBA protein dilution series

contained approximately 10 ng protein with 1 µg/ml poly(d[IC]); therefore, the specific

binding observed by the Gebelein group was not achieved by using an excess of protein.

Binding was subsequently analyzed on a non-denaturing polyacrylamide gel, where bound

protein would cause a supershift in the position of the labelled oligonucleotide, due to higher

combined molecular weight and due to the globular structure of the protein. Binding

appeared to be specific in these trials, as KS1 bound only to its high affinity binding site, and

not to unrelated oligonucleotides. In addition, visual inspection implied that supershifts were

weaker (although these experiments were not quantitative) when mutations were introduced

into the high affinity binding site, and binding was abrogated by deletions of the regions

flanking the A and B boxes.

I attempted to recreate the results of Gebelein and Urrutia by using identical

oligonucleotides (Table 9) in standard FBAs with either poly(d[IC]) or polyA carrier. If the

poly(d[IC]) carrier interfered with the binding of KS1-zf1 to the same oligonucleotide

sequences that had been bound previously, it would confirm that conducting FBAs with an

53

alternate carrier was an appropriate means of measuring KS1-DNA binding. Similarly, FBAs

conducted with polyA carrier would be reconciled with the literature if KS1 was shown to

bind specifically to the same sequences that had previously been bound in EMSAs. Indeed,

binding was non-specific in the presence of poly(d[IC]) carrier, while KS1-zf1 bound

specifically to one of its high affinity binding sites, ROB1, in the presence of polyA (Figure

14 and Table 10). Deleting five base pairs at the 3’ end of the KS1 consensus binding

element (KBE del 2), which rendered it only 22 base pairs long, abolished protein-DNA

binding, as was reported in the literature. Unfortunately, this result lacks an important

positive control (demonstrating that KS1-zf1 binds to full length KBE in standard FBA

conditions), as some oligonucleotides could not be end labelled due to short 5’ overhangs.

Nevertheless, the following conclusion can be drawn: since mutation of the analogous base

pairs in KSE (mutant M4) caused only a 2-fold reduction in binding affinity, the deletion

mutant must have abolished protein-DNA binding because the oligonucleotide no longer

satisfied the minimum binding site length, and not because the deletion removed a high-

affinity binding site.

PolyA also prevented non-specific interaction with two non-cognate oligonucleotides

which had been used as negative controls by Gebelein and Urrutia: a GC box, and a mutant

GC box, in which one GC-rich tract was interrupted by two mutations to thymine (Table 9).

It is important to note, however, that these negative control oligonucleotides were only 22

base pairs long. Thus, KS1 may have failed to bind to these sequences because they did not

satisfy the minimum binding site length, and not because of sequence specificity. Gebelein

and Urrutia postulated that KS1 requires a 27 base pair minimum binding site; however,

truncated binding sites between 23 and 26 base pairs in length have not been tested to date. It

may be worthwhile to test a series of successive deletion mutants to determine the true

minimum binding site length of KS1.

54

Conclusion

Despite some difficulties and discrepancies, this study agrees with and builds upon

the current model for DNA binding by KS1. KS1 binds specifically to its full length binding

site, and not to a truncated binding site, as was reported previously. Specific binding appears

to require the full length KS1 zinc finger domain, at least to achieve correct protein folding.

Thus, at this point it is still conceivable that the entire zinc finger domain wraps around the

DNA double helix at the major groove, as was proposed previously. This study demonstrates

that protein-DNA binding, whatever the mechanism may be, has a dissociation constant of

7.10 ± 1.28 nM. The strongest interactions occur between KS1 and the B box region of the

consensus binding site; the A box also forms strong interactions with KS1, but these are

approximately 5-fold weaker than the interactions formed at the B box. The A and B boxes

together make up a 14 base pair core binding site; if each KS1 zinc finger motif binds a three

base pair subsite, then this implies that four or five zinc fingers contact this central region of

the binding site. Therefore, KS1 may circumvent the problem of strain associated with many

zinc fingers continuously wrapping around the DNA double helix by forming strong contacts

with only a subset of its zinc fingers, and forming weaker, more flexible contacts to the DNA

double helix with the rest of its zinc fingers.

55

References Berg, J.M., Tymoczko, J.L., Stryer, L. (2007) In: Biochemistry, 6th Edition. (W.H. Freeman

and Company, New York) pp 783-816.

Bradford, M.M. (1976) A rapid and sensitive method for the quantitation of microgram

quantities of protein utilizing the principle of protein-dye binding. Anal. Biochem. 72,

248-254.

Elrod-Erickson, M., Rould, M.A., Nekludova, L., and Pabo, C.O. (1996) Zif268 protein-

DNA complex refined at 1.6 A: A model system for understanding zinc finger-DNA

interactions. Structure 4(10), 1171-1180.

Gebelein, B., Fernandez-Zapico, M., Imoto, M., Urrutia., R. (1998) KRAB-independent

suppression of neoplastic cell growth by the novel zinc finger transcription factor

KS1. J. Clin. Invest. 102(11), 1911-1919.

Gebelein, B., and Urrutia, R. (2001) Sequence-specific transcriptional repression by KS1, a

multiple-zinc-finger-Krüppel-associated box protein. Mol. Cell. Biol. 21, 928-939.

Hamilton, T.B., Borel, F., and Romaniuk, P.J. (1998) Comparison of the DNA binding

characteristics of the related zinc finger proteins WT1 and EGR1. Biochemistry 37,

2051-2058.

Isalan, M., Choo, Y., Klug, A. (1997) Synergy between adjacent zinc fingers in sequence-

specific DNA recognition. Proc. Natl. Acad. Sci. USA 94, 5617-5621.

Iuchi, S. (2001) Three classes of C2H2 zinc finger proteins. Cell. Mol. Life Sci. 58,

625-635.

Kamiuchi, T., Abe, E., Imanishi, M., Kaji, T., Nagaoka, M., and Sugiura, Y. (1998) Artificial

nine zinc-finger peptide with 30 base pair binding sites. Biochemistry 37,

13827-13834.

Kim, J., and Pabo, C.O. (1998) Getting a handhold on DNA: Design of poly-zinc finger

proteins with femtomolar dissociation constants. Proc. Natl. Acad. Sci. USA 95,

2812-2817.

Laity, J.H., Dyson, J., and Wright, P.E. (2000) DNA-induced α-helix capping in conserved

linker sequence is a determinant of binding affinity in Cys2-His2 zinc fingers. J. Mol.

Biol. 295, 719-727.

56

Luo, K., Yuan, W., Zhu, C., Li, Y., Wang, Y., Zeng, W., Jiao, W., Liu, M., and Wu, X.

(2002) Expression of a novel Krüppel-like zinc-finger gene, ZNF382, in human heart.

Biochem. Biophys. Res. Commun. 299, 606-612.

Moore, M., Choo, Y., Klug, A. (2001) Design of polyzinc finger peptides with structured

linkers. Proc. Natl. Acad. Sci. USA 98, 1432-1436.

Mori, N., Schoenherr, C., Vandenbergh, D.J., Anderson, D.J. (1992) A common silencer

element in the SCG10 and type II Na+ schannel binds a factor present in nonneuronal

cells but not in neuronal cells. Neuron 9, 45-54.

Nolte, R.T., Conlin, R.M., Harrison, S.C., Brown, R.S. (1998) Differing roles for zinc fingers

in DNA recognition: Structure of a six-finger transcription factor IIIA complex. Proc.

Natl. Acad. Sci. USA 95, 2938-2943.

Peisach, E., and Pabo, C.O. (2003) Constraints for zinc finger linker design as inferred

from x-ray crystal structure of tandem Zif268-DNA complexes. J. Mol. Biol.

330, 1-7.

Romaniuk, P.J. (1985) Characterization of the RNA binding properties of transcription factor

IIIA of Xenopus laevis oocytes. Nucleic Acids Res. 13(14), 5369-5387.

Romaniuk, P.J. (1990) Characterization of the equilibrium binding of Xenopus transcription

factor IIIA to the 5S RNA gene. J. Biol. Chem. 265, 17593-17600.

Taylor-Harris, P., Swift, S., Ashworth, A. (1995) Zfy1 encodes a nuclear sequence-specific

DNA binding protein. FEBS Lett. 360, 315-319.

Wuttke, D.S., Foster, M.P., Case, D.A., Gottesfeld, J.M., and Wright, P.E. (1997) Solution

structure of the first three zinc fingers of TFIIIA bound to the cognate DNA

sequence: Determinants of affinity and sequence specificity. J. Mol. Biol.

273, 183-206.

characterization of the consensus dna binding site for ... · function that requires a dna binding...

Documents