the conformation of a b-dna decamer is mainly determined by its

9
The EMBO Journal vol.10 no.1 pp.35-43, 1991 The conformation of a B-DNA decamer is mainly determined by its sequence and not by crystal environment Udo Heinemann and Claudia Alings Abteilung Saenger, Institut ftir Kristallographie, Freie Universit'it Berlin, D-1000 Berlin, FRG Communicated by W.Saenger By comparing the conformations adopted by a double- stranded decameric B-DNA fragment in different crystal environments, we address the question of the degree of deformability of DNA helices. The three-dimensional structure of the self-complementary DNA decamer CCAGGCmeCTGG has been determined from crystals of space group P6 at 2.25 A resolution with an R value of 17.2% for 2407 lia structure amplitudes. The oligonucleotide forms a B-type double helix with a characteristic sequence-dependent conformation closely resembling that of the corresponding unmethylated decamer, the structure of which is known from a high- resolution analysis of crystals of space group C2. Evidently, both the effects of single-site methylation and altered crystal environment on the DNA conformation are small. Therefore, double-helical DNA may possess sequence-determined conformational features that are less deformable than previously thought. Key words: Dcm methyltransferase/DNA conforma- tion/DNA methylation/double helix/X-ray crystallography Introduction DNA helix conformation is determined by nucleotide sequence and environment. These factors direct the global structure towards one of the known helical families (A, B or Z) and determine the local conformational features that are thought to be recognized by sequence-specific DNA- binding proteins (Dickerson, 1990; Kennard and Hunter, 1989a,b; Shakked and Rabinovich, 1986; Rich et al., 1984; Saenger, 1984). It is not known with certainty to what extent the conformation of a DNA segment is influenced by either its sequence or its environment, i.e. the ionic medium and interacting small and large molecules. The three-dimensional structure of DNA molecules bound to proteins as seen by X-ray crystallography (McClarin et al., 1986; Aggarwal et al., 1988; Jordan and Pabo, 1988; Otwinowski et al., 1988; Suck et al., 1988; Wolberger et al., 1988) may thus be regarded as a consequence of the given sequence or an effect of the protein on an intrinsically flexible DNA helix. Crystals of short DNA fragments provide a means to study the effects of sequence and environment on double helix structure by permitting high-resolution structural analysis of molecules in an environment exactly defined by lattice contacts with their neighbors. If a DNA fragment can be crystallized in different lattices, comparison of the resultant spatial DNA structures will yield an estimate of helix deformability. This experiment has been performed with two different octameric fragments of A-form DNA (Shakked Oxford University Press et al., 1989; Jain et al., 1989). Significant differences between the helical conformations in the different crystal lattices have been reported in either case indicating a rather 'soft' DNA structure. We have recently determined the crystal structure of the self-complementary DNA decamer CCAGGCCTGG at high resolution (Heinemann and Alings, 1989). This oligonucleotide packs in crystals of space group C2 in end- to-end fashion to form quasi-continuous helices, and the two strands of the decamer helix are related by crystallographic symmetry, i.e. they have exactly the same conformation. The decamer forms B-DNA helices with characteristic conformational variability which is for the most part shared by the sequence-related decamers CCAAGATTGG and CCAACGTTGG, which crystallize in the same space group (Prive et al., 1987; Prive,G.G., Yanagi,K. and Dickerson,R.E., submitted). Here we report the structural analysis of the DNA fragment CCAGGCreCTGG from crystals of space group P6, a packing not previously observed with DNA molecules (Dickerson, 1990). This oligonucleotide consists of two hemimethylated recognition sites CCAGG/CmeCTGG of the Escherichia coli Dcm methyltransferase, an enzyme known to methylate position 5 of the inner two cytosines in its pentamer target sequence (Coulondre et al., 1978; Duncan and Miller, 1980). Since it differs from the previously characterized decamer only by the presence of one additional methyl group per strand and crystallizes in a different packing arrangement, it offers for the first time the opportunity to compare the conformation of DNA helices in the biologically relevant B-form in different environments. We find that, despite very different contacts experienced by the molecules in the different crystal lattices, the conformations of the two decamers are remarkably similar. It is concluded that the deformability of DNA molecules may be less pronounced than previously assumed. Results Refinement results and overall helix structure The stereochemically restrained structure refinement against 2.25 A resolution X-ray data converged with an R value of 17.2 % (Table I). The root mean square (rms) positional and thermal shifts in the last refinement cycle were 0.007 A and O 2 0.14 A', respectively, and the difference Fourier maps with a rms density of 0.063 e/A3 showed only two remaining peaks over 0.25 e/A3 which could not be assigned as solvent sites. The mean error in atomic coordinates accor- ding to Luzzati (1952) does not exceed 0.25 A. The struc- ture amplitudes and coordinates of 406 DNA atoms and 72 water oxygens have been deposited with the Brookhaven Protein Data Bank from which copies will be available. It is worth noting that a grossly wrong molecular model could be refined to an R value of < 24% (see Materials and methods). This demonstrates that at medium resolution, 35

Upload: others

Post on 11-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

The EMBO Journal vol.10 no.1 pp.35-43, 1991

The conformation of a B-DNA decamer is mainlydetermined by its sequence and not by crystalenvironment

Udo Heinemann and Claudia Alings

Abteilung Saenger, Institut ftir Kristallographie, Freie Universit'itBerlin, D-1000 Berlin, FRG

Communicated by W.Saenger

By comparing the conformations adopted by a double-stranded decameric B-DNA fragment in different crystalenvironments, we address the question of the degree ofdeformability of DNA helices. The three-dimensionalstructure of the self-complementary DNA decamerCCAGGCmeCTGG has been determined from crystalsof space group P6 at 2.25 A resolution with an R valueof 17.2% for 2407 lia structure amplitudes. Theoligonucleotide forms a B-type double helix with acharacteristic sequence-dependent conformation closelyresembling that of the corresponding unmethylateddecamer, the structure of which is known from a high-resolution analysis of crystals of space group C2.Evidently, both the effects of single-site methylation andaltered crystal environment on the DNA conformationare small. Therefore, double-helical DNA may possesssequence-determined conformational features that areless deformable than previously thought.Key words: Dcm methyltransferase/DNA conforma-tion/DNA methylation/double helix/X-ray crystallography

IntroductionDNA helix conformation is determined by nucleotidesequence and environment. These factors direct the globalstructure towards one of the known helical families (A, Bor Z) and determine the local conformational features thatare thought to be recognized by sequence-specific DNA-binding proteins (Dickerson, 1990; Kennard and Hunter,1989a,b; Shakked and Rabinovich, 1986; Rich et al., 1984;Saenger, 1984). It is not known with certainty to what extentthe conformation of a DNA segment is influenced by eitherits sequence or its environment, i.e. the ionic medium andinteracting small and large molecules. The three-dimensionalstructure of DNA molecules bound to proteins as seen byX-ray crystallography (McClarin et al., 1986; Aggarwalet al., 1988; Jordan and Pabo, 1988; Otwinowski et al.,1988; Suck et al., 1988; Wolberger et al., 1988) may thusbe regarded as a consequence of the given sequence or aneffect of the protein on an intrinsically flexible DNA helix.

Crystals of short DNA fragments provide a means to studythe effects of sequence and environment on double helixstructure by permitting high-resolution structural analysis ofmolecules in an environment exactly defined by latticecontacts with their neighbors. If a DNA fragment can becrystallized in different lattices, comparison of the resultantspatial DNA structures will yield an estimate of helixdeformability. This experiment has been performed with twodifferent octameric fragments of A-form DNA (Shakked

Oxford University Press

et al., 1989; Jain et al., 1989). Significant differencesbetween the helical conformations in the different crystallattices have been reported in either case indicating a rather'soft' DNA structure.We have recently determined the crystal structure of the

self-complementary DNA decamer CCAGGCCTGG at highresolution (Heinemann and Alings, 1989). Thisoligonucleotide packs in crystals of space group C2 in end-to-end fashion to form quasi-continuous helices, and the twostrands of the decamer helix are related by crystallographicsymmetry, i.e. they have exactly the same conformation.The decamer forms B-DNA helices with characteristicconformational variability which is for the most part sharedby the sequence-related decamers CCAAGATTGG andCCAACGTTGG, which crystallize in the same spacegroup (Prive et al., 1987; Prive,G.G., Yanagi,K. andDickerson,R.E., submitted).Here we report the structural analysis of the DNA

fragment CCAGGCreCTGG from crystals of space groupP6, a packing not previously observed with DNA molecules(Dickerson, 1990). This oligonucleotide consists of twohemimethylated recognition sites CCAGG/CmeCTGG of theEscherichia coli Dcm methyltransferase, an enzyme knownto methylate position 5 of the inner two cytosines in itspentamer target sequence (Coulondre et al., 1978; Duncanand Miller, 1980). Since it differs from the previouslycharacterized decamer only by the presence of one additionalmethyl group per strand and crystallizes in a differentpacking arrangement, it offers for the first time theopportunity to compare the conformation of DNA helicesin the biologically relevant B-form in different environments.We find that, despite very different contacts experienced bythe molecules in the different crystal lattices, theconformations of the two decamers are remarkably similar.It is concluded that the deformability ofDNA molecules maybe less pronounced than previously assumed.

ResultsRefinement results and overall helix structureThe stereochemically restrained structure refinement against2.25 A resolution X-ray data converged with an R value of17.2% (Table I). The root mean square (rms) positional andthermal shifts in the last refinement cycle were 0.007 A and

O 20.14 A', respectively, and the difference Fourier maps witha rms density of 0.063 e/A3 showed only two remainingpeaks over 0.25 e/A3 which could not be assigned assolvent sites. The mean error in atomic coordinates accor-ding to Luzzati (1952) does not exceed 0.25 A. The struc-ture amplitudes and coordinates of 406 DNA atoms and 72water oxygens have been deposited with the BrookhavenProtein Data Bank from which copies will be available.

It is worth noting that a grossly wrong molecular modelcould be refined to an R value of < 24% (see Materials andmethods). This demonstrates that at medium resolution,

35

U.Heinemann and C.Alings

Fig. 1. Stereo view of two CCAGGCmCCTGG double helices stacked on top of each other as they pack in the crystal. Atoms are colored by type(grey for carbon, blue for nitrogen, red for oxygen and yellow for phosphorus), and one strand is shown with its Van der Waals surface.

Fig. 2. Base pairs reC(7)-G(14) of the final model shown with electron density before addition of the cytosine methyl group (left) and aftercompletion of the refinement (right). 2FO-Fc maps (blue) are contoured at 0.9 e/A3 and F-Fc difference maps (red) at 0.18 e/A3.

major errors in the model may persist even with a seeminglyreasonable R factor and that close attention must be paid tofeatures of the electron density maps possibly indicatingerrors.

In the P6 crystals, CCAGGCmeCTGG forms a B-DNAhelix with exact 10-fold repeat which permits end-to-endpacking to form quasi-continuous helices (Figure 1). Thisarrangement is possible in spite of a slight curvature ofindividual decamer helices by 5.8° revealed by a confor-mational analysis of the structure with CURVES (Laveryand Sklenar, 1988, 1989). The methyl groups at C(7) andC(17) are apparent from electron density maps (Figure 2).They are positioned symmetrically across the DNA majorgroove.

Segmental flexibility of the decamer helixIn CCAGGCmeCTGG, the average flexibility, as judgedfrom the crystallographic temperature factors, is highest for

*36

the Phosphate groups (24.3 A2) and lowest for the bases(13.9 A ). This is commonly observed with nucleic acidhelices (Shakked and Rabinovich, 1986). More importantly,in the methylated decamer there is a clear trend towards highB values of the terminal base pairs and low B values in thecenter (Figure 3). This flexibility profile should be kept inmind when the methylated and unmethylated forms of thedecamer are compared: The conformation of the former issignificantly better defined around the central base pairsthan at the- ends. The 72 water molecules bound toCCAGGCmeCTGG have temperature factors ranging from7 to 66 A2.

Crystal environment in space groups P6 and C2 isdifferentThe crystals in space group P6 of CCAGGCrnCTGG andin space group C2 of CCAGGCCTGG (Heinemann andAlings, 1989) have in common the end-to-end packing of

Sequence-determined B-DNA structure

Table I. Refinement statistics

Resolution range (A) 8.0-2.25R factor (2407 1 a FO) (%) 17.2

(2200 3a FO) (%) 16.2FO/Fc correlation coefficient 0.927EFO/EFC 1.008< F-FFJ> 31.51

Sugar-base bond distances (A) 0.013/0.025Sugar-base bond angle distances (A) 0.032/0.040Phosphate bond distances (A) 0.049/0.040Phosphate bond angle distances and H-bond distances (A) 0.053/0.060Planar groups (A) 0.019/0.030Chiral volumes (A3) 0.066/0.100Single torsion non-bonded contacts (A) 0.127/0.250Multiple torsion non-bonded contacts (A) 0.190/0.250Isotropic thermal factors (A2)Sugar-base bonds 2.39/3.00Sugar-base bond angles 3.56/5.00Phosphate bonds 3.79/5.00Phosphate bond angles, H-bonds 3.37/5.00

Structure amplitudesA term /22.0B term /- 150.0

Fo and Fc are observed and calculated structure amplitudes,respectively. The R factor is EIFO-Fc /EFO and the correlationcoefficient is E[(Fo-<Fo>) (Fc-<Fc>)]/[E(FO- <Fo >)2_(Fc-< Fc > )2] /2. For stereochemical parameters, the left numbergives the rms deviation from ideal values in the final model and theright number is the target variance used in refinement. The weightapplied to the corresponding restraint is the inverse square of the targetvariance. The weight applied to a structure amplitude is the maximumof either the inverse square of A + B(sin2o/X2 -0.1667) or theexperimental a value.

decamer helices to yield quasi-continuous B-form helices.Clearly, this packing is compatible only with one type ofmolecular structure: relatively straight helices with an exact10 bp repeat. Within these restrictions, helix packing in spacegroup P6 is very different from that in C2.

In monoclinic C2, the strands of the decamer are relatedby a crystallographic dyad axis. Therefore, both strands haveidentical conformation as one would expect for a self-complementary double helix. In addition, both strands makeidentical lattice contacts with neighboring molecules. Inhexagonal P6, the crystallographic asymmetric unitcomprises the full decamer double helix, which hencedisplays similar, but non-identical conformation of bothstrands due to asymmetric lattice contacts. The smallconformational difference of both strands (see below) hasto be attributed to crystal lattice effects (Dickerson et al.,1987).Whereas in the monoclinic cell only one type of interhelical

contact exists through the lattice C-centering, in thehexagonal unit cell three different symmetry elements maypromote contacts between helices. Intimate contacts betweendecamer helices are brought about by the 6-fold rotation axis,a looser contact is created by the redundant dyad axis, whilethe separation of molecules related by the redundant 3-foldaxis is too large to permit direct contact (Figure 4). Withina layer of decamer helices, intermolecular contacts existprimarily between the phosphate groups in both types ofpacking. In space group C2, these contacts are the same forboth strands and are approximately evenly distributed alongone strand. In contrast, the methylated decamer in space

Fig. 3. Average temperature factors of phosphates (squares), sugars(circles) and bases (diamonds) of the decamer. The open symbolsindicate the plus strand (residues 1-10) and the filled symbols theminus strand (11-20).

group P6 experiences different contacts for both strands andan uneven distribution of contacts which leaves severalphosphate groups without any intermolecular partner(Table II). Therefore, the crystal lattice environment ofCCAGGCmeCTGG is very different from that experiencedby its unmethylated counterpart.

Decamer conformation is insensitive to methylationand crystal environmentCrystal environment is responsible for the conforma-tional deviation between the plus and the minus strands ofCCAGGCmeCTGG and, probably along with the effect ofC(7)/C(17) methylation, for structural differences betweenthe methylated and unmethylated decamers. Both strands ofthe methylated decamer can be superimposed with an rmsdistance between equivalent atoms of 0.56 A (Table III).CCAGGCCTGG deviates a little more, and the decamerfiber model deviates considerably. If only the central 8 bpare used for comparison, the differences get smaller,consistent with the high B values and concomitant largeexperimental uncertainties regarding the terminal base pairsof the methylated duplex. Considering the mean error inatomic coordinates of -0.25 A, these differences are smallbut significant. We shall go on to show that, despite thesesmall structural deviations produced by the crystal environ-ment, the characteristic global and local structural featuresof CCAGGC(me)CTGG remain unchanged.The conformational differences between the two forms of

the decamer discernible after a least-squares fit of theduplexes are restricted to the sugar-phosphate backbonesand the terminal base pairs (Figure 5). In the central part,base pair geometries and stacking are similar. The minorgroove of CCAGGCreCTGG is wide at the helix ends andnarrow in its center (Figure 6) just as in the unmethylatedcontrol. Qualitatively the same behavior has been observedwith the sequence-related decamer CCAACGTTGG(Prive,G.G., Yanagi,K. and Dickerson,R.E., submitted)where it has been shown both experimentally and theoretic-ally to be correlated with characteristic minor groove hydra-tion patterns (Chuprina et al., 1990).As with the global helix structure discussed above, the

sequence-dependent local structure of the decamer remains

37

U.Heinemann and C.Alings

a b

-4

c

d

Fig. 4. Crystal environment of CCAGGCmCCTGG and its unmethylated counterpart. In (a) and (b) the view is down the c-axes onto the a/b-plane ofthe P6 unit cell of the methylated decamer and onto the projection perpendicular to the c-axis of the C2 unit cell (a = 32.15 A, b = 25.49 A,c = 34.82 A, 0 = 116.70) of the unmethylated decamer, respectively. The helices are represented by circles of radius 9.75 A corresponding to theaverage distance of phosphorus atoms from the helix axis. In P6 the helices form layers in the a/b plane, while in C2, helix B is longitudinallydisplaced from the reference helix A as a consequence of the monoclinic angle. Crystal packing in C2 is more efficient, requiring an area of only732 A2 per duplex in the projection shown, compared with 835 A2 for the P6 unit cell. (c), (d) Stereo side views of the close contacts between pairsof stacked helices A, B and F' and of the intermediate contact between A and D in the P6 unit cell. Note the close contacts between duplexes wherestrands cross over each other and the severe clash that would result from adding the missing phosphate groups in between decamer helices.

largely unaffected by changing crystal environment. Figure 7compares the most important helical parameters describingbase pair geometry and stacking of the two decamer forms.

38

The close fit between the two duplexes is immediatelyapparent for all structural parameters except for the base pairtilt (i) where the small range of values adopted (from -2°

Sequence-determined B-DNA structure

Table II. Crystal contacts of phosphate groups

Crystal Molecule P1sptate groups within 8 A pVos rus-phosphors separaticn fmrcn reference malecule A

P6 D 10

9 9

17

17 16

F' 14

2 16 16 (2)

2 3 15 15 13 14 (10)

B 4

12 6 6

(12) 4 (20) 12 13 5 5

A 2 3 4 5 6 7 8 9 10 12 13 14 15 16 17 18 19 20

C2 B 19 (10) (10) 18 3 15 15 12 (4) 9 (20) (20) 8 13 5 5 2 (14)

20 6 17 18 16 12 10 16 7 8 6 2

(3) (13)

Phosphorus atoms are identified by their residue numbers and the molecules in the crystal lattice are designated according to Figure 4. In space

group C2, the four different neighbors B are not distinguished. Each column lists the contacts between a given phosphate group of the referencemolecule A with its neighbors in space group P6 (CCAGGCmeCTGG, top) and in space group C2 (CCAGGCCTGG, bottom). Multiple contacts of a

phosphate group with a symmetry-related molecule are listed by increasing P-P distance. Parentheses indicate phosphate groups that are brought intocontact through a unit translation along the crystal c axis.

to 20) simply reflects the limits of accuracy for the structuredetermination of CCAGGCreCTGG. The parametersdescribing the stacking in the crystal between the terminalbase pairs of unconnected decamer duplexes (those on thedashed vertical line) are usually within the range found insidethe helices, demonstrating that in both lattice types,effectively endless helices are formed. The methylatedduplex displays an extreme value of base pair cup (X) at thejunction which might indicate a slightly less efficient helix-to-helix stacking.The large helix twist (440 and 480) and slide (1.8 A

and 1.9 A) values at the two CA/TG base pair steps ofCCAGGCmeCTGG correspond well with the twist and slideparameters of 51 and 2.6 A for the symmetric CA/TG stepsof the unmethylated decamer. They describe a peculiar basepair stacking behavior whereby the usually observedgeometrical overlap of base aromatic rings is abolished atthe expense of a juxtaposition of six-membered rings andcarbonyl or amino functions (Figure 8). Although twist andslide values in the methylated decamer are somewhat lessextreme compared with the unmethylated control, thestacking pattern of permanent dipoles (the carbonyl groups)over the polarizable base rings is the same in both decamers.

In addition to base pair stacking geometries, thesugar-phosphate backbones of the decamers are verysimilar. The characteristic distribution of BI and B11backbone types in CCAGGCCTGG is nearly preserved in

Table III. Conformation of different strands

Plus Minus CCAGGCCTGG Fiber

Plus strand - 0.56 0.70 1.11Minus strand 0.40 - 0.71 1.11CCAGGCCTGG 0.58 0.71 - 1.17Fiber model 0.95 1.01 1.09 -

Rms distance (A) between equivalent atoms in the plus and minusstrands of CCAGGCmeCTGG, CCAGGCCTGG and its fiber model(Chandrasekaran and Arnott, 1989) after least-squares superpositions ofsingle strands. Above the diagonal are the values for the 202 atoms ofthe whole strand, and below the diagonal are the values for the 161atoms of the eight central base pairs.

its methylated analog (Figure 9). The effects on helixstructure of these backbone types has been discussed at length(Cruse et al., 1986; Prive et al., 1987; Heinemann andAlings, 1989; Heinemann et al., 1990a) and need not bereiterated. In previously analyzed B-DNA structures, the BIIbackbone type occurs rather infrequently and is avoided atsuccessive phosphate positions as well as across adinucleotide step. Where two BI1 phosphates link the basepairs of a step, an unusual stacking pattern like that describedabove for the CA/TG step may occur. In CCAGGCmeCT-GG, there are as many BII as BI phosphate positions andconsecutive BII phosphates occur at three positions. All of

39

U.Heinemann and C.Alings

Fig. 5. Stereo view into the minor groove of CCAGGCCTGG (thin lines) superimposed onto CCAGGC"UCTGG (heavy lines).

the additional B11 phosphates that are not present in theunmethylated decamer are at the ends of the helix wheretemperature factors are high and the accuracy of the analysisis reduced. Therefore, it should suffice to state that thebackbone conformation of CCAGGCnlCCTGG is stronglyreminiscent of that of the control.

DiscussionWe have shown that the conformation of the double-helicalB-DNA decamer CCAGGC eCTGG is very similar to thatof CCAGGCCTGG despite significant differences in crystalenvironment. The remaining structural differences have tobe attributed to experimental inaccuracies, especially in thestructure determination of the methylated decamer whichyields lower resolution, to the added methyl group and tothe changed crystal packing. It is difficult to discriminatebetween the relative influences of methylation and changedcrystal environment; this may be unnecessary since theircumulative effect is so small. An EcoRI recognition sitepresent in a B-DNA dodecamer was analyzed under differentmethylation states in isomorphous crystals (Wing et al.,1980; Frederick et al., 1988) and was also found to adopta nearly unchanged conformation. This dodecamer is alsothe only DNA fragment studied in free form and specifi-cally bound to a protein, the E. coli EcoRI restrictionendonuclease (McClarin et al., 1986). Unfortunately, thestructural analysis of the cocrystals has not progressed farenough to permit a detailed comparison of the DNAconformations in both states.The conservation of DNA conformation in different

environments does not imply that every sub-fragment of aDNA helix adopts a rigid, sequence-determined geometry.The unique stacking behavior of the CA/GT dinucleotidestep of the decamers, for example, is not observed in thecrystal structures of CGCAAATTTGCG, CGCAAAAAA-GCG, and CGCAAAAATGCG (Coll et al., 1987; Nelsonet al., 1987; DiGabriele et al., 1989). Clearly, the sequence

40

Fig. 6. Minor groove width in two stacked helices ofCCAGGCreCTGG (top) and CCAGGCCTGG (bottom) given as P-Pseparation across the groove minus 5.8 A to account for the phosphategroup Van der Waals radius. Phosphorus positions are numberedaccording to a 20 bp continuous helix.

context of such an element must not be neglected. Recently,it has been demonstrated that DNA conformation is notdetermined in the form of simple dinucleotide code words,but that longer segments need be considered (Heinemannet al., 1990b).There is an apparent contradiction between the results

obtained with octameric A-DNA fragments where a changingcrystal packing induced significant structural changes (Jainet al., 1989; Shakked et al., 1989) and our observation ofa B-DNA structure mostly unaffected by crystal environ-ment. A-DNA octamers do not complete a full turn of thehelix and may therefore be especially sensitive to crystalpacking effects. It can be shown that their global structuredepends directly on crystal lattice parameters (Heinemann,1990). In contrast, B-DNA decamers stacked to endless

H12

3 12B 2

0' 2

5 9 104 6

8:0 CQ cq CO 0S:.co -t 4 N.a 11

u

a

Sequence-determined B-DNA structure

Fig. 7. Variation of helical parameters with DNA sequence in CCAGGCreCTGG (solid lines) and CCAGGCCTGG (dotted lines). The asteriskmarks the site of methylation. The parameters are as defined by Fratini et al. (1982) and Dickerson (1985) and were calculated with RichardDickerson's NEWHEL90 program: the sign convention is that adopted at the 1988 Cambridge meeting (Dickerson et al., 1989). Briefly, twist (X),roll (Q) and tile (r) are rotations about the helix axis, the long and the short axes of a base pair required to superimpose two stacked base pairs. Rise(Dz) and slide (Dy) measure displacements along the helix axis and the long axes of two stacked base pairs, and cup (X) gives the difference inbuckle (x) between two stacked base pairs. The inclination against the helix axis (71), as well as the base pair propeller (ir) and buckle (x), rotationsbetween the two bases of a pair about its long and short axes, respectively, are properties of individual base pairs. Rotations are given in degreesand translations in A. Parameters for the stacking in the crystals between unconnected decamer duplexes lie on the dashed vertical line. Numericalparameter values on which this figure is based are available from the authors upon request.

Fig. 8. Stereo views of base pair stacking at the C(2)-A(3)/T(18)-G(19) base pair steps in CCAGGCnmCCTGG (top) and CCAGGCCTGG (bottom).Carbon, nitrogen, oxygen and phosphorus atoms are drawn with increasing radius. The upper base pairs are drawn with solid bonds and hydrogenbonds are indicated by thin black lines.

helices may behave like polymeric DNA and thus may bebetter models for DNA in biological systems.The crystal structure of CCAGGCreCTGG compared

with the unmethylated molecule strongly suggests that manyimportant conformational features ofB-DNA are determinedby the nucleotide sequence and are insensitive to environ-

41

U.Heinemann and C.Alings

C - C - A - G - G - C -"*C - T - G - G

5'[ HE I H I I I E I 3'

3'-I I I I I I I I H -5'

G - G - T -neC - C - G - G - A - C - C

C - C - A - G - G - C - C - T - G - G

5'1 H _ I H I I I I I 3'

3'AI I I I I I 5'

G - G - T - C - C - G - G - A - C - C

Fig. 9. Distribution of B1 and B1l backbone types in the two decamers.Backbone torsion angles e and v trans (- 180°) and -gauche (about-60°), respectively, define a B1 backbone type, e and ¢ are -gaucheand trans in a BI1 backbone. In a B1 backbone, the vector through thephosphate oxygens is roughly perpendicular to the helix axis. In a B1lbackbone, this vector is roughly parallel to the axis.

ment. Sequence-specific DNA-binding proteins such as theDcm methyltransferase may thus utilize these structures insite recognition and binding. This does not preclude a furthermodulation of helix structure (e.g. bending) upon proteinbinding. However, DNA is not a 'soft' molecule whosestructure is primarily determined by the environment.

Materials and methodsDNA synthesis and crystallizationTwo micromoles of CCAGGCreCTGG were synthesized on an AppliedBiosystems automatic DNA synthesizer from nucleoside phosphoramidites.Before detritylation the decamer was purified by reverse-phase and afterwardsby anion-exchange FPLC. In 20 mM sodium cacodylate (pH 7), the decamerdouble helix melts with a TM of 52°C. Under identical conditions, the sameoligonucleotide lacking the methyl group in position 5 of C(7) dissociateswith a TM of 59°C.

Needles with hexagonal cross-section, most of which had a hollow center,grew within 3 weeks at 4°C when a 2 mM DNA solution in 20 mM sodiumcacodylate pH 7.5 and 50 mM magnesium chloride was equilibrated in amicrodialysis tube (Dattagupta et al., 1975) against the same buffersupplemented with 40% (by volume) of 2-methyl-2,4-pentanediol (MPD).Precession photographs and diffractometry showed these crystals to haveunit cell dimensions of a = 53.77(1) A and c = 34.35(2) A and to belongto space group P6 (no. 168).

X-ray data acquisitionDiffraction data were collected at room temperature in w-scan mode on aTurbo-CAD4 diffractometer using Ni-filtered CuK,, radiation from aNonius F571 rotating anode generator operated at 45 kV and 100 mA. Avariable scan speed depending on count rate was used and decay wasmonitored by repeated measurement of four controls. The data from twocrystals of approximate dimensions 0.6 x 0.2 x 0.2 mm were subjectedto standard data reduction procedures including a semi-empirical absorptioncorrection (North et al., 1968). To the resolution limit of 2.25 A, 2075measurements from the first crystal and 2766 measurements from the secondcrystal were considered significant at the la level. The two data sets weremerged and scaled with the program ANISOSC from the CCP4 programsuite (Machin et al., 1983). The final data set used in structure determinationand refinement comprised 2407 lu observations in the resolution range from8 to 2.25 A, representing 85% of the theoretically observable reflexions,or 240 structure amplitudes per base pair.

Structure solution and refinementThe length of the unit cell c-axis, corresponding to the pitch of a B-DNAhelix, and the occurrence of prominent, strong reflexions near by and onthe c-axis of the unit cell at 3.4 A spacing indicated the packing of B-formdecamer helices around the 6-fold axis. The peculiar diffraction pattern wasassumed to arise from the transform of the base pairs stacking in 3.4 A

distance perpendicular to the B-DNA helix. A-form DNA does not yielddiffraction patterns of this kind due to the inclination of base pairs againstthe helix axis. Therefore, only a standard B-DNA model with sequenceCCAGGCCTGG constructed from X-ray fiber diffraction coordinates(Chandrasekaran and Arnott, 1989) was considered as input to the multi-dimensional search program ULTIMA (Rabinovich and Shakked, 1984).Here, packing criteria and R-factor calculations using low-resolution dataand group scattering factors were employed to position the search fragmentin the unit cell. The present problem lent itself ideally to this approach,since in space group P6 only two translations had to be considered and therange of rotations were restricted to keep the helix approximately parallelto the c-axis. The 20 best solutions from ULTIMA were further subjectedto rigid-body refinement, first at low resolution using group scatterers andthen at higher resolution using atomic scattering factors. Refinement of thebest solution against 3 A data converged at an R value of 46%; the nextbest packing models yielded 49% or higher R values.

Structure refinement was continued with CORELS (Sussman et al., 1977)where the molecule was divided into constrained groups in the form ofnucleoside pairs and phosphates and subsequently with NUCLSQ (Westhofet al., 1985) and XPLOR (Briinger et al., 1987). The positional refinementconverged at an R value of 30.6%. With the inclusion of 40 solvent sites,identified from difference electron density maps and checked for reasonablegeometry of binding to the DNA with FRODO (Jones, 1978) and after sometemperature factor refinement, the agreement factor could be lowered to23.3 %. At this stage, several lines of evidence indicated a serious problemwith the model being refined: (i) the difference density map was noisy, andthe density maxima occurred as elongated blobs between base pairs andnot in the solvent region, (ii) difference density minima occurred at phosphatepositions 10 and 12, (iii) positive difference density indicating the methylgroups of C(7) and C(17) was not present, (iv) unreasonably close latticecontacts existed between phosphate groups at positions 10 and 12, (v) anunusual backbone conformation was observed for several nucleotides, and(vi) the helix conformation did not reflect the symmetry of the self-complementary DNA, but in contrast appeared frame-shifted by one basepair with respect to the sequence. These observations were consistent withan error in the model whereby a wrong decamer fragment of the continuoushelix stacked parallel to the c-axis had been chosen. This incorporated onephosphate group on either strand that should have been absent, omitted onephosphate per strand, had a partially wrong sequence and therefore yieldedan asymmetric DNA conformation.

Refinement, first with CORELS and then with NUCLSQ, was re-startedfrom a fiber model of CCAGGCCTGG superimposed on the commonphosphorus positions of a nonamer portion of the previously refined modelplus one nucleotide pair of its stacked neighbor. With this starting structure,refinement proceeded smoothly. At R = 28.3%, refinement of thermalparameters was allowed, and from R = 25.8% onwards, solvent peaks,treated as water oxygens with unit occupancy, were obtained from the top20 difference density peaks and included in the model. At R = 20.5% themissing methyl groups of C(7) and C(17) were readily apparent from densitymaps and added to the atom list. Refinement was continued by alternatingNUCLSQ runs and inspection of maps with FRODO until none of the 20highest difference density peaks could be assigned as solvent site and thepositional and thermal shifts had become insignificant.

AcknowledgementsThe help of Dr R.Bald and S.Schulze in DNA synthesis, of Drs J.Granzinand W.Hinrichs in X-ray data collection, of A.Zouni in the UV meltingexperiment, of M.Steifa in all technical matters over many years and ofV.Steifa in preparation of figures is gratefully acknowledged. ProfessorsW.Saenger and R.E.Dickerson are thanked for continuous support andhelpful discussions and Dr U.Egner for critically reading the manuscript.Drs D.Rabinovich, J.L.Sussman and E.Westhof provided computerprograms. This work was funded by the Deutsche Forschungsgemeinschaftthrough SFB 344/D3 and supported by the Fonds der Chemischen Industrie.

References

Aggarwal,A.K., Rodgers,D.W., Drottar,M., Ptashne,M. and Harrison,S.C.(1988) Science, 242, 899-907.

Bruinger,A.T., Kuriyan,J. and Karplus,M. (1987) Science, 235, 458-460.Chandrasekaran,R. and Arnott,S. (1989) In Saenger,W. (ed.) Landolt-

Borjnstein, News Series, Group VII. Springer, Berlin, Vol. lb. pp. 31-170.Chuprina,V.P., Heinemann,U., Nurislamov,A.A., Zielenkiewicz,P.,

Dickerson,R.E. and Saenger,W. (1990) Proc. Natl. Acad. Sci. USA, inpress.

42

Sequence-determined B-DNA structure

Coll,M., Frederick,C.A., Wang,A.H.-J. and Rich,A. (1987) Proc. Nati.Acad. Sci. USA, 84, 8385-8389.

Coulondre,C., Miller,J.H., Farabaugh,P.J. and Gilbert,W. (1978) Nature,274, 775-780.

Cruse,W.B.T., Salisbury,S.A., Brown,T., Cosstick,R., Eckstein,F. andKennard,O. (1986) J. Mol. Biol., 192, 891-905.

Dattagupta,J.K., Fujiwara,T., Grishin,E.V., Lindner,K., Manor,P.C.,Pieniazek,N.J., Saenger,W. and Suck,D. (1975) J. Mol. Biol., 97,267 -271.

Dickerson,R.E. (1985) In Jurnak,F. and McPherson,A. (eds) BiomolecularMacromolecules and Assemblies. Wiley, New York, Vol 2. (Appendix),pp. 471-494.

Dickerson,R.E. (1990) In Sarma,R.H. and Sarma,M.H. (eds) Structure andMethods Vol. 3: DNA and RNA. Adenine Press. Schenectady, USA,pp. 1-38.

Dickerson,R.E., Bansal,M., Calladine,C.R., Diekmann,S., Hunter,W.N.,Kennard,O., von Kitzing,E., Lavery,R., Nelson,H.C.M., Olson,W.K.,Saenger,W., Shakked,Z., Sklenar,H., Soumpasis,D.M., Tung,C.-S.,Wang,A.H.-J. and Zhurkin,V.B. (1989) EMBO J., 8, 1-4.

Dickerson,R.E., Goodsell,D.S., Kopka,M.L. and Pjura,P.E. (1987) J.Biomol. Struct. Dvn., 5, 557-579.

DiGabriele,A.D., Sanderson,M.R. and Steitz,T.A. (1989) Proc. Nati. Acad.Sci. USA, 86, 1816-1820.

Duncan,B.K. and Miller,J.H. (1980) Nature, 287, 560-561.Fratini,A.V., Kopka,M.L., Drew,H.R. and Dickerson.R.E. (1982) J. Biol.

Chem., 257, 14686-14707.Frederick,C.A., Quigley,G.J., van der Marel,G.A., van BoomrJ.H.,Wang,A. H.-J. and Rich,A. (1988) J. Biol. Chein., 263, 17872-1 7879.

Heinemann,U. (1990) J. Biomol. Struct. DIn., in press.Heinemann,U. and Alings,C. (1989) J. Mol. Biol., 210, 369-381.Heinemann,U., Alings,C. and Lauble,H. (1990a) In Sarma,R.H. andSarma,M.H. (eds.) Structure and Methods, Vol. 3: DNA and RNA.Adenine Press, Schenectady, USA, pp. 39-53.

Heinemann,U., Alings,C. and Lauble,H. (1990b) Nucleosides Nucleotides,9, 349-354.

Hendrickson,W.A. (1985) Methods EnZvmol., 115, 252-270.Jain,S., Zon,G. and Sundaralingam,M. (1989) Biochemistry, 28,

2360-2364.JonesT.A. (1978) J. Appl. Crvstallogr., 11, 268-272.Jordan,S.R. and Pabo,C.O. (1988) Science, 242, 893-899.Kennard,O. and Hunter,W.N. (1989a) In Saenger,W. (ed.) Landolt-

Birnstein, New Series, Group VII. Springer, Berlin, Vol. la,pp. 255-360.

Kennard,O. and Hunter,W.N. (1989b) Quart. Rev. Biophvs., 22, 327-379.Lavery,R. and Sklenar,H. (1988) J. Biomnol. Struct. Dvn., 6, 93-71.Lavery,R. and Sklenar,H. (1989) J. Biomol. Struct. Dvn., 6, 655-667.Luzzati,V. (1952) Acta Crvstallogr., 5, 802-810.Machin,P.A., Wonacott,A. and Moss,D. (1983) Daresblurv Laboratory

Newvs, 10, 3-9.McClarin,J., Frederick,C.A., Wang,B.C., Greene,P.. Boyer,H., Grable,J.

and Rosenberg,J.M. (1986) Science, 234, 1526-1541.Nelson,H.C.M., Finch,J.T., Luisi,B.F. and Klug,A. (1987) Naiture, 330,

221 -226.North,A.C.T., Phillips,D.C. and Mathews,F.S. (1968) Acta Cr.stallogr.,

A24, 351-359.Otwinowski,Z., Schevitz,R.W., Zhang,R.-G., Lawson.C.L.,

Joachimiak,A., Marmorstein,R.Q., Luisi,B.F. and Sigler,P.B. (1988)Nature, 335, 321-329.

Prive,G.G., Heinemann,U., Chandrasegaran,S., Kan,L.-S., Kopka,M.L.and Dickerson,R.E. (1987) Science, 238, 498-504.

Rabinovich,D. and Shakked,Z. (1984) Acta Crvstallogr._ A40, 195 -200.Rich,A., Nordheim,A. and Wang,A.H.-J. (1984) Annu. Rev. Biochemii.,

53, 791 -846.Saenger,W. (1984) Principles of Nucleic Ac-id Structlure. Springer, New

York.Shakked,Z. and Rabinovich,D. (1986) Prog. BiophYs. Mol. Biol., 47,

159-195.Shakked,Z., Guerstein-Guzikevich,G., Eisenstein,M., Frolow,F. and

Rabinovich,D. (1989) Nature, 342, 456-460.Suck,D., Lahm,A. and Oefner,C. (1988) Nature, 332. 464-468.Sussman,J.L., Holbrook,S.R., Church,G.M. and Kim,S.-H. (1977) Acta

Crvstallogr., A33, 800-804.Westhof,E., Dumas,P. and Moras,D. (1985) J. Mol. Biol., 184, 119-145.Wing,R., Drew,H., Takano.T., Broka,C., Tanaka.S.. Itakura,K. and

Dickerson,R.E. (1980) Nature, 287, 755-758.Wolberger,C., Dong,Y., Ptashne,M. and Harrison.S.C. (1988) Nature, 335,

789 -795.Received on September 4, 1990; revised October 22, 1990

43