supplemental data molecular architecture and ... · em sample preparation and data collection...
TRANSCRIPT
Structure 14
Supplemental Data
Molecular Architecture and Conformational
Flexibility of Human RNA Polymerase II Seth A. Kostek, Patricia Grob, Sacha De Carlo, J. Slaton Lipscomb, Florian Garczarek, and Eva Nogales
Figure S1. Purification and Activity of hRNAPII (A) SDS-Page of inmuno-purified hRNAPII from HeLa cell nuclei. (B) In vitro
transcriptional activity of our purified hRNAPII using the Kashlev method of transcription
initiation.
Figure S2. FSC and Angular Distribution (A) Fourier shell correlation curve indicating a resolution of 22 Å at the end of the
FREALIGN refinement. (B) Final angular distribution plot of the particles showing an
isotropic distribution of orientations.
Figure S3. 3D Variance and Docked Crystal Structure, Insertions and Deletions
The 3D variance (yellow mesh) is superimposed onto the yeast crystal structure docked
into the EM density map (PDB entry 1Y1V). Deletions in the hRNAPII sequence are
represented in white with green outline. Residues not present in the yeast crystal
structure and insertions in hRNAPII are shown as red dashes. Left and right show front
and side views, respectively.
Figure S4. 3D Variance and Docking Left and right show front (as in Fig. 4B) and back views, respectively, of the yRNAPII
crystal structure docked into the EM hRNAPII map. The latter highlights the position of
pore 1, the Rpb6 N-terminus and the CTD linker. The 3D variance map is shown as a
yellow density.
Figure S5. Rigid Body Docking of the hRNAPII Homology Model into the
Cryo-EM Density Map
The same views of the EM density map from human RNAPII are shown as in Fig.
2 B (mesh). The homology model of hRNAPII (based on its sequence alignment
with yRNAPII —see supplementary Materials and Methods—) was docked
independently into the cryo-EM density using rigid body docking. The resulting
position for the core complex is identical. The only noticeable dissimilarity is in
the stalk, with a different size (smaller) and forming a different angle with the core
of hRNAPII than in yeast. Also visible are the models of most of the flexible loops
and some short termini.
Table S1
Location on the yRNAPII sequence of the insertions and deletions in the
hRNAPII amino-acid sequence (see Fig. 2B). The number of residues is
indicated (with h for human, y for yeast if the numbers are different) as well as
the location relative to the secondary structure of the different subunits, as
defined in Cramer et al. 2001 (core complex) and Meka et al. 2005 (Rpb4/7
complex). The regions closest to the main high variance areas determined by our
analysis are also mentioned, with their corresponding number as in Fig. 4. Yellow
text correspond to regions highlighted in the text concerning the contact of the
stalk with the core enzyme.
Subunit Deletions in hRNAPII
(number of residues)
Insertions in hRNAPII
(number of residues)
Not localized in
yRNAPII
(number of residues)
Regions of high
variance (variance
region #)
Rpb1
44-45 (2) Zipper
1187-1188 (1, 4 total,
but 3 of them are not
localized in yeast Xtal
structure) loop a40-b29
3-4 (4), N-ter
34-35 (2), Zipper
129-130 (2) loop a3-a4
155-156 (11) loop a4-b3
583-584 (8) loop b20-
b21
1286-1287 (5) loop b32-
b33
186-195 (8) loop b4-b5
1176-1187 (10) loop
a40-b29
1243-1254 (10 y + 6 h)
loop b31-a43
1456-1979 - part of linker
and CTD, C-ter
Clamp head (1)
Lid (2)
Rudder (2)
Switch 2 (3)
a38-a39 (4)
a43-b32 (close to
interaction with Rpb9
and Rpb2) (4)
b30-b31 (4)
a41-a42 (4)
Linker and CTD (6)
Rpb2
231 (1) loop b7-b8
642-647 (6) loop b21-
a16
668 (10 total, but not all
localized) loop a16-a17
268-269 (6) loop b9-b10
20 (19) N-ter
Protrusion 70-90 (19) loop a2-b1
335-345 (9) a8
437-446 (8) a11-a12
Fork loop 1 (2)
Protrusion (2)
lobe
external 1
716-733 (17) loop a19-
b24
1178 (1) b44-b45
1220-1224 (5) C-ter
134-164 (19 y, 5 h)
loop b2-b3
External 1 668-678 (9 y, 0 h)
715-722 (6 y, 1 h) loop
a19-b24
Wall flap loop 919-933 (13)
b7-b8 (1)
b10-b11 (1)
External 1
Wall flap loop (5 and 2)
Rpb3
196-199 (4) b10-b11
216 (1) b10-b11
268 (1 localized, 44 total,
but most were not
localized) Tail, C-ter
3 (1) N-ter
76-77 (1) a2-b5
123-124 (4) b7-b8
3 (1) N-ter
268 (44) C-ter Tail
b7-b8
Tail
Rpb4
4-7 (4) N-ter
42-77 (6) a1-a2 loop
118-136 (9) a1-a2 loop
Tip
And/or interface with
Rpb7
Rpb5 7-8 (2) N-ter
68-73 (6) a4, loop a4-b2
48-49 (2) a3-b1
122-123 (1) a6
a8-b6 – contact with
Rpb1 (foot)
b7-b8 – contact with
Rpb1 (foot)
Rpb6 72-74 Tail (N-ter) …72 (71 y, 44 h) Tail (N-
ter)
Tail (5)
Possibly contacts with
Rpb1 and Rpb5 (5)
Rpb7 57-58 (2) b3 141-142 (2) b2-b3
171 (1 localized) C-ter
b3-b4 loop – interface
with Rpb4 ?
Rpb8 32 (1) b3-b4 loop
82-87 (6) b5-b6 loop
2 (1) N-Ter
18-19 (2) b3-b4
105-106 (5) b7-b8
139-140 (1) b9-b10
146 (3 ) C-ter
63-76 (12 ) b5-b6 loop b5-b6 loop
Rpb9 113-120 (10) C-ter 2 (10) N-ter
70-71 (1) b5-b6 loop C-ter (2) C terminal
105-106 (2 ) b7-b8 loop
Rpb10 65 (5y 3h) Cter N terminal (contacting
Rpb3)
Rpb11 114 (6y 3h) C-ter
Rpb12 25 (24y 12h) b3-b4 (interaction with
Rpb2)
Supplemental Experimental Procedures
hRNAPII Purification and Activity
Human RNAPII was purified as previously described [1, 2]. In brief RNAPII was
extracted from HeLa nuclear pellets through sonication and precipitated with a 42%
ammonium sulfate cut. The pellets were resuspended and dialyzed to 0.15 M
ammonium sulfate and placed on a DEAE52 anion exchange column. Subsequent to
thorough washing, RNAPII was eluted with a buffer containing 0.4 M ammonium sulfate
and assayed by western blot. The fractions containing RNAPII were pooled and this
eluate was dialyzed into a buffer containing 0.2 M ammonium sulfate. This was
subsequently placed over a protein G affinity column containing 8WG16 antibodies from
NeoClone. After several high-salt washes the RNAPII was eluted four times with buffer
containing a tri-heptapeptide repeat of the CTD. These fractions were then pooled and
dialyzed against a buffer containing 0.15 M ammonium sulfate and then placed on a
DEAE-5PW ion exchange column to separate the different phosphorylation states of the
protein complex. Fractions were immediately dispensed into 5 µl aliquots and frozen in
liquid nitrogen for later use. Transcriptional activity was measured using a promoter-
less RNA polymerase initiation system prepared as described by Kashlev and
coworkers [3]
EM Sample Preparation and Data Collection
Fractions of 50 ug/mL hRNAPII were thawed just prior to EM use, diluted 5 fold
and 5 µL of this dilution was applied for 30 seconds to a continuous carbon-coated, 400-
mesh copper grid that had been glow discharged. The grid was then placed on a 100 µL
drop of stain that consisted of a saturated solution of ammonium molybdate neutralized
to a pH of 7.2. with 10 N NaOH [4]. After 30 seconds of exposure to the stain the grid
was mounted on a plunger, blotted to a thin layer, air-dried for 1-2 sec and finally
vitrified in liquid ethane. EM data was collected on an FEI CM200-FEG transmission
electron microscope at an acceleration voltage of 200 kV, with a calibrated
magnification of x50280, and using a Gatan 626 cryo-specimen holder (Gatan Inc.,
Warrendale, PA, USA) while keeping the specimen temperature at approximately –
180˚C. Images were taken under low-dose conditions (17 e-/Å2) with a defocus range of
1.2 µm to 3.6 µm and recorded on Kodak SO163 plate films. The quality of the
micrographs was checked by visual inspection for astigmatism and drift. The best 19
micrographs were digitized on a Nikon Super CoolScan 8000 with a 6.353 µm raster
size, resulting in a pixel size of 1.25 Å. Transmission values from the scanner were
converted to optical density with tm2od, an in-house convenience script that employs
the proc2d module of EMAN (see
http://cryoem.berkeley.edu/~slaton/bash/tm2od.shtml). Particles were then decimated to
2.5 Å/pixel.
Image Processing
The EMAN software package [5] was used to manually establish particle
coordinates (boxer) and window 9225 images at a size of 120 x 120 pixels (batchboxer).
Two methodologies were employed to estimate the contrast transfer function (CTF) for
each particle. The first method used the ctfit module of EMAN [5] to estimate the
defocus for an entire micrograph with subsequent assignment of CTF parameters to the
corresponding particles using IMAGIC [6]. This procedure corrected the individual
image CTFs by flipping the phase only, without regard for the amplitudes. The second
method relied on the ctftilt program of FREALIGN to estimate defocus and astigmatism
for each particle [7]. The program determines the specimen tilt parameters by
measuring the defocus at a series of locations on the image while constraining them to
a single plane. This information was stored and used for refinement in FREALIGN.
The individual images were arithmetically normalized, iteratively centered, and
then multiplied by a Gaussian blurred circular mask, with a radius 90% of the total
image radius. 2-D analysis was performed using the program IMAGIC [6]. Particles
were subjected to multivariate statistical analysis (MSA) and hierarchical ascendant
classification (HAC).
Initial Euler angle assignment was performed with SPIDER [8] using a projection
matching strategy. The particles used for 2-D analysis were first converted to SPIDER
format using IMAGIC (em2em). The initial 2-D analysis indicated that the
crystallographic model of the yeast homologue of human RNAPII (PDB coordinates
1Y1V) could be used as an initial reference. The yRNAPII crystal structure was low-
pass Fermi-filtered to a spatial frequency of 1/50Å-1 thus allowing alignment of the
overall shape without bias from high-resolution frequencies. A gallery of 2-D references
was generated from the filtered yeast model by reprojecting it with an angular step
(delta theta) of 15 degrees. Then the individual experimental particles images were
cross-correlated to these references for alignment and Euler angular assignment [9].
After translational and rotational alignment of the particles a 3-D volume was calculated
by back projection using the assigned Euler angles. This procedure was iterated using
the volume derived from the previous round to generate reprojections. The theta angle
step size was decreased after several rounds of projection matching were performed for
each theta step.
Refinement and full CTF correction of the SPIDER model was performed with the
image processing software package FREALIGN [10]. This program refines the x, y
shifts and the three Euler angles for each particle, and performs a CTF correction on the
generated volume in Fourier space. Previously determined particle parameters are
used as initial input parameters. fwrap, an in-house developed package, was used to
divide the dataset across the processors of a Linux cluster and automatically manage
the multiple rounds of FREALIGN refinement (see
http://cryoem.berkeley.edu/~slaton/emperl/fwrap/index.shtml). CTF correction
refinement was performed with an initial resolution range of 200 to 40 Å (10 rounds)
with all 9225 particles. Subsequently the phase residual cut-off parameter was lowered
from 90 to 60 to obtain the best matching 6238 particles. Further refinement using
these particles, up to round 63, was performed with a resolution range of 200 to 10 Å.
The resolution of the final reconstruction was estimated from the Fourier shell
correlation function obtained by comparing two independent reconstructions, which
were generated by splitting randomly the data set in half [11]. The two data sets were
independently reconstructed and the resolution was given according to the 0.143 cut-off
in the Fourier shell correlation curve [12] (Supplementary Figure 1). The density
threshold was calculated with IMAGIC (threed-sexy). This program calculates the
protein mass covered by the above threshold voxels, assuming a protein density of
0.844 Daltons/ Å3.
To assess the potential conformational heterogeneity, 2-D analysis was again
performed with IMAGIC on particles from angular distribution groups generated during
projection matching. Class averages were generated by averaging particles in angular
groups at a theta step of 15˚. Particles in each class were subjected to MSA and MRA
generating initially up to 5 sub-classes. However visual inspection indicated that the
variability in the dataset was represented by two main sub-classes, with the others
being underrepresented.
3-D variance was calculated as previously described [13, 14]. Briefly, 500
bootstrap versions of the dataset were picked randomly with replacement from our
original dataset, leading to as many 3D reconstructions using the same 3D alignment
parameters, and low-pass filtered at 1/30Å resolution. Some background “noise
particles” were extracted from the area around each particle of the dataset and treated
in the same way. An estimate of the 3D structure variance can be calculated using the
bootstrap variance σB2 between the B=500 reconstruction obtained from the resampled
data particles and the average of the “noise” variance σ Back2 calculated with the same
method (command VA 3R in SPIDER):
σ Struct2 =K(σ B
2 −σ Back2 ) ,
where K is the number of particles in the original data set.
Targeted Classification and Reconstruction
The different regions of the 3D variance were selected using spherical masks
centered at the highest variance peaks in SPIDER. They were then individually
projected in quasi-evenly distributed directions (15˚ spacing), identical to the directions
of the particle “angular groups”, to generate a series of 2D masks. 2D classification of
the particle data belonging to each angular group was performed within the
corresponding masks as in [14, 15]. Classification for the high variance region between
the clamp and the lobe (region 1, Fig 4) gave the clearest result, with two main classes
corresponding to a higher or lower density within the mask. The data was partitioned
accordingly for each view into two groups, “closed” or “open” clamp. Two 3D
reconstructions were obtained from the particle groups, which were then refined
simultaneously against the entire dataset; the final particle assignment to either
conformation was determined by the highest cross-correlation coefficient. An additional
refinement of the alignment parameters was performed for each separate group in
FREALIGN [10]. The 3D variance was calculated again for each group of particles,
showing a notably reduced variance level after partition of the data.
Docking of the Crystallographic Data
The atomic coordinates of the 12-subunit yeast RNAPII (PDB 1Y1V,
Kettenberger et al., 2004) were initially docked manually into the EM density map of the
human polymerase in Chimera (see below). The initial fit was refined using the rigid-
body docking program colores from the SITUS package [16, 17]. This program
performed an initial extensive search, with an angular step size of 15˚, followed by an
off-lattice Powell optimization of the fit. Rigid-body docking of the crystallographic data
into the two conformations obtained from targeted classification was also attempted.
The resulting best fit was obtained in approximately the same position as before
partition of the data, with lower cross-correlation coefficients. While most of the crystal
structure matched closely the envelope from the cryo-EM data, some domains seemed
to adopt different positions. Manual docking of the clamp and jaw-lobe domains gave a
better fit in those areas of the structure (Fig. 5, bottom row). The docking of the TFIIB-
yRNAPII core complex gave a slightly better fit for conformation 1, but this structure
lacks the stalk domain.
Homology Model of RNA Polymerase II from Human
A three-dimensional homology model of the human RNAPII was produced using
MODELLER 8v2 [18]. The sequence of the target protein was aligned to the
yeast RNAPII sequence using the program ClustalW [19] and the known X-ray
structure 1Y1V [20] (without TFIIS) was used as a template for the determination
of the tertiary structure. Prior to determination of the homology model, all amino
acids which were not present in the X-ray structure (loops and termini) were
removed from the yeast sequence. The MODELLER default script (model-
default.py) was used and extended with the “env.io.hetatm = True” command to
keep all ions (1 x Mg2+, 9 x Zn2+) within the homology model. Ten models were
created and the one with the lowest value of the MODELLER objective function
was chosen. Termini of the homology RNAPII structure corresponding to regions
which are not resolved within the yeast X-ray structure and which are longer than
15 residues are removed from the homology model (chain A 1486-1970(1486-
1970), chain B 1971-1986 (1-15), chain F 3772-3818 (1-46)).
The resulting human model (available upon request) was docked into the cryo-
EM density map using the SITUS rigid body docking command colores and
represented in the same position of the EM map as the docked yeast model in
figure S5 (1Y1V, Fig. 2B).
Volume Rendering
All the volumes represented were filtered to 22 Å resolution, as determined by
the FSC criterion at 0.143, using a cosine type filter. The density threshold was
calculated to include a protein volume corresponding to 517kDa, the molecular mass of
holo-hRNAPII. The 3D density maps and atomic structures were rendered with the
UCSF Chimera package from the Computer Graphics Laboratory, University of
California, San Francisco ([21] , supported by NIH P41 RR-01081).
Supplemental References 1. Thompson, N.E., Aronson, D.B., and Burgess, R.R. (1990). Purification of
eukaryotic RNA polymerase II by immunoaffinity chromatography. Elution of
active enzyme with protein stabilizing agents from a polyol-responsive
monoclonal antibody. J Biol Chem 265, 7069-7077.
2. Maldonado, E., Drapkin, R., and Reinberg, D. (1996). Purification of human RNA
polymerase II and general transcription factors. Methods Enzymol 274, 72-100.
3. Kireeva, M.L., Komissarova, N., Waugh, D.S., and Kashlev, M. (2000). The 8-
nucleotide-long RNA:DNA hybrid is a primary stability determinant of the RNA
polymerase II elongation complex. J Biol Chem 275, 6530-6536.
4. Adrian, M., Dubochet, J., Fuller, S.D., and Harris, J.R. (1998). Cryo-negative
staining. Micron 29, 145-160.
5. Ludtke, S.J., Baldwin, P.R., and Chiu, W. (1999). EMAN: semiautomated
software for high-resolution single-particle reconstructions. Journal of Structural
Biology 128, 82-97.
6. van Heel, M., Harauz, G., Orlova, E.V., Schmidt, R., and Schatz, M. (1996). A
new generation of the IMAGIC image processing system. J Struct Biol 116, 17-
24.
7. Mindell, J.A., and Grigorieff, N. (2003). Accurate determination of local defocus
and specimen tilt in electron microscopy. J Struct Biol 142, 334-347.
8. Frank, J., Radermacher, M., Penczek, P., Zhu, J., Li, Y.H., Ladjadj, M., and Leith,
A. (1996). SPIDER and WEB - Processing and visualization of images in 3D
microscopy and related fields. J. Struc. Biol. 116, 190-199.
9. Penczek, P.A., Grassucci, R.A., and Frank, J. (1994). The ribosome at improved
resolution: new techniques for merging and orientation refinement in 3D cryo-
electron microscopy of biological particles. Ultramicroscopy 53, 251-270.
10. Grigorieff, N. (1998). Three-dimensional structure of bovine NADH: ubiquinone
oxidoreductase (complex I) at 22 Å in ice. J. Mol. Biol. 277, 1033-1046.
11. Frank, J. (1996). Three-dimensional electron microscopy of macromolecular
assemblies (San Diego: Academic Press).
12. Rosenthal, P.B., and Henderson, R. (2003). Optimal determination of particle
orientation, absolute hand, and contrast loss in single-particle electron
cryomicroscopy. J Mol Biol 333, 721-745.
13. Penczek, P.A., Yang, C., Frank, J., and Spahn, C.M. (2006). Estimation of
variance in single-particle reconstruction using the bootstrap technique. J Struct
Biol.
14. Grob, P., Cruse, M.J., Inouye, C., Peris, M., Penczek, P.A., Tjian, R., and
Nogales, E. (2006). Cryo-electron microscopy studies of human TFIID:
conformational breathing in the integration of gene regulatory cues. Structure 14,
511-520.
15. Penczek, P.A., Frank, J., and Spahn, C.M. (2006). A method of focused
classification, based on the bootstrap 3D variance analysis, and its application to
EF-G-dependent translocation. J Struct Biol 154, 184-194.
16. Wriggers, W., and Birmanns, S. (2001). Using situs for flexible and rigid-body
fitting of multiresolution single-molecule data. J Struct Biol 133, 193-202.
17. Chacon, P., and Wriggers, W. (2002). Multi-resolution contour-based fitting of
macromolecular structures. J Mol Biol 317, 375-384.
18. Sali, A., and Blundell, T.L. (1993). Comparative protein modelling by satisfaction
of spatial restraints. J Mol Biol 234, 779-815.
19. Thompson, J.D., Higgins, D.G., and Gibson, T.J. (1994). CLUSTAL W: improving
the sensitivity of progressive multiple sequence alignment through sequence
weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids
Research 22, 4673-4680.
20. Kettenberger, H., Armache, K.J., and Cramer, P. (2003). Architecture of the RNA
polymerase II-TFIIS complex and implications for mRNA cleavage. Cell 114, 347-
357.
21. Pettersen, E.F., Goddard, T.D., Huang, C.C., Couch, G.S., Greenblatt, D.M.,
Meng, E.C., and Ferrin, T.E. (2004). UCSF Chimera--a visualization system for
exploratory research and analysis. J Comput Chem 25, 1605-1612.