the application of surface entropy reduction isfi david r. cooper, tomek boczek, wonchan choi,...

1
The Application of Surface Entropy Reduction ISFI David R. Cooper, Tomek Boczek, WonChan Choi, Urszula Derewenda, Natalya Olekhnovitch, Ankoor Roy, Darkhan Utepbergenov, Meiying Zheng, and Zygmunt Derewenda University of Virginia, Department of Molecular Physiology and Biological Physics. Charlottesville, VA 22908 The Integrated Center for Structure and Function Innovation (The ISFI), a PSI-2 Specialized Center. Abstract A wide array of factors influence whether or not a protein will crystallize, and many otherwise well- behaved proteins seem to harbor an intrinsic resistance to crystallization. For some proteins, the presence of highly entropic residues on the surface can doom crystallization attempts by creating an entropic shield that prevents the intermolecular contacts necessary for crystallization. The Surface Entropy Reduction (SER) approach of replacing these entropic residues has proven to be an effective means of producing crystals for some of these stubborn crystallizers. However, this approach is not appropriate for every difficult protein and the likelihood of success is highly dependent on whether the protein can be considered a good SER candidate. Herein, we present the types of analysis that should be performed before a protein can be called a SER candidate, and we outline a strategy for the application of surface entropy reduction. We also discuss recent successes of our pipeline and present an indication of the technique’s success rate. Structures that will be shown include a transcription factor, two metal dependent hydrolases, and a disulfide isomerase. SER Basics The S urface E ntropy R eduction technique (SER) promotes crystallization by altering surface features that inhibit crystallization. Large, flexible residues (particularly lysines and glutamates) on the surface of a protein can create an “entropy shield” that impedes crystallization. The SER method involves replacing clusters of highly-entropic residues with residues that can facilitate crystallization. The original and still most popular replacement residue is alanine, but tyrosine, threonine, and methionine are also suitable. Mutations Facilitate Crystallization In most SER structures, the mutations participate in crystal contacts. Here the mutated residues are either labeled or shown in magenta or pink. LcrV Structure 12:357 Hsp33 Structure 12:1901 RGSL of PDZ-RhoGEF Structure 9:559 OhrB Acta Cryst D63:1269 RhoGDI K99S, Q100S Acta Cryst D63:636 RhoGDI K138Y, K141Y Acta Cryst D63:636 RhoGDI K135T, K138T, K141T Acta Cryst D63:636 RhoGDI E155H, E157H Acta Cryst D63:636 Is SER right for you? Have you already tried … Reductive methylation? Nature Methods 5:853-4 (2008). Structure 14:1617-22 (2006). Quick and easy. Can be performed in parallel with native protein already on hand. Reductive cyclic pentylation? Acta Crystallographica D65:462-9 (2009). A recently reported alternative to methylation. Alternate Reservoir screening? Acta Crystallographica D61:490-3 (2005). This technique involves setting up each crystallization drop as normal, but filling the reservoir of every well with something other than the crystallization solution. We use 1.5M NaCl. Not only do we get more crystals (~33% more) using these screens, but the concentration of salt in the reservoir is an easy to optimize parameter. The one tricky bit is figuring our a good cryo solution, because you cannot assume the concentration of the drop is similar to the crystallization solution. Sometimes covering the drop with oil before harvesting crystals is sufficient. Absolute minimum SER Candidate requirements: • The protein is soluble and purifies well. • It is difficult to crystallize or diffracts poorly. • It contains a cluster of highly-entropic residues. • It lacks large regions of predicted disorder or coiled coils. • Its crystallization has not been hampered by a missing co-factor or ligand. Bioinformatics checks: Even if you think you know your protein, check the following things. Check for disorder. One of the first checks should be for regions of high disorder. Significant regions of disorder can indicate that conventional crystallization screening will fail. Some proteins may require co-crystallization with a folding partner. DisMeta (http://www-nmr.cabm.rutgers.edu/bioinformatics/disorder/ ) DisMeta is a disorder meta-server that uses 10 disorder prediction programs to generate a consensus disorder prediction. Check for coiled coils. Since the interaction of large, charged residues can be critical for coiled-coil formation, SER is not appropriate for coiled-coils. Coils Server (http://www.ch.embnet.org/software/COILS_form.html) Paircoil2 (http://groups.csail.mit.edu/cb/paircoil2/) Predict a tertiary fold and identify known domains or motifs. Information about a protein’s relatives can identify why previous crystallization attempts have failed, suggest a new method of screening the protein, or indicate residues for SER mutations. We have identified numerous cases where co-factors, ligands, or anaerobic conditions would probably be necessary for crystallization. Blast (http://blast.ncbi.nlm.nih.gov/Blast.cgi) BioInfoBank Meta Server (http://bioinfo.pl/) A thorough meta server that can identify distantly related proteins that Blast misses. ProDom (http://prodom.prabi.fr/prodom/current/html/home.php) COGs (http://www.ncbi.nlm.nih.gov/COG/) Pfam (http://pfam.sanger.ac.uk/) Prosite (http://ca.expasy.org/prosite/) Blocks (http://blocks.fhcrc.org/) Other things XtalPred (http://ffas.burnham.org/XtalPred-cgi/xtal.pl) A web server for prediction of protein crystallizability. PSI / Nature Structural Genomics Knowledgebase (http://kb.psi-structuralgenomics.org/KB/) Find related structures, models, protocols and more. Find out which PSI centers are working on similar (or identical!) proteins. If you need SER … Our Current SER Strategy • Target evaluation and selection – See “Is SER right for you?” panel. • Expression of Wild Type – taken through to crystallization trials. Performed on a chromatography system and eluted as a gradient to determine optimal washing concentration of imidazole. We screen the WT crystals for crystallization. If we get crystal hits, we will work with them for ~2 months before undertaking mutagenesis. If the structure can be obtained with the WT protein, then the protein does not need SER and is not a SER candidate. • Mutation site selection We use the SERp server. See “If you need SER…” . • Primer design. We order primers to make Ala and Tyr variants for the top 3 clusters. PrimerX is a nice web tool for designing mutagenesis primers. (http://www.bioinformatics.org/primerx/) • QuikChange mutatgenesis We make them all at once. Lysine Rotamers Glutamate Rotamers An alanine SER variant Tm0439 E118A, K119A, K122A (pdb 3FMS) Mutations are in crystal contacts. Wild Type No Hits 1A (E118A, K119A, K122A) Structur e 2A (K2A, K3A) Diffract ion 3A (E30A, K31A) Diffract ion APC1446 Q100A,E101A (pdb 3FHK) APC1446 crystallizes with 4 molecules in the ASU, which generates a non- crystallographic two-fold. Three types of crystal contacts are mediated by the SER mutations. The structures of known oxidized CxC motifs, and the reduced CxC motif of YphP The catalytic loop of YphP. Arg121 activates Cys53 allowing intramolecular disulfide formation. Cys 55 is in turn activated by Arg121 providing an “escape” mechanism. YphP (Apc1446) Thioredoxin (1AUC) C rystals 3Y (Q 100Y , E101Y ) Structure 3A (Q 100A , E101A ) N o H its 2Y (K 113Y , E114Y ) N o H its 2A (K 113A , E114A ) N o H its 1Y (E39Y , K 30Y , E42Y ) C rystals 1A (E39A , K 30A , E42A ) N o H its W ild Type C rystals 3Y (Q 100Y , E101Y ) Structure 3A (Q 100A , E101A ) N o H its 2Y (K 113Y , E114Y ) N o H its 2A (K 113A , E114A ) N o H its 1Y (E39Y , K 30Y , E42Y ) C rystals 1A (E39A , K 30A , E42A ) N o H its W ild Type Tm1382 (pdb 3E57) This protein was phased with the SeMet SER variant K159A, E160A (which grew quickly) and refined against the native wild type (which took ~6 months to grow). Both crystals were grown using alternate reservoir screens (800 mM NaCl and 1.9 M NaCl for WT). K159A, E160A:Phasing Wild Type: Refinement D iffraction 3Y (K 100Y, K 101Y ) C rystals 3A (K 100A, K 101A ) C rystals 2Y (K 78Y , E79Y ) C rystals 2A (K 78A , E79A ) D iffraction 1Y (K 159Y , E160Y ) Structure 1A (K 159A , E160A ) Structure W ild Type D iffraction 3Y (K 100Y, K 101Y ) C rystals 3A (K 100A, K 101A ) C rystals 2Y (K 78Y , E79Y ) C rystals 2A (K 78A , E79A ) D iffraction 1Y (K 159Y , E160Y ) Structure 1A (K 159A , E160A ) Structure W ild Type Tm1679 K100Y, K101Y (pdb 3H3E) Structure 3Y (K 100Y, K 101Y ) N o H its 3A (K 100A, K 101A ) N o H its 2Y (K 78Y , E79Y ) D iffraction 2A (K 78A , E79A ) N o H its 1Y (K 159Y , E160Y ) N o H its 1A (K 159A , E160A ) N o H its W ild Type Structure 3Y (K 100Y, K 101Y ) N o H its 3A (K 100A, K 101A ) N o H its 2Y (K 78Y , E79Y ) D iffraction 2A (K 78A , E79A ) N o H its 1Y (K 159Y , E160Y ) N o H its 1A (K 159A , E160A ) N o H its W ild Type Mutations are in crystal contacts. The conserved residues of COG 1237 are blue and additional residues conserved in the Tm1679-like family are red. The disulfides in the CHCT motif are shown. In the close up of the active site, the color of the Cα sphere indicates its conservation. Tm1679 L1 (1SML) Tflp ( Subclass B3 Top Dali Hit TM1679 is a metallo-β-lactamase (MBL) domain containing protein that has been assigned to COG1237, a small group with a distinct set of conserved residues. MBLs are the most troublesome source of antibiotic resistance. We have identified a family of 40 proteins that share a distinct set of conserved residues with TM1679, with some of these conserved residues overlapping with a Family Strand Block of RNA-metabolising metallo-beta-lactatamases (Blocks server). The Nudix superfamily is large and catalyzes a diverse set of reactions, generally a NUcleotide DIphosphate linked to another moiety, X. Tm1382 contains the conserved “Nudix box,” Gx5Ex5[UA]xREx2EExGU. A number of NUDIX functions can be ruled out by the lack of other conserved motifs, but for now the function of this protein remains a mystery for biochemists to solve. Tm0439 is a member of the GntR superfamily of dimeric transcription factors. It has a N-terminal winged-helix DNA binding domain and a C- terminal FCD domain with an internally bound metal. The FCD domain also serves as a dimerization domain, yielding disparate orientations of the DNA binding domains These structures were aligned using the N-terminal domain of one monomer. DNA binding model based on pdb 1HW2. Apc1446 is encoded by the the B. subtilis yphP gene. It belongs to DUF1094, which currently lacks a functional annotation..YphP has a core domain with high similarity to thioredoxin, but with a CxC motif instead of the classical CxxC. Disulfide isomerase assays show this protein to have a significant isomerase activity. This work was funded as a part of PSI-2. The ISFI is funded by NIH U54 GM074946. Data were collected at SER-CAT (APS 22-ID) and ALS 5.0.2 The SER Summary Some protein refuse to crystallize because their surface is teeming with highly entropic residues that are the result of an evolutionary pressure to prevent non-specific interactions ( Physical Biology 1:9-13 2004). These proteins are viable candidates for SER and other surface altering techniques. As PSI has pursued targets with little of no functional annotations, ascertaining whether or not a protein fits into this category has become a task unto itself. SER has proven itself to be an effective technique, but its labor intensive nature warrants a selection process that will weed out proteins for which SER will

Post on 19-Dec-2015

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: The Application of Surface Entropy Reduction ISFI David R. Cooper, Tomek Boczek, WonChan Choi, Urszula Derewenda, Natalya Olekhnovitch, Ankoor Roy, Darkhan

The Application of Surface Entropy Reduction

ISFIDavid R. Cooper, Tomek Boczek, WonChan Choi, Urszula Derewenda, Natalya Olekhnovitch,

Ankoor Roy, Darkhan Utepbergenov, Meiying Zheng, and Zygmunt DerewendaUniversity of Virginia, Department of Molecular Physiology and Biological Physics. Charlottesville, VA 22908

The Integrated Center for Structure and Function Innovation (The ISFI), a PSI-2 Specialized Center.

AbstractA wide array of factors influence whether or not a protein will

crystallize, and many otherwise well-behaved proteins seem to harbor an intrinsic resistance to crystallization. For some proteins, the presence of highly entropic residues on the surface can doom crystallization attempts by creating an entropic shield that prevents the intermolecular contacts necessary for crystallization. The Surface Entropy Reduction (SER) approach of replacing these entropic residues has proven to be an effective means of producing crystals for some of these stubborn crystallizers. However, this approach is not appropriate for every difficult protein and the likelihood of success is highly dependent on whether the protein can be considered a good SER candidate. Herein, we present the types of analysis that should be performed before a protein can be called a SER candidate, and we outline a strategy for the application of surface entropy reduction. We also discuss recent successes of our pipeline and present an indication of the technique’s success rate. Structures that will be shown include a transcription factor, two metal dependent hydrolases, and a disulfide isomerase.

SER Basics The Surface Entropy Reduction technique (SER) promotes crystallization

by altering surface features that inhibit crystallization. Large, flexible residues (particularly lysines and glutamates) on the surface of a protein can create an “entropy shield” that impedes crystallization. The SER method involves replacing clusters of highly-entropic residues with residues that can facilitate crystallization.

The original and still most popular replacement residue is alanine, but tyrosine, threonine, and methionine are also suitable.

Mutations Facilitate Crystallization In most SER structures, the mutations participate in crystal contacts. Here

the mutated residues are either labeled or shown in magenta or pink.

LcrV Structure 12:357

Hsp33 Structure 12:1901

RGSL of PDZ-RhoGEF Structure 9:559

OhrB Acta Cryst D63:1269

RhoGDI K99S, Q100S Acta Cryst D63:636

RhoGDI K138Y, K141Y Acta Cryst D63:636

RhoGDI K135T, K138T, K141T Acta Cryst D63:636

RhoGDI E155H, E157H Acta Cryst D63:636

Is SER right for you?Have you already tried …

…Reductive methylation? Nature Methods 5:853-4 (2008). Structure 14:1617-22 (2006).

Quick and easy. Can be performed in parallel with native protein already on hand.

…Reductive cyclic pentylation? Acta Crystallographica D65:462-9 (2009).

A recently reported alternative to methylation.

…Alternate Reservoir screening? Acta Crystallographica D61:490-3 (2005).

This technique involves setting up each crystallization drop as normal, but filling the reservoir of every well with something other than the crystallization solution. We use 1.5M NaCl. Not only do we get more crystals (~33% more) using these screens, but the concentration of salt in the reservoir is an easy to optimize parameter. The one tricky bit is figuring our a good cryo solution, because you cannot assume the concentration of the drop is similar to the crystallization solution. Sometimes covering the drop with oil before harvesting crystals is sufficient.

Absolute minimum SER Candidate requirements:• The protein is soluble and purifies well. • It is difficult to crystallize or diffracts poorly.• It contains a cluster of highly-entropic residues.• It lacks large regions of predicted disorder or coiled coils.• Its crystallization has not been hampered by a missing co-factor or ligand.

Bioinformatics checks:Even if you think you know your protein, check the following things.

• Check for disorder. One of the first checks should be for regions of high disorder. Significant regions of disorder can indicate that conventional crystallization screening will fail. Some proteins may require co-crystallization with a folding partner.

DisMeta (http://www-nmr.cabm.rutgers.edu/bioinformatics/disorder/) DisMeta is a disorder meta-server that uses 10 disorder prediction programs to generate a consensus disorder prediction.

• Check for coiled coils. Since the interaction of large, charged residues can be critical for coiled-coil formation, SER is not appropriate for coiled-coils.

Coils Server (http://www.ch.embnet.org/software/COILS_form.html)Paircoil2 (http://groups.csail.mit.edu/cb/paircoil2/)

• Predict a tertiary fold and identify known domains or motifs. Information about a protein’s relatives can identify why previous crystallization attempts have failed, suggest a new method of screening the protein, or indicate residues for SER mutations. We have identified numerous cases where co-factors, ligands, or anaerobic conditions would probably be necessary for crystallization.

Blast (http://blast.ncbi.nlm.nih.gov/Blast.cgi)BioInfoBank Meta Server (http://bioinfo.pl/)

A thorough meta server that can identify distantly related proteins that Blast misses.ProDom (http://prodom.prabi.fr/prodom/current/html/home.php)COGs (http://www.ncbi.nlm.nih.gov/COG/)Pfam (http://pfam.sanger.ac.uk/)Prosite (http://ca.expasy.org/prosite/)Blocks (http://blocks.fhcrc.org/)

• Other thingsXtalPred (http://ffas.burnham.org/XtalPred-cgi/xtal.pl)

A web server for prediction of protein crystallizability.PSI / Nature Structural Genomics Knowledgebase (http://kb.psi-structuralgenomics.org/KB/)

Find related structures, models, protocols and more. Find out which PSI centers are working on similar (or identical!) proteins.

If you need SER …Step 1) Use the SERp Server to help pick mutation sites.

http://nihserver.mbi.ucla.edu/SER/Suggested mutations are predicted based on an algorithm incorporating a conformational entropy profile, a secondary structure prediction, and sequence conservation. While designed to be used with default values, the server has many user-controlled parameters allowing for considerable flexibility. Only the primary sequence is required for analysis. Predictions are available in minutes.

Step 2) Create SER variants. Choose 2-3 mutation clusters and make 2-3 SER variants for each. Alanine, tyrosine, threonine and methionine are effective replacement residues. (You may only have to screen 1 or 2 of the variants to get crystals.) There is an unpredictable correlation between the site of the mutations and the optimal replacement residue. Some combinations have a deleterious effect on solubility.

Our Current SER Strategy• Target evaluation and selection – See “Is SER right for you?” panel.• Expression of Wild Type – taken through to crystallization trials.

Performed on a chromatography system and eluted as a gradient to determine optimal washing concentration of imidazole.

We screen the WT crystals for crystallization. If we get crystal hits, we will work with them for ~2 months before undertaking mutagenesis. If the structure can be obtained with the WT protein, then the protein does not need SER and is not a SER candidate.

• Mutation site selection We use the SERp server. See “If you need SER…” .

• Primer design.We order primers to make Ala and Tyr variants for the top 3

clusters. PrimerX is a nice web tool for designing mutagenesis primers.

(http://www.bioinformatics.org/primerx/)• QuikChange mutatgenesis

We make them all at once.• Mutant expression, purification, crystallization.

We often purify several SER variants simultaneously, with using gravity columns or an AKTAxpress. Variants are washed with the imidazole concentration determined for the wild type protein.

LysineRotamers

GlutamateRotamers

An alanineSER variant

Tm0439 E118A, K119A, K122A (pdb 3FMS)

Mutations are in crystal contacts.

Wild Type No Hits

1A (E118A, K119A, K122A) Structure

2A (K2A, K3A) Diffraction

3A (E30A, K31A) Diffraction

APC1446 Q100A,E101A (pdb 3FHK)

APC1446 crystallizes with 4 molecules in the ASU, which generates a non-crystallographic two-fold. Three types of crystal contacts are mediated by the SER mutations.

The structures of known oxidized CxC motifs, and the reduced CxC motif of YphP

The catalytic loop of YphP. Arg121 activates Cys53 allowing intramolecular disulfide formation. Cys 55 is in turn activated by Arg121 providing an “escape” mechanism.

YphP (Apc1446) Thioredoxin (1AUC)

Crystals3Y(Q100Y, E101Y)

Structure3A (Q100A, E101A)

No Hits2Y (K113Y, E114Y)

No Hits2A (K113A, E114A)

No Hits1Y (E39Y, K30Y, E42Y)

Crystals1A (E39A, K30A, E42A)

No HitsWild Type

Crystals3Y(Q100Y, E101Y)

Structure3A (Q100A, E101A)

No Hits2Y (K113Y, E114Y)

No Hits2A (K113A, E114A)

No Hits1Y (E39Y, K30Y, E42Y)

Crystals1A (E39A, K30A, E42A)

No HitsWild Type

Tm1382 (pdb 3E57)This protein was phased with the SeMet SER variant K159A, E160A (which grew quickly) and refined against the native wild type (which took ~6 months to grow). Both crystals were grown using alternate reservoir screens (800 mM NaCl and 1.9 M NaCl for WT).

K159A, E160A:PhasingWild Type: Refinement

Diffraction3Y(K100Y, K101Y)

Crystals3A (K100A, K101A)

Crystals2Y (K78Y, E79Y)

Crystals2A (K78A, E79A)

Diffraction1Y (K159Y, E160Y)

Structure1A (K159A, E160A)

StructureWild Type

Diffraction3Y(K100Y, K101Y)

Crystals3A (K100A, K101A)

Crystals2Y (K78Y, E79Y)

Crystals2A (K78A, E79A)

Diffraction1Y (K159Y, E160Y)

Structure1A (K159A, E160A)

StructureWild Type

Tm1679 K100Y, K101Y (pdb 3H3E)

Structure3Y(K100Y, K101Y)

No Hits3A (K100A, K101A)

No Hits2Y (K78Y, E79Y)

Diffraction2A (K78A, E79A)

No Hits1Y (K159Y, E160Y)

No Hits1A (K159A, E160A)

No HitsWild Type

Structure3Y(K100Y, K101Y)

No Hits3A (K100A, K101A)

No Hits2Y (K78Y, E79Y)

Diffraction2A (K78A, E79A)

No Hits1Y (K159Y, E160Y)

No Hits1A (K159A, E160A)

No HitsWild Type

Mutations are in crystal contacts.

The conserved residues of COG 1237 are blue and additional residues conserved in the Tm1679-like family are red. The disulfides in the CHCT motif are shown. In the close up of the active site, the color of the Cα sphere indicates its conservation.

Tm1679 L1 (1SML) Tflp (2P4Z) Subclass B3 Top Dali Hit

TM1679 is a metallo-β-lactamase (MBL) domain containing protein that has been assigned to COG1237, a small group with a distinct set of conserved residues. MBLs are the most troublesome source of antibiotic resistance. We have identified a family of 40 proteins that share a distinct set of conserved residues with TM1679, with some of these conserved residues overlapping with a Family Strand Block of RNA-metabolising metallo-beta-lactatamases (Blocks server).

The Nudix superfamily is large and catalyzes a diverse set of reactions, generally a NUcleotide DIphosphate linked to another

moiety, X. Tm1382 contains the conserved “Nudix box,” Gx5Ex5[UA]xREx2EExGU. A number of NUDIX functions can be ruled out by the lack of other conserved motifs, but for now the function of this protein remains a mystery for biochemists to solve.

Tm0439 is a member of the GntR superfamily of dimeric transcription factors. It has a N-terminal winged-helix DNA binding domain and a C-terminal FCD domain with an internally bound metal. The FCD domain also serves as a dimerization domain, yielding disparate orientations of the DNA binding domains

These structures were aligned using the N-terminal domain of one monomer.

DNA binding model based on pdb 1HW2.

Apc1446 is encoded by the the B. subtilis yphP gene. It belongs to DUF1094, which currently lacks a functional annotation..YphP has a core domain with high similarity to thioredoxin, but with a CxC motif instead of the classical CxxC. Disulfide isomerase assays show this protein to have a significant isomerase activity.

This work was funded as a part of PSI-2. The ISFI is funded by NIH U54 GM074946. Data were collected at SER-CAT (APS 22-ID) and ALS 5.0.2

The SER SummarySome protein refuse to crystallize because their surface is teeming with highly entropic residues that are the result of an evolutionary pressure to prevent non-specific interactions ( Physical Biology 1:9-13 2004). These proteins are viable candidates for SER and other surface altering techniques. As PSI has pursued targets with little of no functional annotations, ascertaining whether or not a protein fits into this category has become a task unto itself. SER has proven itself to be an effective technique, but its labor intensive nature warrants a selection process that will weed out proteins for which SER will not solve the root of the crystallization problem.