identification and analysis of the folding determinants of ......identification and analysis of the...
Post on 29-Jun-2020
1 Views
Preview:
TRANSCRIPT
Identification and analysis of the folding determinants of membrane proteins.
by
Fiona Cunningham
A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy
Department of Biochemistry University of Toronto
© Copyright by Fiona Cunningham 2011
ii
Identification and analysis of the folding determinants of
membrane proteins.
Fiona Cunningham
Doctor of Philosophy
Department of Biochemistry
University of Toronto
2011
Abstract
Membrane proteins are responsible for a variety of key cellular functions including
transport of essential substrates across the membrane, signal transduction, and maintenance of
cellular morphology. However, given the size and high hydrophobicity of membrane proteins,
along with demanding expression and solubilization protocols that often preclude biophysical
studies, novel approaches must be devised for studies of their structure and function. This thesis
addresses these issues through several sets of inter-related experiments. We first examine
sequence motifs directing -helix packing, wherein the determinants of glycophorin A (GpA)
dimerization were identified via TOXCAT assay and the evaluation of GpA-derived peptides.
We found that (i) conservative mutations can have significant effects on the oligomerization of
glycophorin A; and (ii) residues that introduce more efficiently packed structures that are poorly
solvated by lipid leads to improved transmembrane segment dimerization. In a further study, we
inquired into the criteria for selection of membrane-spanning -helices by cellular machinery
through investigation of hydrophobic helical segments (termed -helices) that we identified in
soluble proteins. We found that the number and location of charged residues in a given
iii
hydrophobic helix are related to their insertion propensity as membrane-spanning segments.
When we applied this criterion to -helices in their intact protein structures, we successfully
determined the extent of -helix mutations necessary to convert a soluble protein, in part, to a
membrane-inserted protein. Finally, using a three-transmembrane segment construct from the
cystic fibrosis transmembrane conductance regulator (CFTR), we performed experiments aimed
at optimizing criteria for protein overexpression, including construct design, choice of expression
system, growth media, and expression temperature. The overall findings are interpreted in terms
of progress towards defining the fundamental characteristics of membrane-spanning -helices -
from their primary amino acid sequence to the helix-helix interactions they display in the
assembly of biologically-functional membrane protein structures.
iv
Table of Contents
Abstract ------------------------------------------------------------------------------------------------ ii
Table of Contents ------------------------------------------------------------------------------------ iv
List of Figures ---------------------------------------------------------------------------------------- ix
List of Tables ----------------------------------------------------------------------------------------- xi
List of Appendices ----------------------------------------------------------------------------------- xii
List of Abbreviations ------------------------------------------------------------------------------- xiii
Chapter 1. Introduction. -------------------------------------------------------------------------- 1
1.1 Introduction to membrane proteins. ------------------------------------------------- 2
1.2 Properties of transmembrane segments. -------------------------------------------- 3
1.2.1 Secondary structure of transmembrane segments. ------------------------- 3
1.2.2 Structural features of -helical membrane proteins. ---------------------- 5
1.2.3 Structural features of -barrel membrane proteins. ------------------------ 6
1.2.4 Amino acid composition of -helical TM segments. ---------------------- 8
1.2.5
Insertion of membrane proteins into the membrane bilayer is mediated
by the translocon. --------------------------------------------------------------- 12
1.3 Prediction of transmembrane segments from the primary amino acid
sequence. ----------------------------------------------------------------------------------- 14
1.3.1 Tools for the prediction of transmembrane segments. -------------------- 14
1.3.2
Unique marginally hydrophobic transmembrane segments: breaking
the rules. ------------------------------------------------------------------------- 15
1.3.3
Charged residues in transmembrane segments: importance for
membrane integration. --------------------------------------------------------- 17
1.4 Folding of -helical membrane proteins. -------------------------------------------- 19
1.4.1 Stage 1: Insertion of transmembrane segments into the membrane
bilayer. --------------------------------------------------------------------------- 20
1.4.2 Stage 2: Formation of tertiary contacts between transmembrane -
helices. --------------------------------------------------------------------------- 21
1.4.3 Fragmentation approach to studying membrane proteins. ---------------- 22
1.4.4 Efforts to generate membrane protein high resolution structures. ------- 23
1.5 Forces driving membrane protein folding. ------------------------------------------ 24
1.5.1 van der Waals interactions. --------------------------------------------------- 24
v
1.5.2 Electrostatic interactions. ------------------------------------------------------ 25
1.5.3 Cation- interactions. ---------------------------------------------------------- 26
1.5.4 Helix lipid interactions. -------------------------------------------------------- 27
1.5.5 Lipids as membrane protein chaperones. ------------------------------------ 29
1.6 Membrane protein oligomerization motifs. ----------------------------------------- 29
1.6.1 Polar residue motifs. ----------------------------------------------------------- 30
1.6.2 Left-handed transmembrane folding motifs: heptad repeats of small or
large residues. ------------------------------------------------------------------- 31
1.6.3 Right-handed packing motifs: small-xxx-small motifs. ------------------ 32
1.6.3.1 The human erythrocyte protein Glycophorin A. --------------- 33
1.7 Methods to study the oligomerization of transmembrane segments. ---------- 34
1.7.1 Membrane mimetic systems: micelles versus bilayers. ------------------ 34
1.7.1.1 Membrane bilayers. ----------------------------------------------- 34
1.7.1.2 Detergent micelles. ------------------------------------------------ 35
1.7.2 The TOXCAT assay. ---------------------------------------------------------- 35
1.7.3 Sodium dodecyl sulfate polyacrylamide gel electrophoresis. ------------ 37
1.7.4 Förster Resonance Energy Transfer. ---------------------------------------- 38
1.7.5 Analytical ultracentrifugation. ----------------------------------------------- 38
1.7.6 Computational methods. ------------------------------------------------------ 39
1.8 Thesis hypothesis and outline. --------------------------------------------------------- 40
Chapter 2. Beta-branched residues adjacent to GG4 motifs promote the efficient
association of glycophorin A transmembrane helices. ----------------------------------------- 42
2.1 Introduction. ------------------------------------------------------------------------------ 43
2.2 Results. ------------------------------------------------------------------------------------- 44
2.2.1 Mutations at Val80
and Val84
in the glycophorin A dimerization motif
can modulate the strength of oligomerization in vivo. --------------------- 44
2.2.2 Mutations at Val80
and Val84
do not alter secondary structure. ----------- 48
2.2.3 Lipid accessibility of the ridge residues correlates inversely with
tightness of dimer. -------------------------------------------------------------- 50
2.3 Discussion. --------------------------------------------------------------------------------- 51
2.3.1 -branched residues are required to mediate efficient association of the
GpA homodimer in the membrane bilayer. --------------------------------- 51
2.3.2 Hydrophobic -branched residues may be structurally optimized for
transmembrane segment folding. --------------------------------------------- 54
2.3.3 Modulating helix interactions. ------------------------------------------------ 57
2.3.4 Conclusion. ---------------------------------------------------------------------- 58
2.4 Materials and Methods. ----------------------------------------------------------------- 58
2.4.1 TOXCAT Assay. --------------------------------------------------------------- 58
vi
2.4.2 Peptide Synthesis. -------------------------------------------------------------- 59
2.4.3 Circular Dichroism. ------------------------------------------------------------ 60
2.4.4 Glycophorin A Helix Solvation Calculations. ----------------------------- 60
Chapter 3: Distinctions between hydrophobic helices in globular proteins and
transmembrane segments as factors in protein sorting. --------------------------------------- 61
3.1 Introduction. ------------------------------------------------------------------------------ 62
3.2 Results. ------------------------------------------------------------------------------------- 63
3.2.1 Hydropathy of -helices. ------------------------------------------------------ 63
3.2.2 Hydrophobic and charged/polar residue content in -helices. ------------ 63
3.2.3 Amino acid composition of-helices vs. transmembrane and other
globular helices. ----------------------------------------------------------------- 64
3.2.4 -helices are more buried within their native folds than other globular
helices. --------------------------------------------------------------------------- 67
3.2.5 Folding of the -helix peptides in aqueous and membrane mimetic
environments. ------------------------------------------------------------------- 68
3.2.6 Competence of -helix segments for in vivo membrane insertion. ------ 72
3.2.7 Charged residue distribution distinguishes -helix and transmembrane
sequences. ----------------------------------------------------------------------- 73
3.3 Discussion. --------------------------------------------------------------------------------- 75
3.3.1 Role in globular proteins. ------------------------------------------------------ 75
3.3.2 Role of residue content. -------------------------------------------------------- 75
3.3.3 Recognition of hydrophobic segments. -------------------------------------- 76
3.3.4 Conclusions. --------------------------------------------------------------------- 77
3.4 Materials and Methods. ----------------------------------------------------------------- 77
3.4.1 Database construction. -------------------------------------------------------- 77
3.4.2 Amino acid composition analysis. ------------------------------------------- 78
3.4.3 Solvent accessibility analysis. ------------------------------------------------ 79
3.4.4 Residue position analysis. ---------------------------------------------------- 79
3.4.5 Peptide synthesis and purification. ------------------------------------------ 80
3.4.6 Circular dichroism and fluorescence spectroscopy. ----------------------- 80
3.4.7 Plasmid construction. ---------------------------------------------------------- 81
2.4.8 MalE complementation test. -------------------------------------------------- 81
3.4.9 Chloramphenicol acetyltransferase enzyme-linked immunosorbent
assay. ----------------------------------------------------------------------------- 81
Chapter 4. Converting a Marginally Hydrophobic Soluble Protein into a Membrane
Protein. ------------------------------------------------------------------------------------------------- 83
4.1 Introduction. ------------------------------------------------------------------------------ 84
4.2 Results. ------------------------------------------------------------------------------------- 85
vii
4.2.1 -Helix hydrophobicity. -------------------------------------------------------- 85
4.2.2 Choice of -helices for membrane insertion studies. ----------------------- 87
4.2.3 Experimental quantification of membrane insertion of selected -
helices. ---------------------------------------------------------------------------- 88
4.2.4 Converting δ-helices into transmembrane segments. ----------------------- 90
4.2.5 Converting a soluble protein into a membrane protein. -------------------- 93
4.3 Discussion. --------------------------------------------------------------------------------- 95
4.4 Materials and Methods. ----------------------------------------------------------------- 96
4.4.1 DNA engineering. --------------------------------------------------------------- 96
4.4.2 Membrane insertion assay and endoH treatment. --------------------------- 97
Chapter 5. Optimizing synthesis and expression of transmembrane peptides and
proteins. ------------------------------------------------------------------------------------------------ 98
5.1 Introduction. ------------------------------------------------------------------------------ 99
5.1.1 Fragmentation approach to study membrane protein folding. ------------ 99
5.2 Domain fragments of membrane proteins: Application to the Cystic
Fibrosis Transmembrane Conductance Regulator. ------------------------------- 100
5.2.1 The Cystic Fibrosis Transmembrane Conductance Regulator. ----------- 100
5.2.2 Disease causing mutations in CFTR. ----------------------------------------- 102
5.2.3 Fragments of CFTR as a minimal tertiary model. -------------------------- 103
5.3 Triple strand construct from the Cystic Fibrosis Transmembrane
Conductance Regulator transmembrane domain. -------------------------------- 104
5.3.1 Construct information. --------------------------------------------------------- 104
5.3.1.1 Cloning of CFTR TM2/3/4 fragment: Methodology. ---------- 106
5.3.2 Protein Expression of CFTR TM2/3/4 under successful CFTR TM3/4
conditions. ----------------------------------------------------------------------- 107
5.3.3 Heterologous expression of CFTR TM2/3/4. ------------------------------- 109
5.3.3.1 E. coli strain. --------------------------------------------------------- 110
5.3.3.2 Temperature of protein expression induction and
concentration of IPTG. --------------------------------------------- 112
5.3.3.3 Growth media. ------------------------------------------------------- 114
5.3.4 Removing intracellular loop 1 between TM2 and TM3 to improve
protein expression. -------------------------------------------------------------- 116
5.4 Characterization of expressed CFTR fragments. ---------------------------------- 119
5.4.1 Differential gel migration of TM2/3/4 mutants. ---------------------------- 120
5.5 Discussion. --------------------------------------------------------------------------------- 123
Chapter 6. Discussion. ------------------------------------------------------------------------------ 126
viii
6.1 Discussion. --------------------------------------------------------------------------------- 127
6.1.1 Summary of contributions. ----------------------------------------------------- 128
6.1.1.1 Beta-branched residues adjacent to GG4 motifs promote the
efficient association of glycophorin A transmembrane
helices. -------------------------------------------------------------- 128
6.1.1.2 Distinctions between hydrophobic helices in globular
proteins and transmembrane segments as factors in protein
sorting. -------------------------------------------------------------- 129
6.1.1.3 Converting a marginally hydrophobic globular protein into a
membrane protein. ------------------------------------------------- 130
6.1.1.4 Optimizing synthesis and expression of transmembrane
peptides and proteins. --------------------------------------------- 130
6.2 Membrane mimetic micelles versus bilayers. --------------------------------------- 131
6.2.1 SDS as a membrane mimetic. ------------------------------------------------- 131
6.3 Significant content of hydrophobic, -branched residues in transmembrane
segments relative to -helices. ---------------------------------------------------------- 132
6.4 Membrane insertion propensity of transmembrane segments. ----------------- 133
6.4.1 Transmembrane segments with low hydrophobicity. ---------------------- 134
6.4.2 Importance of charged residues to translocon-mediated membrane
insertion of -helices. ----------------------------------------------------------- 134
6.4.3 Importance of secondary structure to translocon-mediated membrane
insertion of -helices. ----------------------------------------------------------- 136
6.4.4 Biological role of -helices. ---------------------------------------------------- 139
6.5 Helix-helix interactions. ----------------------------------------------------------------- 142
6.5.1 Sequence specific dimerization motifs. -------------------------------------- 143
6.5.2 Prediction of helix-helix interactions. ---------------------------------------- 143
6.6 Insights from high resolution structures of membrane proteins. -------------- 145
6.7 Future directions of membrane protein folding. ----------------------------------- 146
Chapter 7: Literature Cited. ------------------------------------------------------------------------ 149
Appendices. -------------------------------------------------------------------------------------------- 181
ix
List of Figures
Figure 1.1 The hydrophobic interior of the membrane bilayer compels membrane
proteins to adopt specific tertiary and quaternary structures. ------------------ 3
Figure 1.2 Example structures of -helical bundle and -barrel membrane proteins. -- 4
Figure 1.3 The structural diversity of -helical membrane proteins. ---------------------- 6
Figure 1.4 The structures and function of -barrel membrane proteins is diverse. ------ 8
Figure 1.5 The distribution of certain amino acids which comprise TM segments has a
skewed distribution. ----------------------------------------------------------------- 10
Figure 1.6 The lipid bilayer is highly heterogeneous. --------------------------------------- 11
Figure 1.7 X-ray crystal structure of the protein conducting channel SecY from
Methanocaldococcus jannaschii (PDB ID 1RH5). ----------------------------- 13
Figure 1.8 X-ray crystal structure of the Mammalian Shaker Kv1.2 potassium channel
(PDB ID 2A79). --------------------------------------------------------------------- 19
Figure 1.9 The two-stage folding model for membrane proteins. -------------------------- 20
Figure 1.10 van der Waals interactions in the close packing of TM -helices. ----------- 25
Figure 1.11 Hydrogen-bond interactions in the folding of membrane proteins. ----------- 26
Figure 1.12 Cation- interactions. --------------------------------------------------------------- 27
Figure 1.13 Acyl chain conformations can occur to accommodate a TM segment within
the membrane bilayer. -------------------------------------------------------------- 28
Figure 1.14 The TOXCAT assay for detecting in vivo TM association within the E. coli
inner membrane. --------------------------------------------------------------------- 36
Figure 2.1 Substitutions for WT Val80
and/or Val84
with Ile, Leu, and Ala
combinations, shown in alignment with the local GpA sequence. ------------ 45
Figure 2.2 TOXCAT assays of GpA and mutant sequences. ------------------------------- 46
Figure 2.3 TOXCAT assay of GpA and mutant Thr sequences. --------------------------- 48
Figure 2.4 Circular dichroism spectra of synthetic peptides corresponding to the GpA
TM sequence. ------------------------------------------------------------------------ 49
Figure 2.5 GpA dimer affinity is inversely correlated to interfacial lipid accessibility. 50
Figure 2.6 Models of the structure of the GG4 „ridge motif‟ involved in GpA
dimerization, with the II, VV (WT) and LL mutants shown in order of
increasing lipid accessibility and decreasing dimer strength. ------------------ 55
Figure 3.1 Comparison of globular helix, -helix and TM helix amino acid
composition. ------------------------------------------------------------------------- 66
Figure 3.2 Solvent accessibility of globular vs. -helices. ---------------------------------- 67
Figure 3.3 Ribbon diagrams of helical globular proteins containing -helix regions
studied in this work. ----------------------------------------------------------------- 69
Figure 3.4 Circular dichroism spectra of δ-helix peptides in various media. ------------- 70
Figure 3.5 -helix secondary structure compared with Chou-Fasman -helix
propensity. ---------------------------------------------------------------------------- 71
Figure 3.6 TOXCAT assay of -helix peptides in the E. coli inner membrane. --------- 73
Figure 3.7 Charged residue positioning in - vs. TM helices. ------------------------------ 74
Figure 4.1 Evaluation of the membrane integration properties of selected δ-helices. --- 89
Figure 4.2 Conversion of -helices into TM segments. ------------------------------------- 92
x
Figure 4.3 Conversion of a soluble protein into a membrane protein. -------------------- 94
Figure 5.1 Structure of CFTR. ------------------------------------------------------------------ 101
Figure 5.2 Construct of the pET32a(TM2/3/4) designed for the expression of the Trx-
CFTR TM2/3/4 fusion protein. ---------------------------------------------------- 105
Figure 5.3 Western blot of expression trial of CFTR TM2/3/4 in conditions which
were successful for CFTR TM3/4. ----------------------------------------------- 108
Figure 5.4 Flow chart showing the conditions used to optimize protein expression for
CFTR TM2/3/4. ---------------------------------------------------------------------- 109
Figure 5.5 Expression of CFTR TM2/3/4 in various E. coli cell lines: BL21, BL21
(codon plus) and C43. --------------------------------------------------------------- 112
Figure 5.6 Western blot showing of the effect of different induction temperatures on
protein expression of Trx-TM2/3/4, at two different concentrations of
IPTG: 0.1 mM and 1.0 mM. ------------------------------------------------------- 113
Figure 5.7 Western blot showing the effect of different growth media on protein
expression of Trx-TM2/3/4, at two different induction temperatures: 25⁰C
and 37⁰C. ----------------------------------------------------------------------------- 116
Figure 5.8 Construct of the pET32a(TM2/3/4)-Loop designed for the expression of
the Trx- TM2/3/4Loop fusion protein. ---------------------------------------- 117
Figure 5.9 Expression trials showing the effect of different growth media on protein
expression of Trx-TM2/3/4-Loop, at two different induction temperatures:
25⁰C (Lanes 2-9) and 37⁰C (Lanes 10-17) in BL21 (DE3) E. coli cells. ---- 119
Figure 5.10 Differential migration of WT CFTR-TM2/3/4 relative to mutants. ---------- 121
Figure 5.11 CFTR TM3/4 hairpin sequence and SDS-PAGE analysis. -------------------- 122
xi
List of Tables
Table 1.1 Marginally hydrophobic TM segments from CFTR, P-gp, AQP1 and
KAT1 that do not insert into the membrane independently. ----------------- 16
Table 2.1 Sequence of peptide mutant of the GpA transmembrane region 49
Table 3.1 Percent occurrence of hydrophobic, polar and charged residues per helix. 64
Table 3.2 Sequences of synthesized -helix peptides. ------------------------------------ 68
Table 3.3 Predicted secondary structure of -helix peptides. ---------------------------- 70
Table 3.4 Tryptophan emission maxima of -helix peptides in various media. ------- 72
Table 4.1 -helices and their predicted Liu-Deber and Gapp hydrophobicities. ------ 86
Table 5.1 Primers used in the PCR amplification of the human CFTR cDNA. ------- 104
Table 5.2 Predicted membrane spanning regions of CFTR TM2/3/4. ------------------ 106
Table 6.1 Liu-Deber and Gapp hydrophobicity predictions for -helices and
mutants. ----------------------------------------------------------------------------- 136
Table 6.2 Comparison of segmental apolar -helicity for the 1ECA and 2AAI -
helices. ------------------------------------------------------------------------------ 138
xii
List of Appendices
Appendix 1. ------------------------------------------------------------------------------------------ 181
Table A1.1 Database of globular helix sequences (n = 122). ------------------------------ 181
Table A1.2 Database of -helix sequences (n = 51). ---------------------------------------- 184
Table A1.3 Database of TM helix sequences (n = 212). ------------------------------------ 185
Table A1.4 Oligonucleotides used in this work. --------------------------------------------- 191
Appendix 2. ------------------------------------------------------------------------------------------ 192
xiii
List of Abbreviations
GCPRs - G-protein-coupled receptors
PDB - Protein Data Bank
TM - Transmembrane
P-gp - P-glycoprotein
CFTR - cystic fibrosis transmembrane conductance regulator
ER - endoplasmic reticulum
WT - wild type
NMR - Nuclear Magnetic Resonance
GpA - glycophorin A
MscL - mechanosensitive channel of large conductance
PE - phosphatidylethanolamine
SDS - sodium dodecyl sulfate
MCP - bacteriophage M13 coat protein
DPC - dodecylphosphocholine
PC - phosphatidylcholine
MBP - maltose binding protein
ELISA - enzyme-linked immunosorbent assay
CAT - chloramphenicol acetyltransferase
SDS-PAGE - Sodium dodecyl sulfate - polyacrylamide gel electrophoresis
FRET - Förster Resonance Energy Transfer
CHI - CNS searching of helix interactions
CNS - crystallography and NMR system software suite
CD - circular dichroism
SPFO - sodium perfluorooctanoate
P - Chou-Fasman secondary structure propensity
P - Chou-Fasman -strand structural propensity
∆Gapp - calculated apparent free energy membrane insertion
RSA - relative solvent accessibility
endoH - endoglycosidase H
RMs - rough microsomes
PCR - polymerase chain reaction
NBD - nucleotide binding domains
R domain - regulatory domain
MSD - membrane spanning domain
Trx - Thioredoxin
IPTG - β-D-1-thiogalactopyranoside
CKB - chicken liver 6-phosphofructo-2-kinase/fructose-2,6-bisphosphatase
1
Chapter 1. Introduction
2
1.1 Introduction to membrane proteins.
The boundaries of a cell, as well as the interior intracellular organelles, are defined by
biological membranes consisting of lipids and proteins. These lipid bilayers represent a barrier
to the passage of polar molecules into such organelles, and organize various biological processes
through compartmentalization. The proteins associated with the membrane bilayer catalyze
numerous chemical reactions and much like other proteins, come in a tremendous variety of sizes
and shapes. Membrane proteins can be classified roughly by their mode of interaction with the
membrane which include integral, peripheral or lipid-linked proteins. These different membrane
protein categories are responsible for a variety of key cellular functions including transport of
essential substrates across the membrane, signal transduction, recognition elements linking the
cell to its surroundings, and maintenance of cellular morphology. To compound their
importance, it has been estimated that -helical membrane proteins constitute approximately
27% of the total human proteome (Almen et al. 2009), and attract a large interest in
pharmaceutical therapeutic inventions, as currently the majority of drug targets are associated
with the cell membrane. For example, G-protein-coupled receptors (GCPRs) play multiple roles
in clinical medicine. This group of proteins has various functions including mediating the action
of hormones, and acting as neurotransmitters. As a group of proteins GCPRs have been the most
successful drug targets and agonists and antagonists of GPCR are used in the treatment of
diseases of every major organ system including the CNS, cardiovascular, respiratory, metabolic
and urogenital systems (Insel et al. 2007).
A requirement for residing in the membrane bilayer, and a differentiating factor that
separates membrane proteins from their soluble counterparts, is that membrane proteins contain
inherent stretches of highly hydrophobic sequences required for their membrane insertion.
However, it is this intrinsic hydrophobicity that provides a challenge when studying membrane
proteins and has impeded the collection of structural data. Relatively few high-resolution
structures exist for membrane proteins compared to their soluble counterparts that currently
number in the thousands (Lundstrom 2004). The importance of gathering structural information
on membrane proteins is highlighted by the fact that they have been implicated in many diseases
such as cystic fibrosis, Alzheimer‟s disease, Retinitis Pigmentosa and hereditary hearing loss
(Partridge et al. 2002b). Despite this intense interest, structural information of membrane
3
proteins has generally remained elusive due to their intrinsic hydrophobicity, complicating
routine analysis.
Due to the issue of high hydrophobicity, it remains important to investigate membrane
protein structure via understanding the first principles of membrane protein folding and assembly
within the unique, low dielectric environment of the membrane bilayer (Fig. 1.1). Fortunately,
the protein folding problem for membrane-embedded species is simplified by the fact that
membrane proteins are limited in their tertiary and quaternary folding patterns due to constraints
presented by the hydrophobic core of the membrane bilayer (Popot and Engelman 1990; 2000).
Figure 1.1. The hydrophobic interior of the membrane bilayer compels membrane proteins to
adopt specific tertiary and quaternary structures. The manner in which membrane proteins fold
in this unique environment is limited by the lipid bilayer. Figure adapted from (Engelman 2005).
1.2 Properties of transmembrane segments.
1.2.1 Secondary structure of transmembrane segments.
At this time, only ~1-2% of the > 65,260 structures deposited in the Protein Data Bank
(PDB) are of membrane proteins (White 2009). This translates to just over 180 unique
4
structures, and this is inclusive of homologous protein structures from different species
(http://blanco.biomol.uci.edu/Membrane_Proteins_xtal.html). Significant progress in the
determination of high resolution membrane protein structures is being made, however, with the
number of solved structures increasing exponentially (White 2009).
Even though the high resolution structural elucidation of membrane proteins lags behind
that of soluble proteins, two major structural classes of integral membrane proteins have emerged
(Fig. 1.2). In one category, proteins embedded in eukaryotic cell membranes and the inner
membranes of bacteria and mitochondria generally adopt structures characterized by bundles of
-helical TM segments (Fig. 1.2A). An alternative arrangement where -strands span the
membrane and assemble into barrel-like structures is common to proteins embedded in the
bacterial and mitochondrial outer membranes (Fig. 1.2B) (Galdiero et al. 2007). The formation
of these two types of structures is favorable in the low dielectric environment of the membrane
bilayer, as both structures satisfy the hydrogen bonding requirements of the peptide backbone
and prevent the exposure of polar backbone groups to the lipid environment (White and Wimley
1999).
Figure 1.2. Example structures of -helical bundle and -barrel membrane proteins. A)
Bacteriorhodopsin is an example of a -helical membrane protein (PDB ID 2NTU). The -helix
structure is formed to satisfy the hydrogen-bonding requirements of the helix backbone where
bonds are formed between the backbone N-H group to the backbone C = O group of the amino
acid four residues earlier (i, i + 4). B) Porin is an example of a -barrel membrane protein
(PDB ID 2POR). The -barrel structure is formed from a large -sheet that coils to form a
closed structure where the first strand is hydrogen bonded to the last. The individual strands are
typically arranged in an antiparallel fashion that also satisfies the hydrogen bonding requirements
of the backbone in a membrane environment. Figure generated with PyMol.
5
1.2.2 Structural features of -helical membrane proteins
The predominant class of membrane spanning proteins consists of -helical structures,
which are found in the inner membrane of bacteria, along with all other eukaryotic membranes.
This class of membrane proteins is composed of sequences which traverse the membrane via -
helical structures, and can consist of simple single pass, or multi-spanning transmembrane (TM)
-helices that are connected by extramembranous loops of varying length. These -helical
structures are ideal for spanning membrane bilayers, as the hydrogen bonding potential of the
backbone atoms is completely satisfied. The backbone hydrogen bonds are arranged such that
the peptide C = O bond at the i position points along the helix axis towards the N – H group at
the i + 4 position. This -helical conformation additionally projects the side chains of non-polar
amino acids into the lipid bilayer, where they can favorably interact with the lipid acyl chains.
The complexity of this group of proteins is vast, including a wide range of single -helix TM
proteins to multi-spanning -helix structures containing complex intra-, and extracellular
domains (Fig. 1.3).
Helical membrane proteins are also extremely diverse in their function. -Helical
membrane proteins can include receptors (e.g. rhodopsin and the 2 adrenergic G-protein-
coupled receptor (Okada et al. 2004; Rasmussen et al. 2007)), membrane pores (e.g. aquaporin
(Murata et al. 2000)), ion channels (e.g. CorA, which is a divalent metal ion transporter, and
voltage dependent K+ channels (Jiang et al. 2003; Eshaghi et al. 2006)) and metabolite
transporters (e.g. AmtB and Rh50 (Khademi et al. 2004; Lupo et al. 2007)), as well as proteins
involved in accumulation and transduction of energy, and proteins responsible for cell adhesion.
6
Figure 1.3. The structural diversity of -helical membrane proteins. -Helical membrane
proteins vary in the number of membrane spanning segments, oligomeric state and the presence
of extramembranous regions. The individual chains of each structure are coloured differently.
A) GpA (PDB ID 1AFO). B) Phospholamban (PDB ID 2HYN). C) Sav1866 (PDB ID
2HYD). D) Aquaporin 5 (PDB ID 3D9S). Figures were generated using PyMol.
1.2.3 Structural features of -barrel membrane proteins.
The second class of membrane proteins consists of β-barrel structures that are found in
the outer membranes of bacteria, mitochondria and chloroplasts (Wimley 2003). The unique
structure of a -barrel protein is formed from a large -sheet resulting from a closed structure in
which the first strand is hydrogen bonded to the last. In the -sheet conformation, formation of a
cylindrical -barrel satisfies the folding constraints imposed by the membrane bilayer in which
the β-strands are laterally hydrogen bonded in a circular pattern (Wimley 2003). In this
arrangement the inter-strand hydrogen bonds stabilize the core of the barrel, producing a
structure which is unlikely to unfold in membranes. Typically, -barrels are comprised of an
7
even number of -strands where these individual strands are connected by alternating tight turns
and longer loops producing an asymmetric structure (Wimley 2003). Generally, -barrels
feature these tight turns on the periplasmic side of the outer membrane with the flexible, longer
loops on the extracellular side of the membrane (Tamm et al. 2004). An additional feature of -
barrel proteins is the alternating arrangement of hydrophobic and polar residues with the identity
of amino acids residues facing the interior of the barrel being mostly polar. The individual β-
strands are rich in Gly and aromatic Trp and Tyr residues which are frequently found in two
rings that contact the lipid bilayer interfaces at both ends of the barrels (Tamm et al. 2004). -
strands in -barrels are typically arranged in an antiparallel fashion, and as evidenced through
the currently available high resolution structures, that show that the number of sheets included in
the structure can vary from 8 – 22 (Tamm et al. 2004). The average length of these individual
strands can vary from 9 – 11 residues, with such strands tilting 20 - 40⁰ with respect to the
membrane (Wimley 2003; Tamm et al. 2004). The oligomeric state of -barrel proteins can also
vary, with high resolution examples of monomeric, dimeric and trimeric species deposited in the
PDB (Fig. 1.4).
Along with structural diversity, the function of -barrels is also vast. In bacteria, -barrel
proteins can be classified into six families according to their function: (1) general porins such as
OmpC, OmpF, and PhoE; (2) passive transporters such as LamB, ScrY, and FadL; (3) active
transporters of siderophores and vitamin B12 such as FepA and BtuB, respectively; (4) enzymes
such as the phospholipase OmpLA or the protease OmpT; and (5) structural proteins such as
OmpA (Koebnik et al. 2000). This functional diversity is largely dictated via loop sequence
variability that contains most of the functional characteristics of the -barrel protein (Koebnik et
al. 2000).
8
Figure 1.4. The structures and functions of -barrel membrane proteins are diverse. Small, tight
loop structures are observed for -barrel membrane proteins on the periplasmic side of the
membrane (bottom of diagram); while larger more elaborate loops appear on the extracellular
side of the membrane (top of diagram). The antiparallel arrangement of the individual strands
can be seen, and the number of sheets per structure varies between proteins. A) Monomeric
FepA (PDB ID 1FEP). B) Dimeric OMPLA (PDB ID 1QD6). C) Tetrameric LamB (PDB ID
1AF6). The -strands and loop regions are colored purple. -Helical secondary structure
segments are colored in cyan. Figure generated using PyMol.
1.2.4 Amino acid composition of -helical transmembrane segments.
In order to span the 30 Å thickness of the membrane bilayer, -helical TM segments
along their entire length must be between 20-30 residues. From a database of 160 TM -helices,
the average length of the hydrophobic stretch which physically traverses the membrane was
shown to be 17.3 ± 3.1 (ranging from 6 to 25) residues or 26 Å in length. The average rise per
9
residue as part of an -helical structure is 1.50 Å, or 3.7 residues per turn (Hildebrand et al.
2004). Transmembrane -helices additionally protrude beyond the membrane bilayer, adding
residues to their total length. An average of 4.7 ± 3.5 residues, or 1.3 turns of the -helix,
extends the helix into the polar exterior of the membrane. These cap sections of TM -helices
consist of only 60% hydrophobic residues (Hildebrand et al. 2004).
Statistical analysis of natural TM segments indicates that the mean number of residues
per TM helix is 26.3 (± 5.6), and that TM -helices are often tilted with respect to the membrane
(Ulmschneider et al. 2005). Several groups have examined the distributions of amino acids
across TM segments in order to determine the composition of these unique membrane spanning
segments. In one such study, hydrophobic residues such as Leu, Ile, Val, Ala, Gly, and Phe were
found to comprise the majority of resides in TM -helices, with Leu being calculated as the most
frequently occurring residue in TM segments (Ulmschneider et al. 2005). Taken together, these
six hydrophobic residues account for approximately 63% of the amino acids in the segments that
span the membrane bilayer, and also constitute half of all residues in total membrane protein
sequences. These compositional values were determined using 46 -helical membrane proteins
containing 440 non-redundant TM helices. This study also identified that the distribution of
certain amino acids within TM segments follows a saddle-like distribution, i.e., TM segments
have a roughly hydrophobic core, and two peaks at the interfacial regions consisting of aromatic,
charged, and polar groups (Fig. 1.5). Such positional dependence indicates an importance for
these residues at specific locations within the TM -helix. This study also identified that apart
from the charged residues in TM segments, the distributions of all other residues are symmetric
(Ulmschneider et al. 2005).
10
Figure 1.5. The distribution of certain amino acids that comprise TM segments has a skewed
distribution. TM segments have a concentration of aromatic, charged, and polar residues at the
membrane interface region – or the helix termini - while the core or central portion is roughly
hydrophobic.
As mentioned, the distribution of aromatic and charged residues is generally not uniform
across membrane spanning segments (Senes et al. 2000; Ulmschneider et al. 2005). The N- and
C-termini of TM segments are located in chemically heterogeneous environments compared to
the core of the membrane bilayer (Fig. 1.6) and are often enriched in Trp and Tyr residues.
These residues are ideal for this boundary position as they can accommodate the membrane-
water interfacial region of the lipid bilayer by favourably interacting with both the phospholipid
head groups and the aqueous environment (Killian and von Heijne 2000). In contrast, Phe
residues have no preferred positions in TM -helices and can occur at both the core and
interfacial regions (Landolt-Marticorena et al. 1993; Killian and von Heijne 2000; Ulmschneider
and Sansom 2001). However, when Phe is found at the helix periphery, it can have strong
bilayer anchoring characteristics due to its aromaticity (Yuen et al. 2000). For example,
removing the Phe residues from the C-terminus of the bacteriophage M13 coat protein (MCP)
11
TM spanning segment can shift the entire helix out of the membrane by several angstroms
(Meijer et al. 2001).
Figure 1.6. The lipid bilayer is highly heterogeneous. The tails of the lipids comprising the
bilayer form the hydrophobic core of the membrane and create an environment unsuitable for
charged, polar and aromatic residues (purple). The polar head groups of the lipids (red) reside at
the lipid-water interface, an environment very different from the core and much more
hydrophilic. Charged, polar and aromatic residues preferentially reside in this region.
Positively charged residues were also identified to have a positional preference with
regards to the membrane bilayer. The prevalence of residues such as Arg and Lys was found to
be higher on the “inside”, or cytosolic side, of the membrane compared to the “outside”, or
periplasmic side of the membrane (Ulmschneider et al. 2005). This “positive inside rule” is
thought to affect the directional insertion of membrane proteins into bilayers in vivo (von Heijne
1992). This occurrence of charged residues at the helix termini suggests favourable electrostatic
interactions with the polar head groups of the lipids and the aqueous external environment in a
stabilizing manner.
12
Energetic considerations of the membrane bilayer suggest that charged and polar amino
acids should generally be excluded from TM segments; however, approximately 25% of all
residues found in TM -helices are polar (Ulmschneider et al. 2005). As noted above, the
existence of charged residues within the membrane bilayer may help to direct the orientation of
membrane insertion, but their implantation within the membrane spanning sequence may play a
key role in either the structure or the function of membrane proteins. For example, side chains of
polar residues such as Ser and Thr have been noted to form hydrogen bonds to the carbonyl
oxygen of the preceding turn of the helix which would energetically enable such side chains to
occur within the TM region (Ulmschneider et al. 2005). Side chain hydrogen bonds between
neighbouring TM -helices have also been postulated as playing essential roles in the formation
of the structures of aquaporin-1 and human nucleoside triphosphate diphosphohydrolase 3 (Buck
et al. 2007; Gaddie and Kirley 2009), and have been observed in such high resolution structures
as the large mechanosensitive channel and the glycerol facilitator (Chang et al. 1998; Fu et al.
2000). Additionally, polar residues in TM segments line the pores of channel membrane proteins
to facilitate function. Substrates such as ions pass though the membrane bilayer via these polar
residue lined pores, with examples including the voltage-gated potassium channel (Morais-
Cabral et al. 2001), and the rotor ring of the F1Fo ATP synthase (Meier et al. 2005). Other roles
for polar residues in TM segments include the binding of prosthetic groups such as the
photosynthetic reaction centre (Nogi et al. 2000) and bacteriorhodopsin (Luecke et al. 1999).
Transmembrane segments are also enriched in specific residues at helix-helix contact
points. For example, Gly has a high frequency of occurrence in TM segments (9%), and it has
been reported that Gly residues occur frequently at helix-helix interfaces and crossing points
(Rath and Deber 2008). It has been suggested that these small residues may facilitate close
packing of TM helices through van der Waals interactions, and especially in motifs combining
Gly and -branched side chains (MacKenzie et al. 1997).
1.2.5 Insertion of membrane proteins into the membrane bilayer is mediated by the
translocon.
-Helical membrane proteins are inserted into the membrane bilayer in vivo through the
translocon. This channel has the unique property of being able to open in two directions:
13
perpendicular to the plane of the membrane to allow a polypeptide segment to pass through the
channel, and within the membrane to allow a hydrophobic TM segment of a membrane protein to
exit laterally into the lipid phase (Osborne et al. 2005). The translocon, termed the SecYEG or
Sec61 complex in bacteria and eukaryotes, respectively, acts as a switching station for discerning
between membrane-embedded and secreted proteins (Osborne et al. 2005) (Fig. 1.7).
Deciphering the code that the translocon uses for selecting TM segments for insertion is of
fundamental importance for understanding the folding of membrane proteins, and is thought to
follow an equilibrium process where there is a direct interaction between the TM segment and
the surrounding lipid. Results of studies by Hessa et al. have suggested that direct protein-lipid
interactions are essential for the recognition of TM helices by the translocon (Hessa et al. 2005a;
Hessa et al. 2005b; Hessa et al. 2007), and that -helical TM helices partition into the
surrounding lipid bilayer based on the free energy of interaction between the TM segment and
the lipid. The specific details of this partitioning process have yet to be determined, but it is
thought that an “open” state of the translocon allows for this sampling process to occur.
Segmental hydrophobicity, as well as the positioning of charged residues along the TM helix has
also been implicated in the identification of TM segments by the translocon, but interestingly,
marginally hydrophobic TM segments and -helical segments from globular proteins with high
hydrophobicity have also been shown to insert into the membrane (Hessa et al. 2007;
Cunningham et al. 2009).
Figure 1.7. The X-ray crystal structure of the protein conducting channel SecY from
Methanocaldococcus jannaschii (PDB ID 1RH5). Helices 2B (red) and 7 (blue) forms the lateral
gate through which TM segments pass from the pore into the lipid bilayer. All other TM helices
and connecting loop structures are shown in green. A) Top view of SecY. B) Side view.
Figure generated via PyMol.
14
1.3 Prediction of transmembrane segments from the primary amino acid sequence.
Prediction of the segments that span the membrane bilayer is an important first step
regarding the prediction of membrane protein structure. Identification of membrane spanning
segments is made easier as the mode in which TM segments must span the bilayer is possible
only by limited topology options: either -sheets or -helices. In the case of -helical
membrane proteins, accurate prediction of the TM and loop segments locations, the rough
topology of membrane proteins can be predicted from the amino acid sequence alone.
Prediction of TM segments from the primary amino acid sequence is based on two basic
observations of membrane spanning segments: the regions that traverse the membrane bilayer
are composed primarily of hydrophobic residues, and the regions of multi-spanning membrane
proteins facing the cytoplasm are generally enriched in positively-charged residues (von Heijne
1989). From here, one can possibly predict the regions of the membrane spanning segments, as
well as the orientation of the protein within the membrane bilayer.
1.3.1 Tools for the prediction of transmembrane segments.
A straightforward method for identifying the hydrophobic character of a protein and the
prediction of membrane spanning segments was devised by Kyte and Doolittle (Kyte and
Doolittle 1982). In this method, a computer program progressively evaluates a protein sequence
along its length, and identifies potential TM segments based on a hydropathy scale. This scale
takes into consideration both the hydrophobicity and hydrophilicity of each of the twenty amino
acid side chains and identifies large uninterrupted areas on the hydrophobic side of the scale.
The primary sequences of membrane proteins are scanned with a sliding window 19 amino acids
long, and potential TM segments of 20-30 amino acids were identified based on their segmental
hydrophobicity (Kyte and Doolittle 1982). While useful in its simplicity, improvements and
integrations to this basic process have been made that have increased the forecasting of TM
prediction programs. Including additional observations above a segmental hydrophobicity
requirement such as the residence of aromatic residues at helix termini (Ulmschneider et al.
2005), and a limit to the number of charged residues in a membrane spanning segment (Deber et
al. 2001) have improved the prediction process. Additionally, based on this simple principle
15
more advanced prediction methods have been developed such as PHDhtm that uses probability
to define TM segments based on local sequence patterns (Rost et al. 1996). Additional
prediction methods have also been developed which consider global sequence patterns:
identifying repeats of TM segments and loop-TM segment patterns was implemented into both
the THMMM (Sonnhammer et al. 1998) and HMMTOP (Tusnady and Simon 1998) TM
prediction programs.
While the tools available to predict TM segments all vary slightly within a common
theme based on hydrophobicity, comparisons of TM prediction programs have suggested that no
particular program has an advantage over another (Cuthbertson et al. 2005). Some TM
prediction programs are suited for prediction of the number and position of TM segments
(THMMM2, TMAP, HMMTOP2), while others were more suited for predicting helix start and
end points (TMHMM2 or SPLIT4) or weeding false positives from prediction outputs (TMHMM
or SOSUI) (Cuthbertson et al. 2005). To circumvent the uncertainty of which prediction
program to use when attempting to identify TM segments from the primary amino acid sequence,
a consensus of programs can determine the most likely TM -helix boundary points. The
accurate determination of TM -helix boundaries becomes important, as differential
determination of -helix end points can ultimately affect experimental outcomes. For example,
comparisons of peptides corresponding to TM2 from the myelin proteolipid protein with -helix
boundaries determined experimentally, versus an agreement of 13 TM prediction programs, lead
to a peptide with two additional C-terminal residues for the latter of which affected the
oligomerization and helicity of the construct in apolar environments (Ng and Deber 2010).
1.3.2 Unique marginally hydrophobic transmembrane segments: breaking the rules.
While the main identifier for TM segments from the primary amino acid sequences is
hydrophobicity (Hedin et al. 2010), unique examples exist in membrane proteins that are not
easily identifiable as TM segments via prediction methods. These “marginally hydrophobic”
TM segments, or false negatives, raise questions regarding the function and insertion pathways
for the proteins containing these segments. A small number of such marginally hydrophobic TM
segments have been identified in P-glycoprotein (P-gp), the cystic fibrosis transmembrane
conductance regulator (CFTR) (Sadlish and Skach 2004), aquaporin-1 (AQP1) (Pitonzo and
16
Skach 2006), and the plant Kv channel KAT1 that do not insert into the membrane by themselves
(Sato et al. 2003) (Table 1.1). This interesting feature suggests that some TM helices in
multispanning proteins may depend upon other parts of the same protein for efficient insertion
and folding (Hedin et al. 2010). In order to investigate the dependence of insertion of marginally
hydrophobic segments on the remainder of the protein - and to determine in fact if these
segments are capable of isolated membrane insertion - individual TM segments from CFTR and
P-gp were tested for their membrane insertion capabilities into ER membrane (Enquist et al.
2009). When the individual TM segments from CFTR and P-gp were cloned into a construct
designed to measure membrane insertion via glycosylation of the membrane embedded
construct, it was shown that 10/12 TM segments from CFTR, and only 3/12 segments from P-gp
were capable of unaided membrane insertion (Enquist et al. 2009). The insertion propensities of
each TM segment followed predictions of their insertion based on the Gapp scale. These results
highlight that membrane insertion can vary widely even for related proteins such as CFTR and P-
gp, and that while membrane embedded, not all TM spanning segments are individually capable
of directing insertion.
Table 1.1. Marginally hydrophobic TM segments from CFTR, P-gp, AQP1 and KAT1 that
do not insert into the membrane independently.
Name Position a Sequence Gapp
b Liu-Deber
c
CFTR - TM6 331-349 IILRKIFTTISFCIVLRMAG 0.95 1.58
CFTR - TM8 902-920 SYAVIITSTSSYYVFYIYV 2.99 1.26
CFTR - TM12 1129-1147 VGIILTLAMNIMSTLQWAV 2.10 1.60
P-gp - TM1 55-73 TLAAIIHGAGLPLMMLVFGG 1.58 0.95
P-gp - TM2 115-133 AYYYSGIGAGVLVAAYIQV 1.05 0.83
P-gp - TM3 192-210 MFFQSMATFFTGFIVGFTRG 2.36 1.13
P-gp - TM4 214-232 GLTLVILAISPVLGLSAAVW 0.36 1.44
P-gp - TM6 328-346 IGQVLTVFFSVLIGAFSVG 1.62 1.38
P-gp - TM7 706-724 TEWPYFVVGVFCAIINGGL 1.55 1.10
P-gp - TM9 840-856 IANLGTGIIISFIYGWQ 2.01 1.09
P-gp - TM11 947-965 AMMYFSYAGCFRFGAYLVA 2.31 1.38
P-gp - TM12 973-991 DVLLVFSAVVFGAMAVGQV 1.09 1.40
AQP1 - TM2 52-68 SLAFGLSIATLAQSVG 3.51 0.52
KAT1 - S3 136-154 TWFAFDVCSTAPFQPLSLL 4.21 0.90
KAT1 - S4 162-180 LGFRILSMLRLWRLRRVSS 3.39 0.98 a
Residues are numbered according to their position in the full length protein sequence. b the Gapp was calculated with the online program: http://dgpred.cbr.su.se/
c Mean Liu-Deber segmental hydrophobicity of each segment.
17
Marginally hydrophobic TM segments were first identified by the “ΔG predictor”
developed in the von Heijne laboratory, which predicts membrane spanning segments based on
segmental hydrophobicity and the position of charged residues along the -helix (Hessa et al.
2007). The apparent free energy of insertion (ΔGapp) into the endoplasmic reticulum (ER)
membrane was then experimentally measured, both with and without inclusion of their
immediate flanking loop segments. In agreement with computational predictions, the marginally
hydrophobic TM segments do not insert into the ER membrane by themselves, but the inclusion
of the flanking loops, both upstream and downstream of the TM segment improves membrane
insertion for some of the unique segments. For those marginally hydrophobic TM segments that
insertion did not improve with addition of flanking amino acids, their insertion in the presence of
neighboring TM segments improves significantly. This result suggests that the insertion of TM
segments can depend on the local environment of the protein, and that TM segments in
multispanning proteins may depend on other parts of the same protein for efficient insertion and
folding (Hedin et al. 2010).
Consequently, the idea that neighboring TM segments can mediate the membrane
insertion of marginally hydrophobic segments then raises the question of how the translocon in
the ER membrane can handle multiple TM segments concurrently. The high resolution structure
of the SecA translocon channel from Thermotoga maritime provided a clue as to how this was
possible: in the open conformation, the translocon may be able to fit two or three TM segments
at once (Zimmer et al. 2008). Presumably, if a sufficiently hydrophobic two- or three-helix
bundle can form within the translocon channel or possibly in the gate-lipid bilayer interface, the
whole bundle may be able to partition into the bilayer in tandem despite the presence of
marginally hydrophobic TM segments (Hedin et al. 2010).
1.3.3 Charged residues in transmembrane segments: importance for membrane
integration.
The location and number of charged residues within a membrane spanning sequence is
thought to affect membrane insertion (Hessa et al. 2007), but unique examples exist of TM
segments that contain large numbers of charged residues. Charged residues in TM segments
reduce the segmental hydrophobicity, but can also be critically important for protein function and
18
for promoting the correct topology of the protein. The voltage-dependent K+ (Kv) channels are
an example of a family of membrane proteins with conserved charged resides central to their
function. Voltage-dependent K+ (Kv) channels contain a membrane embedded ion-selective pore
domain with six TM segments (S1-S6), and a voltage sensor domain with four membrane
spanning segments (S1-S4) (Sakata et al.). The voltage-sensor domain contains negatively
charged residues in the S2 and S3 TM -helices, and four or more positively charged residues in
the S4 helix (Fig. 1.8). The role of these charged residues in voltage gating has been extensively
studied [see (MacKinnon 2003) for a review], but of interest to membrane protein folding is the
manner in which these highly charged segments are inserted into the membrane bilayer.
In vitro translation and translocation experiments of the individual helices in the plant
voltage-dependent K+ (Kv) channel KAT1 have identified specific interactions between charged
residues that contribute to KAT1 membrane topology (Sato et al. 2003). Interactions between
positively charged residues on S4 and negatively charged residues on S2 may be formed
transiently during membrane integration, and constitute posttranslational electrostatic
interactions between charged residues that are required to achieve correct topology (Sato et al.
2003). Measurements of the membrane insertion propensity of the -helices and TM fragments
from the voltage sensor domain suggest that TM segments insert cooperatively, and that the
degree of cooperatively observed depends on the balance between electrostatic and hydrophobic
forces (Zhang et al. 2007). This example highlights the importance of understanding the
membrane protein integration and folding into the membrane bilayer in the context of the
remainder of the protein, and that the identification of TM segments is a complicated process.
19
Figure 1.8. X-ray crystal structure of the Mammalian Shaker Kv1.2 potassium channel (PDB ID
2A79). A) Tetrameric structure of Kv1.2, side view. B) Kv1.2 monomer with helix S4 colored
red. The S4 helix contains several positively charged residues and forms part of the voltage
sensor. Figure generated using PyMol.
1.4 Folding of -helical membrane proteins.
Due to the constraints imposed on membrane proteins by the lipid bilayer, the folding of
membrane proteins is considered to be simplified compared to their water-soluble counterparts.
Based on this fact, a two-stage model for membrane protein folding was proposed by Popot and
Engelman (Popot and Engelman 1990; 2000). This model separates the membrane protein
folding process into two distinct steps: the first is the spontaneous insertion of TM segments into
the membrane bilayer, while the second stage is concerned with the lateral associations of TM
segments within the membrane bilayer (Fig. 1.9). More specifically, the first stage of Popot and
Engleman‟s membrane protein folding model is synonymous with the adoption of TM stretches
into -helical conformations, an event which is primarily driven by the hydrophobic effect. The
second stage of membrane protein folding is the association of these established TM -helices,
which is primarily driven by folding motifs defined on the -helical surfaces. Because
established -helices are already within an apolar environment, the hydrophobic effect does not
play a specific role in the second stage of membrane protein folding. Instead, van der Waals
20
packing and interhelical hydrogen bonds are the driving forces behind the stage-two folding
process.
Figure 1.9. The two-stage folding model for membrane proteins. A) The first stage of
membrane protein folding involves the insertion of -helices into the membrane bilayer, an
event which is driven by hydrophobicity. B) The second stage of membrane protein folding
involves the lateral association of these helices within the bilayer. This association is driven by
motifs on the -helix surface. Figure adapted from (Popot and Engelman 1990).
1.4.1 Stage 1: Insertion of transmembrane segments into the membrane bilayer.
The first stage of membrane protein folding involves the insertion of hydrophobic
segments into the lipid bilayer, which is primarily a result of the hydrophobic effect. The net
hydrophobicity of a potential TM segment is important for determining the likelihood of
membrane insertion. Ideally, an amino acid stretch will be of sufficient hydrophobicity and
length in order to span the bilayer and take on a TM conformation. If a segment possesses an
overall hydrophobicity above a certain threshold value, it will spontaneously insert into a
membrane environment and fold into a -helix structure. This threshold value has been
determined both in vitro (Liu et al. 1996) and in vivo (Nilsson et al. 2003), and is roughly
equivalent to a stretch of poly-Ala residues.
21
Statistical analysis has shown that the average length of a TM segment is approximately
26 residues, but helix length can vary greatly. A survey of high resolution membrane protein
structures shows that TM -helices can vary in length from 14 – 39 residues, with twenty
residues considered optimal to span the membrane bilayer (Bowie 1997). Helices shorter than
this ideal average can be accommodated by inducing disorder in the packing of the acyl chains in
the lipid bilayer which decreases bilayer thickness (Killian and Nyholm 2006). On the other
hand, -helices that are longer than the ideal average may be accommodated by acyl chain
ordering, which results in an increase in membrane bilayer thickness, or tilting within the
membrane plane (Killian and Nyholm 2006).
1.4.2 Stage 2: Formation of tertiary contacts between transmembrane -helices.
Helix-helix associations drive the second stage of membrane protein folding, and involve
the formation of tertiary and/or quaternary structure. This can involve contacts within helices
from multi-spanning membrane proteins, or between helices of separate chains to form higher-
order oligomers. Correct helix contacts are required to form the final protein structure, and
changes to protein folding through mutation have implications in disease.
The packing or folding of TM helices within the membrane bilayer is a highly specific
process, and forces that contribute to correct folding can involve protein-protein, protein-lipid
and lipid-lipid interactions. The protein-protein interactions, or contacts observed between
helices, can be mediated by either van der Waals interactions or electrostatic interactions
between residues on separate helices. Favorable van der Waals interactions have been most
accurately described as “knobs-into-holes” packing where there is a specific fit of the helices
involved, and the surface included in the contact area is maximized. Helix associations arising
from van der Waals contacts are derived from a series of contacting amino acids or an
“interactive face”, rather than just a single residue. Alternatively, helix contacts arising from
electrostatic interactions can arise from a single amino acid. For example, strongly polar
residues such as Asp, Asn, Glu and Gln have been implicated in driving the association of TM -
helices within the membrane bilayer though the formation of interhelical hydrogen bonds. A
single polar residue in the middle of a TM -helix is able to drive association in order to satisfy
22
the hydrogen bonding potential of the residue (Zhou et al. 2001; Johnson et al. 2004; Rath and
Deber 2008). Lipids can also contribute to the folding and association of TM -helices. The
entropy of lipids interacting with protein structures rather than with other lipids can promote or
discourage helix contacts. These forces and their contributions to the second stage of membrane
protein folding will be described in more detail in the upcoming sections. Specific examples of
packing motifs will also exemplify the contribution of these forces to protein folding within a
membrane environment.
1.4.3 Fragmentation approach to studying membrane proteins.
The two stages of membrane protein folding describe the adoption of -helical structure
in a membrane environment, and the lateral association of these helices to form higher order
structures. Innate to this model is that the established -helices possess all structural information
within their amino acid sequence required to form higher-order structures. This fact has been
exemplified by numerous studies showing fragments of membrane proteins combining in
membrane environments to form functional proteins (Ridge et al. 1995; Hankamer et al. 1999;
Martin et al. 1999; Liu et al. 2005). For example, the folding and assembly of rhodopsin was
investigated by co-expression of protein fragments, which were designed to correspond to
proteolytic cleavage sites within the loop regions. Individual expression of the protein fragments
yielded no functional results, while co-expression of the separate fragments covering the full
length protein sequence formed constructs with spectral properties similar to the wild type (WT)
protein (Ridge et al. 1995).
As suggested through the two-stage folding model, small segments can be considered to
define structural characteristics of membrane proteins. More importantly, individual TM
segments can largely be viewed as independent recognition elements of overall folding domains
(Johnson et al. 2004; Rath and Deber 2008). Both heterologous protein expression and the
fragmentation approach have been used successfully to investigate membrane proteins.
23
1.4.4 Efforts to generate membrane protein high resolution structures.
The first high-resolution structure of a membrane protein was of a bacterial
photosynthetic reaction centre (Deisenhofer et al. 1984). Until this initial structure was solved,
the analysis of membrane proteins by X-ray crystallography was thought to be an unattainable
challenge. Since this time however, more than 280 unique membrane protein structures have
been deciphered. The importance of gathering high-resolution structures of membrane proteins
is highlighted by the fact that greater than 40% of pharmaceutical drugs are targeted at
membrane proteins, and each high-resolution structure is highly anticipated by both academia
and the pharmaceutical industry.
The elucidation of available high-resolution structures of membrane proteins has revealed
interesting insights into the molecular functions of various TM processes, but to begin these
kinds of structural studies, researchers require a large supply of purified protein, solubilized in a
detergent in which it retains function; this is an objective that is by no means trivial. Primarily,
three different methods have been used successfully for the gathering of high-resolution
structures of membrane proteins: X-ray crystallography, electron crystallography, and nuclear
magnetic resonance (NMR) spectroscopy. However, all three of these methods combined have
yielded a rather modest set of structures that have been solved at the atomic level. For detailed
mechanistic insights into the function of membrane proteins, a resolution of at least 2.0 Å is
required; particularly to observe small conformational changes (Rosenbusch et al. 2001). An
example of a membrane protein solved to high atomic resolution via X-ray crystallography is
bacteriorhodopsin; a membrane protein commonly regarded as the simplest proton pump (Lanyi
and Schobert 2002). The use of X-ray crystallography to solve high-resolution structures of
membrane proteins is facilitated by proteins of structural similarity to bacteriorhodopsin; in fact,
proteins that have small extramembranous regions are ideal for the formation of the crystals
required by this methodology. While reporting at much lower atomic resolutions, electron
microscopy is also useful in the study of membrane protein function – particularly in the
structural determination of large membrane protein complexes. The structure of the yeast ATP
synthase complex has been determined via electron microscopy which has aided in the functional
studies of the synthesis of ATP by this complex. At 24 Å resolution, the 3-D model of the
protein is highly useful as a framework for studying the functional mechanism of the protein
24
(Lau et al. 2008). Nuclear magnetic resonance can also be used in the structural determination of
membrane proteins. Both solution NMR and solid-state NMR, can be used to determine
structure, and this technique is especially useful to determine associations of individual residues
or internuclear distances (Rosenbusch et al. 2001).
1.5 Forces driving membrane protein folding.
As the hydrophobic effect is generally considered the major driving force for generating
compact structures in soluble proteins, the question of how membrane proteins form their folded
compact structure in the apolar environment of the lipid bilayer merits interest (White and
Wimley 1999). The manner in which membrane proteins fold speaks to the stability of these
segments within the membrane, where the tight association of TM helices results form a
favorable energetic environment (Liu et al. 2003). Stably folded membrane proteins reside in a
free energy minimum determined by the net energetics among the peptide chains interactions
with water, each other, the lipid bilayer, and cofactors (White and Wimley 1999).
In order for two stable -helices to associate within the membrane, the interactions that
permit the close packing must overcome the energetics that favors helix separation. In the case
of TM segments that are established across the membrane bilayer and associate and form
oligomers, peptide-lipid contacts are lost in favor of peptide-peptide contacts. The balance
between monomer and oligomer is determined by a balance of entropy and enthalpy. In the
following sections, forces directing the packing of TM segments will be discussed in further
detail.
1.5.1 van der Waals interactions.
One of the main forces involved in TM helix associations within the membrane bilayer is
van der Waals interactions. These noncovalent interactions arise within the membrane bilayer
from permanent or induced dipoles, or a fluctuating electron cloud (Fig. 1.10A). These
complementary dipoles can induce packing of TM segments within a membrane bilayer by
providing a weak electrostatic attraction. In membrane proteins, the TM helices associate such
that there is geometric complementarity between the two -helices. This type of “knobs-into-
25
holes” packing provides a good fit and allows the close approach of the helix backbone (Fig.
1.10B). As this close packing can occur over the length of the TM helix, the cumulative van der
Waals forces can greatly contribute to the folding of membrane proteins (Rath et al. 2009b).
Figure 1.10. van der Waals interactions in the close packing of TM -helices. A) Non-
covalent interactions between neutral molecules form between permanent or induced dipoles. At
any instant, nonpolar molecules have small randomly oriented dipole moments resulting from the
rapid fluctuating motion of their electrons that produces a weak electrostatic interaction when the
dipoles are in close proximity. B) The van der Waals packing interface of GpA (PDB ID
1AFO). “Knobs-into-holes” packing (red line) along the helix length stabilizes the interaction
between the two -helices.
1.5.2 Electrostatic interactions.
Another force involved in the association of TM -helices within the membrane bilayer
is electrostatic interactions. Unlike the induced dipole associations associated with van der
Waals forces, an electrostatic interaction drives the oligomerization via two polar residues, the
formation of a hydrogen bond between two polar side chains, or additionally a polar side chain
and the -helix backbone. A hydrogen bond occurs when two electronegative atoms interact
with the same hydrogen atom, and serves to cancel out the effect of a polar group in the non-
polar environment of the lipid bilayer. If strongly polar residues such as Asp, Asn, Glu, and Gln
occur within TM -helices, the electrostatic interaction between two such polar side chains can
be sufficient to drive the association of the -helices (Fig. 1.11). In the low dielectric
environment of the membrane, electrostatic interactions between polar side chains can be quite
strong. Hydrogen bonds can also form between the C in the amino acid side chain and the
26
carbonyl oxygen on the -helix backbone. This has been observed in the glycophorin A (GpA)
homodimer as evidenced through selective labeling of the Gly residues in the oligomerization
interface, which promotes the close approach of the dimeric helices (Arbely and Arkin 2004).
Figure 1.11. Hydrogen-bond interactions in the folding of membrane proteins. A) A hydrogen
bond is the attractive interaction of a hydrogen atom with an electronegative atom, such as
nitrogen or oxygen. Two electronegative atoms interact with the same hydrogen to form the
hydrogen-bond. B) A single strongly polar amino acid can drive the association of two -
helices within the membrane bilayer through formation of an intermolecular hydrogen-bond.
1.5.3 Cation- interactions.
A third biologically relevant electrostatic force that can drive the oligomerization of
membrane proteins is the non-covalent interaction between an aromatic residue and a positively
charged amino acid on an opposing helix. These cation- interactions can occur between such
aromatic residues as Trp, Phe and Tyr and positively charged residues such as Arg, Lys and His
(Dougherty 2007). The negative -electron density in the aromatic ring provides a surface of
negative electrostatic potential than can bind to a wide range of cations through a
predominantly
electrostatic interaction (Fig. 1.12). Evaluating the frequency of cation- interactions in protein
structures can be challenging, but estimates indicate that all proteins of significant size have at
least one cation- interaction, and Arg is more frequently found than Lys as the positively
charged partner (Dougherty 2007). As an example, the open conformation of the M2 coat
protein from the influenza A virus is structurally stabilized by a cation- interaction (Haupt et al.
27
2005). Cation- interactions have also been implicated in functional roles within protein
structures. For example, the Escherichia coli Kdp-ATPase nucleotide binding activity is
mediated via a cation- interaction.
Figure 1.12. Cation- interactions. A) The electrostatic basis for the cation- interaction. The
6 bond dipoles that create the overall electrostatic potential for the bond are shown. Figure
adapted from (Dougherty 2007). B) The cation-π interaction between the face of a benzene ring
and a sodium cation. The overall negative charge (blue) on the face of the benzene ring interacts
electrostatically with the positive charge of the sodium ion (red).
1.5.4 Helix-lipid interactions.
The lipid bilayer itself can have dramatic effects on how TM helices interact, as the lipid
chains can contribute to helix-helix associations, either through TM-lipid or lipid-lipid contacts.
Depending on the composition of the bilayer in question, hydrophobic mismatch can occur to
accommodate TM helices of greater or lesser length than the actual membrane. In order to
accommodate a mismatched length, the lipid, protein, or both can undergo some sort of structural
rearrangement. Altered acyl chain conformation can occur, either through a more or less
extended acyl chain which thereby adjusts the membrane thickness. If the helix length is longer
than bilayer thickness, the helix can tilt within the membrane plane. The amount of helix tilting,
or tilt angle, will be determined by the amount of surface area available for contact with other
membrane embedded helices (Fig. 1.13). For example, the GpA homodimer has a tilt angle of
forty degrees with respect to the membrane bilayer in order to accommodate the helix length and
dimeric nature of the protein (MacKenzie et al. 1997). Alternatively, short -helices may be
28
able to adapt to the larger membrane span by extending their side chains into the lipid head
group - water interface. The snorkelling action accommodates charged residues within these
short helices. Aggregation within the membrane is an additional method to accommodate short
helices. The actual length of the helix and its interactions within the bilayer can have dramatic
effects on how the helices associate and interact with one another; however, the magnitude of the
contribution that lipids make to protein folding remains poorly understood.
Figure 1.13. Altered acyl chain conformations can occur to accommodate a TM segment within
the membrane bilayer. A) Acyl chains can extend or order to contain a TM segment longer than
the bilayer width. B) Acyl chains can compress or disorder to accommodate a TM segment
shorter than the bilayer. C) Helices can tilt with respect to the membrane bilayer, and the tilt
angle determines the amount of lipid-TM segment contact.
Interactions between a membrane protein and the surrounding lipid molecules in the
membrane are important in determining the structure and function of the protein. The exact
contribution of lipids to the final membrane protein fold is however difficult to determine as X-
ray crystallographic structures of integral membrane proteins generally include few lipid
molecules. An example of a protein for which lipids are an essential structural component is the
mechanosensitive channel of large conductance (MscL) from Mycobacterium tuberculosis. A
29
cluster of Lys and Arg residues in the protein sequence was found to associate with anionic
phospholipids with high affinity (Powl et al. 2005), and altering the lipid identity has
consequences on MscL conformation (Elmore and Dougherty 2003).
1.5.5 Lipids as membrane protein chaperones.
The identification of molecular chaperones and their role in protein folding in vivo
indicates that acquiring the final folded structure of a protein is a complicated process.
Molecular chaperones are involved in the conformational maturation of soluble and membrane
proteins where they direct folding, prevent misfolding (Frydman and Hartl 1996), and even
unfold protein structures (Martin and Hartl 1997). In addition to protein chaperones, lipids may
also act as non-protein chaperones in the folding of membrane proteins within the membrane
bilayer (Bogdanov and Dowhan 1999). For example, full function of the E. coli membrane
transporter LacY is dependent on exposure to phosphatidylethanolamine (PE) during the in vivo
assembly process. The absence of PE during the in vivo folding pathway of LacY results in the
misfolding of the protein, without actually affecting insertion into the membrane (Bogdanov and
Dowhan 1995). In the folding of LacY, PE was identified as a molecular chaperone because the
functional conformation of LacY was retained after partial folding by SDS, complete removal of
PE and refolding in the absence of PE. These studies indicate that once the final PE-directed
structure of LacY is established in vivo, the presence of PE is no longer required to maintain the
proper confirmation of the protein (Ellis 1997). Both the properties of the ionic headgroup and
the organization of the lipid tail, or hydrophobic domain of PE mimic critical components of
protein chaperones, and render lipids with chaperone function (Bogdanov et al. 1999). The
molecular chaperone effect of lipids is thought to be highly specific for lipid chemical
composition, beyond providing a non-specific detergent like phase.
1.6 Membrane protein oligomerization motifs.
The placement of specific residues along the -helical axis can constitute a specific TM
folding or oligomerization motif, which can determine whether two helices will interact or retain
a monomeric status. Specific amino acid patterns in numerous membrane proteins have been
identified that when folded into a -helix across a membrane bilayer, will form an interaction
30
face that is sufficient to drive association of -helices. Mutagenesis studies have been
exceptionally useful in determining the residues critical to these specific residue motifs, where
by changing the residue identity, oligomerization can be modulated. Many studies identifying
critical residues to membrane protein folding have been performed on single pass membrane
systems (Rath et al. 2009b), and additionally interactions between multiple helices that drive the
formation of the final membrane protein structure have been elucidated (Buck et al. 2007).
Several helix oligomerization motifs have been identified to-date, which can be
categorized into three main classes: polar residue side chain-side chain interactions that occur
between neighboring TM segments (Gratkowski et al. 2001; Zhou et al. 2001); left handed (or
GASleft) folding motifs consisting of heptad repeats of small or large residues (Lear et al. 2004;
Poulsen et al. 2009); and perhaps the most-widely characterized promoter of oligomerization -
the GG4 (or GASright) motif that is most often defined by an i, i + 4 separation of “small”
residues (Gly, Ala, and Ser) (Deber et al. 1993; MacKenzie et al. 1997; Melnyk et al. 2002;
Sulistijo et al. 2003; Sulistijo and MacKenzie 2006; Bocharov et al. 2007; Plotkowski et al.
2007; Roth et al. 2008). In the following sections, these TM folding motifs will be described in
greater detail.
1.6.1 Polar residue motifs.
Experimental studies involving simple TM helices have shown that a single polar residue,
such as Asn, Asp, Glu, Gln, or His is sufficient to mediate homooligomerization of -helices in
membrane bilayers and membrane mimetic systems (Gratkowski et al. 2001; Zhou et al. 2001).
The role of these specific polar residues in the oligomerization of TM segments was first
demonstrated via model peptides where a polar residue placed in the center of a model TM
segment was shown to drive helix association via the formation of an interhelical hydrogen-
bond. As an example, introduction of a polar Asn residue introduced into a model TM -helix
consisting of a poly-Leu background, resulted in the formation of stable dimers as measured by
both in vivo and in vitro assays (Zhou et al. 2001). This helix interaction is considered strong as
it is resistant to denaturation by the detergent sodium dodecyl sulfate (SDS) (Choma et al. 2000;
Zhou et al. 2000; Lear et al. 2001; Zhou et al. 2001).
31
Unlike strongly polar amino acids, weakly polar amino acids such as Ser and Thr most
often occur in sequence patterns to drive oligomerization. High-affinity homooligomerizing
sequences have been identified via in vivo oligomerization screens, where the two most
frequently occurring motifs are SxxSSxxT and SxxxSSxxT (Dawson et al. 2002). Mutations of
any of the Ser or Thr residues in these motifs to non-polar residues abolished oligomerization,
indicating that the interaction between these positions is specific and requires an extended motif
of Ser and Thr hydroxyl groups. This study found that single Ser or Thr groups do not appear to
promote helix association on their own, but can drive strong and specific association through a
cooperative network of interhelical hydrogen bonds (Dawson et al. 2002).
1.6.2 Left-handed transmembrane folding motifs: heptad repeats of small or large
residues.
Examination of protein crystal structures found almost all TM helices have interhelical
hydrogen bonds (Adamian and Liang 2002); however, examples exist of TM helices that
oligomerize via sequence specific motifs that are clearly less dependent on hydrogen bonds. One
example is the left-handed folding motif, which has a characteristic seven-residue spacing of
small or large residues that mediates TM packing. A synthesized model TM peptide containing
Gly at the a and d positions was found to associate via analytical ultracentrifugation, where the
spacing of the Gly residue allowed for the close approach of the -helix backbone and efficient
van der Waals packing of the construct (Lear et al. 2004). An example of an actual TM segment
that uses this motif to associate in vivo is found in TM4 of the Halobacterium salinarum small
multidrug resistance protein, Hsmr, where the underlined residues contribute to TM
oligomerization (85
VAGVVGLALIVAGVVVLNVAS105
). Oligomerization of Hsmr through
this motif is critical to protein function, as the protein must form at least dimers to retain its drug
effluxing properties (Poulsen et al. 2009).
Heptad repeats can also be formed from large, hydrophobic residues such as Leu and Val.
One of the most widely recognized examples of a TM segment which uses this mode of
oligomerization is the TM domain of phospholamban. This SDS-resistant pentamer binds to,
and inhibits, the Ca2+
ATPase. Mutagenesis and modeling experiments have identified that the
32
phospholamban TM segment oligomerizes via its LxxIxxxLxxIxxxL motif, which in turn forms
a Leu/Ile zipper-like coiled-coiled structure (Oxenoid and Chou 2005).
1.6.3 Right-handed packing motifs: small-xxx-small motifs.
One of the best characterized TM folding motifs is the right-handed, small-xxx-small or
GG4 motif. This TM oligomerization motif consists of a small residue (Gly, Ala or Ser)
separated by any three residues at intervening positions, with Gly being the most prevalent. The
importance of this motif was initially implied by amino acid compositional searches of TM
domains, where an enrichment of small residues was observed to be of statistical importance
(Senes et al. 2000). This particular oligomerization motif places the small residues on the same
side or face of the -helix, at an i, i + 4 spacing, and this produces a concave helical surface that
is optimized for protein folding (MacKenzie et al. 1997). Transmembrane dimers containing this
motif are further stabilized by hydrogen bonding between the C and backbone carbonyl. The
exact role of the Gly residues involved in this oligomerization motif is however, not entirely
clear. Some investigators have found that the Gly residues are necessary, but not sufficient for
homodimerization (Schneider and Engelman 2004); while others have reported that the Gly
residues are neither necessary nor sufficient to mediate oligomerization events within the
membrane (Doura and Fleming 2004).
One of the best characterized TM domains which homodimerizes through a GG4 motif is
the membrane spanning region of GpA. This single-pass membrane protein is an obligate dimer
optimized for high affinity association that is mediated by van der Waals interactions between
helices within the membrane bilayer (MacKenzie et al. 1997). Another example of a membrane
protein which uses the GG4 motif to oligomerize is the MCP; however, this homodimerizing unit
has relatively moderate stability compared to GpA (Melnyk et al. 2002). The exact basis for the
difference in -helix affinity between GpA and MCP remains unclear, as replacement of the
residues involved in the GpA oligomerization interface with MCP interfacial residues still
retained helix-helix interactions, albeit it at a reduced strength (Melnyk et al. 2004).
33
1.6.3.1 The human erythrocyte protein Glycophorin A.
The TM domain of GpA serves as an excellent model system by which to study
determinants of membrane protein folding; this single pass membrane protein has been used
extensively in the study of membrane protein folding as viewed through the two-stage folding
model. A relatively small protein, GpA contains 131 amino acids, 23 of which span the
membrane bilayer (residues 73-95) (MacKenzie et al. 1997). The functional role of GpA is to
determine the blood group antigenic specificity, where the amino terminal domain of the protein
determines the MNS blood group type (Marchesi and Andrews 1971), and it is the major
sialoglycoprotein of the of the human red blood cell membrane. The amino terminal domain of
GpA contains a large oligosaccharide component, and the carboxy terminal domain - which
contains the membrane spanning region - associates within the membrane. The TM domain
alone is responsible for the oligomerization of GpA. Glycophorin A is also implicated in a role
of biosynthesis and plasma membrane trafficking of another abundane erythrocyte membrane
protein, Band 3 or the Anion Exchanger 1 (Williamson and Toye 2008).
Structural characterization to determine the mode of GpA TM mediated oligomerization
was conducted via mutagenesis of a GpA chimeric protein; the TM segment of GpA was fused to
the C-terminus of staphylococcal nuclease via a flexible linker and the oligomeric status of GpA
mutant chimeras was established via migration on SDS-PAGE (Lemmon et al. 1992a; Lemmon
et al. 1992b). This technique led to the identification of residues within the TM sequence that
significantly modulated dimerization upon mutation, and to the determination of seven residues
specifically involved in GpA dimerization interface: L75
I76
xxG79
V80
xxG83
V84
xxT87
. From this
work it was determined that the individual GpA TM segments associated within SDS as a
parallel, right-handed supercoil (Lemmon et al. 1992a; Lemmon et al. 1992b). This finding was
supported by the eventual high resolution structural determination of GpA within
dodecylphosphocholine (DPC) micelles, where it was shown that the interactions of individual
GpA TM helices are stabilized by van der Waals interactions (MacKenzie et al. 1997).
34
1.7 Methods to study the oligomerization of transmembrane segments.
As described above, obtaining large amounts full-length membrane proteins, or even
smaller fragments from heterologous expression, can be difficult. Our laboratory has worked
towards optimized synthesis and expression of TM segments to facilitate the study of packing
interactions between TM helices and to obtain structural information regarding these segments.
Solid phase synthesis of hydrophobic TM peptides tagged with solubilizing residues as well as
heterologous expression in E. coli of small membrane protein fragments have facilitated these
structural studies for a number of proteins including CFTR (Therien et al. 2001; Choi et al.
2004), GpA (Melnyk et al. 2004), MCP (Wang and Deber 2000; Melnyk et al. 2004), and the
gamma subunit of the Na, K-ATPase or sodium pump (Therien and Deber 2002). We have also
explored computational methods in order to identify potential TM -helix interactions sites.
This has been exceptionally useful in situations where structural data is available for comparison
(Cunningham et al. 2010).
With the availability of tools to produce large quantities of membrane proteins - or
fragments thereof - several systems have been developed in order to study the close packing of
TM segments within the membrane bilayer, for both in vivo and in vitro systems. The following
sections describe in further detail such assays, and their utility in understanding membrane
protein folding.
1.7.1 Membrane mimetic systems: micelles versus bilayers.
Membrane bilayers and detergent micelles can both be used to study the secondary and
tertiary structures of membrane proteins. As a necessary step in solubilizing hydrophobic
membrane proteins, the choice of detergent or lipid can have different structural consequences,
so the choice should not be made in an arbitrary fashion.
1.7.1.1 Membrane bilayers.
Membrane bilayers are advantageous in the study of membrane proteins as they offer a
native environment in which to investigate secondary and tertiary structure. The lipid structures
35
forming biological membranes are made of two layers of lipid molecules, which are mostly
composed of phospholipids which have a hydrophilic head group and two hydrophobic tails.
When these lipids are exposed to water, they aggregate to form a two-layered sheet, with their
tails pointing towards the centre of the sheet. In mammalian membrane bilayers, the most
common phospholipid is phosphatidylcholine (PC), which accounts for almost half of the lipids
present. Phosphatidylcholine has a zwitterionic head group, as there is a negative charge on the
phosphate group and a positive charge on the amine. An alternative to biological membranes is
liposomes. Liposomes have lipids organized as a bilayer resembling biological membranes but
they also allow comparisons among membrane enriched in particular lipid components of your
choice. For example, if a particular lipid is required as a cofactor for an enzymatic process, they
can usually be identified in a liposome assay.
1.7.1.2 Detergent Micelles.
Detergent micelles are advantageous in studying membrane proteins as they are a simple
and convenient system. Micelles are aggregates of detergent molecules that have hydrophilic
head regions and hydrophobic tail regions. A typical micelle in aqueous solution forms this
aggregate with the head region in contact with the surrounding solvent and the tail region
sequestered into the micelle centre. In this manner, detergent micelles can provide an
environment which is similar to the membrane bilayer; in both cases the centre region of the
micelle and bilayer has a low dielectric environment which is suitable for depositing
hydrophobic TM segments. Transmembrane segments can be solvated efficiently by detergent
micelles where they adopt -helical structures in order to satisfy the hydrogen bonding potential
of the backbone. Structural characterization of membrane proteins is often carried out in
detergent micelles, as this system is amenable to high resolution structural determination
techniques such as crystallization and solution NMR (le Maire et al. 2000; Prive 2007; Carpenter
et al. 2008)
1.7.2 The TOXCAT assay.
To measure the extent of folding of single pass TM segments within a natural membrane
environment, the TOXCAT assay was developed. This assay can measure the
36
homooligomerization of any TM segment, and is based on the TM mediated association of the
dimerization-dependent ToxR transcriptional activation domain. In the TOXCAT assay, the TM
segment of interest is cloned into the pccKAN expression vector, which places the TM segment
between an N-terminal fusion of the DNA binding domain of ToxR – a transcriptional activator
that is only functional in the dimeric form (Kolmar et al. 1995; Ottemann and Mekalanos 1995) –
and a C-terminal fusion of maltose binding protein (MBP) – which ensures the proper orientation
of the construct within the membrane bilayer (Russ and Engelman 1999). Homooligomerization
via the TM domain leads to dimerization of the ToxR domain, resulting in transcriptional
activation of a reporter gene, and the concentration of the reporter is determined via an enzyme-
linked immunosorbent assay (ELISA). In the case of the TOXCAT assay, the reporter is
chloramphenicol acetyltransferase (CAT). In vivo, this bacterial enzyme detoxifies the antibiotic
chloramphenicol by covalently attaching an acetyl group from acetyl-CoA to chloramphenicol,
preventing chloramphenicol from binding to the ribosome, and thereby inhibiting protein
synthesis (Lovett 1996). The TOXCAT assay is useful in studying the oligomerization of
membrane proteins, as the amount of CAT expressed in vivo corresponds to the level of affinity
of the dimerizing construct being tested (Fig. 1.14) (Russ and Engelman 1999). The TOXCAT
assay is additionally advantageous in studying membrane protein association as it is sensitive to
differences in affinity among mutant constructs, and can be used to select and identify strongly
dimerizing constructs (Russ and Engelman 1999). Lastly, this assay has been used to select
oligomeric TM segments from a library of randomized sequences (Russ and Engelman 1999).
Figure 1.14. The TOXCAT assay for detecting in vivo TM association within the E. coli inner
membrane. Homooligomerization mediated through the TM domains drives the association of
the ToxR‟ domain, which drives expression of the reporter gene CAT. The periplasmic located
MBP protein anchors the chimera in the inner membrane, controlling orientation.
37
The TOXCAT assay has proven extremely useful in identifying residues responsible for
dimerization within the E. coli inner membrane, as well as identifying amino acids responsible
for mediating oligomerization. TOXCAT was successfully used to identify the residues involved
in the homooligomerization of the DEP-1 protein tyrosine phosphatase, which is mediated by
specific Gly residues within the sequence. Replacement of these small residues in a non-
conservative fashion to large residues was disruptive to dimerization (Chin et al. 2005).
1.7.3 Sodium dodecyl sulfate polyacrylamide gel electrophoresis.
Sodium dodecyl sulfate - polyacrylamide gel electrophoresis (SDS-PAGE) is one of the
most commonly used tools in studying the association of TM segments (Rath et al. 2009b).
Transmembrane segments that retain oligomerizing capabilities in this detergent can migrate at
apparent molecular weights ≥ two times their actual molecular weight and are examples of
strongly associating segments. Based on migration through the gel and comparisons to
molecular weight standards, SDS-PAGE can provide an estimate of the oligomeric state of the
TM segment; however, the migration does depend heavily on the amount of detergent bound to
the segment, indicating the importance of hydrophobicity and conformation of the sample of
interest (Rath et al. 2009a).
SDS-PAGE has found utility in identifying high affinity dimers as well as the residues
involved in oligomerizing species (Lemmon et al. 1992a; Melnyk et al. 2001). In the case of the
dimeric single pass membrane protein GpA, the seven residue dimerization motif
(L75
IxxGVxxGVxxT87
) centered around the Gly GG4 motif was pin-pointed as the mediator of
dimerization, with replacements at these residues significantly modulating oligomerization.
Fusion constructs consisting of the GpA TM segment and the C-terminus of staphylococcal
nuclease were expressed and purified, and changes to the oligomeric status with mutation were
determined via SDS-PAGE. Amino acids with aliphatic side chains defined much of the
interface, indicating that a precise packing interaction between the helices provides the energy
for association (Lemmon et al. 1992b).
38
1.7.4 Förster Resonance Energy Transfer.
Förster Resonance Energy Transfer (FRET) is another useful in vitro technique to study
the association of TM segments in membrane mimetic systems. In order to observe FRET
between two separately labelled donor and acceptor TM peptides, the labels must be within a
given radius. This separation distance is a property of the FRET pairs, and is referred to as the
Förster radius. A commonly used FRET pair in the study of TM peptides is dansyl chloride as
the donor chromophore and dabsyl chloride as the acceptor. The emission spectrum of dansyl
chloride overlaps the absorbance spectrum of dabsyl chloride, and association of the peptides
results in energy transfer and quenching of the total dansyl chloride fluorescence when the two
labels are within 40 Å of each other (Rath et al. 2009b). FRET finds additional utility in that this
technique can be extended to intact bilayers, as well as allowing for ease in identification of
hetero-oligomers in TM interactive species.
The asymmetric oligomeric interface of the anthrax toxin receptor-1 ANTXR1 TM
domain was identified via FRET studies in SDS micelles. Mutations were made to residues
corresponding to those predicted to be involved in association via computer modeling, and
differences in dimerization were observed via FRET (Go et al. 2006). Additionally, FRET was
successfully used to determine that different detergents can affect the energetics of peptide
association. A series of detergents including a range of alkyl chain lengths, combined with ionic,
zwitterionic, and nonionic head groups showed wide variations in GpA dimer stability and that
detergents might be selected that drive association rather than dissociation of peptide dimers
(Fisher et al. 2003).
1.7.5 Analytical ultracentrifugation
Analytical ultracentrifugation finds its utility in the study of TM association as the
oligomeric status of TM constructs can be estimated based on the mass of TM peptides
solubilized in a membrane mimetic environment. Ultracentrifugation can also be applied to
estimate the energetics of peptide association via titrating peptide concentration to separate
monomeric and oligomeric TM peptide species (Rath et al. 2009b).
39
Homodimerization of the mouse erythropoietin receptor, which is thought to be an initial
regulatory step in erythrocyte formation, was determined via sedimentation equilibrium
analytical ultracentrifugation in detergent. The sequence dependence of the peptide interaction
was also highlighted by comparison to the human version, as the human erythropoietin receptor
differs by three residues but has significantly lower interaction propensity and is only slightly
more favorable than that expected for non-preferential binding (Ebie and Fleming 2007).
1.7.6 Computational methods.
Another tool to predict oligomerization interfaces of TM -helices, including modeling
of TM dimers, is by using the CNS searching of helix interactions (CHI) software suite of the
crystallography and NMR system (CNS) (Adams et al. 1995; Adams et al. 1996; Brunger et al.
1998). In this method, two identical energy minimized -helices are generated from the primary
sequence, and potential interactions are identified by global computational searching. This
method of interfacial recognition can be conducted in either parallel or anti-parallel orientations
(Poulsen et al. 2009), and the lowest energy structures are considered potential candidates for
dimers. This relatively crude method of helix interaction prediction only takes into account
optimized structural interactions, and validation of the models produced from the predictive
output is still required.
Algorithms have also been developed to identify interacting TM helices in membrane
proteins that incorporate features beyond those used to develop contact maps for soluble
proteins. One such predictive method developed by Fuchs et. al. is a neural-network based
approach that integrates protein sequence, correlated or co-evolving residues, protein topology,
residue position within the TM segment, and orientation toward the lipophilic environment to
generate contact maps for membrane proteins (Fuchs et al. 2009). Predicting protein structure
however, is a challenging task. When tested on a dataset of 62 non-homologous membrane
proteins with known structure, a prediction accuracy of 26% was achieved. While this value
initially seems low, prediction of TM helical contacts via this neural-network performs with
equal accuracy to contact predictors designed for soluble proteins (Fuchs et al. 2009). Obviously
the prediction of protein structures and -helical contacts is still an evolving process, but the
identification of interactive helices is a useful exercise that may lead to the assignment of a
40
membrane protein sequence to a family of related folds and hence provide clues related to
function; especially in a field where available structures are limited.
1.8 Thesis hypothesis and outline.
As discussed above, helical membrane protein folding is thought to follow a two-stage
process. The first stage is defined as -helix insertion into the membrane, followed by the
second stage involving lateral association of helices within the membrane to form higher order
structures. In order to fully understand the overall process of membrane protein folding, each of
these stages must be further defined. Unfortunately, studies of this process can be hampered by
the challenge of working with intact membrane proteins - where their size and high
hydrophobicity - combined with demanding expression protocols have often precluded
biophysical studies. In order to overcome this challenge and facilitate the study of membrane
proteins, fragments or peptides corresponding to individual TM segments are often utilized in
place of the full length structure. This largely empirical process has been tackled extensively in
our laboratory, with the goal of producing constructs that can address both stage-one and stage-
two of membrane protein folding.
Chapter 2 of this thesis focuses on the second stage of membrane protein folding and our
understanding of sequence determinants leading to -helix association within the membrane
bilayer utilizing GpA as the model system. If specific sequence determinants are responsible for
directing the association of TM segments within the membrane bilayer, then modulating the
sequence would act to promote, or obstruct dimerization (Cunningham et al. 2010). Chapter 3
will focus on stage-one of membrane protein folding – specifically factors which separate
hydrophobic segments from globular proteins within bona fide TM segments - with a goal
towards understanding how the recognition of actual TM segments occurs. By investigating a
group of -helices related to TM segments through their overall hydrophobicity, we can
determine differences between the groups based on composition and structure (Cunningham et
al. 2009). Expanding on the recognition process of TM segments by the cellular machinery
leading to the membrane integration of amino acid stretches, Chapter 4 will focus on identifying
determinants required for the successful membrane integration of polar, -helical segments such
as -helices into the membrane bilayer, and specifically identifying determinants of membrane
41
integration in vivo. Finally, the production of higher order membrane protein constructs in
quantities sufficient for study must be addressed before investigations of the folding patterns
within membrane environments can be initiated. In Chapter 5 of this thesis, routes to optimal
expression and synthesis of membrane protein fragments will be described, along with efforts
toward preparation and characterization of these hydrophobic segments (Cunningham and Deber
2007).
42
Chapter 2. Beta-branched residues adjacent to GG4 motifs promote
the efficient association of glycophorin A transmembrane helices.
The contents of this chapter have been published, in part, by Cunningham, F. Poulsen, B.E., Ip,
W., and Deber, C.M., Biopolymers: (2010).
Author Contributions: FC and WI designed research. WI and BEP assisted in mutagenesis of
TOXCAT constructs and peptide synthesis and purification of peptides, respectively. FC
analyzed data. FC and CMD wrote the paper.
43
2.1 Introduction.
A major driving force for correct folding and assembly of membrane proteins derives
from interactions between segments created by specific residue motifs on the helix
surface. One such motif is the GG4 (or „small-xxx-small‟) motif, that is defined by an i, i + 4
separation of „small‟ residues (Gly, Ala, and Ser) (Popot and Engelman 1990; Rath et al. 2009b)
with large hydrophobic residues (Ile, Val or Leu) statistically noted to reside at adjacent
positions (Senes et al. 2000).
In an analysis of residues in TM segments, a pattern emerged that identified small
residues (Gly, Ala and Ser) placed three residues apart along the amino acid sequence to occur
frequently. The most over-represented of the pairs of small residues was Gly at the i and i + 4
positions, creating a GG4 motif; this GG4 motif occurs 32% more often than a random
expectation (Senes et al. 2000). See reference (Rath et al. 2009b) for a list of membrane proteins
containing this small residue pattern that is involved in TM segment oligomerization. A GG4
motif is the key feature directing dimerization of GpA, and serves as an excellent model system
to further our understanding of the forced driving -helix association within a membrane. The
interaction motif of the GpA homodimer has been defined as L75
IxxGVxxGVxxT87
, where
favorable van der Waals interactions between monomers facilitate dimerization (Lemmon et al.
1992b). The structure of the GpA homodimer determined by solution NMR indicates that the
side chains of Gly79
and Gly83
form a „groove‟ that packs against a „ridge‟ formed by the
sequentially-adjacent hydrophobic -branched residues Val80
and Val84
, with additional folding
contributions by the surrounding residues (MacKenzie et al. 1997).
While the role of small residues in the GpA dimerization motif has been extensively
studied, the importance of the large adjacent Val residues has not been correspondingly explored.
Compositional analysis has indicated a statistical enrichment in large, hydrophobic -branched
residues in TM segments such as Ile and Val versus simple hydrophobic segments from globular
proteins, suggesting that these residues are important in performing a structural role beyond
maintaining levels of hydrophobicity (Cunningham et al. 2009). Here we sought to determine
the nature and relative extent of the specific contribution(s) to TM-TM packing of the Val80
and
Val84
residues through systematic substitution(s) with Leu, Ile, and Ala residues. Results of the
44
analysis suggest that replacement of Val with even single Ile, Leu, or Ala residues can have
profound effects on GpA helix-helix interactions.
2.2 Results.
2.2.1 Mutations at Val80
and Val84
in the glycophorin A dimerization motif can modulate
the strength of oligomerization in vivo.
To investigate the contribution of the large residues in the homodimerization motif of
GpA TM -helices, a systematic mutagenesis of Val80
and Val84
was performed. A total of 16
single and double mutants were generated in the GpA TM domain of the TOXCAT vector via
site-directed mutagenesis at these sites with all possible combinations of Val, Ile, Leu, and Ala
(Fig. 2.1). Dimerization affinities were determined via the TOXCAT assay, where the extent of
dimerization within the E. coli inner membrane is indicated as a measure of the CAT reporter
gene produced (Fig. 2.2A,B) (Russ and Engelman 1999). The amount of CAT expressed within
the cells ([CAT]) is a measurement of construct dimerization and can be compared relative to the
WT GpA construct after normalization for construct expression within NT326 cells (Fig.
2.2A,B) (Russ and Engelman 1999). The correct membrane insertion of each construct upon
expression was confirmed via growth of NT326 cells on M9-maltose plates (data not shown). E.
coli cells expressing the WT and mutant TOXCAT chimeras were streaked onto agar plates with
maltose (0.4%) as the only carbon source. Constructs which span the membrane bilayer and
target MPB to the periplasm are capable of utilizing maltose as a carbon source for growth. This
maltose complementation assay verifies the periplasmic location of MBP and correct orientation
and insertion of these constructs into the E. coli inner membrane. Transformant growth was
evaluated for all constructs after incubation for 2 days at 37°C.
45
Figure 2.1. Substitutions for WT Val80
and/or Val84
with Ile, Leu, and Ala combinations, shown
in alignment with the local GpA sequence (Lemmon et al. 1992b). The gray boxes represent the
mutational positions in the GpA sequence. Figure adapted from (Cunningham et al. 2010).
We observed that substitution of the WT Val residues by various combinations of Ile and
Leu residues in the GpA TM segment significantly modulates the dimerization affinity of the
construct. Immediately apparent is the improvement in dimerization upon mutation of a WT Val
to Ile (IV and/or VI, p < 0.01 and p < 0.05, respectively, Fig. 2.2A) as measured within the E.
coli inner membrane. A corresponding result was observed for the double Ile mutant (II), where
dimerization was significantly elevated above the WT VV GpA construct (p < 0.05, Fig. 2.2A).
From these results, Ile appears to increase dimerization of GpA in the presence of a second -
branched residue (either Ile or Val), which is not tied to position within the sequence; the
dimerization of Ile-containing GpA mutants (IV, VI and II) is statistically higher than the WT
(VV), albeit these samples are not statistically distinct from each other (p > 0.1). Therefore,
46
regardless of position, the presence of at least one Ile residue in the dimerization interface of
GpA improves the dimerization propensity of the construct (Fig. 2.2A).
Figure 2.2. TOXCAT assays of GpA and mutant sequences. Bars indicating relative CAT
expression for each construct normalized to WT with standard deviation shown. The notation
VV designates the WT Val80
and Val84
residues, respectively; other constructs are
correspondingly designated. Differences in CAT concentration are denoted by * (p < 0.05); **
(p < 0.01). A) CAT expression of WT and conservative mutant GpA constructs. B) CAT
expression of WT and Ala mutant GpA constructs. Expression levels of GpA mutant constructs
are below, and reported [CAT] values are normalized for expression levels. Figure adapted from
(Cunningham et al. 2010).
On the other hand, mutation to Leu had the opposite effect on dimerization of the GpA
TM compared to Ile and Val. The double Leu mutant (LL) displayed a statistically significant
decrease in dimerization propensity relative to the WT (p < 0.01, Fig. 2.2A) indicating the
importance of a side chain -branch in the oligomerization of GpA. However, the dimerization
of the LL construct was still above the background (Russ and Engelman 1999) (c2x, p < 0.01)
indicating that oligomerization is not completely abolished in the presence of this large,
hydrophobic residue. The efficiency of a -branched residue in promoting dimerization of the
47
GpA TM is highlighted by that fact that the presence of Leu in combination with either Val or Ile
(constructs VL, LV, IL, and LI) retained dimerization levels statistically indistinguishable from
WT and each other (Fig. 2.2A) (p > 0.1). These results show that at least one large, hydrophobic
-branched residue is required to maintain WT dimerization affinity at levels comparable to WT
(Fig. 2.2A). The importance of large, hydrophobic -branched residues is additionally
highlighted as hydrophobic segments from globular proteins – -helices- are depleted in Ile and
Val, even though these segments share equivalent hydrophobicity to TM segments (Cunningham
et al. 2009). The statistical enrichment of Ile and Val, along with the increased dimerization data
presented here, highlights the structural role of these residues and how they are optimal for
driving the packing of TM segments (Cunningham et al. 2009).
To further explore the importance of having at least one large hydrophobic, -branched
residue in the dimerization interface of GpA, less conservative mutations to Ala were made with
all combinations of Ile, Val and Leu (Fig. 2.1). All possible Ala/Val and Ala/Ile mutants were
nonetheless able to maintain dimerization affinities at least equivalent to the WT VV construct,
with the VA mutant exhibiting dimerization affinity above the other Ala/Val and Ala/Ile
combinations (Fig. 2.2B). Conversely, the Ala/Ala and Ala/Leu combinations significantly
reduced the dimerization propensity of the construct relative to WT (p < 0.01, Fig. 2.2B).
A preliminary mutational cycle of Thr was also considered to understand the importance
of -branching vs. hydrophobicity of the amino acid side chain in the context of the GpA GG4
dimerization motif. Mutational cycles of Thr, in combination with Val, Ile and Leu, indicate
that mutation to a -branched, yet more polar residue can also modulate dimerization of GpA.
Preliminary results show that dimerization of TI and LT mutants are comparable to WT, while
the IT and TL mutants produced little to no dimer. From this initial set of mutations, no
discernible trend emerged on the effect of introducing a Thr residue into the GpA dimerization
interface: there appears to be no positional dependence of Thr on construct dimerization: the
pattern of dimerization for the TI and IT mutants compared to the TL and LT mutants is opposite
in nature (Fig. 2.3). These results do show however, the importance of a balance of
hydrophobicity and a -branched side chain – in concert with the electrostatic/H-bonding
interactions offered by the Thr residue - in promoting efficient packing of the GpA homodimer.
48
Figure 2.3. TOXCAT assay of GpA and mutant Thr sequences. Bars indicating relative CAT
expression for each construct normalized to WT with standard deviation shown. The notation
VV designates the WT Val80
and Val84
residues, respectively; other constructs are
correspondingly designated. The relative expression of each mutant construct is shown below.
[CAT] values are normalized for expression levels.
2.2.2 Mutations at Val80
and Val84
do not alter secondary structure.
Conservative mutations at Val80
and Val84
that significantly altered in vivo
homooligomerization of the GpA TM helix compared to WT (Fig. 2.2) were selected for peptide
synthesis. The boundaries of the GpA TM segment were chosen based on the available high
resolution structure of WT GpA (MacKenzie et al. 1997); multiple Lys residues were
incorporated at both the N- and C-termini to facilitate characterization and solubilization of the
peptides (Table 2.1). Previous work has shown that such solubilizing techniques do not interfere
with the native oligomerization capabilities of GpA peptides (Melnyk et al. 2001). Secondary
structures of WT and GpA mutant peptides in the membrane-mimetic environment of SDS were
determined by circular dichroism (CD) spectroscopy. In agreement with previously reported
experiments of GpA in -helix inducing solvents such as SDS (Melnyk et al. 2001), all peptides
investigated had -helical CD spectra with minima at 208 and 222 nm (Fig. 2.3). SDS has been
shown to be capable of discerning variations in helicity among libraries of mutants in related TM
segments (Melnyk et al. 2001; Wehbi et al. 2007; Rath et al. 2009b), and allows one to draw
conclusions as to the relative helicity of TM peptide constructs. While slight differences were
49
discernible in the amount of -helical structure observed for GpA vs. some mutant peptides in
SDS, the variations were not statistically significant (p > 0.05).
Table 2.1. Sequence of peptide mutant of the GpA transmembrane region.
Mutant Sequence a
VVpep KKKK-ITLIIFGVMAGVIGTILLISYGI-KKKK
VIpep KKKK-ITLIIFGVMAGIIGTILLISYGI-KKKK
IVpep KKKK-ITLIIFGIMAGVIGTILLISYGI-KKKK
IIpep KKKK-ITLIIFGIMAGIIGTILLISYGI-KKKK
LLpep KKKK-ITLIIFGLMAGLIGTILLISYGI-KKKK a Lys residues (offset by hyphens) were added to each peptide to enhance aqueous solubility
(Melnyk et al. 2001). b Mean Liu-Deber segmental hydrophobicity of each peptide, excluding Lys tags
Figure 2.4. Circular dichroism spectra of synthetic peptides corresponding to the GpA TM
sequence. Spectra were recorded on solutions of 25 M peptide in 25 mM SDS buffer. Peptide
notations in the diagram correspond to the residue positions 80 and 84 of the GpA TM sequence,
where VVpep = WT. See Materials and Methods and Table 2.1 for further details of peptide
synthesis and sequence. Figure adapted from (Cunningham et al. 2010).
50
2.2.3 Lipid accessibility of the ridge residues correlates inversely with tightness of dimer.
Previously it has been shown that helix-lipid interactions contribute to the overall
stability of the GpA dimer in consideration of the small residues of the dimerization interface; an
increase in the lipid accessible surface area (contact between lipid and protein) as a result of
mutation is inversely correlated with tightness of dimer (Johnson et al. 2006). These findings
suggest that helix-lipid interactions may also contribute to changes in the oligomerization status
with mutation to the large “ridge” residues involved in the GpA packing. To assess the role of
the large, hydrophobic residues in promoting the efficient packing of GpA and to provide an
estimate of lipid chain solvation, the lipid accessible surface area to a methylene-sized probe
(1.88 Å) was calculated for the residues comprising the dimerization interface of energy
minimized GpA helices and conservative mutants thereof (Shrake and Rupley 1973; Chothia
1975; Johnson et al. 2006). An inverse correlation was observed for lipid accessibility and
oligomerization as measured by the TOXCAT assay for all mutants (VV, VI, VL, IV, II, IL, LV,
LI, LL, VA, AV, IA, AI, LA, AL, AA) where as lipid accessibility increased, oligomerization
decreased (r value of -0.54, p < 0.05, Fig. 2.4). This trend implies that as the „ridge‟ in GpA
packing becomes more accessible to lipid through mutation, a weaker GpA dimer is produced.
The strength of the trend also implies that lipid solvation offers only a partial explanation to
explain differences in GpA dimerization, but it remains a contributing factor.
Figure 2.5. GpA dimer affinity is inversely correlated to interfacial lipid accessibility.
Interfacial lipid accessibility of mutants to the large WT Val residues of GpA (VV, VI, VL, IV,
II, IL, LV, LI, LL, VA, AV, IA, AI, LA, AL, AA) plotted vs. TOXCAT signal ([CAT]). The
regression line of best fit and correlation coefficient are shown; the correlation meets
significance levels (p < 0.05). Figure adapted from (Cunningham et al. 2010).
51
2.3 Discussion.
2.3.1 -Branched residues are required to mediate efficient association of the glycophorin
A homodimer in the membrane bilayer.
Large hydrophobic residues constitute nearly half of the amino acids in TM -helices and
it has previously been shown that these residues (Ile, Val, or Leu) are often associated with Gly
at the i ± 1 positions in interacting -helices (Russ and Engelman 2000; Senes et al. 2000).
Additionally, there is a prevalence for large, hydrophobic amino acids to occur three residues
apart along TM -helices, with VV, VI, VL, IV, II and IL constituting over-represented pairs, as
shown statistically by Senes et al. (Senes et al. 2000). The high content and structural similarities
among Ile, Leu, and Val beg the question not only as to how these residues affect GpA
dimerization, but more generally membrane protein folding and helix-helix packing. For
example, the mutations studied in the present work represent relatively modest changes to the net
hydrophobicity of the TM segment (Table 2.1); in fact, three pairs of such mutants (VI and IV;
VL and LV; II and LL) are each “iso-hydrophobic”. Additionally, Ile, Leu and Val represent the
amino acids with the highest propensity to form -helical structures in apolar environments (Liu
and Deber 1998b; 1999).
Experimentally, studies addressing the relative importance of residues that are included
within the GpA dimerization interface have indicated that Val residues are important in
maintaining the ability of GpA to form a tight dimer in a membrane environment. Thus, in a
study performed by Lemmon et al., which made use of a heterologously expressed GpA chimeric
protein as a fusion with staphylococcal nuclease, changes to the migration of chimeric constructs
on SDS-PAGE were observed through single amino acid mutations (Lemmon et al. 1992a). For
example, it was shown that VI, VL, VA and AV mutations at positions 80 and 84 were able to
retain measurable levels of dimerization on SDS-PAGE, albeit at different levels from each
other, while the LV mutant was equivalent to WT (Lemmon et al. 1992a). Additionally, single
mutations to the residues involved in the GpA dimerization interface were also investigated by
Doung et al., who utilized activity of expressed CAT reporter protein in a TOXCAT assay to
determine the relative levels of dimerization, and the change in apparent free energy of self-
52
association of the GpA TM segment and single mutants thereof (Duong et al. 2007). These
authors reported a rank order of association propensities for different GpA single mutants,
describing such mutants as possessing association levels similar to WT (IV and LV); less than
WT (AV and VI); and no detectable dimer (VA and VL). Analytical ultracentrifugation on GpA
chimeric constructs has also been used to identify a hierarchy of GpA dimerization propensities,
where single GpA mutants at the 80 or 84 position retained oligomerization as the WT construct
(AV, IV, and LV) or at a reduced level (VA, IV and VL) (Doura and Fleming 2004).
The TOXCAT system has several advantages in measuring the dimerization of TM
segments as it is easy to use, has proven extremely practical in establishing the sequence
determinants of dimerization (Johnson et al. 2006; Johnson et al. 2007), and is a similar
alternative to studying membrane protein folding in a mammalian membrane bilayer. The
TOXCAT assay also imposes a register to potentially interacting TM segments, which may serve
to weed out non-native TM -helix contacts; this alignment of peptide orientation within the
membrane bilayer would not necessarily occur in detergent micelles as relative rotational
freedom may occur. There are, however, limitations to the use of the TOXCAT assay in
studying the association of TM segments within a membrane bilayer. By design, the TOXCAT
assay only reports on the monomeric or dimeric status of a TM segment. It does not report on
higher order structures which can be formed from TM segments such as trimers, tetramers or
pentamers as in phospholamban, a membrane protein involved in the regulation of Ca2+
transport
(Li et al. 2001). The TOXCAT assay is the most useful when reporting on homodimeric
structures such as the monomeric GpA studied here; it cannot provide information regarding the
hetero-oligomerization of TM segments. In multi-spanning membrane proteins the final folded
structure is generated from helix-helix contacts from non-equivalent TM segments. For
example, interactions between TM2 and TM5 have been identified in aquaporin-1 that forms a
polar, quaternary structural motif that influences multiple stages of folding (Buck et al. 2007).
An additional complication of the TOXCAT assay is its limited utility in reporting on the relative
insertion of TM segments and mutants thereof into the membrane bilayer. How a mutation in a
TM segment can affect the percent insertion of a construct into the membrane bilayer is unclear,
and the maltose complementation assay may not therefore be sufficiently sensitive to report on
such differences.
53
This body of work specifically focused on identifying the residues important to GpA
dimerization by cataloguing the effects of individual mutations along the dimerization interface.
Differences in dimerization propensities exist among these published results, perhaps a
consequence of the fact that the methods of evaluation are spread across multiple techniques
(Lemmon et al. 1992b; Langosch et al. 1996; Fleming and Engelman 2001; Doura and Fleming
2004; Doura et al. 2004; Duong et al. 2007). In the present work, we undertook a systematic
evaluation of the role in GpA dimerization of the large residues specifically neighboring the GG4
motif to determine the hierarchy of dimerization promotion among these large aliphatic residues.
Our experimental study utilizes mutagenesis of Ile, Leu, and Val as „ridge‟ residues in all
pairwise combinations at positions 80 and 84 of the GpA dimerization motif to define their role
as specific modulators of oligomerization. Our results suggest that at least one large,
hydrophobic -branched residue is required to maintain WT GpA dimerization affinity at levels
equivalent to WT. Thus, mutation of WT Val80
and Val84
residues along the GpA TM segment
in the context of the TOXCAT assay to various -branched combinations (VV, VI, IV, II)
indicate that these mutations are capable of maintaining levels of dimerization equal to – and in
some cases significantly greater - than WT. Both Val and Ile are additionally capable of
promoting effective dimerization in combination with single conservative mutations to Leu (VL,
LV, IL, LI) and with non-conservative mutations to Ala (VA, AV, IA, AI). However, the
„absolute‟ requirement of a -branched residue in promoting dimerization is most vividly
apparent in mutational cycles of Leu (LL), Ala (AA) and Leu/Ala pairs (AL, LA), as these four
constructs showed a statistically-reduced propensity to dimerize relative to the WT (Fig. 2.2).
That the variations observed in dimer affinity by TOXCAT experiments do not stem directly
from altered secondary structures is confirmed by CD spectroscopy, wherein spectra of synthetic
peptides corresponding to selected GpA TM segments (VV, VI, IV, II, and LL) were found to be
uniformly helical in SDS media (Fig. 2.3).
The work presented here primarily addresses differences in GpA dimerization with
conservative mutations centered around the GG4 motif. Mutations at Val80/84
including all
combinations of Val, Ile and Leu affect GpA dimerization and the importance of the -branch in
the side chain to dimerization was shown through the TOXCAT assay. These mutants do not,
however, address the importance of both hydrophobicity and a -branched side chain to GpA
54
dimerization. To address this, a preliminary analysis was conducted by replacing the WT Val
residues with Thr. The non-conservative mutation of Val80
or Val84
to as Thr either maintained
dimerization at levels equivalent to WT, or completely disrupted it. The drastic differences in
dimerization depending on the location of the Thr residue is difficult to explain, especially as the
IT and TI mutants compared to the LT and TL mutants yielded opposite results. Thr87
is
considered part of the GpA dimerization interface, where it contributes to dimerization by
formation of a hydrogen bond with the backbone of the opposite helix (MacKenzie et al. 1997).
Previous studies investigating GpA dimerization have indicated that relatively polar residues in
the dimerization interface (G79
, G83
and T87
) can be replaced with relatively polar residues (G, S
and T) with little disruption to dimerization (Lemmon et al. 1992b); however, non-conservative
mutations of polar residues such as T87
to a hydrophobic side chain are disruptive to dimerization
(Lemmon et al. 1992b). To maintain efficient dimerization of the GpA TM segment it appears
that a combination of hydrophobicity and a -branch in the side chain is required.
2.3.2 Hydrophobic -branched residues may be structurally optimized for
transmembrane segment folding.
The insertion of TM segments into the membrane bilayer - which can be considered the
first stage in membrane protein folding – is largely driven by hydrophobicity via the cellular
machinery. The second stage of membrane protein folding involves the formation of tertiary or
quaternary structure, and results from association of two „preformed‟ -helices within the
membrane bilayer (Popot and Engelman 1990). Our finding that the amount of lipid solvation of
the „ridge‟ structures involved in GpA packing inversely correlates with dimer affinity implies
that the inability of lipid to solvate the -helical structure may help drive the helix-helix
interactions. Previously it has been noted that contact between lipid and protein is altered with
mutation of the Gly residues in the GpA dimerization motif, which in turn affects TM segment
dimerization. Our results extend that analysis and indicate the importance of lipid-solvation to
additional residues (Johnson et al. 2006). In the present work, we show that the dimer affinity
values in a library of 16 GpA „ridge‟ mutants experimentally determined by the TOXCAT assay
inversely correlate with non-polar group lipid accessibility. As observed through energy-
minimized models of II, VV (WT), and LL GpA -helices, mutation does alter the „ridge‟
55
structure which in turn affects the local surface topology of the construct (Fig. 2.5). This altered
ridge structure produced through mutations to the WT Val residues at GpA positions 80 and 84
changes the dimerization propensity of the construct as helices with larger lipid accessible
surface area form weaker dimers due to greater contact with lipid. Conversely, efficient packers
such as Ile decrease the lipid accessible surface area and promote helix-helix contacts instead.
Figure 2.6. Models of the structure of the GG4 „ridge motif‟ involved in GpA dimerization,
with the II, VV (WT) and LL mutants shown in order of increasing lipid accessibility and
decreasing dimer strength. The van der Waals radii of residues 75 – 90 are shown with the Gly
residues at positions 79 and 83 colored red. Side chain mutations at positions 80 and 84 to A) II
(green); B) WT VV (red); or C) LL (yellow) alter the local „ridge‟ structure. This figure was
produced using energy minimized -helices (see Materials and Methods) and PyMol. Figure
adapted from (Cunningham et al. 2010).
56
While these lipid-helix interactions contribute to GpA dimerization, the strength of the
correlation implies that additional forces beyond „ridge‟ lipid accessibility affect GpA dimer
strength: a combination of forces must therefore influence the dimerization of GpA. In
structural terms, Ile and Val represent optimal candidates for formation of a rigid „ridge‟
structure to drive GpA dimerization, as -branched residues such as Ile, Val and Thr only have
one populated rotamer as a consequence of residing in a membrane embedded -helix, and
create optimal packing surfaces for TM oligomerization without additional losses in entropy
(MacKenzie et al. 1997; Senes et al. 2000; Liu et al. 2003). Alternately, Leu is not as
conformationally restricted in this environment, and can rapidly sample a range of conformers in
a -helix template (MacKenzie et al. 1996). An additional loss of entropy is therefore probable
upon dimerization via Leu – a consequence that is not shared by Ile or Val. While the TOXCAT
assay does not report on such biophysical considerations, our results are consistent with the
rotational freedom of Leu within a helix context precluding successful oligomerization events
relative to those observed for Ile and Val. Since neither raw hydropathy, -branching or entropic
considerations significantly separate the innate oligomerization capabilities of Ile and Val, the
source of increased dimerization likely relates to a ridge structure that is less solvated by lipid,
therefore producing a better packing surface within the locus of helix contact arising from
improved van der Waals packing. In this context, we noted that some Ile-containing mutants,
including VI, IV, and II, do show a statistically significant higher TOXCAT signal than the WT
VV construct.
The optimized structural surface of GpA that promotes dimerization is additionally an
interesting model of membrane protein folding as in vivo, GpA is not required to dimerize for its
function. The role that GpA plays in identifying individual blood groups and contributing to the
overall negative charge of the cell is not likely dependent on dimerization. Additionally, GpA
plays a role in the trafficking of the Anion Exchanger 1 (Band 3) to the cell surface, but this is
also not dependent on dimerization of GpA (Young et al. 2000). For example, mutations made
to the GpA dimerization interface to create a monomeric GpA protein resulted in the successful
trafficking of Band3 to Xenopus oocyte plasma membrane in the same way as wild-type GPA,
showing that the GPA monomer is sufficient to mediate this process (Young et al. 2000). The
lack of importance that dimerization has to GpA function indicates that this structure may simply
provide an optimized folding surface, and GpA as a model system may provide information
57
regarding how TM proteins generally fold when an optimized surface is available to do so. From
a structural standpoint, the dimerization of GpA could be a measure of protection against
unwanted proliferative oligomerization within the membrane bilayer with other TM segments.
The dimerization of GpA through its optimized interface may be an example of a mechanism
which constrains assembly of TM segments in the membrane, providing a balance in the
production of different structures necessary for homeostasis while evading aggregation (Rath and
Deber 2007).
2.3.3 Modulating helix interactions.
A considerable amount of work has gone into identifying residues involved in promoting
the dimerization of TM segments, but to a large extent, the ability to modulate dimerization with
mutation and thus understand the nuances of membrane protein folding, has gone largely
unexplored. Several examples of TM segments which dimerize via a GG4 motif have been
investigated including MCP, MZP, and BNIP3, but these examples also contain large
hydrophobic residues at the ± i, ± i + 4 positions relative to their GG4 motif, and their
importance in oligomerization has yet to be investigated [see (Rath et al. 2009b) for a review].
The ability to modulate helix interactions via single amino acid mutations has implications in
disease states and highlights the importance of investigating even conservative mutations. For
example, differences in oligomerization affinities among family members, or proteins which fold
in a structurally similar manner yet with different affinities may be explained by studies
considering various amino acid mutations.
It also is likely that membrane proteins have evolved to include variations in helix contact
strength with regard to both structure and function of the protein. For example, -helices
involved in maintaining protein structure or rigid portions of the protein may have evolved tight
packing loci between helices. Weaker helix interactions would be advantageous in TM segments
that undergo structural or rotational changes as part of the functional process. A variety of helix
interaction strengths is compatible in the overall protein fold when viewed from a
structure/function relationship, and are most likely essential for function.
58
2.3.4 Conclusion
In the present work, we have performed a systematic experimental evaluation of the
nuanced roles for Val, Ile, Leu, and Ala when these amino acids occur in the positions 80 and 84
adjacent to the Gly79
/Gly83
oligomerization motif of GpA. We found that minor changes to the
oligomeric interface (i.e., Val to Ile) can modulate the oligomerization status of GpA and
accordingly have implications for local protein folding. Notably, our results demonstrate that at
least one -branched residue at position 80 or 84 is essential for significant GpA dimerization.
Given that the membrane domains of proteins are essentially devoid of disulfide bonds, and
feature a limited content of stabilizing side chain-side chain hydrogen bonding sites, the
widespread occurrence of -branched residues in TM segments may well stem from a
requirement for the enhanced structural stability provided by the space-filling capacity of their
side chains.
2.4 Materials and Methods.
2.4.1 TOXCAT Assay.
The expression vector pccKAN and the MBP-deficient (malE-) E. coli strain NT326 were
kindly provided by Dr. Donald M. Engelman, Yale University (Russ and Engelman 1999). The
E. coli strain NT326 lacks the endogenous MBP, resulting in the inability of the cells to transport
maltose into the cytoplaxm, and utilize this compound as a carbon source (Schneider and
Engelman 2003). TOXCAT chimeras fusing the TM sequence of GpA between the ToxR and
MBP have been previously described (Johnson et al. 2006). Mutants of GpA were produced by
mutating the WT GpA construct via the QuikChange site-directed mutagenesis kit (Stratagene).
The identity of all constructs was confirmed with DNA sequencing before further
characterization. Constructs were transformed into NT326 cells, grown to an OD600nm of 0.6 and
assayed for construct expression along with expression of the reporter gene CAT as previously
described (Johnson et al. 2006). CAT measurements and construct expression measurements
were performed in at least triplicate and were normalized for the relative expression level of each
construct using Western blotting as described (Johnson et al. 2006). Densitometry was used to
measure differences in construct expression, and was performed using the program Image J
59
(http://imagej.nih.gov/ij/). A cytoplasmic version of MBP, c2x (pMAL-c2x MBP fusion protein
containing no TM or ToxR domain, New England Biolabs) was used as a negative control for
oligomerization, as it has been used in previous studies (Russ and Engelman 1999) and
represents background CAT expression (Riggs 2001). Significant differences in oligomerization
relative to WT (VV) were tested via the t-test, while significant differences in oligomerization
among mutant constructs were compared via online one-way ANOVA test
(http://faculty.vassar.edu/lowry//anova1u.html). NT326cells expressing TOXCAT chimeras
with WT and mutant GpA sequences were also streaked onto M9 minimal plates with 0.4%
maltose as the only carbon source to confirm membrane insertion. Constructs that grow under
these conditions target the MBP into the periplasm and utilized maltose as the sole carbon source
(Russ and Engelman 1999). Transformant growth was evaluated for all constructs after
incubation for 2 days at 37°C.
2.4.2 Peptide Synthesis.
GpA peptides corresponding to amino acids 73-94 of the full-length protein were
synthesized using PS3 peptide synthesizer (Protein Technologies, Inc.) via Fmoc chemistry.
Four lysine residues were added to both the N- and C-termini to increase the solubility of the
peptide (Melnyk et al. 2003). A 0.1-mmol scale synthesis was used with the O-(7-
azabenzotriazol-1-yl)-N,N,N’,N’-tetramethyl-uronium hexafluorophosphate; N,N-
diisopropylethylamine activator pair, with a 4-fold amino acid excess. Peptide cleavage and
deprotection was carried out using a solution of 88% trifluoroacetic acid, 5% phenol, 5%
ultrapure water, and 2% triisopropylsilane. The cleavage product was precipitated into ice-cold
diethyl ether, dried, and resuspended into ultrapure water. An amidated C terminus upon peptide
cleavage was produced by utilizing a low load (0.18–0.22 mmol/g) FMOC-PAL-polyethylene
glycol-polystyrene resin. Cleaved peptides were purified by reverse phase high performance
liquid chromatography on a C4 preparative column (Phenomenex) with a water/acetonitrile
gradient in the presence of 0.01% trifluoroacetic acid. Peptide molecular weights were
confirmed by mass spectrometry. The Micro BCA assay (Pierce) was used to determine peptide
concentration.
60
2.4.3 Circular Dichroism.
Circular dichroism spectra of peptides were recorded on a Jasco J-720 CD spectrometer
at room temperature. Spectra in SDS (25 mM SDS, 10 mM Tris, pH 7.2, 10 mM NaCl) were
recorded using a 1 mm path length cuvette at peptide concentrations of 25 M. All spectra were
background subtracted and converted to mean residue molar ellipticity (MRE [deg cm2 dmol
-1
x10-3
]). Mean residue ellipticities shown are the average of three separate scans, and statistical
differences at 222 nm were determined relative to the WT peptide (VV) via a t-test.
2.4.4 Glycophorin A Helix Solvation Calculations.
The sequences of the WT and mutant GpA TM segments (Fig. 1) were modeled as single
monomeric energy-minimized -helices using the CNS program suite (Brunger et al. 1998). The
corresponding PDB files generated by CNS each contained a single -helix structure and were
analyzed using NACCESS to estimate the relative lipid accessibility of each monomer as
described previously (Johnson et al. 2006). Interfacial lipid accessibility was determined by
dividing the sum of the lipid accessibility values calculated for each residue defined in the GpA
dimerization interface (Leu75
, Ile76
, Gly79
, Val80
, Gly83
, Val84
, Thr87
) (MacKenzie et al. 1997), by
the number of residues included in the interface. Correlation analysis between TOXCAT and
lipid accessibility was performed using the Prisim software program.
61
Chapter 3: Distinctions between hydrophobic helices in globular
proteins and transmembrane segments as factors in protein sorting.
This work was published, in part, by Cunningham, F., Rath, A., Johnson, R.M., Deber, C.M. J
Biol Chem 284: 5395-402 (2009).
Author Contributions: FC and AR designed research. FC performed research and AR
contributed to database construction, analysis of position of charged residues within sequences
and provided assistance with statistical analysis. FC, AR and CMD analyzed the data. FC and
CMD wrote the paper.
62
3.1 Introduction
As high-resolution structural determination of membrane proteins is not yet routine,
computer simulation methodologies are often used to evaluate helical protein-protein
(Cuthbertson et al. 2006), and protein-lipid interactions (Choi et al. 2004; Johnson et al. 2006).
As a pre-requisite to such studies, protein helical TM segments and their boundaries must be
defined. Hydropathy plots are commonly utilized to identify from primary sequence both the
location of TM segments and their approximate entry/exit points (Kyte and Doolittle 1982;
Engelman et al. 1986; Cserzo et al. 2002; Zhao and London 2006). The hydropathy values
assigned to each residue to create such plots are drawn from one or more scales, among them the
Liu-Deber values developed in our laboratory (Rath and Deber 2008). Transmembrane segments
– and peptides derived from them – that meet or exceed a segmentally-averaged „threshold‟
hydropathy on this scale (approximately equivalent to a poly-Ala strand) have been shown to
spontaneously insert into micellar environments (Liu and Deber 1999). The average hydropathy
levels of ~96% of natural TM segments were also found to exceed this threshold value (Liu and
Deber 1999).
From these observations, in conjunction with measurements of residue helical propensity
in n-butanol (Liu and Deber 1998b), our laboratory developed the TM segment prediction
program TM Finder (Deber et al. 2001) that uses segmental Liu-Deber hydropathy and non-polar
phase helicity values to query primary sequences for potential TM segments. TM Finder
demonstrated a 98% predictive value in pinpointing TM segments in a training set of known
membrane proteins (Deber et al. 2001). TM Finder was additionally specifically trained to limit
the occurrence of false positives, i.e., globular (soluble) protein regions mispredicted as
membrane-embedded. For this purpose, the initial TM Finder code was applied to a sequence
database of globular proteins of known tertiary structure to assess helices that were of sufficient
length (estimated at 19 residues) to span the membrane bilayer. Of a total 174 -helical
globular protein regions from 134 different proteins in this database, we observed that ~30%
were identified as potential TM segments from the primary sequence alone, but could be
separated computationally from bona fide TM sequences based on the presence of ≥ 3 charged
residues. We subsequently termed these TM-like sequences that occur within helical globular
proteins as “-helices” to reflect their intermediacy between TM properties and
63
extramembranous localization (Wang 2000). The work presented within this Chapter provides
an opportunity to improve the characterization of bona fide TM domains: we sought to extend
this preliminary computational separation of -helices, globular helices, and TM segments with a
view towards understanding the criteria that may serve to distinguish these sequence features of
globular proteins from TM segments in vitro, or even within the cell.
3.2 Results
3.2.1 Hydropathy of -helices.
The globular protein sequences ≥ 19 residues in length considered in our initial study
were divided into globular helix and -helix categories based on their mean segmental Liu-Deber
hydropathy values (see Methods). A database of confirmed TM -helical segments was also
assembled from a non-redundant set of protein TM segments with available high-resolution
structures (Appendix 1, Table A1.1); this TM helix database contains 212 TM segments from 37
non-redundant membrane proteins. We note that the length criterion of ≥19 residues was not
applied to the TM helix database because each TM segment has been identified as residing in the
membrane bilayer via high-resolution structure determination. The mean Liu-Deber hydropathy
of each sequence in the globular, - and TM helix classes was calculated (see Appendix 1,
Tables A.1 – A.3) and averaged for each group. We found that the mean hydropathy values of
the - and TM helix classes (0.72 ± 0.26 and 1.31 ± 0.69, respectively) each exceeded the Liu-
Deber threshold value for membrane insertion [≥ 0.4, see ref (Liu and Deber 1998a)], while the
mean hydropathy value of globular helices (-0.28 ± 0.41) was below this threshold. Therefore,
on average, -helices have intermediate hydropathy: greater than that observed for globular
helices but less than the TM helix value (p ≤ 0.01 in both comparisons).
3.2.2 Hydrophobic and charged/polar residue content in -helices.
The amino acid compositions of -, globular, and TM helix groups were determined (see
Methods) to investigate whether the intermediate hydropathy of -helix segments vs. other
globular helices and TM segments arose from decreased numbers of hydrophobic residues,
64
increased numbers of polar residues, or both. As expected, TM helix segments contained ~1.4-
fold more hydrophobic, ~3-fold fewer polar, and ~2-fold fewer charged residues than globular
helices (p < 0.0001, see Table 3.1). -Helix percentage occurrence values in these residue
categories, however, presented as different from the TM and globular groups (p < 0.0001), and
appeared to be intermediate between them (Table 3.1). For example, the average percentage
occurrence of hydrophobic residues per -helix (59.2 % ± 6.2 %) lies between the globular and
TM helix values (47.1% ± 7.8 % and 67.3% ± 9.2%, respectively). The distribution of
hydrophobic and polar/charged amino acid residue types in -helices thus appears to be
transitional between TM and globular helix segments.
Table 3.1. Percent occurrence of hydrophobic, polar and charged residues per helix.
Segment % Hydrophobic a % Charged % Polar
TM helix 67.3 ± 9.2 7.6 ± 6.8 32.7 ± 9.4
-helix 59.2 ± 6.2 19.2 ± 8.5 40.8 ± 6.1
Globular helix 47.1 ± 7.8 26.7 ± 9.0 59.2 ± 7.7
p - value b < 0.0001 < 0.0001 < 0.0001
a Percentage occurrence values may not sum to 100% due to inclusion of residues D, E, K, and R
in both the polar and charged categories. See Methods for details. b
Statistical significance between categories determined by ANOVA testing. Errors represent
one standard deviation of group compositions from the mean.
3.2.3 Amino acid composition of-helices vs. transmembrane and other globular helices.
To further probe the origins of the compositional distinctness of -helices vs. globular
and TM segments, we compared individual amino acid percentage occurrence frequency among
the three helix categories (Fig. 3.1). Consistent with the results of other groups (Senes et al.
2000), we observed that certain hydrophobic residues (i.e. Phe, Ile, Leu, Val, and Trp; Fig. 3.1A)
were significantly enriched (p ≤ 0.01) in TM helices vs. globular helical segments; the situation
was reversed for polar and charged residues (i.e. Asp, Asn, Glu, Gln, Gly, His, Lys, and Arg;
Fig. 3.1A). These trends are readily rationalized given the respective intramembranous vs.
cytoplasmic localization of these sequences in their native proteins. Similarly, we noted that that
-helices contained significantly more Asp, Glu, and Arg residues than TM segments (0.05 ≥ p ≤
0.01, see Fig. 3.1B), results consistent with our previous observation that the presence of three
65
charged residues could delineate -helices as extramembranous (Deber et al. 2001).
Interestingly, the observed decrease in hydropathy of -helices vs. TM segments could be traced
to two individual residues, viz., fewer Ile and Val residues are present in -helix sequences than
in TM segments (p ≤ 0.01); conversely, the content of all other residues classed as hydrophobic
is statistically indistinguishable (p ≥ 0.05, Fig. 3.1B).
Overall, the individual residue composition of -helices appeared to be more similar to
globular than TM segments (compare Fig. 3.1B and 3.1C); however, -helices were significantly
enriched in the hydrophobic residues Leu and Phe, and depleted in Glu, Lys, and Gln compared
to their counterparts in globular proteins (Fig. 3.1C). In terms of hydropathy, -helices appear to
be distinguished from bona fide TM segments based largely on a decreased content of -
branched hydrophobic residues.
66
Figure 3.1. Comparison of globular helix, -helix and TM helix amino acid composition. Mean
residue percent occurrence values are indicated with blue, red, and green bars, respectively.
Error bars represent one standard error of measurement. Residue percentage occurrence values
of A) globular and TM helices; B) -helices and TM helices; and C) globular and -helices are
shown. Comparisons were made among categories with a one way ANOVA, then individual t-
tests. Asterisks above the bars indicate statistically significant differences (p ≤ 0.05, *; p ≤ 0.01,
**). Figure adapted from (Cunningham et al. 2009).
67
3.2.4 -helices are more buried within their native folds than other globular helices.
Because -helices contain greater percentages of hydrophobic residues and lower
percentages of polar/charged residues than other globular helices, we hypothesized that -helices
might be more buried within their native protein folds. To determine if this was the case, the
locations of - and globular helices within their native structures were mapped using a water-
sized probe to calculate residue solvent accessibility (see Methods). Helices within the TM
database were excluded from this analysis due to their established burial within the membrane
bilayer. -Helices were found on average to be more buried than other globular helices within
their native folds (mean solvent accessibility values of 21.0% ± 7.7% vs. 28.9% ± 8.4%,
respectively, p ≤ 0.01). Moreover, the distribution of solvent accessible residues differs between
the -helix and globular helix databases (Fig. 3.2A); the majority of -helix segments are 15-
20% solvent accessible, while the majority of globular helices have 25-30% accessibility. Rather
than being confined to a single residue category, however, the increased accessibility of globular
vs. -helices is reflected across all residue groupings (Fig. 3.2B), i.e., hydrophobic, polar, and
charged residues are ~1.3-1.4x more exposed in globular vs. -helices.
Figure 3.2. Solvent accessibility of globular vs. -helices. A) Solvent accessibility distribution
of all residues. Residues in -helices are more buried than other globular helix residues, with
mean solvent accessibility values of 21.0 ± 7.7 % vs. 28.9 ± 8.4 %, respectively (mean ± S.D., p
≤ 0.01). B) Solvent accessibility of hydrophobic, polar, and charged residues in globular vs. -
helices. Mean solvent accessibility values are indicated with blue and green bars, respectively.
Error bars represent one standard deviation of individual compositions from the mean value. All
residue categories exhibit reduced mean solvent accessibility in - vs. globular helices (p ≤ 0.001
in t-tests). See Methods for details of solvent accessibility calculations. Figure adapted from
(Cunningham et al. 2009).
68
3.2.5 Folding of the -helix peptides in aqueous and membrane mimetic environments.
Since -helix segments represent sequences that are mispredicted as transmembranous,
we initiated experiments to examine the physical properties of -helices vs. TM helices in vitro.
Accordingly, we synthesized peptides corresponding to five -helix segments selected from the
database based on the presence of an intrinsic Trp residue for anticipated fluorescence
experiments. Segments were selected from erythrocruorin (1ECA), myoglobin (1MBA),
Hemocyanin A chain (1HC1), mandelate racemase (2MNR), and L-lactate dehydrogenase
(5LDH); see Table 3.2 for corresponding -helix sequences. The hydrophobicity of the -helix
sequences necessitated the introduction of Lys residues at their termini in order to facilitate
synthesis and characterization (Melnyk et al. 2003); such „Lys-tags‟ have been shown not to
interfere with the core peptide sequence of interest (Liu and Deber 1998b; Melnyk et al. 2001).
The location of each of these -helix segments in their native protein structures is shown in Fig.
3.3.
Table 3.2. Sequences of synthesized -helix peptides.
PDB ID Protein Name Sequence a H
b
1ECA Erythrocruorin K-FAGAEAAWGATLDTFFGMIF-KK 1.10
1HC1 Hemocyanin A chain KK-ELFFWVHHQLTARFDFERL-K 0.77
1MBA Myoglobin K-ADAAWTKLFGLIIDALKAA-K 0.96
2MNR Mandelate racemase K-GLIRMAAAGIDMAAWDALGKV-K 0.53
5LDH L-lactate dehydrogenase KK-GYTNWAIGLSVADLIESMLK 0.52 a
Lys residues (offset by hyphens) were added to each peptide to enhance aqueous solubility. See
Methods for details. b
Mean Liu-Deber segmental hydrophobicity of each peptide, excluding Lys tags.
69
Figure 3.3. Ribbon diagrams of helical globular proteins containing -helix regions studied in
this work. Alpha-carbon backbones of each protein are indicated in grey, with -helices shaded
in green. Proteins and PDB identifiers are as follows: Erythrocruorin (1ECA); Hemocyanin A
chain (1HC1); Myoglobin (1MBA); Mandelate racemase (2MNR); L-lactate dehydrogenase
(5LDH). Figure adapted from (Cunningham et al. 2009).
Circular dichroism spectra were obtained for each -helix peptide in aqueous buffer, and
in buffer containing SDS or sodium perfluorooctanoate (SPFO). Although each sequence
demonstrably adopts -helical structure within its soluble protein tertiary fold, none of the
peptides exhibited a large amount of helical structure in aqueous buffer (Fig. 3.4). The 1MBA
and 1HC1 peptides, however, displayed a degree of helical character in aqueous solution
(minima at 208 nm and 222 nm; Fig. 3.4) that was not observed in the 1ECA, 2MNR, and 5LDH
-helix peptides. We noted that there is a general trend for the -helical content of the -helix
peptides to follow their Chou-Fasman secondary structure propensity (P) in aqueous solvent
(Chou and Fasman 1978) (Fig. 3.5). For example, 1MBA and 1HC1 have the highest calculated
-helical structural propensity (Table 3.3), and also the greatest amount of helical structure (Fig.
3.5). The lack of strong aqueous helicity in the 1ECA and 2MNR -helix peptides may be
similarly rationalized by the relatively high Chou-Fasman -strand structural propensity (P)
predicted for these segments (Table 3.3). Interestingly, each -helix peptide sequence exhibits
regions with overlapping and secondary structure prediction (Table 3.3), suggesting that they
may represent segments with competing secondary structure preferences. Sodium
perfluorooctanoate was used in these studies as it represents a „mild‟ detergent that is thought to
preserve helix-helix interactions denatured by SDS. Sodium perfluorooctanoate has been shown
to preserve native quaternary structures of membrane proteins (Rath et al. 2006). Since SPFO
allows retention of native contacts, it presumably preserves native secondary structural
70
comparisons, so a comparison to SDS is useful. The results in Figure 3.4 maintain that the -
helix peptides are helical in both SDS and PFO.
Figure 3.4. Circular dichroism spectra of δ-helix peptides in various media. Spectra in aqueous
buffer (left), buffer containing SDS (centre), and buffer containing SPFO (right) are shown.
Changes in mean residue ellipticity values observed at 208 nm and 222 nm for each -helix
peptide observed in detergent solution are consistent with increased -helical structure. Figure
adapted from (Cunningham et al. 2009).
Table 3.3. Predicted secondary structure of -helix peptides.
-helix Predicted -
region a
P Predicted
-region P
1ECA 1 – 19 1.11 5 – 11 1.17
1HC1 1 – 18 1.17 5 – 10 1.17
1MBA 1 – 18 1.17 5 – 13 1.21
2MNR 2 – 20 1.15 2 – 12 1.20
5LDH 5 - 19 1.13 2 - 18 1.09 a
Residues in each -helix peptide are numbered beginning with the first residue after the Lys tag
(see Table 2 for sequences). P and P were computed using Chou-Fasman structure
propensities (Chou and Fasman 1978). Lys tags were excluded from propensity calculations.
71
Figure 3.5. -helix secondary structure compared with Chou-Fasman -helix propensity. The
mean residue ellipticity at 222 nm determined from the CD spectrum of each peptide in aqueous
buffer is given as a function of the Chou-Fasman aqueous -helix propensity of the 1ECA,
1HC1, 1MBA, 2MNR and 5LDH peptides. The regression line of best fit, correlation coefficient
and associated -values are shown; there is a trend that the aqueous helicity of -helix peptides
follows the Chou-Fasman prediction of aqueous -helix propensity (0.05 ≤ p ≤ 0.10). Figure
adapted from (Cunningham et al. 2009).
All -helix peptides increased in -helix content when exposed to the membrane-mimetic
environments of SDS or SPFO micelles (Fig. 3.4). Induction of helical structure in apolar media
has been observed with TM proteins such as GpA and the epidermal growth factor receptor
(Melnyk et al. 2001), but can also occur when intact globular proteins are exposed to SDS
micelles [reviewed in (Imamura 2006)]. In both instances, ordered secondary structures such as
-helices are thought to arise in the hydrophobic environment of the detergent micelles in order
to satisfy the hydrogen-bonding requirement of the peptide backbone in the low-dielectric
environment of detergent acyl chains. -Helix peptide exposure to an environment of reduced
polarity in SDS micelles was confirmed by examining the Trp fluorescence emission spectra of
the -helix peptides. Trp has a characteristic fluorescence emission maximum of approximately
350 nm in an aqueous environment; blue shifting of this maximum to a lower wavelength
accompanies accommodation of the Trp side chain in a more hydrophobic environment (Netz et
al. 2002). Indeed, blue shifts in Trp emission maxima were observed for each -helix peptide in
72
the presence of SDS (Table 3.4), suggesting that each is solvated in the apolar SDS micelle
interior.
Table 3.4. Tryptophan emission maxima of -helix peptides in various media.
-helixmax (nm)
Aqueous SDS Blueshift a
1ECA 348 336 12
1MBA 348 330 18
1HC1 344 335 9
2MNR 348 337 11
5LDH 348 335 13 a
(max, aqueous – max, SDS)
3.2.6 Competence of -helix segments for in vivo membrane insertion.
The behavior of the -helix peptides in SDS micelles implies that these sequences are
competent for solvation by micelles in vitro, but offers no information as to whether or not they
meet the requirements for in vivo membrane incorporation. We therefore undertook to
investigate whether -helix segments could mimic TM sequences by assessing their ability to
insert and self-associate in the E. coli inner membrane using the TOXCAT assay (Russ and
Engelman 1999) (see Methods). Of the five -helix sequences evaluated in the TOXCAT assay,
only 1ECA was capable of self-association at levels higher than the monomeric GpA G83I
control, with a self-association strength ~60% of the GpA dimer (Fig. 3.6). The remaining δ-
helix sequences reported lower levels of association strength than the monomeric control, with
the possible exception of the 1MBA sequence. Since the correct membrane insertion of each
fusion protein was confirmed by growth of NT326 cells expressing each fusion protein construct
on M9-maltose plates (not shown), the latter data suggest that whether or not self-association
occurs, at least a portion of each -helix TOXCAT constructs is correctly incorporated in the E.
coli inner membrane.
73
Figure 3.6. TOXCAT assay of -helix peptides in the E. coli inner membrane. Mean levels of
CAT expression from -helices relative to the wild-type GpA dimer are shown ± S.D. (top)
along with a representative Western blot used to evaluate protein expression levels (bottom).
Blot bands excerpted from separate gels are indicated by solid lines between lanes. G83I denotes
the GpA G83I mutant, used as a monomeric control (Russ and Engelman 1999). The mean CAT
expression levels were compared in t-tests; symbols above the bars denote significance level:
0.05 ≤ p ≤ 0.10, +; p ≤ 0.05, *; p ≤ 0.01, **. See Methods for assay details. Figure adapted from
(Cunningham et al. 2009).
3.2.7 Charged residue distribution distinguishes -helix and transmembrane sequences.
Given that -helices are comparable in length and in overall hydropathy/helicity to native
TM segments, we inquired whether the TOXCAT results might be explained by other
characteristics of -helix sequences. For example, the bilayer integration efficiency of model
TM segments in vivo has been shown to depend strongly on the position of charged residues, i.e.,
when they are placed towards the ends of sequences, insertion efficiency is increased (Hessa et
al. 2007). As such, it is possible that -helix sequences might be fundamentally distinguished
from TM sequences based on charged residue positioning. The -helix, TM helix, and globular
helix databases were accordingly queried for charged residue position (Fig. 3.7A). We found
that charged residues were essentially evenly distributed along the lengths of - and globular
helix sequences (p ≥ 0.1 compared to expected frequencies based on an even distribution). In
contrast, charged residues were more abundant within the first 20% or last 20% of segment
74
length than in the middle of TM segments (i.e. near the ends of TM helices). The sequences of
-helices and TM segments can thus be computationally distinguished based on charged residue
distribution. The notion that factors other than raw hydropathy must influence the membrane
insertion propensity of a given -helix segment is also supported from the lack of correlation of
segmental hydropathy vs. the calculated apparent free energy membrane insertion (∆Gapp),
calculated using biological partitioning measurements (Hessa et al. 2005a; Hessa et al. 2007)
(Fig. 3.7B).
Figure 3.7. Charged residue positioning in - vs. TM helices. A) Distribution of charged
residues in globular, -, and TM helix sequences. The frequency of occurrence of charged
residues (D, E, K, and R) at various positions along the length of the helix sequence is indicated.
Charged residues are distributed equally (p ≥ 0.1) along the lengths of globular and -helix
sequences, but distributed unevenly over TM helix lengths (p ≤ 0.0001). B) Comparison of -
helix partitioning under „in vivo‟ vs. „in vitro‟ conditions. The insertion efficiency of -helix
sequences under in vivo conditions is given as the apparent free energy membrane insertion
(∆Gapp), calculated using biological partitioning measurements (Hessa et al. 2005a; Hessa et al.
2007); the in vitro condition is represented by Liu-Deber segmental-averaged hydropathy (Rath
and Deber 2008). Figure adapted from (Cunningham et al. 2009).
75
3.3 Discussion.
3.3.1 Role in globular proteins.
-Helices represent ~30% of the globular-based -helices investigated. This relatively
frequent occurrence implies that these segments may be of some utility in the proteins that
contain them, and considerable evidence supports a major role for hydrophobic interactions in
globular protein folding (Dill et al. 2008). We nevertheless could discern no trend in terms of
localization of -helix segments to known surfaces of protein-protein interaction, or segregation
into a particular globular protein type (data not shown). However, -helices are by definition
highly apolar, and display increased burial within their native protein folds vs. other globular
helices (Fig. 3.2A, B). As such, we suspect that -helix sequences may be important to the
stability of proteins that contain them, perhaps via sequestration of their hydrophobic residues in
the protein interior during folding.
-Helix segments are nevertheless unable to adopt their helical native structures without
input from the remainder of the protein, perhaps because of their competing secondary structure
preferences (Fig. 3.4, Table 3.3). In fact, -helix sequences generally display strong propensities
to exist as -sheet type segments when considered in the absence of the constraints imposed by
globular protein tertiary structure (Table 3.3); this mixed potential is manifested in CD
experiments, where several segments do not develop significant helical structure, even in
membrane-mimetic environments. This disparity of structural propensity vs. conformation of -
helix segments mimics that of „discordant helices‟ – globular protein segments that undergo an
-to- structure transition and form amyloid-like fibrils (Paivio et al. 2004), and suggests that
these sequences may exhibit structural instability in their native folds.
3.3.2 Role of residue content.
The high intrinsic hydropathy of -helix segments is imparted by a different set of
hydrophobic residues than TM segments. Ile and Val residues are depleted in -helices
compared to TM segments (Fig. 3.1), and this may be rationalized in the requirement of such
76
amino acids to be evolutionarily retained in the hydrophobic, restrictive environment of the
membrane bilayer vs. aqueous solvent. The -branched residues such as Ile and Val have only
one populated rotamer as a result of residing in membrane induced -helices, where they are
structurally optimized for the folding and helix-helix interaction requirements of membrane
proteins (Senes et al. 2000). There may thus be considerable selective pressure to retain Ile and
Val in TM segments relative to -helices as they may retain a structural role beyond maintaining
levels of hydrophobicity. Ile and Val are additionally better -sheet-formers than helix-formers
in aqueous solvent (Chou and Fasman 1978), perhaps necessitating their depletion in natively -
helical -helix sequences.
3.3.3 Recognition of hydrophobic segments.
-Helix peptides are sufficiently similar to bona fide TM segments in terms of their
segmental hydrophobicity and apolar helicity to be competent for membrane insertion, and our
TOXCAT results indicate that certain -helix sequences are not only capable of membrane
insertion but also of self-association within the bilayer when placed in the correct protein context
(Fig. 3.6). How, then, are hydrophobic segments destined for the interior of globular proteins
distinguished from those that become incorporated into the interior of the membrane bilayer?
Correct -helix vs. TM segment sorting must rely on factors distinct from bulk biochemical
properties. Based on our results, it appears that charged residue distribution and/or protein
context may act to exclude -helix segments from the bilayer. Thus, authentic TM segments
have a skewed distribution of charged residues with such residues appearing at helix termini
compared to - and soluble -helical segments (Fig. 3.7A). As well, proteins containing -helix
segments appear to lack any additional TM-mimic sequences that could aid their membrane
integration in a manner similar, for example, to the bilayer integration of voltage-sensor domains
of the voltage-dependent potassium channels via „helper helices‟ (Zhang et al. 2007); of the 51
globular helical proteins containing -helix sequences, only six have ≥ two -helix regions. It is
also possible that the potential sequestration of -helix sequences in the protein interior at an
early stage in folding may prevent recognition of their high intrinsic hydrophobicity.
77
Examination of the estimated efficiency of -helix sequence partitioning into the lipid
bilayer in vivo and into membrane-mimetic media in vitro further reinforces the notion that bulk
biochemical properties are not sufficient to predict the non-TM vs. TM location of a - vs. TM
helix sequence. The two sets of values do not correlate, although a regression line and correlation
coefficient are shown for illustrative purposes (Fig. 3.7B). Charged residues, and their location
in the -helix segments, may therefore help to predict these hydrophobic protein regions as
destined to be globular.
3.3.4 Conclusions.
The present study shows that (i) nearly 30% of globular protein helices of sufficient
length to span a membrane bilayer (≥ 19 amino acids) have mean hydropathy values equivalent
to or greater than known actual TM segments; and (ii) differentiation between these TM-
mimicking portions of helical globular proteins and bona fide TM segments could generally be
achieved in the first instance by flagging sequences with three or more charged residues as non-
TM regions. We further observed that although significant hydrophobicity is absolutely a
necessary feature in identifying potential TM segments, additional factors - such as the location
of the charged residues, and an increased occurrence of Ile and Val residues – should also be
considered. While - helix segments in globular proteins may embody important hydrophobic
structural features of the in vivo protein fold, the work described herein provides additional clues
as to how proteins may be sorted by the cell. Our current examination of factors that act to divert
-helices from membrane insertion to an aqueous phase indicate that TM Finder and similar
software could exploit these specific features to increase our predictability of TM segments in
proteins.
3.4 Materials and Methods.
3.4.1 Database construction.
The databases of globular and -helices were initially compiled by searching the
SwissProt release 34 with the keyword “helix”. The ~4200 sequences returned by this query
78
were subsequently refined to 174 entries (Wang 2000), as follows: (i) removing all redundant
sequences (defined as those > 25% identical); (ii) removing sequences with non-standard amino
acids; (iii) removing sequences without high-resolution structure coordinates deposited in the
PDB; and (iv) removing segments with < 19 residues. The 174 remaining sequences were then
divided into globular helices (see Appendix 1, Table A1.1) and -helices (see Appendix 1, Table
A1.2) via submission to TM Finder; those segments with segmental hydropathy values at or
above the Liu-Deber insertion „threshold‟ [≥ 0.4 on the Liu-Deber scale, see reference (Liu and
Deber 1998b)] were deemed -helices. The database of TM segment sequences was compiled
from a non-redundant (defined as those >25% identical) list of TM -helices with available high-
resolution structures. Thirty-seven non-homologous TM proteins were identified containing a
total of 212 different TM segments (see Appendix 1, Table A1.13). Unlike the globular and -
helix databases, segments shorter than 19 amino acids were retained in the TM database because
the membrane localization of each was confirmed in a high resolution structure (Rath and Deber
2008).
3.4.2 Amino acid composition analysis.
Amino acid composition was determined for each segment in each database by counting
the number of each residue and/or group of residues in each individual helix and dividing by
helix length to obtain a normalized residue frequency. Mean residue and/or group residue
composition values were then calculated for each of the globular helix, TM helix, and -helix
datasets. For group comparisons, hydrophobic residues were defined as (A, C, F, I, L, M, V, W,
Y), as determined previously (Liu and Deber 1998b); polar residues as (D, E, G, H, K, N, P, Q,
R, S, T); and charged residues as (D, E, K, R). The overall mean amino acid compositions of
globular, TM and -helices were compared with an online one-way ANOVA test (Prisim). Pair-
wise comparisons of amino acid frequencies among the globular, TM, and -helix datasets were
performed using t-tests (Prisim). As counts of amino acids do not represent continuous data it is
appropriate to use an ANOVA assay to determine differences in amino acid composition among
globular, - and TM-helix categories.
79
3.4.3 Solvent accessibility analysis.
The solvent accessibility of amino acids in globular helices and -helices was evaluated
by application of the program NACCESS (Hubbard 1993) to structure coordinate files obtained
from the PDB as described previously (Rath and Deber 2008), with the following modifications:
(i) A probe size of 1.40 Å was used to approximate the radius of a water molecule; and (ii) the
relative solvent accessibility (RSA) for each amino acid was calculated by comparison of
calculated solvent accessible surface areas to the default reference set supplied with NACCESS.
Helix solvent accessibility was calculated as the sum of the individual residue RSA values in
each helix, divided by the helix length. Helix solvent accessibility distributions were determined
for the globular and -helix groups by sorting individual helix solvent accessibility values into
bins [0-5%, >5-10%, >10-15%, >15-20%, >20-25%, >25-30%, >30-35%, >35-40%, >40-50%,
>50-55%, >55%-60%]; the mean solvent accessibility values for the globular and -helix groups
were also calculated from the individual solvent accessibility data. Within the globular and -
helix groups, the RSA values of residues in the hydrophobic, polar, and charged residue
categories defined above were averaged to obtain mean solvent accessibility values.
Comparisons of the mean solvent accessibility of hydrophobic, polar, and charged residues
between globular and -helix segments were performed using t-tests.
3.4.4 Residue position analysis.
Residues in each helix sequence were sequentially numbered from 1 to n, beginning at
the N-terminal residue and ending at the C terminal residue, such that n represents the total
peptide length in residues. The number assigned to each charged residue (D, E, K, R) in each
helix sequence was divided by n to obtain a positional value normalized to helix length.
Positional values were sorted into bins corresponding to fractions of helix length [≤ 20%, >20%-
40%, >40%-60%, >60%-80%, >80%-100%]. Numbers of charged residues in each bin were
totaled for the globular helix, -helix, and TM helix groups. Chi-squared tests were used to
analyze residue distributions.
80
3.4.5 Peptide synthesis and purification.
Five -helix segments were selected from the database as candidates for in vitro study
based on the presence of an intrinsic Trp residue: 1ECA (residues 114-133 of the full length
protein), 1HC1 (residues 218-236), 1MBA (residues 127-145), 2MNR (residues 96-116) and
5LDH (residues 247-266). The boundaries of the -helix peptides of 1ECA, 1HC1, 1MBA,
2MNR and 5LDH were chosen as prescribed by examination of the high resolution structure and
the TM Finder output (Deber et al. 2001). Peptides with sequences corresponding to these -
helix segments were synthesized with a PS3 peptide synthesizer (Protein Technologies, Inc.)
using standard Fmoc chemistry. Additional lysine residues were incorporated into the peptide
sequences to increase aqueous solubility as previously described (Melnyk et al. 2003). A four-
fold amino acid excess on a 0.1 mmol scale synthesis was used with the HATU/DIEA activator
pair. Synthesis utilized a low-load (0.18-0.22 mmol/g) PAL-PEG-PS resin that produced an
amidated C-terminus upon peptide cleavage. Peptide cleavage and deprotection was achieved
using a cocktail of 88% TFA/5% phenol/5% ultrapure water/2% TIPS, followed by precipitation
with ice-cold diethyl ether, drying, and resuspension in ultrapure water. Crude peptides were
purified by RP-HPLC on a C4 preparative column (Phenomenex) with a water/acetonitrile
gradient in the presence of 0.01% TFA. Peptide molecular weights were confirmed by mass
spectrometry and the Micro BCA assay (Pierce) was used to determine peptide concentration.
3.4.6 Circular dichroism and fluorescence spectroscopy.
Circular dichroism spectra were recorded on a Jasco J-720 CD spectrometer at room
temperature. Spectra in aqueous buffer (10 mM Tris pH 7.2, 10 mM NaCl) and aqueous buffer
with 10 mM SDS were taken using a 0.1 cm path length cuvette at peptide concentrations of 25
M. A 0.01 cm path length cuvette was used for secondary structure determination at peptide
concentrations of 100 M in aqueous buffer with 50 mM SPFO. Fluorescence measurements
were carried out in the aqueous and SDS buffer conditions described above on a Hitachi F-400
Photon Technology International C-60 fluorescence spectrometer at an excitation wavelength of
295 nm. Emission spectra were recorded at room temperature from 305 to 405 nm using a
peptide concentration of 5 M. Evaluation of the correlation between Chou-Fasman -helical
81
propensity (P) and peptide mean residue ellipticity at 222 nm was evaluated using the R
statistical software package.
3.4.7 Plasmid construction.
The expression vector pccKAN and the MBP-deficient (malE-) E. coli strain NT326 were
kindly provided by Dr. Donald M. Engelman, Yale University (Russ and Engelman 1999).
TOXCAT chimeras fusing the TM sequence of GpA and the G83I GpA mutant between the
ToxR and MBP domains have been previously described (Johnson et al. 2006); chimeras
encoding -helix sequences instead of the GpA sequence were constructed in an essentially
identical manner via restriction digestion of oligonucleotide cassettes encoding each -helix
segment with NheI and BamHI and subsequent ligation into the NheI and BamHI sites of the
pccKAN plasmid. The identity of all constructs was confirmed with DNA sequencing prior to
further characterization.
3.4.8 MalE complementation test.
NT326 cells expressing TOXCAT chimeras with -helix sequences, the wild-type GpA
TM sequence, or the G83I GpA mutant were streaked onto M9 minimal plates with 0.4%
maltose as the only carbon source. Under these conditions, transformants capable of growth
must target a portion of the chimeric TOXCAT protein into the cytoplasmic membrane (Russ
and Engelman 1999). Transformant growth was evaluated for all constructs after incubation for
2 days at 37°C.
3.4.9 Chloramphenicol acetyltransferase enzyme-linked immunosorbent assay.
NT 326 cells harboring TOXCAT chimeras were grown at 37 C, harvested into 1 mL
fractions at an A600 of 0.6, pelleted, and stored at -80 C. Cell lysates were prepared from cell
pellets as previously described (Johnson et al. 2006) and assayed for CAT concentration using
the CAT ELISA kit (Roche Applied Science). A standard curve was generated with CAT
provided by the manufacturer. Cells expressing the wild-type GpA and G83I GpA sequences
82
were included in each CAT assay as positive and negative controls, respectively. CAT
measurements were performed in at least triplicate, and were normalized for the relative
expression level of each construct using Western blotting as described (Johnson et al. 2006).
83
Chapter 4. Converting a Marginally Hydrophobic Soluble Protein
into a Membrane Protein.
This work was published, in part, by Nørholm, M.H., Cunningham, F., Deber, C.M., von Heijne,
G. J Mol Biol: 407: 171-179 (2011).
Author contributions: MN designed research plan including experimental membrane insertion
assays and mutations. FC performed database construction and analysis, and designed
mutations. MH, FC, CMD and GVH analyzed data. MH wrote the paper with FC, CMD and
GVH providing input.
84
4.1 Introduction.
What distinguishes TM helices from helices in soluble proteins? Hydrophobicity is a
major determinant influencing the membrane insertion of a given helical segment, but marginally
hydrophobic protein segments are observed to form both TM helices and parts of globular
proteins (Hessa et al. 2007; Cunningham et al. 2009): additional factors must contribute to
ensure their proper localization.
A well-studied example is the membrane-embedded voltage-sensor domain present in
voltage-gated ion channels (Swartz 2008). Voltage-sensor domains contain an unusual highly
charged S4 -helix, where membrane insertion of the S4 helix is aided by electrostatic
interactions with charged residues in neighbouring helices (Sato et al. 2002; Zhang et al. 2007).
In the opposite case, unusually hydrophobic so-called -helices exist in proteins that are
not integrated into membranes. -Helices are -helices defined as having a segmental
hydrophobicity value ≥ 0.4 on the Liu–Deber hydrophobicity scale (Liu and Deber 1998a) and
were first identified in approximately 30% of a test group of 174 solved crystal structures of
globular proteins (Cunningham et al. 2009). Overall, bioinformatics analysis of these -helices
showed that although they are markedly hydrophobic, they contain more charged residues and
fewer Ile and Val residues than typical TM helices. Furthermore, charged residues in -helices
are more evenly distributed than in TM helices where they typically are found near the
membrane-water interface. Nevertheless, synthetic -helix peptides exhibit TM-helix-like
behaviour in vitro; CD and fluorescence spectroscopy show an increase in helical content when
they are exposed to membrane mimetic environments, and a -helix derived from Chironomus
thummi thummi erythrocruorin (1ECA) can insert and self-associate in the inner membrane in E.
coli (Cunningham et al. 2009).
Given that -helices have many of the characteristics of TM helices yet have evolved to
not insert into membranes in their normal context, we find them interesting as representatives of
polypeptide segments that are near the threshold for membrane insertion. This is particularly
pertinent for -helices in secreted proteins, as these sequence segments must be able to
85
translocate through the Sec61 translocon in the ER membrane without being recognized as TM
helices. With this in mind, we have studied selected -helices using a well-established assay for
measuring membrane insertion efficiency in dog pancreas microsomes (Hessa et al. 2005a) and
have determined what sequence alterations are required to convert -helices into TM segments
both in chimeric model proteins and in the native protein context.
4.2 Results.
4.2.1 -Helix hydrophobicity.
Our current collection of δ-helices (Table 4.1), subdivided according to predicted or
experimentally verified subcellular protein localization, contains 51 -helical protein segments
with a Liu–Deber score ≥ 0.4 (Liu and Deber 1998a). Our definition of a “secreted” protein is
one that has a predicted signal sequence (and hence encounters the Sec61/SecYEG translocon
during its translation (White and von Heijne 2008)) and no TM segment. -Helices from
mitochondrial, chloroplastic and viral proteins are classified as cytosolic under this scheme,
given that they do not encounter the Sec61/SecYEG translocon.
In Table 4.1, we compare the predicted Liu–Deber scores for the -helices with the
apparent free energy of membrane insertion (Gapp) calculated by the “G predictor” software
(Hessa et al. 2007). By the latter measure, most of the -helices are predicted to insert poorly or
not at all into a biological membrane, the average Gapp being + 4.1 kcal/mol. For comparison,
the average predicted free energy of insertion for a set of TM helices in membrane proteins with
solved crystal structures is around -1 kcal/mol (Hessa et al. 2007).
86
Table 4.1. -helices and their predicted Liu-Deber and Gapp hydrophobicities.
Non-Secreted Proteins a Secreted Proteins
b
PDB ID Liu-Deber c Gapp
d PDB ID Liu-Deber
c Gapp
d
1BGD 0.84 3.9 1BGC 0.58 6.0
1BIA 0.97 3.9 1BGC 0.49 4.2
1DXI 0.47 4.0 1CF3 0.91 4.0
1FDH 0.46 2.2 1ECA 1.10 4.1
1FHA 0.43 3.2 1EZM 0.49 4.1
1FIA 1.19 5.1 1GLM 0.67 4.6
1GPA 0.48 4.2 1GLM 0.72 4.4
1GUH 1.11 4.1 1OVA 0.53 5.3
1HC1 0.96 5.0 2ACH 0.48 5.4
1HDS 0.77 1.3 3HHR 0.56 4.4
1LTH 0.56 4.4 3HHR 0.54 6.2
1MAT 0.88 4.5 3INK 0.42 4.9
1MBA 0.77 3.4 2AAI 0.84 2.8
1MRR 0.61 3.1 2ACE 0.54 6.4
1PFK 0.73 3.7 Average 0.63 4.8
1PHH 0.80 4.4 SD 0.19 1.0
1PHH 1.07 4.3
1SRY 1.62 2.4
1SRY 0.42 7.6
2ALD 0.65 5.0
2ATI 0.88 3.3
2BMH 0.74 1.3
2LDB 0.66 5.4
2MNR 0.52 3.0
2TSI 1.16 4.3
3PFK 0.9 4.5
3TMS 0.66 3.4
4TMS 0.59 4.0
5LDH 0.52 3.4
9LDB 0.45 5.4
1CSC 0.61 5.7
1CSC 0.98 1.8
1CSC 0.42 4.3
1CPC 1.06 3.3
1TIS 0.81 3.9
2TMV 0.63 4.7
8RUC 0.51 3.9
Average 0.75 3.9
SD 0.27 1.2 a Non-secreted proteins do not encounter the translocon in their translation pathway. These include cytosolic, viral
and mitochondrial proteins. b Secreted proteins encounter the translocon in their translation pathway.
c Average hydropathy of the segment calculated by the Liu-Deber hydropathy scale (Liu and Deber 1998a).
d Predicted Gapp value (in kcal/mol) for the segment calculated by an in vivo membrane insertion scale (Hessa et al.
2007).
87
Comparison of the mean Liu–Deber hydropathy scores of -helices from secreted and
cytosolic proteins shows that there is no significant difference between the groups (0.63 ± 0.19
and 0.75 ± 0.27; p > 0.05). In contrast, when judged by their Gapp values, -helices in secreted
proteins have significantly higher Gapp values on average than their cytosolic counterparts (4.8
± 1.0 kcal/mol versus 3.9 ± 1.2 kcal/mol; p < 0.05). Despite this difference in the Gapp values
between secreted and cytosolic proteins, both types of -helices are characterized by Gapp
values above the threshold for membrane insertion (Hessa et al. 2007). It is likely the
positioning of the charged residues within these segments, and reflected in their Gapp values,
that prevents the membrane insertion of these segments into the bilayer in vivo. The appearance
of charged and/or polar residues in the central portion of membrane spanning segments can be
detrimental to membrane insertion by the cellular machinery (Hessa et al. 2007).
4.2.2 Choice of -helices for membrane insertion studies.
To investigate the ability of -helices to insert into the ER membrane via the Sec61
translocon, we generated chimeric constructs based on E. coli leader peptidase (LepB) (Hessa et
al. 2007) with inserts corresponding to five -helices (italicized in Table 4.1) that were chosen
from the -helix collection based on the following considerations:
The -helix from C. thummi thummi erythrocruorin (1ECA) is capable of in vivo
oligomerization in the E. coli inner membrane bilayer when tested in the TOXCAT assay
(Cunningham et al. 2009).
Synthetic peptides corresponding to the Aplysia limacina myoglobin (1MBA) and 1ECA
-helices can be successfully solvated by detergent in a membrane mimetic system
(Cunningham et al. 2009).
The -helix from Thermus thermophilus seryl-tRNA synthetase (1SRY) has the highest
Liu–Deber score in the -helix database.
88
The -helices from Bacillus megaterium cytochrome P450BM-3 (2BMH) and from
Ricinus communis ricin (2AAI) have relatively low predicted Gapp values (Table 4.1).
These examples also illustrate the potential biological roles of -helices. Both erythrocruorin
(1ECA) and myoglobin (1MBA) are members of the globin family of heme-binding proteins.
The corresponding -helices in these structures are identical with the so-called H-helices that are
central elements in the folding of the hydrophobic core in globins (Nishimura et al. 2000;
Lecomte et al. 2005). Similarly, the 2BMH -helix (termed the I helix in the full-length protein)
is a prominent hydrophobic feature of the three-dimensional structure of cytochrome P450BM-3
(Ravichandran et al. 1993), and the charged Glu 267 residue present in this -helix appears to
play a central role in the reaction catalyzed by the protein (Gerber and Sligar 1992;
Ravichandran et al. 1993). Ricin is a toxin produced in the seeds of R. Communis and is toxic
because the ricin A chain inactivates eukaryotic ribosomes. The hydrophobic -helix in
ricin A, also termed helix E, is shielded from solvent by the amphipathic helix D that turns its
hydrophobic face toward helix E and its hydrophilic face toward the solvent (Morris and Wool
1994). In this case, it is apparently a necessary feature in the active-site architecture (in concert
with interactions with the ribosome) that charged residues in the -helix are placed on a
hydrophobic scaffold.
4.2.3 Experimental quantification of membrane insertion of selected -helices.
Putting the theoretical hydrophobicity scores to a test, we examined the -helices in a
quantitative biological assay for measuring the efficiency of insertion into the ER membrane. In
this method, the segments to be tested are inserted into engineered versions of the LepB protein
where they are flanked by NXT consensus sites for N-linked glycosylation. The membrane
insertion efficiency is then determined by expressing the proteins in vitro in the presence of dog
pancreas microsomes, followed by quantification of the relative amounts of mono- and di-
glycosylated molecules (Hessa et al. 2005a) (Fig. 4.1).
89
Figure 4.1. Evaluation of the membrane integration properties of selected δ-helices. A) DNA
encoding two -helices originating from secreted proteins was inserted in the leader peptidase
(Lep) H3 model construct (left panel schematic), in vitro transcribed and translated in the
presence of dog pancreas microsomes. Control reactions were performed in the absence of RMs.
Membrane insertion efficiency was quantified as the ratio between mono- and double-
glycosylated, 35
S-Met-labeled proteins on SDS-PAGE gels. B) Similar to the experiments in (A),
membrane insertion of three -helices from non-secreted proteins was measured in the Lep H2
model system (left panel schematic). Positively (blue) and negatively charged (red) residues are
highlighted. Bands originating from mono- and double-glycosylated proteins are indicated with
one and two dots, respectively. Figure adapted from (Norholm et al. 2011).
The five selected -helices were evaluated for membrane insertion in either the Nout–Cin
or the Nin–Cout orientation using the appropriate LepB construct. Segments originating from
secreted proteins were tested in the Nout–Cin orientation (i.e., in the orientation they have when
traversing the translocon channel), and segments from cytosolic proteins were tested in the Nin–
90
Cout orientation (i.e., in the orientation they would have if they were functioning as signal-anchor
sequences). Residues that flank TM helices have previously been shown to affect membrane
insertion efficiency (Lerch-Bader et al. 2008), and therefore, whenever possible (when the -
helix was not too close to the N- or C-terminus), five or six of the native residues flanking each
side of the -helices were included in the test constructs.
All five -helices inserted poorly into the ER membrane (7–18% insertion efficiency, Fig.
4.1), with only the -helix from 1SRY being clearly above background level (the amount of
mono-glycosylated molecules is roughly 10% in constructs with a fully translocated test
segment). We also tested the ability of the -helices to serve as signal sequences for targeting to
the signal recognition particle - Sec61 translocation pathway by using a LepB construct that
lacks both the native TM segments; none of the constructs showed any membrane targeting (data
not shown).
In summary, as predicted by the Gapp values, none of the five tested -helices became
efficiently integrated into microsomal membranes when inserted downstream of bona fide TM
helices and also do not function on their own as translocon- targeting signal-anchor sequences.
4.2.4 Converting δ-helices into transmembrane segments.
Generally, -helices are of intermediate hydrophobicity and contain three or more
charged residues, ostensibly rendering them unfavourable for membrane insertion (Cunningham
et al. 2009). Are the charged residues critical for determining the fate of -helices or is the
protein context in which the -helices reside more important? We first addressed these questions
by substituting selected charged residues in three different -helices in the context of the
chimeric LepB constructs. To put the question in an evolutionary perspective, we chose to only
make single-nucleotide changes in order to estimate the minimum number of evolutionary steps
it would take to convert the -helix into a TM helix. Our choice of test -helices was based on
Gapp predictions of the minimum number of nucleotide mutations that would turn the -helix
into a TM helix (i.e., to reach Gapp values < 0 kcal/mol).
91
In the two chosen -helices originating from the secreted 1ECA and 2AAI proteins, we
changed stepwise one to three charged residues into hydrophobic residues and obtained high
membrane insertion efficiencies only when all three charged residues were mutated (Fig. 4.2A).
With the amino acid substitutions E6V, G10V and D14V (GAA→GTA, GGT→GTT and
GAC→GTC) in the 1ECA -helix, membrane insertion increased from 11% to 38%, whereas the
three substitutions R8C, E19V and R22I (CGT→TGT, GAA→GTA and AGA→ATA) in the
2AAI -helix resulted in an increase from 7% to 87%. In contrast, the -helix from the cytosolic
protein cytochrome P450BM-3 (2BMH) was converted into an almost fully inserted TM helix
(from 10% to 79% insertion) upon just the single E20V substitution (GAA→GTA) (Fig. 4.2B).
the insertion efficiency was further increased to 92% by the H19L mutation (CAC→CTC). We
conclude that the two -helices in the secreted proteins are mutationally rather distant from
becoming TM.
92
Figure 4.2. Conversion of -helices into TM segments. Single-nucleotide mutations leading to
the replacement of amino acid residues unfavourable for TM segments were gradually
introduced into selected -helices in the Lep H3 and H2 model systems. A) Two -helices from
secreted proteins tested in Lep H3 and B) a single non-secreted -helix tested in Lep H2.
Underlined residues are those chosen for mutation. Figure adapted from (Norholm et al. 2011).
93
4.2.5 Converting a soluble protein into a membrane protein.
The ease with which the 2BMH -helix was converted into a TM helix in the context of
the LepB protein motivated us to investigate if the corresponding full-length cytochrome
P450BM-3 protein was as readily converted into a membrane protein. For this to happen, the -
helix would need to be able both to serve as a targeting signal to the signal recognition particle–
Sec61 translocation pathway and to insert efficiently into the ER membrane. To monitor
targeting and membrane insertion, we expressed the full-length protein with engineered
glycosylation sites on either the N- or the C-terminal side of the -helix segment (Fig. 4.3A). As
a control, we expressed the unmodified protein in parallel. As expected, wild-type cytochrome
P450BM-3 showed no sign of membrane insertion, but in contrast to our findings with the LepB
constructs, neither did the corresponding singly or doubly mutated versions. This result suggests
that additional factors besides the hydrophobicity of the -helix help prevent targeting of the
protein to the Sec61 translocon.
We speculated that upstream sequence elements would be a determining factor to prevent
membrane targeting, for example, by sequestering a downstream hydrophobic segment in a
folding nucleus and hence preventing it from acting as a targeting signal. To test this hypothesis,
we made two deletions from the N-terminal portion of cytochrome P450BM-3, either including
or excluding a small helical segment that immediately precedes the -helix (Fig. 4.3B). Again,
glycosylation sites were engineered on both sides of the -helix to follow membrane integration,
and the native as well as the singly and doubly mutated -helices were tested. The only construct
exhibiting a detectable amount of glycosylation was the doubly mutated construct with all
upstream secondary structural elements deleted and with an engineered glycosylation site at the
C-terminal side (Fig. 4.3A). Glycosylation of the protein was confirmed by treating the sample
with endoglycosidase H (endoH) (Fig. 4.3C).
Together, these results suggest that the presence of charged residues in the -helix as well
as upstream sequence elements efficiently prevent targeting to the ER but that, upon removal of
these elements, it is possible to convert a soluble protein into a membrane protein.
94
Figure 4.3. Conversion of a soluble protein into a membrane protein. A) Full-length or two
different truncated versions (1–226 and 1–240, numbered according to the deleted amino
acids in the protein) of cytochrome P450BM-3 were expressed with zero to two mutations in the
-helices. Membrane integration was assessed by engineering of a glycosylation site on the N-
terminal or C-terminal side of the -helix (indicated by N or C below the gel lanes), followed by
in vitro transcription and translation in the presence of dog pancreas microsomes. Results were
compared with constructs without glycosylation sites (-) or without RMs. A membrane-
integrated, glycosylated protein is indicated by the red arrow. B) Two views of the placement of
the -helix (coloured violet) and the immediate upstream helical structure (coloured red) in the
three-dimensional structure of cytochrome P450BM-3 [PDB ID 2BMH]. C) Endoglycosidase
treatment of the 1–240 truncated, double-mutated version of cytochrome P450BM-3. The
glycosylated proteins are indicated by red arrows. Figure adapted from (Norholm et al. 2011).
95
4.3 Discussion.
Some marginally hydrophobic protein segments, such as the voltage sensor S4 helix,
clearly insert and function as TM helices, whereas other segments that display many of the
characteristics of a bona fide TM helix are found in globular proteins and do not form TM
helices. What are the roles of these so-called -helices in soluble proteins, and how is
mislocalization to a membrane of such segments prevented?
We have tested five -helices for translocon-mediated insertion into dog pancreas rough
microsomes (RMs). In the context of chimeric LepB proteins, none of the -helices were
inserted into the membrane of the rough microsomes, in agreement with their predicted Gapp
values. However, it should be noted that some protein segments integrate in membranes,
entering and exiting on the same side without spanning the entire bilayer - so-called re-entrant
regions (Viklund et al. 2006) - and the experimental setup used here does not rule out such a
partial membrane integration. Three -helices (in the secreted 1ECA and 2AAI proteins and in
the cytosolic 2BMH protein) were readily converted into TM segments by one- to three-
nucleotide mutations, but in the native 2BMH protein, the corresponding mutations had no effect
in terms of membrane targeting. Only when all secondary structure upstream from the 2BMH -
helix was removed did a small fraction of the truncated protein with a mutated -helix insert into
microsomes.
Upstream TM helices aid insertion of marginally hydrophobic TM helices in the Shaker
and KAT1 potassium channels (Sato et al. 2002; Zhang et al. 2007), and our results indicate that,
in globular proteins, upstream secondary structure may be important for preventing membrane
insertion of marginally hydrophobic -helices. Our findings suggest that both secreted and
cytosolic globular proteins containing marginally hydrophobic helical segments are well
protected - both from a mechanistic and evolutionary perspective - from being inserted into the
ER.
Interestingly, we note that the charged residues necessary to mutate to convert the ricin
(2AAI) -helix into a TM segment constitute the active site (Fig. 4.2) (Ready et al. 1991). This
96
is one example illustrating the role of a hydrophobic helix in a soluble protein and how such a
situation can be stabilized by interactions with neighbouring amphipathic structures. It is also
noteworthy that ricin is not only secreted from the site of synthesis but also actually internalized
by cells, passed into the lumen of the ER and reverse translocated into the cytoplasm where it
targets the ribosome (Ready et al. 1991; Wales et al. 1992; Wales et al. 1993). This situation
further emphasizes the notion that the -helix must be well protected from becoming membrane
integrated.
In summary, our findings suggest that when hydrophobic character is localized in -
helices within soluble proteins, these helices may modulate the folding, three-dimensional
structure or active site of the proteins. Evenly distributed charged residues on these -helices
play key roles – along with upstream secondary structure - in helping to prevent unintended
membrane integration. The observation that a single residue of increased hydrophobicity was
able to modify the insertion potential of a -helix from the cytosolic protein 2BMH, while up to
three residues were required to achieve elevated membrane insertion in the secreted examples
(Fig. 4.2) may reflect the fact that -helices derived from cytosolic proteins display relatively
lower Gapp values because the latter proteins do not encounter the translocon. Hence, -helices
from secreted proteins remain mutationally and perhaps evolutionary more distinct both from
bona fide TM helices and from -helices from cytosolic proteins. Finally, our analysis of the
2BMH -helix represents one of the few examples [see also (Chen and Kendall 1995)] of a
situation where a -helix in a soluble protein has been mutated into a segment that behaves,
albeit with limited efficiency, as a TM helix of an integral membrane protein.
4.4 Materials and Methods.
4.4.1 DNA engineering.
Oligonucleotides used in this work were purchased from Eurofins MWG Operon and are
listed in Appendix 1, Table A1.4. The three LepH3, H2 and H1 constructs described previously
(Hessa et al. 2005a; Lundin et al. 2008) were prepared for uracil-excision cloning (Nour-Eldin et
al. 2006) by the polymerase chain reaction (PCR) with the oligonucleotides LepH3-F and
97
LepH3-R, LepH2-F and LepH2-R or LepH1-F and LepH1-R. The empty pGEM vector was
prepared for cloning by PCR with the oligonucleotides pGEM-F and pGEM-R. DNAs encoding
the -helices were ordered as synthetic oligonucleotides and inserted into the different Lep
model constructs using uracil-excision DNA engineering (Nour-Eldin et al. 2006; Norholm
2010). Site-directed mutagenesis was performed with the QuikChange Multi Site-directed
Mutagenesis kit (Agilent Technologies). Genomic DNA was isolated from B. megaterium using
the Chargeswitch kit (Invitrogen) and used as a PCR template to amplify the cytochrome
P450BM-3 (2BMH) gene. Full-length and truncated versions of 2BMH were cloned into the
pGEM vector by uracil excision as described above.
4.4.2 Membrane insertion assay and endoH treatment.
Membrane insertion efficiency assay was performed using the TnT SP6 Quick Coupled
SP6 Transcription/Translation System (Promega) in the presence of dog pancreas microsomes
followed by quantifications of 35
S-Met-labeled, differentially glycosylated products using SDS-
PAGE, as previously described (Hessa et al. 2005a). endoH (New England Biolabs) treatment
was performed according to manufacturers‟ prescriptions.
98
Chapter 5. Optimizing synthesis and expression of transmembrane
peptides and proteins.
This work was published, in part, by Cunningham, F., Deber, C.M. Methods 41: 370-80 (2007).
Author contributions: FC designed and performed research on optimization of construct
expression. FC and CMD analyzed data. FC and CMD wrote the paper.
99
5.1 Introduction.
5.2.1 Fragmentation approach to study membrane protein folding.
The need to gather structural information on membrane proteins is highlighted by the fact
that they are implicated in a number of diseases; however, high-resolution structural data remains
elusive due to their high hydrophobicity. A detailed understanding of how individual -helical
components interact within the membrane environment to form larger structures is critical in
understanding the bigger picture of membrane protein folding, and ultimately their function.
Fortunately, the folding of membrane proteins is simplified by the fact that individual TM
segments, or at least small segments of TM proteins, represent independent folding domains and
can be studied as such. The two-stage model for membrane protein folding suggests that
individual TM -helices, or small membrane protein segments, can represent autonomous
folding domains; for example, a successful strategy to gather structural information would
involve studying minimal structural units of TM segments consisting of two -helices and the
connecting loop. As TM segment folding appears to be independent of its neighboring segments,
one can thus utilize these hairpin constructs to study helix-helix interactions, and the detailed
manner of -helix segment association within the membrane bilayer (Johnson et al. 2004).
Our laboratory has worked towards optimizing synthesis and expression of TM segments
and fragments of membrane proteins in order to facilitate the study of packing interactions
between TM helices, and to obtain structural information regarding these segments (Melnyk et
al. 2001; Johnson et al. 2004; Rath et al. 2009a). Success in expressing hairpin constructs of
CFTR consisting of TM3, TM4 and the connecting loop has provided clues regarding the
solvation of the construct in membrane mimetic environments (Rath et al. 2009a), and the
importance of sequence context on hairpin structure (Wehbi et al. 2008). As well, hairpin
constructs have been utilized in combination with reverse-phase high performance liquid
chromatography to mimic the process of TM segment portioning into the membrane bilayer
(Mulvihill and Deber 2010). The importance of this work can be extended to disease systems, as
numerous disease-causing mutations in membrane proteins - including CFTR - occur in the
membrane domain, where they can alter both the structure of the construct, and the manner in
100
which these mini-membrane proteins are coated with lipids and detergents (Wehbi et al. 2008;
Rath et al. 2009a).
To date, studies of membrane proteins have generally been impeded by the inherent
difficulty of producing the protein in sufficient quantities required for structural analysis. Some
success however, has been observed via heterologous expression in E. coli. Examples include
TM3/4 of the CFTR (Peng et al. 1998; Therien et al. 2001; Choi et al. 2004; Choi et al. 2005),
GpA (Melnyk et al. 2004), MCP (Wang and Deber 2000; Melnyk et al. 2004), and the gamma
subunit of the Na, K-ATPase or sodium pump (Therien and Deber 2002). This largely empirical
process of heterologous membrane protein expression is complex and a variety of conditions
must be considered in order to obtain significant quantities of protein. This chapter focuses on
the optimized expression of expanded fragments of the membrane protein CFTR from those that
have been previously been studied (Peng et al. 1998; Therien et al. 2002) in order to understand
the packing interactions of helices which comprise the membrane spanning region. While
changes in tertiary contacts and lipid solvation have been observed due to hydrophobic-to-polar
amino acid mutations in a CFTR two-TM system (Wehbi et al. 2008; Rath et al. 2009a), this
circumstance begs the question of the consequences of polar mutations in a larger system,
building toward an intact TM domain. To expand on the two-TM helical hairpin studies, the
structural consequences of introducing a polar residue in a largely hydrophobic system will be
studied in a three-TM system consisting of TM2, TM3 and TM4 (TM2/3/4) with the
interconnecting loop regions.
5.2 Domain fragments of membrane proteins: Application to the Cystic Fibrosis
Transmembrane Conductance Regulator.
5.2.1 The Cystic Fibrosis Transmembrane Conductance Regulator.
The Cystic Fibrosis Transmembrane Conductance Regulator was first identified as the
gene responsible for cystic fibrosis in 1989 (Kerem et al. 1989; Riordan et al. 1989; Rommens et
al. 1989). Since that time, it has been determined that CFTR is a membrane protein that is
expressed in the membranes of ciliated epithelial cells of airway passages (Kreda et al. 2005),
and in the fluid-secreting cells of the submucosal glands (Wu et al. 2007). This 1480-amino acid
101
protein is a member of the ABC transporter family and functional studies have revealed that
CFTR encodes a cAMP-regulated Cl– channel (Kartner et al. 1992). CFTR in the airway
epithelium regulates ion transport, which controls the volume of airway surface liquid, which in
turn affects the levels of mucus hydration (Zhang et al. 2009). It contains five domains, arranged
in two homologous halves: two TM domains (Rommens et al. 1989) and two nucleotide binding
domains (NBD) (Lewis et al. 2004; Atwell et al. 2010) separated by a regulatory domain (R
domain) (Baker et al. 2007) (Fig. 5.1). CFTR is unusual as an ABC family member, as it is the
only member known to function as an ion channel, with the pore being formed through the
association of the TM domains (Riordan 2008). The open and closed state of CFTR is caused by
conformational movements within and between the two NBDs, which in turn causes
rearrangements among the membrane spanning segments, resulting in a shift in the equilibrium
between the open and closed conformation (Cheung et al. 2008). The gating process of CFTR is
regulated through phosphorylation of the R domain by the protein kinase A, and the probability
of channel opening is related to the extent of phosphorylation in the R domain (Riordan 2008).
Figure 5.1. Structure of CFTR. The protein is arranged in two homologous halves with each
half consisting of a TM domain and a nucleotide binding domain. The two halves are separated
by a regulatory domain.
The membrane spanning domains (MSDs) of CFTR form the anion channel portion of
the protein where the chloride translocation pathway is located, most likely at the interface
between the two MSDs. Long-range signals initiating through conformation changes in the
NBDs are likely responsible for conformational changes in the membrane domains that are
required for transporter function (Ramjeesingh et al. 2003). It has been shown separately that
102
CFTR constructs consisting of MSD1 and MSD2 alone, are both capable of mediating anion flux
where these constructs form dimeric structures to re-create a pore structure presumably similar to
that in the intact CFTR (Schwiebert et al. 1998; Ramjeesingh et al. 2003). Mutations in TM5
and TM6 in MSD1 have indicated that these TMs are particularly important in CFTRs function,
forming an essential part of the channel pore (Schwiebert et al. 1998).
5.2.2 Disease causing mutations in CFTR.
Cystic Fibrosis is the most common autosomal recessive disease among the Caucasian
population with a prevalence of 1 in 2500 (Cheung and Deber 2008). While CF primarily affects
the lungs, pancreas and reproductive organs of individuals, the root cause of these symptoms lies
in the malfunctioning of CFTR. In secretory epithelial cells, this reduced chloride permeability
impairs fluid and electrolyte secretion, causing luminal dehydration, which leads to excessive
mucus accumulation in the lungs. The life-limiting aspect of CF is related to the loss of
pulmonary function as a result of these clogged airways, with recurring respiratory infections
that are difficult to treat with antibiotics and persistent inflammation. To date, over 1500 CF-
causing mutations have been identified throughout the protein
(http://www.genet.sickkids.on.ca/cftr), causing varying levels of severity, and approximately 300
of these mutations occur in the TM domains. The principal CF-related defect in CFTR is caused,
however, by a single amino acid deletion in NBD1, which is found on at least one allele in
approximately 90% of CF patients. This deletion of a single Phe residue (F508) results in the
failure of CFTR expression at the cell surface, as this mutant protein is structurally less stable
than the WT (Riordan 2008). The F508 mutation is thought to decrease CFTR structural
stability along the folding pathway by primarily affecting trafficking of the protein leading to
retention in the ER and eventual degradation by the proteasome (Cheng et al. 1990).
ATP-binding cassette (ABC) transporters actively transport chemically diverse substrates
across the lipid bilayers of cellular membranes. While no high-resolution structure is currently
available for CFTR, advances have been made in determining structures of CFTR family
members, including Sav1866 from Staphylococcus aureus (PDB ID: 2HYD) (Dawson and
Locher 2006) and MsbA from Salmonella typhimurium (Ward et al. 2007). The central ABC
transporter structure consists of two transmembrane domains (TMDs) that provide a
103
translocation pathway, and two cytoplasmic, water-exposed nucleotide-binding domains that
hydrolyse ATP. Bacterial ABC transporters are generally expressed as 'half-transporters' that
contain one TM domain fused to a NBD, which describes the structure of Sav1866 and MsbA: a
homodimer within the membrane bilayer, with each momomer containing six TM segments
(Dawson and Locher 2006; Ward et al. 2007). Sequence and biochemical similarities between
CFTR and ABC transporters for which structures are available, indicates that there is likely
strong structural similarities among these proteins (Serohijos et al. 2008). A structural model of
of CFTR was designed from the full length structure of Sav1866, as both functional proteins
contain 12 TM helices and their intracellular loops are of similar lengths (Serohijos et al. 2008).
5.2.3 Fragments of CFTR as a minimal tertiary model.
Optimally, structural investigations on intact CFTR would be performed to understand
the effects of mutations on the protein structure, but this remains a challenge as CFTR is difficult
to readily express in quantities useful for high-resolution study, and local structural effects of
point mutations in the large protein may fall below the limit of biophysical detection (Wehbi et
al. 2008). The fragmentation approach to studying small segments of membrane spanning
domains can find application with a protein such as CFTR, as numerous CF-causing amino acid
mutations have been identified in the TM domain. Individual peptides corresponding to the TM
segments of CFTR membrane domain 1 have been investigated structurally (Wigley et al. 1998),
along with TM hairpin constructs and mutants of CFTR TM3/4 (Wehbi et al. 2008; Rath et al.
2009a; Mulvihill and Deber 2010) and a double-spanning peptide consisting of CFTR TM5/6
(Choi et al. 2005). These peptides and hairpin constructs represent the minimal tertiary structural
units for study of membrane protein folding, and as mentioned, the effect of mutation along these
segments can affect the hairpin construct in in vitro systems in several ways. The information
made available by the CF mutation database indicates that a large number of CF-phenotypic
mutations are non-conservative as nonpolar amino acids are mutated to a polar residue and vice
versa. The consequences of introducing polar mutations can be seen through producing non-
native side chain−side chain hydrogen bonds between TM helices (Therien et al. 2001), or by
inhibiting membrane insertion of TM helices due to reduced hydrophobicity (Choi et al. 2005;
Rath et al. 2009a). As such, these mini-membrane protein models represent excellent choices to
investigate local structural changes due to mutation.
104
5.3 Triple strand construct from the Cystic Fibrosis Transmembrane Conductance
Regulator transmembrane domain.
5.3.1 Construct information.
To assess structural changes in a TM domain system larger than two -helices, a triple
strand fragment containing CFTR TM helices 2, 3 and 4 and the intervening loop regions was
constructed. The limits of the construct were chosen to include all residues identified as
membrane spanning by a hydrophobic moment plot (Eisenberg et al. 1984; Riordan et al. 1989),
as well as the annotated TM segments on Swissprot (http://expasy.org/sprot/). Amino acids 110-
245 were chosen for inclusion in the CFTR-TM2/3/4 construct as this residue stretch
encompassed all three TM segments (Fig. 5.2). We utilized the TM segment boundaries as
annotated by Swissprot (CFTR accession #P13569) in the design of our construct, but is has been
shown previously that defining the boundary residues differently can affect the ability of TM -
helices to associate within membrane environments (Ng and Deber 2010). To assess the putative
TM2/3/4 residues in CFTR we also submitted the amino acid sequence to several TM-predicting
programs that are currently available (Table 5.1). Although there is a consistent overlap of
hydrophobic residues calculated as the core of the TM segment, the predicted locations of the
helix N- and C-termini vary depending on the program used. Predicted residues with a percent
consensus among the programs greater than 80% were considered as a portion of the TM
segment (Table 5.1). Since the residues given by the prediction consensus closely match those
annotated by Swissprot and include all residues predicted in this manner, the Swissprot
numbering was used for ease and similarity to previously published materials.
Table 5.1 Predicted membrane spanning regions of CFTR TM2/3/4
TM Prediction
Consensus a
Swissprot Numbering
TM2 121-138 118-138
TM3 196-214 195-215
TM4 221-240 221-241 a The TM prediction programs used in this analysis were: MemBrain (Shen and Chou 2008), TOPPred II (Claros
and von Heijne 1994), SPLIT4 (Juretic et al. 2002), DAS (Cserzo et al. 1997), MEMSAT3 (Jones 2007),
HMMTOP2 (Tusnady and Simon 2001), TMHMM2 (Krogh et al. 2001), TMPred (Hofmann and Stoffel 1993),
PHDhtm (Rost et al. 1996), SOSUI (Hirokawa et al. 1998) and TM Finder (Deber et al. 2001). Default values were
used in all programs.
105
By cloning the cDNA of WT human CFTR coding for the region of the protein from
amino acids 110-245 into the inducible pET32a(+) expression vector (Novagen), a soluble
thioredoxin (Trx) fusion protein was produced. Low intracellular levels of heterologous
eukaryotic protein expression are often observed in bacteria, and may be caused by protein
misfolding, resulting in recognition by the host as a foreign protein which leads to rapid
degradation (Marston 1986). Expression of eukaryotic proteins in bacteria has been shown to
increase dramatically by fusing the eukaryotic gene of interest to a bacterial gene at the N-
terminus (Marston 1986), where the bacterial protein is usually a highly expressible, soluble
protein. Overexpression of this chimeric construct then exploits the efficient translation of the
bacterial fusion partner by the host machinery.
Thioredoxin (Trx) is a small, ubiquitous, heat-stable protein which participates in various
redox reactions and catalyzes dithiol-disulfide exchange reactions. More recently, it has been
described that the E. coli Trx interacts with unfolded and denatured proteins in a manner similar
to molecular chaperones that are involved in protein refolding after cellular stress (Kern et al.
2003). The Trx-TM2/3/4 construct also contains His-tags and an S-tag for nickel-affinity
column purification, and immunological detection purposes, respectively (Fig. 5.2).
Figure 5.2. Construct of the pET32a(TM2/3/4) designed for the expression of the Trx-CFTR
TM2/3/4 fusion protein. The cDNA corresponding to amino acids 110-247 of CFTR was
subcloned into the Nco1/Xho1 restriction sites of the pET32a vector. The sequences of amino
acids 110-245 is provided, with the individual TM segments underlined. These residues
correspond to those identified as membrane spanning at http://expasy.org/sprot/. The Cys
residues in the construct (Cys 125 and Cys 225) were mutated to Ala and are indicated in red.
106
Cleavage of the purified soluble protein with thrombin will result in the final TM2/3/4
construct consisting of TM2/3/4 of CFTR, the S-tag, and one His-tag. The goal of this construct
design is to produce quantities of the protein in E. coli suitable for biophysical characterization
(Peng et al. 1998; Therien et al. 2002) (Fig. 5.2).
5.3.1.1 Cloning of CFTR TM2/3/4 fragment: Methodology.
The pET32a(+) (Novagen) vector was chosen as the vehicle for heterologous expression
of TM‟s 2-4 of human CFTR in E. coli. This vector contains several unique restriction enzyme
sites in its multiple cloning region, allowing for insertion of the DNA fragment of interest into
the vector. The restriction sites NcoI and XhoI were chosen for cloning which result in the
insertion of the DNA encoding TM2/3/4 downstream of the E. coli protein Trx, creating an N-
terminal fusion construct.
Both 5‟ and 3‟ primers for PCR amplification of the desired segment of human CFTR‟s
cDNA were designed to contain restriction sites for NcoI and Xho1, respectively, with the goal
of adding these sites to the PCR amplified fragment (Table 5.1). As suggested by the supplier of
the restriction enzymes (New England Biolabs), six additional base pairs were added to the 5‟
end of the restriction sites to allow for maximum efficiency of the restriction enzyme when
cleaving the PCR amplified fragment. PCR amplification of the cDNA of human CFTR would
result in the amplification of DNA containing amino acids 110 – 245, inclusive.
Table 5.2. Primers used in the PCR amplification of the human CFTR cDNA.
Primer Name Primer Sequence
5‟ primer 5‟ TAGCTTCCATGGACCCGGATAACAAGG 3‟ a
3‟ primer 5‟ CTCTGAACTCGAGATCCCATCATTCTCCC 3‟ b
The underlined regions represent the restriction enzyme sites of NcoI a and XhoI
b, respectively.
PCR amplification reactions were performed using 5 ng of template DNA (pET32a(+)
containing the full length CFTR cDNA), 1g of each oligonucleotide primer (Table 5.1), 1 U
Vent Polymerase (New England Biolabs), 0.2 mM dNTP mix (New England Biolabs), 100 mM
MgSO4, and 1 x ThermoPol Reaction buffer (supplied by New England Biolabs) with a total
107
reaction volume of 50 l. The reactions were overlaid with 30 l of mineral oil. Reactions were
initially incubated at 95⁰C for 30 seconds, followed by 35 cycles of: i) 95⁰C for 30 seconds; ii)
45⁰C for 60 seconds; and iii) 72⁰C for 60 seconds. A final extension of 72⁰C was continued for
10 minutes. The PCR amplification product was purified from the PCR reaction mixture using
the QIAquick PCR purification kit (Qiagen).
The PCR amplified fragment containing amino acids 110-245 of human CFTR was
digested with NcoI and XhoI in order to produce a fragment that could be cloned into the
pET32a(+) vector to form the Trx-TM2/3/4 expression construct. The empty pET32a(+) was
also treated with NcoI and XhoI to produce a linearized vector.
The isolated and purified PCR fragment was then ligated into the linearized pET32a(+)
vector using T4 DNA Ligase (New England Biolabs) with a 3:1 insert to vector mass ratio.
Ligation reactions were then used to transform supercompetent XL1-Blue E. coli cells
(Stratagene), and colonies were selected for screening of recombinant plasmids. DNA
sequencing was used to confirm the successful ligation reaction and that no mutations were
introduced into the construct. Additionally, the native Cys residues, Cys 125 and Cys 225, were
mutated to Ala to facilitate working with the construct under non-reducing conditions (Fig. 5.2)
5.3.2 Protein Expression of CFTR TM2/3/4 under successful CFTR TM3/4 conditions.
While our lab and others have had success with heterologous expression of membrane
proteins, various parameters of E. coli expression must be empirically optimized for each protein
as the best possible expression conditions might not be universal, and modulating expression
conditions can drastically alter protein yields (Tate 2001; Cunningham and Deber 2007).
Growth temperature, concentration of induction agents, a variety of growth media and the E. coli
strain employed for expression are all factors contributing to protein expression levels, and were
systematically explored for the CFTR TM2/3/4 fragment.
Based on previous success in our lab of expression of CFTR TM3/4 in E. coli in
milligram quantities (Peng et al. 1998; Therien et al. 2002), optimization of CFTR TM2/3/4
expression was carried out. Initial expression trials of CFTR TM2/3/4 focused on growth and
108
induction conditions which were successful for the CFTR TM3/4 construct: transformation into
BL21(DE3) cells, growth in TB media (0.4% (v/v) glycerol, 2.4% (w/v) Bacto NaCl yeast
extract, 1.2% (w/v) Bacto tryptone, 17 mM KH2PO4, 72 mM K2HPO4, Appendix 2) until mid-log
phase (OD600 ≈ 0.6) at 37⁰C in the presence of ampicillin with shaking at 250 rpm. The BL21
(DE3) E. coli strain carries a chromosomal copy of the gene for the T7 RNA polymerase under
control of lacUV5 promoter (Studier and Moffatt 1986). Induction of the T7 polymerase by
isopropyl β-D-1-thiogalactopyranoside (IPTG) allows controlled expression of the Trx-TM2/3/4
gene which is placed downstream of the T7 RNA polymerase-binding site on the expression
vector. Protein expression was induced with 1 mM IPTG, followed by overnight growth at
25⁰C. The E. coli cells were then harvested via centrifugation (Peng et al. 1998; Therien et al.
2002), lysed and then the Trx-T2/3/4 expressed construct was purified via nickel affinity
chromatography. Unfortunately, for the Trx-TM2/3/4 construct, levels of protein expression
were barely detectable by Coomassie staining and Western blotting (Fig. 5.3). To visualize the
levels of protein expression, an antibody to the S-tag was used (Fig. 5.2). The calculated
molecular weight of the Trx-TM2/3/4 full length fusion construct is 33.5 kDa (Fig. 5.3, Lane 1
and 2). The lower major band on the blot, with an apparent molecular weight of 20.9 kDa (Fig.
5.3, Lane 1) is most likely a degraded version of the full length product with cleavage occurring
in the large 56 amino acid loop between TM segments 2 and 3. Expression of the full-length
Trx-TM2/3/4 was not detectable on gels stained with Coomassie Blue.
Figure 5.3. Western blot of expression trial of CFTR TM2/3/4 in conditions which were
successful for CFTR TM3/4 (Therien et al. 2002). TM2/3/4 expresses poorly under these
expression conditions. Expression trial was conducted with BL21 (DE3) E. coli cells in TB
media, OD600 ≈ 0.6. Lane 1: [IPTG] = 1.0 mM with 25⁰C post-induction temperature. Lane 2:
uninduced sample. Positions of See blue molecular weight markers (Invitrogen) are indicated in
kDa.
109
5.3.3 Heterologous expression of CFTR TM2/3/4.
In order to optimize the expression of the Trx-TM2/3/4 construct, various expression
parameters were explored. The effect of varying the protein expression induction, concentration
of induction agent, the type of E. coli cell utilized for expression as well as the expression media
were all investigated. As the conditions used to produce milligram quantities of TM3/4 were
unsuccessful in producing biophysically useful amounts of TM2/3/4 (Fig. 5.3), a series of
experiments optimizing the expression of the construct were carried out, as depicted in the flow
chart in Figure 5.4.
Figure 5.4. Flow chart showing the conditions used to optimize protein expression for CFTR
TM2/3/4. Optimization of culture growth can be divided into four sets of experiments: 1) the E.
coli cell type used for heterologous protein expression; 2) the investigation of temperature at
which E. coli cultures should be induced for protein expression; 3) the investigation of the IPTG
concentration; and 4) the media type used for protein expression. The „X‟ and the end of each
optimization pathway indicates insufficient quantities of TM2/3/4 construct produced for
biophysical analysis.
110
5.3.3.1 E. coli strain.
The majority of membrane proteins are found naturally in very small quantities, resulting
in the need to artificially over-express these proteins in systems that are capable of generating
levels sufficient for biophysical studies after purification. In order to determine the structure of
an integral membrane protein, amounts on the order of mg quantities of purified protein are
required. Additionally, heterologous or homologous overexpression is advantageous as the use
of natural sources to isolate membrane proteins prevents the possibility of genetically modifying
these proteins to facilitate detection and purification, as well as preventing the efficient labeling
for nuclear magnetic resonance and crystallographic studies.
E. coli is often the most popular choice for high-level protein expression for both
homologous (Rastogi and Girvin 1999) and heterologous membrane protein expression
(Hawkins et al. 2005), although several other systems are available. Yeast expression systems
(Pedersen et al. 2007), cell-free expression systems (Maslennikov et al.) and mammalian cells
(Hunter et al. 2005) have all been successfully used in the production of membrane proteins
towards generating high resolution structures of the proteins of interest. While the route to
successful membrane protein over-expression can be complicated, the bacterial E. coli system
has the advantage of effective and efficient recombinant technology for plasmid construction and
protein expression. In addition, E. coli is a well-studied system and cell transformation and
culture growths are rapid and inexpensive.
The commercially available BL21 E. coli (Novagen) is the most commonly used bacterial
host for heterologous expression as it is lon and ompT protease deficient and is known to
promote plasmid stability. This specific type of E. coli cell was used in the optimization of
TM2/3/4 expression (Fig. 5.4) as it was successful in the expression of TM3/4 (Therien et al.
2002). Secondly, the BL21 codon (+) series (Stratagene) was tested for TM2/3/4 expression as it
contains extra copies of rare E. coli codons which corrects E. coli codon bias and may improve
heterologous expression. A third cell type which was used is a BL21 derivative with
considerable success at over-expressing membrane proteins normally toxic to the cells, termed
the C43 (DE3) strain. While the genetic mutations resulting in improved protein expression are
111
unknown, current hypotheses suggest the mutations may affect the amount of T7 RNA
polymerase production, slowing synthesis (Dumon-Seignovert et al. 2004).
Figure 5.5 shows a Western blot detection of the Trx-TM2/3/4 expression optimization in
three different cell lines: BL21 (Novagen), BL21 (Codon Plus) (Stratagene) and C43 cells.
Expression trials were completed in M9 rich media, with a pre-induction temperature of 37⁰C,
protein expression induction with 0.1 mM IPTG, and a post-induction temperature of 25⁰C. The
cells were induced for protein expression with IPTG at OD600 ≈ 0.6, the mid-log phase of E. coli
growth. Two major protein expression bands appeared in this Western blot of cell lysates
expressing the chimeric protein: the higher molecular weight band corresponding to the full
length Trx-TM2/3/4 fusion with an apparent molecular weight of 34.4 kDa (Fig. 5.5, lanes 2, 4
and 6), and a lower band which is most likely a degraded version of the full length product (Fig.
5.5, lanes 2 and 4). Varying the E. coli cell type appears to have an effect on expression of the
Trx-TM2/3/4 construct with BL21 cells producing the largest amount protein, albeit with the
greatest amount of protein degradation (Fig. 5.3, Lane 1). BL21 codon (+) cells biased for E.
coli codon preferences produced less chimeric construct compared to BL21 E. coli cells (Fig.
5.5, Lane 2 and 4). An E. coli strain that has been developed and optimized for membrane
protein overexpression are C43 cells (Miroux and Walker 1996); C43 cells appear to have intact
chimeric protein expression, with no visible protein degradation (Fig. 5.5, Lane 6). In all cell
types investigated, no protein expression was observed prior to expression induction with IPTG
(Fig. 5.5, Lanes 3, 5 and 7). Similar results were observed when the expression post-induction
temperature was shifted upwards to 37⁰C, except that at this higher temperature, larger amounts
of protein degradation were observed (data not shown).
112
Figure 5.5. Expression of CFTR TM2/3/4 in various E. coli cell lines: BL21, BL21 (codon
plus) and C43. All other protein expression conditions were identical: 0.1 mM IPTG, M9 rich
growth media, and 25⁰C expression induction. Cells were induced at mid-log phase of growth
(A600 ≈ 0.6). The calculated molecular weight of Trx-TM2/3/4 fusion is 33.5 kDa. Lane 1: See
blue molecular weight marker (kDa). Lane 2: BL21 E. coli cells, overnight induction. Lane 3:
Pre-induction of BL21 E. coli cells. Lane 4: BL21 codon (+) E. coli cells, overnight induction.
Lane 5: Pre-induction of BL21 codon (+) E. coli cells. Lane 6: C43 E. coli cells, overnight
induction. Lane 7: Pre-induction of C43 E. coli cells.
During the process of membrane protein overexpression, there is however, no necessity
to limit the use of the bacterium to E. coli for membrane protein expression. Several other
promising bacterial expression systems have been developed: Lactococcus lactis is a Gram-
positive lactic acid bacterium that has been used to express a limited number of membrane
proteins including ABC transporters, major facilitator superfamily transporters,
mechanosensitive channels, and lipoproteins (Kunji et al. 2003). The bacterium Halobacterium
salinarum has also been successfully used to produce integral membrane proteins in quantities
suitable for high resolution studies (Lanyi and Schobert 2002). Both of these systems are ideal
for membrane protein expression trials, as they have well studied genetic systems, are easy to
culture and are cost effective.
5.3.3.2 Temperature of protein expression induction and concentration of IPTG.
Growth cultures were incubated at various temperatures post-induction in order to
increase the production of the chimeric Trx-TM2/3/4, as well as to reduce the amount of
degradation observed. A reduction in expression temperature can be associated with an increase
in the amount of soluble protein expressed by bacterial cells (Cunningham and Deber 2007), and
can be further beneficial as an increase in proteolytic degradation is associated with high
113
temperature expression (Quick and Wright 2002). It has been noted that at lower temperatures,
heat shock proteases normally induced in E. coli over-expression conditions are partially
eliminated (Sorensen and Mortensen 2005b).
Protein expression was carried out in BL21 E. coli cells in M9 minimal growth media,
expression induction with either 0.1 or 1.0 mM IPTG with post-induction temperatures of 15⁰C,
25⁰C and 37⁰C. As in the example above, at higher temperatures the expression of Trx-TM2/3/4
appears to improve, but this is associated with an increase in cleavage product produced,
rendering 37⁰C induction temperature unsuitable for expression (Fig. 5.6, Lane 9 and 10).
Lowering the post-induction temperature from 37⁰C to 25⁰C (Fig. 5.6, Lanes 6 and 7) decreases
the amount of cleavage product produced and appears to have a larger amount of full length
construct. Expression of Trx-TM2/3/4 at 15⁰C produced the least amount of cleavage product
compared to protein expression induction at 25⁰C and 37⁰C, but expression levels were still
unsuitable for biophysical analysis (Fig. 5.6, Lane 3 and 4). Protein expression could not be
detected by Coomassie blue staining at any temperature indicated (data not shown). At all post-
induction expression temperatures investigated, no expression of Trx-TM2/3/4 was observed
without the addition of IPTG (Fig. 5.6, Lane 2, 5 and 8).
Figure 5.6. Western blot showing of the effect of different induction temperatures on protein
expression of Trx-TM2/3/4, at two different concentrations of IPTG: 0.1 mM and 1.0 mM.
Cells were induced at mid-log phase of growth (A600 ≈ 0.6). The calculated molecular weight of
Trx-TM2/3/4 fusion is 33.5 kDa. Lane 1: See blue molecular weight marker (kDa). Lane 2: 15
⁰C, (-) IPTG, pre-induction. Lane 3: 15 ⁰C, 0.1 mM IPTG. Lane 4: 15 ⁰C, 1.0 mM IPTG. Lane
5: 25⁰C, (-) IPTG, pre-induction. Lane 6: 25⁰C, 0.1 mM IPTG. Lane 7: 25⁰C, 1.0 mM IPTG.
Lane 8: 37⁰C, (-) IPTG, pre-induction. Lane 9: 37⁰C, 0.1 mM IPTG. Lane 10: 37⁰C, 1.0 mM
IPTG. Increasing the induction temperature from 15⁰C to 37⁰C does not increase the amount of
fusion protein expressed by the E. coli BL21 cells, but increases the amount of cleavage product
observed.
114
Additionally, it can be noted from the Western blot of Trx-TM2/3/4 expression shown in
Figure 5.6 that the concentration of the IPTG used to induce protein expression appears to have
no effect on the levels of protein produced. Increasing the concentration of IPTG from 0.1 mM
to 1.0 mM does not improve expression of the fusion construct at any of the post-induction
temperatures tested.
For membrane protein expression constructs that are controlled by inducible promoters,
varying the concentration of induction agent can, in some cases, affect protein expression.
Lowering IPTG concentrations may improve protein expression as it has been observed at low
IPTG concentrations a reduction in protein synthesis occurs. This may favor protein folding, and
improve stability of the desired product (Yang et al. 1997). As in the case of the chicken liver 6-
phosphofructo-2-kinase/fructose-2,6-bisphosphatase (CKB) the optimum IPTG concentration for
the expression of CKB was in the range of 0.1 to 1 M, irrespective of growth media or
temperature. At IPTG concentrations above these levels, inefficient protein production was
observed (Yang et al. 1997).
IPTG is commonly used as an induction agent in E. coli protein expression, but additional
induction strategies in the production of membrane proteins are available. One such example is
utilizing the promoter of the E. coli cold-shock protein CspA, whose expression is dramatically
increased at temperatures in the region of 15°C (Goldstein et al. 1990). Utilizing the cspA
promoter to drive the transcription and production of the first topological domain of the E. coli
transenvelope protein TolA saw an increase of the amount of protein produced relative to
expression trials conducted with IPTG (Mujacic et al. 1999).
5.3.3.3 Growth media.
Various growth media were also used to explore protein expression optimization of Trx-
TM2/3/4. Commonly used media and their ingredients are listed in Appendix 2. For the
optimized expression of Trx-TM2/3/4, the following media were used for bacterial growth: LB
medium, TB medium, M9 rich and M9 minimal media (Fig. 5.4). LB is a nutritionally rich
medium, and it continues to be one of the most common media used for maintaining and
cultivating recombinant strains of E. coli. TB medium is a complex medium much like LB, but
115
it has been shown to support higher cell densities, as it contains higher amount of yeast extract,
tryptone, and contains glycerol as a carbon source. M9 medium has the added benefit that it can
be supplemented to produce higher growth rates or to allow growth of strains that require
additives (e.g. thiamine or casamino acids). It has also been noted that supplementing media
with glucose (0.2-1%) can increase protein expression (Douglas et al. 2005; Sorensen and
Mortensen 2005a); increasing glucose concentrations is believed to decrease promoter repression
resulting in improved protein synthesis (Yang et al. 1997).
The results of optimizing Trx-TM2/3/4 protein expression in various media are detailed
in Figure 5.7. Poor expression of the construct is observed in both LB media (Fig. 5.7, Lanes 2
and 3) and TB media (Fig. 5.7, Lanes 6 and 7). This poor expression is surprising as both of
these media types were used successfully in expressing the hairpin of CFTR TM3/4 in quantities
suitable for biophysical analysis (Peng et al. 1998; Therien et al. 2002). For both LB and TB
media, the observable protein expression is very low, and the majority of the protein expression
appears to be of a TM2/3/4 cleavage product, with cleavage likely happening in the large loop
between TM2 and TM3. Protein expression improves with use of M9 media (Fig. 5.7, Lanes
10-11, 12-13): in this case, protein expression is slowed as the bacteria are not supplemented
with amino acids and are required to produce their own. This is contrary to conditions provided
by rich media such as TB and LB, where post-induction, the majority of cellular resources are
directed towards protein expression, instead of required cellular functions. As seen in Lane 9,
expression of Trx-TM2/3/4 in M9 minimal media with an induction temperature of 25⁰C
produces the largest quantity of protein among the various media tested; however, at the higher
induction temperature of 37⁰C, a greater proportion of cleavage is observed for the construct
(Fig. 5.7, Lane 10). Expression of Trx-TM2/3/4 in M9 rich media, supplemented with amino
acids through the addition of casein enzymatic hydrolysate (Appendix 2) is poorer than for M9
minimal media. These results suggest that a relatively reduced rate of protein expression is
required for more efficient production of Trx-TM2/3/4 by E. coli.
116
Figure 5.7. Western blot showing the effect of different growth media on protein expression of
Trx-TM2/3/4, at two different induction temperatures: 25⁰C and 37⁰C. Protein expression was
induced with the addition of 0.1 mM IPTG. Optimal Trx-TM2/3/4 protein expression appears to
occur in M9 minimal media (Lane 9). Cells were induced at mid-log phase of growth (A600 ≈
0.6). The calculated molecular weight of Trx-TM2/3/4 fusion is 33.5 kDa. Lane 1: See blue
molecular weight marker (kDa). Lane 2: LB media, (-) IPTG, pre-induction. Lane 3: LB media,
25⁰C. Lane 4: LB media, 37⁰C. Lane 5: TB media, (-) IPTG, pre-induction. Lane 6: TB
media, 25⁰C. Lane 7: TB media, 37⁰C. Lane 8: M9 minimal media, (-) IPTG, pre-induction.
Lane 9: M9 minimal media, 25⁰C. Lane 10: M9 minimal media, 37⁰C. Lane 11: M9 rich
media, (-) IPTG, pre-induction. Lane 12: M9 rich media, 25⁰C. Lane 13: M9 rich media, 37⁰C.
Varying the growth media in membrane protein expression can have large effects on the
production of membrane proteins. Several different types of media are available for use in the
bacterial expression of membrane proteins (Appendix 2), but the quality and amount of protein
produced by varying the media is largely an empirical process. Optimizing the expression of
membrane proteins in various media also has utility beyond the primary goal of protein
overexpression; expression in minimal media allows for isotopic labeling of proteins. This is not
possible when proteins are isolated from natural sources.
5.3.4 Removing intracellular loop 1 between TM2 and TM3 to improve protein
expression.
E. coli BL21cells are widely used for protein over-expression as this host strain is
deficient in two proteases encoded by the lon and ompT genes; however, significant proteolytic
cleavage of the Trx-TM2/3/4 construct during expression remained (Fig. 5.5-5.7). This
degradation of over-expressed protein is, however, not uncommon, and degradation by the host
proteases guarantees that abnormal polypeptides do not accumulate within the cell and also
allows for amino acid recycling. Expressed proteins targeted for degradation can include
117
prematurely terminated polypeptides, both proteolytically and as a result of vulnerable folding
intermediates (Baneyx and Mujacic 2004).
In an effort to reduce the degradation of the Trx-TM2/3/4 construct and improve
expression levels, a modified construct was engineered where 86% of the loop sequence was
removed and replaced with a soluble loop (KSPGSK) which included Pro and Gly residues as
these amino acids have high -turn propensities according to Chou-Fasman rules (Monne et al.
1999) (Fig. 5.8); this construct was termed Trx-TM2/3/4-Loop. The WT TM2/3/4 construct
contains the 56-residue intracellular loop-1 of CFTR that may be targeting the full-length
construct for degradation, which can be seen as the lower band on a Western blot (Fig. 5.5-5.7).
Figure 5.8. Construct of the pET32a(TM2/3/4)-Loop designed for the expression of the Trx-
TM2/3/4Loop fusion protein. The cDNA corresponding to amino acids 110-247 of human
CFTR with amino acids 143-190 replaced with a soluble loop sequence “KSPGSK” was
subcloned into the Nco1/Xho1 restriction sites of the pET32a vector. The amino acid sequence
of the altered construct is shown, with the individual TM segments underlined. The Cys residues
in the construct (Cys 125 and Cys 225) were mutated to Ala and are indicated in red. The loop
construct is indicated in blue.
118
Expression optimization was conducted for the Trx-TM2/3/4-Loop construct in a
similar manner to the Trx-TM2/3/4 construct, where E. coli cell type, concentration of IPTG,
expression induction temperature, and/or expression media were varied and the resulting
expression levels probed via Western blot. Expressing Trx-TM2/3/4-Loop in different E. coli
cell types including BL21(DE3), BL21 codon (+), and the C43 (DE3) as well as altering the
concentration of protein expression induction agent (IPTG) resulted in only small changes in
expression among trials (data not shown). Altering the growth temperature post-induction did
affect the expression of the construct, in that there are still greater amounts of observable
cleavage at the higher expression temperature of 37⁰C (Fig. 5.9, Lanes 2-9 vs. Lanes 10-17).
Similarly to the Trx-TM2/3/4 construct, the loop deleted Trx-TM2/3/4-Loop experiences
increased intracellular proteolytic cleavage by host E. coli proteases at higher expression
temperatures (37⁰C vs. 25⁰C). Exploring expression conditions for the Trx-TM2/3/4-Loop
construct with different growth media types did reveal differences in expression levels: the best
construct expression was observed for the Trx-TM2/3/4-Loop construct in TB media (Fig. 5.9,
Lane 5). While attempts at protein expression for Trx-TM2/3/4-Loop appear to be more
successful than expression optimization of the full-length Trx-TM2/3/4, expression was not
detectable via Coomassie staining (Fig. 5.9B).
119
Figure 5.9. Expression trials showing the effect of different growth media on protein expression
of Trx-TM2/3/4-Loop, at two different induction temperatures: 25⁰C (Lanes 2-9) and 37⁰C
(Lanes 10-17) in BL21 (DE3) E. coli cells. Protein expression was induced with the addition of
0.1 mM IPTG. Optimal Trx-TM2/3/4-Loop protein expression appears to occur in TB media
with a lower protein expression induction temperature of 25⁰C (Lane 5). Cells were induced at
mid-log phase of growth (A600 ≈ 0.6). A) Western blot. B) Coomassie stained duplicate gel.
The calculated molecular weight of Trx-TM2/3/4-Loop fusion is 28.6 kDa. Lane 1: See blue
molecular weight marker (kDa). Lane 2: LB media, (-) IPTG, pre-induction. Lane 3: LB media,
25⁰C. Lane 4: TB media, (-) IPTG, pre-induction. Lane 5: TB media, 25⁰C. Lane 6: M9 rich
media, (-) IPTG, pre-induction. Lane 7: M9 rich media, 25⁰C. Lane 8: M9 minimal media, (-)
IPTG, pre-induction. Lane 9: M9 minimal media, 25⁰C. Lane 10: LB media, (-) IPTG, pre-
induction. Lane 11: LB media, 37⁰C. Lane 12: TB media, (-) IPTG, pre-induction. Lane 13: TB
media, 37⁰C. Lane 14: M9 rich media, (-) IPTG, pre-induction. Lane 15: M9 rich media, 37⁰C.
Lane 16: M9 minimal media, (-) IPTG, pre-induction. Lane 17: M9 minimal media, 37⁰C.
5.4 Characterization of expressed the cystic fibrosis transmembrane conductance
regulator fragments.
Previous work has shown that there can be structural consequences related to the addition
or removal of polar residues in TM segments and fragments of membrane proteins (Therien et al.
2001; Zhou et al. 2001; Wehbi et al. 2008; Rath et al. 2009a). Through the use of TM2/3/4 of
120
CFTR as a model, the consequences of such non-native residues in a larger TM system could be
investigated and the results of analyzing fragments of membrane proteins may provide a
potential explanation. For example, while much biochemical data has been collected on the
cytoplasmic domains of CFTR and how mutations in these soluble domains can lead to disease
(Riordan 2005; Frelet and Klein 2006), there exists a relative paucity of information regarding
CF causing mutations in the TM regions. While the heterologous expression of CFTR-TM2/3/4
remained below levels suitable for full biophysical characterization, investigating the migration
of the CFTR-TM2/3/4 construct on SDS-PAGE via Western Blot can still find utility and
highlight resulting structural differences. The work presented below expands on the
investigation of helical hairpin systems, with a view towards assessing the structural
consequences of introducing non-conservative mutations into a three-TM system.
5.4.1 Differential gel migration of TM2/3/4 mutants.
The migration of this three-strand construct and its mutants could be assessed via SDS-
PAGE and Western blotting with antibodies directed towards the S-tag on the construct (Fig.
5.2). Similarly to the CFTR TM3/4 construct, mutants in the TM regions of CFTR-TM2/3/4
could reflect changes to the structure, which may be detected as differences in migration on a
gel. This method of evaluating structural changes to TM containing constructs is quite sensitive,
and relative differences to WT migration of as much as 30-40% to the actual molecular weight of
the protein due to single point mutations have been observed (Rath et al. 2009a). Accordingly,
several TM mutations were generated, as chosen for various reasons: G126D was chosen as it is
a CF phenotypic mutation located in the middle portion of TM2 (Fig. 5.10A) (Wagner et al.
1994). This mutation was identified in a cystic fibrosis male patient using gradient gel
electrophoresis as a rapid method for screening a large number of CF patients for point mutations
in the CFTR exons (Wagner et al. 1994). Currently the consequence of the G126D mutation on
the folding pathway of CFTR in the full-length protein is unknown. I231E was chosen since a
closely related mutation (I231D) was shown to have the greatest percentage molecular weight
change relative to WT in migration studies of CFTR TM3/4 (Choi et al. 2004), and that mutation
to Glu (I231E) amplified this effect (unpublished results) (Fig. 5.10B). While I231E is not a
cystic fibrosis related mutation, the neighboring phenotypic mutation V232D appears result in
decreased glycosylation with no apparent maturation product at the cell surface. When run on
121
SDS-PAGE, TM2/3/4 constructs treated with thrombin to remove the Trx fusion, in some cases
had altered migration properties relative to WT. This suggests that mutation to the residues
comprising the TM regions can affect the structure of the construct either through altered
protein-detergent complexes, altered secondary structure, or altered TM-TM contacts.
Figure 5.10. Differential migration of WT CFTR-TM2/3/4 relative to mutants. A) Western
blot of TM2/3/4-WT (Lane 1) and TM2/3/4-G126D mutant (Lane 2). The TM2/3/4-G126D
mutant migrates faster relative to the TM2/3/4-WT construct. B) Western blot of TM2/3/4-WT
(Lane 1) and TM2/3/4-I231E mutant. (Lane 2). The TM2/3/4-I231E mutant also migrates faster
relative to the TM2/3/4-WT construct. Mark 12 molecular weight markers are indicated on each
blot.
Originally, it was suggested that introduction of a polar residue into a TM hairpin
construct could alter helical interactions, potentially through the introduction of a non-native
hydrogen bond (Therien et al. 2001). Subsequent studies have shown that altered migration on
SDS-PAGE can also be a result of altered protein-detergent complexes, or altered secondary
structure. While it is possible that the addition of a charged residue to a TM hairpin construct
can contribute to increased gel migration on SDS-PAGE, the majority of the difference in gel
migration can be explained by altered detergent binding, rather than the addition of, or mutation
to, a charged residue. For example, mutations in the CFTR-TM3/4 hairpin from the WT residue
Val (V232) to both Asp (V232D) and Lys (V232K) both display migration on SDS-PAGE that is
significantly faster than WT, and both of these constructs bind significantly less detergent that
WT (Fig. 5.11) (Rath et al. 2009a). If the addition of a charged residue was the predominant
factor in dictating gel migration rates, then the V232D and V232K mutations should have
opposite effects on gel migration: V232D should migrate faster than WT, while V232K should
122
move slower. Additionally, the mutation to a charged residue into a hydrophobic construct
decreases the amount of detergent bound to the construct, and this decreases the mass-to-charge
ratio of the construct. If the overall charge of the construct including those provided by the SDS
detergent molecules was the dominant factor in determining gel migration, than the CFTR-
TM3/4 V232D and V232K mutants should both run slower than the wild type. As both of these
mutants bind less detergent than the WT construct, solvation by detergent and subsequent
structural consequence of this binding that is the primary effect in dictating migration in SDS-
PAGE. Molecules of SDS aggregate at hydrophobic sites on the protein: migration through a gel
is inversely proportional to the amount of detergent bound to the protein (Rath et al. 2009a).
Figure 5.11. CFTR TM3/4 hairpin sequence and SDS-PAGE analysis. A) Amino acid sequence
of the WT TM3/4 hairpin. Residues predicted to be in helical for the CFTR TM3/4 construct are
shown in green text and the predicted loop regions are shown in black text (Riordan et al. 1989).
The V232 residue where mutations are made is underlined. B) Representative SDS-PAGE of
helical hairpin mutants. Positions of MW standards (in kDa) are indicated. This figure is adapted
from (Rath et al. 2009a).
Increasing or decreasing amounts of hairpin secondary structure affects SDS-PAGE
migration by affecting the relative compactness of a migrating construct (Wehbi et al. 2007).
Protein tertiary structure can also affect SDS-PAGE migration rates. For example, disulfide
bonds within or between polypeptide chains can affect both the compactness of a protein
structure and the detergent lading capabilities of the construct. An increase in the migration on
SDS-PAGE was observed to a compact disulfide (S-S) bridged conformation of CFTR-TM3/4
(Therien et al. 2001). Unfortunately, the amount of CFTR TM2/3/4 and mutants thereof were
insufficient to study changes to the secondary structure of the construct.
123
5.5 Discussion.
The work presented here provides a framework for membrane protein expression
optimization and indicates an approach that can be adopted. Several optimization strategies were
outlined and followed for Trx-TM2/3/4 that included investigations on the effects of
temperature, bacterial cell strain, culture media and concentration of induction agent, with the
goal towards improving protein expression.
Our work reaffirms that to improve the expression of membrane proteins utilizing
bacteria as the expression vehicle, a number of expression temperatures must be tested to achieve
optimal results. In the case of Trx-TM2/3/4, a reduction in temperature was found to increase
expression, but this may not be universal depending on the protein. In fact, successful
overexpression of membrane proteins has been observed at a variety of temperatures including
the expression of bacteriorhodopsin at a relatively high temperature of 37°C (Faham et al. 2004),
CFTR-TM3/4 at a intermediate temperature of 25°C (Therien et al. 2002), and successful
overexpression of the human Na+/glucose co-transporter (hSGLT1) at 16°C (Quick and Wright
2002). Often, decreasing the temperature can be associated with an increase in the amount of
protein expressed: high temperature expression can lead to an increased rate of proteolytic
degradation (Quick and Wright 2002), and can trigger cellular stress situations. Cell stresses can
then lead to protein aggregation and inclusion body formation, which may not be ideal for
membrane protein overexpression and subsequent protein re-folding (Cunningham and Deber
2007).
While the work described herein focuses on the use of E. coli as the heterologous
expression vehicle, other systems are available for the over-production of membrane proteins.
The yeast systems Saccharomyces cerevisiae and Pichia pastoris are widely used, and similarly
to bacteria, yeast are well characterized and straightforward to modify genetically. They can be
easily and inexpensively grown, and most importantly they are capable of protein processing and
post-translational modification mechanisms related to those found in mammalian cells. While
this processing may not recapitulate mammalian post-translational modifications exactly, the
ease of use of this system makes yeast an attractive expression tool (Midgett and Madden 2007).
The Shaker family voltage dependent potassium channel is an example of a membrane protein
124
that was overexpressed and structurally solved at high resolution via the use of P. pastoris as the
expression vehicle (Long et al. 2007).
Membrane proteins can also be overexpressed in insect cells via the baculovirus
expression system. Insect cells are simpler to maintain than mammalian cells, and they offer
membrane composition and protein processing machinery closer to those of mammalian cells
than yeast. The baculovirus expression system has been used successfully in the expression of
G-coupled protein receptors (Akermoun et al. 2005) as well as human Aquaporin-4 (Hiroaki et
al. 2006).
While notoriously challenging in the overexpression of membrane proteins, mammalian
cells offer an additional choice in the overexpression of membrane proteins as they possess
cellular machinery most closely associated with human physiology and disease; however, use of
mammalian cells has been limited by difficulty and expense, with a few exceptions: the
crystallographic structure of recombinant rhodopsin was successfully solved following its
expression and purification from COS-1 cells (Standfuss et al. 2007). Mammalian cells are also
chosen as the expression tool when lower organisms are not compatible for expressing the
protein of interest. In experiments designed to define the molecular basis for the inability of E.
coli to express the complete liver H+/Pi transporter, in vitro transcription and translation assays
showed that the complete transporter is only expressed with eukaryotic ribosomes, and
inefficiently expressed in the presence of prokaryotic ribosomes (Ferreira and Pedersen 1992).
Care must be taken when choosing the expression vehicle for membrane protein over-
expression. For example, it has been shown that post-translational modifications such as
glycosylation are important for the functional expression of membrane proteins such as the
serotonin transporter. Without in vivo N-glycosylation, the serotonin transporter fails to fold
normally, and aggregates within the cell (Tate 2001). As a result, this example cannot be
overexpressed in bacterial or yeast systems.
There are additional features of the expression process that can be tailored during
optimization. In the case of membrane proteins, several choices are available when targeting
protein expression to a specific location in cells. Expression in the lipid bilayer, soluble
125
expression in concert with a solubilizing fusion partner to both the cytoplasm and the periplasm,
and targeting expression to inclusion bodies within E. coli, can each be utilized to increase
membrane protein expression. Success with expressing hairpin fragments of CFTR in our lab as
soluble fusion partners to an E. coli protein (Therien et al. 2002; Rath et al. 2009a) has made this
an attractive route for the further characterization of the remainder of the CFTR TM domain. In
addition to Trx, several other fusion domains have been successfully used to increase membrane
protein expression: glutathione-S-transferase, MBP, and the chitin-binding domain are all
popular choices in the effort to increase protein expression (Laage and Langosch 2001).
Additionally, modification of the construct can also affect expression rates. Extension or
truncation of the N- and/or C-termini can affect expression of the construct of interest.
The work described in this chapter also highlights the importance of construct design. As
a model to study -helical interactions within the membrane bilayer, expression of helical
constructs such as TM2/3/4 would provide useful details on the sequence determinants of helical
contacts, and perhaps when these contacts occur. Examination of the structural model of CFTR
and the structure of the related ABC transporter Sav1866 indicate that TMs 2-4 of CFTR may
not be in tandem contact in the final fold of the protein. Further research would need to be
completed to determine if TM2 would contact a TM3/4 hairpin structure in an in vitro setting.
A strategy for the extensive optimization of membrane protein expression leading to
amounts of Trx-TM2/3/4 suitable for study has been outlined here. Following systematic
methodology in the optimization of membrane protein expression, and taking into consideration
the complex host requirements for expression, the approaches described in this Chapter should
ultimately lead to the successful overexpression of a membrane protein of interest.
126
Chapter 6. Discussion.
127
6.1 Discussion.
Membrane proteins account for a large proportion of the total protein content in cells
(Boyd et al. 1998), where they are responsible for a variety of functions including transportation
of essential cellular substrates across the membrane, signal transduction and cellular recognition.
Due to their inherent hydrophobicity, elucidating the structural and functional aspects of
membrane proteins has proven a unique challenge for researchers. As membrane proteins are of
great medical relevance (Yildirim et al. 2007), it would be extremely useful to be able to predict
the final folded structure of a membrane protein from first principles associated with its primary
amino acid sequence. Membrane proteins have been implicated in many diseases such as cystic
fibrosis, Alzheimer‟s disease, retinitis pigmentosa and hereditary hearing loss (Partridge et al.
2002b).
In order to accomplish this task, however, improvements must be made to the individual
components of membrane protein production, prediction, and ultimately to a detailed definition
of the aspects contributing to a final folded structure. The work presented in this thesis
investigates several features of this process, from examining the role of TM amino acid sequence
and their contributions to TM-TM packing, accurate prediction of TM segments from the
primary sequence, determinants for TM segment selection by the cellular machinery in vivo,
conversion of marginally hydrophobic segments from soluble proteins into membrane-spanning
segments, and the optimization of membrane protein expression. To these ends, we evaluated
various parameters of TM segment selection, membrane protein expression, and structure. The
dependence of amino acid sequence on TM helix-helix interactions as well as contributions by
surrounding lipids to TM oligomerization was investigated to determine TM protein folding
determents using the single-pass TM protein GpA (Chapter 2). Comparisons of TM-like
segments from soluble proteins - which we termed -helices - to actual TM segments, helped to
define compositional differences between these groups as well as highlighted the distribution of
charged residues along the helical axis which can be used as a discriminating factor to remove
false negatives from prediction outputs (Chapter 3). The requirements of converting a
hydrophobic -helix from a soluble protein to a membrane-spanning segment were investigated
in a model system (Chapter 4). Finally, utilizing TM2/3/4 of CFTR as a model, it was found that
128
a number of factors can be extensively optimized that contribute to achieving successful protein
overexpression (Chapter 5). The overall major insights of this thesis discussed below:
6.1.1 Summary of contributions.
6.1.1.1 Beta-branched residues adjacent to GG4 motifs promote the efficient association of
glycophorin A transmembrane helices.
Interactive sites between TM -helices commonly contain small residue patterns (termed
GG4 or „small-xxx-small‟ motifs) at i and i + 4 positions along the helical axis. This small
residue pattern often occurs with β-branched aliphatic residues at adjacent positions, as typified
by the GpA dimerization sequence (L75
IxxGVxxGVxxT87
). In Chapter 2 we explored the
importance of local β-branched character on GpA dimerization by making systematic
replacements to all 16 combinations of Val, Ile, Leu, and Ala residues at the Val80
and Val84
positions. Using the TOXCAT system to assay self-oligomerization in the E. coli inner
membrane, we observed that combinations of Val and Ile residues maintained, or improved
dimerization levels; single Ala or Leu mutant combinations with Val or Ile maintained near-WT
dimerization affinities; and in the absence of β-branching, i.e., Leu/Leu, Ala/Ala and Ala/Leu
combinations, GpA dimerization was significantly diminished. Our results in Chapter 2 indicate
an apparent capacity of Ile-containing mutants to increase GpA dimerization vs. WT, which
likely arises from improved van der Waals packing (vs. Val). This is also consistent with
correlations we noted in lipid accessibility measurements. Examination of several synthetic
peptides with sequences corresponding to selected GpA mutants (VV, VI, IV, II, and LL)
confirmed their dimerization on SDS-PAGE. The results presented in Chapter 2 reinforce the
importance of a β-branch-containing „ridge‟ residue to complement a „small-xxx-small groove‟
in promotion of TM-TM interactions and highlight the sequence dependence of TM segment
association.
129
6.1.1.2 Distinctions between hydrophobic helices in globular proteins and transmembrane
segments as factors in protein sorting.
Generally, TM segments can be distinguished in the primary amino acid sequence as
continuous stretches of hydrophobic residues above a specific hydrophobicity threshold;
however, a database we created of helical globular proteins revealed that nearly one-third of the
proteins in the database contained helices of sufficient length to span a bilayer (≥ 19 residues),
and in many instances, had mean hydrophobicity greater than actual TM segments. We termed
these hydrophobic segments from globular proteins “-helices”. In Chapter 3 we found that
peptides corresponding to selected -helix segments behave similarly to native TM sequences as
they readily insert into membrane mimetic environments in helical conformations. As well,
certain -helix sequences can integrate into the membrane bilayer when placed into a membrane-
targeted TOXCAT chimeric protein. Computationally, we established that -helices can be
distinguished from bona fide TM segments by the decreased frequency of occurrence of Ile/Val
residues, and by their relatively decreased solvent accessibilities (vs. other globular helices)
within tertiary structure. -helices generally contain three or more charged residues, and they
display relatively even distributions of these charged residues along their lengths – rather than
concentration near their N- and C-termini as observed for TM segments. This distinction may
constitute key recognition factors in diverting -helices from the membrane in vivo. The results
presented in Chapter 3 identify additional factors that may be important in the correct selection
of TM segments by the cellular machinery, and suggest that -helices may be required for
globular protein folding.
This work can also expand on the prediction process that is used to identify TM segments
from the primary amino acid sequence. In Chapter 3, two features of TM segments were
identified that will aid in the separation of TM -helices from -helices from globular proteins
with significant hydrophobic character (-helices): the skewed positioning of charged residues
along the helical axis; and the significant content of large, hydrophobic -branched residues
(Ile/Val) in the sequence of TM segments (Cunningham et al. 2009). The weeding of false-
positives such as -helices from prediction programs is an important goal, as identification of
130
TM proteins from volumes of sequence data is not yet routinely possible in the large-scale study
of proteins.
6.1.1.3 Converting a marginally hydrophobic globular protein into a membrane protein.
Marginally hydrophobic -helical segments such as -helices exhibit certain sequence
characteristics of TM helices (Chapter 3). To better understand the distinctions between -
helices and TM -helices, we investigated the insertion of five -helices into dog pancreas
microsomal membranes. Model constructs in which an isolated -helix was engineered into a
bona fide membrane protein indicated that for two -helices selected from secreted proteins, at
least three single-nucleotide mutations are necessary to obtain efficient membrane insertion,
whereas one mutation is sufficient in a -helix from the cytosolic protein P450BM-3.
Additionally, we found that when the entire upstream region of the mutated -helix in the intact
cytochrome P450BM-3 is deleted, a small fraction of the truncated protein inserts into
microsomal membranes. Our results in Chapter 4 suggest that upstream portions of the
polypeptide and embedded charged residues protect -helices in globular proteins from being
recognized by the SRP-Sec61 ER-targeting machinery. The results further indicate that -helices
in secreted proteins are mutationally more distant from TM helices than -helices in cytosolic
proteins, and how difficult it is to convert a soluble segment to one that traverses the bilayer with
some efficiency.
6.1.1.4 Optimizing synthesis and expression of transmembrane peptides and proteins.
The over-expression of membrane proteins – which in most cases is a requirement for
high resolution structural determination – is a highly empirical process. A delicate balance exists
for heterologous protein expression of eukaryotic proteins in bacteria, where various parameters
of the process can and should be optimized to achieve efficient results. In Chapter 5 we outlined
a heterologous expression strategy for a fragment of the TM domain of CFTR consisting of
TM‟s 2, 3 and 4 with the interconnecting loops between TM2/3 and TM3/4. Variations in the
protein expression process included variously altering the expression construct, bacterial strain,
growth media, protein expression temperature, as well as induction temperature. While it was
131
found that optimizing various parameters of CFTR TM2/3/4 expression definitely affected
protein expression levels, concentrations suitable for biophysical analysis remained elusive. The
results presented in this Chapter provide researchers with a clear plan in the optimization of the
complicated process of membrane protein production such that one may progress towards the
successful overexpression of a membrane protein.
6.2 Membrane mimetic micelles versus bilayers.
In the work presented in this thesis, both detergent micelles and native membrane
bilayers were used to study the secondary and tertiary structures of -helices and TM segments.
While use of detergents in the study of membrane proteins and fragments of larger membrane
proteins has often proved to be both useful and necessary, both in vitro and in vivo environments
have advantages and disadvantages, and care must be taken when interpreting results.
6.2.1 SDS as a membrane mimetic.
The hydrophobic nature of membrane proteins and TM segments requires the use of
membrane mimetic systems for their study. In Chapters 2, 3 and 5, we chose to use SDS as our
membrane mimetic. Sodium dodecyl sulfate is an anionic detergent that has a tail of 12 carbon
atoms attached to a sulfate group, providing the molecule with the amphiphilic properties
required for micelle formation. An added benefit of SDS is the anionic nature of the detergent,
as it is able to neutralize the effects of the positively charged Lys tags which are placed at
synthesized peptide termini for solubilization purposes. SDS is a detergent commonly used in
labs around the world, and in our hands has resulted in the successful secondary structure
determination of CFTR TM hairpins (Choi et al. 2004), TM peptides (Deber et al. 1993;
Partridge et al. 2002a; Liu et al. 2003; Cunningham et al. 2009), as well as designed TM peptide
segments (Johnson et al. 2004; Tulumello and Deber 2009).
While detergents offer a simplifying solution in the handling of membrane proteins, care
must be taken when using membrane mimetics or interpreting results derived from these
systems: previous work can be conflicting regarding the effects of detergents on membrane
proteins. For example, it was shown that point mutations in the GpA TM segment has similar
132
energetic consequences in the detergent C8E5 as compared to the TOXCAT assay which is
conducted in the inner membrane of E. coli (Fleming and Engelman 2001). The work presented
in this thesis shows that the specifics of TM segment oligomerization in SDS may vary
somewhat from those observed in an intact membrane bilayer (Chapter 4). An SDS micellar
environment is quite different from a biological membrane such as the E. coli inner membrane
which is composed of the lipids phosphatidylethanolamine, phosphatidylglycerol and cardiolipin.
In the case of peptides corresponding to selected mutants of GpA (Chapter 2), all
segments investigated for their oligomerization capabilities on SDS-PAGE retained their dimeric
status; however, the rate of migration on SDS-PAGE did not always correlate with the
dimerization results found via the TOXCAT assay. In a dynamic environment such as a SDS
micelle, changes to the register of a dimerization interface are permitted, as well as the
possibility of anti-parallel interactions. These types of TM oligomerization patterns would not
be available in the TOXCAT system as the nature of the construct would force interaction of
GpA TM segments in a certain orientation and register. SDS-PAGE additionally reports on
several features beyond migration such as detergent binding and hydropathy (Rath et al. 2009a).
Despite some limitations, SDS was still considered a reasonable solubilizing agent for
our TM peptide and protein studies, as secondary structure determination via CD of our GpA
peptides ruled out the possibility that differences in SDS-PAGE migration were caused by
differences in secondary structure (Chapter 2). The basis of differences in SDS-PAGE migration
among GpA mutants is most likely being influenced by factors beyond structural changes to the
peptides such as peptide crossing angles, or detergent binding (Rath et al. 2009a).
6.3 Significant content of hydrophobic, -branched residues in transmembrane
segments relative to -helices.
Compositional analysis of the residues comprising TM segments highlights an abundance
of large -branched residues in TM segments, relative to the amino acid content of -helices
(Cunningham et al. 2009). The evident requirement of hydrophobic, -branched residues in TM
segments is directly related to the apolar environment provided by the membrane bilayer, as
these residues meet hydrophobicity criteria imposed by the membrane bilayer. While -helices
133
retain equivalent hydrophobicity to TM segments according to the Liu-Deber hydrophobicity
scale, the decreased amount of Ile and Val in -helices must be a result of the structural
preferences of amino acids in different environments: large, hydrophobic -branched residues
such as Ile and Val have differential structural preferences depending on their environment. In
an apolar environment such as the membrane bilayer, these residues take on the preferred
structure of a -helix (Liu and Deber 1999). In an aqueous environment, these residues would
“prefer” to form -sheet structures and the propensity to form -helices is relatively low (Chou
and Fasman 1978). Notably, the remaining large hydrophobic residue Leu is statistically highly
represented in both of these environments. Leu contributes to -helical structures in an apolar
environment like the membrane, but has a similar propensity to form both -helices and -sheets
in an aqueous environment (Chou and Fasman 1978; Liu and Deber 1999). Our identification of
the preference of Ile and Val to occur in hydrophobic segments that span the membrane rather
than the core of globular proteins highlights the fact that the interior of these globular proteins
must not in fact resemble the membrane bilayer. Put in evolutionary terms, the low abundance
of Ile and Val in globular proteins is an attempt to preserve the secondary structure of the protein
in a -helical form.
6.4 Membrane insertion propensity of transmembrane segments.
Hydrophobicity is generally considered the overriding characteristic directing the
insertion of TM -helices into the membrane bilayer. Examination of the segmental
hydrophobicity of TM spanning segments with an in vivo hydrophobicity scale developed by
Hessa et. al. shows that there is a threshold hydrophobicity for membrane insertion. The average
predicted free energy of insertion (Gapp), as calculated for TM -helices with available high
resolution structures, is around -1 kcal/mol (Hessa et al. 2007). Efficient recognition by the
translocon for insertion into the membrane bilayer as experimentally determined by this scale is
Gapp < 0 kcal/mol (Hessa et al. 2007). This calculated energetic value appears to hold most true
for single-spanning TM proteins, but this is not the case for all TM segments - especially TM‟s
from multi-spanning TM proteins. Approximately 25% of TM‟s from multi-spanning membrane
proteins have a predicted membrane insertion propensity that is not considered favorable for
insertion, Gapp > 0 (Hessa et al. 2007). The lowered relative hydrophobicity of these TM
134
segments from multi-spanning proteins suggests that efficient membrane insertion of these
segments may depend on contacts with other regions of the protein (Hedin et al. 2010). These
TM segments with below-threshold hydrophobicity would most likely not be recognized by the
translocon as TM -helices if they were the only membrane-embedded sequence in the protein.
6.4.1 Transmembrane segments with low hydrophobicity.
While marginally hydrophobic TM segments are a common theme in membrane proteins,
hydrophobic segments in water-soluble proteins – or -helices – are also common (Cunningham
et al. 2009; Enquist et al. 2009; Hedin et al. 2010). Our lab uses peptides corresponding to TM
segments and fragments of membrane proteins to study the insertion of hydrophobic segments
into membrane bilayers and their associations within these environments. Beyond these studies,
the -helices also provide a unique opportunity to investigate the insertion of marginally
hydrophobic non-TM segments into the membrane bilayer in vivo and to potentially dissect
factors involved in membrane insertion by the cellular machinery.
6.4.2 Importance of charged residues to translocon-mediated membrane insertion of -
helices.
Although -helices are not bona fide TM segments, the work presented in Chapter 3
indicates that -helices are capable of solvation by detergent micelles where they adopt -helical
structure. While the -helices tested for in vivo membrane insertion largely failed to do so in
their wild-type form, introducing polar-to-hydrophobic mutations allowed for their membrane
insertion (Chapter 4). What is most interesting from the results of these membrane insertion
studies is how removing charged residues from the sequence of secreted -helix examples did
not result in equal membrane insertion abilities of the segments. The recognition of TM
segments by the translocon is completed co-translationally, and is thought to be based on a
thermodynamic partitioning into the anisotropic environment of the lipid bilayer (Hessa et al.
2007). Relatively little work has been done exploring the sequence dependence of membrane
insertion, but investigations of model segments show that the location of charged residues and/or
aromatic residues within TM segments can greatly affect the insertion propensity; the translocon-
135
mediated insertion of TM segments into the membrane mirrors the physical properties of the
lipid bilayer (Hessa et al. 2007). The insertion of -helices into the membrane bilayer via the
host cellular machinery furthers these studies by the use of actual hydrophobic segments beyond
ideal model systems, and highlights as yet unknown factors in the membrane integration process.
The -helix segment from 2AAI reached maximal membrane insertion with three polar-
to-hydrophobic mutations, while the -helix from 1ECA had a relative insertion of 50% with
three mutations (Chapter 4, (Norholm et al. 2011)). A comparison of the segmental Liu-Deber
hydrophobicity suggests that 1ECA - E6V,D14V,G10V triple mutant is more hydrophobic than
2AAI - R8C,E19V,R21I triple mutant, both with and without inclusion of flanking residues
(Table 6.1). However, a comparison of the Gapp for both these examples indicates that the in
vivo hydrophobicities or Gapp are similar (Table 6.1). Based on experimental measurements
relating hydrophobicity and the position of charged residues in membrane-spanning segments,
these two secreted -helix examples would be predicted to insert into the membrane with equal
propensity (Hessa et al. 2007). The fact that these two segments do not insert into the membrane
with equal efficiency warrants further investigation and implies a sequence dependence of
membrane insertion.
In comparison to these secreted protein examples (1ECA and 2AAI), the -helix from the
cytosolic 2BMH protein reached high levels of membrane insertion after one polar-to-
hydrophobic mutation: 2BMH-E20V. The segmental hydrophobicity of the 2BMH -helix
segment, as well as the inclusion of flanking residues, is close to the experimentally determined
threshold for in vitro membrane insertion which is greater than, or equal to 0.4 on the Liu-Deber
hydrophobicity scale (Liu and Deber 1999). As the highest level of membrane insertion is seen
for the most hydrophobic -helix investigated, the importance of hydrophobicity in selection of
membrane-spanning segments by the cellular machinery is clearly highlighted. Polar-to-
hydrophobic mutations improve membrane insertion, also reinforcing the significance of the
location of polar/charged residues within the TM sequence (Hessa et al. 2007).
136
Table 6.1. Liu-Deber and Gapp hydrophobicity predictions for -helices and mutants.
-helix plus
flanking region -helix
Mutant Sequence Liu-
Deber a Gapp
b Liu-
Deber a Gapp
b
1ECA - WT DFAGAEAAWGATLDTFFGMIFSKM 0.62 4.51 1.10 4.15
1ECA - D14V DFAGAEAAWGATLVTFFGMIFSKM 0.85 2.393 1.38
2.04
1ECA - E6V DFAGAVAAWGATLDTFFGMIFSKM 0.81 3.597 1.33 3.16
1ECA - E6V,D14V DFAGAVAAWGATLVTFFGMIFSKM 1.04 1.557 1.60 1.20
1ECA -
E6V,D14V,G10V DFAGAVAAWVATLVTFFGMIFSKM 1.30 1.053 1.92 0.64
2AAI - WT TQLPTLARSFIICIQMISEAARFQYIEGEMR 0.51 5.06 0.85 2.98
2AAI - R8C,R21I TQLPTLACSFIICIQMISEAAIFQYIEGEMR 0.91 3.019 1.47 1.98
2AAI - E19V TQLPTLARSFIICIQMISVAARFQYIEGEMR 0.66 3.103 1.07 1.70
2AAI - R8C,E19V TQLPTLACSFIICIQMISVAARFQYIEGEMR 0.83 2.45 1.34 0.86
2AAI -
R8C,E19V,R21I TQLPTLACSFIICIQMISVAAIFQYIEGEMR 1.06 1.056 1.69 0.70
2BMH - WT PLDDENIRYQIITFLIAGHETTSGLLSFALYFLVKNPHV 0.28 7.090 0.75 3.48
2BMH – E20V PLDDENIRYQIITFLIAGHVTTSGLLSFALYFLVKNPHV 0.39 4.983 0.89 1.38
2BMH – H19L,
E20V PLDDENIRYQIITFLIAGLVTTSGLLSFALYFLVKNPHV 0.63 3.647 1.18 0.04
a Segmental hydrophobicity of the -helices as calculated by the Liu-Deber hydrophobicity scale (Liu and Deber
1999). b Hydrophobicity of the -helices as calculated by the Gapp scale (Hessa et al. 2007).
The underlined regions of sequence represent the -helices as identified by TM finder (Deber et al. 2001).
The non-underlined regions of sequence represent native residues added to the sequence as residues flanking TM
helices have previously been shown to affect membrane insertion efficiency ((Hessa et al. 2005a)).
6.4.3 Importance of secondary structure to translocon-mediated membrane insertion of
-helices.
As the work highlighted in Chapter 4 indicates that there may be sequence dependence to
efficient membrane insertion, an additional feature affecting the integration process may possibly
be secondary structure formation within the ribosome and translocon: efficient formation of
secondary structure is related to the amino acid sequence. It has been shown that nascent chain
137
folding inside the ribosome may be an important regulatory mechanism for the topogenesis and
integration of single-spanning membrane proteins (Mingarro et al. 2000; Woolhead et al. 2004),
as -helix formation can occur within the translocating ribosome exit tunnel (Lu and Deutsch
2005; Daniel et al. 2008), although the exact mechanisms promoting helix formation are
unknown (Woolhead et al. 2004; Lu and Deutsch 2005). It has been proposed that nonpolar
surfaces in the ribosome exit tunnel induce -helices, and that these TM -helices initially
formed in the ribosome tunnel also persist into the translocon pore (Woolhead et al. 2004) which
could be a contributing factor to the differences in membrane insertion between 1ECA and
2AAI. A folded -helix structure could potentially be optimized for membrane insertion relative
to a polypeptide with backbone polarity exposed to the membrane bilayer.
Hydrophobic, -branched residues such as Ile and Val may preferentially appear in TM
segments compared to hydrophobic segments from soluble proteins - or -helices – for reasons
beyond structurally optimizing TM -helices within the membrane bilayers alone. On average,
it has been shown that TM segments are comprised of residues with relatively high apolar
helicity. A comparison of intrinsic -helical structure of amino acids in apolar environments
shows that there is a rank order of amino acids in their tendency to form -helical secondary
structures (Liu and Deber 1999). When applied to the 1ECA and 2AAI -helix segments, the
2AAI -helix has a predicted greater ability to form -helical structures in apolar environments,
for both the WT and the triple mutant constructs (Table 6.2). It must be noted however, that with
the additional of flanking residues to the -helix segment investigated for eukaryotic membrane
insertion (Table 6.1), it is not known exactly what portion of the -helix segment is actually
traversing the membrane, and comparing segmental hydrophobicity of these segments remains
slightly empirical. There may be a sliding window effect with the addition of flanking residues
that can help, or hinder membrane insertion and makes it difficult to identify precisely which
amino acids would be inserting into the membrane (Table 6.1).
138
Table 6.2. Comparison of segmental apolar -helicity for the 1ECA and 2AAI -helices.
-helix segment a
Apolar segmental
helicity b
1ECA 27.46
1ECA - E6V,D14V,G10V 28.38
2AAI 33.77
2AAI - R8C,E19V,R21I 34.37 a -helix segments include flanking residues (See Table 6.1 for sequences).
b Calculated as per (Liu and Deber 1999).
The membrane insertion of -helices also begs the question of what happens during the
synthesis of proteins that contain multiple TMs, and their structure in the translocon. Studies
measuring the membrane insertion of aquaporins and CFTR have indicated that these TM
segments likely have -helical structures within the translocon pore, and that movement through
the translocon can be directly influenced by the structure of the nascent polypeptide within the
ribosome exit tunnel (Daniel et al. 2008; Pitonzo et al. 2009). The high content of hydrophobic,
-branched residues in TM segments suggests there is an importance of TM segment helicity in
promoting the membrane insertion of TM segments through the lateral opening of the translocon,
and the hydrophobic constriction within the hour-glass shaped structure of the pore may act to
maintain these helical structures (Junne et al. 2010). It is likely that an optimized -helical
structure in TM segments aides in membrane integration, as contacts with the translocon occur
prior to movement into the membrane bilayer.
While the exact mechanism of translocon recognition is currently unknown, the insertion
of non-native segments into the membrane bilayer suggests that there may be specific sequence
features of hydrophobic segments used by the cellular machinery to discriminate between
membrane insertion and translocation of segments through the pore. The results presented in this
thesis suggest a hierarchy for translocon assisted membrane insertion of TM segments. As
shown via the 2BMH construct, hydrophobicity is likely the most important factor directing
membrane insertion as this segment has the highest segmental hydrophobicity (Table 4.1), and
the highest measured levels of insertion (Chapter 4). Secondly, the positioning of charged
residues plays an important role in the membrane insertion process. Removing charges residues
from non-ideal positions such as the centre of a -helix drastically improves the measured
139
membrane insertion. When these factors are equivalent, as in the case of the 1ECA and 2AAI -
helices, an additional factor may affect membrane insertion such as the secondary structure of
the hydrophobic segment. In concert, these factors could play a role in discriminating TM
segments from non-TM segments in the case of single-spanning membrane proteins, or those that
have limited contact with the remainder of membrane-spanning segments.
Perhaps most importantly, the results presented in Chapter 4 highlight the promiscuity of
the membrane insertion process: the -helices can be inserted into biological membranes to
some degree, but they are not TM segments in vivo. Natural TM segments come in all varieties,
from hydrophobic to relatively hydrophilic, which is a natural consequence of membrane-
spanning segments with different functions. The translocon and associated proteins must have a
threshold for TM segment recognition that is based on generic properties, but still retain enough
specificity to avoid incorrect membrane insertion events. This ubiquitous process requires a
delicate balance between rules for membrane insertion, and promiscuity so that only segments
destined for the membrane are integrated. The research elaborated here works towards defining
the characteristics of membrane insertion in greater detail, which in turn, may be used to improve
TM segment prediction.
6.4.4 Biological role of -helices.
The detailed mechanism by which a polypeptide chain folds to a specific three-
dimensional protein structure is difficult to determine; however; native states of proteins almost
always correspond to the structures that are most thermodynamically stable under physiological
conditions. This usually means the incorporation of hydrophobic portions of proteins into the
folded interior (Dobson 2003). In water-soluble proteins containing -helices, we found that -
helix segments have higher percentages of burial within the protein interior than relatively
hydrophobic -helices of comparable length (Chapter 3). Whether or not the -helix segments
are critical to the actual folding of the full length protein as a rule remains to be determined, but
based on the location of the -helices within the overall fold, it is a reasonable hypothesis. As an
example, the -helix segments from 1ECA and 1MBA studied in this body of work have been
shown to be central in the folding pathways of the full length protein in which they reside.
140
1ECA and 1MBA are members of the globin family of heme-binding proteins, where all family
members contain the basic globin fold of 7 helices, labelled A, B, C, E, F, G and H (Aronson et
al. 1994): the -helix segments within 1ECA and 1MBA corresponds to the H-helix, which is
thought to form one of the first structural elements of the globin fold on which the remainder of
the protein docks to form the final, folded structure (Nishimura et al. 2000).
In a grand scheme, the -helices may represent situations similar to events that resulted in
the evolution of TM proteins. The work highlighted in Chapter 4 indicates how it is possible to
incorporate a hydrophobic -helix into the membrane bilayer. The actual transformation of a -
helix to a membrane-spanning segment was difficult; requiring polar-to-hydrophobic amino acid
mutations as well as removal of upstream structural elements, but the end result was a change to
the functional location of the protein. In evolutionary terms, such mutational events and
structural rearrangements were perhaps at the forefront of generating TM segments and
membrane-spanning proteins, even though the origin of membranes and membrane proteins
remains enigmatic: for example, a lipid membrane would be of little function without membrane
proteins to connect membrane bound contents to the outside world, but how could have
membrane proteins evolved without functional membranes (Mulkidjanian et al. 2009)?
Membrane proteins that traverse the membrane bilayer contain long stretches of
hydrophobic amino acids. Assuming that water-soluble proteins contain a somewhat random
distribution of polar and non-polar amino acids, unlike TM proteins that contain long stretches of
hydrophobic residues, a gradual evolution from soluble proteins to membrane proteins with long
hydrophobic stretches must be considered (Mulkidjanian et al. 2009). This also could have
evolved in the alternative manner: proteins with hydrophobic stretches spanning primordial
membranes would lose their apolar nature to become water-soluble.
The first membrane protein could have evolved from a soluble protein that attaches to
membranes, and inserts into the bilayer. Once such protein that could be used as an example of
the evolution of membrane proteins is vinculin that is involved in regulating cell adhesion,
spreading, and motility (Goldmann et al. 1996) and is thought to insert into the membrane
bilayer via attachment to membranes where it can bind acidic phospholipids (Bakolitsa et al.
1999). Amphipathic helices, where hydrophobic and hydrophilic residues are located on
141
opposite faces when the peptide folds into an -helix, could also be ancestors of membrane
spanning segments and are of particular interest as these peptides can easily adopt an orientation
where the hydrophilic face is buried in water while the hydrophobic face is exposed to the
nonpolar environment formed by the hydrocarbon tails of the lipids (Pohorille et al. 2005). Most
importantly, the match between the polarity of the -helix and its environment is stable, and this
appears to be more important than the specific identity of the amino acids in the amphipathic
sequence (Pohorille et al. 2005). In a similar way, membrane spanning segments could have
evolved from sequences similar to gramicidin. This simple membrane channel inserts into the
membrane in a stepwise manner that likely includes formation of a water-insoluble gramicidin
aggregate, dissociation from the aggregate, partitioning of peptide to the membrane surface,
oligomerization on the surface and insertion and folding of the peptide into its double-helical
form (Hicks et al. 2008). The sequence of gramicidin is hydrophobic, and primordial segments
such as these may have adopted a membrane spanning structure in an equivalent manner.
Hydrophobic or amphipathic helices as mentioned here may spontaneously insert into the
membrane bilayer in a manner unassisted by the translocon, but because of the nature of the
construct used in the membrane insertion assays in this work, this is probably not a likely
scenario for the -helix segments. In the case of the TOXCAT assay, the -helix segment of
interest is fused to MBP, which is normally a periplasmic protein. Due to the presence of a large
fusion protein, spontaneous insertion of -helices into the membrane to establish a membrane
inserted orientation would likely be difficult (Chapter 3). In the case of the membrane insertion
assay employed in Chapter 4 of this thesis, the segment of interest is cloned behind a natural TM
segment from the integral membrane protein leader peptidase, which would direct protein
expression to the membrane. Segments that follow this model TM segment would either be
identified for membrane insertion by the cellular machinery, or not, and spontaneous insertion is
not probable due to the nature of the construct (Fig 4.1). As for the truncated and mutated
2BMH protein, for which a limited amount of membrane insertion was measured, spontaneous
membrane insertion is possible, but unlikely given the glycosylation pattern observed for the
construct (Fig 4.3).
142
6.5 Helix-helix interactions.
Many TM -helices associate to form functional membrane proteins or domains of larger
structures, but when do the TM -helices of multi-spanning membrane proteins first interact?
One possibility is that the translocon identifies TM segments in a linear manner and carries out
each membrane integration event independently and in sequential progression (Pitonzo and
Skach 2006) that is followed by any structural contacts leading to higher order structures within
the membrane bilayer. Although this simplistic model is appealing, it was shown several years
ago that the TM‟s of many polytopic membrane proteins integrate in to the bilayer in pairs or
groups (Skach and Lingappa 1993). Membrane protein that contain short loops may form their
helical interactions within the translocon pore, or very soon after exiting laterally into the
bilayer: the physical constraints imposed by short loops would make this a possibility, and the
early environment experienced by TM segments may play a role in determining how and when
TM segments begin to associate. Studies to date have primarily focused on TM segment
associations with the translocon machinery, and it has been shown that different -helices
contact translocon machinery for different lengths of time, which could contribute to structural
contacts within the protein (Sadlish et al. 2005). Constructs such as CFTR TM2/3/4 would
provide utility in addressing questions relating to this area. In the structural model of CFTR,
based in the high-resolution structure of Sav1866, it does not appear that TM2 contacts TM3 and
TM4 in the final folded structure (Serohijos et al. 2008). If helix contacts are established
between helices in the translocon for polytopic membrane proteins, then no helical contacts for
TM2 to the rest of the protein should be observed. The model consisting of CFTR TM2/3/4 is an
ideal system to address the formation of early helical contacts in a translocating system.
Beyond helical contacts created in the translocon, it has been established that specific
sequence determinants direct -helix association, and that the sequence of amino acids in TM
segments is critical to the formation of higher order structures. The work presented in Chapter 2
of this thesis highlights how a sequence-specific oligomerization motif can be modulated to
influence TM -helix association. Gathering structural information on the folding of membrane
proteins is useful towards the eventual prediction of structural contacts between membrane-
spanning segments in the formation of tertiary and quaternary structure.
143
6.5.1 Sequence specific dimerization motifs.
The most-widely studied motif directing TM segment dimerization is the GG4 segment in
GpA. This structural motif separates two Gly residues by three amino acids, and creates a
concave surface allowing for the close approach of interacting helices (MacKenzie et al. 1997).
While GpA is thought to dimerize principally by using a central GG4 motif (Lemmon et al.
1992b; MacKenzie et al. 1997), it is not plausible that residues surrounding GG4 motifs do not
also shape the affinity of the helix-helix interactions. For example, the presence of a GG4 motif
does not guarantee a tight interaction between helices: MCP from the M13 bacteriophage utilizes
a GG4 motif to direct dimerization, but the strength of association is of relatively moderate
stability compared to GpA (Melnyk et al. 2002).
The importance of the Val residues at i + 1, i + 5 positions relative to the GG4 motif was
first recognized in a systematic replacement GpA TM segment residues to identify interfacial
amino acids (Lemmon et al. 1992b). The data presented in this thesis expands on this work to
identify relative differences in dimerization strength with mutation, and to propose a structural
basis for this difference in oligomerization among mutants. We observed that the strength of
GpA dimerization can be modulated by replacement of the Val residues at positions
neighbouring the GG4 motif. Mutation to Ile - a similar large, hydrophobic -branched residue -
significantly improves dimerization, while mutation to Leu significantly reduces dimerization.
These mutations are considered relatively conservative yet they result in strong differences in
dimerization among mutants. As conservative mutations are often made to highlight the
importance of specific side-chain characteristics, the work presented here provides an example
where small changes to the side chain profile through mutation can have drastic consequences.
At times it may not be possible to make conservative mutations in membrane proteins, as the
structural features in TM segments are carefully fine tuned to carry out specific functions.
6.5.2 Prediction of helix-helix interactions.
The challenge in gaining insights into membrane protein folding and predicting
interactions between TM helices lies in the fact that membrane proteins comprise a higher than
expected structural complexity in their mode of crossing the membrane. Membrane proteins can
144
contain non -helical elements, such as 310-helices, -helices or intra-helical kinks (Riek et al.
2001), or may include loops which enter the membrane and turn back as in the membrane pore
aquaporin (Murata et al. 2000; Viklund et al. 2006). The complication of predicting helix
contact points is best highlighted by segments such as -helices which are capable of interactions
within a membrane bilayer, even though they are not membrane-spanning segments. It has also
been shown that computational methods trained to predict residue contacts in globular proteins
perform only moderately well when applied to membrane proteins (Fuchs et al. 2009), so a
separate set of contact criteria must be considered.
Several methods have been developed to predict helix interaction points, ranging from
measurements of lipid accessibility, determining the contact points of energy-minimized helices
and neural networks that consider actual amino acid sequences; all with limited success.
Measurements of lipid accessibility, which highlight the contribution of the environment to the
promotion of higher order structures in membrane proteins, were used in this thesis to partially
explain differences in oligomerization among GpA mutants. This procedure involves making
measurements on the accessibility of an energy minimized -helix to a methylene-sized probe as
an estimate of solvation by membrane components. This probe mimics the lipid acyl chain
radius in size, although it has considerably more conformational freedom than a methylene group
covalently linked in membrane lipids (Johnson et al. 2006). The correlation that we observed
between GpA dimerization and the lipid accessible surface area implies the existence of nonpolar
cavities on the surface of the -helical structure which has a role in determining and promoting
dimer affinity. These cavities essentially represent areas of the protein structure that do not
easily contact lipid, creating unfavourable voids in the monomeric state.
When lipid accessibility was used to explain differences in GpA mutants made
specifically at the GG4 motif to other small residues (Gly, Ala and Ser), a strong inverse
correlation was observed for dimerization as measured through TOXCAT and the lipid
accessibility (R = -0.75) (Johnson et al. 2006). An inverse correlation between TOXCAT dimer
affinity and lipid accessibility was also observed in the present work when considering the large
-branched residues in GpA at the i +1, i + 5 positions; however, the correlation was not as
strong (R = -0.54) (Cunningham et al. 2010). As seen through the NMR structure of the GpA
145
dimer produced in DPC micelles, the large -branched Val residues surrounding the GG4 motif
are involved in creating a ridge structure that likely promotes dimerization through
complimentary van der Waals packing into the groove created by the small residues (MacKenzie
et al. 1997). The weaker correlation observed when considering mutations to the large ridge
residues implies that changing the lipid solvation of the ridge structure does have such a strong
component in determining dimerization. Rather, creating optimized ridge surfaces, and therefore
optimized van der Waals packing surfaces, plays more of a role in dimerization compared to the
small Gly residues. Our results thus highlight the different contributions that amino acids make
in determining the interactive surfaces of TM -helices and how nuanced these packing
interactions are. Relatively simple mutations at neighbouring positions affect dimerization
differently, and at the present stage of the study of membrane protein folding, extensive
mutagenesis is still required to successfully determine packing surfaces. For the eventual
prediction of membrane protein structures, gathering experimental data relating to factors
involved in driving TM segment association remains critical.
6.6 Insights from high resolution structures of membrane proteins.
Available high-resolution crystal structures have revealed important insights into
membrane insertion events, and contacts between helices in multi-spanning membrane proteins.
The high resolution structure of the Mammalian Shaker Kv1.2 potassium channel (2A79) helped
to clarify a complicated folding pathway that includes interactions between highly charged TM
segments, allowing for insertion into the membrane (Long et al. 2007). The S4 -helix,
containing several positively charged residues, interacts with negatively charged side chains in
the S1 and S2 -helices which form the voltage sensor. These segments are critical to the
function of this multi-spanning membrane protein, and when removed from the full-length
protein context, are not capable of membrane insertion on their own (Hessa et al. 2005b).
The outline for membrane protein over-expression in Chapter 5 of this thesis works
towards an optimized strategy for studying membrane proteins. Production of large quantities of
membrane proteins is required for biophysical analysis, and the benefit from obtaining amounts
of protein to make these types of studies feasible will ultimately be essential to understanding the
detailed rules of membrane insertion, and folding.
146
6.7 Future directions of membrane protein folding.
To arrive at the ability to accurately predict the insertion propensity of TM segments into
the membrane bilayer, along with tertiary and quaternary interactions between TM -helices, it
is ultimately necessary to identify all of the energetic contributors to both of these cellular
processes. To accomplish this task, both in vitro and in vivo experimental approaches will be
useful. Synthesis of peptides corresponding to both TM segments and hydrophobic -helices and
subsequent analysis via SDS-PAGE, size exclusion chromatography and FRET would provide
information regarding the behaviour of these segments in membrane mimetic environments.
Expression of TM segments, and study of insertion and interactions within a native-like
membrane bilayer environment through use of assays such as TOXCAT, provides an in vivo
comparison to these studies.
Previous work has highlighted the subtleties of TM segment integration into the
membrane bilayer, with segmental hydrophobicity and position of charged residues likely
dominating the insertion process (Hessa et al. 2005a; Hessa et al. 2007; Cunningham et al. 2009),
and experiments designed to determine basic rules of membrane integration have been based on
model systems consisting of designed TM segments. However, limited work has been done to
address the importance of the sequence context in the process. For example, can the presence of
a charged residue in a TM segment be offset or buffered by increasing or decreasing surrounding
hydrophobicity? Further study of the membrane integration of -helices would be well-suited to
answer this question. -helices contain charged residues placed regularly throughout their
sequence, unlike TM segments that have charged residues concentrated at the helix termini
(Cunningham et al. 2009). Because of their high segmental hydrophobicity, -helices are
capable of membrane insertion in the context of the TOXCAT assay, but introducing mutations
to change the local hydrophobicity surrounding existing charged residues in -helices would be
useful in answering questions as to how the cellular machinery “handles” charged residues.
Accompanied by a statistical study, the identification and nature of residues surrounding native
charged residues in actual TM segments would aid in the decision of mutations to test in -
helices to further define the membrane insertion of TM and TM-like segments.
147
The determination of forces directing the association of TM segments within the
membrane bilayer is another area worthy of further investigation, as the rules of TM segment
association have yet to be fully defined. For example, several examples of TM segments which
dimerize via a GG4 motif also have large, hydrophobic, -branched residues at positions
neighbouring the small residues (Rath et al. 2009b). The ability of these residues to modulate the
folding of GpA by either increasing (Ile) or decreasing dimerization (Leu) relative to the WT
(Val) was discovered (Cunningham et al. 2010), but would the same type of mutational analysis
hold true in examples of other proteins that dimerize with a similar sequence space? In these
situations, a systematic investigation of the importance of large, hydrophobic, -branched
residues at positions neighbouring GG4 motifs – either in the i + 1, i + 5 or i – 1, i + 3 positions
- would be useful to fully evaluate the importance of these neighbouring ridge residues in
promoting dimerization. Mutational analysis and testing of oligomerization propensities via the
TOXCAT assay would be an informative experimental setup.
Extensive research focus has been directed towards understanding the
homooligomerization of membrane-spanning segments. While this kind of information is
extremely useful for identifying sequence specific dimerization motifs, or the mechanism of
dimerization of single-pass TM proteins like GpA, it does not address the helix-helix contacts
observed in multi-spanning membrane proteins that form the final, folded structure. A method of
evaluating hetero-oligomerization of TM segments would be useful, as little information has
been generated in this regard. To this end, some progress has been made with the development
of the GALLEX assay (Schneider and Engelman 2003). A variation of the TOXCAT assay,
GALLEX can be used to study hetero-dimerization, and currently work in our lab is being
carried out to understand the folding interactions of both single-pass and multi-pass membrane
proteins.
The work presented in this thesis has dealt with optimizing membrane protein expression
for biophysical and biochemical study, understanding determinants of membrane insertion based
on the primary amino acid sequence, as well as identifying contributors to oligomerization events
within the membrane bilayer. With the eventual promise of producing correct high-resolution
structures of membrane proteins, our results have identified several basic rules of membrane
148
protein folding that can be applied to future biophysical and biochemical studies of this unique
group of structures.
149
Chapter 7: Literature Cited.
150
Literature Cited
Adamian, L., and Liang, J. 2002. Interhelical hydrogen bonds and spatial motifs in membrane
proteins: polar clamps and serine zippers. Proteins 47: 209-218.
Adams, P.D., Arkin, I.T., Engelman, D.M., and Brunger, A.T. 1995. Computational searching
and mutagenesis suggest a structure for the pentameric transmembrane domain of
phospholamban. Nature structural biology 2: 154-162.
Adams, P.D., Engelman, D.M., and Brunger, A.T. 1996. Improved prediction for the structure of
the dimeric transmembrane domain of glycophorin A obtained through global searching.
Proteins 26: 257-261.
Akermoun, M., Koglin, M., Zvalova-Iooss, D., Folschweiller, N., Dowell, S.J., and Gearing,
K.L. 2005. Characterization of 16 human G protein-coupled receptors expressed in
baculovirus-infected insect cells. Protein expression and purification 44: 65-74.
Almen, M.S., Nordstrom, K.J., Fredriksson, R., and Schioth, H.B. 2009. Mapping the human
membrane proteome: a majority of the human membrane proteins can be classified
according to function and evolutionary origin. BMC biology 7: 50.
Arbely, E., and Arkin, I.T. 2004. Experimental measurement of the strength of a C alpha-H...O
bond in a lipid bilayer. Journal of the American Chemical Society 126: 5362-5363.
Aronson, H.E., Royer, W.E., Jr., and Hendrickson, W.A. 1994. Quantification of tertiary
structural conservation despite primary sequence drift in the globin fold. Protein Sci 3:
1706-1711.
Atwell, S., Brouillette, C.G., Conners, K., Emtage, S., Gheyi, T., Guggino, W.B., Hendle, J.,
Hunt, J.F., Lewis, H.A., Lu, F., et al. 2010. Structures of a minimal human CFTR first
151
nucleotide-binding domain as a monomer, head-to-tail homodimer, and pathogenic
mutant. Protein Eng Des Sel 23: 375-384.
Baker, J.M., Hudson, R.P., Kanelis, V., Choy, W.Y., Thibodeau, P.H., Thomas, P.J., and
Forman-Kay, J.D. 2007. CFTR regulatory region interacts with NBD1 predominantly via
multiple transient helices. Nature structural & molecular biology 14: 738-745.
Bakolitsa, C., de Pereda, J.M., Bagshaw, C.R., Critchley, D.R., and Liddington, R.C. 1999.
Crystal structure of the vinculin tail suggests a pathway for activation. Cell 99: 603-613.
Baneyx, F., and Mujacic, M. 2004. Recombinant protein folding and misfolding in Escherichia
coli. Nature biotechnology 22: 1399-1408.
Bocharov, E.V., Pustovalova, Y.E., Pavlov, K.V., Volynsky, P.E., Goncharuk, M.V., Ermolyuk,
Y.S., Karpunin, D.V., Schulga, A.A., Kirpichnikov, M.P., Efremov, R.G., et al. 2007.
Unique dimeric structure of BNip3 transmembrane domain suggests membrane
permeabilization as a cell death trigger. The Journal of biological chemistry 282: 16256-
16266.
Bogdanov, M., and Dowhan, W. 1995. Phosphatidylethanolamine is required for in vivo function
of the membrane-associated lactose permease of Escherichia coli. The Journal of
biological chemistry 270: 732-739.
Bogdanov, M., and Dowhan, W. 1999. Lipid-assisted protein folding. The Journal of biological
chemistry 274: 36827-36830.
Bogdanov, M., Umeda, M., and Dowhan, W. 1999. Phospholipid-assisted refolding of an
integral membrane protein. Minimum structural features for phosphatidylethanolamine to
act as a molecular chaperone. The Journal of biological chemistry 274: 12339-12345.
Bowie, J.U. 1997. Helix packing in membrane proteins. Journal of molecular biology 272: 780-
789.
152
Boyd, D., Schierle, C., and Beckwith, J. 1998. How many membrane proteins are there? Protein
Sci 7: 201-205.
Brunger, A.T., Adams, P.D., Clore, G.M., DeLano, W.L., Gros, P., Grosse-Kunstleve, R.W.,
Jiang, J.S., Kuszewski, J., Nilges, M., Pannu, N.S., et al. 1998. Crystallography & NMR
system: A new software suite for macromolecular structure determination. Acta
crystallographica 54: 905-921.
Buck, T.M., Wagner, J., Grund, S., and Skach, W.R. 2007. A novel tripartite motif involved in
aquaporin topogenesis, monomer folding and tetramerization. Nature structural &
molecular biology 14: 762-769.
Carpenter, E.P., Beis, K., Cameron, A.D., and Iwata, S. 2008. Overcoming the challenges of
membrane protein crystallography. Current opinion in structural biology 18: 581-586.
Chang, G., Spencer, R.H., Lee, A.T., Barclay, M.T., and Rees, D.C. 1998. Structure of the MscL
homolog from Mycobacterium tuberculosis: a gated mechanosensitive ion channel.
Science (New York, N.Y 282: 2220-2226.
Chen, H., and Kendall, D.A. 1995. Artificial transmembrane segments. Requirements for stop
transfer and polypeptide orientation. The Journal of biological chemistry 270: 14115-
14122.
Cheng, S.H., Gregory, R.J., Marshall, J., Paul, S., Souza, D.W., White, G.A., O'Riordan, C.R.,
and Smith, A.E. 1990. Defective intracellular transport and processing of CFTR is the
molecular basis of most cystic fibrosis. Cell 63: 827-834.
Cheung, J.C., and Deber, C.M. 2008. Misfolding of the cystic fibrosis transmembrane
conductance regulator and disease. Biochemistry 47: 1465-1473.
Cheung, J.C., Kim Chiaw, P., Pasyk, S., and Bear, C.E. 2008. Molecular basis for the ATPase
activity of CFTR. Archives of biochemistry and biophysics 476: 95-100.
153
Chin, C.N., Sachs, J.N., and Engelman, D.M. 2005. Transmembrane homodimerization of
receptor-like protein tyrosine phosphatases. FEBS letters 579: 3855-3858.
Choi, M.Y., Cardarelli, L., Therien, A.G., and Deber, C.M. 2004. Non-native interhelical
hydrogen bonds in the cystic fibrosis transmembrane conductance regulator domain
modulated by polar mutations. Biochemistry 43: 8077-8083.
Choi, M.Y., Partridge, A.W., Daniels, C., Du, K., Lukacs, G.L., and Deber, C.M. 2005.
Destabilization of the transmembrane domain induces misfolding in a phenotypic mutant
of cystic fibrosis transmembrane conductance regulator. The Journal of biological
chemistry 280: 4968-4974.
Choma, C., Gratkowski, H., Lear, J.D., and DeGrado, W.F. 2000. Asparagine-mediated self-
association of a model transmembrane helix. Nature structural biology 7: 161-166.
Chothia, C. 1975. Structural invariants in protein folding. Nature 254: 304-308.
Chou, P.Y., and Fasman, G.D. 1978. Empirical predictions of protein conformation. Annual
review of biochemistry 47: 251-276.
Claros, M.G., and von Heijne, G. 1994. TopPred II: an improved software for membrane protein
structure predictions. Comput Appl Biosci 10: 685-686.
Cserzo, M., Eisenhaber, F., Eisenhaber, B., and Simon, I. 2002. On filtering false positive
transmembrane protein predictions. Protein engineering 15: 745-752.
Cserzo, M., Wallin, E., Simon, I., von Heijne, G., and Elofsson, A. 1997. Prediction of
transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment
surface method. Protein engineering 10: 673-676.
Cunningham, F., and Deber, C.M. 2007. Optimizing synthesis and expression of transmembrane
peptides and proteins. Methods (San Diego, Calif 41: 370-380.
154
Cunningham, F., Poulsen, B.E., Ip, W., and Deber, C.M. 2010. Beta-branched residues adjacent
to GG4 motifs promote the efficient association of glycophorin A transmembrane helices.
Biopolymers Epub ahead of print.
Cunningham, F., Rath, A., Johnson, R.M., and Deber, C.M. 2009. Distinctions between
hydrophobic helices in globular proteins and transmembrane segments as factors in
protein sorting. The Journal of biological chemistry 284: 5395-5402.
Cuthbertson, J.M., Bond, P.J., and Sansom, M.S. 2006. Transmembrane helix-helix interactions:
comparative simulations of the glycophorin a dimer. Biochemistry 45: 14298-14310.
Cuthbertson, J.M., Doyle, D.A., and Sansom, M.S. 2005. Transmembrane helix prediction: a
comparative evaluation and analysis. Protein Eng Des Sel 18: 295-308.
Daniel, C.J., Conti, B., Johnson, A.E., and Skach, W.R. 2008. Control of translocation through
the Sec61 translocon by nascent polypeptide structure within the ribosome. The Journal
of biological chemistry 283: 20864-20873.
Dawson, J.P., Weinger, J.S., and Engelman, D.M. 2002. Motifs of serine and threonine can drive
association of transmembrane helices. Journal of molecular biology 316: 799-805.
Dawson, R.J., and Locher, K.P. 2006. Structure of a bacterial multidrug ABC transporter. Nature
443: 180-185.
Deber, C.M., Khan, A.R., Li, Z., Joensson, C., Glibowicka, M., and Wang, J. 1993. Val-->Ala
mutations selectively alter helix-helix packing in the transmembrane segment of phage
M13 coat protein. Proceedings of the National Academy of Sciences of the United States
of America 90: 11648-11652.
Deber, C.M., Wang, C., Liu, L.P., Prior, A.S., Agrawal, S., Muskat, B.L., and Cuticchia, A.J.
2001. TM Finder: a prediction program for transmembrane protein segments using a
combination of hydrophobicity and nonpolar phase helicity scales. Protein Sci 10: 212-
219.
155
Deisenhofer, J., Epp, O., Miki, K., Huber, R., and Michel, H. 1984. X-ray structure analysis of a
membrane protein complex. Electron density map at 3 A resolution and a model of the
chromophores of the photosynthetic reaction center from Rhodopseudomonas viridis.
Journal of molecular biology 180: 385-398.
Dill, K.A., Ozkan, S.B., Shell, M.S., and Weikl, T.R. 2008. The protein folding problem. Annual
review of biophysics 37: 289-316.
Dobson, C.M. 2003. Protein folding and misfolding. Nature 426: 884-890.
Dougherty, D.A. 2007. Cation-pi interactions involving aromatic amino acids. The Journal of
nutrition 137: 1504S-1508S; discussion 1516S-1517S.
Douglas, J.L., Trieber, C.A., Afara, M., and Young, H.S. 2005. Rapid, high-yield expression and
purification of Ca2+-ATPase regulatory proteins for high-resolution structural studies.
Protein expression and purification 40: 118-125.
Doura, A.K., and Fleming, K.G. 2004. Complex interactions at the helix-helix interface stabilize
the glycophorin A transmembrane dimer. Journal of molecular biology 343: 1487-1497.
Doura, A.K., Kobus, F.J., Dubrovsky, L., Hibbard, E., and Fleming, K.G. 2004. Sequence
context modulates the stability of a GxxxG-mediated transmembrane helix-helix dimer.
Journal of molecular biology 341: 991-998.
Dumon-Seignovert, L., Cariot, G., and Vuillard, L. 2004. The toxicity of recombinant proteins in
Escherichia coli: a comparison of overexpression in BL21(DE3), C41(DE3), and
C43(DE3). Protein expression and purification 37: 203-206.
Duong, M.T., Jaszewski, T.M., Fleming, K.G., and MacKenzie, K.R. 2007. Changes in apparent
free energy of helix-helix dimerization in a biological membrane due to point mutations.
Journal of molecular biology 371: 422-434.
156
Ebie, A.Z., and Fleming, K.G. 2007. Dimerization of the erythropoietin receptor transmembrane
domain in micelles. Journal of molecular biology 366: 517-524.
Eisenberg, D., Schwarz, E., Komaromy, M., and Wall, R. 1984. Analysis of membrane and
surface protein sequences with the hydrophobic moment plot. Journal of molecular
biology 179: 125-142.
Ellis, R.J. 1997. Do molecular chaperones have to be proteins? Biochemical and biophysical
research communications 238: 687-692.
Elmore, D.E., and Dougherty, D.A. 2003. Investigating lipid composition effects on the
mechanosensitive channel of large conductance (MscL) using molecular dynamics
simulations. Biophysical journal 85: 1512-1524.
Engelman, D.M. 2005. Membranes are more mosaic than fluid. Nature 438: 578-580.
Engelman, D.M., Steitz, T.A., and Goldman, A. 1986. Identifying nonpolar transbilayer helices
in amino acid sequences of membrane proteins. Annual review of biophysics and
biophysical chemistry 15: 321-353.
Enquist, K., Fransson, M., Boekel, C., Bengtsson, I., Geiger, K., Lang, L., Pettersson, A.,
Johansson, S., von Heijne, G., and Nilsson, I. 2009. Membrane-integration characteristics
of two ABC transporters, CFTR and P-glycoprotein. Journal of molecular biology 387:
1153-1164.
Eshaghi, S., Niegowski, D., Kohl, A., Martinez Molina, D., Lesley, S.A., and Nordlund, P. 2006.
Crystal structure of a divalent metal ion transporter CorA at 2.9 angstrom resolution.
Science (New York, N.Y 313: 354-357.
Faham, S., Yang, D., Bare, E., Yohannan, S., Whitelegge, J.P., and Bowie, J.U. 2004. Side-chain
contributions to membrane protein structure and stability. Journal of molecular biology
335: 297-305.
157
Ferreira, G.C., and Pedersen, P.L. 1992. Overexpression of higher eukaryotic membrane proteins
in bacteria. Novel insights obtained with the liver mitochondrial proton/phosphate
symporter. The Journal of biological chemistry 267: 5460-5466.
Fisher, L.E., Engelman, D.M., and Sturgis, J.N. 2003. Effect of detergents on the association of
the glycophorin a transmembrane helix. Biophysical journal 85: 3097-3105.
Fleming, K.G., and Engelman, D.M. 2001. Specificity in transmembrane helix-helix interactions
can define a hierarchy of stability for sequence variants. Proceedings of the National
Academy of Sciences of the United States of America 98: 14340-14344.
Frelet, A., and Klein, M. 2006. Insight in eukaryotic ABC transporter function by mutation
analysis. FEBS letters 580: 1064-1084.
Frydman, J., and Hartl, F.U. 1996. Principles of chaperone-assisted protein folding: differences
between in vitro and in vivo mechanisms. Science (New York, N.Y 272: 1497-1502.
Fu, D., Libson, A., Miercke, L.J., Weitzman, C., Nollert, P., Krucinski, J., and Stroud, R.M.
2000. Structure of a glycerol-conducting channel and the basis for its selectivity. Science
(New York, N.Y 290: 481-486.
Fuchs, A., Kirschner, A., and Frishman, D. 2009. Prediction of helix-helix contacts and
interacting helices in polytopic membrane proteins using neural networks. Proteins 74:
857-871.
Gaddie, K.J., and Kirley, T.L. 2009. Conserved polar residues stabilize transmembrane domains
and promote oligomerization in human nucleoside triphosphate diphosphohydrolase 3.
Biochemistry 48: 9437-9447.
Galdiero, S., Galdiero, M., and Pedone, C. 2007. beta-Barrel membrane bacterial proteins:
structure, function, assembly and interaction with lipids. Current protein & peptide
science 8: 63-82.
158
Gerber, N.C., and Sligar, S.G. 1992. Catalytic mechanism of Cytochrome-P-450 - evidence for a
distal charge relay. Journal of the American Chemical Society 114: 8742-8743.
Go, M.Y., Kim, S., Partridge, A.W., Melnyk, R.A., Rath, A., Deber, C.M., and Mogridge, J.
2006. Self-association of the transmembrane domain of an anthrax toxin receptor.
Journal of molecular biology 360: 145-156.
Goldmann, W.H., Ezzell, R.M., Adamson, E.D., Niggli, V., and Isenberg, G. 1996. Vinculin,
talin and focal adhesions. Journal of muscle research and cell motility 17: 1-5.
Goldstein, J., Pollitt, N.S., and Inouye, M. 1990. Major cold shock protein of Escherichia coli.
Proceedings of the National Academy of Sciences of the United States of America 87:
283-287.
Gratkowski, H., Lear, J.D., and DeGrado, W.F. 2001. Polar side chains drive the association of
model transmembrane peptides. Proceedings of the National Academy of Sciences of the
United States of America 98: 880-885.
Hankamer, B., Morris, E.P., and Barber, J. 1999. Revealing the structure of the oxygen-evolving
core dimer of photosystem II by cryoelectron crystallography. Nature structural biology
6: 560-564.
Haupt, M., Bramkamp, M., Coles, M., Kessler, H., and Altendorf, K. 2005. Prokaryotic Kdp-
ATPase: recent insights into the structure and function of KdpB. Journal of molecular
microbiology and biotechnology 10: 120-131.
Hawkins, C.A., de Alba, E., and Tjandra, N. 2005. Solution structure of human saposin C in a
detergent environment. Journal of molecular biology 346: 1381-1392.
Hedin, L.E., Ojemalm, K., Bernsel, A., Hennerdal, A., Illergard, K., Enquist, K., Kauko, A.,
Cristobal, S., von Heijne, G., Lerch-Bader, M., et al. 2010. Membrane insertion of
marginally hydrophobic transmembrane helices depends on sequence context. Journal of
molecular biology 396: 221-229.
159
Hessa, T., Kim, H., Bihlmaier, K., Lundin, C., Boekel, J., Andersson, H., Nilsson, I., White,
S.H., and von Heijne, G. 2005a. Recognition of transmembrane helices by the
endoplasmic reticulum translocon. Nature 433: 377-381.
Hessa, T., Meindl-Beinker, N.M., Bernsel, A., Kim, H., Sato, Y., Lerch-Bader, M., Nilsson, I.,
White, S.H., and von Heijne, G. 2007. Molecular code for transmembrane-helix
recognition by the Sec61 translocon. Nature 450: 1026-1030.
Hessa, T., White, S.H., and von Heijne, G. 2005b. Membrane insertion of a potassium-channel
voltage sensor. Science (New York, N.Y 307: 1427.
Hicks, M.R., Damianoglou, A., Rodger, A., and Dafforn, T.R. 2008. Folding and membrane
insertion of the pore-forming peptide gramicidin occur as a concerted process. Journal of
molecular biology 383: 358-366.
Hildebrand, P.W., Preissner, R., and Frommel, C. 2004. Structural features of transmembrane
helices. FEBS letters 559: 145-151.
Hiroaki, Y., Tani, K., Kamegawa, A., Gyobu, N., Nishikawa, K., Suzuki, H., Walz, T., Sasaki,
S., Mitsuoka, K., Kimura, K., et al. 2006. Implications of the aquaporin-4 structure on
array formation and cell adhesion. Journal of molecular biology 355: 628-639.
Hirokawa, T., Boon-Chieng, S., and Mitaku, S. 1998. SOSUI: classification and secondary
structure prediction system for membrane proteins. Bioinformatics (Oxford, England) 14:
378-379.
Hofmann, K., and Stoffel, W. 1993. TMbase-A database of membrane spanning proteins
segments. Biol. Chem. Hoppe-Seyler 374: 166.
http://blanco.biomol.uci.edu/Membrane_Proteins_xtal.html.
Hubbard, S.J., Thornton, J. M. 1993. “NACCESS”, 2.1.1 ed, London.
160
Hunter, H.N., Demcoe, A.R., Jenssen, H., Gutteberg, T.J., and Vogel, H.J. 2005. Human
lactoferricin is partially folded in aqueous solution and is better stabilized in a membrane
mimetic solvent. Antimicrobial agents and chemotherapy 49: 3387-3395.
Imamura, T. 2006. Protein-Surfactant Interactions. In Encyclopedia of Surface and Colloid
Science, 2nd Edition ed. (ed. P. Somasundaran), pp. 5251-5263. Taylor & Francis, New
York.
Insel, P.A., Tang, C.M., Hahntow, I., and Michel, M.C. 2007. Impact of GPCRs in clinical
medicine: monogenic diseases, genetic variants and drug targets. Biochimica et
biophysica acta 1768: 994-1005.
Jiang, Y., Lee, A., Chen, J., Ruta, V., Cadene, M., Chait, B.T., and MacKinnon, R. 2003. X-ray
structure of a voltage-dependent K+ channel. Nature 423: 33-41.
Johnson, R.M., Hecht, K., and Deber, C.M. 2007. Aromatic and cation-pi interactions enhance
helix-helix association in a membrane environment. Biochemistry 46: 9208-9214.
Johnson, R.M., Heslop, C.L., and Deber, C.M. 2004. Hydrophobic helical hairpins: design and
packing interactions in membrane environments. Biochemistry 43: 14361-14369.
Johnson, R.M., Rath, A., Melnyk, R.A., and Deber, C.M. 2006. Lipid solvation effects contribute
to the affinity of Gly-xxx-Gly motif-mediated helix-helix interactions. Biochemistry 45:
8507-8515.
Jones, D.T. 2007. Improving the accuracy of transmembrane protein topology prediction using
evolutionary information. Bioinformatics (Oxford, England) 23: 538-544.
Junne, T., Kocik, L., and Spiess, M. 2010. The hydrophobic core of the Sec61 translocon defines
the hydrophobicity threshold for membrane integration. Molecular biology of the cell 21:
1662-1670.
161
Juretic, D., Zoranic, L., and Zucic, D. 2002. Basic charge clusters and predictions of membrane
protein topology. Journal of chemical information and computer sciences 42: 620-632.
Kartner, N., Augustinas, O., Jensen, T.J., Naismith, A.L., and Riordan, J.R. 1992.
Mislocalization of delta F508 CFTR in cystic fibrosis sweat gland. Nature genetics 1:
321-327.
Kerem, B., Rommens, J.M., Buchanan, J.A., Markiewicz, D., Cox, T.K., Chakravarti, A.,
Buchwald, M., and Tsui, L.C. 1989. Identification of the cystic fibrosis gene: genetic
analysis. Science (New York, N.Y 245: 1073-1080.
Kern, R., Malki, A., Holmgren, A., and Richarme, G. 2003. Chaperone properties of Escherichia
coli thioredoxin and thioredoxin reductase. The Biochemical journal 371: 965-972.
Khademi, S., O'Connell, J., 3rd, Remis, J., Robles-Colmenares, Y., Miercke, L.J., and Stroud,
R.M. 2004. Mechanism of ammonia transport by Amt/MEP/Rh: structure of AmtB at
1.35 A. Science (New York, N.Y 305: 1587-1594.
Killian, J.A., and Nyholm, T.K. 2006. Peptides in lipid bilayers: the power of simple models.
Current opinion in structural biology 16: 473-479.
Killian, J.A., and von Heijne, G. 2000. How proteins adapt to a membrane-water interface.
Trends in biochemical sciences 25: 429-434.
Koebnik, R., Locher, K.P., and Van Gelder, P. 2000. Structure and function of bacterial outer
membrane proteins: barrels in a nutshell. Molecular microbiology 37: 239-253.
Kolmar, H., Hennecke, F., Gotze, K., Janzer, B., Vogt, B., Mayer, F., and Fritz, H.J. 1995.
Membrane insertion of the bacterial signal transduction protein ToxR and requirements
of transcription activation studied by modular replacement of different protein
substructures. The EMBO journal 14: 3895-3904.
162
Kreda, S.M., Mall, M., Mengos, A., Rochelle, L., Yankaskas, J., Riordan, J.R., and Boucher,
R.C. 2005. Characterization of wild-type and deltaF508 cystic fibrosis transmembrane
regulator in human respiratory epithelia. Molecular biology of the cell 16: 2154-2167.
Krogh, A., Larsson, B., von Heijne, G., and Sonnhammer, E.L. 2001. Predicting transmembrane
protein topology with a hidden Markov model: application to complete genomes. Journal
of molecular biology 305: 567-580.
Kunji, E.R., Slotboom, D.J., and Poolman, B. 2003. Lactococcus lactis as host for
overproduction of functional membrane proteins. Biochimica et biophysica acta 1610:
97-108.
Kyte, J., and Doolittle, R.F. 1982. A simple method for displaying the hydropathic character of a
protein. Journal of molecular biology 157: 105-132.
Laage, R., and Langosch, D. 2001. Strategies for prokaryotic expression of eukaryotic membrane
proteins. Traffic (Copenhagen, Denmark) 2: 99-104.
Landolt-Marticorena, C., Williams, K.A., Deber, C.M., and Reithmeier, R.A. 1993. Non-random
distribution of amino acids in the transmembrane segments of human type I single span
membrane proteins. Journal of molecular biology 229: 602-608.
Langosch, D., Brosig, B., Kolmar, H., and Fritz, H.J. 1996. Dimerisation of the glycophorin A
transmembrane segment in membranes probed with the ToxR transcription activator.
Journal of molecular biology 263: 525-530.
Lanyi, J., and Schobert, B. 2002. Crystallographic structure of the retinal and the protein after
deprotonation of the Schiff base: the switch in the bacteriorhodopsin photocycle. Journal
of molecular biology 321: 727-737.
Lau, W.C., Baker, L.A., and Rubinstein, J.L. 2008. Cryo-EM structure of the yeast ATP
synthase. Journal of molecular biology 382: 1256-1264.
163
le Maire, M., Champeil, P., and Moller, J.V. 2000. Interaction of membrane proteins and lipids
with solubilizing detergents. Biochimica et biophysica acta 1508: 86-111.
Lear, J.D., Gratkowski, H., and DeGrado, W.F. 2001. De novo design, synthesis and
characterization of membrane-active peptides. Biochemical Society transactions 29: 559-
564.
Lear, J.D., Stouffer, A.L., Gratkowski, H., Nanda, V., and Degrado, W.F. 2004. Association of a
model transmembrane peptide containing gly in a heptad sequence motif. Biophysical
journal 87: 3421-3429.
Lecomte, J.T., Vuletich, D.A., and Lesk, A.M. 2005. Structural divergence and distant
relationships in proteins: evolution of the globins. Current opinion in structural biology
15: 290-301.
Lemmon, M.A., Flanagan, J.M., Hunt, J.F., Adair, B.D., Bormann, B.J., Dempsey, C.E., and
Engelman, D.M. 1992a. Glycophorin A dimerization is driven by specific interactions
between transmembrane alpha-helices. The Journal of biological chemistry 267: 7683-
7689.
Lemmon, M.A., Flanagan, J.M., Treutlein, H.R., Zhang, J., and Engelman, D.M. 1992b.
Sequence specificity in the dimerization of transmembrane alpha-helices. Biochemistry
31: 12719-12725.
Lerch-Bader, M., Lundin, C., Kim, H., Nilsson, I., and von Heijne, G. 2008. Contribution of
positively charged flanking residues to the insertion of transmembrane helices into the
endoplasmic reticulum. Proceedings of the National Academy of Sciences of the United
States of America 105: 4127-4132.
Lewis, H.A., Buchanan, S.G., Burley, S.K., Conners, K., Dickey, M., Dorwart, M., Fowler, R.,
Gao, X., Guggino, W.B., Hendrickson, W.A., et al. 2004. Structure of nucleotide-binding
domain 1 of the cystic fibrosis transmembrane conductance regulator. The EMBO journal
23: 282-293.
164
Li, H., Cocco, M.J., Steitz, T.A., and Engelman, D.M. 2001. Conversion of phospholamban into
a soluble pentameric helical bundle. Biochemistry 40: 6636-6645.
Liu, L.P., and Deber, C.M. 1998a. Guidelines for membrane protein engineering derived from de
novo designed model peptides. Biopolymers 47: 41-62.
Liu, L.P., and Deber, C.M. 1998b. Uncoupling hydrophobicity and helicity in transmembrane
segments. Alpha-helical propensities of the amino acids in non-polar environments. The
Journal of biological chemistry 273: 23645-23648.
Liu, L.P., and Deber, C.M. 1999. Combining hydrophobicity and helicity: a novel approach to
membrane protein structure prediction. Bioorganic & medicinal chemistry 7: 1-7.
Liu, L.P., Li, S.C., Goto, N.K., and Deber, C.M. 1996. Threshold hydrophobicity dictates helical
conformations of peptides in membrane environments. Biopolymers 39: 465-470.
Liu, M., Tarsio, M., Charsky, C.M., and Kane, P.M. 2005. Structural and functional separation of
the N- and C-terminal domains of the yeast V-ATPase subunit H. The Journal of
biological chemistry 280: 36978-36985.
Liu, W., Crocker, E., Siminovitch, D.J., and Smith, S.O. 2003. Role of side-chain conformational
entropy in transmembrane helix dimerization of glycophorin A. Biophysical journal 84:
1263-1271.
Long, S.B., Tao, X., Campbell, E.B., and MacKinnon, R. 2007. Atomic structure of a voltage-
dependent K+ channel in a lipid membrane-like environment. Nature 450: 376-382.
Lovett, P.S. 1996. Translation attenuation regulation of chloramphenicol resistance in bacteria--a
review. Gene 179: 157-162.
Lu, J., and Deutsch, C. 2005. Secondary structure formation of a transmembrane segment in Kv
channels. Biochemistry 44: 8230-8243.
165
Luecke, H., Schobert, B., Richter, H.T., Cartailler, J.P., and Lanyi, J.K. 1999. Structure of
bacteriorhodopsin at 1.55 A resolution. Journal of molecular biology 291: 899-911.
Lundin, C., Kim, H., Nilsson, I., White, S.H., and von Heijne, G. 2008. Molecular code for
protein insertion in the endoplasmic reticulum membrane is similar for N(in)-C(out) and
N(out)-C(in) transmembrane helices. Proceedings of the National Academy of Sciences of
the United States of America 105: 15702-15707.
Lundstrom, K. 2004. Structural genomics on membrane proteins: mini review. Combinatorial
chemistry & high throughput screening 7: 431-439.
Lupo, D., Li, X.D., Durand, A., Tomizaki, T., Cherif-Zahar, B., Matassi, G., Merrick, M., and
Winkler, F.K. 2007. The 1.3-A resolution structure of Nitrosomonas europaea Rh50 and
mechanistic implications for NH3 transport by Rhesus family proteins. Proceedings of
the National Academy of Sciences of the United States of America 104: 19303-19308.
MacKenzie, K.R., Prestegard, J.H., and Engelman, D.M. 1996. Leucine side-chain rotamers in a
glycophorin A transmembrane peptide as revealed by three-bond carbon-carbon
couplings and 13C chemical shifts. Journal of biomolecular NMR 7: 256-260.
MacKenzie, K.R., Prestegard, J.H., and Engelman, D.M. 1997. A transmembrane helix dimer:
structure and implications. Science (New York, N.Y 276: 131-133.
MacKinnon, R. 2003. Potassium channels. FEBS letters 555: 62-65.
Marchesi, V.T., and Andrews, E.P. 1971. Glycoproteins: isolation from cellmembranes with
lithium diiodosalicylate. Science (New York, N.Y 174: 1247-1248.
Marston, F.A. 1986. The purification of eukaryotic polypeptides synthesized in Escherichia coli.
The Biochemical journal 240: 1-12.
Martin, J., and Hartl, F.U. 1997. Chaperone-assisted protein folding. Current opinion in
structural biology 7: 41-52.
166
Martin, N.P., Leavitt, L.M., Sommers, C.M., and Dumont, M.E. 1999. Assembly of G protein-
coupled receptors from fragments: identification of functional receptors with
discontinuities in each of the loops connecting transmembrane segments. Biochemistry
38: 682-695.
Maslennikov, I., Klammt, C., Hwang, E., Kefala, G., Okamura, M., Esquivies, L., Mors, K.,
Glaubitz, C., Kwiatkowski, W., Jeon, Y.H., et al. 2010. Membrane domain structures of
three classes of histidine kinase receptors by cell-free expression and rapid NMR
analysis. Proceedings of the National Academy of Sciences of the United States of
America 107: 10902-10907.
Meier, T., Polzer, P., Diederichs, K., Welte, W., and Dimroth, P. 2005. Structure of the rotor ring
of F-Type Na+-ATPase from Ilyobacter tartaricus. Science (New York, N.Y 308: 659-662.
Meijer, A.B., Spruijt, R.B., Wolfs, C.J., and Hemminga, M.A. 2001. Membrane-anchoring
interactions of M13 major coat protein. Biochemistry 40: 8815-8820.
Melnyk, R.A., Kim, S., Curran, A.R., Engelman, D.M., Bowie, J.U., and Deber, C.M. 2004. The
affinity of GXXXG motifs in transmembrane helix-helix interactions is modulated by
long-range communication. The Journal of biological chemistry 279: 16591-16597.
Melnyk, R.A., Partridge, A.W., and Deber, C.M. 2001. Retention of native-like oligomerization
states in transmembrane segment peptides: application to the Escherichia coli aspartate
receptor. Biochemistry 40: 11106-11113.
Melnyk, R.A., Partridge, A.W., and Deber, C.M. 2002. Transmembrane domain mediated self-
assembly of major coat protein subunits from Ff bacteriophage. Journal of molecular
biology 315: 63-72.
Melnyk, R.A., Partridge, A.W., Yip, J., Wu, Y., Goto, N.K., and Deber, C.M. 2003. Polar
residue tagging of transmembrane peptides. Biopolymers 71: 675-685.
167
Midgett, C.R., and Madden, D.R. 2007. Breaking the bottleneck: eukaryotic membrane protein
expression for high-resolution structural studies. Journal of structural biology 160: 265-
274.
Mingarro, I., Nilsson, I., Whitley, P., and von Heijne, G. 2000. Different conformations of
nascent polypeptides during translocation across the ER membrane. BMC cell biology 1:
3.
Miroux, B., and Walker, J.E. 1996. Over-production of proteins in Escherichia coli: mutant hosts
that allow synthesis of some membrane proteins and globular proteins at high levels.
Journal of molecular biology 260: 289-298.
Monne, M., Nilsson, I., Elofsson, A., and von Heijne, G. 1999. Turns in transmembrane helices:
determination of the minimal length of a "helical hairpin" and derivation of a fine-grained
turn propensity scale. Journal of molecular biology 293: 807-814.
Morais-Cabral, J.H., Zhou, Y., and MacKinnon, R. 2001. Energetic optimization of ion
conduction rate by the K+ selectivity filter. Nature 414: 37-42.
Morris, K.N., and Wool, I.G. 1994. Analysis of the contribution of an amphiphilic alpha-helix to
the structure and to the function of ricin A chain. Proceedings of the National Academy of
Sciences of the United States of America 91: 7530-7533.
Mujacic, M., Cooper, K.W., and Baneyx, F. 1999. Cold-inducible cloning vectors for low-
temperature protein expression in Escherichia coli: application to the production of a
toxic and proteolytically sensitive fusion protein. Gene 238: 325-332.
Mulkidjanian, A.Y., Galperin, M.Y., and Koonin, E.V. 2009. Co-evolution of primordial
membranes and membrane proteins. Trends in biochemical sciences 34: 206-215.
Mulvihill, C.M., and Deber, C.M. 2010. Evidence that the translocon may function as a
hydropathy partitioning filter. Biochimica et biophysica acta 1798: 1995-1998.
168
Murata, K., Mitsuoka, K., Hirai, T., Walz, T., Agre, P., Heymann, J.B., Engel, A., and Fujiyoshi,
Y. 2000. Structural determinants of water permeation through aquaporin-1. Nature 407:
599-605.
Netz, D.J., Bastos Mdo, C., and Sahl, H.G. 2002. Mode of action of the antimicrobial peptide
aureocin A53 from Staphylococcus aureus. Applied and environmental microbiology 68:
5274-5280.
Ng, D.P., and Deber, C.M. 2010. Deletion of a terminal residue disrupts oligomerization of a
transmembrane alpha-helix. Biochemistry and cell biology = Biochimie et biologie
cellulaire 88: 339-345.
Nilsson, I., Johnson, A.E., and von Heijne, G. 2003. How hydrophobic is alanine? The Journal of
biological chemistry 278: 29389-29393.
Nishimura, C., Prytulla, S., Jane Dyson, H., and Wright, P.E. 2000. Conservation of folding
pathways in evolutionarily distant globin sequences. Nature structural biology 7: 679-
686.
Nogi, T., Fathir, I., Kobayashi, M., Nozawa, T., and Miki, K. 2000. Crystal structures of
photosynthetic reaction center and high-potential iron-sulfur protein from
Thermochromatium tepidum: thermostability and electron transfer. Proceedings of the
National Academy of Sciences of the United States of America 97: 13561-13566.
Norholm, M.H. 2010. A mutant Pfu DNA polymerase designed for advanced uracil-excision
DNA engineering. BMC biotechnology 10: 21.
Norholm, M.H., Cunningham, F., Deber, C.M., and von Heijne, G. 2011. Converting a
marginally hydrophobic soluble protein into a membrane protein. Journal of molecular
biology 407: 171-179.
169
Nour-Eldin, H.H., Hansen, B.G., Norholm, M.H., Jensen, J.K., and Halkier, B.A. 2006.
Advancing uracil-excision based cloning towards an ideal technique for cloning PCR
fragments. Nucleic acids research 34: e122.
Okada, T., Sugihara, M., Bondar, A.N., Elstner, M., Entel, P., and Buss, V. 2004. The retinal
conformation and its environment in rhodopsin in light of a new 2.2 A crystal structure.
Journal of molecular biology 342: 571-583.
Osborne, A.R., Rapoport, T.A., and van den Berg, B. 2005. Protein translocation by the
Sec61/SecY channel. Annual review of cell and developmental biology 21: 529-550.
Ottemann, K.M., and Mekalanos, J.J. 1995. Analysis of Vibrio cholerae ToxR function by
construction of novel fusion proteins. Molecular microbiology 15: 719-731.
Oxenoid, K., and Chou, J.J. 2005. The structure of phospholamban pentamer reveals a channel-
like architecture in membranes. Proceedings of the National Academy of Sciences of the
United States of America 102: 10870-10875.
Paivio, A., Nordling, E., Kallberg, Y., Thyberg, J., and Johansson, J. 2004. Stabilization of
discordant helices in amyloid fibril-forming proteins. Protein Sci 13: 1251-1259.
Partridge, A.W., Melnyk, R.A., and Deber, C.M. 2002a. Polar residues in membrane domains of
proteins: molecular basis for helix-helix association in a mutant CFTR transmembrane
segment. Biochemistry 41: 3647-3653.
Partridge, A.W., Therien, A.G., and Deber, C.M. 2002b. Polar mutations in membrane proteins
as a biophysical basis for disease. Biopolymers 66: 350-358.
Pedersen, B.P., Buch-Pedersen, M.J., Morth, J.P., Palmgren, M.G., and Nissen, P. 2007. Crystal
structure of the plasma membrane proton pump. Nature 450: 1111-1114.
170
Peng, S., Liu, L.P., Emili, A.Q., and Deber, C.M. 1998. Cystic fibrosis transmembrane
conductance regulator: expression and helicity of a double membrane-spanning segment.
FEBS letters 431: 29-33.
Pitonzo, D., and Skach, W.R. 2006. Molecular mechanisms of aquaporin biogenesis by the
endoplasmic reticulum Sec61 translocon. Biochimica et biophysica acta 1758: 976-988.
Pitonzo, D., Yang, Z., Matsumura, Y., Johnson, A.E., and Skach, W.R. 2009. Sequence-specific
retention and regulated integration of a nascent membrane protein by the endoplasmic
reticulum Sec61 translocon. Molecular biology of the cell 20: 685-698.
Plotkowski, M.L., Kim, S., Phillips, M.L., Partridge, A.W., Deber, C.M., and Bowie, J.U. 2007.
Transmembrane domain of myelin protein zero can form dimers: possible implications
for myelin construction. Biochemistry 46: 12164-12173.
Pohorille, A., Schweighofer, K., and Wilson, M.A. 2005. The origin and early evolution of
membrane channels. Astrobiology 5: 1-17.
Popot, J.L., and Engelman, D.M. 1990. Membrane protein folding and oligomerization: the two-
stage model. Biochemistry 29: 4031-4037.
Popot, J.L., and Engelman, D.M. 2000. Helical membrane protein folding, stability, and
evolution. Annual review of biochemistry 69: 881-922.
Poulsen, B.E., Rath, A., and Deber, C.M. 2009. The assembly motif of a bacterial small
multidrug resistance protein. The Journal of biological chemistry 284: 9870-9875.
Powl, A.M., Carney, J., Marius, P., East, J.M., and Lee, A.G. 2005. Lipid interactions with
bacterial channels: fluorescence studies. Biochemical Society transactions 33: 905-909.
Prive, G.G. 2007. Detergents for the stabilization and crystallization of membrane proteins.
Methods (San Diego, Calif 41: 388-397.
171
Quick, M., and Wright, E.M. 2002. Employing Escherichia coli to functionally express, purify,
and characterize a human transporter. Proceedings of the National Academy of Sciences
of the United States of America 99: 8597-8601.
Ramjeesingh, M., Ugwu, F., Li, C., Dhani, S., Huan, L.J., Wang, Y., and Bear, C.E. 2003. Stable
dimeric assembly of the second membrane-spanning domain of CFTR (cystic fibrosis
transmembrane conductance regulator) reconstitutes a chloride-selective pore. The
Biochemical journal 375: 633-641.
Rasmussen, S.G., Choi, H.J., Rosenbaum, D.M., Kobilka, T.S., Thian, F.S., Edwards, P.C.,
Burghammer, M., Ratnala, V.R., Sanishvili, R., Fischetti, R.F., et al. 2007. Crystal
structure of the human beta2 adrenergic G-protein-coupled receptor. Nature 450: 383-
387.
Rastogi, V.K., and Girvin, M.E. 1999. Structural changes linked to proton translocation by
subunit c of the ATP synthase. Nature 402: 263-268.
Rath, A., and Deber, C.M. 2007. Membrane protein assembly patterns reflect selection for non-
proliferative structures. FEBS letters 581: 1335-1341.
Rath, A., and Deber, C.M. 2008. Surface recognition elements of membrane protein
oligomerization. Proteins 70: 786-793.
Rath, A., Glibowicka, M., Nadeau, V.G., Chen, G., and Deber, C.M. 2009a. Detergent binding
explains anomalous SDS-PAGE migration of membrane proteins. Proceedings of the
National Academy of Sciences of the United States of America 106: 1760-1765.
Rath, A., Melnyk, R.A., and Deber, C.M. 2006. Evidence for assembly of small multidrug
resistance proteins by a "two-faced" transmembrane helix. The Journal of biological
chemistry 281: 15546-15553.
Rath, A., Tulumello, D.V., and Deber, C.M. 2009b. Peptide models of membrane protein
folding. Biochemistry 48: 3036-3045.
172
Ravichandran, K.G., Boddupalli, S.S., Hasermann, C.A., Peterson, J.A., and Deisenhofer, J.
1993. Crystal structure of hemoprotein domain of P450BM-3, a prototype for microsomal
P450's. Science (New York, N.Y 261: 731-736.
Ready, M.P., Kim, Y., and Robertus, J.D. 1991. Site-directed mutagenesis of ricin A-chain and
implications for the mechanism of action. Proteins 10: 270-278.
Ridge, K.D., Lee, S.S., and Yao, L.L. 1995. In vivo assembly of rhodopsin from expressed
polypeptide fragments. Proceedings of the National Academy of Sciences of the United
States of America 92: 3204-3208.
Riek, R.P., Rigoutsos, I., Novotny, J., and Graham, R.M. 2001. Non-alpha-helical elements
modulate polytopic membrane protein architecture. Journal of molecular biology 306:
349-362.
Riggs, P. 2001. Expression and purification of maltose-binding protein fusions. Current
protocols in molecular biology / edited by Frederick M. Ausubel ... [et al Chapter 16:
Unit16 16.
Riordan, J.R. 2005. Assembly of functional CFTR chloride channels. Annual review of
physiology 67: 701-718.
Riordan, J.R. 2008. CFTR function and prospects for therapy. Annual review of biochemistry 77:
701-726.
Riordan, J.R., Rommens, J.M., Kerem, B., Alon, N., Rozmahel, R., Grzelczak, Z., Zielenski, J.,
Lok, S., Plavsic, N., Chou, J.L., et al. 1989. Identification of the cystic fibrosis gene:
cloning and characterization of complementary DNA. Science (New York, N.Y 245: 1066-
1073.
Rommens, J.M., Iannuzzi, M.C., Kerem, B., Drumm, M.L., Melmer, G., Dean, M., Rozmahel,
R., Cole, J.L., Kennedy, D., Hidaka, N., et al. 1989. Identification of the cystic fibrosis
gene: chromosome walking and jumping. Science (New York, N.Y 245: 1059-1065.
173
Rosenbusch, J.P., Lustig, A., Grabo, M., Zulauf, M., and Regenass, M. 2001. Approaches to
determining membrane protein structures to high resolution: do selections of
subpopulations occur? Micron 32: 75-90.
Rost, B., Fariselli, P., and Casadio, R. 1996. Topology prediction for helical transmembrane
proteins at 86% accuracy. Protein Sci 5: 1704-1718.
Roth, L., Nasarre, C., Dirrig-Grosch, S., Aunis, D., Cremel, G., Hubert, P., and Bagnard, D.
2008. Transmembrane domain interactions control biological functions of neuropilin-1.
Molecular biology of the cell 19: 646-654.
Russ, W.P., and Engelman, D.M. 1999. TOXCAT: a measure of transmembrane helix
association in a biological membrane. Proceedings of the National Academy of Sciences
of the United States of America 96: 863-868.
Russ, W.P., and Engelman, D.M. 2000. The GxxxG motif: a framework for transmembrane
helix-helix association. Journal of molecular biology 296: 911-919.
Sadlish, H., Pitonzo, D., Johnson, A.E., and Skach, W.R. 2005. Sequential triage of
transmembrane segments by Sec61alpha during biogenesis of a native multispanning
membrane protein. Nature structural & molecular biology 12: 870-878.
Sadlish, H., and Skach, W.R. 2004. Biogenesis of CFTR and other polytopic membrane proteins:
new roles for the ribosome-translocon complex. The Journal of membrane biology 202:
115-126.
Sakata, S., Kurokawa, T., Norholm, M.H., Takagi, M., Okochi, Y., von Heijne, G., and
Okamura, Y. 2010. Functionality of the voltage-gated proton channel truncated in S4.
Proceedings of the National Academy of Sciences of the United States of America 107:
2313-2318.
Sato, Y., Sakaguchi, M., Goshima, S., Nakamura, T., and Uozumi, N. 2002. Integration of
Shaker-type K+ channel, KAT1, into the endoplasmic reticulum membrane: synergistic
174
insertion of voltage-sensing segments, S3-S4, and independent insertion of pore-forming
segments, S5-P-S6. Proceedings of the National Academy of Sciences of the United
States of America 99: 60-65.
Sato, Y., Sakaguchi, M., Goshima, S., Nakamura, T., and Uozumi, N. 2003. Molecular dissection
of the contribution of negatively and positively charged residues in S2, S3, and S4 to the
final membrane topology of the voltage sensor in the K+ channel, KAT1. The Journal of
biological chemistry 278: 13227-13234.
Schneider, D., and Engelman, D.M. 2003. GALLEX, a measurement of heterologous association
of transmembrane helices in a biological membrane. The Journal of biological chemistry
278: 3105-3111.
Schneider, D., and Engelman, D.M. 2004. Motifs of two small residues can assist but are not
sufficient to mediate transmembrane helix interactions. Journal of molecular biology
343: 799-804.
Schwiebert, E.M., Morales, M.M., Devidas, S., Egan, M.E., and Guggino, W.B. 1998. Chloride
channel and chloride conductance regulator domains of CFTR, the cystic fibrosis
transmembrane conductance regulator. Proceedings of the National Academy of Sciences
of the United States of America 95: 2674-2679.
Senes, A., Gerstein, M., and Engelman, D.M. 2000. Statistical analysis of amino acid patterns in
transmembrane helices: the GxxxG motif occurs frequently and in association with beta-
branched residues at neighboring positions. Journal of molecular biology 296: 921-936.
Serohijos, A.W., Hegedus, T., Aleksandrov, A.A., He, L., Cui, L., Dokholyan, N.V., and
Riordan, J.R. 2008. Phenylalanine-508 mediates a cytoplasmic-membrane domain
contact in the CFTR 3D structure crucial to assembly and channel function. Proceedings
of the National Academy of Sciences of the United States of America 105: 3256-3261.
Shen, H., and Chou, J.J. 2008. MemBrain: improving the accuracy of predicting transmembrane
helices. PloS one 3: e2399.
175
Shrake, A., and Rupley, J.A. 1973. Environment and exposure to solvent of protein atoms.
Lysozyme and insulin. Journal of molecular biology 79: 351-371.
Skach, W.R., and Lingappa, V.R. 1993. Amino-terminal assembly of human P-glycoprotein at
the endoplasmic reticulum is directed by cooperative actions of two internal sequences.
The Journal of biological chemistry 268: 23552-23561.
Sonnhammer, E.L., von Heijne, G., and Krogh, A. 1998. A hidden Markov model for predicting
transmembrane helices in protein sequences. Proceedings / ... International Conference
on Intelligent Systems for Molecular Biology ; ISMB 6: 175-182.
Sorensen, H.P., and Mortensen, K.K. 2005a. Advanced genetic strategies for recombinant
protein expression in Escherichia coli. Journal of biotechnology 115: 113-128.
Sorensen, H.P., and Mortensen, K.K. 2005b. Soluble expression of recombinant proteins in the
cytoplasm of Escherichia coli. Microbial cell factories 4: 1.
Standfuss, J., Xie, G., Edwards, P.C., Burghammer, M., Oprian, D.D., and Schertler, G.F. 2007.
Crystal structure of a thermally stable rhodopsin mutant. Journal of molecular biology
372: 1179-1188.
Studier, F.W., and Moffatt, B.A. 1986. Use of bacteriophage T7 RNA polymerase to direct
selective high-level expression of cloned genes. Journal of molecular biology 189: 113-
130.
Sulistijo, E.S., Jaszewski, T.M., and MacKenzie, K.R. 2003. Sequence-specific dimerization of
the transmembrane domain of the "BH3-only" protein BNIP3 in membranes and
detergent. The Journal of biological chemistry 278: 51950-51956.
Sulistijo, E.S., and MacKenzie, K.R. 2006. Sequence dependence of BNIP3 transmembrane
domain dimerization implicates side-chain hydrogen bonding and a tandem GxxxG motif
in specific helix-helix interactions. Journal of molecular biology 364: 974-990.
176
Swartz, K.J. 2008. Sensing voltage across lipid membranes. Nature 456: 891-897.
Tamm, L.K., Hong, H., and Liang, B. 2004. Folding and assembly of beta-barrel membrane
proteins. Biochimica et biophysica acta 1666: 250-263.
Tate, C.G. 2001. Overexpression of mammalian integral membrane proteins for structural
studies. FEBS letters 504: 94-98.
Therien, A.G., and Deber, C.M. 2002. Oligomerization of a peptide derived from the
transmembrane region of the sodium pump gamma subunit: effect of the pathological
mutation G41R. Journal of molecular biology 322: 583-550.
Therien, A.G., Glibowicka, M., and Deber, C.M. 2002. Expression and purification of two
hydrophobic double-spanning membrane proteins derived from the cystic fibrosis
transmembrane conductance regulator. Protein expression and purification 25: 81-86.
Therien, A.G., Grant, F.E., and Deber, C.M. 2001. Interhelical hydrogen bonds in the CFTR
membrane domain. Nature structural biology 8: 597-601.
Tulumello, D.V., and Deber, C.M. 2009. SDS micelles as a membrane-mimetic environment for
transmembrane segments. Biochemistry 48: 12096-12103.
Tusnady, G.E., and Simon, I. 1998. Principles governing amino acid composition of integral
membrane proteins: application to topology prediction. Journal of molecular biology
283: 489-506.
Tusnady, G.E., and Simon, I. 2001. The HMMTOP transmembrane topology prediction server.
Bioinformatics (Oxford, England) 17: 849-850.
Ulmschneider, M.B., and Sansom, M.S. 2001. Amino acid distributions in integral membrane
protein structures. Biochimica et biophysica acta 1512: 1-14.
177
Ulmschneider, M.B., Sansom, M.S., and Di Nola, A. 2005. Properties of integral membrane
protein structures: derivation of an implicit membrane potential. Proteins 59: 252-265.
Viklund, H., Granseth, E., and Elofsson, A. 2006. Structural classification and prediction of
reentrant regions in alpha-helical transmembrane proteins: application to complete
genomes. Journal of molecular biology 361: 591-603.
von Heijne, G. 1989. Control of topology and mode of assembly of a polytopic membrane
protein by positively charged residues. Nature 341: 456-458.
von Heijne, G. 1992. Membrane protein structure prediction. Hydrophobicity analysis and the
positive-inside rule. Journal of molecular biology 225: 487-494.
Wagner, K., Greil, I., Schneditz, P., and Rosenkranz, W. 1994. A new missense mutation G126D
in exon 4 of the cystic fibrosis transmembrane conductance regulator (CFTR) gene.
Human heredity 44: 56-57.
Wales, R., Chaddock, J.A., Roberts, L.M., and Lord, J.M. 1992. Addition of an ER retention
signal to the ricin A chain increases the cytotoxicity of the holotoxin. Experimental cell
research 203: 1-4.
Wales, R., Roberts, L.M., and Lord, J.M. 1993. Addition of an endoplasmic reticulum retrieval
sequence to ricin A chain significantly increases its cytotoxicity to mammalian cells. The
Journal of biological chemistry 268: 23986-23990.
Wang, C., and Deber, C.M. 2000. Peptide mimics of the M13 coat protein transmembrane
segment. Retention of helix-helix interaction motifs. The Journal of biological chemistry
275: 16155-16159.
Wang, C., Liu, L.P., and C.M. Deber. 2000. Delta Regions in proteins: Helices mispredicted as
transmembrane segments by the threshold hydrophobicity requirement. . In Proceedings
of the 16th American Peptide Symposium. (eds. G.B. Fields, Tam, J.P., and, and G.
Barany), pp. 367-369. Springer, Minneapolis, Minnesota, USA.
178
Ward, A., Reyes, C.L., Yu, J., Roth, C.B., and Chang, G. 2007. Flexibility in the ABC
transporter MsbA: Alternating access with a twist. Proceedings of the National Academy
of Sciences of the United States of America 104: 19005-19010.
Wehbi, H., Gasmi-Seabrook, G., Choi, M.Y., and Deber, C.M. 2008. Positional dependence of
non-native polar mutations on folding of CFTR helical hairpins. Biochimica et biophysica
acta 1778: 79-87.
Wehbi, H., Rath, A., Glibowicka, M., and Deber, C.M. 2007. Role of the extracellular loop in the
folding of a CFTR transmembrane helical hairpin. Biochemistry 46: 7099-7106.
White, S.H. 2009. Biophysical dissection of membrane proteins. Nature 459: 344-346.
White, S.H., and von Heijne, G. 2008. How translocons select transmembrane helices. Annual
review of biophysics 37: 23-42.
White, S.H., and Wimley, W.C. 1999. Membrane protein folding and stability: physical
principles. Annual review of biophysics and biomolecular structure 28: 319-365.
Wigley, W.C., Vijayakumar, S., Jones, J.D., Slaughter, C., and Thomas, P.J. 1998.
Transmembrane domain of cystic fibrosis transmembrane conductance regulator: design,
characterization, and secondary structure of synthetic peptides m1-m6. Biochemistry 37:
844-853.
Williamson, R.C., and Toye, A.M. 2008. Glycophorin A: Band 3 aid. Blood cells, molecules &
diseases 41: 35-43.
Wimley, W.C. 2003. The versatile beta-barrel membrane protein. Current opinion in structural
biology 13: 404-411.
Woolhead, C.A., McCormick, P.J., and Johnson, A.E. 2004. Nascent membrane and secretory
proteins differ in FRET-detected folding far inside the ribosome and in their exposure to
ribosomal proteins. Cell 116: 725-736.
179
Wu, J.V., Krouse, M.E., and Wine, J.J. 2007. Acinar origin of CFTR-dependent airway
submucosal gland fluid secretion. American journal of physiology 292: L304-311.
Yang, Q.H., Wu, C.L., Lin, K., and Li, L. 1997. Low concentration of inducer favors production
of active form of 6-phosphofructo-2-kinase/fructose-2,6-bisphosphatase in Escherichia
coli. Protein expression and purification 10: 320-324.
Yildirim, M.A., Goh, K.I., Cusick, M.E., Barabasi, A.L., and Vidal, M. 2007. Drug-target
network. Nature biotechnology 25: 1119-1126.
Young, M.T., Beckmann, R., Toye, A.M., and Tanner, M.J. 2000. Red-cell glycophorin A-band
3 interactions associated with the movement of band 3 to the cell surface. The
Biochemical journal 350 Pt 1: 53-60.
Yuen, C.T., Davidson, A.R., and Deber, C.M. 2000. Role of aromatic residues at the lipid-water
interface in micelle-bound bacteriophage M13 major coat protein. Biochemistry 39:
16155-16162.
Zhang, L., Button, B., Gabriel, S.E., Burkett, S., Yan, Y., Skiadopoulos, M.H., Dang, Y.L.,
Vogel, L.N., McKay, T., Mengos, A., et al. 2009. CFTR delivery to 25% of surface
epithelial cells restores normal rates of mucus transport to human cystic fibrosis airway
epithelium. PLoS biology 7: e1000155.
Zhang, L., Sato, Y., Hessa, T., von Heijne, G., Lee, J.K., Kodama, I., Sakaguchi, M., and
Uozumi, N. 2007. Contribution of hydrophobic and electrostatic interactions to the
membrane integration of the Shaker K+ channel voltage sensor domain. Proceedings of
the National Academy of Sciences of the United States of America 104: 8263-8268.
Zhao, G., and London, E. 2006. An amino acid "transmembrane tendency" scale that approaches
the theoretical limit to accuracy for prediction of transmembrane helices: relationship to
biological hydrophobicity. Protein Sci 15: 1987-2001.
180
Zhou, F.X., Cocco, M.J., Russ, W.P., Brunger, A.T., and Engelman, D.M. 2000. Interhelical
hydrogen bonding drives strong interactions in membrane proteins. Nature structural
biology 7: 154-160.
Zhou, F.X., Merianos, H.J., Brunger, A.T., and Engelman, D.M. 2001. Polar residues drive
association of polyleucine transmembrane helices. Proceedings of the National Academy
of Sciences of the United States of America 98: 2250-2255.
Zimmer, J., Nam, Y., and Rapoport, T.A. 2008. Structure of a complex of the ATPase SecA and
the protein-translocation channel. Nature 455: 936-943.
181
Appendices
Appendix 1: Additional Data tables.
Table A1.1. Database of globular helix sequences (n = 122).
PDBID Residues a Length Sequence Hb
1AEP 128-154 27 APVQSALQEAAEKTKEAAANLQNSIQS -1.1
1AEP 36-63 28 EALNLLTEQANAFKTKIAEVTTSLKQEA -0.32
1AEP 7-30 24 IAEAVQQLNHTIVNAAHELHETLG -0.3
1AJA 329-353 25 PCGQIGETVDLDEAVQRALEFAKKE -0.55
1AL7 300-332 33 RPVVFSLAAEGEAGVKKVLQMMRDEFELTMALS 0.2
1ALD 317-337 21 KENLKAAQEEYVKRALANSLA -0.73
1APA 203-221 19 AKVLNLEESWGKISTAIHN -0.46
1ATN 507-528 22 PSDAVAEINSLYDVYLDVQQKW -0.08
1BBH 107-128 22 AEAVKTAFGDVGAACKSCHEKY -0.76
1BBH 81-102 22 MEDVGKIAREFVGAANTLAEVA 0.05
1BBH 5-31 27 PEEQIETRQAGYEFMGWNMGKIKANLE -0.63
1BGD 137-164 28 AFQRRAGGVLVASNLQSFLELAYRALRH 0.14
1BGD 101-124 24 APTLDTLQLDTTDFAINIWQQMED 0.16
1BGD 12-40 29 QSFLLKCLEQMRKVQADGTALQETLCATH -0.1
1BTC 257-284 28 EKGKFFLTWYSNKLLNHGDQILDEANKA -0.6
1CGM 111-133 23 VKRTDDASTAARAEIDNLIESIS -0.6
1CGP 110-136 27 PDILMRLSAQMARRLQVTSE KVGNLAF -0.01
1CHM 190-208 19 PEYEVALHATQAMVRAIAD -0.02
1CHM 160-183 24 SAEEHVMIRHGARIADIGGAAVVE -0.34
1CHM 271-290 20 SDDHLRLWQVNVEVHEAGLK -0.47
1CHR 98-116 19 ASAKAAVEMALLDLKARAL 0.36
1CPC 78-101 24 QTGKDKCVRDIGYYLRMVTYCLVV 0.34
1CPC 21-46 26 STEIQTAFGRFRQASASLAAAKALTE -0.35
1CPT 192-214 23 AARRFHETIATFYDYFNGFTVDR 0.09
1CPT 254-285 32 DKYINAYYVAIATAGHDTTSSSSGGAIIGLSR -0.62
1CPT 123-145 23 PASIRKLEENIRRIAQASVQRLL -0.35
1CSG 68-86 19 GSLTKLKGPLTMMASHYKQ -0.99
1DHR 96-121 26 LFKNCDLMWKQSIWTSTISSHLATKH -0.16
1FBP 29-49 21 EMTQLLNSLCTAVKAISTAVR 0.29
1GDH 293-311 19 TQAREDMAHQANDLIDALF -0.21
1GLA 524-549 26 ANHIIRATLESIAYQTRDVLEAMQAD -0.03
1GPB 528-553 26 EAFIRDVAKVKQENKLKFAAYLEREY -0.23
1GPB 614-632 19 HMAKMIIKLITAIGDVVNH 0.28
1GPB 48-77 30 PRDYYFALAHTVRDHLVGRWIRTQQHYYEK -0.47
1GPB 397-417 21 PRHLQIIYEINQRFLNRVAAA 0.04
182
1GRC 147-170 24 EDDITARVQTQEHAIYPLVISWFA 0.23
1HBG 124-145 22 AAAKDAWAAAYADISGALISGL 0.2
1HBG 100-121 22 AQYFEPLGASLLSAMEHRIGGK -0.42
1HBG 53-71 19 PGVAALGAKVLAQIGVAVS 0.07
1ITH 59-77 19 PAYKAQTLTVINYLDKVVD -0.07
1LAP 84-102 19 EGKENIRAAVAAGCRQIQD -0.89
1LAP 148-169 22 QEAWQRGVLFASGQNLARRLME -0.09
1LE2 87-123 37 EETRARLSKELQAAQARLGADMEDVCGRLVQYRGEVQ -0.53
1LE2 55-78 24 QVTQELRALMDETMKELKAYKSEL -0.25
1LH1 104-125 22 DAHFPVVKEAILKTIKEVVGAK -0.38
1LIS 13-41 29 KAFEVALKVQIIAGFDRGLVKWLRVHGRT 0.17
1LLA 177-198 22 KGELFYYMHQQMCARYDCERLS -0.01
1LMB 9-29 21 QEQLEDARRLKAIYEKKKNEL -1.06
1LTH 297-315 19 DKELAALKRSAETLKETAA -0.77
1LTS 197-222 26 DTCNEETQNLSTIYLREYQSKVKRQI -0.75
1LVL 48-66 19 CIPSKALIHVAEQFHQASR -0.53
1LVL 84-108 25 IGQSVAWKDGIVDRLTTGVAALLKK -0.12
1MAT 8-28 21 PEDIEKMRVAGRLAAEVLEMI 0.19
1MYT 58-76 19 AAISAHGATVLKKLGELLK -0.24
1MYT 122-146 25 AGGQTALRNVMGIIIADLEANYKEL 0.07
1P2P 90-108 19 ACEAFICNCDRNAAICFSK 0.38
1PBX 4-35 32 DKDKAAVRALWSKIGKSADAIGNDALSRMIVV -0.42
1PBX 120-140 21 PEAHVSLDKFLSGVALALAER -0.05
1RVE 37-58 22 TKVLSTIFELFSRPIINKIAEK 0.14
1SBP 14-32 19 RELYEQYNKAFSAHWKQET -0.85
1SCM 778-821 44 LSKIISMFQAHIRGYLIRKAYKKLQDQRIGLSVIQRNIRKWLVL 0.16
1THG 185-204 20 AGLHDQRKGLEWVSDNIANF -0.58
1TML 127-147 21 QHVQQEVLETMAYAGKALKAG -0.58
1TRB 295-313 19 AITSAGTGCMAALDAERYL 0.21
1VSG 23-55 33 DQPKGALFTLQAAASKIQKMRDAALRASIYAEI -0.34
1VSG 87-114 28 LSSQEVTATATASYLKGRIDEYLNLLLQ 0.1
1VSG 60-85 26 NRAKAAVIVANHYAMKADSGLEALKQ -0.64
1WRP 14-32 19 QRHQEWLRFVDLLKNAYQN -0.3
1XIS 296-322 27 FDGVWASAAGCMRNYLILKERAAAFRA 0.37
1XIS 109-128 20 RDVRRYALRKTIRNIDLAVE -0.18
1XIS 151-172 22 VRDALDRMKEAFDLLGEYVTSQ -0.01
1YPI 178-196 19 PEDAQDIHASIRKFLASKL -0.71
1ZEI 1-35 35 LQRMKQLEDKVEELLSKNYHLENEVARLKK LVGER -1.12
256B 24-42 19 AQVKDALTKMRAAALDAQK -0.66
256B 58-80 23 MKDFRHGFDILVGQIDDALKLAN -0.04
2ADA 126-144 19 PDDVVDLVNQGLQEGEQAF -0.54
2AK3 164-188 25 PETVVKRLKAYEAQTEPVLEYYRKK -0.86
2BPA 192-210 19 IMGLQAAYANLHTDQERDY -0.31
2CCY 40-58 19 AAQRAENMAMVAKLAPIGW 0.02
183
2CCY 104-125 22 PDALKAQAAATGKVCKACHEEF -0.84
2CCY 5-30 26 PEDLLKLRQGLMQTLKSQWVPIAGFA -0.03
2CCY 79-102 24 SAEFLEGWKALATESTKLAAAAKA -0.22
2CHB 59-78 20 DSQKKAIERMKDTLRIAYLT -0.55
2CMD 196-217 22 EQEVADLTKRIQNAGTEVVEAK -0.79
2CMD 87-108 22 RSDLFNVNAGIVKNLVQQVAKT -0.37
2CPK 140-159 20 EPHARFYAAQIVLTFEYLHS 0.24
2CPP 193-213 21 FAEAKEALYDYLIPIIEQRRQ 0.2
2CPP 121-145 25 MPVVDKLENRIQELACSLIESLRPQ -0.1
2GST 90-114 25 EEERIRADIVENQVMDNRMQLIMLC 0.35
2LIV 16-35 20 AQYGDQEFTGAEQAVADINA -0.63
2LZM 61-79 19 DEAEKLFNQDVDAAVRGIL -0.14
2LZM 94-113 19 RRCALINMVFQMGETGVAG 0.26
2MHR 41-64 24 APNLATLVKVTTNHFTHEEAMMDA -0.37
2MHR 19-37 19 EQLDEEHKKIFKGIFDCIR -0.39
2MIN 318-346 29 ESIQKKCEEVIAKYKPEWEAVVAKYRPRL -0.65
2PGD 178-206 29 AGHFVKMVHNGIEYGDMQLICEAYHLMKD -0.09
2PGD 392-415 24 DFFKSAVENCQDSWRRAISTGVQA -0.46
2PGD 315-348 34 KKSFLEDIRKALYASKIISYAQGFMLLRQAATEF 0.17
2REB 158-177 20 LAARMMSQAM RKLAGNLKQS -0.46
2SAS 48-69 22 DADYKSMQASLEDEWRDLKGRA -0.92
2WSY 254-272 19 QILMPALNQLEEAFVRAQK 0.14
3COX 403-424 22 QSQNQKGIDMAKKVFDKINQKE -1.64
3CPA 216-234 19 KTELNQVAKSAVEALKSLY -0.45
3ENL 107-125 19 ANAILGVSLAASRAAAAEK -0.2
3ENL 179-200 22 FAEALRIGSEVYHNLKSLTKKR -0.59
3ENL 403-422 20 SERLAKLNQLLRIEEELGDN -0.36
3GPD 315-333 19 NEFGYSERVVDLMAHMASK -0.32
3HSC 230-249 20 GEDFDNRMVNHFIAEFKRKH -0.89
3HSC 116-135 20 PEEVSSMVLTKMKEIAEAYL 0.05
3ICD 38-57 20 GVDVTPAMLKVVDAAVEKAY 0
3MDS 67-89 23 QDIQTAVRNNGGGHLNHSLFWRL -0.71
3SDH 64-82 19 DKLRGHSITLMYALQNFID 0.12
4GR1 96-120 25 WRVIKEKRDAYVSRLNAIYQNNLTK -0.47
4MDH 306-329 24 DFSREKMDLTAKELAEEKETAFEF -0.36
4TLN 159-179 21 NESGAINEAISDIFGTLVEFY 0.25
4TLN 65-88 24 SYDAPAVDAHYYAGVTYDYYKNVH -0.67
4TNC 75-105 31 FEEFLVMMVRQMKEDAKGKSEEELANCFRIF 0.25
5ABP 110-128 19 ATKIGERQGQELYKEMQKR -1.39
6TAA 357-375 19 ELYKLIASANAIRNYAISK -0.01
7AAT 271-289 19 AEEAKRVESQLKILIRPMY -0.19
7AAT 307-338 32 PELRKEWLVEVKGMADRIISMRTQLVSNLKKE -0.31
7API 23-44 22 FNKITPNLAEFAFSLYRQLAHQ 0.01 a Residues are numbered according to the PDB coordinate file.
184
b Mean residue hydropathy calculated with the Liu-Deber scale (Liu and Deber 1998a). See
Materials and Methods.
Table A1.2. Database of -helix sequences (n = 51).
PDBID Residues a Length Sequence Hb
1BGC 101-125 25 APTLDTLQLDVTDFATNIWLQMEDL 0.58
1BGC 72-92 21 LRGCLNQLHGGLFLYQGLLQA 0.49
1BGD 72-92 21 LMGCLRQLHSGLFLYQGLLQA 0.84
1BIA 235-254 20 RNTLAAMLIRELRAALELFE 0.97
1CF3 561-581 21 MTVFYAMALKISDAILEDYAS 0.91
1CPC 78-101 24 SRRMAACLRDMEIILRYVTYAIFA 1.06
1CSC 386-410 25 MNYYTVLFGVSRALGVLAQLIWSRA 0.61
1CSC 341-360 20 PMFKLVAQLYKIVPNVLLEQ 0.98
1CSC 163-193 21 RTKYWEMVYESAMDLIAKLPCVAAKIYRNLY 0.42
1DXI 296-320 25 FDGVWASAAGCMRNYLILKDRAAAF 0.47
1ECA 114-133 20 FAGAEAAWGATLDTFFGMIF 1.1
1EZM 72-91 20 PLNDAHFFGGVVFKLYRDWF 0.49
1FDH 101-121 21 ENFKLLGNVLVTVLAIHFGKE 0.46
1FHA 14-41 28 QDSEAAINRQINLELYASYVYLSMSYYF 0.43
1FIA 50-70 21 LYELVLAEVEQPLLDMVMQYT 1.19
1GLM 186-206 21 FFTIAVQHRALVEGSAFATAV 0.72
1GLM 318-338 21 FLCTLAAAEQLYDALYQWDKQ 0.67
1GPA 290-312 23 ELRLKQEYFVVAATLQDIIRRFK 0.48
1GUH 86-111 26 IKERALIDMYIEGIADLGEMILLLPV 1.11
1HCI 218-236 19 ELFFWVHHQLTARFDFERL 0.96
1HDS 99-117 19 PENFRLLGNVLVVVLARNF 0.77
1LTH 93-114 22 RLELVGATVNILKAIMPNLVKV 0.56
1MAT 120-139 20 IMGERLCRITQESLYLALRM 0.88
1MBA 126-144 19 ADAAWTKLFGLIIDALKAA 0.77
1MRR 186-205 20 LRELKKKLYLCLMSVNALEA 0.61
1OVA 26-44 19 IGAASMEFCFDVFKELKVH 0.53
1PFK 258-276 19 PYDRILASRMGAYAIDLLL 0.73
1PHH 299-317 19 LNLAASDVSTLYRLLLKAY 0.8
1PHH 328-350 23 YSAICLRRIWKAERFSWWMTSVL 1.07
1SRY 164-182 19 DLALYELALLRFAMDFMAR 1.62
1SRY 288-308 21 LEASDRAFQELLENAEEILRL 0.42
1TIS 184-202 19 LPFNIASYATLVHIVAKMC 0.81
2AAI 161-180 20 LPTLARSFIICIQMISEAAR 0.84
2ACE 168-186 21 VGLLDQRMALQWVHDNIQF 0.54
2ACH 25-44 20 GLASANVDFAFSLYKQLVLK 0.48
2ALD 160-178 19 ALAIMENANVLARYASICQ 0.65
2ATI 285-304 20 YFQQAGNGIFARQALLALVL 0.88
2BMH 251-282 32 DENIRYQIITFLIAGHETTSGLLSFALYFLVK 0.74
185
2LDB 107-127 21 LDLVDKNIAIFRSIVESVMAS 0.66
2MNR 98-118 21 GLIRMAAAGIDMAAWDALGKV 0.52
2TMV 114-134 21 VDDATVAIRSAINNLIVELIR 0.63
2TSI 164-184 21 FTEFSYMMLQAYDFLRLYETE 1.16
3HHR 156-183 28 LLKNYGLLYCFRKDMDKVETFLRIVQCR 0.56
3HHR 110-128 19 VYDLLKDLEEGIQTLMGRL 0.54
3INK 7-28 22 TKKTQLQLEHLLLDLQMILNGI 0.42
3PFK 258-276 19 AFDRVLASRLGARAVELLL 0.9
3TMS 173-192 20 GLPFNIASYALLVHMMAQQC 0.66
4TMS 226-244 19 VPFNIASYALLTHLVAHEC 0.59
5LDH 244-263 20 GYTNWAIGLSVADLIESMLK 0.52
8RUC 214-232 19 WRDRFLFCAEALYKAQAET 0.51
9LDB 109-130 22 RLNLVQRNVNIFKFIIPNIVKY 0.45 a Residues are numbered according to the PDB coordinate file.
b Mean residue hydropathy calculated with the Liu-Deber scale (Liu and Deber 1998a). See
Materials and Methods.
Table A1.3. Database of TM helix sequences (n = 212).
PDBID Residuesa Length Sequence Hb
1AFO 72-96 25 EITLIIFGVMAGVIGTILLISYGIR 1.56
1FX8-1 6-35 30 LIFFGVGCVAALKVA 0.98
1FX8-2 40-60 21 GQWEISVIWGLGVAMAIYLTA 1.25
1FX8-3 68-77 10 NPAVTIALWL 1.51
1FX8-4 85-106 22 KVIPFIVSQVAGAFCAAALVYG 0.86
1FX8-5 144-169 26 NFVQAFAVEMVITAILMGLILALTDD 1.54
1FX8-6 178-194 17 LAPLLIGLLIAVIGASM 1.73
1FX8-7 203-216 14 NPARDFGPKVFAWL -0.3
1FX8-8 232-254 23 YFLVPLFGPIVGAIVGAFAYRKL 1.05
1L7V-1 2-32 30 LTLARQQQRQNIRWLLCLSVLLLALLLSLC 1.5
1L7V-2 47-81 34 RGELFVWQIRLPRTLAVLLVGAALAISGAVQALF 1.06
1L7V-3 93-107 15 VSNGAGVGLIAAVLL 0.78
1L7V-4 114-138 25 NWALGLCAIAGALIITLILLRFARR 1.58
1L7V-5 142-166 24 TSRLLLAGVALGIICSALTWAIYF 1.58
1L7V-6 191-206 15 SWLLALIPVLLWICC 2.85
1L7V-7 229-249 20 WFWRNVLVAATGWVGVSVAL 1.38
1L7V-8 258-267 10 LVIPHILRLC 1.63
1L7V-9 272-296 25 HRVLLPGCALAGASALLLADIVARL 0.81
1L7V-10 305-324 20 IGVVTATLGAPVFIWLLLKA 1.43
1LNQ-1 22-40 19 RILLLVLAVIIYGTAGFHF 1.87
1LNQ-2 46-57 12 WTVSLYWTFVTI 2.16
1LNQ-3 71-97 27 LGMYFTVTLIVLGIGTFAVAVERLLEF 1.72
1ORQ-1 26-50 25 LVELGVSYAALLSVIVVVVECTMQL 1.66
186
1ORQ-2 54-78 25 YLVRLYLVDLILVIILWADYAYRAY 2.23
1ORQ-3 84-92 9 AGYVKKTLY -0.27
1ORQ-4 96-112 17 ALVPAGLLALIEGHLAG 0.64
1ORQ-5 116-132 17 FRLVRLLRFLRILLIIS 2.41
1ORQ-6 139-147 9 SAIADAADK -0.86
1ORQ-7 148-173 26 IRFYHLFGAVMLTVLYGAFAIYIVEY 1.8
1ORQ-8 183-194 12 VFDALWWAVVTA 2.13
1ORQ-9 208-240 33 IGKVIGIAVMLTGISALTLLIGTVSNMFQKILV 1.07
1OTS-1 32-70 39 PLAILFMAAVVGTLVGLAAVAFDKGVAWLQNQRMGALVH 0.78
1OTS-2 75-100 26 YPLLLTVAFLCSAVLAMFGYFLVRKY 1.73
1OTS-3 109-117 9 IPEIEGALE 0.12
1OTS-4 124-140 17 WWRVLPVKFFGGLGTLG 0.77
1OTS-5 148-165 18 EGPTVQIGGNIGRMVLDI -0.29
1OTS-6 172-189 18 EARHTLLATGAAAGLAAA -0.11
1OTS-7 193-200 8 PLAGILFI 1.91
1OTS-8 215-230 16 IKAVFIGVIMSTIMYR 1.39
1OTS-9 250-282 33 NTLWLYLILGIIFGIFGPIFNKWVLGMQDLLHR 1.33
1OTS-10 288-305 18 ITKWVLMGGAIGGLCGLL 1.06
1OTS-11 321-325 5 PIATA -0.25
1OTS-12 330-349 20 MGMLVFIFVARVITTLLCFS 2.26
1OTS-13 357-378 22 FAPMLALGTVLGTAFGMVAVEL 1.22
1OTS-14 387-401 15 GTFAIAGMGALLAAS 0.61
1OTS-15 405-416 12 PLTGIILVLEMT 1.46
1OTS-16 423-438 16 LPMIITGLGATLLAQF 1.25
1PW4-1 37-57 21 GYAAYYLVRKNFALAMPYLVE 0.76
1PW4-2 64-84 21 DLGFALSGISIAYGFSKFIMG 0.67
1PW4-3 94-112 19 VFLPAGLILAAAVMLFMGF 2.11
1PW4-4 121-141 21 AVMFVLLFLCGWFQGMGWPPC 1.63
1PW4-5 160-180 21 VWNCAHNVGGGIPPLLFLLGM 0.47
1PW4-6 190-207 18 LYMPAFCAILVALFAFAM 2.42
1PW4-7 264-282 19 FVYLLRYGILDWSPTYLKE 0.97
1PW4-8 288-309 22 LDKSSWAYFLYEYAGIPGTLLC 0.68
1PW4-9 322-341 20 GATGVFFMTLVTIATIVYWM 1.77
1PW4-10 347-367 21 PTVDMICMIVIGFLIYGPVML 1.68
1PW4-11 389-409 21 GLFGYLGGSVAASAIVGYTVD 0.32
1PW4-12 415-434 20 GGFMVMIGGSILAVILLIVV 1.98
1RH5_A-1 23-42 20 FKEKLKWTGIVLVLYFIMGC 1.19
1RH5_A-2 57-91 23 EFWQTITGPIVTAGIIMQLLVGS 0.81
1RH5_A-3 105-129 25 ALFQGCQKLLSIIMCFVEAVLFVGA 1.57
1RH5_A-4 138-162 25 LLAFLVIIQIAFGSIILIYLDEIVS 2.29
1RH5_A-5 169-187 19 GIGLFIAAGVSQTIFVGAL 1.02
1RH5_A-6 211-228 18 APIIGTIIVFLMVVYAEC 1.87
1RH5_A-7 257-276 20 IPVILAAALFANIQLWGLAL 1.8
1RH5_A-8 314-336 23 IHAIVYMIAMIITCVMFGIFWVE 2.37
187
1RH5_A-9 377-395 19 LTVMSSAFVGFLATIANFI 1.48
1RH5_A-10 401-415 15 GTGVLLTVSIVYRMY 1.06
1RH5_B 32-62 31 EYLAVAKVTALGISLLGIIGYIIHVPATYIK 0.82
1RH5_C 30-49 20 PEHVIGVTVAFVIIEAILTY 1.19
1XL4-1 46-68 23 WPVFITLITGLYLVTNALFALAY 1.86
1XL4-2 108-135 28 LANTLVTLEALCGMLGLAVAASLIYARF 1.35
YCE-1 5-38 34 FAKTVVLAASAVGAGTAMIAGIGPGVGQGYAAGK -0.35
YCE-2 55-80 26 STMVLGQAVAESTGIYSLVIALILLY 1.24
1YEW_A-1 185-207 23 EGNTYFWHAFWFAIGVAWIGYWS 1.18
1YEW_A-2 233-257 25 RKVAMGFLAATILIVVMAMSSANSK 0.54
1YEW_B-1 1143 33 HAEAVQVSRTIDWMALFVVFFVIVGSYHIHAML 1.1
1YEW_B-2 58-82 25 RLWVTVTPIVLVTFPAAVQSYLWER 1.01
1YEW_B-3 88-111 24 GATVCVLGLLLGEWINRYFNFWGW 1.36
1YEW_B-4 126-136 11 PGAIILDTVLM 1.18
1YEW_B-5 141-165 25 YLFTAIVGAMGWGLIFYPGNWPIIA 1.19
1YEW_B-6 200-212 13 VEKGTLRTFGKDV -0.75
1YEW_B-7 229-242 14 FMWHFIGRWFSNER 0.77
1YEW_C-1 51-71 21 LTFALAIYTVFYLWVRWYEGV 2.1
1YEW_C-2 84-113 30 EFETYWMNFLYTEIVLEIVTASILWGYLWK 1.61
1YEW_C-3 132-160 29 FTHLVWLVAYAWAIYWGASYFTEQDGTWH 0.95
1YEW_C-4 172-197 26 SHIIEFYLSYPIYIITGFAAFIYAKT 1.06
1YEW_C-5 244-256 13 LHYGFVIFGWLAL 2.09
1Z98-1 35-64 30 WSFWRAAIAEFIATLLFLYITVATVIGHSK 1.32
1Z98-2 74-90 17 LLGIAWAFGGMIFVLVY 2.33
1Z98-3 102-111 10 PAVTFGLFLA 1.36
1Z98-4 116-139 24 LLRALVYMIAQCLGAICGVGLVKA 1.34
1Z98-5 161-180 20 KGTALGAEIIGTFVLVYTVF 1
1Z98-6 200-215 16 LPIGFAVFMVHLATIP 1.19
1Z98-7 223-232 10 PARSFGAAVI -0.09
1Z98-8 237-262 26 KVWDDQWIFWVGPFIGAAVAAAYHQY 0.6
1ZLL 24-52 29 ARQKLQNLFINFCLILICLLLICIIVMLL 2.49
2A79-1 222-242 21 FFIVETLCIIWFSFEFLVRFF 2.93
2A79-2 290-309 20 LAILRVIRLVRVFRIFKLSR 1.49
2A79-3 312-321 10 KGLQILGQTL 0.05
2A79-4 322-348 27 KASMRELGLLIFFLFIGVILFSSAVYF 1.82
2A79-5 362-372 11 PDAFWWAVVSM 1.28
2A79-6 388-419 32 KIVGSLCAIAGVLTIALPVPVIVSNFNYFYHR 0.65
2AHY-1 22-42 21 KEFQVLFVLTILTLISGTIFY 1.75
2AHY-2 50-62 13 PIDALYFSVVTLT 1.13
2AHY-3 75-97 23 FGKIFTILYIFIGIGLVFGFIHK 1.61
2B2F-1 2-28 27 SDGNVAWILASTALVMLMVPGVGFFYA 1
2B2F-2 33-64 32 RKNAVNMIALSFISLIITVLLWIFYGYSVSFG 1.35
2B2F-3 86-110 25 DLLFMMYQMMFAAVTIAILTSAIAE 1.78
2B2F-4 114-138 25 VSSFILLSALWLTFVYAPFAHWLWG 1.76
188
2B2F-5 151-170 20 AGGMVVHISSGFAALAVAMT 0.46
2B2F-6 182-210 29 IEPHSIPLTLIGAALLWFGWFGFNGGSAL 0.66
2B2F-7 215-241 27 VAINAVVVTNTSAAVAGFVWMVIGWIK 1.07
2B2F-8 246-260 15 SLGIVSGAIAGLAAI 0.72
2B2F-9 269-294 26 VKGAIVIGLVAGIVCYLAMDFRIKKK 0.66
2B2F-10 299-319 21 LDAWAIHGIGGLWGSVAVGIL 0.82
2B2F-11 334-366 33 NPQLLVSQLIAVASTTAYAFLVTLILAKAVDAA 0.82
2BG9_A-1 218-237 20 VIIPCLLFSFLTVLVFYLPT 2.32
2BG9_A-2 243-261 19 MTLSISVLLSLTVFLLVIV 2.47
2BG9_A-3 280-300 21 FTMIFVISSIIVTVVVINTHH 1.35
2BG9_A-4 408-427 20 HILLCVFMLICIIGTVSVFA 2.38
B2L2-1 11-46 36 GMVFAVLAMATATIFSGIGSAKGVGMTGEAAAALTT 0.31
B2L2-2 51-79 29 KFGQALILQLLPGTQGLYGFVIAFLIFIN 1.22
B2L2-3 87-122 36 VQGLNFLGASLPIAFTGLFSGIAQGKVAAAGIQILA 0.42
B2L2-4 127-155 29 HATKGIIFAAMVETYAILGFVISFLLVLN 1.38
2CFP-1 9-35 27 FWMFGLFFFFYFFIMGAYFPFFPIWLH 2.69
2CFP-2 44-68 25 DTGIIFAAISLFSLLFQPLFGLLSD 1.33
2CFP-3 74-99 26 KYLLWIITGMLVMFAPFFIFIFGPLL 2.32
2CFP-4 105-131 27 VGSIVGGIYLGFCFNAGAPAVEAFIEK 0.41
2CFP-5 140-162 23 FGRARMFGCVGWALGASIVGIMF 1.04
2CFP-6 168-188 21 FVFWLGSGCALILAVLLFFAK 2.27
2CFP-7 221-250 30 KLWFLSLYVIGVSCTYDVFDQQFANFFTSF 1.2
2CFP-8 259-278 20 RVFGYVTTMGELLNASIMFF 1.2
2CFP-9 289-308 20 KNALLLAGTIMSVRIIGSSF 0.57
2CFP-10 313-334 22 LEVVILKTLHMFEVPFLLVGCF 1.78
2CFP-11 347-373 27 ATIYLVCFCFFKQLAMIFMSVLAGNMY 1.84
2CFP-12 378-398 21 FQGAYLVLGLVALGFTLISVF 1.81
2GIF-1 10-29 20 IFAWVIAIIIMLAGGLAILK 2.3
2GIF-2 338-356 19 HEVVKTLVEAIILVFLVMY 1.84
2GIF-3 363-386 24 RATLIPTIAVPVVLLGTFAVLAAF 1.32
2GIF-4 393-419 26 LTMFGMVLAIGLLVDDAIVVVENVER 1.37
2GIF-5 433-459 27 KSMGQIQGALVGIAMVLSAVFVPMAFF 1.28
2GIF-6 467-495 29 YRQFSITIVSAMALSVLVALILTPALCAT 1.29
2GIF-7 540-558 19 RYLVLYLIIVVGMAYLFVR 2.39
2GIF-8 873-892 20 APSLYAISLIVVFLCLAALY 2.01
2GIF-9 895-918 24 WSIPFSVMLVVPLGVIGALLAATF 1.47
2GIF-10 926-950 25 YFQVGLLTTIGLSAKNAILIVEFAK 0.85
2GIF-11 960-991 32 LIEATLDAVRMRLRPILMTSLAFILGVMPLVI 1.4
2GIF-12 999-1030 32 AQNAVGTGVMGGMVTATVLAIFFVPVFFVVVR 1.02
2HAC 2-23 22 SKLCYLLDGILFIYGVILTALF 1.97
2HYD-1 12-39 28 YKYRIFATIIVGIIKFGIPMLIPLLIKY 1.31
2HYD-2 57-90 34 HHLTIAIGIALFIFVIVRPPIEFIRQYLAQWTSN 0.88
2HYD-3 132-158 27 KDFILTGLMNIWLDCITIIIALSIMFF 2.1
2HYD-4 162-191 30 KLTLAALFIFPFYILTVYVFFGRLRKLTRE 1.38
189
2HYD-5 235-268 34 TRALKHTRWNAYSFAAINTVTDIGPIIVIGVGAY 0.1
2HYD-6 277-312 36 VGTLAAFVGYLELLFGPLRRLVASFTTLTQSFASMD 0.79
2IC8-1 94-114 21 GPVTWVMMIACVVVFIAMQIL 2.07
2IC8-2 147-168 22 SLMHILFNLLWWWYLGGAVEKR 1.32
2IC8-3 171-192 22 SGKLIVITLISALLSGYVQQKF 0.63
2IC8-4 200-217 18 LSGVVYALMGYVWLRGER 0.88
2IC8-5 226-241 16 QRGLIIFALIWIVAGW 2.07
2IC8-6 250-269 20 ANGAHIAGLAVGLAMAFVDS 1.69
2IUB-1 291-312 22 MKVLTIIATIFMPLTFIAGIYG 1.53
2IUB-2 327-349 23 YPVVLAVMGVIAVIMVVYFKKKK 0.97
2NWL-1 15-31 17 KILIGLILGAIVGLILG 1.81
2NWL-2 36-68 33 AHAVHTYVKPFGDLFVRLLKMLVMPIVFASLVV 0.96
2NWL-3 78-108 31 LGRVGVKIVVYYLLTSAFAVTLGIIMARLFN 1.31
2NWL-4 130-168 39 LVHILLDIVPTNPFGALANGQVLPTIFFAIILGIAITYL 1.17
2NWL-5 195-218 24 YKIVNGVMQYAPIGVFALIAYVMA 1.05
2NWL-6 228-254 27 LAKVTAAVYVGLTLQILLVYFVLLKIY 1.87
2NWL-7 298-329 32 IYSFTLPLGATINMDGTALYQGVCTFFIANAL 0.81
2NWL-8 391-414 24 AILDMGRTMVNVTGDLTGTAIVAK 0.15
2OAR-1 15-43 29 VDLAVAVVIGTAFTALVTKFTDSIITPLI 1.08
2OAR-2 69-89 21 LNVLLSAAINFFLIAFAVYFL 2.42
2OAU-1 29-57 29 VNIVAALAIIIVGLIIARMISNAVNRLMI 1.58
2OAU-2 68-91 24 FLSALVRYGIIAFTLIAALGRVGV 1.44
2OAU-3 96-127 32 VIAVLGAAGLAVGLALQGSLSNLAAGVLLVMF 1.2
2ONK-1 2-29 28 RLLFSALLALLSSIILLFVLLPVAATVT 2.06
2ONK-2 48-78 31 WKVVLTTYYAALISTLIAVIFGTPLAYILAR 1.42
2ONK-3 84-107 24 KSVVEGIVDLPVVIPHTVAGIALL 0.5
2ONK-4 127-152 25 LPGIVVAMLFVSVPIYINQAKEGFA 0.73
2ONK-5 183-205 23 RHIVAGAIMSWARGISEFGAVVV 0.51
2ONK-6 231-250 20 PVAAILILLSLAVFVALRII 2.28
2HG9-1 35-56 22 GVATFFFAALGIILIAWSAVLQ 1.86
2HG9-2 84-111 28 GLWQIITICATGAFVSWALREVEICRKL 1.08
2HG9-3 116-198 24 HIPFAFAFAILAYLTLVLFRPVMM 1.86
2HG9-4 171-198 28 PAHMIAISFFFTNALALALHGALVLSAA 0.97
2HG9-5 226-248 24 TLGIHRLGLLLSLSAVFFSALCMI 1.57
2HG9-6 150-162/
24 IWTHLDWVSNTGY PDHEDTFFRDL -0.12 209-219
2NTU-1 12-30 22 EWIWLALGTALMGLGTLYFLVK 1.72
2NTU-2 37-62 26 PDAKKFYAITTLVPAIAFTMYLSMLL 0.91
2NTU-3 81-101 21 ARYADWLFTTPLLLLDLALLV 1.84
2NTU-4 105-127 23 QGTILALVGADGIMIGTGLVGAL 0.64
2NTU-5 131-153 23 YSYRFVWWAISTAAMLYILYVLF 2.22
2NTU-6 165-191 27 PEVASTFKVLRNVTVVLWSAYPVVWLI 0.97
2NTU-7 201-224 24 LNIETLLFMVLDVSAKVGFGLILL 1.72
2UUI-1 7-27 21 LLAAVTLLGVLLQAYFSLQVI 1.98
190
2UUI-2 68-88 21 WVAGIFFHEGAAALCGLVYLF 1.61
2UUI-3 115-135 21 LWLLVALAALGLLAHFLPAAL 2.09
2HJF_C-1 28-50 23 AAGAATVLLVIVLLAGSYLAVLA 1.64
2HJF_C-2 88-111 24 GRCVAVVVMVAGITSFGLVTAALA 1.08
2NQ2-1 7-24 18 PKILFGLTLLLVITAVIS 1.67
2NQ2-2 61-85 25 VRLPRILTALCVGAGLALSGVVLQG 0.71
2NQ2-3 98-113 16 GVTSGSAFGGTLAIFF 0.4
2NQ2-4 117-139 23 LYGLFTSTILFGFGTLALVFLFS 1.93
2NQ2-5 147-171 25 LLMLILIGMILSGLFSALVSLLQYI 2.37
2NQ2-6 194-212 19 WEKLLFFFVPFLLCSSILL 2.44
2NQ2-7 235-256 22 MAPLRWLVIFLSGSLVACQVAI 1.53
2NQ2-8 264-274 11 GLIIPHLSRML 0.71
2NQ2-9 279-297 18 HQSLLPCTMLVGATYMLL 0.96
2NQ2-10 312-328 17 SILTALIGAPLFGVLVY 1.52 a Residues are numbered according to the PDB coordinate file.
b Mean residue hydropathy calculated with the Liu-Deber scale (Liu and Deber 1998a). See
Materials and Methods.
191
Table A1.4. Oligonucleotides used in this work.
pGEM constructs Sequence (5′–3′)
pGEM constructs
LepH3-F ACCGGGdUGGGgtaccAGGGCAAC
LepH3-R ACCTGGdUCCACCACTAGTCTCGGAAAG
LepH2-F ACCGGGdUGGGgtaccgatc
LepH2-R ACCTGGdUCCACCACTAGTCTTCGGCGCAAC
LepH1-F Same as LepH2-F
LepH1-R ACCTGGdUCCACCACTAGTCTGTGCGTTGATCGGTTG
pGEM-F AGCCATCTdUCGTTCACGTTTGC
pGEM-R ATGGTGGCdUCTAGAGTCGACCTG
δ-Helix-encoding oligonucleotides
1ECA-F ACCAGGdUGACTTCGCTGGAGCTGAAGCAGCCTGGGGTGCAACTCTTGACACTTTCT
1ECA-R ACCCGGdUCCCATCTTTGAGAAGATCATTCCGAAGAAAGTGTCAAGAGTTGCAC
1MBA-F ACCAGGdUCCCGCCGGCGCCGACGCTGCATGGACCAAGCTCTTCGGACTCATC
1MBA-R ACCCGGdUCCTTTGCCGGCGGCTTTGAGGGCATCGATGATGAGTCCGAAGAGCTTG
1SRY-F ACCAGGdUGCACTGAAAGGCGACCTGGCACTGTACGAATTAGCACTGTTACGTTTCGCTATGGAC
1SRY-R ACCCGGdUCCCGGCAAGGTCATCGGCAAAAAGCCACGACGAGCCATGAAGTCCATAGCGAAACGTAACAG
2BMH-F ACCAGGdUCCGCTTGATGACGAGAACATTCGCTATCAAATTATTACATTCTTAATTGCGGGACACGAAAC
2BMH-R ACCCGGdUCCTACATGTGGATTTTTCACTAAGAAATACAGCGCAAATGATAAAAGACCACTTGTTGTTTCGTGTCCCGCA
1HDS-F ACCAGGdUGTTGATCCTGAAAATTTTCGTCTTCTTGGTAATGTTCTTGTTGTTGTTCTTG
1HDS-R ACCCGGdUCCAGTAAATTCACCACCAAAATTACGAGCAAGAACAACAACAAGAACATTAC
2AAI-F ACCAGGdUACTCAGCTTCCAACTCTGGCTCGTTCCTTTATAATTTGCATCCAAATGATTTCAGAAGCAGC
2AAI-R ACCCGGdUCCGCGCATTTCTCCCTCAATATATTGGAATCTTGCTGCTTCTGAAATCATTTGGATGC
Site-directed mutagenesis
1ECA-E6V TCGCTGGAGCTGTAGCAGCCTGGGG
1ECA-G10V TGAAGCAGCCTGGGTTGCAACTCTTGACAC
1ECA-D14V CTGGGGTGCAACTCTTGTCACTTTCTTCGGAATGA
2AAI-R8C TCAGCTTCCAACTCTGGCTTGTTCCTTTATAATTTGCATC
2AAI-E19V AATTTGCATCCAAATGATTTCAGTAGCAGCAAGATTCCAATATATTG
2AAI-R22I CCAAATGATTTCAGAAGCAGCAATATTCCAATATATTGAGGGAGAAA
2BMH-E20V CATTCTTAATTGCGGGACACGTAACAACAAGTGGTCTTTTATC
2AAI-E19V-R22I CCAAATGATTTCAGTAGCAGCAATATTCCAATATATTGAGGGAGAAA
2BMH-E20V-H19L TTATTACATTCTTAATTGCGGGACTCGTAACAACAAGTGGTCTTTTATCATT
2BMH-P378S-F CGTCCAGAGCGTTTTGAAAATTCAAGTGCGATTCC
2BMH-P378S-R GGAATCGCACTTGAATTTTCAAAACGCTCTGGACG
Full-length and truncated 2BMH
pGEM-2BMH-F AGCCACCAdUGACAATTAAAGAAATGCCTCAGCCAAAAAC
pGEMgly2BMH-F AGCCACCAdUGACAATTAACTCCACAAAAGAAATGCCTCAGCCAAAAAC
pGEM-2BMH-R AAGATGGCdUACCCAGCCCACACGTCTTTTG
2BMHD1-226-F AGCCACCADUGACAATTGGTGAACAAAGCGATGATTTATTAAC
gly2BMHD1-226-F AGCCACCADUGACAATTAACTCCACAGGTGAACAAAGCGATGATTTATTAAC
2BMHD1-240-F AGCCACCADUGACAATTAAAGATCCAGAAACGGGTGAG
gly2BMHD1-240-F AGCCACCADUGACAATTAACTCCACAAAAGATCCAGAAACGGGTGAG
192
Appendix 2: Commonly used media for bacterial expression.
TB medium
0.4% (v/v) glycerol
2.4% (w/v) Bacto yeast extract
1.2% (w/v) Bacto tryptone
17 mM KH2PO4
72 mM K2HPO4
LB
1% (w/v) Tryptone
0.5% (w/v) Yeast Extract
0.5% (w/v) NaCl
0.01% (v/v) 1N NaOH
M9 Media (5x)
3% (w/v) Na2HPO4
1.5% (w/v) KH2PO4
0.5% (w/v) NH4Cl
0.25% (w/v) NaCl
0.0015% (w/v) CaCl2 (optional)
For M9 Minimal media (1x) supplement with:
1 mM MgSO4
mM CaCl2
0.04 mM Biotin
0.03 mM Vitamin B
0.003% (w/v) Glucose
For M9 Rich media (1x) supplement with:
1 mM MgSO4
mM CaCl2
5.93 M Vitamin B
0.004% (w/v) Glucose
0.005% (w/v) Casein Enzymatic Hydrolysate
193
Copyright Acknowledgements (if any)
top related