identification and analysis of the folding determinants of ......identification and analysis of the...

Identification and analysis of the folding determinants of membrane proteins.

Fiona Cunningham

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy

Department of Biochemistry University of Toronto

Identification and analysis of the folding determinants of

membrane proteins.

Fiona Cunningham

Doctor of Philosophy

Department of Biochemistry

University of Toronto

Abstract

Membrane proteins are responsible for a variety of key cellular functions including

transport of essential substrates across the membrane, signal transduction, and maintenance of

cellular morphology. However, given the size and high hydrophobicity of membrane proteins,

along with demanding expression and solubilization protocols that often preclude biophysical

studies, novel approaches must be devised for studies of their structure and function. This thesis

addresses these issues through several sets of inter-related experiments. We first examine

sequence motifs directing -helix packing, wherein the determinants of glycophorin A (GpA)

dimerization were identified via TOXCAT assay and the evaluation of GpA-derived peptides.

We found that (i) conservative mutations can have significant effects on the oligomerization of

glycophorin A; and (ii) residues that introduce more efficiently packed structures that are poorly

solvated by lipid leads to improved transmembrane segment dimerization. In a further study, we

inquired into the criteria for selection of membrane-spanning -helices by cellular machinery

through investigation of hydrophobic helical segments (termed -helices) that we identified in

soluble proteins. We found that the number and location of charged residues in a given

hydrophobic helix are related to their insertion propensity as membrane-spanning segments.

When we applied this criterion to -helices in their intact protein structures, we successfully

determined the extent of -helix mutations necessary to convert a soluble protein, in part, to a

membrane-inserted protein. Finally, using a three-transmembrane segment construct from the

cystic fibrosis transmembrane conductance regulator (CFTR), we performed experiments aimed

at optimizing criteria for protein overexpression, including construct design, choice of expression

system, growth media, and expression temperature. The overall findings are interpreted in terms

of progress towards defining the fundamental characteristics of membrane-spanning -helices -

from their primary amino acid sequence to the helix-helix interactions they display in the

assembly of biologically-functional membrane protein structures.

Table of Contents

Abstract ------------------------------------------------------------------------------------------------ ii

Table of Contents ------------------------------------------------------------------------------------ iv

List of Figures ---------------------------------------------------------------------------------------- ix

List of Tables ----------------------------------------------------------------------------------------- xi

List of Appendices ----------------------------------------------------------------------------------- xii

List of Abbreviations ------------------------------------------------------------------------------- xiii

Chapter 1. Introduction. -------------------------------------------------------------------------- 1

1.1 Introduction to membrane proteins. ------------------------------------------------- 2

1.2 Properties of transmembrane segments. -------------------------------------------- 3

1.2.1 Secondary structure of transmembrane segments. ------------------------- 3

1.2.2 Structural features of -helical membrane proteins. ---------------------- 5

1.2.3 Structural features of -barrel membrane proteins. ------------------------ 6

1.2.4 Amino acid composition of -helical TM segments. ---------------------- 8

Insertion of membrane proteins into the membrane bilayer is mediated

by the translocon. --------------------------------------------------------------- 12

1.3 Prediction of transmembrane segments from the primary amino acid

sequence. ----------------------------------------------------------------------------------- 14

1.3.1 Tools for the prediction of transmembrane segments. -------------------- 14

Unique marginally hydrophobic transmembrane segments: breaking

the rules. ------------------------------------------------------------------------- 15

Charged residues in transmembrane segments: importance for

membrane integration. --------------------------------------------------------- 17

1.4 Folding of -helical membrane proteins. -------------------------------------------- 19

1.4.1 Stage 1: Insertion of transmembrane segments into the membrane

bilayer. --------------------------------------------------------------------------- 20

1.4.2 Stage 2: Formation of tertiary contacts between transmembrane -

helices. --------------------------------------------------------------------------- 21

1.4.3 Fragmentation approach to studying membrane proteins. ---------------- 22

1.4.4 Efforts to generate membrane protein high resolution structures. ------- 23

1.5 Forces driving membrane protein folding. ------------------------------------------ 24

1.5.1 van der Waals interactions. --------------------------------------------------- 24

1.5.2 Electrostatic interactions. ------------------------------------------------------ 25

1.5.3 Cation- interactions. ---------------------------------------------------------- 26

1.5.4 Helix lipid interactions. -------------------------------------------------------- 27

1.5.5 Lipids as membrane protein chaperones. ------------------------------------ 29

1.6 Membrane protein oligomerization motifs. ----------------------------------------- 29

1.6.1 Polar residue motifs. ----------------------------------------------------------- 30

1.6.2 Left-handed transmembrane folding motifs: heptad repeats of small or

large residues. ------------------------------------------------------------------- 31

1.6.3 Right-handed packing motifs: small-xxx-small motifs. ------------------ 32

1.6.3.1 The human erythrocyte protein Glycophorin A. --------------- 33

1.7 Methods to study the oligomerization of transmembrane segments. ---------- 34

1.7.1 Membrane mimetic systems: micelles versus bilayers. ------------------ 34

1.7.1.1 Membrane bilayers. ----------------------------------------------- 34

1.7.1.2 Detergent micelles. ------------------------------------------------ 35

1.7.2 The TOXCAT assay. ---------------------------------------------------------- 35

1.7.3 Sodium dodecyl sulfate polyacrylamide gel electrophoresis. ------------ 37

1.7.4 Förster Resonance Energy Transfer. ---------------------------------------- 38

1.7.5 Analytical ultracentrifugation. ----------------------------------------------- 38

1.7.6 Computational methods. ------------------------------------------------------ 39

1.8 Thesis hypothesis and outline. --------------------------------------------------------- 40

Chapter 2. Beta-branched residues adjacent to GG4 motifs promote the efficient

association of glycophorin A transmembrane helices. ----------------------------------------- 42

2.1 Introduction. ------------------------------------------------------------------------------ 43

2.2 Results. ------------------------------------------------------------------------------------- 44

2.2.1 Mutations at Val80

and Val84

in the glycophorin A dimerization motif

can modulate the strength of oligomerization in vivo. --------------------- 44

and Val84

do not alter secondary structure. ----------- 48

2.2.3 Lipid accessibility of the ridge residues correlates inversely with

tightness of dimer. -------------------------------------------------------------- 50

2.3 Discussion. --------------------------------------------------------------------------------- 51

2.3.1 -branched residues are required to mediate efficient association of the

GpA homodimer in the membrane bilayer. --------------------------------- 51

2.3.2 Hydrophobic -branched residues may be structurally optimized for

transmembrane segment folding. --------------------------------------------- 54

2.3.3 Modulating helix interactions. ------------------------------------------------ 57

2.3.4 Conclusion. ---------------------------------------------------------------------- 58

2.4 Materials and Methods. ----------------------------------------------------------------- 58

2.4.1 TOXCAT Assay. --------------------------------------------------------------- 58

2.4.2 Peptide Synthesis. -------------------------------------------------------------- 59

2.4.3 Circular Dichroism. ------------------------------------------------------------ 60

2.4.4 Glycophorin A Helix Solvation Calculations. ----------------------------- 60

Chapter 3: Distinctions between hydrophobic helices in globular proteins and

transmembrane segments as factors in protein sorting. --------------------------------------- 61

3.1 Introduction. ------------------------------------------------------------------------------ 62

3.2 Results. ------------------------------------------------------------------------------------- 63

3.2.1 Hydropathy of -helices. ------------------------------------------------------ 63

3.2.2 Hydrophobic and charged/polar residue content in -helices. ------------ 63

3.2.3 Amino acid composition of-helices vs. transmembrane and other

globular helices. ----------------------------------------------------------------- 64

3.2.4 -helices are more buried within their native folds than other globular

helices. --------------------------------------------------------------------------- 67

3.2.5 Folding of the -helix peptides in aqueous and membrane mimetic

environments. ------------------------------------------------------------------- 68

3.2.6 Competence of -helix segments for in vivo membrane insertion. ------ 72

3.2.7 Charged residue distribution distinguishes -helix and transmembrane

sequences. ----------------------------------------------------------------------- 73

3.3 Discussion. --------------------------------------------------------------------------------- 75

3.3.1 Role in globular proteins. ------------------------------------------------------ 75

3.3.2 Role of residue content. -------------------------------------------------------- 75

3.3.3 Recognition of hydrophobic segments. -------------------------------------- 76

3.3.4 Conclusions. --------------------------------------------------------------------- 77

3.4.1 Database construction. -------------------------------------------------------- 77

3.4.2 Amino acid composition analysis. ------------------------------------------- 78

3.4.3 Solvent accessibility analysis. ------------------------------------------------ 79

3.4.4 Residue position analysis. ---------------------------------------------------- 79

3.4.5 Peptide synthesis and purification. ------------------------------------------ 80

3.4.6 Circular dichroism and fluorescence spectroscopy. ----------------------- 80

3.4.7 Plasmid construction. ---------------------------------------------------------- 81

2.4.8 MalE complementation test. -------------------------------------------------- 81

3.4.9 Chloramphenicol acetyltransferase enzyme-linked immunosorbent

assay. ----------------------------------------------------------------------------- 81

Chapter 4. Converting a Marginally Hydrophobic Soluble Protein into a Membrane

Protein. ------------------------------------------------------------------------------------------------- 83

4.1 Introduction. ------------------------------------------------------------------------------ 84

4.2 Results. ------------------------------------------------------------------------------------- 85

4.2.1 -Helix hydrophobicity. -------------------------------------------------------- 85

4.2.2 Choice of -helices for membrane insertion studies. ----------------------- 87

4.2.3 Experimental quantification of membrane insertion of selected -

helices. ---------------------------------------------------------------------------- 88

4.2.4 Converting δ-helices into transmembrane segments. ----------------------- 90

4.2.5 Converting a soluble protein into a membrane protein. -------------------- 93

4.3 Discussion. --------------------------------------------------------------------------------- 95

4.4.1 DNA engineering. --------------------------------------------------------------- 96

4.4.2 Membrane insertion assay and endoH treatment. --------------------------- 97

Chapter 5. Optimizing synthesis and expression of transmembrane peptides and

proteins. ------------------------------------------------------------------------------------------------ 98

5.1 Introduction. ------------------------------------------------------------------------------ 99

5.1.1 Fragmentation approach to study membrane protein folding. ------------ 99

5.2 Domain fragments of membrane proteins: Application to the Cystic

Fibrosis Transmembrane Conductance Regulator. ------------------------------- 100

5.2.1 The Cystic Fibrosis Transmembrane Conductance Regulator. ----------- 100

5.2.2 Disease causing mutations in CFTR. ----------------------------------------- 102

5.2.3 Fragments of CFTR as a minimal tertiary model. -------------------------- 103

5.3 Triple strand construct from the Cystic Fibrosis Transmembrane

Conductance Regulator transmembrane domain. -------------------------------- 104

5.3.1 Construct information. --------------------------------------------------------- 104

5.3.1.1 Cloning of CFTR TM2/3/4 fragment: Methodology. ---------- 106

5.3.2 Protein Expression of CFTR TM2/3/4 under successful CFTR TM3/4

conditions. ----------------------------------------------------------------------- 107

5.3.3 Heterologous expression of CFTR TM2/3/4. ------------------------------- 109

5.3.3.1 E. coli strain. --------------------------------------------------------- 110

5.3.3.2 Temperature of protein expression induction and

concentration of IPTG. --------------------------------------------- 112

5.3.3.3 Growth media. ------------------------------------------------------- 114

5.3.4 Removing intracellular loop 1 between TM2 and TM3 to improve

protein expression. -------------------------------------------------------------- 116

5.4 Characterization of expressed CFTR fragments. ---------------------------------- 119

5.4.1 Differential gel migration of TM2/3/4 mutants. ---------------------------- 120

5.5 Discussion. --------------------------------------------------------------------------------- 123

Chapter 6. Discussion. ------------------------------------------------------------------------------ 126

6.1 Discussion. --------------------------------------------------------------------------------- 127

6.1.1 Summary of contributions. ----------------------------------------------------- 128

6.1.1.1 Beta-branched residues adjacent to GG4 motifs promote the

efficient association of glycophorin A transmembrane

helices. -------------------------------------------------------------- 128

6.1.1.2 Distinctions between hydrophobic helices in globular

proteins and transmembrane segments as factors in protein

sorting. -------------------------------------------------------------- 129

6.1.1.3 Converting a marginally hydrophobic globular protein into a

membrane protein. ------------------------------------------------- 130

6.1.1.4 Optimizing synthesis and expression of transmembrane

peptides and proteins. --------------------------------------------- 130

6.2 Membrane mimetic micelles versus bilayers. --------------------------------------- 131

6.2.1 SDS as a membrane mimetic. ------------------------------------------------- 131

6.3 Significant content of hydrophobic, -branched residues in transmembrane

segments relative to -helices. ---------------------------------------------------------- 132

6.4 Membrane insertion propensity of transmembrane segments. ----------------- 133

6.4.1 Transmembrane segments with low hydrophobicity. ---------------------- 134

6.4.2 Importance of charged residues to translocon-mediated membrane

insertion of -helices. ----------------------------------------------------------- 134

6.4.3 Importance of secondary structure to translocon-mediated membrane

insertion of -helices. ----------------------------------------------------------- 136

6.4.4 Biological role of -helices. ---------------------------------------------------- 139

6.5 Helix-helix interactions. ----------------------------------------------------------------- 142

6.5.1 Sequence specific dimerization motifs. -------------------------------------- 143

6.5.2 Prediction of helix-helix interactions. ---------------------------------------- 143

6.6 Insights from high resolution structures of membrane proteins. -------------- 145

6.7 Future directions of membrane protein folding. ----------------------------------- 146

Chapter 7: Literature Cited. ------------------------------------------------------------------------ 149

Appendices. -------------------------------------------------------------------------------------------- 181

List of Figures

Figure 1.1 The hydrophobic interior of the membrane bilayer compels membrane

proteins to adopt specific tertiary and quaternary structures. ------------------ 3

Figure 1.2 Example structures of -helical bundle and -barrel membrane proteins. -- 4

Figure 1.3 The structural diversity of -helical membrane proteins. ---------------------- 6

Figure 1.4 The structures and function of -barrel membrane proteins is diverse. ------ 8

Figure 1.5 The distribution of certain amino acids which comprise TM segments has a

skewed distribution. ----------------------------------------------------------------- 10

Figure 1.6 The lipid bilayer is highly heterogeneous. --------------------------------------- 11

Figure 1.7 X-ray crystal structure of the protein conducting channel SecY from

Methanocaldococcus jannaschii (PDB ID 1RH5). ----------------------------- 13

Figure 1.8 X-ray crystal structure of the Mammalian Shaker Kv1.2 potassium channel

(PDB ID 2A79). --------------------------------------------------------------------- 19

Figure 1.9 The two-stage folding model for membrane proteins. -------------------------- 20

Figure 1.10 van der Waals interactions in the close packing of TM -helices. ----------- 25

Figure 1.11 Hydrogen-bond interactions in the folding of membrane proteins. ----------- 26

Figure 1.12 Cation- interactions. --------------------------------------------------------------- 27

Figure 1.13 Acyl chain conformations can occur to accommodate a TM segment within

the membrane bilayer. -------------------------------------------------------------- 28

Figure 1.14 The TOXCAT assay for detecting in vivo TM association within the E. coli

inner membrane. --------------------------------------------------------------------- 36

Figure 2.1 Substitutions for WT Val80

and/or Val84

with Ile, Leu, and Ala

combinations, shown in alignment with the local GpA sequence. ------------ 45

Figure 2.2 TOXCAT assays of GpA and mutant sequences. ------------------------------- 46

Figure 2.3 TOXCAT assay of GpA and mutant Thr sequences. --------------------------- 48

Figure 2.4 Circular dichroism spectra of synthetic peptides corresponding to the GpA

TM sequence. ------------------------------------------------------------------------ 49

Figure 2.5 GpA dimer affinity is inversely correlated to interfacial lipid accessibility. 50

Figure 2.6 Models of the structure of the GG4 „ridge motif‟ involved in GpA

dimerization, with the II, VV (WT) and LL mutants shown in order of

increasing lipid accessibility and decreasing dimer strength. ------------------ 55

Figure 3.1 Comparison of globular helix, -helix and TM helix amino acid

composition. ------------------------------------------------------------------------- 66

Figure 3.2 Solvent accessibility of globular vs. -helices. ---------------------------------- 67

Figure 3.3 Ribbon diagrams of helical globular proteins containing -helix regions

studied in this work. ----------------------------------------------------------------- 69

Figure 3.4 Circular dichroism spectra of δ-helix peptides in various media. ------------- 70

Figure 3.5 -helix secondary structure compared with Chou-Fasman -helix

propensity. ---------------------------------------------------------------------------- 71

Figure 3.6 TOXCAT assay of -helix peptides in the E. coli inner membrane. --------- 73

Figure 3.7 Charged residue positioning in - vs. TM helices. ------------------------------ 74

Figure 4.1 Evaluation of the membrane integration properties of selected δ-helices. --- 89

Figure 4.2 Conversion of -helices into TM segments. ------------------------------------- 92

Figure 4.3 Conversion of a soluble protein into a membrane protein. -------------------- 94

Figure 5.1 Structure of CFTR. ------------------------------------------------------------------ 101

Figure 5.2 Construct of the pET32a(TM2/3/4) designed for the expression of the Trx-

CFTR TM2/3/4 fusion protein. ---------------------------------------------------- 105

Figure 5.3 Western blot of expression trial of CFTR TM2/3/4 in conditions which

were successful for CFTR TM3/4. ----------------------------------------------- 108

Figure 5.4 Flow chart showing the conditions used to optimize protein expression for

CFTR TM2/3/4. ---------------------------------------------------------------------- 109

Figure 5.5 Expression of CFTR TM2/3/4 in various E. coli cell lines: BL21, BL21

(codon plus) and C43. --------------------------------------------------------------- 112

Figure 5.6 Western blot showing of the effect of different induction temperatures on

protein expression of Trx-TM2/3/4, at two different concentrations of

IPTG: 0.1 mM and 1.0 mM. ------------------------------------------------------- 113

Figure 5.7 Western blot showing the effect of different growth media on protein

expression of Trx-TM2/3/4, at two different induction temperatures: 25⁰C

and 37⁰C. ----------------------------------------------------------------------------- 116

Figure 5.8 Construct of the pET32a(TM2/3/4)-Loop designed for the expression of

the Trx- TM2/3/4Loop fusion protein. ---------------------------------------- 117

Figure 5.9 Expression trials showing the effect of different growth media on protein

expression of Trx-TM2/3/4-Loop, at two different induction temperatures:

25⁰C (Lanes 2-9) and 37⁰C (Lanes 10-17) in BL21 (DE3) E. coli cells. ---- 119

Figure 5.10 Differential migration of WT CFTR-TM2/3/4 relative to mutants. ---------- 121

Figure 5.11 CFTR TM3/4 hairpin sequence and SDS-PAGE analysis. -------------------- 122

List of Tables

Table 1.1 Marginally hydrophobic TM segments from CFTR, P-gp, AQP1 and

KAT1 that do not insert into the membrane independently. ----------------- 16

Table 2.1 Sequence of peptide mutant of the GpA transmembrane region 49

Table 3.1 Percent occurrence of hydrophobic, polar and charged residues per helix. 64

Table 3.2 Sequences of synthesized -helix peptides. ------------------------------------ 68

Table 3.3 Predicted secondary structure of -helix peptides. ---------------------------- 70

Table 3.4 Tryptophan emission maxima of -helix peptides in various media. ------- 72

Table 4.1 -helices and their predicted Liu-Deber and Gapp hydrophobicities. ------ 86

Table 5.1 Primers used in the PCR amplification of the human CFTR cDNA. ------- 104

Table 5.2 Predicted membrane spanning regions of CFTR TM2/3/4. ------------------ 106

Table 6.1 Liu-Deber and Gapp hydrophobicity predictions for -helices and

mutants. ----------------------------------------------------------------------------- 136

Table 6.2 Comparison of segmental apolar -helicity for the 1ECA and 2AAI -

helices. ------------------------------------------------------------------------------ 138

List of Appendices

Appendix 1. ------------------------------------------------------------------------------------------ 181

Table A1.1 Database of globular helix sequences (n = 122). ------------------------------ 181

Table A1.2 Database of -helix sequences (n = 51). ---------------------------------------- 184

Table A1.3 Database of TM helix sequences (n = 212). ------------------------------------ 185

Table A1.4 Oligonucleotides used in this work. --------------------------------------------- 191

Appendix 2. ------------------------------------------------------------------------------------------ 192

List of Abbreviations

GCPRs - G-protein-coupled receptors

PDB - Protein Data Bank

TM - Transmembrane

P-gp - P-glycoprotein

CFTR - cystic fibrosis transmembrane conductance regulator

ER - endoplasmic reticulum

WT - wild type

NMR - Nuclear Magnetic Resonance

GpA - glycophorin A

MscL - mechanosensitive channel of large conductance

PE - phosphatidylethanolamine

SDS - sodium dodecyl sulfate

MCP - bacteriophage M13 coat protein

DPC - dodecylphosphocholine

PC - phosphatidylcholine

MBP - maltose binding protein

ELISA - enzyme-linked immunosorbent assay

CAT - chloramphenicol acetyltransferase

SDS-PAGE - Sodium dodecyl sulfate - polyacrylamide gel electrophoresis

FRET - Förster Resonance Energy Transfer

CHI - CNS searching of helix interactions

CNS - crystallography and NMR system software suite

CD - circular dichroism

SPFO - sodium perfluorooctanoate

P - Chou-Fasman secondary structure propensity

P - Chou-Fasman -strand structural propensity

∆Gapp - calculated apparent free energy membrane insertion

RSA - relative solvent accessibility

endoH - endoglycosidase H

RMs - rough microsomes

PCR - polymerase chain reaction

NBD - nucleotide binding domains

R domain - regulatory domain

MSD - membrane spanning domain

Trx - Thioredoxin

IPTG - β-D-1-thiogalactopyranoside

CKB - chicken liver 6-phosphofructo-2-kinase/fructose-2,6-bisphosphatase

Chapter 1. Introduction

1.1 Introduction to membrane proteins.

The boundaries of a cell, as well as the interior intracellular organelles, are defined by

biological membranes consisting of lipids and proteins. These lipid bilayers represent a barrier

to the passage of polar molecules into such organelles, and organize various biological processes

through compartmentalization. The proteins associated with the membrane bilayer catalyze

numerous chemical reactions and much like other proteins, come in a tremendous variety of sizes

and shapes. Membrane proteins can be classified roughly by their mode of interaction with the

membrane which include integral, peripheral or lipid-linked proteins. These different membrane

protein categories are responsible for a variety of key cellular functions including transport of

essential substrates across the membrane, signal transduction, recognition elements linking the

cell to its surroundings, and maintenance of cellular morphology. To compound their

importance, it has been estimated that -helical membrane proteins constitute approximately

27% of the total human proteome (Almen et al. 2009), and attract a large interest in

pharmaceutical therapeutic inventions, as currently the majority of drug targets are associated

with the cell membrane. For example, G-protein-coupled receptors (GCPRs) play multiple roles

in clinical medicine. This group of proteins has various functions including mediating the action

of hormones, and acting as neurotransmitters. As a group of proteins GCPRs have been the most

successful drug targets and agonists and antagonists of GPCR are used in the treatment of

diseases of every major organ system including the CNS, cardiovascular, respiratory, metabolic

and urogenital systems (Insel et al. 2007).

A requirement for residing in the membrane bilayer, and a differentiating factor that

separates membrane proteins from their soluble counterparts, is that membrane proteins contain

inherent stretches of highly hydrophobic sequences required for their membrane insertion.

However, it is this intrinsic hydrophobicity that provides a challenge when studying membrane

proteins and has impeded the collection of structural data. Relatively few high-resolution

structures exist for membrane proteins compared to their soluble counterparts that currently

number in the thousands (Lundstrom 2004). The importance of gathering structural information

on membrane proteins is highlighted by the fact that they have been implicated in many diseases

such as cystic fibrosis, Alzheimer‟s disease, Retinitis Pigmentosa and hereditary hearing loss

(Partridge et al. 2002b). Despite this intense interest, structural information of membrane

proteins has generally remained elusive due to their intrinsic hydrophobicity, complicating

routine analysis.

Due to the issue of high hydrophobicity, it remains important to investigate membrane

protein structure via understanding the first principles of membrane protein folding and assembly

within the unique, low dielectric environment of the membrane bilayer (Fig. 1.1). Fortunately,

the protein folding problem for membrane-embedded species is simplified by the fact that

membrane proteins are limited in their tertiary and quaternary folding patterns due to constraints

presented by the hydrophobic core of the membrane bilayer (Popot and Engelman 1990; 2000).

Figure 1.1. The hydrophobic interior of the membrane bilayer compels membrane proteins to

adopt specific tertiary and quaternary structures. The manner in which membrane proteins fold

in this unique environment is limited by the lipid bilayer. Figure adapted from (Engelman 2005).

1.2 Properties of transmembrane segments.

1.2.1 Secondary structure of transmembrane segments.

At this time, only ~1-2% of the > 65,260 structures deposited in the Protein Data Bank

(PDB) are of membrane proteins (White 2009). This translates to just over 180 unique

structures, and this is inclusive of homologous protein structures from different species

(http://blanco.biomol.uci.edu/Membrane_Proteins_xtal.html). Significant progress in the

determination of high resolution membrane protein structures is being made, however, with the

number of solved structures increasing exponentially (White 2009).

Even though the high resolution structural elucidation of membrane proteins lags behind

that of soluble proteins, two major structural classes of integral membrane proteins have emerged

(Fig. 1.2). In one category, proteins embedded in eukaryotic cell membranes and the inner

membranes of bacteria and mitochondria generally adopt structures characterized by bundles of

-helical TM segments (Fig. 1.2A). An alternative arrangement where -strands span the

membrane and assemble into barrel-like structures is common to proteins embedded in the

bacterial and mitochondrial outer membranes (Fig. 1.2B) (Galdiero et al. 2007). The formation

of these two types of structures is favorable in the low dielectric environment of the membrane

bilayer, as both structures satisfy the hydrogen bonding requirements of the peptide backbone

and prevent the exposure of polar backbone groups to the lipid environment (White and Wimley

1999).

Figure 1.2. Example structures of -helical bundle and -barrel membrane proteins. A)

Bacteriorhodopsin is an example of a -helical membrane protein (PDB ID 2NTU). The -helix

structure is formed to satisfy the hydrogen-bonding requirements of the helix backbone where

bonds are formed between the backbone N-H group to the backbone C = O group of the amino

acid four residues earlier (i, i + 4). B) Porin is an example of a -barrel membrane protein

(PDB ID 2POR). The -barrel structure is formed from a large -sheet that coils to form a

closed structure where the first strand is hydrogen bonded to the last. The individual strands are

typically arranged in an antiparallel fashion that also satisfies the hydrogen bonding requirements

of the backbone in a membrane environment. Figure generated with PyMol.

1.2.2 Structural features of -helical membrane proteins

The predominant class of membrane spanning proteins consists of -helical structures,

which are found in the inner membrane of bacteria, along with all other eukaryotic membranes.

This class of membrane proteins is composed of sequences which traverse the membrane via -

helical structures, and can consist of simple single pass, or multi-spanning transmembrane (TM)

-helices that are connected by extramembranous loops of varying length. These -helical

structures are ideal for spanning membrane bilayers, as the hydrogen bonding potential of the

backbone atoms is completely satisfied. The backbone hydrogen bonds are arranged such that

the peptide C = O bond at the i position points along the helix axis towards the N – H group at

the i + 4 position. This -helical conformation additionally projects the side chains of non-polar

amino acids into the lipid bilayer, where they can favorably interact with the lipid acyl chains.

The complexity of this group of proteins is vast, including a wide range of single -helix TM

proteins to multi-spanning -helix structures containing complex intra-, and extracellular

domains (Fig. 1.3).

Helical membrane proteins are also extremely diverse in their function. -Helical

membrane proteins can include receptors (e.g. rhodopsin and the 2 adrenergic G-protein-

coupled receptor (Okada et al. 2004; Rasmussen et al. 2007)), membrane pores (e.g. aquaporin

(Murata et al. 2000)), ion channels (e.g. CorA, which is a divalent metal ion transporter, and

voltage dependent K+ channels (Jiang et al. 2003; Eshaghi et al. 2006)) and metabolite

transporters (e.g. AmtB and Rh50 (Khademi et al. 2004; Lupo et al. 2007)), as well as proteins

involved in accumulation and transduction of energy, and proteins responsible for cell adhesion.

Figure 1.3. The structural diversity of -helical membrane proteins. -Helical membrane

proteins vary in the number of membrane spanning segments, oligomeric state and the presence

of extramembranous regions. The individual chains of each structure are coloured differently.

A) GpA (PDB ID 1AFO). B) Phospholamban (PDB ID 2HYN). C) Sav1866 (PDB ID

2HYD). D) Aquaporin 5 (PDB ID 3D9S). Figures were generated using PyMol.

1.2.3 Structural features of -barrel membrane proteins.

The second class of membrane proteins consists of β-barrel structures that are found in

the outer membranes of bacteria, mitochondria and chloroplasts (Wimley 2003). The unique

structure of a -barrel protein is formed from a large -sheet resulting from a closed structure in

which the first strand is hydrogen bonded to the last. In the -sheet conformation, formation of a

cylindrical -barrel satisfies the folding constraints imposed by the membrane bilayer in which

the β-strands are laterally hydrogen bonded in a circular pattern (Wimley 2003). In this

arrangement the inter-strand hydrogen bonds stabilize the core of the barrel, producing a

structure which is unlikely to unfold in membranes. Typically, -barrels are comprised of an

even number of -strands where these individual strands are connected by alternating tight turns

and longer loops producing an asymmetric structure (Wimley 2003). Generally, -barrels

feature these tight turns on the periplasmic side of the outer membrane with the flexible, longer

loops on the extracellular side of the membrane (Tamm et al. 2004). An additional feature of -

barrel proteins is the alternating arrangement of hydrophobic and polar residues with the identity

of amino acids residues facing the interior of the barrel being mostly polar. The individual β-

strands are rich in Gly and aromatic Trp and Tyr residues which are frequently found in two

rings that contact the lipid bilayer interfaces at both ends of the barrels (Tamm et al. 2004). -

strands in -barrels are typically arranged in an antiparallel fashion, and as evidenced through

the currently available high resolution structures, that show that the number of sheets included in

the structure can vary from 8 – 22 (Tamm et al. 2004). The average length of these individual

strands can vary from 9 – 11 residues, with such strands tilting 20 - 40⁰ with respect to the

membrane (Wimley 2003; Tamm et al. 2004). The oligomeric state of -barrel proteins can also

vary, with high resolution examples of monomeric, dimeric and trimeric species deposited in the

PDB (Fig. 1.4).

Along with structural diversity, the function of -barrels is also vast. In bacteria, -barrel

proteins can be classified into six families according to their function: (1) general porins such as

OmpC, OmpF, and PhoE; (2) passive transporters such as LamB, ScrY, and FadL; (3) active

transporters of siderophores and vitamin B12 such as FepA and BtuB, respectively; (4) enzymes

such as the phospholipase OmpLA or the protease OmpT; and (5) structural proteins such as

OmpA (Koebnik et al. 2000). This functional diversity is largely dictated via loop sequence

variability that contains most of the functional characteristics of the -barrel protein (Koebnik et

al. 2000).

Figure 1.4. The structures and functions of -barrel membrane proteins are diverse. Small, tight

loop structures are observed for -barrel membrane proteins on the periplasmic side of the

membrane (bottom of diagram); while larger more elaborate loops appear on the extracellular

side of the membrane (top of diagram). The antiparallel arrangement of the individual strands

can be seen, and the number of sheets per structure varies between proteins. A) Monomeric

FepA (PDB ID 1FEP). B) Dimeric OMPLA (PDB ID 1QD6). C) Tetrameric LamB (PDB ID

1AF6). The -strands and loop regions are colored purple. -Helical secondary structure

segments are colored in cyan. Figure generated using PyMol.

1.2.4 Amino acid composition of -helical transmembrane segments.

In order to span the 30 Å thickness of the membrane bilayer, -helical TM segments

along their entire length must be between 20-30 residues. From a database of 160 TM -helices,

the average length of the hydrophobic stretch which physically traverses the membrane was

shown to be 17.3 ± 3.1 (ranging from 6 to 25) residues or 26 Å in length. The average rise per

residue as part of an -helical structure is 1.50 Å, or 3.7 residues per turn (Hildebrand et al.

2004). Transmembrane -helices additionally protrude beyond the membrane bilayer, adding

residues to their total length. An average of 4.7 ± 3.5 residues, or 1.3 turns of the -helix,

extends the helix into the polar exterior of the membrane. These cap sections of TM -helices

consist of only 60% hydrophobic residues (Hildebrand et al. 2004).

Statistical analysis of natural TM segments indicates that the mean number of residues

per TM helix is 26.3 (± 5.6), and that TM -helices are often tilted with respect to the membrane

(Ulmschneider et al. 2005). Several groups have examined the distributions of amino acids

across TM segments in order to determine the composition of these unique membrane spanning

segments. In one such study, hydrophobic residues such as Leu, Ile, Val, Ala, Gly, and Phe were

found to comprise the majority of resides in TM -helices, with Leu being calculated as the most

frequently occurring residue in TM segments (Ulmschneider et al. 2005). Taken together, these

six hydrophobic residues account for approximately 63% of the amino acids in the segments that

span the membrane bilayer, and also constitute half of all residues in total membrane protein

sequences. These compositional values were determined using 46 -helical membrane proteins

containing 440 non-redundant TM helices. This study also identified that the distribution of

certain amino acids within TM segments follows a saddle-like distribution, i.e., TM segments

have a roughly hydrophobic core, and two peaks at the interfacial regions consisting of aromatic,

charged, and polar groups (Fig. 1.5). Such positional dependence indicates an importance for

these residues at specific locations within the TM -helix. This study also identified that apart

from the charged residues in TM segments, the distributions of all other residues are symmetric

(Ulmschneider et al. 2005).

Figure 1.5. The distribution of certain amino acids that comprise TM segments has a skewed

distribution. TM segments have a concentration of aromatic, charged, and polar residues at the

membrane interface region – or the helix termini - while the core or central portion is roughly

hydrophobic.

As mentioned, the distribution of aromatic and charged residues is generally not uniform

across membrane spanning segments (Senes et al. 2000; Ulmschneider et al. 2005). The N- and

C-termini of TM segments are located in chemically heterogeneous environments compared to

the core of the membrane bilayer (Fig. 1.6) and are often enriched in Trp and Tyr residues.

These residues are ideal for this boundary position as they can accommodate the membrane-

water interfacial region of the lipid bilayer by favourably interacting with both the phospholipid

head groups and the aqueous environment (Killian and von Heijne 2000). In contrast, Phe

residues have no preferred positions in TM -helices and can occur at both the core and

interfacial regions (Landolt-Marticorena et al. 1993; Killian and von Heijne 2000; Ulmschneider

and Sansom 2001). However, when Phe is found at the helix periphery, it can have strong

bilayer anchoring characteristics due to its aromaticity (Yuen et al. 2000). For example,

removing the Phe residues from the C-terminus of the bacteriophage M13 coat protein (MCP)

TM spanning segment can shift the entire helix out of the membrane by several angstroms

(Meijer et al. 2001).

Figure 1.6. The lipid bilayer is highly heterogeneous. The tails of the lipids comprising the

bilayer form the hydrophobic core of the membrane and create an environment unsuitable for

charged, polar and aromatic residues (purple). The polar head groups of the lipids (red) reside at

the lipid-water interface, an environment very different from the core and much more

hydrophilic. Charged, polar and aromatic residues preferentially reside in this region.

Positively charged residues were also identified to have a positional preference with

regards to the membrane bilayer. The prevalence of residues such as Arg and Lys was found to

be higher on the “inside”, or cytosolic side, of the membrane compared to the “outside”, or

periplasmic side of the membrane (Ulmschneider et al. 2005). This “positive inside rule” is

thought to affect the directional insertion of membrane proteins into bilayers in vivo (von Heijne

1992). This occurrence of charged residues at the helix termini suggests favourable electrostatic

interactions with the polar head groups of the lipids and the aqueous external environment in a

stabilizing manner.

Energetic considerations of the membrane bilayer suggest that charged and polar amino

acids should generally be excluded from TM segments; however, approximately 25% of all

residues found in TM -helices are polar (Ulmschneider et al. 2005). As noted above, the

existence of charged residues within the membrane bilayer may help to direct the orientation of

membrane insertion, but their implantation within the membrane spanning sequence may play a

key role in either the structure or the function of membrane proteins. For example, side chains of

polar residues such as Ser and Thr have been noted to form hydrogen bonds to the carbonyl

oxygen of the preceding turn of the helix which would energetically enable such side chains to

occur within the TM region (Ulmschneider et al. 2005). Side chain hydrogen bonds between

neighbouring TM -helices have also been postulated as playing essential roles in the formation

of the structures of aquaporin-1 and human nucleoside triphosphate diphosphohydrolase 3 (Buck

et al. 2007; Gaddie and Kirley 2009), and have been observed in such high resolution structures

as the large mechanosensitive channel and the glycerol facilitator (Chang et al. 1998; Fu et al.

2000). Additionally, polar residues in TM segments line the pores of channel membrane proteins

to facilitate function. Substrates such as ions pass though the membrane bilayer via these polar

residue lined pores, with examples including the voltage-gated potassium channel (Morais-

Cabral et al. 2001), and the rotor ring of the F1Fo ATP synthase (Meier et al. 2005). Other roles

for polar residues in TM segments include the binding of prosthetic groups such as the

photosynthetic reaction centre (Nogi et al. 2000) and bacteriorhodopsin (Luecke et al. 1999).

Transmembrane segments are also enriched in specific residues at helix-helix contact

points. For example, Gly has a high frequency of occurrence in TM segments (9%), and it has

been reported that Gly residues occur frequently at helix-helix interfaces and crossing points

(Rath and Deber 2008). It has been suggested that these small residues may facilitate close

packing of TM helices through van der Waals interactions, and especially in motifs combining

Gly and -branched side chains (MacKenzie et al. 1997).

1.2.5 Insertion of membrane proteins into the membrane bilayer is mediated by the

translocon.

-Helical membrane proteins are inserted into the membrane bilayer in vivo through the

translocon. This channel has the unique property of being able to open in two directions:

perpendicular to the plane of the membrane to allow a polypeptide segment to pass through the

channel, and within the membrane to allow a hydrophobic TM segment of a membrane protein to

exit laterally into the lipid phase (Osborne et al. 2005). The translocon, termed the SecYEG or

Sec61 complex in bacteria and eukaryotes, respectively, acts as a switching station for discerning

between membrane-embedded and secreted proteins (Osborne et al. 2005) (Fig. 1.7).

Deciphering the code that the translocon uses for selecting TM segments for insertion is of

fundamental importance for understanding the folding of membrane proteins, and is thought to

follow an equilibrium process where there is a direct interaction between the TM segment and

the surrounding lipid. Results of studies by Hessa et al. have suggested that direct protein-lipid

interactions are essential for the recognition of TM helices by the translocon (Hessa et al. 2005a;

Hessa et al. 2005b; Hessa et al. 2007), and that -helical TM helices partition into the

surrounding lipid bilayer based on the free energy of interaction between the TM segment and

the lipid. The specific details of this partitioning process have yet to be determined, but it is

thought that an “open” state of the translocon allows for this sampling process to occur.

Segmental hydrophobicity, as well as the positioning of charged residues along the TM helix has

also been implicated in the identification of TM segments by the translocon, but interestingly,

marginally hydrophobic TM segments and -helical segments from globular proteins with high

hydrophobicity have also been shown to insert into the membrane (Hessa et al. 2007;

Cunningham et al. 2009).

Figure 1.7. The X-ray crystal structure of the protein conducting channel SecY from

Methanocaldococcus jannaschii (PDB ID 1RH5). Helices 2B (red) and 7 (blue) forms the lateral

gate through which TM segments pass from the pore into the lipid bilayer. All other TM helices

and connecting loop structures are shown in green. A) Top view of SecY. B) Side view.

Figure generated via PyMol.

1.3 Prediction of transmembrane segments from the primary amino acid sequence.

Prediction of the segments that span the membrane bilayer is an important first step

regarding the prediction of membrane protein structure. Identification of membrane spanning

segments is made easier as the mode in which TM segments must span the bilayer is possible

only by limited topology options: either -sheets or -helices. In the case of -helical

membrane proteins, accurate prediction of the TM and loop segments locations, the rough

topology of membrane proteins can be predicted from the amino acid sequence alone.

Prediction of TM segments from the primary amino acid sequence is based on two basic

observations of membrane spanning segments: the regions that traverse the membrane bilayer

are composed primarily of hydrophobic residues, and the regions of multi-spanning membrane

proteins facing the cytoplasm are generally enriched in positively-charged residues (von Heijne

1989). From here, one can possibly predict the regions of the membrane spanning segments, as

well as the orientation of the protein within the membrane bilayer.

1.3.1 Tools for the prediction of transmembrane segments.

A straightforward method for identifying the hydrophobic character of a protein and the

prediction of membrane spanning segments was devised by Kyte and Doolittle (Kyte and

Doolittle 1982). In this method, a computer program progressively evaluates a protein sequence

along its length, and identifies potential TM segments based on a hydropathy scale. This scale

takes into consideration both the hydrophobicity and hydrophilicity of each of the twenty amino

acid side chains and identifies large uninterrupted areas on the hydrophobic side of the scale.

The primary sequences of membrane proteins are scanned with a sliding window 19 amino acids

long, and potential TM segments of 20-30 amino acids were identified based on their segmental

hydrophobicity (Kyte and Doolittle 1982). While useful in its simplicity, improvements and

integrations to this basic process have been made that have increased the forecasting of TM

prediction programs. Including additional observations above a segmental hydrophobicity

requirement such as the residence of aromatic residues at helix termini (Ulmschneider et al.

2005), and a limit to the number of charged residues in a membrane spanning segment (Deber et

al. 2001) have improved the prediction process. Additionally, based on this simple principle

more advanced prediction methods have been developed such as PHDhtm that uses probability

to define TM segments based on local sequence patterns (Rost et al. 1996). Additional

prediction methods have also been developed which consider global sequence patterns:

identifying repeats of TM segments and loop-TM segment patterns was implemented into both

the THMMM (Sonnhammer et al. 1998) and HMMTOP (Tusnady and Simon 1998) TM

prediction programs.

While the tools available to predict TM segments all vary slightly within a common

theme based on hydrophobicity, comparisons of TM prediction programs have suggested that no

particular program has an advantage over another (Cuthbertson et al. 2005). Some TM

prediction programs are suited for prediction of the number and position of TM segments

(THMMM2, TMAP, HMMTOP2), while others were more suited for predicting helix start and

end points (TMHMM2 or SPLIT4) or weeding false positives from prediction outputs (TMHMM

or SOSUI) (Cuthbertson et al. 2005). To circumvent the uncertainty of which prediction

program to use when attempting to identify TM segments from the primary amino acid sequence,

a consensus of programs can determine the most likely TM -helix boundary points. The

accurate determination of TM -helix boundaries becomes important, as differential

determination of -helix end points can ultimately affect experimental outcomes. For example,

comparisons of peptides corresponding to TM2 from the myelin proteolipid protein with -helix

boundaries determined experimentally, versus an agreement of 13 TM prediction programs, lead

to a peptide with two additional C-terminal residues for the latter of which affected the

oligomerization and helicity of the construct in apolar environments (Ng and Deber 2010).

1.3.2 Unique marginally hydrophobic transmembrane segments: breaking the rules.

While the main identifier for TM segments from the primary amino acid sequences is

hydrophobicity (Hedin et al. 2010), unique examples exist in membrane proteins that are not

easily identifiable as TM segments via prediction methods. These “marginally hydrophobic”

TM segments, or false negatives, raise questions regarding the function and insertion pathways

for the proteins containing these segments. A small number of such marginally hydrophobic TM

segments have been identified in P-glycoprotein (P-gp), the cystic fibrosis transmembrane

conductance regulator (CFTR) (Sadlish and Skach 2004), aquaporin-1 (AQP1) (Pitonzo and

Skach 2006), and the plant Kv channel KAT1 that do not insert into the membrane by themselves

(Sato et al. 2003) (Table 1.1). This interesting feature suggests that some TM helices in

multispanning proteins may depend upon other parts of the same protein for efficient insertion

and folding (Hedin et al. 2010). In order to investigate the dependence of insertion of marginally

hydrophobic segments on the remainder of the protein - and to determine in fact if these

segments are capable of isolated membrane insertion - individual TM segments from CFTR and

P-gp were tested for their membrane insertion capabilities into ER membrane (Enquist et al.

2009). When the individual TM segments from CFTR and P-gp were cloned into a construct

designed to measure membrane insertion via glycosylation of the membrane embedded

construct, it was shown that 10/12 TM segments from CFTR, and only 3/12 segments from P-gp

were capable of unaided membrane insertion (Enquist et al. 2009). The insertion propensities of

each TM segment followed predictions of their insertion based on the Gapp scale. These results

highlight that membrane insertion can vary widely even for related proteins such as CFTR and P-

gp, and that while membrane embedded, not all TM spanning segments are individually capable

of directing insertion.

Table 1.1. Marginally hydrophobic TM segments from CFTR, P-gp, AQP1 and KAT1 that

do not insert into the membrane independently.

Name Position a Sequence Gapp

b Liu-Deber

CFTR - TM6 331-349 IILRKIFTTISFCIVLRMAG 0.95 1.58

CFTR - TM8 902-920 SYAVIITSTSSYYVFYIYV 2.99 1.26

CFTR - TM12 1129-1147 VGIILTLAMNIMSTLQWAV 2.10 1.60

P-gp - TM1 55-73 TLAAIIHGAGLPLMMLVFGG 1.58 0.95

P-gp - TM2 115-133 AYYYSGIGAGVLVAAYIQV 1.05 0.83

P-gp - TM3 192-210 MFFQSMATFFTGFIVGFTRG 2.36 1.13

P-gp - TM4 214-232 GLTLVILAISPVLGLSAAVW 0.36 1.44

P-gp - TM6 328-346 IGQVLTVFFSVLIGAFSVG 1.62 1.38

P-gp - TM7 706-724 TEWPYFVVGVFCAIINGGL 1.55 1.10

P-gp - TM9 840-856 IANLGTGIIISFIYGWQ 2.01 1.09

P-gp - TM11 947-965 AMMYFSYAGCFRFGAYLVA 2.31 1.38

P-gp - TM12 973-991 DVLLVFSAVVFGAMAVGQV 1.09 1.40

AQP1 - TM2 52-68 SLAFGLSIATLAQSVG 3.51 0.52

KAT1 - S3 136-154 TWFAFDVCSTAPFQPLSLL 4.21 0.90

KAT1 - S4 162-180 LGFRILSMLRLWRLRRVSS 3.39 0.98 a

Residues are numbered according to their position in the full length protein sequence. b the Gapp was calculated with the online program: http://dgpred.cbr.su.se/

c Mean Liu-Deber segmental hydrophobicity of each segment.

Marginally hydrophobic TM segments were first identified by the “ΔG predictor”

developed in the von Heijne laboratory, which predicts membrane spanning segments based on

segmental hydrophobicity and the position of charged residues along the -helix (Hessa et al.

2007). The apparent free energy of insertion (ΔGapp) into the endoplasmic reticulum (ER)

membrane was then experimentally measured, both with and without inclusion of their

immediate flanking loop segments. In agreement with computational predictions, the marginally

hydrophobic TM segments do not insert into the ER membrane by themselves, but the inclusion

of the flanking loops, both upstream and downstream of the TM segment improves membrane

insertion for some of the unique segments. For those marginally hydrophobic TM segments that

insertion did not improve with addition of flanking amino acids, their insertion in the presence of

neighboring TM segments improves significantly. This result suggests that the insertion of TM

segments can depend on the local environment of the protein, and that TM segments in

multispanning proteins may depend on other parts of the same protein for efficient insertion and

folding (Hedin et al. 2010).

Consequently, the idea that neighboring TM segments can mediate the membrane

insertion of marginally hydrophobic segments then raises the question of how the translocon in

the ER membrane can handle multiple TM segments concurrently. The high resolution structure

of the SecA translocon channel from Thermotoga maritime provided a clue as to how this was

possible: in the open conformation, the translocon may be able to fit two or three TM segments

at once (Zimmer et al. 2008). Presumably, if a sufficiently hydrophobic two- or three-helix

bundle can form within the translocon channel or possibly in the gate-lipid bilayer interface, the

whole bundle may be able to partition into the bilayer in tandem despite the presence of

marginally hydrophobic TM segments (Hedin et al. 2010).

1.3.3 Charged residues in transmembrane segments: importance for membrane

integration.

The location and number of charged residues within a membrane spanning sequence is

thought to affect membrane insertion (Hessa et al. 2007), but unique examples exist of TM

segments that contain large numbers of charged residues. Charged residues in TM segments

reduce the segmental hydrophobicity, but can also be critically important for protein function and

for promoting the correct topology of the protein. The voltage-dependent K+ (Kv) channels are

an example of a family of membrane proteins with conserved charged resides central to their

function. Voltage-dependent K+ (Kv) channels contain a membrane embedded ion-selective pore

domain with six TM segments (S1-S6), and a voltage sensor domain with four membrane

spanning segments (S1-S4) (Sakata et al.). The voltage-sensor domain contains negatively

charged residues in the S2 and S3 TM -helices, and four or more positively charged residues in

the S4 helix (Fig. 1.8). The role of these charged residues in voltage gating has been extensively

studied [see (MacKinnon 2003) for a review], but of interest to membrane protein folding is the

manner in which these highly charged segments are inserted into the membrane bilayer.

In vitro translation and translocation experiments of the individual helices in the plant

voltage-dependent K+ (Kv) channel KAT1 have identified specific interactions between charged

residues that contribute to KAT1 membrane topology (Sato et al. 2003). Interactions between

positively charged residues on S4 and negatively charged residues on S2 may be formed

transiently during membrane integration, and constitute posttranslational electrostatic

interactions between charged residues that are required to achieve correct topology (Sato et al.

2003). Measurements of the membrane insertion propensity of the -helices and TM fragments

from the voltage sensor domain suggest that TM segments insert cooperatively, and that the

degree of cooperatively observed depends on the balance between electrostatic and hydrophobic

forces (Zhang et al. 2007). This example highlights the importance of understanding the

membrane protein integration and folding into the membrane bilayer in the context of the

remainder of the protein, and that the identification of TM segments is a complicated process.

Figure 1.8. X-ray crystal structure of the Mammalian Shaker Kv1.2 potassium channel (PDB ID

2A79). A) Tetrameric structure of Kv1.2, side view. B) Kv1.2 monomer with helix S4 colored

red. The S4 helix contains several positively charged residues and forms part of the voltage

sensor. Figure generated using PyMol.

1.4 Folding of -helical membrane proteins.

Due to the constraints imposed on membrane proteins by the lipid bilayer, the folding of

membrane proteins is considered to be simplified compared to their water-soluble counterparts.

Based on this fact, a two-stage model for membrane protein folding was proposed by Popot and

Engelman (Popot and Engelman 1990; 2000). This model separates the membrane protein

folding process into two distinct steps: the first is the spontaneous insertion of TM segments into

the membrane bilayer, while the second stage is concerned with the lateral associations of TM

segments within the membrane bilayer (Fig. 1.9). More specifically, the first stage of Popot and

Engleman‟s membrane protein folding model is synonymous with the adoption of TM stretches

into -helical conformations, an event which is primarily driven by the hydrophobic effect. The

second stage of membrane protein folding is the association of these established TM -helices,

which is primarily driven by folding motifs defined on the -helical surfaces. Because

established -helices are already within an apolar environment, the hydrophobic effect does not

play a specific role in the second stage of membrane protein folding. Instead, van der Waals

packing and interhelical hydrogen bonds are the driving forces behind the stage-two folding

process.

Figure 1.9. The two-stage folding model for membrane proteins. A) The first stage of

membrane protein folding involves the insertion of -helices into the membrane bilayer, an

event which is driven by hydrophobicity. B) The second stage of membrane protein folding

involves the lateral association of these helices within the bilayer. This association is driven by

motifs on the -helix surface. Figure adapted from (Popot and Engelman 1990).

1.4.1 Stage 1: Insertion of transmembrane segments into the membrane bilayer.

The first stage of membrane protein folding involves the insertion of hydrophobic

segments into the lipid bilayer, which is primarily a result of the hydrophobic effect. The net

hydrophobicity of a potential TM segment is important for determining the likelihood of

membrane insertion. Ideally, an amino acid stretch will be of sufficient hydrophobicity and

length in order to span the bilayer and take on a TM conformation. If a segment possesses an

overall hydrophobicity above a certain threshold value, it will spontaneously insert into a

membrane environment and fold into a -helix structure. This threshold value has been

determined both in vitro (Liu et al. 1996) and in vivo (Nilsson et al. 2003), and is roughly

equivalent to a stretch of poly-Ala residues.

Statistical analysis has shown that the average length of a TM segment is approximately

26 residues, but helix length can vary greatly. A survey of high resolution membrane protein

structures shows that TM -helices can vary in length from 14 – 39 residues, with twenty

residues considered optimal to span the membrane bilayer (Bowie 1997). Helices shorter than

this ideal average can be accommodated by inducing disorder in the packing of the acyl chains in

the lipid bilayer which decreases bilayer thickness (Killian and Nyholm 2006). On the other

hand, -helices that are longer than the ideal average may be accommodated by acyl chain

ordering, which results in an increase in membrane bilayer thickness, or tilting within the

membrane plane (Killian and Nyholm 2006).

1.4.2 Stage 2: Formation of tertiary contacts between transmembrane -helices.

Helix-helix associations drive the second stage of membrane protein folding, and involve

the formation of tertiary and/or quaternary structure. This can involve contacts within helices

from multi-spanning membrane proteins, or between helices of separate chains to form higher-

order oligomers. Correct helix contacts are required to form the final protein structure, and

changes to protein folding through mutation have implications in disease.

The packing or folding of TM helices within the membrane bilayer is a highly specific

process, and forces that contribute to correct folding can involve protein-protein, protein-lipid

and lipid-lipid interactions. The protein-protein interactions, or contacts observed between

helices, can be mediated by either van der Waals interactions or electrostatic interactions

between residues on separate helices. Favorable van der Waals interactions have been most

accurately described as “knobs-into-holes” packing where there is a specific fit of the helices

involved, and the surface included in the contact area is maximized. Helix associations arising

from van der Waals contacts are derived from a series of contacting amino acids or an

“interactive face”, rather than just a single residue. Alternatively, helix contacts arising from

electrostatic interactions can arise from a single amino acid. For example, strongly polar

residues such as Asp, Asn, Glu and Gln have been implicated in driving the association of TM -

helices within the membrane bilayer though the formation of interhelical hydrogen bonds. A

single polar residue in the middle of a TM -helix is able to drive association in order to satisfy

the hydrogen bonding potential of the residue (Zhou et al. 2001; Johnson et al. 2004; Rath and

Deber 2008). Lipids can also contribute to the folding and association of TM -helices. The

entropy of lipids interacting with protein structures rather than with other lipids can promote or

discourage helix contacts. These forces and their contributions to the second stage of membrane

protein folding will be described in more detail in the upcoming sections. Specific examples of

packing motifs will also exemplify the contribution of these forces to protein folding within a

membrane environment.

1.4.3 Fragmentation approach to studying membrane proteins.

The two stages of membrane protein folding describe the adoption of -helical structure

in a membrane environment, and the lateral association of these helices to form higher order

structures. Innate to this model is that the established -helices possess all structural information

within their amino acid sequence required to form higher-order structures. This fact has been

exemplified by numerous studies showing fragments of membrane proteins combining in

membrane environments to form functional proteins (Ridge et al. 1995; Hankamer et al. 1999;

Martin et al. 1999; Liu et al. 2005). For example, the folding and assembly of rhodopsin was

investigated by co-expression of protein fragments, which were designed to correspond to

proteolytic cleavage sites within the loop regions. Individual expression of the protein fragments

yielded no functional results, while co-expression of the separate fragments covering the full

length protein sequence formed constructs with spectral properties similar to the wild type (WT)

protein (Ridge et al. 1995).

As suggested through the two-stage folding model, small segments can be considered to

define structural characteristics of membrane proteins. More importantly, individual TM

segments can largely be viewed as independent recognition elements of overall folding domains

(Johnson et al. 2004; Rath and Deber 2008). Both heterologous protein expression and the

fragmentation approach have been used successfully to investigate membrane proteins.

1.4.4 Efforts to generate membrane protein high resolution structures.

The first high-resolution structure of a membrane protein was of a bacterial

photosynthetic reaction centre (Deisenhofer et al. 1984). Until this initial structure was solved,

the analysis of membrane proteins by X-ray crystallography was thought to be an unattainable

challenge. Since this time however, more than 280 unique membrane protein structures have

been deciphered. The importance of gathering high-resolution structures of membrane proteins

is highlighted by the fact that greater than 40% of pharmaceutical drugs are targeted at

membrane proteins, and each high-resolution structure is highly anticipated by both academia

and the pharmaceutical industry.

The elucidation of available high-resolution structures of membrane proteins has revealed

interesting insights into the molecular functions of various TM processes, but to begin these

kinds of structural studies, researchers require a large supply of purified protein, solubilized in a

detergent in which it retains function; this is an objective that is by no means trivial. Primarily,

three different methods have been used successfully for the gathering of high-resolution

structures of membrane proteins: X-ray crystallography, electron crystallography, and nuclear

magnetic resonance (NMR) spectroscopy. However, all three of these methods combined have

yielded a rather modest set of structures that have been solved at the atomic level. For detailed

mechanistic insights into the function of membrane proteins, a resolution of at least 2.0 Å is

required; particularly to observe small conformational changes (Rosenbusch et al. 2001). An

example of a membrane protein solved to high atomic resolution via X-ray crystallography is

bacteriorhodopsin; a membrane protein commonly regarded as the simplest proton pump (Lanyi

and Schobert 2002). The use of X-ray crystallography to solve high-resolution structures of

membrane proteins is facilitated by proteins of structural similarity to bacteriorhodopsin; in fact,

proteins that have small extramembranous regions are ideal for the formation of the crystals

required by this methodology. While reporting at much lower atomic resolutions, electron

microscopy is also useful in the study of membrane protein function – particularly in the

structural determination of large membrane protein complexes. The structure of the yeast ATP

synthase complex has been determined via electron microscopy which has aided in the functional

studies of the synthesis of ATP by this complex. At 24 Å resolution, the 3-D model of the

protein is highly useful as a framework for studying the functional mechanism of the protein

(Lau et al. 2008). Nuclear magnetic resonance can also be used in the structural determination of

membrane proteins. Both solution NMR and solid-state NMR, can be used to determine

structure, and this technique is especially useful to determine associations of individual residues

or internuclear distances (Rosenbusch et al. 2001).

1.5 Forces driving membrane protein folding.

As the hydrophobic effect is generally considered the major driving force for generating

compact structures in soluble proteins, the question of how membrane proteins form their folded

compact structure in the apolar environment of the lipid bilayer merits interest (White and

Wimley 1999). The manner in which membrane proteins fold speaks to the stability of these

segments within the membrane, where the tight association of TM helices results form a

favorable energetic environment (Liu et al. 2003). Stably folded membrane proteins reside in a

free energy minimum determined by the net energetics among the peptide chains interactions

with water, each other, the lipid bilayer, and cofactors (White and Wimley 1999).

In order for two stable -helices to associate within the membrane, the interactions that

permit the close packing must overcome the energetics that favors helix separation. In the case

of TM segments that are established across the membrane bilayer and associate and form

oligomers, peptide-lipid contacts are lost in favor of peptide-peptide contacts. The balance

between monomer and oligomer is determined by a balance of entropy and enthalpy. In the

following sections, forces directing the packing of TM segments will be discussed in further

detail.

1.5.1 van der Waals interactions.

One of the main forces involved in TM helix associations within the membrane bilayer is

van der Waals interactions. These noncovalent interactions arise within the membrane bilayer

from permanent or induced dipoles, or a fluctuating electron cloud (Fig. 1.10A). These

complementary dipoles can induce packing of TM segments within a membrane bilayer by

providing a weak electrostatic attraction. In membrane proteins, the TM helices associate such

that there is geometric complementarity between the two -helices. This type of “knobs-into-

holes” packing provides a good fit and allows the close approach of the helix backbone (Fig.

1.10B). As this close packing can occur over the length of the TM helix, the cumulative van der

Waals forces can greatly contribute to the folding of membrane proteins (Rath et al. 2009b).

Figure 1.10. van der Waals interactions in the close packing of TM -helices. A) Non-

covalent interactions between neutral molecules form between permanent or induced dipoles. At

any instant, nonpolar molecules have small randomly oriented dipole moments resulting from the

rapid fluctuating motion of their electrons that produces a weak electrostatic interaction when the

dipoles are in close proximity. B) The van der Waals packing interface of GpA (PDB ID

1AFO). “Knobs-into-holes” packing (red line) along the helix length stabilizes the interaction

between the two -helices.

1.5.2 Electrostatic interactions.

Another force involved in the association of TM -helices within the membrane bilayer

is electrostatic interactions. Unlike the induced dipole associations associated with van der

Waals forces, an electrostatic interaction drives the oligomerization via two polar residues, the

formation of a hydrogen bond between two polar side chains, or additionally a polar side chain

and the -helix backbone. A hydrogen bond occurs when two electronegative atoms interact

with the same hydrogen atom, and serves to cancel out the effect of a polar group in the non-

polar environment of the lipid bilayer. If strongly polar residues such as Asp, Asn, Glu, and Gln

occur within TM -helices, the electrostatic interaction between two such polar side chains can

be sufficient to drive the association of the -helices (Fig. 1.11). In the low dielectric

environment of the membrane, electrostatic interactions between polar side chains can be quite

strong. Hydrogen bonds can also form between the C in the amino acid side chain and the

carbonyl oxygen on the -helix backbone. This has been observed in the glycophorin A (GpA)

homodimer as evidenced through selective labeling of the Gly residues in the oligomerization

interface, which promotes the close approach of the dimeric helices (Arbely and Arkin 2004).

Figure 1.11. Hydrogen-bond interactions in the folding of membrane proteins. A) A hydrogen

bond is the attractive interaction of a hydrogen atom with an electronegative atom, such as

nitrogen or oxygen. Two electronegative atoms interact with the same hydrogen to form the

hydrogen-bond. B) A single strongly polar amino acid can drive the association of two -

helices within the membrane bilayer through formation of an intermolecular hydrogen-bond.

1.5.3 Cation- interactions.

A third biologically relevant electrostatic force that can drive the oligomerization of

membrane proteins is the non-covalent interaction between an aromatic residue and a positively

charged amino acid on an opposing helix. These cation- interactions can occur between such

aromatic residues as Trp, Phe and Tyr and positively charged residues such as Arg, Lys and His

(Dougherty 2007). The negative -electron density in the aromatic ring provides a surface of

negative electrostatic potential than can bind to a wide range of cations through a

predominantly

electrostatic interaction (Fig. 1.12). Evaluating the frequency of cation- interactions in protein

structures can be challenging, but estimates indicate that all proteins of significant size have at

least one cation- interaction, and Arg is more frequently found than Lys as the positively

charged partner (Dougherty 2007). As an example, the open conformation of the M2 coat

protein from the influenza A virus is structurally stabilized by a cation- interaction (Haupt et al.

2005). Cation- interactions have also been implicated in functional roles within protein

structures. For example, the Escherichia coli Kdp-ATPase nucleotide binding activity is

mediated via a cation- interaction.

Figure 1.12. Cation- interactions. A) The electrostatic basis for the cation- interaction. The

6 bond dipoles that create the overall electrostatic potential for the bond are shown. Figure

adapted from (Dougherty 2007). B) The cation-π interaction between the face of a benzene ring

and a sodium cation. The overall negative charge (blue) on the face of the benzene ring interacts

electrostatically with the positive charge of the sodium ion (red).

1.5.4 Helix-lipid interactions.

The lipid bilayer itself can have dramatic effects on how TM helices interact, as the lipid

chains can contribute to helix-helix associations, either through TM-lipid or lipid-lipid contacts.

Depending on the composition of the bilayer in question, hydrophobic mismatch can occur to

accommodate TM helices of greater or lesser length than the actual membrane. In order to

accommodate a mismatched length, the lipid, protein, or both can undergo some sort of structural

rearrangement. Altered acyl chain conformation can occur, either through a more or less

extended acyl chain which thereby adjusts the membrane thickness. If the helix length is longer

than bilayer thickness, the helix can tilt within the membrane plane. The amount of helix tilting,

or tilt angle, will be determined by the amount of surface area available for contact with other

membrane embedded helices (Fig. 1.13). For example, the GpA homodimer has a tilt angle of

forty degrees with respect to the membrane bilayer in order to accommodate the helix length and

dimeric nature of the protein (MacKenzie et al. 1997). Alternatively, short -helices may be

able to adapt to the larger membrane span by extending their side chains into the lipid head

group - water interface. The snorkelling action accommodates charged residues within these

short helices. Aggregation within the membrane is an additional method to accommodate short

helices. The actual length of the helix and its interactions within the bilayer can have dramatic

effects on how the helices associate and interact with one another; however, the magnitude of the

contribution that lipids make to protein folding remains poorly understood.

Figure 1.13. Altered acyl chain conformations can occur to accommodate a TM segment within

the membrane bilayer. A) Acyl chains can extend or order to contain a TM segment longer than

the bilayer width. B) Acyl chains can compress or disorder to accommodate a TM segment

shorter than the bilayer. C) Helices can tilt with respect to the membrane bilayer, and the tilt

angle determines the amount of lipid-TM segment contact.

Interactions between a membrane protein and the surrounding lipid molecules in the

membrane are important in determining the structure and function of the protein. The exact

contribution of lipids to the final membrane protein fold is however difficult to determine as X-

ray crystallographic structures of integral membrane proteins generally include few lipid

molecules. An example of a protein for which lipids are an essential structural component is the

mechanosensitive channel of large conductance (MscL) from Mycobacterium tuberculosis. A

cluster of Lys and Arg residues in the protein sequence was found to associate with anionic

phospholipids with high affinity (Powl et al. 2005), and altering the lipid identity has

consequences on MscL conformation (Elmore and Dougherty 2003).

1.5.5 Lipids as membrane protein chaperones.

The identification of molecular chaperones and their role in protein folding in vivo

indicates that acquiring the final folded structure of a protein is a complicated process.

Molecular chaperones are involved in the conformational maturation of soluble and membrane

proteins where they direct folding, prevent misfolding (Frydman and Hartl 1996), and even

unfold protein structures (Martin and Hartl 1997). In addition to protein chaperones, lipids may

also act as non-protein chaperones in the folding of membrane proteins within the membrane

bilayer (Bogdanov and Dowhan 1999). For example, full function of the E. coli membrane

transporter LacY is dependent on exposure to phosphatidylethanolamine (PE) during the in vivo

assembly process. The absence of PE during the in vivo folding pathway of LacY results in the

misfolding of the protein, without actually affecting insertion into the membrane (Bogdanov and

Dowhan 1995). In the folding of LacY, PE was identified as a molecular chaperone because the

functional conformation of LacY was retained after partial folding by SDS, complete removal of

PE and refolding in the absence of PE. These studies indicate that once the final PE-directed

structure of LacY is established in vivo, the presence of PE is no longer required to maintain the

proper confirmation of the protein (Ellis 1997). Both the properties of the ionic headgroup and

the organization of the lipid tail, or hydrophobic domain of PE mimic critical components of

protein chaperones, and render lipids with chaperone function (Bogdanov et al. 1999). The

molecular chaperone effect of lipids is thought to be highly specific for lipid chemical

composition, beyond providing a non-specific detergent like phase.

1.6 Membrane protein oligomerization motifs.

The placement of specific residues along the -helical axis can constitute a specific TM

folding or oligomerization motif, which can determine whether two helices will interact or retain

a monomeric status. Specific amino acid patterns in numerous membrane proteins have been

identified that when folded into a -helix across a membrane bilayer, will form an interaction

face that is sufficient to drive association of -helices. Mutagenesis studies have been

exceptionally useful in determining the residues critical to these specific residue motifs, where

by changing the residue identity, oligomerization can be modulated. Many studies identifying

critical residues to membrane protein folding have been performed on single pass membrane

systems (Rath et al. 2009b), and additionally interactions between multiple helices that drive the

formation of the final membrane protein structure have been elucidated (Buck et al. 2007).

Several helix oligomerization motifs have been identified to-date, which can be

categorized into three main classes: polar residue side chain-side chain interactions that occur

between neighboring TM segments (Gratkowski et al. 2001; Zhou et al. 2001); left handed (or

GASleft) folding motifs consisting of heptad repeats of small or large residues (Lear et al. 2004;

Poulsen et al. 2009); and perhaps the most-widely characterized promoter of oligomerization -

the GG4 (or GASright) motif that is most often defined by an i, i + 4 separation of “small”

residues (Gly, Ala, and Ser) (Deber et al. 1993; MacKenzie et al. 1997; Melnyk et al. 2002;

Sulistijo et al. 2003; Sulistijo and MacKenzie 2006; Bocharov et al. 2007; Plotkowski et al.

2007; Roth et al. 2008). In the following sections, these TM folding motifs will be described in

greater detail.

1.6.1 Polar residue motifs.

Experimental studies involving simple TM helices have shown that a single polar residue,

such as Asn, Asp, Glu, Gln, or His is sufficient to mediate homooligomerization of -helices in

membrane bilayers and membrane mimetic systems (Gratkowski et al. 2001; Zhou et al. 2001).

The role of these specific polar residues in the oligomerization of TM segments was first

demonstrated via model peptides where a polar residue placed in the center of a model TM

segment was shown to drive helix association via the formation of an interhelical hydrogen-

bond. As an example, introduction of a polar Asn residue introduced into a model TM -helix

consisting of a poly-Leu background, resulted in the formation of stable dimers as measured by

both in vivo and in vitro assays (Zhou et al. 2001). This helix interaction is considered strong as

it is resistant to denaturation by the detergent sodium dodecyl sulfate (SDS) (Choma et al. 2000;

Zhou et al. 2000; Lear et al. 2001; Zhou et al. 2001).

Unlike strongly polar amino acids, weakly polar amino acids such as Ser and Thr most

often occur in sequence patterns to drive oligomerization. High-affinity homooligomerizing

sequences have been identified via in vivo oligomerization screens, where the two most

frequently occurring motifs are SxxSSxxT and SxxxSSxxT (Dawson et al. 2002). Mutations of

any of the Ser or Thr residues in these motifs to non-polar residues abolished oligomerization,

indicating that the interaction between these positions is specific and requires an extended motif

of Ser and Thr hydroxyl groups. This study found that single Ser or Thr groups do not appear to

promote helix association on their own, but can drive strong and specific association through a

cooperative network of interhelical hydrogen bonds (Dawson et al. 2002).

1.6.2 Left-handed transmembrane folding motifs: heptad repeats of small or large

residues.

Examination of protein crystal structures found almost all TM helices have interhelical

hydrogen bonds (Adamian and Liang 2002); however, examples exist of TM helices that

oligomerize via sequence specific motifs that are clearly less dependent on hydrogen bonds. One

example is the left-handed folding motif, which has a characteristic seven-residue spacing of

small or large residues that mediates TM packing. A synthesized model TM peptide containing

Gly at the a and d positions was found to associate via analytical ultracentrifugation, where the

spacing of the Gly residue allowed for the close approach of the -helix backbone and efficient

van der Waals packing of the construct (Lear et al. 2004). An example of an actual TM segment

that uses this motif to associate in vivo is found in TM4 of the Halobacterium salinarum small

multidrug resistance protein, Hsmr, where the underlined residues contribute to TM

oligomerization (85

VAGVVGLALIVAGVVVLNVAS105

). Oligomerization of Hsmr through

this motif is critical to protein function, as the protein must form at least dimers to retain its drug

effluxing properties (Poulsen et al. 2009).

Heptad repeats can also be formed from large, hydrophobic residues such as Leu and Val.

One of the most widely recognized examples of a TM segment which uses this mode of

oligomerization is the TM domain of phospholamban. This SDS-resistant pentamer binds to,

and inhibits, the Ca2+

ATPase. Mutagenesis and modeling experiments have identified that the

phospholamban TM segment oligomerizes via its LxxIxxxLxxIxxxL motif, which in turn forms

a Leu/Ile zipper-like coiled-coiled structure (Oxenoid and Chou 2005).

1.6.3 Right-handed packing motifs: small-xxx-small motifs.

One of the best characterized TM folding motifs is the right-handed, small-xxx-small or

GG4 motif. This TM oligomerization motif consists of a small residue (Gly, Ala or Ser)

separated by any three residues at intervening positions, with Gly being the most prevalent. The

importance of this motif was initially implied by amino acid compositional searches of TM

domains, where an enrichment of small residues was observed to be of statistical importance

(Senes et al. 2000). This particular oligomerization motif places the small residues on the same

side or face of the -helix, at an i, i + 4 spacing, and this produces a concave helical surface that

is optimized for protein folding (MacKenzie et al. 1997). Transmembrane dimers containing this

motif are further stabilized by hydrogen bonding between the C and backbone carbonyl. The

exact role of the Gly residues involved in this oligomerization motif is however, not entirely

clear. Some investigators have found that the Gly residues are necessary, but not sufficient for

homodimerization (Schneider and Engelman 2004); while others have reported that the Gly

residues are neither necessary nor sufficient to mediate oligomerization events within the

membrane (Doura and Fleming 2004).

One of the best characterized TM domains which homodimerizes through a GG4 motif is

the membrane spanning region of GpA. This single-pass membrane protein is an obligate dimer

optimized for high affinity association that is mediated by van der Waals interactions between

helices within the membrane bilayer (MacKenzie et al. 1997). Another example of a membrane

protein which uses the GG4 motif to oligomerize is the MCP; however, this homodimerizing unit

has relatively moderate stability compared to GpA (Melnyk et al. 2002). The exact basis for the

difference in -helix affinity between GpA and MCP remains unclear, as replacement of the

residues involved in the GpA oligomerization interface with MCP interfacial residues still

retained helix-helix interactions, albeit it at a reduced strength (Melnyk et al. 2004).

1.6.3.1 The human erythrocyte protein Glycophorin A.

The TM domain of GpA serves as an excellent model system by which to study

determinants of membrane protein folding; this single pass membrane protein has been used

extensively in the study of membrane protein folding as viewed through the two-stage folding

model. A relatively small protein, GpA contains 131 amino acids, 23 of which span the

membrane bilayer (residues 73-95) (MacKenzie et al. 1997). The functional role of GpA is to

determine the blood group antigenic specificity, where the amino terminal domain of the protein

determines the MNS blood group type (Marchesi and Andrews 1971), and it is the major

sialoglycoprotein of the of the human red blood cell membrane. The amino terminal domain of

GpA contains a large oligosaccharide component, and the carboxy terminal domain - which

contains the membrane spanning region - associates within the membrane. The TM domain

alone is responsible for the oligomerization of GpA. Glycophorin A is also implicated in a role

of biosynthesis and plasma membrane trafficking of another abundane erythrocyte membrane

protein, Band 3 or the Anion Exchanger 1 (Williamson and Toye 2008).

Structural characterization to determine the mode of GpA TM mediated oligomerization

was conducted via mutagenesis of a GpA chimeric protein; the TM segment of GpA was fused to

the C-terminus of staphylococcal nuclease via a flexible linker and the oligomeric status of GpA

mutant chimeras was established via migration on SDS-PAGE (Lemmon et al. 1992a; Lemmon

et al. 1992b). This technique led to the identification of residues within the TM sequence that

significantly modulated dimerization upon mutation, and to the determination of seven residues

specifically involved in GpA dimerization interface: L75

. From this

work it was determined that the individual GpA TM segments associated within SDS as a

parallel, right-handed supercoil (Lemmon et al. 1992a; Lemmon et al. 1992b). This finding was

supported by the eventual high resolution structural determination of GpA within

dodecylphosphocholine (DPC) micelles, where it was shown that the interactions of individual

GpA TM helices are stabilized by van der Waals interactions (MacKenzie et al. 1997).

1.7 Methods to study the oligomerization of transmembrane segments.

As described above, obtaining large amounts full-length membrane proteins, or even

smaller fragments from heterologous expression, can be difficult. Our laboratory has worked

towards optimized synthesis and expression of TM segments to facilitate the study of packing

interactions between TM helices and to obtain structural information regarding these segments.

Solid phase synthesis of hydrophobic TM peptides tagged with solubilizing residues as well as

heterologous expression in E. coli of small membrane protein fragments have facilitated these

structural studies for a number of proteins including CFTR (Therien et al. 2001; Choi et al.

2004), GpA (Melnyk et al. 2004), MCP (Wang and Deber 2000; Melnyk et al. 2004), and the

gamma subunit of the Na, K-ATPase or sodium pump (Therien and Deber 2002). We have also

explored computational methods in order to identify potential TM -helix interactions sites.

This has been exceptionally useful in situations where structural data is available for comparison

(Cunningham et al. 2010).

With the availability of tools to produce large quantities of membrane proteins - or

fragments thereof - several systems have been developed in order to study the close packing of

TM segments within the membrane bilayer, for both in vivo and in vitro systems. The following

sections describe in further detail such assays, and their utility in understanding membrane

protein folding.

1.7.1 Membrane mimetic systems: micelles versus bilayers.

Membrane bilayers and detergent micelles can both be used to study the secondary and

tertiary structures of membrane proteins. As a necessary step in solubilizing hydrophobic

membrane proteins, the choice of detergent or lipid can have different structural consequences,

so the choice should not be made in an arbitrary fashion.

1.7.1.1 Membrane bilayers.

Membrane bilayers are advantageous in the study of membrane proteins as they offer a

native environment in which to investigate secondary and tertiary structure. The lipid structures

forming biological membranes are made of two layers of lipid molecules, which are mostly

composed of phospholipids which have a hydrophilic head group and two hydrophobic tails.

When these lipids are exposed to water, they aggregate to form a two-layered sheet, with their

tails pointing towards the centre of the sheet. In mammalian membrane bilayers, the most

common phospholipid is phosphatidylcholine (PC), which accounts for almost half of the lipids

present. Phosphatidylcholine has a zwitterionic head group, as there is a negative charge on the

phosphate group and a positive charge on the amine. An alternative to biological membranes is

liposomes. Liposomes have lipids organized as a bilayer resembling biological membranes but

they also allow comparisons among membrane enriched in particular lipid components of your

choice. For example, if a particular lipid is required as a cofactor for an enzymatic process, they

can usually be identified in a liposome assay.

1.7.1.2 Detergent Micelles.

Detergent micelles are advantageous in studying membrane proteins as they are a simple

and convenient system. Micelles are aggregates of detergent molecules that have hydrophilic

head regions and hydrophobic tail regions. A typical micelle in aqueous solution forms this

aggregate with the head region in contact with the surrounding solvent and the tail region

sequestered into the micelle centre. In this manner, detergent micelles can provide an

environment which is similar to the membrane bilayer; in both cases the centre region of the

micelle and bilayer has a low dielectric environment which is suitable for depositing

hydrophobic TM segments. Transmembrane segments can be solvated efficiently by detergent

micelles where they adopt -helical structures in order to satisfy the hydrogen bonding potential

of the backbone. Structural characterization of membrane proteins is often carried out in

detergent micelles, as this system is amenable to high resolution structural determination

techniques such as crystallization and solution NMR (le Maire et al. 2000; Prive 2007; Carpenter

et al. 2008)

1.7.2 The TOXCAT assay.

To measure the extent of folding of single pass TM segments within a natural membrane

environment, the TOXCAT assay was developed. This assay can measure the

homooligomerization of any TM segment, and is based on the TM mediated association of the

dimerization-dependent ToxR transcriptional activation domain. In the TOXCAT assay, the TM

segment of interest is cloned into the pccKAN expression vector, which places the TM segment

between an N-terminal fusion of the DNA binding domain of ToxR – a transcriptional activator

that is only functional in the dimeric form (Kolmar et al. 1995; Ottemann and Mekalanos 1995) –

and a C-terminal fusion of maltose binding protein (MBP) – which ensures the proper orientation

of the construct within the membrane bilayer (Russ and Engelman 1999). Homooligomerization

via the TM domain leads to dimerization of the ToxR domain, resulting in transcriptional

activation of a reporter gene, and the concentration of the reporter is determined via an enzyme-

linked immunosorbent assay (ELISA). In the case of the TOXCAT assay, the reporter is

chloramphenicol acetyltransferase (CAT). In vivo, this bacterial enzyme detoxifies the antibiotic

chloramphenicol by covalently attaching an acetyl group from acetyl-CoA to chloramphenicol,

preventing chloramphenicol from binding to the ribosome, and thereby inhibiting protein

synthesis (Lovett 1996). The TOXCAT assay is useful in studying the oligomerization of

membrane proteins, as the amount of CAT expressed in vivo corresponds to the level of affinity

of the dimerizing construct being tested (Fig. 1.14) (Russ and Engelman 1999). The TOXCAT

assay is additionally advantageous in studying membrane protein association as it is sensitive to

differences in affinity among mutant constructs, and can be used to select and identify strongly

dimerizing constructs (Russ and Engelman 1999). Lastly, this assay has been used to select

oligomeric TM segments from a library of randomized sequences (Russ and Engelman 1999).

Figure 1.14. The TOXCAT assay for detecting in vivo TM association within the E. coli inner

membrane. Homooligomerization mediated through the TM domains drives the association of

the ToxR‟ domain, which drives expression of the reporter gene CAT. The periplasmic located

MBP protein anchors the chimera in the inner membrane, controlling orientation.

The TOXCAT assay has proven extremely useful in identifying residues responsible for

dimerization within the E. coli inner membrane, as well as identifying amino acids responsible

for mediating oligomerization. TOXCAT was successfully used to identify the residues involved

in the homooligomerization of the DEP-1 protein tyrosine phosphatase, which is mediated by

specific Gly residues within the sequence. Replacement of these small residues in a non-

conservative fashion to large residues was disruptive to dimerization (Chin et al. 2005).

1.7.3 Sodium dodecyl sulfate polyacrylamide gel electrophoresis.

Sodium dodecyl sulfate - polyacrylamide gel electrophoresis (SDS-PAGE) is one of the

most commonly used tools in studying the association of TM segments (Rath et al. 2009b).

Transmembrane segments that retain oligomerizing capabilities in this detergent can migrate at

apparent molecular weights ≥ two times their actual molecular weight and are examples of

strongly associating segments. Based on migration through the gel and comparisons to

molecular weight standards, SDS-PAGE can provide an estimate of the oligomeric state of the

TM segment; however, the migration does depend heavily on the amount of detergent bound to

the segment, indicating the importance of hydrophobicity and conformation of the sample of

interest (Rath et al. 2009a).

SDS-PAGE has found utility in identifying high affinity dimers as well as the residues

involved in oligomerizing species (Lemmon et al. 1992a; Melnyk et al. 2001). In the case of the

dimeric single pass membrane protein GpA, the seven residue dimerization motif

IxxGVxxGVxxT87

) centered around the Gly GG4 motif was pin-pointed as the mediator of

dimerization, with replacements at these residues significantly modulating oligomerization.

Fusion constructs consisting of the GpA TM segment and the C-terminus of staphylococcal

nuclease were expressed and purified, and changes to the oligomeric status with mutation were

determined via SDS-PAGE. Amino acids with aliphatic side chains defined much of the

interface, indicating that a precise packing interaction between the helices provides the energy

for association (Lemmon et al. 1992b).

1.7.4 Förster Resonance Energy Transfer.

Förster Resonance Energy Transfer (FRET) is another useful in vitro technique to study

the association of TM segments in membrane mimetic systems. In order to observe FRET

between two separately labelled donor and acceptor TM peptides, the labels must be within a

given radius. This separation distance is a property of the FRET pairs, and is referred to as the

Förster radius. A commonly used FRET pair in the study of TM peptides is dansyl chloride as

the donor chromophore and dabsyl chloride as the acceptor. The emission spectrum of dansyl

chloride overlaps the absorbance spectrum of dabsyl chloride, and association of the peptides

results in energy transfer and quenching of the total dansyl chloride fluorescence when the two

labels are within 40 Å of each other (Rath et al. 2009b). FRET finds additional utility in that this

technique can be extended to intact bilayers, as well as allowing for ease in identification of

hetero-oligomers in TM interactive species.

The asymmetric oligomeric interface of the anthrax toxin receptor-1 ANTXR1 TM

domain was identified via FRET studies in SDS micelles. Mutations were made to residues

corresponding to those predicted to be involved in association via computer modeling, and

differences in dimerization were observed via FRET (Go et al. 2006). Additionally, FRET was

successfully used to determine that different detergents can affect the energetics of peptide

association. A series of detergents including a range of alkyl chain lengths, combined with ionic,

zwitterionic, and nonionic head groups showed wide variations in GpA dimer stability and that

detergents might be selected that drive association rather than dissociation of peptide dimers

(Fisher et al. 2003).

1.7.5 Analytical ultracentrifugation

Analytical ultracentrifugation finds its utility in the study of TM association as the

oligomeric status of TM constructs can be estimated based on the mass of TM peptides

solubilized in a membrane mimetic environment. Ultracentrifugation can also be applied to

estimate the energetics of peptide association via titrating peptide concentration to separate

monomeric and oligomeric TM peptide species (Rath et al. 2009b).

Homodimerization of the mouse erythropoietin receptor, which is thought to be an initial

regulatory step in erythrocyte formation, was determined via sedimentation equilibrium

analytical ultracentrifugation in detergent. The sequence dependence of the peptide interaction

was also highlighted by comparison to the human version, as the human erythropoietin receptor

differs by three residues but has significantly lower interaction propensity and is only slightly

more favorable than that expected for non-preferential binding (Ebie and Fleming 2007).

1.7.6 Computational methods.

Another tool to predict oligomerization interfaces of TM -helices, including modeling

of TM dimers, is by using the CNS searching of helix interactions (CHI) software suite of the

crystallography and NMR system (CNS) (Adams et al. 1995; Adams et al. 1996; Brunger et al.

1998). In this method, two identical energy minimized -helices are generated from the primary

sequence, and potential interactions are identified by global computational searching. This

method of interfacial recognition can be conducted in either parallel or anti-parallel orientations

(Poulsen et al. 2009), and the lowest energy structures are considered potential candidates for

dimers. This relatively crude method of helix interaction prediction only takes into account

optimized structural interactions, and validation of the models produced from the predictive

output is still required.

Algorithms have also been developed to identify interacting TM helices in membrane

proteins that incorporate features beyond those used to develop contact maps for soluble

proteins. One such predictive method developed by Fuchs et. al. is a neural-network based

approach that integrates protein sequence, correlated or co-evolving residues, protein topology,

residue position within the TM segment, and orientation toward the lipophilic environment to

generate contact maps for membrane proteins (Fuchs et al. 2009). Predicting protein structure

however, is a challenging task. When tested on a dataset of 62 non-homologous membrane

proteins with known structure, a prediction accuracy of 26% was achieved. While this value

initially seems low, prediction of TM helical contacts via this neural-network performs with

equal accuracy to contact predictors designed for soluble proteins (Fuchs et al. 2009). Obviously

the prediction of protein structures and -helical contacts is still an evolving process, but the

identification of interactive helices is a useful exercise that may lead to the assignment of a

membrane protein sequence to a family of related folds and hence provide clues related to

function; especially in a field where available structures are limited.

1.8 Thesis hypothesis and outline.

As discussed above, helical membrane protein folding is thought to follow a two-stage

process. The first stage is defined as -helix insertion into the membrane, followed by the

second stage involving lateral association of helices within the membrane to form higher order

structures. In order to fully understand the overall process of membrane protein folding, each of

these stages must be further defined. Unfortunately, studies of this process can be hampered by

the challenge of working with intact membrane proteins - where their size and high

hydrophobicity - combined with demanding expression protocols have often precluded

biophysical studies. In order to overcome this challenge and facilitate the study of membrane

proteins, fragments or peptides corresponding to individual TM segments are often utilized in

place of the full length structure. This largely empirical process has been tackled extensively in

our laboratory, with the goal of producing constructs that can address both stage-one and stage-

two of membrane protein folding.

Chapter 2 of this thesis focuses on the second stage of membrane protein folding and our

understanding of sequence determinants leading to -helix association within the membrane

bilayer utilizing GpA as the model system. If specific sequence determinants are responsible for

directing the association of TM segments within the membrane bilayer, then modulating the

sequence would act to promote, or obstruct dimerization (Cunningham et al. 2010). Chapter 3

will focus on stage-one of membrane protein folding – specifically factors which separate

hydrophobic segments from globular proteins within bona fide TM segments - with a goal

towards understanding how the recognition of actual TM segments occurs. By investigating a

group of -helices related to TM segments through their overall hydrophobicity, we can

determine differences between the groups based on composition and structure (Cunningham et

al. 2009). Expanding on the recognition process of TM segments by the cellular machinery

leading to the membrane integration of amino acid stretches, Chapter 4 will focus on identifying

determinants required for the successful membrane integration of polar, -helical segments such

as -helices into the membrane bilayer, and specifically identifying determinants of membrane

integration in vivo. Finally, the production of higher order membrane protein constructs in

quantities sufficient for study must be addressed before investigations of the folding patterns

within membrane environments can be initiated. In Chapter 5 of this thesis, routes to optimal

expression and synthesis of membrane protein fragments will be described, along with efforts

toward preparation and characterization of these hydrophobic segments (Cunningham and Deber

2007).

Chapter 2. Beta-branched residues adjacent to GG4 motifs promote

the efficient association of glycophorin A transmembrane helices.

The contents of this chapter have been published, in part, by Cunningham, F. Poulsen, B.E., Ip,

W., and Deber, C.M., Biopolymers: (2010).

Author Contributions: FC and WI designed research. WI and BEP assisted in mutagenesis of

TOXCAT constructs and peptide synthesis and purification of peptides, respectively. FC

analyzed data. FC and CMD wrote the paper.

2.1 Introduction.

A major driving force for correct folding and assembly of membrane proteins derives

from interactions between segments created by specific residue motifs on the helix

surface. One such motif is the GG4 (or „small-xxx-small‟) motif, that is defined by an i, i + 4

separation of „small‟ residues (Gly, Ala, and Ser) (Popot and Engelman 1990; Rath et al. 2009b)

with large hydrophobic residues (Ile, Val or Leu) statistically noted to reside at adjacent

positions (Senes et al. 2000).

In an analysis of residues in TM segments, a pattern emerged that identified small

residues (Gly, Ala and Ser) placed three residues apart along the amino acid sequence to occur

frequently. The most over-represented of the pairs of small residues was Gly at the i and i + 4

positions, creating a GG4 motif; this GG4 motif occurs 32% more often than a random

expectation (Senes et al. 2000). See reference (Rath et al. 2009b) for a list of membrane proteins

containing this small residue pattern that is involved in TM segment oligomerization. A GG4

motif is the key feature directing dimerization of GpA, and serves as an excellent model system

to further our understanding of the forced driving -helix association within a membrane. The

interaction motif of the GpA homodimer has been defined as L75

IxxGVxxGVxxT87

, where

favorable van der Waals interactions between monomers facilitate dimerization (Lemmon et al.

1992b). The structure of the GpA homodimer determined by solution NMR indicates that the

side chains of Gly79

and Gly83

form a „groove‟ that packs against a „ridge‟ formed by the

sequentially-adjacent hydrophobic -branched residues Val80

and Val84

, with additional folding

contributions by the surrounding residues (MacKenzie et al. 1997).

While the role of small residues in the GpA dimerization motif has been extensively

studied, the importance of the large adjacent Val residues has not been correspondingly explored.

Compositional analysis has indicated a statistical enrichment in large, hydrophobic -branched

residues in TM segments such as Ile and Val versus simple hydrophobic segments from globular

proteins, suggesting that these residues are important in performing a structural role beyond

maintaining levels of hydrophobicity (Cunningham et al. 2009). Here we sought to determine

the nature and relative extent of the specific contribution(s) to TM-TM packing of the Val80

residues through systematic substitution(s) with Leu, Ile, and Ala residues. Results of the

analysis suggest that replacement of Val with even single Ile, Leu, or Ala residues can have

profound effects on GpA helix-helix interactions.

2.2 Results.

and Val84

in the glycophorin A dimerization motif can modulate

the strength of oligomerization in vivo.

To investigate the contribution of the large residues in the homodimerization motif of

GpA TM -helices, a systematic mutagenesis of Val80

and Val84

was performed. A total of 16

single and double mutants were generated in the GpA TM domain of the TOXCAT vector via

site-directed mutagenesis at these sites with all possible combinations of Val, Ile, Leu, and Ala

(Fig. 2.1). Dimerization affinities were determined via the TOXCAT assay, where the extent of

dimerization within the E. coli inner membrane is indicated as a measure of the CAT reporter

gene produced (Fig. 2.2A,B) (Russ and Engelman 1999). The amount of CAT expressed within

the cells ([CAT]) is a measurement of construct dimerization and can be compared relative to the

WT GpA construct after normalization for construct expression within NT326 cells (Fig.

2.2A,B) (Russ and Engelman 1999). The correct membrane insertion of each construct upon

expression was confirmed via growth of NT326 cells on M9-maltose plates (data not shown). E.

coli cells expressing the WT and mutant TOXCAT chimeras were streaked onto agar plates with

maltose (0.4%) as the only carbon source. Constructs which span the membrane bilayer and

target MPB to the periplasm are capable of utilizing maltose as a carbon source for growth. This

maltose complementation assay verifies the periplasmic location of MBP and correct orientation

and insertion of these constructs into the E. coli inner membrane. Transformant growth was

evaluated for all constructs after incubation for 2 days at 37°C.

Figure 2.1. Substitutions for WT Val80

and/or Val84

with Ile, Leu, and Ala combinations, shown

in alignment with the local GpA sequence (Lemmon et al. 1992b). The gray boxes represent the

mutational positions in the GpA sequence. Figure adapted from (Cunningham et al. 2010).

We observed that substitution of the WT Val residues by various combinations of Ile and

Leu residues in the GpA TM segment significantly modulates the dimerization affinity of the

construct. Immediately apparent is the improvement in dimerization upon mutation of a WT Val

to Ile (IV and/or VI, p < 0.01 and p < 0.05, respectively, Fig. 2.2A) as measured within the E.

coli inner membrane. A corresponding result was observed for the double Ile mutant (II), where

dimerization was significantly elevated above the WT VV GpA construct (p < 0.05, Fig. 2.2A).

From these results, Ile appears to increase dimerization of GpA in the presence of a second -

branched residue (either Ile or Val), which is not tied to position within the sequence; the

dimerization of Ile-containing GpA mutants (IV, VI and II) is statistically higher than the WT

(VV), albeit these samples are not statistically distinct from each other (p > 0.1). Therefore,

regardless of position, the presence of at least one Ile residue in the dimerization interface of

GpA improves the dimerization propensity of the construct (Fig. 2.2A).

Figure 2.2. TOXCAT assays of GpA and mutant sequences. Bars indicating relative CAT

expression for each construct normalized to WT with standard deviation shown. The notation

VV designates the WT Val80

and Val84

residues, respectively; other constructs are

correspondingly designated. Differences in CAT concentration are denoted by * (p < 0.05); **

(p < 0.01). A) CAT expression of WT and conservative mutant GpA constructs. B) CAT

expression of WT and Ala mutant GpA constructs. Expression levels of GpA mutant constructs

are below, and reported [CAT] values are normalized for expression levels. Figure adapted from

On the other hand, mutation to Leu had the opposite effect on dimerization of the GpA

TM compared to Ile and Val. The double Leu mutant (LL) displayed a statistically significant

decrease in dimerization propensity relative to the WT (p < 0.01, Fig. 2.2A) indicating the

importance of a side chain -branch in the oligomerization of GpA. However, the dimerization

of the LL construct was still above the background (Russ and Engelman 1999) (c2x, p < 0.01)

indicating that oligomerization is not completely abolished in the presence of this large,

hydrophobic residue. The efficiency of a -branched residue in promoting dimerization of the

GpA TM is highlighted by that fact that the presence of Leu in combination with either Val or Ile

(constructs VL, LV, IL, and LI) retained dimerization levels statistically indistinguishable from

WT and each other (Fig. 2.2A) (p > 0.1). These results show that at least one large, hydrophobic

-branched residue is required to maintain WT dimerization affinity at levels comparable to WT

(Fig. 2.2A). The importance of large, hydrophobic -branched residues is additionally

highlighted as hydrophobic segments from globular proteins – -helices- are depleted in Ile and

Val, even though these segments share equivalent hydrophobicity to TM segments (Cunningham

et al. 2009). The statistical enrichment of Ile and Val, along with the increased dimerization data

presented here, highlights the structural role of these residues and how they are optimal for

driving the packing of TM segments (Cunningham et al. 2009).

To further explore the importance of having at least one large hydrophobic, -branched

residue in the dimerization interface of GpA, less conservative mutations to Ala were made with

all combinations of Ile, Val and Leu (Fig. 2.1). All possible Ala/Val and Ala/Ile mutants were

nonetheless able to maintain dimerization affinities at least equivalent to the WT VV construct,

with the VA mutant exhibiting dimerization affinity above the other Ala/Val and Ala/Ile

combinations (Fig. 2.2B). Conversely, the Ala/Ala and Ala/Leu combinations significantly

reduced the dimerization propensity of the construct relative to WT (p < 0.01, Fig. 2.2B).

A preliminary mutational cycle of Thr was also considered to understand the importance

of -branching vs. hydrophobicity of the amino acid side chain in the context of the GpA GG4

dimerization motif. Mutational cycles of Thr, in combination with Val, Ile and Leu, indicate

that mutation to a -branched, yet more polar residue can also modulate dimerization of GpA.

Preliminary results show that dimerization of TI and LT mutants are comparable to WT, while

the IT and TL mutants produced little to no dimer. From this initial set of mutations, no

discernible trend emerged on the effect of introducing a Thr residue into the GpA dimerization

interface: there appears to be no positional dependence of Thr on construct dimerization: the

pattern of dimerization for the TI and IT mutants compared to the TL and LT mutants is opposite

in nature (Fig. 2.3). These results do show however, the importance of a balance of

hydrophobicity and a -branched side chain – in concert with the electrostatic/H-bonding

interactions offered by the Thr residue - in promoting efficient packing of the GpA homodimer.

Figure 2.3. TOXCAT assay of GpA and mutant Thr sequences. Bars indicating relative CAT

expression for each construct normalized to WT with standard deviation shown. The notation

VV designates the WT Val80

and Val84

residues, respectively; other constructs are

correspondingly designated. The relative expression of each mutant construct is shown below.

[CAT] values are normalized for expression levels.

and Val84

do not alter secondary structure.

Conservative mutations at Val80

and Val84

that significantly altered in vivo

homooligomerization of the GpA TM helix compared to WT (Fig. 2.2) were selected for peptide

synthesis. The boundaries of the GpA TM segment were chosen based on the available high

resolution structure of WT GpA (MacKenzie et al. 1997); multiple Lys residues were

incorporated at both the N- and C-termini to facilitate characterization and solubilization of the

peptides (Table 2.1). Previous work has shown that such solubilizing techniques do not interfere

with the native oligomerization capabilities of GpA peptides (Melnyk et al. 2001). Secondary

structures of WT and GpA mutant peptides in the membrane-mimetic environment of SDS were

determined by circular dichroism (CD) spectroscopy. In agreement with previously reported

experiments of GpA in -helix inducing solvents such as SDS (Melnyk et al. 2001), all peptides

investigated had -helical CD spectra with minima at 208 and 222 nm (Fig. 2.3). SDS has been

shown to be capable of discerning variations in helicity among libraries of mutants in related TM

segments (Melnyk et al. 2001; Wehbi et al. 2007; Rath et al. 2009b), and allows one to draw

conclusions as to the relative helicity of TM peptide constructs. While slight differences were

discernible in the amount of -helical structure observed for GpA vs. some mutant peptides in

SDS, the variations were not statistically significant (p > 0.05).

Table 2.1. Sequence of peptide mutant of the GpA transmembrane region.

Mutant Sequence a

VVpep KKKK-ITLIIFGVMAGVIGTILLISYGI-KKKK

VIpep KKKK-ITLIIFGVMAGIIGTILLISYGI-KKKK

IVpep KKKK-ITLIIFGIMAGVIGTILLISYGI-KKKK

IIpep KKKK-ITLIIFGIMAGIIGTILLISYGI-KKKK

LLpep KKKK-ITLIIFGLMAGLIGTILLISYGI-KKKK a Lys residues (offset by hyphens) were added to each peptide to enhance aqueous solubility

(Melnyk et al. 2001). b Mean Liu-Deber segmental hydrophobicity of each peptide, excluding Lys tags

Figure 2.4. Circular dichroism spectra of synthetic peptides corresponding to the GpA TM

sequence. Spectra were recorded on solutions of 25 M peptide in 25 mM SDS buffer. Peptide

notations in the diagram correspond to the residue positions 80 and 84 of the GpA TM sequence,

where VVpep = WT. See Materials and Methods and Table 2.1 for further details of peptide

synthesis and sequence. Figure adapted from (Cunningham et al. 2010).

2.2.3 Lipid accessibility of the ridge residues correlates inversely with tightness of dimer.

Previously it has been shown that helix-lipid interactions contribute to the overall

stability of the GpA dimer in consideration of the small residues of the dimerization interface; an

increase in the lipid accessible surface area (contact between lipid and protein) as a result of

mutation is inversely correlated with tightness of dimer (Johnson et al. 2006). These findings

suggest that helix-lipid interactions may also contribute to changes in the oligomerization status

with mutation to the large “ridge” residues involved in the GpA packing. To assess the role of

the large, hydrophobic residues in promoting the efficient packing of GpA and to provide an

estimate of lipid chain solvation, the lipid accessible surface area to a methylene-sized probe

(1.88 Å) was calculated for the residues comprising the dimerization interface of energy

minimized GpA helices and conservative mutants thereof (Shrake and Rupley 1973; Chothia

1975; Johnson et al. 2006). An inverse correlation was observed for lipid accessibility and

oligomerization as measured by the TOXCAT assay for all mutants (VV, VI, VL, IV, II, IL, LV,

LI, LL, VA, AV, IA, AI, LA, AL, AA) where as lipid accessibility increased, oligomerization

decreased (r value of -0.54, p < 0.05, Fig. 2.4). This trend implies that as the „ridge‟ in GpA

packing becomes more accessible to lipid through mutation, a weaker GpA dimer is produced.

The strength of the trend also implies that lipid solvation offers only a partial explanation to

explain differences in GpA dimerization, but it remains a contributing factor.

Figure 2.5. GpA dimer affinity is inversely correlated to interfacial lipid accessibility.

Interfacial lipid accessibility of mutants to the large WT Val residues of GpA (VV, VI, VL, IV,

II, IL, LV, LI, LL, VA, AV, IA, AI, LA, AL, AA) plotted vs. TOXCAT signal ([CAT]). The

regression line of best fit and correlation coefficient are shown; the correlation meets

significance levels (p < 0.05). Figure adapted from (Cunningham et al. 2010).

2.3 Discussion.

2.3.1 -Branched residues are required to mediate efficient association of the glycophorin

A homodimer in the membrane bilayer.

Large hydrophobic residues constitute nearly half of the amino acids in TM -helices and

it has previously been shown that these residues (Ile, Val, or Leu) are often associated with Gly

at the i ± 1 positions in interacting -helices (Russ and Engelman 2000; Senes et al. 2000).

Additionally, there is a prevalence for large, hydrophobic amino acids to occur three residues

apart along TM -helices, with VV, VI, VL, IV, II and IL constituting over-represented pairs, as

shown statistically by Senes et al. (Senes et al. 2000). The high content and structural similarities

among Ile, Leu, and Val beg the question not only as to how these residues affect GpA

dimerization, but more generally membrane protein folding and helix-helix packing. For

example, the mutations studied in the present work represent relatively modest changes to the net

hydrophobicity of the TM segment (Table 2.1); in fact, three pairs of such mutants (VI and IV;

VL and LV; II and LL) are each “iso-hydrophobic”. Additionally, Ile, Leu and Val represent the

amino acids with the highest propensity to form -helical structures in apolar environments (Liu

and Deber 1998b; 1999).

Experimentally, studies addressing the relative importance of residues that are included

within the GpA dimerization interface have indicated that Val residues are important in

maintaining the ability of GpA to form a tight dimer in a membrane environment. Thus, in a

study performed by Lemmon et al., which made use of a heterologously expressed GpA chimeric

protein as a fusion with staphylococcal nuclease, changes to the migration of chimeric constructs

on SDS-PAGE were observed through single amino acid mutations (Lemmon et al. 1992a). For

example, it was shown that VI, VL, VA and AV mutations at positions 80 and 84 were able to

retain measurable levels of dimerization on SDS-PAGE, albeit at different levels from each

other, while the LV mutant was equivalent to WT (Lemmon et al. 1992a). Additionally, single

mutations to the residues involved in the GpA dimerization interface were also investigated by

Doung et al., who utilized activity of expressed CAT reporter protein in a TOXCAT assay to

determine the relative levels of dimerization, and the change in apparent free energy of self-

association of the GpA TM segment and single mutants thereof (Duong et al. 2007). These

authors reported a rank order of association propensities for different GpA single mutants,

describing such mutants as possessing association levels similar to WT (IV and LV); less than

WT (AV and VI); and no detectable dimer (VA and VL). Analytical ultracentrifugation on GpA

chimeric constructs has also been used to identify a hierarchy of GpA dimerization propensities,

where single GpA mutants at the 80 or 84 position retained oligomerization as the WT construct

(AV, IV, and LV) or at a reduced level (VA, IV and VL) (Doura and Fleming 2004).

The TOXCAT system has several advantages in measuring the dimerization of TM

segments as it is easy to use, has proven extremely practical in establishing the sequence

determinants of dimerization (Johnson et al. 2006; Johnson et al. 2007), and is a similar

alternative to studying membrane protein folding in a mammalian membrane bilayer. The

TOXCAT assay also imposes a register to potentially interacting TM segments, which may serve

to weed out non-native TM -helix contacts; this alignment of peptide orientation within the

membrane bilayer would not necessarily occur in detergent micelles as relative rotational

freedom may occur. There are, however, limitations to the use of the TOXCAT assay in

studying the association of TM segments within a membrane bilayer. By design, the TOXCAT

assay only reports on the monomeric or dimeric status of a TM segment. It does not report on

higher order structures which can be formed from TM segments such as trimers, tetramers or

pentamers as in phospholamban, a membrane protein involved in the regulation of Ca2+

transport

(Li et al. 2001). The TOXCAT assay is the most useful when reporting on homodimeric

structures such as the monomeric GpA studied here; it cannot provide information regarding the

hetero-oligomerization of TM segments. In multi-spanning membrane proteins the final folded

structure is generated from helix-helix contacts from non-equivalent TM segments. For

example, interactions between TM2 and TM5 have been identified in aquaporin-1 that forms a

polar, quaternary structural motif that influences multiple stages of folding (Buck et al. 2007).

An additional complication of the TOXCAT assay is its limited utility in reporting on the relative

insertion of TM segments and mutants thereof into the membrane bilayer. How a mutation in a

TM segment can affect the percent insertion of a construct into the membrane bilayer is unclear,

and the maltose complementation assay may not therefore be sufficiently sensitive to report on

such differences.

This body of work specifically focused on identifying the residues important to GpA

dimerization by cataloguing the effects of individual mutations along the dimerization interface.

Differences in dimerization propensities exist among these published results, perhaps a

consequence of the fact that the methods of evaluation are spread across multiple techniques

(Lemmon et al. 1992b; Langosch et al. 1996; Fleming and Engelman 2001; Doura and Fleming

2004; Doura et al. 2004; Duong et al. 2007). In the present work, we undertook a systematic

evaluation of the role in GpA dimerization of the large residues specifically neighboring the GG4

motif to determine the hierarchy of dimerization promotion among these large aliphatic residues.

Our experimental study utilizes mutagenesis of Ile, Leu, and Val as „ridge‟ residues in all

pairwise combinations at positions 80 and 84 of the GpA dimerization motif to define their role

as specific modulators of oligomerization. Our results suggest that at least one large,

hydrophobic -branched residue is required to maintain WT GpA dimerization affinity at levels

equivalent to WT. Thus, mutation of WT Val80

and Val84

residues along the GpA TM segment

in the context of the TOXCAT assay to various -branched combinations (VV, VI, IV, II)

indicate that these mutations are capable of maintaining levels of dimerization equal to – and in

some cases significantly greater - than WT. Both Val and Ile are additionally capable of

promoting effective dimerization in combination with single conservative mutations to Leu (VL,

LV, IL, LI) and with non-conservative mutations to Ala (VA, AV, IA, AI). However, the

„absolute‟ requirement of a -branched residue in promoting dimerization is most vividly

apparent in mutational cycles of Leu (LL), Ala (AA) and Leu/Ala pairs (AL, LA), as these four

constructs showed a statistically-reduced propensity to dimerize relative to the WT (Fig. 2.2).

That the variations observed in dimer affinity by TOXCAT experiments do not stem directly

from altered secondary structures is confirmed by CD spectroscopy, wherein spectra of synthetic

peptides corresponding to selected GpA TM segments (VV, VI, IV, II, and LL) were found to be

uniformly helical in SDS media (Fig. 2.3).

The work presented here primarily addresses differences in GpA dimerization with

conservative mutations centered around the GG4 motif. Mutations at Val80/84

including all

combinations of Val, Ile and Leu affect GpA dimerization and the importance of the -branch in

the side chain to dimerization was shown through the TOXCAT assay. These mutants do not,

however, address the importance of both hydrophobicity and a -branched side chain to GpA

dimerization. To address this, a preliminary analysis was conducted by replacing the WT Val

residues with Thr. The non-conservative mutation of Val80

or Val84

to as Thr either maintained

dimerization at levels equivalent to WT, or completely disrupted it. The drastic differences in

dimerization depending on the location of the Thr residue is difficult to explain, especially as the

IT and TI mutants compared to the LT and TL mutants yielded opposite results. Thr87

considered part of the GpA dimerization interface, where it contributes to dimerization by

formation of a hydrogen bond with the backbone of the opposite helix (MacKenzie et al. 1997).

Previous studies investigating GpA dimerization have indicated that relatively polar residues in

the dimerization interface (G79

and T87

) can be replaced with relatively polar residues (G, S

and T) with little disruption to dimerization (Lemmon et al. 1992b); however, non-conservative

mutations of polar residues such as T87

to a hydrophobic side chain are disruptive to dimerization

(Lemmon et al. 1992b). To maintain efficient dimerization of the GpA TM segment it appears

that a combination of hydrophobicity and a -branch in the side chain is required.

2.3.2 Hydrophobic -branched residues may be structurally optimized for

transmembrane segment folding.

The insertion of TM segments into the membrane bilayer - which can be considered the

first stage in membrane protein folding – is largely driven by hydrophobicity via the cellular

machinery. The second stage of membrane protein folding involves the formation of tertiary or

quaternary structure, and results from association of two „preformed‟ -helices within the

membrane bilayer (Popot and Engelman 1990). Our finding that the amount of lipid solvation of

the „ridge‟ structures involved in GpA packing inversely correlates with dimer affinity implies

that the inability of lipid to solvate the -helical structure may help drive the helix-helix

interactions. Previously it has been noted that contact between lipid and protein is altered with

mutation of the Gly residues in the GpA dimerization motif, which in turn affects TM segment

dimerization. Our results extend that analysis and indicate the importance of lipid-solvation to

additional residues (Johnson et al. 2006). In the present work, we show that the dimer affinity

values in a library of 16 GpA „ridge‟ mutants experimentally determined by the TOXCAT assay

inversely correlate with non-polar group lipid accessibility. As observed through energy-

minimized models of II, VV (WT), and LL GpA -helices, mutation does alter the „ridge‟

structure which in turn affects the local surface topology of the construct (Fig. 2.5). This altered

ridge structure produced through mutations to the WT Val residues at GpA positions 80 and 84

changes the dimerization propensity of the construct as helices with larger lipid accessible

surface area form weaker dimers due to greater contact with lipid. Conversely, efficient packers

such as Ile decrease the lipid accessible surface area and promote helix-helix contacts instead.

Figure 2.6. Models of the structure of the GG4 „ridge motif‟ involved in GpA dimerization,

with the II, VV (WT) and LL mutants shown in order of increasing lipid accessibility and

decreasing dimer strength. The van der Waals radii of residues 75 – 90 are shown with the Gly

residues at positions 79 and 83 colored red. Side chain mutations at positions 80 and 84 to A) II

(green); B) WT VV (red); or C) LL (yellow) alter the local „ridge‟ structure. This figure was

produced using energy minimized -helices (see Materials and Methods) and PyMol. Figure

adapted from (Cunningham et al. 2010).

While these lipid-helix interactions contribute to GpA dimerization, the strength of the

correlation implies that additional forces beyond „ridge‟ lipid accessibility affect GpA dimer

strength: a combination of forces must therefore influence the dimerization of GpA. In

structural terms, Ile and Val represent optimal candidates for formation of a rigid „ridge‟

structure to drive GpA dimerization, as -branched residues such as Ile, Val and Thr only have

one populated rotamer as a consequence of residing in a membrane embedded -helix, and

create optimal packing surfaces for TM oligomerization without additional losses in entropy

(MacKenzie et al. 1997; Senes et al. 2000; Liu et al. 2003). Alternately, Leu is not as

conformationally restricted in this environment, and can rapidly sample a range of conformers in

a -helix template (MacKenzie et al. 1996). An additional loss of entropy is therefore probable

upon dimerization via Leu – a consequence that is not shared by Ile or Val. While the TOXCAT

assay does not report on such biophysical considerations, our results are consistent with the

rotational freedom of Leu within a helix context precluding successful oligomerization events

relative to those observed for Ile and Val. Since neither raw hydropathy, -branching or entropic

considerations significantly separate the innate oligomerization capabilities of Ile and Val, the

source of increased dimerization likely relates to a ridge structure that is less solvated by lipid,

therefore producing a better packing surface within the locus of helix contact arising from

improved van der Waals packing. In this context, we noted that some Ile-containing mutants,

including VI, IV, and II, do show a statistically significant higher TOXCAT signal than the WT

VV construct.

The optimized structural surface of GpA that promotes dimerization is additionally an

interesting model of membrane protein folding as in vivo, GpA is not required to dimerize for its

function. The role that GpA plays in identifying individual blood groups and contributing to the

overall negative charge of the cell is not likely dependent on dimerization. Additionally, GpA

plays a role in the trafficking of the Anion Exchanger 1 (Band 3) to the cell surface, but this is

also not dependent on dimerization of GpA (Young et al. 2000). For example, mutations made

to the GpA dimerization interface to create a monomeric GpA protein resulted in the successful

trafficking of Band3 to Xenopus oocyte plasma membrane in the same way as wild-type GPA,

showing that the GPA monomer is sufficient to mediate this process (Young et al. 2000). The

lack of importance that dimerization has to GpA function indicates that this structure may simply

provide an optimized folding surface, and GpA as a model system may provide information

regarding how TM proteins generally fold when an optimized surface is available to do so. From

a structural standpoint, the dimerization of GpA could be a measure of protection against

unwanted proliferative oligomerization within the membrane bilayer with other TM segments.

The dimerization of GpA through its optimized interface may be an example of a mechanism

which constrains assembly of TM segments in the membrane, providing a balance in the

production of different structures necessary for homeostasis while evading aggregation (Rath and

Deber 2007).

2.3.3 Modulating helix interactions.

A considerable amount of work has gone into identifying residues involved in promoting

the dimerization of TM segments, but to a large extent, the ability to modulate dimerization with

mutation and thus understand the nuances of membrane protein folding, has gone largely

unexplored. Several examples of TM segments which dimerize via a GG4 motif have been

investigated including MCP, MZP, and BNIP3, but these examples also contain large

hydrophobic residues at the ± i, ± i + 4 positions relative to their GG4 motif, and their

importance in oligomerization has yet to be investigated [see (Rath et al. 2009b) for a review].

The ability to modulate helix interactions via single amino acid mutations has implications in

disease states and highlights the importance of investigating even conservative mutations. For

example, differences in oligomerization affinities among family members, or proteins which fold

in a structurally similar manner yet with different affinities may be explained by studies

considering various amino acid mutations.

It also is likely that membrane proteins have evolved to include variations in helix contact

strength with regard to both structure and function of the protein. For example, -helices

involved in maintaining protein structure or rigid portions of the protein may have evolved tight

packing loci between helices. Weaker helix interactions would be advantageous in TM segments

that undergo structural or rotational changes as part of the functional process. A variety of helix

interaction strengths is compatible in the overall protein fold when viewed from a

structure/function relationship, and are most likely essential for function.

2.3.4 Conclusion

In the present work, we have performed a systematic experimental evaluation of the

nuanced roles for Val, Ile, Leu, and Ala when these amino acids occur in the positions 80 and 84

adjacent to the Gly79

/Gly83

oligomerization motif of GpA. We found that minor changes to the

oligomeric interface (i.e., Val to Ile) can modulate the oligomerization status of GpA and

accordingly have implications for local protein folding. Notably, our results demonstrate that at

least one -branched residue at position 80 or 84 is essential for significant GpA dimerization.

Given that the membrane domains of proteins are essentially devoid of disulfide bonds, and

feature a limited content of stabilizing side chain-side chain hydrogen bonding sites, the

widespread occurrence of -branched residues in TM segments may well stem from a

requirement for the enhanced structural stability provided by the space-filling capacity of their

side chains.

2.4 Materials and Methods.

2.4.1 TOXCAT Assay.

The expression vector pccKAN and the MBP-deficient (malE-) E. coli strain NT326 were

kindly provided by Dr. Donald M. Engelman, Yale University (Russ and Engelman 1999). The

E. coli strain NT326 lacks the endogenous MBP, resulting in the inability of the cells to transport

maltose into the cytoplaxm, and utilize this compound as a carbon source (Schneider and

Engelman 2003). TOXCAT chimeras fusing the TM sequence of GpA between the ToxR and

MBP have been previously described (Johnson et al. 2006). Mutants of GpA were produced by

mutating the WT GpA construct via the QuikChange site-directed mutagenesis kit (Stratagene).

The identity of all constructs was confirmed with DNA sequencing before further

characterization. Constructs were transformed into NT326 cells, grown to an OD600nm of 0.6 and

assayed for construct expression along with expression of the reporter gene CAT as previously

described (Johnson et al. 2006). CAT measurements and construct expression measurements

were performed in at least triplicate and were normalized for the relative expression level of each

construct using Western blotting as described (Johnson et al. 2006). Densitometry was used to

measure differences in construct expression, and was performed using the program Image J

(http://imagej.nih.gov/ij/). A cytoplasmic version of MBP, c2x (pMAL-c2x MBP fusion protein

containing no TM or ToxR domain, New England Biolabs) was used as a negative control for

oligomerization, as it has been used in previous studies (Russ and Engelman 1999) and

represents background CAT expression (Riggs 2001). Significant differences in oligomerization

relative to WT (VV) were tested via the t-test, while significant differences in oligomerization

among mutant constructs were compared via online one-way ANOVA test

(http://faculty.vassar.edu/lowry//anova1u.html). NT326cells expressing TOXCAT chimeras

with WT and mutant GpA sequences were also streaked onto M9 minimal plates with 0.4%

maltose as the only carbon source to confirm membrane insertion. Constructs that grow under

these conditions target the MBP into the periplasm and utilized maltose as the sole carbon source

(Russ and Engelman 1999). Transformant growth was evaluated for all constructs after

incubation for 2 days at 37°C.

2.4.2 Peptide Synthesis.

GpA peptides corresponding to amino acids 73-94 of the full-length protein were

synthesized using PS3 peptide synthesizer (Protein Technologies, Inc.) via Fmoc chemistry.

Four lysine residues were added to both the N- and C-termini to increase the solubility of the

peptide (Melnyk et al. 2003). A 0.1-mmol scale synthesis was used with the O-(7-

azabenzotriazol-1-yl)-N,N,N’,N’-tetramethyl-uronium hexafluorophosphate; N,N-

diisopropylethylamine activator pair, with a 4-fold amino acid excess. Peptide cleavage and

deprotection was carried out using a solution of 88% trifluoroacetic acid, 5% phenol, 5%

ultrapure water, and 2% triisopropylsilane. The cleavage product was precipitated into ice-cold

diethyl ether, dried, and resuspended into ultrapure water. An amidated C terminus upon peptide

cleavage was produced by utilizing a low load (0.18–0.22 mmol/g) FMOC-PAL-polyethylene

glycol-polystyrene resin. Cleaved peptides were purified by reverse phase high performance

liquid chromatography on a C4 preparative column (Phenomenex) with a water/acetonitrile

gradient in the presence of 0.01% trifluoroacetic acid. Peptide molecular weights were

confirmed by mass spectrometry. The Micro BCA assay (Pierce) was used to determine peptide

concentration.

2.4.3 Circular Dichroism.

Circular dichroism spectra of peptides were recorded on a Jasco J-720 CD spectrometer

at room temperature. Spectra in SDS (25 mM SDS, 10 mM Tris, pH 7.2, 10 mM NaCl) were

recorded using a 1 mm path length cuvette at peptide concentrations of 25 M. All spectra were

background subtracted and converted to mean residue molar ellipticity (MRE [deg cm2 dmol

]). Mean residue ellipticities shown are the average of three separate scans, and statistical

differences at 222 nm were determined relative to the WT peptide (VV) via a t-test.

2.4.4 Glycophorin A Helix Solvation Calculations.

The sequences of the WT and mutant GpA TM segments (Fig. 1) were modeled as single

monomeric energy-minimized -helices using the CNS program suite (Brunger et al. 1998). The

corresponding PDB files generated by CNS each contained a single -helix structure and were

analyzed using NACCESS to estimate the relative lipid accessibility of each monomer as

described previously (Johnson et al. 2006). Interfacial lipid accessibility was determined by

dividing the sum of the lipid accessibility values calculated for each residue defined in the GpA

dimerization interface (Leu75

, Ile76

, Gly79

, Val80

, Gly83

, Val84

, Thr87

) (MacKenzie et al. 1997), by

the number of residues included in the interface. Correlation analysis between TOXCAT and

lipid accessibility was performed using the Prisim software program.

Chapter 3: Distinctions between hydrophobic helices in globular

proteins and transmembrane segments as factors in protein sorting.

This work was published, in part, by Cunningham, F., Rath, A., Johnson, R.M., Deber, C.M. J

Biol Chem 284: 5395-402 (2009).

Author Contributions: FC and AR designed research. FC performed research and AR

contributed to database construction, analysis of position of charged residues within sequences

and provided assistance with statistical analysis. FC, AR and CMD analyzed the data. FC and

CMD wrote the paper.

3.1 Introduction

As high-resolution structural determination of membrane proteins is not yet routine,

computer simulation methodologies are often used to evaluate helical protein-protein

(Cuthbertson et al. 2006), and protein-lipid interactions (Choi et al. 2004; Johnson et al. 2006).

As a pre-requisite to such studies, protein helical TM segments and their boundaries must be

defined. Hydropathy plots are commonly utilized to identify from primary sequence both the

location of TM segments and their approximate entry/exit points (Kyte and Doolittle 1982;

Engelman et al. 1986; Cserzo et al. 2002; Zhao and London 2006). The hydropathy values

assigned to each residue to create such plots are drawn from one or more scales, among them the

Liu-Deber values developed in our laboratory (Rath and Deber 2008). Transmembrane segments

– and peptides derived from them – that meet or exceed a segmentally-averaged „threshold‟

hydropathy on this scale (approximately equivalent to a poly-Ala strand) have been shown to

spontaneously insert into micellar environments (Liu and Deber 1999). The average hydropathy

levels of ~96% of natural TM segments were also found to exceed this threshold value (Liu and

Deber 1999).

From these observations, in conjunction with measurements of residue helical propensity

in n-butanol (Liu and Deber 1998b), our laboratory developed the TM segment prediction

program TM Finder (Deber et al. 2001) that uses segmental Liu-Deber hydropathy and non-polar

phase helicity values to query primary sequences for potential TM segments. TM Finder

demonstrated a 98% predictive value in pinpointing TM segments in a training set of known

membrane proteins (Deber et al. 2001). TM Finder was additionally specifically trained to limit

the occurrence of false positives, i.e., globular (soluble) protein regions mispredicted as

membrane-embedded. For this purpose, the initial TM Finder code was applied to a sequence

database of globular proteins of known tertiary structure to assess helices that were of sufficient

length (estimated at 19 residues) to span the membrane bilayer. Of a total 174 -helical

globular protein regions from 134 different proteins in this database, we observed that ~30%

were identified as potential TM segments from the primary sequence alone, but could be

separated computationally from bona fide TM sequences based on the presence of ≥ 3 charged

residues. We subsequently termed these TM-like sequences that occur within helical globular

proteins as “-helices” to reflect their intermediacy between TM properties and

extramembranous localization (Wang 2000). The work presented within this Chapter provides

an opportunity to improve the characterization of bona fide TM domains: we sought to extend

this preliminary computational separation of -helices, globular helices, and TM segments with a

view towards understanding the criteria that may serve to distinguish these sequence features of

globular proteins from TM segments in vitro, or even within the cell.

3.2 Results

3.2.1 Hydropathy of -helices.

The globular protein sequences ≥ 19 residues in length considered in our initial study

were divided into globular helix and -helix categories based on their mean segmental Liu-Deber

hydropathy values (see Methods). A database of confirmed TM -helical segments was also

assembled from a non-redundant set of protein TM segments with available high-resolution

structures (Appendix 1, Table A1.1); this TM helix database contains 212 TM segments from 37

non-redundant membrane proteins. We note that the length criterion of ≥19 residues was not

applied to the TM helix database because each TM segment has been identified as residing in the

membrane bilayer via high-resolution structure determination. The mean Liu-Deber hydropathy

of each sequence in the globular, - and TM helix classes was calculated (see Appendix 1,

Tables A.1 – A.3) and averaged for each group. We found that the mean hydropathy values of

the - and TM helix classes (0.72 ± 0.26 and 1.31 ± 0.69, respectively) each exceeded the Liu-

Deber threshold value for membrane insertion [≥ 0.4, see ref (Liu and Deber 1998a)], while the

mean hydropathy value of globular helices (-0.28 ± 0.41) was below this threshold. Therefore,

on average, -helices have intermediate hydropathy: greater than that observed for globular

helices but less than the TM helix value (p ≤ 0.01 in both comparisons).

3.2.2 Hydrophobic and charged/polar residue content in -helices.

The amino acid compositions of -, globular, and TM helix groups were determined (see

Methods) to investigate whether the intermediate hydropathy of -helix segments vs. other

globular helices and TM segments arose from decreased numbers of hydrophobic residues,

increased numbers of polar residues, or both. As expected, TM helix segments contained ~1.4-

fold more hydrophobic, ~3-fold fewer polar, and ~2-fold fewer charged residues than globular

helices (p < 0.0001, see Table 3.1). -Helix percentage occurrence values in these residue

categories, however, presented as different from the TM and globular groups (p < 0.0001), and

appeared to be intermediate between them (Table 3.1). For example, the average percentage

occurrence of hydrophobic residues per -helix (59.2 % ± 6.2 %) lies between the globular and

TM helix values (47.1% ± 7.8 % and 67.3% ± 9.2%, respectively). The distribution of

hydrophobic and polar/charged amino acid residue types in -helices thus appears to be

transitional between TM and globular helix segments.

Table 3.1. Percent occurrence of hydrophobic, polar and charged residues per helix.

Segment % Hydrophobic a % Charged % Polar

TM helix 67.3 ± 9.2 7.6 ± 6.8 32.7 ± 9.4

-helix 59.2 ± 6.2 19.2 ± 8.5 40.8 ± 6.1

Globular helix 47.1 ± 7.8 26.7 ± 9.0 59.2 ± 7.7

p - value b < 0.0001 < 0.0001 < 0.0001

a Percentage occurrence values may not sum to 100% due to inclusion of residues D, E, K, and R

in both the polar and charged categories. See Methods for details. b

Statistical significance between categories determined by ANOVA testing. Errors represent

one standard deviation of group compositions from the mean.

3.2.3 Amino acid composition of-helices vs. transmembrane and other globular helices.

To further probe the origins of the compositional distinctness of -helices vs. globular

and TM segments, we compared individual amino acid percentage occurrence frequency among

the three helix categories (Fig. 3.1). Consistent with the results of other groups (Senes et al.

2000), we observed that certain hydrophobic residues (i.e. Phe, Ile, Leu, Val, and Trp; Fig. 3.1A)

were significantly enriched (p ≤ 0.01) in TM helices vs. globular helical segments; the situation

was reversed for polar and charged residues (i.e. Asp, Asn, Glu, Gln, Gly, His, Lys, and Arg;

Fig. 3.1A). These trends are readily rationalized given the respective intramembranous vs.

cytoplasmic localization of these sequences in their native proteins. Similarly, we noted that that

-helices contained significantly more Asp, Glu, and Arg residues than TM segments (0.05 ≥ p ≤

0.01, see Fig. 3.1B), results consistent with our previous observation that the presence of three

charged residues could delineate -helices as extramembranous (Deber et al. 2001).

Interestingly, the observed decrease in hydropathy of -helices vs. TM segments could be traced

to two individual residues, viz., fewer Ile and Val residues are present in -helix sequences than

in TM segments (p ≤ 0.01); conversely, the content of all other residues classed as hydrophobic

is statistically indistinguishable (p ≥ 0.05, Fig. 3.1B).

Overall, the individual residue composition of -helices appeared to be more similar to

globular than TM segments (compare Fig. 3.1B and 3.1C); however, -helices were significantly

enriched in the hydrophobic residues Leu and Phe, and depleted in Glu, Lys, and Gln compared

to their counterparts in globular proteins (Fig. 3.1C). In terms of hydropathy, -helices appear to

be distinguished from bona fide TM segments based largely on a decreased content of -

branched hydrophobic residues.

Figure 3.1. Comparison of globular helix, -helix and TM helix amino acid composition. Mean

residue percent occurrence values are indicated with blue, red, and green bars, respectively.

Error bars represent one standard error of measurement. Residue percentage occurrence values

of A) globular and TM helices; B) -helices and TM helices; and C) globular and -helices are

shown. Comparisons were made among categories with a one way ANOVA, then individual t-

tests. Asterisks above the bars indicate statistically significant differences (p ≤ 0.05, *; p ≤ 0.01,

**). Figure adapted from (Cunningham et al. 2009).

3.2.4 -helices are more buried within their native folds than other globular helices.

Because -helices contain greater percentages of hydrophobic residues and lower

percentages of polar/charged residues than other globular helices, we hypothesized that -helices

might be more buried within their native protein folds. To determine if this was the case, the

locations of - and globular helices within their native structures were mapped using a water-

sized probe to calculate residue solvent accessibility (see Methods). Helices within the TM

database were excluded from this analysis due to their established burial within the membrane

bilayer. -Helices were found on average to be more buried than other globular helices within

their native folds (mean solvent accessibility values of 21.0% ± 7.7% vs. 28.9% ± 8.4%,

respectively, p ≤ 0.01). Moreover, the distribution of solvent accessible residues differs between

the -helix and globular helix databases (Fig. 3.2A); the majority of -helix segments are 15-

20% solvent accessible, while the majority of globular helices have 25-30% accessibility. Rather

than being confined to a single residue category, however, the increased accessibility of globular

vs. -helices is reflected across all residue groupings (Fig. 3.2B), i.e., hydrophobic, polar, and

charged residues are ~1.3-1.4x more exposed in globular vs. -helices.

Figure 3.2. Solvent accessibility of globular vs. -helices. A) Solvent accessibility distribution

of all residues. Residues in -helices are more buried than other globular helix residues, with

mean solvent accessibility values of 21.0 ± 7.7 % vs. 28.9 ± 8.4 %, respectively (mean ± S.D., p

≤ 0.01). B) Solvent accessibility of hydrophobic, polar, and charged residues in globular vs. -

helices. Mean solvent accessibility values are indicated with blue and green bars, respectively.

Error bars represent one standard deviation of individual compositions from the mean value. All

residue categories exhibit reduced mean solvent accessibility in - vs. globular helices (p ≤ 0.001

in t-tests). See Methods for details of solvent accessibility calculations. Figure adapted from

3.2.5 Folding of the -helix peptides in aqueous and membrane mimetic environments.

Since -helix segments represent sequences that are mispredicted as transmembranous,

we initiated experiments to examine the physical properties of -helices vs. TM helices in vitro.

Accordingly, we synthesized peptides corresponding to five -helix segments selected from the

database based on the presence of an intrinsic Trp residue for anticipated fluorescence

experiments. Segments were selected from erythrocruorin (1ECA), myoglobin (1MBA),

Hemocyanin A chain (1HC1), mandelate racemase (2MNR), and L-lactate dehydrogenase

(5LDH); see Table 3.2 for corresponding -helix sequences. The hydrophobicity of the -helix

sequences necessitated the introduction of Lys residues at their termini in order to facilitate

synthesis and characterization (Melnyk et al. 2003); such „Lys-tags‟ have been shown not to

interfere with the core peptide sequence of interest (Liu and Deber 1998b; Melnyk et al. 2001).

The location of each of these -helix segments in their native protein structures is shown in Fig.

Table 3.2. Sequences of synthesized -helix peptides.

PDB ID Protein Name Sequence a H

1ECA Erythrocruorin K-FAGAEAAWGATLDTFFGMIF-KK 1.10

1HC1 Hemocyanin A chain KK-ELFFWVHHQLTARFDFERL-K 0.77

1MBA Myoglobin K-ADAAWTKLFGLIIDALKAA-K 0.96

2MNR Mandelate racemase K-GLIRMAAAGIDMAAWDALGKV-K 0.53

5LDH L-lactate dehydrogenase KK-GYTNWAIGLSVADLIESMLK 0.52 a

Lys residues (offset by hyphens) were added to each peptide to enhance aqueous solubility. See

Methods for details. b

Mean Liu-Deber segmental hydrophobicity of each peptide, excluding Lys tags.

Figure 3.3. Ribbon diagrams of helical globular proteins containing -helix regions studied in

this work. Alpha-carbon backbones of each protein are indicated in grey, with -helices shaded

in green. Proteins and PDB identifiers are as follows: Erythrocruorin (1ECA); Hemocyanin A

chain (1HC1); Myoglobin (1MBA); Mandelate racemase (2MNR); L-lactate dehydrogenase

(5LDH). Figure adapted from (Cunningham et al. 2009).

Circular dichroism spectra were obtained for each -helix peptide in aqueous buffer, and

in buffer containing SDS or sodium perfluorooctanoate (SPFO). Although each sequence

demonstrably adopts -helical structure within its soluble protein tertiary fold, none of the

peptides exhibited a large amount of helical structure in aqueous buffer (Fig. 3.4). The 1MBA

and 1HC1 peptides, however, displayed a degree of helical character in aqueous solution

(minima at 208 nm and 222 nm; Fig. 3.4) that was not observed in the 1ECA, 2MNR, and 5LDH

-helix peptides. We noted that there is a general trend for the -helical content of the -helix

peptides to follow their Chou-Fasman secondary structure propensity (P) in aqueous solvent

(Chou and Fasman 1978) (Fig. 3.5). For example, 1MBA and 1HC1 have the highest calculated

-helical structural propensity (Table 3.3), and also the greatest amount of helical structure (Fig.

3.5). The lack of strong aqueous helicity in the 1ECA and 2MNR -helix peptides may be

similarly rationalized by the relatively high Chou-Fasman -strand structural propensity (P)

predicted for these segments (Table 3.3). Interestingly, each -helix peptide sequence exhibits

regions with overlapping and secondary structure prediction (Table 3.3), suggesting that they

may represent segments with competing secondary structure preferences. Sodium

perfluorooctanoate was used in these studies as it represents a „mild‟ detergent that is thought to

preserve helix-helix interactions denatured by SDS. Sodium perfluorooctanoate has been shown

to preserve native quaternary structures of membrane proteins (Rath et al. 2006). Since SPFO

allows retention of native contacts, it presumably preserves native secondary structural

comparisons, so a comparison to SDS is useful. The results in Figure 3.4 maintain that the -

helix peptides are helical in both SDS and PFO.

Figure 3.4. Circular dichroism spectra of δ-helix peptides in various media. Spectra in aqueous

buffer (left), buffer containing SDS (centre), and buffer containing SPFO (right) are shown.

Changes in mean residue ellipticity values observed at 208 nm and 222 nm for each -helix

peptide observed in detergent solution are consistent with increased -helical structure. Figure

Table 3.3. Predicted secondary structure of -helix peptides.

-helix Predicted -

region a

P Predicted

-region P

1ECA 1 – 19 1.11 5 – 11 1.17

1HC1 1 – 18 1.17 5 – 10 1.17

1MBA 1 – 18 1.17 5 – 13 1.21

2MNR 2 – 20 1.15 2 – 12 1.20

5LDH 5 - 19 1.13 2 - 18 1.09 a

Residues in each -helix peptide are numbered beginning with the first residue after the Lys tag

(see Table 2 for sequences). P and P were computed using Chou-Fasman structure

propensities (Chou and Fasman 1978). Lys tags were excluded from propensity calculations.

Figure 3.5. -helix secondary structure compared with Chou-Fasman -helix propensity. The

mean residue ellipticity at 222 nm determined from the CD spectrum of each peptide in aqueous

buffer is given as a function of the Chou-Fasman aqueous -helix propensity of the 1ECA,

1HC1, 1MBA, 2MNR and 5LDH peptides. The regression line of best fit, correlation coefficient

and associated -values are shown; there is a trend that the aqueous helicity of -helix peptides

follows the Chou-Fasman prediction of aqueous -helix propensity (0.05 ≤ p ≤ 0.10). Figure

All -helix peptides increased in -helix content when exposed to the membrane-mimetic

environments of SDS or SPFO micelles (Fig. 3.4). Induction of helical structure in apolar media

has been observed with TM proteins such as GpA and the epidermal growth factor receptor

(Melnyk et al. 2001), but can also occur when intact globular proteins are exposed to SDS

micelles [reviewed in (Imamura 2006)]. In both instances, ordered secondary structures such as

-helices are thought to arise in the hydrophobic environment of the detergent micelles in order

to satisfy the hydrogen-bonding requirement of the peptide backbone in the low-dielectric

environment of detergent acyl chains. -Helix peptide exposure to an environment of reduced

polarity in SDS micelles was confirmed by examining the Trp fluorescence emission spectra of

the -helix peptides. Trp has a characteristic fluorescence emission maximum of approximately

350 nm in an aqueous environment; blue shifting of this maximum to a lower wavelength

accompanies accommodation of the Trp side chain in a more hydrophobic environment (Netz et

al. 2002). Indeed, blue shifts in Trp emission maxima were observed for each -helix peptide in

the presence of SDS (Table 3.4), suggesting that each is solvated in the apolar SDS micelle

interior.

Table 3.4. Tryptophan emission maxima of -helix peptides in various media.

-helixmax (nm)

Aqueous SDS Blueshift a

1ECA 348 336 12

1MBA 348 330 18

1HC1 344 335 9

2MNR 348 337 11

5LDH 348 335 13 a

(max, aqueous – max, SDS)

3.2.6 Competence of -helix segments for in vivo membrane insertion.

The behavior of the -helix peptides in SDS micelles implies that these sequences are

competent for solvation by micelles in vitro, but offers no information as to whether or not they

meet the requirements for in vivo membrane incorporation. We therefore undertook to

investigate whether -helix segments could mimic TM sequences by assessing their ability to

insert and self-associate in the E. coli inner membrane using the TOXCAT assay (Russ and

Engelman 1999) (see Methods). Of the five -helix sequences evaluated in the TOXCAT assay,

only 1ECA was capable of self-association at levels higher than the monomeric GpA G83I

control, with a self-association strength ~60% of the GpA dimer (Fig. 3.6). The remaining δ-

helix sequences reported lower levels of association strength than the monomeric control, with

the possible exception of the 1MBA sequence. Since the correct membrane insertion of each

fusion protein was confirmed by growth of NT326 cells expressing each fusion protein construct

on M9-maltose plates (not shown), the latter data suggest that whether or not self-association

occurs, at least a portion of each -helix TOXCAT constructs is correctly incorporated in the E.

coli inner membrane.

Figure 3.6. TOXCAT assay of -helix peptides in the E. coli inner membrane. Mean levels of

CAT expression from -helices relative to the wild-type GpA dimer are shown ± S.D. (top)

along with a representative Western blot used to evaluate protein expression levels (bottom).

Blot bands excerpted from separate gels are indicated by solid lines between lanes. G83I denotes

the GpA G83I mutant, used as a monomeric control (Russ and Engelman 1999). The mean CAT

expression levels were compared in t-tests; symbols above the bars denote significance level:

0.05 ≤ p ≤ 0.10, +; p ≤ 0.05, *; p ≤ 0.01, **. See Methods for assay details. Figure adapted from

3.2.7 Charged residue distribution distinguishes -helix and transmembrane sequences.

Given that -helices are comparable in length and in overall hydropathy/helicity to native

TM segments, we inquired whether the TOXCAT results might be explained by other

characteristics of -helix sequences. For example, the bilayer integration efficiency of model

TM segments in vivo has been shown to depend strongly on the position of charged residues, i.e.,

when they are placed towards the ends of sequences, insertion efficiency is increased (Hessa et

al. 2007). As such, it is possible that -helix sequences might be fundamentally distinguished

from TM sequences based on charged residue positioning. The -helix, TM helix, and globular

helix databases were accordingly queried for charged residue position (Fig. 3.7A). We found

that charged residues were essentially evenly distributed along the lengths of - and globular

helix sequences (p ≥ 0.1 compared to expected frequencies based on an even distribution). In

contrast, charged residues were more abundant within the first 20% or last 20% of segment

length than in the middle of TM segments (i.e. near the ends of TM helices). The sequences of

-helices and TM segments can thus be computationally distinguished based on charged residue

distribution. The notion that factors other than raw hydropathy must influence the membrane

insertion propensity of a given -helix segment is also supported from the lack of correlation of

segmental hydropathy vs. the calculated apparent free energy membrane insertion (∆Gapp),

calculated using biological partitioning measurements (Hessa et al. 2005a; Hessa et al. 2007)

(Fig. 3.7B).

Figure 3.7. Charged residue positioning in - vs. TM helices. A) Distribution of charged

residues in globular, -, and TM helix sequences. The frequency of occurrence of charged

residues (D, E, K, and R) at various positions along the length of the helix sequence is indicated.

Charged residues are distributed equally (p ≥ 0.1) along the lengths of globular and -helix

sequences, but distributed unevenly over TM helix lengths (p ≤ 0.0001). B) Comparison of -

helix partitioning under „in vivo‟ vs. „in vitro‟ conditions. The insertion efficiency of -helix

sequences under in vivo conditions is given as the apparent free energy membrane insertion

(∆Gapp), calculated using biological partitioning measurements (Hessa et al. 2005a; Hessa et al.

2007); the in vitro condition is represented by Liu-Deber segmental-averaged hydropathy (Rath

and Deber 2008). Figure adapted from (Cunningham et al. 2009).

3.3 Discussion.

3.3.1 Role in globular proteins.

-Helices represent ~30% of the globular-based -helices investigated. This relatively

frequent occurrence implies that these segments may be of some utility in the proteins that

contain them, and considerable evidence supports a major role for hydrophobic interactions in

globular protein folding (Dill et al. 2008). We nevertheless could discern no trend in terms of

localization of -helix segments to known surfaces of protein-protein interaction, or segregation

into a particular globular protein type (data not shown). However, -helices are by definition

highly apolar, and display increased burial within their native protein folds vs. other globular

helices (Fig. 3.2A, B). As such, we suspect that -helix sequences may be important to the

stability of proteins that contain them, perhaps via sequestration of their hydrophobic residues in

the protein interior during folding.

-Helix segments are nevertheless unable to adopt their helical native structures without

input from the remainder of the protein, perhaps because of their competing secondary structure

preferences (Fig. 3.4, Table 3.3). In fact, -helix sequences generally display strong propensities

to exist as -sheet type segments when considered in the absence of the constraints imposed by

globular protein tertiary structure (Table 3.3); this mixed potential is manifested in CD

experiments, where several segments do not develop significant helical structure, even in

membrane-mimetic environments. This disparity of structural propensity vs. conformation of -

helix segments mimics that of „discordant helices‟ – globular protein segments that undergo an

-to- structure transition and form amyloid-like fibrils (Paivio et al. 2004), and suggests that

these sequences may exhibit structural instability in their native folds.

3.3.2 Role of residue content.

The high intrinsic hydropathy of -helix segments is imparted by a different set of

hydrophobic residues than TM segments. Ile and Val residues are depleted in -helices

compared to TM segments (Fig. 3.1), and this may be rationalized in the requirement of such

amino acids to be evolutionarily retained in the hydrophobic, restrictive environment of the

membrane bilayer vs. aqueous solvent. The -branched residues such as Ile and Val have only

one populated rotamer as a result of residing in membrane induced -helices, where they are

structurally optimized for the folding and helix-helix interaction requirements of membrane

proteins (Senes et al. 2000). There may thus be considerable selective pressure to retain Ile and

Val in TM segments relative to -helices as they may retain a structural role beyond maintaining

levels of hydrophobicity. Ile and Val are additionally better -sheet-formers than helix-formers

in aqueous solvent (Chou and Fasman 1978), perhaps necessitating their depletion in natively -

helical -helix sequences.

3.3.3 Recognition of hydrophobic segments.

-Helix peptides are sufficiently similar to bona fide TM segments in terms of their

segmental hydrophobicity and apolar helicity to be competent for membrane insertion, and our

TOXCAT results indicate that certain -helix sequences are not only capable of membrane

insertion but also of self-association within the bilayer when placed in the correct protein context

(Fig. 3.6). How, then, are hydrophobic segments destined for the interior of globular proteins

distinguished from those that become incorporated into the interior of the membrane bilayer?

Correct -helix vs. TM segment sorting must rely on factors distinct from bulk biochemical

properties. Based on our results, it appears that charged residue distribution and/or protein

context may act to exclude -helix segments from the bilayer. Thus, authentic TM segments

have a skewed distribution of charged residues with such residues appearing at helix termini

compared to - and soluble -helical segments (Fig. 3.7A). As well, proteins containing -helix

segments appear to lack any additional TM-mimic sequences that could aid their membrane

integration in a manner similar, for example, to the bilayer integration of voltage-sensor domains

of the voltage-dependent potassium channels via „helper helices‟ (Zhang et al. 2007); of the 51

globular helical proteins containing -helix sequences, only six have ≥ two -helix regions. It is

also possible that the potential sequestration of -helix sequences in the protein interior at an

early stage in folding may prevent recognition of their high intrinsic hydrophobicity.

Examination of the estimated efficiency of -helix sequence partitioning into the lipid

bilayer in vivo and into membrane-mimetic media in vitro further reinforces the notion that bulk

biochemical properties are not sufficient to predict the non-TM vs. TM location of a - vs. TM

helix sequence. The two sets of values do not correlate, although a regression line and correlation

coefficient are shown for illustrative purposes (Fig. 3.7B). Charged residues, and their location

in the -helix segments, may therefore help to predict these hydrophobic protein regions as

destined to be globular.

3.3.4 Conclusions.

The present study shows that (i) nearly 30% of globular protein helices of sufficient

length to span a membrane bilayer (≥ 19 amino acids) have mean hydropathy values equivalent

to or greater than known actual TM segments; and (ii) differentiation between these TM-

mimicking portions of helical globular proteins and bona fide TM segments could generally be

achieved in the first instance by flagging sequences with three or more charged residues as non-

TM regions. We further observed that although significant hydrophobicity is absolutely a

necessary feature in identifying potential TM segments, additional factors - such as the location

of the charged residues, and an increased occurrence of Ile and Val residues – should also be

considered. While - helix segments in globular proteins may embody important hydrophobic

structural features of the in vivo protein fold, the work described herein provides additional clues

as to how proteins may be sorted by the cell. Our current examination of factors that act to divert

-helices from membrane insertion to an aqueous phase indicate that TM Finder and similar

software could exploit these specific features to increase our predictability of TM segments in

proteins.

3.4.1 Database construction.

The databases of globular and -helices were initially compiled by searching the

SwissProt release 34 with the keyword “helix”. The ~4200 sequences returned by this query

were subsequently refined to 174 entries (Wang 2000), as follows: (i) removing all redundant

sequences (defined as those > 25% identical); (ii) removing sequences with non-standard amino

acids; (iii) removing sequences without high-resolution structure coordinates deposited in the

PDB; and (iv) removing segments with < 19 residues. The 174 remaining sequences were then

divided into globular helices (see Appendix 1, Table A1.1) and -helices (see Appendix 1, Table

A1.2) via submission to TM Finder; those segments with segmental hydropathy values at or

above the Liu-Deber insertion „threshold‟ [≥ 0.4 on the Liu-Deber scale, see reference (Liu and

Deber 1998b)] were deemed -helices. The database of TM segment sequences was compiled

from a non-redundant (defined as those >25% identical) list of TM -helices with available high-

resolution structures. Thirty-seven non-homologous TM proteins were identified containing a

total of 212 different TM segments (see Appendix 1, Table A1.13). Unlike the globular and -

helix databases, segments shorter than 19 amino acids were retained in the TM database because

the membrane localization of each was confirmed in a high resolution structure (Rath and Deber

2008).

3.4.2 Amino acid composition analysis.

Amino acid composition was determined for each segment in each database by counting

the number of each residue and/or group of residues in each individual helix and dividing by

helix length to obtain a normalized residue frequency. Mean residue and/or group residue

composition values were then calculated for each of the globular helix, TM helix, and -helix

datasets. For group comparisons, hydrophobic residues were defined as (A, C, F, I, L, M, V, W,

Y), as determined previously (Liu and Deber 1998b); polar residues as (D, E, G, H, K, N, P, Q,

R, S, T); and charged residues as (D, E, K, R). The overall mean amino acid compositions of

globular, TM and -helices were compared with an online one-way ANOVA test (Prisim). Pair-

wise comparisons of amino acid frequencies among the globular, TM, and -helix datasets were

performed using t-tests (Prisim). As counts of amino acids do not represent continuous data it is

appropriate to use an ANOVA assay to determine differences in amino acid composition among

globular, - and TM-helix categories.

3.4.3 Solvent accessibility analysis.

The solvent accessibility of amino acids in globular helices and -helices was evaluated

by application of the program NACCESS (Hubbard 1993) to structure coordinate files obtained

from the PDB as described previously (Rath and Deber 2008), with the following modifications:

(i) A probe size of 1.40 Å was used to approximate the radius of a water molecule; and (ii) the

relative solvent accessibility (RSA) for each amino acid was calculated by comparison of

calculated solvent accessible surface areas to the default reference set supplied with NACCESS.

Helix solvent accessibility was calculated as the sum of the individual residue RSA values in

each helix, divided by the helix length. Helix solvent accessibility distributions were determined

for the globular and -helix groups by sorting individual helix solvent accessibility values into

bins [0-5%, >5-10%, >10-15%, >15-20%, >20-25%, >25-30%, >30-35%, >35-40%, >40-50%,

>50-55%, >55%-60%]; the mean solvent accessibility values for the globular and -helix groups

were also calculated from the individual solvent accessibility data. Within the globular and -

helix groups, the RSA values of residues in the hydrophobic, polar, and charged residue

categories defined above were averaged to obtain mean solvent accessibility values.

Comparisons of the mean solvent accessibility of hydrophobic, polar, and charged residues

between globular and -helix segments were performed using t-tests.

3.4.4 Residue position analysis.

Residues in each helix sequence were sequentially numbered from 1 to n, beginning at

the N-terminal residue and ending at the C terminal residue, such that n represents the total

peptide length in residues. The number assigned to each charged residue (D, E, K, R) in each

helix sequence was divided by n to obtain a positional value normalized to helix length.

Positional values were sorted into bins corresponding to fractions of helix length [≤ 20%, >20%-

40%, >40%-60%, >60%-80%, >80%-100%]. Numbers of charged residues in each bin were

totaled for the globular helix, -helix, and TM helix groups. Chi-squared tests were used to

analyze residue distributions.

3.4.5 Peptide synthesis and purification.

Five -helix segments were selected from the database as candidates for in vitro study

based on the presence of an intrinsic Trp residue: 1ECA (residues 114-133 of the full length

protein), 1HC1 (residues 218-236), 1MBA (residues 127-145), 2MNR (residues 96-116) and

5LDH (residues 247-266). The boundaries of the -helix peptides of 1ECA, 1HC1, 1MBA,

2MNR and 5LDH were chosen as prescribed by examination of the high resolution structure and

the TM Finder output (Deber et al. 2001). Peptides with sequences corresponding to these -

helix segments were synthesized with a PS3 peptide synthesizer (Protein Technologies, Inc.)

using standard Fmoc chemistry. Additional lysine residues were incorporated into the peptide

sequences to increase aqueous solubility as previously described (Melnyk et al. 2003). A four-

fold amino acid excess on a 0.1 mmol scale synthesis was used with the HATU/DIEA activator

pair. Synthesis utilized a low-load (0.18-0.22 mmol/g) PAL-PEG-PS resin that produced an

amidated C-terminus upon peptide cleavage. Peptide cleavage and deprotection was achieved

using a cocktail of 88% TFA/5% phenol/5% ultrapure water/2% TIPS, followed by precipitation

with ice-cold diethyl ether, drying, and resuspension in ultrapure water. Crude peptides were

purified by RP-HPLC on a C4 preparative column (Phenomenex) with a water/acetonitrile

gradient in the presence of 0.01% TFA. Peptide molecular weights were confirmed by mass

spectrometry and the Micro BCA assay (Pierce) was used to determine peptide concentration.

3.4.6 Circular dichroism and fluorescence spectroscopy.

Circular dichroism spectra were recorded on a Jasco J-720 CD spectrometer at room

temperature. Spectra in aqueous buffer (10 mM Tris pH 7.2, 10 mM NaCl) and aqueous buffer

with 10 mM SDS were taken using a 0.1 cm path length cuvette at peptide concentrations of 25

M. A 0.01 cm path length cuvette was used for secondary structure determination at peptide

concentrations of 100 M in aqueous buffer with 50 mM SPFO. Fluorescence measurements

were carried out in the aqueous and SDS buffer conditions described above on a Hitachi F-400

Photon Technology International C-60 fluorescence spectrometer at an excitation wavelength of

295 nm. Emission spectra were recorded at room temperature from 305 to 405 nm using a

peptide concentration of 5 M. Evaluation of the correlation between Chou-Fasman -helical

propensity (P) and peptide mean residue ellipticity at 222 nm was evaluated using the R

statistical software package.

3.4.7 Plasmid construction.

The expression vector pccKAN and the MBP-deficient (malE-) E. coli strain NT326 were

kindly provided by Dr. Donald M. Engelman, Yale University (Russ and Engelman 1999).

TOXCAT chimeras fusing the TM sequence of GpA and the G83I GpA mutant between the

ToxR and MBP domains have been previously described (Johnson et al. 2006); chimeras

encoding -helix sequences instead of the GpA sequence were constructed in an essentially

identical manner via restriction digestion of oligonucleotide cassettes encoding each -helix

segment with NheI and BamHI and subsequent ligation into the NheI and BamHI sites of the

pccKAN plasmid. The identity of all constructs was confirmed with DNA sequencing prior to

further characterization.

3.4.8 MalE complementation test.

NT326 cells expressing TOXCAT chimeras with -helix sequences, the wild-type GpA

TM sequence, or the G83I GpA mutant were streaked onto M9 minimal plates with 0.4%

maltose as the only carbon source. Under these conditions, transformants capable of growth

must target a portion of the chimeric TOXCAT protein into the cytoplasmic membrane (Russ

and Engelman 1999). Transformant growth was evaluated for all constructs after incubation for

2 days at 37°C.

3.4.9 Chloramphenicol acetyltransferase enzyme-linked immunosorbent assay.

NT 326 cells harboring TOXCAT chimeras were grown at 37 C, harvested into 1 mL

fractions at an A600 of 0.6, pelleted, and stored at -80 C. Cell lysates were prepared from cell

pellets as previously described (Johnson et al. 2006) and assayed for CAT concentration using

the CAT ELISA kit (Roche Applied Science). A standard curve was generated with CAT

provided by the manufacturer. Cells expressing the wild-type GpA and G83I GpA sequences

were included in each CAT assay as positive and negative controls, respectively. CAT

measurements were performed in at least triplicate, and were normalized for the relative

expression level of each construct using Western blotting as described (Johnson et al. 2006).

Chapter 4. Converting a Marginally Hydrophobic Soluble Protein

into a Membrane Protein.

This work was published, in part, by Nørholm, M.H., Cunningham, F., Deber, C.M., von Heijne,

G. J Mol Biol: 407: 171-179 (2011).

Author contributions: MN designed research plan including experimental membrane insertion

assays and mutations. FC performed database construction and analysis, and designed

mutations. MH, FC, CMD and GVH analyzed data. MH wrote the paper with FC, CMD and

GVH providing input.

4.1 Introduction.

What distinguishes TM helices from helices in soluble proteins? Hydrophobicity is a

major determinant influencing the membrane insertion of a given helical segment, but marginally

hydrophobic protein segments are observed to form both TM helices and parts of globular

proteins (Hessa et al. 2007; Cunningham et al. 2009): additional factors must contribute to

ensure their proper localization.

A well-studied example is the membrane-embedded voltage-sensor domain present in

voltage-gated ion channels (Swartz 2008). Voltage-sensor domains contain an unusual highly

charged S4 -helix, where membrane insertion of the S4 helix is aided by electrostatic

interactions with charged residues in neighbouring helices (Sato et al. 2002; Zhang et al. 2007).

In the opposite case, unusually hydrophobic so-called -helices exist in proteins that are

not integrated into membranes. -Helices are -helices defined as having a segmental

hydrophobicity value ≥ 0.4 on the Liu–Deber hydrophobicity scale (Liu and Deber 1998a) and

were first identified in approximately 30% of a test group of 174 solved crystal structures of

globular proteins (Cunningham et al. 2009). Overall, bioinformatics analysis of these -helices

showed that although they are markedly hydrophobic, they contain more charged residues and

fewer Ile and Val residues than typical TM helices. Furthermore, charged residues in -helices

are more evenly distributed than in TM helices where they typically are found near the

membrane-water interface. Nevertheless, synthetic -helix peptides exhibit TM-helix-like

behaviour in vitro; CD and fluorescence spectroscopy show an increase in helical content when

they are exposed to membrane mimetic environments, and a -helix derived from Chironomus

thummi thummi erythrocruorin (1ECA) can insert and self-associate in the inner membrane in E.

coli (Cunningham et al. 2009).

Given that -helices have many of the characteristics of TM helices yet have evolved to

not insert into membranes in their normal context, we find them interesting as representatives of

polypeptide segments that are near the threshold for membrane insertion. This is particularly

pertinent for -helices in secreted proteins, as these sequence segments must be able to

translocate through the Sec61 translocon in the ER membrane without being recognized as TM

helices. With this in mind, we have studied selected -helices using a well-established assay for

measuring membrane insertion efficiency in dog pancreas microsomes (Hessa et al. 2005a) and

have determined what sequence alterations are required to convert -helices into TM segments

both in chimeric model proteins and in the native protein context.

4.2 Results.

4.2.1 -Helix hydrophobicity.

Our current collection of δ-helices (Table 4.1), subdivided according to predicted or

experimentally verified subcellular protein localization, contains 51 -helical protein segments

with a Liu–Deber score ≥ 0.4 (Liu and Deber 1998a). Our definition of a “secreted” protein is

one that has a predicted signal sequence (and hence encounters the Sec61/SecYEG translocon

during its translation (White and von Heijne 2008)) and no TM segment. -Helices from

mitochondrial, chloroplastic and viral proteins are classified as cytosolic under this scheme,

given that they do not encounter the Sec61/SecYEG translocon.

In Table 4.1, we compare the predicted Liu–Deber scores for the -helices with the

apparent free energy of membrane insertion (Gapp) calculated by the “G predictor” software

(Hessa et al. 2007). By the latter measure, most of the -helices are predicted to insert poorly or

not at all into a biological membrane, the average Gapp being + 4.1 kcal/mol. For comparison,

the average predicted free energy of insertion for a set of TM helices in membrane proteins with

solved crystal structures is around -1 kcal/mol (Hessa et al. 2007).

Table 4.1. -helices and their predicted Liu-Deber and Gapp hydrophobicities.

Non-Secreted Proteins a Secreted Proteins

PDB ID Liu-Deber c Gapp

d PDB ID Liu-Deber

c Gapp

1BGD 0.84 3.9 1BGC 0.58 6.0

1BIA 0.97 3.9 1BGC 0.49 4.2

1DXI 0.47 4.0 1CF3 0.91 4.0

1FDH 0.46 2.2 1ECA 1.10 4.1

1FHA 0.43 3.2 1EZM 0.49 4.1

1FIA 1.19 5.1 1GLM 0.67 4.6

1GPA 0.48 4.2 1GLM 0.72 4.4

1GUH 1.11 4.1 1OVA 0.53 5.3

1HC1 0.96 5.0 2ACH 0.48 5.4

1HDS 0.77 1.3 3HHR 0.56 4.4

1LTH 0.56 4.4 3HHR 0.54 6.2

1MAT 0.88 4.5 3INK 0.42 4.9

1MBA 0.77 3.4 2AAI 0.84 2.8

1MRR 0.61 3.1 2ACE 0.54 6.4

1PFK 0.73 3.7 Average 0.63 4.8

1PHH 0.80 4.4 SD 0.19 1.0

1PHH 1.07 4.3

1SRY 1.62 2.4

1SRY 0.42 7.6

2ALD 0.65 5.0

2ATI 0.88 3.3

2BMH 0.74 1.3

2LDB 0.66 5.4

2MNR 0.52 3.0

2TSI 1.16 4.3

3PFK 0.9 4.5

3TMS 0.66 3.4

4TMS 0.59 4.0

5LDH 0.52 3.4

9LDB 0.45 5.4

1CSC 0.61 5.7

1CSC 0.98 1.8

1CSC 0.42 4.3

1CPC 1.06 3.3

1TIS 0.81 3.9

2TMV 0.63 4.7

8RUC 0.51 3.9

Average 0.75 3.9

SD 0.27 1.2 a Non-secreted proteins do not encounter the translocon in their translation pathway. These include cytosolic, viral

and mitochondrial proteins. b Secreted proteins encounter the translocon in their translation pathway.

c Average hydropathy of the segment calculated by the Liu-Deber hydropathy scale (Liu and Deber 1998a).

d Predicted Gapp value (in kcal/mol) for the segment calculated by an in vivo membrane insertion scale (Hessa et al.

2007).

Comparison of the mean Liu–Deber hydropathy scores of -helices from secreted and

cytosolic proteins shows that there is no significant difference between the groups (0.63 ± 0.19

and 0.75 ± 0.27; p > 0.05). In contrast, when judged by their Gapp values, -helices in secreted

proteins have significantly higher Gapp values on average than their cytosolic counterparts (4.8

± 1.0 kcal/mol versus 3.9 ± 1.2 kcal/mol; p < 0.05). Despite this difference in the Gapp values

between secreted and cytosolic proteins, both types of -helices are characterized by Gapp

values above the threshold for membrane insertion (Hessa et al. 2007). It is likely the

positioning of the charged residues within these segments, and reflected in their Gapp values,

that prevents the membrane insertion of these segments into the bilayer in vivo. The appearance

of charged and/or polar residues in the central portion of membrane spanning segments can be

detrimental to membrane insertion by the cellular machinery (Hessa et al. 2007).

4.2.2 Choice of -helices for membrane insertion studies.

To investigate the ability of -helices to insert into the ER membrane via the Sec61

translocon, we generated chimeric constructs based on E. coli leader peptidase (LepB) (Hessa et

al. 2007) with inserts corresponding to five -helices (italicized in Table 4.1) that were chosen

from the -helix collection based on the following considerations:

The -helix from C. thummi thummi erythrocruorin (1ECA) is capable of in vivo

oligomerization in the E. coli inner membrane bilayer when tested in the TOXCAT assay

Synthetic peptides corresponding to the Aplysia limacina myoglobin (1MBA) and 1ECA

-helices can be successfully solvated by detergent in a membrane mimetic system

The -helix from Thermus thermophilus seryl-tRNA synthetase (1SRY) has the highest

Liu–Deber score in the -helix database.

The -helices from Bacillus megaterium cytochrome P450BM-3 (2BMH) and from

Ricinus communis ricin (2AAI) have relatively low predicted Gapp values (Table 4.1).

These examples also illustrate the potential biological roles of -helices. Both erythrocruorin

(1ECA) and myoglobin (1MBA) are members of the globin family of heme-binding proteins.

The corresponding -helices in these structures are identical with the so-called H-helices that are

central elements in the folding of the hydrophobic core in globins (Nishimura et al. 2000;

Lecomte et al. 2005). Similarly, the 2BMH -helix (termed the I helix in the full-length protein)

is a prominent hydrophobic feature of the three-dimensional structure of cytochrome P450BM-3

(Ravichandran et al. 1993), and the charged Glu 267 residue present in this -helix appears to

play a central role in the reaction catalyzed by the protein (Gerber and Sligar 1992;

Ravichandran et al. 1993). Ricin is a toxin produced in the seeds of R. Communis and is toxic

because the ricin A chain inactivates eukaryotic ribosomes. The hydrophobic -helix in

ricin A, also termed helix E, is shielded from solvent by the amphipathic helix D that turns its

hydrophobic face toward helix E and its hydrophilic face toward the solvent (Morris and Wool

1994). In this case, it is apparently a necessary feature in the active-site architecture (in concert

with interactions with the ribosome) that charged residues in the -helix are placed on a

hydrophobic scaffold.

4.2.3 Experimental quantification of membrane insertion of selected -helices.

Putting the theoretical hydrophobicity scores to a test, we examined the -helices in a

quantitative biological assay for measuring the efficiency of insertion into the ER membrane. In

this method, the segments to be tested are inserted into engineered versions of the LepB protein

where they are flanked by NXT consensus sites for N-linked glycosylation. The membrane

insertion efficiency is then determined by expressing the proteins in vitro in the presence of dog

pancreas microsomes, followed by quantification of the relative amounts of mono- and di-

glycosylated molecules (Hessa et al. 2005a) (Fig. 4.1).

Figure 4.1. Evaluation of the membrane integration properties of selected δ-helices. A) DNA

encoding two -helices originating from secreted proteins was inserted in the leader peptidase

(Lep) H3 model construct (left panel schematic), in vitro transcribed and translated in the

presence of dog pancreas microsomes. Control reactions were performed in the absence of RMs.

Membrane insertion efficiency was quantified as the ratio between mono- and double-

glycosylated, 35

S-Met-labeled proteins on SDS-PAGE gels. B) Similar to the experiments in (A),

membrane insertion of three -helices from non-secreted proteins was measured in the Lep H2

model system (left panel schematic). Positively (blue) and negatively charged (red) residues are

highlighted. Bands originating from mono- and double-glycosylated proteins are indicated with

one and two dots, respectively. Figure adapted from (Norholm et al. 2011).

The five selected -helices were evaluated for membrane insertion in either the Nout–Cin

or the Nin–Cout orientation using the appropriate LepB construct. Segments originating from

secreted proteins were tested in the Nout–Cin orientation (i.e., in the orientation they have when

traversing the translocon channel), and segments from cytosolic proteins were tested in the Nin–

Cout orientation (i.e., in the orientation they would have if they were functioning as signal-anchor

sequences). Residues that flank TM helices have previously been shown to affect membrane

insertion efficiency (Lerch-Bader et al. 2008), and therefore, whenever possible (when the -

helix was not too close to the N- or C-terminus), five or six of the native residues flanking each

side of the -helices were included in the test constructs.

All five -helices inserted poorly into the ER membrane (7–18% insertion efficiency, Fig.

4.1), with only the -helix from 1SRY being clearly above background level (the amount of

mono-glycosylated molecules is roughly 10% in constructs with a fully translocated test

segment). We also tested the ability of the -helices to serve as signal sequences for targeting to

the signal recognition particle - Sec61 translocation pathway by using a LepB construct that

lacks both the native TM segments; none of the constructs showed any membrane targeting (data

not shown).

In summary, as predicted by the Gapp values, none of the five tested -helices became

efficiently integrated into microsomal membranes when inserted downstream of bona fide TM

helices and also do not function on their own as translocon- targeting signal-anchor sequences.

4.2.4 Converting δ-helices into transmembrane segments.

Generally, -helices are of intermediate hydrophobicity and contain three or more

charged residues, ostensibly rendering them unfavourable for membrane insertion (Cunningham

et al. 2009). Are the charged residues critical for determining the fate of -helices or is the

protein context in which the -helices reside more important? We first addressed these questions

by substituting selected charged residues in three different -helices in the context of the

chimeric LepB constructs. To put the question in an evolutionary perspective, we chose to only

make single-nucleotide changes in order to estimate the minimum number of evolutionary steps

it would take to convert the -helix into a TM helix. Our choice of test -helices was based on

Gapp predictions of the minimum number of nucleotide mutations that would turn the -helix

into a TM helix (i.e., to reach Gapp values < 0 kcal/mol).

In the two chosen -helices originating from the secreted 1ECA and 2AAI proteins, we

changed stepwise one to three charged residues into hydrophobic residues and obtained high

membrane insertion efficiencies only when all three charged residues were mutated (Fig. 4.2A).

With the amino acid substitutions E6V, G10V and D14V (GAA→GTA, GGT→GTT and

GAC→GTC) in the 1ECA -helix, membrane insertion increased from 11% to 38%, whereas the

three substitutions R8C, E19V and R22I (CGT→TGT, GAA→GTA and AGA→ATA) in the

2AAI -helix resulted in an increase from 7% to 87%. In contrast, the -helix from the cytosolic

protein cytochrome P450BM-3 (2BMH) was converted into an almost fully inserted TM helix

(from 10% to 79% insertion) upon just the single E20V substitution (GAA→GTA) (Fig. 4.2B).

the insertion efficiency was further increased to 92% by the H19L mutation (CAC→CTC). We

conclude that the two -helices in the secreted proteins are mutationally rather distant from

becoming TM.

Figure 4.2. Conversion of -helices into TM segments. Single-nucleotide mutations leading to

the replacement of amino acid residues unfavourable for TM segments were gradually

introduced into selected -helices in the Lep H3 and H2 model systems. A) Two -helices from

secreted proteins tested in Lep H3 and B) a single non-secreted -helix tested in Lep H2.

Underlined residues are those chosen for mutation. Figure adapted from (Norholm et al. 2011).

4.2.5 Converting a soluble protein into a membrane protein.

The ease with which the 2BMH -helix was converted into a TM helix in the context of

the LepB protein motivated us to investigate if the corresponding full-length cytochrome

P450BM-3 protein was as readily converted into a membrane protein. For this to happen, the -

helix would need to be able both to serve as a targeting signal to the signal recognition particle–

Sec61 translocation pathway and to insert efficiently into the ER membrane. To monitor

targeting and membrane insertion, we expressed the full-length protein with engineered

glycosylation sites on either the N- or the C-terminal side of the -helix segment (Fig. 4.3A). As

a control, we expressed the unmodified protein in parallel. As expected, wild-type cytochrome

P450BM-3 showed no sign of membrane insertion, but in contrast to our findings with the LepB

constructs, neither did the corresponding singly or doubly mutated versions. This result suggests

that additional factors besides the hydrophobicity of the -helix help prevent targeting of the

protein to the Sec61 translocon.

We speculated that upstream sequence elements would be a determining factor to prevent

membrane targeting, for example, by sequestering a downstream hydrophobic segment in a

folding nucleus and hence preventing it from acting as a targeting signal. To test this hypothesis,

we made two deletions from the N-terminal portion of cytochrome P450BM-3, either including

or excluding a small helical segment that immediately precedes the -helix (Fig. 4.3B). Again,

glycosylation sites were engineered on both sides of the -helix to follow membrane integration,

and the native as well as the singly and doubly mutated -helices were tested. The only construct

exhibiting a detectable amount of glycosylation was the doubly mutated construct with all

upstream secondary structural elements deleted and with an engineered glycosylation site at the

C-terminal side (Fig. 4.3A). Glycosylation of the protein was confirmed by treating the sample

with endoglycosidase H (endoH) (Fig. 4.3C).

Together, these results suggest that the presence of charged residues in the -helix as well

as upstream sequence elements efficiently prevent targeting to the ER but that, upon removal of

these elements, it is possible to convert a soluble protein into a membrane protein.

Figure 4.3. Conversion of a soluble protein into a membrane protein. A) Full-length or two

different truncated versions (1–226 and 1–240, numbered according to the deleted amino

acids in the protein) of cytochrome P450BM-3 were expressed with zero to two mutations in the

-helices. Membrane integration was assessed by engineering of a glycosylation site on the N-

terminal or C-terminal side of the -helix (indicated by N or C below the gel lanes), followed by

in vitro transcription and translation in the presence of dog pancreas microsomes. Results were

compared with constructs without glycosylation sites (-) or without RMs. A membrane-

integrated, glycosylated protein is indicated by the red arrow. B) Two views of the placement of

the -helix (coloured violet) and the immediate upstream helical structure (coloured red) in the

three-dimensional structure of cytochrome P450BM-3 [PDB ID 2BMH]. C) Endoglycosidase

treatment of the 1–240 truncated, double-mutated version of cytochrome P450BM-3. The

glycosylated proteins are indicated by red arrows. Figure adapted from (Norholm et al. 2011).

4.3 Discussion.

Some marginally hydrophobic protein segments, such as the voltage sensor S4 helix,

clearly insert and function as TM helices, whereas other segments that display many of the

characteristics of a bona fide TM helix are found in globular proteins and do not form TM

helices. What are the roles of these so-called -helices in soluble proteins, and how is

mislocalization to a membrane of such segments prevented?

We have tested five -helices for translocon-mediated insertion into dog pancreas rough

microsomes (RMs). In the context of chimeric LepB proteins, none of the -helices were

inserted into the membrane of the rough microsomes, in agreement with their predicted Gapp

values. However, it should be noted that some protein segments integrate in membranes,

entering and exiting on the same side without spanning the entire bilayer - so-called re-entrant

regions (Viklund et al. 2006) - and the experimental setup used here does not rule out such a

partial membrane integration. Three -helices (in the secreted 1ECA and 2AAI proteins and in

the cytosolic 2BMH protein) were readily converted into TM segments by one- to three-

nucleotide mutations, but in the native 2BMH protein, the corresponding mutations had no effect

in terms of membrane targeting. Only when all secondary structure upstream from the 2BMH -

helix was removed did a small fraction of the truncated protein with a mutated -helix insert into

microsomes.

Upstream TM helices aid insertion of marginally hydrophobic TM helices in the Shaker

and KAT1 potassium channels (Sato et al. 2002; Zhang et al. 2007), and our results indicate that,

in globular proteins, upstream secondary structure may be important for preventing membrane

insertion of marginally hydrophobic -helices. Our findings suggest that both secreted and

cytosolic globular proteins containing marginally hydrophobic helical segments are well

protected - both from a mechanistic and evolutionary perspective - from being inserted into the

Interestingly, we note that the charged residues necessary to mutate to convert the ricin

(2AAI) -helix into a TM segment constitute the active site (Fig. 4.2) (Ready et al. 1991). This

is one example illustrating the role of a hydrophobic helix in a soluble protein and how such a

situation can be stabilized by interactions with neighbouring amphipathic structures. It is also

noteworthy that ricin is not only secreted from the site of synthesis but also actually internalized

by cells, passed into the lumen of the ER and reverse translocated into the cytoplasm where it

targets the ribosome (Ready et al. 1991; Wales et al. 1992; Wales et al. 1993). This situation

further emphasizes the notion that the -helix must be well protected from becoming membrane

integrated.

In summary, our findings suggest that when hydrophobic character is localized in -

helices within soluble proteins, these helices may modulate the folding, three-dimensional

structure or active site of the proteins. Evenly distributed charged residues on these -helices

play key roles – along with upstream secondary structure - in helping to prevent unintended

membrane integration. The observation that a single residue of increased hydrophobicity was

able to modify the insertion potential of a -helix from the cytosolic protein 2BMH, while up to

three residues were required to achieve elevated membrane insertion in the secreted examples

(Fig. 4.2) may reflect the fact that -helices derived from cytosolic proteins display relatively

lower Gapp values because the latter proteins do not encounter the translocon. Hence, -helices

from secreted proteins remain mutationally and perhaps evolutionary more distinct both from

bona fide TM helices and from -helices from cytosolic proteins. Finally, our analysis of the

2BMH -helix represents one of the few examples [see also (Chen and Kendall 1995)] of a

situation where a -helix in a soluble protein has been mutated into a segment that behaves,

albeit with limited efficiency, as a TM helix of an integral membrane protein.

4.4.1 DNA engineering.

Oligonucleotides used in this work were purchased from Eurofins MWG Operon and are

listed in Appendix 1, Table A1.4. The three LepH3, H2 and H1 constructs described previously

(Hessa et al. 2005a; Lundin et al. 2008) were prepared for uracil-excision cloning (Nour-Eldin et

al. 2006) by the polymerase chain reaction (PCR) with the oligonucleotides LepH3-F and

LepH3-R, LepH2-F and LepH2-R or LepH1-F and LepH1-R. The empty pGEM vector was

prepared for cloning by PCR with the oligonucleotides pGEM-F and pGEM-R. DNAs encoding

the -helices were ordered as synthetic oligonucleotides and inserted into the different Lep

model constructs using uracil-excision DNA engineering (Nour-Eldin et al. 2006; Norholm

2010). Site-directed mutagenesis was performed with the QuikChange Multi Site-directed

Mutagenesis kit (Agilent Technologies). Genomic DNA was isolated from B. megaterium using

the Chargeswitch kit (Invitrogen) and used as a PCR template to amplify the cytochrome

P450BM-3 (2BMH) gene. Full-length and truncated versions of 2BMH were cloned into the

pGEM vector by uracil excision as described above.

4.4.2 Membrane insertion assay and endoH treatment.

Membrane insertion efficiency assay was performed using the TnT SP6 Quick Coupled

SP6 Transcription/Translation System (Promega) in the presence of dog pancreas microsomes

followed by quantifications of 35

S-Met-labeled, differentially glycosylated products using SDS-

PAGE, as previously described (Hessa et al. 2005a). endoH (New England Biolabs) treatment

was performed according to manufacturers‟ prescriptions.

Chapter 5. Optimizing synthesis and expression of transmembrane

peptides and proteins.

This work was published, in part, by Cunningham, F., Deber, C.M. Methods 41: 370-80 (2007).

Author contributions: FC designed and performed research on optimization of construct

expression. FC and CMD analyzed data. FC and CMD wrote the paper.

5.1 Introduction.

5.2.1 Fragmentation approach to study membrane protein folding.

The need to gather structural information on membrane proteins is highlighted by the fact

that they are implicated in a number of diseases; however, high-resolution structural data remains

elusive due to their high hydrophobicity. A detailed understanding of how individual -helical

components interact within the membrane environment to form larger structures is critical in

understanding the bigger picture of membrane protein folding, and ultimately their function.

Fortunately, the folding of membrane proteins is simplified by the fact that individual TM

segments, or at least small segments of TM proteins, represent independent folding domains and

can be studied as such. The two-stage model for membrane protein folding suggests that

individual TM -helices, or small membrane protein segments, can represent autonomous

folding domains; for example, a successful strategy to gather structural information would

involve studying minimal structural units of TM segments consisting of two -helices and the

connecting loop. As TM segment folding appears to be independent of its neighboring segments,

one can thus utilize these hairpin constructs to study helix-helix interactions, and the detailed

manner of -helix segment association within the membrane bilayer (Johnson et al. 2004).

Our laboratory has worked towards optimizing synthesis and expression of TM segments

and fragments of membrane proteins in order to facilitate the study of packing interactions

between TM helices, and to obtain structural information regarding these segments (Melnyk et

al. 2001; Johnson et al. 2004; Rath et al. 2009a). Success in expressing hairpin constructs of

CFTR consisting of TM3, TM4 and the connecting loop has provided clues regarding the

solvation of the construct in membrane mimetic environments (Rath et al. 2009a), and the

importance of sequence context on hairpin structure (Wehbi et al. 2008). As well, hairpin

constructs have been utilized in combination with reverse-phase high performance liquid

chromatography to mimic the process of TM segment portioning into the membrane bilayer

(Mulvihill and Deber 2010). The importance of this work can be extended to disease systems, as

numerous disease-causing mutations in membrane proteins - including CFTR - occur in the

membrane domain, where they can alter both the structure of the construct, and the manner in

which these mini-membrane proteins are coated with lipids and detergents (Wehbi et al. 2008;

Rath et al. 2009a).

To date, studies of membrane proteins have generally been impeded by the inherent

difficulty of producing the protein in sufficient quantities required for structural analysis. Some

success however, has been observed via heterologous expression in E. coli. Examples include

TM3/4 of the CFTR (Peng et al. 1998; Therien et al. 2001; Choi et al. 2004; Choi et al. 2005),

GpA (Melnyk et al. 2004), MCP (Wang and Deber 2000; Melnyk et al. 2004), and the gamma

subunit of the Na, K-ATPase or sodium pump (Therien and Deber 2002). This largely empirical

process of heterologous membrane protein expression is complex and a variety of conditions

must be considered in order to obtain significant quantities of protein. This chapter focuses on

the optimized expression of expanded fragments of the membrane protein CFTR from those that

have been previously been studied (Peng et al. 1998; Therien et al. 2002) in order to understand

the packing interactions of helices which comprise the membrane spanning region. While

changes in tertiary contacts and lipid solvation have been observed due to hydrophobic-to-polar

amino acid mutations in a CFTR two-TM system (Wehbi et al. 2008; Rath et al. 2009a), this

circumstance begs the question of the consequences of polar mutations in a larger system,

building toward an intact TM domain. To expand on the two-TM helical hairpin studies, the

structural consequences of introducing a polar residue in a largely hydrophobic system will be

studied in a three-TM system consisting of TM2, TM3 and TM4 (TM2/3/4) with the

interconnecting loop regions.

5.2 Domain fragments of membrane proteins: Application to the Cystic Fibrosis

Transmembrane Conductance Regulator.

5.2.1 The Cystic Fibrosis Transmembrane Conductance Regulator.

The Cystic Fibrosis Transmembrane Conductance Regulator was first identified as the

gene responsible for cystic fibrosis in 1989 (Kerem et al. 1989; Riordan et al. 1989; Rommens et

al. 1989). Since that time, it has been determined that CFTR is a membrane protein that is

expressed in the membranes of ciliated epithelial cells of airway passages (Kreda et al. 2005),

and in the fluid-secreting cells of the submucosal glands (Wu et al. 2007). This 1480-amino acid

protein is a member of the ABC transporter family and functional studies have revealed that

CFTR encodes a cAMP-regulated Cl– channel (Kartner et al. 1992). CFTR in the airway

epithelium regulates ion transport, which controls the volume of airway surface liquid, which in

turn affects the levels of mucus hydration (Zhang et al. 2009). It contains five domains, arranged

in two homologous halves: two TM domains (Rommens et al. 1989) and two nucleotide binding

domains (NBD) (Lewis et al. 2004; Atwell et al. 2010) separated by a regulatory domain (R

domain) (Baker et al. 2007) (Fig. 5.1). CFTR is unusual as an ABC family member, as it is the

only member known to function as an ion channel, with the pore being formed through the

association of the TM domains (Riordan 2008). The open and closed state of CFTR is caused by

conformational movements within and between the two NBDs, which in turn causes

rearrangements among the membrane spanning segments, resulting in a shift in the equilibrium

between the open and closed conformation (Cheung et al. 2008). The gating process of CFTR is

regulated through phosphorylation of the R domain by the protein kinase A, and the probability

of channel opening is related to the extent of phosphorylation in the R domain (Riordan 2008).

Figure 5.1. Structure of CFTR. The protein is arranged in two homologous halves with each

half consisting of a TM domain and a nucleotide binding domain. The two halves are separated

by a regulatory domain.

The membrane spanning domains (MSDs) of CFTR form the anion channel portion of

the protein where the chloride translocation pathway is located, most likely at the interface

between the two MSDs. Long-range signals initiating through conformation changes in the

NBDs are likely responsible for conformational changes in the membrane domains that are

required for transporter function (Ramjeesingh et al. 2003). It has been shown separately that

CFTR constructs consisting of MSD1 and MSD2 alone, are both capable of mediating anion flux

where these constructs form dimeric structures to re-create a pore structure presumably similar to

that in the intact CFTR (Schwiebert et al. 1998; Ramjeesingh et al. 2003). Mutations in TM5

and TM6 in MSD1 have indicated that these TMs are particularly important in CFTRs function,

forming an essential part of the channel pore (Schwiebert et al. 1998).

5.2.2 Disease causing mutations in CFTR.

Cystic Fibrosis is the most common autosomal recessive disease among the Caucasian

population with a prevalence of 1 in 2500 (Cheung and Deber 2008). While CF primarily affects

the lungs, pancreas and reproductive organs of individuals, the root cause of these symptoms lies

in the malfunctioning of CFTR. In secretory epithelial cells, this reduced chloride permeability

impairs fluid and electrolyte secretion, causing luminal dehydration, which leads to excessive

mucus accumulation in the lungs. The life-limiting aspect of CF is related to the loss of

pulmonary function as a result of these clogged airways, with recurring respiratory infections

that are difficult to treat with antibiotics and persistent inflammation. To date, over 1500 CF-

causing mutations have been identified throughout the protein

(http://www.genet.sickkids.on.ca/cftr), causing varying levels of severity, and approximately 300

of these mutations occur in the TM domains. The principal CF-related defect in CFTR is caused,

however, by a single amino acid deletion in NBD1, which is found on at least one allele in

approximately 90% of CF patients. This deletion of a single Phe residue (F508) results in the

failure of CFTR expression at the cell surface, as this mutant protein is structurally less stable

than the WT (Riordan 2008). The F508 mutation is thought to decrease CFTR structural

stability along the folding pathway by primarily affecting trafficking of the protein leading to

retention in the ER and eventual degradation by the proteasome (Cheng et al. 1990).

ATP-binding cassette (ABC) transporters actively transport chemically diverse substrates

across the lipid bilayers of cellular membranes. While no high-resolution structure is currently

available for CFTR, advances have been made in determining structures of CFTR family

members, including Sav1866 from Staphylococcus aureus (PDB ID: 2HYD) (Dawson and

Locher 2006) and MsbA from Salmonella typhimurium (Ward et al. 2007). The central ABC

transporter structure consists of two transmembrane domains (TMDs) that provide a

translocation pathway, and two cytoplasmic, water-exposed nucleotide-binding domains that

hydrolyse ATP. Bacterial ABC transporters are generally expressed as 'half-transporters' that

contain one TM domain fused to a NBD, which describes the structure of Sav1866 and MsbA: a

homodimer within the membrane bilayer, with each momomer containing six TM segments

(Dawson and Locher 2006; Ward et al. 2007). Sequence and biochemical similarities between

CFTR and ABC transporters for which structures are available, indicates that there is likely

strong structural similarities among these proteins (Serohijos et al. 2008). A structural model of

of CFTR was designed from the full length structure of Sav1866, as both functional proteins

contain 12 TM helices and their intracellular loops are of similar lengths (Serohijos et al. 2008).

5.2.3 Fragments of CFTR as a minimal tertiary model.

Optimally, structural investigations on intact CFTR would be performed to understand

the effects of mutations on the protein structure, but this remains a challenge as CFTR is difficult

to readily express in quantities useful for high-resolution study, and local structural effects of

point mutations in the large protein may fall below the limit of biophysical detection (Wehbi et

al. 2008). The fragmentation approach to studying small segments of membrane spanning

domains can find application with a protein such as CFTR, as numerous CF-causing amino acid

mutations have been identified in the TM domain. Individual peptides corresponding to the TM

segments of CFTR membrane domain 1 have been investigated structurally (Wigley et al. 1998),

along with TM hairpin constructs and mutants of CFTR TM3/4 (Wehbi et al. 2008; Rath et al.

2009a; Mulvihill and Deber 2010) and a double-spanning peptide consisting of CFTR TM5/6

(Choi et al. 2005). These peptides and hairpin constructs represent the minimal tertiary structural

units for study of membrane protein folding, and as mentioned, the effect of mutation along these

segments can affect the hairpin construct in in vitro systems in several ways. The information

made available by the CF mutation database indicates that a large number of CF-phenotypic

mutations are non-conservative as nonpolar amino acids are mutated to a polar residue and vice

versa. The consequences of introducing polar mutations can be seen through producing non-

native side chain−side chain hydrogen bonds between TM helices (Therien et al. 2001), or by

inhibiting membrane insertion of TM helices due to reduced hydrophobicity (Choi et al. 2005;

Rath et al. 2009a). As such, these mini-membrane protein models represent excellent choices to

investigate local structural changes due to mutation.

5.3 Triple strand construct from the Cystic Fibrosis Transmembrane Conductance

Regulator transmembrane domain.

5.3.1 Construct information.

To assess structural changes in a TM domain system larger than two -helices, a triple

strand fragment containing CFTR TM helices 2, 3 and 4 and the intervening loop regions was

constructed. The limits of the construct were chosen to include all residues identified as

membrane spanning by a hydrophobic moment plot (Eisenberg et al. 1984; Riordan et al. 1989),

as well as the annotated TM segments on Swissprot (http://expasy.org/sprot/). Amino acids 110-

245 were chosen for inclusion in the CFTR-TM2/3/4 construct as this residue stretch

encompassed all three TM segments (Fig. 5.2). We utilized the TM segment boundaries as

annotated by Swissprot (CFTR accession #P13569) in the design of our construct, but is has been

shown previously that defining the boundary residues differently can affect the ability of TM -

helices to associate within membrane environments (Ng and Deber 2010). To assess the putative

TM2/3/4 residues in CFTR we also submitted the amino acid sequence to several TM-predicting

programs that are currently available (Table 5.1). Although there is a consistent overlap of

hydrophobic residues calculated as the core of the TM segment, the predicted locations of the

helix N- and C-termini vary depending on the program used. Predicted residues with a percent

consensus among the programs greater than 80% were considered as a portion of the TM

segment (Table 5.1). Since the residues given by the prediction consensus closely match those

annotated by Swissprot and include all residues predicted in this manner, the Swissprot

numbering was used for ease and similarity to previously published materials.

Table 5.1 Predicted membrane spanning regions of CFTR TM2/3/4

TM Prediction

Consensus a

Swissprot Numbering

TM2 121-138 118-138

TM3 196-214 195-215

TM4 221-240 221-241 a The TM prediction programs used in this analysis were: MemBrain (Shen and Chou 2008), TOPPred II (Claros

and von Heijne 1994), SPLIT4 (Juretic et al. 2002), DAS (Cserzo et al. 1997), MEMSAT3 (Jones 2007),

HMMTOP2 (Tusnady and Simon 2001), TMHMM2 (Krogh et al. 2001), TMPred (Hofmann and Stoffel 1993),

PHDhtm (Rost et al. 1996), SOSUI (Hirokawa et al. 1998) and TM Finder (Deber et al. 2001). Default values were

used in all programs.

By cloning the cDNA of WT human CFTR coding for the region of the protein from

amino acids 110-245 into the inducible pET32a(+) expression vector (Novagen), a soluble

thioredoxin (Trx) fusion protein was produced. Low intracellular levels of heterologous

eukaryotic protein expression are often observed in bacteria, and may be caused by protein

misfolding, resulting in recognition by the host as a foreign protein which leads to rapid

degradation (Marston 1986). Expression of eukaryotic proteins in bacteria has been shown to

increase dramatically by fusing the eukaryotic gene of interest to a bacterial gene at the N-

terminus (Marston 1986), where the bacterial protein is usually a highly expressible, soluble

protein. Overexpression of this chimeric construct then exploits the efficient translation of the

bacterial fusion partner by the host machinery.

Thioredoxin (Trx) is a small, ubiquitous, heat-stable protein which participates in various

redox reactions and catalyzes dithiol-disulfide exchange reactions. More recently, it has been

described that the E. coli Trx interacts with unfolded and denatured proteins in a manner similar

to molecular chaperones that are involved in protein refolding after cellular stress (Kern et al.

2003). The Trx-TM2/3/4 construct also contains His-tags and an S-tag for nickel-affinity

column purification, and immunological detection purposes, respectively (Fig. 5.2).

Figure 5.2. Construct of the pET32a(TM2/3/4) designed for the expression of the Trx-CFTR

TM2/3/4 fusion protein. The cDNA corresponding to amino acids 110-247 of CFTR was

subcloned into the Nco1/Xho1 restriction sites of the pET32a vector. The sequences of amino

acids 110-245 is provided, with the individual TM segments underlined. These residues

correspond to those identified as membrane spanning at http://expasy.org/sprot/. The Cys

residues in the construct (Cys 125 and Cys 225) were mutated to Ala and are indicated in red.

Cleavage of the purified soluble protein with thrombin will result in the final TM2/3/4

construct consisting of TM2/3/4 of CFTR, the S-tag, and one His-tag. The goal of this construct

design is to produce quantities of the protein in E. coli suitable for biophysical characterization

(Peng et al. 1998; Therien et al. 2002) (Fig. 5.2).

5.3.1.1 Cloning of CFTR TM2/3/4 fragment: Methodology.

The pET32a(+) (Novagen) vector was chosen as the vehicle for heterologous expression

of TM‟s 2-4 of human CFTR in E. coli. This vector contains several unique restriction enzyme

sites in its multiple cloning region, allowing for insertion of the DNA fragment of interest into

the vector. The restriction sites NcoI and XhoI were chosen for cloning which result in the

insertion of the DNA encoding TM2/3/4 downstream of the E. coli protein Trx, creating an N-

terminal fusion construct.

Both 5‟ and 3‟ primers for PCR amplification of the desired segment of human CFTR‟s

cDNA were designed to contain restriction sites for NcoI and Xho1, respectively, with the goal

of adding these sites to the PCR amplified fragment (Table 5.1). As suggested by the supplier of

the restriction enzymes (New England Biolabs), six additional base pairs were added to the 5‟

end of the restriction sites to allow for maximum efficiency of the restriction enzyme when

cleaving the PCR amplified fragment. PCR amplification of the cDNA of human CFTR would

result in the amplification of DNA containing amino acids 110 – 245, inclusive.

Table 5.2. Primers used in the PCR amplification of the human CFTR cDNA.

Primer Name Primer Sequence

5‟ primer 5‟ TAGCTTCCATGGACCCGGATAACAAGG 3‟ a

3‟ primer 5‟ CTCTGAACTCGAGATCCCATCATTCTCCC 3‟ b

The underlined regions represent the restriction enzyme sites of NcoI a and XhoI

b, respectively.

PCR amplification reactions were performed using 5 ng of template DNA (pET32a(+)

containing the full length CFTR cDNA), 1g of each oligonucleotide primer (Table 5.1), 1 U

Vent Polymerase (New England Biolabs), 0.2 mM dNTP mix (New England Biolabs), 100 mM

MgSO4, and 1 x ThermoPol Reaction buffer (supplied by New England Biolabs) with a total

reaction volume of 50 l. The reactions were overlaid with 30 l of mineral oil. Reactions were

initially incubated at 95⁰C for 30 seconds, followed by 35 cycles of: i) 95⁰C for 30 seconds; ii)

45⁰C for 60 seconds; and iii) 72⁰C for 60 seconds. A final extension of 72⁰C was continued for

10 minutes. The PCR amplification product was purified from the PCR reaction mixture using

the QIAquick PCR purification kit (Qiagen).

The PCR amplified fragment containing amino acids 110-245 of human CFTR was

digested with NcoI and XhoI in order to produce a fragment that could be cloned into the

pET32a(+) vector to form the Trx-TM2/3/4 expression construct. The empty pET32a(+) was

also treated with NcoI and XhoI to produce a linearized vector.

The isolated and purified PCR fragment was then ligated into the linearized pET32a(+)

vector using T4 DNA Ligase (New England Biolabs) with a 3:1 insert to vector mass ratio.

Ligation reactions were then used to transform supercompetent XL1-Blue E. coli cells

(Stratagene), and colonies were selected for screening of recombinant plasmids. DNA

sequencing was used to confirm the successful ligation reaction and that no mutations were

introduced into the construct. Additionally, the native Cys residues, Cys 125 and Cys 225, were

mutated to Ala to facilitate working with the construct under non-reducing conditions (Fig. 5.2)

5.3.2 Protein Expression of CFTR TM2/3/4 under successful CFTR TM3/4 conditions.

While our lab and others have had success with heterologous expression of membrane

proteins, various parameters of E. coli expression must be empirically optimized for each protein

as the best possible expression conditions might not be universal, and modulating expression

conditions can drastically alter protein yields (Tate 2001; Cunningham and Deber 2007).

Growth temperature, concentration of induction agents, a variety of growth media and the E. coli

strain employed for expression are all factors contributing to protein expression levels, and were

systematically explored for the CFTR TM2/3/4 fragment.

Based on previous success in our lab of expression of CFTR TM3/4 in E. coli in

milligram quantities (Peng et al. 1998; Therien et al. 2002), optimization of CFTR TM2/3/4

expression was carried out. Initial expression trials of CFTR TM2/3/4 focused on growth and

induction conditions which were successful for the CFTR TM3/4 construct: transformation into

BL21(DE3) cells, growth in TB media (0.4% (v/v) glycerol, 2.4% (w/v) Bacto NaCl yeast

extract, 1.2% (w/v) Bacto tryptone, 17 mM KH2PO4, 72 mM K2HPO4, Appendix 2) until mid-log

phase (OD600 ≈ 0.6) at 37⁰C in the presence of ampicillin with shaking at 250 rpm. The BL21

(DE3) E. coli strain carries a chromosomal copy of the gene for the T7 RNA polymerase under

control of lacUV5 promoter (Studier and Moffatt 1986). Induction of the T7 polymerase by

isopropyl β-D-1-thiogalactopyranoside (IPTG) allows controlled expression of the Trx-TM2/3/4

gene which is placed downstream of the T7 RNA polymerase-binding site on the expression

vector. Protein expression was induced with 1 mM IPTG, followed by overnight growth at

25⁰C. The E. coli cells were then harvested via centrifugation (Peng et al. 1998; Therien et al.

2002), lysed and then the Trx-T2/3/4 expressed construct was purified via nickel affinity

chromatography. Unfortunately, for the Trx-TM2/3/4 construct, levels of protein expression

were barely detectable by Coomassie staining and Western blotting (Fig. 5.3). To visualize the

levels of protein expression, an antibody to the S-tag was used (Fig. 5.2). The calculated

molecular weight of the Trx-TM2/3/4 full length fusion construct is 33.5 kDa (Fig. 5.3, Lane 1

and 2). The lower major band on the blot, with an apparent molecular weight of 20.9 kDa (Fig.

5.3, Lane 1) is most likely a degraded version of the full length product with cleavage occurring

in the large 56 amino acid loop between TM segments 2 and 3. Expression of the full-length

Trx-TM2/3/4 was not detectable on gels stained with Coomassie Blue.

Figure 5.3. Western blot of expression trial of CFTR TM2/3/4 in conditions which were

successful for CFTR TM3/4 (Therien et al. 2002). TM2/3/4 expresses poorly under these

expression conditions. Expression trial was conducted with BL21 (DE3) E. coli cells in TB

media, OD600 ≈ 0.6. Lane 1: [IPTG] = 1.0 mM with 25⁰C post-induction temperature. Lane 2:

uninduced sample. Positions of See blue molecular weight markers (Invitrogen) are indicated in

5.3.3 Heterologous expression of CFTR TM2/3/4.

In order to optimize the expression of the Trx-TM2/3/4 construct, various expression

parameters were explored. The effect of varying the protein expression induction, concentration

of induction agent, the type of E. coli cell utilized for expression as well as the expression media

were all investigated. As the conditions used to produce milligram quantities of TM3/4 were

unsuccessful in producing biophysically useful amounts of TM2/3/4 (Fig. 5.3), a series of

experiments optimizing the expression of the construct were carried out, as depicted in the flow

chart in Figure 5.4.

Figure 5.4. Flow chart showing the conditions used to optimize protein expression for CFTR

TM2/3/4. Optimization of culture growth can be divided into four sets of experiments: 1) the E.

coli cell type used for heterologous protein expression; 2) the investigation of temperature at

which E. coli cultures should be induced for protein expression; 3) the investigation of the IPTG

concentration; and 4) the media type used for protein expression. The „X‟ and the end of each

optimization pathway indicates insufficient quantities of TM2/3/4 construct produced for

biophysical analysis.

5.3.3.1 E. coli strain.

The majority of membrane proteins are found naturally in very small quantities, resulting

in the need to artificially over-express these proteins in systems that are capable of generating

levels sufficient for biophysical studies after purification. In order to determine the structure of

an integral membrane protein, amounts on the order of mg quantities of purified protein are

required. Additionally, heterologous or homologous overexpression is advantageous as the use

of natural sources to isolate membrane proteins prevents the possibility of genetically modifying

these proteins to facilitate detection and purification, as well as preventing the efficient labeling

for nuclear magnetic resonance and crystallographic studies.

E. coli is often the most popular choice for high-level protein expression for both

homologous (Rastogi and Girvin 1999) and heterologous membrane protein expression

(Hawkins et al. 2005), although several other systems are available. Yeast expression systems

(Pedersen et al. 2007), cell-free expression systems (Maslennikov et al.) and mammalian cells

(Hunter et al. 2005) have all been successfully used in the production of membrane proteins

towards generating high resolution structures of the proteins of interest. While the route to

successful membrane protein over-expression can be complicated, the bacterial E. coli system

has the advantage of effective and efficient recombinant technology for plasmid construction and

protein expression. In addition, E. coli is a well-studied system and cell transformation and

culture growths are rapid and inexpensive.

The commercially available BL21 E. coli (Novagen) is the most commonly used bacterial

host for heterologous expression as it is lon and ompT protease deficient and is known to

promote plasmid stability. This specific type of E. coli cell was used in the optimization of

TM2/3/4 expression (Fig. 5.4) as it was successful in the expression of TM3/4 (Therien et al.

2002). Secondly, the BL21 codon (+) series (Stratagene) was tested for TM2/3/4 expression as it

contains extra copies of rare E. coli codons which corrects E. coli codon bias and may improve

heterologous expression. A third cell type which was used is a BL21 derivative with

considerable success at over-expressing membrane proteins normally toxic to the cells, termed

the C43 (DE3) strain. While the genetic mutations resulting in improved protein expression are

unknown, current hypotheses suggest the mutations may affect the amount of T7 RNA

polymerase production, slowing synthesis (Dumon-Seignovert et al. 2004).

Figure 5.5 shows a Western blot detection of the Trx-TM2/3/4 expression optimization in

three different cell lines: BL21 (Novagen), BL21 (Codon Plus) (Stratagene) and C43 cells.

Expression trials were completed in M9 rich media, with a pre-induction temperature of 37⁰C,

protein expression induction with 0.1 mM IPTG, and a post-induction temperature of 25⁰C. The

cells were induced for protein expression with IPTG at OD600 ≈ 0.6, the mid-log phase of E. coli

growth. Two major protein expression bands appeared in this Western blot of cell lysates

expressing the chimeric protein: the higher molecular weight band corresponding to the full

length Trx-TM2/3/4 fusion with an apparent molecular weight of 34.4 kDa (Fig. 5.5, lanes 2, 4

and 6), and a lower band which is most likely a degraded version of the full length product (Fig.

5.5, lanes 2 and 4). Varying the E. coli cell type appears to have an effect on expression of the

Trx-TM2/3/4 construct with BL21 cells producing the largest amount protein, albeit with the

greatest amount of protein degradation (Fig. 5.3, Lane 1). BL21 codon (+) cells biased for E.

coli codon preferences produced less chimeric construct compared to BL21 E. coli cells (Fig.

5.5, Lane 2 and 4). An E. coli strain that has been developed and optimized for membrane

protein overexpression are C43 cells (Miroux and Walker 1996); C43 cells appear to have intact

chimeric protein expression, with no visible protein degradation (Fig. 5.5, Lane 6). In all cell

types investigated, no protein expression was observed prior to expression induction with IPTG

(Fig. 5.5, Lanes 3, 5 and 7). Similar results were observed when the expression post-induction

temperature was shifted upwards to 37⁰C, except that at this higher temperature, larger amounts

of protein degradation were observed (data not shown).

Figure 5.5. Expression of CFTR TM2/3/4 in various E. coli cell lines: BL21, BL21 (codon

plus) and C43. All other protein expression conditions were identical: 0.1 mM IPTG, M9 rich

growth media, and 25⁰C expression induction. Cells were induced at mid-log phase of growth

(A600 ≈ 0.6). The calculated molecular weight of Trx-TM2/3/4 fusion is 33.5 kDa. Lane 1: See

blue molecular weight marker (kDa). Lane 2: BL21 E. coli cells, overnight induction. Lane 3:

Pre-induction of BL21 E. coli cells. Lane 4: BL21 codon (+) E. coli cells, overnight induction.

Lane 5: Pre-induction of BL21 codon (+) E. coli cells. Lane 6: C43 E. coli cells, overnight

induction. Lane 7: Pre-induction of C43 E. coli cells.

During the process of membrane protein overexpression, there is however, no necessity

to limit the use of the bacterium to E. coli for membrane protein expression. Several other

promising bacterial expression systems have been developed: Lactococcus lactis is a Gram-

positive lactic acid bacterium that has been used to express a limited number of membrane

proteins including ABC transporters, major facilitator superfamily transporters,

mechanosensitive channels, and lipoproteins (Kunji et al. 2003). The bacterium Halobacterium

salinarum has also been successfully used to produce integral membrane proteins in quantities

suitable for high resolution studies (Lanyi and Schobert 2002). Both of these systems are ideal

for membrane protein expression trials, as they have well studied genetic systems, are easy to

culture and are cost effective.

5.3.3.2 Temperature of protein expression induction and concentration of IPTG.

Growth cultures were incubated at various temperatures post-induction in order to

increase the production of the chimeric Trx-TM2/3/4, as well as to reduce the amount of

degradation observed. A reduction in expression temperature can be associated with an increase

in the amount of soluble protein expressed by bacterial cells (Cunningham and Deber 2007), and

can be further beneficial as an increase in proteolytic degradation is associated with high

temperature expression (Quick and Wright 2002). It has been noted that at lower temperatures,

heat shock proteases normally induced in E. coli over-expression conditions are partially

eliminated (Sorensen and Mortensen 2005b).

Protein expression was carried out in BL21 E. coli cells in M9 minimal growth media,

expression induction with either 0.1 or 1.0 mM IPTG with post-induction temperatures of 15⁰C,

25⁰C and 37⁰C. As in the example above, at higher temperatures the expression of Trx-TM2/3/4

appears to improve, but this is associated with an increase in cleavage product produced,

rendering 37⁰C induction temperature unsuitable for expression (Fig. 5.6, Lane 9 and 10).

Lowering the post-induction temperature from 37⁰C to 25⁰C (Fig. 5.6, Lanes 6 and 7) decreases

the amount of cleavage product produced and appears to have a larger amount of full length

construct. Expression of Trx-TM2/3/4 at 15⁰C produced the least amount of cleavage product

compared to protein expression induction at 25⁰C and 37⁰C, but expression levels were still

unsuitable for biophysical analysis (Fig. 5.6, Lane 3 and 4). Protein expression could not be

detected by Coomassie blue staining at any temperature indicated (data not shown). At all post-

induction expression temperatures investigated, no expression of Trx-TM2/3/4 was observed

without the addition of IPTG (Fig. 5.6, Lane 2, 5 and 8).

Figure 5.6. Western blot showing of the effect of different induction temperatures on protein

expression of Trx-TM2/3/4, at two different concentrations of IPTG: 0.1 mM and 1.0 mM.

Cells were induced at mid-log phase of growth (A600 ≈ 0.6). The calculated molecular weight of

Trx-TM2/3/4 fusion is 33.5 kDa. Lane 1: See blue molecular weight marker (kDa). Lane 2: 15

⁰C, (-) IPTG, pre-induction. Lane 3: 15 ⁰C, 0.1 mM IPTG. Lane 4: 15 ⁰C, 1.0 mM IPTG. Lane

5: 25⁰C, (-) IPTG, pre-induction. Lane 6: 25⁰C, 0.1 mM IPTG. Lane 7: 25⁰C, 1.0 mM IPTG.

Lane 8: 37⁰C, (-) IPTG, pre-induction. Lane 9: 37⁰C, 0.1 mM IPTG. Lane 10: 37⁰C, 1.0 mM

IPTG. Increasing the induction temperature from 15⁰C to 37⁰C does not increase the amount of

fusion protein expressed by the E. coli BL21 cells, but increases the amount of cleavage product

observed.

Additionally, it can be noted from the Western blot of Trx-TM2/3/4 expression shown in

Figure 5.6 that the concentration of the IPTG used to induce protein expression appears to have

no effect on the levels of protein produced. Increasing the concentration of IPTG from 0.1 mM

to 1.0 mM does not improve expression of the fusion construct at any of the post-induction

temperatures tested.

For membrane protein expression constructs that are controlled by inducible promoters,

varying the concentration of induction agent can, in some cases, affect protein expression.

Lowering IPTG concentrations may improve protein expression as it has been observed at low

IPTG concentrations a reduction in protein synthesis occurs. This may favor protein folding, and

improve stability of the desired product (Yang et al. 1997). As in the case of the chicken liver 6-

phosphofructo-2-kinase/fructose-2,6-bisphosphatase (CKB) the optimum IPTG concentration for

the expression of CKB was in the range of 0.1 to 1 M, irrespective of growth media or

temperature. At IPTG concentrations above these levels, inefficient protein production was

observed (Yang et al. 1997).

IPTG is commonly used as an induction agent in E. coli protein expression, but additional

induction strategies in the production of membrane proteins are available. One such example is

utilizing the promoter of the E. coli cold-shock protein CspA, whose expression is dramatically

increased at temperatures in the region of 15°C (Goldstein et al. 1990). Utilizing the cspA

promoter to drive the transcription and production of the first topological domain of the E. coli

transenvelope protein TolA saw an increase of the amount of protein produced relative to

expression trials conducted with IPTG (Mujacic et al. 1999).

5.3.3.3 Growth media.

Various growth media were also used to explore protein expression optimization of Trx-

TM2/3/4. Commonly used media and their ingredients are listed in Appendix 2. For the

optimized expression of Trx-TM2/3/4, the following media were used for bacterial growth: LB

medium, TB medium, M9 rich and M9 minimal media (Fig. 5.4). LB is a nutritionally rich

medium, and it continues to be one of the most common media used for maintaining and

cultivating recombinant strains of E. coli. TB medium is a complex medium much like LB, but

it has been shown to support higher cell densities, as it contains higher amount of yeast extract,

tryptone, and contains glycerol as a carbon source. M9 medium has the added benefit that it can

be supplemented to produce higher growth rates or to allow growth of strains that require

additives (e.g. thiamine or casamino acids). It has also been noted that supplementing media

with glucose (0.2-1%) can increase protein expression (Douglas et al. 2005; Sorensen and

Mortensen 2005a); increasing glucose concentrations is believed to decrease promoter repression

resulting in improved protein synthesis (Yang et al. 1997).

The results of optimizing Trx-TM2/3/4 protein expression in various media are detailed

in Figure 5.7. Poor expression of the construct is observed in both LB media (Fig. 5.7, Lanes 2

and 3) and TB media (Fig. 5.7, Lanes 6 and 7). This poor expression is surprising as both of

these media types were used successfully in expressing the hairpin of CFTR TM3/4 in quantities

suitable for biophysical analysis (Peng et al. 1998; Therien et al. 2002). For both LB and TB

media, the observable protein expression is very low, and the majority of the protein expression

appears to be of a TM2/3/4 cleavage product, with cleavage likely happening in the large loop

between TM2 and TM3. Protein expression improves with use of M9 media (Fig. 5.7, Lanes

10-11, 12-13): in this case, protein expression is slowed as the bacteria are not supplemented

with amino acids and are required to produce their own. This is contrary to conditions provided

by rich media such as TB and LB, where post-induction, the majority of cellular resources are

directed towards protein expression, instead of required cellular functions. As seen in Lane 9,

expression of Trx-TM2/3/4 in M9 minimal media with an induction temperature of 25⁰C

produces the largest quantity of protein among the various media tested; however, at the higher

induction temperature of 37⁰C, a greater proportion of cleavage is observed for the construct

(Fig. 5.7, Lane 10). Expression of Trx-TM2/3/4 in M9 rich media, supplemented with amino

acids through the addition of casein enzymatic hydrolysate (Appendix 2) is poorer than for M9

minimal media. These results suggest that a relatively reduced rate of protein expression is

required for more efficient production of Trx-TM2/3/4 by E. coli.

Figure 5.7. Western blot showing the effect of different growth media on protein expression of

Trx-TM2/3/4, at two different induction temperatures: 25⁰C and 37⁰C. Protein expression was

induced with the addition of 0.1 mM IPTG. Optimal Trx-TM2/3/4 protein expression appears to

occur in M9 minimal media (Lane 9). Cells were induced at mid-log phase of growth (A600 ≈

0.6). The calculated molecular weight of Trx-TM2/3/4 fusion is 33.5 kDa. Lane 1: See blue

molecular weight marker (kDa). Lane 2: LB media, (-) IPTG, pre-induction. Lane 3: LB media,

25⁰C. Lane 4: LB media, 37⁰C. Lane 5: TB media, (-) IPTG, pre-induction. Lane 6: TB

media, 25⁰C. Lane 7: TB media, 37⁰C. Lane 8: M9 minimal media, (-) IPTG, pre-induction.

Lane 9: M9 minimal media, 25⁰C. Lane 10: M9 minimal media, 37⁰C. Lane 11: M9 rich

media, (-) IPTG, pre-induction. Lane 12: M9 rich media, 25⁰C. Lane 13: M9 rich media, 37⁰C.

Varying the growth media in membrane protein expression can have large effects on the

production of membrane proteins. Several different types of media are available for use in the

bacterial expression of membrane proteins (Appendix 2), but the quality and amount of protein

produced by varying the media is largely an empirical process. Optimizing the expression of

membrane proteins in various media also has utility beyond the primary goal of protein

overexpression; expression in minimal media allows for isotopic labeling of proteins. This is not

possible when proteins are isolated from natural sources.

5.3.4 Removing intracellular loop 1 between TM2 and TM3 to improve protein

expression.

E. coli BL21cells are widely used for protein over-expression as this host strain is

deficient in two proteases encoded by the lon and ompT genes; however, significant proteolytic

cleavage of the Trx-TM2/3/4 construct during expression remained (Fig. 5.5-5.7). This

degradation of over-expressed protein is, however, not uncommon, and degradation by the host

proteases guarantees that abnormal polypeptides do not accumulate within the cell and also

allows for amino acid recycling. Expressed proteins targeted for degradation can include

prematurely terminated polypeptides, both proteolytically and as a result of vulnerable folding

intermediates (Baneyx and Mujacic 2004).

In an effort to reduce the degradation of the Trx-TM2/3/4 construct and improve

expression levels, a modified construct was engineered where 86% of the loop sequence was

removed and replaced with a soluble loop (KSPGSK) which included Pro and Gly residues as

these amino acids have high -turn propensities according to Chou-Fasman rules (Monne et al.

1999) (Fig. 5.8); this construct was termed Trx-TM2/3/4-Loop. The WT TM2/3/4 construct

contains the 56-residue intracellular loop-1 of CFTR that may be targeting the full-length

construct for degradation, which can be seen as the lower band on a Western blot (Fig. 5.5-5.7).

Figure 5.8. Construct of the pET32a(TM2/3/4)-Loop designed for the expression of the Trx-

TM2/3/4Loop fusion protein. The cDNA corresponding to amino acids 110-247 of human

CFTR with amino acids 143-190 replaced with a soluble loop sequence “KSPGSK” was

subcloned into the Nco1/Xho1 restriction sites of the pET32a vector. The amino acid sequence

of the altered construct is shown, with the individual TM segments underlined. The Cys residues

in the construct (Cys 125 and Cys 225) were mutated to Ala and are indicated in red. The loop

construct is indicated in blue.

Expression optimization was conducted for the Trx-TM2/3/4-Loop construct in a

similar manner to the Trx-TM2/3/4 construct, where E. coli cell type, concentration of IPTG,

expression induction temperature, and/or expression media were varied and the resulting

expression levels probed via Western blot. Expressing Trx-TM2/3/4-Loop in different E. coli

cell types including BL21(DE3), BL21 codon (+), and the C43 (DE3) as well as altering the

concentration of protein expression induction agent (IPTG) resulted in only small changes in

expression among trials (data not shown). Altering the growth temperature post-induction did

affect the expression of the construct, in that there are still greater amounts of observable

cleavage at the higher expression temperature of 37⁰C (Fig. 5.9, Lanes 2-9 vs. Lanes 10-17).

Similarly to the Trx-TM2/3/4 construct, the loop deleted Trx-TM2/3/4-Loop experiences

increased intracellular proteolytic cleavage by host E. coli proteases at higher expression

temperatures (37⁰C vs. 25⁰C). Exploring expression conditions for the Trx-TM2/3/4-Loop

construct with different growth media types did reveal differences in expression levels: the best

construct expression was observed for the Trx-TM2/3/4-Loop construct in TB media (Fig. 5.9,

Lane 5). While attempts at protein expression for Trx-TM2/3/4-Loop appear to be more

successful than expression optimization of the full-length Trx-TM2/3/4, expression was not

detectable via Coomassie staining (Fig. 5.9B).

Figure 5.9. Expression trials showing the effect of different growth media on protein expression

of Trx-TM2/3/4-Loop, at two different induction temperatures: 25⁰C (Lanes 2-9) and 37⁰C

(Lanes 10-17) in BL21 (DE3) E. coli cells. Protein expression was induced with the addition of

0.1 mM IPTG. Optimal Trx-TM2/3/4-Loop protein expression appears to occur in TB media

with a lower protein expression induction temperature of 25⁰C (Lane 5). Cells were induced at

mid-log phase of growth (A600 ≈ 0.6). A) Western blot. B) Coomassie stained duplicate gel.

The calculated molecular weight of Trx-TM2/3/4-Loop fusion is 28.6 kDa. Lane 1: See blue

molecular weight marker (kDa). Lane 2: LB media, (-) IPTG, pre-induction. Lane 3: LB media,

25⁰C. Lane 4: TB media, (-) IPTG, pre-induction. Lane 5: TB media, 25⁰C. Lane 6: M9 rich

media, (-) IPTG, pre-induction. Lane 7: M9 rich media, 25⁰C. Lane 8: M9 minimal media, (-)

IPTG, pre-induction. Lane 9: M9 minimal media, 25⁰C. Lane 10: LB media, (-) IPTG, pre-

induction. Lane 11: LB media, 37⁰C. Lane 12: TB media, (-) IPTG, pre-induction. Lane 13: TB

media, 37⁰C. Lane 14: M9 rich media, (-) IPTG, pre-induction. Lane 15: M9 rich media, 37⁰C.

Lane 16: M9 minimal media, (-) IPTG, pre-induction. Lane 17: M9 minimal media, 37⁰C.

5.4 Characterization of expressed the cystic fibrosis transmembrane conductance

regulator fragments.

Previous work has shown that there can be structural consequences related to the addition

or removal of polar residues in TM segments and fragments of membrane proteins (Therien et al.

2001; Zhou et al. 2001; Wehbi et al. 2008; Rath et al. 2009a). Through the use of TM2/3/4 of

CFTR as a model, the consequences of such non-native residues in a larger TM system could be

investigated and the results of analyzing fragments of membrane proteins may provide a

potential explanation. For example, while much biochemical data has been collected on the

cytoplasmic domains of CFTR and how mutations in these soluble domains can lead to disease

(Riordan 2005; Frelet and Klein 2006), there exists a relative paucity of information regarding

CF causing mutations in the TM regions. While the heterologous expression of CFTR-TM2/3/4

remained below levels suitable for full biophysical characterization, investigating the migration

of the CFTR-TM2/3/4 construct on SDS-PAGE via Western Blot can still find utility and

highlight resulting structural differences. The work presented below expands on the

investigation of helical hairpin systems, with a view towards assessing the structural

consequences of introducing non-conservative mutations into a three-TM system.

5.4.1 Differential gel migration of TM2/3/4 mutants.

The migration of this three-strand construct and its mutants could be assessed via SDS-

PAGE and Western blotting with antibodies directed towards the S-tag on the construct (Fig.

5.2). Similarly to the CFTR TM3/4 construct, mutants in the TM regions of CFTR-TM2/3/4

could reflect changes to the structure, which may be detected as differences in migration on a

gel. This method of evaluating structural changes to TM containing constructs is quite sensitive,

and relative differences to WT migration of as much as 30-40% to the actual molecular weight of

the protein due to single point mutations have been observed (Rath et al. 2009a). Accordingly,

several TM mutations were generated, as chosen for various reasons: G126D was chosen as it is

a CF phenotypic mutation located in the middle portion of TM2 (Fig. 5.10A) (Wagner et al.

1994). This mutation was identified in a cystic fibrosis male patient using gradient gel

electrophoresis as a rapid method for screening a large number of CF patients for point mutations

in the CFTR exons (Wagner et al. 1994). Currently the consequence of the G126D mutation on

the folding pathway of CFTR in the full-length protein is unknown. I231E was chosen since a

closely related mutation (I231D) was shown to have the greatest percentage molecular weight

change relative to WT in migration studies of CFTR TM3/4 (Choi et al. 2004), and that mutation

to Glu (I231E) amplified this effect (unpublished results) (Fig. 5.10B). While I231E is not a

cystic fibrosis related mutation, the neighboring phenotypic mutation V232D appears result in

decreased glycosylation with no apparent maturation product at the cell surface. When run on

SDS-PAGE, TM2/3/4 constructs treated with thrombin to remove the Trx fusion, in some cases

had altered migration properties relative to WT. This suggests that mutation to the residues

comprising the TM regions can affect the structure of the construct either through altered

protein-detergent complexes, altered secondary structure, or altered TM-TM contacts.

Figure 5.10. Differential migration of WT CFTR-TM2/3/4 relative to mutants. A) Western

blot of TM2/3/4-WT (Lane 1) and TM2/3/4-G126D mutant (Lane 2). The TM2/3/4-G126D

mutant migrates faster relative to the TM2/3/4-WT construct. B) Western blot of TM2/3/4-WT

(Lane 1) and TM2/3/4-I231E mutant. (Lane 2). The TM2/3/4-I231E mutant also migrates faster

relative to the TM2/3/4-WT construct. Mark 12 molecular weight markers are indicated on each

Originally, it was suggested that introduction of a polar residue into a TM hairpin

construct could alter helical interactions, potentially through the introduction of a non-native

hydrogen bond (Therien et al. 2001). Subsequent studies have shown that altered migration on

SDS-PAGE can also be a result of altered protein-detergent complexes, or altered secondary

structure. While it is possible that the addition of a charged residue to a TM hairpin construct

can contribute to increased gel migration on SDS-PAGE, the majority of the difference in gel

migration can be explained by altered detergent binding, rather than the addition of, or mutation

to, a charged residue. For example, mutations in the CFTR-TM3/4 hairpin from the WT residue

Val (V232) to both Asp (V232D) and Lys (V232K) both display migration on SDS-PAGE that is

significantly faster than WT, and both of these constructs bind significantly less detergent that

WT (Fig. 5.11) (Rath et al. 2009a). If the addition of a charged residue was the predominant

factor in dictating gel migration rates, then the V232D and V232K mutations should have

opposite effects on gel migration: V232D should migrate faster than WT, while V232K should

move slower. Additionally, the mutation to a charged residue into a hydrophobic construct

decreases the amount of detergent bound to the construct, and this decreases the mass-to-charge

ratio of the construct. If the overall charge of the construct including those provided by the SDS

detergent molecules was the dominant factor in determining gel migration, than the CFTR-

TM3/4 V232D and V232K mutants should both run slower than the wild type. As both of these

mutants bind less detergent than the WT construct, solvation by detergent and subsequent

structural consequence of this binding that is the primary effect in dictating migration in SDS-

PAGE. Molecules of SDS aggregate at hydrophobic sites on the protein: migration through a gel

is inversely proportional to the amount of detergent bound to the protein (Rath et al. 2009a).

Figure 5.11. CFTR TM3/4 hairpin sequence and SDS-PAGE analysis. A) Amino acid sequence

of the WT TM3/4 hairpin. Residues predicted to be in helical for the CFTR TM3/4 construct are

shown in green text and the predicted loop regions are shown in black text (Riordan et al. 1989).

The V232 residue where mutations are made is underlined. B) Representative SDS-PAGE of

helical hairpin mutants. Positions of MW standards (in kDa) are indicated. This figure is adapted

from (Rath et al. 2009a).

Increasing or decreasing amounts of hairpin secondary structure affects SDS-PAGE

migration by affecting the relative compactness of a migrating construct (Wehbi et al. 2007).

Protein tertiary structure can also affect SDS-PAGE migration rates. For example, disulfide

bonds within or between polypeptide chains can affect both the compactness of a protein

structure and the detergent lading capabilities of the construct. An increase in the migration on

SDS-PAGE was observed to a compact disulfide (S-S) bridged conformation of CFTR-TM3/4

(Therien et al. 2001). Unfortunately, the amount of CFTR TM2/3/4 and mutants thereof were

insufficient to study changes to the secondary structure of the construct.

5.5 Discussion.

The work presented here provides a framework for membrane protein expression

optimization and indicates an approach that can be adopted. Several optimization strategies were

outlined and followed for Trx-TM2/3/4 that included investigations on the effects of

temperature, bacterial cell strain, culture media and concentration of induction agent, with the

goal towards improving protein expression.

Our work reaffirms that to improve the expression of membrane proteins utilizing

bacteria as the expression vehicle, a number of expression temperatures must be tested to achieve

optimal results. In the case of Trx-TM2/3/4, a reduction in temperature was found to increase

expression, but this may not be universal depending on the protein. In fact, successful

overexpression of membrane proteins has been observed at a variety of temperatures including

the expression of bacteriorhodopsin at a relatively high temperature of 37°C (Faham et al. 2004),

CFTR-TM3/4 at a intermediate temperature of 25°C (Therien et al. 2002), and successful

overexpression of the human Na+/glucose co-transporter (hSGLT1) at 16°C (Quick and Wright

2002). Often, decreasing the temperature can be associated with an increase in the amount of

protein expressed: high temperature expression can lead to an increased rate of proteolytic

degradation (Quick and Wright 2002), and can trigger cellular stress situations. Cell stresses can

then lead to protein aggregation and inclusion body formation, which may not be ideal for

membrane protein overexpression and subsequent protein re-folding (Cunningham and Deber

2007).

While the work described herein focuses on the use of E. coli as the heterologous

expression vehicle, other systems are available for the over-production of membrane proteins.

The yeast systems Saccharomyces cerevisiae and Pichia pastoris are widely used, and similarly

to bacteria, yeast are well characterized and straightforward to modify genetically. They can be

easily and inexpensively grown, and most importantly they are capable of protein processing and

post-translational modification mechanisms related to those found in mammalian cells. While

this processing may not recapitulate mammalian post-translational modifications exactly, the

ease of use of this system makes yeast an attractive expression tool (Midgett and Madden 2007).

The Shaker family voltage dependent potassium channel is an example of a membrane protein

that was overexpressed and structurally solved at high resolution via the use of P. pastoris as the

expression vehicle (Long et al. 2007).

Membrane proteins can also be overexpressed in insect cells via the baculovirus

expression system. Insect cells are simpler to maintain than mammalian cells, and they offer

membrane composition and protein processing machinery closer to those of mammalian cells

than yeast. The baculovirus expression system has been used successfully in the expression of

G-coupled protein receptors (Akermoun et al. 2005) as well as human Aquaporin-4 (Hiroaki et

al. 2006).

While notoriously challenging in the overexpression of membrane proteins, mammalian

cells offer an additional choice in the overexpression of membrane proteins as they possess

cellular machinery most closely associated with human physiology and disease; however, use of

mammalian cells has been limited by difficulty and expense, with a few exceptions: the

crystallographic structure of recombinant rhodopsin was successfully solved following its

expression and purification from COS-1 cells (Standfuss et al. 2007). Mammalian cells are also

chosen as the expression tool when lower organisms are not compatible for expressing the

protein of interest. In experiments designed to define the molecular basis for the inability of E.

coli to express the complete liver H+/Pi transporter, in vitro transcription and translation assays

showed that the complete transporter is only expressed with eukaryotic ribosomes, and

inefficiently expressed in the presence of prokaryotic ribosomes (Ferreira and Pedersen 1992).

Care must be taken when choosing the expression vehicle for membrane protein over-

expression. For example, it has been shown that post-translational modifications such as

glycosylation are important for the functional expression of membrane proteins such as the

serotonin transporter. Without in vivo N-glycosylation, the serotonin transporter fails to fold

normally, and aggregates within the cell (Tate 2001). As a result, this example cannot be

overexpressed in bacterial or yeast systems.

There are additional features of the expression process that can be tailored during

optimization. In the case of membrane proteins, several choices are available when targeting

protein expression to a specific location in cells. Expression in the lipid bilayer, soluble

expression in concert with a solubilizing fusion partner to both the cytoplasm and the periplasm,

and targeting expression to inclusion bodies within E. coli, can each be utilized to increase

membrane protein expression. Success with expressing hairpin fragments of CFTR in our lab as

soluble fusion partners to an E. coli protein (Therien et al. 2002; Rath et al. 2009a) has made this

an attractive route for the further characterization of the remainder of the CFTR TM domain. In

addition to Trx, several other fusion domains have been successfully used to increase membrane

protein expression: glutathione-S-transferase, MBP, and the chitin-binding domain are all

popular choices in the effort to increase protein expression (Laage and Langosch 2001).

Additionally, modification of the construct can also affect expression rates. Extension or

truncation of the N- and/or C-termini can affect expression of the construct of interest.

The work described in this chapter also highlights the importance of construct design. As

a model to study -helical interactions within the membrane bilayer, expression of helical

constructs such as TM2/3/4 would provide useful details on the sequence determinants of helical

contacts, and perhaps when these contacts occur. Examination of the structural model of CFTR

and the structure of the related ABC transporter Sav1866 indicate that TMs 2-4 of CFTR may

not be in tandem contact in the final fold of the protein. Further research would need to be

completed to determine if TM2 would contact a TM3/4 hairpin structure in an in vitro setting.

A strategy for the extensive optimization of membrane protein expression leading to

amounts of Trx-TM2/3/4 suitable for study has been outlined here. Following systematic

methodology in the optimization of membrane protein expression, and taking into consideration

the complex host requirements for expression, the approaches described in this Chapter should

ultimately lead to the successful overexpression of a membrane protein of interest.

Chapter 6. Discussion.

6.1 Discussion.

Membrane proteins account for a large proportion of the total protein content in cells

(Boyd et al. 1998), where they are responsible for a variety of functions including transportation

of essential cellular substrates across the membrane, signal transduction and cellular recognition.

Due to their inherent hydrophobicity, elucidating the structural and functional aspects of

membrane proteins has proven a unique challenge for researchers. As membrane proteins are of

great medical relevance (Yildirim et al. 2007), it would be extremely useful to be able to predict

the final folded structure of a membrane protein from first principles associated with its primary

amino acid sequence. Membrane proteins have been implicated in many diseases such as cystic

fibrosis, Alzheimer‟s disease, retinitis pigmentosa and hereditary hearing loss (Partridge et al.

2002b).

In order to accomplish this task, however, improvements must be made to the individual

components of membrane protein production, prediction, and ultimately to a detailed definition

of the aspects contributing to a final folded structure. The work presented in this thesis

investigates several features of this process, from examining the role of TM amino acid sequence

and their contributions to TM-TM packing, accurate prediction of TM segments from the

primary sequence, determinants for TM segment selection by the cellular machinery in vivo,

conversion of marginally hydrophobic segments from soluble proteins into membrane-spanning

segments, and the optimization of membrane protein expression. To these ends, we evaluated

various parameters of TM segment selection, membrane protein expression, and structure. The

dependence of amino acid sequence on TM helix-helix interactions as well as contributions by

surrounding lipids to TM oligomerization was investigated to determine TM protein folding

determents using the single-pass TM protein GpA (Chapter 2). Comparisons of TM-like

segments from soluble proteins - which we termed -helices - to actual TM segments, helped to

define compositional differences between these groups as well as highlighted the distribution of

charged residues along the helical axis which can be used as a discriminating factor to remove

false negatives from prediction outputs (Chapter 3). The requirements of converting a

hydrophobic -helix from a soluble protein to a membrane-spanning segment were investigated

in a model system (Chapter 4). Finally, utilizing TM2/3/4 of CFTR as a model, it was found that

a number of factors can be extensively optimized that contribute to achieving successful protein

overexpression (Chapter 5). The overall major insights of this thesis discussed below:

6.1.1 Summary of contributions.

6.1.1.1 Beta-branched residues adjacent to GG4 motifs promote the efficient association of

glycophorin A transmembrane helices.

Interactive sites between TM -helices commonly contain small residue patterns (termed

GG4 or „small-xxx-small‟ motifs) at i and i + 4 positions along the helical axis. This small

residue pattern often occurs with β-branched aliphatic residues at adjacent positions, as typified

by the GpA dimerization sequence (L75

IxxGVxxGVxxT87

). In Chapter 2 we explored the

importance of local β-branched character on GpA dimerization by making systematic

replacements to all 16 combinations of Val, Ile, Leu, and Ala residues at the Val80

and Val84

positions. Using the TOXCAT system to assay self-oligomerization in the E. coli inner

membrane, we observed that combinations of Val and Ile residues maintained, or improved

dimerization levels; single Ala or Leu mutant combinations with Val or Ile maintained near-WT

dimerization affinities; and in the absence of β-branching, i.e., Leu/Leu, Ala/Ala and Ala/Leu

combinations, GpA dimerization was significantly diminished. Our results in Chapter 2 indicate

an apparent capacity of Ile-containing mutants to increase GpA dimerization vs. WT, which

likely arises from improved van der Waals packing (vs. Val). This is also consistent with

correlations we noted in lipid accessibility measurements. Examination of several synthetic

peptides with sequences corresponding to selected GpA mutants (VV, VI, IV, II, and LL)

confirmed their dimerization on SDS-PAGE. The results presented in Chapter 2 reinforce the

importance of a β-branch-containing „ridge‟ residue to complement a „small-xxx-small groove‟

in promotion of TM-TM interactions and highlight the sequence dependence of TM segment

association.

6.1.1.2 Distinctions between hydrophobic helices in globular proteins and transmembrane

segments as factors in protein sorting.

Generally, TM segments can be distinguished in the primary amino acid sequence as

continuous stretches of hydrophobic residues above a specific hydrophobicity threshold;

however, a database we created of helical globular proteins revealed that nearly one-third of the

proteins in the database contained helices of sufficient length to span a bilayer (≥ 19 residues),

and in many instances, had mean hydrophobicity greater than actual TM segments. We termed

these hydrophobic segments from globular proteins “-helices”. In Chapter 3 we found that

peptides corresponding to selected -helix segments behave similarly to native TM sequences as

they readily insert into membrane mimetic environments in helical conformations. As well,

certain -helix sequences can integrate into the membrane bilayer when placed into a membrane-

targeted TOXCAT chimeric protein. Computationally, we established that -helices can be

distinguished from bona fide TM segments by the decreased frequency of occurrence of Ile/Val

residues, and by their relatively decreased solvent accessibilities (vs. other globular helices)

within tertiary structure. -helices generally contain three or more charged residues, and they

display relatively even distributions of these charged residues along their lengths – rather than

concentration near their N- and C-termini as observed for TM segments. This distinction may

constitute key recognition factors in diverting -helices from the membrane in vivo. The results

presented in Chapter 3 identify additional factors that may be important in the correct selection

of TM segments by the cellular machinery, and suggest that -helices may be required for

globular protein folding.

This work can also expand on the prediction process that is used to identify TM segments

from the primary amino acid sequence. In Chapter 3, two features of TM segments were

identified that will aid in the separation of TM -helices from -helices from globular proteins

with significant hydrophobic character (-helices): the skewed positioning of charged residues

along the helical axis; and the significant content of large, hydrophobic -branched residues

(Ile/Val) in the sequence of TM segments (Cunningham et al. 2009). The weeding of false-

positives such as -helices from prediction programs is an important goal, as identification of

TM proteins from volumes of sequence data is not yet routinely possible in the large-scale study

of proteins.

6.1.1.3 Converting a marginally hydrophobic globular protein into a membrane protein.

Marginally hydrophobic -helical segments such as -helices exhibit certain sequence

characteristics of TM helices (Chapter 3). To better understand the distinctions between -

helices and TM -helices, we investigated the insertion of five -helices into dog pancreas

microsomal membranes. Model constructs in which an isolated -helix was engineered into a

bona fide membrane protein indicated that for two -helices selected from secreted proteins, at

least three single-nucleotide mutations are necessary to obtain efficient membrane insertion,

whereas one mutation is sufficient in a -helix from the cytosolic protein P450BM-3.

Additionally, we found that when the entire upstream region of the mutated -helix in the intact

cytochrome P450BM-3 is deleted, a small fraction of the truncated protein inserts into

microsomal membranes. Our results in Chapter 4 suggest that upstream portions of the

polypeptide and embedded charged residues protect -helices in globular proteins from being

recognized by the SRP-Sec61 ER-targeting machinery. The results further indicate that -helices

in secreted proteins are mutationally more distant from TM helices than -helices in cytosolic

proteins, and how difficult it is to convert a soluble segment to one that traverses the bilayer with

some efficiency.

6.1.1.4 Optimizing synthesis and expression of transmembrane peptides and proteins.

The over-expression of membrane proteins – which in most cases is a requirement for

high resolution structural determination – is a highly empirical process. A delicate balance exists

for heterologous protein expression of eukaryotic proteins in bacteria, where various parameters

of the process can and should be optimized to achieve efficient results. In Chapter 5 we outlined

a heterologous expression strategy for a fragment of the TM domain of CFTR consisting of

TM‟s 2, 3 and 4 with the interconnecting loops between TM2/3 and TM3/4. Variations in the

protein expression process included variously altering the expression construct, bacterial strain,

growth media, protein expression temperature, as well as induction temperature. While it was

found that optimizing various parameters of CFTR TM2/3/4 expression definitely affected

protein expression levels, concentrations suitable for biophysical analysis remained elusive. The

results presented in this Chapter provide researchers with a clear plan in the optimization of the

complicated process of membrane protein production such that one may progress towards the

successful overexpression of a membrane protein.

6.2 Membrane mimetic micelles versus bilayers.

In the work presented in this thesis, both detergent micelles and native membrane

bilayers were used to study the secondary and tertiary structures of -helices and TM segments.

While use of detergents in the study of membrane proteins and fragments of larger membrane

proteins has often proved to be both useful and necessary, both in vitro and in vivo environments

have advantages and disadvantages, and care must be taken when interpreting results.

6.2.1 SDS as a membrane mimetic.

The hydrophobic nature of membrane proteins and TM segments requires the use of

membrane mimetic systems for their study. In Chapters 2, 3 and 5, we chose to use SDS as our

membrane mimetic. Sodium dodecyl sulfate is an anionic detergent that has a tail of 12 carbon

atoms attached to a sulfate group, providing the molecule with the amphiphilic properties

required for micelle formation. An added benefit of SDS is the anionic nature of the detergent,

as it is able to neutralize the effects of the positively charged Lys tags which are placed at

synthesized peptide termini for solubilization purposes. SDS is a detergent commonly used in

labs around the world, and in our hands has resulted in the successful secondary structure

determination of CFTR TM hairpins (Choi et al. 2004), TM peptides (Deber et al. 1993;

Partridge et al. 2002a; Liu et al. 2003; Cunningham et al. 2009), as well as designed TM peptide

segments (Johnson et al. 2004; Tulumello and Deber 2009).

While detergents offer a simplifying solution in the handling of membrane proteins, care

must be taken when using membrane mimetics or interpreting results derived from these

systems: previous work can be conflicting regarding the effects of detergents on membrane

proteins. For example, it was shown that point mutations in the GpA TM segment has similar

energetic consequences in the detergent C8E5 as compared to the TOXCAT assay which is

conducted in the inner membrane of E. coli (Fleming and Engelman 2001). The work presented

in this thesis shows that the specifics of TM segment oligomerization in SDS may vary

somewhat from those observed in an intact membrane bilayer (Chapter 4). An SDS micellar

environment is quite different from a biological membrane such as the E. coli inner membrane

which is composed of the lipids phosphatidylethanolamine, phosphatidylglycerol and cardiolipin.

In the case of peptides corresponding to selected mutants of GpA (Chapter 2), all

segments investigated for their oligomerization capabilities on SDS-PAGE retained their dimeric

status; however, the rate of migration on SDS-PAGE did not always correlate with the

dimerization results found via the TOXCAT assay. In a dynamic environment such as a SDS

micelle, changes to the register of a dimerization interface are permitted, as well as the

possibility of anti-parallel interactions. These types of TM oligomerization patterns would not

be available in the TOXCAT system as the nature of the construct would force interaction of

GpA TM segments in a certain orientation and register. SDS-PAGE additionally reports on

several features beyond migration such as detergent binding and hydropathy (Rath et al. 2009a).

Despite some limitations, SDS was still considered a reasonable solubilizing agent for

our TM peptide and protein studies, as secondary structure determination via CD of our GpA

peptides ruled out the possibility that differences in SDS-PAGE migration were caused by

differences in secondary structure (Chapter 2). The basis of differences in SDS-PAGE migration

among GpA mutants is most likely being influenced by factors beyond structural changes to the

peptides such as peptide crossing angles, or detergent binding (Rath et al. 2009a).

6.3 Significant content of hydrophobic, -branched residues in transmembrane

segments relative to -helices.

Compositional analysis of the residues comprising TM segments highlights an abundance

of large -branched residues in TM segments, relative to the amino acid content of -helices

(Cunningham et al. 2009). The evident requirement of hydrophobic, -branched residues in TM

segments is directly related to the apolar environment provided by the membrane bilayer, as

these residues meet hydrophobicity criteria imposed by the membrane bilayer. While -helices

retain equivalent hydrophobicity to TM segments according to the Liu-Deber hydrophobicity

scale, the decreased amount of Ile and Val in -helices must be a result of the structural

preferences of amino acids in different environments: large, hydrophobic -branched residues

such as Ile and Val have differential structural preferences depending on their environment. In

an apolar environment such as the membrane bilayer, these residues take on the preferred

structure of a -helix (Liu and Deber 1999). In an aqueous environment, these residues would

“prefer” to form -sheet structures and the propensity to form -helices is relatively low (Chou

and Fasman 1978). Notably, the remaining large hydrophobic residue Leu is statistically highly

represented in both of these environments. Leu contributes to -helical structures in an apolar

environment like the membrane, but has a similar propensity to form both -helices and -sheets

in an aqueous environment (Chou and Fasman 1978; Liu and Deber 1999). Our identification of

the preference of Ile and Val to occur in hydrophobic segments that span the membrane rather

than the core of globular proteins highlights the fact that the interior of these globular proteins

must not in fact resemble the membrane bilayer. Put in evolutionary terms, the low abundance

of Ile and Val in globular proteins is an attempt to preserve the secondary structure of the protein

in a -helical form.

6.4 Membrane insertion propensity of transmembrane segments.

Hydrophobicity is generally considered the overriding characteristic directing the

insertion of TM -helices into the membrane bilayer. Examination of the segmental

hydrophobicity of TM spanning segments with an in vivo hydrophobicity scale developed by

Hessa et. al. shows that there is a threshold hydrophobicity for membrane insertion. The average

predicted free energy of insertion (Gapp), as calculated for TM -helices with available high

resolution structures, is around -1 kcal/mol (Hessa et al. 2007). Efficient recognition by the

translocon for insertion into the membrane bilayer as experimentally determined by this scale is

Gapp < 0 kcal/mol (Hessa et al. 2007). This calculated energetic value appears to hold most true

for single-spanning TM proteins, but this is not the case for all TM segments - especially TM‟s

from multi-spanning TM proteins. Approximately 25% of TM‟s from multi-spanning membrane

proteins have a predicted membrane insertion propensity that is not considered favorable for

insertion, Gapp > 0 (Hessa et al. 2007). The lowered relative hydrophobicity of these TM

segments from multi-spanning proteins suggests that efficient membrane insertion of these

segments may depend on contacts with other regions of the protein (Hedin et al. 2010). These

TM segments with below-threshold hydrophobicity would most likely not be recognized by the

translocon as TM -helices if they were the only membrane-embedded sequence in the protein.

6.4.1 Transmembrane segments with low hydrophobicity.

While marginally hydrophobic TM segments are a common theme in membrane proteins,

hydrophobic segments in water-soluble proteins – or -helices – are also common (Cunningham

et al. 2009; Enquist et al. 2009; Hedin et al. 2010). Our lab uses peptides corresponding to TM

segments and fragments of membrane proteins to study the insertion of hydrophobic segments

into membrane bilayers and their associations within these environments. Beyond these studies,

the -helices also provide a unique opportunity to investigate the insertion of marginally

hydrophobic non-TM segments into the membrane bilayer in vivo and to potentially dissect

factors involved in membrane insertion by the cellular machinery.

6.4.2 Importance of charged residues to translocon-mediated membrane insertion of -

helices.

Although -helices are not bona fide TM segments, the work presented in Chapter 3

indicates that -helices are capable of solvation by detergent micelles where they adopt -helical

structure. While the -helices tested for in vivo membrane insertion largely failed to do so in

their wild-type form, introducing polar-to-hydrophobic mutations allowed for their membrane

insertion (Chapter 4). What is most interesting from the results of these membrane insertion

studies is how removing charged residues from the sequence of secreted -helix examples did

not result in equal membrane insertion abilities of the segments. The recognition of TM

segments by the translocon is completed co-translationally, and is thought to be based on a

thermodynamic partitioning into the anisotropic environment of the lipid bilayer (Hessa et al.

2007). Relatively little work has been done exploring the sequence dependence of membrane

insertion, but investigations of model segments show that the location of charged residues and/or

aromatic residues within TM segments can greatly affect the insertion propensity; the translocon-

mediated insertion of TM segments into the membrane mirrors the physical properties of the

lipid bilayer (Hessa et al. 2007). The insertion of -helices into the membrane bilayer via the

host cellular machinery furthers these studies by the use of actual hydrophobic segments beyond

ideal model systems, and highlights as yet unknown factors in the membrane integration process.

The -helix segment from 2AAI reached maximal membrane insertion with three polar-

to-hydrophobic mutations, while the -helix from 1ECA had a relative insertion of 50% with

three mutations (Chapter 4, (Norholm et al. 2011)). A comparison of the segmental Liu-Deber

hydrophobicity suggests that 1ECA - E6V,D14V,G10V triple mutant is more hydrophobic than

2AAI - R8C,E19V,R21I triple mutant, both with and without inclusion of flanking residues

(Table 6.1). However, a comparison of the Gapp for both these examples indicates that the in

vivo hydrophobicities or Gapp are similar (Table 6.1). Based on experimental measurements

relating hydrophobicity and the position of charged residues in membrane-spanning segments,

these two secreted -helix examples would be predicted to insert into the membrane with equal

propensity (Hessa et al. 2007). The fact that these two segments do not insert into the membrane

with equal efficiency warrants further investigation and implies a sequence dependence of

membrane insertion.

In comparison to these secreted protein examples (1ECA and 2AAI), the -helix from the

cytosolic 2BMH protein reached high levels of membrane insertion after one polar-to-

hydrophobic mutation: 2BMH-E20V. The segmental hydrophobicity of the 2BMH -helix

segment, as well as the inclusion of flanking residues, is close to the experimentally determined

threshold for in vitro membrane insertion which is greater than, or equal to 0.4 on the Liu-Deber

hydrophobicity scale (Liu and Deber 1999). As the highest level of membrane insertion is seen

for the most hydrophobic -helix investigated, the importance of hydrophobicity in selection of

membrane-spanning segments by the cellular machinery is clearly highlighted. Polar-to-

hydrophobic mutations improve membrane insertion, also reinforcing the significance of the

location of polar/charged residues within the TM sequence (Hessa et al. 2007).

Table 6.1. Liu-Deber and Gapp hydrophobicity predictions for -helices and mutants.

-helix plus

flanking region -helix

Mutant Sequence Liu-

Deber a Gapp

b Liu-

Deber a Gapp

1ECA - WT DFAGAEAAWGATLDTFFGMIFSKM 0.62 4.51 1.10 4.15

1ECA - D14V DFAGAEAAWGATLVTFFGMIFSKM 0.85 2.393 1.38

1ECA - E6V DFAGAVAAWGATLDTFFGMIFSKM 0.81 3.597 1.33 3.16

1ECA - E6V,D14V DFAGAVAAWGATLVTFFGMIFSKM 1.04 1.557 1.60 1.20

1ECA -

E6V,D14V,G10V DFAGAVAAWVATLVTFFGMIFSKM 1.30 1.053 1.92 0.64

2AAI - WT TQLPTLARSFIICIQMISEAARFQYIEGEMR 0.51 5.06 0.85 2.98

2AAI - R8C,R21I TQLPTLACSFIICIQMISEAAIFQYIEGEMR 0.91 3.019 1.47 1.98

2AAI - E19V TQLPTLARSFIICIQMISVAARFQYIEGEMR 0.66 3.103 1.07 1.70

2AAI - R8C,E19V TQLPTLACSFIICIQMISVAARFQYIEGEMR 0.83 2.45 1.34 0.86

2AAI -

R8C,E19V,R21I TQLPTLACSFIICIQMISVAAIFQYIEGEMR 1.06 1.056 1.69 0.70

2BMH - WT PLDDENIRYQIITFLIAGHETTSGLLSFALYFLVKNPHV 0.28 7.090 0.75 3.48

2BMH – E20V PLDDENIRYQIITFLIAGHVTTSGLLSFALYFLVKNPHV 0.39 4.983 0.89 1.38

2BMH – H19L,

E20V PLDDENIRYQIITFLIAGLVTTSGLLSFALYFLVKNPHV 0.63 3.647 1.18 0.04

a Segmental hydrophobicity of the -helices as calculated by the Liu-Deber hydrophobicity scale (Liu and Deber

1999). b Hydrophobicity of the -helices as calculated by the Gapp scale (Hessa et al. 2007).

The underlined regions of sequence represent the -helices as identified by TM finder (Deber et al. 2001).

The non-underlined regions of sequence represent native residues added to the sequence as residues flanking TM

helices have previously been shown to affect membrane insertion efficiency ((Hessa et al. 2005a)).

6.4.3 Importance of secondary structure to translocon-mediated membrane insertion of

-helices.

As the work highlighted in Chapter 4 indicates that there may be sequence dependence to

efficient membrane insertion, an additional feature affecting the integration process may possibly

be secondary structure formation within the ribosome and translocon: efficient formation of

secondary structure is related to the amino acid sequence. It has been shown that nascent chain

folding inside the ribosome may be an important regulatory mechanism for the topogenesis and

integration of single-spanning membrane proteins (Mingarro et al. 2000; Woolhead et al. 2004),

as -helix formation can occur within the translocating ribosome exit tunnel (Lu and Deutsch

2005; Daniel et al. 2008), although the exact mechanisms promoting helix formation are

unknown (Woolhead et al. 2004; Lu and Deutsch 2005). It has been proposed that nonpolar

surfaces in the ribosome exit tunnel induce -helices, and that these TM -helices initially

formed in the ribosome tunnel also persist into the translocon pore (Woolhead et al. 2004) which

could be a contributing factor to the differences in membrane insertion between 1ECA and

2AAI. A folded -helix structure could potentially be optimized for membrane insertion relative

to a polypeptide with backbone polarity exposed to the membrane bilayer.

Hydrophobic, -branched residues such as Ile and Val may preferentially appear in TM

segments compared to hydrophobic segments from soluble proteins - or -helices – for reasons

beyond structurally optimizing TM -helices within the membrane bilayers alone. On average,

it has been shown that TM segments are comprised of residues with relatively high apolar

helicity. A comparison of intrinsic -helical structure of amino acids in apolar environments

shows that there is a rank order of amino acids in their tendency to form -helical secondary

structures (Liu and Deber 1999). When applied to the 1ECA and 2AAI -helix segments, the

2AAI -helix has a predicted greater ability to form -helical structures in apolar environments,

for both the WT and the triple mutant constructs (Table 6.2). It must be noted however, that with

the additional of flanking residues to the -helix segment investigated for eukaryotic membrane

insertion (Table 6.1), it is not known exactly what portion of the -helix segment is actually

traversing the membrane, and comparing segmental hydrophobicity of these segments remains

slightly empirical. There may be a sliding window effect with the addition of flanking residues

that can help, or hinder membrane insertion and makes it difficult to identify precisely which

amino acids would be inserting into the membrane (Table 6.1).

Table 6.2. Comparison of segmental apolar -helicity for the 1ECA and 2AAI -helices.

-helix segment a

Apolar segmental

helicity b

1ECA 27.46

1ECA - E6V,D14V,G10V 28.38

2AAI 33.77

2AAI - R8C,E19V,R21I 34.37 a -helix segments include flanking residues (See Table 6.1 for sequences).

b Calculated as per (Liu and Deber 1999).

The membrane insertion of -helices also begs the question of what happens during the

synthesis of proteins that contain multiple TMs, and their structure in the translocon. Studies

measuring the membrane insertion of aquaporins and CFTR have indicated that these TM

segments likely have -helical structures within the translocon pore, and that movement through

the translocon can be directly influenced by the structure of the nascent polypeptide within the

ribosome exit tunnel (Daniel et al. 2008; Pitonzo et al. 2009). The high content of hydrophobic,

-branched residues in TM segments suggests there is an importance of TM segment helicity in

promoting the membrane insertion of TM segments through the lateral opening of the translocon,

and the hydrophobic constriction within the hour-glass shaped structure of the pore may act to

maintain these helical structures (Junne et al. 2010). It is likely that an optimized -helical

structure in TM segments aides in membrane integration, as contacts with the translocon occur

prior to movement into the membrane bilayer.

While the exact mechanism of translocon recognition is currently unknown, the insertion

of non-native segments into the membrane bilayer suggests that there may be specific sequence

features of hydrophobic segments used by the cellular machinery to discriminate between

membrane insertion and translocation of segments through the pore. The results presented in this

thesis suggest a hierarchy for translocon assisted membrane insertion of TM segments. As

shown via the 2BMH construct, hydrophobicity is likely the most important factor directing

membrane insertion as this segment has the highest segmental hydrophobicity (Table 4.1), and

the highest measured levels of insertion (Chapter 4). Secondly, the positioning of charged

residues plays an important role in the membrane insertion process. Removing charges residues

from non-ideal positions such as the centre of a -helix drastically improves the measured

membrane insertion. When these factors are equivalent, as in the case of the 1ECA and 2AAI -

helices, an additional factor may affect membrane insertion such as the secondary structure of

the hydrophobic segment. In concert, these factors could play a role in discriminating TM

segments from non-TM segments in the case of single-spanning membrane proteins, or those that

have limited contact with the remainder of membrane-spanning segments.

Perhaps most importantly, the results presented in Chapter 4 highlight the promiscuity of

the membrane insertion process: the -helices can be inserted into biological membranes to

some degree, but they are not TM segments in vivo. Natural TM segments come in all varieties,

from hydrophobic to relatively hydrophilic, which is a natural consequence of membrane-

spanning segments with different functions. The translocon and associated proteins must have a

threshold for TM segment recognition that is based on generic properties, but still retain enough

specificity to avoid incorrect membrane insertion events. This ubiquitous process requires a

delicate balance between rules for membrane insertion, and promiscuity so that only segments

destined for the membrane are integrated. The research elaborated here works towards defining

the characteristics of membrane insertion in greater detail, which in turn, may be used to improve

TM segment prediction.

6.4.4 Biological role of -helices.

The detailed mechanism by which a polypeptide chain folds to a specific three-

dimensional protein structure is difficult to determine; however; native states of proteins almost

always correspond to the structures that are most thermodynamically stable under physiological

conditions. This usually means the incorporation of hydrophobic portions of proteins into the

folded interior (Dobson 2003). In water-soluble proteins containing -helices, we found that -

helix segments have higher percentages of burial within the protein interior than relatively

hydrophobic -helices of comparable length (Chapter 3). Whether or not the -helix segments

are critical to the actual folding of the full length protein as a rule remains to be determined, but

based on the location of the -helices within the overall fold, it is a reasonable hypothesis. As an

example, the -helix segments from 1ECA and 1MBA studied in this body of work have been

shown to be central in the folding pathways of the full length protein in which they reside.

1ECA and 1MBA are members of the globin family of heme-binding proteins, where all family

members contain the basic globin fold of 7 helices, labelled A, B, C, E, F, G and H (Aronson et

al. 1994): the -helix segments within 1ECA and 1MBA corresponds to the H-helix, which is

thought to form one of the first structural elements of the globin fold on which the remainder of

the protein docks to form the final, folded structure (Nishimura et al. 2000).

In a grand scheme, the -helices may represent situations similar to events that resulted in

the evolution of TM proteins. The work highlighted in Chapter 4 indicates how it is possible to

incorporate a hydrophobic -helix into the membrane bilayer. The actual transformation of a -

helix to a membrane-spanning segment was difficult; requiring polar-to-hydrophobic amino acid

mutations as well as removal of upstream structural elements, but the end result was a change to

the functional location of the protein. In evolutionary terms, such mutational events and

structural rearrangements were perhaps at the forefront of generating TM segments and

membrane-spanning proteins, even though the origin of membranes and membrane proteins

remains enigmatic: for example, a lipid membrane would be of little function without membrane

proteins to connect membrane bound contents to the outside world, but how could have

membrane proteins evolved without functional membranes (Mulkidjanian et al. 2009)?

Membrane proteins that traverse the membrane bilayer contain long stretches of

hydrophobic amino acids. Assuming that water-soluble proteins contain a somewhat random

distribution of polar and non-polar amino acids, unlike TM proteins that contain long stretches of

hydrophobic residues, a gradual evolution from soluble proteins to membrane proteins with long

hydrophobic stretches must be considered (Mulkidjanian et al. 2009). This also could have

evolved in the alternative manner: proteins with hydrophobic stretches spanning primordial

membranes would lose their apolar nature to become water-soluble.

The first membrane protein could have evolved from a soluble protein that attaches to

membranes, and inserts into the bilayer. Once such protein that could be used as an example of

the evolution of membrane proteins is vinculin that is involved in regulating cell adhesion,

spreading, and motility (Goldmann et al. 1996) and is thought to insert into the membrane

bilayer via attachment to membranes where it can bind acidic phospholipids (Bakolitsa et al.

1999). Amphipathic helices, where hydrophobic and hydrophilic residues are located on

opposite faces when the peptide folds into an -helix, could also be ancestors of membrane

spanning segments and are of particular interest as these peptides can easily adopt an orientation

where the hydrophilic face is buried in water while the hydrophobic face is exposed to the

nonpolar environment formed by the hydrocarbon tails of the lipids (Pohorille et al. 2005). Most

importantly, the match between the polarity of the -helix and its environment is stable, and this

appears to be more important than the specific identity of the amino acids in the amphipathic

sequence (Pohorille et al. 2005). In a similar way, membrane spanning segments could have

evolved from sequences similar to gramicidin. This simple membrane channel inserts into the

membrane in a stepwise manner that likely includes formation of a water-insoluble gramicidin

aggregate, dissociation from the aggregate, partitioning of peptide to the membrane surface,

oligomerization on the surface and insertion and folding of the peptide into its double-helical

form (Hicks et al. 2008). The sequence of gramicidin is hydrophobic, and primordial segments

such as these may have adopted a membrane spanning structure in an equivalent manner.

Hydrophobic or amphipathic helices as mentioned here may spontaneously insert into the

membrane bilayer in a manner unassisted by the translocon, but because of the nature of the

construct used in the membrane insertion assays in this work, this is probably not a likely

scenario for the -helix segments. In the case of the TOXCAT assay, the -helix segment of

interest is fused to MBP, which is normally a periplasmic protein. Due to the presence of a large

fusion protein, spontaneous insertion of -helices into the membrane to establish a membrane

inserted orientation would likely be difficult (Chapter 3). In the case of the membrane insertion

assay employed in Chapter 4 of this thesis, the segment of interest is cloned behind a natural TM

segment from the integral membrane protein leader peptidase, which would direct protein

expression to the membrane. Segments that follow this model TM segment would either be

identified for membrane insertion by the cellular machinery, or not, and spontaneous insertion is

not probable due to the nature of the construct (Fig 4.1). As for the truncated and mutated

2BMH protein, for which a limited amount of membrane insertion was measured, spontaneous

membrane insertion is possible, but unlikely given the glycosylation pattern observed for the

construct (Fig 4.3).

6.5 Helix-helix interactions.

Many TM -helices associate to form functional membrane proteins or domains of larger

structures, but when do the TM -helices of multi-spanning membrane proteins first interact?

One possibility is that the translocon identifies TM segments in a linear manner and carries out

each membrane integration event independently and in sequential progression (Pitonzo and

Skach 2006) that is followed by any structural contacts leading to higher order structures within

the membrane bilayer. Although this simplistic model is appealing, it was shown several years

ago that the TM‟s of many polytopic membrane proteins integrate in to the bilayer in pairs or

groups (Skach and Lingappa 1993). Membrane protein that contain short loops may form their

helical interactions within the translocon pore, or very soon after exiting laterally into the

bilayer: the physical constraints imposed by short loops would make this a possibility, and the

early environment experienced by TM segments may play a role in determining how and when

TM segments begin to associate. Studies to date have primarily focused on TM segment

associations with the translocon machinery, and it has been shown that different -helices

contact translocon machinery for different lengths of time, which could contribute to structural

contacts within the protein (Sadlish et al. 2005). Constructs such as CFTR TM2/3/4 would

provide utility in addressing questions relating to this area. In the structural model of CFTR,

based in the high-resolution structure of Sav1866, it does not appear that TM2 contacts TM3 and

TM4 in the final folded structure (Serohijos et al. 2008). If helix contacts are established

between helices in the translocon for polytopic membrane proteins, then no helical contacts for

TM2 to the rest of the protein should be observed. The model consisting of CFTR TM2/3/4 is an

ideal system to address the formation of early helical contacts in a translocating system.

Beyond helical contacts created in the translocon, it has been established that specific

sequence determinants direct -helix association, and that the sequence of amino acids in TM

segments is critical to the formation of higher order structures. The work presented in Chapter 2

of this thesis highlights how a sequence-specific oligomerization motif can be modulated to

influence TM -helix association. Gathering structural information on the folding of membrane

proteins is useful towards the eventual prediction of structural contacts between membrane-

spanning segments in the formation of tertiary and quaternary structure.

6.5.1 Sequence specific dimerization motifs.

The most-widely studied motif directing TM segment dimerization is the GG4 segment in

GpA. This structural motif separates two Gly residues by three amino acids, and creates a

concave surface allowing for the close approach of interacting helices (MacKenzie et al. 1997).

While GpA is thought to dimerize principally by using a central GG4 motif (Lemmon et al.

1992b; MacKenzie et al. 1997), it is not plausible that residues surrounding GG4 motifs do not

also shape the affinity of the helix-helix interactions. For example, the presence of a GG4 motif

does not guarantee a tight interaction between helices: MCP from the M13 bacteriophage utilizes

a GG4 motif to direct dimerization, but the strength of association is of relatively moderate

stability compared to GpA (Melnyk et al. 2002).

The importance of the Val residues at i + 1, i + 5 positions relative to the GG4 motif was

first recognized in a systematic replacement GpA TM segment residues to identify interfacial

amino acids (Lemmon et al. 1992b). The data presented in this thesis expands on this work to

identify relative differences in dimerization strength with mutation, and to propose a structural

basis for this difference in oligomerization among mutants. We observed that the strength of

GpA dimerization can be modulated by replacement of the Val residues at positions

neighbouring the GG4 motif. Mutation to Ile - a similar large, hydrophobic -branched residue -

significantly improves dimerization, while mutation to Leu significantly reduces dimerization.

These mutations are considered relatively conservative yet they result in strong differences in

dimerization among mutants. As conservative mutations are often made to highlight the

importance of specific side-chain characteristics, the work presented here provides an example

where small changes to the side chain profile through mutation can have drastic consequences.

At times it may not be possible to make conservative mutations in membrane proteins, as the

structural features in TM segments are carefully fine tuned to carry out specific functions.

6.5.2 Prediction of helix-helix interactions.

The challenge in gaining insights into membrane protein folding and predicting

interactions between TM helices lies in the fact that membrane proteins comprise a higher than

expected structural complexity in their mode of crossing the membrane. Membrane proteins can

contain non -helical elements, such as 310-helices, -helices or intra-helical kinks (Riek et al.

2001), or may include loops which enter the membrane and turn back as in the membrane pore

aquaporin (Murata et al. 2000; Viklund et al. 2006). The complication of predicting helix

contact points is best highlighted by segments such as -helices which are capable of interactions

within a membrane bilayer, even though they are not membrane-spanning segments. It has also

been shown that computational methods trained to predict residue contacts in globular proteins

perform only moderately well when applied to membrane proteins (Fuchs et al. 2009), so a

separate set of contact criteria must be considered.

Several methods have been developed to predict helix interaction points, ranging from

measurements of lipid accessibility, determining the contact points of energy-minimized helices

and neural networks that consider actual amino acid sequences; all with limited success.

Measurements of lipid accessibility, which highlight the contribution of the environment to the

promotion of higher order structures in membrane proteins, were used in this thesis to partially

explain differences in oligomerization among GpA mutants. This procedure involves making

measurements on the accessibility of an energy minimized -helix to a methylene-sized probe as

an estimate of solvation by membrane components. This probe mimics the lipid acyl chain

radius in size, although it has considerably more conformational freedom than a methylene group

covalently linked in membrane lipids (Johnson et al. 2006). The correlation that we observed

between GpA dimerization and the lipid accessible surface area implies the existence of nonpolar

cavities on the surface of the -helical structure which has a role in determining and promoting

dimer affinity. These cavities essentially represent areas of the protein structure that do not

easily contact lipid, creating unfavourable voids in the monomeric state.

When lipid accessibility was used to explain differences in GpA mutants made

specifically at the GG4 motif to other small residues (Gly, Ala and Ser), a strong inverse

correlation was observed for dimerization as measured through TOXCAT and the lipid

accessibility (R = -0.75) (Johnson et al. 2006). An inverse correlation between TOXCAT dimer

affinity and lipid accessibility was also observed in the present work when considering the large

-branched residues in GpA at the i +1, i + 5 positions; however, the correlation was not as

strong (R = -0.54) (Cunningham et al. 2010). As seen through the NMR structure of the GpA

dimer produced in DPC micelles, the large -branched Val residues surrounding the GG4 motif

are involved in creating a ridge structure that likely promotes dimerization through

complimentary van der Waals packing into the groove created by the small residues (MacKenzie

et al. 1997). The weaker correlation observed when considering mutations to the large ridge

residues implies that changing the lipid solvation of the ridge structure does have such a strong

component in determining dimerization. Rather, creating optimized ridge surfaces, and therefore

optimized van der Waals packing surfaces, plays more of a role in dimerization compared to the

small Gly residues. Our results thus highlight the different contributions that amino acids make

in determining the interactive surfaces of TM -helices and how nuanced these packing

interactions are. Relatively simple mutations at neighbouring positions affect dimerization

differently, and at the present stage of the study of membrane protein folding, extensive

mutagenesis is still required to successfully determine packing surfaces. For the eventual

prediction of membrane protein structures, gathering experimental data relating to factors

involved in driving TM segment association remains critical.

6.6 Insights from high resolution structures of membrane proteins.

Available high-resolution crystal structures have revealed important insights into

membrane insertion events, and contacts between helices in multi-spanning membrane proteins.

The high resolution structure of the Mammalian Shaker Kv1.2 potassium channel (2A79) helped

to clarify a complicated folding pathway that includes interactions between highly charged TM

segments, allowing for insertion into the membrane (Long et al. 2007). The S4 -helix,

containing several positively charged residues, interacts with negatively charged side chains in

the S1 and S2 -helices which form the voltage sensor. These segments are critical to the

function of this multi-spanning membrane protein, and when removed from the full-length

protein context, are not capable of membrane insertion on their own (Hessa et al. 2005b).

The outline for membrane protein over-expression in Chapter 5 of this thesis works

towards an optimized strategy for studying membrane proteins. Production of large quantities of

membrane proteins is required for biophysical analysis, and the benefit from obtaining amounts

of protein to make these types of studies feasible will ultimately be essential to understanding the

detailed rules of membrane insertion, and folding.

6.7 Future directions of membrane protein folding.

To arrive at the ability to accurately predict the insertion propensity of TM segments into

the membrane bilayer, along with tertiary and quaternary interactions between TM -helices, it

is ultimately necessary to identify all of the energetic contributors to both of these cellular

processes. To accomplish this task, both in vitro and in vivo experimental approaches will be

useful. Synthesis of peptides corresponding to both TM segments and hydrophobic -helices and

subsequent analysis via SDS-PAGE, size exclusion chromatography and FRET would provide

information regarding the behaviour of these segments in membrane mimetic environments.

Expression of TM segments, and study of insertion and interactions within a native-like

membrane bilayer environment through use of assays such as TOXCAT, provides an in vivo

comparison to these studies.

Previous work has highlighted the subtleties of TM segment integration into the

membrane bilayer, with segmental hydrophobicity and position of charged residues likely

dominating the insertion process (Hessa et al. 2005a; Hessa et al. 2007; Cunningham et al. 2009),

and experiments designed to determine basic rules of membrane integration have been based on

model systems consisting of designed TM segments. However, limited work has been done to

address the importance of the sequence context in the process. For example, can the presence of

a charged residue in a TM segment be offset or buffered by increasing or decreasing surrounding

hydrophobicity? Further study of the membrane integration of -helices would be well-suited to

answer this question. -helices contain charged residues placed regularly throughout their

sequence, unlike TM segments that have charged residues concentrated at the helix termini

(Cunningham et al. 2009). Because of their high segmental hydrophobicity, -helices are

capable of membrane insertion in the context of the TOXCAT assay, but introducing mutations

to change the local hydrophobicity surrounding existing charged residues in -helices would be

useful in answering questions as to how the cellular machinery “handles” charged residues.

Accompanied by a statistical study, the identification and nature of residues surrounding native

charged residues in actual TM segments would aid in the decision of mutations to test in -

helices to further define the membrane insertion of TM and TM-like segments.

The determination of forces directing the association of TM segments within the

membrane bilayer is another area worthy of further investigation, as the rules of TM segment

association have yet to be fully defined. For example, several examples of TM segments which

dimerize via a GG4 motif also have large, hydrophobic, -branched residues at positions

neighbouring the small residues (Rath et al. 2009b). The ability of these residues to modulate the

folding of GpA by either increasing (Ile) or decreasing dimerization (Leu) relative to the WT

(Val) was discovered (Cunningham et al. 2010), but would the same type of mutational analysis

hold true in examples of other proteins that dimerize with a similar sequence space? In these

situations, a systematic investigation of the importance of large, hydrophobic, -branched

residues at positions neighbouring GG4 motifs – either in the i + 1, i + 5 or i – 1, i + 3 positions

- would be useful to fully evaluate the importance of these neighbouring ridge residues in

promoting dimerization. Mutational analysis and testing of oligomerization propensities via the

TOXCAT assay would be an informative experimental setup.

Extensive research focus has been directed towards understanding the

homooligomerization of membrane-spanning segments. While this kind of information is

extremely useful for identifying sequence specific dimerization motifs, or the mechanism of

dimerization of single-pass TM proteins like GpA, it does not address the helix-helix contacts

observed in multi-spanning membrane proteins that form the final, folded structure. A method of

evaluating hetero-oligomerization of TM segments would be useful, as little information has

been generated in this regard. To this end, some progress has been made with the development

of the GALLEX assay (Schneider and Engelman 2003). A variation of the TOXCAT assay,

GALLEX can be used to study hetero-dimerization, and currently work in our lab is being

carried out to understand the folding interactions of both single-pass and multi-pass membrane

proteins.

The work presented in this thesis has dealt with optimizing membrane protein expression

for biophysical and biochemical study, understanding determinants of membrane insertion based

on the primary amino acid sequence, as well as identifying contributors to oligomerization events

within the membrane bilayer. With the eventual promise of producing correct high-resolution

structures of membrane proteins, our results have identified several basic rules of membrane

protein folding that can be applied to future biophysical and biochemical studies of this unique

group of structures.

Chapter 7: Literature Cited.

Literature Cited

Adamian, L., and Liang, J. 2002. Interhelical hydrogen bonds and spatial motifs in membrane

proteins: polar clamps and serine zippers. Proteins 47: 209-218.

Adams, P.D., Arkin, I.T., Engelman, D.M., and Brunger, A.T. 1995. Computational searching

and mutagenesis suggest a structure for the pentameric transmembrane domain of

phospholamban. Nature structural biology 2: 154-162.

Adams, P.D., Engelman, D.M., and Brunger, A.T. 1996. Improved prediction for the structure of

the dimeric transmembrane domain of glycophorin A obtained through global searching.

Proteins 26: 257-261.

Akermoun, M., Koglin, M., Zvalova-Iooss, D., Folschweiller, N., Dowell, S.J., and Gearing,

K.L. 2005. Characterization of 16 human G protein-coupled receptors expressed in

baculovirus-infected insect cells. Protein expression and purification 44: 65-74.

Almen, M.S., Nordstrom, K.J., Fredriksson, R., and Schioth, H.B. 2009. Mapping the human

membrane proteome: a majority of the human membrane proteins can be classified

according to function and evolutionary origin. BMC biology 7: 50.

Arbely, E., and Arkin, I.T. 2004. Experimental measurement of the strength of a C alpha-H...O

bond in a lipid bilayer. Journal of the American Chemical Society 126: 5362-5363.

Aronson, H.E., Royer, W.E., Jr., and Hendrickson, W.A. 1994. Quantification of tertiary

structural conservation despite primary sequence drift in the globin fold. Protein Sci 3:

1706-1711.

Atwell, S., Brouillette, C.G., Conners, K., Emtage, S., Gheyi, T., Guggino, W.B., Hendle, J.,

Hunt, J.F., Lewis, H.A., Lu, F., et al. 2010. Structures of a minimal human CFTR first

nucleotide-binding domain as a monomer, head-to-tail homodimer, and pathogenic

mutant. Protein Eng Des Sel 23: 375-384.

Baker, J.M., Hudson, R.P., Kanelis, V., Choy, W.Y., Thibodeau, P.H., Thomas, P.J., and

Forman-Kay, J.D. 2007. CFTR regulatory region interacts with NBD1 predominantly via

multiple transient helices. Nature structural & molecular biology 14: 738-745.

Bakolitsa, C., de Pereda, J.M., Bagshaw, C.R., Critchley, D.R., and Liddington, R.C. 1999.

Crystal structure of the vinculin tail suggests a pathway for activation. Cell 99: 603-613.

Baneyx, F., and Mujacic, M. 2004. Recombinant protein folding and misfolding in Escherichia

coli. Nature biotechnology 22: 1399-1408.

Bocharov, E.V., Pustovalova, Y.E., Pavlov, K.V., Volynsky, P.E., Goncharuk, M.V., Ermolyuk,

Y.S., Karpunin, D.V., Schulga, A.A., Kirpichnikov, M.P., Efremov, R.G., et al. 2007.

Unique dimeric structure of BNip3 transmembrane domain suggests membrane

permeabilization as a cell death trigger. The Journal of biological chemistry 282: 16256-

16266.

Bogdanov, M., and Dowhan, W. 1995. Phosphatidylethanolamine is required for in vivo function

of the membrane-associated lactose permease of Escherichia coli. The Journal of

biological chemistry 270: 732-739.

Bogdanov, M., and Dowhan, W. 1999. Lipid-assisted protein folding. The Journal of biological

chemistry 274: 36827-36830.

Bogdanov, M., Umeda, M., and Dowhan, W. 1999. Phospholipid-assisted refolding of an

integral membrane protein. Minimum structural features for phosphatidylethanolamine to

act as a molecular chaperone. The Journal of biological chemistry 274: 12339-12345.

Bowie, J.U. 1997. Helix packing in membrane proteins. Journal of molecular biology 272: 780-

Boyd, D., Schierle, C., and Beckwith, J. 1998. How many membrane proteins are there? Protein

Sci 7: 201-205.

Brunger, A.T., Adams, P.D., Clore, G.M., DeLano, W.L., Gros, P., Grosse-Kunstleve, R.W.,

Jiang, J.S., Kuszewski, J., Nilges, M., Pannu, N.S., et al. 1998. Crystallography & NMR

system: A new software suite for macromolecular structure determination. Acta

crystallographica 54: 905-921.

Buck, T.M., Wagner, J., Grund, S., and Skach, W.R. 2007. A novel tripartite motif involved in

aquaporin topogenesis, monomer folding and tetramerization. Nature structural &

molecular biology 14: 762-769.

Carpenter, E.P., Beis, K., Cameron, A.D., and Iwata, S. 2008. Overcoming the challenges of

membrane protein crystallography. Current opinion in structural biology 18: 581-586.

Chang, G., Spencer, R.H., Lee, A.T., Barclay, M.T., and Rees, D.C. 1998. Structure of the MscL

homolog from Mycobacterium tuberculosis: a gated mechanosensitive ion channel.

Science (New York, N.Y 282: 2220-2226.

Chen, H., and Kendall, D.A. 1995. Artificial transmembrane segments. Requirements for stop

transfer and polypeptide orientation. The Journal of biological chemistry 270: 14115-

14122.

Cheng, S.H., Gregory, R.J., Marshall, J., Paul, S., Souza, D.W., White, G.A., O'Riordan, C.R.,

and Smith, A.E. 1990. Defective intracellular transport and processing of CFTR is the

molecular basis of most cystic fibrosis. Cell 63: 827-834.

Cheung, J.C., and Deber, C.M. 2008. Misfolding of the cystic fibrosis transmembrane

conductance regulator and disease. Biochemistry 47: 1465-1473.

Cheung, J.C., Kim Chiaw, P., Pasyk, S., and Bear, C.E. 2008. Molecular basis for the ATPase

activity of CFTR. Archives of biochemistry and biophysics 476: 95-100.

Chin, C.N., Sachs, J.N., and Engelman, D.M. 2005. Transmembrane homodimerization of

receptor-like protein tyrosine phosphatases. FEBS letters 579: 3855-3858.

Choi, M.Y., Cardarelli, L., Therien, A.G., and Deber, C.M. 2004. Non-native interhelical

hydrogen bonds in the cystic fibrosis transmembrane conductance regulator domain

modulated by polar mutations. Biochemistry 43: 8077-8083.

Choi, M.Y., Partridge, A.W., Daniels, C., Du, K., Lukacs, G.L., and Deber, C.M. 2005.

Destabilization of the transmembrane domain induces misfolding in a phenotypic mutant

of cystic fibrosis transmembrane conductance regulator. The Journal of biological

chemistry 280: 4968-4974.

Choma, C., Gratkowski, H., Lear, J.D., and DeGrado, W.F. 2000. Asparagine-mediated self-

association of a model transmembrane helix. Nature structural biology 7: 161-166.

Chothia, C. 1975. Structural invariants in protein folding. Nature 254: 304-308.

Chou, P.Y., and Fasman, G.D. 1978. Empirical predictions of protein conformation. Annual

review of biochemistry 47: 251-276.

Claros, M.G., and von Heijne, G. 1994. TopPred II: an improved software for membrane protein

structure predictions. Comput Appl Biosci 10: 685-686.

Cserzo, M., Eisenhaber, F., Eisenhaber, B., and Simon, I. 2002. On filtering false positive

transmembrane protein predictions. Protein engineering 15: 745-752.

Cserzo, M., Wallin, E., Simon, I., von Heijne, G., and Elofsson, A. 1997. Prediction of

transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment

surface method. Protein engineering 10: 673-676.

Cunningham, F., and Deber, C.M. 2007. Optimizing synthesis and expression of transmembrane

peptides and proteins. Methods (San Diego, Calif 41: 370-380.

Cunningham, F., Poulsen, B.E., Ip, W., and Deber, C.M. 2010. Beta-branched residues adjacent

to GG4 motifs promote the efficient association of glycophorin A transmembrane helices.

Biopolymers Epub ahead of print.

Cunningham, F., Rath, A., Johnson, R.M., and Deber, C.M. 2009. Distinctions between

hydrophobic helices in globular proteins and transmembrane segments as factors in

protein sorting. The Journal of biological chemistry 284: 5395-5402.

Cuthbertson, J.M., Bond, P.J., and Sansom, M.S. 2006. Transmembrane helix-helix interactions:

comparative simulations of the glycophorin a dimer. Biochemistry 45: 14298-14310.

Cuthbertson, J.M., Doyle, D.A., and Sansom, M.S. 2005. Transmembrane helix prediction: a

comparative evaluation and analysis. Protein Eng Des Sel 18: 295-308.

Daniel, C.J., Conti, B., Johnson, A.E., and Skach, W.R. 2008. Control of translocation through

the Sec61 translocon by nascent polypeptide structure within the ribosome. The Journal

of biological chemistry 283: 20864-20873.

Dawson, J.P., Weinger, J.S., and Engelman, D.M. 2002. Motifs of serine and threonine can drive

association of transmembrane helices. Journal of molecular biology 316: 799-805.

Dawson, R.J., and Locher, K.P. 2006. Structure of a bacterial multidrug ABC transporter. Nature

443: 180-185.

Deber, C.M., Khan, A.R., Li, Z., Joensson, C., Glibowicka, M., and Wang, J. 1993. Val-->Ala

mutations selectively alter helix-helix packing in the transmembrane segment of phage

M13 coat protein. Proceedings of the National Academy of Sciences of the United States

of America 90: 11648-11652.

Deber, C.M., Wang, C., Liu, L.P., Prior, A.S., Agrawal, S., Muskat, B.L., and Cuticchia, A.J.

2001. TM Finder: a prediction program for transmembrane protein segments using a

combination of hydrophobicity and nonpolar phase helicity scales. Protein Sci 10: 212-

Deisenhofer, J., Epp, O., Miki, K., Huber, R., and Michel, H. 1984. X-ray structure analysis of a

membrane protein complex. Electron density map at 3 A resolution and a model of the

chromophores of the photosynthetic reaction center from Rhodopseudomonas viridis.

Journal of molecular biology 180: 385-398.

Dill, K.A., Ozkan, S.B., Shell, M.S., and Weikl, T.R. 2008. The protein folding problem. Annual

review of biophysics 37: 289-316.

Dobson, C.M. 2003. Protein folding and misfolding. Nature 426: 884-890.

Dougherty, D.A. 2007. Cation-pi interactions involving aromatic amino acids. The Journal of

nutrition 137: 1504S-1508S; discussion 1516S-1517S.

Douglas, J.L., Trieber, C.A., Afara, M., and Young, H.S. 2005. Rapid, high-yield expression and

purification of Ca2+-ATPase regulatory proteins for high-resolution structural studies.

Protein expression and purification 40: 118-125.

Doura, A.K., and Fleming, K.G. 2004. Complex interactions at the helix-helix interface stabilize

the glycophorin A transmembrane dimer. Journal of molecular biology 343: 1487-1497.

Doura, A.K., Kobus, F.J., Dubrovsky, L., Hibbard, E., and Fleming, K.G. 2004. Sequence

context modulates the stability of a GxxxG-mediated transmembrane helix-helix dimer.

Dumon-Seignovert, L., Cariot, G., and Vuillard, L. 2004. The toxicity of recombinant proteins in

Escherichia coli: a comparison of overexpression in BL21(DE3), C41(DE3), and

C43(DE3). Protein expression and purification 37: 203-206.

Duong, M.T., Jaszewski, T.M., Fleming, K.G., and MacKenzie, K.R. 2007. Changes in apparent

free energy of helix-helix dimerization in a biological membrane due to point mutations.

Ebie, A.Z., and Fleming, K.G. 2007. Dimerization of the erythropoietin receptor transmembrane

domain in micelles. Journal of molecular biology 366: 517-524.

Eisenberg, D., Schwarz, E., Komaromy, M., and Wall, R. 1984. Analysis of membrane and

surface protein sequences with the hydrophobic moment plot. Journal of molecular

biology 179: 125-142.

Ellis, R.J. 1997. Do molecular chaperones have to be proteins? Biochemical and biophysical

research communications 238: 687-692.

Elmore, D.E., and Dougherty, D.A. 2003. Investigating lipid composition effects on the

mechanosensitive channel of large conductance (MscL) using molecular dynamics

simulations. Biophysical journal 85: 1512-1524.

Engelman, D.M. 2005. Membranes are more mosaic than fluid. Nature 438: 578-580.

Engelman, D.M., Steitz, T.A., and Goldman, A. 1986. Identifying nonpolar transbilayer helices

in amino acid sequences of membrane proteins. Annual review of biophysics and

biophysical chemistry 15: 321-353.

Enquist, K., Fransson, M., Boekel, C., Bengtsson, I., Geiger, K., Lang, L., Pettersson, A.,

Johansson, S., von Heijne, G., and Nilsson, I. 2009. Membrane-integration characteristics

of two ABC transporters, CFTR and P-glycoprotein. Journal of molecular biology 387:

1153-1164.

Eshaghi, S., Niegowski, D., Kohl, A., Martinez Molina, D., Lesley, S.A., and Nordlund, P. 2006.

Crystal structure of a divalent metal ion transporter CorA at 2.9 angstrom resolution.

Science (New York, N.Y 313: 354-357.

Faham, S., Yang, D., Bare, E., Yohannan, S., Whitelegge, J.P., and Bowie, J.U. 2004. Side-chain

contributions to membrane protein structure and stability. Journal of molecular biology

335: 297-305.

Ferreira, G.C., and Pedersen, P.L. 1992. Overexpression of higher eukaryotic membrane proteins

in bacteria. Novel insights obtained with the liver mitochondrial proton/phosphate

symporter. The Journal of biological chemistry 267: 5460-5466.

Fisher, L.E., Engelman, D.M., and Sturgis, J.N. 2003. Effect of detergents on the association of

the glycophorin a transmembrane helix. Biophysical journal 85: 3097-3105.

Fleming, K.G., and Engelman, D.M. 2001. Specificity in transmembrane helix-helix interactions

can define a hierarchy of stability for sequence variants. Proceedings of the National

Academy of Sciences of the United States of America 98: 14340-14344.

Frelet, A., and Klein, M. 2006. Insight in eukaryotic ABC transporter function by mutation

analysis. FEBS letters 580: 1064-1084.

Frydman, J., and Hartl, F.U. 1996. Principles of chaperone-assisted protein folding: differences

between in vitro and in vivo mechanisms. Science (New York, N.Y 272: 1497-1502.

Fu, D., Libson, A., Miercke, L.J., Weitzman, C., Nollert, P., Krucinski, J., and Stroud, R.M.

2000. Structure of a glycerol-conducting channel and the basis for its selectivity. Science

(New York, N.Y 290: 481-486.

Fuchs, A., Kirschner, A., and Frishman, D. 2009. Prediction of helix-helix contacts and

interacting helices in polytopic membrane proteins using neural networks. Proteins 74:

857-871.

Gaddie, K.J., and Kirley, T.L. 2009. Conserved polar residues stabilize transmembrane domains

and promote oligomerization in human nucleoside triphosphate diphosphohydrolase 3.

Biochemistry 48: 9437-9447.

Galdiero, S., Galdiero, M., and Pedone, C. 2007. beta-Barrel membrane bacterial proteins:

structure, function, assembly and interaction with lipids. Current protein & peptide

science 8: 63-82.

Gerber, N.C., and Sligar, S.G. 1992. Catalytic mechanism of Cytochrome-P-450 - evidence for a

distal charge relay. Journal of the American Chemical Society 114: 8742-8743.

Go, M.Y., Kim, S., Partridge, A.W., Melnyk, R.A., Rath, A., Deber, C.M., and Mogridge, J.

2006. Self-association of the transmembrane domain of an anthrax toxin receptor.

Goldmann, W.H., Ezzell, R.M., Adamson, E.D., Niggli, V., and Isenberg, G. 1996. Vinculin,

talin and focal adhesions. Journal of muscle research and cell motility 17: 1-5.

Goldstein, J., Pollitt, N.S., and Inouye, M. 1990. Major cold shock protein of Escherichia coli.

Proceedings of the National Academy of Sciences of the United States of America 87:

283-287.

Gratkowski, H., Lear, J.D., and DeGrado, W.F. 2001. Polar side chains drive the association of

model transmembrane peptides. Proceedings of the National Academy of Sciences of the

United States of America 98: 880-885.

Hankamer, B., Morris, E.P., and Barber, J. 1999. Revealing the structure of the oxygen-evolving

core dimer of photosystem II by cryoelectron crystallography. Nature structural biology

6: 560-564.

Haupt, M., Bramkamp, M., Coles, M., Kessler, H., and Altendorf, K. 2005. Prokaryotic Kdp-

ATPase: recent insights into the structure and function of KdpB. Journal of molecular

microbiology and biotechnology 10: 120-131.

Hawkins, C.A., de Alba, E., and Tjandra, N. 2005. Solution structure of human saposin C in a

detergent environment. Journal of molecular biology 346: 1381-1392.

Hedin, L.E., Ojemalm, K., Bernsel, A., Hennerdal, A., Illergard, K., Enquist, K., Kauko, A.,

Cristobal, S., von Heijne, G., Lerch-Bader, M., et al. 2010. Membrane insertion of

marginally hydrophobic transmembrane helices depends on sequence context. Journal of

Hessa, T., Kim, H., Bihlmaier, K., Lundin, C., Boekel, J., Andersson, H., Nilsson, I., White,

S.H., and von Heijne, G. 2005a. Recognition of transmembrane helices by the

endoplasmic reticulum translocon. Nature 433: 377-381.

Hessa, T., Meindl-Beinker, N.M., Bernsel, A., Kim, H., Sato, Y., Lerch-Bader, M., Nilsson, I.,

White, S.H., and von Heijne, G. 2007. Molecular code for transmembrane-helix

recognition by the Sec61 translocon. Nature 450: 1026-1030.

Hessa, T., White, S.H., and von Heijne, G. 2005b. Membrane insertion of a potassium-channel

voltage sensor. Science (New York, N.Y 307: 1427.

Hicks, M.R., Damianoglou, A., Rodger, A., and Dafforn, T.R. 2008. Folding and membrane

insertion of the pore-forming peptide gramicidin occur as a concerted process. Journal of

Hildebrand, P.W., Preissner, R., and Frommel, C. 2004. Structural features of transmembrane

helices. FEBS letters 559: 145-151.

Hiroaki, Y., Tani, K., Kamegawa, A., Gyobu, N., Nishikawa, K., Suzuki, H., Walz, T., Sasaki,

S., Mitsuoka, K., Kimura, K., et al. 2006. Implications of the aquaporin-4 structure on

array formation and cell adhesion. Journal of molecular biology 355: 628-639.

Hirokawa, T., Boon-Chieng, S., and Mitaku, S. 1998. SOSUI: classification and secondary

structure prediction system for membrane proteins. Bioinformatics (Oxford, England) 14:

378-379.

Hofmann, K., and Stoffel, W. 1993. TMbase-A database of membrane spanning proteins

segments. Biol. Chem. Hoppe-Seyler 374: 166.

http://blanco.biomol.uci.edu/Membrane_Proteins_xtal.html.

Hubbard, S.J., Thornton, J. M. 1993. “NACCESS”, 2.1.1 ed, London.

Hunter, H.N., Demcoe, A.R., Jenssen, H., Gutteberg, T.J., and Vogel, H.J. 2005. Human

lactoferricin is partially folded in aqueous solution and is better stabilized in a membrane

mimetic solvent. Antimicrobial agents and chemotherapy 49: 3387-3395.

Imamura, T. 2006. Protein-Surfactant Interactions. In Encyclopedia of Surface and Colloid

Science, 2nd Edition ed. (ed. P. Somasundaran), pp. 5251-5263. Taylor & Francis, New

Insel, P.A., Tang, C.M., Hahntow, I., and Michel, M.C. 2007. Impact of GPCRs in clinical

medicine: monogenic diseases, genetic variants and drug targets. Biochimica et

biophysica acta 1768: 994-1005.

Jiang, Y., Lee, A., Chen, J., Ruta, V., Cadene, M., Chait, B.T., and MacKinnon, R. 2003. X-ray

structure of a voltage-dependent K+ channel. Nature 423: 33-41.

Johnson, R.M., Hecht, K., and Deber, C.M. 2007. Aromatic and cation-pi interactions enhance

helix-helix association in a membrane environment. Biochemistry 46: 9208-9214.

Johnson, R.M., Heslop, C.L., and Deber, C.M. 2004. Hydrophobic helical hairpins: design and

packing interactions in membrane environments. Biochemistry 43: 14361-14369.

Johnson, R.M., Rath, A., Melnyk, R.A., and Deber, C.M. 2006. Lipid solvation effects contribute

to the affinity of Gly-xxx-Gly motif-mediated helix-helix interactions. Biochemistry 45:

8507-8515.

Jones, D.T. 2007. Improving the accuracy of transmembrane protein topology prediction using

evolutionary information. Bioinformatics (Oxford, England) 23: 538-544.

Junne, T., Kocik, L., and Spiess, M. 2010. The hydrophobic core of the Sec61 translocon defines

the hydrophobicity threshold for membrane integration. Molecular biology of the cell 21:

1662-1670.

Juretic, D., Zoranic, L., and Zucic, D. 2002. Basic charge clusters and predictions of membrane

protein topology. Journal of chemical information and computer sciences 42: 620-632.

Kartner, N., Augustinas, O., Jensen, T.J., Naismith, A.L., and Riordan, J.R. 1992.

Mislocalization of delta F508 CFTR in cystic fibrosis sweat gland. Nature genetics 1:

321-327.

Kerem, B., Rommens, J.M., Buchanan, J.A., Markiewicz, D., Cox, T.K., Chakravarti, A.,

Buchwald, M., and Tsui, L.C. 1989. Identification of the cystic fibrosis gene: genetic

analysis. Science (New York, N.Y 245: 1073-1080.

Kern, R., Malki, A., Holmgren, A., and Richarme, G. 2003. Chaperone properties of Escherichia

coli thioredoxin and thioredoxin reductase. The Biochemical journal 371: 965-972.

Khademi, S., O'Connell, J., 3rd, Remis, J., Robles-Colmenares, Y., Miercke, L.J., and Stroud,

R.M. 2004. Mechanism of ammonia transport by Amt/MEP/Rh: structure of AmtB at

1.35 A. Science (New York, N.Y 305: 1587-1594.

Killian, J.A., and Nyholm, T.K. 2006. Peptides in lipid bilayers: the power of simple models.

Current opinion in structural biology 16: 473-479.

Killian, J.A., and von Heijne, G. 2000. How proteins adapt to a membrane-water interface.

Trends in biochemical sciences 25: 429-434.

Koebnik, R., Locher, K.P., and Van Gelder, P. 2000. Structure and function of bacterial outer

membrane proteins: barrels in a nutshell. Molecular microbiology 37: 239-253.

Kolmar, H., Hennecke, F., Gotze, K., Janzer, B., Vogt, B., Mayer, F., and Fritz, H.J. 1995.

Membrane insertion of the bacterial signal transduction protein ToxR and requirements

of transcription activation studied by modular replacement of different protein

substructures. The EMBO journal 14: 3895-3904.

Kreda, S.M., Mall, M., Mengos, A., Rochelle, L., Yankaskas, J., Riordan, J.R., and Boucher,

R.C. 2005. Characterization of wild-type and deltaF508 cystic fibrosis transmembrane

regulator in human respiratory epithelia. Molecular biology of the cell 16: 2154-2167.

Krogh, A., Larsson, B., von Heijne, G., and Sonnhammer, E.L. 2001. Predicting transmembrane

protein topology with a hidden Markov model: application to complete genomes. Journal

of molecular biology 305: 567-580.

Kunji, E.R., Slotboom, D.J., and Poolman, B. 2003. Lactococcus lactis as host for

overproduction of functional membrane proteins. Biochimica et biophysica acta 1610:

97-108.

Kyte, J., and Doolittle, R.F. 1982. A simple method for displaying the hydropathic character of a

protein. Journal of molecular biology 157: 105-132.

Laage, R., and Langosch, D. 2001. Strategies for prokaryotic expression of eukaryotic membrane

proteins. Traffic (Copenhagen, Denmark) 2: 99-104.

Landolt-Marticorena, C., Williams, K.A., Deber, C.M., and Reithmeier, R.A. 1993. Non-random

distribution of amino acids in the transmembrane segments of human type I single span

membrane proteins. Journal of molecular biology 229: 602-608.

Langosch, D., Brosig, B., Kolmar, H., and Fritz, H.J. 1996. Dimerisation of the glycophorin A

transmembrane segment in membranes probed with the ToxR transcription activator.

Lanyi, J., and Schobert, B. 2002. Crystallographic structure of the retinal and the protein after

deprotonation of the Schiff base: the switch in the bacteriorhodopsin photocycle. Journal

of molecular biology 321: 727-737.

Lau, W.C., Baker, L.A., and Rubinstein, J.L. 2008. Cryo-EM structure of the yeast ATP

synthase. Journal of molecular biology 382: 1256-1264.

le Maire, M., Champeil, P., and Moller, J.V. 2000. Interaction of membrane proteins and lipids

with solubilizing detergents. Biochimica et biophysica acta 1508: 86-111.

Lear, J.D., Gratkowski, H., and DeGrado, W.F. 2001. De novo design, synthesis and

characterization of membrane-active peptides. Biochemical Society transactions 29: 559-

Lear, J.D., Stouffer, A.L., Gratkowski, H., Nanda, V., and Degrado, W.F. 2004. Association of a

model transmembrane peptide containing gly in a heptad sequence motif. Biophysical

journal 87: 3421-3429.

Lecomte, J.T., Vuletich, D.A., and Lesk, A.M. 2005. Structural divergence and distant

relationships in proteins: evolution of the globins. Current opinion in structural biology

15: 290-301.

Lemmon, M.A., Flanagan, J.M., Hunt, J.F., Adair, B.D., Bormann, B.J., Dempsey, C.E., and

Engelman, D.M. 1992a. Glycophorin A dimerization is driven by specific interactions

between transmembrane alpha-helices. The Journal of biological chemistry 267: 7683-

Lemmon, M.A., Flanagan, J.M., Treutlein, H.R., Zhang, J., and Engelman, D.M. 1992b.

Sequence specificity in the dimerization of transmembrane alpha-helices. Biochemistry

31: 12719-12725.

Lerch-Bader, M., Lundin, C., Kim, H., Nilsson, I., and von Heijne, G. 2008. Contribution of

positively charged flanking residues to the insertion of transmembrane helices into the

endoplasmic reticulum. Proceedings of the National Academy of Sciences of the United

States of America 105: 4127-4132.

Lewis, H.A., Buchanan, S.G., Burley, S.K., Conners, K., Dickey, M., Dorwart, M., Fowler, R.,

Gao, X., Guggino, W.B., Hendrickson, W.A., et al. 2004. Structure of nucleotide-binding

domain 1 of the cystic fibrosis transmembrane conductance regulator. The EMBO journal

23: 282-293.

Li, H., Cocco, M.J., Steitz, T.A., and Engelman, D.M. 2001. Conversion of phospholamban into

a soluble pentameric helical bundle. Biochemistry 40: 6636-6645.

Liu, L.P., and Deber, C.M. 1998a. Guidelines for membrane protein engineering derived from de

novo designed model peptides. Biopolymers 47: 41-62.

Liu, L.P., and Deber, C.M. 1998b. Uncoupling hydrophobicity and helicity in transmembrane

segments. Alpha-helical propensities of the amino acids in non-polar environments. The

Journal of biological chemistry 273: 23645-23648.

Liu, L.P., and Deber, C.M. 1999. Combining hydrophobicity and helicity: a novel approach to

membrane protein structure prediction. Bioorganic & medicinal chemistry 7: 1-7.

Liu, L.P., Li, S.C., Goto, N.K., and Deber, C.M. 1996. Threshold hydrophobicity dictates helical

conformations of peptides in membrane environments. Biopolymers 39: 465-470.

Liu, M., Tarsio, M., Charsky, C.M., and Kane, P.M. 2005. Structural and functional separation of

the N- and C-terminal domains of the yeast V-ATPase subunit H. The Journal of

Liu, W., Crocker, E., Siminovitch, D.J., and Smith, S.O. 2003. Role of side-chain conformational

entropy in transmembrane helix dimerization of glycophorin A. Biophysical journal 84:

1263-1271.

Long, S.B., Tao, X., Campbell, E.B., and MacKinnon, R. 2007. Atomic structure of a voltage-

dependent K+ channel in a lipid membrane-like environment. Nature 450: 376-382.

Lovett, P.S. 1996. Translation attenuation regulation of chloramphenicol resistance in bacteria--a

review. Gene 179: 157-162.

Lu, J., and Deutsch, C. 2005. Secondary structure formation of a transmembrane segment in Kv

channels. Biochemistry 44: 8230-8243.

Luecke, H., Schobert, B., Richter, H.T., Cartailler, J.P., and Lanyi, J.K. 1999. Structure of

bacteriorhodopsin at 1.55 A resolution. Journal of molecular biology 291: 899-911.

Lundin, C., Kim, H., Nilsson, I., White, S.H., and von Heijne, G. 2008. Molecular code for

protein insertion in the endoplasmic reticulum membrane is similar for N(in)-C(out) and

N(out)-C(in) transmembrane helices. Proceedings of the National Academy of Sciences of

the United States of America 105: 15702-15707.

Lundstrom, K. 2004. Structural genomics on membrane proteins: mini review. Combinatorial

chemistry & high throughput screening 7: 431-439.

Lupo, D., Li, X.D., Durand, A., Tomizaki, T., Cherif-Zahar, B., Matassi, G., Merrick, M., and

Winkler, F.K. 2007. The 1.3-A resolution structure of Nitrosomonas europaea Rh50 and

mechanistic implications for NH3 transport by Rhesus family proteins. Proceedings of

the National Academy of Sciences of the United States of America 104: 19303-19308.

MacKenzie, K.R., Prestegard, J.H., and Engelman, D.M. 1996. Leucine side-chain rotamers in a

glycophorin A transmembrane peptide as revealed by three-bond carbon-carbon

couplings and 13C chemical shifts. Journal of biomolecular NMR 7: 256-260.

MacKenzie, K.R., Prestegard, J.H., and Engelman, D.M. 1997. A transmembrane helix dimer:

structure and implications. Science (New York, N.Y 276: 131-133.

MacKinnon, R. 2003. Potassium channels. FEBS letters 555: 62-65.

Marchesi, V.T., and Andrews, E.P. 1971. Glycoproteins: isolation from cellmembranes with

lithium diiodosalicylate. Science (New York, N.Y 174: 1247-1248.

Marston, F.A. 1986. The purification of eukaryotic polypeptides synthesized in Escherichia coli.

The Biochemical journal 240: 1-12.

Martin, J., and Hartl, F.U. 1997. Chaperone-assisted protein folding. Current opinion in

structural biology 7: 41-52.

Martin, N.P., Leavitt, L.M., Sommers, C.M., and Dumont, M.E. 1999. Assembly of G protein-

coupled receptors from fragments: identification of functional receptors with

discontinuities in each of the loops connecting transmembrane segments. Biochemistry

38: 682-695.

Maslennikov, I., Klammt, C., Hwang, E., Kefala, G., Okamura, M., Esquivies, L., Mors, K.,

Glaubitz, C., Kwiatkowski, W., Jeon, Y.H., et al. 2010. Membrane domain structures of

three classes of histidine kinase receptors by cell-free expression and rapid NMR

analysis. Proceedings of the National Academy of Sciences of the United States of

America 107: 10902-10907.

Meier, T., Polzer, P., Diederichs, K., Welte, W., and Dimroth, P. 2005. Structure of the rotor ring

of F-Type Na+-ATPase from Ilyobacter tartaricus. Science (New York, N.Y 308: 659-662.

Meijer, A.B., Spruijt, R.B., Wolfs, C.J., and Hemminga, M.A. 2001. Membrane-anchoring

interactions of M13 major coat protein. Biochemistry 40: 8815-8820.

Melnyk, R.A., Kim, S., Curran, A.R., Engelman, D.M., Bowie, J.U., and Deber, C.M. 2004. The

affinity of GXXXG motifs in transmembrane helix-helix interactions is modulated by

long-range communication. The Journal of biological chemistry 279: 16591-16597.

Melnyk, R.A., Partridge, A.W., and Deber, C.M. 2001. Retention of native-like oligomerization

states in transmembrane segment peptides: application to the Escherichia coli aspartate

receptor. Biochemistry 40: 11106-11113.

Melnyk, R.A., Partridge, A.W., and Deber, C.M. 2002. Transmembrane domain mediated self-

assembly of major coat protein subunits from Ff bacteriophage. Journal of molecular

biology 315: 63-72.

Melnyk, R.A., Partridge, A.W., Yip, J., Wu, Y., Goto, N.K., and Deber, C.M. 2003. Polar

residue tagging of transmembrane peptides. Biopolymers 71: 675-685.

Midgett, C.R., and Madden, D.R. 2007. Breaking the bottleneck: eukaryotic membrane protein

expression for high-resolution structural studies. Journal of structural biology 160: 265-

Mingarro, I., Nilsson, I., Whitley, P., and von Heijne, G. 2000. Different conformations of

nascent polypeptides during translocation across the ER membrane. BMC cell biology 1:

Miroux, B., and Walker, J.E. 1996. Over-production of proteins in Escherichia coli: mutant hosts

that allow synthesis of some membrane proteins and globular proteins at high levels.

Monne, M., Nilsson, I., Elofsson, A., and von Heijne, G. 1999. Turns in transmembrane helices:

determination of the minimal length of a "helical hairpin" and derivation of a fine-grained

turn propensity scale. Journal of molecular biology 293: 807-814.

Morais-Cabral, J.H., Zhou, Y., and MacKinnon, R. 2001. Energetic optimization of ion

conduction rate by the K+ selectivity filter. Nature 414: 37-42.

Morris, K.N., and Wool, I.G. 1994. Analysis of the contribution of an amphiphilic alpha-helix to

the structure and to the function of ricin A chain. Proceedings of the National Academy of

Sciences of the United States of America 91: 7530-7533.

Mujacic, M., Cooper, K.W., and Baneyx, F. 1999. Cold-inducible cloning vectors for low-

temperature protein expression in Escherichia coli: application to the production of a

toxic and proteolytically sensitive fusion protein. Gene 238: 325-332.

Mulkidjanian, A.Y., Galperin, M.Y., and Koonin, E.V. 2009. Co-evolution of primordial

membranes and membrane proteins. Trends in biochemical sciences 34: 206-215.

Mulvihill, C.M., and Deber, C.M. 2010. Evidence that the translocon may function as a

hydropathy partitioning filter. Biochimica et biophysica acta 1798: 1995-1998.

Murata, K., Mitsuoka, K., Hirai, T., Walz, T., Agre, P., Heymann, J.B., Engel, A., and Fujiyoshi,

Y. 2000. Structural determinants of water permeation through aquaporin-1. Nature 407:

599-605.

Netz, D.J., Bastos Mdo, C., and Sahl, H.G. 2002. Mode of action of the antimicrobial peptide

aureocin A53 from Staphylococcus aureus. Applied and environmental microbiology 68:

5274-5280.

Ng, D.P., and Deber, C.M. 2010. Deletion of a terminal residue disrupts oligomerization of a

transmembrane alpha-helix. Biochemistry and cell biology = Biochimie et biologie

cellulaire 88: 339-345.

Nilsson, I., Johnson, A.E., and von Heijne, G. 2003. How hydrophobic is alanine? The Journal of

Nishimura, C., Prytulla, S., Jane Dyson, H., and Wright, P.E. 2000. Conservation of folding

pathways in evolutionarily distant globin sequences. Nature structural biology 7: 679-

Nogi, T., Fathir, I., Kobayashi, M., Nozawa, T., and Miki, K. 2000. Crystal structures of

photosynthetic reaction center and high-potential iron-sulfur protein from

Thermochromatium tepidum: thermostability and electron transfer. Proceedings of the

National Academy of Sciences of the United States of America 97: 13561-13566.

Norholm, M.H. 2010. A mutant Pfu DNA polymerase designed for advanced uracil-excision

DNA engineering. BMC biotechnology 10: 21.

Norholm, M.H., Cunningham, F., Deber, C.M., and von Heijne, G. 2011. Converting a

marginally hydrophobic soluble protein into a membrane protein. Journal of molecular

biology 407: 171-179.

Nour-Eldin, H.H., Hansen, B.G., Norholm, M.H., Jensen, J.K., and Halkier, B.A. 2006.

Advancing uracil-excision based cloning towards an ideal technique for cloning PCR

fragments. Nucleic acids research 34: e122.

Okada, T., Sugihara, M., Bondar, A.N., Elstner, M., Entel, P., and Buss, V. 2004. The retinal

conformation and its environment in rhodopsin in light of a new 2.2 A crystal structure.

Osborne, A.R., Rapoport, T.A., and van den Berg, B. 2005. Protein translocation by the

Sec61/SecY channel. Annual review of cell and developmental biology 21: 529-550.

Ottemann, K.M., and Mekalanos, J.J. 1995. Analysis of Vibrio cholerae ToxR function by

construction of novel fusion proteins. Molecular microbiology 15: 719-731.

Oxenoid, K., and Chou, J.J. 2005. The structure of phospholamban pentamer reveals a channel-

like architecture in membranes. Proceedings of the National Academy of Sciences of the

United States of America 102: 10870-10875.

Paivio, A., Nordling, E., Kallberg, Y., Thyberg, J., and Johansson, J. 2004. Stabilization of

discordant helices in amyloid fibril-forming proteins. Protein Sci 13: 1251-1259.

Partridge, A.W., Melnyk, R.A., and Deber, C.M. 2002a. Polar residues in membrane domains of

proteins: molecular basis for helix-helix association in a mutant CFTR transmembrane

segment. Biochemistry 41: 3647-3653.

Partridge, A.W., Therien, A.G., and Deber, C.M. 2002b. Polar mutations in membrane proteins

as a biophysical basis for disease. Biopolymers 66: 350-358.

Pedersen, B.P., Buch-Pedersen, M.J., Morth, J.P., Palmgren, M.G., and Nissen, P. 2007. Crystal

structure of the plasma membrane proton pump. Nature 450: 1111-1114.

Peng, S., Liu, L.P., Emili, A.Q., and Deber, C.M. 1998. Cystic fibrosis transmembrane

conductance regulator: expression and helicity of a double membrane-spanning segment.

FEBS letters 431: 29-33.

Pitonzo, D., and Skach, W.R. 2006. Molecular mechanisms of aquaporin biogenesis by the

endoplasmic reticulum Sec61 translocon. Biochimica et biophysica acta 1758: 976-988.

Pitonzo, D., Yang, Z., Matsumura, Y., Johnson, A.E., and Skach, W.R. 2009. Sequence-specific

retention and regulated integration of a nascent membrane protein by the endoplasmic

reticulum Sec61 translocon. Molecular biology of the cell 20: 685-698.

Plotkowski, M.L., Kim, S., Phillips, M.L., Partridge, A.W., Deber, C.M., and Bowie, J.U. 2007.

Transmembrane domain of myelin protein zero can form dimers: possible implications

for myelin construction. Biochemistry 46: 12164-12173.

Pohorille, A., Schweighofer, K., and Wilson, M.A. 2005. The origin and early evolution of

membrane channels. Astrobiology 5: 1-17.

Popot, J.L., and Engelman, D.M. 1990. Membrane protein folding and oligomerization: the two-

stage model. Biochemistry 29: 4031-4037.

Popot, J.L., and Engelman, D.M. 2000. Helical membrane protein folding, stability, and

evolution. Annual review of biochemistry 69: 881-922.

Poulsen, B.E., Rath, A., and Deber, C.M. 2009. The assembly motif of a bacterial small

multidrug resistance protein. The Journal of biological chemistry 284: 9870-9875.

Powl, A.M., Carney, J., Marius, P., East, J.M., and Lee, A.G. 2005. Lipid interactions with

bacterial channels: fluorescence studies. Biochemical Society transactions 33: 905-909.

Prive, G.G. 2007. Detergents for the stabilization and crystallization of membrane proteins.

Methods (San Diego, Calif 41: 388-397.

Quick, M., and Wright, E.M. 2002. Employing Escherichia coli to functionally express, purify,

and characterize a human transporter. Proceedings of the National Academy of Sciences

of the United States of America 99: 8597-8601.

Ramjeesingh, M., Ugwu, F., Li, C., Dhani, S., Huan, L.J., Wang, Y., and Bear, C.E. 2003. Stable

dimeric assembly of the second membrane-spanning domain of CFTR (cystic fibrosis

transmembrane conductance regulator) reconstitutes a chloride-selective pore. The

Biochemical journal 375: 633-641.

Rasmussen, S.G., Choi, H.J., Rosenbaum, D.M., Kobilka, T.S., Thian, F.S., Edwards, P.C.,

Burghammer, M., Ratnala, V.R., Sanishvili, R., Fischetti, R.F., et al. 2007. Crystal

structure of the human beta2 adrenergic G-protein-coupled receptor. Nature 450: 383-

Rastogi, V.K., and Girvin, M.E. 1999. Structural changes linked to proton translocation by

subunit c of the ATP synthase. Nature 402: 263-268.

Rath, A., and Deber, C.M. 2007. Membrane protein assembly patterns reflect selection for non-

proliferative structures. FEBS letters 581: 1335-1341.

Rath, A., and Deber, C.M. 2008. Surface recognition elements of membrane protein

oligomerization. Proteins 70: 786-793.

Rath, A., Glibowicka, M., Nadeau, V.G., Chen, G., and Deber, C.M. 2009a. Detergent binding

explains anomalous SDS-PAGE migration of membrane proteins. Proceedings of the

National Academy of Sciences of the United States of America 106: 1760-1765.

Rath, A., Melnyk, R.A., and Deber, C.M. 2006. Evidence for assembly of small multidrug

resistance proteins by a "two-faced" transmembrane helix. The Journal of biological

chemistry 281: 15546-15553.

Rath, A., Tulumello, D.V., and Deber, C.M. 2009b. Peptide models of membrane protein

folding. Biochemistry 48: 3036-3045.

Ravichandran, K.G., Boddupalli, S.S., Hasermann, C.A., Peterson, J.A., and Deisenhofer, J.

1993. Crystal structure of hemoprotein domain of P450BM-3, a prototype for microsomal

P450's. Science (New York, N.Y 261: 731-736.

Ready, M.P., Kim, Y., and Robertus, J.D. 1991. Site-directed mutagenesis of ricin A-chain and

implications for the mechanism of action. Proteins 10: 270-278.

Ridge, K.D., Lee, S.S., and Yao, L.L. 1995. In vivo assembly of rhodopsin from expressed

polypeptide fragments. Proceedings of the National Academy of Sciences of the United

Riek, R.P., Rigoutsos, I., Novotny, J., and Graham, R.M. 2001. Non-alpha-helical elements

modulate polytopic membrane protein architecture. Journal of molecular biology 306:

349-362.

Riggs, P. 2001. Expression and purification of maltose-binding protein fusions. Current

protocols in molecular biology / edited by Frederick M. Ausubel ... [et al Chapter 16:

Unit16 16.

Riordan, J.R. 2005. Assembly of functional CFTR chloride channels. Annual review of

physiology 67: 701-718.

Riordan, J.R. 2008. CFTR function and prospects for therapy. Annual review of biochemistry 77:

701-726.

Riordan, J.R., Rommens, J.M., Kerem, B., Alon, N., Rozmahel, R., Grzelczak, Z., Zielenski, J.,

Lok, S., Plavsic, N., Chou, J.L., et al. 1989. Identification of the cystic fibrosis gene:

cloning and characterization of complementary DNA. Science (New York, N.Y 245: 1066-

Rommens, J.M., Iannuzzi, M.C., Kerem, B., Drumm, M.L., Melmer, G., Dean, M., Rozmahel,

R., Cole, J.L., Kennedy, D., Hidaka, N., et al. 1989. Identification of the cystic fibrosis

gene: chromosome walking and jumping. Science (New York, N.Y 245: 1059-1065.

Rosenbusch, J.P., Lustig, A., Grabo, M., Zulauf, M., and Regenass, M. 2001. Approaches to

determining membrane protein structures to high resolution: do selections of

subpopulations occur? Micron 32: 75-90.

Rost, B., Fariselli, P., and Casadio, R. 1996. Topology prediction for helical transmembrane

proteins at 86% accuracy. Protein Sci 5: 1704-1718.

Roth, L., Nasarre, C., Dirrig-Grosch, S., Aunis, D., Cremel, G., Hubert, P., and Bagnard, D.

2008. Transmembrane domain interactions control biological functions of neuropilin-1.

Molecular biology of the cell 19: 646-654.

Russ, W.P., and Engelman, D.M. 1999. TOXCAT: a measure of transmembrane helix

association in a biological membrane. Proceedings of the National Academy of Sciences

Russ, W.P., and Engelman, D.M. 2000. The GxxxG motif: a framework for transmembrane

helix-helix association. Journal of molecular biology 296: 911-919.

Sadlish, H., Pitonzo, D., Johnson, A.E., and Skach, W.R. 2005. Sequential triage of

transmembrane segments by Sec61alpha during biogenesis of a native multispanning

membrane protein. Nature structural & molecular biology 12: 870-878.

Sadlish, H., and Skach, W.R. 2004. Biogenesis of CFTR and other polytopic membrane proteins:

new roles for the ribosome-translocon complex. The Journal of membrane biology 202:

115-126.

Sakata, S., Kurokawa, T., Norholm, M.H., Takagi, M., Okochi, Y., von Heijne, G., and

Okamura, Y. 2010. Functionality of the voltage-gated proton channel truncated in S4.

Proceedings of the National Academy of Sciences of the United States of America 107:

2313-2318.

Sato, Y., Sakaguchi, M., Goshima, S., Nakamura, T., and Uozumi, N. 2002. Integration of

Shaker-type K+ channel, KAT1, into the endoplasmic reticulum membrane: synergistic

insertion of voltage-sensing segments, S3-S4, and independent insertion of pore-forming

segments, S5-P-S6. Proceedings of the National Academy of Sciences of the United

Sato, Y., Sakaguchi, M., Goshima, S., Nakamura, T., and Uozumi, N. 2003. Molecular dissection

of the contribution of negatively and positively charged residues in S2, S3, and S4 to the

final membrane topology of the voltage sensor in the K+ channel, KAT1. The Journal of

Schneider, D., and Engelman, D.M. 2003. GALLEX, a measurement of heterologous association

of transmembrane helices in a biological membrane. The Journal of biological chemistry

278: 3105-3111.

Schneider, D., and Engelman, D.M. 2004. Motifs of two small residues can assist but are not

sufficient to mediate transmembrane helix interactions. Journal of molecular biology

343: 799-804.

Schwiebert, E.M., Morales, M.M., Devidas, S., Egan, M.E., and Guggino, W.B. 1998. Chloride

channel and chloride conductance regulator domains of CFTR, the cystic fibrosis

transmembrane conductance regulator. Proceedings of the National Academy of Sciences

Senes, A., Gerstein, M., and Engelman, D.M. 2000. Statistical analysis of amino acid patterns in

transmembrane helices: the GxxxG motif occurs frequently and in association with beta-

branched residues at neighboring positions. Journal of molecular biology 296: 921-936.

Serohijos, A.W., Hegedus, T., Aleksandrov, A.A., He, L., Cui, L., Dokholyan, N.V., and

Riordan, J.R. 2008. Phenylalanine-508 mediates a cytoplasmic-membrane domain

contact in the CFTR 3D structure crucial to assembly and channel function. Proceedings

of the National Academy of Sciences of the United States of America 105: 3256-3261.

Shen, H., and Chou, J.J. 2008. MemBrain: improving the accuracy of predicting transmembrane

helices. PloS one 3: e2399.

Shrake, A., and Rupley, J.A. 1973. Environment and exposure to solvent of protein atoms.

Lysozyme and insulin. Journal of molecular biology 79: 351-371.

Skach, W.R., and Lingappa, V.R. 1993. Amino-terminal assembly of human P-glycoprotein at

the endoplasmic reticulum is directed by cooperative actions of two internal sequences.

The Journal of biological chemistry 268: 23552-23561.

Sonnhammer, E.L., von Heijne, G., and Krogh, A. 1998. A hidden Markov model for predicting

transmembrane helices in protein sequences. Proceedings / ... International Conference

on Intelligent Systems for Molecular Biology ; ISMB 6: 175-182.

Sorensen, H.P., and Mortensen, K.K. 2005a. Advanced genetic strategies for recombinant

protein expression in Escherichia coli. Journal of biotechnology 115: 113-128.

Sorensen, H.P., and Mortensen, K.K. 2005b. Soluble expression of recombinant proteins in the

cytoplasm of Escherichia coli. Microbial cell factories 4: 1.

Standfuss, J., Xie, G., Edwards, P.C., Burghammer, M., Oprian, D.D., and Schertler, G.F. 2007.

Crystal structure of a thermally stable rhodopsin mutant. Journal of molecular biology

372: 1179-1188.

Studier, F.W., and Moffatt, B.A. 1986. Use of bacteriophage T7 RNA polymerase to direct

selective high-level expression of cloned genes. Journal of molecular biology 189: 113-

Sulistijo, E.S., Jaszewski, T.M., and MacKenzie, K.R. 2003. Sequence-specific dimerization of

the transmembrane domain of the "BH3-only" protein BNIP3 in membranes and

detergent. The Journal of biological chemistry 278: 51950-51956.

Sulistijo, E.S., and MacKenzie, K.R. 2006. Sequence dependence of BNIP3 transmembrane

domain dimerization implicates side-chain hydrogen bonding and a tandem GxxxG motif

in specific helix-helix interactions. Journal of molecular biology 364: 974-990.

Swartz, K.J. 2008. Sensing voltage across lipid membranes. Nature 456: 891-897.

Tamm, L.K., Hong, H., and Liang, B. 2004. Folding and assembly of beta-barrel membrane

proteins. Biochimica et biophysica acta 1666: 250-263.

Tate, C.G. 2001. Overexpression of mammalian integral membrane proteins for structural

studies. FEBS letters 504: 94-98.

Therien, A.G., and Deber, C.M. 2002. Oligomerization of a peptide derived from the

transmembrane region of the sodium pump gamma subunit: effect of the pathological

mutation G41R. Journal of molecular biology 322: 583-550.

Therien, A.G., Glibowicka, M., and Deber, C.M. 2002. Expression and purification of two

hydrophobic double-spanning membrane proteins derived from the cystic fibrosis

transmembrane conductance regulator. Protein expression and purification 25: 81-86.

Therien, A.G., Grant, F.E., and Deber, C.M. 2001. Interhelical hydrogen bonds in the CFTR

membrane domain. Nature structural biology 8: 597-601.

Tulumello, D.V., and Deber, C.M. 2009. SDS micelles as a membrane-mimetic environment for

transmembrane segments. Biochemistry 48: 12096-12103.

Tusnady, G.E., and Simon, I. 1998. Principles governing amino acid composition of integral

membrane proteins: application to topology prediction. Journal of molecular biology

283: 489-506.

Tusnady, G.E., and Simon, I. 2001. The HMMTOP transmembrane topology prediction server.

Bioinformatics (Oxford, England) 17: 849-850.

Ulmschneider, M.B., and Sansom, M.S. 2001. Amino acid distributions in integral membrane

protein structures. Biochimica et biophysica acta 1512: 1-14.

Ulmschneider, M.B., Sansom, M.S., and Di Nola, A. 2005. Properties of integral membrane

protein structures: derivation of an implicit membrane potential. Proteins 59: 252-265.

Viklund, H., Granseth, E., and Elofsson, A. 2006. Structural classification and prediction of

reentrant regions in alpha-helical transmembrane proteins: application to complete

genomes. Journal of molecular biology 361: 591-603.

von Heijne, G. 1989. Control of topology and mode of assembly of a polytopic membrane

protein by positively charged residues. Nature 341: 456-458.

von Heijne, G. 1992. Membrane protein structure prediction. Hydrophobicity analysis and the

positive-inside rule. Journal of molecular biology 225: 487-494.

Wagner, K., Greil, I., Schneditz, P., and Rosenkranz, W. 1994. A new missense mutation G126D

in exon 4 of the cystic fibrosis transmembrane conductance regulator (CFTR) gene.

Human heredity 44: 56-57.

Wales, R., Chaddock, J.A., Roberts, L.M., and Lord, J.M. 1992. Addition of an ER retention

signal to the ricin A chain increases the cytotoxicity of the holotoxin. Experimental cell

research 203: 1-4.

Wales, R., Roberts, L.M., and Lord, J.M. 1993. Addition of an endoplasmic reticulum retrieval

sequence to ricin A chain significantly increases its cytotoxicity to mammalian cells. The

Journal of biological chemistry 268: 23986-23990.

Wang, C., and Deber, C.M. 2000. Peptide mimics of the M13 coat protein transmembrane

segment. Retention of helix-helix interaction motifs. The Journal of biological chemistry

275: 16155-16159.

Wang, C., Liu, L.P., and C.M. Deber. 2000. Delta Regions in proteins: Helices mispredicted as

transmembrane segments by the threshold hydrophobicity requirement. . In Proceedings

of the 16th American Peptide Symposium. (eds. G.B. Fields, Tam, J.P., and, and G.

Barany), pp. 367-369. Springer, Minneapolis, Minnesota, USA.

Ward, A., Reyes, C.L., Yu, J., Roth, C.B., and Chang, G. 2007. Flexibility in the ABC

transporter MsbA: Alternating access with a twist. Proceedings of the National Academy

of Sciences of the United States of America 104: 19005-19010.

Wehbi, H., Gasmi-Seabrook, G., Choi, M.Y., and Deber, C.M. 2008. Positional dependence of

non-native polar mutations on folding of CFTR helical hairpins. Biochimica et biophysica

acta 1778: 79-87.

Wehbi, H., Rath, A., Glibowicka, M., and Deber, C.M. 2007. Role of the extracellular loop in the

folding of a CFTR transmembrane helical hairpin. Biochemistry 46: 7099-7106.

White, S.H. 2009. Biophysical dissection of membrane proteins. Nature 459: 344-346.

White, S.H., and von Heijne, G. 2008. How translocons select transmembrane helices. Annual

review of biophysics 37: 23-42.

White, S.H., and Wimley, W.C. 1999. Membrane protein folding and stability: physical

principles. Annual review of biophysics and biomolecular structure 28: 319-365.

Wigley, W.C., Vijayakumar, S., Jones, J.D., Slaughter, C., and Thomas, P.J. 1998.

Transmembrane domain of cystic fibrosis transmembrane conductance regulator: design,

characterization, and secondary structure of synthetic peptides m1-m6. Biochemistry 37:

844-853.

Williamson, R.C., and Toye, A.M. 2008. Glycophorin A: Band 3 aid. Blood cells, molecules &

diseases 41: 35-43.

Wimley, W.C. 2003. The versatile beta-barrel membrane protein. Current opinion in structural

biology 13: 404-411.

Woolhead, C.A., McCormick, P.J., and Johnson, A.E. 2004. Nascent membrane and secretory

proteins differ in FRET-detected folding far inside the ribosome and in their exposure to

ribosomal proteins. Cell 116: 725-736.

Wu, J.V., Krouse, M.E., and Wine, J.J. 2007. Acinar origin of CFTR-dependent airway

submucosal gland fluid secretion. American journal of physiology 292: L304-311.

Yang, Q.H., Wu, C.L., Lin, K., and Li, L. 1997. Low concentration of inducer favors production

of active form of 6-phosphofructo-2-kinase/fructose-2,6-bisphosphatase in Escherichia

coli. Protein expression and purification 10: 320-324.

Yildirim, M.A., Goh, K.I., Cusick, M.E., Barabasi, A.L., and Vidal, M. 2007. Drug-target

network. Nature biotechnology 25: 1119-1126.

Young, M.T., Beckmann, R., Toye, A.M., and Tanner, M.J. 2000. Red-cell glycophorin A-band

3 interactions associated with the movement of band 3 to the cell surface. The

Biochemical journal 350 Pt 1: 53-60.

Yuen, C.T., Davidson, A.R., and Deber, C.M. 2000. Role of aromatic residues at the lipid-water

interface in micelle-bound bacteriophage M13 major coat protein. Biochemistry 39:

16155-16162.

Zhang, L., Button, B., Gabriel, S.E., Burkett, S., Yan, Y., Skiadopoulos, M.H., Dang, Y.L.,

Vogel, L.N., McKay, T., Mengos, A., et al. 2009. CFTR delivery to 25% of surface

epithelial cells restores normal rates of mucus transport to human cystic fibrosis airway

epithelium. PLoS biology 7: e1000155.

Zhang, L., Sato, Y., Hessa, T., von Heijne, G., Lee, J.K., Kodama, I., Sakaguchi, M., and

Uozumi, N. 2007. Contribution of hydrophobic and electrostatic interactions to the

membrane integration of the Shaker K+ channel voltage sensor domain. Proceedings of

the National Academy of Sciences of the United States of America 104: 8263-8268.

Zhao, G., and London, E. 2006. An amino acid "transmembrane tendency" scale that approaches

the theoretical limit to accuracy for prediction of transmembrane helices: relationship to

biological hydrophobicity. Protein Sci 15: 1987-2001.

Zhou, F.X., Cocco, M.J., Russ, W.P., Brunger, A.T., and Engelman, D.M. 2000. Interhelical

hydrogen bonding drives strong interactions in membrane proteins. Nature structural

biology 7: 154-160.

Zhou, F.X., Merianos, H.J., Brunger, A.T., and Engelman, D.M. 2001. Polar residues drive

association of polyleucine transmembrane helices. Proceedings of the National Academy

of Sciences of the United States of America 98: 2250-2255.

Zimmer, J., Nam, Y., and Rapoport, T.A. 2008. Structure of a complex of the ATPase SecA and

the protein-translocation channel. Nature 455: 936-943.

Appendices

Appendix 1: Additional Data tables.

Table A1.1. Database of globular helix sequences (n = 122).

PDBID Residues a Length Sequence Hb

1AEP 128-154 27 APVQSALQEAAEKTKEAAANLQNSIQS -1.1

1AEP 36-63 28 EALNLLTEQANAFKTKIAEVTTSLKQEA -0.32

1AEP 7-30 24 IAEAVQQLNHTIVNAAHELHETLG -0.3

1AJA 329-353 25 PCGQIGETVDLDEAVQRALEFAKKE -0.55

1AL7 300-332 33 RPVVFSLAAEGEAGVKKVLQMMRDEFELTMALS 0.2

1ALD 317-337 21 KENLKAAQEEYVKRALANSLA -0.73

1APA 203-221 19 AKVLNLEESWGKISTAIHN -0.46

1ATN 507-528 22 PSDAVAEINSLYDVYLDVQQKW -0.08

1BBH 107-128 22 AEAVKTAFGDVGAACKSCHEKY -0.76

1BBH 81-102 22 MEDVGKIAREFVGAANTLAEVA 0.05

1BBH 5-31 27 PEEQIETRQAGYEFMGWNMGKIKANLE -0.63

1BGD 137-164 28 AFQRRAGGVLVASNLQSFLELAYRALRH 0.14

1BGD 101-124 24 APTLDTLQLDTTDFAINIWQQMED 0.16

1BGD 12-40 29 QSFLLKCLEQMRKVQADGTALQETLCATH -0.1

1BTC 257-284 28 EKGKFFLTWYSNKLLNHGDQILDEANKA -0.6

1CGM 111-133 23 VKRTDDASTAARAEIDNLIESIS -0.6

1CGP 110-136 27 PDILMRLSAQMARRLQVTSE KVGNLAF -0.01

1CHM 190-208 19 PEYEVALHATQAMVRAIAD -0.02

1CHM 160-183 24 SAEEHVMIRHGARIADIGGAAVVE -0.34

1CHM 271-290 20 SDDHLRLWQVNVEVHEAGLK -0.47

1CHR 98-116 19 ASAKAAVEMALLDLKARAL 0.36

1CPC 78-101 24 QTGKDKCVRDIGYYLRMVTYCLVV 0.34

1CPC 21-46 26 STEIQTAFGRFRQASASLAAAKALTE -0.35

1CPT 192-214 23 AARRFHETIATFYDYFNGFTVDR 0.09

1CPT 254-285 32 DKYINAYYVAIATAGHDTTSSSSGGAIIGLSR -0.62

1CPT 123-145 23 PASIRKLEENIRRIAQASVQRLL -0.35

1CSG 68-86 19 GSLTKLKGPLTMMASHYKQ -0.99

1DHR 96-121 26 LFKNCDLMWKQSIWTSTISSHLATKH -0.16

1FBP 29-49 21 EMTQLLNSLCTAVKAISTAVR 0.29

1GDH 293-311 19 TQAREDMAHQANDLIDALF -0.21

1GLA 524-549 26 ANHIIRATLESIAYQTRDVLEAMQAD -0.03

1GPB 528-553 26 EAFIRDVAKVKQENKLKFAAYLEREY -0.23

1GPB 614-632 19 HMAKMIIKLITAIGDVVNH 0.28

1GPB 48-77 30 PRDYYFALAHTVRDHLVGRWIRTQQHYYEK -0.47

1GPB 397-417 21 PRHLQIIYEINQRFLNRVAAA 0.04

1GRC 147-170 24 EDDITARVQTQEHAIYPLVISWFA 0.23

1HBG 124-145 22 AAAKDAWAAAYADISGALISGL 0.2

1HBG 100-121 22 AQYFEPLGASLLSAMEHRIGGK -0.42

1HBG 53-71 19 PGVAALGAKVLAQIGVAVS 0.07

1ITH 59-77 19 PAYKAQTLTVINYLDKVVD -0.07

1LAP 84-102 19 EGKENIRAAVAAGCRQIQD -0.89

1LAP 148-169 22 QEAWQRGVLFASGQNLARRLME -0.09

1LE2 87-123 37 EETRARLSKELQAAQARLGADMEDVCGRLVQYRGEVQ -0.53

1LE2 55-78 24 QVTQELRALMDETMKELKAYKSEL -0.25

1LH1 104-125 22 DAHFPVVKEAILKTIKEVVGAK -0.38

1LIS 13-41 29 KAFEVALKVQIIAGFDRGLVKWLRVHGRT 0.17

1LLA 177-198 22 KGELFYYMHQQMCARYDCERLS -0.01

1LMB 9-29 21 QEQLEDARRLKAIYEKKKNEL -1.06

1LTH 297-315 19 DKELAALKRSAETLKETAA -0.77

1LTS 197-222 26 DTCNEETQNLSTIYLREYQSKVKRQI -0.75

1LVL 48-66 19 CIPSKALIHVAEQFHQASR -0.53

1LVL 84-108 25 IGQSVAWKDGIVDRLTTGVAALLKK -0.12

1MAT 8-28 21 PEDIEKMRVAGRLAAEVLEMI 0.19

1MYT 58-76 19 AAISAHGATVLKKLGELLK -0.24

1MYT 122-146 25 AGGQTALRNVMGIIIADLEANYKEL 0.07

1P2P 90-108 19 ACEAFICNCDRNAAICFSK 0.38

1PBX 4-35 32 DKDKAAVRALWSKIGKSADAIGNDALSRMIVV -0.42

1PBX 120-140 21 PEAHVSLDKFLSGVALALAER -0.05

1RVE 37-58 22 TKVLSTIFELFSRPIINKIAEK 0.14

1SBP 14-32 19 RELYEQYNKAFSAHWKQET -0.85

1SCM 778-821 44 LSKIISMFQAHIRGYLIRKAYKKLQDQRIGLSVIQRNIRKWLVL 0.16

1THG 185-204 20 AGLHDQRKGLEWVSDNIANF -0.58

1TML 127-147 21 QHVQQEVLETMAYAGKALKAG -0.58

1TRB 295-313 19 AITSAGTGCMAALDAERYL 0.21

1VSG 23-55 33 DQPKGALFTLQAAASKIQKMRDAALRASIYAEI -0.34

1VSG 87-114 28 LSSQEVTATATASYLKGRIDEYLNLLLQ 0.1

1VSG 60-85 26 NRAKAAVIVANHYAMKADSGLEALKQ -0.64

1WRP 14-32 19 QRHQEWLRFVDLLKNAYQN -0.3

1XIS 296-322 27 FDGVWASAAGCMRNYLILKERAAAFRA 0.37

1XIS 109-128 20 RDVRRYALRKTIRNIDLAVE -0.18

1XIS 151-172 22 VRDALDRMKEAFDLLGEYVTSQ -0.01

1YPI 178-196 19 PEDAQDIHASIRKFLASKL -0.71

1ZEI 1-35 35 LQRMKQLEDKVEELLSKNYHLENEVARLKK LVGER -1.12

256B 24-42 19 AQVKDALTKMRAAALDAQK -0.66

256B 58-80 23 MKDFRHGFDILVGQIDDALKLAN -0.04

2ADA 126-144 19 PDDVVDLVNQGLQEGEQAF -0.54

2AK3 164-188 25 PETVVKRLKAYEAQTEPVLEYYRKK -0.86

2BPA 192-210 19 IMGLQAAYANLHTDQERDY -0.31

2CCY 40-58 19 AAQRAENMAMVAKLAPIGW 0.02

2CCY 104-125 22 PDALKAQAAATGKVCKACHEEF -0.84

2CCY 5-30 26 PEDLLKLRQGLMQTLKSQWVPIAGFA -0.03

2CCY 79-102 24 SAEFLEGWKALATESTKLAAAAKA -0.22

2CHB 59-78 20 DSQKKAIERMKDTLRIAYLT -0.55

2CMD 196-217 22 EQEVADLTKRIQNAGTEVVEAK -0.79

2CMD 87-108 22 RSDLFNVNAGIVKNLVQQVAKT -0.37

2CPK 140-159 20 EPHARFYAAQIVLTFEYLHS 0.24

2CPP 193-213 21 FAEAKEALYDYLIPIIEQRRQ 0.2

2CPP 121-145 25 MPVVDKLENRIQELACSLIESLRPQ -0.1

2GST 90-114 25 EEERIRADIVENQVMDNRMQLIMLC 0.35

2LIV 16-35 20 AQYGDQEFTGAEQAVADINA -0.63

2LZM 61-79 19 DEAEKLFNQDVDAAVRGIL -0.14

2LZM 94-113 19 RRCALINMVFQMGETGVAG 0.26

2MHR 41-64 24 APNLATLVKVTTNHFTHEEAMMDA -0.37

2MHR 19-37 19 EQLDEEHKKIFKGIFDCIR -0.39

2MIN 318-346 29 ESIQKKCEEVIAKYKPEWEAVVAKYRPRL -0.65

2PGD 178-206 29 AGHFVKMVHNGIEYGDMQLICEAYHLMKD -0.09

2PGD 392-415 24 DFFKSAVENCQDSWRRAISTGVQA -0.46

2PGD 315-348 34 KKSFLEDIRKALYASKIISYAQGFMLLRQAATEF 0.17

2REB 158-177 20 LAARMMSQAM RKLAGNLKQS -0.46

2SAS 48-69 22 DADYKSMQASLEDEWRDLKGRA -0.92

2WSY 254-272 19 QILMPALNQLEEAFVRAQK 0.14

3COX 403-424 22 QSQNQKGIDMAKKVFDKINQKE -1.64

3CPA 216-234 19 KTELNQVAKSAVEALKSLY -0.45

3ENL 107-125 19 ANAILGVSLAASRAAAAEK -0.2

3ENL 179-200 22 FAEALRIGSEVYHNLKSLTKKR -0.59

3ENL 403-422 20 SERLAKLNQLLRIEEELGDN -0.36

3GPD 315-333 19 NEFGYSERVVDLMAHMASK -0.32

3HSC 230-249 20 GEDFDNRMVNHFIAEFKRKH -0.89

3HSC 116-135 20 PEEVSSMVLTKMKEIAEAYL 0.05

3ICD 38-57 20 GVDVTPAMLKVVDAAVEKAY 0

3MDS 67-89 23 QDIQTAVRNNGGGHLNHSLFWRL -0.71

3SDH 64-82 19 DKLRGHSITLMYALQNFID 0.12

4GR1 96-120 25 WRVIKEKRDAYVSRLNAIYQNNLTK -0.47

4MDH 306-329 24 DFSREKMDLTAKELAEEKETAFEF -0.36

4TLN 159-179 21 NESGAINEAISDIFGTLVEFY 0.25

4TLN 65-88 24 SYDAPAVDAHYYAGVTYDYYKNVH -0.67

4TNC 75-105 31 FEEFLVMMVRQMKEDAKGKSEEELANCFRIF 0.25

5ABP 110-128 19 ATKIGERQGQELYKEMQKR -1.39

6TAA 357-375 19 ELYKLIASANAIRNYAISK -0.01

7AAT 271-289 19 AEEAKRVESQLKILIRPMY -0.19

7AAT 307-338 32 PELRKEWLVEVKGMADRIISMRTQLVSNLKKE -0.31

7API 23-44 22 FNKITPNLAEFAFSLYRQLAHQ 0.01 a Residues are numbered according to the PDB coordinate file.

b Mean residue hydropathy calculated with the Liu-Deber scale (Liu and Deber 1998a). See

Materials and Methods.

Table A1.2. Database of -helix sequences (n = 51).

PDBID Residues a Length Sequence Hb

1BGC 101-125 25 APTLDTLQLDVTDFATNIWLQMEDL 0.58

1BGC 72-92 21 LRGCLNQLHGGLFLYQGLLQA 0.49

1BGD 72-92 21 LMGCLRQLHSGLFLYQGLLQA 0.84

1BIA 235-254 20 RNTLAAMLIRELRAALELFE 0.97

1CF3 561-581 21 MTVFYAMALKISDAILEDYAS 0.91

1CPC 78-101 24 SRRMAACLRDMEIILRYVTYAIFA 1.06

1CSC 386-410 25 MNYYTVLFGVSRALGVLAQLIWSRA 0.61

1CSC 341-360 20 PMFKLVAQLYKIVPNVLLEQ 0.98

1CSC 163-193 21 RTKYWEMVYESAMDLIAKLPCVAAKIYRNLY 0.42

1DXI 296-320 25 FDGVWASAAGCMRNYLILKDRAAAF 0.47

1ECA 114-133 20 FAGAEAAWGATLDTFFGMIF 1.1

1EZM 72-91 20 PLNDAHFFGGVVFKLYRDWF 0.49

1FDH 101-121 21 ENFKLLGNVLVTVLAIHFGKE 0.46

1FHA 14-41 28 QDSEAAINRQINLELYASYVYLSMSYYF 0.43

1FIA 50-70 21 LYELVLAEVEQPLLDMVMQYT 1.19

1GLM 186-206 21 FFTIAVQHRALVEGSAFATAV 0.72

1GLM 318-338 21 FLCTLAAAEQLYDALYQWDKQ 0.67

1GPA 290-312 23 ELRLKQEYFVVAATLQDIIRRFK 0.48

1GUH 86-111 26 IKERALIDMYIEGIADLGEMILLLPV 1.11

1HCI 218-236 19 ELFFWVHHQLTARFDFERL 0.96

1HDS 99-117 19 PENFRLLGNVLVVVLARNF 0.77

1LTH 93-114 22 RLELVGATVNILKAIMPNLVKV 0.56

1MAT 120-139 20 IMGERLCRITQESLYLALRM 0.88

1MBA 126-144 19 ADAAWTKLFGLIIDALKAA 0.77

1MRR 186-205 20 LRELKKKLYLCLMSVNALEA 0.61

1OVA 26-44 19 IGAASMEFCFDVFKELKVH 0.53

1PFK 258-276 19 PYDRILASRMGAYAIDLLL 0.73

1PHH 299-317 19 LNLAASDVSTLYRLLLKAY 0.8

1PHH 328-350 23 YSAICLRRIWKAERFSWWMTSVL 1.07

1SRY 164-182 19 DLALYELALLRFAMDFMAR 1.62

1SRY 288-308 21 LEASDRAFQELLENAEEILRL 0.42

1TIS 184-202 19 LPFNIASYATLVHIVAKMC 0.81

2AAI 161-180 20 LPTLARSFIICIQMISEAAR 0.84

2ACE 168-186 21 VGLLDQRMALQWVHDNIQF 0.54

2ACH 25-44 20 GLASANVDFAFSLYKQLVLK 0.48

2ALD 160-178 19 ALAIMENANVLARYASICQ 0.65

2ATI 285-304 20 YFQQAGNGIFARQALLALVL 0.88

2BMH 251-282 32 DENIRYQIITFLIAGHETTSGLLSFALYFLVK 0.74

2LDB 107-127 21 LDLVDKNIAIFRSIVESVMAS 0.66

2MNR 98-118 21 GLIRMAAAGIDMAAWDALGKV 0.52

2TMV 114-134 21 VDDATVAIRSAINNLIVELIR 0.63

2TSI 164-184 21 FTEFSYMMLQAYDFLRLYETE 1.16

3HHR 156-183 28 LLKNYGLLYCFRKDMDKVETFLRIVQCR 0.56

3HHR 110-128 19 VYDLLKDLEEGIQTLMGRL 0.54

3INK 7-28 22 TKKTQLQLEHLLLDLQMILNGI 0.42

3PFK 258-276 19 AFDRVLASRLGARAVELLL 0.9

3TMS 173-192 20 GLPFNIASYALLVHMMAQQC 0.66

4TMS 226-244 19 VPFNIASYALLTHLVAHEC 0.59

5LDH 244-263 20 GYTNWAIGLSVADLIESMLK 0.52

8RUC 214-232 19 WRDRFLFCAEALYKAQAET 0.51

9LDB 109-130 22 RLNLVQRNVNIFKFIIPNIVKY 0.45 a Residues are numbered according to the PDB coordinate file.

Table A1.3. Database of TM helix sequences (n = 212).

PDBID Residuesa Length Sequence Hb

1AFO 72-96 25 EITLIIFGVMAGVIGTILLISYGIR 1.56

1FX8-1 6-35 30 LIFFGVGCVAALKVA 0.98

1FX8-2 40-60 21 GQWEISVIWGLGVAMAIYLTA 1.25

1FX8-3 68-77 10 NPAVTIALWL 1.51

1FX8-4 85-106 22 KVIPFIVSQVAGAFCAAALVYG 0.86

1FX8-5 144-169 26 NFVQAFAVEMVITAILMGLILALTDD 1.54

1FX8-6 178-194 17 LAPLLIGLLIAVIGASM 1.73

1FX8-7 203-216 14 NPARDFGPKVFAWL -0.3

1FX8-8 232-254 23 YFLVPLFGPIVGAIVGAFAYRKL 1.05

1L7V-1 2-32 30 LTLARQQQRQNIRWLLCLSVLLLALLLSLC 1.5

1L7V-2 47-81 34 RGELFVWQIRLPRTLAVLLVGAALAISGAVQALF 1.06

1L7V-3 93-107 15 VSNGAGVGLIAAVLL 0.78

1L7V-4 114-138 25 NWALGLCAIAGALIITLILLRFARR 1.58

1L7V-5 142-166 24 TSRLLLAGVALGIICSALTWAIYF 1.58

1L7V-6 191-206 15 SWLLALIPVLLWICC 2.85

1L7V-7 229-249 20 WFWRNVLVAATGWVGVSVAL 1.38

1L7V-8 258-267 10 LVIPHILRLC 1.63

1L7V-9 272-296 25 HRVLLPGCALAGASALLLADIVARL 0.81

1L7V-10 305-324 20 IGVVTATLGAPVFIWLLLKA 1.43

1LNQ-1 22-40 19 RILLLVLAVIIYGTAGFHF 1.87

1LNQ-2 46-57 12 WTVSLYWTFVTI 2.16

1LNQ-3 71-97 27 LGMYFTVTLIVLGIGTFAVAVERLLEF 1.72

1ORQ-1 26-50 25 LVELGVSYAALLSVIVVVVECTMQL 1.66

1ORQ-2 54-78 25 YLVRLYLVDLILVIILWADYAYRAY 2.23

1ORQ-3 84-92 9 AGYVKKTLY -0.27

1ORQ-4 96-112 17 ALVPAGLLALIEGHLAG 0.64

1ORQ-5 116-132 17 FRLVRLLRFLRILLIIS 2.41

1ORQ-6 139-147 9 SAIADAADK -0.86

1ORQ-7 148-173 26 IRFYHLFGAVMLTVLYGAFAIYIVEY 1.8

1ORQ-8 183-194 12 VFDALWWAVVTA 2.13

1ORQ-9 208-240 33 IGKVIGIAVMLTGISALTLLIGTVSNMFQKILV 1.07

1OTS-1 32-70 39 PLAILFMAAVVGTLVGLAAVAFDKGVAWLQNQRMGALVH 0.78

1OTS-2 75-100 26 YPLLLTVAFLCSAVLAMFGYFLVRKY 1.73

1OTS-3 109-117 9 IPEIEGALE 0.12

1OTS-4 124-140 17 WWRVLPVKFFGGLGTLG 0.77

1OTS-5 148-165 18 EGPTVQIGGNIGRMVLDI -0.29

1OTS-6 172-189 18 EARHTLLATGAAAGLAAA -0.11

1OTS-7 193-200 8 PLAGILFI 1.91

1OTS-8 215-230 16 IKAVFIGVIMSTIMYR 1.39

1OTS-9 250-282 33 NTLWLYLILGIIFGIFGPIFNKWVLGMQDLLHR 1.33

1OTS-10 288-305 18 ITKWVLMGGAIGGLCGLL 1.06

1OTS-11 321-325 5 PIATA -0.25

1OTS-12 330-349 20 MGMLVFIFVARVITTLLCFS 2.26

1OTS-13 357-378 22 FAPMLALGTVLGTAFGMVAVEL 1.22

1OTS-14 387-401 15 GTFAIAGMGALLAAS 0.61

1OTS-15 405-416 12 PLTGIILVLEMT 1.46

1OTS-16 423-438 16 LPMIITGLGATLLAQF 1.25

1PW4-1 37-57 21 GYAAYYLVRKNFALAMPYLVE 0.76

1PW4-2 64-84 21 DLGFALSGISIAYGFSKFIMG 0.67

1PW4-3 94-112 19 VFLPAGLILAAAVMLFMGF 2.11

1PW4-4 121-141 21 AVMFVLLFLCGWFQGMGWPPC 1.63

1PW4-5 160-180 21 VWNCAHNVGGGIPPLLFLLGM 0.47

1PW4-6 190-207 18 LYMPAFCAILVALFAFAM 2.42

1PW4-7 264-282 19 FVYLLRYGILDWSPTYLKE 0.97

1PW4-8 288-309 22 LDKSSWAYFLYEYAGIPGTLLC 0.68

1PW4-9 322-341 20 GATGVFFMTLVTIATIVYWM 1.77

1PW4-10 347-367 21 PTVDMICMIVIGFLIYGPVML 1.68

1PW4-11 389-409 21 GLFGYLGGSVAASAIVGYTVD 0.32

1PW4-12 415-434 20 GGFMVMIGGSILAVILLIVV 1.98

1RH5_A-1 23-42 20 FKEKLKWTGIVLVLYFIMGC 1.19

1RH5_A-2 57-91 23 EFWQTITGPIVTAGIIMQLLVGS 0.81

1RH5_A-3 105-129 25 ALFQGCQKLLSIIMCFVEAVLFVGA 1.57

1RH5_A-4 138-162 25 LLAFLVIIQIAFGSIILIYLDEIVS 2.29

1RH5_A-5 169-187 19 GIGLFIAAGVSQTIFVGAL 1.02

1RH5_A-6 211-228 18 APIIGTIIVFLMVVYAEC 1.87

1RH5_A-7 257-276 20 IPVILAAALFANIQLWGLAL 1.8

1RH5_A-8 314-336 23 IHAIVYMIAMIITCVMFGIFWVE 2.37

1RH5_A-9 377-395 19 LTVMSSAFVGFLATIANFI 1.48

1RH5_A-10 401-415 15 GTGVLLTVSIVYRMY 1.06

1RH5_B 32-62 31 EYLAVAKVTALGISLLGIIGYIIHVPATYIK 0.82

1RH5_C 30-49 20 PEHVIGVTVAFVIIEAILTY 1.19

1XL4-1 46-68 23 WPVFITLITGLYLVTNALFALAY 1.86

1XL4-2 108-135 28 LANTLVTLEALCGMLGLAVAASLIYARF 1.35

YCE-1 5-38 34 FAKTVVLAASAVGAGTAMIAGIGPGVGQGYAAGK -0.35

YCE-2 55-80 26 STMVLGQAVAESTGIYSLVIALILLY 1.24

1YEW_A-1 185-207 23 EGNTYFWHAFWFAIGVAWIGYWS 1.18

1YEW_A-2 233-257 25 RKVAMGFLAATILIVVMAMSSANSK 0.54

1YEW_B-1 1143 33 HAEAVQVSRTIDWMALFVVFFVIVGSYHIHAML 1.1

1YEW_B-2 58-82 25 RLWVTVTPIVLVTFPAAVQSYLWER 1.01

1YEW_B-3 88-111 24 GATVCVLGLLLGEWINRYFNFWGW 1.36

1YEW_B-4 126-136 11 PGAIILDTVLM 1.18

1YEW_B-5 141-165 25 YLFTAIVGAMGWGLIFYPGNWPIIA 1.19

1YEW_B-6 200-212 13 VEKGTLRTFGKDV -0.75

1YEW_B-7 229-242 14 FMWHFIGRWFSNER 0.77

1YEW_C-1 51-71 21 LTFALAIYTVFYLWVRWYEGV 2.1

1YEW_C-2 84-113 30 EFETYWMNFLYTEIVLEIVTASILWGYLWK 1.61

1YEW_C-3 132-160 29 FTHLVWLVAYAWAIYWGASYFTEQDGTWH 0.95

1YEW_C-4 172-197 26 SHIIEFYLSYPIYIITGFAAFIYAKT 1.06

1YEW_C-5 244-256 13 LHYGFVIFGWLAL 2.09

1Z98-1 35-64 30 WSFWRAAIAEFIATLLFLYITVATVIGHSK 1.32

1Z98-2 74-90 17 LLGIAWAFGGMIFVLVY 2.33

1Z98-3 102-111 10 PAVTFGLFLA 1.36

1Z98-4 116-139 24 LLRALVYMIAQCLGAICGVGLVKA 1.34

1Z98-5 161-180 20 KGTALGAEIIGTFVLVYTVF 1

1Z98-6 200-215 16 LPIGFAVFMVHLATIP 1.19

1Z98-7 223-232 10 PARSFGAAVI -0.09

1Z98-8 237-262 26 KVWDDQWIFWVGPFIGAAVAAAYHQY 0.6

1ZLL 24-52 29 ARQKLQNLFINFCLILICLLLICIIVMLL 2.49

2A79-1 222-242 21 FFIVETLCIIWFSFEFLVRFF 2.93

2A79-2 290-309 20 LAILRVIRLVRVFRIFKLSR 1.49

2A79-3 312-321 10 KGLQILGQTL 0.05

2A79-4 322-348 27 KASMRELGLLIFFLFIGVILFSSAVYF 1.82

2A79-5 362-372 11 PDAFWWAVVSM 1.28

2A79-6 388-419 32 KIVGSLCAIAGVLTIALPVPVIVSNFNYFYHR 0.65

2AHY-1 22-42 21 KEFQVLFVLTILTLISGTIFY 1.75

2AHY-2 50-62 13 PIDALYFSVVTLT 1.13

2AHY-3 75-97 23 FGKIFTILYIFIGIGLVFGFIHK 1.61

2B2F-1 2-28 27 SDGNVAWILASTALVMLMVPGVGFFYA 1

2B2F-2 33-64 32 RKNAVNMIALSFISLIITVLLWIFYGYSVSFG 1.35

2B2F-3 86-110 25 DLLFMMYQMMFAAVTIAILTSAIAE 1.78

2B2F-4 114-138 25 VSSFILLSALWLTFVYAPFAHWLWG 1.76

2B2F-5 151-170 20 AGGMVVHISSGFAALAVAMT 0.46

2B2F-6 182-210 29 IEPHSIPLTLIGAALLWFGWFGFNGGSAL 0.66

2B2F-7 215-241 27 VAINAVVVTNTSAAVAGFVWMVIGWIK 1.07

2B2F-8 246-260 15 SLGIVSGAIAGLAAI 0.72

2B2F-9 269-294 26 VKGAIVIGLVAGIVCYLAMDFRIKKK 0.66

2B2F-10 299-319 21 LDAWAIHGIGGLWGSVAVGIL 0.82

2B2F-11 334-366 33 NPQLLVSQLIAVASTTAYAFLVTLILAKAVDAA 0.82

2BG9_A-1 218-237 20 VIIPCLLFSFLTVLVFYLPT 2.32

2BG9_A-2 243-261 19 MTLSISVLLSLTVFLLVIV 2.47

2BG9_A-3 280-300 21 FTMIFVISSIIVTVVVINTHH 1.35

2BG9_A-4 408-427 20 HILLCVFMLICIIGTVSVFA 2.38

B2L2-1 11-46 36 GMVFAVLAMATATIFSGIGSAKGVGMTGEAAAALTT 0.31

B2L2-2 51-79 29 KFGQALILQLLPGTQGLYGFVIAFLIFIN 1.22

B2L2-3 87-122 36 VQGLNFLGASLPIAFTGLFSGIAQGKVAAAGIQILA 0.42

B2L2-4 127-155 29 HATKGIIFAAMVETYAILGFVISFLLVLN 1.38

2CFP-1 9-35 27 FWMFGLFFFFYFFIMGAYFPFFPIWLH 2.69

2CFP-2 44-68 25 DTGIIFAAISLFSLLFQPLFGLLSD 1.33

2CFP-3 74-99 26 KYLLWIITGMLVMFAPFFIFIFGPLL 2.32

2CFP-4 105-131 27 VGSIVGGIYLGFCFNAGAPAVEAFIEK 0.41

2CFP-5 140-162 23 FGRARMFGCVGWALGASIVGIMF 1.04

2CFP-6 168-188 21 FVFWLGSGCALILAVLLFFAK 2.27

2CFP-7 221-250 30 KLWFLSLYVIGVSCTYDVFDQQFANFFTSF 1.2

2CFP-8 259-278 20 RVFGYVTTMGELLNASIMFF 1.2

2CFP-9 289-308 20 KNALLLAGTIMSVRIIGSSF 0.57

2CFP-10 313-334 22 LEVVILKTLHMFEVPFLLVGCF 1.78

2CFP-11 347-373 27 ATIYLVCFCFFKQLAMIFMSVLAGNMY 1.84

2CFP-12 378-398 21 FQGAYLVLGLVALGFTLISVF 1.81

2GIF-1 10-29 20 IFAWVIAIIIMLAGGLAILK 2.3

2GIF-2 338-356 19 HEVVKTLVEAIILVFLVMY 1.84

2GIF-3 363-386 24 RATLIPTIAVPVVLLGTFAVLAAF 1.32

2GIF-4 393-419 26 LTMFGMVLAIGLLVDDAIVVVENVER 1.37

2GIF-5 433-459 27 KSMGQIQGALVGIAMVLSAVFVPMAFF 1.28

2GIF-6 467-495 29 YRQFSITIVSAMALSVLVALILTPALCAT 1.29

2GIF-7 540-558 19 RYLVLYLIIVVGMAYLFVR 2.39

2GIF-8 873-892 20 APSLYAISLIVVFLCLAALY 2.01

2GIF-9 895-918 24 WSIPFSVMLVVPLGVIGALLAATF 1.47

2GIF-10 926-950 25 YFQVGLLTTIGLSAKNAILIVEFAK 0.85

2GIF-11 960-991 32 LIEATLDAVRMRLRPILMTSLAFILGVMPLVI 1.4

2GIF-12 999-1030 32 AQNAVGTGVMGGMVTATVLAIFFVPVFFVVVR 1.02

2HAC 2-23 22 SKLCYLLDGILFIYGVILTALF 1.97

2HYD-1 12-39 28 YKYRIFATIIVGIIKFGIPMLIPLLIKY 1.31

2HYD-2 57-90 34 HHLTIAIGIALFIFVIVRPPIEFIRQYLAQWTSN 0.88

2HYD-3 132-158 27 KDFILTGLMNIWLDCITIIIALSIMFF 2.1

2HYD-4 162-191 30 KLTLAALFIFPFYILTVYVFFGRLRKLTRE 1.38

2HYD-5 235-268 34 TRALKHTRWNAYSFAAINTVTDIGPIIVIGVGAY 0.1

2HYD-6 277-312 36 VGTLAAFVGYLELLFGPLRRLVASFTTLTQSFASMD 0.79

2IC8-1 94-114 21 GPVTWVMMIACVVVFIAMQIL 2.07

2IC8-2 147-168 22 SLMHILFNLLWWWYLGGAVEKR 1.32

2IC8-3 171-192 22 SGKLIVITLISALLSGYVQQKF 0.63

2IC8-4 200-217 18 LSGVVYALMGYVWLRGER 0.88

2IC8-5 226-241 16 QRGLIIFALIWIVAGW 2.07

2IC8-6 250-269 20 ANGAHIAGLAVGLAMAFVDS 1.69

2IUB-1 291-312 22 MKVLTIIATIFMPLTFIAGIYG 1.53

2IUB-2 327-349 23 YPVVLAVMGVIAVIMVVYFKKKK 0.97

2NWL-1 15-31 17 KILIGLILGAIVGLILG 1.81

2NWL-2 36-68 33 AHAVHTYVKPFGDLFVRLLKMLVMPIVFASLVV 0.96

2NWL-3 78-108 31 LGRVGVKIVVYYLLTSAFAVTLGIIMARLFN 1.31

2NWL-4 130-168 39 LVHILLDIVPTNPFGALANGQVLPTIFFAIILGIAITYL 1.17

2NWL-5 195-218 24 YKIVNGVMQYAPIGVFALIAYVMA 1.05

2NWL-6 228-254 27 LAKVTAAVYVGLTLQILLVYFVLLKIY 1.87

2NWL-7 298-329 32 IYSFTLPLGATINMDGTALYQGVCTFFIANAL 0.81

2NWL-8 391-414 24 AILDMGRTMVNVTGDLTGTAIVAK 0.15

2OAR-1 15-43 29 VDLAVAVVIGTAFTALVTKFTDSIITPLI 1.08

2OAR-2 69-89 21 LNVLLSAAINFFLIAFAVYFL 2.42

2OAU-1 29-57 29 VNIVAALAIIIVGLIIARMISNAVNRLMI 1.58

2OAU-2 68-91 24 FLSALVRYGIIAFTLIAALGRVGV 1.44

2OAU-3 96-127 32 VIAVLGAAGLAVGLALQGSLSNLAAGVLLVMF 1.2

2ONK-1 2-29 28 RLLFSALLALLSSIILLFVLLPVAATVT 2.06

2ONK-2 48-78 31 WKVVLTTYYAALISTLIAVIFGTPLAYILAR 1.42

2ONK-3 84-107 24 KSVVEGIVDLPVVIPHTVAGIALL 0.5

2ONK-4 127-152 25 LPGIVVAMLFVSVPIYINQAKEGFA 0.73

2ONK-5 183-205 23 RHIVAGAIMSWARGISEFGAVVV 0.51

2ONK-6 231-250 20 PVAAILILLSLAVFVALRII 2.28

2HG9-1 35-56 22 GVATFFFAALGIILIAWSAVLQ 1.86

2HG9-2 84-111 28 GLWQIITICATGAFVSWALREVEICRKL 1.08

2HG9-3 116-198 24 HIPFAFAFAILAYLTLVLFRPVMM 1.86

2HG9-4 171-198 28 PAHMIAISFFFTNALALALHGALVLSAA 0.97

2HG9-5 226-248 24 TLGIHRLGLLLSLSAVFFSALCMI 1.57

2HG9-6 150-162/

24 IWTHLDWVSNTGY PDHEDTFFRDL -0.12 209-219

2NTU-1 12-30 22 EWIWLALGTALMGLGTLYFLVK 1.72

2NTU-2 37-62 26 PDAKKFYAITTLVPAIAFTMYLSMLL 0.91

2NTU-3 81-101 21 ARYADWLFTTPLLLLDLALLV 1.84

2NTU-4 105-127 23 QGTILALVGADGIMIGTGLVGAL 0.64

2NTU-5 131-153 23 YSYRFVWWAISTAAMLYILYVLF 2.22

2NTU-6 165-191 27 PEVASTFKVLRNVTVVLWSAYPVVWLI 0.97

2NTU-7 201-224 24 LNIETLLFMVLDVSAKVGFGLILL 1.72

2UUI-1 7-27 21 LLAAVTLLGVLLQAYFSLQVI 1.98

2UUI-2 68-88 21 WVAGIFFHEGAAALCGLVYLF 1.61

2UUI-3 115-135 21 LWLLVALAALGLLAHFLPAAL 2.09

2HJF_C-1 28-50 23 AAGAATVLLVIVLLAGSYLAVLA 1.64

2HJF_C-2 88-111 24 GRCVAVVVMVAGITSFGLVTAALA 1.08

2NQ2-1 7-24 18 PKILFGLTLLLVITAVIS 1.67

2NQ2-2 61-85 25 VRLPRILTALCVGAGLALSGVVLQG 0.71

2NQ2-3 98-113 16 GVTSGSAFGGTLAIFF 0.4

2NQ2-4 117-139 23 LYGLFTSTILFGFGTLALVFLFS 1.93

2NQ2-5 147-171 25 LLMLILIGMILSGLFSALVSLLQYI 2.37

2NQ2-6 194-212 19 WEKLLFFFVPFLLCSSILL 2.44

2NQ2-7 235-256 22 MAPLRWLVIFLSGSLVACQVAI 1.53

2NQ2-8 264-274 11 GLIIPHLSRML 0.71

2NQ2-9 279-297 18 HQSLLPCTMLVGATYMLL 0.96

2NQ2-10 312-328 17 SILTALIGAPLFGVLVY 1.52 a Residues are numbered according to the PDB coordinate file.

Table A1.4. Oligonucleotides used in this work.

pGEM constructs Sequence (5′–3′)

pGEM constructs

LepH3-F ACCGGGdUGGGgtaccAGGGCAAC

LepH3-R ACCTGGdUCCACCACTAGTCTCGGAAAG

LepH2-F ACCGGGdUGGGgtaccgatc

LepH2-R ACCTGGdUCCACCACTAGTCTTCGGCGCAAC

LepH1-F Same as LepH2-F

LepH1-R ACCTGGdUCCACCACTAGTCTGTGCGTTGATCGGTTG

pGEM-F AGCCATCTdUCGTTCACGTTTGC

pGEM-R ATGGTGGCdUCTAGAGTCGACCTG

δ-Helix-encoding oligonucleotides

1ECA-F ACCAGGdUGACTTCGCTGGAGCTGAAGCAGCCTGGGGTGCAACTCTTGACACTTTCT

1ECA-R ACCCGGdUCCCATCTTTGAGAAGATCATTCCGAAGAAAGTGTCAAGAGTTGCAC

1MBA-F ACCAGGdUCCCGCCGGCGCCGACGCTGCATGGACCAAGCTCTTCGGACTCATC

1MBA-R ACCCGGdUCCTTTGCCGGCGGCTTTGAGGGCATCGATGATGAGTCCGAAGAGCTTG

1SRY-F ACCAGGdUGCACTGAAAGGCGACCTGGCACTGTACGAATTAGCACTGTTACGTTTCGCTATGGAC

1SRY-R ACCCGGdUCCCGGCAAGGTCATCGGCAAAAAGCCACGACGAGCCATGAAGTCCATAGCGAAACGTAACAG

2BMH-F ACCAGGdUCCGCTTGATGACGAGAACATTCGCTATCAAATTATTACATTCTTAATTGCGGGACACGAAAC

2BMH-R ACCCGGdUCCTACATGTGGATTTTTCACTAAGAAATACAGCGCAAATGATAAAAGACCACTTGTTGTTTCGTGTCCCGCA

1HDS-F ACCAGGdUGTTGATCCTGAAAATTTTCGTCTTCTTGGTAATGTTCTTGTTGTTGTTCTTG

1HDS-R ACCCGGdUCCAGTAAATTCACCACCAAAATTACGAGCAAGAACAACAACAAGAACATTAC

2AAI-F ACCAGGdUACTCAGCTTCCAACTCTGGCTCGTTCCTTTATAATTTGCATCCAAATGATTTCAGAAGCAGC

2AAI-R ACCCGGdUCCGCGCATTTCTCCCTCAATATATTGGAATCTTGCTGCTTCTGAAATCATTTGGATGC

Site-directed mutagenesis

1ECA-E6V TCGCTGGAGCTGTAGCAGCCTGGGG

1ECA-G10V TGAAGCAGCCTGGGTTGCAACTCTTGACAC

1ECA-D14V CTGGGGTGCAACTCTTGTCACTTTCTTCGGAATGA

2AAI-R8C TCAGCTTCCAACTCTGGCTTGTTCCTTTATAATTTGCATC

2AAI-E19V AATTTGCATCCAAATGATTTCAGTAGCAGCAAGATTCCAATATATTG

2AAI-R22I CCAAATGATTTCAGAAGCAGCAATATTCCAATATATTGAGGGAGAAA

2BMH-E20V CATTCTTAATTGCGGGACACGTAACAACAAGTGGTCTTTTATC

2AAI-E19V-R22I CCAAATGATTTCAGTAGCAGCAATATTCCAATATATTGAGGGAGAAA

2BMH-E20V-H19L TTATTACATTCTTAATTGCGGGACTCGTAACAACAAGTGGTCTTTTATCATT

2BMH-P378S-F CGTCCAGAGCGTTTTGAAAATTCAAGTGCGATTCC

2BMH-P378S-R GGAATCGCACTTGAATTTTCAAAACGCTCTGGACG

Full-length and truncated 2BMH

pGEM-2BMH-F AGCCACCAdUGACAATTAAAGAAATGCCTCAGCCAAAAAC

pGEMgly2BMH-F AGCCACCAdUGACAATTAACTCCACAAAAGAAATGCCTCAGCCAAAAAC

pGEM-2BMH-R AAGATGGCdUACCCAGCCCACACGTCTTTTG

2BMHD1-226-F AGCCACCADUGACAATTGGTGAACAAAGCGATGATTTATTAAC

gly2BMHD1-226-F AGCCACCADUGACAATTAACTCCACAGGTGAACAAAGCGATGATTTATTAAC

2BMHD1-240-F AGCCACCADUGACAATTAAAGATCCAGAAACGGGTGAG

gly2BMHD1-240-F AGCCACCADUGACAATTAACTCCACAAAAGATCCAGAAACGGGTGAG

Appendix 2: Commonly used media for bacterial expression.

TB medium

0.4% (v/v) glycerol

2.4% (w/v) Bacto yeast extract

1.2% (w/v) Bacto tryptone

17 mM KH2PO4

72 mM K2HPO4

1% (w/v) Tryptone

0.5% (w/v) Yeast Extract

0.5% (w/v) NaCl

0.01% (v/v) 1N NaOH

M9 Media (5x)

3% (w/v) Na2HPO4

1.5% (w/v) KH2PO4

0.5% (w/v) NH4Cl

0.25% (w/v) NaCl

0.0015% (w/v) CaCl2 (optional)

For M9 Minimal media (1x) supplement with:

1 mM MgSO4

mM CaCl2

0.04 mM Biotin

0.03 mM Vitamin B

0.003% (w/v) Glucose

For M9 Rich media (1x) supplement with:

1 mM MgSO4

mM CaCl2

5.93 M Vitamin B

0.004% (w/v) Glucose

0.005% (w/v) Casein Enzymatic Hydrolysate

Copyright Acknowledgements (if any)

identification and analysis of the folding determinants of ......identification and analysis of the...

Documents

precision folding technology precision folding technology ·...

folding tables supplement lo… · contour plus folding...

towards the identification of structural determinants of...

3800 series heavy duty folding toolbar€¦ · 3800 series...

curved folding - stanford computer graphics...

identification of determinants required for agonistic and...

logo folding architecture. logo contents case study folding...

characterization of protein folding determinants for...

compression folding scoring folding...

identification of environmental determinants of …

folding @ home - distributed parallel protein folding chris...

determinants influencing adoption of radio frequency...

identification of in-flight wingtip folding effects on the

identification of genetic determinants associated with...

folding cartons - mdcdn01-9b10.kxcdn.com€¦ · folding...

determinants of college major choice: identification using...

folding shutters - ceilings, sun control and...

genomic determinants of protein folding thermodynamics in...

introduction fault-bend folding fault-propagation folding...

identification of the determinants of trna function and...