harpers biochem

114
Biochemistry & Medicine 1 1 Robert K. Murray, MD, PhD biochemistry is increasingly becoming their common language. A Reciprocal Relationship Between Biochemistry & Medicine Has Stimulated Mutual Advances The two major concerns for workers in the health sci- ences—and particularly physicians—are the understand- ing and maintenance of health and the understanding and effective treatment of diseases. Biochemistry im- pacts enormously on both of these fundamental con- cerns of medicine. In fact, the interrelationship of bio- chemistry and medicine is a wide, two-way street. Biochemical studies have illuminated many aspects of health and disease, and conversely, the study of various aspects of health and disease has opened up new areas of biochemistry. Some examples of this two-way street are shown in Figure 1–1. For instance, a knowledge of protein structure and function was necessary to eluci- date the single biochemical difference between normal hemoglobin and sickle cell hemoglobin. On the other hand, analysis of sickle cell hemoglobin has contributed significantly to our understanding of the structure and function of both normal hemoglobin and other pro- teins. Analogous examples of reciprocal benefit between biochemistry and medicine could be cited for the other paired items shown in Figure 1–1. Another example is the pioneering work of Archibald Garrod, a physician in England during the early 1900s. He studied patients with a number of relatively rare disorders (alkap- tonuria, albinism, cystinuria, and pentosuria; these are described in later chapters) and established that these conditions were genetically determined. Garrod desig- nated these conditions as inborn errors of metabo- lism. His insights provided a major foundation for the development of the field of human biochemical genet- ics. More recent efforts to understand the basis of the genetic disease known as familial hypercholesterol- emia, which results in severe atherosclerosis at an early age, have led to dramatic progress in understanding of cell receptors and of mechanisms of uptake of choles- terol into cells. Studies of oncogenes in cancer cells have directed attention to the molecular mechanisms involved in the control of normal cell growth. These and many other examples emphasize how the study of INTRODUCTION Biochemistry can be defined as the science concerned with the chemical basis of life (Gk bios “life”). The cell is the structural unit of living systems. Thus, biochem- istry can also be described as the science concerned with the chemical constituents of living cells and with the reac- tions and processes they undergo. By this definition, bio- chemistry encompasses large areas of cell biology, of molecular biology, and of molecular genetics. The Aim of Biochemistry Is to Describe & Explain, in Molecular Terms, All Chemical Processes of Living Cells The major objective of biochemistry is the complete understanding, at the molecular level, of all of the chemical processes associated with living cells. To achieve this objective, biochemists have sought to iso- late the numerous molecules found in cells, determine their structures, and analyze how they function. Many techniques have been used for these purposes; some of them are summarized in Table 1–1. A Knowledge of Biochemistry Is Essential to All Life Sciences The biochemistry of the nucleic acids lies at the heart of genetics; in turn, the use of genetic approaches has been critical for elucidating many areas of biochemistry. Physiology, the study of body function, overlaps with biochemistry almost completely. Immunology employs numerous biochemical techniques, and many immuno- logic approaches have found wide use by biochemists. Pharmacology and pharmacy rest on a sound knowl- edge of biochemistry and physiology; in particular, most drugs are metabolized by enzyme-catalyzed reac- tions. Poisons act on biochemical reactions or processes; this is the subject matter of toxicology. Biochemical ap- proaches are being used increasingly to study basic as- pects of pathology (the study of disease), such as in- flammation, cell injury, and cancer. Many workers in microbiology, zoology, and botany employ biochemical approaches almost exclusively. These relationships are not surprising, because life as we know it depends on biochemical reactions and processes. In fact, the old barriers among the life sciences are breaking down, and

Upload: api-3856689

Post on 10-Apr-2015

1.691 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Harpers Biochem

Biochemistry & Medicine 1

1

Robert K. Murray, MD, PhD

biochemistry is increasingly becoming their commonlanguage.

A Reciprocal Relationship BetweenBiochemistry & Medicine Has StimulatedMutual Advances

The two major concerns for workers in the health sci-ences—and particularly physicians—are the understand-ing and maintenance of health and the understandingand effective treatment of diseases. Biochemistry im-pacts enormously on both of these fundamental con-cerns of medicine. In fact, the interrelationship of bio-chemistry and medicine is a wide, two-way street.Biochemical studies have illuminated many aspects ofhealth and disease, and conversely, the study of variousaspects of health and disease has opened up new areasof biochemistry. Some examples of this two-way streetare shown in Figure 1–1. For instance, a knowledge ofprotein structure and function was necessary to eluci-date the single biochemical difference between normalhemoglobin and sickle cell hemoglobin. On the otherhand, analysis of sickle cell hemoglobin has contributedsignificantly to our understanding of the structure andfunction of both normal hemoglobin and other pro-teins. Analogous examples of reciprocal benefit betweenbiochemistry and medicine could be cited for the otherpaired items shown in Figure 1–1. Another example isthe pioneering work of Archibald Garrod, a physicianin England during the early 1900s. He studied patientswith a number of relatively rare disorders (alkap-tonuria, albinism, cystinuria, and pentosuria; these aredescribed in later chapters) and established that theseconditions were genetically determined. Garrod desig-nated these conditions as inborn errors of metabo-lism. His insights provided a major foundation for thedevelopment of the field of human biochemical genet-ics. More recent efforts to understand the basis of thegenetic disease known as familial hypercholesterol-emia, which results in severe atherosclerosis at an earlyage, have led to dramatic progress in understanding ofcell receptors and of mechanisms of uptake of choles-terol into cells. Studies of oncogenes in cancer cellshave directed attention to the molecular mechanismsinvolved in the control of normal cell growth. Theseand many other examples emphasize how the study of

INTRODUCTION

Biochemistry can be defined as the science concernedwith the chemical basis of life (Gk bios “life”). The cell isthe structural unit of living systems. Thus, biochem-istry can also be described as the science concerned withthe chemical constituents of living cells and with the reac-tions and processes they undergo. By this definition, bio-chemistry encompasses large areas of cell biology, ofmolecular biology, and of molecular genetics.

The Aim of Biochemistry Is to Describe &Explain, in Molecular Terms, All ChemicalProcesses of Living Cells

The major objective of biochemistry is the completeunderstanding, at the molecular level, of all of thechemical processes associated with living cells. Toachieve this objective, biochemists have sought to iso-late the numerous molecules found in cells, determinetheir structures, and analyze how they function. Manytechniques have been used for these purposes; some ofthem are summarized in Table 1–1.

A Knowledge of Biochemistry Is Essentialto All Life Sciences

The biochemistry of the nucleic acids lies at the heart ofgenetics; in turn, the use of genetic approaches has beencritical for elucidating many areas of biochemistry.Physiology, the study of body function, overlaps withbiochemistry almost completely. Immunology employsnumerous biochemical techniques, and many immuno-logic approaches have found wide use by biochemists.Pharmacology and pharmacy rest on a sound knowl-edge of biochemistry and physiology; in particular,most drugs are metabolized by enzyme-catalyzed reac-tions. Poisons act on biochemical reactions or processes;this is the subject matter of toxicology. Biochemical ap-proaches are being used increasingly to study basic as-pects of pathology (the study of disease), such as in-flammation, cell injury, and cancer. Many workers inmicrobiology, zoology, and botany employ biochemicalapproaches almost exclusively. These relationships arenot surprising, because life as we know it depends onbiochemical reactions and processes. In fact, the oldbarriers among the life sciences are breaking down, and

ch01.qxd 2/13/2003 1:20 PM Page 1

Page 2: Harpers Biochem

2 / CHAPTER 1

disease can open up areas of cell function for basic bio-chemical research.

The relationship between medicine and biochem-istry has important implications for the former. As longas medical treatment is firmly grounded in a knowledgeof biochemistry and other basic sciences, the practice ofmedicine will have a rational basis that can be adaptedto accommodate new knowledge. This contrasts withunorthodox health cults and at least some “alternativemedicine” practices, which are often founded on littlemore than myth and wishful thinking and generallylack any intellectual basis.

NORMAL BIOCHEMICAL PROCESSES ARETHE BASIS OF HEALTH

The World Health Organization (WHO) defineshealth as a state of “complete physical, mental and so-cial well-being and not merely the absence of diseaseand infirmity.” From a strictly biochemical viewpoint,health may be considered that situation in which all ofthe many thousands of intra- and extracellular reactionsthat occur in the body are proceeding at rates commen-surate with the organism’s maximal survival in thephysiologic state. However, this is an extremely reduc-tionist view, and it should be apparent that caring forthe health of patients requires not only a wide knowl-edge of biologic principles but also of psychologic andsocial principles.

Biochemical Research Has Impact onNutrition & Preventive Medicine

One major prerequisite for the maintenance of health isthat there be optimal dietary intake of a number ofchemicals; the chief of these are vitamins, certainamino acids, certain fatty acids, various minerals, andwater. Because much of the subject matter of both bio-chemistry and nutrition is concerned with the study ofvarious aspects of these chemicals, there is a close rela-tionship between these two sciences. Moreover, moreemphasis is being placed on systematic attempts tomaintain health and forestall disease, ie, on preventivemedicine. Thus, nutritional approaches to—for exam-ple—the prevention of atherosclerosis and cancer arereceiving increased emphasis. Understanding nutritiondepends to a great extent on a knowledge of biochem-istry.

Most & Perhaps All Disease Hasa Biochemical Basis

We believe that most if not all diseases are manifesta-tions of abnormalities of molecules, chemical reactions,or biochemical processes. The major factors responsiblefor causing diseases in animals and humans are listed inTable 1–2. All of them affect one or more criticalchemical reactions or molecules in the body. Numerousexamples of the biochemical bases of diseases will be en-countered in this text; the majority of them are due tocauses 5, 7, and 8. In most of these conditions, bio-chemical studies contribute to both the diagnosis andtreatment. Some major uses of biochemical investiga-tions and of laboratory tests in relation to diseases aresummarized in Table 1–3.

Additional examples of many of these uses are pre-sented in various sections of this text.

Table 1–1. The principal methods andpreparations used in biochemical laboratories.

Methods for Separating and Purifying Biomolecules1

Salt fractionation (eg, precipitation of proteins with ammo-nium sulfate)

Chromatography: Paper; ion exchange; affinity; thin-layer; gas-liquid; high-pressure liquid; gel filtration

Electrophoresis: Paper; high-voltage; agarose; cellulose acetate; starch gel; polyacrylamide gel; SDS-polyacryl-amide gel

UltracentrifugationMethods for Determining Biomolecular Structures

Elemental analysisUV, visible, infrared, and NMR spectroscopyUse of acid or alkaline hydrolysis to degrade the biomole-

cule under study into its basic constituentsUse of a battery of enzymes of known specificity to de-grade the biomolecule under study (eg, proteases, nucle-

ases, glycosidases)Mass spectrometrySpecific sequencing methods (eg, for proteins and nucleic

acids)X-ray crystallography

Preparations for Studying Biochemical ProcessesWhole animal (includes transgenic animals and animals

with gene knockouts)Isolated perfused organTissue sliceWhole cellsHomogenateIsolated cell organellesSubfractionation of organellesPurified metabolites and enzymesIsolated genes (including polymerase chain reaction and

site-directed mutagenesis)1Most of these methods are suitable for analyzing the compo-nents present in cell homogenates and other biochemical prepa-rations. The sequential use of several techniques will generallypermit purification of most biomolecules. The reader is referredto texts on methods of biochemical research for details.

ch01.qxd 2/13/2003 1:20 PM Page 2

Page 3: Harpers Biochem

BIOCHEMISTRY & MEDICINE / 3

BIOCHEMISTRY

MEDICINE

Lipids

Athero-sclerosis

Proteins

Sickle cellanemia

Nucleicacids

Geneticdiseases

Carbohydrates

Diabetesmellitus

Figure 1–1. Examples of the two-way street connecting biochemistry andmedicine. Knowledge of the biochemical molecules shown in the top part of thediagram has clarified our understanding of the diseases shown in the bottomhalf—and conversely, analyses of the diseases shown below have cast light onmany areas of biochemistry. Note that sickle cell anemia is a genetic disease andthat both atherosclerosis and diabetes mellitus have genetic components.

Table 1–2. The major causes of diseases. All ofthe causes listed act by influencing the variousbiochemical mechanisms in the cell or in thebody.1

1. Physical agents: Mechanical trauma, extremes of temper-ature, sudden changes in atmospheric pressure, radia-tion, electric shock.

2. Chemical agents, including drugs: Certain toxic com-pounds, therapeutic drugs, etc.

3. Biologic agents: Viruses, bacteria, fungi, higher forms ofparasites.

4. Oxygen lack: Loss of blood supply, depletion of theoxygen-carrying capacity of the blood, poisoning ofthe oxidative enzymes.

5. Genetic disorders: Congenital, molecular.6. Immunologic reactions: Anaphylaxis, autoimmune

disease.7. Nutritional imbalances: Deficiencies, excesses.8. Endocrine imbalances: Hormonal deficiencies, excesses.1Adapted, with permission, from Robbins SL, Cotram RS, Kumar V:The Pathologic Basis of Disease, 3rd ed. Saunders, 1984.

Table 1–3. Some uses of biochemicalinvestigations and laboratory tests in relation to diseases.

Use Example

1. To reveal the funda- Demonstration of the na-mental causes and ture of the genetic de-mechanisms of diseases fects in cystic fibrosis.

2. To suggest rational treat- A diet low in phenylalanine ments of diseases based for treatment of phenyl-on (1) above ketonuria.

3. To assist in the diagnosis Use of the plasma enzymeof specific diseases creatine kinase MB

(CK-MB) in the diagnosisof myocardial infarction.

4. To act as screening tests Use of measurement offor the early diagnosis blood thyroxine orof certain diseases thyroid-stimulating hor-

mone (TSH) in the neo-natal diagnosis of con-genital hypothyroidism.

5. To assist in monitoring Use of the plasma enzymethe progress (eg, re- alanine aminotransferasecovery, worsening, re- (ALT) in monitoring themission, or relapse) of progress of infectiouscertain diseases hepatitis.

6. To assist in assessing Use of measurement ofthe response of dis- blood carcinoembryoniceases to therapy antigen (CEA) in certain

patients who have been treated for cancer of the colon.

Impact of the Human Genome Project(HGP) on Biochemistry & Medicine

Remarkable progress was made in the late 1990s in se-quencing the human genome. This culminated in July2000, when leaders of the two groups involved in thiseffort (the International Human Genome SequencingConsortium and Celera Genomics, a private company)announced that over 90% of the genome had been se-quenced. Draft versions of the sequence were published

ch01.qxd 2/13/2003 1:20 PM Page 3

Page 4: Harpers Biochem

in early 2001. It is anticipated that the entire sequencewill be completed by 2003. The implications of thiswork for biochemistry, all of biology, and for medicineare tremendous, and only a few points are mentionedhere. Many previously unknown genes have been re-vealed; their protein products await characterization.New light has been thrown on human evolution, andprocedures for tracking disease genes have been greatlyrefined. The results are having major effects on areassuch as proteomics, bioinformatics, biotechnology, andpharmacogenomics. Reference to the human genomewill be made in various sections of this text. TheHuman Genome Project is discussed in more detail inChapter 54.

SUMMARY

• Biochemistry is the science concerned with studyingthe various molecules that occur in living cells andorganisms and with their chemical reactions. Becauselife depends on biochemical reactions, biochemistryhas become the basic language of all biologic sci-ences.

• Biochemistry is concerned with the entire spectrumof life forms, from relatively simple viruses and bacte-ria to complex human beings.

• Biochemistry and medicine are intimately related.Health depends on a harmonious balance of bio-chemical reactions occurring in the body, and diseasereflects abnormalities in biomolecules, biochemicalreactions, or biochemical processes.

• Advances in biochemical knowledge have illumi-nated many areas of medicine. Conversely, the studyof diseases has often revealed previously unsuspectedaspects of biochemistry. The determination of the se-quence of the human genome, nearly complete, willhave a great impact on all areas of biology, includingbiochemistry, bioinformatics, and biotechnology.

• Biochemical approaches are often fundamental in il-luminating the causes of diseases and in designingappropriate therapies.

• The judicious use of various biochemical laboratorytests is an integral component of diagnosis and moni-toring of treatment.

• A sound knowledge of biochemistry and of other re-lated basic disciplines is essential for the rationalpractice of medical and related health sciences.

REFERENCES

Fruton JS: Proteins, Enzymes, Genes: The Interplay of Chemistry andBiology. Yale Univ Press, 1999. (Provides the historical back-ground for much of today’s biochemical research.)

Garrod AE: Inborn errors of metabolism. (Croonian Lectures.)Lancet 1908;2:1, 73, 142, 214.

International Human Genome Sequencing Consortium. Initial se-quencing and analysis of the human genome. Nature2001:409;860. (The issue [15 February] consists of articlesdedicated to analyses of the human genome.)

Kornberg A: Basic research: The lifeline of medicine. FASEB J1992;6:3143.

Kornberg A: Centenary of the birth of modern biochemistry.FASEB J 1997;11:1209.

McKusick VA: Mendelian Inheritance in Man. Catalogs of HumanGenes and Genetic Disorders, 12th ed. Johns Hopkins UnivPress, 1998. [Abbreviated MIM]

Online Mendelian Inheritance in Man (OMIM): Center for Med-ical Genetics, Johns Hopkins University and National Centerfor Biotechnology Information, National Library of Medi-cine, 1997. http://www.ncbi.nlm.nih.gov/omim/

(The numbers assigned to the entries in MIM and OMIM will becited in selected chapters of this work. Consulting this exten-sive collection of diseases and other relevant entries—specificproteins, enzymes, etc—will greatly expand the reader’sknowledge and understanding of various topics referred toand discussed in this text. The online version is updated al-most daily.)

Scriver CR et al (editors): The Metabolic and Molecular Bases of In-herited Disease, 8th ed. McGraw-Hill, 2001.

Venter JC et al: The Sequence of the Human Genome. Science2001;291:1304. (The issue [16 February] contains the Celeradraft version and other articles dedicated to analyses of thehuman genome.)

Williams DL, Marks V: Scientific Foundations of Biochemistry inClinical Practice, 2nd ed. Butterworth-Heinemann, 1994.

4 / CHAPTER 1

ch01.qxd 2/13/2003 1:20 PM Page 4

Page 5: Harpers Biochem

Water & pH 2

5

Victor W. Rodwell, PhD, & Peter J. Kennelly, PhD

BIOMEDICAL IMPORTANCE

Water is the predominant chemical component of liv-ing organisms. Its unique physical properties, which in-clude the ability to solvate a wide range of organic andinorganic molecules, derive from water’s dipolar struc-ture and exceptional capacity for forming hydrogenbonds. The manner in which water interacts with a sol-vated biomolecule influences the structure of each. Anexcellent nucleophile, water is a reactant or product inmany metabolic reactions. Water has a slight propensityto dissociate into hydroxide ions and protons. Theacidity of aqueous solutions is generally reported usingthe logarithmic pH scale. Bicarbonate and other buffersnormally maintain the pH of extracellular fluid be-tween 7.35 and 7.45. Suspected disturbances of acid-base balance are verified by measuring the pH of arter-ial blood and the CO2 content of venous blood. Causesof acidosis (blood pH < 7.35) include diabetic ketosisand lactic acidosis. Alkalosis (pH > 7.45) may, for ex-ample, follow vomiting of acidic gastric contents. Regu-lation of water balance depends upon hypothalamicmechanisms that control thirst, on antidiuretic hor-mone (ADH), on retention or excretion of water by thekidneys, and on evaporative loss. Nephrogenic diabetesinsipidus, which involves the inability to concentrateurine or adjust to subtle changes in extracellular fluidosmolarity, results from the unresponsiveness of renaltubular osmoreceptors to ADH.

WATER IS AN IDEAL BIOLOGIC SOLVENT

Water Molecules Form Dipoles

A water molecule is an irregular, slightly skewed tetra-hedron with oxygen at its center (Figure 2–1). The twohydrogens and the unshared electrons of the remainingtwo sp3-hybridized orbitals occupy the corners of thetetrahedron. The 105-degree angle between the hydro-gens differs slightly from the ideal tetrahedral angle,109.5 degrees. Ammonia is also tetrahedral, with a 107-degree angle between its hydrogens. Water is a dipole,a molecule with electrical charge distributed asymmetri-cally about its structure. The strongly electronegative

oxygen atom pulls electrons away from the hydrogennuclei, leaving them with a partial positive charge,while its two unshared electron pairs constitute a regionof local negative charge.

Water, a strong dipole, has a high dielectric con-stant. As described quantitatively by Coulomb’s law,the strength of interaction F between oppositelycharged particles is inversely proportionate to the di-electric constant ε of the surrounding medium. The di-electric constant for a vacuum is unity; for hexane it is1.9; for ethanol it is 24.3; and for water it is 78.5.Water therefore greatly decreases the force of attractionbetween charged and polar species relative to water-freeenvironments with lower dielectric constants. Its strongdipole and high dielectric constant enable water to dis-solve large quantities of charged compounds such assalts.

Water Molecules Form Hydrogen Bonds

An unshielded hydrogen nucleus covalently bound toan electron-withdrawing oxygen or nitrogen atom caninteract with an unshared electron pair on another oxy-gen or nitrogen atom to form a hydrogen bond. Sincewater molecules contain both of these features, hydro-gen bonding favors the self-association of water mole-cules into ordered arrays (Figure 2–2). Hydrogen bond-ing profoundly influences the physical properties ofwater and accounts for its exceptionally high viscosity,surface tension, and boiling point. On average, eachmolecule in liquid water associates through hydrogenbonds with 3.5 others. These bonds are both relativelyweak and transient, with a half-life of about one mi-crosecond. Rupture of a hydrogen bond in liquid waterrequires only about 4.5 kcal/mol, less than 5% of theenergy required to rupture a covalent O H bond.

Hydrogen bonding enables water to dissolve manyorganic biomolecules that contain functional groupswhich can participate in hydrogen bonding. The oxy-gen atoms of aldehydes, ketones, and amides providepairs of electrons that can serve as hydrogen acceptors.Alcohols and amines can serve both as hydrogen accep-tors and as donors of unshielded hydrogen atoms forformation of hydrogen bonds (Figure 2–3).

ch02.qxd 2/13/2003 1:41 PM Page 5

Page 6: Harpers Biochem

6 / CHAPTER 2

2e

H

H

105°

2e

OH H

H

HO

OH

OH H

H H OH

OH

OH H

H

Figure 2–2. Left: Association of two dipolar watermolecules by a hydrogen bond (dotted line). Right:Hydrogen-bonded cluster of four water molecules.Note that water can serve simultaneously both as a hy-drogen donor and as a hydrogen acceptor.

Figure 2–1. The water molecule has tetrahedralgeometry.

H

H

OOCH2CH3 H

OOCHCH3 H

H

CH2 CH3

HO

R

R

N

II

III

C

R

RI

2

Figure 2–3. Additional polar groups participate inhydrogen bonding. Shown are hydrogen bonds formedbetween an alcohol and water, between two moleculesof ethanol, and between the peptide carbonyl oxygenand the peptide nitrogen hydrogen of an adjacentamino acid.

Table 2–1. Bond energies for atoms of biologicsignificance.

Bond Energy Bond EnergyType (kcal/mol) Type (kcal/mol)

O—O 34 O==O 96S—S 51 C—H 99C—N 70 C==S 108S—H 81 O—H 110C—C 82 C==C 147C—O 84 C==N 147N—H 94 C==O 164

INTERACTION WITH WATER INFLUENCESTHE STRUCTURE OF BIOMOLECULES

Covalent & Noncovalent Bonds StabilizeBiologic Molecules

The covalent bond is the strongest force that holdsmolecules together (Table 2–1). Noncovalent forces,while of lesser magnitude, make significant contribu-tions to the structure, stability, and functional compe-tence of macromolecules in living cells. These forces,which can be either attractive or repulsive, involve in-teractions both within the biomolecule and between itand the water that forms the principal component ofthe surrounding environment.

Biomolecules Fold to Position Polar &Charged Groups on Their Surfaces

Most biomolecules are amphipathic; that is, they pos-sess regions rich in charged or polar functional groupsas well as regions with hydrophobic character. Proteinstend to fold with the R-groups of amino acids with hy-drophobic side chains in the interior. Amino acids withcharged or polar amino acid side chains (eg, arginine,glutamate, serine) generally are present on the surfacein contact with water. A similar pattern prevails in aphospholipid bilayer, where the charged head groups of

phosphatidyl serine or phosphatidyl ethanolamine con-tact water while their hydrophobic fatty acyl side chainscluster together, excluding water. This pattern maxi-mizes the opportunities for the formation of energeti-cally favorable charge-dipole, dipole-dipole, and hydro-gen bonding interactions between polar groups on thebiomolecule and water. It also minimizes energeticallyunfavorable contact between water and hydrophobicgroups.

Hydrophobic Interactions

Hydrophobic interaction refers to the tendency of non-polar compounds to self-associate in an aqueous envi-ronment. This self-association is driven neither by mu-tual attraction nor by what are sometimes incorrectlyreferred to as “hydrophobic bonds.” Self-associationarises from the need to minimize energetically unfavor-able interactions between nonpolar groups and water.

ch02.qxd 2/13/2003 1:41 PM Page 6

Page 7: Harpers Biochem

WATER & pH / 7

While the hydrogens of nonpolar groups such as themethylene groups of hydrocarbons do not form hydro-gen bonds, they do affect the structure of the water thatsurrounds them. Water molecules adjacent to a hy-drophobic group are restricted in the number of orien-tations (degrees of freedom) that permit them to par-ticipate in the maximum number of energeticallyfavorable hydrogen bonds. Maximal formation of mul-tiple hydrogen bonds can be maintained only by in-creasing the order of the adjacent water molecules, witha corresponding decrease in entropy.

It follows from the second law of thermodynamicsthat the optimal free energy of a hydrocarbon-watermixture is a function of both maximal enthalpy (fromhydrogen bonding) and minimum entropy (maximumdegrees of freedom). Thus, nonpolar molecules tend toform droplets with minimal exposed surface area, re-ducing the number of water molecules affected. For thesame reason, in the aqueous environment of the livingcell the hydrophobic portions of biopolymers tend tobe buried inside the structure of the molecule, or withina lipid bilayer, minimizing contact with water.

Electrostatic Interactions

Interactions between charged groups shape biomolecu-lar structure. Electrostatic interactions between oppo-sitely charged groups within or between biomoleculesare termed salt bridges. Salt bridges are comparable instrength to hydrogen bonds but act over larger dis-tances. They thus often facilitate the binding of chargedmolecules and ions to proteins and nucleic acids.

Van der Waals Forces

Van der Waals forces arise from attractions betweentransient dipoles generated by the rapid movement ofelectrons on all neutral atoms. Significantly weakerthan hydrogen bonds but potentially extremely numer-ous, van der Waals forces decrease as the sixth power ofthe distance separating atoms. Thus, they act over veryshort distances, typically 2–4 Å.

Multiple Forces Stabilize Biomolecules

The DNA double helix illustrates the contribution ofmultiple forces to the structure of biomolecules. Whileeach individual DNA strand is held together by cova-lent bonds, the two strands of the helix are held to-gether exclusively by noncovalent interactions. Thesenoncovalent interactions include hydrogen bonds be-tween nucleotide bases (Watson-Crick base pairing)and van der Waals interactions between the stackedpurine and pyrimidine bases. The helix presents thecharged phosphate groups and polar ribose sugars of

the backbone to water while burying the relatively hy-drophobic nucleotide bases inside. The extended back-bone maximizes the distance between negativelycharged backbone phosphates, minimizing unfavorableelectrostatic interactions.

WATER IS AN EXCELLENT NUCLEOPHILE

Metabolic reactions often involve the attack by lonepairs of electrons on electron-rich molecules termednucleophiles on electron-poor atoms called elec-trophiles. Nucleophiles and electrophiles do not neces-sarily possess a formal negative or positive charge.Water, whose two lone pairs of sp3 electrons bear a par-tial negative charge, is an excellent nucleophile. Othernucleophiles of biologic importance include the oxygenatoms of phosphates, alcohols, and carboxylic acids; thesulfur of thiols; the nitrogen of amines; and the imid-azole ring of histidine. Common electrophiles includethe carbonyl carbons in amides, esters, aldehydes, andketones and the phosphorus atoms of phosphoesters.

Nucleophilic attack by water generally results in thecleavage of the amide, glycoside, or ester bonds thathold biopolymers together. This process is termed hy-drolysis. Conversely, when monomer units are joinedtogether to form biopolymers such as proteins or glyco-gen, water is a product, as shown below for the forma-tion of a peptide bond between two amino acids.

While hydrolysis is a thermodynamically favored re-action, the amide and phosphoester bonds of polypep-tides and oligonucleotides are stable in the aqueous en-vironment of the cell. This seemingly paradoxicbehavior reflects the fact that the thermodynamics gov-erning the equilibrium of a reaction do not determinethe rate at which it will take place. In the cell, proteincatalysts called enzymes are used to accelerate the rate

O+H3N

O

NH

H2O

OH + H

+H3NNH

O–

O–

O

O

Alanine

Valine

ch02.qxd 2/13/2003 1:41 PM Page 7

Page 8: Harpers Biochem

8 / CHAPTER 2

of hydrolytic reactions when needed. Proteases catalyzethe hydrolysis of proteins into their component aminoacids, while nucleases catalyze the hydrolysis of thephosphoester bonds in DNA and RNA. Careful controlof the activities of these enzymes is required to ensurethat they act only on appropriate target molecules.

Many Metabolic Reactions InvolveGroup Transfer

In group transfer reactions, a group G is transferredfrom a donor D to an acceptor A, forming an acceptorgroup complex A–G:

The hydrolysis and phosphorolysis of glycogen repre-sent group transfer reactions in which glucosyl groupsare transferred to water or to orthophosphate. Theequilibrium constant for the hydrolysis of covalentbonds strongly favors the formation of split products.The biosynthesis of macromolecules also involves grouptransfer reactions in which the thermodynamically un-favored synthesis of covalent bonds is coupled to fa-vored reactions so that the overall change in free energyfavors biopolymer synthesis. Given the nucleophiliccharacter of water and its high concentration in cells,why are biopolymers such as proteins and DNA rela-tively stable? And how can synthesis of biopolymersoccur in an apparently aqueous environment? Centralto both questions are the properties of enzymes. In theabsence of enzymic catalysis, even thermodynamicallyhighly favored reactions do not necessarily take placerapidly. Precise and differential control of enzyme ac-tivity and the sequestration of enzymes in specific or-ganelles determine under what physiologic conditions agiven biopolymer will be synthesized or degraded.Newly synthesized polymers are not immediately hy-drolyzed, in part because the active sites of biosyntheticenzymes sequester substrates in an environment fromwhich water can be excluded.

Water Molecules Exhibit a Slight butImportant Tendency to Dissociate

The ability of water to ionize, while slight, is of centralimportance for life. Since water can act both as an acidand as a base, its ionization may be represented as anintermolecular proton transfer that forms a hydroniumion (H3O+) and a hydroxide ion (OH−):

The transferred proton is actually associated with acluster of water molecules. Protons exist in solution notonly as H3O+, but also as multimers such as H5O2

+ and

H O H O H O OH2 2 3 + ++= −

D G A A G D− = + − +

H7O3+. The proton is nevertheless routinely repre-

sented as H+, even though it is in fact highly hydrated.Since hydronium and hydroxide ions continuously

recombine to form water molecules, an individual hy-drogen or oxygen cannot be stated to be present as anion or as part of a water molecule. At one instant it isan ion. An instant later it is part of a molecule. Individ-ual ions or molecules are therefore not considered. Werefer instead to the probability that at any instant intime a hydrogen will be present as an ion or as part of awater molecule. Since 1 g of water contains 3.46 × 1022

molecules, the ionization of water can be described sta-tistically. To state that the probability that a hydrogenexists as an ion is 0.01 means that a hydrogen atom hasone chance in 100 of being an ion and 99 chances outof 100 of being part of a water molecule. The actualprobability of a hydrogen atom in pure water existing asa hydrogen ion is approximately 1.8 × 10−9. The proba-bility of its being part of a molecule thus is almostunity. Stated another way, for every hydrogen ion andhydroxyl ion in pure water there are 1.8 billion or 1.8 ×109 water molecules. Hydrogen ions and hydroxyl ionsnevertheless contribute significantly to the properties ofwater.

For dissociation of water,

where brackets represent molar concentrations (strictlyspeaking, molar activities) and K is the dissociationconstant. Since one mole (mol) of water weighs 18 g,one liter (L) (1000 g) of water contains 1000 × 18 =55.56 mol. Pure water thus is 55.56 molar. Since theprobability that a hydrogen in pure water will exist as ahydrogen ion is 1.8 × 10−9, the molar concentration ofH+ ions (or of OH− ions) in pure water is the productof the probability, 1.8 × 10−9, times the molar concen-tration of water, 55.56 mol/L. The result is 1.0 × 10−7

mol/L.We can now calculate K for water:

The molar concentration of water, 55.56 mol/L, istoo great to be significantly affected by dissociation. Ittherefore is considered to be essentially constant. Thisconstant may then be incorporated into the dissociationconstant K to provide a useful new constant Kw termedthe ion product for water. The relationship betweenKw and K is shown below:

K = =

= × = ×

+[ ][ ]

[ ]

[ ][ ]

[ . ]

. . /

H OH

H O

mol L

− − −

− −2

7 7

14 16

10 10

55 56

0 018 10 1 8 10

K =+[ ][

]

H OH

H O

− ][ 2

ch02.qxd 2/13/2003 1:41 PM Page 8

Page 9: Harpers Biochem

WATER & pH / 9

Note that the dimensions of K are moles per liter andthose of Kw are moles2 per liter2. As its name suggests,the ion product Kw is numerically equal to the productof the molar concentrations of H+ and OH−:

At 25 °C, Kw = (10−7)2, or 10−14 (mol/L)2. At tempera-tures below 25 °C, Kw is somewhat less than 10−14; andat temperatures above 25 °C it is somewhat greater than10−14. Within the stated limitations of the effect of tem-perature, Kw equals 10-14 (mol/L)2 for all aqueous so-lutions, even solutions of acids or bases. We shall useKw to calculate the pH of acidic and basic solutions.

pH IS THE NEGATIVE LOG OF THEHYDROGEN ION CONCENTRATION

The term pH was introduced in 1909 by Sörensen,who defined pH as the negative log of the hydrogen ionconcentration:

This definition, while not rigorous, suffices for manybiochemical purposes. To calculate the pH of a solution:

1. Calculate hydrogen ion concentration [H+].2. Calculate the base 10 logarithm of [H+].3. pH is the negative of the value found in step 2.

For example, for pure water at 25°C,

Low pH values correspond to high concentrations ofH+ and high pH values correspond to low concentra-tions of H+.

Acids are proton donors and bases are proton ac-ceptors. Strong acids (eg, HCl or H2SO4) completelydissociate into anions and cations even in strongly acidicsolutions (low pH). Weak acids dissociate only partiallyin acidic solutions. Similarly, strong bases (eg, KOH orNaOH)—but not weak bases (eg, Ca[OH]2)—arecompletely dissociated at high pH. Many biochemicalsare weak acids. Exceptions include phosphorylated in-

pH H= = =+− − − −−log [ ] ( log 10 7) = 7.07

pH H= +− log [ ]

K w H OH= +[ ][ ]−

K

K K

= = ×

= =

= ×

= ×

+

+

[ ][ ]

[ ]. /

( )[ ] [ ][ ]

( . / ) ( . / )

. ( / )

H OH

H Omol L

H O H OH

mol L mol L

mol L

w

−−

2

16

2

16

14 2

1 8 10

1 8 10 55 56

1 00 10

termediates, whose phosphoryl group contains two dis-sociable protons, the first of which is strongly acidic.

The following examples illustrate how to calculatethe pH of acidic and basic solutions.

Example 1: What is the pH of a solution whose hy-drogen ion concentration is 3.2 × 10− 4 mol/L?

Example 2: What is the pH of a solution whose hy-droxide ion concentration is 4.0 × 10− 4 mol/L? We firstdefine a quantity pOH that is equal to −log [OH−] andthat may be derived from the definition of Kw:

Therefore:

or

To solve the problem by this approach:

Now:

Example 3: What are the pH values of (a) 2.0 × 10−2

mol/L KOH and of (b) 2.0 × 10−6 mol/L KOH? TheOH− arises from two sources, KOH and water. SincepH is determined by the total [H+] (and pOH by thetotal [OH−]), both sources must be considered. In thefirst case (a), the contribution of water to the total[OH−] is negligible. The same cannot be said for thesecond case (b):

pH pOH= ==

14 14 3 4

10 6

− − .

.

[ ] .

log [ ]

log ( . )

log ( . ) log )

OH

pOH OH

− −

− − (− . + .

= .

= ×

=

= ×

==

4 0 10

4 0 10

4 0 10

0 60 4 0

3 4

4

4

4

pH pOH+ = 14

log [ ] log [ ] log H OH+ −+ = 10 14−

K w H OH= =+[ ][ ]− −110 4

pH H=

= ×

== +=

+−

− −−

log [ ]

log ( . )

log ( . ) log ( )

. .

.

3 2 10

3 2 10

0 5 4 0

3 5

4

4

ch02.qxd 2/13/2003 1:41 PM Page 9

Page 10: Harpers Biochem

10 / CHAPTER 2

Concentration (mol/L)

(a) (b)

Molarity of KOH 2.0 × 10−2 2.0 × 10−6

[OH−] from KOH 2.0 × 10−2 2.0 × 10−6

[OH−] from water 1.0 × 10−7 1.0 × 10−7

Total [OH−] 2.00001 × 10−2 2.1 × 10−6

Once a decision has been reached about the significanceof the contribution by water, pH may be calculated asabove.

The above examples assume that the strong baseKOH is completely dissociated in solution and that theconcentration of OH− ions was thus equal to that of theKOH. This assumption is valid for dilute solutions ofstrong bases or acids but not for weak bases or acids.Since weak electrolytes dissociate only slightly in solu-tion, we must use the dissociation constant to calcu-late the concentration of [H+] (or [OH−]) produced bya given molarity of a weak acid (or base) before calcu-lating total [H+] (or total [OH−]) and subsequently pH.

Functional Groups That Are Weak AcidsHave Great Physiologic Significance

Many biochemicals possess functional groups that areweak acids or bases. Carboxyl groups, amino groups,and the second phosphate dissociation of phosphate es-ters are present in proteins and nucleic acids, mostcoenzymes, and most intermediary metabolites. Knowl-edge of the dissociation of weak acids and bases thus isbasic to understanding the influence of intracellular pHon structure and biologic activity. Charge-based separa-tions such as electrophoresis and ion exchange chro-matography also are best understood in terms of thedissociation behavior of functional groups.

We term the protonated species (eg, HA orRNH3

+) the acid and the unprotonated species (eg,A− or RNH2) its conjugate base. Similarly, we mayrefer to a base (eg, A− or RNH2) and its conjugateacid (eg, HA or RNH3

+). Representative weak acids(left), their conjugate bases (center), and the pKa values(right) include the following:

We express the relative strengths of weak acids andbases in terms of their dissociation constants. Shown

R CH COOH COO

NH NH

H CO

H PO

a

a

a

a

— — —

— —

.

.

2

3 2

2 3

2 4

4 5

9 10

6 4

7 2

R — CH p

R — CH R — CH p

HCO p

HPO p

2

2 2

3

4

− −2

K

K

K

K

=

=

=

=

+

below are the expressions for the dissociation constant(Ka ) for two representative weak acids, RCOOH andRNH3

+.

Since the numeric values of Ka for weak acids are nega-tive exponential numbers, we express Ka as pKa, where

Note that pKa is related to Ka as pH is to [H+]. Thestronger the acid, the lower its pKa value.

pKa is used to express the relative strengths of bothacids and bases. For any weak acid, its conjugate is astrong base. Similarly, the conjugate of a strong base isa weak acid. The relative strengths of bases are ex-pressed in terms of the pKa of their conjugate acids. Forpolyproteic compounds containing more than one dis-sociable proton, a numerical subscript is assigned toeach in order of relative acidity. For a dissociation ofthe type

the pKa is the pH at which the concentration of theacid RNH3

+ equals that of the base RNH2. From the above equations that relate Ka to [H+] and

to the concentrations of undissociated acid and its con-jugate base, when

or when

then

Thus, when the associated (protonated) and dissociated(conjugate base) species are present at equal concentra-tions, the prevailing hydrogen ion concentration [H+]is numerically equal to the dissociation constant, Ka. Ifthe logarithms of both sides of the above equation are

Ka H = +[ ]

[ ] [ ]R NH R NH— —2 3= +

[ ] [R COO R COOH— —− ]=

R NH— 3+ → R — NH2

p aK K= − log

R COOH R COO H

R COO H

R COOH

R NH R NH H

R NH H

R NH

a

a

— —

[ — ][ ]

[ — ]

— —

[ — ][ ]

[ — ]

=

=

+

=

+

=

+

+

+ +

+

+

K

K

3 2

2

3

ch02.qxd 2/13/2003 1:41 PM Page 10

Page 11: Harpers Biochem

WATER & pH / 11

taken and both sides are multiplied by −1, the expres-sions would be as follows:

Since −log Ka is defined as pKa, and −log [H+] de-fines pH, the equation may be rewritten as

ie, the pKa of an acid group is the pH at which the pro-tonated and unprotonated species are present at equalconcentrations. The pKa for an acid may be determinedby adding 0.5 equivalent of alkali per equivalent ofacid. The resulting pH will be the pKa of the acid.

The Henderson-Hasselbalch EquationDescribes the Behavior of Weak Acids & Buffers

The Henderson-Hasselbalch equation is derived below.A weak acid, HA, ionizes as follows:

The equilibrium constant for this dissociation is

Cross-multiplication gives

Divide both sides by [A−]:

Take the log of both sides:

Multiply through by −1:

− − −−

log [ ] log log[ ]

[ ] H

HA

Aa

+ = K

log [ ] log[ ]

[ ]

log log[ ]

[ ]

HHA

A

HA

A

a

a

+ =

= +

K

K

[ ][ ]

[ ]H

HA

Aa

+ = K−

[ ][ ] [ ]H A HAa+ =− K

K aH A

HA=

+[ ][ ]

[ ]

HA H A = + + −

p pHaK =

K

K

a

a

H

H

=

=

+

+

[ ]

log [ ]− − log

Substitute pH and pKa for −log [H+] and −log Ka, re-spectively; then:

Inversion of the last term removes the minus signand gives the Henderson-Hasselbalch equation:

The Henderson-Hasselbalch equation has great pre-dictive value in protonic equilibria. For example,

(1) When an acid is exactly half-neutralized, [A−] =[HA]. Under these conditions,

Therefore, at half-neutralization, pH = pKa.

(2) When the ratio [A−]/[HA] = 100:1,

(3) When the ratio [A−]/[HA] = 1:10,

If the equation is evaluated at ratios of [A−]/[HA]ranging from 103 to 10−3 and the calculated pH valuesare plotted, the resulting graph describes the titrationcurve for a weak acid (Figure 2–4).

Solutions of Weak Acids & Their SaltsBuffer Changes in pH

Solutions of weak acids or bases and their conjugatesexhibit buffering, the ability to resist a change in pHfollowing addition of strong acid or base. Since manymetabolic reactions are accompanied by the release oruptake of protons, most intracellular reactions arebuffered. Oxidative metabolism produces CO2, the an-hydride of carbonic acid, which if not buffered wouldproduce severe acidosis. Maintenance of a constant pHinvolves buffering by phosphate, bicarbonate, and pro-teins, which accept or release protons to resist a change

pH p pa a= + +K Klog ( 1/10 = 1)−

pH pA

HA

pH p p

a

a a

= +

= + +

K

K K

log[ ]

[ ]

log

100 /1=

2

pH pA

HAp pa a a= + = + = +K K Klog

[ ]

[ ]log

− 1

10

pH pA

HAa= +K log[ ]

[ ]

pH pHA

Aa= K −

−log

[ ]

[ ]

ch02.qxd 2/13/2003 1:41 PM Page 11

Page 12: Harpers Biochem

12 / CHAPTER 2

0

0.2

0.4

0.6

0.8

1.0

2 3 4 5 6 7pH

8

0

0.2

0.4

0.6

0.8

1.0m

eq o

f alk

ali a

dded

per

meq

of a

cid

Net

cha

rge

Figure 2–4. Titration curve for an acid of the typeHA. The heavy dot in the center of the curve indicatesthe pKa 5.0.

Table 2–2. Relative strengths of selected acids ofbiologic significance. Tabulated values are the pKa

values (−log of the dissociation constant) ofselected monoprotic, diprotic, and triprotic acids.

Monoprotic Acids

Formic pK 3.75Lactic pK 3.86Acetic pK 4.76Ammonium ion pK 9.25

Diprotic Acids

Carbonic pK1 6.37pK2 10.25

Succinic pK1 4.21pK2 5.64

Glutaric pK1 4.34pK2 5.41

Triprotic Acids

Phosphoric pK1 2.15pK2 6.82pK3 12.38

Citric pK1 3.08pK2 4.74pK3 5.40

Initial pH 5.00 5.37 5.60 5.86[A−]initial 0.50 0.70 0.80 0.88[HA]initial 0.50 0.30 0.20 0.12([A−]/[HA])initial 1.00 2.33 4.00 7.33

Addition of 0.1 meq of KOH produces[A−]final 0.60 0.80 0.90 0.98[HA]final 0.40 0.20 0.10 0.02([A−]/[HA])final 1.50 4.00 9.00 49.0

log ([A−]/[HA])final 0.176 0.602 0.95 1.69Final pH 5.18 5.60 5.95 6.69

∆pH 0.18 0.60 0.95 1.69

in pH. For experiments using tissue extracts or en-zymes, constant pH is maintained by the addition ofbuffers such as MES ([2-N-morpholino]ethanesulfonicacid, pKa 6.1), inorganic orthophosphate (pKa2 7.2),HEPES (N-hydroxyethylpiperazine-N9-2-ethanesulfonicacid, pKa 6.8), or Tris (tris[hydroxymethyl] amino-methane, pKa 8.3). The value of pKa relative to the de-sired pH is the major determinant of which buffer is se-lected.

Buffering can be observed by using a pH meterwhile titrating a weak acid or base (Figure 2–4). Wecan also calculate the pH shift that accompanies addi-tion of acid or base to a buffered solution. In the exam-ple, the buffered solution (a weak acid, pKa = 5.0, andits conjugate base) is initially at one of four pH values.We will calculate the pH shift that results when 0.1meq of KOH is added to 1 meq of each solution:

Notice that the change in pH per milliequivalent ofOH− added depends on the initial pH. The solution re-sists changes in pH most effectively at pH values close

to the pKa. A solution of a weak acid and its conjugatebase buffers most effectively in the pH range pKa ± 1.0pH unit.

Figure 2–4 also illustrates the net charge on onemolecule of the acid as a function of pH. A fractionalcharge of −0.5 does not mean that an individual mole-cule bears a fractional charge, but the probability that agiven molecule has a unit negative charge is 0.5. Con-sideration of the net charge on macromolecules as afunction of pH provides the basis for separatory tech-niques such as ion exchange chromatography and elec-trophoresis.

Acid Strength Depends on Molecular Structure

Many acids of biologic interest possess more than onedissociating group. The presence of adjacent negativecharge hinders the release of a proton from a nearbygroup, raising its pKa. This is apparent from the pKavalues for the three dissociating groups of phosphoricacid and citric acid (Table 2–2). The effect of adjacentcharge decreases with distance. The second pKa for suc-cinic acid, which has two methylene groups between itscarboxyl groups, is 5.6, whereas the second pKa for glu-

ch02.qxd 2/13/2003 1:41 PM Page 12

Page 13: Harpers Biochem

WATER & pH / 13

taric acid, which has one additional methylene group,is 5.4.

pKa Values Depend on the Properties of the Medium

The pKa of a functional group is also profoundly influ-enced by the surrounding medium. The medium mayeither raise or lower the pKa depending on whether theundissociated acid or its conjugate base is the chargedspecies. The effect of dielectric constant on pKa may beobserved by adding ethanol to water. The pKa of a car-boxylic acid increases, whereas that of an amine decreasesbecause ethanol decreases the ability of water to solvatea charged species. The pKa values of dissociating groupsin the interiors of proteins thus are profoundly affectedby their local environment, including the presence orabsence of water.

SUMMARY

• Water forms hydrogen-bonded clusters with itself andwith other proton donors or acceptors. Hydrogenbonds account for the surface tension, viscosity, liquidstate at room temperature, and solvent power of water.

• Compounds that contain O, N, or S can serve as hy-drogen bond donors or acceptors.

• Macromolecules exchange internal surface hydrogenbonds for hydrogen bonds to water. Entropic forcesdictate that macromolecules expose polar regions toan aqueous interface and bury nonpolar regions.

• Salt bonds, hydrophobic interactions, and van derWaals forces participate in maintaining molecularstructure.

• pH is the negative log of [H+]. A low pH character-izes an acidic solution, and a high pH denotes a basicsolution.

• The strength of weak acids is expressed by pKa, thenegative log of the acid dissociation constant. Strongacids have low pKa values and weak acids have highpKa values.

• Buffers resist a change in pH when protons are pro-duced or consumed. Maximum buffering capacityoccurs ± 1 pH unit on either side of pKa. Physiologicbuffers include bicarbonate, orthophosphate, andproteins.

REFERENCES

Segel IM: Biochemical Calculations. Wiley, 1968.Wiggins PM: Role of water in some biological processes. Microbiol

Rev 1990;54:432.

ch02.qxd 2/13/2003 1:41 PM Page 13

Page 14: Harpers Biochem

Amino Acids & Peptides 3

14

Victor W. Rodwell, PhD, & Peter J. Kennelly, PhD

SECTION IStructures & Functions of Proteins & Enzymes

BIOMEDICAL IMPORTANCE

In addition to providing the monomer units from whichthe long polypeptide chains of proteins are synthesized,the L-α-amino acids and their derivatives participate incellular functions as diverse as nerve transmission andthe biosynthesis of porphyrins, purines, pyrimidines,and urea. Short polymers of amino acids called peptidesperform prominent roles in the neuroendocrine systemas hormones, hormone-releasing factors, neuromodula-tors, or neurotransmitters. While proteins contain onlyL-α-amino acids, microorganisms elaborate peptidesthat contain both D- and L-α-amino acids. Several ofthese peptides are of therapeutic value, including the an-tibiotics bacitracin and gramicidin A and the antitumoragent bleomycin. Certain other microbial peptides aretoxic. The cyanobacterial peptides microcystin andnodularin are lethal in large doses, while small quantitiespromote the formation of hepatic tumors. Neither hu-mans nor any other higher animals can synthesize 10 ofthe 20 common L-α-amino acids in amounts adequateto support infant growth or to maintain health in adults.Consequently, the human diet must contain adequatequantities of these nutritionally essential amino acids.

PROPERTIES OF AMINO ACIDS

The Genetic Code Specifies 20 L-�-Amino Acids

Of the over 300 naturally occurring amino acids, 20 con-stitute the monomer units of proteins. While a nonre-dundant three-letter genetic code could accommodate

more than 20 amino acids, its redundancy limits theavailable codons to the 20 L-α-amino acids listed inTable 3–1, classified according to the polarity of their Rgroups. Both one- and three-letter abbreviations for eachamino acid can be used to represent the amino acids inpeptides (Table 3–1). Some proteins contain additionalamino acids that arise by modification of an amino acidalready present in a peptide. Examples include conver-sion of peptidyl proline and lysine to 4-hydroxyprolineand 5-hydroxylysine; the conversion of peptidyl gluta-mate to γ-carboxyglutamate; and the methylation,formylation, acetylation, prenylation, and phosphoryla-tion of certain aminoacyl residues. These modificationsextend the biologic diversity of proteins by altering theirsolubility, stability, and interaction with other proteins.

Only L-�-Amino Acids Occur in Proteins

With the sole exception of glycine, the α-carbon ofamino acids is chiral. Although some protein aminoacids are dextrorotatory and some levorotatory, all sharethe absolute configuration of L-glyceraldehyde and thusare L-α-amino acids. Several free L-α-amino acids fulfillimportant roles in metabolic processes. Examples in-clude ornithine, citrulline, and argininosuccinate thatparticipate in urea synthesis; tyrosine in formation ofthyroid hormones; and glutamate in neurotransmitterbiosynthesis. D-Amino acids that occur naturally in-clude free D-serine and D-aspartate in brain tissue, D-alanine and D-glutamate in the cell walls of gram-positive bacteria, and D-amino acids in some nonmam-malian peptides and certain antibiotics.

ch03.qxd 2/13/2003 1:35 PM Page 14

Page 15: Harpers Biochem

Table 3–1. L- α-Amino acids present in proteins.

Name Symbol Structural Formula pK1 pK2 pK3

With Aliphatic Side Chains �-COOH �-NH3+ R Group

Glycine Gly [G] 2.4 9.8

Alanine Ala [A] 2.4 9.9

Valine Val [V] 2.2 9.7

Leucine Leu [L] 2.3 9.7

Isoleucine Ile [I] 2.3 9.8

With Side Chains Containing Hydroxylic (OH) GroupsSerine Ser [S] 2.2 9.2 about 13

Threonine Thr [T] 2.1 9.1 about 13

Tyrosine Tyr [Y] See below.

With Side Chains Containing Sulfur AtomsCysteine Cys [C] 1.9 10.8 8.3

Methionine Met [M] 2.1 9.3

With Side Chains Containing Acidic Groups or Their AmidesAspartic acid Asp [D] 2.0 9.9 3.9

Asparagine Asn [N] 2.1 8.8

Glutamic acid Glu [E] 2.1 9.5 4.1

Glutamine Gln [Q] 2.2 9.1

(continued )

H CH

NH3+

COO–

CH3 CH

NH3+

COO–

CH

H3C

H3C

CH

NH3+

COO–

CH

H3C

H3C

NH3+

COO–CH2 CH

CH

CH2

CH3

CH

NH3+

COO–

CH3

CH

NH3+

COO–CH2

OH

CH

NH3+

COO–CH

OH

CH3

CH

NH3+

COO–CH2

S

CH2

CH3

CH

NH3+

COO–CH2

SH

CH

NH3+

COO–CH2–OOC

CH

NH3+

COO–CH2 CH2–OOC

CH

NH3+

COO–CH2C

O

H2N

CH

NH3+

COO–CH2C

O

H2N CH2

15

ch03.qxd 2/13/2003 1:35 PM Page 15

Page 16: Harpers Biochem

16 / CHAPTER 3

Table 3–1. L-α-Amino acids present in proteins. (continued)

Name Symbol Structural Formula pK1 pK2 pK3

With Side Chains Containing Basic Groups �-COOH �-NH3+ R Group

Arginine Arg [R] 1.8 9.0 12.5

Lysine Lys [K] 2.2 9.2 10.8

Histidine His [H] 1.8 9.3 6.0

Containing Aromatic RingsHistidine His [H] See above.

Phenylalanine Phe [F] 2.2 9.2

Tyrosine Tyr [Y] 2.2 9.1 10.1

Tryptophan Trp [W] 2.4 9.4

Imino AcidProline Pro [P] 2.0 10.6

Amino Acids May Have Positive, Negative,or Zero Net Charge

Charged and uncharged forms of the ionizableCOOH and NH3

+ weak acid groups exist in solu-tion in protonic equilibrium:

While both RCOOH and RNH3+ are weak acids,

RCOOH is a far stronger acid than RNH3+. At

physiologic pH (pH 7.4), carboxyl groups exist almostentirely as RCOO− and amino groups predomi-nantly as RNH3

+. Figure 3–1 illustrates the effect ofpH on the charged state of aspartic acid.

R COOH R COO H

R NH NH H

— —

— R —

=

=

− +

+

+

+ +3 2

CH2

C

CH2

NH2

NH2+

CH2N CH

NH3+

COO–H

NH3+

CH2CH2CH2 CH2 CH

NH3+

COO–

CH

NH3+

COO–CH2

NHN

CH

NH3+

COO–

CH

NH3+

COO–

CH

NH3+

COO–

CH2

N

H

CH2

HO CH2

+N

H2

COO–

Molecules that contain an equal number of ioniz-able groups of opposite charge and that therefore bearno net charge are termed zwitterions. Amino acids inblood and most tissues thus should be represented as inA, below.

Structure B cannot exist in aqueous solution because atany pH low enough to protonate the carboxyl groupthe amino group would also be protonated. Similarly,at any pH sufficiently high for an uncharged amino

O

OH

NH2

RO

A B

O–

NH3+

R

ch03.qxd 2/13/2003 1:35 PM Page 16

Page 17: Harpers Biochem

AMINO ACIDS & PEPTIDES / 17

R

N H

N

H

R

N H

N

H

NH

R

C NH2

NH2

NH

R

C NH2

NH2

NH

R

C NH2

NH2

Figure 3–2. Resonance hybrids of the protonatedforms of the R groups of histidine and arginine.

group to predominate, a carboxyl group will be presentas RCOO−. The uncharged representation B (above)is, however, often used for reactions that do not involveprotonic equilibria.

pKa Values Express the Strengths of Weak Acids

The acid strengths of weak acids are expressed as theirpKa (Table 3–1). The imidazole group of histidine andthe guanidino group of arginine exist as resonance hy-brids with positive charge distributed between both ni-trogens (histidine) or all three nitrogens (arginine) (Fig-ure 3–2). The net charge on an amino acid—thealgebraic sum of all the positively and negativelycharged groups present—depends upon the pKa valuesof its functional groups and on the pH of the surround-ing medium. Altering the charge on amino acids andtheir derivatives by varying the pH facilitates the physi-cal separation of amino acids, peptides, and proteins(see Chapter 4).

At Its Isoelectric pH (pI), an Amino AcidBears No Net Charge

The isoelectric species is the form of a molecule thathas an equal number of positive and negative chargesand thus is electrically neutral. The isoelectric pH, alsocalled the pI, is the pH midway between pKa values oneither side of the isoelectric species. For an amino acidsuch as alanine that has only two dissociating groups,there is no ambiguity. The first pKa (R COOH) is2.35 and the second pKa (RNH3

+) is 9.69. The iso-electric pH (pI) of alanine thus is

For polyfunctional acids, pI is also the pH midway be-tween the pKa values on either side of the isoionicspecies. For example, the pI for aspartic acid is

For lysine, pI is calculated from:

Similar considerations apply to all polyprotic acids (eg,proteins), regardless of the number of dissociatinggroups present. In the clinical laboratory, knowledge ofthe pI guides selection of conditions for electrophoreticseparations. For example, electrophoresis at pH 7.0 willseparate two molecules with pI values of 6.0 and 8.0because at pH 8.0 the molecule with a pI of 6.0 willhave a net positive charge, and that with pI of 8.0 a netnegative charge. Similar considerations apply to under-standing chromatographic separations on ionic sup-ports such as DEAE cellulose (see Chapter 4).

plp p= +K K2 3

2

plp p= + = + =K K1 2

2

2 09 3 96

23 02

. ..

plp p= + = + =K K1 2

2

2 35 9 69

26 02

. ..

O

HO

NH3+

OH

O

H+

pK1 = 2.09(α-COOH)

AIn strong acid(below pH 1);net charge = +1

O

–O

NH3+

O

H+

pK2 = 3.86(β-COOH)

BAround pH 3;net charge = 0

O

–O

O

H+

pK3 = 9.82(— NH3

+)

CAround pH 6–8;net charge = –1

O

–O

NH2

O–

O

DIn strong alkali(above pH 11);net charge = –2

OH

NH3+

O–

Figure 3–1. Protonic equilibria of aspartic acid.

ch03.qxd 2/13/2003 1:35 PM Page 17

Page 18: Harpers Biochem

18 / CHAPTER 3

Table 3–2. Typical range of pKa values forionizable groups in proteins.

Dissociating Group pKa Range

α-Carboxyl 3.5–4.0Non-α COOH of Asp or Glu 4.0–4.8Imidazole of His 6.5–7.4SH of Cys 8.5–9.0OH of Tyr 9.5–10.5α-Amino 8.0–9.0ε-Amino of Lys 9.8–10.4Guanidinium of Arg ~12.0

pKa Values Vary With the Environment

The environment of a dissociable group affects its pKa.The pKa values of the R groups of free amino acids inaqueous solution (Table 3–1) thus provide only an ap-proximate guide to the pKa values of the same aminoacids when present in proteins. A polar environmentfavors the charged form (RCOO− or RNH3

+),and a nonpolar environment favors the uncharged form(RCOOH or RNH2). A nonpolar environmentthus raises the pKa of a carboxyl group (making it aweaker acid) but lowers that of an amino group (makingit a stronger acid). The presence of adjacent chargedgroups can reinforce or counteract solvent effects. ThepKa of a functional group thus will depend upon its lo-cation within a given protein. Variations in pKa can en-compass whole pH units (Table 3–2). pKa values thatdiverge from those listed by as much as three pH unitsare common at the active sites of enzymes. An extremeexample, a buried aspartic acid of thioredoxin, has apKa above 9—a shift of over six pH units!

The Solubility and Melting Points of Amino Acids Reflect Their Ionic Character

The charged functional groups of amino acids ensurethat they are readily solvated by—and thus soluble in—polar solvents such as water and ethanol but insolublein nonpolar solvents such as benzene, hexane, or ether.Similarly, the high amount of energy required to dis-rupt the ionic forces that stabilize the crystal latticeaccount for the high melting points of amino acids(> 200 °C).

Amino acids do not absorb visible light and thus arecolorless. However, tyrosine, phenylalanine, and espe-cially tryptophan absorb high-wavelength (250–290nm) ultraviolet light. Tryptophan therefore makes themajor contribution to the ability of most proteins toabsorb light in the region of 280 nm.

THE �-R GROUPS DETERMINE THEPROPERTIES OF AMINO ACIDS

Since glycine, the smallest amino acid, can be accommo-dated in places inaccessible to other amino acids, it oftenoccurs where peptides bend sharply. The hydrophobic Rgroups of alanine, valine, leucine, and isoleucine and thearomatic R groups of phenylalanine, tyrosine, and tryp-tophan typically occur primarily in the interior of cy-tosolic proteins. The charged R groups of basic andacidic amino acids stabilize specific protein conforma-tions via ionic interactions, or salt bonds. These bondsalso function in “charge relay” systems during enzymaticcatalysis and electron transport in respiring mitochon-dria. Histidine plays unique roles in enzymatic catalysis.The pKa of its imidazole proton permits it to function atneutral pH as either a base or an acid catalyst. The pri-mary alcohol group of serine and the primary thioalco-hol (SH) group of cysteine are excellent nucleophilesand can function as such during enzymatic catalysis.However, the secondary alcohol group of threonine,while a good nucleophile, does not fulfill an analogousrole in catalysis. The OH groups of serine, tyrosine,and threonine also participate in regulation of the activ-ity of enzymes whose catalytic activity depends on thephosphorylation state of these residues.

FUNCTIONAL GROUPS DICTATE THECHEMICAL REACTIONS OF AMINO ACIDS

Each functional group of an amino acid exhibits all ofits characteristic chemical reactions. For carboxylic acidgroups, these reactions include the formation of esters,amides, and acid anhydrides; for amino groups, acyla-tion, amidation, and esterification; and for OH andSH groups, oxidation and esterification. The mostimportant reaction of amino acids is the formation of apeptide bond (shaded blue).

Amino Acid Sequence Determines Primary Structure

The number and order of all of the amino acid residuesin a polypeptide constitute its primary structure.Amino acids present in peptides are called aminoacylresidues and are named by replacing the -ate or -ine suf-fixes of free amino acids with -yl (eg, alanyl, aspartyl, ty-

O

O O

O–HN

NH

SH

Cysteinyl

+H3N

Alanyl Valine

ch03.qxd 2/13/2003 1:35 PM Page 18

Page 19: Harpers Biochem

AMINO ACIDS & PEPTIDES / 19

CH2

C

N

O

C

O

CH N

CH2

SH

CH2

CH2

C

COO–

COO–H

H

NH3+H

Figure 3–3. Glutathione (γ-glutamyl-cysteinyl-glycine). Note the non-α peptide bond that links Glu to Cys.

rosyl). Peptides are then named as derivatives of thecarboxyl terminal aminoacyl residue. For example, Lys-Leu-Tyr-Gln is called lysyl-leucyl-tyrosyl-glutamine.The -ine ending on glutamine indicates that its α-car-boxyl group is not involved in peptide bond formation.

Peptide Structures Are Easy to Draw

Prefixes like tri- or octa- denote peptides with three oreight residues, respectively, not those with three oreight peptide bonds. By convention, peptides are writ-ten with the residue that bears the free α-amino groupat the left. To draw a peptide, use a zigzag to representthe main chain or backbone. Add the main chain atoms,which occur in the repeating order: α-nitrogen, α-car-bon, carbonyl carbon. Now add a hydrogen atom toeach α-carbon and to each peptide nitrogen, and anoxygen to the carbonyl carbon. Finally, add the appro-priate R groups (shaded) to each α-carbon atom.

Three-letter abbreviations linked by straight linesrepresent an unambiguous primary structure. Lines areomitted for single-letter abbreviations.

Where there is uncertainty about the order of a portionof a polypeptide, the questionable residues are enclosedin brackets and separated by commas.

Some Peptides Contain Unusual Amino Acids

In mammals, peptide hormones typically contain onlythe α-amino acids of proteins linked by standard pep-tide bonds. Other peptides may, however, contain non-protein amino acids, derivatives of the protein aminoacids, or amino acids linked by an atypical peptidebond. For example, the amino terminal glutamate ofglutathione, which participates in protein folding andin the metabolism of xenobiotics (Chapter 53), islinked to cysteine by a non-α peptide bond (Figure3–3). The amino terminal glutamate of thyrotropin-

Glu Lys Ala Gly Tyr His Ala- - - -( , , )

Glu - Ala - Lys - Gly - Tyr - Ala

E A K G Y A

Cα N

CN N CCα

CαC

O

O

C

CH2H

NH

C C+H3NHN COO–

–OOC

H3C H

CCCH2

OH

H

releasing hormone (TRH) is cyclized to pyroglutamicacid, and the carboxyl group of the carboxyl terminalprolyl residue is amidated. Peptides elaborated by fungi,bacteria, and lower animals can contain nonproteinamino acids. The antibiotics tyrocidin and gramicidin Sare cyclic polypeptides that contain D-phenylalanineand ornithine. The heptapeptide opioids dermorphinand deltophorin in the skin of South American treefrogs contain D-tyrosine and D-alanine.

Peptides Are Polyelectrolytes

The peptide bond is uncharged at any pH of physiologicinterest. Formation of peptides from amino acids istherefore accompanied by a net loss of one positive andone negative charge per peptide bond formed. Peptidesnevertheless are charged at physiologic pH owing to theircarboxyl and amino terminal groups and, where present,their acidic or basic R groups. As for amino acids, the netcharge on a peptide depends on the pH of its environ-ment and on the pKa values of its dissociating groups.

The Peptide Bond Has Partial Double-Bond Character

Although peptides are written as if a single bond linkedthe α-carboxyl and α-nitrogen atoms, this bond in factexhibits partial double-bond character:

There thus is no freedom of rotation about the bondthat connects the carbonyl carbon and the nitrogen of apeptide bond. Consequently, all four of the coloredatoms of Figure 3–4 are coplanar. The imposed semi-rigidity of the peptide bond has important conse-

CN

O

C

H

CN+

O–

H

ch03.qxd 2/13/2003 1:35 PM Page 19

Page 20: Harpers Biochem

20 / CHAPTER 3

O

C C

R′ H H

N

C

O

C

NC

H HO H R′′

122°

120°

N117°

121°

120°120°110°

0.36 nm

0.147 nm 0.153 nm

0.1 nm

0.123 nm

0.132 nm

Figure 3–4. Dimensions of a fully extended poly-peptide chain. The four atoms of the peptide bond(colored blue) are coplanar. The unshaded atoms arethe α-carbon atom, the α-hydrogen atom, and the α-Rgroup of the particular amino acid. Free rotation canoccur about the bonds that connect the α-carbon withthe α-nitrogen and with the α-carbonyl carbon (bluearrows). The extended polypeptide chain is thus a semi-rigid structure with two-thirds of the atoms of the back-bone held in a fixed planar relationship one to another.The distance between adjacent α-carbon atoms is 0.36nm (3.6 Å). The interatomic distances and bond angles,which are not equivalent, are also shown. (Redrawn andreproduced, with permission, from Pauling L, Corey LP,Branson HR: The structure of proteins: Two hydrogen-bonded helical configurations of the polypeptide chain.Proc Natl Acad Sci U S A 1951;37:205.)

quences for higher orders of protein structure. Encir-cling arrows (Figure 3–4) indicate free rotation aboutthe remaining bonds of the polypeptide backbone.

Noncovalent Forces Constrain PeptideConformations

Folding of a peptide probably occurs coincident withits biosynthesis (see Chapter 38). The physiologicallyactive conformation reflects the amino acid sequence,steric hindrance, and noncovalent interactions (eg, hy-drogen bonding, hydrophobic interactions) betweenresidues. Common conformations include α-helicesand β-pleated sheets (see Chapter 5).

ANALYSIS OF THE AMINO ACIDCONTENT OF BIOLOGIC MATERIALS

In order to determine the identity and quantity of eachamino acid in a sample of biologic material, it is first nec-essary to hydrolyze the peptide bonds that link the aminoacids together by treatment with hot HCl. The resulting

mixture of free amino acids is then treated with 6-amino-N-hydroxysuccinimidyl carbamate, which reacts withtheir α-amino groups, forming fluorescent derivativesthat are then separated and identified using high-pressureliquid chromatography (see Chapter 5). Ninhydrin, alsowidely used for detecting amino acids, forms a purpleproduct with α-amino acids and a yellow adduct withthe imine groups of proline and hydroxyproline.

SUMMARY

• Both D-amino acids and non-α-amino acids occurin nature, but only L-α-amino acids are present inproteins.

• All amino acids possess at least two weakly acidicfunctional groups, RNH3

+ and RCOOH.Many also possess additional weakly acidic functionalgroups such as OH, SH, guanidino, or imid-azole groups.

• The pKa values of all functional groups of an aminoacid dictate its net charge at a given pH. pI is the pHat which an amino acid bears no net charge and thusdoes not move in a direct current electrical field.

• Of the biochemical reactions of amino acids, themost important is the formation of peptide bonds.

• The R groups of amino acids determine their uniquebiochemical functions. Amino acids are classified asbasic, acidic, aromatic, aliphatic, or sulfur-containingbased on the properties of their R groups.

• Peptides are named for the number of amino acidresidues present, and as derivatives of the carboxylterminal residue. The primary structure of a peptideis its amino acid sequence, starting from the amino-terminal residue.

• The partial double-bond character of the bond thatlinks the carbonyl carbon and the nitrogen of a pep-tide renders four atoms of the peptide bond coplanarand restricts the number of possible peptide confor-mations.

REFERENCES

Doolittle RF: Reconstructing history with amino acid sequences.Protein Sci 1992;1:191.

Kreil G: D-Amino acids in animal peptides. Annu Rev Biochem1997;66:337.

Nokihara K, Gerhardt J: Development of an improved automatedgas-chromatographic chiral analysis system: application tonon-natural amino acids and natural protein hydrolysates.Chirality 2001;13:431.

Sanger F: Sequences, sequences, and sequences. Annu Rev Biochem1988;57:1.

Wilson NA et al: Aspartic acid 26 in reduced Escherichia coli thiore-doxin has a pKa greater than 9. Biochemistry 1995;34:8931.

ch03.qxd 2/13/2003 1:35 PM Page 20

Page 21: Harpers Biochem

Proteins: Determination of Primary Structure 4

21

Victor W. Rodwell, PhD, & Peter J. Kennelly, PhD

BIOMEDICAL IMPORTANCE

Proteins perform multiple critically important roles. Aninternal protein network, the cytoskeleton (Chapter49), maintains cellular shape and physical integrity.Actin and myosin filaments form the contractile ma-chinery of muscle (Chapter 49). Hemoglobin trans-ports oxygen (Chapter 6), while circulating antibodiessearch out foreign invaders (Chapter 50). Enzymes cat-alyze reactions that generate energy, synthesize and de-grade biomolecules, replicate and transcribe genes,process mRNAs, etc (Chapter 7). Receptors enable cellsto sense and respond to hormones and other environ-mental cues (Chapters 42 and 43). An important goalof molecular medicine is the identification of proteinswhose presence, absence, or deficiency is associatedwith specific physiologic states or diseases. The primarysequence of a protein provides both a molecular finger-print for its identification and information that can beused to identify and clone the gene or genes that en-code it.

PROTEINS & PEPTIDES MUST BEPURIFIED PRIOR TO ANALYSIS

Highly purified protein is essential for determination ofits amino acid sequence. Cells contain thousands of dif-ferent proteins, each in widely varying amounts. Theisolation of a specific protein in quantities sufficient foranalysis thus presents a formidable challenge that mayrequire multiple successive purification techniques.Classic approaches exploit differences in relative solu-bility of individual proteins as a function of pH (iso-electric precipitation), polarity (precipitation withethanol or acetone), or salt concentration (salting outwith ammonium sulfate). Chromatographic separationspartition molecules between two phases, one mobileand the other stationary. For separation of amino acidsor sugars, the stationary phase, or matrix, may be asheet of filter paper (paper chromatography) or a thinlayer of cellulose, silica, or alumina (thin-layer chro-matography; TLC).

Column Chromatography

Column chromatography of proteins employs as thestationary phase a column containing small sphericalbeads of modified cellulose, acrylamide, or silica whosesurface typically has been coated with chemical func-tional groups. These stationary phase matrices interactwith proteins based on their charge, hydrophobicity,and ligand-binding properties. A protein mixture is ap-plied to the column and the liquid mobile phase is per-colated through it. Small portions of the mobile phaseor eluant are collected as they emerge (Figure 4–1).

Partition Chromatography

Column chromatographic separations depend on therelative affinity of different proteins for a given station-ary phase and for the mobile phase. Association be-tween each protein and the matrix is weak and tran-sient. Proteins that interact more strongly with thestationary phase are retained longer. The length of timethat a protein is associated with the stationary phase is afunction of the composition of both the stationary andmobile phases. Optimal separation of the protein of in-terest from other proteins thus can be achieved by care-ful manipulation of the composition of the two phases.

Size Exclusion Chromatography

Size exclusion—or gel filtration—chromatography sep-arates proteins based on their Stokes radius, the diam-eter of the sphere they occupy as they tumble in solu-tion. The Stokes radius is a function of molecular massand shape. A tumbling elongated protein occupies alarger volume than a spherical protein of the same mass.Size exclusion chromatography employs porous beads(Figure 4–2). The pores are analogous to indentationsin a river bank. As objects move downstream, those thatenter an indentation are retarded until they drift backinto the main current. Similarly, proteins with Stokesradii too large to enter the pores (excluded proteins) re-main in the flowing mobile phase and emerge beforeproteins that can enter the pores (included proteins).

ch04.qxd 2/13/2003 2:02 PM Page 21

Page 22: Harpers Biochem

22 / CHAPTER 4

R

C

F

Figure 4–1. Components of a simple liquid chromatography apparatus. R: Reser-voir of mobile phase liquid, delivered either by gravity or using a pump. C: Glass orplastic column containing stationary phase. F: Fraction collector for collecting por-tions, called fractions, of the eluant liquid in separate test tubes.

Proteins thus emerge from a gel filtration column in de-scending order of their Stokes radii.

Absorption Chromatography

For absorption chromatography, the protein mixture isapplied to a column under conditions where the pro-tein of interest associates with the stationary phase sotightly that its partition coefficient is essentially unity.Nonadhering molecules are first eluted and discarded.Proteins are then sequentially released by disrupting theforces that stabilize the protein-stationary phase com-plex, most often by using a gradient of increasing saltconcentration. The composition of the mobile phase isaltered gradually so that molecules are selectively re-leased in descending order of their affinity for the sta-tionary phase.

Ion Exchange Chromatography

In ion exchange chromatography, proteins interact withthe stationary phase by charge-charge interactions. Pro-teins with a net positive charge at a given pH adhere tobeads with negatively charged functional groups such ascarboxylates or sulfates (cation exchangers). Similarly,proteins with a net negative charge adhere to beads withpositively charged functional groups, typically tertiary orquaternary amines (anion exchangers). Proteins, whichare polyanions, compete against monovalent ions forbinding to the support—thus the term “ion exchange.”For example, proteins bind to diethylaminoethyl(DEAE) cellulose by replacing the counter-ions (gener-ally Cl− or CH3COO−) that neutralize the protonatedamine. Bound proteins are selectively displaced by grad-ually raising the concentration of monovalent ions in

ch04.qxd 2/13/2003 2:02 PM Page 22

Page 23: Harpers Biochem

PROTEINS: DETERMINATION OF PRIMARY STRUCTURE / 23

A B C

Figure 4–2. Size-exclusion chromatography. A: A mixture of large molecules(diamonds) and small molecules (circles) are applied to the top of a gel filtrationcolumn. B: Upon entering the column, the small molecules enter pores in the sta-tionary phase matrix from which the large molecules are excluded. C: As the mo-bile phase flows down the column, the large, excluded molecules flow with itwhile the small molecules, which are temporarily sheltered from the flow when in-side the pores, lag farther and farther behind.

the mobile phase. Proteins elute in inverse order of thestrength of their interactions with the stationary phase.

Since the net charge on a protein is determined bythe pH (see Chapter 3), sequential elution of proteinsmay be achieved by changing the pH of the mobilephase. Alternatively, a protein can be subjected to con-secutive rounds of ion exchange chromatography, eachat a different pH, such that proteins that co-elute at onepH elute at different salt concentrations at another pH.

Hydrophobic Interaction Chromatography

Hydrophobic interaction chromatography separatesproteins based on their tendency to associate with a sta-tionary phase matrix coated with hydrophobic groups(eg, phenyl Sepharose, octyl Sepharose). Proteins withexposed hydrophobic surfaces adhere to the matrix viahydrophobic interactions that are enhanced by a mobilephase of high ionic strength. Nonadherent proteins arefirst washed away. The polarity of the mobile phase isthen decreased by gradually lowering the salt concentra-tion. If the interaction between protein and stationaryphase is particularly strong, ethanol or glycerol may beadded to the mobile phase to decrease its polarity andfurther weaken hydrophobic interactions.

Affinity Chromatography

Affinity chromatography exploits the high selectivity ofmost proteins for their ligands. Enzymes may be puri-

fied by affinity chromatography using immobilized sub-strates, products, coenzymes, or inhibitors. In theory,only proteins that interact with the immobilized ligandadhere. Bound proteins are then eluted either by compe-tition with soluble ligand or, less selectively, by disrupt-ing protein-ligand interactions using urea, guanidinehydrochloride, mildly acidic pH, or high salt concentra-tions. Stationary phase matrices available commerciallycontain ligands such as NAD+ or ATP analogs. Amongthe most powerful and widely applicable affinity matri-ces are those used for the purification of suitably modi-fied recombinant proteins. These include a Ni2+ matrixthat binds proteins with an attached polyhistidine “tag”and a glutathione matrix that binds a recombinant pro-tein linked to glutathione S-transferase.

Peptides Are Purified by Reversed-PhaseHigh-Pressure Chromatography

The stationary phase matrices used in classic columnchromatography are spongy materials whose compress-ibility limits flow of the mobile phase. High-pressure liq-uid chromatography (HPLC) employs incompressiblesilica or alumina microbeads as the stationary phase andpressures of up to a few thousand psi. Incompressiblematrices permit both high flow rates and enhanced reso-lution. HPLC can resolve complex mixtures of lipids orpeptides whose properties differ only slightly. Reversed-phase HPLC exploits a hydrophobic stationary phase of

ch04.qxd 2/13/2003 2:02 PM Page 23

Page 24: Harpers Biochem

24 / CHAPTER 4

aliphatic polymers 3–18 carbon atoms in length. Peptidemixtures are eluted using a gradient of a water-miscibleorganic solvent such as acetonitrile or methanol.

Protein Purity Is Assessed byPolyacrylamide Gel Electrophoresis(PAGE)

The most widely used method for determining the pu-rity of a protein is SDS-PAGE—polyacrylamide gelelectrophoresis (PAGE) in the presence of the anionicdetergent sodium dodecyl sulfate (SDS). Electrophore-sis separates charged biomolecules based on the rates atwhich they migrate in an applied electrical field. ForSDS-PAGE, acrylamide is polymerized and cross-linked to form a porous matrix. SDS denatures andbinds to proteins at a ratio of one molecule of SDS pertwo peptide bonds. When used in conjunction with 2-mercaptoethanol or dithiothreitol to reduce and breakdisulfide bonds (Figure 4–3), SDS separates the com-ponent polypeptides of multimeric proteins. The largenumber of anionic SDS molecules, each bearing acharge of −1, on each polypeptide overwhelms thecharge contributions of the amino acid functionalgroups. Since the charge-to-mass ratio of each SDS-polypeptide complex is approximately equal, the physi-cal resistance each peptide encounters as it moves

through the acrylamide matrix determines the rate ofmigration. Since large complexes encounter greater re-sistance, polypeptides separate based on their relativemolecular mass (Mr). Individual polypeptides trappedin the acrylamide gel are visualized by staining withdyes such as Coomassie blue (Figure 4–4).

Isoelectric Focusing (IEF)

Ionic buffers called ampholytes and an applied electricfield are used to generate a pH gradient within a poly-acrylamide matrix. Applied proteins migrate until theyreach the region of the matrix where the pH matchestheir isoelectric point (pI), the pH at which a peptide’snet charge is zero. IEF is used in conjunction with SDS-PAGE for two-dimensional electrophoresis, which sepa-rates polypeptides based on pI in one dimension andbased on Mr in the second (Figure 4–5). Two-dimen-sional electrophoresis is particularly well suited for sepa-rating the components of complex mixtures of proteins.

SANGER WAS THE FIRST TO DETERMINETHE SEQUENCE OF A POLYPEPTIDE

Mature insulin consists of the 21-residue A chain andthe 30-residue B chain linked by disulfide bonds. Fred-erick Sanger reduced the disulfide bonds (Figure 4–3),

NHHN

NHHN

HN

NH

HN

NH

HSO −

HS

O

OO

O

O

O

O

O

2

O

HCOOH

SH

C2H5

OH

SS

H

H

H

Figure 4–3. Oxidative cleavage of adjacent polypep-tide chains linked by disulfide bonds (shaded) by per-formic acid (left) or reductive cleavage by β-mercap-toethanol (right) forms two peptides that containcysteic acid residues or cysteinyl residues, respectively.

Figure 4–4. Use of SDS-PAGE to observe successivepurification of a recombinant protein. The gel wasstained with Coomassie blue. Shown are protein stan-dards (lane S) of the indicated mass, crude cell extract(E), high-speed supernatant liquid (H), and the DEAE-Sepharose fraction (D). The recombinant protein has amass of about 45 kDa.

ch04.qxd 2/13/2003 2:02 PM Page 24

Page 25: Harpers Biochem

PROTEINS: DETERMINATION OF PRIMARY STRUCTURE / 25

IEF

SDSPAGE

pH = 3 pH = 10

Figure 4–5. Two-dimensional IEF-SDS-PAGE. Thegel was stained with Coomassie blue. A crude bacter-ial extract was first subjected to isoelectric focusing(IEF) in a pH 3–10 gradient. The IEF gel was thenplaced horizontally on the top of an SDS gel, and theproteins then further resolved by SDS-PAGE. Noticethe greatly improved resolution of distinct polypep-tides relative to ordinary SDS-PAGE gel (Figure 4–4).

separated the A and B chains, and cleaved each chaininto smaller peptides using trypsin, chymotrypsin, andpepsin. The resulting peptides were then isolated andtreated with acid to hydrolyze peptide bonds and gener-ate peptides with as few as two or three amino acids.Each peptide was reacted with 1-fluoro-2,4-dinitroben-zene (Sanger’s reagent), which derivatizes the exposedα-amino group of amino terminal residues. The aminoacid content of each peptide was then determined.While the ε-amino group of lysine also reacts withSanger’s reagent, amino-terminal lysines can be distin-guished from those at other positions because they reactwith 2 mol of Sanger’s reagent. Working backwards tolarger fragments enabled Sanger to determine the com-plete sequence of insulin, an accomplishment for whichhe received a Nobel Prize in 1958.

THE EDMAN REACTION ENABLESPEPTIDES & PROTEINS TO BE SEQUENCED

Pehr Edman introduced phenylisothiocyanate (Edman’sreagent) to selectively label the amino-terminal residueof a peptide. In contrast to Sanger’s reagent, thephenylthiohydantoin (PTH) derivative can be removedunder mild conditions to generate a new amino terminalresidue (Figure 4–6). Successive rounds of derivatizationwith Edman’s reagent can therefore be used to sequencemany residues of a single sample of peptide. Edman se-quencing has been automated, using a thin film or solidmatrix to immobilize the peptide and HPLC to identifyPTH amino acids. Modern gas-phase sequencers cananalyze as little as a few picomoles of peptide.

Large Polypeptides Are First Cleaved IntoSmaller Segments

While the first 20–30 residues of a peptide can readilybe determined by the Edman method, most polypep-tides contain several hundred amino acids. Conse-quently, most polypeptides must first be cleaved intosmaller peptides prior to Edman sequencing. Cleavagealso may be necessary to circumvent posttranslationalmodifications that render a protein’s α-amino group“blocked”, or unreactive with the Edman reagent.

It usually is necessary to generate several peptidesusing more than one method of cleavage. This reflectsboth inconsistency in the spacing of chemically or enzy-matically susceptible cleavage sites and the need for setsof peptides whose sequences overlap so one can inferthe sequence of the polypeptide from which they derive(Figure 4–7). Reagents for the chemical or enzymaticcleavage of proteins include cyanogen bromide (CNBr),trypsin, and Staphylococcus aureus V8 protease (Table4–1). Following cleavage, the resulting peptides are pu-rified by reversed-phase HPLC—or occasionally bySDS-PAGE—and sequenced.

MOLECULAR BIOLOGY HASREVOLUTIONIZED THE DETERMINATIONOF PRIMARY STRUCTURE

Knowledge of DNA sequences permits deduction ofthe primary structures of polypeptides. DNA sequenc-ing requires only minute amounts of DNA and canreadily yield the sequence of hundreds of nucleotides.To clone and sequence the DNA that encodes a partic-

ch04.qxd 2/13/2003 2:02 PM Page 25

Page 26: Harpers Biochem

26 / CHAPTER 4

NH

HN

R

O

R′

NH2

O

S

O R

Phenylisothiocyanate (Edman reagent)and a peptide

NH

HN

R

O

R′

NH

S

NH

O

A phenylthiohydantoic acid

NH

NH2

O

R

N NH

A phenylthiohydantoin and a peptideshorter by one residue

S

CN

H2OH+, nitro-methane

+

+

Figure 4–6. The Edman reaction. Phenylisothio-cyanate derivatizes the amino-terminal residue of apeptide as a phenylthiohydantoic acid. Treatment withacid in a nonhydroxylic solvent releases a phenylthio-hydantoin, which is subsequently identified by its chro-matographic mobility, and a peptide one residueshorter. The process is then repeated.

Peptide X Peptide Y

Peptide Z

Carboxyl terminalportion ofpeptide X

Amino terminalportion ofpeptide Y

Figure 4–7. The overlapping peptide Z is used to de-duce that peptides X and Y are present in the originalprotein in the order X → Y, not Y ← X.

ular protein, some means of identifying the correctclone—eg, knowledge of a portion of its nucleotide se-quence—is essential. A hybrid approach thus hasemerged. Edman sequencing is used to provide a partialamino acid sequence. Oligonucleotide primers modeledon this partial sequence can then be used to identifyclones or to amplify the appropriate gene by the poly-merase chain reaction (PCR) (see Chapter 40). Once anauthentic DNA clone is obtained, its oligonucleotide

sequence can be determined and the genetic code usedto infer the primary structure of the encoded poly-peptide.

The hybrid approach enhances the speed and effi-ciency of primary structure analysis and the range ofproteins that can be sequenced. It also circumvents ob-stacles such as the presence of an amino-terminal block-ing group or the lack of a key overlap peptide. Only afew segments of primary structure must be determinedby Edman analysis.

DNA sequencing reveals the order in which aminoacids are added to the nascent polypeptide chain as it issynthesized on the ribosomes. However, it provides noinformation about posttranslational modifications suchas proteolytic processing, methylation, glycosylation,phosphorylation, hydroxylation of proline and lysine,and disulfide bond formation that accompany matura-tion. While Edman sequencing can detect the presenceof most posttranslational events, technical limitationsoften prevent identification of a specific modification.

Table 4–1. Methods for cleaving polypeptides.

Method Bond Cleaved

CNBr Met-X

Trypsin Lys-X and Arg-X

Chymotrypsin Hydrophobic amino acid-X

Endoproteinase Lys-C Lys-X

Endoproteinase Arg-C Arg-X

Endoproteinase Asp-N X-Asp

V8 protease Glu-X, particularly where X is hydro-phobic

Hydroxylamine Asn-Gly

o-Iodosobenzene Trp-X

Mild acid Asp-Pro

ch04.qxd 2/13/2003 2:02 PM Page 26

Page 27: Harpers Biochem

PROTEINS: DETERMINATION OF PRIMARY STRUCTURE / 27

S A

E

D

Figure 4–8. Basic components of a simple massspectrometer. A mixture of molecules is vaporized in anionized state in the sample chamber S. These mole-cules are then accelerated down the flight tube by anelectrical potential applied to accelerator grid A. An ad-justable electromagnet, E, applies a magnetic field thatdeflects the flight of the individual ions until they strikethe detector, D. The greater the mass of the ion, thehigher the magnetic field required to focus it onto thedetector.

MASS SPECTROMETRY DETECTSCOVALENT MODIFICATIONS

Mass spectrometry, which discriminates moleculesbased solely on their mass, is ideal for detecting thephosphate, hydroxyl, and other groups on posttransla-tionally modified amino acids. Each adds a specific andreadily identified increment of mass to the modifiedamino acid (Table 4–2). For analysis by mass spec-trometry, a sample in a vacuum is vaporized underconditions where protonation can occur, impartingpositive charge. An electrical field then propels thecations through a magnetic field which deflects themat a right angle to their original direction of flight andfocuses them onto a detector (Figure 4–8). The mag-netic force required to deflect the path of each ionicspecies onto the detector, measured as the current ap-plied to the electromagnet, is recorded. For ions ofidentical net charge, this force is proportionate to theirmass. In a time-of-flight mass spectrometer, a brieflyapplied electric field accelerates the ions towards a de-tector that records the time at which each ion arrives.For molecules of identical charge, the velocity to whichthey are accelerated—and hence the time required toreach the detector—will be inversely proportionate totheir mass.

Conventional mass spectrometers generally are usedto determine the masses of molecules of 1000 Da orless, whereas time-of-flight mass spectrometers aresuited for determining the large masses of proteins.The analysis of peptides and proteins by mass spec-tometry initially was hindered by difficulties involatilizing large organic molecules. However, matrix-assisted laser-desorption (MALDI) and electrospraydispersion (eg, nanospray) permit the masses of evenlarge polypeptides (> 100,000 Da) to be determinedwith extraordinary accuracy (± 1 Da). Using electro-spray dispersion, peptides eluting from a reversed-

phase HPLC column are introduced directly into themass spectrometer for immediate determination oftheir masses.

Peptides inside the mass spectrometer are brokendown into smaller units by collisions with neutral he-lium atoms (collision-induced dissociation), and themasses of the individual fragments are determined.Since peptide bonds are much more labile than carbon-carbon bonds, the most abundant fragments will differfrom one another by units equivalent to one or twoamino acids. Since—with the exception of leucine andisoleucine—the molecular mass of each amino acid isunique, the sequence of the peptide can be recon-structed from the masses of its fragments.

Tandem Mass Spectrometry

Complex peptide mixtures can now be analyzed with-out prior purification by tandem mass spectrometry,which employs the equivalent of two mass spectrome-ters linked in series. The first spectrometer separates in-dividual peptides based upon their differences in mass.By adjusting the field strength of the first magnet, a sin-gle peptide can be directed into the second mass spec-trometer, where fragments are generated and theirmasses determined. As the sensitivity and versatility ofmass spectrometry continue to increase, it is displacingEdman sequencers for the direct analysis of protein pri-mary structure.

Table 4–2. Mass increases resulting fromcommon posttranslational modifications.

Modification Mass Increase (Da)

Phosphorylation 80

Hydroxylation 16

Methylation 14

Acetylation 42

Myristylation 210

Palmitoylation 238

Glycosylation 162

ch04.qxd 2/13/2003 2:02 PM Page 27

Page 28: Harpers Biochem

28 / CHAPTER 4

GENOMICS ENABLES PROTEINS TO BEIDENTIFIED FROM SMALL AMOUNTS OF SEQUENCE DATA

Primary structure analysis has been revolutionized bygenomics, the application of automated oligonucleotidesequencing and computerized data retrieval and analysisto sequence an organism’s entire genetic complement.The first genome sequenced was that of Haemophilusinfluenzae, in 1995. By mid 2001, the completegenome sequences for over 50 organisms had been de-termined. These include the human genome and thoseof several bacterial pathogens; the results and signifi-cance of the Human Genome Project are discussed inChapter 54. Where genome sequence is known, thetask of determining a protein’s DNA-derived primarysequence is materially simplified. In essence, the secondhalf of the hybrid approach has already been com-pleted. All that remains is to acquire sufficient informa-tion to permit the open reading frame (ORF) thatencodes the protein to be retrieved from an Internet-accessible genome database and identified. In somecases, a segment of amino acid sequence only four orfive residues in length may be sufficient to identify thecorrect ORF.

Computerized search algorithms assist the identifi-cation of the gene encoding a given protein and clarifyuncertainties that arise from Edman sequencing andmass spectrometry. By exploiting computers to solvecomplex puzzles, the spectrum of information suitablefor identification of the ORF that encodes a particularpolypeptide is greatly expanded. In peptide mass profil-ing, for example, a peptide digest is introduced into themass spectrometer and the sizes of the peptides are de-termined. A computer is then used to find an ORFwhose predicted protein product would, if brokendown into peptides by the cleavage method selected,produce a set of peptides whose masses match those ob-served by mass spectrometry.

PROTEOMICS & THE PROTEOME

The Goal of Proteomics Is to Identify theEntire Complement of Proteins Elaboratedby a Cell Under Diverse Conditions

While the sequence of the human genome is known,the picture provided by genomics alone is both staticand incomplete. Proteomics aims to identify the entirecomplement of proteins elaborated by a cell under di-verse conditions. As genes are switched on and off, pro-teins are synthesized in particular cell types at specifictimes of growth or differentiation and in response toexternal stimuli. Muscle cells express proteins not ex-pressed by neural cells, and the type of subunits present

in the hemoglobin tetramer undergo change pre- andpostpartum. Many proteins undergo posttranslationalmodifications during maturation into functionallycompetent forms or as a means of regulating their prop-erties. Knowledge of the human genome therefore rep-resents only the beginning of the task of describing liv-ing organisms in molecular detail and understandingthe dynamics of processes such as growth, aging, anddisease. As the human body contains thousands of celltypes, each containing thousands of proteins, the pro-teome—the set of all the proteins expressed by an indi-vidual cell at a particular time—represents a movingtarget of formidable dimensions.

Two-Dimensional Electrophoresis &Gene Array Chips Are Used to SurveyProtein Expression

One goal of proteomics is the identification of proteinswhose levels of expression correlate with medically sig-nificant events. The presumption is that proteins whoseappearance or disappearance is associated with a specificphysiologic condition or disease will provide insightsinto root causes and mechanisms. Determination of theproteomes characteristic of each cell type requires theutmost efficiency in the isolation and identification ofindividual proteins. The contemporary approach uti-lizes robotic automation to speed sample preparationand large two-dimensional gels to resolve cellular pro-teins. Individual polypeptides are then extracted andanalyzed by Edman sequencing or mass spectroscopy.While only about 1000 proteins can be resolved on asingle gel, two-dimensional electrophoresis has a majoradvantage in that it examines the proteins themselves.An alternative and complementary approach employsgene arrays, sometimes called DNA chips, to detect theexpression of the mRNAs which encode proteins.While changes in the expression of the mRNA encod-ing a protein do not necessarily reflect comparablechanges in the level of the corresponding protein, genearrays are more sensitive probes than two-dimensionalgels and thus can examine more gene products.

Bioinformatics Assists Identification of Protein Functions

The functions of a large proportion of the proteins en-coded by the human genome are presently unknown.Recent advances in bioinformatics permit researchers tocompare amino acid sequences to discover clues to po-tential properties, physiologic roles, and mechanisms ofaction of proteins. Algorithms exploit the tendency ofnature to employ variations of a structural theme toperform similar functions in several proteins (eg, theRossmann nucleotide binding fold to bind NAD(P)H,

ch04.qxd 2/13/2003 2:02 PM Page 28

Page 29: Harpers Biochem

PROTEINS: DETERMINATION OF PRIMARY STRUCTURE / 29

nuclear targeting sequences, and EF hands to bindCa2+). These domains generally are detected in the pri-mary structure by conservation of particular aminoacids at key positions. Insights into the properties andphysiologic role of a newly discovered protein thus maybe inferred by comparing its primary structure withthat of known proteins.

SUMMARY

• Long amino acid polymers or polypeptides constitutethe basic structural unit of proteins, and the structureof a protein provides insight into how it fulfills itsfunctions.

• The Edman reaction enabled amino acid sequenceanalysis to be automated. Mass spectrometry pro-vides a sensitive and versatile tool for determiningprimary structure and for the identification of post-translational modifications.

• DNA cloning and molecular biology coupled withprotein chemistry provide a hybrid approach thatgreatly increases the speed and efficiency for determi-nation of primary structures of proteins.

• Genomics—the analysis of the entire oligonucleotidesequence of an organism’s complete genetic mater-ial—has provided further enhancements.

• Computer algorithms facilitate identification of theopen reading frames that encode a given protein byusing partial sequences and peptide mass profiling tosearch sequence databases.

• Scientists are now trying to determine the primarysequence and functional role of every protein ex-pressed in a living cell, known as its proteome.

• A major goal is the identification of proteins whoseappearance or disappearance correlates with physio-logic phenomena, aging, or specific diseases.

REFERENCES

Deutscher MP (editor): Guide to Protein Purification. Methods En-zymol 1990;182. (Entire volume.)

Geveart K, Vandekerckhove J: Protein identification methods inproteomics. Electrophoresis 2000;21:1145.

Helmuth L: Genome research: map of the human genome 3.0. Sci-ence 2001;293:583.

Khan J et al: DNA microarray technology: the anticipated impacton the study of human disease. Biochim Biophys Acta1999;1423:M17.

McLafferty FW et al: Biomolecule mass spectrometry. Science1999;284:1289.

Patnaik SK, Blumenfeld OO: Use of on-line tools and databases forroutine sequence analyses. Anal Biochem 2001;289:1.

Schena M et al: Quantitative monitoring of gene expression pat-terns with a complementary DNA microarray. Science1995;270:467.

Semsarian C, Seidman CE: Molecular medicine in the 21st cen-tury. Intern Med J 2001;31:53.

Temple LK et al: Essays on science and society: defining disease inthe genomics era. Science 2001;293:807.

Wilkins MR et al: High-throughput mass spectrometric discoveryof protein post-translational modifications. J Mol Biol1999;289:645.

ch04.qxd 2/13/2003 2:02 PM Page 29

Page 30: Harpers Biochem

Proteins: Higher Orders of Structure 5

30

Victor W. Rodwell, PhD, & Peter J. Kennelly, PhD

BIOMEDICAL IMPORTANCE

Proteins catalyze metabolic reactions, power cellularmotion, and form macromolecular rods and cables thatprovide structural integrity to hair, bones, tendons, andteeth. In nature, form follows function. The structuralvariety of human proteins therefore reflects the sophis-tication and diversity of their biologic roles. Maturationof a newly synthesized polypeptide into a biologicallyfunctional protein requires that it be folded into a spe-cific three-dimensional arrangement, or conformation.During maturation, posttranslational modificationsmay add new chemical groups or remove transientlyneeded peptide segments. Genetic or nutritional defi-ciencies that impede protein maturation are deleteriousto health. Examples of the former include Creutzfeldt-Jakob disease, scrapie, Alzheimer’s disease, and bovinespongiform encephalopathy (mad cow disease). Scurvyrepresents a nutritional deficiency that impairs proteinmaturation.

CONFORMATION VERSUSCONFIGURATION

The terms configuration and conformation are oftenconfused. Configuration refers to the geometric rela-tionship between a given set of atoms, for example,those that distinguish L- from D-amino acids. Intercon-version of configurational alternatives requires breakingcovalent bonds. Conformation refers to the spatial re-lationship of every atom in a molecule. Interconversionbetween conformers occurs without covalent bond rup-ture, with retention of configuration, and typically viarotation about single bonds.

PROTEINS WERE INITIALLY CLASSIFIEDBY THEIR GROSS CHARACTERISTICS

Scientists initially approached structure-function rela-tionships in proteins by separating them into classesbased upon properties such as solubility, shape, or thepresence of nonprotein groups. For example, the pro-teins that can be extracted from cells using solutions atphysiologic pH and ionic strength are classified as sol-uble. Extraction of integral membrane proteins re-quires dissolution of the membrane with detergents.

Globular proteins are compact, are roughly sphericalor ovoid in shape, and have axial ratios (the ratio oftheir shortest to longest dimensions) of not over 3.Most enzymes are globular proteins, whose large inter-nal volume provides ample space in which to con-struct cavities of the specific shape, charge, and hy-drophobicity or hydrophilicity required to bindsubstrates and promote catalysis. By contrast, manystructural proteins adopt highly extended conforma-tions. These fibrous proteins possess axial ratios of 10or more.

Lipoproteins and glycoproteins contain covalentlybound lipid and carbohydrate, respectively. Myoglobin,hemoglobin, cytochromes, and many other proteinscontain tightly associated metal ions and are termedmetalloproteins. With the development and applica-tion of techniques for determining the amino acid se-quences of proteins (Chapter 4), more precise classifica-tion schemes have emerged based upon similarity, orhomology, in amino acid sequence and structure.However, many early classification terms remain incommon use.

PROTEINS ARE CONSTRUCTED USINGMODULAR PRINCIPLES

Proteins perform complex physical and catalytic func-tions by positioning specific chemical groups in a pre-cise three-dimensional arrangement. The polypeptidescaffold containing these groups must adopt a confor-mation that is both functionally efficient and phys-ically strong. At first glance, the biosynthesis ofpolypeptides comprised of tens of thousands of indi-vidual atoms would appear to be extremely challeng-ing. When one considers that a typical polypeptidecan adopt ≥ 1050 distinct conformations, folding intothe conformation appropriate to their biologic func-tion would appear to be even more difficult. As de-scribed in Chapters 3 and 4, synthesis of the polypep-tide backbones of proteins employs a small set ofcommon building blocks or modules, the amino acids,joined by a common linkage, the peptide bond. Astepwise modular pathway simplifies the folding andprocessing of newly synthesized polypeptides into ma-ture proteins.

ch05.qxd 2/13/2003 2:06 PM Page 30

Page 31: Harpers Biochem

PROTEINS: HIGHER ORDERS OF STRUCTURE / 31

– 90

90

0

0– 90 90

φ

ψ

Figure 5–1. Ramachandran plot of the main chainphi (Φ) and psi (Ψ) angles for approximately 1000nonglycine residues in eight proteins whose structureswere solved at high resolution. The dots represent al-lowable combinations and the spaces prohibited com-binations of phi and psi angles. (Reproduced, with per-mission, from Richardson JS: The anatomy and taxonomyof protein structures. Adv Protein Chem 1981;34:167.)

THE FOUR ORDERS OF PROTEIN STRUCTURE

The modular nature of protein synthesis and foldingare embodied in the concept of orders of protein struc-ture: primary structure, the sequence of the aminoacids in a polypeptide chain; secondary structure, thefolding of short (3- to 30-residue), contiguous segmentsof polypeptide into geometrically ordered units; ter-tiary structure, the three-dimensional assembly of sec-ondary structural units to form larger functional unitssuch as the mature polypeptide and its component do-mains; and quaternary structure, the number andtypes of polypeptide units of oligomeric proteins andtheir spatial arrangement.

SECONDARY STRUCTURE

Peptide Bonds Restrict Possible Secondary Conformations

Free rotation is possible about only two of the three co-valent bonds of the polypeptide backbone: the α-car-bon (Cα) to the carbonyl carbon (Co) bond and theCα to nitrogen bond (Figure 3–4). The partial double-bond character of the peptide bond that links Co to theα-nitrogen requires that the carbonyl carbon, carbonyloxygen, and α-nitrogen remain coplanar, thus prevent-ing rotation. The angle about the CαN bond istermed the phi (Φ) angle, and that about the CoCαbond the psi (Ψ) angle. For amino acids other thanglycine, most combinations of phi and psi angles aredisallowed because of steric hindrance (Figure 5–1).The conformations of proline are even more restricteddue to the absence of free rotation of the NCα bond.

Regions of ordered secondary structure arise when aseries of aminoacyl residues adopt similar phi and psiangles. Extended segments of polypeptide (eg, loops)can possess a variety of such angles. The angles that de-fine the two most common types of secondary struc-ture, the � helix and the � sheet, fall within the lowerand upper left-hand quadrants of a Ramachandranplot, respectively (Figure 5–1).

The Alpha Helix

The polypeptide backbone of an α helix is twisted byan equal amount about each α-carbon with a phi angleof approximately −57 degrees and a psi angle of approx-imately −47 degrees. A complete turn of the helix con-tains an average of 3.6 aminoacyl residues, and the dis-tance it rises per turn (its pitch) is 0.54 nm (Figure5–2). The R groups of each aminoacyl residue in an αhelix face outward (Figure 5–3). Proteins contain onlyL-amino acids, for which a right-handed α helix is byfar the more stable, and only right-handed α helices

occur in nature. Schematic diagrams of proteins repre-sent α helices as cylinders.

The stability of an α helix arises primarily from hy-drogen bonds formed between the oxygen of the pep-tide bond carbonyl and the hydrogen atom of the pep-tide bond nitrogen of the fourth residue down thepolypeptide chain (Figure 5–4). The ability to form themaximum number of hydrogen bonds, supplementedby van der Waals interactions in the core of this tightlypacked structure, provides the thermodynamic drivingforce for the formation of an α helix. Since the peptidebond nitrogen of proline lacks a hydrogen atom to con-tribute to a hydrogen bond, proline can only be stablyaccommodated within the first turn of an α helix.When present elsewhere, proline disrupts the confor-mation of the helix, producing a bend. Because of itssmall size, glycine also often induces bends in α helices.

Many α helices have predominantly hydrophobic Rgroups on one side of the axis of the helix and predomi-nantly hydrophilic ones on the other. These amphi-pathic helices are well adapted to the formation of in-terfaces between polar and nonpolar regions such as thehydrophobic interior of a protein and its aqueous envi-

ch05.qxd 2/13/2003 2:06 PM Page 31

Page 32: Harpers Biochem

32 / CHAPTER 5

C

CN

C

NC

C

CCN

CN

C

C

0.15 nm

0.54-nm pitch(3.6 residues)

CN

CC

NC

CN

C

CN

CC

N

CN

C

N

Figure 5–2. Orientation of the main chain atoms of apeptide about the axis of an α helix.

R

R R

R

R

R

RR

R

Figure 5–3. View down the axis of an α helix. Theside chains (R) are on the outside of the helix. The vander Waals radii of the atoms are larger than shown here;hence, there is almost no free space inside the helix.(Slightly modified and reproduced, with permission, fromStryer L: Biochemistry, 3rd ed. Freeman, 1995. Copyright© 1995 by W.H. Freeman and Co.)

ronment. Clusters of amphipathic helices can create achannel, or pore, that permits specific polar moleculesto pass through hydrophobic cell membranes.

The Beta Sheet

The second (hence “beta”) recognizable regular sec-ondary structure in proteins is the β sheet. The aminoacid residues of a β sheet, when viewed edge-on, form azigzag or pleated pattern in which the R groups of adja-cent residues point in opposite directions. Unlike thecompact backbone of the α helix, the peptide backboneof the β sheet is highly extended. But like the α helix,β sheets derive much of their stability from hydrogenbonds between the carbonyl oxygens and amide hydro-gens of peptide bonds. However, in contrast to the αhelix, these bonds are formed with adjacent segments ofβ sheet (Figure 5–5).

Interacting β sheets can be arranged either to form aparallel β sheet, in which the adjacent segments of the

polypeptide chain proceed in the same direction aminoto carboxyl, or an antiparallel sheet, in which they pro-ceed in opposite directions (Figure 5–5). Either config-uration permits the maximum number of hydrogenbonds between segments, or strands, of the sheet. Mostβ sheets are not perfectly flat but tend to have a right-handed twist. Clusters of twisted strands of β sheetform the core of many globular proteins (Figure 5–6).Schematic diagrams represent β sheets as arrows thatpoint in the amino to carboxyl terminal direction.

Loops & Bends

Roughly half of the residues in a “typical” globular pro-tein reside in α helices and β sheets and half in loops,turns, bends, and other extended conformational fea-tures. Turns and bends refer to short segments ofamino acids that join two units of secondary structure,such as two adjacent strands of an antiparallel β sheet.A β turn involves four aminoacyl residues, in which thefirst residue is hydrogen-bonded to the fourth, resultingin a tight 180-degree turn (Figure 5–7). Proline andglycine often are present in β turns.

Loops are regions that contain residues beyond theminimum number necessary to connect adjacent re-gions of secondary structure. Irregular in conformation,loops nevertheless serve key biologic roles. For manyenzymes, the loops that bridge domains responsible forbinding substrates often contain aminoacyl residues

ch05.qxd 2/13/2003 2:06 PM Page 32

Page 33: Harpers Biochem

PROTEINS: HIGHER ORDERS OF STRUCTURE / 33

C

R

C

NCC R

CN

CR

C

N

CC

R

NR

CN O

C

N

RC

NR

CC

CN

CR

C

CR

CR

NC

NC

NC

R

Figure 5–4. Hydrogen bonds (dotted lines) formedbetween H and O atoms stabilize a polypeptide in an α-helical conformation. (Reprinted, with permission,from Haggis GH et al: Introduction to Molecular Biology.Wiley, 1964.)

Figure 5–5. Spacing and bond angles of the hydro-gen bonds of antiparallel and parallel pleated β sheets.Arrows indicate the direction of each strand. The hydro-gen-donating α-nitrogen atoms are shown as blue cir-cles. Hydrogen bonds are indicated by dotted lines. Forclarity in presentation, R groups and hydrogens areomitted. Top: Antiparallel β sheet. Pairs of hydrogenbonds alternate between being close together andwide apart and are oriented approximately perpendicu-lar to the polypeptide backbone. Bottom: Parallel βsheet. The hydrogen bonds are evenly spaced but slantin alternate directions.that participate in catalysis. Helix-loop-helix motifs

provide the oligonucleotide-binding portion of DNA-binding proteins such as repressors and transcriptionfactors. Structural motifs such as the helix-loop-helixmotif that are intermediate between secondary and ter-tiary structures are often termed supersecondary struc-tures. Since many loops and bends reside on the surfaceof proteins and are thus exposed to solvent, they consti-tute readily accessible sites, or epitopes, for recognitionand binding of antibodies.

While loops lack apparent structural regularity, theyexist in a specific conformation stabilized through hy-drogen bonding, salt bridges, and hydrophobic interac-tions with other portions of the protein. However, notall portions of proteins are necessarily ordered. Proteinsmay contain “disordered” regions, often at the extremeamino or carboxyl terminal, characterized by high con-formational flexibility. In many instances, these disor-

dered regions assume an ordered conformation uponbinding of a ligand. This structural flexibility enablessuch regions to act as ligand-controlled switches that af-fect protein structure and function.

Tertiary & Quaternary Structure

The term “tertiary structure” refers to the entire three-dimensional conformation of a polypeptide. It indicates,in three-dimensional space, how secondary structuralfeatures—helices, sheets, bends, turns, and loops—assemble to form domains and how these domains re-late spatially to one another. A domain is a section ofprotein structure sufficient to perform a particularchemical or physical task such as binding of a substrate

ch05.qxd 2/13/2003 2:06 PM Page 33

Page 34: Harpers Biochem

34 / CHAPTER 5

N

280

345

30

55

50

330

350

90

230245

260

258300

310320

220110

170

125

120205

145

150185

C377

70

15

80

Figure 5–6. Examples of tertiary structure of pro-teins. Top: The enzyme triose phosphate isomerase.Note the elegant and symmetrical arrangement of al-ternating β sheets and α helices. (Courtesy of J Richard-son.) Bottom: Two-domain structure of the subunit of ahomodimeric enzyme, a bacterial class II HMG-CoA re-ductase. As indicated by the numbered residues, thesingle polypeptide begins in the large domain, entersthe small domain, and ends in the large domain. (Cour-tesy of C Lawrence, V Rodwell, and C Stauffacher, PurdueUniversity.)

O

C

C

COOH

N

N

H

CH3

CH2

CH2OH

O H

H

H

H

H

H

H

NC

O

Figure 5–7. A β-turn that links two segments of an-tiparallel β sheet. The dotted line indicates the hydro-gen bond between the first and fourth amino acids ofthe four-residue segment Ala-Gly-Asp-Ser.

or other ligand. Other domains may anchor a protein toa membrane or interact with a regulatory molecule thatmodulates its function. A small polypeptide such astriose phosphate isomerase (Figure 5–6) or myoglobin(Chapter 6) may consist of a single domain. By contrast,protein kinases contain two domains. Protein kinasescatalyze the transfer of a phosphoryl group from ATP toa peptide or protein. The amino terminal portion of thepolypeptide, which is rich in β sheet, binds ATP, whilethe carboxyl terminal domain, which is rich in α helix,binds the peptide or protein substrate (Figure 5–8). Thegroups that catalyze phosphoryl transfer reside in a looppositioned at the interface of the two domains.

In some cases, proteins are assembled from morethan one polypeptide, or protomer. Quaternary struc-ture defines the polypeptide composition of a proteinand, for an oligomeric protein, the spatial relationshipsbetween its subunits or protomers. Monomeric pro-teins consist of a single polypeptide chain. Dimericproteins contain two polypeptide chains. Homodimerscontain two copies of the same polypeptide chain,while in a heterodimer the polypeptides differ. Greekletters (α, β, γ etc) are used to distinguish different sub-units of a heterooligomeric protein, and subscripts indi-cate the number of each subunit type. For example, α4designates a homotetrameric protein, and α2β2γ a pro-tein with five subunits of three different types.

Since even small proteins contain many thousandsof atoms, depictions of protein structure that indicatethe position of every atom are generally too complex tobe readily interpreted. Simplified schematic diagramsthus are used to depict key features of a protein’s ter-

ch05.qxd 2/13/2003 2:06 PM Page 34

Page 35: Harpers Biochem

PROTEINS: HIGHER ORDERS OF STRUCTURE / 35

Figure 5–8. Domain structure. Protein kinases con-tain two domains. The upper, amino terminal domainbinds the phosphoryl donor ATP (light blue). The lower,carboxyl terminal domain is shown binding a syntheticpeptide substrate (dark blue).

tiary and quaternary structure. Ribbon diagrams (Fig-ures 5–6 and 5–8) trace the conformation of thepolypeptide backbone, with cylinders and arrows indi-cating regions of α helix and β sheet, respectively. In aneven simpler representation, line segments that link theα carbons indicate the path of the polypeptide back-bone. These schematic diagrams often include the sidechains of selected amino acids that emphasize specificstructure-function relationships.

MULTIPLE FACTORS STABILIZETERTIARY & QUATERNARY STRUCTURE

Higher orders of protein structure are stabilized primar-ily—and often exclusively—by noncovalent interac-tions. Principal among these are hydrophobic interac-tions that drive most hydrophobic amino acid sidechains into the interior of the protein, shielding them

from water. Other significant contributors include hy-drogen bonds and salt bridges between the carboxylatesof aspartic and glutamic acid and the oppositelycharged side chains of protonated lysyl, argininyl, andhistidyl residues. While individually weak relative to atypical covalent bond of 80–120 kcal/mol, collectivelythese numerous interactions confer a high degree of sta-bility to the biologically functional conformation of aprotein, just as a Velcro fastener harnesses the cumula-tive strength of multiple plastic loops and hooks.

Some proteins contain covalent disulfide (SS)bonds that link the sulfhydryl groups of cysteinylresidues. Formation of disulfide bonds involves oxida-tion of the cysteinyl sulfhydryl groups and requires oxy-gen. Intrapolypeptide disulfide bonds further enhancethe stability of the folded conformation of a peptide,while interpolypeptide disulfide bonds stabilize thequaternary structure of certain oligomeric proteins.

THREE-DIMENSIONAL STRUCTURE IS DETERMINED BY X-RAYCRYSTALLOGRAPHY OR BY NMR SPECTROSCOPY

X-Ray Crystallography

Since the determination of the three-dimensional struc-ture of myoglobin over 40 years ago, the three-dimen-sional structures of thousands of proteins have been de-termined by x-ray crystallography. The key to x-raycrystallography is the precipitation of a protein underconditions in which it forms ordered crystals that dif-fract x-rays. This is generally accomplished by exposingsmall drops of the protein solution to various combina-tions of pH and precipitating agents such as salts andorganic solutes such as polyethylene glycol. A detailedthree-dimensional structure of a protein can be con-structed from its primary structure using the pattern bywhich it diffracts a beam of monochromatic x-rays.While the development of increasingly capable com-puter-based tools has rendered the analysis of complexx-ray diffraction patterns increasingly facile, a majorstumbling block remains the requirement of inducinghighly purified samples of the protein of interest tocrystallize. Several lines of evidence, including the abil-ity of some crystallized enzymes to catalyze chemical re-actions, indicate that the vast majority of the structuresdetermined by crystallography faithfully represent thestructures of proteins in free solution.

Nuclear Magnetic ResonanceSpectroscopy

Nuclear magnetic resonance (NMR) spectroscopy, apowerful complement to x-ray crystallography, mea-

ch05.qxd 2/13/2003 2:06 PM Page 35

Page 36: Harpers Biochem

36 / CHAPTER 5

sures the absorbance of radio frequency electromagneticenergy by certain atomic nuclei. “NMR-active” isotopesof biologically relevant atoms include 1H, 13C, 15N, and31P. The frequency, or chemical shift, at which a partic-ular nucleus absorbs energy is a function of both thefunctional group within which it resides and the prox-imity of other NMR-active nuclei. Two-dimensionalNMR spectroscopy permits a three-dimensional repre-sentation of a protein to be constructed by determiningthe proximity of these nuclei to one another. NMRspectroscopy analyzes proteins in aqueous solution, ob-viating the need to form crystals. It thus is possible toobserve changes in conformation that accompany lig-and binding or catalysis using NMR spectroscopy.However, only the spectra of relatively small proteins,≤ 20 kDa in size, can be analyzed with current tech-nology.

Molecular Modeling

An increasingly useful adjunct to the empirical determi-nation of the three-dimensional structure of proteins isthe use of computer technology for molecular model-ing. The types of models created take two forms. In thefirst, the known three-dimensional structure of a pro-tein is used as a template to build a model of the proba-ble structure of a homologous protein. In the second,computer software is used to manipulate the staticmodel provided by crystallography to explore how aprotein’s conformation might change when ligands arebound or when temperature, pH, or ionic strength isaltered. Scientists also are examining the library ofavailable protein structures in an attempt to devisecomputer programs that can predict the three-dimen-sional conformation of a protein directly from its pri-mary sequence.

PROTEIN FOLDING

The Native Conformation of a Protein Is Thermodynamically Favored

The number of distinct combinations of phi and psiangles specifying potential conformations of even a rel-atively small—15-kDa—polypeptide is unbelievablyvast. Proteins are guided through this vast labyrinth ofpossibilities by thermodynamics. Since the biologicallyrelevant—or native—conformation of a protein gener-ally is that which is most energetically favored, knowl-edge of the native conformation is specified in the pri-mary sequence. However, if one were to wait for apolypeptide to find its native conformation by randomexploration of all possible conformations, the processwould require billions of years to complete. Clearly,protein folding in cells takes place in a more orderlyand guided fashion.

Folding Is Modular

Protein folding generally occurs via a stepwise process.In the first stage, the newly synthesized polypeptideemerges from ribosomes, and short segments fold intosecondary structural units that provide local regions oforganized structure. Folding is now reduced to the se-lection of an appropriate arrangement of this relativelysmall number of secondary structural elements. In thesecond stage, the forces that drive hydrophobic regionsinto the interior of the protein away from solvent drivethe partially folded polypeptide into a “molten globule”in which the modules of secondary structure rearrangeto arrive at the mature conformation of the protein.This process is orderly but not rigid. Considerable flexi-bility exists in the ways and in the order in which ele-ments of secondary structure can be rearranged. In gen-eral, each element of secondary or supersecondarystructure facilitates proper folding by directing the fold-ing process toward the native conformation and awayfrom unproductive alternatives. For oligomeric pro-teins, individual protomers tend to fold before they as-sociate with other subunits.

Auxiliary Proteins Assist Folding

Under appropriate conditions, many proteins willspontaneously refold after being previously denatured(ie, unfolded) by treatment with acid or base,chaotropic agents, or detergents. However, unlike thefolding process in vivo, refolding under laboratory con-ditions is a far slower process. Moreover, some proteinsfail to spontaneously refold in vitro, often forming in-soluble aggregates, disordered complexes of unfoldedor partially folded polypeptides held together by hy-drophobic interactions. Aggregates represent unproduc-tive dead ends in the folding process. Cells employ aux-iliary proteins to speed the process of folding and toguide it toward a productive conclusion.

Chaperones

Chaperone proteins participate in the folding of overhalf of mammalian proteins. The hsp70 (70-kDa heatshock protein) family of chaperones binds short se-quences of hydrophobic amino acids in newly syn-thesized polypeptides, shielding them from solvent.Chaperones prevent aggregation, thus providing an op-portunity for the formation of appropriate secondarystructural elements and their subsequent coalescenceinto a molten globule. The hsp60 family of chaperones,sometimes called chaperonins, differ in sequence andstructure from hsp70 and its homologs. Hsp60 actslater in the folding process, often together with anhsp70 chaperone. The central cavity of the donut-

ch05.qxd 2/13/2003 2:06 PM Page 36

Page 37: Harpers Biochem

PROTEINS: HIGHER ORDERS OF STRUCTURE / 37

shaped hsp60 chaperone provides a sheltered environ-ment in which a polypeptide can fold until all hy-drophobic regions are buried in its interior, eliminatingaggregation. Chaperone proteins can also “rescue” pro-teins that have become thermodynamically trapped in amisfolded dead end by unfolding hydrophobic regionsand providing a second chance to fold productively.

Protein Disulfide Isomerase

Disulfide bonds between and within polypeptides stabi-lize tertiary and quaternary structure. However, disul-fide bond formation is nonspecific. Under oxidizingconditions, a given cysteine can form a disulfide bondwith the SH of any accessible cysteinyl residue. Bycatalyzing disulfide exchange, the rupture of an SSbond and its reformation with a different partner cys-teine, protein disulfide isomerase facilitates the forma-tion of disulfide bonds that stabilize their native confor-mation.

Proline-cis,trans-Isomerase

All X-Pro peptide bonds—where X represents anyresidue—are synthesized in the trans configuration.However, of the X-Pro bonds of mature proteins, ap-proximately 6% are cis. The cis configuration is partic-ularly common in β-turns. Isomerization from trans tocis is catalyzed by the enzyme proline-cis,trans-iso-merase (Figure 5–9).

SEVERAL NEUROLOGIC DISEASESRESULT FROM ALTERED PROTEINCONFORMATION

Prions

The transmissible spongiform encephalopathies, orprion diseases, are fatal neurodegenerative diseasescharacterized by spongiform changes, astrocytic gli-omas, and neuronal loss resulting from the depositionof insoluble protein aggregates in neural cells. They in-clude Creutzfeldt-Jakob disease in humans, scrapie in

sheep, and bovine spongiform encephalopathy (madcow disease) in cattle. Prion diseases may manifestthemselves as infectious, genetic, or sporadic disorders.Because no viral or bacterial gene encoding the patho-logic prion protein could be identified, the source andmechanism of transmission of prion disease long re-mained elusive. Today it is believed that prion diseasesare protein conformation diseases transmitted by alter-ing the conformation, and hence the physical proper-ties, of proteins endogenous to the host. Human prion-related protein, PrP, a glycoprotein encoded on theshort arm of chromosome 20, normally is monomericand rich in α helix. Pathologic prion proteins serve asthe templates for the conformational transformation ofnormal PrP, known as PrPc, into PrPsc. PrPsc is rich inβ sheet with many hydrophobic aminoacyl side chainsexposed to solvent. PrPsc molecules therefore associatestrongly with one other, forming insoluble protease-re-sistant aggregates. Since one pathologic prion or prion-related protein can serve as template for the conforma-tional transformation of many times its number of PrPcmolecules, prion diseases can be transmitted by the pro-tein alone without involvement of DNA or RNA.

Alzheimer’s Disease

Refolding or misfolding of another protein endogenousto human brain tissue, β-amyloid, is also a prominentfeature of Alzheimer’s disease. While the root cause ofAlzheimer’s disease remains elusive, the characteristicsenile plaques and neurofibrillary bundles contain ag-gregates of the protein β-amyloid, a 4.3-kDa polypep-tide produced by proteolytic cleavage of a larger proteinknown as amyloid precursor protein. In Alzheimer’sdisease patients, levels of β-amyloid become elevated,and this protein undergoes a conformational transfor-mation from a soluble α helix–rich state to a state richin β sheet and prone to self-aggregation. Apolipopro-tein E has been implicated as a potential mediator ofthis conformational transformation.

COLLAGEN ILLUSTRATES THE ROLE OFPOSTTRANSLATIONAL PROCESSING INPROTEIN MATURATION

Protein Maturation Often Involves Making& Breaking Covalent Bonds

The maturation of proteins into their final structuralstate often involves the cleavage or formation (or both)of covalent bonds, a process termed posttranslationalmodification. Many polypeptides are initially synthe-sized as larger precursors, called proproteins. The“extra” polypeptide segments in these proproteinsoften serve as leader sequences that target a polypeptide

HN

O

α1 α1

α1′

α1′

O

N

R1

HN

O O

R1

N

Figure 5–9. Isomerization of the N-α1 prolyl peptidebond from a cis to a trans configuration relative to thebackbone of the polypeptide.

ch05.qxd 2/13/2003 2:06 PM Page 37

Page 38: Harpers Biochem

38 / CHAPTER 5

to a particular organelle or facilitate its passage througha membrane. Others ensure that the potentially harm-ful activity of a protein such as the proteases trypsinand chymotrypsin remains inhibited until these pro-teins reach their final destination. However, once thesetransient requirements are fulfilled, the now superflu-ous peptide regions are removed by selective proteoly-sis. Other covalent modifications may take place thatadd new chemical functionalities to a protein. The mat-uration of collagen illustrates both of these processes.

Collagen Is a Fibrous Protein

Collagen is the most abundant of the fibrous proteinsthat constitute more than 25% of the protein mass inthe human body. Other prominent fibrous proteins in-clude keratin and myosin. These proteins represent aprimary source of structural strength for cells (ie, thecytoskeleton) and tissues. Skin derives its strength andflexibility from a crisscrossed mesh of collagen and ker-atin fibers, while bones and teeth are buttressed by anunderlying network of collagen fibers analogous to thesteel strands in reinforced concrete. Collagen also ispresent in connective tissues such as ligaments and ten-dons. The high degree of tensile strength required tofulfill these structural roles requires elongated proteinscharacterized by repetitive amino acid sequences and aregular secondary structure.

Collagen Forms a Unique Triple Helix

Tropocollagen consists of three fibers, each containingabout 1000 amino acids, bundled together in a uniqueconformation, the collagen triple helix (Figure 5–10). Amature collagen fiber forms an elongated rod with anaxial ratio of about 200. Three intertwined polypeptidestrands, which twist to the left, wrap around one an-other in a right-handed fashion to form the collagentriple helix. The opposing handedness of this superhelixand its component polypeptides makes the collagentriple helix highly resistant to unwinding—the sameprinciple used in the steel cables of suspension bridges.A collagen triple helix has 3.3 residues per turn and a

rise per residue nearly twice that of an α helix. The R groups of each polypeptide strand of the triple helixpack so closely that in order to fit, one must be glycine.Thus, every third amino acid residue in collagen is aglycine residue. Staggering of the three strands providesappropriate positioning of the requisite glycinesthroughout the helix. Collagen is also rich in prolineand hydroxyproline, yielding a repetitive Gly-X-Y pat-tern (Figure 5–10) in which Y generally is proline orhydroxyproline.

Collagen triple helices are stabilized by hydrogenbonds between residues in different polypeptide chains.The hydroxyl groups of hydroxyprolyl residues also par-ticipate in interchain hydrogen bonding. Additionalstability is provided by covalent cross-links formed be-tween modified lysyl residues both within and betweenpolypeptide chains.

Collagen Is Synthesized as aLarger Precursor

Collagen is initially synthesized as a larger precursorpolypeptide, procollagen. Numerous prolyl and lysylresidues of procollagen are hydroxylated by prolyl hy-droxylase and lysyl hydroxylase, enzymes that requireascorbic acid (vitamin C). Hydroxyprolyl and hydroxy-lysyl residues provide additional hydrogen bonding ca-pability that stabilizes the mature protein. In addition,glucosyl and galactosyl transferases attach glucosyl orgalactosyl residues to the hydroxyl groups of specifichydroxylysyl residues.

The central portion of the precursor polypeptidethen associates with other molecules to form the char-acteristic triple helix. This process is accompanied bythe removal of the globular amino terminal and car-boxyl terminal extensions of the precursor polypeptideby selective proteolysis. Certain lysyl residues are modi-fied by lysyl oxidase, a copper-containing protein thatconverts ε-amino groups to aldehydes. The aldehydescan either undergo an aldol condensation to form a CC double bond or to form a Schiff base (eneimine)with the ε-amino group of an unmodified lysyl residue,which is subsequently reduced to form a CN singlebond. These covalent bonds cross-link the individualpolypeptides and imbue the fiber with exceptionalstrength and rigidity.

Nutritional & Genetic Disorders Can ImpairCollagen Maturation

The complex series of events in collagen maturationprovide a model that illustrates the biologic conse-quences of incomplete polypeptide maturation. Thebest-known defect in collagen biosynthesis is scurvy, aresult of a dietary deficiency of vitamin C required by

Triple helix

2º structure

Amino acidsequence – Gly – X – Y – Gly – X – Y – Gly – X – Y –

Figure 5–10. Primary, secondary, and tertiary struc-tures of collagen.

ch05.qxd 2/13/2003 2:06 PM Page 38

Page 39: Harpers Biochem

PROTEINS: HIGHER ORDERS OF STRUCTURE / 39

prolyl and lysyl hydroxylases. The resulting deficit inthe number of hydroxyproline and hydroxylysineresidues undermines the conformational stability of col-lagen fibers, leading to bleeding gums, swelling joints,poor wound healing, and ultimately to death. Menkes’syndrome, characterized by kinky hair and growth re-tardation, reflects a dietary deficiency of the copper re-quired by lysyl oxidase, which catalyzes a key step information of the covalent cross-links that strengthencollagen fibers.

Genetic disorders of collagen biosynthesis includeseveral forms of osteogenesis imperfecta, characterizedby fragile bones. In Ehlers-Dahlos syndrome, a groupof connective tissue disorders that involve impaired in-tegrity of supporting structures, defects in the genesthat encode α collagen-1, procollagen N-peptidase, orlysyl hydroxylase result in mobile joints and skin abnor-malities.

SUMMARY

• Proteins may be classified on the basis of the solubil-ity, shape, or function or of the presence of a pros-thetic group such as heme. Proteins perform complexphysical and catalytic functions by positioning spe-cific chemical groups in a precise three-dimensionalarrangement that is both functionally efficient andphysically strong.

• The gene-encoded primary structure of a polypeptideis the sequence of its amino acids. Its secondarystructure results from folding of polypeptides intohydrogen-bonded motifs such as the α helix, the β-pleated sheet, β bends, and loops. Combinationsof these motifs can form supersecondary motifs.

• Tertiary structure concerns the relationships betweensecondary structural domains. Quaternary structureof proteins with two or more polypeptides(oligomeric proteins) is a feature based on the spatialrelationships between various types of polypeptides.

• Primary structures are stabilized by covalent peptidebonds. Higher orders of structure are stabilized byweak forces—multiple hydrogen bonds, salt (electro-static) bonds, and association of hydrophobic Rgroups.

• The phi (Φ) angle of a polypeptide is the angle aboutthe CαN bond; the psi (Ψ) angle is that about theCα-Co bond. Most combinations of phi-psi anglesare disallowed due to steric hindrance. The phi-psiangles that form the α helix and the β sheet fallwithin the lower and upper left-hand quadrants of aRamachandran plot, respectively.

• Protein folding is a poorly understood process.Broadly speaking, short segments of newly synthe-

sized polypeptide fold into secondary structuralunits. Forces that bury hydrophobic regions fromsolvent then drive the partially folded polypeptideinto a “molten globule” in which the modules of sec-ondary structure are rearranged to give the nativeconformation of the protein.

• Proteins that assist folding include protein disulfideisomerase, proline-cis,trans,-isomerase, and the chap-erones that participate in the folding of over half ofmammalian proteins. Chaperones shield newly syn-thesized polypeptides from solvent and provide anenvironment for elements of secondary structure toemerge and coalesce into molten globules.

• Techniques for study of higher orders of proteinstructure include x-ray crystallography, NMR spec-troscopy, analytical ultracentrifugation, gel filtration,and gel electrophoresis.

• Silk fibroin and collagen illustrate the close linkage ofprotein structure and biologic function. Diseases ofcollagen maturation include Ehlers-Danlos syndromeand the vitamin C deficiency disease scurvy.

• Prions—protein particles that lack nucleic acid—cause fatal transmissible spongiform encephalopa-thies such as Creutzfeldt-Jakob disease, scrapie, andbovine spongiform encephalopathy. Prion diseasesinvolve an altered secondary-tertiary structure of anaturally occurring protein, PrPc. When PrPc inter-acts with its pathologic isoform PrPSc, its conforma-tion is transformed from a predominantly α-helicalstructure to the β-sheet structure characteristic ofPrPSc.

REFERENCES

Branden C, Tooze J: Introduction to Protein Structure. Garland,1991.

Burkhard P, Stetefeld J, Strelkov SV: Coiled coils: A highly versa-tile protein folding motif. Trends Cell Biol 2001;11:82.

Collinge J: Prion diseases of humans and animals: Their causes andmolecular basis. Annu Rev Neurosci 2001;24:519.

Frydman J: Folding of newly translated proteins in vivo: The roleof molecular chaperones. Annu Rev Biochem 2001;70:603.

Radord S: Protein folding: Progress made and promises ahead.Trends Biochem Sci 2000;25:611.

Schmid FX: Proly isomerase: Enzymatic catalysis of slow proteinfolding reactions. Ann Rev Biophys Biomol Struct 1993;22:123.

Segrest MP et al: The amphipathic alpha-helix: A multifunctionalstructural motif in plasma lipoproteins. Adv Protein Chem1995;45:1.

Soto C: Alzheimer’s and prion disease as disorders of protein con-formation: Implications for the design of novel therapeuticapproaches. J Mol Med 1999;77:412.

ch05.qxd 2/13/2003 2:06 PM Page 39

Page 40: Harpers Biochem

Proteins: Myoglobin & Hemoglobin 6

40

Victor W. Rodwell, PhD, & Peter J. Kennelly, PhD

BIOMEDICAL IMPORTANCE

The heme proteins myoglobin and hemoglobin main-tain a supply of oxygen essential for oxidative metabo-lism. Myoglobin, a monomeric protein of red muscle,stores oxygen as a reserve against oxygen deprivation.Hemoglobin, a tetrameric protein of erythrocytes,transports O2 to the tissues and returns CO2 and pro-tons to the lungs. Cyanide and carbon monoxide killbecause they disrupt the physiologic function of theheme proteins cytochrome oxidase and hemoglobin, re-spectively. The secondary-tertiary structure of the sub-units of hemoglobin resembles myoglobin. However,the tetrameric structure of hemoglobin permits cooper-ative interactions that are central to its function. For ex-ample, 2,3-bisphosphoglycerate (BPG) promotes theefficient release of O2 by stabilizing the quaternarystructure of deoxyhemoglobin. Hemoglobin and myo-globin illustrate both protein structure-function rela-tionships and the molecular basis of genetic diseasessuch as sickle cell disease and the thalassemias.

HEME & FERROUS IRON CONFER THEABILITY TO STORE & TO TRANSPORTOXYGEN

Myoglobin and hemoglobin contain heme, a cyclictetrapyrrole consisting of four molecules of pyrrolelinked by α-methylene bridges. This planar network ofconjugated double bonds absorbs visible light and col-ors heme deep red. The substituents at the β-positionsof heme are methyl (M), vinyl (V), and propionate (Pr)groups arranged in the order M, V, M, V, M, Pr, Pr, M(Figure 6–1). One atom of ferrous iron (Fe2

+) resides atthe center of the planar tetrapyrrole. Other proteinswith metal-containing tetrapyrrole prosthetic groupsinclude the cytochromes (Fe and Cu) and chlorophyll(Mg) (see Chapter 12). Oxidation and reduction of theFe and Cu atoms of cytochromes is essential to their bi-ologic function as carriers of electrons. By contrast, oxi-dation of the Fe2

+ of myoglobin or hemoglobin to Fe3+

destroys their biologic activity.

Myoglobin Is Rich in α Helix

Oxygen stored in red muscle myoglobin is released dur-ing O2 deprivation (eg, severe exercise) for use in mus-cle mitochondria for aerobic synthesis of ATP (seeChapter 12). A 153-aminoacyl residue polypeptide(MW 17,000), myoglobin folds into a compact shapethat measures 4.5 × 3.5 × 2.5 nm (Figure 6–2). Unusu-ally high proportions, about 75%, of the residues arepresent in eight right-handed, 7–20 residue α helices.Starting at the amino terminal, these are termed helicesA–H. Typical of globular proteins, the surface of myo-globin is polar, while—with only two exceptions—theinterior contains only nonpolar residues such as Leu,Val, Phe, and Met. The exceptions are His E7 and HisF8, the seventh and eighth residues in helices E and F,which lie close to the heme iron where they function inO2 binding.

Histidines F8 & E7 Perform Unique Roles inOxygen Binding

The heme of myoglobin lies in a crevice between helicesE and F oriented with its polar propionate groups fac-ing the surface of the globin (Figure 6–2). The remain-der resides in the nonpolar interior. The fifth coordina-tion position of the iron is linked to a ring nitrogen ofthe proximal histidine, His F8. The distal histidine,His E7, lies on the side of the heme ring opposite toHis F8.

The Iron Moves Toward the Plane of theHeme When Oxygen Is Bound

The iron of unoxygenated myoglobin lies 0.03 nm(0.3 Å) outside the plane of the heme ring, toward HisF8. The heme therefore “puckers” slightly. When O2occupies the sixth coordination position, the ironmoves to within 0.01 nm (0.1 Å) of the plane of theheme ring. Oxygenation of myoglobin thus is accompa-nied by motion of the iron, of His F8, and of residueslinked to His F8.

ch06.qxd 2/13/2003 2:09 PM Page 40

Page 41: Harpers Biochem

PROTEINS: MYOGLOBIN & HEMOGLOBIN / 41

N

NN

NFe2+

O

–O

O–

O

Figure 6–1. Heme. The pyrrole rings and methylenebridge carbons are coplanar, and the iron atom (Fe2

+)resides in almost the same plane. The fifth and sixth co-ordination positions of Fe2

+ are directed perpendicularto—and directly above and below—the plane of theheme ring. Observe the nature of the substituentgroups on the β carbons of the pyrrole rings, the cen-tral iron atom, and the location of the polar side of theheme ring (at about 7 o’clock) that faces the surface ofthe myoglobin molecule.

Apomyoglobin Provides a HinderedEnvironment for Heme Iron

When O2 binds to myoglobin, the bond between the firstoxygen atom and the Fe2

+ is perpendicular to the plane ofthe heme ring. The bond linking the first and secondoxygen atoms lies at an angle of 121 degrees to the planeof the heme, orienting the second oxygen away from thedistal histidine (Figure 6–3, left). Isolated heme bindscarbon monoxide (CO) 25,000 times more strongly thanoxygen. Since CO is present in small quantities in the at-mosphere and arises in cells from the catabolism of heme,why is it that CO does not completely displace O2 fromheme iron? The accepted explanation is that the apopro-teins of myoglobin and hemoglobin create a hinderedenvironment. While CO can bind to isolated heme in itspreferred orientation, ie, with all three atoms (Fe, C, andO) perpendicular to the plane of the heme, in myoglobinand hemoglobin the distal histidine sterically precludesthis orientation. Binding at a less favored angle reducesthe strength of the heme-CO bond to about 200 timesthat of the heme-O2 bond (Figure 6–3, right) at whichlevel the great excess of O2 over CO normally presentdominates. Nevertheless, about 1% of myoglobin typi-cally is present combined with carbon monoxide.

THE OXYGEN DISSOCIATION CURVESFOR MYOGLOBIN & HEMOGLOBIN SUITTHEIR PHYSIOLOGIC ROLES

Why is myoglobin unsuitable as an O2 transport pro-tein but well suited for O2 storage? The relationshipbetween the concentration, or partial pressure, of O2(PO2) and the quantity of O2 bound is expressed as anO2 saturation isotherm (Figure 6–4). The oxygen-

H24

O

C

HC5

H16

EF3 EF1

E20

H5

H1

GH4

G19

G15

A16

AB1

B1

B5

NA1

A1

O–

F6

FG2

F9

F8

F1

C3 C7C5

E7

B14B16

E5

E1

CD1

C1

G1

G5 D1 D7

CD7

CD2

+H3N

Figure 6–2. A model of myoglobin at low resolution.Only the α-carbon atoms are shown. The α-helical re-gions are named A through H. (Based on Dickerson RE in:The Proteins, 2nd ed. Vol 2. Neurath H [editor]. AcademicPress, 1964. Reproduced with permission.)

Fe

O

O O

Fe

C

N

F8

N

N

F8

N

N

E7

N

N

E7

N

Figure 6–3. Angles for bonding of oxygen and car-bon monoxide to the heme iron of myoglobin. The dis-tal E7 histidine hinders bonding of CO at the preferred(180 degree) angle to the plane of the heme ring.

ch06.qxd 2/13/2003 2:10 PM Page 41

Page 42: Harpers Biochem

42 / CHAPTER 6

100

1400 20 40 60

Gaseous pressure of oxygen (mm Hg)

80 100 120

80

60

40

20

Per

cent

sat

urat

ion

Oxygenated bloodleaving the lungs

Myoglobin

Reduced bloodreturning from tissues

Hemoglobin

Figure 6–4. Oxygen-binding curves of both hemo-globin and myoglobin. Arterial oxygen tension is about100 mm Hg; mixed venous oxygen tension is about 40mm Hg; capillary (active muscle) oxygen tension isabout 20 mm Hg; and the minimum oxygen tension re-quired for cytochrome oxidase is about 5 mm Hg. Asso-ciation of chains into a tetrameric structure (hemoglo-bin) results in much greater oxygen delivery thanwould be possible with single chains. (Modified, withpermission, from Scriver CR et al [editors]: The Molecularand Metabolic Bases of Inherited Disease, 7th ed.McGraw-Hill, 1995.)

binding curve for myoglobin is hyperbolic. Myoglobintherefore loads O2 readily at the PO2 of the lung capil-lary bed (100 mm Hg). However, since myoglobin re-leases only a small fraction of its bound O2 at the PO2values typically encountered in active muscle (20 mmHg) or other tissues (40 mm Hg), it represents an inef-fective vehicle for delivery of O2. However, whenstrenuous exercise lowers the PO2 of muscle tissue toabout 5 mm Hg, myoglobin releases O2 for mitochon-drial synthesis of ATP, permitting continued muscularactivity.

THE ALLOSTERIC PROPERTIES OFHEMOGLOBINS RESULT FROM THEIRQUATERNARY STRUCTURES

The properties of individual hemoglobins are conse-quences of their quaternary as well as of their secondaryand tertiary structures. The quaternary structure of he-moglobin confers striking additional properties, absentfrom monomeric myoglobin, which adapts it to itsunique biologic roles. The allosteric (Gk allos “other,”steros “space”) properties of hemoglobin provide, in ad-dition, a model for understanding other allosteric pro-teins (see Chapter 11).

Hemoglobin Is Tetrameric

Hemoglobins are tetramers comprised of pairs of twodifferent polypeptide subunits. Greek letters are used todesignate each subunit type. The subunit compositionof the principal hemoglobins are α2β2 (HbA; normaladult hemoglobin), α2γ2 (HbF; fetal hemoglobin), α2S2(HbS; sickle cell hemoglobin), and α2δ2 (HbA2; aminor adult hemoglobin). The primary structures ofthe β, γ, and δ chains of human hemoglobin are highlyconserved.

Myoglobin & the � Subunits of Hemoglobin Share Almost IdenticalSecondary and Tertiary Structures

Despite differences in the kind and number of aminoacids present, myoglobin and the β polypeptide of he-moglobin A have almost identical secondary and ter-tiary structures. Similarities include the location of theheme and the eight helical regions and the presence ofamino acids with similar properties at comparable loca-tions. Although it possesses seven rather than eight heli-cal regions, the α polypeptide of hemoglobin alsoclosely resembles myoglobin.

Oxygenation of Hemoglobin Triggers Conformational Changes in the Apoprotein

Hemoglobins bind four molecules of O2 per tetramer,one per heme. A molecule of O2 binds to a hemoglobintetramer more readily if other O2 molecules are alreadybound (Figure 6–4). Termed cooperative binding,this phenomenon permits hemoglobin to maximizeboth the quantity of O2 loaded at the PO2 of the lungsand the quantity of O2 released at the PO2 of the pe-ripheral tissues. Cooperative interactions, an exclusiveproperty of multimeric proteins, are critically impor-tant to aerobic life.

P50 Expresses the Relative Affinities of Different Hemoglobins for Oxygen

The quantity P50, a measure of O2 concentration, is thepartial pressure of O2 that half-saturates a given hemo-globin. Depending on the organism, P50 can varywidely, but in all instances it will exceed the PO2 of theperipheral tissues. For example, values of P50 for HbAand fetal HbF are 26 and 20 mm Hg, respectively. Inthe placenta, this difference enables HbF to extract oxy-gen from the HbA in the mother’s blood. However,HbF is suboptimal postpartum since its high affinityfor O2 dictates that it can deliver less O2 to the tissues.

The subunit composition of hemoglobin tetramersundergoes complex changes during development. The

ch06.qxd 2/13/2003 2:10 PM Page 42

Page 43: Harpers Biochem

PROTEINS: MYOGLOBIN & HEMOGLOBIN / 43

50

40

30

20

10

03 6 3 6Birth

α chain

δ chain

β chain (adult)

and ζ chains(embryonic)

γ chain(fetal)

Gestation (months) Age (months)

Glo

bin

chai

n sy

nthe

sis

(% o

f tot

al)

Figure 6–5. Developmental pattern of the quater-nary structure of fetal and newborn hemoglobins. (Re-produced, with permission, from Ganong WF: Review ofMedical Physiology, 20th ed. McGraw-Hill, 2001.)

human fetus initially synthesizes a ζ2ε2 tetramer. By theend of the first trimester, ζ and γ subunits have been re-placed by α and ε subunits, forming HbF (α2γ2), thehemoglobin of late fetal life. While synthesis of β sub-units begins in the third trimester, β subunits do notcompletely replace γ subunits to yield adult HbA (α2β2)until some weeks postpartum (Figure 6–5).

Oxygenation of Hemoglobin IsAccompanied by Large Conformational Changes

The binding of the first O2 molecule to deoxyHb shiftsthe heme iron towards the plane of the heme ring froma position about 0.6 nm beyond it (Figure 6–6). Thismotion is transmitted to the proximal (F8) histidineand to the residues attached thereto, which in turncauses the rupture of salt bridges between the carboxylterminal residues of all four subunits. As a consequence,one pair of α/β subunits rotates 15 degrees with respectto the other, compacting the tetramer (Figure 6–7).Profound changes in secondary, tertiary, and quater-nary structure accompany the high-affinity O2-inducedtransition of hemoglobin from the low-affinity T (taut)state to the R (relaxed) state. These changes signifi-cantly increase the affinity of the remaining unoxy-genated hemes for O2, as subsequent binding events re-quire the rupture of fewer salt bridges (Figure 6–8).The terms T and R also are used to refer to the low-affinity and high-affinity conformations of allosteric en-zymes, respectively.

Fe

CF helix

Histidine F8

N

N

HCCH

Fe

C

F helix

N

N

HC CH

Fe

Stericrepulsion

Porphyrinplane

+O2

O

O

Figure 6–6. The iron atom moves into the plane ofthe heme on oxygenation. Histidine F8 and its associ-ated residues are pulled along with the iron atom.(Slightly modified and reproduced, with permission,from Stryer L: Biochemistry, 4th ed. Freeman, 1995.)

T form R form

15°

Axis

α1α1 β2

α2 β1

α2

β2

β1

Figure 6–7. During transition of the T form to the Rform of hemoglobin, one pair of subunits (α2/β2) ro-tates through 15 degrees relative to the other pair(α1/β1). The axis of rotation is eccentric, and the α2/β2

pair also shifts toward the axis somewhat. In the dia-gram, the unshaded α1/β1 pair is shown fixed while thecolored α2/β2 pair both shifts and rotates.

ch06.qxd 2/13/2003 2:10 PM Page 43

Page 44: Harpers Biochem

44 / CHAPTER 6

After Releasing O2 at the Tissues,Hemoglobin Transports CO2 & Protons to the Lungs

In addition to transporting O2 from the lungs to pe-ripheral tissues, hemoglobin transports CO2, the by-product of respiration, and protons from peripheral tis-sues to the lungs. Hemoglobin carries CO2 ascarbamates formed with the amino terminal nitrogensof the polypeptide chains.

Carbamates change the charge on amino terminalsfrom positive to negative, favoring salt bond formationbetween the α and β chains.

Hemoglobin carbamates account for about 15% ofthe CO2 in venous blood. Much of the remaining CO2is carried as bicarbonate, which is formed in erythro-cytes by the hydration of CO2 to carbonic acid(H2CO3), a process catalyzed by carbonic anhydrase. Atthe pH of venous blood, H2CO3 dissociates into bicar-bonate and a proton.

CO Hb HbHN

O

C O2 + ++

NH 2H 3+ =

|| −

R structure

T structure

α1 α2

β2β1

O2

O2 O2 O2 O2

O2

O2 O2

O2 O2

O2

O2 O2 O2 O2

O2

Figure 6–8. Transition from the T structure to the R structure. In this model, saltbridges (thin lines) linking the subunits in the T structure break progressively as oxy-gen is added, and even those salt bridges that have not yet ruptured are progressivelyweakened (wavy lines). The transition from T to R does not take place after a fixednumber of oxygen molecules have been bound but becomes more probable as eachsuccessive oxygen binds. The transition between the two structures is influenced byprotons, carbon dioxide, chloride, and BPG; the higher their concentration, the moreoxygen must be bound to trigger the transition. Fully oxygenated molecules in the Tstructure and fully deoxygenated molecules in the R structure are not shown becausethey are unstable. (Modified and redrawn, with permission, from Perutz MF: Hemoglobinstructure and respiratory transport. Sci Am [Dec] 1978;239:92.)

Deoxyhemoglobin binds one proton for every twoO2 molecules released, contributing significantly to thebuffering capacity of blood. The somewhat lower pH ofperipheral tissues, aided by carbamation, stabilizes theT state and thus enhances the delivery of O2. In thelungs, the process reverses. As O2 binds to deoxyhemo-globin, protons are released and combine with bicar-bonate to form carbonic acid. Dehydration of H2CO3,catalyzed by carbonic anhydrase, forms CO2, which isexhaled. Binding of oxygen thus drives the exhalationof CO2 (Figure 6–9).This reciprocal coupling of protonand O2 binding is termed the Bohr effect. The Bohreffect is dependent upon cooperative interactions be-tween the hemes of the hemoglobin tetramer. Myo-globin, a monomer, exhibits no Bohr effect.

Protons Arise From Rupture of Salt BondsWhen O2 Binds

Protons responsible for the Bohr effect arise from rup-ture of salt bridges during the binding of O2 to T state

ch06.qxd 2/13/2003 2:10 PM Page 44

Page 45: Harpers Biochem

PROTEINS: MYOGLOBIN & HEMOGLOBIN / 45

hemoglobin. Conversion to the oxygenated R statebreaks salt bridges involving β-chain residue His 146.The subsequent dissociation of protons from His 146drives the conversion of bicarbonate to carbonic acid(Figure 6–9). Upon the release of O2, the T structureand its salt bridges re-form. This conformationalchange increases the pKa of the β-chain His 146residues, which bind protons. By facilitating the re-for-mation of salt bridges, an increase in proton concentra-tion enhances the release of O2 from oxygenated (Rstate) hemoglobin. Conversely, an increase in PO2 pro-motes proton release.

2,3-Bisphosphoglycerate (BPG) Stabilizesthe T Structure of Hemoglobin

A low PO2 in peripheral tissues promotes the synthesisin erythrocytes of 2,3-bisphosphoglycerate (BPG) fromthe glycolytic intermediate 1,3-bisphosphoglycerate.

The hemoglobin tetramer binds one molecule ofBPG in the central cavity formed by its four subunits.However, the space between the H helices of the βchains lining the cavity is sufficiently wide to accom-modate BPG only when hemoglobin is in the T state.BPG forms salt bridges with the terminal amino groupsof both β chains via Val NA1 and with Lys EF6 andHis H21 (Figure 6–10). BPG therefore stabilizes de-oxygenated (T state) hemoglobin by forming additionalsalt bridges that must be broken prior to conversion tothe R state.

Residue H21 of the γ subunit of fetal hemoglobin(HbF) is Ser rather than His. Since Ser cannot form asalt bridge, BPG binds more weakly to HbF than toHbA. The lower stabilization afforded to the T state byBPG accounts for HbF having a higher affinity for O2than HbA.

CARBONICANHYDRASE

Hb • 4O2

Hb • 2H+

(buffer)

4O2

4O2

2H+ + 2HCO3–

2HCO3– + 2H+

2H2CO3

2CO2 + 2H2O

Exhaled

LUNGS CARBONICANHYDRASE

2H2CO3

2CO2 + 2H2O

Generated bythe Krebs cycle

PERIPHERALTISSUES

Figure 6–9. The Bohr effect. Carbon dioxide gener-ated in peripheral tissues combines with water to formcarbonic acid, which dissociates into protons and bicar-bonate ions. Deoxyhemoglobin acts as a buffer bybinding protons and delivering them to the lungs. Inthe lungs, the uptake of oxygen by hemoglobin re-leases protons that combine with bicarbonate ion,forming carbonic acid, which when dehydrated by car-bonic anhydrase becomes carbon dioxide, which thenis exhaled.

Val NA1

BPG

Lys EF6

His H21

Val NA1α-NH3

+

Lys EF6

His H21

Figure 6–10. Mode of binding of 2,3-bisphospho-glycerate to human deoxyhemoglobin. BPG interactswith three positively charged groups on each β chain.(Based on Arnone A: X-ray diffraction study of binding of2,3-diphosphoglycerate to human deoxyhemoglobin. Na-ture 1972;237:146. Reproduced with permission.)

ch06.qxd 2/13/2003 2:10 PM Page 45

Page 46: Harpers Biochem

46 / CHAPTER 6

Adaptation to High Altitude

Physiologic changes that accompany prolonged expo-sure to high altitude include an increase in the numberof erythrocytes and in their concentrations of hemoglo-bin and of BPG. Elevated BPG lowers the affinity ofHbA for O2 (decreases P50), which enhances release ofO2 at the tissues.

NUMEROUS MUTANT HUMANHEMOGLOBINS HAVE BEEN IDENTIFIED

Mutations in the genes that encode the α or β subunitsof hemoglobin potentially can affect its biologic func-tion. However, almost all of the over 800 known mu-tant human hemoglobins are both extremely rare andbenign, presenting no clinical abnormalities. When amutation does compromise biologic function, the con-dition is termed a hemoglobinopathy. The URLhttp://globin.cse.psu.edu/ (Globin Gene Server) pro-vides information about—and links for—normal andmutant hemoglobins.

Methemoglobin & Hemoglobin M

In methemoglobinemia, the heme iron is ferric ratherthan ferrous. Methemoglobin thus can neither bind nortransport O2. Normally, the enzyme methemoglobinreductase reduces the Fe3

+ of methemoglobin to Fe2+.

Methemoglobin can arise by oxidation of Fe2+ to Fe3

+

as a side effect of agents such as sulfonamides, fromhereditary hemoglobin M, or consequent to reducedactivity of the enzyme methemoglobin reductase.

In hemoglobin M, histidine F8 (His F8) has beenreplaced by tyrosine. The iron of HbM forms a tightionic complex with the phenolate anion of tyrosine thatstabilizes the Fe3

+ form. In α-chain hemoglobin M vari-ants, the R-T equilibrium favors the T state. Oxygenaffinity is reduced, and the Bohr effect is absent. β-Chain hemoglobin M variants exhibit R-T switching,and the Bohr effect is therefore present.

Mutations (eg, hemoglobin Chesapeake) that favorthe R state increase O2 affinity. These hemoglobinstherefore fail to deliver adequate O2 to peripheral tis-sues. The resulting tissue hypoxia leads to poly-cythemia, an increased concentration of erythrocytes.

Hemoglobin S

In HbS, the nonpolar amino acid valine has replacedthe polar surface residue Glu6 of the β subunit, gener-ating a hydrophobic “sticky patch” on the surface ofthe β subunit of both oxyHbS and deoxyHbS. BothHbA and HbS contain a complementary sticky patchon their surfaces that is exposed only in the deoxy-genated, R state. Thus, at low PO2, deoxyHbS can poly-merize to form long, insoluble fibers. Binding of deoxy-HbA terminates fiber polymerization, since HbA lacksthe second sticky patch necessary to bind another Hbmolecule (Figure 6–11). These twisted helical fibersdistort the erythrocyte into a characteristic sickle shape,rendering it vulnerable to lysis in the interstices of thesplenic sinusoids. They also cause multiple secondaryclinical effects. A low PO2 such as that at high altitudesexacerbates the tendency to polymerize.

Deoxy A

Oxy A

β

Deoxy A Oxy S Deoxy S

Deoxy S

α

βα

Figure 6–11. Representation of the sticky patch (�) on hemoglobin S and its “receptor” (�)on deoxyhemoglobin A and deoxyhemoglobin S. The complementary surfaces allow deoxyhe-moglobin S to polymerize into a fibrous structure, but the presence of deoxyhemoglobin A willterminate the polymerization by failing to provide sticky patches. (Modified and reproduced, withpermission, from Stryer L: Biochemistry, 4th ed. Freeman, 1995.)

ch06.qxd 2/13/2003 2:10 PM Page 46

Page 47: Harpers Biochem

PROTEINS: MYOGLOBIN & HEMOGLOBIN / 47

BIOMEDICAL IMPLICATIONS

Myoglobinuria

Following massive crush injury, myoglobin releasedfrom damaged muscle fibers colors the urine dark red.Myoglobin can be detected in plasma following a my-ocardial infarction, but assay of serum enzymes (seeChapter 7) provides a more sensitive index of myocar-dial injury.

Anemias

Anemias, reductions in the number of red blood cells orof hemoglobin in the blood, can reflect impaired syn-thesis of hemoglobin (eg, in iron deficiency; Chapter51) or impaired production of erythrocytes (eg, in folicacid or vitamin B12 deficiency; Chapter 45). Diagnosisof anemias begins with spectroscopic measurement ofblood hemoglobin levels.

Thalassemias

The genetic defects known as thalassemias result fromthe partial or total absence of one or more α or β chainsof hemoglobin. Over 750 different mutations havebeen identified, but only three are common. Either theα chain (alpha thalassemias) or β chain (beta thal-assemias) can be affected. A superscript indicateswhether a subunit is completely absent (α0 or β0) orwhether its synthesis is reduced (α+ or β+). Apart frommarrow transplantation, treatment is symptomatic.

Certain mutant hemoglobins are common in manypopulations, and a patient may inherit more than onetype. Hemoglobin disorders thus present a complexpattern of clinical phenotypes. The use of DNA probesfor their diagnosis is considered in Chapter 40.

Glycosylated Hemoglobin (HbA1c)

When blood glucose enters the erythrocytes it glycosy-lates the ε-amino group of lysine residues and theamino terminals of hemoglobin. The fraction of hemo-globin glycosylated, normally about 5%, is proportion-ate to blood glucose concentration. Since the half-life ofan erythrocyte is typically 60 days, the level of glycosy-lated hemoglobin (HbA1c) reflects the mean blood glu-cose concentration over the preceding 6–8 weeks.Measurement of HbA1c therefore provides valuable in-formation for management of diabetes mellitus.

SUMMARY

• Myoglobin is monomeric; hemoglobin is a tetramerof two subunit types (α2β2 in HbA). Despite having

different primary structures, myoglobin and the sub-units of hemoglobin have nearly identical secondaryand tertiary structures.

• Heme, an essentially planar, slightly puckered, cyclictetrapyrrole, has a central Fe2

+ linked to all four ni-trogen atoms of the heme, to histidine F8, and, inoxyMb and oxyHb, also to O2.

• The O2-binding curve for myoglobin is hyperbolic,but for hemoglobin it is sigmoidal, a consequence ofcooperative interactions in the tetramer. Cooperativ-ity maximizes the ability of hemoglobin both to loadO2 at the PO2 of the lungs and to deliver O2 at thePO2 of the tissues.

• Relative affinities of different hemoglobins for oxy-gen are expressed as P50, the PO2 that half-saturatesthem with O2. Hemoglobins saturate at the partialpressures of their respective respiratory organ, eg, thelung or placenta.

• On oxygenation of hemoglobin, the iron, histidineF8, and linked residues move toward the heme ring.Conformational changes that accompany oxygena-tion include rupture of salt bonds and loosening ofquaternary structure, facilitating binding of addi-tional O2.

• 2,3-Bisphosphoglycerate (BPG) in the central cavityof deoxyHb forms salt bonds with the β subunitsthat stabilize deoxyHb. On oxygenation, the centralcavity contracts, BPG is extruded, and the quaternarystructure loosens.

• Hemoglobin also functions in CO2 and protontransport from tissues to lungs. Release of O2 fromoxyHb at the tissues is accompanied by uptake ofprotons due to lowering of the pKa of histidineresidues.

• In sickle cell hemoglobin (HbS), Val replaces the β6Glu of HbA, creating a “sticky patch” that has acomplement on deoxyHb (but not on oxyHb). De-oxyHbS polymerizes at low O2 concentrations,forming fibers that distort erythrocytes into sickleshapes.

• Alpha and beta thalassemias are anemias that resultfrom reduced production of α and β subunits ofHbA, respectively.

REFERENCES

Bettati S et al: Allosteric mechanism of haemoglobin: Rupture ofsalt-bridges raises the oxygen affinity of the T-structure. JMol Biol 1998;281:581.

Bunn HF: Pathogenesis and treatment of sickle cell disease. N EnglJ Med 1997;337:762.

Faustino P et al: Dominantly transmitted beta-thalassemia arisingfrom the production of several aberrant mRNA species andone abnormal peptide. Blood 1998;91:685.

ch06.qxd 2/13/2003 2:10 PM Page 47

Page 48: Harpers Biochem

48 / CHAPTER 6

Manning JM et al: Normal and abnormal protein subunit interac-tions in hemoglobins. J Biol Chem 1998;273:19359.

Mario N, Baudin B, Giboudeau J: Qualitative and quantitativeanalysis of hemoglobin variants by capillary isoelectric focus-ing. J Chromatogr B Biomed Sci Appl 1998;706:123.

Reed W, Vichinsky EP: New considerations in the treatment ofsickle cell disease. Annu Rev Med 1998;49:461.

Unzai S et al: Rate constants for O2 and CO binding to the alphaand beta subunits within the R and T states of human hemo-globin. J Biol Chem 1998;273:23150.

Weatherall DJ et al: The hemoglobinopathies. Chapter 181 in TheMetabolic and Molecular Bases of Inherited Disease, 8th ed.Scriver CR et al (editors). McGraw-Hill, 2000.

ch06.qxd 2/13/2003 2:10 PM Page 48

Page 49: Harpers Biochem

Enzymes: Mechanism of Action 7

49

Victor W. Rodwell, PhD, & Peter J. Kennelly, PhD

BIOMEDICAL IMPORTANCE

Enzymes are biologic polymers that catalyze the chemi-cal reactions which make life as we know it possible.The presence and maintenance of a complete and bal-anced set of enzymes is essential for the breakdown ofnutrients to supply energy and chemical buildingblocks; the assembly of those building blocks into pro-teins, DNA, membranes, cells, and tissues; and the har-nessing of energy to power cell motility and musclecontraction. With the exception of a few catalytic RNAmolecules, or ribozymes, the vast majority of enzymesare proteins. Deficiencies in the quantity or catalytic ac-tivity of key enzymes can result from genetic defects,nutritional deficits, or toxins. Defective enzymes can re-sult from genetic mutations or infection by viral or bac-terial pathogens (eg, Vibrio cholerae). Medical scientistsaddress imbalances in enzyme activity by using pharma-cologic agents to inhibit specific enzymes and are inves-tigating gene therapy as a means to remedy deficits inenzyme level or function.

ENZYMES ARE EFFECTIVE & HIGHLYSPECIFIC CATALYSTS

The enzymes that catalyze the conversion of one ormore compounds (substrates) into one or more differ-ent compounds (products) enhance the rates of thecorresponding noncatalyzed reaction by factors of atleast 106. Like all catalysts, enzymes are neither con-sumed nor permanently altered as a consequence oftheir participation in a reaction.

In addition to being highly efficient, enzymes arealso extremely selective catalysts. Unlike most catalystsused in synthetic chemistry, enzymes are specific bothfor the type of reaction catalyzed and for a single sub-strate or a small set of closely related substrates. En-zymes are also stereospecific catalysts and typically cat-alyze reactions only of specific stereoisomers of a givencompound—for example, D- but not L-sugars, L- butnot D-amino acids. Since they bind substrates throughat least “three points of attachment,” enzymes can evenconvert nonchiral substrates to chiral products. Figure7–1 illustrates why the enzyme-catalyzed reduction ofthe nonchiral substrate pyruvate produces L-lactaterather a racemic mixture of D- and L-lactate. The ex-quisite specificity of enzyme catalysts imbues living cells

with the ability to simultaneously conduct and inde-pendently control a broad spectrum of chemicalprocesses.

ENZYMES ARE CLASSIFIED BY REACTIONTYPE & MECHANISM

A system of enzyme nomenclature that is comprehen-sive, consistent, and at the same time easy to use hasproved elusive. The common names for most enzymesderive from their most distinctive characteristic: theirability to catalyze a specific chemical reaction. In gen-eral, an enzyme’s name consists of a term that identifiesthe type of reaction catalyzed followed by the suffix -ase. For example, dehydrogenases remove hydrogenatoms, proteases hydrolyze proteins, and isomerases cat-alyze rearrangements in configuration. One or moremodifiers usually precede this name. Unfortunately,while many modifiers name the specific substrate in-volved (xanthine oxidase), others identify the source ofthe enzyme (pancreatic ribonuclease), specify its modeof regulation (hormone-sensitive lipase), or name a dis-tinguishing characteristic of its mechanism (a cysteineprotease). When it was discovered that multiple formsof some enzymes existed, alphanumeric designatorswere added to distinguish between them (eg, RNApolymerase III; protein kinase Cβ). To address the am-biguity and confusion arising from these inconsistenciesin nomenclature and the continuing discovery of newenzymes, the International Union of Biochemists (IUB)developed a complex but unambiguous system of en-zyme nomenclature. In the IUB system, each enzymehas a unique name and code number that reflect thetype of reaction catalyzed and the substrates involved.Enzymes are grouped into six classes, each with severalsubclasses. For example, the enzyme commonly called“hexokinase” is designated “ATP:D-hexose-6-phospho-transferase E.C. 2.7.1.1.” This identifies hexokinase as amember of class 2 (transferases), subclass 7 (transfer of aphosphoryl group), sub-subclass 1 (alcohol is the phos-phoryl acceptor). Finally, the term “hexose-6” indicatesthat the alcohol phosphorylated is that of carbon six ofa hexose. Listed below are the six IUB classes of en-zymes and the reactions they catalyze.

1. Oxidoreductases catalyze oxidations and reduc-tions.

ch07.qxd 2/13/2003 2:14 PM Page 49

Page 50: Harpers Biochem

50 / CHAPTER 7

Enzyme site Substrate

11

3

2 2

4

3

Figure 7–1. Planar representation of the “three-point attachment” of a substrate to the active site of anenzyme. Although atoms 1 and 4 are identical, onceatoms 2 and 3 are bound to their complementary siteson the enzyme, only atom 1 can bind. Once bound toan enzyme, apparently identical atoms thus may be dis-tinguishable, permitting a stereospecific chemicalchange.

2. Transferases catalyze transfer of groups such asmethyl or glycosyl groups from a donor moleculeto an acceptor molecule.

3. Hydrolases catalyze the hydrolytic cleavage ofCC, C O, CN, PO, and certain otherbonds, including acid anhydride bonds.

4. Lyases catalyze cleavage of CC, CO, CN,and other bonds by elimination, leaving doublebonds, and also add groups to double bonds.

5. Isomerases catalyze geometric or structuralchanges within a single molecule.

6. Ligases catalyze the joining together of two mole-cules, coupled to the hydrolysis of a pyrophospho-ryl group in ATP or a similar nucleoside triphos-phate.

Despite the many advantages of the IUB system,texts tend to refer to most enzymes by their older andshorter, albeit sometimes ambiguous names.

PROSTHETIC GROUPS, COFACTORS, & COENZYMES PLAY IMPORTANT ROLES IN CATALYSIS

Many enzymes contain small nonprotein molecules andmetal ions that participate directly in substrate bindingor catalysis. Termed prosthetic groups, cofactors, andcoenzymes, these extend the repertoire of catalytic ca-pabilities beyond those afforded by the limited numberof functional groups present on the aminoacyl sidechains of peptides.

Prosthetic Groups Are Tightly IntegratedInto an Enzyme’s Structure

Prosthetic groups are distinguished by their tight, stableincorporation into a protein’s structure by covalent ornoncovalent forces. Examples include pyridoxal phos-phate, flavin mononucleotide (FMN), flavin dinu-cleotide (FAD), thiamin pyrophosphate, biotin, andthe metal ions of Co, Cu, Mg, Mn, Se, and Zn. Metalsare the most common prosthetic groups. The roughlyone-third of all enzymes that contain tightly boundmetal ions are termed metalloenzymes. Metal ions thatparticipate in redox reactions generally are complexedto prosthetic groups such as heme (Chapter 6) or iron-sulfur clusters (Chapter 12). Metals also may facilitatethe binding and orientation of substrates, the formationof covalent bonds with reaction intermediates (Co2+ incoenzyme B12), or interaction with substrates to renderthem more electrophilic (electron-poor) or nucleo-philic (electron-rich).

Cofactors Associate Reversibly WithEnzymes or Substrates

Cofactors serve functions similar to those of prostheticgroups but bind in a transient, dissociable manner ei-ther to the enzyme or to a substrate such as ATP. Un-like the stably associated prosthetic groups, cofactorstherefore must be present in the medium surroundingthe enzyme for catalysis to occur. The most commoncofactors also are metal ions. Enzymes that require ametal ion cofactor are termed metal-activated enzymesto distinguish them from the metalloenzymes forwhich metal ions serve as prosthetic groups.

Coenzymes Serve as Substrate Shuttles

Coenzymes serve as recyclable shuttles—or grouptransfer reagents—that transport many substrates fromtheir point of generation to their point of utilization.Association with the coenzyme also stabilizes substratessuch as hydrogen atoms or hydride ions that are unsta-ble in the aqueous environment of the cell. Otherchemical moieties transported by coenzymes includemethyl groups (folates), acyl groups (coenzyme A), andoligosaccharides (dolichol).

Many Coenzymes, Cofactors, & ProstheticGroups Are Derivatives of B Vitamins

The water-soluble B vitamins supply important compo-nents of numerous coenzymes. Many coenzymes con-tain, in addition, the adenine, ribose, and phosphorylmoieties of AMP or ADP (Figure 7–2). Nicotinamideand riboflavin are components of the redox coenzymes

ch07.qxd 2/13/2003 2:14 PM Page 50

Page 51: Harpers Biochem

ENZYMES: MECHANISM OF ACTION / 51

OR

O

O

CH2

HO OH

HH

+N

O–

O

O

O

P

O

CH2

NH2

HO

HH

NH2

N

NN

N

OO–

O P

Figure 7–2. Structure of NAD+ and NADP+. ForNAD+, R = H. For NADP+, R = PO3

2−.

N

Arg 145

NH

CNH2NH2

OH

C

O

OH

O

Tyr 248C

H

CH

CCH2

NH2

OZn2+

H

OC

O

Glu 72

N

N

H

His 69

N

N

H

His 196

Figure 7–3. Two-dimensional representation of adipeptide substrate, glycyl-tyrosine, bound within theactive site of carboxypeptidase A.

NAD and NADP and FMN and FAD, respectively.Pantothenic acid is a component of the acyl group car-rier coenzyme A. As its pyrophosphate, thiamin partici-pates in decarboxylation of α-keto acids and folic acidand cobamide coenzymes function in one-carbon me-tabolism.

CATALYSIS OCCURS AT THE ACTIVE SITE

The extreme substrate specificity and high catalytic effi-ciency of enzymes reflect the existence of an environ-ment that is exquisitely tailored to a single reaction.Termed the active site, this environment generallytakes the form of a cleft or pocket. The active sites ofmultimeric enzymes often are located at the interfacebetween subunits and recruit residues from more thanone monomer. The three-dimensional active site bothshields substrates from solvent and facilitates catalysis.Substrates bind to the active site at a region comple-mentary to a portion of the substrate that will not un-dergo chemical change during the course of the reac-tion. This simultaneously aligns portions of thesubstrate that will undergo change with the chemicalfunctional groups of peptidyl aminoacyl residues. Theactive site also binds and orients cofactors or prostheticgroups. Many amino acyl residues drawn from diverseportions of the polypeptide chain (Figure 7–3) con-

tribute to the extensive size and three-dimensional char-acter of the active site.

ENZYMES EMPLOY MULTIPLEMECHANISMS TO FACILITATECATALYSIS

Four general mechanisms account for the ability of en-zymes to achieve dramatic catalytic enhancement of therates of chemical reactions.

Catalysis by Proximity

For molecules to react, they must come within bond-forming distance of one another. The higher their con-centration, the more frequently they will encounter oneanother and the greater will be the rate of their reaction.When an enzyme binds substrate molecules in its activesite, it creates a region of high local substrate concentra-tion. This environment also orients the substrate mole-cules spatially in a position ideal for them to interact, re-sulting in rate enhancements of at least a thousandfold.

Acid-Base Catalysis

The ionizable functional groups of aminoacyl sidechains and (where present) of prosthetic groups con-tribute to catalysis by acting as acids or bases. Acid-basecatalysis can be either specific or general. By “specific”we mean only protons (H3O+) or OH– ions. In specificacid or specific base catalysis, the rate of reaction issensitive to changes in the concentration of protons but

ch07.qxd 2/13/2003 2:14 PM Page 51

Page 52: Harpers Biochem

52 / CHAPTER 7

independent of the concentrations of other acids (pro-ton donors) or bases (proton acceptors) present in solu-tion or at the active site. Reactions whose rates are re-sponsive to all the acids or bases present are said to besubject to general acid or general base catalysis.

Catalysis by Strain

Enzymes that catalyze lytic reactions which involvebreaking a covalent bond typically bind their substratesin a conformation slightly unfavorable for the bondthat will undergo cleavage. The resulting strainstretches or distorts the targeted bond, weakening itand making it more vulnerable to cleavage.

Covalent Catalysis

The process of covalent catalysis involves the formationof a covalent bond between the enzyme and one or moresubstrates. The modified enzyme then becomes a reac-tant. Covalent catalysis introduces a new reaction path-way that is energetically more favorable—and thereforefaster—than the reaction pathway in homogeneous so-lution. The chemical modification of the enzyme is,however, transient. On completion of the reaction, theenzyme returns to its original unmodified state. Its rolethus remains catalytic. Covalent catalysis is particularlycommon among enzymes that catalyze group transferreactions. Residues on the enzyme that participate in co-valent catalysis generally are cysteine or serine and occa-sionally histidine. Covalent catalysis often follows a“ping-pong” mechanism—one in which the first sub-strate is bound and its product released prior to thebinding of the second substrate (Figure 7–4).

SUBSTRATES INDUCECONFORMATIONAL CHANGES IN ENZYMES

Early in the last century, Emil Fischer compared thehighly specific fit between enzymes and their substratesto that of a lock and its key. While the “lock and keymodel” accounted for the exquisite specificity of en-zyme-substrate interactions, the implied rigidity of the

enzyme’s active site failed to account for the dynamicchanges that accompany catalysis. This drawback wasaddressed by Daniel Koshland’s induced fit model,which states that when substrates approach and bind toan enzyme they induce a conformational change, achange analogous to placing a hand (substrate) into aglove (enzyme) (Figure 7–5). A corollary is that the en-zyme induces reciprocal changes in its substrates, har-nessing the energy of binding to facilitate the transfor-mation of substrates into products. The induced fitmodel has been amply confirmed by biophysical studiesof enzyme motion during substrate binding.

HIV PROTEASE ILLUSTRATES ACID-BASE CATALYSIS

Enzymes of the aspartic protease family, which in-cludes the digestive enzyme pepsin, the lysosomalcathepsins, and the protease produced by the human im-munodeficiency virus (HIV), share a common catalyticmechanism. Catalysis involves two conserved aspartylresidues which act as acid-base catalysts. In the first stageof the reaction, an aspartate functioning as a general base(Asp X, Figure 7–6) extracts a proton from a water mole-cule, making it more nucleophilic. This resulting nucle-ophile then attacks the electrophilic carbonyl carbon ofthe peptide bond targeted for hydrolysis, forming atetrahedral transition state intermediate. A second as-partate (Asp Y, Figure 7–6) then facilitates the decompo-sition of this tetrahedral intermediate by donating a pro-ton to the amino group produced by rupture of thepeptide bond. Two different active site aspartates thuscan act simultaneously as a general base or as a generalacid. This is possible because their immediate environ-ment favors ionization of one but not the other.

CHYMOTRYPSIN & FRUCTOSE-2,6-BISPHOSPHATASE ILLUSTRATECOVALENT CATALYSIS

Chymotrypsin

While catalysis by aspartic proteases involves the directhydrolytic attack of water on a peptide bond, catalysis

PyrAla

E CHO E

CHO

Ala

E

CHO

Glu

E

CH2NH2

Pyr

E CH2NH2

Glu

E CHO

KG

E

CH2NH2

KG

Figure 7–4. Ping-pong mechanism for transamination. ECHO and ECH2NH2 represent the enzyme-pyridoxal phosphate and enzyme-pyridoxamine complexes, respectively. (Ala, alanine; Pyr, pyruvate; KG, α-ketoglutarate; Glu, glutamate).

ch07.qxd 2/13/2003 2:14 PM Page 52

Page 53: Harpers Biochem

ENZYMES: MECHANISM OF ACTION / 53

A B

A B

A B

Figure 7–5. Two-dimensional representation ofKoshland’s induced fit model of the active site of alyase. Binding of the substrate AB induces conforma-tional changes in the enzyme that aligns catalyticresidues which participate in catalysis and strains thebond between A and B, facilitating its cleavage.

O

C

O

RN

R′

..H

H

H

O

C

O

CH2

Asp X

O

C

O

CH2

Asp Y

1

O

C RN

R′

H OH2

..

H

O

C

O

CH2

Asp Y

O

C

O

CH2

Asp X

H

O

C

O

H

R′N H

H

O

C

HO

R

OO

C

CH2

Asp Y

CH2

Asp X

+

3

..H

..

Figure 7–6. Mechanism for catalysis by an asparticprotease such as HIV protease. Curved arrows indicatedirections of electron movement. �1 Aspartate X actsas a base to activate a water molecule by abstracting aproton. �2 The activated water molecule attacks thepeptide bond, forming a transient tetrahedral interme-diate. �3 Aspartate Y acts as an acid to facilitate break-down of the tetrahedral intermediate and release of thesplit products by donating a proton to the newlyformed amino group. Subsequent shuttling of the pro-ton on Asp X to Asp Y restores the protease to its initialstate.

by the serine protease chymotrypsin involves prior for-mation of a covalent acyl enzyme intermediate. Ahighly reactive seryl residue, serine 195, participates ina charge-relay network with histidine 57 and aspartate102. Far apart in primary structure, in the active sitethese residues are within bond-forming distance of oneanother. Aligned in the order Asp 102-His 57-Ser 195,they constitute a “charge-relay network” that functionsas a “proton shuttle.”

Binding of substrate initiates proton shifts that in ef-fect transfer the hydroxyl proton of Ser 195 to Asp 102(Figure 7–7). The enhanced nucleophilicity of the seryloxygen facilitates its attack on the carbonyl carbon ofthe peptide bond of the substrate, forming a covalentacyl-enzyme intermediate. The hydrogen on Asp 102then shuttles through His 57 to the amino group liber-ated when the peptide bond is cleaved. The portion ofthe original peptide with a free amino group then leavesthe active site and is replaced by a water molecule. Thecharge-relay network now activates the water moleculeby withdrawing a proton through His 57 to Asp 102.The resulting hydroxide ion attacks the acyl-enzyme in-

ch07.qxd 2/13/2003 2:14 PM Page 53

Page 54: Harpers Biochem

54 / CHAPTER 7

NR1 R2C

OH

HO N HN O

His 57

CO

Asp 102

Ser 195

NR1 R2C

OH

HO N HN O

His 57

O

Asp 102

Ser 195

NH2R1 R2C

HO N N O

His 57

O

Asp 102

Ser 195

O

HO N N

O

His 57

O

Asp 102

H

R2C

O

Ser 195

OH

O R2C

OH

HO N HN O

His 57

O

Asp 102

Ser 195

R2HOOC

H N HN O

His 57

Ser 195

OO

Asp 102

1

2

3

4

5

6

Figure 7–7. Catalysis by chymotrypsin. �1 Thecharge-relay system removes a proton from Ser 195,making it a stronger nucleophile. �2 Activated Ser 195attacks the peptide bond, forming a transient tetrahedralintermediate. �3 Release of the amino terminal peptideis facilitated by donation of a proton to the newlyformed amino group by His 57 of the charge-relay sys-tem, yielding an acyl-Ser 195 intermediate. �4 His 57 andAsp 102 collaborate to activate a water molecule, whichattacks the acyl-Ser 195, forming a second tetrahedral in-termediate. �5 The charge-relay system donates a pro-ton to Ser 195, facilitating breakdown of tetrahedral in-termediate to release the carboxyl terminal peptide �6 .

termediate and a reverse proton shuttle returns a protonto Ser 195, restoring its original state. While modifiedduring the process of catalysis, chymotrypsin emergesunchanged on completion of the reaction. Trypsin andelastase employ a similar catalytic mechanism, but thenumbers of the residues in their Ser-His-Asp protonshuttles differ.

Fructose-2,6-Bisphosphatase

Fructose-2,6-bisphosphatase, a regulatory enzyme ofgluconeogenesis (Chapter 19), catalyzes the hydrolyticrelease of the phosphate on carbon 2 of fructose 2,6-bisphosphate. Figure 7–8 illustrates the roles of sevenactive site residues. Catalysis involves a “catalytic triad”of one Glu and two His residues and a covalent phos-phohistidyl intermediate.

CATALYTIC RESIDUES AREHIGHLY CONSERVED

Members of an enzyme family such as the aspartic orserine proteases employ a similar mechanism to catalyzea common reaction type but act on different substrates.Enzyme families appear to arise through gene duplica-tion events that create a second copy of the gene whichencodes a particular enzyme. The proteins encoded bythe two genes can then evolve independently to recog-nize different substrates—resulting, for example, inchymotrypsin, which cleaves peptide bonds on the car-boxyl terminal side of large hydrophobic amino acids;and trypsin, which cleaves peptide bonds on the car-boxyl terminal side of basic amino acids. The commonancestry of enzymes can be inferred from the presenceof specific amino acids in the same position in eachfamily member. These residues are said to be conservedresidues. Proteins that share a large number of con-served residues are said to be homologous to one an-other. Table 7–1 illustrates the primary structural con-servation of two components of the charge-relaynetwork for several serine proteases. Among the mosthighly conserved residues are those that participate di-rectly in catalysis.

ISOZYMES ARE DISTINCT ENZYMEFORMS THAT CATALYZE THESAME REACTION

Higher organisms often elaborate several physically dis-tinct versions of a given enzyme, each of which cat-alyzes the same reaction. Like the members of otherprotein families, these protein catalysts or isozymesarise through gene duplication. Isozymes may exhibitsubtle differences in properties such as sensitivity to

5475ch07.qxd_ccII 2/14/2003 7:56 AM Page 54

Page 55: Harpers Biochem

ENZYMES: MECHANISM OF ACTION / 55

P

P

1

E • Fru-2,6-P2 E-P • Fru-6-P

+ HGlu327

His392

Arg 257

Arg 307

Arg352

Lys 356

His 258

+

+

+

+

P

P

2

2–

Glu327

His392

Arg 257

Arg 307

Arg352

Lys 356

His 258

O H

6–

2–

O

6–

+

+

+

+

OH

H

P

3

E-P • H2O

+ HGlu327

His392

Arg 257

Arg 307

Arg352

Lys 356

His 258

+

+

+

+

–Pi

4

E • Pi

+Glu327

His392

Arg 257

Arg 307

Arg352

Lys 356

His 258

+

+

+

+

Figure 7–8. Catalysis by fructose-2,6-bisphos-phatase. (1) Lys 356 and Arg 257, 307, and 352 stabilizethe quadruple negative charge of the substrate bycharge-charge interactions. Glu 327 stabilizes the posi-tive charge on His 392. (2) The nucleophile His 392 at-tacks the C-2 phosphoryl group and transfers it to His258, forming a phosphoryl-enzyme intermediate. Fruc-tose 6-phosphate leaves the enzyme. (3) Nucleophilicattack by a water molecule, possibly assisted by Glu 327acting as a base, forms inorganic phosphate. (4) Inor-ganic orthophosphate is released from Arg 257 and Arg307. (Reproduced, with permission, from Pilkis SJ et al: 6-Phosphofructo-2-kinase/fructose-2,6-bisphosphatase: Ametabolic signaling enzyme. Annu Rev Biochem1995;64:799.)

particular regulatory factors (Chapter 9) or substrateaffinity (eg, hexokinase and glucokinase) that adaptthem to specific tissues or circumstances. Some iso-zymes may also enhance survival by providing a “back-up” copy of an essential enzyme.

THE CATALYTIC ACTIVITY OF ENZYMESFACILITATES THEIR DETECTION

The minute quantities of enzymes present in cells com-plicate determination of their presence and concentra-tion. However, the ability to rapidly transform thou-sands of molecules of a specific substrate into productsimbues each enzyme with the ability to reveal its pres-ence. Assays of the catalytic activity of enzymes are fre-quently used in research and clinical laboratories.Under appropriate conditions (see Chapter 8), the rateof the catalytic reaction being monitored is proportion-ate to the amount of enzyme present, which allows itsconcentration to be inferred.

Enzyme-Linked Immunoassays

The sensitivity of enzyme assays can also be exploited todetect proteins that lack catalytic activity. Enzyme-linked immunoassays (ELISAs) use antibodies cova-lently linked to a “reporter enzyme” such as alkalinephosphatase or horseradish peroxidase, enzymes whoseproducts are readily detected. When serum or othersamples to be tested are placed in a plastic microtiterplate, the proteins adhere to the plastic surface and areimmobilized. Any remaining absorbing areas of the wellare then “blocked” by adding a nonantigenic proteinsuch as bovine serum albumin. A solution of antibodycovalently linked to a reporter enzyme is then added.The antibodies adhere to the immobilized antigen andthese are themselves immobilized. Excess free antibodymolecules are then removed by washing. The presenceand quantity of bound antibody are then determinedby adding the substrate for the reporter enzyme.

Table 7–1. Amino acid sequences in the neighborhood of the catalytic sites of severalbovine proteases. Regions shown are those on either side of the catalytic site seryl (S) andhistidyl (H) residues.

Enzyme Sequence Around Serine �S Sequence Around Histidine �H

Trypsin D S C Q D G �S G G P V V C S G K V V S A A �H C Y K S GChymotrypsin A S S C M G D �S G G P L V C K K N V V T A A �H G G V T TChymotrypsin B S S C M G D �S G G P L V C Q K N V V T A A �H C G V T TThrombin D A C E G D �S G G P F V M K S P V L T A A �H C L L Y P

ch07.qxd 2/13/2003 2:14 PM Page 55

Page 56: Harpers Biochem

56 / CHAPTER 7

NAD(P)+-Dependent Dehydrogenases AreAssayed Spectrophotometrically

The physicochemical properties of the reactants in anenzyme-catalyzed reaction dictate the options for theassay of enzyme activity. Spectrophotometric assays ex-ploit the ability of a substrate or product to absorblight. The reduced coenzymes NADH and NADPH,written as NAD(P)H, absorb light at a wavelength of340 nm, whereas their oxidized forms NAD(P)+ do not(Figure 7–9). When NAD(P)+ is reduced, the ab-sorbance at 340 nm therefore increases in proportionto—and at a rate determined by—the quantity ofNAD(P)H produced. Conversely, for a dehydrogenasethat catalyzes the oxidation of NAD(P)H, a decrease inabsorbance at 340 nm will be observed. In each case,the rate of change in optical density at 340 nm will beproportionate to the quantity of enzyme present.

Many Enzymes Are Assayed by Couplingto a Dehydrogenase

The assay of enzymes whose reactions are not accompa-nied by a change in absorbance or fluorescence is gener-ally more difficult. In some instances, the product or re-maining substrate can be transformed into a morereadily detected compound. In other instances, the re-action product may have to be separated from unre-acted substrate prior to measurement—a process facili-

tated by the use of radioactive substrates. An alternativestrategy is to devise a synthetic substrate whose productabsorbs light. For example, p-nitrophenyl phosphate isan artificial substrate for certain phosphatases and forchymotrypsin that does not absorb visible light. How-ever, following hydrolysis, the resulting p-nitrophen-ylate anion absorbs light at 419 nm.

Another quite general approach is to employ a “cou-pled” assay (Figure 7–10). Typically, a dehydrogenasewhose substrate is the product of the enzyme of interestis added in catalytic excess. The rate of appearance ordisappearance of NAD(P)H then depends on the rateof the enzyme reaction to which the dehydrogenase hasbeen coupled.

THE ANALYSIS OF CERTAIN ENZYMESAIDS DIAGNOSIS

Of the thousands of different enzymes present in thehuman body, those that fulfill functions indispensableto cell vitality are present throughout the body tissues.Other enzymes or isozymes are expressed only in spe-cific cell types, during certain periods of development,or in response to specific physiologic or pathophysio-logic changes. Analysis of the presence and distributionof enzymes and isozymes—whose expression is nor-mally tissue-, time-, or circumstance-specific—oftenaids diagnosis.

0

0.2

200 250 300

Wavelength (nm)

350 400

0.4

0.6

0.8

1.0

Opt

ical

den

sity

NADH

NAD+

Figure 7–9. Absorption spectra of NAD+ and NADH.Densities are for a 44 mg/L solution in a cell with a 1 cmlight path. NADP+ and NADPH have spectrums analo-gous to NAD+ and NADH, respectively.

ATP, Mg2+

ADP, Mg2+

HEXOKINASE

Glucose

Glucose 6-phosphate

NADP+

NADPH + H+

GLUCOSE-6-PHOSPHATEDEHYDROGENASE

6-Phosphogluconolactone

Figure 7–10. Coupled enzyme assay for hexokinaseactivity. The production of glucose 6-phosphate byhexokinase is coupled to the oxidation of this productby glucose-6-phosphate dehydrogenase in the pres-ence of added enzyme and NADP+. When an excess ofglucose-6-phosphate dehydrogenase is present, therate of formation of NADPH, which can be measured at340 nm, is governed by the rate of formation of glucose6-phosphate by hexokinase.

ch07.qxd 2/13/2003 2:14 PM Page 56

Page 57: Harpers Biochem

ENZYMES: MECHANISM OF ACTION / 57

Nonfunctional Plasma Enzymes AidDiagnosis & Prognosis

Certain enzymes, proenzymes, and their substrates arepresent at all times in the circulation of normal individ-uals and perform a physiologic function in the blood.Examples of these functional plasma enzymes includelipoprotein lipase, pseudocholinesterase, and the proen-zymes of blood coagulation and blood clot dissolution(Chapters 9 and 51). The majority of these enzymes aresynthesized in and secreted by the liver.

Plasma also contains numerous other enzymes thatperform no known physiologic function in blood.These apparently nonfunctional plasma enzymes arisefrom the routine normal destruction of erythrocytes,leukocytes, and other cells. Tissue damage or necrosisresulting from injury or disease is generally accompa-nied by increases in the levels of several nonfunctionalplasma enzymes. Table 7–2 lists several enzymes usedin diagnostic enzymology.

Isozymes of Lactate Dehydrogenase AreUsed to Detect Myocardial Infarctions

L-Lactate dehydrogenase is a tetrameric enzyme whosefour subunits occur in two isoforms, designated H (for

Table 7–2. Principal serum enzymes used inclinical diagnosis. Many of the enzymes are notspecific for the disease listed.

Serum Enzyme Major Diagnostic Use

AminotransferasesAspartate aminotransfer- Myocardial infarction

ase (AST, or SGOT)Alanine aminotransferase Viral hepatitis

(ALT, or SGPT)

Amylase Acute pancreatitis

Ceruloplasmin Hepatolenticular degeneration (Wilson’s disease)

Creatine kinase Muscle disorders and myocar-dial infarction

γ-Glutamyl transpeptidase Various liver diseases

Lactate dehydrogenase Myocardial infarction(isozymes)

Lipase Acute pancreatitis

Phosphatase, acid Metastatic carcinoma of theprostate

Phosphatase, alkaline Various bone disorders, ob-(isozymes) structive liver diseases

heart) and M (for muscle). The subunits can combineas shown below to yield catalytically active isozymes ofL-lactate dehydrogenase:

LactateDehydrogenase

Isozyme SubunitsI1 HHHHI2 HHHMI3 HHMMI4 HMMMI5 MMMM

Distinct genes whose expression is differentially regu-lated in various tissues encode the H and M subunits.Since heart expresses the H subunit almost exclusively,isozyme I1 predominates in this tissue. By contrast,isozyme I5 predominates in liver. Small quantities oflactate dehydrogenase are normally present in plasma.Following a myocardial infarction or in liver disease,the damaged tissues release characteristic lactate dehy-drogenase isoforms into the blood. The resulting eleva-tion in the levels of the I1 or I5 isozymes is detected byseparating the different oligomers of lactate dehydroge-nase by electrophoresis and assaying their catalytic ac-tivity (Figure 7–11).

ENZYMES FACILITATE DIAGNOSIS OF GENETIC DISEASES

While many diseases have long been known to resultfrom alterations in an individual’s DNA, tools for thedetection of genetic mutations have only recently be-come widely available. These techniques rely upon thecatalytic efficiency and specificity of enzyme catalysts.For example, the polymerase chain reaction (PCR) re-lies upon the ability of enzymes to serve as catalytic am-plifiers to analyze the DNA present in biologic andforensic samples. In the PCR technique, a thermostableDNA polymerase, directed by appropriate oligonu-cleotide primers, produces thousands of copies of asample of DNA that was present initially at levels toolow for direct detection.

The detection of restriction fragment length poly-morphisms (RFLPs) facilitates prenatal detection ofhereditary disorders such as sickle cell trait, beta-thalassemia, infant phenylketonuria, and Huntington’sdisease. Detection of RFLPs involves cleavage of dou-ble-stranded DNA by restriction endonucleases, whichcan detect subtle alterations in DNA that affect theirrecognized sites. Chapter 40 provides further detailsconcerning the use of PCR and restriction enzymes fordiagnosis.

ch07.qxd 2/13/2003 2:14 PM Page 57

Page 58: Harpers Biochem

58 / CHAPTER 7

LACTATEDEHYDROGENASE SSH2(Lactate) (Pyruvate)

NADH + H+NAD+

Reduced NBT(blue formazan)

Oxidized NBT(colorless)

Oxidized PMSReduced PMS

A

B

C

5

Heart

Normal

Liver

4 3 2 1

+ –

Figure 7–11. Normal and pathologic patterns of lactate dehydrogenase (LDH) isozymes in humanserum. LDH isozymes of serum were separated by electrophoresis and visualized using the coupled reac-tion scheme shown on the left. (NBT, nitroblue tetrazolium; PMS, phenazine methylsulfate). At right isshown the stained electropherogram. Pattern A is serum from a patient with a myocardial infarct; B is nor-mal serum; and C is serum from a patient with liver disease. Arabic numerals denote specific LDH isozymes.

RECOMBINANT DNA PROVIDES ANIMPORTANT TOOL FOR STUDYINGENZYMES

Recombinant DNA technology has emerged as an im-portant asset in the study of enzymes. Highly purifiedsamples of enzymes are necessary for the study of theirstructure and function. The isolation of an individualenzyme, particularly one present in low concentration,from among the thousands of proteins present in a cellcan be extremely difficult. If the gene for the enzyme ofinterest has been cloned, it generally is possible to pro-duce large quantities of its encoded protein in Esch-erichia coli or yeast. However, not all animal proteinscan be expressed in active form in microbial cells, nordo microbes perform certain posttranslational process-ing tasks. For these reasons, a gene may be expressed incultured animal cell systems employing the baculovirusexpression vector to transform cultured insect cells. Formore details concerning recombinant DNA techniques,see Chapter 40.

Recombinant Fusion Proteins Are Purifiedby Affinity Chromatography

Recombinant DNA technology can also be used to cre-ate modified proteins that are readily purified by affinitychromatography. The gene of interest is linked to anoligonucleotide sequence that encodes a carboxyl oramino terminal extension to the encoded protein. The

resulting modified protein, termed a fusion protein,contains a domain tailored to interact with a specificaffinity support. One popular approach is to attach anoligonucleotide that encodes six consecutive histidineresidues. The expressed “His tag” protein binds to chro-matographic supports that contain an immobilized diva-lent metal ion such as Ni2+. Alternatively, the substrate-binding domain of glutathione S-transferase (GST) canserve as a “GST tag.” Figure 7–12 illustrates the purifi-cation of a GST-fusion protein using an affinity supportcontaining bound glutathione. Fusion proteins alsooften encode a cleavage site for a highly specific proteasesuch as thrombin in the region that links the two por-tions of the protein. This permits removal of the addedfusion domain following affinity purification.

Site-Directed Mutagenesis ProvidesMechanistic Insights

Once the ability to express a protein from its clonedgene has been established, it is possible to employ site-directed mutagenesis to change specific aminoacylresidues by altering their codons. Used in combinationwith kinetic analyses and x-ray crystallography, this ap-proach facilitates identification of the specific roles ofgiven aminoacyl residues in substrate binding and catal-ysis. For example, the inference that a particularaminoacyl residue functions as a general acid can betested by replacing it with an aminoacyl residue inca-pable of donating a proton.

ch07.qxd 2/13/2003 2:14 PM Page 58

Page 59: Harpers Biochem

ENZYMES: MECHANISM OF ACTION / 59

GSTGSH

GST

Plasmid encoding GSTwith thrombin site (T)

Cloned DNAencoding enzyme

Ligate together

T Enzyme

GST

GST Enzyme

T Enzyme

Transfect cells, addinducing agent, thenbreak cells

Apply to glutathione (GSH)affinity column

Elute with GSH,treat with thrombin

TGSHSepharosebead

EnzymeT

Figure 7–12. Use of glutathione S-transferase (GST)fusion proteins to purify recombinant proteins. (GSH,glutathione.)

SUMMARY

• Enzymes are highly effective and extremely specificcatalysts.

• Organic and inorganic prosthetic groups, cofactors,and coenzymes play important roles in catalysis.Coenzymes, many of which are derivatives of B vita-mins, serve as “shuttles.”

• Catalytic mechanisms employed by enzymes includethe introduction of strain, approximation of reac-tants, acid-base catalysis, and covalent catalysis.

• Aminoacyl residues that participate in catalysis arehighly conserved among all classes of a given enzymeactivity.

• Substrates and enzymes induce mutual conforma-tional changes in one another that facilitate substraterecognition and catalysis.

• The catalytic activity of enzymes reveals their pres-ence, facilitates their detection, and provides the basisfor enzyme-linked immunoassays.

• Many enzymes can be assayed spectrophotometri-cally by coupling them to an NAD(P)+-dependentdehydrogenase.

• Assay of plasma enzymes aids diagnosis and progno-sis. For example, a myocardial infarction elevatesserum levels of lactate dehydrogenase isozyme I1.

• Restriction endonucleases facilitate diagnosis of ge-netic diseases by revealing restriction fragment lengthpolymorphisms.

• Site-directed mutagenesis, used to change residuessuspected of being important in catalysis or substratebinding, provides insights into the mechanisms ofenzyme action.

• Recombinant fusion proteins such as His-tagged orGST fusion enzymes are readily purified by affinitychromatography.

REFERENCES

Conyers GB et al: Metal requirements of a diadenosine pyrophos-phatase from Bartonella bacilliformis. Magnetic resonance andkinetic studies of the role of Mn2+. Biochemistry 2000;39:2347.

Fersht A: Structure and Mechanism in Protein Science: A Guide toEnzyme Catalysis and Protein Folding. Freeman, 1999.

Suckling CJ: Enzyme Chemistry. Chapman & Hall, 1990.Walsh CT: Enzymatic Reaction Mechanisms. Freeman, 1979.

ch07.qxd 2/13/2003 2:14 PM Page 59

Page 60: Harpers Biochem

Enzymes: Kinetics 8

60

Victor W. Rodwell, PhD, & Peter J. Kennelly, PhD

BIOMEDICAL IMPORTANCE

Enzyme kinetics is the field of biochemistry concernedwith the quantitative measurement of the rates of en-zyme-catalyzed reactions and the systematic study of fac-tors that affect these rates. Kinetic analyses permit scien-tists to reconstruct the number and order of theindividual steps by which enzymes transform substratesinto products. The study of enzyme kinetics also repre-sents the principal way to identify potential therapeuticagents that selectively enhance or inhibit the rates of spe-cific enzyme-catalyzed processes. Together with site-directed mutagenesis and other techniques that probeprotein structure, kinetic analysis can also reveal detailsof the catalytic mechanism. A complete, balanced set ofenzyme activities is of fundamental importance for main-taining homeostasis. An understanding of enzyme kinet-ics thus is important for understanding how physiologicstresses such as anoxia, metabolic acidosis or alkalosis,toxins, and pharmacologic agents affect that balance.

CHEMICAL REACTIONS ARE DESCRIBEDUSING BALANCED EQUATIONS

A balanced chemical equation lists the initial chemicalspecies (substrates) present and the new chemicalspecies (products) formed for a particular chemical re-action, all in their correct proportions or stoichiome-try. For example, balanced equation (1) below describesthe reaction of one molecule each of substrates A and Bto form one molecule each of products P and Q.

(1)

The double arrows indicate reversibility, an intrinsicproperty of all chemical reactions. Thus, for reaction(1), if A and B can form P and Q, then P and Q canalso form A and B. Designation of a particular reactantas a “substrate” or “product” is therefore somewhat ar-bitrary since the products for a reaction written in onedirection are the substrates for the reverse reaction. Theterm “products” is, however, often used to designatethe reactants whose formation is thermodynamically fa-vored. Reactions for which thermodynamic factorsstrongly favor formation of the products to which thearrow points often are represented with a single arrowas if they were “irreversible”:

A B P Q+ ←→ +

(2)

Unidirectional arrows are also used to describe reac-tions in living cells where the products of reaction (2)are immediately consumed by a subsequent enzyme-catalyzed reaction. The rapid removal of product P orQ therefore precludes occurrence of the reverse reac-tion, rendering equation (2) functionally irreversibleunder physiologic conditions.

CHANGES IN FREE ENERGY DETERMINETHE DIRECTION & EQUILIBRIUM STATEOF CHEMICAL REACTIONS

The Gibbs free energy change ∆G (also called either thefree energy or Gibbs energy) describes both the direc-tion in which a chemical reaction will tend to proceedand the concentrations of reactants and products thatwill be present at equilibrium. ∆G for a chemical reac-tion equals the sum of the free energies of formation ofthe reaction products ∆GP minus the sum of the freeenergies of formation of the substrates ∆GS. ∆G0 de-notes the change in free energy that accompanies transi-tion from the standard state, one-molar concentrationsof substrates and products, to equilibrium. A more use-ful biochemical term is ∆G0′, which defines ∆G0 at astandard state of 10−7 M protons, pH 7.0 (Chapter 10).If the free energy of the products is lower than that ofthe substrates, the signs of ∆G0 and ∆G0′ will be nega-tive, indicating that the reaction as written is favored inthe direction left to right. Such reactions are referred toas spontaneous. The sign and the magnitude of thefree energy change determine how far the reaction willproceed. Equation (3)—

(3)

—illustrates the relationship between the equilibriumconstant Keq and ∆G0, where R is the gas constant (1.98cal/mol/°K or 8.31 J/mol/°K) and T is the absolutetemperature in degrees Kelvin. Keq is equal to the prod-uct of the concentrations of the reaction products, eachraised to the power of their stoichiometry, divided bythe product of the substrates, each raised to the powerof their stoichiometry.

∆G RT eq0 = − ln K

A B+ → P + Q

ch08.qxd 2/13/2003 2:23 PM Page 60

Page 61: Harpers Biochem

ENZYMES: KINETICS / 61

For the reaction A + B → P + Q—

(4)

and for reaction (5)

(5)

(6)

—∆G0 may be calculated from equation (3) if the con-centrations of substrates and products present at equi-librium are known. If ∆G0 is a negative number, Keqwill be greater than unity and the concentration ofproducts at equilibrium will exceed that of substrates. If∆G0 is positive, Keq will be less than unity and the for-mation of substrates will be favored.

Notice that, since ∆G0 is a function exclusively ofthe initial and final states of the reacting species, it canprovide information only about the direction and equi-librium state of the reaction. ∆G0 is independent of themechanism of the reaction and therefore provides noinformation concerning rates of reactions. Conse-quently—and as explained below—although a reactionmay have a large negative ∆G0 or ∆G0′, it may never-theless take place at a negligible rate.

THE RATES OF REACTIONSARE DETERMINED BY THEIRACTIVATION ENERGY

Reactions Proceed via Transition States

The concept of the transition state is fundamental tounderstanding the chemical and thermodynamic basisof catalysis. Equation (7) depicts a displacement reac-tion in which an entering group E displaces a leavinggroup L, attached initially to R.

(7)

Midway through the displacement, the bond betweenR and L has weakened but has not yet been completelysevered, and the new bond between E and R is as yetincompletely formed. This transient intermediate—inwhich neither free substrate nor product exists—istermed the transition state, E���R���L. Dotted linesrepresent the “partial” bonds that are undergoing for-mation and rupture.

Reaction (7) can be thought of as consisting of two“partial reactions,” the first corresponding to the forma-tion (F) and the second to the subsequent decay (D) ofthe transition state intermediate. As for all reactions,

E R L R L+ →← +− − E

K eqP

A= [ ]

[ ]2

A A+ →← P

KeqP Q

A B= [ ][ ]

[ ][ ]

characteristic changes in free energy, ∆GF, and ∆GD areassociated with each partial reaction.

(8)

(9)

(8-10)

For the overall reaction (10), ∆G is the sum of ∆GF and∆GD. As for any equation of two terms, it is not possi-ble to infer from ∆G either the sign or the magnitudeof ∆GF or ∆GD.

Many reactions involve multiple transition states,each with an associated change in free energy. For thesereactions, the overall ∆G represents the sum of all ofthe free energy changes associated with the formationand decay of all of the transition states. Therefore, it isnot possible to infer from the overall �G the num-ber or type of transition states through which the re-action proceeds. Stated another way: overall thermo-dynamics tells us nothing about kinetics.

∆GF Defines the Activation Energy

Regardless of the sign or magnitude of ∆G, ∆GF for theoverwhelming majority of chemical reactions has a pos-itive sign. The formation of transition state intermedi-ates therefore requires surmounting of energy barriers.For this reason, ∆GF is often termed the activation en-ergy, Eact, the energy required to surmount a given en-ergy barrier. The ease—and hence the frequency—withwhich this barrier is overcome is inversely related toEact. The thermodynamic parameters that determinehow fast a reaction proceeds thus are the ∆GF values forformation of the transition states through which the re-action proceeds. For a simple reaction, where � means“proportionate to,”

(11)

The activation energy for the reaction proceeding in theopposite direction to that drawn is equal to −∆GD.

NUMEROUS FACTORS AFFECT THE REACTION RATE

The kinetic theory—also called the collision theory—of chemical kinetics states that for two molecules toreact they must (1) approach within bond-forming dis-tance of one another, or “collide”; and (2) must possesssufficient kinetic energy to overcome the energy barrierfor reaching the transition state. It therefore follows

Rate e

E

RTact

∝−

E R L R L G G GF D+ →← + = +− − E ∆ ∆ ∆

E R L R LL L E GD→← +− ∆

E R L+ →←− E R L GFL L ∆

ch08.qxd 2/13/2003 2:23 PM Page 61

Page 62: Harpers Biochem

62 / CHAPTER 8

CBA

Energy barrier

Kinetic energy

0

Num

ber

ofm

olec

ules

Figure 8–1. The energy barrier for chemicalreactions.

that anything which increases the frequency or energy ofcollision between substrates will increase the rate of thereaction in which they participate.

Temperature

Raising the temperature increases the kinetic energy ofmolecules. As illustrated in Figure 8–1, the total num-ber of molecules whose kinetic energy exceeds the en-ergy barrier Eact (vertical bar) for formation of productsincreases from low (A), through intermediate (B), tohigh (C) temperatures. Increasing the kinetic energy ofmolecules also increases their motion and therefore thefrequency with which they collide. This combination ofmore frequent and more highly energetic and produc-tive collisions increases the reaction rate.

Reactant Concentration

The frequency with which molecules collide is directlyproportionate to their concentrations. For two differentmolecules A and B, the frequency with which they col-lide will double if the concentration of either A or B isdoubled. If the concentrations of both A and B are dou-bled, the probability of collision will increase fourfold.

For a chemical reaction proceeding at constant tem-perature that involves one molecule each of A and B,

(12)

the number of molecules that possess kinetic energysufficient to overcome the activation energy barrier willbe a constant. The number of collisions with sufficientenergy to produce product P therefore will be directlyproportionate to the number of collisions between Aand B and thus to their molar concentrations, denotedby square brackets.

(13)

Similarly, for the reaction represented by

(14)A B P+ →2

Rate A B∝ [ ][ ]

A B P+ →

which can also be written as

(15)

the corresponding rate expression is

(16)

or

(17)

For the general case when n molecules of A react withm molecules of B,

(18)

the rate expression is

(19)

Replacing the proportionality constant with an equalsign by introducing a proportionality or rate constantk characteristic of the reaction under study gives equa-tions (20) and (21), in which the subscripts 1 and −1refer to the rate constants for the forward and reversereactions, respectively.

(20)

(21)

Keq Is a Ratio of Rate Constants

While all chemical reactions are to some extent re-versible, at equilibrium the overall concentrations of re-actants and products remain constant. At equilibrium,the rate of conversion of substrates to products there-fore equals the rate at which products are converted tosubstrates.

(22)

Therefore,

(23)

and

(24)

The ratio of k1 to k−1 is termed the equilibrium con-stant, Keq. The following important properties of a sys-tem at equilibrium must be kept in mind:

(1) The equilibrium constant is a ratio of the reactionrate constants (not the reaction rates).

k

k

P

A Bn m1

1−= [ ]

[ ] [ ]

k A B k Pn m1 1[ ] [ ] [ ]= −

Rate Rate1 1= −

Rate k P− −1 1= [ ]

Rate k A Bn m1 1= [ ] [ ]

Rate A Bn m∝ [ ] [ ]

nA mB P+ →

Rate A B∝ [ ][ ]2

Rate A B B∝ [ ][ ][ ]

A B B P+ + →

ch08.qxd 2/13/2003 2:23 PM Page 62

Page 63: Harpers Biochem

ENZYMES: KINETICS / 63

(2) At equilibrium, the reaction rates (not the rateconstants) of the forward and back reactions areequal.

(3) Equilibrium is a dynamic state. Although there isno net change in the concentration of substratesor products, individual substrate and productmolecules are continually being interconverted.

(4) The numeric value of the equilibrium constantKeq can be calculated either from the concentra-tions of substrates and products at equilibrium orfrom the ratio k1/k−1.

THE KINETICS OFENZYMATIC CATALYSIS

Enzymes Lower the Activation EnergyBarrier for a Reaction

All enzymes accelerate reaction rates by providing tran-sition states with a lowered ∆GF for formation of thetransition states. However, they may differ in the waythis is achieved. Where the mechanism or the sequenceof chemical steps at the active site is essentially the sameas those for the same reaction proceeding in the absenceof a catalyst, the environment of the active site lowers�GF by stabilizing the transition state intermediates. Asdiscussed in Chapter 7, stabilization can involve (1)acid-base groups suitably positioned to transfer protonsto or from the developing transition state intermediate,(2) suitably positioned charged groups or metal ionsthat stabilize developing charges, or (3) the impositionof steric strain on substrates so that their geometry ap-proaches that of the transition state. HIV protease (Fig-ure 7–6) illustrates catalysis by an enzyme that lowersthe activation barrier by stabilizing a transition state in-termediate.

Catalysis by enzymes that proceeds via a unique re-action mechanism typically occurs when the transitionstate intermediate forms a covalent bond with the en-zyme (covalent catalysis). The catalytic mechanism ofthe serine protease chymotrypsin (Figure 7–7) illus-trates how an enzyme utilizes covalent catalysis to pro-vide a unique reaction pathway.

ENZYMES DO NOT AFFECT Keq

Enzymes accelerate reaction rates by lowering the acti-vation barrier ∆GF. While they may undergo transientmodification during the process of catalysis, enzymesemerge unchanged at the completion of the reaction.The presence of an enzyme therefore has no effect on∆G0 for the overall reaction, which is a function solelyof the initial and final states of the reactants. Equation(25) shows the relationship between the equilibriumconstant for a reaction and the standard free energychange for that reaction:

(25)

If we include the presence of the enzyme (E) in the cal-culation of the equilibrium constant for a reaction,

(26)

the expression for the equilibrium constant,

(27)

reduces to one identical to that for the reaction in theabsence of the enzyme:

(28)

Enzymes therefore have no effect on Keq.

MULTIPLE FACTORS AFFECT THE RATESOF ENZYME-CATALYZED REACTIONS

Temperature

Raising the temperature increases the rate of both uncat-alyzed and enzyme-catalyzed reactions by increasing thekinetic energy and the collision frequency of the react-ing molecules. However, heat energy can also increasethe kinetic energy of the enzyme to a point that exceedsthe energy barrier for disrupting the noncovalent inter-actions that maintain the enzyme’s three-dimensionalstructure. The polypeptide chain then begins to unfold,or denature, with an accompanying rapid loss of cat-alytic activity. The temperature range over which anenzyme maintains a stable, catalytically competent con-formation depends upon—and typically moderatelyexceeds—the normal temperature of the cells in whichit resides. Enzymes from humans generally exhibit sta-bility at temperatures up to 45–55 °C. By contrast,enzymes from the thermophilic microorganisms that re-side in volcanic hot springs or undersea hydrothermalvents may be stable up to or above 100 °C.

The Q10, or temperature coefficient, is the factorby which the rate of a biologic process increases for a10 °C increase in temperature. For the temperaturesover which enzymes are stable, the rates of most bio-logic processes typically double for a 10 °C rise in tem-perature (Q10 = 2). Changes in the rates of enzyme-catalyzed reactions that accompany a rise or fall in bodytemperature constitute a prominent survival feature for“cold-blooded” life forms such as lizards or fish, whosebody temperatures are dictated by the external environ-ment. However, for mammals and other homeothermicorganisms, changes in enzyme reaction rates with tem-perature assume physiologic importance only in cir-cumstances such as fever or hypothermia.

K eqP Q

A B= [ ][ ]

[ ][ ]

K eqP Q Enz

A B Enz= [ ][ ][ ]

[ ][ ][ ]

A B Enz+ + →← P+ Q +Enz

∆G RToeq= − ln K

ch08.qxd 2/13/2003 2:23 PM Page 63

Page 64: Harpers Biochem

64 / CHAPTER 8

0Low High

100

%

X

pH

SH+ E –

Figure 8–2. Effect of pH on enzyme activity. Con-sider, for example, a negatively charged enzyme (EH−)that binds a positively charged substrate (SH+). Shownis the proportion (%) of SH+ [\\\] and of EH− [///] as afunction of pH. Only in the cross-hatched area do boththe enzyme and the substrate bear an appropriatecharge.

Km

Vmax

Vmax/2

Vmax/2

vi

[S]

A

B

C

Figure 8–3. Effect of substrate concentration on theinitial velocity of an enzyme-catalyzed reaction.

Hydrogen Ion Concentration

The rate of almost all enzyme-catalyzed reactions ex-hibits a significant dependence on hydrogen ion con-centration. Most intracellular enzymes exhibit optimalactivity at pH values between 5 and 9. The relationshipof activity to hydrogen ion concentration (Figure 8–2)reflects the balance between enzyme denaturation athigh or low pH and effects on the charged state of theenzyme, the substrates, or both. For enzymes whosemechanism involves acid-base catalysis, the residues in-volved must be in the appropriate state of protonationfor the reaction to proceed. The binding and recogni-tion of substrate molecules with dissociable groups alsotypically involves the formation of salt bridges with theenzyme. The most common charged groups are thenegative carboxylate groups and the positively chargedgroups of protonated amines. Gain or loss of criticalcharged groups thus will adversely affect substrate bind-ing and thus will retard or abolish catalysis.

ASSAYS OF ENZYME-CATALYZEDREACTIONS TYPICALLY MEASURE THE INITIAL VELOCITY

Most measurements of the rates of enzyme-catalyzed re-actions employ relatively short time periods, conditionsthat approximate initial rate conditions. Under theseconditions, only traces of product accumulate, hencethe rate of the reverse reaction is negligible. The initialvelocity (vi) of the reaction thus is essentially that of

the rate of the forward reaction. Assays of enzyme activ-ity almost always employ a large (103–107) molar excessof substrate over enzyme. Under these conditions, vi isproportionate to the concentration of enzyme. Measur-ing the initial velocity therefore permits one to estimatethe quantity of enzyme present in a biologic sample.

SUBSTRATE CONCENTRATION AFFECTSREACTION RATE

In what follows, enzyme reactions are treated as if theyhad only a single substrate and a single product. Whilemost enzymes have more than one substrate, the princi-ples discussed below apply with equal validity to en-zymes with multiple substrates.

For a typical enzyme, as substrate concentration isincreased, vi increases until it reaches a maximum valueVmax (Figure 8–3). When further increases in substrateconcentration do not further increase vi, the enzyme issaid to be “saturated” with substrate. Note that theshape of the curve that relates activity to substrate con-centration (Figure 8–3) is hyperbolic. At any given in-stant, only substrate molecules that are combined withthe enzyme as an ES complex can be transformed intoproduct. Second, the equilibrium constant for the for-mation of the enzyme-substrate complex is not infi-nitely large. Therefore, even when the substrate is pre-sent in excess (points A and B of Figure 8–4), only afraction of the enzyme may be present as an ES com-plex. At points A or B, increasing or decreasing [S]therefore will increase or decrease the number of EScomplexes with a corresponding change in vi. At pointC (Figure 8–4), essentially all the enzyme is present asthe ES complex. Since no free enzyme remains availablefor forming ES, further increases in [S] cannot increasethe rate of the reaction. Under these saturating condi-tions, vi depends solely on—and thus is limited by—the rapidity with which free enzyme is released to com-bine with more substrate.

ch08.qxd 2/13/2003 2:23 PM Page 64

Page 65: Harpers Biochem

ENZYMES: KINETICS / 65

A B C

= S

= E

Figure 8–4. Representation of an enzyme at low (A), at high (C), and at a substrate concentrationequal to Km (B). Points A, B, and C correspond to those points in Figure 8–3.

THE MICHAELIS-MENTEN & HILLEQUATIONS MODEL THE EFFECTS OF SUBSTRATE CONCENTRATION

The Michaelis-Menten Equation

The Michaelis-Menten equation (29) illustrates inmathematical terms the relationship between initial re-action velocity vi and substrate concentration [S],shown graphically in Figure 8–3.

(29)

The Michaelis constant Km is the substrate concen-tration at which vi is half the maximal velocity(Vmax/2) attainable at a particular concentration ofenzyme. Km thus has the dimensions of substrate con-centration. The dependence of initial reaction velocityon [S] and Km may be illustrated by evaluating theMichaelis-Menten equation under three conditions.

(1) When [S] is much less than Km (point A in Fig-ures 8–3 and 8–4), the term Km + [S] is essentially equalto Km. Replacing Km + [S] with Km reduces equation(29) to

(30)

where ≈ means “approximately equal to.” Since Vmaxand Km are both constants, their ratio is a constant. Inother words, when [S] is considerably below Km, vi ∝k[S]. The initial reaction velocity therefore is directlyproportionate to [S].

(2) When [S] is much greater than Km (point C inFigures 8–3 and 8–4), the term Km + [S] is essentially

vV

Kv

V

K

V

K1 1=

+≈ ≈

max max max[ ]

[ ]

[ ][ ]

S

S

S S

m m m

vS

Si =+

V

Kmax[ ]

[ ]m

equal to [S]. Replacing Km + [S] with [S] reduces equa-tion (29) to

(31)

Thus, when [S] greatly exceeds Km, the reaction velocityis maximal (Vmax) and unaffected by further increases insubstrate concentration.

(3) When [S] = Km (point B in Figures 8–3 and8–4).

(32)

Equation (32) states that when [S] equals Km, the initialvelocity is half-maximal. Equation (32) also reveals thatKm is—and may be determined experimentally from—the substrate concentration at which the initial velocityis half-maximal.

A Linear Form of the Michaelis-MentenEquation Is Used to Determine Km & Vmax

The direct measurement of the numeric value of Vmaxand therefore the calculation of Km often requires im-practically high concentrations of substrate to achievesaturating conditions. A linear form of the Michaelis-Menten equation circumvents this difficulty and per-mits Vmax and Km to be extrapolated from initial veloc-ity data obtained at less than saturating concentrationsof substrate. Starting with equation (29),

(29)vS

Si =+

V

Kmax[ ]

[ ]m

vS

S

S

Si

m

=+

= =V

K

V Vmax max max[ ]

[ ]

[ ]

[ ]2 2

vS

S

S

[S]i

m

=+

≈ ≈V

Kv

VVmax max

max[ ]

[ ]

[ ]i

ch08.qxd 2/13/2003 2:23 PM Page 65

Page 66: Harpers Biochem

66 / CHAPTER 8

[S]1

Km

1–

vi

1

Vmax

1

Vmax

KmSlope =

0

Figure 8–5. Double reciprocal or Lineweaver-Burkplot of 1/vi versus 1/[S] used to evaluate Km and Vmax.

invert

(33)

factor

(34)

and simplify

(35)

Equation (35) is the equation for a straight line, y = ax+ b, where y = 1/vi and x = 1/[S]. A plot of 1/vi as y as afunction of 1/[S] as x therefore gives a straight linewhose y intercept is 1/Vmax and whose slope is Km/Vmax.Such a plot is called a double reciprocal orLineweaver-Burk plot (Figure 8–5). Setting the y termof equation (36) equal to zero and solving for x revealsthat the x intercept is −1/Km.

(36)

Km is thus most easily calculated from the x intercept.

Km May Approximate a Binding Constant

The affinity of an enzyme for its substrate is the inverseof the dissociation constant Kd for dissociation of theenzyme substrate complex ES.

(37)

(38)Kk

kd = −1

1

E S

k

k

ES+ →←1

1−

0 = + =ax bm

; therefore, x = b

a

1− −K

1 1 1

v

S

i

m=

+K

VVmax max[ ]

1

v

S

S

Si

m= +K

VVmax max[ ]

[ ]

[ ]

1

v

S

S1

m= +K

V

[ ]

[ ]max

Stated another way, the smaller the tendency of the en-zyme and its substrate to dissociate, the greater the affin-ity of the enzyme for its substrate. While the Michaelisconstant Km often approximates the dissociation con-stant Kd, this is by no means always the case. For a typi-cal enzyme-catalyzed reaction,

(39)

the value of [S] that gives vi = Vmax/2 is

(40)

When k−1 » k2, then

(41)

and

(42)

Hence, 1/Km only approximates 1/Kd under conditionswhere the association and dissociation of the ES com-plex is rapid relative to the rate-limiting step in cataly-sis. For the many enzyme-catalyzed reactions for whichk−1 + k2 is not approximately equal to k −1, 1/Km willunderestimate 1/Kd.

The Hill Equation Describes the Behaviorof Enzymes That Exhibit CooperativeBinding of Substrate

While most enzymes display the simple saturation ki-netics depicted in Figure 8–3 and are adequately de-scribed by the Michaelis-Menten expression, some en-zymes bind their substrates in a cooperative fashionanalogous to the binding of oxygen by hemoglobin(Chapter 6). Cooperative behavior may be encounteredfor multimeric enzymes that bind substrate at multiplesites. For enzymes that display positive cooperativity inbinding substrate, the shape of the curve that relateschanges in vi to changes in [S] is sigmoidal (Figure8–6). Neither the Michaelis-Menten expression nor itsderived double-reciprocal plots can be used to evaluatecooperative saturation kinetics. Enzymologists thereforeemploy a graphic representation of the Hill equationoriginally derived to describe the cooperative binding ofO2 by hemoglobin. Equation (43) represents the Hillequation arranged in a form that predicts a straight line,where k′ is a complex constant.

[ ]Sk

k≈ ≈1

1−K d

k k k− −1 2 1+ ≈

[ ]Sk k

k m= + =−1 2

1K

E S

k

k

ES

k

E P+ →← → +1

1

2

ch08.qxd 2/13/2003 2:23 PM Page 66

Page 67: Harpers Biochem

ENZYMES: KINETICS / 67

Log [S]

S50

1

Slope = n0

– 1

– 4 – 3

Log

v iV

max

v

i

Figure 8–7. A graphic representation of a linearform of the Hill equation is used to evaluate S50, thesubstrate concentration that produces half-maximalvelocity, and the degree of cooperativity n.

[S]

vi

0 ∞

Figure 8–6. Representation of sigmoid substratesaturation kinetics.

(43)

Equation (43) states that when [S] is low relative to k′,the initial reaction velocity increases as the nth powerof [S].

A graph of log vi/(Vmax − vi) versus log[S] gives astraight line (Figure 8–7), where the slope of the line nis the Hill coefficient, an empirical parameter whosevalue is a function of the number, kind, and strength ofthe interactions of the multiple substrate-binding siteson the enzyme. When n = 1, all binding sites behave in-dependently, and simple Michaelis-Menten kinetic be-havior is observed. If n is greater than 1, the enzyme issaid to exhibit positive cooperativity. Binding of the

loglog

v log[S] k1

maxV −− ′

vn

1=

first substrate molecule then enhances the affinity of theenzyme for binding additional substrate. The greaterthe value for n, the higher the degree of cooperativityand the more sigmoidal will be the plot of vi versus [S].A perpendicular dropped from the point where the yterm log vi/(Vmax − vi) is zero intersects the x axis at asubstrate concentration termed S50, the substrate con-centration that results in half-maximal velocity. S50 thusis analogous to the P50 for oxygen binding to hemoglo-bin (Chapter 6).

KINETIC ANALYSIS DISTINGUISHESCOMPETITIVE FROM NONCOMPETITIVE INHIBITION

Inhibitors of the catalytic activities of enzymes provideboth pharmacologic agents and research tools for studyof the mechanism of enzyme action. Inhibitors can beclassified based upon their site of action on the enzyme,on whether or not they chemically modify the enzyme,or on the kinetic parameters they influence. Kinetically,we distinguish two classes of inhibitors based uponwhether raising the substrate concentration does ordoes not overcome the inhibition.

Competitive Inhibitors Typically Resemble Substrates

The effects of competitive inhibitors can be overcomeby raising the concentration of the substrate. Most fre-quently, in competitive inhibition the inhibitor, I,binds to the substrate-binding portion of the active siteand blocks access by the substrate. The structures ofmost classic competitive inhibitors therefore tend to re-semble the structures of a substrate and thus are termedsubstrate analogs. Inhibition of the enzyme succinatedehydrogenase by malonate illustrates competitive inhi-bition by a substrate analog. Succinate dehydrogenasecatalyzes the removal of one hydrogen atom from eachof the two methylene carbons of succinate (Figure 8–8).Both succinate and its structural analog malonate (−OOC CH2COO−) can bind to the active site ofsuccinate dehydrogenase, forming an ES or an EI com-plex, respectively. However, since malonate contains

HC

H

H

SUCCINATEDEHYDROGENASE

–2HC COO–H

–OOC HC

C COO–H

–OOC

Succinate Fumarate

Figure 8–8. The succinate dehydrogenase reaction.

ch08.qxd 2/13/2003 2:23 PM Page 67

Page 68: Harpers Biochem

68 / CHAPTER 8

[S]1

Km

1–K ′m1–

vi

1

Vmax

1

0

+ In

hibi

tor

No inhibitor

Figure 8–9. Lineweaver-Burk plot of competitive in-hibition. Note the complete relief of inhibition at high[S] (ie, low 1/[S]).

only one methylene carbon, it cannot undergo dehy-drogenation. The formation and dissociation of the EIcomplex is a dynamic process described by

(44)

for which the equilibrium constant Ki is

(45)

In effect, a competitive inhibitor acts by decreasingthe number of free enzyme molecules available tobind substrate, ie, to form ES, and thus eventuallyto form product, as described below:

(46)

A competitive inhibitor and substrate exert reciprocaleffects on the concentration of the EI and ES com-plexes. Since binding substrate removes free enzymeavailable to combine with inhibitor, increasing the [S]decreases the concentration of the EI complex andraises the reaction velocity. The extent to which [S]must be increased to completely overcome the inhibi-tion depends upon the concentration of inhibitor pre-sent, its affinity for the enzyme Ki, and the Km of theenzyme for its substrate.

Double Reciprocal Plots Facilitate theEvaluation of Inhibitors

Double reciprocal plots distinguish between competi-tive and noncompetitive inhibitors and simplify evalua-tion of inhibition constants Ki. vi is determined at sev-eral substrate concentrations both in the presence andin the absence of inhibitor. For classic competitive inhi-bition, the lines that connect the experimental datapoints meet at the y axis (Figure 8–9). Since the y inter-cept is equal to 1/Vmax, this pattern indicates that when1/[S] approaches 0, vi is independent of the presenceof inhibitor. Note, however, that the intercept on thex axis does vary with inhibitor concentration—and thatsince −1/Km′ is smaller than 1/Km, Km′ (the “apparentKm”) becomes larger in the presence of increasing con-centrations of inhibitor. Thus, a competitive inhibitorhas no effect on Vmax but raises K ′m, the apparentK m for the substrate.

E

E-S

E + P

E-I± I

± S

KEnz I

EnzI

k

k11

1= =[ ][ ]

[ ] −

EnzI

k

k

Enz I1

1

→← +−

For simple competitive inhibition, the intercept onthe x axis is

(47)

Once Km has been determined in the absence of in-hibitor, Ki can be calculated from equation (47). Ki val-ues are used to compare different inhibitors of the sameenzyme. The lower the value for Ki, the more effectivethe inhibitor. For example, the statin drugs that act ascompetitive inhibitors of HMG-CoA reductase (Chap-ter 26) have Ki values several orders of magnitude lowerthan the Km for the substrate HMG-CoA.

Simple Noncompetitive Inhibitors LowerVmax but Do Not Affect Km

In noncompetitive inhibition, binding of the inhibitordoes not affect binding of substrate. Formation of bothEI and EIS complexes is therefore possible. However,while the enzyme-inhibitor complex can still bind sub-strate, its efficiency at transforming substrate to prod-uct, reflected by Vmax, is decreased. Noncompetitiveinhibitors bind enzymes at sites distinct from the sub-strate-binding site and generally bear little or no struc-tural resemblance to the substrate.

For simple noncompetitive inhibition, E and EIpossess identical affinity for substrate, and the EIS com-plex generates product at a negligible rate (Figure 8–10).More complex noncompetitive inhibition occurs whenbinding of the inhibitor does affect the apparent affinityof the enzyme for substrate, causing the lines to inter-cept in either the third or fourth quadrants of a doublereciprocal plot (not shown).

xI

m i = +

−11

K

[ ]

K

ch08.qxd 2/13/2003 2:23 PM Page 68

Page 69: Harpers Biochem

ENZYMES: KINETICS / 69

[S]1

Km

1–

V ′max

1–

vi

1

Vmax

1

0

+ Inhib

itor

No inhibitor

Figure 8–10. Lineweaver-Burk plot for simple non-competitive inhibition.

EAB-EPQ

EAB-EPQ

F FB-EQ EEA-FPE

A

A

A B P Q

B

B

A

P

EQ

EP

EA

EB

Q

Q

P

P B Q

EE

EQ EEAE

Figure 8–11. Representations of three classes of Bi-Bi reaction mechanisms. Horizontal lines represent theenzyme. Arrows indicate the addition of substrates anddeparture of products. Top: An ordered Bi-Bi reaction,characteristic of many NAD(P)H-dependent oxidore-ductases. Center: A random Bi-Bi reaction, characteris-tic of many kinases and some dehydrogenases. Bot-tom: A ping-pong reaction, characteristic ofaminotransferases and serine proteases.

Irreversible Inhibitors “Poison” Enzymes

In the above examples, the inhibitors form a dissocia-ble, dynamic complex with the enzyme. Fully active en-zyme can therefore be recovered simply by removingthe inhibitor from the surrounding medium. However,a variety of other inhibitors act irreversibly by chemi-cally modifying the enzyme. These modifications gen-erally involve making or breaking covalent bonds withaminoacyl residues essential for substrate binding, catal-ysis, or maintenance of the enzyme’s functional confor-mation. Since these covalent changes are relatively sta-ble, an enzyme that has been “poisoned” by anirreversible inhibitor remains inhibited even after re-moval of the remaining inhibitor from the surroundingmedium.

MOST ENZYME-CATALYZED REACTIONSINVOLVE TWO OR MORE SUBSTRATES

While many enzymes have a single substrate, many oth-ers have two—and sometimes more than two—sub-strates and products. The fundamental principles dis-cussed above, while illustrated for single-substrateenzymes, apply also to multisubstrate enzymes. Themathematical expressions used to evaluate multisub-strate reactions are, however, complex. While detailedkinetic analysis of multisubstrate reactions exceeds thescope of this chapter, two-substrate, two-product reac-tions (termed “Bi-Bi” reactions) are considered below.

Sequential or SingleDisplacement Reactions

In sequential reactions, both substrates must combinewith the enzyme to form a ternary complex beforecatalysis can proceed (Figure 8–11, top). Sequential re-actions are sometimes referred to as single displacement

reactions because the group undergoing transfer is usu-ally passed directly, in a single step, from one substrateto the other. Sequential Bi-Bi reactions can be furtherdistinguished based on whether the two substrates addin a random or in a compulsory order. For random-order reactions, either substrate A or substrate B maycombine first with the enzyme to form an EA or an EBcomplex (Figure 8–11, center). For compulsory-orderreactions, A must first combine with E before B cancombine with the EA complex. One explanation for acompulsory-order mechanism is that the addition of Ainduces a conformational change in the enzyme thataligns residues which recognize and bind B.

Ping-Pong Reactions

The term “ping-pong” applies to mechanisms inwhich one or more products are released from the en-zyme before all the substrates have been added. Ping-pong reactions involve covalent catalysis and a tran-sient, modified form of the enzyme (Figure 7–4).Ping-pong Bi-Bi reactions are double displacement re-actions. The group undergoing transfer is first dis-placed from substrate A by the enzyme to form product

ch08.qxd 2/13/2003 2:23 PM Page 69

Page 70: Harpers Biochem

70 / CHAPTER 8

Increasing[S2]

1vi

1S1

Figure 8–12. Lineweaver-Burk plot for a two-sub-strate ping-pong reaction. An increase in concentra-tion of one substrate (S1) while that of the other sub-strate (S2) is maintained constant changes both the xand y intercepts, but not the slope.

P and a modified form of the enzyme (F). The subse-quent group transfer from F to the second substrate B,forming product Q and regenerating E, constitutes thesecond displacement (Figure 8–11, bottom).

Most Bi-Bi Reactions Conform toMichaelis-Menten Kinetics

Most Bi-Bi reactions conform to a somewhat morecomplex form of Michaelis-Menten kinetics in whichVmax refers to the reaction rate attained when both sub-strates are present at saturating levels. Each substratehas its own characteristic Km value which correspondsto the concentration that yields half-maximal velocitywhen the second substrate is present at saturating levels.As for single-substrate reactions, double-reciprocal plotscan be used to determine Vmax and Km. vi is measured asa function of the concentration of one substrate (thevariable substrate) while the concentration of the othersubstrate (the fixed substrate) is maintained constant. Ifthe lines obtained for several fixed-substrate concentra-tions are plotted on the same graph, it is possible to dis-tinguish between a ping-pong enzyme, which yieldsparallel lines, and a sequential mechanism, which yieldsa pattern of intersecting lines (Figure 8–12).

Product inhibition studies are used to complementkinetic analyses and to distinguish between ordered andrandom Bi-Bi reactions. For example, in a random-order Bi-Bi reaction, each product will be a competitiveinhibitor regardless of which substrate is designated thevariable substrate. However, for a sequential mecha-nism (Figure 8–11, bottom), only product Q will givethe pattern indicative of competitive inhibition when Ais the variable substrate, while only product P will pro-duce this pattern with B as the variable substrate. The

other combinations of product inhibitor and variablesubstrate will produce forms of complex noncompeti-tive inhibition.

SUMMARY

• The study of enzyme kinetics—the factors that affectthe rates of enzyme-catalyzed reactions—reveals theindividual steps by which enzymes transform sub-strates into products.

• ∆G, the overall change in free energy for a reaction,is independent of reaction mechanism and providesno information concerning rates of reactions.

• Enzymes do not affect Keq. Keq, a ratio of reactionrate constants, may be calculated from the concentra-tions of substrates and products at equilibrium orfrom the ratio k1/k−1.

• Reactions proceed via transition states in which ∆GF

is the activation energy. Temperature, hydrogen ionconcentration, enzyme concentration, substrate con-centration, and inhibitors all affect the rates of en-zyme-catalyzed reactions.

• A measurement of the rate of an enzyme-catalyzedreaction generally employs initial rate conditions, forwhich the essential absence of product precludes thereverse reaction.

• A linear form of the Michaelis-Menten equation sim-plifies determination of Km and Vmax.

• A linear form of the Hill equation is used to evaluatethe cooperative substrate-binding kinetics exhibitedby some multimeric enzymes. The slope n, the Hillcoefficient, reflects the number, nature, and strengthof the interactions of the substrate-binding sites. A

ch08.qxd 2/13/2003 2:23 PM Page 70

Page 71: Harpers Biochem

value of n greater than 1 indicates positive coopera-tivity.

• The effects of competitive inhibitors, which typicallyresemble substrates, are overcome by raising the con-centration of the substrate. Noncompetitive in-hibitors lower Vmax but do not affect Km.

• Substrates may add in a random order (either sub-strate may combine first with the enzyme) or in acompulsory order (substrate A must bind before sub-strate B).

• In ping-pong reactions, one or more products are re-leased from the enzyme before all the substrates haveadded.

REFERENCES

Fersht A: Structure and Mechanism in Protein Science: A Guide toEnzyme Catalysis and Protein Folding. Freeman, 1999.

Schultz AR: Enzyme Kinetics: From Diastase to Multi-enzyme Sys-tems. Cambridge Univ Press, 1994.

Segel IH: Enzyme Kinetics. Wiley Interscience, 1975.

ENZYMES: KINETICS / 71

ch08.qxd 2/13/2003 2:23 PM Page 71

Page 72: Harpers Biochem

Enzymes: Regulation of Activities 9

72

Victor W. Rodwell, PhD, & Peter J. Kennelly, PhD

BIOMEDICAL IMPORTANCE

The 19th-century physiologist Claude Bernard enunci-ated the conceptual basis for metabolic regulation. Heobserved that living organisms respond in ways that areboth quantitatively and temporally appropriate to per-mit them to survive the multiple challenges posed bychanges in their external and internal environments.Walter Cannon subsequently coined the term “homeo-stasis” to describe the ability of animals to maintain aconstant intracellular environment despite changes intheir external environment. We now know that organ-isms respond to changes in their external and internalenvironment by balanced, coordinated changes in therates of specific metabolic reactions. Many human dis-eases, including cancer, diabetes, cystic fibrosis, andAlzheimer’s disease, are characterized by regulatory dys-functions triggered by pathogenic agents or genetic mu-tations. For example, many oncogenic viruses elaborateprotein-tyrosine kinases that modify the regulatoryevents which control patterns of gene expression, con-tributing to the initiation and progression of cancer. Thetoxin from Vibrio cholerae, the causative agent of cholera,disables sensor-response pathways in intestinal epithelialcells by ADP-ribosylating the GTP-binding proteins (G-proteins) that link cell surface receptors to adenylylcyclase. The consequent activation of the cyclase triggersthe flow of water into the intestines, resulting in massivediarrhea and dehydration. Yersinia pestis, the causativeagent of plague, elaborates a protein-tyrosine phos-phatase that hydrolyzes phosphoryl groups on key cy-toskeletal proteins. Knowledge of factors that control therates of enzyme-catalyzed reactions thus is essential to anunderstanding of the molecular basis of disease. Thischapter outlines the patterns by which metabolicprocesses are controlled and provides illustrative exam-ples. Subsequent chapters provide additional examples.

REGULATION OF METABOLITE FLOWCAN BE ACTIVE OR PASSIVE

Enzymes that operate at their maximal rate cannot re-spond to an increase in substrate concentration, andcan respond only to a precipitous decrease in substrateconcentration. For most enzymes, therefore, the aver-age intracellular concentration of their substrate tendsto be close to the Km value, so that changes in substrate

concentration generate corresponding changes in me-tabolite flux (Figure 9–1). Responses to changes in sub-strate level represent an important but passive means forcoordinating metabolite flow and maintaining homeo-stasis in quiescent cells. However, they offer limitedscope for responding to changes in environmental vari-ables. The mechanisms that regulate enzyme activity inan active manner in response to internal and externalsignals are discussed below.

Metabolite Flow Tends to Be Unidirectional

Despite the existence of short-term oscillations inmetabolite concentrations and enzyme levels, livingcells exist in a dynamic steady state in which the meanconcentrations of metabolic intermediates remain rela-tively constant over time (Figure 9–2). While all chemi-cal reactions are to some extent reversible, in living cellsthe reaction products serve as substrates for—and areremoved by—other enzyme-catalyzed reactions. Manynominally reversible reactions thus occur unidirection-ally. This succession of coupled metabolic reactions isaccompanied by an overall change in free energy thatfavors unidirectional metabolite flow (Chapter 10). Theunidirectional flow of metabolites through a pathwaywith a large overall negative change in free energy isanalogous to the flow of water through a pipe in whichone end is lower than the other. Bends or kinks in thepipe simulate individual enzyme-catalyzed steps with asmall negative or positive change in free energy. Flow ofwater through the pipe nevertheless remains unidirec-tional due to the overall change in height, which corre-sponds to the overall change in free energy in a pathway(Figure 9–3).

COMPARTMENTATION ENSURESMETABOLIC EFFICIENCY & SIMPLIFIES REGULATION

In eukaryotes, anabolic and catabolic pathways that in-terconvert common products may take place in specificsubcellular compartments. For example, many of theenzymes that degrade proteins and polysaccharides re-side inside organelles called lysosomes. Similarly, fattyacid biosynthesis occurs in the cytosol, whereas fatty

ch09.qxd 2/13/2003 2:27 PM Page 72

Page 73: Harpers Biochem

ENZYMES: REGULATION OF ACTIVITIES / 73

∆S

Km

∆VA

∆VB

[ S ]

∆S

V

Figure 9–1. Differential response of the rate of anenzyme-catalyzed reaction, ∆V, to the same incremen-tal change in substrate concentration at a substrateconcentration of Km (∆VA) or far above Km (∆VB).

B

A

Figure 9–3. Hydrostatic analogy for a pathway witha rate-limiting step (A) and a step with a ∆G value nearzero (B).

acid oxidation takes place within mitochondria (Chap-ters 21 and 22). Segregation of certain metabolic path-ways within specialized cell types can provide furtherphysical compartmentation. Alternatively, possession ofone or more unique intermediates can permit apparentlyopposing pathways to coexist even in the absence ofphysical barriers. For example, despite many shared in-termediates and enzymes, both glycolysis and gluconeo-genesis are favored energetically. This cannot be true ifall the reactions were the same. If one pathway was fa-vored energetically, the other would be accompanied bya change in free energy G equal in magnitude but op-posite in sign. Simultaneous spontaneity of both path-ways results from substitution of one or more reactionsby different reactions favored thermodynamically in theopposite direction. The glycolytic enzyme phospho-fructokinase (Chapter 17) is replaced by the gluco-neogenic enzyme fructose-1,6-bisphosphatase (Chapter19). The ability of enzymes to discriminate between thestructurally similar coenzymes NAD+ and NADP+ alsoresults in a form of compartmentation, since it segre-gates the electrons of NADH that are destined for ATP

generation from those of NADPH that participate inthe reductive steps in many biosynthetic pathways.

Controlling an Enzyme That Catalyzesa Rate-Limiting Reaction Regulatesan Entire Metabolic Pathway

While the flux of metabolites through metabolic path-ways involves catalysis by numerous enzymes, activecontrol of homeostasis is achieved by regulation of onlya small number of enzymes. The ideal enzyme for regu-latory intervention is one whose quantity or catalytic ef-ficiency dictates that the reaction it catalyzes is slow rel-ative to all others in the pathway. Decreasing thecatalytic efficiency or the quantity of the catalyst for the“bottleneck” or rate-limiting reaction immediately re-duces metabolite flux through the entire pathway. Con-versely, an increase in either its quantity or catalytic ef-ficiency enhances flux through the pathway as a whole.For example, acetyl-CoA carboxylase catalyzes the syn-thesis of malonyl-CoA, the first committed reaction offatty acid biosynthesis (Chapter 21). When synthesis ofmalonyl-CoA is inhibited, subsequent reactions of fattyacid synthesis cease due to lack of substrates. Enzymesthat catalyze rate-limiting steps serve as natural “gover-nors” of metabolic flux. Thus, they constitute efficienttargets for regulatory intervention by drugs. For exam-ple, inhibition by “statin” drugs of HMG-CoA reduc-tase, which catalyzes the rate-limiting reaction of cho-lesterogenesis, curtails synthesis of cholesterol.

REGULATION OF ENZYME QUANTITY

The catalytic capacity of the rate-limiting reaction in ametabolic pathway is the product of the concentrationof enzyme molecules and their intrinsic catalytic effi-ciency. It therefore follows that catalytic capacity can be

Nutrients WastesSmallmolecules

Smallmolecules

Smallmolecules

Large molecules

~P ~P

Figure 9–2. An idealized cell in steady state. Notethat metabolite flow is unidirectional.

ch09.qxd 2/13/2003 2:27 PM Page 73

Page 74: Harpers Biochem

74 / CHAPTER 9

influenced both by changing the quantity of enzymepresent and by altering its intrinsic catalytic efficiency.

Control of Enzyme Synthesis

Enzymes whose concentrations remain essentially con-stant over time are termed constitutive enzymes. Bycontrast, the concentrations of many other enzymes de-pend upon the presence of inducers, typically sub-strates or structurally related compounds, that initiatetheir synthesis. Escherichia coli grown on glucose will,for example, only catabolize lactose after addition of aβ-galactoside, an inducer that initiates synthesis of a β-galactosidase and a galactoside permease (Figure 39–3).Inducible enzymes of humans include tryptophan pyr-rolase, threonine dehydrase, tyrosine-α-ketoglutarateaminotransferase, enzymes of the urea cycle, HMG-CoAreductase, and cytochrome P450. Conversely, an excessof a metabolite may curtail synthesis of its cognateenzyme via repression. Both induction and repressioninvolve cis elements, specific DNA sequences located up-stream of regulated genes, and trans-acting regulatoryproteins. The molecular mechanisms of induction andrepression are discussed in Chapter 39.

Control of Enzyme Degradation

The absolute quantity of an enzyme reflects the net bal-ance between enzyme synthesis and enzyme degrada-tion, where ks and kdeg represent the rate constants forthe overall processes of synthesis and degradation, re-spectively. Changes in both the ks and kdeg of specificenzymes occur in human subjects.

Protein turnover represents the net result of en-zyme synthesis and degradation. By measuring the ratesof incorporation of 15N-labeled amino acids into pro-tein and the rates of loss of 15N from protein, Schoen-heimer deduced that body proteins are in a state of “dy-namic equilibrium” in which they are continuouslysynthesized and degraded. Mammalian proteins are de-graded both by ATP and ubiquitin-dependent path-ways and by ATP-independent pathways (Chapter 29).Susceptibility to proteolytic degradation can be influ-enced by the presence of ligands such as substrates,coenzymes, or metal ions that alter protein conforma-tion. Intracellular ligands thus can influence the rates atwhich specific enzymes are degraded.

Enzyme

Amino acids

k s k deg

Enzyme levels in mammalian tissues respond to awide range of physiologic, hormonal, or dietary factors.For example, glucocorticoids increase the concentrationof tyrosine aminotransferase by stimulating ks, andglucagon—despite its antagonistic physiologic effects—increases ks fourfold to fivefold. Regulation of liverarginase can involve changes either in ks or in kdeg. Aftera protein-rich meal, liver arginase levels rise and argi-nine synthesis decreases (Chapter 29). Arginase levelsalso rise in starvation, but here arginase degradation de-creases, whereas ks remains unchanged. Similarly, injec-tion of glucocorticoids and ingestion of tryptophanboth elevate levels of tryptophan oxygenase. While thehormone raises ks for oxygenase synthesis, tryptophanspecifically lowers kdeg by stabilizing the oxygenaseagainst proteolytic digestion.

MULTIPLE OPTIONS ARE AVAILABLE FORREGULATING CATALYTIC ACTIVITY

In humans, the induction of protein synthesis is a com-plex multistep process that typically requires hours toproduce significant changes in overall enzyme level. Bycontrast, changes in intrinsic catalytic efficiency ef-fected by binding of dissociable ligands (allosteric reg-ulation) or by covalent modification achieve regula-tion of enzymic activity within seconds. Changes inprotein level serve long-term adaptive requirements,whereas changes in catalytic efficiency are best suitedfor rapid and transient alterations in metabolite flux.

ALLOSTERIC EFFECTORS REGULATECERTAIN ENZYMES

Feedback inhibition refers to inhibition of an enzymein a biosynthetic pathway by an end product of thatpathway. For example, for the biosynthesis of D from Acatalyzed by enzymes Enz1 through Enz3,

high concentrations of D inhibit conversion of A to B.Inhibition results not from the “backing up” of inter-mediates but from the ability of D to bind to and in-hibit Enz1. Typically, D binds at an allosteric site spa-tially distinct from the catalytic site of the targetenzyme. Feedback inhibitors thus are allosteric effectorsand typically bear little or no structural similarity to thesubstrates of the enzymes they inhibit. In this example,the feedback inhibitor D acts as a negative allostericeffector of Enz1.

Enz Enz

2 3Enz

A B C D1

→ → →

ch09.qxd 2/13/2003 2:27 PM Page 74

Page 75: Harpers Biochem

ENZYMES: REGULATION OF ACTIVITIES / 75

S1 S2 S3 S4

S5

A B

D

C

Figure 9–4. Sites of feedback inhibition in abranched biosynthetic pathway. S1–S5 are intermedi-ates in the biosynthesis of end products A–D. Straightarrows represent enzymes catalyzing the indicated con-versions. Curved arrows represent feedback loops andindicate sites of feedback inhibition by specific endproducts.

S1 S2 S3 S4

S5

A B

D

C

Figure 9–5. Multiple feedback inhibition in abranched biosynthetic pathway. Superimposed on sim-ple feedback loops (dashed, curved arrows) are multi-ple feedback loops (solid, curved arrows) that regulateenzymes common to biosynthesis of several end prod-ucts.

In a branched biosynthetic pathway, the initial reac-tions participate in the synthesis of several products.Figure 9–4 shows a hypothetical branched biosyntheticpathway in which curved arrows lead from feedback in-hibitors to the enzymes whose activity they inhibit. Thesequences S3 → A, S4 → B, S4 → C, and S3 → → Deach represent linear reaction sequences that are feed-back-inhibited by their end products. The pathways ofnucleotide biosynthesis (Chapter 34) provide specificexamples.

The kinetics of feedback inhibition may be competi-tive, noncompetitive, partially competitive, or mixed.Feedback inhibitors, which frequently are the smallmolecule building blocks of macromolecules (eg, aminoacids for proteins, nucleotides for nucleic acids), typi-cally inhibit the first committed step in a particularbiosynthetic sequence. A much-studied example is inhi-bition of bacterial aspartate transcarbamoylase by CTP(see below and Chapter 34).

Multiple feedback loops can provide additional finecontrol. For example, as shown in Figure 9–5, the pres-ence of excess product B decreases the requirement forsubstrate S2. However, S2 is also required for synthesisof A, C, and D. Excess B should therefore also curtailsynthesis of all four end products. To circumvent thispotential difficulty, each end product typically onlypartially inhibits catalytic activity. The effect of an ex-cess of two or more end products may be strictly addi-tive or, alternatively, may be greater than their individ-ual effect (cooperative feedback inhibition).

Aspartate Transcarbamoylase Is a ModelAllosteric Enzyme

Aspartate transcarbamoylase (ATCase), the catalyst forthe first reaction unique to pyrimidine biosynthesis(Figure 34–7), is feedback-inhibited by cytidine tri-

phosphate (CTP). Following treatment with mercuri-als, ATCase loses its sensitivity to inhibition by CTPbut retains its full activity for synthesis of carbamoyl as-partate. This suggests that CTP is bound at a different(allosteric) site from either substrate. ATCase consistsof multiple catalytic and regulatory subunits. Each cat-alytic subunit contains four aspartate (substrate) sitesand each regulatory subunit at least two CTP (regula-tory) sites (Chapter 34).

Allosteric & Catalytic Sites Are Spatially Distinct

The lack of structural similarity between a feedback in-hibitor and the substrate for the enzyme whose activityit regulates suggests that these effectors are not isostericwith a substrate but allosteric (“occupy anotherspace”). Jacques Monod therefore proposed the exis-tence of allosteric sites that are physically distinct fromthe catalytic site. Allosteric enzymes thus are thosewhose activity at the active site may be modulated bythe presence of effectors at an allosteric site. This hy-pothesis has been confirmed by many lines of evidence,including x-ray crystallography and site-directed muta-genesis, demonstrating the existence of spatially distinctactive and allosteric sites on a variety of enzymes.

Allosteric Effects May Be on Km or on Vmax

To refer to the kinetics of allosteric inhibition as “com-petitive” or “noncompetitive” with substrate carriesmisleading mechanistic implications. We refer insteadto two classes of regulated enzymes: K-series and V-se-ries enzymes. For K-series allosteric enzymes, the sub-strate saturation kinetics are competitive in the sensethat Km is raised without an effect on Vmax. For V-seriesallosteric enzymes, the allosteric inhibitor lowers Vmax

ch09.qxd 2/13/2003 2:27 PM Page 75

Page 76: Harpers Biochem

76 / CHAPTER 9

without affecting the Km. Alterations in Km or Vmaxprobably result from conformational changes at the cat-alytic site induced by binding of the allosteric effectorat the allosteric site. For a K-series allosteric enzyme,this conformational change may weaken the bonds be-tween substrate and substrate-binding residues. For aV-series allosteric enzyme, the primary effect may be toalter the orientation or charge of catalytic residues, low-ering Vmax. Intermediate effects on Km and Vmax, how-ever, may be observed consequent to these conforma-tional changes.

FEEDBACK REGULATION IS NOT SYNONYMOUS WITH FEEDBACK INHIBITION

In both mammalian and bacterial cells, end products“feed back” and control their own synthesis, in manyinstances by feedback inhibition of an early biosyn-thetic enzyme. We must, however, distinguish betweenfeedback regulation, a phenomenologic term devoidof mechanistic implications, and feedback inhibition,a mechanism for regulation of enzyme activity. For ex-ample, while dietary cholesterol decreases hepatic syn-thesis of cholesterol, this feedback regulation does notinvolve feedback inhibition. HMG-CoA reductase, therate-limiting enzyme of cholesterologenesis, is affected,but cholesterol does not feedback-inhibit its activity.Regulation in response to dietary cholesterol involvescurtailment by cholesterol or a cholesterol metabolite ofthe expression of the gene that encodes HMG-CoA re-ductase (enzyme repression) (Chapter 26).

MANY HORMONES ACT THROUGHALLOSTERIC SECOND MESSENGERS

Nerve impulses—and binding of hormones to cell sur-face receptors—elicit changes in the rate of enzyme-catalyzed reactions within target cells by inducing the re-lease or synthesis of specialized allosteric effectors calledsecond messengers. The primary or “first” messenger isthe hormone molecule or nerve impulse. Second mes-sengers include 3′,5′-cAMP, synthesized from ATP bythe enzyme adenylyl cyclase in response to the hormoneepinephrine, and Ca2+, which is stored inside the endo-plasmic reticulum of most cells. Membrane depolariza-tion resulting from a nerve impulse opens a membranechannel that releases calcium ion into the cytoplasm,where it binds to and activates enzymes involved in theregulation of contraction and the mobilization of storedglucose from glycogen. Glucose then supplies the in-creased energy demands of muscle contraction. Othersecond messengers include 3′,5′-cGMP and polyphos-phoinositols, produced by the hydrolysis of inositolphospholipids by hormone-regulated phospholipases.

REGULATORY COVALENTMODIFICATIONS CAN BE REVERSIBLE OR IRREVERSIBLE

In mammalian cells, the two most common forms ofcovalent modification are partial proteolysis andphosphorylation. Because cells lack the ability to re-unite the two portions of a protein produced by hydrol-ysis of a peptide bond, proteolysis constitutes an irre-versible modification. By contrast, phosphorylation is areversible modification process. The phosphorylation ofproteins on seryl, threonyl, or tyrosyl residues, catalyzedby protein kinases, is thermodynamically spontaneous.Equally spontaneous is the hydrolytic removal of thesephosphoryl groups by enzymes called protein phos-phatases.

PROTEASES MAY BE SECRETED ASCATALYTICALLY INACTIVE PROENZYMES

Certain proteins are synthesized and secreted as inactiveprecursor proteins known as proproteins. The propro-teins of enzymes are termed proenzymes or zymogens.Selective proteolysis converts a proprotein by one ormore successive proteolytic “clips” to a form that ex-hibits the characteristic activity of the mature protein,eg, its enzymatic activity. Proteins synthesized as pro-proteins include the hormone insulin (proprotein =proinsulin), the digestive enzymes pepsin, trypsin, andchymotrypsin (proproteins = pepsinogen, trypsinogen,and chymotrypsinogen, respectively), several factors ofthe blood clotting and blood clot dissolution cascades(see Chapter 51), and the connective tissue protein col-lagen (proprotein = procollagen).

Proenzymes Facilitate Rapid Mobilization of an Activity in Response to Physiologic Demand

The synthesis and secretion of proteases as catalyticallyinactive proenzymes protects the tissue of origin (eg,the pancreas) from autodigestion, such as can occur inpancreatitis. Certain physiologic processes such as di-gestion are intermittent but fairly regular and pre-dictable. Others such as blood clot formation, clot dis-solution, and tissue repair are brought “on line” only inresponse to pressing physiologic or pathophysiologicneed. The processes of blood clot formation and dis-solution clearly must be temporally coordinated toachieve homeostasis. Enzymes needed intermittentlybut rapidly often are secreted in an initially inactiveform since the secretion process or new synthesis of therequired proteins might be insufficiently rapid for re-sponse to a pressing pathophysiologic demand such asthe loss of blood.

ch09.qxd 2/13/2003 2:27 PM Page 76

Page 77: Harpers Biochem

ENZYMES: REGULATION OF ACTIVITIES / 77

Activation of Prochymotrypsin Requires Selective Proteolysis

Selective proteolysis involves one or more highly spe-cific proteolytic clips that may or may not be accompa-nied by separation of the resulting peptides. Most im-portantly, selective proteolysis often results inconformational changes that “create” the catalytic siteof an enzyme. Note that while His 57 and Asp 102 re-side on the B peptide of α-chymotrypsin, Ser 195 re-sides on the C peptide (Figure 9–6). The conforma-tional changes that accompany selective proteolysis ofprochymotrypsin (chymotrypsinogen) align the threeresidues of the charge-relay network, creating the cat-alytic site. Note also that contact and catalytic residuescan be located on different peptide chains but still bewithin bond-forming distance of bound substrate.

REVERSIBLE COVALENT MODIFICATIONREGULATES KEY MAMMALIAN ENZYMES

Mammalian proteins are the targets of a wide range ofcovalent modification processes. Modifications such asglycosylation, hydroxylation, and fatty acid acylationintroduce new structural features into newly synthe-sized proteins that tend to persist for the lifetime of theprotein. Among the covalent modifications that regu-late protein function (eg, methylation, adenylylation),the most common by far is phosphorylation-dephos-phorylation. Protein kinases phosphorylate proteins by

catalyzing transfer of the terminal phosphoryl group ofATP to the hydroxyl groups of seryl, threonyl, or tyro-syl residues, forming O-phosphoseryl, O-phosphothre-onyl, or O-phosphotyrosyl residues, respectively (Figure9–7). Some protein kinases target the side chains of his-tidyl, lysyl, arginyl, and aspartyl residues. The unmodi-fied form of the protein can be regenerated by hy-drolytic removal of phosphoryl groups, catalyzed byprotein phosphatases.

A typical mammalian cell possesses over 1000 phos-phorylated proteins and several hundred protein kinasesand protein phosphatases that catalyze their intercon-version. The ease of interconversion of enzymes be-tween their phospho- and dephospho- forms in part

1 13 14 15 16 146 149

1 13 14 15 16 146 149

1 13 16 146 149

SSSS

245

245

245

π-CT

α-CT

Pro-CT

14-15 147-148

Figure 9–6. Selective proteolysis and associated conformational changes form theactive site of chymotrypsin, which includes the Asp102-His57-Ser195 catalytic triad.Successive proteolysis forms prochymotrypsin (pro-CT), π-chymotrypsin (π-CT), and ul-timately α-chymotrypsin (α-CT), an active protease whose three peptides remain asso-ciated by covalent inter-chain disulfide bonds.

OH

Pi H2O

ATP

Mg2+

Mg2+

ADP

Enz Ser OEnz Ser PO32–

KINASE

PHOSPHATASE

Figure 9–7. Covalent modification of a regulated en-zyme by phosphorylation-dephosphorylation of a serylresidue.

ch09.qxd 2/13/2003 2:27 PM Page 77

Page 78: Harpers Biochem

78 / CHAPTER 9

Table 9–1. Examples of mammalian enzymeswhose catalytic activity is altered by covalentphosphorylation-dephosphorylation.

Activity State1

Enzyme Low High

Acetyl-CoA carboxylase EP EGlycogen synthase EP EPyruvate dehydrogenase EP EHMG-CoA reductase EP EGlycogen phosphorylase E EPCitrate lyase E EPPhosphorylase b kinase E EPHMG-CoA reductase kinase E EP1E, dephosphoenzyme; EP, phosphoenzyme.

accounts for the frequency of phosphorylation-dephos-phorylation as a mechanism for regulatory control.Phosphorylation-dephosphorylation permits the func-tional properties of the affected enzyme to be alteredonly for as long as it serves a specific need. Once theneed has passed, the enzyme can be converted back toits original form, poised to respond to the next stimula-tory event. A second factor underlying the widespreaduse of protein phosphorylation-dephosphorylation liesin the chemical properties of the phosphoryl group it-self. In order to alter an enzyme’s functional properties,any modification of its chemical structure must influ-ence the protein’s three-dimensional configuration.The high charge density of protein-bound phosphorylgroups—generally −2 at physiologic pH—and theirpropensity to form salt bridges with arginyl residuesmake them potent agents for modifying protein struc-ture and function. Phosphorylation generally targetsamino acids distant from the catalytic site itself. Conse-quent conformational changes then influence an en-zyme’s intrinsic catalytic efficiency or other properties.In this sense, the sites of phosphorylation and other co-valent modifications can be considered another form ofallosteric site. However, in this case the “allosteric li-gand” binds covalently to the protein.

PROTEIN PHOSPHORYLATION IS EXTREMELY VERSATILE

Protein phosphorylation-dephosphorylation is a highlyversatile and selective process. Not all proteins are sub-ject to phosphorylation, and of the many hydroxylgroups on a protein’s surface, only one or a small subsetare targeted. While the most common enzyme functionaffected is the protein’s catalytic efficiency, phosphory-lation can also alter the affinity for substrates, locationwithin the cell, or responsiveness to regulation by al-losteric ligands. Phosphorylation can increase an en-zyme’s catalytic efficiency, converting it to its activeform in one protein, while phosphorylation of anotherconverts it into an intrinsically inefficient, or inactive,form (Table 9–1).

Many proteins can be phosphorylated at multiplesites or are subject to regulation both by phosphoryla-tion-dephosphorylation and by the binding of allostericligands. Phosphorylation-dephosphorylation at any onesite can be catalyzed by multiple protein kinases or pro-tein phosphatases. Many protein kinases and most pro-tein phosphatases act on more than one protein and arethemselves interconverted between active and inactiveforms by the binding of second messengers or by cova-lent modification by phosphorylation-dephosphoryla-tion.

The interplay between protein kinases and proteinphosphatases, between the functional consequences of

phosphorylation at different sites, or between phosphory-lation sites and allosteric sites provides the basis forregulatory networks that integrate multiple environ-mental input signals to evoke an appropriate coordi-nated cellular response. In these sophisticated regula-tory networks, individual enzymes respond to differentenvironmental signals. For example, if an enzyme canbe phosphorylated at a single site by more than oneprotein kinase, it can be converted from a catalyticallyefficient to an inefficient (inactive) form, or vice versa,in response to any one of several signals. If the proteinkinase is activated in response to a signal different fromthe signal that activates the protein phosphatase, thephosphoprotein becomes a decision node. The func-tional output, generally catalytic activity, reflects thephosphorylation state. This state or degree of phos-phorylation is determined by the relative activities ofthe protein kinase and protein phosphatase, a reflectionof the presence and relative strength of the environ-mental signals that act through each. The ability ofmany protein kinases and protein phosphatases to tar-get more than one protein provides a means for an en-vironmental signal to coordinately regulate multiplemetabolic processes. For example, the enzymes 3-hy-droxy-3-methylglutaryl-CoA reductase and acetyl-CoAcarboxylase—the rate-controlling enzymes for choles-terol and fatty acid biosynthesis, respectively—arephosphorylated and inactivated by the AMP-activatedprotein kinase. When this protein kinase is activated ei-ther through phosphorylation by yet another proteinkinase or in response to the binding of its allosteric acti-vator 5′-AMP, the two major pathways responsible forthe synthesis of lipids from acetyl-CoA both are inhib-ited. Interconvertible enzymes and the enzymes respon-sible for their interconversion do not act as mere onand off switches working independently of one another.

ch09.qxd 2/13/2003 2:27 PM Page 78

Page 79: Harpers Biochem

ENZYMES: REGULATION OF ACTIVITIES / 79

They form the building blocks of biomolecular com-puters that maintain homeostasis in cells that carry outa complex array of metabolic processes that must beregulated in response to a broad spectrum of environ-mental factors.

Covalent Modification RegulatesMetabolite Flow

Regulation of enzyme activity by phosphorylation-dephosphorylation has analogies to regulation by feed-back inhibition. Both provide for short-term, readilyreversible regulation of metabolite flow in response tospecific physiologic signals. Both act without alteringgene expression. Both act on early enzymes of a pro-tracted, often biosynthetic metabolic sequence, andboth act at allosteric rather than catalytic sites. Feed-back inhibition, however, involves a single protein andlacks hormonal and neural features. By contrast, regula-tion of mammalian enzymes by phosphorylation-dephosphorylation involves several proteins and ATPand is under direct neural and hormonal control.

SUMMARY

• Homeostasis involves maintaining a relatively con-stant intracellular and intra-organ environment de-spite wide fluctuations in the external environmentvia appropriate changes in the rates of biochemicalreactions in response to physiologic need.

• The substrates for most enzymes are usually presentat a concentration close to Km. This facilitates passivecontrol of the rates of product formation response tochanges in levels of metabolic intermediates.

• Active control of metabolite flux involves changes inthe concentration, catalytic activity, or both of an en-zyme that catalyzes a committed, rate-limiting reac-tion.

• Selective proteolysis of catalytically inactive proen-zymes initiates conformational changes that form the

active site. Secretion as an inactive proenzyme facili-tates rapid mobilization of activity in response to in-jury or physiologic need and may protect the tissueof origin (eg, autodigestion by proteases).

• Binding of metabolites and second messengers tosites distinct from the catalytic site of enzymes trig-gers conformational changes that alter Vmax or theKm.

• Phosphorylation by protein kinases of specific seryl,threonyl, or tyrosyl residues—and subsequent de-phosphorylation by protein phosphatases—regulatesthe activity of many human enzymes. The protein ki-nases and phosphatases that participate in regulatorycascades which respond to hormonal or second mes-senger signals constitute a “bio-organic computer”that can process and integrate complex environmen-tal information to produce an appropriate and com-prehensive cellular response.

REFERENCES

Bray D: Protein molecules as computational elements in livingcells. Nature 1995;376:307.

Graves DJ, Martin BL, Wang JH: Co- and Post-translational Modi-fication of Proteins: Chemical Principles and Biological Effects.Oxford Univ Press, 1994.

Johnson LN, Barford D: The effect of phosphorylation on thestructure and function of proteins. Annu Rev Biophys Bio-mol Struct 1993;22:199.

Marks F (editor): Protein Phosphorylation. VCH Publishers, 1996.Pilkis SJ et al: 6-Phosphofructo-2-kinase/fructose-2,6-bisphospha-

tase: A metabolic signaling enzyme. Annu Rev Biochem1995;64:799.

Scriver CR et al (editors): The Metabolic and Molecular Bases ofInherited Disease, 8th ed. McGraw-Hill, 2000.

Sitaramayya A (editor): Introduction to Cellular Signal Transduction.Birkhauser, 1999.

Stadtman ER, Chock PB (editors): Current Topics in Cellular Regu-lation. Academic Press, 1969 to the present.

Weber G (editor): Advances in Enzyme Regulation. Pergamon Press,1963 to the present.

ch09.qxd 2/13/2003 2:27 PM Page 79

Page 80: Harpers Biochem

396

Molecular Genetics, RecombinantDNA, & Genomic TechnologyDaryl K. Granner, MD, & P. Anthony Weil, PhD

BIOMEDICAL IMPORTANCE*

The development of recombinant DNA, high-density,high-throughput screening, and other molecular ge-netic methodologies has revolutionized biology and ishaving an increasing impact on clinical medicine.Much has been learned about human genetic diseasefrom pedigree analysis and study of affected proteins,but in many cases where the specific genetic defect isunknown, these approaches cannot be used. The newtechnologies circumvent these limitations by going di-rectly to the DNA molecule for information. Manipu-lation of a DNA sequence and the construction ofchimeric molecules—so-called genetic engineering—provides a means of studying how a specific segment ofDNA works. Novel molecular genetic tools allow inves-tigators to query and manipulate genomic sequences aswell as to examine both cellular mRNA and proteinprofiles at the molecular level.

Understanding this technology is important for sev-eral reasons: (1) It offers a rational approach to under-standing the molecular basis of a number of diseases(eg, familial hypercholesterolemia, sickle cell disease,the thalassemias, cystic fibrosis, muscular dystrophy).(2) Human proteins can be produced in abundance fortherapy (eg, insulin, growth hormone, tissue plasmino-gen activator). (3) Proteins for vaccines (eg, hepatitis B)and for diagnostic testing (eg, AIDS tests) can be ob-tained. (4) This technology is used to diagnose existingdiseases and predict the risk of developing a given dis-ease. (5) Special techniques have led to remarkable ad-vances in forensic medicine. (6) Gene therapy for sicklecell disease, the thalassemias, adenosine deaminase defi-ciency, and other diseases may be devised.

40

* See glossary of terms at the end of this chapter.

ELUCIDATION OF THE BASIC FEATURESOF DNA LED TO RECOMBINANT DNA TECHNOLOGY

DNA Is a Complex Biopolymer Organized as a Double Helix

The fundamental organizational element is the se-quence of purine (adenine [A] or guanine [G]) andpyrimidine (cytosine [C] or thymine [T]) bases. Thesebases are attached to the C-1′ position of the sugar de-oxyribose, and the bases are linked together throughjoining of the sugar moieties at their 3′ and 5′ positionsvia a phosphodiester bond (Figure 35–1). The alternat-ing deoxyribose and phosphate groups form the back-bone of the double helix (Figure 35–2). These 3′–5′linkages also define the orientation of a given strand ofthe DNA molecule, and, since the two strands run inopposite directions, they are said to be antiparallel.

Base Pairing Is a Fundamental Concept of DNA Structure & Function

Adenine and thymine always pair, by hydrogen bonding,as do guanine and cytosine (Figure 35–3). These basepairs are said to be complementary, and the guaninecontent of a fragment of double-stranded DNA will al-ways equal its cytosine content; likewise, the thymineand adenine contents are equal. Base pairing and hy-drophobic base-stacking interactions hold the two DNAstrands together. These interactions can be reduced byheating the DNA to denature it. The laws of base pairingpredict that two complementary DNA strands will rean-neal exactly in register upon renaturation, as happenswhen the temperature of the solution is slowly reduced tonormal. Indeed, the degree of base-pair matching (ormismatching) can be estimated from the temperature re-

ch40.qxd 3/16/04 11:05 AM Page 396

Page 81: Harpers Biochem

MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 397

quired for denaturation-renaturation. Segments of DNAwith high degrees of base-pair matching require more en-ergy input (heat) to accomplish denaturation—or, to putit another way, a closely matched segment will withstandmore heat before the strands separate. This reaction isused to determine whether there are significant differ-ences between two DNA sequences, and it underlies theconcept of hybridization, which is fundamental to theprocesses described below.

There are about 3 � 109 base pairs (bp) in eachhuman haploid genome. If an average gene length is3 × 103 bp (3 kilobases [kb]), the genome could consistof 106 genes, assuming that there is no overlap and thattranscription proceeds in only one direction. It isthought that there are < 105 genes in the human andthat only 1–2% of the DNA codes for proteins. Theexact function of the remaining ~98% of the humangenome has not yet been defined.

The double-helical DNA is packaged into a morecompact structure by a number of proteins, mostnotably the basic proteins called histones. This con-densation may serve a regulatory role and certainly hasa practical purpose. The DNA present within the nu-cleus of a cell, if simply extended, would be about1 meter long. The chromosomal proteins compact thislong strand of DNA so that it can be packaged into anucleus with a volume of a few cubic micrometers.

DNA Is Organized Into Genes

In general, prokaryotic genes consist of a small regula-tory region (100–500 bp) and a large protein-codingsegment (500–10,000 bp). Several genes are often con-trolled by a single regulatory unit. Most mammaliangenes are more complicated in that the coding regionsare interrupted by noncoding regions that are elimi-nated when the primary RNA transcript is processedinto mature messenger RNA (mRNA). The coding re-gions (those regions that appear in the mature RNAspecies) are called exons, and the noncoding regions,which interpose or intervene between the exons, arecalled introns (Figure 40–1). Introns are always re-moved from precursor RNA before transport into thecytoplasm occurs. The process by which introns are re-moved from precursor RNA and by which exons areligated together is called RNA splicing. Incorrect pro-cessing of the primary transcript into the maturemRNA can result in disease in humans (see below); thisunderscores the importance of these posttranscriptionalprocessing steps. The variation in size and complexityof some human genes is illustrated in Table 40–1. Al-though there is a 300-fold difference in the sizes of thegenes illustrated, the mRNA sizes vary only about 20-fold. This is because most of the DNA in genes is pres-ent as introns, and introns tend to be much larger than

exons. Regulatory regions for specific eukaryotic genesare usually located in the DNA that flanks the tran-scription initiation site at its 5′ end (5′ flanking-sequence DNA). Occasionally, such sequences arefound within the gene itself or in the region that flanksthe 3′ end of the gene. In mammalian cells, each genehas its own regulatory region. Many eukaryotic genes(and some viruses that replicate in mammalian cells)have special regions, called enhancers, that increase therate of transcription. Some genes also have DNA se-quences, known as silencers, that repress transcription.Mammalian genes are obviously complicated, multi-component structures.

Genes Are Transcribed Into RNA

Information generally flows from DNA to mRNA toprotein, as illustrated in Figure 40–1 and discussed inmore detail in Chapter 39. This is a rigidly controlledprocess involving a number of complex steps, each ofwhich no doubt is regulated by one or more enzymes orfactors; faulty function at any of these steps can causedisease.

RECOMBINANT DNA TECHNOLOGYINVOLVES ISOLATION & MANIPULATIONOF DNA TO MAKE CHIMERIC MOLECULES

Isolation and manipulation of DNA, including end-to-end joining of sequences from very different sources tomake chimeric molecules (eg, molecules containingboth human and bacterial DNA sequences in a se-quence-independent fashion), is the essence of recom-binant DNA research. This involves several uniquetechniques and reagents.

Restriction Enzymes Cut DNA Chains at Specific Locations

Certain endonucleases—enzymes that cut DNA at spe-cific DNA sequences within the molecule (as opposedto exonucleases, which digest from the ends of DNAmolecules)—are a key tool in recombinant DNA re-search. These enzymes were called restriction enzymesbecause their presence in a given bacterium restrictedthe growth of certain bacterial viruses called bacterio-phages. Restriction enzymes cut DNA of any sourceinto short pieces in a sequence-specific manner—incontrast to most other enzymatic, chemical, or physicalmethods, which break DNA randomly. These defensiveenzymes (hundreds have been discovered) protect thehost bacterial DNA from DNA from foreign organisms(primarily infective phages). However, they are presentonly in cells that also have a companion enzyme whichmethylates the host DNA, rendering it an unsuitablesubstrate for digestion by the restriction enzyme. Thus,

ch40.qxd 3/16/04 11:05 AM Page 397

Page 82: Harpers Biochem

398 / CHAPTER 40

CAAT TATA AATAAADNA 5′

NUCLEUS

CYTOPLASM

3′

5′Noncoding

region

Intron 3′Noncoding

region

Exon Exon

Transcription

Translation

Modification of5′ and 3′ ends

Removal of intronsand splicing of exons

Transmembranetransport

PPP

Cap

AAA---APoly(A) tail

AAA---A

AAA---A

NH2 COOH

Primary RNA transcript

Modified transcript

Processed nuclear mRNA

mRNA

Protein

Regulatoryregion

Basalpromoter

region

Transcriptionstart site

Poly(A)addition

site

Figure 40–1. Organization of a eukaryotic transcription unit and the pathway of eukaryotic gene expres-sion. Eukaryotic genes have structural and regulatory regions. The structural region consists of the codingDNA and 5′ and 3′ noncoding DNA sequences. The coding regions are divided into two parts: (1) exons, whicheventually are ligated together to become mature RNA, and (2) introns, which are processed out of the pri-mary transcript. The structural region is bounded at its 5′ end by the transcription initiation site and at its3′ end by the polyadenylate addition or termination site. The promoter region, which contains specific DNAsequences that interact with various protein factors to regulate transcription, is discussed in detail in Chap-ters 37 and 39. The primary transcript has a special structure, a cap, at the 5′ end and a stretch of As at the 3′end. This transcript is processed to remove the introns; and the mature mRNA is then transported to the cyto-plasm, where it is translated into protein.

site-specific DNA methylases and restriction enzymesalways exist in pairs in a bacterium.

Restriction enzymes are named after the bac-terium from which they are isolated. For example,EcoRI is from Escherichia coli, and BamHI is from Bacil-lus amyloliquefaciens (Table 40–2). The first three lettersin the restriction enzyme name consist of the first letterof the genus (E) and the first two letters of the species(co). These may be followed by a strain designation (R)and a roman numeral (I) to indicate the order of discov-ery (eg, EcoRI, EcoRII). Each enzyme recognizes andcleaves a specific double-stranded DNA sequence that is4–7 bp long. These DNA cuts result in blunt ends (eg,

HpaI) or overlapping (sticky) ends (eg, BamHI) (Figure40–2), depending on the mechanism used by the en-zyme. Sticky ends are particularly useful in constructinghybrid or chimeric DNA molecules (see below). If thefour nucleotides are distributed randomly in a givenDNA molecule, one can calculate how frequently agiven enzyme will cut a length of DNA. For each posi-tion in the DNA molecule, there are four possibilities(A, C, G, and T); therefore, a restriction enzyme thatrecognizes a 4-bp sequence cuts, on average, once every256 bp (44), whereas another enzyme that recognizes a6-bp sequence cuts once every 4096 bp (46). A givenpiece of DNA has a characteristic linear array of sites for

ch40.qxd 3/16/04 11:05 AM Page 398

Page 83: Harpers Biochem

MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 399

Table 40–1. Variations in the size and complexityof some human genes and mRNAs.1

mRNAGene Size Number Size

Gene (kb) of Introns (kb)

β-Globin 1.5 2 0.6Insulin 1.7 2 0.4β-Adrenergic receptor 3 0 2.2Albumin 25 14 2.1LDL receptor 45 17 5.5Factor VIII 186 25 9.0Thyroglobulin 300 36 8.71The sizes are given in kilobases (kb). The sizes of the genes in-clude some proximal promoter and regulatory region sequences;these are generally about the same size for all genes. Genes varyin size from about 1500 base pairs (bp) to over 2 × 106 bp. There isalso great variation in the number of introns and exons. The β-adrenergic receptor gene is intronless, and the thyroglobulingene has 36 introns. As noted by the smaller difference in mRNAsizes, introns comprise most of the gene sequence.

Table 40–2. Selected restriction endonucleasesand their sequence specificities.1

Sequence Recognized Bacterial Endonuclease Cleavage Sites Shown Source

↓BamHI GGATCC Bacillus amylo-

CCTAGG liquefaciens H↑

↓BgIII AGATCT Bacillus glolbigii

TCTAGA↑

↓EcoRI GAATTC Escherichia coli

CTTAAG RY13↑

↓EcoRII CCTGG Escherichia coli

GGACC R245↑

↓HindIII AAGCTT Haemophilus

TTCGAA influenzae Rd

↓Hhal GCGC Haemophilus

CGCG haemolyticus↑

↓Hpal GTTAAC Haemophilus

CAATTG parainfluenzae↑

↓MstII CCTNAGG Microcoleus

GGANTCC strain↑

↓PstI CTGCAG Providencia

GACGTC stuartii 164↑

↓Taql TCGA Thermus

AGCT aquaticus YTI↑

1A, adenine; C, cytosine; G, guanine, T, thymine. Arrows show the siteof cleavage; depending on the site, sticky ends (BamHI) or blunt ends(Hpal) may result. The length of the recognition sequence can be 4 bp(Taql), 5 bp (EcoRII), 6 bp (EcoRI), or 7 bp (MstII) or longer. By conven-tion, these are written in the 5′ or 3′ direction for the upper strand ofeach recognition sequence, and the lower strand is shown with theopposite (ie, 3′ or 5′) polarity. Note that most recognition sequencesare palindromes (ie, the sequence reads the same in opposite direc-tions on the two strands). A residue designated N means that any nu-cleotide is permitted.

the various enzymes dictated by the linear sequence ofits bases; hence, a restriction map can be constructed.When DNA is digested with a given enzyme, the endsof all the fragments have the same DNA sequence. Thefragments produced can be isolated by electrophoresison agarose or polyacrylamide gels (see the discussion ofblot transfer, below); this is an essential step in cloningand a major use of these enzymes.

A number of other enzymes that act on DNA andRNA are an important part of recombinant DNA tech-nology. Many of these are referred to in this and subse-quent chapters (Table 40–3).

Restriction Enzymes & DNA Ligase AreUsed to Prepare Chimeric DNA Molecules

Sticky-end ligation is technically easy, but some specialtechniques are often required to overcome problems in-herent in this approach. Sticky ends of a vector may re-connect with themselves, with no net gain of DNA.Sticky ends of fragments can also anneal, so that tandemheterogeneous inserts form. Also, sticky-end sites maynot be available or in a convenient position. To circum-vent these problems, an enzyme that generates bluntends is used, and new ends are added using the enzymeterminal transferase. If poly d(G) is added to the 3′ endsof the vector and poly d(C) is added to the 3′ ends ofthe foreign DNA, the two molecules can only anneal toeach other, thus circumventing the problems listedabove. This procedure is called homopolymer tailing.Sometimes, synthetic blunt-ended duplex oligonu-cleotide linkers with a convenient restriction enzyme se-

ch40.qxd 3/16/04 11:05 AM Page 399

Page 84: Harpers Biochem

400 / CHAPTER 40

G

C

G

C

G

C

T

A

A

T

T

A

T

A

A

T

C

G

A

T

C

G

C

G

5’ 3’

3’ 5’

5’ 3’

3’ 5’

G

C

G

C

C

T

A

T

T

A

T

A

T

C

A

T

A

A

G

+

G

+

C

G

C

G

A. Sticky or staggered ends

B. Blunt ends

BamHI

HpaI

Figure 40–2. Results of restriction en-donuclease digestion. Digestion with a re-striction endonuclease can result in the for-mation of DNA fragments with sticky, orcohesive, ends (A) or blunt ends (B). This isan important consideration in devisingcloning strategies.

quence are ligated to the blunt-ended DNA. Directblunt-end ligation is accomplished using the enzymebacteriophage T4 DNA ligase. This technique, thoughless efficient than sticky-end ligation, has the advantageof joining together any pairs of ends. The disadvantagesare that there is no control over the orientation of inser-tion or the number of molecules annealed together, andthere is no easy way to retrieve the insert.

Cloning Amplifies DNA

A clone is a large population of identical molecules, bac-teria, or cells that arise from a common ancestor. Molec-ular cloning allows for the production of a large numberof identical DNA molecules, which can then be charac-

terized or used for other purposes. This technique isbased on the fact that chimeric or hybrid DNA moleculescan be constructed in cloning vectors—typically bacter-ial plasmids, phages, or cosmids—which then continueto replicate in a host cell under their own control systems.In this way, the chimeric DNA is amplified. The generalprocedure is illustrated in Figure 40–3.

Bacterial plasmids are small, circular, duplex DNAmolecules whose natural function is to confer antibioticresistance to the host cell. Plasmids have several proper-ties that make them extremely useful as cloning vectors.They exist as single or multiple copies within the bac-terium and replicate independently from the bacterialDNA. The complete DNA sequence of many plasmids isknown; hence, the precise location of restriction enzyme

Table 40–3. Some of the enzymes used in recombinant DNA research.1

Enzyme Reaction Primary Use

Alkaline phosphatase Dephosphorylates 5′ ends of RNA and DNA. Removal of 5′-PO4 groups prior to kinase labeling to preventself-ligation.

BAL 31 nuclease Degrades both the 3′ and 5′ ends of DNA. Progressive shortening of DNA molecules.

DNA ligase Catalyzes bonds between DNA molecules. Joining of DNA molecules.

DNA polymerase I Synthesizes double-stranded DNA from Synthesis of double-stranded cDNA; nick translation; gener-single-stranded DNA. ation of blunt ends from sticky ends.

DNase I Under appropriate conditions, produces Nick translation; mapping of hypersensitive sites; mapping single-stranded nicks in DNA. protein-DNA interactions.

Exonuclease III Removes nucleotides from 3′ ends of DNA. DNA sequencing; mapping of DNA-protein interactions.

λ exonuclease Removes nucleotides from 5′ ends of DNA. DNA sequencing.

Polynucleotide kinase Transfers terminal phosphate (γ position) 32P labeling of DNA or RNA.from ATP to 5′-OH groups of DNA or RNA.

Reverse transcriptase Synthesizes DNA from RNA template. Synthesis of cDNA from mRNA; RNA (5′ end) mapping studies.

S1 nuclease Degrades single-stranded DNA. Removal of “hairpin” in synthesis of cDNA; RNA mapping studies (both 5′ and 3′ ends).

Terminal transferase Adds nucleotides to the 3′ ends of DNA. Homopolymer tailing.1Adapted and reproduced, with permission, from Emery AEH: Page 41 in: An Introduction to Recombinant DNA. Wiley, 1984.

ch40.qxd 3/16/04 11:05 AM Page 400

Page 85: Harpers Biochem

MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 401

A

T T AT

DNAligase

Plasmid DNA molecule with human DNA insert(recombinant DNA molecule)

T

A

AA

TT

AATT

A A

T T AT

T

A

AA

TT

AATT

A

Circular plasmid DNA Linear plasmid DNAwith sticky ends

Human DNA

EcoRI restrictionendonuclease

Anneal

EcoRIrestriction

endonuclease AA

AT

T

ATT

A A T T

T T

Piece of human DNA cut withsame restriction nuclease andcontaining same sticky ends

A A

Figure 40–3. Use of restriction nucleases to make new recombinant or chimeric DNA molecules. When in-serted back into a bacterial cell (by the process called transformation), typically only a single plasmid is taken upby a single cell, and the plasmid DNA replicates not only itself but also the physically linked new DNA insert. Sincerecombining the sticky ends, as indicated, regenerates the same DNA sequence recognized by the original restric-tion enzyme, the cloned DNA insert can be cleanly cut back out of the recombinant plasmid circle with this en-donuclease. If a mixture of all of the DNA pieces created by treatment of total human DNA with a single restrictionnuclease is used as the source of human DNA, a million or so different types of recombinant DNA molecules canbe obtained, each pure in its own bacterial clone. (Modified and reproduced, with permission, from Cohen SN: Themanipulation of genes. Sci Am [July] 1975;233:34.)

cleavage sites for inserting the foreign DNA is available.Plasmids are smaller than the host chromosome and aretherefore easily separated from the latter, and the desiredplasmid-inserted DNA is readily removed by cutting theplasmid with the enzyme specific for the restriction siteinto which the original piece of DNA was inserted.

Phages usually have linear DNA molecules intowhich foreign DNA can be inserted at several restric-tion enzyme sites. The chimeric DNA is collected afterthe phage proceeds through its lytic cycle and producesmature, infective phage particles. A major advantage ofphage vectors is that while plasmids accept DNA piecesabout 6–10 kb long, phages can accept DNA fragments10–20 kb long, a limitation imposed by the amount ofDNA that can be packed into the phage head.

Larger fragments of DNA can be cloned in cosmids,which combine the best features of plasmids andphages. Cosmids are plasmids that contain the DNA se-quences, so-called cos sites, required for packaginglambda DNA into the phage particle. These vectorsgrow in the plasmid form in bacteria, but since much ofthe unnecessary lambda DNA has been removed, morechimeric DNA can be packaged into the particle head.It is not unusual for cosmids to carry inserts of chimericDNA that are 35–50 kb long. Even larger pieces ofDNA can be incorporated into bacterial artificial chro-mosome (BAC), yeast artificial chromosome (YAC), orE. coli bacteriophage P1-based (PAC) vectors. Thesevectors will accept and propagate DNA inserts of sev-eral hundred kilobases or more and have largely re-

ch40.qxd 3/16/04 11:05 AM Page 401

Page 86: Harpers Biochem

402 / CHAPTER 40

Table 40–4. Cloning capacities of commoncloning vectors.

Vector DNA Insert Size

Plasmid pBR322 0.01–10 kbLambda charon 4A 10–20 kbCosmids 35–50 kbBAC, P1 50–250 kbYAC 500–3000 kb

placed the plasmid, phage, and cosmid vectors for somecloning and gene mapping applications. A comparisonof these vectors is shown in Table 40–4.

Because insertion of DNA into a functional regionof the vector will interfere with the action of this re-gion, care must be taken not to interrupt an essentialfunction of the vector. This concept can be exploited,however, to provide a selection technique. For example,the common plasmid vector pBR322 has both tetracy-cline (tet) and ampicillin (amp) resistance genes. Asingle PstI restriction enzyme site within the amp resis-tance gene is commonly used as the insertion site for apiece of foreign DNA. In addition to having sticky ends(Table 40–2 and Figure 40–2), the DNA inserted atthis site disrupts the amp resistance gene and makes thebacterium carrying this plasmid amp-sensitive (Figure40–4). Thus, the parental plasmid, which provides re-sistance to both antibiotics, can be readily separatedfrom the chimeric plasmid, which is resistant only totetracycline. YACs contain replication and segregationfunctions that work in both bacteria and yeast cells andtherefore can be propagated in either organism.

In addition to the vectors described in Table 40–4that are designed primarily for propagation in bacterialcells, vectors for mammalian cell propagation and insertgene (cDNA)/protein expression have also been devel-oped. These vectors are all based upon various eukary-otic viruses that are composed of RNA or DNAgenomes. Notable examples of such viral vectors arethose utilizing adenoviral (DNA-based) and retroviral(RNA-based) genomes. Though somewhat limited inthe size of DNA sequences that can be inserted, suchmammalian viral cloning vectors make up for thisshortcoming because they will efficiently infect a widerange of different cell types. For this reason, variousmammalian viral vectors are being investigated for usein gene therapy experiments.

A Library Is a Collection of Recombinant Clones

The combination of restriction enzymes and variouscloning vectors allows the entire genome of an organ-ism to be packed into a vector. A collection of these dif-

ferent recombinant clones is called a library. A genomiclibrary is prepared from the total DNA of a cell line ortissue. A cDNA library comprises complementaryDNA copies of the population of mRNAs in a tissue.Genomic DNA libraries are often prepared by perform-ing partial digestion of total DNA with a restriction en-zyme that cuts DNA frequently (eg, a four base cuttersuch as TaqI ). The idea is to generate rather large frag-ments so that most genes will be left intact. The BAC,YAC, and P1 vectors are preferred since they can acceptvery large fragments of DNA and thus offer a betterchance of isolating an intact gene on a single DNAfragment.

A vector in which the protein coded by the gene in-troduced by recombinant DNA technology is actuallysynthesized is known as an expression vector. Suchvectors are now commonly used to detect specificcDNA molecules in libraries and to produce proteinsby genetic engineering techniques. These vectors arespecially constructed to contain very active induciblepromoters, proper in-phase translation initiationcodons, both transcription and translation terminationsignals, and appropriate protein processing signals, ifneeded. Some expression vectors even contain genesthat code for protease inhibitors, so that the final yieldof product is enhanced.

Probes Search Libraries for Specific Genes or cDNA Molecules

A variety of molecules can be used to “probe” libraries insearch of a specific gene or cDNA molecule or to defineand quantitate DNA or RNA separated by electrophore-sis through various gels. Probes are generally pieces ofDNA or RNA labeled with a 32P-containing nu-cleotide—or fluorescently labeled nucleotides (morecommonly now). Importantly, neither modification (32Por fluorescent-label) affects the hybridization propertiesof the resulting labeled nucleic acid probes. The probemust recognize a complementary sequence to be effec-tive. A cDNA synthesized from a specific mRNA can beused to screen either a cDNA library for a longer cDNAor a genomic library for a complementary sequence inthe coding region of a gene. A popular technique forfinding specific genes entails taking a short amino acidsequence and, employing the codon usage for thatspecies (see Chapter 38), making an oligonucleotideprobe that will detect the corresponding DNA fragmentin a genomic library. If the sequences match exactly,probes 15–20 nucleotides long will hybridize. cDNAprobes are used to detect DNA fragments on Southernblot transfers and to detect and quantitate RNA onNorthern blot transfers. Specific antibodies can also beused as probes provided that the vector used synthesizesprotein molecules that are recognized by them.

ch40.qxd 3/16/04 11:05 AM Page 402

Page 87: Harpers Biochem

MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 403

Ampr

Tetr

PstI

EcoRI

HindIII

BamHI

SalI

Ampicillinresistance gene

Tetracyclineresistance gene

Host pBR322

Amps

Tetr

PstI

PstI

EcoRI

HindIII

BamHI

SalI

Chimeric pBR322

Tetracyclineresistance gene

Cut open withPstI

Then insertPstI-cut DNA

Figure 40–4. A method of screening recombinants for inserted DNA fragments. Using the plasmid pBR322, apiece of DNA is inserted into the unique PstI site. This insertion disrupts the gene coding for a protein that pro-vides ampicillin resistance to the host bacterium. Hence, the chimeric plasmid will no longer survive when platedon a substrate medium that contains this antibiotic. The differential sensitivity to tetracycline and ampicillin cantherefore be used to distinguish clones of plasmid that contain an insert. A similar scheme relying upon produc-tion of an in-frame fusion of a newly inserted DNA producing a peptide fragment capable of complementing aninactive, deleted form of the enzyme β-galactosidase allows for blue-white colony formation on agar plates con-taining a dye hydrolyzable by β-galactoside. β-Galactosidase-positive colonies are blue.

Blotting & Hybridization Techniques AllowVisualization of Specific Fragments

Visualization of a specific DNA or RNA fragmentamong the many thousands of “contaminating” mole-cules requires the convergence of a number of tech-niques, collectively termed blot transfer. Figure 40–5illustrates the Southern (DNA), Northern (RNA), andWestern (protein) blot transfer procedures. (The first isnamed for the person who devised the technique, andthe other names began as laboratory jargon but are nowaccepted terms.) These procedures are useful in deter-mining how many copies of a gene are in a given tissueor whether there are any gross alterations in a gene(deletions, insertions, or rearrangements). Occasionally,if a specific base is changed and a restriction site is al-tered, these procedures can detect a point mutation.The Northern and Western blot transfer techniques areused to size and quantitate specific RNA and proteinmolecules, respectively. A fourth hybridization tech-nique, the Southwestern blot, examines protein•DNAinteractions. Proteins are separated by electrophoresis,

renatured, and analyzed for an interaction by hybridiza-tion with a specific labeled DNA probe.

Colony or plaque hybridization is the method bywhich specific clones are identified and purified. Bacte-ria are grown on colonies on an agar plate and overlaidwith nitrocellulose filter paper. Cells from each colonystick to the filter and are permanently fixed thereto byheat, which with NaOH treatment also lyses the cellsand denatures the DNA so that it will hybridize withthe probe. A radioactive probe is added to the filter,and (after washing) the hybrid complex is localized byexposing the filter to x-ray film. By matching the spoton the autoradiograph to a colony, the latter can bepicked from the plate. A similar strategy is used to iden-tify fragments in phage libraries. Successive rounds ofthis procedure result in a clonal isolate (bacterialcolony) or individual phage plaque.

All of the hybridization procedures discussed in thissection depend on the specific base-pairing propertiesof complementary nucleic acid strands described above.Perfect matches hybridize readily and withstand hightemperatures in the hybridization and washing reac-

ch40.qxd 3/16/04 11:05 AM Page 403

Page 88: Harpers Biochem

404 / CHAPTER 40

cDNA*

Southern Northern Western

DNA

cDNA*

RNA

Antibody*

Protein

Gelelectrophoresis

Transfer to paper

Autoradiograph

Add probe

Figure 40–5. The blot transfer procedure. In aSouthern, or DNA, blot transfer, DNA isolated from acell line or tissue is digested with one or more restric-tion enzymes. This mixture is pipetted into a well in anagarose or polyacrylamide gel and exposed to a directelectrical current. DNA, being negatively charged, mi-grates toward the anode; the smaller fragments movethe most rapidly. After a suitable time, the DNA is dena-tured by exposure to mild alkali and transferred to ni-trocellulose or nylon paper, in an exact replica of thepattern on the gel, by the blotting technique devisedby Southern. The DNA is bound to the paper by expo-sure to heat, and the paper is then exposed to thelabeled cDNA probe, which hybridizes to complemen-tary fragments on the filter. After thorough washing,the paper is exposed to x-ray film, which is developedto reveal several specific bands corresponding to theDNA fragment that recognized the sequences in thecDNA probe. The RNA, or Northern, blot is conceptuallysimilar. RNA is subjected to electrophoresis before blottransfer. This requires some different steps from thoseof DNA transfer, primarily to ensure that the RNA re-mains intact, and is generally somewhat more difficult.In the protein, or Western, blot, proteins are elec-trophoresed and transferred to nitrocellulose and thenprobed with a specific antibody or other probe mole-cule. (Asterisks signify labeling, either radioactive orfluorescent.)

tions. Specific complexes also form in the presence oflow salt concentrations. Less than perfect matches donot tolerate these stringent conditions (ie, elevatedtemperatures and low salt concentrations); thus, hy-bridization either never occurs or is disrupted duringthe washing step. Gene families, in which there is somedegree of homology, can be detected by varying thestringency of the hybridization and washing steps.Cross-species comparisons of a given gene can also bemade using this approach. Hybridization conditions ca-pable of detecting just a single base pair mismatch be-tween probe and target have been devised.

Manual & Automatic Techniques Are Available to Determine the Sequence of DNA

The segments of specific DNA molecules obtained byrecombinant DNA technology can be analyzed to de-termine their nucleotide sequence. This method de-pends upon having a large number of identical DNAmolecules. This requirement can be satisfied by cloningthe fragment of interest, using the techniques describedabove. The manual enzymatic method (Sanger) em-ploys specific dideoxynucleotides that terminate DNAstrand synthesis at specific nucleotides as the strand issynthesized on purified template nucleic acid. The reac-tions are adjusted so that a population of DNA frag-ments representing termination at every nucleotide isobtained. By having a radioactive label incorporated atthe end opposite the termination site, one can separatethe fragments according to size using polyacrylamidegel electrophoresis. An autoradiograph is made, andeach of the fragments produces an image (band) on anx-ray film. These are read in order to give the DNA se-quence (Figure 40–6). Another manual method, that ofMaxam and Gilbert, employs chemical methods tocleave the DNA molecules where they contain the spe-cific nucleotides. Techniques that do not require theuse of radioisotopes are commonly employed in auto-mated DNA sequencing. Most commonly employed isan automated procedure in which four different fluo-rescent labels—one representing each nucleotide—areused. Each emits a specific signal upon excitation by alaser beam, and this can be recorded by a computer.

Oligonucleotide Synthesis Is Now Routine

The automated chemical synthesis of moderately longoligonucleotides (about 100 nucleotides) of precise se-quence is now a routine laboratory procedure. Eachsynthetic cycle takes but a few minutes, so an entiremolecule can be made by synthesizing relatively shortsegments that can then be ligated to one another.Oligonucleotides are now indispensable for DNA se-

ch40.qxd 3/16/04 11:05 AM Page 404

Page 89: Harpers Biochem

MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 405

G A

Bases terminated

T C

ddGTP

Sla

b ge

l

Ele

ctro

phor

esis

Reaction containing radiolabel:

ddATP ddTTP ddCTP

Sequence of original strand:

– 3′

A G T C T T G G A G C T

A G T C T T G G A G C T– – – – – – – – – – – –

Figure 40–6. Sequencing of DNA by the method devised by Sanger. The ladder-like arrays represent from bot-tom to top all of the successively longer fragments of the original DNA strand. Knowing which specific dideoxynu-cleotide reaction was conducted to produce each mixture of fragments, one can determine the sequence of nu-cleotides from the labeled end (asterisk) toward the unlabeled end by reading up the gel. Automated sequencinginvolves the reading of chemically modified deoxynucleotides. The base-pairing rules of Watson and Crick (A–T,G–C) dictate the sequence of the other (complementary) strand. (Asterisks signify radiolabeling.)

quencing, library screening, protein-DNA binding,DNA mobility shift assays, the polymerase chain reac-tion (see below), site-directed mutagenesis, and numer-ous other applications.

The Polymerase Chain Reaction (PCR) Amplifies DNA Sequences

The polymerase chain reaction (PCR) is a method ofamplifying a target sequence of DNA. PCR provides asensitive, selective, and extremely rapid means of ampli-fying a desired sequence of DNA. Specificity is basedon the use of two oligonucleotide primers that hy-bridize to complementary sequences on oppositestrands of DNA and flank the target sequence (Figure40–7). The DNA sample is first heated to separate thetwo strands; the primers are allowed to bind to theDNA; and each strand is copied by a DNA polymerase,starting at the primer site. The two DNA strands eachserve as a template for the synthesis of new DNA fromthe two primers. Repeated cycles of heat denaturation,annealing of the primers to their complementary se-

quences, and extension of the annealed primers withDNA polymerase result in the exponential amplifica-tion of DNA segments of defined length. Early PCR re-actions used an E coli DNA polymerase that was de-stroyed by each heat denaturation cycle. Substitution ofa heat-stable DNA polymerase from Thermus aquaticus(or the corresponding DNA polymerase from otherthermophilic bacteria), an organism that lives and repli-cates at 70–80 °C, obviates this problem and has madepossible automation of the reaction, since the polym-erase reactions can be run at 70 °C. This has also im-proved the specificity and the yield of DNA.

DNA sequences as short as 50–100 bp and as longas 10 kb can be amplified. Twenty cycles provide anamplification of 106 and 30 cycles of 109. The PCR al-lows the DNA in a single cell, hair follicle, or spermato-zoon to be amplified and analyzed. Thus, the applica-tions of PCR to forensic medicine are obvious. ThePCR is also used (1) to detect infectious agents, espe-cially latent viruses; (2) to make prenatal genetic diag-noses; (3) to detect allelic polymorphisms; (4) to estab-lish precise tissue types for transplants; and (5) to study

ch40.qxd 3/16/04 11:05 AM Page 405

Page 90: Harpers Biochem

406 / CHAPTER 40

Targeted sequence

START

CYCLE 1

CYCLE 2

CYCLE 3

CYCLES 4–n

Figure 40–7. The polymerase chain reaction is used toamplify specific gene sequences. Double-stranded DNA isheated to separate it into individual strands. These bind twodistinct primers that are directed at specific sequences onopposite strands and that define the segment to be ampli-fied. DNA polymerase extends the primers in each directionand synthesizes two strands complementary to the originaltwo. This cycle is repeated several times, giving an amplifiedproduct of defined length and sequence. Note that the twoprimers are present in excess.

evolution, using DNA from archeological samples afterRNA copying and mRNA quantitation by the so-calledRT-PCR method (cDNA copies of mRNA generatedby a retroviral reverse transcriptase). There are an equalnumber of applications of PCR to problems in basicscience, and new uses are developed every year.

PRACTICAL APPLICATIONS OF RECOMBINANT DNA TECHNOLOGY ARE NUMEROUS

The isolation of a specific gene from an entire genomerequires a technique that will discriminate one part in amillion. The identification of a regulatory region thatmay be only 10 bp in length requires a sensitivity ofone part in 3 × 108; a disease such as sickle cell anemiais caused by a single base change, or one part in 3 × 109.Recombinant DNA technology is powerful enough toaccomplish all these things.

Gene Mapping Localizes Specific Genes to Distinct Chromosomes

Gene localizing thus can define a map of the humangenome. This is already yielding useful information inthe definition of human disease. Somatic cell hybridiza-tion and in situ hybridization are two techniques usedto accomplish this. In in situ hybridization, the sim-pler and more direct procedure, a radioactive probe isadded to a metaphase spread of chromosomes on a glassslide. The exact area of hybridization is localized by lay-ering photographic emulsion over the slide and, afterexposure, lining up the grains with some histologicidentification of the chromosome. Fluorescence in situhybridization (FISH) is a very sensitive technique thatis also used for this purpose. This often places the geneat a location on a given band or region on the chromo-some. Some of the human genes localized using thesetechniques are listed in Table 40–5. This table repre-sents only a sampling, since thousands of genes havebeen mapped as a result of the recent sequencing of the

ch40.qxd 3/16/04 11:05 AM Page 406

Page 91: Harpers Biochem

MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 407

Table 40–5. Localization of human genes.1

Gene Chromosome Disease

Insulin 11p15Prolactin 6p23-q12Growth hormone 17q21-qter Growth hormone deficiencyα-Globin 16p12-pter α-Thalassemiaβ-Globin 11p12 β-Thalassemia, sickle cellAdenosine deaminase 20q13-qter Adenosine deaminase deficiencyPhenylalanine hydroxylase 12q24 PhenylketonuriaHypoxanthine-guanine Xq26-q27 Lesch-Nyhan syndrome

phosphoribosyltransferaseDNA segment G8 4p Huntington’s chorea1This table indicates the chromosomal location of several genes and the diseases asso-ciated with deficient or abnormal production of the gene products. The chromosomeinvolved is indicated by the first number or letter. The other numbers and letters referto precise localizations, as defined in McKusick VA: Mendelian Inheritance in Man, 6thed. John Hopkins Univ Press, 1983.

human genome. Once the defect is localized to a regionof DNA that has the characteristic structure of a gene(Figure 40–1), a synthetic gene can be constructed andexpressed in an appropriate vector and its function canbe assessed—or the putative peptide, deduced from theopen reading frame in the coding region, can be synthe-sized. Antibodies directed against this peptide can beused to assess whether this peptide is expressed in nor-mal persons and whether it is absent in those with thegenetic syndrome.

Proteins Can Be Produced for Research & Diagnosis

A practical goal of recombinant DNA research is theproduction of materials for biomedical applications.This technology has two distinct merits: (1) It can sup-ply large amounts of material that could not be ob-tained by conventional purification methods (eg, inter-feron, tissue plasminogen activating factor). (2) It canprovide human material (eg, insulin, growth hormone).The advantages in both cases are obvious. Although theprimary aim is to supply products—generally pro-teins—for treatment (insulin) and diagnosis (AIDStesting) of human and other animal diseases and fordisease prevention (hepatitis B vaccine), there are otherpotential commercial applications, especially in agricul-ture. An example of the latter is the attempt to engineerplants that are more resistant to drought or temperatureextremes, more efficient at fixing nitrogen, or that pro-duce seeds containing the complete complement of es-sential amino acids (rice, wheat, corn, etc).

Recombinant DNA Technology Is Used in the Molecular Analysis of Disease

A. NORMAL GENE VARIATIONS

There is a normal variation of DNA sequence just as istrue of more obvious aspects of human structure. Varia-tions of DNA sequence, polymorphisms, occur ap-proximately once in every 500 nucleotides, or about107 times per genome. There are without doubt dele-tions and insertions of DNA as well as single-base sub-stitutions. In healthy people, these alterations obviouslyoccur in noncoding regions of DNA or at sites thatcause no change in function of the encoded protein.This heritable polymorphism of DNA structure can beassociated with certain diseases within a large kindredand can be used to search for the specific gene involved,as is illustrated below. It can also be used in a variety ofapplications in forensic medicine.

B. GENE VARIATIONS CAUSING DISEASE

Classic genetics taught that most genetic diseases weredue to point mutations which resulted in an impairedprotein. This may still be true, but if on reading theinitial sections of this chapter one predicted that ge-netic disease could result from derangement of any ofthe steps illustrated in Figure 40–1, one would havemade a proper assessment. This point is nicely illus-trated by examination of the β-globin gene. This geneis located in a cluster on chromosome 11 (Figure 40–8), and an expanded version of the gene is illus-trated in Figure 40–9. Defective production of β-glo-bin results in a variety of diseases and is due to many

ch40.qxd 3/16/04 11:05 AM Page 407

Page 92: Harpers Biochem

408 / CHAPTER 40

10 kb

5′ LCR 3′

Gγ Aγ Ψβ δ β

Inverted

Hemoglobinopathy

β0-Thalassemia

β0-Thalassemia

Hemoglobin Lepore

(Aγδβ)0-Thalassemia

Figure 40–8. Schematic representation of the β-globin gene cluster and of the lesions in some ge-netic disorders. The β-globin gene is located on chromosome 11 in close association with the two γ-glo-bin genes and the δ-globin gene. The β-gene family is arranged in the order 5′-ε-Gγ-Aγ-ψβ-δ-β-3′. Theε locus is expressed in early embryonic life (as a2ε2). The γ genes are expressed in fetal life, making fetalhemoglobin (HbF, α2γ2). Adult hemoglobin consists of HbA (α2β2) or HbA2(α2δ2). The Ψβ is a pseudo-gene that has sequence homology with β but contains mutations that prevent its expression. A locuscontrol region (LCR) located upstream (5′) from the ε gene controls the rate of transcription of the en-tire β-globin gene cluster. Deletions (solid bar) of the β locus cause β-thalassemia (deficiency or ab-sence [β0] of β-globin). A deletion of δ and β causes hemoglobin Lepore (only hemoglobin α is present).An inversion (Aγδβ)0 in this region (colored bar) disrupts gene function and also results in thalassemia(type III). Each type of thalassemia tends to be found in a certain group of people, eg, the (Aγδβ)0 dele-tion inversion occurs in persons from India. Many more deletions in this region have been mapped, andeach causes some type of thalassemia.

different lesions in and around the β-globin gene(Table 40–6).

C. POINT MUTATIONS

The classic example is sickle cell disease, which iscaused by mutation of a single base out of the 3 × 109

in the genome, a T-to-A DNA substitution, which in

turn results in an A-to-U change in the mRNA corre-sponding to the sixth codon of the β-globin gene. Thealtered codon specifies a different amino acid (valinerather than glutamic acid), and this causes a structuralabnormality of the β-globin molecule. Other point mu-tations in and around the β-globin gene result in de-creased production or, in some instances, no produc-

5′ 3′I I1 2

Figure 40–9. Mutations in the β-globin gene causing β-thalassemia. The β-globin gene is shown in the 5′to 3′ orientation. The cross-hatched areas indicate the 5′ and 3′ nontranslated regions. Reading from the 5′ to3′ direction, the shaded areas are exons 1–3 and the clear spaces are introns 1 (I1) and 2 (I2). Mutations that af-fect transcription control (•) are located in the 5′ flanking-region DNA. Examples of nonsense mutations (�),mutations in RNA processing (�), and RNA cleavage mutations (�) have been identified and are indicated. Insome regions, many mutations have been found. These are indicated by the brackets.

ch40.qxd 3/16/04 11:05 AM Page 408

Page 93: Harpers Biochem

MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 409

Table 40–6. Structural alterations of the β-globingene.

Alteration Function Affected Disease

Point mutations Protein folding Sickle cell diseaseTranscriptional control β-ThalassemiaFrameshift and non- β-Thalassemia

sense mutationsRNA processing β-Thalassemia

Deletion mRNA production β0-ThalassemiaHemoglobin

Lepore

Rearrangement mRNA production β-Thalassemiatype III

tion of β-globin; β-thalassemia is the result of thesemutations. (The thalassemias are characterized by de-fects in the synthesis of hemoglobin subunits, and so β-thalassemia results when there is insufficient produc-tion of β-globin.) Figure 40–9 illustrates that pointmutations affecting each of the many processes in-volved in generating a normal mRNA (and therefore anormal protein) have been implicated as a cause of β-thalassemia.

D. DELETIONS, INSERTIONS, & REARRANGEMENTS OF DNAStudies of bacteria, viruses, yeasts, and fruit flies showthat pieces of DNA can move from one place to an-other within a genome. The deletion of a critical pieceof DNA, the rearrangement of DNA within a gene, orthe insertion of a piece of DNA within a coding or reg-ulatory region can all cause changes in gene expressionresulting in disease. Again, a molecular analysis of β-thalassemia produces numerous examples of theseprocesses—particularly deletions—as causes of disease(Figure 40–8). The globin gene clusters seem particu-larly prone to this lesion. Deletions in the α-globincluster, located on chromosome 16, cause α-thal-assemia. There is a strong ethnic association for manyof these deletions, so that northern Europeans, Fil-ipinos, blacks, and Mediterranean peoples have differ-ent lesions all resulting in the absence of hemoglobin Aand α-thalassemia.

A similar analysis could be made for a number ofother diseases. Point mutations are usually defined bysequencing the gene in question, though occasionally, ifthe mutation destroys or creates a restriction enzymesite, the technique of restriction fragment analysis canbe used to pinpoint the lesion. Deletions or insertionsof DNA larger than 50 bp can often be detected by theSouthern blotting procedure.

E. PEDIGREE ANALYSIS

Sickle cell disease again provides an excellent exampleof how recombinant DNA technology can be appliedto the study of human disease. The substitution of Tfor A in the template strand of DNA in the β-globingene changes the sequence in the region that corre-sponds to the sixth codon from

to

and destroys a recognition site for the restriction en-zyme MstII (CCTNAGG; denoted by the small verticalarrows; Table 40–2). Other MstII sites 5′ and 3′ fromthis site (Figure 40–10) are not affected and so will becut. Therefore, incubation of DNA from normal (AA),heterozygous (AS), and homozygous (SS) individualsresults in three different patterns on Southern blottransfer (Figure 40–10). This illustrates how a DNApedigree can be established using the principles dis-cussed in this chapter. Pedigree analysis has been ap-plied to a number of genetic diseases and is most usefulin those caused by deletions and insertions or the rarerinstances in which a restriction endonuclease cleavagesite is affected, as in the example cited in this para-graph. The analysis is facilitated by the PCR reaction,which can provide sufficient DNA for analysis fromjust a few nucleated red blood cells.

F. PRENATAL DIAGNOSIS

If the genetic lesion is understood and a specific probeis available, prenatal diagnosis is possible. DNA fromcells collected from as little as 10 mL of amniotic fluid(or by chorionic villus biopsy) can be analyzed bySouthern blot transfer. A fetus with the restriction pat-tern AA in Figure 40–10 does not have sickle cell dis-ease, nor is it a carrier. A fetus with the SS pattern willdevelop the disease. Probes are now available for thistype of analysis of many genetic diseases.

G. RESTRICTION FRAGMENT LENGTH

POLYMORPHISM (RFLP)The differences in DNA sequence cited above can re-sult in variations of restriction sites and thus in thelength of restriction fragments. An inherited differencein the pattern of restriction (eg, a DNA variation occur-ring in more than 1% of the general population) isknown as a restriction fragment length polymorphism,

CCTGTGG Coding strandGGAC�A CC Template strand

↓CCTGAGG Coding strandGGAC�T CC Template strand

ch40.qxd 3/16/04 11:05 AM Page 409

Page 94: Harpers Biochem

410 / CHAPTER 40

A. MstII restriction sites around and in the β-globin gene

B. Pedigree analysis

Normal (A) 3′5′

1.15 kb 0.2

Sickle (S) 3′5′

1.35 kb

AS AS SS AA AS AS

1.35 kb

Fragmentsize

kb

1.15 kb

Phenotype

Figure 40–10. Pedigree analysis of sickle cell disease. The top part of the fig-ure (A) shows the first part of the β-globin gene and the MstII restriction en-zyme sites in the normal (A) and sickle cell (S) β-globin genes. Digestion withthe restriction enzyme MstII results in DNA fragments 1.15 kb and 0.2 kb long innormal individuals. The T-to-A change in individuals with sickle cell diseaseabolishes one of the three MstII sites around the β-globin gene; hence, a singlerestriction fragment 1.35 kb in length is generated in response to MstII. This sizedifference is easily detected on a Southern blot. (The 0.2-kb fragment wouldrun off the gel in this illustration.) (B) Pedigree analysis shows three possibili-ties: AA = normal (open circle); AS = heterozygous (half-solid circles, half-solidsquare); SS = homozygous (solid square). This approach allows for prenatal di-agnosis of sickle cell disease (dash-sided square).

ch40.qxd 3/16/04 11:05 AM Page 410

Page 95: Harpers Biochem

MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 411

5′

1

*

Fragments

Intact DNA

Initialprobe

2

3

4

5

3′Gene X

Figure 40–11. The technique of chromosome walking. Gene X is to be isolated from a large pieceof DNA. The exact location of this gene is not known, but a probe (*——) directed against a frag-ment of DNA (shown at the 5′ end in this representation) is available, as is a library containing a se-ries of overlapping DNA fragments. For the sake of simplicity, only five of these are shown. The initialprobe will hybridize only with clones containing fragment 1, which can then be isolated and used asa probe to detect fragment 2. This procedure is repeated until fragment 4 hybridizes with fragment5, which contains the entire sequence of gene X.

or RFLP. An extensive RFLP map of the humangenome has been constructed. This is proving useful inthe human genome sequencing project and is an impor-tant component of the effort to understand various sin-gle-gene and multigenic diseases. RFLPs result fromsingle-base changes (eg, sickle cell disease) or from dele-tions or insertions of DNA into a restriction fragment(eg, the thalassemias) and have proved to be useful di-agnostic tools. They have been found at known geneloci and in sequences that have no known function;thus, RFLPs may disrupt the function of the gene ormay have no biologic consequences.

RFLPs are inherited, and they segregate in amendelian fashion. A major use of RFLPs (thousandsare now known) is in the definition of inherited dis-eases in which the functional deficit is unknown.RFLPs can be used to establish linkage groups, whichin turn, by the process of chromosome walking, willeventually define the disease locus. In chromosomewalking (Figure 40–11), a fragment representing oneend of a long piece of DNA is used to isolate anotherthat overlaps but extends the first. The direction of ex-tension is determined by restriction mapping, and theprocedure is repeated sequentially until the desired se-quence is obtained. The X chromosome-linked disor-ders are particularly amenable to this approach, sinceonly a single allele is expressed. Hence, 20% of the de-fined RFLPs are on the X chromosome, and a reason-ably complete linkage map of this chromosome exists.The gene for the X-linked disorder, Duchenne-typemuscular dystrophy, was found using RFLPs. Likewise,the defect in Huntington’s disease was localized to theterminal region of the short arm of chromosome 4, andthe defect that causes polycystic kidney disease is linkedto the α-globin locus on chromosome 16.

H. MICROSATELLITE DNA POLYMORPHISMS

Short (2–6 bp), inherited, tandem repeat units of DNAoccur about 50,000–100,000 times in the humangenome (Chapter 36). Because they occur more fre-quently—and in view of the routine application of sen-sitive PCR methods—they are replacing RFLPs as themarker loci for various genome searches.

I. RFLPS & VNTRS IN FORENSIC MEDICINE

Variable numbers of tandemly repeated (VNTR) unitsare one common type of “insertion” that results in anRFLP. The VNTRs can be inherited, in which casethey are useful in establishing genetic association with adisease in a family or kindred; or they can be unique toan individual and thus serve as a molecular fingerprintof that person.

J. GENE THERAPY

Diseases caused by deficiency of a gene product (Table40–5) are amenable to replacement therapy. The strat-egy is to clone a gene (eg, the gene that codes foradenosine deaminase) into a vector that will readily betaken up and incorporated into the genome of a hostcell. Bone marrow precursor cells are being investigatedfor this purpose because they presumably will resettle inthe marrow and replicate there. The introduced genewould begin to direct the expression of its protein prod-uct, and this would correct the deficiency in the hostcell.

K. TRANSGENIC ANIMALS

The somatic cell gene replacement described abovewould obviously not be passed on to offspring. Otherstrategies to alter germ cell lines have been devised buthave been tested only in experimental animals. A certain

ch40.qxd 3/16/04 11:05 AM Page 411

Page 96: Harpers Biochem

412 / CHAPTER 40

percentage of genes injected into a fertilized mouse ovumwill be incorporated into the genome and found in bothsomatic and germ cells. Hundreds of transgenic animalshave been established, and these are useful for analysis oftissue-specific effects on gene expression and effects ofoverproduction of gene products (eg, those from thegrowth hormone gene or oncogenes) and in discoveringgenes involved in development—a process that hereto-fore has been difficult to study. The transgenic approachhas recently been used to correct a genetic deficiency inmice. Fertilized ova obtained from mice with genetic hy-pogonadism were injected with DNA containing thecoding sequence for the gonadotropin-releasing hormone(GnRH) precursor protein. This gene was expressed andregulated normally in the hypothalamus of a certainnumber of the resultant mice, and these animals were inall respects normal. Their offspring also showed no evi-dence of GnRH deficiency. This is, therefore, evidenceof somatic cell expression of the transgene and of itsmaintenance in germ cells.

Targeted Gene Disruption or Knockout

In transgenic animals, one is adding one or more copiesof a gene to the genome, and there is no way to controlwhere that gene eventually resides. A complementary—and much more difficult—approach involves the selec-tive removal of a gene from the genome. Gene knock-out animals (usually mice) are made by creating amutation that totally disrupts the function of a gene.This is then used to replace one of the two genes in anembryonic stem cell that can be used to create a het-erozygous transgenic animal. The mating of two suchanimals will, by mendelian genetics, result in a ho-mozygous mutation in 25% of offspring. Several hun-dred strains of mice with knockouts of specific geneshave been developed.

RNA Transcript & Protein Profiling

The “-omic” revolution of the last several years has cul-minated in the determination of the nucleotide se-quences of entire genomes, including those of buddingand fission yeasts, various bacteria, the fruit fly, the wormCaenorhabditis elegans, the mouse and, most notably, hu-mans. Additional genomes are being sequenced at an ac-celerating pace. The availability of all of this DNA se-quence information, coupled with engineering advances,has lead to the development of several revolutionarymethodologies, most of which are based upon high-den-sity microarray technology. We now have the ability todeposit thousands of specific, known, definable DNA se-quences (more typically now synthetic oligonucleotides)on a glass microscope-style slide in the space of a few

square centimeters. By coupling such DNA microarrayswith highly sensitive detection of hybridized fluores-cently labeled nucleic acid probes derived from mRNA,investigators can rapidly and accurately generate profilesof gene expression (eg, specific cellular mRNA content)from cell and tissue samples as small as 1 gram or less.Thus entire transcriptome information (the entire col-lection of cellular mRNAs) for such cell or tissue sourcescan readily be obtained in only a few days. Transcrip-tome information allows one to predict the collection ofproteins that might be expressed in a particular cell, tis-sue, or organ in normal and disease states based upon themRNAs present in those cells. Complementing this high-throughput, transcript-profiling method is the recent de-velopment of high-sensitivity, high-throughput massspectrometry of complex protein samples. Newer massspectrometry methods allow one to identify hundreds tothousands of proteins in proteins extracted from verysmall numbers of cells (< 1 g). This critical informationtells investigators which of the many mRNAs detected intranscript microarray mapping studies are actually trans-lated into protein, generally the ultimate dictator of phe-notype. Microarray techniques and mass spectrometricprotein identification experiments both lead to the gen-eration of huge amounts of data. Appropriate data man-agement and interpretation of the deluge of informationforthcoming from such studies has relied upon statisticalmethods; and this new technology, coupled with theflood of DNA sequence information, has led to the de-velopment of the field of bioinformatics, a new disci-pline whose goal is to help manage, analyze, and inte-grate this flood of biologically important information.Future work at the intersection of bioinformatics andtranscript-protein profiling will revolutionize our under-standing of biology and medicine.

SUMMARY

• A variety of very sensitive techniques can now be ap-plied to the isolation and characterization of genesand to the quantitation of gene products.

• In DNA cloning, a particular segment of DNA is re-moved from its normal environment using one ofmany restriction endonucleases. This is then ligatedinto one of several vectors in which the DNA seg-ment can be amplified and produced in abundance.

• The cloned DNA can be used as a probe in one ofseveral types of hybridization reactions to detectother related or adjacent pieces of DNA, or it can beused to quantitate gene products such as mRNA.

• Manipulation of the DNA to change its structure, so-called genetic engineering, is a key element in cloning(eg, the construction of chimeric molecules) and can

ch40.qxd 3/16/04 11:05 AM Page 412

Page 97: Harpers Biochem

MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 413

also be used to study the function of a certain frag-ment of DNA and to analyze how genes are regulated.

• Chimeric DNA molecules are introduced into cellsto make transfected cells or into the fertilized oocyteto make transgenic animals.

• Techniques involving cloned DNA are used to locategenes to specific regions of chromosomes, to identifythe genes responsible for diseases, to study how faultygene regulation causes disease, to diagnose geneticdiseases, and increasingly to treat genetic diseases.

GLOSSARY

ARS: Autonomously replicating sequence; the ori-gin of replication in yeast.

Autoradiography: The detection of radioactivemolecules (eg, DNA, RNA, protein) by visualiza-tion of their effects on photographic film.

Bacteriophage: A virus that infects a bacterium.Blunt-ended DNA: Two strands of a DNA duplex

having ends that are flush with each other.cDNA: A single-stranded DNA molecule that is

complementary to an mRNA molecule and is syn-thesized from it by the action of reverse transcrip-tase.

Chimeric molecule: A molecule (eg, DNA, RNA,protein) containing sequences derived from twodifferent species.

Clone: A large number of organisms, cells or mole-cules that are identical with a single parental or-ganism cell or molecule.

Cosmid: A plasmid into which the DNA sequencesfrom bacteriophage lambda that are necessary forthe packaging of DNA (cos sites) have been in-serted; this permits the plasmid DNA to be pack-aged in vitro.

Endonuclease: An enzyme that cleaves internalbonds in DNA or RNA.

Excinuclease: The excision nuclease involved in nu-cleotide exchange repair of DNA.

Exon: The sequence of a gene that is represented(expressed) as mRNA.

Exonuclease: An enzyme that cleaves nucleotidesfrom either the 3′ or 5′ ends of DNA or RNA.

Fingerprinting: The use of RFLPs or repeat se-quence DNA to establish a unique pattern ofDNA fragments for an individual.

Footprinting: DNA with protein bound is resistantto digestion by DNase enzymes. When a sequenc-ing reaction is performed using such DNA, a pro-tected area, representing the “footprint” of thebound protein, will be detected.

Hairpin: A double-helical stretch formed by basepairing between neighboring complementary se-

quences of a single strand of DNA or RNA.Hybridization: The specific reassociation of com-

plementary strands of nucleic acids (DNA withDNA, DNA with RNA, or RNA with RNA).

Insert: An additional length of base pairs in DNA,generally introduced by the techniques of recom-binant DNA technology.

Intron: The sequence of a gene that is transcribedbut excised before translation.

Library: A collection of cloned fragments that rep-resents the entire genome. Libraries may be eithergenomic DNA (in which both introns and exonsare represented) or cDNA (in which only exonsare represented).

Ligation: The enzyme-catalyzed joining in phos-phodiester linkage of two stretches of DNA orRNA into one; the respective enzymes are DNAand RNA ligases.

Lines: Long interspersed repeat sequences.Microsatellite polymorphism: Heterozygosity of a

certain microsatellite repeat in an individual.Microsatellite repeat sequences: Dispersed or

group repeat sequences of 2–5 bp repeated up to50 times. May occur at 50–100 thousand loca-tions in the genome.

Nick translation: A technique for labeling DNAbased on the ability of the DNA polymerase fromE coli to degrade a strand of DNA that has beennicked and then to resynthesize the strand; if a ra-dioactive nucleoside triphosphate is employed, therebuilt strand becomes labeled and can be used asa radioactive probe.

Northern blot: A method for transferring RNAfrom an agarose gel to a nitrocellulose filter, onwhich the RNA can be detected by a suitableprobe.

Oligonucleotide: A short, defined sequence of nu-cleotides joined together in the typical phosphodi-ester linkage.

Ori: The origin of DNA replication.PAC: A high capacity (70–95 kb) cloning vector

based upon the lytic E. coli bacteriophage P1 thatreplicates in bacteria as an extrachromosomal ele-ment.

Palindrome: A sequence of duplex DNA that is thesame when the two strands are read in opposite di-rections.

Plasmid: A small, extrachromosomal, circular mole-cule of DNA that replicates independently of thehost DNA.

Polymerase chain reaction (PCR): An enzymaticmethod for the repeated copying (and thus ampli-fication) of the two strands of DNA that make upa particular gene sequence.

ch40.qxd 3/16/04 11:05 AM Page 413

Page 98: Harpers Biochem

414 / CHAPTER 40

Primosome: The mobile complex of helicase andprimase that is involved in DNA replication.

Probe: A molecule used to detect the presence of aspecific fragment of DNA or RNA in, for in-stance, a bacterial colony that is formed from a ge-netic library or during analysis by blot transfertechniques; common probes are cDNA molecules,synthetic oligodeoxynucleotides of defined se-quence, or antibodies to specific proteins.

Proteome: The entire collection of expressed pro-teins in an organism.

Pseudogene: An inactive segment of DNA arisingby mutation of a parental active gene.

Recombinant DNA: The altered DNA that resultsfrom the insertion of a sequence of deoxynu-cleotides not previously present into an existingmolecule of DNA by enzymatic or chemicalmeans.

Restriction enzyme: An endodeoxynuclease thatcauses cleavage of both strands of DNA at highlyspecific sites dictated by the base sequence.

Reverse transcription: RNA-directed synthesis ofDNA, catalyzed by reverse transcriptase.

RT-PCR: A method used to quantitate mRNA lev-els that relies upon a first step of cDNA copying ofmRNAs prior to PCR amplification and quantita-tion.

Signal: The end product observed when a specificsequence of DNA or RNA is detected by autoradi-ography or some other method. Hybridizationwith a complementary radioactive polynucleotide(eg, by Southern or Northern blotting) is com-monly used to generate the signal.

Sines: Short interspersed repeat sequences.SNP: Single nucleotide polymorphism. Refers to

the fact that single nucleotide genetic variation ingenome sequence exists at discrete loci throughoutthe chromosomes. Measurement of allelic SNPdifferences is useful for gene mapping studies.

snRNA: Small nuclear RNA. This family of RNAsis best known for its role in mRNA processing.

Southern blot: A method for transferring DNAfrom an agarose gel to nitrocellulose filter, onwhich the DNA can be detected by a suitableprobe (eg, complementary DNA or RNA).

Southwestern blot: A method for detecting pro-tein-DNA interactions by applying a labeled DNAprobe to a transfer membrane that contains a rena-tured protein.

Spliceosome: The macromolecular complex respon-sible for precursor mRNA splicing. The spliceo-some consists of at least five small nuclear RNAs(snRNA; U1, U2, U4, U5, and U6) and manyproteins.

Splicing: The removal of introns from RNA ac-companied by the joining of its exons.

Sticky-ended DNA: Complementary single strandsof DNA that protrude from opposite ends of aDNA duplex or from the ends of different duplexmolecules (see also Blunt-ended DNA, above).

Tandem: Used to describe multiple copies of thesame sequence (eg, DNA) that lie adjacent to oneanother.

Terminal transferase: An enzyme that adds nu-cleotides of one type (eg, deoxyadenonucleotidylresidues) to the 3′ end of DNA strands.

Transcription: Template DNA-directed synthesisof nucleic acids; typically DNA-directed synthesisof RNA.

Transcriptome: The entire collection of expressedmRNAs in an organism.

Transgenic: Describing the introduction of newDNA into germ cells by its injection into the nu-cleus of the ovum.

Translation: Synthesis of protein using mRNA astemplate.

Vector: A plasmid or bacteriophage into which for-eign DNA can be introduced for the purposes ofcloning.

Western blot: A method for transferring protein toa nitrocellulose filter, on which the protein can bedetected by a suitable probe (eg, an antibody).

REFERENCES

Lewin B: Genes VII. Oxford Univ Press, 1999.Martin JB, Gusella JF: Huntington’s disease: pathogenesis and

management. N Engl J Med 1986:315:1267.Sambrook J, Fritsch EF, Maniatis T: Molecular Cloning: A Labora-

tory Manual. Cold Spring Harbor Laboratory Press, 1989.Spector DL, Goldman RD, Leinwand LA: Cells: A Laboratory

Manual. Cold Spring Harbor Laboratory Press, 1998.Watson JD et al: Recombinant DNA, 2nd ed. Scientific American

Books. Freeman, 1992.Weatherall DJ: The New Genetics and Clinical Practice, 3rd ed. Ox-

ford Univ Press, 1991.

ch40.qxd 3/16/04 11:05 AM Page 414

Page 99: Harpers Biochem

498

Intracellular Traffic & Sorting of Proteins 46Robert K. Murray, MD, PhD

BIOMEDICAL IMPORTANCE

Proteins must travel from polyribosomes to many dif-ferent sites in the cell to perform their particular func-tions. Some are destined to be components of specificorganelles, others for the cytosol or for export, and yetothers will be located in the various cellular mem-branes. Thus, there is considerable intracellular trafficof proteins. Many studies have shown that the Golgiapparatus plays a major role in the sorting of proteinsfor their correct destinations. A major insight was therecognition that for proteins to attain their proper loca-tions, they generally contain information (a signal orcoding sequence) that targets them appropriately. Oncea number of the signals were defined, it became appar-ent that certain diseases result from mutations that af-fect these signals. In this chapter we discuss the intracel-lular traffic of proteins and their sorting and brieflyconsider some of the disorders that result when abnor-malities occur.

MANY PROTEINS ARE TARGETED BY SIGNAL SEQUENCES TO THEIRCORRECT DESTINATIONS

The protein biosynthetic pathways in cells can be con-sidered to be one large sorting system. Many proteinscarry signals (usually but not always specific sequencesof amino acids) that direct them to their destination,thus ensuring that they will end up in the appropriatemembrane or cell compartment; these signals are a fun-damental component of the sorting system. Usually thesignal sequences are recognized and interact with com-plementary areas of proteins that serve as receptors forthe proteins that contain them.

A major sorting decision is made early in proteinbiosynthesis, when specific proteins are synthesized ei-ther on free or on membrane-bound polyribosomes.This results in two sorting branches called the cytosolicbranch and the rough endoplasmic reticulum (RER)branch (Figure 46–1). This sorting occurs because pro-teins synthesized on membrane-bound polyribosomescontain a signal peptide that mediates their attach-ment to the membrane of the ER. Further details on

the signal peptide are given below. Proteins synthesizedon free polyribosomes lack this particular signal pep-tide and are delivered into the cytosol. There they aredirected to mitochondria, nuclei, and peroxisomes byspecific signals—or remain in the cytosol if they lack asignal. Any protein that contains a targeting sequencethat is subsequently removed is designated as a prepro-tein. In some cases a second peptide is also removed,and in that event the original protein is known as a pre-proprotein (eg, preproalbumin; Chapter 50).

Proteins synthesized and sorted in the rough ERbranch (Figure 46–2) include many destined for vari-ous membranes (eg, of the ER, Golgi apparatus, lyso-somes, and plasma membrane) and for secretion. Lyso-somal enzymes are also included. Thus, such proteinsmay reside in the membranes or lumens of the ER orfollow the major transport route of intracellular pro-teins to the Golgi apparatus. Further signal-mediatedsorting of certain proteins occurs in the Golgi appara-tus, resulting in delivery to lysosomes, membranes ofthe Golgi apparatus, and other sites. Proteins destinedfor the plasma membrane or for secretion pass throughthe Golgi apparatus but generally are not thought tocarry specific sorting signals; they are believed to reachtheir destinations by default.

The entire pathway of ER → Golgi apparatus →plasma membrane is often called the secretory or exo-cytotic pathway. Events along this route will be givenspecial attention. Most of the proteins reaching theGolgi apparatus or the plasma membrane are carried intransport vesicles; a brief description of the formationof these important particles will be given subsequently.Other proteins destined for secretion are carried in se-cretory vesicles (Figure 46–2). These are prominent inthe pancreas and certain other glands. Their mobiliza-tion and discharge are regulated and often referred to as“regulated secretion,” whereas the secretory pathwayinvolving transport vesicles is called “constitutive.”

Experimental approaches that have afforded majorinsights to the processes described in this chapter in-clude (1) use of yeast mutants; (2) application of re-combinant DNA techniques (eg, mutating or eliminat-ing particular sequences in proteins, or fusing newsequences onto them; and (3) development of in vitro

ch46.qxd 2/14/2003 9:11 AM Page 498

Page 100: Harpers Biochem

INTRACELLULAR TRAFFIC & SORTING OF PROTEINS / 499

Proteins

Mitochondrial

Nuclear

Peroxisomal

Cytosolic

ER membrane

GA membrane

Plasma membrane

Secretory

Lysosomal enzymes

(1) Cytosolic

(2) Rough ER

Polyribosomes

Figure 46–1. Diagrammatic representation of thetwo branches of protein sorting occurring by synthesison (1) cytosolic and (2) membrane-bound polyribo-somes. The mitochondrial proteins listed are encodedby nuclear genes. Some of the signals used in furthersorting of these proteins are listed in Table 46–4. (ER,endoplasmic reticulum; GA, Golgi apparatus.)

systems (eg, to study translocation in the ER and mech-anisms of vesicle formation).

The sorting of proteins belonging to the cytosolicbranch referred to above is described next, starting withmitochondrial proteins.

THE MITOCHONDRION BOTH IMPORTS & SYNTHESIZES PROTEINS

Mitochondria contain many proteins. Thirteen pro-teins (mostly membrane components of the electrontransport chain) are encoded by the mitochondrialgenome and synthesized in that organelle using its ownprotein-synthesizing system. However, the majority (atleast several hundred) are encoded by nuclear genes,are synthesized outside the mitochondria on cytosolicpolyribosomes, and must be imported. Yeast cells haveproved to be a particularly useful system for analyzingthe mechanisms of import of mitochondrial proteins,partly because it has proved possible to generate a vari-ety of mutants that have illuminated the fundamentalprocesses involved. Most progress has been made in thestudy of proteins present in the mitochondrial matrix,such as the F1 ATPase subunits. Only the pathway ofimport of matrix proteins will be discussed in any detailhere.

Matrix proteins must pass from cytosolic polyribo-somes through the outer and inner mitochondrialmembranes to reach their destination. Passage throughthe two membranes is called translocation. They havean amino terminal leader sequence (presequence),

about 20–80 amino acids in length, which is not highlyconserved but contains many positively charged aminoacids (eg, Lys or Arg). The presequence is equivalent toa signal peptide mediating attachment of polyribosomesto membranes of the ER (see below), but in this in-stance targeting proteins to the matrix; if the leader se-quence is cleaved off, potential matrix proteins will notreach their destination.

Translocation is believed to occur posttranslation-ally, after the matrix proteins are released from the cy-tosolic polyribosomes. Interactions with a number ofcytosolic proteins that act as chaperones (see below)and as targeting factors occur prior to translocation.

Two distinct translocation complexes are situatedin the outer and inner mitochondrial membranes, re-ferred to (respectively) as TOM (translocase-of-the-outer membrane) and TIM (translocase-of-the-innermembrane). Each complex has been analyzed andfound to be composed of a number of proteins, some ofwhich act as receptors for the incoming proteins andothers as components of the transmembrane poresthrough which these proteins must pass. Proteins mustbe in the unfolded state to pass through the com-plexes, and this is made possible by ATP-dependentbinding to several chaperone proteins. The roles ofchaperone proteins in protein folding are discussed laterin this chapter. In mitochondria, they are involved intranslocation, sorting, folding, assembly, and degrada-tion of imported proteins. A proton-motive forceacross the inner membrane is required for import; it ismade up of the electric potential across the membrane(inside negative) and the pH gradient (see Chapter12). The positively charged leader sequence may behelped through the membrane by the negative chargein the matrix. The presequence is split off in the matrixby a matrix-processing peptidase (MPP). Contactwith other chaperones present in the matrix is essentialto complete the overall process of import. Interactionwith mt-Hsp70 (Hsp = heat shock protein) ensuresproper import into the matrix and prevents misfoldingor aggregation, while interaction with the mt-Hsp60-Hsp10 system ensures proper folding. The latter pro-teins resemble the bacterial GroEL chaperonins, a sub-class of chaperones that form complex cage-likeassemblies made up of heptameric ring structures. Theinteractions of imported proteins with the above chap-erones require hydrolysis of ATP to drive them.

The details of how preproteins are translocated havenot been fully elucidated. It is possible that the electricpotential associated with the inner mitochondrial mem-brane causes a conformational change in the unfoldedpreprotein being translocated and that this helps to pullit across. Furthermore, the fact that the matrix is morenegative than the intermembrane space may “attract”the positively charged amino terminal of the preprotein

ch46.qxd 2/14/2003 9:11 AM Page 499

Page 101: Harpers Biochem

Prelysosome(or late endosome)

Earlyendosome

Cytosol

Constitutive(excretory)transportvesicle

Golgiapparatus

Lysosome

Endoplasmicreticulum

Nuclearenvelope

Secretorystoragegranule

TGN

trans

medial

cis

CGN

Plasma membrane

Figure 46–2. Diagrammatic representation of the rough endoplasmic reticu-lum branch of protein sorting. Newly synthesized proteins are inserted into theER membrane or lumen from membrane-bound polyribosomes (small black cir-cles studding the cytosolic face of the ER). Those proteins that are transportedout of the ER (indicated by solid black arrows) do so from ribosome-free transi-tional elements. Such proteins may then pass through the various subcompart-ments of the Golgi until they reach the TGN, the exit side of the Golgi. In the TGN,proteins are segregated and sorted. Secretory proteins accumulate in secretorystorage granules from which they may be expelled as shown in the upper right-hand side of the figure. Proteins destined for the plasma membrane or those thatare secreted in a constitutive manner are carried out to the cell surface in trans-port vesicles, as indicated in the upper middle area of the figure. Some proteinsmay reach the cell surface via late and early endosomes. Other proteins enterprelysosomes (late endosomes) and are selectively transferred to lysosomes. Theendocytic pathway illustrated in the upper left-hand area of the figure is consid-ered elsewhere in this chapter. Retrieval from the Golgi apparatus to the ER is notconsidered in this scheme. (CGN, cis-Golgi network; TGN, trans-Golgi network.)(Courtesy of E Degen.) 500

ch46.qxd 2/14/2003 9:11 AM Page 500

Page 102: Harpers Biochem

INTRACELLULAR TRAFFIC & SORTING OF PROTEINS / 501

to enter the matrix. Close contact between the mem-brane sites in the outer and inner membranes involvedin translocation is necessary.

The above describes the major pathway of proteinsdestined for the mitochondrial matrix. However, cer-tain proteins insert into the outer mitochondrialmembrane facilitated by the TOM complex. Othersstop in the intermembrane space, and some insert intothe inner membrane. Yet others proceed into the ma-trix and then return to the inner membrane or inter-membrane space. A number of proteins contain twosignaling sequences—one to enter the mitochondrialmatrix and the other to mediate subsequent relocation(eg, into the inner membrane). Certain mitochondrialproteins do not contain presequences (eg, cytochromec, which locates in the inter membrane space), and oth-ers contain internal presequences. Overall, proteinsemploy a variety of mechanisms and routes to attaintheir final destinations in mitochondria.

General features that apply to the import of proteinsinto organelles, including mitochondria and some ofthe other organelles to be discussed below, are summa-rized in Table 46–1.

IMPORTINS & EXPORTINS AREINVOLVED IN TRANSPORT OF MACROMOLECULES IN & OUT OF THE NUCLEUS

It has been estimated that more than a million macro-molecules per minute are transported between the nu-cleus and the cytoplasm in an active eukaryotic cell.

These macromolecules include histones, ribosomal pro-teins and ribosomal subunits, transcription factors, andmRNA molecules. The transport is bidirectional andoccurs through the nuclear pore complexes (NPCs).These are complex structures with a mass approxi-mately 30 times that of a ribosome and are composedof about 100 different proteins. The diameter of anNPC is approximately 9 nm but can increase up to ap-proximately 28 nm. Molecules smaller than about 40kDa can pass through the channel of the NPC by diffu-sion, but special translocation mechanisms exist forlarger molecules. These mechanisms are under intensiveinvestigation, but some important features have alreadyemerged.

Here we shall mainly describe nuclear import ofcertain macromolecules. The general picture that hasemerged is that proteins to be imported (cargo mole-cules) carry a nuclear localization signal (NLS). Oneexample of an NLS is the amino acid sequence (Pro)2-(Lys)4-Ala-Lys-Val, which is markedly rich in basic ly-sine residues. Depending on which NLS it contains, acargo molecule interacts with one of a family of solubleproteins called importins, and the complex docks atthe NPC. Another family of proteins called Ran plays acritical regulatory role in the interaction of the complexwith the NPC and in its translocation through theNPC. Ran proteins are small monomeric nuclear GTP-ases and, like other GTPases, exist in either GTP-bound or GDP-bound states. They are themselves reg-ulated by guanine nucleotide exchange factors(GEFs; eg, the protein RCC1 in eukaryotes), which arelocated in the nucleus, and Ran guanine-activatingproteins (GAPs), which are predominantly cytoplas-mic. The GTP-bound state of Ran is favored in the nu-cleus and the GDP-bound state in the cytoplasm. Theconformations and activities of Ran molecules vary de-pending on whether GTP or GDP is bound to them(the GTP-bound state is active; see discussion of G pro-teins in Chapter 43). The asymmetry between nucleusand cytoplasm—with respect to which of these two nu-cleotides is bound to Ran molecules—is thought to becrucial in understanding the roles of Ran in transferringcomplexes unidirectionally across the NPC. Whencargo molecules are released inside the nucleus, the im-portins recirculate to the cytoplasm to be used again.Figure 46–3 summarizes some of the principal featuresin the above process.

Other small monomeric GTPases (eg, ARF, Rab,Ras, and Rho) are important in various cellular pro-cesses such as vesicle formation and transport (ARF andRab; see below), certain growth and differentiationprocesses (Ras), and formation of the actin cytoskele-ton. A process involving GTP and GDP is also crucialin the transport of proteins across the membrane of theER (see below).

Table 46–1. Some general features of proteinimport to organelles.1

• Import of a protein into an organelle usually occurs in threestages: recognition, translocation, and maturation.

• Targeting sequences on the protein are recognized in thecytoplasm or on the surface of the organelle.

• The protein is unfolded for translocation, a state main-tained in the cytoplasm by chaperones.

• Threading of the protein through a membrane requires en-ergy and organellar chaperones on the trans side of the membrane.

• Cycles of binding and release of the protein to the chaper-one result in pulling of its polypeptide chain through the membrane.

• Other proteins within the organelle catalyze folding of theprotein, often attaching cofactors or oligosaccharides andassembling them into active monomers or oligomers.

1Data from McNew JA, Goodman JM: The targeting and assemblyof peroxisomal proteins: some old rules do not apply. TrendsBiochem Sci 1998;21:54.

ch46.qxd 2/14/2003 9:11 AM Page 501

Page 103: Harpers Biochem

502 / CHAPTER 46

α β

α β

α β

β

Targeting

Termination

Cyt

op

lasm

Nu

cleu

s

Nuc

leus

Translocation

Recycle factors

Docking

1

α β

α

α β

2

GTP

RanGEF ?

RanGAP

RanBP1

GDPGTP

RanGEF

Pi

GDP

3

4

6

7

5

8

GDP

GDP

GTP

Ran

Ran

Ran

GTP

GTP

GTP

GTPRan

GTPRan

GTPRan

+

+

+

+

?

RanGTP

RanGAP

RanGTP

RanBP1

RanGDP

Pi+ Ran exchange

OFF ON

Cyt

opla

sm

RanGDP

RanGTP

Figure 46–3. Schematic representation of the proposed role of Ran in the import of cargocarrying an NLS signal. (1) The targeting complex forms when the NLS receptor (α, an importin)binds NLS cargo and the docking factor (β). (2) Docking occurs at filamentous sites that pro-trude from the NPC. Ran-GDP docks independently. (3) Transfer to the translocation channel istriggered when a RanGEF converts Ran-GDP to Ran-GTP. (4) The NPC catalyzes translocation ofthe targeting complex. (5) Ran-GTP is recycled to Ran-GDP by docked RanGAP. (6) Ran-GTP dis-rupts the targeting complex by binding to a site on β that overlaps with a binding site. (7) NLScargo dissociates from α, and Ran-GTP may dissociate from β. (8) α and β factors are recycled tothe cytoplasm. Inset: The Ran translocation switch is off in the cytoplasm and on in the nucleus.Ran-GTP promotes NLS- and NES-directed translocation. However, cytoplasmic Ran is enrichedin Ran-GDP (OFF) by an active RanGAP, and nuclear pools are enriched in Ran-GTP (ON) by anactive GEF. RanBP1 promotes the contrary activities of these two factors. Direct linkage of nu-clear and cytoplasmic pools of Ran occurs through the NPC by an unknown shuttling mecha-nism. Pi, inorganic phosphate; NLS, nuclear localization signal; NPC, nuclear pore complex; GEF,guanine nucleotide exchange factor; GAP, guanine-activating protein; NES, nuclear export sig-nal; BP, binding protein. (Reprinted, with permission, from Goldfarb DS: Whose finger is on theswitch? Science 1997;276:1814.)

ch46.qxd 2/14/2003 9:11 AM Page 502

Page 104: Harpers Biochem

INTRACELLULAR TRAFFIC & SORTING OF PROTEINS / 503

Proteins similar to importins, referred to as ex-portins, are involved in export of many macromole-cules from the nucleus. Cargo molecules for exportcarry nuclear export signals (NESs). Ran proteins areinvolved in this process also, and it is now establishedthat the processes of import and export share a numberof common features.

MOST CASES OF ZELLWEGER SYNDROMEARE DUE TO MUTATIONS IN GENESINVOLVED IN THE BIOGENESIS OF PEROXISOMES

The peroxisome is an important organelle involved inaspects of the metabolism of many molecules, includingfatty acids and other lipids (eg, plasmalogens, choles-terol, bile acids), purines, amino acids, and hydrogenperoxide. The peroxisome is bounded by a single mem-brane and contains more than 50 enzymes; catalase andurate oxidase are marker enzymes for this organelle. Itsproteins are synthesized on cytosolic polyribosomes andfold prior to import. The pathways of import of a num-ber of its proteins and enzymes have been studied, somebeing matrix components and others membrane com-ponents. At least two peroxisomal-matrix targetingsequences (PTSs) have been discovered. One, PTS1, isa tripeptide (ie, Ser-Lys-Leu [SKL], but variations ofthis sequence have been detected) located at the car-boxyl terminal of a number of matrix proteins, includ-ing catalase. Another, PTS2, consisting of about 26–36amino acids, has been found in at least four matrix pro-teins (eg, thiolase) and, unlike PTS1, is cleaved afterentry into the matrix. Proteins containing PTS1 se-quences form complexes with a soluble receptor protein(PTS1R) and proteins containing PTS2 sequencescomplex with another, PTS2R. The resulting com-plexes then interact with a membrane receptor, Pex14p.Proteins involved in further transport of proteins intothe matrix are also present. Most peroxisomal mem-brane proteins have been found to contain neither ofthe above two targeting sequences, but apparently con-tain others. The import system can handle intactoligomers (eg, tetrameric catalase). Import of matrixproteins requires ATP, whereas import of membraneproteins does not.

Interest in import of proteins into peroxisomes hasbeen stimulated by studies on Zellweger syndrome.This condition is apparent at birth and is characterizedby profound neurologic impairment, victims oftendying within a year. The number of peroxisomes canvary from being almost normal to being virtually absentin some patients. Biochemical findings include an accu-mulation of very long chain fatty acids, abnormalities of

the synthesis of bile acids, and a marked reduction ofplasmalogens. The condition is believed to be due tomutations in genes encoding certain proteins—socalled peroxins—involved in various steps of peroxi-some biogenesis (such as the import of proteins de-scribed above), or in genes encoding certain peroxiso-mal enzymes themselves. Two closely related conditionsare neonatal adrenoleukodystrophy and infantileRefsum disease. Zellweger syndrome and these twoconditions represent a spectrum of overlapping fea-tures, with Zellweger syndrome being the most severe(many proteins affected) and infantile Refsum diseasethe least severe (only one or a few proteins affected).Table 46–2 lists some features of these and related con-ditions.

THE SIGNAL HYPOTHESIS EXPLAINSHOW POLYRIBOSOMES BIND TO THEENDOPLASMIC RETICULUM

As indicated above, the rough ER branch is the secondof the two branches involved in the synthesis and sort-ing of proteins. In this branch, proteins are synthesizedon membrane-bound polyribosomes and translocatedinto the lumen of the rough ER prior to further sorting(Figure 46–2).

The signal hypothesis was proposed by Blobel andSabatini partly to explain the distinction between freeand membrane-bound polyribosomes. They found thatproteins synthesized on membrane-bound polyribo-somes contained a peptide extension (signal peptide)

Table 46–2. Disorders due to peroxisomalabnormalities.1

MIM Number2

Zellweger syndrome 214100Neonatal adrenoleukodystrophy 202370Infantile Refsum disease 266510Hyperpipecolic acidemia 239400Rhizomelic chondrodysplasia punctata 215100Adrenoleukodystrophy 300100Pseudo-neonatal adrenoleukodystrophy 264470Pseudo-Zellweger syndrome 261510Hyperoxaluria type 1 259900Acatalasemia 115500Glutaryl-CoA oxidase deficiency 2316901Reproduced, with permission, from Seashore MR, Wappner RS:Genetics in Primary Care & Clinical Medicine. Appleton & Lange,1996.2MIM = Mendelian Inheritance in Man. Each number specifies a ref-erence in which information regarding each of the above condi-tions can be found.

ch46.qxd 2/14/2003 9:11 AM Page 503

Page 105: Harpers Biochem

504 / CHAPTER 46

Table 46–3. Some properties of signal peptides.

• Usually, but not always, located at the amino terminal• Contain approximately 12–35 amino acids• Methionine is usually the amino terminal amino acid• Contain a central cluster of hydrophobic amino acids• Contain at least one positively charged amino acid near

their amino terminal• Usually cleaved off at the carboxyl terminal end of an Ala

residue by signal peptidase

at their amino terminals which mediated their attach-ment to the membranes of the ER. As noted above,proteins whose entire synthesis occurs on free polyribo-somes lack this signal peptide. An important aspect ofthe signal hypothesis was that it suggested—as turnsout to be the case—that all ribosomes have the samestructure and that the distinction between membrane-bound and free ribosomes depends solely on the for-mer’s carrying proteins that have signal peptides. Muchevidence has confirmed the original hypothesis. Becausemany membrane proteins are synthesized on mem-brane-bound polyribosomes, the signal hypothesis playsan important role in concepts of membrane assembly.Some characteristics of signal peptides are summarizedin Table 46–3.

Figure 46–4 illustrates the principal features in rela-tion to the passage of a secreted protein through themembrane of the ER. It incorporates features from theoriginal signal hypothesis and from subsequent work.The mRNA for such a protein encodes an amino termi-nal signal peptide (also variously called a leader se-quence, a transient insertion signal, a signal sequence,or a presequence). The signal hypothesis proposed thatthe protein is inserted into the ER membrane at thesame time as its mRNA is being translated on polyribo-somes, so-called cotranslational insertion. As the sig-nal peptide emerges from the large subunit of the ribo-some, it is recognized by a signal recognition particle(SRP) that blocks further translation after about 70amino acids have been polymerized (40 buried in thelarge ribosomal subunit and 30 exposed). The block isreferred to as elongation arrest. The SRP contains sixproteins and has a 7S RNA associated with it that isclosely related to the Alu family of highly repeatedDNA sequences (Chapter 36). The SRP-imposed blockis not released until the SRP-signal peptide-polyribo-some complex has bound to the so-called docking pro-tein (SRP-R, a receptor for the SRP) on the ER mem-brane; the SRP thus guides the signal peptide to theSRP-R and prevents premature folding and expulsionof the protein being synthesized into the cytosol.

The SRP-R is an integral membrane protein com-posed of α and β subunits. The α subunit binds GDP

and the β subunit spans the membrane. When the SRP-signal peptide complex interacts with the receptor, theexchange of GDP for GTP is stimulated. This form ofthe receptor (with GTP bound) has a high affinity forthe SRP and thus releases the signal peptide, which bindsto the translocation machinery (translocon) also presentin the ER membrane. The α subunit then hydrolyzes itsbound GTP, restoring GDP and completing a GTP-GDP cycle. The unidirectionality of this cycle helps drivethe interaction of the polyribosome and its signal peptidewith the ER membrane in the forward direction.

The translocon consists of a number of membraneproteins that form a protein-conducting channel in theER membrane through which the newly synthesizedprotein may pass. The channel appears to be open onlywhen a signal peptide is present, preserving conductanceacross the ER membrane when it closes. The conduc-tance of the channel has been measured experimentally.Specific functions of a number of components of thetranslocon have been identified or suggested. TRAM(translocating chain-associated membrane) protein maybind the signal sequence as it initially interacts with thetranslocon and the Sec61p complex (consisting of threeproteins) binds the heavy subunit of the ribosome.

The insertion of the signal peptide into the conduct-ing channel, while the other end of the parent protein isstill attached to ribosomes, is termed “cotranslationalinsertion.” The process of elongation of the remainingportion of the protein probably facilitates passage of thenascent protein across the lipid bilayer as the ribosomesremain attached to the membrane of the ER. Thus, therough (or ribosome-studded) ER is formed. It is impor-tant that the protein be kept in an unfolded state priorto entering the conducting channel—otherwise, it maynot be able to gain access to the channel.

Ribosomes remain attached to the ER during syn-thesis of signal peptide-containing proteins but are re-leased and dissociated into their two types of subunitswhen the process is completed. The signal peptide ishydrolyzed by signal peptidase, located on the luminalside of the ER membrane (Figure 46–4), and then isapparently rapidly degraded by proteases.

Cytochrome P450 (Chapter 53), an integral ERmembrane protein, does not completely cross the mem-brane. Instead, it resides in the membrane with its sig-nal peptide intact. Its passage through the membrane isprevented by a sequence of amino acids called a halt- orstop-transfer signal.

Secretory proteins and proteins destined for mem-branes distal to the ER completely traverse the mem-brane bilayer and are discharged into the lumen of theER. N-Glycan chains, if present, are added (Chapter47) as these proteins traverse the inner part of the ERmembrane—a process called “cotranslational glycosyla-tion.” Subsequently, the proteins are found in the

ch46.qxd 2/14/2003 9:11 AM Page 504

Page 106: Harpers Biochem

INTRACELLULAR TRAFFIC & SORTING OF PROTEINS / 505

AUG

Signal codons

Signal peptide

SRP

5′ 3′

Signal peptidase

Signal receptorRibosome receptor

Figure 46–4. Diagram of the signal hypothesis for the transport of secreted proteins across theER membrane. The ribosomes synthesizing a protein move along the messenger RNA specifying theamino acid sequence of the protein. (The messenger is represented by the line between 5′ and 3′.)The codon AUG marks the start of the message for the protein; the hatched lines that follow AUGrepresent the codons for the signal sequence. As the protein grows out from the larger ribosomalsubunit, the signal sequence is exposed and bound by the signal recognition particle (SRP). Transla-tion is blocked until the complex binds to the “docking protein,” also designated SRP-R (repre-sented by the solid bar) on the ER membrane. There is also a receptor (open bar) for the ribosomeitself. The interaction of the ribosome and growing peptide chain with the ER membrane results inthe opening of a channel through which the protein is transported to the interior space of the ER.During translocation, the signal sequence of most proteins is removed by an enzyme called the“signal peptidase,” located at the luminal surface of the ER membrane. The completed protein iseventually released by the ribosome, which then separates into its two components, the large andsmall ribosomal subunits. The protein ends up inside the ER. See text for further details. (Slightlymodified and reproduced, with permission, from Marx JL: Newly made proteins zip through the cell. Sci-ence 1980;207:164. Copyright © 1980 by the American Association for the Advancement of Science.)

lumen of the Golgi apparatus, where further changes inglycan chains occur (Figure 47–9) prior to intracellulardistribution or secretion. There is strong evidence thatthe signal peptide is involved in the process of proteininsertion into ER membranes. Mutant proteins, con-taining altered signal peptides in which a hydrophobicamino acid is replaced by a hydrophilic one, are not in-serted into ER membranes. Nonmembrane proteins(eg, α-globin) to which signal peptides have been at-tached by genetic engineering can be inserted into thelumen of the ER or even secreted.

There is considerable evidence that a second trans-poson in the ER membrane is involved in retrogradetransport of various molecules from the ER lumen tothe cytosol. These molecules include unfolded or mis-folded glycoproteins, glycopeptides, and oligosaccha-rides. Some at least of these molecules are degraded inproteasomes. Thus, there is two-way traffic across theER membrane.

PROTEINS FOLLOW SEVERAL ROUTESTO BE INSERTED INTO OR ATTACHEDTO THE MEMBRANES OF THEENDOPLASMIC RETICULUM

The routes that proteins follow to be inserted into themembranes of the ER include the following.

A. COTRANSLATIONAL INSERTION

Figure 46–5 shows a variety of ways in which proteinsare distributed in the plasma membrane. In particular,the amino terminals of certain proteins (eg, the LDL re-ceptor) can be seen to be on the extracytoplasmic face,whereas for other proteins (eg, the asialoglycoprotein re-ceptor) the carboxyl terminals are on this face. To ex-plain these dispositions, one must consider the initialbiosynthetic events at the ER membrane. The LDL re-ceptor enters the ER membrane in a manner analogousto a secretory protein (Figure 46–4); it partly traverses

ch46.qxd 2/14/2003 9:11 AM Page 505

Page 107: Harpers Biochem

506 / CHAPTER 46

PHOSPHOLIPIDBILAYER

C

N

NN

C

C

N

Various transporters (eg, glucose)N

Influenza neuraminidaseAsialoglycoprotein receptorTransferrin receptorHLA-DR invariant chain

LDL receptorHLA-A heavy chainInfluenza hemagglutinin

CYTOPLASMICFACE

EXTRACYTOPLASMICFACE

G protein–coupled receptors

NN

C C

NN

CCInsulin andIGF-I receptors

C

Figure 46–5. Variations in the way in which proteins are inserted into membranes. Thisschematic representation, which illustrates a number of possible orientations, shows the seg-ments of the proteins within the membrane as α-helices and the other segments as lines. TheLDL receptor, which crosses the membrane once and has its amino terminal on the exterior, iscalled a type I transmembrane protein. The asialoglycoprotein receptor, which also crosses themembrane once but has its carboxyl terminal on the exterior, is called a type II transmembraneprotein. The various transporters indicated (eg, glucose) cross the membrane a number of timesand are called type III transmembrane proteins; they are also referred to as polytopic membraneproteins. (N, amino terminal; C, carboxyl terminal.) (Adapted, with permission, from Wickner WT,Lodish HF: Multiple mechanisms of protein insertion into and across membranes. Science1985;230:400. Copyright © 1985 by the American Association for the Advancement of Science.)

the ER membrane, its signal peptide is cleaved, and itsamino terminal protrudes into the lumen. However, it isretained in the membrane because it contains a highlyhydrophobic segment, the halt- or stop-transfer signal.This sequence forms the single transmembrane segmentof the protein and is its membrane-anchoring domain.The small patch of ER membrane in which the newlysynthesized LDL receptor is located subsequently budsoff as a component of a transport vesicle, probably fromthe transitional elements of the ER (Figure 46–2). Asdescribed below in the discussion of asymmetry of pro-teins and lipids in membrane assembly, the dispositionof the receptor in the ER membrane is preserved in thevesicle, which eventually fuses with the plasma mem-brane. In contrast, the asialoglycoprotein receptor pos-sesses an internal insertion sequence, which inserts intothe membrane but is not cleaved. This acts as an anchor,and its carboxyl terminal is extruded through the mem-brane. The more complex disposition of the trans-porters (eg, for glucose) can be explained by the factthat alternating transmembrane α-helices act as un-

cleaved insertion sequences and as halt-transfer signals,respectively. Each pair of helical segments is inserted as ahairpin. Sequences that determine the structure of aprotein in a membrane are called topogenic sequences.As explained in the legend to Figure 46–5, the abovethree proteins are examples of type I, type II, and typeIII transmembrane proteins.

B. SYNTHESIS ON FREE POLYRIBOSOMES

& SUBSEQUENT ATTACHMENT TO THE

ENDOPLASMIC RETICULUM MEMBRANE

An example is cytochrome b5, which enters the ERmembrane spontaneously.

C. RETENTION AT THE LUMINAL ASPECT

OF THE ENDOPLASMIC RETICULUM

BY SPECIFIC AMINO ACID SEQUENCES

A number of proteins possess the amino acid sequenceKDEL (Lys-Asp-Glu-Leu) at their carboxyl terminal.

ch46.qxd 2/14/2003 9:11 AM Page 506

Page 108: Harpers Biochem

INTRACELLULAR TRAFFIC & SORTING OF PROTEINS / 507

This sequence specifies that such proteins will be at-tached to the inner aspect of the ER in a relatively loosemanner. The chaperone BiP (see below) is one suchprotein. Actually, KDEL-containing proteins first travelto the Golgi, interact there with a specific KDEL recep-tor protein, and then return in transport vesicles to theER, where they dissociate from the receptor.

D. RETROGRADE TRANSPORT FROM

THE GOLGI APPARATUS

Certain other non-KDEL-containing proteins destinedfor the membranes of the ER also pass to the Golgi andthen return, by retrograde vesicular transport, to the ERto be inserted therein (see below).

The foregoing paragraphs demonstrate that a vari-ety of routes are involved in assembly of the proteins ofthe ER membranes; a similar situation probably holdsfor other membranes (eg, the mitochondrial mem-branes and the plasma membrane). Precise targeting se-quences have been identified in some instances (eg,KDEL sequences).

The topic of membrane biogenesis is discussed fur-ther later in this chapter.

PROTEINS MOVE THROUGH CELLULARCOMPARTMENTS TO SPECIFICDESTINATIONS

A scheme representing the possible flow of proteinsalong the ER → Golgi apparatus → plasma membraneroute is shown in Figure 46–6. The horizontal arrows

denote transport steps that may be independent of tar-geting signals, whereas the vertical open arrows repre-sent steps that depend on specific signals. Thus, flow ofcertain proteins (including membrane proteins) fromthe ER to the plasma membrane (designated “bulkflow,” as it is nonselective) probably occurs without anytargeting sequences being involved, ie, by default. Onthe other hand, insertion of resident proteins into theER and Golgi membranes is dependent upon specificsignals (eg, KDEL or halt-transfer sequences for theER). Similarly, transport of many enzymes to lysosomesis dependent upon the Man 6-P signal (Chapter 47),and a signal may be involved for entry of proteins intosecretory granules. Table 46–4 summarizes informa-tion on sequences that are known to be involved in tar-geting various proteins to their correct intracellular sites.

CHAPERONES ARE PROTEINS THAT PREVENT FAULTY FOLDING & UNPRODUCTIVE INTERACTIONS OF OTHER PROTEINS

Exit from the ER may be the rate-limiting step in thesecretory pathway. In this context, it has been foundthat certain proteins play a role in the assembly orproper folding of other proteins without themselvesbeing components of the latter. Such proteins are calledmolecular chaperones; a number of important proper-ties of these proteins are listed in Table 46–5, and thenames of some of particular importance in the ER arelisted in Table 46–6. Basically, they stabilize unfolded

ERcis

GolgitransGolgi

medialGolgi

Cellsurface

Secretorystorage vesicles

Lysosomes

Figure 46–6. Flow of membrane proteins from the endoplas-mic reticulum (ER) to the cell surface. Horizontal arrows denotesteps that have been proposed to be signal independent andthus represent bulk flow. The open vertical arrows in the boxesdenote retention of proteins that are resident in the membranesof the organelle indicated. The open vertical arrows outside theboxes indicate signal-mediated transport to lysosomes and secre-tory storage granules. (Reproduced, with permission, from PfefferSR, Rothman JE: Biosynthetic protein transport and sorting by the en-doplasmic reticulum and Golgi. Annu Rev Biochem 1987;56:829.)

ch46.qxd 2/14/2003 9:11 AM Page 507

Page 109: Harpers Biochem

508 / CHAPTER 46

Table 46–4. Some sequences or compounds thatdirect proteins to specific organelles.

Targeting Sequence or Compound Organelle Targeted

Signal peptide sequence Membrane of ER

Amino terminal Luminal surface of ERKDEL sequence(Lys-Asp-Glu-Leu)

Amino terminal sequence Mitochondrial matrix(20–80 residues)

NLS1 (eg, Pro2-Lys2-Ala- NucleusLys-Val)

PTS1 (eg, Ser-Lys-Leu) Peroxisome

Mannose 6-phosphate Lysosome1NLS, nuclear localization signal; PTS, peroxisomal-matrix target-ing sequence.

Table 46–5. Some properties of chaperoneproteins.

• Present in a wide range of species from bacteria to humans• Many are so-called heat shock proteins (Hsp)• Some are inducible by conditions that cause unfolding of

newly synthesized proteins (eg, elevated temperature andvarious chemicals)

• They bind to predominantly hydrophobic regions of un-folded and aggregated proteins

• They act in part as a quality control or editing mechanismfor detecting misfolded or otherwise defective proteins

• Most chaperones show associated ATPase activity, with ATPor ADP being involved in the protein-chaperone interaction

• Found in various cellular compartments such as cytosol,mitochondria, and the lumen of the endoplasmic reticulum

Table 46–6. Some chaperones and enzymesinvolved in folding that are located in the roughendoplasmic reticulum.

• BiP (immunoglobulin heavy chain binding protein)• GRP94 (glucose-regulated protein)• Calnexin• Calreticulin• PDI (protein disulfide isomerase)• PPI (peptidyl prolyl cis-trans isomerase)

or partially folded intermediates, allowing them time tofold properly, and prevent inappropriate interactions,thus combating the formation of nonfunctional struc-tures. Most chaperones exhibit ATPase activity andbind ADP and ATP. This activity is important for theireffect on folding. The ADP-chaperone complex oftenhas a high affinity for the unfolded protein, which,when bound, stimulates release of ADP with replace-ment by ATP. The ATP-chaperone complex, in turn,releases segments of the protein that have folded prop-erly, and the cycle involving ADP and ATP binding isrepeated until the folded protein is released.

Several examples of chaperones were introducedabove when the sorting of mitochondrial proteins wasdiscussed. The immunoglobulin heavy chain bindingprotein (BiP) is located in the lumen of the ER. Thisprotein will bind abnormally folded immunoglobulinheavy chains and certain other proteins and preventthem from leaving the ER, in which they are degraded.Another important chaperone is calnexin, a calcium-binding protein located in the ER membrane. This pro-tein binds a wide variety of proteins, including mixedhistocompatibility (MHC) antigens and a variety ofserum proteins. As mentioned in Chapter 47, calnexinbinds the monoglycosylated species of glycoproteinsthat occur during processing of glycoproteins, retainingthem in the ER until the glycoprotein has folded prop-erly. Calreticulin, which is also a calcium-binding pro-tein, has properties similar to those of calnexin; it is notmembrane-bound. Chaperones are not the only pro-teins in the ER lumen that are concerned with properfolding of proteins. Two enzymes are present that playan active role in folding. Protein disulfide isomerase(PDI) promotes rapid reshuffling of disulfide bondsuntil the correct set is achieved. Peptidyl prolyl isom-erase (PPI) accelerates folding of proline-containingproteins by catalyzing the cis-trans isomerization of X-Pro bonds, where X is any amino acid residue.

TRANSPORT VESICLES ARE KEY PLAYERSIN INTRACELLULAR PROTEIN TRAFFIC

Most proteins that are synthesized on membrane-bound polyribosomes and are destined for the Golgiapparatus or plasma membrane reach these sites insidetransport vesicles. The precise mechanisms by whichproteins synthesized in the rough ER are inserted intothese vesicles are not known. Those involved in trans-port from the ER to the Golgi apparatus and viceversa—and from the Golgi to the plasma membrane—are mainly clathrin-free, unlike the coated vesicles in-volved in endocytosis (see discussions of the LDL re-ceptor in Chapters 25 and 26). For the sake of clarity,the non-clathrin-coated vesicles will be referred to in

ch46.qxd 2/14/2003 9:11 AM Page 508

Page 110: Harpers Biochem

INTRACELLULAR TRAFFIC & SORTING OF PROTEINS / 509

Table 46–7. Factors involved in the formation ofnon-clathrin-coated vesicles and their transport.

• ARF: ADP-ribosylation factor, a GTPase• Coatomer: A family of at least seven coat proteins (α, β, γ, δ,

ε, β′, and ζ). Different transport vesicles have different com-plements of coat proteins.

• SNAP: Soluble NSF attachment factor• SNARE: SNAP receptor• v-SNARE: Vesicle SNARE• t-SNARE: Target SNARE• GTP-γ-S: A nonhydrolyzable analog of GTP, used to test the

involvement of GTP• NEM: N-Ethylmaleimide, a chemical that alkylates sulfhy-

dryl groups• NSF: NEM-sensitive factor, an ATPase• Rab proteins: A family of ras-related proteins first observed

in rat brain; they are GTPases and are active when GTP isfound

• Sec1: A member of a family of proteins that attach to t-SNAREs and are displaced from them by Rab proteins,thereby allowing v-SNARE–t-SNARE interactions to occur.

this text as transport vesicles. There is evidence thatproteins destined for the membranes of the Golgi appa-ratus contain specific signal sequences. On the otherhand, most proteins destined for the plasma membraneor for secretion do not appear to contain specific sig-nals, reaching these destinations by default.

The Golgi Apparatus Is Involved inGlycosylation & Sorting of Proteins

The Golgi apparatus plays two important roles in mem-brane synthesis. First, it is involved in the processingof the oligosaccharide chains of membrane and otherN-linked glycoproteins and also contains enzymes in-volved in O-glycosylation (see Chapter 47). Second, itis involved in the sorting of various proteins prior totheir delivery to their appropriate intracellular destina-tions. All parts of the Golgi apparatus participate in thefirst role, whereas the trans-Golgi is particularly in-volved in the second and is very rich in vesicles. Becauseof their central role in protein transport, considerableresearch has been conducted in recent years concerningthe formation and fate of transport vesicles.

A Model of Non-Clathrin-Coated VesiclesInvolves SNAREs & Other Factors

Vesicles lie at the heart of intracellular transport ofmany proteins. Recently, significant progress has beenmade in understanding the events involved in vesicleformation and transport. This has transpired because ofthe use of a number of approaches. These include es-tablishment of cell-free systems with which to studyvesicle formation. For instance, it is possible to observe,by electron microscopy, budding of vesicles from Golgipreparations incubated with cytosol and ATP. The de-velopment of genetic approaches for studying vesiclesin yeast has also been crucial. The picture is complex,with its own nomenclature (Table 46–7), and involvesa variety of cytosolic and membrane proteins, GTP,ATP, and accessory factors.

Based largely on a proposal by Rothman and col-leagues, anterograde vesicular transport can be consid-ered to occur in eight steps (Figure 46–7). The basicconcept is that each transport vesicle bears a unique ad-dress marker consisting of one or more v-SNARE pro-teins, while each target membrane bears one or morecomplementary t-SNARE proteins with which theformer interact specifically.

Step 1: Coat assembly is initiated when ARF is ac-tivated by binding GTP, which is exchanged forGDP. This leads to the association of GTP-boundARF with its putative receptor (hatched in Figure46–7) in the donor membrane.

Step 2: Membrane-associated ARF recruits the coatproteins that comprise the coatomer shell from thecytosol, forming a coated bud.Step 3: The bud pinches off in a process involvingacyl-CoA—and probably ATP—to complete theformation of the coated vesicle.Step 4: Coat disassembly (involving dissociation ofARF and coatomer shell) follows hydrolysis ofbound GTP; uncoating is necessary for fusion tooccur.Step 5: Vesicle targeting is achieved via members ofa family of integral proteins, termed v-SNAREs,that tag the vesicle during its budding. v-SNAREspair with cognate t-SNAREs in the target membraneto dock the vesicle.

It is presumed that steps 4 and 5 are closely coupledand that step 4 may follow step 5, with ARF and thecoatomer shell rapidly dissociating after docking.

Step 6: The general fusion machinery then assem-bles on the paired SNARE complex; it includes anATPase (NSF; NEM-sensitive factor) and the SNAP(soluble NSF attachment factor) proteins. SNAPsbind to the SNARE (SNAP receptor) complex, en-abling NSF to bind.Step 7: Hydrolysis of ATP by NSF is essential forfusion, a process that can be inhibited by NEM (N-ethylmaleimide). Certain other proteins and calciumare also required.

ch46.qxd 2/14/2003 9:11 AM Page 509

Page 111: Harpers Biochem

510 / CHAPTER 46

GT

P

GTP

BFA

ARF

GDP

GTP

ATPAcyl-CoA

Pi

NEM

Nocodazole

Fusion

20S fusionparticle

GDP

GDP

SNAPs

t-SNARE

NSF

SNAPsCa2+

NSF

ATP

GTP-γ-S

GDP

v-SNARE

Coatomer

Budding

Coatedbud

Coatedvesicle

GTP

GTP

GTPGT

P

Donormembrane

(eg, ER)

Targetmembrane(eg, CGN)

GT

P

GTP

GTPGTP GTP

GTPGT

P

GTP

2

1

3 4

5

6

7

8

Figure 46–7. Model of the steps in a round of anterograde vesicular transport. Thecycle starts in the bottom left-hand side of the figure, where two molecules of ARFare represented as small ovals containing GDP. The steps in the cycle are described inthe text. Most of the abbreviations used are explained in Table 46–7. The roles of Raband Sec1 proteins (see text) in the overall process are not dealt with in this figure.(CGN, cis-Golgi network; BFA, Brefeldin A.) (Adapted from Rothman JE: Mechanisms ofintracellular protein transport. Nature 1994;372:55.) (Courtesy of E Degen.)

Step 8: Retrograde transport occurs to restart thecycle. This last step may retrieve certain proteinsor recycle v-SNAREs. Nocodazole, a microtubule-disrupting agent, inhibits this step.

Brefeldin A Inhibits the Coating Process

The following points expand and clarify the above.

(a) To participate in step 1, ARF must first be modi-fied by addition of myristic acid (C14:0), employingmyristoyl-CoA as the acyl donor. Myristoylation is oneof a number of enzyme-catalyzed posttranslational mod-ifications, involving addition of certain lipids to specificresidues of proteins, that facilitate the binding of pro-teins to the cytosolic surfaces of membranes or vesicles.Others are addition of palmitate, farnesyl, and geranyl-geranyl; the two latter molecules are polyisoprenoidscontaining 15 and 20 carbon atoms, respectively.

(b) At least three different types of coated vesicleshave been distinguished: COPI, COPII, and clathrin-coated vesicles; the first two are referred to here astransport vesicles. Many other types of vesicles no

doubt remain to be discovered. COPI vesicles are in-volved in bidirectional transport from the ER to theGolgi and in the reverse direction, whereas COPII vesi-cles are involved mainly in transport in the former di-rection. Clathrin-containing vesicles are involved intransport from the trans-Golgi network to prelysosomesand from the plasma membrane to endosomes, respec-tively. Regarding selection of cargo molecules by vesi-cles, this appears to be primarily a function of the coatproteins of vesicles. Cargo molecules may interact withcoat proteins either directly or via intermediary proteinsthat attach to coat proteins, and they then become en-closed in their appropriate vesicles.

(c) The fungal metabolite brefeldin A preventsGTP from binding to ARF in step 1 and thus inhibitsthe entire coating process. In its presence, the Golgi ap-paratus appears to disintegrate, and fragments are lost.It may do this by inhibiting the guanine nucleotide ex-changer involved in step 1.

(d) GTP-�-S (a nonhydrolyzable analog of GTPoften used in investigations of the role of GTP in bio-chemical processes) blocks disassembly of the coat fromcoated vesicles, leading to a build-up of coated vesicles.

ch46.qxd 2/14/2003 9:11 AM Page 510

Page 112: Harpers Biochem

INTRACELLULAR TRAFFIC & SORTING OF PROTEINS / 511

(e) A family of Ras-like proteins, called the Rab pro-tein family, are required in several steps of intracellularprotein transport, regulated secretion, and endocytosis.They are small monomeric GTPases that attach to thecytosolic faces of membranes via geranylgeranyl chains.They attach in the GTP-bound state (not shown inFigure 46–7) to the budding vesicle. Another family ofproteins (Sec1) binds to t-SNAREs and prevents inter-action with them and their complementary v-SNAREs.When a vesicle interacts with its target membrane, Rabproteins displace Sec1 proteins and the v-SNARE-t-SNARE interaction is free to occur. It appears thatthe Rab and Sec1 families of proteins regulate the speedof vesicle formation, opposing each other. Rab proteinshave been likened to throttles and Sec1 proteins todampers on the overall process of vesicle formation.

(f) Studies using v- and t-SNARE proteins reconsti-tuted into separate lipid bilayer vesicles have indicatedthat they form SNAREpins, ie, SNARE complexes thatlink two membranes (vesicles). SNAPs and NSF are re-quired for formation of SNAREpins, but once theyhave formed they can apparently lead to spontaneousfusion of membranes at physiologic temperature, sug-gesting that they are the minimal machinery requiredfor membrane fusion.

(g) The fusion of synaptic vesicles with the plasmamembrane of neurons involves a series of events similarto that described above. For example, one v-SNARE isdesignated synaptobrevin and two t-SNAREs are des-ignated syntaxin and SNAP 25 (synaptosome-associ-ated protein of 25 kDa). Botulinum B toxin is one ofthe most lethal toxins known and the most seriouscause of food poisoning. One component of this toxinis a protease that appears to cleave only synaptobrevin,thus inhibiting release of acetylcholine at the neuro-muscular junction and possibly proving fatal, depend-ing on the dose taken.

(h) Although the above model describes non-clathrin-coated vesicles, it appears likely that many ofthe events described above apply, at least in principle,to clathrin-coated vesicles.

THE ASSEMBLY OF MEMBRANES IS COMPLEX

There are many cellular membranes, each with its ownspecific features. No satisfactory scheme describing theassembly of any one of these membranes is available.How various proteins are initially inserted into themembrane of the ER has been discussed above. Thetransport of proteins, including membrane proteins, tovarious parts of the cell inside vesicles has also been de-scribed. Some general points about membrane assemblyremain to be addressed.

Asymmetry of Both Proteins & Lipids IsMaintained During Membrane Assembly

Vesicles formed from membranes of the ER and Golgiapparatus, either naturally or pinched off by homoge-nization, exhibit transverse asymmetries of both lipidand protein. These asymmetries are maintained duringfusion of transport vesicles with the plasma membrane.The inside of the vesicles after fusion becomes the out-side of the plasma membrane, and the cytoplasmic sideof the vesicles remains the cytoplasmic side of the mem-brane (Figure 46–8). Since the transverse asymmetry ofthe membranes already exists in the vesicles of the ERwell before they are fused to the plasma membrane, amajor problem of membrane assembly becomes under-standing how the integral proteins are inserted into thelipid bilayer of the ER. This problem was addressedearlier in this chapter.

Phospholipids are the major class of lipid in mem-branes. The enzymes responsible for the synthesis ofphospholipids reside in the cytoplasmic surface of thecisternae of the ER. As phospholipids are synthesized atthat site, they probably self-assemble into thermody-namically stable bimolecular layers, thereby expandingthe membrane and perhaps promoting the detachmentof so-called lipid vesicles from it. It has been proposedthat these vesicles travel to other sites, donating theirlipids to other membranes; however, little is knownabout this matter. As indicated above, cytosolic pro-teins that take up phospholipids from one membraneand release them to another (ie, phospholipid exchangeproteins) have been demonstrated; they probably play arole in contributing to the specific lipid composition ofvarious membranes.

Lipids & Proteins Undergo Turnover atDifferent Rates in Different Membranes

It has been shown that the half-lives of the lipids of theER membranes of rat liver are generally shorter thanthose of its proteins, so that the turnover rates oflipids and proteins are independent. Indeed, differ-ent lipids have been found to have different half-lives.Furthermore, the half-lives of the proteins of thesemembranes vary quite widely, some exhibiting short(hours) and others long (days) half-lives. Thus, individ-ual lipids and proteins of the ER membranes appear tobe inserted into it relatively independently; this is thecase for many other membranes.

The biogenesis of membranes is thus a complexprocess about which much remains to be learned. Oneindication of the complexity involved is to consider thenumber of posttranslational modifications that mem-brane proteins may be subjected to prior to attainingtheir mature state. These include proteolysis, assembly

ch46.qxd 2/14/2003 9:11 AM Page 511

Page 113: Harpers Biochem

512 / CHAPTER 46

C

NN

C

Exterior surfaceMembrane protein

Plasmamembrane

Cytoplasm

Vesiclemembrane

Lumen

Integralprotein

N

C

CC

N N

N

Figure 46–8. Fusion of a vesicle with the plasmamembrane preserves the orientation of any integralproteins embedded in the vesicle bilayer. Initially, theamino terminal of the protein faces the lumen, or innercavity, of such a vesicle. After fusion, the amino termi-nal is on the exterior surface of the plasma membrane.That the orientation of the protein has not been re-versed can be perceived by noting that the other endof the molecule, the carboxyl terminal, is always im-mersed in the cytoplasm. The lumen of a vesicle andthe outside of the cell are topologically equivalent. (Re-drawn and modified, with permission, from Lodish HF,Rothman JE: The assembly of cell membranes. Sci Am[Jan] 1979;240:43.)

Table 46–8. Major features of membraneassembly.

• Lipids and proteins are inserted independently into mem-branes.

• Individual membrane lipids and proteins turn over indepen-dently and at different rates.

• Topogenic sequences (eg, signal [amino terminal or inter-nal] and stop-transfer) are important in determining the in-sertion and disposition of proteins in membranes.

• Membrane proteins inside transport vesicles bud off the en-doplasmic reticulum on their way to the Golgi; final sortingof many membrane proteins occurs in the trans-Golgi net-work.

• Specific sorting sequences guide proteins to particular organelles such as lysosomes, peroxisomes, and mitochon-dria.

into multimers, glycosylation, addition of a glycophos-phatidylinositol (GPI) anchor, sulfation on tyrosine orcarbohydrate moieties, phosphorylation, acylation, andprenylation—a list that is undoubtedly not complete.Nevertheless, significant progress has been made; Table46–8 summarizes some of the major features of mem-brane assembly that have emerged to date.

Table 46–9. Some disorders due to mutations ingenes encoding proteins involved in intracellularmembrane transport.1

Disorder2 Protein Involved

Chédiak-Higashi syndrome, Lysosomal trafficking regula-214500 tor

Combined deficiency of factors ERGIC-53, a mannose-V and VIII, 227300 binding lectin

Hermansky-Pudlak syndrome, AP-3 adaptor complex β3A 203300 subunit

I-cell disease, 252500 N-Acetylglucosamine 1-phosphotransferase

Oculocerebrorenal syndrome, OCRL-1, an inositol poly-30900 phosphate 5-phosphatase

1Modified from Olkonnen VM, Ikonen E: Genetic defects of intra-cellular-membrane transport. N Eng J Med 2000;343:1095. Certainrelated conditions not listed here are also described in this publi-cation. I-cell disease is described in Chapter 47. The majority ofthe disorders listed above affect lysosomal function; readersshould consult a textbook of medicine for information on theclinical manifestations of these conditions.2The numbers after each disorder are the OMIM numbers.

ch46.qxd 2/14/2003 9:11 AM Page 512

Page 114: Harpers Biochem

INTRACELLULAR TRAFFIC & SORTING OF PROTEINS / 513

Various Disorders Result From Mutationsin Genes Encoding Proteins Involved in Intracellular Transport

Some of these are listed in Table 46–9; the majority af-fect lysosomal function. A number of other mutationsaffecting intracellular protein transport have been re-ported but are not included here.

SUMMARY

• Many proteins are targeted to their destinations bysignal sequences. A major sorting decision is madewhen proteins are partitioned between cytosolic andmembrane-bound polyribosomes by virtue of the ab-sence or presence of a signal peptide.

• The pathways of protein import into mitochondria,nuclei, peroxisomes, and the endoplasmic reticulumare described.

• Many proteins synthesized on membrane-boundpolyribosomes proceed to the Golgi apparatus andthe plasma membrane in transport vesicles.

• A number of glycosylation reactions occur in com-partments of the Golgi, and proteins are furthersorted in the trans-Golgi network.

• Most proteins destined for the plasma membraneand for secretion appear to lack specific signals—adefault mechanism.

• The role of chaperone proteins in the folding of pro-teins is presented, and a model describing budding

and attachment of transport vesicles to a target mem-brane is summarized.

• Membrane assembly is discussed and shown to becomplex. Asymmetry of both lipids and proteins ismaintained during membrane assembly.

• A number of disorders have been shown to be due tomutations in genes encoding proteins involved invarious aspects of protein traffic and sorting.

REFERENCES

Fuller GM, Shields DL: Molecular Basis of Medical Cell Biology.McGraw-Hill, 1998.

Gould SJ et al: The peroxisome biogenesis disorders. In: The Meta-bolic and Molecular Bases of Inherited Disease, 8th ed. ScriverCR et al (editors). McGraw-Hill, 2001.

Graham JM, Higgins JA: Membrane Analysis. BIOS Scientific,1997.

Griffith J, Sansom C: The Transporter Facts Book. Academic Press,1998.

Lodish H et al: Molecular Cell Biology, 4th ed. Freeman, 2000.(Chapter 17 contains comprehensive coverage of protein sort-ing and organelle biogenesis.)

Olkkonen VM, Ikonen E: Genetic defects of intracellular-mem-brane transport. N Engl J Med 2000;343:1095.

Reithmeier RAF: Assembly of proteins into membranes. In: Bio-chemistry of Lipids, Lipoproteins and Membranes. Vance DE,Vance JE (editors). Elsevier, 1996.

Sabatini DD, Adesnik MB: The biogenesis of membranes and or-ganelles. In: The Metabolic and Molecular Bases of InheritedDisease, 8th ed. Scriver CR et al (editors). McGraw-Hill,2001.

ch46.qxd 2/14/2003 9:11 AM Page 513