polypeptides and proteinsemployees.oneonta.edu/knauerbr/322lects/polypep.pdf · segments of...
TRANSCRIPT
1
N-terminus C-terminusamino acidresidue
planar sections
The peptide linkage is planar ---
C C
O
NH
CC C
O
NH
C
N C
H
R
C
O
O
H
N C
H
R
C
O
H
H3N C
H
R
C
O
H
O
C
R
H
CN N C
H
R
C
O
H H
O
C
R
H
CN
Polypeptides and Proteins
These molecules are composed, at least in part, of chainsof amino acids. Each amino acid is joined to the next onethrough an amide or peptide bond from the carbonylcarbon of one amino acid “residue” to the α-amino groupof the next. At one end of the chain there will be a free orprotonated amino group: the N-terminus. At the other endthere will be a free carboxyl or carboxylate group: the C-terminus. By convention the N-terminus is drawn at theleft end, as shown below in a generic hexapeptide (6amino acid residues).
Note that the backbone chain of amino acid residuesconsists of a series of flat “plates” that “join” at theα-carbons of the amino acid residues. It is the tetrahedralα-carbons that allow the chain to bend and have some
2
N C
H
C
O
O
HH
O
C
CH2
H
CNN C
H
CH2
C
O
H
O
C
H
CN
OH
NH
CHN NH2
N C
H
C
O
H3N C
H
C
O O
C
H
CN N C
H
H
C
O
H H
O
C
CH2
H
CN
NH
CHN NH2
flexibility. The plates are flat because the amide (orpeptide) C-N bond is planar owing to its double bondcharacter as shown by resonance in the figure above.
Because peptide molecules are large a shorthand methodis used to indicate their structures, eg, bradykinin —
As you can see, it is simpler to use the three letterabbreviations for the amino acid residues than it is to drawthe usual structure:Arg-Pro-Pro-Gly-Phe-Ser-Pro-Phe-Arg.
3
Note that the N-terminus is at the left end and theC-terminus is at the right end; it is critical that we hew tothis convention when writing sturctures using the threeletter abbreviations — Arg-Phe-Pro-Ser-Phe-Gly-Pro-Pro-Arg would be a differentmolecule.
Determination of a polypeptide structure —
1. Which amino acid residues are present?
2. How many of each?
3. What is their sequence?
(1) may be determined by complete acid hydrolysis andqualitative analysis of the hydrolysate.(2) may be determined by complete acid hydrolysis,quantitative analysis of the hydrolysate, and thedetermination of the MW of the polypeptide. Consider a polypeptide, MW = 637. Complete hydrolysisof 637 mg (1.00 mmole) gives: Asp, 133 mg (1.00 mmole);Phe, 165 mg (1.00 mmole); Val, 117 mg (1.00 mmole);Glu, 294 mg (2.00 mmole). Therefore, polypeptide = Asp-Glu2-Phe-Val, not in sequence.
The sequence (3) may be determined by partial hydrolysis. Suppose the following units (among others) are found afterpartial hydrolysis of the above polypeptide: Asp-Glu, Val-Asp, Phe-Val, Glu-Glu.
4
S S
S S
A
B
We can extrapolate to the overall sequence as follows:
�����������������������������
�����������������������
���������������� �������������������
���������������������������������
���������������� ��������������
Therefore, the sequence is Phe-Val-Asp-Glu-Glu.
Another --- real life --- example would be thedetermination of the sequence of amino acidresidues in the B-chain of insulin. [Insulinconsists of two polypeptide chains, A and B, heldtogether by two disulfide bridges, C-S-S-C. Adisulfide bridge results from oxidative coupling ofthe S-H groups of two cysteine residues.]
First the disulfide bridges would be cleavedreductively, regenerating cysteine SH groups,and separating the two chains. Then the chain ofinterest, B in this case, would be partially hydrolyzed. Theresult obtained when this was done follows:
�������������� ����������������� ��������� ��������
������������������������������
�������������������������
����������������������������������������������������
������������������������������������������������������������������������������������������������������������������������������������������������������� ����������������� ������������������������������������������������������������������������������
5
Ph N C S + H2NCHCNHCHC
R O R'
base
Ph N C
S
H
OR
NCHC NHCHC
R' O
HH2OHCl
NHCHC
R' O
H
C C
NC
N
S
H
HRO
Ph+
degraded peptide
One of 20 possible phenylthiohydantoins,identified via chromatography, therebyidentifying the terminal a a residue.
phenyl isothiocyanate
polypeptide
N-terminal tagged polypeptide
But how is the sequence in the fragments determined?
Answer: Terminal residue analysis.Some reagents will react specifically with one terminus orthe other, thus enabling one to determine the nature of theterminal residue. In some cases, the terminal residue canbe cleaved off and the remainder of the polypeptideremains (more or less) intact. The remaining polypeptidecan be subjected to terminal residue analysis, etc.The Edman Degradation —
6
���������������� ����� ������������
��� ���
� �
Polypeptides: Smaller than proteins. MW < ~7-10,000 or < ~50 amino acid residues.Polypeptides are usually flexible: they usually do not havethe secondary structure characteristic of proteins. (Nor dothey have tertiary or quarternary structure.)
Examples:
bradykinin: helps regulate blood pressure; pain.
methionine enkephalin: Tyr-Gly-Gly-Phe-Met,morphine like analgesic.
oxytocin:
causes uterinecontraction andlactation;
not species specific—labor induced via chicken oxytocin.
7
�������� ������� ����� ��������������
� �
��� ���
vasopressin (note the structural similarity to oxytocin):
antidiuretic.
Proteins —
Classification by Gross Chemical Structure
Simple: consist only of amino acid residues.
Conjugated: consist of polypeptide portion(s) and non-polypeptide portion(s) called prosthetic group(s), eglipoproteins, glycoproteins, nucleoproteins,metalloproteins.
Classification by Gross Morphology
Fibrous: insoluble in water, resist hydrolysis, eg,sericine (silk fibroin), hard keratin (hair, nails, claws), softkeratin (skin), myosin (muscle), collagen (connectivetissue, bones).
Globular: soluble or colloidal in water, eg, hemoglobin,enzymes, antibodies, some hormones.
8
Protein Structure
Primary: sequence of amino acid residues.
Secondary: way in whichsegments of polypeptidebackbone are oriented inspace; depends on planarity ofpeptide bond and, often, H-bonding. The only places in theprotein chain where there istorsional flexibility is at theα-carbons: the angles Φ and Ψcan change causing one planeto rotate relative to the other.
One could imagine that all ofthe planar “plates” in the proteinbackbone lie in the same plane. This would produce a flatstrand. By snaking backand forth other parts of thestrand could lie downalongside the first in parallel(or antiparallel) fashioncreating a flat sheet. Orother protein moleculescould lie down parallel (orantiparallel) to the first.
9
���� �������β� ���������������������������������������� ����!�"������� �#���������$�
This flat sheet would allow hydrogen bonding betweenadjacent strands of the protein (favorable for formation ofthe sheet). However, the R groups sterically interfere witheach other in this setup. This hypothetical structure hasnot been observed in nature.
The steric interference between the R groups in the flatsheet can be reduced if the sheet becomes “pleated”. Inthis arrangement the angle between adjacent plates is notzero. Hydrogen bonding is still possible and now the Rgroups can splay away from each other to some extent. This setup works if the R groups are small. This is thecase for silk fibroin which does form the pleated sheetstructure.
10
���%�������������������������������������!�� � ���� �
Another geometry which ispossible is the helix. Thisarrangement allowshydrogen bonding indirections roughly parallel tothe helical axis (dottedbonds) and the R groupsare oriented outward fromthe helix, away from eachother. This happens innature. Several varieties ofhelix are known; the mostcommon form is known asthe α-helix. It is a right-handed helix (if it were awood screw, turning itclockwise would cause it tobe driven into the wood). [The mirror image of a right-handed helix is a lefthanded one. The mirrorimages of the L-aminoacids in the right-handedhelix would be D-aminoacids. The chirality of theamino acid residuesdetermines the chirality ofthe helix!]
11
&�� ����� ���'��������''������ ���������������������
Tertiary: way in which protein as a whole is twisted in threedimensions; may depend on disulfide bonds, H-bonds,electrostatic attractions, hydrophobic interactions.
Quaternary: aggregate structures held together bynonbonded interactions or occasionally disulfide links.
Let's look at some proteins!
12
The Primary Structure of Bovine Insulin — Insulin is a protein hormone produced in the pancreas. Itis essential for the regulation of carbohydrate metabolism.
Species Amino Acid Residues
A-8 A-9 A-10 B-30
Cow Ala Ser Val Ala
Sheep Ala Gly Val Ala
Horse Thr Gly Ileu Ala
Human Thr Ser Ileu Thr
Pig Thr Ser Ileu Ala
Whale Thr Ser Ileu Ala
13
I have never seen a discussion of the secondary or tertiarystructure of insulin. It is a small protein so its secondarystructure may be somewhat flexible, although one wouldexpect it to have conformations of lower and higher energywith one or several of lower energy predominating. Withregard to tertiary structure the disulfide links wouldcertainly be important.
Collagen — Collagen is found in all multicellularanimals and is the most abundantprotein of vertebrates. It is a fibrousprotein, the fibers of which havegreat tensile strength. It is found inthe connective tissue of teeth, bones,ligaments, tendons and cartilage.
Primary structure —Collagen is rich in glycine (R=H). This enables the polypeptide strandsto fit tightly to other strands withoutsteric difficulties. The other majorcomponents are proline andhydroxyproline.
Secondary structure —A left-handed helix.
Tertiary structure —A right-handed helix.
14
Quaternary structure —A cabling of three right-handed helicies. However, thiscontinues to higher levels of organization as we can see inthe following photomicrographs.
15
Fe++N N
NN
CHH2C CH3
CH
CH2
CH2CH2COO--OOCCH2CH2
H3C
H3C
CH3
O
N
O
NCH2
CHC
OHN
H
Oxygen
Heme
Histidine A AResidueof Globin
Hemoglobin —
Hemoglobin is a complex, conjugated, metalloprotein. It iswhat makes red blood cells red. It carries oxygenthroughout the body from the lungs.
Each hemoglobin protein consists of four units, two �-units
and two �-units. This is its quaternary structure.
Each unit consists of a globin and a heme. The globin is achain of amino acidresidues. The heme isa porphyrin containingone iron atom. The ironatom carries the oxygenmolecule. Thepresence of the non-polypeptide hememakes hemoglobin aconjugated protein. Each type of unit, α andβ, has its own particularshape composed ofvarious twists and turns. This is its tertiary structure.
Within each unit the polypeptide backbone adopts certainconformations going down the chain – in many places thisis a helix. This is the secondary structure of the protein.
16
!%��������(����������&���(��������������������������
�(��������������������������������)*�����������(������&���
�(����������������������������������%�����+������������
�����������
Finally, the sequence of the amino acids in the α- (141residues) and β-globins (146 residues) is the primarystructure.
17
Over 100 abnormal varieties of hemoglobin have beenreported in humans. The abnormalities are in the globins. One of these is the replacement of the glutamic acid
residue at position #6 in the �-globins by a valine residue:
Normal: Val-His-Leu-Thr-Pro-Glu-Lys ---> 146Abnormal: Val-His-Leu-Thr-Pro-Val-Lys ---> 146
This change causes the hemes associated with the �-globins to be unable to carry oxygen. As a result cellsbecome oxygen starved.
This alteration also causes the hemoglobin to crystallizeand this, in turn, causes some red blood cells to stiffen anddeform into a crescent shape. These cells can block bloodflow through capillaries. Furthermore, the body recognizesthese cells as abnormal and destroys them, resulting inanemia. (Life of abnormal cells ~ 16 days; normal ~ 120days.) This condition, along with the shape of the cellsgives rise to the name of this genetic disease: sickle cellanemia.
18
People whohave inheritedtwo sickle cellgenes (onefrom mother,one fromfather) willhave thedisease. Those whohave inheritedone normaland onedefective genewill have someabnormalhemoglobin,but usually livenormal lives;they are saidto have thetrait, and canpass the geneon to theirchildren.
19
� �N,s N,s
� � � �N,N N,s N,s s,s
N=normal genes=defective geneN,N=completely normalN,s=sickle traits,s=sickle cell disease
20
Why hasn't this gene disappeared from the gene pool? Wouldn't it be removed by (Darwinian) natural selection?
Who gets the s gene?
People who live, or whose ancestors lived, in areas wheremalaria is, or was, prevalent.
Malaria?
The malarial parasite spends part of its life cycle in redblood cells. People who are N,s are somewhat resistant tomalaria. Thus, the presence of malaria confers someadvantage to the s gene. This gene did not survive long-term in populations where malaria is not endemic. Themap below shows the relationship between malaria andthe presence of the sickle gene.
21
�����������