![Page 1: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/1.jpg)
The Structure andFunction of Proteins
Bioinformatics Ch 7
![Page 2: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/2.jpg)
The many functions of proteins
• Mechanoenzymes: myosin, actin• Rhodopsin: allows vision• Globins: transport oxygen• Antibodies: immune system• Enzymes: pepsin, renin, carboxypeptidase A• Receptors: transmit messages through membranes• Vitelogenin: molecular velcro
– And hundreds of thousands more…
![Page 3: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/3.jpg)
Complex Chemistry Tutorial
• Molecules are made of atoms!
• There is a lot of hydrogen out there!
• Atoms make a “preferred” number of covalent (strong) bonds– C – 4
– N – 3
– O, S – 2
• Atoms will generally “pick up” enough hydrogens to “fill their valence capacity” in vivo.
• Molecules also “prefer” to have a neutral charge
![Page 4: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/4.jpg)
Biochemistry
• In the context of a protein…– Oxygen tends to exhibit a slight negative charge
– Nitrogen tends to exhibit a slight positive charge
– Carbon tends to remain neutral/uncharged
• Atoms can “share” a hydrogen atom, each making “part” of a covalent bond with the hydrogen– Oxygen: H-Bond donor or acceptor
– Nitrogen: H-Bond donor
– Carbon: Neither
![Page 5: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/5.jpg)
Proteins are chains of amino acids
• Polymer – a molecule composed of repeating units
![Page 6: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/6.jpg)
Amino acid composition
• Basic Amino AcidStructure:– The side chain, R,
varies for each ofthe 20 amino acids
– Amino & Carboxyl groups, plus Carbon make the “Backbone” of the amino acid
C
RR
C
H
NO
OHH
H
Aminogroup
Carboxylgroup
Side chain
![Page 7: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/7.jpg)
The Peptide Bond
• Dehydration synthesis
• Repeating backbone: N–C –C –N–C –C
– Convention – start at amino terminus and proceed to carboxy terminus
O O
![Page 8: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/8.jpg)
Peptidyl polymers
• A few amino acids in a chain are called a polypeptide. A protein is usually composed of 50 to 400+ amino acids.
• Since part of the amino acid is lost during dehydration synthesis, we call the units of a protein amino acid residues.
carbonylcarbonylcarboncarbon
amideamidenitrogennitrogen
![Page 9: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/9.jpg)
Side chain properties
• Recall that the electronegativity of carbon is at about the middle of the scale for light elements– Carbon does not make hydrogen bonds with water easily
– hydrophobic– O and N are generally more likely than C to h-bond to
water – hydrophilic
• We group the amino acids into three general groups:– Hydrophobic– Charged (positive/basic & negative/acidic)– Polar
![Page 10: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/10.jpg)
The Hydrophobic Amino Acids
Proline severelyProline severelylimits allowablelimits allowableconformations!conformations!
![Page 11: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/11.jpg)
The Charged Amino Acids
![Page 12: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/12.jpg)
The Polar Amino Acids
![Page 13: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/13.jpg)
More Polar Amino Acids
And then there’s…And then there’s…
![Page 14: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/14.jpg)
Planarity of the peptide bond
Phi () – the angle of rotation about the N-C bond.
Psi () – the angle of rotation about the C-C bond.
The planar bond angles and bond lengths are fixed.
![Page 15: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/15.jpg)
Phi and psi
= = 180° is extended conformation
: C to N–H : C=O to C
C
C=O
N–H
![Page 16: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/16.jpg)
The Ramachandran Plot
• G. N. Ramachandran – first calculations of sterically allowed regions of phi and psi
• Note the structural importance of glycine
Observed(non-glycine)
Observed(glycine)Calculated
![Page 17: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/17.jpg)
Primary & Secondary Structure
• Primary structurePrimary structure = the linear sequence of amino acids comprising a protein:
AGVGTVPMTAYGNDIQYYGQVT…• Secondary structureSecondary structure
– Regular patterns of hydrogen bonding in proteins result in two patterns that emerge in nearly every protein structure known: the -helix and the-sheet
– The location of direction of these periodic, repeating structures is known as the secondary structuresecondary structure of the protein
![Page 18: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/18.jpg)
The alpha helix 60°
![Page 19: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/19.jpg)
Properties of the alpha helix 60°
• Hydrogen bondsHydrogen bondsbetween C=O ofresidue n, andNH of residuen+4
• 3.6 residues/turn
• 1.5 Å/residue rise
• 100°/residue turn
![Page 20: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/20.jpg)
Properties of -helices
• 4 – 40+ residues in length• Often amphipathic or “dual-natured”
– Half hydrophobic and half hydrophilic– Mostly when surface-exposed
• If we examine many -helices,we find trends…– Helix formers: Ala, Glu, Leu,
Met– Helix breakers: Pro, Gly, Tyr,
Ser
![Page 21: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/21.jpg)
The beta strand (& sheet) 135° +135°
![Page 22: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/22.jpg)
Properties of beta sheets
• Formed of stretches of 5-10 residues in extended conformation
• Pleated – each C a bitabove or below the previous
• Parallel/aniparallelParallel/aniparallel,contiguous/non-contiguous
![Page 23: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/23.jpg)
Parallel and anti-parallel -sheets• Anti-parallel is slightly energetically favored
Anti-parallelAnti-parallel ParallelParallel
![Page 24: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/24.jpg)
Turns and Loops• Secondary structure elements are connected by
regions of turns and loops• Turns – short regions
of non-, non-conformation
• Loops – larger stretches with no secondary structure. Often disordered.– “Random coil”– Sequences vary much more than secondary structure
regions
![Page 25: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/25.jpg)
Levels of Protein
Structure
• Secondary structure elements combine to form tertiary structure
• Quaternary structure occurs in multienzyme complexes– Many proteins are active
only as homodimers, homotetramers, etc.
![Page 26: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/26.jpg)
Secondary Structure Prediction
• Based on backbone flexibility
• Various methods– Statistical, neural networks, evolutionary
computation.– Conserved aligned sequences as input (degree
calculated)– PHD can get 70-75% accuracy
![Page 27: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/27.jpg)
Chou-Fasman ParametersName Abbrv P(a) P(b) P(turn) f(i) f(i+1) f(i+2) f(i+3)Alanine A 142 83 66 0.06 0.076 0.035 0.058Arginine R 98 93 95 0.07 0.106 0.099 0.085Aspartic Acid D 101 54 146 0.147 0.11 0.179 0.081Asparagine N 67 89 156 0.161 0.083 0.191 0.091Cysteine C 70 119 119 0.149 0.05 0.117 0.128Glutamic Acid E 151 37 74 0.056 0.06 0.077 0.064Glutamine Q 111 110 98 0.074 0.098 0.037 0.098Glycine G 57 75 156 0.102 0.085 0.19 0.152Histidine H 100 87 95 0.14 0.047 0.093 0.054Isoleucine I 108 160 47 0.043 0.034 0.013 0.056Leucine L 121 130 59 0.061 0.025 0.036 0.07Lysine K 114 74 101 0.055 0.115 0.072 0.095Methionine M 145 105 60 0.068 0.082 0.014 0.055Phenylalanine F 113 138 60 0.059 0.041 0.065 0.065Proline P 57 55 152 0.102 0.301 0.034 0.068Serine S 77 75 143 0.12 0.139 0.125 0.106Threonine T 83 119 96 0.086 0.108 0.065 0.079Tryptophan W 108 137 96 0.077 0.013 0.064 0.167Tyrosine Y 69 147 114 0.082 0.065 0.114 0.125Valine V 106 170 50 0.062 0.048 0.028 0.053
![Page 28: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/28.jpg)
Chou-Fasman Algorithm
• Identify -helices– 4 out of 6 contiguous amino acids that have P(a) > 100– Extend the region until 4 amino acids with P(a) < 100
found– Compute P(a) and P(b); If the region is >5 residues
and P(a) > P(b) identify as a helix
• Repeat for -sheets [use P(b)]• If an and a region overlap, the overlapping
region is predicted according to P(a) and P(b)
![Page 29: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/29.jpg)
Chou-Fasman, cont’d
• Identify hairpin turns:– P(t) = f(i) of the residue * f(i+1) of the next residue *
f(i+2) of the following residue * f(i+3) of the residue at position (i+3)
– Predict a hairpin turn starting at positions where:• P(t) > 0.000075• The average P(turn) for the four residues > 100 P(a) < P(turn) > P(b) for the four residues
• Accuracy 60-65%
![Page 30: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/30.jpg)
Chou-Fasman Example
• CAENKLDHVRGPTCILFMTWYNDGP• CAENKL – Potential helix (!C and !N)
• Residues with P(a) < 100: RNCGPSTY
– Extend: When we reach RGPT, we must stop– CAENKLDHV: P(a) = 972, P(b) = 843– Declare alpha helix
• Identifying a hairpin turn– VRGP: P(t) = 0.000085– Average P(turn) = 113.25
• Avg P(a) = 79.5, Avg P(b) = 98.25
![Page 31: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/31.jpg)
Protein Structure Examples
![Page 32: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/32.jpg)
Views of a protein
Wireframe Ball and stick
![Page 33: The Structure and Function of Proteins Bioinformatics Ch 7](https://reader036.vdocument.in/reader036/viewer/2022062417/5515d7c1550346dd6f8b493d/html5/thumbnails/33.jpg)
Views of a proteinSpacefill Cartoon CPK colors
Carbon = green, black, or grey
Nitrogen = blue
Oxygen = red
Sulfur = yellow
Hydrogen = white