protein folds and structure
TRANSCRIPT
-
8/6/2019 Protein Folds and Structure
1/19
-
8/6/2019 Protein Folds and Structure
2/19
The central dogma
DNA ------- RNA ---------- Protein
{A,C,T,G} {A,C,G,U} {A,D,..Y}
Guanine, Cytosine TU
Thymine, Adenine
-
8/6/2019 Protein Folds and Structure
3/19
Biology/Chemistry of Protein Structure
Primary
Secondary
Tertiary
Quaternary
Assembly
Folding
Packing
InteractionS
TR
U
C
TU
R
E P
RO
C
E
SS
-
8/6/2019 Protein Folds and Structure
4/19
Primary Structure
linearordered
1 dimensional
sequence of amino acid polymer
by convention, written from amino end to carboxyl enda perfectly linear amino acid polymer is neitherfunctional nor energetically favorable folding!
-
8/6/2019 Protein Folds and Structure
5/19
Protein Folding
tumbles towards
conformations that reduce E (this process is thermodynamically favorable)
yields secondary structure
occurs in the cytosol
involves localized spatialinteraction among primarystructure elements, i.e. theamino acids
may or may not involvechaperone proteins
-
8/6/2019 Protein Folds and Structure
6/19
Secondary Structure non-linear
3 dimensional
localized to regions of an
amino acid chain
formed and stabilized by
hydrogen bonding,
electrostatic and van der
Waals interactions
-
8/6/2019 Protein Folds and Structure
7/19
Ramachandran Plot
Pauling built models based on the followingprinciples, codified by Ramachandran:
(1) bond lengths and angles should besimilar to those found in individualamino acids and small peptides
(2) peptide bond should be planer
(3) overlaps not permitted, pairs of atomsno closer than sum of their covalent radii
(4) stabilization have sterics that permithydrogen bonding
Two degrees of freedom:
(1 ) (phi) angle = rotation about N C(2) (psi) angle = rotation about C C A linear amino acid polymer with some folds
is better but still not functional nor
completely energetically favorable packing!
-
8/6/2019 Protein Folds and Structure
8/19
Protein Packing
occurs in the cytosol (~60% bulk
water, ~40% water of hydration) involves interaction between
secondary structure elementsand solvent
may be promoted bychaperones, membrane proteins
tumbles into molten globulestates
overall entropy loss is smallenough so enthalpy determinessign of E, which decreases(loss in entropy from packingcounteracted by gain fromdesolvation and reorganizationof water, i.e. hydrophobic effect)
yields tertiary structure
-
8/6/2019 Protein Folds and Structure
9/19
Tertiary Structure
non-linear
3 dimensional
global but restricted to the
amino acid polymer
formed and stabilized by
hydrogen bonding, covalent
(e.g. disulfide) bonding,
hydrophobic packing toward
core and hydrophilic
exposure to solvent
A globular amino acid
polymer folded and
compacted is somewhat
functional (catalytic) and
energetically favorableinteraction!
-
8/6/2019 Protein Folds and Structure
10/19
Protein Interaction
occurs in the cytosol, in close proximity to other
folded and packed proteins
involves interaction among tertiary structure
elements of separate polymer chains may be promoted by chaperones, membrane
proteins, cytosolic and extracellular elements aswell as the proteins own propensities
E decreases further due to furtherdesolvation and reduction of surface area
globular proteins, e.g. hemoglobin,largely involved in catalytic roles
fibrous proteins, e.g. collagen,
largely involved in structural roles
yields quaternary structure
-
8/6/2019 Protein Folds and Structure
11/19
Quaternary Structure
non-linear
3 dimensional
global, and across
distinct amino acid
polymers
formed by hydrogen
bonding, covalent
bonding, hydrophobic
packing and hydrophilic
exposure favorable, functional
structures occur
frequently and have been
categorized
-
8/6/2019 Protein Folds and Structure
12/19
Class/Motif
class = secondary structurecomposition,
e.g. all , all , segregated + , mixed / motif = small, specific
combinations of secondarystructure elements,
e.g. - - loop both subset of
fold/architecture/domains
-
8/6/2019 Protein Folds and Structure
13/19
Fold/Architecture/Domains
fold = architecture = the
overall shape andorientation of the secondarystructures, ignoringconnectivity between thestructures,
e.g. / barrel, TIM barrel domain = the
functional property
of such a fold or
architecture,
e.g. binding, cleaving,spanning sites
subset of topology/fold
families/superfamilies
-
8/6/2019 Protein Folds and Structure
14/19
Experimental Determination and Analysis
Repositories
Protein Data Bank
Molecular Modeling DataBase
Resolution
X-Ray Crystallography
NMR Spectroscopy
Mass Spectroscopy (next week)
Fluorescence Resonance Energy Transfer
-
8/6/2019 Protein Folds and Structure
15/19
A repository for 3-D biological macromolecularstructure. Established in 1971 at Brookhaven National Lab (7
structures)
It includes proteins, nucleic acids and viruses. Obtained by X-Ray crystallography (80%) or NMR
spectroscopy (16%). Submitted by biologists and biochemists from
around the world.
Protein Data bank (PDB)
-
8/6/2019 Protein Folds and Structure
16/19
Computational Determination and Analysis
Databases
CATH (Class, Architecture, Topology, Homologoussuperfamily)
SCOP (Structural Classification Of Proteins)
FSSP (Fold classification based on Structure-Structure
alignment of Proteins)
Prediction
Ab-initio, theoretical modeling, and conformation spacesearch
Homology modeling and threading
Energy minimization, simulation and Monte Carlo
-
8/6/2019 Protein Folds and Structure
17/19
CATH a combination of manual and automated
hierarchical classification
four major levels:
Class (C) based on secondarystructure content
Architecture (A) based on grossorientation of secondary structures
Topology (T) based on connectionsand numbers of secondary structures
Homologous superfamily (H) basedon structure/function evolutionarycommonalities
provides useful geometric information (e.g.architecture)
partial automation may result in examples
near fixed thresholds being assignedinaccurately
-
8/6/2019 Protein Folds and Structure
18/19
SCOP a purely manual hierarchical classification three major levels:
Family based on clear evolutionaryrelationship (pairwise residue identitiesbetween proteins are >30%)
Superfamily based on probableevolutionary origin (low sequenceidentity but common structure/functionfeatures
Fold based on major structuralsimilarity (major secondary structuresin same arrangement and topology
provides detailed evolutionary information
manual process influences updatefrequency and equally exhaustiveexamination
-
8/6/2019 Protein Folds and Structure
19/19
FSSP a purely automated
hierarchical classification
three major levels: representative set 330
protein chains (less than 30%sequence identity)
clustering based onstructural alignment into fold
families convergence cutting at a
high statistical significancelevel increases the number ofdistinct families, graduallyapproaching one family per
protein chain continually updated, presentsdata and lets user assess
Without sufficient knowledge,user may not assess dataappropriately
list of representative set
clustering dendogram