ast734 andrew boal 25 november 2004 image credits: nasa, nps, and protein data bank coming soon to a...
Post on 25-Dec-2015
213 Views
Preview:
TRANSCRIPT
AST734Andrew Boal
25 November 2004
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Image credits: NASA, NPS, and Protein Data Bank
Coming Soon to a Lab Near you…
Extreme environments and astrobiologyNumerous extreme terrestrial habitats are seen as potential analogs to life-bearing niches in the solar systemExtreme environments are those which exist outside of the conditions of a “mesophilic environment” (T~30-40oC, salt concentration <3%, etc)
These “extreme” environments might model conditions found on Mars, Europa, Titan, elsewhere
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Terrestrial examples include hot springs (high temp.), salt lakes (high salt), deep sea vents (high pressure), deserts (low water)
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Image credits: NASA, NPS
Extreme environments: microbes in residenceExtremeophiles are defined by the type of environment required for growth
There is no overall consensus on the definition of an extreme environment
Organisms that can survive in an extreme environment but do not require those conditions for growth are extremeotolerent
Mesophile: Lives in an ambient environmentQuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Image credit: CDC
Thermophile: Temp. > ~45oC
Psychrophile: Temp. < ~20oC
Barophile: High pressure
Xerophile: Low water content
Halophile: Salt content > 3-10%
Acidophile: pH < 5
Alkaliphilie: pH > 9
Radiophile: high amounts of radiation
Biogeography
Biogeography is the study of the environmental distribution of species
One can explore several, isolated, analagous extreme environments which may not allow transport of microbes between them to develop a better understanding of microbial evolution
But, what about a deeper look?Map credit: CIA World Factbook, Image credits: NPS
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Molecular components of cellsThe predominant components of the molecular makeup of cells include lipids, nucleotides, and proteins
The ability of these molecules to function is directly related to molecular shape, which is influenced by the environments, so…
Nucleotides: protein blueprints and fabrication
Qu
ickTim
e™
an
d a
TIFF
(LZW
) de
com
pre
ssor
are
need
ed
to s
ee th
is pictu
re.
Proteins: do the work of the cell
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.Q
uic
kTim
e™
an
d a
TIF
F (U
nco
mp
resse
d) d
ecom
pre
sso
rare
nee
de
d to
see th
is p
ictu
re.
Lipids: provide cell membranes
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Biomolecular structural endemism
Are there molecular structures which are endemic in an environment?
If so, how and why are those structures arrived at?
The Big Questions:
Photo Credits: National Park Service Web pages
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
What are biomolecules?
Biomolecular structureBiomolecular structure is determined by a combination of covalent and noncovalent bonds
Covalent bonds are static entities which are little effected by environment
Noncovalent bonds (hydrophobic interactions, hydrogen bonding, and electrostatic attraction) exist in a dynamic equilibrium, and thus can be attenuated by factors such as temperature, ion content, and pH
Biomolecules must both be somewhat flexible and somewhat rigid to attain proper functioning, therefore the forces that hold the molecular shape must attain a balance with the environment
Too static- function is compromised
Balance- function and function preserved
Too dynamic- structure is compromised
Lipid structureLipids are made up of a hydrophilic (water-loving) head group and a hydrophobic (water fearing) tail
In cell membranes, lipids pack to form a bilayer so that the heads are in water and the tails are mixed together
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
H2O
O O
O
OO
P-O O
O
N+
H2C CHO
CH2
O
O
CCH2
CCH2
OO
H2CCH2
H2CCH2
H2CCH2
H2CCH2
H2CCH2
H2CCH2
H2CCH3
H2CCH2
H2CCH2
H2CCH2
H2CCH2
H2CCH2
H2CCH2
H2CCH3
P-O O
O
CH2
H2CN+
H3CCH3
CH3
Hydrophilic head group
Hydrophobic tail
Lipids in thermal environments
Lipids from thermophilic archaea have a dramatically different chemical structure
Mesophile lipid bilayer
O O
PO
OO
OO
OP
O OOO
OP
O O
POThermophile Archaea bilayer
Increased hydrocarbon branching- increased hydrophobicity
Head-tail linkages are ethers, not esters, and are chemically more robust
OO
OP
O O
PO
Hyperthermophile Archaea lipid
Backbone of both layers is chemically connected, again increased stability
DNA and RNA: chemistryDNA and RNA are polymers of nucleotides (oligo- or polynucleotide)Nucleotides are comprised of nucleobases attached to a sugar
Sugars:
Ribose is in RiboNucleic Acid (RNA)
Deoxyribose is in DeoxyriboNucleic Acid (DNA)
O
OHO
O
Backbone
Backbone
Nucleobase
O
O
O
Backbone
Backbone
Nucleobase
The extra -OH (alcohol) in ribose makes it much less chemically stable
N
N
O
O
H
Sugar
N
N
NHH
O
Sugar
N
N
N HH
N
N
Sugar
N
N
O
O
H
Sugar
N
NHO
NH
HN
N
Sugar
Nucleobases:
Thymine (DNA only)
Uracil (RNA only)
Adenine Guanine
Cytosine
Nucleobases are cyclic structures which are basic (like ammonia)
DNA and RNA: polynucleotide structureDNA and RNA structure is based on hydrophobic interactions and hydrogen bonding
Hydrogen bonding is a weak interaction where two electronegative elements “share” a hydrogen atom (note that carbon-hydrogen bonds do not partake in hydrogen bonding
NN H
O
NH
H
N
N NN
NH
H
O
O
O
O
POO-
O
O
O NN
O
O
H NN
NH
H
N
N
Dashed lines indicate hydrogen bonds
Polynucleotide backbone has charged phosphate groups which are hydrophilic
Thymine:Adenine (T:A) base pair
Guanine:Cytosine (G:C) base pair
Center of duplex is hydrophobic
DNA: secondary structureBase pairing determines the nature of the secondary structure
The basic elements of DNA secondary structure are the duplex (which is by far the most prevalent), the junction, and the hairpin
Quic
kTim
e™
an
d a
TIF
F (
LZW
) d
ecom
pre
sso
rare
ne
ed
ed
to
see
th
is p
ictu
re.
Qu
ickTim
e™
an
d a
TIFF
(LZW
) de
com
pre
ssor
are
nee
de
d to
see
this p
icture
.
Qu
ickTim
e™
and
aTIF
F (L
ZW
) de
com
pre
ssor
are
nee
de
d to
see th
is pictu
re.
C
C
G
T
A
T
G
C
A
G
G
T
A
C
TA
T C C G C T A A G
CG C
A
GTT
A G G C G A T T
junction duplexhairpin
DNA melting
One of the easiest ways to measure DNA stability is to obtain a “melting curve” which is a spectroscopic measurement of duplex unzipping
Representation of DNA melting by duplex unzipping or unwinding
Figure taken from: Drukker, K., et. al. J. Phys. Chem. B. 2000, 104, 6108-6111
Example of a DNA melting curve obtained spectroscopicly
Stability of DNA in extreme environmentsMain determinant of DNA stability is the fraction of C:G base pairs in a given oligonucleotide sequenceThe primary difference between an A:T and G:C base pair is that G:C has three hydrogen bonds, and is thus more stable
40
50
60
70
80
90
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9DNA content (f
GC)
Tm
, 69mM NaCl
Tm
, 220mM NaCl
Tm
, 1020mM NaCl
Data taken from: Owczarzy, R., et. al.
Biochemistry, 2004, 43, 3537-3554. Other factors include hydrophobicity and interaction between salt and the DNA backbone
NN
O
O
H NN
NH
H
N
N
A:T
NN H
O
NH
H
N
N NN
NH
H
O
G:C
Transfer RNA (tRNA) transports amino acids into the ribosome
The many faces of RNARNA is primarily involved in protein synthesis and comes in three major types:
Message RNA (mRNA) is made by transcription of DNA and lists the amino acid sequence of a protein
Ribosome RNA (rRNA) forms the skeleton of the ribosome, the machine which makes proteins
Growing protein
Amino acid
Structure of tRNA
tRNA is a good molecule to explore for environmental studies
Like DNA, RNA secondary structure has elements such as duplexes, loops, bulges, and hairpins
tRNA molecules are usually fairly small (less than 100 nucleic acid monomers)
tRNA has a relatively simple secondary structure
tRNA usually exists as free molecule in the cell
G
T A
CG
A
C
G
T
T
C
T
G
C
A
A
GG
CGAG
GCTCAG
AGGG
TC A
T A
T
T
A
T C
A GC
AG
TG
C G G
G C CTC A
GC
TT
C
G
G
G
G
G
C
T
C
CA
CC
A
Stability of tRNAThe stability of tRNA can be both measured spectroscopicly like DNA but can also be calculated
Calculated free energy is obtained by factoring in the strength of noncovalent interactions in a folded and unfolded tRNA and is expressed as the free energy of complex formation, ∆Gf (NOTE: lower ∆Gf value indicates increased stability, formation is more favorable)
Initial ∆Gf values and predicted secondary structure can be calculated from raw sequence data:
E. coli.: GGGGCTATAGCTCAGCTGGGAGAGCGCTTGCATGGCATGCAAGAGGtCAGCGGTTCGATCCCGCTTAGCTCCACCA
E. coli.: ∆Gf = -28.9 kcal/mol
Thermoplasma acidophilum: GGGCCGGTAGATCAGAGGTAGATCGCTTCCTTGGCATGGAAGAGGcCAGGGGTTCAAATCCCCTCCGGTCCA
T. acidophilum.: ∆Gf = -30 kcal/mol
Calculated ∆Gf values of the GGC codon tRNA from E. coli. and T. acidophilum
Proteins: amino acids and primary structurePrimary structure is determined by covalent amide bonds between individual amino acids
Amino acids
O
HON
H
HR H
Amino functionality
Acid functionality
Side chain “R-group”- defines the chemical and physical nature of the amino acid
Examples of amino acids
O
HON
H
HH
Alanine (A, Ala)slightly hydrophobic
Leucine (L, Leu)strongly hydrophobic
O
HON
H
HH
O
HON
H
HH
Serine (S, Ser)hydrophilic
OH
O
HON
H
HH
Glutamic acid (G, Glu)hydrophilic, negative
OOH
O
HON
H
HH
Lysine (K, Lys)hydrophilic, positive
H2N
HN
NH
HN
NH
HN
NH
HN
NH
HN
NH
HN
NH
NH2
O
HO
O
O
O
O
O
O
O
O
O
O
O
OHO N
N
OH
O
NH
NH2
HN
HO
H2NO
A peptide or protein is a chain of 10-1000 amino acids
Proteins: secondary structureSecondary structures (folds) are defined by hydrogen bonding and steric interactions of the side chain
The -sheet is a linear arrangement of amino acids
Structure is defined by inter-strand hydrogen bonds, less by sterics of side chains
Other, but far less common, peptide folds include the coiled-coil, random coil, bulge, -turn, 310 helix, 27 helix, -helix, -barrel, and so on…
Sheets can be parallel or anti-parallel, defined by orientation of the backbone
The -helix is a coil of a peptide chain and has 3.6 residues per helical turn
Primary interactions are hydrogen bonding between residues along the helical axis and steric interactions between side chains
Proteins: tertiary and quaternary structure
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Tertiary Structure: the overall shape of a folded protein
Tertiary and quaternary structure is defined almost entirely by noncovalent interactions Quaternary Structure: the
assembly of multiple protein units into a larger structure
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Top View
Side View
Protein model systems: helicesThe helix is a common protein structural element which can be readily studiedhelices are the secondary structural element which is most susceptible to sequence and environment factors and the stability of helices is related to the stability of the overall protein
Structural stability is measured by spectroscopicly observing helix unfolding
Like DNA melting, helix (and protein) stability is related to a structural denaturation
Graph taken from: Whitington, S. J., et. al. Biochemistry, 2003, 42,
14690-14695.
As for tRNA, ∆Gf can be calculated for helices or can be measured using Circular Dichrosim spectroscopy by employing the relationship ∆Gf = -RTlnK, where K can be measured from the spectrum
Example of environment related structural differences
One example is the study of the helices of RecARecA is a protein involved with DNA repair, cell division and other processes and is found in all environments
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
This work was published as: Petukov, M. et. al. Proteins: Structure, Function,
and Genetics 1997, 29, 309-320
Crystal structure of RecA from E. coli
Crystal structure of RecA from E. coli was used as a template
RecA sequences from 29 proteins were aligned with that of E. coli, allowing for the determination of helical fragments
∆Gf values for these sequences were calculated and analyzed
There are 10 helical regions in RecA
Thermophile helices are more stable
Calculated ∆Gf values indicated that helices of thermophlie origin were more stable than mesophile helices
Eight of the thermophile helices were found to be more stable- these helices are likely related to STRUCTURAL stability
No change was found for two helices, both of which are directly involved in interactions with DNA and other proteins, these helices likely need to retain flexibility for FUNCTIONAL stability
Interestingly, total helix stability was found to be the same value if the optimal temperature for protein activity is taken into account- this is again related to the need for molecular flexibility
To
tal h
elix
∆G
f
20oC 37oC 80oC
T. thermophilus (80oC)
E. coli (37oC)
P. areuglinosa (20oC)
Biomolecular structural endemism
Are there molecular structures which are endemic in an environment?
If so, how and why are those structures arrived at?
The Big Questions:
Photo Credits: National Park Service Web pages
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Study roadmap
Bioinformatics Sample Collection
Model StudiesSynthesis of short RNA and peptide sequences
Study structure of these molecules in lab-generated extreme (thermal/salt/pressure) environments
Computer models of these systems
Develop comprehensive listing of known protein/RNA sequences from public database
Identify environments for study (Hawaii lakes, Chile: Andes and Patagonia?)
Travel/Sample Collection/ Data Analysis
Search for environment-specific structural elements
Environments to be explored
Initial work will be carried out in Hawaiian lakesThese include Lake Kauhako (Moloka’I), Lake Wai’ele’ele (Maui), Green Lake and lake Waiau (Hawai’I)
These lakes are relatively accessible and will provide a ready data set that we will use to develop our sampling and analysis methodologiesThis data set will also establish part of the mesophile baseline
South American Lake Environments
South American lakes are less well studied from the biogeographical view point- will be able to describe new environmentsThese environments are also geographically isolated from other extreme environments will allow for greater geographic variability
South America, specifically the Andes and Patagonia, have numerous extremeophilic environments
Other possible environments include deep sea trenches and subglacial lakes- UH collaborations
What we will look at: “adaptive” proteins
Antifreeze protein: inhibits ice crystal formation
Qu
ickTim
e™
and
aTIF
F (L
ZW
) decom
pre
ssor
are
nee
de
d to
see
this p
ictu
re.
Potassium Channel: transports K+ into the cell
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.QuickTime™ and a
TIFF (LZW) decompressorare needed to see this picture.
Mechanosensitive channel: responds to osmotic stress
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture. ATPsulfurylase: critical in sulfate reducing bacteria
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Proteins which serve a function adapted to the environment
What we will look at: “conserved” proteinsConserved proteins are those which would be expected to be more similar given a function which is ubiquitous
ATPase: synthesis of ATP, a cell energy source
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
DNA gyrase: involved in DNA packaging
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Rhodopsin: light sensing and transduction
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Pyruvate kinase: involved in glycolysis
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Planned Methodologies: Bioinformatics and Sample Collection
Data that will be collected in the environmentEnvironmental DNA- will be used to establish the biodiversity of a site as well as provide information regarding molecular sequences
Physical factors will also be taken into account, including the temperature, salinity, nutrient composition, etc…
Bioinformatics is the term used to describe the mining of biological sequence and structural data bases
The initial work here will be to develop a database of molecular sequences correlated with the organism of origin (which will tell us the nature of the environments they came from)
These sequences will then be examined for environment-specific structural motifsThis database will help to establish environmental targets and can be modified by biogeographical studies
Methodologies: Model systems and computationsSynthesis and physical or computational characterization of model and natural peptides or nucleotide sequences
These studies will provide us with a numerical quantity (∆Gf) for stability as well as molecular level insights of the mechanism of stabilityOther variants of this work includes the study of the folding of proteins isolated from the environment and the study of peptide-oligonuicleotides interactions
helices
tRNA sequences
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
More complicated peptides such as the helix bundle (common in membrane proteins)
Bringing it all togetherWe will attempt to establish a relationship between the physical environment, biodiversity, and molecular structure
One way this can be accomplished is to generate plots of stability vs. structural similarity for individual environments
Incr
ea
sing
st
ruct
ura
l sim
ilari
ty
Increasing stabilityor protein activity
This range will indicate stability window
This range will indicate the variance of structures which are capable of surviving
A small stability range would indicate that there are rigorous energetic requirements
A small structural similarity range would indicate environment specific structures
If both values are small, it may indicate that structures evolved to meet the specific requirements of that environment
top related