human genetics
DESCRIPTION
Human Genetics. Translation of RNA into Protein. Replication. DNA. Transcription. RNA. Nucleus. Translation. Protein. Cytoplasm. Central Dogma. Human Genome. 3.2 million DNA base pairs 1.5% encode proteins < = > 98.5% not protein encoding - PowerPoint PPT PresentationTRANSCRIPT
Human Genome
3.2 million DNA base pairs
1.5% encode proteins < = > 98.5% not protein encoding
~ 31,000 genes encoding 100,000 - 200,000 proteins
How are 100,000 to 200,000 proteins produced from 31,000 genes?
What is the 98.5% of the human genome that does not encode proteins?
Noncoding portion of the human genome
Type of sequence Function or characteristic
Noncoding RNAs Translation (tRNA,rRNA)
Pseudogenes
RNA processing
Introns Removed with RNA processing
Promoters and other
regulatory regions
Determine when and where transcription occurs
Repeats:
Transposons DNA that moves around genome
Telomeres Chromosome tips
Centromeres Important for attachment to spindle
Duplications Unknown
Simple short repeats unknown
Two types of nucleic acids
RNA
Usually single-stranded
Has uracil as a base
Ribose as the sugar
Carries protein-encoding information
Can be catalytic
DNA
Usually double-stranded
Has thymine as a base
Deoxyribose as the sugar
Carries RNA-encoding information
Not catalytic
RNA Structure Depends on Sequence
A can pair with U and the C with G via hydrogen bonding just as with DNA.
Secondary RNA structure is critical in how it performs its function.
RNA Structure and RNA Sequence enable an RNA to interact specifically with proteins.
RNA Processing
mRNA transcripts are modified before use as a template for translation:
- Addition of capping nucleotide at the 5’ end - Addition of polyA tail to 3’ end
Important for moving transcript out of nucleus And for regulating when translation occurs
Splicing - the removing internal sequences - introns are sequences removed - exons are sequences remaining
Protein Structure was solved before DNA was known to be
genetic materialLinus Pauling and Alpha Helix led to model building by Watson and
Crick
Proteins
most abundant type of molecules in cellsresponsible for most biological functions
muscle contraction - myosin and actinoxygen transport - hemoglobinimmune system -antibodiesconnective tissue- cartilagehair/skin - keratinmetabolism - enzymes
Protein Basics
Proteins are polymers assembled from amino acids
20 different amino acids are usedBond between amino acids is called the "Peptide
Bond". Peptide Bond is formed between the carboxyl
group of one amino acid and the Alpha amino group of another amino acid.
mRNAs have a 5' end and a 3' end - they have Polarity.
Proteins also have polarity.
Protein Folding is Critical
How is protein folding directed within cells?
This is still an active area of research, but to a large degree, protein sequence determines protein folding.
Protein Polarity
The Amino acid at one end of a protein chain has a free Alpha amino group.
Called "Amino-Terminus" or "N-terminus" of the protein.
Amino acid at other end has a free Alpha carboxyl group.
Called "Carboxy-Terminus" or "C-terminus" of the protein.
Direction of Protein Synthesis is from N-terminus to C-terminus.
The Genetic Code
There is a 3 to 1 correspondence between RNA nucleotides and amino acids.
The three nucleotides used to encode one amino acid are called a codon.
The genetic code refers to which codons encode which amino acids.
How do we know it is a 3 letter code?
How Do the mRNA Nucleotides Direct Formation of the Amino Acids in a Protein?
Proteins are formed from 20 amino acids in humans.
Codons of one nucleotide:AGCU
Can only encode 4 amino acids
Codons of two nucleotides:AA GA CA UAAG GG CG UGAC GC CC UCAU GU CU UU
Can only encode16 amino acids
Codons of three nucleotides:
AAA AGA ACA AUA AAG AGG ACG AUGAAC AGC ACC AUC AAU AGU ACU AUUGAA GGA GCA GUA GAG GGG GCG GUGGAC GGC GCC GUC GAU GGU GCU GUUCAA CGA CCA CUA CAG CGG CCG CUGCAC CGC CCC CUC CAU CGU CCU CUUUAA UGA UCA UUA UAG UGG UCG UUGUAC UGC UCC UUC UAU UGU UCU UUU
Allows for 64 potential codons => sufficient!
The process of reading the RNA sequence of an mRNA and creating the amino acid sequence of a protein is called translation.
Transcription
Codon Codon Codon
Translation
DNA
T T C A G T C A G
DNAtemplatestrand
mRNA
A A G U C A G U C MessengerRNA
Protein Lysine Serine ValinePolypeptide(amino acidsequence)
Translation
Universal Code?
In some organisms, a few of the 64 possible "words" of the genetic code are different.
Do a few different words mean that the code is not universal?
Perhaps: if you're willing to say that the US and Britain don't share a common language because elevators in the UK are called "lifts" and they spell the word "color" with a "u.“
The Genetic Code Is
Linear: uses mRNA which is complementary to DNA sequence.
Triplet: the unit of information is the codon, a series of three ribonucleotides.
Unambiguous: each codon specifies only one amino acid (AA).
Degenerate: more than one codon exists for most amino acids.
The Genetic Code Is:
Punctuated: there are codons that indicate “start” and “stop.”
Commaless: there is no punctuation within a mRNA sequence.
Nonoverlapping: any one ribonucleotide is part of only one codon (some exceptions exist).
Universal: the same code is used by viruses, bacteria, archaea, and eukaryotes.
Point Mutations
Single Base Change can alter protein product.Misssense: results in one amino acid change. Nonsense: results in stop codon. Frame-shift: change "reading-frame" of
genetic message.Silent mutations: point mutations that DON’T
alter the protein product because of the degenerate nature of the genetic code.
Frame Shift
Within a gene, small deletions or insertions of a number of bases not divisible by 3 will result in a frame shift. For example, given the coding sequence:
AGA UCG ACG UUA AGCcorresponding to the protein
arginine - serine - threonine - leucine - serine
Frame ShiftThe insertion of a C-G base pair between
bases 6 and 7 would result in the following new code, which would result in a non-functional protein. Every amino acid after the insertion will be wrong.
AGA UCG CAC GUU AAG CCorresponding to the protein:
arginine - serine - histidine - valine – lysine
The frame shift could generate a stop codon which would prematurely end the protein.
How to Recognize Protein Information in DNA
Don't assume that a dsDNA molecule will be read from left to right on the top strand.
Every dsDNA sequence has six possible translations:
top / bottom strand each with a 1st / 2nd / 3rd reading frame
Not every AUG or "stop" sequence is a start or stop codon.
ORF is the Open Reading Frame- It has an ATG in frame with a Stop codon. It could encode a protein.
Comma free and non-overlapping are correct.
The living cell does decodes the messenger RNAs by a kind of dead-reckoning.
Ribosomes march along the messenger RNA in strides of three bases, translating as they go.
Except for signals that mark where the ribosome is supposed to start, there is nothing in the code itself to enforce the correct reading frame.
Three codons serve as stop signs: UAA, UAG or UGA
What reading frame should be used?
In any mRNA sequence, there are three ways triplet codons can be read.
Each way to read the codons is called a "Reading Frame".
It is very important for ribosome to find correct reading frame.
If the wrong reading frame is used, translation generates a protein with the wrong amino acid sequence which is not functional.
At what codon in the mRNA does the ribosome begin translation?
Recall there is a 5’ untranslated region of the messenger RNA.
The solution is that the ribosome begins translation at a specific AUG codon within the mRNA template termed the "Start Codon".
This is a methionine codon, so the first amino acid in proteins is almost always methionine.
Translation has Three Steps
Initiation - translation begins at start codon (AUG=methionine)
Elongation - the ribosome uses the tRNA anticodon to match codons to amino acids and adds those amino acids to the growing peptide chain
Termination - translation ends at the stop codon UAA, UAG or UGA
Leadersequence
mRNA
5’ 3’
mRNA
A U GU U C G U C G G A C G AU G U A A G A
Small ribosomal subunit
Assembling to begin translation
Met
U A C
Initiator tRNA
Translation Initiation
CU A
Met
mRNA5’ 3’
Amino acidLarge ribosomal subunit
C C U
tRNA
Ribosome
Gly
U U U CG G G G GGA A A A A
Translation Initiation
mRNA5’ 3’
CC
U
MetGly
CU U
Lys
Lengtheningpolypeptide(amino acid chain)
A AC
Cys
U U U CG G G G GGA A A A A
Translation Elongation
mRNA5’
U U U CG G G G GGA A A A A U A A
Stop codon
C UG
Arg
CU U
Lys
Met
GlyCys
Releasefactor
A
AC
Translation Termination
mRNA5’
CU
U
Met Gly CysLys
Stop codonRibosome reaches stop codon
C UG
Arg
U U U CG G G G GGA A A A A U A A
Releasefactor
Translation Termination
UU U
CG G G G G
GAA A A A U A A
C UG
MetGly
CysLys
Arg
Releasefactor
Once stop codon is reached, elements disassemble.
Translation Termination
5'- G T A A T C C T C -3' DNA sense (partner) strand
3’- C A T T A G G A G -5’ DNA template (antisense) strand
5'- G U A A U C C U C -3' mRNA
N - val - ile - leu - C protein
By convention, amino acid sequences are written and numbered left-to-right from N-terminus to C-terminus.
RNA Splicing Depends on Sequence and Structure
http://bcs.whfreeman.com/thelifewire/content/chp14/1402001.html