human genetics

Human Genetics

Translation of RNA into Protein

Cytoplasm

Nucleus

DNA

Central Dogma

RNA

Protein

Replication

Transcription

Translation

Human Genome

3.2 million DNA base pairs

1.5% encode proteins < = > 98.5% not protein encoding

~ 31,000 genes encoding 100,000 - 200,000 proteins

How are 100,000 to 200,000 proteins produced from 31,000 genes?

What is the 98.5% of the human genome that does not encode proteins?

Noncoding portion of the human genome

Type of sequence Function or characteristic

Noncoding RNAs Translation (tRNA,rRNA)

Pseudogenes

RNA processing

Introns Removed with RNA processing

Promoters and other

regulatory regions

Determine when and where transcription occurs

Repeats:

Transposons DNA that moves around genome

Telomeres Chromosome tips

Centromeres Important for attachment to spindle

Duplications Unknown

Simple short repeats unknown

Two types of nucleic acids

RNA

Usually single-stranded

Has uracil as a base

Ribose as the sugar

Carries protein-encoding information

Can be catalytic

DNA

Usually double-stranded

Has thymine as a base

Deoxyribose as the sugar

Carries RNA-encoding information

Not catalytic

# of strands

kind of sugar

bases used

RNA Structure Depends on Sequence

A can pair with U and the C with G via hydrogen bonding just as with DNA.

Secondary RNA structure is critical in how it performs its function.

RNA Structure and RNA Sequence enable an RNA to interact specifically with proteins.

RNA Processing

mRNA transcripts are modified before use as a template for translation:

- Addition of capping nucleotide at the 5’ end - Addition of polyA tail to 3’ end

Important for moving transcript out of nucleus And for regulating when translation occurs

Splicing - the removing internal sequences - introns are sequences removed - exons are sequences remaining

RNA Processing

Protein Structure was solved before DNA was known to be

genetic materialLinus Pauling and Alpha Helix led to model building by Watson and

Crick

Proteins

most abundant type of molecules in cellsresponsible for most biological functions

muscle contraction - myosin and actinoxygen transport - hemoglobinimmune system -antibodiesconnective tissue- cartilagehair/skin - keratinmetabolism - enzymes

Gene Expression changes in Proteins during Development

Protein Basics

Proteins are polymers assembled from amino acids

20 different amino acids are usedBond between amino acids is called the "Peptide

Bond". Peptide Bond is formed between the carboxyl

group of one amino acid and the Alpha amino group of another amino acid.

mRNAs have a 5' end and a 3' end - they have Polarity.

Proteins also have polarity.

Protein Folding is Critical

How is protein folding directed within cells?

This is still an active area of research, but to a large degree, protein sequence determines protein folding.

Misfolding of Protein Impairs Function

Protein Polarity

The Amino acid at one end of a protein chain has a free Alpha amino group.

Called "Amino-Terminus" or "N-terminus" of the protein.

Amino acid at other end has a free Alpha carboxyl group.

Called "Carboxy-Terminus" or "C-terminus" of the protein.

Direction of Protein Synthesis is from N-terminus to C-terminus.

The Genetic Code

There is a 3 to 1 correspondence between RNA nucleotides and amino acids.

The three nucleotides used to encode one amino acid are called a codon.

The genetic code refers to which codons encode which amino acids.

How do we know it is a 3 letter code?

How Do the mRNA Nucleotides Direct Formation of the Amino Acids in a Protein?

Proteins are formed from 20 amino acids in humans.

Codons of one nucleotide:AGCU

Can only encode 4 amino acids

Codons of two nucleotides:AA GA CA UAAG GG CG UGAC GC CC UCAU GU CU UU

Can only encode16 amino acids

Codons of three nucleotides:

AAA AGA ACA AUA AAG AGG ACG AUGAAC AGC ACC AUC AAU AGU ACU AUUGAA GGA GCA GUA GAG GGG GCG GUGGAC GGC GCC GUC GAU GGU GCU GUUCAA CGA CCA CUA CAG CGG CCG CUGCAC CGC CCC CUC CAU CGU CCU CUUUAA UGA UCA UUA UAG UGG UCG UUGUAC UGC UCC UUC UAU UGU UCU UUU

Allows for 64 potential codons => sufficient!

Theoretical Codes

The Genetic CodeThree Conceivable Kinds of Genetic Codes

The process of reading the RNA sequence of an mRNA and creating the amino acid sequence of a protein is called translation.

Transcription

Codon Codon Codon

Translation

DNA

T T C A G T C A G

DNAtemplatestrand

mRNA

A A G U C A G U C MessengerRNA

Protein Lysine Serine ValinePolypeptide(amino acidsequence)

Translation

How do we know a 3 nucleotide codon determines amino acid choice?

Prediction of Amino Acid Sequence from Synthetic RNA molecules

The genetic code is non-overlapping

Universal Code?

In some organisms, a few of the 64 possible "words" of the genetic code are different.

Do a few different words mean that the code is not universal?

Perhaps: if you're willing to say that the US and Britain don't share a common language because elevators in the UK are called "lifts" and they spell the word "color" with a "u.“

The Genetic Code Is

Linear: uses mRNA which is complementary to DNA sequence.

Triplet: the unit of information is the codon, a series of three ribonucleotides.

Unambiguous: each codon specifies only one amino acid (AA).

Degenerate: more than one codon exists for most amino acids.

The Genetic Code Is:

Punctuated: there are codons that indicate “start” and “stop.”

Commaless: there is no punctuation within a mRNA sequence.

Nonoverlapping: any one ribonucleotide is part of only one codon (some exceptions exist).

Universal: the same code is used by viruses, bacteria, archaea, and eukaryotes.

Point Mutations

Single Base Change can alter protein product.Misssense: results in one amino acid change. Nonsense: results in stop codon. Frame-shift: change "reading-frame" of

genetic message.Silent mutations: point mutations that DON’T

alter the protein product because of the degenerate nature of the genetic code.

Frame Shift

Within a gene, small deletions or insertions of a number of bases not divisible by 3 will result in a frame shift. For example, given the coding sequence:

AGA UCG ACG UUA AGCcorresponding to the protein

arginine - serine - threonine - leucine - serine

Frame ShiftThe insertion of a C-G base pair between

bases 6 and 7 would result in the following new code, which would result in a non-functional protein. Every amino acid after the insertion will be wrong.

AGA UCG CAC GUU AAG CCorresponding to the protein:

arginine - serine - histidine - valine – lysine

The frame shift could generate a stop codon which would prematurely end the protein.

How to Recognize Protein Information in DNA

Don't assume that a dsDNA molecule will be read from left to right on the top strand.

Every dsDNA sequence has six possible translations:

top / bottom strand each with a 1st / 2nd / 3rd reading frame

Not every AUG or "stop" sequence is a start or stop codon.

ORF is the Open Reading Frame- It has an ATG in frame with a Stop codon. It could encode a protein.

Comma free and non-overlapping are correct.

The living cell does decodes the messenger RNAs by a kind of dead-reckoning.

Ribosomes march along the messenger RNA in strides of three bases, translating as they go.

Except for signals that mark where the ribosome is supposed to start, there is nothing in the code itself to enforce the correct reading frame.

Three codons serve as stop signs: UAA, UAG or UGA

What reading frame should be used?

In any mRNA sequence, there are three ways triplet codons can be read.

Each way to read the codons is called a "Reading Frame".

It is very important for ribosome to find correct reading frame.

If the wrong reading frame is used, translation generates a protein with the wrong amino acid sequence which is not functional.

At what codon in the mRNA does the ribosome begin translation?

Recall there is a 5’ untranslated region of the messenger RNA.

The solution is that the ribosome begins translation at a specific AUG codon within the mRNA template termed the "Start Codon".

This is a methionine codon, so the first amino acid in proteins is almost always methionine.

Translation has Three Steps

Initiation - translation begins at start codon (AUG=methionine)

Elongation - the ribosome uses the tRNA anticodon to match codons to amino acids and adds those amino acids to the growing peptide chain

Termination - translation ends at the stop codon UAA, UAG or UGA

Translation Initiation

Leadersequence

mRNA

5’ 3’

mRNA

A U GU U C G U C G G A C G AU G U A A G A

Small ribosomal subunit

Assembling to begin translation

Met

U A C

Initiator tRNA


CU A

Met

mRNA5’ 3’

Amino acidLarge ribosomal subunit

C C U

tRNA

Ribosome

Gly

U U U CG G G G GGA A A A A


CU A

Met

mRNA5’ 3’

C C U

Gly


AAC

Cys

Translation Elongation

mRNA5’ 3’

Met

C C U

Gly

C

UA


A AC

Cys


mRNA5’ 3’

Met

A AC

Cys

C

UU

Lys

C C U

Gly


CU

A


mRNA5’ 3’

CC

U

MetGly

CU U

Lys

Lengtheningpolypeptide(amino acid chain)

A AC

Cys



mRNA5’ 3’

Met

Gly

C UG

Arg

CU U

Lys

A AC

Cys

U U U CG G G G GGA A A A AC

CU


mRNA5’ 3’

Met

Gly

C UG

Arg

CU U

Lys


A AC

Cys

CC

U


mRNA5’

U U U CG G G G GGA A A A A U A A

Stop codon

C UG

Arg

CU U

Lys

Met

GlyCys

Releasefactor

A

AC

Translation Termination

mRNA5’

CU

U

Met Gly CysLys

Stop codonRibosome reaches stop codon

C UG

Arg

U U U CG G G G GGA A A A A U A A

Releasefactor


UU U

CG G G G G

GAA A A A U A A

C UG

MetGly

CysLys

Arg

Releasefactor

Once stop codon is reached, elements disassemble.


Translation In the Cell

Multiple copies of a protein are made simultaneously

5'- G T A A T C C T C -3' DNA sense (partner) strand

3’- C A T T A G G A G -5’ DNA template (antisense) strand

5'- G U A A U C C U C -3' mRNA

N - val - ile - leu - C protein

By convention, amino acid sequences are written and numbered left-to-right from N-terminus to C-terminus.

tRNA is a connection between anticodon and amino acid

5'-AUG-3' codon in mRNA ||| 3'-UAC-5'anticodon in tRNA

5'-CAU-3'if anticodon is written 5’->3'

RNA Splicing Depends on Sequence and Structure

http://bcs.whfreeman.com/thelifewire/content/chp14/1402001.html

http://bcs.whfreeman.com/thelifewire/content/chp14/1402001.html

Alternative splicing of exons forms distinct proteins: one gene, many proteins

Alternative splicing of exons forms distinct proteins:

one gene, many proteins

Exon shuffling forms distinct proteins:

human genetics

Documents

protein encoding

protein folding

sugarcarries protein

protein sequence

misfolding of protein

protein chain

free alpha amino group

sugarcarries rna