organisation of human genome nuclear genome (3.2 gbp) 24 types of chromosomes y- 51mb and chr1...

Post on 17-Dec-2015

218 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Organisation of human genome

Nuclear genome (3.2 Gbp) 24 types of chromosomes Y- 51Mb and chr1 -279Mbp

Mitochondrial genome

9

Intergenicregions(junk)

Introns (junk)Exons

1.5%1.5%

The genome is empty?The genome is empty?

Estimatednumber ofgenes:

6,034 13,061 19,099 25,000

Saccharomycescerevisiae

(baker’s yeast)

Drosophilamelanogaster

(fruit fly)

Caenorhabdituselegans

(roundworm)

Arabidopsisthaliana

(mustard plant)

LA COMPLEJIDAD BIOLÓGICA CRECIENTE EXIGE CAMBIOS GENÓMICOS QUE INCREMENTEN LA CAPACIDAD INFORMACIONAL DEL SISTEMA...

...PERO EL NÚMERO DE GENES EN LOS DISTINTOS GENOMAS SECUENCIADOS NO CONCUERDA CON LO ESPERADO (APARENTEMENTE)

Amphimedon queenslandica 18693

Nassonia vitripennis 17279

Bos taurus >22790

Homo sapiens 21527

Mus musculus 22083

Trichoplax adhaerens 11514

Nematostella vectensis 18000

Danio rerio 21413Drosophila melanogaster 13781

Ciona intestinalis 16000

Caenorhabditis elegans 20224

Gallus gallus <17000

Takifugu rubripes 18500

Xenopus tropicalis 18000

Strongylocentrotus purpuratus 23300

Anolis carolinensis 17000

Gorilla gorilla 21000

Pan troglodytes 21000

Oryza sativa 50000

Arabidopsis thaliana 26000

Glycine max 75778

Populus trichocarpa 45550

Why (coding) gene number doesn’t matter?

• More sophisticated regulation of expression?

• Proteome vastly larger than genome?

– Alternate splicing

– RNA editing

• Postranslational modifications

• Cellular location

…but, remember there are other genes

Genes in the genome:

• Protein-coding genes (mRNA): around 20500 (as of 10/2012) • Non-coding RNAs

Ribosomal RNA (rRNA)Transfer RNA (tRNA)Small nuclear RNA (SnRNA)Small nucleolar RNA (SnoRNA)microRNA (miRNA)Other non-coding RNAs (Xist, 7SK, etc.)

• Peudogenes

Non polypeptide–coding: RNA encoding

Statistics about the current Gencode freeze (version 13)*The statistics derive from the gtf files, which include only the main chromosomes of the human reference genome.

Version 13 (March 2012 freeze, GRCh37)General statsTotal No of Genes 55123 Protein-coding genes 20670 Long non-coding RNA genes 12393 Small non-coding RNA genes 9173 Pseudogenes 13123 Total No of Transcripts 182967 Protein-coding transcripts 77901 Long non-coding RNA loci transcripts 19835

Total No of distinct translations 78119 Genes that have more than one distinct translations 14235

Protein-coding genes (mRNA):

HUMAN genes and their homology

to genes from other organisms

Noncoding regions in coding genes

• Regulatory regions– RNA polymerase binding site– Transcription factor binding sites– Polyadenylation [poly(A)] sites– Enhancers

• 5’- and 3’-UTRs

CODING GENES

DNA as a series of ‘docking’ sites

It is the relative location of these docking sites to one another that permits genes to be transcribed, spliced, and translated properly and in specific spatial and temporal patterns.

…some more statistics

• Gene density 1/100 kb (vary widely); • Averagely 9 exons per gene• 363 exons in titin gene• Many genes are intronsless• Largest intron is 800 kb (WWOX gene)• Smallest introns – 10 bp• Average 5’ UTR 0,2-0,3 kb• Average 3’ UTR 0,77 kb but underestimated…• Largest protein: titin: 38,138 aa• Largest gene: dystrophin

Human genes vary enormously in size and exon content

An example of complex human gene locus INK4a-ARF

From: Prof. Gordon Peters website

Genes within genes

Neurofibromatosis gene (NF1) intron 26 encode :

OGMP (oligodendrocyte myelin glycoprotein)EVI2A and EVO2B (homologues of ecotropic viral intergration sites in mouse)

Why gene number doesn’t matter?

• More sophisticated regulation of expression

• Proteome vastly larger than genome– Alternate splicing

– RNA editing…

• Postranslational modifications

• Cooption

• GRN’s connectivity

REDES DINÁMICAS

Why gene number doesn’t matter?

• More sophisticated regulation of expression

• Proteome vastly larger than genome

– Alternate splicing

– RNA editing…

• Postranslational modifications

• Cooption

• GRN’s connectivity

Table 1. Levels of regulation--loci of control constraints--above the genome.

Levels and transitions Dynamic regulatory system

1. Genome to transcriptomeEpigenetic regulation of gene expression (5). Includes pathways that detect energylevels (redox levels) and repress DNA transcription when cellular NADH levels areincreased.

2. Transcriptome to proteomeRegulatory constraints include posttranslational modification of proteins.

3. Proteome to dynamic systemMetabolic networks of glycolysis and mitochondrial oxidation-reduction are thedynamic systems presently the best understood in terms of both mechanism offormation and operating principles. They display control distributed over all enzymes of a network, and their phenotype includes cellular redox potential.

4. Dynamic systems to phenotype Control of global phenotype such as disease may be localized to a single regulatorysystem (such as metabolic, hormone signaling, etc.) or be distributed over many systems and levels

Gene Expression• The products of genes may be RNA or protein• RNA and protein synthesis occur in many steps• These steps are regulated and conttroled

Table 1. Levels of regulation--loci of control constraints--above the genome.

Levels and transitions Dynamic regulatory system

1. Genome to transcriptomeEpigenetic regulation of gene expression (5). Includes pathways that detect energylevels (redox levels) and repress DNA transcription when cellular NADH levels areincreased.

2. Transcriptometo proteomeRegulatory constraints include posttranslational modification ofproteins.

3. Proteome to dynamic systemMetabolic networks of glycolysis and mitochondrial oxidation-reduction are thedynamic systems presently the best understood in terms of both mechanism offormation and operating principles. They display control distributed over all enzymes of a network, and their phenotype includes cellular redox potential.

4. Dynamic systems to phenotypeControl of global phenotype such as disease may be localized to a single regulatorysystem (such as metabolic, hormone signaling, etc.) or be distributed over many systems and levels

UCSC

Table 1. Levels of regulation--loci of control constraints--above the genome.

Levels and transitions Dynamic regulatory system

1.Genome to transcriptomeEpigenetic regulation of gene expression (5). Includes pathways that detect energylevels (redox levels) and repress DNA transcription when cellular NADH levels areincreased.

2.Transcriptometo proteomeRegulatory constraints include posttranslational modification of proteins.

3.Proteome to dynamic systemMetabolic networks of glycolysis and mitochondrial oxidation-reduction are thedynamic systems presently the best understood in terms of both mechanism offormation and operating principles. They display control distributed over all enzymes of a network, and their phenotype includes cellular redox potential.

4.Dynamic systems to phenotype Control of global phenotype such as disease may be localized to a single regulatorysystem (such as metabolic, hormone signaling, etc.) or be distributed over many systems and levels

Gene Expression• The products of genes may be RNA or protein• RNA and protein synthesis occur in many steps• These steps are regulated and conttroled

Location of CpG islands in the gene

CpG islands do NOT have a deficit of CpG dinucelotides

How epigenetics worksPromoter Region Gene

CpG Island

= CpG

= methylated CpG

Unmethylated CpGs relax chromatin

Gene

= CpG

= methylated CpG

RNA

Proteins

Methylated CpGs constrain chromatin

Gene

= CpG

= methylated CpG

RNA

Proteins

Chromatin RemodelingSNF/SWI

Histone ModificationAcetylation

UbiquitinationSumoylationMethylation

Phosphorylation

DNA MethylationCpG dinucleotides

MeCP2

Histone SubstitutionH2AZH2AxH3.3

Transcription FactorModification

AcetylationPhosphorylation

Chromatin Modification

Eukaryotic transcription regulationModular construction and combinatorial control

• The regulatory sequence (cis element) on DNA consists of multiple motifs specific for transcription factors.

• Multiple transcription factors can bind simultaneously to the regulatory sequences and act together on the transcription of the gene.

TBPGene X

TATA-35

Regulated Transcription

Co-activatorprotein

General transcription

factors

Transcriptional activatorsbinding to promoter region

Activators stimulate the highly cooperative assembly of initiation complexes

Figure 10-60

Binding sites for activators that control transcription of the mouse TTR gene

Model for cooperative assembly of an activated transcription-initiation complex in the TTR promoter

Figure 10-61

(TTR= transthyretin)

Locus Control Region

Regulatory site required for optimal expressionof adjacent group of genes

Insulator ElementPrevents activation/repression extending to an adjacent

regulatory sequence

Distant Cis-Acting Elements

Distant Cis-Acting Elements

Insulator ElementPrevents activation/repression extending to an adjacent

regulatory sequence

TBPGene X

TATA-35

Regulated Transcription

Co-activatorprotein

General transcription

factors

Transcriptional activatorsbinding to promoter region

ALTERNATIVE PROMOTERS

REGULACIÓN ESPECÍFICA DE SEXO EN EL GEN DNMT1 (METHYLTRANSFERASE):PROMOTORES DE OOCITO, SOMÁTICO, O DE ESPERMATOCITO

Posttranscriptional control

• Regulation of RNA processing

• Regulation of mRNA degradation

• Regulation of translation

mRNA: many places for variation, modification, regulation

• transcription• initiation• elongation• termination

• 5’ capping • 3’ polyA addition

• alternative sites

• splicing• alternative exons• self-splicing, spliceosome-

mediated

• editing• changing bases and codons

• nuclear export• mature mRNA only

• stability• nonsense-mediated decay• degradation signals

• sequestration• localization in cytoplasmic

compartments• access to translation machinery

• antisense/RNA interference• inhibit translation

The PolyA Site (PAS)

3’ exon

stop UTRAAAA

PAS

AATAAA~17nt

AAAAAAAAAT

PolyA signal

Alternative polyadenylation sites

Alternative PAS & Post-transcriptional (de)regulation

Coding sequenceAUUAAA

3' UTRAUUAAA

AUUAAA

AUUAAA AUUAAA

Possible regulatory element(stability, translation, transport)

Use of abnormal polyA site is associated to various diseases: A/B Thalassemia (globin)Mantle cell lymphoma (Cyclin CCND1)Teratocarcinoma (PDGF)Hypertension (Ca2+ ATPase)

Consensus nucleotides at intron/exon junctions

Alternative splicing is a mechanism for Generating functional diversity

Alternative processsing exampleAlternative processsing example

RNA editing is a rare form of post-transcriptional processing whereby base-specific changes are enzymatically introduced at the RNA level. Types of RNA editing in humans:

(i) C---> U, occurs in humans by a specific cytosine deaminase

e.g. The expression of the human apolipoprotein B gene in the intestine involves tissue-specific RNA editing

(ii) A ---> I, the amino group in in carbon 6 of adenine is replaced by a carbonyl group. I then acts as a G. Occurs in some ligand-gated ion channels.

(iii) U ---> C, in mRNA of the WT1 Wilms’ tumor gene

(iv) U ---> A, in alpha-galactosidase mRNA

RNA editing

Apo B-100Apo B-100

Apo B-48 Apo B-48

Gene Expression

• The products of genes may be RNA or protein• RNA and protein synthesis occur in many steps• These steps are frequently regulated

1. Proteolysis

2. Glycosylation

3. Attachment of lipids:

myristoylation

prenylation (farnesyl or geranylgeranyl)

palmitoylation

4. Attachment of glycolipids

3. Protein Phosphorylation

1. Proteolysis

Post-translational modifications that alter activity of the p53 protein. Enzymes that have been shown to modify specific amino acid residues of p53 are shown. Enzymes that inhibit the covalent modifications are indicated in red. P, phosphorylation; R, ribosylation; Ac, acetylation.

…increasing informational capability of the genome, but there are other genes….

top related