the transcriptional landscape of the mammalian genome riken and fantom consortium carninci, et al...
Post on 21-Dec-2015
214 Views
Preview:
TRANSCRIPT
The Transcriptional Landscape of the Mammalian Genome
RIKEN and FANTOM Consortium Carninci, et al (2005)
Chris Chander, Luke AdeaBioSci D145
Feb. 12, 2015
How many “genes” do we have?
● Do humans only have 30,000 genes?● Raises the question of what constitutes a gene● There is a significant portion of the genome that
does not encode protein- what transcripts are derived from those
regions?
Why do we want to analyze RNA transcripts?
● Identify protein-coding transcriptso as well as function of noncoding RNAs (ncRNAs)
● Understand transcriptional regulationo both in differentiation and development
● Understand transcript conservation○ if sequence is conserved than maybe it has
relevance
The Transcriptional Landscape
● Pattern of transcriptional control signals and the transcripts generated
Ditag Technologies
● Ditags are short sequences at the 5’ and 3’ ends of a DNA fragmento Gene identification signature (GIS) o Cap-analysis gene expression (CAGE)o Gene signature cloning (GSC)
allows acquisition of rare genes
Approach
● Combined full length cDNA isolation with ditag technologies● to identify initiation and termination sites
Cap-Analysis Gene Expression (CAGE)
Methods
● 1 million CAGE tags produced from 2 HepG2 CAGE librarieso one constructed with random primerso the other with oligo-DT primers
● CAGE tags mapped to genome○ identified likely promoters ○ transcription start sites (TSS)○ genomic span of primary transcript
Genome-Transcriptome Relation
● Full length cDNAand GSC ditags distribute together● Mega-transcriptsfound at upper end ofdistribution
Transcriptional unit vs transcriptional framework
● Transcriptional unit (TU)o mRNAs that share at least one nucleotideo same genomic locationo same genomic orientationo However, TU fusion can join unrelated transcripts
Transcriptional unit vs transcriptional framework cont...
● Transcriptional framework (TK)○ group of transcripts that share
■ common expressed regions■ splicing events■ transcriptional start sites■ termination events
Genome has much more transcription than expected
● TKs are closely associated in what is called transcriptional forests (TF).
● Transcription in TFs occurs without gaps and can occur on either strand.o vary in RNA size some up to 1MB
● Based on the number of transcripts produced from the CAGE method there is an order of magnitude more transcripts than “genes” in mice
● Genome tiling arrays suggest 10x more transcripts encoded than the number of “genes” in humans
Transcript diversity in start, splicing and termination sites● Transcription initiation can
occur in any region of a gene.
● 65% of TU contain multiple splice variants
● Alternative termination sites discovered via analysis of transcripts 3’ ends
Intergenic distances
● Compares distance between TUs● Shorter distance between genes using tail to tail (3’ end of
one strand to 3’ end on antisense strand) configuration● Suggests antisense regulatory mechanism for downstream
genes
Conservation of Promoters
● Promoter regionsof ncRNAs are moreconserved than coding RNAs ● Suggests that ncRNAs display positional conservation
Implications
● Suggests much more transcription and transcript diversity than previously thought○ numerous transcript variants in one gene○ at least 10 times more transcripts than number of genes
● Will lead to further study of ncRNA function○ ncRNA contain multiple regulatory elements
● Transcription occurs on both strands ○ Genome manipulation in mice may affect more than one
TK
Critiques and Limitations
● Extensive use of acronyms and new terminology
● The figure legends do not explain the graphs clearly- ambiguous color schemes
Further Reading
● Long noncoding RNA function, NEAT 1Imamura, K. Long noncoding RNA NEAT1 dependent SFPQ relocation from promoter region to paraspeckle mediates IL8 expression upon immune stimuli. 2014. Mol Cell. 6;53(3):393-406
Future Directions
● Further characterize functions for ncRNA ● Establish a more encompassing definition of
a gene (that includes or distinguishes ncRNA)
top related