alternative splicing (a review by liliana florea, 2005) cs 498 ss saurabh sinha 11/30/06
TRANSCRIPT
![Page 1: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/1.jpg)
Alternative Splicing(a review by Liliana Florea, 2005)
CS 498 SS
Saurabh Sinha
11/30/06
![Page 2: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/2.jpg)
What is alternative splicing?
• The first result of transcription is “pre-mRNA”• This undergoes “splicing”, i.e., introns are
excised out, and exons remain, to form mRNA
• This splicing process may involve different combinations of exons, leading to different mRNAs, and different proteins
• This is alternative splicing
![Page 3: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/3.jpg)
Alternative splicing
• Important regulatory mechanism, for modulating gene and protein content in the cell
• Large-scale genomic data today suggests that as many as 60% of the human genes undergo alternative splicing
![Page 4: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/4.jpg)
Significance
• Number of human genes has recently been estimated to be about 20-25 K.
• Not significantly greater than much less complex organisms
• Alternative splicing is a potential explanation of how a large variety of proteins can be achieves with a small number of genes
• Errors in splicing mechanism implicated in diseases such as cancers
![Page 5: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/5.jpg)
What happens in alternative splicing?
• Different combinations of exons within a gene are spliced from the RNA precursor, to be included in mRNA
• The combination depends on tissue type, developmental stage, disease etc.
• Thus different proteins in these different conditions
• Different types of alternative splicing on next slide
![Page 6: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/6.jpg)
http://bib.oxfordjournals.org/cgi/content/full/7/1/55/F1
exon inclusion/exclusion
alternative 5’ exon
alternative 3’ exon
intron retention
5’ alternative UTR
3’ alternative UTR
![Page 7: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/7.jpg)
Bioinformatics of Alt. splicing
• Two main goals:– Find out cases of alt. splicing
• What are the different forms (“isoforms”) of a gene?
– Find out how alt. splicing is regulated• What are the sequence motifs controlling alt.
splicing, and deciding which isoform will be produced
![Page 8: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/8.jpg)
Identification of splice variants
• All cells have same genome• But all cells don’t have the same
“transcriptome” (i.e., transcripts)– Different cells may express different
(alternative) transcripts of the same gene
• Goal of bioinformatics is to find “splice forms”, i.e., what are the alternative splicing events?
![Page 9: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/9.jpg)
Identification of splice variants
• Direct comparison between sequences of different cDNA isoforms – Q: What is cDNA? How is this different from a
gene’s DNA?– cDNA is “complementary DNA”, obtained by
reverse transcription from mRNA. It has no introns
• Direct comparison reveals differences in the isoforms
• But this difference could be part of an exon, a whole exon, or a set of exons
![Page 10: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/10.jpg)
Copyright restrictions may apply.
Florea, L. Brief Bioinform 2006 7:55-69; doi:10.1093/bib/bbk005
Bioinformatics methods for identifying alternative splicing
directcomparison
![Page 11: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/11.jpg)
Identification of splice variants
• Comparison of exon-intron structures (the gene’s architecture)
• Where do the exon-intron structures come from?– Align cDNA (no introns) with genomic
sequence (with introns)– This gives us the intron and exon structure
![Page 12: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/12.jpg)
Copyright restrictions may apply.
Florea, L. Brief Bioinform 2006 7:55-69; doi:10.1093/bib/bbk005
Bioinformatics methods for identifying alternative splicing
comparisonof exon-intronstructures
![Page 13: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/13.jpg)
Identification of splice variants
• Alignment tools.• Align cDNA sequence to genomic sequence• Why shouldn’t this be a perfect match with
gaps (introns)?– Sequencing errors, polymorphisms, etc.
• Special purpose alignment programs for this purpose
![Page 14: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/14.jpg)
Identifying full lengh alt. spliced transcripts
• Previous methods identified parts of alt. spliced transcript
• Much more difficult to identify full length alternatively spliced transcripts
• Such methods include “gene indices”
![Page 15: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/15.jpg)
Gene indices
• Compare all EST sequences against one another
• Identify significant overlaps
• Group and assemble sequences with compatible overlaps into clusters
![Page 16: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/16.jpg)
Gene indices
![Page 17: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/17.jpg)
Problems with gene indices
• Overclustering: paralogs may get clustered together.– What are paralogs? – Related but distinct genes in the same species
• Underclustering: if number of ESTs is not sufficient
• Computationally expensive:– Quadratic time complexity
![Page 18: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/18.jpg)
Splice graphs
• Nodes: Exons
• Edges: Introns
• Gene: directed acyclic graph
• Each path in this DAG is an alternative transcript
![Page 19: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/19.jpg)
Splice graph
![Page 20: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/20.jpg)
Splice graphs
• Combinatorially generate all possible alt. transcripts
• But not all such transcripts are going to be present
• Need scores for candidate transcripts, in order to differentiate between the biologically relevant ones and the artifactual ones
![Page 21: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/21.jpg)
Splice variants from microarray data
• Affymetrix GeneChip technology uses 22 probes collected from exons or straddling exon boundaries
• When an exon is alternatively spliced, expression level of its probes will be different in different experiments
![Page 22: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/22.jpg)
Copyright restrictions may apply.
Florea, L. Brief Bioinform 2006 7:55-69; doi:10.1093/bib/bbk005
Bioinformatics methods for identifying alternative splicing
splice variantsfrom microarray data
![Page 23: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/23.jpg)
Part 2:Regulation of
alternative splicing
![Page 24: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/24.jpg)
Biological mechanism
• Splicing of pre-mRNA is a complex cellular process
• “Spliceosome” is a complex of several molecules that assembles onto each intron and catalyzes the excision of the intron
• Splice sites (5’ or donor splice site and 3’ or acceptor splice site) play a major role in splicing
• More sites, apart from the splice signals, in introns and exons, contribute to splicing
![Page 25: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/25.jpg)
Biological mechanism
• Cis-regulatory elements (again !)
• Promote (“splicing enhancers”) or repress (“splicing silencers”) the inclusion of the exon in the mRNA
• Can be located in exons or introns
![Page 26: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/26.jpg)
Bioinformatics methods
• Goal: find the cis-regulatory elements that mediate splicing (alternative splicing)
• Early work: find consensus sequences (motifs) of splicing enhancers
• More advanced work: Position weight matrices (PWMs)
![Page 27: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/27.jpg)
Copyright restrictions may apply.
Florea, L. Brief Bioinform 2006 7:55-69; doi:10.1093/bib/bbk005
Bioinformatics representations of splicing regulatory motifs: (a) consensus sequence and (b) position weight matrix (PWM)
![Page 28: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/28.jpg)
Motif finding (again !)
• Statistical overrepresentation• Find k-mers that occur more often in one class of
sequences than in another;• Should be statistically significant• Exonic splicing enhancers (ESE) are more likely to
occur in exons than in introns; hence find 6-mers (k=6) statistically overrepresented in exons compared to introns
• Calculate z-score of count– (Count - mean)/(standard deviation)– Homework 1
![Page 29: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/29.jpg)
Motif finding
• Other standard approaches of motif finding also adopted:– MEME & Gibbs sampling
• Comparative genomics– Find conserved sites in introns– Find conserved sites in exons. This has to
be done carefully. Because exons already have selective pressure.
![Page 30: Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06](https://reader035.vdocument.in/reader035/viewer/2022062423/5697bfc61a28abf838ca6f4b/html5/thumbnails/30.jpg)
Summary
• Alternative splicing is very important
• Bioinformatics for finding alternative spliced forms
• Bioinformatics for finding regulatory mechanisms