computational bioinformatics: software and databaseswangj/rna/slides/highschool-7-31-2008.pdfjul 31,...
TRANSCRIPT
7/31/2008 1
Computational Bioinformatics:Computational Bioinformatics:Software and DatabasesSoftware and Databases
Jason T. L. Wang, Professor
Bioinformatics Program and Computer Science DepartmentNew Jersey Institute of Technology
http://web.njit.edu/~wangj
Work supported by NSF grant IIS-0707571
Presentation for NSF-Sponsored C2PRISM Program
7/31/2008 2
Outline• Introduction to Bioinformatics• Introduction to Computational RNA
Genomics (Our Current Project)• RNA Informatics Tools• RNA Databases• Bioinformatics Center• Conclusion and Future Work
7/31/2008 3
7/31/2008 4
Gene:•Genetic information-containing elements
•Distributed to each cell when cell divides
•Made of deoxyribonucleic acid --DNA
Gene:•Genetic information-containing elements
•Distributed to each cell when cell divides
•Made of deoxyribonucleic acid --DNA
Gene Structure:•Promoter
•Start codon
•Introns
•Exons
•Stop codon
•etc
Gene:•Transcription : DNA to RNA
• RNA Splicing: Remove Intons--mRNA
•mRNA translation--Protein
7/31/2008 5
Exon 1 Intron 1 Exon 2 Intron 2 Exon 3
CCCTGTGGAGCCACACCCTAGGGTTGGCCAATCTACTCCCAGGAGCAGGGAGGGCAGGAGCCAGGGCTGGGCATAAAAGTCAGGGCAGAGCCATCTATTGCTTACATTTGCTTCTGACACAACTGTGTTCACTAGCAACTCAAACAGACACCATGGTGCACCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGTTGGTATCAAGGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCATGTGGAGACAGAGAAGACTCTTGGGTTTCTGATAGGCACTGACTCTCTCTGCCTATTGGTCTATTTTCCCACCCTTAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGCTCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACTTCAGGGTGAGTCTATGGGACCCTTGATGTTTTCTTTCCCCTTCTTTTCTATGGTTAAGTTCATGTCATAGGAAGGGGAGAAGTAACAGGGTACAGTTTAGAATGGGAAACAGACGAATGATTGCATCAGTGTGGAAGTCTCAGGATCGTTTTAGTTTCTTTTATTGCTGTTCATAACAATTGTTTTCTTTTGTTTAATTCTTGCTTTCTTTTTTTTTCTTCTCCGCAATTTTTACTATTATACTTAATGCCTTAACATTGTGTATAACAAAAGGAAATATCTCTGAGATACATTAAGTAACTTAAAAAAAAACTTTACACAGTCTGCCTAGTACATTACTATTTGGAATATATGTGTGCTTATTTGCATATTCATAATCTCCCTACTTTATTTTCTTTTATTTTTAATTGATACATAATCATTATACATATTTATGGGTTAAAGTGTAATGTTTTAATATGTGTACACATATTGACCAAATCAGGGTAATTTTGCATTTGTAAATTTTAAAAAATGCTTTCTTCTTTTAATATACTTTTTTGTTTATCTTATTTCTAATACTTTCCCTAATCTCTTTCTTTCAGGGCAATAATGATACAATGTATCATGCCTCTTTGCACCATTCTAAAGAATAACAGTGATAATTTCTGGGTTAAGGCAATAGCAATATTTCTGCATATAAATATTTCTGCATATAAATTGTAACTGATGTAAGAGGTTTCATATTGCTAATAGCAGCTACAATCCAGCTACCATTCTGCTTTTATTTTATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATACCTCTATCTTCCTCCCACAGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCACCCACCAGTGCAGGCTGCCTATCAGAAAGTGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCACTAAGCTCGCTTTCTTGCTGTCCAATTTCTATTAAAGGTTCCTTTGTTCCCTAAGTCCAACTACTAAACTGGGGGATATTATGAAGGGCCTTGAGCATCTGGATTCTGCCTAATAAAAAACATTTATTTTCATTGCAATGATGTATTTAAATTATTTCTGAATATTTTACTAAAAAGGGAATGTGGGAGGTCAGTGCATTTAAAACATAAAGAAATGATGAGCTGTTCAAACCTTGGGAAAATACACTATATCTTAAACTCCATGAAAGAAGGTGAGGCTGCAACCAGCTAATGCACATTGGCAACAGCCCCTGATGCCTAATGCACATTGGCAACAGCCCCTGATGCCTATGCCTTATTCATCCCTCAGAAAAGGATTCTTGTAGAGGCTTGATTTGCAGGTTAAAGTTTTGCTATGCTGTATTTTACATTACTTATTGTTTTAGCTGTCCTCATGAATGTCTTTTC
Prom
oter
Star
t Cod
on
Exon
1
Exon
2
Exon
3St
op C
odon
Intro
n 1
Intro
n 2
Star
t Cod
on
Stop
Cod
on
Star
t Cod
on
Stop
Cod
on
Star
t Cod
on
Stop
Cod
on
Transcription
Splicing( intron removal)
Gene
pre-mRNA
mRNA
Gene Structure and Gene Expression
Acc
epto
r site
Acc
epto
r site
Don
or si
te
Don
or si
te
Splicing( RNA rejoining)
Don
or si
te
Don
or si
te
Acc
epto
r site
Acc
epto
r site
Translation
Protein
7/31/2008 6
Computational RNA Genomics
• Biochemical and genetic studies have demonstrated many functions associated with the UTRs in mRNAs.
• Unlike proteins, RNA sequence search is insufficient for detecting similarity.
7/31/2008 7
Sequence Similarity vs. Structural Similarity>NM_000032UUCGUUCGUCCUCAGUGCAGGGCAACAGGA((((((.(((((......)))))))).)))>NM_014585CAACUUCAGCUACAGUGUUAGCUAAGUUUG((((((.(((((......)))))))).)))
7/31/2008 8
RSmatch and RADAR(BMC Bioinformatics 2005)
(Nucleic Acids Research 2007)
Alignment of two RNA secondary structures where the local matchesfound by RSmatch are in green.
7/31/2008 9
7/31/2008 10
7/31/2008 11
Multiple Structural Alignment
7/31/2008 12
7/31/2008 13
7/31/2008 14
GLEAN-UTR Database(BMC Genomics 2008)
• Use RADAR, hierarchical clustering and Gene Ontology to mine RNA motifs in the UnTranslated Regions (UTRs) conserved between human and mouse orthologs in multiple genes sharing common biological pathways.
• GLEAN-UTR DB contains 90 RNA motifs (structure groups) from 698 genes. Top two motifs are Iron response element (IRE) and histone 3’-UTR stem-loop structure.
http://datalab.njit.edu/biodata/GLEAN-UTR-DB/
7/31/2008 15
7/31/2008 16
7/31/2008 17
18
7/31/2008 19
7/31/2008 20
Conclusion
• We have developed a warehouse of informatics tools and databases for RNA genomics.
• We want to invite high school students to our research team to conduct interesting research (Liberty Science Center Model)
• Contact Dr. Jason Wang ([email protected])• http://web.njit.edu/~wangj