computational bioinformatics: software and databaseswangj/rna/slides/highschool-7-31-2008.pdfjul 31,...

20
7/31/2008 1 Computational Bioinformatics: Computational Bioinformatics: Software and Databases Software and Databases Jason T. L. Wang, Professor Bioinformatics Program and Computer Science Department New Jersey Institute of Technology http://web.njit.edu/~wangj Work supported by NSF grant IIS-0707571 Presentation for NSF-Sponsored C2PRISM Program

Upload: others

Post on 07-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 1

Computational Bioinformatics:Computational Bioinformatics:Software and DatabasesSoftware and Databases

Jason T. L. Wang, Professor

Bioinformatics Program and Computer Science DepartmentNew Jersey Institute of Technology

http://web.njit.edu/~wangj

Work supported by NSF grant IIS-0707571

Presentation for NSF-Sponsored C2PRISM Program

Page 2: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 2

Outline• Introduction to Bioinformatics• Introduction to Computational RNA

Genomics (Our Current Project)• RNA Informatics Tools• RNA Databases• Bioinformatics Center• Conclusion and Future Work

Page 3: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 3

Page 4: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 4

Gene:•Genetic information-containing elements

•Distributed to each cell when cell divides

•Made of deoxyribonucleic acid --DNA

Gene:•Genetic information-containing elements

•Distributed to each cell when cell divides

•Made of deoxyribonucleic acid --DNA

Gene Structure:•Promoter

•Start codon

•Introns

•Exons

•Stop codon

•etc

Gene:•Transcription : DNA to RNA

• RNA Splicing: Remove Intons--mRNA

•mRNA translation--Protein

Page 5: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 5

Exon 1 Intron 1 Exon 2 Intron 2 Exon 3

CCCTGTGGAGCCACACCCTAGGGTTGGCCAATCTACTCCCAGGAGCAGGGAGGGCAGGAGCCAGGGCTGGGCATAAAAGTCAGGGCAGAGCCATCTATTGCTTACATTTGCTTCTGACACAACTGTGTTCACTAGCAACTCAAACAGACACCATGGTGCACCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGTTGGTATCAAGGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCATGTGGAGACAGAGAAGACTCTTGGGTTTCTGATAGGCACTGACTCTCTCTGCCTATTGGTCTATTTTCCCACCCTTAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGCTCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACTTCAGGGTGAGTCTATGGGACCCTTGATGTTTTCTTTCCCCTTCTTTTCTATGGTTAAGTTCATGTCATAGGAAGGGGAGAAGTAACAGGGTACAGTTTAGAATGGGAAACAGACGAATGATTGCATCAGTGTGGAAGTCTCAGGATCGTTTTAGTTTCTTTTATTGCTGTTCATAACAATTGTTTTCTTTTGTTTAATTCTTGCTTTCTTTTTTTTTCTTCTCCGCAATTTTTACTATTATACTTAATGCCTTAACATTGTGTATAACAAAAGGAAATATCTCTGAGATACATTAAGTAACTTAAAAAAAAACTTTACACAGTCTGCCTAGTACATTACTATTTGGAATATATGTGTGCTTATTTGCATATTCATAATCTCCCTACTTTATTTTCTTTTATTTTTAATTGATACATAATCATTATACATATTTATGGGTTAAAGTGTAATGTTTTAATATGTGTACACATATTGACCAAATCAGGGTAATTTTGCATTTGTAAATTTTAAAAAATGCTTTCTTCTTTTAATATACTTTTTTGTTTATCTTATTTCTAATACTTTCCCTAATCTCTTTCTTTCAGGGCAATAATGATACAATGTATCATGCCTCTTTGCACCATTCTAAAGAATAACAGTGATAATTTCTGGGTTAAGGCAATAGCAATATTTCTGCATATAAATATTTCTGCATATAAATTGTAACTGATGTAAGAGGTTTCATATTGCTAATAGCAGCTACAATCCAGCTACCATTCTGCTTTTATTTTATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATACCTCTATCTTCCTCCCACAGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCACCCACCAGTGCAGGCTGCCTATCAGAAAGTGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCACTAAGCTCGCTTTCTTGCTGTCCAATTTCTATTAAAGGTTCCTTTGTTCCCTAAGTCCAACTACTAAACTGGGGGATATTATGAAGGGCCTTGAGCATCTGGATTCTGCCTAATAAAAAACATTTATTTTCATTGCAATGATGTATTTAAATTATTTCTGAATATTTTACTAAAAAGGGAATGTGGGAGGTCAGTGCATTTAAAACATAAAGAAATGATGAGCTGTTCAAACCTTGGGAAAATACACTATATCTTAAACTCCATGAAAGAAGGTGAGGCTGCAACCAGCTAATGCACATTGGCAACAGCCCCTGATGCCTAATGCACATTGGCAACAGCCCCTGATGCCTATGCCTTATTCATCCCTCAGAAAAGGATTCTTGTAGAGGCTTGATTTGCAGGTTAAAGTTTTGCTATGCTGTATTTTACATTACTTATTGTTTTAGCTGTCCTCATGAATGTCTTTTC

Prom

oter

Star

t Cod

on

Exon

1

Exon

2

Exon

3St

op C

odon

Intro

n 1

Intro

n 2

Star

t Cod

on

Stop

Cod

on

Star

t Cod

on

Stop

Cod

on

Star

t Cod

on

Stop

Cod

on

Transcription

Splicing( intron removal)

Gene

pre-mRNA

mRNA

Gene Structure and Gene Expression

Acc

epto

r site

Acc

epto

r site

Don

or si

te

Don

or si

te

Splicing( RNA rejoining)

Don

or si

te

Don

or si

te

Acc

epto

r site

Acc

epto

r site

Translation

Protein

Page 6: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 6

Computational RNA Genomics

• Biochemical and genetic studies have demonstrated many functions associated with the UTRs in mRNAs.

• Unlike proteins, RNA sequence search is insufficient for detecting similarity.

Page 7: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 7

Sequence Similarity vs. Structural Similarity>NM_000032UUCGUUCGUCCUCAGUGCAGGGCAACAGGA((((((.(((((......)))))))).)))>NM_014585CAACUUCAGCUACAGUGUUAGCUAAGUUUG((((((.(((((......)))))))).)))

Page 8: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 8

RSmatch and RADAR(BMC Bioinformatics 2005)

(Nucleic Acids Research 2007)

Alignment of two RNA secondary structures where the local matchesfound by RSmatch are in green.

Page 9: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 9

Page 10: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 10

Page 11: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 11

Multiple Structural Alignment

Page 12: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 12

Page 13: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 13

Page 14: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 14

GLEAN-UTR Database(BMC Genomics 2008)

• Use RADAR, hierarchical clustering and Gene Ontology to mine RNA motifs in the UnTranslated Regions (UTRs) conserved between human and mouse orthologs in multiple genes sharing common biological pathways.

• GLEAN-UTR DB contains 90 RNA motifs (structure groups) from 698 genes. Top two motifs are Iron response element (IRE) and histone 3’-UTR stem-loop structure.

http://datalab.njit.edu/biodata/GLEAN-UTR-DB/

Page 15: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 15

Page 16: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 16

Page 17: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 17

Page 18: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

18

Page 19: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 19

Page 20: Computational Bioinformatics: Software and Databaseswangj/rna/slides/HighSchool-7-31-2008.pdfJul 31, 2008  · Presentation for NSF-Sponsored C2PRISMProgram. 7/31/2008 2 Outline

7/31/2008 20

Conclusion

• We have developed a warehouse of informatics tools and databases for RNA genomics.

• We want to invite high school students to our research team to conduct interesting research (Liberty Science Center Model)

• Contact Dr. Jason Wang ([email protected])• http://web.njit.edu/~wangj