introduction to biocomputing biology in silico 3 rd february 2010
DESCRIPTION
Introduction to BioComputing Biology in silico 3 rd February 2010. Carrie Iwema , PhD, MLS Molecular Biology Information Specialist Health Sciences Library System University of Pittsburgh [email protected] http://www.hsls.pitt.edu/guides/genetics. General Topics. Information Overload - PowerPoint PPT PresentationTRANSCRIPT
Introduction to BioComputingBiology in silico3rd February 2010
Carrie Iwema, PhD, MLSMolecular Biology Information SpecialistHealth Sciences Library SystemUniversity of [email protected]
http://www.hsls.pitt.edu/guides/genetics
General Topics
Information Overload
Genome Gene Protein
http://www.hsls.pitt.edu/guides/genetics
Specific Topics Information Overload
PubMed Alternatives to PubMed
GoPubMed Novoseek PubGet
Molecular Databases HSLS Molecular Biology Information Service
Genome Gene Protein Genome Biology Genome Browsers
UCSC Genome Browser NCBI MapViewer
Entrez Gene UniProt
http://www.hsls.pitt.edu/guides/genetics
Information Overload
209K• Breast
Cancer
84K• Colon
Cancer
52K • p53
4K • STAT1
5,394 Journals
http://www.hsls.pitt.edu/guides/genetics
1.3 billionsearches in 2009
Alternatives to PubMed
http://www.hsls.pitt.edu/guides/genetics
Growth of Molecular Databases
Source: Nodal Point Blog
2008: 1075
http://www.hsls.pitt.edu/guides/genetics
2009: 1170
2010: 1230
Molecular Databases Nucleic Acids Research: Oxford Journals
Annual Database Issue Annual Web Server Issue
Journals Bioinformatics: Oxford Journals BMC Bioinformatics: BioMed Central Database: Oxford Journals *new in 2009*
Articles on “genetic databases” PubMed: 21,851 results MeSH: 16,398 results
http://www.hsls.pitt.edu/guides/genetics
HSLS Molecular Biology Information Service
Workshops
Website
Software Licensing
Bioinformatics Consultations
http://www.hsls.pitt.edu/guides/genetics
HSLS OBRC in Science
HSLS OBRC
2441 links to databases
and software
~3000hits/day
http://www.hsls.pitt.edu/guides/genetics
search.HSLS.MolBio Integrated search system
Databases & Software Articles on Databases & Software Genes/Proteins Pathways Protocols Videos Recommended Articles
Tabbed browsing Clustered search results
http://www.hsls.pitt.edu/guides/genetics
Hands-on exercises Locate databases on
Natural antisense, UTR, copy number variation
Retrieve gene information for Your favorite gene, BRCA1, STAT1
Find a suitable protocol for Methylation PCR, in situ hybridization, primer design
Identify videos on Protein structure prediction, human genome project
http://www.hsls.pitt.edu/guides/genetics
From Cell to Gene
Human Genome Project Video
http://www.hsls.pitt.edu/guides/genetics
Genome Biology Time Line
1976
RNA Bacteriophage MS2
2001
Human Genome Draft Seq
2003
Published Complete Human Ref Genome
2007
Diploid Genome seq ofan Individual Human
2010
Published Complete Genomes: 1191 organisms
1995
HaemophilusInfluenza
Human Genome Project Video
2008
Jim Watson Genome
http://www.hsls.pitt.edu/guides/genetics
Genome Resources
NCBI: Genomes Resources : Link
Genome Project Genome: 6108 species
Genomes OnLine Database (GOLD): Link
JGI: Integrated Microbial Genomes: Link
http://www.hsls.pitt.edu/guides/genetics
NCBI Genome Resources
http://www.hsls.pitt.edu/guides/genetics
Practice Question: Query: Check the status of genome sequencing for
an organism, such as rabbit.
Answer: Pick an organism or metagenome project name. Search the Genome Project database. To get the most precise
results specify the organism field when searching with an organism name, for example: human[orgn].
Click on the desired Genome Project if more than one result. The Genome Project summary page will provide information of
available projects and sequencing status.
http://www.hsls.pitt.edu/guides/genetics
NCBI Genome Project A collection of complete and in-progress large-scale sequencing,
assembly, annotation, and mapping projects for cellular organisms. The database is organized into organism-specific overviews that function as portals for browsing and retrieving projects pertaining to each organism.
CLICKRabbit
http://www.hsls.pitt.edu/guides/genetics
NCBI Genome Project : Rabbit Genome
http://www.hsls.pitt.edu/guides/genetics
NCBI Genome Project : Rabbit Genome
http://www.hsls.pitt.edu/guides/genetics
NCBI Entrez Genome:
http://www.hsls.pitt.edu/guides/genetics
Genomes Online Database (GOLD) http://genomesonline.org/index2.htm
Global resource for comprehensive access to information regarding complete and ongoing genome projects, metagenomes, and metadata.
“genome sequencing has come of age, and genomics will become central to microbiology's future. It may appear at the moment that the human genome is the main focus and primary goal of genome sequencing, but do not be deceived. The real justification in the long run, is microbial genomics”
Carl Woese, 1998http://www.hsls.pitt.edu/guides/genetics
Genome Browsers: What are they?
Genome Browsers enable researchers to visualize and browse entire
genomes with annotated data including gene prediction and
structure, proteins, expression, regulation, variation, comparative
analysis, etc.
http://www.hsls.pitt.edu/guides/genetics
Genome Browsers The Big Three
NCBI MapViewer UCSC Genome Browser EBI Ensembl
Generic Genome Browser (Gbrowse) JBrowse (Ajax based like Google Map)
Display: Vertical
Display: Horizontal
http://www.hsls.pitt.edu/guides/genetics
Tutorial Articles
Link Link
Link Link
http://www.hsls.pitt.edu/guides/genetics
Tutorial/Seminar Videos
Link Link
Link Link
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser
http://www.hsls.pitt.edu/guides/genetics
Navigating the Human Genome
Browse the region of human chromosome 7 between bp 54,318,043 to 55,974,438
UCSC Genome Browser
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Set up basic browser parameters
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Start fresh
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Browse the region of human chromosome 7 between bp 54,318,043 to 55,974,438
What genes are present in this region?
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
http://www.hsls.pitt.edu/guides/genetics
NCBI sequence databases RefSeq
based on GenBank records; non-redundant, expert-verified databases of reference sequences Link
GenBank archival database of nucleotide sequences
from >160,000 organisms Link
http://www.hsls.pitt.edu/guides/genetics
International Nucleotide Sequence Database Collaboration
http://www.hsls.pitt.edu/guides/genetics
Primary Vs Derivative databases
http://www.hsls.pitt.edu/guides/genetics
RefSeq Scope & Accessions Genomic DNA
NC_123456 - complete genome, chromosome, plasmid NG_123456 - genomic region NT_123456 - genomic contig
mRNA NM_123456 Protein NP_123456
more about RefSeq scope and accessions...
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Browse the region of human chromosome 7 between bp 54,318,043 to 55,974,438
Zoom in and display only the EGFR gene
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Select the gene region from the “Scale” track to zoom in
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Browse the region of human chromosome 7 between bp 54,318,043 to 55,974,438
Display all Single Nucleotide polymorphisms (SNPs) present in this gene
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Browse the region of human chromosome 7 between bp 54,318,043 to 55,974,438
Retrieve the nucleotide sequence of this genomic region showing all exons in blue and SNPs in Red,
bold faced and underlined.
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region: sequence view
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Browse the region of human chromosome 7 between bp 54,318,043 to 55,974,438
Look in probable promoter region and see if there’s anything
interesting…
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
http://www.hsls.pitt.edu/guides/genetics
Zoom out
UCSC Genome Browser: navigating a genomic region
Browse the region of human chromosome 7 between bp 54,318,043 to 55,974,438
What transcription factors bind in this region?
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
http://www.hsls.pitt.edu/guides/genetics
NCBI MapViewer How To: View/download features around an object or
between two objects on a chromosomeStarting with...CHROMOSOMAL COORDINATES
Begin on the Map Viewer home page. Click the "R" icon under Tools for the desired organism and build.
Select the chromosome, enter the coordinates in the From and To boxes, and click Go. Use either exact coordinates, e.g., 61551076, or values such as, 61M or 61551K.
If necessary, use the Maps & Options dialog box to change displayed maps; the maps and region displayed determine the data available.
Common Questions
What is its function?
What are its neighboring genes?
What is its genomic seq?How many splice varients are there?
What are its intron-exon architechure?
What diseases are associated with it?
Which tissues it expressed ?
How can I get its cDNA clone?
http://www.hsls.pitt.edu/guides/genetics
SNP
Genomic Sequence
Expression Profile
Interacting Partners3D Structure
mRNA Sequence
Chromosomal Localization
Disease
Amino acid Sequence
Homologous Sequences
http://www.hsls.pitt.edu/guides/genetics
NCBI : Entrez Gene
Entrez GeneFind: gene symbols and aliases sequences: genomic, mRNA, protein intron-exon architecture genomic context: neighboring and
antisense genes interacting partners associated gene ontology terms:
function, cellular component and biological process
http://www.hsls.pitt.edu/guides/genetics
Entrez Gene
a searchable database of genes, from RefSeq genomes, and defined by sequence and/or located in the NCBI Map Viewer
each record represents a single gene from a given organism
http://www.hsls.pitt.edu/guides/genetics
Entrez Gene Sequences
mRNA Seq
Protein Seq
Genomic Seq
http://www.hsls.pitt.edu/guides/genetics
Gene Ontology (GO)
Controlled vocabulary tagging
Function Biological Processes Cellular Component
http://www.hsls.pitt.edu/guides/genetics
Entrez Gene: Gene Table
http://www.hsls.pitt.edu/guides/genetics
Introns/Exons
Try it!
Find mRNA sequence for your gene of interest
http://www.hsls.pitt.edu/guides/genetics
Find mRNA Sequence for Reelin Gene
http://www.hsls.pitt.edu/guides/genetics
FASTA vs GenBank records
http://www.hsls.pitt.edu/guides/genetics
NCBI Entrez Gene Tutorials
Information page with wiki, video, blog etc.
Entrez gene: A Directory of Genes, NCBI Handbook
Short Video Tutorial (MIT)
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: find a gene in the genome
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: find a gene in the genome
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: find a gene in the genome
http://www.hsls.pitt.edu/guides/genetics
Bioinformatics Databases & Software Providers
NCBI Home page Site map Resource Guide
EBI Home page Databases Software
http://www.hsls.pitt.edu/guides/genetics
UniProt
world's most comprehensive catalog of information on proteins
http://www.hsls.pitt.edu/guides/genetics
a central repository of protein sequence and function created by joining the information contained in Swiss-Prot, TrEMBL, and PIR
UniProt
http://www.hsls.pitt.edu/guides/genetics
Thank you!Any questions?
Carrie Iwema Ansuman [email protected] [email protected] 412-383-6887 412-648-1297
http://www.hsls.pitt.edu/guides/genetics