tools in bioinformatics genome browsers. retrieving genomic information previous lesson(s):...
TRANSCRIPT
Tools in Bioinformatics
Genome Browsers
Retrieving genomic information
Previous lesson(s): annotation-based perspective of search/data
Today: genomic-based perspective: look at all the data from the prism of a specific chromosome location
Next: sequence-based searches
Genome browsers
NCBI Map Viewer http://www.ncbi.nih.gov/mapview
Ensembl http://www.ensembl.org/
UCSC Genome Browser http://genome.ucsc.edu/
Copyright OpenHelix. No use or reproduction without express written consent 5
Important note to slide users:
To maintain the color schemes/cues and the animations, if you import these slides into other slide sets please click the checkbox in the PowerPoint Insert window that maintains slide format. Otherwise important information may be lost.
Mac usersPC users
Version16a_0209
Copyright OpenHelix. No use or reproduction without express written
consent 6
The UCSC Genome BrowserIntroduction
Materials prepared byMary Mangan, Ph.D.www.openhelix.com
Updated: Q1 2009
Copyright OpenHelix. No use or reproduction without express written consent 7
UCSC Genome Browser Agenda
UCSC Genome Browser: http://genome.ucsc.edu
Introduction and Credits Basic Searches Understanding Displays Get Details or Sequences Sequence Searches (BLAT) Summary Exercises
Introduction and Credits
Copyright OpenHelix. No use or reproduction without express written consent 8
Organization of Genomic Data
Genome backbone: base position numbersequenceA
nnot
atio
n T
rack
s
chromosome band
known genes
predicted genes
evolutionary conservation
SNPs
sts sites
gap locations
repeated regions
microarray/expression data
more…
Links out to more data
Copyright OpenHelix. No use or reproduction without express written consent 9
A Sample of the UCSC Genome Browser
gene details
An
nota
tion
Tra
cks
officialsequence
comparisons
SNPs
Copyright OpenHelix. No use or reproduction without express written consent 11
UCSC Genome Browser Agenda
UCSC Genome Browser: http://genome.ucsc.edu
Introduction and Credits Basic Searches Understanding Displays Get Details or Sequences Sequence Searches (BLAT) Summary Exercises
Copyright OpenHelix. No use or reproduction without express written consent 12
The UCSC Homepage: http://genome.ucsc.edu
navigate
navigate General information
Specific information—new features, current status, etc.
Copyright OpenHelix. No use or reproduction without express written consent 13
Genome Browser Gateway: start page, basic search
text/ID searches
Helpful search examples
samples provided
Use this Gateway to search by: Gene names, symbols, IDs Chromosome number: chr7, or region: chr11:1038475-1075482 Keywords: kinase, receptor
See lower part of page for help with format
Copyright OpenHelix. No use or reproduction without express written consent 14
The Genome Browser Gateway
Make your Gateway choices:
1. Select Clade
2. Select genome = species: search 1 species at a time
3. Assembly: the official backbone DNA sequence
4. Position: location in the genome to examine
5. Image width: how many pixels in display window; 5000 max
6. Configure: make fonts bigger + other choices
4 51 32
assembly
6
Copyright OpenHelix. No use or reproduction without express written consent 15
The Genome Browser Gatewaysample search for Human TP53
Sample search: human, March 2006 assembly, tp53
select
Select from results list ID search may go right to a viewer page, if unique
Copyright OpenHelix. No use or reproduction without express written consent 16
UCSC Genome Browser Agenda
UCSC Genome Browser: http://genome.ucsc.edu
Introduction and Credits Basic Searches Understanding Displays Get Details or Sequences Sequence Searches (BLAT) Summary Exercises
Copyright OpenHelix. No use or reproduction without express written consent 17
Overview of the WholeGenome Browser Page
(mature release)} Genome viewer section
mRNA and EST Tracks
Expression (such as microarray)
Comparative Genomics•As a group•Individual species
Variation and Repeats(including SNPs, copy number variation)
Groups of data (Tracks)Mapping and Sequencing Tracks
Genes and Gene Prediction Tracks(including sno/miRNA data)
ENCODE Tracks
Phenotype and Disease Tracks
Regulation (including TFBS)
Copyright OpenHelix. No use or reproduction without express written consent 18
Different Species, Different Tracks, Same Software
Species may have different data tracks Layout, software, functions the same
Copyright OpenHelix. No use or reproduction without express written consent 19
Sample Genome Viewer Image, TP53 Region
base position
UCSC genes
RefSeq genes
mRNAs & ESTs
repeats
many species compared
SNPs
single species compared
MGC clones
Copyright OpenHelix. No use or reproduction without express written consent 20
Visual Cues on the Genome Browser
Track colors may have meaning—for example, UCSC Gene track:
•If there is a corresponding PDB entry = black•If there is a corresponding reviewed/validated seq = dark blue•If there is a non-RefSeq seq = lightest blue
Tick marks; a single location (STS, SNP)
For some tracks, the height of a bar is increased likelihoodof an evolutionary relationship (conservation track)
Intron and direction of transcription <<< or >>>
<exon exon exon< < < < < < <ex 5' UTR3' UTR
Alignment indications (Conservation pairs: “chain” or “net” style)•Alignments = boxes, Gaps = lines
Copyright OpenHelix. No use or reproduction without express written consent 21
Options for Changing Images: Upper Section
Change your view or location with controls at the top Use “base” to get right down to the nucleotides Configure: to change font, window size, more…
Next item, next exon navigation assistance can be turned on
Specifya
position
Fonts,window,
next item,more
Walkleft orright
Zoomin
Zoomout
Click tozoom 3x
and re-center
Copyright OpenHelix. No use or reproduction without express written consent 22
Annotation Track Display Options
Some data is ON or OFF by default
Menu links to info about the tracks: content, methods
You change the view with pulldown menus
After making changes, REFRESH to enforce the change
enforcechangesEnforcechanges
Change track view
Links to infoand/or filters
Copyright OpenHelix. No use or reproduction without express written consent 23
Annotation Track Options Defined Hide: removes a track from view
Dense: all items collapsed into a single line
Squish: each item = separate line, but 50% height + packed
Pack: each item separate, but efficiently stacked (full height)
Full: each item on separate line
Copyright OpenHelix. No use or reproduction without express written consent 24
Mid-page Options to Change Settings
You control the views Use pulldown menus Configure options page
Reset, back to defaults
Start from scratch
Enforce any changes (hide, full, squish…)
Flip display to Genomic 3’5’
Copyright OpenHelix. No use or reproduction without express written consent 25
Cookies and Sessions
Your browser remembers where you were (cookies)
To clear your “cart” or parameters, click default tracks or reset
OR
Save your setup as “sessions” and store/share them
Copyright OpenHelix. No use or reproduction without express written consent 26
UCSC Genome Browser Agenda
UCSC Genome Browser: http://genome.ucsc.edu
Introduction and Credits Basic Searches Understanding Displays Get Details or Sequences Sequence Searches (BLAT) Summary Exercises
Copyright OpenHelix. No use or reproduction without express written consent 27
Click Any Viewer Object for Details
Example: click your mouse anywhere on the TP53 line
Click the item
New description web page opens
Many details and links to more data about TP53
Copyright OpenHelix. No use or reproduction without express written consent 28
Click Annotation Track Item for Details Pages
Not all genes have this much detail.
Different annotation tracks
carry different data.
informativedescriptionother resource links
microarray data
mRNA secondary structure
links to sequences
protein domains/structure
orthologs in other species
Gene Ontology™ descriptions
mRNA descriptions
pathways
genetic associationstudiescomparative toxicology
gene model
Copyright OpenHelix. No use or reproduction without express written consent 29
Get DNA, with Extended Case/Color Options Use the DNA link at
the top Plain or Extended
options Change colors,
fonts, etc.
Copyright OpenHelix. No use or reproduction without express written consent 30
Get Sequence from Details Pages
Click a track, go to Sequence section of details page
Click the item
sequence sectionon detail page
Copyright OpenHelix. No use or reproduction without express written consent 31
UCSC Genome Browser Agenda
UCSC Genome Browser: http://genome.ucsc.edu
Introduction and Credits Basic Searches Understanding Displays Get Details or Sequences Sequence Searches (BLAT) Summary Exercises
Copyright OpenHelix. No use or reproduction without express written consent 32
Accessing the BLAT Tool
Rapid searches by INDEXING the entire genome Works best with high similarity matches See documentation and publication for details
Kent, WJ. Genome Res. 2002. 12:656
BLAT = BLAST-like Alignment Tool
Copyright OpenHelix. No use or reproduction without express written consent 33
BLAT Tool Overview: www.openhelix.com/sampleseqs.html
Make choices
DNA limit 25000 basesProtein limit 10000 aa25 total sequences
Paste one or more
sequences
Or upload
submit
Copyright OpenHelix. No use or reproduction without express written consent 34
BLAT Results with Hyperlinks
Results with demo sequences, settings default; sort = Query, Score Score is a count of matches—higher number, better match
Click browser to go to Genome Browser image location (next slide) Click details to see the alignment to genomic sequence (2nd slide)
sorting
go
to b
row
ser/
vie
we
r
go
to a
lign
me
nt d
eta
il
Copyright OpenHelix. No use or reproduction without express written consent 35
BLAT Results: Browser
From browser click in BLAT results A new line with Your Sequence from BLAT Search appears!
Base position = “full” menu and zoomed in enough to see
amino acids in 3 frame translation
query
Copyright OpenHelix. No use or reproduction without express written consent 36
BLAT Results,Alignment Details
Your query
Genomic match, color cues
Side by Side Alignment
yours
genomic
Copyright OpenHelix. No use or reproduction without express written consent 37
UCSC Genome Browser Agenda
UCSC Genome Browser: http://genome.ucsc.edu
Introduction and Credits Basic Searches Understanding Displays Get Details or Sequences Sequence Searches (BLAT) Summary Exercises
Copyright OpenHelix. No use or reproduction without express written consent 38
Introduction Summary
UCSC Genome Browser Visual cues and genomic context Many ways to alter your views Access to deeper data Access and use sequence data