http://cs273a.stanford.edu 1 ucsc genome browser tutorial the ucsc toolset & portal to the...
Post on 20-Jan-2016
223 views
TRANSCRIPT
http://cs273a.stanford.edu 1
UCSC Genome Browser Tutorial
http://genome.ucsc.edu/
http://genome-test.cse.ucsc.edu/
The UCSC Toolset & Portalto the Human Genome
• Genome Browser• Table Browser
“I was blind and now I can see”
http://cs273a.stanford.edu 2
UCSC Genome Browser
[version9a]
http://www.openhelix.com/downloads/ucsc/ucsc_home.shtml
3
The UCSC Homepage: http://genome.ucsc.edu
navigate
navigateGeneral information
Specific information—new features, current status, etc.
4
The Genome Browser Gatewaystart page choices, December 2006
Make your Gateway choices:
1. Select Clade
2. Select species: search 1 species at a time
3. Assembly: the official backbone DNA sequence
1 2 3
practically speaking, there is no such thing as a genome.
there is only a genome assembly. assemblies update.
frequently. think moving target...
5
Everything in Genomics is a Moving Target
The genomes Their annotations The Portals Our understanding of Biology
Conclusion:
write code
that can be
run...
and rerun
and rerun
and rerun
and rerun
6
The Genome Browser Gatewaystart page, basic search
7
The Genome Browser Gatewaystart page choices, December 2006
Make your Gateway choices:
1. Select Clade
2. Select species: search 1 species at a time
3. Assembly: the official backbone DNA sequence
4. Position: location in the genome to examine
5. Image width: how many pixels in display window; 5000 max
6. Configure: make fonts bigger + other choices
4 5
6
8
The Genome Browser Gatewaystart page, basic search
text/ID searches
Helpful search examples,
suggestions below
Use this Gateway to search by: Gene names, symbols Chromosome number: chr7, or region: chr11:1038475-
1075482 Keywords: kinase, receptor IDs: NP, NM, OMIM, and more…
See lower part of page for help with format
4
9
The Genome Browser Gatewaysample search for Human TP53
Sample search: human, March 2006 assembly, tp53
select
Select from results list ID search may go right to a viewer page, if unique
10
Overview of the wholeGenome Browser page
(mature release)
}Genome viewer section
mRNA and EST Tracks
Expression and Regulation
Comparative Genomics
Variation and Repeats
Groups of data
Mapping and Sequencing Tracks
Genes and Gene Prediction Tracks
ENCODE Tracks
11
Different species, different tracks, same software
Species may have different data tracks Layout, software, functions the same
12
Sample Genome Viewer image, TP53 region
base positionSTS markers
Known genes
RefSeq genes
GenBank seqs
repeats
17 species compared
SNPs
single species compared
13
Visual Cues on the Genome Browser
Track colors may have meaning—for example, Known Gene track:
•If there is a corresponding PDB entry, = black•If there is a corresponding NCBI Reviewed seq, = dark blue•If there is a corresponding NCBI Provisional seq, = light blue
Tick marks; a single location (STS, SNP)
For some tracks, the height of a bar is increased likelihoodof an evolutionary relationship (conservation track)
Intron, and direction of transcription <<< or >>>
<exon exon exon< < < < < < <ex 5' UTR3' UTR
14
Options for Changing Images: Upper Section
Change your view or location with controls at the top Use “base” to get right down to the nucleotides Configure: to change font, window size, more…
Specifya
position
fonts,window,
more
Walkleft orright
Zoomin
Zoomout
click tozoom 3x
and re-center
15
Annotation Track display options
Some data is ON or OFF by default
Links to infoand/or filters
Menu links to info about the tracks: content, methods You change the view with pulldown menus
enforcechanges
After making changes, REFRESH to enforce the change
Change track view
16
Annotation Track options, defined Hide: removes a track from view
Dense: all items collapsed into a single line
Squish: each item = separate line, but 50% height + packed
Pack: each item separate, but efficiently stacked (full height)
Full: each item on separate line
17
Reset, Hide, Configure or Refresh to change settings
You control the views Use pulldown menus Configure options page
reset, back to defaults start from
scratch
enforce any changes (hide, full, squish…)
18
Annotation Track options, if altered….important point: the browser remembers!
Session information (the position you were examining) Track choices (squish, pack, full, etc) Filter parameters (if you changed the colors of any items, or the
subset to be displayed) …are all saved on your computer. When you come back in a
couple of days to use it again, these will still be set. You may—or may not—intend this.
To clear your “cart” or parameters, click default tracks
OR
19
Saved Sessions
20
Click Any Viewer Object for Details
Example: click your mouse anywhere on the TP53 line
Click the item
New web page
opens
Many details and links to more data about TP53
21
Click annotation track item for details pages
Not all genes have This much detail.
Different annotation tracks
carry different data.
informativedescriptionother resource links
microarray data
mRNA secondary structure
links to sequences
protein domains/structure
homologs in other species
Gene Ontology™ descriptions
mRNA descriptions
pathways
22
Get DNA, with Extended Case/Color Options
Use the DNA link at the top
Plain or Extended options
Change colors, fonts, etc.
23
Get Sequence from Details Pages
Click a track, go to Sequence section of details page
Click the line Click the item
sequence sectionon detail page
24
Accessing the BLAT tool
Rapid searches by INDEXING the entire genome Works best with high similarity matches See documentation and publication for details
Kent, WJ. Genome Res. 2002. 12:656
BLAT = BLAST-like Alignment Tool
25
BLAT tool overview: www.openhelix.com/sampleseqs.html
Make choices
DNA limit 25000 basesProtein limit 10000 aa25 total sequences
Paste one or more
sequences
Or upload
submit
26
BLAT results, with links
Results with demo sequences, settings default; sort = Query, Score Score is a count of matches—higher number, better
match Click browser to go to Genome Browser image location (next slide) Click details to see the alignment to genomic sequence (2nd slide)
sorting
go
to b
row
ser/
vie
we
r
go
to a
lign
me
nt d
eta
il
27
BLAT results, browser link
From browser click in BLAT results A new line with your Sequence from BLAT Search appears!
query
click to flip frame
Watch out for reading frame! Click - - - > to flip frame Base position = full and zoomed in enough to see
amino acids
28
BLAT results,alignment details
Your query
Genomic match, color cues
Side-by-side alignment
yoursgenomic
29
Understand Blat’s Limitation
Blat was designed to rapidly align sequence from onegenome back to itself (e.g., EST/cDNA data)
It can and it does miss clear hits at times
Blat actually allows for a single mismatch, but it alsoremoves k-mers with excessive counts for efficiency.
Not suitable for cross-species mapping.
30
Bunch More Goodies – Click Around
31
Bibliography:
http://genome.ucsc.edu/goldenPath/pubs.html The UCSC Genome Browser Database:
update 2008, update 2007, and earlier. UCSC Genome Browser Tutorial UCSC Genome Browser: Deep support
for molecular biomedical research The UCSC Known Genes, 2006. The UCSC Gene Sorter, 2007. Piloting the Zebrafish Genome Browser,
2006.
32
UCSC Genome Browser
[version9a]
33
Genome Browser Database
Primary table: positions, names, etc.
UnderlyingDatabase(MySQL)
Auxiliary table: related data
visualize search & download
34
The Table Browser
Open browser
Open browser
http://genome.ucsc.edu/
35
Table Browser: Choose Genome
In the Human genome (hg16),
search for simple repeats on a chromosome 4 location
with copy number more than 10and download the sequence.
In the Human genome (hg16),
search for simple repeats on a chromosome 4 location
with copy number more than 10and download the sequence.
Choose Genome
36
Table Browser: Choose Table to Search
In the Human genome (hg16),
search for simple repeats on a chromosome 4 location
with copy number more than 10and download the sequence.
Choose Data Table
37
Table Browser: Describe Table
Describe table
38
Table Browser: Choose Region to Search
In the Human genome (hg16), search for simple repeats
on a chromosome 4 locationwith copy number more than 10
and download the sequence.
Choose Region to Search
39
Table Browser: Upload Locations to Search
Paste Upload
40
Table Browser: Filter to Refine Search
Create Filter
In the Human genome (hg16), search for simple repeats
on a chromosome 4 locationwith copy number more than 10
and download the sequence.
Submit Filter
41
Table Browser: Output Data
Output data
In the Human genome (hg16), search for simple repeats
on a chromosome 4 locationwith copy number more than 10
and download the sequence.
42
Table Browser: Output Formats
Output formats
Text Fields
43
Table Browser: Fasta Sequence Output
Sequence
44
Table Browser: Database Format Outputs
Database
45
Table Browser: Custom Track Output
Custom Track
46
Table Browser: Hyperlinks Output
Hyperlinks
47
Table Browser: Obtaining Output
Adding name creates file on desktop,leaving blank creates output in browser.
(exception: custom track)
Data Summary
48
Table Browser: Output configuration
Sequence Format
Get Sequence
49
Table Browser: Intersecting Data
Find simple repeats (copy number > 10) within known genes
and download the sequence.
Intersect
2nd Table
Any Overlap
Submit
50
Table Browser: Intersecting Data Narrows Search
Filtered simple repeats, intersected (overlapping)
w/ known genes
Summary
Filtered simple repeats
51
Table Browser: Downloading Sequence Data
Sequence Format
Get Sequence
52
Table Browser: Correlating Data Tables
Correlate 2 Datasets
Get Results
53
Custom Tracks: Table Browser Searches
Create Track
Get Output
54
Custom Tracks: Name and Configure Track
Download track file to desktop
Name Track:SRepeatKGenes
Describe Track:Intersection …
Choose defaultview in browser
In Genome Browser
55
Custom Tracks: Open Track in Genome Browser
Open Details
Compare
“…caused by anexpanded, unstable
trinucleotide repeat…”
56
Custom Tracks: Track in Table Browser
Custom tracks also are available for filtering and intersections
on the Table Browser
57
Custom Tracks: User-generated Data in Track
Custom Tracks Link
Custom Track How-to
58
Custom Tracks: Four Steps to Create Track
Four steps to create a custom trackDefine track characteristics Define browser characteristicsFormat your dataUpload and view your track
59
Custom Tracks: Submit Track
Copy and pastesmall or simple tracks
Submit File
http://genome.ucsc.edu/FAQ/FAQformat
60
Custom Tracks: Track Appears in Genome Browser
61
Custom Tracks: Track Characteristics
Default view of custom track is “pack”
Default viewof other tracks set
62
Custom Tracks: Track Appears in Table Browser
Custom Track alsoappears in
Table Browser
63
Custom Tracks from Outside Sources
Custom Tracks Link
Contributed Track
64
Bibliography:
http://genome.ucsc.edu/goldenPath/pubs.html The UCSC Table Browser, 2004. Bejerano et al., Nature Methods, 2005. The UCSC Proteome Browser Phylogenomic Resources at the UCSC Genome
Browser