biological databases ucsc genome...

Post on 05-Oct-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

BIOLOGICAL DATABASES

UCSC GENOME BROWSERLetizia Marullo

MSc, PhD

NCBIGENE, GENBANK,

NUCLEOTIDE

FASTA format

UCSCGENOME BROWSER

• University of California Santa Cruz http://genome.ucsc.edu/

Santa Cruz, CA

THE UCSC HOME PAGE

navigate

navigate General information

Specific information—

new features, current status, etc.

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

UCSCTHE GENOME BROWSER GATEWAY

START PAGE

Make your Gateway choices:

1. Select Clade

2. Select species: search 1 species at a time

3. Assembly: the official backbone DNA sequence

4. Position: location in the genome to examine

5. Image width: how many pixels in display window; 5000 max

6. Configure: make fonts bigger + other choices

1 2 43

5

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

Search “BRCA1”

UCSCTHE GENOME BROWSER GATEWAY

START PAGE

text/ID

searches

Use this Gateway to search by:

• Gene names, symbols

• Chromosome number: chr7, or region: chr11:1038475-1075482

• Keywords: kinase, receptor

• IDs: NP, NM, OMIM, and more…

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

EXERCISE 1: SEARCHING IN UCSC GENOME BROWSER

• Group: mammals

• Genome: human

• Assembly: hg19

• Search “LCT”

UCSCOVERVIEW OF THE WHOLEGENOME BROWSER PAGE

}Genome viewer section

Track and image controls

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

UCSCMOVING IN THE BROWSER

Change your view or location with controls at the top

Use “base” to get right down to the nucleotides

Configure: to change font, window size, more…

Specify

a

position

Walk

left or

right

Zoom

in

Zoom

out

click to

zoom 3x

and re-center

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

fonts,

window,

more

Genome

viewer

section

Groups of data

UCSCOVERVIEW OF THE WHOLEGENOME BROWSER PAGE

UCSCANNOTATION TRACK DISPLAY OPTIONS

Some data is ON or OFF by default

Links to info

and/or filters

• Menu links to info about the tracks: content, methods

• You change the view with pulldown menus

enforce

changes

• After making changes, REFRESH to enforce the change

Change

track view

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

Data from

the

ENCODE

project

Data lifted

from other

builds

UCSC NEW INFO: ENCODE

http://www.nature.com/

encode/#/threads

EXERCISE 2: CONFIGUREVISUALISATION IN UCSC

• Mapping and Sequencing: • Base position: DENSE

• Genes and gene predictions:

• UCSC genes: PACK

• RefSeq genes: DENSE

• GENECODE: SHOW

• Phenotype and literature:

• Publications: DENSE

• GWAS catalog: DENSE

• mRNAs and ESTs:

• Human mRNAs: DENSE

• Spliced ESTs: HIDE

• Expression: hide all

• Regulation:• ENCODE regulation: SHOW

• Set what to show: Transcription FULL, DNAseI DENSE

• Comparative Genomics:

• Conservation: FULL

• Neandertal Assembly and Denisova Assembly:hide all

• Variation:

• Common SNPs: PACK

• Repeats:

• RepeatMasker: DENSE

UCSCWHAT WE WILL FIND..

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

UCSC ANNOTATION TRACK

informative

description

other resource links

microarray data

mRNA secondary structure

links to sequences

protein domains/structure

homologs in other species

Gene Ontology™ descriptions

mRNA descriptions

pathways

Not all genes have

This much detail.

Different

annotation tracks

carry different detail

data.

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

UCSC GET SEQUENCES

Click the lineClick the item

sequence section

on detail page

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

EXERCISE 3: GET THE SEQUENCE

• Get the 1st 15 lines of LAC genomic sequence in fasta format and save in a text file.

UCSC GET SEQUENCES WITH EXTENDED OPTIONS

• Use the DNA link at the top

• Plain or Extended options

• Change colors, fonts, etc.

EXERCISE 3: GET THE SEQUENCE

• Get the sequence chr1:128,284,800-128,290,000 in Mus musculus and savethe FASTA format in a text file.

UCSC BLAT TOOL

• Rapid searches by INDEXING the entire genome

• Works best with high similarity matches

BLAT = BLAST-like Alignment Tool

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

Submit

Make choices

Paste one or

more

sequences

Or uploadInga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human Genomics

Imperial College of London

UCSC BLAT TOOL

UCSC BLAT RESULTS,

ALIGNMENT DETAILSYour query

Genomic match, color cues

Side-by-side alignment

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

EXERCISE 4: ALIGNMENT

• Align the two FASTA sequences to Mus musculus genome.

UCSC WHOLE GENOMES DATA

top related