biological databases ucsc genome...

26
BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc, PhD

Upload: others

Post on 05-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

BIOLOGICAL DATABASES

UCSC GENOME BROWSERLetizia Marullo

MSc, PhD

Page 2: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

NCBIGENE, GENBANK,

NUCLEOTIDE

FASTA format

Page 3: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

UCSCGENOME BROWSER

• University of California Santa Cruz http://genome.ucsc.edu/

Santa Cruz, CA

Page 4: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

THE UCSC HOME PAGE

navigate

navigate General information

Specific information—

new features, current status, etc.

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

Page 5: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,
Page 6: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

UCSCTHE GENOME BROWSER GATEWAY

START PAGE

Make your Gateway choices:

1. Select Clade

2. Select species: search 1 species at a time

3. Assembly: the official backbone DNA sequence

4. Position: location in the genome to examine

5. Image width: how many pixels in display window; 5000 max

6. Configure: make fonts bigger + other choices

1 2 43

5

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

Search “BRCA1”

Page 7: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

UCSCTHE GENOME BROWSER GATEWAY

START PAGE

text/ID

searches

Use this Gateway to search by:

• Gene names, symbols

• Chromosome number: chr7, or region: chr11:1038475-1075482

• Keywords: kinase, receptor

• IDs: NP, NM, OMIM, and more…

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

Page 8: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

EXERCISE 1: SEARCHING IN UCSC GENOME BROWSER

• Group: mammals

• Genome: human

• Assembly: hg19

• Search “LCT”

Page 9: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

UCSCOVERVIEW OF THE WHOLEGENOME BROWSER PAGE

}Genome viewer section

Track and image controls

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

Page 10: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

UCSCMOVING IN THE BROWSER

Change your view or location with controls at the top

Use “base” to get right down to the nucleotides

Configure: to change font, window size, more…

Specify

a

position

Walk

left or

right

Zoom

in

Zoom

out

click to

zoom 3x

and re-center

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

fonts,

window,

more

Page 11: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

Genome

viewer

section

Groups of data

UCSCOVERVIEW OF THE WHOLEGENOME BROWSER PAGE

Page 12: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

UCSCANNOTATION TRACK DISPLAY OPTIONS

Some data is ON or OFF by default

Links to info

and/or filters

• Menu links to info about the tracks: content, methods

• You change the view with pulldown menus

enforce

changes

• After making changes, REFRESH to enforce the change

Change

track view

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

Data from

the

ENCODE

project

Data lifted

from other

builds

Page 13: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

UCSC NEW INFO: ENCODE

http://www.nature.com/

encode/#/threads

Page 14: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

EXERCISE 2: CONFIGUREVISUALISATION IN UCSC

• Mapping and Sequencing: • Base position: DENSE

• Genes and gene predictions:

• UCSC genes: PACK

• RefSeq genes: DENSE

• GENECODE: SHOW

• Phenotype and literature:

• Publications: DENSE

• GWAS catalog: DENSE

• mRNAs and ESTs:

• Human mRNAs: DENSE

• Spliced ESTs: HIDE

• Expression: hide all

• Regulation:• ENCODE regulation: SHOW

• Set what to show: Transcription FULL, DNAseI DENSE

• Comparative Genomics:

• Conservation: FULL

• Neandertal Assembly and Denisova Assembly:hide all

• Variation:

• Common SNPs: PACK

• Repeats:

• RepeatMasker: DENSE

Page 15: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

UCSCWHAT WE WILL FIND..

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

Page 16: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

UCSC ANNOTATION TRACK

informative

description

other resource links

microarray data

mRNA secondary structure

links to sequences

protein domains/structure

homologs in other species

Gene Ontology™ descriptions

mRNA descriptions

pathways

Not all genes have

This much detail.

Different

annotation tracks

carry different detail

data.

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

Page 17: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

UCSC GET SEQUENCES

Click the lineClick the item

sequence section

on detail page

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

Page 18: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

EXERCISE 3: GET THE SEQUENCE

• Get the 1st 15 lines of LAC genomic sequence in fasta format and save in a text file.

Page 19: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

UCSC GET SEQUENCES WITH EXTENDED OPTIONS

• Use the DNA link at the top

• Plain or Extended options

• Change colors, fonts, etc.

Page 20: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

EXERCISE 3: GET THE SEQUENCE

• Get the sequence chr1:128,284,800-128,290,000 in Mus musculus and savethe FASTA format in a text file.

Page 21: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,
Page 22: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

UCSC BLAT TOOL

• Rapid searches by INDEXING the entire genome

• Works best with high similarity matches

BLAT = BLAST-like Alignment Tool

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

Page 23: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

Submit

Make choices

Paste one or

more

sequences

Or uploadInga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human Genomics

Imperial College of London

UCSC BLAT TOOL

Page 24: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

UCSC BLAT RESULTS,

ALIGNMENT DETAILSYour query

Genomic match, color cues

Side-by-side alignment

Inga Prokopenko, MSc, PhD, Senior PostDoc, Senior Lecturer in Human GenomicsImperial College of London

Page 25: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

EXERCISE 4: ALIGNMENT

• Align the two FASTA sequences to Mus musculus genome.

Page 26: BIOLOGICAL DATABASES UCSC GENOME BROWSERm.docente.unife.it/silvia.fuselli/dispense-corsi/BAG_UC... · 2014. 11. 25. · BIOLOGICAL DATABASES UCSC GENOME BROWSER Letizia Marullo MSc,

UCSC WHOLE GENOMES DATA