Download - NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center
![Page 1: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/1.jpg)
NC
BI
Fie
ldG
uid
eA Field Guide part 2
August 30, 2005 University of Colorado Health Sciences Center
![Page 2: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/2.jpg)
NC
BI
Fie
ldG
uid
ePart 2
Entrez: text searching• a GenBank record• preview/index
BLAST: sequence searching• pre-computed searches• algorithms• what’s new?
VAST: structure searching
Example: mapping oligos to a genome
![Page 3: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/3.jpg)
NC
BI
Fie
ldG
uid
eGenBank Records
Header
Feature Table
Sequence
The Flatfile Format
![Page 4: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/4.jpg)
NC
BI
Fie
ldG
uid
eA Typical GenBank Record
LOCUS NM_019570 4279 bp mRNA linear INV 28-OCT-2004DEFINITION Mus musculus REV1-like(S. cerevisiae)(Rev1l),mRNAACCESSION NM_019570VERSION NM_019570.3 GI:50811869 KEYWORDS .
= Title
![Page 5: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/5.jpg)
NC
BI
Fie
ldG
uid
eGenBank Record: Feature Table
![Page 6: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/6.jpg)
NC
BI
Fie
ldG
uid
e
GenPept identifier
GenBank Record: Feature Table, con’t.
![Page 7: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/7.jpg)
NC
BI
Fie
ldG
uid
eGenBank Record: sequence
skip
![Page 8: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/8.jpg)
NC
BI
Fie
ldG
uid
eIndexing for Nucleotide UID 59958365
Field Indexed Terms
[primary accession] NM_001012399[title] Bos taurus hemochromatosis (hfe), mRNA.[organism] Bos taurus[sequence length] 1168[modification date] 2005/02/19[properties] biomol mrna
gbdiv mamsrcdb refseq
[accn]
[orgn]
[mdat][prop]
![Page 9: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/9.jpg)
NC
BI
Fie
ldG
uid
eGlobal Entrez Search: HFE
HFE
![Page 10: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/10.jpg)
NC
BI
Fie
ldG
uid
e
Entrez Nucleotide: HFE 137 records
Not HFE [Title]
![Page 11: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/11.jpg)
NC
BI
Fie
ldG
uid
eSmarter Query
hfe[title]
42 records
Curated HFE splice variants(11 total)
AND human[orgn]
![Page 12: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/12.jpg)
NC
BI
Fie
ldG
uid
ehfe[title] AND human[orgn] (con’t)
Primary data
![Page 13: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/13.jpg)
NC
BI
Fie
ldG
uid
ePreview/Index
![Page 14: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/14.jpg)
NC
BI
Fie
ldG
uid
ePreview/Index
![Page 15: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/15.jpg)
NC
BI
Fie
ldG
uid
ePreview/Index: Properties, srcdb
srcdbProperties
![Page 16: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/16.jpg)
NC
BI
Fie
ldG
uid
ePreview/Index: Properties, srcdb
…AND srcdb refseq[Properties]…AND srcdb refseq[Properties]
![Page 17: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/17.jpg)
NC
BI
Fie
ldG
uid
ePreview/Index: Properties, srcdb
…AND srcdb ddbj/embl/genbank[Properties]…AND srcdb ddbj/embl/genbank[Properties]
![Page 18: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/18.jpg)
NC
BI
Fie
ldG
uid
e#1 hfe 137#2 hfe[title] AND human[orgn] 42
#3 #2 AND srcdb refseq[prop] 11#4 #2 AND srcdb ddbj/embl/genbank[prop] 31
Database Queries
#5 #4 AND gbdiv pri[prop] 29
#4 #4 AND gbdiv est[prop] 2
Primate division gbdiv pri[prop]EST division gbdiv est[prop]
![Page 19: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/19.jpg)
NC
BI
Fie
ldG
uid
e
Molecule Queries
#1 hfe 116
#2 hfe[title] AND human[orgn] 42
#3 #2 AND biomol mrna[prop] 29
#4 #2 AND biomol genomic[prop] 13
Genomic DNA biomol genomic[prop]cDNA biomol mrna[prop]
![Page 20: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/20.jpg)
NC
BI
Fie
ldG
uid
eMore Queries…
Fields are database-specific
Entrez Nucleotide
Reviewed RefSeqs with transcript variants:
srcdb refseq reviewed[prop] AND transcript[title] AND variant[title]
Topoisomerase genes from Archaea:
topoisomerase[gene name] AND archaea[organism]
Entrez Gene
Genes on human chromosome 2 with OMIM links
2[chromosome] AND human[organism] AND “gene omim”[filter]
Membrane proteins linked to cancer:
“integral to plasma membrane”[gene ontology] AND cancer[dis]
![Page 21: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/21.jpg)
NC
BI
Fie
ldG
uid
e
Other Entrez Databases
UniSTS: markers on the Genethon map of human chromosome 12
Genethon[Map Name] AND human[organism] AND 12[chromosome]
UniGene: rat clusters that have at least one mRNA
rat[organism] NOT 0[mrna count]
Structure: structures of bacterial kinases with resolutions below 2 Å
bacteria[organism] AND kinase AND 000.00:002.00[resolution]
SNP: uniquely mapped microsatellites on human chr2
microsat[SNP Class] AND 1[Map Weight] AND 2[Chromosome]) AND human[orgn]
![Page 22: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/22.jpg)
NC
BI
Fie
ldG
uid
e
Basic Local Alignment Search Tool
![Page 23: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/23.jpg)
NC
BI
Fie
ldG
uid
eBLAST Web Searches, 2005
200,000
![Page 24: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/24.jpg)
NC
BI
Fie
ldG
uid
e
Nucleotide or protein: Related
Sequences
BLAST link: BLink
Precomputed BLAST Services
Transcript clusters: UniGene
Protein homologs: HomoloGene
![Page 25: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/25.jpg)
NC
BI
Fie
ldG
uid
eLink to Related Sequences
![Page 26: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/26.jpg)
NC
BI
Fie
ldG
uid
eRelated Sequences
Most similar
Least similar
![Page 27: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/27.jpg)
NC
BI
Fie
ldG
uid
e
BLink (BLAST Link)
![Page 28: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/28.jpg)
NC
BI
Fie
ldG
uid
eBLink Output
Best hitsBest hits 3D structures3D structures CDD-SearchCDD-Search
![Page 29: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/29.jpg)
NC
BI
Fie
ldG
uid
eGlobal vs Local Alignment
Seq 1
Seq 2
Seq 1
Seq 2
Global alignment
Local alignment
![Page 30: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/30.jpg)
NC
BI
Fie
ldG
uid
e
Global vs Local Alignment
Seq1: WHEREISWALTERNOW (16aa)Seq2: HEWASHEREBUTNOWISHERE (21aa)
Global
Seq1: 1 W--HEREISWALTERNOW 16 W HERE
Seq2: 1 HEWASHEREBUTNOWISHERE 21
LocalSeq1: 1 W--HERE 5 Seq1: 1 W--HERE 5 W HERE W HERESeq2: 3 WASHERE 9 Seq2: 15 WISHERE 21
![Page 31: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/31.jpg)
NC
BI
Fie
ldG
uid
eThe Flavors of BLAST
• Standard BLAST– nucleotide, protein and translations (blastn, blastp,
blastx, tblastn, tblastx)– traditional “contiguous” word hit
• Megablast– optimized for large batch searches– can use discontiguous words
• PSI-BLAST– constructs PSSMs automatically; uses as query– very sensitive protein search
• RPS BLAST– searches a database of PSSMs– tool for conserved domain searches
“contiguous”
discontiguous
![Page 32: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/32.jpg)
NC
BI
Fie
ldG
uid
eFast- heuristic approach based on Smith Waterman
Local alignments
Statistical significance- Expect value
Versatile- blastn, blastp, blastx, tblastn, tblastx, rps-blast,
psi-blast- www, standalone, and network clients
Why Is BLAST So Popular?
![Page 33: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/33.jpg)
NC
BI
Fie
ldG
uid
e
How BLAST Works
• Make lookup table of “words” for query
• Scan database for hits
• Ungapped extensions of hits (initial HSPs)
• Gapped extensions (no traceback)
• Gapped extensions (traceback; alignment
details)
• Make lookup table of “words” for query
• Scan database for hits
• Ungapped extensions of hits (initial HSPs)
• Gapped extensions (no traceback)
• Gapped extensions (traceback; alignment
details)
![Page 34: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/34.jpg)
NC
BI
Fie
ldG
uid
eNucleotide Words
GTACTGGACATGGACCCTACAGGAAQuery:
GTACTGGACAT
TACTGGACATG
ACTGGACATGG
CTGGACATGGA
TGGACATGGAC
GGACATGGACC
GACATGGACCC
ACATGGACCCT
Make a lookuptable of words
11-mer
. . .
![Page 35: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/35.jpg)
NC
BI
Fie
ldG
uid
eProtein Words
GTQITVEDLFYNIATRRKALKNQuery:
Neighborhood Words
LTV, MTV, ISV, LSV, etc.
GTQ
TQI
QIT
ITV
TVE
VED
EDL
DLF
...
Make a lookuptable of words
Word size = 3 (default)
Word size can only be 2 or 3
[ -f 11 = blastp default ]
![Page 36: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/36.jpg)
NC
BI
Fie
ldG
uid
e
Minimum Requirements for a Hit
• Nucleotide BLAST requires one exact match• Protein BLAST requires two neighboring matches within 40 aa
GTQITVEDLFYNI
SEI YYN
ATCGCCATGCTTAATTGGGCTT
CATGCTTAATT
neighborhood words
one exact match
two matches
[ -A 40 = blastp default ]
![Page 37: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/37.jpg)
NC
BI
Fie
ldG
uid
e
BLASTP Summary
YLS HFLSbjct 287 LEETYAKYLHKGASYFVYLSLNMSPEQLDVNVHPSKRIVHFLYDQEI 333
Query 1 IETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESI 47
Gapped extension with trace back
Query 1 IETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESI-LEV… 50 +E YA YL K F+YLSL +SP+ +DVNVHP+K VHFL+++ I + +Sbjct 287 LEETYAKYLHKGASYFVYLSLNMSPEQLDVNVHPSKRIVHFLYDQEIATSI… 337
Final HSP
+E YA YL K F+ L +SP+ +DVNVHP+K V +++ I
High-scoring pair (HSP)
HFL 18HFV 15 HFS 14HWL 13NFL 13DFL 12HWV 10etc …
YLS 15YLT 12 YVS 12YIT 10etc …
Neighborhood words
Neighborhood score threshold
T (-f) =11
Query: IETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILEV…
example query words
![Page 38: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/38.jpg)
NC
BI
Fie
ldG
uid
e
Scoring Systems - Nucleotides
A G C T
A +1 –3 –3 -3
G –3 +1 –3 -3
C –3 –3 +1 -3
T –3 –3 –3 +1
Identity matrix
CAGGTAGCAAGCTTGCATGTCA
|| |||||||||||| ||||| raw score = 19-9 = 10
CACGTAGCAAGCTTG-GTGTCA
[ -r 1 -q -3 ]
![Page 39: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/39.jpg)
NC
BI
Fie
ldG
uid
eScoring Systems - Proteins
Position Independent MatricesPAM Matrices (Percent Accepted Mutation)
• Derived from observation; small dataset of alignments• Implicit model of evolution• All calculated from PAM1• PAM250 widely used
BLOSUM Matrices (BLOck SUbstitution Matrices)• Derived from observation; large dataset of highly
conserved blocks• Each matrix derived separately from blocks with a
defined percent identity cutoff• BLOSUM62 - default matrix for BLAST
Position Specific Score Matrices (PSSMs)PSI- and RPS-BLAST
![Page 40: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/40.jpg)
NC
BI
Fie
ldG
uid
e
A 4R -1 5 N -2 0 6D -2 -2 1 6C 0 -3 -3 -3 9Q -1 1 0 0 -3 5E -1 0 0 2 -4 2 5G 0 -2 0 -1 -3 -2 -2 6H -2 0 1 -1 -3 0 0 -2 8I -1 -3 -3 -3 -1 -3 -3 -4 -3 4 L -1 -2 -3 -4 -1 -2 -3 -4 -3 2 4K -1 2 0 -1 -3 1 1 -2 -1 -3 -2 5M -1 -1 -2 -3 -1 0 -2 -3 -2 1 2 -1 5F -2 -3 -3 -3 -2 -3 -3 -3 -1 0 0 -3 0 6P -1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4 7S 1 -1 1 0 -1 0 0 0 -1 -2 -2 0 -1 -2 -1 4T 0 -1 0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -1 1 5W -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1 1 -4 -3 -2 11Y -2 -2 -2 -3 -2 -1 -2 -3 2 -1 -1 -2 -1 3 -3 -2 -2 2 7V 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4X 0 -1 -1 -1 -2 -1 -1 -1 -1 -1 -1 -1 -1 -1 -2 0 0 -2 -1 -1 -1 A R N D C Q E G H I L K M F P S T W Y V X
BLOSUM62
D
F
Negative for less likely substitutions
D
Y
FPositive for more likely substitutions
![Page 41: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/41.jpg)
NC
BI
Fie
ldG
uid
e
Position-Specific Score Matrix
DAF-1
Serine/Threonine protein kinases catalytic loop
1 7 4PSSM scores 5 4
![Page 42: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/42.jpg)
NC
BI
Fie
ldG
uid
e
A R N D C Q E G H I L K M F P S T W Y V 435 K -1 0 0 -1 -2 3 0 3 0 -2 -2 1 -1 -1 -1 -1 -1 -1 -1 -2 436 E 0 1 0 2 -1 0 2 -1 0 -1 -1 0 0 0 -1 0 0 -1 -1 -1 437 S 0 0 -1 0 1 1 0 1 1 0 -1 0 0 0 2 0 -1 -1 0 -1 438 N -1 0 -1 -1 1 0 -1 3 3 -1 -1 1 -1 0 0 -1 -1 1 1 -1 439 K -2 1 1 -1 -2 0 -1 -2 -2 -1 -2 5 1 -2 -2 -1 -1 -2 -2 -1 440 P -2 -2 -2 -2 -3 -2 -2 -2 -2 -1 -2 -1 0 -3 7 -1 -2 -3 -1 -1 441 A 3 -2 1 -2 0 -1 0 1 -2 -2 -2 0 -1 -2 3 1 0 -3 -3 0 442 M -3 -4 -4 -4 -3 -4 -4 -5 -4 7 0 -4 1 0 -4 -4 -2 -4 -1 2 443 A 4 -4 -4 -4 0 -4 -4 -3 -4 4 -1 -4 -2 -3 -4 -1 -2 -4 -3 4 444 H -4 -2 -1 -3 -5 -2 -2 -4 10 -6 -5 -3 -4 -3 -2 -3 -4 -5 0 -5 445 R -4 8 -3 -4 0 -1 -2 -3 -2 -5 -4 0 -3 -2 -4 -3 -3 0 -4 -5 446 D -4 -4 -1 8 -6 -2 0 -3 -3 -5 -6 -3 -5 -6 -4 -2 -3 -7 -5 -5 447 I -4 -5 -6 -6 -3 -4 -5 -6 -5 3 5 -5 1 1 -5 -5 -3 -4 -3 1 448 K 0 0 1 -3 -5 -1 -1 -3 -3 -5 -5 7 -4 -5 -3 -1 -2 -5 -4 -4 449 S 0 -3 -2 -3 0 -2 -2 -3 -3 -4 -4 -2 -4 -5 2 6 2 -5 -4 -4 450 K 0 3 0 1 -5 0 0 -4 -1 -4 -3 4 -3 -2 2 1 -1 -5 -4 -4 451 N -4 -3 8 -1 -5 -2 -2 -3 -1 -6 -6 -2 -4 -5 -4 -1 -2 -6 -4 -5 452 I -3 -5 -5 -6 0 -5 -5 -6 -5 6 2 -5 2 -2 -5 -4 -3 -5 -3 3 453 M -4 -4 -6 -6 -3 -4 -5 -6 -5 0 6 -5 1 0 -5 -4 -3 -4 -3 0 454 V -3 -3 -5 -6 -3 -4 -5 -6 -5 3 3 -4 2 -2 -5 -4 -3 -5 -3 5 455 K -2 1 1 4 -5 0 -1 -2 1 -4 -2 4 -3 -2 -3 0 -1 -5 -2 -3 456 N 1 1 3 0 -4 -1 1 0 -3 -4 -4 3 -2 -5 -2 2 -2 -5 -4 -4 457 D -3 -2 5 5 -1 -1 1 -1 0 -5 -4 0 -2 -5 -1 0 -2 -6 -4 -5 458 L -3 -1 0 -3 0 -3 -2 3 -4 -2 3 0 1 1 -2 -2 -3 5 -1 -3
Position-Specific Score Matrix
catalytic loop
![Page 43: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/43.jpg)
NC
BI
Fie
ldG
uid
eLocal Alignment Statistics
High scores of local alignments between two random sequencesfollow the Extreme Value Distribution
Score (S)
Alig
nm
en
ts
(applies to ungapped alignments)
E = Kmne-S or E = mn2-S’
K = scale for search space = scale for scoring system S’ = bitscore = (S - lnK)/ln2
Expect ValueE = number of database hits you expect to find by chance, ≥ S
your score
expected number of
random hits
More info: www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html
![Page 44: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/44.jpg)
NC
BI
Fie
ldG
uid
e
Gapped Alignments
Gapping provides more biologically realistic alignments
Gapped BLAST parameters are simulated for each scoring matrix
Affine gap costs = -(a+bk)a = gap open penalty b = gap extend penaltyA gap of length 1 receives the score -(a+b)
![Page 45: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/45.jpg)
NC
BI
Fie
ldG
uid
e
An Alignment BLAST Cannot Make
1 GAATATATGAAGACCAAGATTGCAGTCCTGCTGGCCTGAACCACGCTATTCTTGCTGTTG || | || || || | || || || || | ||| |||||| | | || | ||| | 1 GAGTGTACGATGAGCCCGAGTGTAGCAGTGAAGATCTGGACCACGGTGTACTCGTTGTCG
61 GTTACGGAACCGAGAATGGTAAAGACTACTGGATCATTAAGAACTCCTGGGGAGCCAGTT | || || || ||| || | |||||| || | |||||| ||||| | | 61 GCTATGGTGTTAAGGGTGGGAAGAAGTACTGGCTCGTCAAGAACAGCTGGGCTGAATCCT
121 GGGGTGAACAAGGTTATTTCAGGCTTGCTCGTGGTAAAAAC |||| || ||||| || || | | |||| || ||| 121 GGGGAGACCAAGGCTACATCCTTATGTCCCGTGACAACAAC
1 GAATATATGAAGACCAAGATTGCAGTCCTGCTGGCCTGAACCACGCTATTCTTGCTGTTG || | || || || | || || || || | ||| |||||| | | || | ||| | 1 GAGTGTACGATGAGCCCGAGTGTAGCAGTGAAGATCTGGACCACGGTGTACTCGTTGTCG
61 GTTACGGAACCGAGAATGGTAAAGACTACTGGATCATTAAGAACTCCTGGGGAGCCAGTT | || || || ||| || | |||||| || | |||||| ||||| | | 61 GCTATGGTGTTAAGGGTGGGAAGAAGTACTGGCTCGTCAAGAACAGCTGGGCTGAATCCT
121 GGGGTGAACAAGGTTATTTCAGGCTTGCTCGTGGTAAAAAC |||| || ||||| || || | | |||| || ||| 121 GGGGAGACCAAGGCTACATCCTTATGTCCCGTGACAACAAC
Reason:
no contiguous exact match of 7 bp.
![Page 46: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/46.jpg)
NC
BI
Fie
ldG
uid
e
BLAST 2 Sequences (blastx) output:
An Alignment BLAST Can Make
Solution: compare protein sequences; BLASTXScore = 290 bits (741), Expect = 7e-77Identities = 147/331 (44%), Positives = 206/331 (61%), Gaps = 8/331 (2%)Frame = +3
Score = 290 bits (741), Expect = 7e-77Identities = 147/331 (44%), Positives = 206/331 (61%), Gaps = 8/331 (2%)Frame = +3
![Page 47: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/47.jpg)
NC
BI
Fie
ldG
uid
e
Other BLAST Algorithms
• Megablast
• Discontiguous Megablast
• PSI-BLAST
• PHI-BLAST
• Megablast
• Discontiguous Megablast
• PSI-BLAST
• PHI-BLAST
![Page 48: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/48.jpg)
NC
BI
Fie
ldG
uid
e
Megablast: NCBI’s Genome Annotator
• Long alignments of similar DNA sequences
• Greedy algorithm
• Concatenation of query sequences
• Faster than blastn; less sensitive
• Long alignments of similar DNA sequences
• Greedy algorithm
• Concatenation of query sequences
• Faster than blastn; less sensitive
![Page 49: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/49.jpg)
NC
BI
Fie
ldG
uid
e
MegaBLAST & Word Size
Trade-off: sensitivity vs speed
Too fast foryou?
![Page 50: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/50.jpg)
NC
BI
Fie
ldG
uid
e
MegaBLAST & Word Size
Trade-off: sensitivity vs speed
23blastp
828megablast
711blastn
minimumdefaultWORD SIZE
![Page 51: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/51.jpg)
NC
BI
Fie
ldG
uid
e
Discontiguous Megablast
• Uses discontiguous word matches
• Better for cross-species comparisons
• Uses discontiguous word matches
• Better for cross-species comparisons
![Page 52: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/52.jpg)
NC
BI
Fie
ldG
uid
e
Templates for Discontiguous Words
W = 11, t = 16, coding: 1101101101101101W = 11, t = 16, non-coding: 1110010110110111W = 12, t = 16, coding: 1111101101101101W = 12, t = 16, non-coding: 1110110110110111W = 11, t = 18, coding: 101101100101101101W = 11, t = 18, non-coding: 111010010110010111W = 12, t = 18, coding: 101101101101101101W = 12, t = 18, non-coding: 111010110010110111W = 11, t = 21, coding: 100101100101100101101W = 11, t = 21, non-coding: 111010010100010010111W = 12, t = 21, coding: 100101101101100101101W = 12, t = 21, non-coding: 111010010110010010111
Reference: Ma, B, Tromp, J, Li, M. PatternHunter: faster and more sensitive homology search. Bioinformatics March, 2002; 18(3):440-5
W = word size; # matches in template
t = template length
![Page 53: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/53.jpg)
NC
BI
Fie
ldG
uid
eDiscontiguous (Cross-species)
MegaBLAST
![Page 54: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/54.jpg)
NC
BI
Fie
ldG
uid
eDiscontiguous Word
Options
![Page 55: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/55.jpg)
NC
BI
Fie
ldG
uid
eMegaBLAST vs Discontiguous
MegaBLAST
NM_017460 Homo sapiens cytochrome P450, family 3, subfamily A, polypeptide 4 (CYP3A4),
transcript variant 1, mRNA (2768 letters)
vs Drosophila
![Page 56: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/56.jpg)
NC
BI
Fie
ldG
uid
e
MegaBLAST vs Discontiguous MegaBLAST
MegaBLAST = “No significant similarity found.”
Discontiguous megaBLAST =
![Page 57: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/57.jpg)
NC
BI
Fie
ldG
uid
e
Another Example . . .
Discontiguous megaBLAST = numerous hits . . .
Query: NM_078651
Drosophila melanogaster CG18582-PA (mbt) mRNA, (3244 bp)
/note= mushroom bodies tiny; synonyms: Pak2, STE20, dPAK2
MegaBLAST = “No significant similarity found.”
Database: nr (nt), Mammalia[orgn]
![Page 58: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/58.jpg)
NC
BI
Fie
ldG
uid
eEx: Discontiguous MegaBLAST
![Page 59: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/59.jpg)
NC
BI
Fie
ldG
uid
eEx: BLASTN
![Page 60: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/60.jpg)
NC
BI
Fie
ldG
uid
e
PSI-BLAST
Example: Confirming relationships of purine
nucleotide metabolism proteins
Position-specific Iterated BLAST
![Page 61: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/61.jpg)
NC
BI
Fie
ldG
uid
e>gi|113340|sp|P03958|ADA_MOUSE ADENOSINE DEAMINASE (ADENOSINEMAQTPAFNKPKVELHVHLDGAIKPETILYFGKKRGIALPADTVEELRNIIGMDKPLSLPGFVIAGCREAIKRIAYEFVEMKAKEGVVYVEVRYSPHLLANSKVDPMPWNQTEGDVTPDDVVDEQAFGIKVRSILCCMRHQPSWSLEVLELCKKYNQKTVVAMDLAGDETIEGSSLFPGHVEAYRTVHAGEVGSPEVVREAVDILKTERVGHGYHTIEDEALYNRLLKENMHFEVCPWSSYLTGAVRFKNDKANYSLNTDDPLIFKSTLDTDYQMTKKDMGFTEEEFKRLNINAAKSSFLPEEEKK
PSI-BLAST
0.005 E value cutoff for PSSM
![Page 62: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/62.jpg)
NC
BI
Fie
ldG
uid
eRESULTS: Initial BLASTP
Same results as protein-protein BLAST; different format
![Page 63: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/63.jpg)
NC
BI
Fie
ldG
uid
eResults of First PSSM Search
Other purine nucleotide metabolizing enzymes not found by ordinary BLAST
![Page 64: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/64.jpg)
NC
BI
Fie
ldG
uid
eTenth PSSM Search: Convergence
Just below threshold, another nucleotide metabolism enzyme
Check to add to PSSM
![Page 65: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/65.jpg)
NC
BI
Fie
ldG
uid
eReverse PSI-BLAST (RPS)-BLAST
![Page 66: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/66.jpg)
NC
BI
Fie
ldG
uid
eAdenosine/AMP Deaminase Domain
AMP Deaminases
.
.
.
![Page 67: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/67.jpg)
NC
BI
Fie
ldG
uid
e
PHI-BLAST
>gi|231729|sp|P30429|CED4_CAEEL CELL DEATH PROTEIN 4MLCEIECRALSTAHTRLIHDFEPRDALTYLEGKNIFTEDHSELISKMSTRLERIANFLRIYRRQASELIDFFNYNNQSHLADFLEDYIDFAINEPDLLRPVVIAPQFSRQMLDRKLLLGNVPKQMTCYIREYHVIKKLDEMCDLDSFFLFLHGRAGSGKSVIASQALSKSDQLIGINYDSIVWLKDSGTAPKSTFDLFTDILKSEDDLLNFPSVEHVTSVVLKRMICNALIDRPNTLFVFDDVVQEETIRWAQELRLRCLVTTRDVEIASQTCEFIEVTSLEIDECYDFLEAYGMPMPVGEKEEDVLNKTIELSSGNPATLMMFFKSCEPKTFEK
[GA]xxxxGK[ST]
![Page 68: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/68.jpg)
NC
BI
Fie
ldG
uid
eGenome BLAST
![Page 69: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/69.jpg)
NC
BI
Fie
ldG
uid
eGenome BLAST via Map Viewer
![Page 70: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/70.jpg)
NC
BI
Fie
ldG
uid
eExample Search Pathways:
Hemochromatosis
Gene
OMIMOMIM GeneGene
“hemochromatosis”HFE
nucleotide sequence
GenomeBLAST Map Viewer
SNP
Protein
Domains
text search
sequence search
![Page 71: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/71.jpg)
NC
BI
Fie
ldG
uid
eExample: Human Genome BLAST
TGCCTCCTTTGGTGAAGGTGACACATCATGTGACCTCTTCAGTGACCACTCTACGGTGTCGGGCCTTGAACTACTACCCCCAGAACATCACCATGAAGTGGCTGAAGGATAAGCAGCCAATGGATGCCAAGGAGTTCGAACCTAAAGACGTATTGCCCAATGGGGATGGGACCTACCAGGGCTGGATAACCTTGGCTGTACCCCCTGGGGAAGAGC
Human EST
![Page 72: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/72.jpg)
NC
BI
Fie
ldG
uid
eHuman Genome BLAST: Results
![Page 73: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/73.jpg)
NC
BI
Fie
ldG
uid
e
Human Genome BLAST: MapViewer
Entrez GeneEntrez Gene
![Page 74: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/74.jpg)
NC
BI
Fie
ldG
uid
e
What’s New?
![Page 75: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/75.jpg)
NC
BI
Fie
ldG
uid
e
BLAST DatabasesNucleotide
• refseq_rna = NM_*, XM_*
• refseq_genomic = NC_*, NG_*
• env_nt– environmental sample[filter], e.g., 16S
rRNA
Protein
• refseq = NP_*, XP_*
• env_nr
![Page 76: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/76.jpg)
NC
BI
Fie
ldG
uid
eNew Formatter
Select lower case Select red
![Page 77: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/77.jpg)
NC
BI
Fie
ldG
uid
eNew Formatter
• gray line = same database hit
• hsp’s color-coded independently
![Page 78: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/78.jpg)
NC
BI
Fie
ldG
uid
e
BLAST Output: Alignments & Filter
low complexity sequence filtered
![Page 79: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/79.jpg)
NC
BI
Fie
ldG
uid
eAdvanced Options
Limit to Organism
all[filter] NOT ma
Example Entrez Queriesall[Filter] NOT mammalia[Organism]ray finned fishes[Organism]srcdb refseq[Properties]
Nucleotide only:biomol mrna[Properties]biomol genomic[Properties]
OtherAdvanced–e 10000 expect value-v 2000 descriptions-b 2000alignments
Example Entrez Queriesall[Filter] NOT mammalia[Organism]ray finned fishes[Organism]srcdb refseq[Properties]
Nucleotide only:biomol mrna[Properties]biomol genomic[Properties]
OtherAdvanced–e 10000 expect value-v 2000 descriptions-b 2000alignments
-e 10000 -v 2000
![Page 80: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/80.jpg)
NC
BI
Fie
ldG
uid
eSearching by Structure
Why search for similar structures?
• Find homologs with low sequence similarity
• Explore protein evolution: similar protein folds can support different functions
• Identify conserved core elements to model related proteins of unknown structure
Why search for similar structures?
• Find homologs with low sequence similarity
• Explore protein evolution: similar protein folds can support different functions
• Identify conserved core elements to model related proteins of unknown structure
![Page 81: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/81.jpg)
NC
BI
Fie
ldG
uid
e
Indexing into MMDB
Structure
id 1 , name "helix 1" , type helix , location subgraph residues interval { { molecule-id 1 , from 49 , to 61 } } } ,
Add secondary structure
inter-residue-bonds { { atom-id-1 { molecule-id 1 , residue-id 1 , atom-id 1 } , atom-id-2 { molecule-id 1 , residue-id 2 , atom-id 9 } } ,
Add chemical bonds
• Import only experimentally determined structures• Convert to ASN.1 • Verify sequences
• Create “backbone” model (Cα, P only)• Create single-conformer model
MMDBMolecular Modeling Data Base
![Page 82: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/82.jpg)
NC
BI
Fie
ldG
uid
e
Structure Summary
Conserved Domains3D Domain Neighbors
Structure Neighbors
![Page 83: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/83.jpg)
NC
BI
Fie
ldG
uid
e
3D Domains
1
32
4
![Page 84: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/84.jpg)
NC
BI
Fie
ldG
uid
e
Conserved Domains
TyrKc
SH3
SH2
![Page 85: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/85.jpg)
NC
BI
Fie
ldG
uid
e
VAST: Alignment
For each protein chain,
locate SSEs (secondarystructure elements),
represent SSEs asindividual vectors, 1
2
3
4
5 6
Human IL-4
IL-4 &Leptinalign the vectors.
![Page 86: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/86.jpg)
NC
BI
Fie
ldG
uid
e
VAST
Structure neighbors
Taq DNA polymerase
![Page 87: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/87.jpg)
NC
BI
Fie
ldG
uid
eVAST Results for the Chain
Table view
![Page 88: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/88.jpg)
NC
BI
Fie
ldG
uid
e
VAST
Vector Alignment Search Tool
3D Domain structure neighbors
![Page 89: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/89.jpg)
NC
BI
Fie
ldG
uid
eVAST Results for Domain 1
Not found with Chain query!
![Page 90: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/90.jpg)
NC
BI
Fie
ldG
uid
e
Best way to convert PDB files to MMDB format
for viewing with Cn3D!
Best way to convert PDB files to MMDB format
for viewing with Cn3D!
submit file to PDB
![Page 91: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/91.jpg)
NC
BI
Fie
ldG
uid
eExample: Mapping Oligos Onto
a Genome
>forwardCCATGGCGACCCTGGAAAAGC
>reverseCAGCAGCGGCTGTGCCTGCGG
??
?
![Page 92: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/92.jpg)
NC
BI
Fie
ldG
uid
eMap Oligos Onto Genome
>CCATGGCGACCCTGGAAAAGCNNNNNNNNNNCAGCAGCGGCTGTGCCTGCGG
-W 7 –e 1000
forward primer reverse primer
![Page 93: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/93.jpg)
NC
BI
Fie
ldG
uid
eGenome BLAST Results
![Page 94: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/94.jpg)
NC
BI
Fie
ldG
uid
e
Primer Alignments
forward primer
reverse primer
![Page 95: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/95.jpg)
NC
BI
Fie
ldG
uid
e
MapViewer
![Page 96: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/96.jpg)
NC
BI
Fie
ldG
uid
e
MapViewer
![Page 97: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/97.jpg)
NC
BI
Fie
ldG
uid
eSequence View (sv)
forward
reverse
![Page 98: NCBI FieldGuide A Field Guide part 2 August 30, 2005 University of Colorado Health Sciences Center](https://reader036.vdocument.in/reader036/viewer/2022062802/56649e905503460f94b950da/html5/thumbnails/98.jpg)
NC
BI
Fie
ldG
uid
e
Service Addresses
•BLAST [email protected]
•General Help [email protected]•Wayne Matten [email protected]
•BLAST [email protected]
•General Help [email protected]•Wayne Matten [email protected]