wssp-10 chapter 7 blastn: dna vs dna searches

15
WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches 4-3

Upload: others

Post on 11-May-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches

WSSP-10 Chapter 7BLASTN: DNA vs DNA searches

4-3

Page 2: WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches

DSAP: BLASTn Page

p. 7-1

p. 7-1

NCBI BLAST Home Page

Page 3: WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches

p. 7-2

NCBI BLASTN search page

p. 7-2

Copy sequence from DSAP or wave form program

Page 4: WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches

p. 7-3

Choose a database (nr/nt or est)

p. 7-4

Search options (Use defaults)

Page 5: WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches

p. 7-5

BLASTN progress report (search may take a few minutes)

p. 7-5

Format options (use defaults)

Page 6: WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches

p. 7-6

EX1.10 BLASTN nr/nt database

Graphic report of EX2.09

p. 7-7

Page 7: WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches

p. 7-7

BLASTN list of matches for EX1.10

EX2.09BLASTN

p. 7-9

Page 8: WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches

Best match to EX1.10

p. 7-9

>gi|226493893|ref|NM_001157047.1| Zea mays dynein light chain LC6, flagellarouter arm (LOC100284150), mRNALength=606

Score = 221 bits (244), Expect = 5e-54 Identities = 218/282 (77%), Gaps = 0/282 (0%) Strand=Plus/Plus

Query 11 ATGTTGGAAGGGAGGGCGAGAGTAGAAGACACCGACATGCCGAGGAAGATGCAGGCGGAG 70 ||||||||||| | |||| || || ||||||||||||||| ||||||||| || ||Sbjct 104 ATGTTGGAAGGAAAGGCGGTGGTGGAGGACACCGACATGCCGGCGAAGATGCAAGCCCAG 163

Query 71 GCCATGAACGCCGCCTCTCACGCGCTCGATCTGTTCGACGTCGCGGACTGCAAGAGCCTC 130 || ||| || || || || || |||| ||||||||| |||||| |||| ||Sbjct 164 GCGATGTCGGCGGCGTCCAGGGCCCTGGATCGCTTCGACGTCCTCGACTGCCGGAGCATC 223

Query 131 GCCGCGCATATCAAGAAGGAATTTGATAAGATCTACGGTCCGGGATGGCAGTGCGTCGTC 190 || | || ||||||||||| ||||| |||| | || || |||||||| ||||| ||Sbjct 224 GCGTCCCACATCAAGAAGGAGTTTGACGCGATCCATGGCCCCGGATGGCAATGCGTGGTT 283

Query 191 GGCTCCAGCTTCGGCTGTTTCTTCACTCACAAGAAAGGCAGCTTCATCTACTTCCGCCTG 250 |||||| |||||||||| | | |||| |||| || || |||||||||||||||||||||Sbjct 284 GGCTCCGGCTTCGGCTGCTACATCACGCACAGCAAGGGGAGCTTCATCTACTTCCGCCTG 343

Query 251 GAGACGCTCCACTTCCTCATCTTCAAAGGCGCGGCCGCTTGA 292 ||| ||||| |||||| |||||||||| ||||| || |||Sbjct 344 GAGTCGCTCAGGTTCCTCGTCTTCAAAGGGGCGGCAGCATGA 385

Our Seq.

Database Seq.

Length ofsequence

Mismatch

Match

Perfect, but short, matches are notusually meaningful

>gi|14250883|emb|AL583809.3|CNS07EFY Human chromosome 14 DNAsequence BAC R-736L22 of library RPCI-11 from chromosome 14 ofHomo sapiens (Human), complete sequence Score = 40.1 bits (20), Expect = 4.6 Identities = 20/20 (100%)

Query: 189   ttttctgaatattcataata 208 ||||||||||||||||||||Sbjct: 60645 ttttctgaatattcataata 60626

7-11

Page 9: WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches

Examine the best alignments:Are they significant?

7-9

C R E L L I L D A Query TGT CGT GAA CTC CTA ATT CTC GAC GCC ||| ||| ||| || || || || || || Sbjct TGT CGT GAA CTT CTG ATC CTT GAT GCA C R E L L I L D A

Mismatches

p. 7-12

Query 69 ATGAACAAGGAGAAGATTCTGAAGCTGGCGAAGGGCTTCCGGGGGAGGGCGAAGAACTGC 128 |||||||||| |||||| | ||||| || ||||| ||| | || |||||||| || |||Sbjct 242 ATGAACAAGGGAAAGATTTTTAAGCTAGCTAAGGGATTCAGAGGAAGGGCGAAAAATTGC 301

Query 129 ATCCGGATCGCGAGGGAGCGGGTGGAGAAGGCGCTCCAGTACTCGTACCGCGATCGCCGC 188 || |||| || |||||| ||||||| ||||| || || || || ||| | ||||| |||Sbjct 302 ATAAGGATAGCAAGGGAGAGGGTGGAAAAGGCACTGCAATATTCATACAGGGATCGACGC 361

Bad sequence on our part

Bad sequence on their part

Differences in the sequence of the two organisms

Page 10: WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches

           C R R T P D P *Query TGTCGT-CGAACTCCTGATCCTTGA           |||||| ||||||||||||||||||Sbjct TGTCGTCCGAACTCCTGATCCTTGA           C R E L L I L D

p. 7-13

Small Gaps- alter the reading frame of the protein

Query: 179 TTCGAGCTACCAGATGATC-GATTGGAACAT-T-C--TGTCATTG-AC-CTTC-AGGTAA 230 ||||||| || | | || |||| || || | | | | ||| | |||| |||| |Sbjct: 4684 TTCGAGCG-CC-GTTAATATGATTACAATATCTACAATATTATTATATGCTTCCAGGTGA 4741

Query: 231 TCAACCATGACCGTGTCAACCGAAACGACGTTATCGGCCGTGCACTATTGAACATGGAGG 290 |||| ||||||||||| ||||| || || || || |||||||| || | || ||||| |Sbjct: 4742 TCAATCATGACCGTGTTAACCGTAATGATGTAATTGGCCGTGCCCTTCTTAATATGGAAG 4801

An example of a match with and without gaps.

p. 7-13

Page 11: WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches

>gi|241990611|dbj|AK330768.1| Triticum aestivum cDNA, clone: SET5_E05, cultivar:Chinese Spring Length=650 Score = 219 bits (242), Expect = 2e-53 Identities = 211/271 (77%), Gaps = 0/271 (0%)

Query 10 GATGTTGGAAGGGAGGGCGAGAGTAGAAGACACCGACATGCCGAGGAAGATGCAGGCGGA 69 |||| ||||||||| ||||| || || ||||||||||||||| ||||||||| | |Sbjct 78 GATGCTGGAAGGGAAGGCGACGGTGGAGGACACCGACATGCCGGCCAAGATGCAGCTGCA 137

Query 70 GGCCATGAACGCCGCCTCTCACGCGCTCGATCTGTTCGACGTCGCGGACTGCAAGAGCCT 129 ||||| || || || |||||||| | ||||||||| |||||| |||| |Sbjct 138 GGCCACCTCGGCGGCGTCCAGGGCGCTCGAACGCTTCGACGTCCTCGACTGCCGGAGCAT 197

Query 130 CGCCGCGCATATCAAGAAGGAATTTGATAAGATCTACGGTCCGGGATGGCAGTGCGTCGT 189 ||| ||||| ||||||||||| || || | |||| |||| ||||| ||||||||||| ||Sbjct 198 CGCGGCGCACATCAAGAAGGAGTTCGACACGATCCACGGCCCGGGGTGGCAGTGCGTGGT 257

Query 190 CGGCTCCAGCTTCGGCTGTTTCTTCACTCACAAGAAAGGCAGCTTCATCTACTTCCGCCT 249 |||| |||||||||||| | |||||| |||| || || |||||||| |||||| ||Sbjct 258 GGGCTGCAGCTTCGGCTGCTACTTCACGCACAGCAAGGGGAGCTTCATATACTTCAAGCT 317

Query 250 GGAGACGCTCCACTTCCTCATCTTCAAAGGC 280 ||| |||||| |||||| |||||||||||Sbjct 318 CGAGTCGCTCCGGTTCCTCGTCTTCAAAGGC 348

Alignment of the second best match to EX1.10

p. 7-14

p. 7-14

Alignments near the end of the EX1.10

>gi|254826767|ref|NG_012498.1| Homo sapiens glypican 4 (GPC4),RefSeqGene on chromosome X Length=121142 Score = 71.6 bits (78),Expect = 6e-09 Identities = 42/44 (95%), Gaps = 0/44 (0%)

Query 665 CTAGCTTTTCTTAACaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 708 || ||||||||||| |||||||||||||||||||||||||||||Sbjct 72886 CTTGCTTTTCTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 72929

Page 12: WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches

p. 7-15

Fill in the table listing the best matches fromthree different organisms.

List Wolffia if there is a match

Use theclone reportto obtainmoreinformationabout thegene

p. 7-15

Page 13: WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches

3) Perform aBLASTn ofthe estdatabase

Changethedatabase

p. 7-17

p. 7-17

BLASTn report of the EX1.10 search ofthe est database

Page 14: WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches

>gi|198335694|gb|GD004539.1| CCHY28888.g1 CCHY Panicum virgatum callus (N) Panicum virgatum cDNA clone CCHY28888 3', mRNA sequence. Length=624 Score = 246 bits (272), Expect = 1e-61 Identities = 226/286 (79%), Gaps = 0/286 (0%) Strand=Plus/Minus Query 3 GAGAGAAGATGTTGGAAGGGAGGGCGAGAGTAGAAGACACCGACATGCCGAGGAAGATGC 62 |||| | ||| ||||||||| ||||| || || ||||| ||||||||| ||||||||Sbjct 527 GAGACACCATGCTGGAAGGGAAGGCGATGGTGGAGGACACGGACATGCCGGCGAAGATGC 468 Query 63 AGGCGGAGGCCATGAACGCCGCCTCTCACGCGCTCGATCTGTTCGACGTCGCGGACTGCA 122 ||||| |||| ||| || || || || ||||| | ||||||||| |||||| Sbjct 467 AGGCGCAGGCGATGGCGGCGGCGTCCAGGGCCCTCGACCGCTTCGACGTCCTCGACTGCC 408 Query 123 AGAGCCTCGCCGCGCATATCAAGAAGGAATTTGATAAGATCTACGGTCCGGGATGGCAGT 182 |||| |||| ||||| ||||||||||| ||||| | |||| |||| || || ||||| |Sbjct 407 GGAGCATCGCGGCGCACATCAAGAAGGAGTTTGACACGATCCACGGCCCCGGGTGGCAAT 348 Query 183 GCGTCGTCGGCTCCAGCTTCGGCTGTTTCTTCACTCACAAGAAAGGCAGCTTCATCTACT 242 |||| || ||||||||||||||||| | |||||| |||| || || |||||||||||||Sbjct 347 GCGTGGTGGGCTCCAGCTTCGGCTGCTACTTCACGCACAGCAAGGGGAGCTTCATCTACT 288 Query 243 TCCGCCTGGAGACGCTCCACTTCCTCATCTTCAAAGGCGCGGCCGC 288 |||| || ||| ||||| ||||||||||||||||| ||||| ||Sbjct 287 TCCGGCTCGAGTCGCTCAGGTTCCTCATCTTCAAAGGGGCGGCAGC 242

Alignment of the best match to EX1.09from the est search

p. 7-17

Fill out the DSAP table of the BLASTnsearch of the est database

p. 7-18

Page 15: WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches

Query 61 CAAGGTCTAAGTACTGAAAAGGAAAGTCTACTAATTACAAAGAAGTTATTGTTTGTACCT 120 |||||||||||||||||||||||||||| |||||||||||||||||||||||||||||||Sbjct 13166 CAAGGTCTAAGTACTGAAAAGGAAAGTCCACTAATTACAAAGAAGTTATTGTTTGTACCT 13107

Query 121 TTTGTATCAGGGTTTATTAAATTTCAATCTTTATTGCTGAATCCCGAAACAAGGTGATCT 180 |||||||||||||||||||||||| |||||| ||||||||||||||||||||||||||||Sbjct 13106 TTTGTATCAGGGTTTATTAAATTTTAATCTTCATTGCTGAATCCCGAAACAAGGTGATCT 13047

Open Question: Why are there differences in the sequences?