dna 序列分析

68
DNA 序序序序 David Shiuan Department of Life Science Institute of Biotechnology and Interdisciplinary Program of Bioinformatics National Dong Hwa University

Upload: hamish

Post on 12-Jan-2016

113 views

Category:

Documents


0 download

DESCRIPTION

DNA 序列分析. David Shiuan Department of Life Science Institute of Biotechnology and Interdisciplinary Program of Bioinformatics National Dong Hwa University. DNA 序列分析 (I). BLAST comparison ORF (open reading frame) Finder Promoter Search - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: DNA 序列分析

DNA 序列分析

David Shiuan

Department of Life Science

Institute of Biotechnology and

Interdisciplinary Program of Bioinformatics

National Dong Hwa University

Page 2: DNA 序列分析

DNA 序列分析 (I)

BLAST comparison ORF (open reading frame) Finder Promoter Search

- Promoter Prediction (BCM)

- EPD (Eukaryote Promoter Database)

- NNPP prokaryote promoter prediction (BCM)

- ProtScan (BIMAS)

Page 3: DNA 序列分析

DNA 序列分析 (II)

Sequence Alignment (Clastal W) Tree Analysis (MEGA, PAUP, UPGMA) Motif Prediction Restriction Analysis (TCGA) RNAFOLD (GCG)

Page 4: DNA 序列分析

Basic Local Alignment Search Tool

A sequence comparison algorithm optimized for speed used to search sequence databases for optimal local alignments to a query.

Algorithm : A fixed procedure embodied in a computer program.

Page 5: DNA 序列分析

Basic Local Alignment Search Tool

The initial search is done for a word of length "W" that scores at least "T" when compared to the query using a substitution matrix. Word hits are then extended in either direction in an attempt to generate an alignment with a score exceeding the threshold of "S". The "T" parameter dictates the speed and sensitivity of the search.

Page 6: DNA 序列分析

Calculating alignment scores

Page 7: DNA 序列分析
Page 8: DNA 序列分析

BLOSUM62 Substitution Scoring Matrix

The BLOSUM 62 matrix shown here is a 20 x 20 matrix, in which every possible identity and substitution is assigned a score based on the observed frequencies of such occurences in alignments of related proteins.

Identities are assigned the most positive scores.

Page 9: DNA 序列分析
Page 10: DNA 序列分析

The NCBI BLAST family of programs

blastp compares an amino acid query sequence against a protein sequence database

blastn compares a nucleotide query sequence against a nucleotide sequence database

blastx compares a nucleotide query sequence translated in all reading frames against a protein sequence database

tblastn compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames

tblastx compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.

Page 11: DNA 序列分析

Peptide Sequence Databases

for BLAST search nr

All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF

month All new or revised GenBank CDS

translation+PDB+SwissProt+PIR+PRF released in the last 30 days.

swissprot Last major release of the SWISS-PROT protein

sequence database (no updates)

Page 12: DNA 序列分析

Filtering of low-complexity segments

Page 13: DNA 序列分析

E-value for the score S

the expected number of HSPs with score at least S is given by the formula

E = K m n e – S

HSP : high-scoring segment pairs m and n : sequence lengths K and lambda : parameters

Page 14: DNA 序列分析
Page 15: DNA 序列分析
Page 16: DNA 序列分析
Page 17: DNA 序列分析
Page 18: DNA 序列分析
Page 19: DNA 序列分析
Page 20: DNA 序列分析
Page 21: DNA 序列分析
Page 22: DNA 序列分析
Page 23: DNA 序列分析
Page 24: DNA 序列分析
Page 25: DNA 序列分析
Page 26: DNA 序列分析
Page 27: DNA 序列分析
Page 28: DNA 序列分析
Page 29: DNA 序列分析
Page 30: DNA 序列分析
Page 31: DNA 序列分析
Page 32: DNA 序列分析
Page 33: DNA 序列分析
Page 34: DNA 序列分析
Page 35: DNA 序列分析
Page 36: DNA 序列分析
Page 37: DNA 序列分析
Page 38: DNA 序列分析
Page 39: DNA 序列分析
Page 40: DNA 序列分析
Page 41: DNA 序列分析
Page 42: DNA 序列分析
Page 43: DNA 序列分析

Promoter Search

ProtScan (at BIMAS) EPD (Eukaryote Promoter Database) Promoter Prediction (BCM) NNPP (Prokaryote Promoter Prediction at

BCM)

Page 44: DNA 序列分析
Page 45: DNA 序列分析
Page 46: DNA 序列分析
Page 47: DNA 序列分析
Page 48: DNA 序列分析
Page 49: DNA 序列分析
Page 50: DNA 序列分析
Page 51: DNA 序列分析
Page 52: DNA 序列分析
Page 53: DNA 序列分析
Page 54: DNA 序列分析
Page 55: DNA 序列分析
Page 56: DNA 序列分析
Page 57: DNA 序列分析

About the neural network method

NNPP is a method that finds eukaryotic and prokaryotic promoters in a DNA sequence.

It has been shown that multiple functional sites in the primary DNA are involved in the polymerase binding process.

These elements, such as the TATA-box and the transcription start site ("Initiator") for eukaryotes.

These promoter elements are present in various combinations separated by various distances in the sequence.

Page 58: DNA 序列分析
Page 59: DNA 序列分析
Page 60: DNA 序列分析
Page 61: DNA 序列分析
Page 62: DNA 序列分析
Page 63: DNA 序列分析
Page 64: DNA 序列分析
Page 65: DNA 序列分析
Page 66: DNA 序列分析
Page 67: DNA 序列分析
Page 68: DNA 序列分析