sequence analysis. dna and protein sequences are biological information that are well suited for...

16
Sequence Analysis

Upload: terence-west

Post on 22-Dec-2015

219 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences

Sequence Analysis

Page 2: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences

Sequence Analysis

• DNA and Protein sequences are biological information that are well suited for computer analysis

• Fundamental Axiom: homologous sequences share an evolutionary ancestor and are almost surely performing the same or a similar function

Page 3: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences

Sequence Analysis topics for today

• Restriction enzyme sites for diagnostics and cloning

• Open reading frame analysis

• Conceptual translation

• Oligo primer design

• Sequence alignments

Page 4: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences

Sequence Analysis

• Alignments document homologous relationships

• DNA sequence alignments - best for showing identity

• Protein sequence alignments best for showing similarity

Page 5: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences

Types of Alignments

Page 6: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences

In Class Tutorial

• Introduction to File Formats– Examples of file formats– Utilities to change formats

• Restriction Analysis– Web tools for restriction analysis– Local programs

Page 7: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences

In Class Tutorial

• Open reading frame analysis• Reverse complement• Capturing output to an MS Word doc• Oligo Primer Design for PCR and

sequencing• Alignments

– global and local

Page 8: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences

Sequence File Formats

• FASTA – Simplest format– Easy to create by hand on a word processor

Page 9: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences

FASTA• First line must start with > followed by seq name• Second line to end = sequence• No numbers or spaces• Seq can be UPPER or lower case

Page 10: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences

File Formats

• Some sequence analysis program take input sequences in FASTA format ONLY

• ReadSeq is a web based utility that converts many file formats to FASTA

• More and more programs will accept multiple file formats as input

Page 11: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences

Mono-Space Fonts

• Every character uses the same space = mono space

• ATG and C use the same space on a line

• W and . use the same space on a line

• Critical for sequence alignments to stay aligned

Page 12: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences

Mono-Space Fonts

NOT a Monospace font

Page 13: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences

Primer Design

• Primers are chemically synthesized oligonucleotides

• Used for sequencing and PCR

• Bad primer design can result in reaction failures

Page 14: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences

Primer Design Matters

Page 15: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences

Primer Design• TM 55-60°: PCR primer pairs need to have similar TM’s • GC content 40-60% (Biased to 5’ end) • Length = 17-25nt• Low self complementarity (Palindromes)• < 3/5 3’ bases G/C (no GC clamp at 3’ end)• Low complementarity between primers (avoid primer dimer)• Blast search primers – avoid repetitive DNA• Small amplicon size increases PCR efficiency• Avoid runs of one base

Page 16: Sequence Analysis. DNA and Protein sequences are biological information that are well suited for computer analysis Fundamental Axiom: homologous sequences

Primer Design: GC Clamps cause false priming