bnfo 602 lecture 1

16
BNFO 602 Lecture 1 Usman Roshan

Upload: carney

Post on 22-Jan-2016

32 views

Category:

Documents


0 download

DESCRIPTION

BNFO 602 Lecture 1. Usman Roshan. Bio background. DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing. Representing DNA in a format manipulatable by computers. DNA is a double-helix molecule made up of four nucleotides: Adenosine (A) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: BNFO 602 Lecture 1

BNFO 602Lecture 1

Usman Roshan

Page 2: BNFO 602 Lecture 1

Bio background

• DNA

• Transcription and translation

• Proteins: folding and structure

• SNPs

• SNP genotyping, sequencing

Page 3: BNFO 602 Lecture 1

Representing DNA in a format manipulatable by computers

• DNA is a double-helix molecule made up of four nucleotides:– Adenosine (A)– Cytosine (C)– Thymine (T)– Guanine (G)

• Since A (adenosine) always pairs with T (thymine) and C (cytosine) always pairs with G (guanine) knowing only one side of the ladder is enough

• We represent DNA as a sequence of letters where each letter could be A,C,G, or T.

• For example, for the helix shown here we would represent this as CAGT.

Page 4: BNFO 602 Lecture 1

Transcription and translation

Page 5: BNFO 602 Lecture 1

Amino acidsProteins are chains ofamino acids. There aretwenty different aminoacids that chain indifferent ways to formdifferent proteins.

For example,FLLVALCCRFGH (this is how we could storeit in a file)

This sequence of aminoacids folds to form a 3-Dstructure

Page 6: BNFO 602 Lecture 1

Protein folding

Page 7: BNFO 602 Lecture 1

Protein folding• The protein foldingproblem is to determinethe 3-D protein structurefrom the sequence.• Experimental techniquesare very expensive. • Computational are cheap but difficult to solve. • By comparing sequences we can deduce the evolutionary conserved portions which are also functional (most of the time).

Page 8: BNFO 602 Lecture 1

Protein structure

• Primary structure: sequence ofamino acids.• Secondary structure: parts of thechain organizes itself into alpha helices, beta sheets, and coils. Helices and sheets are usually evolutionarily conserved and can aid sequence alignment.• Tertiary structure: 3-D structure of entire chain• Quaternary structure: Complex of several chains

Page 9: BNFO 602 Lecture 1

Key points

• DNA can be represented as strings consisting of four letters: A, C, G, and T. They can be very long, e.g. thousands and even millions of letters

• Proteins are also represented as strings of 20 letters (each letter is an amino acid). Their 3-D structure determines the function to a large extent.

Page 10: BNFO 602 Lecture 1

SNPs

• DNA sequence variations that occur when a single nucleotide is altered.

• Must be present in at least 1% of the population to be a SNP.

• Occur every 100 to 300 bases along the 3 billion-base human genome.

• Many have no effect on cell function but some could affect disease risk and drug response.

Page 11: BNFO 602 Lecture 1

Toy example

Page 12: BNFO 602 Lecture 1

SNPs on the chromosome

SNP

Chromosome

Gene

Page 13: BNFO 602 Lecture 1

Bi-allelic SNPs

• Most SNPs have one of two nucleotides at a given position

• For example:– A/G denotes the varying nucleotide as

either A or G. We call each of these an allele

– Most SNPs have two alleles (bi-allelic)

Page 14: BNFO 602 Lecture 1

SNP genotype

• We inherit two copies of each chromosome (one from each parent)

• For a given SNP the genotype defines the type of alleles we carry

• Example: for the SNP A/G one’s genotype may be– AA if both copies of the chromosome have A– GG if both copies of the chromosome have G– AG or GA if one copy has A and the other has G– The first two cases are called homozygous and latter two

are heterozygous

Page 15: BNFO 602 Lecture 1

SNP genotyping

Page 16: BNFO 602 Lecture 1

Real SNPs

• SNP consortium: snp.cshl.org

• SNPedia: www.snpedia.com