snps and microarray

Post on 13-Jun-2015

208 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Lecture notes for Genomics 300

TRANSCRIPT

SNPs and Gene expression

Sucheta Tripathy

schedule

• 6 classes• 2 for SNP and 4 for gene expression• 2 per group and will present in each class.• Subjects and groups will be chosen randomly.• Todays class is introductory, we will discuss

fundamental aspects.

Terminologies

• Forward genetics – understanding a genotype by understanding the phenotype.

• Reverse Genetics -

How often SNPs occur?

• One in 300 bases – 10 M.

Not all single-nucleotide changes are SNPs, though. To be classified as a SNP,

two or more versions of a sequence must each be present in at least one percent of the general

population.

Each combination is a haplotype!!!!Not necessarily all 8 haplotypes exist!!!!

dbSNP and Hapmap project

• dbSNP: 2.5 million variations• http://www.ncbi.nlm.nih.gov/SNP/

• Haplotypes are blocks – hapmap focuses on those blocks

• http://hapmap.ncbi.nlm.nih.gov/thehapmap.html.en – 2002– Nigeria, Japan, China, USA

Gene Expression

• Yeast 1997.

Cy3: 570

Cy5: 670

Two Channel and single Channel microarrayTwo channel – two conditionsSpike-in control probes are thereUsed for Normalization-Agilent dual mode; Eppendorf with dualchip

Single channel: One condition at a time.Abundance of a transcript will not be known only relative abundance.Affymetrix: Genechip; Illumina BeadChip

Microarray and Bioinformatics

• Experimental design.• Standardization.• Statistical data analysis.• Data storage and visualization.

Contd…

• Experimental design:– Biological replicates.– Technical replicate.– Randomization

• Standardization:– Difficult – cant be easily replicated.– Minimum Information About a Microarray

Experiment" (MIAME); 2001, nature genetics• http://fged.org/projects/miame/

Contd..

• Data Analysis:– Image Analysis – gridding of the spots– Data processing:

• Background correction• Visualization (MA Plot)

– M is log transformation and A is mean average scale

Most gene should not change -> Y is 0

Contd..

• Data Processing:– Normalization (Remove non-biological variation)• Simplest way: Assume all arrays have same median

gene expression• Subtract median from each array• Quantile normalization:

– Order values in each array– Take average across probes– Substitute probe intensity with average– Change the original order

Contd..

• Class discovery– Unsupervised methods.– Supervised methods

• Draw hypothesis

Next class…

• We will discuss this paper.http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2805859

/

top related