factors to consider in selecting a genotyping platform

57
Factors to Consider in Selecting Factors to Consider in Selecting a Genotyping Platform a Genotyping Platform Elizabeth Pugh June 22, 2007

Upload: ramya

Post on 09-Jan-2016

39 views

Category:

Documents


3 download

DESCRIPTION

Factors to Consider in Selecting a Genotyping Platform. Elizabeth Pugh June 22, 2007. GWA Studies. Genotype 300,000 to 1,000,000 SNPs 3 platforms, multiple products Affymetrix Illumina Perlegen How to choose?. What I can cover. Basics of calling genotypes - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Factors to Consider in Selecting a Genotyping Platform

Factors to Consider in Factors to Consider in Selecting a Genotyping Selecting a Genotyping

PlatformPlatformElizabeth Pugh

June 22, 2007

Page 2: Factors to Consider in Selecting a Genotyping Platform

GWA Studies

• Genotype 300,000 to 1,000,000 SNPs

• 3 platforms, multiple products Affymetrix Illumina Perlegen

• How to choose?

Page 3: Factors to Consider in Selecting a Genotyping Platform

What I can cover

• Basics of calling genotypes

• Examples of good and bad data

• Some things to consider

Page 4: Factors to Consider in Selecting a Genotyping Platform

Basics of how it works

• Skipping chemistry…• Generate intensity data for 2 alleles• Assign genotypes based on

clustering• These are ‘phenotypes’ – there is

measurement error• No manual review of data – too many

SNPs

Page 5: Factors to Consider in Selecting a Genotyping Platform

A good SNP

Page 6: Factors to Consider in Selecting a Genotyping Platform

Same SNP different view

Page 7: Factors to Consider in Selecting a Genotyping Platform

Same SNP different view

Page 8: Factors to Consider in Selecting a Genotyping Platform

Another Good SNPAnother Good SNP

Page 9: Factors to Consider in Selecting a Genotyping Platform

And Another Good SNPAnd Another Good SNP

Page 10: Factors to Consider in Selecting a Genotyping Platform

Data Quality

• Most of the data is good for all platforms

• Some samples, SNPs and genotypes fail

• Have to find them without manual review

Page 11: Factors to Consider in Selecting a Genotyping Platform

Ways to find bad data

• Use summary statistics across SNPs, samples

• Include investigator and control replicates

• Include control and where possible investigator trios

• If use Hapmap controls can compare with caution to Hapmap genotypes – there are some errors in Hapmap data

Page 12: Factors to Consider in Selecting a Genotyping Platform

Finding Bad SNPs

• Use qc checks Call rate Mendelian Inheritance Replicates HWE Quality score, clustering

• Note some bad SNPs will pass any qc filter

• Some good SNPs may fail qc

Page 13: Factors to Consider in Selecting a Genotyping Platform

Bad SNP caught by qc filter

Page 14: Factors to Consider in Selecting a Genotyping Platform

Bad SNP caught by qc filter

Page 15: Factors to Consider in Selecting a Genotyping Platform

Bad SNP caught by qc filter

Page 16: Factors to Consider in Selecting a Genotyping Platform

Bad SNP caught by qc filter

Page 17: Factors to Consider in Selecting a Genotyping Platform

Bad SNP caught by qc filter

Page 18: Factors to Consider in Selecting a Genotyping Platform

Bad SNP caught by qc filter

Page 19: Factors to Consider in Selecting a Genotyping Platform

Bad SNP caught by qc filter

Page 20: Factors to Consider in Selecting a Genotyping Platform

Bad SNP caught by qc filter

Page 21: Factors to Consider in Selecting a Genotyping Platform

Bad SNP caught by qc filter

Page 22: Factors to Consider in Selecting a Genotyping Platform

Yikes! Some of those are awful!

• Yes

• We can find many, hopefully most of them but…

• Use the intensity data to plot your most significant SNPs

• Look at them before you publish

Page 23: Factors to Consider in Selecting a Genotyping Platform

Use a lab that will give you intensity data

• If you have intensity data you can Plot the intensities to check clustering Cluster with a different algorithm Recluster as algorithms get better Recluster subsets or supersets of the

data Create your own metrics (e.g. number of

samples with no or very low intensity)

Page 24: Factors to Consider in Selecting a Genotyping Platform

Finding Bad Samples

• Look at sample level metrics starting with call rate

• Bad samples - even water will have some genotypes

• May want to remove possibly bad sample before clustering the data then make final sample decisions

Page 25: Factors to Consider in Selecting a Genotyping Platform

Sample plotall SNPs for one sample sample call rate 99.8%

Page 26: Factors to Consider in Selecting a Genotyping Platform

Sample plot – Failed samplelow intensity Call freq 41%

Page 27: Factors to Consider in Selecting a Genotyping Platform

Failed samples tend to fall outside of clusters for many

SNPs

Page 28: Factors to Consider in Selecting a Genotyping Platform

Failed samples tend to fall outside of clusters for many

SNPs

Page 29: Factors to Consider in Selecting a Genotyping Platform

Can I use WGA samples?

• Whole Genome Amplified DNA performance ranges from awful to very good

• Even WGA samples that work very well may perform poorly for some SNPs

• Extra attention needed for clustering decisions and for analysis

• Make sure lab knows sample type for each sample

Page 30: Factors to Consider in Selecting a Genotyping Platform

WGA clustering with other samples

Page 31: Factors to Consider in Selecting a Genotyping Platform

WGA lower intensityCall freq 98%

Page 32: Factors to Consider in Selecting a Genotyping Platform

WGA failurecall rate 93%

Page 33: Factors to Consider in Selecting a Genotyping Platform

Multiple sample types in study

• Look at data by sample type (metrics and plots)

• If they are not performing equivalently do lots of extra qc by sample type

• If have to cluster separately even more qc and checks are needed

• If sample type is not random may cause more headaches (e.g. different types for cases and controls)

Page 34: Factors to Consider in Selecting a Genotyping Platform

Preventing Bad Data

• Discuss sample types with lab what is their experiece? May want to test some before start project

• Discuss plating with lab may wish to place controls uniquely or arrange males and females uniquely by plate

Page 35: Factors to Consider in Selecting a Genotyping Platform

Preventing Bad Data

• Differences in intensity (batch effects) are not common but possible

• May only be present for subset of SNPs

• May want to mix cases and controls across plates to minimize effect of plate effect if it happens

Page 36: Factors to Consider in Selecting a Genotyping Platform

Genotypes

• For good SNPs and samples some genotypes will fail May not be called May be called with low confidence or

quality score May be called wrong

Page 37: Factors to Consider in Selecting a Genotyping Platform

1 genotype not called

Page 38: Factors to Consider in Selecting a Genotyping Platform

1 wrong genotype 1 wrong genotype

Page 39: Factors to Consider in Selecting a Genotyping Platform

Copy number

• With Affymetrix and Illumina intensity information can be used to infer copy number

• Works very well with small numbers of samples and manual review

• Not really a high throughput system – software not sensitive or specific enough … Yet

Page 40: Factors to Consider in Selecting a Genotyping Platform

Genome viewer

Page 41: Factors to Consider in Selecting a Genotyping Platform

Female Chr X

Page 42: Factors to Consider in Selecting a Genotyping Platform

Male chrX

Page 43: Factors to Consider in Selecting a Genotyping Platform

Known Frequent CNV chr 10

Page 44: Factors to Consider in Selecting a Genotyping Platform

Known Frequent CNV chr 10

Page 45: Factors to Consider in Selecting a Genotyping Platform

Choosing a Platform and Product

Factors to Consider

• Your study Population Study design Sample types Combining data

with other studies Interest in CNV’s

• Product Coverage of the

genome How many SNPs Which SNPs (tagging,

in or near genes) Quality of data Performance on your

sample types Information on CNV’s

Page 46: Factors to Consider in Selecting a Genotyping Platform

Comparing PlatformsMake sure the numbers are

comparable!

• QC rates reported – denominators can differ Mendel errors per trio or per sample Replicate errors per pair or per sample

Page 47: Factors to Consider in Selecting a Genotyping Platform

Comparing PlatformsMake sure the numbers are

comparable!

• SNPs on the chip are correlated with many others – often very strong correlation

• There are multiple measures of the strength of the

correlation Lists of SNPs to use as proxy for

‘Genome’

Page 48: Factors to Consider in Selecting a Genotyping Platform

Cost?

• Hard to say• Changing rapidly • Generally increase with the numbers

of SNPs on a chip• May decrease with number of

samples in a study• Reagents (the chips) are only part of

the cost

Page 49: Factors to Consider in Selecting a Genotyping Platform

New Stuff!

Page 50: Factors to Consider in Selecting a Genotyping Platform

New GWA ArraysNew GWA ArraysAffymetrix and IlluminaAffymetrix and Illumina• ~ 1 million SNPs

• Enhanced copy number content Different strategies

• Improved coverage in YRI population

• Illumina 1M – still pre-release Same chemistry, same software, same probe

designs, same lab workflow as other Infinium products

• Affymetrix 6.0 – just released Same chemistry & lab workflow as 5.0 Changes in probe design & software

Page 51: Factors to Consider in Selecting a Genotyping Platform

More SNPs are better, right?

• Maybe not always• Methods that use the genotypes on

samples plus Hapmap data to infer ungenotyped SNPs Can use infered genotypes in analysis Can combine data from studies that used

different SNPs• more samples on fewer genotypes may

give more power Need enough genotypes for your population to

infer SNPs

Page 52: Factors to Consider in Selecting a Genotyping Platform

One or Two Stage Designs

• A year ago everyone was thinking about 2 stage designs

• GWA scan on part of sample

• Follow up a subset of significant results in rest of sample

• Now may cost less to do GWA scan on all samples

Page 53: Factors to Consider in Selecting a Genotyping Platform

Effect Size of 1.2 !!!!

• Recent GWA studies have found small effect sizes

• May need many, many samples to have reasonable power

Page 54: Factors to Consider in Selecting a Genotyping Platform

Choosing a platform

• Must balance coverage, QC and cost per sample to design the most powerful study you can

• Costs, products, clustering, qc and analysis methods are changing rapidly

• What is best will change

Page 55: Factors to Consider in Selecting a Genotyping Platform

www.cidr.jhmi.eduwww.cidr.jhmi.edu

Page 56: Factors to Consider in Selecting a Genotyping Platform

The end

Page 57: Factors to Consider in Selecting a Genotyping Platform