genome-wide association study (gwas)

23
Genome-Wide Association Study (GWAS) Presented by Karen Xu

Upload: eddy

Post on 22-Feb-2016

72 views

Category:

Documents


0 download

DESCRIPTION

Genome-Wide Association Study (GWAS). Presented by Karen Xu. What you need to know. Basic genetic concepts behind GWAS Genotyping technologies and common study designs Statistical concepts for GWAS analysis Replication, interpretation and follow-up of association results. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Genome-Wide Association Study (GWAS)

Genome-Wide Association Study (GWAS)

Presented by Karen Xu

Page 2: Genome-Wide Association Study (GWAS)

What you need to knowBasic genetic concepts behind GWAS

Genotyping technologies and common study designs

Statistical concepts for GWAS analysis

Replication, interpretation and follow-up of association results

Page 3: Genome-Wide Association Study (GWAS)

Central Goal of Human Genetics

To identify genetic risk factors for common, complex diseases

Page 4: Genome-Wide Association Study (GWAS)

Goal of GWASTo use genetic risk factors to predict who is at risk

Identify the biological underpinnings of disease susceptibility for developing new prevention and treatment strategies

Page 5: Genome-Wide Association Study (GWAS)

Application in pharmacologyIdentifying DNA sequence variations associated w/ drug metabolism and efficacy as well as adverse effects

Example, warfarin---determining the appropriate dose

Personalized medicine

Page 6: Genome-Wide Association Study (GWAS)

Concepts underlying the study designSNP---single nucleotide polymorphismSingle base pair changes in the DNA sequence that

occur with high frequency in the human genomeSNP (common) vs. Mutation (rare)Cystic fibrosis---mutations in the CFTR geneLinage analysis---genotyping families affected by

cystic fibrosis using a collection of genetic markers across the genome and examining how these genetic markers segregate w/ the disease across multiple familes

Page 7: Genome-Wide Association Study (GWAS)

Common Disease Common Variant HypothesisCommon disorders are likely influenced by

genetic variation that is also common in the population

1. If common genetic variants influence disease, the effect size (or penetrance) for any one variant must be small relative to that found for rare disorders.

2. If common alleles have small genetic effects (low penetrance), but common disorders show heritability (inheritance in families), then multiple common alleles must influence disease susceptibility.

Page 8: Genome-Wide Association Study (GWAS)

Figure 1. Spectrum of Disease Allele Effects.

Bush WS, Moore JH (2012) Chapter 11: Genome-Wide Association Studies. PLoS Comput Biol 8(12): e1002822. doi:10.1371/journal.pcbi.1002822http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002822

Page 9: Genome-Wide Association Study (GWAS)

Capturing Common Variation1. location and density of commonly

occurring SNPs is needed to identify the genomic regions and individual sites that must be examined by genetic studies

2. population-specific differences in genetic variation must be cataloged so that studies of phenotypes in different populations can be conducted with the proper design

3. correlations among common genetic variants must be determined so that genetic studies do not collect redundant information

Page 10: Genome-Wide Association Study (GWAS)

International HapMap ProjectUsed a variety of sequencing techniques to

discover and catalog SNPs in European descent populations, the Yoruba populations of African origin, Han Chinese individuals from Beijing, and Japanese individuals from Tokyo

Has since been expanded to include 11 human populations

Page 11: Genome-Wide Association Study (GWAS)

Linkage DisequilibriumA property of SNPs on a contiguous stretch of

genomic sequence that describes the degree to which an allele of a SNP is inherited or correlated with an allele of another SNP within a population

Linkage between markers on a population scale

Page 12: Genome-Wide Association Study (GWAS)

Figure 2. Linkage and Linkage Disequilibrium.

Bush WS, Moore JH (2012) Chapter 11: Genome-Wide Association Studies. PLoS Comput Biol 8(12): e1002822. doi:10.1371/journal.pcbi.1002822http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002822

Page 13: Genome-Wide Association Study (GWAS)

Direct vs. Indirect Association LD creates two possible positive outcomes

from a genetic association study1. direct association----the SNP influencing a

biological system that leads to the phenotype is directly genotyped in the study

2. Indirect association----the influential SNP is not directly typed, but instead a tag SNP in high LD with the influential SNP is typed

Therefore, a significant SNP association from a GWAS should not be assumed as the causal variant

Page 14: Genome-Wide Association Study (GWAS)

Genotyping TechnologiesChip-based microarray technologyIllumina, NA molecules and primers are first

attached on a slide and amplified with polymerase so that local clonal DNA colonies, later coined "DNA clusters", are formed. To determine the sequence, four types of reversible terminator bases (RT-bases) are added and non-incorporated nucleotides are washed away. A camera takes images of the fluorescently labeled nucleotides, then the dye, along with the terminal 3' blocker, is chemically removed from the DNA, allowing for the next cycle to begin.

Page 15: Genome-Wide Association Study (GWAS)

Study DesignCase control vs. quantitative designTwo primary classes of phenotypes: categorical or quantitative

From the statistical perspective, quantitative traits are preferred, but not required for a successful study

Page 16: Genome-Wide Association Study (GWAS)

Association Test1. single-locus analysisWhen a well-defined phenotype has been selected for a

study population, and genotypes are collected using sound techniques, the statistical analysis can begin

Quantitative traits----ANOVA (analysis of variance)---null hypothesis is that there is no difference between the trait means of any genotype group

Dichotomous case/ control traits are analyzed using logistic regression---null hypothesis---there is no association between the phenotype and genotype

http://luna.cas.usf.edu/~mbrannic/files/regression/Logistic.html

Page 17: Genome-Wide Association Study (GWAS)

Statistical replicationReplication studies should be conducted in an

independent dataset drawn from the same population as GWAS

Once an effect is confirmed in the target population, other populations may be sampled to determine if the SNP has an ethnic-specific effect

Identical phenotype criteria should be used in both GWAS and replication studies

A similar effect should be seen in the replication set from the same SNP, or a SNP in high LD with the GWAS-identified SNP

Page 18: Genome-Wide Association Study (GWAS)

Meta-analysis of multiple analysis resultsMeta-analysis developed to examine and

refine significance and effect size estimates from multiple studies examining the same hypothesis in the published literature

However, it is rare to find multiple studies that match perfectly on all criteria

Study heterogeneity is often statistically quantified in a meta-analysis to determine the degree to which studies differ.

Page 19: Genome-Wide Association Study (GWAS)

Data ImputationTo conduct a meta-analysis properly, the effect of

the same allele across multiple distinct studies must be assessed. This can prove difficult if different studies use different genotyping platforms (which use different SNP marker sets). As this is often the case, GWAS datasets can be imputed to generate results for a common set of SNPs across all studies. Genotype imputation exploits known LD patterns and haplotype frequencies from the HapMap or 1000 Genomes project to estimate genotypes for SNPs not directly genotyped in the study [50].

Page 20: Genome-Wide Association Study (GWAS)

Logistic regressionPredicting the likelihood that Y is equal to 1

(rather than 0) given certain values of XExample: we try to predict whether or not

small business will succeed based on the number of years of experience the owner has in the field prior to starting the business. We presume that those people who have more experience will be more likely to succeed

As X (the number of years of experience) increases, the probability that Y will be equal to 1 (success in the business) will tend to increase

Page 21: Genome-Wide Association Study (GWAS)

Logistic Regression

Page 22: Genome-Wide Association Study (GWAS)

Logistic Regression

Page 23: Genome-Wide Association Study (GWAS)

Logistic Regression