gil mcvean

35
Gil McVean

Upload: levi

Post on 05-Jan-2016

56 views

Category:

Documents


1 download

DESCRIPTION

Gil McVean. What makes us different?. Image: Wikimedia commons. The genetic axes. Strong. Genetic disorders. Cancer. Inherited. Somatic. Complex disease. Aging. Weak. Images:Wikimedia commons. Characterising individual genomes. Image: Wikimedia commons. Image: Wikimedia commons. - PowerPoint PPT Presentation

TRANSCRIPT

Page 2: Gil  McVean

What makes us different?

Image: Wikimedia commons

Page 3: Gil  McVean

The genetic axes

Strong

Weak

Inherited Somatic

Cancer

Complex disease

Genetic disorders

Aging

Images:Wikimedia commons

Page 4: Gil  McVean

Characterising individual genomes

Image: Illumina Cambridge Ltd

Image: Wikimedia commonsImage: Wikimedia commons

Page 5: Gil  McVean

Why 1000 genomes?

• To find all common (>5%) variants in the accessible human genome

• To find at least 95% of variants at 1% in populations of medical genetics interest– 95% of variants at 0.1% in genes

• To provide a fully public framework for interpreting rare genetic variation in the context of disease– Screening– Imputation

Page 6: Gil  McVean

The 1000 Genomes Project

Page 7: Gil  McVean

1000 Genomes Project design

Page 8: Gil  McVean

Haplotypes2x

10x

Population sequencing

Page 9: Gil  McVean

A map of shared variation

Page 10: Gil  McVean

http://browser.1000genomes.org

www.1000genomes.org

Page 11: Gil  McVean

Good, but not perfect

Variant type Validation methods Estimated FDR

Low-coverage SNPs Sequenom, 454, PacBio

1.8%

Exome SNPs 454 1.6%

LOF variants 454 5.2%

Short indels PCR, Sanger, array genotypes

36% -> 5.4%

Large deletions PCR, array CGH, SNP genotype

2.1%

Other large SVs PCR, array CGH, SNP genotype

1.4% – 3.7%

Post-hoc filtering

Not genotyped

Page 12: Gil  McVean
Page 13: Gil  McVean

4 million sites that differ from the human reference genome

12,000 changes to proteins

100 changes that knockout gene function5 rare

variants that are known to cause disease

Page 14: Gil  McVean

Most variation is common – Most common variation is cosmopolitan

Number of variants in typical genome

Found only in Europe

0.3%

Found in all continents

92%

Found only in the UK

0.1%

Found only in you

0.002%

Page 15: Gil  McVean

Imputation from 1000 Genomes

• Imputation similar for all variant types across populations• Comparable to imputation from high quality SNP haplotypes

Page 16: Gil  McVean

…but it can work for common variants

Page 17: Gil  McVean

The 1000 Genomes Sampling design

Page 18: Gil  McVean

The 1000 Genomes Sampling design

Page 19: Gil  McVean

What have we learned about low-frequency genetic variation from the 1000 Genomes Project?

• How many rare (<0.5%) and low-frequency (0.5-5%) variants are there, how does it vary between populations and what does it tell use about demography?

• To what extent has natural selection shaped the distribution of rare variants within and between populations?

• What are the implications of these findings for the interpretation of genetic variation in individual genomes?

Page 20: Gil  McVean

Populations differ in load of rare and common variants

Page 21: Gil  McVean

Most rare variation is private

Page 22: Gil  McVean

Rare variant differentiation within ancestry groupings increases as variant frequency decreases

Page 23: Gil  McVean

Not all populations are equal

Page 24: Gil  McVean

Rare variants identify recent historical links between populations

48% of IBS variants shared with American populations

ASW shows stronger sharing with YRI than LWK

Page 25: Gil  McVean

What about variants that affect gene function?

Page 26: Gil  McVean

Conserved variant load per individual

Page 27: Gil  McVean

The proportion of rare variants is predicted by conservation, with the exception of splice-disrupting and STOP+ variants

Page 28: Gil  McVean

KEGG ‘pathways’ show variation in excess rare-variant load

Page 29: Gil  McVean

Patterns of variation inform about selective constraint

CTCF-binding motif

Page 30: Gil  McVean

Variants under selection showed elevated levels of population differentiation

Proportion of pairwise comparisons where nonsynonymous variants are more differentiated than synonymous ones

Page 31: Gil  McVean

Rare variant differentiation can confound the genetic study of disease

Mathieson and McVean (2012)

Page 32: Gil  McVean

Implications

• Rare variants have spatial and ancestry-related distributions that reflect recent demographic events and selection.

• Purifying selection elevates local differentiation of rare variants.

• The functional and aetiological interpretation of rare variants in the context of disease needs to be aware of the local genetic background.

Page 33: Gil  McVean

AFRICA

Gambian in Western Division, The Gambia (GWD)

Malawian in Blantyre, Malawi (MAB)

Mende in Sierra Leone (MSL)

Esan in Nigeria (ESN)

SOUTH ASIAN

Punjabi in Lahore, Pakistan (PJL)

Bengali in Bangladesh (BEB)

Sri Lankan Tamil in the UK (STU)

Indian Telugu in the UK (ITU)

AMERICASAfrican American in Jackson, MS (AJM)

100

200

100

100

100

100

80

The final resource – mid 2013

Page 34: Gil  McVean

What more could we learn about human population genetics?

• There is a need for continuing the programme of developing public resources describing genetic variation across new populations, with high resolution spatial information.– This will not just shed light on population history and selection, but be

important for interpreting (rare) genetic variation in individual genomes.

• The Phase 1 1000 Genomes data has made clear the extent of variation in conserved regulatory sequence within genomes– How does this relate to variation in function in different cell types?

• Many of the most interesting parts of the genome (for the study of selection) are still poorly-covered by HTS data– Need to collect ‘bespoke’ data types for some genomic regions

Page 35: Gil  McVean

The 1000 Genomes Project Consortium

http://www.1000genomes.org/