genetic variation - ut · what is genetic variation? • differences in dna content or structure...

47
Genetic Variation Applied Computational Genomics, Lecture 05 https://github.com/quinlan-lab/applied-computational-genomics Aaron Quinlan Departments of Human Genetics and Biomedical Informatics USTAR Center for Genetic Discovery University of Utah quinlanlab.org

Upload: others

Post on 17-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Genetic VariationApplied Computational Genomics, Lecture 05

https://github.com/quinlan-lab/applied-computational-genomicsAaron Quinlan

Departments of Human Genetics and Biomedical InformaticsUSTAR Center for Genetic Discovery

University of Utahquinlanlab.org

Page 2: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

What is genetic variation?• Differences in DNA content or structure among individuals

• Any two individuals have ~99.5% identical DNA. • But the human genome is big - each haploid set of 23 chromosomes has 3.1 billion

nucleotides.• There are >100,000,000 know genetic variants in the human genome

• Effectively infinite combinations of alleles. The details matter.

99% identical DNA~99.8% identical DNA (differ at 1/ 620 - 1/750 bp

Drosophila ~ 1/180

Page 3: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

CGCAAATTTGCCGGATTTCCTTTGCTGTTCCTGCATGTAGTTTAAACGAGATTGCCAGCACCGGGTATCATTCACCATTTTTCTTTTCGTTAACTTGCCGTCAGCCTTTTCTTTGACCTCTTCTTTCTGTTCATGTGTATTTGCTGTCTCTTAGCCAGACTTCCCGTGTCCTTTCCACCGGGCCTTTGAGAGGTCACAGGGTCTTGATGCTGTGGTCTTCATCTGCAGGTGTCTGACTTCCAGCAACTGCTGGCCTGTGCCAGGGTGCAAGCTGAGCACTGGAGTGGAGTTTTCCTGTGGAGAGGAGCCATGCCTAGAGTGGGATGGGCCATTGTTCATCTTCTGGCCCCTGTTGTCTGCATGTAACTTAATACCACAACCAGGCATAGGGGAAAGATTGGAGGAAAGATGAGTGAGAGCATCAACTTCTCTCACAACCTAGGCCAGTAAGTAGTGCTTGTGCTCATCTCCTTGGCTGTGATACGTGGCCGGCCCTCGCTCCAGCAGCTGGACCCCTACCTGCCGTCTGCTGCCA/TCGGAGCCCAAAGCCGGGCTGTGACTGCTCAGACCAGCCGGCTGGAGGGAGGGCC/GCTCAGCAGGTCTGGCTTTGGCCCTGGGAGAGCAGGTGGAAGATCAGGCAGGCCATCGCTGCCACAGAACCCAGTGGATTGGCCTAGGTGGGATCTCTGAGCTCAACAAGCCCTCTCTGGGTGGTAGGTGCAGAGACGGGAGGGGCAGAGCCGCAGGCACAGCCAAGAGGGCTGAAGAAATGGTAGAACGGAGCAGCTGGTGATGTGTGGGCCCACCGGCCCCAGGCTCCTGTCTCCCCCCAGGTGTGTGGTGATGCCAGGCATGCCCTTCCCCAGCATCAGGTCTCCAGAGCTGCAGAAGACGACGGCCGACTTGGATCACACTCTTGTG/AAGTGTCCCCAGTGTTGCAGAGGTGAGAGGAGAGTAGACAGTGAGTGGGAGTGGCGTCGCCCCTAGGGCTCTACGGGGCCGGCGTCTCCTGTCTCCTGGAGAGGCTTCGATGCCCCTCCACACCCTCTTGATCTTCCCTGTGATGTCATCTGGAGCCCTGCTGCTTGCGGTGGCCTATAAAGCCTCCTAGTCTGGCTCCAAGGCCTGGCAGAGTCTTTCCCAGGGAAAGCTACA/TAGCAGCAAACAGTCTGCATGGGTCATCCCCTTCACTCCCAGCTCAGAGCCCAGGCCAGGGGCCCCCAAGAAAGGCTCTGGTGGAGAACCTGTGCATGAAGGCTGTCAACCAGTCCATAGGCAAGCCTGGCTGCCTCCAGCTGGGTCGACAGACAGGGGCTGGAGAAGGGGAGAAGAGGAAAGTGAGGTTGCCTGCCCTGTCTCCTACCTGAGGCTGAGGAAGGAGAAGGGGATGCACTGTTGGGGAGGCAGCTGTAACTCAAAGCCTTAGCCTCTGTTCCCACGAAGGCAGGGCCATCAGGCACCAAAGGGATTCTGCCAGCATAGTGCTCCTGGACCAGTGATACACCCGGCACCCTGTCCTGGACACGCTGTTGGCCTGGATCTGAGCCCTGGTGGAGGTCAAAGCCACCTTTGGTTCTGCCATTGCTGCTGTGTGGAAGTTCACTCCTGCCTTTTCCTTTCCCTAGAGCCTCCACCACCCCGAGATCACATTTCTCACTGCCTTTTGTCTGCCCAGTTTCACCAGAAGTAGGCCTCTTCCTGACAGGC/TAGCTGCACCACTGCCTGGCGCTGTGCCCTTCCTTTGCTCTGCCCGCTGGAGACGGTGTTTGTCATGGGCCTGGTCTGCAGGGATCCTGCTACAAAGGTGAAACCCAGGAGAGTGTGGAGTCCAGAGTGTTGCCAGGACCCAGGCACAGGCATTAGTGCCCGTTGGAGAAAACAGGGGAATCCCGAAGAAATGGTGGGTCCTGGCCATCCGTGAGATCTTCCCAGGTGTGCCGTTTTCTCTGGAAGCCTCTTAAGAACACAGTGGCGCAGGCTGGGTGGAGCCGTCCCCCCATGGAGCACAGGCA/GGACAGAAGTCCCCGCCCCAGCTGTGTGGCCTCAAGCCAGCCTTCCGCTCCTTGAAGCTGGTCTCCACACAGTGCTGGTTCCGTCACCCCCTCCCAAGGAAGTAGGTCTGAGCAGCTTGTCCTGGCTGTGTCCATGTCAGAGCAACGGCCCAAGTCTGGGTCTGGGGGGGAAGGTGTCATGGAGCCCCCTACGATTCCCAGTCGTCCTCGTCCTCCTCTGCCTGTGGCTGCTGCGGTGGCGGCAGAGGAGGGATGGAGTCTGACACGCGGGCAAAGGCTCCTCCGGGCCCCTCACCAGCCCCAGGTCCTTTCCCAGAGATGCCTGGAGGGAAAAGGCTGAGTGAGGGTGGTTGGTGGGAAACCCTGGTTCCCCCAGCCCCCGGA/CGACTTAAATACAGGAAGAAAAAGGCAGGACAGAATTACAAGGTGCTGGCCCAGGGCGGGCAGCGGCCCTGCCTCCTACCCTTGCGCCTCATGACCGGAGCCATAGCCCAGGCAGGAGGGCTGAGGACCTCTGGTGGCGGCCCAGGGCTTCCAGCATGTGCCCTAGGGGAAGCAGGGGCCAGCTGGCAAGAGCAGGGGGTGGGCAGAAAGCACCCGGTGGACTCAGGGCTGGAGGGGAGGAGGCGATCTTGCCCAAGGCCCTCCGACTGCAAGCTCCAGGGCCCGCTCACCTTGCTCCTGCTCCTTCTGCTGCTGCTTCTCCAGCTTTCGCTCCTTCATGCTGCGCAGCTTGGCCTTGCCGATGCCCCCAGCTTGGCGGATGGACTCTAGCAGAGTGGCCAGCCACCGGAGGGGTCAACCACTTCCC

Page 4: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Types of genetic variation

ctctgagctccgag

Single-nucleotidepolymorphisms

(SNPs)

ctctgagctc--ag

Insertion-deletionpolymorphisms

(INDELs)

ctc agctcaag

“DNA spelling mistakes” “extra or missingDNA”

“Large blocks of extra, missingor rearranged

DNA”

Structural variants(SVs)

Page 5: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

A typical human genome"We find that a typical [human] genome differs from the reference human genome at 4.1 million to 5.0 million sites. Although >99.9% of variants consist of SNPs and short indels, structural variants affect more bases: the typical genome contains an estimated 2,100 to 2,500 structural variants (~1,000 large deletions, ~160 copy-number variants, ~915 Alu insertions, ~128 L1 insertions, ~51 SVA insertions, ~4 NUMTs, and ~10 inversions), affecting ~20 million bases of sequence.

Nucleotide diversity ( ):1/756 bp to 1/620 bp

Page 6: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Cases(have disease)

Controls(no disease)

v.

Why do we care?

Understanding the relationship between genetic variation and traits or disease phenotypes

Complex diseases(multiple genes contribute to risk)

Page 7: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

http://www.fos.auckland.ac.nz/~howardross/RecreateTheResearch/RTR/images/Coalescence.png

Why do we care?

Genetic variants are the "bread crumbs" for tracking evolution

Page 8: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Mutation != Polymorphism (or SNP)

Page 9: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

It all starts with mutation

acctccgagta acctccgagtaacctccgagta acctccgagtaacctccgagta acctccgagtaacctccgagta acctccgagtaacctccgagta acctccgagta

a toy population of 10 identical chromosomes

Page 10: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

acctccgagta acctccgagtaacctccgagta acctccgagtaacctccgagta acctccgagtaacctccgagta acctccgagtaacctccgagta acctcTgagta

mutation: private to this chromosome / individual

Mutation creates genetic diversity

Page 11: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

acctccgagta acctcTgagtaacctcTgagta acctccgagtaacctccgagta acctcTgagta

acctccgagtaacctccgagta acctcTgagtaacctccgagta

From mutation to polymorphism

Page 12: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

♂♀

ATCGGGTACCATCCAATCATTACC

Humans are diploid.

ATCGGGAACCATCCAATCATTACC♂♀

Our genome is comprised of a paternal and a maternal "haplotype". Together, they form our "genotype"

Page 13: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

How existing (germline) variation is inherited

ctctgagctccgag

ctctgagctccgag

XExample: Mom and dad are

heterozygous;that is, the zygote

from which they developed was comprised of a sperm

and egg with two different alleles

ctctgagctccgag

ctctgagctccgag

Kid is heterozygous(C/T)

♂♀

♀♂

or

Kid is homozygous(C/C)

ctccgagctccgag♂

Kid is homozygous(T/T)

ctctgagctctgag♂

Page 14: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

de novo mutation: the birth of new variation

ctccgagctccgag

ctccgagctccgag

X Example: Mom and dad are homozygous for the same alleles.

♂♂

♀♀

Kid is heterozygous owing to de novo mutation.(C/T)

ctccgagctctgag♂

New mutation occurs in father’s or mother’s germ cell

Note: This is a derivative chromosomeof the one the father inherited from

His parents. The mutation occurred inhis gamete (sperm) and was passed on to

the child.

♂ ctccgag ♂ ctctgag

Page 15: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

How frequently do de novo mutations (DNMs) occur?human mutation rate

~ 1.1 x 10-8 / bp / generation

haploid genome size~ 3.1 x 109 nucleotides

X

~30-40 DNM per haploid genome,

So approximately 80 per diploid genome

=

(But what is the human mutation rate, really?)Roach et al. (2010) Science

estimate 28 new mutations (per haploid genome)

per generation via quartet sequencing

Nachman et al. (2000) Genetics

estimate nearly double that number via phylogenetic

comparisonSlide from Tom Sasani

Page 16: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

DNMs are more likely to occur inthe paternal germline, and correlate with paternal age

Yuen et al. (2016) Nature Genomic Medicine(data from 200 ASD trios)Slide from Tom Sasani

Page 17: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

somatic mutations

Germline mutation- occur in sperm or egg.

- are heritable

Somatic mutation- non-germline tissues.

- are not heritable

Somatic mutationscommon in cancer

compare DNA from cancer cells tohealthy cells from same individual

vs.

Page 18: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

acctccgagtaacctctgagta

Chromosome with new allele

All other chromosomes

What if the mutation is:

beneficial

Time

Positive selection

neutral

freq. = freq. ⬇ freq. ⬆Genetic “drift” - random process

Selection and genetic drift

Negative selection

deleterious

Page 19: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Genetic bottleneck“Mixed” population

Page 20: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

“Mixed” population

Bottleneck event (migration, war, disease, nearly extinct animals

(e.g., elephant seals in Baja) greatly reduces population size

Genetic bottleneck

Page 21: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

“Mixed” population

Greatly reduced population diversityTime

Genetic bottleneck

Page 22: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Out of Africa Theory bottlenecks & reduced diversity

Nature Reviews Genetics 3, 611-621 (August 2002)

Page 23: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Population stratification

Page 24: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

The 1000 (2,504) Genomes Project

Page 25: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Site Frequency Spectrum - most variants are rare

Page 26: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

ExAC: exome sequencing of 60,706 humans

1 variant every 8 bases among 60,706 humans! Also, >50% of the 9 million variants discovered were present as a heterozygote in 1 out of the 60706 individuals (a "singleton")!!!

http://www.nature.com/nature/journal/v536/n7616/pdf/nature19057.pdf

Page 27: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Allele frequency spectrum: most variants are rare

http://www.nature.com/nature/journal/v536/n7616/pdf/nature19057.pdf

>50% of the 9 million variants discovered were present as a heterozygote in 1 out of the 60706 individuals (a "singleton"). >90% present on <=10 chromosomes sampled

Page 28: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Each chromosome is a mosaic of alleles.You inherit two haplotypes: one from mom, one from dad

A

G

C

T

G

G

Haplotypes

Page 29: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Meiotic recombination shuffles alleles and generates new haplotypes

A

G

G

G

G CG

C

T

Page 30: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Recombination frequency increases with distance between two "markers" (polymorphic sites). T.H. Morgan

A G C

Recombination more likely

Lesslikely

Page 31: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

One centimorgan (cM) is the equivalent to a recombination frequency of 0.01 (1%)

A G C

0.02Recomb.

freq.

0.08Recomb.

freq.

2 cM8 cM

10 cM

In humans, 1 cM corresponds to approximately 1 million bp on average

Page 32: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Linkage equilibrium: random association of alleles at different loci.

A

A

C

T

G

G

G TG

G CG

25%

25%

25%

25%

Page 33: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Linkage disequilibrium: non-random association of alleles at different loci.

A

A

C

T

G

G

G TG

G CG

80%

10%

7%

3%

Page 34: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Therefore, knowing one allele (e.g., the first A) is a strong predictor of other alleles on a haplotype

A

A

C

T

G

G

G TG

G CG

80%

10%

7%

3%

Page 35: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

VCF format, Hardy-Weinberg Equilibrium, & VCF toolkits

Applied Computational Genomics, Lecture 11https://github.com/quinlan-lab/applied-computational-genomics

Aaron QuinlanDepartments of Human Genetics and Biomedical Informatics

USTAR Center for Genetic DiscoveryUniversity of Utah

quinlanlab.org

Page 36: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

VCF format

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3137218/pdf/btr330.pdf

Page 37: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

VCF format

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3137218/pdf/btr330.pdf

Page 38: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Genotypes

#CHROM POS ID REF ALT QUALFILTER INFO FORMAT LF1396chr7 117175373 . A G 90 PASS AF=0.0 GT 0/0

chr7 117175373 . A G 90 PASS AF=0.5 GT 0/1

chr7 117175373 . A G 90 PASS AF=1.0 GT 1/1

chr7 117175373 . A G 0 PASS AF=0.0 GT ./.

Hom. Ref.

Het.

Hom. Alt.

Unknown

Why would a genotype be unknown?

Page 39: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

VCF format. A basic example

#CHROM POS ID REF ALT QUALFILTER INFO FORMAT LF1396chr7 117175373 . A G 90 PASS AF=0.5 GT 0/1

Heterozygous A/G. The REF allele is allele "0", ALT is allele "1"

Page 40: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Multi-sample VCF

#CHROM POS ID REF ALT QUALFILTER INFO FORMAT MOM KIDchr7 2194169 . C T 210 PASS AF=0.5 GT 0/1 0/1

Heterozygous C/T.Mom

Kid

Page 41: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Hardy-Weinberg Equilibrium

Polymorphic loci that are biallelic (e.g., A and G alleles) have two allele frequencies, p and q.

f(A) = p = 4/6 = 0.67f(G) = q = 2/6 = 0.33

p + q = 1

AAGAGA

Page 42: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Hardy-Weinberg EquilibriumIn the absence of evolutionary forces such as selection, drift, or bottlenecks, Hardy–Weinberg equilibrium states

that allele and genotype frequencies in a population will remain constant from generation to generation. If we know the allele frequencies, p and q, we can predict the genotype frequencies that should be observed

(binomial expectation).

f(A) = p = 4/6 = 0.67f(G) = q = 2/6 = 0.33

https://en.wikipedia.org/wiki/Hardy%E2%80%93Weinberg_principle#Deviations_from_Hardy.E2.80.93Weinberg_equilibrium

f(AA) = p2 = (0.67)2 = 0.4489f(AG) = 2pq = 2(0.67)(0.33) = 0.4422f(GG) = q2 = (0.33)2 = 0.1089

p2 + 2pq + q2 = 1

Page 43: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Hardy-Weinberg Equilibrium: expected genotype freqs

https://en.wikipedia.org/wiki/Hardy%E2%80%93Weinberg_principle#Deviations_from_Hardy.E2.80.93Weinberg_equilibrium

f(AA) = p2 = (0.5)2 = 0.25f(AG) = 2pq = 2(0.5)(0.5) = 0.5f(GG) = q2 = (0.5)2 = 0.25

p = 0.5, q = 0.5

f(AA) = p2 = (0.1)2 = 0.01f(AG) = 2pq = 2(0.1)(0.9) = 0.18f(GG) = q2 = (0.9)2 = 0.81

p = 0.1, q = 0.9

f(AA) = p2 = (0.01)2 = 0.0001f(AG) = 2pq = 2(0.01)(0.99) = 0.0198f(GG) = q2 = (0.99)2 = 0.9801

p = 0.01, q = 0.99

f(AA) = p2 = (0.001)2 = 0.000001f(AG) = 2pq = 2(0.001)(0.999) = 0.001998f(GG) = q2 = (0.999)2 = 0.998001

p = 0.001, q = 0.999

Page 44: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Hardy-Weinberg Equilibrium

https://en.wikipedia.org/wiki/Hardy%E2%80%93Weinberg_principle#Deviations_from_Hardy.E2.80.93Weinberg_equilibrium

f(AA) = p2 = (0.1)2 = 0.01f(AG) = 2pq = 2(0.1)(0.9) = 0.18f(GG) = q2 = (0.9)2 = 0.81

p = 0.1, q = 0.9

If we sequenced 100 individuals, how many

A/G heterozygotes would we expect?

How many A/A homozygotes?

Page 45: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Hardy-Weinberg Equilibrium

http://www.nature.com/scitable/knowledge/library/the-hardy-weinberg-principle-13235724

Page 46: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Hardy-Weinberg Equilibrium - example

The frequency of allele "Z" at a given locus on chromosome 7 is 0.3. What proportion of individuals do we expect to be heterozygous for the Z and Q alleles?

Page 47: Genetic Variation - ut · What is genetic variation? • Differences in DNA content or structure among individuals • Any two individuals have ~99.5% identical DNA. • But the human

Deviations from Hardy-Weinberg Equilibrium

http://www.nature.com/scitable/knowledge/library/the-hardy-weinberg-principle-13235724

● Inbreeding● Population bottlenecks● Positive selection● Purifying selection● Random genetic drift

Example: a recessive, disease causing allele.Expect p2 homozygotes for the recessive allele, yet observe significantly less than p2 * the number of individuals tested