Short introduction to Human Variation
Lasse FolkersenCenter for Biological Sequence analysis
Technical University of Denmark
Human 1AGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGA
Human 2ACGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
Human 3AGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
Human 4AGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
Human 5ACGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTTGA
33M letters
33M letters
147M letters
147M letters
33M letters
33M letters
33M letters 147M letters
147M letters
147M letters
SNP
Human 1AGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGA
Human 2ACGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
Human 3AGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
Human 4AGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
Human 5ACGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTTGA
33M letters
33M letters
147M letters
147M letters
33M letters
33M letters
33M letters 147M letters
147M letters
147M letters
SNP1 SNP2 SNP3?
SNP
Human 1AGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGA
Human 2ACGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
Human 3AGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
Human 4AGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
Human 5ACGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTTGA
33M letters
33M letters
147M letters
147M letters
33M letters
33M letters
33M letters 147M letters
147M letters
147M letters
SNP1 SNP2 SNP3?(50%) (0.1%)(1%)
SNP
Human 1AGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGA
Human 2ACGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
Human 3AGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
Human 4AGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
Human 5ACGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTTGA
33M letters
33M letters
147M letters
147M letters
SNP: One nucleotide difference occuring in at least 1% of a population
33M letters
33M letters
33M letters 147M letters
147M letters
147M letters
rs2278007 rs16891982 mutation
1.0.01% (eg. 300k)
2.0.1% (eg. 3 million)
3.1% (eg. 30 million)
4.10% (eg. 300 million)
Question: How many SNPs differ on average between two people?
Human: 3 billion base-pairs~ one base pair out of every 1,000 will be different between any two individuals
Another similar project?
Human 1: European ancestryAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGA
Human 2: Mixed ancestryAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
Human 3: Asian ancestryAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
33M letters
33M letters
147M letters
147M letters
SNP: rs16891982 – ethnicity dependent
33M letters 147M letters
C-frequency Scandinavia2%
C-frequency China:99.9%
Practically no Chinesehave GG
Human 1: European ancestryAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGA
Human 2: Mixed ancestryAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
Human 3: Asian ancestryAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
GorillaAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
DogAGGAAAACACGGAATTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAATTGATGCA G AAGCCCCAACATCCAACCTCGA
33M letters
33M letters
147M letters
147M letters
59M letters
73M letters
33M letters 147M letters
SNP: rs16891982 – ethnicity dependent
34M letters
14M letters
Human 1: European ancestryAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGA
Human 3: Asian ancestryAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
GorillaAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
DogAGGAAAACACGGAATTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAATTGATGCA G AAGCCCCAACATCCAACCTCGA
33M letters
33M letters
147M letters
147M letters
59M letters
73M letters
SNP: rs16891982 – non-synonymous coding
34M letters
14M letters
E V G C W G F/L C I N S V F SAmino acids:
Human 1: European ancestryAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGA
Human 2: Mixed ancestryAGGAAAACACGGAGTTGATGCA G AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
Human 3: Asian ancestryAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGAAGGAAAACACGGAGTTGATGCA C AAGCCCCAACATCCAACCTCGA
33M letters
33M letters
147M letters
147M letters
33M letters 147M letters
SNP: rs16891982 – hair colour associated
1
32
Unravelling a hair-colour SNP
Hair colour and genome-wide association studies (GWAS)
Results from largest hair colour GWAS: there are many hair-colour SNPs
Hair colour and genome-wide association studies (GWAS)
Results and conclusions
Human 1 Phenotype: average scandinavianGenotype-rs16891982 : light homozygousGenotype-others: mixed
Human 3Phenotype: blackGenotype-rs16891982 : dark homozygousGenotype-others: almost all light colour
Human 2Phenotype: average scandinavianGenotype-rs16891982 : heterozygousGenotype-others: half mixed, half complete light