relatedness and differentiation in arbitrary population
TRANSCRIPT
●●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●● ●●
●●
●
●●
●
●
●●
●●
●
●●
●●
●
●
●
●●
●●
●●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●●
●● ●
●● ●
●
●
●
●
●
●
0.2
0.3
0.4
Inbr
eedi
ng c
oeff.
Relatedness and differentiationin arbitrary population structures
Alejandro Ochoa, John D. Storey Lab, Princeton University7 DrAlexOchoa � viiia.org/research/ R [email protected]
Why study relatedness?
Human geneticsis fascinating!
Pop. structureconfoundsassociationstudies (GWAS)
Heritability ofcomplex traits
Animal and plantbreeding
Why study relatedness?
Human geneticsis fascinating!
Pop. structureconfoundsassociationstudies (GWAS)
Heritability ofcomplex traits
Animal and plantbreeding
Why study relatedness?
Human geneticsis fascinating!
Pop. structureconfoundsassociationstudies (GWAS)
Heritability ofcomplex traits
Animal and plantbreeding
Why study relatedness?
Human geneticsis fascinating!
Pop. structureconfoundsassociationstudies (GWAS)
Heritability ofcomplex traits
Animal and plantbreeding
The kinship coefficient for parent-child: 14
The kinship coefficient for parent-child: 14
The kinship coefficient for parent-child: 14
The kinship coefficient for siblings: 14 on average
Visscher et al. (2006)
The kinship coefficient for siblings: 14 on average
Visscher et al. (2006)
The kinship coefficient for siblings: 14 on average
Visscher et al. (2006)
The inbreeding coefficient in populations
Measurements relativeto a reference pop.:
Inbreeding = 0 in thelocal population
Inbreeding ≥ 0 relativeto a distant ancestralpopulation
Better measured usingcovariance
The inbreeding coefficient in populations
Measurements relativeto a reference pop.:
Inbreeding = 0 in thelocal population
Inbreeding ≥ 0 relativeto a distant ancestralpopulation
Better measured usingcovariance
The inbreeding coefficient in populations
Measurements relativeto a reference pop.:
Inbreeding = 0 in thelocal population
Inbreeding ≥ 0 relativeto a distant ancestralpopulation
Better measured usingcovariance
Dataset: 1000 Genomes Project (2013)
●
●
●
●
●
● ●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
ASW
ACB
ESN
GWD
LWK
MSLYRI
GBR
FIN
IBS
TSI
CEU
BEB
GIH
ITU
PJL
STU
CDX
CHB
JPT
KHV
CHS
CLM
MXL
PEL
PUR
Subpopulations
SAfricaEuropeSAsiaEAsiaAmr
2,504 indivs. from 26 locs. — 20,417,698 loci (asc. in YRI) — WGS trios, etc.
Dataset: Human Origins (2016)
●
●
●●
●
●
●●●
●●●
●
●
●
●●●
●●
●● ●
● ●
● ●●
●●
●●
●
●●● ●
●●●
●
●●●
●●
●●
●●●●
●
●
●
● ●●
●
●
●●●
●●●
●
●
●
●●
●●
●
●●
●●
●
●● ●
●●●●
●
●
●
●●●
●●
●
●●●
●●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●●
●
● ●
●●
●●
●
●●
●
●
●●
●
● ●●
●
●
● ●
●
● ●
●
●●
●●
●●
●●
●
●● ●●
●
●●
●●
●
●●
Ju_hoan_North
Khomani
Mbuti
Biaka
BantuSA
BantuKenya
Luhya
YorubaEsan
MandenkaGambian
Mende
Luo
Hadza
AA
KikuyuMasai
Datog
Somali
Jew_Ethiopian
Saharawi
MoroccanMozabite
Algerian Tunisian
Libyan Egyptian
Jew_Libyan
Jew_TunisianJew_Moroccan
Jew_Yemenite
Yemeni
Saudi
BedouinBBedouinA
PalestinianJordanian
SyrianLebanese_Muslim
Lebanese_Christian
Jew_Turkish
Assyrian
Druze
Lebanese
Jew_iraqi
Iranian_Bandari
Iranian
Jew_Iranian
TurkishItalian_SouthSicilian
Maltese
SpanishSW
Canary_Islander
SpanishNE
FrenchS
Italian_North
Sardinian
Basque
FrenchN
Greek
BulgarianCroatian
English
Scottish
Orcadian
Icelandic
Hungarian
Norwegian
Czech
Belarusian
Estonian
Lithuanian
Ukrainian
RussianFinnish
MordovianJew_Ashkenazi
Cypriot
Romanian
Albanian North_Ossetian
Jew_GeorgianLezgin
KumykBalkar
Tajik
Nogai
Chuvash
Turkmen
ChechenAbkhasian
Georgian
Adygei
Armenian
BalochiBrahui
Makrani
Kalash
Pathan
Sindhi
Burusho
Punjabi
Hazara
Jew_Cochin
Gujarati Bengali
Uzbek
Mansi
Kusunda
Uygur
Altaian
Dolgan
Even
Kalmyk
Kyrgyz
Selkup
Tubalar Tuvinian
ThaiCambodian
Ami
Atayal
Dai
Han
Japanese
Kinh
Korean
Lahu
Miao
Mongola
Naxi She
Tujia
Xibo
Yi
Daur Hezhen
Tu
OroqenUlchi
Yakut
Nganasan
Yukagir
ItelmenKoryak
Chukchi
Eskimo
AleutAleut_Tlingit
Pima
Mixe
Mixtec
Zapotec
Mayan
Piapoco
Surui
Karitiana
Bolivian
Quechua
Australian
Nasoi
Papuan
Subpopulations
SAfricaNAfricaMiddleEastEuropeCaucasusSAsiaNAsiaEAsiaAmrOc
2,066 indivs. from 163 locs. — 595,911 loci — SNP chip
Recently admixed populations
African-Americans
Baharian, et al. (2016)
Hispanics
Moreno-Estrada, et al. (2013)
Admixed siblings from different populations?
Lucy and Maria, UK
Ochoa brothers, MX
High Admixture LD:
Moreno-Estrada, et al. (2013)
Solution: treat every individual as its own population!
Admixed siblings from different populations?
Lucy and Maria, UK
Ochoa brothers, MX
High Admixture LD:
Moreno-Estrada, et al. (2013)
Solution: treat every individual as its own population!
Admixed siblings from different populations?
Lucy and Maria, UK
Ochoa brothers, MX
High Admixture LD:
Moreno-Estrada, et al. (2013)
Solution: treat every individual as its own population!
Admixed siblings from different populations?
Lucy and Maria, UK
Ochoa brothers, MX
High Admixture LD:
Moreno-Estrada, et al. (2013)
Solution: treat every individual as its own population!
FST in the independent subpopulation model
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
0.4
0.6
Alle
le fr
eque
ncy
Illustration.
FST =Var(pSi∣∣T)
pTi(1− pTi
) .
FST in the independent subpopulation model
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
0.4
0.6
Alle
le fr
eque
ncy
Illustration.
FST =Var(pSi∣∣T)
pTi(1− pTi
) .
FST measures population structure / differentiation
●●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●● ●●
●●
●
●●
●
●
●●
●●
●
●●
●●
●
●
●
●●
●●
●●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●●
●● ●
●● ●
●
●
●
●
●
●
0.1
0.2
0.3
0.4
Alle
le fr
eque
ncy
Median diff. SNP in Human Origins (rs7626601; given MAF ≥ 10%).
F̂WCST ≈ 0.0712 using Weir-Cockerham estimator and K = 163.
FST measures population structure / differentiation
●●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●● ●●
●●
●
●●
●
●
●●
●●
●
●●
●●
●
●
●
●●
●●
●●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●●
●● ●
●● ●
●
●
●
●
●
●
0.1
0.2
0.3
0.4
Alle
le fr
eque
ncy
Median diff. SNP in Human Origins (rs7626601; given MAF ≥ 10%).
F̂WCST ≈ 0.0712 using Weir-Cockerham estimator and K = 163.
Single Nucleotide Polymorphism (SNP) dataExample: Genotype CC CT TT
xij 0 1 2Genome\Wide(SNP(Data(
0 2 2 1 1 0 1 0 2 1 0 1 2 . . .
Individuals
SN
Ps
X
Increased homozygosity in a structured population
“Hardy-Weinberg Equilibrium”
0 1 2
pi = 0.25
Genotype ( xij)
Pro
babi
lity
0.0
0.2
0.4
0.6
A structured population
0 1 2
pi = 0.25, fj = 0.5
Genotype ( xij)
Pro
babi
lity
0.0
0.2
0.4
0.6
Increased homozygosity in a structured population
“Hardy-Weinberg Equilibrium”
0 1 2
pi = 0.25
Genotype ( xij)
Pro
babi
lity
0.0
0.2
0.4
0.6
A structured population
0 1 2
pi = 0.25, fj = 0.5
Genotype ( xij)
Pro
babi
lity
0.0
0.2
0.4
0.6
Kinship model for genotypes
symbol meaningT ref ancestral populationi locus index
j , k individual indexespTi ref allele frequencyxij genotype (num ref alleles)ϕTjk kinship of j , k
f Tj inbreeding of j
Statistical model:
E[xij |T ] = 2pTi ,
Var(xij |T ) = 2pTi(1− pTi
)(1 + f Tj ),
Cov(xij , xik |T ) = 4pTi(1− pTi
)ϕTjk .
(Wright 1921, 1951; Malécot 1948; Jacquard 1970).
We developed a new kinship estimation framework that works forarbitrary population structures!
Kinship model for genotypes
symbol meaningT ref ancestral populationi locus index
j , k individual indexespTi ref allele frequencyxij genotype (num ref alleles)ϕTjk kinship of j , k
f Tj inbreeding of j
Statistical model:
E[xij |T ] = 2pTi ,
Var(xij |T ) = 2pTi(1− pTi
)(1 + f Tj ),
Cov(xij , xik |T ) = 4pTi(1− pTi
)ϕTjk .
(Wright 1921, 1951; Malécot 1948; Jacquard 1970).
We developed a new kinship estimation framework that works forarbitrary population structures!
Wright’s FST
Total inbreeding: FIT =1|S |∑j∈S
f Tj ,
Local inbreeding: FIS =1|S |∑j∈S
f Sj ,
Structural inbreeding: FST =FIT − FIS
1− FIS.
The generalized FST
Need new “local” subpopulations Lj (separates total from local inbreeding):(1− f Tj
)=(1− f
Ljj
)(1− f TLj
).
Generalized FST: applicable to arbitrary population structures, equals previousdefinition for non-overlapping subpopulations:
FST =n∑
j=1
wj fTLj.
Mean heterozygosity in a structured population:
H̄i = 2pTi(1− pTi
)(1− FST) .
The generalized FST
Need new “local” subpopulations Lj (separates total from local inbreeding):(1− f Tj
)=(1− f
Ljj
)(1− f TLj
).
Generalized FST: applicable to arbitrary population structures, equals previousdefinition for non-overlapping subpopulations:
FST =n∑
j=1
wj fTLj.
Mean heterozygosity in a structured population:
H̄i = 2pTi(1− pTi
)(1− FST) .
The generalized FST
Need new “local” subpopulations Lj (separates total from local inbreeding):(1− f Tj
)=(1− f
Ljj
)(1− f TLj
).
Generalized FST: applicable to arbitrary population structures, equals previousdefinition for non-overlapping subpopulations:
FST =n∑
j=1
wj fTLj.
Mean heterozygosity in a structured population:
H̄i = 2pTi(1− pTi
)(1− FST) .
Our admixture simulation
Intermediate subpop. diff.
FS
T
01
A
0.0
0.2
Intermediate subpop. spread
dens
ity
B
Admixture proportions
ance
stry
01
C
Discrete subpop. approx.
ance
stry
01
D
position in 1D geography
Comparison of population structures in simulation
T
S1
f TS1
S2
f TS2
...SK
f TSK
Indep. Subpops.T
S1 S2 ... SK
A1 A2 ... An
Admixture
00.
10.
2
Kin
ship
Indi
vidu
als
Bias in standard kinship estimator
Standard kinship estimator:
ϕ̂T ,stdjk =
m∑i=1
(xij − 2p̂Ti
) (xik − 2p̂Ti
)4
m∑i=1
p̂Ti(1− p̂Ti
) , p̂Ti =12
n∑j=1
wjxij .
Estimator has a distorted bias (varies for every pair of individuals j , k):
ϕ̂T ,stdjk
a.s.−−−→m→∞
ϕTjk − ϕ̄T
j − ϕ̄Tk + ϕ̄T
1− ϕ̄T.
Bias in standard kinship estimator
Standard kinship estimator:
ϕ̂T ,stdjk =
m∑i=1
(xij − 2p̂Ti
) (xik − 2p̂Ti
)4
m∑i=1
p̂Ti(1− p̂Ti
) , p̂Ti =12
n∑j=1
wjxij .
Estimator has a distorted bias (varies for every pair of individuals j , k):
ϕ̂T ,stdjk
a.s.−−−→m→∞
ϕTjk − ϕ̄T
j − ϕ̄Tk + ϕ̄T
1− ϕ̄T.
New estimator: two stepsStep 1: “pre-adjusted” kinship estimator with uniform bias.
ϕ̂T ,preadjjk =
m∑i=1
(xij − 1)(xik − 1)− 1
4m∑i=1
p̂Ti(1− p̂Ti
) + 1 a.s.−−−→m→∞
ϕTjk − ϕ̄T
1− ϕ̄T,
Step 2: Estimate minimum kinship, use to unbias “step 1” estimates.
ϕ̂T ,preadjmin
a.s.−−−→m→∞
− ϕ̄T
1− ϕ̄T, ϕ̂T ,new
jk =ϕ̂T ,preadjjk − ϕ̂T ,preadj
min
1− ϕ̂T ,preadjmin
a.s.−−−→m→∞
ϕTjk .
f̂ T ,newj = 2ϕ̂T ,new
jj − 1 a.s.−−−→m→∞
f Tj , F̂ newST =
n∑j=1
wj f̂T ,newj
a.s.−−−→m→∞
FST
New estimator: two stepsStep 1: “pre-adjusted” kinship estimator with uniform bias.
ϕ̂T ,preadjjk =
m∑i=1
(xij − 1)(xik − 1)− 1
4m∑i=1
p̂Ti(1− p̂Ti
) + 1 a.s.−−−→m→∞
ϕTjk − ϕ̄T
1− ϕ̄T,
Step 2: Estimate minimum kinship, use to unbias “step 1” estimates.
ϕ̂T ,preadjmin
a.s.−−−→m→∞
− ϕ̄T
1− ϕ̄T, ϕ̂T ,new
jk =ϕ̂T ,preadjjk − ϕ̂T ,preadj
min
1− ϕ̂T ,preadjmin
a.s.−−−→m→∞
ϕTjk .
f̂ T ,newj = 2ϕ̂T ,new
jj − 1 a.s.−−−→m→∞
f Tj , F̂ newST =
n∑j=1
wj f̂T ,newj
a.s.−−−→m→∞
FST
New estimator: two stepsStep 1: “pre-adjusted” kinship estimator with uniform bias.
ϕ̂T ,preadjjk =
m∑i=1
(xij − 1)(xik − 1)− 1
4m∑i=1
p̂Ti(1− p̂Ti
) + 1 a.s.−−−→m→∞
ϕTjk − ϕ̄T
1− ϕ̄T,
Step 2: Estimate minimum kinship, use to unbias “step 1” estimates.
ϕ̂T ,preadjmin
a.s.−−−→m→∞
− ϕ̄T
1− ϕ̄T, ϕ̂T ,new
jk =ϕ̂T ,preadjjk − ϕ̂T ,preadj
min
1− ϕ̂T ,preadjmin
a.s.−−−→m→∞
ϕTjk .
f̂ T ,newj = 2ϕ̂T ,new
jj − 1 a.s.−−−→m→∞
f Tj , F̂ newST =
n∑j=1
wj f̂T ,newj
a.s.−−−→m→∞
FST
Performance of proposed estimator
A) Truth B) Proposed C) Existing
−0.1
00.
10.
20.
3
Kin
ship
Indi
vidu
als
Bias in FST estimators for independent subpopulationsPrevious estimator for n subpopulations, simplified for known AFs (πij):
F̂ indepST =
m∑i=1
σ̂2i
m∑i=1
p̂Ti(1− p̂Ti
)+ 1
nσ̂2i
,
p̂Ti =1n
n∑j=1
πij , σ̂2i =
1n − 1
n∑j=1
(πij − p̂Ti
)2.
Estimator is biased in dependent subpopulations:
F̂ indepST
a.s.−−−→m→∞
FST − 1n−1
(nθ̄T − FST
)1− 1
n−1
(nθ̄T − FST
) .
Bias in FST estimators for independent subpopulationsPrevious estimator for n subpopulations, simplified for known AFs (πij):
F̂ indepST =
m∑i=1
σ̂2i
m∑i=1
p̂Ti(1− p̂Ti
)+ 1
nσ̂2i
,
p̂Ti =1n
n∑j=1
πij , σ̂2i =
1n − 1
n∑j=1
(πij − p̂Ti
)2.
Estimator is biased in dependent subpopulations:
F̂ indepST
a.s.−−−→m→∞
FST − 1n−1
(nθ̄T − FST
)1− 1
n−1
(nθ̄T − FST
) .
Bias estimating the generalized FST
0.00
0.05
0.10
Ind. Subpops.A
Wei
r−C
ocke
rham
Hud
sonK
Bay
eSca
n
Sta
ndar
dK
insh
ip
New
− − −−
− AdmixtureB
Wei
r−C
ocke
rham
Hud
sonK
Bay
eSca
n
Sta
ndar
dK
insh
ip
New
− −−
−
−
−
True FST
FSTindep limit
Estimates−
FS
T e
stim
ate
FST estimator
Ju_h
oan_
Nor
thK
hom
ani
Mbu
tiB
iaka
Ban
tuS
AB
antu
Ken
yaLu
hya
Yoru
baE
san
Man
denk
aG
ambi
anM
ende
Luo
Had
za AA
Kik
uyu
Mas
aiD
atog
Som
ali
Jew
_Eth
iopi
anS
ahar
awi
Mor
occa
nM
ozab
iteA
lger
ian
Tuni
sian
Liby
anE
gypt
ian
Jew
_Lib
yan
Jew
_Tun
isia
nJe
w_M
oroc
can
Jew
_Yem
enite
Yem
eni
Sau
diB
edou
inB
Bed
ouin
AP
ales
tinia
nJo
rdan
ian
Syr
ian
Leba
nese
_Mus
limLe
bane
se_C
hris
tian
Jew
_Tur
kish
Ass
yria
nD
ruze
Leba
nese
Jew
_ira
qiIr
ania
n_B
anda
riIr
ania
nJe
w_I
rani
anTu
rkis
hM
alte
seC
anar
y_Is
land
erIta
lian_
Sou
thS
icili
anF
renc
hNS
pani
shS
WS
pani
shN
EF
renc
hSIta
lian_
Nor
thS
ardi
nian
Bas
que
Sco
ttish
Orc
adia
nE
nglis
hIc
elan
dic
Nor
weg
ian
Cze
chC
roat
ian
Hun
garia
nA
lban
ian
Gre
ekB
ulga
rian
Lith
uani
anB
elar
usia
nU
krai
nian
Est
onia
nR
ussi
anF
inni
shM
ordo
vian
Rom
ania
nJe
w_A
shke
nazi
Cyp
riot
Arm
enia
nJe
w_G
eorg
ian
Geo
rgia
nA
bkha
sian
Ady
gei
Nor
th_O
sset
ian
Che
chen
Lezg
inK
umyk
Bal
kar
Nog
aiTa
jikK
alas
hP
atha
nB
aloc
hiB
rahu
iM
akra
niS
indh
iB
urus
hoG
ujar
ati
Pun
jabi
Ben
gali
Jew
_Coc
hin
Turk
men
Uzb
ekH
azar
aU
ygur
Chu
vash
Man
siS
elku
pTu
bala
rK
yrgy
zA
ltaia
nTu
vini
anN
gana
san
Dol
gan
Yaku
tX
ibo
Hez
hen
Oro
qen
Ulc
hiE
ven
Yuka
gir
Itelm
enK
orya
kC
hukc
hiE
skim
oA
leut
Ale
ut_T
lingi
tK
usun
daC
ambo
dian
Tha
iA
mi
Ata
yal
Kin
hD
aiLa
huN
axi
Yi
Tujia
Han
Mia
oS
he TuM
ongo
laK
alm
ykD
aur
Japa
nese
Kor
ean
Pim
aM
ixe
Mix
tec
Zap
otec
May
anP
iapo
coS
urui
Kar
itian
aB
oliv
ian
Que
chua
Aus
tral
ian
Nas
oiP
apua
n
Ju_hoan_NorthKhomani
MbutiBiaka
BantuSABantuKenya
LuhyaYoruba
EsanMandenka
GambianMende
LuoHadza
AAKikuyuMasaiDatog
SomaliJew_Ethiopian
SaharawiMoroccanMozabiteAlgerianTunisian
LibyanEgyptian
Jew_LibyanJew_Tunisian
Jew_MoroccanJew_Yemenite
YemeniSaudi
BedouinBBedouinA
PalestinianJordanian
SyrianLebanese_Muslim
Lebanese_ChristianJew_Turkish
AssyrianDruze
LebaneseJew_iraqi
Iranian_BandariIranian
Jew_IranianTurkishMaltese
Canary_IslanderItalian_South
SicilianFrenchN
SpanishSWSpanishNE
FrenchSItalian_North
SardinianBasque
ScottishOrcadian
EnglishIcelandic
NorwegianCzech
CroatianHungarian
AlbanianGreek
BulgarianLithuanianBelarusian
UkrainianEstonianRussianFinnish
MordovianRomanian
Jew_AshkenaziCypriot
ArmenianJew_Georgian
GeorgianAbkhasian
AdygeiNorth_Ossetian
ChechenLezginKumykBalkarNogai
TajikKalashPathan
BalochiBrahui
MakraniSindhi
BurushoGujaratiPunjabiBengali
Jew_CochinTurkmen
UzbekHazaraUygur
ChuvashMansi
SelkupTubalarKyrgyzAltaian
TuvinianNganasan
DolganYakutXibo
HezhenOroqen
UlchiEven
YukagirItelmenKoryak
ChukchiEskimo
AleutAleut_Tlingit
KusundaCambodian
ThaiAmi
AtayalKinh
DaiLahuNaxi
YiTujiaHan
MiaoShe
TuMongola
KalmykDaur
JapaneseKorean
PimaMixe
MixtecZapotec
MayanPiapoco
SuruiKaritianaBolivian
QuechuaAustralian
NasoiPapuan
SAfrica NAfrica MiddleEast Europe Caucasus SAsia NAsia EAsia Amr Oc
SA
fric
aN
Afr
ica
Mid
dleE
ast
Eur
ope
Cau
casu
sS
Asi
aN
Asi
aE
Asi
aA
mr
Oc
00.
10.
20.
3
Kin
ship
Indi
vidu
als
New kinshipestimates
Genotypes from “Human Origins” (Lazaridis
et al. 2014, 2016)
Edited from Ephert [CC BY-SA 3.0], via
Wikimedia Commons
Ju_h
oan_
Nor
thK
hom
ani
Mbu
tiB
iaka
Ban
tuS
AB
antu
Ken
yaLu
hya
Yoru
baE
san
Man
denk
aG
ambi
anM
ende
Luo
Had
za AA
Kik
uyu
Mas
aiD
atog
Som
ali
Jew
_Eth
iopi
anS
ahar
awi
Mor
occa
nM
ozab
iteA
lger
ian
Tuni
sian
Liby
anE
gypt
ian
Jew
_Lib
yan
Jew
_Tun
isia
nJe
w_M
oroc
can
Jew
_Yem
enite
Yem
eni
Sau
diB
edou
inB
Bed
ouin
AP
ales
tinia
nJo
rdan
ian
Syr
ian
Leba
nese
_Mus
limLe
bane
se_C
hris
tian
Jew
_Tur
kish
Ass
yria
nD
ruze
Leba
nese
Jew
_ira
qiIr
ania
n_B
anda
riIr
ania
nJe
w_I
rani
anTu
rkis
hM
alte
seC
anar
y_Is
land
erIta
lian_
Sou
thS
icili
anF
renc
hNS
pani
shS
WS
pani
shN
EF
renc
hSIta
lian_
Nor
thS
ardi
nian
Bas
que
Sco
ttish
Orc
adia
nE
nglis
hIc
elan
dic
Nor
weg
ian
Cze
chC
roat
ian
Hun
garia
nA
lban
ian
Gre
ekB
ulga
rian
Lith
uani
anB
elar
usia
nU
krai
nian
Est
onia
nR
ussi
anF
inni
shM
ordo
vian
Rom
ania
nJe
w_A
shke
nazi
Cyp
riot
Arm
enia
nJe
w_G
eorg
ian
Geo
rgia
nA
bkha
sian
Ady
gei
Nor
th_O
sset
ian
Che
chen
Lezg
inK
umyk
Bal
kar
Nog
aiTa
jikK
alas
hP
atha
nB
aloc
hiB
rahu
iM
akra
niS
indh
iB
urus
hoG
ujar
ati
Pun
jabi
Ben
gali
Jew
_Coc
hin
Turk
men
Uzb
ekH
azar
aU
ygur
Chu
vash
Man
siS
elku
pTu
bala
rK
yrgy
zA
ltaia
nTu
vini
anN
gana
san
Dol
gan
Yaku
tX
ibo
Hez
hen
Oro
qen
Ulc
hiE
ven
Yuka
gir
Itelm
enK
orya
kC
hukc
hiE
skim
oA
leut
Ale
ut_T
lingi
tK
usun
daC
ambo
dian
Tha
iA
mi
Ata
yal
Kin
hD
aiLa
huN
axi
Yi
Tujia
Han
Mia
oS
he TuM
ongo
laK
alm
ykD
aur
Japa
nese
Kor
ean
Pim
aM
ixe
Mix
tec
Zap
otec
May
anP
iapo
coS
urui
Kar
itian
aB
oliv
ian
Que
chua
Aus
tral
ian
Nas
oiP
apua
n
Ju_hoan_NorthKhomani
MbutiBiaka
BantuSABantuKenya
LuhyaYoruba
EsanMandenka
GambianMende
LuoHadza
AAKikuyuMasaiDatog
SomaliJew_Ethiopian
SaharawiMoroccanMozabiteAlgerianTunisian
LibyanEgyptian
Jew_LibyanJew_Tunisian
Jew_MoroccanJew_Yemenite
YemeniSaudi
BedouinBBedouinA
PalestinianJordanian
SyrianLebanese_Muslim
Lebanese_ChristianJew_Turkish
AssyrianDruze
LebaneseJew_iraqi
Iranian_BandariIranian
Jew_IranianTurkishMaltese
Canary_IslanderItalian_South
SicilianFrenchN
SpanishSWSpanishNE
FrenchSItalian_North
SardinianBasque
ScottishOrcadian
EnglishIcelandic
NorwegianCzech
CroatianHungarian
AlbanianGreek
BulgarianLithuanianBelarusian
UkrainianEstonianRussianFinnish
MordovianRomanian
Jew_AshkenaziCypriot
ArmenianJew_Georgian
GeorgianAbkhasian
AdygeiNorth_Ossetian
ChechenLezginKumykBalkarNogai
TajikKalashPathan
BalochiBrahui
MakraniSindhi
BurushoGujaratiPunjabiBengali
Jew_CochinTurkmen
UzbekHazaraUygur
ChuvashMansi
SelkupTubalarKyrgyzAltaian
TuvinianNganasan
DolganYakutXibo
HezhenOroqen
UlchiEven
YukagirItelmenKoryak
ChukchiEskimo
AleutAleut_Tlingit
KusundaCambodian
ThaiAmi
AtayalKinh
DaiLahuNaxi
YiTujiaHan
MiaoShe
TuMongola
KalmykDaur
JapaneseKorean
PimaMixe
MixtecZapotec
MayanPiapoco
SuruiKaritianaBolivian
QuechuaAustralian
NasoiPapuan
SAfrica NAfrica MiddleEast Europe Caucasus SAsia NAsia EAsia Amr Oc
SA
fric
aN
Afr
ica
Mid
dleE
ast
Eur
ope
Cau
casu
sS
Asi
aN
Asi
aE
Asi
aA
mr
Oc
−0.0
50
0.05
0.1
0.15
Kin
ship
Indi
vidu
als
Standard kinshipestimates
Genotypes from “Human Origins” (Lazaridis
et al. 2014, 2016)
Edited from Ephert [CC BY-SA 3.0], via
Wikimedia Commons
Population-level inbreeding
●●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●● ●●
●●
●
●●
●
●
●●
●●
●
●●
●●
●
●
●
●●
●●
●●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●●
●● ●
●● ●
●
●
●
●
●
●
0.2
0.3
0.4
Inbr
eedi
ng c
oeff.
Differentiation (FST) previously underestimated
0.0 0.1 0.2 0.3 0.4 0.5
020
40
Inbreeding Coefficient
Den
sity
Subpopulations
SAfricaNAfricaMiddleEastEuropeCaucasusSAsiaNAsiaEAsiaAmrOc
FST estimates
WCBayeScanHudsonKNew
00.
10.
20.
30.
4
Kin
ship
Indi
vidu
als
0.0
0.5
1.0
Individuals
Anc
estr
y fr
ac.
x
PU
RC
LMP
EL
MX
L
Pop
ulat
ion
x
AF
RA
MR
EU
R
Anc
estr
y
Kinship driven byadmixture in Hispanics
New kinship estimates
Genotypes from the 1000 Genomes Project (2013)
Improved relatedness has repercussions across genetics!
Easy to measureroutinely
Search fordisease-causinggenetic variants
Heritability ofcomplex traits
Animal and plantbreeding
Acknowledgments
John D. StoreyAndrew BassIrineo CabrerosWei HaoRiley Skeen-Gaar
Neo Christopher ChungUniversity of Warsaw
Funding:National Institutes of HealthOtsuka Pharmaceuticals
Lewis-Sigler Institute for Integrative Genomics