barbujani ucla
DESCRIPTION
Presentation UCLA April 8, 2009TRANSCRIPT
Nuragic Sardinians are still among us, and the Etruscans too. Two genealogical studies
Guido Barbujani
Dip. Biologia ed Evoluzione Università di [email protected]
UCLA, April 8, 2009
1. The Etruscans do not resemble most modern Tuscans
A bit of history
Etruscan a non-Indo-European languageDocumented from the end of VIII century BCEtruscan cities independent statesCommon culture and language, but never a political unitMaximum territory expansion: VI century BCMilitary defeats, Roman assimilation in the II century BC
Dionysius of Halicarnassus: the Etruscans an Italic populationHerodotus: the Etruscans seamen from Lydia, escaping famine
V
A
S
PM
T
C
Rom e
Adria (17, 5), Volterra (6, 3),Castelfranco di Sotto (2, 1),Castelluccio di Pienza (1, 1),Magliano and Marsiliana (25, 6)Tarquinia (18, 5), Capua (8, 6)
80 bone samples from 8 Etruscan necropoleis
27 individuals, 22 different haplotypes, h=0.946Tuscans: 49 individuals from Francalacci et al. (1996)
Shared sequences between the Etruscans and modern populations
23 3
35
2
1
31
52
22
33
1
3
7
4
2
2
2
4
Genetic distances (Fst x 1000) between the Etruscans and
modern populations
3680 90
7048
74
118
5550
37
47
4176
6051
57
261
69
65
41
62
73
71
2. Testing hypotheses by serial coalescent simulation
Reconstructing (proceeding backwards in time) the maternal genealogy of a sample
Two possibilities: either each individual has a different mom
Or two individuals have the same mom (coalesence)
Coalescence probability a function of population size N and sample size n
Past
Present
N = 10N constant
GenealogiesMRCA
N = 10N constantn = 69 generations
Mutation
Mutation(sequences are arbitrary, their differences
are not)
1
00000
00001
00010
00101
101011 2 3 3 4 5
01010
Il Modello: Serial Simcoal
Serial coalescence
N=20Modern sample (n=5)0
100
Time(generations)
Ancient sample (n=2)
Anderson C.N.K., Ramakrishnan U., Chan Y.L. e Hadly E.A. (2005) Bioinformatics
INPUT Population size Population genealogy Population growth rate Migration matrix Mutation model and rate Sample sizes and ages
OUTPUT N haplotypes Haplotype diversity Nucleotide diversity Mismatch distribution Haplotype sharing
Etruscans Tuscans Murlo
Sample size 27 49 86
Haplotype n 22 40 60
Haplotype diversity 0.946 0.949 0.960
Nucleotide diversity 0.011 0.014 0.012
Avg. mismatch 3.91 5.03 4.50
Haplotype sharing 0.09 0.14
Fst0.024 0.028
Observed population statistics
Consistency criterion: overlap between the 95% confidence intervals of observed and simulated statistics
median observed value
simulated values
The posterior probability (two-tailed) of a simulated statistic is represented by the gray area in the
graph
Two ways to combine the results: 1 estimate a joint posterior probability for all statistics; 2. count the number of statistics with P<0.05.
Simulation parameters
• Population sizes: Etruscans: 292,00012 = 25,000 Tuscans: 3,500,00012 = 300,000
• Growth rate: Nt=N0ert r=1/100 ln 300,000/25,000 = 0.025
• Mutation rate: 1 mutation per million years per nucleotide 360 nucleotides, 25 years per generation, 2 0.0045
• 360 nucleotides
• Transition bias: 0.94
Etruscans and Tuscans a single population?
Nf=25,000
Nf=25,000
r = 0
Model 1: Small population, constant size
• Allele sharing: 4.2% (1.4-8.1) OK
• Hapl. diversity:
- Etruscans: > Obs.
- Tuscans: > Obs.
0
100
Generations
Tuscans
Etruscans
Etruscans and Tuscans a single population?
Nf=300,000
Nf=25,000
r = -0.025
Model 3: Expanding population
• Allele sharing: 5.0% (1.3-9.1) OK
• Hapl. diversity:
- Etruscans: > Obs.
- Tuscans: > Obs
0
100
Generations
Tuscans
Etruscans
Only models in which modern Tuscans and Etruscans belong to distinct genealogies are consistent with the data (2<31)
Interpretations, doubts
• Unless mutation rate is much higher than currently believed, the Etruscans left very few modern mitochondrial descendants in Tuscany (Belle et al. 2006)
• Did they all go extinct?• Was the sample studied only representative of a social elite?• Did massive immigration dilute a component of Etruscan origin
in the Tuscans’ mtDNA gene pool?
Postmortem DNA modifications and/or technical problems affected the Etruscan mtDNA sequences (Achilli et al. 2007)
The similarity between the modern Tuscans and the Near East/Turkey suggests that the Etruscans came from there (Achilli et al. 2007)
No evidence of sequence errors in the Etruscan dataset
61 tooth samples from Middle-Age Tuscany Guimaraes et al., submitted
Joint analysis of11 Etruscan sequences27 Medieval sequences (900-1300 A.D.), from 6 cemeteries
322 (Achilli et al.) and 49 (Francalacci et al.) modern Tuscan sequences
Murlo, Volterra, Casentino
Only the model in which medieval Tuscans and Etruscans belong to the same genealogy and modern Tuscans don’t is consistent
with the data (Guimaraes et al., submitted)
Model 1
0 C
E
M
Model 4
C
E
M
Model 2
C
E
M
Model 3
C
E
M
Model 5
C
E
M
Model 6
E
M
C
Model 7
E
M
C
3. Or maybe the Etruscans are still among us, hiding somewhere?
Excoffier et al. (2005) Genetics 169:1727-1738
Estimating parameters and comparing models by ABC (Approximate Bayesian Computations)
Parameters: priors and posterior distributions
Parameters Priors
Ne Modern Tuscans 50 000 – 500 000 | 10 000 – 70 000
μ 0.0003 – 0.0075
T estimated (bottleneck) 101 – 1500
Ne Generation 26 100 – 10 000
Ne Generation 27 10 000 – 100 000
Ne at split 100 – 2000
Ne Medieval Tuscans 10 000 – 50 000
Ne Etruscans 4000 – 21 000
Mod 1
Method Thresh. Mod 1 Mod 2 Mod 3
RL 50000 0.972 0.028 0.000
SR 100 0.990 0.010 0.000
Method Thresh. Mod 1 Mod 2 Mod 3
RL 50000 0.000 1 0.000
0.SR 100 0 1 0
Method Thresh. Mod 1 Mod 2 Mod 3
RL 50000 0.000 1 0.000
SR 100 0 1 0
Casentino
Murlo
Volterra
SR= Straightforward rejection; LR = Logistic regression
Mod 2 Mod 3
E M
C
27
26
a1 a2
E
M
27
26
a1 a2
Mod 1
C
E
M
27
26
Parameters: priors and posterior distributions
Parameters Priors
Ne Modern Tuscans 20 000 – 200 000
Ne Modern Tuscans from Casentino valley 10 000 – 70 000
μ 0.0003 – 0.0075
T estimated (bottleneck) 101 – 1500
Ne Generation 26 100 – 10 000
Ne Generation 27 10 000 – 100 000
Ne at split 100 – 2000
Ne Medieval Tuscans 10 000 – 50 000
SR= Straightforward rejection; LR = Logistic regression
Method Thresh. A B C D
RL 50000 0.000 0.003 0.912 0.056
RL 100 0 0 1 0
SR 100 0.023 0.011 0.966 0.000
Ca
E
M
Mu Vo
E
M
Mu VoCa
E
M
Mu VoCa
A B
C D
Ca Mu Vo
E
M
27
26
a1 a2
27
26
27
26
a1 a2
a1 a2 a1 a2
27
26
4. Nuragic Sardinians resemble some, but not all, modern Sardinians
A genetic map of Europe (Menozzi, Piazza, Cavalli-Sforza 1978)
53 tooth samples from 6 nuragic sites
Elimination of samples that do not comply with the strictest quality standards
10 different sequences in 23 nuragic individuals
h, haplotype diversity=0.83
Etruscans: 0.95Tuscans: 0.96Basques: 0.96Greeks: 0.98Sicilians: 0.96
Ogliastra 0.78Gallura 0.93
North Africa: 13.9 4
Near East: 10.7 6
Europe: 18.3 8
Iberians: 29.4 2Etruscans: 22.2 4
Shared sequences among Nuragic people and other modern and ancient populations
Gallura: 18.5 1Ogliastra: 54.6 4
Ogliastra
126
126
Ogliastra Gallura Gallura GalluraOgliastra Ogliastra
Model 2Model 1 Model 3
0
0
Model 4 Model 5 Model 6
Gallura Gallura GalluraOgliastra OgliastraLatium Latium Latium
Six models describing the genealogical relationships among Nuragic people and modern Sardinians
Parameters: priors and posterior distributions
Parameters Priors
Ne Ogliastra 500 – 20 000
Ne Gallura 1000 – 40 000
Ne split 100 – 6 000
Ne Ogliastra, Gallura at split 100 – 6 000
Ne Latium 400 000
Migration rate from Latium 0 – 0.01
T split (Ogliastra vs. Gallura) 127 – 1000 [1, 2, 4, 5] or 1- 125 [3, 6]
T split (Sardinia vs. Latium) 1000
μ 0.06 - 1.3 per million years per site
Observed summary statistics describing genetic variation in the Sardinia study
Bronze Age Ogliastra Gallura Latium
Haplotype number 10 26 21 36
N of segregating sites 10 22 31 45
Mean pairwise difference 1.39 2.49 4.42 4.07
Haplotype diversity 0.83 0.79 0.97 0.95
Tajima’s D -1.64 -0.97 -1.66 -2.02
Fst 0.0218
Haplotype sharing Ogliastra / Bronze Age = 0.400
Gallura / Bronze Age = 0.100
Ogliastra / Gallura = 0.095
Model 2Model 1 Model 3
Model 4 Model 5 Model 6
Posterior probabilities of the models, with and without immigration(best 50 000 simulations)
0.983 0.002 0.015
0.813 0.081 0.106
Model 4 beats Model 1 >77% of times
What happened in Italy between the Bronze-Age and now?
Many things.
Major demographic changes in the last few centuries documented by mtDNA in the Netherlands (Manni et al. 2002), in the British Isles (Töpf et al. 2007) and in Iceland (Helgason et al. 2008), but not in the Iberian peninsula (Sampietro et al. 2005).
Relatively recent immigration may have deeply changed the genetic structure of the population in part of Tuscany and in Gallura, but not Casentino and Ogliastra
In studies of admixture, genealogical continuity between past and present is no longer an inevitable assumption, but rather a testable hypothesis (only at the mtDNA level, at present).
David CaramelliGiorgio Bertorelle
Andrea Benazzo, Silvia Ghirotto
Loredana Castrì
Elise Belle
Many thanks to
Enza Colonna
Stefano Mona
Silvia Guimaraes
Erica Fumagalli