languages and genes: recent work and emerging results aussois: 22-25 september 2005 the formation of...

62
Languages and genes: recent work and Languages and genes: recent work and emerging results emerging results Aussois: 22-25 September 2005 Aussois: 22-25 September 2005 The formation of East The formation of East Asian Language Asian Language families: a families: a partial partial scenario scenario . . L. Sagart 1 , with the collaboration of Alicia Sanchez-Mazas 2 , Estella 'Sim' Poloni 2 and Barbara Arredi 2,3 1 CNRS, Paris; 2 Dept. of Anthropology and Ecology, University of Geneva; 3 Dept. of Histology, Microbiology and Medical Biotechnologies, University of Padova EUROPEAN SCIENCE FOUNDATION EUROCORES (EUROpean Science Foundation COllaborative RESearch) Programme Workshop organized with the support of the SHS department of CNRS

Post on 18-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Languages and genes: recent work Languages and genes: recent work and emerging resultsand emerging results

Aussois: 22-25 September 2005Aussois: 22-25 September 2005

The formation of East The formation of East Asian LanguageAsian Language

families: a families: a partial partial scenarioscenario..

L. Sagart1, with the collaboration of Alicia Sanchez-Mazas2, Estella 'Sim' Poloni2

and Barbara Arredi2,3

1 CNRS, Paris; 2 Dept. of Anthropology and Ecology, University of Geneva; 3 Dept. of Histology, Microbiology and Medical Biotechnologies,

University of Padova

EUROPEAN SCIENCE FOUNDATIONEUROCORES

(EUROpean Science Foundation

COllaborative RESearch) ProgrammeWorkshop

organized with the support of

the SHS department of

CNRS

Page 2: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

This presentationThis presentation

► Reflects my ideas on East Asian language Reflects my ideas on East Asian language historyhistory

► Makes crucial use of results obtained within Makes crucial use of results obtained within OHLL project "Languages and Genes in East OHLL project "Languages and Genes in East Asia". Asia".

► Project members:Project members: E. 'Sim' Poloni (co-director). U. of Geneva.E. 'Sim' Poloni (co-director). U. of Geneva. A. Sanchez-Mazas. U. of Geneva.A. Sanchez-Mazas. U. of Geneva. G. Jacques. U. of Paris 5.G. Jacques. U. of Paris 5. recent collaborator: B. Arredi. U. of Padova; U. of recent collaborator: B. Arredi. U. of Padova; U. of

GenevaGeneva

Page 3: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Main productions of our Main productions of our group:group:

Sanchez-Mazas, A., E. S. Poloni, G. Jacques and L. Sanchez-Mazas, A., E. S. Poloni, G. Jacques and L. Sagart (2005) HLA genetic diversity and linguistic Sagart (2005) HLA genetic diversity and linguistic variation in East Asia. In: L. Sagart, R. Blench, A. variation in East Asia. In: L. Sagart, R. Blench, A. Sanchez-Mazas (eds): Sanchez-Mazas (eds): The peopling of East Asia: The peopling of East Asia: putting together archaeology, linguistics and putting together archaeology, linguistics and genetics genetics 273-296. Londres: RoutledgeCurzon.273-296. Londres: RoutledgeCurzon.

Poloni, E. S., A. Sanchez-Mazas, G. Jacques, L. Sagart Poloni, E. S., A. Sanchez-Mazas, G. Jacques, L. Sagart (2005) Comparing linguistic and genetic (2005) Comparing linguistic and genetic relationships among east asian populations: a study relationships among east asian populations: a study of the Rh and GM polymorphisms. In: L. Sagart, R. of the Rh and GM polymorphisms. In: L. Sagart, R. Blench, A. Sanchez-Mazas (eds): Blench, A. Sanchez-Mazas (eds): The peopling of The peopling of East Asia: putting together archaeology, linguistics East Asia: putting together archaeology, linguistics and geneticsand genetics, 252-272. , 252-272. Londres: RoutledgeCurzon.Londres: RoutledgeCurzon.

Page 4: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

MDS of genetic distances among 102 populations samples MDS of genetic distances among 102 populations samples computed on GM frequency distributions (stress value 0.085)computed on GM frequency distributions (stress value 0.085)

source: Poloni, E. S., A. Sanchez-Mazas, G. Jacques, L. Sagart (2005) Comparing linguistic and genetic

relationships among east asian populations: a study of the Rh and GM polymorphisms. In: L. Sagart, R. Blench,

A. Sanchez-Mazas (eds): The peopling of East Asia: putting together archaeology, linguistics and genetics,

252-272. Londres: RoutledgeCurzon.

Northern Tibeto-Burman (Tibetan)

Northern Mandarin samples

Southern Chinese (southwestern Mandarin and other southern dialects), southern Tibeto-Burman (Bodo-Garo, Kuki-Chin, Kiranti, Loloish, Bai, Tujia samples)

Wu and southwestern Mandarin samples

Page 5: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

A genetic boundary across A genetic boundary across Sino-TibetanSino-Tibetan

SAMOVA analysis of GM dataSAMOVA analysis of GM data

► Samova: Dupanloup, I., Schneider, S., Excoffier, Samova: Dupanloup, I., Schneider, S., Excoffier, L. (2002) A simulated annealing approach to L. (2002) A simulated annealing approach to define the genetic structure of populations. define the genetic structure of populations. Molecular EcologyMolecular Ecology 11(12):2571-81 11(12):2571-81

► GM data GM data ► 118 East Asian populations118 East Asian populations

Page 6: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

GM: SAMOVA on 118 population samples (search for genetic differentiation between geographic groups)

Altaic

Austronesian

Austro-Asiatic

Hmong-Mien

Japanese-Ainu

Tai-Kadai

Korean

Sino-Tibetan

Thanks to Estella ‘Sim’ Poloni !

Page 7: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

GM: SAMOVA on 118 population samples (search for genetic differentiation between geographic groups)

Altaic

Austronesian

Austro-Asiatic

Hmong-Mien

Japanese-Ainu

Tai-Kadai

Korean

Sino-Tibetan

Thanks to Estella ‘Sim’ Poloni !

Page 8: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

GM: SAMOVA on 118 population samples (search for genetic differentiation between geographic groups)

genetic boundary

Altaic

Austronesian

Austro-Asiatic

Hmong-Mien

Japanese-Ainu

Tai-Kadai

Korean

Sino-Tibetan

Thanks to Estella ‘Sim’ Poloni !

Page 9: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

GM: SAMOVA on 118 population samples (search for genetic differentiation between geographic groups)

genetic boundary separation into 2 groups: FCT = 24.6% (P < 0.001)

Altaic

Austronesian

Austro-Asiatic

Hmong-Mien

Japanese-Ainu

Tai-Kadai

Korean

Sino-Tibetan

Thanks to Estella ‘Sim’ Poloni !

Page 10: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Boundary is stable Boundary is stable

►whether or not Altaic populations are whether or not Altaic populations are included;included;

►regardless of number of output groups regardless of number of output groups asked for (2, 3, 4, 5).asked for (2, 3, 4, 5).

Page 11: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

This boundaryThis boundary

►corresponds closely to the corresponds closely to the linguistic linguistic boundary between N and SW/SE boundary between N and SW/SE MandarinMandarin

►shown by Zavjalova (1983) to follow the shown by Zavjalova (1983) to follow the political boundary between the Jin political boundary between the Jin (Djurchet, Altaic-speaking) and southern (Djurchet, Altaic-speaking) and southern Song (Chinese) territories in the 12th-Song (Chinese) territories in the 12th-13th centuries CE and later (14th 13th centuries CE and later (14th century) between the Yuan (Mongolian-century) between the Yuan (Mongolian-speaking) and southern Song.speaking) and southern Song.

Page 12: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

ANOVAs on GM dataANOVAs on GM data

► FFCTCT : Proportion of the total genetic variation (here : Proportion of the total genetic variation (here

GM) that is due to differences between East Asian GM) that is due to differences between East Asian

groups compared 2 by 2.groups compared 2 by 2.

► 128 East Asian populations128 East Asian populations

► Linguistically and geographically defined groups as Linguistically and geographically defined groups as

in preceding MDSin preceding MDS

Page 13: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

NC (19) 0.009*** 0.008*** 0.005*** 0.013*** 0.011*** 0.012*** 0.005*** 0.008*** 0.033*** 0.008***

SC (22) 0.143*** 0.014*** 0.016*** 0.025*** 0.025*** 0.024*** 0.018*** 0.020*** 0.045*** 0.013***

WSE (17) 0.022*** 0.059*** 0.012*** 0.022*** 0.020*** 0.020*** 0.013*** 0.016*** 0.042*** 0.012***

NTB (5) 0.025*** 0.264*** 0.086*** 0.037*** 0.045*** 0.033*** 0.016*** 0.024*** 0.067*** 0.011***

STB (7) 0.125*** 0.0002 0.048*** 0.239** 0.073*** 0.052*** 0.052*** 0.046*** 0.074*** 0.018***

AA (4) 0.273*** 0.053** 0.176*** 0.442** 0.057* 0.066*** 0.085*** 0.058*** 0.086*** 0.017***

DA (11) 0.304*** 0.060*** 0.203*** 0.500*** 0.077*** -0.0110 0.045*** 0.043*** 0.072*** 0.017***

HM (3) 0.222*** 0.024** 0.129** 0.369* 0.0260 -0.0140 0.0070 0.033*** 0.078*** 0.012***

TW (13) 0.267*** 0.045*** 0.169*** 0.452*** 0.062*** -0.0003 0.015* -0.0040 0.070*** 0.014***

ALT(14) 0.018*** 0.220*** 0.067*** -0.0014 0.191*** 0.350*** 0.397*** 0.292** 0.358*** 0.037***

JAK(13) 0.021*** 0.239*** 0.080*** 0.0004 0.218*** 0.375*** 0.405*** 0.324** 0.371*** 0.004*

NC (19) SC (22) WSE (17) NTB (5) STB (7) AA (4) DA (11) HM (3) TW (13) ALT(14) JAK(13)

Below diagonal = Fct

Above diagonal = Fsc not significant

Number of permutations = 100'000 significant 5% but not 1%

* 0.01 < P < 0.05 significant 1% or 0.1%

** 0.001 < P < 0.01 in bold: Fct > Fsc

*** P < 0.001

North–south differentiation

Thanks to Alicia

Sanchez-Mazas!

Page 14: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

NC (19) 0.009*** 0.008*** 0.005*** 0.013*** 0.011*** 0.012*** 0.005*** 0.008*** 0.033*** 0.008***

SC (22) 0.143*** 0.014*** 0.016*** 0.025*** 0.025*** 0.024*** 0.018*** 0.020*** 0.045*** 0.013***

WSE (17) 0.022*** 0.059*** 0.012*** 0.022*** 0.020*** 0.020*** 0.013*** 0.016*** 0.042*** 0.012***

NTB (5) 0.025*** 0.264*** 0.086*** 0.037*** 0.045*** 0.033*** 0.016*** 0.024*** 0.067*** 0.011***

STB (7) 0.125*** 0.0002 0.048*** 0.239** 0.073*** 0.052*** 0.052*** 0.046*** 0.074*** 0.018***

AA (4) 0.273*** 0.053** 0.176*** 0.442** 0.057* 0.066*** 0.085*** 0.058*** 0.086*** 0.017***

DA (11) 0.304*** 0.060*** 0.203*** 0.500*** 0.077*** -0.0110 0.045*** 0.043*** 0.072*** 0.017***

HM (3) 0.222*** 0.024** 0.129** 0.369* 0.0260 -0.0140 0.0070 0.033*** 0.078*** 0.012***

TW (13) 0.267*** 0.045*** 0.169*** 0.452*** 0.062*** -0.0003 0.015* -0.0040 0.070*** 0.014***

ALT(14) 0.018*** 0.220*** 0.067*** -0.0014 0.191*** 0.350*** 0.397*** 0.292** 0.358*** 0.037***

JAK(13) 0.021*** 0.239*** 0.080*** 0.0004 0.218*** 0.375*** 0.405*** 0.324** 0.371*** 0.004*

NC (19) SC (22) WSE (17) NTB (5) STB (7) AA (4) DA (11) HM (3) TW (13) ALT(14) JAK(13)

Below diagonal = Fct

Above diagonal = Fsc not significant

Number of permutations = 100'000 significant 5% but not 1%

* 0.01 < P < 0.05 significant 1% or 0.1%

** 0.001 < P < 0.01 in bold: Fct > Fsc

*** P < 0.001

Thanks to Alicia

Sanchez-Mazas!

Northern ST closer to Altaic and Japanese/Korean

than to southern ST

Page 15: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Tai-Kadai

AltaicsJapaneseKoreansTibeto-Burmans NHan N

Taiwan Austronesians

East coast

Centre and west coast

-2

-1,5

-1

-0,5

0

0,5

1

1,5

-2 -1,5 -1 -0,5 0 0,5 1 1,5

Malayo-Polynesian Miao-Yao

Altaic Japanese

Korean Tai-Kadai

Austronesian-Taiwan Austro-Asiatic

Han-North Han-South

TB North TB South

MDS GM 143 populations(stress = 0.108)

Thanks to Alicia Sanchez-Mazas !

Page 16: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

closeness of northern ST and Altaic or closeness of northern ST and Altaic or Japanese-Korean looked at from other Japanese-Korean looked at from other

systems:systems:

►HVS1 (mtDNA)HVS1 (mtDNA)►Y chromosome SNPsY chromosome SNPs►HLA-DRB1HLA-DRB1

Page 17: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

TaiwanAustronesians

mostly:Altaics, JapaneseKoreansTibeto-Burmans NHan N

mostly:Tai-Kadai andHmong-Mien

-3

-2,5

-2

-1,5

-1

-0,5

0

0,5

1

1,5

2

2,5

3

-2,5 -2 -1,5 -1 -0,5 0 0,5 1 1,5 2 2,5

Altaic

Austronesian-Taiwan

Malayo-Polynesian

Austro-Asiatic

Japanese

Tai-Kadai

Korean

Hmong-Mien

Han North

Han South

TB North

TB South

MDS HVS1 (mtDNA)115 populations(stress = 0.183)

Thanks to Estella ‘Sim’ Poloni !

Page 18: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Thanks to Barbara Arredi !

mostly: AltaicsJapanese, Koreans

Tai-Kadai

Atayal

Hui

-2

-1,5

-1

-0,5

0

0,5

1

1,5

2

2,5

-3 -2,5 -2 -1,5 -1 -0,5 0 0,5 1 1,5 2 2,5 3

Malayo-Polynesian

Hmong-Mien

Altaic

Japanese

Korean

Tai-Kadai

Austronesian-Taiwan

Austro-Asiatic

Han North

Han South

TB North

TB South

MDS Y chromosome SNPs76 populations(stress = 0.218)

Page 19: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

MDS analysis of 27 East Asian populations based on the HLA-DRB1 polymorphism

Source: Sanchez-Mazas et al. (2005), p. 279

Northern Chinese

Southern Chinese

S=0.291

Page 20: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

HLA-DRB1HLA-DRB1

► In the northern Chinese group:In the northern Chinese group: Guanxian undifferentiated from ManchuGuanxian undifferentiated from Manchu Urumqi Chinese undifferentiated from Urumqi Chinese undifferentiated from

Manchu, Manchu, Urumqi Chinese undifferentiated from Urumqi Chinese undifferentiated from

Khalk (Mongol)Khalk (Mongol) Urumqi Chinese undifferentiated from Urumqi Chinese undifferentiated from

Khazak (Turkic)Khazak (Turkic)(F(FSTST among populations tested by 10,000 random among populations tested by 10,000 random

permutations)permutations)Alicia Sanchez-Mazas, p.c. Sept 15, 2005

Page 21: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Proximity of southern ST to Proximity of southern ST to other southern groupsother southern groups

►Long observed (Cavalli-Sforza for Long observed (Cavalli-Sforza for Chinese)Chinese)

►Usual explanation:Usual explanation: ST homeland is in northern ChinaST homeland is in northern China Northern Chinese/TB best reflects original Northern Chinese/TB best reflects original

STST Southern Chinese has diverged because of Southern Chinese has diverged because of

‘Austric’ gene flow following colonization ‘Austric’ gene flow following colonization of south China, c. 2000 BP.of south China, c. 2000 BP.

Page 22: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Problems for the ‘usual’ Problems for the ‘usual’ interpretation:interpretation:

1.1. Northern ST closer to Altaic than to Northern ST closer to Altaic than to southern ST: strange. southern ST: strange.

2.2. Most of the ST linguistic diversity is Most of the ST linguistic diversity is in the southern group.in the southern group.

Page 23: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Gene flow from Austric ?Gene flow from Austric ?

► L. Reid (2005), principal proponent of L. Reid (2005), principal proponent of ‘Austric’ theory:‘Austric’ theory:

““With the accumulation of evidence presented by With the accumulation of evidence presented by Sagart in this volume and elsewhere, that Sagart in this volume and elsewhere, that Austronesian can also be shown to be genetically Austronesian can also be shown to be genetically related to the Sino-Tibetan family of languages (…) related to the Sino-Tibetan family of languages (…) the possibility exists that the relationship between the possibility exists that the relationship between Austroasiatic and Austronesian is more remote than Austroasiatic and Austronesian is more remote than earlier considered. earlier considered. The concept of Austric as a The concept of Austric as a language family may eventually need to be language family may eventually need to be abandoned in favour of a wider language family, abandoned in favour of a wider language family, which can be shown to include both AN and AA which can be shown to include both AN and AA language familieslanguage families, but not necessarily as sisters of a , but not necessarily as sisters of a common ancestor”common ancestor”Source: Reid, L. (2005) The current status of Austric. In: L.Sagart, R.

Blench and A. Sanchez-Mazas (eds.) The Peopling of East Asia, pp.17-30. London: RoutledgeCurzon.

Page 24: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Is closeness to Altaic an original Is closeness to Altaic an original characteristic of ST characteristic of ST

populations ?populations ?

Reasons for thinking that northern Reasons for thinking that northern Chinese closeness to Altaic is Chinese closeness to Altaic is notnot

originaloriginal

Page 25: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Exhibit 1: ancient mtDNA study Exhibit 1: ancient mtDNA study of 2 Shandong populationsof 2 Shandong populations

Two early Shandong populations (c. 2500 BP; c. 2000 Two early Shandong populations (c. 2500 BP; c. 2000

BP)BP)

closer to modern closer to modern southernsouthern Chinese than to modern Chinese than to modern

northern Chinese, incl. Shandong.northern Chinese, incl. Shandong.

Yong-Gang Yao, Qing-Peng Kong, Xiao-Yong Man, Yong-Gang Yao, Qing-Peng Kong, Xiao-Yong Man, Hans-Jürgen Bandelt, and Ya-Ping Zhang (2003) Hans-Jürgen Bandelt, and Ya-Ping Zhang (2003)

Reconstruction of the evolutionary history of China: Reconstruction of the evolutionary history of China: A caveat about inferences drawn from Ancient DNA, A caveat about inferences drawn from Ancient DNA,

Mol Biol Evol Mol Biol Evol 20(2): 214-21920(2): 214-219

Page 26: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Exhibit 2: episodes of Altaic Exhibit 2: episodes of Altaic domination of N. Chinadomination of N. China

► Sixteen Kingdoms (Toba: Early Mongolians): Sixteen Kingdoms (Toba: Early Mongolians): 300-430 CE300-430 CE

► Northern Wei dynasty (Xianbei: early Northern Wei dynasty (Xianbei: early Mongolian) : 386-534 CEMongolian) : 386-534 CE

► Liao dynasty (Khitan: Tungusic ?): 907-1119 CELiao dynasty (Khitan: Tungusic ?): 907-1119 CE► Jin dynasty (Jurchet: early Manchu ?): 1115-Jin dynasty (Jurchet: early Manchu ?): 1115-

1234 CE1234 CE► Yuan dynasty (Mongol): 1271-1368 CEYuan dynasty (Mongol): 1271-1368 CE► Qing dynasty (Manchu): 1644-1911 CEQing dynasty (Manchu): 1644-1911 CE

Page 27: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Results on N. Chinese Results on N. Chinese populations:populations:

►very high wartime mortality of Chinese very high wartime mortality of Chinese populations in the northpopulations in the north

► large-scale N. Chinese migrations to large-scale N. Chinese migrations to south Chinasouth China

►settling of N. China by Altaic-speaking settling of N. China by Altaic-speaking populationspopulations

►Settled Altaic populations and ruling Settled Altaic populations and ruling class become bilingual in Chinese, class become bilingual in Chinese, then shift to Chinesethen shift to Chinese

Page 28: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

consequences of language shift: consequences of language shift: Altaic substratum in northern Altaic substratum in northern

MandarinMandarin

► in grammarin grammar Hashimoto 1984 (higher incidence of Hashimoto 1984 (higher incidence of

verb-final patterns in n. Mandarin)verb-final patterns in n. Mandarin)

► in pronunciationin pronunciation Cheng 2002 (in N. Mandarin, elimination Cheng 2002 (in N. Mandarin, elimination

of vowel sequences violating Altaic vowel of vowel sequences violating Altaic vowel harmony)harmony)

Page 29: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Evidence for an Altaic Evidence for an Altaic substratum in northern TB substratum in northern TB

Gong 2002 (Altaic case endings in Gong 2002 (Altaic case endings in TB languages, especially northern: TB languages, especially northern: Tibetan, Tangut)Tibetan, Tangut)

Page 30: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Conclusions for part IConclusions for part I

► convergence of:convergence of: Historical evidence Historical evidence Linguistic evidenceLinguistic evidence Ancient DNA evidenceAncient DNA evidence

► suggests that Northern ST populationssuggests that Northern ST populations genetically close to Altaic because of massive genetically close to Altaic because of massive

Altaic gene flow in past 2000 yearsAltaic gene flow in past 2000 years► Southern STSouthern ST

Has most of the ST linguistic diversityHas most of the ST linguistic diversity Is Closer to ‘original ST’Is Closer to ‘original ST’

Page 31: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Part II: focus on the southPart II: focus on the south

►proximity between southern ST and proximity between southern ST and AustroasiaticAustroasiatic Hmong-MienHmong-Mien Austronesian (Taiwan)Austronesian (Taiwan) Tai-KadaiTai-Kadai

manifested for the GM system in low Fct manifested for the GM system in low Fct values between them:values between them:

Page 32: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

NC (19) 0.009*** 0.008*** 0.005*** 0.013*** 0.011*** 0.012*** 0.005*** 0.008*** 0.033*** 0.008***

SC (22) 0.143*** 0.014*** 0.016*** 0.025*** 0.025*** 0.024*** 0.018*** 0.020*** 0.045*** 0.013***

WSE (17) 0.022*** 0.059*** 0.012*** 0.022*** 0.020*** 0.020*** 0.013*** 0.016*** 0.042*** 0.012***

NTB (5) 0.025*** 0.264*** 0.086*** 0.037*** 0.045*** 0.033*** 0.016*** 0.024*** 0.067*** 0.011***

STB (7) 0.125*** 0.0002 0.048*** 0.239** 0.073*** 0.052*** 0.052*** 0.046*** 0.074*** 0.018***

AA (4) 0.273*** 0.053** 0.176*** 0.442** 0.057* 0.066*** 0.085*** 0.058*** 0.086*** 0.017***

DA (11) 0.304*** 0.060*** 0.203*** 0.500*** 0.077*** -0.0110 0.045*** 0.043*** 0.072*** 0.017***

HM (3) 0.222*** 0.024** 0.129** 0.369* 0.0260 -0.0140 0.0070 0.033*** 0.078*** 0.012***

TW (13) 0.267*** 0.045*** 0.169*** 0.452*** 0.062*** -0.0003 0.015* -0.0040 0.070*** 0.014***

ALT(14) 0.018*** 0.220*** 0.067*** -0.0014 0.191*** 0.350*** 0.397*** 0.292** 0.358*** 0.037***

JAK(13) 0.021*** 0.239*** 0.080*** 0.0004 0.218*** 0.375*** 0.405*** 0.324** 0.371*** 0.004*

NC (19) SC (22) WSE (17) NTB (5) STB (7) AA (4) DA (11) HM (3) TW (13) ALT(14) JAK(13)

Below diagonal = Fct

Above diagonal = Fsc not significant

Number of permutations = 100'000 significant 5% but not 1%

* 0.01 < P < 0.05 significant 1% or 0.1%

** 0.001 < P < 0.01 in bold: Fct > Fsc

*** P < 0.001

Thanks to Alicia

Sanchez-Mazas!

proximity between ST and AA, TK, AN, Hm-M

Page 33: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

in short:in short:

southern Sino-Tibetanssouthern Sino-TibetansTaiwan AustronesiansTaiwan AustronesiansTai-KadaisTai-Kadaisless reliably Austroasiatics and Hmong-Miensless reliably Austroasiatics and Hmong-Miens

show:show:

► significant but low group-to-group significant but low group-to-group differentiations differentiations

Page 34: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Sino-Tibetan-Austronesian Sino-Tibetan-Austronesian linguistic theorylinguistic theory

Sagart, L. (2005) Sagart, L. (2005) Sino-Tibetan-Austronesian: an updated and Sino-Tibetan-Austronesian: an updated and improved argument. In L. Sagart, R. Blench and A. Sanchez-improved argument. In L. Sagart, R. Blench and A. Sanchez-Mazas (eds) Mazas (eds) The peopling of East Asia: Putting together The peopling of East Asia: Putting together Archaeology, Linguistics and GeneticsArchaeology, Linguistics and Genetics 161-176 161-176. London: . London: RoutledgeCurzon.RoutledgeCurzon.

Sino-Tibetan Austronesian

Proto-Sino-Tibetan-Austronesian:

c. 8500 BP, NE China

Page 35: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

sound correspondencesgeneral case

Proto-AN Old Chinese TB

to shoot panaq 弩 anaʔ crossbow

brain punuq 腦 anuʔ Benedict PTB nuk

vomit/spit utaq 吐 athaʔ Lushai chāk < tak

earth -taq 土 athaʔ

breast nunuh 乳 bnoʔ Benedict *nuw

head quluh 首 bhluʔ Lushai lu

Page 36: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

The Swadesh 100-word listThe Swadesh 100-word list(in (in greengreen: 13 words shared by Chinese and PAN): 13 words shared by Chinese and PAN)

II, , you (sg.), we, you (sg.), we, thisthis, that, who, what, not, all, many, , that, who, what, not, all, many, oneone, two, big, long, small, woman, man, human (n), , two, big, long, small, woman, man, human (n), fish, bird, dog, louse, tree, seed, leaf, root, bark (of fish, bird, dog, louse, tree, seed, leaf, root, bark (of tree), skin, flesh, blood, bone, fattree), skin, flesh, blood, bone, fat (n.), (n.), eggegg, , hornhorn, tail, , tail, feather, hair (of head), feather, hair (of head), headhead, ear, eye, nose, mouth, , ear, eye, nose, mouth, tongue, tooth, claw, foot, knee, hand, neck, belly, tongue, tooth, claw, foot, knee, hand, neck, belly, breastbreast(s)(s), heart, liver, drink, eat, bite, hear, see, , heart, liver, drink, eat, bite, hear, see, know, know, sleepsleep (vb.), (vb.), diedie, kill, swim, fly (vb.), walk, , kill, swim, fly (vb.), walk, come, lie (recline), sit, stand, give, come, lie (recline), sit, stand, give, saysay, sun, moon, , sun, moon, star, water, rain (n.), stone, sand, star, water, rain (n.), stone, sand, earthearth, cloud, , cloud, smoke, fire, ash(es), burn (intr.), path, mountain, red, smoke, fire, ash(es), burn (intr.), path, mountain, red, green, yellow, white, black, night, green, yellow, white, black, night, hothot, cold, full, new, , cold, full, new, good, round, good, round, drydry, name., name.

Page 37: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

13 basic vocabulary items shared by Old 13 basic vocabulary items shared by Old Chinese and PANChinese and PAN

meaning proto-Austronesian Old Chinese TB

I ku anga ka, nga

this di bdï Tib. Ndi

one is- a b?it it

egg qiCeluR alor? twiy

horn quRung ak- rok rung

head quluh bhlu? Lushai lu

breast(s) nunuh bno? nuw

sleep (vb.) - zem btshim? Tib. gzim

die maCay bsij? siy

say kawaS bwat Tib. s- go < w-

hot qanget bnget

earth - taq atha?

dry - kaR akar Burmese kân < - r

Page 38: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Shared Morphology 1 Shared Morphology 1 prefix s- 'valency increaser'prefix s- 'valency increaser'

► Austronesian: AtayalAustronesian: Atayal mm‑‑'to be afraid' 'to be afraid' ss‑‑'to frighten''to frighten'

► Old Chinese Old Chinese 順 順 ** bbm‑lun‑s ‘to be pliant, obedient’m‑lun‑s ‘to be pliant, obedient’ 馴馴 ** bbss‑m-lun ‘to tame' ‑m-lun ‘to tame'

► Tibetan Tibetan 'bar 'to burn, catch fire, be ignited''bar 'to burn, catch fire, be ignited' ss-bar 'to light, to kindle, to inflame'-bar 'to light, to kindle, to inflame'

Page 39: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

► Proto-Austronesian:Proto-Austronesian: pa-Cay 'to kill' (pa- causative)pa-Cay 'to kill' (pa- causative) mama-Cay 'to die, dead' -Cay 'to die, dead'

► Old ChineseOld Chinese 夾 夾 aakrep ‘to press between’krep ‘to press between’ 狹 狹 aaNN-krep ‘narrow’ -krep ‘narrow’

► TB: GyarongTB: Gyarong kk‑‑ ‘to split’ ‘to split’ kk‑‑ ‘to be rent’ ‘to be rent’

Shared Morphology 2Shared Morphology 2prefix m-/N- 'intransitive'prefix m-/N- 'intransitive'

Page 40: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Shared morphology 3:Shared morphology 3:-n nominalizer of verbs-n nominalizer of verbs

► Tibetan Tibetan za-ba 'to eat' za-ba 'to eat' za-za-nn 'food, pap, porridge' 'food, pap, porridge'

► Austronesian: PaiwanAustronesian: Paiwan kan 'eat' kan 'eat' kan-kan-enen ‘food’ ‘food’

Page 41: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Formation of the STAN Formation of the STAN phylumphylum

►Bellwood/Renfrew farming/language Bellwood/Renfrew farming/language hypothesishypothesis

►The STAN phylum as a farming The STAN phylum as a farming expansion based on rice and foxtail expansion based on rice and foxtail millet (Setaria italica)millet (Setaria italica)

Page 42: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

A field of Setaria italica in n. China (courtesy: Tracey Lu)

Page 43: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Neolithic transition(s) in N. China

domestication of rice, c. 10.000 BP

domestication of Setaria italica, c. 8500

BP.

Illustration from Lu 2005, modified

Page 44: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Bellwood's recent hypothesis on Bellwood's recent hypothesis on East AsiaEast Asia

1.1. Only one neolithic transition in east Asia: Only one neolithic transition in east Asia: domestication of rice, c. 10,000 BP; domestication of rice, c. 10,000 BP;

2.2. followed by population expansionfollowed by population expansion3.3. The northernmost farmers obliged to The northernmost farmers obliged to

domesticate a second cereal: Setaria italica, domesticate a second cereal: Setaria italica, c. 8500 BPc. 8500 BP

[in [in Sagart, Blench and Sanchez-Mazas (eds) Sagart, Blench and Sanchez-Mazas (eds)

The Peopling of East AsiaThe Peopling of East AsiaLondon: RoutledgeCurzon]London: RoutledgeCurzon]

Page 45: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Distribution of Setaria Italica (foxtail millet) c. 5000 BP (source: Lu 2005, slightly mo'd.)

Page 46: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Distribution of millet cultivation Distribution of millet cultivation c. 5000 BP:c. 5000 BP:

►North China (nuclear area)North China (nuclear area)►TibetTibet►TaiwanTaiwan

Precisely the area of Precisely the area of Sino-Tibetan-Austronesian Sino-Tibetan-Austronesian

Page 47: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

STAN cereal-related terms

PAN Chinese TB

husked rice beRas 糲 bmə- rat- s Tib. 'bras 'rice' < m- ras

rice in grains/ cooked

grains of rice

Semay 米 amijʔ Bodo- Garo may 'cooked rice;

rice; paddy'

Setaria italica beCeŋ 稷 btsək Lhokpu cək

Page 48: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Tai-Kadai as a branch of Tai-Kadai as a branch of AustronesianAustronesian

Sagart, L. (2004) The higher phylogeny Sagart, L. (2004) The higher phylogeny of Austronesian and the position of Tai-of Austronesian and the position of Tai-Kadai. Kadai. Oceanic LinguisticsOceanic Linguistics 43,2: 411- 43,2: 411-444.444.

Page 49: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

1

PST

Pazeh

Luilang

Saisiat

Western Plains

Atayal

Sediq

Atayalic

Siraya

Bunun

Tsouic

Rukai

Rukai_Tsouic

Paiwan

Puyuma

MaRish

Amis

PMP

PTK

Kavalan

Ketagalan

NE_Formosan

Muish

Pangish

Puluqish

Walu_Siwaish

Enemish

Pituish

PAN

PSTAN

Sagart's phylogeny for STAN

Old additive expression meaning '5+2' is

reduced to pitu 'seven'

New word for 'six': enem;

New word for 'year': kawaS

Additive expressions

meaning '5+3' and

'5+4' reduced to new words walu 'eight' and Siwa

'nine'

New word for 'thou'; new word for 'bird'

New word for 'ten'

New morphological process

Pang-V > instrumental noun

Page 50: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Proposed:Proposed:

► Belwood's northern farmers, c. 8500 BPBelwood's northern farmers, c. 8500 BP

spoke proto-sino-tibetan-austronesianspoke proto-sino-tibetan-austronesian In north-eastern China (Yellow Valley, Huai Valley)In north-eastern China (Yellow Valley, Huai Valley) Had millet, rice, chickens;Had millet, rice, chickens; Expanded:Expanded:

► An Eastern branch reached the eastern seaboard c. An Eastern branch reached the eastern seaboard c. 7000 BP and eventually Taiwan c. 5500 BP, Philippines 7000 BP and eventually Taiwan c. 5500 BP, Philippines 4000 BP, N; Vietnam 4000 BP (Tai-Kadai)4000 BP, N; Vietnam 4000 BP (Tai-Kadai)

► The stay-at-homes evolved into the ST family, The stay-at-homes evolved into the ST family, expanding westward, reaching Tibet in the 6th mill. BPexpanding westward, reaching Tibet in the 6th mill. BP

Page 51: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

L’Asie orientale (Encarta 2000)domestication of Setaria italica, c. 8500 BP: Cishan-Peiligang, Jiahu cultures

1

3

Yangshao culture: Proto-Sino-Tibetan, c. 7000 BP

2

Beixin-Dawenkou: pre-austronesian culture, c. 7000-6000 BP

4

W. Taiwan, Dapenkeng culture 5500 BP

Karuo, Tibetc. 5300 BP

expansion of Sino-Tibetan-Austronesian Setaria farmers

Out of Taiwan I: Malayo-Polynesian, c. 4000 BP

Out of Taiwan II: Tai-Kadai, c. 4000 BP

Page 52: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Markers for the southward Markers for the southward coastal expansion of the pre-coastal expansion of the pre-

Austronesians:Austronesians:

1. grains of 1. grains of Setaria Setaria italicaitalica in in archeaological contexts:archeaological contexts: Dawenkou culture, c. 6000-5000 BPDawenkou culture, c. 6000-5000 BP Taiwan west coast, c. 5000 BPTaiwan west coast, c. 5000 BP

Page 53: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

carbonized grains of Setaria from carbonized grains of Setaria from Tainan, Taiwan c. 5000 BPTainan, Taiwan c. 5000 BP

source: Tsang, Cheng-hwa (2005) Recent discoveries at a Tapenkeng culture site in Taiwan: implications for the problem of

Austronesian origins. In L. Sagart, R. Blench and A. Sanchez-Mazas (eds) The peopling of East Asia: Putting together

Archaeology, Linguistics and Genetics. London: RoutledgeCurzon.

Page 54: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Markers for the southward Markers for the southward coastal expansion of the pre-coastal expansion of the pre-

Austronesians:Austronesians:

2. 2. Tooth evulsionTooth evulsion: ritual extraction of : ritual extraction of upper lateral incisors; in boys and upper lateral incisors; in boys and girls, in adolescence: girls, in adolescence: Dawenkou culture ca. 6500- BPDawenkou culture ca. 6500- BP Taiwan west coast ca. 5000 BPTaiwan west coast ca. 5000 BP Nowhere else at those early datesNowhere else at those early dates

Page 55: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

tooth evulsiontooth evulsion

Page 56: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

A Y-chromosome mutation with a A Y-chromosome mutation with a correlated distribution:correlated distribution:

The M119 mutation (and The M119 mutation (and corresponding O1 haplotype) is carried corresponding O1 haplotype) is carried by many more individuals in the by many more individuals in the Eastern branch of STAN than Eastern branch of STAN than elsewhere:elsewhere:

Page 57: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

M119M119

Highest frequency on the Eastern Highest frequency on the Eastern Chinese seaboard:Chinese seaboard: speakers of Chinese dialectsspeakers of Chinese dialects Taiwan AustronesiansTaiwan Austronesians Tai-Kadais (really Austronesians)Tai-Kadais (really Austronesians)

Low frequency among Low frequency among Tibeto-Burmans Tibeto-Burmans AltaicAltaic Japanese-KoreanJapanese-Korean

Page 58: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

#

#

#

#

#

#

#

#

# #

#

#

#

#

##

#

#

#

#

#

#

#

##

#

#

# #

#

#

#

#

# #

#

###

##

#

#

#

#

#

##

##

#

#

#

#

##

#

#

#

#

#

##

#

##

#

###

# ##

#

#

#

#

#

#

O1-M119 in East Asia

Thanks to Estella ‘Sim’ Poloni !

Page 59: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

#

#

#

#

#

#

#

#

# #

#

#

#

#

##

#

#

#

#

#

#

#

##

#

#

# #

#

#

Japanese

Korean

Manchu_1

Manchu_2

Buryat

Mongolian_2Mongolian_1

Siberian Evenk

Chinese_Evenk

Oroqen

Uzbek

Altai

Kazakh

Uygur

She

Miao

YaoVietnamese

Cambodian

Zhuang_1Bouyei

Zhuang_2

Thailandese

Atayal

Taiwanese_Aborigene

Batak

Malaysian

BalineseEast_IndonesianW_Indonesian

Philippino

O1-M119 among non-ST

Thanks to Estella ‘Sim’ Poloni !

Page 60: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

#

#

# #

#

###

##

#

#

#

#

#

##

##

#

#

#

#

##

#

#

#

#

#

##

#

##

#

###

# ##

#

#

#

#

#

#

O1-M119 among ST

Thanks to Estella ‘Sim’ Poloni !

Page 61: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

ConclusionsConclusions

► northern ST is linguistically and genetically northern ST is linguistically and genetically "altaicized""altaicized"

► southern ST is 'original ST'southern ST is 'original ST'► southern ST genetically close to southern southern ST genetically close to southern

groups: Austronesian, Tai-Kadai, Hmong-groups: Austronesian, Tai-Kadai, Hmong-Mien, AustroasiaticMien, Austroasiatic

► But results for Hmong-Mien and But results for Hmong-Mien and Austroasiatic need to be confirmed on a Austroasiatic need to be confirmed on a larger number of population sampleslarger number of population samples

► Genetic data do not contradict Sino-Genetic data do not contradict Sino-Austronesian theory in a major wayAustronesian theory in a major way

Page 62: Languages and genes: recent work and emerging results Aussois: 22-25 September 2005 The formation of East Asian Language families: a partial scenario

Thank you for your attentionThank you for your attention

This presentation will be posted on the conference websiteThis presentation will be posted on the conference website

comments:comments:

[email protected]@ehess.fr