genetic relationships i david willis li2 language variation: historical linguistics

Genetic relationships I

David WillisLi2 Language variation:Historical linguistics

*Proto-Indo-European

*Proto-Celtic ‡Tocharian

*Proto-Indo-Iranian‡Old Irish *British

*Proto-Germanic

*Proto-

Balto-Slavic

Albanian

‡Ancient

Greek

Armenian

‡Latin

Catalan

French

Italian

Occitan

Portuguese

Romanian

Romansh

Spanish

NORTH EAST

Afrikaans

Dutch

English

Frisian

German

Yiddish

Danish

Faroese

Icelandic

Norwegian

Swedish

‡Gothic

*Proto-

Iranian

Kurdish

Ossetian

Persian

Pashto

Tadzhik

INDO-

ARYAN

‡Sanskrit

Assamese

Bengali

Gujarati

Hindi/Urdu

Maldivian

Marathi

Punjabi

Sindhi

Sinhalese

*Proto-Baltic *Proto-Slavic

Latvian

Lithuanian

‡Prussian

WEST

SOUTH

EAST

Czech

Polish

Slovak

Sorbian

Bulgarian

Croatian

Macedonian

‡Old Church Slavonic

Serbian

Slovene

Belorussian

Russian

Ukrainian

WEST

Gaelic

Irish

‡Manx

Breton

‡Cornish

Welsh

ROMANCE

* = reconstructed language

‡ = dead language

Modern

Greek

The Comparative Method

How do we establish that languages are related? Similarities between languages can be due to:

accident (chance resemblance or independent parallel development) borrowing / language contact onomatopoeia universals and typology genetic relationship

The first four need to be eliminated to demonstrate the last.

Eliminating borrowings and chance similarities

FINNISH ENGLISH abstraktinen abstract almanakka almanac arkkitehti architect

MODERN

GREEK MALA Y ‘eye’ ['mati] [mata]

Eliminating borrowings and chance similarities

HAWAIIAN ANCIENT GREEK

aeto ‘eagle’ aetos ‘eagle’ noonoo ‘thought’ nous ‘thought’ manao ‘think’ manthano ‘learn’ mele ‘sing’ melos ‘melody’ lahui ‘people’ laos ‘people’ meli ‘honey’ meli ‘honey’ kau ‘summer’ kauma ‘heat’ mahina ‘month’ men ‘moon’

Correspondence sets

systematic sound correspondences between cognates (items in two languages that have been continuously transmitted from a single parent-language item) are strong evidence for linguistic affinity (assuming regularity of sound change)

some linguists regard sound correspondences as essential for demonstrating it

ENGLISH DUTCH GERMAN DANISH SWEDISH

‘house’ [ha s] [høys] [ha s] [hu s] [hu s] ‘mouse’ [ma s] [møys] [ma s] [mu s] [mu s] ‘louse’ [la s] [løys] [la s] [lu s] [lu s] ‘out’ [a t] [øyt] [a s] [u ] [u t] ‘brown’ [bra n] [brøyn] [bra n] [b au n] [bru n]

Correspondence sets

some linguists regard them as essential for demonstrating it use basic vocabulary: body parts, close kin, natural world, low numbers (but

even basic vocabulary can sometimes be borrowed e.g. Finnish borrowed from Baltic and Germanic the words for ‘mother’, ‘daughter’, ‘sister’, ‘tooth’, ‘navel’, ‘neck’, ‘thigh’, ‘fur’)

similarity due to onomatopoeia is not reliable evidence of relatedness, but can usually be readily identified

cognacy is often not recognised until the systematic correspondences are understood:

English wheel : Hindi cakkā

English horn : Hindi sĩg

English sister : Hindi bahan

French cinq : Russian pjat’ : Armenian hing : English five (< PIE *penkwe)

(regular but non-obvious correspondences e.g. Armenian hing shows *p > h in Armenian cf. Armenian het : Greek ped- ‘foot’, hour : pyr ‘fire’ etc.)

False correspondence sets due to borrowing

regular correspondences may sometimes be found in loans:

English : French normally shows f : p (pere : father, pied : foot, pour : for)

Romance loans show p : p (paternel : paternal, piedestal : pedestal) use of basic vocabulary may help up to overcome this but not always:

Massive borrowing may create false correspondence sets e.g. Welsh /p/ : Latin /p/:

WELSH LATIN ‘bridge’ pont p ns ‘fish’ pysg piscis ‘tent’ pabell p pili ‘pan’ padell patella ‘people’ pobl populus

False correspondence sets due to borrowing

can be hard to distinguish from real correspondence sets e.g. Welsh /p/ : Old Irish /k/:

Armenian was widely believed to be an Iranian language until 1875, when Heinrich Hübschmann showed it to be an independent branch of Indo-European in 1875 by showing that similarities with Iranian were due to borrowing (Armenian showed three systems of sound correspondence with Iranian) and were not supported by morphological irregularity.

WELSH OLD IRISH ‘four’ pedwar cethair ‘worm’ pryf cruim ‘I buy’ prynaf crenaim ‘head’ penn cenn

‘Shared aberrancy’ / shared grammatical irregularity

The presence of similar (highly arbitrary) morphological alternation in two languages is a very good indicator of genetic relationship (e.g. English good, better, best and German gut, besser, best-). Antoine Meillet termed this ‘shared aberrancy’

3rd sing. 3rd plur. 1st sing. of ‘be’ of ‘be’ of ‘be’ LATIN est sunt sum SANSKRIT ásti sánti asmi GREEK esti eisi eimi GOTHIC ist sind am HITTITE e zi a anzi e mi PIE *es-ti *s-enti *es-mi

Typological evidence

the value of typological evidence in disputed early linguists often used shared morphological type as evidence of relatedness usually considered to be of little value today, despite some attempts to use it

The subgrouping problem lines on family trees indicate a period of shared innovation (development) of all the daughter languages splits indicate the beginning of independent innovation nodes indicate intermediate parent languages (whether or not these are attested or even have names) e.g. Proto-Germanic, Latin etc. subgrouping is done by establishing that a group of languages share an innovation that is not found in some other portion of the family e.g. Grimm’s Law defines Germanic (English father, German Vater vs. French père (Latin pater), Ancient

Greek patēr, Sanskrit pitr); loss of PIE *p defines Celtic (Old Irish lán, Welsh llawn vs. French plein, Russian polon, English full) genetic similarities between languages may be due either to shared retention or to shared innovation: only shared innovation is indicate of subgrouping shared parallel development is a possible confounding factor

The subgrouping problem

L1 L2 L3 L4 L5 p p f f zero

The subgrouping problem

L1 L2 L3 L4 L5 p p f f zero

Most linguists would prefer to posit two changes in the histories of these languages, namely, p > f and (subsequently) f > ø, hence the tree:

Note that this means that, perhaps surprisingly, L1 and L2 share no particularly close relationship.

(Harrison 2003)

The establishment of Indo-European

establishment of Indo-European is often attributed to William Jones’s third discourse to the Royal Asiatic Society of Bengal (February 1786):

The Sanscrit language, whatever be its antiquity, is of a wonderful structure; more perfect than the Greek, more copious than the Latin, and more exquisitely refined than either; yet bearing to both of them a stronger affinity, both in the roots of verbs and in the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three without believing them to have sprung from some common source , which, perhaps, no longer exists. There is a similar reason, though not quite so forcible, for supposing that both the Gothic and Celtick, though blended with a very different idiom, had the same origin with the Sanscrit; and the old Persian might be added to the same family… Sir William Jones

(1746–94)


connections between Indo-European languages had been observed long before Jones the relationship between Sanskrit and other Indo-European languages had been

observed before Jones Jones did not use the comparative method Jones interpreted linguistic relationship as part of ethnic / genealogical relationships

in a biblical framework, the history of the human races Jones made a number of errors that were not made by everyone at the time:

Of these cursory observations on the Hindus … this is the result; that they had an immemorial affinity with the old Persians, Ethiopians, and the Egyptians; the Phenicians, Greeks, and Tuscans [= Etruscans]; the Scythians or Goths, and Celts; the Chinese, Japanese, and Peruvians; whence, as no reason appears for believing that they were a colony from any one of those nations, or any of those nations from them, we may fairly conclude that they all proceeded from some central country, to investigate which will be the object of my future Discourses. (Third discourse to the Royal Asiatic Society of Bengal, 1786)


early attempts at language classification in the sixteenth century saw all languages as descended from Hebrew

even in the sixteenth century, many of the branches of Indo-European were recognised (e.g. by Konrad Gesner in 1555)

early attempts at etymology allowed free substitution of sounds (cf. Voltaire’s alleged remark that ‘etymology is a science in which the vowels count for nothing and the consonants for very little’)

an improvement is the notion that certain sound changes recur in different languages


the idea that various European languages might go back to a common ancestor originates with the ‘Scythian hypothesis’, first propounded by Johannes Goropius Becanus (Jan van Gorp van der Beke) in 1569 and later associated with Claude Saumaise (1588–1653) (Indo-Scythian), Marc Boxhorn (1602–53) and Andreas Jäger (Scytho-Celtic):

An ancient language, once spoken in the distant past in the area of the Caucasus mountains and spreading by waves of migration throughout Europe and Asia, had itself ceased to be spoken and had left no linguistic monuments behind, but had as a ‘mother’ generated a host of ‘daughter languages’, many of which in turn had become ‘mothers’ to further ‘daughters’. (For a language tends to develop dialects, and these dialects in the course of time become independent, mutually unintelligible languages.) Descendants of the ancestral languages include Persian, Greek, Italic (whence Latin and in time the modern Romance tongues), the Slavonic languages, Celtic, and finally Gothic and the other Germanic tongues. (Andreas Jäger, De lingua vetustissima Europae, Scytho-Celtica et Gothica, 1686)


the discovery of Sanskrit added weight to such investigations. Thomas Stephens had noted similarities between Indian languages and Latin and Greek in 1583 (but he meant typological similarity and not common historical origin):

The languages of these regions are very numerous. They have a not unpleasing pronunciation and a structure similar to Latin and Greek.

The idea of common origin was suggested by Gaston Laurent Coeurdoux in 1767 (but he meant borrowing). Such ideas were commonplace in the mid 18th century.

Edward Lhuyd (1707) was perhaps the first to identify sound correspondences and understand that regularity of sound correspondences was proof of common origin e.g. he identified that Greek, Romance, Celtic /k/ corresponds to Germanic /h/ (German hundert ‘hundred’: Latin centum, German Hund ‘dog (hound)’ : Latin canis, German Hals ‘neck’ : Latin collum = Grimm’s Law).

genetic relationships i david willis li2 language variation: historical linguistics

Documents

shared aberrancy slide

historical linguistics

false correspondence

welsh p

armenian hing

p paternel

latin p

p pere