genetic relationships i david willis li2 language variation: historical linguistics
TRANSCRIPT
*Proto-Indo-European
*Proto-Celtic ‡Tocharian
*Proto-Indo-Iranian‡Old Irish *British
*Proto-Germanic
*Proto-
Balto-Slavic
Albanian
‡Ancient
Greek
Armenian
‡Latin
Catalan
French
Italian
Occitan
Portuguese
Romanian
Romansh
Spanish
NORTH EAST
Afrikaans
Dutch
English
Frisian
German
Yiddish
Danish
Faroese
Icelandic
Norwegian
Swedish
‡Gothic
*Proto-
Iranian
Kurdish
Ossetian
Persian
Pashto
Tadzhik
INDO-
ARYAN
‡Sanskrit
Assamese
Bengali
Gujarati
Hindi/Urdu
Maldivian
Marathi
Punjabi
Sindhi
Sinhalese
*Proto-Baltic *Proto-Slavic
Latvian
Lithuanian
‡Prussian
WEST
SOUTH
EAST
Czech
Polish
Slovak
Sorbian
Bulgarian
Croatian
Macedonian
‡Old Church Slavonic
Serbian
Slovene
Belorussian
Russian
Ukrainian
WEST
Gaelic
Irish
‡Manx
Breton
‡Cornish
Welsh
ROMANCE
* = reconstructed language
‡ = dead language
Modern
Greek
The Comparative Method
How do we establish that languages are related? Similarities between languages can be due to:
accident (chance resemblance or independent parallel development) borrowing / language contact onomatopoeia universals and typology genetic relationship
The first four need to be eliminated to demonstrate the last.
Eliminating borrowings and chance similarities
FINNISH ENGLISH abstraktinen abstract almanakka almanac arkkitehti architect
MODERN
GREEK MALA Y ‘eye’ ['mati] [mata]
Eliminating borrowings and chance similarities
HAWAIIAN ANCIENT GREEK
aeto ‘eagle’ aetos ‘eagle’ noonoo ‘thought’ nous ‘thought’ manao ‘think’ manthano ‘learn’ mele ‘sing’ melos ‘melody’ lahui ‘people’ laos ‘people’ meli ‘honey’ meli ‘honey’ kau ‘summer’ kauma ‘heat’ mahina ‘month’ men ‘moon’
Correspondence sets
systematic sound correspondences between cognates (items in two languages that have been continuously transmitted from a single parent-language item) are strong evidence for linguistic affinity (assuming regularity of sound change)
some linguists regard sound correspondences as essential for demonstrating it
ENGLISH DUTCH GERMAN DANISH SWEDISH
‘house’ [ha s] [høys] [ha s] [hu s] [hu s] ‘mouse’ [ma s] [møys] [ma s] [mu s] [mu s] ‘louse’ [la s] [løys] [la s] [lu s] [lu s] ‘out’ [a t] [øyt] [a s] [u ] [u t] ‘brown’ [bra n] [brøyn] [bra n] [b au n] [bru n]
Correspondence sets
some linguists regard them as essential for demonstrating it use basic vocabulary: body parts, close kin, natural world, low numbers (but
even basic vocabulary can sometimes be borrowed e.g. Finnish borrowed from Baltic and Germanic the words for ‘mother’, ‘daughter’, ‘sister’, ‘tooth’, ‘navel’, ‘neck’, ‘thigh’, ‘fur’)
similarity due to onomatopoeia is not reliable evidence of relatedness, but can usually be readily identified
cognacy is often not recognised until the systematic correspondences are understood:
English wheel : Hindi cakkā
English horn : Hindi sĩg
English sister : Hindi bahan
French cinq : Russian pjat’ : Armenian hing : English five (< PIE *penkwe)
(regular but non-obvious correspondences e.g. Armenian hing shows *p > h in Armenian cf. Armenian het : Greek ped- ‘foot’, hour : pyr ‘fire’ etc.)
False correspondence sets due to borrowing
regular correspondences may sometimes be found in loans:
English : French normally shows f : p (pere : father, pied : foot, pour : for)
Romance loans show p : p (paternel : paternal, piedestal : pedestal) use of basic vocabulary may help up to overcome this but not always:
Massive borrowing may create false correspondence sets e.g. Welsh /p/ : Latin /p/:
WELSH LATIN ‘bridge’ pont p ns ‘fish’ pysg piscis ‘tent’ pabell p pili ‘pan’ padell patella ‘people’ pobl populus
False correspondence sets due to borrowing
can be hard to distinguish from real correspondence sets e.g. Welsh /p/ : Old Irish /k/:
Armenian was widely believed to be an Iranian language until 1875, when Heinrich Hübschmann showed it to be an independent branch of Indo-European in 1875 by showing that similarities with Iranian were due to borrowing (Armenian showed three systems of sound correspondence with Iranian) and were not supported by morphological irregularity.
WELSH OLD IRISH ‘four’ pedwar cethair ‘worm’ pryf cruim ‘I buy’ prynaf crenaim ‘head’ penn cenn
‘Shared aberrancy’ / shared grammatical irregularity
The presence of similar (highly arbitrary) morphological alternation in two languages is a very good indicator of genetic relationship (e.g. English good, better, best and German gut, besser, best-). Antoine Meillet termed this ‘shared aberrancy’
3rd sing. 3rd plur. 1st sing. of ‘be’ of ‘be’ of ‘be’ LATIN est sunt sum SANSKRIT ásti sánti asmi GREEK esti eisi eimi GOTHIC ist sind am HITTITE e zi a anzi e mi PIE *es-ti *s-enti *es-mi
Typological evidence
the value of typological evidence in disputed early linguists often used shared morphological type as evidence of relatedness usually considered to be of little value today, despite some attempts to use it
The subgrouping problem lines on family trees indicate a period of shared innovation (development) of all the daughter languages splits indicate the beginning of independent innovation nodes indicate intermediate parent languages (whether or not these are attested or even have names) e.g. Proto-Germanic, Latin etc. subgrouping is done by establishing that a group of languages share an innovation that is not found in some other portion of the family e.g. Grimm’s Law defines Germanic (English father, German Vater vs. French père (Latin pater), Ancient
Greek patēr, Sanskrit pitr); loss of PIE *p defines Celtic (Old Irish lán, Welsh llawn vs. French plein, Russian polon, English full) genetic similarities between languages may be due either to shared retention or to shared innovation: only shared innovation is indicate of subgrouping shared parallel development is a possible confounding factor
The subgrouping problem
L1 L2 L3 L4 L5 p p f f zero
Most linguists would prefer to posit two changes in the histories of these languages, namely, p > f and (subsequently) f > ø, hence the tree:
Note that this means that, perhaps surprisingly, L1 and L2 share no particularly close relationship.
(Harrison 2003)
The establishment of Indo-European
establishment of Indo-European is often attributed to William Jones’s third discourse to the Royal Asiatic Society of Bengal (February 1786):
The Sanscrit language, whatever be its antiquity, is of a wonderful structure; more perfect than the Greek, more copious than the Latin, and more exquisitely refined than either; yet bearing to both of them a stronger affinity, both in the roots of verbs and in the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three without believing them to have sprung from some common source , which, perhaps, no longer exists. There is a similar reason, though not quite so forcible, for supposing that both the Gothic and Celtick, though blended with a very different idiom, had the same origin with the Sanscrit; and the old Persian might be added to the same family… Sir William Jones
(1746–94)
The establishment of Indo-European
connections between Indo-European languages had been observed long before Jones the relationship between Sanskrit and other Indo-European languages had been
observed before Jones Jones did not use the comparative method Jones interpreted linguistic relationship as part of ethnic / genealogical relationships
in a biblical framework, the history of the human races Jones made a number of errors that were not made by everyone at the time:
Of these cursory observations on the Hindus … this is the result; that they had an immemorial affinity with the old Persians, Ethiopians, and the Egyptians; the Phenicians, Greeks, and Tuscans [= Etruscans]; the Scythians or Goths, and Celts; the Chinese, Japanese, and Peruvians; whence, as no reason appears for believing that they were a colony from any one of those nations, or any of those nations from them, we may fairly conclude that they all proceeded from some central country, to investigate which will be the object of my future Discourses. (Third discourse to the Royal Asiatic Society of Bengal, 1786)
The establishment of Indo-European
early attempts at language classification in the sixteenth century saw all languages as descended from Hebrew
even in the sixteenth century, many of the branches of Indo-European were recognised (e.g. by Konrad Gesner in 1555)
early attempts at etymology allowed free substitution of sounds (cf. Voltaire’s alleged remark that ‘etymology is a science in which the vowels count for nothing and the consonants for very little’)
an improvement is the notion that certain sound changes recur in different languages
The establishment of Indo-European
the idea that various European languages might go back to a common ancestor originates with the ‘Scythian hypothesis’, first propounded by Johannes Goropius Becanus (Jan van Gorp van der Beke) in 1569 and later associated with Claude Saumaise (1588–1653) (Indo-Scythian), Marc Boxhorn (1602–53) and Andreas Jäger (Scytho-Celtic):
An ancient language, once spoken in the distant past in the area of the Caucasus mountains and spreading by waves of migration throughout Europe and Asia, had itself ceased to be spoken and had left no linguistic monuments behind, but had as a ‘mother’ generated a host of ‘daughter languages’, many of which in turn had become ‘mothers’ to further ‘daughters’. (For a language tends to develop dialects, and these dialects in the course of time become independent, mutually unintelligible languages.) Descendants of the ancestral languages include Persian, Greek, Italic (whence Latin and in time the modern Romance tongues), the Slavonic languages, Celtic, and finally Gothic and the other Germanic tongues. (Andreas Jäger, De lingua vetustissima Europae, Scytho-Celtica et Gothica, 1686)
The establishment of Indo-European
the discovery of Sanskrit added weight to such investigations. Thomas Stephens had noted similarities between Indian languages and Latin and Greek in 1583 (but he meant typological similarity and not common historical origin):
The languages of these regions are very numerous. They have a not unpleasing pronunciation and a structure similar to Latin and Greek.
The idea of common origin was suggested by Gaston Laurent Coeurdoux in 1767 (but he meant borrowing). Such ideas were commonplace in the mid 18th century.
Edward Lhuyd (1707) was perhaps the first to identify sound correspondences and understand that regularity of sound correspondences was proof of common origin e.g. he identified that Greek, Romance, Celtic /k/ corresponds to Germanic /h/ (German hundert ‘hundred’: Latin centum, German Hund ‘dog (hound)’ : Latin canis, German Hals ‘neck’ : Latin collum = Grimm’s Law).