voices from the past prof. john coleman phonetics laboratory university of oxford @sounds_ancient...

53
Voices from the past Prof. John Coleman Phonetics Laboratory University of Oxford @sounds_ancient British Science Festival, Bradford, 7/9/15

Upload: alexina-matthews

Post on 17-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

PowerPoint Presentation

Voices from the past

Prof. John ColemanPhonetics LaboratoryUniversity of Oxford

@sounds_ancient

British Science Festival, Bradford, 7/9/15

What would historical linguistics be like if we worked with (audio) sounds instead of letters?

Could we bring back to life the sounds of dead languages?Part 1: An old storyThis old story starts in 1788 - Sir William Jones's The Third Anniversary Discourse to the [Bengal Asiatic] Society in Calcutta

3SanskritIn the 17th-18th C.,the ancient language of Hindu religious texts such as the Rig Veda became familiar to Western scholars

4Sir William Jones The Third Anniversary DiscourseBengal Asiatic Society, Calcutta, 2nd Feb 1786The Sanscrit language, whatever be its antiquity, is of a wonderful structure; more perfect than the Greek, more copious than the Latin yet bearing to both of them a stronger affinity, both in the roots of verbs and in the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer existsJones can't help being struck by the similarities between languages of India and oldest languages of Europe(especially, the similarities between Sanskrit, Greek and Latin)

5Digits2345678910GreekdtrespnteheksheptoktennadkaLatinduotrsquattuorsexseptemoctnovemdecemSanskritdvtryaschatvraspchashashsaptashtnvadsha6Sir William Jones went on:There is similar reason, though not quite so forcible, for supposing that both the Gothic and the Celtic, though blended with a very different idiom, had the same origin with the Sancrit; and the old Persian might be added to the same family7GothicThe old Germanic language of the Visigoths, mostly known from 6th century copies of a 4th century translation of the Bible, from Crimea

8Digits2345678910GreekdtrespnteheksheptoktennadkaLatinduotrsquattuorsexseptemoctnovemdecemSanskritdvtryaschatvraspchashashsaptashtnvadshaGothictwaithreisfidwrfimfsahssibunahtauniuntahun9CelticIrish, Scottish Gaelic, Welsh, Breton, Manx

Gaulish (spoken in Gaul what is now France at the time of the Roman empire)

Book of Leinster, 1160, Trinity College Dublin

10Digits2345678910GreekdtrespnteheksheptoktennadkaLatinduotrsquattuorsexseptemoctnovemdecemSanskritdvtryaschatvraspchashashsaptashtnvadshaGothictwaithreisfidwrfimfsahssibunahtauniuntahunIrishdtrceathaircuigsseachtochtnaoideich11Old PersianAncient Iranian language, known from (a) the Avesta, the Zoroastrian religious texts, and (b) ancient monumental inscriptions

12Inscription of Darius I at Behistun

Babylonian, Elamite and Old Persian13Digits2345678910GreekdtrespnteheksheptoktennadkaLatinduotrsquattuorsexseptemoctnovemdecemSanskritdvtryaschatvraspchashashsaptashtnvadshaGothictwaithreisfidwrfimfsahssibunahtauniuntahunIrishdtrceathaircuigsseachtochtnaoideichOld PersiandvathraychathwrpnchaxshvashhaptaashtmanavadasaYou can see that Old Persian is particularly similar to Sanskrit. For this reason they are believed to descend from a common origin.14Digits2345678910SanskritdvtryaschatvraspchashashsaptashtnvadshaGreekdtrespnteheksheptoktennadkaLatinduotrsquattuorsexseptemoctnovemdecemGothictwaithreisfidwrfimfsahssibunahtauniuntahunIrishdtrceathaircuigsseachtochtnaoideichOld Persiandvathraychathwrpnchaxshvashhaptaashtma-navadasaRegular similarities Shared ancestryDissimilarities Historical divergenceYou can see that Old Persian is particularly similar to Sanskrit. For this reason they are believed to descend from a common origin.15The Family Tree Model of Language HistoryAugust Schleicher (1860) Deutsche SpracheRegular similarities Shared ancestryDissimilarities Historical divergence

Main first languages of PakistanLanguage2008 estimate1Punjabi76,367,36044.17%2Pashto26,692,89015.44%3Sindhi24,410,91014.12%4Saraiki18,019,61010.42%5Urdu13,120,5407.59%6Balochi6,204,5403.59%Source: Pakistan Bureau of StatisticsWell, here we are in Bradford, where 20% of population is of Pakistani heritage according to the 2011 census

17Main first languages of PakistanLanguage2008 estimateRelated to1Punjabi76,367,36044.17%Sanskrit2Pashto26,692,89015.44%Old Persian3Sindhi24,410,91014.12%Sanskrit4Saraiki18,019,61010.42%Sanskrit5Urdu13,120,5407.59%Sanskrit (and Arabic)6Balochi6,204,5403.59%Old PersianMost languages of Pakistan are related to, that is descended from, either Sanskrit or Old Persian18Digits12345678910Indo-Aryan:SanskritkasdvtryaschatvraspchashashsaptashtnvadshaUrdu-HindiekdotincharpachchesahtahtndasSindhihik'btecharpanjchehsaptaktnavdhPunjabi

ikdotincharpanjchesatathnodasIndo-Iranian:Old PersianavadvathraychathwrpnchaxshvashhaptaashtmanavadasaBalochiyakdosecharpanchshashafthashtnodahPashtoyaudwadhretsalorpinzshpaguwatnalas

NB these spellings are approximate!Play audio recordings of one to ten19Similarities to European languages2. Balochi do : Irish do , Spanish dos, French deux, Pashto dwa : Russian (Polish/Serbian/Croatian) dva , Italian due, English (from Latin) duo, dual, Scots/Old English twa3. Pashto dre : Hessian German drei Italian tre, English three, (from Greek) tri(pod)4. char: Irish ceathair, Croatian chetiri, Belarusian chatyru5. Urdu panch : Polish pi pyench , Russian pyach, English (from Greek) penta(gon)

NB these spellings are approximate!

Balochi do/Irish do20Similarities to European languages6. Balochi shash : Polish sze sheshch , Russian shestche: Irish se (shey)7. Sindhi saptuh : Romanian sapta , Latvian septini, English (from Latin) Sept(ember), French sept; Balochi aft : Greek efta8. hasht, ata, aht, aktuh: Lithuanian ashtuoni, Icelandic ahta, German acht, English (from Latin and Greek) October, octopus9. no, nau, navuh: Catalan nou, Provencal no, French neuf, Breton nav, Welsh nau, English (from Latin) November10. das: Northern Portuguese des, French dix, English (from Latin) December

NB these spellings are approximate!

In fact, because Sanskrit and Old Persian are related to European language families, the words in languages of Pakistan are related to, and sometimes turn out to be quite similar to, words in European languages21Why?22Why?Some hypotheses:

Coincidence?

Borrowing?

Common origins?23Accidental coincidence?This would imply that only a few words were similar in each pair of languages. But as we have seen, this is not the case: many of the same patterns of correspondences are found time and time again. The digits are just 10 examples, but we could present many more.24Borrowing?This does happen. Urdu, for example, has borrowed a lot from Arabic, and many Iranian languages borrowed certain words from Turkish. But the languages of Pakistan, India and Europe are far more similar to one another than they are to Arabic, Turkish, or Dravidian languages25Borrowing?12345678910Arabicwahiditnantalatah'arba`ahamsahsittahsab`ahtamaniyyahtis`ah`asharahTurkishbirikichdrtbeshaltuhyedisekizdokuzonTamilonruirantumunrunankuaintuarueluettuonpatupaththiUrduekdotincharpanchchesataatnaudasArabic sittah and sab`ah are similar to Indo-European words for six and seven, but none of the other numbers are. So that's probably coincidence.Turkish iki and sekiz look like a bit like Indo-Iranian words for 1 and 6 but they are 2 and 8!Tamil onru and ettu look a bit like English one and eight, but that's as far as it goes. So no, the similarities between languages of Pakistan and Western Europe are not because of borrowing26Common origins?d in languages of Pakistan often corresponds to t in English (Grimm's law):

Balochi do : English twoBalochi draxt : English tree

Balochi dohtir : English daughterUrdu, Punjabi dena : English donate

Regular similarities Shared ancestryDissimilarities Historical divergenceThe similarities in sound between languages of Pakistan, India and Europe are not just accidental, and not idiosyncratic (word-by-word), but more general; there are regularities to the similarities27Common origins?p and t in languages of Pakistan (and Latin and Greek) correspond to f and th in English (Grimm's law again):

Urdu panch : English five, penta(gon)Urdu pura : English full, (re)pl(enish)Balochi phadh : English foot, pace, (tri)pod

Regular similarities Shared ancestryDissimilarities Historical divergence28Common origins?zn, jn in languages of Pakistan (and Slavonic languages!) correspond to kn in English spelling and gn in Latin/Greek:

Urdu janna : English know, (co)gn(itive)Polish/Russian etc znaty

Balochi zanuk: English knee, genu(flect)

Regular similarities Shared ancestryDissimilarities Historical divergenceSince the correspondences between one language and another are regular, we infer that the similar features are due to a common ancestor, and the differences are due to divergence over time from the ancestor.29Proto-Indo-European:our common ancestor language

Words in Indo-European languages are similar because they originate from a common, prehistoric culture30Proto-Indo-European:our common ancestor language

Pashto duwa from dwoH, Balochi doo from dwoH and Sindhi buh from dwoH, and English two from dwoH31Languages spread, not necessarily people

Languages spreading does not imply vast movements of populations. Mr Targett and his ancestors have lived in Cheddar, Somerset for at least 9000 years ago. They have seen the arrival and demise of British Celtic, Latin, Anglo-Saxon, Danish, and Norman French, and they spoke some Pre-Indo-European language before that.

32How do we know how words of ancient languages were pronounced?Marius Victorinus: Q differs from c by having lip protrusionquarum utramque exprimi faucibus, alteram distento, alteram producto rictu manifestum est.

Gellius: ng and nc had velar [], not [n]inter litteram n et g est alia uis, ut in nomine anguis et angari et ancorae et increpat et incurrit et ingenuus. In omnibus his non non uerum n, sed adulterinum ponitur. nam n non esse lingua indicio est; nam si ea littera esset, lingua palatum tangeret.How do we know how words of ancient languages were pronounced, anyway? Because as long as there has been writing, scholars have been writing about language, grammar and pronunciation.33How do we know how words of ancient languages were pronounced?Marius Victorinus: Q differs from c by having lip protrusionThey are both produced in the throat, the one with protrusion, the other pronounced rightly

Gellius: ng and nc had velar [], not [n]there is another sound of the letter n with g, as in the nouns anguis and angari and increpat and incurrit and ingenuus. In all these it is not a true n, but is modified. This n is not made by the tongue tip; because of the following letter, the tongue touches the palate.How do we know how words of ancient languages were pronounced, anyway? Because as long as there has been writing, scholars have been writing about language, grammar and pronunciation.34PART 2. Why does the sound of words change/diverge over time?A game of Chinese whispersashfuf, bidzar35Language transmission across the generations is like Chinese whispers

Other factors of variation: people move away from home, meet different people, kids are born in a different place36

Variation + generations leads to dialects + division of languages

37

Variation + generations leads to dialects + division of languages

Dialects are patterns of speaking that are different but still mutually understandablee.g. you say eether, I say aytherYou say baahth, I say bath38

Variation + generations leads to dialects + division of languages

When dialects become so different that the are not mutually understandable, we call them languagese.g. you say se I say three. Even though they are both from a common historical origin, *treyes, theyve changed so much that its not at all obvious, and we cannot understand each other without learning the others language39PART 3. Modelling sound change using speech synthesisFirst of all, what are sounds?40First, what are sounds?41First, what are sounds?Sound is waves of variation in the air pressure (vibrations)

The vibrations must be fast enough that our ears can detect it (between about 50 Hz and 18,000 Hz)

We can detect, measure and record sound waves using a microphoneFirst of all, what are sounds?42What does a microphone do?A microphone converts variations in air pressure to (similar) variations in voltage

VoltageTimeFirst of all, what are sounds?43Digitization: turning sound into numbersThe computers sound card contains an analogue-to-digital convertor, that measures the voltage tens of thousands of times per second and stores them as a (very long) list of numbers First of all, what are sounds?44

Digitization: turning sound into numbersThis vector is a sound file, e.g. a wav file45Addition: sound 1 + sound 2 = (mixing)Subtraction: noise + speech (noise cancellation) noise =

Multiplication: 2 sound = louder (amplification)Division: sound 2 = quieter (attenuation)Some things we can do with numbers

46More ambitious: Can we make a sound that is half way between two other sounds? What is the average of [u] and []? i.e. (sound 1 + sound 2) 2

Is it []?

Some things we can do with numbers

47More ambitious: Can we make a sound that is half way between two other sounds? What is the average of [u] and []? i.e. (sound 1 + sound 2) 2

Is it []?

No: to get something in between two sounds, we need to convert the 1-dimensional sound waves into 2-D surfaces: spectrogramsSome things we can do with numbers

48

Spectrograms: sounds as surfaces49

Averaging spectrograms+2 =This surface is half-way between the other two50

LPC spectrogramspectrogram 1

Sound change continuaLPC spectrogramLPC spectrogramLPC spectrogramLPC spectrogramLPC spectrogramInterpolatedspectrogramsspectrogram 2GENERATION OF SYNTHETIC SPEECHD Moore, J Coleman Eur. Patent 1,504,443Sound 2Sound 1100% spec 1 + 0% spec 250% spec 1 + 50% spec 20% spec 1 + 100% spec 2

90% spec 1 + 10% spec 251Voices from the pastone comes from Proto-Indo-European oinos, about 6000 years ago (like S. German oins)

three comes from PIE treyes (like Spanish tres)

five (like Urdu panch) comes from PIE penkwe (like Lithuanian penki)

eight comes from PIE hokto (like Greek okto)

ten comes from PIE dekmt (like Lithuanian deshmt)

52We previously heard how English two, Pashto duwa, Balochi doo and Sindhi buh all come from Proto-Indo-European dhoH, about 6000 years ago

53