edinburgh research explorer · p t k b d g house & fairbanks 1953 127.9 127.1 127.2 120.9 120.6...

2
Edinburgh Research Explorer On the mechanics of tonogenesis Citation for published version: Kirby, J & Ladd, R 2014, 'On the mechanics of tonogenesis: Evidence from prevoicing', 3rd Workshop on Sound Change, California, United States, 28/05/14 - 31/05/14. Link: Link to publication record in Edinburgh Research Explorer Document Version: Other version General rights Copyright for the publications made accessible via the Edinburgh Research Explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy The University of Edinburgh has made every reasonable effort to ensure that Edinburgh Research Explorer content complies with UK legislation. If you believe that the public display of this file breaches copyright please contact [email protected] providing details, and we will remove access to the work immediately and investigate your claim. Download date: 30. May. 2020

Upload: others

Post on 27-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Edinburgh Research Explorer · p t k b d g House & Fairbanks 1953 127.9 127.1 127.2 120.9 120.6 122.8 Lehiste & Peterson 1961 175 176 176 165 163 163 Mohr 1968 130.7 129.8 131.1 125.1

Edinburgh Research Explorer

On the mechanics of tonogenesis

Citation for published version:Kirby, J & Ladd, R 2014, 'On the mechanics of tonogenesis: Evidence from prevoicing', 3rd Workshop onSound Change, California, United States, 28/05/14 - 31/05/14.

Link:Link to publication record in Edinburgh Research Explorer

Document Version:Other version

General rightsCopyright for the publications made accessible via the Edinburgh Research Explorer is retained by the author(s)and / or other copyright owners and it is a condition of accessing these publications that users recognise andabide by the legal requirements associated with these rights.

Take down policyThe University of Edinburgh has made every reasonable effort to ensure that Edinburgh Research Explorercontent complies with UK legislation. If you believe that the public display of this file breaches copyright pleasecontact [email protected] providing details, and we will remove access to the work immediately andinvestigate your claim.

Download date: 30. May. 2020

Page 2: Edinburgh Research Explorer · p t k b d g House & Fairbanks 1953 127.9 127.1 127.2 120.9 120.6 122.8 Lehiste & Peterson 1961 175 176 176 165 163 163 Mohr 1968 130.7 129.8 131.1 125.1

A COARTICULATORY PATH TO SOUND CHANGE 789

when followed by an oral stop, but the nasal is largely absent and the vowel more fullynasalized when followed by a voiceless fricative (Hattori et al. 1958).

2.1. COVARIATION LINKED TO CODA VOICING. Evidence of covariation between vowelnasalization and nasal consonant duration suggests that speakers might produce aroughly constant-sized nasal gesture across VNC contexts, but variably align that ges-ture relative to the oral articulators due to the influences of C. The upper panel ofFigure 1 schematically represents overlapping vowel, nasal (lowered velum), and oral-cavity closure gestures for VNCvoiced, superimposed over the acoustic waveform of atoken of [bεnd] with partial vowel nasalization. (The dashed vertical lines show acousticsegmentation of V, N, and C.) If, as in the lower panel, timing remains constant inVNCvoiceless except for earlier onset of the (same-sized) lowered velum gesture, theoutcome would be temporally more extensive vowel nasalization (shown by greateroverlap of the lowered velum gesture with the vowel gesture), a shorter acoustic nasalconsonant (b–c), and a longer postnasal oral constriction (c–d).

vowel lowered velum

oral closure

V N voicedC

V Nvoiceless

C

same-sized velum gesture initiated earlier

a b c d

FIGURE 1. Schematic representation of the consequences for vowel nasalization, the nasal consonant,and the postnasal oral constriction if the velum gesture is initiated earlier in voiceless (bottom)

than in voiced (top) contexts. Dashed lines indicate acoustic segmentation.

The hypothesis of variable temporal alignment of an otherwise (roughly) stable velumgesture was tested for voiced and voiceless obstruent contexts in English. Becausethe pattern of more extensively nasalized vowels cooccurring with shorter nasal conso-nants has already been reported in the literature on English and other languages, thecurrent focused, small-scale experiment explored this pattern in depth for a singlevowel context. Nine native American English-speaking members of the University ofMichigan community read a randomized word list consisting of multiple repetitions ofC(C)V(N)C words where V ! /ε/, N ! /n/, and coda C ! one of /t d s z/ (e.g. bet,bent, bed, bend). Data from only six speakers, however, are reported because three

•  If onset F0 is under speaker control, it might be in a trading relation with VOT (Repp, 1982)

Your  text  would  go  here.     Your  text  would  go  here.    

Your  text  would  go  here.    

Your  text  would  go  here.    

Linguis3cs  &  English  Language,  University  of  Edinburgh  James  Kirby  &  D.  Robert  Ladd  On  the  mechanics  of  tonogenesis:  Evidence  from  prevoicing  

INTRODUCTION  

held constant at 120 Hz and 45 dB, respectively. For all tokens, the burst had a dura-tion of 4 ms with 4 frication formants, each with a frequency of 300 Hz and 100 Hzbandwidth. Onset f0 began at 90, 110, 130, or 150 Hz and converged to 120 Hz overthe next 50 ms. Between the burst and the onset of the vowel, aspiration amplitudewas zero for prevoiced tokens and linearly correlated with positive VOT values.

2.3.2 Task

Stimuli were presented using MATLAB 7.10 (MathWorks, 2010) via a Soundblaster Live!soundcard on a Dell Optiplex/Windows XP computer through headphones (SennheiserHD 280 pro) at a comfortable listening level.

Participants heard one token at a time, using the MATLAB interface to click onone of two buttons labeled “BA” and “PA” to indicate what they heard. After eachresponse, there was a 400 ms pause before the beginning of the next trial. On each trialthe mouse pointer was re- centered between the two response buttons so as not to biasresponses. Participants were not limited in response time, but were instructed torespond as quickly and accurately as possible. The left–right order of the response but-tons was counterbalanced. There were 11 blocks of 40 tokens, for a total of 440 totaltrials.

3. Analysis and results

The relative weighting of the 2 acoustic cues in production was computed using dis-criminant analysis for each of the 25 participants, providing standardized canonicalcoefficients such that a larger coefficient denotes a stronger weighting of a variable.Results [Fig. 1(a)], showed a significant negative Pearson’s product moment correla-tion, r (23)¼"0.42, p¼ 0.037, between VOT and onset f0. Discriminant analysis waschosen over logistic regression, which would otherwise have afforded a more direct

Fig. 1. Scatter plots illustrating comparisons discussed in text. Dark line indicates linear regression line. (a) Pro-duction: Total-sample canonical coefficients across VOT and onset f0 for each participant, r (23)¼"0.42,p¼ 0.037. (b) Perception: b weights across VOT and onset f0 for each participant, r (23)¼ 0.36, p¼ 0.076. (c)VOT: Weightings across perception and production for each participant, r (23)¼ 0.1, p¼ 0.623. (d) Onset f0:Weightings across perception and production for each participant, r (23)¼"0.34, p¼ 0.101.

Shultz et al.: JASA Express Letters [http://dx.doi.org/10.1121/1.4736711] Published Online 10 July 2012

EL98 J. Acoust. Soc. Am. 132 (2), August 2012 Shultz et al.: Voicing cues in perception and production

Downloaded 11 Jul 2012 to 128.210.208.105. Redistribution subject to ASA license or copyright; see http://asadl.org/journals/doc/ASALIB-home/info/terms.jsp

•  Shultz et al. (2012) present evidence of just such a relation between VOT and F0 for (phonetically voiceless) English stops

•  Replicated (by us) in pilot

PHONETIC EXPLANATIONS FOR THE DEVELOPMENT OF TONES

2.2. THE 'SEEDS' OF THE SOUND CHANGE: PHONETIC DATA. Phonetic studies by House & Fairbanks 1953, Lehiste & Peterson 1961, Mohr 1968, Lea 1973, and Lofqvist 1975-among others-show how a voicing distinction in prevocalic position can affect the Fo of the following vowel. Some of the data from these studies are summarized in Table 1.

p t k b d g House & Fairbanks 1953 127.9 127.1 127.2 120.9 120.6 122.8 Lehiste & Peterson 1961 175 176 176 165 163 163 Mohr 1968 130.7 129.8 131.1 125.1 124.8 125 TABLE 1. Fundamental frequency (in Hz) of vowels as a function of the preceding consonant,

as determined by three studies.

Although the number of subjects and the methods used to measure and average the data differ in these studies, it is clear that the Fo of a vowel is higher after voiceless (aspirated) than after voiced stops, and that it does not vary in any con- sistent way as a function of the place of articulation of the stops. Unfortunately, these data give only an averaged or peak value for Fo, making it impossible to deduce the time course of the perturbation caused by the preceding consonant.

Data gathered by Hombert 1975a remedy this. He recorded five adult male American English speakers' pronunciation of the sentence 'Say C[i] again', where C= [p t k b d g] (and for three of the speakers, [w m] as well). Each token was repeated ten times. For each token he sampled Fo at 20 ms intervals from vowel onset to 100 ms after vowel onset. The results are given in Figures 1-2. Fig. 1 shows the frequency-normalized Fo curves on the vowels following the voiced and voice-

Fo 140Hz_

130 - p

^ b

120 /-

I I I I time 20 100ms

FIGURE 1. Average fundamental frequency values (in Hz) of vowels following English stops (data from five speakers). The curves labeled [p] and [b] represent the values associated with all voiceless and voiced stops, respectively-regardless of place of articulation. The zero point on the abscissa represents the moment of voice onset; with respect to stop release, this occurs later in real time in voiceless aspirated stops.

39

MATERIALS  &  METHODS  •  5 female speakers each of French & Italian

•  3 repetitions each of 11 (near-) minimal triplets: two bilabial target items (e.g. balla ~ palla, boule ~ poule) and one rhyming distractor (e.g. stalla, foule) in initial and medial positions

•  Vowels were mainly /a/, with a few /ɛ/s and (for French) /u/s

•  Positive correlation between duration of voicing lead and onset F0 in both languages; onset F0 for [+voice] lower in medial context

•  F0 ~ VOT + voicing + context + (VOT:voicing) + (VOT:context) + (1+VOT|subject) + (1|item)!

!!

italian, voiced italian, voiceless

160

180

200

220

-150 -100 -50 10 20 30 40

vot

f0

context init med

DISCUSSION  

No evidence for trading relation: speakers tend to use more low F0 as voicing lead time increases

french, voiced french, voiceless

140

160

180

200

220

240

-150 -100 -50 50 100 150

vot

f0

•  Implicated in emergence and evolution of tone systems

•  What factor(s) might contribute to exaggeration of this effect?

•  Kingston & Diehl (1994): onset F0 is actively lowered to enhance the perception of the [voice] contrast

(Hombert et al., 1979)

REFERENCES  Beddor,  P.  S.  2009.  A  coar3culatory  path  to  sound  change.  Language    85(4):  785-­‐821.  /  Hombert.,  J.-­‐M.,  Ohala,  J.  J.,  Ewan,  W.G.  1979.  Phone3c  explana3ons  for  the  development  of  tones.  Language  55(1):37-­‐58.  /  Honda,  K.  2004.  Physiological  factors  causing  tonal  characteris3cs  of  speech:  from  global  to  local  prosody.  In  Proc.  Speech  Prosody  2004,  739-­‐744.  /  Kent,  R.D.,  Moll,  K.  L.  1969.  Vocal  tract  characteris3cs  of  the  stop  consonants.  JASA  59:1548-­‐1555.  /  Kingston,  J.,  Diehl,  R.  1994.  Phone3c  knowledge.  Language  70(3):  419-­‐454.  /  Repp,  B.  1982.  Phone3c  trading  rela3ons  and  context  effects:  new  experimental  evidence  for  a  speech  mode  of  percep3on.  Psychological  Bulle9n    92(1):  81-­‐110.  /  Shultz,  A.A.,  Francis,  A.  L.,  Llanos,  F.  2012.  Differen3al  cue  weigh3ng  in  percep3on  and  produc3on  of  consonant  voicing.  JASA  Express  Le<ers  132(2):  EL95-­‐EL101.  /  Solé,  M.-­‐J.  2014.  The  percep3on  of  voice-­‐ini3a3ng  gestures.  Laboratory  Phonology  5(1):37-­‐68.  

RESULTS  

•  Durational measures: voicing onset, burst onset/offset, duration of following vowel

•  Spectral measures: F0 at seven timepoints starting 10ms after vowel onset (start of f)

•  If informativity of F0 increases as that of VOT decreases, listeners might misattribute the source of the effect (Beddor, 2009; Solé, 2014)

!"#$%&%'( !)#$*+'(

,+

,%

''

-

-,

,

-.

,/01203

1"/405"67

,7/801"5

52/32909

Figure 6: Vertical larynx movement and its effect on F0. As thelarynx is lowered, the cricoid cartilage moves downward keep-ing its posterior plate parallel to the curvature of the cervicalspine, and thus rotates to shorten the vocal folds.

of the larynx and hyoid bone are lower for voiced stops thanfor voiceless stops [38], and thus F0 is lower for voiced stops.This account seems reasonable because larynx lowering for avoiced stop allows glottal airflow during stop closure of the vo-cal tract [39]. Another account is derived from the activity ofthe CT muscle in voicing: the CT is used to apply a suddenstretch to the vocal folds to halt vocal fold vibration for voice-less stops [40], and thus F0 at voicing is higher after voicelessstops. Although this account proposes a reasonable mechanismfor F0 variation due to voicing, a different account is also pos-sible from the viewpoint of perceptual requirement: speakersintend to produce higher F0 after voiceless stops to realize theauditory cue of the sounds. Since the pars recta is recorded inmany EMG studies, this part of the CT may contribute to thelocal F0 pattern at voicing (but see Figure 4).

Intrinsic vowel F0, a language-universal phenomenon forhigh vowels to have higher F0, is known as a vocal manifes-tation of tongue-larynx interaction. This phenomenon has ledspeech researchers to search for its causal mechanisms [41]. Acontemporary view of the underlying mechanism is the biome-chanical coupling between the larynx and the supra-laryngealarticulators. The classical ”tongue pull” theory [42] has beenrefined by a few physiological studies [43] [29]. The tongue de-formation for high-vowel articulation produces forward move-ment of the hyoid bone, which applies a force to the cricothyroidjoint to stretch the vocal folds. Contrary to this biomechanicalexplanation, the intrinsic vowel F0 may be a deliberate produc-tion to enhance vowel identification [44]. As evidence, EMGstudies have revealed the contribution of CT activity to intrinsicvowel F0 [45] [46]. Although the vowel-dependent F0 varia-tion in speech is not large enough to interact with prosody, itshould provide an additional feature to complete vowel-specifictonal characteristics and contributes to the expansion of audi-tory vowel space by changing the F0-F1 relationship.

6. SummaryThis report described contemporary issues of physiologicalmechanisms of F0 control. Although experimental evidence sofar is not complete and needs to be accumulated by further stud-ies, the physiological mechanisms deriving the units of speechprosody may be summarized as follows.

Local fluctuations: The CT (pars recta?) contributes toF0 patterns of voicing, and the tongue-larynx interactioncauses vowel-dependent F0 variation.Lexical accent: The CT (pars obliqua?), along with otherintrinsic and strap muscles, determines F0 patterns oflexical accent.Phrasal pattern: The extrinsic laryngeal muscles in co-ordination with the respiratory muscles regulate F0 pat-terns in phrasal declination.

7. AcknowledgementThis research was supported as part of ”Research on HumanCommunication” with funding from the TelecommunicationsAdvancement Organization of Japan.

8. References[1] Fink, B.R.; Demarest, R.J., 1978. Laryngeal Biomechan-

ics. Cambridge, Mass: Harvard University Press.[2] Mayet, V.A.; Mundnich, K., 1958. Beitrag zur Anatomie

und zur Funktion des M. Cricothyroideus und derCricothyreoidgelenke. Acta Anat., 33, 273-288.

[3] Sonninen, A., 1968. The external frame function in thecontrol of pitch in the human voice. Ann. New York Acad.Sci., 68-90.

[4] Takano, S.; Honda, K.; Masaki, S.; Shimada, Y.; Fuji-moto, I., 2003. High-resolution imaging of vocal gestureusing a laryngeal MRI coil and a synchronized imagingmethod with external triggering. In Proc. Spring Meeting,Acoust. Soc. Jpn., 291-292. (in Japanese)

[5] Arnold, G.E., 1961. Physiology and pathology of thecricothyroid muscle. Laryngoscope, 71, 687-753.

[6] Hirano, M.; Ohala, J.; Vennard, W., 1969. The functionof laryngeal muscles in regulating fundamental frequencyand intensity of phonation. J. Speech and Hearing Re-search, 12, 616-628.

[7] Gay, T.; Strome, M.; Hirose, H.; Sawashima, M., 1972.Electromyography of the intrinsic laryngeal muscles. Ann.Otol. Rhinol. Laryngol., 81, 401-409.

[8] Atkinson, J.E., 1978. Correlation analysis of the phys-iological features controlling fundamental frequency. J.Acoust. Soc. Am., 63, 211-222.

[9] Shipp, T.; Doherty, E.T.; Morrissey, P., 1979. Predictingvocal frequency from selected physiological measures. J.Acoust. Soc. Am., 66, 678-684.

[10] Honda, K., 1983. Variability analysis of laryngeal mus-cle activities. In Vocal Fold Physiology: Biomechanics,Acoustics, and Phonatory Control, I. Titze; R. Scherer(eds.). Denver: The Denver Center for the PerformingArts, 286-297.

[11] Honda, K., 1988. Various laryngeal mechanisms in con-trolling the voice fundamental frequency. J. Acoust. Soc.Am., 84, Suppl. 1, S82.

•  Larynx and hyoid bone are consistently lower for voiced stops than for voiceless (Kent & Moll, 1969)

•  Accompanied by downward movement of the cricoid cartilage and slackening of the vocal folds (Honda, 2004)

(Honda 2004, Fig. 6)

(Beddor, 2009)

Does there exist a similar trading relation between voicing lead and F0?

F0 at 10ms after voicing onset by context

Sound  Change  In  Interac0ng  Human  Systems,  UC-­‐Berkeley,  May  28-­‐31  2014  

Consistent with primarily automatic account of onset F0 effects, e.g.:

(Shultz et al., 2012)

Obstruent-intrinsic F0 effect: F0 of vowels following voiced stops lower than F0 following voiceless stops

Italian

French

However, relationship between voicing and F0 still important for the emergence and evolution of tone

•  Maintaining prevoicing is aerodynamically challenging; lowering of larynx could prolong prevoicing while also lowering onset F0, consistent with present findings

•  If the low-frequency pulsing is not perceived, percept could instead be of a low- vs high-F0 [-voice] stop

Misattribution of coarticulatory source need not involve active enhancement or trading relation

Finding in opposite direction to predicted effect

0

6000

b f

v

Time (s)0.04688 0.6246

frc-init-basta-1