acoustical features of hard and soft russian …...dept. for speech, music and hearing quarterly...

Dept. for Speech, Music and Hearing

Quarterly Progress andStatus Report

Acoustical features of hardand soft Russian consonants

in connected speech: Aspectrographic study

Shupljakov, V. and Fant, G. and deSerpa-Leitao, A.

journal: STL-QPSRvolume: 9number: 4year: 1968pages: 001-006

http://www.speech.kth.se/qpsr

http://www.speech.kth.se

http://www.speech.kth.se/qpsr

STL-QPSR 4/1968 3.

Experimental procedure

Three female and two male speakers of Modern Standard Russian took

par t i n the experiments. The subjects pronounced l i s t s of symmetr ical

sequencesto VCV (vowel-consonant-vowel) type. These l i s t s contained a l l

Russian vowels and consonants, in al l 180 syllables.

This speech mater ia l was recorded on an Ampex magnetic tape-recorder .

Spectrograms were made with a Kay Electr ic Sona-Graph, type 4691, oper-

ated with a broad f i l ter of 300 Hz and a high frequency emphasis of 6 dB

p e r octave above 1000 Hz,

Measurements were made of F and F within the consonant sampling a t 2 3 places of extreme values. If these were obscured, measurements were

taken at the very beginning of a transit ion f rom the consonant to the vowel.

The F and F data were quantized to nearest 100 Hz values fo r the purpose 2 3

of constructing diagrams.

The direction of the transit ions were observed.

Results and conclusions

It was soon verified that hypothesis (3) did not hold. F2-transi t ions

f r o m adjacent vowels to the soft consonant were generally positive, s ee

Fig. I-A-1, but could a l so be neutral o r slightly negative, s ee the word

[ a l ' z j of Fig. I-A-2, where F2 = 2100 Hz in [ie] a s well a s i n the [I] . The

sequence [~ lz ] of Fig. I-A-2 exemplifies a typical negative transit ion f rom

the F2 = 1900 Hz of the [ie] to the F2 min = 830 Hz of the da rk [I].

However, in [odo] and [ aza l , Fig. I-A-2, the transit ion f rom the vowel

to the hard consonant i s neutral o r positive.

Although the direction of transit ion can b e a n important perceptual

parameter , Stevens (1 967), Chistovich (1 968), it apparently does not pro-

vide a simple invariance rule for the soft/hard distinction.

The ce ter i s paribus hypothesis (1) was consistently found to hold. The

F of a soft consonant i s always higher than the F2 of a hard consonant i n 2

the same context and assuming the same speaker. The next s tep was to .

s e e whether there existed an absolute c r i te r ion in F independent of vowel 2

context a s formulated i n hypothesis (2) above. Fig. I-A-3 and Fig. I-A-4

show the distributions for each of the five speakers . Apparently the F2

kHz

Fig. I-A-1. Spectrogram of the syllable uh' u.

SUBJECT: M - n

I I \ I

\ I \

i :I \I Y t

\ \ \ \ \

1 1 1 1 I I 1 J

500 1000 1500 2000 2500

3 0 ~ - - 25 1 SUBJECT: B - r - - - 20 = I - I

I

I I I I

- \ - I \ \ \

I l l I l U \ I

2500

FREQUENCY OF THE SECOND F3RMANT IN Hz

Fig. I -A-3. The f requency of the second formant i n hard (solid l ines ) and soft (dashed l ines ) cons onants, pronounced by two male subjects .

FREQUENCY OF THE SECOND FORMANT IN H z

Fig. I -A-4 . The frequency of the second fo rmant in ha rd (solid l ines ) and soft (dashed l ines) consonants, pronounced by t h r e e different female subjects .

STL-CPSR 4/1968 4.

distributions of hard and soft consonants show very l i t t le overlap. Thei r

c rossover points occur a t 1700 Hz for the two male speakers and 1900 Hz,

2000 Hz, and 2000 Hz i n the distributions of the female speakers . The

number of mistakes that would occur when using a single threshold value

in F2, individually determined for each speaker would be 1.9 % in the 360

utterances of the male subjects and 4 YO in the 540 utterances of the female

group. A s imi l a r tendency was found in the resul ts f rom the experiments

on the perception of hard and soft stationary fricative consonants,

Shupljakov (1 968).

An analysis of the overlap in the F distributions shows that 18 out of 2

the 29 cases originate f rom the ;cquencis[atre] [ik 1 Lug' u][uf u][o J' o]

which a r e r a r e o r unnatural in Russian. The front vowel contexts ([i]

[ &][a] and [ e l account fo r 17 of the overlap points.

The finite amount of overlap can be fur ther reduced by taking into ac-

count the F distributions, see Figs. I-A-5 - I -A-9 . This cue, however, 3

i s effective mainly in front vowel contexts, where F3 of the soft consonant

i s generally higher than in hard consonants. In [ a ] contexts the overlap i s

considerable. The same i s found in Lo] and [u] contexts but i t is interest-

ing to note that for one of the male subjects (h4-n) there i s a c l ea r tendency

of a lower F in soft than in hard consonants. This effect could be attributed 3

to a coarticulation of the hard consonant with "mid-vocal tract" place of

articulation conditioning a high F 3 '

The crossover points of F in front vowel contexts were found to be 3 2600 Hz fo r the male speakers and 3100 Hz, 3100 Hz, and 3000 Hz fo r the

female speakers .

The question now a r i s e s whether the cr i t ical threshold values of second

and third formants separating hard f rom soft consonants can be inferred

f rom some general property of the speaker. One such parameter is the

total length of the speaker ' s vocal t r ac t which i s inversely proportional to

the average F3, Fant (1 959). The average F of each speaker ' s front 3 vowels [i] [&][el and [ze) was accordingly calculated and was found to co-

incide with the hard/snft F3 boundary. The hard/soft F boundary was found 2

to be 0.53 of the male speake r ' s average F and within the female group 3

the F boundary was 0.59 of the speaker ' s average Fj. Thus 2 - F 2 t = 0.53 Fg (males) - F 2 t = 0.59 F3 (females) - F3t = F 3 (males and females)

FREQUENCY OF THE THIRD FORMANT IN Hz

Fig . I-A-5. The frequency of the th i rd fo rmant i n hard (solid l ines ) and soft (dashed l ines) consonants, followed by different vowels (Male speaker : M-n).


F i g . I -A-6 . The f requency of the th i rd fo rmant i n ha rd (solid l ines) and soft (dashed l ines) consonants , followed by di f ferent vowels (Male speaker : B - r ) .


F i g . I - A - 7 . The frequency of the th i rd fo rmant i n hard (sol id l ines ) and sof t (dashed l ines ) consonants , followed by d i f ferent vowels ( F e m a l e speaker : H - a ) .

FREQUENCY OF THE THIRD FORMANT I N Hz

Fig. I - A - 9 . The frequency of the th i rd formant in hard (solid l ines) and soft (dashed l ines) consonants, followed by di f ferent vowels ( F e m a l e speaker : P - a ) .

STL-CPSR 4/1968 5.

The normalizing function of the third formant determining F 2 boundaries

i n vowel identification has been studied by Fujisaki and Kawashima (1 967

and 1968). F r o m the i r data and those of Fant (1959) i t i s known that the

perceptual importance of F is much g rea te r i n front vowels than i n back 3 vowels. This general rule i s related to our finding that F3 adds to the

phonetic separation of soft and hard consonants i n front vowel context only.

In this connection i t is also worth noting that i n the experiments on the per-

ception of sustained fricatives, Shupljakov (1968), the F2t was constant a s

long a s the par t of the noise spectrum above 2500 Hz was kept the same.

Summary. Perceptual model

Our final conclusion is accordingly that the speech wave charac ter i s t ics

separating Russian hard and soft consonants can be expressed by a single

threshold i n the consonant F with a support of a s imi l a r normalized thres- 2

hold in F i f F2 is high. These thresholds a r e simply related to the 3

speaker ' s average formant pattern. Consonants with formant frequencies

above the thresholds a r e soft and those with formant frequencies below

the thresholds a r e hard. The speech production cor re la te is palatalization

a s opposed to a neutral o r "mid tube" place of tongue-body articulation.

Therefore a l is tener ' s hard/s:Jft identification could be based on n f ~ r m a n t

frequency boundary categorization a s the pr imary cue a s was suggested

fo r vowels by Chistovich, Fant, d e s e r p a - ~ e i t g o , and Tjernlund (1966).

More detailed data on the F-pat tern of various hard and soft consonants

in different contexts will be given i n a forthcoming publication.

Acknowledgments

We a r e thankful to Mrs. 5. Felicett i and Dr. B. Lindblom fo r the help

during the preparation of the paper and to Miss E. Agelfors for thz drzwings.

Rc:fzr;:nc~s Chistovich, L. A. (1968): "Direction of Transit ion a s a Perceptual P a r a -

me te r of Time-Varying Stimuli", paper B-3-7, 6th International Gongr. on Acoustics, Tokyo 1968.

Chistovich, L. , Fant , G., d e s e r p a - ~ e i t c o , A., and Tjernlund, ~ . ( 1 9 6 6 ) : "Mimicking and Perception of Synthetic Vowels. P a r t 11", STL-CPSR 3/1966, pp. 1-3.

Fant, G. (1959): "Acoustic Analysis and Synthesis of Speech with Applica- tions to Swedish", E r i c s son Technics 15, No. 1 (1959). -

Fant, G. (1 960): Acoustic Theory of Speech Production ( ' s-Gravenhage).

STL-QPSR 4/1968 6.

references, conto

Fujisaki, H. and Kawashima, T. (1967): "The Roles of Pi tch and Higher Formants i n the Perception of Vowels", Conference on Speech Communication and Processing, Cambridge, Mass. 1967; publ. in IEEE Transactions on Audio and Electroacoustics AU-16, March 1968, pp. 73-77.

Kawashima, T. and Fujisaki, H. : "The Influence of Various Fac to r s on the Identification and Discrimination of Synthetic Speech Sounds:' paper B-3-6, 6th International Congr. on Acoustics, Tokyo 1968.

Shupljakov, : "Pitch and Hard-Soft Distinction af Fricat ive and / /", Proc , of the Seminar on Speech

Production and Perception, Leningrad 1966; 2. f. Phonetik, Sprachwissenschaft und Kommunikationsforschung - 21, Heft 1/2, 1968.

Shupljakov, V. S. (1 968): "P,coustical Fea tu re of the Perception of Soft Stationary Fricat ive Consonants", 6th Allunion Acoustical Conference, Ivioscow (in ~ u s s i a n ) .

Stevens, K, N. (1 967): "Acoustic Corre la tes of Certain Consonantal Features" , 1967 Conf. on Speech Communication and Processing, Cambridge, Mass. , paper C6 in Conference Prepr in ts , pp. 177 - 184.

ahman, S. (1 964): "N<]te 3n Palat alizati,>n i n Russian", Luar t . P r o g r . Rep. Research Inst. of Electronics, Massachusetts Inst. of Techn. No. 73 (1964)~ pp. 167-172.

Chomsky, N, and Halle, M. : The Sound Pat te rn of English (New York 1968).

acoustical features of hard and soft russian …...dept. for speech, music and hearing quarterly...

Documents