acoustical features of hard and soft russian …...dept. for speech, music and hearing quarterly...
TRANSCRIPT
Dept. for Speech, Music and Hearing
Quarterly Progress andStatus Report
Acoustical features of hardand soft Russian consonants
in connected speech: Aspectrographic study
Shupljakov, V. and Fant, G. and deSerpa-Leitao, A.
journal: STL-QPSRvolume: 9number: 4year: 1968pages: 001-006
http://www.speech.kth.se/qpsr
STL-QPSR 4/1968 3.
Experimental procedure
Three female and two male speakers of Modern Standard Russian took
par t i n the experiments. The subjects pronounced l i s t s of symmetr ical
sequencesto VCV (vowel-consonant-vowel) type. These l i s t s contained a l l
Russian vowels and consonants, in al l 180 syllables.
This speech mater ia l was recorded on an Ampex magnetic tape-recorder .
Spectrograms were made with a Kay Electr ic Sona-Graph, type 4691, oper-
ated with a broad f i l ter of 300 Hz and a high frequency emphasis of 6 dB
p e r octave above 1000 Hz,
Measurements were made of F and F within the consonant sampling a t 2 3 places of extreme values. If these were obscured, measurements were
taken at the very beginning of a transit ion f rom the consonant to the vowel.
The F and F data were quantized to nearest 100 Hz values fo r the purpose 2 3
of constructing diagrams.
The direction of the transit ions were observed.
Results and conclusions
It was soon verified that hypothesis (3) did not hold. F2-transi t ions
f r o m adjacent vowels to the soft consonant were generally positive, s ee
Fig. I-A-1, but could a l so be neutral o r slightly negative, s ee the word
[ a l ' z j of Fig. I-A-2, where F2 = 2100 Hz in [ie] a s well a s i n the [I] . The
sequence [~ lz ] of Fig. I-A-2 exemplifies a typical negative transit ion f rom
the F2 = 1900 Hz of the [ie] to the F2 min = 830 Hz of the da rk [I].
However, in [odo] and [ aza l , Fig. I-A-2, the transit ion f rom the vowel
to the hard consonant i s neutral o r positive.
Although the direction of transit ion can b e a n important perceptual
parameter , Stevens (1 967), Chistovich (1 968), it apparently does not pro-
vide a simple invariance rule for the soft/hard distinction.
The ce ter i s paribus hypothesis (1) was consistently found to hold. The
F of a soft consonant i s always higher than the F2 of a hard consonant i n 2
the same context and assuming the same speaker. The next s tep was to .
s e e whether there existed an absolute c r i te r ion in F independent of vowel 2
context a s formulated i n hypothesis (2) above. Fig. I-A-3 and Fig. I-A-4
show the distributions for each of the five speakers . Apparently the F2
kHz
Fig. I-A-1. Spectrogram of the syllable uh' u.
SUBJECT: M - n
I I \ I
\ I \
i :I \I Y t
\ \ \ \ \
1 1 1 1 I I 1 J
500 1000 1500 2000 2500
3 0 ~ - - 25 1 SUBJECT: B - r - - - 20 = I - I
I
I I I I
- \ - I \ \ \
I l l I l U \ I
2500
FREQUENCY OF THE SECOND F3RMANT IN Hz
Fig. I -A-3. The f requency of the second formant i n hard (solid l ines ) and soft (dashed l ines ) cons onants, pronounced by two male subjects .
FREQUENCY OF THE SECOND FORMANT IN H z
Fig. I -A-4 . The frequency of the second fo rmant in ha rd (solid l ines ) and soft (dashed l ines) consonants, pronounced by t h r e e different female subjects .
STL-CPSR 4/1968 4.
distributions of hard and soft consonants show very l i t t le overlap. Thei r
c rossover points occur a t 1700 Hz for the two male speakers and 1900 Hz,
2000 Hz, and 2000 Hz i n the distributions of the female speakers . The
number of mistakes that would occur when using a single threshold value
in F2, individually determined for each speaker would be 1.9 % in the 360
utterances of the male subjects and 4 YO in the 540 utterances of the female
group. A s imi l a r tendency was found in the resul ts f rom the experiments
on the perception of hard and soft stationary fricative consonants,
Shupljakov (1 968).
An analysis of the overlap in the F distributions shows that 18 out of 2
the 29 cases originate f rom the ;cquencis[atre] [ik 1 Lug' u][uf u][o J' o]
which a r e r a r e o r unnatural in Russian. The front vowel contexts ([i]
[ &][a] and [ e l account fo r 17 of the overlap points.
The finite amount of overlap can be fur ther reduced by taking into ac-
count the F distributions, see Figs. I-A-5 - I -A-9 . This cue, however, 3
i s effective mainly in front vowel contexts, where F3 of the soft consonant
i s generally higher than in hard consonants. In [ a ] contexts the overlap i s
considerable. The same i s found in Lo] and [u] contexts but i t is interest-
ing to note that for one of the male subjects (h4-n) there i s a c l ea r tendency
of a lower F in soft than in hard consonants. This effect could be attributed 3
to a coarticulation of the hard consonant with "mid-vocal tract" place of
articulation conditioning a high F 3 '
The crossover points of F in front vowel contexts were found to be 3 2600 Hz fo r the male speakers and 3100 Hz, 3100 Hz, and 3000 Hz fo r the
female speakers .
The question now a r i s e s whether the cr i t ical threshold values of second
and third formants separating hard f rom soft consonants can be inferred
f rom some general property of the speaker. One such parameter is the
total length of the speaker ' s vocal t r ac t which i s inversely proportional to
the average F3, Fant (1 959). The average F of each speaker ' s front 3 vowels [i] [&][el and [ze) was accordingly calculated and was found to co-
incide with the hard/snft F3 boundary. The hard/soft F boundary was found 2
to be 0.53 of the male speake r ' s average F and within the female group 3
the F boundary was 0.59 of the speaker ' s average Fj. Thus 2 - F 2 t = 0.53 Fg (males) - F 2 t = 0.59 F3 (females) - F3t = F 3 (males and females)
FREQUENCY OF THE THIRD FORMANT IN Hz
Fig . I-A-5. The frequency of the th i rd fo rmant i n hard (solid l ines ) and soft (dashed l ines) consonants, followed by different vowels (Male speaker : M-n).
FREQUENCY OF THE THIRD FORMANT IN Hz
F i g . I -A-6 . The f requency of the th i rd fo rmant i n ha rd (solid l ines) and soft (dashed l ines) consonants , followed by di f ferent vowels (Male speaker : B - r ) .
FREQUENCY OF THE THIRD FORMANT IN Hz
F i g . I - A - 7 . The frequency of the th i rd fo rmant i n hard (sol id l ines ) and sof t (dashed l ines ) consonants , followed by d i f ferent vowels ( F e m a l e speaker : H - a ) .
FREQUENCY OF THE THIRD FORMANT I N Hz
Fig. I - A - 9 . The frequency of the th i rd formant in hard (solid l ines) and soft (dashed l ines) consonants, followed by di f ferent vowels ( F e m a l e speaker : P - a ) .
STL-CPSR 4/1968 5.
The normalizing function of the third formant determining F 2 boundaries
i n vowel identification has been studied by Fujisaki and Kawashima (1 967
and 1968). F r o m the i r data and those of Fant (1959) i t i s known that the
perceptual importance of F is much g rea te r i n front vowels than i n back 3 vowels. This general rule i s related to our finding that F3 adds to the
phonetic separation of soft and hard consonants i n front vowel context only.
In this connection i t is also worth noting that i n the experiments on the per-
ception of sustained fricatives, Shupljakov (1968), the F2t was constant a s
long a s the par t of the noise spectrum above 2500 Hz was kept the same.
Summary. Perceptual model
Our final conclusion is accordingly that the speech wave charac ter i s t ics
separating Russian hard and soft consonants can be expressed by a single
threshold i n the consonant F with a support of a s imi l a r normalized thres- 2
hold in F i f F2 is high. These thresholds a r e simply related to the 3
speaker ' s average formant pattern. Consonants with formant frequencies
above the thresholds a r e soft and those with formant frequencies below
the thresholds a r e hard. The speech production cor re la te is palatalization
a s opposed to a neutral o r "mid tube" place of tongue-body articulation.
Therefore a l is tener ' s hard/s:Jft identification could be based on n f ~ r m a n t
frequency boundary categorization a s the pr imary cue a s was suggested
fo r vowels by Chistovich, Fant, d e s e r p a - ~ e i t g o , and Tjernlund (1966).
More detailed data on the F-pat tern of various hard and soft consonants
in different contexts will be given i n a forthcoming publication.
Acknowledgments
We a r e thankful to Mrs. 5. Felicett i and Dr. B. Lindblom fo r the help
during the preparation of the paper and to Miss E. Agelfors for thz drzwings.
Rc:fzr;:nc~s Chistovich, L. A. (1968): "Direction of Transit ion a s a Perceptual P a r a -
me te r of Time-Varying Stimuli", paper B-3-7, 6th International Gongr. on Acoustics, Tokyo 1968.
Chistovich, L. , Fant , G., d e s e r p a - ~ e i t c o , A., and Tjernlund, ~ . ( 1 9 6 6 ) : "Mimicking and Perception of Synthetic Vowels. P a r t 11", STL-CPSR 3/1966, pp. 1-3.
Fant, G. (1959): "Acoustic Analysis and Synthesis of Speech with Applica- tions to Swedish", E r i c s son Technics 15, No. 1 (1959). -
Fant, G. (1 960): Acoustic Theory of Speech Production ( ' s-Gravenhage).
STL-QPSR 4/1968 6.
references, conto
Fujisaki, H. and Kawashima, T. (1967): "The Roles of Pi tch and Higher Formants i n the Perception of Vowels", Conference on Speech Communication and Processing, Cambridge, Mass. 1967; publ. in IEEE Transactions on Audio and Electroacoustics AU-16, March 1968, pp. 73-77.
Kawashima, T. and Fujisaki, H. : "The Influence of Various Fac to r s on the Identification and Discrimination of Synthetic Speech Sounds:' paper B-3-6, 6th International Congr. on Acoustics, Tokyo 1968.
Shupljakov, : "Pitch and Hard-Soft Distinction af Fricat ive and / /", Proc , of the Seminar on Speech
Production and Perception, Leningrad 1966; 2. f. Phonetik, Sprachwissenschaft und Kommunikationsforschung - 21, Heft 1/2, 1968.
Shupljakov, V. S. (1 968): "P,coustical Fea tu re of the Perception of Soft Stationary Fricat ive Consonants", 6th Allunion Acoustical Conference, Ivioscow (in ~ u s s i a n ) .
Stevens, K, N. (1 967): "Acoustic Corre la tes of Certain Consonantal Features" , 1967 Conf. on Speech Communication and Processing, Cambridge, Mass. , paper C6 in Conference Prepr in ts , pp. 177 - 184.
ahman, S. (1 964): "N<]te 3n Palat alizati,>n i n Russian", Luar t . P r o g r . Rep. Research Inst. of Electronics, Massachusetts Inst. of Techn. No. 73 (1964)~ pp. 167-172.
Chomsky, N, and Halle, M. : The Sound Pat te rn of English (New York 1968).