pitch of synthetic sung vowels - division of speech, music ... · i i 1 i 1 l4 - o/oo - - - on...

24
Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Pitch of synthetic sung vowels Sundberg, J. journal: STL-QPSR volume: 13 number: 1 year: 1972 pages: 034-044 http://www.speech.kth.se/qpsr

Upload: others

Post on 08-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

Dept. for Speech, Music and Hearing

Quarterly Progress andStatus Report

Pitch of synthetic sungvowels

Sundberg, J.

journal: STL-QPSRvolume: 13number: 1year: 1972pages: 034-044

http://www.speech.kth.se/qpsr

Page 2: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE
Page 3: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

B. PITCH OF SYNTHETIC SUNG VOWELS++

J , Sundberg

Abstract

Experiments a r e reported in which ~xus ica l ly trained subjects ~ i a t c h the pitch of zl. stimulus tone with a var iablc response tone, both tones being synthesis of a sung vo\vel. The effects on pitch and pitch perception ac - curacy of vibratos of different kinds and tile ' singing formant' a r e exam- ined. The resul t s appear to show that the pitch corresponds to the average fundamental frequency to the f i r s t approximation eventhough deviations from this average occur that may be explained by postulating a time con- stant in the fundamental frequency measuring system of the ea r . Adding I

a vibrato or a ' singing formant ' to a vox- el sound affects neither the pitch perceived nor the accuracy of the pitch perception appreciably. Thc standard deviatioii value s obtained f rom tilc exper i r i~ents a r e considerably smaller than the difference limen for f r e q ~ e n c y probably owing to the subjects' training and to the fact that complex waveforms were used in the experixeiits.

Introduction

In acoustic signals used for the communication of music, acoustic features

can be recognized that appear to characterize a good musician. Probably

such character ist ic s a r e not chosen randol=~ly within a given cultural t radi-

tion, for music - a s \-re11 a s other a r t s - has a long history behind it and i ts

development through the centuries is likely to obey to Darwinistic evolution

principles: Such acoustic features can be expected to survive which a r e in

some meaning adequate from the point of v iev~ of human perception. The

fac t that music can have a forceful impact on emotions can be interpreted

2s a support for this hypothesis. An important task of musical acoustics

would therefore be to map such features and to examine whether their usage

Bas a motivation in the xvay in which humail perception works. Investigations

of this kind may hopefully contribute to the understanding of our sensory and

mcntal behavior.

The present paper repor ts an exploratory study aiming a t an understand-

ing of this kind of two acoustic features characterizing the type of professional

singing that can be hc,zrL in our operas and concert halls. One feature is the

vibrato and the otller i s the spectrum envelope peak near 3 kHz n o r x a l l y oc-

curr ing in professional singing and frequc~ltly called the "singing formanti ' .

Their effects on pitch and pitch perception accuracy will be investigated.

* Paper presented a t Psychoacoustics S y r i ~ p o s i u ~ "Problems of auditive perception", May f 972, Toznah, Poland.

Page 4: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

The vibrato corresponds to a periodic lov frequency modulation of the

fundamental frequency. The waveform of this modulation i s generally quite

similar to a sinusoid and i t s periodicity var ies between 5. 5 and 8 cycles per

second. The extent of the rxodulation i s generally between + - 1.5 OJJ and t 3 lo of the fundamental frequency, i. e. betweell -i a quartertone and f - a semitone

(Seashore, 1932). In singing this nodulation of the fundamental frequency i s

accompanied by a 11-lodulation of the overall sound pressure . Sorne authors

have found that the two f inds of rnodulatiox a r e in phase, but others repor t

that this i s not always the case ( ~ c h o e n , I 9 2 3 ; Sacerdote, 1957). The d is -

agreement can probably be explained with reference to the relation between

tile strongest part ia l in the spectrum and the formant lying closest to this

part ia l in frequency. If the formant frequency i s higher than the frequency

of the part ia l the amplitude and frequency r;;odulations can be expected to be

in phase whereas the opposite would occur when the strongest part ia l i s

slightly higher than i ts closest formant frcquency. In any case the anplitude

modulation, whether a consequence of the frcquency modulation o r not, is

believed to be of minor importance a s regards the sound perceived (Baclius,

9969).

The vibrato has been shown to give a pleasing quality to the sound provided

that i t has a periodicity of slightly more than 6 Hz and a total extent of ap-

proximately t 2 70 of the fundamental frequency. In this case one single pitch - i s perceived and even though a s yet not strictly proved, this pitch i s believed

to correspond to the average of the fundamental frequency (Gibian, 1972;

Rax-sdell). It i s a lso assumed that the pitch perceived f rom a vibrato tone

is l e s s certain than the pitch perceived f r o i ~ a steady state signal. It has

been suggested that a function of the vibrato i s to cover up slight e r r o r s in

tuning, which, hov~evcr , has not been brought to a s t r ic t scientific test

(Stevens & Davis, 1 1 3 2 ) .

The "singing f o r ~ a n t " has been studicd previously mainly in male singing

(Sundberg, 1970a and b; Sundberg, 1972). It appears to consist of a cluster

of three forlxants. The lowest of these fori::ants corresponds to the third

formant in normal speech and the highest to the fourth. In-between these

formants an extra forrr-ant is found. This forrr-ant can be ascr ibed to the

larynx tube that iy-ay ac t a s a separate I-Ieli:?holt~ resonator under certain

conditions. Thesc conditions involve a v~ide laryngeal ventricle and a v ide

pharynx. Additional conditions fo r generating a "singing formant1' i s to adopt

Page 5: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

STL-QPSR 1/1972 3 6 .

a special articulation of some vowels in singing as compared with speech.

Thus, the "singing formant'' i s probably actively pursued by the singer a t

costs a s regards h is normal articulatory habits.

The perceptual function of the "singing formant1' i s not well understood.

It has been shovsn that i t adds to the "beauty1' of the tone and that i t may be

correlated with a perceptual quality labeled "placement in the head" among

singing teachers (Cibian, 1972). An interesting circumstance is also that the I frequency of the "singing forrr,ant9' appro:;iixately coincides with the region of

maxiriial sensitivity of the human ea r . Consequently the p re sencc of the

"singing formant" may make the sung sound eas i e r to perceive, a s suggested

by Winckel (197 1). YTe might expect that it a l so makes i t s pitch eas i e r to

discern.

Experimental procedure

The influence of the vibrato and the "singing formant" on pitch and pitch

perception accuracy was studied by means of t e s t s e r i e s in which subjects

matched the pitch of a given stimulus tone with a variable response tone.

The stiniulus and response tones were repcatedly presented binaurally to the

subjects by means of e a r phones. A pause of . 5 sec separated the stimulus

and the response tones, each of 11 9 sec cliiration. Between the response

tone and the follox-~ing s t i x u lus tone the pause was 1.6 sec (Fig. 111-B- 1).

The subjects adjusted the fundamental freqcency of the response tone by

turning a knob. 77hen they found the pitch of the two signals identical they

pressed a button intempting the signal presentation and triggering two f r e -

quency counters, one for each signal. In the case of vibrato tones the aver - age fundariiental frequency \-/as ~ e a s u r e d .

When not other\-rise stated the stimulus and response tones had the follow-

ing character is t ics . Both \-sere synthescs of a baritone voice singing a n [ a ] .

The synthesis was performed on a forri;lai:t synthesizer using the follox-sing

formant frequencies: F = 520 Hz, F2 = 9 5 5 Hz, F = 2335 Hz, Fq = 2550 Hz, 1 3 and F5 = 3480 Hz. The synthesis was found to be quite ''natural" by a l l sub-

jects. The sound p res su re level in the e a r phones varied around 70 + 3 dB 2

- r e . .0002 dyn/cm dependiiig on the fundai::ental frequency. The r i s e and

fall t i n e of the s t i x u l u s and response t o l ~ e s was C O msec. The vibrato, ex-

clusively used for the stimulus tone, was generated by modulating the funda-

mental frequency sinusoidally a t a ra te of 6 . 5 cycles pe r sec and slot-~ly

Page 6: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

Fig. 111-B-I. Oscillogram of the stimulus-rerponre presentation (upper figure) and of the onset of the vibrato rignal (lower figure).

Page 7: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

grawing to a maxirnai extent of i- - 1.7 OJo of the fundam~ental frequency ( ~ i g .

III-B- 1). As indicated above the "singins formantu was generated by connect-

i i ~ g an extra forma-?t circuit resonating at a frequency close to the third

formant (F = 2550 HZ). Four fundame~ta i frequency values were used for 4

the stimulus tone: 70 , f 15, 125, and 300 I-2z. These frequencies a r e equally

spaced over the range of a skilled professional bass singer. Each subject

made 5 matching s of each of the se fundacxntal frequencie s in randomized

order in one tes t ser ies . Three o r four such ser ies , each lasting maximally

f 5 min, could be run in one test session x-~ithout extending i ts duration over

60 min. In al l sessions a r e s t of some i=?iautes was offered to the subjects

after each series .

The subjects \-!ere sb: ~ ~ u s i c a l l y traixecl persons in the age of 2 5 to 36

years . Three of them a r e professional i==usicians (MD, 3 T , NS) and a l l of

ther.-: play or sing regularly in ensernble s.

Experiment I: Test of method

Due to possible effects of auditory phenor-~ena such a s temporal masliing

etc. i t cannot be talcen for certain that snbjects adjust the pitch of the response

tone so that i t s fundamental frequency i s identical with that of the stimulus

tone in an experiment of the type described. Therefore, in one test se r i es

the stimulus and response tones were both presented without vibrato and i-rith

''singing formant", thus not differing fror-? each other. 10 settinzs of each

of the four fundarxental frequencies were x a d e by a l l subjects except

and BT c-~ho made only 5. The resul ts a r e plotted in Fig. III-B-2.a and b

in ter ixs of the icdividual subjects ' means.

Fig. III-B-2. a shows that, on the average, there i s no dir'ference between

the fundamental frequency of the stirr,ulus tone Fa, and that of the response 3

tone F O R . In some cases, however, individual subjects give values of F O R

that differ from F O by more than one standard deviation. lMoreover the S standard deviations vary considerably bet\-reen subjects and fundamental

frequencies betvreen . 05 and .7 7i (Fig. III-B-2. b). Therefore i t seerns

motivated to judge the influence of a given stimulus property on the basis

of the change in pitch and standard deviation that i t inlposes on the values of

the individual subje cts . In the experiments to be described the stiE:ulus and response tones were

both generated by means of a formant synthesizer. Thus, the timbre and

Page 8: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

EFFECT OF TEST SITUATION

ON PITCH %o

6 RISE t

4

FO (Hz)

ON PITCH ACCURACY I I I I r I I 1

DECREASE 7

4

MEAN I 2

FO (Hz)

Fig. 111-B-2.a. Difference in fundamental f requency between the s t imulus and response tones observed when they w e r e generated in exactly the s a m e way.

b. Corresponding values of the s tandard deviat ions. Symbols r e p r e s e n t sub jec t s ( 0 = JS; @ = BN;

A = NS; A = EJ; 0 = BT; 1111 = MD).

Page 9: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE
Page 10: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

AMPLITUDE (dB ) AMPLITUDE (dB )

Page 11: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE
Page 12: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

EFFECT OF SlNUSOlDAL VIBRATO

a ON PITCH

FO (Hz)

I I 1 I 1 l 4 O/oo -

-

-

-

ON PITCH ACCURACY r 1 I I I I I 1 OIOO

DROP 4 - - -

I I I 1 I 1

DECREASE f

\

-2

-6

INCREASE 4 '0

50 100 150 200 300 400

FO (Hz)

Fig. 111-B-5. a. Change in the fundamental frequency difference be- tween the stimulus and response tones observed after adding a sinusoidal vibrato to the stimulus tone.

b. Corresponding changes in the standard deviations. The subject symbols a r e the same a s in Fig. 111-B-2.

Page 13: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

As regards pitch Fig. 111-B-5. a demonstrates that the vibrato has a very

small effect, if any. The individual differences ascribable to the vibrato

do not reach even a 90 % level of confidence in any case but one. The ap-

propriate concluaio-n, tllerefore seems to be that adding a vibrato has no ef-

fect on the pitch perceived.

As regards pitch perception accuracy the individual differences in the

standard deviatioxs giveri in Fig. 111-33-5. b show that there might be a sm-all

effect when the fundarriental frequency i s a s lo\-J a s 70 o r 115 Hz. F o r these

frequencies the average standard deviation r i ses by . 2 and . 1 70 respectively,

thus by an extra--1ely srzall amount. The extent af the vibrato used in the

c:<perinents i s rather s r a l l : + - 9.7 70 of the fundamental frequency. Values

of f 3 70 do occur in singing. One test sc r i cs was run with one of the subjects

in which the extent of t11c vibrato was incrcascd to + 3 O/o. However, no con-

sistent changes in the standard deviations could be observed. Therefore

the resul ts can hardly be taken a s a support for the hypothesis that a pe r -

ceptual function of the vibrato i s to cover np slight e r r o r s in tuning.

It i s interesting that the pitch perceived of a vibrato tone appears to cor-

respond to the averaged fundamental frequcncy. As long a s a sinusoidal

vibrato i s used it cannot be decided whether this average i s computed a s the

frequency midway between the extremes ill the fundarriental frequency modu-

lation curve or vrl~ether i t i s the linear mean. Lf the modrlation waveforms

a r e distorted a s shown in Fig. 111-B-6 these two averages differ.

Two test ser ies \-rere run in which thc stimulus had a vibrato obtained

v~i th the modulation v~aveforms shown in Fig. 111-B-6. The rate of the vibrato

was lcept to 6. 5 cycles per sec and the eiiteilt to f 1.7 73 of the fundamental - frequency. 1

The individual mean values obtained from these test se r i es tail be cox-

pared with those f rom the te s t where both stilxulus and response lacked a

vibrato in Fig. 111-3-7. a and b. This figare shows the effects of adding a

vibrato of these ::inds t9 the stimulus toile. A s regards pitch there seems to

be an effect of these distortions of the sincsoidal vibrato type. !?hen the

fundamental f r e q u e ~ c y ~zlalte s 20s itive deviations tile r e sponse tone was ad-

justed slightly lover than the linear averxzc, and when the vibrato =;rave-

form was inverted the opposite applies. Thus, the fundamental frequency

of the response tone x-jal; in both cases deviating from the linear average in

Page 14: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

FUNDAMENTAL FREQUENCY MODULATION WAVEFORMS

PERIODICITY: 6 . 5 PER SECOND

Fig. 111-B-6. Two waveforms used for the fundamental frequency modulation.

Page 15: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

EFFECT OF VIBRATO TYPE

ON PITCH

Fig

. S \

A AA

'. 4 DROP / ' \ 2

1 I I I I 1 +

VIBRATO : - -4 - - A -.-.- w

-1 5 -0- v

- v RISE f -

ON PlTCH ACCURACY

O/oo

8

l o t DECREASE 7 VIBRATO.-,^--^ 6

A -. -.- r\/ 1

I INCREASE 1 -I-"

FO (Hz)

. 111-B-7. a. Change in the difference in fundamental frequency between the stimulus and response tones observed after adding to the stimulus tone vibratos of the waveforms shown in Fig. 111-B-4.

b. Corresponding changes in the standard deviations.

Page 16: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

STL-QPSR 1/1972 40.

the direction of the steady state portion of the modulation wavefor=. This

deviation rcay vary with the fundamental frequency since the mean curves

separate more a t h i ~ l ~ e r frequcncies. Fron Fig. 111-R-7. b it can be seen

that, on the average, there seems to be no cffect on the standard deviation

values. This indicates that the subjects dicl not find it more difficult to

match the pitch of a stimulus with these tyijcs of vibrato than wit11 a sinus-

oidal vibrato.

The resul ts clearly show that the fundamental frequency determining

the pitch of a vibrato tone i s not the frequency midv~ay betweer, the extremes I in the modulation curve. The linear avcragc appears to be a better approxi-

mation, but the steady state portion of the ~iodula t ion curve seems to be of

sorcewhat greater importance than the r c st of the curve.

Exper irzent 111. Effect of the "singing formant"

In a l l experirncnts described above thc stimulus and r e sponse tones con-

tained a "singing formantu. In one test scr ies the stimulus and response

tones were presented 1-~ithout this spectrul= envelope peak. These signals

were generated by sir-ply omitting the extra formant a t 2550 Hz in the syn-

thesizer. The idealized spectrum envelopes a r e sl~own in Fig. 111-B-8. The

stimulus tone had a sinusoidal vibrato. The effect of adding a "singing for - rA.antl' can therefore be studied by comparing the data collected in this ex-

periment with the data obtained when the stixxulus and response signals con-

tained a "singing forl;nantl'. Fig, 111-B-?. a and b shows sucI1 a comparison.

The intersubject 1::ean curve indicates that there might be an effect on

the pi-ccl~ perceived and that this effect dei>cilds on the fundamental frequency.

Ho\i.zver, statistically significant differcnccs were found only in two cases

(out of 24). Consequently the correct col~clusion seems to be that the

"singing formant'! does not affect the pitch. Sirfiilarly Fig. 111-B-9. b in-

dicates that i t does not affect the accuracy with -3hich the pitch i s perceivcd.

As previously mentiomd it has been sr:ggcsted that the $'singing formant"

~nalces the sound easier to perceive in large halls. Therefore i t may be a s -

sulxed that an cffcct could be observed if thc stimulus tone is presented a t a

considerably lov~cr sound pressure level than the response tone. In this case

the "singing formant" -:rould possibly facilitate the pitch perception. This I

hypothesis was tested in a t e s t . The stir?,ulus tone was presented a t a sound 2 1

pressure level of 4 5 d B re l . ,0002 dyn/crx , i. e. 25 dB below the level of I

Page 17: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

IDEALIZED SPECTRUM ENVELOPE

FREQUENCY (kHz)

Fig. 111-B-8. Idealized spectral envelopes used for the stimulus and r e - sponse tones in studying the effects of the "singing formant".

- - .- -

Page 18: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE
Page 19: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

STL-QPSR 1/1972 41.

the response tone. In one test ser ies the stimulus and response tones con-

tained a "singing formant", in another not. The stimulus tone was presented

\-{it11 a sinusoidal vibrato. All subjects but one (EJ) participated in these

series.

Fig. 111- B - 10. a and b shov~s the change s in the fundamental frequency of

the rcssonse tone and in the standard deviations observed when a "singing

forri=antl' was added to the stirr-ulus tone. Fig. 111-3- 10. a indicates that

different subjects react very differently 1-~llen the "singing formant" i s intro-

duced. At the same time there i s no clear effect on the pitch accuracy. This

i s indicated by the very srr-all changes in the standard deviation values a s

seen in Fig. 111-E- 40. b. r"cvcrt11eless significant changes in pitch occurs I only spuriously. Thus it does not seem 1il:ely that the "singin3 formant" has

any effect neither on pitch, nor on the accuracy with which the pitch i s per-

ceived even when added to a vowel of weak intensity. I In a previous investigation it was observed that the dependence on inten-

sity of the pitch appeared to be different in complex sounds a s compared with

sinusoids (Lindqvist S Sundberg, 19711). 1:Thereas the pitch of a sinusoid

below about .5 kHz r i ses if the intensity i s decreased, the pitch of a complex

wave was found to drop with decreasing intensity. If this observation i s ap-

plicable also to vowel sounds, subjects 1-~ould match the pitch of the mealcer

sounds wit11 a lower fundamental frequency than in the case when the stirn-

ulus and response tone \-{ere equally loud. Fig. III-B-11. a confirms that this

i s the case in t e rms of intersubject neans of the fundamental frequency

difference between the response tone and the stimulus tone. l'll~en the stim-

ulus and response tones have the same intensity they differ very little in

fundamental frequexcy. Vhen the s t h u l u s i s 25 dB wealcer than the response

tone the pitch appears to drop considerably, nc matter if a Itsinging formant"

i s p e s e n t or not. It i s interesting that this large shift in pitch i s not ac-

companied by an in the accuracy of pitch perception, a s indicated

by the close agreement between the standard deviations given in Fig. 111-B- 11. b.

Discussion

According to tllc findings reported in this investigation the pitch per -

ce j ved from a vibrato tone appears to c o r n spond approximately to the aver - age fundamental frequency. Moreover, the vibrato does not appear to affect

tile certainty with which the pitch i s perceived, since the vibrato \.ras not found

Page 20: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

EFFECT OF "SINGING FORMANT"

a STIM. 45d0, RESP. 70dB

ON PITCH

FO (Hz)

b ON PITCH ACCURACY h I I 1 I I

10 - DECREASE t

INCREASE 4

FO (Hz )

Fig. 111-B-10. a. Change in the fundamental frequency difference between the stimulus and response tones observed after adding a "singing formant" to the stimulus tone. The sound pressure level of the stimulus tone was 25 dB below that of the response tone.

b. Corresponding changes in the standard deviations. The subject symbols a r e the same a s in Fig. 111-B-2.

Page 21: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

EFFECT OF SPL

a ON PITCH

' --t S.F. STIM. 45 dB SPL 0 -*o.- - S.E RESP. 7 0 6 0 SPL

-.*.- 1 S.E STIM.4 RESP. 70dB SPL 4-14

FO ( H z )

ON PITCH ACCURACY r I I I I I I 1 O ~ o O

6 - *'- S.P STIM. & RESF! 70dB SPL

I- z U Y -- . - . V)

-0 2

FO (Hz )

.g. 111-B- I I. a. Intersubject averaged difference between the s t im- ulus and the response tones. The stimulus tone was presented a t two intensities and a t the lower intensity with and without a "singing formant".

b. Cor re sponding averages of the standard deviations.

Page 22: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

STL-GPSR 1/1972

to increase the standard deviations in the pitch matching experiments appreciab-

ly. Consequently, the vibrato can hardly reduce the demands on intonation ac-

curacy in singing. These findings agrec 1-jitll the results of recent investiga-

tion in vrhich the relations between fundamental frequency in singing and sab-

jects3 i n ~ p r e ssion of bad intonation were examined: if the mean value of the

f u n d a ~ e n t a l frequency never coincide s with the theoretically correct value

during the course of a to-cre, niusically trained subjects tend to judge the tone

a s #'out of tune", no matter if the theoretically correct value l ies x-~ithin the

frequency swing of the vibrato (Lindgren 8 Sundberg, 1972). On the other

hand, the s t ixulus tones in our experim-ents certainly differ froxx the signals

reaching the listeners' e a r s in a concert. For instance, one difference i s that

the average fundamental frequency c;f the stirmlu s tone remained constant

(or , in the test with non-sinusoidal vibrato, almost constant) in the tests.

This seems to be r a r e in ~~,zusical perforxances. Transients in the average

fundarilental frequency may be of relevance to the effect of a vibrato a s regards

the pitch and the accuracy of pitch perception Thus, a vibrat, c . 2 ~ be

rc lcvmt :-?kc3 the sinzcr' r accuracy in t;:::ing i s uncertain. Another dif - ference is that a voice is often accompanied by instruments or other voices

in music. It i s likely that a vibrato i s capable of hiding beats between n i s -

tuned sounds under these conditions. Therefore, it may in such situations

have a function of covering e r r o r s in tuning.

The average fundamental frequency uscd by the ea r in pitch perception ap-

pears to be closer to the linear average than the frequency lying =id\-ray be-

tween the extremes in the modulation curve. On the other hand, the experi-

x x n t s wit11 the non-sinusoidal vibratos also showed that the steady state

portion of the f u n d a ~ e n t a l frequency modulation waveform seems to be cf

slightly greater ircportance than the rapidly changing portions when the ea r

compute s the average determining the pitch perceived. An interpretation of

this cannot be m-ade on the basis of the available set of data. However, it seerri s

reasonable to assume that the c.xplanation deals with some time constant in

the fundamental frequency measuring sys t ex of the ear ; this system might

be too slow to follox-r the rapid changes in fundamental frequency occurring

in the distorted l-nodulation waveforms. ?his interpretation is in agreement

with earl ier findings \-/here subjects matcl~ed the extreme s in pitch between

which a vibrato tone appeared to swing. These pitches correspond to f re -

quencies slightly closer to the average frequency than the actual frequencies

Page 23: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE

STL-QPSR 1/1972

of the extremes (Makepeace, 1 931, cit. in Vennard, 1967). Further experi-

ments with sy ster=,atically varied shapes of the fundamental frequency modu-

lation waveform may elucidate the propcrtie s of the fundamental frequency

measuring s y s t e a of the ear .

The &'singing formanttt seems to have no appreciable influence neither on

the pitch nor on the accuracy of pitch pcrccption. Thus no explanation to its

presence in professional singing has been found. Another hypothetical ex-

planation to be tested in future r e search i s that it matches the average

spectrum of an orchestral accompaniment iil such a way that the tone sung

I i s i ~ o t masked.

I

The standard deviation values in all cxpcriments described in this in-

vestigation a r e astonishingly small and a p p a r to be rather unsensitive to

variations of stimulus properties. The standard deviations in a test of this

liind cannot be expected to be identical \ -~ i t l~ the difference limen for frequency,

but they could be cxpectecl to be of the sx:e order of magnitude. However, I

the standard deviations a r e about 5 to 10 t h e s smaller than the difference

l i n e n reported for sinusoids. One possibility of explaining this discrepancy

i s to assume that the difference limen for frequency reported in the literature

i s too large (Ralcot-zslri, 1172). I t i s also possible that the pitch of a complex

sound i s perceived wit11 higher accuracy t h n the pitch of a sinusoid. In pre -

vious experiments i~us i ca l l y trained subjects set the octave of given stim-

ulus tones. In these experiments the s t a ~ d a r d deviatior s were approxir-~ately

only doubled when the stimulus tone was shifted from a complex wave signal

to a sinusoid (Lindqvist & Sundberg, $971). Consequently it seer-1s reasonable

to assume that musical training i s of appreciable importance to tile difference

lirsen fcr frequency. Piost probably then the extremely small standard devia-

tions found in the present te s t ser ies a r e d ~ e to the musical training of the

scbjects and to the fact that the stimulus tone as a compkx wave.

Conclusions

The pitch perceived from a vibrato tolie appears to correspond to the

average fundamental frequency to the fir s t approximation. However, the

~ e a s u r i n g system cf the ea r i s probably :lot capable of tracking very rapid

changes in the fundamental frequency so that slight discrepancies c a y occur

between the physical average and the average giving the pitch. The pitch 1

perceived does not seem to be affected by the presence of a "singing formanttt,

1

Page 24: Pitch of synthetic sung vowels - Division of Speech, Music ... · I I 1 I 1 l4 - O/oo - - - ON PITCH ACCURACY r1 I I I I I 1 OIOO DROP 4 - - - I I I 1 1 \ DECREASE f -2 -6 INCREASE