a subjective rating scale for timbre [ftp.rotman-baycrest.on.ca]

12
Journal of Sound and Vibration (1976) 45(3), 3 17-328 A SUBJECTIVE RATING SCALE FOR TIMBRE R. L. PRATT? AND P. E. DOAK Institute of Sound and Vibration Research, University of Southampton, Southampton SO9 5NH, England (Received 28 July 1975) The factors governing timbre are discussed and a subjective rating scale for their quantitative assessment devised. This scale was used with success to differentiate between a limited number of sounds of varying harmonic content. By using this scale a quantitative measure of the suitability of selected words for timbre was obtained. 1. INTRODUCTION The word timbre is borrowed from the French language and is usually synonymous with the “tone quality” of a musical instrument. The American Standards Association defines timbre as “that attribute of auditory sensation in terms of which a listener can judge that two sounds similarly presented and having the same loudness and pitch are dissimilar” [l]. This definition is unsatisfactory as it restricts one to judging timbre only under equal loudness and pitch conditions. Consider the following example: a listener hears a trombone play C,(262 Hz) marked forte, followed by a flute playing A,(440 Hz) marked piano. He would judge the second sound dissimilar from the first on three counts; namely that it was of higher pitch, quieter, and was produced by a different musical instrument. It is possible therefore for subjects to detect differences of timbre even when two sounds do not have the same loudness and pitch, and this fact must be reflected in any satisfactory definition. It is proposed therefore to define timbre as “that attribute of auditory sensation whereby a listener can judge that two sounds are dissimilar using any criteria other than pitch, loudness or duration”. Before discussing the formulation of a subjective rating scale for timbre, a review of the factors that are thought to govern timbre will be presented. The first thorough investigation of timbre was carried out by Helmholtz and published in 1862 [2]. After conducting a series of experiments with a set of eight electrically driven tuning forks set at harmonically related frequencies, he concluded that “the quality of the musical portion of a compound tone depends solely on the number and relative strength of its partial simple tones and in no respect on their difference in phase” [2, p. 1261. Later investigators have not always agreed that the relative phase of the partials is unimportant [3], and Pratt [4, p. 371has further suggested that the phase relationships of the partials may not be constant for musical instruments. Work by Berger [5] and George [6] has demonstrated the psycho- acoustical importance of the starting transient, or “attack”, in identifying musical instrument notes. Richardson [7] has shown that the harmonic structure of a note during the attack is not merely a miniature of the steady state. Pratt [4, p. 21 proposed therefore to extend Helmholtz’ original model for timbre (which only applies to the “musical portion” or steady- state of a tone) by postulating that the timbre of a musical note would be more fully described by specifying the complete amplitude versus time history of all harmonics for that note. Recordings of clarinet notes were made in an anechoic room (see Figure 1) and subsequently analyzed to give this information. t Present address: Department of Physics, University of Surrey, Guildford, Surrey, England. 317

Upload: jean-pierre

Post on 27-Oct-2014

50 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: A Subjective Rating Scale for Timbre [Ftp.rotman-baycrest.on.CA]

Journal of Sound and Vibration (1976) 45(3), 3 17-328

A SUBJECTIVE RATING SCALE FOR TIMBRE

R. L. PRATT? AND P. E. DOAK Institute of Sound and Vibration Research,

University of Southampton, Southampton SO9 5NH, England

(Received 28 July 1975)

The factors governing timbre are discussed and a subjective rating scale for their quantitative assessment devised. This scale was used with success to differentiate between a limited number of sounds of varying harmonic content. By using this scale a quantitative measure of the suitability of selected words for timbre was obtained.

1. INTRODUCTION

The word timbre is borrowed from the French language and is usually synonymous with the “tone quality” of a musical instrument. The American Standards Association defines timbre as “that attribute of auditory sensation in terms of which a listener can judge that two sounds similarly presented and having the same loudness and pitch are dissimilar” [l]. This definition is unsatisfactory as it restricts one to judging timbre only under equal loudness and pitch conditions. Consider the following example: a listener hears a trombone play C,(262 Hz) marked forte, followed by a flute playing A,(440 Hz) marked piano. He would judge the second sound dissimilar from the first on three counts; namely that it was of higher pitch, quieter, and was produced by a different musical instrument. It is possible therefore for subjects to detect differences of timbre even when two sounds do not have the same loudness and pitch, and this fact must be reflected in any satisfactory definition. It is proposed therefore to define timbre as “that attribute of auditory sensation whereby a listener can judge that two sounds are dissimilar using any criteria other than pitch, loudness or duration”. Before discussing the formulation of a subjective rating scale for timbre, a review of the factors that are thought to govern timbre will be presented.

The first thorough investigation of timbre was carried out by Helmholtz and published in 1862 [2]. After conducting a series of experiments with a set of eight electrically driven tuning forks set at harmonically related frequencies, he concluded that “the quality of the musical portion of a compound tone depends solely on the number and relative strength of its partial simple tones and in no respect on their difference in phase” [2, p. 1261. Later investigators have not always agreed that the relative phase of the partials is unimportant [3], and Pratt [4, p. 371 has further suggested that the phase relationships of the partials may not be constant for musical instruments. Work by Berger [5] and George [6] has demonstrated the psycho- acoustical importance of the starting transient, or “attack”, in identifying musical instrument notes. Richardson [7] has shown that the harmonic structure of a note during the attack is not merely a miniature of the steady state. Pratt [4, p. 21 proposed therefore to extend Helmholtz’ original model for timbre (which only applies to the “musical portion” or steady- state of a tone) by postulating that the timbre of a musical note would be more fully described by specifying the complete amplitude versus time history of all harmonics for that note. Recordings of clarinet notes were made in an anechoic room (see Figure 1) and subsequently analyzed to give this information.

t Present address: Department of Physics, University of Surrey, Guildford, Surrey, England. 317

Page 2: A Subjective Rating Scale for Timbre [Ftp.rotman-baycrest.on.CA]

318 R. L. PRATT AND P. E. DOAK

Figure 1. The attack waveform of A,(440) played forte on a clarinet.

The analysis performed on these waveforms was twofold: a power spectral density plot of the steady-state portion of the notes was performed to confirm that the spectrum contained energy only at harmonically related frequencies. Band-pass digital filtering then was per- formed on the attack of the note to yield the amplitude versus time history of each harmonic, and the results for the first four harmonics are shown in Figure 2. This extended model for timbre was then tested by comparing the original recordings with synthesized sounds that employed the data obtained from the digital filtering analysis. The sounds were synthesized by using two completely different techniques.

Figure 2. The attack of the first four harmonics of a clarinet note A,(440) played forte. The amplitude of the fundamental is defined to be unity.

The first method involved the design and construction of a special purpose electronic sound synthesizer. This generated six harmonically related, frequency-locked sine waves which represented the first six harmonics of a complex musical tone. Each sine wave then was fed to its own attack and decay circuit which independently controlled its rise and fall times. The characteristics of this attack and decay circuit are shown in Figure 3. Note that they have the same general form as those of the harmonics of a real clarinet (Figure 2). Finally the

Page 3: A Subjective Rating Scale for Timbre [Ftp.rotman-baycrest.on.CA]

ASUBJECTIVERATINGSCALEFORTIMBRE 319

-r- t (seconds)

Figure 3. The characteristics of the attack/decay circuit I. ms < 7 -x 100 ms.

components were mixed to produce the complex tone. The schematic diagram of this syn- thesizer is shown in Figure 4. Thus the amplitude versus time history of the first six harmonics of a synthesized tone could be set (within certain practical limitations) to correspond to those obtained by analysis, and the synthesized tone then compared subjectively to the original recording.

Master oscillator

Divtders Attack /decay CCTS

F

2F

3F

+N 4F

- 5F

6F

Figure 4. A schematic diagram of the electronic sound synthesizer.

The second method involved use of a sound synthesis FORTRAN program called MUSIC V. This program was developed by Mathews [8] and others, and enables a digital computer, with digital-to-analogue (D/A) converters, to simulate a conventional analogue electronic sound synthesizer. By using this program it was possible to produce a complex tone whose harmonics could be set to give the appropriate amplitude versus time history.

Thus two different methods were employed to synthesize the original clarinet sounds. Upon listening to the electronic synthesizer (the first method) it became evident that it was not producing sounds at all similar subjectively to real clarinet tones. The most striking feature was that the sounds it produced were readily analyzable by the ear into the corresponding components. The timbre of the sounds produced by the synthesizer most nearly resembled that of an organ pipe. Notes synthesized by using MUSIC V, and the same data from the computer analysis, more closely corresponded to the sound of a clarinet, but were clearly distinguishable from real clarinet notes. The closer resemblance of sounds synthesized by using this technique manifested itself as better “blend”, where blend is defined as a measure of the inability of the ear to resolve a complex tone into its components. The reason for this increase in blend was due to the fact that the digital samples were not presented to the D/A converter at a uniform rate, and this effectively introduced frequency modulation. This result is con- sistent with the computer study carried out by Risset [9], who noted that trumpet tones displayed “a quasi-random fluctuation of pitch”. He further demonstrated that synthesized tones which included some fluctuation of pitch were almost indistinguishable from the original trumpet tones.

Thus it may be concluded that the timbre of a note depends largely on the harmonic

Page 4: A Subjective Rating Scale for Timbre [Ftp.rotman-baycrest.on.CA]

320 R. L. PRATT AND P. E. DOAK

structure, but both the amplitude and frequency of the harmonics vary with time, and it is the precise nature of these fluctuations which is of critical importance for determining the timbre of a note. With the major factors that govern timbre thus identified, some quantitative assessment of their relative importance was next sought.

2. THE RATING SCALE

The perception of timbre has not been investigated as thoroughly as that of pitch or loud- ness. (However, an investigation of the verbal attributes of timbre has been published by von Bismarck since this study was completed and will be discussed more fully in section 5.) Perhaps the reason is that timbre is not a single readily identifiable factor like pitch or loud- ness, but a mixture of several factors which combine to give an overall subjective impression. Many adjectives are used informally to describe timbre, and it was decided to investigate whether it would be possible to evolve a more formal scale by using a limited vocabulary of

PLEASE HELP

The aim of this questionnaire is to discover whether the timbre of musical instruments can be adequately described by using a limited number of adjectives (six in this case) with a subjective rating scale for each adjective.

Below are listed some words commonly used to describe timbre. For each family of instruments indicated below, and for the combination of all three families, select the SIXwords that you feel would be most useful for describing the timbre by ticking the word in the appropriate column. (Thus each column should contain six ticks.) Please fill in the table first and then answer the questions overleaf.

Strings

TABLE

Woodwind Brass Combination

of all three

Pure 3 3 1 Rich 7 3 6 Mellow 6 7 2 Rough 0 0 3 Sharp 1 0 2 Colourful 3 5 4 Harsh 1 1 5 Dull 2 2 1 Brilliant 5 2 6 Nasal 1 8 0 Clear 3 4 2 Sweet 4 1 0 Clean 2 2 2 Penetrating 2 4 7 Smooth 3 4 1 Bright 6 3 7 Transparent 1 0 1 Warm 4 3 2 Hollow 0 3 1

1

: 1 1 8 0 1 5 0 2 1 0 5 2 7 0 6 0

(1) What is your definition of the word timbre?

(2) Do you think that six adjectives are:

(a) too few? cl

(b) too many? cl

(3) Can you suggest any other useful adjectives for describing timbre?

THANK YOU FOR YOUR HELP

Figure 5. The questionnaire.

Page 5: A Subjective Rating Scale for Timbre [Ftp.rotman-baycrest.on.CA]

A SUBJECTIVE RATING SCALE FOR TIMBRE 321

such adjectives. A questionnaire, shown in Figure 5, was drawn up which listed nineteen adjectives commonly used for describing timbre. Subjects were asked to select the six adjectives they thought were the most useful for describing timbre for four categories of instruments. After completing this section they were asked to answer three further questions. The questionnaire was sent to all members (forty-two in total, both staff and students) of the Department of Music at the University of Southampton.

Before discussing the results of the main part of the questionnaire in detail, it is appropriate to examine the responses to the three additional questions. For question one, subjects agreed that timbre was the “quality” or “character” of an instrument. For question two, subjects were almost equally divided as to whether six adjectives were too few or too many. In response to question three, subjects generally were able to think of one or two more words to describe timbre but no one word was suggested by more than one subject. Despite the limited response (21%) the results, as displayed in Figure 5, do indicate that certain words were more popular than others. This trend was most noticeable when they were asked to pick six words for describing the timbre of the three families combined, and the fo!lowing seven words emerged as clear favourites: rich, mellow, colourful, brilliant, penetrating, bright and warm. To reduce these words to a more manageable number it was decided that bright was synonymous with brilliant and could be grouped under brilliant. Likewise mellow was incorporated into warm. Colourful and penetrating were later discarded as being not very useful descrip- tions for the range of sounds the synthesizer produced. This left a basic vocabulary of rich, brilliant and warm. The way in which these words have been chosen (and discarded) has been very idiosyncratic. It may well be that the reader does not agree with some of the statements above regarding the suitability of certain words. Some degree of arbitrariness is inevitable as those involved in the development of such scales realize [lo].

Sound

cold I I I worm

Pure I I 1 Rich

Figure 6. The subjective rating scale for timbre.

With the vocabulary thus selected, the scales were devised by following the method evolved by Osgood, Suci and Tannenbaum, which will be briefly outlined here. In their book The Measurement of Meaning [IO] Osgood et al. introduce the concept of the semantic differential. This is essentially a condition of controlled judgmental and rating procedures. The subject is provided with “concepts” to be differentiated (e.g., me, fire, God, America, lake, sword), and a set of pairs of antonymous adjectives which lie at either end of a seven point “semantic scale”. Examples of such scales are as follows : Heavy/Light, Wise/Foolish, Safe/Dangerous, etc. The authors then postulate a semantic space, Euclidean and multidimensional. Each semantic scale is a straight line function which passes through the origin of this space. The semantic differential is thus “a highly generalizable technique of measurement which must be adapted to the requirements of each research problem to which it is applied. There are no standard concepts and no standard scales; rather the concepts and scales used in a particular study depend on the purpose of the research” [lo, p, 761. The basic vocabulary of rich, brilliant and warm therefore was complemented by pure, dull, and cold which were chosen as being suitable antonyms of the original vocabulary. In this study the “concepts” will be tones of differing harmonic content, and there will be three scales: Dull/Brilliant, Cold/Warm and Pure/Rich. Figure 6 shows that layout of the scales used in the experiment.

Page 6: A Subjective Rating Scale for Timbre [Ftp.rotman-baycrest.on.CA]

322 R. L. PRATT AND P. E. DOAK

3. EXPERIMENTAL PROCEDURE

An experiment next was devised to examine the following two questions.

(1) How consistently can subjects use the subjective rating scale for timbre? (2) How useful are the words that have been selected in providing a quantitative estimate

of the relative importance of the factors governing timbre?

It was decided to present subjects with six sounds of differing harmonic content, but with the same loudness, pitch and envelope. The electronic synthesizer was used to create these six sounds and a tape recording was made with each sound repeated six times (in a balanced Latin square order) making a total of thirty-six presentations. Each sound lasted for one second, followed by a ten second gap in which to rate the sound on a score sheet (shown in Figure 6) by placing a cross at the point the subject thought most appropriate on each of the three scales. The six sounds were selected to give as much range of timbre as possible, and their spectral composition is shown in Figure 7. Twenty-one subjects, drawn mainly from staff and students at the University of Southampton, completed the experiment.

The experimental procedure was as follows. Subjects were presented with four score sheets and were asked to think for a moment about the words on the scale. It then was explained that these words represented opposite sensations, and that the line itself represented the gradual transition between the two extremes. In order to familiarize themselves with the scales, and the range of sounds they would be hearing, they were asked to rate eight sounds which were selected at random somewhere in the tape. No instructions were given as to how they should use the scales, since it was hoped to see if subjects naturally identified tones of a given harmonic content with certain words. After this practice trial they then rated the complete tape of thirty-six sounds. Thus each subject gave six responses to each of the six sounds.

4. RESULTS

From the results, there are two properties concerning the composite scale which can be investigated.

(1) Are the three scales Dull/Brilliant, Cold/Warm and Pure/Rich independent ? (2) Are subjects using the scale to make genuine distinction between sounds?

To answer question one the results of a subject are presented in the following manner. Since the complete rating scale comprises three separate scales, any two scales may be selected and plotted against each other. Figure 8 shows such graphs for the subject who used the scales with the greatest consistency. Ideally the points should be clustered at a certain point indicating a consistent rating on all three scales for that sound. The trend that emerges here, and was seen to exist in varying degrees in other subjects, was to link Brilliant, Cold and Rich together, and Dull, Warm and Pure together.

To answer question two consider just one subject and his rating of (say) sounds 1 and 2. The five inch line used by the subjects may be divided for convenience into the range O-50, and the exact position of each cross measured in tenths of an inch from the left-hand end of scale. (It should not be assumed that this is necessarily the resolution of the scale, however.) Since each subject rates each sound six times the means and standard deviations for each scale then may be computed, and a test of significance (t test) applied to determine the likeli- hood of these two sets of data originating from the same population. Thus for a given subject consider the values of the means and standard deviations of the Dull/Brilliant scale for sounds 1 and 2. A t test significant at the 5% level was chosen as indicating that the subject could make a genuine distinction between sounds 1 and 2 using the Dull/Brilliant scale. By

Page 7: A Subjective Rating Scale for Timbre [Ftp.rotman-baycrest.on.CA]
Page 8: A Subjective Rating Scale for Timbre [Ftp.rotman-baycrest.on.CA]

324 R. L. PRATT AND P. E. DOAIi

Brilliant

f

Pure Rich

Cold

(a)

Eril iont 4

Cold Warm

f Dull

Cold (b)

Figure 8. The results of the most consistent SubJect. (a) sound 1; (b) sound 2; (c) sound 3; (d) sound 4; (e) sound 5; (f) sound 6.

Page 9: A Subjective Rating Scale for Timbre [Ftp.rotman-baycrest.on.CA]

A SUBJECTIVE RATING SCALE FOR TIMBRE 325

Brilliant

T

Cold Worm

Brilllont

Rich

Worm

t

Rich

Cold

(c)

Brilliant

i

Cold ~-Q/-+&Worm

Brllliont

5

Pure ‘2 4

-I-

0 3 Rich

Dull

Pure

Worm

5 3

2

+

10 Rich

Cold

(4

Figure 8 (c)-(d).

Page 10: A Subjective Rating Scale for Timbre [Ftp.rotman-baycrest.on.CA]

326 R. L. PRATT AND P. E. DOAK

Brilliant

4

Cold Worm

DLll Dull

Brilliant

c

5 2’ 4

3

0 Pure

.

Rich

Worm

f

Pure <& Rich

Cold (e)

Btilliant

Pure Rich

5 Cold (f-1

Figure 8 (e)-(f)

Page 11: A Subjective Rating Scale for Timbre [Ftp.rotman-baycrest.on.CA]

A SUBJECTIVE RATING SCALE FOR TIMBRE 327

P 2-

12 13 14 15

No.of diitilcttis

Figure 9. The number of “distinctions” made by all subjects.

comparing each of the six sounds together in this way a subject can make a total of fifteen separate “distinctions”. A histogram showing the number of distinctions is shown in Figure 9. For the I test to be valid the standard deviations must be similar. This may be checked by using theftest. For six subjects the standard deviations were significantly different for more than one-third of the tests and these subjects are omitted from Figure 9. (For these omitted subjects the scores were more erratic than the rest and so the results displayed are somewhat higher than the average.) An interesting feature of Figure 9 is the subject who made only one distinction, but whose scoring was nevertheless consistent. It became apparent upon looking at his score sheet that he had used only the middle 20 % of the scale, thus severely reducing the resolution and making distinctions very difficult. All subjects had difficulty in distinguish- ing between sounds 5 and 6 as sounds with little or no fundamental do sound very similar. The sine wave (sound 1) was generally rated Dull, Warm, and Pure. Sounds with little or no fundamental (5 and 6) were rated Brilliant, Cold and Rich (Figure 8). Sounds 2, 3 and 4 generally were rated more in the middle of the scales, but often with sufficient precision to make genuine distinctions possible.

Dull /&llliant

Pure/Rch -Cold/Warm

= 60-

40 -

20 -

Figure 10. The number of “distinctions” made with each scale.

Finally the relative merit of the three scales when rating sounds of different harmonic content may be examined by displaying the total number of distinctions made by all subjects when using the scales. Subjects used Dull/Brilliant with the greatest reliability, as can be seen from Figure 10.

5. CONCLUSIONS

The most important factor governing the timbre of a musical note usually is taken to be the harmonic composition. However, the amplitude and frequency of the harmonics vary with time, and it is the detailed nature of these variations that is of great importance in determining timbre.

A limited number of words used to describe timbre were selected from a vocabulary of

Page 12: A Subjective Rating Scale for Timbre [Ftp.rotman-baycrest.on.CA]

328 R. L. PRATT AND P. E. DOAK

nineteen words, by subjects indicating their preference in a questionnaire. These selected words then were used to form a set of three scales which comprised pairs of antonymous adjectives at either end of a line representing the gradual transition between the two extremes. The three scales were Dull/Brilliant, Pure/Rich and Cold/Warm, and grouped collectively formed the Subjective Rating Scale for timbre. Subjects were able to differentiate between certain sounds of varying harmonic content, using the Dull/Brilliant scale with the greatest reliability.

From these preliminary results, it would appear that construction of a verbal Subjective Rating Scale for the timbre of musical sounds is a possibility. By use of such a scale, musical sounds of most distinctive timbres could be identified experimentally. The detailed time- dependent structure of the harmonics of such most distinctive sounds then could be analyzed (as described in section 1 of this paper) and the results interpreted for purposes of further identification of the physical correlates of timbre.

A similar study by von Bismarck [ 1 l] has been published in English since the work reported here was undertaken. Subjects were asked to rate thirty-five sounds on thirty semantic scales, which they themselves had previously selected from a total repertoire of sixty-nine scales. The sounds, as those used in this study, differed only in spectral composition. Two groups of subjects, comprising musicians and non-musicians, took part in the experiments. By using factor analysis it was shown that, for the group of musicians, four scales, Dull/Sharp, Com- pact/Scattered, Full/Empty, and Colourful/Colourless, accounted for 90 % of the variance. The scale Dull/Sharp alone was further shown to account for 44 % of the variance.

Although von Bismarck’s original work was carried out in the German language [12], this last result seems consistent with the conclusions of this paper.

REFERENCES

1. Anon. 1960 American Standards Acoustic Terminology. New York: Acoustical Society of America, Inc. See p, 45.

2. H. VON HELMHOLTZ 1885 On the Sensation of Tone as a Physiological Basis for the Theory of Music. English translation by A. J. Ellis, reprinted by Dover Publications, 1954.

3. J. H. CRAIG and L. A. JEFFRIES 1962 Journal of the Acoustical Society of America 34, 1752-1760. Effect of phase on the quality of a two-component tone.

4. R. L. PRATT 1974 M.Sc. Dissertation, Institute of Soundand Vibration Research, University of Southampton. The physical basis of timbre of musical instruments.

5. K. W. BERGER 1964 Journal of the Acoustical Society of America 36, 1888-1891. Some factors in the recognition of timbre.

6. W. H. GEORGE 1957 Acustica 4, 224-225. A sound reversal technique applied to the study of tone quality.

7. E. G. RICHARDSON 1957 Journal of the Acoustical Society of America 26,43 l-460. Transient tones of wind instruments.

8. M. V. MATHEWS 1969 The Technology of Computer Music. Massachusetts Institute of Technology Press.

9. J. C. RISSET 1968 Bell Telephone Laboratory Report. Computer study of trumpet tones. 10. C. OSGOOD, G. SUCI and P. TANNENBAUM 1957 The Measurement sf Meaning. University of

Illinois Press. 11. G. VON BISMARCK 1974 Acustica 30, 146-l 59. Timbre of steady sounds: a factorial investigation

of its verbal attributes. 12. G. VON BISMARCK 1972 Dissertation Technische Universitiit Miinchen. Extraktion und Messung

von Merkmalen der Klangfarbenwahrnehmung stationgrer Schalle.