timbral and melodic characteristics of the persian singing ... · timbral and melodic...

36
Timbral and Melodic Characteristics of the Persian Singing Style of Avaz HAMA JINO BIGLARI Master of Science Thesis Stockholm, Sweden 2012

Upload: others

Post on 25-Dec-2019

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

Timbral and Melodic Characteristics of the Persian Singing Style of Avaz

H A M A J I N O B I G L A R I

Master of Science Thesis Stockholm, Sweden 2012

Page 2: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

Timbral and Melodic Characteristics of the Persian Singing Style of Avaz

H A M A J I N O B I G L A R I

DT217X, Master’s Thesis in Music Acoustics (30 ECTS credits) Single Subject Courses Royal Institute of Technology year 2012 Supervisor at CSC was Johan Sundberg Examiner was Sten Ternström TRITA-CSC-E 2012:026 ISRN-KTH/CSC/E--12/026--SE ISSN-1653-5715 Royal Institute of Technology School of Computer Science and Communication KTH CSC SE-100 44 Stockholm, Sweden URL: www.csc.kth.se

Page 3: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

Timbral and melodic characteristics of the Persian singing style of Avaz

Abstract The floridly ornamented Persian singing style called avaz was studied, focusing on melody characteristics in melismatic pitch transitions, phonation types, and formant to harmonic relationships. Audio and EGG signals were simultaneously recorded from a professional male tenor range singer, who sang a Persian avaz song, scales, rapid tone reiterations, and alternations between two neighbouring tones. Voice source parameters and formant settings (F1 & F2) were measured from inverse filtering of the audio signal, using the custom made DeCap and S-naq (Svante Granqvist) and the commercial Soundswell softwares. Fundamental frequency F0 was measured from the EGG signal using the Soundswell CORR tool. In the melismatic embellishments, the pitch transition between melody tones of the modal register were sung via remarkably short falsetto episodes in which F0 quickly jumped up to a peak in order to immediately dive towards the next modal tone. Being produced in this way, tone repetitions and alternations were sung with continuous phonation and had identical voice source data in their modal melody tones. Moreover, for most vowels, the singer tuned F1 to H2 (F0 * 2) and sometimes also F2 to some higher harmonics, in his higher voice range, i.e. above about Bb4 (235 Hz, approximately). These findings are discussed in relation to western operatic formant strategies and some melodic ornaments of early Italian Baroque singing. Klangliga och melodiska egenskaper inom den persiska sångstilen avaz

Sammanfattning Den rikligt ornamenterade persiska sångstilen avaz studerades med fokus på fonationstyper, melismatiska växlingar mellan meloditoner, samt förhållandet mellan formanter och övertoner. Audio- och EGG-signaler spelades in samtidigt, med en professionell manlig tenorsångare som sjöng ett stycke persisk avazsång, skalor, snabba tonupprepningar, samt alterneringar mellan två toner. Olika parametrar hos röstkällan samt formantpositionerna (F1 & F2) mättes genom inversfiltrering av audiosignalen med hjälp av de skräddarsydda mjukvaruprogrammen DeCap och S-naq (Svante Granqvist) samt det kommersiella programmet Soundswell. Med verktyget Soundswell CORR kunde grundtonsfrekvensen F0 fås ur EGG-signalen. I de melismatiska ornamenteringarna sjöngs växlingar mellan modalresgistrets meloditoner via anmärkningsvärt korta falsettepisoder där F0 snabbt hoppade uppåt till en höjdpunkt för att omedelbart dyka neråt mot nästa modalton. Sålunda framställda tonupprepningar och alterneringar sjöngs med kontinuerlig fonation och hade identiska röstkällor i modaltonerna. I övre delen av sitt omfång, dvs ovanför Bb4 (ca 235 Hz), placerade sångaren F1 på H2 (F0 * 2) för de flesta vokaler, och ibland placerades även F2 på högre övertoner. Dessa observationer diskuteras i förhållande till formantstrategier inom västerländsk operasång samt vissa melodiska ornament inom tidig italiensk barocksång.

Page 4: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

1 Introduction ............................................................................................................... 1 1.1 Aim of the study ............................................................................................................................. 1 1.2 Voice Science ................................................................................................................................... 1 1.2.1 The Basics .................................................................................................................................................. 1 1.2.2 Inverse Filtering ........................................................................................................................................ 2 1.2.3 Voice Source Parameters ......................................................................................................................... 3 1.2.4 Research on Iranian singing .................................................................................................................... 3 1.2.5 Repetitive Melodic Ornaments .............................................................................................................. 4 1.3 Method ............................................................................................................................................. 5 1.3.1 The subject ................................................................................................................................................ 5 1.3.2 Protocol ...................................................................................................................................................... 5 1.3.3 Recording ................................................................................................................................................... 6 1.3.4 Analysis ...................................................................................................................................................... 6 1.4 Definitions and scope .................................................................................................................... 8

2 Results ..................................................................................................................... 10 2.1 The Fundamental Frequency (F0) .............................................................................................. 10 2.2 Voice source .................................................................................................................................. 13 2.3 Formants and Spectrum Harmonics .......................................................................................... 15 2.4 The Voice Assessment ................................................................................................................. 19

3 Discussion ............................................................................................................... 21 3.1 Discussion on F0 .......................................................................................................................... 21 3.2 Discussion on Voice Source ....................................................................................................... 24 3.3 Discussion on Formants .............................................................................................................. 25

4 Conclusions ............................................................................................................. 28

5 Acknowledgements ................................................................................................. 29

6 Bibliography ............................................................................................................ 30

Page 5: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

1

1 Introduction

1.1 Aim of the study This study aims to investigate some acoustic aspects of a type of traditional Iranian singing. More specifically, it aims to describe characteristic timbral and melodic features of the ornamental singing style known as Persian avaz, with focus on the following three areas:

1) Characteristic features of the melodic line, focusing on the fundamental frequency in melismatic embellishments

2) Voice source characteristics

3) Formant frequencies and their relationship with the harmonics

1.2 Voice Science

1.2.1 The Basics When singing or speaking, sound is generated as the air pressure is repeatedly disturbed due to the oscillatory vibrations of the vocal folds. A vibration cycle in the glottis is often seen as divided into different phases. During glottal closure we have the closed phase which is followed by the opening phase, being defined as the tiny time slot that elapses while the glottis is moving towards being opened. Thereafter, the open phase is entered, during which there is transglottal airflow. The closing phase starts when the glottis moves towards being closed again. And finally, the glottis returns to the closed phase. The act of closing the glottis and also the act of maintaining glottal closure is known as adduction. (Sundberg 2001; Vennard 1968) During the open phase in each phonation cycle, an air pulse passes by the vocal folds due to the relatively high pressure in the lungs, i.e. the subglottal pressure. This transglottal airflow is decreased, minimized or completely interrupted, depending of how complete the glottal closure is during the closed phase, i.e. depending on whether the vocal folds have full contact or not. Thus, the transglottal airflow consists of an air pulse followed by a minimized or stopped air passage phase. The number of air pulses in a certain amount of time yields the fundamental frequency (F0) of the air pressure disturbances being radiated from the lips. The transglottal airflow is the voice source, i.e. the oscillatory component, in singing, and it provides an approximate picture of the vocal fold motion. (Sundberg 2001; Hall 2002) Due to the complex and abrupt motion of the vocal folds, the resulting vibrations in the air have not only the frequency of the vocal fold vibration, but also higher frequencies that are whole number multiples of F0, also known as partials, overtones and harmonics. That is, due to the opening and closing movements of the vocal folds and the collisions involved in the muscular-membranous motion, the air particles move with not only the F0 frequency but also with integer multiples of F0, i.e. 2xF0, 3xF0, 4xF0, etc. However, the energy level of the voice source partials is strongly decreasing, so that a frequency doubling in the partials means a decrease by 12 dB. For example, when singing the tone A4 which has F0=440 Hz, the overtones in the harmonic spectrum of the voice source will be 880Hz, 1320Hz, 1760Hz, 2200Hz, etc. Then the air vibration, i.e. the air pressure disturbances, at 440 Hz will be 12 dB stronger than at 880 Hz, which in turn will be 12 dB stronger than the 1760 Hz harmonic, etc. (Sundberg 2001 & 1989; Hall 2002) Above the glottis we have sound; the disturbances of air pressure caused by the air pulses above the glottis are sound waves. On their way to the lips, the sound waves hit the inner walls of the space known as the vocal tract which encompasses the throat as well as the mouth and the nasal cavities. The sound waves are thus reflected or refracted, so that parts of the waves travel backwards towards the glottis in order to be reflected outwards again. When a returned wave belonging to a higher partial reaches back to the glottis in order to be reflected outwards again, it might accompany the next wave being produced by the transglottal air pulse, so that the energy of the reflected wave will be added to the energy of the newly produced wave.

Page 6: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

2

Thus, the reflections in the vocal tract will result in that some partials are strengthened compared to their neighbouring partials. This phenomenon is known as resonance; it occurs whenever returning (reflecting) waves resonate synchronously along with new waves emerging from the oscillator. For a given voice source oscillating at frequency F0, the shape of the vocal tract at a certain point of time during phonation results in a specific set of partials being strengthened. In other words, the vocal tract functions as a filter that enhances some frequencies of the voice source partials so that the vibrations at those frequencies become stronger due to the addition of the returning waves to the new ones. Also the neighbouring partials are somewhat strengthened, while partials further away from the resonance frequencies remain unaffected. (Sundberg 2001 & 1989; Hall 2002) The specific frequencies that a given vocal tract having a certain shape is ready to enhance are known as formants1. The formants are more or less evenly distributed over the harmonic spectrum; a straight tube, i.e. a cylinder produces resonances at 500 Hz, 1500 Hz, 2500 Hz etc, and the vocal tract more or less does the same. However, the positions of the formants within each of the 1 kHz segments of the human voice spectrum vary, so that for example the two lowest formants can be at 300 and 2500 Hz. The two and sometimes three lowest formants determine which vowel is being pronounced. The first formant (F1), which roughly varies in the range 250-1300 Hz, is controlled by the amount of jaw lowering, so that a small opening produces a low F1 while a large opening gives a high F1. The second formant (F2) is controlled by the tongue position, so that having the tongue pushed forward produces high F2 while drawing the tongue back to the throat gives low F2. The third formant (F3) is usually related to the lip rounding and also to the cavity between the tongue and the lips. The rule of thumb is to count on 8-9 formants up to 8 kHz in the harmonic spectrum. (Sundberg 2001) Thus, the harmonic spectrum of the resulting sound wave being radiated from the lips differs from the spectrum of the voice source in that the formants strengthen some partials. The timbre, i.e. the sound quality, is then determined by the combination of the voice source and the formant setting. One of the questions studied in this thesis is whether F1 and F2 systematically and intentionally are set equal or close to any harmonics for a given vowel, and if so, whether the formant sticks to that harmonic even when F0 and thereby also the higher partials increase or decrease. A formant is considered as tuned to a harmonic when it is on or close to the harmonic. When F0 is low, e.g. at about 100 Hz, the chances are much higher that at least some formant frequencies match some harmonics, whereas the same thing should not be as easy to find for high F0 values, e.g. in the upper tenor octave and in soprano ranges.

1.2.2 Inverse Filtering As mentioned above, the vocal tract ensures filtering of the voice source through the formants, so that the air pulses leaving the lips differ from the voice source in that they have been filtered. This means that if the filtering could be undone, the air stream at the lips would be the same as at the glottis, i.e. the voice source. But the voice source can be unfolded even from the filtered air steam being radiated. Instead of undoing the filtering, the filtered air stream is taken as input in order to be filtered with a filtering function that is the opposite of the one given by the vocal tract. This process is known as inverse filtering, and it requires that the singer sings into a mask, through which all air passes so that the air flow can be measured. (Rothenberg 1973) A similar and almost equivalent way of doing inverse filtering is by using the audio signal (recorded via microphone) instead of the air flow. The difference is that the air flow in the mask shows if there is any air leakage during phonation, i.e. if part of the transglottal air is unphonated which makes the singer sound breathy, whereas the recorded audio signal misses such leakage. (Sundberg 2001) The inverse filter function for a given formant frequency is nowadays calculated by software programs, such as DeCap which is made by Svante Granqvist. A point of time in the signal captured from either the air flow or the radiated sound wave (hereafter audio signal will be assumed) is selected and the search for the formant frequencies is started. F1 and F2 and sometimes also F3 can be located within their typical 1 Some scholars, for example in France, use different terminology, and call such frequencies resonances.

Page 7: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

3

ranges for the vowel being sung, and the higher formants can be distributed about 1 kHz apart over the spectrum range. When the user selects a formant frequency, the software program reverse filters the radiated audio signal so that the harmonic spectrum of the audio signal is modified accordingly. Then the position of each formant is iteratively changed back and forth by the user until the resulting harmonic spectrum is reasonably even and decreases with about 12 dB per octave. Another criterion is that the resulting flow glottogram representing the voice source must have a reasonable shape, which often means that it must contain a pulse (of air) followed by a closed phase with less or no air flow, preferably with no ripples. Yet another criterion is to let the software program differentiate the EGG signal, so that the EGG derivative (dEGG) can be used to find a more reliable flow glottogram by fine tuning the selected formant frequencies. However, this requires that the beginning of the closing phase in the flow glottogram is recognizable in that the decrease rate of curve suddenly slows down. As a consequence, the final set of formant frequencies selected should be seen as approximate values. The flow glottogram sometimes may contain ripples in its closed phase. Moreover, the beginning of the closing phase may lack a sudden change and will therefore be impossible to locate, and a closing phase may be followed by an opening phase in which the curve increases in order to decrease again before rising towards the next pulse peak. Therefore, also the flow glottogram produced through inverse filtering can sometimes be hard to verify, which in turn can jeopardize the reliability of the obtained voice source.

1.2.3 Voice Source Parameters Data achieved from the air pulse amplitude along with the closed phase quotient reveal the amount of adduction (how hard the vocal folds are pressed together) and the subglottal air pressure level. These parameters in turn determine the phonation type, as there are different phonation types along a continuous scale extending from low adduction and low subglottal pressure to very high values for both parameters. Sundberg describes “pressed” phonation in terms of high adduction in the vocal folds and as being related to high position of the larynx and high subglottal pressure. He also states that in pressed phonation the maximum width of the glottis opening, i.e. its horizontal vibration amplitude, is at minimum. And meanwhile, the closed phase, i.e. the time when glottis is closed, is maximized, which of course means that the open phase has its shortest duration (Sundberg 2001:85). The difference in SPL between the first and second partials determines the air pulse amplitude and affects the radiated sound timbre. The first partial is indeed the same as F0, but in this context it is denoted as H1, and the second partial is H2. Therefore, the difference in SPL is denoted H1-H2 and is measured in dB. QClosed (Closed Quotient) is another recurring voice source parameter, denoting the ratio of the closed phase to the cycle period. The closed phase denotes the duration of glottal closure, which can be either complete (the vocal folds have full contact so that the air flow through glottis is stopped) or partial (there is come air leakage as the vocal folds are not in full contact). The amplitude of air pulse amplitude in a flow glottogram shows the momentary air volume passing through the glottis. MFDR (Maximum Flow Declination Rate), given by the negative peak amplitude of the transglottal airflow derivative, shows the point of maximum closing speed of the vocal folds. MFDR turns out to be interesting to measure, as it determines the SPL being produced. It is also used in NAQ (Normalized Amplitude Quotient), which shows the amount of adduction and is defined as the pulse amplitude of the transglottal airflow curve divided by the period and MFDR. (Björkner et al 2006)

1.2.4 Research on Iranian singing Some research has been conducted on different aspects of the various styles and traditions found in the music of Iran, both by Iranian and also by Western as well as East Asian scholars and musicians, but the subjects of those studies have mostly been other than technical or acoustic-physiological questions on singing. Simms (1996) transcribed and analyzed commercial recordings of the master of Persian avaz, Mohammad Reza Shajarian. He found the melodic repertoire to constitute a centonic style, in which some twenty basic blocks, each consisting of a few notes, recur through the quasi-improvised style. During

Page 8: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

4

(1984) and Miller (1999), both lived in Iran where they studied different music styles, primarily the Persian radif, which are transcribed canonizations of the traditional melodic repertoire. However, their presentations of the singing styles do not include acoustic-physiological investigations. Nevertheless, there are two cases of exception regarding acoustic-physiological investigations of Iranian singing. Margaret Caton studied recordings of male Iranian singers of Persian avaz as well as those of folk music and she published an article at UCLA in 1974 (see Caton 1974). Caton presented time domain spectrograms of a commercial avaz recording, focusing on the takiyah, i.e. the 50-70 ms short falsetto episodes, in which F0 quickly jumped upwards to a peak in order to descend as quickly towards the next melody tone. She discussed accentuated and non-accentuated takiyah and made statements on the differences between them, such as the usage of aspiration (phonation initiated with the phoneme [h]) and their strong and weak partials. She also discussed the provenience of takiyah types in Kurdistan and Azerbaijan, and she attempted to explain the glottal motion patterns for takiyah as well as the modal melody tones. (Caton 1974) Michèle Castellengo and her colleagues, among them Jean During, have investigated acoustic aspects of the falsetto episodes in Iranian tahrir. They studied the following parameters during the takiyah (without actually mentioning the term takiyah): the relationship between vowels and the sound intensity; the F0 jump interval as depending on dynamics, i.e. the relationship between the dynamics and the F0 interval between the takiyah peak and the modal tone; the F0 jump interval as depending on the starting frequency; open quotient comparisons between the takiyah and the modal tones; duration of the takiyah; and reproducibility of the takiyah peak frequency. (Castellengo et al 2009)

1.2.5 Repetitive Melodic Ornaments Tone repetitions and alternation are considered as the two basic melodic ornaments of the late 16th century and the early Italian Baroque solo singing at the beginning of the 17th century. James Stark discusses the physiological implications of the 400 years old Italian descriptions on those ornaments in terms of modern voice science, asking which vocal folds motion patterns are physiologically (im-) possible for each type of tone sequence (Stark 1999). Those repetitive ornaments are the tremolo of Zacconi, which according to Stark described tone repetitions, as did the trillo of Caccini. Also Greenlee regarded tremolo and trillo as denoting tone repetitions, albeit some scholars would disagree, instead suggesting that tremolo and even trillo described vibrato (see Brown 1976). The ornament for alternation is the gruppo of Caccini, but also that has been open to debate and some scholars preferred to interpret even the gruppo as vibrato. (Greenlee 1985; Stark 1999; MacClintock 1976) Stark combines modern voice science with Manuel Garcia’s theoretical model for the singing voice in his search for the physiological implications of Italian sources describing tone repetitions and alternations as being the two basic ornaments of the early Baroque. Stark considers rapid tone repetitions and alternations as impossible to sing with the same phonation type, i.e. with the same glottal motion pattern. In his view, rapid tone repetitions require what he labels as “loose phonation”, in which the arytenoid cartilages stay apart from each other, thereby creating an open triangle in the posterior part of the glottis during phonation (Stark 1999:24). Thus, there is unphonated transglottal airflow during Stark’s “loose phonation”, which indeed is the phonation type which Sundberg labels “breathy” (Sundberg 2001). In other words, Stark argues that rapid tone repetitions require breathy phonation, since each repeated tone must be preceded by full glottal closure, during which the arytenoid cartilages are pushed together so that the posterior 3/8 part of the glottis will too be closed. (Stark 1999). Rapid alternations, on the other hand, are according to Stark not suitable for “loose” phonation, since the arytenoids cartilages would not be able to close and open quickly enough in combination with alternating pitch. Stark states that the suitable phonation mode for glottal alternation is what he calls “anterior phonation”, in which the arytenoid cartilages are brought together by the adductory muscles and remain fixed during phonation. In such phonation, only the anterior 3/5 (or in some cases 5/8) of the glottis length vibrates during phonation while the posterior non-membranous part remains closed (Stark 1999). Stark ascribes this kind of glottal motion to what Sundberg (2001) labels as “pressed” phonation, adding

Page 9: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

5

that Sundberg’s definition includes raised larynx position (Stark 1999). However, Stark’s “anterior” phonation encompasses not only pressed phonation but also normal and flow phonation as defined by Sundberg (see Sundberg 2001). In Biglari (2009), the repetitive ornaments of the early Italian Baroque were discussed, and some of the descriptions in the early sources as well as their modern musicological interpretations were compared to the repetitive melodic ornaments found in Iranian singing. Since the data on Iranian singing were solely F0 and histograms for old commercial recordings, some questions remained unanswered. Margaret Caton’s statements on the glottal motion during modal and takiyah phonation were the only publication that directly illustrated the vocal folds in a study on the avaz vocal style. That should suffice as a reason for the author to study also the voice source in this thesis, but the author had also touched upon Stark’s discussion of the above mentioned physiological possibilities regarding early Baroque singing, which was the reason to study the voice source in tone repetitions and alternations in particular. In the 2009 study, it was shown that the phonation in the commercial Iranian recordings was continuous, so that tone repetitions did not contain any pauses. The melody tones were instead preceded by takiyah episodes, and the same pattern was observed in alternations. Although, it was not possible to determine whether the same phonation type was in use in both ornaments, it was shown that tone repetitions and alternations could be produced also in other ways than was being suggested by Stark. In other words, the BSc thesis showed that tone repetitions and alternations were sung rapidly with continuous phonation. However, while the existence of interleaving takiyah episodes overruled Stark’s criteria on full glottal closure, it was still unclear if the Iranian singers sang the repeating modal melody tones with the same glottal motion pattern as the alternating modal melody tones.

1.3 Method

1.3.1 The subject This study is based on studio recordings at KTH. The subject is a male professional singer and teacher of Persian avaz, who has learned the style the traditional way, i.e. by taking lessons during many years from some of the acclaimed masters of the art in Iran.

1.3.2 Protocol The main idea was to record an excerpt of a traditional Avaz song with free meter, containing a variety of pitches as well as of dynamics, sustained tones and typical ornamentations. In addition, scales sung on various vowels were added to the protocol so as to allow study of formant strategies, i.e. the relationships between the lower formants and spectrum partials. Prompted by the author’s previous studies of melodic ornaments in commercial recordings of Persian avaz, the protocol was extended to include also isolated execution of two ornaments, namely tone repetitions, i.e. reiteration of the same melody tone, and alternations between two adjacent notes. The subject first sang the Avaz piece that he selected, which was in the mode of dastgah-e mahoor and therefore contained the steps of the major scale, or rather Ionian mode. Thereafter, when asked to sing a shortened version of the song, he repeated the introductory part. Then, the subject sang on the vowels /ɑ, ӕ, i/ an ascending scale starting from F3 (the tone F at about 175 Hz, not the third formant) and after a short pause followed by a descending scale starting on F4, i.e. at about 350 Hz. The subject also was asked to sing tone repetitions. The subject then sang two series of tone repetitions, each ending with an improvised phrase. First series:

Page 10: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

6

- A sustained F4 tone on the vowel /ɑ/, about 1 second long, and followed by short repetitions of the same tone.

- A sequence of short repetitions on vowel /e/ on the pitch of E4 - An improvised phrase ending containing faster repetitions on the vowel /ɑ/ on the

pitches G4, F4, E3, arranged in groups of three.

Second series: - Repetitions of the tone F4 on vowel /ɑ/, basically similar to phrase 1. - Sequence of short repetitions on vowel /ɑ/ and /e/ on the pitches of E4 and D4 - An improvised phrase, ascending from C4 to G4 and then descending again, where the

tone on each pitch was sung twice Upon being asked to sing alternations between two adjacent tones, the subject sang three series of repeated alternations which were approximately on the following adjacent tone pairs C♯4-D♯4, D♯4-E4, E4-F4. All the alternations were sung on the vowel /ɑ/.

1.3.3 Recording The recordings were done in one session in a sound treated studio at KTH. The audio signal was recorded using a head mounted omni-directional microphone (TCM 110, AV-JEFE) at 10 cm distance. To calibrate the microphone, a dynamically steady tone was sung and recorded while the SPL was being measured by a level meter held next to the microphone; the difference between the recorded and the externally measured SPL could afterwards be used to adjust any recorded SPL value. Two electrode contacts belonging to a 2-channel electroglottograph (EGG) machine (Glottal enterprises, EG2) were attached to the throat of the subject. This was in order to measure the fundamental frequency in a reliable way. The EGG machine creates electric voltage at 2 MHz speed between the electrodes, so that electric current will flow through the vocal folds whenever they are in contact, thereby providing data that shows when the folds are in contact, over time. This shows the frequency of the glottal motion, which is taken to be same as the fundamental frequency. The audio and EGG signals were recorded using two channels in the commercial software program Swell, and the recordings were stored in .smp files.

1.3.4 Analysis 1.3.4.1 Assessment of the Recording In order to assess how representative our recordings were for Persian avaz (in other words: how typical the Persian avaz excerpt sung by the subject sounded), a panel of experts on Persian music were asked to do an audio-visual listening test. 17 excerpts from our recordings and 21 other recordings of 11 singers (the author’s studio recordings of 5 singers, and commercial recordings of 6 singers) were rated on a non-graded horizontal line denoting “clearly typical” at one end and “clearly untypical” at the other end. 10 excerpts were duplicated so that they occurred at least twice in the test, thus making 51 sound excerpts in total. The duplications covered both typical and non-typical (as perceived and purposefully selected by the author) examples as well as our recordings of the subject. The 17 excerpts of the subject covered phrases on various pitches from the avaz song, including both sustained tones and series of fast melismatic ornaments, ascending and descending scales, and also the isolated melodic ornaments for tone repetition and alternation, as shown in Table I.

Page 11: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

7

Table I The types of excerpts sung by the subject included in the audio-visual voice assessment.

Among the other 21 recordings listed in Table II were the author’s studio recordings covering isolated tone repetitions, alternations and scales along with Persian avaz as well as some non-Persian excerpts, namely typical Kurdish and Azerbaijani songs. The commercial recordings were excerpts of Persian folkloric traditions as well as Persian avaz, including episodes with tone repetitions and repeated alternations. The test participants were asked to neglect linguistic and dialectal characteristics and instead focus on timbre, the singing technique and the ornamentation style.

Table II The excerpts sung by other singers than the subject, either from commercial recordings or from studio recordings at KTH.

1.3.4.2 Acoustical Analysis The commercial software Swell was used to convert the recorded EGG signal to an F0 curve consisting of EGG frequencies versus time. In order to get a reliable F0 curve, the autocorrelation Swell tool CORR was used to derive from the EGG signal an F0 curve free of errors. The resulting data was also stored in an SMP file. F0 histograms were generated in Swell for selected sections of the F0 curves. Since both EGG and audio had been recorded by Swell and thereby were synchronized, it was possible to relate any point in the obtained F0 curve to its corresponding point in the audio signal. The recorded audio signal was opened in the freeware Wavesurfer software in order to obtain its SPL curve in dB versus time. Wavesurfer was also used for the frequency domain representation of the harmonic spectra of the audio, i.e. the levels of the spectrum partials (in dB) for any given point in time in the audio signal. DeCap, the custom made software program by Svante Granqvist, was used for inverse filtering. The SMP files containing both audio and EGG signals were opened in DeCap while the same file and its corresponding F0 curve were opened in Swell. The flow glottogram along with the EGG derivative as well as the voice source partials were viewed in DeCap. The DeCap and the Swell windows were synchronized through the link function in both programs, so that a selected spot in the Swell audio channel automatically redrew the flow glottogram and the other curves for the same time point in DeCap. Due to the distance of ca 27 cm between the glottis and the microphone (ca 17 cm from glottis to the lips, plus another 10 cm to the microphone), the EGG derivative in DeCap was delayed by 0.8 ms (∆t = 27 cm / (35000 cm/s) = 0.0008 s).

Type of melody Category Number of unique recordings

Number of duplications

Alternation Alternation 1 0

Scale Scale 4 0

Tone repetitions Tone repetition 1 0

Persian avaz Song 11 3

Type of melody

Category

# Recordings (dupl. not incl.)

# Duplications

Type of recording

Persian avaz song, by Persian singer Song 3 2 Commercial recordings Persian avaz song, by non-Persian singer Song 1 1 Author’s KTH recording

Kurdish song (on free meter, like avaz) Song 2 0 Author’s KTH recordings

Azerbaijani song Song 2 0 Author’s KTH recordings

Non-avaz genres of Persian singing Song 3 1 Commercial recordings

Scales Scale 6 2 Author’s KTH recordings Different alternation by Azerbaijani singer Alternation 1 2 Commercial recording

Alternation by Persian singer Alternation 1 1 Commercial recording

Tone repetitions

Tone repetition

2

1

1 commercial recording, 1 author’s KTH recording

Page 12: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

8

The points in time to be inverse filtered were selected as result of listening experience; the author listened through every vowel sung and selected one or several points distributed over the duration of the vowel, thereby covering various colors and phases of long shifting vowels. Upon selecting a point of time in the audio channel in Swell, the formants were set one by one in DeCap and they were adjusted until the following criteria were fulfilled, basically in the following order:

- The shape of the flow glottogram became satisfactory, which in most cases meant that the closed phase was as ripple-free as possible.

- The beginning of the closed phase in the flow glottogram and the negative peak of the EGG derivative coincided.

- The partial peaks in the harmonic spectrum descended more or less evenly

- The two lowest formants were in accordance with the vowel sung. For each inverse filtered point of time, eight or nine formants, i.e. F1-F8/F9, were set in the partial frequency range up to 8 kHz, and the resulting inverse filtered flow glottogram as well as the formant frequencies were stored. Thereafter, the two lowest formants were plotted versus F0 so as to reveal their relationship with the spectrum harmonics appearing at integer multiples of F0. For the voice source characteristics to emerge, voice source data derived from inverse filtering needed to be analyzed. Although some voice source parameters such as H1-H2 and QClosed are visually measurable in DeCap, it was more convenient to use another software program of Svante Granqvist for that task, namely S-naq. Each SMP file, containing the inverse filtered flow glottogram for a specific time coordinate, was opened in Swell and linked to S-naq, so that the EGG derivative and the flow glottogram could be made visible in the S-naq window. Finally, the period as well as the beginning and the end of the closed phase were marked. Then the voice source parameters NAQ, H1-H2, and QClosed were stored. In this way the vowels /ɑ, æ, i/ were analyzed for all modal pitches in the scale. The same parameters were compared also for the vowels /ɑ, æ, e, i/ in the avaz song recordings. In addition, QOpen was compared between modal and falsetto registers within series of tone repetitions.

1.4 Definitions and scope Some of the F0 axes in the diagrams presented in this thesis show the absolute frequency in Hz. However, in most diagrams the frequency is represented relatively, as an interval above certain frequency, which usually is 220 Hz, i.e. the tone A2. In such cases, the unit is written as [ST above A2], or [ST rel 201.5 Hz], where the abbreviation ST denotes semitones. Subglottal pressure (Psub) was not measured during the recording, although the aim initially had been to measure it. While the calibration of the SPL was done as to compensate for the absence of Psub, the author has decided to include Psub measurement in future studies. All the dB values for SPL are relative values, with 20 �Pa being the reference level of pressure. Tone repetitions and alternations were included in the protocol partly in reference to the author’s previous discussions on those repetitive melodic ornaments. In a BSc thesis in musicology which was done in 2009 at Uppsala University, the author analyzed F0 curves and histograms for excerpts of commercial recordings of Persian avaz as well as similar styles of singing from the Kurdish and Azerbaijani traditions. Nevertheless, some questions remained unanswered in the BSc thesis, and they will be addressed in this thesis. Therefore, some of the issues discussed in the BSc thesis will be briefly presented in this thesis so that the central (unanswered) questions can be discussed in light of the new data, including the idea that Iranian and some other Eastern singing techniques appear to be similar to Italian Baroque ornaments, as described in the historic sources.

Page 13: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

9

Some musicological literature on early Italian singing is referred to in this thesis, mostly regarding issues discussed in the author’s BSc thesis where the early Barocque vocal ornaments for tone repetition and alternation were discussed. Both the historic sources (Vicentino, Confurot, Bovicelli, Zacconi, Caccini) and the musicological literature discussing them (Greenlee, MacClintock, Galliver, Brown, Stark) can be neglected by a reader who would be mostly interested in the main questions of this thesis, namely some melodic and timbral characteristics of Persian avaz in terms of F0, voice source and formant settings. Caton’s (1974) mentioning of genre-related, ethnical and geographical characteristics for various types of takiyah has not been considered in this thesis, which is about the Persian singing style of avaz, specifically. Neither does this thesis deal with Caton’s dividing of takiyahs into accentuated and non-accentuated based on differences in aspiration (phonation initialized with the phoneme [h]) and partial strength in the spectrogram. The author would be interested in examining different kinds of takiyah in future studies.

Page 14: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

10

2 Results

2.1 The Fundamental Frequency (F0)

Modal and falsetto registers were used both in the avaz song and in the scales and the repetitive ornaments, i.e. alternation and repetition. All melody tones were sung in modal register, while melismatic transitions from one melody tone to the next were sung in falsetto, whereby F0 quickly jumped up to a frequency peak well above the next melody tone and then quickly dived towards it. These short falsetto episodes are known as takiyah, which literally means leaning/support, i.e. appoggio (Caton 1974; in Simms 1996 spelled tekye in accordance with modern Farsi pronunciation). Some examples of takiyah are shown in the F0 curve in Fig. 2.1.1. Caton studied the takiyah as the basic stone in ornamentation, and she presented different kinds of takiyah which she ascribed to different genres. Also this thesis follows Caton’s idea, focusing on takiyah rather than tahrir 2.

Fig. 2.1.1 (Left): Examples of takiyah and tahrir marked o the F0 curve for melismatically ornamented melody in avaz song. (Right): Schematic view of pitch melismatic pitch transition in avaz.

The modal melody tones in the avaz song covered the pitch range G3-Ab4. The lowest takiyah peak was 150 cents above G3 and occurred between the lowest modal tones F3 and G3 in the scales. The highest takiyah peak was at A4 in the avaz song. Occasionally, takiyahs preceding the same modal melody pitch had varying peak frequencies, especially in the scale recordings, for example the peaks between F3 and G3 varied from 150 cents above G3 to C4. Such differences in peak were seen between ascending and descending scales on the same vowel as well as between the different vowels /ɑ, æ, i/, and even between two recordings of descending scale on the same vowel, as shown in Fig. 2.1.2.

Fig. 2.1.2 F0 in scales. (Left): The takiyah peaks differ between ascending scales on the vowels /ɑ, æ, i/. (Right): The takiyah peaks differ also between two recordings of a descending scale on /i/.

2 In other studies not dealing with voice science, the discussion is rather tahrir-oriented, probably due to the fact that the takiyah is sometimes not discussed at all by singers and teachers of avaz. Even when takiyah occurrences are marked between the melody notes in avaz transcriptions, the discussion often remains tahrir-oriented, e.g. in Tatsumura (1980) and Simms (1996).

0

6

12

18

24

10000 12000 14000 16000 18000 20000

Time [ms]

F0 [

ST

rel A

2]

Descending scales on /i/

6

10

14

18

22

0 5000 10000 15000Time [ ms ]

F0 [

ST

abov

e A2

]

Ascending scales on /?, ? , i/

140

190

240

290

340

390

440

1000 6000 11000 16000 21000 26000

Time [ms]

F0 [H

z]

[a : ]

[o: ]

[ae] [  i :  ]

Takiyah (lit:appoggio)

Tahrir = melismatic ornamented melody (here with 17 takiyahs)

2000 4000 6000 8000 10000

/ɑ, ӕ, i/

/ɑ/

/ӕ/

/i/

Peakfrequenciesvary

Not a tone;No distinctpitch

Interval not significant; can be off scale

Pitch transition from one modal tone to the next(mostly) goes via takiyah

Melody tone, modal register

Takiyah, ornament, falsetto

Time

Freq

Register breaks

Continuousphonation

Peakfrequenciesvary

Not a tone;No distinctpitch

Interval not significant; can be off scale

Pitch transition from one modal tone to the next(mostly) goes via takiyah

Melody tone, modal register

Takiyah, ornament, falsetto

Time

Freq

Register breaks

Continuousphonation Interval not

significant; can be off scale

Interval not significant; can be off scale

Pitch transition from one modal tone to the next(mostly) goes via takiyah

Melody tone, modal register

Takiyah, ornament, falsetto

Time

Freq

Register breaks

Continuousphonation

Pitch transition from one modal tone to the next(mostly) goes via takiyah

Melody tone, modal register

Takiyah, ornament, falsetto

Time

Freq

Pitch transition from one modal tone to the next(mostly) goes via takiyah

Melody tone, modal register

Takiyah, ornament, falsetto

Time

Freq

Pitch transition from one modal tone to the next(mostly) goes via takiyah

Melody tone, modal register

Takiyah, ornament, falsettoPitch transition from one modal tone to the next(mostly) goes via takiyah

Melody tone, modal register

Takiyah, ornament, falsetto

Melody tone, modal register

Takiyah, ornament, falsetto

Melody tone, modal register

Takiyah, ornament, falsettoTakiyah, ornament, falsetto

Time

Freq

Register breaksRegister breaks

Continuousphonation

Continuousphonation

F0 for melismatically ornamented melody in Persian avaz

Page 15: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

11

The takiyah peak frequencies in the avaz song were more even and thus more in control when the two recordings of the same episode were compared; the fundamental frequency in several phrases containing series of modal-takiyah-modal transitions were found to be more or less identical between the two avaz recordings, as shown in Fig. 2.1.3.

Fig. 2.1.3 The two recordings of the avaz song are rather similar. The difference in timing is minor and non-significant, since the song is free of meter. The differences in takiyah peaks in the tahrir in the initial 4 seconds are not musically significant. However, takiyah peaks preceding the same modal pitch varied much more when frequencies in the same recording were compared, both within the same exhalation and between two different phrases. Sometimes also two takiyahs that were very close in time, e.g. preceding two consecutive modal tones of the same pitch, had different peaks. Variations in peak could be seen even between two takiyahs (T1 and T2) preceding not only the same modal pitch M2 but also departing from the same modal pitch M1. Note that the two simplest cases of such scenarios are tone repetition (M1-T1-M1-T1-M1-…) and alternation (M1-T2-M2-T3-M1-T2-M2-…), which are shown in Fig. 2.1.4.

Fig. 2.1.4 Takiyah peak variations in tahrirs made of tone repetitions and alternations. a-b) The takiyah peaks in tone repetitions vary about 100 cent, but the modal tone frequency is the only significant frequency according to the histogram. c-d) The takiyah peaks preceding the same pitch vary also in alternations, but again the histograms show that only the modal tones are dominating.

In the excerpts presented in Fig. 2.1.4, the takiyah peaks preceding modal tones at the same pitch in the repetitive sequences mostly vary within about 100 cent. But the histograms in the figure also reveal that tone repetitions and alternations are being sung; the histogram clearly shows that the dominating frequency in tone repetitions belongs to the repeated modal tone. Similarly, the histograms for the alternations show that the two melody tones were sung with equal durations, although the upper tone produces a less stable histogram curve with more tails due to the fact that both takiyah peaks occur above the upper tone. Thus, the histograms clearly show that tone repetitions and alternations were produced when the F0 curves show that those were produced with interleaving takiyah episodes.

F0 HistogramAlternation #2

0

0,02

0,04

0,06

0,08

6 7 8 9 10 11 12

F0 [ST above 201,5 Hz]

Occ

urre

nce

Alternation #2

6

7

8

9

10

11

12

12285 13285 14285

Time [ms]

F0 [S

T re

l 201

,5 H

z]

F0 Histogram for alternation in Avaz song

0

0,03

0,06

0,09

14 16 18 20 22F0 [ST rel A2]

Occ

urre

nce

Alternation in Avaz song

15

17

19

21

72000 72500 73000Time [ms]

F0 [S

T re

l A2]

F0 HistogramTone Repetition #1

0

0,06

0,12

0,18

9 10 11 12 13F0 [ST above 201 Hz]

Occ

urre

nce

Tone repetition #1

9

10

11

12

13

14

0 500 1000 1500 2000 2500Time [ms]

F0 [

ST re

l 201

Hz]

(

F0 HistogramTone Repetition #3

0

0,06

0,12

0,18

9 10 11 12 13 14

F0 [ST above 201 Hz]

Occ

urre

nce

Tone Repetition #3

9

11

13

20800 21800 22800Time [ms]

F0

[ST

rel

A2]

Two recordings of Avaz song(ca 700 ms silence added in REC 2)

10000 12000 14000 16000 18000 20000Time [ ms ]

F0 [S

T abo

ve A

2]

REC 1REC 2

a a b b b

c c d d d

Page 16: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

12

Only a few takiyah peaks were more than 500 cents above the next melody tone in the avaz song. In the scales, the peaks were higher, sometimes as high as ca 700 cents above the next melody tone, whereas in tone repetitions and alternations, they were well below 500 cents. The difference between the lowest and the highest takiyah peak frequencies within each sequence of tone repetitions was approximately 150-200 cents, with the lower peaks usually in the beginning and the highest peak usually somewhere after the initial 6-8 repetitions. The last few tone repetitions had decreasing takiyah peaks and were almost twice as slow as the initial ones. Thus, the envelope of the F0 curve reminded of a parable as it ascended up to the maximum peak midway through the duration of the repetition sequence in order to descend somewhat thereafter. The SPL dropped during the takiyah episodes, regardless of which vowels were being sung, as is shown in Fig. 2.1.5. Whenever the F0 curve jumped up towards a takiyah peak, the SPL curve dropped to a negative peak and rose again when F0 was diving from the falsetto peak towards the next modal tone. Considering the modal melodic pattern only, however, the SPL moved in the same direction as the modal tones; the SPL increased and decreased in parallel with ascending and descending modal tone patterns.

Fig. 2.1.5 F0 and SPL in scales on the open vowel /ɑ/ and the closed vowel /i/, and also on /i/in a tahrir in the avaz song. SPL dropped at all takiyah episodes regardless of vowel.

Our data did not indicate any clear correlation between SPL and takiyah peak intervals, neither during the modal tones nor the takiyah episodes, which is shown in Fig. 2.1.6. In the alternation, the SPL alternated along with F0 for the modal tones and it dropped during the takiyah peaks. Still, the takiyah peaks mostly increased although the SPL of each modal tone pitch remained unchanged. It also seemed that the SPL mostly dropped less when the falling interval between the takiyah peak and its following modal tone increased. On the other hand, the SPL increased during the initial short tones of each tone repetition sequence and reached its maximum near the repeated modal tone that had the highest takiyah peak. That is, it seemed that increasing dynamics to some extent lead to higher takiyah peaks when the modal tones had the same pitch. However, while the SPL level stayed high throughout the few longer modal tones at the end, the takiyah peaks in those last repetitions decreased. It seemed that the SPL of the modal tones as well as of the takiyah episodes in some cases increased without affecting the jump intervals between the takiayh peaks and the modal tones.

Fig. 2.1.6 SPL and F0 during tone repetitions and alternation.

Alternation: F0 & SPL

97

100

103

106

109

1592 1642 1692 1742

Time [centiseconds]

F0

[ST

rel A

2]

19

21

23

25

SP

L [d

B]

SPLF0

Tone repetitions

94

100

106

112

665 765 865

Time [centiseconds]

SP

L [d

B]

19

21

23

25

F0 [S

T re

l A2]

SPL

F0Tone repetitions

95

100

105

110

115

300 400 500 600

Time [centiseconds]

SP

L [d

B]

20

22

24

26

F0 [S

T re

l A2]

SPL

F0

Avaz songTahrir on vowel /i/

15

18

21

24

27

13220 13270 13320 13370

Time [centiseconds]F0

[ST

rel A

2]

74

80

86

92

98

SPL

[db]

F0SPL

Scale on /i/, SPL & F0

100

200

300

400

4650 4750 4850 4950 5050

Time [centiseconds]

F0 [

Hz

]

80

90

100

110

SP

L [

dB

]

F0SPL

Scale on /?/, SPL & F0

190

240

290

340

1270 1370 1470 1570Time [centiseconds]

F0 [

Hz

]

91

101

111

121

SP

L [

dB ]

F0SPL

Scale on /ɑ/

Scale on /i/

Page 17: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

13

2.2 Voice source The voice source data (NAQ, H1-H2, QClosed) for the scales sung in modal register on the vowels /ɑ, æ, i/ are shown versus F0 in Fig. 2.2.1, for the pitch range F3-F4, approximately. In a nearsighted reading we can see that the NAQ data was scattered for /æ/ while it remained low for /ɑ/. For /i/ it tended to increase linearly with F0, meaning that the value was nearly doubled when a tone was raised by one octave. The values of H1-H2 mostly remained in the range of 4-6 dB for /ɑ/ and /i/. QClosed mostly tended to increase with F0 for /æ/, to decease for /i/, and to mostly remain within the range 0.5-0.6 for /ɑ/.

Fig. 2.2.1 Voice source data (NAQ, H1-H2, QClosed) in the ascending scales sung on the vowels /ɑ, æ, i/. The same voice source parameters (NAQ, H1-H2, QClosed) for the vowels /ɑ, æ, i, e/ sung in the avaz song are presented versus F0 for the pitch range 220-380 Hz in Fig. 2.2.2. For /ɑ/, NAQ remained more or less constant around 0.1 over the entire F0 range, while for /i/ it tended to increase with F0, and for /æ, e/ it was more scattered. Thus, the NAQ values for /ɑ, æ, i/ in the avaz song showed the same tendencies as in the scales. The H1-H2 values in the avaz song tended to be higher than the scale values; some values for /æ/ in the lower part of the F0 range were above 12 dB. Also most other H1-H2 values in the avaz song were a couple of dB higher than in the scales. The QClosed values in the avaz song were mostly in the same range as in the scales, i.e. ca 0.3-0.6, with the difference that fewer values in the avaz song were in the lower part of the range, and also that /ɑ, i/ were more scattered and did not show any tendencies to increase with F0.

Fig. 2.2.2 Voice source data (NAQ, H1-H2, QClosed) in the avaz song on the vowels /ɑ, æ, i, e/.

There may be several reasons why some voice source data from the scales and the song differ. The narrow and fragmented coverage of F0 for the vowels /i, e/, make it more difficult to discern a systematic variation with F0. It might also be relevant that the scale tones were sung in one phrase and each tone was sung once, producing basically one data point per pitch for each vowel. In the avaz song, on the other hand, tones at the same pitch occurred in several phrases, thus producing more than one value for a given F0 on a given vowel. The NAQ values varied in the range 0.06-0.21, which indicates great variation in the level of adduction. The vowel /ɑ/, which covered the widest F0 range, A3-G4, approximately, had the least variation in NAQ, which mostly stayed around 0.1, thereby indicating a consistently high level of adduction. But it needs to be added that on the approximate pitch F♯, the NAQ range was 0.06-0.12, which indicates a

Vowels in avaz song

0,1

0,3

0,5

0,7

150 190 230 270 310 350 390

F0 [Hz]

QClo

sed

Vowels in avaz song

0

4

8

12

16

150 190 230 270 310 350 390

F0 [Hz]

H1-

H2

[dB

]

Vowels in avaz song

0,05

0,1

0,15

0,2

150 190 230 270 310 350 390

F0 [Hz]

NAQ

Scales

0,1

0,3

0,5

0,7

150 190 230 270 310 350 390

F0 [Hz]

QC

lose

d

Scales

0

4

8

12

16

150 190 230 270 310 350 390

F0 [Hz]

H1-

H2

[dB

]

Scales

0,05

0,1

0,15

0,2

150 190 230 270 310 350 390

F0 [Hz]

NA

Q

/ i /

/ ӕ /

/ ɑ /

/ ɑ / / ӕ /

/ i / / e /

Page 18: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

14

doubled adduction from one occasion to another. Compared to /ɑ/, the vowel /æ/ varied more: while it was the vowel with the highest NAQ value of 0.15 in the F0 range Bb3-Db4, it seemed to have been sung with twice as high adduction at the pitch Eb4. In the approximate F0 range D4-E4, the highest NAQ values were up to 0.21 for the vowels /i, e/, which indicated a much lower adduction level. This should not be interpreted as suggesting that the vowel per se determined the level of adduction – it is rather suggesting the opposite, namely that neither the vowel nor the pitch determine the singer’s choice of adduction. The H1-H2 values should be in light of what we just detected about NAQ. The highest H1-H2 values belong to the vowel /æ/ sung in the relatively low F0 range Bb3-Db4, which in combination with the moderate adduction (NAQ was 0.15) indicate high air pulses, i.e. high amplitude in the flow glottogram. This reminds of all but pressed phonation, so that the closed quotient should be expected to be low, which turns out to be the case. The vowel /æ/ in the pitch range D4-E4 turned out to have the highest level of pressedness, as H1-H2 was about 3 dB while QClosed almost reached 0.7. It is also interesting to see that not all high pitch tones sung at about F♯4 were pressed; the vowels /æ, e/ at about 370 Hz had high adduction (low NAQ), low H1-H2 and high Qclosed, which means that they were pressed, while the vowel /ɑ/ was sung with clearly pressed as well as non-pressed phonation at about the same pitch. To make the data more comprehensive, the mean values and standard deviations of the voice source parameters as well as of F0 were calculated for each vowel sung in the avaz song, as seen in Fig. 2.2.3. The figure also shows the standard deviations for all of the voice source values regardless of vowel.

Fig. 2.2.3 Standard deviation from the mean values for the voice source data versus F0 in the avaz song on the vowels /ɑ, æ, i, e/. Castellengo et al (2009) compared the open quotient (QOpen) in modal and falsetto tones by measuring the EGG derivative (dEGG). They found QOpen to be 0.4 in mechanism M1 (same as modal register) and 0.8 in M2 (falsetto register). For clarity of comparison, the author measured QOpen in both modal tones and falsetto episodes in sequences of tone repetitions, as shown in Figur 2.2.4, although it would have been possible to obtain the QOpen values from the QClosed values for all vowels of the avaz song presented in Fig. 2.2.3.

Figur 2.2.4 (Left): Open Quotient values in falsetto (takiyah episodes) and in modal tones, in sequences of tone repetition. (Right): Open Quotient for modal tones on the vowel /ɑ/in the avaz song.

Figur 2.2.4 also shows the avaz song values for modal tones on the vowel /ɑ/. In our tone repetitions, the mean QOpen value for modal tones was about 5/8 of the value for falsetto: the mean values were ca 0.44

Modal tones on /?/ in avaz song

0,3

0,5

0,7

0,9

220 260 300 340

F0 [Hz]

QOpe

n

Tone repetitions on vowel /?/

0,3

0,5

0,7

0,9

290 330 370 410 450F0 [Hz]

QO

pen

Falsetto

Modal

Song

0,35

0,45

0,55

0,65

220 260 300 340

F0 [Hz]

QClosed

Song

2

4

6

8

10

12

220 260 300 340

F0 [Hz]

H1-

H2

[dB

]

Song

0,06

0,1

0,14

0,18

220 260 300 340

F0 [Hz]

NAQ

/ ɑ / / ӕ /

/ i / / e / Mean ± SD (for all 4 vowels)

Tone repetitions on vowel /ɑ/

Modal tones on /ɑ/ in Avaz song

Page 19: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

15

and 0.69, with ±0.08 being SD in both cases. This suggests that the there is a clear difference in the open phase between modal and falsetto registers, which also corroborates with the result of Castellengo et al. Fig. 2.2.5 shows the voice source parameters for our recordings of fast repeating and alternating modal tones as well as the interleaving falsetto episodes of the takiyahs. NAQ and H1-H2 were about the same in the modal tones of both ornaments, while QClosed was higher in the tone repetitions. This suggests that the tone repetitions in this case were slightly more pressed than the alternation. The falsetto episodes were actually measured only in the tone repetitions, and they had clearly lower adduction and mostly shorter closed quotient values, which means that they were not pressed at all.

Fig. 2.2.5 Voice source parameters tone repetitions and alternations.

2.3 Formants and Spectrum Harmonics The relationship between the spectrum harmonics and the two lowest formant frequencies will be presented for the tone repetitions and the scales as well as the avaz song. For any given point in time in each recording, the spectrum harmonics were easily calculated from F0, and the formants surfaced as result of the inverse filtering, which also yielded the voice source data. Thus, the data on the formants have the same time and frequency coordinates as the voice source data presented earlier. However, while the voice source data were presented as single points and standard deviation axes, the main focus in presenting the formant data will be how the first and second formants (F1 and F2) develop versus F0, and especially if F1 lies on any of the spectrum harmonics over a certain F0 range.

Fig. 2.3.1 Scale sung on the vowel /ɑ/. (Left): F1and F2in relation to the harmonics. (Right): F1 at the points measured, in relation to H2 as well as the threshold for being considered as tuned to H2, namely the dotted line which denotes H2+50 Hz. Fig. 2.3.1 shows F1 and F2 as well as the spectrum harmonics versus F0 for the scale tones sung on the vowel /ɑ/. The scale range, calculated as the interval between the mean F0 values of the lowest and the highest tones, was 168-342 Hz, i.e. one octave (≈ F3/F4). The figure suggests that, in the F0 range F3-A3, F1 was almost constant and increased very little with F0. By the scale tones A3 and Bb3, F1 approached H2, and in the F0 range D4-F4 it was again closer to H2, whereas on the pitch C4 it increased to values well above H2. Thus, one way of interpreting the data would be to ascribe to F1 the tendency of increasing in parallel with H2 in the F0 range A3-F4, so that the deviation on C4 would be considered as special pitch that was difficult to make formant tuned. However, another interpretation could be to regard

Scale, vowel /?/

18

22

26

30

6 8 10 12 14 16 18 20

Fo [ST rel A2]

Fn [S

T re

l A2]

F1H2 ±50 Hz

Scale, vowel /?/

22

28

34

40

6 8 10 12 14 16 18 20Fo [ST rel A2]

Fn [S

T re

l A2]

F1 F2

Tone repetitions & alternation

0,1

0,3

0,5

0,7

290 340 390

F0 [Hz]Q

Clo

sed

Tone repetitions & alternation

0

3

6

9

12

290 340 390F0 [Hz]

H1-

H2

[dB

]Tone repetitions & alternation

0,05

0,1

0,15

0,2

290 340 390

F0 [Hz]

NA

Q

Falsetto

Modal

(tone repetition)

Modal

(alternation)

Scale, vowel /ɑ/

Scale on vowel /ɑ/

Page 20: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

16

the tuning on A3 and Bb3 as an inevitable and yet unintended side-effect as F1 was constant while F0 was rising, which then would mean that the formant tuning occurred only in the F0 range D4-F4. The dotted line in Fig. 2.3.1 corresponds to the frequency 50 Hz above H2, and we can see that some F1 values were less than 50 Hz above H2, namely on the tones A3, Bb3, D4, E4 and F4. Since Henrich et al (2011) found the 50 Hz limit as the line between tuning and not tuning of F1 on H2, the harmonic spectra corresponding to some of the F1 values were studied and compared, as shown in Fig. 2.3.2.

Fig. 2.3.2 The harmonic spectra for some of the points in Fig. 2.3.1 where F1 and F2 were presented versus F0.

When F1 was tuned on H2, the second partial was expected to be stronger than its neighboring partials. But also the relationship between F2 and higher partials should be taken into consideration, as for example in case (b), where F1 was not within the 50 Hz limit from H2 while F2 was within 50 Hz from H4 or H5. In one case (a), both formants were within 50 Hz of H2 and H4 respectively. When F1 was within 50 Hz from H2 as in (a,c), the 2nd partial was clearly stronger than all other partials in the spectrum. When F1 was outside the 50 Hz limit, as in (b,d,e), the 2nd partial was not the only strong one and the fourth partial was usually equally strong, as F2 was usually tuned to H4. When both F1 and F2 were untuned, as in (e), neither the 2nd nor the 4th partial was the strongest, while tuning of both formants again made the 2nd partial the strongest one. Still, perhaps rather unexpectedly, the 2nd partial was the strongest even when F1 was untuned and F2 was tuned, as is seen in case (d). Fig. 2.3.3 shows F1 and F2 as well as the spectrum harmonics versus F0 for the scale tones sung on the vowels /æ, i/. For the vowel /æ/, F1 and F2 did not seem to be constant for increasing F0. F1 started at H4 on the lowest tone F3, and by the tones Bb3 and C4 it passed through H3, and was close to H2 on the final tone F4, whereas for all the other tones it was somewhere between the partials. Also F2 had some rendezvous with the partials H6, H5 and H4 as F0 was increasing. Nevertheless, it was rather clear that none of the formants showed any tendency to join the spectrum harmonics. The absence of formant tuning was most obvious for the vowel /i/, as F1 was mostly staying constant midway between H1 and H2 while the high values of F2 coincidently crossed some partials without showing any formant tuning strategy.

F1 - H2 > 50 Hz (F1 not tuned on H2)H4 - F2 > 50 Hz (F2 not tuned on H4) Scale on vowel /?/

-80

-70

-60

-50

-40

-30

0 1000 2000 3000 4000

Partial frequency [Hz]

Parti

al in

tens

ity [d

B]

F1 - H2 > 50 Hz (F1 not tuned on H2) H4 - F2 < 50 Hz (F2 tuned on H4) Scale on vowel /?/

-80

-70

-60

-50

-40

-30

0 1000 2000 3000 4000

Partial frequency [Hz]

Parti

al in

tens

ity [d

B]

F1 - H2 = 50 Hz (F1 tuned on H2)H4 - F2 > 50 Hz (F2 not tuned on H4) Scale on vowel /?/

-80

-70

-60

-50

-40

-30

0 1000 2000 3000 4000

Partial frequency [Hz]

Parti

al in

tensit

y [dB

]

F1 - H2 = 56 Hz (F1 not tuned on H2) H4 - F2 < 50 Hz (F2 tuned on H4) Scale on vowel /?/

-80

-70

-60

-50

-40

-30

-20

0 1000 2000 3000 4000

Partial frequency [Hz]

Parti

al in

tens

ity [d

B]

F1 - H2 < 50 Hz (F1 tuned on H2) H4 - F2 < 50 Hz (F2 tuned on H4)

Scale on vowel /?/

-70

-60

-50

-40

-30

-20

0 1000 2000 3000 4000

Partial frequency [Hz]

Parti

al in

tens

ity [d

B]

Scale on vowel /ɑ/

Scale on vowel /ɑ/

Scale on vowel /ɑ/

c b a

Scale on vowel /ɑ/

Scale on vowel /ɑ/

d e

F0, F1 & F2 values (see Fig. 2.3.1):

F0=16.83 F0=15.33 F0=12.42

F1=29.57 F1=29.10 F1=26.23

F2=40.36 F2=39.59 F2=37.94

F0=14.49 F0=7.37

F1=29.34 F1=24.12

F2=39.06 F2=37.65

a b

c

d e

Page 21: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

17

Fig. 2.3.3 F1 and F2 versus F0 in scales.

The tone repetitions were sung on /ɑ, e/, occasionally with the /ɑ/ tending toward /o/. Despite the frequency jumps during the takiyah episodes and the resulting double register breaks preceding each repeated modal melody tone, the relationship between the formant frequencies and the spectrum harmonics remained unchanged throughout each repetition sequence, as shown in Fig. 2.3.4. Moreover, not only did F1 remain constant, it was also equal to the second harmonic H2 of the modal melody tone. Note that during the takiyah episodes the increase in F0 yields increasing frequencies of the harmonics. Hence, the F1-H2 coupling is broken, as H2 ascends while F1 remains constant.

Fig. 2.3.4 Formant to harmonics relationship for modal tones and falsetto episodes in tone repetitions. (Left): The measured points for each register are seen on the F0 curve. (Middle): F=H2 for the modal tones that repeated a tone at the same pitch (F0 varied within a 50 cent range). (Right): F1 is not tuned to H2 during the falsetto episodes. This also shows that F1 and partially also F2 remained unchanged throughout the whole repetition sequence, while F0 and thereby also the higher harmonics increased and left the formants behind during the takiyah episodes.

In order to test the null hypothesis that the F1-H2 coupling occurred by coincidence, also shorter series of tone repetitions on different pitches were analyzed. The result clearly indicated the same relationship, namely F1=H2 during the repeated series of modal tones, while the increased second harmonic H2 left the constant F1 behind during the intervening takiyah episodes, as shown in Fig. 2.3.5.

Fig. 2.3.5 Formant tuning in sequences of tone repetitions. (Left): The triangles on the F0 curve show points selected in the modal

Formants during falsettoepisodes between modal tone repetitions

6

12

18

24

30

36

8 9 10 11 12 13

Fo [ ST rel 201 Hz ]

Fn [S

T re

l A2]

F1F2

Repeated modal tones at 4 pitches

6

12

18

24

30

36

6,0 8,0 10,0 12,0

Fo [ST rel 201 Hz]

Fn [

ST re

l 201

Hz]

F1F2

Tone repetition #1

6

8

10

12

14

11000 12000 13000

Time [ms]

F0 [ S

T re

l 201

Hz ]

F0

Falsetto

Modal

Falsetto betweenmodal tone repetitions

6

12

18

24

30

36

11,7 11,9 12,1 12,3 12,5

Fo [ ST rel 201 Hz ]Fn

[ST

rel 2

01 H

z]

F1F2

Repeated Melody Tone

6

12

18

24

30

36

9,5 9,7 9,9 10,1

Fo [ST rel 201 Hz]

Fn [

ST re

l 201

Hz]

F1

F2

Tone repetition #1

6

10

14

1000 1200 1400 1600 1800 2000 2200 2400 2600

Time [ms]

F0 [S

T re

l 201

Hz]

F0 Modal (melody) Falsetto (takiyah)

30

36

42

6 8 10 12 14 16 18 20Fo [ST rel A2]

Fn [S

T re

l A2]

F1 F2Scale, vowel /? /

12

24

36

48

6 8 10 12 14 16 18 20

Fo [ST rel A2]

Fn [

ST re

l A2 ]

F1 F2Scale, vowel [ i ] Scale, vowel /ӕ/

Page 22: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

18

register in tone repetition series on 4 different pitches, while the small circles show the points selected in falsetto. (Middle): F1=H2 for the modal tones. F2 is untuned to H4 (is not within 50 Hz from H4) except for the lowest selected tone. (Right): The falsetto episodes did not have formant tuning. Both F1 and F2 are more than 50 Hz away from H2 and H3, respectively. On the other hand, both formants increase almost in parallel with the harmonics.

The formant to harmonics relationship in the modal tones of alternations between two neighbour notes are shown for F1 and F2, as illustrated in Fig. 2.3.6. Obviously, both F1 and to some extent also F2 move in parallel with F0; the formants alternate as they move up and down along with the two modal tones of the alternation. Still, the most interesting aspect again appeared in F1, as the F1=H2 formant tuning was clearly seen in both the upper and the lower tones of alternations at various pitches.

Fig. 2.3.6 Formant tuning in alternation on the vowel /ɑ/. (Left): The triangles on the F0 curve show points selected in the modal register. (Right): F1 is clearly tuned to H2 for both the upper and the lower modal tone in alternations. F2 tends to increase mostly in parallel with F0, although avoiding formant tuning by keeping distance to H4.

Since the avaz song could be assumed to provide the most typical examples of the stylistically relevant voice quality, its formant settings were of primary interest. The relationship between F1 and F2 and the spectrum harmonics in for the four vowels /ɑ, æ, i, e/sung in the avaz song are presented in Fig. 2.3.7. The data for the vowel /ɑ/ clearly indicate that F1 was tuned to H2 in the approximate range D♯4-F♯4. The same condition was observed for /æ/ in the F0 range B3-F4 except for the pitches around Eb4. In the vowel /e/ F1 was clearly tuned to H2 from almost a quartertone above C4 up to a pitch slightly lower than F♯4. For the vowel /i/, on the other hand, F1 increased almost in parallel with F0 and was much closer to F0 than H2, without reaching down to F0. F1 was mostly higher than the +50 Hz limit above F0 but seemed to keep close distance to the limit and it was within the limit for the F0 range D♯4-E4. It was not clear, though, how to interpret the data, i.e. whether F1 for /i/ was primarily tending toward the F0+50Hz line as a way of tuning without actually making F1 equal to F0, or if the primary aim of increasing F1 was to stay above F0 and perhaps also above the F0+50Hz line in order to avoid the timbral consequence of F1=F0.

F1 & F2 in avaz song, vowel /? /

22

28

34

40

46

11 12 13 14 15 16 17 18 19 20 21

Fo [ST above A2]

Fn [S

T abo

ve A

2]

F1 & F2 in avaz song, vowel /?/

22

28

34

40

10 11 12 13 14 15 16 17 18 19 20 21

Fo [ST above A2]

Fn [S

T abo

ve A

2]

Alternating neighbor tones in modal register

27

33

39

17,0 18,0 19,0 20,0 21,0

Fo [ST rel A2]

Fn [S

T rel

A2]

F1

F2

Serie15Serie16

Alternation #3

6,5

7,5

8,5

9,5

10,5

11,5

15000 16000 17000Time [ms]

F0 [S

T re

l A2]

F0Modal (melody)

vowel /ɑ/

vowel /ӕ/

Page 23: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

19

Fig. 2.3.7 The relationship between the harmonics and the formants F1 and F2, for the vowels /ɑ, æ, i, e/in the avaz song.

2.4 The Voice Assessment The voice assessment answers received from four of the participants in the audio-visual listening test will presented and discussed here. The participants provided their answers either graphically, as requested by the author in the instructions accompanying the test, or numerically. In both cases, the answers were translated into percentage, so that the highest value denoting “clearly typical” is 100 and the lowest value is 0, indicating “clearly untypical”. The four scale excerpts sung by the subject were rated at about 30, i.e. almost as low as the value 29 given to the duplicated descending scale excerpt sung on the vowel /o/ by a Swedish choir singer, while the average score on scales sung by other Persian singers was 53. This indicates that the scales sung by the subject were quite untypical for the Persian style of avaz. But it must also be added that scales by definition are untypical for the style, simply because singers do not practice them. Traditionally, singing scales is neither part of the teaching, nor practiced by students at home, and our subject was not used to sing them. It should also be mentioned that the assessment participants in some cases differed extremely in their ratings of the same scale excerpt. For example, the scale sung by the Swedish singer was scored 4, 10, 40 and 76 by the four participants. Similarly, scales sung by the subject were rated in the interval 2-69, and scales sung by a Kurdish singer on the vowel /ӕ/ were given ratings in the interval 2-95. Only the three scale excerpts sung by a highly acclaimed singer of Persian avaz were given slightly less scattered ratings. The isolated repetitive ornaments sung by the subject were indeed rated as very typical with values around 90. Regarding the alternation excerpts, the subject’s recording was the only one that was not taken from a song, and the author was assuming that the excerpt taken from commercial Persian avaz sung by a highly acclaimed singer would get the highest rating, but it landed at about 85. It was even more surprising to see that another alternation excerpt taken from an Azerbaijani commercial recording would be rated as less typical for Persian avaz. The F0 curve of the Azerbaijani alternation shows that it is not even produced as in Persian avaz. That is, the Azerbaijani alternation is not sung with takiyah episodes; the F0 curve just moves smoothly so that the pitch alternation is achieved with glissando. Nevertheless, those alternations were also rated as clearly typical for Persian avaz. The tone repetitions excerpts sung by the subject were also rated as highly typical, scoring about 90, which was also given to the tone repetition excerpt sung by one of the most highly acclaimed singers of Persian avaz. The tone repetition excerpt taken from an early commercial recording by the late master Reza Qoli Mirza Zelli (1906-1945) was rated at about 97. Thus, both the alternation and the tone repetition excerpts of the subject were assessed as highly typical for Persian avaz. Also the avaz excerpts of the subject were rated as highly typical, with a mean value of about 87. Two of the participants rated the subject’s excerpts in the intervals 80-95 and 79-98, while another participant gave the rating values 60, 80 and 100. The fourth participant’s ratings were in the interval 54-95. The excerpts sung in the higher voice range and the ornamented phrases as well as the one containing a tahrir scored high by all four participants; the excerpts with ratings in the interval 54-60 turned out to be among the recordings in the lower range, i.e. G3-C4, sung without embellishments. Moreover, as was shown

F1 & F2 in avaz song, vowel /i/

16

22

28

3440

46

52

13 14 15 16 17 18 19 20 21 22 23

Fo [ST above A2]

Fn [S

T abo

ve A

2]

F1 & F2 in avaz song, vowel /e/

26

32

38

44

15 16 17 18 19 20 21

Fo [ST above A2]

Fn [S

T abo

ve A

2]

Page 24: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

20

above, the higher range excerpts turned out to be sung with pressed phonation, as opposed to the the lower range. This could be seen as suggesting that pressed phonation in higher voice range is more typical for Persian avaz than non-pressed phonation on lower range in within the same octave. Excerpts taken from commercial recordings of the master of the Persian avaz, Mohammad Reza Shajarian, were rated at about 95, even though the excerpts were not from his highest range. This can be taken as enough argument to claim that the subject sang in a manner highly typical for the Persian avaz style. On the other hand, some excerpts taken from commercial recordings of other Persian styles as well as one of the excerpts of Kurdish song were also assessed as highly typical for Persian avaz, as shown in Table III. Since the above mentioned commercial excerpts of alternation and tone repetitions were taken from commercial recordings of Persian avaz, also those excerpts are included in Table III. Language/ Style

Recording

Mean Rating

Comment on singer or genre

persian Commercial 97.5 Tone repetitions in Persian avaz, sung by Reza Qoli Mirza Zelli

persian Commercial 95.2 sung by the male singer Shajarian, the most acclaimed master of Persian avaz

persian Commercial 89.2 Ghazal-Khani (traditional Persian style, different from avaz), male singer Persian

Subject (KTH)

87.2

Persian avaz , sung by the subject

persian KTH 85.5 Persian avaz, sung by (non-Iranian) male Kurdish singer

persian Commercial 85.3 Alternation in Persian avaz, sung by Shajarian

persian Commercial 82.1 Persian folklore sung by the acclaimed female singer SimaBina

kurdish KTH 79.4 Kurdish song style close to Persian avaz, sung by male Kurdish singer

persian Commercial 59.8 Sherwe-Khani (traditional Persian style, different from avaz), male singer

kurdish KTH 44.4 Kurdish song, sung by male Kurdish singer Azeri KTH 43.7 Azeri song, sung by Iranian Azeri female singer

Table III The audio-visual voice assessment result for song excerpts.

On the whole, the ratings can be taken as indicating that the subject’s singing was highly typical for the Persian avaz. One of the participants, who rated the subject’s excerpts as 60 and 80, commented his own ratings as inclining towards the uniformity and the dominance in the commercial output as well as the aesthetic ideal of today. In his view, there has been a convergence towards Shajarian’s style in the last few decades, which affected the ratings in that Shajarian and Zelli were considered as models for clearly typical Persian avaz singing. He also added that the ratings of the subject would have been higher if he would include also the more individual and diverse styles of earlier masters.

Page 25: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

21

3 Discussion

3.1 Discussion on F0 In a master’s thesis at UCLA in 1974, Margaret Caton studied recordings of male Iranian singers of Persian avaz along with what she labeled as “art music” and also her own recording of a folkloristic singer in a northern region of the country (see Caton 1974). While there is reason to criticize Caton’s claims about the glottal motion pattern during both takiyah and modal tones, our observations presented in 2.1 about takiyah as being a falsetto episode with varying, undefined and non-significant peaks, corroborate with her findings. Our data also confirm the findings of Castellengo et al, who presented a much more systematic study of the frequency and register transitions in the takiyah (see Castellengo et al 2009), although our findings do not corroborate some of their results. Here we shall discuss our presented results in relation to both of those previous studies. Caton found that the takiyah episodes were sung in falsetto, and that the peak frequencies were undefined, varying and even off-scale at high tempi, while being more even and on-scale at slow tempo. However, it is not clear which parameters in slow tempo resulted in the improved control of takiyah peaks in the avaz genre (labelled as “chant” by Caton). She found the takiyah durations to be 50-70 ms in avaz, but she also wrote that folk songs with ca 100 ms takiyahs had more even and on-scale peaks. But it remains unclear whether the takiyah peak intonation was improved due to fewer takiyahs per second or because of longer takiyah durations (Caton 1974). Before comparing with our data, the author is obliged to admit that it is rather difficult to measure the duration of the takiyah episode, as the beginning of the jump in the F0 curve seems to occur later than the change in the spectrum. That is, it was not clear whether the beginning of the takiyah episode ought to be somewhere in the last part of the melody tone while F0 was still either flat or beginning to rise, or if the takiyah should be seen as starting when F0 began to jump upwards. Analogously, the end of the takiyah episode was hard to define, as the harmonics started to take the shape of modal tone while F0 still was decreasing towards the intended frequency for the modal tone. Nevertheless, since Caton analyzed the spectrogram of audio recordings and did not report of any measurement method for the takiyah duration other than the graphical representation of the frequency jump, it is reasonable to assume that her values correspond to the duration of the F0 jump in our recordings. It is also interesting to see that in some commercial recordings (see Fig. 3.1.1), the F0 curve is much more flat than in our recordings, so that the change in F0 from flat to rising occurs almost abruptly when the takiyah episode is entered, as does the change from decreasing F0 to flat at the end of the episode. Such sudden changes could probably be seen as reliable demarcations for the takiyah episode. In our recordings, the durations of the sudden F0 jumps of the takiyah episodes were mostly 50-70 ms. The variations in takiyah peak frequencies were much more in control in the avaz song compared to the scales. The results also showed that the existing variation pattern was reproducible, i.e. the takiyah peaks preceding the melody tones of the same phrases in the two recordings were almost identical, while the variations were greater within each recording. However, the variations in takiyah peak did not show any tendency towards correlation with the duration of the takiyah episode, nor to the number of takiyahs per second. Caton’s statement on the relationship between takiyah speed and peak control gives rise to suspicion also when we consider commercial recordings of well acclaimed singers of avaz. Analysis of commercial audio recordings reveals that the takiyah peak frequencies can be controlled even in very short and very dense takiyah episodes (duration about 50 ms, recurring about 6-7 times per second). On the other hand, it is plausible that in the specific commercial recording (sung by Zabihi) studied by Caton, the takiyah peak variations actually were dependent on the duration or the recurring frequency of the episodes.

Page 26: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

22

Fig. 3.1.1 The takiyah episodes in a commercial recording by the professional avaz singer Alireza Gorbani (see Ghorbani 2004) are about 50 ms in duration, and yet the peaks are very even.

Caton also found the takiyah episodes to be longer in folk music, namely 100 ms, and that those takiyah peaks were more even and made smaller intervals to the modal melody tones. But the longer takiyah durations in various folk music styles in Iran, including the Persian, Azerbaijani and Kurdish folk traditions, are usually also coarser and not at all more integrated with the melody as claimed by Caton. In addition, also some of the greatest masters of the avaz tradition had varying peaks, either suddenly in single or few episodes, or more recurring, such as the in example shown in Fig. 3.1.2. One could also argue that for example an increasing sequence of takiyah peaks preceding a series of repeated modal tones (of the same frequency) are the expressive – or the implicit – result of a drive towards some specific higher modal tone, as shown in Fig. 3.1.2. In fact, in Caton’s transcription of Zabihi’s commercial recording of avaz, a descending melody line is interleaved by takiyahs in which the peak frequencies decrease by about a quartertone for each episode (see Caton 1974).

Fig. 3.1.2 Takiyah peaks in commercial recordings. (Left) Tone repetitions with short episodes and yet mostly even takiyah peaks except for some suddenly deviating peaks where the peaks increase by about 100 cent (see Zelli 2000). (Right) Tahrir with short takiyah episodes and varying peaks (see Shajarian: Sayeh Larzan).

One should also keep in mind that the interval between a takiyah peak and its following modal tone depends on whether the modal tone preceding the takiyah was higher or lower; if the modal tone transition is a descending, the interval will be greater than in a rising melody. Thus, in tahrir phrases with both ascending and descending tone sequences such as C-D-E-D-E-D-C-D, the takiyah peaks will inevitably produce different intervals to the melody tones. In ascending melodies, the peaks preceding the higher melody range also tend to build smaller intervals, as shown in Fig. 3.1.3.

Fig. 3.1.3 Tahrir in a commercial recording (Shajarian: Jan-e Oshagh) where each modal tone is sung twice in stepwise ascending and descending phrases. For each pitch, the two takiyahs have different peaks, as the peaks increase in the ascending phrase and vice versa in the descending.

Varying takiyah peaks in Tahrir (sung by M R Shajarian)

17

19

21

23

3,9 4,4 4,9 5,4 5,9Time [s]

F0 [S

T re

l A2]

Tahrir (sung by Shajarian)

0369121518

0 2000 4000 6000 8000 10000 12000 14000Time [ ms ]

F0 [S

T re

l A3]

Tone repetitions in avaz (sun by R-Q M Zelli)

19

21

23

550 950 1350 1750 2150 2550 2950 3350Time [ms]

F0 [H

z]Tone repetition sequences in avaz

(sung by Alireza Ghorbani)

12

16

20

24

28

740 790 840 890 940 990Time [centiseconds]

F0 [S

T re

l A2]

Alternations in avaz (sung by Alireza Ghorbani)

16

18

20

22

24

26

28

160 210 260 310Time [centiseconds]

F0 [S

T re

l A2]

Page 27: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

23

Caton stated that short takiyahs were perceived as demarcations and not as distinct pitches. Although this has not been investigated in this thesis, the author agrees with Caton. The author also agrees with Caton on the interval between the takiyah peak and its following melody tone as being neither well defined nor (musically) significant, and that it can be off-scale (Caton 1974). In a much more systematic and scientifically thorough study, Castellengo et al (2009) found the intervals to be individual and varying, and yet well reproducible – which is also the case in our recordings. In addition, they found that depending on the singer, the intervals were constant or decreasing with starting frequency, i.e. with the preceding modal tone frequency, while they were increasing with dynamics for all singers, so that higher SPL yielded larger intervals. They also studied the intervals versus the vowels /ɑ, o, i/ (Castellengo et al 2009). While the takiyah peaks were not studied versus vowels or modal tone frequency in this thesis, no correlation with SPL could be found (see Fig. 2.1.6). In general, scholars who have studied Iranian music and Persian singing, e.g. Simms (1996) and During (1984), do not speak of the varying, off-scale and undefined takiyah peak frequencies as musically significant, and especially not as a aesthetic-musical deficiency. Neither do musicians and singers speak of takiyah peaks, and in many cases not even of takiyahs. Nevertheless, Castellengo found that the Iranian singers in her study excellently reproduced constant peaks (Castellengo et al 2009), albeit the musical significance of such abilities still are open to discussion, and it is also unclear within what range the intervals were considered as constant. In addition, Castellengo et al (2009) found that the SPL dropped during takiyah episodes on the open vowels /ɑ, o/ and that it peaked on the closed vowels /u, i/. Also in our recording the SPL dropped during the takiyah on the vowel /ɑ/, but it did drop for the vowel /i/ as well, both in the scales and in the avaz songs, as shown in Fig. 2.1.5; in our recordings, the SPL dropped during the takiyah episodes regardless of which vowel was being sung. It is possible that the SPL peak measured by Castellengo et al was caused by accidental formant tuning during the takiyah episodes; the peak frequency may have come closer to F1, or some higher partial may have come closer to F2, which should increase the SPL of that specific partial and thereby increase the total SPL of the radiated sound. This shall be discussed in 3.3 when formant settings will be discussed. In the BSc thesis in musicology in 2009, the author analyzed the F0 and histogram curves for commercial recording excerpts of avaz song containing alternations and tone repetitions, and referred to musicological literature addressing early Italian descriptions of similar repetitive ornaments. Also in the current thesis the same properties were observed in the F0 and histogram data, and the crucial point was that the same F0 pattern constituted both ornaments: the results for tone repetitions and alternations showed that the modal tones were preceded by takiyahs, which in more general or rather Western terms could be seen as short grace notes. A brief account of the main arguments from the BSc thesis will be presented here. One of the most noteworthy accounts of tone repetitions is the Milanese singer Bovicelli’s description of tremolo in his Regole, passaggi di musica3 from 1594, where his musical notation for the ornament contained falling short grace notes preceding each repeated melody tone, and he clarified in the text that the score example showed how to sing on the same note. He also added that alternation between two neighbour notes should be sung in the same manner (MacClintock 1976; Greenlee 1985:65). MacClintock adds that Bovicelli’s tremolo denoted the same ornament as Giulio Caccini’s trillo, i.e. quick repetitions of the same tone sung quickly with glottal articulation (MacClintock 1976). While some scholars have also discussed the idea that even Zacconi’s tremolo from 1592 might have designated tone repetitions (see Greenlee 1984; MacClintock 1976; Stark 1999), a more significant point has been made, namely that also Caccini advocated that trillo and gruppo (designated alternation, see Greenlee 1985) would be practiced and performed in the same way by rebeating the same vowel in the throat (Caccini 1983; Wistreich 2000). Thus, Bovicelli and Caccini both stated and even emphasized that tone repetitions should be produced in the same way as alternations. Analysis of tone repetitions sung by early music specialists and opera singers

3 The most important treatise of the 1590s on vocal ornamentation, according to MacClintock (1976).

Page 28: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

24

have shown that F0 increased and almost built a short peak at the beginning of each repeated tone. The jumps were however not as high as in takiyah, and there were no register breaks involved, so that the whole series of tone repetitions were sung in modal register (Hakes et al 1987; Brown & Scherer 1992). Alternations, on the other hand, are produced as vibrato, i.e. by continuously increasing and decreasing F0 between the two tones; the transition up and down between the two tones is done in legato, so that the alternating F0 moves in glissando (Sundberg 2001). It is therefore reasonable to assume that the traditional Western classical way, including the technique used by early music specialists, differs from the early 17th century. That is, today’s Western way of singing tone repetitions and alternations does not suit the descriptions and the musical notations made by Bovicelli and Caccini. To grasp a more convincing idea of how the aforementioned ornaments might have been sung by early Italian Baroque singers and perhaps also by their predecessors, one should make room for Bovicelli’s notation with the short grace notes. Through detailed studies of the descriptions of vocal ornaments from 16th and early 17th century, some musicologists have reached new understanding and more robust interpretations of the musical and physiological implications of the basic ornaments of the period, which repeatedly emphasize two desirable properties: glottal articulation and the proper balance between detachment and bounded tones, both of which would be achieved with glottal onset. That is, separation between the tones should not be achieved through pauses in phonation and yet the tones should not be too bounded either and each tone must be sung with glottal onset. However, glottal onset and the content of the advocating regarding detachment and binding of tones in the early Italian treatises are outside the scope of this thesis (see Greenlee 1985 & 1987; Stark 1999; Galliver 1973, MacClintock 1976; Vicentino 1555/1982). Here it will suffice to mention that the register break after the takiyah episode makes the next modal melody tone sound as if it were produced with glottal onset despite the fact that phonation is continuous. The nature of the register break and its effect upon the next melody tone are also outside the subject of this thesis. Nevertheless, it is worth mentioning the idea that is being implied, namely that if Bovicelli’s short grace notes were sung in falsetto and the melody tones in the modal register, then the tone repetition phrase would be sung with continuous phonation and yet each repeated tone theoretically could sound like being produced with glottal onset, and thereby one might possibly assume that the desirable balance between binding of the tones and detachment was achievable.

3.2 Discussion on Voice Source In an attempt to describe the vocal fold motion, Caton presented an erroneous speculation, as she had misinterpreted Vennards (see Vennard 1968) schematic pictures of the glottal motion during a phonation cycle (Caton 1974)4. It seems that Caton did not separate between the voice source and the articulators, and she ascribed to the vocal folds a certain shape in order to produce the closed back vowel /ω/ during the takiyah. The picture that she used for phonation during the takiyah was Vennard’s picture for the beginning of the opening phase, while her picture for phonation during modal tones was Vennard’s illustration of the open phase when the vocal folds were widely apart. The only voice source data that the author has found among the studies on Persian singing is in the work done by Castellengo et al (2009), where they presented QOpen values: 0.4 for modal tones, and 0.8 for falsetto. It should be mentioned, though, that Castellengo et al and other French scholars use a different terminology for such phonation modes. Instead of the term register, they use the notion of mechanism, so that the Mechanism 1 (M1) corresponds to the modal register, while Mechanism 2 (M2) corresponds to falsetto (see Catellengo et al 2009; Henrich 2011). Although they measured QOpen by studying the EGG derivative (dEGG) and the author used inverse filtering, their values were more or less confirmed by our data. In our case, QOpen for modal tones were mostly about 0.4-0.5 in tone repetitions in the F0 range D4-G4, approximately, and QOpen for modal tones in the avaz song were 0.43-0.57 for the F0 range Bb3-

4 The author has previously discussed that issue in a BSc thesis in musicology, but it will still be discussed briefly here, as the author has gained new insight thanks to deepened studies in voice science.

Page 29: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

25

E4. For the falsetto episodes in tone repetitions, QOpen were mostly around 0.6-0.7 in the approximate F0 range F4-Ab4, as was shown in Figur 2.2.4. (Catellengo et al 2009) In general, our voice source data varied from very low NAQ with high QClosed to moderate and high NAQ with much lower QClosed. That is, the voice varied between clearly pressed and non-pressed phonation. Also, our voice source data for tone repetitions and alternations provide new information for recent Western research on early Italian Baroque singing in terms of alternative physiological possibilities for rapid producing of ornamental repetitive sequences of modal tones. It has been shown that that the same voice source values were measured in the melody tones, i.e. the modal tones, in both ornaments: both alternations and tone repetitions were sung with continuous pressed phonation.

3.3 Discussion on Formants There are no previous studies on the formant settings in the singing style of avaz. Our results for the vowels /ɑ, æ, e/ clearly showed that F1 was tuned to H2 in the singer’s upper F0 range, i.e. about B3-F♯4 in the avaz excerpts and also in isolated tahrir phrases consisting of tone repetitions and alternations. Some voice scientists who studied the formant settings in operatic male voices believe that the F1=H2 tuning is characteristic for the Western classical male voices. According to some scholars, it is typical for the male passaggio specifically, i.e. for the register transition range of male opera singers. Others do not agree, as such tunings were not found in various recordings of well acclaimed and clearly typical opera voices. Kenneth Bozeman5, Donald Miller, and Natalie Henrich are among the proponents of the existence of formant to harmonic tuning as a characteristic property of bass and tenor voices, while Ingo Titze is an opponent to their ideas. Also the author’s supervisor, Johan Sundberg, is sceptical towards the idea of regarding the F1=H2 tuning as being characteristic for Western opera. The stances taken by these scholars will briefly be discussed below. Miller has found that F2 is tuned to H4 at the beginning of the passaggio, and that the lowering of all formants in the passaggio results in a new tuning, namely the tuning of F2 to H3. For instance, by analyzing a climatic and sustained Bb4 tone towards the end of Aida in many commercial recordings throughout the 20th century, he came to the conclusion that F2 was the crucial formant being tuned above the passaggio point, and that the F2=H3 tuning was only one acoustically possible alternative among many. Its prevalent occurrence was intentional and not a side-effect, for which the vowel modification due to the increased F2 (from 900 Hz to 1400 for the vowel /o/, approximately) and the singers’ hiding of the modification are clear evidence, according to Miller (Schutte et al 2005). He specifically refers to Pavarotti’s timbre at his highest tones and points out that the late singer’s mastery was not solely in his ability to produce powerful and sustained C5, but rather in his timbral consistency, which Miller ascribes to the formant tuning, i.e. the F2-H3 setting (Miller 2008). The author assumes that scholars opposing Miller would argue that he finds the formants merely by visually looking for the strongest partials in the spectrum harmonic, and that he claims of being able to find the formants in this way even in commercial recordings. Miller’s argument for determining the position of F2 in the example for the vowel /o/ is merely that H3 was strong and the F1 and F3 were too far away, so that F2 must have been equal to or near H3. But he does not discuss what is considered as near. Bozeman, who agrees with Miller, invites the reader to notice how the timbre changes as result of interaction between F1 and H2. For any given vowel, there is a low F0 range in which H2 is much lower than F1, and if F1 is high, such as in /ɑ, a/, it is possible that a few partials are below F1. For example if the vowel /ɑ/ with F1=800 Hz were sung at F0=150, the partials H2, H3, H4 and even H5 would be below F1. According to Bozeman, the timbre can be open as long as more than one partial is below F1;

5 I thank Dr Bozeman for sending me his own presentation notes when I contacted him after his inspiring presentation on F1-H2 interactions at PAS5 in 2010 in Stockholm, which became my starting point in studying formant tuning.

Page 30: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

26

the more partials below F1, the more open the timbre. But at F1=H2, the timbre quality becomes more penetrating and yelling, and Bozeman actually points out that the same coupling is in place when we yell for help. At this point, if the aim would have been to maintain the formant tuning also on the higher while increasing F0, it would be necessary to raise F1 in parallel with H2. But maintaining the F1=H2 tuning is in Bozeman’s view not preferable in classical Western singing, since it would be done by raising the larynx, which would result in pressed phonation. Of course, F1 can also be raised by increasing the jaw opening, but Bozeman points out that this can be done only for the F0 range of about a major second, and also that it can not easily be used in some languages where the F1 range is strongly limited. Thus, F1 should not increase along H2 and the yell must be avoided, which is done by letting H2 pass beyond F1 as F0 continues to increase. Thereby the timbre becomes darker, which in various singing jargons is known as turning over or covering, and produces the desirable timbre in the classical Western singing idiom. (Bozeman 2010) However, not all proponents of the idea that formant tuning is characteristic for classical Western male voices would insist on avoiding the F1=H2 tuning. Henrich et al found various formant tuning strategies in both male and female voices, among them the F1=H2 tuning which was systematically used in the tenors’ upper range and in the altos’ lower range. They also defined the above mentioned 50 Hz threshold for tuning to H2, and she showed that individual singers used different tuning strategies, i.e. they tuned F1 to different partials. The singers reproduced their same formant tunings very exactly, and she clearly showed that the tuning range could in some cases be as wide as a decima, which means that the tuning is not used only in the passaggio ranges of the various voice types. But Henrich et al do not seem to regard the F1=H2 tuning as a problem, and they do not suggest any strategies to avoid it. (Henrich et al 2011). It should be added that Henrich’s formant tuning criteria, that the formant must be within 50 Hz distance from the harmonic, does not support Miller’s visual determination of formant positions, since a partial may be clearly strong, i.e. stronger than its neighbor partials and yet be farther than 50 Hz from the nearest formant, as was the case in our recording (see Fig. 2.3.2 c). Henrich et al (2007) also studied the formant to harmonic relationships in two stylistically typical sound qualities in Bulgarian women’s traditional singing on the four pitches F4, G4, A4, and B4. In the timbre described as a “loud, projected sound”, they found clear evidence of F1=H2 tuning for the vowels /e, o, œ/ at all four pitches. The vowel /ɑ/ was also tuned in the same way, but only at the pitches A4 and B4. While /u/ was not formant tuned at all, the vowel /i/ had a F1=F0 tuning. Formant tuning was as evident also in the other timbre, which they described as lighter, more lyrical and similar to the female head voice in western classical singing. Again, the vowels /ɑ, e, o, œ/ were formant tuned with F1=H2, while /i, u/ were tuned according to F1=F0. On one hand, the existence of F=H2 tuning in two clearly different female timbres could be taken as suggesting that perhaps such formant tuning is so common and widespread among so many styles and voice timbres that it may not be regarded as characteristic for any of them, i.e. that it will not tell us anything significant about how such voices sound. But on the other hand, knowing that the voice quality is determined by much more than one specific formant’s distance to a harmonic, it is indeed interesting to see that the F1=H2 tuning is pursued in apparently different timbres. Titze is sceptical towards the advantages of formant tuning, and together with colleagues he does not find it to be characteristic for male operatic voices – in recordings with both skilled and less skilled singing students, they did not even find any tuning to harmonics for F1 and F2. They found that the singers achieved the desired timbre by concentrating energy in H2-H4 instead of H1 (F0), and without tuning F1 or F2 to any harmonic. They also remind of the high male voice as tending to break, i.e. suddenly change from modal to falsetto – something which the Western vocal pedagogy serves to prevent (Titze et al 1994). Elsewhere, Titze has stated that tuning of F1 or F2 to any harmonic will make the voice instable and create ripples in the voice source. Thus, one plausible interpretation of his argument would be that formant tuning boosts the register breaks. In fact he has stated that crossing of F0 and any harmonics can trigger register breaks (Titze 2007). This raises the question whether formant tuning is a necessary or perhaps supportive component for producing the frequent double register breaks in traditional Iranian singing that may occur at any time on

Page 31: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

27

any pitch, including sustaining long tones and scale exercises. While such question is outside the scope of this paper, it is relevant to mention that in search for answer, one should also consider the research done by Švec et al on abrupt register jumps between modal and falsetto, and examine whether formant tuning affects the voice source in such a way that may be related to the physiological and bio-mechanical conditions studied by Švec et al (see Švec et al 1999). On the other hand, the register breaks occur only at the takiyah episodes, and sustained tones with steady timbre do exist in Persian avaz. In other words, it seems that the kind of formant tuning seen in Persian avaz, namely having F1 on or slightly above H2, does not necessarily imply instability and register breaks. Titze argues against maintaining formant tuning also from another point of view, namely rising of the larynx and increased pressedness in the phonation. All resonance frequencies rise with a shortened tube, which means that all formants increase with rising larynx. And raising the larynx is what many singers do in non-classical genres, e.g. belting. But a raised larynx is strongly associated with pressed phonation, which in turn is seen as very unhealthy in today’s classical Western pedagogy and voice care (Titze 2007). However, in a study on 43 Lebanese singers who sang with pressed phonation, no correlation was found between their pressed phonation and voice damage (Hamdan et al 2006). It is therefore conceivable that rising of the larynx might even be advocated as an efficient way of maintaining certain voice quality characterized by pressed phonation and formant tuning, without worrying about voice damage and unhealthy pedagogy. Titze stresses that the undesirable formant setting is when F1 or F2 is either equal to or slightly below any of H2–H4, whereas the beneficial setting would be given by having the formant slightly above a harmonic. And while he suspects that Schutte and Miller might have missed a small difference between F1 and a slightly “detuned” H2 in their claims on F1=H2, Titze sees his own results as corroborating with their findings (Titze 2004). Considering Henrich’s tuning zone, i.e. the 50 Hz frequency zone defining the formant tuning distance from the harmonics, it would be interesting to find out if Titze’s formant data turn out to match Henrich’s definition of formant tuning. It would also be interesting to see if Titze’s obtained formant values slightly above the harmonics turn out to be recurring systematically so that they perhaps should be regarded as characterizing the high ranges of classical Western male voices.

Page 32: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

28

4 Conclusions We saw that the pitch transition from one modal tone to the next is done via falsetto episodes known as takiyah, in which the frequency jumps up to a peak in order to dive to the next modal tone. The duration of the takiyahs mostly varied in the interval 50-70 ms. We also mentioned that takiyahs are sung in falsetto, so that there are two register breaks when two consecutive modal melody tones are sung melismatically in the Persian singing style of avaz. The takiyah peaks are not well-defined and they are rather demarcations before the modal tones rather than distinct pitches. The intervals between them and the modal tones vary and may even be off scale, but those intervals are not musically significant. These results corroborated with Caton’s findings (1974). Comparisons between our results and the F0 curves of some commercial recordings suggested that the intervals between the takiayh peaks and the modal tones can vary between individual singers. In some commercial recordings of highly acclaimed singers and masters of the Persian avaz, the intervals did show a tendency to decrease with the modal tone pitch; the takiyah peaks tended to decrease when the melody F0 increased. This corroborated with the results presented by Castellengo et al (2009), who found that the peak intervals increased with dynamics and that they remained constant or decreased with modal F0. They also studied the peak intervals versus vowels. We saw that there was no correlation between SPL and takiyah peaks in our subject, and neither the vowel nor the melody F0 determined the intervals. The SPL dropped during the takiyah episode for all vowels in our recordings, which contradicts the results presented by Castellengo et al, and we explained that the increased dynamics measured (by Castellengo) could be due to decreased distance between F0 and the low F1 of the closed vowels /u, i/. The tone repetitions and the alternations in our recordings were produced with continuous phonation and with takiyah episodes preceding each modal melody tone. In this way, both of these repetitive melodic ornaments could be produced with the same phonation type, as the QOpen values for the modal tones were the same in alternations and tone repetitions. The histograms showed that the repeated or alternating modal tones were the dominating tone produced. These findings indicated that Stark’s hypothesis on the plausible phonation models for tone repetitions and alternations are not the only conceivable options, and they also suggest that the early Italian descriptions on those ornamental melodic patterns can be interpreted differently. In fact, Stark’s dilemma and the physiological contradictions that he faced are not relevant when tone repetitions and alternations are produced in Persian avaz. This means that there is no contradiction between the characteristics found in our recordings of the ornaments and the early Baroque ornaments as described and notated by Confurto, Bovicelli and Caccini (see Stark 1999). Our subject’s voice source varied depending on range, so that the phonation was normal on the lower range of the avaz song in order to become clearly pressed in the higher range. Our QOpen comparison between modal and falsetto corroborated with the findings of Castellengo et al (2009), as the values for modal were much lower. Our singer tuned F1 to H2 for the vowels /ɑ, æ, e/ in the avaz song. Except for some deviations in the F0 range A4-B4, this formant tuning was clearly in place above Bb4 (235 Hz, approximately). Also the vowel /i/ showed tendencies towards formant tuning, as F1 stayed more or less in parallel with and yet above H1, i.e. F0. Since the audio visual voice assessment clearly indicated that the subject’s singing was highly typical for the Persian singing style of avaz, it is reasonable to state that the F1=H2 formant tuning is a timbral characteristic of the Persian avaz.

Page 33: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

29

5 Acknowledgements More than anything, I am grateful to my supervisor, professor Johan Sundberg. The year that I have spent working with this thesis (as it was planned for 50% of normal study time) has been filled with new learning. The length of the work and the generous amount of supervision time offered by professor Sundberg gave me the rare opportunity to become aware of how I was growing and being shaped to a better student under his guidance. My hope is that this thesis is a step in the right direction and able to live up to at least one of his crucial words of advice, namely that a researcher’s greatest virtue is to be sceptical. It is needless to say that this thesis would not be possible to carry out without the friendly and generous contribution of the subject, Bahram Badjelan, who is both a performing artist and a teacher of Persian avaz at the Royal Collage of Music in Stockholm. His vast experience from taking private lessons in the canonized avaz repertoire from several of the legendary icons of Iranian music makes him a true carrier of the tradition, and a reliable enriching source for studies of this kind. I am grateful to Svante Granqvist for his invaluable software programs which have added to the delight of the analysis work. And I also want to thank researchers and professors at the Speech, Music and Hearing department at KTH for giving me all kinds of help with books, recording equipment, booking of studio room, etc. As mentioned in the section on formant tuning, I was introduced to the phenomenon of formant tuning at professor Kenneth Bozeman’s presentation at the PAS 5 conference (Physiology and Acoustics of Singing) at KTH, Stockholm. That was a few months before I started working with this thesis, and I want to thank professor Bozeman for providing me with his presentation material after the conference, including his own additional notes.

Page 34: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

30

6 Bibliography Biglari HJ: Likheter mellan iransk sångteknik och tidig italiensk barock, BSc Thesis in Musicology, Uppsala University 2009 Björkner E, Sundberg J, Alku P: ”Subglottal pressure and normalized amplitude quotient variation in classically trained baritone singers”, Logopedics Phoniatrics Vocology, 2006;1-9 Bozeman KW: “The Role of the First Formant in Training the Male Singing Voice”, Journal of Singing, 2010;Vol. 66 Brown LR & Scherer RC: “Laryngeal adduction in Trillo”, Journal of Voice 1992; Vol. 6, No. 1, pp 27-35 Caccini G: Le nuove musiche, Florence, 1601/1983 -- “Le nuove musiche”, Source Readings in Music History. From Classical Antiquity through the Romantic Era, Translation & Ed. Oliver Strunk, p. 377-392, New York, 1601/1950

Caton M: ”The Vocal Ornament Takiyah in Persian Music”, UCLA Selected Reports in Ethnomusicology 1974;2:1. p. 42-53 Greenlee R: The techniques of Italian melismatic articulation in the latter half of the sixteenth century, Indiana University (Ph.D dissertation), 1985

-- “Dispositione di voce: Passage to Florid Singing”, Early Music 1987; 15:1, Oxford Univ. Press During J: La musique Iranienne. Tradition et evolution, p. 84-86, Paris, 1984

Fatemi S: [booklet in music CD box, M.CD-139], Voices from the Land of Iran, An Anthlogy of Vocal Styles and Techniques, Mahoor Institute of Culture and Art, Tehran, 2005 Ghorbani AR: [AC 107], Calligraphies Vocales. L’art du chant classique persan, Accords Croisés, 2004 Hamdan A-L, Sibai A, Moukarbel R, Deeb R: “Laryngeal Biomechanics in Middle Eastern Singing”, Journal of Voice 2006;20:4, p. 579-584 Hall DE: Musical Acoustics, Pacific Grove (CA, USA), 2002, p. 293-313 Henrich N, Kiek M, Smith J & Wolfe J: “Resonance strategies used in Bulgarian women’s singing style: A pilot study”, Logopedics Phoniatrics Vocology 2007; 32, p.171-177 Henrich N, Smith J & Wolfe J: “Vocal tract resonances in singing: Strategies used by sopranos, altos, tenors, and baritones”, Journal of the Acoustical Society of America 2011; Vol. 129, No. 2 MacClintock C: “Caccini’s Trillo, a Re-examination”, The NATS Bulletin (Oct., 1976), p. 38-41 Miller LC: Music and Song in Persia. The Art of Âvâz, Surrey, 1999 Rothenberg M: “A new inverse-filtering technique for deriving the glottal air flow waveform during voicing”, Journal of the Acoustical Society of America, 1973; Vol. 53, Issue 6, pp. 1632–1645 Schutte HK, Miller DG, Duijnstee M: “Resonance Strategies Revealed in Recorded Tenor High Notes”, Folia Phoniatrica et Logopaedica 2005;57, p. 292-307 Shajarian MR: [commercial CD recording] Jan-e Oshagh, Delawaz, Tehran

Page 35: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

31

-- [commercial recording] Sayeh Larzan Hakes J, Shipp T, Doherty ET: ”Acoustic Properties of Straight Tone, Vibrato, Trill, and Trillo”, Journal of Voice 1987; 1:2, New York, pp. 148-156 Simms R: Avaz in the Recordings of Mohammed Reza Shajarian, Ph.D. dissertation, University of Toronto, 1996 Stark J: Bel Canto: A History of Vocal pedagogy, University of Toronto, 1999

Sundberg J: Röstlära – Fakta om rösten i tal och sång, Malmö, 2001 -- Musikens ljudlära, Visby, 1989 Švec J, Schutte HK and Miler DG: ”On pitch jumps between chest and falsetto registers in voice: Data from living and excised human larynges”, Journal of the Acoustical Society of America, 1999; Vol. 106, No. 3 Tatsumura A: “Performance of Persian classical vocal music”, Musical voices of Asia. Report of ATPA [Asian Traditional Performing Arts] 1978, ed. Richard Emmert & Minegishi Yuki, Tokyo, 1980 Titze IR, Mapes S, Story B: “Acoustics of the tenor high voice”, Journal of the Acoustical Society of America, 1994; Vol. 95, No. 2, pp. 1133–1142 Titze IR: “Acoustic Interpretation of Resonant Voice”, Journal of Voice, 2001; Vol. 15, No. 4, pp. 519–528 -- “A theoretical Study of F0-F1 Interaction With Application to Resonant Speaking and Singing Voice”, Journal of Voice, 2004; Vol. 18, No. 3, pp. 292-298 -- “Belting and a High Larynx Position”, Journal of Singing, 2007; Vol. 63, No. 5, pp. 557-558 Vennard W: SINGING the Mechanism and the technic, New York, 1968

Zacconi L: Prattica di Musica utile et necessaria si al compositore, si anco al cantore, New York, 1596/1982 Zelli RQM: [M.CD-052, Shur], Songs of Reza Qoli Mirza Zelli, Mahoor Institute of Culture and Art, Tehran, 2000

Page 36: Timbral and Melodic Characteristics of the Persian Singing ... · Timbral and melodic characteristics of the Persian singing style of Avaz Abstract The floridly ornamented Persian

TRITA-CSC-E 2012:026 ISRN-KTH/CSC/E--12/026-SE

ISSN-1653-5715

www.kth.se