perceiving boundaries in makam music 1 running head...

Perceiving Boundaries in Makam Music 1

Running Head: PERCEIVING BOUNDARIES IN MAKAM MUSIC

Gestalt Universals or Musical Enculturation:

Perceiving Boundaries in Unfamiliar Turkish Makam Music

Esra Mungan

Boğaziçi University

Z. Funda Yazıcı

Istanbul Technical University

Mustafa (Uğur) Kaya

Koç University


Abstract

According to Gestalt theory, our perception of the world is guided by certain basic principles of

grouping such as proximity and similarity (e.g., Wertheimer, 1923). As such, Gestalt theory

supports a universality perspective, which claims that these perceptual laws hold for any kind of

material and for any kind of subject population. The earliest application of Gestalt laws to music

was Lerdahl and Jackendoff’s (1983) model developed for Western hierarchical, tonal music.

This model has been put to test extensively with Western classical, pop, and 20th century music

using musicians and nonmusicians (e.g., Bruderer et al., 2009, Deliège et al., 1996; Deliège & El

Ahmadi, 1990; Krumhansl, 1996). But only recently do we see first attempts to check the

validity of Gestalt laws in non-Western musical material (e.g., Ayari & McAdams, 2003;

Lartillot & Ayari, 2011). The goal of our study was to continue along that line using traditional

Turkish makam music pieces. In our study, makam music-trained musicians, nonmusicians and

Western listeners were asked to mark their segmentations as they listened to different 19th

century Turkish makam tunes, which were all recorded in a qānūn timbre on MIDI with retained

microtonal structure. Each participant was allowed to make three segmentation attempts per

piece. A slightly novel set-up was used to ensure (1) a lower load on working memory during

the task, and (2) some certainty about which of the three segmentation attempts to take as the

final one. We also had two experts segment the tunes in a free time setting. The number of

segmentations per tune did not differ across participant groups. Moreover, we found

considerable within- and across-group agreement within-groups, as well as good agreement with

the expert segmentations. After transforming each participant’s segmentations into ‘hits’ and

‘false alarms’ based on their match or mismatch with the experts, we observed that the only

significant difference was between musicians and the other two groups who performed about the


same. Almost all “extra” segmentations (‘false alarms’) in all three groups were strongly driven

by duration-based proximity, or rather separation. This was also so for the other ‘hit’

segmentations. We believe these results to provide additional support for the universality stance

of Gestalt theory.


Gestalt Universals or Musical Enculturation: Perceiving Boundaries in

Unfamiliar Turkish Makam Music

Any study interested in perceptual segmentation has to address the Gestalt laws of

organization (Wertheimer, 1923). These are a group of principles that attempt to explain how we

perceive ‘figures’ apart from the backgrounds they are embedded in, as well as how we group

elements to perceive the ‘figures’ the way we do. One can probably call these laws of perception

the first cognitive-perceptual laws in psychology that shifted the focus from the characteristics of

the sensory organ to post-sensory processes. Irving Rock (1975) refers to Gestalt psychology as

the first, and for a long time, only “systematic theory about form perception” (p. 257). Allport

(1955) eloquently notes:

Any serious discussion of modern perceptual theories may well begin with

gestalt psychology. … Somewhat philosophical in its outlook, it is a theory of

imposing intellectual stature and in some respects of elegance and beauty.

It is most at home in the realm of configuration. ... The subject-matter and method

of the theory have been from the start, strongly phenomenological. The

immediate experiences of the subject are the matter to be described and explained.

(p. 112)

Gestalt psychologists were mostly occupied with visual form perception and over the time

period they were influential, more than hundred laws of organization were set forth, which

Boring reduced to fourteen (Allport, 1955). One might say that the most elementary two laws of

grouping have come to be the law of proximity, according to which elements of a series that are,

in relative terms, spatio-temporally closer to each other will be processed as a group; and the law

of similarity, according to which elements in a series that are similar or equal to each other in


terms of their features will be processed as a group. It is these two laws that we will be focusing

on in this paper. The fact that we start perceiving groups due to proximity and/or similarity is a

phenomen referred to as the “emergent properties” of a set of stimuli because none of the

individual stimuli in the set for or by itself carry that property (Palmer, 1999). Early Gestalt

psychologists looked at this peculiar phenomenon of grouping as an innate perceptual

phenomenon due to structural and functional aspects of our brains, thus, according to them, even

a newly born baby perceived its sourroundings based on these laws of organization.

Even though Gestalt laws of organization are almost always described with visual examples,

we know that Wertheimer, in his 1923 article in the “Psychologischen Untersuchungen” (which

appeared as a Festschrift for Carl Stumpf), listed examples from music right in the sixth

paragraph of his introduction. As can be seen from Allport’s comment cited earlier, Gestalt

psychologists frequently made their postulations via simple, direct demonstrations instead of

meticulous experimentations. Interestingly, when looking at the scientific literature on Gestalt,

one gets the impression that this theory stimulated information-processing theorists and

computer scientists more than experimental psychologists, who seemed to have accepted the

theory as is. Palmer (1999) mentions that this was partly related to the limited experimental

equipment that was at hand at the time and necessary particularly for the careful manipulation of

stimuli.

Most of the early studies on Gestalt laws of grouping in the auditory domain using music or

music-like tone-sequences appear in the early 1970s (for an overview see Deutsch, 1986). But

there are important exceptions, such as Miller & Heise’s careful experimentation on how much

frequency disparity is tolerated for a tonal sequence to be still perceived as a group (1950; cited

in Deutsch, 1986). Or Schouten’s 1960 study that looked at the effects of both frequency


proximity and presentation rate in perceptual grouping of tone sequences. Deutsch (1986) notes

that auditory research on Gestalt was heavily influenced by experimental work in vision and

language, but also by developments in music theory. We see, for instance, that in the 1970s, in

almost all of experimental research on auditory grouping highly controlled, single-line tone

sequencesare used (but again, there are exceptions, such as Dowling, 1973), whereas later on,

starting from the 1980s, we see an increasing use of melodic lines, excerpts from real musical

pieces and the like (e.g., Deliège, 1987; Krumhansl & Jusczyk, 1990; Serafine, Glassman, &

Overbeeke, 1989; Stoffer, 1985).

The increase in studies using real musical stimuli must have been strongly related to the

emergence of a major musical theory that incorporated Gestalt laws, i.e., Lerdahl and

Jackendoff’s (1983) Generative Theory of Tonal Music. This theory, originally developed to

model, among others, perceptual grouping in Western hierarchical music, received extensive

attention within both music theory and music perception research. Studies within the Western

musical context showed that local groupings predicted by this theory were indeed experimentally

validated for both musicians and nonmusicians with a variety of melodic material, such as

classical music pieces (e.g., Deliège, Mélen, Stammers, & Cross, 1996; Krumhansl, 1996),

popular songs from different subgenres (e.g., Bruderer, McKinney, & Kohlrausch, 2009), and

20th century modern music (e.g., Clarke & Krumhansl, 1990; Deliège & El Ahmadi, 1990;

Krumhansl, 1991). Almost all of these studies of segmentation used what Deliège (1996) calls a

“real-time listening” set-up, in which participants report their spontaneous segmentations as the

experimental tunes unfold. It is important to note that the routine procedure, particularly when

using more complex and extended melody lines, is to allow the listener two to three real-time


listening segmentation trials after an initial pure listening trial. We will come back to this issue

at a later point in this paper.

A major finding of these studies was that even musicians heavily rely on musical surface-

related cues rather than musical structure-related ones when segmenting (e.g., Deliège, Mélen,

Stammers, & Cross, 1996, but see Schaefer, Murre, & Bod, 2004). This finding comes as a

surprise since one would expect more top-down type expectation-based processing in musically

more experienced listeners.

All studies referred to so far used different musical stimuli which all came from a shared

Western musical culture. The first set of studies to explore non-Western music were the studies

by Ayari and McAdams (2003), and Lartillot and Ayari (2008, 2011), which showed some

segmentation similarities even between experienced Tunisian listeners and European musicians

who did not have any listening experience with Tunisian improvised modal music. Ayari and

McAdams, for instance, were able to show that the segmentation overlaps between Tunisian and

European listeners were predicted to a great extent by one of the most basic surface-feature laws

of Gestalt, proximity, particularly duration-based proximity. All three studies are also worth

mentioning because they allow us to look at the influences of musical enculturation. We know

about earlier studies that looked at tonal hierarchies and tonal hierarchical as well as schema-

based perception in non-Western tunes (e.g., Castellano, Bharucha, & Krumhansl, 1984; Kessler,

Hansen, & Shepard, 1984; Krumhansl, 2000), but to our knowledge, the three Ayari studies are

the first to study segmentation behavior in real-time listening of non-Western tunes. It was our

main goal to further inquire this rather interesting question of how much Gestalt laws of

grouping rather than culture-based knowledge predict perceptual segmentation in non-Western

music.


The Present Study

Up and foremost, we wanted to follow up on the Ayari and McAdams (2003), and Lartillot

and Ayari (2008, 2011) studies by using slightly more structured non-Western music. A

pecularity of the three Ayari studies was that instead of using a regular modal piece, they used a

complex, less structured, improvisation-type (taqsim) tune. We believe that part of the reason

why their findings were not as clear-cut as one would have wished was because of the very

nature of the tuens they used. We therefore aimed at getting potentialy stronger concurrences

between same and different-culture participants by using regular classical makam tunes.

Secondly, we also wanted to look at the effect of musical training within a culture on

perceptual segmentation and then compare all segmentations with the segmentations predicted

by the most prominent two grouping laws of Gestalt, proximity and similarity. We therefore

used three groups of participants: Turkish makam music-trained musicians, Turkish

nonmusicians, and Western listeners. If the universality proposition of Gestalt perception is

correct, we should see considerable overlaps between all three groups. If, on the other hand,

grouping is mostly driven by musical enculturation without necessitating any external formal

makam training, we should see high overlap between Turkish musicians and nonmusicians but

not between Turkish and Westerner listeners. And as a third option, if segmentation is mostly

driven by experience- and knowledge-based schemata rather than surface-level features, we

should expect only little overlap between Turkish makam musicians and Turkish nonmusicians.

Furthermore, if musical enculturation has no effect on real-time segmentation, Turkish

nonmusicians should be performing close to Westerner listeners.

Thirdly, we had two experts segment the pieces once on purely perceptual grounds and a

second time on purely musicological grounds. The main reason to have our experts do a


musicological segmentation as well was to ensure that their first segmentation would be on more

purely perceptual grounds. We were concerned that had we asked them to do a single perceptual

segmentation for each tune, some musicological segmentation might nonetheless infiltrate their

responses. Since participants’ groupings were to be perceptual groupings only, we had to make

sure that the experts’ segmentations were of the same nature. Collecting the perceptual

segmentations of two experts, in turn, allowed us to obtain a more solid and valid measure

against which to compare our three participant groups.

Last but not least, we decided to introduce a new “real-time listening” set-up that might open

a venue for future music segmentation studies. One of the main reasons that pushed us to

develop a new set-up was the high within-group variability in segmentations that the Ayari

studies found not only for Western but also Tunisian listeners (Ayari & McAdams, 2003;

Lartillot & Ayari, 2011). This variability issue also seemed to have posed a problem in Bruderer,

McKinney, and Kohlrausch’ (2009) study who used two methods to overcome it. One was to

have a second study, in which a new group of participants received the more or less overlapping

segmentations of a preceding group of participants with the task of rating the salience of each or

those segments. This way they were able to obtain not only a validation for the previous

empirical segmentation data, but also boundary strength data. A second method they used to

deal with the variabilities within- and between-participants was to look at between-trial

correlations per participant and use Gaussian smoothing measures, respectively. As mentioned

earlier, almost all real-time listening type segmentation studies give participants an initial free

listening trial which is followed by three trials during which they are asked to mark their

segmentations. Thus, each listener attempts to segment any given tune three times, hence, not

only between- but also considerable within-participant variabilities are to be expected. But aside


from this variability a major other concern is the problem of which segmentation attempt to take

as the valid one. The first one right after the free listening trial? Or the second one, or the third

one? Or some kind of a composite score of all three segmentation attempts? In the method

section we will introduce a slightly different set-up that attempts to overcome potential problems

of the typical segmentation set-up.

Method

Participants

16 musicians, 14 nonmusicians, and 11 Western listeners served as participants for this

study and informed consent was obtained prior to their participation. Musicians were all

volunteering advanced undergraduate students from the Traditional Turkish Music State

Conservatory of Istanbul Technical University, who had a median of 10 yrs of music

conservatory training (ranging from 4 to 17 yrs) which included 4 yrs of intensive makam

musical training.

Nonmusicians, on the other hand, were Boğaziçi University students who had a median

of 0 yrs of general (not makam) music training (ranging from 0 to 3 yrs; the participant with 3

yrs of training reported to have quit training and playing 3 years ago, and 2 of the remaining 3

participants who played an instrument for 2 yrs reported to have quit playing their instrument at

least 2 yrs ago). Western listeners were a group of Erasmus and exchange students who had

come to study at Boğaziçi University for a term. We excluded any students from Middle East,

Northern Africa or Balkan immigrant background. In addition, only those who reported absolute

unfamiliarity with Turkish makam music were eligible for the study. Their median musical

training was 4 yrs (with a range from 0 to 17 yrs), and all, including the three with 10+ yrs of

musical training reported to have stopped playing music for at least 5 yrs. All nonmusicians and


Western listeners had to first pass a melody discrimination test to participate in the experiment to

ensure they had no problems in discriminating melodic changes in a microtonal musical system.

Except for one person, all other nonmusicians reported to be listening to music that was not

Traditional Turkish Music. None of the Western listeners reported to be listening to Traditional

Turkish Music or any non-Western music, in fact. Both groups participated in return for course

credits or a small gift.

In addition to these three groups of participants, we consulted two makam experts to

provide us with perceptual and musicological markings of boundaries for all eight tunes in a free

listening set-up with the musical scores in front of them. Both experts were professors at the

Traditional Turkish Music State Conservatory of Istanbul Technical University, one of them was

the head of the Department of Music Theory and the other was a 30-yr musician and the head of

the Department for Traditional String Instruments.

Materials and Segmentation Set-Up

A total of eight musical excerpts were used, each of which had a duration ranging from

60 s to 70 s with a mean duration of 63 s (Table 1). Excerpts were taken from the first couple of

measures of eight different pieces that were written in four of the most common makams of

Turkish traditional music (Hicaz, Nihavend, Saba, Segah, Uşşak). Furthermore, to ensure

rhythmic variety, the eight pieces covered four of the most common rhythmic patterns used in

Turkish traditional music (Hafif (32/2 and 32/4), Aksak (9/8), Ağır Aksak (9/4), Yürük Semai

(6/4 and 6/8)). The purpose of having a range of different makams and rhythmic patterns was to

ensure a good representation of the variety within traditional Turkish music.

------------------------------------

Insert Table 1 about here


-------------------------------------

All tunes were written via Mus2, a specific application for the notation of microtonal

pieces which allows the user to play back the score with accurate intonation by using the sound

samples of acoustic instruments in Turkish Music. Tunes were then recorded in the Qānūn (a

different type of zither) sound sample. All the musical fragments created by this application

were initially recorded in wav. format and then converted into high-resolution mp3 files in order

to use them with the set-up programme.

The experimental set-up for the segmentation trials was programmed using Javascript. It

was slightly different from the ones used in other segmentation studies. In traditional

experimental set-ups, participants only hear the melodies over headphones and then press a given

key to mark a boundary. Typically, they are given one or two additional trials to reattempt their

segmentations. A major problem with this kind of a set-up is that at each repetition participants

simply do the task from anew, i.e., without benefitting from their earlier responses, except if they

retain some kind of a memory for it. This, in turn, is likely to incur a constant working memory

load per trial, hence preventing any chances of improving their performance per repetition trial.

Furthermore, another potential handicap of the traditional segmentation set-up is that one never

knows which of the segmentation attempts is to represent the most accurate one. Participants’

best segmentations per melody could be their segmentation in the first trial, the second trial, or

the third trial, or even worse, a mixed combination across trials. In other words, there is no way

to treat one segmentation attempt as superior to the other, unless one relies on participants’

personal reports. Personal reports, on the other hand, often tend to be unreliable (if we think of

the extensive literature on dissociations between explicit and implicit measures in a variety of

behavioral studies). Taking across-trial averages to determine each participants’ ‘representative’


segmentation, in turn, is problematic since averages may not entail any of the participants’

performed segmentations.

To deal with both the working memory load issue and the response validity issue

described above, we decided to add a visual component to the task without, however, providing

any visual information about the spectral and temporal/metric aspects of the tunes. In particular,

during the presentation of each tune, a red horizontal bar moved from left to right on a horizontal

axis and in real time with the tune towards a black bar which marked the ending point of the tune

(and was different for each tune depending on its exact duration). In the segmentation trials,

participants instantly saw their markings on the horizontal axis whenever they made a key press.

In addition, in the subsequent trials per tune, they saw their earlier segmentations as they were

attempting their new ones. A detailed description of the set-up from the participants’ perspective

is provided in the subsequent procedure section of this paper.

Procedure

Participants were tested individually. Each session lasted between 45 - 55 min. All

musical materials of the study were presented via binaural headsets. All participants had to first

sign a consent form. Nonmusicians and Western participants had to pass a melody

discrimination test to be eligible for the experiment. This test was used to ensure that our

nonmusicians and Western listeners had basic spectral (i.e., pitch-related) sensitivity. The test

consisted of three trials. In the first trial, participants heard one correct and one incorrect

transposition of a 4-sec long makam tune fragment. Their task was to tell which of the two

transpositions was correct or incorrect (both could have been correct or incorrect). In the

subsequent two trials, participants heard correct and incorrect repetitions of two different 3-sec

long makam tune fragments. One of the incorrect repetitions had a full note change, which also


changed its contour but not so much the “character” of the tune, whereas the other had a quarter-

note change, which maintained the contour of the original tune. Only participants who passed all

three tests were eligible for the study. This pre-assessment phase was skipped for musicians.

In the next phase, participants filled out a music background questionnaire with questions

about their music experience, there listening styles and musical preferences. After that they

received the instructions for the experiment. They were told that they were participating in a

music perception experiment, in which they would listen to a total of eight different makam

pieces. They were informed that they would listen to each tune four times. In the first trial, they

would simply listen to the tune to familiarize themselves. They were told that in the next three

repetitions, they would have to indicate all instances, at which they perceived a melodic

boundary. The idea of musical segmentation was demonstrated to them on a well-known tune

(“Katibim”). They were told that when people hear musical tunes, they often tend to perceive

melodic subgroups within those tunes. After the experimenter felt sure that the participant

understood the idea of grouping and marking boundaries, the session continued with the practice

phase.

In the practice phase, participants were told that they would go through practice trials to

familiarize them with the procedure of the study. This phase included two 45-sec long makam

tunes (that were not used in the experimental phase). Participants were told that each tune would

be repeated four times, one for free listening and the next three times for marking their

segmentations. They were informed that during each repetition they would see a red bar moving

along a horizontal axis from left to right as the tune unfolded. In the second repetition, when

they had to mark their segmentations, they would see the moving bar again, but this time,

whenever they pressed the space bar, a green vertical line cutting across the horizontal axis


would appear right at the point where they had pressed the key. In the third repetition of the

tune, things would be just like in the second repetition except that together with the moving bar

they would also see their previous boundary markings on the horizontal axis. Participants were

told that in the third repetition, they should once again try to mark the boundaries they perceived.

It was explained to them that each repetition was a new attempt so each time they could either

confirm their earlier markings by pressing the space bar at exactly the same location or correct

them by pressing the space bar at new locations within the melody. In the fourth repetition,

participants had a third and last chance to change or confirm their boundaries. The boundary

markings of the second repetition appeared in blue whereas the final markings appeared in

purple, to allow participants to discriminate between their marking attempts.

After participants finished their three segmentation attempts for each of the two practice

tunes, they were asked if they now felt comfortable to execute the task for the actual tunes. In

case they needed more practice trials, the practice sessions would have been repeated. However,

all participants reported after two practice trials that they felt ready for the experimental trials.

The experimental session consisted of eight different makam tunes, each of which was repeated

four times. Unlike the earlier phases, in which occasional interruptions were allowed, the

experimental session was conducted in a strictly standardized fashion without any interruptions.

This session took 33 minutes and 36 seconds. After this session, participants were given a final

feedback form with various questions about the task, potential strategies they might have used

and a section in which they could write additional comments about the experiment. Participants

were then thanked for their participation.

Our experts’ segmentations were obtained differently and in two rounds with slightly

different instructions. They were not exposed to the regular experimental set-up and the fixed


time conditions. Instead, they were exempt of the “real-time listening” pressure and allowed to

see the musical scores in front of them while listening. That is, they simply listened to the tune

and marked the segmentations on the musical score. In the first round, they were asked to mark

a boundary at points that on purely perceptual grounds gave them a sense of closure, whereas in

the second round, they were asked to mark boundaries based on their musicological knowledge.

As mentioned earlier, this distinction was used to ensure that one set of segmentations would be

less influenced by their musicological knowledge. Our study will, however, only use the

perceptual segmentations of the experts for the upcoming analyses.

Results and Discussion

All analyses were done on the last segmentations unless otherwise reported. One

nonmusician was excluded from the data set because a participant-by-participant analysis of

segmentations showed that he was the only one participant who did not have any match with the

experts and almost no match with the other participants as well. There were participants in all

three groups that did not perform with high accordance with the expert or with others but this

nonmusician was unique in that he showed almost no matches at all across all eight tunes over

the three trials, something we did not observe in any other participant. All other participants,

even those that performed with lower match rates, showed variation across tunes, always having

at least one tune (typically, Tune 6), for which they performed much better than the others. We

suspect that this participant simply did not cooperate, hence we excluded him from the analyses.

Feedback forms


Most of the nonmusicians and Western listeners reported that doing the segmentations

was a somewhat challenging task. To a lesser degree, quite a few musicians, too, reported that

the task was not easy, particularly for some tunes. Except for one nonmusician, all other

nonmusicians, and all Western listeners mentioned that that their final (third) segmentations were

their best segmentations. Also the vast majority of musicians reported their last segmentations as

their best ones. One of the musicians who did not agree with this mentioned that he segmented

differently with each listening because with each listening different aspects of the tune became

salient to him. Almost all nonmusicians, Western listeners, and the majority of musicians

reported that having had a third chance for the segmentations was beneficial. Quite a lot of

nonmusicians and Western listeners tried to explain how they did the segmentations. They

reported that they tended to perceive boundaries whenever they there was “a change in melody”,

“a quick sequence of notes”, “a change in rhythm”, an “extended note” or a “pause”. Musicians

reported very similar criteria with a more technical language. None of the participants reported

to be knowing any of the tunes.

Expert segmentations

Both experts showed considerable overlap in their perceptual (as well as musicological)

segmentations, in fact, the second expert (who was the head of the Department for Traditional

String Instruments and a professional musician with 30 years of musical experience) confirmed

all of the other expert’s markings but had about 3-4 times as many segmentations, which were all

very small, very locally based perceptual groupings. We therefore decided to use the

overlapping segmentations because their number much better reflected our participants’ number


of segmentations per tune. Thus, from now on, we will use the term “expert segmentations” in

the sense of referring to those intersecting segmentations of the two experts.

Number of segmentations

Figure 1 shows the number of segmentations per tune for musicians, nonmusicians, and

Western listeners. For comparison sake, we also added the number of segmentations per tune of

the experts who made an average of 7.3 segmentations across tunes. A 3 (Musician,

nonmusician, Western listener) x 8 (Tune 1-8) mixed two-way ANOVA revealed that there was

no significant difference between musicians, nonmusicians and Western participants in the

average number of segmentations made per tune (M = 8.8, SD = 3.4 for musicians, M = 9.1, SD =

4.4 for nonmusicians, and M = 9.6, SD = 4.7 for Western listeners, F(2, 36) < 1.0; p > .10). Yet,

as can be seen in Table 1, nonmusicians showed more variation among each other in number of

segmentations per tune than musicians (with standard deviations across tunes ranging from 2.1 to

6.4 for nonmusicians, and from 1.6 to 4.8 for musicians). For Western listeners, the lowest

between-participants standard deviation for a given tune was 4.1 and the highest 5.4.

-----------------------------------------

Insert Figure 1 about here

------------------------------------------

------------------------------------------


------------------------------------------

Furthermore, there was no significant interaction effect between group and tune (p > .10).

There was, however, a significant main effect for tune, F (4.03, 149.11) = 9.58, MSe = 9.21, p <


.001, partial η2 = .21, with Greenhouse-Geisser corrections for sphericity. Bonferroni post hoc

tests of comparison with a p-value set at .05 revealed that Tune 1 produced significantly more

segmentations than any other tune except Tune 6, and Tune 6 produced more segmentations than

Tune 4 and 7. A quick look at Figure 1 shows that expert segmentations did not mirror these

differentiations. The highest number of expert segmentations were for Tunes 2 and 4 (nine

segmentations for each), and the lowest were for Tunes 1, 3, and 5 (six segmentations for each).

Since all participant groups made more segmentations for Tune 1 than for the other tunes, it is

likely that this was due to its being the first tune in the experimental series.

Time plots: Segmentation locations in milliseconds

Figure 2 presents a time graph in which each colored line represents a given participant’s

segmentation at a given millisecond location as marked on the x-axis, with participants’ group

membership being color-coded as blue for musicians, red for nonmusicians, and green for

Western listeners. A general visual inspection of this time graph shows that for all eight tunes

there were considerably converging segmentations within each group, and more importantly,

across groups.

----------------------------------


----------------------------------

We were also curious to see to what extent such convergence existed already in their

preceding segmentation attempts. Figures 3 and 4 show the same time plots this time for

participants’ first- and second-trial segmentations. We were surprised to see very comparable

overlaps both within and between groups already in the first segmentation attempts, which were


performed right after a single free-listening trial of a given tune. We will inspect the three

segmentations attempts in more detail in our “hit” and “false alarm” analyses.

----------------------------------


----------------------------------

----------------------------------


----------------------------------

Histograms: Segmentation locations binned onto musical notes

Since a given millisecond location per segmentation does not really tell us anything about

the actual event going on in the music we decided to do a transformation that would turn all

millisecond values into the notes they corresponded to. For example, Tune 1 was composed of

exactly 109 notes and had a length of 66,000 milliseconds, thus, the 66,000 millisecond temporal

space unfolded unevenly across the 109 notes depending on each note’s durational space. Since

all tunes were MIDI recorded tunes, we were able to perform this binning procedure. Some

pieces had pauses but since all tunes were recorded in the Qānūn (a Turkish zither) timbre, the

pauses were not audible because of the lingering sound of the preceding note, just as in real life

Qānūn performances. We therefore added all pause areas to the preceding note’s area. We did

the same thing for all the remaining tunes.

We then constructed histograms to represent the percentage of participants per group who

put a segmentation on a given note/bin. Figures 5, 6, and 7 show three histograms each (for each

group) aligned along the x-axis for Tunes 1, 3, and 6. These tunes were picked with the purpose


of showing the most differing histograms. Tune 1 was picked because it was the tune with the

highest number of segmentations. Tune 3 was chosen because it was the tune which produced

the worst performance with respect to participants’ segmentations in comparison to the expert

segmentation locations as will be discussed in more detail later. Tune 6, on the other hand, was

chosen because it produced the highest degree of match both within-groups, between-groups, and

with the expert, as will be seen in the later “signal detection-like analysis” section. We also

marked the experts’ perceptual (red) and musicological (black) segmentations as dotted vertical

lines declining from the top of the musicians’ histograms (again, with the perceptual

segmentations being the ones of primary interest). Since we did not want to compromise the

“readability” of the graphs, we split all overlapping expert markings into two closely adjacent

lines, which in reality meet in the middle right on top of the histogram bar for a given bin. The

following subsections will include a descriptive analysis of the converging segmentations within

each of the tunes, and also some interpretation of the segmentation points in terms of the most

basic Gestalt laws of grouping, foremost proximity, but also similarity and good continuation.

We also present the musical scores to make it easier for the reader to follow the

descriptive analyses. In each score, the expert segmentations are marked with a vertical red line

crossing through the staff, whereas the group performances can be seen as blue (for musicians),

red (nonmusicians), and green (Western listeners) proportions to mark the corresponding

percentages of participants per group who selected that note as a point of segmentation. We also

marked all specific locations that assumed certain musicological functions within the tune and

notated what they referred to, as a side information for the interested and knowledgeable reader.

As can be seen from the histograms as well as the musical scores, there were quite a lot

of cases where part (usually the majority) of the participants put their segments right at the expert


points of segmentation and the other part (usually the minority) put it right after the expert points

of segmentation. These one-note shifts might have been due to a delayed motor response but

considering that they had 3 segmentation trials and were able to see their previous segmentations

(which means that they could correct their 1-note delayed responses to the actual spot), we may

interpret them as a perceptually delayed grasping of a boundary rather than a motor reaction

based delay.

----------------------------------


----------------------------------

----------------------------------


----------------------------------

----------------------------------


----------------------------------

In the next three subsections, we will take a closer look at the points of segmentations by

consulting the musical scores. We will attempt to understand both expert and participant

segmentations based on laws of Gestalt and other potential surface-feature related factors (such

as measure ends and the like). Here we will exclusively focus on the segmentations that matched

with the expert segmentations but we will also look at all the other potential locations that could

have received boundary responses because of similar surface-feature characteristics and see

whether at least some participants marked those points.


Tune 1. Overall, when we look at the histograms, we see good convergences within

groups, as marked by 50%+ participant agreements on the y-axis at given points of segmentation

on the x-axis. Moreover, we see good convergence also across groups as can be seen by the

overlapping segmentation points marked by 50%+ of the participants in each group. And finally,

when comparing those strongly overlapping segmentation points with the expert markings (in

this tune, all of the experts’ perceptual boundaries were at the same time their musicological

boundaries), we also see a good fit not only with the musicians but also the other two groups.

----------------------------------


----------------------------------

Clearly, the strongest convergence appeared on the 82nd note, marked by the experts and

agreed upon by 94% of the musicians and 100% of the nonmusicians and Western listeners. A

look at the musical score (Figure 8) shows us a half note partial flat B, followed by a quarter note

pause, and preceded by a quaver C. Thus, there seems to be a strongly implied segmentation

based on interonset interval (IOI), which from a Gestalt point of view refers to a grouping based

on durational proximity and hence a segmentation based on durational distance.

The expert marking for the 17th note also triggered considerably converging

segmentations within and across groups. This note was an extended A (a quarter note extended

by a quaver) preceded by a semiquaver G, once again implying a segmentation based on

durational distance.

The expert marking for the 34th note rendered 69% agreement among musicians, and

50% and 55% agreement among nonmusicians and Western listeners, respectively. Again, a

quick glance at the musical score reveals a segmentation based on durational distance. But the


34th note also marks the end of a 9/4 measure, so we cannot rule out that being a measure ending

might have triggered segmentation responses, even though it is hard to envision Western

listeners to be able to pick up such an unfamiliar measure. In fact, quite a few Western

participants mentioned on the post-experimental feedback that they had a hard time figuring out

the temporal structure of the tunes (“After the first few trials I realized that the music did not

follow the common rhythm and melody tempo found in Western music. I treated them like

conversations of melodies; some were short, some were very long; I couldn't use a 4/4 to

determine the turn of switch."). Even for our local nonmusicians it is typically very hard to

extract the metric structure even in multiple trials since the meters are much more varied and

complex compared to Western music. Furthermore, there were 3 out of 7 measure endings that

remained unmarked by all three groups and the experts, so a measure ending in itself appears to

be insufficient as a surface-level feature in triggering segmentations.

The expert segmentation for the 57th note was a quaver C on the musical score preceded

by two semiquaver notes (C-D) and followed by a quaver pause. Hence, that C corresponded to

a durational space of 2 quavers (the note and the inaudible pause) and it was indeed longer

compared to the preceding 2 semiquavers. This boundary was confirmed by 50% of the

musicians, but almost not at all by the nonmusicians (7%) and Western listeners (18%).

Interestingly, the latter groups had a fair amount of participants who preferred to segment the

tune at slightly later points, which were the 58th and 61st notes for nonmusicians, and the 58th,

61st and 62nd notes for Western listeners. It is not clear why the 58th note triggered a

segmentation in some of the nonmusicians (36%) and Western listeners (45%) but the 61st note

(a dotted quarter E, which was preceded by a quaver D) appears to have triggered a segmentation

by 31% of musicians, 64% of nonmusicians and 45% of Western listeners because of durational


distance. The 62nd note (a quaver half flat A, preceded by a dotted half note E), on the other

hand, which was marked by 36% of the Western listeners (though almost not at all by

nonmusicians), suggests a very local grouping based on pitch interval.

The experts’ fifth segmentation, which was on the 99th note -- a half note G with an

added quaver pause which is preceded by a semiquaver F flat – once more implies a

segmentation based on durational separation. This boundary was picked up by musicians,

nonmusicians, and Western listeners at high degrees (69%, 79%, and 82%, respectively;

interestingly, musicians chose that point less convergently than the other groups).

Finally, the experts’ last segmentation, which was the final dotted half note C followed

by a full note pause and preceded by a semiquaver B, once more suggests a segmentation based

on durational separation. But this segmentation point also marked the end of a measure,

moreover, the end of the tune. Here we have to remember that participants were able to visibly

see the end of the tune marked by a black bar on the right hand side (see Materials and

Segmentation Set-Up), hence that marking could have been due to that knowledge. We would

like to note though that participants were told that they would receive parts of classical Turkish

makam music, hence there was no hint that the 60-75 second long tunes would end with some

closure. Yet, when looking at the final notes of each of the eight tunes, we saw that the last note

was indeed consistently picked not only by the experts but also by a fair percentage of

participants varying from 60% to 100% across tunes. This finding suggests that making a

segmentation on the final note was a very common though not necessarily “default” behavior.

So far we tried to understand the possible surface-feature related factors that might have

triggered a sense of boundary at those six locations marked by the experts and the participants.

But it is also necessary to search the musical score for similar durational or pitch-wise


proximity/distance related sections that were not marked by the experts and see whether at least

some participants marked those locations. There were indeed ten more locations (notes 2, 10, 21,

30, 47, 61, 68, 74, 86, 87, and 100) that could trigger additional segmentations. Three of them,

the 2nd, 47th, and 61st notes, were all dotted quarter notes that were preceded by quavers, and all

three of them were marked considerably within and across groups, which shows once more a

Gestalt-driven durational segmentation, at least in a real-time listening environment. The 47th

note also distinguished itself by being the ending point to a descending series of five quaver

notes, which might have evoked a sense of good continuation and local closure with the terminal

longer note. These two factors together might account for the particularly high degrees of

segmentations per group, varying from 64% to 79%. It is important to note that all three

locations were at the same time locations with critical musical functions within the piece.

The 10th, 68th, 87th, and 100th notes were potential locations that could have been

segmented due to pitch interval. But as can be seen by the rather low percentages, pitch interval

did not seem to be a strong driver of segmentation behavior in real time listening. The remaining

four notes, the 21st, 30th, 74th, and 86th notes, were once again points of longer duration (but with

less durational contrast compared to the dotted 2nd, 47th, and 61st notes), and except for the 21st

note, they all assumed a musical function as well within the tune. The segmentation behaviors at

the 30th and the 74th note seem worth discussing. Thirty-one percent of the musicians placed a

boundary on the 30th note, a quarter note A, whereas nonmusicians and Western listeners seemed

to have preferred the 31st note, a quaver C, either because of a slightly later “grasp” of the

durational lingering on the quarter note A or because of the pitch interval created by the C that

followed the A. Finally, we observed 55% of Western listeners (and only 19% of musicians and

29 % of nonmusicians) segment at the 74th note, which was a quarter C followed by a quaver C


which was also the ending of a descending series of shorter notes. Hence, Western listeners

could have reacted to the ending of a good continuation, more so than nonmusicians and

musicians.

All those additional markings we just discussed (i.e., that were not shared by the experts

but appeared both within- and across- groups) were labelled “false alarms” in our upcoming

signal detection-like analyses, but strictly speaking, as will be seen for the next two tunes, these

were almost never randomly placed segmentations. As already pointed out, when we analyzed

the musical scores from a musicological perspective (e.g. the momentary pauses at the middle or

the end of the modal phrases to mark critical notes/pitches within the modal scale, all set to help

the listener understand the underlying main makam form or rhythmic pattern of the tune), we

noticed that a majority of those extra segmentations discussed in terms of Gestalt laws also

turned out to be musicologically critical locations. This is a point we will come back to later in

our paper.

Tune 3. Tune 3 was the one that showed the worst convergence with the expert

segmentation points (see Figure 6). The experts marked six locations, of which only three were

also marked as musicological segments, but, as mentioned earlier, our focus will be exclusively

on the perceptual boundaries.

----------------------------------


----------------------------------

The experts’ first boundary was for the 22nd note, which was a quarter note G preceded

by a quaver A, which in turn was preceded by a whole sequence of 12 overall descending quaver

notes. This point of segment was chosen by 31% of the musicians, and only 7% and 9% of the


nonmusicians and Western listeners, respectively. A quarter note after a preceding quaver note

does imply some durational separation, but it is likely that a sense of good continuation and

closure might have directed the experts to group the whole measure of 12 quavers as a group and

hence set a boundary at the quarter note that started the next measure. Quite a few of the

participants (44%, 29%, 55%, for musicians, nonmusicians, and Western listeners, respectively)

put a boundary on the 23rd note, which was a B, probably triggered by the pitch interval from the

quarter note G to the quarter note B.

The experts’ next boundary was on the quarter note pause next to the quarter note D, thus

in the 35th note space. This quarter note D, in turn, was preceded by two more quarter note Ds,

which annexed to an ascending then descending sequence of four quavers, the last quaver being a

D. It is likely that the three quarter note Ds created a sense of grouping based on similarity but it

is also possible that 35th note received a segmentation because of its being the final note of the

6/4 measure. Certainly, both similarity and end of measure might have played a role, but we

would like to remind that --across the eight tunes-- ends of measure many times did not trigger

segmentations. This 35th note was also marked by 44% of the musicians, 21% of nonmusicians

and 18% of Western listeners.

The third and fourth expert segmentations were the 65th and 91st notes which also

coincided with measure ends. Both were at the same time on half notes preceded by two pitch-

wise nearby quaver notes, thus a durational separation might have triggered those segmentations.

However, neither of the two were marked by any more than about 20% of participants within

each group.

The fifth expert segmentation was on the 97th note, which was a dotted quarter note G

with an annexed quaver pause, preceded by a quarter note F. The note G hence had a total


duration of a half note, twice as long as the preceding note. But it also marked the end of a

measure. Interestingly, all participants “missed” that point of segmentation.1

The final expert segmentation was on the last note of the tune which was a half note

preceded by two pitch-wise close quaver notes. This segmentation was picked up by more than

half of the participants in each group. Again we note a durational separation that coincided with

a measure end, which this time was the end of the tune.

We once again inspected the tune for similar sections of duration- or pitch-wise

separations or similarity-based groupings that were not marked by the experts. We could

identify nine such spots, seven of which were duration-based points of separation (notes 9, 62,

70, 71, 74, 88, and 123), and four, pitch-based points of separation (notes 10, 23, 36, and 46).

The 9th note could have received a segmentation because of its being a quarter note that was

preceded by shorter notes. Moreover, those shorter notes were descending ones, hence both

good continuation and durational separation could have acted as potential Gestalt phenomena to

lead to a segmentation on the quarter note D. Interestingly, none except for one single

nonmusician placed a boundary on that note. Tune 3 distinguished itself by its extremely

flowing, rather fast tempo so it is likely that this very flowing character of this piece made it so

difficult to segment it in general, and make the listener “miss” such duration-wise slightly longer

notes. The 62nd, 88th, and 123rd notes, on the other hand, were interesting points of

segmentations because they were very parallel segmentations to the experts’ segmentations at the

65th, 91st, and 126th notes, respectively. In all three cases there was a durational separation

caused by a half note B demiflat, F sharp, and B demiflat again, respectively. Whereas the

experts preferred to place the segmentation on the second repetition of those notes within the

measure, quite a lot of the participants, and particularly Western listeners, placed their


boundaries on the first appearance of those half notes. The remaining durational points of

separation, i.e., the 70th, 71st, and 74th notes, received some segmentation to varying degrees

across groups. The two notes with strong pitch intervals, notes 10 and 46, received only few

segmentations, and mostly more by Western listeners than the other two groups. In contrast, the

23rd note, which was the one just after the note at which the experts had put a segment, was

marked by 44% of the musicians and 55% of Western listeners. This could once again be a

“spill over”, i.e., a delayed segmentation that was meant for the 22nd note. However, the 23rd

note is the B demiflat which is also the main note of this Segâh makam (the tune starts and ends

with that very note), so it is possible that a more deep level abstraction of this “carrying note” in

addition to or instead of the pitch interval distance led to this segmentation. The pitch interval

from the 35th note D to the 36th note G produced quite varying numbers of segmentation within

groups (13% for musicians, 57 % for nonmusicians, and 27% for Western listeners). But once

again it is critical to note that the 36th note was the note which came right after the expert

segmentation at the 35th note, so the segmentations of some of the participants might have

“spilled over” to the 36th note even though they were “meant” for the preceding note. This

seemed less so for the musicians, the majority of who had marked the 35th rather the 36th note,

and more so, particularly for the nonmusicians, who had marked the 36th note rather than the 35th

one.

In general, it is quite difficult to be able to clearly interpret all segmentation behaviors

only from a Gestalt perspective. There are many factors we are not yet aware of. And last but

not least, we once again notice that a great majority of those additional locations also happen to

be points at which crucial musicological events are taking place.


Tune 6. Tune 6 was the tune with the best match between participants and expert

markings (see Figure 7). The measure of this tune was 32/2, which means that the entire tune is

spread out over one single measure, which, in turn, is evenly subdivided into 32 two-half note

sub-measures. A look at the musical score suggests that the reason for such convergent

performance with the expert boundaries across groups was most likely due to this much more

simplistic metric structure, which must have allowed participants to more easily grasp the

beginnings and ends of (sub)measures, which were not too different from Western type music

(i.e., a sub-measure of two half notes corresponding to a 2/2 or 4/4 Western measure).

----------------------------------


----------------------------------

The first expert segmentation was on the quarter pause following the 21st note, which was

a quarter note B preceded by a sequence of descending quaver notes. As such, both the law of

good continuation and the law of durational distance (i.e., the distance from the quaver notes to

the half note durational space of the 21st note) might have led to this choice. This segmentation

point was picked up by 64% to 73% participants across the three groups.

The second, fifth and last expert markings (notes 34, 87, and 119) were similar to each

other in that they were the endpoints of a sub-measure consisting of one note that expanded over

an entire full-note space. These points were also clearly marked by a durational separation since

in all three cases it was preceded in the previous sub-measure by 3 to 4 quaver notes. Hence we

had once again a “confounding” situation with both measure end and durational distance steering

the listeners to exactly the same point of segmentation, which, in turn, might explain the very


high percentages of overlapping markings both within and across groups. The second expert

segmentation, for instance, was marked by 100% of the musicians, 86% of the nonmusicians,

and 91% of the Western listeners. For the fifth expert boundary we observed an agreement of

around 90% within each group whereas for the last expert boundary, which also marked the end

of the tune, we observed an agreement with 88%, 86%, and 73% of the musicians, nonmusicians,

and Western listeners, respectively,

The third, fourth, and sixth expert segmentations (notes 55, 64, and 108) were all placed

on a half note or a half note tied to a quarter note, which were again preceded by shorter notes.

All of these points coincided this time with the beginning of a measure. Hence, we cannot know

whether a grasping of the beginning of a measure or a durational separation, or both triggered

these boundaries.

In Tune 6, we were able to identify eight potential points of segmentation based on

Gestalt laws which were left unmarked by the experts. We excluded the immediate pitch interval

from the first to the second note in the piece since it seemed too early a position to place a

segmentation. Pitch intervals should strike out in a context of a series of neighboring notes, thus

the second note of a tune should be lacking this contrast effect. Three of the eight locations,

notes 65, 72, and 109, all had unique pitch intervals. Interestingly, all three of them were at the

same time the very next notes to a major segmentation marked both by the experts and the

participants. Hence the varying degrees of segmentation responses at those very locations might

once more be “spill overs” from the preceding segments rather than fresh segmentations based

on pitch interval. The remaining five locations (notes 3, 8, 24, 42, and 71), on the other hand,

were all durational-wise separated points, mostly half notes or dotted quarter notes preceded by

shorter notes. In three of the cases (notes 3, 8, and 71) those very locations also marked


significant musico-functional events, and it was those notes that did indeed produce a

considerable number of participant segmentations (except for the 3rd note, which once more

might have been too early an event to notice). This may make one wonder whether there is some

relationship between the presence or absence of musicological events in triggering or not

triggering participant segmentations. But, on the other hand, how should a Western listener have

any even implicit understanding of these musicological, typically more deep-structure related

aspects of music? We will slightly speculate on this issue in the upcoming section.

Summary. Overall, we observed that the segmentations marked by the experts and the

participants almost always referred to notes that had considerable longer durations than their

preceding notes. In a few cases these points also represented the ending point of a “good

continuation” series (such as a series of ascending or descending shorter notes). Both experts

and participants only very rarely, and if so, more Western listeners than any of the others,

segmented based on pitch intervals. For that matter, pitch intervals also never referred to

musico-functional events. Finally, we report one case (in Tune 3), where a grouping might have

been based on similarity, but that particular location also contained a durational separation and

an ending of a measure, thus it is hard to know which one of these surface features were

directing the participants to mark a boundary.2

When we analyzed the musical scores with respect to notes that could have been but were

not segmented by the experts based on those very laws, we observed that many were marked by

a varying percentage of participants, and almost always by all of the three groups. This once

more confirmed the powerfulness of surface-related Gestalt principles in guiding segmentation.

Finally, we also noticed that most of the times those very locations assumed significant musical

functions within the pieces. Since it is hard to imagine Western listeners to have any such


knowledge even on implicit grounds, we believe that this is an interesting peculiarity of at least

19th century Turkish makam music (and probably of a lot of other, particularly older, musical

styles) that critical musicological events mostly seem to happen at Gestalt-wise critical points of

segmentation, particularly, at duration-wise longer notes. Out of curiosity, we inspected all

musical scores for locations that could have been segmented because of some duration-wise

separation (e.g. a quarter note preceded by one or two quaver notes) but were not segmented by

even a single participant. All eight tunes had such locations, and interestingly, none of them

marked any musicological event. Furthermore, we noticed that almost all of those slightly longer

notes appeared to be very local points of durational separation that were parts of bigger wholes,

e.g., parts in a bigger “good form” such as a repetitive rhythmic pattern. We wonder whether in

their fourth listening trials (which were their third segmentation trials), participants, even

Western listeners, might have started to use some surface-structures that were less local (such as

immediate durational separation) and more global. An example would be a recognizing of a

repeating longer-scale rhythmic pattern that simply should not be split due to some very local

duration-wise separating event that occurs within that more global pattern, or “good form”.3

Signal detection-like analyses

Given the impressive convergences that were already visible in the high-resolution

millisecond-based time plots, and certainly in the histograms, for which all millisecond time

stamps were binned onto the notes they belonged to in their respective musical scores, we

thought it would be worthwhile to attempt a signal detection-like analyses, which would allow us

to display the statistics for all of the eight tunes. For this, we turned every participant’s data into

“hits” and “false alarms” by taking the expert segmentations as the “answer key”. As mentioned


earlier, these “hits” and “false alarms” were already visible in the histograms but the analyses we

were able to do were doomed to remain descriptive. Furthermore, since we also had the first and

second segmentation trial data at hand, we had a chance to perform analyses on those earlier

segmentations as well.

Transforming segmentations into “hits” and “false alarms”. Each participant’s hit rate

was calculated by counting the number of notes per piece and per segmentation trial that

overlapped with the choices of the experts. For instance, if the experts had marked six specific

locations as indexed by note number (e.g., 34th note etc.) and a given participant had correctly

marked 5 of them, her hit rate was calculated as 5/6 (≈ 83%) for a given segmentation trial.

False alarm rates, on the other hand, were calculated by counting all the remaining segmentations

of the participant (that were not marked by the experts) divided by the remaining possible

number of notes. For instance, if a tune had a total of 109 notes, six of which were points of

segmentation according to our experts, we would have a remaining 103 notes on which a

participant could have placed a segmentation. Thus, if a given participant had 13 extra

segmentation points (in addition to her 5 points that matched with the expert segmentation) we

would calculate her false alarm rate as 13/103 (≈ 13%). However, when we inspected our

histograms, we noticed that there were quite a lot of “gaps” on the x-axis, meaning that in each

tune there were a number of notes that were not marked by any of the 40 participants and 2

experts. We therefore decided to subtract the number of notes never marked with a segmentation

from each tune and recalculate participants’ false alarms accordingly. If, for instance, in a given

tune 45 notes out of the remaining 103 notes were never marked by anyone, we recalculated that

participant’s false alarm rate as 13/58 (≈ 22%). We believe this to be a much fairer

representation of participants’ false alarm tendencies.


Comparing segmentation accuracies across three attempts. Table 2 is a summary table of

participants’ hit and false alarm rates by segmentation trial and group membership.

----------------------------------


----------------------------------

A very interesting finding emerged, which we were able to see already when we plotted

the time graphs for the first and second segmentations as well. Quite to our surprise and in

contrast to participants’ self-reports, we saw that already at their very first segmentation

attempts, they performed very close to their final segmentation accuracy.

We therefore decided to run a 3 (Group: Musicians, Nonmusicians, Western listeners) x 3

(Segmentation trial: First, Second, Third) mixed design ANOVA on participants’ Hit minus

False Alarm accuracies which are plotted in Figure 11.

----------------------------------


----------------------------------

As suspected, there was no main effect for segmentation trial (F(2, 74) < 1.0, p > .10),

which means that participants performed equally across the three segmentation trials.

Furthermore, there was also no significant interaction between segmentation trial and group (F(4,

74) < 1.5, p > .10), meaning that the fluctuations in segmentation accuracy across three trials did

not differ by group. On the other hand, we did find a main effect for group (F(2, 37) = 3.39, MSe

= .03, p < .05, partial ɳ2 = .16). A Bonferroni post hoc test with an α-value set at .05 showed that

musicians were significantly more accurate than both nonmusicians and Western listeners, with

the latter two not being significantly different from each other.


Comparing segmentation accuracies across tunes. Since participants’ segmentation

accuracies did not differ significantly along their three segmentation trials, we averaged them

across trials and then calculated their average hits and false alarm rates tune by tune (Table 4).

----------------------------------


----------------------------------

As we had already noted when discussing the histograms, consistently, across all three

groups, Tune 6 (makam: “nihavend”; rhythmic pattern: hafif, 32/2) produced the best

segmentation performance whereas Tune 3 (makam: “segâh”; rhythmic pattern: yürük semai,

6/4) produced the worst performance when comparing participants’ segmentation matches to

expert segmentations. When comparing the groups we see that, on average, musicians’ and

Western listeners’ performed well not only in Tune 6 but also Tune 8 (makam: “hicaz”; rhythmic

pattern: hafif, 32/4), whereas, on average, the second best tune segmentation of nonmusicians

was for Tune 5 (makam: “saba”; rhythmic pattern: aksak, 9/8) instead of Tune 8. It remains

open as to why such more or less consistent differences4 occurred across tunes. Both Tune 6 and

8 had a rhythmic pattern that was both simpler and closer to the Western meters. This might

have been one reason why the highest success was shown on those tunes. Why Tune 3 faired as

the worst tune is harder to understand because both its rhythmic pattern and makam was shared

by other tunes which were segmented quite well on occasions.

Comparing musician Western listeners to nonmusician ones. As noted in the

“participants” section, four of our Western participants were musicians (of 7, 11, 11, and 17 yrs

of musical training, respectively), all of who had stopped playing music for at least 5 years. We

calculated their average hit and false alarm rates to see how they compared to their fellow


Western participants who were complete nonmusicians. Table 5 shows that musician Western

listeners did indeed have higher hit but also higher false alarm rates. A comparison with Table 3

shows that the average hit rate of the four musician Western listeners equaled the one of the 16

Turkish musicians, but Western musicians had once more higher false alarm rates. When

looking at the averages in terms of accuracies, i.e., hit minus false alarm rates, Turkish

musicians’ still had the highest average accuracy in descriptive terms. Since it is not appropriate

to compare an average coming from 16 participants to a much less representative one coming

from just four participants, we did not do any further analyses here.

----------------------------------


----------------------------------

We were, however, interested in seeing whether there would be some changes if we

performed a 3 x 3 mixed ANOVA on participants’ Hit minus False Alarm rate accuracies, this

time including only the nonmusician listeners of the Western group. We once more only found a

main effect for group (with an observed power of still (1 – β) > .61, despite the decrease in

sample size). A Bonferroni post hoc test with an α-value set at .05 showed that there was still

only a significant difference between musicians and the other two groups, i.e., the accuracies of

Turkish and Western nonmusicians remained comparable.

About the participants.

We also like to report some subjective observations that may shed some additional light

regarding the performances of the three groups. We noticed that our musicians showed quite a

level of anxiety when doing the task. We had the impression from the way they behaved during


the experiment that they felt as if their musicianship was tested, which certainly was not the case

and was very explicitly communicated to them. We emphasized that we were interested in how

different people segment tunes differently, none of the participants were told about an expert

segmentation or the like, and we indeed only collected our expert segmentations after having

finished running all three groups.

Our nonmusicians, in turn, were regular university students of a university that does not

have a music department. The great majority listened to Turkish and foreign pop, Turkish and

foreign rock, and house music as their most preferred styles of music (but reported to be listening

to quite a few other styles as well, with almost all of them listing to at least 4-5 different music

types, which included predominantly jazz, and classical music, and in 2-3 cases Turkish folk

music), and they participated in a routine way, as students of an Introduction to Psychology mass

course, in return for credits. Quite a few of them enjoyed the experiment but there were also a

couple of them who probably saw it as one more experiment to earn credits from, no more. It is

important to note that Turkish classical makam music, particularly such an old and purely

instrumental version of it is not a very exciting and stimulating music for most young people in

Turkey.

Our Western listeners were one-term exchange students who reported to be listening

mostly to pop, rock, jazz, and electronic music. They were all very into the experiment and

experienced the whole thing as something very exotic and nicely fitting with the cultural

component of their exchange-student experience. They not only enjoyed doing the task but they

were also very concentrated during the entire session. If seemed to us that the very difficulty and

unusualness of the musical material challenged them into delivering their best possible

performance.


In conclusion, we believe that these subject and situational factors must also have had

some influence on our participants’ data.

General Discussion

The main goal of our study was to see to what extent musical segmentation is driven by

surface-level features, and to what extent musical training and enculturation had additional

influences on peoples’ segmentations in a real-time listening task. Moreover, we wanted to

inspect these dynamics in a type of music outside of the highly investigated Western musical

realm in order to get some sense of the generalizability of Gestalt laws of grouping in the context

of music. To our knowledge, the only published studies on segmentation using non-Western

music are those by Ayari and McAdams (2003) and Lartillot and Ayari (2008, 2011), all of

which used a Tunisian Arabic taqsīm piece, which, however, because of its taqsīm (i.e.,

improvisation-like) form might have been too challenging for the untrained listener to tackle.

We therefore decided to use again a music outside of the Western tonal hierarchical realm but

one that was sufficiently structured for untrained or foreign listeners to handle such a task as

segmenting in real time. We carefully chose our pieces from a generally unknown 19th century

Turkish classical music repertoire, and made sure to cover a certain variety of makam, form and

rhythmic pattern. Our participants, on the other hand, consisted of three groups: (1) advanced

musicians who were students of makam music, (2) nonmusicians, and (3) Western listeners who

did not have any exposure to Middle East or North African makam type music.

Overall, we found an astonishing overlap of segmentations not only within each group

but also across groups. Moreover, the real-time segmentations also showed considerable

convergence with the segmentations made by two experts, and this was the case not only for the


participants’ last (of their three) attempted segmentations, but also for their earlier

segmentations, including their very first ones. This is a very critical finding since segmenting a

tune for the third time while listening to it for the fourth time makes one wonder to what extent

memory-like representations start driving the segmentation process, at least in addition to

perceptual processes. It was interesting to see that although the majority of participants reported

that their last segmentations were the best one and that it was necessary for them to have had that

third trial, we did not find significant differences in performance accuracies (as measured by hits

and false alarms) across segmentation trials in neither of the three groups.

When we scrutinized the specific locations of those shared segmentations we almost

always found a duration-based separation from a preceding set of notes, only rarely a pitch-wise

separation, and only once a potentially similarity-based grouping (which, however, was

confounded by a coinciding duration-based separation). This finding, expressed in Gestalt terms,

suggests a predominantly durational (rather than pitch-wise) proximity-based grouping5. This

predominance of durational separation in determining segmentations nicely confirms Lartillot

and Ayari’s (2011) findings. We furthermore noticed that other locations, which were not

marked by the experts but had a similar duration-wise separation from a preceding set of shorter

notes, still triggered segmentations by a fair number of participants across groups. This shows,

once more, that surface-feature related aspects seem to be driving the segmentation process,

particularly when doing the task in a real-listening setting. In some of the cases, durational

separation went hand in hand with a sort of good continuation that had come to a stop (such as 4-

5 ascending or descending quaver notes that finished on a half note), which is yet another more

local-level Gestalt law of grouping. On the other hand, it was interesting to observe that almost

all locations that had these separating, surface-level features happened to be locations that played


a critical functional, musicological role in the piece. They were oftentimes lingering notes that

hinted at the main makam of the piece or points of small modal shifts and the like. From an

attentional perspective one could argue that these points of separation attract the listeners’

attention and that this raised attention then allows him or her to extract a potentially critical piece

of information for a better understanding of the piece. We certainly do not know to what extent

these attentional processes are happening at an explicit or implicit level but it is likely that for

nonmusicians and Western listeners these are happening at a more implicit level whereas for

musicians they are happening at both an implicit but certainly also an explicit level (cf. Bigand &

Poulin-Charronnat, 2006). This might also explain why musicians overall performed better than

the other two groups with respect to their number of matches with the experts, i.e., their musical

knowledge may make them seek these points in a more explicit and top-down way.

We had also noted earlier that we had used a slightly different set-up, in which

participants had a chance to see their earlier segmentations on a real-time moving bar (that did

not provide any visual information about the tune) and be able to confirm or correct their earlier

segmentations at each successive trial. We had argued that segmentations during real-time

listening with only headphones and a keyboard to mark boundaries might be too challenging a

task, particularly with difficult or very novel music. Since participants could never relate to their

previous performances, any new segmentation attempt simply meant a trial almost from anew.

Hence, one could never be sure whether the last segmentation was indeed the most valid one for

the participant or whether it was the first, or the second one, or even some kind of a composite of

the three segmentation sets. One potential risk with our set-up, however, was that participants’

might get distracted by the visual display or that they might end up anchoring to their first

segmentations (i.e., simply reconfirm them in the second and third segmentation trials). When


we individually inspected the data of all 40 participants we saw that with a few exceptions

(mostly for Tune 6, which was the most successfully mastered tune), participants did change

their segmentations over trials, sometimes to their benefit and sometimes to their disadvantage

and sometimes neither (i.e., when their changed segmentations still did not match those of the

experts). It was nonetheless critical for us to see that participants of all three groups were

already very accurate in their first segmentations. Furthermore, if the display had been

distracting, we should have expected very uneven performance due to the distraction. Instead we

observed (1) strong convergences within groups, (2) strong convergences across groups, (3)

strong convergences with the experts who performed the task with the musical score and without

real-time pressure, (4) certain pieces to consistently produce nonconverging segmentations and

other pieces to consistently produce strongly-converging segmentations. We believe that all of

these are important indications that we are dealing with a very real, ecologically valid, and

astonishingly successful segmentation performance, already in the first attempts. Since most

studies have used the typical protocol of having participants first listen to a tune and then try to

segment it over three trials with the last segmentation being counted as the to-be-analyzed one

(e.g., Ayari & McAdams, 2003; Deliège, Mélen, Stammers, & Cross, 1996; Deliège & El

Ahmadi, 1990; Krumhansl, 1996), we decided to stick to the same protocol and hence reported

our findings first only on the basis of the last segmentations. Only in the later sections did we

report performance accuracies for the earlier segmentations. Here, our findings for the first

segmentations suggested that in future studies we might do with a single segmentation trial after

one free listening trial per tune. As mentioned earlier, this would also overcome the issue of how

much of the processes guiding participants’ segmentation behaviors are memory-based rather

than, or in addition to, perception-based processes.


In conclusion, our segmentation study with 19th century unknown Turkish makam tunes

of varying makam and rhythmic forms confirmed the findings of a series of earlier studies, which

had showed that the way listeners segmented music in a real-listening setting was strongly driven

by musical surface-related cues, and that this was so even for musicians (Deliège, Mélen,

Stammers, & Cross, 1996), and for music from a variety of Western musical genres (e.g.,

Bruderer, McKinney, & Kohlrausch, 2009; Clarke & Krumhansl, 1990; Deliège & El Ahmadi,

1990). Ayari and McAdams’ 2003 and Lartillot and Ayari’s 2008 and 2011 studies similarly

hinted at the strong role of durational proximity when segmenting music from a completely

foreign musical culture, and our present study confirmed this with somewhat more strength and

clarity, probably thanks to the more structured nature of our selected tunes. We would also like

to note that all three Ayari studies had used musician Western listeners whereas we had mostly

nonmusician Western listeners who were nonetheless surprisingly good in identifying the

boundaries, not only in agreement with each other but also in agreement with the expert

segmentations.

To our knowledge, this is the first study in the segmentation literature to look at the

effects of musical training and musical culture with non-Western musical material. We found an

effect of musical training and it is likely that musicians’ superior performance was due to some

“help” coming from a top-down, knowledge-based understanding, since almost all locations that

demanded a segmentation according to the experts were locations that had both Gestalt-based

grouping characteristics and musico-functional characteristics within the piece. And it may be

this very peculiar lack of a dissociation between musico-functional segments and Gestalt-based

segments in this 19th century Turkish makam music which made it possible for Western listeners

to perform so well and indistinguishably from the Turkish nonmusicians. This may also explain


why Schaefer, Murre, and Bod (2004) noted limits for the universality in segmentation when

using a special German folksong corpus with songs that had jump-phrases which violated Gestalt

principles, i.e., where musico-functional segments dissociated from Gestalt-based segments. It is

likely that post-Renaissance Western music has more of these dissociations compared to more

ancient, traditional music.

In conclusion, we might say that in real-time listening settings, in which ones does not

know what comes next at any given moment, tones of longer duration are more often than not

important points of segmentation. But we also observed, in our more detailed descriptive per-

tune analyses of participants’ segmentations and lack of segmentations, that sometimes such

notes are not chosen even by musically and musico-culturally unexperienced listeners as points

of segmentations because of some early understanding or abstraction of bigger “wholes” that

simply should not be split. We would like to end this paper with the last sentences in Max

Wertheimer’s 1923 article “Untersuchungen zur Lehre von der Gestalt”:

Is it not phenomenally a huge difference when I hear the first three tones of a

melody as those in expectation of a continuation …, or in contrast, when there is

an “ending”, when I have the three as a (full-)motif? … There is nothing “added

upon”, instead there is essentially the flesh and blood of the given. If one truly

searches to grasp this sort of the phenomenally given, one sees that the individual

tones in the melody are given clearly, ‘prägnant’ as parts; and if one tries to grasp

this sort of the given before its continuation (in the situation, in which “one does

not yet know how and if its continues”), then some beautifully characteristic

arises: the tones … are not yet quite “finalized”, they are of an indefinite, still


unstable character, they are finalized, firm and definite only then when there is the

last one, e.g., the “closing tone”, through which everything is settled. (p. 350)6


References

Allport, F. H. (1955). Theories of perception and the concept of structure: A review and

critical analysis with an introduction to a dynamic-structural theory of behavior.

Hoboken, NJ: John Wiley & Sons.

Ayari, M., & McAdams, S. (2003). Aural analysis of arabic improvised instrumental music

(taqsīm). Music Perception, 21, 159-216.

Bruderer, M. J., McKinney, M. F., & Kohlrausch, A. (2009). The perception of structural

boundaries in melody lines of Western popular music. Musicae Scientiae, 13, 273-313.

Castellano, M. A., Bharucha, J. J., & Krumhansl, C. L. (1984). Tonal hierarchies in the music

of North India. Journal of Experimental Psychology: General, 113, 394-412.

Clarke, E. F., & Krumhansl, C. L. (1990). Perceiving musical time. Music Perception, 7,

213-252.

Deliège, I. (1987). Grouping conditions in listening to music: An approach to Lerdahl &

Jackendoff’s grouping preference rules. Music Perception, 4, 325-360.

Deliège, I. (1996). Cue abstraction as a component of categorisation processes in music

listening. Psychology of Music, 24, 131-156.

Deliège, I., & El Ahmadi, A. (1990). Mechanisms of cue extraction in musical groupings: A

study of perception on Sequenza VI for viola solo by Luciano Berio. Music Psychology,

18, 18-44.

Deliège, I., Mélen, M., Stammers, D., & Cross, I. (1996). Musical schemata in real-time

listening to a piece of music. Music Perception, 14, 117-160.

Deutsch, D. (1975). Musical Illusions. Scientific American, 233, 92-104.

Deutsch, D. (1986). Auditory pattern recognition. In K. R. Boff, L. Kaufman, & J. P. Thomas


(Eds.), Handbook of perception and human performance: Vol. 2. Cognitive processes

and performance (pp. 1-49). Oxford, England: John Wiley & Sons.

Dowling, W. J. (1973). Rhythmic groups and subjective chunks in memory for melodies.

Perception & Psychophysics, 14, 37-40.

Jusczyk, P. W., & Krumhansl, C. L. (1993). Pitch and rhythmic patterns affecting infants’

sensitivity to musical phrase structure. Journal of Experimental Psychology: Human

Perception and Performance, 19, 627-640.

Kessler, E. J., Hansen, C., & Shepard, R. N. (1984). Tonal schemata in the perception of music

in Bali and in the West. Music Perception, 2, 131-165.

Krumhansl, C. L. (1991). Memory for musical surface. Memory & Cognition, 19, 401-411.

Krumhansl, C. L. (1996). A perceptual analyses of Mozart’s Piano Sonata K. 282:

Segmentation, tension, and musical ideas. Music Perception, 13, 401-432.

Krumhansl, C. L. (2000). Tonality induction: A statistical approach applied cross-culturally.

Music Perception, 17, 461-479.

Lartillot, O., & Ayari, M. (2008). Segmenting Arabic modal improvisation: Comparing

listeners’ responses with computer predictions. Proceedings of the Conference on

Interdisciplinary Musicology (CIM08), 1-10.

Lartillot, O., & Ayari, M. (2011). Cultural impact in listeners’ structural understanding of a

Tunisian traditional modal improvisation, studied with the help of computational

models. Journal of Interdisciplinary Music Studies, 5, 85-100.

Lehrdahl, F, & Jackendoff, R. (1983). An overview of hierarchical structure in music. Music

Perception, 1, 229-252.

Palmer, S. E. (1999). Vision science: Photons to phenomenology. Cambridge, MA: MIT


Press.

Rock, I. (1975). An introduction to perception. New York City, NY: Macmillan Publishing.

Schaefer, R. S., Murre, J. M. J., & Bod, J. (2004). Limits to universality in segmentation of

simple melodies. In Proceedings of the 8th International Conference on Music

Perception & Cognition (ICMPC8), 1-4.

Serafine, M. L., Glassman, N., Overbeeke, C. (1989). The cognitive reality of hierarchic

structure in music. Music Perception, 6, 397-430.

Stoffer, T. H. (1985). Representation of phrase structure in the perception of music. Music

Perception, 3, 191-220.

Wertheimer, M. (1923). Untersuchungen zur Lehre von der Gestalt. II. Psychologische

Forschung: Zeitschrift für Psychologie und ihre Grenzwissenschaften, 4, 301-350.


Acknowledgments

We thank Hasan Gürkan Tekman and Aysecan Boduroglu for their helpful comments and

suggestions as well as Olivier Lartillot for his extensive help for prior versions of visually

representing segmentation marks, and Cem Gözel who is a student at Mimar Sinan Fine Arts

University for helping us with the final markings on the musical notations. We also thank Prof.

Nilgün Doğrusöz and Prof. Nermin Kaygusuz who participated as our two musical experts. We

would like to note that this research was the experimental-musicological part of a larger project

with Olivier Lartillot, who together with the second author of this paper developed the computer

modelling and musicological component of it. In that component, which can be found in the

Proceedings of the Third International Workshop on Folk Music Analysis, Lartillot proposes a

new computer model of segmentation, which is compared to the models proposed by Tenney &

Polansky (1980) and Cambouropoulos’ Local Boundary Detection Model (2006).


Endnotes

1 Across all eight tunes there were only four more of such ‘misses’ (i.e., expert

segmentations that did not receive any participant segmentations), two in Tune 4 and two in

Tune 7. All ‘missed’ segmentations made sense from a Gestalt perspective (in two cases we

observed a separation based on duration and in the other two, a separation based on pitch). The

two duration-wise separating notes also marked an important musicological event. Yet none of

them were picked up by even a single participant.

2 We inspected all tunes for potential groupings based on similarity, i.e., we looked at

locations that had at least a unison. Overall, there were three such locations across all eight

tunes, all of which were marked by the experts. Interestingly, these three locations also referred

to critical musicological events in the piece, so once more we observed a Gestalt-based event that

coincided with a critical musicological event. One such event was in Tune 3, which we already

discussed. A second one was in Tune 7, a unison of two dotted quarter note Ds, which filled the

entire 6/8 measure. Thus, the second D was also the note that ended that measure, but

nonetheless, only 5% of musicians, 7% of nonmusicians, and 10% of Western listeners marked

that note (as mentioned, this location was marked by the experts though). The third similarity

event was in Tune 8 in its final measure which ended with two Ds, a dotted quarter note D

followed by a quarter note D with an annexed quarter pause (hence sounding as a half note rather

than a quarter note). The ending D received a segmentation by 62% of musicians, 50% of

nonmusicians, and 48% of Western listeners. We cannot know for sure whether it was

segmented due to similarity, due to being a measure end, due to being the endnote of the tune, or

due to being the longest note in the measure. However, given that locations with at least a

unison, as well as locations that marked the ending of a measure did not consistently trigger


segmentations, we feel prone to believe that durational proximity in the sense of separation was a

much stronger predictor of segmentation.

3 We also had all tunes be segmented by three different computer models which only

used those very local laws of Gestalt (Lartillot, Yazıcı, & Mungan, 2013). What we observed

was that all models produced a considerable number of extra segmentations which were not

shared either by our experts or our participants, including Westerner listeners. This indeed

suggests that even Western listeners, particularly in the fourth round of listening, might have

been able to understand slightly bigger forms and hence abstain from such extra segmentations,

something that the computer models may not yet be able to do.

4 We also looked at the best and worst segmentations, participant by participant within

each group, which revealed considerable stability across participants within each group. Yet, we

observed that almost without any exception, musicians and Western listeners performed worst on

the same tune (Tune 3), whereas nonmusicians showed considerable variance in that respect (in

some cases Tune 2, in some Tune 4 was the worst one). On the other hand, we observed that

almost without any exception, musicians performed equally often best on both Tunes 6 and 8,

whereas Western listeners mostly performed best on Tune 6 and only a few times also on Tune 8.

Nonmusicians, in turn, almost always performed best on Tune 6 and a few times also on Tune 1.

5 We would like to note that grouping on the basis of similarity did not much apply to our

tunes since (1) we had removed all repeat signs, meaning that phrases or measures were never

repeated, (2) our tunes only rarely had repeating same notes (see Endnote 2), and (3) there were

no timbre, articulation or dynamics changes due to the synthesized nature of the tunes.

6 Translated by one of the authors (E. M.).


Table 1

List of Tunes with Information about their Musical Characteristics

Tune Makam Rhythmic pattern & Meter (“Usul”)

Composer Duration

1 Saba Ağır Aksak (9/4) Şevki Bey (1860 – 1891) 00:01:06

2 Uşşak Ağır Aksak (9/4) Şevki Bey 00:01:00

3 Segâh Yürük Semai (6/4) Eyyubî Ebubekir Ağa ( ? – 1759) 00:01:00

4 Segâh Aksak (9/8) Ahmet Rasim Bey (1864 – 1932) 00:01:00

5 Saba Aksak (9/8) Zeki Arif Ataergin (1896 - ) 00:01:08

6 Nihavend Hafif (32/2) Hacı Faik Bey ( ? – 1891) 00:01:10

7 Uşşak Yürük semai (6/8) Tanburî Ali Efendi (1836 –

1902) 00:01:00

8 Hicaz Hafif (32/4) Tanburî Refik Fersan (1893 –

1965) 00:01:00


Table 2

Mean Number of Segmentations (and their Standard Deviations) per Tune According to Musicianship and Musical Culture

Tune 1 Tune 2 Tune 3 Tune 4 Tune 5 Tune 6 Tune 7 Tune 8

Group N Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD Ave SD SD range

Musicians 16 10.3 4.8 9.1 3.6 8.1 3.4 8.2 3.7 8.9 3.5 9.5 3.0 7.9 2.9 8.2 1.6 8.8 3.4 1.6 - 4.8

Nonmusicians 13 12.5 6.4 8.5 4.1 8.3 6.2 7.4 2.9 9.5 3.4 10.1 3.8 8.1 2.1 8.8 3.0 9.1 4.4 2.1 - 6.4

Western List. 11 11.6 5.1 9.3 5.4 7.6 4.6 9.0 4.7 9.7 4.4 9.6 4.6 9.1 4.1 10.5 4.7 9.6 4.7 4.1 - 5.4


Table 3

Average “Hit” and “False Alarm” Percentages by Segmentation Trial Number and Group

Segmentation Trial

First Second Third Group

Averages:

Musicians (n=16)

Ave. Hits: 58% 59% 60% 59%

Ave. False Alarms: 8% 8% 8% 8%

Nonmusicians (n=13)

Ave. Hits: 51% 54% 53% 52%


Western Listeners (n=11)

Ave. Hits: 55% 52% 53% 53%


Total Averages for Hits: 55% 55% 55% 55%

Total Ave. for False Alarms: 9% 9% 9% 9%


Table 4

Average “Hit” and “False Alarm” Percentages across three Trials by Tune and Group

Tunes

1 2 3 4 5 6 7 8

Musicians (n=16)

Ave. Hits: 67% 53% 25% 45% 75% 81% 46% 81%

Ave. False Alarms: 10% 8% 10% 6% 7% 9% 7% 5%

Nonmusicians (n=13)

Ave. Hits: 64% 35% 27% 34% 76% 78% 40% 66%

Ave. False Alarms: 14% 11% 9% 6% 8% 10% 8% 10%

Western List. (n=11)

Ave. Hits: 64% 44% 17% 44% 65% 81% 40% 74%

Ave. False Alarms: 13% 11% 10% 6% 9% 10% 9% 12%

Total Averages for Hits: 65% 44% 23% 41% 72% 80% 42% 74%

Total Ave. for F/As: 12% 10% 9% 6% 8% 10% 8% 9%


Table 5.

Average “Hit” and “False Alarm” Percentages for Musician and Nonmusician Western

Listeners

Segmentation Trial

First Second Third Group

Averages:

Musician Western List. (n=4)

Ave. Hits: 63% 55% 59% 59%


Nonmusicians West. List. (n=7)

Ave. Hits: 51% 51% 50% 50%



Figure 1. Number of segmentations per tune by musicianship and musical culture.


0

2

4

6

8

10

12

14

1 2 3 4 5 6 7 8

Av

e. n

o. o

f se

gm

enta

tio

ns

Tune

musicians (n= 16)

nonmusicians (n=13)

western list. (n=11)

expert (n=1)


Figure 2. Segmentation locations of third segmentations in milliseconds (x-axis) for musicians

(blue), nonmusicians (red), and Western listeners (green) for all eight makam tunes plotted in

separate but aligned x- axes.


Figure 3. Segmentation locations of first segmentations in milliseconds (x-axis) for musicians

(blue), nonmusicians (red), and Western listeners (green) for all eight makam tunes plotted in

separate but aligned x- axes.


Figure 4. Segmentation locations of second segmentations in milliseconds (x-axis) for

musicians (blue), nonmusicians (red), and Western listeners (green) for all eight makam tunes

plotted in separate but aligned x- axes.


Figure 5. Histograms for musicians (blue), nonmusicians (red), and Western listeners (green) by

note (bins, with their numbers being marked between ticks on the x-axis) for Tune 1. Dotted

black and red lines mark expert’s perceptual and musicological boundaries, respectively. To

visually differentiate them in cases of full overlap they were slightly separated out horizontally.


Figure 8. Musical score for Tune 1 with expert segmentations marked as red vertical lines,

participant percentages marked as proportions above the notes (only for cases, in which at least

in one participant group there was a ≥ 30% agreement, and sometimes for immediately

neighboring cases, where participant segmentations “spilled over” to the next note), and note

numbers marked as white in a black ellipse below the notes. Musicological events are marked in

the original language on top of the notes and percentages.


Figure 11. Segmentation accuracies across three segmentation trials by group


0%

10%

20%

30%

40%

50%

60%

1st segm 2nd segm 3rd segm

Hit

s m

inu

s Fa

lse

Ala

rms

Segmentation Trial

Musicians

Nonmusicians

Western Listeners

perceiving boundaries in makam music 1 running head...

Documents