the contributions of compositional structure and performance expression to the communication of...

22
Psychology of Music 2014, Vol. 42(4) 503–524 © The Author(s) 2013 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/0305735613482023 pom.sagepub.com The contributions of compositional structure and performance expression to the communication of emotion in music Lena Quinto, William Forde Thompson and Alan Taylor Macquarie University, Australia Abstract In this investigation, eight highly-trained musicians communicated emotions through composition, performance expression, or the combination of the two. In the performance condition, they performed melodies with the intention of expressing six target emotions: anger, fear, happiness, neutral, sadness, and tenderness. In the composition condition, they composed melodies to express the same six emotions. The notated compositions were then played digitally without performance expression. In the combined condition, musicians performed the melodies they composed to convey the target emotions. Forty-two listeners heard the stimuli and attempted to decode the emotions in a forced-choice paradigm. Decoding accuracy varied significantly as a function of the channel of communication. Fear was comparatively well-decoded in the composition condition whereas anger was comparatively well decoded in the performance condition. Happiness and sadness were comparatively well-decoded in all three channels of communication. A principal component analysis of cues used by musicians clarified the distinct approaches adopted in composition and performance to differentiate emotional intentions. The results confirm that composition and performance involve the manipulation of distinct cues and have different emotional capabilities. Keywords acoustic cues, composition, emotion perception, music cognition, performance expression Music communicates emotion through a range of cues, including intensity (loudness), rate, pitch, modality, pitch height, timing, and timbre. There are no defining features to unambigu- ously communicate an emotion. Rather, emotions are probabilistically associated with many cues that vary depending on contextual factors such as the musician and the instrument (Brunswik, 1956; Juslin, 1997a). Corresponding author: Lena Quinto, Department of Psychology, Macquarie University, Sydney, NSW 2109, Australia. Email: [email protected] 0010.1177/0305735613482023Psychology of MusicQuinto et al. 2013 Article

Upload: maribeltobito

Post on 16-Nov-2015

20 views

Category:

Documents


1 download

DESCRIPTION

Article

TRANSCRIPT

  • Psychology of Music2014, Vol. 42(4) 503 524

    The Author(s) 2013Reprints and permissions:

    sagepub.co.uk/journalsPermissions.navDOI: 10.1177/0305735613482023

    pom.sagepub.com

    The contributions of compositional structure and performance expression to the communication of emotion in music

    Lena Quinto, William Forde Thompson and Alan TaylorMacquarie University, Australia

    AbstractIn this investigation, eight highly-trained musicians communicated emotions through composition, performance expression, or the combination of the two. In the performance condition, they performed melodies with the intention of expressing six target emotions: anger, fear, happiness, neutral, sadness, and tenderness. In the composition condition, they composed melodies to express the same six emotions. The notated compositions were then played digitally without performance expression. In the combined condition, musicians performed the melodies they composed to convey the target emotions. Forty-two listeners heard the stimuli and attempted to decode the emotions in a forced-choice paradigm. Decoding accuracy varied significantly as a function of the channel of communication. Fear was comparatively well-decoded in the composition condition whereas anger was comparatively well decoded in the performance condition. Happiness and sadness were comparatively well-decoded in all three channels of communication. A principal component analysis of cues used by musicians clarified the distinct approaches adopted in composition and performance to differentiate emotional intentions. The results confirm that composition and performance involve the manipulation of distinct cues and have different emotional capabilities.

    Keywordsacoustic cues, composition, emotion perception, music cognition, performance expression

    Music communicates emotion through a range of cues, including intensity (loudness), rate, pitch, modality, pitch height, timing, and timbre. There are no defining features to unambigu-ously communicate an emotion. Rather, emotions are probabilistically associated with many cues that vary depending on contextual factors such as the musician and the instrument (Brunswik, 1956; Juslin, 1997a).

    Corresponding author:Lena Quinto, Department of Psychology, Macquarie University, Sydney, NSW 2109, Australia. Email: [email protected]

    0010.1177/0305735613482023Psychology of MusicQuinto et al.2013

    Article

  • 504 Psychology of Music 42(4)

    In all musical traditions that involve the use of notation for preserving and disseminating music, composition and performance are channels of musical activity that carry emotional information. Trained musicians in such traditions typically develop high levels of competence in both skills, although they often specialize as composers or performers. Researchers have identified a range of emotional cues within both composition (e.g., Gabrielsson & Lindstrm, 2010; Thompson & Robitaille, 1992; Wedin, 1972) and performance (Gabrielsson & Juslin, 1996; Juslin & Timmers, 2010; Laukka & Gabrielsson, 2000; Sherman, 1928). Each channel is associated with a different set of musical cues, such that the same emotion would be expressed differently by composition and performance.

    It is important to acknowledge that the distinction between composition and performance is not clear-cut and that there is often considerable overlap between the two. For example, the Suya in Brazil do not make this distinction (Seeger, 2004), and some popular, folk, jazz, and electronic genres have no formal practice of composition distinct from performance. Even in genres that clearly differentiate the two activities, composers frequently provide expressive markings in their written scores, for example, indicating the expected pace of the music (e.g., andante, presto), and where a performer should introduce expressive actions such as a cre-scendo (a gradual increase in intensity), or a ritardando (a gradual decrease in the tempo of the music). In other words, the act of composition may be intertwined with conceptions of appro-priate performance expression.

    Nonetheless, within all musical genres that involve the use of notation for preserving musi-cal ideas, the art of composition is associated with an abstract conception of musical structure that typically includes the set of pitches and durations that comprise the melodies, harmonies, rhythms, and timbres. Musicians understand that these primary attributes operationally defined here as compositional structure should remain largely constant from one performance to another. Musicians also recognize that they have freedom to manipulate other cues, such as intensity, tempo, articulation, vibrato, and timing. Such secondary attributes may hence vary from one performance to another. These actions are operationally defined here as performance expression. Together, the activities of composition and performance enable the use of a wide range of cues with which to communicate emotion.

    Compositional structureComposers of tonal and post-tonal music have the greatest control over pitch structure and temporal structure, which are typically represented through notation. Discussions of the rela-tion between compositional structure and emotion have focused on cues such as modality, melodic structure, instrument choice, rhythm, meter, and harmony (Gabrielsson & Lindstrm, 2010). Pitch-related cues, especially modality, are particularly influential. Pieces written in a major key are described as happy, light, bright, and cheerful, whereas pieces written in a minor key are described as restless, sad, and mystical (Crowder, 1984; Gagnon & Peretz, 2003; Hevner, 1935). As reviewed by Gabrielsson and Lindstrm (2010), large pitch variability may suggest emotional states such as excitement, happiness, pleasantness, and surprise. In contrast, small pitch variability is associated with emotional states such as disgust, sadness, and fear. Both large and small variations in pitch have been associated with anger (Scherer & Oshinsky, 1977; Timmers & Ashley, 2007). Low pitch height has been associated with sadness and boredom, whereas high pitch might imply happiness, serenity, anger, and fear (Balkwill & Thompson, 1999; Gabrielsson & Lindstrm, 2010). These associations with pitch height are not always observed (Ilie & Thompson, 2006), suggesting that the emotional connotations of

  • Quinto et al. 505

    pitch height may depend on contextual cues (Gabrielsson & Lindstrm, 2010; Huron, Kinney, & Precoda, 2006; Schubert, 2004).

    Performance expressionA musician adds performance expression to a composition by manipulating dynamics, timing, intonation, and articulation. The notes in a score are nondiscretionary attributes of any com-position, whereas expressive markings are considered to be guidelines for the performer. The manner in which these guidelines are implemented will depend on the performers interpreta-tion of the composition. The contributions of the performer are significant: They convey vary-ing levels of expressivity (Kendall & Carterette, 1990; Palmer & Hutchins, 2006) and they communicate specific emotions (Juslin, 2003; Sloboda, 2000).

    Performers have direct control over cues such as articulation, intensity, variability in inten-sity, and high-frequency (HF) energy or brightness of the sound (for some instruments). Articulation may be defined as the proportion of sounded notes to silence within inter-onset intervals, and extends from legato (high proportion of sounded notes to silence) to staccato (low proportion of sounded notes to silence). Emotions such as tenderness and sadness are usually associated with legato performances, whereas emotions like happiness are often associated with staccato performances. Music played with low intensity has been associated with sadness or tenderness, and low arousal. Conversely, music played with high intensity is suggestive of anger or happiness, and high arousal (Gabrielsson & Lindstrm, 2010; Juslin & Laukka, 2003; Kotlyar & Morozov, 1976; Schubert, 2004). Music that contains high intensity variability has been associated with fear (Juslin, 1997b; Juslin & Laukka, 2003). The amount of HF energy in the spectrum, which for some instruments can be controlled by the musician, is related to the perceived brightness or sharpness of the tone. The degree of timbral brightness also carries emotional connotations. For example, bright or sharp timbres are associated with emotions such as anger and happiness, whereas dull timbres are associated with sadness and tenderness (Gabrielsson & Juslin, 1996; Juslin, 1997b).

    Finally, tempo is under the performers control but the composer may provide recommenda-tions to the performer within markings in the score. As such, we consider tempo to be both a compositional and performance cue. Tempo may be defined as the rate at which the music is played, although attempts to model tempo illustrate the complexity of this cue in music (e.g., Cemgil, Kappen, Desain, & Honing, 2000; Honing, 2001, 2005). A fast tempo has been associ-ated with happiness, anger and fear, whereas a slow tempo has been associated with sadness and tenderness (Hevner, 1935, 1937; Juslin & Laukka, 2003).

    The present studyThe aim of this investigation was to assess the emotional cues that are employed when emotion is expressed through compositional structure, performance expression and the combination, and how such cues are extracted and interpreted by listeners. Eight highly trained musicians assisted in developing the stimuli. We focused on the communication of six categorical emo-tions that have been examined in previous investigations (Juslin & Laukka, 2003). It must be emphasized that emotional communication in music is not always captured by labels such as happy, sad, and fearful. Nonetheless, the ability to perceive categorical emotions in music appears early in development (Dalla Bella, Peretz, Rousseau, & Gosselin, 2001; Terwogt & Van

  • 506 Psychology of Music 42(4)

    Grinsven, 1991) and among adults emotional decoding in music is remarkably efficient, occur-ring within just a few seconds of sounded notes (Peretz, Gagnon, & Bouchard, 1998).

    There were three communication channels. In the performance condition, musicians were provided with four emotionally-ambiguous melodies and were asked to perform them in a man-ner that communicated each of six emotions: anger, fear, happiness, neutral (no emotional expression), sadness, and tenderness. These melodies were created to minimize emotional cues arising from compositional structure while still providing musicians with melodic materials onto which expressive actions could be added. In the composition condition, musicians com-posed their own melodies with the intention of communicating the same six emotions. Their compositions were then converted into a musical instrument digital interface (MIDI) format and recorded without performance expression. In the combined condition, musicians per-formed their own compositions expressively with the intention of reinforcing the emotion that was intended in the composition.

    Table 1 illustrates the cues that were held constant in each communication channel and the number of stimuli in each channel. In the composition condition, melodies contained no varia-tion in performance cues, such as intensity or expressive timing. In the performance condition, however, four emotionally-ambiguous melodies were provided to musicians to control compo-sitional structure. This strategy provided musicians with a range of melodic properties on which emotional performances were superimposed.1

    This approach improved upon previous investigations that have employed musicians to cre-ate musical stimuli. Firstly, we recruited a larger sample of musicians compared to previous investigations. Studies using compositions developed for experimental purposes have typically used one musician to create the stimuli; the exception is Thompson and Robitaille (1992) who used five musicians. Secondly, previous research has mainly focused on emotional communica-tion by either performance expression or composition, and has rarely compared the emotional capacities of these two channels. In contrast, our musicians provided emotional stimuli using composition, performance, and the combination, allowing us to directly compare these chan-nels of emotional communication.

    The stimuli were decoded by a separate set of listeners to assess the success of emotional communication. For each communication channel, we expected that emotional decoding would occur at above-chance levels. Each channel may, however, be particularly well suited for expressing specific emotions because of the cues available. For example, anger is not always communicated well by compositional structure (Thompson & Robitaille, 1992) but it should be effectively conveyed by performance expression (Juslin & Laukka, 2003). Across communica-tion channels, it was expected that happiness and sadness would be reliably decoded (Kallinen, 2005). We expected tenderness to be clearly decoded in the composition condition (Thompson

    Table 1. Characteristics of the communication channels showing number of stimuli and the dimension that was held constant.

    Communication channels

    Composition Performance Combined

    Held constant Performance expression Compositional structure Neither

    Number of stimuli 48 exemplars(8 musicians 6 emotions)

    192 exemplars(8 musicians 6 emotions 4 melodies)

    48 exemplars(8 musicians 6 emotions)

  • Quinto et al. 507

    & Robitaille, 1992) whereas it might be confused with sadness when only performance cues are available (Gabrielsson & Juslin, 1996). Sadness and tenderness are both expressed with low arousal (energy) levels but differ in valence (positive or negative evaluation) levels. Performance expression is unlikely to be used differently between sadness and tenderness because perfor-mance cues (tempo, intensity, and articulation) may be used similarly to communicate low arousal but not different valence levels. In contrast, compositional cues such as mode may be a better indication of valence and may therefore differentiate these two emotions.

    Finally, it was expected that emotional decoding would be better when compositional struc-ture and performance expression were combined than when either set of cues was presented in isolation. This prediction was based on the assumption that emotional decoding occurs by a process of evaluating all available cues and their association with different emotions (see Juslin, 2000). Combining cues from compositional structure with those from performance expression should provide listeners with a larger source of information from which all cues are used to decode emotional meaning.

    Two analyses of musical cues were conducted. The first analysis identified cues that varied significantly as a function of emotional intention in each communication channel. Table 2 summarizes associations that have been observed between the emotional intentions of musi-cians and several cues that signal emotion: modality, pitch height, average interval size, inter-val range, articulation, intensity, intensity variability, HF energy, and tempo (see Gabrielsson & Lindstrm, 2010; Juslin & Laukka, 2003). These cues were expected to change similarly in our stimuli. In the second analysis, the cues associated with our musical stimuli were used to extract principal components. We then assessed the relationship between these principal components and listeners responses.

    MethodMaterialsMusicians. Eight musicians participated in the production task: four violinists and four vocalists. Violin and voice were selected because both instruments have high potential to express emo-tion. In particular, they allow fine-grained control over a range of cues, with relatively few biomechanical constraints on their expressive capacity; other instruments, such as harpsi-chord, are more restricted in the cues that can be manipulated. We did not anticipate that

    Table 2. Predicted direction of the various cues.

    Cues Emotions

    Anger Fear Happiness Neutral Sadness Tenderness

    Modality Minor Minor Major Major Minor MajorMean F0 Unclear High High Low Low HighAverage interval size Small Unclear Large Small Small SmallRange Unclear Large Large Small Small SmallArticulation Unclear Staccato Staccato Legato Legato LegatoIntensity High Low High Low Low LowIntensity variability High High Low Low Low LowHF energy Greater Lesser Greater Lesser Lesser LesserTempo Fast Fast Fast Slow Slow Slow

  • 508 Psychology of Music 42(4)

    emotional communication would differ between vocalists and violinists and the study was not optimally designed for such a comparison. However, production constraints associated with these two groups of musicians are somewhat different and vocalists and violinists may differ in their capacity to communicate emotion. For example, violinists are not constrained by their vocal range and may more readily communicate emotions that are associated with large pitch changes. Additionally, associations between particular instruments and emotions (e.g., xylo-phone and happiness; Schutz, Huron, Keeton, & Loewer, 2008) could lead to priming effects: Listeners may be primed to recognize certain emotions depending on the instrument.

    Musicians were recruited through advertisements to local performance groups (e.g., choirs, orchestras, musical theatre societies). Musicians who agreed to participate in the study all expressed comfort and enthusiasm at being able to express emotions through performance and composition. Originally, 15 musicians took part and recordings from eight were used for the investigation. The criteria for selecting the eight musicians were as follows: (a) they successfully completed the tasks for all conditions; (b) two judges with at least 10 years of music training independently confirmed that the intended emotion was expressed in every condition.

    All musicians were currently performing and had completed higher-level examinations for their instruments. Four had reached the level of Associate of Music. This is the highest grade offered at the Australian Music Examination Board (AMEB). One musician had reached the 8th grade (penultimate level) and another had completed the 6th grade AMEB. Two of the vocalists who had not completed formal examinations had been actively singing for 17 and 20 years, respectively. The musicians had an average age of 27.63 (SD = 9.08) years, with 15 (SD = 3.89) years of formal training and an average time of 21.15 (SD = 9.41) years performing. All were paid for their participation.

    Conditions. There were three communication channels: composition, performance, and com-bined. In each channel, musicians created stimuli with the intention of communicating the emotions of anger, fear, happiness, neutral, sadness, and tenderness. Instructions were given to communicate emotion using a particular strategy, or a specific style or genre. Musicians were recorded in a quiet room using a high-quality microphone (K2 Rde condenser microphone) and their output saved in digital format using Cubase 4.

    Composition stimuli. We asked musicians to compose brief melodies with the intention of expressing the aforementioned emotions. Compositions were limited to a maximum of 9 notes (range = 59 notes, average = 7.40 notes). The compositions were brief and monophonic to ensure that musicians would focus on emotional communication and not on compositional goals associated with large-scale structure. Musicians did not report feeling constrained by this task and some chose to use fewer than the maximum 9 notes. An additional reason for the short phrases was to allow the listening component of the experiment to be completed in a timely manner. Generally, listeners are capable of identifying the emotional intention of a piece of music after hearing only a few tones (Vieillard et al., 2008). Recent research has shown that the consistency of emotional ratings for brief clips of music (under 1 s) is similar to excerpts that are 15 s in length (Krumhansl, 2009). Other work suggests that musical excerpts that are only 1 s in length can induce emotions that are similar to those induced by longer excerpts (Bigand, Vieillard, Madurell, Marozeau, & Dacquet, 2005).

    Our procedure resulted in 48 compositions (8 musicians 6 emotions). Compositions were manually notated in MIDI format using Cubase. A Roland super JV-1080 64 voice synthesizer module with 4 expansion modules was used to select the timbre of either a violin or voice. The

  • Quinto et al. 509

    timbres used were 41 for the violin from the XPA preset bank and 54 for the voice from the D (GM) preset bank. The timbre was dependent on the original instrument of the musician. The tempo of each melody was matched to the tempo as performed by the musician in the combined condition. We excluded changes in dynamics, deviations in timing, articulation and expressive markings. Figure 1 (af) shows notated exemplars for phrases composed by the musicians.

    Performance stimuli. The first author composed four musical phrases, each between seven and nine notes in length (mean = 7.75 notes). The average length of the performance and composition stimuli differed because the experimenters created the melodies used for the performance condition whereas composers created their own melodies and were free to use up to a maximum of nine notes. Four melodies were used to provide some melodic diversity for

    Figure 1. Notated exemplars of the melodies: melodies composed by musicians to express anger (a), fear (b), happiness (c), neutral (d), sadness (e) and tenderness (f) and two experimental melodies, used in the performance condition, intended to be emotionally ambiguous (g & h).

  • 510 Psychology of Music 42(4)

    participants in this condition. The melodies were developed to be unfamiliar and emotionally ambiguous. Emotional ambiguity was achieved by constructing melodies that were compatible with both major and minor modes (e.g., no mediant) and exhibited minimal rhythmic complexity. They were characterized by an intermediate degree of pitch variation (mean range = 9.25 semitones), which is slightly larger than the neutral melodies composed by the musicians (8.15 semitones). Figure 1 (g & h) shows notated exemplars for two phrases intended to be emotionally ambiguous. Musicians performed the melodies with the intention of expressing the six indicated emotions. Overall, the performance condition yielded 192 performances (8 musicians 4 melodies 6 emotions).

    Combined stimuli. Musicians performed their own compositions in a manner that reinforced the emotion that was intended in each composition. This procedure resulted in 48 performed compositions (8 musicians 6 emotions).

    Decoding experimentListeners. Forty-two undergraduate students at Macquarie University participated as listeners in the experiment (34 females and 8 males) in exchange for course credit. The average age was 21.93 (SD = 7.08) years with an average 5.47 (SD = 6.57) years of formal music training.

    Procedure. Each participant was exposed to only half of the performance stimuli because there were a large number of trials using the same four melodies in the performance condition. Par-ticipants heard performances from two violinists and two vocalists, which were randomly and independently assigned (96 trials). They heard every stimulus in the composition (48 trials) and combined conditions (48 trials). The three communication channels were presented in a random order that was determined independently for each participant. Participants underwent six practice trials to ensure that they were familiar with the response format. After each trial was played, they judged which of the six emotions they believed the performer was attempting to convey.

    Analysis of cuesCues. All the stimuli that were created by the musicians underwent an analysis. The cues were measured in Praat (Boersma & Weenink, 2009) and the MIRToolbox (Lartillot, Toiviainen, & Eerola, 2008). For some of the cues extracted by Praat, it was first necessary to manually iden-tify the start and end time of notes and rests, along with any large pitch glides (using text grids).

    Mode. Mode was assessed as an estimate of how major or minor the excerpt is. Values ranged from 1 (minor) to +1 (major).

    Mean fundamental frequency (F0). Mean F0 or mean pitch refers to the average pitch height of a phrase.

    Mean interval size. Mean interval size was measured by finding the average pitch of each tone and determining the musical interval (in semitones) between sequential tones. These distances were averaged over each phrase.

    Range. The range in semitones for each stimulus was measured by calculating the distance between the lowest and highest frequency.

  • Quinto et al. 511

    Articulation. Articulation refers to whether a sequence of notes is played connected (legato) or detached (staccato). The articulation of a given tone was calculated by dividing the duration of the executed tone by the time between the onset of that tone and the onset of the subsequent tone (or rest). Note onsets and offsets were manually indicated with text grids in Praat.

    Intensity level. Intensity level refers to the perceptual sense of loudness. It is a measure of the energy in an acoustic signal and was assessed in decibels using Praat. Intensity level was calculated as the difference in intensity between each emotion and the neutral performance for the same musician. This method was used because there may have been potential differences in the recording levels between musicians.

    Intensity variability. Intensity variability (Intensity SD) refers to changes in loudness. It was calculated as the standard deviation of intensity level for each stimulus. Periods of silence were excluded with the use of text grids.

    High-frequency energy. High-frequency energy (HF energy) refers to the percentage of HF energy in the spectrum above 3,000 Hz. This variable has been shown to contribute to the brightness of the sound. It was measured with the MIRToolbox using the mirbrightness command.

    Tempo. Tempo is the average speed at which the piece is performed. The tempo of each stimulus was calculated by dividing the duration of the performance up to the onset of the last note (in seconds), by the number of beats and then multiplied by 60 to determine the beats per minute (bpm).

    Table 3. Percentage of responses for each emotion and communication channel (rows sum to 100).

    Condition Intended emotion

    Listener response

    Anger Fear Happiness Neutral Sadness Tenderness

    Composition Anger 24.11 27.68 4.17 12.50 19.64 11.90 Fear 12.80 47.92 4.17 8.63 16.37 10.12 Happiness 3.87 3.57 50.89 17.56 4.17 19.94 Neutral 5.06 5.95 16.96 29.46 15.77 26.79 Sadness 11.01 20.24 3.87 16.67 33.93 14.29 Tenderness 6.85 12.50 18.15 18.15 17.56 26.79Performance Anger 45.98 10.27 17.11 14.14 5.21 7.29 Fear 8.04 31.70 5.80 16.37 24.11 13.99 Happiness 8.48 5.36 43.60 21.13 6.25 15.18 Neutral 3.72 5.36 10.71 34.82 20.83 24.55 Sadness 1.19 7.44 2.38 12.50 52.08 24.40 Tenderness 1.93 5.51 4.61 15.77 39.73 32.44Combined Anger 37.50 14.58 6.25 16.67 19.94 5.06

    Fear 13.69 49.70 2.68 12.80 16.37 4.76Happiness 2.08 2.98 71.43 9.82 4.17 9.52Neutral 2.38 3.57 16.07 38.10 17.26 22.62Sadness 2.38 10.12 4.46 9.82 57.14 16.07Tenderness 2.68 6.55 7.14 18.45 33.93 31.25

    Note. Numbers in bold indicate that listener responses matched the intended emotion.

  • 512 Psychology of Music 42(4)

    ResultsDecoding resultsRosenthal and Rubins (1989) index, , was used to determine whether the least well-decoded emotion was identified at above chance levels. This procedure controls for the number of response options in multiple-choice-type data. If the least well-decoded emotion is chosen sig-nificantly more than would be expected by chance, then it can be concluded that other emo-tions were also decoded better than would be expected by guessing. The confusion matrix (proportion of listener responses) is shown in Table 3. The emotion of anger in the composition condition was the least well-decoded, at 24.11% correctly identified. This value is significantly above that expected by chance, p < .001, estimated with an effect size of = .62 .08 (95% confidence limits), if chance level is 16.6%.

    AccuracyThe average accuracy in the three communication channels and six emotions was calculated for each participant, averaged across the eight performers. Figure 2 displays the means and standard errors for the percentage of correct responses. To determine whether there were sig-nificant differences in accuracy across the three communication channels and six emotions, a repeated-measures 3 6 (communication channel [performance, composition, combined] emotion [anger, fear, happiness, neutral, sadness, tenderness]) analysis of variance (ANOVA) was performed. The ANOVA with accuracy as the dependent variable revealed a significant main effect of communication channel, F(2, 82) = 38.87, p < .001, 2p = .49. Across all emo-tion categories, the combined condition was best decoded (M = 47.19, SD = 31.61), the perfor-mance condition was intermediate (M = 39.58, SD = 25.15), and the composition condition was least well-decoded (M = 35.47, SD = 30.82). This finding suggests that decoding occurs by evaluating the accumulation of evidence, rather than using specific criteria. Participants had

    Figure 2. The mean accuracy for all six emotions and three communication channels. Standard error bars are shown.

  • Quinto et al. 513

    access to a greater number of cues in the combined condition than the performance or compo-sition conditions, and should be in a better position to determine the intended emotion.

    There was also a main effect of emotion, F(5, 205) = 19.15, p < .001, 2p = .32. Across com-munication channels, the emotions of happiness (M = 52.38, SD = 49.99) and sadness (M = 48.81, SD = 50.00) were the most reliably decoded; anger (M = 36.23, SD = 48.08) and fear (M = 40.25, SD = 49.01) were decoded at intermediate levels; and emotionally neutral (M = 32.38, SD = 46.81) and tender stimuli (M = 31.73, SD = 46.15) were the least well-decoded. Differences in decoding accuracy between emotions have previously been observed in music (e.g., Kotlyar & Morozov, 1976; Terwogt & Van Grinsven, 1991).

    The main effects were qualified by a significant emotion communication channel interac-tion, F(10, 410) = 12.50, p < .001, 2p = .23. Tests of simple effects with Bonferroni correction were performed to explore the interaction. In the composition condition, the emotions of hap-piness and fear were better decoded than other emotional intentions. There were no differences in the decoding of happiness and fear. Happiness was significantly better decoded than sadness, t(41) = 4.47, p < .001; anger, t(41) = 6.77, p < .001; neutral, t(41) = 5.35, p < .001; and ten-derness, t(41) = 5.47, p < .001. Fear was significantly better decoded than sadness, t(41) = 3.59, p = .02; anger, t(41) = 5.47, p < .001; neutral, t(41) = 4.51, p < .001; and tenderness, t(41) = 4.80, p < .001. No other comparisons were significantly different.

    In the performance condition, the emotions of sadness and happiness were the most accu-rately decoded. Pairwise tests revealed that sadness was better decoded than fear, t(41) = 5.51, p < .001; neutral, t(41) = 5.08, p < .001; and tenderness, t(41) = 4.78, p < .001. Happiness was better decoded than fear, t(41) = 3.72, p = .01, and marginally better decoded than tender-ness, t(41) = 3.29, p = .06. Decoding accuracy for sadness and happiness did not significantly differ. No other comparisons were significantly different.

    In the combined condition, the emotions of happiness, sadness and fear were the most accu-rately decoded. Pairwise comparisons indicated that happiness was better decoded than fear, t(41) = 5.05, p < .001; anger, t(41) = 8.78, p < .001; neutral, t(41) = 7.57, p < .001; and ten-derness, t(41) = 6.77, p < .001. Sadness was the second best decoded emotion and was signifi-cantly better decoded than anger, t(41) = 4.18, p = .004; neutral, t(41) = 4.04, p = .006; and tenderness, t(41) = 8.74, p < .001. Fear was better decoded than anger t(41) = 5.04, p = .001; and tenderness t(41) = 3.85, p = .007. No other significant differences were observed.

    We next assessed whether combining performance expression to compositional structure benefitted emotional communication. Decoding accuracy was better in the combined condition than in the composition condition for the emotions of anger, t(41) = 3.62, p = .05; happiness, t(41) = 5.39, p < .001; and sadness, t(41) = 6.27, p < .001. No other significant differences were observed between the composition and combined conditions.

    Decoding accuracy for the performance condition was not compared to decoding accuracy for the other conditions because performers operated on different compositions and different numbers of stimuli. Nonetheless, our results confirm that when happiness, sadness or anger were expressed with compositional structure, adding performance expression enhanced com-munication of these emotions. Performance expression alone is remarkably effective for emo-tional communication, especially for the emotions of anger and sadness (Juslin & Laukka, 2003). Our results support the view that performance and composition are somewhat special-ized for the expression of certain emotions. Interestingly, the addition of performance expres-sion to compositional structure may not increase decoding accuracy for all emotions but it may enhance the decoding of emotions that are difficult to convey compositionally, such as anger.

  • 514 Psychology of Music 42(4)

    The role of timbreWe assessed whether there were differences in the expression of emotions between the two instruments. A repeated-measures 2 6 (instrument [violin and vocal] emotion [anger, fear, happiness, neutral, sadness, tenderness]) ANOVA was performed. There was no significant main effect of instrument on accuracy, F(1,41) = 2.41, p = .13, 2p = .06, and the significant main effect of emotion remained, F(5, 205) = 15.25, p < .001, = .27. There was a significant interaction between emotion and instrument, F(5, 205) = 10.92, p < .001, = .21. Tests of simple effects showed that violinists expressed fear more effectively (M = 45.24, SD = 21.68) than vocalists (M = 35.27, SD = 17.98), t(41) = 3.57, p = .02. Similarly, violinists expressed happiness more effectively (M = 59.08, SD = 21.96) than vocalists (M = 45.68, SD = 19.15), t(41) = 4.46, p < .001. Tenderness, however was better expressed by vocalists (M = 38.34, SD = 19.05), than violinists (M = 22.62, SD = 11.87), t(41) = 5.06, p < .001. These findings demonstrate that emotional decoding sometimes depends on the instrument but both violinists and vocalists were equally capable of communicating emotion overall.

    Analysis of cuesCues used by musicians. A series of multilevel mixed linear model (MLM) analyses were conducted to determine whether the cues varied systematically depending on the intended emotion and communication channel. The eight musicians were collectively treated as a random factor. The degrees of freedom vary for some analyses due to Satterthwaite approximation (SPSS, 2002). Each of the nine cues identified in our analysis of the stimuli (see Method section) was entered as a dependent variable into the model with emotion as an independent variable. All pairwise tests were performed with Bonferroni correction. Communication channel varied between analyses depending on the cue being examined. For example, the cue of intensity was only examined in the performance and combined conditions because it did not vary in the composi-tion condition.

    Compositional cues. The compositional cues (mode, mean F0, average interval size, range) were first entered into a series of 2 6 (communication channel [composition, combined] emotion [anger, fear, happiness, neutral, sadness, tenderness]) MLM analyses, with the eight musicians as a random variable. A preliminary analysis revealed that, not surprisingly, there

    Table 4. Means for compositional cues (composition and combined conditions) for each emotion (standard deviations in parentheses).

    Cues Emotion

    Anger Fear Happiness Neutral Sadness Tenderness

    Mode 0.08(0.10)

    0.07(0.08)

    0.09(0.14)

    0.09(0.14)

    0.11(0.14)

    0.01(0.12)

    Mean F0 318.07(70.27)

    426.31(81.30)

    443.91(44.37)

    386.60(58.39)

    414.56(70.23)

    445.98(48.92)

    Average interval size 3.23(1.28)

    2.53(1.59)

    3.22(1.36)

    2.45(0.88)

    2.29(0.42)

    2.82(0.97)

    Semitone range 9.53(4.45)

    8.98(3.76)

    10.55(2.53)

    8.15(2.98)

    7.92(1.99)

    9.14(3.34)

  • Quinto et al. 515

    were no significant interactions or differences between communication channels for all the variables.2

    No significant interactions between communication channel and emotion were observed because performers adhered closely to the notated compositional structure. The communica-tion channels were collapsed. The means and standard deviations for these variables are shown in Table 4. There were 96 exemplars (8 musicians 6 emotions 2 channels) in this analysis.

    An MLM analysis with mode as the dependent variable revealed a significant main effect of emotion, F(5, 83) = 8.92, p < .001. Mode values were lower for the emotions of anger, fear, and sadness, and higher for the emotions of happiness, neutral, and tenderness.

    The MLM analysis with mean pitch as the dependent variable also revealed a main effect of emotion, F(5, 83) = 10.99, p < .001. The emotion with the lowest mean pitch was anger and this was significantly lower than fear, happiness, sadness and tenderness, ts(83) > 4.69, p < .005. No other differences were significant.

    There were no significant differences in interval size, F(7, 83) = 2.23, p = .059; or in range, F(7, 83) = 1.70, p = .14, between emotions. However, there was a trend whereby the average interval size was larger for happiness and for anger than for sadness (see Table 4).

    Performance cues. The cues of articulation, intensity, intensity variability, and HF energy were subjected to a series of 2 6 (communication channel [performance, combined] emotion [anger, fear, happiness, neutral, sadness, tenderness]) MLM analyses. There were 192 exemplars in the performance condition (8 musicians 4 melodies 6 emotions) and 48 in the combined condition (8 musicians 6 emotions).

    The MLM analysis with articulation as the dependent variable revealed a main effect of emo-tion, F(5, 221) = 25.50, p < .001. Sadness, tenderness, and neutral were played with higher articulation (more legato) than anger, fear, and happiness. This same analysis showed a main effect of condition, F(1, 221) = 4.98, p = .027, where the performance pieces (M = 95.96, SD = 6.44) were played with greater legato articulation than pieces in the combined condition (M = 93.84, SD = 11.63). Finally, there was a significant interaction, F(5, 221) = 4.86, p < .001. Happiness was played with greater legato articulation in the performance condition (M = 90.42, SD = 9.57) than in the combined condition (M = 78.16, SD = 18.96), t(221) = 4.81, p < .001.

    Figure 3. The mean tempo for all six emotions and three communication channels. Standard error bars are shown.

  • 516 Psychology of Music 42(4)

    The MLM analysis with intensity as the dependent variable revealed a main effect of emo-tion, F(5, 221) = 22.74, p < .001. The emotions of anger (M = 4.87, SD = 3.07) and happiness (M = 1.51, SD = 3.74) were played with greater intensity than neutral. The emotions of sad-ness (M = 2.25, SD = 3.49), fear (M = 1.81, SD = 3.63), and tenderness (M = 0.7, SD = 2.42) were played with less intensity than neutral. There was no main effect of communication channel nor was there an interaction.

    The MLM analysis with intensity variability as the dependent variable revealed a main effect of emotion, F(5, 221) = 17.81, p < .001. The emotion of fear (M = 4.73, SD = 1.57) was played with the greatest intensity variability followed by the emotions of sadness (M = 5.81, SD = 1.67), tenderness (M = 4.11, SD = 1.07), and happiness (M = 3.64, SD = 1.27). Anger (M = 3.44, SD = 1.22) and neutral (M = 3.54, SD = 0.92) were played with less intensity variability than the other emotions. There was also a main effect of condition, F(1, 221) = 4.63, p = .03. There was more intensity variability in the performance condition (M = 3.99, SD = 1.24) than in the combined condition (M = 3.58, SD = 1.43). The interaction was not significant.

    For the cue of HF energy, there was no significant main effect of condition, F(1, 221) = .78, p = .38, or emotion, F(5, 221) = 1.44, p = .21, nor was there a significant interaction, F(5, 221) = 1.35, p = .24.

    Combined cue. The MLM analysis with tempo as a dependent variable, and the three communication channels and six emotions as the independent variables, revealed a marginally significant main effect of condition, F(2, 263) = 2.80, p = .06. The performance condition had a faster tempo (M = 108.77, SD = 28.79) than the composition (M = 102.83, SD = 23.72) and combined (M = 105.57, SD = 27.43) conditions. There was a significant main effect of emotion, F(5, 263) = 10.75, p < .001. Happiness was played the fastest, anger and neutral were intermediate, whereas fear, sadness and tenderness were played the slowest. There was also a significant interaction between communication channel and emotion, F(10, 263) = 2.88, p = .002. The results are shown in Figure 3. This interaction arose because anger was played with a faster tempo in the performance condition as compared to the composition, t(263) = 2.54, p = .01, and the combined conditions, t(263) = 2.38, p = .02. The other pairwise comparisons were not significant.

    Principal component analysisA principal component analysis (PCA) was performed to achieve a better understanding of how the cues were modified by the musicians. The PCA allowed us to reduce the number of vari-ables, or cues, into fewer components to explain how musicians used these cues in combina-tion. To compare stimuli across emotions and communication channels, all variables were centered with means of zero before being entered into the PCA with varimax rotation. Varimax rotation was used because the cues were not strongly correlated with each other. The use of varimax rotation meant that each component derived from the PCA would be uncorrelated with the other components, allowing us to focus on the unique predictive power of each com-ponent. The results revealed that four components reached an Eigen value of greater than one these components were retained. Eigen values greater than one indicate that the component explains a larger proportion of the variance than is contributed by any one cue and that the component meaningfully contributes to explaining the variance in the data. The retained com-ponents accounted for 62.04% of the variance in the use of cues by musicians. Table 6 shows the rotated component matrix and the loadings of the cues on each component.

  • Quinto et al. 517

    Factor loadings greater than .40 were considered to strongly load onto that component. The first component was most strongly correlated with high average interval size and large semitone range. This might reflect melodic excursion. The second component was most strongly correlated with high HF energy, low intensity, and legato articulation. We called this component soothing/connectedness. The third component was most closely related to a high pitch and a major mode and reflects pitch valence. Finally, the fourth component was most closely associated with a fast tempo and low intensity variability. This component might be conceptualized as drive.

    The mean principal component scores for each emotion are shown in Table 5. This illus-trates the strength of the components and associated cues across communication channels. Anger was associated with high melodic excursion, low soothing/connectedness, intermediate pitch valence and high drive. Happiness showed a similar pattern except with a higher loading on pitch valence (third component). Fear was associated with intermediate scores on most of the components except the fourth component. Fear was associated with less drive or a slow tempo and high intensity variability. Sadness was associated with low melodic excursion, high soothing/connectedness, low pitch valence and low drive. Tenderness differed from sadness by having more intermediate values on these scales.

    Cues used by listeners. The centered PCA scores were entered into a MLM analysis to determine which components were significantly associated with listeners responses for each emotion. The dependent variable was the proportion of listener judgments for a given emotion, regard-less of whether this response was correct. Table 7 shows the associated statistics and demon-strates that the components that emerged as significant generally matched the components associated with how musicians encoded emotions. However, not all components predicted lis-teners responses. Soothing/connectedness and drive (articulation, intensity, HF energy, tempo, and intensity variation) were most frequently associated with listeners responses.

    Table 5. Average component scores for each emotion, including interpretation (standard deviations in parentheses).

    Emotion Component scores Description

    1 2 3 4

    Anger .41(1.90)

    .82(1.23)

    .09(1.66)

    .63(1.06)

    High melodic excursion, low soothing/connectedness, intermediate pitch valence, high drive

    Fear .02(2.07)

    .16(1.36)

    .03(1.16)

    .40(1.13)

    Intermediate melodic excursion, intermediate soothing/connectedness, intermediate pitch valence, low drive

    Happiness .50(1.81)

    .53(1.70)

    .38(1.32)

    .58(0.91)

    High melodic excursion, low soothing/connectedness, high pitch valence, high drive

    Neutral .33(1.57)

    .28(0.89)

    .04(1.07)

    .38(0.92)

    Low melodic excursion, high soothing/ connectedness, intermediate pitch valence, high drive

    Sadness .47(1.30)

    .64(1.27)

    .37(1.14)

    .75(1.23)

    Low melodic excursion, high soothing/connectedness, low pitch valence, low drive

    Tenderness .13(1.58)

    .28(1.12)

    .07(1.11)

    .45(1.03)

    Low melodic excursion, high soothing/connectedness, intermediate pitch valence, low drive

  • 518 Psychology of Music 42(4)

    DiscussionThe current investigation examined the range and nature of cues used in performance and composition to communicate emotion, and the ability of listeners to decode emotion from these conditions separately and in combination. Our findings illustrate that: (a) listeners decoded intended emotions at above-chance levels, (b) decoding was dependent on the intended emo-tion and communication channel, (c) the range of cues varied significantly with emotional intentions, (d) the cues were divided into unique components reflecting either performance or compositional cues, and (e) judgments of emotion were associated with these components.

    Table 7. The results from the multilevel mixed linear analysis using the principal components to predict listeners responses. C indicates the component, t is the associated t-test value, degrees of freedom in parentheses, and R2 indicates the variance accounted for by the model.

    Emotion Significant components Description

    C t R2

    Anger 24

    4.91*** (246)3.86 *** (263)

    8.45 Low soothing/connectednessHigh drive

    Fear 4 2.91** (281) 7.99 Low driveHappiness 1

    234

    3.30*** (280)1.91a (167)

    4.61*** (243)8.31*** (203)

    28.52 High melodic excursionLow soothing/connectednessHigh valenceHigh drive

    Neutral 4 4.92*** (156) 6.91 High driveSadness 1

    234

    .35*** (280)4.44*** (172.85)

    3.24*** (247)8.66*** (208)

    33.06 Low melodic excursionHigh soothing/connectednessLow valenceLow drive

    Tenderness 24

    3.09** (249).197* (265)

    16.07 High soothing/connectednessLow drive

    Note. DF vary because of Satterthwaite adjustment for mixed linear models.***p < .001; ** p < .01; * p < .05, a p = .057.

    Table 6. Results of the principal components analysis.

    Cues Components

    1 2 3 4

    Average interval size .904 .001 .087 .002Mean F0 .138 .154 .723 .253Semitone range .914 .036 .028 .047Intensity .207 .592 .393 .189Intensity SD .207 .402 .315 .489Tempo .018 .066 .122 .861Mode .015 .004 .654 .281HF energy .046 .687 .228 .114Articulation .050 .587 .068 .212

    Note. Numbers in bold indicate the highest loadings on each of the four components.

  • Quinto et al. 519

    Overall, the results confirm that composition and performance can be used independently or in combination to communicate emotion. As expected, not all emotions were decoded equally well (see also Terwogt & Van Grinvsen, 1991; Thompson & Robitaille, 1992). In the composi-tion condition, happiness, and fear were the most successfully recognized. In the performance condition, happiness and sadness were the most clearly decoded. In the combined condition, it was these three emotions that were better decoded in comparison to anger, tenderness, and neutral. The addition of performance expression to compositional structure improved the decoding of anger, happiness, and sadness in the combined condition. This suggests that in some cases there may be an advantage for having more cues available.

    Our investigation focused on emotional communication by violinists and vocalists because musical instruments vary in the range of cues that can be physically manipulated. Overall, lis-teners decoded the emotions expressed by both instruments equally well (for more work on timbre and emotional communication see Hailstone et al., 2009; Schutz, et al., 2008). Our findings revealed that fear and happiness were better decoded by listeners when expressed by violinists than vocalists. It may have been easier for violinists than for vocalists to execute the large pitch intervals associated with happiness. Violinists also may have expressed fear using performance cues such as tremolo, which are easier to produce on a violin than vocally. On the other hand, listeners were better able to decode tenderness when communicated by vocalists than violinists. Lullabies are often sung with a tender emotional expression. This association may have primed listeners to more readily identify tenderness in vocal performances.

    The analysis of listeners responses revealed the patterns that listeners relied on to decode emotions. For example, the decoding of anger was associated with the components of low soothing/connectedness and high drive. In other words, high intensity, staccato articula-tion, fast tempo, and little variation in intensity were associated with judgments of anger. This replicates previous findings regarding the expression of anger (Gabrielsson & Juslin, 1996; Juslin & Laukka, 2003). These results may explain the improved decoding observed in the com-bined condition as compared to the composition condition. The relevant cues to communicate anger were mainly provided through performance expression. The results suggest that for our stimuli, anger was difficult to convey with compositional cues alone but that the addition of performance expression assisted in its communication.

    Fear was most clearly expressed with compositional cues. The addition of performance expression in the combined condition did not improve listeners accuracy. Musicians primarily encoded fear with intermediate levels of each component and less drive (slow tempo and high intensity variability). Listeners perceptions of fear were significantly related to drive but not to any of the other components. It is possible that compositional cues not examined in this investigation, such as the degree to which the music was chromatic and the presence of rests between groups of tones, helped listeners to decode this emotion.

    Happiness was most clearly conveyed in the combined condition. Musicians encoded happi-ness with high melodic excursion, low soothing/connectedness, high pitch valence and high drive. When errors in decoding happiness occurred, confusions were with neutral or tender-ness in the performance and composition conditions. One reason for this confusion is that musicians played with more legato articulation in the performance condition as compared to the combined condition; listeners responses suggested that they associated less soothing/con-nectedness (more staccato articulation) with judgments of happiness.

    Sadness was most clearly decoded in the combined and performance conditions as compared to the composition condition. This finding is surprising because melodies with compositional structure were typically composed in a minor mode, whereas melodies in the performance con-dition had an ambiguous mode. It was expected that a minor mode would be a key feature to

  • 520 Psychology of Music 42(4)

    allow listeners to discriminate the emotion. Our findings suggest that performance expression is effective at communicating sadness and enhances decoding accuracy when added to compo-sitional structure. Judgments of sadness were associated with all four of the components in a manner that is consistent with musicians encoding strategy.

    Sadness and tenderness were associated with similar performance cues, which explains why these emotions were frequently confused with each other (Gabrielsson & Juslin, 1996; Juslin, 1997b). The PCA showed that the strength of the components was not as extreme for tenderness as for sadness. Tenderness, like sadness, was associated with few melodic excursions, higher soothing/connectedness, and low drive, but a more intermediate pitch valence than sadness.

    Musicians communication of neutral tended to be low in melodic excursion, higher in soothing/connectedness, intermediate in pitch valence and high to intermediate in drive. These values were generally not intermediate. This may be because it has been demonstrated that musicians tend to expressively shape each tone (e.g., Palmer, 1989; Repp, 1999) and attempt-ing to not express anything may be an unnatural task for musicians. A musicians effectiveness is based on both technical skill and the ability to highlight structural and emotional informa-tion (Sloboda, 2000). The focus on emotional decoding in the experimental task may have encouraged listeners to identify an emotion, even when the emotional intention was neutral.

    Our findings provide a greater understanding of emotional expression in performance and composition and listeners confusion of certain emotions. As has previously been demon-strated, cues are used probabilistically (Juslin, 2000). In some cases, the same cue might be used to convey different emotions. For example, a low intensity can be used to express both sad-ness and tenderness. Each stimulus is expressed with a unique combination of cues. It is through the probabilistic and redundant nature of these cues that listeners are able to decode emotional intentions (e.g., Juslin & Timmers, 2010).

    The results illustrate that emotional communication arises somewhat differently in perfor-mance and composition. Performers and composers have a broad range of knowledge in com-municating emotions through music, but they each have different levels of control over the emotional cues that are available. Overall, combining cues from composition and performance yields slightly more accurate emotional decoding than using cues from just one channel alone, a finding that is consistent with Brunswiks lens model (Brunswik, 1956; Juslin, 1997a, 2000). However, additional cues only modestly increase emotional decoding.

    The findings of this work extend the lens model approach with the inclusion of additional cues; in particular those associated with compositional structure. There are two ways that our results could be interpreted. The first is that combining compositional structure with perfor-mance expression does not impart additional information for the listener with respect to all emotional judgments because cues are redundant. However, while certain compositional cues may be related to performance expression, there is no reason to believe that in all cases perfor-mance cues are systematically intercorrelated with compositional structure. Our results sug-gested that composition and performance cues were not intercorrelated and each channel provided independent information.

    A second interpretation is that listeners use all cues to make emotional judgments. The addi-tion of either performance expression or compositional structure may not dramatically increase accuracy as in the combined condition if one channel is not informative. For example, this work and previous work has found that anger is difficult to communicate through composition but easily decoded through performance expression. The addition of compositional structure to renditions communicating anger with performance expression may not increase accuracy

  • Quinto et al. 521

    beyond performance expression alone if composition adds little useful evidence and instead implies another emotion. Some emotions are better communicated through performance expression, whereas other emotions are better communicated through compositional struc-ture. It is likely that differences in the cues available within each condition are responsible for this pattern of results.

    There may be adaptive reasons for the superior expression of some emotions through either performance expression or through compositional structure. The probabilistic nature of musi-cal cues used to communicate emotion means that listeners attend to the most salient cues. Performance cues, such as intensity and timing, may be more salient and less culture-specific than compositional cues such as modality. Support for this idea is found in cross-cultural and developmental studies (Dalla Bella et al., 2001; Thompson & Balkwill, 2010). Emotions such as anger or threat are well expressed through changes in intensity and timing. Sensitivity to these cues may allow individuals to adaptively respond when other individuals express these emo-tions. In contrast, compositional structure may rely to a greater extent on learned associations and may also express emotions that are not as important to immediate survival, such as happi-ness. The pitch and rhythmic cues associated with compositional structure may require more time to unfold, and their processing may be more influenced by previous exposure, expecta-tions, and conventions, as compared to performance cues.

    It is important to acknowledge that the observed decoding accuracy is unlikely to represent the potential for emotional communication in all contexts. Our composition task involved cre-ating brief phrases consisting of a maximum of nine tones. If these melodies were longer, included harmony, dissonance, and varied choices of timbres, then musicians capabilities to communicate emotion through composition may have been increased. Moreover, our stimuli were based on eight musicians and caution should be exercised when generalizing about the capacity of all composers and performers to convey specific emotions (an acknowledgement that tends to apply to all studies of emotion in music). Nonetheless, our results illustrate that emotional communication in music may be broken into distinct channels of communication that permit the manipulation of different cues, with unique emotional capacities.

    Future work may seek to understand the balance that performers achieve between adher-ence to performance norms and expressive actions that highlight their unique contributions to music experience. In Western music, a performer interprets the composers intentions. The expressive decisions that a performer makes regarding performance interpretation are influ-enced by the compositional structure, among other factors. In the present investigation, musi-cians always performed their own compositions. As such, it is unclear whether emotional communication is more effective when performing ones own composition than when perform-ing the music of another composer. A comparison of multiple performances of the same com-positions may provide insight into the constraints that compositional structure imposes on performers and, hence, the radius of creativity available to musicians.

    A related issue is the way in which performers approach the task of interpreting the emo-tional connotations within a composition. During the most effective performances, it may sometimes appear that the performer is channelling the intentions of the composer, which implies that a successful performance involves a process of achieving an altered sense of agency that blurs the boundary between the performer and the composer. Future work may compare the sense of agency achieved by a musician who performs her or his own composition with the sense of agency achieved when a musician performs a composition by another musician.

    Stimuli in the composition condition were presented as MIDI sequences. Such sequences excluded performance expression and, consequently, sounded somewhat artificial. Conversely,

  • 522 Psychology of Music 42(4)

    sequences that included performance expression sounded less artificial and more suggestive of a human (biological) agent. An intriguing question is whether processes of emotional decoding are affected (e.g., heightened) when music is perceived to arise from a human agent. Emotion is crucial for social relationships and there may be a natural tendency to infer emotion from sig-nals that are perceived to be humanly produced. One important function of performance expression may be to bring life to a composition, enhancing the degree to which music is perceived to arise from a human agent, and stimulating processes related to emotional communication.

    To conclude, a musical performance reflects the shared goal of performers and composers to communicate emotional intentions to listeners (along with other communicative intentions). The results of this investigation confirm that performers and composers have access to different subsets of emotional cues. In musical traditions that involve the use of notation for preserving and disseminating music, it is only through the joint intentions and capabilities of both per-formers and composers that listeners can perceive and experience the full palate of emotions that lie at the heart of music experience.

    AcknowledgementsWe would like to thank Alissa Beath, Michael Connors, Genevieve McArthur, and Bruno Repp for their helpful comments on an earlier version of this manuscript. We would also like to thank Bojan Neskovic and Alex Chilvers for all their technical support, and the musicians who volunteered and followed instructions.

    FundingThis research was supported in part by a grant from the Australian Research Council DP 0879017 awarded to the second author.

    Notes1. Balanced designs were also considered, but there were limitations with these designs. For example, in

    the performance condition, all compositional cues could have been held constant. That is, musicians could have been asked to play a single tone repeatedly. This would have resulted in the same number of stimuli in each condition, but the use of a single tone would have been problematic as it would not have been particularly musical, and may have led to atypical uses of performance expression. We also considered providing only one emotionally ambiguous melody to musicians. Such a design would have resulted in repetitive experimental stimuli, and made it difficult to generalize our results to other melodic contexts.

    2. The results of this preliminary analysis revealed no significant interactions (mode, F(5, 77) = 0.63, p = .63; mean F0, F(5, 77) = 0.24, p = .994; average interval size, F(5, 77) = 0.13, p = .999; and range, F(5, 77) = 0.00, p = 1.00) or differences between communication channels (mode, F(1, 77) = 1.74, p = .191; mean F0, F(1, 77) = 0.02, p = .903; average interval size, F(1, 77) = 0.27, p = .604; and range, F(1, 77) = 0.000, p = 1.00) for all the variables.

    ReferencesBalkwill, L.-L., & Thompson, W. F. (1999). A cross-cultural investigation of the perception of emotion in

    music: Psychophysical and cultural cues. Music Perception, 17(1), 4364.Bigand, E., Vieillard, S., Madurell, F., Marozeau, J., & Dacquet, A. (2005). Multidimensional scaling of

    emotional responses to music: The effect of musical expertise and of the duration of the excerpts. Cognition and Emotion, 19, 11131139. doi:10.1080/02699930500204250

    Boersma, P., & Weenink, D. (2009). Praat: Doing phonetics by computer (Version 5.1.20) [Computer Software]. Amsterdam, The Netherlands: Institute of Phonetic Sciences.

  • Quinto et al. 523

    Brunswik, E. (1956). Perception and the representative design of psychological experiments (2nd ed.). Berkeley: University of California Press.

    Cemgil, T., Kappen, B., Desain, P., & Honing, H. (2000). On tempo tracking: Tempogram representation and Kalman filtering. Journal of New Music Research, 29, 259273. doi:10.1080/09298210008565462

    Crowder, R. G. (1984). Perception of the major/minor distinction: I. Historical and theoretical founda-tions. Psychomusicology, 4, 312.

    Dalla Bella, S., Peretz, I., Rousseau, L., & Gosselin, N. (2001) A developmental study of the affective value of tempo and mode in music. Cognition, 80, B1B10. doi:10.1016/S00100277(00)001360

    Gabrielsson, A., & Juslin, P. N. (1996). Emotional expression in music performance: Between the performers intention and the listeners experience. Psychology of Music, 24, 6891. doi:10.1177/0305735696241007

    Gabrielsson, A., & Lindstrm, E. (2010). The role of structure in the musical expression of emotions. In P. Juslin & J. Sloboda (Eds.), Music and emotion (2nd ed., pp. 368400). Oxford, UK: Oxford University Press.

    Gagnon, L., & Peretz, I. (2003). Mode and tempo relative contributions to happy sad judgments in equitone melodies. Cognition and Emotion, 17, 2540. doi:10.1080/02699930302279

    Hailstone, J. C., Omar, R., Henley, S. M., Frost, C., Kenward, M. G., & Warren, J.D. (2009). Its not what you play, its how you play it: Timbre affects perception of emotion in music. Quarterly Journal of Experimental Psychology, 62, 21412155. doi:10.1080/17470210902765957

    Hevner, K. (1935). The affective character of the major and minor modes in music. American Journal of Psychology, 47, 103118. doi:10.2307/1416710

    Hevner, K. (1937). The affective value of pitch and tempo in music. The American Journal of Psychology, 49, 621630. doi:10.2307/1416385

    Honing, H. (2001). From time to time: The representation of timing and tempo. Computer Music Journal, 25, 5061. doi:10.1162/014892601753189538

    Honing, H. (2005). Is there a perception-based alternative to kinematic models of tempo rubato? Music Perception, 23, 7985. doi:10.1525/mp.2005.23.1.79

    Huron, D., Kinney, D., & Precoda, K. (2006). Influence of pitch height on the perception of submissiveness and threat in musical passages. Empirical Musicology Review, 1(3): 170177.

    Ilie, G., & Thompson, W. F. (2006). A comparison of acoustic cues in music and speech for three dimen-sions of affect. Music Perception, 23, 319329. doi:10.1525/mp.2006.23.4.319

    Juslin, P. N. (1997a). Emotional communication in music performance: A functionalist perspective and some data. Music Perception, 14(4), 383418.

    Juslin, P. N. (1997b). Perceived emotional expression in synthesized performances of a short melody: Capturing the listeners judgment policy. Musicae Scientiae, 1(2), 225256.

    Juslin, P. N. (2000). Cue utilization in communication of emotion in music performance: Relating per-formance to perception. Journal of Experimental Psychology: Human Perception and Performance, 26, 17971813. doi:10.1037/00961523.26.6.1797

    Juslin, P. N. (2003). Five facets of musical expression: A psychologists perspective on music performance. Psychology of Music, 31, 273302. doi:10.1177/03057356030313003

    Juslin, P. N., & Laukka, P. (2003). Communication of emotions in vocal expression and music perfor-mance: Different channels, same code? Psychological Bulletin, 129, 770814. doi:10.1037/00332909.129.5.770

    Juslin, P., & Timmers, R. (2010). Expression and communication of emotion in music performance. In P. Juslin & J. Sloboda (Eds.), Music and emotion (2nd ed., pp. 453489). Oxford, UK: Oxford University Press.

    Kallinen, K. (2005). Emotional ratings of music excerpts in the western art music repertoire and their self-organization in the Kohonen neural network. Psychology of Music, 33, 373393. doi:10.1177/0305735605056147

    Kendall, R. A., & Carterette, E. C. (1990). The communication of musical expression. Music Perception, 8(2), 129164.

  • 524 Psychology of Music 42(4)

    Kotlyar, G. M., & Morozov, V. P. (1976). Acoustical correlates of the emotional content of vocalized speech. Soviet Physics Acoustics, 22, 208211.

    Krumhansl, C. (2009). Plink: Thin slices of music. Music Perception, 27(5), 337354. doi: 10.1525/MP.2010.27.5.337

    Lartillot, O., Toiviainen, P., & Eerola, T. (2008) A Matlab toolbox for music information retrieval. In C. Preisach, H. Burkhardt, L. Schmidt-Thieme, & R. Decker (Eds.), Data analysis, machine learning and applications: Studies in classification, data analysis, and knowledge organization (pp. 261268). Berlin, Germany: Springer-Verlag.

    Laukka, P., & Gabrielsson, A. (2000). Emotional expression in drumming performance. Psychology of Music, 28, 181189. doi:10.1177/0305735600282007

    Palmer, C. (1989). Mapping musical thought to musical performance. Journal of Experimental Psychology: Human Perception and Performance, 15, 331346. doi:10.1037/00961523.15.2.331

    Palmer, C., & Hutchins, S. (2006). What is musical prosody? In B. H. Ross (Ed.), Psychology of learning and motivation (Vol. 46, pp. 245278). Amsterdam, The Netherlands: Elsevier Press.

    Peretz, I., Gagnon, L., & Bouchard, B. (1998). Music and emotion: Perceptual determinants, immediacy and isolation after brain damage. Cognition, 68, 111141. doi:10.1016/S00100277(98)000432

    Repp, B. H. (1999). Relationships between performance timing, perception of timing perturbations, and perceptual-motor synchronization in two Chopin preludes. Australian Journal of Psychology, 51, 188203.

    Rosenthal, R., & Rubin, D. B. (1989). Effect size estimation for one-sample multiple-choice-type data: Design, analysis, and meta-analysis. Psychological Bulletin, 106, 332337. doi:10.1037//00332909.106.2.332

    Scherer, K. R., & Oshinsky, J. S. (1977). Cue utilization in emotion attribution from auditory stimuli. Motivation and Emotion, 1, 331346. doi:10.1007/BF00992539

    Schubert, E. (2004). Modeling perceived emotion with continuous musical features. Music Perception, 21, 561585. doi:10.1525/mp.2004.21.4.561

    Schutz, M., Huron, D., Keeton, K., & Loewer, G. (2008). The happy xylophone: Acoustic affordances restrict an emotional palette. Empirical Musicology Review, 3(3), 126135.

    Seeger, A. (2004). Why Suy sing: A musical anthropology of an Amazonian people. Cambridge, UK: Cambridge University Press.

    Sherman, M. (1928). Emotional character of the singing voice. Journal of Experimental Psychology, 11, 495497. doi:10.1037/h0075703

    Sloboda, J. A. (2000). Individual differences in music performance. Trends in Cognitive Science, 4, 397403. doi:10.1016/S13646613(00)01531-X

    SPSS. (2002). Linear mixed effects modelling in SPSS: An introduction to the mixed procedure. Chicago, IL. Retrieved from http://www.spss.ch/upload/1107355943_LinearMixedEffectsModelling.pdf

    Terwogt, M. M., & Van Grinsven, F. (1991). Musical expression of moodstates. Psychology of Music, 19, 99109. doi:10.1177/0305735691192001

    Thompson, W. F., & Balkwill, L.-L. (2010). Cross-cultural similarities and differences. In P. Juslin and J. Sloboda (Eds.), Music and emotion (2nd ed., pp. 755788). Oxford, UK: Oxford University Press.

    Thompson, W. F., & Robitaille, B. (1992). Can composers express emotions through music? Empirical Studies of the Arts, 10(1), 7989.

    Timmers, R., & Ashley, R. (2007). Emotional ornamentation in performances of a Handel sonata. Music Perception, 25(2), 117134.

    Vieillard, S., Peretz, I., Gosselin, N., Khalfa, S., Gagnon, L., & Bouchard, B. (2008). Happy, sad, scary and peaceful musical excerpts for research on emotions. Cognition and Emotion, 22, 720752. doi:10.1080/02699930701503567

    Wedin, L. (1972). A multi-dimensional study of perceptual emotional qualities in music. Scandinavian Journal of Psychology, 13, 115131. doi:10.1111/j.14679450.1972.tb00072