processing emotions in sounds: cross-domain aftereffects

18
Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=pcem20 Download by: [Texas A&M University Libraries] Date: 21 November 2016, At: 12:27 Cognition and Emotion ISSN: 0269-9931 (Print) 1464-0600 (Online) Journal homepage: http://www.tandfonline.com/loi/pcem20 Processing emotions in sounds: cross-domain aftereffects of vocal utterances and musical sounds Casady Bowman & Takashi Yamauchi To cite this article: Casady Bowman & Takashi Yamauchi (2016): Processing emotions in sounds: cross-domain aftereffects of vocal utterances and musical sounds, Cognition and Emotion, DOI: 10.1080/02699931.2016.1255588 To link to this article: http://dx.doi.org/10.1080/02699931.2016.1255588 View supplementary material Published online: 16 Nov 2016. Submit your article to this journal Article views: 14 View related articles View Crossmark data

Upload: others

Post on 08-Jun-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Processing emotions in sounds: cross-domain aftereffects

Full Terms & Conditions of access and use can be found athttp://www.tandfonline.com/action/journalInformation?journalCode=pcem20

Download by: [Texas A&M University Libraries] Date: 21 November 2016, At: 12:27

Cognition and Emotion

ISSN: 0269-9931 (Print) 1464-0600 (Online) Journal homepage: http://www.tandfonline.com/loi/pcem20

Processing emotions in sounds: cross-domainaftereffects of vocal utterances and musicalsounds

Casady Bowman & Takashi Yamauchi

To cite this article: Casady Bowman & Takashi Yamauchi (2016): Processing emotions insounds: cross-domain aftereffects of vocal utterances and musical sounds, Cognition andEmotion, DOI: 10.1080/02699931.2016.1255588

To link to this article: http://dx.doi.org/10.1080/02699931.2016.1255588

View supplementary material

Published online: 16 Nov 2016.

Submit your article to this journal

Article views: 14

View related articles

View Crossmark data

Page 2: Processing emotions in sounds: cross-domain aftereffects

Processing emotions in sounds: cross-domain aftereffects of vocalutterances and musical soundsCasady Bowman and Takashi Yamauchi

Department of Psychology, Texas A&M University, College Station, TX, USA

ABSTRACTNonlinguistic signals in the voice and musical instruments play a critical role incommunicating emotion. Although previous research suggests a commonmechanism for emotion processing in music and speech, the precise relationshipbetween the two domains is unclear due to the paucity of direct evidence. Byapplying the adaptation paradigm developed by Bestelmeyer, Rouger, DeBruine,and Belin [2010. Auditory adaptation in vocal affect perception. Cognition, 117(2),217–223. doi:10.1016/j.cognition.2010.08.008], this study shows cross-domainaftereffects from vocal to musical sounds. Participants heard an angry or fearfulsound four times, followed by a test sound and judged whether the test sound wasangry or fearful. Results show cross-domain aftereffects in one direction – vocalutterances to musical sounds, not vice-versa. This effect occurred primarily forangry vocal sounds. It is argued that there is a unidirectional relationship betweenvocal and musical sounds where emotion processing of vocal sounds encompassesmusical sounds but not vice-versa.

ARTICLE HISTORYReceived 2 December 2015Revised 19 October 2016Accepted 24 October 2016

KEYWORDSAuditory adaptation; music;speech; emotion; musicalaffect

Communicating emotion is crucial for social inter-actions and two powerful means to express emotionare speech and music. Research on vocal acoustic(Bachorowski & Owren, 2008), infant-directed speech(Byrd, Bowman, & Yamauchi, 2012; Schachner &Hannon, 2011), and laughter (Bachorowski, Smoski, &Owren, 2001) indicate the presence of a commonmechanism for processing emotions in music andspeech. For instance, Aubé, Angulo-Perkins, Peretz,Concha, and Armony (2015) found correlations foramygdala responses to fearful music and vocaliza-tions. Coutinho and Dibben (2012) also demonstratedthat seven psychoacoustic features – loudness,tempo/speech rate, melody/prosody contour, spectralcentroid, spectral flux, sharpness, and roughness – arecapable of explaining affect ratings given to bothmusic and speech. These results indicate thatemotions in music and speech are processed byshared general mechanisms (Juslin & Laukka, 2003).The findings by Thompson, Schellenberg, and Husain(2004) further reinforce this idea. The researchers

compared emotion judgments of musicians andnon-musicians. Participants were asked to make anemotion judgment after listening to vocal utterancesspoken with emotional prosody or tone sequencesthat mirrored the prosody of the vocal utterances.Results showed that musicians outperformed non-musicians in emotion identification and suggestedthat processing emotion in auditory cues, such asmusic or vocal utterances, likely utilizes similar mech-anisms. In the same vein, recent evidence shows thatspeech and music share underlying neural resources(Fedorenko, Patel, Casasanto, Winawer, & Gibson,2009; Patel, 2003) such that certain acoustic featuresare used for emotion perception in both domains(see Juslin & Laukka, 2003).

Although these findings point to the notion thatprocessing emotions in speech and music is mediatedby a common mechanism, there are at least two criti-cal problems with this view. First, the evidence sup-porting this claim comes almost exclusively fromregression analyses (Byrd, Bowman, & Yamauchi,

© 2016 Informa UK Limited, trading as Taylor & Francis Group

CONTACT Takashi Yamauchi [email protected] data for this article can be accessed at doi:10.1080/02699931.2016.1255588

COGNITION AND EMOTION, 2016http://dx.doi.org/10.1080/02699931.2016.1255588

Page 3: Processing emotions in sounds: cross-domain aftereffects

2012; Eerola, Ferrer, & Alluri, 2012; Juslin & Laukka,2003), which can only suggest an indirect associativerelationship between these domains. Second, thisresearch has focused on dimensional aspects ofemotion (arousal and valence) while ignoring otherimportant characteristics such as adaptive character-istics of sound (e.g. approach and avoidance). Due tothese limitations, the relationship between musicand speech in emotion processing is still unclear(Bowman & Yamauchi, 2015a, 2015b; Ilie & Thompson,2006). By applying the adaptation paradigm devel-oped by Bestelmeyer, Rouger, DeBruine, and Belin(2010), the current study investigated the extent towhich similar mechanisms mediate emotion percep-tion of vocal and musical sounds.

Adaptation and emotion processing

Adaptation is a process where neural responsesdecrease after prolonged stimulation. Researchshows that neurons respond to specific stimulus attri-butes and are active at early stages of information pro-cessing, particularly for high-level properties such asfacial identity (Bestelmeyer, Maurage, Rouger,Latinus, & Belin, 2014; Bestelmeyer et al., 2010). Pro-longed exposure to stimuli can also result in the oppo-site effect – sensitization. Sensitization results when anobserver is repeatedly exposed to an angry face, forinstance, and rates a subsequent face as angrier(Kandel, Schwartz, & Jessel, 2001, p. 1465; Wanget al., 2016). While the exact interpretation of whatcauses sensitization is still unclear, recent behavioraland neuropsychological research points to the ideathat sensitization may be mediated by similar pro-cesses as adaptation. In particular, the occurrence ofadaptation and/or sensitization in emotion processingdepends on adaptive values (approach vs. avoidance)of the stimuli. For example, when the stimuli provide asalient adaptive purpose, they are likely to result insensitization.

Frühholz and Grandjean (2013) demonstrated in afunctional magnetic resonance imaging (fMRI) studythat sub-regions in the amygdala are sensitive tothreatening emotional voices. While recorded onfMRI, participants listened to speech-like non-wordstimuli that were angry or neutral and judgedwhether stimuli were angry or neutral (explicit task)and whether the voice was male or female (implicittask). Using blood-oxygen-level dependent (BOLD)contrast imaging, the researchers compared theneutral and angry conditions for both the explicit

and implicit task types. Results showed increasedBOLD activity in the bilateral superficial (SF) complexand the right laterobasal (LB) complex of the amyg-dala to emotional cues from speech prosody in theexplicit task (angry–neutral judgment) but not in theimplicit task (male–female judgment). They concludethat the threatening vocalizations may cause sensitiz-ation when stimuli serve a salient adaptive purpose(Frühholz & Grandjean, 2013).

Similarly, Pollak, Cicchetti, Hornung, and Reed(2000) showed a sensitization aftereffect for angrysounds that may be mediated by adaptive values. Chil-dren who have been maltreated showed enhancedevent-related potentials (ERP) responses to angryversus fearful or happy vocal expressions. In addition,Strauss et al. (2005) found that in adults, repeatedexposure to angry expressions caused an increase inneural responses in emotion-processing circuits (e.g.amygdala), whereas repeated presentations of othernegative emotions (e.g. fear) lead to adaptation(Strauss et al., 2005; Vaish, Grossmann, & Woodward,2008). Although correlations between physiologicalfindings (neuroimaging or ERP) and behavioral judg-ments of adaptation are not yet well understoodand the role of the specific regions in the brainremains unclear (Wang et al., 2016), it seems thatadaptive characteristics of emotional stimuli influencesensitization and adaptation effects.

While the adaptation paradigm has been used toexplore neural mechanisms underlying face andvoice perception, it is not yet clear if these aftereffectsexist for processing other types of nonlinguistic audi-tory information, such as instrumental sounds, orwhether these aftereffects cross domains. To investi-gate the relationship between music and speech, wefocus on the link between instrumental sounds andvocal sounds as initial proxies for music- andspeech-like sounds.

Experiments 1a–1d: instrument and voice

Each experiment consisted of two phases – a baselinephase and an experimental phase. In both phases, par-ticipants judged whether a test stimulus was angry orfearful (i.e. the anger–fear judgment task). The exper-imental phase was analogous to the baseline phaseexcept that participants received an adaptationsound four times (either an angry or fearful sound)right before an anger–fear judgment trial (Figure 1).In Experiment 1a, participants were adapted toangry or fearful vocal sounds and tested on vocal

2 C. BOWMAN AND T. YAMAUCHI

Page 4: Processing emotions in sounds: cross-domain aftereffects

sounds; in Experiment 1b, participants were adaptedto angry or fearful instrumental sounds and testedon instrumental sounds.

If emotion processing for vocal and instrumentalsounds make use of shared neural mechanisms, pro-longed exposure to voice sounds (e.g. angry voice)should result in aftereffects (either adaptation or sen-sitization) in the processing of instrumental soundsand vice-versa.

Method

ParticipantsTwenty undergraduates (13 female, mean age = 19.1,SD = 1.35; 7 male, mean age = 20.6, SD = 3.71)participated in Experiment 1a (adapt to voice, teston voice) and 21 undergraduates (14 female, meanage = 19.57, SD = 2.06; 7 male, mean age = 18.57,SD = 1.51) took part in Experiment 1b (adapt to instru-ment, test on instrument). Thirty-six undergraduatestudents (19 female, mean age = 18.7, SD = 0.82; 17male, mean age = 19.7, SD = 2.02) participated inExperiment 1c (adapt to voice, test on instrument).Fifty-two undergraduate students (24 female, meanage = 18.96, SD = 0.91; 28 male, mean age = 19.32,SD = 1.09) took part in Experiment 1d (adapt to instru-ment, test on voice). All participants reported normalhearing and received course credit.1

MaterialsTest and baseline stimuli in Experiment 1a (adapted tovoice; tested on voice) and Experiment 1d (adapted toinstrument; tested on voice) were morphed conti-nuum stimuli of seven steps from anger to fear.These continua were created from eight initial record-ings of two male and two female voices from the Mon-treal Affective Voices (MAV, Belin, Fillion-Bilodeau, &Gosselin, 2008). The MAV are nonverbal affect burststhat correspond to anger, disgust, fear, pain,sadness, surprise happiness, and pleasure, whichwere produced by actors instructed to makeemotional interjections using the vowel /a/. Thesestimuli have been used in Bestelmeyer’s studies (Bes-telmeyer et al., 2010, 2014). We modified the MAVrecordings to produce anger to fear morphed conti-nua in seven steps corresponding to 5/95%, 20/80%,35/65%, 50/50%, 65/35%, 80/20%, and 95/5% anger/fear. The program STRAIGHT (Kawahara & Matsui,2003) was used to create the anger–fear voicemorphed continua in MatlabR2007b (Mathworks,Inc.). There were 56 total stimulus sounds (4 voicesounds (2 male, 2 female) × 2 emotions per sound(anger/fear) × 7 steps for each morphed continuum= 56 total continuum stimuli).

Test and baseline stimuli in Experiment 1b(adapted to instrument; tested on instrument) andExperiment 1c (adapted to voice; tested instrument)

Figure 1. A schematic illustration of the baseline phase and experimental phase for experiments 1a–1d. This illustration depicts experiment 1awith voice sounds; however, the procedure is the same for all experiments.

COGNITION AND EMOTION 3

Page 5: Processing emotions in sounds: cross-domain aftereffects

were an instrumental morphed continua of sevensteps from anger to fear. The morphed continuawere created similarly to the voice morphed stimuli.These continua were created from eight initial record-ings of two classes of musical instruments; brass andwoodwind where selected instruments were theFrench horn, baritone, saxophone, and flute, recordedat 440 Hz. The eight instrumental sounds wererecorded from musicians in the Army 395th band;they were directed to play both an angry and afearful sound for each instrument. Instrumentalmorphed continua (anger to fear in seven steps)were created using the same procedure as the MAVmorphed continua. There were 56 total stimulussounds (4 instrumental sounds × 2 emotions persound (anger/fear) × 7 steps for each morphed conti-nuum = 56 total continuum stimuli).

Adaptation sounds in Experiment 1a (adapted tovoice; tested on voice) and 1c (adapted to voice;tested on instrument) were the original MAV (100%angry/fearful) sounds. Adaptation sounds in Exper-iments 1b (adapted to instrument; tested on instru-ment) and 1d (adapted to instrument; tested onvoice) were the original instrument recordings (100%angry/fearful). Stimuli were all normalized for energy,or root mean square (RMS) energy. For each soundfile, the function mirrms was run using MIRToolboxto normalize RMS energy for each sound stimulus.This is computed by taking the root average of thesquare of the amplitude for each sound. Thisensured that sounds participants heard were approxi-mately the same loudness. Sound stimuli were pre-sented in stereo via JVC Flats stereo headphones.These stimuli are available at http://people.tamu.edu/~takashi-yamauchi/resources.html. In addition,supplementary information is available including adetailed description and table of extracted physicalacoustic properties of the sound stimuli (i.e. anger/fear morphed stimuli).

ProcedureTable 1 summarizes the structure of the baselineand experimental phases of Experiment 1a–1d.Each experiment (Experiments 1a–1d) consisted oftwo phases – a baseline phase and an experimentalphase. The baseline phase for Experiment 1a con-sisted of 84 trials in two blocks (each block was com-posed of either male or female vocal sounds).Participants were presented with 168 voice soundsone at a time from an anger–fear morphed conti-nuum of male or female voice sounds and judged

whether a test stimulus sounded angry or fearfulby selecting one of the two buttons on the compu-ter monitor labeled as “angry” or “fearful” (i.e. theanger–fear judgment task).2 The sound for eachvoice (two males and two females) at each of theseven morph steps was repeated six times, leadingto 84 trials per block, with a total of 168 trials (4voices × 7 anger–fear morphed steps × 6 times =168 trials). Following the baseline phase the exper-imental phase for Experiment 1a began with thepresentation of four consecutive adapting sounds(either an angry or fearful voice). One second follow-ing the offset of the fourth affective voice sound, atest stimulus (voice) was presented. The teststimuli were voice anger–fear morphed continuawhich were also used in the baseline phase of theexperiment. Participants indicated whether thetest stimulus was angry or fearful by pressing an“angry” or “fear” button on the computer monitor.There were four experimental phase blocks (2emotion (anger or fear) × 4 voice (male or female))and each of the seven test stimuli per voice wasrepeated six times leading to 84 trials per blockwith a total of 336 trials; 2 adapting voice sounds(angry or fearful sounds) × 4 voices (2 male and 2female sounds) × 7 anger–fear morphed steps × 6times = 336).

The baseline phase of Experiment 1b was identicalto that of Experiment 1a, except that instrumentalsounds, anger–fear morphed stimuli from two instru-ment classes, were used as stimuli (2 brass – Frenchhorn and baritone and 2 woodwind – flute and altosaxophone). The baseline phase of Experiment 1cwas identical to that of Experiment 1a, using the

Table 1. Stimuli used in the baseline and experimental (adaptation)phases in experiments 1a–1d.

ExperimentPhase

Experimental phase

Baseline phase Exposure Test

Exp. 1a Morphed voicesounds: anger–fear judgment

Voice sounds Morphed voicesounds: anger–fear judgment

Exp. 1b Morphedinstrumentalsounds: anger–fear judgment

Instrumentalsounds

Morphedinstrumentalsounds: anger–fear judgment

Exp. 1c Morphedinstrumentalsounds: anger–fear judgment

Voice sounds Morphedinstrumentalsounds: anger–fear judgment

Exp. 1d Morphed voicesounds: anger–fear judgment

Instrumentalsounds

Morphed voicesounds: anger–fear judgment

4 C. BOWMAN AND T. YAMAUCHI

Page 6: Processing emotions in sounds: cross-domain aftereffects

same voice anger–fear morphed continuum sounds.The baseline phase of Experiment 1d was the sameas that of Experiment 1b.

The experimental phase of Experiment 1b wasidentical to 1a, except that adapting stimuli wereangry or fearful instrumental sounds and test stimuliwere from anger–fear morphs of instrumentalsounds. The experimental phase of Experiment 1cwas the same as that in Experiment 1a, except thatadapting sounds were angry or fearful voice soundsand test stimuli were anger–fear instrumentalmorphs. The experimental phase of Experiment 1dwas identical to Experiment 1b except that adaptingsounds were angry or fearful instrumental soundsand test stimuli were from an anger–fear voicemorphed continuum. Test stimuli used in the baselinephase and the experimental phase of each experimentwere the same. To avoid low-level adaptation pro-longed exposure stimuli and test stimuli were oppo-site in gender or instrument class (e.g. maleprolonged exposure voice, and female test voice).

DesignThe dependent variable was the proportion of trialsthat participants judged test sounds as fearful andthe independent variable was the exposure condition(baseline, anger, or fear). For all data analyses, datawere collapsed across the seven morph steps whereeach participant had an average anger–fear judgmentscore for each exposure condition.3 A one-wayrepeated ANOVA (exposure condition; baseline,anger, fear; within-subjects factor) was applied to theaveraged judgment data.

Results

Experiment 1a (adapted to voice; tested onvoice)Prolonged exposure to an angry voice led participantsto judge voice sounds at test as less angry (morefearful), demonstrating an adaptation aftereffect. Adap-tation occurs when neural responses are decreasedafter prolonged stimulation. A one-way repeatedmeasures ANOVA revealed a significant main effectfor affective vocal sounds, Figure 2, (F (2, 44) = 10.10,MSE = .036, p < .001, η2p= .32).

Planned t-tests indicated that participants judgedsounds as less angry after prolonged exposure toangry sounds (M = .61, SD = .09) relative to baseline(M = .52, SD = .07), t(22) = 4.63, p < .001, d = 1.05, 95%CId [.43, 1.69]. A significant difference was also

present for the anger versus fear conditions, t(22) =3.06, p <.01, d = .56, 95% CId [.04, 1.16]. The baselineversus fear condition was not significant, t(22) = 1.54,p > .05, d = .40, 95% CId [.19, 1.00].

To further explore the direction of the effect, datawere averaged as a function of the seven morphsteps and a psychophysical curve (the hyperbolictangent function) was fitted to the mean data foreach adaptor type (baseline, anger, and fear). Goodfits were obtained for all three conditions; baseline(R2 = .97), anger (R2 = .99), and fear (R2 = .98). Thepoint of inflection of the function (point of subjectiveequality – PSE) was computed for all curves (baseline,anger, and fear) as illustrated with an asterisk in Figure2(b). The point of inflection refers to the point on thetest continuum (x-axis, Figure 2(b)) where the voice attest was equally likely to be labelled as angry or fearful

Figure 2. (a) Behavioral results from experiment 1a (adapted to voice;tested on voice). The grand average of all participants is displayed.(b) Psychophysical curves (the hyperbolic tangent function) for thegrand average of the three experimental conditions: baseline (solid),anger (light dashed), and fear (dark dashed). The Points of SubjectiveEquality (PSE) values are denoted with an asterisk and a vertical line tothe x-axis. The PSEs were used to determine differences betweenconditions.

COGNITION AND EMOTION 5

Page 7: Processing emotions in sounds: cross-domain aftereffects

(y-axis). The PSE was used to determine differencesbetween conditions.

A one-way repeated measures ANOVA on inflection(PSE) values revealed a significant main effect of adap-tation to affective voices (F(2, 44) = 7.12, MSE = .529, p< .01, η2p = .25). Planned t-tests showed that the PSEwas significantly smaller in the anger condition (M =2.65, SD = .97) than the baseline condition (M = 3.45,SD = .88), (t(22) = 3.35, p < .01). Additionally, the PSEin the fear condition was also significantly smaller(M = 2.99, SD = 2.13) than the baseline condition (M= 3.45, SD = .88), t(22) = 2.32, p < .05), Figure 2(b).

Experiment 1b (adapted to instrument; tested oninstrument)Similar to Experiment 1a, prolonged exposure to anangry instrumental sound caused participants tojudge instrumental sounds at test as less angry(more fearful) (i.e. adaptation effect). There was noeffect for prolonged exposure to fearful sounds. Exper-iment 1b revealed an adaptation effect for instrumen-tal sounds, showing the same effect as Experiment 1a(adapted to voice, tested on voice) in a differentdomain.

A one-way repeated measures ANOVA indicateda significant main effect for instrumental sounds,Figure 3, (F (2, 38) = 3.81, MSE = .019, p < .001,η2p = .17). Planned t-tests indicated that after pro-longed exposure to angry instrumental sounds, par-ticipants judged instrumental test sounds as lessangry (M = .52, SD = .16) compared to the baselinecondition (M = .41, SD = .07); t(19) = 2.52, p < .05,d = .80, 95% CId [.13, 1.45]. There was no significantdifference between the baseline and fear conditionst(19) = .86, p > .05, d = .27, 95% CId [−.92, .35]. Thedifference between the anger and fear conditionsapproached a significant level, t(19) = 1.96, p = .06,d = .45, 95% CId [.19, 1.10].

The data were fitted with a psychophysical curve(the hyperbolic tangent function) where good fitswere obtained for all three conditions; baseline (R2

= .99), anger (R2 = .95), and fear (R2 = .96) (Figure 3(b)). A one-way repeated measures ANOVA on PSEvalues revealed a significant main effect of adaptationto affective instrument sounds (F(2, 44) = 7.65, MSE =2.811, p < .001, η2p = .26). Planned t-tests showedthat the PSE was smaller in the anger condition (M =3.45, SD = 2.13) compared to the baseline condition(M = 5.53, SD = 1.37), (t(22) = 3.701, p < .001). Inaddition, the anger condition was also significantlydifferent (M = 3.45, SD = 2.13) than the fear condition

(M = 4.51, SD = 2.45), t(22) = 2.30, p < .05. Theseresults suggest that prolonged exposure to an angryvocalization results in adaptation.

Experiment 1c (adapted to voice; tested oninstrument)Prolonged exposure to angry vocal sounds led partici-pants to judge instrumental sounds as angrier (i.e. sen-sitization). Sensitization results when neural responsesto a prolonged stimulus are increased. This cross-domain sensitization effect was observed when par-ticipants were exposed to angry vocal sounds butnot when they were exposed to fearful vocal sounds.

A one-way repeated measures ANOVA revealed asignificant main effect, Figure 4, (F (2, 70) = 21.71,MSE = .070, p < .001, η2p = .38). Planned t-tests indicatethat participants judged instrumental sounds asangrier after prolonged exposure to an angry voice

Figure 3. (a) Behavioral results from experiment 1b (adapted to instru-ment; tested on instrument). (b) Psychophysical function for the grandaverage of the three experimental conditions: baseline (solid), anger(light dashed), and fear (dark dashed). PSE values are denoted withan asterisk and a vertical line to the x-axis. The PSEs were used todetermine differences between conditions.

6 C. BOWMAN AND T. YAMAUCHI

Page 8: Processing emotions in sounds: cross-domain aftereffects

(M = .43, SD = .14) relative to the baseline condition (M= .55, SD = .12); t(35) = 4.61, p < .001, d = .91, 95% CId[.41, 1.40]. A significant difference was also presentfor the anger versus fear conditions, t(35) = 6.25, p< .001, d = 1.02, 95% CId [.52, 1.52]. The baselineversus fear conditions was not significant, t(35) =1.38, p > .05, d = .26, 95% CId [.21, .72].

As in the previous experiments, a psychophysicalcurve (the hyperbolic tangent function) was fitted tothe mean data for each adaptor type (baseline,anger, and fear) and good fits were obtained for allthree conditions; baseline (R2 = .76), anger (R2 = .74),and fear (R2 = .77). The PSEs are illustrated with anasterisk in Figure 4(b). A one-way repeated measuresANOVA on PSE values showed a significant maineffect of adaptation to affective voices (F(2, 68) =17.41, MSE = .07, p < .001, η2p = .34). Planned t-testsshowed that the PSE was significantly larger after

prolonged exposure to an angry voice (M = 4.39, SD= 2.13) compared to the baseline condition (M =3.31, SD = 1.41), (t(35) = 3.11, p < .05). This supportsprevious results that adaptation to an angry voicecauses sensitization. In addition, the PSE was signifi-cantly larger in the anger condition (M = 4.39, SD =2.13) than in the fear condition (M = 2.69, SD = 2.10),t(35) = 6.41, p < .05.

Experiment 1d (adapted to instrument; tested onvoice)In contrast to the adaptation effects in Experiments1a and 1b, or the sensitization effect in Experiment1c, there was no indication of adaptation or sensitiz-ation when participants were exposed to angry orfearful instrumental sounds and tested on vocalsounds, F(2, 102) = 1.53, MSE = .065, p = .221,η2p = .029, (Figure 5).

Figure 5. (a) Behavioral results from experiment 1d (adapted to instru-mental sounds; tested on voice sounds). (b) Psychophysical functionfor the grand average of the three experimental conditions: baseline(solid), anger (light dashed), and fear (dark dashed). PSE values aredenoted with an asterisk and a vertical line to the x-axis. The PSEswere used to determine differences between conditions.

Figure 4. (a) Behavioral results from experiment 1c (adapted to voicesounds; tested on instrumental sounds). (b) Psychophysical functionfor the grand average of the three experimental conditions: baseline(solid), anger (light dashed), and fear (dark dashed). PSE values aredenoted with an asterisk and a vertical line to the x-axis. The PSEswere used to determine differences between conditions.

COGNITION AND EMOTION 7

Page 9: Processing emotions in sounds: cross-domain aftereffects

Discussion

Results from Experiments 1a–1d showed both singledomain and cross-domain aftereffects of angrysounds. In Experiment 1a, exposure to angry voicesmade vocal test stimuli less angry (more fearful) (i.e.adaptation); similarly, in Experiment 1b, exposure toangry instrumental sounds made instrumental teststimuli less angry (more fearful) (i.e. adaptation). InExperiment 1c, exposure to angry voices made instru-mental test stimuli angrier (sensitization). In Experiment1d, we found no aftereffects. In all experiments,exposure to fearful sounds resulted in no aftereffects.

The results from Experiments 1a and 1b (adapted tovoice, tested on voice; adapted to instrument, tested oninstrument) support previous research indicating thatadaptation can take place in more than one domain(see Bestelmeyer et al., 2014). When participants weretested across domains, there was a sensitization effectwhen angry vocal sounds were presented before instru-mental sounds. The exact mechanism by which thissensitization occurs is not clear; however it is likelythat emotion processing of vocal sounds and instru-mental sounds do not overlap completely. Whileemotion processing of vocal sounds influences instru-mental sounds, the opposite is not the case.

It is possible that the instrumental soundsemployed in Experiment 1d were not emotionallyexpressive enough to produce aftereffects in Exper-iment 1d. Likewise, exposure to fearful sounds didnot result in aftereffects because fearful sounds werenot fearful enough. Moreover, it is unknown whetherresults in Experiments 1a–1d can be generalized tomore complex situations. The vocal and instrumentalsounds we employed in Experiment 1a–1d are notnecessarily representative of speech and music,however, these sound stimuli do contain some ofthe same information as music and speech. Tobetter understand how emotion is perceived in thedomains of music and speech, it is crucial to re-examine cross-domain aftereffects with sounds moreanalogous to music and speech and emotionallymore expressive.

Experiments 2a–2d employed instrumental three-note sound sequences and vocal utterances asstimuli. In using these stimuli we expect to seesimilar results as observed in Experiments 1a–1d,where adaptation aftereffects emerge in singledomains both for the vocal utterances (Experiment2a) and for the three-note sound sequences (Exper-iment 2b) as well as across domains from vocal

utterances to three-note sound sequences (Exper-iment 2c) but not from three-note sound sequencesto vocal utterances (Experiment 2d), provided thatwhat we found in Experiments 1a–1d were not merestatistical anomalies.

Experiments 2a–2d: vocal utterances andthree-note sound sequences

In Experiments 2a–2d, we employed three-note soundsequences and vocal utterances (two phoneme vocalsounds) as sound stimuli. Vocal utterance soundswere created from recordings of voices using the pho-nemes gi/go, wo/wo, de/de, or te/te. Three-notesound sequence stimuli were recordings of instrumen-tal tones combined to create three-note soundsequences. Except for the new stimuli, the structureand procedure of Experiment 2a–2d were analogousto those described in Experiment 1a–1d (Table 2).

If emotion processing for these two types of soundmakes use of shared neural mechanisms, prolongedexposure to vocal utterances (e.g. angry voice)should result in aftereffects (either adaptation or sen-sitization) in the processing of three-note soundsequences and vice-versa.

Method

ParticipantsSeventeen undergraduate students took part in Exper-iment 2a (adapted to vocal utterance; tested on vocal

Table 2. Stimuli used in the baseline and adaptation phases ofexperiments 2a–2d.

ExperimentPhase Baseline phase

Experimental phase

Exposure Test

Exp. 2a Morphed vocalutterance:anger–fearjudgment

Vocalutterance

Morphed vocalutterance:anger–fearjudgment

Exp. 2b Morphed three-note soundsequence:anger–fearjudgment

Three-notesoundsequence

Morphed three-note soundsequence:anger–fearjudgment

Exp. 2c Morphed three-note soundsequence:anger–fearjudgment

Vocalutterance

Morphed three-note soundseqence: anger-fear judgment

Exp. 2d Morphed vocalutterance:anger–fearjudgment

Three-notesoundsequence

Morphed vocalutterance:anger–fearjudgment

8 C. BOWMAN AND T. YAMAUCHI

Page 10: Processing emotions in sounds: cross-domain aftereffects

utterance) (8 female, mean age = 19.00, SD = 0.53; 9male, mean age = 19.67, SD = 1.41); 18 undergraduatestudents took part in Experiment 2b (adapted tothree-note sound sequence; tested on three-notesound sequence) (10 female, mean age = 18.40, SD =0.70; 8 male, mean age = 20.00, SD = 3.30); 20 under-graduate students participated in Experiment 2c(adapted to vocal utterance; tested on three-notesound sequence) (12 female, mean age = 19, SD =1.12; 8 male, mean age = 20.4, SD = 2.56); and 20undergraduate students participated in Experiment2d (adapted to three-note sound sequence; testedon vocal utterance) (12 female, mean age = 19.20,SD = 1.94; 8 male, mean age = 20.37, SD = 2.77). Allparticipants reported normal hearing and receivedcourse credit.

MaterialsStimuli used in the baseline phase of Experiments 2a(adapted to vocal utterance; tested on vocal utter-ance) and 2d (adapted to three-note sound sequence;tested on vocal utterance) were morphed continuumstimuli of seven steps from anger to fear. The vocalutterance morphed continua were created fromeight initial recordings of speech-like vocal utter-ances. Two male actors recorded an angry andfearful sound using the pseudo syllable sounds /de/de/, or /te/te/ and two female actors recorded thespeech-like sounds /gi/go/, or /wo/wo/, which weremodified after those used by Klinge, Röder, andBüchel (2010). For the baseline phase, vocal utterancestimuli were used to create anger to fear morphedcontinua in seven steps corresponding to 5/95%,20/80%, 35/65%, 50/50%, 65/35%, 80/20%, and 95/5% anger/fear. There were 56 total vocal utterancemorphed sounds (4 vocal utterances (2 male, 2female) × 2 emotions per sound (anger/fear) × 7steps for each morphed continuum = 56 total conti-nuum stimuli). The program STRAIGHT (Kawahara &Matsui, 2003) was used to create the anger–fearvoice morphed continua in MatlabR2007b (Math-works, Inc.).

Stimuli used in the baseline phase of Experiment1b (adapted to three-note sound sequence; testedon three-note sound sequence) and Experiment 1c(adapted to vocal utterance; tested on three-notesound sequence) were musical morphed continuumstimuli of seven steps from anger to fear; these werecreated similarly to the vocal utterance morphedstimuli. Morphed continua were created from eightinitial recordings of two classes of musical

instruments, brass and woodwind. Selected instru-ments were the French horn, baritone, saxophone,and flute. Instrumental sounds were recorded bymembers of the U.S. 395th Army band. Instrumental-ists were directed to play both an angry and afearful three-note sound. From these three-notesound stimuli, angry to fearful continua were createdfrom each sound in seven steps that correspondedto 5/95%, 20/80%, 35/65%, 50/50%, 65/35%, 80/20%,and 95/5% anger/fear.

Adaptation sounds used in the experimental phaseof Experiment 2a (adapted to vocal utterance; testedon vocal utterance) and 2c (adapted to vocal utter-ance; tested on three-note sound sequence) werethe initial vocal utterances (100% angry/fearful). Adap-tation sounds used in Experiments 2b (adapted tothree-note sound sequence; tested on three-notesound sequence) and 2d (adapted to three-notesound sequence; tested on vocal utterance) were theoriginal musical, three-note sound sequence record-ings (100% angry/fearful). All stimuli were normalizedin RMS energy and presented in stereo via JVC Flatsstereo headphones.

ProcedureThe procedure was analogous to Experiments 1a–1dwith exception to the sounds presented. Exper-iment 2a was identical to 1a except vocal utteranceswere used as stimuli. Experiment 2b was identical to1a, except three-note sound sequences were usedto create anger–fear morphed stimuli from twoinstrument classes (2 brass – French horn and bari-tone and 2 woodwind – flute and alto saxophone).Experiment 2c was identical to 1c except vocalutterances and three-note sound sequences wereused for adapting and test stimuli, respectively.Experiment 2d was identical to 1d except three-note sound sequences and vocal utterances wereused for adapting and test stimuli, respectively,see Figure 1.

DesignData were collapsed across the seven morph stepswhere each participant had an average anger–fearjudgment score for each exposure condition (baseline,anger, and fear). The dependent variable was the pro-portion of trials that participants judged stimulussounds as fearful and the independent variable wasthe prolonged exposure condition (baseline, anger,or fear).

COGNITION AND EMOTION 9

Page 11: Processing emotions in sounds: cross-domain aftereffects

Results

Experiment 2a (adapted to vocal utterance;tested on vocal utterance)Prolonged exposure to an angry vocal utterancerevealed that participants judged vocal utterancesounds at test as less angry (more fearful), showingan adaptation effect similar to Experiment 1a (adaptto voice, test on voice). A one-way repeated measuresANOVA revealed a significant main effect for affectivevocal utterances when participants were tested onvocal utterance sounds, (F (2, 32) = 4.19, MSE = .055,p < .05, η2p = .21), Figure 6.

Participants judged sounds as less angry followingprolonged exposure in the anger condition (M = .56,SD = .11) relative to baseline (M = .48, SD = .08), t(16)= 2.21, p < .05, d = .82, 95% CId [.08, 1.53]. A significant

difference was also present for the anger versus fearconditions, t(16) = 5.18, p < .001, d = .72, 95% CId [.05,1.43]. As in Experiment 1a, the baseline versus fearcondition was not significant, t(16) = .15, p > .05,d = .06, 95% CId [−.40, .52].

Data were averaged as a function of the sevenmorph steps and a psychophysical curve (the hyper-bolic tangent function) was fitted to the mean datafor each adaptor type (baseline, anger and fear).Good fits were obtained for all three conditions; base-line (R2 = .98), anger (R2 = .99), and fear (R2 = .98) andthe point of inflection of the function was computedfor all curves, as illustrated with an asterisk in Figure6(b). A one-way repeated measures ANOVA on inflec-tion values revealed that there was no main effect ofadaptation to affective speech (F(2, 32) = 2.69, MSE =1.18, p > .05, η2p = .14).

Experiment 2b (adapted to three-note soundsequence; tested on three-note sound sequence)Prolonged exposure to an angry three-note soundsequence in Experiment 2b showed that participant’sconsistently judged three-note sound sequences asless angry (more fearful), demonstrating an adaptationeffect. Similarly, after prolonged exposure to a fearfulthree-note sound sequence, participants judged athree-note sound sequence at test as less fearful(more angry) Figure 7, (F (2, 30) = 18.10, MSE = .027,p < .001, η2p = .55).

Paired t-tests indicated that participants judgedsounds as less angry after prolonged exposure toangry sound sequences (M = .67, SD = .11), relative tothe baseline condition (M = .61, SD = .08); t(15) = 3.35,p < .01, d = .65, 95% CId [.06, 1.39]. An adaptationeffect was also present for the fear condition whereparticipants judged sounds as less fearful after pro-longed exposure to fearful sound sequences (M= .54, SD = .15) compared to the baseline condition(M = .61, SD = .08); t(16) = 2.61, p < .01, d = .60, 95%CId [.13, 1.33]. A difference was also present for theanger and fear conditions, t(15) = 6.76, p < .001, d =1.02, 95% CId [.26, 1.79].

Fitting the data to a psychophysical curve (thehyperbolic tangent function), good fits were obtainedfor all three conditions; baseline (R2 = .99), anger (R2

= .99), and fear (R2 = .99). The PSEs for each conditionare illustrated with an asterisk in Figure 7(b).

A one-way repeated measures ANOVA on inflectionvalues revealed a significant main effect of adaptationto affective three-note sound sequences (F(2, 30) =8.76, MSE = .87, p < .001, η2p = .37). Follow-up t-tests

Figure 6. (a) Behavioral results from experiment 2a (adapted to vocalutterances; tested on vocal utterances). (b) Psychophysical function forthe grand average of the three experimental conditions: baseline(solid), anger (light dashed), and fear (dark dashed). PSE values aredenoted with an asterisk and a vertical line to the x-axis. The PSEswere used to determine differences between conditions.

10 C. BOWMAN AND T. YAMAUCHI

Page 12: Processing emotions in sounds: cross-domain aftereffects

showed that the PSE for the anger condition was sig-nificantly smaller (M = 2.60, SD = 1.22) compared tobaseline (M = 3.35, SD = .91), t(15) = 2.70, p < .01).Additionally, the PSE in the fear condition was signifi-cantly larger (M = 3.98, SD = 1.71) compared to base-line, t(15) = 2.10, p < .05). These results are consistentwith those prior to the curve fitting and demonstratean effect of adaptation when participants wereadapted to anger and when participants wereadapted to fear.

Experiment 2c (adapted to vocal utterance;tested on three-note sound sequence)Prolonged exposure to angry vocal utterances demon-strated an adaptation effect where participants’judged three-note sound sequences as less angry

(more fearful). After prolonged exposure to a fearfulvocal utterance participants also judged three-notesound sequences at test as more fearful, demonstrat-ing a sensitization effect; Figure 8, (F (2, 38) = 10.38,MSE = .068, p < .001, η2p = .35).

Paired t-tests indicate that participants judgedsounds as less angry after prolonged exposure to anangry vocal utterance (M = .54, SD = .14) relative tobaseline (M = .45, SD = .10); t(19) = 2.94, p < .01,d = .69, 95% CId [.03, 1.34] (i.e. adaptation effect). Asensitization effect was found after prolongedexposure to fearful vocal utterances, t(19) = 4.43,p < .001, d = 1.03, 95% CId [.35, 1.71] where soundswere judged as more fearful (M = .59, SD = .16) relativeto the baseline condition (M = .45, SD = .10). A differ-ence was not present for the anger and fear con-ditions, t(19) = 1.67, p >.05, d = .35, 95% CId [.28, 1.00].

Figure 7. (a) Behavioral results from experiment 2b (adapted to three-note sound sequences; tested on three-note sound sequences. (b) Psy-chophysical function for the grand average of the three experimentalconditions: baseline (solid), anger (light dashed), and fear (darkdashed). PSE values are denoted with an asterisk and a vertical lineto the x-axis. The PSEs were used to determine differences betweenconditions.

Figure 8. (a) Behavioral results from experiment 2c (adapted to vocalutterances; tested on three-note sound sequences. (b) Psychophysicalfunction for the grand average of the three experimental conditions:baseline (solid), anger (light dashed), and fear (dark dashed). PSEvalues are denoted with an asterisk. The PSEs were used to determinedifferences between conditions.

COGNITION AND EMOTION 11

Page 13: Processing emotions in sounds: cross-domain aftereffects

Good fits were obtained for the data after fitting tothe hyperbolic tangent function; baseline (R2 = .98),anger (R2 = .92), and fear (R2 = .98), the PSEs are illus-trated with an asterisk in Figure 8(b). A one-wayrepeated measures ANOVA on PSE values revealed asignificant main effect of adaptation to affectivevocal utterances when tested on three-note soundsequences (F(2, 38) = 4.03, MSE = 2.23, p < .05, η2p= .18). Follow-up t-tests showed that the PSE for thefear condition was significantly smaller (M = 2.93, SD= 2.01) compared to baseline (M = 4.21, SD = 1.71), t(19) = 2.77, p < .01). There was no difference for base-line compared to the anger condition or for theanger and fear conditions. These results are in agree-ment with those prior to curve fitting that show aneffect of adaptation when participants are exposedto an angry vocal utterance and tested on a three-note sound sequence, but do not show the same sen-sitization effect. This could indicate that the effect ofsensitization is not as strong as adaptation.

Experiment 2d (adapted to three-note soundsequences; tested on vocal utterance)As in Experiment 1d, prolonged exposure to angrythree-note sound sequences did not lead participantsto judge vocal utterances as more angry or fearful attest. Similarly, prolonged exposure to fearful three-note sound sequences did not cause participants tojudge vocal utterances as more angry or fearful attest (F (2, 38) = 2.92, MSE = .028, p > .05, η2p = .13),Figure 9.

Discussion

Overall, the pattern of results was analogous toExperiments 1a–1d where aftereffects occurred forvocal sounds and three-note sound sequences butnot vice versa, Table 3. These results support thatadaptation aftereffects can occur in differentdomains, however, only from vocal utterances tothree-note sound sequences and not vice-versa.Similar to the cross-domain effects of Experiment1c (adapted to voice; tested on instrument andvice-versa) we found a comparable pattern ofresults for exposure to vocal utterances when partici-pants were tested on three-note sound sequences.As in Experiment 1d, we found no aftereffects fromthree-note sound sequences to vocal utterances inExperiment 2d.

Experiment 2a demonstrated an adaptation effectwhere exposure to angry vocal utterances made

vocal utterances sound less angry (more fearful).Experiment 2b also showed an adaptation effect forangry and for fearful three-note sound sequenceswhen tested on three-note sound sequences. Simi-larly, Experiment 2c revealed adaptation to angryvocal utterances where exposure to angry vocal utter-ances made three-note sound sequences more fearful(less angry) and sensitization where exposure to afearful vocal utterance made three-note soundsequences more fearful. There were no adaptation orsensitization effects present for Experiment 2d whenexposed to three-note sound sequences and testedon vocal utterances.

One may argue that the absence of aftereffects inExperiment 2d can be attributed to the fact that thethree-note sound sequences we employed were notemotionally salient enough. This interpretation is unli-kely because these three-note sound sequences didproduce large aftereffects in Experiment 2b.

Figure 9. (a) Behavioral results from experiment 2d (adapted to three-note sound sequences; tested on vocal utterances. (b) Psychophysicalfunction for the grand average of the three experimental conditions:baseline (solid), anger (light dashed), and fear (dark dashed). PSEvalues are denoted with an asterisk and a vertical line to the x-axis(b). The PSEs were used to determine differences between conditions.

12 C. BOWMAN AND T. YAMAUCHI

Page 14: Processing emotions in sounds: cross-domain aftereffects

Table 3. Summary of results for experiments 1a–1d and experiments 2a–2d.

Experiment Adaptation stimuli ResultExposure Test

Exp. 1a Voice Voice

. Adaptation: Exposure to angry sounds made test sounds less angry.

Exp. 1b Instrument Instrument

. Adaptation: Exposure to angry sounds made test sounds less angry

Exp. 1c Voice Instrument

. Sensitization: Exposure to angry sounds made test sounds more angry

Exp. 1d Instrument Voice . No effect

Exp. 2a Vocal utterance Vocal utterance

. Adaptation: exposure to angry sounds made test sounds less angry

Exp. 2b Three-note sequence Three-note sequence

. Adaptation: exposure to angry sounds made test sounds less angry

. Adaptation: Exposure to fearful sounds made test sounds less fearful

Exp. 2c Vocal utterance Three-note sequence

. Adaptation: exposure to angry sounds made test sounds less angry;

. Sensitization: exposure to fearful sounds made test sounds more fearful

Exp. 2d Three-note sequence Vocal utterance . No effect

COGNITION AND EMOTION 13

Page 15: Processing emotions in sounds: cross-domain aftereffects

General discussion

In two sets of four experiments, we examined the linkbetween voice and instrumental sounds, as well asbetween vocal utterance and three-note soundsequence stimuli; we gauged whether adaptationwould occur in single domains and across domains.Results showed a similar pattern of adaptation afteref-fects for the perception of angry voice, instrumental,vocal utterance, and three-note sound sequenceswhere aftereffects were found across domains fromvocal to instrumental sounds and from vocal utter-ances to three-note sound sequences, but not vice-versa. Our results also highlight the complexity ofcross-domain aftereffects: in Experiment 2c, exposureto angry vocal utterances made three-note soundsequences less angry (adaptation) and exposure to afearful vocal utterance made three-note soundsequences more fearful (sensitization).

We interpret these results in terms of the neuralresources used for processing emotion in music andspeech sounds and also in terms of the differencesbetween the perception of anger and fear. Theseresults provide evidence that similar mechanisms areused for emotion perception of voice and instrumen-tal sounds as well as vocal utterance and three-notesound sequences for the emotion anger. Furthermore,the nature of this relationship is more complex than asimple shared mechanism. Specifically, there is likely aunidirectional relationship where vocal sounds maymodulate the perception of instrumental and three-note sound sequence stimuli, but not vice-versa.

Mechanisms: shared or separate?

A growing body of research has found support for therelationship between music and speech processingsuch that they share overlapping cognitive resources(Frühholz, Trost, & Grandjean, 2014). The suggestion ofshared neural resources between vocal expression andmusic has had a long history (von Helmholtz, 1885,p. 371; Patel, 2009; Rousseau & von Herder, 1986);however, recent works have indicated a separation ofneural resources involved for speech and music proces-sing (see Rogalsky, Rong, Saberi, & Hickok, 2011). Forinstance, amusia, a musical disorder defined by adeficit in musical perception and production, providesa good example of the separability of music andspeech. Often, individuals with congenital amusia arenot able to sing or process pitch, but can speak normally(Ayotte, Peretz, & Hyde, 2002; Peretz, 2012, p. 59).

Recent work by Rogalsky et al. (2011) corroboratesthe notion that there is a difference in the way musicand speech sounds are processed. In an fMRI study,participants listened to sentences, scrambled sen-tences, and melodies. Results showed that sentenceselicited activation in ventrolateral areas whereasmusic melodies elicited activation in dorsomedialand parietal areas. These findings indicate that proces-sing of music and speech does not share overlappingneural resources. The results of our studies support thefindings from Rogalsky et al. (2011) and further show adifference in the way that emotion in sounds repre-senting music and speech are processed.

There is much research supporting both overlapand non-overlap in the processing of music andspeech. Because this delineation is not clear, it is plaus-ible that these two domains share some resources, butdo not completely overlap. A recent review of neuroi-maging studies by Peretz, Vuvan, Lagrois, and Armony(2015) indicates that neural regions are shared for theperception of music and language, for example, co-activation of brain regions during fMRI for music andspeech stimuli. This overlap, however, does notmean that the same neural circuitries are shared(Peretz et al., 2015). Rather, neural circuits employedfor music may interact with those used for a similarprocess in language (Peretz et al., 2015). Our findingsreinforce this proposition such that aftereffects werepresent after participants were adapted to a voicesound and tested on an instrumental sound (Exper-iments 1c and 2c), but not vice versa.

One explanation for this finding is that music andspeechmay share a mechanism for emotion processingthat works in one direction – from voice to instrument.For example, Levy, Granot, and Bentin (2001) comparedresponses to voices andmusical instruments using ERPs.Their results show a voice-specific response (VSR) thatoccurred when positive peaks (320 ms) were more acti-vated during electrophysiological recordings by voicestimuli compared to non-vocal stimuli (Belin, Fecteau,& Bedard, 2004; Levy, Granot, & Bentin, 2003; Levyet al., 2001). This VSR is related to the salience of voicestimuli, reflecting theway attention is allocated andpro-poses that emotion perception is mediated by vocaliza-tions. This could explain why our results showadaptation aftereffects in single domains (e.g. vocalutterance to vocal utterance), and cross-domain afteref-fects from vocal to instrumental sounds but not frominstrumental to voice sounds.

Because we found aftereffects across domains (voiceto instrumental sounds) in only one direction, we

14 C. BOWMAN AND T. YAMAUCHI

Page 16: Processing emotions in sounds: cross-domain aftereffects

propose that mechanisms of emotion perception aregoverned by those used for speech processing. Thisdifference in processing may depend upon the adap-tive value (approach versus avoidance) of emotions.

Are anger and fear different?

Anger and fear are negatively valenced emotions;however, they have different motivational values –approach or avoidance. Recent studies show thatdifferent aftereffects occur due to the adaptive valueof a stimulus (see Frühholz & Grandjean, 2013). After-effects were found for the cross-domain experimentswhere responses were either significantly increased(sensitization in Experiment 1c) or decreased (adap-tation in Experiment 2c) when participants wererepeatedly exposed to angry vocal utterances. Theseresults suggest a difference in the way anger andfear are perceived. Although we do not have clearexplanation for this disparity, we speculate that thisis due to different adaptive values attached to theemotions anger and fear (Strauss et al., 2005).

Both approach motivation and avoidance motiv-ation are governed by motives that orient or directbehavior toward or away from desired or undesiredstates (the action-oriented view; e.g. Carver, Sutton, &Scheier, 2000; Eder, Elliot, & Harmon-Jones, 2013).This is demonstrated in Wilkowski and Meier (2010)where faster approach movements were observedtoward angry facial expressions showing that anger isrelated to approach motivation rather than avoidancemotivation. In contrast, Springer, Rosas, McGetrick, andBowers (2007) argued that angry faces were associatedwith heightened defensive activations (startleresponse/ avoidance) (see also Peretz et al., 2015).

Other researchers show that angry faces evokeapproach or avoidance motivational reactions depend-ing on characteristics of the observer (Strauss et al.,2005). Regardless of the association of anger withapproach or avoidance, this offers evidence that morethan one channel may be used to process emotion insounds. We suggest that our finding that cross-domainaftereffects emerged primarily for angry vocal soundsreflects this adaptive asymmetry of angry and fearfulsounds.

Limitations

Past research has questioned whether aftereffectswere due to low-level (e.g. pitch) or high-level adap-tation (e.g. adaptation in areas of the brain responsible

for face processing) (Bestelmeyer et al., 2014). In thisvein, Bestelmeyer et al. (2010) adapted and tested par-ticipants on different voice identities to address thislimitation. Following Bestelmeyer’s methodologicalingenuity, we also adapted participants to differentvoice or instrument identities than they were testedon. For instance, if adapting to a male voice, partici-pants were tested on a female voice. However, it isnot yet clear how effective our procedure was interms of reducing adaptation effects due to low-level features (e.g. to pitch of a voice) rather thanhigh-level features (e.g. to affect). Though informationabout musicianship, or musical experience of partici-pants was not collected, this would be an interestingavenue to explore in future studies.

It is unclear why the fearful stimuli did not elicit anadaptation aftereffect. This could be due to stimuli notaccurately representing fear, such that a strongresponse did not occur after participants wereexposed to fear. This is demonstrated in Experiment2c (adapted to vocal utterance; tested on three-notesound sequence), where participants were being sen-sitized to the fearful sounds during the prolongedexposure condition. Instead of judging a test soundas angry; they judged test sounds as more fearful. Inaddition, findings from Experiments 1c (adapted tovoice; tested on instrument) did not show aftereffectsfor the emotion fear. This could indicate that thestimuli were not effective in inducing adaptation tofear. A future study should clarify this ambiguity.

Conclusion

Results from these studies show that adaptation afteref-fects occur in the same domain (e.g. voice, instrument,and music) and across domains where vocal sounds(voice and vocal utterances) have an effect on the per-ception of instrumental and three-note soundsequences. These results support a shared and poten-tially unidirectional mechanism of emotion processingfor vocal and musical sounds. We suggest that there isa unidirectional relationship between vocal andmusical sounds where emotion processing of vocalsounds encompassesmusical sounds but not vice-versa.

Notes

1. Experiments 1c and 1d were run separately from 1a and1b and did not have equal sample sizes. Experiment 1aand 1b tested adaptation with the same stimulus type(e.g. adapted to voice, tested on voice), whereas

COGNITION AND EMOTION 15

Page 17: Processing emotions in sounds: cross-domain aftereffects

Experiments 1c and 1d examined cross-domain adap-tation for voice and instrumental sounds (e.g. adapt toan instrumental sound and test on a voice sound).Because Experiment 1c was the first cross-domain adap-tation study, we increased the number of participantsto examine if cross-domain adaptation would bepresent. Additional analyses with reduced sample sizesfor Experiment 1d (n = 36 to equal Exp. 1c and n = 20)and a reduced sample size for Experiment 1c (n = 20)were run and the basic results did not change theoverall interpretation of the data.

2. The baseline phase for each experiment (Exp. 1a–1d and2a–2d) functioned as a stimulus emotion evaluation. Inthe baseline phase, participants listened to and judgedwhether each sound stimulus (all morphed soundsstimuli, steps 1–7, including those used as adaptors inthe adaptation phase of the experiments) sounded angryor fearful. In doing this, we can evaluate the emotion ofthe sound stimuli by examining the baseline phase foreach experiment; see the baseline phase of Figures 2–9.

3. In addition to the analysis in the main text we performeda two-way ANOVA ((3 (baseline, anger, fear) × 7 (step 1, 2,3, 4, 5, 6, 7)) with morphing steps (7 levels) as a within-subjects factor. We found a two-way interaction effectin Experiment 1a, 1c, and 1d and the directions of theseinteraction effects were all consistent with the maineffects described in the results for all experiments.

Disclosure statement

No potential conflict of interest was reported by the authors.

ORCID

Takashi Yamauchi http://orcid.org/0000-0002-6372-1118

References

Aubé, W., Angulo-Perkins, A., Peretz, I., Concha, L., & Armony, J. L.(2015). Fear across the senses: Brain responses to music, voca-lizations and facial expressions. Social Cognitive and AffectiveNeuroscience, 10(3), 399–407. doi:10.1093/scan/nsu067

Ayotte, J., Peretz, I., & Hyde, K. (2002). Congenital amusia. Brain,125(2), 238–251.

Bachorowski, J. A., & Owren, M. J. (2008). Vocal expression ofemotion. In M. Lewis, J. Haviland-Jones, & L. F. Barret (Eds.),Handbook of emotions (pp. 196–210). New York, NY: GuilfordPress.

Bachorowski, J. A., Smoski, M. J., & Owren, M. J. (2001). The acous-tic features of human laughter. The Journal of the AcousticalSociety of America, 110(3), 1581. doi:10.1121/1.1391244

Belin, P., Fecteau, S., & Bedard, C. (2004). Thinking the voice:Neural correlates of voice perception. Trends in CognitiveSciences, 8(3), 129–135.

Belin, P., Fillion-Bilodeau, S., & Gosselin, F. (2008). The Montrealaffective voices: A validated set of nonverbal affect burstsfor research on auditory affective processing. BehaviorResearch Methods, 40(2), 531–539. doi:10.3758/BRM.40.2.531

Bestelmeyer, P. E. G., Maurage, P., Rouger, J., Latinus, M., & Belin,P. (2014). Adaptation to vocal expressions reveals multistepperception of auditory emotion. The Journal of Neuroscience,34(24), 8098–8105. doi:10.1523/JNEUROSCI.4820-13.2014

Bestelmeyer, P. E. G., Rouger, J., DeBruine, L. M., & Belin, P. (2010).Auditory adaptation in vocal affect perception. Cognition, 117(2), 217–223. doi:10.1016/j.cognition.2010.08.008

Bowman, C., & Yamauchi, T. (2015a, September 21–24). Emotion,voices and musical instruments: Repeated exposure to angryvocal sounds makes instrumental sounds angrier. Internationalconference on Affective Computing and IntelligentInteraction (ACII) (pp. 670–675), IEEE Computer Society.

Bowman, C., & Yamauchi, T. (2015b). Perceiving categories ofemotion in sound: The role of timbre. Psychomusicology:Music, Mind and Brain, 26(1), 15–25.

Byrd, M., Bowman, C., & Yamauchi, T. (2012). Cooing, crying, andbabbling: A link between music and prelinguistic communi-cation. In N. Miyake, D. Peebles, & R. P. Cooper (Eds.),Proceedings of the 34th annual conference of the cognitivescience society (pp. 1392–1397). Sapporo: Cognitive ScienceSociety.

Carver, C. S., Sutton, S. K., & Scheier, M. F. (2000). Action, emotion,and personality: Emerging conceptual integration. Personalityand Social Psychology Bulletin, 26(6), 741–751. doi:10.1177/0146167200268008

Coutinho, E., & Dibben, N. (2012). Psychoacoustic cues toemotion in speech prosody and music. Cognition & Emotion,27(4), 658–684. doi:10.1080/02699931.2012.732559

Eder, A. B., Elliot, A. J., & Harmon-Jones, E. (2013). Approach andavoidance motivation: Issues and advances. Emotion Review, 5(3), 227–229.

Eerola, T., Ferrer, R., & Alluri, V. (2012). Timbre and affect dimen-sions: Evidence from affect and similarity ratings and acousticcorrelates of isolated instrument sounds. Music Perception, 30(1), 49–70.

Fedorenko, E., Patel, A., Casasanto, D., Winawer, J., & Gibson, E.(2009). Structural integration in language and music:Evidence for a shared system. Memory Cognition, 37(1), 1–9.

Frühholz, S., & Grandjean, D. (2013). Amygdala subregions differ-entially respond and rapidly adapt to threatening voices.Cortex, 49(5), 1394–1403. doi:10.1016/j.cortex.2012.08.003

Frühholz, S., Trost, W., & Grandjean, D. (2014). The role of themedial temporal limbic system in processing emotions invoice and music. Progress in Neurobiology, 123, 1–17. doi:10.1016/j.pneurobio.2014.09.003

von Helmholtz, H. (1885). On the sensations of tone as a physio-logical basis for the theory of music (2nd ed.). London:Longman.

Ilie, G., & Thompson, W. F. (2006). A comparison of acoustic cuesin music and speech for three dimensions of affect. MusicPerception: An Interdisciplinary Journal, 23(4), 319–330.doi:10.1525/mp.2006.23.4.319

Juslin, P., & Laukka, P. (2003). Communication of emotions invocal expression and music performance: Different channels,same code? Psychological Bulletin, 129(5), 770–814. doi:10.1037/0033-2909.129.5.770

Kandel, E. R., Schwartz, J. H., & Jessel, T. M. (2001). Principles ofneural science. New York, NY: McGraw Hill.

Kawahara, H., & Matsui, H. (2003, April 6–10). Auditory morphingbased on an elastic perceptual distance metric in an

16 C. BOWMAN AND T. YAMAUCHI

Page 18: Processing emotions in sounds: cross-domain aftereffects

interference-free time-frequency representation. Proceedings ofthe IEEE international conference on Acoustics, Speech, andSignal Processing (ICASSP ’03). IEEE. doi:10.1109/ICASSP.2003.1198766

Klinge, C., Röder, B., & Büchel, C. (2010). Increased amygdala acti-vation to emotional auditory stimuli in the blind. Brain, 133(6),1729–1736. doi:10.1093/brain/awq102

Levy, D. A., Granot, R., & Bentin, S. (2001). Processing specificityfor human voice stimuli: Electrophysiological evidence.Neuroreport, 12(12), 2653–2657. doi:10.1097/00001756-200108280-00013

Levy, D. A., Granot, R., & Bentin, S. (2003). Neural sensitivity tohuman voices: ERP evidence of task and attentional influ-ences. Psychophysiology, 40(2), 291–305.

Patel, A. (2009). Music and the brain: Three links to language. TheOxford handbook of music psychology. New York, NY: OxfordUniversity Press.

Patel, A. D. (2003). Language, music, syntax and the brain. NatureNeuroscience, 6(7), 674–681.

Peretz, I. (2012). Music, language, and modularity in action. In P.Rebushat, M. Rohmeier, J. Hawkins, & I. Cross (Eds.), Languageand music as cognitive systems (pp. 254–268). New York, NY:Oxford University Press.

Peretz, I., Vuvan, D., Lagrois, M. E., & Armony, J. (2015). Neuraloverlap in processing music and speech. PhilosophicalTransactions of the Royal Society of London Series B,Biological Sciences, 9, 370. doi:10.3389/fnhum.2015.00330

Pollak, S. D., Cicchetti, D., Hornung, K., & Reed, A. (2000).Recognizing emotion in faces: Developmental effects ofchild abuse and neglect. Developmental psychology, 36(5),679–688.

Rogalsky, C., Rong, F., Saberi, K., & Hickok, G. (2011). Functionalanatomy of language and music perception: Temporal andstructural factors investigated using functional magnetic res-onance imaging. The Journal of Neuroscience, 31(10), 3843–3852.

Rousseau, J. J., & von Herder, J. G. (1986). On the origin oflanguage. Chicago, IL: University of Chicago Press.

Schachner, A., & Hannon, E. E. (2011). Infant-directed speechdrives social preferences in 5- month-old infants.Developmental Psychology, 47(1), 19–25. doi:10.1037/a0020740

Springer, U. S., Rosas, A., McGetrick, J., & Bowers, D. (2007).Differences in startle reactivity during the perception ofangry and fearful faces. Emotion, 7(3), 516–525. doi:10.1037/1528-3542.7.3.516

Strauss, M. M., Makris, N., Aharon, I., Vangel, M. G., Goodman, J.,Kennedy, D. N.,… Breiter, H. C. (2005). fMRI of sensitizationto angry faces. NeuroImage, 26(2), 389–413. doi:10.1016/j.neuroimage.2005.01.053

Thompson, W. F., Schellenberg, E., & Husain, G. (2004). Decodingspeech prosody: Do music lessons help? Emotion, 4(1), 46–64.

Vaish, A., Grossmann, T., & Woodward, A. (2008). Not all emotionsare created equal: The negativity bias in social-emotionaldevelopment. Psychological bulletin, 134(3), 383–403.

Wang, X., Guo, X., Chen, L., Liu, Y., Goldberg, M. E., & Xu, H. (2016).Auditory to visual cross-modal adaptation for emotion:Psychophysical and neural correlates. Cerebral Cortex, 2016,1–10.

Wilkowski, B. M., & Meier, B. P. (2010). Bring it on: Angry facialexpressions potentiate approach-motivated motor behavior.Journal of Personality and Social Psychology, 98(2), 201–210.doi:10.1037/a0017992

COGNITION AND EMOTION 17