the smuse: a situated cognition approach …slegroux/publications/pubs/icmc2012.pdfthe smuse: a...

8
THE SMUSE: A SITUATED COGNITION APPROACH TO INTERACTIVE MUSIC COMPOSITION Sylvain Le Groux GTCMT Georgia Tech, Atlanta Paul F.M.J. Verschure SPECS, UPF, Barcelona ICREA, Barcelona ABSTRACT The evolution of computer-based music systems has gone from computer-aided composition, which transposed the traditional paradigms of music composition to the dig- ital realm, to complex feedback systems that allow for rich multimodal interactions. Yet, a lot of interactive mu- sic systems still rely on outdated principles in the light of modern situated cognitive systems design. Moreover, the role of human emotional feedback, arguably an im- portant feature of musical experience, is rarely taken into account into the interaction loop. We propose to address these limitations by introducing a novel situated synthetic interactive composition framework called the SMuSe (for Situated Music Server). The SMuSe is based on the prin- ciples of parallelism, situatedness, emergence and emo- tional feedback and is built on a cognitively plausible ar- chitecture. It addresses questions at the intersection of music perception and cognition while being used as a cre- ative tool for interactive music composition. 1. BACKGROUND Interactivity has now become a standard feature of many multimedia systems and plays a fundamental role in con- temporary art practice. Specifically, real-time human/ ma- chine interactive music systems are now omnipresent as both composition and live performance tools. Yet, the term “interactive music system” is often misused. The interaction that takes place between a human and a sys- tem is a process that includes both control and feedback, where the real-world actions are interpreted into the vir- tual domain of the system [4]. If some parts of the inter- action loop are missing (for instance the cognitive level in Figure 1), the system becomes only a reactive (vs. inter- active) system. As a matter of fact, in most of current human-computer musical systems, the human agent interacts whereas the machine due to a lack of cognitive modeling only reacts. Although the term interactivity is widely used in the new media arts, most systems are simply reactive systems [4]. Furthermore, the cognitive modeling of interactive multi- media systems, when it exists, often relies on a classical cognitive science approach to artificial systems where the different modules (e.g. perception, memory, action) are studied separately. This approach has since been chal- lenged by modern cognitive science, which emphasizes the crucial role of the perception-action loop, the build- ing of cognitive artifacts, as well as the interaction of the system with its environment [45]. In this paper we pro- pose a novel approach to interactive music system design informed by modern cognitive science and present an im- plementation of such a system called the SMuSe. Sensors Actuators Memory Cognition Interaction Machine Human Senses Effectors Memory Cognition Figure 1. Human machine interaction (adapted from [4]) 2. FROM EMBODIED COGNITIVE SCIENCE TO MUSIC SYSTEMS DESIGN 2.1. Classical View A look at the evolution of our understanding of cognitive systems put in parallel with the evolution of music com- position practices, gives a particularly interesting perspec- tive on some limitations of actual interactive music sys- tems. The classical approach to cognitive science assumes that external behavior is mediated by internal representations [6] and that cognition is basically the manipulation of these mental representations by sets of rules. It mainly relies on the sense-think-act framework [34], where future actions are planned according to perceptual information. Interestingly enough, a parallel can be drawn between clas- sical cognitive science and the development of classical music which also heavily relies on the use of formal struc- tures. It puts the emphasis on internal processes (compo- sition theory) to the detriment of the environment or the body, with a centralized control of the performance (the conductor). Disembodiment in classical music composition can be seen at several levels. Firstly, by training, the composer is used

Upload: trankhuong

Post on 06-May-2018

216 views

Category:

Documents


1 download

TRANSCRIPT

THE SMUSE: A SITUATED COGNITION APPROACH TO INTERACTIVEMUSIC COMPOSITION

Sylvain Le Groux

GTCMTGeorgia Tech, Atlanta

Paul F.M.J. Verschure

SPECS, UPF, BarcelonaICREA, Barcelona

ABSTRACT

The evolution of computer-based music systems has gonefrom computer-aided composition, which transposed thetraditional paradigms of music composition to the dig-ital realm, to complex feedback systems that allow forrich multimodal interactions. Yet, a lot of interactive mu-sic systems still rely on outdated principles in the lightof modern situated cognitive systems design. Moreover,the role of human emotional feedback, arguably an im-portant feature of musical experience, is rarely taken intoaccount into the interaction loop. We propose to addressthese limitations by introducing a novel situated syntheticinteractive composition framework called the SMuSe (forSituated Music Server). The SMuSe is based on the prin-ciples of parallelism, situatedness, emergence and emo-tional feedback and is built on a cognitively plausible ar-chitecture. It addresses questions at the intersection ofmusic perception and cognition while being used as a cre-ative tool for interactive music composition.

1. BACKGROUND

Interactivity has now become a standard feature of manymultimedia systems and plays a fundamental role in con-temporary art practice. Specifically, real-time human/ ma-chine interactive music systems are now omnipresent asboth composition and live performance tools. Yet, theterm “interactive music system” is often misused. Theinteraction that takes place between a human and a sys-tem is a process that includes both control and feedback,where the real-world actions are interpreted into the vir-tual domain of the system [4]. If some parts of the inter-action loop are missing (for instance the cognitive level inFigure 1), the system becomes only a reactive (vs. inter-active) system.

As a matter of fact, in most of current human-computermusical systems, the human agent interacts whereas themachine due to a lack of cognitive modeling only reacts.Although the term interactivity is widely used in the newmedia arts, most systems are simply reactive systems [4].Furthermore, the cognitive modeling of interactive multi-media systems, when it exists, often relies on a classicalcognitive science approach to artificial systems where thedifferent modules (e.g. perception, memory, action) are

studied separately. This approach has since been chal-lenged by modern cognitive science, which emphasizesthe crucial role of the perception-action loop, the build-ing of cognitive artifacts, as well as the interaction of thesystem with its environment [45]. In this paper we pro-pose a novel approach to interactive music system designinformed by modern cognitive science and present an im-plementation of such a system called the SMuSe.

Sensors

Actuators

MemoryCognition

Interaction MachineHuman

Senses

Effectors

MemoryCognition

Figure 1. Human machine interaction (adapted from [4])

2. FROM EMBODIED COGNITIVE SCIENCE TOMUSIC SYSTEMS DESIGN

2.1. Classical View

A look at the evolution of our understanding of cognitivesystems put in parallel with the evolution of music com-position practices, gives a particularly interesting perspec-tive on some limitations of actual interactive music sys-tems.

The classical approach to cognitive science assumes thatexternal behavior is mediated by internal representations[6] and that cognition is basically the manipulation of thesemental representations by sets of rules. It mainly relies onthe sense-think-act framework [34], where future actionsare planned according to perceptual information.

Interestingly enough, a parallel can be drawn between clas-sical cognitive science and the development of classicalmusic which also heavily relies on the use of formal struc-tures. It puts the emphasis on internal processes (compo-sition theory) to the detriment of the environment or thebody, with a centralized control of the performance (theconductor).

Disembodiment in classical music composition can be seenat several levels. Firstly, by training, the composer is used

to compose in his head and translate his mental represen-tations into an abstract musical representation: the score.Secondly, the score is traditionally interpreted live by theorchestra’s conductor who “controls” the main aspects ofthe musical interpretation, whereas the orchestra musi-cians themselves are left with a relatively reduced inter-pretative freedom. Moreover, the role of audience as anactive actor of a musical performance is mostly neglected.

2.2. Modern View

An alternative to classical cognitive science is the connec-tionist approach that tries to build biologically plausiblesystems using neural networks. Unlike more traditionaldigital computation models based on serial processing andexplicit manipulation of symbols, connectionist networksallow for fast parallel computation. Moreover, it does notrely on explicit rules but on emergent phenomena stem-ming from the interaction between simple neural units.Another related approach, called embodied cognitive sci-ence, put the emphasis on the influence of the environmenton internal processes. In some sense it replaced the viewof cognition as a representation by the view that cognitionis an active process involving an agent acting in the en-vironment. Consequently, the complexity of a generatedstructure is not the result of the complexity of the under-lying system only, but partly due to the complexity of itsenvironment [41].

A piece that gives a good illustration of situatedness, dis-tributed processing, and emergence principles is In C byTerry Riley. In this piece, musicians are given a set ofpitch sequences composed in advance, but each musicianis left in charge of choosing when to start playing and re-peating these sequences. The piece is formed by the com-bination of decisions of each independent musician thatmakes her decision based on the collective musical outputthat emerges from all the possible variations.

Following recent evolution of our understanding of cogni-tive systems, we emphasize the crucial role of emergence,distributed processes and situatedness (as opposed to rule-based, serial, central, internal models) in the design of in-teractive music composition systems.

2.3. Human-In-The-Loop

With the advent of new sensor technologies, the influenceof the environment on music systems can be sensed viaboth explicit and implicit interfaces, which allow accessto the behavioral and physiological state of the user. Mu-sic is often referred to as the language of emotion, hencehuman emotions seems to be a natural feedback channelto take into account in the design of a situated music sys-tem. We believe that in order to be complete, the designof a situated music system should take into considerationthe emotional aspects of music.

2.3.1. Explicit Gestural Interfaces

The advent of new sensing technologies has fostered thedevelopment of new kind of interfaces for musical ex-pression. Graphical User Interfaces, tangible interfaces,gesture interfaces have now become omnipresent in thedesign of live music performance or compositions [31].Most of these interfaces are gesture-based interfaces thatrequire explicit conscious body movements from the user.They can give access to behavioral or self-reported infor-mation, but not to implicit emotional states of the user.

2.3.2. Implicit Biosignal Interface

Thanks to the development of more robust and accuratebiosignal technologies, it is now possible to derive emotion-related information from physiological data and use it asan input to interactive music systems. Although the ideais not new [15, 37], the past few years have witnessed agrowing interest from the computer music community inusing physiological data such as heart rate, electrodermalactivity, electroencephalogram and respiration to generateor transform sound and music. Providing emotion-basedphysiological interface is highly relevant for a number ofapplications including music therapy, diagnosis, interac-tive gaming, and emotion-aware musical instruments.

2.3.3. Emotional Mapping

Music and its effect on the listener has long been a subjectof fascination and scientific exploration from the Greeksspeculating on the acoustic properties of the voice [14] toMusak researcher designing “soothing” elevator music. Ithas now become an omnipresent part of our day to daylife. Music is well known for affecting human emotionalstates, and most people enjoy music because of the emo-tions it evokes. Yet, although emotions seem to be a cru-cial aspect of music listening and performance, the scien-tific literature on music and emotion is scarce if comparedto music cognition or perception [28, 10, 17, 5]. The re-lationship between specific musical parameters and time-varying emotional responses is still not clear. Biofeedbackinteractive music systems appear to be an ideal paradigmto explore the complex relationship between emotion andmusic.

2.4. Music Perception and Cognition

2.4.1. Hierarchy

What are the most relevant dimensions of music and howshould they be represented in the system ? Here, we take acognitive psychology approach, and define a set of param-eters that are the most salient perceptually and the mostmeaningful cognitively. Music is a real-world stimulusthat is meant to be appreciated by a human listener. It in-volves a complex set of perceptive and cognitive processes

Figure 2. The different levels of sequential groupingfor musical material: event fusion, melodic and rhythmicgrouping and formal sectioning (from [42])

that take place in the central nervous system. The fast ad-vances in the neuroscience of music over the past twentyyears have taught us that these processes are partly inter-dependent, are integrated in time and involve memory aswell as emotional systems [16, 32, 33]. Their study shedlight on the structures and features that are involved inmusic processing and stand out as being perceptually andcognitively relevant. Experimental studies have found thatmusical perception happens at three different time scale,namely the event fusion level when basic musical eventssuch as pitch, intensity and timbre emerge (∼ 50ms); themelodic and rhythmic grouping when pattern of those ba-sic events are perceived (∼ 5s), and finally the form level(from 5s to 1 hour) that deals with large scale sections ofmusic [42] (Figure 2). This hierarchy of three time scaleof music processing forms the basis on which we builtSMuSe’s music processing chain.

2.4.2. Modularity

Research on the brain substrates underlying music pro-cessing has switched in the last twenty years from a clas-sical view emphasizing a dichotomy between language(supposedly processed in left hemisphere) and music (re-spectively right hemisphere) to a more modular view [1].There is some evidence that music processing modules areorganized into two parallel but largely independent sub-modules that deal with pitch content (“What ?”) and tem-poral content (“When ?”) respectively [33, 18]. This ev-idence suggests that they can be treated separately in acomputational framework. Additionally, studies involv-ing music-related deficits in neurologically impaired indi-viduals (e.g. subjects with amusias who can’t recognizemelodies anymore) have shown that music faculty is com-posed of a set of neurally isolable processing componentsfor pitch, loudness and rhythm [32]. The common viewis that pitch, rhythm and loudness are first processed sep-

Voice n

When

What

Instrument n

Room, Location

Musical Event

Acoustic Event

Pulse

Rhythm Agent

Pitch Classes/Chord Agent

RegisterAgent

DynamicsAgent

ArticulationAgent

Attack Brightness Flux Noisiness Inharmonicity

Reverb Spatialization

Voice 1

Instrument 1

Global Reference Expressive Modulation

Memory

Figure 3. SMuSE is based on a hierarchy of musicalagents

arately by the brain to then later form (around 25-50ms)an impression of unified musical object [27] (see [16] fora review of the neural basis of music perception). Thismodularity as well as the three different levels and timescales of auditory memory (sound, groups, structure) forma set of basic principles for the design of our computa-tional framework.

3. A COMPUTATIONAL MODEL BASED ON ASOCIETY OF MUSICAL AGENTS

3.1. The SMuSe’s Architecture

The architecture of SMuSe follows a hierarchical and mod-ular structure, and has been implemented as a set of agentsusing data-flow programming. The musical material isrepresented at three different hierarchical levels, namelyevent fusion, event grouping and structure correspond-ing to different memory constraints. From the generativepoint of view, SMuSe modules are divided into time mod-ules (“when”) that generate rhythmic pattern of events andcontent modules (“what”) that for each time event choosemusical material such as pitch and dynamics (Figure 3).

At the low event fusion level, SMuSe provides a set of syn-thesis techniques validated by psychoacoustic tests [26,23] that give perceptual control over the generation of tim-bre as well as the use of MIDI information to define ba-sic musical material such as pitch, velocity and duration.Inspired by previous works on musical performance mod-eling [7], the SMuSe also allows to modulate the expres-siveness of music generation by varying parameters suchas phrasing, articulation and performance noise [24].

At the medium melodic and rhythmic grouping level, theSMuSe implements various state of the art algorithmic

composition tools (e.g. generation of tonal, Brownian andserial series of pitches and rhythms, Markov chains, ...).The time scale of this mid-level of processing is in theorder of 5s. for a single grouping, i.e. the time limit ofauditory short-term memory.

The form level concerns large groupings of events overa long period of time (longer than the short-term mem-ory). It deals with entire sequences of music and relatesto the structure and limits of long-term memory. Influ-enced by experiments in synthetic epistemology and sit-uated robotics, this longer term structure is accomplishedvia the interaction with the environment [45, 44].

The modularity of the music processing chain is also re-flected in different SMuSe modules that specifically dealwith time (“when”) or material (“what”).

3.2. Agent Framework

The agent framework is based on the principle that com-plex tasks can be accomplished through a society of sim-ple cross-connected self-contained agents [29]. Here, anagent is understood as “anything that can be viewed asperceiving its environment through sensors and acting uponthat environment through effectors [39]. In the context ofcognitive science, this paradigm somehow takes a standagainst a unified theory of mind where a diversity of phe-nomena would be explained by a single set of rules. Theclaim here is that surprising, complex and emergent re-sults can be obtained through the interaction of simplenon-linear agents. The agent framework is particularlysuited to building flexible real-time interactive musical sys-tems based on the principles of modularity, real-time in-teraction and situatedness.

3.3. Data-flow Programming

We chose to implement this hierarchy of musical agentsin SMuSe in a data-flow programming language calledMax/MSP [46]. Data flow programming conceptually mod-els a program as a directed graph of data flowing betweenoperations. This kind of model can easily represent par-allel processing which is common in biological systems,and is also convenient to represent an agent-based modu-lar architecture.

3.4. Distributed Control

All the musical agents in SMuSe can be controlled and ac-cessed from anywhere (including over a network) at anytime in a distributed fashion (via OSC commands). Thisgives great flexibility to the system, and allows for sharedcollaborative compositions where several clients/performerscan access and control a common music server/generator.In this collaborative composition paradigm, every performerbuilds on what the others have done. The action of one

performer on the server (e.g. modulation of a set of mu-sical parameters) stimulates the performance of a next ac-tion for another performer. The result is a complex soundstructure that keeps evolving as long as different perform-ers contribute changes to its current shape. A parallelcould be drawn with stigmergic mechanisms of coordi-nation between social insects like ants [41, 3, 11]. In antcolonies, the pheromonal trace left by one ant at a giventime is used as a means to communicate and stimulate theaction of the others. Hence they manage to collectivelybuild complex networks of trails towards food sources.Similarly, in a distributed collective music paradigm oneperformer leaves a musical trace (i.e. the result of a con-trol command to the music server) to the shared composi-tion via independent control, which in turn stimulate theother co-performers to react and build on top of it (viaother control commands).

3.5. Concurrent and On-the-fly Control of Musical Pro-cesses

We have proposed a biologically inspired memory andprocess architecture for SMuSe as well as a computationalmodel based on software agents. The OSC communica-tion protocol allows to easily send text-based commandsto specific agents in the hierarchy. It allows for flexibleand intuitive time-based, concurrent and on-the-fly con-trol of musical processes.

The different musical agents in the SMuSe all have a spe-cific ID/ address at which they receive commands anddata. The addresses are divided into /global (affectingthe whole hierarchy), /voice[n] (affecting specific voices),and /synth[n] (affecting specific sound generators). TheOSC syntax supports regular expressions which allows toaddress several modules at the same time with a compactsyntax.

Patterns of perceptually-grounded musical features are sentto the short-term (STM) and long-term memory (LTM)modules at any moment in time via specific commands.

/* Example: fill up the STM *//voice1/rhythm/pattern 4n 4n 8n16n 16n 4n/voice1/pitch/pattern 0 0 5 7 10 0/voice1/pitch/register 4 5 4 4 4 5/voice1/velocity 12 12 16 16 12 32

Musical sequence generation follow different Selection prin-ciples, a term inspired by the reflexions of Koenig on se-rial music and algorithmic composition [19]. It refers tothe actions taken by the system to generate musical eventsusing the available short and long term memory content.

The SMuSe

Microphone

Biosignals

Wiimote

XIM

IQR IanniX

Terminal Torque

Audio Feedback

Network Clients Sensor Data

OSC

Environment

Figure 4. SMuSe’s environment: the SMuSe can inter-act with its environment through different sensors such asbiosignals, camera, gazers, lasers, pressure sensitive floor,MIDI, audio, but also via OSC commands sent from clientapplications (such as console terminal, IQR, Iannix graph-ical score, Torque game engine, etc.) to the music serverover the network.

These actions can be deterministic (e.g. playback of astored sequence) or based on probability of occurrenceof specific events (series, Markov chains, random events).This allows for an hybrid approach to algorithmic compo-sition where complex stochastic processes are mixed withmore deterministic repeating patterns (Cf. Table ??). Ex-pressivity parameters such as articulation, tempo and dy-namics can be continuously accessed and modulated.

3.6. Human in the loop

We tested the SMuSe within different sensing environ-ments ranging from physiology sensors that can provideimplicit emotional user interaction (heart rate, electroder-mal activity, electroencephalogram), to virtual and mixed-reality sensors for behavioral interaction (camera, gazers,lasers, pressure sensitive floors) and finally MIDI and au-dio (microphone) for direct musical interaction (Figure 4).SMuSe integrates sensory data from the environment (thatconveys information about the human participants inter-acting with the system) in real time and send this inter-preted data to the music generation processes after ap-propriate fixed or learned musical mappings. The initialmusical material generated by SMuSe is amplified, trans-formed and nuanced as the interaction between the systemand the participant evolves.

3.7. Emotional Mappings

We have built a situated cognitive music system that issensitive to its environment via musical, behavioral andphysiological sensors. Thanks to a flexible architecture,the system is able to memorize, combine and generatecomplex musical structures in real-time. We have de-scribed various sensate environments that provide feed-back information from the human interactor and allow toclose the interaction loop. As proposed previously, wetake an approach that focuses on emotion-related feed-back.

Table 1. The relationship between musical parametersand perceived emotion. A summary of the main results.(adapted from [10])

3.7.1. Predefined Mappings

From advanced behavioral and physiological interfaces,we can infer emotion-related information about the inter-action with SMuSe. One possible way to take this feed-back into account is to design fixed mappings based onprevious results from experimental psychology studies thathave investigated the emotional responses to specific mu-sical parameters. This a priori knowledge can be usedto drive the choice of musical parameters depending onthe difference between the goal emotion to be expressedor induced, and the emotional state detected by the sys-tem via its sensors. A number of reviews have proposedgeneric relationships between sound parameters and emo-tional responses [9, 13, 12]. These results serve as thebasis for explicit emotional mappings (Cf. Table 1) andhave been confirmed in the specific context of SMuSe’sparameter space. This explicit design approach is detailedin [26, 20].

3.7.2. Adaptive Mappings

In cases where the relationship between the change ofmusical parameters and the emotional response has to belearned, explicit mappings is not possible. One solutionin SMuSe is to use a Reinforcement Learning (RL) agentthat learns to adapt the choice of musical parameters basedon the interaction of the system with the environment [43].Reinforcement learning is a biologically plausible learn-ing algorithm particularly suited to an explorative and adap-

Participant

SMuSe

Reinforcement Agent

Audio Feedback

Reward

Action

Figure 5. The reinforcement-based music system iscomposed of three main components: the music engine(the SMuSe), the reinforcement learning agent and the lis-tener who provides the reward signal.

tive approach to mapping as it tries to find a sequenceof parameter changes that optimizes a reward function(which for instance relates to a state of emotional stress)[25]. Unlike most previous examples of the use reinforce-ment learning in music systems, where the reward relatesto some predefined musical rules or quality of improvisa-tion, we are interested in the emotional feedback from thelistener (Figure 5).

This approach contrasts with expert systems such as theKTH rule system [7, 8] that can modulate the expressiv-ity of music by applying a set of predefined rules inferredfrom previous extensive music and performance analysis.Here, we propose a paradigm where the system learns toautonomously tune its own parameters in function of thedesired reward function (some emotional feedback) with-out using any a-priori musical rule.

Interestingly enough, the biological validity of RL is sup-ported by numerous studies in psychology and neurosciencethat found various examples of reinforcement learning inanimal behavior (e.g. foraging behavior of bees [30] orthe dopamine system in primate brains [40]).

4. ARTISTIC REALIZATIONS

We have further explored the purposive construction of in-teractive installations and performances using the SMuSesystem. To name but a few, during the VRoboser instal-lation [22], the sensory inputs (motion, color, distance) ofa 3D virtual Khepera1 robot living in a game-like envi-ronment modulated musical parameters in real-time, thuscreating a never ending musical soundscape in the spiritof Brain Eno’s “Music for Airports”. In another contextthe SMuSe generated automatic soundscapes and musicwhich reacted to and influenced the spatial behavior ofhuman and avatars in the mixed-reality space called XIM(for eXperience Induction Machine) [2] thus emphasizingthe role of the environment and interaction on the musicalcomposition. Based on similar premises, Re(PER)curso,an interactive mixed reality performance involving dance,percussion, interactive music and video was presented atthe ArtFutura Festival 07 and Museum of Modern Art in

1http://www.k-team.com/

A) B)

C)

Figure 6. Artistic realizations: A) Re(PER)curso (ArtFutura and MACBA, Barcelona 2007) B) The MultimodalBrain Orchestra (FET, Prague 2009) C) XIM sonification(2007)

Barcelona in the same year. The performance was com-posed by several interlaced layers of artistic and techno-logical activities. The music controlled had three com-ponents: a predefined soundscape, the percussionist whoperformed from a score and the interactive compositionsystem synchronized by SMuSe; the physical actors, thepercussionist and the dancer were tracked by a video basedactive tracking system that in turn controlled an array ofmoving lights that illuminated the scene. The spatial in-formation from the stage obtained by the tracking systemwas also projected onto the virtual world where it modu-lated the avatar’s behavior allowing it to adjust body posi-tion, posture and gaze to the physical world. In 2009, theBrain Orchestra, a multimodal performance using braincomputer interfaces, explored the creative potential of acollection of brains directly interfaced to the world. Dur-ing the performance, four “brain musicians” were control-ling a string quartet generated by the SMuSe using theirbrain activity alone. The orchestra was conducted by an“emotional conductor”, whose emotional reactions wererecorded using biosignal interfaces and fed back to thesystem. The Brain Orchestra was premiered in Prague forthe FET 09 meeting organized by the European Commis-sion. [21]. Finally, a live performance of a piece inspiredby Terry Riley’s “in C” served as an illustration of theprinciples of parallelism, situatedness and emergence ex-hibited by the SMuSe at the Ernst Strungmann Forum onLanguage, Music and the Brain: a mysterious relation-ship.

5. CONCLUSIONS

The SMuSe illustrates a novel situated approach to musiccomposition systems. It is built on a cognitively plausi-ble architecture that takes into account the different timeframes of music processing, and uses an agent frameworkto model a society of simple distributed musical processes.It takes advantage of its interaction with the environmentto go beyond the classic sense-think-act paradigm [38].

It combines cognitively relevant representations with per-ceptually grounded sound synthesis techniques and is basedon modern data-flow audio programming practices [35,36]. This provides an intuitive, flexible and distributedcontrol environment that can easily generate complex mu-sical structure in real-time. SMuSe can sense its environ-ment via a variety of sensors, notably physiology-basedsensors. The analysis and extraction of relevant informa-tion from sensor data allows to re-inject emotion-basedfeedback to the system based on the responses of the hu-man participant. The SMuSe proposes a set of “pre-wired”emotional mappings from emotions to musical parametersgrounded on the literature on music and emotion, as wellas a reinforcement learning agent that performs onlineadaptive mapping. It provides a well grounded approachtowards the development of advanced synthetic aestheticsystems and a further understanding of the fundamentalpsychological processes on which it relies.

6. REFERENCES

[1] E. Altenmuller, “How many music centers are in thebrain?” in Annals of the New York Academy of Sci-ences, vol. 930, no. The Biological Foundations ofMusic. John Wiley & Sons, 2001, pp. 273–280.

[2] U. Bernardet, S. Bermúdez i Badia, A. Duff, M. In-derbitzin, S. Groux, J. Manzolli, Z. Mathews,A. Mura, A. Valjamae, and P. Verschure, “The ex-perience induction machine: A new paradigm formixed-reality interaction design and psychologicalexperimentation,” The Engineering of Mixed RealitySystems, pp. 357–379, 2010.

[3] E. Bonabeau, M. Dorigo, and G. Theraulaz, Swarmintelligence: from natural to artificial systems. Ox-ford University Press, USA, 1999.

[4] B. Bongers, “Physical interfaces in the electronicarts. Interaction theory and interfacing techniquesfor real-time performance,” Trends in Gestural Con-trol of Music, pp. 41–70, 2000.

[5] M. M. Bradley and P. J. Lang, “Affective reactions toacoustic stimuli.” Psychophysiology, vol. 37, no. 2,pp. 204–215, March 2000.

[6] J. Fodor, The language of thought. Harvard UnivPr, 1975.

[7] A. Friberg, R. Bresin, and J. Sundberg, “Overviewof the kth rule system for musical performance,” Ad-vances in Cognitive Psychology, Special Issue onMusic Performance, vol. 2, no. 2-3, pp. 145–161,2006.

[8] A. Friberg, “pdm: An expressive sequencer withreal-time control of the kth music-performancerules,” Comput. Music J., vol. 30, no. 1, pp. 37–48,2006.

[9] A. Gabrielson and P. N. Juslin, “Emotional expres-sion in music performance: Between the performer’sintention and the listener’s experience,” in Psychol-ogy of Music, vol. 24, no. 1, 1996, pp. 68–91.

[10] A. Gabrielsson and E. Lindström, Music and Emo-tion - Theory and Research, ser. Series in AffectiveScience. New York: Oxford University Press, 2001,ch. The Influence of Musical Structure on EmotionalExpression.

[11] E. Hutchins and G. Lintern, Cognition in the Wild.MIT Press, Cambridge, Mass., 1995.

[12] P. N. Juslin and P. Laukka, “Communication of emo-tions in vocal expression and music performance:different channels, same code?” Psychol Bull, vol.129, no. 5, pp. 770–814, Sep 2003.

[13] P. N. Juslin and J. A. Sloboda, Eds., Music and emo-tion : theory and research. Oxford ; New York:Oxford University Press, 2001.

[14] P. Kivy, Introduction to a Philosophy of Music. Ox-ford University Press, USA, 2002.

[15] B. R. Knapp and H. S. Lusted, “A bioelectric con-troller for computer music applications,” ComputerMusic Journal, vol. 14, no. 1, pp. 42–47, 1990.

[16] S. Koelsch and W. Siebel, “Towards a neural basisof music perception,” Trends in Cognitive Sciences,vol. 9, no. 12, pp. 578–584, 2005.

[17] C. Krumhansl, “An exploratory study of musicalemotions and psychophysiology,” Canadian journalof experimental psychology, vol. 51, no. 4, pp. 336–353, 1997.

[18] ——, “Rhythm and pitch in music cognition,” Psy-chological Bulletin, vol. 126, no. 1, pp. 159–179,2000.

[19] O. Laske, “Composition theory in Koenig’s projectone and project two,” Computer Music Journal, pp.54–65, 1981.

[20] S. Le Groux, M. Brotons, J. Manzolli, and P. F.M. J. Verschure, “Perceptual determinants of emo-tional responses to sound in dementia patients: To-wards a sound-based diagnosis,” in Music, Scienceand Medicine: Frontiers in Biomedical Researchand Clinical Applications. New York Academy ofScience, March 2011.

[21] S. Le Groux, J. Manzolli, M. Sanchez, A. Lu-vizotto, A. Mura, A. Valjamae, C. Guger,R. Prueckl, U. Bernardet, and P. F. M. J.Verschure, “Disembodied and collaborative mu-sical interaction in the Multimodal Brain Or-chestra,” in NIME ’10: Proceedings of theinternational conference on New Interfaces for

Musical Expression. Sydney, NSW, Australia:ACM Press, June 2010. [Online]. Available: http://www.educ.dab.uts.edu.au/nime/PROCEEDINGS/

[22] S. Le Groux, J. Manzolli, and P. F. M. J.Verschure, “VR-RoBoser: real-time adaptive soni-fication of virtual environments based on avatarbehavior,” in NIME ’07: Proceedings of the 7thinternational conference on New interfaces formusical expression. New York, NY, USA: ACMPress, 2007, pp. 371–374. [Online]. Available:http://portal.acm.org/citation.cfm?id=1279822

[23] S. Le Groux and P. F. M. J. Verschure, “PER-CEPTSYNTH: Mapping perceptual musical fea-tures to sound synthesis parameters,” in Proceed-ings of IEEE International Conference ons Acous-tics, Speech and Signal Processing (ICASSP), LasVegas, USA, April 2008.

[24] ——, “Situated interactive music system:Connecting mind and body through musicalinteraction,” in Proceedings of the InternationalComputer Music Conference. Montreal, Canada:Mc Gill University, August 2009. [Online].Available: http://www.cs.ucl.ac.uk/research/vr/Projects/PRESENCCIA/Public/presenccia_pub/sharedDocuments/presenccia_publications/Publications/app1/20_LeGroux-ICMC09.pdf

[25] ——, “Adaptive music generation by reinforcementlearning of musical tension,” in 7th Sound andMusic Computing Conference, Barcelona, Spain,July 2010. [Online]. Available: http://smcnetwork.org/files/proceedings/2010/24.pdf

[26] ——, “Emotional responses to the perceptual di-mensions of timbre: A pilot study using physicallyinspired sound synthesis,” in Proceedings of the 7thInternational Symposium on Computer Music Mod-eling, Malaga, Spain, June 2010.

[27] D. Levitin and A. Tirovolas, “Current advances inthe cognitive neuroscience of music,” Annals of theNew York Academy of Sciences, vol. 1156, no. TheYear in Cognitive Neuroscience 2009, pp. 211–231,2009.

[28] L. B. Meyer, Emotion and Meaning in Music. TheUniversity of Chicago Press, 1956.

[29] M. Minsky, The society of mind. Simon and Schus-ter, 1988.

[30] P. Montague, P. Dayan, C. Person, and T. Sejnowski,“Bee foraging in uncertain environments using pre-dictive hebbian learning,” Nature, vol. 377, no.6551, pp. 725–728, 1995.

[31] J. Paradiso, “Electronic music: new ways to play,”Spectrum, IEEE, vol. 34, no. 12, pp. 18–30, 2002.

[32] I. Peretz and M. Coltheart, “Modularity of musicprocessing,” Nature Neuroscience, vol. 6, no. 7, pp.688–691, 2003.

[33] I. Peretz and R. Zatorre, “Brain organization for mu-sic processing,” Psychology, vol. 56, 2005.

[34] R. Pfeifer and C. Scheier, Understanding intelli-gence. The MIT Press, 2001.

[35] M. Puckette, “Pure data: another integrated com-puter music environment,” in Proceedings of the 2ndIntercollege Computer Music Concerts, Tachikawa,Japan, 1996.

[36] M. Puckette, M. Ucsd, T. Apel et al., “Real-time au-dio analysis tools for pd and msp,” 1998.

[37] D. Rosenboom, “Biofeedback and the arts: Resultsof early experiments,” in Computer Music Journal,vol. 13, no. 4, 1989, pp. 86–88.

[38] R. Rowe, Interactive music systems: machine listen-ing and composing. Cambridge, MA, USA: MITPress, 1993.

[39] S. Russell and P. Norvig, Artificial intelligence: amodern approach. Prentice hall, 2009.

[40] W. Schultz, P. Dayan, and P. Montague, “A neu-ral substrate of prediction and reward,” Science, vol.275, no. 5306, p. 1593, 1997.

[41] H. Simon, “The science of the artificial,” Cam-bridge, USA, MA, 1981.

[42] B. Snyder, Music and memory: an introduction.The MIT Press, 2000.

[43] R. S. Sutton and A. G. Barto, Reinforcement Learn-ing: An Introduction (Adaptive Computation andMachine Learning). The MIT Press, March 1998.

[44] P. F. M. J. Verschure, “Synthetic epistemology: Theacquisition, retention, and expression of knowledgein natural and synthetic systems,” in Fuzzy SystemsProceedings, 1998. IEEE World Congress on Com-putational Intelligence., The 1998 IEEE Interna-tional Conference on, vol. 1. IEEE, 1998, pp. 147–152.

[45] P. F. M. J. Verschure, T. Voegtlin, and R. J. Dou-glas, “Environmentally mediated synergy betweenperception and behaviour in mobile robots,” Nature,vol. 425, no. 6958, pp. 620–4, Oct 2003.

[46] D. Zicarelli, “How I learned to love a program thatdoes nothing,” Computer Music Journal, no. 26, pp.44–51, 2002.