ismir 2013 moreira

Upload: fiammetta-ghedini

Post on 14-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 Ismir 2013 Moreira

    1/6

    VIRTUALBAND: INTERACTING WITH STYLISTICALLY CONSISTENTAGENTS

    Names should be omitted for double-blind reviewing

    Affiliations should be omitted for double-blind reviewing

    ABSTRACT

    VirtualBand is a multi-agent system dedicated to live

    computer-enhanced music performances. VirtualBand en-

    ables one or several musicians to interact in real-time with

    stylistically plausible virtual agents. The problem addressed

    is the generation of virtual agents that each represent the

    style of a given musician, while reacting to human play-

    ers. The solution we propose is a generation framework

    that relies on feature interaction. Virtual agents exploit

    a style database, which consists of an audio signal from

    which a set of MIR features are extracted. Musical inter-

    actions are represented by directed connections between

    virtual agents. The connections are specified as database

    filters based on MIR features extracted from the agents

    audio signal. We claim that such a connection framework

    allows to implement meaningful musical interactions and

    to produce stylistically consistent musical output. We il-

    lustrate this concept through several examples in jazz im-

    provisation, beat boxing and interactive mash-ups.

    1. INTRODUCTION

    Collective improvisation is a group practice in which sev-

    eral musicians contribute their part to produce a coherent

    musical output. Each musician tpically brings in musical

    knowledge, taste, and technical skills, and more generally

    his style, which makes him or her unique and recognizable.

    However, good improvisations are not only about putting

    together the competence of several individuals. Listening

    and interacting to each other is also crucial, as it enables

    each musician to adapt his or her playing to the global mu-

    sical output in terms of rhythm, intensity, harmony, etc.

    The combination of individual styles with the definition of

    their interaction defines the quality of a music band. Inshort, group improvisation can be seen as interactions be-

    tween stylistically consistentagents.

    Previous works have attempted to simulate computa-

    tionally the behavior of real musicians in a context of mu-

    sical group practice. For instance in [5] a MIDI-based

    model of an improvisers personality is proposed, to build

    a virtual trio system, but no explicit interactions between

    Permission to make digital or hard copies of all or part of this work for

    personal or classroom use is granted without fee provided that copies are

    not made or distributed for profit or commercial advantage and that copies

    bear this notice and the full citation on the first page.

    c 2013 International Society for Music Information Retrieval.

    virtual agents and real musicians are proposed. Impro-

    tek [9] is a system for live improvisation that generates

    musical sequences built in real-time from a live source us-

    ing feature similarity and concatenative synthesis. Impro-

    tek plays along with a musician improvising on harmonic

    and beat based music, in the style of this musician, but

    interactions with the virtual agents are uniquely based on

    similarity measures, thereby limiting the scope of possible

    man-machine interactions. [3] describes the Jambot, a sys-

    tem that infers in real-time various features from an audio

    signal and then produces percussive responses, following

    alternatively two behaviors: imitative or intelligent.

    In the imitative behavior, the system responds directly to

    a user onset by a percussive sound, which is limited both

    in terms of interaction and style. In the intelligent behav-

    ior, the system bases its output on a global understanding

    of the musical context (tempo, meter, etc.), but not directly

    from the user input: the system is not really interacting

    with the musician. Most importantly, even the style of the

    Jambot is not clearly defined. [6] presents Beatback, a sys-

    tem that generates a MIDI signal based on rhythmic pat-

    terns and using Markov models. Besides the fact that the

    sole use of MIDI signals is restrictive musically, interac-tions are limited to rhythmic elements only. In [8], Martin

    et al. proposes a system that interacts in real-time with

    a human musician, by adapting its behavior both on prior

    knowledge (database of musical situations) and parameters

    extracted during the current session, but this system forces

    the musician to deal with a graphical interface during the

    improvisation, which makes the system hardly usable in a

    live performance.

    In this paper we revisit the problem of musical interac-

    tion with stylistically consistent agents by taking a feature-

    based MIR approach to the problem. We introduce Virtu-

    alBand, a reactive music multi-agent system in which real

    musicians can control and interact with virtual musicians

    that retain their own style.

    A virtual musician is a representation of an actual mu-

    sician as a virtual agent. A virtual agent is designed to

    play or improvise like the actual musician it represents. To

    build the virtual agent, we record the musician in a musical

    situation that fits with the targeted improvisation context.

    The resulting audio content is stored in a database. This

    database is organized in musically relevant chunks (usu-

    ally beats and bars). To each chunk is associated a set of

    features that are used to define and implement the interac-

    tion schemes. During the performance, the virtual agent

    uses concatenative synthesis (see [13] for instance) to gen-

  • 7/27/2019 Ismir 2013 Moreira

    2/6

    Figure 1. Architecture of VirtualBand. VirtualBand mod-

    ules are in the rectangular frame at the bottom. Connec-

    tions between agents are represented by red arrows.

    erate music streams as seamless concatenations of database

    chunks.

    The real musicians, those participating in the improvisa-

    tion, are also represented by agents, called here real agents.Real agents do not relate to any prerecorded audio, but

    rather extract various acoustic and/or MIDI features from

    the musical output of the real musician. These features are

    used to implement the interactions with the virtual agents.

    The interactions between the real and virtual musicians

    are implemented in VirtualBand by connections, which are

    the core contribution of our approach. A connection de-

    fines precisely the interaction scheme between two agents

    (real or virtual). Technically, a connection is a directed link

    from a master (real or virtual) agent to a slave virtual agent

    (see Figure 1). The link is defined by a mapping between

    feature values. These mapping in turn are specified as fil-ters that define which database chunks slave agents should

    select from their database, in reaction to the analyzed in-

    put. In that sense our multi-agent system is an instance of

    a reactive system rather than deliberative [7]. Its main po-

    tential lies in the seemingly infinite possibilities provided

    by feature interaction, as illustrated in this paper.

    In Section 1 we describe the main components of Virtu-

    alBand. In Section 2 we show how to build a reflexive vir-

    tual band from the interaction point of view, in the context

    of jazz improvisation, and illustrate the power of feature

    interaction on several configurations of the system. Sec-

    tion 3 describes applications of VirtualBand in two othermusical contexts: beatboxing and automatic mash-up.

    2. SYSTEM DESCRIPTION

    The core of VirtualBand system (see Figure 1) consists of:

    A clock that establishes the tempo by sending noti-fications at each beat to every agent. We imposed a

    fixed tempo in the current version, mainly because

    time stretching algorithms are still not compatible

    with the real-time constraints of the system.

    An audio engine and a mixer to play the outputs ofthe virtual agents.

    2.1 Agents

    There are two types of agents: human and virtual agents,

    representing respectively actual and virtual musicians.

    Human agents are responsible for analyzing the input

    (audio or MIDI) from the human musician they represent,

    and to extract acoustic features on the signal in real-time.

    The values of the features will be used as a means of con-

    trolling virtual agents via the connections, as explained be-low.

    Virtual agents play music from a style database consist-

    ing of prerecorded audio content. The music stored in the

    database is organized in bars and beats from which are ex-

    tracted a set of feature values. These features are means for

    real agents to control the virtual agents via connections.

    Concretely, a virtual agent computes at every beat the

    next audio chunk to play by choosing among the avail-

    able audio beats in its database (and possibly transforming

    them). To do so, virtual agents apply a succession of filters

    to this database, as specified by its connection with other

    agents (see below). Once the next chunk to play (beat orbar) has been selected, the agent performs a cross-fade be-

    tween the end of the beat currently being played and the

    beginning of the selected beat and proceeds.

    2.2 Style Databases

    A style database represents the musical skills of a musi-

    cian in a given musical style/context. Concretely, a style

    database consists of an audio signal from which a set of

    MIR features is extracted. The construction of such a style

    database requires two steps: recording and extraction.

    In the recording step, a musician plays so as to fully ex-

    press his musical style, i.e. attempts to cover a large rangeof musical situations (e.g., playing with different intensi-

    ties, in different moods, staccato or legato, using various

    patterns, etc.). Of course, it is difficult to express ones

    style exhaustively, so style databases are limited to specific

    musical situations, e.g. defined by musical genre such as

    Bossa nova, Swing, etc. and a tempo. Note that the subject

    ofindividual style capture is still in its infancy and further

    studies should refine this concept.

    In the feature extraction step, the recorded audio sig-

    nal is segmented into bars and beats (in our current imple-

    mentation the tempo is given a priori so the segmentation

    task is easy), from which various features are extracted,either from the bars or from the beats. The extracted fea-

    tures are common MIR features (temporal features, e.g.,

    R.M.S., numbers of onsets or rhythmic patterns, and spec-

    tral features, e.g., spectral centroid, harmonic to noise ratio

    or chroma), but they can be imported from a specific mu-

    sical context (e.g., harmony, if the musician played with a

    chord grid).

    2.3 Connections

    A connection between two agents represents a musical in-

    teraction between the corresponding musicians. A connec-

    tion consists of two agents (a master and a slave), a pair of

    features (one for each agent), and a filter that selects from

  • 7/27/2019 Ismir 2013 Moreira

    3/6

    the database the best musical contents (bars/beats) to play

    depending on the features of each agent (see Figure 1). The

    filter can be designed in various ways, and ofers a lot of

    flexibility to represent many diverse interaction scenarios.

    In general, an interaction is specified by an equation of the

    following form:

    Consider a Master agentMA that controls a Slave agent

    SA. the problem is to select a chunk in SAs database,

    that is most appropriate for the context defined by MA.

    This selection is performed using 1) a master-feature type

    (e.g. spectralCentroid), 2) a slave-feature type (e.g. hit-

    Count) and 3) a master-feature-value, provided at time t

    (e.g. 0.8). One problem is that the feature types may oper-

    ate in different spaces, with different distributions of val-

    ues. Therefore, a mapping is needed to map values in the

    master feature space to values in the slave feature space.

    Such a mapping can be defined by:

    SA.f= (MA.f)

    The audio chunk in SA is then determined by applyinga succession of filters to the database, until a single value

    is found.

    For instance, consider a real guitar agent (RG) that con-

    trols of virtual drummer (VD). The master feature of the

    connection can be the R.M.S. of the guitar, and the slave

    feature is the hit-count, i.e., the number of onsets in each

    bar, of the virtual drummer. To select the next beat to be

    played by the virtual drummer, the filter establishes a lin-

    ear mapping from the input RMS value to a target hit count

    value, and then selects in the database the audio beat with

    the hit count closestto the target hit count.

    Formally, such a filter can be specified as:

    RG.RMSmatches VD.Hit-Count

    2.4 Harmony

    In tonal musical contexts, such as standard jazz, musicians

    improvise on a chord sequence that is known to all musi-

    cians beforehand (e.g. Solar, by Miles Davis). In Virtu-

    alBand, harmony is implemented by a specific agent that

    represents the chord sequence, and dispatches harmony to

    all other agents.

    The chord sequence agent (CSA) provides a unique fea-

    ture, called CHORD, whose value is the name of the next

    chord in the sequence. This agent is connected to every vir-

    tual agent that is harmony-dependent, for instance pitched

    instruments (piano, guitar, singing voice). For a virtual

    agent Vi, the connection is defined as follows: the master

    agent is CSA, its master feature is CHORD and the slave

    agent is Vi. The slave feature and filter are left undeter-

    mined but should be implemented so that the virtual agent

    Vi will play something that is harmonically consistent with

    the chord returned by the CSA.

    For instance, consider a piano comping agent (PCA)

    as slave agent. The database consists of various recorded

    beats with a single audio chords. Each beat is annotated

    with a feature CHORD which represents the chord name.

    In this case the slave feature is the chord of PCA

    (PCA.CHORD). The filter is defined as:

    PCA.CHORD matches CSA.CHORD

    The matching is not necessarily strict, but can use sub-

    stitution rules, as explained in the next section.

    2.5 Reflexive Style Databases and transformations

    VirtualBand can also be used in a memoryless mode, i.e.,

    without pre-recorded style databases. One motivation for

    getting rid of prerecorded material is to avoid canned mu-

    sic effects due to the fact that the audience does not know

    the content of the style databases. Instead, those databases

    can be built on-the-fly, in a manner typical of interactive

    systems such as the Continuator [10] or Omax [9]. Reflex-

    ive style databases can be used to generate various species

    of self-accompaniments such as duos or trios with oneself

    (see, e.g. [11]).

    Conceptually, reflexive style database do not differ from

    pre-recorded ones. Technically, they raise various real-time issues since their segmentation and feature extraction

    must be performed in real-time without interfering with the

    main VirtualBand loop, which are not discussed in this pa-

    per.

    More interestingly, reflexive style database raises spar-

    sity issues. Because the database contains only what the

    musician has played so far, its size will typically be much

    smaller than that of pre-recorded databases. As a conse-

    quence, the system has much less possibilities for genera-

    tion. This is particularly visible in contexts using predeter-

    mined chord grids. In principle, the system requires a long

    feeding phase, to accumulate at least one chunk for everychord of the sequence, before it can start playing back rele-

    vant audio material. Such a feeding phase is usually boring

    for the musician as well as for the audience.

    In order to reduce feeding, we propose to expand auto-

    matically style databases by using audio transformations,

    such as transpositions, as well as explicit musical knowl-

    edge. Pitch Shifting is an effect that shifts up or down an

    audio signal pitch-wise ( [12]). In VirtualBand, the au-

    dio chunks of a database are pitch-shifted so that the trans-

    posed bars can be used in new harmonic situations.

    We use so-called substitution rules to further expand

    the style database. Substitution rules consist in replacing achord by another chord with an equivalent harmonic func-

    tion. This operation is widely used in Jazz to bring variety

    to standard pieces (see [1]). VirtualBand uses chord sub-

    stitutions to play in a certain harmonic context audio that

    was recorded in another harmonic context, provided the

    two harmonic contexts may be substituted.

    By combining transpositions and substitutions, the feed-

    ing phase can be drastically reduced. The three examples

    of Figure 2 show how to harmonize the song Solar from

    a small number of input chords, using transpositions (a),

    substitutions (b), and a combination of both (c). The in-

    put chords (highlighted) generate the remaining chords of

    Solar (marked in the same color). Without substitutions

    or transpositions, 12 input chords would be required in the

  • 7/27/2019 Ismir 2013 Moreira

    4/6

    Figure 2. Illustration of the feeding phase reduction using

    (a) transpositions, (b) substitutions and (c) a combination

    of both.

    feeding phase. In contrast, (a) requires 5 input chords, (b)

    requires 8, and (c) only 2. For instance in (c), the input

    chord G min 7can be substituted for C 7 using substitution

    G min 7 : C 7, but also F min 7 using a 1 tone downard

    transposition, and Bb 7 with a combination of these two

    operations.

    3. EXAMPLES

    A typical jazz guitar combo is a formation where a gui-

    tarist improvises melodies on an harmonic grid, while the

    two others accompany with chords and bass (i.e., lower

    notes of the guitar). Each of these roles, melody, chords

    and bass, represent different guitar playing modes. Dur-

    ing an improvisation, guitarists typically shift roles in turn.

    When a musician takes the lead (solo), the other guitarists

    adapt their behavior so that each mode is always played bysomeone.

    The following examples illustrate various configurations

    of VirtualBand with one real guitarist and two reflexive vir-

    tual agents. In the first and second examples we limit the

    formation to a duo, in which the virtual agent is specified

    respectively by a spectral centroid feature and by a rhythm

    pattern feature extracted from real guitarist. Then, we ex-

    tend te example to a guitar trio configuration with the pre-

    sented settings. An accompanying web site illustrates all

    these configurations with videos of the corresponding per-

    formances.

    3.1 Spectral centroid based duo

    For two jazz guitarists improvising together, a common in-

    teraction scheme is question-answer dialogues. In such a

    configuration, a guitarist plays a melody, and as soon as

    he finishes, the other guitarist plays another melody that

    borrows elements from the first one. Typically the answer-

    ing melody is in the same pitch range, or shares similar

    rhythm patterns with the original melody. While a guitarist

    proposes a melody, the other one either stops playing, or

    accompany the first one with, e.g., chord comping.

    We can simulate such scenarios with various configura-

    tions of VirtualBand based on pitch features of audio con-

    tent. A reflexive style database records the incoming audio

    Figure 3. Score of a spectral-centroid based duo: a real

    agent (RA) plays the melody of Solar. A virtual agent (VA)

    matches the spectral centroid of RA, with a one-bar delay.

    At bar 11 (Db maj7) VA uses a transposition of bar 5 (F

    maj7) of RA to match the spectral centroid of bar 10 of

    RA. At bar 8 (Bb 7), VA uses substitution F min 7 : Bb

    7 of bar 7 of RA.

    of the real guitarist (RG) and extract the spectral centroidof each bar. We chose the spectral centroid as an approxi-

    mation of pitch, but more refined descriptors could be used

    interchangably. A virtual agent (VG), associated to the

    reflexive database, is connected to the real guitarist. The

    connection filter specifies that at each bar, the virtual agent

    selects in the database the bar with the spectral centroid

    closestto the real guitarists one. This can be specified as:

    VG.spectralCentroidmatches RG.SpectralCentroid

    In such a mode, the system behaves as a self harmo-

    nizer: the virtual guitarist follows the spectral centroid ofthe real guitarist with one bar delay (see Figure 3) music

    material that will sound like what the guitarist just played.

    Another mode consists in doing the opposite, and forc-

    ing the agent to play audio that is far away pitch-wise to

    the humans input (of course, still satisfying harmonic con-

    straints). Such a configuration is easily specified as:

    VG.spectralCentroid oppositeTo RG.SpectralCentroid

    In this mode, the two outputs (real and virtual guitar)

    are more clearly distinct. Of course, it is possible to switch

    from one filter to another during the performance, e.g. by

    using specific controllers.

    3.2 Rhythm-based duo

    Question-answer musical dialogues can also be based on

    rhythmical similarities. A guitarist plays a rhythm pattern

    for a few bars, and the other one responds by playing a sim-

    ilar pattern. This can be emulated simply in VirtualBand

    by using other pairs of features.

    For instance, we can introduce a feature that represents

    the rhythm pattern by using a RMS profile. Such a pro-

    file is obtained by computing the RMS 12 times per beat

    over one bar. The distance between two R.M.S. profiles is

    defined as the scalar product of the two sets. This mode

    can be specified as:

  • 7/27/2019 Ismir 2013 Moreira

    5/6

    VG.RhythmProfile closestTo RG.RhythmProfile

    In such a configuration, the musical interaction is of

    course very different. Because the rhythm patterns played

    by the virtual agent are not identical to the real guitarists,

    this mode can lead to fascinating interactions. However, it

    requires typically larger size databases, so it is best used

    not right away during the session, but after some time,

    when the system has accumlated enough rhythm samples

    to play back interesting variations.

    Here again, another filter can be designed to play an

    opposite rhythm pattern. Such an opposite pattern allows

    the musician to play with various playing modes, as ex-

    plained in the introduction of this section: if the real gui-

    tarist plays (and the database records) gentle chords, and

    then more melodies, the virtual agent will be able to ac-

    company melodies of the guitarist with chords and chords

    with melodies.

    3.3 Trio

    In a guitar trio, each playing mode (melody, chords and

    bass) is typically represented by one musician. With Vir-

    tualBand, a eal guitarist can be self-accompanied by two

    reflexive virtual agents. Both choose and play their audio

    beats from the same reflexive database.

    Here, in addition to the R.M.S. profile, the reflexive

    database extracts theplaying mode from each recorded beat,

    among four possible modes: melody, chord, bass and si-

    lence. Automatic playing mode idenfitication is described,

    e.g. in [2].

    Each virtual agent is associated to a unique playing mode

    (one to the bass, one to the chords), and agent follows a

    mutually exclusive mode, i.e., they play only if the real

    guitarist is not playing in the same mode, following the

    scheme used in [11]. Furthermore, like in the previous ex-

    ample, each agent follows the rhythmical patterns of the

    guitarist, with one bar of delay.

    Here the musician is free to play as he wants, the virtual

    agents adapt their mode and their rhythm. The real musi-

    cian can trigger rhythm patterns he previously recorded by

    playing similar patterns. For instance, a walking bass, or

    comping chords (a lot of notes per bar, regularly spaced)

    can be triggered by playing a fast and regular melody. The

    guitar trio with oneself follows more or less the rules of astandard guitar trio. A video showing an example of this

    trio can be seen in the accompanying web site 1 .

    4. OTHER APPLICATIONS

    In this section, we describe applications of VirtualBand in

    two other musical contexts.

    4.1 Reflexive Beatboxing

    Beatboxing is a music style where musicians use their mouth

    to simulate percussion. Beatboxing also involves hum-

    ming, speech or singing (see [14]). Common beatboxers1 Hidden for anonymity.

    1 2 3 4

    human t---|t-t-|kk--|t--t|Ap ----|----|----|----|Ah ----|----|----|----|

    5 6 7 8

    human O-O-|O-O-|A-A-|AO--|Ap ----|t-t-|t-t-|t-t-|Ah ----|----|----|----|

    9 10 11 12

    human tt--|tktk|OAOA|OAOA|Ap kk--|----|----|tktk|Ah ----|AO--|A-A-|----|

    13 14 15 16

    human tktk|tktk|----|----|Ap tktk|----|----|tktk|Ah ----|OAOA|OAOA|OAOA|

    Table 1. Notation of a 16-bar performance of the reflexive

    beatboxing system: t and k are percussive sounds, O and

    A

    are hummed vowels. Ap and Ah are the percussive andhumming virtual agents. Virtual agents follow a mutually

    exclusive principle and try to match the humans rhythm

    with a one-bar delay.

    typically alternate between modes, but some great beat-

    boxers are able to play two modes at the same time (e.g.,

    percussion and humming).

    Beatboxing with VirtualBand aims at augmenting the

    performance of a moderately good beatboxer by allow-

    ing him or her to play, via reflexive virtual agents, sev-

    eral modes at the same time. Using the same settings than

    above (3.3), the system records and stores in a reflexivedatabase the incoming audio of the real beatboxer. From

    this audio, the database distinguishes between two play-

    ing modes: the percussion mode and the humming mode.

    Then two virtual agents are connected to this database, one

    representing the percussion mode and the other the hum-

    ming mode. The classification is performed following a

    similar scheme than [2], except for the set of features se-

    lected by the classifier (here harmonic-to-noise ratio, spec-

    tral centroid and Yin). Following an mutually exclusive

    principle, virtual agents play alternatively, depending on

    the mode of the real beatbox, and following his or her

    rhythmic patterns. Table 1 illustrates a typical session withsuch an augmented beatbox.

    4.2 Mash-up

    A mash-up is a song or composition created by blend-

    ing two or more prerecorded songs, usually by overlaying

    a track of one song over the tracks of another (see [4]).

    Mash-up usually exploit multi-track songs, i.e., with songs

    whose tracks (e.g., drum track, guitar track, voice track)

    are available as separated audio files.

    A mash-up requires two songs (or more) that are har-

    monically compatible, i.e., songs that share moments where

    the harmony is similar, or at least superimposable. Finding

    two such songs is difficult. We solve this problem in Vir-

  • 7/27/2019 Ismir 2013 Moreira

    6/6

    tualBand with the transposition and substitution (see Sec-

    tion 2.5), as they enlarge the harmonic possibilities of a

    database.

    We represent each track of a song by a virtual agent.

    The audio of each track is stored in a different database.

    Mashing-up with VirtualBand consists in playing a song

    where one agent is muted and replaced by the track agent

    of another song (for instance the drum track is muted and

    replaced by the drum track of another song). The virtual

    agent representing the new track is connected to the muted

    agent of the original song. Note that the muted agent still

    plays with the rest of the group (even if it is not audible),

    in order to master the replacing agent in real-time.

    We illustrate mash-up with VirtualBand by replacing

    the drummer of the band The Police in Roxanne by

    another drummer. We use four different drummers: the

    drummer of Nirvana in Smell Like Teen Spirit, the

    drummer of Pixies in Hey, and Bossa nova and Funk

    drums by the jazz drummer Jeff Boudreaux 2 . Each drum-

    mer follows the rhythmical patterns of the muted origi-

    nal drummer. We can hear in the provided example that

    each drummer plays according to the global structure of

    the song, i.e., plays breaks and more intensively on chorus

    or bridge.

    5. CONCLUSION

    We have presented VirtualBand, a reactive multi-agent sys-

    tem for live musical improvisation involving real and vir-

    tual musicians. VirtualBand revists the problem of inter-

    acting with stylistically consistent agents by taking a MIR

    approach to the problem. Interactions are specified using

    features pairs, taken from the vast library of features de-

    veloped in MIR. VirtualBand agents are reactive and not

    deliberative, i.e. they do not attempt to exhibit autonomy,

    self goal making, planning, etc. However, this paper shows

    that even with only reactive agents rich interactions can

    take place. This effect is mainly due to the complex in-

    teractions that may occur between 2 features computed on

    various audio signals, even simple ones like the one used in

    this paper. We have illustrated the system with several jazz

    improvization scenarios, such as a guitar jazz trio played

    by a single musician as well as on other types of musical

    situations such as mashup and beatboxing.

    VirtualBand illustrates the power of feature interaction

    to represent complex and meaningful musical interplay oc-

    curring in real performance. As such, it only scratches the

    surface of the new field of individual style modeling. Fur-

    ther work will focus on issues like how to saturate a

    style database, or how to predict the emergence of long

    term structure from basic low-level feature interaction.

    6. REFERENCES

    [1] Jerry Coker. Elements of the Jazz Language for the De-

    veloping Improvisor. Alfred Music Publishing, 1997.

    2 http://tinyurl.com/bu5hx6q

    [2] R. Foulon, F. Pachet, and P. Roy. Automatic classifica-

    tion of guitar playing modes. Submitted, 2013.

    [3] T. Gifford and A.R. Brown. Beyond reflexivity: Me-

    diating between imitative and intelligent action in an

    interactive music system. In 25th BCS Conference on

    Human-Computer Interaction, 2011.

    [4] J. Grobelny. Mashups, sampling, and authorship:

    A mashupsampliography. Music Reference Services

    Quarterly, 11(3-4):229239, 2008.

    [5] M. Hamanaka, M. Goto, H. Asoh, and N. Otsu.

    A learning-based jam session system that imitates a

    players personality model. In International Joint Con-

    ference on Artificial Intelligence, volume 18, pages 51

    58, 2003.

    [6] A. Hawryshkewich, P. Pasquier, and A. Eigenfeldt.

    Beatback: A real-time interactive percussion system

    for rhythmic practise and exploration. In Proceedings

    of the International Conference on New Interfaces for

    Musical Expression, pages 100105, 2011.

    [7] L. Iocchi, D. Nardi, and M. Salerno. Reactivity and de-

    liberation: A survey on multi-robot systems. In Markus

    Hannebauer, Jan Wendler, and Enrico Pagello, editors,

    Balancing Reactivity and Social Deliberation in Multi-

    Agent Systems, volume 2103 ofLecture Notes in Com-

    puter Science, pages 934. Springer, 2000.

    [8] A. Martin, A. McEwan, C.T. Jin, and W. L. Martens. A

    similarity algorithm for interactive style imitation. In

    Proc. ICMC, pages 571574, 2011.

    [9] J. Nika and M. Chemillier. Improtek: integrating har-

    monic controls into improvisation in the filiation of

    omax. In Proceedings of the International Computer

    Music Conference 2012 (ICMC 2012), pages 180187,

    2012.

    [10] F. Pachet. The continuator: Musical interaction with

    style. Journal of New Music Research, 32(3):333341,

    2003.

    [11] F. Pachet, P. Roy, J. Moreira, and M. dInverno. Re-

    flexive loopers for solo musical improvisation. In Pro-

    ceedings of the SIGCHI Conference on Human Factors

    in Computing Systems, CHI 13, pages 22052208.

    ACM, 2013.

    [12] C. Schrkhuber and A. Klapuri. Pitch shifting of audio

    signals using the constant-q transform. In Proc. of the

    15th Int. Conference on Digital Audio Effects (DAFx-

    12), 2012.

    [13] D. Schwarz. Current research in concatenative sound

    synthesis. In Proceedings of the International Com-

    puter Music Conference (ICMC), pages 912, 2005.

    [14] D. Stowell and M. D. Plumbley. Characteristics of the

    beatboxing vocal style. Dept. of Electronic Engineer-

    ing, Queen Mary, University of London, Technical Re-port, Centre for Digital Music C4DMTR-08-01, 2008.