a computational model of rubato (todd).pdf

8/11/2019 A computational model of rubato (Todd).pdf

1/21

PLEASE SCROLL DOWN FOR ARTICLE

This article was downloaded by: [Ingenta Content Distribution Psy Press Titles]

On: 5 December 2009

Access details: Access Details: [subscription number 911796916]

Publisher Routledge

Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-

41 Mortimer Street, London W1T 3JH, UK

Contemporary Music Review

Publication details, including instructions for authors and subscription information:

http://www.informaworld.com/smpp/title~content=t713455393

A computational model of rubato

Neil Todd a

aDepartment of Psychology, University of Exeter, Exeter, UK

To cite this ArticleTodd, Neil'A computational model of rubato', Contemporary Music Review, 3: 1, 69 88

To link to this Article DOI

10.1080/07494468900640061

URL http://dx.doi.org/10.1080/07494468900640061

Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf

This article may be used for research, teaching and private study purposes. Any substantial orsystematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply ordistribution in any form to anyone is expressly forbidden.

The publisher does not give any warranty express or implied or make any representation that the contentswill be complete or accurate or up to date. The accuracy of any instructions, formulae and drug dosesshould be independently verified with primary sources. The publisher shall not be liable for any loss,actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directlyor indirectly in connection with or arising out of the use of this material.
http://www.informaworld.com/smpp/title~content=t713455393http://dx.doi.org/10.1080/07494468900640061http://www.informaworld.com/terms-and-conditions-of-access.pdfhttp://www.informaworld.com/terms-and-conditions-of-access.pdfhttp://dx.doi.org/10.1080/07494468900640061http://www.informaworld.com/smpp/title~content=t713455393


2/21

Contemporary Music Review

1989, Vol. 3 pp. 69-88

Photocopying p ermitted b y license on ly

9 1989 Harw ood Academic Publishers Gm bH

Printed in the United Kingdom

c o m p u t a t io n a l m o d e l o f r ub a to

e i l T o d d

Department of Psychology Un iversity of Exeter Exeter UK

Presented is a model o f rubato, im plem ente d in Lisp, in which expression is view ed as the

m ap pin g of mu sical structure into the variables of expression. T he basic idea is that the

per form er use s "ph rase finallengthening" as a device to reflect some internal representation

of the p hra se structure. The representation is bas ed on Lardahl and Jackendoff's time-span

reduction. The basic heuristic in the m odel is recursive involving look-ahead a nd planning

at a nu m ber of levels. The planned phrasings are superp osed beat by beat and the ou tput

from the program is a l is t of durations which could easily be adapted to be sent to a

synthesiser given a suitable system.

KEYWORDS computational modelling, music cognition, musical performance, rubato,

mental representation, m enta l process.

n t r o du c t i o n

O n e o f t h e m o s t u b i q u i t o u s e x p r e s s i v e d e v ic e s in m u s i c a l p e r f o r m a n c e i s

r u b a t o . M o s t n o t a b l y i t i s u s e d i n m u s i c o f t h e r o m a n t i c e r a , b u t i s a l s o

e v i d e n t i n a v a r i e t y of o t h e r s ty le s . R e s e a r c h o n m u s i c p e r f o r m a n c e

( S e a s h o r e , 1 9 3 8 ; S h a f f e r , 1 9 81 ; C l a r k e , 1 9 84 ; T o d d , 1 98 5 ; B e n g t s s o n &

G a b r i e l s s o n , 1 98 0; S u n d b e r g & V e r iU o , 19 80 ) i n v o l v i n g t h e p r e c i s e

m e a s u r e m e n t o f d u r a t i o n h a s s h o w n t h a t th e r e a r e a n u m b e r o f b a s i c

o b s e r v a t i o n s w h i c h c a n b e m a d e . T h e f i rs t i s t h a t s k i ll ed p e r f o r m e r s c a n

s h o w a r e m a r k a b l e d e g r e e o f r e p ro d u c i b i l i ty f r o m o n e p e r f o r m a n c e t o t h e

n e x t ( S h a f f e r , 1 9 8 4 ; G a b r i e l s s o n , 1 9 8 7 ) . T h i s p r e c i s i o n i n t i m i n g s h o w s

t h a t th e p e r f o r m a n c e m u s t i n v o l v e th e u s e o f g e n e r a t i v e p r o c e d u r e s a n d a

p r e c is e i n te r n a l r e p r e s e n t a t i o n o f u n d e r l y i n g e x p r e s s i v e fo r m . A s e c o n d

o b s e r v a t i o n is t h e u s e o f s l o w i n g t o m a r k a p h r a s e b o u n d a r y ( T o d d , 1 98 5) ,

w h i c h h a s b e e n s h o w n t o a p p l y r e c u r s i v e ly a t a n u m b e r o f l e v el s ( S h a ff e r

& T o d d , 1 9 87 ).

I n T o d d (1 98 5) a m o d e l o f r u b a t o w a s e s t a b l i s h e d w h i c h g e n e r a t e d a

d u r a t i o n s t r u c t u r e f r o m a s t r u c t u r a l d e s c r i p t i o n o f a p i e c e o f m u s i c . T h e

i d e a o f t h e m o d e l w a s t h a t t h e p e r f o r m e r u s e s " p h r a s e f in a l l e n g t h e n i n g "

t o s ig n a l a b o u n d a r y - - t h e d e g r e e o f s l o w i n g d e t e r m i n e d b y th e

69

Downl

oad

ed

By:

[Ingenta

Content

Di

strib

uti

on

Psy

Press

Ti

tl

es]

At:08

:065

Decemb

er2009

Contemporary

usic Review

1989, Vol. 3 pp. 69-88

Photocopying permitted by license only

1989 Harwood Academic Publishers GmbH


computational model

o

rubato

Neil

Todd

Department of Psychology University of Exeter Exeter UK

Presented is a model of rubato, implemented in Lisp, in which expression is viewed as

the

mapping of musical structure into the variables of expression. The basic idea is that the

performer uses phrase final lengthening as a device to reflect some internal representat ion

of the phrase structure. The representation is based on Lardahl and Jackendoff's time-span

reduction. The basic heuristic in the model is recursive involving look-ahead and planning

at a number of levels. The planned phrasings are superposed beat by beat and the output

from

the

program is a list of durations which could easily be adapted to be

sent

to a



mental representation, mental process.

Introduction

One

of

the

most ubiquitous expressive devices in musical performance is

rubato. Most notably it is used in music of the romantic era, but is also

evident in a variety of

other

styles. Research on music performance

(Seashore, 1938; Shaffer, 1981; Clarke, 1984; Todd, 1985; Bengtsson

Gabrielsson, 1980; Sundberg Verillo, 1980) involving the precise

measurement of duration has shown that there are a number of basic

observations which can be made. The first is

that

skilled performers can

show

a remarkable degree of reproducibility from one performance to the

next (Shaffer, 1984; Gabrielsson, 1987). This precision in timing shows

that

the performance must involve the use of generative procedures and a

precise internal representation of underlying expressive form. A second

observation is the use of slowing to mark a phrase boundary (Todd, 1985),

which has

been shown

to apply recursively at a

number

of levels (Shaffer

Todd, 1987).

In Todd

1985)

a model of rubato

was

established which generated a

duration structure from a structural description of a piece of music. The

idea of the model

was

that the

performer uses

phrase

final lengthening

to signal a

boundary

- the degree of slowing determined by the

69

Contemporary

usic Review

1989, Vol. 3 pp. 69-88

Photocopying permitted by license only

1989 Harwood Academic Publishers GmbH


computational model

o

rubato

Neil

Todd

Department of Psychology University of Exeter Exeter UK

Presented is a model of rubato, implemented in Lisp, in which expression is viewed as

the

mapping of musical structure into the variables of expression. The basic idea is that the

performer uses phrase final lengthening as a device to reflect some internal representat ion

of the phrase structure. The representation is based on Lardahl and Jackendoff's time-span

reduction. The basic heuristic in the model is recursive involving look-ahead and planning

at a number of levels. The planned phrasings are superposed beat by beat and the output

from

the

program is a list of durations which could easily be adapted to be

sent

to a



mental representation, mental process.

Introduction

One

of

the

most ubiquitous expressive devices in musical performance is

rubato. Most notably it is used in music of the romantic era, but is also

evident in a variety of

other

styles. Research on music performance

(Seashore, 1938; Shaffer, 1981; Clarke, 1984; Todd, 1985; Bengtsson

Gabrielsson, 1980; Sundberg Verillo, 1980) involving the precise

measurement of duration has shown that there are a number of basic

observations which can be made. The first is

that

skilled performers can

show

a remarkable degree of reproducibility from one performance to the

next (Shaffer, 1984; Gabrielsson, 1987). This precision in timing shows

that

the performance must involve the use of generative procedures and a

precise internal representation of underlying expressive form. A second

observation is the use of slowing to mark a phrase boundary (Todd, 1985),

which has

been shown

to apply recursively at a

number

of levels (Shaffer

Todd, 1987).

In Todd

1985)

a model of rubato

was

established which generated a

duration structure from a structural description of a piece of music. The

idea of the model

was

that the

performer uses

phrase

final lengthening

to signal a

boundary

- the degree of slowing determined by the

69


3/21

70 Neil Todd

importance of the boundary. The input to the model was the time-span

reduction of Lerdahl and Jackendoff's theory (1983). Whilst the model

gave a reasonable description of the data from actual performances of

some pieces there were, however, a numbe r of objections to the model as

iL stood. This has led to the formulation of a new model. In this paper I will

describe the new model and the reasoning which led to its formulation.

he reduction hyp othesis and k no w ledg e representat ion

The first problem with the Todd (1985) model stems from the fact that it

inherits the reduction hyp othes is of Lerdahl & Jackendoff's theory.

That is, the listener, and therefore the performer, sees each event in a

single coherent s truc ture.

This hypothesis places too high a demand on

working memory to be psychologically plausible. In terms of the model it

means that wh en computing a boundary strength, every event in time ~

span reduction is taken into account, irrespective of how close, or how far

apart, the events are in time. This leads to the prediction of more degrees

of boundary strength, and therefore degrees of relative slowing, than can

be discerned from the data. On the other hand , it is both psychologically

plausible and musically necessary that the performer should have some

kind of global overview of the piece as well as being able to look ahe ad

to some degree in order to plan a phrase.

A solution to this problem, which is the first premise of the updated

model, is to suppose that the internal rep resent atio n-- rather than being

a single, simply connected tree - - is composed of a set (or forest) of trees

organised on a number of hierarchic levels with each subset of trees at one

level being bound by a tree at a higher level. This accords with

Anderson's ACT* theory of cognition (1983). In the theory knowledge

comes in chunks or cognitive units which can be such things as

propositions, spatial images or temporal relations. A cognitive unit

encodes a set of no more than about five elements. Larger structures are

created by the hierarchical embedding of cognitive units. Of particular

interest to us here are cognitive units encoding temporal information

which Anderson refers to as temporal strings . The notion of temporal

strings accords well with the idea of musical groups.

A model of performance constructed on this basis predicts a duration

structure determined by the superposition of a number of hierarchic

timing components, from a global component, span ning the whole piece,

to a local component spanning a few beats with each component

corresponding to structural level. This overcomes the objections

discussed above because for any event at one level the number of other

events directly connected is limited. At the same time it allows for look

ahead and gives the performer global overview.

he process o f performance

A second object ion to the Todd (1985) model is that it is off line . In other

Downl

oad

ed

By:

[Ingenta

Content

Di

strib

uti

on

Psy

Press

Ti

tl

es]

At:08

:065

Decemb

er2009

70

Neil Todd

importance of the

boundary.

The fnput to the model was the time-span

reduction of Lerdahl

and

Jackendoff's theory (1983). Whilst the model


some

pieces there were, however, a

number

of objections to

the

model

as

it stood. This has led to the formulation of a new model.

In

this paper I will

describe the

new model and

the reasoning which led to its formulation.

The reduction

hypothesis

and knowledge representation

The first problem with the Todd 1985) model stems from the fact that it

inherits the reduction hypothesis of Lerdahl Jackendoff's theory.


single

coherent structure

This hypothesis places too high a

demand

on

working

memory

to

be

psychologically plausible. In terms of the model it

means that when computing a boundary strength, every

event

in time


apart,

the

events are in time. This leads to

the

prediction of more degrees


be discerned from

the

data. On the

other hand,

it is

both

psychologically

plausible and musically necessi'1ry that the performer should have some

kind

of global overview of the piece as well as being able to look ahead

to some degree in

order

to

plan

a phrase.

A solution to this problem, which is

the

first premise of the

updated

model, is to suppose that the internal representation -

rather

than being

a single, simply connected tree - is composed of a

set

(or forest) of trees

organised on a number of hierarchic levels with each

subset

of trees at one



comes in

chunks

or cognitive units which can be

such

things as

propositions, spatial images or temporal relations. A cognitive

unit

encodes a

set

of

no

more than about five elements. Larger structures are


interest to

us

here are cognitive units encoding temporal information

which

Anderson

refers to as temporal strings . The notion of temporal



structure

determined

by

the

superposition of a number of hierarchic

timing components, from a global

component, spanning

the whole piece,




events directly connected is limited.

At

the same time it allows for look

ahead

and

gives the performer global overview.

The process of performance

A second objection to the Todd 1985) model is that it is off line . In other

70

Neil Todd

importance of the

boundary.

The fnput to the model was the time-span

reduction of Lerdahl

and

Jackendoff's theory (1983). Whilst the model


some

pieces there were, however, a

number

of objections to

the

model

as

it stood. This has led to the formulation of a new model.

In

this paper I will

describe the

new model and

the reasoning which led to its formulation.

The reduction

hypothesis

and knowledge representation

The first problem with the Todd 1985) model stems from the fact that it

inherits the reduction hypothesis of Lerdahl Jackendoff's theory.


single

coherent structure

This hypothesis places too high a

demand

on

working

memory

to

be

psychologically plausible. In terms of the model it

means that when computing a boundary strength, every

event

in time


apart,

the

events are in time. This leads to

the

prediction of more degrees


be discerned from

the

data. On the

other hand,

it is

both

psychologically

plausible and musically necessi'1ry that the performer should have some

kind

of global overview of the piece as well as being able to look ahead

to some degree in

order

to

plan

a phrase.

A solution to this problem, which is

the

first premise of the

updated

model, is to suppose that the internal representation -

rather

than being

a single, simply connected tree - is composed of a

set

(or forest) of trees

organised on a number of hierarchic levels with each

subset

of trees at one



comes in

chunks

or cognitive units which can be

such

things as

propositions, spatial images or temporal relations. A cognitive

unit

encodes a

set

of

no

more than about five elements. Larger structures are


interest to

us

here are cognitive units encoding temporal information

which

Anderson

refers to as temporal strings . The notion of temporal



structure

determined

by

the

superposition of a number of hierarchic

timing components, from a global

component, spanning

the whole piece,




events directly connected is limited.

At

the same time it allows for look

ahead

and

gives the performer global overview.

The process of performance

A second objection to the Todd 1985) model is that it is off line . In other


4/21

A c o mp u ta tio n a l mo d e l o f ru b a to 71

w o r d s i t d o e s n o t d e s c r i b e th e p r o c e s s o f p e r f o r m a n c e . W h i ls t it is

r e a s o n a b l e t o s u p p o s e t h a t t h e p e r f o r m e r c a n h o l d t h e w h o l e s t r u c t u re i n

l o n g - t e r m m e m o r y , i n d e e d a m u s i c i a n s ' s a b il it y t o m e m o r i s e i s q u i t e

r e m a r k a b l e , i t s e e m s i m p l a u s i b le t h a t t h e p e r f o r m e r c o u l d a c c e s s t h e

w h o l e s t ru c t u r e a t a n y o n e t im e . I n t h e e a rl y m o d e l t h e c o m p u t a t i o n s

w e r e d o n e f o r e a c h c o m p o n e n t a n d t h e n a d d e d t o g et h e r. I n a n a ct u al

p e r f o r m a n c e t h e c o m p u t a t i o n s a r e d o n e a s e a c h p h r a s e i s a c c e s s e d i n tu r n

a n d t h e c o m p o n e n t s s u p e r p o s e d n o t e b y n o te .

T h e o b v i o u s a n s w e r , a n d t h is is t h e s e c o n d p r e m i s e o f t h e n e w m o d e l ,

i s t h a t in o r d e r t o d e sc r i b e th e p r o c e s s o f p e r f o r m a n c e t h e m o d e l n e e d s t o

b e f o r m u l a t e d i n c o m p u t a t i o n a l t e r m s a n d i m p l e m e n t e d i n a s u i t a b l e

h i g h - l e v e l la n g u a g e s u c h a s L i sp . In p a r ti c u l a r, w h a t i s i m p o r t a n t h e r e i s

t h e i d e a t h a t a p r o c e s s s h o u l d b e c a st i n te r m s o f a n e f f ec t iv e p r o c e d u r e

( L o n g u e t - H i g g i n s , 1 97 8, 1 98 1; J o h n s o n - L a i r d , 1 98 3), t h u s e n a b l i n g t h e

t h e o r y t o b e p r e c i s e a n d t e s ta b l e .

he indeterminism of individu al perform ances

W h i l s t s u c h a t h e o r y d o e s m a k e p r e d i c t i o n s , g i v e n a c e r t a i n i n p u t , t h e

g o a l o f t h e t h e o r y i s n o t t h e p r e d i c t io n o f i n d i v i d u a l p e r f o r m a n c e s a s

s u c h , b u t t h e p r i n c i p l e d e x p l a n a t i o n o f p e r f o r m a n c e d a t a . T h i s i s s o i n

p s y c h o l o g y in g e n e r a l, a n d m u s i c p s y c h o l o g y in p a rt ic u la r , b e c a u s e if t h e

t h e o r y w e r e c o m p l e t e l y d e te r m i n i s ti c i t w o u l d n e g a t e t h e c r e at iv e a s p e c t

o f p e r f o r m a n c e . J o h n s o n - L a i r d (198 3) h a s e x p r e s s e d t h is i n d e t e r m i n i s m

o f i n d i v i d u a l p e r f o r m a n c e s i n t h e l a n g u a g e o f c o m p u t e r s c i en c e :

I f h u m a n b e in g s a r e a t l e a st a s co m p l i c a te d a s T u r i n g m a c h i n e s a n d t h e ir

i n d i v i d u a l p r o c e s se s o f t h o u g h t d i f f e r a s a r e s u l t o f t h e i r g e n e s a n d e x p e r ie n c e ,

t h e n t h e i r b e h a v i o u r i s m o s t u n l i k e l y t o b e c o m e w h o l l y p r e d i c t a b l e , b e c a u s e

t h e r e is n o e f f e c ti v e p r o c e d u r e t h a t c a n p r e d i c t t h e b e h a v i o u r o f a n a r b i t r a r y

T u r i n g m a c h i n e . T h e r e i s t h u s l i t t l e d a n g e r o f c r e a t i n g a p s y c h o l o g y c a p ab le o f

m o d e l l in g a n i n d i v i d u a l ' s t h o u g h t s - - a n e v e n t u a l i t y l ik e l y t o d e s t r o y t h e

s p o n t a n e i t y a n d s i g n i f ic a n c e o f l ife . B u t t h e r e a r e n o

a p r io r i

r e a s o n s f o r

s u p p o s i n g t h a t i t is i m p o s s i b l e t o d e v e lo p s c i e n t i f ic t h e o r ie s o f g e n e r a l

p s y c h o l o g i c a l a b i l i t i e s .

[ Joh nso n-L a i rd , 1983 ; p . 12 ]

h e c o m p u t a t io n a l t h e o r y o f a n e x p r e s s io n s y s t e m

T h e s e t w o is s u e s d i s c u s s e d a b o v e , o f r e p r e s e n t a t i o n a n d p r o c e s s , a r e

c e n tr a l to a n y i n f o r m a t i o n - p r o c e s s i n g t y p e a p p r o a c h t o c o g n i t i o n a n d

c o g n i t iv e m o d e l l i n g . O u r m a i n t a sk , t h e r e f o re , i n t h e c o n s t r u c t i o n o f s u c h

a m o d e l is t o m a k e e x p li c it , in t h e f o r m o f a n a l g o r i t h m , t h e p r o c e s s o f

p e r f o r m a n c e a n d i ts i n p u t . H o w e v e r , a s D a v i d M a r r (1 98 2) h a s s a i d s u c h

a s y s t e m c a n b e v i e w e d f r o m t h r e e l e v e l s o f e x p l a n a ti o n :

Downl

oad

ed

By:

[Ingenta

Content

Di

strib

uti

on

Psy

Press

Ti

tl

es]

At:08

:065

Decemb

er2009

A

computational model

of rubato 71

words it does not describe the process of performance. Whilst it is

reasonable to

suppose that

the performer can hold the whole structure in

long-term memory, indeed a musicians's ability to memorise is quite

remarkable, it seems implausible

that

the

performer could access

the

whole structure at

anyone

time. In the early model the computations

were

done

for each

component and then

added together. In

an

actual

performance the computat ions are

done as each phrase is accessed

in turn

and

the

components superposed

note by note.

The obvious answer, and this is the second premise of the new model,

is that in order to describe the process of performance the model

needs

to

be formulated in computational terms

and implemented

in a suitable

high-level language

such

as Lisp. In particular, what is

important

here is

the idea that a process should be cast in terms of an effective procedure

(Longuet-Higgins, 1978, 1981; Johnson-Laird, 1983),

thus

enabling the

theory to be precise and testable.

he indeterminism

o

individual performances

Whilst

such

a theory does make predictions, given a certain input, the

goal of the theory is not the prediction of individual performances as

such, but the principled explanation of performance data. This is so in

psychology in general, and music psychology in particular, because

i

the

theory

were

completely deterministic it would negate the creative aspect

of performance. Johnson-Laird

1983)

has expressed this indeterminism

of individual performances in the language of

computer

science:

If human beings

are

at least

as complicated

as Turing machines and their

individual processes of thought

differ

as a result of

their

genes

and experience

then their behaviour

is

most unlikely

to become

wholly predictable because

there

is no effective procedure that

can

predict the

behaviour

of an arbitrary

Turing machine. There is thus little danger ofcreating a psychology

capable

of

modelling an

individual's thoughts - an eventuality

likely to destroy

the

spontaneity

and significance

of

life.

But

there

are no

a priori reasons for

supposing that

it

is impossible to develop scientific theories of general

psychological abilities.

Uohnson-Laird, 1983; p. 12]

he computational theory

o

an expression system

These two issues discussed above, of representation and process, are

central to

any

information-processing type

approach

to cognition

and

cognitive modelling. Our main task, therefore, in the construction of such

a model is to make explicit, in the form of

an

algorithm, the process of

performance and its input. However, as David Marr 1982) has said such

a system can be viewed from three levels of explanation:

A

computational model

of rubato 71

words it does not describe the process of performance. Whilst it is

reasonable to

suppose that

the performer can hold the whole structure in

long-term memory, indeed a musicians's ability to memorise is quite

remarkable, it seems implausible

that

the

performer could access

the

whole structure at

anyone

time. In the early model the computations

were

done

for each

component and then

added together. In

an

actual

performance the computat ions are

done as each phrase is accessed

in turn

and

the

components superposed

note by note.

The obvious answer, and this is the second premise of the new model,

is that in order to describe the process of performance the model

needs

to

be formulated in computational terms

and implemented

in a suitable

high-level language

such

as Lisp. In particular, what is

important

here is

the idea that a process should be cast in terms of an effective procedure

(Longuet-Higgins, 1978, 1981; Johnson-Laird, 1983),

thus

enabling the

theory to be precise and testable.

he indeterminism

o

individual performances

Whilst

such

a theory does make predictions, given a certain input, the

goal of the theory is not the prediction of individual performances as

such, but the principled explanation of performance data. This is so in

psychology in general, and music psychology in particular, because

i

the

theory

were

completely deterministic it would negate the creative aspect

of performance. Johnson-Laird

1983)

has expressed this indeterminism

of individual performances in the language of

computer

science:

If human beings

are

at least

as complicated

as Turing machines and their

individual processes of thought

differ

as a result of

their

genes

and experience

then their behaviour

is

most unlikely

to become

wholly predictable because

there

is no effective procedure that

can

predict the

behaviour

of an arbitrary

Turing machine. There is thus little danger ofcreating a psychology

capable

of

modelling an

individual's thoughts - an eventuality

likely to destroy

the

spontaneity

and significance

of

life.

But

there

are no

a priori reasons for

supposing that

it

is impossible to develop scientific theories of general

psychological abilities.

Uohnson-Laird, 1983; p. 12]

he computational theory

o

an expression system

These two issues discussed above, of representation and process, are

central to

any

information-processing type

approach

to cognition

and

cognitive modelling. Our main task, therefore, in the construction of such

a model is to make explicit, in the form of

an

algorithm, the process of

performance and its input. However, as David Marr 1982) has said such

a system can be viewed from three levels of explanation:


5/21

72 Neil Todd

A t o ne e x t r e m e t he top le v e l i s t he abs t r ac t c om pu t a t i on a l t he or y o f t he de v i c e

i n w h i c h t he pe r f o r m anc e o f t he de v i c e is c har ac t e ri z ed a s a m a pp i ng f r om one

k i nd o f i n f o r m a t i on t o ano t he r . . . . I n t he c e n t r e i s t he c hoi ce o f r e pr e s e n t a t i on

f o r t h e i n p u t a n d o u t p u t a n d t h e a l g o r it h m t o be u s e d t o tr a n s f o r m o n e i n to t h e

o t h er . A n d a t t h e o t h e r e x tr e m e a r e th e d e ta i ls o f h o w t h e a l g o r i th m a n d t h e

representa t ion are rea l i zed phys ica l ly .

[Marr, 1982; p. 24]

At the level of computational theory then, is useful to express the

various processes of music performance in symbolic terms. Let N stand

for the music notation or score, P for performance, and 9 for the internal

representation. Thus we can think of the process of performance as a

mapping:

9 :

v ~ ~ p 1 . a )

where the map ping is carried out by a pe r f o r m anc e p r oc e dur e or f u n c t i o n ~ .

In the same way the process of sight-reading can be thought of as a double

mapping:

N---~ 9 ~ P (1.b)

Conversely, we can think of the process of perception as the mapping:

A :P--~ ~ (2.a)

where the mapping is carried out by a

l i s t e n i ng p r oc e dur e

or

f u n c t i o n A .

Again in the same manner the process of notation can be thought of as:

P---~ 9 ~ N (2.b)

So, at the algor ithmic level then , our task is twofold: a) to find a suitable

representa tion for ~; and b) to make explicit an algorithm for performing

the mapping 9 --~ P.

ethodology

The methodology adopted in order to implement the twofold task

outlined above is threefold:

(a) a n a l y si s - - which involves finding a value for ~, either from the score

or the data;

(b) s y n t h e s i s - - which involves taking the value for 9 and using it as an

input to a performance algorithm which generates an output in the

form of a graph or list of numbers; a nd

(c) e v a l u a t i o n - - which involves the comparison of data with algorithm

output.

This metho d is, of course, similar to the analysis-by-synthesis me tho d

of Sundberg and his co-workers (Fryden & Sundberg, 1984) but pe rhaps

more closely related to the method of Risset and Wessel in their work on

timbre (Risset & Wessel, 1982). The differences with the Sundberg

Downl

oad

ed

By:

[Ingenta

Content

Di

strib

uti

on

Psy

Press

Ti

tl

es]

At:08

:065

Decemb

er2009

72 Neil

Todd

t one extreme the top

level is

the

abstract

computational theory of

the

device

in which

the

performance of

the

device is characterized

as

a mapping

from

one

kind of information

to another

In the centre is

the

choice

of

representation

for the

input

and

output

and

the

algorithm

to

be

used

to

transform

one

into

the

other.

nd

at

the

other extreme are

the

details

of

how

the

algorithm and

the

representation are realized physically.

[Marr,

1982; p.

24]

At the level of computational theory then, is useful to express

the

various processes of music performance

in

symbolic terms. Let N

stand

for

the

music notation

or

score, P for performance,

and

I for the internal

representation. Thus

we

can think of the process of performance as a

mapping:

l.a)

where

the mapping

is carried

out

by a performance procedure or function

1T.

In

the same way the process of sight-reading can be thought of as a double

mapping:

l.b)

Conversely,

we

can think of the process of perception as the mapping:

A ~

I 2.a)

where the

mapping

is carried

out

by

a

listening

procedure

or

function

A.


P ~ I ~ N 2.b)

So,

at

the algorithmic level

then our

task is twofold: a) to find a suitable

representation for

1 ; and b)

to make explicit

an

algorithm for performing

the

mapping

I P.

ethodology

The methodology

adopted

in order to implement the twofold task


(a)

analysis - which involves finding a value for 1 , either from the score

or the data;

(b) synthesis - which involves taking the value for I

and

using it as

an

input to a performance algorithm which generates

an output in

the

form of a

graph or

list of numbers;

and

(c)

evaluation

- which involves the comparison of ~ t with algorithm

output.

This method is, of course, similar to the analysis-by-synthesis

method

of Sundberg

and

his co-workers (Fryden Sundberg, 1984) but

perhaps

more closely related to the

method

of Risset

and

Wessel

in

their work on

timbre (Risset Wessel, 1982). The differences with the

Sundberg

72 Neil

Todd

t one extreme the top

level is

the

abstract

computational theory of

the

device

in which

the

performance of

the

device is characterized

as

a mapping

from

one

kind of information

to another

In the centre is

the

choice

of

representation

for the

input

and

output

and

the

algorithm

to

be

used

to

transform

one

into

the

other.

nd

at

the

other extreme are

the

details

of

how

the

algorithm and

the

representation are realized physically.

[Marr,

1982; p.

24]

At the level of computational theory then, is useful to express

the

various processes of music performance

in

symbolic terms. Let N

stand

for

the

music notation

or

score, P for performance,

and

I for the internal

representation. Thus

we

can think of the process of performance as a

mapping:

l.a)

where

the mapping

is carried

out

by a performance procedure or function

1T.

In

the same way the process of sight-reading can be thought of as a double

mapping:

l.b)

Conversely,

we

can think of the process of perception as the mapping:

A ~

I 2.a)

where the

mapping

is carried

out

by

a

listening

procedure

or

function

A.


P ~ I ~ N 2.b)

So,

at

the algorithmic level

then our

task is twofold: a) to find a suitable

representation for

1 ; and b)

to make explicit

an

algorithm for performing

the

mapping

I P.

ethodology

The methodology

adopted

in order to implement the twofold task


(a)

analysis - which involves finding a value for 1 , either from the score

or the data;

(b) synthesis - which involves taking the value for I

and

using it as

an

input to a performance algorithm which generates

an output in

the

form of a

graph or

list of numbers;

and

(c)

evaluation

- which involves the comparison of ~ t with algorithm

output.

This method is, of course, similar to the analysis-by-synthesis

method

of Sundberg

and

his co-workers (Fryden Sundberg, 1984) but

perhaps

more closely related to the

method

of Risset

and

Wessel

in

their work on

timbre (Risset Wessel, 1982). The differences with the

Sundberg


6/21

A comp utational model ofrubato 73

method are that the starting point here is actual performances, rather

than performer intuitions, a nd the evaluation process involves the direct

comparison of data and model, rather than the subjective rating of

generated output.

A n a l y s i s : s c o r e > r e p r e s e n t a t i o n v s . data ~ r e p r e s e n t a t i o n

We ne ed to find a value for the internal representa tion ~. A distinction is

made here between three possible representations. First, the ana l y s t s

r e pr e se n t a ti on ~ A , which is determined directly from the score; second, the

per former s r epresenta t ion ~p, which is also determined from the score but

which is unobservable; and third, the da ta de t e r m i ne d r e pr e se n t a ti on ~ m . So,

we can represent the computational theory at this stage thus:

~trp --~ Pp ~ at D

N ~ (3)

~A

To find a value for ~A involves taking the score of the piece of music und er

investigation and production an anlysis of the grouping or phrase

structure. At the mome nt t he most useful analytic meth od is that

developed by Lerdahl and Jackendoff (1983) despite its deficiencies

(Slawson Peel, 1985; Clarke, 1986; Baker, in press). After the analysis is

complete the grouping is converted to a Lisp representation which

becomes the input to an algorithm for generating a duration structure.

(see figure 1).

( s e t q t s r ( ( A) ( B)

( s e t q A ( (a ) ( a ) ) )

( s e t q B ( (b ) ( b ) ) )

( s e t q a ( 3 1 2 i ) )

( s e t q b ( 3 1 2 I ) )

A ) ) )

Figure 1 A Lisp representationof Lerdahland Jackendoff'sbracket notationfor grouping.

At the top level there are two groups A and B arranged symmetrically n the order ABA.

Group A contains he sub-group a repeated and group Bcontains he sub-group b repeated.

The integers represent the metricalstrength of a beat.

A value for ~D determined by analysing the data from actual

performances. The basic idea is that a slowing indicates a group

bounda ry. This can be done systematically using an algorithm l i s t e n

which takes that data as input and returns a Lisp representation of the

grouping t s r (Todd, in press).

Downl

oad

ed

By:

[Ingenta

Content

Di

strib

uti

on

Psy

Press

Ti

tl

es]

At:08

:065

Decemb

er2009

A computational model o

rubato 73

method

are

that the

starting point here is actual performances, rather

than

performer intuitions,

and

the evaluation process involves the direct

comparison of data

and

model, rather

than the

subjective rating of

generated

output.

Analysis: score _ representation vs. data _ representation

We

need

to find a value for

the

internal representation

It.

A distinction is

made

here between three possible representations. First, the analyst s

representation qt

A

which is determined directly from the score; second, the

performer s representation It

p,

which is also determined from the score but

which is unobservable;

and

third,

the

data determined representation ltD So,


3)

To find a value for ItA involves taking the score of the piece of music

under

investigation

and

production

an

anlysis of the

grouping or

phrase

structure. At the

moment

the most useful analytic

method

is

that

developed

by

Lerdahl

and

Jackendoff

1983)

despite its deficiencies

(Slawson Peel, 1985; Clarke, 1986; Baker, n press). After the analysis is

complete the

grouping

is converted to a Lisp representation which

becomes the input to

an

algorithm for generating a duration structure.

(see figure 1).

(setq t s r A)

(B) (A)

(setq A a )

( a)

(setq B

b )

(b)

(setq

a

(3

1 2

1

(setq b (3 1 2 1

Figure 1 A Lisp representation of Lerdahl and Jackendoff s bracket notation for grouping.

At the top level there are two groups A and B arranged symmetrically

in

the order ABA.

Group

A contains the sub-group

a

repeated

and

group B contains the sub-group

b

repeated.

The integers represent the metrical strength of a beat.

A value for qtD determined by analysing the data from actual

performances. The basic idea is

that a slowing indicates a group

boundary. This can be done systematically using

an

algorithm

listen


grouping

tsr

(Todd, in press).

A computational model o

rubato 73

method

are

that the

starting point here is actual performances, rather

than

performer intuitions,

and

the evaluation process involves the direct

comparison of data

and

model, rather

than the

subjective rating of

generated

output.

Analysis: score _ representation vs. data _ representation

We

need

to find a value for

the

internal representation

It.

A distinction is

made

here between three possible representations. First, the analyst s

representation qt

A

which is determined directly from the score; second, the

performer s representation It

p,

which is also determined from the score but

which is unobservable;

and

third,

the

data determined representation ltD So,


3)

To find a value for ItA involves taking the score of the piece of music

under

investigation

and

production

an

anlysis of the

grouping or

phrase

structure. At the

moment

the most useful analytic

method

is

that

developed

by

Lerdahl

and

Jackendoff

1983)

despite its deficiencies

(Slawson Peel, 1985; Clarke, 1986; Baker, n press). After the analysis is

complete the

grouping

is converted to a Lisp representation which

becomes the input to

an

algorithm for generating a duration structure.

(see figure 1).

(setq t s r A)

(B) (A)

(setq A a )

( a)

(setq B

b )

(b)

(setq

a

(3

1 2

1

(setq b (3 1 2 1

Figure 1 A Lisp representation of Lerdahl and Jackendoff s bracket notation for grouping.

At the top level there are two groups A and B arranged symmetrically

in

the order ABA.

Group

A contains the sub-group

a

repeated

and

group B contains the sub-group

b

repeated.

The integers represent the metrical strength of a beat.

A value for qtD determined by analysing the data from actual

performances. The basic idea is

that a slowing indicates a group

boundary. This can be done systematically using

an

algorithm

listen


grouping

tsr

(Todd, in press).


7/21

74 Neil Todd

S y n t h e s i s : r e p r e s e n t a t i o n ---> p e r f o r m a n c e

Having obtained

t s r

we need to make explicit the procedure ,rr for

mapping representation into the performance. We can represent the

computational theory at this stage thus:

~P ---> PP ~ ~ tY D - ' - > PD

~ffA ~ > PA

4)

such that each representation ~i has its corresponding performance Pi.

The performance is modelled using an algorithm p l a y See Append ix 1).

The basic heuristic of the algorithm is to look-ahead and plan the phrasing

of a group at a given level then move d own to the next sub-group, look-

ahead and plan, and so on recursively. The planned phrasings are

superposed onto an outp ut plan see

o u t p u t ,

Appendix 1) which

continuous ly evolves as the performance unfolds. Whe n a surface-group

is reached the first element of the ou tpu t plan is printed and discarded,

and so on. Whe n the surface-group is completed the program backtracks

to the next level and so on until all the surface groups are played. The

output from the program is a list of durations, which could easily be

adapted to be sent to a synthesiser given a suitable sys tem see figure 2).

The precise durations within a phrase are det ermined by a parabolic

function PB embedded within the performance procedure. This function

has the following form:

a2 { t a 4 - 1 ) } 2

= a6 , t = 1 , 2 ..... T 5)

PB t,

a i

a l J r ( 1 - - - - a 6 )

a 3

as

where t is metrical time and a i is a vector of parameters such that:

a l = t e m po ,

a2 = am p l i t ude ,

a3 = l e ng t h o f phr as e ,

a 4 ~ b o u n d a r y s t r e n g t h ,

a s = u p p e r l i m i t o f b o u n d a r y s t r e n g t h ,

a6 = o f f se t o f parabo la m i n i m um .

1 -

a6) -2 is a normali sation factor such that if the b oundary st rength a4 =

1 and t = a3 i.e. at the e nd of the phrase) the n a 2 represents the true

ampl itude Todd, 1985). As for the values of the parameters, al and

a2 are input at the start of the algorithm p l a y see functions s t a r t and

s e t _ u p _ v a t s , Appendix 1); a3 and a 4 are computed for each group as the

program runs see functions

p l a n

and

r u b a t o ,

Appendix 1); and as and a6

are set with in the program with a6 = 0.52. In Todd 1985) as = 11 but in the

new model as = 3 because the number of possible bound ary strengths is

reduced see function s e t _ u p _ v a t s , Appendix 1).

Downl

oad

ed

By:

[Ingenta

Content

Di

strib

uti

on

Psy

Press

Ti

tl

es]

At:08

:065

Decemb

er2009

74 Neil Todd

Synthesis: representation performance

Having obtained

tsr

we

need

to make explicit the procedure

1T

for

mapping

representation into

the

performance. We can

represent

the


4)

such

that

each representation

l i has

its corresponding performance Pi

The performance is modelled using an algorithm

play

See ppendix 1).

The basic heuristic of the algorithm is to look-ahead and plan the

phrasing

of a

group at

a given level

then

move

down

to

the

next sub-group, look

ahead and plan, and so

on

recursively. The planned phrasings are

superposed

onto

an output

plan

see output,

ppendix

1) which

continuously evolves as

the

performance unfolds. When a surface-group

is reached the first

element

of the output plan is printed and discarded,

and so on. When

the

surface-group is completed

the program

backtracks

to

the

next level and so on until all

the

surface

groups

are played. The


adapted to be

sent

to a synthesiser given a suitable system see figure 2).

The precise

durations

within a

phrase

are

determined

by

a parabolic

function

P

embedded within

the

performance procedure. This function

has

the

following form:

_ a2

{t a4

- 1 } 2 _ )

PB t,ai)-a1+ 2 -a6 t-1,2, ... T 5

1

-

a6) a3

as

where t is metrical time and ai is a vector of parameters such that:

a1 tempo,

a2

amplitude,

a3

length of

phrase,

a4

boundary strength,

as upper limit ofboundary strength,

a6 offset ofparabola minimum.

1 - a6)-z is a normalisation factor

such

that i the boundary strength a4

1 and t

=

a3 i.e. at the

end

of the phrase) then a2 represents the true

amplitude Todd, 1985). As for

the

values of

the

parameters, al and

az are input at the start of the algorithm

play

see functions

start and

set_up_vars,

ppendix

1); a3

and

a4

are

computed

for each

group

as

the

program

runs

see functions

plan

and

rubato,

ppendix

1);

and

as

and

a6

are set within

the

program with

a6 =

0.52. In

Todd

1985) as

= 11 butin the

new

model

as = 3 because the

number

of possible

boundary strengths

is

reduced see function

set_up_vars,

ppendix 1).

74 Neil Todd

Synthesis: representation performance

Having obtained

tsr

we

need

to make explicit the procedure

1T

for

mapping

representation into

the

performance. We can

represent

the


4)

such

that

each representation

l i has

its corresponding performance Pi

The performance is modelled using an algorithm

play

See ppendix 1).

The basic heuristic of the algorithm is to look-ahead and plan the

phrasing

of a

group at

a given level

then

move

down

to

the

next sub-group, look

ahead and plan, and so

on

recursively. The planned phrasings are

superposed

onto

an output

plan

see output,

ppendix

1) which

continuously evolves as

the

performance unfolds. When a surface-group

is reached the first

element

of the output plan is printed and discarded,

and so on. When

the

surface-group is completed

the program

backtracks

to

the

next level and so on until all

the

surface

groups

are played. The


adapted to be

sent

to a synthesiser given a suitable system see figure 2).

The precise

durations

within a

phrase

are

determined

by

a parabolic

function

P

embedded within

the

performance procedure. This function

has

the

following form:

_ a2

{t a4

- 1 } 2 _ )

PB t,ai)-a1+ 2 -a6 t-1,2, ... T 5

1

-

a6) a3

as

where t is metrical time and ai is a vector of parameters such that:

a1 tempo,

a2

amplitude,

a3

length of

phrase,

a4

boundary strength,

as upper limit ofboundary strength,

a6 offset ofparabola minimum.

1 - a6)-z is a normalisation factor

such

that i the boundary strength a4

1 and t

=

a3 i.e. at the

end

of the phrase) then a2 represents the true

amplitude Todd, 1985). As for

the

values of

the

parameters, al and

az are input at the start of the algorithm

play

see functions

start and

set_up_vars,

ppendix

1); a3

and

a4

are

computed

for each

group

as

the

program

runs

see functions

plan

and

rubato,

ppendix

1);

and

as

and

a6

are set within

the

program with

a6 =

0.52. In

Todd

1985) as

= 11 butin the

new

model

as = 3 because the

number

of possible

boundary strengths

is

reduced see function

set_up_vars,

ppendix 1).


8/21

00

W

X

I

J

Q

00o

0

~

0

0

i

I

|

|

|

0

0

0

0

0

C

,

0

0

0

W

I

0W

O

[

N

(

s

N

O

I

V

~

3

7

V

3

N

Downloaded By: [Ingenta Content Distribution Psy Press Titles] At: 08:06 5

VI

E

z

0

{

a

>

0

U J

: ;

--

0

U J

: ;

--


9/21

76 NeilTodd

Ev alu atio n: PA ~ PP ~ PD?

H a v i n g g e n e r a t e d PA o r PD w e n e e d t o c o m p a r e t h e m w i t h a n a c tu a l

p e r f o r m a n c e

Pp.

I n T o d d (1 98 5) t h e d a t a a n d m o d e l ( ie P p a n d

PA

w e r e

c o m p a r e d v i s u a l ly u s i n g t h e c r it er ia ; a ) t h e p o s i t io n o f p e a k s o r p o i n t s o f

s l o w i n g ; b ) t h e r e l a t iv e h e i g h t s o f th e p e a k s . W h i l st t h is m e t h o d i s u s e f u l

i t i s u n s a t i s f a c t o r y fo r a n u m b e r o f r e a s o n s . F i rs t, t h e c o m p a r i s o n o f

r e l a ti v e h e i g h t s i s o n l y q u a l i t a ti v e . S o , o b v i o u s l y a m o r e s y s t e m a t i c a n d

q u a n t i t a t iv e t e s t i s r e q u i r e d . H i e r a r c h i c a l c l u s t e r i n g ( J o h n s o n , 1 9 6 7 ) i s

s u c h a m e t h o d w h i c h h a s b e e n s u c c e s s f u ll y a p p l i e d i n a n a l y s i n g s p e e c h

( G r o s je a n a n d G e e , 19 83) a n d I a m c u r r e n t l y w o r k i n g o n w a y s o f a p p l y i n g

t h is t o m u s i c p e r f o r m a n c e .

A s e c o n d p r o b l e m l ie s i n t h e i n d e t e r m i n i s m o f i n d i v i d u a l p e r f o r m a n c e s

a s d i s c u s s e d a b o v e . W h i l s t i t i s o f t e n p o s s i b l e t o o b s e r v e c o n s i d e r a b l e

a c r o s s - p e r f o r m e r a g r e e m e n t ( S h a f fe r & T o d d , 1 98 7) t h e r e a r e a ls o m a n y

d i f fe r e n c e s . A l so , t h e r e i s n o r e a s o n w h y t h e a n a l y s t 's i n t e r p r e t a t io n

~ t r

s h o u l d b e t h e s a m e a s t h e p e r f o r m e r ' s ~ v s in c e th e r e is n o s u c h t h i n g a s a

s i n g le c o r r e c t g r o u p i n g . I t is f o r t h e s e r e a s o n s t h a t t h e i n p u t u s e d is t h e

r e p r e s e n t a t i o n d e r i v e d f r o m t h e d a t a ~ D w h i c h i s o b t a i n e d v i a t h e

a l g o r i t h m listen. T h is p r o c e d u r e i s c e r ta i n ly n o t i n t e n d e d t o g i v e li c en c e

t o a d j u s t t h e t h e o r y po st hoc t o fit e a c h s e t o f d a t a - - o n t h e c o n t r a r y t h e

s a m e p e r f o rm a n c e m a p p i n g p l a y is u s e d i n e a c h c a se . R e m e m b e r t h e g o a l

o f a t h e o r y o f p e r f o r m a n c e i s t h e p r i n c ip l e d e x p l a n a t io n o f p e r f o r m a n c e

d a t a. S o , w h i ls t w e c a n n o t p r e d i c t a p e r f o r m a n c e w i t h a n y c e r t a in t y w h a t

w e c a n s a y f o r e a c h p e r f o r m a n c e i s t h a t i f t h e f o ll o w i n g t h r e e a s s u m p t i o n s

p r o d u c e a g o o d m a t c h b e t w e e n P p a n d PD t h e n t h e a s s u m p t i o n s

c o n s t i t u t e a re a s o n a b l e e x p l a n a t i o n :

(a) t h e p e r f o r m e r h a s u s e d s l o w i n g t o i n d ic a t e a g r o u p i n g b o u n d a r y ;

(b ) t h e p e r f o r m e r ' s g r o u p i n g a n a l y s is c o r r e s p o n d s t o tsr;

(c) t h e p e r fo r m e r ' s m a p p i n g p r o c e d u r e c o r r e s p o n d s to play .

T h e m o d e l t h e n i s re a l ly a n a n a l y ti c al t h e o r y o f p e r f o r m a n c e r a t h e r t h a n a

p r e s c ri p ti v e t h e o r y o f p e r f o r m a n c e . H o w e v e r , t h e r e a r e n o r e a s o n s , if

e n o u g h t d a t a i s a m a s s e d , w h y p r o b a b i l i t y w e i g h t i n g s c o u l d n o t b e

a s s i g n e d t o v a r io u s p e r f o r m a n c e s a s a f u n c t i o n o f s ty l e a n d i n s t r u m e n t .

E x a m p l e s

P r e s e n t e d i n F i g u r e s 3 a n d 4 a r e t w o e x a m p l e s o f d a t a f r o m a c t u a l

p e r f o r m a n c e s c o m p a r e d a g a i n s t t h e m o d e l . T h e f i r s t e x a m p l e i s t a k e n

f r o m a p e r f o r m a n c e o f t h e A d a g i o f r o m t h e H a y d n S o n a t a in B -F lat w h i c h

w a s a ls o u s e d i n T o d d (1 98 5) s o th a t c o m p a r i s o n w i t h t h e o l d m o d e l is

p o s si b le . T h e s e c o n d e x a m p l e is t a k e n f r o m t w o p e r f o r m a n c e s o f t h e

C h o p i n p r e l u d e i n F # M i n o r (S h a f fe r a n d T o d d , 1 9 87 ). T h e d a t a w e r e

o b t a i n e d u s i n g t h e m e t h o d o f S h a f f e r (1 98 1).

Downl

oad

ed

By:

[Ingenta

Content

Di

strib

uti

on

Psy

Press

Ti

tl

es]

At:08

:065

Decemb

er2009

76 Neil Todd

Evaluation PA

e p e

P

D

?

Having generated P

A

or P

D

we need to compare them with

an

actual

performance

P

p

In

Todd 1985)

the

data

and

model (ie

p

and

P

A

were

compared visually using the criteria; a)

the

position of peaks or points of

slowing; b) the relative heights of the peaks. Whilst this

method

is useful

it is unsatisfactory for a

number

of reasons. First,

the

comparison of

relative heights is only qualitative. So, obviously a more systematic

and

quantitative test is required. Hierarchical clustering Gohnson, 1967) is

such

a

method

which

has been

successfully applied in analysing speech

(Grosjean

and

Gee, 1983)

and

I am currently working

on

ways of applying

this to music performance.

A second problem lies in

the indeterminism of individual performances

as discussed above. Whilst it is often possible to observe considerable

across-performer agreement (Shaffer Todd, 1987) there are also

many

differences. Also, there is

no

reason

why

the analyst's interpretation

qr

should be the same as

the

performer's

qrp

since there is

no

such thing as a

single correct grouping.

t

is for these reasons

that the

input

used

is the

representation derived from the data

qrD

which is obtained via

the

algorithm

listen

This procedure is certainly

not

intended to give licence

to adjust the theory

post hoc

to fit each set of data - on

the

contrary

the

same performance

mapping pl y

is

used

in each case. Remember the goal

of a theory of performance is the principled explanation of performance

data. So, whilst

we

cannot predict a performance

with any

certainty

what

we

can say for each performance is

that

if the following three assumptions

produce a good match between

p and

P

D

then

the

assumptions

constitute a reasonable explanation:

a) the

performer

has used

slowing to indicate a grouping boundary;

(b)

the

performer's grouping analysis corresponds to

tsr;

c)

the performer's

mapping

procedure corresponds to

play

The model

then

is really

an

analytical theory of performance rather than a

prescriptive theory of performance. However, there are

no

reasons,

if

enought

data is amassed, why probability weightings could not be

assigned to various performances as a function of style

and

instrument.

Examples

Presented

in

Figures 3

and

4 are two examples of data from actual

performances compared against the model. The first example is taken

from a performance of the Adagio from

the Haydn

Sonata

in

B-Flat which

was

also

used in

Todd 1985) so

that

comparison

with

the

old model is

possible. The second example is taken from two performances of the

Chopin prelude in

F

Minor (Shaffer

and

Todd, 1987). The data were

obtained using the

method

of Shaffer (1981).

76 Neil Todd

Evaluation PA

e p e

P

D

?

Having generated P

A

or P

D

we need to compare them with

an

actual

performance

P

p

In

Todd 1985)

the

data

and

model (ie

p

and

P

A

were

compared visually using the criteria; a)

the

position of peaks or points of

slowing; b) the relative heights of the peaks. Whilst this

method

is useful

it is unsatisfactory for a

number

of reasons. First,

the

comparison of

relative heights is only qualitative. So, obviously a more systematic

and

quantitative test is required. Hierarchical clustering Gohnson, 1967) is

such

a

method

which

has been

successfully applied in analysing speech

(Grosjean

and

Gee, 1983)

and

I am currently working

on

ways of applying

this to music performance.

A second problem lies in

the indeterminism of individual performances

as discussed above. Whilst it is often possible to observe considerable

across-performer agreement (Shaffer Todd, 1987) there are also

many

differences. Also, there is

no

reason

why

the analyst's interpretation

qr

should be the same as

the

performer's

qrp

since there is

no

such thing as a

single correct grouping.

t

is for these reasons

that the

input

used

is the

representation derived from the data

qrD

which is obtained via

the

algorithm

listen

This procedure is certainly

not

intended to give licence

to adjust the theory

post hoc

to fit each set of data - on

the

contrary

the

same performance

mapping pl y

is

used

in each case. Remember the goal

of a theory of performance is the principled explanation of performance

data. So, whilst

we

cannot predict a performance

with any

certainty

what

we

can say for each performance is

that

if the following three assumptions

produce a good match between

p and

P

D

then

the

assumptions

constitute a reasonable explanation:

a) the

performer

has used

slowing to indicate a grouping boundary;

(b)

the

performer's grouping analysis corresponds to

tsr;

c)

the performer's

mapping

procedure corresponds to

play

The model

then

is really

an

analytical theory of performance rather than a

prescriptive theory of performance. However, there are

no

reasons,

if

enought

data is amassed, why probability weightings could not be

assigned to various performances as a function of style

and

instrument.

Examples

Presented

in

Figures 3

and

4 are two examples of data from actual

performances compared against the model. The first example is taken

from a performance of the Adagio from

the Haydn

Sonata

in

B-Flat which

was

also

used in

Todd 1985) so

that

comparison

with

the

old model is

possible. The second example is taken from two performances of the

Chopin prelude in

F

Minor (Shaffer

and

Todd, 1987). The data were

obtained using the

method

of Shaffer (1981).


10/21

I

I

l

l

I

I

I

I

I

I

O

0

0

0

0

0

I

0

0

0

c

0

I

I

I

(

o

0

)

c00

o

d

c

O

0

0

oo

o(

1

i

~

I

i

0

~

r

O

~

~

.

~

(

~

)

N

O

I

L

V

H

A

O

H

V

H

Downloaded By: [Ingenta Content Distribution Psy Press Titles] At: 08:06 5

-

'

-

z

5000

4000

a

1

a2

3200ms

300ms

HAYDN

SONATA

B-FLAT

MAJOR

V\ \ :

:VJ

\

A \NVi l

a

. .

:; .

.,

; . j /

::

0

cc 3000

a computational model of rubato (todd).pdf

Documents