dyros.snu.ac.krdyros.snu.ac.kr/paper/isr2018.pdf · 140 intel serv robotics (2018) 11:139–148...

12
1 23 Intelligent Service Robotics ISSN 1861-2776 Volume 11 Number 2 Intel Serv Robotics (2018) 11:139-148 DOI 10.1007/s11370-017-0241-x Dance motion generation by recombination of body parts from motion source Minho Lee, Kyogu Lee, Mihee Lee & Jaeheung Park

Upload: others

Post on 12-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: dyros.snu.ac.krdyros.snu.ac.kr/paper/ISR2018.pdf · 140 Intel Serv Robotics (2018) 11:139–148 Fig. 1 Body part motion recombination to synthesize new dance performance large database

1 23

Intelligent Service Robotics ISSN 1861-2776Volume 11Number 2 Intel Serv Robotics (2018) 11:139-148DOI 10.1007/s11370-017-0241-x

Dance motion generation by recombinationof body parts from motion source

Minho Lee, Kyogu Lee, Mihee Lee &Jaeheung Park

Page 2: dyros.snu.ac.krdyros.snu.ac.kr/paper/ISR2018.pdf · 140 Intel Serv Robotics (2018) 11:139–148 Fig. 1 Body part motion recombination to synthesize new dance performance large database

1 23

Your article is protected by copyright and

all rights are held exclusively by Springer-

Verlag GmbH Germany. This e-offprint is

for personal use only and shall not be self-

archived in electronic repositories. If you wish

to self-archive your article, please use the

accepted manuscript version for posting on

your own website. You may further deposit

the accepted manuscript version in any

repository, provided it is only made publicly

available 12 months after official publication

or later and provided acknowledgement is

given to the original source of publication

and a link is inserted to the published article

on Springer's website. The link must be

accompanied by the following text: "The final

publication is available at link.springer.com”.

Page 3: dyros.snu.ac.krdyros.snu.ac.kr/paper/ISR2018.pdf · 140 Intel Serv Robotics (2018) 11:139–148 Fig. 1 Body part motion recombination to synthesize new dance performance large database

Intel Serv Robotics (2018) 11:139–148https://doi.org/10.1007/s11370-017-0241-x

ORIGINAL RESEARCH PAPER

Dance motion generation by recombination of body parts frommotion source

Minho Lee1 · Kyogu Lee1,3 · Mihee Lee2 · Jaeheung Park1,3

Received: 8 August 2016 / Accepted: 27 October 2017 / Published online: 29 December 2017© Springer-Verlag GmbH Germany 2017

Abstract In this paper, we propose an approach to synthe-size new dance routines by combining body part motionsfrom a humanmotion database. The proposed approach aimsto provide a movement source to allow robots or animationcharacters to perform improvised dances tomusic, and also toinspire choreographers with the provided movements. Basedon the observation that some body parts performmore appro-priately than other body parts during dance performances,a correlation analysis of music and motion is conducted toidentify the expressive body parts.We then combine the bodypart movement sources to create a newmotion, which differsfrom all sources in the database. The generated performancesare evaluated by a user questionnaire assessment, and theresults are discussed to understand what is important in gen-erating more appealing dance routines.

Electronic supplementary material The online version of thisarticle (https://doi.org/10.1007/s11370-017-0241-x) containssupplementary material, which is available to authorized users.

B Jaeheung [email protected]

Minho [email protected]

Kyogu [email protected]

Mihee [email protected]

1 Department of Transdisciplinary Studies, Seoul NationalUniversity, Seoul, Republic of Korea

2 World Dance Alliance (Korea Chapter), Seoul, Republic ofKorea

3 Advanced Institute of Convergence Science and Technology,Suwon, Republic of Korea

Keywords Choreography · Motion synthesis ·Music–motion correlation · Human motion database

1 Introduction

The importing of human motion into humanoid robots oranimated characters enables a natural imitation of humanmovement. Therefore, research on techniques to mimichuman movement is becoming more topical. However, it isboth time and labor intensive to gather human movementdata. Therefore, the techniques to create various motionshave become more relevant. We are particularly interested indancing, and, therefore, have developed a system that synthe-sizes various dancemovements. The various newmovementscan be utilized as inputs to assist choreographers in obtain-ing inspiration for their work. Additionally, the system canprovidemotion sources that can be used for generatingmove-ments of robots and animated characters. For example, whenmusic is provided, the robot or character can dance withimprovisation.

In this study, the dance performance is synthesized byanalyzing the appropriateness of motion to music, and usesa pre-generated motion database. In studies of matchingmotions to music, Nakahara et al. controlled the motor speedby predicting the tempo of the music [1], and Grunberg et al.attempted to synchronize HUBO movements to the tempoof a given piece of music [2]. Both of these studies useda manually pre-generated database, and then modulated thejoint speed to better match the music. These studies foundthat matching joint movements to musical features results inappealing dance routines.

Lee et al. [3] developed a motion capture database, andattempted to generate dance motion sequences by searchingfor appropriate motion data for music pieces. By using the

123

Author's personal copy

Page 4: dyros.snu.ac.krdyros.snu.ac.kr/paper/ISR2018.pdf · 140 Intel Serv Robotics (2018) 11:139–148 Fig. 1 Body part motion recombination to synthesize new dance performance large database

140 Intel Serv Robotics (2018) 11:139–148

Fig. 1 Body part motion recombination to synthesize new dance performance

large database and a motion searching algorithm, they wereable to generate well-matched musical motion sequences,without modifying the movement speeds. However, evenwhen a dance motion sequence was unique, a drawbackwas that each motion in the sequence was identical to onein the database. Synthesizing a new motion is necessary togenerate a greater variety of dance routines. According toToiviainen et al., specific joint movements in dance play acritical role in representing music characteristics [4], and therole depend on the characteristics of the music. Based on thisstudy, we designed a dance motion synthesizing algorithmby recombining body part movements, to generate variousmotion from our database.

This study was developed with the expectation that musicand motion with similar change patterns would be appropri-ate for each other. The aim was to determine whether themotion synthesized by several body part movements appearsto be natural human movements. Verification of the expecta-tion and the generated motion is confirmed by the user study.

The concept of our proposed algorithm is shown in Fig. 1.When a new musical piece is provided, the proposed algo-rithm is designed to search appropriate body part motionsources by analyzing the cross-correlation scores betweenthemusic and themotion characteristics. The highest-scoringbody part motion candidates are selected for each body parts,and these may be selected from different sources in thedatabase. The selected body part movements are combinedto synthesize the new dance routine.

The workflow of the proposed approach to generate dancemotion is presented in Sect. 2. Music and motion charac-teristic features used for the experiment are described inSect. 3, and the section also explains the method used toanalyze their correlations. Section 4 describes how we builtthe pre-constructed dance database for the study. In Sect. 5,we describe the process of selecting appropriate motion andexplain how we synthesize the movement and synchronizethe motion to the music. We present an evaluation of the gen-erated dance performance in Sect. 6, aswell as a discussion of

123

Author's personal copy

Page 5: dyros.snu.ac.krdyros.snu.ac.kr/paper/ISR2018.pdf · 140 Intel Serv Robotics (2018) 11:139–148 Fig. 1 Body part motion recombination to synthesize new dance performance large database

Intel Serv Robotics (2018) 11:139–148 141

Fig. 2 Proposed approach

the factors that influence people when they appraise a danceperformance. The conclusion is presented in Sect. 7.

2 Overview of dance motion generation system

This section outlines the workflow of the proposed system.The sequence of the proposed approach to generate dancemotion is illustrated in Fig. 2. The system operates witha pre-built human dance motion database that is clusteredaccording to the similarity of the corresponding music. The

process comprises four steps, shown in sequence, beginningfrom the bottom of Fig. 2.

The first step extracts the audio characteristic featuresfrom the input music. In Fig. 2, the features are expressed bythe colors on the audio signals. The input music is segmentedinto short pieces, based on the times where the musical char-acteristics change significantly. The system then searches forthe appropriate motion for each music segment. The methodof extracting the features to segment the music input will bedescribed in Sect. 3.

The second step determines the motion candidates foreach music segment, according to the musical similarities.For each music segment, the motion cluster group in thedatabase with the closest musical characteristics is found. InFig. 2, the color of a group specifies its musical character-istics, and the corresponding motion segments are from thesame group. The number of motion segments may vary fromgroup to group (e.g., group A in the figure contains threemotion segments, while group B contains two segments.).The recommended candidates will then be recombined togenerate a newmotion. The details of the motion groups, andthe music similarity-based search algorithm, can be found inSect. 4.

The third step recommends movement sources for fivebody parts. The motion candidates in the previous stepare divided into body parts, and their characteristics areextracted. The motion characteristic features are analyzed bydetermining their correlation to the musical features in thefirst step, and the motion source with the highest correlationfor the music segment is selected for each body part (e.g., forthe first segment of input music in Fig. 2, the head motionof source A1 is recommended). The method for extractingcharacteristic features is described in Sect. 3, and the motionsource selection process for body parts is detailed in Sect. 5.

Thefinal step combines the recommendedmotion sources.For each given music segment, the recommended body partmotions are combined to synthesize a dance routine for thesegment. The dance routines are then connected to generate acomplete choreographic sequence for the given music. Thisprocess is described in Sect. 5.

3 Music–motion features and correlationcomparison method

In this section, we introduce the music and motion character-istic features used for the experiment, as well as the methodsused to compare the correlations between the features. Toextract the music features, we used feature sets that are typi-cally used to analyze audio signals. Themotion features wereanalyzed using the angular velocities most frequently usedin Laban Movement Analysis (LMA). The technique to findsimilar music pieces was then applied to find the appropriate

123

Author's personal copy

Page 6: dyros.snu.ac.krdyros.snu.ac.kr/paper/ISR2018.pdf · 140 Intel Serv Robotics (2018) 11:139–148 Fig. 1 Body part motion recombination to synthesize new dance performance large database

142 Intel Serv Robotics (2018) 11:139–148

Table 1 Musical attributes represented by acoustic features

Acoustic feature Musical attributes

Chromagram Tonal

Spectral flux Spectral/temporal

Spectral centroid Spectral

Spectral spread Spectral

MFCC Specral/timbral

Spectral rolloff Spectral

Zero-crossing rate Temporal

motion segments for the music. The methods described inthis section will be used consistently in subsequent sections.

3.1 Characteristic features

In this study, seven acoustic features are selected to repre-sent the various attributes of music sources: chromagram,spectral flux/centroid/spread/rolloff, Mel-frequency cepstralcoefficients (MFCC), and zero-crossing rate. These acousticfeatures represent the timbral, tonal and temporal character-istics of music, and are widely used to analyze audio signals[5,6]. Table 1 shows the attributes applicable to the variousacoustic features.

To analyze movements, we selected angular velocities ofhuman joints to represent the movements. Unlike for music,there is no set of features commonly used for dance motionanalysis. Although the concept of LMA is widely used toexpress the characteristics of motion, its quantitative use dif-fers between studies [7]. In this study, the angular velocitiesof body parts were used, as they are frequently used in LMAto express the tempo and rhythm characteristics of move-ment.

3.2 Correlation comparison method

In order to compare the extracted features of music andmotion, the features are transformed into novelty functions.Novelty functions are used to represent the changes of thecorresponding features, and are often used for segmentingaudio signals by detecting significant changes (i.e., novelty)of the signals. In our study, we used the novelty function tonot only compare the features of twomusic pieces, but also tocompare the features between music and motion segments.This was based on an expectation that the motion is appro-priate to music when they both have similar changes in theirpatterns.

When a music or motion source is provided, a self-similarity matrix is calculated to derive the novelty function[8,9]. From the N frames of the music or motion source, anNxN self-similarity matrix, defined as S is generated. The

matrix represents the measurements of the cosine distancesbetween the frames of the provided source. The equation isas follows:

S(i, j) ≡ vi · v j

||vi ||||v j || (1)

Where vi and v j are the feature vectors of frames i andj , respectively, and each frame is 0.01 s long. The similar-ities of features between each frame are shown in matrixS. We then use the self-similarity matrix to obtain the nov-elty function. The novelty function is derived by sliding aGaussian kernel along the diagonals of the self-similaritymatrix:

NoveltyScore(i)

=L/2∑

m=−L/2

L/2∑

n=−L/2

C(m, n)S(i + m, i + n) (2)

Where C is the L × L Gaussian-tapered checkerboardkernel with two opposite regions: positive and negative.The kernel C is used as a filter on the self-similaritymatrix S. The kernel emphasizes the differences betweentime diagonal stamps of S and the nearest time stamps,and obtains a score for each time stamp [8,9]. m and nare the numbers of two frames on the kernel. The equa-tion measures the correlation coefficients of C and S foreach frame of i ; therefore, the measurements are calcu-lated along the diagonal side of S. The kernel size, L , isset at 200 frames (2 s), to compute the novelty measure-ments for ±1s around each frame i . Before the calculation,S is zero-padded by L/2 length on either side to avoidundefined values. The i denotes each frame, and the timeinterval is 0.01 s. The novelty function generated by adaptingthe kernel to the self-similarity matrix indicates the dif-ferences between each frame and the surrounding frames.The function yields a one-dimensional signal over time,and noticeable peaks indicate significant changes of theattributes.

Examples of the self-similarity matrix and novelty scoresare shown in Fig. 3. In the self-similarity matrix Fig. 3a,brighter colors indicate closeness between the features ofcorresponding frames. The novelty scores described inFig. 3b are derived from the matrix in Fig. 3a. This exam-ple is an extract from a motion source segment in ourdatabase.

To compare two data sources(e.g., music or motionsource), we compute the cross-correlation of the noveltyfunctions extracted from the sources. A high synchronouscorrelation is interpreted to show that the two given sourceshave similar characteristic flows. As novelty peaks representsignificant change in music or dance movements, matchingthe music and dance movements with a correlation of this

123

Author's personal copy

Page 7: dyros.snu.ac.krdyros.snu.ac.kr/paper/ISR2018.pdf · 140 Intel Serv Robotics (2018) 11:139–148 Fig. 1 Body part motion recombination to synthesize new dance performance large database

Intel Serv Robotics (2018) 11:139–148 143

Fig. 3 An example of the functions a self-similarity matrix b noveltyscore

novelty score means that the music and dance are matchedby similarities in their changes.

4 Music–motion database

The motion generation sequences in this study are basedon modifying the movements of a pre-constructed databasecollected from human dancers. We expected that applyingprofessional choreographer’s work to create new choreog-raphy would yield effective results. The database comprisesdancingmotion pairedwithmusic of the popmusic genre.Alldances in the database were created by professional choreog-raphers, and the danceswere choreographed to be appropriateto the pairedmusic. Thesemusic–motion pairs are segmentedand clustered according to their musical characteristics.

4.1 Music database

The process of constructing the music database is shown inFig. 4. When importing a piece of music to the database,

Fig. 4 Segmentation and clustering process ofmusic–motion database

its characteristic features are extracted to calculate the self-similarity matrix and the novelty function. The discerniblepeaks above a defined threshold on the novelty function aredefined as boundary points for segmentation, and the musicpiece is segmented into short pieces (e.g., a1 to an of A′

(n)inthe figure). The short music segments are then clusteredaccording to their music characteristics, and inserted intothe database, DBα .

In this experiment, the music is inserted into the databasein the format of the audio source file (Mono, 8-bit, 11025Hz,wav.). Each music clip contains approximately 90 s of thefirst verse. Sixty-five pieces of music, resulting in 1283 seg-ments, were used. The length of each segment differed, butthe segments averaged approximately 3−6 s. The segmentswere classified into 200 clusters using the k-means clusteringalgorithm [10,11].

4.2 Motion database

The dancers who participated in the experiment wererequested to watch the music video of the original singer,and then imitated the choreography. The movements of thedancers were recorded by an optical motion capture system,which comprised 29 Vicon T-160 cameras with Vicon Bladesoftware. Sixty markers were attached to the professionaldancers, to track the movements of their heads, torsos and

123

Author's personal copy

Page 8: dyros.snu.ac.krdyros.snu.ac.kr/paper/ISR2018.pdf · 140 Intel Serv Robotics (2018) 11:139–148 Fig. 1 Body part motion recombination to synthesize new dance performance large database

144 Intel Serv Robotics (2018) 11:139–148

Fig. 5 Markerset, analyzed bones and body part division

limbs. A marker set was used to analyze 19 joints of eachdancer, and the Euler angles of each joint were extracted bythe Vicon Blade software. The marker set used in the exper-iment, and the list of joints are shown in Fig. 5. Motion wascaptured at a frame rate of 100 Hz, and each dancer per-formed while the music played to ensure that the motion andthe music were synchronized.

As shown in Fig. 4, the captured dance motion data aresegmented (B(n)) and clustered (DBβ ) according to its corre-sponding music. Lee et al. [3], proved that, when the dancesare segments by the same boundaries as the music segments,the dancemotion segments aremeaningfully clustered by fol-lowing the clustering rules of the pairwised music segments.Their experiment verified that the dances are also appro-priate for other pieces of similar music. In this experiment,we provide diversity of generated motions. In order to cre-ate various movements, we divided the body into five parts,and combined motions from different movement sources togenerate new motions. The divided body parts comprisedthe head, torso, left and right arms, and the lower body, as

Fig. 6 Motion selection and synthesis process

shown in Fig. 5. The legs were not split into two parts toprevent the generated motion from becoming too awkward.The approach does not modify the motion source, except forrecombining the body parts from motion sources.

5 Motion selection and synthesis

The process of synthesizing a new choreography to musicis shown in Fig. 6. When new music is input, the music issegmented into short pieces (C ′

(n)) according to the noveltyscores derived from the characteristic features of the music.For each segment we then obtain the closest groups in themusic database, DBα by comparing the musical characteris-tics of the given music segments to the clustered group in thedatabase. The Gα1 to Gαn of Gα indicates the closest groupsfor each of the n segments. The motion sources(Gβ1 to Gβn )of Gβ corresponding to the music groups are recommended

123

Author's personal copy

Page 9: dyros.snu.ac.krdyros.snu.ac.kr/paper/ISR2018.pdf · 140 Intel Serv Robotics (2018) 11:139–148 Fig. 1 Body part motion recombination to synthesize new dance performance large database

Intel Serv Robotics (2018) 11:139–148 145

Fig. 7 Example of motion alignment process

as appropriate candidates for the input music segments. Themotion sources are included in the database DBβ .

To find the appropriate body part movements from themotion candidates, angular velocities are extracted from eachbody part of the candidates. The novelty function is calcu-lated from the angular velocities, where the novelty scoresare represented as NGβ . The NGβ comprise the novelty scoresof each body part from the candidates.

Then, the cross-correlation is calculated for NGβ withNC(n)

, where NC(n)are the novelty scores of input music

segments. When calculating cross-correlations, the motioncandidate is repeatedly connected until it is longer than themusic segment, as it is required to be longer than the musicsegment. The example ofmotion repetition is shown in Fig. 7.

After the calculation of the cross-correlations, for eachinput music segment, motion candidates with the highestscores is found for each body part. In Fig. 6, M1(n)

indicatesthe motions with the highest scored motions, and the termMn(1−5) indicates the movement sources of the 1st–5th body

parts (head-to-lower body) that are strongly correlated withthe nth music segment. The body part movements will becombined (M2(n)

) and connected to generate choreographicsequence, M , for the input music.

When combining the selected body part motions, the mostsynchronous region of each body part movement is selected.As an example, in Fig. 7, the regions of body part move-ments with the highest cross-correlation have been selectedto generate the combined motion.

MotionBuilder, developed by Autodesk, is used to com-bine the movements of body parts. The root joint was set tothe hip, that is located in lower part of the body, and otherbody part movements were linked to the movements of theirparent body parts. At the boundaries of motion sources, aperiod of 0.2 s was selected for interpolation between twosegments to avoid abrupt changes.

To evaluate the system, we synthesized dance motionsfor four music pieces not included in our database. The test

123

Author's personal copy

Page 10: dyros.snu.ac.krdyros.snu.ac.kr/paper/ISR2018.pdf · 140 Intel Serv Robotics (2018) 11:139–148 Fig. 1 Body part motion recombination to synthesize new dance performance large database

146 Intel Serv Robotics (2018) 11:139–148

music pieces were is 60–70s long, and divided into 17–22segments. The video clips are attached to the manuscript.

6 Evaluation and discussions

When a choreography is generated by a specific method,it must be evaluated by certain criteria. Therefore, therehave been numerous studies to develop quantifying and eval-uating methods for choreography. Aesthetic CompetenceEvaluation (ACE) [12] and the Performance CompetenceEvaluation Measure(PCEM) [13] were developed to quan-tify choreographic factors. Labanotation [14] is used forthe readable notation of the body movements. Laban alsodeveloped theories and systems of spatial theories [15] andqualitative movement analysis [16]. Based on Laban’s the-ory, Chi et al. developed a simulation model deconstructhuman motion data into criteria of changing inner inten-sion values [17]. Torresani et al. presented a technique tolearnmotion style synthesis fromartificially generated exam-ples [18]. However, these studies are able to quantify andevaluate the characteristics of the movements, but are notsuitable for evaluating the appropriateness of the motion tothe music. Furthermore, the criteria for judging choreogra-phy vary according to the genre of the music. Therefore,we verified the performance of the proposed approach by auser study, and learned which factors should be consideredto synthesize more appealing choreography.

For the user study, the subjects were asked to evaluatefour types of dance clips: (1) a dance generated by the pro-posed approach (Dance I hereafter), (2) a dance comprisingmotion sources selected with the lowest correlation scoresto the music (Dance II hereafter), (3) a dance compris-ing randomly selected motions from the database. (DanceIII hereafter), and (4) the original dance performance thatwas choreographed for the test music (Dance IV hereafter).These dance clips were provided for the four pieces of musicselected for the user study. The questionnaire used five-levelLikert scale scoring, as shown in Table 2. The first ques-tion asked how appropriate the dance is to the music. Theother four questions were regarding different aspects relatedto the appropriateness of the dances to the music. The ques-tions were formulated so as to better understand the criteriapeople used when evaluating the dance performances. Thepresentation order of the video clips was randomly selectedfor each piece of music, and the order was not given to thesubjects, in order to maintain fairness. A total of 43 subjectsparticipated in the user study.

Themean scores of each question are presented in Table 3.The p values for pairwise comparison using a t-test are alsopresented in the table. These values are used to verify statis-tically meaningful differences between the scores of DanceI, and the scores of Dance II and Dance III. The results of

the first question show that the dance motion generated bythe proposed approach (Dance I) appeals significantly moreto the subjects than that generated from the lowest correlatedmotion sources (Dance II). However, the scores of Dance IIIshow relatively small differences with the scores of Dance I.In the case of Music II, randomly generated motion (DanceIII) received a slightly higher score thanDance I, although thep-values indicate that it is not a significant result. The resultbetween Dance I and Dance II indicates that the technique ofselecting a motion with a high music correlation score has aneffect on generating appealing choreography. However, thescores of Dance I and Dance III indicate that the proposedtechnique alone is not sufficient to create satisfactory dancemovements. Therefore, we analyzed the answers to questions2, 3, 4, and 5 to identify factors that are expected to affectthe quality of dances.

Question 2 asked how well the dance matched the musicbeat, and the results were similar to the results of question 1.Therefore, we can expect that a synchronous beat is impor-tant inmaking the dance appropriate to themusic. In question3, the smoothness of movement connection was judged, andDances I, II, and III received similar scores. According tothe opinions, we gathered from the subjects prior to the userstudy, the smoothness of motion connection was a factor thatcould affect the evaluation. However, as the proposedmethodused the same connection method for all dances, the answersfor question 3 were considered to be meaning. Question 4regarded the dynamic movements in each dance clip. As inthe first and second questions, the answers indicated thatDances I, III, and II got higher scores, in that order, exceptforMusic 3.Themethodof correlating thenovelty functionofmusic andmotion, whichwe used for the proposed approach,would clearly influence the similarity of the dynamic charac-teristic by making the variation characteristics of the danceand music similar. However, the randomly generated motionfor Music 3 received excellent results in terms of beat anddynamic matchings, and obtained scores similar to our pro-posed method. The final question regarded repetition in thedance sequence. According to preliminary research, manypeople preferred to see variety in dance sequences, and thisquestion was posed to better understand the effect of motionrepetition. As our proposed approach was designed to applysimilar motions sources to similar music, the results of thisquestion were higher for Dances I and II, than for Dance III.However, as the results of question 1 differ from the results ofquestion 5, the evaluation of dance appropriateness to musicis according to taste, regardless of whether the dance com-prises various movements or not.

From the user study results, it was observed that the par-ticipants typically focused on the quality of the movementitself, rather than distinguishing the appropriateness of thedance to the given music. Therefore, the effect of selectingsynchronous motions for the music, which was one of the

123

Author's personal copy

Page 11: dyros.snu.ac.krdyros.snu.ac.kr/paper/ISR2018.pdf · 140 Intel Serv Robotics (2018) 11:139–148 Fig. 1 Body part motion recombination to synthesize new dance performance large database

Intel Serv Robotics (2018) 11:139–148 147

Table 2 Questionnaire used for user study

Music I Question Score (in five-level) Evaluation base Comments

Clip A Q1. Does dance appropriate to music? 1–5 – –

Q2. Does beat of dance matches well to the music? 1–5 –

Q3. Does dance sequences are connected smoothly? 1–5 –

Q4. Does dance is dynamic when music is dynamic? 1–5 –

Q5. Does dance sequences consist of any rules? 1–5 –

Clip B Q1. Does dance appropriate to music? 1–5 – –

Q2. Does beat of dance matches well to the music? 1–5 –

Q3. Does dance sequences are connected smoothly? 1–5 –

Q4. Does dance is dynamic when music is dynamic? 1–5 –

Q5. Does dance sequences consist of any rules? 1–5 –

(Same format used for Clip C, D, and for Music II, III and IV)

Table 3 Scoring results of whole subjects

Music Clips Mean score p value (to Dance I)

Q1 Q2 Q3 Q4 Q5 Q1 Q2 Q3 Q4 Q5

1 Dance I 3.33 2.88 3.42 3.37 3.91

Dance II 2.02 1.47 3.49 1.67 3.42 0.013 0.015 0.31 0.003 0.21

Dance III 2.77 2.12 3.51 2.81 1.91 0.11 0.082 0.28 0.015 0.022

Dance IV 4.23 4.14 4.88 4.28 4.70

2 Dance I 3.12 2.74 3.28 3.12 3.79

Dance II 2.37 1.88 3.37 1.98 3.28 0.021 0.023 0.34 0.008 0.30

Dance III 3.23 2.67 3.53 3.32 1.88 0.24 0.15 0.32 0.20 0.011

Dance IV 4.37 4.23 4.79 4.33 4.65

3 Dance I 3.42 3.05 3.74 3.51 3.86

Dance II 1.81 1.98 3.30 1.47 3.63 0.005 0.007 0.33 0.006 0.28

Dance III 3.37 2.88 3.28 3.33 1.65 0.28 0.19 0.36 0.27 0.004

Dance IV 4.65 4.79 4.81 4.53 4.63

4 Dance I 3.28 2.88 3.44 3.21 3.53

Dance II 2.37 1.77 3.19 1.88 3.56 0.033 0.018 0.36 0.013 0.37

Dance III 2.81 2.33 3.26 2.95 1.81 0.21 0.12 0.23 0.11 0.002

Dance IV 4.67 4.56 4.91 4.60 4.74

objectives of this study, was not as great as expected. How-ever, there are significant difference between the scores ofDances I and II, when scored by the test subjects. This resultsignifies that the method of matching movement to the musicbased on their correlations, clearly has an effect on the danceperformance being seen as appropriate to the music. Fromquestions 2−5, we found that beat matching and dynamicsynchronizing both play a significant role in generating bet-ter dance performances.

Another noticeable result is that few comments rated themovements, generated by combining the movements of dif-ferent parts of the body, as being awkward. This indicatesthat we can use the proposed approach to provide diversedance motion outputs from the database.

7 Conclusion

In this study, a new approach is proposed to synthesize newdance motions by combining movements of body parts. Theproposed approach is aimed at being used for choreographicwork. The recommendation of reference dance motions forsimilar music, as well as the recommendation of a varietyof dance movements can provide choreographers with inspi-ration. We also expect the approach to be able to providemovement sources for robots or animation characters forimprovised dancing to a given piece of music.

There are several advantages to this approach. First, byreconstructingmotion fromactual humanmovement sources,the generated motion data are able to retain the natural char-

123

Author's personal copy

Page 12: dyros.snu.ac.krdyros.snu.ac.kr/paper/ISR2018.pdf · 140 Intel Serv Robotics (2018) 11:139–148 Fig. 1 Body part motion recombination to synthesize new dance performance large database

148 Intel Serv Robotics (2018) 11:139–148

acteristics of human movement. Secondly, the procedure bywhich motion suggestions are based on musical similarity,significantly improves the appropriateness of motion sourcesto music. Thirdly, the synthesized motion of five body partsfrom different sources, results in a motion that is not con-tained in the database. Therefore, the proposed approach isable to provide diverse motion outputs.

The purpose of the study was to demonstrate that a motionrecombined by body part movements can create an appealingperformance, when each body movement sources compriseshighly synchronous movements to the music. We expectedthat the dance motion would be appropriate to the musicwhen the motion characteristics have a strong correlationwith the music characteristics, and this was confirmed bythe user study. Moreover, we were able to investigate theeffects of beat matching and dynamic synchronizing on theappropriateness of dance to the music. Additional factorsshould be considered when synthesizing an appealing danceperformance, and these factors should be addressed in futurestudies.

Additionally, there are numerous problems still to besolved in order to apply the synthesized dance to robots. Forexample, self-collision, balancing, and the kinematic differ-ences between robots and humans need to be addressed. It isinevitable that themotion sourcewill bemodifiedwhile solv-ing these problems. Therefore, in order to take advantage ofthe human-like dance movements resulting from this study,our future work must find a way to minimize dance sourcemodifications, while solving the problems listed above.

Acknowledgements This work was supported by the National Rese-arch Foundation of Korea (NRF) grant funded by the Korea government(MSIP) (No. NRF-2015R1A2A1A10055798)

References

1. Nakahara N, Miyazaki K, Sakamoto H, Fujisawa TX, Nagata N,Nakatsu R (2009) Dancemotion control of a humanoid robot basedon real-time tempo tracking from musical audio signals. EntertainComput ICEC 2009:36–47

2. Grunberg D, Ellenberg R, Kim Y, Oh P (2009) Creating anautonomous dancing robot. In: Proceedings of the internationalconference on hybrid information technology (ICHIT), pp 221–227

3. Lee M, Lee K, Park J (2013) Music similarity-based approachto generating dance motion sequence. Multimed Tools Appl62(3):895–912

4. Toiviainen P, Luck G, Thompson M (2009) Embodied metre: hier-archical eigenmodes in spontaneousmovement tomusic. CognitiveProcess 10(2):325–327

5. Bartsch MA, Wakefield GH (2001) To catch a chorus: usingchroma-based representations for audio thumbnailing, 2001 IEEEworkshop on the applications of signal processing to audio andacoustics, pp 15–18

6. Gray JM (1975) An exploration of musical timbre. PhD thesis,Department of Psychology, Stanford University, Stanford, CA,USA

7. Knight H, Simmons R (2016) Laban head-motions convey robotstate: a call for robot body language. In: Proceedings of interna-tional conference on robotics and automation (ICRA), pp 2881–2888

8. Foote J (2000) Automatic audio segmentation using a measureof audio novelty. In: Proceedings IEEE international conferencemultimedia and expo (ICME2000), vol 1, pp 452–455

9. Foote J (1999) Visualizing music and audio using self-similarity.In: Proceedings ACM multimedia, pp 70–80

10. Kanungo T, Netanyahu NS, Wu AY (2002) An efficient k-meansclustering algorithm:analysis and implementation. icassp(IEEETrans Pattern Anal Mach Intell) 24(7):881–892

11. Aristidis L, Nikos V, Verbeek JJ (2003) The global k-means clus-tering algorithm. icassp(Pattern Recognit) 36(2):451–461

12. Chatfield SJ, ByrnesWC (1990) Correlational analysis of aestheticcompetency, skill acquisition and physiologic capabilities of mod-ern dancers. In: International dance conference, pp 79–100

13. Krasnow D, Chatfield SJ (2009) Development of the performancecompetence evaluation measure, assessing qualitative aspects ofdance performance. J Dance Med Sci 13(4):101–107

14. Hutchinson AG (1977) Labanotation: the system of analyzing andrecording movement. Theatre Arts Books, New York

15. Rudolf L (2011) CHOREUTICS (annotated and edited by LisaUllmann). Dance Books, Alton

16. Rudolf L, Lawrence FC (1947) Effort.MacDonald andEvans, Lon-don

17. Chi D, Costa M, Zhao L, Badler N (2000) The EMOTE modelfor effort and shape, center for human modeling and simulation.University of Pennsylvania, Philadelphia

18. Torresani L, Hackney P, Christoph B (2006) Learning motion stylesynthesis from perceptual observations. In: Advances in neuralinformation processing systems, pp 1393–1400

123

Author's personal copy