overview what : stroke type transformation: timbre rhythm when: stroke timing resynthesis
TRANSCRIPT
Tabla GyanRealtime tabla recognition and resynthesis
Parag Chordia (GTCMT)Alex Rae (GTCMT)
Overview
What :Stroke type
Transformation:TimbreRhythm
When:Stroke timing
Resynthesis
Video Demo
The Drum
• Dayan – treble drum
• Bayan – bass drum
Tabla Language
Recognition Architecture
Onset detection
Statistical ModelSVM
BayesianNeural Net
Training data
ke
tun
dhe
gedha
te
Input music
Stroke Label
Rhythm
Build Model: Training Data
• Several Datasets• Professional
musician• Home recording
• Audio recordings manually edited and labeled
Build Model: Target Mapping
• Standardize idiosyncratic traditional naming conventions
• Map timbrally similar (or identical) strokes to the same category
Build Model: Feature Extraction
Spectral Features• MFCCs (24)• Centroid• Variance• Skewness• Kurtosis• Slope• Roll-off
VarianceF1F2F3...
Fn
Spectral centroid
Kurtosis
Feature Vector
Build Model: Trained Model
• WEKA machine learning package• Support Vector Machine• Models trained on different datasets can be
saved for future use
Audio: Input
• Live audio is taken from a close-mic’d tabla
• Stereo signal provides partial separation of drums
Audio: Segmentation
• Onset detection done in Max using bonk~• More recent parallel project uses spectral flux
algorithm in Java• End of stroke marked by next onset (1 sec
buffer size)• Onset times stored
Audio: Feature Extraction
VarianceF1F2F3...
Fn
Spectral centroid
Kurtosis
Feature Vector
Output: Classification
• Feature vector is fed to previously trained model
• Single category label returned
SVM labelfeature vector
Output: Symbolic Score
• Stroke label combined with timing and amplitude information
• Score stored in temporary buffer in Max patch
.3204 .9665 2
.3527 .5715 6
.3031 .3648 6
.3325 .9827 6
.2970 .4762 2
.3865 .5928 1
.3496 .6603 8
.7046 .4621 1
.3144 .5024 6
.7152 .2990 6
.3387 .8891 2
.2902 .7342 6
.3868 .9051 7
.3049 .5727 1
Output: Timbre Remapping
Stroke labels can be flexibly remapped
Output: Conditional Repetition
Output: User Interface
Dangum
Future Directions
• Beat tracking• Modeling specific types of improvisational
forms (e.g. qaida, tihai …)• Automate transformations• Improve interface so it can be “played”• Tracking of expressive parameters (e.g. bayan
pitch modulation)
Conclusions
• Shown a realtime tabla interaction system• Implemented as Max java external using
machine learning to identify strokes• Supports flexible transformations• Foundation for more general improvisation
system