optical music recognition
DESCRIPTION
Optical Music Recognition. Ichiro Fujinaga McGill University 2003. Content. Optical Music Recognition Levy Project Levy Sheet Music Collection Digital Workflow Management Gamera Guido / NoteAbility. Optical Music Recognition (OMR). - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/1.jpg)
Optical Music Recognition
Ichiro Fujinaga
McGill University2003
![Page 2: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/2.jpg)
Content
Optical Music Recognition
Levy Project Levy Sheet Music Collection
Digital Workflow Management
Gamera
Guido / NoteAbility
![Page 3: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/3.jpg)
Optical Music Recognition (OMR)
Trainable open-source OMR system in development since 1984
Staff recognition and removal• Run-length coding• Projections
Lyric removal / classifier Stems and notehead removal Music symbol classifier Score reconstruction
Demo
![Page 4: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/4.jpg)
OMR: Classifier
Connected-component analysis Feature extraction, e.g:
Width, height, aspect ratio Number of holes Central moments
k-nearest neighbor classifier Genetic algorithm
![Page 5: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/5.jpg)
Overall Architecture for OMR
Staff removalSegmentation
Recognition
K-NN Classifier
Output
Symbol Name
Knowledge BaseFeature Vectors
OptimizationGenetic Algorithm
K-nn Classifier
BestWeight Vector
ImageFile
Off-line
![Page 6: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/6.jpg)
Lester S. Levy Collection
![Page 7: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/7.jpg)
Lester S. Levy Collection
North American sheet music (1780–1960)
Digitized 29,000 pieces including “The Star-Spangle Banner”
and “Yankee Doodle”
Database of: text index records images of music (8bit gray) lyrics (first lines of verse and chorus) color images of cover sheets (32bit)http://levysheetmusic.mse.jhu.edu
![Page 8: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/8.jpg)
Reduce the manual intervention for large-scale digitization projects
Creation of data repository (text, image, sound) Optical Music Recognition (OMR) Gamera
XML-based metadata composer, lyricist, arranger, performer, artist, engraver,
lithographer, dedicatee, and publisher cross-references for various forms of names, pseudonyms authoritative versions of names and subject terms
Music and lyric search engines Analysis toolkit
Digital Workflow Management
![Page 9: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/9.jpg)
The problem
Suitable OCR for lyrics not found Commercial OCR systems are often
inadequate for non-standard documents The market for specialized recognition of
historical documents is very small Researchers performing document
recognition often “re-invent” the basic image processing wheel
![Page 10: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/10.jpg)
The solution
Provide easy to use tools to allow domain experts (people with specialized knowledge of a collection) to create custom recognition applications
Generalize OMR for structured documents
![Page 11: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/11.jpg)
Introducing Gamera
Framework for creation of structured document recognition system
Designed for domain experts Image processing tools (filters, binarizations, etc.) Document segmentation and analysis Symbol segmentation and classification
• Feature extraction and selection• Classifier selection and combiners
Syntactical and semantic analysis
Generalized Algorithms and Methods for Enhancement and Restoration of Archives
![Page 12: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/12.jpg)
Features of Gamera
Portability (Unix, Windows, Mac) Extensibility (Python and C++ plugins) Easy-to-use (experts and programmers) Open source Graphic User Interface Interactive / Batchable (scripts)
![Page 13: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/13.jpg)
Graphic User Interface (wxWindows)
Architecture of Gamera
GAMERA Core (C++)
Scripting Environment (Python)
Plugins (Python)
Automatic Plugin Wrapper (Boost)
Plugins (C++)
![Page 14: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/14.jpg)
Example of C++ Plugin
// Number of pixels in matrix#include “gamera.hh”#ifdef __area_wrap__#define NARGS 1#define ARG1_ONEBIT#endifusing namespace Gamera;template <class T>feature_t area(T &m) {return feature_t(m.nrows() * m.ncols());
}
![Page 15: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/15.jpg)
Example of Python Plugin
// This filters a list of CC objectsimport gameradef filter_wide(ccs, max_width):tmp = []for x in ccs:
if x.ncols() > max_width:x.fill_matrix(0)
else:tmp.append(x)
return tmp
![Page 16: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/16.jpg)
Gamera: Interface(screenshot in Linux)
![Page 17: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/17.jpg)
Gamera: Interface(screenshot in Linux)
![Page 18: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/18.jpg)
Histogram(screenshot in Linux)
![Page 19: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/19.jpg)
Thresholding(screenshot in Linux)
![Page 20: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/20.jpg)
Thresholding(screenshot in Linux)
![Page 21: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/21.jpg)
Staff removal: Lute tablature
![Page 22: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/22.jpg)
![Page 23: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/23.jpg)
Classifier: Lute(screenshot in Linux)
![Page 24: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/24.jpg)
Staff removal: Neums
![Page 25: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/25.jpg)
Classifier: Neums(screenshot in Linux)
![Page 26: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/26.jpg)
Greek example
![Page 27: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/27.jpg)
GUIDO Music Notation FormatH. Hoos, K. Renz, J. Kilian
“A formal language for score-level representation”
Plain text: readable, platform independent Extensible and flexible Adequate representation NoteServer: Web/Windows GUIDO/XML NoteAbility (K. Hamel)
![Page 28: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/28.jpg)
GUIDO: An example{ [ \beamsOff | \clef<"treble"> \key<"D"> f#*1/8. g*1/16 |a*1/4. d2*1/8 d*1/4. c#*1/8 |e1*1/2 _*1/4 f#*1/8. g*1/16 |c#2*1/4. b1*1/8 a*1/4. g*1/8 || e#*1/2 f#*1/4 f#*1/8. g*1/16 |a*1/4. d2*1/8 d*1/4. c#*1/8 |e1*1/2 _*1/4 f#*1/8 g |c#2*1/4. b1*1/8 a*1/4. c#*1/8 ],
…
![Page 29: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/29.jpg)
![Page 30: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/30.jpg)
Conclusions
Gamera allows rapid development of domain-specific document recognition applications
Domain experts can customize and control all aspects of the recognition process
Includes an easy-to-use interactive environment for experimentation
Beta version available on Linux OS X version in preparation
![Page 31: Optical Music Recognition](https://reader035.vdocument.in/reader035/viewer/2022062217/56814731550346895db46f02/html5/thumbnails/31.jpg)
Projections
X-projections
Y-projections
back