digitization of the lester s. levy collection of sheet music ichiro fujinaga mcgill university with...

Post on 17-Jan-2018

222 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Lester S. Levy Collection

TRANSCRIPT

Digitization of the Lester S. Levy Collection of Sheet Music

Ichiro FujinagaMcGill University

withMichael Droettboom, Karl MacMillan,

G. Sayeed Choudhury, Tim DiLauro, Mark Patton, Teal AndersonLevy Project II

Digital Knowledge CenterSheridan Libraries

Johns Hopkins University

Contents Levy Project

Levy Sheet Music Collection Digital Workflow Management Optical Music Recognition Gamera Guido / NoteAbility

Current goals

Digitization completed

Under development

Lester S. Levy Collection

Lester S. Levy Collectionlevysheetmusic.mse.jhu.edu

North American sheet music (1780–1960)

Digitized 29,000 pieces (130,000 sheets) Began in 1994 includes “The Star-Spangle Banner” and

“Yankee Doodle”

Lester S. Levy Collectionlevysheetmusic.mse.jhu.edu

North American sheet music (1780–1960)

Digitized 29,000 pieces (130,000 sheets) Began in 1994 includes “The Star-Spangle Banner” and

“Yankee Doodle”

Database of: metadata images of music (8bit gray) lyrics (first lines of verse and chorus) color images of cover sheets (32bit)

Reduce the manual intervention for large-scale digitization projects

Creation of data repository (text, image, sound) Optical Music Recognition (OMR) Gamera

XML-based metadata composer, lyricist, arranger, performer, artist, engraver,

lithographer, dedicatee, and publisher cross-references for various forms of names, pseudonyms authoritative versions of names and subject terms

Music and lyric search engines Analysis toolkit

Digital Workflow Management

Optical Music Recognition (OMR)

Trainable open-source OMR system in development since 1984

Staff recognition and removal Lyric removal Stems and notehead removal Music symbol classifier Score reconstruction Lyric classifier? Optical Character Recognition (OCR)

The problem Suitable OCR for lyrics not found Commercial OCR systems are often

inadequate for non-standard documents The market for specialized recognition of

historical documents is very small Researchers performing document

recognition often “re-invent” the basic image processing wheel

The solution Provide easy to use tools to allow domain

experts (people with specialized knowledge of a collection) to create custom recognition applications

Generalize OMR for structured documents

Introducing Gamera Framework for creation of structured document

recognition system Designed for domain experts Image processing tools (filters, binarizations, …) Document segmentation and analysis Symbol segmentation and classification Syntactical and semantic analysis

Generalized Algorithms and Methods for Enhancement and Restoration of Archives

Features of Gamera Portability (Unix, Windows, Mac) Extensibility (Python and C++ plugins) Easy-to-use (experts and programmers) Open source Graphic User Interface Interactive / Batchable (scripts)

Gamera: Interface(screenshot in Linux)

Gamera: Interface(screenshot in Linux)

Histogram(screenshot in Linux)

Thresholding(screenshot in Linux)

Thresholding(screenshot in Linux)

Staff removal: Lute tablature

Classifier: Lute(screenshot in Linux)

Staff removal: Neumes

Classifier: Neums(screenshot in Linux)

Greek example

GUIDO Music Notation FormatH. Hoos, K. Renz, J. Kilian

“A formal language for score-level representation”

Plain text: readable, platform independent Extensible and flexible Adequate representation NoteServer: Web/Windows GUIDO/XML NoteAbility (K. Hamel)

Conclusions Levy Collection

Searchable Metadata Online images (public domain) of music and

cover Digital Workflow Management

Optical Music Recognition Gamera for domain experts

Includes an easy-to-use interactive environment for experimentation

Beta version available on Linux OS X and Windows version in preparation

Acknowledgements National Science Foundation National Endowments for the Humanities Institute of Museum and Library Services The Levy Family

OMR: Classifier

Connected-component analysis Feature extraction, e.g:

Width, height, aspect ratio Number of holes Central moments

k-nearest neighbor classifier Genetic algorithm

Overall Architecture for OMR

Staff removalSegmentation

Recognition

K-NN Classifier

Output

Symbol Name

Knowledge BaseFeature Vectors

OptimizationGenetic Algorithm

K-nn ClassifierBest

Weight Vector

ImageFile

Off-line

Graphic User Interface (wxWindows)

Architecture of Gamera

GAMERA Core (C++)

Scripting Environment (Python)

Plugins (Python)

Automatic Plugin Wrapper (Boost)

Plugins (C++)

GUIDO: An example{ [ \beamsOff | \clef<"treble"> \key<"D"> f#*1/8. g*1/16 |a*1/4. d2*1/8 d*1/4. c#*1/8 |e1*1/2 _*1/4 f#*1/8. g*1/16 |c#2*1/4. b1*1/8 a*1/4. g*1/8 || e#*1/2 f#*1/4 f#*1/8. g*1/16 |a*1/4. d2*1/8 d*1/4. c#*1/8 |e1*1/2 _*1/4 f#*1/8 g |c#2*1/4. b1*1/8 a*1/4. c#*1/8 ],

top related