musical notes reader - idc notes/report.pdf · have assumed there is 1 clef in the beginning of the...
TRANSCRIPT
![Page 1: Musical Notes Reader - IDC Notes/report.pdf · have assumed there is 1 clef in the beginning of the staff. Note Head Detection Once the note segments had been identified, we were](https://reader030.vdocument.in/reader030/viewer/2022040315/5e17e9f46e435f7edf5a96ec/html5/thumbnails/1.jpg)
Musical Notes Reader
Anna Shmushkin & Lior Abramov
Introduction
Optical music recognition (OMR) has been the subject of research for decade. Many image
processing algorithms and techniques have been developed to address this problem, and yet
the problem still poses many challenges to scientists and researchers today.
The goal of this project is to parse musical note sheet from a taken image and supply
playback mechanism for it.
Algorithm
Image align
1) find the 4 corners of the page
i. We find the largest region boundaries and then find its corners. Using
matlab function bwboundaries.
2) Apply 2-D projective geometric transformation on the input image using the 4
corners of the page we have found at step 1.
Reading the Note Sheet …
Staff Lines
1) Detection: The first step in processing a given input image is to detect the individual
staff lines of the piece of music. We used Y projection of the image and later we
have found the peaks by using: findpeaks with MinPeakHeight=image_width/3.
2) Parameter Extraction: Once we had the final staff line locations from the previous
step, we calculated the gap between the staff lines, g, by computing the median of
the set of interval lengths between adjacent staff lines. We also calculated the staff
thickness, t by computing the median of width peaks from previous step.
3) Removal: We have removed the staff lines by using the staff line locations and the
parameters g and t. We scanned across the rows of the staff lines, removing the
existing black pixel if there are no black pixels above nor below it.
Segmentation
1) Staff Segmentation: In the staff segmentation, we divided the image into horizontal
strips, one strip per 5 lines (staff).
2) Note Segmentation: In the group segmentation, vertical strips of the staff segment
were identified as note group segments. We did X-projection and then group
segmentation with thresholds: MIN_WIDTH_THRESHOLD=g*2,
MIN_HEIGHT_SUM_THRESHOLD=t*2.
Clef Detection
we have identified the clef by using template matching (matlab function normxcorr2). We
![Page 2: Musical Notes Reader - IDC Notes/report.pdf · have assumed there is 1 clef in the beginning of the staff. Note Head Detection Once the note segments had been identified, we were](https://reader030.vdocument.in/reader030/viewer/2022040315/5e17e9f46e435f7edf5a96ec/html5/thumbnails/2.jpg)
have assumed there is 1 clef in the beginning of the staff.
Note Head Detection
Once the note segments had been identified, we were able to identify the coordinates of the
note heads. We would like to perform a simple erosion to accomplish this task, but doing so
would not detect any half or whole notes. Therefore, we filled holes with radius smaller than
g/2. At this point, we could detect quarter, half, and whole note heads by simply performing
erosion with a disk structuring element. Since we expected the note heads to have a
diameter of approximately the staff gap line g, we chose a radius of 0.75*g/2 for the
structuring element.
Note Identification given the eroded regions from the previous portion of the processing pipeline, we next
classified the note type, octave, and pitch of the note.
Note type we currently support only 8th, quarter, half, and whole notes.
First we tried to detect 8th note, 8th note consists of 2 regions with centroids centroid1 and
centroid2 (sorted by y) where centroid2.y-centroid1.y ~ g*2 and centroid1.x-centroid1.x ~ g.
Later we have detected quarter, half, and whole notes:
we first determined whether the region around the note was filled or empty. To do this, we
counted the number of filled pixels in a small circular region surrounding the centroid in the
original image before erosion or small-region filling and thresholder to determine if a note
head was full or empty. If the note head was full, we classify the note as a quarter notes. If
the note head was empty, we looked at the X-projection of the note to look for the presence
of a note stem, which we observed to be at least 2.5*g. If we found a peak of this height or
greater in the X-projection we assumed the presence of a stem, and classified the note as a
half note. Otherwise we classified the note as a whole note.
Octave and pitch
To determine the octave and pitch, we used the centroid of the region found previously and
cross-referenced it with the staff line locations of the original image in order to round the
position to the nearest half step of g, which directly was converted into an octave and pitch.
MIDI Synthesis
finally, once given the list of notes, their durations, and pitches, we were able to generate a
MIDI file for the decoded sheet music. We have used lilypond program to play the music.
![Page 3: Musical Notes Reader - IDC Notes/report.pdf · have assumed there is 1 clef in the beginning of the staff. Note Head Detection Once the note segments had been identified, we were](https://reader030.vdocument.in/reader030/viewer/2022040315/5e17e9f46e435f7edf5a96ec/html5/thumbnails/3.jpg)
Figure 1: Original Image Figure 2: Aligned Image Figure 3: Staff Line Detection
Interface We have used a client-server model.
1) At client side, user can choose an input image (from gallery or camera).
2) Input image is sent to server
3) Server returns a zip file containing the processed result consisting of:
a. MIDI file, output of lilypond program
b. PNG image file, output of lilypond program
4) Client shows the PNG results and plays the MIDI file.
Visual Results
![Page 4: Musical Notes Reader - IDC Notes/report.pdf · have assumed there is 1 clef in the beginning of the staff. Note Head Detection Once the note segments had been identified, we were](https://reader030.vdocument.in/reader030/viewer/2022040315/5e17e9f46e435f7edf5a96ec/html5/thumbnails/4.jpg)
Figure 4: Staff Segmentation Figure 5: Staff Lines Removal and Notes Segmentation
Figure 6: Notes Identification
![Page 5: Musical Notes Reader - IDC Notes/report.pdf · have assumed there is 1 clef in the beginning of the staff. Note Head Detection Once the note segments had been identified, we were](https://reader030.vdocument.in/reader030/viewer/2022040315/5e17e9f46e435f7edf5a96ec/html5/thumbnails/5.jpg)
The solution limitations:
We currently support only 8th, quarter, half, and whole notes.
When the image is not aligned correctly, meaning the lines are jagged and therefore
the algorithm behaves problematically in the phase of staff line removal.
References https://stacks.stanford.edu/file/druid:yj296hj2790/Khan_Ng_Mobile_Sheet_Music_Player.p
df
http://www.music.mcgill.ca/~ich/research/papers/dalitz08comparative.pdf
\score {<<
\new Staff { \easyHeadsOn
\clef treble e'4 e'4 e'2 e'4 e'4 e'2 e'4
g'4 c'4 d'8 }
\new Staff { \easyHeadsOn
\clef treble e'1 f'4 f'4 f'4 f'8 f'4 e'4
e'4 e'4 }
\new Staff { \easyHeadsOn
\clef treble e'4 d'4 d'4 e'4 d'2 g'2 e'4
e'4 e'2 e'4 e'4 e'2 }
\new Staff { \easyHeadsOn
\clef treble e'4 g'4 c'4 d'8 e'1 f'4 f'4
f'4 f'4 }
\new Staff { \easyHeadsOn
\clef treble f'4 e'4 e'4 e'4 g'4 g'4 f'4
d'4 c'1 }
>>\midi {}
\layout {}
}
Figure 7: Lilipond file