text detection strategies
TRANSCRIPT
Welcome to our first
Computer Vision Meetup
Sponsored by
Daniel Albertini Technical Director & Co-Founder
Anyline - a product of 9yards GmbH
Zirkusgasse 13/2b
1020 Wien
Agenda
- Overview Talk about different text detection strategies. - Feedback about possible future Meetup topics. - Get-together, discuss and beer.
Text Detection Strategies Overview
SWT (Stroke Width Transformation)
Computes per pixel the most likely stroke width containing the pixel. Steps: - Compute Edge Map of image. - Compute X & Y Gradient Map. - Calculate Ray from every edge pixel with
the direction from the gradient maps. - Set the value of the pixels of the ray to
the min of current value and ray length. - Group neighbor pixels with similar
stroke width together to find letter candidates.
SWT (Stroke Width Transformation)
SWT Rejecting connected components strategies: - Variance of the stroke width. - Aspect ratio.
- Too large & too small components - Components which are clearly not part of a
word / text line
SWT (Stroke Width Transformation)
SWT (Stroke Width Transformation)
Advantages: - Is able to accurately detect
text in different sizes, styles, colors.
- Can detect text independent of perspective and rotation.
- First step of SWT is a good all-rounder thresholding method for images with text.
Disadvantages: - Relatively slow performance
(edge & gradient maps). - Needs information if text or
background is darker (in the grayscale image).
MSER (Maximally Stable Extremal Regions)
Blob detection method suitable for detecting character features. This method detects regions which are considered stable over a large range of threshold values.
MSER
Threshold value: 10 45 75
105 135 165
MSER (Maximally Stable Extremal Regions)
MSER (Maximally Stable Extremal Regions)
Advantages: - Is able to accurately detect
text in different sizes, styles, colors.
- Can detect text independent of perspective and rotation.
- Good performance.
Disadvantages: - Sensible against blur. - No binary image as an output
(thresholding for OCR still needed).
ER Variation for text detection
Sequential classifier trained for character detection instead of maximum region Advantages: - Only Character regions will be found. No need for analyzing and rejecting
components.
Disadvantages: - Needs training for different font or character types - Slower performance
The End
Sources
SWT: http://research.microsoft.com/pubs/149305/1509.pdf MSER: http://www.icg.tugraz.at/pub/pubobjects/docvpr2006