ocr and ocv and ocv tom brennan artemis vision artemis vision 781 vallejo st denver, co 80204...
TRANSCRIPT
OCR and OCV
Tom Brennan
Artemis VisionArtemis Vision781 Vallejo St
Denver, CO 80204(303)832-1111
About Us
• Machine Vision Integrator
– Turnkey Systems
• OEM Vision Software
– Work with camera partners and their clients
Artemis Vision781 Vallejo St.
Denver, CO 80204(303)832-1111
www.artemisvision.com
Tom [email protected]/pub/tom-brennan/1b/2b7/984/
OCR and OCV
• Considerations for Deployment
• OCR vs OCV
• Technical Challenges
– Pre-Processing
– Segmentation
– Recognition
Written Language and Machine Vision
• Written Human Language
– Highly varied:
• Character based and letter based
• Fonts and Scripts
• Scale, Spacing, Directionality
• Machine Vision
– Doesn’t like variability:
• Difficult to test without stepping through examples
• Greater variability = greater costs
Barcodes vs Human Language
• Barcodes
– Highly regular
– Designed for Vision Readability
– Uniform global specifications
• Written Human Language
– Evolved over time
– Highly variable
– Many Languages, many fonts, many standards
OCR Applications
• Space or process constraints preclude barcode
• Human Readability Requirements
• Aesthetic concerns
• Too many legacy parts / labels in circulation
• Information cannot be readily barcoded (i.e. labelled drawing, or chart)
To OCR or Not to OCR?
• The barcode exists because OCR is difficult.
• OCR is typically used as a modern “Turing Test”
AA
Hardware Setup
• Geometric Constraints
– Fixture text consistently in front of the camera
– Minimum 20x40 pixels per character
– Diffuse lighting – avoid hotspots – light scene evenly
– Correct for lens distortion or longer focal length preferred
OCR Fonts
• OCR fonts minimize segmentation and recognition challenges
– OCR-A
• Characters evenly spaced
• Characters slightly modified to all look unique
• Used on Bank Checks
• OCR fonts are engineered for easy OCR
OCR and OCV
• Considerations for Deployment
• OCR vs OCV
• Technical Challenges
– Pre-Processing
– Segmentation
– Recognition
OCR vs OCV
• OCR – Optical Character Recognition
– Attempts to read text
• OCV – Optical Character Verification
– Verifies text conforms to a standard
– Helps diagnose printer problems
• Missing Lines
• Low contrast
OCV
• Typically verifies known text
• Difficult to combine OCV and OCR.
– “Smudged” 6 or “Good” 8
– OCV for lot code verification, expiration date verification, etc.
OCR and OCV
• Considerations for Deployment
• OCR vs OCV
• Technical Challenges
– Pre-Processing
– Segmentation
– Recognition
OCR Steps
• Pre-process
– Reduce background noise
– Improve characters
• Segment
– Locate and divide into characters
• Recognize
– Identify Specific Characters
Pre-Processing
• Reduce Noise
– Erosion and Dilation
– Adaptive Thresholding
– Blur and sharpen
• Improve Character Consistency
– Compute Skeletons
– Compute Stroke Width
– Prune
Noise Reduction Techniques
• Dilation
– Expansion of light colored areas
• Erosion
– Shrinking of light colored areas
Original Dilated Eroded
Character Consistency
• Skeleton
– All points equal-distant from at least 2 edges
– Think “start a fire on the boundary, where fires meet, draw a point”
Locating “Text”
• Easy for people.
• Can be a challenge for software.
– Logos
– Symbols
– Lines
• OCR applications will work best when text is consistently located.
Segmentation
• Splitting Text into Discrete Characters
• Critical to accurate OCR
• Issues
– Not all characters are the same width
– Not all characters can be split with vertical lines due to skew
– Sometimes characters touch
Segmentation
• Adaptive Thresholding
• Detect Corners
• Estimate Stroke Width
• Edge detection
• Path detection
Recognition
• Can be easier than Locating and Segmenting
• However
– Similar Characters:
• l, 1, I, i, 7, /, \ , (, )
• B, D, 8, 6, 9, S, Z, R, P
– Handwriting vs Type
– Scale and Orientation (Document Scan vs. Package on Conveyor)
Recognition Strategies
• Pattern Matching Techniques
– Match the actual image pattern
– Can be problematic on large character sets
• Artificial Intelligence Techniques
– Extract Features from the image
– Learn rules for features
– Neural nets, SVMs, kNN, AdaBoost, etc.
– Tesseract uses a feature distance method
Conclusions
• General Purpose OCR is challenging
• Consider shortcuts to make OCR easier– Context?
– Character number known?
– Character size known?
– Font known? Can we train on that font?
– Eliminate hotspots, distortion
– Locate text consistently, control scale, orientation
– Preprocess to improve image / characters
Questions?
Tom Brennan
Artemis Vision
781 Vallejo St
Denver, CO 80204
(303)832-1111
www.artemisvision.com