![Page 1: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/1.jpg)
Spatial Business Detection and Recognition from Images
Alexander Darino
![Page 2: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/2.jpg)
Outline
• Project Overview
![Page 3: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/3.jpg)
PROJECT OVERVIEW
Previous WorkProject ObjectiveAnticipated End ResultProject Pipeline
![Page 4: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/4.jpg)
Previous Work: Where Am I?
Image Where Am I? Latitude, Longitude
![Page 5: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/5.jpg)
Project Objective
• Given:– Image– Geolocation
• Yield:– Spatial Identification of Businesses in Image– Addresses of Businesses in Image– Information about Businesses in Image
• Ex. Reviews, Categories, Phone Number, etc.
![Page 6: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/6.jpg)
Project Objective
• Given:– Image– Geolocation
• Yield:– Spatial Identification of Businesses in Image– Addresses of Businesses in Image– Information about Businesses in Image
• Ex. Reviews, Categories, Phone Number, etc.
![Page 7: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/7.jpg)
7
Project Pipeline
LatitudeLongitude
Geocoding
ReverseGeocoding
Nearby Businesses
Image Text Extraction
Detected Text
Business Name
Matching
BusinessIdentification
Business Spatial
Detection
![Page 8: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/8.jpg)
Anticipated End Result
![Page 9: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/9.jpg)
BUSINESS SEARCHINGObtaining a List of Candidate Businesses in Image via
![Page 10: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/10.jpg)
10
Business Searching
LatitudeLongitude
Geocoding
ReverseGeocoding
Nearby Businesses
Image Text Extraction
Detected Text
Business Name
Matching
BusinessIdentification
Business Spatial
Detection
![Page 11: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/11.jpg)
Business Searching
• Business Search Services– Google– Yelp– CityGrid (Supplier for Yellow Pages, Super Pages)
• REST-based API• Results in JSON or XML format• Aggregate Results into Facade
![Page 12: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/12.jpg)
{'businesses': [{'address1': '466 Haight St', 'address2': '', 'address3': '', 'avg_rating': 4.0, 'categories': [{'category_filter': 'danceclubs', 'name': 'Dance Clubs', 'search_url': 'http://yelp.com/search?find_loc=466+Haight+St%2C+San+Francisco%2C+CA&cflt=danceclubs'}, {'category_filter': 'lounges', 'name': 'Lounges', 'search_url': 'http://yelp.com/search?find_loc=466+Haight+St%2C+San+Francisco%2C+CA&cflt=lounges'}, {'category_filter': 'tradamerican', 'name': 'American (Traditional)', 'search_url': 'http://yelp.com/search?find_loc=466+Haight+St%2C+San+Francisco%2C+CA&cflt=tradamerican'}], 'city': 'San Francisco', 'distance': 1.8780401945114136, 'id': 'yyqwqfgn1ZmbQYNbl7s5sQ', 'is_closed': False, 'latitude': 37.772201000000003, 'longitude': -122.42992599999999, 'mobile_url': 'http://mobile.yelp.com/biz/yyqwqfgn1ZmbQYNbl7s5sQ', 'name': 'Nickies', 'nearby_url': 'http://yelp.com/search?find_loc=466+Haight+St%2C+San+Francisco%2C+CA', 'neighborhoods': [{'name': 'Hayes Valley', 'url': 'http://yelp.com/search?find_loc=Hayes+Valley%
![Page 13: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/13.jpg)
Business Searching: Results40.441127247181797 -80.002821624487595Denham & Company SalonUllrich's Shoe RepairingNicholas Coffee CoBella Sera On the SquareA & J RibsStarbucks CoffeeJenny Lee BakeryGalardi's 30 Minute CleanersJimmy John's Gourmet SandwichesCharley's Grilled SubsFresh CornerLagondola Pizzeria & RestaurantCamera Repair Service IncPittsburgh Cigar BarOriginal Oyster HouseMixStirs1902 TavernCostanzo'sPittsburgh Silver LlcGraeme StGalardi's 30 Minute CleanersDenham & Co SalonBruegger's Bagel BakeryNicholas Coffee CoMarket SquareFat Tommy's PizzeriaMixstirs CafeGigglesRycon Construction IncGarbera, Dennis C, Dds - Emmert Dental AssocBella Sera on the SquareMancini's Bread CoLas VelasCiao BabyWashington Reprographics IncHighmark Life Insurance CoFischer, Donald R, Md - Highmark Life Insurance CoJimmy John'sLynx Energy Partners IncEmmert Dental Assoc
![Page 14: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/14.jpg)
Business Searching: Evaluation• Strengths
– Aggregated results almost always found Business of interest • Weaknesses
– Each API limits query result set size - this is why we aggregate– Only businesses listed– Not all businesses listed
• Limitations
– Dependent on well-populated, accurate Business Directories– Have only tested for 15 Pittsburgh images - unknown result quality for
rural areas.
![Page 15: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/15.jpg)
EXTRACTING IDENTIFYING TEXTObtaining names of Businesses in Image by
![Page 16: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/16.jpg)
16
Extracting Identifying Text
LatitudeLongitude
Geocoding
ReverseGeocoding
Nearby Businesses
Image Text Extraction
Detected Text
Business Name
Matching
BusinessIdentification
Business Spatial
Detection
![Page 17: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/17.jpg)
Extracting Identifying Text: OCR
• Used Two OCR APIs:– GNU OCR (Ocrad)– GOCR
• OCR APIs highly sensitive to:– Font (only works well with roman font)– Perspective– Scale– Binarization Threshold– Dark on Light vs. Light on Dark (inversion)
![Page 18: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/18.jpg)
Extracting Identifying Text: OCR
• OCR API evaluations– Ocrad - could not yield any meaningful data across
over 200 scale/threshold/inversion combinations– GOCR - produced good results across 10 scales
with and without inversion using threshold automatically determined by Otsu's method
• 98% of Results are garbage!• Examples of GOCR output (next slides)
![Page 19: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/19.jpg)
Extracting Identifying Text: OCR
![Page 20: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/20.jpg)
Extracting Identifying Text: OCRn.c.......o.a...u..............oU..D.oa..e......_RuEGGE..KERy..J...w...........L........M.II.....c..
...i
.......l.
.J
.t...llt...lSHA.P.It..tllt.........._.l...Jy._.c_...._tt.._....t.._.r.........t.t_t.._.._.l..J.r.r.I.
![Page 21: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/21.jpg)
Extracting Identifying Text: OCR
![Page 22: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/22.jpg)
Extracting Identifying Text: OCRu..........._nq......eoR.E.l.e...í....e...n....n....n.e.R.E...e....o._....E.R.E.IKE........I.ltlO.........rE..o......E.....I.K.E.o.....
J.n....c...E.R.E.I.E.......M..E.R.E...E...aJ...Gu.ge..geE.F.._.....E..gE.D...fUlI..lll.lll.IIi.l..Xl..
![Page 23: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/23.jpg)
Extracting Identifying Text: OCR
![Page 24: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/24.jpg)
Extracting Identifying Text: OCR..e_..w.._......D.........uJ.....J.................n......n..........n_..r.l_d..J.ec.m._..n.......J.n.._...tn..ct..._.................D.u.v...e.n....u..
Y.._w.n.n....Jn.......G..o..r..._........J...ml.t..l.tt.l.._w....................._....l....t........j..ilI.i..
![Page 25: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/25.jpg)
Extracting Identifying Text: OCR
![Page 26: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/26.jpg)
Extracting Identifying Text: OCR__.ncu_.l..._..._J...ne......._n._..v.....ra......d_..._.............i..n..UllREsT.unAN...r.c.....r...Tt.rJll......m...c.....n.......
..
.Jn.I..c...r.rESTAU.ANT.r.O....c.cc.
Note: Even though "Tambellini" is a roman font, it is too stretched to be picked up by GOCR
![Page 27: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/27.jpg)
OCR Evaluation
• Strengths– Applicable to expected input of orthogonal images
• Weaknesses– Only works well(-ish) for strictly roman font
• Limitations– Will perform poorly for artistic fonts and business signs
• Conclusion– By itself, OCR is not the best approach towards Business
identification • Reasons: poor recognition, franchises, perspective, etc
![Page 28: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/28.jpg)
BUSINESS NAME MATCHINGMatching Identifying Text to Candidate Business Names via
![Page 29: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/29.jpg)
29
Business Name Matching
LatitudeLongitude
Geocoding
ReverseGeocoding
Nearby Businesses
Image Text Extraction
Detected Text
Business Name
Matching
BusinessIdentification
Business Spatial
Detection
![Page 30: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/30.jpg)
Business Name Matching
• Given: Unreliable fragments of ‘detected text’• Yield: Matching Business Names• Process:
– Filter input: trimming, uselessness (< 2 letters)– Fuzzy String Matching– Voting Scheme: confidence of business appearing
in image
![Page 31: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/31.jpg)
31
Business Name Detection
![Page 32: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/32.jpg)
32
Business Name Matching
• Developed Confidence Attribution Algorithm– Confidence of OCR Token being Name Token
• Example: Confidence of “ESTUANT” representing “RESTAURANT”
• Point-based system– Confidence of Name appearing in Image
• Sum of points of matching OCR Text• Use logarithmically-normalized points to determine
business inclusion threshold
![Page 33: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/33.jpg)
33
Business Name Matching
![Page 34: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/34.jpg)
34
![Page 35: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/35.jpg)
35
Business Name Matching
![Page 36: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/36.jpg)
36
![Page 37: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/37.jpg)
37
Business Name Matching
![Page 38: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/38.jpg)
38
Business Name Matching
![Page 39: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/39.jpg)
39
Business Name Matching
Note: This originally did not appear because it did not exceed the confidence threshold. It now appears because it contributes to the Business Name Identification
![Page 40: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/40.jpg)
SPATIAL BUSINESS IDENTIFICATIONIsolating Identified Images in Image via
![Page 41: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/41.jpg)
41
Business Spatial Identification
LatitudeLongitude
Geocoding
ReverseGeocoding
Nearby Businesses
Image OCR Detected Text
Business Name
Matching
BusinessIdentification
Business Spatial
Detection
![Page 42: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/42.jpg)
42
Business Spatial Identification
![Page 43: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/43.jpg)
43
Business Spatial Identification
Aiken George S Co
Category:Food, GroceryAddress: 218 Forbes AvePittsburgh, PA 15222Phone: (412) 391-6358Rating: 4.5/5 (2 Reviews)
![Page 44: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/44.jpg)
44
Business Spatial Identification
![Page 45: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/45.jpg)
45
Business Spatial Identification
![Page 46: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/46.jpg)
46
Business Spatial Identification
Bruegger's Bagels
Category:BagelsAddress: Market Sq
Pittsburgh, PA 15222Phone: (412) 281-2515Rating: Not Rated
![Page 47: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/47.jpg)
V0.1: EVALUATION
![Page 48: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/48.jpg)
48
Current Approach
LatitudeLongitude
Geocoding
ReverseGeocoding
Nearby Businesses
Image OCR Detected Text
Business Name
Matching
BusinessIdentification
Business Spatial
Detection
![Page 49: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/49.jpg)
49
Weaknesses to Current Approach
LatitudeLongitude
Geocoding
ReverseGeocoding
Nearby Businesses
Image OCR Detected Text
Business Name
Matching
BusinessIdentification
Business Spatial
Detection
![Page 50: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/50.jpg)
50
Weaknesses to Current Approach
Lots of Garbage
![Page 51: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/51.jpg)
51
Weaknesses to Current Approach
Fragmented Word Detection
![Page 52: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/52.jpg)
52
Weaknesses to Current ApproachFails with
non-orthogonal perspective
Did I already mention lots of
garbage?
![Page 53: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/53.jpg)
53
Weaknesses to Current Approach
Fails withnon-roman text
Not scale-invariant
![Page 54: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/54.jpg)
54
ALTERNATIVES TO OCR
![Page 55: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/55.jpg)
55
Alternative #1: Image Matching
LatitudeLongitude
Geocoding
ReverseGeocoding
Nearby Businesses
Image
Match to Storefront
Image
BusinessIdentification
Business Spatial
Detection
![Page 56: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/56.jpg)
56
Alternative #1: Image Matching
![Page 57: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/57.jpg)
Alternative #1: Evaluation
• Weaknesses:– Low Availability of Storefront Images (< 50% Avg)
• George Aiken area businesses with photos: 18/35• Brueggers area businesses with photos: 22/40• Tambellini area businesses with photos: 8/22
– Available Images too small (100 x 100)– Computationally Expensive
• Conclusion: Not a viable solution
![Page 58: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/58.jpg)
58
Alternative #2: Template Matching
• Tambellini• Tambellini• Tambellini• Tambellini• Tambellini• Tambellini• Tambellini• Tambellini
![Page 59: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/59.jpg)
59
Alternative #2: Template Matching
LatitudeLongitude
Geocoding
ReverseGeocoding
Nearby Businesses
Image
Render Templates of Business Names in Different Fonts
Business SpatialDetection
Image Matching(eg. SIFT, HAAR)
Template Images
Business Identification
![Page 60: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/60.jpg)
60
Alternative #2: Template Matching
OCR• Not Scale Invariant• Unbounded Search• Fragmented Recognition• Roman-only font
Alternative #2• Scale Invariant• Bounded Search• Whole-word recognition• All fonts
![Page 61: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/61.jpg)
![Page 62: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/62.jpg)
![Page 63: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/63.jpg)
Subsequent Attempts
![Page 64: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/64.jpg)
Alternative #3: Scene Text Recognition
• State of the Art:– STR ≠ OCR– Far superior to our ‘naïve’ approaches to STR (ie. OCR,
Image matching, SIFT)• OCR only works for highly controlled environments• STR works for unconditioned environments
– Scale invariant– Color/intensity invariant– Lexicon-Assisted
![Page 65: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/65.jpg)
Alternative: Scene Text Recognition
• No STR implementations readily available• Have contacted several groups specialized in
STR – unable to assist us in providing implementation for research purposes
• Had to resort to implement STR from scratch
![Page 66: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/66.jpg)
SCENE TEXT RECOGNITIONThe long and perilous journey of implementing
![Page 67: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/67.jpg)
STR Implementation
• STR Implementation: “Automatic Detection and Recognition of Signs From Natural Scenes”
Multiresolution-based potential
characters detection
Character/layout geometry and color properties analysis
Local affine rectification
Refined Detection
![Page 68: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/68.jpg)
MULTIRESOLUTION-BASED POTENTIAL CHARACTERS DETECTION
Candidate Text Detection via
![Page 69: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/69.jpg)
STR Implementation
• STR Implementation: “Automatic Detection and Recognition of Signs From Natural Scenes”
Multiresolution-based potential
characters detection
Character/layout geometry and color properties analysis
Local affine rectification
Refined Detection
![Page 70: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/70.jpg)
Multiresolution-based potential characters detection
• Laplacian-of-Guassian Edge Detection• Dice image/edges into Patches
– Combine patches with similar properties into regions
– Obtain bounding box of region as candidate text– Properties include:
• Mean• Variance• Intensity
![Page 71: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/71.jpg)
Multiresolution-based potential characters detection
![Page 72: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/72.jpg)
Multiresolution-based potential characters detectionPatches qualify if:
![Page 73: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/73.jpg)
Multiresolution-based potential characters detection
![Page 74: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/74.jpg)
Multiresolution-based potential characters detection
![Page 75: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/75.jpg)
Multiresolution-based potential characters detection
![Page 76: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/76.jpg)
Multiresolution-based potential characters detection
![Page 77: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/77.jpg)
Problems with Current Approach
• Too much “bleeding”• Unstable edge-data due
to unpredictability of location of edge patch relative to edge itself
![Page 78: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/78.jpg)
New Approach
• Each edge pixel gets an N x N edge patch (eg. 3x3)
• Edge patches overlap– Tighter boundary boxes– More region consistency– More robust to
resolution changes– Able to use tighter
thresholds
![Page 79: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/79.jpg)
New Approach
![Page 80: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/80.jpg)
New Approach
![Page 81: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/81.jpg)
New Approach
![Page 82: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/82.jpg)
New Approach
![Page 83: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/83.jpg)
New Approach
![Page 84: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/84.jpg)
New Approach
![Page 85: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/85.jpg)
New Approach
![Page 86: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/86.jpg)
New Approach
![Page 87: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/87.jpg)
New Approach
![Page 88: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/88.jpg)
New Approach
![Page 89: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/89.jpg)
New Approach
![Page 90: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/90.jpg)
New Approach
![Page 91: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/91.jpg)
New Challenges!
![Page 92: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/92.jpg)
Text Detection Problem #1
How do I know that two regions are close enough together that they might be part of the same character?• Center of bounding box?• Moment of regions?• Nearest Neighbor?• Connectedness?All have severe weaknesses
![Page 93: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/93.jpg)
Text Detection Problem #2
How do I know that two characters are close enough to be considered a part of the same word?
Easier version of the last problem, but still hard!
![Page 94: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/94.jpg)
CHARACTER/LAYOUT GEOMETRY AND COLOR PROPERTIES ANALYSIS
![Page 95: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/95.jpg)
STR Implementation
• STR Implementation: “Automatic Detection and Recognition of Signs From Natural Scenes”
Multiresolution-based potential
characters detection
Character/layout geometry and color properties analysis
Local affine rectification
Refined Detection
![Page 96: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/96.jpg)
Color Properties Analysis
• Implemented Gaussian Mixture Model (GMM) to obtain μ and σ of foreground/background for: R/G/B/H/I
• Calculated Confidences that component (RGBHI) can be used to recognize characters
Multiresolution-based potential
characters detection
Character/layout geometry and color properties analysis
Local affine rectification
Refined Detection
![Page 97: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/97.jpg)
Color Properties Analysis
• Assumed Invariant: High contrast between foreground/background of characters in sign
• Choose the channel (R/G/B/H/I) that is best suited for use with character recognition
![Page 98: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/98.jpg)
Original
![Page 99: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/99.jpg)
Greenμ1=172.337447154472 μ2=255 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒=0.017056947503074 σ1=4.8463 σ2=0.2000
![Page 100: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/100.jpg)
Blueμ1=122.673512195122 μ2=255 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒=0.021524159560500 σ1=6.1478 σ2=0.2000
![Page 101: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/101.jpg)
Hueμ1=106.601736628811 μ2=0 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒=0.017897920959170 σ1=5.9561 σ2=0.2000
![Page 102: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/102.jpg)
Intensityμ1=145.658856368567 μ2=255 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒=0.051403296762968 σ1=2.1271 σ2=0.2000
![Page 103: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/103.jpg)
Mistake: This should only be done on individual characters, not words
![Page 104: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/104.jpg)
Color Analysis: Evaluation
• Highest confidence observed to be channel best suited for OCR…
• …Did I just say OCR?
YES!(I did.)
![Page 105: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/105.jpg)
OPTICAL CHARACTER RECOGNITIONA second shot at
![Page 106: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/106.jpg)
OPTICAL CHARACTERRECOGNITION II
(and this time, it’s personal)
![Page 107: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/107.jpg)
Refined Detection
• Generate alphabet templates in different fonts• Resize templates; Divide into grid• Apply several 2D Gabor filters to each grid patch
– Different orientations, frequencies, variances– For each pixel, yields real/imaginary component of
transformation• Feed data into Linear Discriminant Analysis
– Reduces features and forms classifier at same time
![Page 108: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/108.jpg)
2D Gabor Filter
• Convolution of Gaussian x Sine wave
![Page 109: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/109.jpg)
Live Demonstration
TrainingClassification
![Page 110: Spatial Business Detection and Recognition from Images](https://reader035.vdocument.in/reader035/viewer/2022070500/5681689f550346895ddf31e9/html5/thumbnails/110.jpg)
Thank You!