spatial business detection and recognition from images

110
Spatial Business Detection and Recognition from Images Alexander Darino

Upload: gada

Post on 25-Feb-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Spatial Business Detection and Recognition from Images. Alexander Darino. Outline. Project Overview. Previous Work Project Objective Anticipated End Result Project Pipeline. Project Overview. Previous Work: Where Am I?. Image. Where Am I?. Latitude, Longitude. Project Objective. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Spatial Business Detection and Recognition from Images

Spatial Business Detection and Recognition from Images

Alexander Darino

Page 2: Spatial Business Detection and Recognition from Images

Outline

• Project Overview

Page 3: Spatial Business Detection and Recognition from Images

PROJECT OVERVIEW

Previous WorkProject ObjectiveAnticipated End ResultProject Pipeline

Page 4: Spatial Business Detection and Recognition from Images

Previous Work: Where Am I?

Image Where Am I? Latitude, Longitude

Page 5: Spatial Business Detection and Recognition from Images

Project Objective

• Given:– Image– Geolocation

• Yield:– Spatial Identification of Businesses in Image– Addresses of Businesses in Image– Information about Businesses in Image

• Ex. Reviews, Categories, Phone Number, etc.

Page 6: Spatial Business Detection and Recognition from Images

Project Objective

• Given:– Image– Geolocation

• Yield:– Spatial Identification of Businesses in Image– Addresses of Businesses in Image– Information about Businesses in Image

• Ex. Reviews, Categories, Phone Number, etc.

Page 7: Spatial Business Detection and Recognition from Images

7

Project Pipeline

LatitudeLongitude

Geocoding

ReverseGeocoding

Nearby Businesses

Image Text Extraction

Detected Text

Business Name

Matching

BusinessIdentification

Business Spatial

Detection

Page 8: Spatial Business Detection and Recognition from Images

Anticipated End Result

Page 9: Spatial Business Detection and Recognition from Images

BUSINESS SEARCHINGObtaining a List of Candidate Businesses in Image via

Page 10: Spatial Business Detection and Recognition from Images

10

Business Searching

LatitudeLongitude

Geocoding

ReverseGeocoding

Nearby Businesses

Image Text Extraction

Detected Text

Business Name

Matching

BusinessIdentification

Business Spatial

Detection

Page 11: Spatial Business Detection and Recognition from Images

Business Searching

• Business Search Services– Google– Yelp– CityGrid (Supplier for Yellow Pages, Super Pages)

• REST-based API• Results in JSON or XML format• Aggregate Results into Facade

Page 12: Spatial Business Detection and Recognition from Images

{'businesses': [{'address1': '466 Haight St',                 'address2': '',                 'address3': '',                 'avg_rating': 4.0,                 'categories': [{'category_filter': 'danceclubs',                                 'name': 'Dance Clubs',                                 'search_url': 'http://yelp.com/search?find_loc=466+Haight+St%2C+San+Francisco%2C+CA&cflt=danceclubs'},                                {'category_filter': 'lounges',                                 'name': 'Lounges',                                 'search_url': 'http://yelp.com/search?find_loc=466+Haight+St%2C+San+Francisco%2C+CA&cflt=lounges'},                                {'category_filter': 'tradamerican',                                 'name': 'American (Traditional)',                                 'search_url': 'http://yelp.com/search?find_loc=466+Haight+St%2C+San+Francisco%2C+CA&cflt=tradamerican'}],                 'city': 'San Francisco',                 'distance': 1.8780401945114136,                 'id': 'yyqwqfgn1ZmbQYNbl7s5sQ',                 'is_closed': False,                 'latitude': 37.772201000000003,                 'longitude': -122.42992599999999,                 'mobile_url': 'http://mobile.yelp.com/biz/yyqwqfgn1ZmbQYNbl7s5sQ',                 'name': 'Nickies',                 'nearby_url': 'http://yelp.com/search?find_loc=466+Haight+St%2C+San+Francisco%2C+CA',                 'neighborhoods': [{'name': 'Hayes Valley',                                    'url': 'http://yelp.com/search?find_loc=Hayes+Valley%

Page 13: Spatial Business Detection and Recognition from Images

Business Searching: Results40.441127247181797 -80.002821624487595Denham & Company SalonUllrich's Shoe RepairingNicholas Coffee CoBella Sera On the SquareA & J RibsStarbucks CoffeeJenny Lee BakeryGalardi's 30 Minute CleanersJimmy John's Gourmet SandwichesCharley's Grilled SubsFresh CornerLagondola Pizzeria & RestaurantCamera Repair Service IncPittsburgh Cigar BarOriginal Oyster HouseMixStirs1902 TavernCostanzo'sPittsburgh Silver LlcGraeme StGalardi's 30 Minute CleanersDenham & Co SalonBruegger's Bagel BakeryNicholas Coffee CoMarket SquareFat Tommy's PizzeriaMixstirs CafeGigglesRycon Construction IncGarbera, Dennis C, Dds - Emmert Dental AssocBella Sera on the SquareMancini's Bread CoLas VelasCiao BabyWashington Reprographics IncHighmark Life Insurance CoFischer, Donald R, Md - Highmark Life Insurance CoJimmy John'sLynx Energy Partners IncEmmert Dental Assoc

Page 14: Spatial Business Detection and Recognition from Images

Business Searching: Evaluation• Strengths

– Aggregated results almost always found Business of interest • Weaknesses

– Each API limits query result set size - this is why we aggregate– Only businesses listed– Not all businesses listed

• Limitations

– Dependent on well-populated, accurate Business Directories– Have only tested for 15 Pittsburgh images - unknown result quality for

rural areas.

Page 15: Spatial Business Detection and Recognition from Images

EXTRACTING IDENTIFYING TEXTObtaining names of Businesses in Image by

Page 16: Spatial Business Detection and Recognition from Images

16

Extracting Identifying Text

LatitudeLongitude

Geocoding

ReverseGeocoding

Nearby Businesses

Image Text Extraction

Detected Text

Business Name

Matching

BusinessIdentification

Business Spatial

Detection

Page 17: Spatial Business Detection and Recognition from Images

Extracting Identifying Text: OCR

• Used Two OCR APIs:– GNU OCR (Ocrad)– GOCR

• OCR APIs highly sensitive to:– Font (only works well with roman font)– Perspective– Scale– Binarization Threshold– Dark on Light vs. Light on Dark (inversion)

Page 18: Spatial Business Detection and Recognition from Images

Extracting Identifying Text: OCR

• OCR API evaluations– Ocrad - could not yield any meaningful data across

over 200 scale/threshold/inversion combinations– GOCR - produced good results across 10 scales

with and without inversion using threshold automatically determined by Otsu's method

• 98% of Results are garbage!• Examples of GOCR output (next slides)

Page 19: Spatial Business Detection and Recognition from Images

Extracting Identifying Text: OCR

Page 20: Spatial Business Detection and Recognition from Images

Extracting Identifying Text: OCRn.c.......o.a...u..............oU..D.oa..e......_RuEGGE..KERy..J...w...........L........M.II.....c..

...i

.......l.

.J

.t...llt...lSHA.P.It..tllt.........._.l...Jy._.c_...._tt.._....t.._.r.........t.t_t.._.._.l..J.r.r.I.

Page 21: Spatial Business Detection and Recognition from Images

Extracting Identifying Text: OCR

Page 22: Spatial Business Detection and Recognition from Images

Extracting Identifying Text: OCRu..........._nq......eoR.E.l.e...í....e...n....n....n.e.R.E...e....o._....E.R.E.IKE........I.ltlO.........rE..o......E.....I.K.E.o.....

J.n....c...E.R.E.I.E.......M..E.R.E...E...aJ...Gu.ge..geE.F.._.....E..gE.D...fUlI..lll.lll.IIi.l..Xl..

Page 23: Spatial Business Detection and Recognition from Images

Extracting Identifying Text: OCR

Page 24: Spatial Business Detection and Recognition from Images

Extracting Identifying Text: OCR..e_..w.._......D.........uJ.....J.................n......n..........n_..r.l_d..J.ec.m._..n.......J.n.._...tn..ct..._.................D.u.v...e.n....u..

Y.._w.n.n....Jn.......G..o..r..._........J...ml.t..l.tt.l.._w....................._....l....t........j..ilI.i..

Page 25: Spatial Business Detection and Recognition from Images

Extracting Identifying Text: OCR

Page 26: Spatial Business Detection and Recognition from Images

Extracting Identifying Text: OCR__.ncu_.l..._..._J...ne......._n._..v.....ra......d_..._.............i..n..UllREsT.unAN...r.c.....r...Tt.rJll......m...c.....n.......

..

.Jn.I..c...r.rESTAU.ANT.r.O....c.cc.

Note: Even though "Tambellini" is a roman font, it is too stretched to be picked up by GOCR

Page 27: Spatial Business Detection and Recognition from Images

OCR Evaluation

• Strengths– Applicable to expected input of orthogonal images

• Weaknesses– Only works well(-ish) for strictly roman font

• Limitations– Will perform poorly for artistic fonts and business signs

• Conclusion– By itself, OCR is not the best approach towards Business

identification • Reasons: poor recognition, franchises, perspective, etc

Page 28: Spatial Business Detection and Recognition from Images

BUSINESS NAME MATCHINGMatching Identifying Text to Candidate Business Names via

Page 29: Spatial Business Detection and Recognition from Images

29

Business Name Matching

LatitudeLongitude

Geocoding

ReverseGeocoding

Nearby Businesses

Image Text Extraction

Detected Text

Business Name

Matching

BusinessIdentification

Business Spatial

Detection

Page 30: Spatial Business Detection and Recognition from Images

Business Name Matching

• Given: Unreliable fragments of ‘detected text’• Yield: Matching Business Names• Process:

– Filter input: trimming, uselessness (< 2 letters)– Fuzzy String Matching– Voting Scheme: confidence of business appearing

in image

Page 31: Spatial Business Detection and Recognition from Images

31

Business Name Detection

Page 32: Spatial Business Detection and Recognition from Images

32

Business Name Matching

• Developed Confidence Attribution Algorithm– Confidence of OCR Token being Name Token

• Example: Confidence of “ESTUANT” representing “RESTAURANT”

• Point-based system– Confidence of Name appearing in Image

• Sum of points of matching OCR Text• Use logarithmically-normalized points to determine

business inclusion threshold

Page 33: Spatial Business Detection and Recognition from Images

33

Business Name Matching

Page 34: Spatial Business Detection and Recognition from Images

34

Page 35: Spatial Business Detection and Recognition from Images

35

Business Name Matching

Page 36: Spatial Business Detection and Recognition from Images

36

Page 37: Spatial Business Detection and Recognition from Images

37

Business Name Matching

Page 38: Spatial Business Detection and Recognition from Images

38

Business Name Matching

Page 39: Spatial Business Detection and Recognition from Images

39

Business Name Matching

Note: This originally did not appear because it did not exceed the confidence threshold. It now appears because it contributes to the Business Name Identification

Page 40: Spatial Business Detection and Recognition from Images

SPATIAL BUSINESS IDENTIFICATIONIsolating Identified Images in Image via

Page 41: Spatial Business Detection and Recognition from Images

41

Business Spatial Identification

LatitudeLongitude

Geocoding

ReverseGeocoding

Nearby Businesses

Image OCR Detected Text

Business Name

Matching

BusinessIdentification

Business Spatial

Detection

Page 42: Spatial Business Detection and Recognition from Images

42

Business Spatial Identification

Page 43: Spatial Business Detection and Recognition from Images

43

Business Spatial Identification

Aiken George S Co

Category:Food, GroceryAddress: 218 Forbes AvePittsburgh, PA 15222Phone: (412) 391-6358Rating: 4.5/5 (2 Reviews)

Page 44: Spatial Business Detection and Recognition from Images

44

Business Spatial Identification

Page 45: Spatial Business Detection and Recognition from Images

45

Business Spatial Identification

Page 46: Spatial Business Detection and Recognition from Images

46

Business Spatial Identification

Bruegger's Bagels

Category:BagelsAddress: Market Sq

Pittsburgh, PA 15222Phone: (412) 281-2515Rating: Not Rated

Page 47: Spatial Business Detection and Recognition from Images

V0.1: EVALUATION

Page 48: Spatial Business Detection and Recognition from Images

48

Current Approach

LatitudeLongitude

Geocoding

ReverseGeocoding

Nearby Businesses

Image OCR Detected Text

Business Name

Matching

BusinessIdentification

Business Spatial

Detection

Page 49: Spatial Business Detection and Recognition from Images

49

Weaknesses to Current Approach

LatitudeLongitude

Geocoding

ReverseGeocoding

Nearby Businesses

Image OCR Detected Text

Business Name

Matching

BusinessIdentification

Business Spatial

Detection

Page 50: Spatial Business Detection and Recognition from Images

50

Weaknesses to Current Approach

Lots of Garbage

Page 51: Spatial Business Detection and Recognition from Images

51

Weaknesses to Current Approach

Fragmented Word Detection

Page 52: Spatial Business Detection and Recognition from Images

52

Weaknesses to Current ApproachFails with

non-orthogonal perspective

Did I already mention lots of

garbage?

Page 53: Spatial Business Detection and Recognition from Images

53

Weaknesses to Current Approach

Fails withnon-roman text

Not scale-invariant

Page 54: Spatial Business Detection and Recognition from Images

54

ALTERNATIVES TO OCR

Page 55: Spatial Business Detection and Recognition from Images

55

Alternative #1: Image Matching

LatitudeLongitude

Geocoding

ReverseGeocoding

Nearby Businesses

Image

Match to Storefront

Image

BusinessIdentification

Business Spatial

Detection

Page 56: Spatial Business Detection and Recognition from Images

56

Alternative #1: Image Matching

Page 57: Spatial Business Detection and Recognition from Images

Alternative #1: Evaluation

• Weaknesses:– Low Availability of Storefront Images (< 50% Avg)

• George Aiken area businesses with photos: 18/35• Brueggers area businesses with photos: 22/40• Tambellini area businesses with photos: 8/22

– Available Images too small (100 x 100)– Computationally Expensive

• Conclusion: Not a viable solution

Page 58: Spatial Business Detection and Recognition from Images

58

Alternative #2: Template Matching

• Tambellini• Tambellini• Tambellini• Tambellini• Tambellini• Tambellini• Tambellini• Tambellini

Page 59: Spatial Business Detection and Recognition from Images

59

Alternative #2: Template Matching

LatitudeLongitude

Geocoding

ReverseGeocoding

Nearby Businesses

Image

Render Templates of Business Names in Different Fonts

Business SpatialDetection

Image Matching(eg. SIFT, HAAR)

Template Images

Business Identification

Page 60: Spatial Business Detection and Recognition from Images

60

Alternative #2: Template Matching

OCR• Not Scale Invariant• Unbounded Search• Fragmented Recognition• Roman-only font

Alternative #2• Scale Invariant• Bounded Search• Whole-word recognition• All fonts

Page 61: Spatial Business Detection and Recognition from Images
Page 62: Spatial Business Detection and Recognition from Images
Page 63: Spatial Business Detection and Recognition from Images

Subsequent Attempts

Page 64: Spatial Business Detection and Recognition from Images

Alternative #3: Scene Text Recognition

• State of the Art:– STR ≠ OCR– Far superior to our ‘naïve’ approaches to STR (ie. OCR,

Image matching, SIFT)• OCR only works for highly controlled environments• STR works for unconditioned environments

– Scale invariant– Color/intensity invariant– Lexicon-Assisted

Page 65: Spatial Business Detection and Recognition from Images

Alternative: Scene Text Recognition

• No STR implementations readily available• Have contacted several groups specialized in

STR – unable to assist us in providing implementation for research purposes

• Had to resort to implement STR from scratch

Page 66: Spatial Business Detection and Recognition from Images

SCENE TEXT RECOGNITIONThe long and perilous journey of implementing

Page 67: Spatial Business Detection and Recognition from Images

STR Implementation

• STR Implementation: “Automatic Detection and Recognition of Signs From Natural Scenes”

Multiresolution-based potential

characters detection

Character/layout geometry and color properties analysis

Local affine rectification

Refined Detection

Page 68: Spatial Business Detection and Recognition from Images

MULTIRESOLUTION-BASED POTENTIAL CHARACTERS DETECTION

Candidate Text Detection via

Page 69: Spatial Business Detection and Recognition from Images

STR Implementation

• STR Implementation: “Automatic Detection and Recognition of Signs From Natural Scenes”

Multiresolution-based potential

characters detection

Character/layout geometry and color properties analysis

Local affine rectification

Refined Detection

Page 70: Spatial Business Detection and Recognition from Images

Multiresolution-based potential characters detection

• Laplacian-of-Guassian Edge Detection• Dice image/edges into Patches

– Combine patches with similar properties into regions

– Obtain bounding box of region as candidate text– Properties include:

• Mean• Variance• Intensity

Page 71: Spatial Business Detection and Recognition from Images

Multiresolution-based potential characters detection

Page 72: Spatial Business Detection and Recognition from Images

Multiresolution-based potential characters detectionPatches qualify if:

Page 73: Spatial Business Detection and Recognition from Images

Multiresolution-based potential characters detection

Page 74: Spatial Business Detection and Recognition from Images

Multiresolution-based potential characters detection

Page 75: Spatial Business Detection and Recognition from Images

Multiresolution-based potential characters detection

Page 76: Spatial Business Detection and Recognition from Images

Multiresolution-based potential characters detection

Page 77: Spatial Business Detection and Recognition from Images

Problems with Current Approach

• Too much “bleeding”• Unstable edge-data due

to unpredictability of location of edge patch relative to edge itself

Page 78: Spatial Business Detection and Recognition from Images

New Approach

• Each edge pixel gets an N x N edge patch (eg. 3x3)

• Edge patches overlap– Tighter boundary boxes– More region consistency– More robust to

resolution changes– Able to use tighter

thresholds

Page 79: Spatial Business Detection and Recognition from Images

New Approach

Page 80: Spatial Business Detection and Recognition from Images

New Approach

Page 81: Spatial Business Detection and Recognition from Images

New Approach

Page 82: Spatial Business Detection and Recognition from Images

New Approach

Page 83: Spatial Business Detection and Recognition from Images

New Approach

Page 84: Spatial Business Detection and Recognition from Images

New Approach

Page 85: Spatial Business Detection and Recognition from Images

New Approach

Page 86: Spatial Business Detection and Recognition from Images

New Approach

Page 87: Spatial Business Detection and Recognition from Images

New Approach

Page 88: Spatial Business Detection and Recognition from Images

New Approach

Page 89: Spatial Business Detection and Recognition from Images

New Approach

Page 90: Spatial Business Detection and Recognition from Images

New Approach

Page 91: Spatial Business Detection and Recognition from Images

New Challenges!

Page 92: Spatial Business Detection and Recognition from Images

Text Detection Problem #1

How do I know that two regions are close enough together that they might be part of the same character?• Center of bounding box?• Moment of regions?• Nearest Neighbor?• Connectedness?All have severe weaknesses

Page 93: Spatial Business Detection and Recognition from Images

Text Detection Problem #2

How do I know that two characters are close enough to be considered a part of the same word?

Easier version of the last problem, but still hard!

Page 94: Spatial Business Detection and Recognition from Images

CHARACTER/LAYOUT GEOMETRY AND COLOR PROPERTIES ANALYSIS

Page 95: Spatial Business Detection and Recognition from Images

STR Implementation

• STR Implementation: “Automatic Detection and Recognition of Signs From Natural Scenes”

Multiresolution-based potential

characters detection

Character/layout geometry and color properties analysis

Local affine rectification

Refined Detection

Page 96: Spatial Business Detection and Recognition from Images

Color Properties Analysis

• Implemented Gaussian Mixture Model (GMM) to obtain μ and σ of foreground/background for: R/G/B/H/I

• Calculated Confidences that component (RGBHI) can be used to recognize characters

Multiresolution-based potential

characters detection

Character/layout geometry and color properties analysis

Local affine rectification

Refined Detection

Page 97: Spatial Business Detection and Recognition from Images

Color Properties Analysis

• Assumed Invariant: High contrast between foreground/background of characters in sign

• Choose the channel (R/G/B/H/I) that is best suited for use with character recognition

Page 98: Spatial Business Detection and Recognition from Images

Original

Page 99: Spatial Business Detection and Recognition from Images

Greenμ1=172.337447154472 μ2=255 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒=0.017056947503074  σ1=4.8463 σ2=0.2000

Page 100: Spatial Business Detection and Recognition from Images

Blueμ1=122.673512195122 μ2=255 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒=0.021524159560500  σ1=6.1478 σ2=0.2000

Page 101: Spatial Business Detection and Recognition from Images

Hueμ1=106.601736628811 μ2=0 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒=0.017897920959170  σ1=5.9561 σ2=0.2000

Page 102: Spatial Business Detection and Recognition from Images

Intensityμ1=145.658856368567 μ2=255 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒=0.051403296762968  σ1=2.1271 σ2=0.2000

Page 103: Spatial Business Detection and Recognition from Images

Mistake: This should only be done on individual characters, not words

Page 104: Spatial Business Detection and Recognition from Images

Color Analysis: Evaluation

• Highest confidence observed to be channel best suited for OCR…

• …Did I just say OCR?

YES!(I did.)

Page 105: Spatial Business Detection and Recognition from Images

OPTICAL CHARACTER RECOGNITIONA second shot at

Page 106: Spatial Business Detection and Recognition from Images

OPTICAL CHARACTERRECOGNITION II

(and this time, it’s personal)

Page 107: Spatial Business Detection and Recognition from Images

Refined Detection

• Generate alphabet templates in different fonts• Resize templates; Divide into grid• Apply several 2D Gabor filters to each grid patch

– Different orientations, frequencies, variances– For each pixel, yields real/imaginary component of

transformation• Feed data into Linear Discriminant Analysis

– Reduces features and forms classifier at same time

Page 108: Spatial Business Detection and Recognition from Images

2D Gabor Filter

• Convolution of Gaussian x Sine wave

Page 109: Spatial Business Detection and Recognition from Images

Live Demonstration

TrainingClassification

Page 110: Spatial Business Detection and Recognition from Images

Thank You!