wise: large scale content-based web image search michael isard joint with: qifa ke, jian sun, zhong...

49
WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Post on 19-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

1

WISE: Large Scale Content-Based Web Image Search

Michael IsardJoint with: Qifa Ke, Jian Sun, Zhong Wu

Microsoft Research Silicon Valley

Page 2: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

2

What leaf ?

Artist ? Higher resolution?

Bank web-site?

Email ?

Ad. in e-bay ?

……

©: Who else using this?

Query by Images“A picture is worth a thousand words.”

Page 3: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Partial-Duplicate Image Search

• Given a query image, find its partial duplicates from a database of web-images

Page 4: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

4

Two Major Challenges

• How to represent images– No text annotations or labels– Noise and modification

• How to efficiently index and query images– Large number of images (millions)

Index

?

Page 5: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

5

Image Representation: Bag-of-Words[CVPR’09, ICCV’09]

Descriptor ID

1

3506

999,999

1,000,000

Detection[Lowe 2004, Matas et al 2002,Winder et al 2007]

Bag-of-words [Sivic&Zisserman’2003]

0 5 10 15 20 25 300

10

20

30

40

50

60

0 5 10 15 20 25 300

20

40

60

80

100

120

0 5 10 15 20 25 300

10

20

30

40

50

60

……

0 5 10 15 20 25 300

10

20

30

40

50

60

……

Code-book

Normalization

0 5 10 15 20 25 300

10

20

30

40

50

60

0 5 10 15 20 25 300

20

40

60

80

100

120

0 5 10 15 20 25 300

10

20

30

40

50

60

0 5 10 15 20 25 300

5

10

15

20

25

30

35

0 5 10 15 20 25 300

10

20

30

40

50

60

0 5 10 15 20 25 300

10

20

30

40

50

60

70

80

90

0 5 10 15 20 25 300

10

20

30

40

50

60

0 5 10 15 20 25 300

10

20

30

40

50

60

70

80

90

Description[Lowe 2004, Winder et al 2007]

3506

1

999,999

206

2: Quantization1: Feature extraction: Bundle Features 3: Representation

Page 6: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

6

Page 7: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

7

Matching query to database

• Use an index– Each visual word has a ‘posting list’– Lists every image containing the word

• At query time– Look up the posting list for each query word– Merge lists to find candidate images

• Partial match: don’t need every word to be present

Page 8: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

8

How much work to query?

• Disk-based index, bottleneck is random reads– One seek per posting list

• Also one seek per matching image– To fetch thumbnail etc

• Keep as little information as possible in posting lists, to keep index size small

Page 9: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

9

Index Pipeline• Implemented in a large computer cluster

– 256-nodes, using Dryad/DryadLINQ

Image Crawler

Content Chunk Visual WordsFeature

Quantizer

Crawler

BundledFeatures

Feature Extractor

Media DB(Thumbnail, URL)

Local Features

Indexer

InvertedIndex

IndexVisual Word

Page 10: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

10

Query Pipeline

BundledFeatures

Feature Extractor Visual WordsFeature

QuantizerQuery Image

InvertedIndex

Index Server

Search Results(chunkID, imgID)

Media DB(Thumbnail, URL)

GUIResults

Page 11: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Bag-of-Words: Limitations

• Quantization– Lost discriminative

power– Sensitive to image

variations and noises– Soft quantization

[Philbin et al, CVPR 2008]

– Hamming embedding [Jegou et al, ECCV 2008]

Descriptor ID

1

3506

999,999

1000,000

0 5 10 15 20 25 300

10

20

30

40

50

60

0 5 10 15 20 25 300

20

40

60

80

100

120

0 5 10 15 20 25 300

10

20

30

40

50

60

……

0 5 10 15 20 25 300

10

20

30

40

50

60

……

Quantization

0 5 10 15 20 25 300

10

20

30

40

50

60

70

80

90

0 5 10 15 20 25 300

10

20

30

40

50

60

70

80

90

0 5 10 15 20 25 300

10

20

30

40

50

60

70

80

90

0 5 10 15 20 25 300

10

20

30

40

50

60

70

80

90

Page 12: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

12

Geometric verification

• In practice, bag of words is too weak• Does not exploit any geometry• Post-process to check spatial layout of

matching features• Requires a disk seek per image

– Only used as a re-ranking step to shortlist of matched images

Page 13: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Geometric Re-ranking

Re-rank top 300 images

50000 200000 500000 10000000.25

0.3

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

Number of images

mA

P

baseline (bag-of-words)

baseline + reranking

Page 14: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

14

Geometry in the index

• Previous works:– Jegou et al ECCV 2008

• Try to match similar orientations and scales

– Perdoch et al CVPR 2009• Match oriented features more effectively

• Still feature-by-feature– Global geometric consitency applied at the end

Page 15: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Single Feature is Weak

++

Page 16: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Neighboring Features ?

Page 17: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Define Neighboring Features

• Previous works– kNN voting [Sivic&Zisserman 2003]

– Higher-order spatial features [Liu et al][Yuan et al][Tirilly et al][Quack et al]

– Post geometric spatial verification [Lowe’2004][Chum et al 2007][Nister 2006][Philbin et al 2007]……

– Geometric Min-Hash [Chum et al 2009]

• Challenges– Repeatable– Partial matching– Scalable: simple enough to build into index

Page 18: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Define Neighboring Features

DoG Features [Lowe 2004] MSER Features [Matas et al 2002]

- point features- repeatable

- region features- repeatable

Page 19: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Define Neighboring Features

DoG Features [Lowe 2004] MSER Features [Matas et al 2002]

- point features- repeatable

- region features- repeatable

region groups points?

Page 20: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Bundled Feature: Definition

• Bundled Feature = A set of DOG features bundled by a MSER region

DoG interest points

MSER region

Page 21: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Bundled Feature: DefinitionBundled Features

Page 22: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Matched bundlep = { pi }

Query bundleq = {qj} = { }

Matching Bundles: MembershipMembership score:

Voting weight:

( ; ) 4mM q p q p

1 2( , ) ( ) 4 16

j j

jq q

Sim I I v q

( ) ( ; ) 4j mv q M q p

Page 23: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Matched bundlesp1, p2, p3

Query bundleq = {qj} = { }

Membership score:

Matching Bundles: Membership

p1

p2

p3

1 1( ; ) 2mM p pq q

2 2( ; ) 1mM p pq q

3 3( ; ) 2mM p pq q

1 2( , ) ( ) 8

j

jq

Sim I I v q 1( ) 2v q 2( ) 2v q

3( ) max(1,2) 2v q

4( ) 2v q

( ) max ( ; ) |j m jv q M q k

kp

q p q

Page 24: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Matching Bundles: Geometric Constraint

1

3

2

44

1

3

2

5

order in target img: 1 < 3 < 4 < 5

order inconsistency: 0 + 0 + 0 = 0

Penalize inconsistent relative orders:

y

1 < 2 < 3 < 4order in query img: 1 < 2 < 3 < 4

1

3

2

4

2

1

3

4

5

5 > 2 > 1 < 3

1 + 1 + 0 = 2

matching order:

inconsistency:

order in query img:

query candidate query candidate

1( ; )g q i q iM O p O p q p

Page 25: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

( ; ) ( ; ) ( ; )m gM M M q p q p q p

Matching Bundles: Formulation

• Bundle matching score:

• Image matching score:geometric constraintmembership

1 2{ }

( , ) ( )j

jq

Sim I I v q

( ) max ( ; ) |j jv q M q k

kp

q p q

- Repeatable- Partial matching- Scalable?

Page 26: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Inverted Index (without Bundles)Posting ……

Image ID

Visualword

……

… … … 27

… 27

……

……

Image ID = 27

Page 27: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Inverted Index with BundlesPosting ……

Image ID

Visualword

……

… … … 27,[3,1,1]

…27,

[1,2,1]…

……

Image ID = 27

Bundle Bits

9 bits 5 bits 5 bits

Bundle ID X-Order Y-Orderp1

p2

p3

1

3

2

27,[2,2,2

]

Page 28: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

RetrievalQuery Image Iq Inverted index with bundle bits

1,1,[2,5,9]

3,1,[3,4,5]

9,2,[3,2,5]

10, 1,[1,1,2]

12,1,[1.1.2]

… 10, 1,[2,2,1] … … …

……

……

……

p1

p2

p3

Top candidate images

Page 29: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Experimental Settings

• Image database:– 1M web images from query-click log

• Ground truth partial duplicates– 780 known partial duplicate images in 19 groups

• Baseline bag-of-words– Visual word vocabulary size = 1 M– Soft quantization factor = 4– 500 features per image

Page 30: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Partial Duplicate Example

……

Page 31: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Partial Duplicate Example

……

Page 32: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Example Query Results

Challenging cases

Query

Page 33: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Evaluation: Precision-Recall

• A query returns N images– T : correct matches– A : expected matches

Precision = T

NRecall =

T

A

Recall

Prec

isio

n

Page 34: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Comparison: Precision-Recall

Baseline bag-of-words (started from 13th)

Bundled features (started from 13th)

Query image:

Page 35: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

More Precision-Recall Comparisons

Page 36: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

• Average Precision (AP) for one query: – Area under Precision-Recall curve

• mAP: mean of AP’s from all testing queries

Evaluation: mAP

AP

Page 37: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

mAP: Baseline Bag-of-Words

50000 200000 500000 1000000

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

Number of images

mA

P

baselineHEbundled(membership)bundledbundled + HE

Page 38: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

mAP: Hamming Embedding (HE)

50000 200000 500000 1000000

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

Number of images

mA

P

baselineHEbundled(membership)bundledbundled + HE

Page 39: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

mAP: Bundle (Membership)

50000 200000 500000 1000000

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

Number of images

mA

P

baselineHEbundled(membership)bundledbundled + HE

Page 40: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

mAP: Bundle (both terms)

50000 200000 500000 1000000

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

Number of images

mA

P

baselineHEbundled(membership)bundledbundled + HE

26% 40%

Page 41: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

mAP: Bundle + HE

50000 200000 500000 1000000

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

Number of images

mA

P

baselineHEbundled(membership)bundledbundled + HE

49%

Page 42: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Bundle VS. Geometric Re-ranking

50000 200000 500000 10000000.25

0.3

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

Number of images

mA

P

baseline (bag-of-words)bundlebaseline + rerankingbundle + reranking

Re-rank top 300 images

Page 43: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Bundle + Geometric Re-ranking

50000 200000 500000 10000000.25

0.3

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

Number of images

mA

P

baseline (bag-of-words)bundlebaseline + rerankingbundle + reranking

Re-rank top 300 images

24% 77%

Page 44: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

More Results

Page 45: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Failure Case

Page 46: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

46

Demo Setup

Document Server

Web Server

Index Servers

Client

6 million images

Page 47: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

47

Demo

Query image

Results

Page 48: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Conclusion

• Bundle feature– More discriminative– Enforce spatial constraints while traversing index– Partial match– Scalable: built into index

9 bits 5 bits 5 bits

Bundle ID X-Order Y-Order

Page 49: WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1

Thanks!