interactive image search with attributes

41
Interactive Image Search with Attributes Adriana Kovashka Department of Computer Science January 13, 2015 Joint work with Kristen Grauman and Devi Parikh

Upload: others

Post on 23-May-2022

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Interactive Image Search with Attributes

Interactive Image Search with Attributes

Adriana KovashkaDepartment of Computer Science

January 13, 2015

Joint work with Kristen Grauman and Devi Parikh

Page 2: Interactive Image Search with Attributes

We Need Search to Access Visual Data

• 144,000 hours of video uploaded to YouTube daily

• 4.5 million photos uploaded to Flickr daily

• 150 million top-level domains

• This data pool is too big to simply browse

Page 3: Interactive Image Search with Attributes

Some Example Image Search Problems

… “Who did the witness see at the crime scene?”

… “Find all images depictingshiny objects.”

…“Which plant

is this?”

Page 4: Interactive Image Search with Attributes

How Is Image Search Done Today?

• Keywords work well for categories with known name

Marionberry?

marionberry

Marionberry.jpg Marion-berries-2.jpg

Marionberry-jam.jpg

Page 5: Interactive Image Search with Attributes

How Is Image Search Done Today?

• Unclear how to search without name or image

• Keywords are not enough!

???

Page 6: Interactive Image Search with Attributes

Solution? Interactively Describing Visual Content

It’s a yellow fruit.

Is it a lemon?

It has spikes. Is it a durian?

It is green on the inside. Is it a horned

melon?

Page 7: Interactive Image Search with Attributes

Solution? Interactively Describing Visual Content

It’s a yellow fruit.

Is it a lemon?

It has spikes. Is it a durian?

It is green on the inside. Is it a horned

melon?

Page 8: Interactive Image Search with Attributes

• Potential to communicate more precisely the desired visual content

• Iteratively refine the set of retrieved images

My Goal: Precise Communication for Search

Feedback

Results

Page 9: Interactive Image Search with Attributes

Main Idea: Interactive Visual Search

Problem: User’s ability to communicate about visual content is severely limited

Key idea: New form of interaction using precise language-based feedback

“It was bigger and brighter

and more beautiful…”

Page 10: Interactive Image Search with Attributes

Key Questions

1) How to open up communication?

2) How to account for ambiguity of search terms?

“Bright” “Bright”

Target is more smiling than

Page 11: Interactive Image Search with Attributes

How is Interactive Search Done Today?

• Traditional binary feedback imprecise; allows only coarse communication between user and system

Keywords + binary relevance feedback

thin white male

[Rui et al. 1998, Zhou et al. 2003, Tong & Chang 2001, Cox et al. 2000, Ferecatu & Geman 2007, …]

relevantirrelevant

Page 12: Interactive Image Search with Attributes

Our Idea: Search via Comparisons

• Allow user to “whittle away” irrelevant images via comparative feedback on properties of results

“Like this… but with curlier hair”

Kovashka, Parikh, and Grauman, CVPR 2012

Page 13: Interactive Image Search with Attributes

Prior Work: Semantic Visual Attributes

• High-level descriptive properties shared by objects

• Human-understandable and machine-detectable

• Middle ground between user and system

smiling large-lips

long-hair

natural

perspective openyellow

spiky smooth

[Lampert et al. 2009, Farhadi et al. 2009, Kumar et al. 2009, Parikh & Grauman 2011, Berg et al. 2010, Branson et al. 2010, Endres et al. 2010, Curran et al. 2012 …]

high

heel

redornaments

metallic

14

Page 14: Interactive Image Search with Attributes

Prior Work: Attributes for Recognition and Search

Object recognition

• Low-dimensional image representation for classification

• Need not be semantic

Image search

• A form of keyword search

“men with dark hair” “middle-aged white men” FaceTracer,Kumar et al. 2008

[Lampert et al. 2009, Farhadi et al. 2009, Kumar et al. 2009, Parikh & Grauman 2011, Berg et al. 2010, Branson et al. 2010, Endres et al. 2010, Curran et al. 2012 …]

[Kumar et al. ECCV 2008, Siddiquie et al. CVPR 2011, Scheirer et al. CVPR 2012, Rastegari et al. CVPR 2013]

Page 15: Interactive Image Search with Attributes

We need ability to compare images by attribute “strength”

bright

smiling

natural

Relative Attributes

Page 16: Interactive Image Search with Attributes

Learning Relative Attributes

• We want to learn a spectrum (ranking model) for an attribute, e.g. “brightness”.

• Supervision from human annotators consists of:

Parikh and Grauman, ICCV 2011

Ordered pairs

Similar pairs

Page 17: Interactive Image Search with Attributes

We need ability to compare images by attribute “strength”

bright

smiling

natural

Relative Attributes

Page 18: Interactive Image Search with Attributes

WhittleSearch with Relative Attribute Feedback

Results Page 1

Update relevance

scores

score=7

score=5

score=4

score=4score=1

?User: “I want somethingmore natural than this.”

Kovashka, Parikh, and Grauman, CVPR 2012

Page 19: Interactive Image Search with Attributes

WhittleSearch with Relative Attribute Feedback

natural

perspective“I want

something more natural

than this.” “I want something less naturalthan this.”

“I want something with more perspective than this.”

Kovashka, Parikh, and Grauman, CVPR 2012

+1

+1

+1

+1

+1

+1

+1 +1

+1

+1 +1

Page 20: Interactive Image Search with Attributes

More open than

Qualitative Result (Relative Attribute Feedback)

More open than

Less ornaments than

Match

Round 1

Ro

un

d 2

Round 3

Query: “I want a bright, open shoe that is shorton the leg.”

Selected feedback

Page 21: Interactive Image Search with Attributes

Shoes [Berg10, Kovashka12]:14,658 shoe images;

10 attributes: “pointy”, “bright”, “high-heeled”, “feminine” etc.

OSR [Oliva01]:2,688 scene images;

6 attributes:“natural”, “perspective”,

“open-air”, “close-depth” etc.

PubFig [Kumar08]:772 face images;

11 attributes:“masculine”, “young”,

“smiling”, “round-face”, etc.

Datasets Data from 147 users

Page 22: Interactive Image Search with Attributes

WhittleSearch Results (Summary)

• Binary feedback represents status quo [Rui et al. 1998, Cox et al. 2000, Ferecatu & Geman 2007, …]

• WhittleSearch converges on target faster than traditional binary feedback

Page 23: Interactive Image Search with Attributes

Key Questions

1) How to open up communication?

2) How to account for ambiguity of search terms?

“Bright” “Bright”

Target is more smiling than

Page 24: Interactive Image Search with Attributes

Standard Attribute Learning: One Generic Model

• Learn a generic model by pooling training data regardless of the annotator identity

• Inter-annotator disagreement treated as noise

Vote on labels

“formal”

“not formal”

Crowd

Page 25: Interactive Image Search with Attributes

Problem: One Model Does Not Fit All

• There may be valid perceptual differences within an attribute, yet existing methods assume monolithic attribute sufficient

[Lampert et al. CVPR 2009, Farhadi et al. CVPR 2009,Branson et al. ECCV 2010, Kumar et al. PAMI 2011,Scheirer et al. CVPR 2012, Parikh & Grauman ICCV 2011, …]

Formal? User labels: 50% “yes”50% “no”

or

More ornamented? User labels: 50% “first”20% “second”30% “equally”

Binary attribute Relative attribute

Page 26: Interactive Image Search with Attributes

Is formal?

= formal wear for a conference? OR

= formal wear for a wedding?

Context

Imprecision of Attributes

Page 27: Interactive Image Search with Attributes

Is blue or green?

English: “blue”

Russian: “neither” (“голубой” vs. “синий”)

Japanese: “both”(“青” = blue and green)

Cultural

Imprecision of Attributes

Page 28: Interactive Image Search with Attributes

Idea: Learn User-Specific Attributes

• Treat learning perceived attributes as an adaptation problem

• Adapt generic attribute model with minimal user-specific labeled examples

Standard

approach:Vote on labels

Our idea:

“formal”

“not formal”

“formal”

“not

formal”

“formal”

“not formal”

Crowd

User

Kovashka and Grauman, ICCV 2013

Page 29: Interactive Image Search with Attributes

• Adapting binary attribute classifiers:

J. Yang et al. ICDM 2007.

Given user-labeled data

and generic model ,

learn adapted model ,

Learning Adapted Attributes

Page 30: Interactive Image Search with Attributes

Learning Adapted Attributes

“formal”

“not formal”

“formal”

“not formal”Generic boundary

Adapted boundary

Page 31: Interactive Image Search with Attributes

Learning Adapted Attributes

User-exclusive boundary

Adapted boundary

“formal”

“not formal”Generic boundary

Page 32: Interactive Image Search with Attributes

Adapted Attribute Accuracy

• Result over 3 datasets, 32 attributes, and 75 total users

• Our user-adaptive method most accurately captures perceived attributes

Page 33: Interactive Image Search with Attributes

0

10

20

30

40

50

60

70

Shoes-Binary SUN

generic generic+

user-exclusive user-adaptive

Mat

ch r

ate

Personalizing Image Searchwith Adapted Attributes

“white shiny heels”

“shinier than ”

Page 34: Interactive Image Search with Attributes

Attribute Model Spectrum

Generic ? User-adaptive

No personalization Personalized to each user

Robustness to noise via majority vote

No robustness to noise

Assumes all users have thesame notion of the attribute

Assumes each user has aunique notion of the attribute

Page 35: Interactive Image Search with Attributes

Problem: User-Specific Extreme

• Different groups of users might subscribe to different shades of meaning of an attribute term

• How can we discover these shades automatically?

Is

fashionable?

Yes

No

No

Page 36: Interactive Image Search with Attributes

Our Idea: Discovering Shades of Attributes

• Discover “schools of thought” among users based on latent factors behind their use of attribute terms

• Allows discovery of the attribute’s “shades of meaning”

Shade 2

Shade 3

Shade 1

Fashionable shoes

Page 37: Interactive Image Search with Attributes

Approach: Recovering Latent Factors

• Given: a partially observed attribute-specific label matrix

• Want to recover its latent factors via matrix factorization

• Gives a representation of each user

Image 1 Image 2 Image 3 Image 4

Annotator 1 1 ? 0 ?

Annotator 2 ? 1 0 ?

Annotator 3 1 ? ? 0

Annotator 4 0 ? 1 ?

Annotator 5 ? ? 1 1

Annotator 6 ? 0 ? 1

Annotator 7 1 0 1 ?

Factor 1 (toe?)

Factor 2(heel?)

Annotator 1 0.85 0.12

Annotator 2 0.72 0.21

Annotator 3 0.91 0.17

Annotator 4 0.07 0.95

Annotator 5 0.50 0.92

Annotator 6 0.15 0.75

Annotator 7 0.45 0.50

Is the attribute “open” present in the image?

Page 38: Interactive Image Search with Attributes

Approach: Discovering Shades

• Cluster users in the space of latent factor representations

• Use K-means; select K automatically

• Gives a representation of each shade for this attribute

Factor 1 (toe?)

Factor 2(heel?)

Annotator 1 0.85 0.12

Annotator 2 0.72 0.21

Annotator 3 0.91 0.17

Annotator 4 0.07 0.95

Annotator 5 0.50 0.92

Annotator 6 0.15 0.75

Annotator 7 0.45 0.50

Shade 1

Shade 2

Attribute: “open”

Page 39: Interactive Image Search with Attributes

Results: Discovered Shades

COMFORTABLE

Page 40: Interactive Image Search with Attributes

Results: Discovered Shades

OPEN AREA

Page 41: Interactive Image Search with Attributes

Conclusion

• New language-based interaction using comparisons boosts search accuracy

• Informative and perceptually precise feedback

• Exploits interplay between vision and language