modeling visual clutter using proto-objects

Modeling visual clutter using proto-objects

TECH TALK @ SHUTTERSTOCK

Presenter: Chen-Ping Yu, PhD Candidate

Research Advisors: Dr. Dimitris Samaras (computer science)Dr. Greg Zelinsky

(psychology)

Department of Computer Science Stony Brook University

February 5, 2014

• Visual search• Examples

• Visual clutter• Models• Proto-objects

• Parametric proto-object segmentation• Superpixels• Graph and clustering

• Data

• Experiment and results

• Conclusion

AGENDA

• Visual search• Ubiquitous, happens everyday.• Finding your car in a parking lot, finding you keys on a cluttered desk, etc.

• Modeling visual search performance• Are we able to predict how easy/hard a search task is?• Helps in advertisement design, item placement (i.e. shelf organization for

supermarkets, electronic stores).

• Attributes that affect visual search performance• The similarity of the target to the distractor items (Wolfe, 1994, 1998).• The similarity of the distractors (Duncan & Humphreys, 1989).• Set size – the number of items in an image (Wolfe, 1998).

VISUAL SEARCH

• Example: find the target patch in the query image

VISUAL SEARCH

Target patch

Source: M. Asher, D. Tolhurst, T. Troscianko and I. Gilchrist, “Regional effects of clutter on human target detection performance”, Journal of Vision, 2013

VISUAL SEARCH

VISUAL SEARCH

• Another example

VISUAL SEARCH

Target patch

VISUAL SEARCH

• Set size effect example

VISUAL SEARCH

Target patch

Source: M. Neider, and G. Zelinsky, “Cutting through the clutter: searching for targets in evolving complex scenes.” Journal of Vision, 2011.

VISUAL SEARCH

VISUAL SEARCH

VISUAL CLUTTER

• Visual clutter• In general, it is a ‘‘confused collection’’ or a ‘‘crowded disorderly state’’.• Alternatively, it is the state in which excess items, or their representation or

organization, lead to a degradation of performance at some task (Rosenholtz et al. 2007).

VISUAL CLUTTER

• Set size effect• Set size: number of items/objects in an image• Visual search task performance degrades as more objects are added to the display,

i.e. looking for a particular building in a rural setting vs in an urban setting (Neider et al.

2008, 2011).• Number of objects is proportional to level of clutter.

• Set size in the real world• However, most “objects” in real world scene are not visually countable• grass, rocks, patches of textures, shadows, etc.

• Alternative approach• Analysis in the feature space

VISUAL CLUTTER

Both contain 24 objects!

What are objects in these scenes? What is the ranking of their clutterness?

CLUTTER MODELS

Segmenting objects is difficult, therefore:• Edge density model (Mack et al. 2004)

• Counts the pixels on a Canny edge detected image. (r = 0.83)• Result is very sensitive to Canny’s edge detection setting, i.e. smoothing,

thresholding.

Top row: input images

Bottom row: edge density

CLUTTER MODELS

• Feature congestion model (Rosenholtz et al. 2007)• Compute the feature variances of: Color, Luminance, and Orientation• Build a 3D ellipse using the feature variances, and the volume of the ellipse is the

clutter measure for that image.• State-of-the-art, widely being used as the comparison gold standard. (r = 0.75)

Left: input, Right: feature variance ellipses 25 weather and US map dataset

CLUTTER MODELS

• Power Law model (Bravo et al. 2008)• Using Felzenszwalb’s graph-based method to segment the input image, r = 0.62.

* Left images: from Wischnewski et al. 2010; Right image: from Bravo et al. 2008 (24 objects)

PROTO-OBJECTS

• Direct modeling of set size: proto-objects• Low-level information processed before the focus of attention, and then focus of

attention acts as a ‘‘hand’’ that grabs the relating proto-objects together into forming a true stable object, and proto-object itself are groupings of similar low level features that are nearby by the visual neurons (Rensink 1997, 2000).

• Directly related to set size.

• Better representation of set size than “objects”.

Proto-objects as color blobs

• Our clutter model• Quantify set size, using # of proto-objects instead of objects• Segment proto-objects by performing superpixel clustering

PROTO-OBJECT SEGMENTATION

Input image Superpixels Proto-objects

Image from left to right: input image, mean-shift, graph-based, turbopixel, normalized-cut.

SUPERPIXEL SEGMENTATION

• Superpixel segmentation• Over-segment an image into regions of similar pixels that are also boundary

preserving.• As a pre-processing can reduce the need to find boundaries.• Can provide region statistics.

• Superpixel graph• Neighboring superpixels are connected, into a graph structure


SLIC k = 1000 Superpixel GraphInput image Superpixels


0.110.77

0.15

0.860.28

0.630.35

0.770.12

0.75

0.210.82

0.310.04

0.320.93

0.81

0.380.71

0.680.65

0.750.23

0.05

0.110.77

0.15

0.860.28

0.630.35

0.770.12

0.75

0.210.82

0.310.04

0.320.93

0.81

0.380.71

0.680.65

0.750.23

0.05

Compute similarity threshold, remove edges that are higher than the threshold

Merge the connected clusters, represented as proto-objects

Within-cluster edge

Between-cluster edge (identify, then remove)

PARAMETRIC PROTO-OBJECT SEGMENTATION

Intensity

Color

Orientation

Weibull-Mixture Model (WMM):

Similarity Threshold – the crossing point between the two components:

PARAMETRIC PROTO-OBJECT SEGMENTATION

• Clutter model• Count the resulting # of proto-objects.

• Divide the count by the initial # of superpixels, results in a scale-invariant normalized clutter measure.

• The clutter measure is between 0 and 1, the larger the more cluttered.

DATA

• 90 images from the SUN dataset• 800x600• Real world images• 6 groups with 15 images each (total = 90 images). • Group 1: 1~10 objects• Group 2: 11~20 objects• …• Group 6: 51~60 objects

• Rated by 15 human subjects age from 18~30, from least to most clutter.• Avg correlation over all pairs of subjects: R = 0.6919 (p<0.001)• Using the median ranked position for each image as the ground truth.

RESULTS

• Results• Achieved R = 0.7557, p<0.001 against human rated ground truth ordering by clutter• 10-fold cross validation with avg test set correlation of R = 0.6808.

**latest results:

RESULTS

Clutter measure: 0.1713


RESULTS



CONCLUSION

• Applications• Image-level feature for image retrieval.

• Image-to-painting style transformation.

• Advertisement, user interface, and item organization quantified analysis.

• Next steps• Apply our clutter model to the target search task performances.

• Explore more on proto-objects for automatic object formation and detection.

• Eye-movement related projects.

• Related papers

• Chen-Ping Yu, Wen-Yu Hua, Dimitris Samaras, and Gregory Zelinsky, “Modeling clutter perception using parametric proto-object partitioning.” Advances in Neural Information Processing (NIPS), Lake Tahoe, USA, Dec 2013.

• Chen-Ping Yu, Dimitris Samaras, and Gregory Zelinsky, “Modeling visual clutter perception using proto-object segmentation.”, Journal of Vision (to appear), 2014.

• For more information, please visit my project webpage:http://mysbfiles.stonybrook.edu/~cheyu/projects/proto-objects.html

• For full citation information of this presentation, please refer to the NIPS 2013 paper.

CONCLUSION

modeling visual clutter using proto-objects

Documents

visual search performanceare

feature space visual

number of objects

clutter measure

level of clutter

regional effects of

number of items

feature variances