can you see it? annotating image regions based on users' gaze information

44
Can you see it? Annotating Image Regions based on Users' Gaze Information Ansgar Scherp , Tina Walber, Steffen Staab Technical University of Vienna October 2012

Upload: ansgar-scherp

Post on 26-May-2015

1.007 views

Category:

Technology


0 download

DESCRIPTION

Presentation on eyetracking-based annotation of image regions that I gave at Vienna on Oct 19, 2012. Download original PowerPoint file to enjoy all animations. For the papers, please refer to: http://www.ansgarscherp.net/publications

TRANSCRIPT

Page 1: Can you see it? Annotating Image Regions based on Users' Gaze Information

Can you see it? Annotating Image Regions based on Users' Gaze Information

Ansgar Scherp, Tina Walber, Steffen Staab

Technical University of ViennaOctober 2012

Page 2: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 2 of 40

Benefiting of Eye Tracking

Information for Image Region Annotation

Idea

Page 3: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 3 of 40

Eye-tracking Hardware

X60

Page 4: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 4 of 40

Saccade Fixation

Recorded Data

Page 5: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 5 of 40

Scenario: Image Tagging

Find specific objects in images Analyzing the user’s gaze path

sidewalk

car

store

tree

people

girl

Page 6: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 6 of 40

Investigation in 3 Steps

Gaze + Manual Regions

Interactive Tagging Application

Gaze + Automatic Segments

3

2

1

Page 7: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 7 of 40

1st Step

1.Best fixation measure to find the correct image region given a specific tag?

2. Can we differentiate two regions in the same image?

Page 8: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 8 of 40

3 Steps Conducted by Users

Look at red blinking dot Decide whether tag can be seen (“y” or “n”)

Page 9: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 9 of 40

Dataset LabelM community images

Manually drawn polygons Regions annotated with tags

182.657 images (August 2010)

High-quality segmentation and annotation

Used as ground truth

http://labelme.csail.mit.edu/Release3.0/

Page 10: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 10 of 40

Dataset (continued)

Page 11: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 11 of 40

Experiment Images and Tags Randomly selected images from LabelMe Each image: at least two regions, 1000p x 700p

Created three sets of 51 images each Assigned a tag to each image

Keep subjects concentrated during experiment

Tags are either “true” or “false” “true” object described by tag can be seen “false” object cannot be seen on the image

Page 12: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 12 of 40

Subjects & Experiment System 30 subjects

21 male, 9 female (age: 22-45, Ø=28.7) Undergrads (10), PhD (17), office clerks (3)

Experiment system Simple web page in Internet Explorer Standard notebook, resolution 1680x1050 Tobii X60 eye-tracker (60 Hz, 0.5° accuracy)

Page 13: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 13 of 40

Conducting the Experiment Each user looked at 51 tag-image-pairs First tag-image-pair dismissed

94.6% correct answers Roughly equal for true/false tags ~2.8s avg. until decision (true), ~3.8s avg. (false)

Users felt comfortable during the experiment (avg.: 4.4, SD: 0.75)

Eyetracker did not much influence comfort

Page 14: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 14 of 40

Pre-processing of Eye-tracking Data Obtained 799 gaze paths from 30 users where

Image has “true” tag assigned Users gave correct answers

Fixation extraction Tobii Studio’s velocity & distance thresholds Fixation: focus on particular point on screen

One fixation inside or near the correct region 656 gaze paths fulfill this requirement (82%)

Page 15: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 15 of 40

Analysis of Gaze Fixations (1) Applied 13 fixation measures on the 656 paths (2 new, 7 standard Tobii , 4 literature)

Fixation measure: function on users’ gaze paths Calculated for each image region, over all users

viewing the same tag-image-pair

Page 16: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 16 of 40

Considered Fixation MeasuresNr Name Favorite region r Origin

1 firstFixation No. of fixations before 1st on r Tobii

2 secondFixation No. of fixations before 2nd on r [13]

3 fixationsAfter No. of fixations after last on r [4]

4 fixationsBeforeDecision

fixationsAfter, but before decision New

5 fixationsAfterDecision

fixationsBeforeDecision and after New

6 fixationDuration Total duration of all fixations on r Tobii

7 firstFixationDuration

Duration of first fixation on r Tobii

8 lastFixationDuration

Duration of last fixation on r [11]

9 fixationCount Number of fixations on r Tobii

10 maxVisitDuration Max time first fixation until outside r Tobii

11 meanVisitDuration Mean time first fixation until outside r Tobii

12 visitCount No. of fixations until outside r Tobii

13 saccLength Saccade length, before fixation on r [6]

Page 17: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 17 of 40

Analysis of Gaze Fixations (2)

For every image region (b) the fixation measure is calculated over all gaze paths (c)

Results are summed up per region Regions ordered according to fixation measure If favorite region (d) and tag (a) match, result is

true positive (tp), otherwise false positive (fp)

Page 18: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 18 of 40

Precision per Fixation Measure

meanVisitDuration

Sum

of t

p an

d fp

ass

ignm

ents

Fixation measures

P

fixationsBeforeDecision

lastFixationDuration

fixationDuration

Page 19: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 19 of 40

Adding Boundaries and Weights Take eye-tracker inaccuracies into account Extension of region boundaries by 13 pixels

Larger regions more likely to be fixated Give weight to regions < 5% of image size

lastFixationDuration increases to P = 0.65

Page 20: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 20 of 40

Weighted Measure Function

Measure function fm(r) on region r with m=1…13

Relative region size: sr

Threshold when weighting is applied: T

Maximum weighting value: M

Page 21: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 21 of 40

Weighted Measure Function

Page 22: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 22 of 40

Examples: Tag-Region-Assignments

Page 23: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 23 of 40

Comparison with Baselines

Naïve baseline: largest region r is favorite Salience baseline: Itti et al., TPAMI, 20(11), Nov 1998 Random baseline: randomly select favorite r

Gaze / Gaze* significantly better (all tests: p < 0.0015) Least significant result X2=(1,N=124)=10.723

P

Page 24: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 24 of 40

Effect of Gaze Path Aggregation

+46%

+7%

Aggregation of precision P for Gaze*

P

# of gaze paths used

Page 25: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 25 of 40

Research Questions

1.Best fixation measure to find the correct image region given a specific tag?

2. Can we differentiate two regions in the same image?

lastFixationDuration with precision of 65%

Page 26: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 26 of 40

Experiment Images and Tags Randomly selected images from LabelMe Images contained at least two tagged regions Organized in three sets of 51 images each

Assigned a tag to each image

Tags are either “true” or “false”

Two of the image sets share the same images Thus, these images have two tags each

Page 27: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 27 of 40

Differentiate Two Objects Use first and second tag set to identify different objects in the same images

16 images (of our 51) have two “true” tags 6 images had two correct regions identified

Proportion of 38%

Average precision for single object is 63%

Correct tag assignment for two images: 40%

Page 28: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 28 of 40

Correctly Differentiated Objects

Page 29: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 29 of 40

Research Questions

1.Best fixation measure to find the correct image region given a specific tag?

2. Can we differentiate two regions in the same image?

lastFixationDuration with precision of 65%

Accuracy of 38%

Page 30: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 30 of 40

Investigation in 3 Steps

Gaze + Manual Regions

Interactive Tagging Application

Gaze + Automatic Segments

3

2

1

Page 31: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 31 of 40

=

car + +

For 63% of the images, we can identify the correct region.

T. Walber, A. Scherp, and S. Staab: Identifying Objects in Images from Analyzing the Users' Gaze Movements for Provided Tags, MMM, Klagenfurt,Austria, 2012.

car

So far …

Page 32: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 32 of 40

=

car + +

car

Automatic segmentation LabelMe segments only used as ground truth

T. Walber, A. Scherp, and S. Staab: Can you see it? Two Novel Eye-Tracking-Based Measures for Assigning Tags to Image Regions, MMM, Huangshan, China, 2013.

Now:

Page 33: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 33 of 40

2nd Step: New Measure Automatic segmentation measure Berkeley Segmentation Data Set and Benchmarks 500 (BSDS500)

Berkley‘s bPb-owt-ucm algorithm Segmentation on different hierarchy levels Combination of contour detection and segmentation

Oriented Watershed Transform and Ultrametric Contour Map

P. Arbeléz, M. Maire, C. Fowlkes, and J. Malik. Contour detection andhierarachical image segmentation. IEEE TPAMI, 33(5):898–916, May 2011.

Page 34: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 34 of 40

Segmentation Example Segmentations with different k = 0 … 0.4

Page 35: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 35 of 40

Automatic Segments + Gaze Conducted same computations as before But on the automatically extracted segments

Page 36: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 36 of 40

Results for different k’s: P/R/FP P

Eye-tracking-based automatic segmentation measure

Golden sections rule baseline

Page 37: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 37 of 40

Baseline: Golden Sections Rule

a + b / a = a / b

Page 38: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 38 of 40

Best Precision & Best F-measure

Eye-tracking-based automatic segmentation measure significantly outperforms golden sections baseline

Also shown: eye-tracking-based heatmap measure (no automatic segmentation)

Page 39: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 39 of 40

Investigation in 3 Steps

Gaze + Manual Regions

Interactive Tagging Application

Gaze + Automatic Segments

3

2

1

Page 40: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 40 of 40

3rd Step: Interactive Application

► tree_car ; house ; girl

Page 41: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 41 of 40

APPENDIX

Page 42: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 42 of 40

Influence of Red Dot

First 5 fixations, over all subjects and all images

Page 43: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 43 of 40

Experiment Data Cleaning Manually replaced images with

a) Tags that are incomprehensible, require expert-knowledge, or nonsense

b) Tag refers to multiple regions, but not all are drawn into the image (e.g., bicycle)

c) Obstructed objects (bicycle behind a car)

d) “False”-tag actually refers to a visible part of the image and thus were “true” tags

Page 44: Can you see it? Annotating Image Regions based on Users' Gaze Information

A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 44 of 40

How to Compute P/R? Rfav is calculated from

Automatic segmentation measure Baseline measure