canonical views of scenes depend on the shape of the...

1
0 5 10 15 20 25 0 200 400 600 800 View agreement vs. scene area R 2 = 0.29 indoor urban natural Scene area (m 2 ) Agreement (s.e. of view angle) Canonical views of scenes depend on the shape of the space Krista A. Ehinger & Aude Oliva Preference for “canonical” views of objects in recognition, depiction, and imagery (Palmer, Rosch, & Chase, 1981). Are there canonical views of scenes? What determines the canonical view of a scene? 1084 panoramic photos, each shown to 10 different workers on Amazon Mechanical Turk On each trial, workers performed two tasks: 1. Name the location shown in the image (eg, “classroom”) 2. Rotate the image in a 360-degree viewer to show the “best view” of the location Task window. The image appeared in an interactive viewer: users could rotate the view shown to simulate looking around in the scene. False alarm rate Detection rate ROC curves Area map (AUC = 0.71) Navigational map (AUC = 0.59) Introduction Experiment Results The boundaries of the space were obtained by outlining the ground plane and calculating the area around the camera. Navigational paths were marked by Mechanical Turk workers. The area map represents the percent of the space visible in each direction. The navigational map represents navigability in each direction. Overhead view of the visible space Camera Walking paths Views selected by observers Modeling the shape of the space 10 20 30 Area map Navigational map Panoramic image 50 100 % area within view % navigability Palmer, S., Rosch, E., Chase, P., (1981). Canonical perspective and the perception of 40 objects. In Attention and Performance IX, Ed. J. Long, A. Baddeley (Hillsdale, NJ: Lawrence Erlbaum), pp. 135-151. References Panoramic image Selected views “Best” views selected by observers Agreement was generally high (Rayleigh’s test of nonuniformity returned p < 0.01 for 538 images (50% of images), p < 0.05 for 694 images (64% of images)) Navigational model performance vs. scene area 0.4 0.5 0.6 0.7 0.8 0.9 1 0 200 400 600 Model AUC R 2 = 0.00 Scene area (m 2 ) 0.4 0.5 0.6 0.7 0.8 0.9 1 0 200 400 600 Area model performance vs. scene area Model AUC R 2 = 0.28 Scene area (m 2 ) indoor urban natural Example: both models performed well Example: both models performed poorly Area map AUC = 0.95 Navigational map AUC = 0.96 60 40 20 50 100 % navigability % area within view Area map AUC = 0.29 Navigational map AUC = 0.25 10 5 50 100 % navigability % area within view View agreement was highest in small, indoor / man-made spaces and lowest in natural scenes. The area model also performed better in smaller, indoor spaces. The navigational model’s performance was not related to scene area. Model performance and scene area There is high agreement on the “best view” of a scene. The best view of a scene is the one that shows as much of the space as possible, not necessarily the functional view for navigating in that space. Conclusion Examples (high agreement to low agreement):

Upload: others

Post on 28-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Canonical views of scenes depend on the shape of the spaceolivalab.mit.edu/Papers/CogSci2011_EhingerOliva_poster.pdf · Canonical views of scenes depend on the shape of the space

0

5

10

15

20

25

0 200 400 600 800

View agreement vs. scene area

R2 = 0.29

indoor

urban

natural

Scene area (m2)

Agre

em

en

t (s

.e. o

f vie

w a

ng

le)

Canonical views of scenes depend on the shape of the space

Krista A. Ehinger & Aude Oliva

Preference for “canonical” views of objects in recognition,

depiction, and imagery (Palmer, Rosch, & Chase, 1981).

Are there canonical views of scenes? What determines the

canonical view of a scene?

1084 panoramic photos, each shown to 10 different workers

on Amazon Mechanical Turk

On each trial, workers performed two tasks:

1. Name the location shown in the image (eg, “classroom”)

2. Rotate the image in a 360-degree viewer to show the

“best view” of the location

Task window. The image

appeared in an interactive

viewer: users could rotate

the view shown to simulate

looking around in the scene.

False alarm rate

De

tectio

n r

ate

ROC curves

Area map (AUC = 0.71)

Navigational map (AUC = 0.59)

Introduction

Experiment

Results

The boundaries of the space were obtained by outlining the

ground plane and calculating the area around the camera.

Navigational paths were marked by Mechanical Turk workers.

The area map represents the percent of the space visible in

each direction. The navigational map represents navigability

in each direction.

Overhead view of the visible space

Camera

Walking paths

Views selected

by observers

Modeling the shape of the space

10

20

30

Area map

Navigational map

Panoramic image

50

100

% a

rea w

ithin

vie

w

% n

avig

abili

ty

Palmer, S., Rosch, E., Chase, P., (1981). Canonical perspective and the perception of 40 objects. In Attention and

Performance IX, Ed. J. Long, A. Baddeley (Hillsdale, NJ: Lawrence Erlbaum), pp. 135-151.

References

Panoramic image Selected views

“Best” views selected by observers

Agreement was generally high (Rayleigh’s test of nonuniformity

returned p < 0.01 for 538 images (50% of images), p < 0.05 for

694 images (64% of images))

Navigational model performance vs. scene area

0.4

0.5

0.6

0.7

0.8

0.9

1

0 200 400 600

Mod

el A

UC

R2 = 0.00

Scene area (m2)

0.4

0.5

0.6

0.7

0.8

0.9

1

0 200 400 600

Area model performance vs. scene area

Mod

el A

UC

R2 = 0.28

Scene area (m2)

indoor

urban

natural

Example: both models performed well

Example: both models performed poorly

Area map

AUC = 0.95 Navigational map

AUC = 0.96

60

40

20

50

100

% n

avig

ab

ility

% a

rea

with

in v

iew

Area map

AUC = 0.29 Navigational map

AUC = 0.25

10

5 50

100

% n

avig

ab

ility

% a

rea

with

in v

iew

View agreement was highest in

small, indoor / man-made spaces

and lowest in natural scenes. The

area model also performed better

in smaller, indoor spaces. The

navigational model’s performance

was not related to scene area.

Model performance and scene area

There is high agreement on the “best view” of a scene.

The best view of a scene is the one that shows as much of the

space as possible, not necessarily the functional view for

navigating in that space.

Conclusion

Examples (high agreement to low agreement):