visual berrypicking in large image collections - thomas … · visual berrypicking in large image...

4
Visual Berrypicking in Large Image Collections Thomas Low Data & Knowledge Engineering Group Otto von Guericke University Magdeburg [email protected] Harald Sack Hasso Plattner Institute for Software Systems Engineering Potsdam, Germany [email protected] Christian Hentschel Hasso Plattner Institute for Software Systems Engineering Potsdam, Germany [email protected] Andreas N¨ urnberger Data & Knowledge Engineering Group Otto von Guericke University Magdeburg [email protected] Sebastian Stober Data & Knowledge Engineering Group Otto von Guericke University Magdeburg [email protected] Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author. Copyright is held by the owner/author(s). /NordiCHI ’14/, Oct 26-30 2014, Helsinki, Finland ACM 978-1-4503-2542-4/14/10. http://dx.doi.org/10.1145/2639189.2670271 Abstract Exploring image collections using similarity-based two-dimensional maps is an ongoing research area that faces two main challenges: with increasing size of the collection and complexity of the similarity metric projection accuracy rapidly degrades and computational costs prevent online map generation. We propose a prototype that creates the impression of panning a large (global) map by aligning inexpensive small maps showing local neighborhoods. By directed hopping from one neighborhood to the next the user is able to explore the whole image collection. Additionally, the similarity metric can be adapted by weighting image features and thus users benefit from a more informed navigation. Author Keywords Interactive Exploration, Image Retrieval, Multi-Dimensional Scaling, Procrustes Analysis ACM Classification Keywords H.5.m [Information interfaces and presentation (e.g., HCI)]: Miscellaneous.; H.3.3 [Information storage and retrieval]: Information search and retrieval. Introduction When exploring large image collections, the user is interested in information about the collection as a whole:

Upload: vuonghuong

Post on 25-May-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

Visual Berrypicking in Large ImageCollections

Thomas LowData & KnowledgeEngineering GroupOtto von GuerickeUniversity [email protected]

Harald SackHasso Plattner Institutefor Software SystemsEngineeringPotsdam, [email protected]

Christian HentschelHasso Plattner Institutefor Software SystemsEngineeringPotsdam, [email protected]

Andreas NurnbergerData & KnowledgeEngineering GroupOtto von GuerickeUniversity [email protected]

Sebastian StoberData & KnowledgeEngineering GroupOtto von GuerickeUniversity [email protected]

Permission to make digital or hard copies of part or all of this work for personalor classroom use is granted without fee provided that copies are not made ordistributed for profit or commercial advantage and that copies bear this noticeand the full citation on the first page. Copyrights for third-party componentsof this work must be honored. For all other uses, contact the Owner/Author.

Copyright is held by the owner/author(s).

/NordiCHI ’14/, Oct 26-30 2014, Helsinki, FinlandACM 978-1-4503-2542-4/14/10.http://dx.doi.org/10.1145/2639189.2670271

AbstractExploring image collections using similarity-basedtwo-dimensional maps is an ongoing research area thatfaces two main challenges: with increasing size of thecollection and complexity of the similarity metricprojection accuracy rapidly degrades and computationalcosts prevent online map generation. We propose aprototype that creates the impression of panning a large(global) map by aligning inexpensive small maps showinglocal neighborhoods. By directed hopping from oneneighborhood to the next the user is able to explore thewhole image collection. Additionally, the similarity metriccan be adapted by weighting image features and thususers benefit from a more informed navigation.

Author KeywordsInteractive Exploration, Image Retrieval,Multi-Dimensional Scaling, Procrustes Analysis

ACM Classification KeywordsH.5.m [Information interfaces and presentation (e.g.,HCI)]: Miscellaneous.; H.3.3 [Information storage andretrieval]: Information search and retrieval.

IntroductionWhen exploring large image collections, the user isinterested in information about the collection as a whole:

what kind of images are available, how they relate to eachother, and if there are clusters of similar images? In thisscenario, similarity-based two-dimensional maps are asuitable means of visualization, because they quicklyconvey information about the general structure of acollection and help to identify groups of similar images.However, with growing size of the collection andincreasing complexity of the underlying similarity metric,three major problems need to be faced:

• projections become increasingly inaccurate

• generating large maps is computationally impractical

• adapting feature weights (which enables personalizedviews) requires a recalculation of the distance matrix

In order to avoid these problems, we propose to visualizeonly the set of k-nearest neighbors for a given seed imagein a small map. By choosing another image as a seed, theuser is able to hop from one neighborhood map to

another. Consecutive maps are likely to overlap to someextend and thus can be aligned to create a consistenttransition that is, ideally, perceived as panning a large(global) map. This is visualized in Fig. 1. Because of thistransition between two maps, users are able to transferknowledge about the content and the relevance ofindividual images accumulated during the explorationprocess from one visualization to the next. This allows theuser to navigate step-by-step through the wholecollection, which we call visual berrypicking.

Related WorkIn document-based information retrieval, berrypickingdescribes the user’s behavior during the search process [1].Instead of a single query, the user performs a series ofmodified queries in order to find relevant information. Inour scenario, choosing a neighboring image as a new seedcorresponds to modifying a search query. An overview

Figure 1: Images areprojected to atwo-dimensional map basedon their pairwise similarity.The user hops from oneneighborhood map to thenext by selecting new seedimages (with large border).Maps are aligned to createthe impression of panning alarge (global) map.

over different browsing models for image retrieval is givenin [4]. Amongst others, the author discusseddimensionality reduction methods for the presentation ofsearch results when addressing users with only poorlydefined information needs. Rubner et al. [6] were amongthe first to propose multidimensional scaling (MDS [5])for iterated image search in large collections – a techniquethat we also follow in this paper. The authors propose alocal MDS on the nearest neighbors of a query image.However, in contrast to our approach, consecutive mapsare not aligned to each other.

In previous work [8], we reviewed and compared differentdimensionality reduction algorithms for the visualization oflarge music collections. In a user study, MDS was favoredby most participants as best layout algorithm when thecollection undergoes changes due to newly added items.Different from the approach proposed here, we pursued aglobal map approach for MDS. We also suggested to useProcrustes analysis [3] to better align newly generatedmaps with their respective predecessors, which we adopthere as it reduces confusion for users.

Map Generation and TransitionsVisualization of image similarities on a two-dimensionalmap requires dimensionality reduction of the typicallymuch higher input feature space. We used MDS for thatpurpose. Despite providing the advantage of beingdistance preserving, MDS as any projection into lowerdimensional spaces will cause projection errors thatincrease with the number of dimensions to reduce and thesize of the collection. This also applies to one-dimensionallists, i.e., the visualized order of objects may notappropriately respect the order defined by the used metric.By limiting the number of images used to compute theprojection, we reduce the impact of this error and thusvisualizations become more reasonable.

Transitions between consecutive maps are animated withthe aim of giving the user the impression of panning alarge map representing the collection as a whole. In orderto make these transitions as consistent as possible, wealign consecutive maps on their common neighbors. Weuse Procrustes analysis to reduce the sum of the squareddifferences between the two sets of images that remain

Figure 2: Prototypeshowing k-nearest neighbors(k = 27) to a given seedimage (see top left image)both in a traditional list-based view (left screenshot)and a map-based view(right screenshot).Images in the list view areordered based on theirsimilarity to the seed image(see numbers in percentbelow images).

visible by translation, scaling, rotation and reflection. As aresult, the user benefits from continuity that allows totransfer knowledge from one map to the next and morestable navigation directions and, thus, is less likely to getlost.

Interface PrototypeWe have implemented a web-based interface prototypethat supports a traditional list-based as well as theproposed map-based approach for image collectionexploration, see Fig. 2 (left and right). The prototyperenders thumbnails of the k-nearest neighbors of a givenseed image. By clicking on a thumbnail, the user canselect a new seed and thereby retrieves another subset ofimages most similar to the one selected. This enables theuser to navigate to the desired image or explore thecollection as a whole. Since very similar images are likelyto overlap in the MDS based visualization (see Fig. 2right), we implemented a grid layout that preservessimilarities and, in addition, avoids overlaps, see Fig. 3.

Figure 3: Map-based view withgrid layout: Images are moved tothe closest unoccupied cell inorder to avoid overlaps.Similarities between imageslargely remain visible.

In our prototype, we used the Caltech101 data set [2]comprising 9144 images representing 101 categories. Theemployed similarity metric is based on a weighted sumover four individual visual image features: we extract theMPEG-7 descriptors Color Layout, Scalable Color andEdge Histogram [7] as well as a YCbCr Color Histogram.During exploration, the user is able to adapt featureweights to influence navigation directions.

ConclusionsWe implemented a prototype to demonstrate our proposedinteraction scheme for exploring large image collections.Instead of computing a large costly map for the wholecollection we propose to generate small maps using MDSshowing local neighborhoods. By selecting a new seed

image, the user is able to hop from one neighborhood toanother. Consecutive maps are aligned using Procrustesanalysis in order to create the impression of panning alarge (global) map. A preliminary user study showed thatusers benefit from additional information provided byaligned maps. In a next step, we plan to conduct a userstudy that compares both list- and map-basedvisualizations in more detail. Also, we will investigate howto increase the overlap between consecutive maps suchthat the alignment can be improved.

References[1] Bates, M. J. The design of browsing and berrypicking

techniques for the online search interface. OnlineInformation Review 13, 5 (1989), 407–424.

[2] Fei-Fei, L., Fergus, R., and Perona, P. One-shotlearning of object categories. IEEE Trans. on PatternAnalysis &Machine Intelligence 28, 4 (2006), 594–611.

[3] Gower, J. Generalized procrustes analysis.Psychometrika 40, 1 (1975), 33–51.

[4] Heesch, D. A survey of browsing models for contentbased image retrieval. Multimedia Tools andApplications 40, 2 (Apr. 2008), 261–284.

[5] Kruskal, J. Multidimensional scaling by optimizinggoodness of fit to a nonmetric hypothesis.Psychometrika 29 (1964), 1–27.

[6] Rubner, Y., Guibas, L., and Tomasi, C. The earthmover’s distance, multi-dimensional scaling, andcolor-based image retrieval. Proc. of the ARPA ImageUnderstanding Workshop (1997), 661–668.

[7] Salembier, P., and Sikora, T. Introduction toMPEG-7: Multimedia Content Description Interface.John Wiley & Sons, Inc., New York, NY, USA, 2002.

[8] Stober, S., Low, T., Gossen, T., and Nurnberger, A.Incremental visualization of growing music collections.In 14th Intl. Conf. on Music IR (2013), 433–438.