[ieee 2013 ieee international conference on multimedia and expo workshops (icmew) - san jose, ca,...

4
AN EVALUATION OF INTERACTIVE SEARCH WITH MODERN VIDEO PLAYERS Klaus Schoeffmann, Claudiu Cobˆ arzan Institute of Information Technology, Klagenfurt University, 9020 Klagenfurt, Austria {Klaus.Schoeffmann, Claudiu.Cobarzan}@aau.at ABSTRACT The navigation features of video players are often used for in- teractive search in videos, when users want to find a specific segment. Especially non-experts make use of these naviga- tion facilities because they typically do not have any video retrieval tool at hand and – maybe more important – the nav- igation features of video players are very easy to use. How- ever, in order to design professional video browsing tools that allow for better search performance but still provide ease of use, we need to know how users search with common video players. Therefore, we analyze logging data from a user study with 17 participants that performed Known Item Search tasks with an HTML5 video player. We classify search behavior by type of interaction and speed of interactive search and discuss what we can learn for the design and development of profes- sional video search tools. Index TermsVideo Browsing, Interactive Search, HCI 1. INTRODUCTION Due to the ubiquitous availability of portable video recording devices and their ease of use, the number of archived videos has significantly increased over the last decade. This is par- ticularly true for the private domain where non-professionals record many videos, typically for sharing, affection or preser- vation purpose [1]. However, with the increasing number of archived videos and the need for finding the proper video seg- ment (e.g., to show it to friends), it becomes increasingly diffi- cult for non-expert users to search in videos. One problem is that those users typically do not have access to professional video retrieval tools. Also, non-experts prefer simple and familier features, as a recent study has shown [2]. Therefore, in practice non-professional users typically employ simple video players not only for video playback but also for searching within videos. The navigation features of video players are well-known to most of the users and very intuitive to use. However, video players provide a very lim- ited type of navigation: fast-forward/fast-reverse and random positioning with a seeker-bar. Modern video players – like the standard HTML5 video players – are even simpler and typically use no fast-forward or reverse feature but provide a seeker-bar instead (see Figure 1). This is also true for many online services, even if not implemented with HTML5. In this paper, we evaluate how well-suited such simple video players are for interactive search tasks and what search strategies users employ. A similar study was done by Crock- ford and Agius [3], who evaluated search strategies for a VCR-like control set with play, pause, stop, fast-forward and reverse features. However, our study focuses on navigation with features provided by HTML5 video players: play, pause, stop and a seeker-bar for random access. We present eval- uation results from a user study performed with 17 partici- pants that had to solve Known Item Search (KIS) tasks that were used for the Video Browser Showdown [4]. For these KIS tasks users had to search for randomly selected segments with a duration of 20 seconds in one-hour long videos within a time limit of three minutes. We classify search behavior by the type of interaction and the speed of interactive search and then discuss what we can learn for the design and develop- ment of professional video search tools. 2. RELATED WORK Crockford and Agius [3] performed a user study on how users interact with a VCR-like control set that consists of Play, Stop, Pause, Fast Rewind, Fast Forward, Step Forward and Step Reverse. They used “Known Item Fact Retrieval” tasks, also known as “I’ll know it when I see it” [5], where the users were required to find video segments according to a semantic question. In difference to our study, they used no seeker-bar at all and focused on interactive search in a small video archive; i.e., in several video files instead of one longer file. Basically they identified four search strategies regarding video file se- lection: (1) incremental linear search (55%), (2) decremental linear search (10%), (3) educated guess (29%), and (4) ran- dom selection (6%). For the browsing behavior within one file they identified: (1) straight viewing (21%), (2) speed switch- ing: linear viewing with switching back and forth between playback and fast-forward (46%), (3) inaccurate shuttle de- termination: fast-forward too far, fast-reverse, then play – or if too far back, fast-forward again (13%), (4) accurate shuttle determination: similar to (3) but with step-forward and step- backward (7%), (5) halt and refine: step-forward and play but pause sometimes to reflect on where they are (13%). Finally,

Upload: claudiu

Post on 10-Feb-2017

213 views

Category:

Documents


1 download

TRANSCRIPT

AN EVALUATION OF INTERACTIVE SEARCH WITH MODERN VIDEO PLAYERS

Klaus Schoeffmann, Claudiu Cobarzan

Institute of Information Technology, Klagenfurt University, 9020 Klagenfurt, Austria{Klaus.Schoeffmann, Claudiu.Cobarzan}@aau.at

ABSTRACT

The navigation features of video players are often used for in-teractive search in videos, when users want to find a specificsegment. Especially non-experts make use of these naviga-tion facilities because they typically do not have any videoretrieval tool at hand and – maybe more important – the nav-igation features of video players are very easy to use. How-ever, in order to design professional video browsing tools thatallow for better search performance but still provide ease ofuse, we need to know how users search with common videoplayers. Therefore, we analyze logging data from a user studywith 17 participants that performed Known Item Search taskswith an HTML5 video player. We classify search behavior bytype of interaction and speed of interactive search and discusswhat we can learn for the design and development of profes-sional video search tools.

Index Terms— Video Browsing, Interactive Search, HCI

1. INTRODUCTION

Due to the ubiquitous availability of portable video recordingdevices and their ease of use, the number of archived videoshas significantly increased over the last decade. This is par-ticularly true for the private domain where non-professionalsrecord many videos, typically for sharing, affection or preser-vation purpose [1]. However, with the increasing number ofarchived videos and the need for finding the proper video seg-ment (e.g., to show it to friends), it becomes increasingly diffi-cult for non-expert users to search in videos. One problem isthat those users typically do not have access to professionalvideo retrieval tools. Also, non-experts prefer simple andfamilier features, as a recent study has shown [2].

Therefore, in practice non-professional users typicallyemploy simple video players not only for video playback butalso for searching within videos. The navigation features ofvideo players are well-known to most of the users and veryintuitive to use. However, video players provide a very lim-ited type of navigation: fast-forward/fast-reverse and randompositioning with a seeker-bar. Modern video players – likethe standard HTML5 video players – are even simpler andtypically use no fast-forward or reverse feature but provide a

seeker-bar instead (see Figure 1). This is also true for manyonline services, even if not implemented with HTML5.

In this paper, we evaluate how well-suited such simplevideo players are for interactive search tasks and what searchstrategies users employ. A similar study was done by Crock-ford and Agius [3], who evaluated search strategies for aVCR-like control set with play, pause, stop, fast-forward andreverse features. However, our study focuses on navigationwith features provided by HTML5 video players: play, pause,stop and a seeker-bar for random access. We present eval-uation results from a user study performed with 17 partici-pants that had to solve Known Item Search (KIS) tasks thatwere used for the Video Browser Showdown [4]. For theseKIS tasks users had to search for randomly selected segmentswith a duration of 20 seconds in one-hour long videos withina time limit of three minutes. We classify search behavior bythe type of interaction and the speed of interactive search andthen discuss what we can learn for the design and develop-ment of professional video search tools.

2. RELATED WORK

Crockford and Agius [3] performed a user study on how usersinteract with a VCR-like control set that consists of Play,Stop, Pause, Fast Rewind, Fast Forward, Step Forward andStep Reverse. They used “Known Item Fact Retrieval” tasks,also known as “I’ll know it when I see it” [5], where the userswere required to find video segments according to a semanticquestion. In difference to our study, they used no seeker-bar atall and focused on interactive search in a small video archive;i.e., in several video files instead of one longer file. Basicallythey identified four search strategies regarding video file se-lection: (1) incremental linear search (55%), (2) decrementallinear search (10%), (3) educated guess (29%), and (4) ran-dom selection (6%). For the browsing behavior within one filethey identified: (1) straight viewing (21%), (2) speed switch-ing: linear viewing with switching back and forth betweenplayback and fast-forward (46%), (3) inaccurate shuttle de-termination: fast-forward too far, fast-reverse, then play – orif too far back, fast-forward again (13%), (4) accurate shuttledetermination: similar to (3) but with step-forward and step-backward (7%), (5) halt and refine: step-forward and play butpause sometimes to reflect on where they are (13%). Finally,

they found that speed switching was the fastest approach,followed by halt and refine, accurate shuttle determination,straight viewing, and inaccurate shuttle determination.

However, although a lot of video browsing tools havebeen proposed (many of them are improved or extended videoplayers; see [6] for a detailed review), to best of our knowl-edge there exists no recent study on the interactive search per-formance and behavior of modern video players, which useplay, pause, and a seeker-bar as main navigation aid.

3. NAVIGATION PATTERNS

In order to analyze users’ search behavior with modern videoplayers, we conducted an experiment with 17 users.

3.1. User Study

The participants of our study had to perform Known ItemSearch (KIS) tasks in Dutch news videos of one-hour dura-tion each. We used the public dataset that has been used forthe Video Browser Showdown [4] in 2013. The dataset andtarget segments are publicly available at the website of theVideo Browser Showdown [4].

17 daily computer users (two female) aged 23 to 52 years(mean 31.6, SD 8.75) participated in the study. The appli-cation ran locally on a MacBook Pro laptop with its 17-inchmonitor set at a 1920×1200 pixels resolution. The interfacewas presented in a Safari web-browser window in full-screenmode. An optical wired mouse was used as input device.Each of the 10 search tasks consisted in finding a short se-quence of about 20 seconds within a long video of over anhour. For each task, the participants first watched the shortsequence within an automatic playback player on the left sideof a full screen window as shown in Figure 1a.

Fig. 1. Search interface with an HTML5 video player. (a)and (b) the interface during the first stage of a trial with theautomatic playback of the target scene. (c) and (d) secondstage of a trial during search. (e) close up of the providedinteraction possibilities provided by the video player.

No interaction was possible during the playback sinceall interaction elements were removed (see close-up in Fig-

ure 1b). After the playback ended, the participants were pre-sented on the right side of the window (see Figure 1c) with acount-down timer set to 3 minutes and the corresponding longvideo file within a player with basic controls (play, pause,seeker-bar - Figure 1d and e for a close-up). Those controlscould be used to navigate within the long video in search of aframe belonging to the target segment. The participants usedthe submit button below the player to check whether the cur-rent displayed frame actually belonged to the target video. Incase of a false submission the background turned red for 4seconds, otherwise it turned green and the score for the trialwas presented for 10 seconds. Starting the next test trial waspossible only after a successful submission or after the count-down reached 0 and no correct frame was submitted. In bothcases a “start next test” button appeared.

3.2. Discussion

As shown in Figure 2, for 64% of all 170 tasks, the usersstarted with playback from the beginning of the video fileand performed a linear search that was followed by forwardnavigation. Interestingly, for 33% of tasks the users alreadyjumped to about 30 seconds (20%) or 60 seconds (13%) afterthe beginning of the video. For only 3% of tasks, the usersdecided to jump to a specific position, obviously caused by a’good guess’. When looking at Figure 3, we can see that themajority ( 83%) of navigation during the whole test consistedof playback, dragging or clicking to a future point in timewhile in playback (Forward@Playback) or pause state (For-ward). Only 17% of interactions were reverse positioning inpause state (10%) or in playback state (7%).

Fig. 2. Interaction methods used to start the search.

By studying the log data we noticed two main classes foruser behavior: (i) Click & Play and (ii) Dragging. In thecase of the Click & Play approach, the participants usuallyhit the play button and then continuously clicked the seeker-bar in order to navigate within the video either towards its end(forward) or towards its beginning (reverse). Sometimes thepause button was used in order to asses a certain frame andsubmit it for evaluation.

While employing the Dragging approach, the participantsused the seeker-bar to browse within the content of the video

Fig. 3. Navigation methods used in the whole test.

in both directions, most of the times without even using theplay and pause buttons. In those cases the logged number ofclicks is significantly smaller than for Click & Play behavior.

However, the navigation strategies varied between the par-ticipants, as shown in Figure 4. While the first participantuse no playback at all to answer the KIS tasks, participant 10and 12 used playback for most of the search time. Some ofthe participants used reverse positioning almost equal long asforward positioning (participants 2 and 16). Moreover, wecan see that the participants preferred positioning in pausedmode over positioning in playback mode. When comparingFigure 4 with Figure 5, we can see that users that relied onDragging had significantly less frames (e.g., 2633, 5078, and5197 frames by participant 1, 4, and 9) than those participantsspending a high amount of time on playback (17170, 18475,and 25526 frames for 7,10, and 12). Participants 1, 4, and 9were also the users with the fastest submissions overall in ourtest, whereas 7, 10, and 12 were the slowest. Therefore, wecan conclude that interactive search with a seeker-bar is mosteffective when used without playback.

Fig. 4. Time spent for a specific search method (per user).

In spite of the fact that the actual test runs were precededfor each participant by a number of training runs (from oneto four, depending on the user), the logged data for the firsttest run shows a quasi erratic behavior. Most users browsedthrough the video from start towards end, employing one ofthe above mentioned approaches, in a precipitated way. Be-cause of the inadequate speed, most needed more such itera-tions to find one appropriate frame, while some didn’t succeedin the allocated 3 minutes period.

After the first run, most of the users recognized that they

Fig. 5. Number of frames visited per participant by using aspecific search method.

Fig. 6. Average distance for forward and backward navigationfor every task.

were browsing/searching way too fast and adapted by slowingdown. This behavior is also visible in Figure 6, which showsthe average distance between two consecutive forward andbackward navigation steps. After the first task, the users sig-nificantly decreased the step size for navigation for all tasksbut 5, which was a particularly hard task because the tar-get segment appeared more than once in the video with onlyslight differences. After the first task, users also spent moretime on inspecting the content during dragging, as revealed byFigure 7, which shows the average delay between two consec-utive navigation steps.

During the next few test runs two main approaches wererevealed by studying the log data: some users first tried to getan overview of the video content by browsing from the startto the end at an increased speed and then refining the searchby starting again from a certain point (not necessarily the be-ginning) usually at a lower speed. Others just started from thebeginning in a linear way - just searching/browsing slowlyfrom the start towards the end. Surprisingly enough, someof the participants adopted an awkward approach in whichthey searched from the beginning to the end of the video andthen reverted the direction from the end towards the begin-ning (see Figure 8). Some did this while applying Click &Play. Those approaches were most of the times unsuccessfuland time consuming. One of the users recognized it was afailing strategy and changed his approach in mid-session byswitching to linear search. He kept using it even in a special

Fig. 7. Average delay between consecutive steps of forwardand backward navigation for every task.

case when the other participants used common knowledge torefine the search interval. In this particular case (task 6), thetarget illustrated a football sequence and most users startedsearching towards the end of the long video, since in newsprograms Sports and Meteo are usually presented last.

Fig. 8. Navigation diagram of participant 1. Shows thevisited frames over time for every task. While for most tasksthe user navigated linearly, for task 1 and 7 he used forward-search followed by reverse-search (and followed by a quickforward-search again, for task 1).

4. CONCLUSION

We have presented results from a user study on interactivesearch with modern video players that used limited navigationfeatures. As expected, our results show that different usersapply different search behaviors. However, most users ap-plied linear forward search with seeker-bars positioning andrarely relied on reverse search. A few times reverse searchwas used in situations where linear search did not lead to suc-cess. In the same situation, however, the majority of the userspreferred to go back to the beginning and started a forwardsearch again. It seems that users like to search in linear man-

ner because with this strategy it is easier to remember visitedsegments. These observations acknowledge the interface de-sign presented in [7], where in the user could immediatelyrestart from the beginning in case the desired content was notfound. Our study has also shown that linear forward searchwith seeker-bar dragging in non-playback state is most effi-cient in terms of search time. For this kind of search usersonly need to concentrate on the content itself instead of click-ing targets. Based on these results, video search tools couldsimply rely on static images during interactive search and pro-vide a playback feature only on demand.

AcknowledgementsThis work was funded by the Federal Ministry for Trans-port, Innovation and Technology (bmvit) and Austrian Sci-ence Fund (FWF): TRP 273-N15 and the European RegionalDevelopment Fund and the Carinthian Economic PromotionFund (KWF), supported by Lakeside Labs GmbH, Klagen-furt, Austria.

5. REFERENCES

[1] M. Lux and J. Huber, “Why did you record this video?an exploratory study on user intentions for video produc-tion,” in Image Analysis for Multimedia Interactive Ser-vices, 13th Int. Workshop on, 2012, pp. 1–4.

[2] David Scott, Frank Hopfgartner, Jinlin Guo, and CathalGurrin, “Evaluating novice and expert users on hand-held video retrieval systems,” in Proc. of the 19th In-ternational Conference on Multimedia Modeling (MMM2013). 2013, pp. 69–78, Springer Berlin Heidelberg.

[3] Chris Crockford and Harry Agius, “An empirical inves-tigation into user navigation of digital video using thevcr-like control set,” International Journal of Human-Computer Studies, vol. 64, no. 4, pp. 340 – 355, 2006.

[4] Klaus Schoeffmann and Werner Bailer, “Video browsershowdown,” SIGMultimedia Rec., vol. 4, no. 2, pp. 1–2,July 2012.

[5] Albert H. Huang, “Effects of multimedia on documentbrowsing and navigation: an exploratory empirical inves-tigation,” Information & Management, vol. 41, no. 2, pp.189 – 198, 2003.

[6] Klaus Schoeffmann, Frank Hopfgartner, Oge Marques,Laszlo Boeszoermenyi, and Joemon M. Jose, “Videobrowsing interfaces and applications: a review,” SPIEReviews, vol. 1, no. 1, pp. 018004, 2010.

[7] K. Schoeffmann and D. Ahlstrom, “Using a 3d cylindri-cal interface for image browsing to improve visual searchperformance,” in Image Analysis for Multimedia Interac-tive Services, 13th Int. Workshop on, 2012, pp. 1 –4.