interactive video search - tutorial at acm multimedia 2015

131
Interactive Video Search Klaus Schoeffmann, PhD Klagenfurt University Institute of Information Technology Klagenfurt, Austria Frank Hopfgartner, PhD University of Glasgow School of Humanities Glasgow, UK

Upload: klschoef

Post on 22-Jan-2018

5.127 views

Category:

Science


1 download

TRANSCRIPT

Page 1: Interactive Video Search - Tutorial at ACM Multimedia 2015

Interactive Video Search

Klaus Schoeffmann, PhD

Klagenfurt University

Institute of Information Technology

Klagenfurt, Austria

Frank Hopfgartner, PhD

University of Glasgow

School of Humanities

Glasgow, UK

Page 2: Interactive Video Search - Tutorial at ACM Multimedia 2015

Outline

• Search in video content: motivation and challenges• Video retrieval and its challenges• What is interactive video search and how can it help?• Video browsing• Video navigation• Break• Video content visualization• Ad-hoc similarity search / video exploration• Sketch-based search in video• Evaluation of interactive video search tools• Visual lifelogging

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 2

Page 3: Interactive Video Search - Tutorial at ACM Multimedia 2015

Motivation

3Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 4: Interactive Video Search - Tutorial at ACM Multimedia 2015

Videos Everywhere

• Ubiquitous use of videos nowadaysEntertainment and commercials

Social gaming (screencasts)

Personal videos (family, kids, …)

Sports documentation and analysis (e.g., GoPro)

Product usage instructions (e.g., furniture)

Surveillance (buildings, places, street, …)

Lifelogging

Health care and medical science (endoscopic procedures)

• Enormous amount of data!

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 4

Page 5: Interactive Video Search - Tutorial at ACM Multimedia 2015

Video as the Ultimate Media?

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 5

[Mary Meeker, Liang Wu, Internet Trends, D11 Conference, May, 2013]

As of 2014, everyminute 300 hours ofvideo are uploaded

to YouTube!

Page 6: Interactive Video Search - Tutorial at ACM Multimedia 2015

Video Cameras

• Increasingly powerfulThese days you can record 4K content with your mobile!

Video sensors use auto-focus, object tracking, color correction, and image stabilization

Storage space not a big problem Current smartphones have 128 GB of memory

NAS devices cheaply available

Network bandwidth also dramatically increased over years Video streaming on the go is simple and common

LTE connections provide 30 Mbit/s and even much more!

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 6

Page 7: Interactive Video Search - Tutorial at ACM Multimedia 2015

7Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

[Mary Meeker, Liang Wu, Internet Trends, D11 Conference, May, 2013]

Page 8: Interactive Video Search - Tutorial at ACM Multimedia 2015

Challenges

• Video dataare a continuous media: the content depends on time!

often contain several media types: image, text, audio

cannot be simply stored and indexed in a data base, requires own indexing and search methods!

require huge amount of storage space without compression!

may contain a lot of important information, which is, however, often very subjective!

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 8

Page 9: Interactive Video Search - Tutorial at ACM Multimedia 2015

Challenges

• Subjectivity of dataHow many people use it?

The more the easier!?

• Different levels Internet scale (YouTube)

Country/region

Company/organization/group

Individual

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 9

Available meta-data

How to query for a video clip?How to efficiently retrieve results?

How to effectively present content to the user?

Page 10: Interactive Video Search - Tutorial at ACM Multimedia 2015

“Poor Man’s Video Search Tool”

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 10

VCR in the 1970s provided a similar functionality!

Page 11: Interactive Video Search - Tutorial at ACM Multimedia 2015

Video Data Still Tedious to Use

• Even with video retrieval tools it is still challenging to find desired video contentEspecially if it is not a publicly available (and popular)

Many problems with querying, in particular for novice users!

• The ultimate goal is to make use of and search in video as effective as for textQuickly find relevant content

Compare to interactivity of a text book Index, ToC, list of figures/tables, etc.

Change, extend, copy, bookmark, highlight, etc.

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 11

Page 12: Interactive Video Search - Tutorial at ACM Multimedia 2015

TraditionalVideo Retrieval

12Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

“Query and browse results”

Page 13: Interactive Video Search - Tutorial at ACM Multimedia 2015

Search Example (TRECVID KIS 2010)

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 13

Find the “video of President Bush standing near sea vessels with Coast Guard members talking about his pride of the Coast Guard, immigration, and security issues”.

Video from IACC public data set!

TRECVID: see later!

Page 14: Interactive Video Search - Tutorial at ACM Multimedia 2015

Video Clip Hidden in Huge Collection

14

Internet Archive with Creative

Commons (IACC)data set, as used

for TRECVID:146,788 shots

(~9,000 videos)

Page 1 2 3 …. 38 39 40

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 15: Interactive Video Search - Tutorial at ACM Multimedia 2015

1st Trial at YouTube

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 15

Page 16: Interactive Video Search - Tutorial at ACM Multimedia 2015

2nd Trial at YouTube

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 16

Page 17: Interactive Video Search - Tutorial at ACM Multimedia 2015

[ Heesch, D., Howarth, P., Magalhaes, J., May, A., Pickering, M., Yavlinsky, A., & Rüger, S. (2004, November). Video retrieval using search and browsing. In TREC Video Retrieval Evaluation Online Proceedings. ]

17

Content-based

Feature

Example Image

Text

Ranked list of shots

In IACC about 5800 pages.

Temporal Context

Traditional Video Retrieval Tool

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 18: Interactive Video Search - Tutorial at ACM Multimedia 2015

More Interactive Retrieval Tool

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 18

[A. Moumtzidou et al., “VERGE: A Multimodal Interactive Video Search Engine”, Proc. of the 21st International Conference on MultiMedia Modeling (MMM 2015), Sydney, 2015]

kNN Similarity searchbased on VLAD vectors

Concept detection with SVM andfive local descriptors (SIFT, SURF,

ORB, ...) and PCA

Hierarchicalkeyframe clustering

Interaction details later

Page 19: Interactive Video Search - Tutorial at ACM Multimedia 2015

Challenges forVideo Retrieval

19Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 20: Interactive Video Search - Tutorial at ACM Multimedia 2015

Traditional Video Retrieval Approach“Query-and-Browse-Resoluts” Paradigm

Works well if (and only if)users can properly express their needs.

content features can sufficiently describe visual content.

computer vision can accurately detect semantics.

20

Content-basedSearch

Ranked Results

Unfortunately, in practice these assumptions do not hold.

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 21: Interactive Video Search - Tutorial at ACM Multimedia 2015

Challenges

Content-based features How to understand semantics from pixels? Semantic Gap

Both images show bears in front

of a landscape.

21Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 22: Interactive Video Search - Tutorial at ACM Multimedia 2015

Database affinity of concept classifiers

Low performance in broad domain

P(k) Precision at level k (after k results)rel(k) defines if kth retrieved document is relevant

Performance Gap

22

Challenges

TRECVID 2013 Semantic Indexing (SIN-500): median “inferred average precision” (infAP) < 0.13

In other words: 88% of the results

are not correct!

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 23: Interactive Video Search - Tutorial at ACM Multimedia 2015

Query-by-concept

Which concept to use? Choose from a long list of results…

Query-by-example

Typically no perfect example available.

Query-by-sketch

Users are no artists

Query-by-text

How to describe a desired image by text?

Usability Gap

23

A picture tells a 1000 words.

by marfis75

How to describe a video clip by text???

Challenges

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 24: Interactive Video Search - Tutorial at ACM Multimedia 2015

Example: Query by Motion Sketch

• Matching based on trajectory descriptor

• Challenge: may differ a lot among different users

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 24

[Ghosal, Koustav, and Anoop Namboodiri. "A Sketch-Based Approach To Video Retrieval Using Qualitative Features." Proceedings of the 2014 Indian

Conference on Computer Vision Graphics and Image Processing. ACM, 2014.]

Page 25: Interactive Video Search - Tutorial at ACM Multimedia 2015

Well-Known Issuesof “Query-and-Browse-Results” Paradigm

Users cannot formulate or have no query provide exploratory search features!

For example: browsing, filtering, similarity search

Users expect good results (on first page!) Use relevance feedback / active learning instead of long lists!

Shots have a temporal context

Videos are dynamic Static thumbnails are not informative

Esp. true for long shots and self-similar content

skims and visual summaries (“smart playback”)

sophisticated navigation & content structure visualization

Grid interfaces are not always the best choice

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 25

Usability Gap

See later

Page 26: Interactive Video Search - Tutorial at ACM Multimedia 2015

Uniform Sampled Frames from a Video with High Self-Similarity

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 26

Page 27: Interactive Video Search - Tutorial at ACM Multimedia 2015

Needs Special Keyframe Extraction and Object Detection

[Klaus Schoeffmann, Manfred Del Fabro, Tibor Szkaliczki, Laszlo Böszörmenyi, and Jörg Keckstein, “Keyframe Extraction in Endoscopic Video“, in Multimedia Tools and Applications, Springer, August, 2014]

[Manfred J. Primus, Klaus Schoeffmann, and Laszlo Böszörmenyi, “Instrument Classification in Laparoscopic Videos“, in Proceedings of the International Workshop on Content-Based Multimedia Indexing (CBMI 2015), Prague, Czech Republic, IEEE, 2015, pp. 1-6]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 27

Page 28: Interactive Video Search - Tutorial at ACM Multimedia 2015

And Special Browsing Tools / Visualization

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 28

Agglomerative clustering based on visual similarity and temporal information

[Jakub Lokoc, Klaus Schoeffmann, and Manfred Del Fabro, “Dynamic Hierarchical Visualization of Keyframes in Endoscopic Video“, in Proceedings of the 21st International Conference on MultiMedia Modelling 2015 (MMM 2015), Sydney, Australia, Lecture Notes on Computer Science (LNCS), Vol. 8936, Springer International Publishing, 2015, pp. 291-294]

Page 29: Interactive Video Search - Tutorial at ACM Multimedia 2015

Where Is the User in Multimedia Retrieval?

29

[Marcel Worring et al., „Where Is the User in Multimedia Retrieval?“, IEEE Multimedia, Vol. 19, No. 4, Oct.-Dec. 2012, pp. 6-10 ]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 30: Interactive Video Search - Tutorial at ACM Multimedia 2015

Interactive vs. “Traditional” Retrieval

30Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 31: Interactive Video Search - Tutorial at ACM Multimedia 2015

Interactive Video SearchAnd how it can help…

31Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 32: Interactive Video Search - Tutorial at ACM Multimedia 2015

Performance Gap

Interactive Video Search

32

• Mostly interactive search• Human computation• Simple-to-use• Inflexible and tedious for archives• Low performance (?)

• Mostly automatic search• Retrieval engine• Complicated to use• Flexible and easier (?) for archives• Limited performance too!

Usability Gap

Novices Experts

Combines HCI with CV and MIR

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 33: Interactive Video Search - Tutorial at ACM Multimedia 2015

Interactive Video Search

Traditional video retrieval + interactive inspection/exploration/navigation and rich content visualization in order to satisfy an information need

33

Focuses on search and exploration in (i) single videos as well as (ii) video collections

Directed SearchFind a specific shot or segment in a videoFind a specific video in an archive

Undirected SearchSearching to discover informationE.g., browse through a video in order to

Learn how the content looks likeSee if it is interesting

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Not supported by trad. video retrieval

Page 34: Interactive Video Search - Tutorial at ACM Multimedia 2015

Interactive Video Search

• Tries to strongly integrate user into search process

Search – Inspect – Think – Repeat Exploratory search (“will know it when I see it”)

Instead of „query-and-browse-results“

User controls search process Inspects and interacts

Most meaningful tool for current need, e.g.• Content Browsing/Navigation

• Content Visualization and Summarization

• Ad-hoc Querying (e.g., by sketch, filtering, ad-hoc example)

34Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 35: Interactive Video Search - Tutorial at ACM Multimedia 2015

User Studies with Significance Tests!

• Many interfaces proposed without proper evaluation

• Interface A better than interface B? comparative user study needed! Perform search tasks in exactly the

same setting (data, environment, etc.) Logging of interaction behavior

and task solve time Questionnaire about subjective workloads Statistical analysis with proper tests

(e.g., t-test, ANOVA, Wilcoxon signed-rank, etc.)

• User simulations?

• Evaluation competitions Same data set Comparative evaluation TRECVID, MediaEval, Video Browser Showdown (see later)

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 35

Page 36: Interactive Video Search - Tutorial at ACM Multimedia 2015

IVS Tools: Video Browsing

36Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 37: Interactive Video Search - Tutorial at ACM Multimedia 2015

37

Video Browsing

[ F. Arman, R. Depommier, A. Hsu, and M-Y. Chiu, Content-based Browsing of Video Sequences, in Proc. of ACM International Conference on Multimedia, 1994, pp. 97-103 ]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 38: Interactive Video Search - Tutorial at ACM Multimedia 2015

The ThumbBrowser

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 38

[Marco Hudelist, Klaus Schoeffmann, Laszlo Böszörmenyi. “Mobile Video Browsing with the ThumbBrowser”, Proc. of the International Conference on Multimedia, 2013, pp. 405-406 ]

Page 39: Interactive Video Search - Tutorial at ACM Multimedia 2015

Video Browser for the Digital Native

39

[Adams, Brett, Stewart Greenhill, and Svetha Venkatesh. "Towards a video browser for the digital native." Multimedia and Expo Workshops (ICMEW), 2012 IEEE International

Conference on. IEEE, 2012.]

“Temporal Semantic Compression” based on tempo function and shot popularity (insight)

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 40: Interactive Video Search - Tutorial at ACM Multimedia 2015

Video Browser for the Digital Native

• User study with 8 participants Test configuration elements by two tasks

(after presentation + 5 minutes training) (i) Browse a familiar movie to find scenes you remember

(ii) Browse an unfamiliar movie to get a feel for its story or structure

Questionnaire with Likert-scale ratings

40

[Adams, Brett, Stewart Greenhill, and Svetha Venkatesh. "Towards a video browser for the digital native." Multimedia and Expo Workshops (ICMEW), 2012 IEEE International

Conference on. IEEE, 2012.]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 41: Interactive Video Search - Tutorial at ACM Multimedia 2015

Thread-Based Browsing of Retrieval Results

• Thread: linked seq. of shots in a specified order Query results, visual similarity, semantic similarity, textual similarity

time, …

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 41

[De Rooij, Ork, Cees GM Snoek, and Marcel Worring. "Balancing thread based navigation for targeted video search." Proceedings of the 2008 international conference on Content-based image and video retrieval (CIVR). ACM, 2008.]

Page 42: Interactive Video Search - Tutorial at ACM Multimedia 2015

Thread-Based Browsing of Retrieval Results

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 42

[Ork de Rooij et al.]

Page 43: Interactive Video Search - Tutorial at ACM Multimedia 2015

IVS Tools: Video Navigation

43Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 44: Interactive Video Search - Tutorial at ACM Multimedia 2015

Improving Navigation

44

e.g., on YouTube default window:

640 pixels = frames(25 seconds)

Common seeker-bar limits navigation granularity

[Huerst et al., ICME 2007]

ZoomSlider

Improvements (selected):

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 45: Interactive Video Search - Tutorial at ACM Multimedia 2015

Navigation with a Seeker-BarIdea of the ZoomSlider

45

Wolfgang Hürst, Georg Götz, and Martina Welte, “Interactive video browsing on mobile devices”, in Proceedings of the 15th International Conference on Multimedia (MULTIMEDIA '07). ACM, pp. 247-256, 2007

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 46: Interactive Video Search - Tutorial at ACM Multimedia 2015

46Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 47: Interactive Video Search - Tutorial at ACM Multimedia 2015

Improving Navigation

47

e.g., on YouTube default window:

640 pixels = frames(25 seconds)

Common seeker-bar limits navigation granularity

[Dragicevic et al., CHI 2008]

Direct Manipulation

[Huerst et al., ICME 2007]

ZoomSlider

Improvements (selected):

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 48: Interactive Video Search - Tutorial at ACM Multimedia 2015

Relative Flow DraggingBackground Stabilization

48

Pierre Dragicevic, Gonzalo Ramos, Jacobo Bibliowitcz, Derek Nowrouzezahrai, Ravin Balakrishnan, and Karan Singh. “Video browsing by direct manipulation”, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '08). ACM, pp. 237-246, 2008

Video browsing by direct manipulation / relative flow dragging

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 49: Interactive Video Search - Tutorial at ACM Multimedia 2015

Relative Flow Dragging

• Evaluation with a user study 16 participants (18-44 years old)

Direct comparison to seeker-bar navigation

Navigation tasks, 2 videos (ladybug, cars) “Find the position where the ladybug passes over marker X”

“Find the moment when car X starts moving”

Flow dragging significantly faster (RM-ANOVA)by at least 250% (also significantly less errors)

49

Pierre Dragicevic, Gonzalo Ramos, Jacobo Bibliowitcz, Derek Nowrouzezahrai, Ravin Balakrishnan, and Karan Singh. “Video browsing by direct manipulation”, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '08). ACM, pp. 237-246, 2008

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 50: Interactive Video Search - Tutorial at ACM Multimedia 2015

How Do Users Search in Video With a Common

Video Player?

50Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 51: Interactive Video Search - Tutorial at ACM Multimedia 2015

How do Users Search with Video Players (Navigate with Seeker-Bars)?

• User study with more than 30 participants

• Known Item Search Tasks

51

[Claudiu Cobarzan and Klaus Schoeffmann, “How do Users Search with Basic HTML5 Video Players?“, in Proceedings of The 20th International Conference on MultiMedia Modeling (MMM2014), 2014]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 52: Interactive Video Search - Tutorial at ACM Multimedia 2015

How do Users Search with Video Players (Navigate with Seeker-Bars)?

52

[Claudiu Cobarzan and Klaus Schoeffmann, “How do Users Search with Basic HTML5 Video Players?“, in Proceedings of The 20th International Conference on MultiMedia Modeling (MMM2014), 2014]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 53: Interactive Video Search - Tutorial at ACM Multimedia 2015

How do Users Search with Video Players (Navigate with Seeker-Bars)?

53

[Claudiu Cobarzan and Klaus Schoeffmann, “How do Users Search with Basic HTML5 Video Players?“, in Proceedings of The 20th International Conference on MultiMedia Modeling (MMM2014), 2014]

Vast amount of content was checked with normal playback!?

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 54: Interactive Video Search - Tutorial at ACM Multimedia 2015

How do Users Search with Video Players (Navigate with Seeker-Bars)?

Tim

e in

vid

eo (

ms)

54

[Claudiu Cobarzan and Klaus Schoeffmann, “How do Users Search with Basic HTML5 Video Players?“, in Proceedings of The 20th International Conference on MultiMedia Modeling (MMM2014), 2014]

User start with rough navigation and look more carefully after narrowing down the search area!

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Target segment

Page 55: Interactive Video Search - Tutorial at ACM Multimedia 2015

ImprovingVideo Navigation on

Touch Devices

55Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Session continues at 3:20 pm

Page 56: Interactive Video Search - Tutorial at ACM Multimedia 2015

56Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 57: Interactive Video Search - Tutorial at ACM Multimedia 2015

Keyframe Navigation Tree

• Consider findings of study on navigation behavior

• Basic idea inspired by frame stripes (MO images)

• Goal

very compact visualization

not as fine as frames but not as coarse as keyframes of shots

provide different granularity levels for navigation

previous work has shown that users typically navigate in a coarse-to-fine grained manner

57

[ Schoeffmann, K., Taschwer, M., & Boeszoermenyi, L. (2010, February). The video explorer: a tool for navigation and searching within a single video basedon fast content analysis. In Proceedings of the first annual ACM SIGMM conference on Multimedia systems (pp. 247-258). ACM. ]

[Mueller-Seelich, H., Tan, E.: Visualizing the semantic structure of film and video (2000) ]

compact overview but abstract & very high level of detail (frame-based!)

[Xiaoxiao Luo, Qing Xu, Mateu Sbert, Klaus Schoeffmann, “F-Divergences Driven Video Key Frame Extraction“, in Proc. of the IEEE Int. Conference on Multimedia & Expo (ICME 2014), Chengdu, China, 2014, pp. 6]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 58: Interactive Video Search - Tutorial at ACM Multimedia 2015

• Keyframe selection based on sub-shots (JSD with color histograms) Cover all important scenes even for long shots (e.g. pans)

Excerpts with three levels of detail: L1: narrow, L2: wide, L3: full keyframe

Used as seeker-bar with synchronized interaction for all levels

Simple touch-based interaction (tap and wipe gestures)

58

L1 (30 shots)

L2 (~ 12 shots)

L3 (~ 3 shots)

Keyframe Navigation Tree

[Hudelist, Marco A., Klaus Schoeffmann, and Qing Xu. "Improving interactive known-item search in video with the keyframe navigation tree." MultiMedia Modeling. Springer International Publishing, 2015.]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 59: Interactive Video Search - Tutorial at ACM Multimedia 2015

59Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 60: Interactive Video Search - Tutorial at ACM Multimedia 2015

Keyframe Navigation Tree

• Video Player vs. KNT Browser (iPad, 4th generation, 9.7-inch)

• User study with 20 participants (15m/5f) Age: 18-40 (mean 28.15, s.d. 6.08)

• Known-item search tasks Given 20 seconds long target clip

Find correct clip in 1-h long video as fast as possible

200 search tasks in total Each participant performed random selection of 10 tasks (5/5)

latin-square principle to avoid familiarization effects

Time-out after 3 minutes (“unanswered”)

Wrong results marked as “erroneous”

• Time measurement, logging, questionnaire (Likert-scale ratings)

60Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 61: Interactive Video Search - Tutorial at ACM Multimedia 2015

Keyframe Navigation TreeSearch Time & Performance

• Task solve time, erroneous trials, unanswered trials…

61

28.11

44.18

KNT Browser statistically significantly faster acc. to dependent paired-samples t-test (t(19) = -3.937 p < 0.005)

7 trials10 trials

5 trials

22 trials

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

[Hudelist, Marco A., Klaus Schoeffmann, and Qing Xu. "Improving interactive known-item search in video with the keyframe navigation tree." MultiMedia Modeling. Springer International Publishing, 2015.]

Page 62: Interactive Video Search - Tutorial at ACM Multimedia 2015

Keyframe Navigation TreeSubjective Rating

• NASA Task-Load-Index (TLX) questionnaires

62

KNT browser significantly better in all 7 categories!Acc. to Wilcoxon signed-

rank tests (for details see paper)

KNT browser is preferred search tool for 85% of tested users (17/20)

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

[Hudelist, Marco A., Klaus Schoeffmann, and Qing Xu. "Improving interactive known-item search in video with the keyframe navigation tree." MultiMedia Modeling. Springer International Publishing, 2015.]

Page 63: Interactive Video Search - Tutorial at ACM Multimedia 2015

IVS Tools: Content Visualization

63Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 64: Interactive Video Search - Tutorial at ACM Multimedia 2015

Grid Interfaces Aren‘t Enough!

• Many video retrieval systems use a Grid interface!?

Moreover, a grid interface does not allow for fast human visual search (see later)!

64

A ranked list of results does not convey the temporal content structure!• To which video does a shot belong to?• What is the sequence of shots?• How long is a shot / scene?

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 65: Interactive Video Search - Tutorial at ACM Multimedia 2015

Table of Video Content (TOVC)

[Goeau et al., ICME 2007]

65

Squeeze / FisheyeRapid Visual Serial

Presentation (RSVP)

Improving Visualizationaka “Video Surrogates”

[Wildemuth et al., 2003]

[Wittenburg et al., 2005]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 66: Interactive Video Search - Tutorial at ACM Multimedia 2015

66

VideoTree[Jansen et al., CBMI 2008]

However, outperformed by simple “grid of keyframes”

in terms of search time.

Similar concept proposed later[Girgensohn et al., ICMR 2011]

• Split-based clustering algorithm withcolor correlograms.

• Tree not directly shown to the user(only one level).

Improving Visualizationaka “Video Surrogates”

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 67: Interactive Video Search - Tutorial at ACM Multimedia 2015

3D Ring Interface

• Utilization of screen real estate Large set of images Minor occlusion, slight distortion

• Intuitive interaction Rotate and zoom

• Content-based sorting

• “Pop-out images” (in the back)

• Further advantages Immediately continue on miss,

scaling

67

Klaus Schoeffmann, David Ahlström, and Marco Andrea Hudelist, “3-D Interfaces to Improve the Performance of Visual Known-Item Search“, in IEEE Transactions on Multimedia, Vol. 16, No. 7, November, 2014, pp. 1942-1951.

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 68: Interactive Video Search - Tutorial at ACM Multimedia 2015

3D Ring Interface - Perspectives

Preferred Design acc. to user study

25% Vertical 66% Horizontal 8.3% Frontal

68

Klaus Schoeffmann, David Ahlström, and Marco Andrea Hudelist, “3-D Interfaces to Improve the Performance of Visual Known-Item Search“, in IEEE Transactions on Multimedia, Vol. 16, No. 7, November, 2014, pp. 1942-1951.

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 69: Interactive Video Search - Tutorial at ACM Multimedia 2015

3D interface significantly faster than grid by 12.7%

User Study: Grid vs. Ring (both sorted)150 images, 12 participants, 1440 trials

69

Klaus Schoeffmann, David Ahlström, and Marco Andrea Hudelist, “3-D Interfaces to Improve the Performance of Visual Known-Item Search“, in IEEE Transactions on Multimedia, Vol. 16, No. 7, November, 2014, pp. 1942-1951.

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 70: Interactive Video Search - Tutorial at ACM Multimedia 2015

Extension: Multiple Rings with Vertical Scrolling

70

Klaus Schoeffmann. 2014. The Stack-of-Rings Interface for Large-Scale Image Browsing on Mobile Touch Devices. In Proc. of the ACM Int. Conference on Multimedia (MM '14). ACM, New York, NY, USA, 1097-1100.

Significantly faster search (by about 48%) than common image browser on iPad!

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 71: Interactive Video Search - Tutorial at ACM Multimedia 2015

IVS Tools: Ad-Hoc Similarity Search

71Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 72: Interactive Video Search - Tutorial at ACM Multimedia 2015

The Video Explorer

72

[ Schoeffmann, K., Taschwer, M., & Boeszoermenyi, L. (2010, February). The video explorer: a tool for navigation and searching within a single video based on fast content analysis. In Proceedings of the first annualACM SIGMM conference on Multimedia systems (pp. 247-258). ACM. ]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 73: Interactive Video Search - Tutorial at ACM Multimedia 2015

Interactive Navigation Summaries

Allows a user to quickly identifysimilar/repeating scenes

73

[ Schoeffmann, K., & Boeszoermenyi, L. (2009, June). Video browsing using interactive navigation summaries. In Content-Based Multimedia Indexing, 2009. CBMI'09. Seventh Int.Workshop on (pp. 243-248). IEEE. ]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 74: Interactive Video Search - Tutorial at ACM Multimedia 2015

Motion Layout: Direction + Intensity

Motion Vector (µ) classification intoK=12 equidistant motion directions

Mapping to Hue channel

74

[ Schoeffmann, K., Lux, M., Taschwer, M., & Boeszoermenyi, L. (2009, June). Visualization of video motion in context of video browsing. In Multimedia and Expo, 2009. ICME 2009. IEEE Int. Conf. on (pp. 658-661). IEEE. ]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 75: Interactive Video Search - Tutorial at ACM Multimedia 2015

75

[ Schoeffmann, K., Lux, M., Taschwer, M., & Boeszoermenyi, L. (2009, June). Visualization of video motion in context of video browsing. In Multimedia and Expo, 2009. ICME 2009. IEEE Int. Conf. on (pp. 658-661). IEEE. ]

Similarity Search (SOI) with Motion Layout

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 76: Interactive Video Search - Tutorial at ACM Multimedia 2015

• SOI Search Motion-based search by example sequence

Using Motion Direction histogram Db

User-selected sequence

Find most similar sequences Compute distance to any possible seq. of same length

Match if below spec. threshold

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 76

Motion Layout (Db)

Match 1 Match 2 Match 3

frame 1 frame n

Similarity Search (SOI) with Motion Layout

Page 77: Interactive Video Search - Tutorial at ACM Multimedia 2015

Region-of-Interest (ROI) Search User selects spatial region-of-interest

On search Compute Euclidian distance of frame F

to every other frame f (acc. to selected region)

Based on color layout descriptor

frame F

frame 1 frame k frame n

User-selected region (I)

d(F,1)=350 d(F,k)=8 d(F,n)=400

77

[ Schoeffmann, K., Taschwer, M., & Boeszoermenyi, L. (2010, February). The video explorer: a tool for navigation and searching within a single video based on fast content analysis. In Proceedings of the first annualACM SIGMM conference on Multimedia systems (pp. 247-258). ACM. ]

Similarity Search (ROI) with Color Layout

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 78: Interactive Video Search - Tutorial at ACM Multimedia 2015

78

[ Schoeffmann, K., Taschwer, M., & Boeszoermenyi, L. (2010, February). The video explorer: a tool for navigation and searching within a single video based on fast content analysis. In Proceedings of the first annualACM SIGMM conference on Multimedia systems (pp. 247-258). ACM. ]

Similarity Search (ROI) with Color Layout

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 79: Interactive Video Search - Tutorial at ACM Multimedia 2015

IVS Tools: Sketch-Based Search

79Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 80: Interactive Video Search - Tutorial at ACM Multimedia 2015

• Color sketches mapped to feature signatures

• Matched to those of keyframes

80

1. Sampling keypoints2. Description through location (x,y),

CIE Lab, contrast and entropy of surrounding pixels

3. K-means clustering

Feature Signatures

[ Kruliš, M., Lokoč, J. and Skopal, T. (2013). Efficient Extraction of Feature Signatures Using Multi-GPU Architecture. Springer Berlin Heidelberg, LNCS 7733, pp.446-456. ]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 81: Interactive Video Search - Tutorial at ACM Multimedia 2015

Feature Signature-Based Video Browser

81

Color Sketch(Signature)

Player

Winner of Video Browser Showdown 2014 + 2015Download demo at: http://siret.ms.mff.cuni.cz/lokoc/vbs.zip

2nd Color Sketch(optional)

[ Lokoč, J., Blažek, A., & Skopal, T. (2014, January). Signature-Based Video Browser. In MultiMedia Modeling (pp. 415-418). Springer International Publishing. ]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 82: Interactive Video Search - Tutorial at ACM Multimedia 2015

Compact visualization

Simple color-position sketch

Negativeexample

Matched key-frames

Time to 2nd sketch

2nd optional sketch

Interactive-navigation summaryOn demand neighborhood expansion

[Slide: Adam Blazek et al. (siret research group, Czech Republic)]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 82

Page 83: Interactive Video Search - Tutorial at ACM Multimedia 2015

Compact Visualization in Detail

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 83

[Courtesy of Jakub Lokoc et al.]

Page 84: Interactive Video Search - Tutorial at ACM Multimedia 2015

Another Example of a Color-Based Browser

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 84

BrowsingArea

Map

SegmentInspection

Color SketchCategories

[Kai Uwe Barthel, Nico Hezel, Radek Mackowiak. Navigating a graph of scenes for exploring large video collections, in Proc. of 22nd International Conference on MultiMedia Modeling (MMM 2016), Lecture Notes in Computer Science (LNCS), Vol. tbd, Springer International Publishing, 2016, pp. 1-7]

Page 85: Interactive Video Search - Tutorial at ACM Multimedia 2015

Evaluation ofIVS Tools

85Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 86: Interactive Video Search - Tutorial at ACM Multimedia 2015

TRECVIDhttp://trecvid.nist.gov/

• International video retrieval competition evaluation Annually performed by NIST (Gaithersburg, Maryland, USA) Funded by NIST and other US government agencies Benchmark for researchers using same data Origin in TREC (Text REtrieval Conference, since 1992)

• Founded in 2003, by Alan Smeaton (Dublin City University) Wessel Kraaij (TNO-ICT, Delft)

• International advisory Committee Alex Hauptmann (CMU) Michael Lew (Leiden Institute of Advanced Computer Science) Georges Quenot (LIG, Grenoble) John Smith (IBM Research) …

• Local organisation Paul Over (NIST)

86Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 87: Interactive Video Search - Tutorial at ACM Multimedia 2015

TRECVID Known-item Search

TRECVID KIS (2010-2012)models the situation in which “someone knows of a video, has seen it before, believes it is contained in a collection, but doesn‘t know where to look”

Automatic Search Text-description about the video

Return ranked list of 100 videos (out of 9000)

Interactive Search Pre-processing based on text query

Searcher browses through result list (e.g., keyframes of shots)• Interactively find target video as fast as possible

• Within 5 minutes

87Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 88: Interactive Video Search - Tutorial at ACM Multimedia 2015

TRECVID Known-item SearchThe Performance of State-of-The-Art Video Retrieval Tools

Known items not found by any team:

Interactive Automatic out of

2010 5 / 24 21% 69 / 300 22% 15 teams

2011 6 / 25 24% 142 / 391 36% 9 teams

2012 2 / 24 17% 108 / 361 29% 9 teams

From: [Alan Smeaton, Paul Over, “Known-Item Search @ TRECVID 2012”, NIST, 2012]

88Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 89: Interactive Video Search - Tutorial at ACM Multimedia 2015

MediaEval 2015

• Search and Anchoring in Video Archives“Search for Multimedia Content”

Multi-model textual and visual descriptions of content of interest

“Automatic Anchor Selection” Predict key elements of videos as anchor points for hyperlinking

Professional (BBC) and non-professional content (users)

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 89

http://www.multimediaeval.org

Page 90: Interactive Video Search - Tutorial at ACM Multimedia 2015

Video Browser Showdown (VBS)

• Annual performance evaluation competition

Live evaluation of search performance

Special session at Int. Conference on MultiMedia Modeling (MMM)

Demonstrates and evaluates state-of-the-art interactive video search tools

Idea influenced by VideOlympics (Snoek et al., IEEE Multimedia 2008)

• Focus

Known-item Search tasks

Target clips are presented on site

Teams search in shared data set

Highly interactive search

Should push research on interfaces and interaction/navigation

Experts and Novices

Easy-to-use tools and methods

90

Teams connected to a server that issues tasks and evaluates submitted results

http://videobrowsershowdown.org/

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 91: Interactive Video Search - Tutorial at ACM Multimedia 2015

Video Browser Showdown (VBS)

• Scoring through VBS Server

• Score (s) [0-100] for task i and team k is based on Solve time (t)

Penalty (p) based on number of submissions (m)

91

Maximum solve time (Tmax) typically 3-5 minutes

[Schoeffmann, K., Ahlström, D., Bailer, W., Cobârzan, C., Hopfgartner, F., McGuinness, K., ... & Weiss, W. (2013). The Video Browser Showdown: a live evaluation of interactive video search tools. International Journal of Multimedia Information Retrieval, 1-15. ]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 92: Interactive Video Search - Tutorial at ACM Multimedia 2015

Video Browser Showdown 2015

• Search in mid-sized video collections (2016: 200 hours) Originally only single video search

• Two different kind of tasks: Visual: visual presentation of a 30s target clip

Textual: textual description of a 30s target clip

• Shared video data from BBC 2015: 153 video files, about 100.000 shots (9 Mio frames)

Participants:• Need to find the target clips as quickly as possible• Get points for each task (the faster the better)

• But only for submission of exact location of target clip

[Schoeffmann, Klaus. "A user-centric media retrieval competition: The video browser showdown 2012-2014." MultiMedia, IEEE 21.4 (2014): 8-13.]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 92

Page 93: Interactive Video Search - Tutorial at ACM Multimedia 2015

93Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

2012: Klagenfurt11 teams

2013: Huangshan6 teams

2014: Dublin7 teams

2015: Sydney9 teams

VBS 2016: January 5, 2016, Miami, USA (MMM 2016)http://www.videobrowsershowdown.org/

Page 94: Interactive Video Search - Tutorial at ACM Multimedia 2015

Video Browser Showdown 2012Two examples (of the 11 tools; single video search only)

94

[Xiangyu Chen, Jin Yuan, Liqiang Nie, Zheng-Jun Zha, Shuicheng Yan, and Tat-Seng Chua, "TRECVID 2010 Known-item Search by NUS", in Proceedings of TRECVID 2010 workshop, NIST, Gaithersburgh, USA, 2011

Jin Yuan, Huanbo Luan, Dejun Hou, Han Zhang, Yan-Tao Zheng, Zheng-Jun Zha, and Tat-Seng Chua, "Video Browser Showdown by NUS", in Proceedings of th 18th International Conference on Multimedia Modeling (MMM) 2012, Klagenfurt, Austria, pp. 642-645]

• Keyframe extraction (shots)• ASR and OCR• HLF (Concepts)• RF with Related Samples

• Uniform sampled keyframes(with flexible distance)

• Parallel playback + navigation

[Manfred Del Fabro and Laszlo Böszörmenyi, "AAU Video Browser: Non-Sequential Hierarchical Video Browsing without Content Analysis", in Proceedings of th 18th International Conference on Multimedia Modeling (MMM) 2012, Klagenfurt, Austria, pp. 639-641]

Winner of VBS 2012

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 95: Interactive Video Search - Tutorial at ACM Multimedia 2015

Winner 2014 and 2015(2014: single video and collection search, 2015: collection only)

95

Color Sketch(Signature)

Player

2nd Color Sketch(optional)

[ Lokoč, J., Blažek, A., & Skopal, T. (2014, January). Signature-Based Video Browser. In MultiMedia Modeling (pp. 415-418). Springer International Publishing. ]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 96: Interactive Video Search - Tutorial at ACM Multimedia 2015

Video Browser Showdown 2015Two examples (of the 9tools, collection search only)

96

Moumtzidou, A., Avgerinakis, K., Apostolidis, E., Markatopoulou, F., Apostolidis, K., Mironidis, T., ... & Patras, I. (2015, January). VERGE: A Multimodal Interactive Video Search Engine. In MultiMedia Modeling(pp. 249-254). Springer International Publishing.

• Shot and scene detection• HLF (Concepts) with

SIFT/SURF and VLAD• Similarity search

• Uniform sampled frames• Human computation

Hürst, W., van de Werken, R., & Hoet, M. (2015, January). A Storyboard-BasedInterface for Mobile Video Browsing. In MultiMedia Modeling (pp. 261-265). Springer International Publishing.

3rd place

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 97: Interactive Video Search - Tutorial at ACM Multimedia 2015

97Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

URL: http://mklab-services.iti.gr/vss2015/ [Courtesy of Stefanos Vrochidis]

Page 98: Interactive Video Search - Tutorial at ACM Multimedia 2015

98Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

[Courtesy of Stefanos Vrochidis]

Page 99: Interactive Video Search - Tutorial at ACM Multimedia 2015

99Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Similarity Search Results[Courtesy of Stefanos Vrochidis]

Page 100: Interactive Video Search - Tutorial at ACM Multimedia 2015

100Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

[Courtesy of Stefanos Vrochidis]

Page 101: Interactive Video Search - Tutorial at ACM Multimedia 2015

Human vs. Machine

• Utrecht University @ VBS 2015 Wolfgang Huerst et al., The Netherlands

Strong experience in HCI

• Features Uniformly sampled thumbs

(1 second distance)

Huge storyboard on tablet

Vertical scrolling, paging

101

625 thumbnails in one screen

[Hürst, W., van de Werken, R., & Hoet, M. (2015, January). A Storyboard-Based Interface for Mobile Video Browsing. In MultiMedia Modeling (pp. 261-265). Springer International Publishing.]

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search

Page 102: Interactive Video Search - Tutorial at ACM Multimedia 2015

Visual Lifelogging

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 102

Slides by Frank Hopfgartner

Page 103: Interactive Video Search - Tutorial at ACM Multimedia 2015

What is The Quantified Self?

The Quantified Self is about obtaining self-knowledge through self-tracking.

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 103

Page 104: Interactive Video Search - Tutorial at ACM Multimedia 2015

What is The Quantified Self?

Self-tracking is also referred to as lifelogging, self-analysis, or self-hacking.

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 104

Page 105: Interactive Video Search - Tutorial at ACM Multimedia 2015

Memex

Bush, Vannevar. "As We May Think." The Atlantic Monthly. July 1945.

Imag

es o

f M

emex

: htt

p:/

/tre

vor.

smit

h.n

ame/

mem

ex/

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 105

Page 106: Interactive Video Search - Tutorial at ACM Multimedia 2015

MyLifeBits

• Gordon Bell (Microsoft) digitized his life:Books writtenPersonal documents PhotosPosters, paintings, photo of

thingsHome movies and videosCD collectionPC files…

Gordon Bell and Jim Gemmell. Total Recall: How the E-Memory Revolution will change everything, New York, Dutton 2009

http://research.microsoft.com/en-us/projects/mylifebits/

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 106

Page 107: Interactive Video Search - Tutorial at ACM Multimedia 2015

Creating Personal Lifebraries

A lifebrary consists of heterogeneous data recorded using many different sensors.

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 107

Page 108: Interactive Video Search - Tutorial at ACM Multimedia 2015

Recording what I eat

Aizawa, Kiyoharu, Maruyama, Yutu, Li, He, and Morikawa, Chamin. “Food Balance Estimation by Using Personal Dietrary Tendencies in a Multimedia Food Log." IEEE Transactions on Multimedia, 15(8):2176-2185, 2013.

Semantic Gap

http://foodlog.jp/

http://mealsnap.com/

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 108

Page 109: Interactive Video Search - Tutorial at ACM Multimedia 2015

Recording what I see

"LifeGlogging cameras 1998 2004 2006 2013 labeled" by Glogger - Own work. Licensed under CC BY-SA 3.0 via Commons -https://commons.wikimedia.org/wiki/File:LifeGlogging_cameras_1998_2004_2006_2013_labeled.jpg#/media/File:LifeGlogging_cameras_1998_2004_2006_2013_labeled.jpg

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 109

Page 110: Interactive Video Search - Tutorial at ACM Multimedia 2015

Visual Lifelogging

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 110

Page 111: Interactive Video Search - Tutorial at ACM Multimedia 2015

Example: Visual Lifelog of a day

5,500 pictures a day

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 111

[Slide: C. Gurrin, DCU]

Page 112: Interactive Video Search - Tutorial at ACM Multimedia 2015

Big Data

Cathal Gurrin, Alan F. Smeaton and Aiden R. Doherty (2014), "LifeLogging: Personal Big Data", Foundations and Trends® in Information Retrieval: Vol. 8: No. 1, pp 1-125.

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 112

Page 113: Interactive Video Search - Tutorial at ACM Multimedia 2015

Semantic Analysis

• Context cues help us to remember (Naaman et al.)

• Context in lifelogging data: Location, bluetooth, time, date,

… Derived Knowledge (e.g.

activities)

• Approaches: Combine cues from different

sources Perform content analysis to

identify objects, people, events… Annotate lifelogs in form of

narrative text

Mor Naaman, Susumu Harada, QianYing Wang, Hector Garcia-Molina, Andreas Paepcke: Context data in geo-referenced digital photo collections. ACM Multimedia 2004: 196-203

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 113

[Slide: C. Gurrin, DCU]

Page 114: Interactive Video Search - Tutorial at ACM Multimedia 2015

Visual Feature Extraction

Steering wheel (72%) Shopping (75%) Inside of vehicle when not driving (airplane, taxi, car,

bus) (60%) Toilet/Bathroom (58%) Giving Presentation / Teaching (29%) View of Horizon (23%) Door (62%) Staircase (48%) Hands (68%) Holding a cup/glass (35%) Holding a mobile phone (39%) Eating food (41%) Screen (computer/laptop/tv) (78%) Reading paper/book (58%) Meeting (34%) Road (47%) Vegetation (64%) Office Scene (72%) Faces (61%) People (45%) Grass (61%) Sky (79%) Tree (63%)

Byrne, Daragh, Doherty, Aiden R., Snoek, Cees G. M., Jones, Gareth J. F., Smeaton, Alan F. “Everyday concept detection in visual lifelogs: validation, relationships and trends." Multimedia Tools and Applications, 49(1):119-144, 2010.

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 114

Page 115: Interactive Video Search - Tutorial at ACM Multimedia 2015

A day

This does not work well… Let’s add event segmentation.

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 115

[Slide: C. Gurrin, DCU]

Page 116: Interactive Video Search - Tutorial at ACM Multimedia 2015

Event Segmentation & Annotation

• Segment 5,500 photos per day into a set of events Similar to SBD in digital video processing

We employ visual features and output of on-device sensors

Multiple Events

Finishing work in the lab

At the bus stop Chatting at Skylon Hotel lobby Moving to a room

Tea time On the way back home

Event Segmentation

Summarization

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 116

[Slide: C. Gurrin, DCU]

Page 117: Interactive Video Search - Tutorial at ACM Multimedia 2015

Non-supervised Event Segmentation

2. Arriving

in the office

6. Walking in

the building 12. Leaving

the office

Na Li et al. “Random Matrix Ensembles of Time Correlation Matrices to Analyze Visual Lifelogs." In Proc. Multimedia Modeling Conference, Dublin, Ireland, pp. 400-411, 2014.

Event Segmentation based on the extraction of low level features and computation of semantic concepts requires knowledge about dataset.

Alternative: Highlight “significant events” by performing time series analysis

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 117

Page 118: Interactive Video Search - Tutorial at ACM Multimedia 2015

MyLifeBits

Gordon Bell and Jim Gemmell. Total Recall: How the E-Memory Revolution will change everything, New York, Dutton 2009

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 118

Page 119: Interactive Video Search - Tutorial at ACM Multimedia 2015

MyLifeBits

Gordon Bell and Jim Gemmell. Total Recall: How the E-Memory Revolution will change everything, New York, Dutton 2009

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 119

Page 120: Interactive Video Search - Tutorial at ACM Multimedia 2015

Virtual reality

“Bad Trip is an immersive virtual reality installation […] that enables people to navigate the creator's mind using a game controller.Since November 2011, every moments of his life has been documented by a video camera mounted on glasses, producing an expanding database of digitalized visual memories. Using custom virtual reality software, he created a virtual mindscape where people could navigate, and experience his memories and dreams.”

Souce: http://www.kwanalan.com

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 120

Page 121: Interactive Video Search - Tutorial at ACM Multimedia 2015

Art installations

Kelly, Philip and Doherty, Aiden R. and Smeaton, Alan F. and Gurrin, Cathal and O’Connor, Noel E. “The Colour of Life: Novel Visualisations of Population Lifestyles." In Proc. ACM Multimedia, pp. 1063-1066, 2010.

Imag

e: C

ou

rtes

y o

f C

. Gu

rrin

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 121

Page 122: Interactive Video Search - Tutorial at ACM Multimedia 2015

Video Summary

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 122

[Courtesy of T. Plumbaum]

Page 123: Interactive Video Search - Tutorial at ACM Multimedia 2015

NTCIR

• Workshop series focusing on research on Information Access technologies (information retrieval, question answering, text summarisation, etc)

• Sponsored by Japan Society for Promotion of Science (JSPS)

• Organised since 1997 in an 18-months cycle• NTCIR-12: January 2015 – June 2016

NII Test Collection for IR Systems

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 123

Page 124: Interactive Video Search - Tutorial at ACM Multimedia 2015

NTCIR-12 TasksN

TC

IR-1

2

Second round: Search-Intent Mining

Mobile Click

Temporal Information Access

Spoken Query & Spoken Document Retrieval

QA Lab for Entrance Exam

First round: Medical NLP for Clinical Documents

Personal Lifelog Access & Retrieval

Short Text Conversation

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 124

Page 125: Interactive Video Search - Tutorial at ACM Multimedia 2015

Encourage research advances in organising and retrieving from lifelog data.

LifeLog @ NTCIR-12

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 125

Page 126: Interactive Video Search - Tutorial at ACM Multimedia 2015

Multimodal dataset with information needs

Created by various

individuals over 10+ days

TEST

CO

LLEC

TIO

N

1,500 images, location, GSR, heart-rate, others… per lifelogger per day

Accompanying output of 1,000 concepts

Data processed pre-release (removal of personal content; face blurring, translation of concepts)

Detailed user queries andjudgments generated by the lifelogging data gatherers

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 126

Page 127: Interactive Video Search - Tutorial at ACM Multimedia 2015

Tasks

Evaluate different methods ofretrieval and access.

T1:

LIFE

LOG

SEM

AN

TIC

AC

CES

S (L

SAT)

Models the retrieval need from lifelogs (Known-item Search)

Retrieve N segments that match information need

Interactive or Automatic participation

Interactive: Time limit for fair and comparative evaluation in an interactive system with users

Automatic: Fully-automatic retrieval system. Automated query processing

T2:

LIFE

LOG

IN

SIG

HT

Models the need for reflection over lifelog data

Exploratory task, the aim is to: Encourage broad

participation Novel methods to

visualize and explore lifelogs

Same data as LSAT task Presented via demo/poster

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 127

Page 128: Interactive Video Search - Tutorial at ACM Multimedia 2015

Task 1: Lifelog Semantic Access

Find the moment(s)

where I use my coffee machine.

Find the moment(s)

where I am in the kitchen

Find the moment(s) where I am

playing with my phone.

Find the moment(s) where I am preparing breakfast.

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 128

Page 129: Interactive Video Search - Tutorial at ACM Multimedia 2015

Task 2: Lifelog Insight Task

Provide insights on the time I spend taking

breakfast.

Provide insights on the time I

spend driving to work.

Provide insights on the time I

spend reading a paper.

Provide insights on the time I

spend working on the computer.

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 129

Page 130: Interactive Video Search - Tutorial at ACM Multimedia 2015

Further information

http://ntcir-lifelog.computing.dcu.ie/

21 Sep 2015: Release of formal run collection and task data (topics)15 Dec 2015: Deadline for formal run submissions15 Jan 2016: Formal run evaluation results return01 Mar 2016: Paper for the Proceedings7-10 Jun 2016: NCTIR-12 conference

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 130

Page 131: Interactive Video Search - Tutorial at ACM Multimedia 2015

The End

Klaus Schoeffmann, Frank Hopfgartner ACM Multimedia 2015 Tutorial: Interactive Video Search 131