the crowdsearch framework

Post on 19-Jun-2015

316 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

+

A Framework for Crowdsourced Multimedia Processing and Querying

Alessandro Bozzon, Ilio Catallo, Eleonora Ciceri, Piero Fraternali, Davide Martinenghi, Marco Tagliasacchi

1

+CUbRIK Project

CUbRIK is a research project financed by the European Union

Goals: Advance the architecture of

multimedia search Exploit the human

contribution in multimedia search

Use open-source components provided by the community

Start up a search business ecosystem

http://www.cubrikproject.eu/

2

+Humans in Multimedia Information Retrieval Problem: the uncertainty of analysis algorithms leads to low

confidence results and conflicting opinions on automatically extracted features

Solution: humans have superior capacity for understanding the content of audiovisual material State of the art: humans replace automatic feature extraction

processes (human annotations)

Our contribution: integration of human judgment and algorithms Goal: improve the performance of multimedia content processing

3

+ Example of CUbRIK Human-enhanced computation: Trademark Logo Detection Problem statement: identifying occurrences of trademark

logos in a video collection through keyword-based queries Special case of the classic problem of object recognition

Use case: a professional user wants to retrieve all the occurrences of logos in a large collection of video clips

Applications: rating effectiveness of advertising, subliminal advertising detection, automatic annotation, trademark violation detection

4

+

Problems in automatic logo detection: Object recognition is affected by the quality of the input set

of images

Uncertain matches, i.e., the ones with low matching score, could not contain the searched logo

5

Trademark Logo Detection: problems in automatic logo detection

+

Contribution in human computation Filter the input logos, eliminating the irrelevant ones Segment the input logos

Validate the matching results

6

Trademark Logo Detection: contribution of human computation

+ 7

Trademark Logo Detection: pipeline

+The CrowdSearch framework for HC task management

8

+CrowdSearch framework in the Logo detection application

9

Types of tasks• Automatic tasks• Crowd tasks: tasks that are executed

by an open-ended community of performers

+Community of Performers

10

The application is deployed as a Facebook application

Seed community Information Technology department of Politecnico di Milano

Task propagationEach user in the seed community can propagate tasks through the social networks

+Design of “Validate Logo Images”

11

The “LIKE” task variant requires to choose relevant logos among a set of not filtered images

The “ADD”task variant requires to add new relevant image URLs

Please add new relevant logos

URL…

Send

+People to task matching & Task Assignment

12

Execution criteriaConstraints of task execution

Time budget for the experiment

Content Affinity criteriaQuery on a representation of the users’ capacities• Current state: manual selection of users• Future work: Geocultural affinityQuestions are dispatched to the crowd according to the user experience in answering questions• Expert user: an user that has already

answered to three questions

New users answer to “LIKE” questions

Expert users answer to “LIKE”+“ADD” questions

+Task execution

13

“LIKE” task variant “ADD” task variant

+Output aggregation

14

“LIKE” task variantsTop-5 rated logos are selected as relevant logos

“ADD” task variantsNew images are fed back to the LIKE tasks

+Experimental evaluation

Three experimental settings: No human intervention Logo validation performed by two domain experts Inclusion of the actual crowd knowledge

Crowd involvement 40 people involved 50 task instances generated 70 collected answers

15

+Experimental evaluation

16

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

No Crowd

ExpertsCrowd

No Crowd

Experts

CrowdNo Crowd

Experts

CrowdAleveChunkyShout

Precision

Reca

ll

+Experimental evaluation

17

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

No Crowd

ExpertsCrowd

No Crowd

Experts

CrowdNo Crowd

Experts

CrowdAleveChunkyShout

Precision

Reca

ll

Precision decreases

Reasons for the wrong inclusion• Geographical location of the

users• Expertise of the involved users

+Experimental evaluation

18

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

No Crowd

ExpertsCrowd

No Crowd

Experts

CrowdNo Crowd

Experts

CrowdAleveChunkyShout

Precision

Reca

ll

Precision decreases• Similarity between two

logos in the data set

+Future directions

Task design: Implement new task types (tag / comment / like / add / modify…) Partition large task instances into several smaller instances dispatched

to multiple users

Task assignment: study how to associate the most suitable request with the most appropriate user Implement a ranking function on worker pool, based on the expertise,

geocultural information and past work history of the performers

Task execution: multiple heterogeneous platforms (Facebook, LinkedIn, Twitter, stand-alone application)

More use cases: Breaking news Fashion trend

19

top related