improving the search experiencein a social network with cross media contents

21
Improving the Search Experience in a Social Network with Cross Media Contents Daniele Cenni, Paolo Nesi University of Florence Department of Systems and Informatics Distributed Systems and Internet Technology Laboratory [email protected] [email protected] , http://www.disit.dinfo.unifi.it DMS2013, August 2013, UK, Paolo Nesi 1

Upload: paolo-nesi

Post on 07-May-2015

59 views

Category:

Technology


0 download

DESCRIPTION

Indexing/Searching solution for ECLAP Social Network allowing: Indexing multilingual crossmedia content metadata and data (e.g. documents) Indexing portal blogs, forums, events, group pages, comments, etc. Efficient multilingual search (keyword search and advanced search) supporting: misspelled words (e.g. shespeare) partial word search Sorting and filtering search results re-index the whole data without blocking the system Log and monitor users activity … Evaluate the Indexing/Searching service Indexing & Search system Based on Apache Solr Multilingual aspects Translate the metadata or translate the query?.. both metadata translation Query translation Indexing schema Dublin Core + DCTerms (multi language) Performing Arts Technical (provider, content type, GPS, IPR, duration, quality, …) Groups associations (multi language) Taxonomy associations (multi language) Comments & multi language tags FullText of the textual digital resources

TRANSCRIPT

Page 1: Improving the Search Experiencein a Social Network with Cross Media Contents

Improving the Search Experience

in a Social Network with Cross

Media Contents

Daniele Cenni, Paolo Nesi

University of Florence Department of Systems and Informatics

Distributed Systems and Internet Technology Laboratory

[email protected]

[email protected] , http://www.disit.dinfo.unifi.it

DMS2013, August 2013, UK, Paolo Nesi 1

Page 2: Improving the Search Experiencein a Social Network with Cross Media Contents

ECLAP Social Network

ECLAP is a Digital Library on Performing

Arts connected with Europeana

ECLAP is a Best Practice and Social

Network (blogs, forums, comments,

tagging, voting, …)

DMS2013, August 2013, UK, Paolo Nesi 2

Page 3: Improving the Search Experiencein a Social Network with Cross Media Contents

Goals/Requirements Develop an Indexing/Searching solution for ECLAP Social

Network allowing: Indexing multilingual crossmedia content metadata and

data (e.g. documents)

Indexing portal blogs, forums, events, group pages, comments, etc.

Efficient multilingual search (keyword search and advanced search) supporting: misspelled words (e.g. shespeare)

partial word search

Sorting and filtering search results

re-index the whole data without blocking the system

Log and monitor users activity

Evaluate the Indexing/Searchig service

DMS2013, August 2013, UK, Paolo Nesi 3

Page 4: Improving the Search Experiencein a Social Network with Cross Media Contents

ECLAP ANY content kind Informative Content

Video, audio, images,

documents

3D, animations, Braille

Slide, Video-Slide, courses

eBook, ePub, Mpeg21,

intelligent

Aggregated Content:

Playlist, Collections

Annotations,

Synchronization

Support and networking

content:

Blog, WebPage, Events,

comments,

forum, votes, messages, … 4

comments

rating

relationships

technical

Dynamic

recommend

……………

• Performance

• Master classes

• Scene Sketches

• Scenography

• Scenes

• Private lives of

artists

• Scores

• Braille

• BackStage Stills

• Choreography

• Morals

• Poster

• Booklets

• Magazines Music

• Audio ballets

Page 5: Improving the Search Experiencein a Social Network with Cross Media Contents

ECLAP Semantic Model 1

DMS2013, August 2013, UK, Paolo Nesi

Media Object

Video Audio

Document

Group/Channel

CollectionPlaylist

0..n

0..n

1..n

0..n

Image

AVObjectAnnotation0..n 1..2

1..n

0..n

ForumWebPage

CommentContentTaxonomyTerm0..n 0..n 0..n1

0..n

0..n

Blog

Metadata

PerformingArts

Dublin Core

Technical

Main Annotation

Side Annotation

1..n

1

GeoName

Crossmedia

Archive

Event

epub

3D

IPR

Braille Music Score

5

Page 6: Improving the Search Experiencein a Social Network with Cross Media Contents

ECLAP Semantic Model 2

DMS2013, August 2013, UK, Paolo Nesi

User Group/Channel

Content

Media Object

Comment

Annotation

TaxonomyTerm

foaf:memberadmin

isProvidedBy

isFavouriteOf

dc:creator

dc:creator

foaf:topic_interest

isFeaturedBy

foaf:knows

6

Page 7: Improving the Search Experiencein a Social Network with Cross Media Contents

Indexing

Indexing & Search system Based on Apache Solr

Multilingual aspects Translate the metadata or translate the query?.. both

metadata translation

Query translation

Indexing schema Dublin Core + DCTerms (multi language)

Performing Arts

Technical (provider, content type, GPS, IPR, duration, quality, …)

Groups associations (multi language)

Taxonomy associations (multi language)

Comments & multi language tags

FullText of the textual digital resources

DMS2013, August 2013, UK, Paolo Nesi 7

Page 8: Improving the Search Experiencein a Social Network with Cross Media Contents

Indexing

DMS2013, August 2013, UK, Paolo Nesi 8

Page 9: Improving the Search Experiencein a Social Network with Cross Media Contents

Metadata Schema Indexing

DMS2013, August 2013, UK, Paolo Nesi 9

Page 10: Improving the Search Experiencein a Social Network with Cross Media Contents

Search Facilities Full text search

Uses the catch all fields to search for keywords in

most important fields in all languages (title,

description, text, body, subject,…)

Fuzzy search

Allows matching mistyped words

Deep search

Allows searching for partial words

Faceted Search

Maximasing Precision and Recall:

Relevance & boosting terms

DMS2013, August 2013, UK, Paolo Nesi 10

Page 11: Improving the Search Experiencein a Social Network with Cross Media Contents

Search Facilities vs Information

DMS2013, August 2013, UK, Paolo Nesi 11

Page 12: Improving the Search Experiencein a Social Network with Cross Media Contents

Searching Faceted search

DMS2013, August 2013, UK, Paolo Nesi 12

Page 13: Improving the Search Experiencein a Social Network with Cross Media Contents

Weighted Query Model

Where for the “q” query

Weights are boosting fields

Title is DC.Title, description DC.Description….,

Body is textual body, subject…,

taxonomy the full description of the taxonomy

branch

DMS2013, August 2013, UK, Paolo Nesi 13

Page 14: Improving the Search Experiencein a Social Network with Cross Media Contents

Model Optimization

Optimization of the Precision&Recall to

improve search quality

50 reference queries

Optimization Methods

Simulated Annealing

Genetic Algorithms

7 parameters

DMS2013, August 2013, UK, Paolo Nesi 14

Page 15: Improving the Search Experiencein a Social Network with Cross Media Contents

Monte Carlo Analysis

MAP: Mean Average Precision DMS2013, August 2013, UK, Paolo Nesi 15

Page 16: Improving the Search Experiencein a Social Network with Cross Media Contents

DMS2013, August 2013, UK, Paolo Nesi 16

Page 17: Improving the Search Experiencein a Social Network with Cross Media Contents

Some weights’ Trends

DMS2013, August 2013, UK, Paolo Nesi 17

Page 18: Improving the Search Experiencein a Social Network with Cross Media Contents

Comparative Results

MAP: Mean Average Precision DMS2013, August 2013, UK, Paolo Nesi 18

Page 19: Improving the Search Experiencein a Social Network with Cross Media Contents

Usage Results

Over than 500.000 visits

7.29 minutes of permanence on the

portal

DMS2013, August 2013, UK, Paolo Nesi 19

Page 20: Improving the Search Experiencein a Social Network with Cross Media Contents

Assessment of Search Facility

Distribution of performed clicks

First page

DMS2013, August 2013, UK, Paolo Nesi 20

Page 21: Improving the Search Experiencein a Social Network with Cross Media Contents

Conclusions indexing solution for

cross media for multilingual metadata and texts

Improved Searching & filtering results and thus user experience

quality

Providing: (full text, operators), advanced, faceted, etc.

Precision and Recall analysis allowed to tune the search services

Simulated Annealing and Genetic Algorithms produced similar

results

User behavior assessment has shown that search facility appreciation has been improved wrt to early previous

settings, grounded on common sense and classical

metadata relevance

DMS2013, August 2013, UK, Paolo Nesi 21