Download - Description and search of multimedia datacsu.unipv.it/wp-content/uploads/2020/01/10_search.pdf · Content-based retrieval on shape Classical methods of analysis of the shape of a

Description and search of multimedia data Digital Content Retrieval

Prof.ssa Maria Grazia Albanesi

Topics

The problem: searching in multimedia data

Why is it different from text search?

How to make a search in a database of multimedia (MM)

From data to information: what is the difference?

How to evaluate a system for finding information in data MM?

Case studies

References

Book:

H. Blanken, A. P. de Vries, H. E. Blok, L. Fengs: “Multimedia

Retrieval”, Springer, 2007.

Web

http://labs.exalead.com/applications

The problem

Subtitle of the book: Data-Centric System and Applications.

What does this mean?

Purpose of the lesson: answer the following questions:

Why is the search of multimedia data different from the text?

How would you describe the media content (the purpose of the

research?)

What is the quality of the search process

How can the user interact with the search system and under

what constraints?

Diagram of a system for storage and retrieval of MM data

Terminology

Indexing

Search (best: retrieval)

Querying (execution of a query of a search)

Browsing (making a search without criteria or criteria with very

little binding analysis of the data presented in automatic or semi-

automatic way)

In which contexts, the research is a daily problem? Reference: text (introduction)

Journalism: A journalist must prepare a MM service on the

consequences of alcohol on driving

I'm watching TV and I try to find a program in an archive of the

broadcaster.

Web Search sporty nature:

"You get information about tennis players U.S. including video

clips of the games show a player as he goes to net

Other examples????

Retrieval of text vs. retrieval of MM

The text is added to a relational database with a rigid structure (but dynamic)

Understand the difference??

Employees (Name: char (20), City char (20), Phototizio: image

Select name from Employees WHERE City = "pavia"

There is a language (standard), SQL or its slight variations (FQL)

But if the target of the research was:

I want all the names of the bald employees???

First problem: The language and the relational structure are not able to

analyze the semantics of multimedia information, analyze only the DATA.

What is a semantic aspect?

"The semantics is that part of linguistics which studies the meaning of words

(lexical semantics), sets of words, phrases (phrasal semantics) and texts.

(Source: wikipedia)

And the semantic web?

The term Semantic Web, a term coined by its inventor, Tim Berners-Lee, is

defined as the transformation of the World Wide Web in an environment

where the published documents (HTML pages, files, images, and so on) are

associated with information and data (metadata) that specify the semantic

context into a format suitable to the query, the interpretation and, more in

general, to automatic retrieval .

Example: Beethoven's Sixth Symphony, the glasses 3D TV: what connection

there?

Terminology

Multimedia = more than one medium (media)

The term medium can have multiple meanings:

Means of communicating information

Means of support for information it is NOT our case

Type of structured data

Text, web pages, html-audio-video-still images (with or without audio)-graphics-

topographic-view 3D virtual reality-enhanced, ......

Multimedia: a collection of more than one type of media used together

At least one type of medium should be non-alphanumeric

Only digital medium

The text can contain only alphanumeric characters. It 's true the other way

around??

Another problem!

Compared to the text, multimedia occupies more space

Storage problem

Research problem (the search space can be huge)

Taken from the text:

Storage:

A book of 500 pages 2 MB

100 color 144 MB

1 hour of audio CDs 635 MB

1 hour of video 68.4 GB

Comment: are reliable?

This problem affects those involved in storage and those involved in the

provision of search algorithms, not those involved in measuring the

effectiveness of research.

How to bring the problem to a simpler case

metadata

Metadata describes some aspect of the meaning of multimedia data, turning it

into "information."

Photos of painted apple

Metadata:

painting, apple, Cezanne Botany

apple, fruit, stark

Metadata

Should be as descriptive as possible given the MM

Do not introduce too much overhead in terms of storage

The comparison between two metadata must be "fast“

Descriptiveness:

information concerning the format or a fact connected with the given MM

Author, creation date, data length MM, representation techniques (Dublin

Core)

annotation information on the content: give info on the content of the given

MM

information semantic annotation: give info on the meaning of the given

MM

Exercise

which semantic annotations associated with a photo album of historical

agricultural tools?

Problems of metadata-annotations

Annotations (both semantic both content) are added manually or in a semi-

automatic (dubious!)

The moethod is long, expensive and subject to incompleteness and

subjectivity

example:

One day, looking Pavia on Google images, I found in the first place

Second place!!

Problems of metadata-annotations

Annotations are often entered manually (high cost)

It is not clear the criterion with which the data is

annotated

Problem of synonyms: two words can have different

semantic mean

There are very high costs of upgrading

The criteria for entry are almost never uniform

…..

Second problem: how to extract the information to be in the annotations? Low-level Features

Statistical analysis (eg the recurrence of certain words)

Analysis of color (eg color histogram)

Extraction from video clips (eg, Motion analysis, lighting ....)

Features are automatable? Easy and quick to pull out, can be very

limiting

High-level features

Represent the meaning or content of the MM as seen from the

point of view of the user. Ex: automatic translators

Semantic gap

Between low-level features and high-level f. there is a semantic GAP

How vast is the semantic GAP?

Mean distance between high-level and since MM

To understand how big the gap do the opposite route:

From annotation to MM data

Game: What's this?

It can have the simple wave

It can also be a man

It can be compressed

If you give it to someone hurts!

May be of milk

If watered is thrown

SOLUTION: ????????

In english it is completeley meaningless!!

Semantic gap: the game in italian

Può avere l’onda semplice

Lo può essere anche un uomo

Può essere compresso

Se lo dai a qualcuno fa male!

Può essere di latte

Se lo bagni si butta

Solution: ????

CONCLUSION semantic is language dependent

Exercise: create a meaningful exercise in english (or french)

Content-Based Retrieval to database

Meaning of the term Content Based Retrieval

Try to answer the semantic gap

Utility and application fields

Techniques for still images

Fundamental concepts to describe and evaluate the

algorithms

Architecture of a CBR-System

Architecture of a CBR-System

Storage: most obvious aspect of the system, responsible for

delays and degradation of QoS. For this reason, often the MM

data and metadata are kept on separate servers.

Indexing: features can be from records, the content or the

semantics of the data. Features can occupy space and must in

turn be indexed (in the classical sense of database)

The metadata not only depend on the data MM, but can also

exist dependencies between them.

Maintenance of a MIRS

One of the most underestimated. It is incremental maintenance

Why a system must provide for the maintenance?

MM objects can be changed. Should be amended accordingly also

feature. It 'a recursive process.

You can change the algorithms with which target feature.

Dependencies between data can be changed.

Maintenance of a MIRS

Searching: the paradigms

What information do you use?

Data on the content of perception (vision, hearing):

Data on characteristics of low / intermediate level

(metadata-dependent content, often perceptive)

Data on the semantic content (metadata descriptive of

the content, relationships between entities and

attributes of the images with real-world objects)

Extraction procedure of the metadata

For any given MM in the DB are pre-calculated descriptors.

Queries are expressed in terms of perception (visual, auditory)

The examples can be supplied by the user or taken from images

offered by the IR system

To satisfy a query, the system checks the similarity between the

descriptors of the visual content of the query and those of DB

We often use iterative techniques of relevance feedback

Extraction procedure of the metadata

The retrieval by content is based on the concept of similarity, which is very different from the retrieval or exact matching:

The matching is an operation of binary partitioning. Objective: To determine whether or not corresponds to a model (classification)

The retrieval based on similarity is the reorganization of the MM data DB according to their similarity to the query (ranking), although none of the data has characteristics close to those of the given example.

Similarity

As two stimuli are similar?

The determination of the similarity between perceptual stimuli is based on the measurement of an appropriate distance in a metric space

An appropriate distance function (or metric) can be used to measure the distance between two stimuli

These two vectors V1 and V2 in the n-dimensional space, some distance functions commonly used are

21

1

221

n

i

E iViVD

n

i

C iViVD1

21

i

iiT DwD

Relevance Feedback

Browsing vs. searching

Frequently, the user cannot specify exactly what are looking

for.

However, it is able to recognize it if it appears in the output

This phenomenon implies the relevance feedback, but it is also the basis of browsing. We need to find a starting point.

Even if you make a query with approximate parameters:

It asks the system to propose a starting point

We classify data with MM subclassifications later.

Presenatation in a MIRS

It should present the user with an ordered list of

objects MM.

You have the right to see them?

We use icons, or shrunken versions of the object

Constraints and real-time network (streaming)

The interface must fulfill the criteria of usability.

Performance evaluation

Relevant Not relevant

Founded A (correct) B (uncorrectritrovati)

Not founded C (missing) D (correct)

BA

A

Precision

CA

A

Recall

exercise

In an image database of 5000 images of a museum, divided into four classes as follows:

Paintings of the 900: 1200 images

Baroque statues: 1120 images

Other Paintings: 1440 images

Miscellaneous items: 1240

Making a query-by-example by subjecting the search engine picture of a painting of the eighteenth century, and returns the following 20 images:

3 Baroque statues, two paintings of ‘900, 13 paintings from other eras, two jewelry.

How much are Precision and Recall?

Visual Query: what perceptual stimulus?

In the case of images, there are three perceptual stimuli that can be used for a search based on the content of images (still or moving):

color

shape

texture

Given the simplicity and the lack of temporal dimension, are generally used on still images.

Colour

Colour descriptors:

Histograms

Dominant colours

Stat. Moments computed on colour distributions

Application fields: photorealism, art,…

Content-based retrieval on texture

Content-based retrieval on shape

Classical methods of analysis of the shape of a region:

The description assumes the form of image segmentation into regions.

Techniques closely dependent on the application and type of images

The description of the form can be carried out through measures such as the area, the perimeter, the eccentricity, circularity, the orientation and the size of the main diameter, the Fourier descriptors of the contour or a part thereof ...

These descriptions all give rise to one or more numeric values (indices), and can therefore be conveniently used for the retrieval

Download - Description and search of multimedia datacsu.unipv.it/wp-content/uploads/2020/01/10_search.pdf · Content-based retrieval on shape Classical methods of analysis of the shape of a

Top Related