4.3 multimedia datamining

17
15/03/22 1 11CP913 – DATA MINING MULTIMEDIA DATA MINING DATA MINING

Upload: krishver2

Post on 08-Aug-2015

26 views

Category:

Education


0 download

TRANSCRIPT

15/04/23 1

11CP913 – DATA MINING MULTIMEDIA DATA MINING

DATA MINING

15/04/23 2

MULTIMEDIA DATABASE

Multimedia database system – stores & manages a large collection of multimedia objects Audio data, image data, video data, sequence data,

hypertext data (contain text, text markups & linkages)

Audio-video equipment, CD-ROM, internet Multimedia data mining focuses on image data mining Multimedia data mining methods

Similarity search in multimedia data, Multidimensional analysis, Classification & prediction analysis and Mining associations in multimedia data

DATA MINING

15/04/23 3

SIMILARITY SEARCH IN MULTIMEDIA DATA

Two types of multimedia indexing and retrieval systems Description-based retrieval system Content-based retrieval system

Description-based retrieval system Build indices and perform object retrieval based on image descriptions,

• keywords• caption• Size• Time of creation

Labor-intensive Poor quality

DATA MINING

15/04/23 4

CONTENT-BASED RETRIEVAL SYSTEM Object retrieval is based on the image content,

color histogram texture pattern image topology shape of objects and their layouts and locations within the image

Desirable in many applications Two kinds of queries

Image sample-based queries Image feature specification queries

Image sample-based queries Search compares the feature vector extracted from the sample with

images & indexed in image database Images closer to the sample images are returned

DATA MINING

15/04/23 5

CONTD… Image feature specification queries

Sketch image features(color, texture or shape) Translated into feature vector to be matched with the image database

Applications – medical diagnosis, weather prediction, web search engines for images

QBIC(Query By Image Content) Support both sample-based & image feature specification queries

Approaches for similarity-based retrieval in image database based on image signature Color histogram-based signature Multifeature composed signature Wavelet-based signature Wavelet-based signature with region-based granularity

DATA MINING

15/04/23 6

CONTD…

Color histogram-based signature image signature includes color histogram based on the color

composition of an image No information about shape, location or texture Two images with similar color results in unrelated semantics

Multifeature composed signature Image signature includes a composition of multiple features

• Color histogram, shape, location and texture Separate distance function for each feature Few features are used to search for images with similar features

DATA MINING

15/04/23 7

CONTD…

Wavelet-based signature Image signature includes the wavelet coefficients of an image Wavelets capture shape, texture & location information in a single

unified framework Improves efficiency & reduces the need for multiple search primitives Computes a single signature for an entire image

Wavelet-based signature with region-based granularity Computation & comparison of signatures are at the granularity of

regions, not the entire image Similar images may contain similar regions Region in one image – performs translation/scaling of a matching

region with other Similarity measure between query image & target image

DATA MINING

15/04/23 8

MULTIDIMENSIONAL ANALYSIS OF MULTIMEDIA DATA

Multimedia data cube contain additional dimensions Measures for multimedia information – color, texture & shape

Multimedia miner Image contains 2 descriptors – feature descriptor & a layout descriptor Original image is not stored directly in database Description information

• Image file name

• Image URL

• Image type

• List of keywords

DATA MINING

15/04/23 9

CONTD… Feature descriptor

set of vectors for visual characteristics Main vectors – color vector, MFC (Most Frequent Color), MFO (Most

Frequent Orientation) vector Layout descriptor

Color layout vector - MFC Edge layout vector – number of edges for each orientation

Dimensions of multimedia data cube Size of the image/video in bytes Width & height of the frames Date of creation (image/video) Format type Frame sequence duration in seconds Keywords, color & edge orientation

DATA MINING

15/04/23 10

CONTD…

Construction of a multimedia data cube Facilitates multidimensional analysis of multimedia data Based on visual content Mining of multiple kinds of knowledge

• Summarization

• Comparison

• Classification

• Association

• clustering

DATA MINING

15/04/23 11DATA MINING

15/04/23 12

CONTD…

Difficult to implement a data cube efficiently for large number of dimensions

Attributes are set-oriented instead of single-valued Eg : single image corresponds to set of keywords,

set of objects associated with set of colors

DATA MINING

15/04/23 13

CLASSIFICATION & PREDICTION ANALYSIS OF MULTIMEDIA DATA Scientific research – astronomy, seismology & geoscientific

research Decision tree classification – essential data mining method Eg : sky images – classified by astronomers as the training set

constructing models for recognition of galaxies, stars

based on properties – magnitudes, areas, intensity, image

moments & orientation.sky images taken by telescope are

tested against the constructed models – to identify new

bodies Data preprocessing – mining image data

DATA MINING

15/04/23 14

MINING ASSOCIATIONS IN MULTIMEDIA DATA

Association between image content & non-image content features: “if atleast 50% of the upper part of the picture is blue, it is likely to represent sky”

Association among image contents that are not related to spatial relationships:”if a picture contains 2 blue squares, it is likely to contain one red circle as well”

Association among image contents related to spatial relationships:”if a red triangle is in between 2 yellow squares, it is likely there is a big oval-shaped object underneath”

DATA MINING

15/04/23 15

CONTD…

Multiple objects with multiple features – large number of possible associations

Essential to promote progressive resolution refinement Frequently occurring pattern – mine at rough level & focus on

finer resolution level Reduces the cost without loss of quality Picture containing multiple recurrent objects is an important

feature in image analysis Relative spatial relationships among multimedia objects –

above, beneath, between, nearby

DATA MINING

15/04/23 16

AUDIO & VIDEO DATA MINING

Demand for effective content-based retrieval & data mining methods for audio & video data

Eg: editing video clips, detecting suspicious scenes in videos MPEG & JPEG – video compression schemes MPEG-7- formally named “Multimedia Content Description

Interface” Used in broad range of applications Audiovisual description – still pictures, video, graphics, audio,

speech

DATA MINING

15/04/23 17

CONTD…

Elements in MPEG-7 A set of descriptors defines the syntax & semantics of a feature Structure & semantics of the relationships between its components A set of coding schemes for the descriptors DDL(Description Definition Language)

Facilitates content-based video retrieval & video data mining Video clip – collection of actions & events in time Shot – group of frames/pictures Key frame

Most representative frame in a video Sequence of key frames defines the sequence of the events in the video

clip

DATA MINING