module 1 - introduction to multimedia databases
TRANSCRIPT
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
1/39
Module 1
INTRODUCTION TO
MULTIMEDIA DATABASES
Prof. Dr. Naomie Salim
Faculty of Computer Science & Information SystemsUniversiti Teknologi Malaysia
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
2/39
The Explosion of Digital
Multimedia Information
We interact with multimedia everyday
Large amount of text, images, speech & video
converted to digital form
Advantages of digitized data over analog
Easy storage
Easy processing
Easy sharing
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
3/39
Give examples of multimedia
applications that deals with
storing, retrieving, processingand sharing of multimedia data
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
4/39
Eg 1. Journalism
Journalist to write article about influence of
alcohol on driving
Investigation involved:
Collect news articles about accidents, scientific
reports, television commercials, police interviews,
medical experts interviews
Illustration:
Search photo archives, stock footage companies
for good photos shocking, funny, etc.
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
5/39
Other examples
Searching movies
Based on taste of movies already seen
Based on movies a friend favor
Searching on web
Eg. searching Australian Open website
(http://www.ausopen.org)
Integrate conceptual terms + interesting events
give info about video segments showing female
American tennis players going to the net
http://www.ausopen.org/http://www.ausopen.org/ -
8/3/2019 Module 1 - Introduction to Multimedia Databases
6/39
Retrieval problems
EMPLOYEE (Name: char(20), City: Char(20),
Photo: Image)
How do you select employees in Skudai?
How do you select employees that wear tudung,
wear glasses, fair and have a mole under the lips?
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
7/39
Characteristics of Media Data
Medium - Information representation
Alphanumeric
Representation of audio, video and image
Static vs dynamic Static: do not have time dimensions (alphanumeric data,
images, graphics)
Dynamic: have time dimensions (video, animation, audio)
Multimedia
Collection of media types used together
At least one media types must be non-alphanumeric
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
8/39
Digital representation of text
OCR techniques convert analog text to digital text
Eg. of digital representation: ASCII
Use 8 bits
Chinese char requires more space
Storage requirements depend on number of characters
Structured documents becoming more popular
Docs consist of titles, chapters, sections, paragraphs, etc.
Standards like HTML and XML used to encode structured information
Compression of text
Huffman, arithmetic coding
Since storage requirements not too high, less important than
multimedia data
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
9/39
Digital representation of audio
Audio air pressure waves with frequency, amplitude
Human hears 20-20,000 Hertz
Low amplitude soft sound
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
10/39
Digitizing pressure waveforms Transform into electrical
signal (by microphone)
Convert into discrete
values
Sampling: continuous time axisdivided into small, fixed
intervals
Quantization: determination of
amplitude of video signals at
beginning of each time interval Human cannot notice
difference between analog &
digital with enough high
sampling rate and precise
quantization
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
11/39
Audio storage requirements
Example of a CD audio
16 bits per sample
44,000 samples per second
Two (stereo) channels Requirements = 16 * 44,000 * 2 bits = 1.4 Mbit per second
Compression (examples)
Masking: Discard soft sound because not audible by
louder sound
Speech: coding of lower frequency sounds only
MPEG: audio compression standards
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
12/39
Digital representation of image
Scan analog photos & pictures using scanner
Analog image approximated by rectangle of small dots
In digital camera, ADC is built-in
Image consists of many small dots or pictureelements (pixels)
Gray scale: 1 byte (8 bits) per pixel
Color: 3 color (RGB) of one byte each
Data required for 1 rectangular screen
A =xyb
A:number of bytes needed, x: # pixels per horizontal line,
y: # horizontal lines, b: # bytes per pixel
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
13/39
Image compression
Exploit redundancy in image & properties of
human perception
Spatial redundancy: pixels in certain area often
appear similar (golden sand, blue sky)
Human tolerance: error still allows effective
communication
Eg. of image compression Transform coding
Fractal image coding
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
14/39
Digital representation of video
Sequence of frames or images presented atfixed rate
Digital video obtained by digitizing analog videos
or digital cameras Playing 25 frames per second gives illusion of
continuous view
Amount of data to represent video
1 second, image: 512 lines, 512 pixels per line, 24
bits per pixel, 25 frames per second
512 * 512 * 3 * 25 = 19 Mbytes
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
15/39
Compression of video
Compressing frames of videos: similar to image Reduce redundancy & exploit human perception properties
Temporal redundancy: neighboring frames normally similar, remove by
applying motion estimation & compression
Each image divided into fixed-sized blocks
For each block in image, the most similar block in previous image isdetermined & pixel difference computed
Together with displacement between the two blocks, this difference stored or
transmitted
MPEG-1 (VHS, pixel based coding): coding of video data up to speed of 1.5
Mbits per second MPEG-2 (pixel based coding): coding of video data up to speed of 10 Mbits
per second
MPEG-4 (multimedia data, object based coding) : coding of video data up
to speed of 40 Mbits per second, tools for decoding & representing video
objects, support content-based indexing & retrieval
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
16/39
How to search for images or
multimedia data? Analyze one by one?
No! Takes too long!
Have to use metadata instead of searching directly,
search for metadata that have been added to it
Metadata requirements to be valuable for searching:
Description of multimedia object should be as complete as
possible
Storage of metadata must not take too much overhead
Comparison of two metadata values must be fast
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
17/39
Metadata of Multimedia Objects
Descriptive data
Give format or factual info about multimedia
object
Eg.: author name, creation date, length of
multimedia object, representation technique
Eg. standard for descriptive data: Dublin core
Can use SQL (metadata condition in WHEREclause)
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
18/39
Metadata of Multimedia Objects
(cont.) Annotations
Textual description ofcontents of objects
Eg.: photo description in Facebook
Either free format or sequence of keywords
Manual text annotations allow Information Retrieval
techniques to be used but
Time consuming, expensive
Subjective, incomplete Structured concepts (eg semantic web, ER-like schema)
can be used to describe content through concepts, their
relationships to each other & MM object but
Also slow and expensive
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
19/39
Metadata of Multimedia Objects
(cont.)
Features
Derive characteristics from MM object itself
Need language to describe features, eg. MPEG-7
Process to capture features from MM object is
called feature extraction
Performed automatically, sometimes with human
support Two feature classes
Low-level features
High-level features
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
20/39
Low-level Features
Grasp data patterns & statistics of MM object
Depend strongly on medium
Extraction performed automatically
Eg. for text List of keywords with frequency indicators
Eg. for audio
Representation
Amplitude-time sequence: quantification of air pressure at each sample
Silence:0, > silence:+ve amplitude, < silence:-ve amplitude
Eg. Low-level features derived
Energy (loudness of signal), ZCR(zero crossing rate-frequency of sign
change)-high indicate speech, silence ratio(low indicates music)
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
21/39
Low-level features (cont.)
Eg. for images
Color histograms: # pixels having color of certain range
Spatial relationships: eg. blue patterns appears above
yellow (beach photo), Contrast: # dark spots neighboring light spots
Eg. for video
Use low-level features for image
Eg. of temporal dimension: shot change-when pixel
difference between two images is higher than certain
threshold
Shot- sequence of images taken with same camera position
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
22/39
High-level features
Features which are meaningful to end user, such as
golf course, forest
How can we bridge semantic gap between low level
and high level features High level feature extraction from low level features
Eg. text containing words football, referee football
match text
Eg. Speech to text translators (low level audio features totext)
Eg. Video-Domain specific: loud sound from crowd, round
object passing white line, followed by sharp whistle-goal
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
23/39
Multimedia Information Retrieval
System (MIRS)
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
24/39
Component of MIRS - Archiving
MM data stored separately from its metadata
Voluminous
Visible or audible delays in playback unacceptable
MM data managed separately in MM content
server
Objects get identification to be used by other
parts of MIRS at storage time
Have to deal with compression and protection
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
25/39
Component of MIRS
Feature Extraction (Indexing) Extraction of metadata (annotations, descriptions, features)
from incoming multimedia object
Algorithms have to consider extraction dependencies. Eg.:
Video object segmented, choose key frame for each segment
Extract low-level features from key frame
Based on low-level features, classify into shots of audience, fields,
close-ups
For field shots, detect positions of players
Extract body related features of players Determine where net playing begins and ends
Have to consider incremental maintenance (modification of
MM objects, extractors, extraction dependencies)
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
26/39
Incremental Maintenance in ACOI
Feature Extraction Architecture
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
27/39
Component of MIRS - Searching
Multimedia queries are diverse, can be
specified in many different ways
No exact match, many ways to describe MM
objects
Specifying information need
Direct user specifies info. need herself
Indirect user relies on other users
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
28/39
Possible Querying Scenarios
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
29/39
Possible Querying Scenarios
(cont.)
Queries based on Profile
Users expose preferences in one way or another
Preferences stored in user profile in MIRS
Can use profile of a friend if not sure & trusted
Queries based on Descriptive Data
Based on format and fact about MM object
Eg. all movies with Director = Steven Spielberg
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
30/39
Possible Querying Scenarios
(cont.) Queries based on Annotations
Text-based: keywords or natural language
Eg. Show me video in which Barack Obama shakes hand with
Mahathir Mohamad
Set of keywords derived from query & compared with keywords inannotations of movies
Queries based on Features
content-based queries
features derived (semi) automatically from content of MM object
Low & high level features used
Eg. Find all photos with color distribution like this photo
Eg. Give me all football videos which a goal is scored within last ten
minutes
goal is high-level feature that must be known to MIRS
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
31/39
Possible Querying Scenarios
(cont.) Query by example
Give example MM object
MIRS extract all kinds of features from the MM object
Resulting query based on these features
Similarity
Degree to which query & MM object of MIRS are similar
Similarity calculated by MIRS based on metadata of MM
object & query
Try to estimate value of relevance of MM object to user
Output is list of MM objects in descending order of
similarity value
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
32/39
General Retrieval Model
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
33/39
Relevance Feedback
Helps when user doesnt know exactly what he is looking
for, causing problem in query formulation Interactive approach
User issue starting query, MIRS compose result set, user
judge output (relevant/not), MIRS uses feedback to
improve retrieval process
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
34/39
Component of MIRS - Browsing
User sometimes cannot precisely specify what they want, but
can recognize what they want when they see it
Browsing let user scans through objects
Exploits hyperlinks which lead user from one object to other
When object shown, user judge its relevance & proceed accordingly
If objects are huge, icons are used
Starting point
query that describe info need or system provide starting point
User can ask for another starting point if not satisfied
Can classify object based on topics & subtopics
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
35/39
Component of MIRS
Output Presentation (Play) When MIRS returns list of objects, system has to
decide whether user has right to see them
User interface should be able to show all kinds of
MM data What if objects are huge and result set large?
Give user perception of content of object
Extract & present essential info for user to browse & select
objects Text: title, summary, places where keywords occur
Audio: tune, start of song
Images: summary of images thumbnails
Video: cut into scene n choose for each scene a prime image
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
36/39
Component of MIRS
Output Presentation (cont.) Streaming
Content sent to client at specific rate and except for
buffering, played directly
Audio & video is delivered as continuous stream of packets
When resource become scarce
Use switched Ethernet instead of shared Ethernet
Use disk stripping
Skip frames during play-back
Fragment content over several content servers (need logicalcomponent between client & servers to direct client request to
corresponding server)
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
37/39
Quality of MIRS
Recall
r/R
Precision
r/n
Relevance judged by humans, refer to TREC
(Text Retrieval Conference)
r: # of relevant objects returned by system,n: # objects retrieved,R: # relevant objects in collection
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
38/39
Exercise
Discuss the role of DBMS in storing MM
objects
Discuss the role of Information Retrieval
systems in storing MM objects
-
8/3/2019 Module 1 - Introduction to Multimedia Databases
39/39
End of Module 1