module 1 - introduction to multimedia databases

Upload: yanchan89

Post on 06-Apr-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    1/39

    Module 1

    INTRODUCTION TO

    MULTIMEDIA DATABASES

    Prof. Dr. Naomie Salim

    Faculty of Computer Science & Information SystemsUniversiti Teknologi Malaysia

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    2/39

    The Explosion of Digital

    Multimedia Information

    We interact with multimedia everyday

    Large amount of text, images, speech & video

    converted to digital form

    Advantages of digitized data over analog

    Easy storage

    Easy processing

    Easy sharing

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    3/39

    Give examples of multimedia

    applications that deals with

    storing, retrieving, processingand sharing of multimedia data

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    4/39

    Eg 1. Journalism

    Journalist to write article about influence of

    alcohol on driving

    Investigation involved:

    Collect news articles about accidents, scientific

    reports, television commercials, police interviews,

    medical experts interviews

    Illustration:

    Search photo archives, stock footage companies

    for good photos shocking, funny, etc.

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    5/39

    Other examples

    Searching movies

    Based on taste of movies already seen

    Based on movies a friend favor

    Searching on web

    Eg. searching Australian Open website

    (http://www.ausopen.org)

    Integrate conceptual terms + interesting events

    give info about video segments showing female

    American tennis players going to the net

    http://www.ausopen.org/http://www.ausopen.org/
  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    6/39

    Retrieval problems

    EMPLOYEE (Name: char(20), City: Char(20),

    Photo: Image)

    How do you select employees in Skudai?

    How do you select employees that wear tudung,

    wear glasses, fair and have a mole under the lips?

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    7/39

    Characteristics of Media Data

    Medium - Information representation

    Alphanumeric

    Representation of audio, video and image

    Static vs dynamic Static: do not have time dimensions (alphanumeric data,

    images, graphics)

    Dynamic: have time dimensions (video, animation, audio)

    Multimedia

    Collection of media types used together

    At least one media types must be non-alphanumeric

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    8/39

    Digital representation of text

    OCR techniques convert analog text to digital text

    Eg. of digital representation: ASCII

    Use 8 bits

    Chinese char requires more space

    Storage requirements depend on number of characters

    Structured documents becoming more popular

    Docs consist of titles, chapters, sections, paragraphs, etc.

    Standards like HTML and XML used to encode structured information

    Compression of text

    Huffman, arithmetic coding

    Since storage requirements not too high, less important than

    multimedia data

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    9/39

    Digital representation of audio

    Audio air pressure waves with frequency, amplitude

    Human hears 20-20,000 Hertz

    Low amplitude soft sound

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    10/39

    Digitizing pressure waveforms Transform into electrical

    signal (by microphone)

    Convert into discrete

    values

    Sampling: continuous time axisdivided into small, fixed

    intervals

    Quantization: determination of

    amplitude of video signals at

    beginning of each time interval Human cannot notice

    difference between analog &

    digital with enough high

    sampling rate and precise

    quantization

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    11/39

    Audio storage requirements

    Example of a CD audio

    16 bits per sample

    44,000 samples per second

    Two (stereo) channels Requirements = 16 * 44,000 * 2 bits = 1.4 Mbit per second

    Compression (examples)

    Masking: Discard soft sound because not audible by

    louder sound

    Speech: coding of lower frequency sounds only

    MPEG: audio compression standards

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    12/39

    Digital representation of image

    Scan analog photos & pictures using scanner

    Analog image approximated by rectangle of small dots

    In digital camera, ADC is built-in

    Image consists of many small dots or pictureelements (pixels)

    Gray scale: 1 byte (8 bits) per pixel

    Color: 3 color (RGB) of one byte each

    Data required for 1 rectangular screen

    A =xyb

    A:number of bytes needed, x: # pixels per horizontal line,

    y: # horizontal lines, b: # bytes per pixel

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    13/39

    Image compression

    Exploit redundancy in image & properties of

    human perception

    Spatial redundancy: pixels in certain area often

    appear similar (golden sand, blue sky)

    Human tolerance: error still allows effective

    communication

    Eg. of image compression Transform coding

    Fractal image coding

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    14/39

    Digital representation of video

    Sequence of frames or images presented atfixed rate

    Digital video obtained by digitizing analog videos

    or digital cameras Playing 25 frames per second gives illusion of

    continuous view

    Amount of data to represent video

    1 second, image: 512 lines, 512 pixels per line, 24

    bits per pixel, 25 frames per second

    512 * 512 * 3 * 25 = 19 Mbytes

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    15/39

    Compression of video

    Compressing frames of videos: similar to image Reduce redundancy & exploit human perception properties

    Temporal redundancy: neighboring frames normally similar, remove by

    applying motion estimation & compression

    Each image divided into fixed-sized blocks

    For each block in image, the most similar block in previous image isdetermined & pixel difference computed

    Together with displacement between the two blocks, this difference stored or

    transmitted

    MPEG-1 (VHS, pixel based coding): coding of video data up to speed of 1.5

    Mbits per second MPEG-2 (pixel based coding): coding of video data up to speed of 10 Mbits

    per second

    MPEG-4 (multimedia data, object based coding) : coding of video data up

    to speed of 40 Mbits per second, tools for decoding & representing video

    objects, support content-based indexing & retrieval

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    16/39

    How to search for images or

    multimedia data? Analyze one by one?

    No! Takes too long!

    Have to use metadata instead of searching directly,

    search for metadata that have been added to it

    Metadata requirements to be valuable for searching:

    Description of multimedia object should be as complete as

    possible

    Storage of metadata must not take too much overhead

    Comparison of two metadata values must be fast

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    17/39

    Metadata of Multimedia Objects

    Descriptive data

    Give format or factual info about multimedia

    object

    Eg.: author name, creation date, length of

    multimedia object, representation technique

    Eg. standard for descriptive data: Dublin core

    Can use SQL (metadata condition in WHEREclause)

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    18/39

    Metadata of Multimedia Objects

    (cont.) Annotations

    Textual description ofcontents of objects

    Eg.: photo description in Facebook

    Either free format or sequence of keywords

    Manual text annotations allow Information Retrieval

    techniques to be used but

    Time consuming, expensive

    Subjective, incomplete Structured concepts (eg semantic web, ER-like schema)

    can be used to describe content through concepts, their

    relationships to each other & MM object but

    Also slow and expensive

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    19/39

    Metadata of Multimedia Objects

    (cont.)

    Features

    Derive characteristics from MM object itself

    Need language to describe features, eg. MPEG-7

    Process to capture features from MM object is

    called feature extraction

    Performed automatically, sometimes with human

    support Two feature classes

    Low-level features

    High-level features

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    20/39

    Low-level Features

    Grasp data patterns & statistics of MM object

    Depend strongly on medium

    Extraction performed automatically

    Eg. for text List of keywords with frequency indicators

    Eg. for audio

    Representation

    Amplitude-time sequence: quantification of air pressure at each sample

    Silence:0, > silence:+ve amplitude, < silence:-ve amplitude

    Eg. Low-level features derived

    Energy (loudness of signal), ZCR(zero crossing rate-frequency of sign

    change)-high indicate speech, silence ratio(low indicates music)

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    21/39

    Low-level features (cont.)

    Eg. for images

    Color histograms: # pixels having color of certain range

    Spatial relationships: eg. blue patterns appears above

    yellow (beach photo), Contrast: # dark spots neighboring light spots

    Eg. for video

    Use low-level features for image

    Eg. of temporal dimension: shot change-when pixel

    difference between two images is higher than certain

    threshold

    Shot- sequence of images taken with same camera position

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    22/39

    High-level features

    Features which are meaningful to end user, such as

    golf course, forest

    How can we bridge semantic gap between low level

    and high level features High level feature extraction from low level features

    Eg. text containing words football, referee football

    match text

    Eg. Speech to text translators (low level audio features totext)

    Eg. Video-Domain specific: loud sound from crowd, round

    object passing white line, followed by sharp whistle-goal

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    23/39

    Multimedia Information Retrieval

    System (MIRS)

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    24/39

    Component of MIRS - Archiving

    MM data stored separately from its metadata

    Voluminous

    Visible or audible delays in playback unacceptable

    MM data managed separately in MM content

    server

    Objects get identification to be used by other

    parts of MIRS at storage time

    Have to deal with compression and protection

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    25/39

    Component of MIRS

    Feature Extraction (Indexing) Extraction of metadata (annotations, descriptions, features)

    from incoming multimedia object

    Algorithms have to consider extraction dependencies. Eg.:

    Video object segmented, choose key frame for each segment

    Extract low-level features from key frame

    Based on low-level features, classify into shots of audience, fields,

    close-ups

    For field shots, detect positions of players

    Extract body related features of players Determine where net playing begins and ends

    Have to consider incremental maintenance (modification of

    MM objects, extractors, extraction dependencies)

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    26/39

    Incremental Maintenance in ACOI

    Feature Extraction Architecture

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    27/39

    Component of MIRS - Searching

    Multimedia queries are diverse, can be

    specified in many different ways

    No exact match, many ways to describe MM

    objects

    Specifying information need

    Direct user specifies info. need herself

    Indirect user relies on other users

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    28/39

    Possible Querying Scenarios

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    29/39

    Possible Querying Scenarios

    (cont.)

    Queries based on Profile

    Users expose preferences in one way or another

    Preferences stored in user profile in MIRS

    Can use profile of a friend if not sure & trusted

    Queries based on Descriptive Data

    Based on format and fact about MM object

    Eg. all movies with Director = Steven Spielberg

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    30/39

    Possible Querying Scenarios

    (cont.) Queries based on Annotations

    Text-based: keywords or natural language

    Eg. Show me video in which Barack Obama shakes hand with

    Mahathir Mohamad

    Set of keywords derived from query & compared with keywords inannotations of movies

    Queries based on Features

    content-based queries

    features derived (semi) automatically from content of MM object

    Low & high level features used

    Eg. Find all photos with color distribution like this photo

    Eg. Give me all football videos which a goal is scored within last ten

    minutes

    goal is high-level feature that must be known to MIRS

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    31/39

    Possible Querying Scenarios

    (cont.) Query by example

    Give example MM object

    MIRS extract all kinds of features from the MM object

    Resulting query based on these features

    Similarity

    Degree to which query & MM object of MIRS are similar

    Similarity calculated by MIRS based on metadata of MM

    object & query

    Try to estimate value of relevance of MM object to user

    Output is list of MM objects in descending order of

    similarity value

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    32/39

    General Retrieval Model

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    33/39

    Relevance Feedback

    Helps when user doesnt know exactly what he is looking

    for, causing problem in query formulation Interactive approach

    User issue starting query, MIRS compose result set, user

    judge output (relevant/not), MIRS uses feedback to

    improve retrieval process

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    34/39

    Component of MIRS - Browsing

    User sometimes cannot precisely specify what they want, but

    can recognize what they want when they see it

    Browsing let user scans through objects

    Exploits hyperlinks which lead user from one object to other

    When object shown, user judge its relevance & proceed accordingly

    If objects are huge, icons are used

    Starting point

    query that describe info need or system provide starting point

    User can ask for another starting point if not satisfied

    Can classify object based on topics & subtopics

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    35/39

    Component of MIRS

    Output Presentation (Play) When MIRS returns list of objects, system has to

    decide whether user has right to see them

    User interface should be able to show all kinds of

    MM data What if objects are huge and result set large?

    Give user perception of content of object

    Extract & present essential info for user to browse & select

    objects Text: title, summary, places where keywords occur

    Audio: tune, start of song

    Images: summary of images thumbnails

    Video: cut into scene n choose for each scene a prime image

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    36/39

    Component of MIRS

    Output Presentation (cont.) Streaming

    Content sent to client at specific rate and except for

    buffering, played directly

    Audio & video is delivered as continuous stream of packets

    When resource become scarce

    Use switched Ethernet instead of shared Ethernet

    Use disk stripping

    Skip frames during play-back

    Fragment content over several content servers (need logicalcomponent between client & servers to direct client request to

    corresponding server)

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    37/39

    Quality of MIRS

    Recall

    r/R

    Precision

    r/n

    Relevance judged by humans, refer to TREC

    (Text Retrieval Conference)

    r: # of relevant objects returned by system,n: # objects retrieved,R: # relevant objects in collection

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    38/39

    Exercise

    Discuss the role of DBMS in storing MM

    objects

    Discuss the role of Information Retrieval

    systems in storing MM objects

  • 8/3/2019 Module 1 - Introduction to Multimedia Databases

    39/39

    End of Module 1