introduction to multimedia1. 2 introduction zmultimedia description zwhy multimedia systems?...
TRANSCRIPT
Introduction to Multimedia 1
Introduction to Multimedia
Introduction to Multimedia 2
Introduction
Multimedia DescriptionWhy multimedia systems?Classification of MediaMultimedia SystemsData Stream Characteristics
Introduction to Multimedia 3
Multimedia Description
Multimediais an integration of continuous media (e.g. audio,
video) and discrete media (e.g. text, graphics, images) through which digital information can be conveyed to the user in an appropriate way.
Multimany, much, multiple
MediumAn interleaving substance through which something
is transmitted or carried on
Introduction to Multimedia 4
Why Multimedia Computing?
Application drivene.g. medicine, sports, entertainment, education
Information can often be better represented using audio/video/animation rather than using text, images and graphics alone.
Information is distributed using computer and telecommunication networks.
Integration of multiple media places demands oncomputation powerstorage requirementsnetworking requirements
Introduction to Multimedia 5
Multimedia Information Systems
Technical challenges Sheer volume of data
Need to manage huge volumes of data
Timing requirementsamong components of data computation and
communication.Must work internally with given timing constraints - real-
time performance is required.
Integration requirementsneed to process traditional media (text, images) as well as
continuous media (audio/video).Media are not always independent of each other -
synchronization among the media may be required.
Introduction to Multimedia 6
High Data Volume of Multimedia Information
Speech 8000 samples/s 8Kbytes/s
CD Audio 44,100 samples/s, 2 bytes/sample
176Kbytes/s
Satellite Imagery
180X180 km 2̂ 30m 2̂ resolution
600MB/image (60MB compressed)
NTSC Video 30fps, 640X480 pixels, 3bytes/pixel
30Mbytes/s (2-8 Mbits/s compressed)
Introduction to Multimedia 7
Technology Incentive
Growth in computational capacityMM workstations with audio/video processing capabilityDramatic increase in CPU processing power Dedicated compression engines for audio, video etc.
Rise in storage capacityLarge capacity disks (several gigabytes)Increase in storage bandwidth,e.g. disk array
technology
Surge in available network bandwidthhigh speed fiber optic networks - gigabit networksfast packet switching technology
Introduction to Multimedia 8
Application Areas
Residential Servicesvideo-on-demandvideo phone/conferencing systemsmultimedia home shopping (MM catalogs, product
demos and presentation)self-paced education
Business ServicesCorporate trainingDesktop MM conferencing, MM e-mail
Introduction to Multimedia 9
Application Areas
EducationDistance education - MM repository of class videosAccess to digital MM libraries over high speed
networks
Science and Technologycomputational visualization and prototypingastronomy, environmental science
MedicineDiagnosis and treatment - e.g. MM databases that
provide support for queries on scanned images, X-rays, assessments, response etc.
Introduction to Multimedia 10
Classification of Media
Perception MediumHow do humans perceive information in a computer?
• Through seeing - text, images, video • Through hearing - music, noise, speech
Representation MediumHow is the computer information encoded?
• Using formats for representing and information• ASCII(text), JPEG(image), MPEG(video)
Presentation MediumThrough which medium is information delivered by
the computer or introduced into the computer?• Via I/O tools and devices• paper, screen, speakers (output media)• keyboard, mouse, camera, microphone (input media)
Introduction to Multimedia 11
Classification of Media (cont.)
Storage Medium• Where will the information be stored?• Storage media - floppy disk, hard disk, tape, CD-ROM
etc. Transmission Medium
• Over what medium will the information be transmitted?• Using information carriers that enable continuous data
transmission - networks• wire, coaxial cable, fiber optics
Information Exchange Medium• Which information carrier will be used for information
exchange between different places?• Direct transmission using computer networks• Combined use of storage and transmission media (e.g.
electronic mail).
Introduction to Multimedia 12
Media Concepts
Each medium definesRepresentation values - determine the information
representation of different media• Continuous representation values (e.g. electro-
magnetic waves)• Discrete representation values(e.g. text characters in
digital form)Representation space determines the surrounding
where the media are presented.• Visual representation space (e.g. paper, screen)• Acoustic representation space (e.g. stereo)
Introduction to Multimedia 13
Media Concepts (cont.)
Representation dimensions of a representation space are: Spatial dimensions:
two dimensional (2D graphics)three dimensional (holography)
Temporal dimensions:Time independent (document) - Discrete media
• Information consists of a sequence of individual elements without a time component.
Time dependent (movie) - Continuous media• Information is expressed not only by its individual value
but also by its time of occurrence.
Introduction to Multimedia 14
Multimedia Systems
Qualitative and quantitative evaluation of multimedia systems Combination of media
continuous and discrete.
Levels of media-independencesome media types (audio/video) may be tightly
coupled, others may not.
Computer supported integrationtiming, spatial and semantic synchronization
Communication capability
Introduction to Multimedia 15
Data Streams
Distributed multimedia communication systems
data of discrete and continuous media are broken into individual units (packets) and transmitted.
Data Streamsequence of individual packets that are transmitted in
a time-dependant fashion.Transmission of information carrying different media
leads to data streams with varying features• Asynchronous• Synchronous • Isochronous
Introduction to Multimedia 16
Data Stream Characteristics
Asynchronous transmission mode • provides for communication with no time restriction• Packets reach receiver as quickly as possible, e.g.
protocols for email transmissionSynchronous transmission mode
• defines a maximum end-to-end delay for each packet of a data stream.
• May require intermediate storage• E.g. audio connection established over a network.
Isochronous transmission mode• defines a maximum and a minimum end-to-end delay
for each packet of a data stream. Delay jitter of individual packets is bounded.
• E.g. transmission of video over a network.• Intermediate storage requirements reduced.
Introduction to Multimedia 17
Data Stream Characteristics
Data Stream characteristics for continuous media can be based on
Time intervals between complete transmission of consecutive packets
• Strongly periodic data streams - constant time interval• Weakly periodic data streams - periodic function with finite
period.• Aperiodic data streams
Data size - amount of consecutive packets• Strongly regular data streams - constant amount of data• Weakly regular data streams - varies periodically with time• Irregular data streams
Continuity• Continuous data streams• Discrete data streams
Introduction to Multimedia 18
Classification based on time intervals
Strongly periodic data stream
Weakly periodic data stream
Aperiodic data stream
T
T
T1 T3T2
T1 T2
T
Introduction to Multimedia 19
Classification based on packet size
TD1
D1
TD1D2D3D1D2D3
D1D2D3
Dn
Strongly regular data stream
Weakly regular data stream
Irregular data stream
t
t
t
Introduction to Multimedia 20
Classification based on continuity
Continuous data stream
Discrete data stream
D
D1 D2 D3 D4
D
D1 D2 D3 D4
Introduction to Multimedia 21
Broadband Multimedia Communications
Audio/Image/Video Representation
Introduction to Multimedia 22
Introduction
Basic Sound ConceptsComputer Representation of SoundBasic Image ConceptsImage Representation and FormatsVideo Signal RepresentationColor Encoding Computer Video Format
Introduction to Multimedia 23
Basic Sound Concepts
Acoustics study of sound - generation, transmission and
reception of sound waves.
Sound is produced by vibration of matter.During vibration, pressure variations are created in
the surrounding air molecules.Pattern of oscillation creates a waveform
• the wave is made up of pressure differences.Waveform repeats the same shape at intervals called
a period.• Periodic sound sources - exhibit more periodicity, more
musical - musical instruments, wind etc.• Aperiodic sound sources - less periodic - unpitched
percussion, sneeze, cough.
Introduction to Multimedia 24
Basic Sound Concepts
Sound TransmissionSound is transmitted by molecules bumping into each
other.Sound is a continuous wave that travels through air.
Sound is detected by measuring the pressure level at a point.
ReceivingMicrophone in sound field moves according to the
varying pressure exerted on it.Transducer converts energy into a voltage level (i.e.
energy of another form - electrical energy) Sending
Speaker transforms electrical energy into sound waves.
Introduction to Multimedia 25
Frequency of a sound wave
period
amplitude
time
Airpressure
Frequency is the reciprocal value of the period.
Introduction to Multimedia 26
Basic Sound Concepts
Wavelength is the distance travelled in one cycle
20Hz is 56 feet, 20KHz is 0.7 in.
Frequency represents the number of periods in a second (measured in hertz, cycles/second).
Frequency is the reciprocal value of the period.Human hearing frequency range: 20Hz - 20Khz, voice
is about 500Hz to 2Khz. Infrasound from 0 - 20 Hz Human range from 20Hz - 20KHz Ultrasound from 20kHz - 1GHz Hypersound from 1GHz - 10THz
Introduction to Multimedia 27
Basic Sound Concepts
Amplitude of a sound is the measure of the displacement of the air pressure wave from its mean or quiescent state.
Subjectively heard as loudness. Measured in decibels.
0 db - essentially no sound heard
35 db - quiet home 70 db - noisy street 120db - discomfort
Introduction to Multimedia 28
Computer Representation of Audio
A transducer converts pressure to voltage levels.
Convert analog signal into a digital stream by discrete sampling.
Discretization both in time and amplitude (quantization).
In a computer, we sample these values at intervals to get a vector of values.
A computer measures the amplitude of the waveform at regular time intervals to produce a series of numbers (samples).
Introduction to Multimedia 29
Computer Representation of Audio
Sampling Rate:rate at which a continuous wave is sampled (measured in
Hertz)• CD standard - 44100 Hz, Telephone quality - 8000 Hz.
Direct relationship between sampling rate, sound quality (fidelity) and storage space.
Question• How often do you need to sample a signal to avoid losing
information?Answer
• To decide a sampling rate - must be aware of difference between playback rate and capturing(sampling) rate.
• It depends on how fast the signal is changing. In reality - twice per cycle (follows from the Nyquist sampling theorem).
Introduction to Multimedia 30
Sampling
samples
SampleHeight
Introduction to Multimedia 31
Nyquist Sampling Theorem
If a signal f(t) is sampled at regular intervals of time and at a rate higher than twice the highest significant signal frequency, then the samples contain all the information of the original signal.
ExampleActual playback frequency for CD quality audio is
22050 HzBecause of Nyquist Theorem - we need to sample the
signal twice, therefore sampling frequency is 44100 Hz.
Introduction to Multimedia 32
Data Rate of a Channel
Noiseless Channel• Nyquist proved that if any arbitrary signal has been run
through a low pass filter of bandwidth H, the filtered signal can be completely reconstructed by making only 2H (exact) samples per second. If the signal consists of V discrete levels, Nyquist’s theorem states:
max datarate = 2 *H log_2 V bits/sec• noiseless 3kHz channel with quantization level 1 bit
cannot transmit binary signal at a rate exceeding 6000 bits per second.
Noisy Channel• Thermal noise present is measured by the ratio of the
signal power S to the noise power N (signal-to-noise ratio S/N).
• Max datarate - H log_2 (1+S/N)
Introduction to Multimedia 33
Quantization
Sample precision - the resolution of a sample value
Quantization depends on the number of bits used measuring the height of the waveform.
16 bit CD quality quantization results in 64K values.
Audio formats are described by sample rate and quantization.
• Voice quality - 8 bit quantization, 8000 Hz mono(8 Kbytes/sec)
• 22kHz 8-bit mono (22kBytes/s) and stereo (44Kbytes/sec)• CD quality - 16 bit quantization, 44100 Hz linear stereo (196
Kbytes/s)
Introduction to Multimedia 34
Quantization and Sampling
samples
SampleHeight
0.75
0.5
0.25
Introduction to Multimedia 35
Audio Formats
Audio formats are characterized by four parameters
Sample rate: Sampling frequencyEncoding: audio data representation
-law encoding corresponds to CCITT G.711 - standard for voice data in telephone companies in USA, Canada, Japan
• A-law encoding - used for telephony elsewhere.• A-law and -law are sampled at 8000 samples/second with
precision of 12bits, compressed to 8-bit samples.• Linear Pulse Code Modulation(PCM) - uncompressed audio
where samples are proportional to audio signal voltage.Precision: number of bits used to store audio sample
-law and A-law - 8 bit precision, PCM can be stored at various precisions, 16 bit PCM is common.
Channel: Multiple channels of audio may be interleaved at sample boundaries.
Introduction to Multimedia 36
Audio Formats
Available on UNIX au (SUN file format), wav (Microsoft RIFF/waveform
format), al (raw a-law), u (raw u-law)…
Available on Windows-based systems (RIFF formats) wav, midi (file format for standard MIDI files), avi
RIFF (Resource Interchange File Format) tagged file format (similar to TIFF).. Allows multiple
applications to read files in RIFF format
RealAudio, MP3 (MPEG Audio Layer 3)
Introduction to Multimedia 37
Computer Representation of Voice
Best known technique for voice digitization is pulse-code-modulation (PCM). Consists of the 2 step process of sampling and
quantization. Based on the sampling theorem.
If voice data are limited to 4000Hz, then PCM samples 8000 samples per second which is sufficient for input voice signal.
PCM provides analog samples which must be converted to digital representation.
Each of these analog samples must be assigned a binary code. Each sample is approximated by being quantized.
Introduction to Multimedia 38
Computer Representation of Music
MIDI (Music Instrument Digital Interface)standard that manufacturers of musical instruments use
so that instruments can communicate musical information via computers.
The MIDI interface consists of:• Hardware - physical connection b/w instruments, specifies a
MIDI port (plugs into computers serial port) and a MIDI cable.• Data format - has instrument specification, notion of
beginning and end of note, frequency and sound volume. Data grouped into MIDI messages that specify a musical event.
• An instrument that satisfies both is a MIDI device (e.g. synthesizer)
MIDI software applications include• music recording and performance applications, musical
notations and printing applications, music education etc.
Introduction to Multimedia 39
Computer Representation of Speech
Human ear is most sensitive in the range 600Hz to 6000 Hz.
Speech Generation• real-time signal generation allows transformation of text into
speech without lengthy processing• Limited vs. large vocabulary (depends on application)• Must be understandable, must sound natural
Speech Analysis• Identification and Verification - recognize speakers using
acoustic fingerprint• Recognition and Understanding - analyze what has been said• How something was said - used in lie detectors.
Speech transmission - coding, recognition and synthesis methods - achieve minimal data rate for a given quality.
Introduction to Multimedia 40
Basic Concepts (Digital Image Representation)
An image is a spatial representation of an object, a 2D or 3D scene etc.
Abstractly, an image is a continuous function defining a rectangular region of a plane
intensity image - proportional to radiant energy received by a sensor/detector
range image - line of sight distance from sensor position.
An image can be thought of as a function with resulting values of the light intensity at each point over a planar region.
Introduction to Multimedia 41
Digital Image Representation
For computer representation, function (e.g. intensity) must be sampled at discrete intervals.
Sampling quantizes the intensity values into discrete intervals.
• Points at which an image is sampled are called picture elements or pixels.
• Resolution specifies the distance between points - accuracy.A digital image is represented by a matrix of numeric
values each representing a quantized intensity value.• I(r,c) - intensity value at position corresponding to row r and
column c of the matrix.• Intensity value can be represented by bits for black and
white images (binary valued images), 8 bits for monochrome imagery to encode color or grayscale levels, 24 bit (color-RGB).
Introduction to Multimedia 42
Image Formats
Captured Image Formatformat obtained from an image frame grabberImportant parameters
• Spatial resolution (pixels X pixels)• Color encoding (quantization level of a pixel - 8-bit, 24-
bit)• e.g. “SunVideo” Video digitizer board allows pictures of
320 by 240 pixels with 8-bit grayscale or color resolution. Parallax-X video includes resolution of 640X480 pixels and 24-bit frame buffer.
Introduction to Multimedia 43
Image Formats
Stored Image Format - format when images are stored
Images are stored as 2D array of values where each value represents the data associated with a pixel in the image.
Bitmap - this value is a binary digitFor a color image - this value may be a collection of
• 3 values that represent intensities of RGB component at that pixel, 3 numbers that are indices to table of RGB intensities, index to some color data structure etc.
Image file formats include - GIF (Graphical Interchange Format) , X11 bitmap, Postscript, JPEG, TIFF
Introduction to Multimedia 44
Basic Concepts (Video Representation)
Human eye views video immanent properties of the eye determine essential
conditions related to video systems.
Video signal representation consists of 3 aspects:Visual Representation
• objective is to offer the viewer a sense of presence in the scene and of participation in the events portrayed.
Transmission• Video signals are transmitted to the receiver through a
single television channelDigitalization
• analog to digital conversion, sampling of gray(color) level, quantization.
Introduction to Multimedia 45
Visual Representation
The televised image should convey the spatial and temporal content of the scene
Vertical detail and viewing distance• Aspect ratio: ratio of picture width and height (4/3 = 1.33 is
the conventional aspect ratio).• Viewing angle = viewing distance/picture height
Horizontal detail and picture width• Picture width (conventional TV service ) - 4/3 * picture height
Total detail content of the image• Number of pixels presented separately in the picture height =
vertical resolution• Number of pixels in the picture width
= vertical resolution*aspect ratio• product equals total number of picture elements in the image.
Introduction to Multimedia 46
Visual Representation
Perception of Depth• In natural vision, this is determined by angular separation
of images received by the two eyes of the viewer• In the flat image of TV, focal length of lenses and changes
in depth of focus in a camera influence depth perception.Luminance and Chrominance
• Color-vision - achieved through 3 signals, proportional to the relative intensities of RED, GREEN and BLUE.
• Color encoding during transmission uses one LUMINANCE and two CHROMINANCE signals
Temporal Aspect of Resolution• Motion resolution is a rapid succession of slightly different
frames. For visual reality, repetition rate must be high enough (a) to guarantee smooth motion and (b) persistance of vision extends over interval between flashes(light cutoff b/w frames).
Introduction to Multimedia 47
Visual Representation
Continuity of motion• Motion continuity is achieved at a minimal 15 frames per
second; is good at 30 frames/sec; some technologies allow 60 frames/sec.
• NTSC standard provides 30 frames/sec - 29.97 Hz repetition rate.
• PAL standard provides 25 frames/sec with 25Hz repetition rate.
Flicker effect• Flicker effect is a periodic fluctuation of brightness perception.
To avoid this effect, we need 50 refresh cycles/sec. Display devices have a display refresh buffer for this.
Temporal aspect of video bandwidth• depends on rate of the visual system to scan pixels and on
human eye scanning capabilities.
Introduction to Multimedia 48
Transmission (NTSC)
Video bandwidth is computed as follows700/2 pixels per line X 525 lines per picture X 30
pictures per secondVisible number of lines is 480.
Intermediate delay between frames is1000ms/30fps = 33.3ms
Display time per line is33.3ms/525 lines = 63.4 microseconds
The transmitted signal is a composite signalconsists of 4.2Mhz for the basic signal and 5Mhz for
the color, intensity and synchronization information.
Introduction to Multimedia 49
Color Encoding
A camera creates three signalsRGB (red, green and blue)
For transmission of the visual signal, we use three signals
• 1 luminance (brightness-basic signal) and 2 chrominance (color signals).
In NTSC, luminance and chrominance are interleavedGoal at receiver
• separate luminance from chrominance components• avoid interference between them prior to recovery of
primary color signals for display.
Introduction to Multimedia 50
Color Encoding
RGB signal - for separate signal codingconsists of 3 separate signals for red, green and blue
colors. Other colors are coded as a combination of primary color. (R+G+B = 1) --> neutral white color.
YUV signalseparate brightness (luminance) component Y andcolor information (2 chrominance signals U and V)
• Y = 0.3R + 0.59G + 0.11B• U = (B-Y) * 0.493• V = (R-Y) * 0.877
Resolution of the luminance component is more important than U,V
Coding ratio of Y, U, V is 4:2:2
Introduction to Multimedia 51
Color Encoding(cont.)
YIQ signalsimilar to YUV - used by NTSC format
• Y = 0.3R + 0.59G + 0.11B• U = 0.60R - 0.28G + 0.32 B• V = 0.21R -0.52g + 0.31B
Composite signalAll information is composed into one signalTo decode, need modulation methods for eliminating
interference b/w luminance and chrominance components.
Introduction to Multimedia 52
Digitization
Refers to sampling the gray/color level in the picture at MXN array of points.
Once points are sampled, they are quantized into pixels
• sampled value is mapped into an integer• quantization level is dependent on number of bits used
to represent resulting integer, e.g. 8 bits per pixel or 24 bits per pixel.
Need to create motion when digitizing videodigitize pictures in timeobtain sequence of digital images per second to
approximate analog motion video.
Introduction to Multimedia 53
Computer Video Format
Video Digitizer A/D converter
Important parameters resulting from a digitizer• digital image resolution• quantization• frame rate
E.g. Parallax X Video - camera takes the NTSC signal and the video board digitizes it. Resulting video has
• 640X480 pixels spatial resolution• 24 bits per pixel resolution• 20fps (lower image resolution - more fps)
Output of digital video goes to raster displays with large video RAM memories.
• Color lookup table used for presentation of color
Introduction to Multimedia 54
Digital Transmission Bandwidth
Bandwidth requirement for imagesraw image transmission b/w = size of image = spatial
resolution x pixel resolutioncompressed image - depends on compression schemesymbolic image transmission b/w = size of instructions
and primitives carrying graphics variables
Bandwidth requirement for videouncompressed video = image size X frame ratecompressed video - depends on compression schemee.g HDTV quality video uncompressed - 345.6Mbps,
compressed using MPEG (34 Mbps with some loss of quality).
Introduction to Multimedia 55
Broadband Multimedia Communications
Multimedia Compression Techniques
Introduction to Multimedia 56
Introduction
Coding Requirements Entropy Encoding
Content Dependent Coding• Run-length Coding• Diatomic Coding
Statistical Encoding• Huffman Coding• Arithmetic Coding
Source EncodingPredictive Coding
• Differential Pulse Code Modulation• Delta Modulation
Adaptive Encoding
Introduction to Multimedia 57
Coding Requirements
Storage RequirementsUncompressed audio:
• 8Khz, 8-bit quantization implies 64 Kbits to store per second
CD quality audio:• 44.1Khz, 16-bit quantization implies storing 705.6Kbits/sec
PAL video format:• 640X480 pixels, 24 bit quantization, 25 fps, implies
storing 184,320,000 bits/sec = 23,040,000 bytes/sec
Bandwidth Requirementsuncompressed audio: 64KbpsCD quality audio: 705.6KbpsPAL video format: 184,320,000 bits/sec
COMPRESSION IS REQUIRED!!!!!!!
Introduction to Multimedia 58
Coding Format Examples
JPEG for still images H.261/H.263 for video conferencing, music and
speech (dialog mode applications) MPEG-1, MPEG-2, MPEG-4 for audio/video
playback, VOD (retrieval mode applications) DVI for still and continuous video applications
(two modes of compression)• Presentation Level Video (PLV) - high quality
compression, but very slow. Suitable for applications distributed on CD-ROMs
• Real-time Video (RTV) - lower quality compression, but fast. Used in video conferencing applications.
Introduction to Multimedia 59
Coding Requirements
Dialog mode applicationsEnd-to-end Delay (EED) should not exceed 150-200 msFace-to-face application needs EED of 50ms (including
compression and decompression).
Retrieval mode applicationsFast-forward and rewind data retrieval with
simultaneous display (e.g. fast search for information in a multimedia database).
Random access to single images and audio frames, access time should be less than 0.5sec
Decompression of images, video, audio - should not be linked to other data units - allows random access and editing
Introduction to Multimedia 60
Coding Requirements
Requirements for both dialog and retrieval mode applications
Support for scalable video in different systems.Support for various audio and video rates.Synchronization of audio-video streams (lip
synchronization)Economy of solutions
• Compression in software implies cheaper, slower and low quality solution.
• Compression in hardware implies expensive, faster and high quality solution.
Compatibility• e.g. tutoring systems available on CD should run on
different platforms.
Introduction to Multimedia 61
Classification of Compression Techniques
Entropy Coding• lossless encoding• used regardless of media’s specific characteristics• data taken as a simple digital sequence• decompression process regenerates data completely• e.g. run-length coding, Huffman coding, Arithmetic coding
Source Coding• lossy encoding• takes into account the semantics of the data• degree of compression depends on data content.• E.g. content prediction technique - DPCM, delta modulation
Hybrid Coding (used by most multimedia systems)• combine entropy with source encoding• E.g. JPEG, H.263, DVI (RTV & PLV), MPEG-1, MPEG-2, MPEG-
4
Introduction to Multimedia 62
Steps in Compression
Picture preparation• analog-to-digital conversion• generation of appropriate digital representation• image division into 8X8 blocks• fix the number of bits per pixel
Picture processing (compression algorithm)• transformation from time to frequency domain, e.g. DCT• motion vector computation for digital video.
Quantization• Mapping real numbers to integers (reduction in precision).
E.g. U-law encoding - 12bits for real values, 8 bits for integer values
Entropy coding• compress a sequential digital stream without loss.
Introduction to Multimedia 63
Compression Steps
Picture Preparation
Picture Processing
Quantization
Entropy Coding
CompressedPicture
UncompressedPicture
AdaptiveFeedbackLoop
Introduction to Multimedia 64
Types of compression
Symmetric Compression• Same time needed for decoding and encoding phases• Used for dialog mode applications
Asymmetric Compression• Compression process is performed once and enough
time is available, hence compression can take longer.• Decompression is performed frequently and must be
done fast. • Used for retrieval mode applications
Introduction to Multimedia 65
Broadband Multimedia Communications
JPEG Compression
Introduction to Multimedia 66
Introduction
Requirements on JPEG implementations JPEG Image Preparation
• Blocks, Minimum Coded Units (MCU)
JPEG Image Processing• Discrete Cosine Transformation (DCT)
JPEG Quantization• Quantization Tables
JPEG Entropy Encoding• Run-length Coding/Huffman Encoding
Introduction to Multimedia 67
Additional Requirements -JPEG
JPEG implementation is independent of image size and applicable to any image and pixel aspect ratio.
Image content may be of any complexity (with any statistical characteristics).
JPEG should achieve very good compression ratio and good quality image.
From the processing complexity of a software solution point of view: JPEG should run on as many available platforms as possible.
Sequential decoding (line-by-line) and progressive decoding (refinement of the whole image) should be possible.
Introduction to Multimedia 68
Variants of Image Compression
Four different modesLossy Sequential DCT based mode
• Baseline process that must be supported by every JPEG implementation.
Expanded Lossy DCT based mode• enhancements to baseline process
Lossless mode• low compression ratio• allows perfect reconstruction of original image
Hierarchical mode• accommodates images of different resolutions
Introduction to Multimedia 69
JPEG Processing Steps
Block, MCU8bits/pixel
LosslessMode
ExpandedLossyMode
HierarchicalMode
BaselineSequentialMode
12 bits/pixel 2-16 bits/pixel
Layeredcoding
TransformationSource Codinglossy DCT
PredictiveEntropycoding
Switch betweenlossy DCT and losslesstechnique
Run-lengthHuffman
Pixel,Block, MCU
ImagePreparation
ImagePreparation
Quantization
EntropyEncoding
PredictionFDCT
Run-lengthHuffmanArithmetic
UncompressedImage
CompressedImage
Introduction to Multimedia 70
Broadband Multimedia Communications
MPEG Compression
Introduction to Multimedia 71
Introduction
General Information about MPEG MPEG/ Video Standard MPEG/ Audio Standard MPEG Systems
• Multiplexing of Video/Audio Data Streams
Introduction to Multimedia 72
General Information
MPEG-1 achieves data compression of 1.5Mbps.This is the data rate of audio CD’s and DAT’s (Digital
Audio Tapes). MPEG considers explicitly functionalities of
other standards,e.g. it uses JPEG. MPEG defines standard video, audio coding and
system data streams with synchronization. MPEG Core Technology
• includes many different patents• MPEG committee sets technical standards
Introduction to Multimedia 73
General Information (cont.)
MPEG stream provides more information than a data stream compressed according to the JPEG standard.
Aspect Ratio - 14 aspect ratios can be encoded.• 1:1 corresponds to computer graphics, 4:3 corresponds to
702X575 pixels (TV format), 16:9 corresponds to 625/525 (HDTV format).
Refresh Frequency - 8 frequencies are encoded - • 23.976Hz, 24, 25,29.97, 50, 59.94, 60 Hz.
Other Issues with frame rateEach frame must be built within a maximum of
41.7(33)ms to keep display rate of 24fps(30fps). No need or possibility of defining MCUs in MPEG.
• Implies sequential non-interleaving order.For MPEG, there is no advantage to progressive display
over sequential display.
Introduction to Multimedia 74
MPEG Overview
MPEG exploits temporal (i.e frame-to-frame) redundancy present in all video sequences.
Two Categories: Intra-frame and inter-frame encoding DCT based compression for the reduction of
spatial redundancy (similar to JPEG) Block-based motion compensation for exploiting
temporal redundancy causal(predictive coding) - current picture is modeled
as transformation of picture at some previous timenon-causal (interpolative coding) - uses past and future
reference
Introduction to Multimedia 75
MPEG Image Preparation -Motion Representation
Predictive and interpolative codingGood compression but requires storage and informationOften makes sense for parts of an image and not the
whole image.
Each image is divided into areas called macro-blocks (motion compensation units)
Each macro-blocks is partitioned into 16x16 pixels for luminance, 8x8 for each of the chrominance components.
Choice of macro-block size is a tradeoff between gain from motion compensation and cost of motion estimation.
Macro-blocks are useful for compression based on motion estimation.
Introduction to Multimedia 76
MPEG Video Processing
MPEG stream includes 4 types of image coding for video processing
I-frames - Intra-coded frames - access points for random access, yields moderate compression
P-frames - Predictive-coded frames - encoded with reference to a previous I or P frame.
B-frames - Bi-directionally predictive coded frames - encoded using previous/next I and P frame, maximum compression
D-frames - DC coded frames Motivation for types of frames
Demand for efficient coding scheme and fast random accessGoal to achieve high compression rate -
• temporal redundancies of subsequent pictures (i.e. interframes) must be exploited
Introduction to Multimedia 77
MPEG Audio Encoding Steps
PsychoacousticModel
QuantizationBit/noise
Allocation
Filter Bank
Multiplexer
Entropy Coder Huffman Coding
If noise level is too low --> finer quantization is applied
If noise level is too high --> rough quantization is applied
Transformation from time to frequency domain
32 subbands
Compressed data
Introduction to Multimedia 78
MPEG/System Data Stream
Video Stream is interleaved with audio. Video Stream consists of 6 layers
Sequence layerGroup of pictures layer
• Video Param - width, height, aspect ratio, picture rate• Bitstream Param - bitrate, bufsize• QT - intracoded blocks, intercoded blocks
Picture layer• Time code - hours, minutes, seconds
Slice layer• Type - I, P, B• Buffer Param - decoder’s bufsize• Encode Param - indicates info about motion vectors
Macro-block layer• Vertical Position - what line does this slice start on?• Qscale - how is the quantization table scaled in this
slice?Block layer