multimedia a. wieczorkowska video acquisition and processing
DESCRIPTION
A.Wieczorkowska /843 AV acquisition: equipment Lens – up to 40% of total cost of camera Zoom Filters - pieces of glass which have been tinted, treated, or designed to hold some material between the subject and the camera –Effects: polarization, diffusion, fog, star pattern, and gradated colors –Always use a skylight or UV filter. They cut down on ultra violet light entering the camera, creating a bluish tinge in shadows, and protect expensive lens, keeping dust, dirt, and fingerprints off of the main glass –When cleaning a lens, never blow on it. That just gets more dirt stuck down past the seals. Always use proper lens cleaning fluid and paper. Never use alcohol, which can remove the special coatings on many lensesTRANSCRIPT
Multimedia
A. Wieczorkowska
Video acquisition
and processing
A.Wieczorkowska /842
AV acquisition: equipment• Camera: analog, digital
– All cameras manufactured today are equipped with CCD technology (charged coupled device). Light coming from lens strikes light-sensitive CCD chip, aligned in grid pattern of pixels. Image is digitized as optical image and converted (encoded) into electrical signal. Then signal can be broadcast, recorded to videotape, or recorded digitally on disk
A.Wieczorkowska /843
AV acquisition: equipment• Lens – up to 40% of total cost of camera• Zoom• Filters - pieces of glass which have been tinted,
treated, or designed to hold some material between the subject and the camera– Effects: polarization, diffusion, fog, star pattern, and
gradated colors– Always use a skylight or UV filter. They cut down on ultra
violet light entering the camera, creating a bluish tinge in shadows, and protect expensive lens, keeping dust, dirt, and fingerprints off of the main glass
– When cleaning a lens, never blow on it. That just gets more dirt stuck down past the seals. Always use proper lens cleaning fluid and paper. Never use alcohol, which can remove the special coatings on many lenses
A.Wieczorkowska /844
AV acquisition: equipment• White balance
– The way the operator tells the camera what type of light it is shooting in and, therefore, how it should render colors
– Focusing on a piece of white paper, the camera “reads” the Kelvin temperature and adjusts accordingly
– Tungsten light is yellow while fluorescent light is blue. Daylight is blue during high noon on a cold November day, but orange at sunset in August. We refer to such different lights as being cool or warm, these comparative terms having reference to variations in the light’s actual Kelvin temperature
– It is possible to cheat the camera into a different color scale for effect. For instance, by white balancing to a cool light source and then using tungsten lighting during shoot, video will be warm-rich in yellows and oranges
A.Wieczorkowska /845
Recording formats– Bring the cleanest, highest resolution picture
possible into the computer. The computer cannot add detail, it looses detail
• VHS and 8mm are analog formats intended for consumer use
• S-VHS and Hi-8 are prosumer formats• ¾ -inch, 1-inch, and Betacam are
professional and broadcast analog formats
JONES H. F., Desktop Digital Video Production, Prentice Hall, Upper Saddle River, NJ, USA, 1999
A.Wieczorkowska /846
Recording formats• Digital formats for prosumer and consumer
cameras are: DVC, DVCAM from Sony and DVC Pro from Panasonic– The technical specification for both Sony and Panasonic
are interchangeable but the physical tape formats are not. Both conform to the IEEE-1394 (Firewire) transfer protocol. These are compressed formats which rate between Hi-8 and Betacam in quality but allow a minimum of generational loss when transferred to and from digital edit suites
• Digital formats such as D1, D2, D3 and D5 are used in high-end post production facilities, often for computer graphics and compositing
A.Wieczorkowska /847
Tape• Tape consists of a Mylar or other plastic film
backing covered with a thin layer of either oxide or metal particles mixed with the binding agent– During the recording process, these coatings
store the video and audio information in magnetized patterns which then can be read for playback
• The type of tape used and the way it is recorded can significantly effect performance
A.Wieczorkowska /848
Tape• All analog tape is susceptible to drop-out, which
occurs when the oxide or metal particles flake off the tape because of stretching or repeated wear– This is seen in the form of sporadic white horizontal
streaks wherever there is a loss of picture information• Oxide tape is more likely to have drop-out than
metal tape– Metal tape also provides higher quality video images,
since it enhances recording sensitivity and increases the amount of recorded information
• Digital tape is inherently better than an analog tape because, technically, it cannot have dropout– ability to be copied without any loss over multiple
generations of copies
A.Wieczorkowska /849
Signal formats• Composite - all the video information is encoded in
a single channel– Strong colors, such as red, tend to bleed– In USA and Japan composite format adopted by TV and
video industries is known as NTSC signal• S-video, also referred to as Y/C video, separates
the luminance information from chroma information– They are recorded separately and then put back together
during playback• Component video separates the video signal into 3
elements: luminance, red minus luminance, and blue minus luminance, recorded in separate channels– Component video produces the best picture quality
A.Wieczorkowska /8410
Signal formatsFormat A/D Sig Tape Hres S/N¾” A C OX 280 45VHS A C OX 240 468mm A C OX 260 461” A C OX 300 46¾” SP A C MP 340 47S-VHS A Y/C OX 400 47Hi-8 A Y/C OX 400 47DVC D 1394 500 54DVCAM D 1394 500 54DVCPRO D 1394 500 54Betacam A R- OX 300 48Betacam SP A R- MP 340 51Digital Betacam D R- MP 450 54D-2 D R- MP 450 54D-3 D R- MP 450 54D-1 D R- MP 460 56
A/D = Analog or Digital; Sig = Signal type: C= Composite, Y/C = S-video, R- = Component; Tape = Tape type: OX = Oxide, MP= Metal Particle; Hres = horizontal resolution; S/N = luminance signal-to-noise ratio, average
A.Wieczorkowska /8411
Signal standards: frame
• Video frame is made up of 2 fields of scanned lines. Field 1 has video data for all the odd-numbered scan lines, Field 2 even-numbered scan lines. When interlaced, the two fields make up the entire image
A.Wieczorkowska /8412
Signal standards: NTSC
• NTSC (National Television Standards Committee) - standard for the television broadcasting of color signals– 525 scanned lines with timing originally
synchronized to the 60 Hz standard for AC power in North America, producing 30 frames per second (later redefined to 29.97 fps)
– prevalent in North America, Japan, and many South American countries that have 60 Hz AC power
A.Wieczorkowska /8413
Signal standards: PAL and SECAM
• In Europe and much of the rest of the world the standard signal is PAL – 625 scanned lines and 25 frames per second– PAL is becoming the standard in Europe as a whole as
a result of the European Economic Community (EEC)• SECAM is a similar standard primarily utilized in
France and a few other French influenced countries
• Most digital video edit systems can edit and output NTSC, PAL and SECAM if properly set
• Analog monitors, recorders and cameras cannot be switched
A.Wieczorkowska /8414
Control track
• Recorded on each videotape for timing and synchronization purposes– system for labeling video sequences and
frames• VHS, S-VHS, 8mm, and Hi-8 cameras
record timing pulses on this track to control the speed of videotape upon playback– This timing has no relationship to the frame
accuracy. Control-track timing pulses approximate frame locations by indexing a resettable counter
A.Wieczorkowska /8415
Time code
• Timecode specifically identifies video frames and audio signals by time– four 2-digit numbers which represent hours,
minutes, seconds and frames on a 24-hour clock
– each frame of visual and aural material has a unique identifier, or address.
• The most common standard in use was developed by SMPTE (Society of Motion Picture Engineers) - SMPTE timecode
A.Wieczorkowska /8416
SMPTE time code
• SMPTE 25 EBU– runs at 25 frames per second and is also
known as SMPTE EBU (European Broadcasting Union) - European television systems run at exactly 25 fps
• SMPTE 24 Film Sync– runs at 24 frames per second and is also know
as SMPTE Film Sync. This rate matches a nominal film rate of 24 fps (the slowest speed possible for apparent continuous motion)
A.Wieczorkowska /8417
SMPTE drop frame time code
• We consider NTSC timecode to run at 30 frames per second, but in reality, it actually runs at 29.97 frames per second
• Drop-frame timecode runs accurately at the 29.97 fps standard: one hour of indicated drop-frame timecode plays for exactly one hour– Skipping frame numbers 00 and 01 at the beginning of
each minute, except at the ten minute mark– No actual video frames are dropped, only frame
numbers are omitted from the timecode sequence
A.Wieczorkowska /8418
SMPTE non-drop frame time code
• Non-drop-frame is typically used only for short programs of a few minutes duration, where time difference is negligible, or where exact length is not important– 1 hour of indicated non-drop-frame timecode
actually requires 1 hour and 3.6 seconds of play time
• Drop-frame timecode is used when exact length of a program is important, particularly for programs exceeding 1 minute
A.Wieczorkowska /8419
Lighting: direct/indirect
• There are two basic types of light in the natural world, sunlight and overcast. All artificial lighting sources and styles mimic these two– Sunlight is direct light, comes from a single, identifiable
source and creates clearly defined shadows. It tends to be harsher than indirect light, but more effectively shows texture and shape
– Overcast light is indirect light. It is diffused or reflected, coming from a larger source and casting softer shadows. It is flatter, more 2D, but can smooth out a scene and make it more pleasing to the eye
• Rarely will lighting set-ups be of only one type. Usually, a combination of set-ups is necessary, an interplay of types, to give the best effect
A.Wieczorkowska /8420
Quality of light
• Color of the light is most often measured by degrees Kelvin
• Even the same light source can have different color output depending upon various circumstances
A.Wieczorkowska /8421
Kelvin ratings
Candle Flame 1,900 Kelvin100 Watt Light Bulb 1,850 KelvinQuartz Studio Light 3,200 KelvinWarm White Fluorescent 4,500 KelvinAverage Daylight 5,500 KelvinOvercast Sky 7,500 KelvinClear Northern Skylight 25,000 Kelvin
A.Wieczorkowska /8422
Lighting instruments
• Light sources can be distinguished by the type of light they create (direct/indirect) and the quality or color of the light– Direct lighting instruments: open-faced lights with
exposed bulbs, fresnels which have a focusing lens, and ellipsoidals, a compound lens spotlight, for spotlight effects
– Indirect lighting sources include scoops, softboxes, lensless reflectors with diffuser filters, and direct light that is bounced off or diffused through something else
– Warm lighting: low-angle sunlight, tungsten bulb fixtures– Cool, whiter lights: direct, overhead sunlight, as well as
fluorescent and special halogen lights known as HMIs
A.Wieczorkowska /8423
Lighting instruments
Elipsoidal Reflector SpotlightsFresnel SpotlightsParabolic ReflectorsStrip LightsCyclorama Lighting
A.Wieczorkowska /8424
Lighting instruments: wattage
• Portable lighting equipment ranges from 250 to 1000 watts, studio lights from 1000 to 10,000 watts– A good all-purpose starter kit might contain
three 1000 watt lights, with stands and assorted grip equipment
– Important: know the load your electrical circuits can handle
A.Wieczorkowska /8425
Lighting techniques: 3-point lighting
A.Wieczorkowska /8426
Lighting techniques: 3-point lighting
A.Wieczorkowska /8427
Lighting techniques: bounce lighting• Bounce lighting refers here to the use of diffused sources
used as primary means of lighting a subject. Creates soft shadows, with even fall-off (reduction of light or shadow receding from center of light source)
• Scene appears smoother, without drastic contrasts between light and dark areas
• Produces lower light level than direct light and requires cameras sensitive enough
A.Wieczorkowska /8428
Chroma key• Chroma-key shooting and compositing is the technique
of replacing a specific color with another keyed source– Example: TV weather (person standing in front of a big blue
board while, in the TV control room, all of the video that is blue is replaced by images from another source - maps and moving images)
• It is possible to key out any color, so the blue screen may actually be some other color– Blue and green are most commonly used because they are
the hues most distant from the skin tones• It is most important to carefully prepare the set-up and
lighting for a chroma-key shoot– The smoother and more consistent the background color
appears to the camera, the easier it will be to key out and replace. Allow plenty of time for testing
A.Wieczorkowska /8429
Chroma key
A.Wieczorkowska /8430
Chroma key - techniques
• Paper• Paint
• Reflection– Problem: fair skin, blond hair, light colored
clothing– Have a strong backlight or kicker coming from
behind and to the side of the subject
A.Wieczorkowska /8431
Chroma key – Ultimatte Technology
• Total system of high-end chroma-keying (special paint, equipment run by trained operators)
• It looks also for specific hue and saturation. Unlike other systems, most blue and green clothing are acceptable in an Ultimatte shoot
• There are various configurations, depending on cost and the amount of fine control needed. Most of these systems are so expensive that they are rented out, with operators, at a daily rate
A.Wieczorkowska /8432
Chroma key – Ultimatte Technology
• Product of an Ultimatte shoot is 2 master tapes– 1st is the tape of the video just as it looks on the set,
subject matter against a blue or green background– 2nd tape is exactly the same video, except it is black and
white-black for the background, white for the subject. This becomes a matte with which to totally black out everything not to be seen in the original
• Frame for frame, 2 tapes are combined. White areas define the subject matter, black the areas to be replaced by new background material. The key is the combined master video, the mat and the video that replaces the mat in new images
• There are Ultimatte technologies arriving that are software, rather than hardware based
A.Wieczorkowska /8433
Chroma key – make-up and hair
• If the star actress is wearing blue eye shadow and, even worse, has baby blue eyes, either stay away from close-ups or redo the makeup and consider tinted contact lenses
• Minimize wispy strands, which may just disappear anyway, and keep hair close to the head, using hair spray as necessary. Beware of hairdos which are so inflated as to allow light to spill through, making a clean key nearly impossible
A.Wieczorkowska /8434
Sound equipment - microphones
• Types of microphone transducers– Ribbon– Dynamic – Condenser– Carbon - obsolete / not utilized in the
manufacture of general purpose audio production microphones
– Crystal - obsolete / not utilized in the manufacture of general purpose audio production microphones
A.Wieczorkowska /8435
Sound equipment - microphones
• Ribbon– smooth response, used for vocals and voice over
recording. Very sensitive to air movement and sibilance and can be easily damaged in use. Restricted to studio applications
• Dynamic– moving coil microphone, most widely used in AV field
production. Produces very high quality signal, particularly for voice and field recording, very rugged. Frequency response in a professional quality microphone is very good, it has good sensitivity. Comes in a wide variety of directional formats. Most inexpensive of the professional microphone types. Requires no power supply
A.Wieczorkowska /8436
Sound equipment - microphones
• Condenser– in many forms, from very expensive high quality
studio mics, to extremely inexpensive mics for amateur use. In general, professional condenser mics are designed as omnidirectional mics and are less rugged in field usage than dynamic mics. Very important for musical recording in the studio and of general purpose uses. Professional quality condenser mics tend to be much more expensive than dynamic ones. Require power source, either internal or external batteries or power supplies
A.Wieczorkowska /8437
Directional quality of microphones
Omnidirectional, bi-directional, unidirectional
A.Wieczorkowska /8438
Directional quality of microphones
Cardioid, supercardioid, hypercardiod & unidirectional
A.Wieczorkowska /8439
Directional quality of microphones• Parabolic
– with receptor mounted at focal point of parabolic reflector. Can retrieve sound from hundreds of yards away with high sensitivity. Used in sports, in nature video and for applications requiring acquisition of distant sounds
A.Wieczorkowska /8440
Proximity effects• Supercardiod, parabolic and other mics used at a
distance suffer from a low frequency loss due to the loss of sound energy through the air
• This generally doesn’t effect voice recording but musical performances require post-production equalization to restore a more realistic sound balance
• Mics placed too close to the source of sound can have exaggerated bass response. This is called proximity effect and is more difficult to adjust in postproduction. The best remedy is proper mic placement
• It is always best to audition the sound using headphones before shooting
A.Wieczorkowska /8441
Interference problems• Electrical interference can be a problem if
cables are not in good shape or ground wires are not intact. During shooting, be aware of electrical hums and pops and RF or radio interference, and remedy it on the spot. It is virtually impossible to remove in post production
A.Wieczorkowska /8442
Mic application and placement• Room acoustics – most common problem is
sound reflections from hard surfaces. The use of absorptive materials such as packing blankets can reduce this problem. It can also be reduced by moving mics closer to the subjects. Greater problem with fixed placement of microphones
A.Wieczorkowska /8443
Mic application and placement• Personal mics
– attached to a subject, clip-on microphone (clipped to clothing), lavaliere or lav microphone (suspended on a cord around neck). Can be either wired or wireless. Designed to be placed within about a foot of mouth
– Pay attention to jewelry or clothing that might brush against the mic or produce sound that might be picked up by the mic
• Hand-held mics– cardioid; hold mic at about 30° angle, at distance
8-16 inches (or use windscreen outdoors)
A.Wieczorkowska /8444
Mic application and placement• PZ mics
– Pressure Zone mics, omnidirectional; rely on reflected sound. Placed on table tops or stage floors, and serve as pickups for group interviews or musical recording
• Headset mics – sports, off camera narration• Contact mics - on musical instruments• Wireless mics
– May have self-contained transmitters or may require use of standard microphone plugged into an independent transmitter. They can be used on any type of microphone desired. The receivers may be mounted on a camera or attached to a mixer or console. Their range is generally from 50 to 500 feet depending on conditions
– Interference from commercial radio and television transmitters can be a problem
A.Wieczorkowska /8445
Mic application and placement• Off-camera mics
– eliminate the problem of seeing the microphone in dramatic works
• Hanging and slung mics– hung on cords suspended over the set or recording area.
Be sure that mics suspended from lighting equipment or grids don’t pick up electrical interference or filament hum
• Fishpoles– attached to the end of metal or fiberglas rods and held by
an operator, recording a single subject. This method requires the operator to use headphones to assure that the pickup and direction of the microphone is correct. Proximity to the subject is difficult
A.Wieczorkowska /8446
Mic application and placement• Booms
– mounted on either fixed or moveable stands• Hidden mics
– may be incorporated into set elements. Be sure to turn down unused mics to eliminate phase cancellation of signals
• Line mics– usually contained in a rubber or fur windscreen.
They are usually condenser mics that are very sensitive. They are commonly used on location for recording at distances up to 20 feet. Normally mounted on fishpoles. Can also be eventually mounted on cameras
A.Wieczorkowska /8447
Stereo• Stereo Mics – for interview• X-Y Technique
– 2 mics arranged across each other at a 45°, forming X. Does not translate as well into mono
• Using Pan Pots on a mixing console – To position the apparent location of voices and sound.
Pan pot simultaneously mixes portion of mono signal into right and left track. The control, indicated as left or right on the panel, can range from all of the signal to left to all of the signal to right or any degree in between. This gives the illusion of changing the apparent location of the sound source in the stereo mix
– Generally it is best to hold dialog to the center of stereo mix
A.Wieczorkowska /8448
MIDI• Musical Instrument Digital Interface - command
system that allows computers and electronic synthesizers to communicate and to record virtual performances of sound or musical compositions in a form that can be replicated without creating an actual sound recording
• Actual sound generation may be from a recorded collection of live or acoustic music, musical tones or samples stored in a computer or other device, or may be synthesized using a series of oscillators and sound processors entirely electronically
A.Wieczorkowska /8449
MIDI vs. wave form• MIDI is an important music source when:
– Timing is critical - MIDI files can be easily time adjusted to fit and exact spot in a presentation. Wave files are much less editable by the casual user in this respect
– Disc space is critical - MIDI files are much more compact
– Computer processing resources are critical• Wave form files are important when:
– Voice recording, acoustical music or sound effects are essential and sampled
A.Wieczorkowska /8450
Audio
• Basic audio signal recording parameters:– sampling frequency– number of bits used to record each sample– data format (PCM, 1-bit, etc.)
A.Wieczorkowska /8451
Audio
• Selection of sound recording parameters depends on:– Signal quality– Efficiency in memory use– Possibility of real-time recording (important in
case of older systems, of low computational power)
– Compatibility of formats• In practice, selection of digital sound format
is a trade-off between signal quality and memory economy
A.Wieczorkowska /8452
Audio
• History: with digital signal recording development and application of computer sound editors, many sound storage formats have been created (by various software and equipment producers)
• Variety of sound formats was invented because of operating systems variety
A.Wieczorkowska /8453
Audio• Sound files can be divided into
– files for storage of broadband musical signal of high quality
– files for storage of speech signal, with limited bandwidth and dynamics
Dynamic range - difference between maximum signal level which cannot be exceeded without introducing unacceptable distortion and minimum level where the system noise becomes predominant
A.Wieczorkowska /8454
Sound file format conversion
• Sound file format conversion requires taking 2 issues into consideration:– recording parameters conversion: sampling
frequency, number of channels and bits used in recording
– file type conversion (data recording format)• Program converters can be applied to
transfer sound files between systems– sometimes they allow simple signal operations,
for instance, signal reversal, decimation, amplitude change, addition of echo, etc.
A.Wieczorkowska /8455
Sampling frequencies
• Sampling frequencies used in digital audio systems:– 5500 Hz (Macintosh) (=44100/8)– 7333 Hz (=44100/6)– 8000 Hz – telephone standard for coding
-law, a-law– 8012.8210513 – NeXT standard, used with
Telco codec– 11025 Hz (=22050/2)– 16000 Hz - telephone standard G.722
A.Wieczorkowska /8456
Sampling frequencies
– 16726.8 Hz – NTSC TV = 7159090.5/(2142)– 18900 Hz – CD-ROM standard – 22050 Hz – Macintosh standard, CD/2– 22254.[54] – MacIntosh monitor connection
standard 128k– 32000 Hz DAB (Digital Audio Broadcasting),
NICAM (Nearly-Instantaneous Companded Audio Multiplex) – for instance BBC; other systems: TV, HDTV, R-DAT
– 32768 Hz (321024)– 37800 Hz – high quality CD-ROM
A.Wieczorkowska /8457
Sampling frequencies
– 44056 Hz – sampling frequency used in professional equipment (NTSC-compatible)
– 44100 Hz – CD audio – the most popular frequency in professional and non-professional applications
– 48000 Hz – R-DAT– 49152 Hz (481024)– >50000 Hz – sometimes used in professional
DSP systems– 96000 Hz – high resolution R-DAT
A.Wieczorkowska /8458
Sampling frequency conversion
• 2-stage procedure:– oversampling – additional samples are
generated– excessive samples are removed
• Oversampling frequency should be lcd (least common denominator) of source and final sampling frequency
A.Wieczorkowska /8459
Sampling frequency conversion
A.Wieczorkowska /8460Relationships between most popular sampling frequencies
A.Wieczorkowska /8461
Additional samples generation
• Additional samples are generated by means of various interpolation algorithms
• Depending on required signal quality and system capabilities, linear interpolation is applied (simple home use systems) and high order polynomial interpolation (professional applications)
A.Wieczorkowska /8462
Excessive samples removal
• Excessive samples are removed from digital signal representation - decimation:y(n) = Xa(nTM) = Xa(nT’)
• To avoid aliasing (spectral overlap), oversampled signal cannot contain frequencies above Nyquist frequency (half of final sampling frequency)–Oversampled signal has to be filtered using low-
pass filter
1
0
)2'(' 1)(
M
k
MkTj
Tj eXM
eY
A.Wieczorkowska /8463
Sampling frequency conversion
• Remark: it is possible that output signal, obtained via oversampling, does not contain any samples from input signal (only samples generated during oversampling, i.e. interpolated ones)
A.Wieczorkowska /8464
Coding methods
• PCM• ADPCM• Compandor codecs:
-law (American-Japanese standard )– a-law (European standard)
• Source codecs– vocoder (voice coder)
• Hybrid codecs
codec = encoder + decoder
A.Wieczorkowska /8465
PCM• Linear Pulse Code Modulation (PCM) – the
most popular method of sound coding– Record current is a series of binary pulses or a
carrier modulated by binary bits. Each analog sample is quantized and converted into an m-bit code word. Quantization steps are equal for all signal levels
– Signal-to-noise ratio (S/N, SNR) = amplitude difference [dB] between the maximum signal level and noise in absence of signalS/N=6.02m+1.76dB
– Advantage: high quality can be obtained (CD quality)
– Disadvantage: high volume files
A.Wieczorkowska /8466
Companded PCM and DPCM• To reduce the number of bits a nonlinear
quantizer is used– at low input levels quantizing steps are small and S/N
is high, at large input levels steps are large– companding techniques result in nonlinear distortion
• DPCM (Delta PCM, Differential PCM) – in over-sampled signal differences between successive samples are small; these differences are coded in only a few bits and then recorded– quasi-periodicity of musical signals is utilized:
prediction of next sample value on the basis of previous samples values
A.Wieczorkowska /8467
ADPCM
• ADPCM (Adaptive Differential Pulse Code Modulation) – adaptive prediction applied, i.e. prediction method is adjusted to individual characteristics of encoded signal
A.Wieczorkowska /8468
Main sound file formats
• .snd, .au (NeXT, Sun)• .wav (Microsoft, IBM)• .mp3• .mid (MIDI)
• Perceptual compression standards:– MPEG, AC-3 (HDTV), PASC (DCC tape
recorder)
A.Wieczorkowska /8469
Image compression methods
• Lossless (exact data reproduction) – Probabilistic, for instance statistic – binary tree
of codes is built; more frequent symbols are placed closer to the root
• Huffman method – tree built from leaves to root• Shannon-Fano method - from root to leaves
– dictionary (Ziv, Lempel – Haifa, late 70.) - repeating symbol sequences are replaced by references to their first appearance
• Lossy - high compression level at the cost of loss of details
A.Wieczorkowska /8470
Image compression• For image compression applications where the
input files are usually measured in megabytes, and where the losses of very minor graphic details are not critical, lossy methods are mostly used
• For certain applications, such as the compression of text files or executable codes, lossless compression is a necessity
A.Wieczorkowska /8471
Image compression methods• Symmetrical
– compression and the decompression processes take roughly the same amount of time/effort; JPEG
• Asymmetrical – take more time/effort compressing an image than
decompressing it; the idea is to do most of work during compression, thus creating an output file that can be decompressed very quickly; FIC
• Most effective methods:– GIF– JPEG– fractal compression FIC
A.Wieczorkowska /8472
Image compression• For strictly black-and-white image, each pixel
has only 1 data bit - 0 for white and 1 for black, say. Thus it is 2D set, so image can be considered as a (compact) subset of R2
• Gray-scale image can be considered as a (compact) subset of R3 - 2 spatial D and 3rd for gray-scale intensity (usually 0-255, 1B/pixel)
• Color image bit-mapped file can be considered as a (compact) subset of R5: in addition to 2 spatial D, there are 3 D for RGB color parameters
A.Wieczorkowska /8473
Lossy image compression - JPEG
• Joint Photographic Experts Group (JPEG) – elaborates continuous image compression standards ISO and ITU-T; official name: ISO/IEC JTC1 SC29 Working Group 1
http://www.jpeg.org/ • Standard description:http://www.jpeg.org/public/jpeglinks.htm
A.Wieczorkowska /8474
JPEG compression
• RGB image is converted into YCrCb:
• RGB colors quantized into 220 levels are changed into luminance Y and chrominance CrCb, also 220 levels
• 1 pair of chrominance values is coded for each 2 luminance values
YCrCb
RGB
128128
77 256 150 256 29 256131 256 110 256 21 256
44 256 87 256 131 256
/ / // / // / /
A.Wieczorkowska /8475
JPEG compression• Discrete Cosine Transform (DCT) applied for 8x8
pixel blocks• Quantization, depending on spatial frequency• Run Length Encoding (RLE) and Huffman method,
basing on entropy calculation and prediction of expected data pattern
• JPEG makes use of relative insensitivity of human eye to color shades, i.e. chrominance changes, in comparison with luminance. Therefore, quantization step can be adjusted to frequencies, i.e. bigger step can represent less significant frequencies
A.Wieczorkowska /8476
Graphics and video encoders
A.Wieczorkowska /8477
Image compression standards
A.Wieczorkowska /8478
JPEG compression
• Examples of comparison of original and JPEG-compressed imageshttp://www.bk.isy.liu.se/~svan/jpeg.html(original left, compressed right)
A.Wieczorkowska /8479
JPEG compression - examples
A.Wieczorkowska /8480
JPEG compression - artifacts
• Blocking• Fringing
A.Wieczorkowska /8481
Moving pictures compression - MPEGMPEG-1: for CD-I and Video-CD, allows transmission 1.5 Mbps MPEG-2: standard for digital TV and DVD (Digital Video
Disc); advanced version of layer-1, with possibility of coding of images with interleaving; 4 Mbps
MP3: audio compression standard MPEG Audio Layer 3 (MPEG-1 Part 3)
MPEG-4: standard for multimedia for the web and mobilityMPEG-7: Multimedia Content Description Interface - the aim is to
create standard of multimedia data content description; in progress. MPEG-7 is not dedicated to specific applications
http://mpeg.telecomitalialab.com/standards/mpeg-7/mpeg-7.htm MPEG-21: Multimedia Framework - to enable transparent and
augmented use of multimedia resources across a wide range of networks and devices used by different communities
http://mpeg.telecomitalialab.com/standards/mpeg-21/mpeg-21.htm
A.Wieczorkowska /8482
Compression techniques in MPEG
• Discrete Cosine Transform (DCT)• Quantization• Huffman coding• Predictive coding – differences between
frames are calculated and only the differences are encoded
• 2-way prediction – on the basis of previous and next frames
A.Wieczorkowska /8483
MPEG-4 - ISO/IEC 14496 • MPEG-4 - multimedia standard for network, dedicated
to integration of digital TV production and distribution, interactive graphics and interactive multimedia techniques (www)
• Universality – resolution
• from < QCIF quarter common intermediate format (176 x 144) • to > HDTV (about 1280 x 720)
– bit rate from 5k to 10Mb– video with or without interleaving– others
• independence on medium• operations on scene and objects• interaction with the user
A.Wieczorkowska /8484
MPEG-4
• Main advantages:– object-oriented conception– scalability– universality (hybrid scenes supported)
• disadvantages:– still limited popularity– presence of many competitive technologies
in some applications