2 chapter mm information representation

120
Chapter 2 • Multimedia Information Representation

Upload: rahul-nyamangoudar

Post on 09-Nov-2015

223 views

Category:

Documents


4 download

DESCRIPTION

2 Chapter MM Information Representation

TRANSCRIPT

  • Chapter 2Multimedia Information Representation

  • Multimedia information representation2.1 Introduction2.2 Digitization principles2.3 Text2.4 Images2.5 Audio2.6 Video

  • Introduction

    The conversion of an analog signal into a digital formSignal encoder, sampling, signal decoder

  • 2.2 Digitization principles2.2.1 Analog signalsFourier analysis can be used to show that any time-varying analog signal is made up of a possibly infinite number of single-frequency sinusoidal signals whose amplitude and phase vary continuously with time relative to each otherSignal bandwidthFig2.1 The bandwidth of the transmission channel should be equal to or greater than the bandwidth of the signalbandlimiting channel

  • Multimedia Information Representation Multimedia Information is stored and processed within a computer in a digital form. Codeword: Combination of a fixed number of bits that represents each character, in the case of textual information. Analog signal: Signal whose amplitude (magnitude of the sound/image intensity) varies continuously with time. Signal encoder: Electrical circuit used for the conversion of an analog signal into a digital form. Signal decoder: Electrical circuit that converts stored digitized samples into time-varying analogue form.

  • Analog Signals As mentioned earlier the amplitude of the signal varies continuously with time The Fourier analysis can be used to show that any time varying signal is made up of infinite number of single-frequency sinusoidal components The range of frequencies of the sinusoidal components that make up the signal is called the signal bandwidth Speech bandwidth: 50Hz 10kHz Music Bandwidth: 15Hz 20kHz

  • Analog Signals Signal Properties

  • Analogue Signals Signal Properties To transmit an analogue signal through a network the bandwidth of the transmission channel should be equal to or greater than the signal bandwidth If the bandwidth of the channel is less than the signal bandwidth than channel is called the bandlimiting channel

  • Encoder Design

  • 2.2.2 Encoder designA bandlimiting filter and an analog-to-digital converter(ADC), the latter comprising a sample-and-hold and a quantizerFig2.2Remove selected higher-frequency components from the source signal (A)(B) is then fed to the sample-and-hold circuitSample the amplitude of the filtered signal at regular time intervals (C) and hold the sample amplitude constant between samples (D)

  • 2.2.2 Encoder designQuantizer circuit which converts each sample amplitude into a binary value known as a codeword (E)The signal to be sampled at a rate which is higher than the maximum rate of change of the signal amplitudeThe number of different quantization levels used to be as large as possible

  • 2.2.2 Encoder designNyquist sampling theorem states that: in order to obtain an accurate representation of a time-varying analog signal, its amplitude must be sampled at a minimum rate that is equal to or greater than twice the highest sinusoidal frequency component that is present in the signal.

  • Encoder Design Bandlimiting filter: Removes the selected higher frequency components from the source signal Sample and hold Circuit: Samples amplitude of the filtered signal at regular intervals and holds the sampled amplitudes between samples Quantizer: Converts the samples into their corresponding binary form

  • Encoder Design Data representation

    The most significant bit of the codeword represents the sign of the sample A binary 0 indicates a positive value and a binary 1 indicates a negative value The signal must be sampled at a much higher rate than the maximum rate of change of the signal amplitude The number of quantization levels should be as large as possible to represent the signal accurately

  • Nyquist sampling theorem: To obtain an accurate representation of a time-varying analogue signal, its amplitude must be sampled at a minimum that is equal to or greater than twice the highest sinusoidal frequency component that is present in the signal Nyquist rate is represented either in Hz or more correctly in samples per seconds (sps) Antialiasing filter: Another name for bandlimiting filter. Since it passes frequencies that are within the Nyquist rate Sampling Rate

  • Alias signal generation due to undersampling In reality the transmission channel used often has a lower bandwidth To avoid distortion the source signal is first passed through the BLF which is designed to pass only the frequency components that are within the channel bandwidth This avoids alias signals caused by undersampling

  • Representation of the analogue samples require an infinite number of digits

    Quantization Intervals

  • Three bits are used to represent each sample ( 1 bit for the sign and two bits to represent the magnitude) If Vmax is the maximum positive and negative signal amplitude and n is the number of binary bits used then the quantization interval, q, is defined as q = 2Vmax/ 2n A signal anywhere within the quantization interval will be represented by the same binary codeword Each codeword is at the centre of the corresponding quantization interval Therefore a difference of q/2 from the actual signal level is present. This difference is known as the quantization error

    Quantization Intervals

  • Quantization noise polarity Quantization error is the difference between the actual signal amplitude and the corresponding nominal amplitude (also known as quantization noise since values vary randomly)

  • Dynamic Range With high-fidelity music it is important to be able to hear very quiet passages without any distortion created by quantization noise Dynamic range is defined as the ratio of the maximum signal amplitude to the minimum. D = 20 log10 (Vmax/Vmin) dB

  • Decoder Design A signal decoder is an electronic circuit that performs the conversion prior to their output back again into their analogue form through a digital-to-analogue converter and a low pass filter Low-pass filter: Only passes those frequency components that were filtered through the bandlimiting filter in the encoder Encoder+decode= Codec

  • Text Unformatted text: Known as plain text; enables pages to be created which comprise strings of fixed-sized characters from a limited character set Formatted Text: Known as richtext; enables pages to be created which comprise of strings of characters of different styles, sizes and shape with tables, graphics, and images inserted at appropriate points Hypertext: Enables an integrated set of documents (Each comprising formatted text) to be created which have defined linkages between them

  • Unformatted Text The basic ASCII character set The American Standard Code for Information Interchange is one of the most widely used character sets and the table includes the binary codewords used to represent each character (7 bit binary code) Control characters(Back space, escape, delete, form feed etc) Printable characters(alphabetic, numeric, and punctuation)

  • Unformatted Text Supplementary set of Mosaic characters The characters in columns 010/011 and 110/111 are replaced with the set of mosaic characters; and then used, together with the various uppercase characters illustrated, to create relatively simple graphical images

  • Unformatted Text Examples of Videotext/Teletext Although in practice the total page is made up of a matrix of symbols and characters which all have the same size, some simple graphical symbols and text of larger sizes can be constructed by the use of groups of the basic symbols

  • Formatted Text It is produced by most word processing packages and used extensively in the publishing sector for the preparation of papers, books, magazines, journals and so on.. Documents of mixed type (characters, different styles, fonts, shape etc) possible.Format control characters are used

  • Hypertext Electronic Document in hypertext Hypertext can be used to create an electronic version of documents with the index, descriptions of departments, courses on offer, library, and other facilities all written in hypertext as pages with various defined hyperlinks

  • Hypertext Electronic Document in hypertext An example of a hypertext language is HTML used to describe how the contents of a document are presented on a printer or a display; other mark-up languages are: Postscript, SGML (Standard Generalized Mark-up language) Tex, Latex.

  • Images Images include computer-generated images (referred to as computer graphics or simply graphics) and digitized images of both documents and pictures All types of images are displayed in the form of a two-dimensional matrix of individual picture elements (pixels or pels), but represented differently within the computer memory (file) Each type of these images is created differently.

  • Graphics VGA is a common type of display that consists of a matrix of 640 horizontal pixels by 480 vertical pixels with for example, 8 bits per pixel which allows each pixel to have one of 256 different colours

  • Graphics All objects are made up of a series of lines that are connected to each other and, what appear as a curved line, in practice is a series of short lines each made up of a string of pixels Each object has a number of attributes associated with it. These include its shape, size in terms of pixel position, colour of the border etc.. Colouring a solid block with the same colour is known as rendering.

  • Graphics - Conclusions There are two forms of representation - high-level representation (similar to a source code of a program) requires less memory to store the image and less bandwidth for transmission - actual picture image of the graphic ( similar to the low-level machine code and generally known as bit-map format) e.g. GIF (graphical interchange format), TIFF ( tagged image format) A graphic can be transferred over the network in either form A software called SRGP (simple raster graphics package) - used to convert high-level form into a pixel-image form

  • Digitized Documents- Fax Principles The scanner associated with fax machines operates by scanning each complete page from left to right to produce a sequence of scan lines that start at the top of the page and end at the bottom Vertical resolution is either 3.85 (100 lines) or 7.7 mm (200 lines)

  • Digitized Documents- Digitization format Fax machines uses a single binary digit to represent each pel, a 0 for a white pel and a 1 for a black pel. Hence the digital representation of a scanned page produces a stream about 2 million bits. Single binary digit per pel means fax machines are best suited for bitonal images.

  • Colour Derivative Principles additive colour mixing ( R + G + B) Black is produced when all three primary colours (R,G,B) are zero. Useful for producing a colour image on a black surface as is the case in display applications

  • Digitised Pictures- Subtractive colour mixing White is produced when the three chosen primary colours cyan, magenta and yellow are all zero. Useful for producing a colour image on a white surface as is the case in printing applications.

  • Digitized Pictures- Television/computer monitor principles The picture tubes used in most television sets operate using what is known as a raster-scan; this involves a finely-focussed electron beam being scanned over the complete screen

  • Digitized Pictures- Raster Scan Progressive scanning is performed by repeating the scanning operation that starts at the top left corner of the screen and ends at the bottom right corner follows by the beam being deflected back again to the top left corner

  • Digitized Pictures Raster scan display architecture

  • Digitized Pictures-Pixel format on each scan The set of three related colour-sensitive phospors associated with each pixel is called a phospor triad and the typical arrangement of the triads on each scan line is shown.

  • Digitized Pictures Concepts Frame: Each complete set of horizontal scan lines (either 525 for North & South America and most of Asia, or 625 for Europe and other countries)Flicker: Caused by the previous image fading from the eye retina before the following image is displayed, after a low refresh rate ( to avoid this a refresh rate of 50 times per second is required) Pixel depth: Number of bits per pixel that determines the range of different colours that can be produced Colour Look-up Table (CLUT): Table that stores the selected colours in the subsets as an address to a location reducing the amount of memory required to store an image

  • Digitized Pictures Aspect Ratio: This is the ratio of the screen width to the screen height ( television tubes and PC monitors have an aspect ratio of 4/3 and wide screen television is 16/9)

  • Digitized Pictures Screen Resolutions NTSC = 525 lines per frame (480 Visible) PAL,CCIR,SECAM=625 lines ( 576 visible)Example display resolutions: VGA (640x480x8), XGA (1024x768x8) and SVGA (1024x768x24)

  • 2.4.3 Digitized picturesColor principlesA whole spectrum of colors known as a color gamut can be produced by using different proportions of red(R), green(G), and blue (B)Fig 2.12Additive color mixing producing a color image on a black surfaceSubtractive color mixing for producing a color image on a white surfaceFig 2.13

  • 2.4.3 Digitized picturesRaster-scan principlesProgressive scanning Each complete set of horizontal scan is called a frameThe number of bits per pixel is known as the pixel depth and determines the range of different colors.

  • 2.4.3 Digitized picturesAspect ratioBoth the number of pixels per scanned line and the number of lines per frameThe ratio of the screen width to the screen heightNational Television Standards Committee (NTSC), PAL(UK), CCIR(Germany), SECAM (France)Table 2.1

  • 2.4.3 Digitized pictures

  • *Digitized Pictures(5) Example 2.3 Derive the time to transmit the following digitized images at both 64Kbps and 1.5Mbps networks a 6404808 VGA-compatible image a 102476824 SVGA-compatible image Solution The size of each image in bit is as follows a VGA image = 6404808 = 2.46Mbits an SVGA image = 102476824 =18.88MbitsThe time to transmit each image is given as follows at 64Kbps : VGA = 2.46Mbits/64Kbps = [2.46106]/[64 103] = 38.4 sec. SVGA = [18.88106]/[64 103] = 295 sec. at 1.5Mbps: VGA = 2.46Mbits/1.5Mbps = [2.46106]/[1.5 106] = 1.64 sec. SVGA = [18.88106]/[1.5 106] = 12.59 sec.

  • 2.4.3 Digitized picturesDigital cameras and scannersAn image is captured within the camera/scanner using an image sensorA two-dimensional grid of light-sensitive cells called photositesA widely-used image sensor is a charge-coupled device (CCD)Fig 2.16

  • Digitized Pictures Colour Image Capture: Schematic Typical arrangement that is used to capture and store a digital image produced by a scanner or a digital camera (either a still camera or a video camera)

  • Digitized Pictures Colour Image Capture: Schematic Photosites: Silicon chip which consists of a two dimensional grid of light-sensitive cells, which stores the level of intensity of the light that falls on it Charge-coupled devices (CCD): Image sensor that converts the level of light intensity on each photosites into an equivalent electrical charge

  • 2.6 Video2.6.1 Broadcast televisionScanning sequenceIt is necessary to use a minimum refresh rate of 50 times per second to avoid flickerA refresh rate of 25 times per second is sufficientField:the first comprising only the odd scan lines and the second the even scan lines

  • 2.6.1 Broadcast televisionThe two field are then integrated together in the television receiver using a technique known as interlaced scanningFig 2.19The three main properties of a color sourceBrightnessHue:this represents the actual color of the sourceSaturation:this represents the strength or vividness of the color

  • 2.6.1 Broadcast televisionThe term luminance is used to refer to the brightness of a source The hue and saturation are referred to as its chrominance

    Where Ys is the amplitude of the luminance signal and Rs,Gs and Bs are the magnitudes of the three color component signals

  • 2.6.1 Broadcast televisionThe blue chrominance (Cb), and the red chrominance (Cr) are then used to represent hue and saturation The two color difference signals:

  • 2.6.1 Broadcast televisionIn the PAL system, Cb and Cr are referred to as U and V respectively

    The NTSC system form two different signals referred to as I and Q

  • 2.6.2 Digital videoEye have shown that the resolution of the eye is less sensitive for color than it is for luminance422 formatThe original digitization format used in Recommendation CCIR-601 A line sampling rate of 13.5MHz for luminance and 6.75MHz for the two chrominance signals The number of samples per line is increased to 720

  • 2.6.2 Digital videoThe corresponding number of samples for each of the two chrominance signals is 360 samples per active lineThis results in 4Y samples for every 2Cb, and 2Cr samplesThe numbers 480 and 576 being the number of active (visible) lines in the respective systemFig. 2.21Example 2.7

  • Figure 2.21 Sample positions with 4:2:2 digitization format.

  • 2.6.2 Digital video420 format is used in digital video broadcast applicationsInterlaced scanning is used and the absence of chrominance samples in alternative linesThe same luminance resolution but half the chrominance resolutionFig2.22

  • Figure 2.22 Sample positions in 4:2:0 digitization format.

  • 2.6.2 Digital video

    525-line system

    625-line system

  • 2.6.2 Digital videoHDTV formats: the resolution to the newer 16/9 wide-screen tubes can be up to 1920*1152 pixelsThe source intermediate format (SIF) give a picture quality comparable with video recorders(VCRs)

  • 2.6.2 Digital videoThe common intermediate format (CIF) for use in videoconferencing applicationsFig 2.23The quarter CIF (QCIF) for use in video telephony applicationsFig 2.24Table 2.2

  • Figure 2.23 Sample positions for SIF and CIF.

  • Figure 2.24 Sample positions for QCIF.

  • 2.6.3 PC video

  • 2.5 AudioThe bandwidth of a typical speech signal is from 50Hz through to 10kHz; music signal from 15 Hz through to 20kHzThe sampling rate: 20ksps (2*10kHz) for speech and 40ksps (2*20kHz) for musicMusic stereophonic (stereo) results in a bit rate double that of a monaural(mono) signalExample 2.4

  • 2.5.2 CD-quality audioBit rate per channel =sampling rate*bits per sample

    Total bit rate = 2*705.6=1.411MbpsExample 2.5

  • AUDIOTWO TYPES OF AUDIO SIGNALS- SPEECH SIGNALS AND MUSIC QUALITY AUDIO

    AUDIO IS PRODUCED - MICROPHONE / SYNTHESISER

    SYNTHESIZER PRODUCES AUDIO IN DIGITAL FORMAT WHICH CAN STORE IN COMPUTER

    PCM SPEECH:

    It is a digitization process

    Defined in ITU-T Recommendations G.711

    PCM CONSISTS OF ENCODER AND DECODER

  • IT CONSISTS OF EXPANDER AND COMPRESSOR

    AS COMPARED TO EARLIER WHERE LINEAR QUANTIZTION IS USED NOISE LEVEL SAME FOR BOTH LOUD AND LOW SIGNALS.

    AS EAR IS MORE SENSITIVE TO NOISE ON QUITE SIGNALS THAN LOUD SIGNALS, PCM SYSTEM CONSISTS OF NON-LINEAR QUANTIZATION WITH NARROW INTERVALS THROUGH COMPRESSOR

    AT THE DESTINATION EXPANDER IS USED

    THE OVERALL OPERATION IS COMPANDING

    BEFORE SAMPLING AND USING ADC, SIGNAL PASSED THROUGH COMPRESSOR FIRST AND PASSED TO ADC AND QUANTIZED.

    AT THE RECEIVER, CODEWORD IS FIRST PASSED TO DAC AND EXPANDER

    TWO COMPRESSOR CHARACTERISTICS A LAW AND MU LAW

  • CD- QUALITY AUDIO

    STANDARD FOR CD PLAYERS AND CDROMS CD-DA STANDARD

    SYNTHESIZED AUDIO:

    Synthesized audio uses less memory

    It is easier to edit synthesized audio

    Mix several passages together

    Three components are- computer, keyboard, sound generators

    Keyboard sends commands to computer which is sent to sound generators which produces Sound waveform via DAC to drive speakers

    For each key different codeword known as the message with a synthesizer keybord is generated and read by the computerprogram

    The control panel has switches and sliders which indicate the volume and sound effects for the prog

  • Secondary interface stores audio in secondary Storage devices

    There are programs to allow the users to edit a previously enterred passage or mix several stored passages together

    There is a range of other inputs from instruments

    To discriminate between inputs from different possible sources a standard messages are defined for corresponding sound generators

    These are defined in a standard music instrument digital interface-MIDI

    It defines format of standardized set of messages used by synthesizer, types of connectors,cables and electrical signals .

  • Example 1

  • Example 2

  • Example 3

  • Example 4

  • Example 5

  • Example 6

  • Chapter 1- Example-1

  • Chapter 1 example 2

  • ***