video technology 2013-14 - minor ii

Upload: milind-mathur

Post on 03-Jun-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Video Technology 2013-14 - Minor II

    1/282

    Image &Video Processing

    1

    B.Tech Sem VIII

  • 8/12/2019 Video Technology 2013-14 - Minor II

    2/282

    Course Contents

    1. Introduction to Image, Video1. Human Visual System (HVS)

    2. Colours

    Biology, Physics, Technology, Coding

    3. Image Processing (examples)

    Capture, Preprocessing, 1D and 2D Fourier transformation,

    Minor I

    1D an D convo ut on, reconstruct on, a as ng, ter ng

    Enhancement 1. noise reduction, filter masks

    2. edge-detection, histograms,

    3. image segmentation4. Image Compression (example)

    5. History and Basics of Video Technology (examples)

    6. Video Compression

    Minor II

  • 8/12/2019 Video Technology 2013-14 - Minor II

    3/282

    MINOR I

  • 8/12/2019 Video Technology 2013-14 - Minor II

    4/282

    Images

    Images, often calledpictures, are represented bybitmaps.

    A bitmap is a spatial two-dimensional matrix made upof individual picture elements calledpixels.

    Each pixel has a numerical value calledamplitude.

    4

    The number of bits available to code a pixel is calledamplitude depth orpixel depth.

    A pixel depth may represent

    a black or white dot in images

    a level of gray in continuous-tone, monochromatic images, or

    the color attributes of the picture element in colored pictures.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    5/282

    An Image is...

    (will discuss in later lectures)52 6B 8C 6B 73 5A 635A 6B 73 84 84 73 73

    5A 84 84 73 5A 84 84

    6B 6B 8C 5A 42 4A 42

    42 6B 6B 5A A5 DE D6

    5A 6B 5A 42 F7 F7 F7

    84 5A 6B 31 DE F7 F7

    45 71 82 7D 7D 55 5D

    55 75 7D 75 71 6D 65

    55 75 7D 6D 61 75 75

    made up of pixels

    each pixel is a

    combination of Red

    Green and Blue color

    amplitudes

    45 71 75 55 41 41 38

    34 51 55 55 9E CF CF

    51 55 51 38 FF FF FF

    71 51 59 34 DB FF FF

    42 7B 7B 9C 7B 63 4A

    63 7B 7B 73 73 63 63

    63 73 7B 63 63 73 7342 73 73 63 42 42 39

    31 4A 4A 63 94 DE DE

    4A 4A 42 39 FF FF FF

    73 42 5A 31 BD FF FF

    Represented by I(x,y) of the two spatial coordinates ofthe image plane.

    I(x,y) is the intensity of the image at the point (x,y)

    on the image plane.

    Color image represented by R(x,y), G(x,y), B(x,y)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    6/282

    Image processing An image processing operation typically defines a new

    image g in terms of an existing imagef.

    The simplest operations are those that transform each

    pixel in isolation. These pixel-to-pixel operations can be

    written:

    g(x,y)t(f(x,y))

    Examples: threshold, RGB grayscale

    Note: a typical choice for mapping to grayscale is to apply

    the YIQ television matrix and keep the Y.

    Y

    I

    Q

    0.299 0.587 0.114

    0.596 0.275 0.321

    0.212 0.523 0.311

    R

    G

    B

  • 8/12/2019 Video Technology 2013-14 - Minor II

    7/282

    Pixel movement

    Some operations preserve intensities, but movepixels around in the image

    ( , ) ( ( , ), ( , ))g x y f x x y y x y

    7

    Examples: many amusing warps of images

    [Show image sequence.]

  • 8/12/2019 Video Technology 2013-14 - Minor II

    8/282

    Recommended Books

    Gonzales and R.E Woods, Digital Image Processing, AddisonWesley, 1993, 3rd Edition.

    A.K Jain, Digital Image Processing, Prentice Hall India. 1989

    K.R. Rao & J.J. Hwang, Techniques & Standards for Image, , , .

  • 8/12/2019 Video Technology 2013-14 - Minor II

    9/282

    Overview: Image Formats Uncompressed

    pgm (portable gray map) or ppm (portable pixel map) Unix,

    bmp (gary and color) Windows.

    Compressed GIF (Graphics Interchange Format) :

    Average Compression Ratio 4:1.

    GIF87a and GIF 89a.

    9

    .

    JPEG (Joint Photographic Experts Group) Good for photos, not very good for small image or line arts less

    than 100x100 pixels. Compression ratio 10:1 to 100:1.

    PNG (Portable Network Graphics) More color depth (up to 48bit) than GIF(8bit). 10 30 % smaller than GIF. Automatic anti-alias. Text based metadata can be added.

    tif, tiff, ps, pdf, eps etc.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    10/282

    Clarifications: Dimension in different

    context Dimension of a signal ~ # of index variables

    Audio and speech is 1-D signal: over time or sampled time index

    Image is 2-D: over two spatial indices (horizontal and vertical)

    Video is 3-D: over two spatial indices and one time index

    Dimension of an image ~ size of digital image How man ixels alon each row and column: e. . 512x512 ima e Also referred to as the resolution of an image

    Dimension of a vector space ~ # of basis vectors in it [ x(1), , x(N) ]T ~ # of elements in the vector

  • 8/12/2019 Video Technology 2013-14 - Minor II

    11/282

    Colours

    Vipan Kakkar 11

  • 8/12/2019 Video Technology 2013-14 - Minor II

    12/282

    Upon completion of this unit, you

    should be able to explain the

    colours and describe how it is

    Colours

    Vipan Kakkar 12

    .

  • 8/12/2019 Video Technology 2013-14 - Minor II

    13/282

    What is colour?

    colour

    :

    The appearance of objects or light

    sources described in terms of the

    individuals perception of them,

    13

    involvinghue,brightness, and

    saturation.

    colour is a sensation -- a perception

    of the viewer.Websters New College Dictionary

  • 8/12/2019 Video Technology 2013-14 - Minor II

    14/282

    The Visible Spectrum

  • 8/12/2019 Video Technology 2013-14 - Minor II

    15/282

    The colour of Light

    15

  • 8/12/2019 Video Technology 2013-14 - Minor II

    16/282

    Light

    Illuminating sources:

    emit light (e.g. the sun, light bulb, TV monitors)

    perceived colour depends on the emitted freq.

    follows additive rule

    R+G+B=White

    16

    Reflecting sources:

    reflect an incoming light (e.g. the colour dye, cloth etc

    perceived colour depends on reflected freq (=emitted freq-

    absorbed freq.) follows subtractive rule

    R+G+B=Black

  • 8/12/2019 Video Technology 2013-14 - Minor II

    17/282

    Wavelength of the light

    (RGB)

    17

  • 8/12/2019 Video Technology 2013-14 - Minor II

    18/282

    The colour Sensitivity of Light

    visible region of the

    Spectrum.

    This visible region is a very

    narrow segment of this

    18

    extending from

    ~ 440nm in the extreme

    blue (near ultra violet)

    to

    ~ 690 nm in the red

    region--with green in the

    middle @ ~ 555 nm.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    19/282

    Primary colours

    19

    Red Green Blue

  • 8/12/2019 Video Technology 2013-14 - Minor II

    20/282

    Food for thought

    Why R, G, B?

    These are primary colours.

    y are ese cons ere pr mary co ours

  • 8/12/2019 Video Technology 2013-14 - Minor II

    21/282

    Food for thought

    Why R, G, B?

    These are primary colours.

    Why are these considered primary colours.

    Due to laws of physics

    Due to laws of biology

  • 8/12/2019 Video Technology 2013-14 - Minor II

    22/282

    Human Visual System (HVS)

    Eyes, optic nerve, parts of the

    brain

    Transforms electroma netic

    energy

  • 8/12/2019 Video Technology 2013-14 - Minor II

    23/282

    Human Visual System

    Image Formation

    Cornea (focus, curvature, RI), sclera

    (nerves till optic nerve), pupil (point for

    light),iris (controls the quantity oflight), ens (focus), re na (image),

    fovea (central vision, cones)

    Transduction

    retina, rods, and cones Processing

    optic nerve, brain

  • 8/12/2019 Video Technology 2013-14 - Minor II

    24/282

    Human Visual System

    From a HVS perspective

    the retina is composed of

    three kinds of cone cells that

    have a high sensitivity to .

    These are 630nm (red), 530nm

    (green) and 450nm (blue).

    So, All image/video displays arebased on this theory of vision to

    reproduce color.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    25/282

    The Human Vision System (HVS)

    Naturally, an eye can adapt to ahuge range of intensities from

    lowest visible light to highest

    bearable glare.

    The e e uses two t es of discrete

    light receptors.

    6-7 million centrally located

    cones are highly sensitive to

    colour and bright light.

    75-100 million rods across the

    surface of the retina are

    sensitive to light but not colour.

    Above : A cardiologist examining a coronary angiogram.

    Above right: http://www.wiu.edu/users/mfmrb/PSY343/SensPercVision.htm

  • 8/12/2019 Video Technology 2013-14 - Minor II

    26/282

    Transduction (Retina)

    Transform light to neural

    impulses

    Converts light energy into

    electrical impulses

    Bipolar cells signal

    ganglion cells

    Axons in the ganglion cells

    form optic nerveRodsBipolar cells

    ConesGanglion

    Optic nerve

  • 8/12/2019 Video Technology 2013-14 - Minor II

    27/282

    The Human Vision System (HVS) As in the camera the image is

    projected upside down on theback surface.

    However, the human vision system(HVS) is an extremelysophisticated multi-stage process.

    The first steps in the sensoryprocess of vision involve thestimulation of light receptors to

    .

    Electrical signals containing thevision information from each eyeare transmitted to the brainthrough the optic nerves.

    The image information is

    processed in several stages,ultimately reaching the visualcortex of the cerebrum.

    We see with our brain not oureyes!

  • 8/12/2019 Video Technology 2013-14 - Minor II

    28/282

    Rods vs Cones

    Contain photo-pigment

    Respond to low energy

    Enhance sensitivit

    Contain photo-pigment

    Respond to high energy

    Enhance erce tion

    Cones Rods

    Concentrated in retina, butoutside of fovea

    One type, sensitive to

    grayscale changes

    Concentrated in fovea, existsparsely in retina

    Three types, sensitive to

    different wavelengths

  • 8/12/2019 Video Technology 2013-14 - Minor II

    29/282

    Camera and Eye

  • 8/12/2019 Video Technology 2013-14 - Minor II

    30/282

    Food for thought

    Why R, G, B?

    These are primary colours.

    Why are these considered primary colours.

    Due to laws of physics

    Due to laws of biology

    Primary colours therefore are primary only becausecones in our eyes are sensitive to those three colours.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    31/282

    Tri-stimulus Theorem (1/3)

    RGB Model: Different intensities ofred, green, and blue are added to generate

    various colors.

    Luma/ Chroma Representation (to be described later):

    The luminance component (Y) contains the gray-scale information

    31

    . ., .

    The chrominance component defines the color (U) and theintensity (V) of the color.

    Advantage: The human eye is more susceptible to brightness thancolor.

    A compression scheme can use gray-scale information to define detailand allows loss of color information to achieve higher rates of compression(i.e., JPEG).

  • 8/12/2019 Video Technology 2013-14 - Minor II

    32/282

    Primary colors cannot be obtained by mixing the other two primary

    colors.

    Tri-stimulus Theorem (2/3)

    32

    are three types of color receptors in a human eye.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    33/282

    Tri-stimulus Theorem (2/3)

    3 types of cones (6 to 7 million of them) Red = L cones, Green = M cones, Blue = S cones

    Ratio differentiates for each person

    E.g., Red (64%), Green (32%), rest S cones

    E.g., L(75.8%), M(20%), rest S cones

    E.g., L(50.6%), M(44.2%), rest S cones Source of information:

    See cone cell in wikipedia

    www.colorbasics.com/tristimulus/index.php

    Each type most responsive to a narrow band red and green absorb most energy, blue the least

    Light stimulates each set of cones differently, and the ratiosproduce sensation of color

  • 8/12/2019 Video Technology 2013-14 - Minor II

    34/282

    Color Specification Systems

    (Color Spaces)

    Spectral Power Distribution (SPD): A plot of radiant energy of a

    color vs wavelength.

    The luminance, hue (color), and saturation of a color can be

    specified most accurately by its SPD. However, SPD does not

    describe the relationship between the physical properties of a

    34

    color and its visual perception.

    International Commission on Illumination (CIE-Commission

    Internationale dEclairage) system defines how to map an SPD

    to a triple-numeral-component that are mathematical

    coordinates in a color space (more details during videoprocessing lectures).

  • 8/12/2019 Video Technology 2013-14 - Minor II

    35/282

    What is color space?

    A model used to define a specified color

    R, G and B represent lighting colors of red, green and blue

    respectively.

    Colour Space

    Combining red, green and blue

    - with different weights can produce any visible color.

    - A numerical value is used to indicate the proportion of each color.

    -Drawback:

    -3 colors are equally important and should be stored with same

    amount of data bits.

    -Therefore another color representation may be required (described

    later)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    36/282

    The CIE (Chromaticity System) was established to define an

    "average" human observer.

    The average human eye is most sensitive to green/yellow light

    and least sensitive to reds or blues (slide 11).

    The CIE System (1/3)

    36

    apparatus in order to define a "standard observer". The results are

    shown here, and are called "CIE color space".

  • 8/12/2019 Video Technology 2013-14 - Minor II

    37/282

    The CIE System (2/3)

    CIE 1931 XYZ system

    One of the color spaces

    The first mathematically defined color space

    37

    Three parameter:

    X, Y, Z

    or Y (brightness), x, y (chroma)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    38/282

    The CIE System (3/3)

    CIE Chromaticity

    Diagram

    Spectral Locus

    38

    Parameter x, y

  • 8/12/2019 Video Technology 2013-14 - Minor II

    39/282

    Refresher: Color Theory

    Exam les

  • 8/12/2019 Video Technology 2013-14 - Minor II

    40/282

    Red Green

    colour

    40

    Blue

  • 8/12/2019 Video Technology 2013-14 - Minor II

    41/282

    Red

    colour

    41

    Red+ Blue

    Magenta

    Red+ Blue

    Magenta

    Blue

    Magenta

  • 8/12/2019 Video Technology 2013-14 - Minor II

    42/282

    Red GreenYellow

    colour

    42

    Green

    + Red

    Yellow

    Green

    + Red

    Yellow

  • 8/12/2019 Video Technology 2013-14 - Minor II

    43/282

    Green

    colour

    43

    Green+ Blue

    Cyan

    Green+ Blue

    Cyan

    Blue

    Cyan

  • 8/12/2019 Video Technology 2013-14 - Minor II

    44/282

    Red Green

    colour

    44

    Red

    Green

    + Blue

    Red

    Green

    + Blue

    White

    Blue

    White

  • 8/12/2019 Video Technology 2013-14 - Minor II

    45/282

    Additive Theory

    Black radiates no light White (sun) radiates all light

    Video is the process of capturing andradiating light, therefore it uses Additive(Light) Theory not Subtractive (Pigment)Theory.

    The primary colours in Additive Theory are:

    Red ( R )

    Green ( G )

    Blue ( B )

    The primary colours add together tomake white

    Light Theory is also called AdditiveTheory.

    Light Theory is used in Television, theaterlighting, computer monitors, and videoproduction.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    46/282

    The Colour Wheel

    formed:

  • 8/12/2019 Video Technology 2013-14 - Minor II

    47/282

    colour Perception (colour Theory) Hue

    distinguishes named colours,e.g., RGB

    dominant wavelength of thelight

    Saturation

    Perceived intensity of a

    Hue Scale

    O

    rigi

    how far colour is from a grayof equal intensity

    Brightness (lightness) perceived intensity

    Saturation

    nal

    lightness

    Source: Wikipedia

  • 8/12/2019 Video Technology 2013-14 - Minor II

    48/282

    The Colour Wheel Colours on the wheel

    can be described usingthree parameters:

    1. Hue: degrees from 0 to

    360

    2. Saturation: brightnessor dullness

    3. Value: lightness or

    darkness

    (As suggested by Henry Albert Munsell inAColour Notation, 1905)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    49/282

    The Colour Wheel: Hue Hue or Spectral Colour is

    represented as an angle.

    Primary Colours: 0 = Red

    120 = Green

    240 = Blue

    Secondary Colours: 60 = Yellow

    180 = Cyan

    300 = Magenta

  • 8/12/2019 Video Technology 2013-14 - Minor II

    50/282

    The Colour Wheel: Saturation Saturation or Chroma is the

    intensity of a colour.

    A highly saturated colour isbright and appears closer tothe ed e of the wheel.

    A more unsaturated colouris dull.

    A colour with no saturation

    is achromatic or in the greyscale.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    51/282

    The Colour Wheel: Value"the quality by which we

    distinguish a light colour

    from a dark one."- Albert Henry Munsell

    A Colour Notation 1905

    Value represents the luminescent

    contrast value between black

    and white

  • 8/12/2019 Video Technology 2013-14 - Minor II

    52/282

    The Colour Wheel 3dThree parameters to describe a colour: Hue

    Chroma Value

  • 8/12/2019 Video Technology 2013-14 - Minor II

    53/282

    MANY more scientific models based on different colour theory: (Example: Colour

    Tree by American artist Henry Albert Munsell from

    A Colour Notation, 1905.)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    54/282

    Colour SchemesSystematic ways of selecting colours

    Monochromatic

    Complimentary

    Analo ous

    Warm

    Cool

    Achromatic Chromatic Grays

  • 8/12/2019 Video Technology 2013-14 - Minor II

    55/282

    Colour Schemes: Monochromatic

    Monochromatic:One Hue many values of Tintand Shade

    Artist: Marc ChagallTitle: Les Amants Sur Le Toit

  • 8/12/2019 Video Technology 2013-14 - Minor II

    56/282

    Colour Schemes: Complementary (notespelling--NOT complimentary)

    Complementary: Colours thatare opposite on the wheel.High Contrast

    Artist: Paul Cezanne

    Title: La Montage Saint VictoireYear: 1886-88

  • 8/12/2019 Video Technology 2013-14 - Minor II

    57/282

    Colour Schemes: Analogous

    Analogous: A selection ofcolours that are adjacent.Minimal contrast

    Artist: Vincent van Gogh

    Title: The IrisYear: 1889

  • 8/12/2019 Video Technology 2013-14 - Minor II

    58/282

    Colour Schemes: Warm

    Warm: First half of the wheelgive warmer colours. Thecolours of fire.

    Artist: Jan Vermeer

    Title: Girl Asleep at a TableYear: 1657

  • 8/12/2019 Video Technology 2013-14 - Minor II

    59/282

    Colour Schemes: Cool

    Cool: Second half of the wheelgives cooler colours

    Artist: Pablo Picasso

    Title: Femme Allonge LisantYear: 1939

  • 8/12/2019 Video Technology 2013-14 - Minor II

    60/282

    Colour Schemes:

    Achromatic, Chromatic Grays

    Achromatic: Black and white with all the

    grays in-between.

    Chromatic Grays:Also called neutral

    relief. Dull colours, low contrast.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    61/282

    Additive Color Mixing

    The mixing oflight

    61

    Primary: Red, Green, Blue

    The complementary color

    White means

  • 8/12/2019 Video Technology 2013-14 - Minor II

    62/282

    Subtractive Color Mixing (1/2..)

    The mixing ofpigment

    62

    Primary: Cyan, Magenta, Yellow

    The complementary color

    Why black?

  • 8/12/2019 Video Technology 2013-14 - Minor II

    63/282

    Subtractive Color Mixing (2/2..)

    Why?

    Pigments absorb light

    63

    Thinking:

    the Color Filters

    Question:

    Yellow + Cyan=?

  • 8/12/2019 Video Technology 2013-14 - Minor II

    64/282

    Image Acquisition

  • 8/12/2019 Video Technology 2013-14 - Minor II

    65/282

    Sensor:Charge-coupled device (CCD)

    Special sensor that captures an image

    Light-sensitive silicon solid-state device composed of many cells

    When exposed to light, each

    cell becomes electrically

    charged. This charge can

    then be converted to a 8-bit

    The electromechanical

    shutter is activated to expose

    the cells to light for a brief

    Lens

    area

    65

    value where 0 represents no

    exposure while 255

    represents very intense

    exposure of that cell to light.

    Some of the columns are

    covered with a black strip of

    paint. The light-intensity ofthese pixels is used for zero-

    bias adjustments of all the

    cells.

    moment.

    The electronic circuitry, when

    commanded, discharges the

    cells, activates the

    electromechanical shutter,

    and then reads the 8-bit

    charge value of each cell.These values can be clocked

    out of the CCD by external

    logic through a standard

    parallel bus interface.

    Pixel

    columns

    overe

    columns

    Electronic

    circuitry

    Electro-

    mechanical

    shutter

    Pixel

    rows

  • 8/12/2019 Video Technology 2013-14 - Minor II

    66/282

    Sensor:Charge-coupled device (CCD)

    Special sensor that captures an image

    Light-sensitive silicon solid-state device composed of many cells

    When exposed to light, each

    cell becomes electrically

    charged. This charge can

    then be converted to a 8-bit

    The electromechanical

    shutter is activated to expose

    the cells to light for a brief

    Lens

    area

    66

    value where 0 represents no

    exposure while 255

    represents very intense

    exposure of that cell to light.

    Some of the columns are

    covered with a black strip of

    paint. The light-intensity ofthese pixels is used for zero-

    bias adjustments of all the

    cells.

    moment.

    The electronic circuitry, when

    commanded, discharges the

    cells, activates the

    electromechanical shutter,

    and then reads the 8-bit

    charge value of each cell.These values can be clocked

    out of the CCD by external

    logic through a standard

    parallel bus interface.

    Pixel

    columns

    overe

    columns

    Electronic

    circuitry

    Electro-

    mechanical

    shutter

    Pixel

    rows

  • 8/12/2019 Video Technology 2013-14 - Minor II

    67/282

    Image Sampling And Quantisation

    Remember that a digital image is always only

    an approximation of a real world scene

  • 8/12/2019 Video Technology 2013-14 - Minor II

    68/282

    An Image is... (1/2)52 6B 8C 6B 73 5A 63

    5A 6B 73 84 84 73 73

    5A 84 84 73 5A 84 84

    6B 6B 8C 5A 42 4A 42

    42 6B 6B 5A A5 DE D6

    5A 6B 5A 42 F7 F7 F7

    84 5A 6B 31 DE F7 F7

    45 71 82 7D 7D 55 5D

    55 75 7D 75 71 6D 6555 75 7D 6D 61 75 75

    made up of pixels

    each pixel is a

    combination of Red

    Green and Blue color

    amplitudes

    45 71 75 55 41 41 38

    34 51 55 55 9E CF CF

    51 55 51 38 FF FF FF

    71 51 59 34 DB FF FF

    42 7B 7B 9C 7B 63 4A

    63 7B 7B 73 73 63 63

    63 73 7B 63 63 73 73

    42 73 73 63 42 42 39

    31 4A 4A 63 94 DE DE

    4A 4A 42 39 FF FF FF

    73 42 5A 31 BD FF FF

    Represented by I(x,y) of the two spatial coordinates ofthe image plane.

    I(x,y) is the intensity of the image at the point (x,y)

    on the image plane. f(x,y) can also be interchangeably used as

    A function.

    Color image represented by R(x,y), G(x,y), B(x,y)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    69/282

    An Image is... (2/2)

    A color image is just three functions pasted together. Wecan write this as a vector-valued function:

    ( , )

    ( , ) ( , )

    ( , )

    r x y

    f x y g x y

    b x y

    Well focus in grayscale (scalar-valued) images for now.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    70/282

    Spatial and Frequency Domains

    Spatial domain

    refers to planar region of

    intensity values at time t

    Spatial domain Frequency domain

    requency oma n

    An image plane as a

    sinusoidal function of

    changing intensity values

    refers to organizing pixels

    according to their changing

    intensity (frequency)

    CS 414 - Spring 2009

    f(x,y)

    F(sx,sy )

  • 8/12/2019 Video Technology 2013-14 - Minor II

    71/282

  • 8/12/2019 Video Technology 2013-14 - Minor II

    72/282

    Image Transformation

    Fourier Transform

  • 8/12/2019 Video Technology 2013-14 - Minor II

    73/282

    Filtering in the Frequency Domain

  • 8/12/2019 Video Technology 2013-14 - Minor II

    74/282

    Fourier transforms

    We can represent a function as a linear combination

    (weighted sum) of sines and cosines.

    We can think of a function in two complementary ways:

    Spatially in the spatial domain

    The Fourier transform and its inverse convert between these

    two domains:

    Frequencydomain

    Spatialdomain

    F(s) f(x)e i2sx

    dx

    f(x) F(s)e i2sx

    ds

  • 8/12/2019 Video Technology 2013-14 - Minor II

    75/282

    2D Fourier transform

    Frequency

    domain

    Spatial

    domain

    Spatial domain Frequency domain

    F(sx,sy ) f(x,y)ei2sxx

    ei2syydxdy

    f(x,y) F(sx,sy )ei2sxx

    e i2syydsxdsy

    f(x,y)

    F(sx ,sy )

  • 8/12/2019 Video Technology 2013-14 - Minor II

    76/282

    Fourier transforms (contd)

    Where do the sines and cosines come in?

    Frequency

    domain

    Spatial

    domain

    F(s) f(x)e i2

    sx

    dx

    f(x) F(s)ei2sx

    ds

    f(x) is usually a real signal, but F(s) is generallycomplex:

    Iff(x) is symmetric, i.e.,f(x) =f(-x)), then F(s) =Re(s).

    F(s) Re(s) i Im(s) F(s) ei2s

  • 8/12/2019 Video Technology 2013-14 - Minor II

    77/282

    What if f(x,y) were separable? That is,

    f x = f x f

    )(2e),(f),F( dxdyyxSySx ySxSi yx

    Fourier transforms (contd)

    )(221 e)()(),F( dxdyyfxfSySx ySxSi yx

    )(22)(2

    1 e)(e)(),F( dxdyyfxfSySx ySixSi xx

    Breaking up the exponential,

    Fourier transforms (contd)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    78/282

    dyyfdxxfSySx ySixSi yx )(22)(2

    1 e)(e)(),F(

    Separating the integrals,

    Fourier transforms (cont d)

    , 21 yxyx Using these two,

    -the spatial domain image is first transformed into an intermediate image

    using N one-dimensional Fourier Transforms.-This intermediate image is then transformed into the final image, again

    using N one-dimensional Fourier Transforms.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    79/282

  • 8/12/2019 Video Technology 2013-14 - Minor II

    80/282

    2D Fourier transform

    Frequency

    domain

    Spatial

    domain

    F(sx,sy ) f(x,y)e i2sxx

    ei2syydxdy

    f(x,y) F(sx,sy )ei2sxx

    e i2syydsxdsy

    Spatial domain Frequency domain

    -The Fourier Transform: used to decompose an

    ima e into its sine and cosine com onents.

    f(x,y)

    F(sx,sy )

    -The output of the transformation represents

    the image in the frequency domain, while the

    input image is the spatial equivalent.

    - In the Fourier domain image, each pointrepresents a frequency contained in the spatial

    domain image.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    81/282

    Example - revisitedSpatial domain Frequency domain

    f(x,y)

    F(sx,sy )

    The Fourier image has two basic

    component ?

  • 8/12/2019 Video Technology 2013-14 - Minor II

    82/282

    Example - revisitedSpatial domain Frequency domain

    f(x,y)

    F(sx,sy )

    The Fourier image is shown in such a

    way that the DC-value F(0,0) is

    displayed in the center of the image.

    The further away from the center an

    image point is, the higher is its

    corresponding frequency.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    83/282

    Example - revisitedSpatial domain Frequency domain

    f(x,y)

    F(sx,sy )

    The Fourier image is shown in such a

    way that the DC-value F(0,0) is

    displayed in the center of the image.

    The further away from the center an

    image point is, the higher is its

    corresponding frequency.

    We can see that the DC-value is by far the

    largest component of the image.

    However, the dynamic range of the Fouriercoefficients (i.e. the intensity values in the

    Fourier image) is too large as obtained in the

    image above.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    84/282

    Why FT

    Frequency

    domain

    Spatial

    domain

    F(sx,sy ) f(x,y)e i2sxx

    ei2syydxdy

    f(x,y) F(sx,sy )ei2sxx

    e i2syydsxdsy

    Spatial domain Frequency domain

    f(x,y)

    F(sx,sy )

    Fourier Transform contain a set of samples

    which is large enough to fully describe the

    spatial domain image.

    The number of frequencies corresponds to the

    number of pixels in the spatial domain

    image, i.e. the image in the spatial and Fourier

    domain are of the same size.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    85/282

    2D - FT

    Frequency

    domain

    Spatial

    domain

    F(sx,sy ) f(x,y)e i2sxx

    ei2syydxdy

    f(x,y) F(sx,sy )ei2sxx

    e i2syydsxdsy

    Spatial domain Frequency domain

    Where:

    f(x,y)

    F(sx,sy )

    -f(x,y) is the image in the spatial domain and

    - the exponential term is the basis function

    corresponding to each point F(sx, sy) in the

    Fourier space.

    -The equation can be interpreted as: the value

    of each point F(sx, sy) is obtained bymultiplying the spatial image with the

    corresponding base function and summing the

    result.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    86/282

    1D Fourier examples

  • 8/12/2019 Video Technology 2013-14 - Minor II

    87/282

    2D Fourier examples

    Spatial Frequency

    domain

    f(x,y)

    F(sx ,sy )

  • 8/12/2019 Video Technology 2013-14 - Minor II

    88/282

    DFT

    A

    2D Fourier examples

    DFT

    DFT

    0.25 * A

    + 0.75 * B

  • 8/12/2019 Video Technology 2013-14 - Minor II

    89/282

    Summary

    We have looked at:

    Human visual system

    Light and the electromagnetic spectrum

    Colors in imaging Image capture

    Image sensing and acquisition

    Image representation Fourier and spatial domains

    Sampling, quantisation and resolution

    Next topic: image enhancement techniques

  • 8/12/2019 Video Technology 2013-14 - Minor II

    90/282

    Image Pre-processing

    M i i fil i d i i

  • 8/12/2019 Video Technology 2013-14 - Minor II

    91/282

    Motivation: filtering and resizing

    Pre-processing

    What if we now want to:

    smooth an image?

    sharpen an image?

    shrink an image?

    Before we try these operations, lets revisit

    and think about images in a moremathematical way

  • 8/12/2019 Video Technology 2013-14 - Minor II

    92/282

    Image Resolution

    How many pixels

    Spatial resolution

    How many shades of grey/colours

    92

    How many frames per second

    Temporal resolution

    Nyquists theorem

  • 8/12/2019 Video Technology 2013-14 - Minor II

    93/282

    Nyquists Theorem

    A periodic signal can be reconstructed if the

    sampling interval is half the period

    An object can be detected if two samples span

    93

  • 8/12/2019 Video Technology 2013-14 - Minor II

    94/282

    Spatial Resolution

    94

    n, n/2, n/4, n/8, n/16 and n/32 pixels on a side.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    95/282

    Amplitude Resolution

    Humans can see:

    About 40 shades of brightness

    About 7.5 million shades of colour

    95

    Depends on signal to noise ratio

    40 dB equates to about 20 shades

    Images captured:

    256 shades

  • 8/12/2019 Video Technology 2013-14 - Minor II

    96/282

    Shades of Grey

    96

    256, 16, 4 and 2 shades.

    T l R l ti ( ill b d

  • 8/12/2019 Video Technology 2013-14 - Minor II

    97/282

    Temporal Resolution (will be done

    later)

    Nyquists theorem for temporal data

    How much does an object move between frames?

    Can motion be understood unambiguously?

    97

  • 8/12/2019 Video Technology 2013-14 - Minor II

    98/282

  • 8/12/2019 Video Technology 2013-14 - Minor II

    99/282

    Filtering in the Frequency Domain

  • 8/12/2019 Video Technology 2013-14 - Minor II

    100/282

    Filtering in the Frequency Domain

  • 8/12/2019 Video Technology 2013-14 - Minor II

    101/282

    Some Basic Filters and Their Functions

    Multiply all values ofF(u,v) by the filter function (notch filter):

    All this filter would do is set F(0,0) to zero (force the average value of

    an image to zero) and leave all frequency components of the Fourier

    otherwise.1

    )2/,2/(),(if0),(

    NMvuvuH

    trans orm untouc e .

    2 l d h

  • 8/12/2019 Video Technology 2013-14 - Minor II

    102/282

    Basic 2D Filters and Their Functions

    Lowpass filter

    Highpass filter

  • 8/12/2019 Video Technology 2013-14 - Minor II

    103/282

    Convolution

  • 8/12/2019 Video Technology 2013-14 - Minor II

    104/282

    2D Convolution

    Lets Try it with Two-Dimensions!

  • 8/12/2019 Video Technology 2013-14 - Minor II

    105/282

    Let s Try it with Two Dimensions!This image exclusively has 32 cycles

    in the vertical direction.

    This image exclusively has 8 cycles in

    the horizontal direction.

    So what is going on here?

    The u axis runs from left to right and it represents

    the horizontal component of the frequency. The v

    axis runs up and down and it corresponds tovertical components of the frequency.

    x-y coordinate system

    Fourier Transform

    You will notice that the second example is a

    little more smeared out. This is because the

    lines are more blurred so more sine waves are

    required to build it. The transform is weighted

    so brighter spots indicate sine waves more

    frequently used.

    The central dot is an average of all the sine waves

    so it is usually the brightest dot and used as a

    point of reference for the rest of the points.

    Since this is inverse space, dots close to the origin

    will be further apart in real space than dots that

    are far apart on the Fourier Transform. (Again

    keeping in mind that these dots refer to the

    frequency of a component wave.)

    u-v coordinate system

    Magnitude vs. Phase

  • 8/12/2019 Video Technology 2013-14 - Minor II

    106/282

    Magnitude vs. Phase

    The Fourier Transform is defined as:

    Since Computers dont like infinite integrals a Fast Fourier

    Transform makes it simpler:

    Where F(w) is original function and f(t) is the transformed function

    N

    yvxui

    eyxFvuf

    )**(2*

    ),(),(

    These two images are shifted pi with respect tox y

    Where F(x,y) is real and f(u,v) is complex.

    So what do we do with this?Well instead of representing the complex numbers as real and

    imaginary parts we can represent it as Magnitude and Phase

    where they are defined as:

    Re

    Imarctan)(

    ImRe)( 22

    fPhase

    fMagnitude

    Magnitude is telling how much of a certain frequency

    component is in the image.

    Phase is telling where that certain frequency lies in the

    image.

    each other.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    107/282

    Examples

    Fourier

    Convolution

  • 8/12/2019 Video Technology 2013-14 - Minor II

    108/282

    Fourier Transform

  • 8/12/2019 Video Technology 2013-14 - Minor II

    109/282

    2D-Fourier Transform

  • 8/12/2019 Video Technology 2013-14 - Minor II

    110/282

    Fourier Transform

  • 8/12/2019 Video Technology 2013-14 - Minor II

    111/282

    Fourier Transform

  • 8/12/2019 Video Technology 2013-14 - Minor II

    112/282

    Image Pre-processing (2D Convolution)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    113/282

    Convolution

    One of the most common methods for filtering afunction is called convolution.

    In 1D, convolution is defined as:

    x x * h x

    where

    h(x)h(x)

    f( x)h(x x)d x

    f( x)

    h( x x)d x

  • 8/12/2019 Video Technology 2013-14 - Minor II

    114/282

    Convolution properties

    Convolution exhibits a number of basic, butimportant properties.

    Commutativity:

    a(x) b(x)b(x) a(x)

    Associativity:

    Linearity:

    [a(x) b(x)] c(x)a(x) [b(x) c(x)]

    a(x) [k b(x)] k [a(x) b(x)]

    a(x) (b(x) c(x)) a(x) b(x) a(x) c(x)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    115/282

    Convolution in 2D

    In two dimensions, convolution becomes:

    g(x,y) f(x,y) h(x,y)

    f( x, y)h(x x)(y y)d x d y

    where

    * =

    f(x,y) h(x,y) g(x,y)

    h(x,y)h(x,y)

    ,

    l

  • 8/12/2019 Video Technology 2013-14 - Minor II

    116/282

    Discrete convolution

    For a digital signal, we define discrete convolution as:

    g[i] f[i] h[i]

    f[i]h[i i]

    116

    where

    i

    f[i]

    h[i i]i

    h[i] h[i]

    l h l

  • 8/12/2019 Video Technology 2013-14 - Minor II

    117/282

    1D convolution theorem example

    i l i i 2

  • 8/12/2019 Video Technology 2013-14 - Minor II

    118/282

    Discrete convolution in 2D

    Similarly, discrete convolution in 2D becomes:

    g[i,j] f[i,j] h[i,j]

    f[i , j]h[i i ,j j]

    i

    118

    where

    f[i , j]

    h[i i, j j]j

    i

    h[i,j] h[i,j]

    2D l i h l

  • 8/12/2019 Video Technology 2013-14 - Minor II

    119/282

    2D convolution theorem example

    *

    f(x,y) |F(sx,sy)|

    h(x,y)

    g(x,y)

    |H(sx,sy)|

    |G(sx,sy)|

    R i fil i 2D

  • 8/12/2019 Video Technology 2013-14 - Minor II

    120/282

    Reconstruction filters in 2D

    We can perform reconstruction in 2D

    Example problem

    Find the Fourier transform of

  • 8/12/2019 Video Technology 2013-14 - Minor II

    121/282

    Find the Fourier transform of

    Example problem: Answer.

    Find the Fourier transform of

  • 8/12/2019 Video Technology 2013-14 - Minor II

    122/282

    Find the Fourier transform of

    f(x) = (x/4) (x/2) + .5(x)

    Using the Fourier transforms of and and the linearity and scaling properties,

    F(u) = 4sinc(4u) - 2sinc2(2u) + .5sinc2(u)

    Example problem: Alternative Answer.

    Find the Fourier transform of

  • 8/12/2019 Video Technology 2013-14 - Minor II

    123/282

    Find the Fourier transform of

    f(x) = (x/4) 0.5(((x/3) * (x))

    *

    Using the Fourier transforms of and and the linearity and scaling and convolution properties ,

    F(u) = 4sinc(4u) 1.5sinc(3u)sinc(u)

    2 1 0 1 2 1 -.5 0 .5 1

    Plane waves

    Lets get an intuitive feel for the plane wave )(2e vyuxi

  • 8/12/2019 Video Technology 2013-14 - Minor II

    124/282

    Let s get an intuitive feel for the plane wave e

    The period; the distance betweensuccessive maxima of the waves

    defines the direction

    Lines of constant phase undulation

    in the complex plane

    of the undulation.

    Plane waves: sine and cosine waves

  • 8/12/2019 Video Technology 2013-14 - Minor II

    125/282

    sin(2**x)

    cos(2**x)

    Plane waves: sine waves in the complex plane.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    126/282

    sin(10**x)

    sin(10**x +4*pi*y)

    Two-Dimensional Fourier Transform

  • 8/12/2019 Video Technology 2013-14 - Minor II

    127/282

    Where in f(x,y), xand yare real, not complex variables.

    )(2e),(f),F( dxdyyxvu vyuxi

    Two-Dimensional Fourier Transform:

    Two-Dimensional Inverse Fourier Transform:

    )(2e),F(),( dudvvuyxf vyuxi

    amplitude basis functions

    and phase of

    required basis functions

    Separable Functions

  • 8/12/2019 Video Technology 2013-14 - Minor II

    128/282

    What if f(x,y) were separable? That is,

    f x = f x f

    )(2e),(f),F( dxdyyxvu vyuxi

    Two-Dimensional Fourier Transform:

    )(221 e)()(),F( dxdyyfxfvu vyuxi

    )(22)(2

    1 e)(e)(),F( dxdyyfxfvu vyiuxi

    Breaking up the exponential,

    Separable Functions

  • 8/12/2019 Video Technology 2013-14 - Minor II

    129/282

    )(22)(2

    1 e)(e)(),F( dxdyyfxfvu vyiuxi

    dyyfdxxfvu vyiuxi )(22)(2

    1 e)(e)(),F(

    Separating the integrals,

    )()(),( 21 vFuFvuF

  • 8/12/2019 Video Technology 2013-14 - Minor II

    130/282

    Fourier Transform

    f(x,y) = cos(10x)*1

    F(u,v) = 1/2 [(u+5,0) +(u-5,0)]

    u

    v

    v

    -0.5

    0

    0.5

    Real [F(u,v)]

    u

    vImaginary [F(u,v)]

  • 8/12/2019 Video Technology 2013-14 - Minor II

    131/282

  • 8/12/2019 Video Technology 2013-14 - Minor II

    132/282

    Fourier Transform

    f(x,y) = sin(40x)

    F(u,v) = i/2 [(u+20,0) - (u-20,0)]

    u

    v

    v

    -0.5

    0

    0.5

    Real [F(u,v)]

    u

    vImaginary [F(u,v)]

  • 8/12/2019 Video Technology 2013-14 - Minor II

    133/282

    Fourier Transform

    f(x,y) = sin(20x + 10y)

    F(u,v) = i/2 [(u+10,v+5) - (u-10,v-5)]

    u

    v

    v

    -0.5

    0

    0.5

    Real [F(u,v)]

    u

    vImaginary [F(u,v)]

    Reconstruction filters in 2D

  • 8/12/2019 Video Technology 2013-14 - Minor II

    134/282

    Reconstruction filters in 2D

    We can perform reconstruction in 2D

  • 8/12/2019 Video Technology 2013-14 - Minor II

    135/282

    Image Pre-processing (Sampling)

    Image Sampling And Quantisation (cont )

  • 8/12/2019 Video Technology 2013-14 - Minor II

    136/282

    Image Sampling And Quantisation (cont)

    Remember that a digital image is always only

    an approximation of a real world scene

    Sampling

  • 8/12/2019 Video Technology 2013-14 - Minor II

    137/282

    Now, we can talk about sampling.

    Sampling

    The Fourier spectrum gets replicatedby spatial sampling!

    How do we recover the signal?

    Reconstruction filters

  • 8/12/2019 Video Technology 2013-14 - Minor II

    138/282

    Reconstruction filters

    The sinc filter, while ideal, has two drawbacks:

    It has large support (slow to compute)

    It introduces ringing in practice

    We can choose from many other filters

    Reconstruction filters

  • 8/12/2019 Video Technology 2013-14 - Minor II

    139/282

    Reconstruction filters

    The sinc filter, while ideal, has two drawbacks:

    It has large support (slow to compute)

    It introduces ringing in practice

    We can choose from many other filters

    Reconstruction filters in 2D

  • 8/12/2019 Video Technology 2013-14 - Minor II

    140/282

    Reconstruction filters in 2D

    We can also perform reconstruction in 2D

  • 8/12/2019 Video Technology 2013-14 - Minor II

    141/282

    MINOR I

  • 8/12/2019 Video Technology 2013-14 - Minor II

    142/282

    Image Pre-processing (Aliasing)

    Aliasing

  • 8/12/2019 Video Technology 2013-14 - Minor II

    143/282

    Aliasing

    Sampling rate is too low

    Aliasing

  • 8/12/2019 Video Technology 2013-14 - Minor II

    144/282

    Aliasing

    What if we go below the Nyquist frequency?

    Anti-aliasing

  • 8/12/2019 Video Technology 2013-14 - Minor II

    145/282

    Anti aliasing

    Anti-aliasing is the process ofremoving the frequencies

    before they alias.

    Anti-aliasing by analytic prefiltering

  • 8/12/2019 Video Technology 2013-14 - Minor II

    146/282

    We can fill the magic box with analytic pre-filteringof the signal:

    Why may this not generally be possible?

  • 8/12/2019 Video Technology 2013-14 - Minor II

    147/282

    MINOR II

  • 8/12/2019 Video Technology 2013-14 - Minor II

    148/282

    Image processing/Enhancement (Noise reduction)

    What Is Image Enhancement?

  • 8/12/2019 Video Technology 2013-14 - Minor II

    149/282

    What Is Image Enhancement?

    Image enhancement is the process of making

    images more useful

    The reasons for doing this include:

    g g ng n eres ng e a n mages

    Removing noise from images

    Making images more visually appealing

    NoiseI i l i i i f d i bl b bl f

  • 8/12/2019 Video Technology 2013-14 - Minor II

    150/282

    In signal processing, it is often desirable to be able to perform somekind of noise reduction on an image or signal.

    Image processing is also useful for noise reduction and edgeenhancement. We will focus on these applications for the remainder ofthe lecture

    Common types of noise:

    Salt and pepper noise:contains random

    150

    occurrences of black andwhite pixels

    Impulse noise: containsrandom occurrences ofwhite pixels

    Gaussian noise:

    variations in intensity drawnfrom a Gaussian normaldistribution

    Ideal noise reduction

  • 8/12/2019 Video Technology 2013-14 - Minor II

    151/282

    Ideal noise reduction

    151

    Ideal noise reduction

  • 8/12/2019 Video Technology 2013-14 - Minor II

    152/282

    Ideal noise reduction

    152

    Practical noise reduction

  • 8/12/2019 Video Technology 2013-14 - Minor II

    153/282

    Practical noise reduction

    How can we smooth away noise in a single image?

    153

    Example revisited: 2D Convolution

    k ( filt )

  • 8/12/2019 Video Technology 2013-14 - Minor II

    154/282

    mask (average filter)

    Effect of average filters

  • 8/12/2019 Video Technology 2013-14 - Minor II

    155/282

    g

    155

    Median Filtering

  • 8/12/2019 Video Technology 2013-14 - Minor II

    156/282

    Median Filtering

    The median filter is another digital filtering technique, often used to remove

    noise. Median filtering is very widely used in digital image processing because it

    preserves edges while removing noise.

    Median filters

  • 8/12/2019 Video Technology 2013-14 - Minor II

    157/282

    It replaces the value of the center pixel with the median of the intensityvalues in the neighborhood of that pixel.

    Median filtering is an operation often used in image processing to reduce

    "salt and pepper" noise. A median filter is more effective than convolution

    when the goal is to simultaneously reduce noise and preserve edges.

    Median filters are particularly effective in the presence ofimpulse noise,

    also called salt and pepper noise because of its appearance as white and

    black dots superimposed on an image.

    For every pixel, a 3x3 neighborhood with the pixel as center is considered.

    In median filtering, the value of the pixel is replaced by the median of the

    pixel values in the 3x3 neighborhood.

    Median Filtering Median filtering is useful for removing

  • 8/12/2019 Video Technology 2013-14 - Minor II

    158/282

    Median filtering is useful for removingnoise but usefully preserves edges.

    The median is the central value in arange

    Median {4,2,0,1,3,0,5} = ?

    Median filtering is a popular low-passfiltering method. Pixel values are sortedand the median (middle value) is output.

    Median filtering removes sparse outliers.

    Sparse outliers appear as salt andpepper noise in images, i.e., dark pixelsin light areas and light pixels in darkareas. This type of noise was commonin old televisions.

    You may use some simple filters in thelaboratory (using matlab). A medianfilter will be used to remove noise.

    Passing a 3x3 median filter over

    the image pixels shown above

    on the right produces the

    output on the right.

    Notice how the outlier (the 150)

    is removed.

    Median Filtering Median filtering is useful for removing

  • 8/12/2019 Video Technology 2013-14 - Minor II

    159/282

    Median filtering is useful for removingnoise but usefully preserves edges.

    The median is the central value in arange

    Median {4,2,0,1,3,0,5} = 2

    Median filtering is a popular low-passfiltering method. Pixel values are sortedand the median (middle value) is output.

    Median filtering removes sparse outliers.

    Sparse outliers appear as salt andpepper noise in images, i.e., dark pixelsin light areas and light pixels in darkareas. This type of noise was common inanalogue television.

    You will use some simple filters in thelaboratory. A median filter will be usedto remove noise.

    Passing a 3x3 median filter over

    the image pixels shown above

    on the right produces the

    output on the right.

    Notice how the outlier (the 150)

    is removed.

    Effect of median filters

  • 8/12/2019 Video Technology 2013-14 - Minor II

    160/282

    160

    Mean and Median Filtering

  • 8/12/2019 Video Technology 2013-14 - Minor II

    161/282

    g

    X1 X2 X3

    X4 X0 X5

    X6 X7 X8

    X1 X2 X3

    X4 X0 X5

    X6 X7 X8

    Replace the X0 by the

    mean of X0~X8 is

    called mean filtering

    Replace the X0 by the

    median of X0~X8 is

    called median filtering

    Gaussian filters

  • 8/12/2019 Video Technology 2013-14 - Minor II

    162/282

    Gaussian filters weigh pixels based on their distance fromthe center of the convolution filter. In particular:

    2 2 2( ) /(2 )

    [ , ]i j

    eh i j

    C

    162

    This does a decent job of blurring noise while preservingfeatures of the image.

    What parameter controls the width of the Gaussian?

    What happens to the image as the Gaussian filter kernel

    gets wider? What is the constant C? What should we set it to?

    Effect of Gaussian filters

  • 8/12/2019 Video Technology 2013-14 - Minor II

    163/282

    163

    Comparison: Gaussian noise

  • 8/12/2019 Video Technology 2013-14 - Minor II

    164/282

    p

    164

    Comparison: salt and pepper noise

  • 8/12/2019 Video Technology 2013-14 - Minor II

    165/282

    165

  • 8/12/2019 Video Technology 2013-14 - Minor II

    166/282

    Image processing/Enhancement (Edge Detection)

    Edge detection

  • 8/12/2019 Video Technology 2013-14 - Minor II

    167/282

    One of the most important uses of imageprocessing is edge detection:

    Really easy for humans

    Really difficult for computers

    167

    Fundamental in computer vision

    Important in many graphics applications

    What is an edge?

  • 8/12/2019 Video Technology 2013-14 - Minor II

    168/282

    168

    Q: How might you detect an edge in 1D?

    Gradients

  • 8/12/2019 Video Technology 2013-14 - Minor II

    169/282

    The gradient is the 2D equivalent of the derivative:

    ( , ) ,f f

    f x yx y

    169

    Properties of the gradient

    Its a vector

    Points in the direction of maximum increase off

    Magnitude is rate of increase How can we approximate the gradient in a discrete image?

    Less than ideal edges

  • 8/12/2019 Video Technology 2013-14 - Minor II

    170/282

    170

    Edge Properties

  • 8/12/2019 Video Technology 2013-14 - Minor II

    171/282

    Edge has twoproperties:

    how steep it is

    direction, ie, is it

    A(x)

    x

    or right?

    A(x)

    x

    Edge PropertiesEdge Properties gradientgradient

  • 8/12/2019 Video Technology 2013-14 - Minor II

    172/282

    Consider a 1-dcontinuous image of

    an edge, denoted by

    Edge PropertiesEdge Properties--gradientgradient

    0

    100

    200

    A(x)

    Edge properties canbe obtained from the

    gradient = A /x

    gradient=dA/dxasx0.

    5 10 15 20 25 30-20

    0

    20

    5 10 15 20 25 30-20

    0

    20

    xA

    A

    x

    Edge Properties

  • 8/12/2019 Video Technology 2013-14 - Minor II

    173/282

    Gradient has two

    properties magnitude direction

    dx

    dAgradient 11

    dx

    dAgradient 22

    Magnitude, or

    steepness, given by

    |dA/dx|

    Direction, left or

    right, given by sign ofdA/dx

    dxdA

    dxdA

    dx

    dA

    dx

    dA

    12

    12

    sgnsgn

    Edge PropertiesEdge Properties--gradientgradient

  • 8/12/2019 Video Technology 2013-14 - Minor II

    174/282

    Gradient given by firstderivative dA /d x.

    Second derivative,

    dge Propertiesdge Properties gradientgradient

    5 10 15 20 25 300

    100

    200 xA

    x ,generatestwo peaks at

    beginning and end of

    edge.

    Called ringing.

    5 10 15 20 25 30-20

    0

    20

    5 10 15 20 25 30-20

    0

    20

    2

    2

    dx

    Ad

    dx

    Edge Properties-discrete gradient

  • 8/12/2019 Video Technology 2013-14 - Minor II

    175/282

    5 10 15 20 25 30

    0

    100

    200

    20

    iAxA B=[-1 1]

    5 10 15 20 25 30-20

    0

    5 10 15 20 25 30-20

    0

    20

    iiii

    iiii

    ii

    iii

    AAAA

    AAAA

    AAdx

    Ad

    AAAdx

    2

    12

    112

    12

    2

    1

    2

    B=[1 -2 1]

    Steps in edge detection

  • 8/12/2019 Video Technology 2013-14 - Minor II

    176/282

    Edge detection algorithms typically proceedin three or four steps:

    Filtering: cut down on noise

    Enhancement: amplify the difference between

    176

    e ges an non-e ges Detection: use a threshold operation

    Localization (optional): estimate geometry ofedges beyond pixels

    22--d Gradient Operatord Gradient Operator

  • 8/12/2019 Video Technology 2013-14 - Minor II

    177/282

    dy

    dA

    dx

    dAA

    22

    ji

    j

    dA/dy

    dxdA

    dydA

    dydx

    1-

    tannOrientatio

    Magn tu e

    i

    x

    Review

  • 8/12/2019 Video Technology 2013-14 - Minor II

    178/282

    Image Enhancement / (Pre)Processing

    Noise reduction

    Information Detection

    Histograms

    Image Segmentation

    Image Compression (Next)

    Review

  • 8/12/2019 Video Technology 2013-14 - Minor II

    179/282

    Image Enhancement / (Pre)Processing

    Noise reduction

    Using Filters (coefficients of the filter mask (hi,j) to be

    computed ???)

    Information Detection

    Edge Detection

    Histograms

    Image SegmentationThreshold technique

    Image Compression (Next)

    Review

  • 8/12/2019 Video Technology 2013-14 - Minor II

    180/282

    Image Enhancement / (Pre)ProcessingNoise reduction

    Using Filters (coefficients of the filter mask to be computedusing rectangular, gaussian, triangular techniques etc.)

    Avera e Filter mask

    Median Filter

    Gaussian Filter etc.

    Information DetectionEdge Detection

    HistogramsImage Segmentation

    Threshold technique

  • 8/12/2019 Video Technology 2013-14 - Minor II

    181/282

  • 8/12/2019 Video Technology 2013-14 - Minor II

    182/282

    Image gradient The gradient of an image:

  • 8/12/2019 Video Technology 2013-14 - Minor II

    183/282

    The gradient points in the direction of most rapid change in intensity

    The gradient direction is given by:

    how does this relate to the direction of the edge?

    The edge strength/magnitude is given by the gradient magnitude

    Neighbourhood Operators

  • 8/12/2019 Video Technology 2013-14 - Minor II

    184/282

    First derivative can be calculated by

    convolving with mask ______

    Second derivative can be calculated by

    ________

    Edge Properties-discrete gradient

  • 8/12/2019 Video Technology 2013-14 - Minor II

    185/282

    5 10 15 20 25 30

    0

    100

    200

    20

    iAxA B=[-1 1]

    take derivative

    5 10 15 20 25 30-20

    0

    5 10 15 20 25 30-20

    0

    20

    iiii

    iiii

    ii

    iii

    AAAA

    AAAA

    AAdx

    Ad

    AAAdx

    2

    12

    112

    12

    2

    1

    2

    B=[1 -2 1]

    Neighbourhood Operators

  • 8/12/2019 Video Technology 2013-14 - Minor II

    186/282

    First derivative can be calculated by

    convolving with mask B=[-1 1].

    Second derivative can be calculated by

    - .

    Discrete 2-d gradient operator

  • 8/12/2019 Video Technology 2013-14 - Minor II

    187/282

    jiijiji

    ji

    AAAx

    A

    AyxA

    ,,,1

    ,,

    1

    Neighbour hood

    operators

    ji jijjiiji

    jijjiji

    AAA

    AAAy

    A

    ,,,

    ,,1,

    11

    1

    j

    i

    B

    Gradient Operators for Images

  • 8/12/2019 Video Technology 2013-14 - Minor II

    188/282

    Second-order gradient

    denoted by 2A.

    in an image.

    Scalar. 22

    2

    yxA

    Laplacian Operator

    Neighbourhood Operators

  • 8/12/2019 Video Technology 2013-14 - Minor II

    189/282

    jijijiji

    jiijiijii

    ji

    AAAA

    AAAx

    A

    AyxA

    ,,1,1,2

    ,

    2

    ,,12

    2

    ,,

    1

    2

    ii BAA

    jijiji

    jijijiji

    jijjijjij

    jijiji

    AAA

    AAAA

    AAAy

    A

    AAA

    ,1,2,

    ,1,1,2,

    ,

    2

    ,1,2

    2

    ,,1,2

    2

    2

    121

    1

    2

    j

    jj

    B

    BAA

    Neighbourhood Operators

  • 8/12/2019 Video Technology 2013-14 - Minor II

    190/282

    222

    ji

    ji

    BABA

    AAA

    010

    141

    010

    ji BBB

    B*A

    What is Edge-enhancement?

  • 8/12/2019 Video Technology 2013-14 - Minor II

    191/282

    Physcophysical experiments indicate that an

    image with accentuated or crispened edges is

    often more subjectively pleasing than the original

    image.

    Edge enhancement

  • 8/12/2019 Video Technology 2013-14 - Minor II

    192/282

    Laplacian

    Canny edge detector

    Laplacian Image

  • 8/12/2019 Video Technology 2013-14 - Minor II

    193/282

    2AL

    010

    141B

    LaplacianLaplacian

  • 8/12/2019 Video Technology 2013-14 - Minor II

    194/282

    Add Laplacian to

    2Ad

    xA5 10 15 20 25 30

    0

    100

    200

    20

    . A(x)+L

    Overshoot below

    and above edge.

    22

    2

    2

    2

    dx

    AdxA

    dx

    Ad

    dx

    5 10 15 20 25 30-20

    0

    5 10 15 20 25 30-20

    0

    20

    0 5 10 15 20 25 30

    100

    200

    300

    Neighbourhood Operations

  • 8/12/2019 Video Technology 2013-14 - Minor II

    195/282

    010000

    , LyxA

    Laplacian :

    mask

    010151

    010

    010000

    This is enhanced mask BB

    (A FILTER)(A FILTER)

    Laplacian

  • 8/12/2019 Video Technology 2013-14 - Minor II

    196/282

    A(x)+L

    Edge

    Enhancement

    x

  • 8/12/2019 Video Technology 2013-14 - Minor II

    197/282

    Original imageOriginal image enhanced with

    laplacian

    Edge detection

  • 8/12/2019 Video Technology 2013-14 - Minor II

    198/282

    original

    Edge detector

  • 8/12/2019 Video Technology 2013-14 - Minor II

    199/282

    thinning

    (non-maximum suppression)

    Effect of (Gaussian kernel size)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    200/282

    Canny with Canny withoriginal

    The choice of depends on desired behavior

    large detects large scale edges

    small detects fine features

    Gaussian Mask

  • 8/12/2019 Video Technology 2013-14 - Minor II

    201/282

    Uses Gaussian Mask

    Sigma is a value chosen by DESIGNER

    22 2/ xe

    X and Y are the distances awayfrom target pixel

    Range from positive to negative

    Mask Radius

    Example: Gaussian

  • 8/12/2019 Video Technology 2013-14 - Minor II

    202/282

  • 8/12/2019 Video Technology 2013-14 - Minor II

    203/282

    Image Enhancement (Histogram)

    Image Histograms

  • 8/12/2019 Video Technology 2013-14 - Minor II

    204/282

    The histogram of an image shows us thedistribution of grey levels in the image

    Massively useful especially in segmentation

    Grey Levels

    Freq

    uencies

    Histogram Examples (cont)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    205/282

    talImageProcessing(2002)

    Imagestaken

    fromGonzalez&Woods,Dig

    i

    Histogram Examples (cont)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    206/282

    talImageProcessing(2002)

    Imagestaken

    fromGonzalez&Woods,Digi

    Histogram Examples (cont)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    207/282

    talImageProcessing(2002)

    Imagestaken

    fromGonzalez&Woods,Digi

    Histogram Examples (cont)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    208/282

    talImageProcessing(2002)

    Imagestaken

    fromGonzalez&Woods,Digi

    Histogram Examples (cont)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    209/282

    A selection of images andtheir histograms

    Notice the relationships

    talImageProcessing(2002)

    their histograms

    Note that the high contrast

    image has the most

    evenly spaced histogramImagestaken

    fromGonzalez&Woods,Digi

    VARIANCE and STANDARD DEVIATION

  • 8/12/2019 Video Technology 2013-14 - Minor II

    210/282

    of the histogram tell us about theaverage contrast of the image !

    So, from previous figure:

    higher the VARIANCE = higher the

    STANDARD DEVIATION,

    higher will be the images contrast !

  • 8/12/2019 Video Technology 2013-14 - Minor II

    211/282

    Image Enhancement (Image Segmentation)

    Image Segmentation

  • 8/12/2019 Video Technology 2013-14 - Minor II

    212/282

    Segmentation divides an image into itsconstituent regions or objects.

    Segmentation of images is a difficult task in

    . . Segmentation allows to extract objects in

    images.

    What it is useful for

  • 8/12/2019 Video Technology 2013-14 - Minor II

    213/282

    segmenting the image, gives the contours of objects, whichcan be extracted using edge detection and/or border

    following techniques.

    Shape of objects can be described.

    , , . Image segmentation techniques are extensively used in

    similarity searches, e.g.:

    http://elib.cs.berkeley.edu/photos/blobworld/

    Segmentation Algorithms

  • 8/12/2019 Video Technology 2013-14 - Minor II

    214/282

    Segmentation algorithms are based on one of twobasic properties of color, gray values, or texture:discontinuity and similarity.

    First category is to partition an image based on

    ,image.

    Second category are based on partitioning an imageinto regions that are similar according to apredefined criteria. Histogram thresholding approachfalls under this category.

    Clustering in Color Space

  • 8/12/2019 Video Technology 2013-14 - Minor II

    215/282

    1. Each image point is mapped to a point in a color space,e.g.:

    Color(i, j) = (R (i, j), G(i, j), B(i, j))

    It is many to one mapping.

    2. The points in the color space are grouped to clusters.

    3. The clusters are then mapped back to regions in the image.

    Displaying objects in the Segmented Image

  • 8/12/2019 Video Technology 2013-14 - Minor II

    216/282

    The objects can be distinguished by assigningan arbitrary pixel value or average pixel value

    to the pixels belonging to the same clusters.

    Thus, one needs clustering algorithms

    for image segmentation.

  • 8/12/2019 Video Technology 2013-14 - Minor II

    217/282

    Homework (still preparing):

    Implement in Matlab and test on some example images the

    clustering in the color space.

    Use Euclidean distance in RGB color space.

    You can use k-means, PAM, or some other clustering algorithm.Links to k-means, PAM, data normalization

    Test images: rose, plane, car, tiger, landscape

  • 8/12/2019 Video Technology 2013-14 - Minor II

    218/282

    Gray Scale Image Example

  • 8/12/2019 Video Technology 2013-14 - Minor II

    219/282

    Image of a Finger Print with light background

    Histogram

  • 8/12/2019 Video Technology 2013-14 - Minor II

    220/282

    Segmented Image

  • 8/12/2019 Video Technology 2013-14 - Minor II

    221/282

    Image after Segmentation

    Thresholding Bimodal Histograms

  • 8/12/2019 Video Technology 2013-14 - Minor II

    222/282

    Basic Global Thresholding:1)Select an initial estimate for T

    2)Segment the image using T. This will produce two groups ofpixels. G1 consisting of all pixels with gray level values >T andG2 consisting of pixels with values

  • 8/12/2019 Video Technology 2013-14 - Minor II

    223/282

    Image of rice with black background

  • 8/12/2019 Video Technology 2013-14 - Minor II

    224/282

    Basic Adaptive Thresholding:

    Images having uneven illumination makes it difficult to

  • 8/12/2019 Video Technology 2013-14 - Minor II

    225/282

    g gsegment using histogram,this approach is to divide the original image

    into sub images

    and use the thresholding process

    to each of the sub ima es.

    Multimodal Histogram

  • 8/12/2019 Video Technology 2013-14 - Minor II

    226/282

    If there are three or more dominant modes in theimage histogram, the histogram has to be partitionedby multiple thresholds.

    u t eve t res o ng c ass es a po nt x,y asbelonging to one object class

    if T1 < (x,y) T2and to the background

    if f(x,y)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    227/282

    227

    Low noise image Thresholded at T=some value

    Greylevel thresholding

  • 8/12/2019 Video Technology 2013-14 - Minor II

    228/282

    0.02

    p(x)

    228

    0.00

    0.01

    x

    T

    Object

    Background

    Greylevel thresholding

  • 8/12/2019 Video Technology 2013-14 - Minor II

    229/282

    Comparison with different thresholding

    229

    High noise circle image Optimum threshold

    Relaxation - 20 i

    Example: Image Enhancement

  • 8/12/2019 Video Technology 2013-14 - Minor II

    230/282

    Example: Image Enhancement

  • 8/12/2019 Video Technology 2013-14 - Minor II

    231/282

  • 8/12/2019 Video Technology 2013-14 - Minor II

    232/282

    Assi nment

    Color Image

  • 8/12/2019 Video Technology 2013-14 - Minor II

    233/282

    Given Image

    Statement

  • 8/12/2019 Video Technology 2013-14 - Minor II

    234/282

    To perform an imagesegmentation:

    1. Skin color

    Assignment to besubmitted:

    latest by March 28,

    .

    Tools to be used:

    Matlab

    Report:

    Matlab code

    Output images

    (results) with captions

    Example result: Segmented Image

  • 8/12/2019 Video Technology 2013-14 - Minor II

    235/282

    Example result: Segmented image, skin color is shown

  • 8/12/2019 Video Technology 2013-14 - Minor II

    236/282

    Assi nment

    Review

    h / ( )

  • 8/12/2019 Video Technology 2013-14 - Minor II

    237/282

    Image Enhancement / (Pre)ProcessingNoise reduction

    Using Filters (coefficients of the filter mask to be computed usingrectangular, gaussian, triangular techniques etc.)

    Average Filter (mask)

    Median Filter

    .

    Information DetectionEdge Detection (to get salient features)

    High Pass Filters

    Laplacian mask

    HistogramsDistribution of intensity levels

    Image Segmentation (to extract different objects)Threshold technique

    Next: Image Compression

  • 8/12/2019 Video Technology 2013-14 - Minor II

    238/282

    Image Compression

    Again: An Image is...

    52 6B 8C 6B 73 5A 63

    5A 6B 73 84 84 73 73

    5A 84 84 73 5A 84 84

    made up of pixels

  • 8/12/2019 Video Technology 2013-14 - Minor II

    239/282

    5A 84 84 73 5A 84 84

    6B 6B 8C 5A 42 4A 42

    42 6B 6B 5A A5 DE D6

    5A 6B 5A 42 F7 F7 F7

    84 5A 6B 31 DE F7 F7

    45 71 82 7D 7D 55 5D

    55 75 7D 75 71 6D 65

    55 75 7D 6D 61 75 75

    each pixel is a

    combination of Red

    Green and Blue color

    amplitudes

    45 71 75 55 41 41 38

    34 51 55 55 9E CF CF

    51 55 51 38 FF FF FF

    71 51 59 34 DB FF FF

    42 7B 7B 9C 7B 63 4A

    63 7B 7B 73 73 63 63

    63 73 7B 63 63 73 73

    42 73 73 63 42 42 39

    31 4A 4A 63 94 DE DE

    4A 4A 42 39 FF FF FF

    73 42 5A 31 BD FF FF

    Represented by I(x,y) of the two spatial coordinates of

    the image plane.

    I(x,y) is the intensity of the image at the point (x,y)on the image plane.

    Color image represented by R(x,y), G(x,y), B(x,y)

    Image Compression: JPEG

    S S

  • 8/12/2019 Video Technology 2013-14 - Minor II

    240/282

    Summary: JPEG Compression

    DCT

    Quantization

    Zi -Za Scan

    Sources: The JPEG website:

    http://www.jpeg.org

    240

    RLE and DPCM

    Entropy Coding

    Why Compression?

    The compression ratio of lossless methods (e.g., Huffman,Arithmetic, LZW) is not high enough for image and video

    i

  • 8/12/2019 Video Technology 2013-14 - Minor II

    241/282

    compression. JPEG uses transform coding, it is largely based on the

    following observations:

    Observation 1: A large majority of useful image contents changerelatively slowly across images, i.e., it is unusual for intensity

    241

    va ues to a ter up and down severa times in a sma area, orexample, within an 8 x 8 image block.A translation of this fact into the spatial frequency domain,implies, generally, lower spatial frequency components containmore information than the high frequency components whichoften correspond to less useful details and noises.

    Observation 2: Experiments suggest that humans are moreimmune to loss of higher spatial frequency components than lossof lower frequency components.

    Entropy Coding: DC Components (Contd..)

    SIZE C d C d

    DC components are differentially coded as (SIZE,Value)

    The code for a SIZE is derived from the following table

    Ex mpl : If DC mp t is 40

  • 8/12/2019 Video Technology 2013-14 - Minor II

    242/282

    SIZE CodeLength

    Code

    0 2 00

    1 3 010

    2 3 011

    Example: If a DC component is 40and the previous DCcomponent is 48. Thedifference is -8. Therefore itis coded as:

    1010111

    0111: The value for re resentin 8

    242

    4 3 101

    5 3 110

    6 4 1110

    7 5 11110

    8 6 111110

    9 7 1111110

    10 8 11111110

    11 9 111111110

    (see Size_and_Value table)101: The size from the same table

    reads 4. The correspondingcode from the table at left is101.

    Huffman Table for DC component SIZE field

  • 8/12/2019 Video Technology 2013-14 - Minor II

    243/282

    A TV till i (f ) i I di h 720 483 i l

    Why Compression?

  • 8/12/2019 Video Technology 2013-14 - Minor II

    244/282

    A TV still image (frame) in India has 720 x 483 pixels

    Hence the total amount of data for the image

    = 720 x 483 x 3 bytes

    = 1043280 bytes

    Considering there are 25 such frames in a second, the data rate

    = 1043280 x 8 x 25 bits/sec

    = 208656000 bits/sec

    = 208 Mbits/sec !!!

    But, the cable that comes to our house can carry data rates of upto only 40Mbits/sec :-((

    Compression

    Store more images

  • 8/12/2019 Video Technology 2013-14 - Minor II

    245/282

    Store more images Transmit images in less time

    JPEG (Joint Photographic Experts Group) Popular standard format for representing digital images in a

    compressed form

    Provides for a number of different modes of operation Mode used in this class provides high compression ratios using DCT

    (discrete cosine transform)

    Image data divided into blocks of 8 x 8 pixels

    3 steps performed on each block

    DCT

    Quantization

    Huffman encoding

    Compression

    L l l ( id l d)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    246/282

    Lossless or lossy(widely used)

    Color Transform (RGB to YCbCr)

    Downsampling (4:2:2 to 4:2:0)

    Noise reduction

    Image captured

    (R,G,B)

    and digitised

    246

    Color Transform RGB to YCbCr

    for Video class

    C i f RGB

  • 8/12/2019 Video Technology 2013-14 - Minor II

    247/282

    Conversion from RGB: Y=0.299(R-G) + G + 0.114(B-G)

    Cb=0.564(B-Y)

    Cr=0.713(R-Y)

    247

    The Matrix form:

    0.299 0.587 0.114

    0.168636 0.232932 0.064296

    0.499813 0.418531 0.081282

    Y R

    Cb G

    Cr B

    Downsampling in Y-Cr-Cb space

    for Video class

  • 8/12/2019 Video Technology 2013-14 - Minor II

    248/282

    Compression

    Lossless or lossy(widely used)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    249/282

    Lossless or lossy(widely used)

    Noise reduction

    Image Enhancement

    Image captured

    (grayscale)

    and digitised

    249

    Example: JPEG Coding

    YCbCr

    DCTf(i, j)

    8 x 8

    F(u, v)

    8 x 8

    QuantizationFq(u, v) Steps Involved:

    1. Discrete Cosine

    Transform of each 8x8i l

  • 8/12/2019 Video Technology 2013-14 - Minor II

    250/282

    Coding

    QuantTables

    8 x 8 8 x 8

    Zi Za

    Transform of each 8x8pixel arrayf(x,y) TF(u,v)

    2. Quantization using atable or using a constant

    3. Zig-Zag scan to exploitredundanc

    250

    DPCM

    RLC

    EntropyCoding

    HeaderTables

    Data

    Scan 4. Differential Pulse CodeModulation(DPCM) onthe DC component andRun length Coding of theAC components

    5. Entropy coding(Huffman) of the final

    output

    Exercise - DCT : Discrete Cosine Transform

    DCT converts the information contained in a block(8x8) of pixels fromspatialdomain to thefrequencydomain.

    A simple analogy: Consider a unsorted list of 12 numbers between 0 and3 > (2 3 1 2 2 0 1 1 0 1 0 0) C id t f ti f th li t

  • 8/12/2019 Video Technology 2013-14 - Minor II

    251/282

    A simple analogy: Consider a unsorted list of 12 numbers between 0 and3 -> (2, 3, 1, 2, 2, 0, 1, 1, 0, 1, 0, 0). Consider a transformation of the listinvolving two steps: (1.) sort the list (2.) Count the frequency ofoccurrence of each of the numbers ->(????). : Through thistransformation we lost the spatial information but captured thefrequency information.

    There are other transformations which retain the spatial information.

    251

    E.g., Fourier transform, DCT etc. Therefore allowing us to move back andforth between spatial and frequency domains.

    1-D DCT: 1-D Inverese DCT:

    F()a(u)2 f(n)cos(2n1)

    16n 0

    N1

    a(0) 12

    a(p) 1 p 0

    f'(n)a(u)2 F()cos(2n1)

    16 0

    N1

    Exercise - DCT : Discrete Cosine Transform

    DCT converts the information contained in a block(8x8) of pixels fromspatialdomain to thefrequencydomain.

    A simple analogy: Consider a unsorted list of 12 numbers between 0 and3 > (2 3 1 2 2 0 1 1 0 1 0 0) Consider a transformation of the list

  • 8/12/2019 Video Technology 2013-14 - Minor II

    252/282

    A simple analogy: Consider a unsorted list of 12 numbers between 0 and3 -> (2, 3, 1, 2, 2, 0, 1, 1, 0, 1, 0, 0). Consider a transformation of the listinvolving two steps: (1.) sort the list (2.) Count the frequency ofoccurrence of each of the numbers ->(4,4,3,1). : Through thistransformation we lost the spatial information but captured thefrequency information.

    There are other transformations which retain the spatial information.

    252

    E.g., Fourier transform, DCT etc. Therefore allowing us to move back andforth between spatial and frequency domains.

    1-D DCT: 1-D Inverese DCT:

    F()a(u)2 f(n)cos(2n1)

    16n 0

    N1

    a(0) 12

    a(p) 1 p 0

    f'(n)a(u)2 F()cos(2n1)

    16 0

    N1

    transforming the data from the spatial domain to spatial

    frequency domain

    Transform coding for images

  • 8/12/2019 Video Technology 2013-14 - Minor II

    253/282

    Discrete Cosine Transform -

    1-D DISCRETE COSINE TRANSFORM

    DCT

    1 )12()()()(

    N uxfC

  • 8/12/2019 Video Technology 2013-14 - Minor II

    254/282

    1

    0 2)12(cos)()()(

    N

    x NuxxfuauC

    ,,,

    1,,1

    2

    01

    )(

    NuN

    uN

    ua

    2-D DISCRETE COSINE TRANSFORM

    DCT

  • 8/12/2019 Video Technology 2013-14 - Minor II

    255/282

    N

    vy

    N

    uxyxfvauavuC

    N

    x

    N

    y 2

    )12(cos

    2

    )12(cos),()()(),(

    1

    0

    1

    0

    Nvy

    NuxvuCvauayxf

    u v 2cos

    2cos),()()(),(

    0 0

    1,,1,0, Nvu

    2-D DCT

    Images are two-dimensional; How do you perform 2-D DCT? Two series of 1-D transforms result in a 2-D transform as demonstrated in

    the figure below

  • 8/12/2019 Video Technology 2013-14 - Minor II

    256/282

    1-D 1-D

    j)f(i,

    256

    Row-wise Co umn-wise

    8x8 8x8 8x8

    v)F(u,

    r F(0,0) is called the DC component and the rest of F(i,j) are calledAC components

  • 8/12/2019 Video Technology 2013-14 - Minor II

    257/282

    Quality

  • 8/12/2019 Video Technology 2013-14 - Minor II

    258/282

    258

  • 8/12/2019 Video Technology 2013-14 - Minor II

    259/282

    259

  • 8/12/2019 Video Technology 2013-14 - Minor II

    260/282

    DCT step

    Transforms original 8 x 8 block into a cosine-frequency domain

  • 8/12/2019 Video Technology 2013-14 - Minor II

    261/282

    Transforms original 8 x 8 block into a cosine-frequency domain Upper-left corner values represent more of the essence of the image

    Lower-right corner values represent finer details Can reduce precision of these values and retain reasonable image quality

    orwar ormu a C(h) = if (h == 0) then 1/sqrt(2) else 1.0 Auxiliary function used in main function F(u,v)

    F(u,v) = x C(u) x C(v) x=0..7 y=0..7 fxy x cos((2u + 1)u/16) x cos((2y + 1)v/16) Gives encoded pixel at row u, column v

    fxy is original pixel value at row x, column y

    IDCT (Inverse DCT) Reverses process to obtain original block (not needed for this design)

    What happens when DCT is

    performed?

  • 8/12/2019 Video Technology 2013-14 - Minor II

    262/282

    Example: 2D signal

  • 8/12/2019 Video Technology 2013-14 - Minor II

    263/282

    Energy concentrated

    in low-frequency

    region (using DCT)

    Basis of DCT

  • 8/12/2019 Video Technology 2013-14 - Minor II

    264/282

    2-D Basis Functions N=4

    0 1 2 3

    u

    v

  • 8/12/2019 Video Technology 2013-14 - Minor II

    265/282

    0

    1

    2

    3

    Quantization step

    Achieve high compression ratio by reducing imagequality

  • 8/12/2019 Video Technology 2013-14 - Minor II

    266/282

    quality Reduce bit precision of encoded data

    Fewer bits needed for encoding One way is to divide all values by a factor of n

    Simple right shifts can do this

    Dequantization would reverse process fordecompression

    1150 39 -43 -10 26 -83 11 41

    -81 -3 115 -73 -6 -2 22 -5

    14 -11 1 -42 26 -3 17 -38

    2 -61 -13 -12 36 -23 -18 5

    44 13 37 -4 10 -21 7 -8

    36 -11 -9 -4 20 -28 -21 14

    -19 -7 21 -6 3 3 12 -21

    -5 -13 -11 -17 -4 -1 7 -4

    144 5 -5 -1 3 -10 1 5

    -10 0 14 -9 -1 0 3 -1

    2 -1 0 -5 3 0 2 -5

    0 -8 -2 -2 5 -3 -2 1

    6 2 5 -1 1 -3 1 -1

    5 -1 -1 -1 3 -4 -3 2

    -2 -1 3 -1 0 0 2 -3

    -1 -2 -1 -2 -1 0 1 -1

    After being decoded using DCT After quantization

    Divide each cells

    value by 8

    Quantization

    Why? -- To reduce number of bits per sample

    F(u,v) = round(F(u,v)/q(u,v))

    Example: 101101 = 45 (6 bits).Truncate to 4 bits: 1011 = 11. (Compare 11 x 4 =44 against 45)

  • 8/12/2019 Video Technology 2013-14 - Minor II

    267/282

    ( p g )Truncate to 3 bits: 101 = 5. (Compare 8 x 5 =40 against 45)Note, that the more bits we truncate the more precision we lose

    Quantization error is the main source of the Lossy Compression.

    Uniform Quantization:

    267

    q(u,v) is a constant.

    Non-uniform Quantization -- Quantization Tables

    Eye is most sensitive to low frequencies (upper left corner in frequencymatrix), less sensitive to high frequencies (lower right corner)

    Custom quantization tables can be put in image/scan header.

    JPEG Standard defines two default quantization tables, one each forluminance and chrominance.

    Fact : Human Visual System (HVS) does not distinguish fine details

    Thresholding & Quantisation

  • 8/12/2019 Video Technology 2013-14 - Minor II

    268/282

    y ( ) gbelow a certain luminance level

    Fact : HVS is less sensitive to high spatial frequency changes

    Therefore,

    replace values below a certain threshold (a function of

    frequency) by 0

    quantize the resultant values with an accuracy decreasing withincreasing spatial frequencies

    The result of Thresholding and

    Quantization

  • 8/12/2019 Video Technology 2013-14 - Minor II

    269/282

    The zig-zag scan

  • 8/12/2019 Video Technology 2013-14 - Minor II

    270/282

    (129) ; 1 ; 0 ; 0 ; 1 ; 1 ; 1 ; 1 ; 0 ; 0 ; -1 ; -1 ; 0 ; -1 ; 2 ; 1 ; -1 ; 0 ; 0 ; 0 ; 1 ; -1 ;

    The resultant sequence

  • 8/12/2019 Video Technology 2013-14 - Minor II

    271/282

    0 ; 1 ; 1 ; 0 ; 0 ; 0 ; 0 ; 0 ; -1 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0 ; 1 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0 ;0 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0 ; 1 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0

    Huffman Coding

    Huffman coding is the most popular technique

  • 8/12/2019 Video Technology 2013-14 - Minor II

    272/282

    Huffman coding is the most popular techniquefor removing coding redundancy.

    Unique prefix property

    Instantaneous decodin ro ert

    Optimality

    JPEG(fixed, not optimal)

    272

  • 8/12/2019 Video Technology 2013-14 - Minor II

    273/282

    Huffman encoding example

    Pixel frequencies on left

    Pixel value 1 occurs 15 times

    Pixel value 14 occurs 1 time

    Build Huffman tree from bottomup

  • 8/12/2019 Video Technology 2013-14 - Minor II

    274/282

    Create one leaf node for eachpixel value and assignfrequency as nodes value

    Create an internal node byoinin an two nodes whose

    23

    5

    6

    4-1 15x

    0 8x

    -2 6x

    -1 00

    0 100

    -2 110

    Pixel

    frequenciesHuffman tree

    Huffman

    codes

    sum is a minimal value This sum is internal nodesvalue

    Repeat until complete binarytree

    Traverse tree from root to leaf toobtain binary code for leafs pixel

    value Append 0 for left traversal, 1

    for right traversal

    Huffman encoding is reversible

    No code is a prefix of anothercode

    144

    5 3 2

    1 0 -2

    -1

    -10 -5 -3

    -4 -8 -9614

    1 1

    2

    1 1

    2

    1

    22

    4

    3

    5

    4

    65

    9

    5

    1

    0

    5

    1

    15

    1

    4

    6

    17

    8

    181

    5

    9

    1

    x

    2 5x3 5x

    5 5x

    -3 4x

    -5 3x

    -10 2x

    144 1x

    -9 1x

    -8 1x

    -4 1x

    6 1x14 1x

    2 1110

    3 1010

    5 0110

    -3 11110

    -5 10110

    -10 01110

    144 11111

    -9 11111

    -8 10111