1. image compression (brief overview) 2. hough transform...

Lecture 9Lecture 9

Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/

1.1.

Image compression (brief overview)Image compression (brief overview)2.2.

Hough transform.Hough transform.

3.3.

Morphological operations on binary Morphological operations on binary imagesimages

Video Data AnalysisVideo Data Analysis

Lecture 9Lecture 9


Image compressionImage compression

Reading: 1:8.1Relative data redundancy

RD

in the two datasets

coding same information

:

Here:n1

and n2

the numbers of information carrying units.CR the compression rate.

Data compression is achieved when some of these redundancies are reduced (eliminated)

2

111nnCwhere

CR R

RD =−=

set. second in theexpansion Data,0, :3 Casedataredundant highly andn compressioant Singinific1,,:2Case

yredaundanc No0,1,:1Case

12

12

21

−∞>−>−>>>−∞>−<<

===

DR

DR

DR

RCnnRCnn

RCnn

Three types of data redundancy: coding, interpixel, psycho-visual

Variable-

length coding: if more probably gray-values are coded with less bits –

the compression is achieved. Coding redundancy: more bits are used to code each gray level than it is necessary.

Lecture 9Lecture 9


Other redundanciesOther redundancies

Interpixel

redundancy

is related to the correlation of pixels’

gray level within the image: the value of any given pixel can be reasonably predicted from the value of its neighbors.Run-length coding: image is represented by the value

and length

of its constant gray-level runs.

Psychophysical redundancy

is associated with quantifiable visual information which itself is not essential for visual processing. Elimination of psychophysically redundant

information results in a loss of quantitative information, referred to as quantization. The loss of information is measured by the root-mean-square (rms) error:

Reading: 1:8.1

[ ]

[ ]∑∑

∑∑

∑∑

−

=

−

=

−

=

−

=

−

=

−

=

−=

−=

⎥⎦

⎤⎢⎣

⎡−=

1

0

1

0

2

1

0

1

0

2

ms

2/11

0

1

0

2

),(),(ˆ

),(ˆ

SNR

:ratio-noise-to-signal -square-mean The),(),(ˆ),(

:pixeleach in error with the

),(),(ˆ1

M

x

N

y

M

x

N

y

M

x

N

yrms

yxfyxf

yxf

yxfyxfyxe

yxfyxfMN

e

Lecture 9Lecture 9


Perception of image qualityPerception of image quality

The root-mean-square (rms) error and the mean-square SNR are not always good measures of the image quality as perceived by humans.

Reading: 1:8.1

Left: original imageCenter: uniform quantization to 16 levels. Compression rate is 2:1. RMS=6.93; SNR=10.25Right: improved gray-scale quantization. Compression rate is 2:1.RMS=6.78; SNR=10.39.

Lecture 9Lecture 9


ErrorError--free compressionfree compression

Lossless predictive codingIs based on eliminating the interpixel

redundancy of closely spaced pixels by extracting and coding only the difference between the actual and

predicted value of that pixel. The system consists of the encoder

and decoder

each containing an identical predictor.

Reading: 1:8.4

coded is

ˆ

n

nnn

effe −=

nnn

n

fef

e

ˆcomputes and

tsreconstrucDecoder

+=

Lecture 9Lecture 9


Predictive image Predictive image compressioncompression Reading: 1:8.4.4

Entropy is a measure of the information content of the image:

{ } { }

Modulation Code Pulse alDifferenti:DPCM theisresult The

]ˆ[

:error prediction square-mean the minimize set to are parameters

ˆ

:pixels previous ofn combinatio linear aby formed is prediction The

predictor Optimal

22

1

nnn

i

m

iinin

ffEeE

froundf

m

−=

= ∑=

−

α

α

∑−

=

=1

02 )]([log)(

L

kkkkke rprpH The decreasing of the entropy reflects removal of the

redundancy. Compression rate is about 2.

The prediction error image.

Lecture 9Lecture 9


Reading: 1:8.5

LossyLossy image compressionimage compression

Lossy

predictive codingA quantizer, which absorbs the nearest integer function of the error-free encoder is added. It maps the prediction error into a limited range of outputs, which establish the amount of compression

and distortion

associated with lossy

coding.Predictions generated by the encoder and decoder must be equivalent, therefore the quantizer

input is generated as a function of past predictions and quantized errors:

nnn fef ˆ+= &&

Lecture 9Lecture 9


Reading: 1:8.5.2

Is based on modifying the transform coefficients of an image, which are then quantized and coded. Many coefficients are small and can be coarsely quantized.Compression is achieved during the quantization of the transformed coefficients.

Transform codingTransform coding

Lecture 9Lecture 9



Reading: 1:8.5

Approximations using the Fourier, Hadamar and cosine transforms with the corresponding RMS errors: 1.28, 0.86, 0.68 gray levels.

The image is usually divided into sub-images (blocks), say 8x8. Each block is transformed generating 64 coefficients out of which only 32 are retained on the basis of maximum magnitude. Compression by a factor of 2

is achieved during quantization of the transform coefficients.

Lecture 9Lecture 9



Reading: 1:8.5

Approximations using 25% of the DCT coefficients. a),b) 8x8x subimage results. c) zoomed original; d) 2x2 result; e) 4x4 result; f) 8x8x result.

Reconstruction error versus subimage size. 75% of the DCT coefficients were truncated

Lecture 9Lecture 9


DCT codingDCT coding

Reading: 1:8.5

Approximations using threshold-coding of the DCT coefficients. Compression rate is 34:1 and 67:1. The rms errors are 3.42 and 6.33 gray levels respectively.

Lecture 9Lecture 9


Reading: 1:8.5.3

Principle

(similar): transform coefficients can be coded more efficiently

than the original image pixels. The compression rate depends on the rate of truncated coefficients.Procedure:

2Jx2J image is processed. The computed transform converts a large portion of the image into horizontal, vertical and diagonal decomposition coefficients with zero-mean and Laplacian-like distributions. Since many coefficients carry little visual information, they are efficiently quantized and coded. Decoding is via the inverse transform and symbol decoder.Advantages:-

because the wavelet transform is computationally efficient (simple basis functions!) and inherently local, subdivision of the original image into sub-images is unnecessary. This eliminates the blocking artifacts, which are typical for the high rate DCT-based compression.-

On average the compression rate is higher when compared to the DCT-based method with the similar level of the rms

error in decompressed images.

Wavelet codingWavelet coding

Lecture 9Lecture 9


Reading: 1:8.5.3


Wavelet-based approximations were reconstructed from encoding that compressed the original image by 34

to 1 and 67 to 1. The rms errors are 2.29

(opposed to 3.42 for DCT) and 2.96 (6.33

for DCT) gray levels respectively.

Lecture 9Lecture 9


Reading: 1:8.5.3


Reconstruction from 108:1

and 167:1 wavelet-based encodings of the original image, with the rms errors of 3,72 and 4.73 respectively.At more than twice the level of compression, the most highly compressed wavelet-based reconstruction has only 75% of the error of the less compressed transform-

based result – and superior perceived quality.

Lecture 9Lecture 9


Sill image compression Sill image compression standardsstandards

JPEG standardThe compression is performed in three sequential steps: DCT computation, quantization and variable-length coding. -Image blocks are of 8x8 size;-Level shifting by subtracting 2n-1

(128 for 8 bits per pixel)

Reading: 1:8.6

Lecture 9Lecture 9


Sill image compression Sill image compression standardsstandards

JPEG standard-when converted with the forward DCT the matrix becomes:

Reading: 1:8.6

JPEG standard-Scaled and truncated coefficients are:

The rms

error after overall compression and reconstruction process is ~ 5.9 gray levels

Lecture 9Lecture 9


Video compression Video compression MPEG standardMPEG standard

Reading: 1:4.4Video compression

standards extend the transform-based still image compression techniques to include methods for reducing temporal or frame-to-

frame redundancies.MPEG standard –Motion Picture Experts Group –

uses a motion estimator:•

Macro-blocks compared to neighboring blocks of the previous frame and used to compute a motion compensated prediction error.•

The prediction error is discreet cosine transformed in 8x8 blocks, quantized and coded.The compression is performed in three sequential steps: 1)

DCT computation, 2) quantization and 3)

variable-length coding. The principal difference is the input, which may be a conventional image block or the difference between the conventional block and a prediction of it based on similar blocks in previous video frames.Three types of encoded output frames:Independent frame encoder

–

independent compression of a still image frame is stored. All standards require their periodic insertion into the

compressed codestream. Predictive frame encoder

–

compressed difference between the frame and its prediction based on the previous frame. Bidirectional frame encoder

is a similar compressed difference based on the previous and next frames.

Lecture 9Lecture 9


Hough transformHough transform

Reading: 5:5.2; 1:10.2.2

Idea:

To map a pattern detection problem, for instance detection of a

curve, into a peak detection problem in the space of the parameters of the curve.

line on the is)','()(''line on the is)','()(''

')('''

2222

1111

bayaxbbxaybayaxbbxay

yaxbbxay

+−=>−+=+−=>−+=

+−=>−+=

Lecture 9Lecture 9


Hough transformHough transform

Keep the parameter space finite:

Both parameters a, b

change within the range [-∞, ∞

]. There is no way to sample the whole parameter space. The line’s slope when using the equation y = ax+b will be infinity for the vertical line. This means that vertical line won’t be represented in the parameter space (a,b).The way to solve this problem is to use the polar representation

of a line, in which both parameters are finite and any line can be represented.

θθρ sincos yx +=

Reading: 5:5.2; 1:10.2.2

Note:

a line is mapped into a sinusoid in the parameter space.

Q:

what is the interval for the angle and distance parameters?.

Lecture 9Lecture 9


Hough transformHough transformReading: 5:5.2; 1:10.2.2

Lecture 9Lecture 9



[ ] parameters of vector a is,...),

:form theof curveany Or )()(

:circles ofdetection the todgeneralize becan HT

1

222

21

paaf(x,y

rcycx

Tp==

=−+−

aa

Lecture 9Lecture 9



Original image

Laplace convolution HT for circles Binary circle points

Summary:

HT is a voting algorithm: each point votes for all combination of parameters of a pattern if it were part of it. Attractive features: 1. Points are processed independently, partly occluded patterns can

be also detected. It is more efficient than the template matching. 2. Robust to noise as

spurious points are unlikely to contribute consistently to any ingle bin and only generate a background noise.3.

Detects multiple instances of a model pattern in a single pass.

Lecture 9Lecture 9


Basic morphological operatorsBasic morphological operators

The difference of two sets

A and B:

Reading: 1:9.1; 2:11

cBABwAwwBA ∩=∉∈=− },|{

Translation of A by z

Reflection of B

Lecture 9Lecture 9


Logic operationsLogic operations

The principal logic operations are:

AND

OR

NOT (Complement)

XOR (exclusive OR)

Difference:

these are restricted to binary

variables.

Reading: 1:9.1; 2:11

Lecture 9Lecture 9


DilationDilation

Solid line shows the limit beyond which any further displacement

of the origin of

B would cause the intersection

of A

and B

to be empty.

The dilation

of A

and B:

}Ø)ˆ(|{ ≠=⊕ ABzBA z I

The dilation is commutative and associative:

DBADBAABBA

⊕⊕=⊕⊕⊕=⊕

)()(

Reading: 1:9.1; 2:11

The structuring element

of dilation

Lecture 9Lecture 9


DilationDilation

ExampleExample: Application of dilation for bridging gaps

Reading: 1:9.1; 2:11

The structuring element

of dilation

Lecture 9Lecture 9


ErosionErosion

Solid line shows the limit beyond which any further displacement

of the origin of B

would cause the set to cease

being completely contained

in A.

Dilation

and erosion

are duals

of each other with respect to set complementation

and reflection:

})ˆ(|{O ABzBA z ⊆=

The erosion

of A

and B:

BABA cc ˆ)O( ⊕=

Reading: 1:9.1; 2:11

Lecture 9Lecture 9


OpeningOpening

Opening A

by

B

is the erosion

of A

by B followed by a dilation

of the result by

B:

BBABA ⊕= )O(o

The erosion and dilation are not inverse transformations: if the

image is eroded and then dilated, the original image is not re-obtained. Instead, the result is a less detailed version of the original image.

Reading: 1:9.3; 2:11.3

Opening: smoothes boarders, eliminates thin protrusions.

Geometric interpretation of the opening:

Lecture 9Lecture 9


ClosingClosing

Opening and closing are dual operators:

BBABA O)( ⊕=•

Geometric interpretation for closing is similar, except that the

ball rolls on the outside of the boundary:

Reading: 1:9.3; 2:11.3

BABA cc ˆ)( o=•

Reapplication of the opening and closing does not change the previous result: BBXBX

BBXBX••=•

=)()( ooo

Closing A

by

B

is the dilation

of A

by B followed by erosion

of the result by

B:

Lecture 9Lecture 9


Properties of opening & Properties of opening & closingclosing

Reading: 1:9.3; 2:11.3

Lecture 9Lecture 9


Filtering by morphological Filtering by morphological operationsoperations

Note the elimination of noise

Reading: 1:9.3; 2:11.3

Lecture 9Lecture 9


Morphological Morphological hithit--oror--miss miss

operatoroperator

]O[)O(:becomestion Transforma Miss-or-Hit then the

X;and LetX)]-(WO[)O(

21

2121

BABABA

(W-X)BB),B(BBAXABA

c

c

I

I

=⊗

====⊗

The objective is to find the location of particular shape. The origin of each shape is located in the center of gravity.

Introduce a structuring element B

associated with X - the shape sought: B

is composed of X

and its background, then:

Reading: 1:9.4;

Lecture 9Lecture 9


Boundary detectionBoundary detection

Q: What is the width of the boundary given the structuring element B?

What will happen if B

is 5x5 pixel size?

Reading: 1:9.5;

)O()( BAAA −=β

Lecture 9Lecture 9


Region FillingRegion Filling

Start with a point inside the region. Assign P=1. Proceed iteratively:

A

XX

kABXX

kk

ckk

U

I

k

1

1

XF(A):filling ofResult

:criterion Stop,..3,2,1)(

=

=

=⊕=

−

−

Conditional dilationReading: 1:9.5;

Lecture 9Lecture 9


Extraction of connected Extraction of connected componentscomponents

Start with a point belonging to the component. Proceed iteratively:

1

1

:criterion Stop,..3,2,1)(

−

−

=

=⊕=

kk

kk

XX

kABXX I

Reading: 1:9.5;

Lecture 9Lecture 9


Extraction of connected Extraction of connected componentscomponents

Connected components are used frequently for automated inspection.

The task: detection of foreign objects in processed food before packaging

Reading: 1:9.5;

1. image compression (brief overview) 2. hough transform...

Documents