1. image compression (brief overview) 2. hough transform...
TRANSCRIPT
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
1.1.
Image compression (brief overview)Image compression (brief overview)2.2.
Hough transform.Hough transform.
3.3.
Morphological operations on binary Morphological operations on binary imagesimages
Video Data AnalysisVideo Data Analysis
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Image compressionImage compression
Reading: 1:8.1Relative data redundancy
RD
in the two datasets
coding same information
:
Here:n1
and n2
the numbers of information carrying units.CR the compression rate.
Data compression is achieved when some of these redundancies are reduced (eliminated)
2
111nnCwhere
CR R
RD =−=
set. second in theexpansion Data,0, :3 Casedataredundant highly andn compressioant Singinific1,,:2Case
yredaundanc No0,1,:1Case
12
12
21
−∞>−>−>>>−∞>−<<
===
DR
DR
DR
RCnnRCnn
RCnn
Three types of data redundancy: coding, interpixel, psycho-visual
Variable-
length coding: if more probably gray-values are coded with less bits –
the compression is achieved. Coding redundancy: more bits are used to code each gray level than it is necessary.
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Other redundanciesOther redundancies
Interpixel
redundancy
is related to the correlation of pixels’
gray level within the image: the value of any given pixel can be reasonably predicted from the value of its neighbors.Run-length coding: image is represented by the value
and length
of its constant gray-level runs.
Psychophysical redundancy
is associated with quantifiable visual information which itself is not essential for visual processing. Elimination of psychophysically redundant
information results in a loss of quantitative information, referred to as quantization. The loss of information is measured by the root-mean-square (rms) error:
Reading: 1:8.1
[ ]
[ ]∑∑
∑∑
∑∑
−
=
−
=
−
=
−
=
−
=
−
=
−=
−=
⎥⎦
⎤⎢⎣
⎡−=
1
0
1
0
2
1
0
1
0
2
ms
2/11
0
1
0
2
),(),(ˆ
),(ˆ
SNR
:ratio-noise-to-signal -square-mean The),(),(ˆ),(
:pixeleach in error with the
),(),(ˆ1
M
x
N
y
M
x
N
y
M
x
N
yrms
yxfyxf
yxf
yxfyxfyxe
yxfyxfMN
e
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Perception of image qualityPerception of image quality
The root-mean-square (rms) error and the mean-square SNR are not always good measures of the image quality as perceived by humans.
Reading: 1:8.1
Left: original imageCenter: uniform quantization to 16 levels. Compression rate is 2:1. RMS=6.93; SNR=10.25Right: improved gray-scale quantization. Compression rate is 2:1.RMS=6.78; SNR=10.39.
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
ErrorError--free compressionfree compression
Lossless predictive codingIs based on eliminating the interpixel
redundancy of closely spaced pixels by extracting and coding only the difference between the actual and
predicted value of that pixel. The system consists of the encoder
and decoder
each containing an identical predictor.
Reading: 1:8.4
coded is
ˆ
n
nnn
effe −=
nnn
n
fef
e
ˆcomputes and
tsreconstrucDecoder
+=
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Predictive image Predictive image compressioncompression Reading: 1:8.4.4
Entropy is a measure of the information content of the image:
{ } { }
Modulation Code Pulse alDifferenti:DPCM theisresult The
]ˆ[
:error prediction square-mean the minimize set to are parameters
ˆ
:pixels previous ofn combinatio linear aby formed is prediction The
predictor Optimal
22
1
nnn
i
m
iinin
ffEeE
froundf
m
−=
= ∑=
−
α
α
∑−
=
=1
02 )]([log)(
L
kkkkke rprpH The decreasing of the entropy reflects removal of the
redundancy. Compression rate is about 2.
The prediction error image.
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Reading: 1:8.5
LossyLossy image compressionimage compression
Lossy
predictive codingA quantizer, which absorbs the nearest integer function of the error-free encoder is added. It maps the prediction error into a limited range of outputs, which establish the amount of compression
and distortion
associated with lossy
coding.Predictions generated by the encoder and decoder must be equivalent, therefore the quantizer
input is generated as a function of past predictions and quantized errors:
nnn fef ˆ+= &&
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Reading: 1:8.5.2
Is based on modifying the transform coefficients of an image, which are then quantized and coded. Many coefficients are small and can be coarsely quantized.Compression is achieved during the quantization of the transformed coefficients.
Transform codingTransform coding
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Transform codingTransform coding
Reading: 1:8.5
Approximations using the Fourier, Hadamar and cosine transforms with the corresponding RMS errors: 1.28, 0.86, 0.68 gray levels.
The image is usually divided into sub-images (blocks), say 8x8. Each block is transformed generating 64 coefficients out of which only 32 are retained on the basis of maximum magnitude. Compression by a factor of 2
is achieved during quantization of the transform coefficients.
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Transform codingTransform coding
Reading: 1:8.5
Approximations using 25% of the DCT coefficients. a),b) 8x8x subimage results. c) zoomed original; d) 2x2 result; e) 4x4 result; f) 8x8x result.
Reconstruction error versus subimage size. 75% of the DCT coefficients were truncated
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
DCT codingDCT coding
Reading: 1:8.5
Approximations using threshold-coding of the DCT coefficients. Compression rate is 34:1 and 67:1. The rms errors are 3.42 and 6.33 gray levels respectively.
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Reading: 1:8.5.3
Principle
(similar): transform coefficients can be coded more efficiently
than the original image pixels. The compression rate depends on the rate of truncated coefficients.Procedure:
2Jx2J image is processed. The computed transform converts a large portion of the image into horizontal, vertical and diagonal decomposition coefficients with zero-mean and Laplacian-like distributions. Since many coefficients carry little visual information, they are efficiently quantized and coded. Decoding is via the inverse transform and symbol decoder.Advantages:-
because the wavelet transform is computationally efficient (simple basis functions!) and inherently local, subdivision of the original image into sub-images is unnecessary. This eliminates the blocking artifacts, which are typical for the high rate DCT-based compression.-
On average the compression rate is higher when compared to the DCT-based method with the similar level of the rms
error in decompressed images.
Wavelet codingWavelet coding
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Reading: 1:8.5.3
Wavelet codingWavelet coding
Wavelet-based approximations were reconstructed from encoding that compressed the original image by 34
to 1 and 67 to 1. The rms errors are 2.29
(opposed to 3.42 for DCT) and 2.96 (6.33
for DCT) gray levels respectively.
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Reading: 1:8.5.3
Wavelet codingWavelet coding
Reconstruction from 108:1
and 167:1 wavelet-based encodings of the original image, with the rms errors of 3,72 and 4.73 respectively.At more than twice the level of compression, the most highly compressed wavelet-based reconstruction has only 75% of the error of the less compressed transform-
based result – and superior perceived quality.
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Sill image compression Sill image compression standardsstandards
JPEG standardThe compression is performed in three sequential steps: DCT computation, quantization and variable-length coding. -Image blocks are of 8x8 size;-Level shifting by subtracting 2n-1
(128 for 8 bits per pixel)
Reading: 1:8.6
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Sill image compression Sill image compression standardsstandards
JPEG standard-when converted with the forward DCT the matrix becomes:
Reading: 1:8.6
JPEG standard-Scaled and truncated coefficients are:
The rms
error after overall compression and reconstruction process is ~ 5.9 gray levels
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Video compression Video compression MPEG standardMPEG standard
Reading: 1:4.4Video compression
standards extend the transform-based still image compression techniques to include methods for reducing temporal or frame-to-
frame redundancies.MPEG standard –Motion Picture Experts Group –
uses a motion estimator:•
Macro-blocks compared to neighboring blocks of the previous frame and used to compute a motion compensated prediction error.•
The prediction error is discreet cosine transformed in 8x8 blocks, quantized and coded.The compression is performed in three sequential steps: 1)
DCT computation, 2) quantization and 3)
variable-length coding. The principal difference is the input, which may be a conventional image block or the difference between the conventional block and a prediction of it based on similar blocks in previous video frames.Three types of encoded output frames:Independent frame encoder
–
independent compression of a still image frame is stored. All standards require their periodic insertion into the
compressed codestream. Predictive frame encoder
–
compressed difference between the frame and its prediction based on the previous frame. Bidirectional frame encoder
is a similar compressed difference based on the previous and next frames.
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Hough transformHough transform
Reading: 5:5.2; 1:10.2.2
Idea:
To map a pattern detection problem, for instance detection of a
curve, into a peak detection problem in the space of the parameters of the curve.
line on the is)','()(''line on the is)','()(''
')('''
2222
1111
bayaxbbxaybayaxbbxay
yaxbbxay
+−=>−+=+−=>−+=
+−=>−+=
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Hough transformHough transform
Keep the parameter space finite:
Both parameters a, b
change within the range [-∞, ∞
]. There is no way to sample the whole parameter space. The line’s slope when using the equation y = ax+b will be infinity for the vertical line. This means that vertical line won’t be represented in the parameter space (a,b).The way to solve this problem is to use the polar representation
of a line, in which both parameters are finite and any line can be represented.
θθρ sincos yx +=
Reading: 5:5.2; 1:10.2.2
Note:
a line is mapped into a sinusoid in the parameter space.
Q:
what is the interval for the angle and distance parameters?.
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Hough transformHough transformReading: 5:5.2; 1:10.2.2
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Hough transformHough transformReading: 5:5.2; 1:10.2.2
[ ] parameters of vector a is,...),
:form theof curveany Or )()(
:circles ofdetection the todgeneralize becan HT
1
222
21
paaf(x,y
rcycx
Tp==
=−+−
aa
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Hough transformHough transformReading: 5:5.2; 1:10.2.2
Original image
Laplace convolution HT for circles Binary circle points
Summary:
HT is a voting algorithm: each point votes for all combination of parameters of a pattern if it were part of it. Attractive features: 1. Points are processed independently, partly occluded patterns can
be also detected. It is more efficient than the template matching. 2. Robust to noise as
spurious points are unlikely to contribute consistently to any ingle bin and only generate a background noise.3.
Detects multiple instances of a model pattern in a single pass.
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Basic morphological operatorsBasic morphological operators
The difference of two sets
A and B:
Reading: 1:9.1; 2:11
cBABwAwwBA ∩=∉∈=− },|{
Translation of A by z
Reflection of B
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Logic operationsLogic operations
The principal logic operations are:
AND
OR
NOT (Complement)
XOR (exclusive OR)
Difference:
these are restricted to binary
variables.
Reading: 1:9.1; 2:11
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
DilationDilation
Solid line shows the limit beyond which any further displacement
of the origin of
B would cause the intersection
of A
and B
to be empty.
The dilation
of A
and B:
}Ø)ˆ(|{ ≠=⊕ ABzBA z I
The dilation is commutative and associative:
DBADBAABBA
⊕⊕=⊕⊕⊕=⊕
)()(
Reading: 1:9.1; 2:11
The structuring element
of dilation
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
DilationDilation
ExampleExample: Application of dilation for bridging gaps
Reading: 1:9.1; 2:11
The structuring element
of dilation
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
ErosionErosion
Solid line shows the limit beyond which any further displacement
of the origin of B
would cause the set to cease
being completely contained
in A.
Dilation
and erosion
are duals
of each other with respect to set complementation
and reflection:
})ˆ(|{O ABzBA z ⊆=
The erosion
of A
and B:
BABA cc ˆ)O( ⊕=
Reading: 1:9.1; 2:11
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
OpeningOpening
Opening A
by
B
is the erosion
of A
by B followed by a dilation
of the result by
B:
BBABA ⊕= )O(o
The erosion and dilation are not inverse transformations: if the
image is eroded and then dilated, the original image is not re-obtained. Instead, the result is a less detailed version of the original image.
Reading: 1:9.3; 2:11.3
Opening: smoothes boarders, eliminates thin protrusions.
Geometric interpretation of the opening:
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
ClosingClosing
Opening and closing are dual operators:
BBABA O)( ⊕=•
Geometric interpretation for closing is similar, except that the
ball rolls on the outside of the boundary:
Reading: 1:9.3; 2:11.3
BABA cc ˆ)( o=•
Reapplication of the opening and closing does not change the previous result: BBXBX
BBXBX••=•
=)()( ooo
Closing A
by
B
is the dilation
of A
by B followed by erosion
of the result by
B:
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Properties of opening & Properties of opening & closingclosing
Reading: 1:9.3; 2:11.3
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Filtering by morphological Filtering by morphological operationsoperations
Note the elimination of noise
Reading: 1:9.3; 2:11.3
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Morphological Morphological hithit--oror--miss miss
operatoroperator
]O[)O(:becomestion Transforma Miss-or-Hit then the
X;and LetX)]-(WO[)O(
21
2121
BABABA
(W-X)BB),B(BBAXABA
c
c
I
I
=⊗
====⊗
The objective is to find the location of particular shape. The origin of each shape is located in the center of gravity.
Introduce a structuring element B
associated with X - the shape sought: B
is composed of X
and its background, then:
Reading: 1:9.4;
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Boundary detectionBoundary detection
Q: What is the width of the boundary given the structuring element B?
What will happen if B
is 5x5 pixel size?
Reading: 1:9.5;
)O()( BAAA −=β
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Region FillingRegion Filling
Start with a point inside the region. Assign P=1. Proceed iteratively:
A
XX
kABXX
kk
ckk
U
I
k
1
1
XF(A):filling ofResult
:criterion Stop,..3,2,1)(
=
=
=⊕=
−
−
Conditional dilationReading: 1:9.5;
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Extraction of connected Extraction of connected componentscomponents
Start with a point belonging to the component. Proceed iteratively:
1
1
:criterion Stop,..3,2,1)(
−
−
=
=⊕=
kk
kk
XX
kABXX I
Reading: 1:9.5;
Lecture 9Lecture 9
Video Data Analysis, B-IT Marina Kolesnik http://www.fit.fraunhofer.de/~kolesnik/
Extraction of connected Extraction of connected componentscomponents
Connected components are used frequently for automated inspection.
The task: detection of foreign objects in processed food before packaging
Reading: 1:9.5;