cis303 advanced forensic computing
DESCRIPTION
CIS303 Advanced Forensic Computing. Dr Giles Oatley. Object identification – part 1. Image representation. - PowerPoint PPT PresentationTRANSCRIPT
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
CIS303CIS303Advanced Forensic ComputingAdvanced Forensic Computing
Dr Giles Oatley
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
Object identification – part 1. Image
representation• Having segmented regions of interest out of an image the
next task is to identify them. Identification is usually divided into two tasks : evaluating suitable quantitative descriptors and then matching the descriptors to known objects. In this lecture we will look at object representation and description.
• Sub topics– Topology & boundary descriptors– Chain codes & shape numbers– Minimum-Perimeter Polygons (MPPs)– Fourier descriptors– Statistical moments
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
An image region is said to be convex if when you draw any straight line connecting two points within the region the line lies completely within the region or along an edge of the region.
Convex
The convex hull of a region ‘A’ is the smallest convex region that encloses ‘A’. The convex deficiency is the difference between the convex hull and ‘A’
PThe red hatched region is the convex hull and the green hatched regions the convex deficiency of the letter ‘P’
Non convex
The convex hull
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
Properties of a region which are topologically invariant are clearly good candidates for region descriptors. One such property is the Euler number which we can define in two ways:
Using the usual definition of 4 or 8 connected regions we can define the Euler number ‘E’ as C – H where C is the number of connected regions and ‘H’ the total number of holes.
7 8 9E = 1 E = -1 E = 0
Edge (Q)
Vertex (V)
Hole (H)
Face (F)
E = C – H = V – Q + F
In the example above
E = 7 – 11 + 2 = -2
The Euler number
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
MatLab example• function Eulertest• %To find the Euler number and convex deficiency of objects• %First read in the image and convert to binary 'S'• T = imread('ABCtest.jpg');• G = rgb2gray(T);• S = G>128;• %convert to a lable image required by regionprops• [L n] = bwlabel(S);• %get the region props characteristics required and dump into
matrices• stats = regionprops(L,'EulerNumber','solidity');• EN = [stats.EulerNumber];• SN = [stats.Solidity];• CD = 1-SN;• fprintf('%s %s %2.0f %s %g\n','A ','Euler number ',EN(1),'
Convex deficiency ',CD(1));• fprintf('%s %s %2.0f %s %g\n','B ','Euler number ',EN(3),'
Convex deficiency ',CD(3));• fprintf('%s %s %2.0f %s %g\n','C ','Euler number ',EN(2),'
Convex deficiency ',CD(2));
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
The source image
Note: In the Matlab code, for convex deficiency, we are actually using ‘Solidity’
Solidity = (Pixels in region)/(Pixels in
convex hull)
The analysis :
A Euler number = 0 Solidity = 0.295291
B Euler number = -1 Solidity = 0.300041
C Euler number = 1 Solidity = 0.47308
Results
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
%For fun we will display the convex hull deficency
H = regionprops(L,'ConvexImage','BoundingBox');
BB = [H.BoundingBox];
for J =1:3
term = 4*(J-1);
ULX = round(BB(term+1));
ULY = round(BB(term+2));
LRX = round(BB(term+1)+BB(term+3)-1);
LRY = round(BB(term+2)+BB(term+4)-1);
SI = S(ULY:LRY,ULX:LRX);
CDI = H(J).ConvexImage - SI;
termplot = 3*(J-1);
subplot(3,3,termplot+1), imshow(SI);
subplot(3,3,termplot+2), imshow(H(J).ConvexImage);
subplot(3,3,termplot+3), imshow(CDI);
end
Displaying the convex deficiency
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
Original letter Convex hull Deficiency
Results
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
• Skeletonization: an approach to representing the structured shape of a planar region.
• Defined via the medial axis transformation (MAT):
To find the MAT of region R with border B, for each point p in R, find its closest neighbour in B. If p has more than one such neighbour, it belongs to the medial axis (skeleton) of R.
• Note that the concept of ‘closest’ (and hence the medial axis) depend upon the definition of ‘distance’ between pixels.
• Algorithms use morphological operators (thinning) – see texts (and last week) for details. They are often computationally expensive.
Skeleton of a region
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
Region descriptors
The Euler number and the convex deficiency are examples of region descriptors because they define properties of the complete segmented region.
Boundary descriptors
A boundary descriptor on the other hand concentrates on describing properties of the regions boundaries.
Thus, the segmentation techniques studied earlier yield raw data in the form of pixels along a boundary or contained in a region.
Although this data is sometimes used directly, it is normal practice to compact the data into representations that are more useful in the computation of descriptors
We will now look at some different approaches.
Region & boundary descriptors
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
(xi,yi)
A boundary descriptor can be formed by simply recording the co-ordinates of sampled points on the boundary.
In practice this is rarely done because the chain code descriptor produced can be very long and noise can seriously interfere with the result.
A better approach is to cover the boundary with a grid and record the grid points closest to the boundary. A 4 code or an 8 code descriptor can then be formed by recording the direction moved to connect up the recorded grid points.
0
1
2
34 code
0
12
3
4
56
78 code
Chain codes
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
0
12
3
4
56
7
The basic chain code starting at the ‘red’ dot is :
0 7 6 7 6 6 5 5 4 4 2 3 3 1 2 2
Exercise: Calculate the 4-directional code for the above shape.
8 code example
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
A signature is a 1-D functional representation of a boundary.
It may be generated in several ways.
The simplest is to plot the distance from an interior point (e.g. the centroid) to the boundary as a function of angle:
Signatures
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
The basic idea is to reduce the boundary representation to a 1-D function, rather than the original 2-D representation.
It only works if the vector extending from the origin to the boundary intersects the boundary only once, thus yielding a single valued function of increasing angle.
It therefore normally excludes objects with deep, narrow concavities, or long thin protrusions.
Note that the signatures shown on the previous slide are invariant to translation, but depend on rotation and scaling – it is therefore necessary to somehow remove dependency on size, whilst preserving the fundamental shape of the waveforms.
For example, align the axis of rotation along the major axis of the object (see later), and normalize by scaling all functions so that they span the same range of values (e.g. [0, 1]). – tends to be susceptible to noise, if relying on minimum and maximum values.
Signatures
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
Length: Simply counting the number of pixels along the contour gives a rough approximation to the length.
Diameter: Diameter of boundary B is given by:
Diam(B) = MAX [ D (pi, pj) ]
where D is a distance measure and pi and pj are points on the boundary.
Related to this are the major and minor axes (major – line segment connecting single pair of farthest points; minor – line perpendicular to major axis, such that box passing through outer 4 points of intersection with boundary completely encloses boundary).
Eccentricity – ratio of major to minor axes
Curvature: Rate of change of slope. For example, use the difference between slopes of adjacent boundary segments (which have been represented as straight lines). Changes in slope can be characterised by ranges in change in slope (e.g. <10 ‘straight’, 80 -100 ‘corner’ etc.), or ‘concave’, ‘convex’ etc.
Descriptors - simple boundary descriptors
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
■The chain code discussed previously is obviously dependent on the starting point.
■The shape number is derived from the chain code and is independent of the start point.
■To calculate the shape number, take the first difference of the chain code, then ‘shift it’ to form the integer of smallest magnitude.
■The order n of the shape number is the number of digits in its representation.
Chain code: 0 0 3 2 2 1
Difference: 3 0 3 3 0 3
Shape no.: 0 3 3 0 3 3
A more sophisticated descriptor – the shape
number
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
Note that:
• The difference is obtained by counting (counter clockwise) the number of directions that separate two adjacent directions of the chain code.
• The code is treated as a circular sequence so that the first element of the difference is calculated using the transition between the last and first components of the chain.
Exercise: Calculate the chain code, difference and shape number of the following:
Chain code: 0 3 0 3 2 2 1 1
Difference: 3 3 1 3 3 0 3 0
Shape number: 0 3 0 3 3 1 3 3
Shape number
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
0
12
3
4
56
7
The basic chain code starting at the ‘red’ dot is :
0 7 6 7 6 6 5 5 4 4 2 3 3 1 2 2
Difference: 5 7 7 1 7 0 7 0 7 0 6 1 0 6 1 0
Shape number: 0 5 7 7 1 7 0 7 0 7 0 6 1 0 6 1
Using difference codes and reordering to make the smallest integer produces a code independent of the starting point and orientation of the boundary
Back to the 8 code example
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
■ Although the first difference of a chain code is independent of rotation, in general the coded boundary depends upon the orientation of the grid.
■ It is usual to normalize the grid by aligning it with the basic rectangle i.e. the box enclosing the major and minor axes of an object.
■ Note that to implement these algorithms in practice is a non-trivial task! See Gonzalez and Woods (DIPUM) for a full treatment, including M-files.
Grid orientation
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
Further example (DIPUM)
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
(xj,yj)
(xj+1,yj+1)
A Fourier descriptor begins with digitising the actual boundary. One way is to superimpose a ‘star mask’ over the boundary, with origin at the centroid of the region. The ‘AND’ operation will isolate the points of interest. Normally you would use a star with lines every 10 or 5 degrees, not 45 degrees as in the illustration.
Fourier descriptors
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
Each point is then regarded as a complex number pair, by treating the x-axis as the real axis and the y-axis as the imaginary axis:
)(.)()( kyjkxks Where s(k) represents each coordinate pair [x(k), y(k)].
In other words we treat the plane of the image as an Argand diagram. The advantage of this is that we have reduced a 2D problem to a 1D problem.
Alternative representation
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
If we have a total of ‘K’ points we can calculate the coefficients of the discrete Fourier transform :
1
0
/2)(1
)(K
k
KukjeksK
ua
The complex coefficients a(u) form the Fourier descriptors of the boundary, which can be reconstructed from the inverse Fourier transform:
1
0
/2)()(K
k
Kukjeuaks
Suppose now that we use only the first ‘P’ coefficients as the descriptor. i.e. a(u) = 0 for u > (P – 1). The same number of points exists in the approximate boundary, but not as many terms are used in the reconstruction of each point:
1
0
/2)()(P
k
Kukjeuaks
Object identification
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
As we have seen invariance to region orientation, position and scale are important attributes for a descriptor. The Fourier descriptor behaves very well in this respect, since changes in these parameters can be related to simple transformations on the descriptors:
Rotation by jr
jr euauaeksks )()( )()(
Translation by xy)()()( )()( uuauaksks xytxyt
Scaling by )()(a )()( s uaukskss
Starting point k0
Kukjsp euaukksks /2
sp00)()(a )()(
Properties of the Fourier descriptor
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
To test the effect of varying the number of descriptors used in the reconstruction, we will try running the Applet at the following URL:
http://www.s2.chalmers.se/research/image/Java/applets_list.htm
Reconstruction
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
The method begins by defining a segment(s) of the boundary, drawing a straight line to connect the ends of the segment and then rotating this segment until the line is horizontal.
The method treats the resulting curve as a statistical function g(x) of the random variable x. If the area under the curve is normalised to 1 then ‘g’ becomes the probability density function. Normal statistical moments can now be calculated and used as descriptors.
x
g(x)
Statistical moments
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
The general definition is :
1
0
1
0
where
)(
K
iii
K
ii
nin
xgxm
xgmxx
An attractive feature of these descriptors is that in many cases they have a direct physical interpretation.
For example 2(x) measures the spread of the curve about the mean (variance) and 3(x) the symmetry of the curve about the mean.
In Matlab, statistical moments are computed using the function statmoments. See Matlab Help for further details.
Definition of the statistical moments
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
• The representation and description of objects or regions that have been segmented out of an image are the early steps in the operation of most automated image analysis systems.
• Morphological operations (last session) are often used to extract elements of these regions.
• A range of description techniques exist – the choice is dictated by the problem under consideration.
• The object is to use descriptors that capture essential differences between objects – or classes of objects – while maintaining independence to possible changes in location, size and orientation.
• A pattern is formed by one or more descriptors and pattern recognition is used in object recognition and interpretation.
Concluding remarks
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing
In this week’s laboratory, investigate the following:
Using the ‘holes’ and ‘convex hull deficiency’ shape descriptors, explain how you would distinguish between the shapes of the characters ‘0’ ‘1’ ‘8’ ‘9’ and ‘X’.
Look up the Matlab IPT function regionprops
Using the descriptors ‘convex hull deficiency’/’Solidity’ and ‘Euler number’ only, build a MatLab function that identifies each of the objects in image ‘lettertest.jpg’ (i.e. the characters described in bullet 1). How well separated are the object archetypes in the 2D vector space ?
Repeat for all of the letters in the alphabet.
Tutorial