cis303 advanced forensic computing

29
University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing CIS303 CIS303 Advanced Forensic Advanced Forensic Computing Computing Dr Giles Oatley

Upload: iola-norman

Post on 31-Dec-2015

58 views

Category:

Documents


4 download

DESCRIPTION

CIS303 Advanced Forensic Computing. Dr Giles Oatley. Object identification – part 1. Image representation. - PowerPoint PPT Presentation

TRANSCRIPT

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

CIS303CIS303Advanced Forensic ComputingAdvanced Forensic Computing

Dr Giles Oatley

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

Object identification – part 1. Image

representation• Having segmented regions of interest out of an image the

next task is to identify them. Identification is usually divided into two tasks : evaluating suitable quantitative descriptors and then matching the descriptors to known objects. In this lecture we will look at object representation and description.

• Sub topics– Topology & boundary descriptors– Chain codes & shape numbers– Minimum-Perimeter Polygons (MPPs)– Fourier descriptors– Statistical moments

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

An image region is said to be convex if when you draw any straight line connecting two points within the region the line lies completely within the region or along an edge of the region.

Convex

The convex hull of a region ‘A’ is the smallest convex region that encloses ‘A’. The convex deficiency is the difference between the convex hull and ‘A’

PThe red hatched region is the convex hull and the green hatched regions the convex deficiency of the letter ‘P’

Non convex

The convex hull

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

Properties of a region which are topologically invariant are clearly good candidates for region descriptors. One such property is the Euler number which we can define in two ways:

Using the usual definition of 4 or 8 connected regions we can define the Euler number ‘E’ as C – H where C is the number of connected regions and ‘H’ the total number of holes.

7 8 9E = 1 E = -1 E = 0

Edge (Q)

Vertex (V)

Hole (H)

Face (F)

E = C – H = V – Q + F

In the example above

E = 7 – 11 + 2 = -2

The Euler number

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

MatLab example• function Eulertest• %To find the Euler number and convex deficiency of objects• %First read in the image and convert to binary 'S'• T = imread('ABCtest.jpg');• G = rgb2gray(T);• S = G>128;• %convert to a lable image required by regionprops• [L n] = bwlabel(S);• %get the region props characteristics required and dump into

matrices• stats = regionprops(L,'EulerNumber','solidity');• EN = [stats.EulerNumber];• SN = [stats.Solidity];• CD = 1-SN;• fprintf('%s %s %2.0f %s %g\n','A ','Euler number ',EN(1),'

Convex deficiency ',CD(1));• fprintf('%s %s %2.0f %s %g\n','B ','Euler number ',EN(3),'

Convex deficiency ',CD(3));• fprintf('%s %s %2.0f %s %g\n','C ','Euler number ',EN(2),'

Convex deficiency ',CD(2));

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

The source image

Note: In the Matlab code, for convex deficiency, we are actually using ‘Solidity’

Solidity = (Pixels in region)/(Pixels in

convex hull)

The analysis :

A Euler number = 0 Solidity = 0.295291

B Euler number = -1 Solidity = 0.300041

C Euler number = 1 Solidity = 0.47308

Results

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

%For fun we will display the convex hull deficency

H = regionprops(L,'ConvexImage','BoundingBox');

BB = [H.BoundingBox];

for J =1:3

term = 4*(J-1);

ULX = round(BB(term+1));

ULY = round(BB(term+2));

LRX = round(BB(term+1)+BB(term+3)-1);

LRY = round(BB(term+2)+BB(term+4)-1);

SI = S(ULY:LRY,ULX:LRX);

CDI = H(J).ConvexImage - SI;

termplot = 3*(J-1);

subplot(3,3,termplot+1), imshow(SI);

subplot(3,3,termplot+2), imshow(H(J).ConvexImage);

subplot(3,3,termplot+3), imshow(CDI);

end

Displaying the convex deficiency

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

Original letter Convex hull Deficiency

Results

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

• Skeletonization: an approach to representing the structured shape of a planar region.

• Defined via the medial axis transformation (MAT):

To find the MAT of region R with border B, for each point p in R, find its closest neighbour in B. If p has more than one such neighbour, it belongs to the medial axis (skeleton) of R.

• Note that the concept of ‘closest’ (and hence the medial axis) depend upon the definition of ‘distance’ between pixels.

• Algorithms use morphological operators (thinning) – see texts (and last week) for details. They are often computationally expensive.

Skeleton of a region

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

Region descriptors

The Euler number and the convex deficiency are examples of region descriptors because they define properties of the complete segmented region.

Boundary descriptors

A boundary descriptor on the other hand concentrates on describing properties of the regions boundaries.

Thus, the segmentation techniques studied earlier yield raw data in the form of pixels along a boundary or contained in a region.

Although this data is sometimes used directly, it is normal practice to compact the data into representations that are more useful in the computation of descriptors

We will now look at some different approaches.

Region & boundary descriptors

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

(xi,yi)

A boundary descriptor can be formed by simply recording the co-ordinates of sampled points on the boundary.

In practice this is rarely done because the chain code descriptor produced can be very long and noise can seriously interfere with the result.

A better approach is to cover the boundary with a grid and record the grid points closest to the boundary. A 4 code or an 8 code descriptor can then be formed by recording the direction moved to connect up the recorded grid points.

0

1

2

34 code

0

12

3

4

56

78 code

Chain codes

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

0

12

3

4

56

7

The basic chain code starting at the ‘red’ dot is :

0 7 6 7 6 6 5 5 4 4 2 3 3 1 2 2

Exercise: Calculate the 4-directional code for the above shape.

8 code example

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

A signature is a 1-D functional representation of a boundary.

It may be generated in several ways.

The simplest is to plot the distance from an interior point (e.g. the centroid) to the boundary as a function of angle:

Signatures

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

The basic idea is to reduce the boundary representation to a 1-D function, rather than the original 2-D representation.

It only works if the vector extending from the origin to the boundary intersects the boundary only once, thus yielding a single valued function of increasing angle.

It therefore normally excludes objects with deep, narrow concavities, or long thin protrusions.

Note that the signatures shown on the previous slide are invariant to translation, but depend on rotation and scaling – it is therefore necessary to somehow remove dependency on size, whilst preserving the fundamental shape of the waveforms.

For example, align the axis of rotation along the major axis of the object (see later), and normalize by scaling all functions so that they span the same range of values (e.g. [0, 1]). – tends to be susceptible to noise, if relying on minimum and maximum values.

Signatures

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

Length: Simply counting the number of pixels along the contour gives a rough approximation to the length.

Diameter: Diameter of boundary B is given by:

Diam(B) = MAX [ D (pi, pj) ]

where D is a distance measure and pi and pj are points on the boundary.

Related to this are the major and minor axes (major – line segment connecting single pair of farthest points; minor – line perpendicular to major axis, such that box passing through outer 4 points of intersection with boundary completely encloses boundary).

Eccentricity – ratio of major to minor axes

Curvature: Rate of change of slope. For example, use the difference between slopes of adjacent boundary segments (which have been represented as straight lines). Changes in slope can be characterised by ranges in change in slope (e.g. <10 ‘straight’, 80 -100 ‘corner’ etc.), or ‘concave’, ‘convex’ etc.

Descriptors - simple boundary descriptors

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

■The chain code discussed previously is obviously dependent on the starting point.

■The shape number is derived from the chain code and is independent of the start point.

■To calculate the shape number, take the first difference of the chain code, then ‘shift it’ to form the integer of smallest magnitude.

■The order n of the shape number is the number of digits in its representation.

Chain code: 0 0 3 2 2 1

Difference: 3 0 3 3 0 3

Shape no.: 0 3 3 0 3 3

A more sophisticated descriptor – the shape

number

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

Note that:

• The difference is obtained by counting (counter clockwise) the number of directions that separate two adjacent directions of the chain code.

• The code is treated as a circular sequence so that the first element of the difference is calculated using the transition between the last and first components of the chain.

Exercise: Calculate the chain code, difference and shape number of the following:

Chain code: 0 3 0 3 2 2 1 1

Difference: 3 3 1 3 3 0 3 0

Shape number: 0 3 0 3 3 1 3 3

Shape number

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

0

12

3

4

56

7

The basic chain code starting at the ‘red’ dot is :

0 7 6 7 6 6 5 5 4 4 2 3 3 1 2 2

Difference: 5 7 7 1 7 0 7 0 7 0 6 1 0 6 1 0

Shape number: 0 5 7 7 1 7 0 7 0 7 0 6 1 0 6 1

Using difference codes and reordering to make the smallest integer produces a code independent of the starting point and orientation of the boundary

Back to the 8 code example

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

■ Although the first difference of a chain code is independent of rotation, in general the coded boundary depends upon the orientation of the grid.

■ It is usual to normalize the grid by aligning it with the basic rectangle i.e. the box enclosing the major and minor axes of an object.

■ Note that to implement these algorithms in practice is a non-trivial task! See Gonzalez and Woods (DIPUM) for a full treatment, including M-files.

Grid orientation

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

Further example (DIPUM)

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

(xj,yj)

(xj+1,yj+1)

A Fourier descriptor begins with digitising the actual boundary. One way is to superimpose a ‘star mask’ over the boundary, with origin at the centroid of the region. The ‘AND’ operation will isolate the points of interest. Normally you would use a star with lines every 10 or 5 degrees, not 45 degrees as in the illustration.

Fourier descriptors

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

Each point is then regarded as a complex number pair, by treating the x-axis as the real axis and the y-axis as the imaginary axis:

)(.)()( kyjkxks Where s(k) represents each coordinate pair [x(k), y(k)].

In other words we treat the plane of the image as an Argand diagram. The advantage of this is that we have reduced a 2D problem to a 1D problem.

Alternative representation

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

If we have a total of ‘K’ points we can calculate the coefficients of the discrete Fourier transform :

1

0

/2)(1

)(K

k

KukjeksK

ua

The complex coefficients a(u) form the Fourier descriptors of the boundary, which can be reconstructed from the inverse Fourier transform:

1

0

/2)()(K

k

Kukjeuaks

Suppose now that we use only the first ‘P’ coefficients as the descriptor. i.e. a(u) = 0 for u > (P – 1). The same number of points exists in the approximate boundary, but not as many terms are used in the reconstruction of each point:

1

0

/2)()(P

k

Kukjeuaks

Object identification

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

As we have seen invariance to region orientation, position and scale are important attributes for a descriptor. The Fourier descriptor behaves very well in this respect, since changes in these parameters can be related to simple transformations on the descriptors:

Rotation by jr

jr euauaeksks )()( )()(

Translation by xy)()()( )()( uuauaksks xytxyt

Scaling by )()(a )()( s uaukskss

Starting point k0

Kukjsp euaukksks /2

sp00)()(a )()(

Properties of the Fourier descriptor

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

To test the effect of varying the number of descriptors used in the reconstruction, we will try running the Applet at the following URL:

http://www.s2.chalmers.se/research/image/Java/applets_list.htm

Reconstruction

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

The method begins by defining a segment(s) of the boundary, drawing a straight line to connect the ends of the segment and then rotating this segment until the line is horizontal.

The method treats the resulting curve as a statistical function g(x) of the random variable x. If the area under the curve is normalised to 1 then ‘g’ becomes the probability density function. Normal statistical moments can now be calculated and used as descriptors.

x

g(x)

Statistical moments

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

The general definition is :

1

0

1

0

where

)(

K

iii

K

ii

nin

xgxm

xgmxx

An attractive feature of these descriptors is that in many cases they have a direct physical interpretation.

For example 2(x) measures the spread of the curve about the mean (variance) and 3(x) the symmetry of the curve about the mean.

In Matlab, statistical moments are computed using the function statmoments. See Matlab Help for further details.

Definition of the statistical moments

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

• The representation and description of objects or regions that have been segmented out of an image are the early steps in the operation of most automated image analysis systems.

• Morphological operations (last session) are often used to extract elements of these regions.

• A range of description techniques exist – the choice is dictated by the problem under consideration.

• The object is to use descriptors that capture essential differences between objects – or classes of objects – while maintaining independence to possible changes in location, size and orientation.

• A pattern is formed by one or more descriptors and pattern recognition is used in object recognition and interpretation.

Concluding remarks

University of Sunderland CIS303 Advanced Forensic Computing BSc Forensic Computing

In this week’s laboratory, investigate the following:

Using the ‘holes’ and ‘convex hull deficiency’ shape descriptors, explain how you would distinguish between the shapes of the characters ‘0’ ‘1’ ‘8’ ‘9’ and ‘X’.

Look up the Matlab IPT function regionprops

Using the descriptors ‘convex hull deficiency’/’Solidity’ and ‘Euler number’ only, build a MatLab function that identifies each of the objects in image ‘lettertest.jpg’ (i.e. the characters described in bullet 1). How well separated are the object archetypes in the 2D vector space ?

Repeat for all of the letters in the alphabet.

Tutorial