cs 376b introduction to computer vision 02 / 22 / 2008 instructor: michael eckmann

CS 376bIntroduction to Computer Vision

02 / 22 / 2008

Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Today’s Topics• Comments/Questions• binary morphological operators

– conditional dilation

– hit and miss transform

– histograms

– thresholding

• automatic thresholding using Otsu method

Michael Eckmann - Skidmore College - CS 376 - Spring 2008

binary morphology• binary image morphological operations

– conditional dilation (can be used to select certain regions)

• let's look at the example on page 72

• an image B is first eroded to find only regions containing vertical lines of >= 3 1-pixels (note: this is not part of the conditional dilation) --- this results in an image C

• D is a 3x3 square structuring element

• then, C is conditionally dilated by D with respect to B which – repeatedly dilates C by D while only retaining 1's that are in

B– this is done until there are no further changes

– this is a nice operator to “select” only certain regions with a particular structural property (which was determined by the original erosion)



– hit and miss transform (for shape detection)

• given two structuring elements s1 and s2

• and given an input image I, a pixel in the output image O is ON iff s1 fits I AND s2 fits ~I, where ~I is the complement of I

• if we “place” s1 “over” s2 by making their origins line up, then s1 should NOT “intersect” s2 in ON pixels

– otherwise, no pixels will be on in the output image

• see handout



– boundary extraction

• if original image is I, and structuring element is S– erode I by S to get E– then do I – E

• example on board

• there are also ways to define the morphological operations for grey-scale images, where both the input image and the structuring elements have a range of values for their pixels, but our text does not cover this.


region properties of binary images• Consider a region to be a set of connected ON pixels

• area of region is simply the count of the number of pixels in the region

• centroid = (rowc, col

c) where, row

c= mean row of all pixels in region and

colc= mean column of all pixels in region

• perimeter of a region without holes is the set of all pixels that are in the region AND have a neighbor outside of the region

– recall neighbor can be defined in several ways, among them, 4-connected and 8-connected

• length of perimeter can be computed by starting at a pixel and travelling around the whole perimeter and arriving back at the starting pixel

– add 1 if the adjacent pixels are 4 connected

– add sqrt(2) if the adjacent pixels are 8 connected but not 4 connected

– Note: if there are n pixels in the perimeter, then there are n pairs of adjacent pixels, hence, n numbers added (each of which is either 1 or sqrt(2).


region properties of binary images• “Circularity”

– in other texts this is referred to as compactness and is defined as

– C1 = (length of perimeter)2 / area

– this is a dimensionless number and in the analog world is minimized by a disk (Ballard & Brown 1982)

– however, for digital shapes, that measure is minimized when the shape is a diamond (when we use the length of perimeter definition as before)

– a different circularity measure (Haralick 1974) which is similar for both digital and analog shapes and monotonically increases as the shape becomes more circular is

• C2 = mean of radial distances / standard dev. of radial distances

• the radial distance of a pixel on the perimeter is the distance between that pixel's center and the centroid of the region


region properties of binary images• “Circularity”

– C2 = mean of radial distances / standard dev. of radial distances

– the radial distance of a pixel on the perimeter is the distance between that pixel's center and the centroid of the region

– the mean and standard deviation are statistical measures

• the mean is what most people consider the average

• the standard deviation is a measure of how spread out the group is in values

• lower sd => more compact group, higher sd => more spread out

• standard deviation is the square root of the variance

• whatever units the original data is in, the variance is in those units squared, if you want a measure of the spread in the same units, then use standard deviation


histograms• A histogram of a greyscale image is a function

– whose domain is a set of grey values and

– the value of the histogram function at a grey value is the number of pixels in the image with that grey value

• A bin is a set of contiguous grey values. The above definition uses one grey value per bin.

• Alternatively one can state the number of bins desired and divide up the range of grey values evenly among the bins.

• e.g. for an 8 bit greyscale image, the range of values is 0 to 255

– if we say we want 32 bins, then each bin will represent 256/32 = 8 grey values. 0-7 in bin0, 8-15 in bin1, ..., 248-255 in bin31.

– our histogram will be a function whose domain is the bins and whose value is the number of pixels in the image with any of the grey values corresponding to that bin


histograms• Let's take a look at a histogram of an image in gimp.


thresholding• thresholding is a way to make a binary image out of a greyscale image

– assume t and th are thresholds (for an 8bit greyscale image, a value

between 0 and 255)

– threshold above makes all pixels with greyvalue >= t be 1, others 0

– threshold below makes all pixels with greyvalue < t be 1, others 0

– threshold inside makes all pixels with t <= greyvalue < th be 1,

others 0

– threshold outside makes all pixels with t > greyvalue OR greyvalue >= t

h be 1, others 0

• can use a histogram to compute these thresholds

– if the histogram is bimodal (2 modes) that are distinct (contain a valley with values near 0 between the modes) then can choose a threshold as a greyvalue in the valley (example on the board)


thresholding• Otsu method of automatic threshold determination (1979)

• assumes a bimodal distribution (may not work well when this assumption does not hold)

• range of grey values: 0 to I

• Histogram probabilities for each grey value

– P(i) = # of pixels with greyvalue i / total # of pixels in image

• The procedure works like follows

– choose a threshold t

– compute the variance of

• the pixels with grey values less than or equal to t

• the pixels with grey values greater than t

– the best t is when the weighted sum of the Within-Group variance is minimized


thresholding• assume group 1 is comprised of the pixels with grey values <= t and

group 2 is comprised of the others

• the Within-Group variance is defined to be

q1(t)*Var

1(t) + q

2(t)*Var

2(t)

where

q1(t) = Sum of the P(i)'s for all i's <= t

q2(t) = Sum of the P(i)'s for all i's > t

• computing all this stuff for every possible threshold t would be very inefficient. However, there is a way to make it more efficient by extracting all the computations that are not dependent on t.


thresholding• Instead of minimizing the Within-Group variance, one can maximize

the Between-Group variance which is a formula containing only q1(t)

and the means of the two groups.

• q1(t+1) is easily computed from q

1(t) how?

• mean1(t+1) can be computed from mean

1(t) and q

1(t)

• similarly for mean2(t+1)


thresholding• Dynamic thresholding - If the intensities of the pixels are strongly

dependent on the location in the image, then one can use local thresholds (and apply them only in the local area) as opposed to global thresholds

• Knowledge-based thresholding – uses prior knowledge of the shape/size of objects to determine the regions and the thresholds

cs 376b introduction to computer vision 02 / 22 / 2008 instructor: michael eckmann

Documents