p a ttern anal ysis of the mul ti-scale w a velet …burgiss jr. august 1998. a ckno wledgements iw...

RANGE IMAGE SEGMENTATION THROUGH

PATTERN ANALYSIS OF THE MULTI-SCALE

WAVELET TRANSFORM

A Thesis

Presented for the

Master of Science

Degree

The University of Tennessee, Knoxville

Samuel G. Burgiss Jr.

August 1998

ACKNOWLEDGEMENTS

I would like to thank my parents, Samuel and Janet Burgiss, for their support. Much

gratitude goes to my �ancee, Heather, for her encouragement and patience especially

during the writing of this document. Thanks to Dr. M. A. Abidi for selecting me for the

Graduate Research Assistant position in the Imaging, Robotics and Intelligent Systems

laboratory that made obtaining my master's degree �nancially possible. I also wish to

thank my advisors, Dr. R. T. Whitaker and Dr. M. A. Abidi, for their guidance throughout

my program. Thanks also to the members of my committee, Dr. R. T. Whitaker, Dr. M.

A. Abidi and Dr. J. Gregor for their help and constructive criticism.

The work in this thesis was supported by the DOE's University Research Program in

Robotics (Universities of Florida, Michigan, New Mexico, Tennessee, and Texas) under

grant DOE{DE{FG02{86NE37968.

Thanks to R. E. Barry and the Oak Ridge National Laboratory, Oak Ridge, Tennessee

37831, Managed by Lockheed Martin Energy Research Corp. for the U.S. Department

of Energy under contract DE-AC05-96OR22464. They provided the Coleman laser range

data.

ii

ABSTRACT

This work presents an image segmentation method for range data that uses multi-scale

wavelet analysis in combination with pattern recognition. To segment range images we

develop PASSEF (pattern analysis of scale space for the detection of features). PASSEF

creates a fuzzy edge map and we then apply a morphological watershed algorithm to this

map to create a segmentation.

The PASSEF system uses pattern recognition to classify points in an image based

on response to a feature detector over scale. A scale-space signature is the vector of

measurements at di�erent scales taken at a single point in an image. We train PASSEF

with scale-space signatures from the edge points of a training image. Once trained, the

system can determine the degree of edgeness of points in a new image.

A feature-detection framework based on multi-scale analysis and pattern-recognition

has several potential advantages over other feature-detection systems. Our goal is to create

a system that exploits the advantages of a multi-scale, pattern-recognition framework.

These advantages are detection of features at di�erent scales (i.e. features of all sizes),

robustness to noise, and few or no free parameters. We discuss these advantages in relation

to the development of the PASSEF system and provide a critical analysis of the system

based on these three goals. The PASSEF system achieves the stated goals for the detection

of step-edge features. Our results also show that this technique might be useful in the

detection of other features such as crease edges. We suggest future work for extending the

capabilities of the system.

iii

Contents

1. Introduction 11.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Overview of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2. Related Work 62.1 Segmentation Using Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.1 Region-Based Segmentation . . . . . . . . . . . . . . . . . . . . . . . 62.1.2 Hybrid Segmentation Using Wavelets . . . . . . . . . . . . . . . . . . 72.1.3 Edge-based Segmentation Using Wavelets . . . . . . . . . . . . . . . 8

2.2 Multi-Scale Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . 102.2.1 Scale Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.2.2 Scale Traversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2.3 Collective Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3. Multi-Scale Feature Extraction 173.1 Step-Edge Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.2 Crease-Edge Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.3 Segmentation from Feature Extraction . . . . . . . . . . . . . . . . . . . . . 223.4 Fuzzy Reasoning Approach to Scale-Space Fusion . . . . . . . . . . . . . . . 24

4. Scale-Space Combination Algorithm 394.1 Analysis of the Scale Space of the Step-Edge Operator . . . . . . . . . . . . 404.2 Analysis of the Scale Space of the Crease-Edge Operator . . . . . . . . . . . 434.3 Motivation for Using Pattern-Recognition . . . . . . . . . . . . . . . . . . . 464.4 Modeling Scale Space Using Gaussian Blobs . . . . . . . . . . . . . . . . . . 484.5 Fusion of the Step-Edge and Crease-Edge Detection Maps . . . . . . . . . . 51

5. Results and Analysis of System 535.1 Training Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535.2 Detection of Objects at All Scales in the Presence of Noise . . . . . . . . . . 555.3 Free Parameters in the PASSEF System . . . . . . . . . . . . . . . . . . . . 65

5.3.1 Number of Scales | n . . . . . . . . . . . . . . . . . . . . . . . . . . 675.3.2 Number of Blobs | N . . . . . . . . . . . . . . . . . . . . . . . . . . 685.3.3 Noise Level in Training Data | � . . . . . . . . . . . . . . . . . . . 715.3.4 Minimum Watershed Depth | d . . . . . . . . . . . . . . . . . . . . 73

5.4 Images from Di�erent Acquiring Devices . . . . . . . . . . . . . . . . . . . . 735.5 Crease-Edge Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

6. Conclusions 86

BIBLIOGRAPHY 88

APPENDICES 94

iv

A. Background 95A.1 Wavelet Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

A.1.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95A.1.2 Properties of Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . 97A.1.3 Examples of Wavelet Functions . . . . . . . . . . . . . . . . . . . . . 101

VITA 104

v

List of Tables

5.1 Training Data used with the PASSEF System . . . . . . . . . . . . . . . . . 545.2 Free parameters in the PASSEF system and segmentation algorithm. . . . . 67

vi

List of Figures

1.1 First and second derivatives of 1-D signal . . . . . . . . . . . . . . . . . . . 31.2 Multi-scale derivative of Gaussian operator applied to 1-D signal containing

step edges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43.1 Representations of the Wavelet Transform. . . . . . . . . . . . . . . . . . . 173.2 Implementation of the Wavelet Transform. . . . . . . . . . . . . . . . . . . 203.3 A synthetic image containing step edges. . . . . . . . . . . . . . . . . . . . . 213.4 A synthetic image containing crease edges. . . . . . . . . . . . . . . . . . . . 233.5 Region formed from catchment basin. . . . . . . . . . . . . . . . . . . . . . 243.6 The step and step10 images. . . . . . . . . . . . . . . . . . . . . . . . . . 253.7 Scale space of the step-edge operator applied to the step10 image. . . . . . 273.8 Watershed algorithm applied to step-edge operator images from the step10

image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.9 Fuzzy fusion (Bernoulli's Rule of Combination) applied to scale space of

step10 image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.10 Watershed algorithm applied to fuzzy fusion (Bernoulli's Rule of Combina-

tion) result for step10 image. . . . . . . . . . . . . . . . . . . . . . . . . . . 313.11 The hydro3 image (size 256x256). . . . . . . . . . . . . . . . . . . . . . . 333.12 Scale space of the step-edge operator applied to the hydro3 image. . . . . 343.13 Watershed algorithm applied to step-edge operator images from the hy-

dro3 image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.14 Fuzzy fusion (Bernoulli's Rule of Combination) applied to scale space of

hydro3 image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.15 Watershed algorithm applied to fuzzy fusion (Bernoulli's Rule of Combina-

tion) result for hydro3 image. . . . . . . . . . . . . . . . . . . . . . . . . . 374.1 Creation of a scale-space signature from scale-space data. . . . . . . . . . . 404.2 The block image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.3 Scale-space signatures of each pixel in the block image with added Gaus-

sian noise (� = 0:1). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.4 Creation of a collective scale space from edge points in the step image. . . 424.5 2-D scale-space projections from the step10 image. . . . . . . . . . . . . . 434.6 The pyramid image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444.7 Scale-space signatures of each pixel in the simple pyramid image with

added Gaussian noise (� = 0.1). . . . . . . . . . . . . . . . . . . . . . . . . 444.8 Synthetic 8-sided-cone image. . . . . . . . . . . . . . . . . . . . . . . . . . 454.9 2-D scale-space projections from the 8-sided-cone image. . . . . . . . . . . 454.10 Pattern recognition to determine edgeness of scale-space signatures. . . . . 464.11 2-D scale-space projections from the step10 and 8-sided-cone images. . . 484.12 Flow of entire Pattern Analysis of Scale Space for Extraction of Features

(PASSEF) system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494.13 Synthetic highbay image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504.14 2-D projection of scale space for edge points of the highbay image. . . . . 514.15 K-means applied to scale space of step-edge points of highbay image. . . . 525.1 Training data from the step10 image. . . . . . . . . . . . . . . . . . . . . . 545.2 Training data from the 8-sided-cone image. . . . . . . . . . . . . . . . . . 555.3 Training data from the highbay image. . . . . . . . . . . . . . . . . . . . . 565.4 Results of applying PASSEF and the step-edge operator to the step10 image. 585.5 The step30 image (size 256x256). . . . . . . . . . . . . . . . . . . . . . . . 59

vii

5.6 Results of applying PASSEF and the step-edge operator to the step30 image. 605.7 Application of the step-edge operator to the step10 image at various scales. 615.8 Watershed algorithm applied to step-edge operator results of the step10

image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625.9 Application of the step-edge operator to the step30 image at various scales. 635.10 Watershed algorithm applied to step-edge operator results of the step30

image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645.11 The hydro3 image (size 256x256). . . . . . . . . . . . . . . . . . . . . . . 655.12 Results of applying PASSEF and the step-edge operator to the hydro3

image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665.13 Application of the PASSEF system to the step10 image for varying values

of n. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695.14 2-D scale-space projections of training data sets. . . . . . . . . . . . . . . . 705.15 Application of the PASSEF system to the step10 image for varying values

of N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725.16 The PASSEF system applied to highbay synthetic image for varying values

of �. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725.17 The hydro6 image (size 256x256). . . . . . . . . . . . . . . . . . . . . . . 745.18 Results of applying PASSEF and the step-edge operator to the hydro6

image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755.19 The cone image (size 256x256). . . . . . . . . . . . . . . . . . . . . . . . . 765.20 Application of the step-edge operator to the cone image. . . . . . . . . . . 775.21 Results of applying PASSEF to the cone image. . . . . . . . . . . . . . . . 785.22 Histogram equalization of the result of applying PASSEF to the cone image. 795.23 The polyhedral1 and polyhedral2 images. . . . . . . . . . . . . . . . . 805.24 Application of the step-edge operator to the polyhedral1 image. . . . . . 815.25 Results of applying PASSEF to the 8-sided-cone image. . . . . . . . . . . 835.26 Results of applying PASSEF to the hybay image. . . . . . . . . . . . . . . 835.27 Fusion of step-edge and crease-edge operator results using PASSEF. . . . . 85A.1 Representations of the Wavelet Transform . . . . . . . . . . . . . . . . . . . 97A.2 Haar wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98A.3 Daubechies wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98A.4 Derivative of a quadratic spline Wavelet . . . . . . . . . . . . . . . . . . . . 99A.5 DOG Wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100A.6 4 Coe�cient Wavelet Space . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

viii

CHAPTER 1

Introduction

1.1 Problem Statement

Image segmentation is a di�cult problem, but it is also an essential pre-processing

step in many computer vision systems [1]. In the case of range images, segmentation is

performed in order to locate objects, �t surfaces, and render volumes. Segmentation of a

range image is the process of dividing the image into areas that are associated with speci�c

objects or object components in a scene. Often, these objects or object components can

be identi�ed in a range image as patches that are relatively uniform in range value or

surface shape.

The literature is replete with novel segmentation algorithms. Traditionally these al-

gorithms must be tuned in order to segment images for a particular system (i.e. camera

or scanner) or segment images for other varying conditions [2]. Segmentation occurs at

a particular scale. Some segmentation algorithms leave scale as an adjustable parameter

whereas other algorithms attempt to �nd the best scale(s) for segmentation. Some meth-

ods create segmentations based heavily on one scale and re�ned by other scales [3, 4, 5].

More advanced techniques extract relevant information by intelligently examining all scales

[6, 7]. Noise occurs in an image at a particular scale or set of scales ; therefore, in order

to produce a proper segmentation, algorithms that choose scale must avoid scales that

contain relatively large amounts of noise.

Our goal is to develop an algorithm that posses three qualities:

� Detection of objects at all scales in the presence of noise.

� Few free parameters.

1

� Applicability to images from di�erent acquiring devices.

This thesis presents the results of our attempt to develop a system that adheres to

these criteria. We develop a system that can segment some di�erent types of images

without parameter adjustment. Additionally, this system segments objects in images at

a range of scales in the presence of sensor noise. Attempts to generalize the system to

crease edges demonstrates some of the strengths and weaknesses of this approach.

1.2 Strategy

We propose to segment range images using multi-scale feature extraction. We detect

edges as features in a range image, and from this edge map we create a segmentation.

There are two concepts that are integral to our strategy; the �rst is image segmentation

through feature extraction and the second is the extraction of features at multiple scales.

In this section we describe how these two ideas leads to the development of our strategy.

Several strategies for segmenting images exist. Segmentation techniques are commonly

divided into three broad categories: edge-based, region-based, and hybrid methods. Edge-

based methods are derived from feature extraction ideas; these methods detect edges as

features in an image. Filters that �nd discontinuities in pixel value (e.g. intensity or range)

are commonly used to �nd edges. After edges are detected, regions that are outlined by

these edges are labeled as objects or regions. Region-based methods work by combining

areas of similar value. These methods usually grow or somehow group pixels of similar

value. For example initial seeds can be established and these seeds grown to create desired

regions. Hybrid methods use both region-based and edge-based tactics to create a �nal

image.

The Canny technique [5] and Marr-Hildreth technique [8] are two of the more popular

methods of edge-based segmentation. Both of these techniques estimate the derivative of

2

1−D signal

1st Derivative of Signal

2nd Derivative of Signal

Figure 1.1: First and second derivatives of 1-D signal

the image at each point by using a convolution mask and create an edge map based on the

result. Figure 1.1 shows the �rst and second derivatives of a 1-D signal containing edge

models. Notice that the maxima and minima of the �rst derivative of the signal indicate

the location of a step edge. The zero crossings of the second derivative also show where

the step edges occur.

An important issue that �gure 1.1 does not address is scale. Scale refers to the idea

that objects in the world exist or are relevant over a limited range of sizes or distances

[9]. For example the branch of a tree is relevant only at the scale from a foot to a few

yards. At the scale of an inch the bark or other features are examined. At the distance

of �fty yards the entire tree is relevant. Scale is relevant to feature extraction because

feature extraction must occur at a particular scale { whether or not that scale is explicitly

represented as a free parameter in the algorithm.

The idea of scale space refers to a family of derived signals where the �ne-scale infor-

mation is successively suppressed as scale increases [3]. Figure 1.2 shows the multi-scale

derivative of a 1-D signal containing step edges. This derivative is created by convolving

the derivative of a Gaussian (DOG) function with the sample signal. Each scale of the

3

1−D Signal Containing Edges

Multiscale Derivative

Figure 1.2: Multi-scale derivative of Gaussian operator applied to 1-D signal containingstep edges.

result is created using a di�erent standard deviation for the DOG function.

This �gure demonstrates two concepts of scale that are vital to our work. First the

�gure shows that as scale increases, derivatives computed at these scales become more

smoothed. This means that in this scale space of derivatives there is a trade o� between

the level of noise reduction and the accuracy of the result. Secondly, notice that in �gure

1.2 the last scales appear to present no new information. They are so smoothed that they

are almost at. This observation suggests that only scales up to a certain �nite size are

needed to analyze particular sets of data.

In image processing multi-scale analysis provides a representation of an image that

allows information from each scale to be analyzed separately. The wavelet transform

provides a tool for creating such multi-scale data. In this work we use the derivative of a

cubic-spline wavelet [10], which has the following desirable properties: compact support,

symmetry, di�erentiability (to a �nite degree), and one zero crossing. This wavelet creates

a scale space of �rst-order di�erence information. Edge-detection operators are derived

4

from this �rst-order di�erence information.

We want to segment objects of all scales in an image; therefore, using edge detection

as a means to segmentation requires that we extract edges at all scales. Conventional edge

detectors extract edges at a only one scale [5, 8, 11]. Thus, combining information from

these edge detectors at multiple scales allows the detection of edges at all scales in an

image. In addition we want to segment these objects in the presence of noise. The noise

versus accuracy trade o� in a derivative scale space leads us to suppose that if a system

could locate noise in the scale space, the scale-space information could be weighted to

achieve proper multi-scale edge detection in the presence of this noise. Thus, multi-scale

analysis not only allows for the detection of multi-scale edges but also provides for a way

to avoid improper detection of noise.

1.3 Overview of Thesis

Chapter 2 of this thesis gives an overview of work related to range image segmentation.

Wavelet segmentation and multi-scale segmentation methods are discussed. In chapter 3

we derive two edge operators from the wavelet transform. We examine the scale-space

properties of these edge operators. In chapter 4 we present a detailed description of the

range-image segmentation system that we have developed. We also give our motivation

for using a pattern-recognition system to analyze scale space. Issues that determine the

system's success are discussed. Chapter 5 is a presentation of results of the system. Results

for synthetic and real data are shown as well as an evaluation of the performance of the

system. We conclude with chapter 6 which summarizes our work and discusses ideas for

future investigations in these areas.

5

CHAPTER 2

Related Work

We are proposing the use of multi-scale wavelet information for the purpose of seg-

menting range images. There are two areas in the literature that are relevant. The �rst

area of related work is the use of wavelets for segmentation. We describe region-based,

edge-based, and hybrid wavelet segmentation methods. The second area of related work

is multi-scale feature extraction. In our discussion of multi-scale feature extraction we

give an account of how multi-scale feature extraction ideas have evolved. We begin with

methods that utilize intelligent choice of scale, then proceed to edge focusing, and �nally

describe schemes that extract information by examining the entire scale space.

2.1 Segmentation Using Wavelets

In this section we present three types of wavelet-based segmentation algorithms. These

are region-based, edge-based, and hybrid wavelet segmentation methods. Most region-

based segmentation methods that use wavelets segment images on the basis of texture.

We discuss a variety of edge-based algorithms, most of which are based on the work of

Mallat. We present only a short discussion of hybrid methods because these methods are

not very prevalent in the literature.

2.1.1 Region-Based Segmentation

Virtually all region-based wavelet segmentation methods are based on texture analysis.

In our discussion of texture-based methods we examine the motivation for texture-based

segmentation. A natural tool for texture-analysis is the wavelet transform. We describe

how wavelet-analysis is utilized to achieve segmentations based on texture and indicate

6

areas of application for texture-based segmentation using the wavelet transform.

The main idea of texture-based segmentation is that an image is made up of di�erent

textures, \the set of local neighborhood properties of the gray levels of an image region"

[12], and each of these di�erent textures can be described by a small number of character-

istic frequencies. The wavelet transform provides a tool that can extract these frequencies

locally within the regions of the image [13]. An overview of the use of wavelets for texture

analysis is provided in [12]. In that work the authors discuss the application of texture

analysis to feature extraction and the extension of texture-based feature extraction to

segmentation. Pixels in an image can be classi�ed based on local texture characteristics.

Application of this classi�cation to each pixel of an entire image produces regions in the

image that are invariant in local texture properties.

One example of fundamental work in this area comes from Laine et al. [14, 15, 16,

17]. Laine and Fan compare conventional texture analysis with wavelet-based texture

analysis [14, 16]. They develop a segmentation that clusters pixels of like texture by

grouping feature vectors in a wavelet space. They group feature vectors using a K-Means

algorithm. Many other authors use this approach with di�erent wavelet transforms or

di�erent clustering algorithms (e.g. [18, 19, 20, 21, 22, 23, 24, 20]).

Texture analysis has been applied to a myriad of image types. For example, due

to the nature of some medical images (e.g. each tissue type in medical images might

exhibit a certain texture), this segmentation method has become popular in that �eld.

Texture analysis with wavelets has been used to segment various types of medical images

[25, 26, 27, 28]. In other �elds this scheme has been applied to various radar images

[29, 30, 31].

7

2.1.2 Hybrid Segmentation Using Wavelets

As mentioned, wavelets are utilized in only a few hybrid-based segmentation methods. We

discuss only two examples here. The �rst is by Ramos, Hemami, and Tamburro [32]. These

authors attempt to model the psycho-visual system of humans. They assert that there

are three image components of distinct perceptual signi�cance to humans: strong edges,

smooth regions, and textured regions. They divide an image into 8x8 or 16x16 blocks then

classify these blocks as strong edge, smooth, or textured regions. They extract multi-scale

information using the wavelet transform discussed in [10]. Several rules are applied to the

wavelet transform response of all blocks in an image to classify these blocks into one of

the previously mentioned categories.

Another hybrid algorithm integrates local fractal dimension (derived using the wavelet

transform) and edge information into a region growing method [33]. The authors state

that the fractal dimension is a measure of surface roughness (i.e. texture). They use the

Sobel operator to detect edges in the image. The fractal dimension information and edge

information control region growing in the image.

2.1.3 Edge-based Segmentation Using Wavelets

Mallat is the major contributor to edge-based segmentation using wavelets [10]. Most edge-

based segmentation algorithms that utilize wavelets use the wavelet transform developed

by Mallat in [10] to detect discontinuities (i.e. edges) in an image. This section focuses on

his work in [10], and at the end of this section we present a few examples of other related

work.

Mallat's goal is actually to create an image compression scheme. For this he proposes

to use an edge map in conjunction with information that describes the regions created

by edge contours in the edge map [10]. He shows that using the wavelet transform to

detect edges is similar to the Canny edge-detection method [5]. If the basis function of

8

the wavelet transform is the derivative of a Gaussian (DOG), then this method of edge

detection is equivalent to Canny edge detection.

Mallat's �rst step in creating an edge map using his wavelet based edge detection is to

determine which wavelet function to use. In choosing the wavelet function Mallat shows

that the desired properties are compact support, symmetry, and one vanishing moment.

He uses the derivative of a cubic spline, which is a quadratic spline, as the basis function

of his wavelet transform.

Mallat applies his quadratic-spline wavelet transform to several test images. He maps

the modulus maxima of the transform as the edges of an image. He uses three scales of the

transform to construct the characterization of the image. His characterizations of images

are suitable for only those applications in which accuracy is not critical; detailed areas of

the image become blurred.

Many authors use Mallat's wavelet transform to perform edge detection. For example,

Sheng and Chevrette use Mallat's wavelet transform for object recognition [34]. They

train a neural network with contour information for several objects. They apply the

neural network to contours extracted from an image via Mallat's wavelet transform. The

neural network classi�es the contours as one of the training objects. Other examples are

found in [35, 36, 37, 38, 39, 40].

In his Master's thesis Neiroukh [41] examines the use of the wavelet transform for

segmentation of range images. His method of obtaining the edge map is similar to Mallat's

in that he maps the modulus maxima of the quadratic-spline wavelet transform as the

edges of the image. He also uses only one scale of the wavelet transform as Mallat did.

After obtaining an edge map, Neiroukh links edges, labels regions, and merges regions

below a certain size with larger regions. He concludes that his method creates an accurate

segmentation of synthetic and real images even in the presence of noise.

Some authors perform edge detection using wavelet transforms that are not discussed

9

by Mallat. Aydin et al. use an M-band wavelet transform because they wish to detect

edges using second-order information [42, 43]. Applying this wavelet transform to an

image yields a map where zero crossings indicate edge points in the image (as opposed to

Mallat's transform in which edges are indicated by a maxima). Their approach is similar

to the Marr and Hildreth technique [8]. They apply Teager's energy operator to reduce the

number of unwanted zero-crossings. Other examples of non-Mallat edge detection using

wavelet can be found in [44, 45, 46].

2.2 Multi-Scale Feature Extraction

Scale refers to the idea that objects in the world exist or are relevant over a limited

range of sizes or distances [9]. Scale is relevant to feature extraction because feature

extraction must occur at a particular scale. In this section we discuss in detail literature

that utilizes this multi-scale concept to achieve feature extraction for segmentation. The

breath of ideas in the area of multi-scale feature extraction is substantial. We divide multi-

scale feature extraction methods into three broad categories: scale choice, scale traversal,

and collective analysis.

These three categories di�er in the number of scales used to achieve a feature detection

and/or the manner in which the scale space is analyzed. We say that algorithms that utilize

one scale or a few scales are scale-choice methods. These algorithms may, for example,

determine the quality of edge detection at each scale, and then perform edge detection

at the best scale. We say that methods that perform an initial feature extraction at a

particular scale and then re�ne that feature extraction with many other scales are scale

traversal algorithms. One common scheme in this area is edge focusing. Finally, we identify

algorithms that extract features by examining the entire scale space at once as collective

analysis methods.

10

2.2.1 Scale Choice

First we examine algorithms that use one or two scales. This category has the smallest

representation in the literature. We present three algorithms that all use quite di�erent

techniques to choose scale, but one characteristic divides these scale choice methods into

two groups. There are two types of scale decision approaches: global scale decisions and

local scale decisions. Techniques using global scale decisions determine a best scale for

the entire image. Methods using local scale decisions determine a best scale for a pixel or

(although no examples of this were found) a region.

Khashman and Curtis [47] develop a system that chooses the best scale for edge detec-

tion of a particular range image. They apply a Laplacian of a Gaussian (LOG) operator

at seven di�erent scales to training images. They manually determine the best scale for

these training images. A multi-layer perceptron neural network is trained with the train-

ing images and the best scales for each image. After training the network is applied to

new range images to determine the best scale for edge detection of these images. Another

scheme that globally chooses scale is [48].

Lindeberg uses local scale decision for edge detection. In other words, a best scale is

determined for pixels on an individual basis [49]. Then edge detection is performed on each

pixel at that pixel's chosen scale. He states that the spatial extent of corresponding image

structures can be indicated by local maxima over scale of normalized derivatives. He uses

these maxima as a guide to locally choose the best scale for edge and ridge detection.

He does this by determining the scale at which the normalized derivative of a point in

an image gives a maximum response. Then he applies the derivative operator at this scale.

Performing this process at each point in the image creates an edge map. He determines

a signi�cance measure for each edge in this edge map. He shows �nal edge maps that

contain a �nite number of the most signi�cant contours.

11

Elder also uses local scale decision [50]. He asserts that a minimum reliable scale can

be determined if sensor noise statistics are known a priori. He states that at thisminimum

reliable scale and all larger scales the likelihood of error in edge detection due to sensor

noise is below a certain tolerance. So, the idea is to �nd the minimum scale that blurs the

sensor noise (a small scale feature).

Elder states that because smoothing delocalizes edges the smallest scale that avoids

improper detection of noise as edges is the best scale for edge detection. Therefore, the

minimum reliable scale is the best scale for edge detection. Elder �nds the minimum

reliable scale for both �rst-order and second-order edge detection operators at each pixel.

He then he applies these edge detection operators at the chosen scales to create an edge

map. He re�nes this edge map using various techniques (e.g. contour closure). Another

example of local scale choice is [51].

2.2.2 Scale Traversal

Among the large variety of scale combining edge detection schemes proposed in the lit-

erature there are two basic schools of thought. One school believes that an initial edge

map is created at one scale and then the map is altered according to information from

other scales. The other proposes that information can be better extracted by examining

the entire scale space to create an edge map. This section discusses the idea of creating

an edge map at one scale and then traversing the scale space in order to re�ne the map.

Bergholm [4] proposed fundamental concepts under the label edge focusing in 1987.

This algorithm combines edge information by traversing from a coarse to a �ne scale (and

is therefore referred to as a coarse-to-�ne algorithm.) He applies the Canny edge detector

to a larger scale and then applies the detector at a smaller scale in the vicinity of edges

found at the larger scale. This process is performed iteratively through decreasing scale.

After Bergholm's work several authors have used and expounded on this same basic

12

strategy. Lindeberg uses Bergholm's idea in conjunction with his concept of scale blobs

[52]. Lindeberg considers a scale space created by convolving a two-dimensional image with

Laplacian of a Gaussian (LOG) functions at di�erent standard deviations. He considers the

scale parameter as continuous. Zero crossings occur at edge points in an image when the

image is convolved with a LOG function. These zero crossings can be traced through the

scale space. The zero crossings create closed regions in the image plane and in the scale

space, thus, creating volumes or blobs in the three-dimensional space (two-dimensional

images over scale). Lindeberg creates signi�cance values for the scale blobs in his scale

space. He traverses the scale space in a coarse-to-�ne manner. He examines edges of the

most signi�cant scale blobs. He traces or focuses these edges through the scale space to

create an edge detection.

One group, Williams and Shah [53] create contours at the largest scale by combining

what they determine to be the strongest edge points. They use four measures to determine

the strength of an edge point. One of these measures quanti�es the level of noise around

the edge point. After determining the contours at the largest scale, end points of the

contours are examined in a direction similar to the end point of the contour at lower

scales. If edge points exist in the correct position at lower scales, the small scale edges are

added to the edge map.

Another group, Qian and Huang [54, 55], create an initial map from the smallest

scale and then add salient edges from larger scales. This method is referred to as a

�ne-to-course algorithm. Qian and Huang begin with a small scale edge map and add

only salient edges from each larger scale to the �nal edge map. They �nd edge points

at multiple scales using LOG �lters. A multi-step process determines the salience of an

edge contour. The process determines edge strength based on gradient magnitude of edge

points in the Gaussian blurred image at a particular scale. Edge strengths are normalized

based on edge contour length. Thresholding these strength values based on a global noise

13

approximation determines edge salience.

Another approach traverses scale, but uses a di�erent blurring technique. Whitaker

and Pizer assert that the use of Gaussian functions for blurring, which are used in a

signi�cant amount of multi-scale edge extraction techniques, has di�culties [56]. Gaussian

blurring, especially at large scales, causes inaccurate edge detection. They use edge-

a�ected di�usion in order to more accurately detect edges. This di�usion technique limits

blurring according to the presence of edges. They apply this technique at successively

smaller scales. Results for synthetic images show that the edge-a�ected di�usion is an

appealing alternative to Gaussian blurring.

2.2.3 Collective Analysis

An alternative to methods that traverse scale space is to examine scale space collectively.

We found varied approaches to collective scale-space analysis. These di�erent methods

apply techniques such as data fusion and pattern recognition to scale-space information

in order to extract features.

Wang [57] presents an algorithm which fuses scale-space information from a morpho-

logical gradient operator in an additive manner. Wang's work concentrates on the advan-

tages of using a morphological gradient operator and the application of a morphological

watershed algorithm to create a segmentation from his edge map. His gradient operator

is the di�erence of a signal eroded by a structuring unit and the signal dilated by the

same structuring element of scale i. Before summing the scale-space information for this

operator the result of this operator is eroded by the structuring element of scale i � 1.

Because all scales are weighted equally in this method the maximum scale used greatly

e�ects the results.

Baxter and Coggins present a di�erent approach [58]. They assert that pixels can be

classi�ed into di�erent regions based on properties in space and scale. They represent

14

pixels in an image by patterns or vectors in an n-dimensional feature space. The feature

vectors are the outputs of a series of n spatial �lters at a particular spatial location. They

use �lters that form clusters of pixels they wish to classify in the corresponding feature

spaces.

They apply their system to two di�erent image types. They analyze objective prism

images, which are obtained by inserting a prism in a telescope. These images record the

stellar spectra of stars. The authors wish to segment the stellar spectra from the back-

ground. (Conventional thresholding methods are not applicable because of the magnitude

of the noise.) They apply two �lters to the astronomical images to create a 2-dimensional

feature space. With this feature space they are able to di�erentiate spectra information

and noise. They also analyze cell images. They create a ten-dimensional space using

isotropic Gaussian �lters. They attempt to segment nucleolar organizer regions (NORs)

and nuclei from the background of the cell images. (Conventional thresholding is not

applicable due to the presence of objects of similar value.) They use supervised pixel

classi�cation to classify all the pixels in a cell image. A cell image is manually labeled

to serve as a training image. Training feature vectors form two classes (one representing

NORs and the other representing nuclei) in the ten-dimensional space. If a pixel in an

image being analyzed is close to a training class in the feature space the pixel is labeled

as being the respective class. If a pixel is not close to either class then it is labeled as

background. They conclude that classi�cation for the cell images is not perfect.

Truchetet, Laligant, Bourcnanne, and Miteran [6] present an approach that uses a

statistical method to determine the class (edge vs. non-edge) of points in an image.

Directional multi-scale gradient information of known edge points in an image crates a

scale space which is characterized by a statistical classi�er. Each training edge point is a

feature vector in an N-dimensional space (N being the number of scales used). Truchetet

�ts hyperrectangles to the edge points in the scale space. After training, points in a new

15

image can be classi�ed by the system as edge or non-edge.

Pereira and Manolakos use a similar approach in that they classify pixels in an image

as edge or non-edge using scale-space information, but their method uses both a di�erent

wavelet transform and a di�erent classi�cation method. They begin by applying the

wavelet transform with one of Daubechies' wavelets to create a scale space for a simple

synthetic training image. A hierarchical feed-forward neural network is trained with the

scale-space information of only the edge points of this training image. After training the

network is ready to classify points in a new image as edge or non-edge points. Analysis of

their system includes application to two real images and an investigation of the e�ects of

noise in images being analyzed.

Haring, Viergever, and Kok [59] propose a similar method to Truchetet but they use

a classi�er to create a segmentation rather than an edge detection. They analyze multi-

scale di�erential geometrical invariant information using a Kohonen network. They apply

several di�erential geometrical invariants to Gaussian smoothed images to create various

scale spaces. These scale spaces are used to create feature vectors for each pixel in an

image. Such feature vectors from a simple synthetic image are used to train the Kohonen

network. After training the network they segment synthetic and real images.

These examples demonstrate the breadth of methods that researchers have developed

based on the idea of multi-scale fusion for segmentation. Our work falls into the second

school of thought and is most closely related to the work of Truchetet et al. as well as

Haring et al. One main di�erence in these works and our work and is that we use a

fuzzy classi�er rather than a crisp classi�er to determine the degree of membership in our

feature class. Haring et al. classify pixels into particular segment classes and Truchetet

et al. classify points as edges or non-edges.

16

CHAPTER 3

Multi-Scale Feature Extraction

The wavelet transform creates a set of multi-scale derivatives for an image. We com-

bine this derivative information to create two feature-detection operators: the step-edge

detector and the crease-edge detector. In this chapter we describe how these two operators

are derived from the wavelet transform. We introduce the watershed algorithm as a tool

for obtaining segmentation from these two edge-detection operators. We describe some

observations of the multi-scale properties for these operators. We also present a simple

method for combining multi-scale edge information. This method is based on fuzzy fu-

sion. We analyze the behavior of this multi-scale fusion method and give motivations for

developing a more sophisticated method to combine multi-scale information.

There are two basic schemes for representing the wavelet transform of a signal: the

pyramidal scheme [60] and the convolution scheme [10]. Most wavelet applications use the

pyramidal implementation of the wavelet transform, which involves subsampling at each

scale. We use the convolution implementation, which involves no subsampling of data.

The convolution implementation creates a group of signals (one signal representing each

scale) that are all the same size (�gure 3.1).

ImplementationPyramidal

ScaleIncreasing

ConvolutionImplementation

Figure 3.1: Representations of the Wavelet Transform.

The convolution representation of the wavelet transform of a signal f(x) is de�ned by

equation 3.1, where is the convolution operator and (x) is the chosen wavelet function

17

at a particular scale s. We use the quadratic-spline wavelet [10] with equation 3.1 to create

a scale space of �rst-order di�erence information. We choose the quadratic-spine wavelet

because it has the following desirable properties: compact support, symmetry, di�eren-

tiablity (to a �nite degree), and one zero crossing (see appendix A for more details). We

combine the scale spaces from this wavelet transform to create edge-detection operators.

Ws[f(x)] = f(x)s(x) (3.1)

Equation 3.1 can be implemented by convolving a signal with a scaling function and

then applying a derivative operator [10]. A scaling function, �s(x), is a smoothing function

that meets the criteria1Z

�1

�(x)dx = 1; (3.2)

and

9� > 0 : �(x) = 0, 8 j x j> �: (3.3)

For the quadratic-spline wavelet, the smoothing function is the cubic-spline function [10].

The quadratic-spline function is the derivative of the cubic-spline function. A signal f(x) is

smoothed to scale s by convolution with �s(x). The wavelet transform of a one-dimensional

signal f then becomes

Ws[f(x)] =@(f �s)(x)

@x(3.4)

We extend this implementation of the wavelet transform to two-dimensional signals.

Consider the signal f(i; j) to be an image. The indices of the image are i, the vertical

index, and j, the horizontal index. A two-dimensional scaling function �(i; j) is employed

to �nd the wavelet transform. To create �rst-order derivative approximations, the scaling

function is applied to the image at scale s and then the derivative operator is applied to the

signal in the desired direction. Equation 3.5 shows application of the wavelet transform

for the two-dimensional case.

18

W is [f(i; j)] =

@(f �s)(i; j)

@i

W js [f(i; j)] =

@(f �s)(i; j)

@j(3.5)

Applying the wavelet transform can also create second-order derivative approxima-

tions. We apply the same cubic-spline scaling function at scale s and then apply the

proper partial derivative operator. Equation 3.6 shows second-order partial derivative

approximations obtained using the wavelet transform. Equations 3.5 and 3.6 are used to

create step-edge and crease-edge operators.

W iis [f(i; j)] =

@2(f �s)(i; j)

@i2

W jjs [f(i; j)] =

@2(f �s)(i; j)

@j2

W ijs [f(i; j)] =W ji

s [f(i; j)] =@2(f �s)(i; j)

@i@j

(3.6)

We want to create a scale space of wavelet-transform information. Instead of creating

smoothing functions for each scale and convolving the smoothing functions with an image

we create a scale space using a recursive method [60]. The u �lter is a smoothing kernel

and the v �lter is a di�erence operator.

Applying these �lters yields a scale space where the scale parameter is discrete. We

choose not to sample the scale space uniformly. A typical sampling that reduces the

information at larger scales in a natural manner is dyadic sampling [10, 60]. In dyadic

sampling, scale varies along the dyadic sequence s = (2y)y2N where s is scale[10]. We

de�ne scale one as application of v directly to the original image.

Figure 3.2 illustrates the application of these two �lters to create a dyadic sampling

of the wavelet-transform scale space. This �gure shows the creation of the scale space of

19

8 u(i,j)

v(i)

v(i)

v(i)

v(i)

v(i)

X

4 u(i,j)

2 u(i,j)

2 u(i,j)

f(i,j) W (i,j)j

W (i,j)j

W (i,j)j

W (i,j)j

W (i,j)j

1

2

4

8

16

Implementation of Wavelet Transform

: convolve with filter X

: N convolutions with filter XN X

Figure 3.2: Implementation of the Wavelet Transform.

W is [f(i; j)]. Applying v(j) in place of v(i) yields W j

s [f(i; j)]. Applying v(i) twice rather

than once yields W iis [f(i; j)]. All the wavelet transforms in equation 3.6 can be created

by changing the application of v.

3.1 Step-Edge Detection

We de�ne step edges in a range image as discontinuities in range. As shown in �gure

1.1 a step edge in a 1-D signal can be detected using the described wavelet transform.

The result of the application of the wavelet transform in both the vertical and horizontal

directions is combined to create a step-edge detector (equation 3.7). Application of the

step-edge operator renders an image in which the extrema represent step-edge contours

(�gure 3.3). We say that the image resulting from application of the step-edge operator

20

a b

Figure 3.3: A synthetic image containing step edges: (a) A synthetic 2-D signal (i.e.image) containing step edges (b) g2 of (a).

at scale s is the fuzzy edge map gs:

jrf j =

��W i

s [f ]�2

+�W j

s [f ]�2�1=2

: (3.7)

3.2 Crease-Edge Detection

We de�ne crease edges in a range image as discontinuities in surface normal of range

[61]. The 1-D wavelet transform is applied to the image several times to �nd the gradient of

the surface normal, GSN. Assuming we have 3-D data as vectors in a Cartesian coordinate

system, we can calculate the GSN.

A three-dimensional position vector is

~p = [ x(i; j) y(i; j) z(i; j) ] (3.8)

Each point in a range image represents a vector in 3-D space. Some processing may be

required to extract these vectors from a range image [62]. We can think of all the vectors

for a single range image as three images, one image for each component of the vector.

From these three images we can calculate the GSN.

21

An approximation to a surface normal vector at a point i; j in an image f is

~N(i; j) = [ Nx(i; j) Ny(i; j) Nz(i; j) ] (3.9)

where

Nx(i; j) =nx(i; j)q

nx(i; j)2 + ny(i; j)

2 + nz(i; j)2

(3.10)

Ny(i; j) =ny(i; j)q


2 + nz(i; j)2

(3.11)

Nz(i; j) =nz(i; j)q


2 + nz(i; j)2

(3.12)

and

nx(i; j) = W is [y(i; j)]W

js [z(i; j)] �W i

s [z(i; j)]iWjs [y(i; j)] (3.13)

ny(i; j) = W is [z(i; j)]W

js [x(i; j)] �W i

s [x(i; j)]iWjs [z(i; j)] (3.14)

nz(i; j) = W is [x(i; j)]W

js [y(i; j)] �W i

s [y(i; j)]iWjs [x(i; j)] (3.15)

The magnitude of the gradient of a surface normal vector is

jjr ~N jj =

�Nx

@x

�2+

�Nx

@y

�2+

�Ny

@x

�2+

�Ny

@y

�2+

�Nz

@x

�2+

�Nz

@y

�2(3.16)

This is our crease-edge operator. Application of this operator for an image at scale s

creates a fuzzy edge map hs. Figure 3.4 shows hs for a simple synthetic image.

3.3 Segmentation from Feature Extraction

Our goal is image segmentation, and up to this point we have discussed only feature

detection. We need a method to create a segmentation from our edge detection results.

The step-edge and crease-edge operators both perform fuzzy feature extraction: The op-

erators do not give a crisp (or binary) response they render a response proportional to the

magnitude of a feature { (i.e. discontinuity of surface geometry). We refer to any image

consisting of fuzzy edge information as a fuzzy edge map.

22

a b

Figure 3.4: A synthetic image containing crease edges: (a) A synthetic image containingcrease edges (b) h1 of (a).

Conventional thresholding creates a binary edge map from a fuzzy edge map by thresh-

olding the fuzzy values. Segmentation can be derived from the binary edge map by link-

ing edges in this binary edge map and then labeling areas enclosed by edges as regions.

Thresholding a fuzzy feature map eliminates responses to a feature detector that are below

a certain value. In the case of edge detection, edges of low discontinuity are eliminated.

Edges of low discontinuity can represent important features in an image and should not

necessarily be eliminated in a �nal segmentation. We choose a segmentation algorithm

that �nds regions bounded by local maxima in a fuzzy edge map.

We apply a morphological watershed algorithm to a fuzzy edge map to create a seg-

mentation [63]. We use the watershed algorithm described in [62], here we give a short

synopsis this algorithm. If a symbolic drop of water is placed at a local maxima in a fuzzy

edge map it will drain down to a regional minima. All the pixels in the path of this drop

can be associated with the respective local minima. The watershed algorithm performs

this analysis for every regional maxima in the image, thus, forming catchment basins that

identify distinct regions having di�erent labels (�gure 3.5). So when applied to a fuzzy

edge map the watershed will form distinct regions bounded by areas of relatively high

23

high feature values

catchment basin: labeled region

Catchment Basin

Figure 3.5: Region formed from catchment basin.

feature values, i.e. a segmentation.

3.4 Fuzzy Reasoning Approach to Scale-Space Fusion

This section begins with a discussion of scale space and edge-detection operators. Using

a synthetic image containing only step edges we look at the scale space of the step-edge

detector. One possible method for combining the multi-scale data is fuzzy fusion. This

section elaborates on this idea, gives a simple experiment, and then discuss attributes of

the fuzzy fusion scheme.

To analyze the step-edge operator we create the step image, and we show which points

in this image should be detected as edges (�gure 3.6). This image contains 196 blocks

created using a random number generator (uniform distribution). The minimum block

magnitude is 1 and a maximum magnitude is 255. The blocks placed adjacent to one

another create di�erent step magnitudes in the image. We add zero mean Gaussian noise

(standard deviation=10) to the step image to create the step10 image (�gure 3.6c). In

all synthetic images that have no noise and are piece-wise at, the gradient magnitude is

a \perfect" step-edge detector. This could be implemented using �nite di�erences or in

our case the smallest scale of the wavelet transform. The gradient magnitude of the step

shows the points that should be detected as edges (�gure 3.6b).

24

a b

c d

Figure 3.6: The step and step10 images: (a) The step image (size 256x256), (b) imageindicating edge points used to create scale-space projections from the step image (blackpoints represent edge points and white points represent non-edge points), (c) the step10image containing added Gaussian noise (� = 10), and (d) g1 of (c).

25

Figure 3.7 shows the scale space of the step-edge operator for the step10 image. In

this �gure black represents the highest response to the step-edge operator and white values

represent the lowest response to the step-edge operator.

Figure 3.8 shows this result of application of the watershed algorithm to the scale space

of the step10 image. Figures 3.7 and 3.8 reveal three attributes of the scale space for the

step-edge operator:

� Edges are less accurately detected as scale increases.

� Noise is reduced as scale increases.

� Scales greater than some value seem to present no new useful information.

The scale one in this case contains too much noise to derive a good segmentation, but

examine the second scale | the noise is reduced, yet most edges are present. The largest

scales render bad segmentations because of very low accuracy in the detection of edges.

One researcher has proposed that the more salient an edge, the longer it survives

in scale space [52]. For example a signi�cant edge will produce a response at a greater

number of scales than an edge created from noise. Noise survives within a small region in

the scale space (at low scales); whereas, signi�cant edge features are usually present at a

great number scales. If a scheme could be employed to choose edges that survive well in

the scale space an improved edge detection could be achieved.

Because the edge maps contain fuzzy values, fuzzy set theory can be applied [64]. Fuzzy

set theory is a logical paradigm within which to develop a scale-space combination scheme.

A fuzzy set union could create the desired fused-scale edge map. We use Bernoulli's rule

of combination [65] to fuse scale space because it has no free parameters and it allows

the combination of more than two values (by recursive application). Bernoulli's rule of

combination gives the union of two values a 2 [0; 1] and b 2 [0; 1] (equation 3.17).

1� (1� a)(1 � b) (3.17)

26

a b c

d e f

g h

Figure 3.7: Scale space of the step-edge operator applied to the step10 image. The scalevalues are (a) 1, (b) 2, (c) 4, (d) 8, (e) 16, (f) 32, (g) 64, and (h) 128. In these �guresblack represents the highest and white the lowest response to the step-edge operator. Grayvalues between white and black represent values that are between the lowest and highestresponses respectively.

27

a b c

d e f

g h

Figure 3.8: Watershed algorithm applied to step-edge operator images from the step10image. The scale values are (a) 1, (b) 2, (c) 4, (d) 8, (e) 16, (f) 32, (g) 64, and (h) 128.

28

We can recursively fuse the scale space in �gure 3.7 using Bernoulli's rule of combination.

In order to perform Bernoulli's rule of combination we need edge maps that are fuzzy

and contain values on the interval [0,1]. Our edge maps at this point are fuzzy, but are

not bounded. So to perform Bernoulli's rule of combination we must �rst remap the edge

maps to the interval [0,1]. A simple way to map the data to the [0,1] interval is to perform

a linear remapping. Consider fi to be the initial fuzzy edge map, fmin is the minimum of

all the values in fi, and fmax is the maximum of all the values in fi. We apply equation

3.18 to obtain an edge map of values on the interval [0,1], fo.

fo =fi � fmin

fmax � fmin(3.18)

We examine the results of scale-space fusion using Bernoulli's rule of combination for

the step10 image. After remapping each scale we apply Bernoulli's rule of combination

to the scale space of the step10 image. Figure 3.9 shows the fused edge maps. We apply

the watershed algorithm to the edge maps to create segmentations (�gure 3.10).

The results in 3.10 show that the fused scale space has similar attributes to the non-

fused scale space:

� Edges are less accurately detected as scale increases.

� Noise is reduced as scale increases.

� Scales greater than some value seem to present no new useful information.

The di�erence in the fused scale space and the non-fused scale space is that the fused

scale space appears to retain more small-scale information as scale increases. This seems

reasonable because one scale in the fused scale space represents the fusion of that particular

scale and all lower scales of the non-fused space. The fused scale space demonstrates more

accurate detection of edges as scale increases than the non-fused space. The fused space

also seems to retain more noise as scale increases as compared to the non-fused scale space.

29

a b c

d e f

g

Figure 3.9: Fuzzy fusion (Bernoulli's Rule of Combination) applied to scale space ofstep10 image. Result for fusion of scales: (a) 1 and 2, (b) 1 to 4, (c) 1 to 8, (d) 1 to 16,(e) 1 to 32, (f) 1 to 64, and (g) 1 to 128.

30

a b c

d e f

g

Figure 3.10: Watershed algorithm applied to fuzzy fusion (Bernoulli's Rule of Combina-tion) result for step10 image. The scales fused are (a) 1 and 2, (b) 1 to 4, (c) 1 to 8, (d)1 to 16, (e) 1 to 32, (f) 1 to 64, and (g) 1 to 128.

31

For example we compare scale eight of the non-fused space (�gure 3.7c) to the fused result

of scales one to eight (�gure 3.9d). Scales above scale two of the non-fused scale space

(�gure 3.8b) appear to present no new useful information, whereas, scales above scale

eight of the fused scale space (�gure 3.10c) appear to present no new useful information.

This also indicates that the fused scale space retains more small-scale information than

the non-fused scale space.

This scale-space fusion scheme can be applied to real data. We process and examine

the scale space of the real hydro3 image in the same way we processed and examined

the step10 image. The hydro3 data is acquired using a Perceptron laser range scanner

[66, 67]. Figure 3.11 shows the range image. Figure 3.12 shows the scale space created by

applying the step-edge operator to this image. We apply the watershed algorithm to each

edge map in �gure 3.12 to yield a segmentation for each scale (�gure 3.13).

The results of the step-edge operator followed by the watershed are clearly di�erent

for real data. The real data presents a more challenging case for edge detection. The

noise present in this real data is greater than the noise of the step10 image and does

not diminish so readily with increasing scale. Smaller features are present in this image.

These small features are smoothed away at higher scales, so preservation of these features

while reducing noise makes edge detection challenging for this image.

When applying the fusion scheme to the scale space of the hydro3 image (�gures

3.14 and 3.15), we �nd that, as with the step10 image, the fused scale space has similar

attributes to the non-fused scale space. These attributes are that edges are less accurately

detected as scale increases, noise is reduced as scale increases, and scales greater than some

value seem to present no new useful information In addition, as with the step10 image,

the main di�erence in the non-fused scale space and the fused scale space is that the fused

scale space appears to retain more small-scale information as scale increases. Because the

hydro3 image contains noise of greater magnitude, noise (a small-scale feature) diminishes

32

Figure 3.11: The hydro3 image (size 256x256).

33

a b c

d e f

g h

Figure 3.12: Scale space of the step-edge operator applied to the hydro3 image. Thescale values are (a) 1, (b) 2, (c) 4, (d) 8, (e) 16, (f) 32, (g) 64, and (h) 128.

34

a b c

d e f

g h

Figure 3.13: Watershed algorithm applied to step-edge operator images from the hydro3image. The scale values are (a) 1, (b) 2, (c) 4, (d) 8, (e) 16, (f) 32, (g) 64, and (h) 128.

35

a b c

d e f

g

Figure 3.14: Fuzzy fusion (Bernoulli's Rule of Combination) applied to scale space ofhydro3 image. Result for fusion of scales: (a) 1 and 2, (b) 1 to 4, (c) 1 to 8, (d) 1 to 16,(e) 1 to 32, (f) 1 to 64, and (g) 1 to 128.

36

a b c

d e f

g

Figure 3.15: Watershed algorithm applied to fuzzy fusion (Bernoulli's Rule of Combina-tion) result for hydro3 image. The scales fused are (a) 1 and 2, (b) 1 to 4, (c) 1 to 8, (d)1 to 16, (e) 1 to 32, (f) 1 to 64, and (g) 1 to 128.

37

less quickly as scale increases in the fused scale space. Figure 3.13e contains fewer segments

created by noise than does �gure 3.15d.

The fused scale space for the hydro3 image shows some improvement over the non-

fused scale space, but the fusion scheme weights all scales equally. This means that a

feature that is detected at only one or a few scales may not be present in the fused edge

map. If scale space can be combined in a way that gives preference to scales containing

signi�cant edge information, this combination may have advantages over a simple fuzzy-

fusion method.

In the next chapter we further explore the scale spaces for the step-edge and crease-edge

detectors that result from application of the wavelet transform. This exploration results

in a segmentation method that combines scales in a way that places greater emphasis on

scales that contain signi�cant edge information. We develop these ideas into a somewhat

more sophisticated scale-space combination scheme.

38

CHAPTER 4

Scale-Space Combination Algorithm

In this chapter we analyze scale-space data from both the step-edge and crease-edge

operators. We present our analysis in a way that motivates the strategy behind the

development of a segmentation system. The result is the Pattern Analysis of Scale Space

for Extraction of Features (PASSEF) system. PASSEF is centered around the following

three ideas:

� We combine scale-space information derived from the wavelet transform to segment

range images.

� We train a statistical pattern-recognition system with points from a training image;

this system can then determine the degree of edgeness (or non-edgeness) for points

in a range image, thus, creating a fuzzy edge map.

� Two fuzzy edge maps, representing step edges and crease edges, are combined to

create a comprehensive edge detection.

This chapter describes in detail the development of these three ideas. We �rst examine

and analyze scale-space information for two edge-detection operators, gs and hs, derived

from the wavelet transform. Through analysis we show the motivation for choosing sta-

tistical pattern recognition to derive edge detection from scale-space data. We discuss

the overall PASSEF system architecture and a detailed discussion reveals speci�cs of the

system. The fusion of the crease-edge and step-edge maps created by the PASSEF system,

Ph and Pg, is explained at the end of the chapter.

39

4.1 Analysis of the Scale Space of the Step-Edge Operator

We want to detect step edges from scale-space data of the step-edge operator. We

de�ne a scale-space signature as the vector of measurements at di�erent scales taken at a

single point, (i,j), in the image (�gure 4.1). This is similar to the work of [68].

WaveletTransform

Scale

Ma

gn

itu

de

Scale Signature

Scale Space

ImageBlock

Figure 4.1: Creation of a scale-space signature from scale-space data.

First we examine these scale-space signatures for step edges. A synthetic image is

created that contains a square area of magnitude 1 and a background area of magnitude

0. Gaussian noise of standard deviation equal to 0.1 is added to create the block image

(�gure 4.2a).

In order to examine the scale space signatures of the edge points, we would like to

identify the edge points of the block image (�gure 4.2b) in the same way we identi�ed

the edges points of the step10 image (gradient magnitude of the image without noise).

Ideally we would like to distinguish light and dark points in this image based on their

respective scale-space signatures. Figure 4.3 shows that the signatures of the edge points

appear self similar and distinguishable from the signatures of non-edge points.

40

a b

Figure 4.2: The block image: (a) The block image, size 25x25, with added Gaussiannoise (� = 0:1) and (b) �rst scale of wavelet transform of simple block image withoutnoise.

Figure 4.3: Scale-space signatures of each pixel in the block image with added Gaussiannoise (� = 0:1).

41

This suggests that scale-space signatures can be used to determine edgeness; how to

separate edge point signatures and non-edge point signatures remains an issue. Examining

the collective scale-space of this operator will help us determine what type of method

should be used to �nd edge points in this scale space. Examining the collective scale space

means considering the scale-space signatures as vectors in an n dimensional space where

n is the number of scales used to create a particular scale-space signature. Therefore we

want to examine or get an idea of how points (edge and non-edge) are arranged in this

space. Figure 4.4 shows how a collective scale-space is created for only edge points in an

image.

WaveletTransform

E s

m: Edge point number m at scale s

Step Image

Scale SignatureCreation

Scale Space

Edge TruthImage

1

32

2

1 1

2

2 2

2

3 3

2

[E , E , E ]

1[E , E , E ]1

2[E , E , E ]1

3

1

[E , E , E ]1

4

4

4

4

M M M

Figure 4.4: Creation of a collective scale space from edge points in the step image.

We examine the collective scale space of the synthetic step10 image (�gure 3.6c with

edge points indicated in �gure 3.6b). Figure 4.5a shows a 2-D projection of scale-space

data for edge points in the step10 image. Figure 4.5b shows a 2-D projection of the edge

points and non-edge points in the step image. These scale-space projections are created

by projecting the data onto the two dimensional subspace spanned by the eigenvectors

associated with the largest eigenvalues.

42

a b

Figure 4.5: 2-D scale-space projections from the step10 image: (a) 2-D scale-space projec-tion of the space of edge points from the step10 image and (b) 2-D scale-space projectionof the space of edge points (gray) and non-edge points (black) from the step10 image.

4.2 Analysis of the Scale Space of the Crease-Edge Operator

We can examine the scale-space of the GSN operator in the same way that we examined

the scale-space of the step-edge operator. For this analysis we use a simple pyramid image

(�gure 4.6a). This simple image contains only crease edges. The �rst scale of the GSN

operator of the pyramid without noise (�gure 4.6b) indicates the edge points of the

image. Most of the edge points in this image appear to have scale-space signatures that

are distinguishable from the signatures of non-edge points (�gure 4.7).

We also examine the collective scale space of the GSN operator. We create a synthetic

8-sided-cone image (�gure 4.8a) which contains only crease edges. Figure 4.9 shows

two 2-D projections for scale-space data from the 8-sided-cone image. These scale-space

projections are created by projecting the data onto the two dimensional subspace spanned

by the eigenvectors associated with the largest eigenvalues.

43

a b

Figure 4.6: The pyramid image: (a) The pyramid image, size 25x25, with added Gaus-sian noise (� = 0.2) and (b) gradient of the surface normal of the simple pyramid imagewithout noise.

Figure 4.7: Scale-space signatures of each pixel in the simple pyramid image with addedGaussian noise (� = 0.1).

44

a b

Figure 4.8: Synthetic 8-sided-cone image: (a) Synthetic 8-sided-cone image with addedGaussian noise (� = 2) and (b) image indicating edge points used to create scale-spaceprojections from the 8-sided-cone image.

a b

Figure 4.9: 2-D scale-space projections from the 8-sided-cone image: (a) 2-D scale-spaceprojection of the space of edge points from the 8-sided-cone image with added noise and(b) 2-D scale-space projection of the space of edge points (gray) and non-edge points(black) from the 8-sided-cone image with added noise.

45

4.3 Motivation for Using Pattern-Recognition

The di�erent locations of feature and non-feature points indicated by �gures 4.5b and

4.9b reveal that features could be detected in this space by a pattern-classi�cation system.

We propose to analyze the 1-D scale-space signatures with a supervised pattern-recognition

system (�gure 4.10). We choose a pattern-recognition approach because the system

� Can be trained to detect objects at all scales in the presence of noise.

� Has few free parameters.

� Can be trained to detect features in image from di�erent acquiring devices.

In this section we examine the criteria a�ecting the potential success of a pattern recog-

nition system for this application. We discuss the likelihood of these criteria being met.

Finally we propose the PASSEF system.

Pattern-Recognition

95%

20%

75%

15%

SignaturesScale-Space

System

ValueEdgenessFuzzy

Figure 4.10: Pattern recognition to determine edgeness of scale-space signatures.

There are two primary criteria that a�ect the potential success of a pattern-recognition

system in this application:

1. The class of edge points and the class of points that are not edges in the image

must have scale-space signature di�erences that allow them to be separated by the

pattern-recognition system.

46

2. Signatures from the training data set must closely resemble data from the image to

be segmented.

We address the potential of these two criteria being met by examining synthetic data.

The class of edge points and class of non-edge points for the step10 and 8-sided-cone

images (�gures 4.5b and 4.9b respectively) appear mostly separated. However we can

not know for sure if criteria one will be met because it is di�cult to visualize the high-

dimensional space from 2-D projections. The second criteria is also di�cult to asses.

We do know of one aspect of the scale space that could prevent the second criteria

from being met. In order for the training data to match the data being analyzed, the

magnitudes of the scale-space signatures for the two data sets must match. This is a

problem because there are an in�nite number of magnitudes for the case of the step-edge

operator and quite a large range of values for the GSN operator. We can alleviate this

problem by normalizing the scale-space signatures so that the pattern-recognition system

is based solely on the shape of the scale-space signature and therefore independent of the

contrast of an edge.

We normalize each scale signature by dividing each value in the signature by the

signature's total magnitude. Equation 4.1 shows the normalization of a feature vector

where gs(i; j) is the response of a feature detector at scale s at point (i; j) in an image and

n is the total number of scales in the feature vector. After normalization all the points in

the space lie on one hyperplane. Transforming the coordinate system of the space to this

hyperplane reduces the dimensionality of the space by one.

"g1(i; j)Pn�1

y=0 g2y(i; j);

g2(i; j)Pn�1y=0 g2y(i; j)

; : : :gn(i; j)Pn�1

y=0 g2y (i; j)

#(4.1)

Normalizing the scale space changes the entire scale-space. We must examine the

normalized space to determine what type of pattern classi�er to use. We want to examine

47

a b

Figure 4.11: 2-D scale-space projections from the step10 and 8-sided-cone images:(a) 2-D scale-space projection of the normalized space of edge points from the step10

image and (b) 2-D scale-space projection of the normalized space of edge points from the8-sided-cone image with added noise.

the normalized scale space of the step10 and 8-sided-cone images. Figures 4.11a and

4.11b show collective scale-space projections for the step10 and 8-sided-cone images

respectively. These projections show that feature vectors representing edges appear to

form one hyperblob within the space. We could model this edge space with a simple

statistical pattern-recognition system.

We propose to train a pattern-recognition system with scale-space signatures of edge

points. Once trained, the system is applied to each pixel of an image to be segmented.

The edgeness of each pixel in the new image is determined, therefore creating a fuzzy

edge map. We develop the segmentation process shown in �gure 4.12 based on the idea

of modeling the edge space with Gaussian blobs.

4.4 Modeling Scale Space Using Gaussian Blobs

The process in �gure 4.12 consists of two phases: a training phase and a feature

detection phase. In the training phase we �rst perform the wavelet transform on a training

48

NormalizedScale Feature

Vectors

Covariance MatrixMean Vector

Training Phase

12

Scale Space

345

N

Scale Space12345

N

Pattern-Recognition

Fuzzy Fusion

System

GaussianApproximation

Range Image

Image

Range Image

Magnitude

Feature Detection Phase

Fuzzy Feature Map

Truth

Figure 4.12: Flow of entire Pattern Analysis of Scale Space for Extraction of Features(PASSEF) system.

image (usually a synthetic image with noise). Next we normalize each scale signature. We

then calculate the covariance matrix and mean vector of the edge points in the normalized

and transformed space. This statistical information is used to create a Gaussian model of

the space.

We use a Gaussian function because it is smooth and is convenient to implement. In

addition the Gaussian function has desirable properties: smoothness and parameters for

adjusting the shape of the function (the standard deviation) and the placement of the

function (the mean). The Gaussian is used as a model for the training data to indicate

where concentrations of feature points occur rather than as a precise model of feature

points in the space. The parameters of the multi-dimensional Gaussian function allow it

to successfully indicate concentrations of feature points in the feature space.

After deriving the proper model for the training data the feature detection phase

begins. In this phase the wavelet transform yields a feature vector for each point in the

input image. The pattern-recognition system creates an edge map using the feature vectors

of the input image and the Gaussian approximation from the training phase. This edge

49

a b

Figure 4.13: Synthetic highbay image: (a) Synthetic highbay image and (b) imageindicating edge points (step-edge points only) used to create scale space from real highbayimage.

map and the magnitude values of the scale space are fused to create a �nal edge map.

The �nal edge map is segmented using a morphological watershed algorithm [62].

Simple training data such as the step10 and 8-sided-cone images can be modeled

with a single n-dimensional Gaussian, but for a more complicated space this single blob

model breaks down. Multiple Gaussian blobs, if positioned correctly, might be able to

better represent this data. To show this we examine the collective scale space of a more

complex scene. We use a synthetic data set referred to as the highbay image (�gure

4.13a). This image contains step as well as crease edges, but we examine only the edge

space for the step-edge operator. Figure 4.13b shows the points used to create the edge

space. The edge space for the highbay image is much more dispersed than the edge space

of the simple step10 image (�gure 4.14).

A classi�cation system with the ability to divide the areas of concentrated points into

separate regions could allow the space of edges to be modeled by a number of n-dimensional

Gaussian blobs. One n-dimensional Gaussian could be �tted to each piece of the divided

50

Figure 4.14: 2-D projection of scale space for edge points of the highbay image.

space in the same way it is �tted to the entire space. The number of vectors in a particular

area of the edge space would determine the scale of the Gaussian approximation used to

model that piece. This is necessary because the magnitude of the result from the statistical

classi�er must be related to the concentration of points. Areas in the scale space that are

highly concentrated with edge points from the training data should render high responses

when a feature vector from a new image is examined.

We use a k-means algorithm to divide the space into a speci�ed number of clusters.

We then �t an n-dimensional Gaussian to each of these clusters. Figure 4.15 shows the

results of applying the k-means clustering algorithm to the edge scale-space of the hybay

image to create ten clusters.

4.5 Fusion of the Step-Edge and Crease-Edge Detection Maps

In order to detect all the edges in an image (step and crease) we apply the PASSEF

system twice. We apply the PASSEF system using the step-edge operator with only step-

edge training data, and then we apply the PASSEF system using the crease-edge operator

51

Figure 4.15: K-means applied to scale space of step-edge points of highbay image.

with only crease-edge training data. This creates two edge maps: one for step edges and

the other for crease edges. These two edge maps must somehow be combined to create a

single comprehensive edge map.

We use a fuzzy union operator to fuse these two maps [64]. We choose Bernoulli's

rule of combination to perform the fusion [65]. Bernoulli's rule of combination performs a

union of two values a 2 [0; 1] and b 2 [0; 1] using equation 4.2.

1� (1� a)(1 � b) (4.2)

Edge maps created by the PASSEF system are fuzzy but not bounded. In order to apply

Bernoulli's rule of combination, we need to transform the values of the edge maps to

the [0,1] interval. A simple way to map the data to the [0,1] interval is to perform a

linear remapping (equation 3.18). After the fuzzy union is performed on the images the

morphological watershed described in [62] is applied to the �nal edge map to create a

segmentation.

52

CHAPTER 5

Results and Analysis of System

In this chapter we examine the capabilities of the Pattern Analysis of Scale Space for

Feature Extraction (PASSEF) system. We analyze the PASSEF system with both syn-

thetic and real data. This analysis addresses each of the three qualities of a segmentation

algorithm set forth in chapter one:




We begin this chapter with an analysis of training data used with the PASSEF system.

Next we discuss the detection of step edges and end with an analysis and discussion of

crease edges.

5.1 Training Data

The PASSEF system requires training data to extract features in an image. In this

chapter all the training-data sets are formed from feature points in synthetic images.

Di�erent synthetic images create training-data sets for the results. The step10, 8-sided-

cone, and highbay images are used. Di�erent edge points are used to create three

training-data sets from the highbay image. Table 5.1 exhibits the aspects of all training-

data sets used to obtain the results for this chapter.

The step10 image contains only step edges so the training data created from this

image contains feature points of only step edges. To �nd the step-edge points in the

53

Table 5.1: Training Data used with the PASSEF System

Variable Name Image Source Type of Features Edge Point FigureS step10 step 5.1bC 8-sided-cone crease 5.2bHs;c highbay step and crease 5.3bHs highbay step 5.3cHc highbay crease 5.3d

a b

Figure 5.1: Training data from the step10 image: (a) The step10 image (size 256x256)containing noise � = 10, (b) image indicating edge points (black) used to create train-ing-data set S.

step10 image we �rst �nd g1 for the step10 image with no noise. We threshold g1 to

yield a binary map of edge points used to create the training-data set (�gure 5.1b).

The 8-sided-cone image contains only crease edges. This means that the training

data created from this image contains feature points of only crease edges. We determine

the crease-edge points in the 8-sided-cone image by examining g1 for the 8-sided-cone

image with no noise. We threshold g1 to yield a binary map of edge points used to create

the training-data set (�gure 5.2b).

The highbay image contains both crease and step edges. We create three training-

data sets from the highbay image. One set contains all edges in the image, Hs;c. A

second set contains only step edge points in the image, Hs, and a third incorporates only

54

a b

Figure 5.2: Training data from the 8-sided-cone image: (a) The 8-sided-cone image(size 256x256) with noise � = 0:1, (b) image indicating edge points (black) used to createtraining-data set C.

crease edge points, Hc. To obtain the edge points used to create Hs;c we derive g1 for

the highbay image. We threshold g1 to obtain a binary map of all the edge points in the

highbay image (�gure 5.3b). To acquire the edge points used to make Hs we �nd g1 for

the highbay image. We threshold and manipulate g1 to yield a binary map of edge points

used to create Hs (�gure 5.3c). Crease edge points of the highbay image are found by

thresholding and manipulating h1 of the highbay (�gure 5.3d). These crease-edge points

are used to create the training-data set Hc.

5.2 Detection of Objects at All Scales in the Presence of Noise

First we consider the detection of blocks in the synthetic step10 image and an analo-

gous image that contains more noise (the step30 image containing Gaussian noise � = 30).

These images provide a way to see the a�ects of di�erent noise levels on the PASSEF sys-

tem. Second we consider the detection of objects in a real image. This real image provides

objects of di�erent sizes (i.e. features at di�erent scales). We compare results from the

55

a b

c d

Figure 5.3: Training data from the highbay image. (a) The synthetic highbay imagewith added noise � = 0:1 and images indicating points of the highbay used to createtraining set (black) (b) Hs;c, (c) Hs, and (d) Hc.

56

PASSEF method to results obtained by applying the step-edge detector at a single scale.

Recall the step10 image (�gure 5.1a). This image consists of 196 blocks of various

magnitudes. The image is piece-wise at and thus contains only step edges. Also recall

that a \perfect" edge detector for an ideal image (no noise and piece-wise at) is the

gradient magnitude. Figure 5.1b shows this for the step10 image.

We use S to train the PASSEF system and then apply the PASSEF system to the

step10 image. The PASSEF system utilizes only the step-edge operator g. We denote

this as PSg [step10]. We use eight scales for the training and application of the PASSEF

system. Because the scale-space data appears to form one Gaussian blob (�gure 4.11a)

we use only one blob to model the space. Figure 5.4 shows PSg [step10] and g2[step10]

for subjective comparison.

Next we apply the PASSEF system to the step30 image. This image is analogous

to the step10 image except the standard deviation of the added noise is thirty rather

than ten (�gure 5.5). The \perfect" edge detection for this image with no noise is again

the gradient magnitude (�gure 5.1b). The proper noise level for a training set is not

necessarily equally to the noise level of the image being processed; therefore, we train the

PASSEF system using the step10 image. (This is fully explored in the next section.) We

again use eight scales and one blob to model the data. Figure 5.6 displays PSg [step30]

and g4[step30] for comparison.

The gradient magnitude, gs, is an excellent edge detector for the step10 and step30

images. These images contain blocks of only one size; therefore, an optimal global scale

exists for the gs operator. The optimal scale is the minimum scale that su�ciently smooths

noise in the image.

Figure 5.7 shows each dyadic scale of the step-edge detector, gs(step10) for s 2

f1; 2; 4; 8; 16; 32g, and �gure 5.8 shows the result of applying the watershed algorithm to

this scale space. The second scale of this scale space (�gure 5.8b) seems to yield the best

57

a b

c d

Figure 5.4: Results of applying PASSEF and the step-edge operator to the step10 image:(a) PSg [step10], (b) the watershed algorithm applied to PSg [step10], (c) g2[step10], and(d) the watershed algorithm applied to g2[step10]. In these �gures black is the highestand white the lowest response to the edge operator. Gray values between white and blackrepresent values that are between the lowest and highest responses respectively.

58

Figure 5.5: The step30 image (size 256x256). This image contains added Gaussian noise� =30.

59

a b

c d

Figure 5.6: Results of applying PASSEF and the step-edge operator to the step30 image:(a) PSg [step30], (b) the watershed algorithm applied to PSg [step30], (c) g4[step30], and(d) the watershed algorithm applied to g4[step30].

60

a b c

d e f

Figure 5.7: Application of the step-edge operator to the step10 image at various scales:gs[step10] for s = (a) 1, (b) 2, (c) 4, (d) 8, (e) 16, and (f) 32.

61

a b c

d e f

Figure 5.8: Watershed algorithm applied to step-edge operator results of the step10

image: Watershed algorithm applied to gs[step10] for s = (a) 1, (b) 2, (c) 4, (d) 8, (e)16, and (f) 32.

62

a b c

d e f

Figure 5.9: Application of the step-edge operator to the step30 image at various scales:gs[step30] for s = (a) 1, (b) 2, (c) 4, (d) 8, (e) 16, and (f) 32.

segmentation. At the second scale noise is reduced and most of the blocks are properly

segmented. Figure 5.9 shows each scale of the step-edge detector for the step30 image,

gs(step30) for s 2 f1; 2; 4; 8; 16; 32g, and �gure 5.10 shows the result of applying the

watershed algorithm to this space. In the gradient-magnitude scale space for the step30

image, scale four seems to yield the best segmentation (�gure 5.10c). Indeed a higher

noise level requires more smoothing to create a proper segmentation and the minimum

scale that provides this adequate smoothing is gives the optimal segmentation.

An image that contains objects of varying sizes will not have an optimal global scale

and may therefore prove more challenging for the single-scale paradigm. Next we examine

the hydro3 image (�gure 5.11). Figure 3.12 shows gs[hydro3] for s 2 f1; 2; 4; :::128g.

Unlike the step10 and step30 images it is more di�cult to determine which scale of

63

a b c

d e f

Figure 5.10: Watershed algorithm applied to step-edge operator results of the step30

image: Watershed algorithm applied to gs[step30] for s = (a) 1, (b) 2, (c) 4, (d) 8, (e)16, and (f) 32.

64


gs produces an optimal segmentation for the hydro3 image. This is because the scene

contains small objects (such as valve wheels) as well as larger objects (such as pipes).

To apply the PASSEF system to the hydro3 image we use the Hs;c training data

because this data more closely models the hydro3 image. We use four scales and twenty

blobs to model the scale space. Because gs has some ability in detecting both types of

edges we want to train the PASSEF system with both step and crease edges. We use Hs;c

rather than Hs as training data.

Figure 5.12 shows PHs;cg [hydro3] and g4[hydro3] (one of the better segmentations

from gs) for comparison. The PASSEF method seems to create a better segmentation

than any one scale of the gradient magnitude operator (refer to �gure 3.13 if needed).

This analysis suggests that the PASSEF method succeeds in properly detecting features

at multiple scales in the presence of noise.

5.3 Free Parameters in the PASSEF System

In this section we examine the free parameters of the PASSEF system. We consider

each free parameter individually. We discuss proper values for each free parameter, how

65

a b

c d

Figure 5.12: Results of applying PASSEF and the step-edge operator to the hydro3

image: (a) PHs;cg [hydro3], (b) the watershed algorithm applied to PSg [hydro3], (c)

g4[hydro3], and (d) the watershed algorithm applied to g4[hydro3].

66

Table 5.2: Free parameters in the PASSEF system and segmentation algorithm.

Parameter Name Variable Range of Values SensitivityNumber of Scales n 4 to 8 lowNumber of Blobs N 0 to 20 low

Noise Level in Training Data � 0.1 mediumMinimum Watershed Depth d 0.0001 to 0.00001 medium

proper values are determined, and the PASSEF system's sensitivity to each parameter.

The PASSEF system has few free parameters and the proper values for these parameters

can be found with a minimum of human interaction. Indeed, the PASSEF system's greatest

strength is its lack of free parameters and robustness to the adjustment of these parameters.

Table 5.2 shows the free parameters of the PASSEF system. This table states the name

of the parameter, the variable name associated with the parameter, the range of values

for the parameter used in this thesis, and a subjective measure of the PASSEF system's

sensitivity to this parameter. There are three grades of sensitivity measure: low, medium,

and high. Low sensitivity indicates that changing this parameter from image to image is

seldom necessary. High sensitivity indicates that the parameter may need to be changed

from image to image.

5.3.1 Number of Scales | n

The PASSEF system requires multi-scale information to extract features. We ensure that

the PASSEF system has the ample information by adjusting the parameter n. If the value

of n is too low the scale space will not contain enough information to extract features.

For example suppose n =1, the PASSEF system can only extract features at the given

scale. Therefore, n should be high enough to create a scale space that contains all the

information needed to extract features in a particular image.

Theoretically very large values of n will always yield a good feature extraction because

all the needed information is available to the system. But using large values of n forces the

system to analyze much redundant information. When n is very large redundant infor-

67

mation is present because after a certain degree of smoothing no new useful information

can be extracted from an image (examine �gure 3.12). Theoretically the PASSEF system

has the ability to extract features in the midst of this redundant information, and this

is borne out, to some extent, in our experiments. Because the system determines which

scales contain useful information it will use only these scales for classi�cation. Although

the system can detect features regardless of redundant information, we attempt to use val-

ues of n that avoid redundant information in the system in order to optimize processing

time.

To demonstrate the e�ect of n being too low we examine detection of features in the

step10 image using two values of n, four and eight. Figure 5.13 shows the result of

applying the PASSEF system to the step10 image with n equal to four and eight. The

training data set is S. The four scale and eight scale results are somewhat similar, but

the four scale result appears to have more noise. Better detection using eight scales with

the PASSEF system indicates that scales beyond the fourth dyadic scale for gs[step10]

contribute some useful information (examine �gure 3.7).

5.3.2 Number of Blobs | N

We create a statistical model of a training-data set for the purpose of classifying points

in an image. Finding the proper training-data model is essential for the PASSEF system

to properly extract features. To create a model for the training data we �rst apply a

k-means algorithm to the feature space. This divides the feature vectors into N groups.

Each group of vectors is then �tted with an n-dimensional Gaussian blob. These Gaussian

blobs collectively create a model of the training data.

The feature vectors of various training-data sets can be distributed very di�erently

within their respective feature spaces. For example edge points in the step10 image,

training-data set S, create a feature space that is somewhat compact. The feature vectors

68

a b

c d

Figure 5.13: Application of the PASSEF system to the step10 image for varying values ofn: PSg [step10] with n = (a) 4 and (c) 8. The watershed algorithm applied to PSg [step10]with n = (b) 4 and (d) 8.

69

a b

Figure 5.14: 2-D scale-space projections of training data sets: (a) 2-D scale-space projec-tion of the space of the training-data set S and (b) 2-D scale-space projection of the spaceof the training-data set Hs;c.

in this space seem to form one group (�gure 5.14a). On the contrary, the edge points in

the highbay image, training-data set Hs;c, form a convoluted feature space (�gure 5.14b).

Creating acceptable models for these training-data sets, which have varying feature

distributions, requires knowledge of the proper model for training data. The proper model

should generalize the training data to the data being analyzed. This implies that the model

must be accurate but not too exact. The training-data model must be accurate enough

to discriminate feature points from non-feature points in the data being analyzed. If the

model is too general non-feature points will be classi�ed as feature points. So the proper

model is one that is able to generalize yet accurately describes the training data.

Adjustment ofN provides a way to obtain a proper model for various training-data sets.

Examining the distributions of feature points in a training-data set aids in determining a

proper range of values forN . Figure 5.14 displays two examples of these projections. These

two projections are created by projecting the data onto the two-dimensional subspace

spanned by the eigen vectors associated with the largest eigen values. We examine these

projections and other projections to ascertain the distribution of feature points in the

70

training space. A more distributed space requires a higher value for N and a more compact

space requires a lower value for N .

Proper training data models are obtained by knowing only an acceptable range for N

rather then an exact proper value for N because a precise choice of N does not appear

to be critical for obtaining good results. For example the proper range of value for N

from the feature distribution (�gure 5.14a) seems to be 1 to 3. We apply the PASSEF

system to the step10 image using the training-data set S with N = 1, 5, and 20 (�gure

5.15). As expected a large number of blobs, twenty, creates a result that is slightly worse

because the model becomes too exact, but all the results are very similar. This example

demonstrates that the PASSEF system is quite robust to changes in the parameter N .

Experimentation shows this is true for other training-data sets.

5.3.3 Noise Level in Training Data | �

Next we examine the parameter �. Usually the PASSEF system is trained with synthetic

images. Noise is added to these images to simulate real data. The added noise creates

training data that more accurately describes the real data.

We examine a complex synthetic image to determine the proper value range for �. We

add Gaussian noise of di�erent standard deviations to this synthetic image that has all

types of edges present. To test the PASSEF system we train it with the synthetic highbay

image with Gaussian noise of �= 0, 0.1, and 1.0. We apply the PASSEF system to the

highbay synthetic image with �=1.0 to obtain the three edge maps shown in �gure 5.16.

Note that we have adjusted the contrast in the images in �gure 5.16 to better visualize

the results.

It appears that a small amount of noise in the training data improves the result as

compared with no noise. This is because without noise the training data does not accu-

rately model the data being analyzed. In addition, a small amount of noise in the training

71

a b c

d e f

Figure 5.15: Application of the PASSEF system to the step10 image for varying valuesof N : PSg [step10] with N = (a) 1, (b) 5, and (c) 20. Application of watershed to (a), (b),and (c) is shown in (d), (e), and (f) respectively.

a b c

Figure 5.16: The PASSEF system applied to highbay synthetic image for varying valuesof �: The PASSEF system applied to highbay synthetic image with Gaussian noise addedof �=1 with training data of Gaussian noise levels �= (a) 0 and (b) 0.1, and (c) 1.0. Wegamma adjust the images in this �gure to reveal the noise in the results.

72

data appears to yield better results than training data with a level of noise equal to that

in the image being analyzed. Too much noise in a training-data set causes the PASSEF

system to model noise rather than features thus causing noise in the result. This analysis

indicates that the best training set is one that has a small amount of noise but probably

less noise than that present in the image being analyzed.

5.3.4 Minimum Watershed Depth | d

The last free parameter is d. The parameter d is the minimum depth of the individual

catchment basins created by the watershed algorithm. The depth of a catchment basin

refers to the di�erence between the maximum and minimum values of that catchment

basin. Catchment basins that are too shallow (i.e. have a depth less than d) are merged

with other regions.

The correct value for d is a value that is not too high yet not too low. If the value

of d is too high, too many regions are merged together. This causes undersegmentation

in the �nal result. On the other hand, if the value of d is too low, oversegmentation

results because not enough regions are merged together. The proper value for d is found

heuristically.

Before applying the watershed algorithm to results from the PASSEF system, we

transform the values of the fuzzy edge map to the interval [0,1]. The range of values for

d is 0.0001 to 0.001. The value used most is 0.001. The sensitivity for this parameter is

medium.

5.4 Images from Di�erent Acquiring Devices

Our �nal goal is to develop a system that extracts features from images created by

di�erent acquiring devices. We apply the PASSEF system to images from two di�erent

Perceptron laser range �nders and an image from a Coleman Coherent Laser Radar scan-

73


ner. We use the training-data set Hs;c to process all the images from di�erent acquiring

devices. Twenty Gaussian blobs model the four-dimensional scale space of edge points (i.e.

N=20 and n=4). The parameter d ranged from 0.001 to 0.0015. The PASSEF system

gives similar results from all scanners.

First we analyze two images from a P5000 Perceptron laser range �nder. The fore-

ground of the hydro3 image (�gure 5.11) is made up of mainly pipes and valves. There is

a wall to the right of the image that also appears in the foreground. The hydro6 image

is a di�erent view of the same scene taken with the same range scanner (�gure 5.17). The

hydro6 image also contains pipes and valves, as well as a light �xture.

Figures 5.12a and b show PHs;cg [hydro3]. We compare the result obtained with the

PASSEF system to single-scale results. Scale four presented the best segmentation for

gs (�gures 5.12c and d). We apply the PASSEF system to the hydro6 image to obtain

PHs;cg [hydro6] (�gures 5.18a and b). The edge map g2[hydro6] seemed to provide the

best single-scale segmentation for this image (�gures 5.18c and d).

Both the hydro3 and hydro6 images contain spike noise. The PASSEF algorithm

avoids improper classi�cation of spike noise as edges. For example, on the wall at the

74

a b

a b

Figure 5.18: Results of applying PASSEF and the step-edge operator to the hydro6

image: (a) PHs;cg [hydro6], (b) the watershed algorithm applied to P

Hs;cg [hydro6], (c)

g4[hydro6], and (d) the watershed algorithm applied to g4[hydro6].

75

Figure 5.19: The cone image (size 256x256).

right of the hydro3 image some spike noise is present. The �rst, second, third and fourth

dyadic scales of gs(hydro3) detect these noise points as edges (refer to �gures 3.12a to

3.12e). The ceiling of the hydro6 image contains a narrow, horizontal region of noise in

approximately the middle of the image. The PASSEF algorithm does not detect this noise,

but the single-scale paradigm does (�gure 5.18). The PASSEF system properly detects

small scale features while mis-classifying substantially fewer spike noise points.

Next we apply the PASSEF system to an image that we label as the cone image

(�gure 5.19). This image is from a Coleman Coherent Laser Radar scanner1. This range

scanner provides images with very little noise. The largest objects in the image are two

brick blocks and a tra�c cone placed in front of a barrel. This image has very little spike

noise and a low amount of additive noise.

Figure 5.20 shows g1[cone] and g2[cone]. The edge map g1[cone] demonstrates

more accurate detection of edges for some objects (e.g. edges of the brick blocks) but

noise corrupts proper detection of other features (e.g. the cone and barrel). In addition

1The laser range data �les were provided by the Oak Ridge National Laboratory, Oak Ridge, Tennessee37831, Managed by Lockheed Martin Energy Research Corp. for the U.S. Department of Energy undercontract DE-AC05-96OR22464.

76

a b

c d

Figure 5.20: Application of the step-edge operator to the cone image: (a) g1[cone], (b)g2[cone] and the watershed applied applied to (c) g1[cone] and (d) g2[cone].

77

g1[cone] fails to smooth the noise present in the right side of the image. The g2[cone]

result improves detection of edges for the cone and barrel yet corrupts edges of the brick

blocks. The noise patch at the right side of the image is virtually non-existent in g2[cone].

a b

Figure 5.21: Results of applying PASSEF to the cone image: (a) PHs;cg [cone] and (b)

watershed applied to PHs;cg [cone].

The PASSEF system shows excellent detection of step edges and avoids detection of

noise as features in the cone image (�gure 5.21 shows PHs;cg [cone]). The PASSEF system

does present improper detection of some crease edges. Improper detection of crease edges

causes the image to be undersegmented. For example PHs;cg [cone] does not properly

detect the crease edge or change in surface normal that occurs between the barrel and

the oor. Figure 5.22 shows that part of the crease edge between the barrel and oor is

detected, but a small piece, at the intersection with the cone's edge, is not detected. The

missing piece of this crease edge causes the watershed algorithm to merge the oor region

with the region representing the barrel. Notice that edges dividing the oor and the wall

also have this characteristic where they intersect edges of the brick blocks.

78

Figure 5.22: Histogram equalization of the result of applying PASSEF to the cone image:Histogram equalization applied to P

Hs;cg [cone].

Finally we apply the PASSEF system to two images acquired from another Perceptron

laser range �nder 2. Each of these images contains one polyhedral object. Figure 5.23a

shows the polyhedral1 image and �gure 5.23b shows the polyhedral2 image. We

apply the PASSEF system to both of these images (�gures 5.23c and 5.23d) to obtain

PHs;cg [polyhedral1] and P

Hs;cg [polyhedral2]. In previous examples the PASSEF sys-

tem, using only the step-edge operator, demonstrated partial detection of crease edges.

The PASSEF system does not detect crease edges in the polyhedral1 or polyhedral2

image. Analysis of gs[polyhedral1] shows why the PASSEF system does not detect

crease edges in the image. Edge maps gs[polyhedral1] for s 2 f1; 2; 4; 8g (�gure 5.24)

show that the step-edge operator has quite limited ability in extracting crease edges in

the polyhedral1 image.

Improved crease-edge detection is essential to creating a reasonable segmentation for

the polyhedral1 and polyhedral2 images. Results for the cone image show that

2The laser range data �les were provided by The Computer Vision / Image Analysis Research Labo-ratory at the University of South Florida, Department of Computer Science & Engineering, University ofSouth Florida, Tampa, Florida 33620-5399, http://marathon.csee.usf.edu/range/seg-comp/SegComp.html

79

a b

c d

Figure 5.23: The polyhedral1 and polyhedral2 images: (a) The polyhedral1 image,

(b) the polyhedral2 image, (c) PHs;cg [polyhedral1], and (d) P

Hs;cg [polyhedral2].

80

a b

b b

Figure 5.24: Application of the step-edge operator to the polyhedral1 image:gs[polyhedral1] for s = (a) 1, (b) 2, (c) 4, and (d) 8.

81

improvement of crease-edge detection would greatly improve the segmentation produced

by the PASSEF system. The next section discusses our attempts to generalize the PASSEF

system to the extraction of crease edges in order to improve segmentation results.

5.5 Crease-Edge Detection

This section describes the behavior of the PASSEF system using the crease-edge op-

erator, hs, as opposed to the step-edge operator, gs, that we examined in the previous

sections. The crease-edge operator is given by equation 3.16 in chapter three. We use this

operator to extract crease edges with the PASSEF system. This section will show how the

generalization of the PASSEF system to crease edges using hs demonstrates that crease

edge detection is possible with the PASSEF system. It will also show that more work is

necessary for the reliable extraction of crease edges.

The strategy is to use data from hs with the PASSEF system to create a fuzzy crease-

edge map denoted as Ph[f ] where f is the image being analyzed. To �nd Ph[f ] the PASSEF

system is trained with a training-data set that contains only crease-edge points. For an

initial analysis we examine Ph for the 8-sided-cone image (�gure 5.2a), a synthetic image

containing only crease edges. We use the training-data set C. Figure 5.25 shows PCh [8-

sided-cone]. This initial experiment proves the potential for crease-edge detection using

the PASSEF system.

Our second experiment involves detecting crease edges in a more complex synthetic

image, the highbay image (�gure 5.3a). We use the training-data set Hc to derive

PHc

h [highbay] (�gure 5.26b). Some crease edges appear to be detected properly in

PHc

h [highbay], but the results are not as good as for the 8-sided-cone experiment.

Because the highbay image contains step edges and crease edges we would like to de-

rive a feature map that contains both step and crease edges. For this we �nd Pg[highbay]

82

a b

Figure 5.25: Results of applying PASSEF to the 8-sided-cone image: (a)PCh [8-sided-cone] and (b) the watershed algorithm applied to PCh [8-sided-cone].

a b

Figure 5.26: Results of applying PASSEF to the hybay image: (a) PHs

g [highbay] and

(b) PHc

h [highbay].

83

and fuse this result with Ph[highbay]. Because we are now detecting crease edges ex-

plicitly we do not want Pg[highbay] to detect crease edges. We �nd Pg[highbay] using

a training-data set that contains only step edges (Hs) as opposed to one that contains

both crease and step edges (Hc;s), which we used previously in this chapter. Figure 5.26a

shows PHs

g [highbay].

We fuse PHs

g [highbay] and PHc

h [highbay] using Bernoulli's rule of combination (equa-

tion 4.2) to create a complete feature map for the highbay image. Figures 5.27a and 5.27b

show the results from this fusion. We compare this result with the result obtained from

Pg only to determine if crease edge detection is improved by using hs. Figure 5.27c shows

PHs;cg [highbay]. The fused edge map shows improved detection of some crease edges but

some noise is present.

Our goal is to extend this method to real data. We �nd PHc

h for the hydro3, hydro6,

and polyhedral1 images. The PASSEF system yields noisy results for all images. We

attempt to fuse the PHs

g with PHc

h . The noise prevails in the fused result. Noise in

the fused result causes unacceptable oversegmentation. Additional work could improve

crease-edge detection using the PASSEF system.

84

a b

c

Figure 5.27: Fusion of step-edge and crease-edge operator results using PASSEF: (a) Thefusion of PHs

g [highbay] and PHc

h [highbay] using Bernoulli's rule of combination, (b) the

watershed algorithm applied to the fusion of PHs

g [highbay] and PHc

h [highbay], and a

segmentation from applying the watershed to PHs;cg [highbay].

85

CHAPTER 6

Conclusions

We have developed a segmentation system for range images that utilizes multi-scale

analysis. We developed the Pattern Analysis of Scale Space for Extraction of Features

(PASSEF) method, which largely adheres to the goals set forth in chapter one. The goals

being to develop an algorithm that posses the three qualities:




Chapter �ve of this thesis presents results that demonstrate the degree to which the

PASSEF system achieves the stated goals. The PASSEF system demonstrates excellent

detection of step-edges at all scales in the presence of noise, but shows less promise in

detecting crease-edges. The generalization of the PASSEF system to crease-edges shows

some of the strengths and weaknesses of the system. The PASSEF system requires only

four free parameters. In addition, the system is very robust to adjustment of these parame-

ters. We apply the PASSEF system to several images from various range acquiring devices

to �nd that the system can successfully detect features in all these images. Moreover, no

adjustment of parameters is needed to obtain feature extractions from these images.

One of the strengths of the PASSEF system is that the system can be extended to

detect other features by using di�erent feature operators. We attempted to extend the

system to the detection of crease edges using the operator in equation 3.16. This attempt

revealed that extending the system to detection of other features is possible, but more

work is needed to improve the results.

86

In conclusion, we have developed a system that can segment range images that contain

multi-scale objects and are from various acquiring devices with very little free parameter

adjustment. We have presented a thorough analysis of this system that includes assorted

results.

87

BIBLIOGRAPHY

BIBLIOGRAPHY

[1] D. Marr, Vision, Freeman, 1982.

[2] A. Hoover, G. Jean-Baptiste, X. Jiang, P. Flynn, H. Bunke, D. Goldgof, and K. Bowyer,\A comparison of range segmentation algorithms," IEEE Transactions on PatternAnalysis and Machine Intelligence 18(7), pp. 673{689, 1996.

[3] A. P. Witkin, \Scale-space �ltering," in Proceedings of the 8th International JointConference on Arti�cial Intelligence, pp. 1019{1022, 1983.

[4] F. Bergholm, \Edge focusing," IEEE Transactions on Pattern Analysis and MachineIntelligence 9(6), pp. 726{741, 1987.

[5] J. Canny, \A computational approach to edge detection," IEEE Transactions onPattern Analysis and Machine Intelligence 8(6), pp. 679{698, 1986.

[6] F. Truchete, O. Laligant, E. Bourcnanne, and J. Miteran, \Frame of wavelets for edgedetection," in Proceedings of the SPIE - Wavelet Applications in Signal Processing,vol. 2303, pp. 141{152, 1994.

[7] S. Haring, M. A. Viergever, and J. N. Kok, \Kohonen networks for multiscale imagesegmentation," Image and Vision Computing 12(6), pp. 339{344, 1994.

[8] D. Marr and E. Hilldreth, \Theory of edge detection," Procedings of the Royal Societyof London 8(6), pp. 679{698, 1986.

[9] T. Lindeberg, \Scale-space for discrete signals," IEEE Transactions on Pattern Anal-ysis and Machine Intelligence 12(3), pp. 234{254, 1990.

[10] S. Mallat and S. Zhong, \Characterization of signals from multiscale edges," IEEETransactions on Pattern Analysis and Machine Intelligence 14(7), pp. 710{732, 1992.

[11] R. C. Gonzalez and R. E. Woods, Digital Image Processing, Addison-Wesley, third ed.,1992.

[12] S. Livens, P. Scheunders, G. V. de Wouwer, and D. V. Dyck, \Wavelets for textureanalysis, an overview," in 6th Int. Conf. on Image Processing and its Applications,vol. 2, pp. 581{585, 1997.

[13] P. Vautrot, N. Bonnet, and M. Herbin, \Comparative study of di�erent spatial/spatial-frequency methods (gabor �lters, wavelets, wavelets packets) for texture segmenta-tion/classi�cation," in IEEE International Conference on Image Processing, vol. 3,pp. 145{148, 1996.

[14] A. Laine and J. Fan, \Frame representations for texture segmentation," IEEE Trans-actions on Image Processing 5(5), pp. 771{779, 1996.

[15] R. A. Kiltie, J. Fan, and A. F. Laine, \A wavelet-based metric for visual texturediscrimination with applications in evolutionary ecology," Mathematical Biosciences126(1), pp. 21{39, 1995.

[16] A. Laine and J. Fan, \Texture classi�cation by wavelet packet signatures," IEEETransactions on Pattern Analysis and Machine Intelligence 15(11), pp. 1186{1190,1993.

89

[17] R. Kiltie and A. Laine, \Visual texture, machine vision and animal camou age,"Trends in Ecology and Evolution 7(5), pp. 163{166, 1992.

[18] O. Pichler, A. Teuner, and B. Hosticka, \An unsupervised texture segmentation al-gorithm with feature space reduction and knowledge feedback," IEEE Transactionson Image Processing 7(1), pp. 53{61, 1998.

[19] W. Wu and S. Wei, \Rotation and gray-scale transform-invariant texture classi�cationusing spiral resampling, subband decomposition, and hidden Markov model," IEEETransactions on Image Processing 5(10), 1996.

[20] C. Lu, P. Chung, and C. Chen, \Unsupervised texture segmentation via wavelettransform," Pattern Recognition 30(5), pp. 729{742, 1997.

[21] A. Pikaz and A. Averbuch, \An e�cient topological characterization of gray-levelstextures, using a multiresolution representation," Graphical Models and Image Pro-cessing 59(1), pp. 1{ 17, 1997.

[22] R. Porter and N. Canagarajah, \A robust automatic clustering scheme for imagesegmentation using wavelets," IEEE Transactions on Image Processing 5(4), pp. 662{665, 1996.

[23] K. Kim, I. Jung, and Y. Yang, \High resolution image classi�cation with featuresfrom wavelet frames," in Proceedings of the 1997 IEEE International Geoscience andRemote Sensing Symposium, vol. 1, pp. 584{587, 1997.

[24] B. Wang, Y. Motomura, and A. Ono, \Texture segmentation algorithm using mul-tichannel wavelet frames," in IEEE International Conference on Systems, Man, andCybernetics, vol. 3, pp. 2527{2532, 1997.

[25] X. Zong, A. Meyer, and A. Laine, \Multiscale segmentation through a radial ba-sis neural network," in IEEE International Conference on Image Processing, vol. 3,pp. 400{403, 1997.

[26] S. Liu and E. Delp, \Multiresolution detection of stellate lesions in mammograms,"in IEEE International Conference on Image Processing, vol. 2, pp. 109{112, 1997.

[27] C. Busch, \Wavelet based texture segmentation of multi-modal tomographic images,"Computers and Graphics 21(3), pp. 347{358, 1997.

[28] S. Pemmaraju, S. Mitra, Y.-Y. Shieh, and G. Roberson, \Multiresolution waveletdecomposition and neuro-fuzzy clustering for segmentation of radiographic images,"in IEEE Symposium on Computer-Based Medical Systems, pp. 142{149, 1995.

[29] A. Betti, M. Barni, and A. Mecocci, \Using a wavelet-based fractal feature to improvetexture discrimination on sar images," in IEEE International Conference on ImageProcessing, vol. 1, pp. 251{254, 1997.

[30] L. Alparone, M. Barni, M. Betti, and A. Garzelli, \Fuzzy clustering of texturedsar images based on a fractal dimension feature," in Proceedings of the 1997 IEEEInternational Geoscience and Remote Sensing Symposium, vol. 3, pp. 1184{1186,1997.

90

[31] J. Boucher and S. Pleihers, \Unsupervised segmentation of radar images using waveletdecomposition and cumulants," in IEEE International Conference on Acoustics, Speech,and Signal Processing, vol. 5, pp. 1{4, 1994.

[32] M. Ramos, S. Hemami, and M. Tamburro, \Psychovisually-based multiresolutionimage segmentation," in IEEE International Conference on Image Processing, vol. 3,pp. 66{69, 1997.

[33] J. Maeda, V. Anh, T. Ishizaka, and Y. Suzuki, \Integration of local fractal dimensionand boundary edge in segmenting natural images," in IEEE International Conferenceon Image Processing, vol. 1, pp. 845{848, 1996.

[34] S. Sheng and P. Chevrette, \Three-dimensional object recognition from two-dimensionalimages using wavelet transforms and neural networks," inOptical Engineering, vol. 37,pp. 763{770, 1998.

[35] G. Fernandez and T. Huntsberger, \Wavelet-based system for recognition and labelingof polyhedral junctions," Optical Engineering 37(1), pp. 158{165, 1998.

[36] J. Beltran, L. Garcia, and J. Navarro, \Edge detection and classi�cation using Mal-lat's wavelet," in IEEE International Conference on Image Processing, vol. 1, pp. 293{297, 1994.

[37] J. Zan, B. Zheng, and W. Zhu, \Image compression scheme using wavelets-based edgeextraction and low-frequence component expansion," in Proceedings of the Interna-tional Conference on Signal Processing, vol. 2, pp. 974{977, 1996.

[38] A. Laine and X. Zong, \Border identi�cation of echocardiograms via multiscale edgedetection and shape modeling," in IEEE International Conference on Image Process-ing, vol. 3, pp. 287{290, 1996.

[39] J. Fayolle, C. Ducottet, T. Fournel, and J.-P. Schon, \Motion characterization ofunrigid objects by detecting and tracking feature points," in IEEE InternationalConference on Image Processing, vol. 3, pp. 803{806, 1996.

[40] S. Chang and M. Vetterli, \Spatial adaptive wavelet thresholding for image denois-ing," in IEEE International Conference on Image Processing, vol. 2, pp. 374{377,1997.

[41] O. Neiroukh, \Range image segmentation through multiresolution analysis usingwavelets," master's thesis, The University of Tennessee, Knoxville, Tennessee, May1995.

[42] T. Aydin, Y. Yemez, E. Anarim, and B. Sankur, \Multidirectional and multiscale edgedetection via M-band wavelet transform," IEEE Transactions on Image Processing5(9), pp. 1370{1377, 1996.

[43] T. Aydin, Y. Yemez, B. Sankur, E. Anarim, and O. Alkin, \Use of M-band wavelettransform for multidirectional and multiscale edge detection," in IEEE InternationalConference on Acoustics, Speech, and Signal Processing, vol. 5, pp. v{17{20, 1994.

[44] M. Venkatraman and V. Govindaraju, \Zero crossings of a non-orthogonal wavelettransform for object location," in IEEE International Conference on Image Process-ing, vol. 3, pp. 57{60, 1995.

91

[45] Z. Xiong, M. Orchard, and K. Ramchandran, \Inverse halftoning using wavelets," inIEEE International Conference on Image Processing, vol. 1, pp. 569{572, 1996.

[46] K. Cinkler and A. Mertins, \Coding of digital video with the edge-sensitive discretewavelet transform," in IEEE International Conference on Image Processing, vol. 1,pp. 961{964, 1996.

[47] A. Khashman and K. M. Curtis, \Neural networks arbitration for automation for au-tomatic edge detection of 3-dimensional objects," in IEEE International Conferenceon Electronics, Circuits, and Systems, vol. 1, pp. 49{52, October 1996.

[48] D. Ziou and S. Tabbone, \A multi-scale edge detector," Pattern Recognition 26(9),pp. 1305{1314, 1993.

[49] T. Lindeberg, \Edge detection and ridge detection with automatic scale selection,"in IEEE International Conference on Computer Vision and Pattern Recognition,pp. 465{470, 1996.

[50] J. Elder, The Visual Computation of Bounding Contours.Phd thesis, McGill University, Canada, August 1995.

[51] S. Mahmoodi, B. Sharif, and E. Chester, \Contour detection using multi-scale activeshape models,"

[52] T. Lindeberg, Discrete Scale-Space Theory and the Scale-Space Primal Sketch.Phd thesis, Royal Institute of Technology, Stockholm, Sweden, May 1991.

[53] D. J. Williams and M. Shah, \Edge contours using multiple scales," Computer VisionGraphics Image Processing 51(3), pp. 256{274, 1990.

[54] R. Qian and T. Huang, \A two-dimensional edge detection scheme for general visualprocessing," in International Conference on Pattern Recognition, vol. 1, pp. 595{598,1994.

[55] R. Qian and T. Huang, \Optimal edge detection in two-dimensional images," IEEETransactions on Image Processing 5(7), pp. 1215{1220, 1996.

[56] R. T. Whitaker and S. M. Pizer, \A multi-scale approach to nonuniform di�usion,"CVGIP: Image Understanding 57(1), pp. 99{110, 1993.

[57] D. Wang, \A multiscale gradient algorithm for image segmentation using watersheds,"Pattern Recognition 30(12), pp. 2043{2052, 1997.

[58] L. Baxter and J. Coggins, \Supervised pixel classi�cation using a feature space derivedfrom an arti�cial visual system," in Proceedings of the SPIE { Intelligent Robots andComputer Vision IX: Algorithms and Techniques, pp. 495{469, November 1990.

[59] S. Haring and M. Viergever, \A multiscale approach to image segmentation usingkohonen networks," in Lecture Notes in Computer Science: Information Processingin Medical Imaging, 1993.

[60] S. Mallat, \A theory for multiresolution signal decomposition: The wavelet repre-sentation," IEEE Transactions on Pattern Analysis and Machine Intelligence 11(7),pp. 674{693, 1989.

92

[61] P. J. Besl, Surfaces in Range Image Understanding, Springer-Verlag, one ed., 1988.

[62] E. D. Lester, \Feature extraction, image segmentation, and surface �tting: The de-velopment of a 3d scene reconstruction system," Master's thesis, The University ofTennessee, Knoxville, 1998.

[63] M. Baccar, \Surface characterization using a Gaussian weighted least squares tech-nique towards segmentation of range images," Master's thesis, The University ofTennessee, Knoxville, 1994.

[64] G. J. Klir and T. A. Folger, Fuzzy Sets, Uncertainty, and Information, Prentice Hall,1988.

[65] G. Shafer, A Mathematical Theory of Evidence, Princeton University, 1976.

[66] R. Pito, \Characterization, calibration and use of the perceptron laser range �nder ina controlled enviroment," Tech. Rep. MS-CIS-95-05, Department of Computer andInformation Science, University of Pennsylvania, 1995.

[67] I. S. Kweon, R. Ho�man, and E. Krotkov, \Experimental characterization of theperceptron laser range�nder," Tech. Rep. CMU-RI-TR-91-1, The `Robotics Institute,Carnegie Mellon University, Pittsburgh, Pennsylvania, 1991.

[68] K. Low and J. Coggins, \Multiscale vector �elds for image pattern recognition," inProceedings on SPIE Symposim on Advances in Intelligent Robotics Systems, vol. 1192,1989.

[69] D. Colella and C. Heil, \The characterization of continuous, four-ce�cient scalingfunctions and wavelets," IEEE Transactions on Information Theory 38(2), pp. 876{881, 1992.

93

APPENDICES

APPENDIX A

Background

A.1 Wavelet Theory

In image processing the term scale refers to the idea that objects in the world exist

or are relevant over a limited range of sizes or distances [9]. Thus, objects or features

associated with these objects occur at a particular scale within the image. The wavelet

transform provides a tool for extracting information at a particular scale in an image.

The wavelet transform is applied to a signal using a particular wavelet function or

basis function. The basis function determines how the wavelet transform will respond to

a signal. The ability to change this basis function makes the wavelet transform a exible

tool. The scaling of this basis function determines the size of features or objects to which

the transform is most sensitive.

A.1.1 General

The wavelet transform breaks a signal into frequency components much like Fourier analy-

sis. In contrast to the sine and cosine functions used in Fourier analysis, which are perfectly

local in frequency but global in space, wavelet functions are typically local in both space

and frequency. Because wavelet functions are local in both space and frequency, wavelet

analysis is capable of representing local features of a signal such as sharp peaks or edges.

A wavelet is any function that satis�es the condition stated in equation A.1 (i.e. the

function has equal area above and below the horizontal axis).

Z1

�1

(x)dx = 0 (A.1)

In addition wavelet functions usually have compact support or are nonzero over a closed

95

set of points. According to this de�nition there are in�nitely-many valid wavelet functions.

Typically, multi-scale analysis using the wavelet transform is achieved by scaling the

wavelet basis function. The di�erent scales of this basis function are also called dilations.

When the wavelet transform is applied to a signal using a large dilation of the basis

function, large features of the signal are analyzed. A smaller dilation of the basis function

analyzes the details of a function. Equation A.2 de�nes the scaling or dilation of a function,

where �s(x) is the resulting scaled function scaled by the factor s.

�s(x) =1

s�(x

s) (A.2)

Most applications of wavelet analysis involve the wavelet transform. Finding the

wavelet transform of a signal consists of determining the inner product of the signal with

the wavelet basis. There are two basic schemes for representing the wavelet transform of

a signal: the pyramidal scheme [60] and the convolution scheme [10].

The pyramidal implementation of the wavelet transform uses the inner product of the

wavelet basis translated to di�erent positions in the image. For each scale the wavelet

becomes larger and is therefore translated to fewer positions in the signal. Typically

pyramid algorithms are dyadic, meaning the size of the signal at successive levels of the

pyramid are reduced by a factor of two. A lossless representation of the data being

transformed is achieved with this implementation by using orthogonal wavelets. This

representation may also be created by convolving the dilated wavelet with the signal and

then uniformly sampling the signal to obtain half the number of samples of the original

signal.

The convolution implementation creates a group of signals (one signal representing

each scale) that are all the same size. The data structures created by the pyramidal and

convolution representations are shown in �gure A.1. The convolution representation of the

wavelet transform of a signal f(x) is de�ned by equation A.3 where is the convolution

96

IncreasingScale

StackPyramid

Figure A.1: Representations of the Wavelet Transform

operator.

Wsf(x) = f(x)s(x) (A.3)

Using this equation, the wavelet function (x) can be chosen to have certain properties

desired for a particular application.

A.1.2 Properties of Wavelets

There are several important properties of wavelet functions. Five of these properties are

integral to wavelet analysis for image segmentation. These are: orthogonality, compact

support, symmetry, di�erentiability, and number of zero crossings.

Orthogonality

An orthogonal wavelet is one in which the basis wavelet is orthogonal to its translations

and dilations [41]. Two orthogonal functions must satisfy equation A.4.

hn;mi = �n;m (A.4)

Examples of orthogonal wavelet functions are: the Haar (Figure A.2) and the Daubechies

(Figure A.3) wavelets.

97

350 400 450 500 550 600 650 700 750

−0.1

−0.05

0

0.05

0.1

0.15

Figure A.2: Haar wavelet

300 350 400 450 500 550 600 650 700 750−0.25

−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

0.2

0.25

Figure A.3: Daubechies wavelet

98

0 5 10 15 20 25 30 35 40 45 50−0.04

−0.03

−0.02

−0.01

0

0.01

0.02

0.03

0.04

Figure A.4: Derivative of a quadratic spline Wavelet

Compact Support

A function has compact support if its nonzero values are contained over a closed set of

points [41] such that:

9� > 0 : f(x) = 0, 8 j x j> �: (A.5)

A wavelet function that does not have compact support requires truncation for implemen-

tation on a computer. For example the derivative of a Gaussian (DOG), a valid wavelet

function (�gure A.5), must be truncated. Truncation of a function causes several prob-

lems for implementation. One problem is that inaccurate values occur when taking the

derivative of the function. These inaccurate values exist at the truncation points of the

function. Some examples of compactly supported wavelets are the Haar, the Daubechies,

and the derivative of a quadratic spline (�gure A.4) wavelets.

Symmetry

In this work functions that have an axis of either symmetry or antisymmetry are considered

symmetric. If f(x) = f(-x), a function has an axis of symmetry. If f(x) = - f(-x), a function

is considered to have an axis of antisymmetry. Some examples of symmetric wavelets

are: the Haar, the derivative of a quadratic spline, and DOG functions. The Daubechies

wavelet, (Figure A.3) for example, is not symmetric.

99

0 200 400 600 800 1000 1200−0.04

−0.03

−0.02

−0.01

0

0.01

0.02

0.03

0.04

Figure A.5: DOG Wavelet

Di�erentiability

Di�erentiability, in this work, refers to a function that has a derivative at each point on

the interval (�1;1). In addition to the existence of a �rst derivative of the wavelet

function, there is also interest in how many derivatives can be taken beyond this �rst

derivative. For example the DOG function can be di�erentiated in�nite times, whereas,

the derivative of a quadratic spline function can not.

Number of Zero Crossings

A zero crossing occurs in the function f(x) at the point a if

9� > 0 :

f(x) > 0 8 a < x < a+ � and

f(x) < 0 8 a� � < x < a and

f(a) = 0

or if f(x) < 0 8 a < x < a+ � and

f(x) > 0 8 a� � < x < a and

f(a) = 0

Applying the wavelet transform to a signal that contains discontinuities yields a signal

100

containing local extrema. The number of zero crossings in the wavelet function corresponds

directly to the number of extrema in the result of applying the wavelet transform to a

signal that contains a single discontinuity. The derivative of a quadratic spline function

and the DOG function both have one zero crossing.

A.1.3 Examples of Wavelet Functions

Derivative of Gaussian

The DOG is a valid wavelet function because it integrates to zero. The DOG function is

in�nitely di�erentiable and symmetric, but it is not orthogonal and does not have com-

pact support. The DOG function was used in image processing before the establishment

of wavelet theory. As explained in [10], the DOG function plays an important role in

connecting the �eld of wavelets to conventional edge detection.

Four-Coe�cient Wavelets

One common way to create wavelets is to use a recursive algorithm that relies on a

four-coe�cient weighting function [69]. Each set of coe�cient values creates a partic-

ular wavelet. Applying rules to the values of the coe�cients allows the creation of wavelet

functions with particular properties.

A scaling function, �(x), is used to produce the wavelet function. This scaling function

must have an area equal to 1, that is,

Z1

�1

�(x) dx = 1: (A.6)

The scaling function is produced recursively using

�(x) = c0�(2x) + c1�(2x� 1) + c2�(2x � 2) + c3�(2x� 3): (A.7)

Where c0, c1, c2, and c3 are chosen coe�cients. The values of the coe�cients ultimately de-

termine the characteristics of the wavelet function generated from this recursively-de�ned

scaling function.

101

Given four initial values for �(x), the recursive application of equation A.7 generates

another discrete function. Successive applications of equation A.7 produce scaling func-

tions with progressively higher resolutions. For instance, the second resolution has twice

as many samples (8) as the initial resolution (4). Equation A.7 can then be applied to

the second resolution to yield a third resolution, which contains four times the number of

samples (16) as the initial resolution.

In some implementations of this recursive algorithm �(x) is bounded to the interval

[0,3] and the �ner resolutions represent samples that lie in between those at the coarser

resolutions. Nonetheless, �(x) must have area 1 on the interval [0,3]. For computer imple-

mentations of this algorithm, values of the continuous signal are stored as discrete samples.

In this case the sum of all these samples must be 1. Equation A.8 shows the criteria of

the scaling function (area equal to 1) expressed for the case of the recursive algorithm

implemented for the continuous and discrete cases. With each increase in resolution the

number of samples of �(x) is doubled. This means that in the discrete case �x is halfed

with each recursion. Z3

0

�(x) dx =1X

x=�1

�(x)�x = 1 (A.8)

where �x = 1

2jfor the jth level of recursion. This leads to the constraint that

3Xi=0

ci = 2: (A.9)

The wavelet function is produced from the scaling function by application of equation

A.10.

(x) = c3�(2x) � c2�(2x � 1) + c1�(2x� 2)� c0�(2x� 3) (A.10)

Because the scaling function sums to 1 the wavelet function will sum to 0. This leads to

the following constraint for the values of the coe�cients:

c0 + c2 = c1 + c3 = 1: (A.11)

102

Odd

Even

Orthogonal

Symmetric

Differentiable

- Quadradic Spline

- Daubechies

- Haar

Figure A.6: 4 Coe�cient Wavelet Space

This allows the four coe�cients c0; c1; c2, and c3 to be reduced to two coe�cients, co and

ce, which are referred to as odd and even respectively (equation A.12).

c0 = ce c1 = 1� co c2 = 1� ce c3 = co (A.12)

Figure A.6 shows the space of odd and even coe�cient values and labels the properties

of coe�cients in that space. If co=ce, then the wavelet function will be symmetric. If

co+ce=1/2, then the wavelet function is di�erentiable. Values of the odd and even co-

e�cients that lie on the circle labeled \Orthogonal" create orthogonal wavelets. These

four-coe�cient wavelets allow us to choose desirable properties and easily create a wavelet

with these properties. Notice that a four-coe�cient wavelet can not be simultaneously

symmetric, di�erentiable, and orthogonal.

103

VITA

Samuel Burgiss was born in Raleigh, North Carolina in 1972. He lived in Raleigh until

1975 when he moved with his parents to eastern Florida. After a short time he moved to

Knoxville, TN. Samuel graduated from Farragut High School in 1990. After high school he

left Knoxville to attend North Carolina State University. He received his BS in Computer

Engineering in May of 1994. Upon graduation Samuel worked as a research assistant at the

Image Processing Lab at the University of Tennessee Medical Center. After working there

for �ve months Samuel began work as a Systems Analyst at Phillips Consumer Electronics.

Samuel enrolled in the MSEE program at the University of Tennessee Knoxville in 1996

where he is currently working as a Graduate Research Assistant in the Imaging, Robotics

and Intelligent Systems Laboratory under the supervision of Dr. R. T. Whitaker and

Dr. M. A. Abidi. He expects to graduate in August 1998 with a specialization in image

processing and robotic vision.

104

p a ttern anal ysis of the mul ti-scale w a velet …burgiss jr. august 1998. a ckno wledgements iw...

Documents