p a ttern anal ysis of the mul ti-scale w a velet …burgiss jr. august 1998. a ckno wledgements iw...
TRANSCRIPT
RANGE IMAGE SEGMENTATION THROUGH
PATTERN ANALYSIS OF THE MULTI-SCALE
WAVELET TRANSFORM
A Thesis
Presented for the
Master of Science
Degree
The University of Tennessee, Knoxville
Samuel G. Burgiss Jr.
August 1998
ACKNOWLEDGEMENTS
I would like to thank my parents, Samuel and Janet Burgiss, for their support. Much
gratitude goes to my �ancee, Heather, for her encouragement and patience especially
during the writing of this document. Thanks to Dr. M. A. Abidi for selecting me for the
Graduate Research Assistant position in the Imaging, Robotics and Intelligent Systems
laboratory that made obtaining my master's degree �nancially possible. I also wish to
thank my advisors, Dr. R. T. Whitaker and Dr. M. A. Abidi, for their guidance throughout
my program. Thanks also to the members of my committee, Dr. R. T. Whitaker, Dr. M.
A. Abidi and Dr. J. Gregor for their help and constructive criticism.
The work in this thesis was supported by the DOE's University Research Program in
Robotics (Universities of Florida, Michigan, New Mexico, Tennessee, and Texas) under
grant DOE{DE{FG02{86NE37968.
Thanks to R. E. Barry and the Oak Ridge National Laboratory, Oak Ridge, Tennessee
37831, Managed by Lockheed Martin Energy Research Corp. for the U.S. Department
of Energy under contract DE-AC05-96OR22464. They provided the Coleman laser range
data.
ii
ABSTRACT
This work presents an image segmentation method for range data that uses multi-scale
wavelet analysis in combination with pattern recognition. To segment range images we
develop PASSEF (pattern analysis of scale space for the detection of features). PASSEF
creates a fuzzy edge map and we then apply a morphological watershed algorithm to this
map to create a segmentation.
The PASSEF system uses pattern recognition to classify points in an image based
on response to a feature detector over scale. A scale-space signature is the vector of
measurements at di�erent scales taken at a single point in an image. We train PASSEF
with scale-space signatures from the edge points of a training image. Once trained, the
system can determine the degree of edgeness of points in a new image.
A feature-detection framework based on multi-scale analysis and pattern-recognition
has several potential advantages over other feature-detection systems. Our goal is to create
a system that exploits the advantages of a multi-scale, pattern-recognition framework.
These advantages are detection of features at di�erent scales (i.e. features of all sizes),
robustness to noise, and few or no free parameters. We discuss these advantages in relation
to the development of the PASSEF system and provide a critical analysis of the system
based on these three goals. The PASSEF system achieves the stated goals for the detection
of step-edge features. Our results also show that this technique might be useful in the
detection of other features such as crease edges. We suggest future work for extending the
capabilities of the system.
iii
Contents
1. Introduction 11.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Overview of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2. Related Work 62.1 Segmentation Using Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 Region-Based Segmentation . . . . . . . . . . . . . . . . . . . . . . . 62.1.2 Hybrid Segmentation Using Wavelets . . . . . . . . . . . . . . . . . . 72.1.3 Edge-based Segmentation Using Wavelets . . . . . . . . . . . . . . . 8
2.2 Multi-Scale Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . 102.2.1 Scale Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.2.2 Scale Traversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2.3 Collective Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3. Multi-Scale Feature Extraction 173.1 Step-Edge Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.2 Crease-Edge Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.3 Segmentation from Feature Extraction . . . . . . . . . . . . . . . . . . . . . 223.4 Fuzzy Reasoning Approach to Scale-Space Fusion . . . . . . . . . . . . . . . 24
4. Scale-Space Combination Algorithm 394.1 Analysis of the Scale Space of the Step-Edge Operator . . . . . . . . . . . . 404.2 Analysis of the Scale Space of the Crease-Edge Operator . . . . . . . . . . . 434.3 Motivation for Using Pattern-Recognition . . . . . . . . . . . . . . . . . . . 464.4 Modeling Scale Space Using Gaussian Blobs . . . . . . . . . . . . . . . . . . 484.5 Fusion of the Step-Edge and Crease-Edge Detection Maps . . . . . . . . . . 51
5. Results and Analysis of System 535.1 Training Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535.2 Detection of Objects at All Scales in the Presence of Noise . . . . . . . . . . 555.3 Free Parameters in the PASSEF System . . . . . . . . . . . . . . . . . . . . 65
5.3.1 Number of Scales | n . . . . . . . . . . . . . . . . . . . . . . . . . . 675.3.2 Number of Blobs | N . . . . . . . . . . . . . . . . . . . . . . . . . . 685.3.3 Noise Level in Training Data | � . . . . . . . . . . . . . . . . . . . 715.3.4 Minimum Watershed Depth | d . . . . . . . . . . . . . . . . . . . . 73
5.4 Images from Di�erent Acquiring Devices . . . . . . . . . . . . . . . . . . . . 735.5 Crease-Edge Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6. Conclusions 86
BIBLIOGRAPHY 88
APPENDICES 94
iv
A. Background 95A.1 Wavelet Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
A.1.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95A.1.2 Properties of Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . 97A.1.3 Examples of Wavelet Functions . . . . . . . . . . . . . . . . . . . . . 101
VITA 104
v
List of Tables
5.1 Training Data used with the PASSEF System . . . . . . . . . . . . . . . . . 545.2 Free parameters in the PASSEF system and segmentation algorithm. . . . . 67
vi
List of Figures
1.1 First and second derivatives of 1-D signal . . . . . . . . . . . . . . . . . . . 31.2 Multi-scale derivative of Gaussian operator applied to 1-D signal containing
step edges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43.1 Representations of the Wavelet Transform. . . . . . . . . . . . . . . . . . . 173.2 Implementation of the Wavelet Transform. . . . . . . . . . . . . . . . . . . 203.3 A synthetic image containing step edges. . . . . . . . . . . . . . . . . . . . . 213.4 A synthetic image containing crease edges. . . . . . . . . . . . . . . . . . . . 233.5 Region formed from catchment basin. . . . . . . . . . . . . . . . . . . . . . 243.6 The step and step10 images. . . . . . . . . . . . . . . . . . . . . . . . . . 253.7 Scale space of the step-edge operator applied to the step10 image. . . . . . 273.8 Watershed algorithm applied to step-edge operator images from the step10
image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.9 Fuzzy fusion (Bernoulli's Rule of Combination) applied to scale space of
step10 image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.10 Watershed algorithm applied to fuzzy fusion (Bernoulli's Rule of Combina-
tion) result for step10 image. . . . . . . . . . . . . . . . . . . . . . . . . . . 313.11 The hydro3 image (size 256x256). . . . . . . . . . . . . . . . . . . . . . . 333.12 Scale space of the step-edge operator applied to the hydro3 image. . . . . 343.13 Watershed algorithm applied to step-edge operator images from the hy-
dro3 image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.14 Fuzzy fusion (Bernoulli's Rule of Combination) applied to scale space of
hydro3 image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.15 Watershed algorithm applied to fuzzy fusion (Bernoulli's Rule of Combina-
tion) result for hydro3 image. . . . . . . . . . . . . . . . . . . . . . . . . . 374.1 Creation of a scale-space signature from scale-space data. . . . . . . . . . . 404.2 The block image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.3 Scale-space signatures of each pixel in the block image with added Gaus-
sian noise (� = 0:1). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.4 Creation of a collective scale space from edge points in the step image. . . 424.5 2-D scale-space projections from the step10 image. . . . . . . . . . . . . . 434.6 The pyramid image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444.7 Scale-space signatures of each pixel in the simple pyramid image with
added Gaussian noise (� = 0.1). . . . . . . . . . . . . . . . . . . . . . . . . 444.8 Synthetic 8-sided-cone image. . . . . . . . . . . . . . . . . . . . . . . . . . 454.9 2-D scale-space projections from the 8-sided-cone image. . . . . . . . . . . 454.10 Pattern recognition to determine edgeness of scale-space signatures. . . . . 464.11 2-D scale-space projections from the step10 and 8-sided-cone images. . . 484.12 Flow of entire Pattern Analysis of Scale Space for Extraction of Features
(PASSEF) system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494.13 Synthetic highbay image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504.14 2-D projection of scale space for edge points of the highbay image. . . . . 514.15 K-means applied to scale space of step-edge points of highbay image. . . . 525.1 Training data from the step10 image. . . . . . . . . . . . . . . . . . . . . . 545.2 Training data from the 8-sided-cone image. . . . . . . . . . . . . . . . . . 555.3 Training data from the highbay image. . . . . . . . . . . . . . . . . . . . . 565.4 Results of applying PASSEF and the step-edge operator to the step10 image. 585.5 The step30 image (size 256x256). . . . . . . . . . . . . . . . . . . . . . . . 59
vii
5.6 Results of applying PASSEF and the step-edge operator to the step30 image. 605.7 Application of the step-edge operator to the step10 image at various scales. 615.8 Watershed algorithm applied to step-edge operator results of the step10
image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625.9 Application of the step-edge operator to the step30 image at various scales. 635.10 Watershed algorithm applied to step-edge operator results of the step30
image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645.11 The hydro3 image (size 256x256). . . . . . . . . . . . . . . . . . . . . . . 655.12 Results of applying PASSEF and the step-edge operator to the hydro3
image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665.13 Application of the PASSEF system to the step10 image for varying values
of n. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695.14 2-D scale-space projections of training data sets. . . . . . . . . . . . . . . . 705.15 Application of the PASSEF system to the step10 image for varying values
of N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725.16 The PASSEF system applied to highbay synthetic image for varying values
of �. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725.17 The hydro6 image (size 256x256). . . . . . . . . . . . . . . . . . . . . . . 745.18 Results of applying PASSEF and the step-edge operator to the hydro6
image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755.19 The cone image (size 256x256). . . . . . . . . . . . . . . . . . . . . . . . . 765.20 Application of the step-edge operator to the cone image. . . . . . . . . . . 775.21 Results of applying PASSEF to the cone image. . . . . . . . . . . . . . . . 785.22 Histogram equalization of the result of applying PASSEF to the cone image. 795.23 The polyhedral1 and polyhedral2 images. . . . . . . . . . . . . . . . . 805.24 Application of the step-edge operator to the polyhedral1 image. . . . . . 815.25 Results of applying PASSEF to the 8-sided-cone image. . . . . . . . . . . 835.26 Results of applying PASSEF to the hybay image. . . . . . . . . . . . . . . 835.27 Fusion of step-edge and crease-edge operator results using PASSEF. . . . . 85A.1 Representations of the Wavelet Transform . . . . . . . . . . . . . . . . . . . 97A.2 Haar wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98A.3 Daubechies wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98A.4 Derivative of a quadratic spline Wavelet . . . . . . . . . . . . . . . . . . . . 99A.5 DOG Wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100A.6 4 Coe�cient Wavelet Space . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
viii
CHAPTER 1
Introduction
1.1 Problem Statement
Image segmentation is a di�cult problem, but it is also an essential pre-processing
step in many computer vision systems [1]. In the case of range images, segmentation is
performed in order to locate objects, �t surfaces, and render volumes. Segmentation of a
range image is the process of dividing the image into areas that are associated with speci�c
objects or object components in a scene. Often, these objects or object components can
be identi�ed in a range image as patches that are relatively uniform in range value or
surface shape.
The literature is replete with novel segmentation algorithms. Traditionally these al-
gorithms must be tuned in order to segment images for a particular system (i.e. camera
or scanner) or segment images for other varying conditions [2]. Segmentation occurs at
a particular scale. Some segmentation algorithms leave scale as an adjustable parameter
whereas other algorithms attempt to �nd the best scale(s) for segmentation. Some meth-
ods create segmentations based heavily on one scale and re�ned by other scales [3, 4, 5].
More advanced techniques extract relevant information by intelligently examining all scales
[6, 7]. Noise occurs in an image at a particular scale or set of scales ; therefore, in order
to produce a proper segmentation, algorithms that choose scale must avoid scales that
contain relatively large amounts of noise.
Our goal is to develop an algorithm that posses three qualities:
� Detection of objects at all scales in the presence of noise.
� Few free parameters.
1
� Applicability to images from di�erent acquiring devices.
This thesis presents the results of our attempt to develop a system that adheres to
these criteria. We develop a system that can segment some di�erent types of images
without parameter adjustment. Additionally, this system segments objects in images at
a range of scales in the presence of sensor noise. Attempts to generalize the system to
crease edges demonstrates some of the strengths and weaknesses of this approach.
1.2 Strategy
We propose to segment range images using multi-scale feature extraction. We detect
edges as features in a range image, and from this edge map we create a segmentation.
There are two concepts that are integral to our strategy; the �rst is image segmentation
through feature extraction and the second is the extraction of features at multiple scales.
In this section we describe how these two ideas leads to the development of our strategy.
Several strategies for segmenting images exist. Segmentation techniques are commonly
divided into three broad categories: edge-based, region-based, and hybrid methods. Edge-
based methods are derived from feature extraction ideas; these methods detect edges as
features in an image. Filters that �nd discontinuities in pixel value (e.g. intensity or range)
are commonly used to �nd edges. After edges are detected, regions that are outlined by
these edges are labeled as objects or regions. Region-based methods work by combining
areas of similar value. These methods usually grow or somehow group pixels of similar
value. For example initial seeds can be established and these seeds grown to create desired
regions. Hybrid methods use both region-based and edge-based tactics to create a �nal
image.
The Canny technique [5] and Marr-Hildreth technique [8] are two of the more popular
methods of edge-based segmentation. Both of these techniques estimate the derivative of
2
1−D signal
1st Derivative of Signal
2nd Derivative of Signal
Figure 1.1: First and second derivatives of 1-D signal
the image at each point by using a convolution mask and create an edge map based on the
result. Figure 1.1 shows the �rst and second derivatives of a 1-D signal containing edge
models. Notice that the maxima and minima of the �rst derivative of the signal indicate
the location of a step edge. The zero crossings of the second derivative also show where
the step edges occur.
An important issue that �gure 1.1 does not address is scale. Scale refers to the idea
that objects in the world exist or are relevant over a limited range of sizes or distances
[9]. For example the branch of a tree is relevant only at the scale from a foot to a few
yards. At the scale of an inch the bark or other features are examined. At the distance
of �fty yards the entire tree is relevant. Scale is relevant to feature extraction because
feature extraction must occur at a particular scale { whether or not that scale is explicitly
represented as a free parameter in the algorithm.
The idea of scale space refers to a family of derived signals where the �ne-scale infor-
mation is successively suppressed as scale increases [3]. Figure 1.2 shows the multi-scale
derivative of a 1-D signal containing step edges. This derivative is created by convolving
the derivative of a Gaussian (DOG) function with the sample signal. Each scale of the
3
1−D Signal Containing Edges
Multiscale Derivative
Figure 1.2: Multi-scale derivative of Gaussian operator applied to 1-D signal containingstep edges.
result is created using a di�erent standard deviation for the DOG function.
This �gure demonstrates two concepts of scale that are vital to our work. First the
�gure shows that as scale increases, derivatives computed at these scales become more
smoothed. This means that in this scale space of derivatives there is a trade o� between
the level of noise reduction and the accuracy of the result. Secondly, notice that in �gure
1.2 the last scales appear to present no new information. They are so smoothed that they
are almost at. This observation suggests that only scales up to a certain �nite size are
needed to analyze particular sets of data.
In image processing multi-scale analysis provides a representation of an image that
allows information from each scale to be analyzed separately. The wavelet transform
provides a tool for creating such multi-scale data. In this work we use the derivative of a
cubic-spline wavelet [10], which has the following desirable properties: compact support,
symmetry, di�erentiability (to a �nite degree), and one zero crossing. This wavelet creates
a scale space of �rst-order di�erence information. Edge-detection operators are derived
4
from this �rst-order di�erence information.
We want to segment objects of all scales in an image; therefore, using edge detection
as a means to segmentation requires that we extract edges at all scales. Conventional edge
detectors extract edges at a only one scale [5, 8, 11]. Thus, combining information from
these edge detectors at multiple scales allows the detection of edges at all scales in an
image. In addition we want to segment these objects in the presence of noise. The noise
versus accuracy trade o� in a derivative scale space leads us to suppose that if a system
could locate noise in the scale space, the scale-space information could be weighted to
achieve proper multi-scale edge detection in the presence of this noise. Thus, multi-scale
analysis not only allows for the detection of multi-scale edges but also provides for a way
to avoid improper detection of noise.
1.3 Overview of Thesis
Chapter 2 of this thesis gives an overview of work related to range image segmentation.
Wavelet segmentation and multi-scale segmentation methods are discussed. In chapter 3
we derive two edge operators from the wavelet transform. We examine the scale-space
properties of these edge operators. In chapter 4 we present a detailed description of the
range-image segmentation system that we have developed. We also give our motivation
for using a pattern-recognition system to analyze scale space. Issues that determine the
system's success are discussed. Chapter 5 is a presentation of results of the system. Results
for synthetic and real data are shown as well as an evaluation of the performance of the
system. We conclude with chapter 6 which summarizes our work and discusses ideas for
future investigations in these areas.
5
CHAPTER 2
Related Work
We are proposing the use of multi-scale wavelet information for the purpose of seg-
menting range images. There are two areas in the literature that are relevant. The �rst
area of related work is the use of wavelets for segmentation. We describe region-based,
edge-based, and hybrid wavelet segmentation methods. The second area of related work
is multi-scale feature extraction. In our discussion of multi-scale feature extraction we
give an account of how multi-scale feature extraction ideas have evolved. We begin with
methods that utilize intelligent choice of scale, then proceed to edge focusing, and �nally
describe schemes that extract information by examining the entire scale space.
2.1 Segmentation Using Wavelets
In this section we present three types of wavelet-based segmentation algorithms. These
are region-based, edge-based, and hybrid wavelet segmentation methods. Most region-
based segmentation methods that use wavelets segment images on the basis of texture.
We discuss a variety of edge-based algorithms, most of which are based on the work of
Mallat. We present only a short discussion of hybrid methods because these methods are
not very prevalent in the literature.
2.1.1 Region-Based Segmentation
Virtually all region-based wavelet segmentation methods are based on texture analysis.
In our discussion of texture-based methods we examine the motivation for texture-based
segmentation. A natural tool for texture-analysis is the wavelet transform. We describe
how wavelet-analysis is utilized to achieve segmentations based on texture and indicate
6
areas of application for texture-based segmentation using the wavelet transform.
The main idea of texture-based segmentation is that an image is made up of di�erent
textures, \the set of local neighborhood properties of the gray levels of an image region"
[12], and each of these di�erent textures can be described by a small number of character-
istic frequencies. The wavelet transform provides a tool that can extract these frequencies
locally within the regions of the image [13]. An overview of the use of wavelets for texture
analysis is provided in [12]. In that work the authors discuss the application of texture
analysis to feature extraction and the extension of texture-based feature extraction to
segmentation. Pixels in an image can be classi�ed based on local texture characteristics.
Application of this classi�cation to each pixel of an entire image produces regions in the
image that are invariant in local texture properties.
One example of fundamental work in this area comes from Laine et al. [14, 15, 16,
17]. Laine and Fan compare conventional texture analysis with wavelet-based texture
analysis [14, 16]. They develop a segmentation that clusters pixels of like texture by
grouping feature vectors in a wavelet space. They group feature vectors using a K-Means
algorithm. Many other authors use this approach with di�erent wavelet transforms or
di�erent clustering algorithms (e.g. [18, 19, 20, 21, 22, 23, 24, 20]).
Texture analysis has been applied to a myriad of image types. For example, due
to the nature of some medical images (e.g. each tissue type in medical images might
exhibit a certain texture), this segmentation method has become popular in that �eld.
Texture analysis with wavelets has been used to segment various types of medical images
[25, 26, 27, 28]. In other �elds this scheme has been applied to various radar images
[29, 30, 31].
7
2.1.2 Hybrid Segmentation Using Wavelets
As mentioned, wavelets are utilized in only a few hybrid-based segmentation methods. We
discuss only two examples here. The �rst is by Ramos, Hemami, and Tamburro [32]. These
authors attempt to model the psycho-visual system of humans. They assert that there
are three image components of distinct perceptual signi�cance to humans: strong edges,
smooth regions, and textured regions. They divide an image into 8x8 or 16x16 blocks then
classify these blocks as strong edge, smooth, or textured regions. They extract multi-scale
information using the wavelet transform discussed in [10]. Several rules are applied to the
wavelet transform response of all blocks in an image to classify these blocks into one of
the previously mentioned categories.
Another hybrid algorithm integrates local fractal dimension (derived using the wavelet
transform) and edge information into a region growing method [33]. The authors state
that the fractal dimension is a measure of surface roughness (i.e. texture). They use the
Sobel operator to detect edges in the image. The fractal dimension information and edge
information control region growing in the image.
2.1.3 Edge-based Segmentation Using Wavelets
Mallat is the major contributor to edge-based segmentation using wavelets [10]. Most edge-
based segmentation algorithms that utilize wavelets use the wavelet transform developed
by Mallat in [10] to detect discontinuities (i.e. edges) in an image. This section focuses on
his work in [10], and at the end of this section we present a few examples of other related
work.
Mallat's goal is actually to create an image compression scheme. For this he proposes
to use an edge map in conjunction with information that describes the regions created
by edge contours in the edge map [10]. He shows that using the wavelet transform to
detect edges is similar to the Canny edge-detection method [5]. If the basis function of
8
the wavelet transform is the derivative of a Gaussian (DOG), then this method of edge
detection is equivalent to Canny edge detection.
Mallat's �rst step in creating an edge map using his wavelet based edge detection is to
determine which wavelet function to use. In choosing the wavelet function Mallat shows
that the desired properties are compact support, symmetry, and one vanishing moment.
He uses the derivative of a cubic spline, which is a quadratic spline, as the basis function
of his wavelet transform.
Mallat applies his quadratic-spline wavelet transform to several test images. He maps
the modulus maxima of the transform as the edges of an image. He uses three scales of the
transform to construct the characterization of the image. His characterizations of images
are suitable for only those applications in which accuracy is not critical; detailed areas of
the image become blurred.
Many authors use Mallat's wavelet transform to perform edge detection. For example,
Sheng and Chevrette use Mallat's wavelet transform for object recognition [34]. They
train a neural network with contour information for several objects. They apply the
neural network to contours extracted from an image via Mallat's wavelet transform. The
neural network classi�es the contours as one of the training objects. Other examples are
found in [35, 36, 37, 38, 39, 40].
In his Master's thesis Neiroukh [41] examines the use of the wavelet transform for
segmentation of range images. His method of obtaining the edge map is similar to Mallat's
in that he maps the modulus maxima of the quadratic-spline wavelet transform as the
edges of the image. He also uses only one scale of the wavelet transform as Mallat did.
After obtaining an edge map, Neiroukh links edges, labels regions, and merges regions
below a certain size with larger regions. He concludes that his method creates an accurate
segmentation of synthetic and real images even in the presence of noise.
Some authors perform edge detection using wavelet transforms that are not discussed
9
by Mallat. Aydin et al. use an M-band wavelet transform because they wish to detect
edges using second-order information [42, 43]. Applying this wavelet transform to an
image yields a map where zero crossings indicate edge points in the image (as opposed to
Mallat's transform in which edges are indicated by a maxima). Their approach is similar
to the Marr and Hildreth technique [8]. They apply Teager's energy operator to reduce the
number of unwanted zero-crossings. Other examples of non-Mallat edge detection using
wavelet can be found in [44, 45, 46].
2.2 Multi-Scale Feature Extraction
Scale refers to the idea that objects in the world exist or are relevant over a limited
range of sizes or distances [9]. Scale is relevant to feature extraction because feature
extraction must occur at a particular scale. In this section we discuss in detail literature
that utilizes this multi-scale concept to achieve feature extraction for segmentation. The
breath of ideas in the area of multi-scale feature extraction is substantial. We divide multi-
scale feature extraction methods into three broad categories: scale choice, scale traversal,
and collective analysis.
These three categories di�er in the number of scales used to achieve a feature detection
and/or the manner in which the scale space is analyzed. We say that algorithms that utilize
one scale or a few scales are scale-choice methods. These algorithms may, for example,
determine the quality of edge detection at each scale, and then perform edge detection
at the best scale. We say that methods that perform an initial feature extraction at a
particular scale and then re�ne that feature extraction with many other scales are scale
traversal algorithms. One common scheme in this area is edge focusing. Finally, we identify
algorithms that extract features by examining the entire scale space at once as collective
analysis methods.
10
2.2.1 Scale Choice
First we examine algorithms that use one or two scales. This category has the smallest
representation in the literature. We present three algorithms that all use quite di�erent
techniques to choose scale, but one characteristic divides these scale choice methods into
two groups. There are two types of scale decision approaches: global scale decisions and
local scale decisions. Techniques using global scale decisions determine a best scale for
the entire image. Methods using local scale decisions determine a best scale for a pixel or
(although no examples of this were found) a region.
Khashman and Curtis [47] develop a system that chooses the best scale for edge detec-
tion of a particular range image. They apply a Laplacian of a Gaussian (LOG) operator
at seven di�erent scales to training images. They manually determine the best scale for
these training images. A multi-layer perceptron neural network is trained with the train-
ing images and the best scales for each image. After training the network is applied to
new range images to determine the best scale for edge detection of these images. Another
scheme that globally chooses scale is [48].
Lindeberg uses local scale decision for edge detection. In other words, a best scale is
determined for pixels on an individual basis [49]. Then edge detection is performed on each
pixel at that pixel's chosen scale. He states that the spatial extent of corresponding image
structures can be indicated by local maxima over scale of normalized derivatives. He uses
these maxima as a guide to locally choose the best scale for edge and ridge detection.
He does this by determining the scale at which the normalized derivative of a point in
an image gives a maximum response. Then he applies the derivative operator at this scale.
Performing this process at each point in the image creates an edge map. He determines
a signi�cance measure for each edge in this edge map. He shows �nal edge maps that
contain a �nite number of the most signi�cant contours.
11
Elder also uses local scale decision [50]. He asserts that a minimum reliable scale can
be determined if sensor noise statistics are known a priori. He states that at thisminimum
reliable scale and all larger scales the likelihood of error in edge detection due to sensor
noise is below a certain tolerance. So, the idea is to �nd the minimum scale that blurs the
sensor noise (a small scale feature).
Elder states that because smoothing delocalizes edges the smallest scale that avoids
improper detection of noise as edges is the best scale for edge detection. Therefore, the
minimum reliable scale is the best scale for edge detection. Elder �nds the minimum
reliable scale for both �rst-order and second-order edge detection operators at each pixel.
He then he applies these edge detection operators at the chosen scales to create an edge
map. He re�nes this edge map using various techniques (e.g. contour closure). Another
example of local scale choice is [51].
2.2.2 Scale Traversal
Among the large variety of scale combining edge detection schemes proposed in the lit-
erature there are two basic schools of thought. One school believes that an initial edge
map is created at one scale and then the map is altered according to information from
other scales. The other proposes that information can be better extracted by examining
the entire scale space to create an edge map. This section discusses the idea of creating
an edge map at one scale and then traversing the scale space in order to re�ne the map.
Bergholm [4] proposed fundamental concepts under the label edge focusing in 1987.
This algorithm combines edge information by traversing from a coarse to a �ne scale (and
is therefore referred to as a coarse-to-�ne algorithm.) He applies the Canny edge detector
to a larger scale and then applies the detector at a smaller scale in the vicinity of edges
found at the larger scale. This process is performed iteratively through decreasing scale.
After Bergholm's work several authors have used and expounded on this same basic
12
strategy. Lindeberg uses Bergholm's idea in conjunction with his concept of scale blobs
[52]. Lindeberg considers a scale space created by convolving a two-dimensional image with
Laplacian of a Gaussian (LOG) functions at di�erent standard deviations. He considers the
scale parameter as continuous. Zero crossings occur at edge points in an image when the
image is convolved with a LOG function. These zero crossings can be traced through the
scale space. The zero crossings create closed regions in the image plane and in the scale
space, thus, creating volumes or blobs in the three-dimensional space (two-dimensional
images over scale). Lindeberg creates signi�cance values for the scale blobs in his scale
space. He traverses the scale space in a coarse-to-�ne manner. He examines edges of the
most signi�cant scale blobs. He traces or focuses these edges through the scale space to
create an edge detection.
One group, Williams and Shah [53] create contours at the largest scale by combining
what they determine to be the strongest edge points. They use four measures to determine
the strength of an edge point. One of these measures quanti�es the level of noise around
the edge point. After determining the contours at the largest scale, end points of the
contours are examined in a direction similar to the end point of the contour at lower
scales. If edge points exist in the correct position at lower scales, the small scale edges are
added to the edge map.
Another group, Qian and Huang [54, 55], create an initial map from the smallest
scale and then add salient edges from larger scales. This method is referred to as a
�ne-to-course algorithm. Qian and Huang begin with a small scale edge map and add
only salient edges from each larger scale to the �nal edge map. They �nd edge points
at multiple scales using LOG �lters. A multi-step process determines the salience of an
edge contour. The process determines edge strength based on gradient magnitude of edge
points in the Gaussian blurred image at a particular scale. Edge strengths are normalized
based on edge contour length. Thresholding these strength values based on a global noise
13
approximation determines edge salience.
Another approach traverses scale, but uses a di�erent blurring technique. Whitaker
and Pizer assert that the use of Gaussian functions for blurring, which are used in a
signi�cant amount of multi-scale edge extraction techniques, has di�culties [56]. Gaussian
blurring, especially at large scales, causes inaccurate edge detection. They use edge-
a�ected di�usion in order to more accurately detect edges. This di�usion technique limits
blurring according to the presence of edges. They apply this technique at successively
smaller scales. Results for synthetic images show that the edge-a�ected di�usion is an
appealing alternative to Gaussian blurring.
2.2.3 Collective Analysis
An alternative to methods that traverse scale space is to examine scale space collectively.
We found varied approaches to collective scale-space analysis. These di�erent methods
apply techniques such as data fusion and pattern recognition to scale-space information
in order to extract features.
Wang [57] presents an algorithm which fuses scale-space information from a morpho-
logical gradient operator in an additive manner. Wang's work concentrates on the advan-
tages of using a morphological gradient operator and the application of a morphological
watershed algorithm to create a segmentation from his edge map. His gradient operator
is the di�erence of a signal eroded by a structuring unit and the signal dilated by the
same structuring element of scale i. Before summing the scale-space information for this
operator the result of this operator is eroded by the structuring element of scale i � 1.
Because all scales are weighted equally in this method the maximum scale used greatly
e�ects the results.
Baxter and Coggins present a di�erent approach [58]. They assert that pixels can be
classi�ed into di�erent regions based on properties in space and scale. They represent
14
pixels in an image by patterns or vectors in an n-dimensional feature space. The feature
vectors are the outputs of a series of n spatial �lters at a particular spatial location. They
use �lters that form clusters of pixels they wish to classify in the corresponding feature
spaces.
They apply their system to two di�erent image types. They analyze objective prism
images, which are obtained by inserting a prism in a telescope. These images record the
stellar spectra of stars. The authors wish to segment the stellar spectra from the back-
ground. (Conventional thresholding methods are not applicable because of the magnitude
of the noise.) They apply two �lters to the astronomical images to create a 2-dimensional
feature space. With this feature space they are able to di�erentiate spectra information
and noise. They also analyze cell images. They create a ten-dimensional space using
isotropic Gaussian �lters. They attempt to segment nucleolar organizer regions (NORs)
and nuclei from the background of the cell images. (Conventional thresholding is not
applicable due to the presence of objects of similar value.) They use supervised pixel
classi�cation to classify all the pixels in a cell image. A cell image is manually labeled
to serve as a training image. Training feature vectors form two classes (one representing
NORs and the other representing nuclei) in the ten-dimensional space. If a pixel in an
image being analyzed is close to a training class in the feature space the pixel is labeled
as being the respective class. If a pixel is not close to either class then it is labeled as
background. They conclude that classi�cation for the cell images is not perfect.
Truchetet, Laligant, Bourcnanne, and Miteran [6] present an approach that uses a
statistical method to determine the class (edge vs. non-edge) of points in an image.
Directional multi-scale gradient information of known edge points in an image crates a
scale space which is characterized by a statistical classi�er. Each training edge point is a
feature vector in an N-dimensional space (N being the number of scales used). Truchetet
�ts hyperrectangles to the edge points in the scale space. After training, points in a new
15
image can be classi�ed by the system as edge or non-edge.
Pereira and Manolakos use a similar approach in that they classify pixels in an image
as edge or non-edge using scale-space information, but their method uses both a di�erent
wavelet transform and a di�erent classi�cation method. They begin by applying the
wavelet transform with one of Daubechies' wavelets to create a scale space for a simple
synthetic training image. A hierarchical feed-forward neural network is trained with the
scale-space information of only the edge points of this training image. After training the
network is ready to classify points in a new image as edge or non-edge points. Analysis of
their system includes application to two real images and an investigation of the e�ects of
noise in images being analyzed.
Haring, Viergever, and Kok [59] propose a similar method to Truchetet but they use
a classi�er to create a segmentation rather than an edge detection. They analyze multi-
scale di�erential geometrical invariant information using a Kohonen network. They apply
several di�erential geometrical invariants to Gaussian smoothed images to create various
scale spaces. These scale spaces are used to create feature vectors for each pixel in an
image. Such feature vectors from a simple synthetic image are used to train the Kohonen
network. After training the network they segment synthetic and real images.
These examples demonstrate the breadth of methods that researchers have developed
based on the idea of multi-scale fusion for segmentation. Our work falls into the second
school of thought and is most closely related to the work of Truchetet et al. as well as
Haring et al. One main di�erence in these works and our work and is that we use a
fuzzy classi�er rather than a crisp classi�er to determine the degree of membership in our
feature class. Haring et al. classify pixels into particular segment classes and Truchetet
et al. classify points as edges or non-edges.
16
CHAPTER 3
Multi-Scale Feature Extraction
The wavelet transform creates a set of multi-scale derivatives for an image. We com-
bine this derivative information to create two feature-detection operators: the step-edge
detector and the crease-edge detector. In this chapter we describe how these two operators
are derived from the wavelet transform. We introduce the watershed algorithm as a tool
for obtaining segmentation from these two edge-detection operators. We describe some
observations of the multi-scale properties for these operators. We also present a simple
method for combining multi-scale edge information. This method is based on fuzzy fu-
sion. We analyze the behavior of this multi-scale fusion method and give motivations for
developing a more sophisticated method to combine multi-scale information.
There are two basic schemes for representing the wavelet transform of a signal: the
pyramidal scheme [60] and the convolution scheme [10]. Most wavelet applications use the
pyramidal implementation of the wavelet transform, which involves subsampling at each
scale. We use the convolution implementation, which involves no subsampling of data.
The convolution implementation creates a group of signals (one signal representing each
scale) that are all the same size (�gure 3.1).
ImplementationPyramidal
ScaleIncreasing
ConvolutionImplementation
Figure 3.1: Representations of the Wavelet Transform.
The convolution representation of the wavelet transform of a signal f(x) is de�ned by
equation 3.1, where is the convolution operator and (x) is the chosen wavelet function
17
at a particular scale s. We use the quadratic-spline wavelet [10] with equation 3.1 to create
a scale space of �rst-order di�erence information. We choose the quadratic-spine wavelet
because it has the following desirable properties: compact support, symmetry, di�eren-
tiablity (to a �nite degree), and one zero crossing (see appendix A for more details). We
combine the scale spaces from this wavelet transform to create edge-detection operators.
Ws[f(x)] = f(x)s(x) (3.1)
Equation 3.1 can be implemented by convolving a signal with a scaling function and
then applying a derivative operator [10]. A scaling function, �s(x), is a smoothing function
that meets the criteria1Z
�1
�(x)dx = 1; (3.2)
and
9� > 0 : �(x) = 0, 8 j x j> �: (3.3)
For the quadratic-spline wavelet, the smoothing function is the cubic-spline function [10].
The quadratic-spline function is the derivative of the cubic-spline function. A signal f(x) is
smoothed to scale s by convolution with �s(x). The wavelet transform of a one-dimensional
signal f then becomes
Ws[f(x)] =@(f �s)(x)
@x(3.4)
We extend this implementation of the wavelet transform to two-dimensional signals.
Consider the signal f(i; j) to be an image. The indices of the image are i, the vertical
index, and j, the horizontal index. A two-dimensional scaling function �(i; j) is employed
to �nd the wavelet transform. To create �rst-order derivative approximations, the scaling
function is applied to the image at scale s and then the derivative operator is applied to the
signal in the desired direction. Equation 3.5 shows application of the wavelet transform
for the two-dimensional case.
18
W is [f(i; j)] =
@(f �s)(i; j)
@i
W js [f(i; j)] =
@(f �s)(i; j)
@j(3.5)
Applying the wavelet transform can also create second-order derivative approxima-
tions. We apply the same cubic-spline scaling function at scale s and then apply the
proper partial derivative operator. Equation 3.6 shows second-order partial derivative
approximations obtained using the wavelet transform. Equations 3.5 and 3.6 are used to
create step-edge and crease-edge operators.
W iis [f(i; j)] =
@2(f �s)(i; j)
@i2
W jjs [f(i; j)] =
@2(f �s)(i; j)
@j2
W ijs [f(i; j)] =W ji
s [f(i; j)] =@2(f �s)(i; j)
@i@j
(3.6)
We want to create a scale space of wavelet-transform information. Instead of creating
smoothing functions for each scale and convolving the smoothing functions with an image
we create a scale space using a recursive method [60]. The u �lter is a smoothing kernel
and the v �lter is a di�erence operator.
Applying these �lters yields a scale space where the scale parameter is discrete. We
choose not to sample the scale space uniformly. A typical sampling that reduces the
information at larger scales in a natural manner is dyadic sampling [10, 60]. In dyadic
sampling, scale varies along the dyadic sequence s = (2y)y2N where s is scale[10]. We
de�ne scale one as application of v directly to the original image.
Figure 3.2 illustrates the application of these two �lters to create a dyadic sampling
of the wavelet-transform scale space. This �gure shows the creation of the scale space of
19
8 u(i,j)
v(i)
v(i)
v(i)
v(i)
v(i)
X
4 u(i,j)
2 u(i,j)
2 u(i,j)
f(i,j) W (i,j)j
W (i,j)j
W (i,j)j
W (i,j)j
W (i,j)j
1
2
4
8
16
Implementation of Wavelet Transform
: convolve with filter X
: N convolutions with filter XN X
Figure 3.2: Implementation of the Wavelet Transform.
W is [f(i; j)]. Applying v(j) in place of v(i) yields W j
s [f(i; j)]. Applying v(i) twice rather
than once yields W iis [f(i; j)]. All the wavelet transforms in equation 3.6 can be created
by changing the application of v.
3.1 Step-Edge Detection
We de�ne step edges in a range image as discontinuities in range. As shown in �gure
1.1 a step edge in a 1-D signal can be detected using the described wavelet transform.
The result of the application of the wavelet transform in both the vertical and horizontal
directions is combined to create a step-edge detector (equation 3.7). Application of the
step-edge operator renders an image in which the extrema represent step-edge contours
(�gure 3.3). We say that the image resulting from application of the step-edge operator
20
a b
Figure 3.3: A synthetic image containing step edges: (a) A synthetic 2-D signal (i.e.image) containing step edges (b) g2 of (a).
at scale s is the fuzzy edge map gs:
jrf j =
��W i
s [f ]�2
+�W j
s [f ]�2�1=2
: (3.7)
3.2 Crease-Edge Detection
We de�ne crease edges in a range image as discontinuities in surface normal of range
[61]. The 1-D wavelet transform is applied to the image several times to �nd the gradient of
the surface normal, GSN. Assuming we have 3-D data as vectors in a Cartesian coordinate
system, we can calculate the GSN.
A three-dimensional position vector is
~p = [ x(i; j) y(i; j) z(i; j) ] (3.8)
Each point in a range image represents a vector in 3-D space. Some processing may be
required to extract these vectors from a range image [62]. We can think of all the vectors
for a single range image as three images, one image for each component of the vector.
From these three images we can calculate the GSN.
21
An approximation to a surface normal vector at a point i; j in an image f is
~N(i; j) = [ Nx(i; j) Ny(i; j) Nz(i; j) ] (3.9)
where
Nx(i; j) =nx(i; j)q
nx(i; j)2 + ny(i; j)
2 + nz(i; j)2
(3.10)
Ny(i; j) =ny(i; j)q
nx(i; j)2 + ny(i; j)
2 + nz(i; j)2
(3.11)
Nz(i; j) =nz(i; j)q
nx(i; j)2 + ny(i; j)
2 + nz(i; j)2
(3.12)
and
nx(i; j) = W is [y(i; j)]W
js [z(i; j)] �W i
s [z(i; j)]iWjs [y(i; j)] (3.13)
ny(i; j) = W is [z(i; j)]W
js [x(i; j)] �W i
s [x(i; j)]iWjs [z(i; j)] (3.14)
nz(i; j) = W is [x(i; j)]W
js [y(i; j)] �W i
s [y(i; j)]iWjs [x(i; j)] (3.15)
The magnitude of the gradient of a surface normal vector is
jjr ~N jj =
�Nx
@x
�2+
�Nx
@y
�2+
�Ny
@x
�2+
�Ny
@y
�2+
�Nz
@x
�2+
�Nz
@y
�2(3.16)
This is our crease-edge operator. Application of this operator for an image at scale s
creates a fuzzy edge map hs. Figure 3.4 shows hs for a simple synthetic image.
3.3 Segmentation from Feature Extraction
Our goal is image segmentation, and up to this point we have discussed only feature
detection. We need a method to create a segmentation from our edge detection results.
The step-edge and crease-edge operators both perform fuzzy feature extraction: The op-
erators do not give a crisp (or binary) response they render a response proportional to the
magnitude of a feature { (i.e. discontinuity of surface geometry). We refer to any image
consisting of fuzzy edge information as a fuzzy edge map.
22
a b
Figure 3.4: A synthetic image containing crease edges: (a) A synthetic image containingcrease edges (b) h1 of (a).
Conventional thresholding creates a binary edge map from a fuzzy edge map by thresh-
olding the fuzzy values. Segmentation can be derived from the binary edge map by link-
ing edges in this binary edge map and then labeling areas enclosed by edges as regions.
Thresholding a fuzzy feature map eliminates responses to a feature detector that are below
a certain value. In the case of edge detection, edges of low discontinuity are eliminated.
Edges of low discontinuity can represent important features in an image and should not
necessarily be eliminated in a �nal segmentation. We choose a segmentation algorithm
that �nds regions bounded by local maxima in a fuzzy edge map.
We apply a morphological watershed algorithm to a fuzzy edge map to create a seg-
mentation [63]. We use the watershed algorithm described in [62], here we give a short
synopsis this algorithm. If a symbolic drop of water is placed at a local maxima in a fuzzy
edge map it will drain down to a regional minima. All the pixels in the path of this drop
can be associated with the respective local minima. The watershed algorithm performs
this analysis for every regional maxima in the image, thus, forming catchment basins that
identify distinct regions having di�erent labels (�gure 3.5). So when applied to a fuzzy
edge map the watershed will form distinct regions bounded by areas of relatively high
23
high feature values
catchment basin: labeled region
Catchment Basin
Figure 3.5: Region formed from catchment basin.
feature values, i.e. a segmentation.
3.4 Fuzzy Reasoning Approach to Scale-Space Fusion
This section begins with a discussion of scale space and edge-detection operators. Using
a synthetic image containing only step edges we look at the scale space of the step-edge
detector. One possible method for combining the multi-scale data is fuzzy fusion. This
section elaborates on this idea, gives a simple experiment, and then discuss attributes of
the fuzzy fusion scheme.
To analyze the step-edge operator we create the step image, and we show which points
in this image should be detected as edges (�gure 3.6). This image contains 196 blocks
created using a random number generator (uniform distribution). The minimum block
magnitude is 1 and a maximum magnitude is 255. The blocks placed adjacent to one
another create di�erent step magnitudes in the image. We add zero mean Gaussian noise
(standard deviation=10) to the step image to create the step10 image (�gure 3.6c). In
all synthetic images that have no noise and are piece-wise at, the gradient magnitude is
a \perfect" step-edge detector. This could be implemented using �nite di�erences or in
our case the smallest scale of the wavelet transform. The gradient magnitude of the step
shows the points that should be detected as edges (�gure 3.6b).
24
a b
c d
Figure 3.6: The step and step10 images: (a) The step image (size 256x256), (b) imageindicating edge points used to create scale-space projections from the step image (blackpoints represent edge points and white points represent non-edge points), (c) the step10image containing added Gaussian noise (� = 10), and (d) g1 of (c).
25
Figure 3.7 shows the scale space of the step-edge operator for the step10 image. In
this �gure black represents the highest response to the step-edge operator and white values
represent the lowest response to the step-edge operator.
Figure 3.8 shows this result of application of the watershed algorithm to the scale space
of the step10 image. Figures 3.7 and 3.8 reveal three attributes of the scale space for the
step-edge operator:
� Edges are less accurately detected as scale increases.
� Noise is reduced as scale increases.
� Scales greater than some value seem to present no new useful information.
The scale one in this case contains too much noise to derive a good segmentation, but
examine the second scale | the noise is reduced, yet most edges are present. The largest
scales render bad segmentations because of very low accuracy in the detection of edges.
One researcher has proposed that the more salient an edge, the longer it survives
in scale space [52]. For example a signi�cant edge will produce a response at a greater
number of scales than an edge created from noise. Noise survives within a small region in
the scale space (at low scales); whereas, signi�cant edge features are usually present at a
great number scales. If a scheme could be employed to choose edges that survive well in
the scale space an improved edge detection could be achieved.
Because the edge maps contain fuzzy values, fuzzy set theory can be applied [64]. Fuzzy
set theory is a logical paradigm within which to develop a scale-space combination scheme.
A fuzzy set union could create the desired fused-scale edge map. We use Bernoulli's rule
of combination [65] to fuse scale space because it has no free parameters and it allows
the combination of more than two values (by recursive application). Bernoulli's rule of
combination gives the union of two values a 2 [0; 1] and b 2 [0; 1] (equation 3.17).
1� (1� a)(1 � b) (3.17)
26
a b c
d e f
g h
Figure 3.7: Scale space of the step-edge operator applied to the step10 image. The scalevalues are (a) 1, (b) 2, (c) 4, (d) 8, (e) 16, (f) 32, (g) 64, and (h) 128. In these �guresblack represents the highest and white the lowest response to the step-edge operator. Grayvalues between white and black represent values that are between the lowest and highestresponses respectively.
27
a b c
d e f
g h
Figure 3.8: Watershed algorithm applied to step-edge operator images from the step10image. The scale values are (a) 1, (b) 2, (c) 4, (d) 8, (e) 16, (f) 32, (g) 64, and (h) 128.
28
We can recursively fuse the scale space in �gure 3.7 using Bernoulli's rule of combination.
In order to perform Bernoulli's rule of combination we need edge maps that are fuzzy
and contain values on the interval [0,1]. Our edge maps at this point are fuzzy, but are
not bounded. So to perform Bernoulli's rule of combination we must �rst remap the edge
maps to the interval [0,1]. A simple way to map the data to the [0,1] interval is to perform
a linear remapping. Consider fi to be the initial fuzzy edge map, fmin is the minimum of
all the values in fi, and fmax is the maximum of all the values in fi. We apply equation
3.18 to obtain an edge map of values on the interval [0,1], fo.
fo =fi � fmin
fmax � fmin(3.18)
We examine the results of scale-space fusion using Bernoulli's rule of combination for
the step10 image. After remapping each scale we apply Bernoulli's rule of combination
to the scale space of the step10 image. Figure 3.9 shows the fused edge maps. We apply
the watershed algorithm to the edge maps to create segmentations (�gure 3.10).
The results in 3.10 show that the fused scale space has similar attributes to the non-
fused scale space:
� Edges are less accurately detected as scale increases.
� Noise is reduced as scale increases.
� Scales greater than some value seem to present no new useful information.
The di�erence in the fused scale space and the non-fused scale space is that the fused
scale space appears to retain more small-scale information as scale increases. This seems
reasonable because one scale in the fused scale space represents the fusion of that particular
scale and all lower scales of the non-fused space. The fused scale space demonstrates more
accurate detection of edges as scale increases than the non-fused space. The fused space
also seems to retain more noise as scale increases as compared to the non-fused scale space.
29
a b c
d e f
g
Figure 3.9: Fuzzy fusion (Bernoulli's Rule of Combination) applied to scale space ofstep10 image. Result for fusion of scales: (a) 1 and 2, (b) 1 to 4, (c) 1 to 8, (d) 1 to 16,(e) 1 to 32, (f) 1 to 64, and (g) 1 to 128.
30
a b c
d e f
g
Figure 3.10: Watershed algorithm applied to fuzzy fusion (Bernoulli's Rule of Combina-tion) result for step10 image. The scales fused are (a) 1 and 2, (b) 1 to 4, (c) 1 to 8, (d)1 to 16, (e) 1 to 32, (f) 1 to 64, and (g) 1 to 128.
31
For example we compare scale eight of the non-fused space (�gure 3.7c) to the fused result
of scales one to eight (�gure 3.9d). Scales above scale two of the non-fused scale space
(�gure 3.8b) appear to present no new useful information, whereas, scales above scale
eight of the fused scale space (�gure 3.10c) appear to present no new useful information.
This also indicates that the fused scale space retains more small-scale information than
the non-fused scale space.
This scale-space fusion scheme can be applied to real data. We process and examine
the scale space of the real hydro3 image in the same way we processed and examined
the step10 image. The hydro3 data is acquired using a Perceptron laser range scanner
[66, 67]. Figure 3.11 shows the range image. Figure 3.12 shows the scale space created by
applying the step-edge operator to this image. We apply the watershed algorithm to each
edge map in �gure 3.12 to yield a segmentation for each scale (�gure 3.13).
The results of the step-edge operator followed by the watershed are clearly di�erent
for real data. The real data presents a more challenging case for edge detection. The
noise present in this real data is greater than the noise of the step10 image and does
not diminish so readily with increasing scale. Smaller features are present in this image.
These small features are smoothed away at higher scales, so preservation of these features
while reducing noise makes edge detection challenging for this image.
When applying the fusion scheme to the scale space of the hydro3 image (�gures
3.14 and 3.15), we �nd that, as with the step10 image, the fused scale space has similar
attributes to the non-fused scale space. These attributes are that edges are less accurately
detected as scale increases, noise is reduced as scale increases, and scales greater than some
value seem to present no new useful information In addition, as with the step10 image,
the main di�erence in the non-fused scale space and the fused scale space is that the fused
scale space appears to retain more small-scale information as scale increases. Because the
hydro3 image contains noise of greater magnitude, noise (a small-scale feature) diminishes
32
Figure 3.11: The hydro3 image (size 256x256).
33
a b c
d e f
g h
Figure 3.12: Scale space of the step-edge operator applied to the hydro3 image. Thescale values are (a) 1, (b) 2, (c) 4, (d) 8, (e) 16, (f) 32, (g) 64, and (h) 128.
34
a b c
d e f
g h
Figure 3.13: Watershed algorithm applied to step-edge operator images from the hydro3image. The scale values are (a) 1, (b) 2, (c) 4, (d) 8, (e) 16, (f) 32, (g) 64, and (h) 128.
35
a b c
d e f
g
Figure 3.14: Fuzzy fusion (Bernoulli's Rule of Combination) applied to scale space ofhydro3 image. Result for fusion of scales: (a) 1 and 2, (b) 1 to 4, (c) 1 to 8, (d) 1 to 16,(e) 1 to 32, (f) 1 to 64, and (g) 1 to 128.
36
a b c
d e f
g
Figure 3.15: Watershed algorithm applied to fuzzy fusion (Bernoulli's Rule of Combina-tion) result for hydro3 image. The scales fused are (a) 1 and 2, (b) 1 to 4, (c) 1 to 8, (d)1 to 16, (e) 1 to 32, (f) 1 to 64, and (g) 1 to 128.
37
less quickly as scale increases in the fused scale space. Figure 3.13e contains fewer segments
created by noise than does �gure 3.15d.
The fused scale space for the hydro3 image shows some improvement over the non-
fused scale space, but the fusion scheme weights all scales equally. This means that a
feature that is detected at only one or a few scales may not be present in the fused edge
map. If scale space can be combined in a way that gives preference to scales containing
signi�cant edge information, this combination may have advantages over a simple fuzzy-
fusion method.
In the next chapter we further explore the scale spaces for the step-edge and crease-edge
detectors that result from application of the wavelet transform. This exploration results
in a segmentation method that combines scales in a way that places greater emphasis on
scales that contain signi�cant edge information. We develop these ideas into a somewhat
more sophisticated scale-space combination scheme.
38
CHAPTER 4
Scale-Space Combination Algorithm
In this chapter we analyze scale-space data from both the step-edge and crease-edge
operators. We present our analysis in a way that motivates the strategy behind the
development of a segmentation system. The result is the Pattern Analysis of Scale Space
for Extraction of Features (PASSEF) system. PASSEF is centered around the following
three ideas:
� We combine scale-space information derived from the wavelet transform to segment
range images.
� We train a statistical pattern-recognition system with points from a training image;
this system can then determine the degree of edgeness (or non-edgeness) for points
in a range image, thus, creating a fuzzy edge map.
� Two fuzzy edge maps, representing step edges and crease edges, are combined to
create a comprehensive edge detection.
This chapter describes in detail the development of these three ideas. We �rst examine
and analyze scale-space information for two edge-detection operators, gs and hs, derived
from the wavelet transform. Through analysis we show the motivation for choosing sta-
tistical pattern recognition to derive edge detection from scale-space data. We discuss
the overall PASSEF system architecture and a detailed discussion reveals speci�cs of the
system. The fusion of the crease-edge and step-edge maps created by the PASSEF system,
Ph and Pg, is explained at the end of the chapter.
39
4.1 Analysis of the Scale Space of the Step-Edge Operator
We want to detect step edges from scale-space data of the step-edge operator. We
de�ne a scale-space signature as the vector of measurements at di�erent scales taken at a
single point, (i,j), in the image (�gure 4.1). This is similar to the work of [68].
WaveletTransform
Scale
Ma
gn
itu
de
Scale Signature
Scale Space
ImageBlock
Figure 4.1: Creation of a scale-space signature from scale-space data.
First we examine these scale-space signatures for step edges. A synthetic image is
created that contains a square area of magnitude 1 and a background area of magnitude
0. Gaussian noise of standard deviation equal to 0.1 is added to create the block image
(�gure 4.2a).
In order to examine the scale space signatures of the edge points, we would like to
identify the edge points of the block image (�gure 4.2b) in the same way we identi�ed
the edges points of the step10 image (gradient magnitude of the image without noise).
Ideally we would like to distinguish light and dark points in this image based on their
respective scale-space signatures. Figure 4.3 shows that the signatures of the edge points
appear self similar and distinguishable from the signatures of non-edge points.
40
a b
Figure 4.2: The block image: (a) The block image, size 25x25, with added Gaussiannoise (� = 0:1) and (b) �rst scale of wavelet transform of simple block image withoutnoise.
Figure 4.3: Scale-space signatures of each pixel in the block image with added Gaussiannoise (� = 0:1).
41
This suggests that scale-space signatures can be used to determine edgeness; how to
separate edge point signatures and non-edge point signatures remains an issue. Examining
the collective scale-space of this operator will help us determine what type of method
should be used to �nd edge points in this scale space. Examining the collective scale space
means considering the scale-space signatures as vectors in an n dimensional space where
n is the number of scales used to create a particular scale-space signature. Therefore we
want to examine or get an idea of how points (edge and non-edge) are arranged in this
space. Figure 4.4 shows how a collective scale-space is created for only edge points in an
image.
WaveletTransform
E s
m: Edge point number m at scale s
Step Image
Scale SignatureCreation
Scale Space
Edge TruthImage
1
32
2
1 1
2
2 2
2
3 3
2
[E , E , E ]
1[E , E , E ]1
2[E , E , E ]1
3
1
[E , E , E ]1
4
4
4
4
M M M
Figure 4.4: Creation of a collective scale space from edge points in the step image.
We examine the collective scale space of the synthetic step10 image (�gure 3.6c with
edge points indicated in �gure 3.6b). Figure 4.5a shows a 2-D projection of scale-space
data for edge points in the step10 image. Figure 4.5b shows a 2-D projection of the edge
points and non-edge points in the step image. These scale-space projections are created
by projecting the data onto the two dimensional subspace spanned by the eigenvectors
associated with the largest eigenvalues.
42
a b
Figure 4.5: 2-D scale-space projections from the step10 image: (a) 2-D scale-space projec-tion of the space of edge points from the step10 image and (b) 2-D scale-space projectionof the space of edge points (gray) and non-edge points (black) from the step10 image.
4.2 Analysis of the Scale Space of the Crease-Edge Operator
We can examine the scale-space of the GSN operator in the same way that we examined
the scale-space of the step-edge operator. For this analysis we use a simple pyramid image
(�gure 4.6a). This simple image contains only crease edges. The �rst scale of the GSN
operator of the pyramid without noise (�gure 4.6b) indicates the edge points of the
image. Most of the edge points in this image appear to have scale-space signatures that
are distinguishable from the signatures of non-edge points (�gure 4.7).
We also examine the collective scale space of the GSN operator. We create a synthetic
8-sided-cone image (�gure 4.8a) which contains only crease edges. Figure 4.9 shows
two 2-D projections for scale-space data from the 8-sided-cone image. These scale-space
projections are created by projecting the data onto the two dimensional subspace spanned
by the eigenvectors associated with the largest eigenvalues.
43
a b
Figure 4.6: The pyramid image: (a) The pyramid image, size 25x25, with added Gaus-sian noise (� = 0.2) and (b) gradient of the surface normal of the simple pyramid imagewithout noise.
Figure 4.7: Scale-space signatures of each pixel in the simple pyramid image with addedGaussian noise (� = 0.1).
44
a b
Figure 4.8: Synthetic 8-sided-cone image: (a) Synthetic 8-sided-cone image with addedGaussian noise (� = 2) and (b) image indicating edge points used to create scale-spaceprojections from the 8-sided-cone image.
a b
Figure 4.9: 2-D scale-space projections from the 8-sided-cone image: (a) 2-D scale-spaceprojection of the space of edge points from the 8-sided-cone image with added noise and(b) 2-D scale-space projection of the space of edge points (gray) and non-edge points(black) from the 8-sided-cone image with added noise.
45
4.3 Motivation for Using Pattern-Recognition
The di�erent locations of feature and non-feature points indicated by �gures 4.5b and
4.9b reveal that features could be detected in this space by a pattern-classi�cation system.
We propose to analyze the 1-D scale-space signatures with a supervised pattern-recognition
system (�gure 4.10). We choose a pattern-recognition approach because the system
� Can be trained to detect objects at all scales in the presence of noise.
� Has few free parameters.
� Can be trained to detect features in image from di�erent acquiring devices.
In this section we examine the criteria a�ecting the potential success of a pattern recog-
nition system for this application. We discuss the likelihood of these criteria being met.
Finally we propose the PASSEF system.
Pattern-Recognition
95%
20%
75%
15%
SignaturesScale-Space
System
ValueEdgenessFuzzy
Figure 4.10: Pattern recognition to determine edgeness of scale-space signatures.
There are two primary criteria that a�ect the potential success of a pattern-recognition
system in this application:
1. The class of edge points and the class of points that are not edges in the image
must have scale-space signature di�erences that allow them to be separated by the
pattern-recognition system.
46
2. Signatures from the training data set must closely resemble data from the image to
be segmented.
We address the potential of these two criteria being met by examining synthetic data.
The class of edge points and class of non-edge points for the step10 and 8-sided-cone
images (�gures 4.5b and 4.9b respectively) appear mostly separated. However we can
not know for sure if criteria one will be met because it is di�cult to visualize the high-
dimensional space from 2-D projections. The second criteria is also di�cult to asses.
We do know of one aspect of the scale space that could prevent the second criteria
from being met. In order for the training data to match the data being analyzed, the
magnitudes of the scale-space signatures for the two data sets must match. This is a
problem because there are an in�nite number of magnitudes for the case of the step-edge
operator and quite a large range of values for the GSN operator. We can alleviate this
problem by normalizing the scale-space signatures so that the pattern-recognition system
is based solely on the shape of the scale-space signature and therefore independent of the
contrast of an edge.
We normalize each scale signature by dividing each value in the signature by the
signature's total magnitude. Equation 4.1 shows the normalization of a feature vector
where gs(i; j) is the response of a feature detector at scale s at point (i; j) in an image and
n is the total number of scales in the feature vector. After normalization all the points in
the space lie on one hyperplane. Transforming the coordinate system of the space to this
hyperplane reduces the dimensionality of the space by one.
"g1(i; j)Pn�1
y=0 g2y(i; j);
g2(i; j)Pn�1y=0 g2y(i; j)
; : : :gn(i; j)Pn�1
y=0 g2y (i; j)
#(4.1)
Normalizing the scale space changes the entire scale-space. We must examine the
normalized space to determine what type of pattern classi�er to use. We want to examine
47
a b
Figure 4.11: 2-D scale-space projections from the step10 and 8-sided-cone images:(a) 2-D scale-space projection of the normalized space of edge points from the step10
image and (b) 2-D scale-space projection of the normalized space of edge points from the8-sided-cone image with added noise.
the normalized scale space of the step10 and 8-sided-cone images. Figures 4.11a and
4.11b show collective scale-space projections for the step10 and 8-sided-cone images
respectively. These projections show that feature vectors representing edges appear to
form one hyperblob within the space. We could model this edge space with a simple
statistical pattern-recognition system.
We propose to train a pattern-recognition system with scale-space signatures of edge
points. Once trained, the system is applied to each pixel of an image to be segmented.
The edgeness of each pixel in the new image is determined, therefore creating a fuzzy
edge map. We develop the segmentation process shown in �gure 4.12 based on the idea
of modeling the edge space with Gaussian blobs.
4.4 Modeling Scale Space Using Gaussian Blobs
The process in �gure 4.12 consists of two phases: a training phase and a feature
detection phase. In the training phase we �rst perform the wavelet transform on a training
48
NormalizedScale Feature
Vectors
Covariance MatrixMean Vector
Training Phase
12
Scale Space
345
N
Scale Space12345
N
Pattern-Recognition
Fuzzy Fusion
System
GaussianApproximation
Range Image
Image
Range Image
Magnitude
Feature Detection Phase
Fuzzy Feature Map
Truth
Figure 4.12: Flow of entire Pattern Analysis of Scale Space for Extraction of Features(PASSEF) system.
image (usually a synthetic image with noise). Next we normalize each scale signature. We
then calculate the covariance matrix and mean vector of the edge points in the normalized
and transformed space. This statistical information is used to create a Gaussian model of
the space.
We use a Gaussian function because it is smooth and is convenient to implement. In
addition the Gaussian function has desirable properties: smoothness and parameters for
adjusting the shape of the function (the standard deviation) and the placement of the
function (the mean). The Gaussian is used as a model for the training data to indicate
where concentrations of feature points occur rather than as a precise model of feature
points in the space. The parameters of the multi-dimensional Gaussian function allow it
to successfully indicate concentrations of feature points in the feature space.
After deriving the proper model for the training data the feature detection phase
begins. In this phase the wavelet transform yields a feature vector for each point in the
input image. The pattern-recognition system creates an edge map using the feature vectors
of the input image and the Gaussian approximation from the training phase. This edge
49
a b
Figure 4.13: Synthetic highbay image: (a) Synthetic highbay image and (b) imageindicating edge points (step-edge points only) used to create scale space from real highbayimage.
map and the magnitude values of the scale space are fused to create a �nal edge map.
The �nal edge map is segmented using a morphological watershed algorithm [62].
Simple training data such as the step10 and 8-sided-cone images can be modeled
with a single n-dimensional Gaussian, but for a more complicated space this single blob
model breaks down. Multiple Gaussian blobs, if positioned correctly, might be able to
better represent this data. To show this we examine the collective scale space of a more
complex scene. We use a synthetic data set referred to as the highbay image (�gure
4.13a). This image contains step as well as crease edges, but we examine only the edge
space for the step-edge operator. Figure 4.13b shows the points used to create the edge
space. The edge space for the highbay image is much more dispersed than the edge space
of the simple step10 image (�gure 4.14).
A classi�cation system with the ability to divide the areas of concentrated points into
separate regions could allow the space of edges to be modeled by a number of n-dimensional
Gaussian blobs. One n-dimensional Gaussian could be �tted to each piece of the divided
50
Figure 4.14: 2-D projection of scale space for edge points of the highbay image.
space in the same way it is �tted to the entire space. The number of vectors in a particular
area of the edge space would determine the scale of the Gaussian approximation used to
model that piece. This is necessary because the magnitude of the result from the statistical
classi�er must be related to the concentration of points. Areas in the scale space that are
highly concentrated with edge points from the training data should render high responses
when a feature vector from a new image is examined.
We use a k-means algorithm to divide the space into a speci�ed number of clusters.
We then �t an n-dimensional Gaussian to each of these clusters. Figure 4.15 shows the
results of applying the k-means clustering algorithm to the edge scale-space of the hybay
image to create ten clusters.
4.5 Fusion of the Step-Edge and Crease-Edge Detection Maps
In order to detect all the edges in an image (step and crease) we apply the PASSEF
system twice. We apply the PASSEF system using the step-edge operator with only step-
edge training data, and then we apply the PASSEF system using the crease-edge operator
51
Figure 4.15: K-means applied to scale space of step-edge points of highbay image.
with only crease-edge training data. This creates two edge maps: one for step edges and
the other for crease edges. These two edge maps must somehow be combined to create a
single comprehensive edge map.
We use a fuzzy union operator to fuse these two maps [64]. We choose Bernoulli's
rule of combination to perform the fusion [65]. Bernoulli's rule of combination performs a
union of two values a 2 [0; 1] and b 2 [0; 1] using equation 4.2.
1� (1� a)(1 � b) (4.2)
Edge maps created by the PASSEF system are fuzzy but not bounded. In order to apply
Bernoulli's rule of combination, we need to transform the values of the edge maps to
the [0,1] interval. A simple way to map the data to the [0,1] interval is to perform a
linear remapping (equation 3.18). After the fuzzy union is performed on the images the
morphological watershed described in [62] is applied to the �nal edge map to create a
segmentation.
52
CHAPTER 5
Results and Analysis of System
In this chapter we examine the capabilities of the Pattern Analysis of Scale Space for
Feature Extraction (PASSEF) system. We analyze the PASSEF system with both syn-
thetic and real data. This analysis addresses each of the three qualities of a segmentation
algorithm set forth in chapter one:
� Detection of objects at all scales in the presence of noise.
� Few free parameters.
� Applicability to images from di�erent acquiring devices.
We begin this chapter with an analysis of training data used with the PASSEF system.
Next we discuss the detection of step edges and end with an analysis and discussion of
crease edges.
5.1 Training Data
The PASSEF system requires training data to extract features in an image. In this
chapter all the training-data sets are formed from feature points in synthetic images.
Di�erent synthetic images create training-data sets for the results. The step10, 8-sided-
cone, and highbay images are used. Di�erent edge points are used to create three
training-data sets from the highbay image. Table 5.1 exhibits the aspects of all training-
data sets used to obtain the results for this chapter.
The step10 image contains only step edges so the training data created from this
image contains feature points of only step edges. To �nd the step-edge points in the
53
Table 5.1: Training Data used with the PASSEF System
Variable Name Image Source Type of Features Edge Point FigureS step10 step 5.1bC 8-sided-cone crease 5.2bHs;c highbay step and crease 5.3bHs highbay step 5.3cHc highbay crease 5.3d
a b
Figure 5.1: Training data from the step10 image: (a) The step10 image (size 256x256)containing noise � = 10, (b) image indicating edge points (black) used to create train-ing-data set S.
step10 image we �rst �nd g1 for the step10 image with no noise. We threshold g1 to
yield a binary map of edge points used to create the training-data set (�gure 5.1b).
The 8-sided-cone image contains only crease edges. This means that the training
data created from this image contains feature points of only crease edges. We determine
the crease-edge points in the 8-sided-cone image by examining g1 for the 8-sided-cone
image with no noise. We threshold g1 to yield a binary map of edge points used to create
the training-data set (�gure 5.2b).
The highbay image contains both crease and step edges. We create three training-
data sets from the highbay image. One set contains all edges in the image, Hs;c. A
second set contains only step edge points in the image, Hs, and a third incorporates only
54
a b
Figure 5.2: Training data from the 8-sided-cone image: (a) The 8-sided-cone image(size 256x256) with noise � = 0:1, (b) image indicating edge points (black) used to createtraining-data set C.
crease edge points, Hc. To obtain the edge points used to create Hs;c we derive g1 for
the highbay image. We threshold g1 to obtain a binary map of all the edge points in the
highbay image (�gure 5.3b). To acquire the edge points used to make Hs we �nd g1 for
the highbay image. We threshold and manipulate g1 to yield a binary map of edge points
used to create Hs (�gure 5.3c). Crease edge points of the highbay image are found by
thresholding and manipulating h1 of the highbay (�gure 5.3d). These crease-edge points
are used to create the training-data set Hc.
5.2 Detection of Objects at All Scales in the Presence of Noise
First we consider the detection of blocks in the synthetic step10 image and an analo-
gous image that contains more noise (the step30 image containing Gaussian noise � = 30).
These images provide a way to see the a�ects of di�erent noise levels on the PASSEF sys-
tem. Second we consider the detection of objects in a real image. This real image provides
objects of di�erent sizes (i.e. features at di�erent scales). We compare results from the
55
a b
c d
Figure 5.3: Training data from the highbay image. (a) The synthetic highbay imagewith added noise � = 0:1 and images indicating points of the highbay used to createtraining set (black) (b) Hs;c, (c) Hs, and (d) Hc.
56
PASSEF method to results obtained by applying the step-edge detector at a single scale.
Recall the step10 image (�gure 5.1a). This image consists of 196 blocks of various
magnitudes. The image is piece-wise at and thus contains only step edges. Also recall
that a \perfect" edge detector for an ideal image (no noise and piece-wise at) is the
gradient magnitude. Figure 5.1b shows this for the step10 image.
We use S to train the PASSEF system and then apply the PASSEF system to the
step10 image. The PASSEF system utilizes only the step-edge operator g. We denote
this as PSg [step10]. We use eight scales for the training and application of the PASSEF
system. Because the scale-space data appears to form one Gaussian blob (�gure 4.11a)
we use only one blob to model the space. Figure 5.4 shows PSg [step10] and g2[step10]
for subjective comparison.
Next we apply the PASSEF system to the step30 image. This image is analogous
to the step10 image except the standard deviation of the added noise is thirty rather
than ten (�gure 5.5). The \perfect" edge detection for this image with no noise is again
the gradient magnitude (�gure 5.1b). The proper noise level for a training set is not
necessarily equally to the noise level of the image being processed; therefore, we train the
PASSEF system using the step10 image. (This is fully explored in the next section.) We
again use eight scales and one blob to model the data. Figure 5.6 displays PSg [step30]
and g4[step30] for comparison.
The gradient magnitude, gs, is an excellent edge detector for the step10 and step30
images. These images contain blocks of only one size; therefore, an optimal global scale
exists for the gs operator. The optimal scale is the minimum scale that su�ciently smooths
noise in the image.
Figure 5.7 shows each dyadic scale of the step-edge detector, gs(step10) for s 2
f1; 2; 4; 8; 16; 32g, and �gure 5.8 shows the result of applying the watershed algorithm to
this scale space. The second scale of this scale space (�gure 5.8b) seems to yield the best
57
a b
c d
Figure 5.4: Results of applying PASSEF and the step-edge operator to the step10 image:(a) PSg [step10], (b) the watershed algorithm applied to PSg [step10], (c) g2[step10], and(d) the watershed algorithm applied to g2[step10]. In these �gures black is the highestand white the lowest response to the edge operator. Gray values between white and blackrepresent values that are between the lowest and highest responses respectively.
58
Figure 5.5: The step30 image (size 256x256). This image contains added Gaussian noise� =30.
59
a b
c d
Figure 5.6: Results of applying PASSEF and the step-edge operator to the step30 image:(a) PSg [step30], (b) the watershed algorithm applied to PSg [step30], (c) g4[step30], and(d) the watershed algorithm applied to g4[step30].
60
a b c
d e f
Figure 5.7: Application of the step-edge operator to the step10 image at various scales:gs[step10] for s = (a) 1, (b) 2, (c) 4, (d) 8, (e) 16, and (f) 32.
61
a b c
d e f
Figure 5.8: Watershed algorithm applied to step-edge operator results of the step10
image: Watershed algorithm applied to gs[step10] for s = (a) 1, (b) 2, (c) 4, (d) 8, (e)16, and (f) 32.
62
a b c
d e f
Figure 5.9: Application of the step-edge operator to the step30 image at various scales:gs[step30] for s = (a) 1, (b) 2, (c) 4, (d) 8, (e) 16, and (f) 32.
segmentation. At the second scale noise is reduced and most of the blocks are properly
segmented. Figure 5.9 shows each scale of the step-edge detector for the step30 image,
gs(step30) for s 2 f1; 2; 4; 8; 16; 32g, and �gure 5.10 shows the result of applying the
watershed algorithm to this space. In the gradient-magnitude scale space for the step30
image, scale four seems to yield the best segmentation (�gure 5.10c). Indeed a higher
noise level requires more smoothing to create a proper segmentation and the minimum
scale that provides this adequate smoothing is gives the optimal segmentation.
An image that contains objects of varying sizes will not have an optimal global scale
and may therefore prove more challenging for the single-scale paradigm. Next we examine
the hydro3 image (�gure 5.11). Figure 3.12 shows gs[hydro3] for s 2 f1; 2; 4; :::128g.
Unlike the step10 and step30 images it is more di�cult to determine which scale of
63
a b c
d e f
Figure 5.10: Watershed algorithm applied to step-edge operator results of the step30
image: Watershed algorithm applied to gs[step30] for s = (a) 1, (b) 2, (c) 4, (d) 8, (e)16, and (f) 32.
64
Figure 5.11: The hydro3 image (size 256x256).
gs produces an optimal segmentation for the hydro3 image. This is because the scene
contains small objects (such as valve wheels) as well as larger objects (such as pipes).
To apply the PASSEF system to the hydro3 image we use the Hs;c training data
because this data more closely models the hydro3 image. We use four scales and twenty
blobs to model the scale space. Because gs has some ability in detecting both types of
edges we want to train the PASSEF system with both step and crease edges. We use Hs;c
rather than Hs as training data.
Figure 5.12 shows PHs;cg [hydro3] and g4[hydro3] (one of the better segmentations
from gs) for comparison. The PASSEF method seems to create a better segmentation
than any one scale of the gradient magnitude operator (refer to �gure 3.13 if needed).
This analysis suggests that the PASSEF method succeeds in properly detecting features
at multiple scales in the presence of noise.
5.3 Free Parameters in the PASSEF System
In this section we examine the free parameters of the PASSEF system. We consider
each free parameter individually. We discuss proper values for each free parameter, how
65
a b
c d
Figure 5.12: Results of applying PASSEF and the step-edge operator to the hydro3
image: (a) PHs;cg [hydro3], (b) the watershed algorithm applied to PSg [hydro3], (c)
g4[hydro3], and (d) the watershed algorithm applied to g4[hydro3].
66
Table 5.2: Free parameters in the PASSEF system and segmentation algorithm.
Parameter Name Variable Range of Values SensitivityNumber of Scales n 4 to 8 lowNumber of Blobs N 0 to 20 low
Noise Level in Training Data � 0.1 mediumMinimum Watershed Depth d 0.0001 to 0.00001 medium
proper values are determined, and the PASSEF system's sensitivity to each parameter.
The PASSEF system has few free parameters and the proper values for these parameters
can be found with a minimum of human interaction. Indeed, the PASSEF system's greatest
strength is its lack of free parameters and robustness to the adjustment of these parameters.
Table 5.2 shows the free parameters of the PASSEF system. This table states the name
of the parameter, the variable name associated with the parameter, the range of values
for the parameter used in this thesis, and a subjective measure of the PASSEF system's
sensitivity to this parameter. There are three grades of sensitivity measure: low, medium,
and high. Low sensitivity indicates that changing this parameter from image to image is
seldom necessary. High sensitivity indicates that the parameter may need to be changed
from image to image.
5.3.1 Number of Scales | n
The PASSEF system requires multi-scale information to extract features. We ensure that
the PASSEF system has the ample information by adjusting the parameter n. If the value
of n is too low the scale space will not contain enough information to extract features.
For example suppose n =1, the PASSEF system can only extract features at the given
scale. Therefore, n should be high enough to create a scale space that contains all the
information needed to extract features in a particular image.
Theoretically very large values of n will always yield a good feature extraction because
all the needed information is available to the system. But using large values of n forces the
system to analyze much redundant information. When n is very large redundant infor-
67
mation is present because after a certain degree of smoothing no new useful information
can be extracted from an image (examine �gure 3.12). Theoretically the PASSEF system
has the ability to extract features in the midst of this redundant information, and this
is borne out, to some extent, in our experiments. Because the system determines which
scales contain useful information it will use only these scales for classi�cation. Although
the system can detect features regardless of redundant information, we attempt to use val-
ues of n that avoid redundant information in the system in order to optimize processing
time.
To demonstrate the e�ect of n being too low we examine detection of features in the
step10 image using two values of n, four and eight. Figure 5.13 shows the result of
applying the PASSEF system to the step10 image with n equal to four and eight. The
training data set is S. The four scale and eight scale results are somewhat similar, but
the four scale result appears to have more noise. Better detection using eight scales with
the PASSEF system indicates that scales beyond the fourth dyadic scale for gs[step10]
contribute some useful information (examine �gure 3.7).
5.3.2 Number of Blobs | N
We create a statistical model of a training-data set for the purpose of classifying points
in an image. Finding the proper training-data model is essential for the PASSEF system
to properly extract features. To create a model for the training data we �rst apply a
k-means algorithm to the feature space. This divides the feature vectors into N groups.
Each group of vectors is then �tted with an n-dimensional Gaussian blob. These Gaussian
blobs collectively create a model of the training data.
The feature vectors of various training-data sets can be distributed very di�erently
within their respective feature spaces. For example edge points in the step10 image,
training-data set S, create a feature space that is somewhat compact. The feature vectors
68
a b
c d
Figure 5.13: Application of the PASSEF system to the step10 image for varying values ofn: PSg [step10] with n = (a) 4 and (c) 8. The watershed algorithm applied to PSg [step10]with n = (b) 4 and (d) 8.
69
a b
Figure 5.14: 2-D scale-space projections of training data sets: (a) 2-D scale-space projec-tion of the space of the training-data set S and (b) 2-D scale-space projection of the spaceof the training-data set Hs;c.
in this space seem to form one group (�gure 5.14a). On the contrary, the edge points in
the highbay image, training-data set Hs;c, form a convoluted feature space (�gure 5.14b).
Creating acceptable models for these training-data sets, which have varying feature
distributions, requires knowledge of the proper model for training data. The proper model
should generalize the training data to the data being analyzed. This implies that the model
must be accurate but not too exact. The training-data model must be accurate enough
to discriminate feature points from non-feature points in the data being analyzed. If the
model is too general non-feature points will be classi�ed as feature points. So the proper
model is one that is able to generalize yet accurately describes the training data.
Adjustment ofN provides a way to obtain a proper model for various training-data sets.
Examining the distributions of feature points in a training-data set aids in determining a
proper range of values forN . Figure 5.14 displays two examples of these projections. These
two projections are created by projecting the data onto the two-dimensional subspace
spanned by the eigen vectors associated with the largest eigen values. We examine these
projections and other projections to ascertain the distribution of feature points in the
70
training space. A more distributed space requires a higher value for N and a more compact
space requires a lower value for N .
Proper training data models are obtained by knowing only an acceptable range for N
rather then an exact proper value for N because a precise choice of N does not appear
to be critical for obtaining good results. For example the proper range of value for N
from the feature distribution (�gure 5.14a) seems to be 1 to 3. We apply the PASSEF
system to the step10 image using the training-data set S with N = 1, 5, and 20 (�gure
5.15). As expected a large number of blobs, twenty, creates a result that is slightly worse
because the model becomes too exact, but all the results are very similar. This example
demonstrates that the PASSEF system is quite robust to changes in the parameter N .
Experimentation shows this is true for other training-data sets.
5.3.3 Noise Level in Training Data | �
Next we examine the parameter �. Usually the PASSEF system is trained with synthetic
images. Noise is added to these images to simulate real data. The added noise creates
training data that more accurately describes the real data.
We examine a complex synthetic image to determine the proper value range for �. We
add Gaussian noise of di�erent standard deviations to this synthetic image that has all
types of edges present. To test the PASSEF system we train it with the synthetic highbay
image with Gaussian noise of �= 0, 0.1, and 1.0. We apply the PASSEF system to the
highbay synthetic image with �=1.0 to obtain the three edge maps shown in �gure 5.16.
Note that we have adjusted the contrast in the images in �gure 5.16 to better visualize
the results.
It appears that a small amount of noise in the training data improves the result as
compared with no noise. This is because without noise the training data does not accu-
rately model the data being analyzed. In addition, a small amount of noise in the training
71
a b c
d e f
Figure 5.15: Application of the PASSEF system to the step10 image for varying valuesof N : PSg [step10] with N = (a) 1, (b) 5, and (c) 20. Application of watershed to (a), (b),and (c) is shown in (d), (e), and (f) respectively.
a b c
Figure 5.16: The PASSEF system applied to highbay synthetic image for varying valuesof �: The PASSEF system applied to highbay synthetic image with Gaussian noise addedof �=1 with training data of Gaussian noise levels �= (a) 0 and (b) 0.1, and (c) 1.0. Wegamma adjust the images in this �gure to reveal the noise in the results.
72
data appears to yield better results than training data with a level of noise equal to that
in the image being analyzed. Too much noise in a training-data set causes the PASSEF
system to model noise rather than features thus causing noise in the result. This analysis
indicates that the best training set is one that has a small amount of noise but probably
less noise than that present in the image being analyzed.
5.3.4 Minimum Watershed Depth | d
The last free parameter is d. The parameter d is the minimum depth of the individual
catchment basins created by the watershed algorithm. The depth of a catchment basin
refers to the di�erence between the maximum and minimum values of that catchment
basin. Catchment basins that are too shallow (i.e. have a depth less than d) are merged
with other regions.
The correct value for d is a value that is not too high yet not too low. If the value
of d is too high, too many regions are merged together. This causes undersegmentation
in the �nal result. On the other hand, if the value of d is too low, oversegmentation
results because not enough regions are merged together. The proper value for d is found
heuristically.
Before applying the watershed algorithm to results from the PASSEF system, we
transform the values of the fuzzy edge map to the interval [0,1]. The range of values for
d is 0.0001 to 0.001. The value used most is 0.001. The sensitivity for this parameter is
medium.
5.4 Images from Di�erent Acquiring Devices
Our �nal goal is to develop a system that extracts features from images created by
di�erent acquiring devices. We apply the PASSEF system to images from two di�erent
Perceptron laser range �nders and an image from a Coleman Coherent Laser Radar scan-
73
Figure 5.17: The hydro6 image (size 256x256).
ner. We use the training-data set Hs;c to process all the images from di�erent acquiring
devices. Twenty Gaussian blobs model the four-dimensional scale space of edge points (i.e.
N=20 and n=4). The parameter d ranged from 0.001 to 0.0015. The PASSEF system
gives similar results from all scanners.
First we analyze two images from a P5000 Perceptron laser range �nder. The fore-
ground of the hydro3 image (�gure 5.11) is made up of mainly pipes and valves. There is
a wall to the right of the image that also appears in the foreground. The hydro6 image
is a di�erent view of the same scene taken with the same range scanner (�gure 5.17). The
hydro6 image also contains pipes and valves, as well as a light �xture.
Figures 5.12a and b show PHs;cg [hydro3]. We compare the result obtained with the
PASSEF system to single-scale results. Scale four presented the best segmentation for
gs (�gures 5.12c and d). We apply the PASSEF system to the hydro6 image to obtain
PHs;cg [hydro6] (�gures 5.18a and b). The edge map g2[hydro6] seemed to provide the
best single-scale segmentation for this image (�gures 5.18c and d).
Both the hydro3 and hydro6 images contain spike noise. The PASSEF algorithm
avoids improper classi�cation of spike noise as edges. For example, on the wall at the
74
a b
a b
Figure 5.18: Results of applying PASSEF and the step-edge operator to the hydro6
image: (a) PHs;cg [hydro6], (b) the watershed algorithm applied to P
Hs;cg [hydro6], (c)
g4[hydro6], and (d) the watershed algorithm applied to g4[hydro6].
75
Figure 5.19: The cone image (size 256x256).
right of the hydro3 image some spike noise is present. The �rst, second, third and fourth
dyadic scales of gs(hydro3) detect these noise points as edges (refer to �gures 3.12a to
3.12e). The ceiling of the hydro6 image contains a narrow, horizontal region of noise in
approximately the middle of the image. The PASSEF algorithm does not detect this noise,
but the single-scale paradigm does (�gure 5.18). The PASSEF system properly detects
small scale features while mis-classifying substantially fewer spike noise points.
Next we apply the PASSEF system to an image that we label as the cone image
(�gure 5.19). This image is from a Coleman Coherent Laser Radar scanner1. This range
scanner provides images with very little noise. The largest objects in the image are two
brick blocks and a tra�c cone placed in front of a barrel. This image has very little spike
noise and a low amount of additive noise.
Figure 5.20 shows g1[cone] and g2[cone]. The edge map g1[cone] demonstrates
more accurate detection of edges for some objects (e.g. edges of the brick blocks) but
noise corrupts proper detection of other features (e.g. the cone and barrel). In addition
1The laser range data �les were provided by the Oak Ridge National Laboratory, Oak Ridge, Tennessee37831, Managed by Lockheed Martin Energy Research Corp. for the U.S. Department of Energy undercontract DE-AC05-96OR22464.
76
a b
c d
Figure 5.20: Application of the step-edge operator to the cone image: (a) g1[cone], (b)g2[cone] and the watershed applied applied to (c) g1[cone] and (d) g2[cone].
77
g1[cone] fails to smooth the noise present in the right side of the image. The g2[cone]
result improves detection of edges for the cone and barrel yet corrupts edges of the brick
blocks. The noise patch at the right side of the image is virtually non-existent in g2[cone].
a b
Figure 5.21: Results of applying PASSEF to the cone image: (a) PHs;cg [cone] and (b)
watershed applied to PHs;cg [cone].
The PASSEF system shows excellent detection of step edges and avoids detection of
noise as features in the cone image (�gure 5.21 shows PHs;cg [cone]). The PASSEF system
does present improper detection of some crease edges. Improper detection of crease edges
causes the image to be undersegmented. For example PHs;cg [cone] does not properly
detect the crease edge or change in surface normal that occurs between the barrel and
the oor. Figure 5.22 shows that part of the crease edge between the barrel and oor is
detected, but a small piece, at the intersection with the cone's edge, is not detected. The
missing piece of this crease edge causes the watershed algorithm to merge the oor region
with the region representing the barrel. Notice that edges dividing the oor and the wall
also have this characteristic where they intersect edges of the brick blocks.
78
Figure 5.22: Histogram equalization of the result of applying PASSEF to the cone image:Histogram equalization applied to P
Hs;cg [cone].
Finally we apply the PASSEF system to two images acquired from another Perceptron
laser range �nder 2. Each of these images contains one polyhedral object. Figure 5.23a
shows the polyhedral1 image and �gure 5.23b shows the polyhedral2 image. We
apply the PASSEF system to both of these images (�gures 5.23c and 5.23d) to obtain
PHs;cg [polyhedral1] and P
Hs;cg [polyhedral2]. In previous examples the PASSEF sys-
tem, using only the step-edge operator, demonstrated partial detection of crease edges.
The PASSEF system does not detect crease edges in the polyhedral1 or polyhedral2
image. Analysis of gs[polyhedral1] shows why the PASSEF system does not detect
crease edges in the image. Edge maps gs[polyhedral1] for s 2 f1; 2; 4; 8g (�gure 5.24)
show that the step-edge operator has quite limited ability in extracting crease edges in
the polyhedral1 image.
Improved crease-edge detection is essential to creating a reasonable segmentation for
the polyhedral1 and polyhedral2 images. Results for the cone image show that
2The laser range data �les were provided by The Computer Vision / Image Analysis Research Labo-ratory at the University of South Florida, Department of Computer Science & Engineering, University ofSouth Florida, Tampa, Florida 33620-5399, http://marathon.csee.usf.edu/range/seg-comp/SegComp.html
79
a b
c d
Figure 5.23: The polyhedral1 and polyhedral2 images: (a) The polyhedral1 image,
(b) the polyhedral2 image, (c) PHs;cg [polyhedral1], and (d) P
Hs;cg [polyhedral2].
80
a b
b b
Figure 5.24: Application of the step-edge operator to the polyhedral1 image:gs[polyhedral1] for s = (a) 1, (b) 2, (c) 4, and (d) 8.
81
improvement of crease-edge detection would greatly improve the segmentation produced
by the PASSEF system. The next section discusses our attempts to generalize the PASSEF
system to the extraction of crease edges in order to improve segmentation results.
5.5 Crease-Edge Detection
This section describes the behavior of the PASSEF system using the crease-edge op-
erator, hs, as opposed to the step-edge operator, gs, that we examined in the previous
sections. The crease-edge operator is given by equation 3.16 in chapter three. We use this
operator to extract crease edges with the PASSEF system. This section will show how the
generalization of the PASSEF system to crease edges using hs demonstrates that crease
edge detection is possible with the PASSEF system. It will also show that more work is
necessary for the reliable extraction of crease edges.
The strategy is to use data from hs with the PASSEF system to create a fuzzy crease-
edge map denoted as Ph[f ] where f is the image being analyzed. To �nd Ph[f ] the PASSEF
system is trained with a training-data set that contains only crease-edge points. For an
initial analysis we examine Ph for the 8-sided-cone image (�gure 5.2a), a synthetic image
containing only crease edges. We use the training-data set C. Figure 5.25 shows PCh [8-
sided-cone]. This initial experiment proves the potential for crease-edge detection using
the PASSEF system.
Our second experiment involves detecting crease edges in a more complex synthetic
image, the highbay image (�gure 5.3a). We use the training-data set Hc to derive
PHc
h [highbay] (�gure 5.26b). Some crease edges appear to be detected properly in
PHc
h [highbay], but the results are not as good as for the 8-sided-cone experiment.
Because the highbay image contains step edges and crease edges we would like to de-
rive a feature map that contains both step and crease edges. For this we �nd Pg[highbay]
82
a b
Figure 5.25: Results of applying PASSEF to the 8-sided-cone image: (a)PCh [8-sided-cone] and (b) the watershed algorithm applied to PCh [8-sided-cone].
a b
Figure 5.26: Results of applying PASSEF to the hybay image: (a) PHs
g [highbay] and
(b) PHc
h [highbay].
83
and fuse this result with Ph[highbay]. Because we are now detecting crease edges ex-
plicitly we do not want Pg[highbay] to detect crease edges. We �nd Pg[highbay] using
a training-data set that contains only step edges (Hs) as opposed to one that contains
both crease and step edges (Hc;s), which we used previously in this chapter. Figure 5.26a
shows PHs
g [highbay].
We fuse PHs
g [highbay] and PHc
h [highbay] using Bernoulli's rule of combination (equa-
tion 4.2) to create a complete feature map for the highbay image. Figures 5.27a and 5.27b
show the results from this fusion. We compare this result with the result obtained from
Pg only to determine if crease edge detection is improved by using hs. Figure 5.27c shows
PHs;cg [highbay]. The fused edge map shows improved detection of some crease edges but
some noise is present.
Our goal is to extend this method to real data. We �nd PHc
h for the hydro3, hydro6,
and polyhedral1 images. The PASSEF system yields noisy results for all images. We
attempt to fuse the PHs
g with PHc
h . The noise prevails in the fused result. Noise in
the fused result causes unacceptable oversegmentation. Additional work could improve
crease-edge detection using the PASSEF system.
84
a b
c
Figure 5.27: Fusion of step-edge and crease-edge operator results using PASSEF: (a) Thefusion of PHs
g [highbay] and PHc
h [highbay] using Bernoulli's rule of combination, (b) the
watershed algorithm applied to the fusion of PHs
g [highbay] and PHc
h [highbay], and a
segmentation from applying the watershed to PHs;cg [highbay].
85
CHAPTER 6
Conclusions
We have developed a segmentation system for range images that utilizes multi-scale
analysis. We developed the Pattern Analysis of Scale Space for Extraction of Features
(PASSEF) method, which largely adheres to the goals set forth in chapter one. The goals
being to develop an algorithm that posses the three qualities:
� Detection of objects at all scales in the presence of noise.
� Few free parameters.
� Applicability to images from di�erent acquiring devices.
Chapter �ve of this thesis presents results that demonstrate the degree to which the
PASSEF system achieves the stated goals. The PASSEF system demonstrates excellent
detection of step-edges at all scales in the presence of noise, but shows less promise in
detecting crease-edges. The generalization of the PASSEF system to crease-edges shows
some of the strengths and weaknesses of the system. The PASSEF system requires only
four free parameters. In addition, the system is very robust to adjustment of these parame-
ters. We apply the PASSEF system to several images from various range acquiring devices
to �nd that the system can successfully detect features in all these images. Moreover, no
adjustment of parameters is needed to obtain feature extractions from these images.
One of the strengths of the PASSEF system is that the system can be extended to
detect other features by using di�erent feature operators. We attempted to extend the
system to the detection of crease edges using the operator in equation 3.16. This attempt
revealed that extending the system to detection of other features is possible, but more
work is needed to improve the results.
86
In conclusion, we have developed a system that can segment range images that contain
multi-scale objects and are from various acquiring devices with very little free parameter
adjustment. We have presented a thorough analysis of this system that includes assorted
results.
87
BIBLIOGRAPHY
BIBLIOGRAPHY
[1] D. Marr, Vision, Freeman, 1982.
[2] A. Hoover, G. Jean-Baptiste, X. Jiang, P. Flynn, H. Bunke, D. Goldgof, and K. Bowyer,\A comparison of range segmentation algorithms," IEEE Transactions on PatternAnalysis and Machine Intelligence 18(7), pp. 673{689, 1996.
[3] A. P. Witkin, \Scale-space �ltering," in Proceedings of the 8th International JointConference on Arti�cial Intelligence, pp. 1019{1022, 1983.
[4] F. Bergholm, \Edge focusing," IEEE Transactions on Pattern Analysis and MachineIntelligence 9(6), pp. 726{741, 1987.
[5] J. Canny, \A computational approach to edge detection," IEEE Transactions onPattern Analysis and Machine Intelligence 8(6), pp. 679{698, 1986.
[6] F. Truchete, O. Laligant, E. Bourcnanne, and J. Miteran, \Frame of wavelets for edgedetection," in Proceedings of the SPIE - Wavelet Applications in Signal Processing,vol. 2303, pp. 141{152, 1994.
[7] S. Haring, M. A. Viergever, and J. N. Kok, \Kohonen networks for multiscale imagesegmentation," Image and Vision Computing 12(6), pp. 339{344, 1994.
[8] D. Marr and E. Hilldreth, \Theory of edge detection," Procedings of the Royal Societyof London 8(6), pp. 679{698, 1986.
[9] T. Lindeberg, \Scale-space for discrete signals," IEEE Transactions on Pattern Anal-ysis and Machine Intelligence 12(3), pp. 234{254, 1990.
[10] S. Mallat and S. Zhong, \Characterization of signals from multiscale edges," IEEETransactions on Pattern Analysis and Machine Intelligence 14(7), pp. 710{732, 1992.
[11] R. C. Gonzalez and R. E. Woods, Digital Image Processing, Addison-Wesley, third ed.,1992.
[12] S. Livens, P. Scheunders, G. V. de Wouwer, and D. V. Dyck, \Wavelets for textureanalysis, an overview," in 6th Int. Conf. on Image Processing and its Applications,vol. 2, pp. 581{585, 1997.
[13] P. Vautrot, N. Bonnet, and M. Herbin, \Comparative study of di�erent spatial/spatial-frequency methods (gabor �lters, wavelets, wavelets packets) for texture segmenta-tion/classi�cation," in IEEE International Conference on Image Processing, vol. 3,pp. 145{148, 1996.
[14] A. Laine and J. Fan, \Frame representations for texture segmentation," IEEE Trans-actions on Image Processing 5(5), pp. 771{779, 1996.
[15] R. A. Kiltie, J. Fan, and A. F. Laine, \A wavelet-based metric for visual texturediscrimination with applications in evolutionary ecology," Mathematical Biosciences126(1), pp. 21{39, 1995.
[16] A. Laine and J. Fan, \Texture classi�cation by wavelet packet signatures," IEEETransactions on Pattern Analysis and Machine Intelligence 15(11), pp. 1186{1190,1993.
89
[17] R. Kiltie and A. Laine, \Visual texture, machine vision and animal camou age,"Trends in Ecology and Evolution 7(5), pp. 163{166, 1992.
[18] O. Pichler, A. Teuner, and B. Hosticka, \An unsupervised texture segmentation al-gorithm with feature space reduction and knowledge feedback," IEEE Transactionson Image Processing 7(1), pp. 53{61, 1998.
[19] W. Wu and S. Wei, \Rotation and gray-scale transform-invariant texture classi�cationusing spiral resampling, subband decomposition, and hidden Markov model," IEEETransactions on Image Processing 5(10), 1996.
[20] C. Lu, P. Chung, and C. Chen, \Unsupervised texture segmentation via wavelettransform," Pattern Recognition 30(5), pp. 729{742, 1997.
[21] A. Pikaz and A. Averbuch, \An e�cient topological characterization of gray-levelstextures, using a multiresolution representation," Graphical Models and Image Pro-cessing 59(1), pp. 1{ 17, 1997.
[22] R. Porter and N. Canagarajah, \A robust automatic clustering scheme for imagesegmentation using wavelets," IEEE Transactions on Image Processing 5(4), pp. 662{665, 1996.
[23] K. Kim, I. Jung, and Y. Yang, \High resolution image classi�cation with featuresfrom wavelet frames," in Proceedings of the 1997 IEEE International Geoscience andRemote Sensing Symposium, vol. 1, pp. 584{587, 1997.
[24] B. Wang, Y. Motomura, and A. Ono, \Texture segmentation algorithm using mul-tichannel wavelet frames," in IEEE International Conference on Systems, Man, andCybernetics, vol. 3, pp. 2527{2532, 1997.
[25] X. Zong, A. Meyer, and A. Laine, \Multiscale segmentation through a radial ba-sis neural network," in IEEE International Conference on Image Processing, vol. 3,pp. 400{403, 1997.
[26] S. Liu and E. Delp, \Multiresolution detection of stellate lesions in mammograms,"in IEEE International Conference on Image Processing, vol. 2, pp. 109{112, 1997.
[27] C. Busch, \Wavelet based texture segmentation of multi-modal tomographic images,"Computers and Graphics 21(3), pp. 347{358, 1997.
[28] S. Pemmaraju, S. Mitra, Y.-Y. Shieh, and G. Roberson, \Multiresolution waveletdecomposition and neuro-fuzzy clustering for segmentation of radiographic images,"in IEEE Symposium on Computer-Based Medical Systems, pp. 142{149, 1995.
[29] A. Betti, M. Barni, and A. Mecocci, \Using a wavelet-based fractal feature to improvetexture discrimination on sar images," in IEEE International Conference on ImageProcessing, vol. 1, pp. 251{254, 1997.
[30] L. Alparone, M. Barni, M. Betti, and A. Garzelli, \Fuzzy clustering of texturedsar images based on a fractal dimension feature," in Proceedings of the 1997 IEEEInternational Geoscience and Remote Sensing Symposium, vol. 3, pp. 1184{1186,1997.
90
[31] J. Boucher and S. Pleihers, \Unsupervised segmentation of radar images using waveletdecomposition and cumulants," in IEEE International Conference on Acoustics, Speech,and Signal Processing, vol. 5, pp. 1{4, 1994.
[32] M. Ramos, S. Hemami, and M. Tamburro, \Psychovisually-based multiresolutionimage segmentation," in IEEE International Conference on Image Processing, vol. 3,pp. 66{69, 1997.
[33] J. Maeda, V. Anh, T. Ishizaka, and Y. Suzuki, \Integration of local fractal dimensionand boundary edge in segmenting natural images," in IEEE International Conferenceon Image Processing, vol. 1, pp. 845{848, 1996.
[34] S. Sheng and P. Chevrette, \Three-dimensional object recognition from two-dimensionalimages using wavelet transforms and neural networks," inOptical Engineering, vol. 37,pp. 763{770, 1998.
[35] G. Fernandez and T. Huntsberger, \Wavelet-based system for recognition and labelingof polyhedral junctions," Optical Engineering 37(1), pp. 158{165, 1998.
[36] J. Beltran, L. Garcia, and J. Navarro, \Edge detection and classi�cation using Mal-lat's wavelet," in IEEE International Conference on Image Processing, vol. 1, pp. 293{297, 1994.
[37] J. Zan, B. Zheng, and W. Zhu, \Image compression scheme using wavelets-based edgeextraction and low-frequence component expansion," in Proceedings of the Interna-tional Conference on Signal Processing, vol. 2, pp. 974{977, 1996.
[38] A. Laine and X. Zong, \Border identi�cation of echocardiograms via multiscale edgedetection and shape modeling," in IEEE International Conference on Image Process-ing, vol. 3, pp. 287{290, 1996.
[39] J. Fayolle, C. Ducottet, T. Fournel, and J.-P. Schon, \Motion characterization ofunrigid objects by detecting and tracking feature points," in IEEE InternationalConference on Image Processing, vol. 3, pp. 803{806, 1996.
[40] S. Chang and M. Vetterli, \Spatial adaptive wavelet thresholding for image denois-ing," in IEEE International Conference on Image Processing, vol. 2, pp. 374{377,1997.
[41] O. Neiroukh, \Range image segmentation through multiresolution analysis usingwavelets," master's thesis, The University of Tennessee, Knoxville, Tennessee, May1995.
[42] T. Aydin, Y. Yemez, E. Anarim, and B. Sankur, \Multidirectional and multiscale edgedetection via M-band wavelet transform," IEEE Transactions on Image Processing5(9), pp. 1370{1377, 1996.
[43] T. Aydin, Y. Yemez, B. Sankur, E. Anarim, and O. Alkin, \Use of M-band wavelettransform for multidirectional and multiscale edge detection," in IEEE InternationalConference on Acoustics, Speech, and Signal Processing, vol. 5, pp. v{17{20, 1994.
[44] M. Venkatraman and V. Govindaraju, \Zero crossings of a non-orthogonal wavelettransform for object location," in IEEE International Conference on Image Process-ing, vol. 3, pp. 57{60, 1995.
91
[45] Z. Xiong, M. Orchard, and K. Ramchandran, \Inverse halftoning using wavelets," inIEEE International Conference on Image Processing, vol. 1, pp. 569{572, 1996.
[46] K. Cinkler and A. Mertins, \Coding of digital video with the edge-sensitive discretewavelet transform," in IEEE International Conference on Image Processing, vol. 1,pp. 961{964, 1996.
[47] A. Khashman and K. M. Curtis, \Neural networks arbitration for automation for au-tomatic edge detection of 3-dimensional objects," in IEEE International Conferenceon Electronics, Circuits, and Systems, vol. 1, pp. 49{52, October 1996.
[48] D. Ziou and S. Tabbone, \A multi-scale edge detector," Pattern Recognition 26(9),pp. 1305{1314, 1993.
[49] T. Lindeberg, \Edge detection and ridge detection with automatic scale selection,"in IEEE International Conference on Computer Vision and Pattern Recognition,pp. 465{470, 1996.
[50] J. Elder, The Visual Computation of Bounding Contours.Phd thesis, McGill University, Canada, August 1995.
[51] S. Mahmoodi, B. Sharif, and E. Chester, \Contour detection using multi-scale activeshape models,"
[52] T. Lindeberg, Discrete Scale-Space Theory and the Scale-Space Primal Sketch.Phd thesis, Royal Institute of Technology, Stockholm, Sweden, May 1991.
[53] D. J. Williams and M. Shah, \Edge contours using multiple scales," Computer VisionGraphics Image Processing 51(3), pp. 256{274, 1990.
[54] R. Qian and T. Huang, \A two-dimensional edge detection scheme for general visualprocessing," in International Conference on Pattern Recognition, vol. 1, pp. 595{598,1994.
[55] R. Qian and T. Huang, \Optimal edge detection in two-dimensional images," IEEETransactions on Image Processing 5(7), pp. 1215{1220, 1996.
[56] R. T. Whitaker and S. M. Pizer, \A multi-scale approach to nonuniform di�usion,"CVGIP: Image Understanding 57(1), pp. 99{110, 1993.
[57] D. Wang, \A multiscale gradient algorithm for image segmentation using watersheds,"Pattern Recognition 30(12), pp. 2043{2052, 1997.
[58] L. Baxter and J. Coggins, \Supervised pixel classi�cation using a feature space derivedfrom an arti�cial visual system," in Proceedings of the SPIE { Intelligent Robots andComputer Vision IX: Algorithms and Techniques, pp. 495{469, November 1990.
[59] S. Haring and M. Viergever, \A multiscale approach to image segmentation usingkohonen networks," in Lecture Notes in Computer Science: Information Processingin Medical Imaging, 1993.
[60] S. Mallat, \A theory for multiresolution signal decomposition: The wavelet repre-sentation," IEEE Transactions on Pattern Analysis and Machine Intelligence 11(7),pp. 674{693, 1989.
92
[61] P. J. Besl, Surfaces in Range Image Understanding, Springer-Verlag, one ed., 1988.
[62] E. D. Lester, \Feature extraction, image segmentation, and surface �tting: The de-velopment of a 3d scene reconstruction system," Master's thesis, The University ofTennessee, Knoxville, 1998.
[63] M. Baccar, \Surface characterization using a Gaussian weighted least squares tech-nique towards segmentation of range images," Master's thesis, The University ofTennessee, Knoxville, 1994.
[64] G. J. Klir and T. A. Folger, Fuzzy Sets, Uncertainty, and Information, Prentice Hall,1988.
[65] G. Shafer, A Mathematical Theory of Evidence, Princeton University, 1976.
[66] R. Pito, \Characterization, calibration and use of the perceptron laser range �nder ina controlled enviroment," Tech. Rep. MS-CIS-95-05, Department of Computer andInformation Science, University of Pennsylvania, 1995.
[67] I. S. Kweon, R. Ho�man, and E. Krotkov, \Experimental characterization of theperceptron laser range�nder," Tech. Rep. CMU-RI-TR-91-1, The `Robotics Institute,Carnegie Mellon University, Pittsburgh, Pennsylvania, 1991.
[68] K. Low and J. Coggins, \Multiscale vector �elds for image pattern recognition," inProceedings on SPIE Symposim on Advances in Intelligent Robotics Systems, vol. 1192,1989.
[69] D. Colella and C. Heil, \The characterization of continuous, four-ce�cient scalingfunctions and wavelets," IEEE Transactions on Information Theory 38(2), pp. 876{881, 1992.
93
APPENDICES
APPENDIX A
Background
A.1 Wavelet Theory
In image processing the term scale refers to the idea that objects in the world exist
or are relevant over a limited range of sizes or distances [9]. Thus, objects or features
associated with these objects occur at a particular scale within the image. The wavelet
transform provides a tool for extracting information at a particular scale in an image.
The wavelet transform is applied to a signal using a particular wavelet function or
basis function. The basis function determines how the wavelet transform will respond to
a signal. The ability to change this basis function makes the wavelet transform a exible
tool. The scaling of this basis function determines the size of features or objects to which
the transform is most sensitive.
A.1.1 General
The wavelet transform breaks a signal into frequency components much like Fourier analy-
sis. In contrast to the sine and cosine functions used in Fourier analysis, which are perfectly
local in frequency but global in space, wavelet functions are typically local in both space
and frequency. Because wavelet functions are local in both space and frequency, wavelet
analysis is capable of representing local features of a signal such as sharp peaks or edges.
A wavelet is any function that satis�es the condition stated in equation A.1 (i.e. the
function has equal area above and below the horizontal axis).
Z1
�1
(x)dx = 0 (A.1)
In addition wavelet functions usually have compact support or are nonzero over a closed
95
set of points. According to this de�nition there are in�nitely-many valid wavelet functions.
Typically, multi-scale analysis using the wavelet transform is achieved by scaling the
wavelet basis function. The di�erent scales of this basis function are also called dilations.
When the wavelet transform is applied to a signal using a large dilation of the basis
function, large features of the signal are analyzed. A smaller dilation of the basis function
analyzes the details of a function. Equation A.2 de�nes the scaling or dilation of a function,
where �s(x) is the resulting scaled function scaled by the factor s.
�s(x) =1
s�(x
s) (A.2)
Most applications of wavelet analysis involve the wavelet transform. Finding the
wavelet transform of a signal consists of determining the inner product of the signal with
the wavelet basis. There are two basic schemes for representing the wavelet transform of
a signal: the pyramidal scheme [60] and the convolution scheme [10].
The pyramidal implementation of the wavelet transform uses the inner product of the
wavelet basis translated to di�erent positions in the image. For each scale the wavelet
becomes larger and is therefore translated to fewer positions in the signal. Typically
pyramid algorithms are dyadic, meaning the size of the signal at successive levels of the
pyramid are reduced by a factor of two. A lossless representation of the data being
transformed is achieved with this implementation by using orthogonal wavelets. This
representation may also be created by convolving the dilated wavelet with the signal and
then uniformly sampling the signal to obtain half the number of samples of the original
signal.
The convolution implementation creates a group of signals (one signal representing
each scale) that are all the same size. The data structures created by the pyramidal and
convolution representations are shown in �gure A.1. The convolution representation of the
wavelet transform of a signal f(x) is de�ned by equation A.3 where is the convolution
96
IncreasingScale
StackPyramid
Figure A.1: Representations of the Wavelet Transform
operator.
Wsf(x) = f(x)s(x) (A.3)
Using this equation, the wavelet function (x) can be chosen to have certain properties
desired for a particular application.
A.1.2 Properties of Wavelets
There are several important properties of wavelet functions. Five of these properties are
integral to wavelet analysis for image segmentation. These are: orthogonality, compact
support, symmetry, di�erentiability, and number of zero crossings.
Orthogonality
An orthogonal wavelet is one in which the basis wavelet is orthogonal to its translations
and dilations [41]. Two orthogonal functions must satisfy equation A.4.
hn;mi = �n;m (A.4)
Examples of orthogonal wavelet functions are: the Haar (Figure A.2) and the Daubechies
(Figure A.3) wavelets.
97
350 400 450 500 550 600 650 700 750
−0.1
−0.05
0
0.05
0.1
0.15
Figure A.2: Haar wavelet
300 350 400 450 500 550 600 650 700 750−0.25
−0.2
−0.15
−0.1
−0.05
0
0.05
0.1
0.15
0.2
0.25
Figure A.3: Daubechies wavelet
98
0 5 10 15 20 25 30 35 40 45 50−0.04
−0.03
−0.02
−0.01
0
0.01
0.02
0.03
0.04
Figure A.4: Derivative of a quadratic spline Wavelet
Compact Support
A function has compact support if its nonzero values are contained over a closed set of
points [41] such that:
9� > 0 : f(x) = 0, 8 j x j> �: (A.5)
A wavelet function that does not have compact support requires truncation for implemen-
tation on a computer. For example the derivative of a Gaussian (DOG), a valid wavelet
function (�gure A.5), must be truncated. Truncation of a function causes several prob-
lems for implementation. One problem is that inaccurate values occur when taking the
derivative of the function. These inaccurate values exist at the truncation points of the
function. Some examples of compactly supported wavelets are the Haar, the Daubechies,
and the derivative of a quadratic spline (�gure A.4) wavelets.
Symmetry
In this work functions that have an axis of either symmetry or antisymmetry are considered
symmetric. If f(x) = f(-x), a function has an axis of symmetry. If f(x) = - f(-x), a function
is considered to have an axis of antisymmetry. Some examples of symmetric wavelets
are: the Haar, the derivative of a quadratic spline, and DOG functions. The Daubechies
wavelet, (Figure A.3) for example, is not symmetric.
99
0 200 400 600 800 1000 1200−0.04
−0.03
−0.02
−0.01
0
0.01
0.02
0.03
0.04
Figure A.5: DOG Wavelet
Di�erentiability
Di�erentiability, in this work, refers to a function that has a derivative at each point on
the interval (�1;1). In addition to the existence of a �rst derivative of the wavelet
function, there is also interest in how many derivatives can be taken beyond this �rst
derivative. For example the DOG function can be di�erentiated in�nite times, whereas,
the derivative of a quadratic spline function can not.
Number of Zero Crossings
A zero crossing occurs in the function f(x) at the point a if
9� > 0 :
f(x) > 0 8 a < x < a+ � and
f(x) < 0 8 a� � < x < a and
f(a) = 0
or if f(x) < 0 8 a < x < a+ � and
f(x) > 0 8 a� � < x < a and
f(a) = 0
Applying the wavelet transform to a signal that contains discontinuities yields a signal
100
containing local extrema. The number of zero crossings in the wavelet function corresponds
directly to the number of extrema in the result of applying the wavelet transform to a
signal that contains a single discontinuity. The derivative of a quadratic spline function
and the DOG function both have one zero crossing.
A.1.3 Examples of Wavelet Functions
Derivative of Gaussian
The DOG is a valid wavelet function because it integrates to zero. The DOG function is
in�nitely di�erentiable and symmetric, but it is not orthogonal and does not have com-
pact support. The DOG function was used in image processing before the establishment
of wavelet theory. As explained in [10], the DOG function plays an important role in
connecting the �eld of wavelets to conventional edge detection.
Four-Coe�cient Wavelets
One common way to create wavelets is to use a recursive algorithm that relies on a
four-coe�cient weighting function [69]. Each set of coe�cient values creates a partic-
ular wavelet. Applying rules to the values of the coe�cients allows the creation of wavelet
functions with particular properties.
A scaling function, �(x), is used to produce the wavelet function. This scaling function
must have an area equal to 1, that is,
Z1
�1
�(x) dx = 1: (A.6)
The scaling function is produced recursively using
�(x) = c0�(2x) + c1�(2x� 1) + c2�(2x � 2) + c3�(2x� 3): (A.7)
Where c0, c1, c2, and c3 are chosen coe�cients. The values of the coe�cients ultimately de-
termine the characteristics of the wavelet function generated from this recursively-de�ned
scaling function.
101
Given four initial values for �(x), the recursive application of equation A.7 generates
another discrete function. Successive applications of equation A.7 produce scaling func-
tions with progressively higher resolutions. For instance, the second resolution has twice
as many samples (8) as the initial resolution (4). Equation A.7 can then be applied to
the second resolution to yield a third resolution, which contains four times the number of
samples (16) as the initial resolution.
In some implementations of this recursive algorithm �(x) is bounded to the interval
[0,3] and the �ner resolutions represent samples that lie in between those at the coarser
resolutions. Nonetheless, �(x) must have area 1 on the interval [0,3]. For computer imple-
mentations of this algorithm, values of the continuous signal are stored as discrete samples.
In this case the sum of all these samples must be 1. Equation A.8 shows the criteria of
the scaling function (area equal to 1) expressed for the case of the recursive algorithm
implemented for the continuous and discrete cases. With each increase in resolution the
number of samples of �(x) is doubled. This means that in the discrete case �x is halfed
with each recursion. Z3
0
�(x) dx =1X
x=�1
�(x)�x = 1 (A.8)
where �x = 1
2jfor the jth level of recursion. This leads to the constraint that
3Xi=0
ci = 2: (A.9)
The wavelet function is produced from the scaling function by application of equation
A.10.
(x) = c3�(2x) � c2�(2x � 1) + c1�(2x� 2)� c0�(2x� 3) (A.10)
Because the scaling function sums to 1 the wavelet function will sum to 0. This leads to
the following constraint for the values of the coe�cients:
c0 + c2 = c1 + c3 = 1: (A.11)
102
Odd
Even
Orthogonal
Symmetric
Differentiable
- Quadradic Spline
- Daubechies
- Haar
Figure A.6: 4 Coe�cient Wavelet Space
This allows the four coe�cients c0; c1; c2, and c3 to be reduced to two coe�cients, co and
ce, which are referred to as odd and even respectively (equation A.12).
c0 = ce c1 = 1� co c2 = 1� ce c3 = co (A.12)
Figure A.6 shows the space of odd and even coe�cient values and labels the properties
of coe�cients in that space. If co=ce, then the wavelet function will be symmetric. If
co+ce=1/2, then the wavelet function is di�erentiable. Values of the odd and even co-
e�cients that lie on the circle labeled \Orthogonal" create orthogonal wavelets. These
four-coe�cient wavelets allow us to choose desirable properties and easily create a wavelet
with these properties. Notice that a four-coe�cient wavelet can not be simultaneously
symmetric, di�erentiable, and orthogonal.
103
VITA
Samuel Burgiss was born in Raleigh, North Carolina in 1972. He lived in Raleigh until
1975 when he moved with his parents to eastern Florida. After a short time he moved to
Knoxville, TN. Samuel graduated from Farragut High School in 1990. After high school he
left Knoxville to attend North Carolina State University. He received his BS in Computer
Engineering in May of 1994. Upon graduation Samuel worked as a research assistant at the
Image Processing Lab at the University of Tennessee Medical Center. After working there
for �ve months Samuel began work as a Systems Analyst at Phillips Consumer Electronics.
Samuel enrolled in the MSEE program at the University of Tennessee Knoxville in 1996
where he is currently working as a Graduate Research Assistant in the Imaging, Robotics
and Intelligent Systems Laboratory under the supervision of Dr. R. T. Whitaker and
Dr. M. A. Abidi. He expects to graduate in August 1998 with a specialization in image
processing and robotic vision.
104