a survey and evaluation of promising approaches for automatic image-based defect detection of bridge...

This article was downloaded by: [University of Guelph]On: 01 October 2013, At: 13:53Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK

Structure and Infrastructure Engineering:Maintenance, Management, Life-Cycle Design andPerformancePublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/nsie20

A survey and evaluation of promising approaches forautomatic image-based defect detection of bridgestructuresMohammad R. Jahanshahi a , Jonathan S. Kelly b , Sami F. Masri a & Gaurav S. Sukhatme ba Department of Civil and Environmental Engineering, University of Southern California, LosAngeles, California, USAb Computer Science Department, University of Southern California, Los Angeles, California,USAPublished online: 12 Aug 2009.

To cite this article: Mohammad R. Jahanshahi , Jonathan S. Kelly , Sami F. Masri & Gaurav S. Sukhatme (2009) A surveyand evaluation of promising approaches for automatic image-based defect detection of bridge structures, Structure andInfrastructure Engineering: Maintenance, Management, Life-Cycle Design and Performance, 5:6, 455-486

To link to this article: http://dx.doi.org/10.1080/15732470801945930

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of theContent. Any opinions and views expressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon andshould be independently verified with primary sources of information. Taylor and Francis shall not be liable forany losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use ofthe Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/loi/nsie20

http://dx.doi.org/10.1080/15732470801945930

http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/page/terms-and-conditions

A survey and evaluation of promising approaches for automatic image-based defect

detection of bridge structures

Mohammad R. Jahanshahia, Jonathan S. Kellyb, Sami F. Masria* and Gaurav S. Sukhatmeb

aDepartment of Civil and Environmental Engineering, University of Southern California, Los Angeles, California, USA; bComputerScience Department, University of Southern California, Los Angeles, California, USA

(Received 4 September 2007; final version received 28 January 2008)

Automatic health monitoring and maintenance of civil infrastructure systems is a challenging area of research.Nondestructive evaluation techniques, such as digital image processing, are innovative approaches for structuralhealth monitoring. Current structure inspection standards require an inspector to travel to the structure site andvisually assess the structure conditions. A less time consuming and inexpensive alternative to current monitoringmethods is to use a robotic system that could inspect structures more frequently. Among several possibletechniques is the use of optical instrumentation (e.g. digital cameras) that relies on image processing. Thefeasibility of using image processing techniques to detect deterioration in structures has been acknowledged byleading experts in the field. A survey and evaluation of relevant studies that appear promising and practical forthis purpose is presented in this study. Several image processing techniques, including enhancement, noiseremoval, registration, edge detection, line detection, morphological functions, colour analysis, texture detection,wavelet transform, segmentation, clustering and pattern recognition, are key pieces that could be merged to solvethis problem. Missing or deformed structural members, cracks and corrosion are main deterioration measures thatare found in structures, and they are the main examples of structural deterioration considered here. This paperprovides a survey and an evaluation of some of the promising vision-based approaches for automatic detection ofmissing (deformed) structural members, cracks and corrosion in civil infrastructure systems. Several examples(based on laboratory studies by the authors) are presented in the paper to illustrate the utility, as well as thelimitations, of the leading approaches.

Keywords: image processing; pattern recognition; crack; corrosion; bridge inspection; defect detection

1. Introduction

Change detection by means of digital image processinghas been used in several fields, including homelandsecurity and safety (Shinozuka 2003), product qualitycontrol (Garcia-Alegre et al. 2000), system identifica-tion (Shinozuka et al. 2001, Chung et al. 2004), aircraftskin inspections (Siegel and Gunatilake 1998), videosurveillance (Collins et al. 2000), aerial sensing(Watanabe et al. 1998, Huertas and Nevatia 2000),remote sensing (Goldin and Rudahl 1986, Bruzzoneand Serpico 1997, Deer and Eklund 2002, Peng et al.2004), medical applications (Dumskyj et al. 1996,Lemieux et al. 1998, Thirion and Calmon 1999, Reyet al. 2002, Bosc et al. 2003), underwater inspections(Lebart et al. 2000, Edgington et al. 2003), transporta-tion systems (Achler and Trivedi 2004) and nondes-tructive structural health monitoring (Dudziak et al.1999, Abdel-Qader et al. 2003, Sinha et al. 2003,Poudel et al. 2005). These applications of digital imageprocessing share common steps that have been

reviewed by Singh (1989), Coppin and Bauer (1996),Lu et al. (2004) and, most recently, by Radke et al.(2005). As change detection techniques are problem-oriented, most image processing approaches arelimited to detecting only one type of defect at a time.The aim of this review is to present and evaluate thesteps and algorithms that are necessary for detectingvarious changes simultaneously in three-dimensional(3D) truss structures by digital image processing. Inthis study, the changes of main concern are missing(deformed) parts, cracks and rust; however, otherchanges (e.g. missing bolts) are also possible to detectusing nondestructive techniques such as infraredimaging (Shubinsky 1994).

1.1. Motivation

Traditional bridge inspection is time consuming andexpensive because it requires an expert to visuallyinspect the structure site for changes. Yet, visualinspection remains the most commonly used technique

*Corresponding author. Email: [email protected]

Structure and Infrastructure Engineering

Vol. 5, No. 6, December 2009, 455–486

ISSN 1573-2479 print/ISSN 1744-8980 online

� 2009 Taylor & Francis

DOI: 10.1080/15732470801945930

http://www.informaworld.com

Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

to detect damage, since many of the bridges inthe United States are old and not instrumented withsensor systems. In cases of special structures, such aslong-span bridges, access to critical locations for visualinspection can be difficult (Pines and Aktan 2002). Arobotic system that could inspect structures morefrequently and which is accompanied by othernondestructive techniques could be a great advancein infrastructure maintenance. The use of digitalcameras, image processing and pattern recognitiontechniques is an appropriate approach to reach thisgoal.

1.2. Background

An automatic crack detection procedure in weldsbased on magnetic particle testing (Coffey 1988) wasintroduced by Ho et al. (1990). This method can onlybe used on ferromagnetic materials. First, the testingsurface is sprayed with white paint to reduce theinitial noise of subsequently captured images. Next, amagnetic field is applied to the weld. Then, magneticink made of small magnetic particles suspended in oilis sprayed over the testing surface. The change offlux density at the crack causes the magnetic particlesto trace out the shape of the crack on the weldsurface. Lastly, an image of the prepared surface iscaptured and cracks are detected by means of theSobel edge detection operator (Duda and Hart 1973,Gonzalez et al. 2004) and by implementing aboundary tracing algorithm. The results were satis-factory as reported by Ho et al. (1990), but clearlythis technique has drawbacks since a preprocessingstep is required.

Tsao et al. (1994) composed image analysis and anexpert system modulus to detect spalling and trans-verse cracks in pavements. The overall accuracy of thesystem for detecting spalling and transverse cracks wasreported to be 85% and 90%, respectively (Chae 2001).Kaseko et al. (1994) and Wang et al. (1998) used theimage processing and neural network techniques todetect defects in pavements.

Siegel and Gunatilake (1998) developed a remotevisual inspection system of aircraft surfaces. To detectcracks, their proposed algorithm detects rivets as crackspropagate on rivet edges. Multi-scale edge detection isused to detect the edges of small defects at small scalesand the edges of large defects at large scales. By tracingedges from high scale to low scale, it is possible to definethe propagation depth of edges. Using other featuresbased on wavelet transformation (Prasad and Iyenger1997, Abtoine et al. 2004) and a trained back-propagation neural network (Duda et al. 2001), crackscan be classified from other defects such as scratches.Corroded regions can also be detected by defining

features based on two-dimensional (2D) discretewavelet transformation of the captured images andusing a neural network classifier (Siegel and Gunatilake1998).

Nieniewski et al. (1999) developed a visual systemthat could detect cracks in ferrites. A morphologicaldetector based on a top-hat transform (Salembier1990) detects irregular changes of brightness, whichcould lead to crack detection. k-nearest neighbours(Duda et al. 2001) is used as a classifier to classifycracks from grooves. The outcome of this study is verypromising, and this technique is quite robust, despitethe presence of noise, unlike other edge detectionoperators used for crack extraction.

Moselhi and Shehab-Eldeen (2000) used imageanalysis techniques and neural networks to automati-cally detect and classify defects in sewer pipes. Theaccuracy rate of the proposed algorithm is 98.2%, asreported by the authors.

Chae (2001) proposed a system consisting of imageprocessing techniques, along with neural networks andfuzzy logic systems for automatic defect (includingcracks) detection of sewer pipelines.

Benning et al. (2003) used photogrammetry tomeasure the deformations of reinforced concretestructures. A grid of circular targets is established onthe testing surface. Up to three cameras capture imagesof the surface simultaneously. The relative distancesbetween the centres of adjacent targets make it possibleto monitor the evolution of cracks.

Abdel-Qader et al. (2003) analysed the efficacy ofdifferent edge detection techniques in identifyingcracks in concrete pavements of bridges. They con-cluded that the fast Harr transform (FHT), which is awavelet transform with a mother wavelet of Harr, hasthe most accurate crack detection capability in contrastwith fast Fourier transforms (FFTs), Sobel and Cannyedge detection operators (Bachmann et al. 2000,Alageel and Abdel-Qader 2002).

A study on using computer vision techniques forautomatic structural assessment of underground pipeshas been carried out by Sinha et al. (2003). Thealgorithm proposed by Sinha et al. (2003) consists ofimage processing, segmentation, feature extraction,pattern recognition and a proposed neuro-fuzzy net-work for classification.

Choi and Kim (2005) applied machine visiontechniques to evaluate and classify different surfacecorrosion damages. They proposed a set of morpho-logical attributes (colour, texture and shape) asappropriate features to be used for classification.Hue-saturation-intensity (HSI) colour space (Gonzalezand Wintz 1987, Pratt 2001) and co-occurrence matrixmethods are used for colour and texture featuresrespectively. Principal component analysis (PCA)

456 M.R. Jahanshahi et al.

Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

(Jolliffe 2002) is used to optimise the selection of thefeatures.

Giakoumis et al. (2006) detected the cracks indigitised paintings by thresholding the output of themorphological top-hat transform. Sinha and Fieguth(2006b) detected the defects in underground pipeimages by thresholding the morphological opening ofthe pipe images using different structuring elements.

1.3. Scope

As noted above, the aim of this study is to present andalso evaluate the feasibility of the steps that areessential for vision-based automatic health monitoringof structures. The focus of this study is mainly onvisual imaging and image processing techniques thatcan detect crack and corrosion. The ultimate objectiveis to develop a robotic system that can independentlynavigate underneath bridges and capture images.Different image acquisition systems, including digitalcameras and infrared imaging devices, can be used forthis purpose. The system should detect and classifycracks, corrosion, missing parts and deformed mem-bers. Apart from detecting defects, the system alsoshould be able to localise the position of thedeterioration. Clearly, detecting a defect in a largebridge that contains repeats of the 3D truss systempattern is difficult, even for a human being. Therefore,localisation is crucial to resolving the problem. A set ofsteps and algorithms that are most promising forreaching this goal is presented in this study.

A quick review of preprocessing techniques ispresented in x2, since preprocessing of the capturedimages might be a prerequisite for the application ofother algorithms. Image registration and its role in thedefined problem are mentioned in x3. This section isuseful for detecting the missing or deformed structuralmembers. In x4, a brief review of pattern recognitionconcepts is presented, and supervised and unsupervisedclassification algorithms are introduced. Classificationplays an important role in differentiating defects fromnon-defective changes. Some important and usefulclassification techniques are discussed in this section.For corrosion detection, the wavelet filter bankapproach, which has applications in both crackextraction and texture segmentation, is used. Section5 briefly reviews wavelet decomposition and recon-struction of images. Section 6 focuses on common edgedetection techniques and some morphological techni-ques useful for crack extraction. The introduction tomorphological techniques is included in x6.2. Section 7discusses promising texture segmentation techniquesand colour characteristics of corrosion, which could bea key tool in clustering defective parts from non-defective ones. Finally, conclusions are discussed in x8.

2. Preprocessing

Preprocessing consists of a series of steps that preparethe image for further processing. These enhancementtechniques, including image smoothing, image sharpen-ing, contrast modification and histogram modification,can be found in almost any digital image processingbook (e.g. Gonzalez and Wintz 1987, Gonzalez andWoods 1992, Pratt 2001, Gonzalez et al. 2004).

The purpose of image smoothing is to reduce noisein an image. Below, some practical and useful imagesmoothing techniques are mentioned.

2.1. Neighbourhood averaging (mean filter)

The average grey-level value of a neighbourhood isreplaced as the new value in the smoothed image.Although this technique is very simple, it will blur anysharp edges. To overcome this shortcoming, it isnecessary to average the brightness values of only thosepixels in the neighbourhood that have similar bright-ness as the pixel that is being processed. The mostimportant factor in this technique is the neighbourhoodand the assigned weights for averaging the valueswithin the neighbourhood. In fact, it is possible to writeneighbourhood averaging as a 2D convolution bysliding a kernel over the grey-scale image. Thisconvolution could be written mathematically as:

Qði; jÞ ¼Xm0k¼�m

Xn0l¼�n

Iðiþ k� 1; jþ l� 1Þ

� Kðkþmþ 1; lþ nþ 1Þ; ð1Þ

where I is the grey-scale (Mþmþm0) 6 (Nþ nþ n0)image derived from the initial M 6 N image bymirroring the border elements to create a larger matrix.i and j are the pixel coordinates in the convolved imageand k and l are parameters used in the summationoperator to specify the coordinates of each kernel valuein the kernel window. In that way, Q will be M 6 N aswell. Each value in the image matrix is the brightnessvalue of the relevant pixel and K is the kernel withmþm0 þ 1 rows and nþ n0 þ 1 columns.

Gaussian smoothing kernel can be used as asuitable neighbouring average kernel. This kernel canbe estimated as a 2D isotropic Gaussian distribution,as shown in Equation (2):

Gðx; yÞ ¼ 1

2ps2exp � x2 þ y2

2s2

� �; ð2Þ

where s is the standard deviation. Since the digitalimage is a set of discrete pixels, a discrete approxima-tion of the Gaussian distribution is needed. One can

Structure and Infrastructure Engineering 457

Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

assume that the value of the Gaussian distribution forpoints further than three standard deviations from themean is zero. Horn (1986) has introduced a way toapproximate continuous Gaussian functions withdiscrete filters.

2.2. Median filter

It is possible to overcome the image blurring of theneighbourhood averaging method by choosing athreshold; however, the threshold is usually based onextensive trial and error. For this reason, the grey levelof each pixel can be replaced by the median value of thegrey level of the neighbouring pixels. This simplenonlinear filter is referred to as a median filter. Thistechnique is suitable when crack detection is of interest.

2.3. Averaging of multiple images

Suppose that a noisy image Jk (i, j) is formed asfollows:

Jkði; jÞ ¼ Iði; jÞ þ Zkði; jÞ; ð3Þ

where I (i, j) is the original image and Zk (i, j) is noise.Assuming that noise is uncorrelated in different pixelsand has zero mean value the average of multipleimages can be written as:

Jði; jÞ ¼ 1

N

XNk¼1

Jkði; jÞ: ð4Þ

It then follows that:

E Jði; jÞ� �

¼ Iði; jÞ; ð5Þ

and

sJði;jÞ ¼1ffiffiffiffiNp sZði; jÞ; ð6Þ

where E is the mathematical expectation (mean) and sis the standard deviation. Equations (5) and (6) showthat, as the number of multiple images increases, theaveraged image converges to the original image, andthe deviation of the pixel values decreases. This is asuitable way to remove noise; however, the noisy imagesshould be properly registered prior to averaging(Gonzalez andWintz 1987, Gonzalez andWoods 1992).

3. Image registration

Image registration is the process of matching two ormore images of the same scene. These images can becaptured at different times, from different orientations,or even by different types of sensors. Image registrationis a fundamental task of many change detection

procedures. A simple example is to register two imagestaken at different times and from different orientations,and identify the differences between the latter imageand the former reference image as a measure ofpossible changes. Subtracting the two images easilydetects changes; however, the registration of images isnot always an easy task, especially if the images include3D scenes with occlusions. Two comprehensive surveypapers on image registration are published by Brown(1992) and Zitova and Flusser (2003).

Image registration could be key to solving pro-blems such as visual detection of missing or deformedstructural members. It could also help localise detectedchanges by 3D reconstruction of truss systems. Thecrack and corrosion detection techniques discussed inx6 and x7 are image registration-independent becausethe current image registration algorithms are insuffi-cient for 3D reconstruction of a truss in the presence ofocclusions. The authors believe that image registrationis a key piece of the puzzle to solve the problem, andfor this reason a quick review is presented here. Mostimage registration techniques consist of feature detec-tion, feature matching, transformation and imageresampling, as described below.

3.1. Feature detection

In feature detection, control points such as distinctiveobjects, edges, topographies, points, line intersectionsand corners are detected. From a robotic standpoint, itis desirable that these control points are detectedautomatically rather than manually. There are manywell-developed algorithms that can detect the controlpoints. The Moravec corner detection algorithm is asimple and fast point feature algorithm (Moravec1977, 1979); however, it is sensitive to rotation suchthat the points extracted from one image are differentfrom those extracted from the same image thathas been rotated (anisotropic response) (Parks andGravel 2005). The Moravec operator is also highlysensitive to edges, which means that anything thatlooks like an edge (i.e. noise) may cause the intensityvariation to become significant. The intensity variationis the main criterion that detects a corner in thismethod (Parks and Gravel 2005).

The Harris (Plessey) operator (Harris and Stephens1988) is another algorithm, which has a higherdetection rate than the Moravec operator. The formeroperator is more robust than the latter one in terms ofrepeatability (Parks and Gravel 2005). On the otherhand, the Harris detector is computationally morecostly than the Moravec detector. Since the Harristechnique is based on gradient variations, it is alsosensitive to noise. Recent modifications have made thismethod capable of responding isotropically (Parks and


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

Gravel 2005). Figure 1 shows 563 detected features in areal 3D truss model using the Harris operator.

Recently, the scale-invariant feature transform(SIFT) (Lowe 2004) has become a popular choice forfeature detection. SIFT features are invariant tochanges in scale and rotation, and partially invariantto changes in 3D viewpoint and illumination. TheSIFT operator is also highly discriminative and robustto significant amounts of image noise. Features areidentified by finding local extrema in a scale–spacerepresentation of the image. For each extremum point,SIFT then computes a gradient orientation histogramover a region around the point producing a 128-element descriptor vector. Although SIFT offers betterrepeatability than the Moravec or Harris operators,the algorithm is computationally more expensive.Figure 2 shows 600 detected features of the same trusssystem shown in Figure 1 using the SIFT detector.

3.2. Feature matching

After extracting appropriate features, it is time to findthe correct matching features in the reference imageand the image to be registered. Feature matching canbe carried out manually by a human operator;however, it is more desirable to match the extractedfeatures automatically. Automatic feature matching isa critical stage in image registration. Matchinginappropriate features will lead to a registrationfailure.

An initial estimate is necessary to identify thecorrespondences between the extracted features inthe target image and their correct matches in thereference image. The common technique used formaking the estimation is calculating a quantitative

descriptor for each feature. Next, a distance matrix A iscomputed, where each Aij element indicates the close-ness of the ith feature descriptor in the target image andthe jth feature descriptor in the reference image (Ringerand Morris 2001). The initial matching features areselected as the smallest elements of A. There should notbe any row or column selected more than once. RingerandMorris (2001) and Scott and Longuet–Higgis (1991)have provided further details about different descriptordistance matrix calculation methods.

Figures 3 and 4 show examples of 75 putativematched features in two different images of a singletruss system using the Harris and SIFT detectorsrespectively. There are false matches made in theseimages. In order to improve the correspondenceestimation, outliers (defined as incorrect matchingfeatures) are supposed to be identified. A very usefultechnique in this regard is random sample consensus(RANSAC) (Fischler and Bolles 1981).

When there are two images of a single scene takenfrom different view points, any point in one image liesalong a line in the other image (Faugeras 1993). Thiscondition could be imposed as the followingconstraint:

uTFv ¼ 0; ð7Þ

where u and v are the points in the two images, and areexpressed in the form of [x, y, 1]T and T is thetranspose operator. The fundamental matrix F usesEquation (7) to relate any two corresponding points oftwo images that represent the same point of a scene.More details and estimation techniques of this matrixare provided by Torr and Murray (1997), Zhang (1998)and Hartley and Zisserman (2000).

Figure 1. Harris detector – feature points (indicated by the‘þ ’ symbol).

Figure 2. SIFT detector – feature points (indicated by the‘þ ’ symbol).


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

In the RANSAC algorithm, a number of featuresare randomly chosen to calculate the matrix F. Thesefeatures are initially estimated to be appropriatematches using the descriptor distance matrix. Foreach pair of corresponding features, the error is definedas the result of calculating the left-hand side ofEquation (7) for the estimated F. Those pairs witherrors greater than a threshold are detected as outliers.This procedure is repeated several times until the leastamount of total error is calculated, and the minimumnumber of outliers are detected. Figures 5 and 6 show

75 matched features detected by the Harris and SIFTdetectors using the RANSAC algorithm. These twofigures contain fewer mismatched features comparedwith Figures 3 and 4 where the RANSAC algorithm isnot used.

This technique can be improved by minimising themedian error instead of the total error, also known asthe least median squares (LMedS) (Ringer and Morris2001, Torr and Murray 1997); however, the lattertechnique is not efficient in the presence of Gaussiannoise (Rousseeuw 1987).

Figure 3. Harris detector – 75 putative matched features in two different images of a single truss system (matched features areconnected by matching lines).

Figure 4. SIFT detector – 75 putative matched features in two different images of a single truss system (matched features areconnected by matching lines).


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

3.3. Transformation and image resampling

Provided that well-matched features are detected intwo images, it is easy to estimate the transformationmatrix, which transforms any pixel in the targetimage into the corresponding pixel in the referenceimage. Since the pixel coordinates have to be integernumbers, appropriate interpolation techniques arerequired to calculate the values of the transformedpixels with non-integer coordinates. Figure 7 shows aregistration example, where Figures 7(a) and 7(b)show two images taken of a 3D truss model fromdifferent orientations, while Figure 7(c) represents the

registration of Figure 7(b) onto Figure 7(a). It isworth mentioning that there are generally two typesof image registration algorithms, and these aredescribed below.

3.3.1. Area-based methods

Area-based methods emphasise feature matching, andare less concerned with feature detection. The structureof the image is analysed through the use of correlationmatrices, Fourier properties, etc. (Zitova and Flusser2003).

Figure 5. Harris detector – matches consistent with the fundamental matrix.

Figure 6. SIFT detector – matches consistent with the fundamental matrix.


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

3.3.2. Feature-based methods

Feature-based methods focus on detecting features(points, lines, corners, line intersections, edges, bound-aries, etc.) regardless of image structure (Zitova andFlusser 2003). The feature detection and featurematching algorithms described in x3.1 and x3.2 fallinto this type of registration method.

Registration of convex 3D objects has been well-developed in previous studies, but further improve-ments are needed to register a 3D truss that hasocclusions. This is impossible to do with the currentregistration techniques.

If an image is registered to an existing computeraided design model, it is possible to recover theposition of the camera at the time that the image wascaptured. This positional information can be used todetermine how the camera should be repositioned tocapture more detailed images of certain areas, etc.

4. Pattern recognition

The aim of pattern recognition is to classify the objectsor patterns that have similar attributes into the sameclass. A scheme of a complete pattern recognition isshown in Figure 8. In this scheme, the first step is data

Figure 7. (a) Reference image, (b) image to be registered, and (c) registration of image (b) on image (a).


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

collection. Data sensing can be carried out by a digitalcamera in this case.

4.1. Segmentation

Segmentation is a set of steps that isolates the patternsthat can be potentially classified as the defined defect;however, sometimes it mistakenly picks out patterns

that do not belong to the class of potential defects. Theaim of segmentation is to reduce the extraneous dataabout patterns whose classes are not desired to beknown. A good segmentation algorithm can help theclassifier correctly classify patterns, and it can alsoaffect the type of classifier used. Figure 9(a) shows acorroded column and its background, and Figure 9(b)shows the corroded area segmented from the rest of the

Figure 8. A pattern recognition system scheme.

Figure 9. (a) Corroded column and its background, (b) segmentation of the corroded-like area, and (c) classification of pixelsusing the k-means classifier into three classes based on the RGB colour vector of each pixel. (Available in colour online.)


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

image. Some segmented portions of the image in Figure9(b) do not belong to the corroded area. Figure 9(c)shows the classification of pixels into three classes usingthe k-means classifier based on the red-green-blue (RGB)colour vector of each pixel. Pixels with the same greylevels (black, white, or grey) belong to the same class.

4.2. Feature extraction

After segmenting the patterns of interest, it is time toassign them a set of finite values representingquantitative attributes or properties called features.These features should represent the important char-acteristics that help identify similar patterns. Theprocess of selecting these suitable attributes is calledfeature extraction.

According to Hogg (1993), there are five mainfactors for visual image inspection used by experiencedhuman operators: intensity (a spectral feature), texture(a local spatial feature), size, shape and organisation.In automatic classification of patterns or objects in animage, the spectral and textural attributes are used asfeatures (Sinha et al. 2003).

It is possible to assign a feature vector to eachpattern in which the elements of the vector arequantitative values representing the extracted featuresof the pattern. This means that an M-dimensional

feature space can be defined where each axis in thisspace represents a feature and each pattern is onepoint. It is better if the coordinates are orthogonal inthe feature space (it is preferred that the features areindependent and also orthogonal). In order to lowerthe feature vector dimension, it is possible to map theprincipal features of a pattern from a higher dimen-sional space to a lower dimensional space by means ofa mapping transformation, such as discrete cosinetransformation, Fourier transformation or PCA(Karhunen–Loeve transform) (Sinha et al. 2003).

PCA is a linear transformation that keeps thesubspace with the largest variance, and it needs morecomputation with respect to the other specifiedmapping transformations. The goal of this algorithmis to lower the dimensionality of a given data set Xfrom M to L where L5M. More informationregarding PCA can be found in Jolliffe (2002).

4.3. Classification

The last step in a pattern recognition system is decisionmaking or classification. The feature vectors previouslyextracted for each pattern are inputted into theappropriate classifier, which then outputs the classifiedpatterns. Figure 10 shows the clustering of 7776 pixelsof Figure 9(c) plotted in a three-dimensional feature

Figure 10. Clustering of pixels of Figure 9(c) in the RGB feature space. (Available in colour online.)


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

space, the RGB colour space. The two types ofclassifiers are described below.

4.3.1. Supervised classification

In this type of classification, a set of feature vectorsbelonging to the known classes is used to train theclassifier. The goal of using a training set is to find arelation between the extracted features of the same-class patterns and predict the class of a valid featurevector when its class is unknown. Choosing anappropriate training set is essential to obtainingreasonable and accurate results from supervisedclassification.

The k-nearest neighbour classifier is an example ofsupervised classifiers. In this technique, an unclassifiedpattern is classified, based on the majority of its knearest neighbours in the feature space. The neigh-bours are the patterns in the training set. The distancebetween patterns in the feature space is usually theEuclidian distance. This classifier is very sensitive tonoise; however, incrementing k decreases its sensitivityto noise. The appropriate value of k depends on thetype of data. If the population of the training set growsenough, the nearest neighbour in the training setrepresents the class of the unknown pattern. A supportvector machine or an artificial neural network can beused as a supervised classification tool in manyclassification problems. Further details are given byDuda et al. (2001).

Neuro-fuzzy systems are promising approachesused by Chae (2001) and Sinha and Fieguth (2006a)to detect defects, including cracks in sewer pipesystems. The performance of the neuro-fuzzy systemsproposed in these two researches is better than theregular neural networks and other classical classifica-tion algorithms (Chae 2001, Sinha and Fieguth 2006a).Kumar and Taheri (2007) used neuro-fuzzy expertsystems in their automated interpretation system forpipeline condition assessment. Neuro-fuzzy systemssimultaneously benefit from the data imprecisiontolerance (vague definitions) of fuzzy logic systemsand the tolerance of the neural networks to noisy data.The easily comprehendible linguistic terms and if-thenrules of the fuzzy systems and the learning capabilitiesof the neural networks are fused into a neuro-fuzzysystem (Lee 2005). Different fusions of neural networksand fuzzy systems, which lead to neuro-fuzzy expertsystems, are provided by Lee (2005).

4.3.2. Unsupervised classification

There is no training set in an unsupervised classifica-tion system. Instead, there is a set of non-classifiedpatterns. The goal is to classify or cluster different

patterns of a given data set. This technique is veryuseful in cases where obtaining an appropriate trainingset is time consuming or costly. In some cases, a largeamount of data can be clustered by an unsupervisedclassifier, and the class of each cluster can bedetermined using a supervised classification (Dudaand Hart 1973, Duda et al. 2001).

A very common unsupervised classifier is the k-means classifier. The goal of this classifier is to clusterpatterns into k classes (k is known). In order to achievethis goal, k feature vectors of given patterns areselected randomly as the initial mean of each k class.Each remaining pattern is classified as the class withthe nearest mean vector to it. The distance is usuallythe Euclidean distance. After clustering the data, themean vector of each class is computed. The patternsare clustered again based on the nearest mean vector.These steps are repeated until the mean vectors do notchange or a specific number of iterations is reached.The mean value can be calculated from Equation (8):

mi ¼1

Ni

Xj2Ci

Xj; ð8Þ

where Ci is the ith cluster, Ni is the number of members

belonging to the ith cluster and Xj is the feature vectorof the jth pattern.

The negative aspect of this classifier is thepredefined value of k. For automatic clustering ofpatterns of an image, it is important to find theoptimum value of k for that image. Porter andCanagarajah (1996) proposed a way to automaticallydetect the true cluster number when the number of theclusters is unknown. Within-cluster distance is definedas the sum of all distances between feature vectors andtheir corresponding cluster mean vectors. Within-cluster distance can be used as a criterion to selectthe true clustering number. Within-clustering distanceDk can be defined, as indicated by Equation (9), whered(mi,Xj) represents the distance between the meanvector of the ith cluster and feature vector Xj:

Dk ¼1Pk

m¼1 Nm

Xki¼1

Xj2Ci

dðmi;XjÞ: ð9Þ

The maximum value for within-cluster distance occursat k¼ 1; as k increases, Dk rapidly decreases until itreaches the true cluster number, after which Dk

decreases very slightly or converges to a constantvalue. When the first difference of the within-clusterdistances is small enough, the true cluster number isfound; however, this method requires a threshold. Onthe other hand, the rapid decrease of Dk before the truecluster number and its gradual decrease after thetrue cluster number means that the gradient of the


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

‘within-cluster distance’ versus ‘cluster number’ graphhas a significant change at the true cluster number.Based on this, the true cluster number is the one thathas the maximum value for the second difference ofwithin-cluster distance (Porter and Canagarajah 1996).

In x6 and x7, applicable segmentation techniquesand useful features for detection of cracks andcorrosion will be reviewed.

5. Wavelet filter bank

The 2D discrete wavelet transform (DWT) of images(Mallat 1989) is a useful technique in many imageprocessing problems, and there are many paperspublished on this subject. Wavelet transform providesa remarkable understanding of the spatial andfrequency characteristics of an image (Gonzalez et al.2004). Since the low frequencies dominate mostimages, the ability of wavelet transform to repetitivelydecompose in low frequencies makes it popular formany image analysis tasks (Porter and Canagarajah1996). In this section, the decomposition and recon-struction of images using wavelet transforms areintroduced.

Figure 11 represents a schematic decompositionprocedure of an image by 2D DWT. The input to thissystem is the initial image, the ith approximation; hjand hc are low-pass and high-pass decomposition

filters, respectively. The words ‘Columns’ and ‘Rows’underneath these filters indicate whether the columnsor rows of the input should be convolved with thedecomposition filter. Since one-step decomposition ofthe input with a low-pass and a high-pass filter yieldsalmost a doubled amount of data, a down-sampling(indicated by 2#) keeps the amount of data almost thesame size as the input. The words ‘Columns’ and‘Rows’ beneath the down-sampling boxes shows thatthe down-sampling should take place either overcolumns or rows (which could be done simply bykeeping the even-indexed columns or rows).

The result is an (i þ 1)th approximation, whichincludes the low frequency characteristics of the input,and it is the most similar output to the input image.There will be three horizontal, vertical, and diagonaldetails that include the details of the input in thespecified directions. These outputs are the wavelettransform coefficients. Since the (i þ 1)th approxima-tion has the most characteristics of the input image, itcan be fed to the decomposition system as aninput, and decomposition can take place repeatedly(Misiti et al. 2006). Figure 12 shows a 2D wavelet treefor a three-stage decomposition of an image.

As the order of decomposition increases, moredetails will be decomposed from the image. A two-stage decomposition of a truss model is shown inFigure 13. One can see the horizontal, vertical and

Figure 11. Two-dimensional discrete wavelet transform decomposition scheme.

Figure 12. Two-dimensional discrete wavelet transform decomposition scheme. The approximation component of the ith

decomposition stage can be decomposed to the (i þ 1)th approximation, horizontal, vertical and diagonal details.


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

diagonal details in the two-level decomposition of atruss model in this figure, while the second approxima-tion contains most of the information about theoriginal image. These approximation coefficients anddetail coefficients can be used as features for texturalanalysis of an image. This will be discussed in x7.

The 2D inverse discrete wavelet transform (IDWT)can be used to reconstruct the initial image from theapproximation coefficients and detail coefficients asshown in Figure 14. The notation 2" indicates up-sampling over rows or columns, which can be done byinserting zeros at odd-indexed rows or columns. Low-pass and high-pass reconstruction filters are denoted ash0j and h0c.

The decomposition and reconstruction filters arederived from the scaling function j and the mother

wavelet c of a specific wavelet family. The decomposi-tion filters in Figure 13 are based on a Daubechies(1992) wavelet family of order eight. The coefficientsfor this wavelet family in the form of column vectors(as shown in Equation (10)) and their transposes canbe used for column and row convolutions, respectively:

hj ¼

0:230380:714850:63088�0:02798�0:187030:030840:03288�0:01060

266666666664

377777777775

and hc ¼

�0:01060�0:032880:030840:18703�0:02798�0:630880:71485�0:23038

266666666664

377777777775: ð10Þ

Figure 13. Two-stage DWT decomposition of a truss image.


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

Repetitive wavelet decomposition of an image,followed by elimination of its details using a thresholdand reconstruction of the edited data, leads to imagesmoothing, while elimination of the approximationswill lead to edge detection. The latter characteristic canbe used for crack extraction. Wavelet transform is alsoused as an image compression tool (DeVore et al. 1992,Villasenor et al. 1995), and by setting an appropriatethreshold its performance is confirmed as a noiseremoval technique (Chang et al. 2000).

6. Crack detection

In this section, two different categories of crackextraction techniques are considered, and the perfor-mances of different methods are discussed. The mainobjective of these techniques is to appropriatelysegment the regions of interest (crack-like defects)from the rest of the image, followed by featureextraction and decision making to detect the actualcracks. These two categories are based on edgedetection and morphological operators. Since edgesare often intermixed with cracks, it is hard to classifythe cracks from the edges solely based on edgedetection methods. Edge detection techniques aremore effective when just the defective region isincluded in the image, or the region of interest iscompletely segmented from its background. This is notan easy task; however, morphological operations arecapable of detecting edges in images that include manynon-crack edges. An ideal crack extraction procedureis to derive the initial information by a morphologicaloperation; based on that, the camera can zoom in tocapture a closer image of the crack. Finally, the edgedetection techniques can be used to extract moreinformation such as the length, the thickness or theorientation of the crack.

6.1. Edge-based techniques

Edge detection techniques can be used to extractcrack-like edges in the region of interest where usual

edges such as element boundaries do not exist. Tworeview papers on edge detection techniques areprovided by Davis (1975) and Ziou and Tabbone(1998). A comprehensive description of several edgedetection techniques is reviewed by Pratt (2001). Anyedge detection technique should consist of smooth-ing, differentiation, and labelling. Smoothing is apreprocessing step that reduces noise and may causethe loss of some edges. An edge can be defined as adiscontinuity or sudden change in the image inten-sity. This is identical to the derivative of animage having local maximum values at the edges.For this purpose, the gradient of an image is anappropriate tool for identifying edges. The gradientvector of a given image f (x, y) is defined as inEquation (11):

rf ¼ Gx

Gy

� �¼

@f

@x

@f

@y

2664

3775: ð11Þ

The magnitude of the gradient, located at (i, j), can becalculated as indicated by Equation (12):

jrfði; jÞj ¼ ðG2xði; jÞ þ G2

yði; jÞÞ12: ð12Þ

For computational simplicity, one can approximatethe magnitude of the gradient using Equations (13)or (14):

jrfði; jÞj ¼ G2xði; jÞ þ G2

yði; jÞ; ð13Þ

and

jrfði; jÞj ¼ jGxði; jÞj þ jGyði; jÞj: ð14Þ

The gradient magnitude is zero in areas of constantintensity, whereas in the presence of edges themagnitude is the local maximum. Gradient edge

Figure 14. Two-dimensional discrete wavelet transform reconstruction scheme.


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

detection can be used to compute the direction ofchanges as defined in Equation (15):

yði; jÞ ¼ tan �1Gyði; jÞGxði; jÞ

� �: ð15Þ

Common convolution masks (kernels) for digitalestimation of Gx and Gy are Sobel (Duda and Hart1973, Gonzalez et al. 2004), Roberts (1965), Prewitt(1970), Frei and Chen (1977) and Canny (1986) edgedetection operators. Convolving the initial image withone of the first-order derivative edge detection masks,both vertically and horizontally, generates the approx-imate gradient magnitude of the pixels. The pixels withvalues greater than a specified threshold are deter-mined to be edges (labelling). Lower threshold valueswill lead to detection of more edges, while highervalues will cause some edges to be undetected.Different techniques have been proposed to select theappropriate threshold (Abdou 1973, Abdou and Pratt1979, Henstock and Chelberg 1996). Gonzalez andWoods (1992) proposed an automatic way to computethe global threshold by selecting an initial randomthreshold T on the histogram of the image. Then m1and m2 are computed as the average intensity values ofthe pixels with intensity values that are greater or lessthan T, respectively. A new T is then computed as theaverage of m1 and m2. This iteration continues until aconstant T is achieved.

Another category of the first-order derivative edgedetection techniques is based on computing thegradient in more than two orthogonal directions byconvolving the initial image with several gradientimpulse response arrays and then selecting the max-imum value of the convolved images with differenttemplates as shown in Equation (16):

Gði; jÞ ¼ max jG1ði; jÞj; jG2ði; jÞj; . . . ; jGNði; jÞjf g; ð16Þ

where Gk(i, j) is the result of convolving the initialimage f (x, y) with the kth gradient response array.

Since the intensity of pixels in an image changesrapidly at the edges (the first derivative has a localmaximum), the second derivative will have a zerocrossing. The second-order derivative of f (x, y) can becomputed by the Laplacian operator as defined inEquation (17):

r2fðx; yÞ ¼ @2fðx; yÞ@x2

þ @2fðx; yÞ@y2

: ð17Þ

In order to compute the second derivative of animage, a window mask is convolved with the image:

Lðx; yÞ ¼ fðx; yÞ �Hðx; yÞ: ð18Þ

A simple four-neighbour Laplacian mask is:

H ¼0 �1 0�1 4 �10 �1 0

24

35: ð19Þ

The Laplacian is rarely used for edge detectionalone because it is very sensitive to noise and cannotdetect the direction of edges. Laplacian convolutionoperators will lead to double-edge detection, which isinappropriate for direct edge detection; however, theycan be a complement for other edge detectiontechniques. Applying a Gaussian smoothing filter toan image and then using the Laplacian of the new imagefor edge detection yields to Laplacian of the Gaussianoperator. Because of its linearity, this detector can bedirectly applied as the convolution of the initial imagewith the Laplacian of the Gaussian function:

r2hðx; yÞ ¼ � ðx2 þ y2Þ � s2

s4

� �exp � x2 þ y2

2s2

� �:

ð20Þ

Increasing the window size of the edge detectionoperator decreases its sensitivity towards noise. Sincethe Roberts window is the smallest in size, it is verynoise-sensitive, and many spots are detected as edges bythis operator. The Prewitt operator is weak in detectingdiagonal edges (Pal and Pal 1993). The Sobel operatordoes not have noise sensitivity as it gives more weight tothe pixels closer to the pixel of interest, which is locatedin the middle of the convolution window. Among theoperators mentioned above, the Canny operator hasthe best performance. This technique considers theedges as the local maxima of the derivative of aGaussian filter. In other words, the smoothing step isimbedded within the operator. Subsequently, the weakedges and the strong edges are extracted by setting twodifferent thresholds. Finally, the strong edges and theweak edges that are connected to strong edges aredetected as the real edges. Consequently, less weakedges are falsely detected.

Based on experiments on bridge pavements, Abdel-Qader et al. (2003) have concluded that the Canny edgedetection technique is more successful in detectingcracks than Sobel and FFT techniques. This result(that the FFT approach has the worst performance) isalso confirmed by the authors of the current paper.The FFT approach includes the frequency propertiesof the image in the frequency domain. The mathema-tical FFT formulation is shown in Equation (21):

Fðu;vÞ¼ 1

MN

XM�1x¼0

XN�1y¼0

fðx;yÞ exp �2pj xu

MþyvN

; ð21Þ


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

where f (x, y) is the M 6 N image, x and y are thespatial coordinates and u and v are the transformationcoordinates in the frequency domain. Since the FFT ishighly sensitive to noise, it is not recommended to beused for the problem in question.

Abdel-Qader et al. (2003) have demonstrated thatthe fast Haar transform performs even better than theCanny detector for detecting cracks in concrete bridgepavements. Haar is the simplest wavelet whose motherwavelet c and scaling function j are shown below:

cðtÞ ¼1 0 � t < 1

2

�1 12 � t < 1

0 elsewhere

8<: ð22Þ

and

jðtÞ ¼ 1 0 � t < 10 elsewhere

�: ð23Þ

The decomposition and reconstruction filters for thiswavelet family are:

hj ¼1ffiffiffi2p 1

1

� �and hc ¼

1ffiffiffi2p 1

�1

� �: ð24Þ

Abdel-Qader et al. (2003) used the Haar wavelet toget the one-level decomposition of an image, asdescribed in x5. Then, the three details are combinedto generate the magnitude image. The threshold isdefined as the average intensity value of all pixels in thecaptured images. The overall accuracy of this techni-que is 86%, as reported by Abdel-Qader et al. (2003).Because the effect of light and the contrast of eachimage are not considered independently when choosingthe threshold, this thresholding is inappropriate. Byselecting an independent threshold for each image, abetter detection rate is expected. In this approach, noother classification is used to detect cracks from non-crack edges.

Mallat and Zhong (1992) have demonstrated thatthe local maxima of an image wavelet transform can beused to extract and analyse multi-scale edges. Siegeland Gunatilake (1998) have used the wavelet filterbank for detecting cracks in aircraft surfaces, wherethey used a cubic spline and its first derivative as

scaling and wavelet functions. This wavelet transformis equivalent to applying a smoothing filter on theimage followed by taking the derivative of thesmoothed image, which is identical to a classical edgedetection procedure.

Siegel and Gunatilake (1998) applied a three-leveldecomposition on the region of interest; however, thedecomposition algorithm is slightly different fromwhat is described in x5. This decomposition is definedas applying the high-pass decomposition filter, gi(mother wavelet function), once to the rows and onceto the columns separately, which leads to Wyi and Wxi,respectively, where i is the decomposition level. Thelow-pass decomposition filter, hi (scaling function), isapplied to the rows and the columns. The wavelet andscaling decomposition filters for the specified waveletat the three levels are shown below:

g1¼ ½2;� 2�; ð25Þ

g2¼ ½2; 0;� 2�; ð26Þ

g3¼ ½2; 0; 0; 0;� 2�; ð27Þ

h1¼ ½0:125; 0:375; 0:375; 0:125�; ð28Þ

h2¼ ½0:125; 0; 0:375; 0; 0:375; 0; 0:125�; ð29Þand

h3¼ ½0:125; 0; 0; 0; 0:375; 0; 0; 0; 0; 0:375; 0; 0; 0; 0:125�:ð30Þ

The schematic procedure described above is presentedin Figure 15. In order to extract crack-like edges, themagnitude image Mi is computed for each level as:

Mi ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiW2

xi þW2yi

q: ð31Þ

By choosing a dynamic threshold based on thehistogram of each magnitude image, pixel valuesabove the threshold are detected as edge points. Sincethe direction of a crack varies smoothly, edge pointsare linked, based on eight neighbours if the differenceof the corresponding angles is less than a specific angle

Figure 15. Multi-resolution decomposition of the image used by Siegel and Gunatilake (1998).


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

(Siegel and Gunatilake 1998). For this purpose, theangle image of each level Ai is defined as:

Ai ¼ arctanWyi

Wxi

� �: ð32Þ

A one hidden-layer neural network consisting offour neural units, six input units and one output unit isused to classify cracks from non-crack edges. Onaircraft surfaces, the cracks are smaller than non-crackedges (e.g. scratches). On the other hand, an edge thatonly appears in the first-decomposition level is smallerthan an edge that appears in the first two levels.Similarly, the latter is smaller than an edge thatappears in all levels of the decomposition. Thisimportant characteristic, which was introduced as‘propagation depth’ by Siegel and Gunatilake (1998),is considered to be one of the selected features. Thepropagation depth represents the number of decom-position levels in which a specific edge has appeared,and also conveys the size information of the edge. Apropagation depth is assigned to each edge thatappears in the first decomposition level. For thisreason, a ‘coarse-to-fine edge linking process’ is used,which provides information about an edge from acoarse resolution to a fine resolution. The featuresassigned to each edge in the first decomposition level,as defined by Siegel and Gunatilake (1998), are:

(1) Computed propagation depth number;(2) Number of pixels constituting the edge;(3) Average wavelet magnitude of the edge pixels;(4) Direction of pixels constituting the edge in level

one;(5) The signs of

PWx1 and

PWy1 for all pixels

belonging to the edge; and(6) Average wavelet magnitude of linked edges in

levels two and three during coarse-to-fine edgelinking processes.

The accuracy of this technique in crack detection is71.5% (Siegel and Gunatilake 1998). The smalleraccuracy of this study with respect to the one carriedout by Abdel-Qader et al. (2003) is due to the level ofcomplexity of the problem. This technique is highlydependent on the direction of light during the imageacquisition. The steps necessary to obtain betterperformance of the classification process are: selectingadditional features, capturing more images of a surfacewith different camera orientations (in order to gatherand enrich the data with different light directions) andalso increasing the population of the training set(Siegel and Gunatilake 1998).

None of the techniques discussed above deal withthe problem of major non-defect edges such as

structural member edges or background crack-likeobjects. Another set of techniques that can overcomethis shortcoming is discussed in the following section.

6.2. Morphological techniques

Morphological image processing extracts useful in-formation about the objects of an image based onmathematical morphology. The foundation of mor-phological image processing is based on previousstudies of Minkowski (1903) and Matheron (1975) onset algebra and topology, respectively (Pratt 2001).Morphological techniques can be applied to binary orgrey-scale images. Although morphological operationsare also discussed in the context of colour imageprocessing (Comer and Delp 1999, Al-Otum andMunawer 2003, Yu et al. 2004), the grey-scaleoperations that are useful for segmenting cracks fromthe rest of an image are introduced here. Figure 16(a)shows a vertical crack on a steel strip caused by atensile rupture, and Figure 16(b) shows a horizontalcrack on a rebar caused by a torsional rupture. Theresults of performing different morphological opera-tions on these two images are presented later in thispaper to give the reader a better understanding of theapplications of the described operations.

Morphological image processing generally can beused in image filtering, image sharpening or smoothing,noise removal, image segmentation, edge detection,feature detection, defect detection, preprocessing andpostprocessing tasks. A brief discussion of somedefinitions used in morphological approaches follows.

6.2.1. Dilation

The grey-scale dilation of image I and the structuringelement S is defined as:

ðI � SÞ ðx; yÞ ¼ max ½I ðx� x0; y� y

0 Þþ Sðx0 ; y0 Þ j ðx0 ; y0 Þ 2 Ds�; ð33Þ

where DS, a binary matrix, is the domain of S (thestructuring element) and defines which neighbouringpixels are included in the maximum function. In thecase of non-flat structuring elements, DS indicates thepixels included in the maximum function as well astheir weights (DS is not binary in this case). During thedilation process, which is similar to the convolutionprocess, I (x, y) is assumed to be 7? for (x, y) 62 DS.In the case of flat structuring elements, S (x0, y0)¼ 0 for(x0, y0) 2 DS. Flat structuring elements are usually usedfor grey-scale morphological operations. Visually,dilation expands the bright portions of theimage (Salembier 1990). The results of applying thisoperation to the images in Figure 16 are shown in


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

Figure 18. Figures 17(a) and (b) show the domain oftwo flat structuring elements used in Figures 18, 19, 20,21, 22, 24 and 25. Number 1 in Figures 17(a) and (b)shows the pixels to be included in the morphologicaloperation.

6.2.2. Erosion

Similar to above, the grey-scale erosion is definedas:

ðI � SÞ ðx; yÞ ¼ min ½I ðx þ x0; y þ y0Þ� Sðx0; y0Þ j ðx0; y0Þ 2 Ds�; ð34Þ

where I (x, y) is assumed to be þ? for (x, y) 62 DS. Inthe case of flat structuring elements, S (x0, y0)¼ 0 for(x0, y0) 2 Ds. In fact, erosion shrinks the bright portionsof a given image (Salembier 1990). The results ofapplying this operation to the images in Figure 16 areshown in Figure 19.

6.2.3. Morphological gradient

The morphological gradient is defined as the dilatedimage minus the eroded version of the image, and itcan be used to detect edges as it represents the localvariations of an image (Gonzalez et al. 2004). Figure20 shows the results of applying this operation to theimages in Figure 16.

6.2.4. Opening

The grey-scale opening of an image I by structuringelement S can be written as:

I S ¼ ðI� SÞ � S: ð35Þ

Figure 21 shows the results of applying the openingoperation to the images in Figure 16. Sinha andFieguth (2006b) detected the defects in undergroundpipe images by thresholding the morphological open-ing of the pipe images using different structuringelements.

6.2.5. Closing

Similarly, the closing in grey-scale is defined as:

I S ¼ ðI� SÞ � S: ð36Þ

Figure 16. (a) Vertical crack caused by a tensile rupture of a steel strip, and (b) horizontal crack caused by a torsional rupture ofa steel rebar.

Figure 17. (a) 1 6 5 flat structuring element domain usedin Figures 18–22, 24 and 25, and (b) 961 flat structuringelement domain used in Figures 18–22, 24 and 25.


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

Figure 18. Dilation performed by: (a) a 1 6 5 structuring element, and (b) a 9 6 1 structuring element.

Figure 19. Erosion performed by: (a) a 1 6 5 structuring element, and (b) a 9 6 1 structuring element.

Figure 20. Morphological gradient performed by: (a) a 1 6 5 structuring element, and (b) a 9 6 1 structuring element.


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

Closing is applied to the images in Figure 16, and theresults are shown in Figure 22.

While opening is usually used to eliminatesharp bright details, closing is used to remove darkdetails provided that the structuring element islarger than the details. These properties make thecombination of opening and closing very suitable fornoise removal and image blurring (Gonzalez et al.2004). Figure 23 shows the one-dimensional openingand closing operations. While the curve represents thegrey-scale level, the circles (structuring elements)beneath the curve pushing it up illustrate the openingoperation, and the circles above the curve pushing itdown represent the closing operation.

In Figure 23, higher values of the curve showbrighter pixels. One can conclude from Figure 23 that

Figure 21. Opening performed by: (a) a 1 6 5 structuring element, and (b) a 9 6 1 structuring element.

Figure 22. Closing performed by: (a) a 1 6 5 structuring element, and (b) a 9 6 1 structuring element.

Figure 23. One-dimensional scheme of the opening andclosing operations. The curve represents the grey-scale level,the circles (structuring elements) beneath the curve pushing itup illustrate the opening operation, and the circles above thecurve pushing it down represent the closing operation.


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

the subtraction of the opened image from its originalversion will result in the detection of bright defects.This operation is called top-hat transform (Serra 1982,Meyer 1986) and its formula is:

T ¼ I � ðI SÞ: ð37Þ

The dual form of the above equation is calledbottom-hat, which is appropriate for detecting darkdefects (see Figure 23). Its formulation is:

T ¼ ðI SÞ � I: ð38Þ

An example of bottom-hat operation applied to theimages in Figure 16 is shown in Figure 24.

These two transformations can also be used forcontrast enhancement. Giakoumis et al. (2006) de-tected the cracks in digitised paintings by thresholdingthe output of the top-hat transform. The shapeand size of the structuring element depend on thedefect of interest. The structuring element has animportant role in defect detection using the abovemorphological image processing techniques. Forexample, to detect small defects such as edges, smallstructuring elements, preferably flat ones, should beused (Salembier 1990).

Salembier (1990) has proposed and compareddifferent algorithms to improve morphological defectdetection based on top-hat and bottom-hat transfor-mations. It was concluded that the algorithms shownin Equations (39) and (40) can be used to detect brightand dark defects respectively:

T ¼ I�min ½ðI SÞ S; I�; ð39Þ

and

T ¼ max ½ðI SÞ S; I� � I: ð40Þ

Applying Equation (40) to the images in Figure 16leads to the segmentation of dark cracks, as shown inFigure 25. After postprocessing, the final segmentedcracks are shown in Figure 26.

Nieniewski et al. (1999) used the above equations toextract cracks in ferrites. They used two flat structuringelements: a 1 6 5 row element for the vertical cracksand a 5 6 1 column element for the horizontal ones.The authors of the current study applied a squarestructuring element to truss model images to detectcracks; the results after noise removal were excellent.Even in the presence of occlusions and differentbackgrounds, the algorithm is highly successful pro-vided that the camera captures high resolution imagesand focuses its lens on the specific structural member ofinterest. Now-a-days almost all commercial digitalcameras have active or passive auto-focusing capabil-ities that can assist in the acquisition of suitable images(Schlag et al. 1983).

Figure 27 represents examples of applying theabove technique on actual images. Figures 27(a), (c),(e), (g) and (i) are the original real steel structuralmembers. Figure 27(i) is a magnification of Figure27(g) to give a better view of the structure’s crack. Thetrue cracks that were segmented in Figures 27(b), (d),(f), (h) and (j) are white and the other segmentedobjects are non-white (after postprocessing and noiseremoval). The domain of structuring elements thatwere used to segment the cracks are as follows: a 70 670 matrix with (both main and minor) diagonal

Figure 24. Bottom-hat operation performed by: (a) a 1 6 5 structuring element, and (b) a 9 6 1 structuring element.


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

members of 1 and non-diagonal members of 0(Figure 27(b)), a 10 6 10 matrix with ones in theminor diagonal and zeros everywhere else (Figure27(d)), a unit matrix of 7 6 7 (Figure 27(f)) and a 1 65 structuring element (Figures 27(h) and (j)).

The challenge is to find the appropriate size andformat of the structuring element. When the structur-ing element has a line format, it can segment cracksthat are perpendicular to it (see the structuring ele-ments used in Figures 26 and 27). A good example ofsuch a study is presented by Sinha and Fieguth (2006b),in which they tried to find the optimal size ofstructuring elements to segment and classify cracks,holes, laterals and joints in underground pipe images.

The morphological operations discussed abovesegment the cracks more efficiently than the edgedetection operators reviewed in x6.1. Edge-basedtechniques extract all the edges in an image, whichmakes the classification task harder. Basically, edge-based techniques will generate more noise thanmorphological techniques (compare Figures 25and 28).

After extracting reliable, independent and discri-minating set of features from the segmented objects, itis the classifier’s task to label each of the segmentedobjects as crack or non-crack (x4). Sinha (2000)has used area, number of objects, major axis length,minor axis length, mean and variance of pixels

Figure 26. Crack segmentation. (a) Vertical crack segmentation, and (b) horizontal crack segmentation.

Figure 25. (a) Vertical dark crack segmented using morphological techniques, and (b) horizontal dark crack segmented usingmorphological techniques.


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

Figure 27. Examples of crack segmentation in real structures: (a) original structural member 1, (b) crack segmentation of image(a) using a 70670 matrix with (both main and minor) diagonal members of 1 and non-diagonal members of 0 as the structuringelement, (c) original structural member 2, (d) crack segmentation of image (c) using a 10 6 10 matrix with ones on the minordiagonal and zeros everywhere else as the structuring element, (e) original structural member 3, (f) crack segmentation of image(e) using a unit matrix of 767 as the structuring element, (g) original structural member 4, (h) crack segmentation of image (g)using a 16 5 structuring element, (i) magnification of image (g), and (j) crack segmentation of image (i) using a 16 5 structuringelement.


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

projected in four directions (0, 45, 90 and 1358) asfeatures for crack detection in underground pipelinesystems.

7. Corrosion detection

There are few papers published on corrosion detectionbased on image processing techniques alone; however,the capability of image-understanding algorithms is ofgreat interest since they are contactless, nondestructivemethods. Because the corrosion process can deterio-rate the surface of metals, the corroded surface has adifferent texture to the rest of the image. Texture canbe regarded as the measurement of smoothness,coarseness and regularity (Gonzalez and Woods1992). Most texture segmentation techniques are basedon the pattern recognition concepts described in x4. Paland Pal (1993) and Reed and Dubuf (1993) provide acomprehensive review of different segmentation tech-niques. Although the computational cost is high forlarge windows (Pratt 2001), discrete wavelet transformcoefficients are powerful tools to characterise theappropriate features for texture classification as theylocalise the spatial and frequency characteristics verywell (Gunatilake et al. 1997).

For subsurface corrosion, a defective area can berecognised based on changes in its surface shape ratherthan its texture. Stereo cameras are appropriate toolsto detect the changes in surface shape, which is usefulfor detecting the subsurface corrosion. Hanji et al.(2003) used this approach to measure the 3D shape ofthe corroded surface of steel plates. They used astereoadapter on a regular camera to have thestereovision of the corroded surface. The correspond-ing regions in the stereo images are detected as high

correlative areas in both images. The 3D model of thesurface is then reconstructed for measurement pur-poses. The results of the technique are in goodagreement with the measurements of the laser dis-placement meter (Hanji et al. 2003).

Colour is another important attribute of digitalimage-based corrosion detection. Colour image seg-mentation surveys are provided by Skarbek andKoschen (1994) and Cheng et al. (2001).

Gunatilake et al. (1997) used Daubechies (1992)wavelets of order six to detect corrosion areas onaircraft skins. The outcome of their algorithm is abinary image indicating the corroded and non-corroded regions. A three-level wavelet filter bank isused to decompose the image, as described in x5. Thelow-pass and high-pass filters for this wavelet areshown in Equation (41):

hj ¼

0:33267

0:80689

0:45988

�0:13501�0:085440:03523

2666666664

3777777775

and hc ¼

0:03523

0:08544

�0:13501�0:459880:80689

�0:33267

2666666664

3777777775: ð41Þ

The image is divided into 8 6 8 pixel blocks. Block-based feature elements lead to a high signal-to-noiseratio and decrease the false detection of corrosion on asurface. Finally, 10 features are assigned to any ofthese non-overlapping blocks. Each feature is theenergy of the block computed from the waveletcoefficients from one of the 10 decomposed frames.The energy of each decomposed frame is defined as thesum of the square of all pixel values belonging to that

Figure 28. (a) Extracted edges using the Sobel edge detection technique for the steel strip in Figure 16(a); (b) extracted edgesusing the Sobel edge technique for the steel rebar in Figure 16(b).


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

frame divided by the sum of the square of all pixelvalues belonging to all decomposed frames of theblock. For a better understanding of this definition, aschematic three-level decomposition of an image isshown in Figure 29, where ‘L’ and ‘H’ stand for low-pass and high-pass filters respectively.

Each feature can be written mathematically as:

fjðiÞ ¼Pðm;nÞ2BðiÞ Wjðm; nÞ

� �2P10

j¼1Pðm;nÞ2BðiÞ Wjðm; nÞ

� �2 ; ð42Þ

where fj (i) is the jth feature of the ith block, Wj (m, n)is the wavelet decomposition coefficients of the jth

sub-band (as in Figure 29) at (m, n) and B(i) is theith block.

Gunatilake et al. (1997) used a nearest neighbourclassifier (as described in x4.3.1) to classify corrodedregions from corrosion-free areas. The algorithm has a95% accuracy in detecting corroded regions, asreported by the authors. The authors of the currentstudy have used the same features to automaticallysegment different textures of an image with anunsupervised classifier. The results are acceptable;however, colour, a very important attribute, is notincluded in the feature vector described above.

Since corrosion does not always exhibit a repeatedtexture, and since the lighting in which an image iscaptured is inconstant, the classical form of texturesegmentation described above is incapable of perform-ing reliable defect detection. This problem, along withthe aim of integrating several images captured from asingle scene to create a corrosion map of a surface, leadto the development of a more sophisticated algorithmthat also contains the colour characteristics of animage (Siegel and Gunatilake 1998).

In their new algorithm, Siegel and Gunatilake(1998) converted the RGB image into a moreuncorrelated colour space of YIQ, where Y representsluminance, and I and Q represent chrominanceinformation. Equation (43) shows how RGB and

YIQ colour components are related (Buchsbaum1968, Pratt 2001):

Y

I

Q

0B@

1CA ¼

0:29890 0:58660 0:11448

0:59598 �0:27418 �0:321800:21147 �0:52260 0:31110

0B@

1CA

R

G

B

0B@

1CA:

ð43Þ

The Battle–Lemaire (BL) (Strang and Nguyen 1996)wavelet transform filter is used to obtain a three-leveldecomposition of the Y component without down-sampling at each stage. Consequently, 10 equal-sizedimages are obtained as shown in Figure 30. The two-level decomposition of the I and Q components arecomputed using the normal wavelet transform withdown-sampling at each stage. The Y image is dividedinto 32 6 32 non-overlapping pixel blocks; for each ofthese blocks, 10 features from the Y decompositionimage and four features from the I and Q decomposi-tion images are extracted. Each of the first ninefeatures derived from the Y image is calculated as:

fSBkðx; yÞ ¼

Pb2

i¼�b2

Pb2

j¼�b2

wSB2kPb

2

i¼�b2

Pb2

j¼�b2

w½LH2k þHH2

k þHL2k�;

ð44Þ

where

w ¼ wðxþ i; yþ jÞ; ð45Þ

SBk ¼ SBkðxþ i; yþ jÞ; ð46Þ

LHk ¼ LHkðxþ i; yþ jÞ; ð47Þ

HHk ¼ HHkðxþ i; yþ jÞ; ð48Þand

HLk ¼ HLkðxþ i; yþ jÞ: ð49Þ

Equation (44) is the ratio of a detailed image,resulting from decomposition and the total energy ofall details in that level of decomposition. In thisequation, HH, HL and LH are the decomposedimages, as shown in Figure 30. The coordinate (x, y)is the centre of the blocks in the Y decomposedimages, b is the size of the block, SBk is either HH,HL, or LH at level k of the decomposition, and w isa Gaussian weighted mask as described in x2. Infact, the computation of these nine features is similarto the traditional texture segmentation techniquesusing wavelet transform where a Gaussian mask isused to define the weighting factor to calculate theenergy of each block.

Figure 29. Three-level wavelet decomposition notationused by Gunatilake et al. (1997).


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

The 10th feature that is extracted from image Y is:

fLL3ðx; yÞ ¼

P3k¼1Pb

2

i¼�b2

Pb2

j¼�b2

w½LH2k þHH2

k þHL2k�Pb

2

i¼�b2

Pb2

j¼�b2

wLL23

;

ð50Þ

where b is the size of the block, SBk is either HH, HLor LH at level k of the decomposition and w is aGaussian weighted mask. The parameters w, SBk,LHk, HHk, and HLk are defined by Equations (45),(46), (47), (48) and (49), respectively. Equation (50)represents the ratio of the whole energy of all thedetails and the approximation energy of the Y imageafter a three-level wavelet decomposition.

Four more features are extracted from the two-level wavelet decompositions of the I and Q imagecomponents, as shown in Equations (51) and (52):

fIkðx0

k; y0

kÞ ¼

Pb2

i¼�b2

Pb2

j¼�b2

w½LH2k þH2

k þHL2k�Pb

2

i¼�b2

Pb2

j¼�b2

wLL22

; ð51Þ

and

fQk ðx

0

k; y0

kÞ ¼

Pb2

i¼�b2

Pb2

j¼�b2

w½LH2k þH2

k þHL2k�Pb

2

i¼�b2

Pb2

j¼�b2

wLL22

; ð52Þ

where

w ¼ wðx0k þ i; y0k þ jÞ; ð53Þ

SBk ¼ SBkðx0

k þ i; y0

k þ jÞ; ð54Þ

LHk ¼ LHkðx0

k þ i; y0

k þ jÞ; ð55Þ

HHk ¼ HHkðx0

k þ i; y0

k þ jÞ; ð56Þ

andHLk ¼ HLkðx

0

k þ i; y0

k þ jÞ; ð57Þ

where b is the size of the block, SBk is either HH, HL,or LH at level k of the decomposition, w is a Gaussian

Figure 30. (a) Two-level wavelet decomposition of an image, and (b) three-level wavelet decomposition of an image (notationused by Siegel and Gunatilake (1998)).

Figure 31. HSI colour space.


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

Figure 32. The result of binarising the saturation component of original images (a), (c), (e), (g) and (i), in order to extract thecorroded area of each image, is presented in images (b), (d), (f), (h) and (j), respectively.


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

weighted mask, k is the level of decomposition(here k¼ 1, 2) and (x0k, y0k) in I and Q is thecorresponding coordinate of the block centre (x, y) inY at the kth level of decomposition. This can becomputed as:

ðx0k; y0kÞ ¼

x

2k;y

2k

: ð58Þ

After computing (x0k, y0k), a 32632 window centred at

(x0k, y0k) is selected to calculate the feature values, as

shown in Equations (51) and (52).The 32 6 32 windows in I and in Q will overlap as

the level of decomposition increases. For the higherlevels of decomposition, this process will result in abetter estimation of the low-frequency signals in thechrominance characteristics of the image. Equations(51) and (52) are ratios of the total energy of the detailsin the kth level of decomposition and the energy of thesecond level approximation for the I andQ components.After extracting the described features, Siegel andGunatilake (1998) used a feed-forward neural networkconsisting of 14 input, 40 hidden and 2 output neurons.The possible outputs of the algorithm are: corrosionwith high confidence, or a corrosion with low confidenceor a corrosion-free region. The decisionmaking functionis based on the two outputs of the neural network and athreshold T that is experimentally selected as 0.65. Theconfidence is defined as the absolute value of thedifference between the two outputs:

outputð1Þ > outputð2Þ&

confidence � T

8<:

9=;)

corrosion

ðhigh confidenceÞ

outputð1Þ > outputð2Þ&

confidence < T

8<:

9=;)

corrosion

ðlow confidenceÞoutputð1Þ < outputð2Þ ) corrosion�free

ð59Þ

8>>>>>>>>>><>>>>>>>>>>:

Because of the above decision making procedure, it ispossible to process multiple images and performinformation fusion over captured data. In addition, asingle corrosion map can be generated where eachregion has the largest confidence value extracted fromdifferent images. The probability of correct detectionfor this algorithm is 94% (Siegel and Gunatilake 1998).

Another way to evaluate the colour characteristicsof a corroded region is to convert the RGB colourimage into the HSI colour space. It is possible toexpress the colour characteristics independent ofbrightness in the HSI colour space. For this reason,the HSI colour space is a suitable choice for identifyingcorroded areas quantitatively (Choi and Kim

2005). The following equations show how the RGBcomponents of an image can be converted to the HSIcomponents:

I ¼ 1

3ðRþGþ BÞ; ð60Þ

S ¼ 1� 3

Rþ Gþ B

� �min ðR;G;BÞð Þ; ð61Þ

and

H ¼ cos �112 ½ðR�GÞ þ ðR�BÞ�ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

ðR�GÞ2 þ ðR� BÞðG� BÞq : ð62Þ

Figure 31 shows the HSI colour space in which thesaturation varies from 0 to 1, the hue varies from 0 to3608, and the intensity varies from 0 (black) to 1(white).

Binarising the saturation component of an imagewill segment the pixels that have higher saturationvalues than the rest of the image, which, in many cases,can result in segmenting the corroded area. Thisindicates the significance of the saturation component.Figure 32 shows the result of binarising the saturationcomponent in actual corrosion images using thethreshold technique described in x6.1, where the whiteareas in Figures 32b, d, f, h and j are the potentialcorroded areas.

Choi and Kim (2005) classified several types ofcorrosion defects. They proposed to divide each H andS component into 10 6 10 pixel blocks, and then treatthe histogram of each block like a distribution ofrandom variables. After applying the PCA andvarimax approach, it was concluded that the mean Hvalue, the mean S value, the median S value, the skewsof the S distribution and the skews of the I distributionare appropriate features to be assigned to each blockfor classification (Choi and Kim 2005). The co-occurrence matrix is used for texture feature extractionbased on the azimuth difference of points on a surface.This approach may not be useful for the problem inquestion since it requires microscopes to capture theimages; the magnification factor of the tested imagesby Choi and Kim (2005) is between 50 and 500, whichis far beyond the magnification factor of regular digitalcameras.

8. Summary and conclusions

Among the possible techniques for inspecting civilinfrastructure, the use of optical instrumentation thatrelies on image processing is a less time consumingand inexpensive alternative to current monitoringmethods. This paper provides a survey and an


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

evaluation of some of the promising vision-basedapproaches for automatic detection of missing (de-formed) members, cracks and corrosion in civilinfrastructure systems. Several examples that arebased on laboratory studies are presented in thepaper to illustrate the utility, as well as the limita-tions, of the leading approaches.

Image registration can be the key to resolvingissues such as visual detection of missing or deformedstructural members and 3D reconstruction of trusssystems, which can lead to localising the detectedchanges. Different feature detection and featurematching techniques for image registration are pre-sented in this paper, including the Moravec operator,the Harris (Plessey) operator, the SIFT detector, andthe RANSAC algorithm; however, current registrationtechniques need to be improved in order to reconstructa 3D model of truss-like structures for localisationpurposes.

Morphological image processing techniques arepromising approaches to segment the probable crack-like patterns from the rest of the image. Twomorphological algorithms based on modifying thetop-hat and bottom-hat transformations are the bestalgorithms for this purpose, as verified by the authors.Several edge-based techniques are discussed, includingSobel, Roberts, Prewitt, Canny, Laplacian, Laplacianof Gaussian, FFT and wavelet transform approaches.Crack detection techniques discussed in this paper arehighly promising, but the problem of irregular back-grounds requires further study.

Discrete wavelet filter banks are introduced andseveral applications of wavelet transform coefficients inedge detection, crack detection, image enhancementand compression, as well as corrosion detection, aredescribed.

In order to segment corrosion-like regions from therest of the image, both texture and colour analysisshould be applied. Multi-resolution wavelet analysis isa powerful tool to characterise the appropriate featuresfor texture classification. Decision making is a vitalstep in detecting cracks and corrosion regions, as wellas the texture classification. The concept of patternrecognition, including supervised and unsupervisedclassification, and the performances of some of theseclassification techniques such as k-nearest neighbour,k-means and neural network are discussed. Neuralnetwork classification has an acceptable performancein most cases. To perform colour analysis, YIQ andHSI colour spaces appear to provide suitable featuresfor the classification process. Corrosion detectionrequires more research to correctly segment andclassify the defected regions.

Neuro-fuzzy systems simultaneously benefit fromthe data imprecision tolerance (vague definitions) of

fuzzy logic systems and the tolerance of the neuralnetworks for noisy data. The easily comprehendiblelinguistic terms and if-then rules of the fuzzy systemsand the learning capabilities of the neural networks arefused into a neuro-fuzzy system. These capabilitiesmake the neuro-fuzzy expert systems appropriate forpattern recognition and defect classification.

Data fusion and image acquisition of a sceneusing several cameras or different lighting conditionsare very interesting aspects of the class of problemsunder discussion, and which require more research.The results discussed within this paper show thatusing image processing and pattern recognitiontechniques are promising approaches for contactlessnondestructive health monitoring of many civilinfrastructures.

Acknowledgements

This study was supported in part by grants from the NationalScience Foundation.

References

Abdel-Qader, I., Abudayyeh, O., and Kelly, M.E., 2003.Analysis of edge-detection techniques for crack identifi-cation in bridges. Journal of Computing in Civil Engineer-ing, 17 (4), 255–263.

Abdou, I., 1973. Quantitative methods of edge detection. LosAngeles, CA: Image Processing Institute, University ofSouthern California. Technical Report USCIPI Report830.

Abdou, I.E. and Pratt, W.K., 1979. Quantitative design andevaluation of enhancement/thresholding edge detectors.Proceedings of IEEE, 67 (5), 753–763.

Abtoine, J.-P., Murenzi, R., Vandergheynst, P., and Ali,S.T., 2004. Two-dimensional wavelets and their relatives.Cambridge, UK: Cambridge University Press.

Achler, O. and Trivedi, M.M., 2004. Camera based vehicledetection, tracking, and wheel baseline estimationapproach. In: Proceedings of IEEE conference onintelligent transportation systems. ITSC, 743–748.

Al-Otum, H.M., 2003. Morphological operators for colorimage processing based on mahalanobis distance mea-sure. Optical Engineering, 42 (9), 2595–2606.

Alageel, K. and Abdel-Qader, I., 2002. Harr transform use inimage processing. Kalamazoo, MI: Department ofElectrical and Computer Engineering, Western MichiganUniversity, Technical Report.

Bachmann, G., Narici, L., and Beckenstein, E., 2000. Fourierand wavelet analysis. New York, NY: Springer.

Benning, W., Gortz, S., Lange, J., Schwermann, R., andChudoba, R., 2003. Development of an algorithm forautomatic analysis of deformation of reinforced concretestructures using photogrammetry. VDI Berichte, 1757,411–418.

Bosc, M., Heitz, F., Armspach, J.-P., Namer, I., Gounot, D.,and Rumbachc, L., 2003. Automatic change detection inmultimodal serial MRI: application to multiple sclerosislesion evolution. Neuroimage, 20, 643–656.

Brown, L.G., 1992. A survey of image registration techni-ques. ACM Computing Surveys, 24 (4), 325–376.


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

Bruzzone, L. and Serpico, S.B., 1997. An iterativetechnique for the detection of land-cover transitionsin multitemporal remote-sensing images. IEEE Transac-tions on Geoscience and Remote Sensing, 35 (4), 858–867.

Buchsbaum, W.H., 1968. Color TV servicing. 2nd ed.Englewood, Cliffs: Prentice Hall Press.

Canny, J., 1986. A computational approach to edgedetection. IEEE Transactions on Pattern Analysis andMachine Intelligence, PAMI-8 (6), 679–698.

Chae, M.J., 2001. Automated interpretation and assessment ofsewer pipeline. Thesis (PhD). Purdue University.

Chang, S.G., Yu, B., and Vetterli, M., 2000. Spatiallyadaptive wavelet thresholding with context modeling forimage denoising. IEEE Transactions on Image Proces-sing, 9 (9), 1522–1531.

Cheng, H.D., Jiang, X.H., Sun, Y., and Wang, J., 2001.Color image segmentation: advances and prospects.Pattern Recognition, 34 (12), 2259–2281.

Choi, K. and Kim, S., 2005. Morphological analysis andclassification of types of surface corrosion damageby digital image processing. Corrosion Science, 47 (1),1–15.

Chung, H.-C., Liangand, J., Kushiyama, S., and Shinozuka,M., 2004. Digital image processing for non-linear systemidentification. International Journal of Non-linearmechanics, 39, 691–707.

Coffey, J.M., 1988. Non-destructive testing – thetechnology of measuring defects. CEGB Research, 21,36–47.

Collins, R.T., Lipton, A.J., and Kanade, T., 2000. Introduc-tion to the special section on video surveillance. IEEETransactions on Pattern Analysis and Machine Intelli-gence, 22 (8), 745–746.

Comer, M.L. and Delp, E.J., 1999. Morphological opera-tions for color image processing. Journal of ElectronicImaging, 8 (3), 279–289.

Coppin, P.R. and Bauer, M.E., 1996. Digital changedetection in forest ecosystems with remote sensingimagery. Remote Sensing Reviews, 13 (3–4), 207–234.

Daubechies, I., 1992. Ten lectures on wavelets. Philadelphia,USA: SIAM.

Davis, L.S., 1975. A survey of edge detection techniques.Computer Graphics and Image Processing, 4, 248–270.

Deer, P. and Eklund, P., 2002. Values for the fuzzy c-meansclassifier in change detection for remote sensing. In:Proceedings of 9th international conference on informationprocessing and management of uncertainty, 187–194.

DeVore, R.A., Jawerth, B., and Lucier, B.J., 1992.Image compression through wavelet transform coding.IEEE Transactions on Information Theory, 38 (2), 719–746.

Duda, R.O. and Hart, P.E., 1973. Pattern classification andscene analysis. New York, NY: John Wiley & Sons.

Duda, R.O., Hart, P.E., and Stork, D.G., 2001. Patternclassification. 2nd ed. New York, NY: Wiley.

Dudziak, M.J., Chervonenkis, A.Y., and Chinarov, V., 1999.Nondestructive evaluation for crack, corrosion, andstress detection for metal assemblies and structures. In:Proceedings of SPIE – nondestructive evaluationof aging aircraft, airports and aerospace hardware III,3586, 20–31.

Dumskyj, M.J., Aldington, S.J., Dore, C.J., and Kohner,E.M., 1996. The accurate assessment of changes in retinalvessel diameter using multiple frame electrocardiographsynchronised fundus photography. Current Eye Re-search, 15 (6), 625–632.

Edgington, D.R., Salamy, K.A., Risi, M., Sherlock, R.E.,Walther, D., and Koch, C., 2003. Automated eventdetection in underwater video. Oceans Conference Record(IEEE), 5, 2749–2753.

Faugeras, O., 1993. Three-dimensional computer vision: ageometric view point. Cambridge, MA: MIT Press.

Fischler, M.A. and Bolles, R.C., 1981. Random sampleconsensus: a paradigm for model fitting with applicationsto image analysis and automated cartography. Commu-nications of the ACM, 24 (6), 381–395.

Frei, W. and Chen, C., 1977. Fast boundary detection: ageneralization and a new algorithm. IEEE Transactionson Computers, C-26 (10), 988–998.

Garcia-Alegre, M.C., Ribeiro, A., Guinea, D., and Cristobal,G., 2000. Eggshell defects detection based on colorprocessing. Machine Vision Applications in IndustrialInspection VIII, 3966, 280–287.

Giakoumis, I., Nikolaidis, N., and Pitas, I., 2006. Digitalimage processing techniques for the detection andremoval of cracks in digitized paintings. IEEE Transac-tions on Image Processing, 15 (1), 178–188.

Goldin, S.E. and Rudahl, K.T., 1986. Tutorial imageprocessing system: a tool for remote sensing training.International Journal of Remote Sensing, 7 (10), 1341–1348.

Gonzalez, R.C. and Wintz, P., 1987. Digital image proces-sing. 2nd ed. Boston, MA: Addison-Wesley.

Gonzalez, R.C. and Woods, R.E., 1992. Digital imageprocessing. Boston, MA: Addison-Wesley.

Gonzalez, R.C., Woods, R.E., and Eddins, S.L., 2004.Digital image processing using MATLAB. Upper SaddleRiver, NJ: Prentice Hall.

Gunatilake, P., Siegel, M.W., Jordan, A.G., and Podnar,G.W., 1997. Image understanding algorithms for remotevisual inspection of aircraft surfaces. In: Proceedings ofSPIE – international society for optical engineering, 3027,2–13.

Hanji, T., Tateishi, K., and Kitagawa, K., 2003. 3-D shapemeasurement of corroded surface by using digitalstereography. In: Proceedings of 1st international con-ference on structural health monitoring and intelligentinfrastructure, 1, 699–704.

Harris, C. and Stephens, M., 1988. A combined corner andedge detector. In: Alvey vision conference, PlesseyResearch Roke Manor, UK, 147-152.

Hartley, R. and Zisserman, A., 2000. Multiple view geometryin computer vision. Cambridge, UK: Cambridge Uni-versity Press.

Henstock, P.V. and Chelberg, D.M., 1996. Automaticgradient threshold determination for edge detection.IEEE Transactions on Image Processing, 5 (5), 784–787.

Ho, S.K., White, R.M., and Lucas, J., 1990. Vision systemfor automated crack detection in welds. MeasurementScience and Technology, 1 (3), 287–294.

Hogg, D.C., 1993. Shape in machine vision. Image and VisionComputing, 11 (6), 309–316.

Horn, B.K.P., 1986. Robot vision. Cambridge, MA: MITPress, McGraw-Hill Book Co.

Huertas, A. and Nevatia, R., 2000. Detecting changes inaerial views of man-made structures. Image and VisionComputing, 18, 583–596.

Jolliffe, I.T., 2002. Principal component analysis. 2nd ed. NewYork, NY: Springer.

Kaseko, M.S., Lo, Z.-P., and Ritchie, S.G., 1994. Compar-ison of traditional and neural classifiers for pavement-crack detection. Journal of Transportation Engineering,120 (4), 552–569.


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

Kumar, S. and Taheri, F., 2007. Neuro-fuzzy approaches forpipeline condition assessment. Nondestructive Testing andEvaluation, 22 (1), 35–60.

Lebart, K., Trucco, E., and Lane, D., 2000. Real-timeautomatic sea-floor change detection from video. OceansConference Record (IEEE), 2, 1337–1343.

Lee, K.H., 2005. First course on fuzzy theory and applications.Berlin: Springer-Verlag.

Lemieux, L., Wieshmann, U.C., Moran, N.F., Fish, D.R.,and Shorvon, S.D., 1998. The detection and significanceof subtle changes in mixed-signal brain lesions by serialMRI scan matching and spatial normalization. MedicalImage Analysis, 2 (3), 227–242.

Lowe, D.G., 2004. Distinctive image features from scale-invariant keypoints. International Journal of ComputerVision, 60 (2), 91–110.

Lu, D., Mausel, P., Brondizio, E., and Moran, E., 2004.Change detection techniques. International Journal ofRemote Sensing, 25 (12), 2365–2407.

Mallat, S.G., 1989. Multifrequency channel decompositions ofimages andwaveletmodels. IEEETransactions onAcoustics,Speech and Signal Processing, 37 (12), 2091–2110.

Mallat, S. and Zhong, S., 1992. Characterization of signalsfrom multiscale edges. IEEE Transactions on PatternAnalysis and Machine Intelligence, 14 (7), 710–732.

Matheron, G., 1975. Random sets and integral geometry. NewYork, NY: Wiley.

Meyer, F., 1986. Automatic screening of cytological speci-mens. Computer Vision, Graphics and Image Processing,35 (3), 356–369.

Minkowski, H., 1903. Volumen und oberflache. Mathema-tische Annalen, 57 (4), 447–495.

Misiti, M., Misiti, Y., Oppenheim, G., and Poggi, J.-M.,2006. Wavelet toolbox users guide, version 3 [online].Natick, MA: MathWorks, Inc. Available from:http://www.mathworks.com/access/helpdesk/help/pdf_doc/wavelet/wavelet_ug.pdf

Moravec, H.P., 1977. Towards automatic visual obstacleavoidance. In: Proceedings of 5th international jointconference on artificial intelligence, 584.

Moravec, H.P., 1979. Visual mapping by a robot rover. In:Proceedings of international joint conference on artificialintelligence, 598–600.

Moselhi, O. and Shehab-Eldeen, T., 2000. Classification ofdefects in sewer pipes using neural networks. Journal ofInfrastructure Systems, 6 (3), 97–104.

Nieniewski, M., Chmielewski, L., Jozwik, A., andSklodowski, M., 1999. Morphological detection andfeature-based classification of cracked regions in ferrites.Machine Graphics and Vision, 8 (4), 699–712.

Pal, N.R. and Pal, S.K., 1993. A review of image seg-mentation techniques. Pattern Recognition, 26 (9), 1277–1294.

Parks, D. and Gravel, J.-P., 2005. Corner detectors [online].Center for Intelligent Machines, McGill University.Available from: http://www.cim.mcgill.ca/*dparks/CornerDetector/index.htm [Accessed 1 December 2005].

Peng, L., Zhao, Z., Cui, L., and Wang, L., 2004. Remotesensing study based on irsa remote sensing imageprocessing system. In: Proceedings of IEEE internationalgeoscience and remote sensing symposium: science forsociety: exploring and managing a changing planet(IGARSS), 7, 4829–4832.

Pines, D. and Aktan, A.E., 2002. Status of structural healthmonitoring of long-span bridges in the united states.Progress inStructuralEngineering andMaterials, 4, 372–380.

Porter, R. and Canagarajah, N., 1996. A robustautomatic clustering scheme for image segmentationwith wavelets. IEEE Transactions on Image Processing,5 (4), 662–665.

Poudel, U.P., Fu, G., and Ye, J., 2005. Structural damagedetection using digital video imaging technique andwavelet transformation. Journal of Sound and Vibration,286 (4–5), 869–895.

Prasad, L. and Iyenger, S., 1997. Wavelet analysis withapplications to image processing. Boca Raton, FL: CRCPress.

Pratt, W.K., 2001. Digital image processing. 3rd ed. NewYork, NY: Wiley.

Prewitt, J.M.S., 1970. Object enhancement and extraction.New York, NY: Academic Press.

Radke, R.J., Andra, S., Al-Kofahi, O., and Roysam, B.,2005. Image change detection algorithms: a systematicsurvey. IEEE Transactions on Image Processing, 14 (3),294–307.

Reed, T.R. and Dubuf, J.M.H., 1993. A review of recenttexture segmentation and feature extraction techniques.CVGIP: Image Understanding, 57 (3), 359–372.

Rey, D., Subsol, G., Delingette, H., and Ayache, N., 2002.Automatic detection and segmentation of evolvingprocesses in 3D medical images: application to multiplesclerosis. Medical Image Analysis, 6 (2), 163–179.

Ringer, M. and Morris, R.D., 2001. Robust automaticfeature detection and matching between multiple images.Research Institute for Advanced Computer Science,RIACS Technical Report 01.27.

Roberts, L.G., 1965. Machine perception of three-dimensionalsolids. Cambridge, MA: MIT Press, 159–197.

Rousseeuw, P.J., 1987. Robust registration and outlierdetection. New York, NY: John Wiley & Sons.

Salembier, P., 1990. Comparison of some morphologicalsegmentation algorithms based on contrast enhance-ment. application toautomatic defect detection. In:Proceed-ings of 5th European signal processing conference, 833–836.

Schlag, J.F., Sanderson, C., Neuman, C.P. and Wimberly,F.C., 1983. Implementation of automatic focusingalgorithms for a computer vision system with camera.Carnegie-Mellon University, Pittsburgh, PA 15213,Technical Report.

Scott, G. and Longuet–Higgis, H., 1991. An algorithm forassociating the features of two patterns. In: ProceedingsRoyal Society London B, 244, 21–26.

Serra, J., 1982. Image analysis and mathematical morphology.London, UK: Academic Press.

Shinozuka, M., 2003. Homeland security and safety. In:Proceedings of structural health monitoring and intellegentinfrastructure, 2, 1139–1145.

Shinozuka, M., Chung, H.-C., Ichitsubo, M., and Liang, J.,2001. System identification by video image processing. In:Proceedings of SPIE – international society for opticalengineering, 4330, 97–107.

Shubinsky, G., 1994. Visual and infrared imaging for bridgeinspection. Northwestern University Basic IndustrialResearch Laboratory.

Siegel, M. and Gunatilake, P., 1998. Remote enhanced visualinspection of aircraft by a mobile robot. IEEE workshopon emerging technologies, intelligent measurement andvirtual systems for instrumentation and measurement.St. Paul, MN, USA.

Singh, A., 1989. Digital change detection techniques usingremotely-sensed data. International Journal of RemoteSensing, 10 (6), 989–1003.


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

http://www.cim.mcgill.ca/~dparks/CornerDetector/index.htm

http://www.cim.mcgill.ca/~dparks/CornerDetector/index.htm

Sinha, S.K., 2000. Automated underground pipe inspectionusing a unified image processing and artificial intelligencemethodology. Thesis (PhD). University of Waterloo,Waterloo, Ontario, Canada.

Sinha, S.K. and Fieguth, P.W., 2006a. Classification ofunderground pipe scanned images using feature extrac-tion and neuro-fuzzy algorithm. Automation in Construc-tion, 15 (1), 58–72.

Sinha, S.K. and Fieguth, P.W., 2006b. Morphologicalsegmentation and classification of undergroundpipe images. Machine Vision and Applications, 17 (1),21–31.

Sinha, S.K., Fieguth, P.W., and Polak, M.A., 2003.Computer vision techniques for automatic structuralassessment of underground pipes. Computer Aided Civiland Infrastructure Engineering, 18 (2), 95–112.

Skarbek, W. and Koschen, A., 1994. Colour image seg-mentation: a survey. Institute for Technical Infor-matics, Technical University of Berlin, Technical Report.

Strang, G. and Nguyen, T., 1996. Wavelets and filter banks.Wellesley, MA: Wellesley–Cambridge Press.

Thirion, J.-P. and Calmon, G., 1999. Deformation analysisto detect and quantify active lesions inthree-dimensionalmedical image sequences. IEEE Transactions on MedicalImaging, 18 (5), 429–441.

Torr, P.H.S. and Murray, D.W., 1997. The development andcomparison of robust methods for estimating thefundamental matrix. International Journal of ComputerVision, 24 (3), 271–300.

Tsao, S., Kehtarnavaz, N., Chan, P., and Lytton, R., 1994.Image-based expert-system approach to distress detec-tion on CRC pavement. Journal of TransportationEngineering, 120 (1), 62–64.

Villasenor, J.D., Belzer, B., and Liao, J., 1995. Wavelet filterevaluation for image compression. IEEE Transactions onImage Processing, 4 (8), 1053–1060.

Wang, K.C., Nallamothu, S., and Elliott, R.P., 1998.Classification of pavement surface distress withan embedded neural net chip. In: Artificial neural networksfor civil engineers: advanced features and applications, 131–161.

Watanabe, S., Miyajima, K., and Mukawa, N., 1998.Detecting changes of buildings from aerial images usingshadow and shading model. In: Proceedings of 14thinternational conference on pattern recognition, 2 (2),1408–1412.

Yu, M., Wang, R., Jiang, G., Liu, X., and Cho, T.-Y., 2004.New morphological operators for color image proces-sing. In: Proceedings of IEEE region 10 annual interna-tional conference, A, 443–446.

Zhang, Z., 1998. Determining the epipolar geometry and itsuncertainty: a review. International Journal of ComputerVision, 27 (2), 161–195.

Ziou, D. and Tabbone, S., 1998. Edge detection techniques –an overview. Pattern Recognition and Image Analysis, 8(4), 537–559.

Zitova, B. and Flusser, J., 2003. Image registration methods:a survey. Image and Vision Computing, 21 (11), 977–1000.


Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 13:

53 0

1 O

ctob

er 2

013

a survey and evaluation of promising approaches for automatic image-based defect detection of bridge...

Documents