automated crack detection on concrete...

10
1 Automated Crack Detection on Concrete Bridges Prateek Prasanna, Kristin J. Dana, Nenad Gucunski, Basily Basily, Hung La, Ronny Lim, and Hooman Parvardeh Abstract—Detection of cracks on bridge decks is a vital task for maintaining the structural health and reliability of concrete bridges. Robotic imaging can be used to obtain bridge surface image sets for automated on-site analysis. We present a novel automated crack detection algorithm, the STRUM (spatially-tuned robust multi-feature) classifer, and demonstrate results on real bridge data using a state-of-the-art robotic bridge scanning system. By using machine learning classification, we eliminate the need for manually tuning threshold parameters. The algorithm uses robust curve fitting to spatially localize potential crack regions even in the presence of noise. Multiple visual features that are spatially tuned to these regions are computed. Feature computation includes examining the scale space of the local feature in order to represent the information and the unknown salient scale of the crack. The classification results are obtained with real bridge data from hundreds of crack regions over two bridges. This comprehensive analysis shows a peak STRUM classifier performance of 95% compared with 69% accuracy from a more typical image-based approach. In order to create a composite global view of a large bridge span, an image sequence from the robot is aligned computationally to create a continuous mosaic. A crack density map for the bridge mosaic provides a computational description as well as an global view of the spatial patterns of bridge deck cracking. The bridges surveyed for data collection and testing include Long Term Bridge Performance program’s (LTBP) pilot project bridges at Gainesville, Virginia and Sacramento, California. Note to Practitioners- The automated crack detection algorithm can analyze an image sequence with full video coverage of the region of interest at high resolution (approximately 0.6 mm pixel size). The image sequence can be acquired with a robotic measurement device with attached cameras or with a mobile cart equipped with surface imaging cameras. The automated algorithm can provide a crack map from this video sequence that creates a seamless photographic panorama with annotated crack regions. Crack density (the number of cracks per region) is illustrated in the crack map because individual cracks are difficult to see at the magnification required to view large regions of the bridge deck. I. I NTRODUCTION Surface health monitoring (SHM) of bridge decks plays a vital role in maintaining the structural health and reliability of concrete bridges. Early detection of small cracks on bridge decks is an important maintenance task. More than 100,000 bridges across the United States have exhibited early age bridge-deck cracking [3]. Many bridges exhibit defects in early P. Prasanna and K. Dana are with the Department of Electrical and Computer Engineering, Rutgers University, Piscataway, NJ 08854, USA. (e- mail: [email protected] and [email protected]) N. Gucunski is with the Department of Civil and Environmental En- gineering, Rutgers University. H. La, R. Lim and H. Parvardeh are with the Center for Advanced Infrastructure and Transportation, Rutgers University, Piscataway, NJ 08854, USA. (email: [email protected], [email protected] and [email protected] ) B. Basily is with the Department of Industrial and Systems En- gineering, Rutgers University, Piscataway, NJ 08854, USA. (email: [email protected]) Fig. 1. (Left) Bridge inspection at the Gainesville, Virginia testing site. The white dots seen in the image are the bridge deck grid-markings. (Right) RABIT - Robotic Assessment Bridge Inspection Tool: Long Term Bridge Performance (LTBP) program of Federal Highway Administration (FHWA) [1] has developed a robotic scanning system for imaging bridge decks using non-destructive evaluation techniques (ground-penetrating radar, impact- echo measurements) and high resolution images. Using this system, we collect image data from bridge decks for developing and testing an automated crack detection algorithm. Path-planning, sensor geometry and collision detection for the robot scanning are discussed in [2]. stages immediately after construction. As cracks appear on the deck, paths are created for water and corrosive agents to reach the subsurface rods and steel reinforcements, requiring costly maintenance and repair. The current method of site inspection using human vi- sual inspection is a time-consuming process for long-span bridges. Typically, square grid coordinates are marked on the bridge deck. Skilled inspectors go to the site and assess the deck condition, grid by grid, marking the corrosions and cracks on a chart, all under strict traffic control (Figure 1). Automated and accurate condition assessment that requires minimal lane closure is highly desirable for fast large-area evaluation. Robotic bridge scanning is revolutionizing the process of bridge inspection [4]. However, a key challenge is automated interpretation of the large image dataset in order to infer bridge condition. We present a novel automated crack detection algorithm and demonstrate results using data from a state-of-the-art robotic bridge scanning system illustrated in Figure 1. Many prior crack detection methods use simple edge detection or image thresholding; but these methods are non-robust to noise and require manual parameter setting and adjustment. When the cracks are high contrast regions against a near uniform background (see Figure 2), the simple methods may perform well. However, real world concrete images have cracks of variable appearance and competing visual clutter. Additionally, manual parameter adjustment is an impediment to automatically analyzing large datasets over multiple bridges. Our approach uses machine learning and optimization in order to successfully detect cracks while elim- inating the need for tuning threshold parameters. We develop the STRUM (spatially-tuned robust multi-feature) classifer to

Upload: ngonga

Post on 26-Mar-2018

219 views

Category:

Documents


1 download

TRANSCRIPT

1

Automated Crack Detection on Concrete BridgesPrateek Prasanna, Kristin J. Dana, Nenad Gucunski, Basily Basily, Hung La, Ronny Lim, and Hooman Parvardeh

Abstract—Detection of cracks on bridge decks is a vital task

for maintaining the structural health and reliability of concrete

bridges. Robotic imaging can be used to obtain bridge surface

image sets for automated on-site analysis. We present a novel

automated crack detection algorithm, the STRUM (spatially-tunedrobust multi-feature) classifer, and demonstrate results on real

bridge data using a state-of-the-art robotic bridge scanning

system. By using machine learning classification, we eliminate the

need for manually tuning threshold parameters. The algorithm

uses robust curve fitting to spatially localize potential crack

regions even in the presence of noise. Multiple visual features

that are spatially tuned to these regions are computed. Feature

computation includes examining the scale space of the local

feature in order to represent the information and the unknown

salient scale of the crack. The classification results are obtained

with real bridge data from hundreds of crack regions over two

bridges. This comprehensive analysis shows a peak STRUM

classifier performance of 95% compared with 69% accuracy

from a more typical image-based approach. In order to create a

composite global view of a large bridge span, an image sequence

from the robot is aligned computationally to create a continuous

mosaic. A crack density map for the bridge mosaic provides a

computational description as well as an global view of the spatial

patterns of bridge deck cracking. The bridges surveyed for data

collection and testing include Long Term Bridge Performance

program’s (LTBP) pilot project bridges at Gainesville, Virginia

and Sacramento, California.

Note to Practitioners- The automated crack detection algorithm

can analyze an image sequence with full video coverage of the

region of interest at high resolution (approximately 0.6 mm

pixel size). The image sequence can be acquired with a robotic

measurement device with attached cameras or with a mobile

cart equipped with surface imaging cameras. The automated

algorithm can provide a crack map from this video sequence

that creates a seamless photographic panorama with annotated

crack regions. Crack density (the number of cracks per region)

is illustrated in the crack map because individual cracks are

difficult to see at the magnification required to view large regions

of the bridge deck.

I. INTRODUCTION

Surface health monitoring (SHM) of bridge decks plays avital role in maintaining the structural health and reliabilityof concrete bridges. Early detection of small cracks on bridgedecks is an important maintenance task. More than 100,000bridges across the United States have exhibited early agebridge-deck cracking [3]. Many bridges exhibit defects in early

P. Prasanna and K. Dana are with the Department of Electrical andComputer Engineering, Rutgers University, Piscataway, NJ 08854, USA. (e-mail: [email protected] and [email protected])

N. Gucunski is with the Department of Civil and Environmental En-gineering, Rutgers University. H. La, R. Lim and H. Parvardeh arewith the Center for Advanced Infrastructure and Transportation, RutgersUniversity, Piscataway, NJ 08854, USA. (email: [email protected],[email protected] and [email protected] )

B. Basily is with the Department of Industrial and Systems En-gineering, Rutgers University, Piscataway, NJ 08854, USA. (email:[email protected])

Fig. 1. (Left) Bridge inspection at the Gainesville, Virginia testing site.The white dots seen in the image are the bridge deck grid-markings.(Right) RABIT - Robotic Assessment Bridge Inspection Tool: Long TermBridge Performance (LTBP) program of Federal Highway Administration(FHWA) [1] has developed a robotic scanning system for imaging bridge decksusing non-destructive evaluation techniques (ground-penetrating radar, impact-echo measurements) and high resolution images. Using this system, we collectimage data from bridge decks for developing and testing an automated crackdetection algorithm. Path-planning, sensor geometry and collision detectionfor the robot scanning are discussed in [2].

stages immediately after construction. As cracks appear on thedeck, paths are created for water and corrosive agents to reachthe subsurface rods and steel reinforcements, requiring costlymaintenance and repair.

The current method of site inspection using human vi-sual inspection is a time-consuming process for long-spanbridges. Typically, square grid coordinates are marked onthe bridge deck. Skilled inspectors go to the site and assessthe deck condition, grid by grid, marking the corrosions andcracks on a chart, all under strict traffic control (Figure 1).Automated and accurate condition assessment that requiresminimal lane closure is highly desirable for fast large-areaevaluation. Robotic bridge scanning is revolutionizing theprocess of bridge inspection [4]. However, a key challengeis automated interpretation of the large image dataset in orderto infer bridge condition. We present a novel automated crackdetection algorithm and demonstrate results using data froma state-of-the-art robotic bridge scanning system illustratedin Figure 1. Many prior crack detection methods use simpleedge detection or image thresholding; but these methods arenon-robust to noise and require manual parameter settingand adjustment. When the cracks are high contrast regionsagainst a near uniform background (see Figure 2), the simplemethods may perform well. However, real world concreteimages have cracks of variable appearance and competingvisual clutter. Additionally, manual parameter adjustment isan impediment to automatically analyzing large datasets overmultiple bridges. Our approach uses machine learning andoptimization in order to successfully detect cracks while elim-inating the need for tuning threshold parameters. We developthe STRUM (spatially-tuned robust multi-feature) classifer to

2

Fig. 2. (Top Row) Prior work uses images that are high contrast andlow clutter similar to to the images shown. Simple methods such as edgedetection and thresholding for binarization can be applied for these images.(Bottom Row) Real bridge deck images from our work shows significantlymore distractors and pose a more challenging crack detection problem.

obtain high-performance accuracy on images of real concrete.The STRUM algorithm starts with robust curve fitting tospatially localize potential crack regions even in the presenceof noise and distractors. Visual features that are spatiallytuned to these regions are computed. The computation of thesevisual features includes scale space saliency to account for theunknown scale of the crack. A suite of possible combinationsof visual features are evaluated experimentally with threeclassifier methods: support vector machines [5], adaboost [6]and random forests [7]. The tests are done with real bridge dataand hundreds of crack images. from the Long Term BridgePerformance program’s (LTBP) [1] pilot project bridges atGainesville, Virginia and Sacramento, California. In order tocreate a composite global view of a large bridge span, an imagesequence from the robot is aligned computationally to create acontinuous mosaic. A crack density map for the bridge mosaicprovides a computational description as well as an at-a-glanceview of the spatial patterns of bridge deck cracking.

A. Related WorkIn prior work, many automated crack detection algorithms

for bridge decks emphasize high-contrast visually-distinctcracks. Standard image processing methods, including edgefollowing, image thresholding and morphology operations, areapplicable for these cracks. Numerous successful approacheshave been demonstrated with high-contrast, low-clutter crackregions as described in [8]–[15]. However, the crack imagesfrom the real concrete bridge decks are often immersed insignificant visual clutter as illustrated in Figure 2 and are moredifficult to detect in an automated manner. To illustrate thepoint, Figure 3 shows the output of a recent crack detectionalgorithm [15] compared with our STRUM classifier on asample image from our dataset.

Machine learning has been applied for visual recognitionand classification and generally perform better than methodswith hand-tuned parameters [16]–[21]. Example-based ma-chine learning is referred to as supervised learning and enablesstatistical inference based on the relevant data without the

Fig. 3. (Left) Original image showing a surface with cracks. (Center) Imageshowing the output of a prior crack detection algorithm [15]. (Right) Imageshowing the output of our algorithm.

need for manual parameter adjustment as in prior methodssuch as the percolation algorithm [9], [10] and binarizationmethods [8], [12], [14]. For the task of automated bridge crackdetection, the trend toward using machine learning algorithmsis just beginning to take hold. Machine learning is a largefield and there is no best algorithm for all classification tasks.Constructing a suitable algorithm requires developing the rightrepresentation of the data. For example, neural nets are used todetermine crack orientation [22], however the representationrelies on standard image binarization. Automated classifiersare used in [23] including support vector machines, nearestneighbor and neural nets with input features such as crackeccentricity, solidity and compactness. A drawback of thismethod is that input features depend on first segmenting thecrack with standard manually tuned methods. If this segmen-tation fails, the features for the classifier are not directlymeaningful. For our approach, a robust line-fitting is usedto find crack segments; the method is robust because it isknown to work well in the presence of noise and clutter. Thefeatures are computed relative to the local line fit and theyare relevant for classification even if the line does not fallon a crack segment. Our approach of robust spatial tuningcombined with a localized multi-feature appearance vector isa key contribution of the STRUM algorithm.

The percolation method [9]–[11] also provides spatial tun-ing by following the local image gradient to arrive at aminimum intensity pixel. Shape evaluation is performed onthe set of such pixels that descend into a thin-shaped crackregion. As with any non-convex gradient-descent algorithm,the objective function can have local minima especially fornoisy or cluttered concrete images. For high contrast, verydistinct edge, the problem is relatively easy and a percolatonmethod is sufficient. However, the percolation process requireshand-tuned parameters and doesn’t have the benefit describedof a machine learning approach.

A robust automatic crack detection method from noisyconcrete surfaces is proposed in [24], however the “noise”in this method is image shadows. Shadows have soft-edgesand therefore the multiscale processing outlined in this paperis very effective for the removal. For our data, the visual dis-tractors are not due to shadowing and cannot be removed withthe same soft-edge assumption. The experimental results in[24] evaluate 60 crack images. In fact, several crack detectionalgorithms presented in the literature are tested using a verysmall image test set. For example, a genetic programmingalgorithm [8] shows results with 6 training and 17 test images

3

Fig. 4. Method overview. The STRUM (Spatially Tuned RobUst Multi-feature) classifier consists of three components: 1) a robust line segment detector, (2)spatially-tuned multiple feature computation and (3) a machine learning classifier. The resulting local crack maps over a bridge span are combined by formingan image mosaic where individual frames are aligned to a single coordinate frame and stitched for a composite image. A crack density map is computedproviding a global view of the crack densities across the bridge.

(with run times of 10 hours); a wavelet transform method [13]uses only three images for testing; a Haar transform methodin [25] uses a single image for testing; a fast Haar transformmethod [26] uses 50 test images.

Finding maxima in scale space is a well established methodto find the best scale for a visual feature [27]–[29]. Fewalgorithms in the concrete crack detection literature use amulti-scale signal processing methods to analyze the imagesat the best scale [13], [26]. Wavelets in the form of a fastHaar transform is used in [26], but the approach is outdatedin its use of manually tuned parameters and comparison toedge detection. Our approach to multiscale analysis is to findthe maximum response of over scale space using a Laplacianpyramid [30]–[32]. Other methods use multiscale analysis,reporting the evolution of cracks as a function of scale [33],but do not seek space space maxima.

For the task of bridge deck inspection, each surface imagegives only a snapshot of the entire of the bridge deck anda full view at high resolution requires stitching together theentire video scan. Few crack detection approaches integratethis image stitching issue. The need for stitching is discussedin [14] but only applied for two adjacent images. Largescale stitching is much more challenging than two imagestitches and constraints on the homography parameters areneeded for accurate stitching. Stitching multi-angle cameraviews and multiple camera distances with respect to the bridgesurface is addressed in [23], [33], [34]. By estimating thecamera position, the method can undo crack distortion fromperspective projection. Our robotic scanner views the surfacefrontally, so this distortion is not an issue for our system. Also,the camera motion in our system is for scanning the bridgesurface sequentially instead of getting multiple views of thesame crack region. We apply image stitching to a large bridgespan by accurately stitching an image sequence covering a 35ft. bridge section. The robotic measurement system provides a

high resolution image-scan of the bridge, spatially covering theentire surface and sampling densely enough so that adjacentframes have sufficient overlap for image based alignment.Alignment is not a trivial application of the robotic trans-lation parameters because subpixel alignment is needed forstitching high resolution images. We perform computationalauto-alignment by matching visual keypoints and minimizingthe variation of the homography from a translation. In doingso, we obtain a stitched mosaic which unified all images into aglobal map. Annotated thin cracks are not visible on a globalmap when viewed at a “zoomed out” magnification, so wecompute a crack density map in order to readily see regionswith cracks at a glance of the global map. This unification oflocal crack detection with a global large bridge deck view isnot addressed elsewhere in the literature for bridge decks.

An important aspect of using a supervised learning approachis having access to high quality relevant training data. Relatedwork [23] uses machine learning classification (support vectormachines, neural nets and nearest neighbor), but because oflimited access to real data, they include synthetic training data.Their results on real data for 200 training and test images haveaccuracy of 79%. We are obtain a large real data training,set comprised of hundreds of images with thousands of crackinstances, and demonstrate accuracies over 90%. Furthermorewe demonstrate training on a geographically distinct bridgefrom testing, demonstrating that prior training can extend tomultiple bridge decks.

In summary, much of the prior work suffers from oneor more of the following issues: (1) results presented ononly a few images, (2) manual parameter tuning instead ofmachine learning, (3) long run times, (4) applicable for veryhigh contrast noise-free images, (5) use of features that aredependent on an initial segmentation from simple binarization.

Our contribution is the first algorithm combining severalkey desirable features. Machine learning is used to avoid

4

manual parameter tweaking, and we have developed a novelmulti-feature representation that uses a set of features spatiallytuned to a local robust line fit. Unlike many prior methods,we develop and test comprehensively using hundreds of realbridge images. A fast run-time lends itself to onsite analysis.The feature set is chosen by extensively evaluating a suite ofpossible features and feature combinations and choosing thebest performing multi-feature vector. Our features use a scalespace approach to achieve a measure of scale invariance, whichis a necessary feature of computer vision methods where thescale of the object to be detected is unknown apriori. We alsoevaluate three possible classifier techniques as a componentof the final STRUM classifier. Furthermore, we are the firstwork to show a bridge crack classifier where the test site isgeographically distinct from the training site. This concept oftraining on one bridge, but testing on another demonstratesthe framework of training only once, yet reusing the classifierfor multiple bridge decks. Finally, we provide the only workthat uses images stitching across a large span of the bridgestitched together with local crack regions aggregated for a first-of-its-kind global crack map obtained via image stitching forzooming capabilities to see fine scale cracks.

II. METHODS

The STRUM (Spatially Tuned RobUst Multi-feature) clas-sifier consists of three components: 1) a robust line segmentdetector, (2) spatially-tuned multiple feature computation and(3) a machine learning classifier. The robust curve fit is doneusing RANdom SAmple Consesus (RANSAC) [35] whereline segments are fit to pixels below a fixed percentage ofthe average intensity in pixel neighborhoods called blocks.The robustness of RANSAC is well known, and Figure 7provides a visual comparison to least-squares estimation forline fits. Small blocks are used so that curved cracks can bereasonably approximated with line segments within the smallregion. This approximation is equivalent to a Taylor seriesapproximation where curved cracks are represented locallywith line segments. Line fits can also be replaced with a higherorder curve fit, but lower order fits are typically more stableand reliable for small neighborhoods. The line segments areobtained for each block in the image and a machine learningclassification is done to classify these segments into twoclasses: crack or not-crack. Training examples are providedthat are manually labelled with the correct class as shown inFigure 6.

The key contribution of our method is the input to the stan-dard classifier, i.e. the crack appearance vector computed as aspatially-tuned multi-feature vector. The appearance vector isconstructed using components that each contribute a partialcue to the classification decision. We evaluate the perfor-mance of classification using combinations of the features anddemonstrate that the multi-feature appearance vector whichintegrates several weaker cues provides the best classifier. Weinvestigate a suite of (1) intensity based features, (2) gradient-based features and (3) scale-space features. Our experimentalresults show that combining these multiple features into oneappearance vector provides optimal performance.

Fig. 5. Seekur robot mounted with cameras used for image collection. Therobot is remotely operated and data is sent to the base station located in avan at the end of the bridge-testing area.

Fig. 6. Positive and negative training samples. (a)-(d) show 15⇥15 pixelimage regions (blocks) with cracks. (e)-(h) show 15⇥15 pixel image regionswithout cracks. For our training and validation purposes, we construct a datasetof 2000 samples from two bridges, with equal number from each bridge, andequal number of positive and negative instances.

A. Intensity-based and Gradient-based features

Spatially-tuned features provide a quantitative description ofappearance along the local line segment. As an example, themean µ

i

of the pixel intensity along the line segment is chosenas feature component because crack pixels are typically lowintensity. However, that feature alone is not sufficient for highaccuracy classification as illustrated in Figure 8 which showsthe lower intensity trend for crack regions along with manyviolators of this trend. For very thin cracks, the darkness ofthe pixel intensities is not reliable. Also, a few dark pixelsalong the line segment in a non-crack region causes a lowmean intensity. Multiple features that provide weak cues arecombined in our method. Specifically, we use the followingset of intensity-based features that are computed with pixelsalong the robustly detected line segment:

• Mean of intensity histogram (µi

)• Standard deviation of intensity(�

i

)• Mean of gradient magnitudes (µ

g

)• Standard deviation of gradient magnitudes (�

g

)• Ratio of the mean of intensity along the local line to the

mean intensity in the local region (ri

)

5

Fig. 7. Comparison between RANSAC and least square fit of curves. Redlines in (a) and (c) shows the curves fit to the minimum intensity points usingRANSAC. Blue lines in (b) and (d) shows the curves fit to the minimumintensity points using least squares method that misses the cracks due to thepresence outliers.

Fig. 8. An individual component of the crack appearance vector, µi

, themean pixel intensity along the line segments is less for crack regions (blocks1-50) than non-crack regions (blocks 51-100) . However, this feature alonewould be insufficient to classify the crack.

The intensity standard deviation �i

indicate that the cracksegments have as approximate uniform intensity comparedto non-crack segments. The component µ

g

is used becausethe gradient magnitudes will be larger in the crack regions,and �

g

indicates that the gradient magnitude along a crack isexpected to be more uniform than for line segments in non-crack regions. The gradient vector for image I along the xand y directions is computed as

rI =@I

@xi+

@I

@yj. (1)

Here, @I

@x

and @I

@y

are the gradient magnitudes along thex and the y axes respectively denoted by I

x

and Iy

. Thegradient magnitude |rI| =

qI2x

+ I2y

. In all cases a discreteapproximation of the derivative is used by a standard centraldifference filter. Finally, the feature component r

i

provides arelative intensity measure of the line segment pixels comparedto the background pixels and these photometric ratios have theadvantage of being independent of global illumination. Eachof these components provides a weak classification cue andthe concatenation of all the components comprise the multi-feature vector.

B. Scale space featuresThe bridge deck images of interest have cracks of varying

sizes, as thin as a millimeter to over a centimeter. Uniformityin the representation and processing of visual information over

Fig. 9. (a) Original Image (b) Level-2 pyramid image upsampled to thesame size as the original image. Laplace pyramid enhances edge features,which are more prominent at a characteristic scale.

Fig. 10. Patches in the second row show the line fit to the points which havethe lowest Laplacian pyramid values. Patches in the first row show the curvesfit to the points of minimum intensity. These patches have very fine cracks.Curves fit to pixels having the lowest Laplacian pyramid values perform betterthan the curves fit to minimum intensity points.

multiple scales is an inherent property offered by visual sys-tems. Laplacian pyramids are a classic coarse-to-fine strategythat assist in the search over scale space. Image structures,such as cracks, tend to have a particular salient scale. That is,they are most prominent at one level of the Laplace pyramidas illustrated in Figure 9. Laplacian pyramid images representdifferent spatial frequency bands so that level-0 containsimage information in the highest spatial frequencies and thesubsequent levels correspond to lower spatial frequency bands.The spatial tuning for these features is obtained by robust linefits to the minimum tenth percentile Laplacian pyramid valuesin the block, instead of pixel intensity values. The Laplacianpyramid value is the mean over pyramid levels 1, 2 and 3.As demonstrated in Figure 10, thinner cracks can be detectedwhen curves are fit to such points (scale-space extrema).

The scale space features include the following spatially-tuned Laplace pyramid features (computed along the local linesegment):

• Maximum of Laplacian pyramid values across three lev-els (L

max

)• Minimum of Laplacian pyramid values across three levels

(Lmin

)• Mean of Laplacian pyramid level 1 values (µ

L1)• Mean of Laplacian pyramid level 2 values (µ

L2)• Mean of Laplacian pyramid level 3 values (µ

L3)

6

III. CLASSIFIER TRAINING AND TESTING

The crack appearance vector is used as input to machinelearning classification. For statistical inference, classifiers arechosen based on empirical performance and we investigatethe following classifiers: support vector machines (SVM) [5],adaboost [6] and random forest [7]. Features are used indifferent combinations to evaluate the performance of each ofthe three classifiers to yield the highest possible accuracy [36].The performance on the training set is analyzed using a 10-fold cross validation. Furthermore, a classification test thatis geographically mutually exclusive is provided where thetraining data is obtained from one bridge and the test set isobtained from another bridge.

IV. GLOBAL VIEW: CRACK DENSITY MAPS

Crack location and characteristics are important to detect,evaluate and archive. Additionally, presentation of the crackdetection results may be made to a human inspector. There isa challenge of communicating spatial patterns of cracks overthe entire bridge deck. Subpixel image registration methodsare used to create a large scale mosaic from the robot scannerimages (as described in Section V). However, examining thefull resolution mosaic annotated with cracks requires eitherwall-size monitors or lengthy scrolling across a large scaledigital image. Downsampling the image to fit on a smallerwindow renders thin cracks invisible. We compute a simplecrack density map where the pixel value is the crack density inthe region. This computation can be slow since a spatial aver-age must be computed for every pixel in the large mosaic. Weemploy integral images which is a computationally efficienttechnique useful when computing averages over windows forevery pixel in the images [37]. The resultant crack densitymaps give a detailed overview about the surface degradationof the bridge deck. Algorithm IV.1 gives the pseudocode forcomputing a density map from a crack-labeled image wherethe crack density is computed over 60 ⇥ 60 patch centeredaround any pixel x, y. The integral image at (x, y) is computedby calculating the sum of the pixel values above and to theleft of (x, y). Integral image I(x, y) is given by

I(x, y) =X

x

0x,y

0y

i(x0, y

0), (2)

where i(x, y) is the input image. From the integral image¯I(x, y), the crack density in a particular region centered at x, y

is given by a simple linear combination of the integral imagevalue at the four region corners as illustrated in Figure 11.

Fig. 11. The crack density in region R is computed using the integral imagevalues at the four corners as T4 + T1� (T2 + T3).

Algorithm IV.1: DENSITY(Crackimage

)

Crackimage

to Crackbinary

;Ifilt

= (Crackbinary

* 2D Gaussian filter);comment: Appropriate zero-padding required;

Crackintegral

= IntegralImage(Ifilt

) ;for i 1 to num

rows

do

8>>>>><

>>>>>:

for j 1 to numcols

do

8>>><

>>>:

Crackmap

(i, j) =Crack

integral

(i� 30, j � 30)+Crack

integral

(i+ 30, j + 30)�Crack

integral

(i� 30, j + 30)�Crack

integral

(i+ 30, j � 30);return (Crack

map

)

V. BRIDGE DECK MOSAIC

The high-resolution surface images are stitched to form acoherent spatial mosaic of the bridge deck. Image stitching hasbeen well studied in the computer vision community [38]. Thisimage mosaic or panorama is a comprehensive tool, allowingthe bridge inspector to analyze the surface condition visually,without being physically present on-site.

The stitcher has three major components: (1) Detectingpoint matches between neighboring images, (2) Estimation ofinter-frame homography H, and (3) warping to align images.

A. Detecting point matches

Two successive images obtained as the robot traverses thebridge have overlap between them. Extraction and matchingof features between the images is done using scale-invariantfeature transform (SIFT) [39]. The SIFT algorithm computeskeypoints that are useful for matching. The method alsocomputes matches between keypoints. For stitching, we usekeypoint matches between images j � 1, j and j + 1, wherej is the frame number of each image within the sequence.Figure 12 shows the SIFT matches between two successivesurface images. We use the common approach of removeoutliers using a RANSAC estimation of the homography toobtain high quality matching points.

B. Calculating homographies

For general camera motion, a 3 ⇥ 3 matrix H , called ahomography, relates the pixel coordinates P0 and P1 in twoimages as

P0 =

2

4h11 h12 h13

h21 h22 h23

h31 h32 h33

3

5P1. (3)

As illustrated in Figure 13, homographies can be concatenatedto relate points in reference frame to points in the currentframe. The ith keypoint in the jth frame when transformedto the 0th frame 0P

j,i

is related to the ith keypoint in the jth

frame by0P

j,i

=0Hj

⇥jPj,i

, (4)

7

Fig. 12. (Top Row) SIFT matches between two successive bridge deckimages captured by the robot. Lines indicate point matches between twoframes. (Bottom Row) SIFT matches after RANSAC homography estimationfor outlier removal.

Fig. 13. Registration of bridge deck images is obtained by computing thehomography that relates the pixel coordinates of adjacent frames. Additionally,a homography relates the pixel coordinates of the current frame to thereference frame. Estimation of homographies enables image registration oralignment, for arbitrary 3D camera motion.

where j (= 0, 1, 2, 3....M ) is the frame number, i is thepoint pair index and 0H

j

is the concatenation of intermediatehomographies between frame j and frame 0, given by

0Hj

=0H1 ⇥1H2.........⇥j�1Hj

. (5)

The problem with concatenation is an accumulation of errorsover local frames. The best way to avoid this is a pro-cedure called bundle adjustment [40]–[42] which computesall of the interframe homographies in a global optimizationover all relevant images. However, this approach is verycomputationally intensive. Frame-to-frame alignment workswell if the homography estimation is constrained so that therotational components is small. We follow the approach of[43] to ensure undistorted homographies by encouraging theoriginal rectangular shape of the image. This alignment is anoptimization problem with an objective function that has twoterms: reprojection error E

r

to match points and distortionerror E

d

. The reprojection error is

Er

=MX

j=0

NjX

i=1

(0Pj�1,i �0P

j,i

)2, (6)

where Nj

is the number of keypoints in the jth frame andM is the number of frames. The distortion term E

d

can be

Fig. 14. (a) Stitched images obtained by minimizing the misalignment ofmatched points .(b) Stitched images with homography distortion removed byadding a distortion term to the optimization problem. There are eight imagesstitched together in this figure.

formulated as

Ed

= ||H[1, 0, 0]T � [1, 0, 0]T ||2+ ||H[0, 1, 0]T � [0, 1, 0]T ||2.(7)

Total error Et

is the sum of both terms given by

Et

= Er

+ ↵Ed

. (8)

Levenberg-Marquardt algorithm [44] is used for this non-linearoptimization step. Figure 14 shows the image stitching resultswith and without the distortion term.

VI. RESULTS

The experimental results use an image database collected byrobotic scans of two bridge decks consisting of 100 imagesper bridge of 3.5 ft. ⇥ 2.6 ft. concrete surface segmentscorresponding to 1920⇥1280 pixel image regions. The highresolution images were captured using two Canon Rebel T3iDSLR cameras (with Canon EF-S 18-55mm f/3.5-5.6 lenses)mounted on a Seekur robot as shown in Figure 1. Mechatronicsand navigation of the robot system in described in [4].

For the experimental results, combinations of candidatefeatures (intensity-based, gradient-based and scale space) areused. The validations and training set are chosen from 1000samples per bridge in the labelled database with equal numbercrack and non-crack samples. To quantify classifier perfor-mance, a standard 10-fold cross validation is performed whichvaries the training and validation set over the entire dataset.

A. Classification with intensity and gradient-based featuresUsing the spatially tuned intensity-based features along the

local line segments, as described in Section II, the blocksare classified and the performance metrics of the supportvector machine classifier are shown in Table I. The ranking offeatures in order of their individual performance is r

i

, µi

, �g

and µg

(in decreasing order of accuracy). F1, the combinedfeature vector using r

i

, µi

, �g

and µg

performs best. Weuse the 4⇥1 feature vector F1 and evaluate the classifier

8

performance on the validation set data as shown in Table II.The combined intensity-based feature vector led to an increasein accuracy and precision of the classifiers. Random forestsclassifier has the highest accuracy and the adaboost classifierhas the highest precision.

TABLE ICOMPARISON OF SVM CLASSIFIER PERFORMANCE WITH DIFFERENT

INTENSITY-BASED FEATURE VECTORS.

Feature Vectors Accuracy (%) Sensitivity Precisionµi

, �i

87.2 0.78 0.844µg

69 0.64 0.658�g

68.4 0.628 0.646µg

, �g

75.6 0.624 0.69ri

84.7 0.788 0.796µi

, µg

, �g

, ri

(F1) 89.5 0.796 0.869

TABLE IICLASSIFIER PERFORMANCE WITH THE F1 FEATURE VECTOR.

Classifier Accuracy (%) Sensitivity PrecisionSVM 89 0.832 0.849

Adaboost 91.2 0.74 0.974Random Forest 92 0.784 0.916

B. Classification using scale space featuresThe SVM classifier performance for different combinations

of Laplacian pyramid feature vectors described in Section IIare shown in Table III. The performance of the three classifiersare compared in Table IV. Random forests has the highestaccuracy, while SVM and adaboost also perform well. Acombination of the all features in a 9⇥ 1 vector results in anincrease in accuracy for all three classification algorithms. TheSTRUM classifier therefore is defined with this multi-featurevector as input.

TABLE IIICOMPARISON OF SVM PERFORMANCE WITH DIFFERENT LAPLACE

PYRAMID FEATURES AND THE COMBINED F1, F2 FEATURE VECTOR.

Feature Vectors Accuracy (%) Sensitivity PrecisionLmax

53.2 0.38 0.55Lmin

58.2 0.404 0.63Lmax

, Lmin

60 0.488 0.629µL1, µ

L2,µL3 61 0.448 0.663

Lmax

, Lmin

, µ1, µ2, µ3 (F2) 64.6 0.44 0.748F1, F2 (9⇥1 feature vector) 91.6 0.9 0.93

TABLE IVCLASSIFIER PERFORMANCE USING THE MULTI-FEATURE VECTOR (F1,F2)

A 9⇥1 FEATURE VECTOR AS DESCRIBED IN TABLES I AND III)

Classifier Accuracy (%) Sensitivity PrecisionSVM 91.6 0.9 0.93

Adaboost 93.7 0.904 0.89Random Forest 94.7 0.9 0.911

C. Geographically Distinct Test SetThe classifier performance is evaluated using geographically

distinct test data, i.e. from two different bridge datasets. Forthis purpose, we constructed training and test sets of images

Fig. 15. Crack detection results. (a) Raw image from the Virginia bridge deckcorresponding to a 2.6 ft ⇥ 3.5 ft section on the bridge.(b) Image showingthe detected cracks (c) Morphological operations (closing and hole filling)remove small cracks.

collected from bridges in both Gainesville, Virginia and Sacra-mento, California. The data from the Virginia bridge was col-lected under natural light conditions, while the data from theCalifornia bridge was collected at night under artificial light-ing. For each of the two bridges 1000 samples were labelledwith equal positive and negative instances. The classifiers wereevaluated with the multi-feature vector F1, F2, combiningthe intensity-based and the Laplacian pyramid features. Theclassifier performance when trained on the California bridge-deck images and tested on the Virginia bridge-deck imagesis shown in Table V. The best performance was with theadaboost classifier. The STRUM classifier approach, as shownin Figure 4, does not fix the choice of the machine learningclassifier. The result shown suggests that the adaboost classifierprovides a useful method for generalizing the trained modelto novel bridge surfaces.

TABLE VCLASSIFIER PERFORMANCE WITH CALIFORNIA BRIDGE DATASET AS THE

TRAINING SET AND THE VIRGINIA BRIDGE DATASET AS THE TEST SETWITH A 9⇥1 FEATURE VECTOR CONSISTING OF FEATURES F1 AND F2.

Classifier Accuracy (%) Sensitivity PrecisionSVM 79.6 0.96 0.723

Adaboost 90.8 0.952 0.723Random Forest 76.3 0.676 0.685

D. Crack Density MapsFigure 16 shows a bridge mosaic of a 35 ft. ⇥ 3.5 ft.

section of the Virginia bridge deck comprised of 12 individualimages from the robot scan. The corresponding density mapis obtained by running the STRUM classifier on the bridgemosaic. This color-map shows various levels of degradationsindicated by the different colors where dark blue corresponds

9

Fig. 16. Bridge deck mosaic and crack density map. (a) Mosaic of a 35 ft. ⇥ 3.5 ft. section of the Virginia bridge deck (b) Crack density map computedby averaging the number of cracks in a region. Notice how the cracks are not apparent in (a) because their width is subpixel in the scaled mosaic. But thecrack density map gives a clear visual assessment of the pattern of fine cracks over the imaged span of the bridge. (Row 3) The crack density map in (b) issuperimposed on the mosaic in (a). Three of the high crack-density regions have been shown zoomed for a clear depiction of the underlying cracks.

to region of low crack density and light blue corresponds tohigh crack density. The superposition of the crack map and theimage mosaic is also shown in Figure 16. The global crackmap can be used for quantitative analysis but also for visuallyassessing global crack patterns. For example, notice that anapproximately periodic global pattern of cracking is detectablein Figure 16.

E. Timing DiscussionOn-site analysis enables knowledge inference from the

large dataset collected by the robotic bridge scanning. Thecomputation and measurement speed of our methods supportsfast analysis in its current form. With a robot speed of 3ft/second, the 35 ft. section shown in Figure 16 is imagedin approximately 12 seconds. The individual image size is 2.6ft in the direction of motion and 3.5 ft. in the perpendiculardirection. The robot takes images with approximately 40%overlap, i.e. the sample distance along the bridge span isapproximately 1.5 ft. With a processor speed of 2.3 GHz,the total time for image collection, STRUM classification,

and annotating the cracks on the 35 ft section comprised of12 images takes approximately 33 minutes, in a completelyautomated manner. The current implementation uses C++ andMatlab and can easily be optimized for significant speedgains. Time efficiency can be further improved with GPUparallelization and embedded vision hardware.

VII. CONCLUSION

The STRUM classifier for crack detection on bridge decksprovides a method of inspection for use in on-site roboticscanning. Since automated scanning generates large imagedatasets, automated analysis has clear utility in rapidly assess-ing bridge condition. Moreover, the results can be quantified,archived and compared over time. The methods of this workshows the first application of automated crack detection torobotic bridge scanning. The new algorithm uses a feature setthat shows 90% accuracy on thousands of tests cracks. Addi-tionally, geographically separate datasets have been providedfor testing and training. A thorough evaluation of multiplefeatures and multiple classifiers on real world data validates the

10

methodology. A coherent spatial mosaic, along with the crackdensity map, serves as a new tool for inspectors to analyzebridge decks.

ACKNOWLEDGEMENT

The authors would like to thank the Federal Highway Administra-tion, U.S. Department of Transportation for supporting this researchas part of the Long Term Bridge Performance program. The authorsalso express their gratitude to Kun Zhao, graduate student at RutgersUniversity, for his valuable contribution towards the deck-mosaicingapplication.

REFERENCES

[1] S. Kruschwitz, C. Rascoe, R. Feldmann, and N. Gucunski, MeetingLTBP Program Objectives through Periodical Bridge Condition Moni-toring by Nondestructive Evaluation, 2009.

[2] H.M. La, R.S. Lim, B. Basily, N. Gucunski, J. Yi, A. Maher, F.A.Romero, and H. Parvardeh, “Autonomous robotic system for high-efficiency non-destructive bridge deck inspection and evaluation,” inProceedings of the IEEE International Conference on Automation Sci-ence and Engineering (CASE), August 2013.

[3] B. Wan, T. McDaniel, and C. Foley, “What’s causing cracking in newbridge decks?,” Wisconsin Highway Research Program.

[4] H.M. La, R.S. Lim, B.B. Basily, N. Gucunski, Jingang Yi, A. Maher,F.A. Romero, and H. Parvardeh, “Mechatronic systems design for anautonomous robotic system for high-efficiency bridge deck inspectionand evaluation,” Mechatronics, IEEE/ASME Transactions on, vol. 18,no. 6, pp. 1655–1664, Dec 2013.

[5] C. Cortes and V. Vapnik, “Support-vector networks,” Machine learning,vol. 20, no. 3, pp. 273–297, 1995.

[6] Y. Freund and R. E. Schapire, “Experiments with a new boostingalgorithm,” in International Workshop on Machine Learning, 1996, pp.148–156.

[7] L. Breiman, “Random forests,” Machine learning, vol. 45, no. 1, pp.5–32, 2001.

[8] Takafumi Nishikawa, Junji Yoshida, Toshiyuki Sugiyama, and Yozo Fu-jino, “Concrete crack detection by multiple sequential image filtering,”Computer-Aided Civil and Infrastructure Engineering, vol. 27, no. 1,pp. 29–47, 2012.

[9] Zhenhua Zhu, Stephanie German, and Ioannis Brilakis, “Visual retrievalof concrete crack properties for automated post-earthquake structuralsafety evaluation,” Automation in Construction, vol. 20, no. 7, pp. 874– 883, 2011.

[10] Tomoyuki Yamaguchi and Shuji Hashimoto, “Fast crack detectionmethod for large-size concrete surface images using percolation-basedimage processing,” Machine Vision and Applications, vol. 21, no. 5, pp.797–809, 2010.

[11] T. Yamaguchi, S. Nakamura, and S. Hashimoto, “An efficient crackdetection method using percolation-based image processing,” in 3rdIEEE Conference on Industrial Electronics and Applications, June 2008,pp. 1875 –1880.

[12] X. Tong, J. Guo, Y. Ling, and Z. Yin, “A new image-based method forconcrete bridge bottom crack detection,” in International Conference onImage Analysis and Signal Processing (IASP), Oct. 2011, pp. 568–571.

[13] Hoang-Nam Nguyen, Tai-Yan Kam, and Pi-Ying Cheng, “A novel auto-matic concrete surface crack identification using isotropic undecimatedwavelet transform,” in Intelligent Signal Processing and Communica-tions Systems (ISPACS), 2012 International Symposium on, Nov 2012,pp. 766–771.

[14] R.S. Adhikari, O. Moselhi, and A. Bagchi, “Image-based retrievalof concrete crack properties for bridge inspection,” Automation inConstruction, vol. 39, pp. 180 – 194, 2014.

[15] R.S. Lim, H.M. La, Z. Shan, and W. Sheng, “Developing a crack inspec-tion robot for bridge maintenance,” in IEEE International Conferenceon Robotics and Automation (ICRA), May 2011, pp. 6288 –6293.

[16] Pedro Domingos, “A few useful things to know about machine learning,”Communications of the ACM, vol. 55, no. 10, pp. 78–87, October 2012.

[17] Tom M. Mitchell, “The discipline of machine learning,” Carnegie MelonUniversity, School of Computer Science, Machine Learning Department,CMU-ML-06-108, July 2006.

[18] Christopher M. Bishop, Pattern recognition and machine learning,vol. 1, springer New York, 2006.

[19] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification (2ndEdition), Wiley-Interscience, 2001.

[20] Thomas M. Mitchell, Machine Learning, McGraw-Hill Higher Educa-tion, 1997.

[21] Thomas M. Mitchell, “Does machine learning really work?.,” ArtificialIntelligence Magazine, Association for the Advancement of ArtificialIntelligence, vol. 18, no. 3, pp. 11–20, 1997.

[22] BY Lee, YY Kim, ST Yi, and JK Kim, “Automated image processingtechnique for detecting and analysing concrete surface cracks.,” Struc-ture and Infrastructure Engineering, vol. 9, no. 6, pp. 567 – 577, 2013.

[23] Mohammad Jahanshahi, Sami Masri, Curtis Padgett, and GauravSukhatme, “An innovative methodology for detection and quantificationof cracks through incorporation of depth perception.,” Machine Visionand Applications, vol. 24, no. 2, pp. 227 – 241, 2013.

[24] Y. Fujita and Y. Hamamoto, “A robust automatic crack detection methodfrom noisy concrete surfaces,” Machine Vision and Applications, vol.22, pp. pp. 245–254, 2011.

[25] Tara C. Hutchinson and Chen ZhiQiang, “Improved image analysisfor evaluating concrete damage.,” Journal of Computing in CivilEngineering, vol. 20, no. 3, pp. 210 – 216, 2006.

[26] Abdel Qader-I., O. Abudayyeh, and M. Kelly, “Analysis of edge-detection techniques for crack identification in bridges,” Journal ofComputing in Civil Engineering, vol. 17, no. 4, pp. 255–263, 2003.

[27] T Lindeberg, “Scale selection properties of generalized scale-spaceinterest point detectors.,” JOURNAL OF MATHEMATICAL IMAGINGAND VISION, vol. 46, no. 2, pp. 177 – 210, 2013.

[28] Krystian Mikolajczyk and Cordelia Schmid, “A performance evaluationof local descriptors,” Pattern Analysis and Machine Intelligence, IEEETransactions on, vol. 27, no. 10, pp. 1615–1630, 2005.

[29] David G Lowe, “Distinctive image features from scale-invariant key-points,” International journal of computer vision, vol. 60, no. 2, pp.91–110, 2004.

[30] Sylvain Paris, Samuel W Hasinoff, and Jan Kautz, “Local laplacianfilters: edge-aware image processing with a laplacian pyramid.,” ACMTrans. Graph., vol. 30, no. 4, pp. 68, 2011.

[31] Tania Stathaki, Image fusion: algorithms and applications, AcademicPress, 2011.

[32] P.J. Burt and E.H. Adelson, “The laplacian pyramid as a compact imagecode,” Communications, IEEE Transactions on, vol. 31, no. 4, pp. 532–540, Apr 1983.

[33] Mohammad R. Jahanshahi and Sami F. Masri, “Adaptive vision-basedcrack detection using 3d scene reconstruction for condition assessmentof structures,” Automation in Construction, vol. 22, no. 0, pp. 567 – 576,2012, Planning Future Cities-Selected papers from the 2010 eCAADeConference.

[34] MR Jahanshahi and SF Masri, “A new methodology for non-contact ac-curate crack width measurement through photogrammetry for automatedstructural safety evaluation.,” Smart Materials and Structures, vol. 22,no. 3, 2013.

[35] M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigmfor model fitting with applications to image analysis and automatedcartography,” Commun. ACM, vol. 24, no. 6, pp. 381–395, June 1981.

[36] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H.Witten, “The weka data mining software: An update,” SIGKDD, 2009.

[37] P. Viola and M. Jones, “Rapid object detection using a boosted cascadeof simple features,” in Computer Vision and Pattern Recognition,2001. CVPR 2001. Proceedings of the 2001 IEEE Computer SocietyConference on, 2001, vol. 1, pp. I–511–I–518 vol.1.

[38] R. Szeliski, “Image alignment and stitching: a tutorial,” Foundation andTrends in Computer Graphics and Vision, vol. 2, no. 1, pp. 1–104, Jan.2006.

[39] D.G. Lowe, “Object recognition from local scale-invariant features,”in The Proceedings of the Seventh IEEE International Conference onComputer Vision, 1999, vol. 2, pp. 1150–1157 vol.2.

[40] Bill Triggs, Philip Mclauchlan, Richard Hartley, and Andrew Fitzgibbon,“Bundle adjustment a modern synthesis,” in Vision Algorithms: Theoryand Practice, LNCS. 2000, pp. 298–375, Springer Verlag.

[41] Changchang Wu, Sameer Agarwal, Brian Curless, and Steven M Seitz,“Multicore bundle adjustment,” in Computer Vision and Pattern Recog-nition (CVPR), 2011 IEEE Conference on. IEEE, 2011, pp. 3057–3064.

[42] Sameer Agarwal, Noah Snavely, Steven M Seitz, and Richard Szeliski,“Bundle adjustment in the large,” in Computer Vision–ECCV 2010, pp.29–42. Springer, 2010.

[43] R. Garg and S.M. Seitz, “Dynamic mosaics,” in Second InternationalConference on 3D Imaging, Modeling, Processing, Visualization andTransmission (3DIMPVT). IEEE, 2012, pp. 65–72.

[44] D.W. Marquardt, “An algorithm for least-squares estimation of non-linear parameters,” Journal of the Society for Industrial and AppliedMathematics, vol. 11, no. 2, 1963.