automatic pavement crack detection and classification system · detection and classification. the...

10
AUTOMATIC PAVEMENT CRACK DETECTION AND CLASSIFICATION SYSTEM Pedro Rosa Instituto de Telecomunicacões - Instituto Superior Técnico Av. Rovisto Pais, 1, Lisbon, Portugal E-mail: [email protected] ABSTRACT The identification of cracks on pavement surfaces has a crucial importance for the good maintenance of roads, allowing a higher quality in all the transportation services and, subsequently, the population’s development. This thesis proposes an automatic crack detection and classification system capable of identifying and retrieving pavement surface images containing cracks from a road pavement survey image database. For this, learning algorithms based on boosting are used, a general method for improving the accuracy of learning algorithms. Crack regions are detected over the acquired images using statistical features carefully chosen and boosting techniques. Crack detection results are evaluated by comparison with a human labelling of the test images, obtained using a graphic user interface (GUI) that divides the image into a set of blocks that are classified as containing cracks, or not. Next, a classification step distinguishes cracks according to their types, notably longitudinal, transversal, diagonal or miscellaneous cracks. The experimental results, obtained using images from Portuguese roads, are encouraging for the development of automatic pavement crack detection systems. Index Terms - road pavement cracks, pattern recognition, boosting techniques, automated pavement crack analysis, crack detection, crack classification. 1. INTRODUCTION Nowadays roads are crucial in population’s development, enabling the trading of goods in markets and allowing the easy mobility of people. Among a community, the evolution in the quality of transportation services allows a higher quality for all the inherent services, such as a better access to jobs, health, education and resources, reducing the poverty of certain communities through a growth in life’s quality. Because of this, the economic development of countries is highly dependent on the quality of road transportation services. Since roads are nowadays crucial to our society, in order to guarantee the good quality of transportation services, road maintenance must be assured since a bad maintenance policy would neglect the service quality or even the access to some remote areas. The parameters of this policy must be well specified to implement a satisfactory maintenance management information system, capable of dealing with several types of data such as pavement condition, climate, traffic volumes and loads, along with other topics. The pavement data is typically obtained from a careful visual inspection. These policies vary among the companies that apply them, because there are multiple techniques and technological equipments that can be used in order to collect important data for an adequate structural and functional quality evaluation. In this paper, distress detection and classification methodologies are proposed, using digital image processing techniques. In the Portuguese case, the distress classification is based on the “Portuguese Distress Catalog for Flexible Pavements” [1], produced the official Portuguese agency for management, planning, development and execution of the road infrastructure policy guidelines. There is not a large number of papers available referring to the subject of automatic pavement crack detection and classification. The literature reviewed refers to detection and classification algorithms based on artificial life [22], which consists in excluding non-crack regions in order to detect pavement cracks, on Markov Random Fields [23] and [24], based in the construction of an irregular lattice derived from the original image, or based on a skeleton analysis [25] Another important method, initially proposed for face detection and recognition, presents a visual object detection algorithm [20], based on fast image processing and a high rate of object detection, by using a AdaBoost algorithm,. The data collection is supported by video or photographic equipments and afterwards it can be further processed in the office. Figure 1 shows an example of a vehicle used for distress image collection. Figure 1 - Example of a car with an image collection system.

Upload: others

Post on 25-Aug-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AUTOMATIC PAVEMENT CRACK DETECTION AND CLASSIFICATION SYSTEM · detection and classification. The literature reviewed refers to detection and classification algorithms based on artificial

AUTOMATIC PAVEMENT CRACK DETECTION AND CLASSIFICATION SYSTEM

Pedro Rosa

Instituto de Telecomunicacões - Instituto Superior Técnico

Av. Rovisto Pais, 1, Lisbon, Portugal

E-mail: [email protected]

ABSTRACT

The identification of cracks on pavement surfaces has a

crucial importance for the good maintenance of roads,

allowing a higher quality in all the transportation services

and, subsequently, the population’s development. This

thesis proposes an automatic crack detection and

classification system capable of identifying and retrieving

pavement surface images containing cracks from a road

pavement survey image database. For this, learning

algorithms based on boosting are used, a general method

for improving the accuracy of learning algorithms. Crack

regions are detected over the acquired images using

statistical features carefully chosen and boosting

techniques. Crack detection results are evaluated by

comparison with a human labelling of the test images,

obtained using a graphic user interface (GUI) that divides

the image into a set of blocks that are classified as

containing cracks, or not. Next, a classification step

distinguishes cracks according to their types, notably

longitudinal, transversal, diagonal or miscellaneous

cracks. The experimental results, obtained using images

from Portuguese roads, are encouraging for the

development of automatic pavement crack detection

systems.

Index Terms - road pavement cracks, pattern recognition,

boosting techniques, automated pavement crack analysis,

crack detection, crack classification.

1. INTRODUCTION

Nowadays roads are crucial in population’s

development, enabling the trading of goods in markets and

allowing the easy mobility of people. Among a

community, the evolution in the quality of transportation

services allows a higher quality for all the inherent

services, such as a better access to jobs, health, education

and resources, reducing the poverty of certain

communities through a growth in life’s quality. Because of

this, the economic development of countries is highly

dependent on the quality of road transportation services.

Since roads are nowadays crucial to our society, in order

to guarantee the good quality of transportation services,

road maintenance must be assured since a bad

maintenance policy would neglect the service quality or

even the access to some remote areas.

The parameters of this policy must be well

specified to implement a satisfactory maintenance

management information system, capable of dealing with

several types of data such as pavement condition, climate,

traffic volumes and loads, along with other topics. The

pavement data is typically obtained from a careful visual

inspection. These policies vary among the companies that

apply them, because there are multiple techniques and

technological equipments that can be used in order to

collect important data for an adequate structural and

functional quality evaluation.

In this paper, distress detection and classification

methodologies are proposed, using digital image

processing techniques. In the Portuguese case, the distress

classification is based on the “Portuguese Distress Catalog

for Flexible Pavements” [1], produced the official

Portuguese agency for management, planning,

development and execution of the road infrastructure

policy guidelines.

There is not a large number of papers available

referring to the subject of automatic pavement crack

detection and classification. The literature reviewed refers

to detection and classification algorithms based on

artificial life [22], which consists in excluding non-crack

regions in order to detect pavement cracks, on Markov

Random Fields [23] and [24], based in the construction of

an irregular lattice derived from the original image, or

based on a skeleton analysis [25] Another important

method, initially proposed for face detection and

recognition, presents a visual object detection algorithm

[20], based on fast image processing and a high rate of

object detection, by using a AdaBoost algorithm,.

The data collection is supported by video or

photographic equipments and afterwards it can be further

processed in the office.

Figure 1 shows an example of a vehicle used for

distress image collection.

Figure 1 - Example of a car with an image collection system.

Page 2: AUTOMATIC PAVEMENT CRACK DETECTION AND CLASSIFICATION SYSTEM · detection and classification. The literature reviewed refers to detection and classification algorithms based on artificial

The most common visual inspection methodology

consists in a human operator traveling in a specific type of

car, in which the data is collected into a personal computer

This data contains pavement images and their geographic

location. The operator typically has a specific keyboard

where each button is used to indicate the presence of a

given type of distress class. An example of a commercial

system of this type, used by some road management

companies, is called DESY [2].

When a system like the one shown in Figure 1 is

used, the cameras that are installed at the top of the car

produce a video scan of the pavement surface that will be

later evaluated in the office by an human operator, a

process that always involves some degree of subjective

judgment, besides the cost associated to having an expert

manually performing this task. In order to minimize

subjectivity, and minimize the required analysis time, the

human operator can be helped by a computerized analysis

system that automatically evaluates the video, providing

an initial set of results that will eventually be validated by

the human operator.

This paper objectives are the development and

evaluation of an automatic pavement crack detection and

classification system. Automatic learning methods are

used, capable of learning the image statistical features

from texture variations of the background and cracks. The

features studied are the dynamic range, standard deviation

and minimum intensity pixel value and the learning

method is trained using a set of crack images. The image

database used corresponds to two different Portuguese

roads. The automatic learning system uses, boosting

techniques, considered nowadays as standard in the

resolution of pattern recognition problems, with the

purpose of increasing the processing performance and

accuracy. The implemented classification system mostly

follows the one available from the Portuguese Distress

Catalog [1], which groups pavement distresses into five

main classes, notably, cracking, break up and loss of

materials, movement of materials and deformations and

repairs.

This paper is structured as follows. After the

introduction, Section 2 presents the system architecture.

Section 3 describes some of the choices made during the

development phase including the features choice and the

algorithm used by AdaBoost. In section 4 the overall

results provided by the proposed solution are presented

and the the system performance is evaluated. Conclusions

and future work are discussed in Section 5.

2. SYSTEM ARCHITECTURE

This section explains the methodology used to

develop an automatic road pavement cracking detection

and classification system. The system architecture and the

algorithms used to implement it are described.

The automatic crack detection and classification solution

to implement employs techniques from the pattern

recognition and image processing areas of research.

Automatic learning methods are used, capable of learning

the image background (i.e. road pavement) and crack

features, by means of selected training examples. Figure 2

shows the overall system architecture which is divided into

three main steps: Training, Detection and Classification,

which will be explained in detail in the next subsections.

Figure 2 – System Architecture.

2.1. Training Step

In this section an explanation of the algorithms

used to implement the classifier training is given. This step

is divided into three different tasks:

1. Human labeling is carried out using the Distress

Selection Program.

2. Statistical feature extraction.

3. Classifier training produced by the AdaBoost

algorithm using the statistical features.

A detailed diagram illustrating the implementation

of this part of the proposed algorithm is shown in Figure 3.

Figure 3 - Diagram describing the Training step.

Firstly, a program developed by Henrique Oliveira

[15] helps the user to label the blocks in the image. This

program consists in dividing an image into non-

overlapping blocks of pixels, with size chosen by the user,

in order to mark the blocks containing cracks with the help

Page 3: AUTOMATIC PAVEMENT CRACK DETECTION AND CLASSIFICATION SYSTEM · detection and classification. The literature reviewed refers to detection and classification algorithms based on artificial

of a graphical user interface (GUI). If the user chooses a

large block size the results in the classification step could

be different from the expected ones, as the crack would

correspond to a small portion of the selected pixels.

Smaller block sizes allow the algorithm to identify with

higher accuracy the areas of the image containing cracks

and to better characterize them. Nevertheless, if the block

size is too small the noise may become relevant and

therefore, decrease the algorithm’s accuracy. The choice

of the block size is a very important aspect for the

algorithm’s accuracy. An example of the GUI of this

application can be seen in Figure 4. For this Thesis, both

the training and the tests are made considering blocks of

size 64x64 pixels. This block size was chosen as it leads to

a good accuracy for the classification algorithm, allowing

distinguishing the crack from the non-crack areas. A block

is labeled ‘1’ if the user indicates this block contains a

crack and ‘-1’ otherwise.

Figure 4 - GUI of the distress selection program, used to label each

image block as either containing a crack or not.

Next, the statistical feature extraction module

extracts selected features for each image block. Three

different features are studied in this Thesis: dynamic

range, standard deviation and minimum intensity of the

block’s pixel values. Before the algorithm proceeds with

the feature extraction, the images are converted to gray

scale, with values ranging from 0 to 255. The 0 value

corresponds to black pixels and 255 to white pixels. The

dynamic range is calculated according to,

(1)

where is the dynamic range of each block, and

are the maximum and the minimum intensities of

pixels in the block, respectively. The calculation for each

block of the standard deviation and the minimum intensity

pixel value is self-representative.

Subsequently, the data samples are stored into

matrices containing both the statistical features

information and the labelling performed by the user, which

are concatenated and used as an input for training the

classifier using the AdaBoost algorithm. The application is

trained and then the classifiers and the weights are saved

in a file to be used in the image detection step. The

AdaBoost algorithm is explained in the next subsection.

In order to study the influence of the three features

listed above, four algorithm configurations were tested:

1. Dynamic range.

2. Dynamic range + Standard deviation.

3. Minimum intensity pixel value.

4. Minimum intensity pixel value + Standard

deviation.

2.1.1. Boosting Classifier

In this Thesis, a detection algorithm based on

boosting techniques is used. Adaboost is an algorithm

introduced in 1996 by Freund and Schaphire [16] that

solved many of the practical difficulties of the earlier

boosting algorithms. Its name came from the expression

Adaptive Boosting.

The algorithm tries to adapt a classification

function to a training set that works in an iterative way.

This means that the algorithm chooses a weak

classification function with the lowest error in each

iteration. The “Adaptive” expression comes from the

weights update in order to increase the weight of the badly

classified elements in the next iteration. This way the next

classifiers will focus on the training elements that were

poorly rated in the previous rounds. So the boosting

classifiers make the decision on a group of simple

classifiers called weak classifiers. This name comes from

the weak performance they have if used alone, being

unlikely for them to correctly classify all the training set

with only one such classifier. The weak classifier is

defined by a parity , a feature and a

threshold being the image or the image block that is

going to be classified and x is the input, as the next

equation shows,

(2)

Summarizing, at first the weights are assigned for

every training element. The boosting algorithm constructs

the final classifier in an iterative way selecting in any

iteration the weak classifier with the lower classification

error as shown in,

(3)

in which the “strong” classifier is a linear combination of

simple “weak” classifiers where is the

parameter that minimizes the error. In every iteration each

weak classifier chooses the best threshold to obtain the

lowest error possible.

After each iteration a weight is given to every

training pattern. The weight for the misclassified patterns

is increased and for the patterns with a good classification

the weight is reduced. In the next iteration a weak

classifier that concentrates in a correct classification of the

elements with greater weight is chosen (misclassified by

the classifier in the previous iterations). For each selected

weak classifier j the weight is given. So, during T

iterations a classifier with a maximum of T weak

classifiers is obtained. The maximum value of T comes

Page 4: AUTOMATIC PAVEMENT CRACK DETECTION AND CLASSIFICATION SYSTEM · detection and classification. The literature reviewed refers to detection and classification algorithms based on artificial

from the fact that the same weak classifier can be chosen

more than once with different thresholds.

In this Thesis three different variants of the

AdaBoost algorithm are used at first, the

“ModestAdaBoost” algorithm, the “GentleAdaBoost” and

the “RealAdaBoost” which are similar [17]. The only

difference is the value of that is related to the error.

Intuitively measures the importance that is assigned to

. The process of selecting and (x) can be interpreted

as a single optimization step minimising the upper bound

on the empirical error. Improvement of the bound is

guaranteed, provided that .

A threshold is introduced before the output of the

final hypothesis traduced by,

(4)

A threshold for this function is defined which is

subtracted to the value of (4). Normally, the threshold

value is 0.5. This threshold is used in order to increase the

algorithm’s accuracy by decreasing the “false positive”

detection rates.

2.2. Detection Step

In this section an explanation of the algorithms

used to implement the Detection step is given. A detailed

diagram of the Detection step can be seen in Figure 5.

Figure 5 - Diagram that describes the Detection phase.

This step starts by loading two files: the first one

containing all the information regarding the classifier, i.e.,

the “weak learners” and their respective weights that were

previously calculated in the classifier training step. The

second one containing the human labeling information

which results had already been saved using the GUI

regarding the testing image. This file contains the

coordinates of the crack blocks in the images chosen by

the user for testing.

Next, in the statistical extraction module, the

statistical features are calculated using the information

regarding the coordinates of the crack blocks. The method

of how the algorithm extracts the features was explained in

the previous subsection and therefore will not be

presented.

Subsequently, the output of the statistical features

extraction together with the classifier information (“weak

learners” and respective weights) is loaded into the

AdaBoost algorithm in order to proceed to the detection

results. An example of the detection made by AdaBoost

with the indication of the “false positive” and the “false

negative” blocks is shown in Figure 6.

Figure 6 - Example of the detection result using the AdaBoost algorithm

- white blocks correspond to the “false positives”, blue blocks to the

“false negative” and the union of the red and blue blocks corresponds to

the ideal detection.

2.3. Classification Step

In this section the algorithms used to perform the

Classification step are described. The two main tasks in

this phase are the elimination of the image crack blocks

that have no 8-neighbouring blocks containing cracks; and

the classification of the resulting cracks as Longitudinal,

Transversal, Miscellaneous, Top Diagonal or Down

Diagonal. A detailed diagram describing the Classification

phase is shown in the block diagram of Figure 7.

Figure 7 - Diagram that describes the Classification phase.

The algorithm uses the Detection step output to

find the blocks containing cracks. It starts with the

elimination of isolated crack blocks. This step mainly

allows to remove “false positive” blocks, which often

correspond to oil stains. However, this block elimination

may eventually also lead to the removal of blocks that

were well detected but because of being isolated are

eliminated. This type of situation occurs with a low

probability when compared with the total number of crack

blocks, thus not being considered a problem.

Page 5: AUTOMATIC PAVEMENT CRACK DETECTION AND CLASSIFICATION SYSTEM · detection and classification. The literature reviewed refers to detection and classification algorithms based on artificial

After that, the algorithm proceeds to the crack

classification using a pattern classification system

exploiting a 2D feature space [15]. The 2D feature space

is composed by the standard deviation of the column and

row coordinates of the detected crack regions, i.e., image

windows labeled with ‘1’ in the detection step results. An

example of the 2D feature space is shown in Figure 8.

Figure 8 - 2D feature space used for crack classification. P1 represents

the single crack shown in, which is classified as a transversal crack.

The bisectrix axis divides the 2D feature space into

two areas and the points that lie on the bisectrix line are

associated to perfect miscellaneous cracks, as the value of

the standard deviation for rows and columns is equal.

Points positioned over the horizontal and vertical axis

correspond to perfect transversal and longitudinal cracks,

respectively. Note that by this method only three types of

cracks are distinguished, Longitudinal, Transversal and

Miscellaneous. In order to distinguish Diagonal cracks

from the Miscellaneous ones another method is used, as

explained below. Diagonal cracks are important to detect

because they can be dangerous when a vehicle passes by

it. If the crack has a high severity the vehicle can follow

the crack tracks and can turn a simple passage by over

crack into a severe accident.

According to Figure 8 the crack classification is

performed by computing two distances: and , where

corresponds to the distance from the point P1 to the

bisectrix line and is the distance to the nearest axis

(horizontal or vertical). In Figure 8 point P1 represents a

transversal crack. The classification rules are:

• Longitudinal cracks have and the nearest

axis is the horizontal one;

• Transversal cracks have and the nearest

axis is the vertical one;

• Miscellaneous cracks have independently

of the nearest axis.

The distinction between Miscellaneous and Top or

Down Diagonal cracks, the detected crack blocks are

studied only in predefined locations, as shown in Figure 9.

To these cases a threshold is defined and the principles

are:

• Classified as Down Diagonal if the number of

detected crack blocks in the top left and down

right areas is higher than 8 and if the number of

detected crack blocks in the top right and down

left area is less than 8.

• Classified as Top Diagonal if the number of

detected crack blocks in the top right and down

left areas is higher than 8 and if the number of

detected crack blocks in the top left and down

right area is less than 8.

• Classified as Miscellaneous if any detected crack

block exists in any selected areas at the same time.

Note that this classification method is made after

the previous one and by that reason, problems with

the Longitudinal and Transversal types of

classification do not occur.

The threshold choice was made after several

experiments, with all types of cracks, to choose which

threshold distinguishes best Diagonal from Miscellaneous

cracks.

Figure 9 - Example of the distinction between a Diagonal crack from a

Transversal crack.

3. PROPOSED CRACK DETECTION METHOD

One of the most important decisions in this

Thesis was to choose which features are best suited for

getting the best results and subsequently the lowest error.

In this case, after an exhaustive experimental analysis,

three features lead to the best results: dynamic range,

standard deviation and the minimum intensity pixel

value. Also, there are various boosting algorithms that

can be used in this particular case. In order to achieve

the best detection rates, three boosting algorithms were

tested: “Gentle AdaBoost”, “Real AdaBoost” and

“Modest AdaBoost”.

Thousands of images can be used in the training

and detection steps and defining the image set is

fundamental to evaluate the algorithm’s performance.

The best images to use in the classifier training step are

the ones with cracks perfectly bounded, i.e., wide cracks

mainly constituted by dark pixels.

3.1. Selection of Features for AdaBoost

The AdaBoost algorithm is a method for combining

a set of weak classifiers to a make a strong classifier.

Typically a large set of features are specified in advance

and the algorithm selects which ones to use and how to

combine them. In this case only one or two features will be

used as input to the algorithm. The problem is that the

choice of which features to use in the boosting algorithm is

Page 6: AUTOMATIC PAVEMENT CRACK DETECTION AND CLASSIFICATION SYSTEM · detection and classification. The literature reviewed refers to detection and classification algorithms based on artificial

critical to the application success. The set of features used

for face detection by Viola and Jones [23] consists of a

subset of Haar basis functions. This type of features is

mainly used for heterogeneous environments, i.e., images

that have a large texture variation. The experimental work

developed in this paper uses images with low texture

variation converted into a gray scale. For that reason, three

different features are studied: dynamic range, standard

deviation, and the minimum intensity pixel value. The

reason for considering the dynamic range feature is the

fact that crack regions are constituted mainly by black (or

at least dark gray) pixels, in opposition to the rest of the

pavement which is composed by pixels of various gray

levels. Knowing that the value of black pixels is 0 and for

white pixels it is 255, the dynamic range of the blocks

containing cracks is different from the ones showing the

rest of the pavement. After the experimental procedures

explained in the next subsection, only using this feature is

not enough to characterize the crack areas. The standard

deviation is very important in order to express the slight

variations from the blocks with cracks from the ones

without them. It is assumed that for the crack blocks the

standard deviation should be larger than for the non-crack

blocks because a significant number of darker pixels only

appear in blocks containing cracks. Due to the fact that

this feature only express slight variations in the image

texture, must be used together with one of the other two

features, dynamic range or minimum intensity pixel value.

Finally, the last feature studied is the minimum intensity

pixel value within a block, which despite being the simpler

feature produces good results in terms of detection error,

with very low processing time. This feature consists only

in calculating the minimum intensity pixel value from the

image blocks, which is expected to correspond to the

darker pixels, typical of cracks. Using this feature, the

algorithm can distinguish crack from non-crack blocks and

establish a recognition pattern. Some experiments are done

in order to substantiate the choices previously mentioned.

The algorithm is thus tested with four different sets of

features:

1. Dynamic Range.

2. Dynamic Range + Standard Deviation.

3. Minimum intensity pixel value.

4. Minimum intensity pixel value + Standard

Deviation.

To select the features leading to the best

performance in crack detection, the number of false

negatives and false positives detected by the algorithm are

evaluated. The best performing combinations of features

were:

1. Minimum intensity pixel value.

2. Minimum intensity pixel value + standard

deviation.

At this stage the two other possible feature sets are

discarded, as not having the desired performance.

Another conclusion is that for classifier training it

is important to choose images that have well defined

cracks, as this directly influences the Detection algorithm

performance.

3.2. Selection of the AdaBoost Algorithm

In this section, 3 different boosting schemes (“Real

AdaBoost”, “Gentle AdaBoost” and “Modest AdaBoost”)

are evaluated, to choose the one producing the best results

for this problem. “Real AdaBoost” (see [20] for full

description) is the generalization of a basic AdaBoost

algorithm, first introduced by Freund and Schapire [18],

and should be treated as a basic “hardcore” boosting

algorithm. “Gentle AdaBoost” is a more robust and stable

version of “Real AdaBoost” (see [19] for full description).

So far, it has been the most practically efficient boosting

algorithm, used, for example, in Viola-Jones object

detector [21]. “Gentle AdaBoost” performs slightly better

than “Real AdaBoost” on regular data, but is considerably

better on noisy data, and much more resistant to outliers

[17]. “Modest AdaBoost” is mostly aimed for better

generalization capability and resistance to overfitting. In

terms of test error and overfitting this algorithm

outperforms both “Real” and “Gentle AdaBoost”. In order

to choose which one best fits the data studied in this

Paper, experimental results will be shown related with one

of the two cases previously referenced, the “worst” case.

Figure 10 shows a comparative test between the

performance of the “Real AdaBoost”, “Modest AdaBoost”

and the “Gentle AdaBoost” algorithms. As can be seen,

“Modest AdaBoost” performs better than “Gentle

AdaBoost” on this subset. The difference between the two

performances is about 1 percent which may appear to be a

low value, but in this type of application can be very

significant. In fact, an error of 1 percent means that in

average 10 blocks are misclassified and knowing that a

crack can occupy more or less 20 blocks, if the

misclassification is in crack blocks the crack detection will

be catastrophic. The experiments produced in [22] and the

ones performed in this Paper conclude that the “Modest

AdaBoost” outperforms both “Real AdaBoost” and the

proper “Gentle AdaBoost” that is assumed to be one of the

best boosting algorithms used in practice.

Figure 10 - Gentle AdaBoost, Modest AdaBoost and Real AdaBoost

performance on test subset in the “worst” case.

4. PERFORMANCE EVALUATION

This chapter starts by presenting a set of metrics

used for the evaluation of road pavement crack detection

algorithm’s performance. Two sets of test images,

captured from two different roads will be tested, and the

overall results of the boosting classifiers, trained with

images of the two sets, are presented. Variables like the

Page 7: AUTOMATIC PAVEMENT CRACK DETECTION AND CLASSIFICATION SYSTEM · detection and classification. The literature reviewed refers to detection and classification algorithms based on artificial

percentage of images used for training, the decision

threshold or the type of features used are tested, in order to

evaluate the differences produced in the final crack

detection and classification results. Finally, a set of images

with and without cracks are used to evaluate the crack

detection performance.

4.1. Detection Rate and False Detection Rate

For the performance evaluation of crack detection

results, for each classifier two types of evaluation result

metrics are used: precision (5) and recall (6),

(5)

(6)

The precision metric evaluates the correct

detections in comparison with the total number of obtained

detections; as a complement the recall metric evaluates the

number of correct detections by the number of relevant

detections. In order to proceed with this evaluation, two

indicators are calculated, TDR (True Detection Rate) and

the FDR (False Detection Rate). TDR is a recall metric

and FDR is a precision metric. In the context of this paper,

the TDR (7) and FDR (8) indicators are calculated

according to,

(7)

(8)

with GT (Ground Truth) being the total number of crack

blocks present in the image, TD (True Detection) the

number of crack blocks correctly detected and FD (False

Detection) the number of false crack blocks detected.

These metrics provide a simple way to evaluate the

detection performance of the algorithm. In order to ease

the perception of the algorithm’s performance evaluation

another quality indicator (Q) [29] is computed, integrating

into a single metric the true and false detection results.

The quality indicator is given by:

(9)

The objective is that the classifier achieves a high

true detection rate and a low false negative detection rate

and therefore the difference between these two measures

should be as high as possible, i.e., the classifier

performance must achieve the highest possible value of Q.

4.2. Test Results for “Road 1”

In this section the results for crack blocks detection

and classification are presented. The training phase using

the boosting classifiers is processed with 100 iterations

each for the two sets of images collected from two

different roads. This value was chosen because after the

100 iterations the algorithm has the ability to converge to a

minimum error. At this stage, the detection and

classification algorithm performance will be evaluated and

only the best performance results will be here presented.

Note that only the best performance results will be here

presented.

Applying the selected classifiers to the respective

images used for the testing phase, the performance results

are achieved and the crack detection and classification

performance can be evaluated. Next, the results

demonstrating the best detection and classification

performance are presented.

Table 1 shows the average detection quality and

classification results for “road 1”, using the minimum

intensity pixel value feature, no threshold and 25% of

images for training.

Table 1 - Average detection quality and classification results for “road

1”, using the minimum intensity pixel value feature, no threshold and

25% of images for training.

Image Number TDR FDR Q Classification

5 1.0000 0.1842 0.8158 Transversal

6 1.0000 0.1579 0.8421 Down Diagonal

7 0.8667 0.1613 0.7054 Transversal

8 1.0000 0.3429 0.6571 Longitudinal

9 1.0000 0.2162 0.7838 Longitudinal

10 0.9697 0.0588 0.9109 Longitudinal

11 1.0000 0.3590 0.6410 Longitudinal

12 1.0000 0.2121 0.7879 Longitudinal

13 0.9333 0.4286 0.5047 Longitudinal

14 0.8286 0.3556 0.4730 Longitudinal

15 0.9000 0.2703 0.6297 Longitudinal

Results[%] 95.44 24.97 70.47 100.00

For this case, the average detection quality

according to TDR is 95.44%, which can be considered as

a good performance detector. As it was said before, from

the three indicators shown in the table, the most important

is TDR because for road maintenance it is important to

detect every pavement crack. The detection of “false

positive” blocks (FDR), in this case, has a low rate of

24.97% which corresponds to an average of 8 blocks out

of 700 existing in the images. The crack type classification

precision was 100%, which is considered very good.

Next, Table 2 shows the detection and

classification results of the boosting classifiers using the

minimum intensity pixel and the standard deviation

features. In this case a 0.5 value threshold is used and 50%

of the images are used for training.

Table 2 - Average detection quality and classification results for “road

1”, using the minimum intensity pixel value and standard deviation

features, no threshold and 50% of images for training.

Image

Number TDR FDR Q Classification

8 1.0000 0.2778 0.7222 Longitudinal

9 0.9697 0.1351 0.8346 Longitudinal

10 0.9697 0.0588 0.9109 Longitudinal

11 1.0000 0.2821 0.7179 Longitudinal

12 1.0000 0.2121 0.7879 Longitudinal

13 0.9333 0.4286 0.5048 Longitudinal

14 0.8286 0.3556 0.4730 Longitudinal

15 0.9000 0.2703 0.6297 Longitudinal

Results[%] 95.02 25.26 69.7625 100.00

Page 8: AUTOMATIC PAVEMENT CRACK DETECTION AND CLASSIFICATION SYSTEM · detection and classification. The literature reviewed refers to detection and classification algorithms based on artificial

In this case the average TDR is 95.02% which

corresponds to a better performance than in the previous

case. The classifier achieves good classification results,

similar to the previous case, leading to a correct

classification of all the test images. The classifier classifies

correctly 100% of the images, achieving similar results

when compared to the other case due to the fact that the

values of the FDR and Q are similar.

According to these two cases, the best results are

obtained using the minimum intensity pixel feature alone

or together with the standard deviation.

4.3. Test Results for “Road 2”

“Road 2” images are harder to analyze as the cracks are

not so clearly distinguishable, in terms of gray levels, even

for a human observer. Like the previous section only the

best results will be presented. Table 3 shows the average detection quality and classification results in the set of

images corresponding to “road 2” using the minimum

intensity pixel feature, considering no threshold and 25%

of the images for training.

Table 3 - Average detection quality and classification results for “road

2”, using the minimum intensity pixel value feature, no threshold and

25% of images for training.

Image Number TDR FDR Q Classification

4 0.8444 0.1461 0.6984 Miscellaneous

5 0.9737 0.1190 0.8546 Transversal

6 0.7667 0.3235 0.4431 Transversal

7 0.8929 0.2126 0.6803 Miscellaneous

8 0.5938 0.0500 0.5437 Miscellaneous

9 0.9091 0.3548 0.5543 Longitudinal

10 1.0000 0.2308 0.7692 Longitudinal

Results[%] 85.44 20.53 64.91 100.00

Following the evaluation, the best detection and

classification results are presented. Table 3 shown above

presents one of the results using the minimum intensity

pixel value feature, no threshold and 25% of images for

training. In this case the TDR indicates a rate of 85.44%,

which can be considered as a good performance detector.

Observing the table, it can be seen that image 6 has the

lowest accuracy results. This fact occurs because this

image is a very “difficult” one, which crack is not as clear

as the ones in the other images. In terms of classification

performance, the classifier achieves a 100% precision and

classifies all cracks correctly.

Table 4 shows one of the results using the

minimum intensity pixel value and standard deviation

features, no threshold and 25% of images for training. The

TDR indicator shows now 84.82% accuracy, which is

lower than the one observed in “road 1” but still can be

considered as a good performance detector. In terms of

classification accuracy, in this case an 85.71% rate is

achieved which the only misclassification being the crack

in image 6.

Table 4 - Average detection quality and classification results for “road

2”, using the minimum intensity pixel value and standard deviation

features, no threshold and 25% of images for training.

Image Number TDR FDR Q Classification

4 0.9222 0.1443 0.7779 Miscellaneous

5 0.9737 0.1190 0.8546 Transversal

6 0.6333 0.3667 0.2667 Miscellaneous

7 0.8482 0.2213 0.6269 Miscellaneous

8 0.6510 0.0530 0.5980 Miscellaneous

9 0.9091 0.4286 0.4805 Longitudinal

10 1.0000 0.2308 0.7692 Longitudinal

Results[%] 84.82 22.34 62.4829 85.71

4.4. Test Results Discussion

After the preceding results presentation, where

parameters like threshold, percentage of images used for

training and features used are studied, a final analysis will

be presented.

The first two parameters are not considered to be

crucial in terms of increasing the detector and classifier

performance. The cases where the best results were

achieved used only the minimum intensity pixel value

feature or both minimum intensity and standard deviation

features. The results provided by both cases are similar in

terms of TDR and classification rates.

Next, these two features will be tested in a test set

composed of images with and without cracks, in order to

conclude whether this algorithm can be used in real

environment circumstances.

4.5. Test Results in a Real Environment

During a pavement crack analysis of a well

maintained road, most of the captured images should not

present any cracks - crack areas are expected to appear

only in a low percentage of the captured images. Due to

this fact it is crucial to test the algorithm in a real

environment, where images with and without cracks

appear in the test set.

In this section tests are performed with a test set it

consisting of 21 non-crack images and 15 crack images.

Note that in some of the non-crack images, oil stains are

present, which can lead to misclassifications by the

classifier. The objective here is to see if the detection

algorithm can accomplish a good detection of the crack

images. Table 5 and

Table 6 show the corresponding detection and

classification results, no threshold, a percentage for

training of 25% of the images, corresponding to the first 4

images of the set, using both minimum intensity pixel and

the standard deviation features or only the minimum

intensity pixel feature, respectively.

The results reported in Table 5 and

Table 6 present satisfactory results in crack

detection – all images with or without cracks were

correctly identified, and the classification of the crack

type, for crack images, was always correct except in image

32 in both cases. This is due to the fact that this image

have blocks of pixels with dark regions near the crack and

Page 9: AUTOMATIC PAVEMENT CRACK DETECTION AND CLASSIFICATION SYSTEM · detection and classification. The literature reviewed refers to detection and classification algorithms based on artificial

because of that the number of false positives is high (FDR

= 0.5).

These results show that this boosting algorithm can

be used in a real environment. The fact that not all crack

blocks within the images were identified (average TDR in

the 91-92% range) has not prevented the correct overall

behaviour of the algorithm.

Table 5 - Detection and classification results of the images containing

cracks for the set of images simulating a real environment, with the

minimum intensity pixel and the standard deviation features, no

threshold and a percentage for training of 25% of the images.

Image

Number TDR FDR Q Classification

5 No crack

6 0.8482 0.1521 0.6961 Transversal

7 0.8814 0.0869 0.7945 Down

Diagonal

8 No crack

9 0.6854 0.0000 0.6854 Transversal

10 0.9490 0.3072 0.6418 Longitudinal

11 0.9691 0.1982 0.7709 Longitudinal

12 No crack

13 No crack

14 No crack

15 No crack

16 0.8557 0.0778 0.7779 Miscellaneous

17 No crack

18 No crack

19 No crack

20 0.8947 0.0685 0.8262 Miscellaneous

21 No crack

22 No crack

23 No crack

24 No crack

25 1.0000 0.1351 0.8649 Longitudinal

26 0.9677 0.2857 0.6820 Longitudinal

27 0.9286 0.13333 0.7952 Longitudinal

28 No crack

29 No crack

30 0.9286 0.0000 0.9286 Longitudinal

31 0.8621 0.1935 0.6685 Longitudinal

32 0.9048 0.5000 0.4048 Miscellaneous

33 1.0000 0.1667 0.8333 Longitudinal

34 1.0000 0.1081 0.8919 Longitudinal

35 No crack

36 No crack

37 No crack

38 No crack

39 No crack

40 No crack

Results[%] 91.17 16.08 75.08 97.22

Table 6 - Detection and classification results of the images containing

cracks for the set of images simulating a real environment, with the

minimum intensity pixel feature, no threshold and a percentage for

training of 25% of the images.

Image

Number TDR FDR Q Classification

5 No crack

6 0.8902 0.1082 0.7820 Transversal

7 0.9164 0.1287 0.7877 Down Diagonal

8 No crack

9 0.7264 0.0263 0.7001 Transversal

10 0.9712 0.2762 0.6950 Longitudinal

11 0.9691 0.1982 0.7709 Longitudinal

12 No crack

13 No crack

14 No crack

15 No crack

16 0.8557 0.0568 0.7989 Miscellaneous

17 No crack

18 No crack

19 No crack

20 0.8947 0.0423 0.8525 Miscellaneous

21 No crack

22 No crack

23 No crack

24 No crack

25 1.0000 0.0303 0.9697 Longitudinal

26 0.9677 0.2857 0.6820 Longitudinal

27 0.9286 0.1333 0.7952 Longitudinal

28 No crack

29 No crack

30 0.9286 0.0000 0.9286 Longitudinal

31 0.8621 0.1935 0.6685 Longitudinal

32 0.9048 0.5000 0.4048 Miscellaneous

33 1.0000 0.1667 0.8333 Longitudinal

34 1.0000 0.1081 0.8919 Longitudinal

35 No crack

36 No crack

37 No crack

38 No crack

39 No crack

40 No crack

Results[%] 92.10 15.03 77.07 97.22

5. CONCLUSIONS AND FUTURE WORK

In this paper a method inspired in boosting

methods for the automatic detection and classification of

cracks in the pavement was developed. To demonstrate the

effectiveness of the approach two different sets for

training are used to obtain the final classifiers. A simple

boosting method is used to train the classifier and the two

sets (one for each road) make possible to achieve results in

two different scenarios which demonstrates the robustness

of the implemented method. The key element of this

Page 10: AUTOMATIC PAVEMENT CRACK DETECTION AND CLASSIFICATION SYSTEM · detection and classification. The literature reviewed refers to detection and classification algorithms based on artificial

Thesis was the choice of the features that originate better

results. Among three possible choices the minimum

intensity pixel of each block contained in the image was

the one which provided the best detection and

classification results. Another very important element was

the choice of the boosting algorithm. The “Modest

AdaBoost” algorithm was chosen for the reason that it

provides better generalization results and achieves the

final error much faster when compared with the other

considered options: “Gentle AdaBoost” and “Real

AdaBoost” algorithms. Good results were achieved but

with big variations for different image sets. For the first

image set, overall detection rates of around 95% and false

positive detection ones around 25% were achieved for all

the images used for testing, and for the second image set

containing more “difficult” images overall detection rates

between around 85% and false positive detection ones

between 21% and 23% were achieved. Finally for the set

containing images simulating a real environment (4

training images; 21 non-crack images and 15 crack

images), the results were: detection rate around 92% and

false positive detection ones between 15% and 16%. It can

be concluded that the algorithm developed in this Thesis

can be perfectly used in a real environment. After these

results it is concluded that images used for training should

contain cracks that are perfectly bounded with the aim of

getting the best results and therefore achieve a higher

algorithm performance. The classification algorithm was

able to correctly classify all the images contained in the

two first sets. In the test set simulating the real

environment the achieved classification results were

97.22% which are very good.

Although the automated collection and processing

of pavement distress data have progressed greatly in the

last decade, there are still barriers to overcome before the

technologies involved can come to fruition as real-time,

reliable, and generally applicable tools. First there is the

need for development of systems capable of consistently

producing high-quality digital images under most data

collection conditions (lighting, angle of the sun,

shadowing, etc.). Although there is evidence that the

technologies have progressed to the needed capability,

they are not generally applied within the industry.

Future developments will target the classification

of additional crack types such as the alligator pattern crack

type and the other types of distresses. Achieve better

detection rates lowering the false positive detection ones

will be the principal objective of future works in this

pattern recognition area.

6. REFERENCES

[1] Ex-JAE, “Catálogo de Degradações dos Pavimentos Rodoviários

Flexíveis – 2ª Versão”, Ex-Junta Autónoma das Estradas, Portugal,

November 1997.

[2] System developed by the Laboratoire Central des Ponts et

Chaussées, http://www.lcpc.fr/en/home.dml, accessed on November

2007.

[3] Ahmad Ardani, Shamshad Hussain, and Robert LaForce,

“Evaluation of Premature PCC Pavement Longitudinal Cracking in

Colorado”, Proceedings of the 2003 Mid-Continent Transportation

Research Symposium, Iowa 2003.

[4] Prithvi S. Kandhal, Timothy L. Ramirez, Paul M. Ingram,

“Evaluation of eight longitudinal joint construction techniques for

asphalt pavements in Pennsylvania”, July 2001.

[5] Image acquired from a site of the U.S. Department of Transportation,

http://www.tfhrc.gov/pavement/ltpp/reports/03031/01.htm#transverse,

accessed on July 2008.

[6] Image acquired from a site of the SemMaterials company,

http://www.semmaterials.com/repairing/fatiguecracking.aspx, accessed

in July 2008.

[7] Image acquired from a site, http:// www.winpave.com/tarmac-

surfacing.html, accessed in July 2008.

[8] Gramling, W.L., NCHRP Synthesis of Highway Practice 203:

Current Practices in Determining Pavement Condition, Transportation

Research Board, National Research Council, Washington, D.C., 1994,

57 pp.

[9] Wang, K.C.P, Transportation Research Circular: Automated Imaging

Technologies for Pavement Distress Survey, Committee A2B06,

Transportation Research Board, National Research Council,

Washington, D.C., draft, June 22, 2003.

[10] Wang, K.C.P. and X. Li, “Use of Digital Camera for Pavement

Surface Distress Survey,” Transportation Research Record 1675,

Transportation Research Board, National Research Council,

Washington, D.C., 1999, pp. 91–97.

[11] Digital Imaging and Linescan Imaging, International Cybernetics

Corporation, Largo, Fla., 2002 [Online]. Available:

http://www.internationalcybernetics.com.

[12] The Source for Infrastructure Information, Roadware Group, Inc.,

Paris, Ont., Canada, 2002 [Online]. Available:

http://www.roadware.com.

[13] Mandli’s Pavement System, Mandli Communications,Inc., Oregon,

Wis., 2002 [Online]. Available: http:// www.mandli.com.

[14] Henrique Oliveira, Paulo Lobato Correia, “Identifying and

retrieving distress images from road pavement surveys”, Workshop on

Multimedia Information Retrieval: New Trends and Challenges, IEEE

International Conference on Image Processing (ICIP 2008), San Diego,

USA, Oct. 2008

[15] Yoav Freund, Robert E. Schapire, “A decision-theoretic

generalization of on-line learning and an application to boosting”,

Journal of Computer and System Sciences, 1996.

[16] Alexander Vezhnevets, “GML AdaBoost Matlab Toolbox Manual”,

July 2006.

[17] Y. Freund and R. E. Schapire, “Game theory, on-line prediction and

boosting”, in Proceedings of the Ninth Annual Conference on

Computational Learning Theory, pages 325-332, 1996.

[18] Jerome Friedman, Trevor Hastie, and Robert Tibshirani, “Additive

logistic regression: A statistical view of boosting”, The Annals of

Statistics, 38(2):337-374, April 2000.

[19] R. E. Schapire and Y. Singer, “Improved boosting algorithms using

confidence-rated predictions”, Machine Learning, 37(3):297-336,

December 1999.

[20] P. Viola and M. Jones, “Robust Real-Time Object Detection”, In

Proc. 2nd Int’l Workshop on Statistical and Computational Theories of

Vision – Modeling, Learning, Computing and Sampling, Vancouver,

Canada, July 2001.

[21] A. Vezhnevets and V. Vezhnevets, “Modest AdaBoost – teaching

AdaBoost to generalize better”, Graphicon 2005.

[22] H. G. Zhang, Q. Wang, “Use of Artificial Living System for Pavement

Distress Survey”, The 30th Annual Conference of the IEEE Industrial

Electronics Society, November 2-6, 2004, Busan, Korea.

[23] Philippe Delagnes, Dominique Barba, “A Markov Random Field for

rectilinear structure extraction in pavement distress image analysis”.

[24] Philippe Delagnes, Dominique Barba, “Rectilinear Structure

Extraction in Textured Images with an Irregular, Graph-Based Markov

Random Field Model”, Proceedings of ICPR ’96.

[25] H. D. Cheng, M. Miyojim, “Automatic pavement distress detection

system”, Journal of Information Sciences 108 (1998) 219-240.

[26] Ricardo Martins, “Reconhecimento automático de crateras na

superfície de Marte baseado em técnicas de boosting”, Master Thesis,

October 2007.