ditch detection using reﬁned lidar data1334514/fulltext01.pdf · ditch detection using refined...

Ditch detectionusing refinedLiDAR dataA bachelor’s thesis at JonkopingUniversity

PAPER WITHIN Computer ScienceAUTHORS: Filip Andersson, Jonatan FlycktTUTOR: Niklas LavessonJONKOPING 2019 June

This exam work has been carried out at the School of Engineering in Jonkoping inthe subject area Computer Science. The work is a part of the three-year Bachelor ofScience in Engineering programme. The authors take full responsibility for opinions,conclusions and findings presented.Examiner: Ragnar NohreSupervisor: Niklas LavessonScope 15 creditsDate: 2019-06-18

Mailing address: Visiting address: Phone:Box 1026 Gjuterigatan 5 036-10 10 00551 11 Jonkoping

DITCH DETECTION USING REFINED LIDAR DATA I

Abstract

In this thesis, a method for detecting ditches using digital elevation data derived fromLiDAR scans was developed in collaboration with the Swedish Forest Agency.

The objective was to compare a machine learning based method with a state-of-the-artautomated method, and to determine which LiDAR-based features represent the strongestditch predictors.

This was done by using the digital elevation data to develop several new features, whichwere used as inputs in a random forest machine learning classifier. The output from thisclassifier was processed to remove noise, before a binarisation process produced thefinal ditch prediction. Several metrics including Cohen’s Kappa index were calculatedto evaluate the performance of the method. These metrics were then compared with themetrics from the results of a reproduced state-of-the-art automated method.

The confidence interval for the Cohen’s Kappa metric for the population was calculatedto be [0.567 , 0.645] with a 95 % certainty. Features based on the Impoundment attributederived from the digital elevation data overall represented the strongest ditch predictors.

Our method outperformed the state-of-the-art automated method by a high margin. Thisthesis proves that it is possible to use AI and machine learning with digital elevation datato detect ditches to a substantial extent.

Keywords

Machine learning - Geographic information systems - Classification and regression trees- Supervised learning by classification

DITCH DETECTION USING REFINED LIDAR DATA II

Acknowledgements

We thank Professor Niklas Lavesson for guidance and instructions during the course ofthis thesis. We also wish to thank Liselott Nilsson and everyone else at the SwedishForest Agency for providing the data and the thesis idea, as well as a lot of help alongthe way. Thank you also to Dr Anneli Agren at the Swedish University of AgriculturalSciences for input and advice, as well as the labelled data for this project. Lastly,thank you to Sigurd Israelsson at the School of Engineering at Jonkoping Universityfor lending hardware resources for the experiments in this study.

DITCH DETECTION USING REFINED LIDAR DATA III

Contents

1 Introduction 1

2 Context 22.1 Available data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.2 Current situation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

3 Aim and scope 4

4 Background 54.1 Supervised learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54.2 Related work using machine learning . . . . . . . . . . . . . . . . . . . 64.3 Random forests and decision trees . . . . . . . . . . . . . . . . . . . . 74.4 Gini importance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

5 Evaluation 10

6 Approach 126.1 Experiment design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126.2 Experimental methodology . . . . . . . . . . . . . . . . . . . . . . . . 136.3 Data preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

6.3.1 Training and validation data . . . . . . . . . . . . . . . . . . . 136.3.2 Defining ditches in raster format . . . . . . . . . . . . . . . . . 15

6.4 Reproducing the Whitebox method . . . . . . . . . . . . . . . . . . . . 166.5 Feature engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

6.5.1 General features . . . . . . . . . . . . . . . . . . . . . . . . . 176.5.2 Custom features . . . . . . . . . . . . . . . . . . . . . . . . . 17

6.6 Model configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216.7 Post-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

6.7.1 Noise reduction and gap filling . . . . . . . . . . . . . . . . . . 236.7.2 Binarisation with zones . . . . . . . . . . . . . . . . . . . . . . 246.7.3 Cluster removal . . . . . . . . . . . . . . . . . . . . . . . . . . 24

7 Results and analysis 267.1 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . 267.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

8 Discussion 30

DITCH DETECTION USING REFINED LIDAR DATA IV

8.1 Strengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308.2 Weaknesses and limitations . . . . . . . . . . . . . . . . . . . . . . . . 318.3 Comparison to state-of-the-art . . . . . . . . . . . . . . . . . . . . . . 328.4 General discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

9 Conclusions and future work 33

References 35

Appendices 38

Appendix A Source code 38

Appendix B All feature importances 39

Appendix C Individual zone results 41

Appendix D Prediction result graphics for each zone 42

DITCH DETECTION USING REFINED LIDAR DATA 1

1 Introduction

The Swedish Forest Agency (SFA) is a government agency charged with various forestissues, as well as with enforcing forest political goals set by the Swedish Parliament(Skogsstyrelsen, 2018). The SFA wishes to perform a mapping of ditches in Sweden,where ditches are identified on maps. Ditch mapping is important as a foundation inissues such as forest production, nature consideration and infrastructure (L. Nilsson,personal communication, 9 January, 2019). It is also important in water quality assessment,as ditches can be contaminated by chemicals in their surface runoff and shallow subsurfaceflow, which can cause groundwater contamination, and lead to changes in the nature ofwater flow (Carluer & Marsily, 2004; Dages et al., 2009).

Methods for manual mapping of ditches have high precision, but they incur significanteffort and costs (L. Nilsson, personal communication, 9 January, 2019). A high precisionmanual mapping has been performed at an area called Krycklan in Vasterbotten, Sweden(L. Nilsson, personal communication, 9 January, 2019). The automated methods previouslyapplied, however, have been too imprecise for ditch mapping (Svensk, 2016) due to theLight Detection and Ranging (LiDAR) measurements sometimes being interfered bytrees, bushes, or grass. These automated methods can therefore leave gaps in the digitalelevation model, causing ditches to be incorrectly mapped (Cho, Slatton, Cheung, &Hwang, 2011; Roelens, Hofle, Dondeyne, Van Orshoven, & Diels, 2018). There arestudies that have achieved good results using automated algorithms configured to fill incavities in the model without utilising machine learning. However, these algorithms canbe hard to tune correctly, and they can still suffer from the disrupted LiDAR scans (Choet al., 2011).

In the light of this the SFA wishes to examine the possibilities of using artificial intelligence(AI) and machine learning to identify ditches. As this is an open problem with manychallenges and no apparent solution, this has given us a unique opportunity to expand theknowledge of machine learning for the SFA. The aim of this thesis has not been to givea full solution to the SFA’s machine learning goals, but to form a basis on which futurework can build on. The focus has been on investigating if a machine learning method canperform better than an established method, as well as determining what data attributesare most relevant for a correct ditch prediction. To explore this, experiments have beenconducted with a supervised learning algorithm. Supervised learning has been usedin previous studies of geographical data with good results (Gislason, Benediktsson, &Sveinsson, 2006; Roelens et al., 2018; Stanislawski, Brockmeyer, & Shavers, 2018;Sverdrup-Thygeson, Ørka, Gobakken, & Næsset, 2016).


2 Context

A geographic information system (GIS) is a system for storing, analysing, retrieving,and displaying geographical data (Mulik, 1999). This data can be either spatial orattribute data represented on maps or in tables in software systems (Mulik, 1999). Onetype of GIS data is digital elevation data, which represents elevation differences ofgeographic surfaces with many different data attributes in a raster format (Esri, 2017).

2.1 Available data

The data available for the work in this study are all attributes originally derived fromLiDAR data. This LiDAR data has first been refined into a digital elevation model(DEM). The DEM has then been processed with the Whitebox software (Lindsay, J. B.,2016). All the data attributes are in a raster format with a resolution of 0.25m2 per pixel(L. Nilsson, personal communication, 9 January, 2019). The following data attributeshave been extracted from this process:

• Digital Elevation ModelThe DEM attribute represents the elevation in metres above sea level.

• Sky View FactorThe Sky View Factor attribute represents, with a value ranging from 0 to 1, howmuch of the sky that is visible from a certain point on the ground (Zaksek, Ostir,& Kokalj, 2011).

The Sky-view factor is defined by part of the visible sky (Ω) above acertain observation point as seen from a two-dimensional representation(a). The algorithm computes the vertical elevation angle of the horizon γiin n directions to the specified radius R (b). (Zaksek et al., 2011)

See Figure 1 for a visual representation.

• Impoundment indexAn attribute representing the size of the impoundment that would be created byinserting a dam at a location in the DEM. The grid cells of the DEM, together withspecified length and height of the dam, produce the Impoundment index. Thisindex is mainly used for mapping drained wetlands with LiDAR data. (Lindsay,J. B., 2016)

The Impoundment index used in this thesis was developed specifically for theKrycklan area. (A. Agren, personal communication, 7 May, 2019).


• High Pass Median Filter - smoothedHigh Pass Median Filters (HPMF) are used to help humans and algorithms tofocus attention on important small scale features in images. For LiDAR data youcan eliminate random variations in laser pulse energy, as well as the effects ofattenuation. The filter uses the median value of a neighbourhood of values andthen replaces all the neighbourhood values with this median value to help smoothout the data. (Shane D. Mayor, 2007)

• SlopeThe Slope attribute represents the degree of slope at a certain point, with a valueranging from 0 to 90. This attribute contains no information about the directionof the slope.

Figure 1. Sky View Factor illustration, (Zaksek et al., 2011)

2.2 Current situation

The SFA has provided refined digital elevation data of the Krycklan area. This areacovers approximately 10,000 hectare, and the data is refined from LiDAR data with avery high resolution of an average of 20 laser pulses per m2 (Erixon, 2015). The datacomes from a scan ordered by the Swedish University of Agricultural Sciences (SLU).This detailed data is currently only available for Krycklan and one other area, whilethe LiDAR data for the rest of Sweden comes from the Swedish national laser scanand has a lower resolution (Lantmateriet, 2018). Of the 10,000 hectare in Krycklan,4,100 hectare was used in our study. The current manual mapping was developed atSLU with the use of a height model and orthophoto by manually placing ditches asvectors on a two-dimensional map (A. Agren, personal communication, 25 January,2019). This detailed mapping has been provided to us by the SFA for the Krycklanarea. Although this manual mapping is precise, it requires vast amounts of labour and


time resources (L. Nilsson, personal communication, 9 January, 2019). Because of this,it was relevant to explore the possibility of a more automated solution. The currentautomated solutions can detect parts of ditches, but fail where the LiDAR data has beendisrupted by various blocking factors (Svensk, 2016). These disruptions may leave gapsin the ditch model, which need to be manually studied and mapped by a human. Theautomated mappings also incorrectly classifies non-ditch areas as ditches too frequently.One automated study was performed in Delineation of Ditches in Wetlands by RemoteSensing by Gustavsson and Selberg (2018). In their study, the Whitebox software(Lindsay, J. B., 2016) was used to produce a ditch mapping from the data describedin 2.1. Their method was used as a comparison with the method produced in our study.

3 Aim and scope

With the manually mapped ditches in Krycklan as training and validation data, we haveinvestigated whether it is possible to use a supervised learning algorithm to classifyditches in this area. The SFA has previously never attempted to classify ditches usingmachine learning (L. Nilsson, personal communication, 9 January, 2019). A comparisonwas also made between our classifier and the method used by Gustavsson and Selberg(2018). In addition to this, information about what data features contribute the most toditch predictions was collected.

Two research questions were formed to answer these queries:

• Does the proposed approach detect ditches more accurately than that of Gustavssonand Selberg (2018)?

• Which LiDAR-based features represent the strongest ditch predictors?

We captured various aspects of detection performance through the following measures:accuracy, precision, recall, Cohen’s Kappa index, and area under the precision/recallcurve score.

We did not make use of the original LiDAR data, but instead only the refined dataattributes described in 2.1. We also only focused on one supervised learning algorithm,and did not make any comparisons between several different machine learning methods.

The experiments were performed on a 4,100 hectare area of northern Sweden, whichis only roughly 0.00009 % of the total area of Sweden. The available data is of a 0.25m2 resolution per raster point (pixel). The data derived from the Lantmateriet nationallaser scan has a resolution of 4 m2 (Lantmateriet, 2018). The higher quality resolution


and relatively small study area may mean that our results are not generalisable for theentirety of Sweden.

The study aimed to find all the ditches in the study area, but we did not seek outany information about the shape or form of individual ditches. The ditches all have ageneralised width based on a perceived average of ditch width in the area. No informationwas sought out about the depth or water flow of ditches. Both forest and road ditchesexist, but we did not separate these two classes of ditches in any way. No focus was puton the computational efficiency of the method.

4 Background

According to Flach (2012), machine learning is the science of systems that improve theirperformance by using previous experience. Flach defines two phases of a given machinelearning implementation: the learning problem and the task. Both phases require inputsand these inputs are defined by features. Features consist of refined or raw data froma set of data attributes. A learning algorithm uses these features to produce a modelby mapping how the features correlate. The task is the so called black box methodwhere feature data is sent to a model to produce an output (Flach, 2012). A black boxmethod is a method where the user enters an input and receives an output, but has noinsight in why the algorithm makes its prediction (Guidotti et al., 2018). Flach (2012)summarises the machine learning process in the following way: ”Machine learning isconcerned with using the right features to build the right models that achieve the righttasks.”. There are different categories of tasks that can be solved with machine learning,e.g. classification, regression, or clustering. In this study, we focused on modellingditch detection in a way that enabled us to apply classification to solve the detectionproblem.

4.1 Supervised learning

Supervised learning is a branch of machine learning where labelled data exists and canbe used to map features and train the model (Kotsiantis, 2007). Furthermore, supervisedlearning works by using algorithms that can produce general hypotheses from externallysupplied occurrences (Kotsiantis, 2007). These hypotheses can make predictions aboutfuture occurrences by building a model where the distribution of classification labelsis mapped to correlating features. The model can then be used to label and classifydata where the features are known beforehand, but the classification label is unknown.


Kotsiantis (2007) defines several steps in the process of developing a classifier withsupervised learning, see Figure 2. The first steps involve identifying and pre-processingthe input data, as well as defining the labels in the training set. When this is done, asuitable algorithm is selected. The last step involves using the algorithm on the data totrain the classifying model. This step can consist of many iterations where the inputparameters are tweaked to produce the best possible model. Because the SFA provideda detailed mapping of ditches over Krycklan, and the data is labelled, it was possible toapply supervised learning in this study.

Problem

Data pre-processing

Definition of

training set

Algorithm

selection

Training

Evaluation

with test set

OK? ClassifierYes

Identification

of required

data

Parameter tuning

No

Figure 2. The steps involved in developing a supervised learning classifying model

(Kotsiantis, 2007).

4.2 Related work using machine learning

There are several studies where geographical data has been used to structure modelsof behaviour and patterns in geographical areas. Sverdrup-Thygeson et al. (2016) useddigital elevation data in their study of classifying old near-natural forests versus oldmanaged forests. Their data consisted of digital elevation data refined from LiDARscans. Three algorithms were compared: generalised linear model, boosted regression


trees, and random forest. They concluded that the difference in performance betweenthe algorithms was not statistically significant, indicating that many approaches canbe taken when classifying this type of data. Another study using supervised learningwith digital elevation data is Random Forest for land cover classification (Gislasonet al., 2006). They concluded that random forest is very desirable for multisourceclassification of remote sensing data where no statistical models are available.

Stanislawski et al. (2018) used a deep learning convolutional neural network methodin their study to extract road and drainage valley features from digital elevation data inIowa, USA. They used labelled data of roads and stream valleys and a DEM derivedfrom LiDAR data to train their model. Stanislawski et al. (2018) used a three metrewidening process to produce a raster model from the vectors of the road and streamvalley networks. Their study indicates that deep neural networks can be used to detectdrainage features from digital elevation data. However, deep neural networks havethe downside of requiring vast amounts of processing power (Jaderberg, Vevaldi, &Zisserman, 2014).

Roelens et al. (2018) used random forest to detect ditches in Belgium. They used rawLiDAR data with a resolution of at least 16 laser pulses per m2. They managed toattain an overall classification accuracy between 91 and 99 %, and a ditch classificationtrue positive rate (recall) of between 67 and 89 %. Many different features were usedwhere neighbouring points of a specific LiDAR point were examined. Values of allLiDAR attributes were represented for the neighbouring area for both a 0.5, 1, 1.5, 2, 3and 4 metre radius. Reproducing these neighbouring area features was of interest whenbuilding the model in our study. Roelens et al. (2018) used a Gini importance evaluationto determine that the most important factor of these neighbouring areas were the two andfour metre radius values.

There was a plethora of possible algorithm choices for the task that this thesis deals with.Several of them (generalised linear model, boosted regression trees, and random forest)can also produce variable weights, highlighting what features are the most important fora given prediction (Sverdrup-Thygeson et al., 2016). Because random forest proved tobe suitable for Roelens et al. (2018), it was chosen for our ditch detection problem.

4.3 Random forests and decision trees

Model ensembles is a category of machine learning where a set of models can becombined to increase diversity and robustness of predictions (Flach, 2012). Randomforest is a supervised learning algorithm, first developed by Breiman (2001), which


makes use of the model ensemble technique by using a set of decision trees to build aforest of trees. The outputs from these trees are then examined, and a majority vote canbe used to produce the final classification output for a specific input (Breiman, 2001).Random forest can also produce a class probability estimation ranging from 0 to 1 foreach sample by dividing the amount of tree outputs of a class with the total amount oftrees.

A decision tree is a data structure where each step down in the tree aims to minimisethe entropy of the outputs (Kotsiantis, 2007). Each node in a decision tree representsa feature or an abstraction of features from the inputs, and each split is made by usingthis feature to separate the inputs. The end node of each tree is called a leaf, and eachleaf produces an output value. Figure 3 and Table 1 show a generic decision tree and itsdata.

at1

at2 No No

Yes at3 at4

No Yes No

a3

Yes

b3

a2 b2 c2

a4 b4

a1 b1 c1

Figure 3. An example of a decision tree. Each node splits

the data set based on certain feature attributes (Kotsiantis,

2007).

Table 1

The mock data used inFigure 3at1 at2 at3 at4 Classa1 a2 a3 a4 Yesa1 a2 a3 b4 Yesa1 b2 a3 a4 Yesa1 b2 b3 b4 Noa1 c2 a3 a4 Yesa1 c2 a3 b4 Nob1 b2 b3 b4 Noc1 b2 b3 b4 No

One way to increase the diversity of a model ensemble is to use a technique calledbootstrap aggregation (bagging), which was developed by Breiman (2001). Baggingtakes, with replacement, a set of different random samples for constructing each tree(Flach, 2012). Bagging is particularly useful in constructing decision trees, due to theirinherent sensitivity to variations in the data.


Random forest makes use of bagging, as well as subspace sampling. Subspace sampling,developed by Ho (1995), is the process of drawing a random subset of features from thefeature set. Using this subspace sampling of features in tandem with bagging helps toproduce an even more robust and diverse ensemble, and this is the method referred to asrandom forest (Flach, 2012). According to Breiman (2001), this reduces generalisationand overfitting errors that other classification algorithms can give rise to.

To determine which split is the best at each node in a tree, random forest can prioritisefinding either the best reduction of entropy in the training dataset, or using a Ginicriterion, which tries to produce pure nodes by sending all data in the largest classto the next node (Breiman, 1996). There are other hyperparameters that can be tunedto achieve better results from the model. For instance, you can adjust the amount ofdecision trees to use, or the maximum amount of features to use for each subspacesampling (Pedregosa et al., 2011).

4.4 Gini importance

The random forest algorithm can produce variable weights using Gini importance. Giniimportance shows how often a feature is selected for a split for a certain classification(Menze et al., 2009). Menze et al. (2009) explain that the Gini importance can beobtained without much extra processing power, as it is a by-product of the actual trainingof the classifier. Each node in every randomised tree in the forest is examined, and eachfeature is given a Gini impurity rating. This impurity rating measures the frequencyof incorrect labelling when using a randomly selected feature. As the Gini impurityrating for a feature increases, its Gini importance decreases (Menze et al., 2009). In thefinal product from the algorithm, the Gini importance for each feature is summarisedfrom the trees, and presented as a continuous value between 0 and 1. A high valueindicates that a feature is important when making a prediction (Menze et al., 2009). TheGini importance can help when developing a model by indirectly giving advice on newfeatures to introduce. This Gini importance was also relevant for giving answer to oursecond research question found in 3.


5 Evaluation

The following data was gathered to evaluate the prediction, where positive represents aditch and negative represents a non-ditch:

• Number of true positives (TP)

• Number of false positives (FP)

• Number of true negatives (TN)

• Number of false negatives (FN)

In An introduction to ROC analysis, Fawcett (2006) explains that the outcome of theprediction can be presented in a confusion matrix to represent the disposition of the setof instances.

The following metrics were extracted from the confusion matrix:

• Recall (true positive rate)Positives correctly classified in relation to all actual positives.

TPTP + FN

• PrecisionPositives correctly classified in relation to all classified positives.

TPTP + FP

• AccuracyCorrect classifications in relation to all attempted classifications.

TP + TNTP + TN + FN + FP

Using only accuracy as an evaluation metric when dealing with an imbalanced dataset(roughly 98 % of all occurrences are non-ditch) would produce a poor performanceassessment (Spelmen & Porkodi, 2018). By simply classifying all pixels as non-ditches,one would by default attain 98 % accuracy. For this reason, the results were alsoevaluated using Cohen’s Kappa (Cohen’s κ) index. Cohen’s κ index measures howmuch better a prediction is compared to a prediction based purely on chance, wherechance would yield a value of zero (Sim & Wright, 2005). With our data, a κ valueof roughly zero would be attained by predicting 2 % of the occurrences as ditch pixels


completely at random.

The chance rating Pc of a prediction of n occurrences is calculated with:

Pc =

((TP+FN)·(TP+FP)

n

)+(

(FN+TN)·(FP+TN)n

)n

Cohen’s κ is then calculated as a value between -1 and 1 with:

κ =Accuracy – Pc

1 – Pc

Values above zero are better than chance and values below zero are worse than chance.Landis and Koch (1977) suggest the benchmarks seen in Table 2 as a general measurementof how good a prediction’s κ rating is.

Table 2

κ analysisκ value Performance strength

< 0.00 Poor0.00 – 0.20 Slight0.21 – 0.40 Fair0.41 – 0.60 Moderate0.61 – 0.80 Substantial0.81 – 1.00 Almost perfect

Note. Landis and Koch (1977)suggested these benchmarks to judgethe performance strength of a classifierusing Cohen’s κ rating.

The precision-recall curve and the area under the precision-recall curve (AUPRC) areadditional metrics that can be used when evaluating datasets with a largely imbalancedclass distribution (Fu, Yi, & Pan, 2019). The precision-recall curve has the recall valueon the x-axis and the precision value on the y-axis, and the area under the curve thatis defined by this point gives the AUPRC value. The area under this curve is given asa value between zero and one, where a value closer to one is better. The weightingcauses the precision-recall curve to not place an equal value on true negatives and truepositives (Fu et al., 2019). For our ditch detection problem, this means that the AUPRCevaluation metric favours accurately classifying ditch pixels over accurately classifyingnon-ditch pixels.

To obtain a clearer view of how well a model performs on a classifying problem,


confidence intervals can be calculated from several different predictions. This can bepreferable to the more traditionally used significance levels, due to confidence intervalsnot only either rejecting or accepting a hypothesis (Gardner & Altman, 1986). Confidenceintervals are given as intervals between two values that you can say with some confidencelevel that the true mean of the population lies in. Commonly in research, intervals with aconfidence level of 90, 95 or 99 % are used (Gardner & Altman, 1986). The confidenceinterval is calculated with:[

x – λα/2 ·σ√

n, x + λα/2 ·

σ√n

]where x is the mean of all the samples and n is the number of samples, and λα/2 canbe obtained from a z-value table where α represents 1 - the confidence level. σ is thestandard deviation of the samples given by:

σ =

√√√√ N

∑i=1

(xi – µ)2 ·P(ξ = xi)

where P(ξ = xi) is the probability of occurrence of a value from your set of values, andµ is the expectation given by:

µ =N

∑i=1

xi ·P(ξ = xi)

(Vannman, 2002)

6 Approach

The study was initiated by an information retrieval from previous research in the areas ofmachine learning and GIS. The information was then used to determine what algorithmsand methods were the most suitable for the following phase of the study. The last phaseconsisted of an experiment designed to answer the research questions of the thesis.Based on the information study, a hypothesis was also formulated for the experiment.

6.1 Experiment design

The first phase of the experiment involved reproducing an automated method for ditchdetection not using machine learning. The method chosen for this comparison is from


Delineation of Ditches in Wetlands by Remote Sensing by Gustavsson and Selberg(2018). In this study, the Whitebox software (Lindsay, J. B., 2016) was used on aDEM to determine how well different data attributes could be used to detect ditchesin raster and polyline formats. Since two of the data attributes (Sky View Factor andImpoundment index) were also available in our dataset, reproducing this method onthe Krycklan area produced a good comparison with our model. The second experimentphase involved feature engineering and developing post-processing for the random forestmodel. The third phase involved evaluating the output from the model and determiningthe importances of the features used. Lastly, the results from the different methods werecompared and analysed to determine how they differed.

6.2 Experimental methodology

The random forest algorithm from the python library scikit-learn was used in the experiment.This library allows you to choose to split your trees with a Gini- or entropy criterion.The probability predictions used in this random forest distribution simply calculates theamount of occurrences of a class output divided with the total amount of trees in theforest. (Pedregosa et al., 2011)

The independent variables of the experiment were the feature inputs of the learningalgorithm, as well as the number of trees and other configurations of the random forestalgorithm. The dependent variable was the raster output classified either as ditch ornon-ditch. (Zobel, 2015)

To answer the research question ”Does the proposed approach detect ditches moreaccurately than that of Gustavsson and Selberg (2018)?”, the following hypothesis forthe experiment was formulated:

The method proposed in this study outperforms the method by Gustavsson and Selberg(2018) in ditch detection, with respect to Cohen’s Kappa index.

6.3 Data preparation

6.3.1 Training and validation data

To develop and evaluate our model, the raster and ditch label data of Krycklan weremanually divided into 21 smaller subsections. From this division, 11 of the subsectionswere put aside as hold-out data to evaluate the performance of the predictions, and10 zones were used in the development of the model. This allowed the model to


be evaluated on unseen data to strengthen the validity of the experiment. Each zonerepresents an area of roughly 196 hectare. Figure 4 shows which zones were used fordevelopment and evaluation respectively.

Figure 4. Krycklan’s location in Sweden. Red zones were used for developing the

model and green zones were used for evaluation. Each zone represents 2997 · 2620

pixels (roughly 196 hectare).


With the 11 zones in the hold-out data for the final random forest experiment, a processcalled leave-one-out cross validation was used. Leave-one-out cross validation is amethod where you train a model on all but one of your occurrences, and use thatoccurrence to evaluate the results (Wong, 2015). Using this technique allowed us totrain 11 different random forest classifying models with a large amount of data, andevaluate each model once on a single zone, producing 11 sub-experiments to evaluatethe method on.

6.3.2 Defining ditches in raster format

The digital elevation data from the SFA was represented in a raster format, whereas theditches from SLU were represented as vectors. These vectors contain no informationabout the width of ditches. To label each individual pixel as either ditch or non-ditch, aconversion from vector to raster format was performed. Because the observed averagewidth of ditches is larger than 0.5 metres, all pixels within a radius of three pixels(1.5 metres) of the vectors were labelled as ditch pixels. Figure 5 A shows the ditchesrasterised from vectors and B shows the ditches after widening. The data in Figure 5 Bis the labelled data that was used to train the random forest model. A similar approachwas taken by Stanislawski et al. (2018) in their study of roads and stream valleys. Dueto all ditches varying in width, it was not possible to produce a perfect representation ofeach ditch. However, this made for a good compromise for the average ditch.

Since our aim was to detect ditches, and not each pixel labelled as a ditch, someadjustments were made when evaluating the prediction results. The dataset was dividedinto a lower resolution grid of six by six pixels (9 m2) for each grid. Each grid cell thatcontained at least 25% ditch pixels was then labelled as a ditch. A similar method wasused by Stanislawski et al. (2018). See Figure 5 C for a visual representation of thesegrid zones.


Figure 5. Processing of ditch labels.

A: Rasterised ditches with a width of one pixel.

B: Ditches after a widening process, seven pixels (3.5 metres) wide. Used as input when

training the model.

C: Ditches after a zone conversion. Used for evaluating the results from the model.

6.4 Reproducing the Whitebox method

In Delineation of Ditches in Wetlands by Remote Sensing (Gustavsson & Selberg, 2018),the workflow for ditch detection consisted of a reclassification to remove noise and todefine the limits of what to classify as a ditch. The raster data was then imported intoArcMap (Esri, 2017) to convert the raster to vectors (Gustavsson & Selberg, 2018). Weonly reproduced the reclassification step, as the results needed to be in raster format inorder to compare it with our model. The workflow that Gustavsson and Selberg (2018)used for ditch detection is presented below.

• Sky View FactorThe Sky View Factor data has a value between zero and one. The data wasbinarised to only include values below 0.989. To remove large waterbodies,Gustavsson and Selberg (2018) created a buffer of six metres around polygons ofwaterbodies. These were converted to pixels and excluded in the result (Gustavsson& Selberg, 2018). Since we had no available data on waterbodies, we could notremove them from the prediction.


• Impoundment IndexThe dams constructed in Whitebox (Lindsay, J. B., 2016) were four by four metresin size. After running the impoundment tool, the data was binarised to removevalues with a water accumulation below 30 m3. This was done to remove flatareas, but still maintain the pixels with a large water accumulation. (Gustavsson& Selberg, 2018)

6.5 Feature engineering

Developing the random forest model involved examining how different kinds of featuresaffected the prediction. Several possible data manipulatation methods could theoreticallyproduce a better prediction. An issue with the previously used automated methods is thatthey do not correctly detect ditches where the LiDAR has been interrupted by bushes ortrees. To combat this, steps were taken where neighbouring pixels were included to givea representation of the area surrounding a specific pixel. A similar approach was takenby Roelens et al. (2018), and this approach produced positive results in their study.

6.5.1 General features

The features used for training the model (Sky View Factor, Impoundment index, Slope,High Pass Median Filter) are all derivatives of the digital elevation data. These rawfeatures provided a satisfactory foundation for the model, but lacked in the generalisabilityof their predictions. More diverse features were extracted using simple statistical aggregatessuch as mean, median, min, max, and standard deviation. This facilitated findingobscurities in the neighbouring areas around pixels. These features were calculatedby gathering all data points in different circular radii around the studied pixel, beforeperforming one of the statistical aggregations. See Figure 6: B, C, H, and J for graphicalrepresentation of some of these features.

6.5.2 Custom features

Several custom features were also developed in addition to the general features, attemptingto specifically target and enhance ditches as well as non-ditches. These will be presentedas follows.

The Sky View Factor Conic filter uses the Sky View Factor attribute to detect and fillgaps in ditches. This was done by taking the mean of all the pixels covered by acone-shaped mask, which expands outwards from each examined pixel point. The mean


was calculated in eight directions from each pixel in a radius outward of 10 pixels. Ifthe mean value from two opposing masks were both below a threshold, the pixel wasupdated with the lowest of these values. This meant that only pixels with strong ditchindicative values in two opposing directions were updated. This allowed the filter toavoid updating pixels that lay close to cavities or hollows, and only focus on lineargeographical properties. This however also meant that geographical properties such asstreams were amplified as well.

The Impoundment Ditch Amplification feature uses the Impoundment attribute to amplifyditches by using thresholds and classifying pixels that usually indicate ditches withincreasing values. Means and medians were used to eliminate noise, and to producea smoother ditch representation. See Figure 6: K for a graphical representation of thisfeature.

Similar to the aforementioned Impoundment Ditch Amplification, the HPMF ditch amplificationfeature classifies pixels based on their likelihood of lying in a ditch. Values weresmoothed with medians and means of different radii before receiving another reclassificationbased on ditch likelihood. A mean was taken one more time to smooth out the reclassifieddata. See Figure 6: E for a graphical representation of this feature.

The Sky View Factor non-ditch amplification feature amplifies pixels which are notditches. This aims to help the model exclude hills and streams, which generally have adeeper impression on the landscape than ditches do with the Sky View Factor attribute.This observation was used to help amplify pixels that exceeded a certain threshold.This feature still misses many stream pixels, and sometimes also picks up pixels fromparticularly deep ditches. See Figure 6: I for a graphical representation of this feature.The Slope non-ditch amplification has the same goal as the Sky View Factor-based filter,but uses different thresholds and is based on the Slope attribute instead. This moreaggressive filter will pick up a much higher percentage of hills and streams, with thedownside of sometimes covering ditches as well.

The DEM ditch amplification feature was extracted from the DEM, where differencesin elevation of local areas were calculated. Pixels that lay at a lower altitude than theaverage of a 15 meter radius circle around the examined pixel were marked out beforea morphological grey closing was performed to remove noise from the feature.

A Gabor filter is an image processing filter that can be used to detect lines of a certainorientation in an image (Hong, Wan, & Jain, 1998). A set of 30 Gabor filters, whichwere rotated in different angles and with different frequencies, was used to detect linesin all directions. The filters from this set of filters were then combined to amplifyditches. These filters were used to create features from both the HPMF and Sky View


Factor attributes. See Figure 6: D and G for graphical representations of these features.

The raw Impoundment feature was used to create a mask, attempting to retain ditches,but mark out streams. This was done by using a threshold on the Impoundment indexthat only marked out areas with a relatively large impoundment, which would indicatethat these areas contained streams. After widening the resulting area, this mask wasused to remove streams from all the aforementioned custom features, generating onenew feature from each. See Figure 6: F and L for graphical representations of featuresthat make use of this mask.


Figure 6. Example of 11 of the 81 features for a small sample area, in addition to ditch

labels, used by the model:

A: Labelled ditches, B: Slope standard deviation, radius 6, C: HPMF mean, radius 4,

D: HPMF Gabor, E: HPMF ditch amplification, F: HPMF ditch amplification - streams

removed G: Sky View Factor Gabor, H: Sky View Factor max, radius 6, I: Sky View

Factor non-ditch amplification J: Impoundment mean, radius 3, K: Impoundment ditch

amplification, L: Impoundment ditch amplification - streams removed


6.6 Model configuration

The random forest classifier was trained on all the features seen in Table 3. The testingphase showed that the classifier produced poor results when the ratio of ditch- versusnon-ditch pixels in the training data was very high. A high ratio led to the model notbeing punished for mislabelling ditches as non-ditches, causing it to prioritise a highaccuracy over a high recall. According to Spelmen and Porkodi (2018) an imbalanceddataset causes a minority class to receive a reduced accuracy. As the ditch class is muchless common than the non-ditch class, this needed to be addressed when training ourmodel. To balance the model, we attempted to train the model with a roughly equalamount of ditch pixels and non-ditch pixels.

The first step to create a more balanced dataset was to extract all pixels labelled asditches, as well as pixels within close proximity of ditches. Secondly, random pixelsamples from the entire area were extracted. This allowed the training dataset to befairly balanced while still containing most of the geographical features of each zone,see Figure 7.

Figure 7. Yellow pixels here indicate the balanced masks used to determine what pixels

are used when training the model. A and B represent two of the zones from the training

dataset.


Table 3

The 81 features used when training the model.Feature/Algorithma Circular radiib

Sky View Factor raw -Sky View Factor mean 2, 3, 4, 6Sky View Factor median 2, 4, 6Sky View Factor standard deviation 2, 4, 6Sky View Factor min 2, 4, 6Sky View Factor max 2, 4, 6Sky View Factor non ditch amplification -Sky View Factor conic filter - streams removed -Sky View Factor Gabor -Sky View Factor Gabor - streams removed -Impoundment raw -Impoundment mean 2, 3, 4, 6Impoundment median 2, 4, 6Impoundment standard deviation 2, 4, 6Impoundment min 2, 4, 6Impoundment max 2, 4, 6Impoundment ditch amplification -Impoundment ditch amplification - streams removed -HPMF raw -HPMF mean 2, 3, 4, 6HPMF median 2, 4, 6HPMF standard deviation 2, 4, 6HPMF min 2, 4, 6HPMF max 2, 4, 6HPMF ditch amplification -HPMF ditch amplification - streams removed -HPMF Gabor -HPMF Gabor - streams removed -Slope raw -Slope mean 2, 3, 4, 6Slope median 2, 4, 6Slope standard deviation 2, 4, 6Slope min 2, 4, 6Slope max 2, 4, 6Slope non-ditch amplification -

a Displays algorithm used to produce the feature.b Represents the radius of the circular mask (if one was used) to determine

what neighbouring pixels to use in an algorithm.


A hyperparameter tuning was performed to determine what parameter values for therandom forest algorithm would yield the best results. Evaluating a maximum of 25features for each node, and using 200 trees showed the best results. Setting the classweight to balanced also improved the performance of the classifier. A probabilityprediction was used instead of a majority vote binary prediction to allow further post-processingof the prediction.

6.7 Post-processing

The model outputs a ditch class probability prediction for each pixel. These probabilitieshave continuous values between zero and one. zero indicates a very low probability ofa pixel lying in a ditch, while one equals a very high ditch probability. See Figure 8: Afor a graphical representation a raw prediction for one of the 21 zones of Krycklan.

6.7.1 Noise reduction and gap filling

The probability predictions contained a lot of noise in places far away from ditches,which needed to be excluded. The first step for removing noise was to use a bilateralde-noising filter on the entire prediction image. This left linear properties and pixelswith a very high value intact, while lowering the value of pixels that did not contributeto an accurate prediction. See Figure 8: B for a graphical representation.

The second step for removing noise was to use a custom function to remove pixels witha semi-high probability, but that lay far away from any other high probability pixels. Athreshold value was used to avoid removing pixels that had a high enough probability,helping to retain pixels that lay in or close to a ditch. The max probability value in acircular radius of 10 pixels was then calculated. If this max value was not high enough,the probability of the examined pixel was lowered. See Figure 8: C for a graphicalrepresentation.

The third step involved taking measures to try to fill gaps in ditches that the model failedto correctly predict. A similar method to the one described in 6.5.2 was employed tocalculate the mean of cone masks expanding outwards in different directions from theexamined pixel. This step also amplified some of the noise that was left, but filling thegaps in the ditches was judged to be more important to help make the next step moreeffective. See Figure 8: D for a graphical representation.


6.7.2 Binarisation with zones

The model’s ditch prediction is given in the original resolution raster format. Therefore,the same grid conversion was performed on the prediction raster as on the evaluationdata. A mean probability rating was calculated for each six by six grid zone by binaryclassifying the entire zone if the mean probability exceeded 35 %. This helped to fill ingaps where lone pixels in ditches had been incorrectly classified, and also helped in thenext step of the post-processing. See Figure 8: E for a graphical representation.

6.7.3 Cluster removal

To remove noise from the binary prediction, a custom cluster detection algorithm wasused. By finding the number of connected pixels with a true value and removing thosewhose cluster size were below a given threshold, minor noise in the prediction couldbe removed while still retaining most of the ditch pixels. Ditches that have a lowprobability, and therefore create small clusters may still be excluded by this method,but the noise removal advantages outweigh the loss in recall. This algorithm operatessimilarly to a paint fill function in an image processing software, with the differenceof counting the pixels instead of colouring them. A distance calculation was alsoperformed in tandem with this method to find the largest distance of pixels inside eachgiven cluster. This helped to remove cavities and hollows that were not removed bythe initial small cluster removal, but that had a shape that indicates that they did notrepresent a ditch. See Figure 8: F for a graphical representation.


A B

C D

Figure 8. The different steps taken after a prediction by the model. A more yellow

pixel indicates a high probability of a ditch, whereas a purple pixel indicates a low ditch

probability: A: Raw probability prediction, B: Prediction after bilateral de-noising, C:

Prediction after custom de-noising, D: Prediction after gap filling


E F

Figure 8. (continued) E: Prediction after binarisation with zone probability, F: Final

binary prediction after cluster removal

7 Results and analysis

7.1 Experimental results

The results from the experiment with our method for the 11 different zones are summarisedin a confusion matrix in Table 4. The results for the reproduced method of Gustavssonand Selberg (2018) can be seen in Table 5 for the Impoundment method, and in Table6 for the Sky View Factor method. The reason that the reproduced experiments have alower total percentage of actual ditch pixels is due to the grid classification being usedin our method, but not being used in the reproduced methods.


Table 4

Confusion matrix from the results of the random forest modelprediction. Displays percentages of predicted ditch and non-ditchoccurrences as well as actual occurrences.

PredictionActual

Ditch % Non-ditch % Sum %Ditch % 1.32a 0.87b 2.19Non-ditch % 0.69c 97.12d 97.81Sum % 2.01 97.99 100.00

a True positiveb False positivesc False negativesd True negatives

Table 5

Confusion matrix from the reproduced Impoundment methodresults. Displays percentages of predicted ditch and non-ditchoccurrences as well as actual occurrences.

PredictionActual



Table 6

Confusion matrix from the reproduced Sky View Factor methodresults. Displays percentages of predicted ditch and non-ditchoccurrences as well as actual occurrences.

PredictionActual




7.2 Analysis

Table 7 displays all the evaluation metrics for the prediction of our method. The confidenceintervals were calculated from the 11 different zones that the experiment was performedon. Since the zones have different amount of ditches in them, the confidence intervalswill not be completely accurate, but will produce a close estimation. From Table 7, itcan be seen that averaging the results from the 11 zones will yield a very similar valueto the total results, indicating that the model generally performs equally well on zoneswith a low amount of ditches as on zones with a high amount of ditches. For the resultsof each individual zone, refer to Appendix C.

Table 7

Metrics for the prediction performance of our model.Metric Totala Zoneb CI 90%c CI 95%c CI 99%c

AverageAccuracy % 98.43 98.43 [98.04 , 98.82] [97.97 , 98.90] [97.82 , 99.04]Recall % 65.47 65.25 [59.30 , 71.20] [58.16 , 72.33] [55.93 , 74.56]Precision % 60.11 58.98 [56.54 , 61.43] [56.06 , 61.90] [55.15 , 62.82]κ rating 0.619 0.606 [0.573 , 0.639] [0.567 , 0.645] [0.552 , 0.628]AUPRC 0.631 0.625 [0.590 , 0.659] [0.584 , 0.665] [0.571 , 0.678]

a The result of all 11 zone experiments when combined.b An average score from the 11 different zones that the experiment was performed on.c Confidence intervals at different confidence levels.

Table 8 shows the evaluation metrics from the experiments with the recreated methodsof Gustavsson and Selberg (2018). Of the two methods, the Impoundment methodoutperformed the Sky view factor method for all metrics except recall.

Table 8

Metrics from the total results of the two recreatedexperiments using the thresholds from the thesisby Gustavsson and Selberg (2018).

Metric Sky View Factor ImpoundmentAccuracy % 93.08 98.32Recall % 13.00 6.10Precision % 3.23 22.17κ rating 0.029 0.090AUPRC 0.087 0.148


The κ rating for our method can be seen in the Total column in Table 7 (κ = 0.619). Theκ rating from the two reproduced experiments can be seen in Table 8 (κ = 0.090 and κ

= 0.029). Because the κ rating from our method outperforms both reproduced methods,our hypothesis has been confirmed by the experiments.

In Table 9, the top 20 features from the random forest model are presented with theirimportance percentages for the model making a successful prediction. The featureimportance was obtained using the Gini impurity for each feature (Menze et al., 2009).For a full list of feature importances for the model, refer to Appendix B.

Table 9

The top 20 features by importance when the model makes a prediction.Position Featurea Importance (%)

1. Impoundment mean 3 11.082. Impoundment mean 4 7.443. HPMF mean 4 7.034. HPMF mean 3 3.975. Impoundment median 4 2.976. Impoundment mean 2 2.537. Sky View Factor Gabor - streams removed 2.038. Impoundment ditch amplification 1.919. HPMF median 4 1.9010. Impoundment ditch amplification - streams removed 1.7611. Slope standard deviation 6 1.6212. Sky View Factor non-ditch amplification 1.5913. HPMF Gabor - streams removed 1.5914. Sky View Factor Gabor 1.4815. HPMF Gabor 1.4616. Sky View Factor max 6 1.3917. HPMF mean 6 1.3918. Impoundment standard deviation 4 1.3919. Slope min 6 1.3120. Slope non-ditch amplification 1.25

a The number next to some of the features indicates the circular radius used to selectwhat neighbouring pixels to use in the statistical aggregation method.


8 Discussion

8.1 Strengths

The results from our method seen in Table 7 show that we managed to attain a Cohen’sκ rating in the substantial range, according to the thresholds proposed by Landis andKoch (1977) seen in Table 2. This means that our model performed substantially betterthan one based purely on chance. The total recall value of 65.47 % shows that wemanaged to find most of the ditch pixels that exist. It is hard to determine exactly howmany of the actual ditches we managed to detect, due to pixel classification not beingan entirely accurate measurement to assess ditch detection. The precision of 60.11 %shows that we incorrectly classified a large amount of pixels as ditch pixels. However, alot of these incorrect classifications lay in very close proximity to a ditch, which has tobe taken into consideration when assessing the performance of the method. Consideringpixel classification was used, our results have shown great promise.

We managed to bridge a lot of gaps and exclude streams to a large extent in our predictions.This was in part due to the features developed, such as the impoundment stream removalfeatures, and in part due to the post-processing of the prediction. The post-processingwas successful both in removing noise from the random forest prediction, as well asbridging some of the gaps that the model was unsuccessful at bridging on its own. Thehomogeneous six by six pixel grids that helped form more continuous ditches, and ourcustom cluster removal of both small clusters and clusters of a non-ditch shape bothhelped in this.

Several of the custom features were of use. For example, using fingerprint enhancementtechniques for gap filling such as Gabor filters (Hong et al., 1998), proved to workwell on ditches due to similarities in the structures of gaps in fingerprint ridges andditches in images. Another example is the Sky View Factor conic filter, and the gapfilling in the post-processing, which filled gaps by looking at the means of pixels inopposite directions of an examined pixel. Applying different statistical aggregationsusing circular radii of different sizes for neighbouring areas of pixels, similar to Roelenset al. (2018), also proved to produce very useful features.

To assure the validity of our results, more than half of the data (11 out of 21 zones) wereexcluded early in the process, only for use in the final experiment. This allowed for themodel to be tested on areas that had not been seen during development, and thereforedid not have features or post-processing tweaked to their composition. Another strengthin our method was the use of leave-one-out cross validation with our 11 hold-out zones.


This allowed for us to use more training data for each model in the experiment, insteadof dividing the data into training- and testing zones.

8.2 Weaknesses and limitations

Generally, the ditch pixels that our method failed to detect were the ones where theditches were more shallow, and made less of an imprint on the landscape than otherditches, causing the data to have weaker values. Sometimes pixels in these ditcheswere classified as ditches, but not enough pixels in the surrounding area, causing ourpost-processing to identify them as noise and remove them. Incorrectly classified ditchpixels that caused the precision to drop were generally the pixels that lay in streams.However, small cavities or hilly areas classified as ditches by the model were generallyremoved in the post-processing.

A potential issue with using our model in other geographical areas is that all the post-processingsteps and features were developed based on occurrences in the Krycklan area. Differentgeographical compositions in other areas could cause the feature algorithms developedin this study to be less effective. Thresholds used to fill gaps in ditches and to removenoise could also prove to be ineffective in other areas. In addition to this, the rawImpoundment feature used in this study was created with specific thresholds for theKrycklan area, and may therefore not work as well in other parts of Sweden.

Because all ditches were widened from a vector format to 3.5 metres in a raster format,this means that some ditches were made too wide, and some too thin. This resulted in themodel sometimes not learning correctly on pixels with faulty labels, which potentiallyhindered the classification’s performance. Had we been able to use labels that werecorrectly classified on a pixel basis, the model could have learned with a lot higherprecision, and the classification results would also most likely benefit from this. Thisraises the question of whether pixel classification is the right way to tackle this ditchdetection problem.

While many of the features used in the experiment proved to be of value, the projectsuffered from our lack of experience in feature engineering. Time had to be spent onlearning what features are, and how to make use of them. With more experience, betterfeatures could have been developed more efficiently. In some cases the data may alsohave simply been too weak to be able to detect all ditches, and to distinguish them fromstreams. It is possible that use of the raw LiDAR data in addition to the DEM couldhave provided even better opportunities to detect ditches.


Due to time constraints, no extensive hyperparameter tuning for our random forestmodel was performed. Since hyperparameter tuning is often more important than choiceof algorithm (Lavesson & Davidsson, 2006), this probably led to a lower performancethan what was possible with our features. For hyperparameters, we only examinednumber of trees, the max amount of features used for evaluation per node in a tree, aswell as setting the class weight to balanced or not. Several combinations of hyperparametersettings were not tested.

8.3 Comparison to state-of-the-art

The results of our method showed a significant improvement over the previously usedautomated methods for ditch detection. Several issues with previous methods weresolved at least to some extent, such as bridging gaps and removing some of the streamsfrom the ditch predictions.

Concerning the comparison with the state-of-the-art method from Gustavsson and Selberg(2018), the thresholds values used in their geographical area were not generalisable tothe Krycklan area. This probably caused the comparison of the methods to be biased inour favour. Gustavsson and Selberg (2018) also had data of water bodies in their studyarea, which they could exclude from the ditch prediction. This data did not exist in ourreproduction, causing their method to potentially perform worse. The results from theprediction highlights one of the bigger problems with automated methods that are tunedto a specific geographical area.

8.4 General discussions

Looking at what features had the highest Gini importance as seen in Table 9, the featuresbased on the Impoundment attribute generally outperformed features based on the otherattributes. As Impoundment measures the volume that a dam would occupy on thelandscape, it is logical that this feature clearly marks out streams and ditches, andtherefore has a high importance for our model. The features in Table 9 marked withstreams removed also used the Impoundment attribute to attempt to remove streamsfrom the feature, further highlighting the effectiveness of this attribute. Of the customfeatures, the features based on Gabor filters performed well. This makes sense sincethe nature of Gabor filters is to mark out linear properties in images, and ditches almostalways form straight lines as they are man-made structures as opposed to, for instance,streams.


There were a lot of challenges that we had to overcome during the development of thisthesis. Time had to be spent on learning about features and other machine learningterms, as we had never before used machine learning methods. Researching whatalgorithm to use and what potential features to develop also took up a lot of time, whichwith more experience could have been spent on developing a better model instead. A lotof the development was simply trial and error, where different features and post-processingwere developed to determine what worked best.

9 Conclusions and future work

This thesis investigates how to use real-world digital elevation data together with manuallyidentified ditches to identify patterns and relationships that can be used for automaticditch detection. The proposed method significantly outperforms state-of-the-art methods.While the method still has room for improvement, it performs well on the available data.The thesis contributes to an increased understanding of how machine learning can beapplied to ditch detection and, more generally, to real-world problems in forestry. Animplication of this work is that it may become both faster and easier to map ditches,which can support decision making on which environmental actions need to be taken,and where. From an economic perspective, automation of ditch detection significantlyreduces manual labour and cost. In conclusion, this thesis has shown that it is possibleto use machine learning with digital elevation data to learn patterns that enable robustdetection of ditches.

As mentioned previously, pixel classification may not be the best approach to ditchdetection. One could examine larger grids of pixels and perform object segmentationon the images formed by these grids to potentially produce a better performance. Usingdeep learning or other techniques that handle these types of features well could be agood next step to take for the SFA. The prediction performance could then be assessede.g. by measuring distances from detected objects to the vector labels, thereby avoidingartificially labelling and widening ditches. Being limited by processing power, memory,and the time allotted for this thesis, however, we early ruled out the possibilitiy of usingdeep learning techniques.

The work in this thesis can help to form a base for the SFA:s goals of using machinelearning to classify ditches. It gives an insight into what features work well, and whatfeatures do not. Some of the custom features could be reused in the future, with randomforest or other machine learning algorithms. Several of the post-processing algorithms,such as the probability prediction noise removal or the cluster removal from the binary


prediction could be generalised for use in any type of pixel classification of ditches.


References

Breiman, L. (1996). Technical note: Some properties of splitting criteria. MachineLearning, 24(1), 41-47. doi: https://10.1023/A:1018094028462

Breiman, L. (2001, October). Random forests. Machine Learning, 45(1), 5-32. doi:https://10.1023/A:1010933404324

Carluer, N., & Marsily, G. D. (2004). Assessment and modelling of the influenceof man-made networks on the hydrology of a small watershed: implicationsfor fast flow components, water quality and landscape management. Journal ofHydrology, 285(1), 76-95. doi: https://10.1016/j.jhydrol.2003.08.008

Cho, H.-C., Slatton, K. C., Cheung, S., & Hwang, S. (2011). Stream detection for lidardigital elevation models from a forested area. International Journal of RemoteSensing, 32(16), 4695-4721. doi: https://10.1080/01431161.2010.484822

Dages, C., Voltz, M., Bsaibes, A., Prevot, L., Huttel, O., Louchart, X., . . . Negro, S.(2009). Estimating the role of a ditch network in groundwater recharge in amediterranean catchment using a water balance approach. Journal of Hydrology,375(3-4), 498-512. doi: https://10.1016/j.jhydrol.2009.07.002

Erixon, A. (2015). Kombinerad flygfotografering/laserskanning hassleby/krycklan.TerraTec Sweden AB, Johanneshov. (Unpublished document)

Esri. (2017). Digital elevation models. Retrieved 2018-01-20, from https://learn

.arcgis.com/en/related-concepts/digital-elevation-models.htm

Fawcett, T. (2006). An introduction to roc analysis. Pattern Recognition Letters, 27(8),861-874. doi: https://10.1016/j.patrec.2005.10.010

Flach, P. (2012). Machine learning: The art and science of algorithms that makesense of data. Cambridge, United Kingdom: Cambridge University Press. doi:https://10.1017/CBO9780511973000

Fu, G. ., Yi, L. ., & Pan, J. (2019). Tuning model parameters in class-imbalancedlearning with precision-recall curve. Biometrical Journal, 61(3), 652-664. doi:https://10.1002/bimj.201800148

Gardner, M. J., & Altman, D. G. (1986). Confidence intervals rather than p values:Estimation rather than hypothesis testing. British medical journal (Clinicalresearch ed.), 292(6522), 746-750.

Gislason, P., Benediktsson, J. A., & Sveinsson, J. R. (2006, October). Random forestsfor land cover classification. Pattern Recognition Letters, 27(4), 294-300. doi:https://10.1016/j.patrec.2005.08.011

Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., & Pedreschi, D.(2018). A survey of methods for explaining black box models. ACM Computing

https://learn.arcgis.com/en/related-concepts/digital-elevation-models.htm

https://learn.arcgis.com/en/related-concepts/digital-elevation-models.htm


Surveys, 51(5). doi: https://10.1145/3236009Gustavsson, A., & Selberg, M. (2018). Delineation of ditches in wetlands by

remote sensing (Unpublished bachelor’s thesis, Uppsala University, Uppsala,Sweden). Retrieved from http://www.diva-portal.org/smash/get/diva2:

1221962/FULLTEXT01.pdf

Ho, T. K. (1995). Random decision forests. In Proceedings of the internationalconference on document analysis and recognition, icdar (Vol. 1, p. 278-282). doi:https://10.1109/ICDAR.1995.598994

Hong, L., Wan, Y., & Jain, A. (1998). Fingerprint image enhancement: Algorithm andperformance evaluation. IEEE Transactions on Pattern Analysis and MachineIntelligence, 20(8), 777-789.

Jaderberg, M., Vevaldi, A., & Zisserman, A. (2014). Speeding up convolutional neuralnetworks with low rank expansions. Nottingham, United Kingdom: Proceedingsof the British Machine Vision Conference. doi: https://10.5244/C.28.88

Kotsiantis, S. B. (2007). Supervised machine learning: A review of classificationtechniques. Informatica (Ljubljana), 31(3), 249-268. doi: https://10.1007/s10462-007-9052-3

Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement forcategorical data. Biometrics, 33(1), 159-174. doi: https://10.2307/2529310

Lantmateriet. (2018). Laser data - laserdata skog. Retrievedfrom https://www.lantmateriet.se/contentassets/

d85c20e0e23846538330674fbfe8c8ac/lidar data skog.pdf

Lavesson, N., & Davidsson, P. (2006). Quantifying the impact of learning algorithmparameter tuning. In Proceedings of the national conference on artificialintelligence (Vol. 1, p. 395-400).

Lindsay, J. B. (2016). Whitebox gat: A case study in geomorphometric analysis.computers & geosciences. Computers & Geosciences, 95, 75-84. doi: https://10.1016/j.cageo.2016.07.003

Menze, B. H., Kelm, B. M., Masuch, R., Himmelreich, U., Bachert, P., Petrich, W., &Hamprecht, F. A. (2009). A comparison of random forest and its gini importancewith standard chemometric methods for the feature selection and classification ofspectral data. BMC Bioinformatics, 10. doi: https://10.1186/1471-2105-10-213

Mulik, S. N. (1999). An introduction to geographical information systems. IETETechnical Review (Institution of Electronics and Telecommunication Engineers,India), 16(5-6), 419-424. doi: https://10.1080/02564602.1999.11416861

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., . . .Duchesnay, E. (2011). Scikit-learn: Machine learning in python. Journal of

http://www.diva-portal.org/smash/get/diva2:1221962/FULLTEXT01.pdf

http://www.diva-portal.org/smash/get/diva2:1221962/FULLTEXT01.pdf

https://www.lantmateriet.se/contentassets/d85c20e0e23846538330674fbfe8c8ac/lidar_data_skog.pdf

https://www.lantmateriet.se/contentassets/d85c20e0e23846538330674fbfe8c8ac/lidar_data_skog.pdf


Machine Learning Research, 12, 2825–2830. Retrieved from http://www.jmlr

.org/papers/volume12/pedregosa11a/pedregosa11a.pdf

Roelens, J., Hofle, B., Dondeyne, S., Van Orshoven, J., & Diels, J. (2018).Drainage ditch extraction from airborne lidar point clouds. ISPRS Journalof Photogrammetry and Remote Sensing, 146, 409-420. doi: https://10.1016/j.isprsjprs.2018.10.014

Shane D. Mayor, C. S. U. (2007). Unfiltered vs. filtered - what is the difference?Retrieved 2019-02-11, from http://lidar.csuchico.edu/filtering/

Sim, J., & Wright, C. C. (2005). The kappa statistic in reliability studies: Use,interpretation, and sample size requirements. Physical Therapy, 85(3), 257-268.doi: https://10.1093/ptj/85.3.257

Skogsstyrelsen. (2018). Om oss. Retrieved 2019-01-16, from https://www

.skogsstyrelsen.se/om-oss/

Spelmen, V. S., & Porkodi, R. (2018). A review on handling imbalanceddata. In Proceedings of the 2018 International Conference on CurrentTrends towards Converging Technologies, ICCTCT 2018. doi: https://10.1109/ICCTCT.2018.8551020

Stanislawski, L., Brockmeyer, T., & Shavers, E. (2018). Automated road breaching toenhance extraction of natural drainage networks from elevation models throughdeep learning. International Archives of the Photogrammetry, Remote Sensingand Spatial Information Sciences - ISPRS Archives, 42(4), 671-678. doi: https://10.5194/isprs-archives-XLII-4-597-2018

Svensk, J. (2016). Berakning av diken fran lantmateriets nationella laserdata. ForanSverige AB, Linkoping. (Unpublished document)

Sverdrup-Thygeson, A., Ørka, H. O., Gobakken, T., & Næsset, E. (2016). Can airbornelaser scanning assist in mapping and monitoring natural forests? Forest Ecologyand Management, 369, 116-125. doi: https://10.1016/j.foreco.2016.03.035

Vannman, K. (2002). Matematisk statistik. Lund, Sweden: Studentlitteratus AB.Wong, T. . (2015). Performance evaluation of classification algorithms by k-fold and

leave-one-out cross validation. Pattern Recognition, 48(9), 2839-2846.Zaksek, K., Ostir, K., & Kokalj, Z. (2011, 12). Sky-view factor as a relief visualization

technique. Remote Sensing, 3. doi: https://10.3390/rs3020398Zobel, J. (2015). Writing for computer science (3rd ed.). London, United Kingdom:

Springer Publishing Company, Incorporated. doi: https://10.1007/978-1-4471-6639-9

http://www.jmlr.org/papers/volume12/pedregosa11a/pedregosa11a.pdf

http://www.jmlr.org/papers/volume12/pedregosa11a/pedregosa11a.pdf

http://lidar.csuchico.edu/filtering/

https://www.skogsstyrelsen.se/om-oss/

https://www.skogsstyrelsen.se/om-oss/


Appendices

Appendix A Source code

All the source code used to develop features, models, post processing and for performingthe experiment can be found in our GitHub repository here.

https://github.com/jonatan-flyckt/random_forest_lidar_ditch_detection


Appendix B All feature importances

Table 10

The importance for 38 of the 81 feature when the model makes a prediction.Position Featurea Importance (%)1. Impoundment mean 3 11.082. Impoundment mean 4 7.443. HPMF mean 4 7.034. HPMF mean 3 3.975. Impoundment median 4 2.976. Impoundment mean 2 2.537. Sky View Factor Gabor - streams removed 2.038. Impoundment ditch amplification 1.919. HPMF median 4 1.9010. Impoundment ditch amplification - streams removed 1.7611. Slope standard deviation 6 1.6212. Sky View Factor non-ditch amplification 1.5913. HPMF Gabor - streams removed 1.5914. Sky View Factor Gabor 1.4815. HPMF Gabor 1.4616. Sky View Factor max 6 1.3917. HPMF mean 6 1.3918. Impoundment standard deviation 4 1.3919. Slope min 6 1.3120. Slope non-ditch amplification 1.2521. HPMF min 4 1.1422. Sky View Factor max 4 1.1323. Impundment max 6 1.1224. Impundment standard deviation 6 1.0925. Impundment median 6 1.0426. Impundment mean 6 1.0327. Slope min 4 1.0128. Impundment median 2 1.0129. HPMF min 6 1.0130. HPMF standard deviation 6 1.0131. Slope median 6 1.0032. HPMF min 2 0.9933. HPMF mean 2 0.9834. Sky View Factor min 6 0.9635. Impundment max 4 0.9636. HPMF ditch amplification - streams removed 0.9637. Slope mean 6 0.9538. HPMF ditch amplification 0.94

a The number next to some of the features indicates the circular radius used toselect what neighbouring pixels to use in the statistical aggregation method.


Table 11

The importance for the bottom 43 of the 81 feature when the modelmakes a prediction.Position Featurea Importance (%)39. Sky View Factor standard deviation 6 0.9340. Slope standard deviation 4 0.9141. Slope min 2 0.8742. Sky View Factor max 2 0.8643. Impoundment max 2 0.8044. HPMF standard deviation 4 0.7945. HPMF max 6 0.7946. Slope max 4 0.7847. Sky View Factor median 6 0.7748. Sky View Factor mean 6 0.7649. Sky View Factor standard deviation 4 0.7450. Sky View Factor min 4 0.7351. Slope median 4 0.6752. HPMF standard deviation 2 0.6753. Impoundment standard deviation 2 0.6554. HPMF median 2 0.6355. Sky View Factor median 4 0.6256. Slope standard deviation 2 0.6257. Sky View Factor standard deviation 2 0.6058. Slope mean 4 0.6059. Slope median 2 0.5960. Slope raw 0.5861. Sky View Factor conic filter - streams removed 0.5662. HPMF median 6 0.5563. Slope max 2 0.5564. Sky View Factor median 2 0.5565. Sky View Factor min 2 0.5566. HPMF max 4 0.5567. Slope mean 2 0.5368. Slope mean 3 0.5269. Sky View Factor mean 4 0.5270. Sky View Factor conic filter 0.4971. Sky View Factor raw 0.4972. Sky View Factor mean 3 0.4873. Impoundment raw 0.4874. Sky View Factor mean 2 0.4775. HPMF raw 0.4176. HPMF max 2 0.3877. DEM ditch amplification - streams removed 0.2278. DEM ditch amplification 0.2079. Impoundment min 2 0.1680. Impoundment min 4 0.0181. Impoundment min 6 0.00

a The number next to some of the features indicates the circular radius usedto select what neighbouring pixels to use in the statistical aggregationmethod.


Appendix C Individual zone results

Table 12

Prediction performance for each evaluation zone of the experiment.Zone Accuracy % Recall % Precision % Cohen’s Kappa AUPRC Ditch label %1 98.34 59.12 58.11 0.578 0.590 1.992 96.65 60.67 66.75 0.618 0.647 4.813 98.15 59.83 61.25 0.596 0.610 2.374 97.52 75.24 52.39 0.605 0.641 2.665 97.90 73.31 65.41 0.681 0.698 3.206 98.76 39.08 50.81 0.436 0.453 1.257 98.70 76.30 59.25 0.661 0.680 1.708 99.24 86.76 58.07 0.692 0.725 1.009 99.15 65.40 53.71 0.586 0.597 0.9410 99.27 57.66 59.55 0.582 0.588 0.9011 99.06 64.33 63.56 0.635 0.642 1.29


Appendix D Prediction result graphics for each zone

Graphical representations of the results from each zone of the random forest experiment.Green indicates a true positive, red indicates a false positive, blue indicates a falsenegative, and white indicates a true negative.

Figure 9. Graphical representation of the results from zone 1.

ditch detection using reﬁned lidar data1334514/fulltext01.pdf · ditch detection using refined...

Documents